CN109522299B - Data processing method, device, system and storage medium - Google Patents

Data processing method, device, system and storage medium Download PDF

Info

Publication number
CN109522299B
CN109522299B CN201811240090.6A CN201811240090A CN109522299B CN 109522299 B CN109522299 B CN 109522299B CN 201811240090 A CN201811240090 A CN 201811240090A CN 109522299 B CN109522299 B CN 109522299B
Authority
CN
China
Prior art keywords
data
key value
hot spot
spatial
hotspot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811240090.6A
Other languages
Chinese (zh)
Other versions
CN109522299A (en
Inventor
丁鹏云
沈军栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MIGU Digital Media Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
MIGU Digital Media Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MIGU Digital Media Co Ltd, MIGU Culture Technology Co Ltd filed Critical MIGU Digital Media Co Ltd
Priority to CN201811240090.6A priority Critical patent/CN109522299B/en
Publication of CN109522299A publication Critical patent/CN109522299A/en
Application granted granted Critical
Publication of CN109522299B publication Critical patent/CN109522299B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data processing method, a device, a system and a storage medium, wherein the data processing method comprises the following steps: carrying out at least two different spatial mapping conversions on the key value data in each service data to obtain each spatial mapping conversion result; obtaining a space coordinate value of each key value data by fusing each space mapping conversion result corresponding to each key value data; acquiring a hotspot coordinate set according to a statistical result of the occurrence times of the spatial coordinate values of each key value data in a preset time length, wherein the hotspot coordinate set comprises a preset number of spatial coordinate values with the former times in the statistical result; and determining that the service data of which the spatial coordinate values belong to the hot spot coordinate set are hot spot data. The data processing method of the embodiment of the invention has high efficiency of hot spot data identification and saves the CPU consumption of the client.

Description

Data processing method, device, system and storage medium
Technical Field
The present invention relates to the field of data caching, and in particular, to a data processing method, apparatus, system, and storage medium.
Background
The service platform has a high degree of dependence on the cache, most of data corresponding to main service contents are stored in the cache, the requirements on the cache performance and the robustness are high, high concurrent access pressure needs to be borne, and a perfect disaster tolerance, backup and pressure capacity expansion system is provided.
When an instantaneous burst access scene occurs, in order to meet the access requirement for data, the existing capacity expansion scheme is to copy the whole amount of cache cluster data to a new node for capacity expansion. Such full-copy capacity-expansion schemes are limited to "no differentiation" of traffic data, which can result in a blind consumption of processing resources.
Disclosure of Invention
In view of this, embodiments of the present invention provide a data processing method, apparatus, system and storage medium, so as to solve the problem in the prior art that blind consumption of processing resources is caused due to "no distinction" of service data.
The technical scheme of the embodiment of the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a data processing method, where the data processing method includes:
carrying out at least two different spatial mapping conversions on the key value data in each service data to obtain each spatial mapping conversion result; obtaining a space coordinate value of each key value data by fusing each space mapping conversion result corresponding to each key value data;
acquiring a hotspot coordinate set according to a statistical result of the occurrence times of the spatial coordinate values of each key value data in a preset time length, wherein the hotspot coordinate set comprises a preset number of spatial coordinate values with the former times in the statistical result;
and determining that the service data of which the spatial coordinate values belong to the hot spot coordinate set are hot spot data.
In a second aspect, an embodiment of the present invention provides a data processing apparatus, including:
the conversion module is used for carrying out at least two times of different space mapping conversion on the key value data in each service data to obtain each space mapping conversion result; obtaining a space coordinate value of each key value data by fusing each space mapping conversion result corresponding to each key value data;
the acquisition module is used for acquiring a hotspot coordinate set according to a statistical result of the occurrence times of the spatial coordinate values of the key value data in a preset time length, wherein the hotspot coordinate set comprises a preset number of spatial coordinate values with the former times in the statistical result;
and the determining module is used for determining that the service data of which the spatial coordinate values belong to the hot spot coordinate set is hot spot data.
In a third aspect, an embodiment of the present invention further provides a data processing system, where the data processing system includes:
the client is used for carrying out at least two times of different space mapping conversion on the key value data in each service data to obtain each space mapping conversion result; obtaining a space coordinate value of each key value data by fusing each space mapping conversion result corresponding to each key value data; acquiring a hotspot coordinate set according to a statistical result of the occurrence times of the spatial coordinate values of each key value data in a preset time length, wherein the hotspot coordinate set comprises a preset number of spatial coordinate values with the former times in the statistical result; determining the service data of which the spatial coordinate values belong to the hot spot coordinate set as hot spot data;
and the monitoring system is used for periodically sending the hotspot coordinate set to the client at intervals of the preset duration.
In a fourth aspect, an embodiment of the present invention further provides a computer storage medium, which stores an executable program, and when the executable program is executed by a processor, the data processing method according to any of the above embodiments is implemented.
In the technical scheme of the embodiment of the invention, the space coordinate value of each key value data is obtained by converting the key value data in each service data, the hot spot coordinate set is generated according to the statistical result of the occurrence times of the space coordinate value of each key value data in the preset time length, the service data of which the space coordinate value belongs to the hot spot coordinate set is determined to be the hot spot data, the hot spot data identification efficiency is high, and the CPU consumption of the client is saved due to the fact that the hot spot identification is carried out according to the hot spot coordinate set.
Drawings
FIG. 1 is a flow chart illustrating a data processing method according to a first embodiment of the present invention;
FIG. 2 is a flow chart illustrating a data processing method according to a second embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a data processing apparatus according to a first embodiment of the present invention;
FIG. 4 is a diagram illustrating a data processing system according to a first embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further elaborated by combining the drawings and the specific embodiments in the specification. It should be understood that the examples provided herein are merely illustrative of the present invention and are not intended to limit the present invention. In addition, the following embodiments are provided as partial embodiments for implementing the present invention, not all embodiments for implementing the present invention, and the technical solutions described in the embodiments of the present invention may be implemented in any combination without conflict.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.
It should be noted that, in the embodiments of the present invention, the terms "comprises", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion, so that a method or apparatus including a series of elements includes not only the explicitly recited elements but also other elements not explicitly listed or inherent to the method or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other related elements in a method or apparatus including the element (e.g., steps in a method or elements in an apparatus, such as a unit may be part of a circuit, part of a processor, part of a program or software, etc.).
Fig. 1 is a schematic flow chart of a data processing method according to a first embodiment of the present invention. Referring to fig. 1, the data processing method of this embodiment is applied to a client, and includes:
step 101: carrying out at least two different spatial mapping conversions on the key value data in each service data to obtain each spatial mapping conversion result; and obtaining the spatial coordinate value of each key value data by fusing each spatial mapping conversion result corresponding to each key value data.
The service data is request data sent by a service system, and the request data carries key value data and is used for accessing a value corresponding to the key value data. For example, the request data for service request transmission via the login web page or the request data for service request transmission via the login APP may be used. It should be noted that the service system may be located on the local side of the client or may be located on the remote side networked with the client.
In an optional embodiment, before performing the spatial mapping conversion on the key value data in each service data, the method further includes: and cleaning the key value data in the received service data according to a preset rule, and taking the cleaned key value data as an object of space mapping conversion.
Specifically, the prefix of the key value data in the service data generated by each service system is generally added with the routing information and the system flag, such as: the key is characterized in that the sys1_ route 2group _ Liaoning ship is actually focused on the Liaoning ship, value content corresponding to the Liaoning ship is to be acquired, and the technical scheme firstly removes a sys1_ route 2group _ prefix to remove redundant characters in key value data, so that the cleaned key value data is obtained, and the calculation amount of subsequent space mapping conversion can be greatly reduced. Optionally, the client locally stores a dictionary for data cleaning, prefixes to be removed may be listed in the dictionary, and the client may clean the key value data in the service data according to the dictionary.
In an optional embodiment, the performing at least two different spatial mapping transformations on the key value data in each service data to obtain each spatial mapping transformation result includes: performing hash operation on the key value data through a first space mapping function, a second space mapping function and a third space mapping function respectively to obtain three first integer data; and respectively carrying out surplus on the three first integer data and the preset number to obtain three second integer data. The obtaining of the spatial coordinate value of each key value data by fusing each spatial mapping conversion result corresponding to each key value data respectively includes: and filling the three second integer data to preset digits of third integer data, wherein the third integer data is used for representing the three-dimensional coordinate value of the key value data.
In an optional embodiment, two different spatial mapping conversions may be performed on the key value data in each service data, and a two-dimensional coordinate value used for representing each key value data is obtained by fusing each spatial mapping conversion result corresponding to each key value data.
In another optional implementation manner, four different spatial mapping conversions may be performed on the key value data in each service data, and four-dimensional coordinate values used for representing each key value data are obtained by fusing each spatial mapping conversion result corresponding to each key value data.
Optionally, the key value data is converted through three different spatial mapping functions, so that the dimension reduction processing of the data is realized, the three groups of obtained data after dimension reduction are projected into x, y and z axes of a three-dimensional coordinate system to obtain three-dimensional coordinate values for representing the key value data, and compared with the two-dimensional coordinate values corresponding to the key value data, the three-dimensional coordinate values can be used for establishing a larger number of classification standards for the same number of service data, so that the fineness of the statistical classification of the key value data can be effectively improved. Specifically, the three spatial mapping functions are murmur3 (a non-encrypted hash function suitable for general hash retrieval operation), sha1 (secure hash algorithm), and crc (cyclic redundancy check). And respectively carrying out three times of hash operation through the three functions to obtain three first integer data, and then respectively carrying out residue with a preset number of hot spot data to be counted to obtain three second integer data.
For example, it is assumed that the system sets the service data corresponding to the three-dimensional coordinate value of the top 10 of the sequence as the hot spot data, that is, the preset number is 10. Then, the three first integer data obtained by performing the hash operation for three times respectively through the three functions and 10 are left to obtain three second integer data; then, the obtained three second integer data are used as filling data of a third integer data.
In an alternative embodiment: the third integer data is 32-bit integer data, wherein the lower 4 bits of the upper 8 bits are used for storing the result of murmur3 after being left (i.e. the second integer data corresponding to murmur 3), the upper 4 bits of the lower 8 bits are used for storing the result of sha1 after being left (i.e. the second integer data corresponding to sha 1), and the lower 4 bits of the lower 8 bits are used for storing the result of crc after being left (i.e. the second integer data corresponding to crc), so as to obtain one integer data, and the integer data is the three-dimensional coordinate value.
Step 102: and acquiring a hotspot coordinate set according to the statistical result of the occurrence times of the spatial coordinate values of the key value data in the preset time length, wherein the hotspot coordinate set comprises a preset number of spatial coordinate values with the former times in the statistical result.
It should be noted that the preset number of spatial coordinate values with the earlier number of times in the statistical result refers to the preset number of spatial coordinate values with the earlier number of times of occurrence, and is used for representing key value data with the earlier access frequency received by the preset duration client.
In an optional embodiment, the acquiring the hotspot coordinate set includes: and receiving a hotspot coordinate set sent by a monitoring system, wherein the monitoring system is used for periodically sending the hotspot coordinate set with a preset time length.
Specifically, in an embodiment, the client sends the three-dimensional coordinate values obtained after the spatial mapping conversion to a statistical system, the statistical system forms time series data according to the received three-dimensional coordinate values, the time series data is generated by the statistical system according to a mapping relationship between the time of receiving the three-dimensional coordinate values and the three-dimensional coordinate data, and the statistical system counts the occurrence times of the three-dimensional coordinate values in each time period according to a preset time duration (e.g., one hour). It should be noted that the statistical results at each time period are independent of each other. And the monitoring system periodically acquires the statistical results of the statistical system side according to the preset time length to obtain a preset number of three-dimensional coordinate values with the occurrence times before, so that a hot spot coordinate set is generated and sent to the client.
For example, the statistical system forms time series data according to the three-dimensional coordinate values sent by the receiving client, and counts 10 three-dimensional coordinate values with the highest frequency of occurrence in each hour. The monitoring system obtains a statistical result from the statistical system every hour, generates a hot spot coordinate set according to the 10 three-dimensional coordinate values with the highest occurrence frequency, and sends the hot spot coordinate set to the client.
Step 103: and determining that the service data of which the spatial coordinate values belong to the hot spot coordinate set are hot spot data.
For example, the client performs a hot spot identification operation on the service data according to the three-dimensional coordinate values corresponding to the subsequently accessed service data, with a hot spot coordinate set formed by the received 10 three-dimensional coordinate values with the highest occurrence frequency as a current judgment condition.
In an optional embodiment, the client filters the three-dimensional coordinate value of the subsequently accessed service data according to the received hot spot coordinate set as a filtering condition.
And during filtering, if the three-dimensional coordinate value corresponding to the key value data of the subsequently accessed service data after spatial mapping conversion belongs to the hot spot coordinate set, judging that the service data is the hot spot data.
For example, if the three-dimensional coordinate value corresponding to the subsequently accessed service data belongs to one of 10 three-dimensional coordinate values in the current hotspot coordinate set, it is determined that the service data is hotspot data.
According to the data processing method, the hot spot coordinate set is obtained, the three-dimensional coordinate value of the subsequently accessed service data is filtered according to the hot spot coordinate set, the service data of which the three-dimensional coordinate value belongs to the hot spot coordinate set is determined to be the hot spot data, and the hot spot data identification efficiency is high. And because only the hot spot coordinate set is used as a filtering condition, hot spot identification aiming at the whole data is avoided, and the CPU consumption of the client is saved.
In an embodiment, optionally, the determining that the service data of which the spatial coordinate value belongs to the hotspot coordinate set is hotspot data includes: and detecting whether a capacity expansion request is received or not, and determining that the service data of which the space coordinate value belongs to the hot spot coordinate set is hot spot data when the capacity expansion request is received.
Here, after receiving the hot spot coordinate set, the client may not immediately filter the subsequent service data, or may perform an operation of identifying the hot spot data after receiving the trigger condition. Specifically, the monitoring system monitors the load of each cache node, generates and sends a capacity expansion request to the client when the load of the cache node is monitored to be higher than a preset threshold, and after receiving the capacity expansion request, the client filters subsequently accessed service data according to a known filtering condition (i.e., a hot spot coordinate set corresponding to the previous time period), so as to identify hot spot data. Wherein the load of the cache node comprises at least one of QPS (query per second) and I/O performance.
After identifying the hotspot data, the client further comprises: and identifying the access request corresponding to the identified hot spot data, and sending the access request carrying the identification to the cache proxy gateway, wherein the access request carrying the identification is used for the cache proxy gateway to copy and expand the capacity at the capacity expansion storage node.
Specifically, the client adds an identifier, such as a prefix identifier, to the access request determined to be the hotspot data, and then sends the access request carrying the identifier to the caching proxy gateway, and after the caching proxy gateway receives the access request, the caching proxy gateway identifies that the service data corresponding to the current access request is the hotspot data according to the prefix identifier, and then copies and distributes the operation data corresponding to the access request to the capacity expansion storage node.
For example, if the access request carrying the identifier is a write operation request, the cache proxy gateway copies and distributes the write operation data to the capacity expansion storage node; and if the access request carrying the identifier is a read operation request, the cache proxy gateway retrieves corresponding data from the corresponding cache node, copies and distributes the retrieved data to the capacity expansion storage node, and returns the retrieved data to the client.
In the embodiment, by identifying the hot spot data and performing targeted capacity expansion on the hot spot data, the problems of long capacity expansion preheating time and cluster avalanche caused by no hot spot predictability in the existing capacity expansion system are solved, and the throughput of instantaneous explosive data access capacity is effectively improved.
Optionally, the monitoring system continues to monitor whether the load of the newly added capacity expansion storage node exceeds a preset threshold, if so, sends a capacity expansion request to the client, the client continues to newly add the capacity expansion storage node, and copies and hashes the operation data corresponding to the hot data to the newly added capacity expansion storage node until the load of all the storage nodes meets the requirement.
Therefore, when monitoring that the load of the cache nodes in the distributed storage system is higher than a preset threshold value, the monitoring system sends a capacity expansion request to the client, and after receiving the capacity expansion request, the client identifies hot data according to the hot coordinate set as a filtering condition and copies and distributes operation data of the hot data to the capacity expansion storage nodes, so that the pertinence of capacity expansion is strong, and the defect that the capacity expansion nodes waste unnecessary memory due to the capacity expansion of the hot data is avoided. As described above, the data processing method of this embodiment may perform preheating expansion on the identified hot spot data, that is, only copy and distribute the operation data corresponding to the hot spot data to the newly added expansion storage node, thereby achieving an effect of rapid expansion and effectively improving throughput of transient explosive data access.
Fig. 2 is a flowchart illustrating a data processing method according to a second embodiment of the present invention. The data processing system corresponding to the data processing method shown in fig. 2 includes: client, caching proxy gateway, monitoring system, statistical system and caching node.
Referring to fig. 2, the data processing method may include the steps of:
step 201: and cleaning the key value data in the received service data.
The client locally stores a dictionary for data cleaning, prefixes to be removed can be listed in the dictionary, and the client can clean key value data in the service data according to the dictionary.
Step 202: and carrying out space mapping conversion on the washed key value data to obtain a three-dimensional coordinate value.
Specifically, the client converts the key value data through three different spatial mapping functions, and projects the obtained three sets of dimensionality-reduced data to the x, y and z axes of the three-dimensional coordinate system, so that the fineness of data classification can be effectively improved. Illustratively, the three spatial mapping functions are murmur3 (a non-encrypted hash function suitable for general hash retrieval operation), sha1 (secure hash algorithm), and crc (cyclic redundancy check), and three hash operations are performed on the three functions respectively to obtain three first integer data, and then the three first integer data are respectively obtained by being subjected to a residue operation with a preset number of hot spot data to be counted to obtain three second integer data, and the obtained three second integer data are used as padding data of one third integer data. In an alternative embodiment: the third integer data is 32-bit integer data, wherein the lower 4 bits of the upper 8 bits are used for storing the result of murmur3 after being left (i.e. the second integer data corresponding to murmur 3), the upper 4 bits of the lower 8 bits are used for storing the result of sha1 after being left (i.e. the second integer data corresponding to sha 1), and the lower 4 bits of the lower 8 bits are used for storing the result of crc after being left (i.e. the second integer data corresponding to crc), so as to obtain one integer data, and the integer data is the three-dimensional coordinate value.
Step 203: and sending the three-dimensional coordinate value to a statistical system.
And the client sends the three-dimensional coordinate value corresponding to the service data to the statistical system.
Step 204: and sending the request to the caching proxy gateway.
It should be noted that, at this time, since the client does not acquire the hot spot coordinate set, hot spot identification is not performed on the service data, that is, preheating and capacity expansion cannot be performed on the hot spot data.
Step 205: the request is routed to the cache node.
And the cache proxy gateway routes the request to the corresponding cache node according to the received service data.
Step 206: and counting the hit times of the three-dimensional coordinate values.
And the statistical system generates time sequence data according to the received three-dimensional coordinate values and counts the hit times of the three-dimensional coordinate values in each time period of preset time.
Step 207: and obtaining a statistical result.
And the monitoring system periodically sends acquisition instructions to the statistical system according to the set time.
Step 208: and returning a statistical result.
And the statistical system returns a statistical result to the monitoring system according to the acquisition instruction.
Step 209: and sending the hot spot coordinate set.
And the monitoring system generates a hot spot coordinate set according to the received statistical result and sends the hot spot coordinate set to the client. It should be noted that, if the data is not sent for the first time, the client updates the hot spot coordinate set according to the received data.
Step 210: and monitoring the node load state.
The monitoring system periodically sends a request for monitoring the load state of the node to the cache node.
Step 211: the node returns a load detection value.
The node returns a load detection value according to the monitoring request, and the monitoring system receives the load detection value which is returned by the cache node and represents the load state of the cache node, wherein the load detection value comprises at least one of QPS (query rate per second) and I/O performance.
Step 212: and sending a capacity expansion request.
And the monitoring system judges the received load detection value, and generates and sends a capacity expansion request to the client when determining that the load detection value is higher than a preset threshold value.
Step 213: and filtering the service data to be sent according to the hot spot coordinate set.
And the client filters the three-dimensional coordinate value of the service data to be sent according to the received hot spot coordinate set as a filtering condition.
Step 214: and adding identification to the identified hot spot data.
The client adds an identifier, such as a prefix identifier, to the service data determined to be the hotspot data.
Step 215: and sending an access request carrying the identifier.
And the client sends the access request carrying the identifier to the caching proxy gateway.
Step 216: and copying and distributing the operation data of the hot spot data.
And the cache proxy gateway identifies the hot data according to the prefix identifier, and copies and distributes the operation data corresponding to the hot data to the capacity expansion storage node. For example, when the service data is a write operation request, the cache proxy gateway copies and distributes the write operation data to the capacity expansion storage node; when the service data is a read operation request, the cache proxy gateway retrieves the data from the corresponding cache node, copies and distributes the retrieved data to the capacity expansion storage node, and returns the retrieved data to the client.
According to the data processing method, the hot data can be rapidly identified on the client side, the identified hot data is added with the identification and then sent to the cache proxy gateway, and the operation data of the hot data is copied and distributed to the capacity expansion storage node through the cache proxy gateway, so that the pertinence of capacity expansion is strong, and the defect that the capacity expansion node wastes unnecessary memory due to the fact that the capacity expansion of the hot data is not performed is overcome. The method can preheat and expand the identified hot spot data, namely only the operation data corresponding to the hot spot data is copied and distributed to the newly added expansion storage nodes, so that the effect of rapid expansion is realized, and the throughput of instantaneous explosive data access can be effectively improved.
The embodiment of the invention also provides a data processing device, which is applied to the client and belongs to the same inventive concept with the data processing method, and the specific implementation process can be seen in the method embodiment. Fig. 3 shows a schematic configuration diagram of the data processing apparatus, and referring to fig. 3, the data processing apparatus includes:
the conversion module 301 is configured to perform at least two different spatial mapping conversions on the key value data in each service data to obtain each spatial mapping conversion result; obtaining a space coordinate value of each key value data by fusing each space mapping conversion result corresponding to each key value data;
the obtaining module 302 is configured to obtain a hot spot coordinate set according to a statistical result of the occurrence times of the spatial coordinate values of each key value data within a preset time duration, where the hot spot coordinate set includes a preset number of spatial coordinate values with a time value in a front order in the statistical result;
the determining module 303 is configured to determine that the service data of which the spatial coordinate value belongs to the hot spot coordinate set is hot spot data.
In an alternative embodiment, the conversion module 301 is further configured to:
and cleaning data of each key value data in the received service data according to a preset rule, and taking the cleaned key value data as an object of space mapping conversion.
In an optional embodiment, the conversion module 301 is configured to perform hash operation on the key value data through a first, a second, and a third spatial mapping function, respectively, to obtain three first integer data; respectively carrying out surplus on the three first integer data and the preset number to obtain three second integer data; and filling the three second integer data to preset digits of third integer data, wherein the third integer data is used for representing the three-dimensional coordinate value of the key value data.
In an optional implementation manner, the acquiring module 302 acquires a hotspot coordinate set, including: and receiving a hotspot coordinate set sent by a monitoring system, wherein the monitoring system is used for periodically sending the hotspot coordinate set by taking the preset duration as an interval.
In an optional embodiment, the determining module 303 is specifically configured to:
and detecting whether a capacity expansion request is received or not, and determining that the service data of which the space coordinate value belongs to the hot spot coordinate set is hot spot data when the capacity expansion request is received.
Here, after receiving the hot spot coordinate set, the client may not immediately filter the subsequent service data, or may perform an operation of identifying the hot spot data after receiving the trigger condition. Specifically, the monitoring system monitors the load of each cache node, generates and sends a capacity expansion request to the client when the load of the cache node is monitored to be higher than a preset threshold, and the client filters the service data to be sent according to known filtering conditions after receiving the capacity expansion request, so as to identify the hot data. Wherein the load of the cache node comprises at least one of QPS (query per second) and I/O performance.
In an embodiment, optionally, the data processing apparatus further includes: a request module 304, the request module 304 to:
and identifying the access request corresponding to the identified hot spot data, and sending the access request carrying the identification to the cache proxy gateway, wherein the access request carrying the identification is used for the cache proxy gateway to copy and expand the capacity at the capacity expansion storage node.
Fig. 4 is a schematic structural diagram of a data processing system according to an embodiment of the present invention, and referring to fig. 4, the data processing system includes:
the client 401 is configured to perform at least two different spatial mapping conversions on each key value data in each service data to obtain each spatial mapping conversion result; obtaining a space coordinate value of each key value data by fusing each space mapping conversion result corresponding to each key value data; acquiring a hotspot coordinate set according to a statistical result of the occurrence times of the spatial coordinate values of each key value data in a preset time length, wherein the hotspot coordinate set comprises a preset number of spatial coordinate values with the former times in the statistical result; determining the service data of which the spatial coordinate values belong to the hot spot coordinate set as hot spot data;
and the monitoring system 402 is configured to periodically send the hotspot coordinate set to the client at intervals of the preset duration.
In an embodiment, optionally, the client 401 is further configured to detect whether a capacity expansion request is received, and filter the three-dimensional coordinate value of the service data to be sent according to the hot spot coordinate set when the capacity expansion request is determined to be received.
In an embodiment, optionally, the client 401 is further configured to identify an access request corresponding to the identified hotspot data, and send the access request carrying the identification to the cache proxy gateway, where the access request carrying the identification is used for the cache proxy gateway to perform copy and expansion on the capacity expansion storage node.
In an embodiment, optionally, the monitoring system 402 is further configured to monitor loads of the cache nodes, and generate and send a capacity expansion request to the client when it is monitored that the loads of the cache nodes are higher than a preset threshold.
An embodiment of the present invention further provides a computer storage medium, where the computer storage medium may include: various media that can store program codes, such as a removable Memory device, a Random Access Memory (RAM), a Read-Only Memory (ROM), a magnetic disk, and an optical disk. The readable storage medium stores an executable program; the executable program is used for realizing the data processing method of any embodiment of the invention when being executed by a processor.
It should be understood by those skilled in the art that the functions of the programs in the storage medium of the present embodiment can be understood by referring to the related description of the data processing method described in the foregoing embodiment.
The technical schemes described in the embodiments of the present invention can be combined arbitrarily without conflict.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, embodiments of the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing system to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing system, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing system to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing system to cause a series of operational steps to be performed on the computer or other programmable system to produce a computer implemented process such that the instructions which execute on the computer or other programmable system provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. A data processing method, comprising:
carrying out at least two times of different space mapping conversion on key value data in each service data, and taking the rest of the result after the space mapping conversion to obtain each space mapping conversion result; obtaining a space coordinate value of each key value data by fusing each space mapping conversion result corresponding to each key value data;
acquiring a hotspot coordinate set according to a statistical result of the occurrence times of the spatial coordinate values of each key value data in a preset time length, wherein the hotspot coordinate set comprises a preset number of spatial coordinate values with the former times in the statistical result;
and determining that the service data of which the spatial coordinate values belong to the hot spot coordinate set are hot spot data.
2. The data processing method of claim 1, wherein the performing at least two different spatial mapping transformations on the key value data in each service data, and after the spatial mapping transformations, taking the remainder of the results to obtain each spatial mapping transformation result comprises:
performing hash operation on the key value data through a first space mapping function, a second space mapping function and a third space mapping function respectively to obtain three first integer data;
respectively carrying out surplus on the three first integer data and the preset number to obtain three second integer data;
the obtaining of the spatial coordinate value of each key value data by fusing each spatial mapping conversion result corresponding to each key value data respectively includes:
and filling the three second integer data to preset digits of third integer data, wherein the third integer data is used for representing the spatial coordinate value of the key value data.
3. The data processing method of claim 1, wherein the obtaining of the hotspot coordinate set according to the statistical result of the occurrence times of the spatial coordinate values of the key value data within the preset time duration comprises:
receiving a hotspot coordinate set sent by a monitoring system;
and the monitoring system is used for periodically sending the hotspot coordinate set by taking the preset time length as an interval.
4. The data processing method of claim 1, wherein before performing at least two different spatial mapping transformations on key value data in each service data, further comprising:
and cleaning the key value data in the service data according to a preset rule, and taking the cleaned key value data as an object to be converted.
5. The data processing method according to claim 1, wherein the determining that the service data whose spatial coordinate value belongs to the hotspot coordinate set is hotspot data comprises:
and detecting whether a capacity expansion request is received or not, and determining that the service data of which the space coordinate value belongs to the hot spot coordinate set is hot spot data when the capacity expansion request is received.
6. The data processing method according to claim 1, wherein after determining that the service data whose spatial coordinate value belongs to the hotspot coordinate set is hotspot data, the method further comprises:
and identifying the access request corresponding to the hot spot data, and sending the access request carrying the identification to the caching proxy gateway, wherein the access request carrying the identification is used for the caching proxy gateway to determine the data to be copied in the capacity expansion storage node.
7. A data processing apparatus, comprising:
the conversion module is used for carrying out at least two times of different space mapping conversion on the key value data in each service data, and taking the rest of the result after the space mapping conversion so as to obtain each space mapping conversion result; obtaining a space coordinate value of each key value data by fusing each space mapping conversion result corresponding to each key value data;
the acquisition module is used for acquiring a hotspot coordinate set according to a statistical result of the occurrence times of the spatial coordinate values of the key value data in a preset time length, wherein the hotspot coordinate set comprises a preset number of spatial coordinate values with the former times in the statistical result;
and the determining module is used for determining that the service data of which the spatial coordinate values belong to the hot spot coordinate set is hot spot data.
8. The data processing apparatus of claim 7, further comprising:
and the request module is used for identifying the access request corresponding to the identified hot data and sending the access request carrying the identification to the cache proxy gateway, wherein the access request carrying the identification is used for the cache proxy gateway to determine the data to be copied in the capacity expansion storage node.
9. A data processing system, comprising:
the client is used for carrying out at least two times of different space mapping conversion on the key value data in each service data, and taking the rest of the result after the space mapping conversion so as to obtain each space mapping conversion result; obtaining a space coordinate value of each key value data by fusing each space mapping conversion result corresponding to each key value data; acquiring a hotspot coordinate set according to a statistical result of the occurrence times of the spatial coordinate values of each key value data in a preset time length, wherein the hotspot coordinate set comprises a preset number of spatial coordinate values with the former times in the statistical result; determining the service data of which the spatial coordinate values belong to the hot spot coordinate set as hot spot data;
and the monitoring system is used for periodically sending the hotspot coordinate set to the client at intervals of the preset duration.
10. A computer storage medium, characterized in that an executable program is stored, which when executed by a processor implements the data processing method according to any one of claims 1 to 6.
CN201811240090.6A 2018-10-23 2018-10-23 Data processing method, device, system and storage medium Active CN109522299B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811240090.6A CN109522299B (en) 2018-10-23 2018-10-23 Data processing method, device, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811240090.6A CN109522299B (en) 2018-10-23 2018-10-23 Data processing method, device, system and storage medium

Publications (2)

Publication Number Publication Date
CN109522299A CN109522299A (en) 2019-03-26
CN109522299B true CN109522299B (en) 2020-12-18

Family

ID=65773024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811240090.6A Active CN109522299B (en) 2018-10-23 2018-10-23 Data processing method, device, system and storage medium

Country Status (1)

Country Link
CN (1) CN109522299B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609130B (en) * 2021-07-30 2023-06-13 中电金信软件有限公司 Method, device, electronic equipment and storage medium for acquiring gateway access data
CN114971079B (en) * 2022-06-29 2024-05-28 中国工商银行股份有限公司 Second killing type transaction processing optimization method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291628A (en) * 2017-07-04 2017-10-24 北京京东尚科信息技术有限公司 The method and apparatus of accessing data storage devices
CN107623702A (en) * 2016-07-13 2018-01-23 阿里巴巴集团控股有限公司 A kind of data cache method, apparatus and system
CN107911447A (en) * 2017-11-15 2018-04-13 聚好看科技股份有限公司 Operation system expansion method and device
CN108683695A (en) * 2018-03-23 2018-10-19 阿里巴巴集团控股有限公司 Hot spot access processing method, cache access agent equipment and distributed cache system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8606885B2 (en) * 2003-06-05 2013-12-10 Ipass Inc. Method and system of providing access point data associated with a network access point

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107623702A (en) * 2016-07-13 2018-01-23 阿里巴巴集团控股有限公司 A kind of data cache method, apparatus and system
CN107291628A (en) * 2017-07-04 2017-10-24 北京京东尚科信息技术有限公司 The method and apparatus of accessing data storage devices
CN107911447A (en) * 2017-11-15 2018-04-13 聚好看科技股份有限公司 Operation system expansion method and device
CN108683695A (en) * 2018-03-23 2018-10-19 阿里巴巴集团控股有限公司 Hot spot access processing method, cache access agent equipment and distributed cache system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
热点key的发现与解决之道;林一木;《云栖社区》;20180127;全文 *

Also Published As

Publication number Publication date
CN109522299A (en) 2019-03-26

Similar Documents

Publication Publication Date Title
CN107332876B (en) Method and device for synchronizing block chain state
JP2020509445A5 (en)
CN108769146B (en) Data transmission method and device based on block chain and block chain system
CN108337333B (en) IP address management method, management device, network video recorder and storage medium
CN109522299B (en) Data processing method, device, system and storage medium
CN109413163B (en) Service access method and device
CN108282522B (en) Data storage access method and system based on dynamic routing
CN108040136B (en) IP resource management method and system
CN109918261B (en) Fault monitoring method, device, equipment and computer readable storage medium
CN103823807B (en) A kind of method, apparatus and system for removing repeated data
CN102624750B (en) Resist the method and system that DNS recurrence is attacked
CN108259426B (en) DDoS attack detection method and device
JP6582445B2 (en) Thin client system, connection management device, virtual machine operating device, method, and program
US7310660B1 (en) Method for removing unsolicited e-mail messages
CN106878038B (en) Fault positioning method and device in communication network
CN109597800B (en) Log distribution method and device
CN110618974A (en) Data storage method, device, equipment and storage medium
CN110874185B (en) Data storage method and storage device
CN111984601A (en) Log file deleting method and device, electronic equipment and storage medium
KR20120030938A (en) Method of data replication in a distributed data storage system and corresponding device
CN111211993B (en) Incremental persistence method, device and storage medium for stream computation
CN102446251A (en) Device activation realizing method and equipment
CN111507695A (en) Data processing method, data processing device, node equipment and storage medium
JP6992309B2 (en) Transmitter, receiver, and communication method
CN115442439A (en) Distributed cache cluster management method, system, terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant