CN109783523B - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN109783523B
CN109783523B CN201910069219.XA CN201910069219A CN109783523B CN 109783523 B CN109783523 B CN 109783523B CN 201910069219 A CN201910069219 A CN 201910069219A CN 109783523 B CN109783523 B CN 109783523B
Authority
CN
China
Prior art keywords
user operation
operation data
target
value
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910069219.XA
Other languages
Chinese (zh)
Other versions
CN109783523A (en
Inventor
陈武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huya Information Technology Co Ltd
Original Assignee
Guangzhou Huya Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huya Information Technology Co Ltd filed Critical Guangzhou Huya Information Technology Co Ltd
Priority to CN201910069219.XA priority Critical patent/CN109783523B/en
Publication of CN109783523A publication Critical patent/CN109783523A/en
Application granted granted Critical
Publication of CN109783523B publication Critical patent/CN109783523B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a data processing method, a data processing device, data processing equipment and a storage medium. The method comprises the following steps: determining user operation data; determining keys and values from the user operation data; determining a target space from more than two cache spaces according to the keys of the user operation data; writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target; and if the processing target meets the preset processing condition, processing the value of the user operation data in the candidate set according to the processing target. By the method, the problems that time cost is high, network congestion is easily caused and data interaction cannot be obtained in real time due to the fact that each piece of data can be subjected to two times of network interaction in real time are solved.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present invention relates to data statistics technologies, and in particular, to a data processing method, an apparatus, a device, and a storage medium.
Background
With the development of computer technology, more and more websites choose to use Redis to layout information, and Redis serves as a high-performance key-value pair (key-value) database and provides a lot of convenience for data statistics.
In the prior art, each time a website receives a piece of data, the website needs to perform comprehensive processing with all data in a database to obtain the data required by a user. Meanwhile, each time of comprehensive processing of all data in the database requires two times of network interaction, and when the data volume is large, the time cost of network interaction is high, so that not only is network congestion easily caused, but also the problem that data interaction cannot be obtained in real time is caused.
Disclosure of Invention
The invention provides a data processing method, a data processing device, data processing equipment and a storage medium, which are used for solving the problem that when a website receives a piece of data, the website needs to comprehensively process all data in a database so as to obtain the data required by a user.
In a first aspect, an embodiment of the present invention provides a data processing method, including:
determining user operation data;
determining keys and values from the user operation data;
determining a target space from more than two cache spaces according to the keys of the user operation data;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target;
and if the processing target meets the preset processing condition, processing the value of the user operation data in the candidate set according to the processing target.
On the basis, each cache space has a space number;
the determining a target space from more than two cache spaces according to the key of the user operation data includes:
processing the keys of the user operation data to obtain data numbers;
and when the space number is matched with the data number, determining the cache space corresponding to the data number as a target space.
On the basis, the processing target comprises deduplication processing;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target, including:
determining a candidate set corresponding to the keys of the user operation data in the target space;
judging whether the values in the candidate set are the same as the values of the user operation data or not;
if yes, ignoring the value of the user operation data;
and if not, writing the value of the user operation data into the candidate set.
On the basis, each cache space is allocated with a storage partition in the database;
the processing the value of the user operation data in the candidate set according to the processing target includes:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
judging whether the values in the target set are the same as the values in the candidate set or not;
if yes, values in the candidate set are ignored to determine the number of values in the target set;
if not, writing the values in the candidate set into the target set to determine the number of values in the target set.
On the basis, the deduplication processing comprises a first type of deduplication processing and a second type of deduplication processing;
the first type of deduplication processing comprises constructing a candidate set of character string types, wherein values in the candidate set are user identification numbers;
the second type of deduplication process includes constructing a candidate set of cardinality statistics types, the values in the candidate set being user identification numbers that have been subjected to a reduction process.
On the basis, the value of the user operation data comprises a number, and the processing target is maximum value acquisition processing;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target, including:
determining a candidate set corresponding to the keys of the user operation data in the target space;
judging whether the value of the user operation data is larger than the value in the candidate set or not;
if yes, writing the value of the user operation data into the candidate set;
and if not, ignoring the value of the user operation data.
On the basis, each cache space has a corresponding storage partition in the database;
the processing the value of the user operation data in the candidate set according to the processing target includes:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
judging whether the value in the candidate set is larger than the value in the target set;
if yes, writing values in the candidate set into the target set to determine the maximum value of the values in the target set;
if not, values in the candidate set are ignored to determine the maximum value of the values in the target set.
On the basis, the value of the user operation data comprises a number, and the processing target is processing for acquiring a minimum value;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target, including:
determining a candidate set corresponding to the keys of the user operation data in the target space;
judging whether the value of the user operation data is smaller than the value in the candidate set or not;
if yes, writing the value of the user operation data into the candidate set;
and if not, ignoring the value of the user operation data.
On the basis, each cache space has a corresponding storage partition in a database;
the processing the value of the user operation data in the candidate set according to the processing target includes:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
judging whether the values in the candidate set are smaller than the values in the target set or not;
if yes, writing values in the candidate set into the target set to determine a minimum value of the values in the target set;
if not, values in the candidate set are ignored to determine the minimum value of the values in the target set.
On this basis, the value of the user operation data includes a number, and the processing target includes a summing process;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target, including:
determining a candidate set corresponding to the keys of the user operation data in the target space;
writing values of the user operation data to the candidate set;
the sum of the values in the candidate set is calculated to determine a first numerical value.
On the basis, each cache space has a corresponding storage partition in a database;
the processing the value of the user operation data in the candidate set according to the processing target includes:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
writing a first value in the candidate set to the target set;
the sum of the values in the target set is calculated to determine a second value.
In a second aspect, an embodiment of the present invention further provides a data processing apparatus, including:
the operation data determining module is used for determining user operation data;
a key value determining module for determining a key and a value from the user operation data;
the target space determining module is used for determining a target space from more than two cache spaces according to the keys of the user operation data;
a candidate set writing module, configured to write the value of the user operation data into a candidate set corresponding to a key of the user operation data in the target space according to a preset processing target;
and the value processing module is used for processing the value of the user operation data in the candidate set according to the processing target if the preset processing condition is met.
In a third aspect, an embodiment of the present invention further provides an electronic device, including:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement a data processing method as in any one of the embodiments.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement a data processing method according to any one of the embodiments.
The invention determines keys and values from user operation data; determining a target space from more than two cache spaces according to keys of user operation data; writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in a target space according to a preset processing target; and when the preset processing conditions are met, processing the values of the user operation data in the candidate set according to the processing target to obtain the data required by the user. The method and the device solve the problems that time cost is high, network congestion is easily caused and data interaction cannot be obtained in real time due to the fact that each piece of data can be subjected to two times of network interaction in real time, and achieve the purposes that first processing is conducted on user operation data in a cache space, and then second processing is conducted on the user operation data in a candidate set when preset processing conditions are met. The beneficial effect of reducing the network interaction frequency is achieved by carrying out decentralized processing on the user operation data.
Drawings
Fig. 1A is a flowchart of a data processing method according to an embodiment of the present invention;
FIG. 1B is a diagram illustrating a set in a target space according to an embodiment of the present invention;
fig. 1C is a schematic diagram of a data storage format before processing values in a candidate set according to a processing target according to an embodiment of the present invention;
fig. 2 is a flowchart of a data processing method according to a second embodiment of the present invention;
fig. 3 is a flowchart of a data processing method according to a third embodiment of the present invention;
FIG. 4 is a flowchart of a data processing method according to a fourth embodiment of the present invention
Fig. 5 is a flowchart of a data processing method according to a fifth embodiment of the present invention;
fig. 6 is a structural diagram of a data processing apparatus according to a sixth embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to a seventh embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1A is a flowchart of a data processing method according to an embodiment of the present invention. The present embodiment is applicable to a scenario in which the user operation data in the target space is first processed, and then the user operation data in the candidate set is processed. The method may be performed by a data processing apparatus, which may be implemented in software and/or hardware, typically configured in an electronic device, typically provided in a processor. Referring to fig. 1A, the method specifically includes:
and S101, determining user operation data.
The user operation data refers to records generated when the user interacts with the website. Determining user operation data refers to analyzing the format of a record generated when a user interacts with a website and determining different contents corresponding to different fields in the format.
Such as: an operator a operates a website, which may have X different users at any moment, and access from different regions. Operators want to know the number of users accessing the website per province per minute so as to match with corresponding operation strategies. Each time a user interacts with a web site, a record is generated in the format: time, user ID, province. Such as: 2018-11-1716:05:10, user1, guangdong. This is the user operation data.
And S102, determining keys and values from the user operation data.
A key-value pair (key value) represents a correspondence between a key and a value, and a key may correspond to one or more values.
And selecting a part from the byte section corresponding to the user operation data as a key in the user operation data, selecting a part as a value in the user operation data, and establishing a corresponding relation between the key and the value in the user operation data.
The keys and values in the user operation data are selected by those skilled in the art according to actual service conditions, and this embodiment is not limited thereto.
S103, determining a target space from more than two cache spaces according to the keys of the user operation data.
After a series of processing, the key of the user operation data can be mapped to a cache space, and the mapped cache space is the target space of the key of the user operation data. This series of processing is fixed, and no random number needs to be added, and therefore, the buffer space to which the key of the user operation data is mapped at a time is also fixed.
The specific way of processing the key of the user operation data to obtain the target space may be: round Robin algorithm (Round Robin), HASH Algorithm (HASH), Least Connection algorithm (Least Connection), Response speed algorithm (Response Time), Weighted method (Weighted), etc.
And S104, writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target.
The preset processing target refers to an operation required to be performed on the user operation data, and may generally be to perform deduplication, maximum value acquisition, minimum value acquisition, summation processing, or the like on the user operation data.
Fig. 1B is a schematic diagram of a set in a target space according to an embodiment of the present invention. The target space 11 includes a plurality of sets, each set including a key 12 and a value 13. And writing the user operation data into the target space after the user operation data determines the target space through the keys. There will be multiple sets in the target space, each set including keys and values. A set may be understood as a queue, the head of which is a key, and which is filled with a plurality of values. And determining which set the user operation data should belong to according to the keys of the user operation data, wherein the determined set can be understood as a candidate set. And writing the value corresponding to the user operation data into the candidate set.
Fig. 1C is a schematic diagram of a data storage format before processing values in a candidate set according to a processing target according to an embodiment of the present invention. It is understood that the data in the target space needs to be sorted before step S105 is executed. The target space includes a key list 14 and a value list 15, the key list 14 stores keys of user operation data, and the value list 15 includes time data 151, length data 152, and values of the user operation data. The time data 151 indicates the time during which the data exists in the target space, the length data 152 indicates the length of a value corresponding to a key of the data operated by the user, and a, b, and c in the length data 152 indicate the number of values, respectively.
And S105, if the preset processing conditions are met, processing the values of the user operation data in the candidate set according to the processing target.
And setting a preset processing condition for a certain set of a plurality of cache spaces, a single cache space or a cache space, and processing the value of the user operation data in the set when the preset processing condition is met, so that the processing result accords with a preset processing target.
The preset processing condition may be set in a time dimension, for example, set to exceed a time threshold from the time of the last processing of the values of the user operation data in the candidate set according to the processing target.
The preset processing condition may be set by taking the data amount as a dimension, for example, the data amount of all data set as the cache space reaches the amount threshold.
The preset treatment conditions may also be: and (4) fixing the length of X pieces of timing Y seconds strategy, and emptying the cache space immediately after interaction.
Fixed length X bar representation: and after the data of the cache space reaches X pieces, immediately processing the values of the user operation data in the candidate set.
Timing Y seconds indicates: and if the time of the current time minus the arrival time of the first piece of data in the cache space is greater than Y seconds, immediately processing the value of the user operation data in the candidate set.
The embodiment of the invention determines keys and values from user operation data; determining a target space from more than two cache spaces according to keys of user operation data; writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in a target space according to a preset processing target; and when the preset processing conditions are met, processing the values of the user operation data in the candidate set according to the processing target to obtain the data required by the user. The method and the device solve the problems that time cost is high, network congestion is easily caused and data interaction cannot be obtained in real time due to the fact that each piece of data can be subjected to two times of network interaction in real time, and achieve the purposes that first processing is conducted on user operation data in a cache space, and then second processing is conducted on the user operation data in a candidate set when preset processing conditions are met. The beneficial effect of reducing the network interaction frequency is achieved by carrying out decentralized processing on the user operation data.
Example two
Fig. 2 is a flowchart of a data processing method according to a second embodiment of the present invention. The present embodiment is a refinement performed on the basis of the first embodiment, and specifically describes a specific process of data processing when the processing target is deduplication processing. Referring to fig. 2, the method specifically includes:
s201, determining user operation data.
Analyzing the format of the record generated when the user interacts with the website, and determining different contents corresponding to different fields in the format.
In this embodiment, the content corresponding to the time field, the region field, and the user identification number field may be determined as the user operation data.
For example, in 2018-11-17, 16:05:10, the Guangdong user with user identification number user1 interacts with the website, and the generated user operation data can be expressed as:
Time=2018-11-17 16:05:10
userId=user1
province=guangdong
s202, determining keys and values from the user operation data.
And determining the content corresponding to the time field and the region field in the user operation data as the key of the user operation data. And determining the content corresponding to the user identification number field as the value of the user operation data.
Key 201811171605| Guangdong
Value (value) user 1.
And S203, processing the key of the user operation data to obtain a data number.
And S204, when the space number is matched with the data number, determining the cache space corresponding to the data number as a target space.
Steps S203-S204 describe determining a target space from the buffer space according to the key of the user operation data. Each cache space has a space number, and the data number can be obtained after processing the key of the user operation data. And according to a certain corresponding relation, enabling the data number to correspond to a cache space.
The method for processing the key of the user operation data comprises the following steps: round Robin algorithm (Round Robin), HASH Algorithm (HASH), Least Connection algorithm (Least Connection), Response speed algorithm (Response Time), Weighted method (Weighted), etc.
In a specific implementation, if there are N cache spaces, they may be numbered sequentially from 1-N, so that each cache space has a unique space number. And carrying out hash code (key) operation on the key of the user operation data to be calculated so as to obtain an integer. Taking the modulus of the integer, and taking the absolute value of the final result to operate | hashcode (key)% N |, so as to obtain the positive integer M. The positive integer M indicates the Mth of the N memory spaces with fixed sequence. It can be understood that the cache space with space number M is the target space.
In one embodiment, N-8 (8 buffer spaces) is taken as an example for explanation,
key=201811171605|guangdong
userId=user1
hashCode(key)=-1826944887
M=|hashCode(key)%N|=|-1826944887%8|=|-7|=7
namely 201811171605| guangdong- > user1 is placed into space number 7 of the cache. user1 is the value of the user manipulation data.
S205, in the target space, determining a candidate set corresponding to the key of the user operation data.
The target space includes a plurality of sets, each of which may include a unique key of the user operation data and a plurality of values of the user operation data.
And searching a set corresponding to the key of the user operation data in the target space, and determining the set as a candidate set. And if the target space does not have a set corresponding to the key of the user operation data, establishing a set corresponding to the key of the user operation data, and taking the newly established set as a candidate set corresponding to the key of the user operation data.
S206, judging whether the values in the candidate set are the same as the values of the user operation data or not. If yes, go to step S207; if not, go to step S208.
And after the candidate set is determined, judging whether the value of the user operation data is the same as the value in the candidate set in a traversing mode. If the data are the same, ignoring the value of the user operation data; and if the user operation data are different, writing the value of the user operation data into the candidate set.
S207, ignoring the value of the user operation data.
The value of the user operation data is the same as the value in the candidate set, and the value of the user operation data is not processed and added to the candidate set. Thereby realizing local deduplication processing in the cache space.
And S208, writing the value of the user operation data into the candidate set.
And if the value of the user operation data is not the same as the value in the candidate set, writing the value of the user operation data into the candidate set.
S209, searching a target set corresponding to the key in the target space from the storage partition corresponding to the target space.
Each cache space is allocated a storage partition in the database, where there are multiple sets, each set including a key and a value. A set may be understood as a queue, the head of which is a key, and which is filled with a plurality of values. If the set in the cache space is the same as the key of the set in the storage partition, the value of the set in the cache space is stored in the same set as the key in the storage partition. The set of storage partitions with the same keys is the target set.
And if the keys of the set in the cache space and the set in the storage partition are different, directly storing the set in the cache space into the storage partition.
S210, judging whether the values in the target set are the same as the values in the candidate set. If yes, go to step S211; if not, go to step S212.
Before the key of the set in the cache space is the same as that of the set in the storage partition, and the value of the set in the cache space is stored in the set with the same key in the storage partition, it is also necessary to determine whether the value of the target set is the same as that of the candidate set. And if the values are the same, ignoring the values in the candidate set to determine the number of values in the target set. And if the values are different, writing the values in the candidate set into the target set so as to determine the number of values in the target set.
S211, ignoring the values in the candidate set to determine the number of values in the target set.
The values in the candidate set are the same as the values in the target set, and the values in the target set are not processed, nor are the values in the candidate set added to the target set. Thus, the whole deduplication processing in the database storage partition is realized. To determine the number of values in the target set.
S212, writing the values in the candidate set into the target set to determine the number of values in the target set.
And if the values in the candidate set are different from the values in the target set, a set is newly built in the storage partition as the target set, and the values in the candidate set are written into the candidate set. To determine the number of values in the target set.
On the basis of the above-described embodiment, the deduplication processing includes a first type of deduplication processing and a second type of deduplication processing. The two types of deduplication processing employ different approaches to building a set.
The first type of deduplication process includes constructing a candidate set of string types, the values in the candidate set being user identification numbers.
In one implementation, an unordered set of string types may be constructed as a candidate set. The collection members (user identification numbers) are unique, which means that duplicate data cannot appear in the collection. The accurate duplicate removal counting method for the real-time large data stream is realized by utilizing a set function of a plurality of redis and combining a micro-batch mode of packaging at fixed time and fixed length.
The second type of deduplication process involves constructing a candidate set of cardinal statistics types, with values in the candidate set being user identification numbers that have been subjected to a reduction process.
In one implementation, an unordered set of cardinal statistics types may be constructed as a candidate set. Each time a new piece of data (a user identification number subjected to simplification processing such as hashing) is added, it is necessary to compare the new piece of data with the set members (the user identification numbers subjected to simplification processing such as hashing) one by one. The high-precision duplication eliminating counting method for the real-time large data stream is realized by utilizing the superlogog set function of a plurality of redis and combining a micro-batch mode of packaging at fixed time and fixed length.
The embodiment of the invention determines keys and values from user operation data; determining a target space from more than two cache spaces according to keys of user operation data; writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to the deduplication processing; and when the preset processing conditions are met, processing the values of the user operation data in the candidate set according to the deduplication processing to obtain the data required by the user. The method and the device solve the problems that time cost is high, network congestion is easily caused and data interaction cannot be obtained in real time due to the fact that each piece of data can be subjected to two times of network interaction in real time, and achieve the purposes that first processing is conducted on user operation data in a cache space, and then second processing is conducted on the user operation data in a candidate set when preset processing conditions are met. The beneficial effect of reducing the network interaction frequency is achieved by carrying out decentralized processing on the user operation data.
EXAMPLE III
Fig. 3 is a flowchart of a data processing method according to a third embodiment of the present invention. The present embodiment is a refinement based on the first embodiment, and specifically describes a specific process of data processing when the processing target is to obtain the maximum value. Referring to fig. 3, the method specifically includes:
s301, determining user operation data.
In this embodiment, the content corresponding to the time byte section, the zone byte section, and the digital byte section may be determined as the user operation data.
If a Guangdong user with user identification number user1 consumes 88 yuan at the website, as at 2018-11-17, 16:05:10, the resulting user operation data may be expressed as:
Time=2018-11-17 16:05:10
userId=user1
payCoin=88
province=guangdong。
s302, determining keys and values from the user operation data.
And determining the content corresponding to the time field and the region field in the user operation data as the key of the user operation data. And determining the corresponding content of the digital word section as the value of the user operation data.
Key 201811171605| Guangdong
The value (value) is 88.
And S303, processing the key of the user operation data to obtain a data number.
S304, when the space number is matched with the data number, determining the cache space corresponding to the data number as a target space.
S305, determining a candidate set corresponding to the key of the user operation data in the target space.
Steps S303 to S305 describe a process of determining a target space from the cache space according to the key of the user operation data, and further determining a candidate set in the target space. The detailed manner can be seen in the descriptions of steps S203-S205 in embodiment two.
S306, judging whether the value of the user operation data is larger than the value in the candidate set. If yes, go to step S307, otherwise go to step S308.
And S307, writing the value of the user operation data into the candidate set.
S308, ignoring the value of the user operation data.
Steps S306-S308 describe a method of determining whether to write a value of the user operation data to the candidate set, thereby determining the maximum value of the candidate set.
Since the present embodiment is to obtain the maximum value, the candidate set may store only one number, which is the maximum value of the candidate set, in order to save resources. Of course, the candidate set may store the value of the user operation data, and then obtain the maximum value of the candidate set when necessary.
When the value of the user operation data is larger than the value in the candidate set, it means that the value of the user operation data should be the maximum value of the candidate set after being written into the candidate set. Therefore, it is necessary to write the value of the user operation data into the candidate set, and if the candidate set stores only one number, the value of the candidate set is overwritten with the value of the user operation data.
When the value of the user operation data is smaller than the value in the candidate set, the maximum value of the candidate set is not changed after the value of the user operation data is written into the candidate set. The value of the user operation data is therefore ignored.
S309, searching a target set corresponding to the key in the target space from the storage partition corresponding to the target space.
S310, judging whether the value in the candidate set is larger than the value in the target set. If yes, go to step S311, otherwise go to step S312.
S311, writing the values in the candidate set into the target set to determine the maximum value of the values in the target set.
S312, ignoring the values in the candidate set to determine the maximum value of the values in the target set.
Steps S309-S312 describe a method of determining whether to write a value in the cache space to the target set of the storage partition, thereby determining the target set maximum value.
Since the maximum value is to be obtained in this embodiment, the target set may store only one number, which is the maximum value of the target set, in order to save resources. Of course, the target set may also store values of user operation data, and then obtain the maximum value of the candidate target set when necessary.
When the set in the storage space is larger than the value in the target set, it means that the maximum value of the target set should be obtained after the set in the storage space is written into the target set. It is therefore necessary to write the set in the storage space to the target set, and if the target set stores only one number, the value of the target set is overwritten by the set in the storage space. To determine the maximum of the values in the target set.
When the set in the storage space is smaller than the value in the target set, it means that the maximum value of the target set is not changed after the set in the storage space is written into the target set. Values in the candidate set are therefore ignored to determine the maximum of the values in the target set.
The embodiment of the invention determines keys and values from user operation data; determining a target space from more than two cache spaces according to keys of user operation data; in the target space, writing the value of the user operation data into a candidate set corresponding to the key of the user operation data according to the maximum value processing; and when the preset processing conditions are met, processing the values of the user operation data in the candidate set according to the maximum value processing to determine the maximum value of the values in the target set. The method and the device solve the problems that time cost is high, network congestion is easily caused and data interaction cannot be obtained in real time due to the fact that each piece of data can be subjected to two times of network interaction in real time, and achieve the purposes that first processing is conducted on user operation data in a cache space, and then second processing is conducted on the user operation data in a candidate set when preset processing conditions are met. The beneficial effect of reducing the network interaction frequency is achieved by carrying out decentralized processing on the user operation data.
Example four
Fig. 4 is a flowchart of a data processing method according to a fourth embodiment of the present invention. The present embodiment is a refinement based on the first embodiment, and specifically describes a specific process of data processing when the processing target is to obtain the minimum value. Referring to fig. 4, the method specifically includes:
s401, determining user operation data.
In this embodiment, the content corresponding to the time byte section, the zone byte section, and the digital byte section may be determined as the user operation data.
If a Guangdong user with user identification number user1 consumes 88 yuan at the website, as at 2018-11-17, 16:05:10, the resulting user operation data may be expressed as:
Time=2018-11-17 16:05:10
userId=user1
payCoin=88
province=guangdong。
s402, determining keys and values from the user operation data.
And determining the content corresponding to the time field and the region field in the user operation data as the key of the user operation data. And determining the corresponding content of the digital word section as the value of the user operation data.
Key 201811171605| Guangdong
The value (value) is 88.
And S403, processing the key of the user operation data to obtain a data number.
S404, when the space number is matched with the data number, determining the cache space corresponding to the data number as a target space.
S405, in the target space, determining a candidate set corresponding to the key of the user operation data.
Steps S403 to S405 describe a process of determining a target space from the cache space according to a key of the user operation data, and further determining a candidate set in the target space. The detailed manner can be seen in the descriptions of steps S203-S205 in embodiment two.
S406, judging whether the value of the user operation data is smaller than the value in the candidate set. If yes, go to step S407; if not, step S408 is executed.
And S407, writing the value of the user operation data into the candidate set.
S408, ignoring the value of the user operation data.
Steps S406-S408 describe a method of determining whether to write a value of the user operation data to the candidate set, thereby determining the minimum value of the candidate set.
Since the minimum value is to be obtained in this embodiment, the candidate set may store only one number, which is the minimum value of the candidate set, in order to save resources. Of course, the candidate set may store the value of the user operation data, and acquire the minimum value of the candidate set when necessary.
When the value of the user operation data is smaller than the value in the candidate set, it indicates that the value of the user operation data should be the minimum value of the candidate set after being written into the candidate set. Therefore, it is necessary to write the value of the user operation data into the candidate set, and if the candidate set stores only one number, the value of the candidate set is overwritten with the value of the user operation data.
When the value of the user operation data is larger than the value in the candidate set, the minimum value of the candidate set is not changed after the value of the user operation data is written into the candidate set. The value of the user operation data is therefore ignored.
S409, searching a target set corresponding to the key in the target space from the storage partition corresponding to the target space.
S410, judging whether the value in the candidate set is smaller than the value in the target set. If yes, go to step S411; if not, go to step S412.
S411, writing the values in the candidate set into the target set to determine the minimum value of the values in the target set.
S412, ignoring the values in the candidate set to determine the minimum value of the values in the target set.
Steps S409-S412 describe a method of determining whether to write a value in the cache space to the target set of the storage partition, thereby determining the target set minimum value.
Since the minimum value is to be obtained in this embodiment, the target set may store only one number, which is the minimum value of the target set, in order to save resources. Of course, the target set may also store values of user operation data, and then obtain the minimum value of the candidate target set when necessary.
When the set in the storage space is smaller than the value in the target set, it means that the set in the storage space should be the minimum value of the target set after being written into the target set. It is therefore necessary to write the set in the storage space to the target set, and if the target set stores only one number, the value of the target set is overwritten by the set in the storage space. To determine the minimum of the values in the target set.
When the set in the storage space is larger than the value in the target set, the minimum value of the target set is not changed after the set in the storage space is written into the target set. Values in the candidate set are therefore ignored to determine the minimum of the values in the target set.
The embodiment of the invention determines keys and values from user operation data; determining a target space from more than two cache spaces according to keys of user operation data; writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to the minimum value processing; and when the preset processing conditions are met, processing the values of the user operation data in the candidate set according to the obtained minimum value processing to determine the minimum value of the values in the target set. The method and the device solve the problems that time cost is high, network congestion is easily caused and data interaction cannot be obtained in real time due to the fact that each piece of data can be subjected to two times of network interaction in real time, and achieve the purposes that first processing is conducted on user operation data in a cache space, and then second processing is conducted on the user operation data in a candidate set when preset processing conditions are met. The beneficial effect of reducing the network interaction frequency is achieved by carrying out decentralized processing on the user operation data.
EXAMPLE five
Fig. 5 is a flowchart of a data processing method according to a fifth embodiment of the present invention. The present embodiment is a refinement based on the first embodiment, and specifically describes a specific process of data processing when the processing target is the sum. Referring to fig. 5, the method specifically includes:
s501, determining user operation data.
In this embodiment, the content corresponding to the time byte section, the zone byte section, and the digital byte section may be determined as the user operation data.
If a Guangdong user with user identification number user1 consumes 88 yuan at the website, as at 2018-11-17, 16:05:10, the resulting user operation data may be expressed as:
Time=2018-11-17 16:05:10
userId=user1
payCoin=88
province=guangdong。
and S502, determining keys and values from the user operation data.
And determining the content corresponding to the time field and the region field in the user operation data as the key of the user operation data. And determining the corresponding content of the digital word section as the value of the user operation data.
Key 201811171605| Guangdong
The value (value) is 88.
And S503, processing the key of the user operation data to obtain a data number.
S504, when the space number is matched with the data number, determining the cache space corresponding to the data number as a target space.
And S505, determining a candidate set corresponding to the key of the user operation data in the target space.
Steps S503 to S505 describe a process of determining a target space from the buffer space according to the key of the user operation data, and further determining a candidate set in the target space. The detailed manner can be seen in the descriptions of steps S203-S205 in embodiment two.
S506, writing the value of the user operation data into the candidate set.
S507, calculating the sum of the values in the candidate set to determine a first numerical value.
Steps S506-S507 describe writing values of the user operation data to the candidate set to determine a first numerical value, i.e. a sum value of the candidate set.
And writing the value of the user operation data into the candidate set, and summing the values in the candidate set to obtain a sum value corresponding to each key as a first sum value.
S508, searching a target set corresponding to the key in the target space from the storage partition corresponding to the target space.
S509, writing the first numerical value in the candidate set into the target set.
S510, calculating the sum of the values in the target set to determine a second numerical value.
Steps S508-S510 describe a method of writing values in the cache space to a target set of the storage partition, thereby determining the target set and the values.
And writing a first numerical value corresponding to each key in the candidate set into a target set of the storage partition, and summing the values in the target set to obtain a sum value corresponding to each key as a second sum value.
The embodiment of the invention determines keys and values from user operation data; determining a target space from more than two cache spaces according to keys of user operation data; writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to the obtained sum value processing; and when the preset processing condition is met, processing the values of the user operation data in the candidate set according to the obtained sum value processing so as to determine the sum value of the values in the target set. The method and the device solve the problems that time cost is high, network congestion is easily caused and data interaction cannot be obtained in real time due to the fact that each piece of data can be subjected to two times of network interaction in real time, and achieve the purposes that first processing is conducted on user operation data in a cache space, and then second processing is conducted on the user operation data in a candidate set when preset processing conditions are met. The beneficial effect of reducing the network interaction frequency is achieved by carrying out decentralized processing on the user operation data.
EXAMPLE six
Fig. 6 is a structural diagram of a data processing apparatus according to a sixth embodiment of the present invention, including: an operational data determination module 61, a key value determination module 62, a target space determination module 63, a candidate set write module 64, and a value processing module 65. Wherein:
an operation data determination module 61 for determining user operation data;
a key value determination module 62 for determining a key and a value from the user operation data;
a target space determining module 63, configured to determine a target space from more than two cache spaces according to the key of the user operation data;
a candidate set writing module 64, configured to write the value of the user operation data into a candidate set corresponding to a key of the user operation data in the target space according to a preset processing target;
and a value processing module 65, configured to, if a preset processing condition is met, process the value of the user operation data in the candidate set according to the processing target.
The embodiment of the invention determines keys and values from user operation data; determining a target space from more than two cache spaces according to keys of user operation data; writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in a target space according to a preset processing target; and when the preset processing conditions are met, processing the values of the user operation data in the candidate set according to the processing target to obtain the data required by the user. The method and the device solve the problems that time cost is high, network congestion is easily caused and data interaction cannot be obtained in real time due to the fact that each piece of data can be subjected to two times of network interaction in real time, and achieve the purposes that first processing is conducted on user operation data in a cache space, and then second processing is conducted on the user operation data in a candidate set when preset processing conditions are met. The beneficial effect of reducing the network interaction frequency is achieved by carrying out decentralized processing on the user operation data.
On the basis of the foregoing embodiment, each cache space has a space number, and the target space determining module is further configured to:
processing the keys of the user operation data to obtain data numbers;
and when the space number is matched with the data number, determining the cache space corresponding to the data number as a target space.
On the basis of the above embodiment, the processing target includes a deduplication process, and the candidate set writing module is further configured to:
determining a candidate set corresponding to the keys of the user operation data in the target space;
judging whether the values in the candidate set are the same as the values of the user operation data or not;
if yes, ignoring the value of the user operation data;
and if not, writing the value of the user operation data into the candidate set.
On the basis of the foregoing embodiment, each of the cache spaces is allocated with a storage partition in the database, and the value processing module is further configured to:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
judging whether the values in the target set are the same as the values in the candidate set or not;
if yes, values in the candidate set are ignored to determine the number of values in the target set;
if not, writing the values in the candidate set into the target set to determine the number of values in the target set.
On the basis of the above embodiment, the deduplication processing includes a first type of deduplication processing and a second type of deduplication processing;
the first type of deduplication processing comprises constructing a candidate set of character string types, wherein values in the candidate set are user identification numbers;
the second type of deduplication process includes constructing a candidate set of cardinality statistics types, the values in the candidate set being user identification numbers that have been subjected to a reduction process.
On the basis of the foregoing embodiment, the value of the user operation data includes a number, the processing target is processing for obtaining a maximum value, and the candidate set writing module is further configured to:
in the target space, writing the value of the user operation data into a candidate set corresponding to a key of the user operation data according to a preset processing target, and further configured to:
determining a candidate set corresponding to the keys of the user operation data in the target space;
judging whether the value of the user operation data is larger than the value in the candidate set or not;
if yes, writing the value of the user operation data into the candidate set;
and if not, ignoring the value of the user operation data.
On the basis of the foregoing embodiment, each cache space has a corresponding storage partition in the database, and the value processing module is further configured to:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
judging whether the value in the candidate set is larger than the value in the target set;
if yes, writing values in the candidate set into the target set to determine the maximum value of the values in the target set;
if not, values in the candidate set are ignored to determine the maximum value of the values in the target set.
On the basis of the foregoing embodiment, the value of the user operation data includes a number, the processing target is processing for obtaining a minimum value, and the candidate set writing module is further configured to:
determining a candidate set corresponding to the keys of the user operation data in the target space;
judging whether the value of the user operation data is smaller than the value in the candidate set or not;
if yes, writing the value of the user operation data into the candidate set;
and if not, ignoring the value of the user operation data.
On the basis of the foregoing embodiment, each of the cache spaces has a corresponding storage partition in the database, and the value processing module is further configured to:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
judging whether the values in the candidate set are smaller than the values in the target set or not;
if yes, writing values in the candidate set into the target set to determine a minimum value of the values in the target set;
if not, values in the candidate set are ignored to determine the minimum value of the values in the target set.
On the basis of the above embodiment, the value of the user operation data includes a number, the processing target includes a summing process, and the candidate set writing module is further configured to:
determining a candidate set corresponding to the keys of the user operation data in the target space;
writing values of the user operation data to the candidate set;
the sum of the values in the candidate set is calculated to determine a first numerical value.
On the basis of the foregoing embodiment, each of the cache spaces has a corresponding storage partition in the database, and the value processing module is further configured to:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
writing a first value in the candidate set to the target set;
the sum of the values in the target set is calculated to determine a second value.
The data processing apparatus provided by this embodiment can be used to execute the data processing method provided by any of the above embodiments, and has corresponding functions and advantages.
EXAMPLE seven
Fig. 7 is a schematic structural diagram of an electronic device according to a seventh embodiment of the present invention. As shown in fig. 7, the electronic apparatus includes a processor 70, a memory 71, a communication module 72, an input device 73, and an output device 74; the number of the processors 70 in the electronic device may be one or more, and one processor 70 is taken as an example in fig. 7; the processor 70, the memory 71, the communication module 72, the input device 73 and the output device 74 in the electronic device may be connected by a bus or other means, and the bus connection is exemplified in fig. 7.
The memory 71, as a computer-readable storage medium, may be used to store software programs, computer-executable programs, and modules corresponding to a data processing method in the present embodiment (for example, the operation data determining module 61, the key value determining module 62, the target space determining module 63, the candidate set writing module 64, and the value processing module 65 in a data processing apparatus). The processor 70 executes various functional applications and data processing of the electronic device by executing software programs, instructions and modules stored in the memory 71, that is, implements one of the data processing methods described above.
The memory 71 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 71 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 71 may further include memory located remotely from the processor 70, which may be connected to the electronic device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
And the communication module 72 is used for establishing connection with the display screen and realizing data interaction with the display screen. The input device 73 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus.
The electronic device provided in this embodiment may perform the data processing method provided in any embodiment of the present invention, and its corresponding functions and advantages are concrete.
Example eight
An eighth embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a data processing method, including:
determining user operation data;
determining keys and values from the user operation data;
determining a target space from more than two cache spaces according to the keys of the user operation data;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target;
and if the processing target meets the preset processing condition, processing the value of the user operation data in the candidate set according to the processing target.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the data processing method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes instructions for enabling a computer electronic device (which may be a personal computer, a server, or a network electronic device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the data processing apparatus, the included units and modules are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (14)

1. A data processing method, comprising:
determining user operation data;
determining keys and values from the user operation data;
determining a target space from more than two cache spaces according to the keys of the user operation data;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target;
if the processing target meets the preset processing condition, processing the value of the user operation data in the candidate set according to the processing target;
wherein the processing target comprises a duplicate removal processing, a maximum value acquisition processing or a minimum value acquisition processing.
2. The method of claim 1, wherein each cache space has a space number;
the determining a target space from more than two cache spaces according to the key of the user operation data includes:
processing the keys of the user operation data to obtain data numbers;
and when the space number is matched with the data number, determining the cache space corresponding to the data number as a target space.
3. The method according to claim 1 or 2, wherein the processing target includes a deduplication process;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target, including:
determining a candidate set corresponding to the keys of the user operation data in the target space;
judging whether the values in the candidate set are the same as the values of the user operation data or not;
if yes, ignoring the value of the user operation data;
and if not, writing the value of the user operation data into the candidate set.
4. The method of claim 3, wherein each of the cache spaces is allocated a storage partition in a database;
the processing the value of the user operation data in the candidate set according to the processing target includes:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
judging whether the values in the target set are the same as the values in the candidate set or not;
if yes, values in the candidate set are ignored to determine the number of values in the target set;
if not, writing the values in the candidate set into the target set to determine the number of values in the target set.
5. The method of claim 3, wherein the deduplication processing comprises a first type of deduplication processing and a second type of deduplication processing;
the first type of deduplication processing comprises constructing a candidate set of character string types, wherein values in the candidate set are user identification numbers;
the second type of deduplication process includes constructing a candidate set of cardinality statistics types, the values in the candidate set being user identification numbers that have been subjected to a reduction process.
6. The method according to claim 1 or 2, wherein the value of the user operation data includes a number, and the processing target is processing for obtaining a maximum value;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target, including:
determining a candidate set corresponding to the keys of the user operation data in the target space;
judging whether the value of the user operation data is larger than the value in the candidate set or not;
if yes, writing the value of the user operation data into the candidate set;
and if not, ignoring the value of the user operation data.
7. The method of claim 6, wherein each cache space has a corresponding memory partition in the database;
the processing the value of the user operation data in the candidate set according to the processing target includes:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
judging whether the value in the candidate set is larger than the value in the target set;
if yes, writing values in the candidate set into the target set to determine the maximum value of the values in the target set;
if not, values in the candidate set are ignored to determine the maximum value of the values in the target set.
8. The method according to claim 1 or 2, wherein the value of the user operation data includes a number, and the processing target is a minimum value processing;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target, including:
determining a candidate set corresponding to the keys of the user operation data in the target space;
judging whether the value of the user operation data is smaller than the value in the candidate set or not;
if yes, writing the value of the user operation data into the candidate set;
and if not, ignoring the value of the user operation data.
9. The method of claim 8, wherein each of the cache spaces has a corresponding memory partition in the database;
the processing the value of the user operation data in the candidate set according to the processing target includes:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
judging whether the values in the candidate set are smaller than the values in the target set or not;
if yes, writing values in the candidate set into the target set to determine a minimum value of the values in the target set;
if not, values in the candidate set are ignored to determine the minimum value of the values in the target set.
10. The method according to claim 1 or 2, wherein the value of the user operation data includes a number, and the processing target includes a summing process;
writing the value of the user operation data into a candidate set corresponding to the key of the user operation data in the target space according to a preset processing target, including:
determining a candidate set corresponding to the keys of the user operation data in the target space;
writing values of the user operation data to the candidate set;
the sum of the values in the candidate set is calculated to determine a first numerical value.
11. The method of claim 10, wherein each of the cache spaces has a corresponding memory partition in the database;
the processing the value of the user operation data in the candidate set according to the processing target includes:
searching a target set corresponding to the key in the target space from a storage partition corresponding to the target space;
writing a first value in the candidate set to the target set;
the sum of the values in the target set is calculated to determine a second value.
12. A data processing apparatus, comprising:
the operation data determining module is used for determining user operation data;
a key value determining module for determining a key and a value from the user operation data;
the target space determining module is used for determining a target space from more than two cache spaces according to the keys of the user operation data;
a candidate set writing module, configured to write the value of the user operation data into a candidate set corresponding to a key of the user operation data in the target space according to a preset processing target;
the value processing module is used for processing the value of the user operation data in the candidate set according to the processing target if the preset processing condition is met;
wherein the processing target comprises a duplicate removal processing, a maximum value acquisition processing or a minimum value acquisition processing.
13. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement a data processing method as claimed in any one of claims 1-11.
14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a data processing method according to any one of claims 1 to 11.
CN201910069219.XA 2019-01-24 2019-01-24 Data processing method, device, equipment and storage medium Active CN109783523B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910069219.XA CN109783523B (en) 2019-01-24 2019-01-24 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910069219.XA CN109783523B (en) 2019-01-24 2019-01-24 Data processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109783523A CN109783523A (en) 2019-05-21
CN109783523B true CN109783523B (en) 2022-02-25

Family

ID=66502220

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910069219.XA Active CN109783523B (en) 2019-01-24 2019-01-24 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109783523B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102843396A (en) * 2011-06-22 2012-12-26 中兴通讯股份有限公司 Data writing and reading method and device in distributed caching system
CN104424263A (en) * 2013-08-29 2015-03-18 腾讯科技(深圳)有限公司 Data recording method and data recording device
CN105354247A (en) * 2015-10-13 2016-02-24 武汉大学 Geographical video data organization management method supporting storage and calculation linkage
CN105354151A (en) * 2014-08-19 2016-02-24 阿里巴巴集团控股有限公司 Cache management method and device
CN107766529A (en) * 2017-10-27 2018-03-06 合肥城市云数据中心股份有限公司 A kind of mass data storage means for sewage treatment industry
CN108595268A (en) * 2018-04-24 2018-09-28 咪咕文化科技有限公司 A kind of data distributing method, device and computer readable storage medium based on MapReduce

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7801844B2 (en) * 2005-11-23 2010-09-21 Microsoft Corporation Surrogate key generation and utilization
CN102014158B (en) * 2010-11-29 2013-07-10 北京兴宇中科科技开发股份有限公司 Cloud storage service client high-efficiency fine-granularity data caching system and method
CN109634933A (en) * 2014-10-29 2019-04-16 北京奇虎科技有限公司 The method, apparatus and system of data processing
WO2016122547A1 (en) * 2015-01-29 2016-08-04 Hewlett Packard Enterprise Development Lp Foster twin data structure
CN106407207B (en) * 2015-07-29 2020-06-16 阿里巴巴集团控股有限公司 Real-time newly-added data updating method and device
CN107133329B (en) * 2017-05-09 2022-03-08 腾讯科技(深圳)有限公司 Data processing method, data processing apparatus, and storage medium
CN107391034B (en) * 2017-07-07 2019-05-10 华中科技大学 A kind of repeated data detection method based on local optimization
CN107515931B (en) * 2017-08-28 2023-04-25 华中科技大学 Repeated data detection method based on clustering

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102843396A (en) * 2011-06-22 2012-12-26 中兴通讯股份有限公司 Data writing and reading method and device in distributed caching system
CN104424263A (en) * 2013-08-29 2015-03-18 腾讯科技(深圳)有限公司 Data recording method and data recording device
CN105354151A (en) * 2014-08-19 2016-02-24 阿里巴巴集团控股有限公司 Cache management method and device
CN105354247A (en) * 2015-10-13 2016-02-24 武汉大学 Geographical video data organization management method supporting storage and calculation linkage
CN107766529A (en) * 2017-10-27 2018-03-06 合肥城市云数据中心股份有限公司 A kind of mass data storage means for sewage treatment industry
CN108595268A (en) * 2018-04-24 2018-09-28 咪咕文化科技有限公司 A kind of data distributing method, device and computer readable storage medium based on MapReduce

Also Published As

Publication number Publication date
CN109783523A (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN107391770B (en) Method, device and equipment for processing data and storage medium
CN109460398B (en) Time series data completion method and device and electronic equipment
WO2020207410A1 (en) Data compression method, electronic device, and storage medium
CN111294233A (en) Network alarm statistical analysis method, system and computer readable storage medium
US9721362B2 (en) Auto-completion of partial line pattern
CN112241439A (en) Attack organization discovery method, device, medium and equipment
CN111414619B (en) Data security detection method, device, equipment and readable storage medium
CN110990350B (en) Log analysis method and device
CN111047434A (en) Operation record generation method and device, computer equipment and storage medium
CN110704773B (en) Abnormal behavior detection method and system based on frequent behavior sequence mode
CN111275599A (en) Big data integration algorithm-based group rental house early warning method and device, storage medium and terminal
CN109783523B (en) Data processing method, device, equipment and storage medium
CN113742332A (en) Data storage method, device, equipment and storage medium
CN110022343B (en) Adaptive event aggregation
CN113672375A (en) Resource allocation prediction method, device, equipment and storage medium
US20140297663A1 (en) Filter regular expression
CN114020745A (en) Index construction method and device, electronic equipment and storage medium
CN111198900A (en) Data caching method and device for industrial control network, terminal equipment and medium
CN113127767B (en) Mobile phone number extraction method and device, electronic equipment and storage medium
CN113342550A (en) Data processing method, system, computing device and storage medium
CN114003784A (en) Request recording method, device, equipment and storage medium
CN108173689B (en) Output system of load balancing data
CN108182202B (en) Content update notification method, content update notification device, electronic equipment and storage medium
CN105468603A (en) Data selection method and apparatus
CN113157695B (en) Data processing method and device, readable medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant