CN111858586B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN111858586B
CN111858586B CN202010641089.5A CN202010641089A CN111858586B CN 111858586 B CN111858586 B CN 111858586B CN 202010641089 A CN202010641089 A CN 202010641089A CN 111858586 B CN111858586 B CN 111858586B
Authority
CN
China
Prior art keywords
data
hash
sub
value
hash table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010641089.5A
Other languages
Chinese (zh)
Other versions
CN111858586A (en
Inventor
刘中砥
刘霖
徐超
赵福仁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Skyguard Network Security Technology Co ltd
Original Assignee
Beijing Skyguard Network Security Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Skyguard Network Security Technology Co ltd filed Critical Beijing Skyguard Network Security Technology Co ltd
Priority to CN202010641089.5A priority Critical patent/CN111858586B/en
Publication of CN111858586A publication Critical patent/CN111858586A/en
Application granted granted Critical
Publication of CN111858586B publication Critical patent/CN111858586B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation

Abstract

The invention discloses a data processing method and device, and relates to the technical field of computers. One embodiment of the method comprises the following steps: the method and the device can process the data details comprising a plurality of index key values and search the memory address of the data details based on any index key value, so that the same data details are searched by utilizing different index key values, the data searching efficiency is improved, the memory address of the data details is stored by utilizing the hash tables of the plurality of index key values instead of the data details, and the memory occupation is reduced; when the data processing operation is executed on the hash table, the problem of high data processing complexity caused by the need of multi-thread scheduling is solved by utilizing the sub-hash table lock and the corresponding queue to be processed.

Description

Data processing method and device
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for data processing.
Background
With rapid development of information technology, performance requirements on data storage and query in an application system are higher and higher, and requirements on memory data storage or query are also put forward, and at present, a data structure of a key value pair, namely a data storage structure of one key value corresponding to one data, is generally adopted as a memory data storage mode, and meanwhile, the memory data is often used in a multi-thread mode application system.
In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art:
in an actual application scene, a service requirement for inquiring memory data containing a plurality of key values by utilizing different key values exists, the same memory data cannot be inquired by utilizing different key values through the existing structure of one key value corresponding to one data, the data corresponding to other key values can be directly obtained, and meanwhile, the consumption of memory resources is increased by utilizing a storage method of a plurality of key value pairs; in the application system of the multithreading mode, the threads related to the memory data storage and inquiry are required to be scheduled, so that the complexity of data processing is increased.
Disclosure of Invention
In view of this, the embodiments of the present invention provide a method and an apparatus for data processing, which can process a data detail including a plurality of index key values, and search for a memory address of the data detail based on any index key value, so as to realize that the same data detail is queried by using different index key values, improve the efficiency of data query, and store the memory address of the data detail instead of storing the data detail by using hash tables of a plurality of index key values, thereby reducing the occupation of memory; when the data processing operation is executed on the hash table, the problem of high data processing complexity caused by the need of multi-thread scheduling is solved by utilizing the sub-hash table lock and the corresponding queue to be processed.
To achieve the above object, according to one aspect of the embodiments of the present invention, there is provided a data processing method, including: the data detail to be processed comprises at least two index key values and index values corresponding to the index key values; the index key values are respectively associated with corresponding hash tables, the hash tables comprise a first number of sub hash tables, and the sub hash tables comprise a second number of hash chains; calculating element positions of the hash values of the index values corresponding to the hash table, the sub hash table and the hash chain according to the hash values of the index values, the first quantity and the second quantity respectively; the element position is used for storing the memory address of the data detail; receiving a data read-write instruction, wherein the data read-write instruction comprises an index key value to be operated and a corresponding index value, determining the element position corresponding to the index value according to the index value, and executing read-write operation on the memory address of the data detail.
Optionally, the method of data processing is characterized in that,
according to the hash value of the index value, the first quantity and the second quantity, calculating the element positions of the hash table, the sub hash table and the hash chain corresponding to the hash value of the index value comprises the following steps:
Determining the sub-hash table to which the element position belongs according to the value of the predefined part of the hash value of the index value;
optionally, the method of data processing is characterized in that,
according to the hash value of the index value, the first quantity and the second quantity, calculating the element positions of the hash table, the sub hash table and the hash chain corresponding to the hash value of the index value comprises the following steps:
and calculating the remainder of the hash value after the predefined part is removed according to the bit number of the hash value of the index value, and determining the element position of the hash chain corresponding to the hash value of the index value by using the remainder obtained by modulo the remainder of the second number.
Optionally, the method of data processing is characterized in that,
the sub-hash table is provided with a corresponding sub-hash table lock and a data waiting queue corresponding to the sub-hash table lock.
Optionally, the method of data processing is characterized in that,
when the data read-write instruction is stored, acquiring a sub-hash table lock corresponding to the index value:
when the sub-hash table lock is acquired, storing the memory address of the data detail in the element position, and adding one to the reference count of the data detail pointed by the memory address;
And when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data storage instruction in the data queue to be processed.
Optionally, the method of data processing is characterized in that,
when the data read-write instruction is deleted, acquiring a sub-hash table lock corresponding to the index value:
when the sub-hash table lock is acquired, deleting the memory address of the data detail based on the element position, and subtracting one from the reference count of the data detail pointed by the memory address;
and when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data deleting instruction in the data queue to be processed.
Optionally, the method of data processing is characterized in that,
when the data read-write instruction is a query, acquiring a sub-hash table lock corresponding to the index value:
when the sub-hash table lock is acquired, acquiring the quantity of data to be processed in the data queue to be processed, when the quantity is 0, acquiring a memory address of the data detail stored in the element position, acquiring the data detail according to the memory address, and adding one to the reference count of the data detail pointed by the memory address;
When the number is not less than 1, corresponding data read-write operation is executed according to the read-write instruction for storing or deleting the data to be processed;
and after the data read-write instruction of the data detail is completed, reducing the reference count of the data detail pointed by the memory address by one.
To achieve the above object, according to a second aspect of an embodiment of the present invention, there is provided an apparatus for data processing, including: the method comprises the steps of obtaining an index key value, a data module, an index value element position calculating module and an execution data read-write processing module; wherein,
the data detail to be processed comprises at least two index key values and index values corresponding to the index key values; the index key values are respectively associated with corresponding hash tables, the hash tables comprise a first number of sub hash tables, and the sub hash tables comprise a second number of hash chains;
the index value element position calculating module is configured to calculate element positions of the index value hash value corresponding to the hash table, the sub hash table, and the hash chain according to the index value hash value, the first number, and the second number, respectively; the element position is used for storing the memory address of the data detail;
The executing data read-write processing module is used for receiving a data read-write instruction, wherein the data read-write instruction comprises an index key value to be operated and a corresponding index value, determining the element position corresponding to the index value according to the index value, and executing read-write operation on the memory address of the data detail.
Optionally, the data processing apparatus is characterized in that,
according to the hash value of the index value, the first quantity and the second quantity, calculating the element positions of the hash table, the sub hash table and the hash chain corresponding to the hash value of the index value comprises the following steps:
determining the sub-hash table to which the element position belongs according to the value of the predefined part of the hash value of the index value;
optionally, the data processing apparatus is characterized in that,
according to the hash value of the index value, the first quantity and the second quantity, calculating the element positions of the hash table, the sub hash table and the hash chain corresponding to the hash value of the index value comprises the following steps:
and calculating the remainder of the hash value after the predefined part is removed according to the bit number of the hash value of the index value, and determining the element position of the hash chain corresponding to the hash value of the index value by using the remainder obtained by modulo the remainder of the second number.
Optionally, the data processing apparatus is characterized in that,
the sub-hash table is provided with a corresponding sub-hash table lock and a data waiting queue corresponding to the sub-hash table lock.
Optionally, the data processing apparatus is characterized in that,
when the data read-write instruction is stored, acquiring a sub-hash table lock corresponding to the index value:
when the sub-hash table lock is acquired, storing the memory address of the data detail in the element position, and adding one to the reference count of the data detail pointed by the memory address;
and when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data storage instruction in the data queue to be processed.
Optionally, the data processing apparatus is characterized in that,
when the data read-write instruction is deleted, acquiring a sub-hash table lock corresponding to the index value:
when the sub-hash table lock is acquired, deleting the memory address of the data detail based on the element position, and subtracting one from the reference count of the data detail pointed by the memory address;
and when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data deleting instruction in the data queue to be processed.
Optionally, the data processing apparatus is characterized in that,
when the data read-write instruction is a query, acquiring a sub-hash table lock corresponding to the index value:
when the sub-hash table lock is acquired, acquiring the quantity of data to be processed in the data queue to be processed, when the quantity is 0, acquiring a memory address of the data detail stored in the element position, acquiring the data detail according to the memory address, and adding one to the reference count of the data detail pointed by the memory address;
when the number is not less than 1, corresponding data read-write operation is executed according to the read-write instruction for storing or deleting the data to be processed;
and after the data read-write instruction of the data detail is completed, reducing the reference count of the data detail pointed by the memory address by one.
To achieve the above object, according to a third aspect of an embodiment of the present invention, there is provided an electronic device for data processing, including: one or more processors; and storage means for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the methods of data processing described above.
To achieve the above object, according to a fourth aspect of embodiments of the present invention, there is provided a computer-readable medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements a method as described in any one of the above-described data processing methods.
One embodiment of the above invention has the following advantages or benefits: the method can process the data details comprising a plurality of index key values, and find the memory address of the data details based on any index key value, so that the same data details are queried by utilizing different index key values, the data query efficiency is improved, the memory address of the data details is stored instead of the memory address of the data details by utilizing the hash tables of the plurality of index key values, and the occupation of the memory is reduced; when the data access is executed on the hash table, the sub-hash table lock and the corresponding queue to be processed are utilized, so that the problem of high data processing complexity caused by the need of utilizing multi-thread scheduling is solved.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a flow chart of a method for data processing according to an embodiment of the present invention;
FIG. 2 is an exemplary diagram of a correspondence between data details and hash tables according to one embodiment of the present invention;
FIG. 3 is a flow chart of a data write operation of a data process according to an embodiment of the present invention;
FIG. 4 is a flow diagram of a data query operation for data processing according to one embodiment of the present invention;
FIG. 5 is a schematic diagram of an apparatus for data processing according to an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;
fig. 7 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
As shown in fig. 1, an embodiment of the present invention provides a method for processing data, which may include the following steps:
step S101: the data detail to be processed comprises at least two index key values and index values corresponding to the index key values; the index key values are respectively associated with corresponding hash tables, the hash tables comprise a first number of sub hash tables, and the sub hash tables comprise a second number of hash chains.
Specifically, a data detail to be processed is obtained, wherein the data detail comprises data of a plurality of key/value structures, such as' key1: value1; key2, value2; key3 value3 … key N: value N "; the key1, key2 and key3 … key N are used for indicating index key values of a group of data with relevance, and the corresponding index values are value1, value2 and value3 … value N respectively; according to the business scenario, at least two key values can be selected as index key values, for example, key1 and key2 are selected as index key values; the value1 and the value2 corresponding to the key1 and the key2 are the index values corresponding to the key1 and the key 2; namely, the data detail comprises at least two index key values and index values corresponding to the index key values; it will be appreciated that when key1, key2 are selected as index keys, keys other than key1, key2 are non-index keys contained in the data details, and in acquiring the data details, value1, value2, value3 … value n may be acquired simultaneously; for example, as shown in the example of fig. 2, a data detail is shown, including:
"IP (Internet Protocol Internet protocol, IP) address: 123.123.1.1; name, zhang San; the staff number is A0010; telephone number 188 8888 "corresponds to" key1 value1; key2, value2; key3: value3; key 4-value 4 ". The data details are stored in a memory database and have corresponding memory addresses; when inquiring the positioning data detail, the method can respectively inquire through different index key value inquiry according to a plurality of set index key values; for example: in the example shown in fig. 2, it is assumed that the index key values are "IP address" and "name"; "123.123.1.1" can be queried by "IP address" when the data detail is found and the registrant who gets the IP address is "Zhang Sanj"; assume that "Zhang Sany" again uses "an IP address: 123.123.1.2 "log in the same system again, when it needs to verify whether a user logs in with a different IP address, it can find" Zhang Sanj "by the index key value of" name ", and inquire about the IP address containing" IP address: 123.123.1.1; name: zhang Sanj data details, and performing the next data processing operation, for example, deleting the memory address of one data details in order to prevent repeated login, and deleting the data of the data details in the memory database when the reference count of the memory address is 0.
Further, the hash table corresponding to the index key value is used for storing memory addresses of a plurality of data details corresponding to the index key value; one index key value corresponds to one hash table; the index key values are respectively associated with corresponding hash tables, the hash tables comprise a first number of sub hash tables, and the sub hash tables comprise a second number of hash chains; for example, as shown in the example of fig. 2, the hash table corresponding to the index key "IP address" is "IP address hash table"; the hash table corresponding to the index key value name is a name hash table.
Still further, each hash table includes a first number of sub hash tables, and it can be understood that dividing a plurality of sub hash tables reduces granularity of the data set, which is helpful for quickly locating data; each sub-hash table contains a second number of hash chains; the hash chain comprises an element position indicated by an index value, wherein the element position is used for storing a memory address of data detail; wherein the first number may be determined based on the number of bits of the hash value, e.g., calculating the hash value of the index value "Lifour" and converting to a 32-bit binary hash value; assuming that the hash value dividing the 32-bit binary is the first 6 bits and the last 26 bits, the number of sub-hash tables is 2 6 (64) 2 6 I.e. the first number; the first 6-bit numerical value of the hash value (32-bit binary) according to the index value is, for example: the first six bits are 110011, and the value obtained by conversion into decimal is 51, and then it is determined that the index value association corresponds to the 51 th sub-hash table in the 64 sub-hash tables in the name hash table.
It will be appreciated that the first number is related to the hash value's bin and a predefined number of bits, e.g. the hash value is binary and the predefined portion's number of bits is 8, then the first number is 2 8 (256); wherein the predefined portion may take the first N bits of the hash value, or the N bits anywhere in the middle of the hash value, or the following N bits; that is, calculating the element positions of the hash table, the sub-hash table, and the hash chain corresponding to the hash value of the index value according to the hash value of the index value, the first number, and the second number includes: and determining the sub hash table to which the element position belongs according to the numerical value of the predefined part of the hash value of the index value.
Further, the second number is the number of hash chains contained in each sub-hash table, for example, the second number is set to 1000, and it is further assumed that the first number is calculated by taking a predefined portion (first 6 bits) of the hash value, and further, the value of the last 26 bits remaining after the first 6 bits (predefined portion) are removed by calculating the index-valued hash value, that is, the remaining portion after the predefined portion is removed by calculating the hash value; for example, the value obtained by calculating the 26 th bit is 2097152; if the remainder of the 1000 modulo of 2097152 is 152, the index value is associated with the 152 th hash chain of the 51 st sub-hash table, it can be understood that the element position corresponding to each hash chain forms a linked list structure, and after locating the hash chain, the element position in the corresponding linked list is searched according to the data contained in the hash chain; namely, according to the bit number of the hash value of the index value, calculating the remainder of the hash value after the predefined part is removed, and determining the element position of the hash chain corresponding to the hash value of the index value by using the remainder obtained by modulo the remainder of the second number; further, when the data amount in the sub-hash table exceeds the loading number of the predefined hash chains, expanding the capacity of the sub-hash table according to the expansion strategy of the predefined sub-hash table, and storing the data existing in the sub-hash table again according to the expanded capacity of the sub-hash table.
The present invention is not limited to the specific content of the data detail, the specific content and the specific number of the index key values, the length and the system of the hash values, the number and the position of the bits of the predefined part, the first number and the second number.
Step S102: calculating element positions of the hash values of the index values corresponding to the hash table, the sub hash table and the hash chain according to the hash values of the index values, the first quantity and the second quantity respectively; the element position is used for storing the memory address of the data detail;
specifically, descriptions about the sub hash table corresponding to the hash value for calculating the index value and the element positions of the hash chains in the sub hash table are consistent with step S101, and are not described herein again; it will be understood that, different index values correspond to different index key values, and are respectively associated with hash tables corresponding to the index key values, and still taking the example in fig. 2 as an example, it is assumed that, through calculation, the element positions corresponding to the hash values of the index value "123.123.1.1" are determined as follows: "element position 1" of "hash chain 1" in "sub-hash table 1" included in "IP address hash table"; the element positions corresponding to the hash value of the index value Zhang Sany are as follows: "element position 2" of "hash chain n" in "sub-hash table 1" contained in "name hash table"; it will be appreciated that when "Zhang Sanj" is registered again via the IP address "123.123.1.2", two data details containing different IP addresses can be found.
The element position is used for storing the memory address of the data detail; for example, assume an example data detail "IP address: 123.123.1.1 as shown in FIG. 2; name, zhang San; the staff number is A0010; the telephone number 188 x12345678 "has a memory address of 0x12345678", wherein the data details and the memory address are in one-to-one correspondence, and the content and the data of the data details can be obtained through the memory address; storing "0x12345678" at the storage address 1 of hash chain 1 in sub-hash table 1 contained in the IP address hash table; and the storage address 2 of hash chain n in sub-hash table 1 contained in the name hash table; therefore, the memory address for storing the data detail replaces the mode of storing the content of the data detail, so that the memory consumption caused by storing the same data for multiple times is reduced, and the data processing performance is improved; that is, according to the hash value of the index value, the first number and the second number, respectively, element positions of the hash value of the index value corresponding to the hash table, the sub hash table and the hash chain are calculated; the element location is used to store the memory address of the data detail.
Step S103: receiving a data read-write instruction, wherein the data read-write instruction comprises an index key value to be operated and a corresponding index value, determining the element position corresponding to the index value according to the index value, and executing read-write operation on the memory address of the data detail.
Specifically, according to the description of step S101 to step S102, the element positions of the memory address of the data detail in the hash tables of different index key values are determined, and according to the read-write instruction of the data, the read-write operation of the memory address of the data detail is executed. Wherein the read operation includes a query, and the write operation includes a store (insert) and a delete; further, the data read-write instruction comprises an index key value to be operated and a corresponding index value; for example, the index key value to be operated (to be queried) is "name", and the corresponding index value is "Zhang Sano"; further, the element position corresponding to the index value is determined according to the index value, for example, the element position of a hash chain in a certain sub-hash table in the name hash table containing the memory address of the data detail of "Zhang san" is determined by calculating the hash value of "Zhang san", and the read-write operation of the memory address of the data detail is performed, it is understood that the element position stores the memory address of the data detail, and thus the access of the content of the data detail is processed (protected) by the reference count of the data detail pointed to by the memory address, for example, when the reference count of the data detail pointed to by the memory address is 0, the data of the data detail can be deleted.
Further, the sub-hash table is provided with a corresponding sub-hash table lock and a data waiting queue corresponding to the sub-hash table lock. Specifically, a lock is set for each sub hash table, so that granularity of data processing is reduced and flexibility of data processing is improved relative to setting a lock for the whole hash table; the method for setting the sub-hash table lock can adopt a function or a method of a programming language, and the specific implementation method for setting the sub-hash table lock, the specific implementation method for the data to be processed queue and the storage mode are not limited.
The detailed description about the storage and deletion of the data based on the element position is identical to the description of step S301 to step S304, and will not be repeated here; the detailed description about the element position-based data query is identical to the descriptions of step S401 to step S405, and will not be repeated here.
As shown in fig. 3, an embodiment of the present invention provides a method for a flow chart of a data storage or deletion operation, which may include the following steps:
Step S301: and calculating the hash value of the index value, and determining the N sub-hash table of the index value and the element position corresponding to the hash chain contained in the N sub-hash table.
Specifically, the method and description for determining the nth sub-hash table and the element position of the index value by calculating the hash value of the index value are consistent with steps S101 to S102, and will not be described herein.
Step S302: the lock of the nth sub-hash table is obtained.
Specifically, each sub-hash table is provided with a corresponding sub-hash table lock, and for data processing operation of the sub-hash table, the lock of the sub-hash table needs to be acquired first. Further, when the lock of the nth sub hash table is acquired, step S303 is executed; when the lock of the nth sub-hash table is not acquired, step S304 is performed.
Step S303: the memory address of the data detail is stored (deleted) in the element position corresponding to the hash chain, and the reference count of the data detail pointed by the memory address is increased by one (reduced by one).
When the data read-write instruction is stored (or deleted), and the sub-hash table lock corresponding to the index value is obtained, storing (or deleting) the memory address of the data detail in the element position corresponding to the hash chain, and adding one (when the memory address is stored) or subtracting one (when the memory address is deleted) from the reference count of the data detail pointed by the memory address.
Step S304: and storing the element position, the memory address of the data detail and the corresponding data storage instruction in the data queue to be processed.
Specifically, when the lock of the nth sub-hash table is not acquired, the element position, the memory address of the data detail and the corresponding data storage instruction are stored in the data queue to be processed corresponding to the nth sub-hash table.
The descriptions of step S303 to step S304 are: when the data read-write instruction is stored, acquiring a sub-hash table lock corresponding to the index value: when the sub-hash table lock is acquired, storing the element position in the memory address of the data detail, and adding one to the reference count of the data detail pointed by the memory address; and when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data storage instruction in the data queue to be processed.
When the data read-write instruction is deleted, acquiring a sub-hash table lock corresponding to the index value: when the sub-hash table lock is acquired, deleting the memory address of the data detail based on the element position, and subtracting one from the reference count of the data detail pointed by the memory address; and when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data deleting instruction in the data queue to be processed.
As shown in fig. 4, an embodiment of the present invention provides a method for a flow chart of a data query operation, which may include the following steps:
step S401: and calculating the hash value of the index value, and determining the N sub-hash table of the index value and the element position corresponding to the hash chain contained in the N sub-hash table.
Specifically, the method and description for determining the nth sub-hash table and the element position of the index value by calculating the hash value of the index value are consistent with steps S101 to S102, and will not be described herein.
Step S402: the lock of the nth sub-hash table is obtained.
Specifically, each sub-hash table is provided with a corresponding sub-hash table lock, and for data processing operation of the sub-hash table, the lock of the sub-hash table needs to be acquired first. When the lock of the nth sub hash table is acquired, step S403 is executed; when the lock of the Nth sub-hash table is not acquired, waiting for next scheduling, and re-executing the flow of acquiring the lock of the Nth sub-hash table.
Step S403: and obtaining the number of the queues to be processed corresponding to the sub-hash table.
Specifically, as shown in the description of step S304, when the lock of the nth sub-hash table is not obtained, the element position, the memory address of the data detail, and the corresponding data storage instruction are stored in the data queue to be processed corresponding to the nth sub-hash table; further, the number of data to be processed in the data queue to be processed is obtained, and when the number is 0, step S405 is executed; if not 0, step S404 is executed.
Step S404: and executing corresponding data read-write operation according to the read-write instruction for storing or deleting the data to be processed.
Specifically, the quantity of data to be processed in the data queue to be processed is obtained; when the number is not less than 1 (for example, the number is 1 or 5), obtaining to-be-processed data in the to-be-processed data queue, wherein the to-be-processed data comprises the element position, the memory address of the data detail and a corresponding data read-write (storage or deletion) instruction; further, according to the data read-write operation corresponding to the data storage or deletion instruction, it can be understood that the data to be processed contained in the queue to be processed is associated with sub-hash tables, and each sub-hash table corresponds to one data queue to be processed; the detailed description about data storage and deletion is identical to the description of step S301 to step S304, and will not be repeated here.
Further, after the corresponding data read-write operation is performed according to the read-write instruction for storing or deleting the data to be processed, step S405 is performed.
Step S405: and acquiring a memory address of the data detail stored by the memory address, and acquiring the data detail according to the memory address.
Specifically, a memory address of the stored data detail or the data detail and the content contained in the data detail are obtained according to the memory address corresponding to the hash chain; and increasing the reference count of the data detail pointed by the memory address by one;
it can be appreciated that the access uniqueness of the one-up and one-down protected memory addresses of the reference count is utilized when querying data; specifically, after the lock of the nth sub-hash table is obtained, firstly processing a queue to be processed, after the processing is finished, querying corresponding data in the hash table, adding one to the reference count of the data detail after the querying, returning a query result to a user querying the data detail, and performing reference count one-subtracting operation after the querying user finishes the data processing, namely subtracting one to the reference count of the data detail pointed by the memory address after the data processing of the data detail is finished. The data processing comprises data inquiry and acquisition, for example, data details related to the IP address are inquired through the IP address, information such as names, employee numbers, telephones and the like corresponding to the IP address are further acquired from the data details, and the data details are protected through reference counting; for example, as illustrated in fig. 2, the memory address "0x12345678" of the data detail is obtained through the "storage address 2" of the "hash chain n" included in the "sub-hash table 1" of the "name hash table", and further, according to the memory address "0x12345678", the content of the data detail is obtained as "IP address: 123.123.1.1; name, zhang San; the staff number is A0010; telephone number 188 x 8888"; assume that the other piece of data generated after the generation is specified as an IP address of 123.123.1.2; name, zhang San; the staff number is A0010; telephone number 188 x 8888"; the memory address is 0x87654321; it can be seen that the two data details contain different "IP addresses"; because the element positions with the index value of 'Zhang Sanj' can be queried by using the index key value of 'name', the memory addresses of two data details are respectively stored, and if only one user can be allowed to log in by using one IP address according to a specific service (for example, an identity verification system of an application proxy server), the data details of 'IP address: 123.123.1.1' are deleted according to a predefined deletion strategy (for example, the data details generated before are deleted according to the chronological order). Step S303 of deleting the data details is identical and will not be described here.
Step S402-step S405 is described, namely, when the data read-write instruction is a query, obtaining a sub-hash table lock corresponding to the index value: when the sub-hash table lock is acquired, acquiring the quantity of data to be processed in the data queue to be processed, when the quantity is 0, acquiring a memory address of the data detail stored in the element position, acquiring the data detail according to the memory address, and adding one to a reference count of the data detail pointed by the memory address; when the number is not less than 1, corresponding data read-write operation is executed according to the read-write instruction for storing or deleting the data to be processed; and after the data processing of the data detail is completed, reducing the reference count of the data detail pointed by the memory address by one.
As shown in fig. 5, an embodiment of the present invention provides an apparatus 500 for data processing, including: an index key value obtaining module 501, a data module 501, an index value element calculating module 502 and an execution data read-write processing module 503; wherein,
the index key value and data module 501 includes a data detail to be processed, where the data detail includes at least two index key values and index values corresponding to the index key values; the index key values are respectively associated with corresponding hash tables, the hash tables comprise a first number of sub hash tables, and the sub hash tables comprise a second number of hash chains;
The index value element position calculating module 502 is configured to calculate element positions of the index value hash value corresponding to the hash table, the sub hash table, and the hash chain according to the index value hash value, the first number, and the second number, respectively; the element position is used for storing the memory address of the data detail;
the executing data read-write processing module 503 is configured to receive a data read-write instruction, where the data read-write instruction includes an index key value to be operated and a corresponding index value, determine the element position corresponding to the index value according to the index value, and execute a read-write operation on a memory address of the data detail.
Optionally, the module 502 configured to calculate, according to the hash value of the index value, the first number, and the second number, the element positions of the hash table, the sub-hash table, and the hash chain corresponding to the hash value of the index value includes: and determining the sub hash table to which the element position belongs according to the numerical value of the predefined part of the hash value of the index value.
Optionally, the module 502 configured to calculate, according to the hash value of the index value, the first number, and the second number, the element positions of the hash table, the sub-hash table, and the hash chain corresponding to the hash value of the index value includes: and calculating the remainder of the hash value after the predefined part is removed according to the bit number of the hash value of the index value, and determining the element position of the hash chain corresponding to the hash value of the index value by using the remainder obtained by modulo the remainder of the second number.
Optionally, the executing data read-write processing module 503 includes: the sub-hash table is provided with a corresponding sub-hash table lock and a data waiting queue corresponding to the sub-hash table lock.
Optionally, the executing data read-write processing module 503 is configured to obtain, when the data read-write instruction is storage, a sub-hash table lock corresponding to the index value: when the sub-hash table lock is acquired, storing the memory address of the data detail in the element position, and adding one to the reference count of the data detail executed by the memory address; and when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data storage instruction in the data queue to be processed.
Optionally, the executing data read-write processing module 503 is configured to obtain, when the data read-write instruction is delete, a sub-hash table lock corresponding to the index value: when the sub-hash table lock is acquired, deleting the memory address of the data detail based on the element position, and subtracting one from the reference count of the data detail executed by the memory address; and when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data deleting instruction in the data queue to be processed.
Optionally, the executing data read-write processing module 503 is configured to obtain, when the data read-write instruction is a query, a sub-hash table lock corresponding to the index value: when the sub-hash table lock is acquired, acquiring the quantity of data to be processed in the data queue to be processed, when the quantity is 0, acquiring a memory address of the data detail stored in the element position, acquiring the data detail according to the memory address, and adding one to a reference count of the data detail pointed by the memory address; when the number is not less than 1, corresponding data read-write operation is executed according to the read-write instruction for storing or deleting the data to be processed; and after the data processing of the data detail is completed, reducing the reference count of the data detail pointed by the memory address by one.
The embodiment of the invention also provides electronic equipment for data processing, which comprises: one or more processors; and a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method provided by any of the embodiments described above.
The embodiment of the invention also provides a computer readable medium, on which a computer program is stored, which when executed by a processor implements the method provided by any of the above embodiments.
Fig. 6 illustrates an exemplary system architecture 600 of a data processing method or apparatus to which embodiments of the present invention may be applied.
As shown in fig. 6, the system architecture 600 may include terminal devices 601, 602, 603, a network 604, and a server 605. The network 604 is used as a medium to provide communication links between the terminal devices 601, 602, 603 and the server 605. The network 604 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 605 via the network 604 using the terminal devices 601, 602, 603 to receive or send messages, etc. Various client applications may be installed on the terminal devices 601, 602, 603, such as enterprise application clients, information system clients, web browser applications, search class applications, instant messaging tools, mailbox clients, and the like.
The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting the operation of various clients, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 605 may be a server providing various services, such as a background management server providing support for client applications run by the user with the terminal devices 601, 602, 603. The background management server can process the received data processing request and feed back the processing result to the terminal equipment.
It should be noted that, the method for processing data provided in the embodiment of the present invention is generally executed by the server 605, and accordingly, the device for processing data is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 7, there is illustrated a schematic diagram of a computer system 700 suitable for use in implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU) 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the system 700 are also stored. The CPU 701, ROM 702, and RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read therefrom is mounted into the storage section 708 as necessary.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 709, and/or installed from the removable medium 711. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 701.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules and/or units involved in the embodiments of the present invention may be implemented in software, or may be implemented in hardware. The described modules and/or units may also be provided in a processor, e.g., may be described as: the processor comprises an index key value acquisition module, a data module, an index value element calculation position module and an execution data read-write processing module. The names of these modules do not limit the module itself in some cases, for example, the module for calculating the index value element position may also be described as "a module for determining, according to the hash value of the index value, the element position stored by the memory address of the data detail associated with the index key value".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include: the data detail to be processed comprises at least two index key values and index values corresponding to the index key values; the index key values are respectively associated with corresponding hash tables, the hash tables comprise a first number of sub hash tables, and the sub hash tables comprise a second number of hash chains; calculating element positions of the hash values of the index values corresponding to the hash table, the sub hash table and the hash chain according to the hash values of the index values, the first quantity and the second quantity respectively; the element position is used for storing the memory address of the data detail; receiving a data read-write instruction, wherein the data read-write instruction comprises an index key value to be operated and a corresponding index value, determining the element position corresponding to the index value according to the index value, and executing read-write operation on the memory address of the data detail.
According to the technical scheme of the embodiment of the invention, the data details including a plurality of index key values can be processed, and the memory address of the data details can be searched based on any index key value, so that the same data details can be searched by utilizing different index key values, the data searching efficiency is improved, the memory address of the data details is stored instead of the memory details by utilizing the hash tables of the plurality of index key values, and the occupation of the memory is reduced; when the data processing operation is executed on the hash table, the problem of high data processing complexity caused by the need of multi-thread scheduling is solved by utilizing the sub-hash table lock and the corresponding queue to be processed.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (7)

1. A method of data processing, comprising:
the data detail to be processed comprises at least two index key values and index values corresponding to the index key values; the index key values are respectively associated with corresponding hash tables, the hash tables comprise a first number of sub hash tables, and the sub hash tables comprise a second number of hash chains;
Calculating element positions of the hash values of the index values corresponding to the hash table, the sub hash table and the hash chain according to the hash values of the index values, the first quantity and the second quantity respectively; comprising the following steps: determining the sub-hash table to which the element position belongs according to the value of the predefined part of the hash value of the index value;
calculating the remainder of the hash value after the predefined part is removed according to the bit number of the hash value of the index value, and determining the element position of the hash chain corresponding to the hash value of the index value by using the remainder obtained by modulo the remainder of the second number;
the sub-hash table is provided with a corresponding sub-hash table lock and a data waiting queue corresponding to the sub-hash table lock;
the element position is used for storing the memory address of the data detail;
receiving a data read-write instruction, wherein the data read-write instruction comprises an index key value to be operated and a corresponding index value, determining the element position corresponding to the index value according to the index value, and executing read-write operation on the memory address of the data detail.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
when the data read-write instruction is stored, acquiring a sub-hash table lock corresponding to the index value:
when the sub-hash table lock is acquired, storing the memory address of the data detail in the element position, and adding one to the reference count of the data detail pointed by the memory address;
and when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data storage instruction in the data queue to be processed.
3. The method of claim 1, wherein the step of determining the position of the substrate comprises,
when the data read-write instruction is deleted, acquiring a sub-hash table lock corresponding to the index value:
when the sub-hash table lock is acquired, deleting the memory address of the data detail based on the element position, and subtracting one from the reference count of the data detail pointed by the memory address;
and when the sub-hash table lock is not acquired, storing the element position, the memory address of the data detail and the corresponding data deleting instruction in the data queue to be processed.
4. A method according to any one of claims 2 to 3, wherein,
When the data read-write instruction is a query, acquiring a sub-hash table lock corresponding to the index value:
when the sub-hash table lock is acquired, acquiring the quantity of data to be processed in the data queue to be processed, when the quantity is 0, acquiring a memory address of the data detail stored in the element position, acquiring the data detail according to the memory address, and adding one to a reference count of the data detail pointed by the memory address;
when the number is not less than 1, corresponding data read-write operation is executed according to the read-write instruction for storing or deleting the data to be processed;
and after the data processing of the data detail is completed, reducing the reference count of the data detail pointed by the memory address by one.
5. An apparatus for data processing, comprising: the method comprises the steps of obtaining an index key value, a data module, an index value element position calculating module and an execution data read-write processing module; wherein,
the data detail to be processed comprises at least two index key values and index values corresponding to the index key values; the index key values are respectively associated with corresponding hash tables, the hash tables comprise a first number of sub hash tables, and the sub hash tables comprise a second number of hash chains;
The index value element position calculating module is configured to calculate element positions of the index value hash value corresponding to the hash table, the sub hash table, and the hash chain according to the index value hash value, the first number, and the second number, respectively; comprising the following steps: determining the sub-hash table to which the element position belongs according to the value of the predefined part of the hash value of the index value; calculating the remainder of the hash value after the predefined part is removed according to the bit number of the hash value of the index value, and determining the element position of the hash chain corresponding to the hash value of the index value by using the remainder obtained by modulo the remainder of the second number; the sub-hash table is provided with a corresponding sub-hash table lock and a data waiting queue corresponding to the sub-hash table lock; the element position is used for storing the memory address of the data detail;
the executing data read-write processing module is used for receiving a data read-write instruction, wherein the data read-write instruction comprises an index key value to be operated and a corresponding index value, determining the element position corresponding to the index value according to the index value, and executing read-write operation on the memory address of the data detail.
6. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-4.
7. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-4.
CN202010641089.5A 2020-07-06 2020-07-06 Data processing method and device Active CN111858586B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010641089.5A CN111858586B (en) 2020-07-06 2020-07-06 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010641089.5A CN111858586B (en) 2020-07-06 2020-07-06 Data processing method and device

Publications (2)

Publication Number Publication Date
CN111858586A CN111858586A (en) 2020-10-30
CN111858586B true CN111858586B (en) 2024-04-09

Family

ID=73153169

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010641089.5A Active CN111858586B (en) 2020-07-06 2020-07-06 Data processing method and device

Country Status (1)

Country Link
CN (1) CN111858586B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860712B (en) * 2021-04-13 2024-02-09 深圳前海移联科技有限公司 Block chain-based transaction database construction method, system and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102194002A (en) * 2011-05-25 2011-09-21 中兴通讯股份有限公司 Table entry adding, deleting and searching method of hash table and hash table storage device
CN106096023A (en) * 2016-06-24 2016-11-09 腾讯科技(深圳)有限公司 Method for reading data, method for writing data and data server
CN108153757A (en) * 2016-12-02 2018-06-12 深圳市中兴微电子技术有限公司 A kind of method and apparatus of Hash table management
CN109902092A (en) * 2019-02-22 2019-06-18 广州荔支网络技术有限公司 A kind of operating method of data-storage system, device and mobile terminal
CN110929103A (en) * 2019-11-20 2020-03-27 车智互联(北京)科技有限公司 Method for constructing index for data set, data query method and computing equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102194002A (en) * 2011-05-25 2011-09-21 中兴通讯股份有限公司 Table entry adding, deleting and searching method of hash table and hash table storage device
CN106096023A (en) * 2016-06-24 2016-11-09 腾讯科技(深圳)有限公司 Method for reading data, method for writing data and data server
CN108153757A (en) * 2016-12-02 2018-06-12 深圳市中兴微电子技术有限公司 A kind of method and apparatus of Hash table management
CN109902092A (en) * 2019-02-22 2019-06-18 广州荔支网络技术有限公司 A kind of operating method of data-storage system, device and mobile terminal
CN110929103A (en) * 2019-11-20 2020-03-27 车智互联(北京)科技有限公司 Method for constructing index for data set, data query method and computing equipment

Also Published As

Publication number Publication date
CN111858586A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN109657174B (en) Method and device for updating data
CN110909022A (en) Data query method and device
CN116303608A (en) Data processing method and device for application service
CN110555068A (en) Data export method and device
CN111858586B (en) Data processing method and device
CN111401684A (en) Task processing method and device
CN113761565B (en) Data desensitization method and device
CN113312355A (en) Data management method and device
CN112948138A (en) Method and device for processing message
CN113722113A (en) Traffic statistic method and device
CN109213815B (en) Method, device, server terminal and readable medium for controlling execution times
CN113760861A (en) Data migration method and device
CN113779122A (en) Method and apparatus for exporting data
CN111737218A (en) File sharing method and device
CN113347052A (en) Method and device for counting user access data through access log
CN117478535B (en) Log storage method and device
CN111459981A (en) Query task processing method, device, server and system
CN110866002A (en) Method and device for processing sub-table data
CN110262756B (en) Method and device for caching data
CN113778909B (en) Method and device for caching data
CN104156358A (en) Method, device and system for reading tables of database in batches
CN113741796B (en) Data persistence method and device for terminal application
CN117478535A (en) Log storage method and device
CN113111275A (en) Method and device for generating short website, electronic equipment and storage medium
US8650153B2 (en) Storing records in databases in a randomized manner to effectively utilize database servers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Liu Zhongdi

Inventor after: Liu Lin

Inventor after: Xu Chao

Inventor after: Zhao Furen

Inventor before: Liu Zhongdi

Inventor before: Xu Chao

Inventor before: Zhao Furen

GR01 Patent grant