CN115729847A - Data storage method and related equipment - Google Patents

Data storage method and related equipment Download PDF

Info

Publication number
CN115729847A
CN115729847A CN202111017533.7A CN202111017533A CN115729847A CN 115729847 A CN115729847 A CN 115729847A CN 202111017533 A CN202111017533 A CN 202111017533A CN 115729847 A CN115729847 A CN 115729847A
Authority
CN
China
Prior art keywords
data
list
coordinate
index
network device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111017533.7A
Other languages
Chinese (zh)
Inventor
李晟如
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202111017533.7A priority Critical patent/CN115729847A/en
Publication of CN115729847A publication Critical patent/CN115729847A/en
Pending legal-status Critical Current

Links

Images

Abstract

The application provides a data storage method and related equipment. In the application, after the network device obtains the first key value and the first data, the network device may directly determine the first seed function through one-time calculation according to the first key value, where the first seed function is used to calculate a first index, the first index is used to indicate a storage address of the first data, and the network device calculates the first index according to the first seed function. The network device stores the first data in a storage list according to the first index. Therefore, the process of determining the first subfunction by the network device is simple, the overhead of the network device is reduced, and the efficiency of storing the first data by the network device is improved.

Description

Data storage method and related equipment
Technical Field
The embodiment of the application relates to the field of communication, in particular to a data storage method and related equipment.
Background
With the continuous increase of the scale of the internet and the continuous increase of network users and contents, the requirement of the technical field of data communication on the performance of storing and searching mass data is higher and higher.
The key value pair lookup system has wide applications in the current data communication field, for example, in a network forwarding device, high-speed lookup of an MAC address table, a flow table, and the like needs to be performed, and large-scale and quick lookup of content in a Content Distribution Network (CDN) and a data center distributed storage system needs to be performed. In a forwarding plane of a network device, a hash table is a common implementation method for storing and searching key value pairs.
The key value pair storage of the high-performance network equipment has extremely high requirements on scale and searching performance, however, the storage resource of the forwarding chip of the high-performance network equipment is limited, and complex operation is difficult to support, so that the key value pair storage and searching method hash table which is designed and realized with high storage efficiency and simple operation is the key for improving the table item specification and the forwarding performance of the network equipment. How to improve the efficiency of storing key-value pairs is a considerable problem.
Disclosure of Invention
The embodiment of the application provides a data storage method, and the process of determining the first seed function by the network equipment is simple, so that the overhead of the network equipment is reduced, and the efficiency of storing the first data by the network equipment is improved.
A network device obtains a first key value and first data, where the first key value is used to search for the first data; the network equipment determines a first seed function according to the first key value, wherein the first seed function is used for calculating a first index, and the first index is used for indicating the storage address of the first data; the network device calculates the first index according to the first seed function; and the network equipment stores the first data in a storage list according to the first index.
In the application, the network device may directly determine a first seed function through one-time calculation according to the first key value, where the first seed function is used to calculate a first index, the first index is used to indicate a storage address of the first data, and the network device calculates the first index according to the first seed function. The network device stores the first data in a storage list according to the first index. Therefore, the process of determining the first subfunction by the network device is simple, the overhead of the network device is reduced, and the efficiency of storing the first data by the network device is improved.
In a possible implementation manner, the determining, by the network device, a target seed function according to the target key value includes: the network equipment calculates the first key value by using a first function to obtain the first row of coordinates; and the network equipment acquires the first seed function according to the first row coordinate.
In this possible implementation manner, the first coordinate includes a first row coordinate and a first column coordinate, where a key used in the formula in the above example is a first key value, i is the first row coordinate, and j is the first column coordinate. The key is substituted into the hash function, and the values of i and j, i.e. the first coordinate, can be obtained respectively.
In a possible implementation manner, the acquiring, by the network device, the first seed function according to the first row coordinate includes: the network equipment acquires a plurality of key values with the same row coordinate as the first row coordinate according to the first row coordinate; and the network equipment generates the first seed function according to the first key value and the plurality of key values.
In this possible implementation manner, if the network device obtains all the key values, the network device may calculate a new seed according to the new key and the original key together, and may obtain a perfect hash seed, that is, the network device generates the first seed function according to the first key value and the plurality of key values. The possible implementation mode improves the realizability of the scheme.
In a possible implementation manner, the storing list includes a first list and a second list, and the network device stores the first data in the storing list according to the first index, including: the network device stores the first data in the first list according to the first index; the method further comprises the following steps: the network device sets a target operation result, wherein the target operation result is used for indicating an operation result between a value recorded by an address corresponding to the first coordinate and a value recorded by an address corresponding to a second coordinate, the second coordinate is a coordinate of a second address, the second address is located in the second list, the first address is located in the first list, and the target operation result is used for indicating that the first data is stored in the first list.
In the method, the network equipment calculates the sub-table number d and the hash bucket index i (first row coordinate) by using a d-left hash algorithm. Searching a perfect hash function (first seed function) for all the corresponding keys and the currently inserted key in the hash bucket HT _ d [ i ] of the sub-table HT _ d, and if the perfect hash function can be found, recording the seed of the perfect hash function in the hash bucket. Otherwise, the insertion fails. And according to the table entry index (first index) calculated by the perfect hash function, storing each value in the hash bucket according to the corresponding table entry position of the key word. And inserting the mapping relation between the keyword key and the sub-table number d into the Othello hash to obtain 1 one-dimensional bitmap. Mapping the logical address index to be modified to a two-dimensional address space by a Hash remainder address mapping method to obtain a row-column address coordinate (i, j) (a first coordinate), and writing a two-dimensional bitmap into a corresponding Hash bucket according to the row-column address. For example, assuming that the data in fig. 7 finally stores in the hash table HT0, the xor value of the values of Bit0 and Bit1 written in the two-dimensional bitmap may be 0 (the target operation result), and the xor value is 0, which represents that the first data is written in the hash table HT0. The possible implementation mode provides a specific implementation mode, and the realizability of the scheme is improved.
In a possible implementation, the method further includes: the network equipment receives a data query instruction, wherein the data query instruction comprises the first key value; and the network equipment queries the first data according to the first key value.
In a possible implementation manner, the storing list includes a first list and a second list, and the querying, by the network device, the first data according to the first key value includes: the network equipment acquires a first index and a second index according to a first key value; the network equipment acquires third data from the first list according to the first index; the network equipment acquires fourth data from the second list according to the second index; the network device confirms that the third data is the first data according to the target operation result, the target operation result is used for indicating an operation result between a value recorded by an address corresponding to the first coordinate and a value recorded by an address corresponding to a second coordinate, the second coordinate is a coordinate of a second address, the second address is located in the second list, the first address is located in the first list, and the target operation result is used for indicating that the first data is stored in the first list.
In this possible implementation manner, the network device calculates hash bucket indexes i in HT0 and HT1 corresponding to the lookup key, respectively 0 = Hash0 (key)% M, and i 1 Hash1 (key)% M. Hash bucket HT0[ i ] with separate access to two sub-tables 0 ]And HT1[ i 1 ]Get the Bitmap BM [ i ] 0 ]And BM [ i ] 1 ]And perfect hash seed
Figure BDA0003240459550000021
And
Figure BDA0003240459550000022
are used separately
Figure BDA0003240459550000023
And
Figure BDA0003240459550000024
constructing a perfect hash function corresponding to the hash bucket, and calculating a table entry index k corresponding to the search keyword by using the perfect hash function 0 And k 1 (first index and second index). Accessing the key word corresponding to the table entries of the two hash buckets to obtain Value 0 =HT[i 0 ][k 0 ]And Value 1 =HT[i 1 ][k 1 ]. Calculating indexes of bit values in bitmaps of two hash buckets corresponding to the keywords: j is a unit of a group 0 =Hash0(key)%N,j 1 = Hash1 (key)% N. If it is used
Figure BDA0003240459550000031
Return Value 0 . Otherwise, return Value 1 . The possible implementation mode provides a specific implementation mode for inquiring the key value pair, and the realizability of the scheme is improved.
A second aspect of the present application provides a network device comprising at least one processor, a memory, and a communication interface. The processor is coupled with the memory and the communication interface. The memory is configured to store instructions, the processor is configured to execute the instructions, and the communication interface is configured to communicate with other network devices under control of the processor. The instructions, when executed by the processor, cause the network device to perform the method of the first aspect or any possible implementation of the first aspect.
A third aspect of the present application provides a computer program product storing one or more computer executable instructions that, when executed by a processor, perform the method of the first aspect or any one of the possible implementations of the first aspect.
A fourth aspect of the present application provides a chip, which includes a processor and a communication interface, where the processor is coupled to the communication interface, and the processor is configured to read an instruction to execute the method of the first aspect or any one of the possible implementation manners of the first aspect.
A fifth aspect of the present application is a network system, where the system includes the first network device described in the foregoing first aspect or any one of the possible implementation manners of the first aspect.
According to the technical scheme, the embodiment of the application has the following advantages:
in the application, after the network device obtains the first key value and the first data, the network device may directly determine the first seed function through one-time calculation according to the first key value, where the first seed function is used to calculate a first index, the first index is used to indicate a storage address of the first data, and the network device calculates the first index according to the first seed function. The network device stores the first data in a storage list according to the first index. Therefore, the process of determining the first seed function by the network equipment is simple, the overhead of the network equipment is reduced, and the efficiency of storing the first data by the network equipment is improved.
Drawings
Fig. 1 is a schematic structural diagram of a network system provided in the present application;
fig. 2 is a schematic diagram of an application of a data storage method provided in the present application;
fig. 3 is a schematic diagram of another application of a data storage method provided in the present application;
fig. 4 is a schematic diagram illustrating a hash remainder address mapping method according to the present application;
fig. 5 is a schematic structural diagram of a hash table provided in the present application;
fig. 6 is a schematic diagram illustrating a keyword search process according to the present application;
fig. 7 is a schematic diagram of another application of a data storage method provided in the present application;
FIG. 8 is a diagram illustrating an embodiment of a data storage method provided in the present application;
FIG. 9 is a schematic diagram of an embodiment of a data storage method provided in the present application;
fig. 10 is a schematic structural diagram of a network device provided in the present application;
fig. 11 is a schematic structural diagram of another network device provided in the present application.
Detailed Description
The examples provided in this application are described below with reference to the accompanying drawings, and it is to be understood that the examples described are only examples of some, and not all, of the present application. As can be appreciated by those skilled in the art, with the development of technology and the emergence of new scenarios, the technical solutions provided in the present application are also applicable to similar technical problems.
The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the examples described herein are capable of being carried out in sequences other than those illustrated or otherwise described herein. Moreover, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
With the increasing scale of the internet and the increasing of network users and contents, the requirement of the data communication technology field for the storage and searching performance of mass data is higher and higher.
The key value pair lookup system has wide applications in the current data communication field, for example, in a network forwarding device, high-speed lookup of an MAC address table, a flow table, and the like needs to be performed, and large-scale and quick lookup of content in a Content Distribution Network (CDN) and a data center distributed storage system needs to be performed. In a forwarding plane of a network device, a hash table is a common implementation method for storing and searching key value pairs.
A basic hash table algorithm suitable for a forwarding plane of a network device generally comprises a hash function and a table. Given a lookup key, the hash function is used to compute the position index in its corresponding table. The table consists of several hash buckets. Because the hash function calculates the position index and has conflict problem, each barrel also comprises a plurality of table entries for tolerating hash conflict to a certain extent. The table entry stores the key and its corresponding value (i.e., key-value pair). When key value pair insertion operation is carried out, a hash function is used for calculating the key words to obtain a position index corresponding to a hash bucket; the key-value pair data is then stored in the free entry corresponding to the hash bucket. When key value pair searching is carried out, a hash function is used for calculating the key words to obtain a position index corresponding to a hash bucket; and then comparing the keywords in each table entry one by one in the corresponding hash bucket, if the keywords are the same, searching for hits, and returning corresponding values.
In the key value pair insertion process, hash collision exists, which may cause some key value pairs to be unsuccessfully inserted into the table, and also may cause some table entries to become empty, that is, the storage space of the hash table may not be fully utilized. Therefore, evaluating the memory utilization efficiency of a hash table algorithm usually takes the load rate as an index, and the higher the load rate is, the more efficient the hash table storage is.
Generally, the higher the loading rate of the hash table algorithm, the higher the complexity of its corresponding key-value pair insertion and lookup algorithm. For example, in the d-left hash table algorithm, a table is split into d sub-tables, when a key value pair is inserted, a hash bucket is indexed for each sub-table, and finally a bucket with a lighter load is selected for insertion. During searching, the corresponding value can be obtained only by traversing and searching the search key words in the d hash buckets at the same time. In this way, the d-left hash table algorithm can obtain higher loading rate. Cuckoo hashing table (Cuckoo hashing table) is another hash table algorithm commonly used for a data surface, when key value pairs are inserted, two hash buckets are indexed through two independent hash functions, if all the entries in the two buckets are occupied, random bucket kicking operation is carried out, namely, the key value pair of one existing entry in the bucket is replaced by the currently inserted key value pair at random, the replaced key value pair replaces the key value pair in the other bucket corresponding to the replaced key value pair, and the process is repeated continuously until the last replaced key value pair has a corresponding idle entry which can be stored. In this way, the cuckoo hash algorithm can also achieve very high loading rates. However, both the insertion and lookup processes of these improved hash table algorithms are more complex than the basic hash table algorithm.
In addition, storage resources and I/O resources on the forwarding chip of the network device are at a premium, and therefore, the forwarding key value pair storage and lookup algorithm is usually designed with a compromise between storage and throughput using on-chip/off-chip heterodyning. For example, in order to reduce the occupation of on-chip memory, only a short-bit-width key fingerprint and an address pointing to off-chip data are stored in the entry of the on-chip hash table, and a space is opened up in the off-chip storage to store complete key-value pair data. During searching, the key word fingerprints are only compared one by one in each hash bucket, and if the key word fingerprints are hit, the off-chip storage access is performed again to obtain corresponding values. However, fingerprints contain only a small amount of key information, and fingerprint collisions of different keys may cause key-value pair insertion to fail. Generally, the shorter the fingerprint bit width, the lower the storage overhead, but the higher its collision probability, the lower the hash table loading rate.
Generally, key value pair storage of a high-performance network device has extremely high requirements on scale and lookup performance, however, storage resources of a forwarding chip of the high-performance network device are limited, and complex operations are difficult to support, so that designing and implementing a hash table of a key value pair storage and lookup method with high storage efficiency and simple operations is a key for improving the table entry specification and the forwarding performance of the network device. How to improve the efficiency of storing key-value pairs is a considerable problem.
In order to solve the problems in the above solutions, the present application provides a data storage method, a related device, and a network system. The process of determining the first seed function by the network equipment is simple, the overhead of the network equipment is reduced, and the efficiency of storing the first data by the network equipment is improved. The network system, the data storage method, and the network device provided by the present application will be respectively described below with reference to the accompanying drawings.
The network system provided by the present application will be described first.
Fig. 1 is a schematic structural diagram of a network system provided in the present application.
In the present application, the network system includes at least a network device 101 as shown in fig. 1.
After the network device 101 obtains the first key value and the first data, the network device 101 may directly determine a first seed function through one-time calculation according to the first key value, where the first seed function is used to calculate a first index, the first index is used to indicate a storage address of the first data, and the network device 101 may calculate the first index according to the first seed function. The network device 101 stores the first data in a storage list according to the first index. In this way, the process of determining the first subfunction by the network device 101 is simple, the overhead of the network device is reduced, and the efficiency of storing the first data by the network device 101 is improved.
The storage method of the data provided by the present application is described based on the network system described in fig. 1.
Fig. 2 is a schematic application diagram of a data storage method provided in the present application.
Referring to fig. 2, an example of a data storage method provided by the present application includes steps 201 to 204.
201. The network device obtains a first key value and first data.
In the present application, the first key value is used to search for the first data. The first key value exists in various forms. For example, given a lookup key that is a key value, the hash function may compute an index of the location of the key in the data table. The data table may consist of several hash buckets. The data table may store a plurality of first data. Since the hash function calculates the position index, there is a conflict problem, and each bucket also contains several entries, so as to tolerate some hash conflict. The table entry stores a key (first key value) and a corresponding value (first data), and the first key value and the first data form a key-value pair.
202. The network device determines a first seed function according to the first key value.
In the application, the network device may obtain the first seed function according to the first key value. Wherein the first seed function may be used to calculate a first index indicating a storage address of the first data.
203. The network device calculates a first index according to a first seed function.
204. The network device stores the first data in a storage list according to the first index.
In the present application, data (first data) is typically stored in a hash table. In the data storage method provided by the application, the hash table for storing the first data is composed of two sub-tables HT0 and HT1, and the hash bucket position for inserting the key value pair is calculated by a d-left hash algorithm. The number of the buckets of the two hash tables is M, and each hash bucket comprises 1 Bitmap with the length of N, 1 seed of a perfect hash function and K table entries.
In the application, the network device may directly determine a first seed function through one-time calculation according to the first key value, where the first seed function is used to calculate a first index, the first index is used to indicate a storage address of the first data, and the network device calculates the first index according to the first seed function. The network device stores the first data in a storage list according to the first index. Therefore, the process of determining the first seed function by the network equipment is simple, the overhead of the network equipment is reduced, and the efficiency of storing the first data by the network equipment is improved.
In this application, the network device mentioned in the above description of step 202 has a specific implementation manner when determining the target seed function according to the target key value, and the specific implementation manner will be described in the following embodiments.
Fig. 3 is a schematic diagram of another application of a data storage method provided in the present application.
301. The network equipment calculates a first key value by using a first function to obtain a first row coordinate.
First, a hash remainder address mapping method is introduced.
Fig. 4 is a schematic diagram of a hash remainder address mapping method provided in the present application.
As shown in fig. 4, the logical address space is a one-dimensional address index space, and contains B address units. The physical address space is a two-dimensional address index space, consists of M rows by N columns of address units and is indexed in a row-column coordinate mode. Wherein each parameter satisfies the following relationship:
B=M*N。
m and N are relatively prime.
In this application, the calculation method of the address index k in the logical address space is as follows:
k=Hash(key)%B
the address index (i, j) in the physical address space is calculated in the following way:
i=Hash(key)%M
j=Hash(key)%N
by the above hash remainder address mapping method, a one-dimensional address space can be mapped to a two-dimensional address space, so that by using the method, a Bitmap with a length of B of an Othello hash can be mapped into an M × N two-dimensional Bitmap, and each line of the obtained two-dimensional Bitmap is stored in each bucket of the hash table, thereby obtaining the structure shown in fig. 5.
In this application, the first coordinate includes a first row coordinate and a first column coordinate, where a key used in the formula in the above example is a first key value, i is the first row coordinate, and j is the first column coordinate. The key is substituted into the hash function, and the values of i and j, i.e. the first coordinate, can be obtained respectively.
302. The network device obtains a first seed function according to the first row coordinate.
In the application, the value of the first seed function can be obtained after the network device obtains the first row coordinate.
Fig. 5 is a schematic structural diagram of a hash table provided in the present application.
In the key-value pair storage structure of the present invention shown in fig. 5, each hash table includes a plurality of hash buckets, each hash bucket contains 3 types of data, i.e., 1 Bitmap with a length N, 1 seed (seed) of a perfect hash function, i.e., a first seed function, and several values of key-value pairs. One inventive hash table contains M hash buckets and satisfies M × N = B and M and N are coprime. The data bit width of Bitmap and seed must satisfy the following condition:
the sum of the bit widths of the Bitmap and the Seed is smaller than the bit width of one-time data access supported by hardware.
The hash bucket index calculation mode is as follows:
i=Hash(key)%M
in the application, by using the storage structure and the access mode, the bit in the Bitmap related to a keyword and the seed of the perfect hash function are stored in the same hash bucket and can be deployed at adjacent positions in the physical memory, so that when the keyword is searched, the keyword can be read through one access. That is, the first seed function can be obtained once after the first row coordinate i is obtained.
Fig. 6 is a schematic diagram of a keyword search process provided in the present application.
In the key value pair inserting process, the bit value at each index position in the original Othello hashed one-dimensional Bitmap is written into the bit map of each bucket of the hash table according to the result of the hash remainder address mapping method.
In this application, optionally, when the network device obtains the first seed function according to the first row coordinate, there may be a specific implementation manner, which will be described in the following embodiments.
Fig. 7 is a schematic diagram of another application of a data storage method provided in the present application.
401. The network equipment acquires a plurality of key values with the same row coordinate as the first row coordinate according to the first row coordinate.
For example, as in fig. 5, assuming that the value i of the key value calculated by the hash function is 1, the network device may obtain the original keys corresponding to the 4 values on the right side of Seed in the first row. That is, the network device obtains a plurality of key values having the same row coordinate as the first row coordinate according to the first row coordinate.
402. The network device generates a first seed function according to the first key value and the plurality of key values.
For example, as shown in fig. 5, if the network device obtains all the key values, the network device may calculate a new seed according to the new key and the original key together, so as to obtain a perfect hash seed, that is, the network device generates a first seed function according to the first key value and the plurality of key values.
Fig. 8 is a schematic diagram of an embodiment of a data storage method provided in the present application.
In this application, the network device stores the first data in the storage list according to the first index, and this specific storage process will be described in detail in the following embodiments.
The network device stores the first data in a first list according to the first index.
The network device sets a target operation result.
In this application, including first list and second list in the memory list, the operation result between the value that the target operation result is used for instructing the address that the first coordinate corresponds recorded the value and the address that the second coordinate corresponds recorded, the second coordinate is the coordinate of second address, the second address is located in the second list, first address is located in the first list, the target operation result is used for instructing first data storage extremely in the first list.
Illustratively, the key-value pair insertion process is described below in conjunction with FIG. 8. The network device calculates the sub-table number d and the hash bucket index i (the first row coordinate) by using a d-left hash algorithm. Searching a perfect hash function (first seed function) for all the corresponding keys and the currently inserted key in the hash bucket HT _ d [ i ] of the sub-table HT _ d, and if the perfect hash function can be found, recording the seed of the perfect hash function in the hash bucket. Otherwise, the insertion fails. And according to the table entry index (first index) calculated by the perfect hash function, storing each value in the hash bucket according to the corresponding table entry position of the key word. And inserting the mapping relation between the keyword key and the sub-table number d in the Othello hash to obtain 1 one-dimensional bitmap. Mapping the logical address index to be modified to a two-dimensional address space by a Hash remainder address mapping method to obtain a row-column address coordinate (i, j) (a first coordinate), and writing a two-dimensional bitmap into a corresponding Hash bucket according to the row-column address. For example, assuming that the data in fig. 7 is finally stored in the hash table HT0, the xor value of the values of Bit0 and Bit1 written in the two-dimensional bitmap may be 0 (target operation result), and the xor value is 0, which represents that the first data is written in the hash table HT0.
In this application, the network device queries the first data in the storage list according to the first key, and this specific storage process will be described in detail in the following embodiments.
The method comprises the steps that network equipment receives a data query instruction, wherein the data query instruction comprises a first key value;
the network equipment acquires the first index and the second index according to the first key value.
And the network equipment acquires the third data from the first list according to the first index.
And the network equipment acquires the fourth data from the second list according to the second index.
And the network equipment confirms that the third data is the first data according to the target operation result.
In this application, the target operation result is used to indicate an operation result between a value recorded by an address corresponding to the first coordinate and a value recorded by an address corresponding to the second coordinate, the second coordinate is a coordinate of the second address, the second address is located in the second list, the first address is located in the first list, and the target operation result is used to indicate that the first data is stored in the first list.
Illustratively, the key-value pair insertion process is described below in conjunction with FIG. 8. Network devices respectivelyCalculating hash bucket index i in HT0 and HT1 corresponding to search key 0 = Hash0 (key)% M, and i 1 Hash1 (key)% M. Hash bucket HT0[ i ] with separate access to two sub-tables 0 ]And HT1[ i 1 ]Obtaining a Bitmap BM [ i ] 0 ]And BM [ i 1 ]And perfect hash seed
Figure BDA0003240459550000081
And
Figure BDA0003240459550000082
respectively using
Figure BDA0003240459550000083
And
Figure BDA0003240459550000084
constructing a perfect hash function corresponding to the hash bucket, and calculating a table entry index k corresponding to the search keyword by using the perfect hash function 0 And k 1 (first index and second index). Accessing the key word corresponding to the table entries of the two hash buckets to obtain Value 0 =HT[i 0 ][k 0 ]And Value 1 =HT[i 1 ][k 1 ]. Calculating the index of bit values in the bitmaps of the two hash buckets corresponding to the keyword: j is a unit of a group 0 =Hash0(key)%N,j 1 = Hash1 (key)% N. If it is not
Figure BDA0003240459550000085
Return Value 0 . Otherwise, return Value 1
In this application, optionally, another hash algorithm may be used to store the key-value pairs, which will be described in detail below.
Fig. 9 is a schematic diagram of an embodiment of a data storage method provided in the present application.
The technical scheme of the application can be practically applied in a mode as shown in fig. 9. In the second embodiment, only 1 hash table HT is included, and the hash bucket position where the key value pair is inserted is calculated by Cuckoo hash algorithm. The number of the hash table buckets is M, each hash bucket comprises 1 array with the length of N, each element in the array is 1 value with the width of 2-bit, 1 seed of a perfect hash function, and K table entries. The values in the array are generated by the Coloring owner algorithm.
Illustratively, the key-value pair insertion process is described below in conjunction with FIG. 9. And calculating by using a Cuckoo hash algorithm to obtain a hash bucket index i and a hash function index d used by the corresponding calculation bucket index. For a hash bucket HT d [i]Searching a perfect hash function by all corresponding keywords and the currently inserted keywords, and if the perfect hash function can be found, recording seeds of the perfect hash function in a hash bucket; otherwise, the insertion fails. And storing each value in the hash bucket according to the corresponding table entry position of the key word of each value according to the table entry index calculated by the perfect hash function. And inserting the mapping relation between the key and the sub-table number d in the clustering embed to obtain 1 one-dimensional array. Mapping the logical address index to be modified to a two-dimensional address space by a Hash remainder address mapping method to obtain row and column address coordinates (i, j), and writing the two-dimensional array into a corresponding Hash bucket according to the row and column addresses.
Illustratively, the key-value pair insertion process is described below in conjunction with FIG. 9. Two hash bucket indices i _0= hash0 (key)% M, i _1= hash1 (key)% M are calculated using hash function 0 and hash function 1, respectively. Accessing two hash buckets HT0 i _0 separately]And HT1[ i _1 ]]Get 2-bit array BM [ i _0 ]]And BM [ i _1 ]]And perfect hash seeds Seed _ (i _ 0) and Seed _ (i _ 1). And constructing a perfect hash function corresponding to the hash bucket by using Seed _ (i _ 0) and Seed _ (i _ 1), and calculating the table entry indexes k _0 and k _1 corresponding to the search key by using the perfect hash function. Accessing the key word corresponding to the table entries of the two hash buckets to obtain Value _0= 2 HT i _0][k_0]And Value _1= HT 2 i 1][k_1]. Calculating indexes of 2-bit arrays in two hash buckets corresponding to the keywords: j _0= hash0 (key)% N, j _1= hash1 (key)% N. If it is used
Figure BDA0003240459550000091
Return Value _0; otherwise, return Value _1.
The foregoing examples provide different embodiments of a data storage method, and a network device 50 is provided below, as shown in fig. 10, where the network device 50 is configured to execute steps executed by the network device in the foregoing examples, and the executing steps and corresponding beneficial effects are specifically understood with reference to the foregoing corresponding examples, which are not described herein again, and the network device 50 includes:
an obtaining unit 501, configured to obtain a first key value and first data, where the first key value is used to search for the first data;
a determining unit 502, configured to determine a first seed function according to the first key value, where the first seed function is used to calculate a first index, and the first index is used to indicate a storage address of the first data;
a calculating unit 503, configured to calculate the first index according to the first seed function;
a storage unit 504, configured to store the first data in a storage list according to the first index.
In one possible implementation, the first coordinates include a first row coordinate and a first column coordinate,
the calculation unit is used for calculating the time difference between the current time and the current time,
the first row coordinate is obtained by calculating the first key value by using a first function;
and acquiring the first seed function according to the first row coordinate.
In a possible implementation manner, the obtaining unit is configured to obtain, according to the first row coordinate, a plurality of key values having a same row coordinate as the first row coordinate;
a generating unit, configured to generate the first seed function according to the first key value and the plurality of key values.
In one possible implementation, the stored list includes a first list and a second list,
the storage unit is configured to store the first data in the first list according to the first index;
the setting unit is further configured to set a target operation result, where the target operation result is used to indicate an operation result between a value recorded by an address corresponding to the first coordinate and a value recorded by an address corresponding to a second coordinate, the second coordinate is a coordinate of a second address, the second address is located in the second list, the first address is located in the first list, and the target operation result is used to indicate that the first data is stored in the first list.
In one possible implementation, the method is characterized in that,
a receiving unit, configured to receive a data query instruction, where the data query instruction includes the first key value;
and the query unit is used for querying the first data according to the first key value.
In a possible implementation manner, the storing the list includes a first list and a second list, and the querying, by the network device, the first data according to the first key value includes:
the acquisition unit is used for:
acquiring a first index and a second index according to a first key value;
acquiring third data from the first list according to the first index;
acquiring fourth data from the second list according to the second index;
the confirming unit is configured to confirm that the third data is the first data according to the target operation result, where the target operation result is used to indicate an operation result between a value recorded by an address corresponding to the first coordinate and a value recorded by an address corresponding to a second coordinate, the second coordinate is a coordinate of a second address, the second address is located in the second list, the first address is located in the first list, and the target operation result is used to indicate that the first data is stored in the first list.
It should be noted that, for the information interaction, the execution process, and other contents between the modules of the network device 50, the execution steps are consistent with the details of the above method steps since the method examples are based on the same concept, and reference may be made to the description in the above method examples.
Referring to fig. 11, a schematic structural diagram of a network device is provided for the present application, where the network device 600 includes: a processor 602, a communication interface 603, and a memory 601. Optionally, a bus 604 may be included. Wherein, the communication interface 603, the processor 602 and the memory 601 may be connected to each other through a bus 604; the bus 604 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 11, but this is not intended to represent only one bus or type of bus. The network device 600 may implement the functionality of any of the network devices in the example shown in fig. 10. The processor 602 and the communication interface 603 may perform operations corresponding to the network device in the above method examples.
The following specifically describes each constituent element of the network device with reference to fig. 11:
the memory 601 may be a volatile memory (volatile memory), such as a random-access memory (RAM); or a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory (flash memory), a Hard Disk Drive (HDD) or a solid-state drive (SSD); or a combination of the above types of memories, for storing program code, configuration files, or other content that may implement the methods of the present application.
The processor 702 is a control center of the controller, and may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the examples provided in this application, such as: one or more Digital Signal Processors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs).
The communication interface 603 is used for communication with other network devices.
The processor 602 may perform operations performed by any of the network devices in the foregoing examples shown in fig. 10, which are not described herein again in detail.
It should be noted that, for the information interaction, the execution process, and other contents between the modules of the network device 600, the execution steps are consistent with the details of the above method steps since the method examples are based on the same concept, and reference may be made to the description in the above method examples.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the above-described systems, devices and units may refer to the corresponding processes in the foregoing examples, and are not described herein again.
In the several examples provided in this application, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. For example, the above-described apparatus examples are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be realized in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the elements may be selected according to actual needs to achieve the purpose of the present example.
In addition, functional units in the examples of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application, in essence or part of the technical solutions contributing to the prior art, or all or part of the technical solutions, can be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the examples of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that various examples can be combined, and the above-mentioned embodiments are only examples of the present invention and should not be used to limit the scope of the present invention, and any combination, modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the present invention. The above examples are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing examples, those of ordinary skill in the art will appreciate that: the technical solutions described in the foregoing examples can be modified, or some technical features can be equivalently replaced; such modifications or substitutions do not depart from the scope of the exemplary embodiments of the present application.

Claims (13)

1. A method of storing data, comprising:
the method comprises the steps that network equipment obtains a first key value and first data, wherein the first key value is used for searching the first data;
the network equipment determines a first seed function according to the first key value, wherein the first seed function is used for calculating a first index, and the first index is used for indicating a storage address of the first data;
the network equipment calculates the first index according to the first seed function;
and the network equipment stores the first data in a storage list according to the first index.
2. The data storage method of claim 1, wherein the first coordinate comprises a first row coordinate and a first column coordinate, and wherein the network device determines a target seed function based on the target key value, comprising:
the network equipment calculates the first key value by using a first function to obtain the first row of coordinates;
and the network equipment acquires the first seed function according to the first row of coordinates.
3. The data storage method of claim 2, wherein the network device obtains the first seed function according to the first row coordinate, and wherein the obtaining the first seed function comprises:
the network equipment acquires a plurality of key values with the same row coordinate as the first row coordinate according to the first row coordinate;
and the network equipment generates the first seed function according to the first key value and the plurality of key values.
4. The data storage method according to any one of claims 1 to 3, wherein the storage list comprises a first list and a second list, and the network device stores the first data in the storage list according to the first index, including:
the network device stores the first data in the first list according to the first index;
the method further comprises the following steps:
the network device sets a target operation result, wherein the target operation result is used for indicating an operation result between a value recorded by an address corresponding to the first coordinate and a value recorded by an address corresponding to a second coordinate, the second coordinate is a coordinate of a second address, the second address is located in the second list, the first address is located in the first list, and the target operation result is used for indicating that the first data is stored in the first list.
5. The data storage method of any of claims 1 to 4, wherein the method further comprises:
the network equipment receives a data query instruction, wherein the data query instruction comprises the first key value;
and the network equipment queries the first data according to the first key value.
6. The data storage method of claim 5, wherein the storage list comprises a first list and a second list, and wherein the querying, by the network device, the first data according to the first key value comprises:
the network equipment acquires a first index and a second index according to a first key value;
the network equipment acquires third data from the first list according to the first index;
the network equipment acquires fourth data from the second list according to the second index;
the network device confirms that the third data is the first data according to the target operation result, the target operation result is used for indicating an operation result between a value recorded by an address corresponding to the first coordinate and a value recorded by an address corresponding to a second coordinate, the second coordinate is a coordinate of a second address, the second address is located in the second list, the first address is located in the first list, and the target operation result is used for indicating that the first data is stored in the first list.
7. A network device, comprising:
an obtaining unit, configured to obtain a first key value and first data, where the first key value is used to search for the first data;
a determining unit, configured to determine a first seed function according to the first key, where the first seed function is used to calculate a first index, and the first index is used to indicate a storage address of the first data;
a calculating unit, configured to calculate the first index according to the first seed function;
and the storage unit is used for storing the first data in a storage list according to the first index.
8. The network device of claim 7, wherein the first coordinates comprise a first row coordinate and a first column coordinate,
the calculation unit is used for calculating the time difference between the current time and the current time,
the first row coordinate is obtained by calculating the first key value by using a first function;
and acquiring the first seed function according to the first row coordinate.
9. The network device of claim 8,
the acquiring unit is used for acquiring a plurality of key values with the same row coordinate as the first row coordinate according to the first row coordinate;
a generating unit, configured to generate the first seed function according to the first key value and the plurality of key values.
10. Network device according to any of claims 7 to 9, wherein said stored list comprises a first list and a second list,
the storage unit is configured to store the first data in the first list according to the first index;
the setting unit is further configured to set a target operation result, where the target operation result is used to indicate an operation result between a value recorded by an address corresponding to the first coordinate and a value recorded by an address corresponding to a second coordinate, the second coordinate is a coordinate of a second address, the second address is located in the second list, the first address is located in the first list, and the target operation result is used to indicate that the first data is stored in the first list.
11. The network device of any one of claims 7 to 10,
a receiving unit, configured to receive a data query instruction, where the data query instruction includes the first key value;
and the query unit is used for querying the first data according to the first key value.
12. The network device of claim 11, wherein the stored list comprises a first list and a second list, and wherein querying, by the network device, the first data according to the first key value comprises:
the acquisition unit is used for:
acquiring a first index and a second index according to a first key value;
acquiring third data from the first list according to the first index;
acquiring fourth data from the second list according to the second index;
the determining unit is configured to determine that the third data is the first data according to the target operation result, where the target operation result is used to indicate an operation result between a value recorded by an address corresponding to the first coordinate and a value recorded by an address corresponding to a second coordinate, the second coordinate is a coordinate of a second address, the second address is located in the second list, the first address is located in the first list, and the target operation result is used to indicate that the first data is stored in the first list.
13. A network device, comprising:
a processor, a memory, and a communication interface;
the processor is connected with the memory and the communication interface;
the communication interface is to:
sending the first information to the second network equipment;
receiving a data message;
the processor is configured to read the instructions stored in the memory and cause the network device to perform the method of any of claims 1 to 6.
CN202111017533.7A 2021-08-31 2021-08-31 Data storage method and related equipment Pending CN115729847A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111017533.7A CN115729847A (en) 2021-08-31 2021-08-31 Data storage method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111017533.7A CN115729847A (en) 2021-08-31 2021-08-31 Data storage method and related equipment

Publications (1)

Publication Number Publication Date
CN115729847A true CN115729847A (en) 2023-03-03

Family

ID=85291814

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111017533.7A Pending CN115729847A (en) 2021-08-31 2021-08-31 Data storage method and related equipment

Country Status (1)

Country Link
CN (1) CN115729847A (en)

Similar Documents

Publication Publication Date Title
CN108255958B (en) Data query method, device and storage medium
CN110321344B (en) Information query method and device for associated data, computer equipment and storage medium
US10114908B2 (en) Hybrid table implementation by using buffer pool as permanent in-memory storage for memory-resident data
US9292554B2 (en) Thin database indexing
US8924373B2 (en) Query plans with parameter markers in place of object identifiers
CN106326475B (en) Efficient static hash table implementation method and system
US10649997B2 (en) Method, system and computer program product for performing numeric searches related to biometric information, for finding a matching biometric identifier in a biometric database
CN107704202B (en) Method and device for quickly reading and writing data
CN108228799B (en) Object index information storage method and device
US20160103858A1 (en) Data management system comprising a trie data structure, integrated circuits and methods therefor
US20240126817A1 (en) Graph data query
CN114691721A (en) Graph data query method and device, electronic equipment and storage medium
US8396858B2 (en) Adding entries to an index based on use of the index
CN105912696A (en) DNS (Domain Name System) index creating method and query method based on logarithm merging
CN112799972B (en) Implementation method and device of SSD mapping table, readable storage medium and electronic equipment
CN109460406B (en) Data processing method and device
US9292553B2 (en) Queries for thin database indexing
CN109213972B (en) Method, device, equipment and computer storage medium for determining document similarity
CN115729847A (en) Data storage method and related equipment
CN113419792A (en) Event processing method and device, terminal equipment and storage medium
CN108984780B (en) Method and device for managing disk data based on data structure supporting repeated key value tree
CN114398373A (en) File data storage and reading method and device applied to database storage
CN110321346B (en) Method and system for realizing character string hash table
US9824105B2 (en) Adaptive probabilistic indexing with skip lists
WO2021012211A1 (en) Method and apparatus for establishing index for data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination