CN117539408B - Integrated index system for memory and calculation and key value pair memory system - Google Patents

Integrated index system for memory and calculation and key value pair memory system Download PDF

Info

Publication number
CN117539408B
CN117539408B CN202410030995.XA CN202410030995A CN117539408B CN 117539408 B CN117539408 B CN 117539408B CN 202410030995 A CN202410030995 A CN 202410030995A CN 117539408 B CN117539408 B CN 117539408B
Authority
CN
China
Prior art keywords
hash
key
array
value
row
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410030995.XA
Other languages
Chinese (zh)
Other versions
CN117539408A (en
Inventor
吴兵
冯丹
童薇
刘景宁
罗弘杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202410030995.XA priority Critical patent/CN117539408B/en
Publication of CN117539408A publication Critical patent/CN117539408A/en
Application granted granted Critical
Publication of CN117539408B publication Critical patent/CN117539408B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C16/00Erasable programmable read-only memories
    • G11C16/02Erasable programmable read-only memories electrically programmable
    • G11C16/06Auxiliary circuits, e.g. for writing into memory
    • G11C16/24Bit-line control circuits

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a memory-calculation integrated index system and a key value pair storage system, which belong to the field of memory and calculation intersection, and comprise: a controller and a plurality of array clusters operable in parallel; the array cluster includes a plurality of content addressable cross-point arrays; in the array, each row is used for storing one key and a valid flag bit; the controller is used for performing index operation on the data of the key values stored in the array; the indexing operation includes: an insertion operation; the inserting operation includes: for key value data to be inserted [ k ] i ,v i ]At bond k i Corresponding hash bucket B i Searching a row which does not store a valid key in the mapped array cluster, and if the searching is successful, setting a key k i After storing the allocated row, the valid flag is set to be valid, and the value v is set i Writing into an array row; otherwise, the return operation fails. The invention can finish the index operation of the key value pair in the memory, improves the parallelism of the index operation and reduces the index operation time delay.

Description

Integrated index system for memory and calculation and key value pair memory system
Technical Field
The invention belongs to the field of storage and calculation intersection, and particularly relates to a storage and calculation integrated index system and a key value pair storage system.
Background
Key-value storage systems organize data in simple key-value pairs, each data item having a unique key and associated data (value), and are widely used in high-performance, low-latency, scalable and highly concurrent applications. The data index is an important component of the high-performance key value storage system, can provide efficient data operation, and can finish quick insertion, inquiry, update and deletion of data (value) corresponding to a given key. Because of the different designs of the data index structures, each index structure has a respective way to perform data operations, but can generally be divided into two broad categories: 1) Tree index structure, index is organized into a multi-level tree structure, traversing hierarchically from root to leaf node to perform operations on target data, such as b+ tree, log structured merge tree, skip list, radix tree, etc.; 2) And the hash index structure is used for directly obtaining an index value by applying a hash function to a given key and then positioning data corresponding to the index value in a linear table to finish operation. Tree indexes require traversing nodes in a multi-level tree, whose query time complexity is typically O (log N), where N is the data storage amount. While hash indexes, due to their flat memory structure (linear table), can provide query operations within the theoretical time complexity of O (1). However, the hash index is inevitably affected by hash collision, that is, after the hash function is calculated, a plurality of keys are mapped to the same position of the linear table, and no matter what collision processing method is adopted, for example, an open-chain method, a linear detection method and the like, more times of data access and comparison are introduced to process the collision.
Whether the node traversal of the tree index or the conflict processing of the hash index brings more frequent comparison and access, leads to the reduction of the index operation efficiency and influences the storage performance of the key value on the storage system. At present, although researchers put forward a novel index structure such as learning index, the index structure is generally organized into a tree shape, and the problems of reduced index operation efficiency caused by multiple rounds of calculation, comparison and access still exist, so that the storage performance of a key value pair system still needs to be further improved.
Disclosure of Invention
Aiming at the defects and improvement demands of the prior art, the invention provides a storage and calculation integrated indexing system and a key value pair storage system, and aims to finish the indexing operation of key value pair data in a memory by utilizing the parallel data comparison capability of a cross point array so as to improve the parallelism of the indexing operation and reduce the time delay of the indexing operation.
To achieve the above object, according to one aspect of the present invention, there is provided a storage-in-one indexing system including: a controller and a plurality of array clusters;
each array cluster comprises a plurality of arrays; the array is a cross-point array consisting of resistive memory cells and is content addressable; in the array, each row is used for storing one key and a valid flag bit; the value of the valid flag bit is f v When the corresponding row stores a valid key, the valid flag bit has a value of f i When, the corresponding row is indicated as not storing a valid key; f (f) v ≠f i The method comprises the steps of carrying out a first treatment on the surface of the At the initial time, the effective flag bit of each row is f i The method comprises the steps of carrying out a first treatment on the surface of the Multiple array clusters may operate in parallel;
the controller is used for performing index operation on the data of the key values stored in the array; the indexing operation includes: an insertion operation; the inserting operation includes:
(I1) For key value data to be inserted [ k ] i , v i ]In hash bucket B i Searching a valid flag bit f in the mapped array cluster i If the search is successful, go to step (I2); otherwise, returning to failure operation;
hash bucket B i For key k i Hash value of (2) in the hash table; in the hash table, each hash bucket is mapped to an array cluster and is used for storing the addresses of the arrays in the array cluster;
(I2) Couple key value to data k i , v i ]Key k of (2) i After being stored in the allocated row, the effective mark position is f v And will have the value v i Written into one array row.
Further, the indexing operation further includes: inquiring operation; the query operation includes:
(S1) for key k to be queried s Sequentially reading hash buckets B s And find the stored key k in the corresponding array s And the effective flag bit is f v Line L of (2) s If (if)If the search is successful, the step (S2) is carried out; otherwise, the returned data does not exist;
hash bucket B s For key k s Hash value of (2) in the hash table;
(S2) slave line L s Corresponding in-row read value v for stored value s And returns.
Further, the indexing operation further includes: updating operation; the updating operation includes:
(U1) for key value data [ k ] to be updated u , v u ]Sequentially reading hash buckets B u And find the stored key k in the corresponding array u And the effective flag bit is f v Line L of (2) u If the search is successful, the step (U2) is carried out; otherwise, the returned data does not exist;
hash bucket B u For key k u Hash value of (2) in the hash table;
(U2) line L u The value stored in the row corresponding to the stored value is updated to v u
Further, the indexing operation further includes: a delete operation; the deletion operation includes:
(D1) For key k to be deleted d Sequentially reading hash buckets B d And find the stored key k in the corresponding array d And the effective flag bit is f v Line L of (2) d If the search is successful, the step (D2) is carried out; otherwise, the returned data does not exist;
hash bucket B d For key k d Hash value of (2) in the hash table;
(D2) Will line L d The effective mark position of (1) is f i
Further, the controller is further configured to perform a hash capacity expansion operation; the Hash capacity expansion operation comprises the following steps:
(E1) Expanding the capacity of the hash table to 2N, and determining the corresponding relation between each hash bucket and the array cluster; n is the capacity of the current hash table, and N is the power of 2;
(E2) Determining hash buckets corresponding to the stored effective keys in the new hash tables in the array clusters, and moving the effective keys to the array clusters mapped by the corresponding hash buckets;
for any one active keykThe hash bucket sequence number whose hash value corresponds to the original hash table is recorded asiJudging the log of the hash value 2 Whether the N bit is 0, if so, determining a valid keykThe corresponding hash bucket sequence number in the new hash table isiOtherwise, determine the valid keykThe corresponding hash bucket sequence number in the new hash table isi+N
Further, the inserting operation further includes:
in the process of bonding key k i Store key k while in the allocated row i High (H-log) hash value of (2) 2 N) bits are stored as hash valid bits into one array row; keys stored in the same array, whose hash valid bits are stored in the same array and aligned by columns; h represents the length of the hash value;
in the hash capacity expansion operation, the log of the hash value of the valid key 2 The N-bit acquisition mode comprises the following steps:
log-th in reading hash value from array 2 The hash valid bit corresponding to the N bits is located in the column.
Further, in the hash table, the size of each hash bucket is the same as the size of one CPU cache line.
Further, in the hash table, different hash buckets map to different array clusters.
According to still another aspect of the present invention, there is provided a key value pair storage system including the above-mentioned integrated index system of the present invention.
In general, through the above technical solutions conceived by the present invention, the following beneficial effects can be obtained:
(1) The storage and calculation integrated indexing system provided by the invention is realized based on the synergy of the cross point array and the hash table formed by the resistive random access memory units, and the in-situ indexing operation is realized by utilizing the content addressable function (parallel data comparison function) of the array, so that a key of a key value pair data is mapped to a hash bucket after being subjected to the hash function, and then the specific operation on the data is completed by utilizing the in-situ indexing operation of the array. In general, the invention can effectively improve index operation performance.
(2) In the preferred technical scheme of the invention, when hash capacity expansion is carried out, based on the capacity relation of the hash tables before and after the hash capacity expansion, the moving position of the data is determined by only using the corresponding bit in the hash value, so that the forward direction of all the row data in the array can be finished in the index system, the data movement in the memory is finished, and the participation of a CPU (central processing unit) is not needed, thereby reducing the data movement between the CPU and the memory, improving the efficiency of the hash capacity expansion, reducing the influence on the index function during the hash capacity expansion, and further improving the integral index performance of the index system.
(3) In a further preferred technical scheme of the invention, hash valid bits are determined in advance and used for indicating the directions of valid keys in the array when hash expansion is carried out each time, the hash valid bits are stored in the array in a column alignment mode, when the hash expansion is carried out, the directions of the valid keys in the array can be determined by parallelly reading the columns of the corresponding hash valid bits from the array according to the capacity of the current hash table, the whole process is completed in an index system, the CPU does not need to participate in data movement, and the hash expansion efficiency is further improved.
(4) In a further preferred embodiment of the present invention, in the hash table, the capacity of each hash bucket is the same as one CPU cache line, so that the cache utilization of the CPU can be improved.
(5) In a further preferred scheme of the invention, different hash buckets are mapped to different array clusters in the hash table, so that parallelism of a hardware bottom layer can be maximized, and overall index operation performance is further improved.
Drawings
Fig. 1 is a schematic diagram of a conventional resistive random access memory.
Fig. 2 is a schematic diagram of a conventional cross-point array structure composed of a resistive random access memory.
FIG. 3 is a schematic diagram of a content addressable process of an array in accordance with an embodiment of the present invention.
Fig. 4 is a schematic hardware architecture of a unified memory index system according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of a hardware and software collaborative hash index according to an embodiment of the present invention.
Fig. 6 is a schematic diagram of an inserting operation according to an embodiment of the present invention.
FIG. 7 is a schematic diagram of a query operation according to an embodiment of the present invention.
FIG. 8 is a diagram illustrating an update operation according to an embodiment of the present invention.
Fig. 9 is a schematic diagram of a deletion operation in an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
In the present invention, the terms "first," "second," and the like in the description and in the drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
In order to solve the technical problem that the conventional index mode for the data aiming at the key value brings frequent comparison and access and leads to the reduction of the index operation efficiency, the invention provides a storage and calculation integrated index system and a key value pair storage system.
Before explaining the technical scheme of the invention in detail, a brief description is given below of a resistive random access memory cell and a cross-point array structure formed by the resistive random access memory cell.
The structure of the resistive random access memory unit is shown in fig. 1, and the basic structure of the resistive random access memory unit consists of a resistive material, a bottom electrode and a top electrode. The resistance change material can be metal oxide, and the like, and the resistance can realize reversible conversion between a high resistance state and a low resistance state under the action of an external electric field, so that the writing of data is completed. By applying a small voltage, the data reading can be accomplished by measuring the current flowing or the converted voltage without changing the state of the cell.
As shown in fig. 2, the cross-point array of resistive random access memory cells can apply a calculated voltage on a word line (row) of the array, and obtain the input calculated voltage vector and the matrix vector multiplication operation result between the conductance matrices stored in the array by reading the current flowing on a bit line (column). The complexity of the matrix vector multiplication operation implemented is only O (1), and is therefore widely learned for applications such as accelerating neural networks, graph computation, and the like that are dominated by matrix vector multiplication. The cross-point array may also implement parallel data contrast operations, also known as content addressable functions, by which rows of data stored in the array that are identical to input data may be determined. FIG. 3 is a schematic diagram of one possible content addressable scheme, in this implementation, two resistive memory cells are used to form a content addressable cell, and two corresponding columns form an input contrast cell. The content stored in the content addressable unit may be determined according to the mapping relationship below fig. 3, for example, if the logical value to be stored is 1, then the data actually stored in the content addressable unit is (1, 0). With the help of the input registers, each 1-bit input data will be converted into an input to two bit lines, and the specific conversion rule is determined by the mapping relationship below in fig. 3, for example, when the logic value "0" is input, the input is converted into an input comparison unit and then is (0, 1). The input of more bit lines realizes the parallel comparison of the input multi-bit data and the storage data of each row. If a row of data stored in the array matches (is the same as) a given input data, the Sense Amplifier (SA) will input a "1" signal, and conversely output a "0".
It should be noted that, as shown in fig. 3, only one way of implementing content-addressable for the resistive cross-point array is possible, and other implementations or other nonvolatile memories, and even conventional memories such as DRAM and SRAM, are possible.
In practical applications, the key-value pair data is generally represented by [ key, value ], where key represents a key, value represents a value corresponding to the key, and in the following embodiments, a similar representation manner will be adopted, specifically, [ k, v ] is used to represent the key-value pair data, and k and v represent a key and a value, respectively.
The following are examples.
Example 1:
a computationally integrated indexing system, as shown in fig. 4, comprising: a controller and a plurality of array clusters;
multiple array clusters may operate in parallel; each array cluster includes a plurality of arrays; the array is a cross-point array consisting of resistive memory cells and is content addressable; in the array, each row is used for storing one key and a valid flag bit; the valid flag bit is used for indicating whether the corresponding row stores a valid key; in this embodiment, when the value of the valid flag bit is 1, it indicates that the corresponding row stores a valid key, and when the value of the valid flag bit is 0, it indicates that the corresponding row does not store a valid key; at the initial time, the valid flag bit of each row is 0.
Optionally, as shown in fig. 4, in this embodiment, the array specifically uses a memory architecture similar to a DRAM memory to construct the integrated index hardware, so as to avoid a complex on-chip interconnection structure. Wherein the input registers and sense amplifiers required for the content addressable function can be shared with the input registers and sense amplifiers used for the storage function. It is only necessary to adapt the sense amplifier to support 2 reference value comparisons for normal read operations and content addressable operations, respectively. It should be noted that the hardware organization form shown in fig. 4 is only an alternative implementation manner of the embodiment of the present invention, and in other embodiments of the present invention, the organization of the array may be completed by using other organization forms such as an H tree.
The controller is used for performing index operation on the data of the key values stored in the array; in this embodiment, the indexing operation specifically includes an insert operation, a query operation, an update operation, and a delete operation. These operations are all done within the array.
In this embodiment, the keys and the valid flag bits thereof are stored in one array row, so that parallel comparison of the input keys and valid flag bits can be completed. It is easy to understand that in the initialized state, all rows of the array are not inserted with data, and the valid flag bit of each row is 0; after deleting a certain line of data, the valid flag bit is also 0.
As shown in fig. 5, in this embodiment, on the basis of storing integrated index hardware, a hash table with cooperation of software and hardware is designed, where each hash bucket in the hash table is mapped to an array cluster and is used to store the addresses of the arrays in the array cluster; alternatively, in this embodiment, the serial number of the array in the array cluster is used as the address of the array cluster, and each array address occupies 8B. The capacity of each hash bucket is the same as that of one CPU cache line, so that one hash bucket can store a plurality of array addresses to fill one CPU cache line, and the cache utilization rate of the CPU can be improved. In order to maximize the parallelism of the bottom layer of the hardware, in this embodiment, different hash buckets are mapped to different array clusters, and since different array clusters can operate in parallel, the system performance can be further improved.
In this embodiment, when the key value is mapped to the hash table, the hash bucket number mapped to the hash table is hash (k)% N, where hash (k) represents the hash value of the key k, N represents the current capacity of the hash table, and% represents the remainder operation.
Based on the designed hash table, in this embodiment, the key k of the key value pair data [ k, v ] is mapped to a hash bucket after a hash function, and then a corresponding operation can be performed on an array in the array cluster mapped by the hash bucket. Similar to the conventional hash table-based indexing system, the hash value hash (k) of key k is calculated and the corresponding hash bucket is located in the hash table, which is completed by the CPU.
In this embodiment, the inserting operation includes:
(I1) For key value data to be inserted [ k ] i , v i ]In hash bucket B i Searching a valid flag bit f in the mapped array cluster i If the search is successful, go to step (I2); otherwise, returning to failure operation;
hash bucket B i For key k i Hash value of (2) in the hash table; in the hash table, each hash bucket is mapped to an array cluster and is used for storing the addresses of the arrays in the array cluster;
(I2) Couple key value to data k i , v i ]Key k of (2) i After being stored in the allocated row, the effective mark position is f v And will have the value v i Written into one array row.
FIG. 6 shows an example of an insert operation in which a key k to be inserted i =0011; the left side of fig. 6 shows finding empty rows within the array, i.e. rows for which the valid flag bit is 0, and the right side of fig. 6 shows writing data to the found empty rows. It should be noted that if a plurality of empty rows are found in the array, one empty row is selected randomly, in this embodiment, the first empty row is selected; if there is no empty line, the return operation fails. Furthermore, the value v data can be stored simply using one storage unit, not a content addressable unit, because it does not need to participate in data comparison. At the same time, value v i Can be stored in the sum key k i The valid flag bits in the same array can be stored in different arrays, and only the corresponding row of the array where the operation value is located is needed after the matching row number of the key is acquired.
In this embodiment, the query operation includes:
(S1) for key k to be queried s Sequentially reading hash buckets B s And find the stored key k in the corresponding array s And the effective flag bit is f v Line L of (2) s If the search is successful, the step (S2) is shifted to; otherwise, the returned data does not exist;
hash bucket B s For key k s Hash value of (2) in the hash table;
(S2) slave line L s Corresponding in-row read value v for stored value s And returns.
FIG. 7 shows an example of a query operation in which a key k to be queried s =1010; the left side of FIG. 7 shows the search for k s To the right of FIG. 7, represents the value v required for reading from the row of the array of stored values corresponding to that row s
In this embodiment, the updating operation includes:
(U1) for key value data [ k ] to be updated u , v u ]Sequentially reading hash buckets B u And find the stored key k in the corresponding array u And the effective flag bit is f v Line L of (2) u If the search is successful, the step (U2) is carried out; otherwise, the returned data does not exist;
hash bucket B u For key k u Hash value of (2) in the hash table;
(U2) line L u The value stored in the row corresponding to the stored value is updated to v u
FIG. 8 is an example of an update operation in which a key value to be updated is to key k in data u =0100; the left side of FIG. 8 shows the search for k u The right side of fig. 8 shows a row for updating the stored value corresponding to the row.
For the delete operation, only the valid flag bit of the matching row is cleared. Accordingly, in the present embodiment, the deleting operation includes:
(D1) For key k to be deleted d Sequentially reading hash buckets B d And find the stored key k in the corresponding array d And the effective flag bit is f v Line L of (2) d If the search is successful, the step (D2) is carried out; otherwise, the returned data does not exist;
hash bucket B d For key k d Hash value of (2) in the hash table;
(D2) Will line L d The effective mark position of (1) is f i
FIG. 9 is an example of a delete operation in which key k of key value versus data to be deleted d =1010; the left side of FIG. 9 shows the search for k d The valid flag bit for this row is cleared as shown on the right side of fig. 9.
The embodiment designs the corresponding hash table based on the content addressable array, so that in-situ indexing operation can be realized in the array, a large number of CPU memory access operations are omitted, the indexing operation steps are effectively reduced, and the indexing operation efficiency is improved. And the parallelism of the bottom hardware is fully utilized, and the overall performance of the system is further improved.
As the key value to be indexed increases for data, the hash table is gradually filled, and when the remaining space of the hash table is small, hash expansion is required. In the hash capacity expansion process, data movement is involved, and in order to further improve the performance of the index system, the embodiment is implemented by using in-memory data movement. When the hash is expanded, the calculation of the index value is converted from the original hash (k)% N to hash (k)% (2N). In this embodiment, the capacity N of the hash table is a power of 2, so that after the data originally stored in the ith hash bucket is moved to a new hash bucket, the data may be moved to the ith hash bucket or the (i+n) th hash bucket of the new hash table, depending on the log of the hash value hash (k) 2 In this embodiment, the bit is used as a hash valid bit, if the bit is 0, after calculating the index value according to hash (k)% (2N), the array address of the key k is still located in the ith hash bucket, otherwise, in the (i+n) th hash bucket, and based on this, the controller is further configured to perform a hash expansion operation; the Hash capacity expansion operation comprises the following steps:
(E1) Expanding the capacity of the hash table to 2N, and determining the corresponding relation between each hash bucket and the array cluster; n is the capacity of the current hash table, and N is the power of 2;
(E2) Determining hash buckets corresponding to the stored effective keys in the new hash tables in the array clusters, and moving the effective keys to the array clusters mapped by the corresponding hash buckets;
for any one active keykThe hash bucket sequence number whose hash value corresponds to the original hash table is recorded asiJudging the log of the hash value 2 Whether the N bit is 0, if so, determining a valid keykThe corresponding hash bucket sequence number in the new hash table isiOtherwise, determine the valid keykThe corresponding hash bucket sequence number in the new hash table isi+N
In fact, the hash value of the key is high (H-log 2 N) bits are used as hash valid bits for indicating the directions of valid keys in the array when the hash is expanded each time, and H represents the length of the hash value; in order to facilitate the utilization of the information, in this embodiment, the hash valid bits are stored in the array while the data is inserted, and by adopting the parallel reading function of the array, a certain hash valid bit of all rows of the array can be read, so that the destination of the data of all rows of the array can be known in the memory, and the data movement in the memory is completed. Accordingly, in this embodiment, the inserting operation further includes:
in the process of bonding key k i Store key k while in the allocated row i High (H-log) hash value of (2) 2 N) bits are stored as hash valid bits into one array row; keys stored in the same array, whose hash valid bits are stored in the same array and aligned by columns;
in the hash capacity expansion operation, the log of the hash value of the valid key 2 The N-bit acquisition mode comprises the following steps:
log-th in reading hash value from array 2 The hash valid bit corresponding to the N bits is located in the column.
In this embodiment, the data memory in the hash expansion can be moved by supporting the commands like mv (ori_addr, to_addr0, to_addr1, bit_pos) to all the valid keys stored in one array, where ori_addr represents the original array address, and bit_pos represents the hash valid bit of the bit_pos in the hash value corresponding to the key; when the effective bits of the bit_pos are 0 and 1, the to_addr0 and the to_addr1 respectively represent the array addresses to which the data in the original array need to be moved, and the array addresses are respectively determined according to array clusters mapped by the ith hash bucket and the (i+N) th hash bucket in the new hash table; when the hash valid bit is 0, the valid key is moved from the array corresponding to ori_addr to the array corresponding to to_addr0, and when the hash valid bit is 1, the valid key is moved from the array corresponding to ori_addr to the array corresponding to to_addr1.
Through the Hash capacity expansion operation, the Hash capacity expansion can be completed only through the movement of the stored data, the participation of a CPU is not needed, and the Hash capacity expansion efficiency is further improved.
Example 2:
a key-value pair storage system comprising the storage-as-a-whole index system provided in embodiment 1 above.
In the key value pair storage system, the index system defines a key value data storage mode and completes the storage of the key value pair data, and based on the integrated storage index system provided in the above embodiment 1, the key value pair storage system provided in the present embodiment can provide efficient data operation, and performance is effectively improved.
It is to be understood that, in this embodiment, other modules matched with the index system, such as a module for monitoring and managing, security and authority control, load balancing, backup, etc. to improve usability/reliability are further included, and the specific implementation manner of these modules is the same as that of the conventional key value pair storage system, which will not be repeated here.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (9)

1. A presence-based indexing system, comprising: a controller and a plurality of array clusters;
the plurality of array clusters may operate in parallel; each array cluster includes a plurality of arrays; the array is a cross-point array consisting of resistive memory cells and is content addressable; each row in the array is used for storing one key and the valid flag bit thereof; the value of the valid flag bit is f v When the corresponding row stores a valid key, the valid flag bit has the value f i When, the corresponding row is indicated as not storing a valid key; f (f) v ≠f i The method comprises the steps of carrying out a first treatment on the surface of the At the initial time, the effective flag bit of each row is f i
The controller is used for performing indexing operation on the data of the key values stored in the array; the indexing operation includes: an insertion operation; the inserting operation includes:
(I1) For key value data to be inserted [ k ] i , v i ]In hash bucket B i Searching a valid flag bit f in the mapped array cluster i If the search is successful, go to step (I2); otherwise, returning to failure operation;
hash bucket B i For key k i Hash value of (2) in the hash table; each hash bucket is mapped to one array cluster in the hash table and is used for storing the addresses of the arrays in the array cluster;
(I2) The key value is applied to the data [ k ] i , v i ]Key k of (2) i After being stored in the allocated row, the effective mark position is f v And will have the value v i Written into one array row.
2. The presence indexing system of claim 1, wherein the indexing operation further comprises: inquiring operation; the query operation includes:
(S1) for key k to be queried s Sequentially reading hash buckets B s And find the stored key k in the corresponding array s And the effective flag bit is f v Line L of (2) s If the search is successful, the step (S2) is shifted to; otherwise, the returned data does not exist;
hash bucket B s For key k s Hash value of (2) in the hash table;
(S2) slave line L s Corresponding in-row read value v for stored value s And returns.
3. The presence indexing system of claim 1, wherein the indexing operation further comprises: updating operation; the updating operation includes:
(U1) for key value data [ k ] to be updated u , v u ]Sequentially reading hash buckets B u And find the stored key k in the corresponding array u And the effective flag bit is f v Line L of (2) u If the search is successful, the step (U2) is carried out; otherwise, the returned data does not exist;
hash bucket B u For key k u Hash value of (2) in the hash table;
(U2) line L u The value stored in the row corresponding to the stored value is updated to v u
4. The presence indexing system of claim 1, wherein the indexing operation further comprises: a delete operation; the deleting operation includes:
(D1) For key k to be deleted d Sequentially reading hash buckets B d And find the stored key k in the corresponding array d And the effective flag bit is f v Line L of (2) d If the search is successful, the step (D2) is carried out; otherwise, the returned data does not exist;
hash bucket B d For key k d Hash value of (2) in the hash table;
(D2) Will line L d The effective mark position of (1) is f i
5. The integrated storage indexing system of any one of claims 1 to 4, wherein the controller is further configured to perform a hash expansion operation; the hash capacity expansion operation comprises the following steps:
(E1) Expanding the capacity of the hash table to 2N, and determining the corresponding relation between each hash bucket and the array cluster; n is the capacity of the current hash table, and N is the power of 2;
(E2) Determining hash buckets corresponding to the stored effective keys in the new hash tables in the array clusters, and moving the effective keys to the array clusters mapped by the corresponding hash buckets;
for any one active keykThe hash bucket sequence number whose hash value corresponds to the original hash table is recorded asiJudging the log of the hash value 2 Whether the N bit is 0, if so, determining a valid keykThe corresponding hash bucket sequence number in the new hash table isiOtherwise, determine the valid keykThe corresponding hash bucket sequence number in the new hash table isi+N
6. The presence indexing system of claim 5, wherein the inserting operation further comprises:
in the process of bonding key k i Store key k while in the allocated row i High (H-log) hash value of (2) 2 N) bits are stored as hash valid bits into one array row; keys stored in the same array, whose hash valid bits are stored in the same array and aligned by columns; h represents the length of the hash value;
in the hash capacity expansion operation, the log of the hash value of the effective key 2 The N-bit acquisition mode comprises the following steps:
log-th in reading hash value from array 2 The hash valid bit corresponding to the N bits is located in the column.
7. The system of any one of claims 1-4, wherein the hash table has a size of each hash bucket that is the same as a size of a CPU cache line.
8. The system of any one of claims 1-4, wherein different hash buckets in the hash table map to different array clusters.
9. A key-value pair storage system comprising the integrated storage indexing system of any one of claims 1 to 8.
CN202410030995.XA 2024-01-09 2024-01-09 Integrated index system for memory and calculation and key value pair memory system Active CN117539408B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410030995.XA CN117539408B (en) 2024-01-09 2024-01-09 Integrated index system for memory and calculation and key value pair memory system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410030995.XA CN117539408B (en) 2024-01-09 2024-01-09 Integrated index system for memory and calculation and key value pair memory system

Publications (2)

Publication Number Publication Date
CN117539408A CN117539408A (en) 2024-02-09
CN117539408B true CN117539408B (en) 2024-03-12

Family

ID=89790397

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410030995.XA Active CN117539408B (en) 2024-01-09 2024-01-09 Integrated index system for memory and calculation and key value pair memory system

Country Status (1)

Country Link
CN (1) CN117539408B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117891858A (en) * 2024-03-14 2024-04-16 苏州大学 Space-time efficient parallel approximate member query method and system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053852A (en) * 2017-11-03 2018-05-18 华中科技大学 A kind of wiring method of the resistance-variable storing device based on crosspoint array
CN109461811A (en) * 2018-09-12 2019-03-12 华中科技大学 A kind of mixing of CRS resistance-variable storing device can reallocating method
CN109683811A (en) * 2018-11-22 2019-04-26 华中科技大学 A kind of request processing method mixing memory key-value pair storage system
CN113196260A (en) * 2018-12-14 2021-07-30 美光科技公司 Key value storage tree capable of selectively using key portions
CN113924625A (en) * 2019-06-07 2022-01-11 美光科技公司 Operational consistency in non-volatile memory systems
CN115599532A (en) * 2021-07-07 2023-01-13 华为技术有限公司(Cn) Index access method and computer cluster
EP4137963A1 (en) * 2021-08-18 2023-02-22 Samsung Electronics Co., Ltd. Persistent key value storage device with hashing and method for operating the same
CN116719813A (en) * 2023-05-26 2023-09-08 华中科技大学 Hash table processing method and device and electronic equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10599647B2 (en) * 2017-12-22 2020-03-24 Oracle International Corporation Partitioning-based vectorized hash join with compact storage footprint

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108053852A (en) * 2017-11-03 2018-05-18 华中科技大学 A kind of wiring method of the resistance-variable storing device based on crosspoint array
CN109461811A (en) * 2018-09-12 2019-03-12 华中科技大学 A kind of mixing of CRS resistance-variable storing device can reallocating method
CN109683811A (en) * 2018-11-22 2019-04-26 华中科技大学 A kind of request processing method mixing memory key-value pair storage system
CN113196260A (en) * 2018-12-14 2021-07-30 美光科技公司 Key value storage tree capable of selectively using key portions
CN113924625A (en) * 2019-06-07 2022-01-11 美光科技公司 Operational consistency in non-volatile memory systems
CN115599532A (en) * 2021-07-07 2023-01-13 华为技术有限公司(Cn) Index access method and computer cluster
EP4137963A1 (en) * 2021-08-18 2023-02-22 Samsung Electronics Co., Ltd. Persistent key value storage device with hashing and method for operating the same
CN116719813A (en) * 2023-05-26 2023-09-08 华中科技大学 Hash table processing method and device and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘勇 ; 赵秦德 ; 赖正文 ; 黄东平 ; 王星 ; .异构平台上多维线性哈希的研究.计算机科学.2012,(10),全文. *
汪承宁.高密度电阻型存储阵列读写优化方法研究.<中国博士学位论文全文数据库 (信息科技辑)>.2022,I137-5. *

Also Published As

Publication number Publication date
CN117539408A (en) 2024-02-09

Similar Documents

Publication Publication Date Title
US10565123B2 (en) Hybrid logical to physical address translation for non-volatile storage devices with integrated compute module
US5870324A (en) Contents-addressable memory
CN117539408B (en) Integrated index system for memory and calculation and key value pair memory system
CN107153707B (en) Hash table construction method and system for nonvolatile memory
US6389507B1 (en) Memory device search system and method
CN102197436B (en) Data path for multi-level cell memory, methods for storing and methods for utilizing a memory array
US11604834B2 (en) Technologies for performing stochastic similarity searches in an online clustering space
CN112000846B (en) Method for grouping LSM tree indexes based on GPU
CN111400306B (en) RDMA (remote direct memory Access) -and non-volatile memory-based radix tree access system
CN109002257B (en) Data distribution optimization method based on variable scratch pad memory
US20190227739A1 (en) Technologies for providing stochastic key-value storage
US20170147712A1 (en) Memory equipped with information retrieval function, method for using same, device, and information processing method
CN109165321B (en) Consistent hash table construction method and system based on nonvolatile memory
US11327881B2 (en) Technologies for column-based data layouts for clustered data systems
US11573735B2 (en) Technologies for media management in column-addressable memory media systems
Sun et al. Energy-efficient SQL query exploiting RRAM-based process-in-memory structure
CN110033797A (en) Storage system and storage method
Chen et al. Design of skiplist based key-value store on non-volatile memory
CN113434091A (en) Cold and hot key value identification method based on hybrid DRAM-NVM
US11914587B2 (en) Systems and methods for key-based indexing in storage devices
CN113434092B (en) Fingerprint identification method based on hybrid DRAM-NVM
CN117540056B (en) Method, device, computer equipment and storage medium for data query
US12001716B2 (en) Key-value data storage system using content addressable memory
Greengard Better memory
CN111292782B (en) Non-volatile random access memory and access method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant