CN115964394A - Method and device for cache management of graph database - Google Patents

Method and device for cache management of graph database Download PDF

Info

Publication number
CN115964394A
CN115964394A CN202211707990.3A CN202211707990A CN115964394A CN 115964394 A CN115964394 A CN 115964394A CN 202211707990 A CN202211707990 A CN 202211707990A CN 115964394 A CN115964394 A CN 115964394A
Authority
CN
China
Prior art keywords
data
cache
key
graph database
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211707990.3A
Other languages
Chinese (zh)
Inventor
冷建琴
程萍
唐俊
丁先胜
张睿
王振宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Shutian Mengtu Data Technology Co ltd
Original Assignee
Sichuan Shutian Mengtu Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Shutian Mengtu Data Technology Co ltd filed Critical Sichuan Shutian Mengtu Data Technology Co ltd
Priority to CN202211707990.3A priority Critical patent/CN115964394A/en
Publication of CN115964394A publication Critical patent/CN115964394A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to the field of databases, and in particular, to a method and apparatus for cache management of a graph database. The method mainly comprises the following steps: organizing data in the database cache in a skip data structure, wherein the key-value of the skip corresponds to the key-value of the database; and operating the data in the cache by using an unlocking CAS primitive, and verifying the correctness of the data in the cache, wherein the unlocking CAS primitive ensures the concurrent security of the execution by a variable mark variable and a memory barrier instruction. The method provided by the embodiment of the invention can increase the concurrency of the data on the premise of ensuring the correctness of the data, thereby improving the system performance.

Description

Method and device for cache management of graph database
Technical Field
The present invention relates to the field of databases, and in particular, to a method and apparatus for cache management of a graph database.
Background
Graph databases are a type of non-relational databases that primarily employ graph theory to store relationship information between entities. There are three basic concepts in graph databases: vertices are generally used to indicate an entity; edges are used to represent relationships between entities, and in most cases, directed edges are used; an attribute is some information that describes a vertex or edge. The modeling mode in the graph database is different from that of the traditional relational database, the modeling is directly based on the real world entity and relation, and the high abstraction level of the relational database is not available, so that the method is simpler and easier to understand. In the aspect of complex relation query capability, the graph database is suitable for analyzing multi-level complex relations and is more suitable for the current massive data processing situation.
When processing data in a database, the data needs to be read into a memory for caching. For processors that support multithreading, especially multi-core processors, data in the memory is often read and written in a concurrent manner. Due to the thread concurrency mechanism of the processor, reading and writing of data in the memory do not always occur according to the sequence specified in the user code, which may cause data reading and writing conflict and generate dirty data. To avoid this, an operation lock (lock) mechanism is commonly used in multi-threaded algorithms. For example, before writing a certain memory address, the address is locked by a flag value, and after the operation is completed, the lock is unlocked to inform other threads that the data is written completely and can be used. However, since the processor may reorder the operation instructions in different threads during concurrent processing, other threads may see the flag set before the current write data operation. As shown in FIG. 1, the timing of the major occurrences of the read/write reordering in the x86\ x64 processor is summarized.
Using lock to achieve thread synchronization, while dirty data caused by reordering can be prevented, there are some disadvantages: (1) When competition occurs, the threads are blocked and wait, and real-time response of the threads cannot be achieved. And (2) dead lock. And (3) live lock. And (4) priority overturning. And (5) the lock is improperly used, which causes performance degradation.
In view of this, how to overcome the defects in the prior art and solve the defects caused by the existing cache management of the graph database through lock is a problem to be solved in the technical field.
Disclosure of Invention
In view of the above-identified deficiencies or needs for improvement in the art, the present invention addresses the problem of cache performance degradation caused by cache management of graph databases through lock.
The embodiment of the invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for cache management of a graph database, which is characterized by specifically comprising: organizing data in the graph database cache in a skip data structure, wherein the key-value of the skip corresponds to the key-value of the graph database; and operating the data in the cache by using an unlocking CAS primitive, and verifying the correctness of the data in the cache, wherein the unlocking CAS primitive ensures the concurrent security of the execution by a variable mark variable and a memory barrier instruction.
Preferably, the organizing the data in the graph database cache by a skip data structure specifically includes: and creating a memtable cache, wherein the data structure used by the memtable is a skip, sorting the data in the graph database according to a specified sequence, and taking the key of the data in the graph data as the key of the skip data structure of the memtable.
Preferably, the creating a memtable cache further includes: all X locks in memtable are replaced by S locks, and the S locks are used as concurrent locking modes of memtable.
Preferably, the operating the data in the cache by using the lock-free CAS primitive specifically includes: and comparing the memory address parameter and the expected value parameter in the primitive corresponding to each operation, and exchanging the forward pointer and the backward pointer corresponding to the data in the skiplist according to the new value parameter of the primitive when the value of the memory address is equal to the expected value according to the operation requirement.
Preferably, when the operation item is a query, the operating the data in the cache by using the lock-free CAS primitive specifically includes: traversing from the head of a chain table at the highest layer of the current skip list, and when the inquired data key is larger than the currently traversed data key, continuously traversing the next piece of data at the current layer; if the next piece of data exists, reducing the number of layers of the skiplist, and continuing to compare backwards until a data key needing to be inquired is found; if the next piece of data is the chain end of the skip, it indicates that the data does not exist.
Preferably, when the operation item is an insert, the operating the data in the cache by using the lock-free CAS primitive specifically includes: atomic operations in primitives for insert operations include: the new value of the backward pointer of the data before the insertion position points to the data to be inserted, the new value of the forward pointer of the data after the insertion position points to the data to be inserted, the new value of the forward pointer of the data to be inserted points to the data before the insertion position, and the new value of the backward pointer of the data to be inserted points to the data after the insertion position; searching the insertion position of the data, and performing concurrent pointer exchange on the data needing to be inserted by using the primitive of the insertion operation; and the number of layers where the data are inserted is specified according to a random algorithm, the layers are sorted in an ascending mode, and the layers are inserted into the skiplist at the corresponding positions of the layers.
Preferably, the inserting position of the search data specifically includes: and comparing the key value of the data to be inserted with the key value of each piece of data in the layer chain table layer by layer from the highest layer of the skiplist until the insertion position is found, so that the key value of the data to be inserted is smaller than the key value of the data in front of the insertion position and larger than the key value of the data behind the insertion position.
Preferably, when the operation item is delete, the operating the data in the cache by using the lock-free CAS primitive specifically includes: atomic operations in primitives for delete operations include: the new value of the backward pointer of the data before the data to be deleted is the data after the data to be deleted, and the new value of the forward pointer of the data after the data to be deleted is the data before the data to be deleted; and searching a data item corresponding to the key value of the data to be deleted in the skip, and performing concurrent pointer exchange on the data to be inserted by using the primitive of the deletion operation.
Preferably, the verifying the correctness of the data in the cache specifically includes: checking the total number of the data, and judging whether the total number of the data is consistent with the expectation; checking the key of the inserted data and the key of the deleted data, and judging whether the key of the inserted data or the key of the deleted data exists or not; and checking the content of the inserted data, and judging whether the content of the inserted data is consistent with the expectation.
On the other hand, the invention provides a device for cache management of a graph database, which specifically comprises the following steps: comprising at least one processor and a memory, the at least one processor and the memory being connected by a data bus, the memory storing instructions executable by the at least one processor, the instructions, after execution by the processor, performing the method for cache management of a graph database according to the first aspect.
Compared with the prior art, the embodiment of the invention has the beneficial effects that: the skiplist is used as a data structure memory storage system, so that better query performance is provided; meanwhile, the CAS non-locking mode is used, so that the effectiveness of data reading and writing can be ensured, other data structures without lock safety can be used, and the universality is better. The method provided by the embodiment of the invention can increase the concurrency of the data on the premise of ensuring the correctness of the data, thereby improving the system performance.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 illustrates the timing of the major occurrences of read/write reordering in an x86\ x64 processor;
FIG. 2 is a schematic core flow diagram of the LSM tree read/write operation;
FIG. 3 is a comparison of the processing efficiency of a read-write lock versus a CAS lock-less;
FIG. 4 is a schematic diagram of the basic structure of the skiplist;
FIG. 5 is a flowchart of a method for cache management of a graph database according to an embodiment of the present invention;
FIG. 6 is a graph of the average temporal complexity of data operations in a cache after using a skiplist;
FIG. 7 is a flowchart of another method for cache management of a graph database according to an embodiment of the present invention;
FIG. 8 is a flowchart of another method for cache management of a graph database according to an embodiment of the present invention;
FIG. 9 is a schematic structural diagram of an apparatus for cache management of a graph database according to an embodiment of the present invention;
wherein the reference numbers are as follows:
11: a processor; 12: a memory.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The present invention is a system structure of a specific function system, so the functional logic relationship of each structural module is mainly explained in the specific embodiment, and the specific software and hardware implementation is not limited.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other. The invention will be described in detail below with reference to the figures and examples.
Example 1:
in a graph database, in order to improve query efficiency, a Log-Structured-Merge-Tree (LSM Tree) may be used to store key-value (KV) data pairs of the graph database. The LSM tree storage is KV storage based on sequential disk writing and a multi-layer structure. Fig. 2 shows the core flow of the LSM tree read/write operation. When receiving a write request, directly writing data to be written into a memory table (memtable) in a memory as a cache of a database, changing the memory block into an immutable memtable after the memtable is full, and simultaneously generating a new memtable for a subsequent write request. Meanwhile, the contents of immutable memtable are written to a disk to form SSTable files. Both memables and immutable memables in memory are ordered by key for data found, so SSTable is also ordered by key in SSTable. Furthermore, the SSTable files are hierarchically organized, a layer 0 is directly written out from the memory, and after the layer 0 data reaches a certain size, the layer 1 data is automatically merged into a layer 1, which is similar to merging and sorting. The merged files of the layer 1 are also written in sequence, and when the layer 1 reaches a certain size, the files are merged into a higher layer continuously, and so on. At the time of merging, duplicate data or deleted data is cleared. When a reading request is received, firstly searching in the memtable in the memory, and if the reading request is not found, starting to search layer by layer from the file of the 0 th layer.
In order to solve the problem caused by using lock to implement thread synchronization in the prior art, a lock-free mechanism is used in the method provided by this embodiment to ensure the accuracy of data synchronization during multithreading concurrency. Lock-free is to ensure atomicity of a state in a certain manner, and further ensure security of access to shared data, and is mainly in three common manners: compare And Swap (CAS in short), linked list-based lock-free queues, array-based lock-free queues. In the method provided in this embodiment, the CAS system is used.
CAS is a set of primitive instructions used to implement variable synchronization under multiple threads. A primitive is a program segment consisting of several instructions, which is used to implement a specific function and cannot be interrupted during the execution. In an operating system, operations called by a process, such as queue operations, semaphore operations, checking and starting peripheral operations, and the like, cannot be interrupted once being executed, otherwise, operation errors occur, and system confusion is caused. Primitives always appear as a basic unit. It differs from the general process in that: they are "atomic or atomic operations". An atomic operation is an operation in which all actions are either done or not done. In other words, it is an indivisible basic unit and therefore is not allowed to be interrupted during execution. Atomic operations are performed in a managed state, residing in memory. The CAS replaces the data in the memory with a new value when the values are the same by comparing the values in the memory with the specified data. Compared with locking modes such as read-write lock, mutual exclusion lock, spin lock, conditional lock and the like, the CAS mainly has the following advantages: (1) Only one statement (CAS atomic operation) is needed for accessing the shared variable, so that the locking and unlocking actions which are carried out when the shared variable is operated every time can be saved, and the system overhead is saved; (2) And a spinning attempt method is used in concurrent competition, so that frequent switching of threads due to blocking under the condition of using a lock is avoided. As shown in FIG. 3, to compare the processing efficiency of the read-write lock with the CAS lock-less, it can be seen that the data write (membleput) efficiency of memtable can be significantly improved by using CAS lock-less.
Further, in order to realize the key-value structure of the graph database in the cache, meet the functions of data insertion, search, iteration, modification and the like in the cache, and meet the requirement of concurrent operation, the data structure which has higher query efficiency, supports high concurrency and supports a lock-free mechanism is required to be used in the memtable. In a common data structure supporting key-value: the Hash table iteration time complexity is high, and the complexity of the red-black tree requiring outer locking can be reduced. The bottom layer of the skip list (skip list) is realized by a chain table, which can be realized as lock free, and at the same time, the skip list has good performance, so that the data in the memtable is organized by using a skip list data structure in the embodiment.
A skiplist is an in-memory data structure that can be used in place of a search tree, the basic structure of which is shown in fig. 4. The basic principle is to add some indexes on an ordered linked list and simulate binary search by a random rule ensuring a certain probability. The whole skiplist can be intuitively regarded as a multilayer structure, for example, fig. 4 contains a 7-layer structure, the lowest layer is an ordered linked list used as a base, the arrow direction represents a linked list method, the rest layers are index layers established on the linked list of the lowest layer, approximate binary search can be realized through the index layers, and the search efficiency is improved. Each data unit in the linked list is sequentially connected with the previous data and the next data through a forward pointer and a backward pointer, and the forward pointer and the backward pointer of each data unit can be independently operated, so that the uniqueness of the data can be ensured, and a lock-free mechanism can be supported.
In this embodiment, in order to ensure that operations for reading and writing data are not interrupted and prevent a current read or write operation of data from being modified or read to old data by other data before the current read or write operation is completed, a primitive of CAS is used as an operation instruction to ensure the certainty of data and implement lock-free cache management.
As shown in fig. 5, the method for cache management of a graph database according to the embodiment of the present invention includes the following specific steps:
step 101: organizing the data in the database cache in a skip data structure, wherein the key-value of the skip corresponds to the key-value of the database.
In order to improve the performance of caching the graph database, in this embodiment, a skip structure is used to cache data in the graph database, operations such as creation, insertion, query, deletion and the like of cached data are realized in the skip, and corresponding processing is performed on data correctness when shared memory data is operated under a concurrent condition.
Specifically, a memtable cache is created according to needs, a data structure used by the memtable is a skiplist, data in the graph database are sorted according to a specified sequence, and keys of the data in the graph database are used as keys of the skiplist data structure of the memtable. When data query is carried out in the skiplist, quick search can be carried out through keys, and the time complexity is O (log n) in most cases. As shown in FIG. 6, the average time complexity of each data operation in the cache after using the skip is shown, where n is the length of the skip. In the memtable, all data are sorted according to a sorting method defined by a user and then are stored in order, so that the data ordering in the cache is ensured, and the searching efficiency is further improved.
Step 102: and operating on the data in the cache by using the non-lock CAS primitive and verifying the correctness of the data in the cache.
In the embodiment, the skip list is subjected to concurrency control by using a lock-free CAS technology, and uninterrupted data exchange operation is realized by primitive-based atomic operation, so that the problem of data inconsistency caused by uncertain execution sequence and unpredictability of interruption when multithreading is used for rewriting certain data at the same time is avoided.
In a specific implementation scenario, most multi-core processors so far support CAS, and have different CAS instruction implementations for processors of different architectures. For example, the instruction CMPXCHG under x86 implements CAS, with a front LOCK that can achieve atomic operations.
Further, since lockless programming is limited by physical machine implementation optimization, to prevent processor reordering for reads and writes, in some specific scenarios, the lockless CAS primitive ensures concurrent security when executed by a variable flag variable and a memory barrier instruction (memory barrier). The memory barrier instruction has a corresponding implementation form in most platforms, for example, some platforms provide an interlockedxxxx function interface with an attached built-in memory barrier. Under the x86 platform, the implementation can be based on the CAS lock-free function interface at the beginning of the interlock, for example: interlockedCompareExchangePointer and InterlockedCompareExchange interfaces.
After the steps 101 to 102 provided in this embodiment, efficient query and lock-free management of the graph database cache can be realized, and the efficiency of data processing in the cache is improved.
Furthermore, in some scenarios of the skiplist data management, not all operations may not use the shared lock (S lock) and the exclusive lock (X lock), but only the exclusive use during writing may be ensured without the X lock, so that all X locks in memtable may be replaced by S locks, S locks may be used as a concurrency locking mode of memtable, and S locks supporting sharing may be used to replace exclusive X locks, thereby ensuring that programs have higher concurrency.
In the method provided by the embodiment, a primitive is used as an instruction for data operation. The CAS primitive has three parameters: memory address, expected value, and new value. In the skiplist of this embodiment, functions such as creation, insertion, search, iteration, deletion, and modification of data are implemented by exchanging data pointers in a linked list.
When data processing is performed using the CAS primitive, it is necessary to first make a judgment on the validity of the data. If the value of the memory address is equal to the desired value, indicating that the value is not modified, the value may be modified to a new value at this time. Otherwise, the modification is failed, false is returned, and the user determines the subsequent operation.
For a specific implementation of the validity judgment, the following steps can be referred to.
Bool CAS(T*addr,T expected,T newValue)
{
if(*addr==expected)
{
*addr=newValue;
return true;
}
else
return false;
}
For data in the cached skiplist: and comparing the memory address parameter and the expected value parameter in the primitive corresponding to each operation, and exchanging a forward pointer and a backward pointer corresponding to the data in the skiplist according to the new value parameter of the primitive when the value of the memory address is equal to the expected value according to the operation requirement.
Further, in specific implementation, an ABA problem may be caused when the CAS is used to directly perform validity determination, and when a data value in a certain storage location is obtained, the data value in the storage location is modified multiple times while a read operation is performed, and although the finally obtained value may be the same as the value before modification, a hidden data consistency problem may still be generated. Particularly when some concurrent data structures are manipulated using pointers, such as in the scenario of data manipulation in a cliplist in this embodiment. Therefore, an additional flag needs to be added to the storage location of each data to indicate whether the data of the location has been modified.
Hereinafter, a specific process of using the lock-free CAS primitive to perform data processing in step 102 is described by taking a cache common data operation as an example, and the following steps can be implemented by using a corresponding CAS primitive interface in the processor. In practical implementation, specific statements and instructions may be determined according to interfaces supported by the processor and specific data processing needs, and other data operations may be completed in the following manner.
(1) When the operation item is a query
As shown in fig. 7, according to the structure of the skiplist, the query process is as follows:
step 201: and starting traversal from the head of the chain table at the highest layer of the current skip list, and when the inquired data key is larger than the currently traversed data key, continuously traversing the next piece of data at the current layer.
When the skiplist is queried, the highest layer of the skiplist is used for querying layer by layer according to a linked list query mode. And when the queried data key is the same as the currently traversed data key, the value corresponding to the currently traversed data key is the data to be queried. And when the queried data key is larger than the key of the current data, continuously traversing to the next piece of data of the current layer until the traversal of all the data in the highest layer is completed.
Step 202: if the next hop data exists, reducing the number of layers of the skiplist and continuing to compare backwards until the data key needing to be inquired is found.
And after the traversal of the highest layer is finished, reducing the number of layers of the skiplist from the highest layer to the lowest layer, completing the traversal layer by layer, continuously comparing the size of the inquired data key with that of each data key in other layers until the data to be inquired is found, and finishing the traversal of the skiplist.
Step 203: if the next hop data is the chain tail of the skiplist, the data does not exist.
If the same key is not found when the chain tail of the skip list is found, it indicates that the data corresponding to the key does not exist.
After the steps 201 to 203 provided in this embodiment, the query of the data in the cached skiplist can be completed. The data in the graph database cache are organized by using the skip list, the average time complexity of query is O (log n), the query performance is better under the condition of a single key with better searching performance, and meanwhile, the query performance is better improved by matching with the condition that the CAS is not locked.
(2) When the operation item is insert
Atomic operations in primitives for insert operations resemble linked list insert operations, including: the new value of the backward pointer of the data before the insertion position points to the data to be inserted, the new value of the forward pointer of the data after the insertion position points to the data to be inserted, the new value of the forward pointer of the data to be inserted points to the data before the insertion position, and the new value of the backward pointer of the data to be inserted points to the data after the insertion position.
As shown in fig. 8, the data insertion process is as follows:
step 301: and searching the insertion position of the data, and performing concurrent pointer exchange on the data needing to be inserted by using the primitive of the insertion operation.
Because the data in the skiplist are all sorted according to the key value, the insertion position needs to be searched according to the key value of the inserted data when the data are inserted. Specifically, the method comprises the following steps: and comparing the key value of the data to be inserted with the key value of each piece of data in the layer chain table layer by layer from the highest layer of the skiplist until the insertion position is found, so that the key value of the data to be inserted is smaller than the key value of the data in front of the insertion position and larger than the key value of the data behind the insertion position. The order of keys is maintained during insertion, so that the efficiency of subsequent searching can be ensured.
Step 302: and the number of layers where the data are inserted is specified according to a random algorithm, the layers are sorted in an ascending mode, and the layers are inserted into the skiplist at the corresponding positions of the layers.
In this embodiment, the skip is concurrently inserted, and after the insertion, the index data may need to modify the forward pointer and the backward pointer of the index data according to the new data sequence in the basic linked list. Therefore, it is necessary to ensure lock-free modification of both data and data pointer values, which in particular implementations can be done through a corresponding interface of the processor. For example, in an x86 processor, two CAS primitive interfaces of InterlockedCompareExchangePointer and InterlockedCompareExchange are mainly used.
After the steps 301 to 302 provided in this embodiment, the insertion of the data in the cached skiplist can be completed. Further, in order to avoid releasing the memory at the insertion position during the concurrent insertion process, a reference count may be set for each memory position that needs to be used, and the S-lock is implemented by the reference count.
(3) When the operation item is delete
Atomic operations in primitives for delete operations resemble linked list delete operations, including: the new value of the backward pointer of the data before the data to be deleted is the data after the data to be deleted, and the new value of the forward pointer of the data after the data to be deleted is the data before the data to be deleted.
And searching a data item corresponding to the key value of the data to be deleted in the skip, and performing concurrent pointer exchange on the data to be inserted by using the primitive of the deletion operation.
The insert operation is consistent with the above operation, and the delete operation is to search the data item of the specified KEY, and then to exchange the forward pointer and the backward pointer by using the CAS atomic operation, so as to realize the delete operation of the node.
Through the process example, the method provided by the embodiment can complete the management of data in the database cache.
In the above process, especially after insertion and deletion, in order to ensure correctness of the operation, verification of correctness of data in the cache is also required. According to specific needs, one of the following modes can be selected for verification, or multiple modes can be used for combined verification, and other verification modes can be added according to needs.
(1) And checking the total number of the data, and judging whether the total number of the data is consistent with the expectation.
After insertion or deletion, the total number of pieces of data changes, and therefore, the correctness of the operation is preliminarily determined by looking at the total number of pieces of data. After the inserting operation, the total number of the data is increased by the number of the inserted data; after the delete operation, the total number of data pieces is reduced by the number of deleted pieces.
(2) And checking the key of the inserted data and the key of the deleted data, and judging whether the key of the inserted data or the key of the deleted data exists or not.
After the insertion or deletion operation is performed, verification can also be performed by performing an inquiry operation. After the insertion operation, the corresponding data items can be normally inquired according to the inserted data keys; after the deleting operation, the corresponding data item cannot be inquired according to the deleted data key.
(3) And checking the content of the inserted data and judging whether the content of the inserted data is consistent with the expectation.
After the insertion operation is performed, whether the inserted data value is correct or not can be further determined on the basis of query. And judging whether the content of each inserted data key is consistent with the content in the data entry corresponding to the same inquired key.
By the verification method, the operated data can be verified, and the correctness of data management operation in the cache is further ensured.
According to the method for cache management of the graph database, the cliplist is used as a basic data structure of the graph database cache, the data processing efficiency is improved through the data structure characteristics of the cliplist, and the lock-free operation can be supported. Meanwhile, the non-lock CAS primitive is used as an instruction of data operation, and the data consistency during concurrent data operation is ensured under the non-lock state by using kernel-level atomic operation.
Example 2:
on the basis of the method for cache management of a graph database provided in embodiment 1, the present invention further provides a device for cache management of a graph database, which is used for implementing the method, as shown in fig. 9, which is a schematic diagram of a device architecture according to an embodiment of the present invention. The apparatus for cache management of a graph database of the present embodiment includes one or more processors 11 and a memory 12. In fig. 9, one processor 11 is taken as an example.
The processor 11 and the memory 12 may be connected by a bus or other means, and the bus connection is exemplified in fig. 9.
The memory 12, which is a nonvolatile computer-readable storage medium that is a cache management method for a graph database, may be used to store nonvolatile software programs, nonvolatile computer-executable programs, and modules, such as the cache management method for a graph database in embodiment 1. The processor 11 executes various functional applications of the apparatus for cache management of a graph database and data processing, that is, implements the method for cache management of a graph database of embodiment 1, by executing nonvolatile software programs, instructions, and modules stored in the memory 12.
The memory 12 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 12 may optionally include memory located remotely from the processor 11, and these remote memories may be connected to the processor 11 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Program instructions/modules are stored in memory 12 that, when executed by one or more processors 11, perform the method of cache management of a graph database in embodiment 1 described above, e.g., perform the various steps shown in FIGS. 5, 7, and 8 described above.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A method for cache management of a graph database is characterized by specifically comprising:
organizing data in the database cache in a skip data structure, wherein the key-value of the skip corresponds to the key-value of the database;
and operating on the data in the cache by using an unlocking CAS primitive, and verifying the correctness of the data in the cache, wherein the unlocking CAS primitive ensures the concurrency security of execution through a variable mark variable and a memory barrier instruction.
2. A method for cache management of a graph database according to claim 1, wherein said organizing data in the graph database cache in a skip data structure comprises:
creating a memtable cache, wherein the data structure used by the memtable is a skip, sorting the data in the graph database according to a specified sequence, and taking the key of the data in the graph data as the key of the skip data structure of the memtable.
3. A method for cache management of a graph database according to claim 2, wherein said creating a memtable cache further comprises:
all X locks in memtable are replaced by S locks, and the S locks are used as concurrent locking modes of memtable
4. The method for cache management of a graph database according to claim 1, wherein said using a lockless CAS primitive to operate on data in a cache, comprises:
and comparing the memory address parameter and the expected value parameter in the primitive corresponding to each operation, and exchanging a forward pointer and a backward pointer corresponding to the data in the skiplist according to the new value parameter of the primitive when the value of the memory address is equal to the expected value according to the operation requirement.
5. The method for cache management of a graph database according to claim 1, wherein said using a lock-less CAS primitive to operate on data in the cache when the operation item is a query, specifically comprises:
traversing from the head of a chain table at the highest layer of the current skip list, and when the inquired data key is larger than the currently traversed data key, continuously traversing the next piece of data at the current layer;
if the next piece of data exists, reducing the number of layers of the skiplist, and continuing to compare backwards until a data key needing to be inquired is found;
if the next piece of data is the chain tail of the cliplist, the data does not exist.
6. The method for cache management of a graph database according to claim 1, wherein said using a lock-less CAS primitive to operate on data in the cache when the operation item is insert, specifically comprises:
atomic operations in primitives for insert operations include: the new value of the backward pointer of the data before the insertion position points to the data to be inserted, the new value of the forward pointer of the data after the insertion position points to the data to be inserted, the new value of the forward pointer of the data to be inserted points to the data before the insertion position, and the new value of the backward pointer of the data to be inserted points to the data after the insertion position;
searching the insertion position of the data, and performing concurrent pointer exchange on the data needing to be inserted by using the primitive of the insertion operation;
and the number of layers where the data are inserted is specified according to a random algorithm, the layers are sorted in an ascending mode, and the layers are inserted into the skiplist at the corresponding positions of the layers.
7. The method for cache management of a graph database according to claim 6, wherein said locating an insertion location of data comprises:
and comparing the key value of the data to be inserted with the key value of each piece of data in the layer chain table layer by layer from the highest layer of the skiplist until the insertion position is found, so that the key value of the data to be inserted is smaller than the key value of the data in front of the insertion position and larger than the key value of the data behind the insertion position.
8. The method for cache management of a graph database according to claim 1, wherein said using a lock-less CAS primitive to operate on data in the cache when the operation item is delete, specifically comprises:
atomic operations in primitives for delete operations include: the new value of the backward pointer of the data before the data to be deleted is the data after the data to be deleted, and the new value of the forward pointer of the data after the data to be deleted is the data before the data to be deleted;
and searching a data item corresponding to the key value of the data to be deleted in the skip, and performing concurrent pointer exchange on the data to be inserted by using the primitive of the deletion operation.
9. The method for cache management of a graph database according to claim 1, wherein said verifying the correctness of data in the cache comprises:
checking the total number of the data, and judging whether the total number of the data is consistent with the expectation;
checking the key of the inserted data and the key of the deleted data, and judging whether the key of the inserted data or the key of the deleted data exists or not;
and checking the content of the inserted data, and judging whether the content of the inserted data is consistent with the expectation.
10. An apparatus for cache management of a graph database, comprising:
comprising at least one processor and a memory, said at least one processor and memory being connected by a data bus, said memory storing instructions executable by said at least one processor, said instructions upon execution by said processor, for performing a method for cache management of a graph database according to any of claims 1-9.
CN202211707990.3A 2022-12-28 2022-12-28 Method and device for cache management of graph database Pending CN115964394A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211707990.3A CN115964394A (en) 2022-12-28 2022-12-28 Method and device for cache management of graph database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211707990.3A CN115964394A (en) 2022-12-28 2022-12-28 Method and device for cache management of graph database

Publications (1)

Publication Number Publication Date
CN115964394A true CN115964394A (en) 2023-04-14

Family

ID=87361427

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211707990.3A Pending CN115964394A (en) 2022-12-28 2022-12-28 Method and device for cache management of graph database

Country Status (1)

Country Link
CN (1) CN115964394A (en)

Similar Documents

Publication Publication Date Title
Leis et al. The ART of practical synchronization
Tang et al. XIndex: a scalable learned index for multicore data storage
Pugh Concurrent maintenance of skip lists
Arbel et al. Concurrent updates with rcu: Search tree as an example
Ellen et al. Non-blocking binary search trees
US5430869A (en) System and method for restructuring a B-Tree
CN100367239C (en) Cache-conscious concurrency control scheme for database systems
Brown et al. Pragmatic primitives for non-blocking data structures
Ramachandran et al. A fast lock-free internal binary search tree
US11409698B2 (en) Parallel materialisation of a set of logical rules on a logical database
CN114282074B (en) Database operation method, device, equipment and storage medium
US20170235780A1 (en) Providing lock-based access to nodes in a concurrent linked list
CN111597193A (en) Method for locking and unlocking tree-shaped data
Platz et al. Concurrent unrolled skiplist
CN104077078B (en) Read memory block, update the method and device of memory block
Fleury et al. Scalable proof producing multi-threaded SAT solving with Gimsatul through sharing instead of copying clauses
CN115964394A (en) Method and device for cache management of graph database
CN111639076A (en) Cross-platform efficient key value storage method
CN110502535A (en) Data access method, device, equipment and storage medium
CN111198660A (en) B + tree traversal method and device
US9223780B2 (en) Non-blocking caching technique
US11625386B2 (en) Fast skip list purge
Wang et al. Isle-tree: A b+-tree with intra-cache line sorted leaves for non-volatile memory
CN114328500A (en) Data access method, device, equipment and computer readable storage medium
Krishna et al. Using Cuckoo Filters to Improve Performance in Object Store-based Very Large Databases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination