WO2023197404A1 - 一种基于分布式数据库的对象存储方法及装置 - Google Patents
一种基于分布式数据库的对象存储方法及装置 Download PDFInfo
- Publication number
- WO2023197404A1 WO2023197404A1 PCT/CN2022/094380 CN2022094380W WO2023197404A1 WO 2023197404 A1 WO2023197404 A1 WO 2023197404A1 CN 2022094380 W CN2022094380 W CN 2022094380W WO 2023197404 A1 WO2023197404 A1 WO 2023197404A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- index
- disk
- memory
- internal request
- distributed database
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000015654 memory Effects 0.000 claims abstract description 87
- 230000006399 behavior Effects 0.000 claims abstract description 33
- 230000008569 process Effects 0.000 claims description 8
- 238000012795 verification Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 abstract description 13
- 230000000295 complement effect Effects 0.000 abstract 1
- 230000002688 persistence Effects 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 7
- 238000011084 recovery Methods 0.000 description 7
- 238000011010 flushing procedure Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000009977 dual effect Effects 0.000 description 3
- 230000002708 enhancing effect Effects 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/176—Support for shared access to files; File sharing support
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
Definitions
- the present application relates to the technical fields of distributed databases and object-oriented storage, and in particular to an object storage method and device based on a distributed database.
- Object storage also known as object-oriented storage, is a storage technology suitable for unstructured data. At present, it is often the best solution for reading and writing massive small files. Object storage applies a write-after-write mode to aggregate writes to small files, thereby greatly improving the IOPS (Input/Output operations Per Second, number of read and write operations per second) and bandwidth of reading and writing.
- IOPS Input/Output operations Per Second, number of read and write operations per second
- index the location and size of a specific small file in the aggregate file, that is, the mapping relationship between the small file and the aggregate file.
- object storage technology solved the problem of reading and writing massive small files, it also introduced a large number of new small files (i.e., numerous index data), and what followed was a new literacy issues. In other words, the existence of numerous index data has become a new bottleneck restricting read and write performance.
- one solution is to use an in-memory database such as Redis to solve the performance problem when reading and writing index data through the fast memory read and write characteristics.
- memory has the disadvantage of non-persistence. It will bring about a series of problems, and the cost of memory is also high.
- Another approach in the existing technology is to use a disk-based database such as MySQL to ensure the durability of the index itself by sacrificing read and write performance. However, the cost is that performance will be lost, and a large part of the disk's IOPS is consumed in processing the index. .
- This application provides an object storage method and device based on a distributed database, which achieves high concurrent reading and writing characteristics while maintaining the persistence characteristics of the entire system, thereby truly solving the problem of reading and writing massive small files.
- an object storage method based on a distributed database is provided.
- the method is used in a distributed database system.
- the distributed database system includes multiple nodes, and each node shares a disk and Memory; the methods include:
- the behavior log of the application programming interface API of the current node is written in the disk and the memory of the current node, and then the index corresponding to the internal request for index writing is written into the current node.
- the index corresponding to the internal request for index reading is read from the disk and returned.
- write the behavior log of the application program interface API of the current node in the disk which may include:
- the target log file is a log file on the disk corresponding to the current node, and the file names of log files corresponding to different nodes are mutually exclusive;
- the behavior log is written to the target log file in the form of write-write.
- the method further includes:
- the time stamp is used to determine whether the data is new or old to ensure that non-latest data does not overwrite the latest data.
- the method also includes:
- Time consistency verification is performed on each node every preset period.
- the method further includes:
- the duplicate index is deleted in the memory.
- the method also includes:
- the converted internal request is sent to the selected node.
- the load balancing strategy includes:
- a hash value is first calculated based on the file name of the file, and then a node is selected based on the hash value to achieve load balancing among nodes.
- each node of the distributed database system realizes shared disk and memory through the Gluster file system.
- an object storage device based on a distributed database is provided.
- the device is used in a distributed database system.
- the distributed database system includes multiple nodes, and each node shares a disk and Memory;
- the device includes:
- a logging unit configured to write the behavior log of the application program interface API of the current node in the disk and the memory of the current node when the current node receives an internal request for index writing, and then trigger the first index writing unit;
- the first index writing unit is used to write the index corresponding to the internal request for index writing into the first-in-first-out FIFO queue in the memory of the current node, where the index includes keywords, and the keywords are operated by the user.
- a second index writing unit configured to write all indexes in the queue to the disk regularly or when the queue is full
- the index reading unit is used to read the index corresponding to the internal request for index reading from the disk and return it when the current node receives an internal request for index reading.
- the logging unit when used to write the behavior log of the application program interface API of the current node in the disk, it is specifically used to:
- the target log file is a log file on the disk corresponding to the current node, and the file names of log files corresponding to different nodes are mutually exclusive;
- the behavior log is written to the target log file in the form of write-write.
- the second index writing unit is also used for:
- the time stamp is used to determine whether the data is new or old to ensure that non-latest data does not overwrite the latest data.
- the device also includes:
- the time consistency check unit is used to check the time consistency of each node every preset period.
- the index reading unit is also used for:
- the duplicate index is deleted in the memory.
- the device also includes:
- An internal request generation unit used to obtain the user's operation on the file, and convert the operation into an internal request of the distributed database system, wherein the internal request is divided into an internal request for index writing and an internal request for index reading;
- a task allocation unit is used to select a node in the distributed database system according to a preset load balancing policy, and send the converted internal request to the selected node.
- the load balancing strategy includes:
- a hash value is first calculated based on the file name of the file, and then a node is selected based on the hash value to achieve load balancing among nodes.
- disks and memory are shared between nodes of the distributed database system through the Gluster file system.
- the solution of this application is improved on the basis of traditional disk-based databases and introduces a memory index mechanism so that the index data is not directly written into the disk-based database, but is written first into the memory, using the memory as a buffer pool, and then writes it to the disk in a timely manner.
- record behavior logs on the disk and memory before the operation for data recovery in the event of a failure.
- Figure 1 is a schematic flow chart of an object storage method based on a distributed database provided by an embodiment of the present application
- FIG. 2 is a schematic diagram of the work flow of nodes in the embodiment of this application.
- Figure 3 is another schematic flow chart of an object storage method based on a distributed database provided by an embodiment of the present application
- Figure 4 is a schematic diagram of an object storage device based on a distributed database provided by an embodiment of the present application.
- Figure 1 is a schematic flow chart of an object storage method based on a distributed database provided by an embodiment of the present application.
- the method can be used in a distributed database system, which can include multiple nodes, and disks and memory are shared between each node.
- the nodes of the distributed database system may share disks and memories through the Gluster file system.
- GlusterFS Cluster File System
- GlusterFS Cluster File System
- the method can be applied to any node in the distributed database system. As shown in Figure 1, the method may include the following steps:
- step S101 when an internal request for index writing is received, the behavior log (log) of the application program interface API of the current node is written in the disk and the current node memory, and then the internal request for index writing is written.
- the corresponding index is written into the first-in-first-out FIFO queue in the memory of the current node, where the index includes keywords, and the keywords are the file names of the files operated by the user.
- mapping relationship is formed between small files and large files (that is, aggregated files).
- This mapping relationship is an index.
- the index can include keywords (keys), and the keywords are files operated by the user (also known as aggregate files). That is, the file name of a small file).
- the index can also include the location of the small file in the aggregated file, the size of the small file, etc.
- Disks and memory are shared between each node (for example, the Gluster file system can be used), which ensures the synchronization of data between different nodes, thereby ensuring the persistence of data when a node fails.
- the API behavior log will be recorded first for data recovery in case of failure.
- writing the behavior log of the application program interface API of the current node in the disk may specifically include:
- the target log file is a log file on the disk corresponding to the current node, and the file names of log files corresponding to different nodes are mutually exclusive;
- the behavior log is written to the target log file in the form of write-write.
- the current node will directly write the API behavior log to the shared disk to ensure data recovery in the event of node failure.
- the log file can be named with a special name, such as saving it with the node number as a suffix, or adding a random hash to the node number, etc., to ensure mutual exclusion between nodes.
- the shared memory mechanism can be used to write the API behavior log in the memory, so that when a node fails unexpectedly, this part of the memory data can be passed through accessed by other nodes.
- a ring structure can be used to ensure primary and backup memory data between multiple nodes.
- node 2 has the backup data of node 1
- node 3 has the backup data of node 2, and so on.
- there is also a behavior log on the disk if the shared memory data of other nodes is available, the recovery efficiency will be improved.
- the two copies of data on disk and in memory can mutually prove the credibility of the backup data.
- step S102 all indexes in the queue are written to the disk periodically or when the queue is full.
- FIG. 2 is a schematic diagram of the work flow of nodes in the embodiment of the present application.
- Each node maintains a FIFO in memory.
- the queue is set up with a mechanism for flushing the disk when the data is full and flushing the disk periodically. Outside of these two opportunities (regular and full), the index data is not directly written to the disk database, thus The writing rate is greatly improved.
- data persistence can be ensured, thereby supporting the stability of the entire data system.
- Coopetition conflicts are key conflicts.
- Key (keyword in the index) is the file name of the small file operated by the user.
- a node When a node is preparing to write an index for a key, another node may also want to write an index for the same key. , for example, a user operates multiple times, and each operation is distributed to a different node by the load balancing mechanism.
- This kind of competition can be avoided through the operation time attribute of the database. By adding timestamps and using conditional SQL statements when flushing, it can be ensured that non-latest data will not overwrite the correct data and then be discarded correctly.
- the method may also include:
- the time stamp is used to determine whether the data is new or old to ensure that non-latest data does not overwrite the latest data.
- the method may also include:
- Time consistency verification is performed on each node every preset period.
- step S103 when an internal request for index reading is received, the index corresponding to the internal request for index reading is read from the disk and returned.
- the method in the process of reading the index corresponding to the internal request of the index read from the disk, the method may also include:
- the duplicate index is deleted in the memory.
- the method may also include:
- step S301 the user's operation on the file is obtained.
- the files here are small files. User operations on small files will be converted into internal requests in the distributed database system. For example, if the user modifies the small file, the index of the small file will usually change accordingly, and the system will An index write request will be generated.
- step S302 the operation is converted into an internal request of the distributed database system, where the internal request is divided into an internal request for index writing and an internal request for index reading.
- step S303 a node is selected in the distributed database system according to a preset load balancing strategy.
- This embodiment does not limit the specific load balancing strategy. Those skilled in the art can choose and design it according to different needs and different scenarios. These choices and designs that can be used here do not deviate from the spirit and spirit of this application. protected range.
- the load balancing strategy may specifically include:
- a hash value is first calculated based on the file name of the file, and then a node is selected based on the hash value to achieve load balancing among nodes.
- step S304 the converted internal request is sent to the selected node.
- the selected node is also the current node in step S101.
- requests can be distributed to each node in an even distribution.
- the hash value of the file name in the request will be calculated and distributed among nodes based on the hash value. Allocation through hashing may weaken load balancing to a certain extent, but it reduces data competition in the background and implicitly improves concurrency performance.
- the focus of load may be different. For example, when multiple files operated by a user have similar file names, they will all be hashed and assigned to the same node, which will cause the load to be unbalanced. , at this time, the load balancing rules can be adjusted according to the situation.
- the file name can be hashed to get 1 to 15 numbers
- 1 to 5 are assigned to node A
- 6 to 10 are assigned to node B
- 11 to 15 are assigned to node C to achieve balance.
- you may find that the numbers that appear are all from 1 to 10.
- you can adjust the strategy for example, assign 1 to 3 to node A, 4 to 7 to node B, and 8 to 10 to node C.
- this embodiment improves on the traditional disk-based database and introduces a memory index mechanism so that the index data is not directly written into the disk-based database, but first Write to memory, use memory as a buffer pool, and then write to disk in a timely manner. At the same time, record behavior logs on disk and memory before operation for data recovery in case of failure.
- Figure 4 is a schematic diagram of an object storage device based on a distributed database provided by an embodiment of the present application.
- the device is used in a distributed database system, which includes multiple nodes, and disks and memories are shared between each node.
- disks and memory can be shared between nodes of the distributed database system through the Gluster file system.
- the device may include:
- the logging unit 401 is configured to write the behavior log of the application program interface API of the current node in the disk and the memory of the current node when the current node receives an internal request for index writing, and then trigger the first index writing unit;
- the first index writing unit 402 is used to write the index corresponding to the internal request for index writing into the first-in-first-out FIFO queue in the memory of the current node, where the index includes keywords, and the keywords are operated by the user.
- the second index writing unit 403 is used to write all indexes in the queue to the disk regularly or when the queue is full;
- the index reading unit 404 is configured to read the index corresponding to the internal request for index reading from the disk and return it when the current node receives an internal request for index reading.
- the logging unit when used to write the behavior log of the application programming interface API of the current node in the disk, it can be specifically used to:
- the target log file is a log file on the disk corresponding to the current node, and the file names of log files corresponding to different nodes are mutually exclusive;
- the behavior log is written to the target log file in the form of write-write.
- the second index writing unit can also be used for:
- the time stamp is used to determine whether the data is new or old to ensure that non-latest data does not overwrite the latest data.
- the device may further include:
- the time consistency check unit is used to check the time consistency of each node every preset period.
- the index reading unit can also be used for:
- the duplicate index is deleted in the memory.
- the device may further include:
- An internal request generation unit used to obtain the user's operation on the file, and convert the operation into an internal request of the distributed database system, wherein the internal request is divided into an internal request for index writing and an internal request for index reading;
- a task allocation unit is used to select a node in the distributed database system according to a preset load balancing policy, and send the converted internal request to the selected node.
- the load balancing strategy may specifically include:
- a hash value is first calculated based on the file name of the file, and then a node is selected based on the hash value to achieve load balancing among nodes.
- this embodiment improves on the traditional disk-based database and introduces a memory index mechanism so that the index data is not directly written into the disk-based database, but first Write to memory, use memory as a buffer pool, and then write to disk in a timely manner. At the same time, record behavior logs on disk and memory before operation for data recovery in case of failure.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
一种基于分布式数据库的对象存储方法及装置,用于分布式数据库系统,节点之间共享磁盘和内存;当当前节点接收到索引写的内部请求时,在磁盘中及当前节点内存中写入当前节点API行为日志,然后将索引写所对应的索引写入当前节点内存中的FIFO队列(S101);定期或队列存满时再将队列中的全部索引写入磁盘(S102);当接收到索引读的内部请求时,从磁盘读取索引读所对应的索引并返回(S103)。先将索引写入内存,以内存作为缓冲池,再适时写入磁盘,且提前记录下行为日志,这样通过内存与磁盘的并用和互补,不但可以提升写入速率,满足高并发的读写特性,而且通过内存和磁盘的共享机制及行为日志的内存磁盘双份记录,确保了索引数据的高可用性。
Description
本申请涉及分布式数据库和面向对象存储技术领域,尤其涉及一种基于分布式数据库的对象存储方法及装置。
随着科学技术的飞速发展,人类如今已步入云计算时代。在云计算时代中,出现了一种区别于文件存储等传统技术的存储技术——对象存储。对象存储也即面向对象的存储,是一种适用于非结构化数据的存储技术,在当下往往是海量小文件读写场景的最佳解决方案。对象存储应用了追记写的模式,对小文件进行聚合写入,从而大大提高了读写的IOPS(Input/Output operations Per Second,每秒读写操作次数)和带宽。
在对象存储中,某个具体的小文件在聚合文件中的位置、大小,也即小文件与聚合文件之间所形成的映射关系,被称为索引。发明人在实现本申请方案的过程中发现,对象存储技术虽然解决了海量小文件的读写问题,但却引入了新的海量小文件(即众多索引数据),随之而来的便是新的读写问题。换句话说,众多索引数据的存在已成为制约读写性能的新的瓶颈。
对于索引的处理,在现有技术中,一种解决办法是使用内存型数据库如Redis,通过内存读写快的特性来解决索引数据读写时的性能问题,然而内存具有非持久化的缺点,会带来一系列问题,另外内存的支出费用也较高。现有技术中另一种做法是使用磁盘型数据库如MySQL,通过牺牲读写性能来确保索引本身的持久性,但是代价是性能会有所损耗,磁盘的IOPS很大一部分被消耗在处理索引上。
发明内容
本申请提供一种基于分布式数据库的对象存储方法及装置,在保持整个系 统的持久性特性的同时实现高并发的读写特性,从而在真正意义上解决海量小文件的读写问题。
根据本申请实施例的第一方面,提供一种基于分布式数据库的对象存储方法,所述方法用于分布式数据库系统,所述分布式数据库系统包括多个节点,各节点之间共享磁盘和内存;所述方法包括:
对于当前节点:
当接收到索引写的内部请求时,在所述磁盘中以及当前节点内存中写入当前节点的应用程序接口API的行为日志,然后将所述索引写的内部请求所对应的索引写入当前节点内存中的先入先出FIFO队列,其中所述索引包括关键字,所述关键字为用户所操作的文件的文件名;
定期或者在所述队列存满时将所述队列中的全部索引写入所述磁盘;
当接收到索引读的内部请求时,从所述磁盘读取所述索引读的内部请求所对应的索引并返回。
可选的,在所述磁盘中写入当前节点的应用程序接口API的行为日志,具体可以包括:
打开目标日志文件,其中,所述目标日志文件为所述磁盘上的与当前节点对应的日志文件,不同节点所对应的日志文件的文件名互斥;
以追记写的形式将所述行为日志写入与所述目标日志文件。
可选的,在将所述队列中的全部索引写入所述磁盘的过程中,所述方法还包括:
判断写入时是否出现索引关键字冲突;
如果出现关键字冲突,则根据时间戳判断数据的新旧以保证非最新的数据不会覆盖最新的数据。
可选的,所述方法还包括:
每隔预设周期对各节点进行时间一致性校验。
可选的,在从所述磁盘读取所述索引读的内部请求所对应的索引的过程中,所述方法还包括:
判断内存中是否存在重复索引,其中所述重复索引是与所读取的索引的关键字相同的索引;
如果内存中存在所述重复索引,则在内存中删除所述重复索引。
可选的,所述方法还包括:
对于所述分布式数据库系统:
获取用户对文件的操作;
将所述操作转换为所述分布式数据库系统的内部请求,其中,所述内部请求分为索引写的内部请求和索引读的内部请求;
根据预设的负载均衡策略在所述分布式数据库系统中选取一个节点;
将转换得到的所述内部请求发送给所选取的节点。
可选的,所述负载均衡策略包括:
当所述分布式数据库系统的负荷不高于预设阈值时,根据平均分配的策略为所述内部请求选取节点;
当所述分布式数据库系统的负荷高于预设阈值时,先根据所述文件的文件名计算得到哈希值,然后基于所述哈希值进行节点的选择以实现节点间的负载均衡。
可选的,所述分布式数据库系统的各节点之间通过Gluster文件系统实现共享磁盘和内存。
根据本申请实施例的第二方面,提供一种基于分布式数据库的对象存储装置,所述装置用于分布式数据库系统,所述分布式数据库系统包括多个节点,各节点之间共享磁盘和内存;
所述装置包括:
日志记录单元,用于当当前节点接收到索引写的内部请求时,在所述磁盘中以及当前节点内存中写入当前节点的应用程序接口API的行为日志,然后触发第一索引写单元;
第一索引写单元,用于将所述索引写的内部请求所对应的索引写入当前节点内存中的先入先出FIFO队列,其中所述索引包括关键字,所述关键字为用户所操作的文件的文件名;
第二索引写单元,用于定期或者在所述队列存满时将所述队列中的全部索引写入所述磁盘;
索引读单元,用于当当前节点接收到索引读的内部请求时,从所述磁盘读 取所述索引读的内部请求所对应的索引并返回。
可选的,所述日志记录单元在用于在所述磁盘中写入当前节点的应用程序接口API的行为日志时,具体用于:
打开目标日志文件,其中,所述目标日志文件为所述磁盘上的与当前节点对应的日志文件,不同节点所对应的日志文件的文件名互斥;
以追记写的形式将所述行为日志写入与所述目标日志文件。
可选的,所述第二索引写单元还用于:
判断写入时是否出现索引关键字冲突;
如果出现关键字冲突,则根据时间戳判断数据的新旧以保证非最新的数据不会覆盖最新的数据。
可选的,所述装置还包括:
时间一致性校验单元,用于每隔预设周期对各节点进行时间一致性校验。
可选的,所述索引读单元还用于:
判断内存中是否存在重复索引,其中所述重复索引是与所读取的索引的关键字相同的索引;
如果内存中存在所述重复索引,则在内存中删除所述重复索引。
可选的,所述装置还包括:
内部请求生成单元,用于获取用户对文件的操作,将所述操作转换为所述分布式数据库系统的内部请求,其中,所述内部请求分为索引写的内部请求和索引读的内部请求;
任务分配单元,用于根据预设的负载均衡策略在所述分布式数据库系统中选取一个节点,将转换得到的所述内部请求发送给所选取的节点。
可选的,所述负载均衡策略包括:
当所述分布式数据库系统的负荷不高于预设阈值时,根据平均分配的策略为所述内部请求选取节点;
当所述分布式数据库系统的负荷高于预设阈值时,先根据所述文件的文件名计算得到哈希值,然后基于所述哈希值进行节点的选择以实现节点间的负载均衡。
可选的,所述分布式数据库系统的各节点之间通过Gluster文件系统实现 共享磁盘和内存。
本申请实施例提供的技术方案可以包括以下有益效果:
为了从真正意义上解决海量小文件的读写问题,本申请方案在传统磁盘型数据库基础上进行了改善,引入了内存索引机制,使索引数据并不直接写入磁盘型数据库,而是先写入内存,以内存作为缓冲池,然后再适时写入磁盘,同时,在操作之前先在磁盘和内存中记录下行为日志,以作故障时的数据复原之用。这样,通过内存与磁盘的并用和互补,取长补短,不但可以大大提升写入速率,使磁盘IOPS在索引上的损耗大大减少,达到了高并发的读写特性,而且通过内存和磁盘的共享机制以及行为日志的内存磁盘双份记录,确保了索引数据的高可用性,从而为数据的持久性提供了保证,进而保证了整个系统的持久性特性,增强了整个数据体系的稳定性。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。此外,这些介绍并不构成对实施例的限定,附图中具有相同参考数字标号的元件表示为类似的元件,除非有特别申明,附图中的图不构成比例限制。
图1是本申请实施例提供的一种基于分布式数据库的对象存储方法示意性流程图;
图2是本申请实施例中节点的工作流程示意图;
图3是本申请实施例提供的一种基于分布式数据库的对象存储方法另一示意性流程图;
图4是本申请实施例提供的一种基于分布式数据库的对象存储装置的示意图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行详细描述。当涉及附图时,除非另有说明,否则不同附图中的相同数字表示相同或相似的要素。显然,以下所描述的实施例仅仅是本申请的一部分实施例,而不是全部的实施例,或者说以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
当本申请实施例的说明书、权利要求书及上述附图中若出现术语“第一”、“第二”、“第三”等时,是用于区别不同对象,而不是用于限定特定顺序。在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”等的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
图1是本申请实施例提供的一种基于分布式数据库的对象存储方法示意性流程图。所述方法可用于分布式数据库系统,所述分布式数据库系统可以包括多个节点,各节点之间共享磁盘和内存。
作为示例,在实施时,所述分布式数据库系统的各节点之间具体可以通过Gluster文件系统实现共享磁盘和内存。
GlusterFS(Gluster File System)是一个开源的分布式文件系统,目前主要适用于大文件存储场景,并未对小文件作额外的优化措施,因此对于小文件尤其是海量小文件,GlusterFS的存储效率和访问性能都不佳。而通过本申请实施例中的方案,可以弥补GlusterFS的这一缺陷。
所述方法具体可用于所述分布式数据库系统中的任一节点。参见图1所示,所述方法可以包括如下步骤:
对于当前节点:
在步骤S101中,当接收到索引写的内部请求时,在所述磁盘中以及当前节点内存中写入当前节点的应用程序接口API的行为日志(log),然后将所述索引写的内部请求所对应的索引写入当前节点内存中的先入先出FIFO队列,其中所述索引包括关键字,所述关键字为用户所操作的文件的文件名。
在对象存储中,小文件与大文件(也即聚合文件)之间会形成映射关系,该映射关系即索引,索引可包括关键字(key),所述关键字为用户所操作的文件(也即某个小文件)的文件名。此外,索引还可以包括小文件在聚合文件中的位置、小文件的大小等内容。
用户对小文件的操作在分布式数据库系统内会转变为内部请求,例如用户修改了小文件,那么小文件的索引通常就会发生相应的变化,则系统内部就会产生索引写的请求。
在写索引之前,可以先做一些面向高可用的事情。各节点之间共享磁盘和内存(例如可以使用Gluster文件系统),保证了不同节点间数据的同步,进而可以保证节点故障时数据的持久化特性。本步骤中会先记录API的行为log,以作故障时的数据复原。
作为示例,在本实施例或本申请其他某些实施例中,在所述磁盘中写入当前节点的应用程序接口API的行为日志,具体可以包括:
打开目标日志文件,其中,所述目标日志文件为所述磁盘上的与当前节点对应的日志文件,不同节点所对应的日志文件的文件名互斥;
以追记写的形式将所述行为日志写入与所述目标日志文件。
当前节点会往共享磁盘上直接写入API行为log,以确保节点故障时数据的复原。为了避免不同节点间的竞合、影响读写性能,log文件可以以特殊名称命名,例如以节点编号作为后缀保存,或者节点编号加随机哈希等,以确保节点间的互斥。
同时为了规避写入操作本身的瓶颈,可以直接OPEN该log文件,以追记写的形式写入,提高写入IOPS。通过后端的磁盘\内存的共享机制,可以确保某节点故障时,数据依旧对其它节点可见。
此外,为了二次确保该行为log的高可用性,以在灾备时能够恢复,可以利用共享内存机制在内存中也写入API行为log,这样当某节点意外故障时,该部分内存数据可以通其它节点访问到。
作为示例,可以使用环状结构来保证多节点间的内存数据主备,例如,节点2里有节点1的备数据,节点3里有节点2的备数据,等等。虽然磁盘上也有行为log,但如果其他节点的共享内存数据可用,则恢复效率会提升。同时,磁盘和内存中的两份数据可以互相证明备数据的可信度。
在步骤S102中,定期或者在所述队列存满时将所述队列中的全部索引写 入所述磁盘。
作为示例可参见图2所示,图2是本申请实施例中节点的工作流程示意图。每个节点在内存中维护FIFO,该队列设置了数据满刷盘以及定期刷盘的机制,在这两个契机(定期、存满)之外,索引数据并不直接写入磁盘型数据库,从而大大提升了写入速率,同时,因为有磁盘和内存共享机制以及API行为log,可以保证数据持久性,进而支撑了整个数据体系的稳定性。
另外,由于多节点间的负载均衡机制,不同节点可能会存在相同数据的不同版本,在刷盘时产生竞合冲突。竞合冲突即key冲突,key(索引中的关键字)是用户所操作的那个小文件的文件名,当一个节点准备写一个key的索引时,另一个节点可能也要写同一个key的索引,例如用户多次操作,每个操作被负载均衡机制分配到不同的节点上。
这种竞合可以通过数据库的操作时间属性来规避,通过加入时间戳,在刷盘时通过条件SQL语句,可保证非最新的数据不会覆盖正确数据,然后被正确舍弃。
因此在本实施例或本申请其他某些实施例中,在将所述队列中的全部索引写入所述磁盘的过程中,所述方法还可以包括:
判断写入时是否出现索引关键字冲突;
如果出现关键字冲突,则根据时间戳判断数据的新旧以保证非最新的数据不会覆盖最新的数据。
此外,为了确保不同节点在处理新旧数据竞合时的时间戳一致性,可以加入心跳脚本,定期来做时间一致性的校验。同时,在刷盘的SQL写入时,可以做条件的二次确认,以避免该脚本同步间隔的偶发冲突,例如,同步周期是5秒钟,SQL写入时,时间差大于5秒的可以直接写入,反之则需要二次确认当前时间的一致性。
因此在本实施例或本申请其他某些实施例中,所述方法还可以包括:
每隔预设周期对各节点进行时间一致性校验。
在步骤S103中,当接收到索引读的内部请求时,从所述磁盘读取所述索引读的内部请求所对应的索引并返回。
读取后立刻返回,但是通过异步执行的方式,在内存中可以进行一定的改善行为:如果内存中存在相同文件名(key)时,可以直接删除该内存中的数据,从而减少写入的开销。这是因为,从磁盘中读取的数据是最新的,而内存 中的数据不是与读取的数据一样新,就是比读取的数据旧,因此可以删除。
故在本实施例或本申请其他某些实施例中,在从所述磁盘读取所述索引读的内部请求所对应的索引的过程中,所述方法还可以包括:
判断内存中是否存在重复索引,其中所述重复索引是与所读取的索引的关键字相同的索引;
如果内存中存在所述重复索引,则在内存中删除所述重复索引。
另外,参见图3所示,在本实施例或本申请其他某些实施例中,所述方法还可以包括:
对于所述分布式数据库系统:
在步骤S301中,获取用户对文件的操作。
此处的文件即小文件,用户对小文件的操作在分布式数据库系统内会转变为内部请求,例如用户修改了小文件,那么小文件的索引通常就会发生相应的变化,则系统内部就会产生索引写的请求。
在步骤S302中,将所述操作转换为所述分布式数据库系统的内部请求,其中,所述内部请求分为索引写的内部请求和索引读的内部请求。
在步骤S303中,根据预设的负载均衡策略在所述分布式数据库系统中选取一个节点。
对于具体的负载均衡策略,本实施例并不进行限制,本领域技术人员可以根据不同需求\不同场景而自行选择、设计,可以在此处使用的这些选择和设计都没有背离本申请的精神和保护范围。
例如,所述负载均衡策略具体可以包括:
当所述分布式数据库系统的负荷不高于预设阈值时,根据平均分配的策略为所述内部请求选取节点;
当所述分布式数据库系统的负荷高于预设阈值时,先根据所述文件的文件名计算得到哈希值,然后基于所述哈希值进行节点的选择以实现节点间的负载均衡。
在步骤S304中,将转换得到的所述内部请求发送给所选取的节点。所选取的节点也即步骤S101中的当前节点。
具体来讲,当负荷较轻时,可以以平均分配的形式,把请求分发到各个节 点。而当负荷较重时,会计算请求中的文件名的哈希值,通过哈希值进行节点间的分配。通过哈希分配可能会在一定程度上弱化负载均衡,但减少了后台的数据竞合,隐形地提升并发性能。
此外,根据不同的用户场景,负载轻重的侧重点可能不一样,例如当用户所操作的多个文件具有相似的文件名时,会导致哈希后都分配给同一个节点,反而使得负载不均衡,此时可以根据情况来调整负载均衡的规则。
简单举例来讲,假设文件名哈希后可得到1~15个数字,1~5分给节点A、6~10分给节点B、11~15分给节点C,从而实现均衡,然而实际中可能会发现出现的数字都是1~10,此时可以调整策略,例如将1~3分给节点A、4~7分给节点B、8~10分给节点C。
为了从真正意义上解决海量小文件的读写问题,本实施例方案在传统磁盘型数据库基础上进行了改善,引入了内存索引机制,使索引数据并不直接写入磁盘型数据库,而是先写入内存,以内存作为缓冲池,然后再适时写入磁盘,同时,在操作之前先在磁盘和内存中记录下行为日志,以作故障时的数据复原之用。这样,通过内存与磁盘的并用和互补,取长补短,不但可以大大提升写入速率,使磁盘IOPS在索引上的损耗大大减少,达到了高并发的读写特性,而且通过内存和磁盘的共享机制以及行为日志的内存磁盘双份记录,确保了索引数据的高可用性,从而为数据的持久性提供了保证,进而保证了整个系统的持久性特性,增强了整个数据体系的稳定性。
下述为本申请装置实施例,可以用于执行本申请方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
图4是本申请实施例提供的一种基于分布式数据库的对象存储装置的示意图。所述装置用于分布式数据库系统,所述分布式数据库系统包括多个节点,各节点之间共享磁盘和内存。
作为示例,所述分布式数据库系统的各节点之间可以通过Gluster文件系统实现共享磁盘和内存。
参照图4所示,所述装置可以包括:
日志记录单元401,用于当当前节点接收到索引写的内部请求时,在所述磁盘中以及当前节点内存中写入当前节点的应用程序接口API的行为日志,然后触发第一索引写单元;
第一索引写单元402,用于将所述索引写的内部请求所对应的索引写入当前节点内存中的先入先出FIFO队列,其中所述索引包括关键字,所述关键字为用户所操作的文件的文件名;
第二索引写单元403,用于定期或者在所述队列存满时将所述队列中的全部索引写入所述磁盘;
索引读单元404,用于当当前节点接收到索引读的内部请求时,从所述磁盘读取所述索引读的内部请求所对应的索引并返回。
在本实施例或本申请其他某些实施例中,所述日志记录单元在用于在所述磁盘中写入当前节点的应用程序接口API的行为日志时,具体可以用于:
打开目标日志文件,其中,所述目标日志文件为所述磁盘上的与当前节点对应的日志文件,不同节点所对应的日志文件的文件名互斥;
以追记写的形式将所述行为日志写入与所述目标日志文件。
在本实施例或本申请其他某些实施例中,所述第二索引写单元还可以用于:
判断写入时是否出现索引关键字冲突;
如果出现关键字冲突,则根据时间戳判断数据的新旧以保证非最新的数据不会覆盖最新的数据。
在本实施例或本申请其他某些实施例中,所述装置还可以包括:
时间一致性校验单元,用于每隔预设周期对各节点进行时间一致性校验。
在本实施例或本申请其他某些实施例中,所述索引读单元还可以用于:
判断内存中是否存在重复索引,其中所述重复索引是与所读取的索引的关键字相同的索引;
如果内存中存在所述重复索引,则在内存中删除所述重复索引。
在本实施例或本申请其他某些实施例中,所述装置还可以包括:
内部请求生成单元,用于获取用户对文件的操作,将所述操作转换为所述分布式数据库系统的内部请求,其中,所述内部请求分为索引写的内部请求和索引读的内部请求;
任务分配单元,用于根据预设的负载均衡策略在所述分布式数据库系统中选取一个节点,将转换得到的所述内部请求发送给所选取的节点。
在本实施例或本申请其他某些实施例中,所述负载均衡策略具体可以包括:
当所述分布式数据库系统的负荷不高于预设阈值时,根据平均分配的策略为所述内部请求选取节点;
当所述分布式数据库系统的负荷高于预设阈值时,先根据所述文件的文件名计算得到哈希值,然后基于所述哈希值进行节点的选择以实现节点间的负载均衡。
关于上述实施例中的装置,其中各个单元\模块执行操作的具体方式已经在相关方法的实施例中进行了详细描述,此处不再赘述。在本申请中,上述单元\模块的名字对单元\模块本身不构成限定,在实际实现中,这些单元\模块可以以其他名称出现,只要各个单元\模块的功能和本申请类似,皆属于本申请权利要求及其等同技术的范围之内。
为了从真正意义上解决海量小文件的读写问题,本实施例方案在传统磁盘型数据库基础上进行了改善,引入了内存索引机制,使索引数据并不直接写入磁盘型数据库,而是先写入内存,以内存作为缓冲池,然后再适时写入磁盘,同时,在操作之前先在磁盘和内存中记录下行为日志,以作故障时的数据复原之用。这样,通过内存与磁盘的并用和互补,取长补短,不但可以大大提升写入速率,使磁盘IOPS在索引上的损耗大大减少,达到了高并发的读写特性,而且通过内存和磁盘的共享机制以及行为日志的内存磁盘双份记录,确保了索引数据的高可用性,从而为数据的持久性提供了保证,进而保证了整个系统的持久性特性,增强了整个数据体系的稳定性。
以上所述,仅是本申请的较佳实施例而已,并非对本申请作任何形式上的限制,虽然本申请已以较佳实施例揭露如上,然而并非用以限定本申请,任何熟悉本专业的技术人员,在不脱离本申请技术方案范围内,当可利用上述揭示的技术内容做出些许更动或修饰为等同变化的等效实施例,但凡是未脱离本申请技术方案内容,依据本申请技术方案的技术实质,在本申请技术方案的精神和原则之内,对以上实施例所作的任何简单的修改、等同替换与改进等,均仍属于本申请技术方案的保护范围之内。
本领域技术人员在考虑说明书及实践这里公开的方案后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开 的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由所附的权利要求指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。
Claims (16)
- 一种基于分布式数据库的对象存储方法,其特征在于,所述方法用于分布式数据库系统,所述分布式数据库系统包括多个节点,各节点之间共享磁盘和内存;所述方法包括:对于当前节点:当接收到索引写的内部请求时,在所述磁盘中以及当前节点内存中写入当前节点的应用程序接口API的行为日志,然后将所述索引写的内部请求所对应的索引写入当前节点内存中的先入先出FIFO队列,其中所述索引包括关键字,所述关键字为用户所操作的文件的文件名;定期或者在所述队列存满时将所述队列中的全部索引写入所述磁盘;当接收到索引读的内部请求时,从所述磁盘读取所述索引读的内部请求所对应的索引并返回。
- 根据权利要求1所述的方法,其特征在于,在所述磁盘中写入当前节点的应用程序接口API的行为日志,包括:打开目标日志文件,其中,所述目标日志文件为所述磁盘上的与当前节点对应的日志文件,不同节点所对应的日志文件的文件名互斥;以追记写的形式将所述行为日志写入与所述目标日志文件。
- 根据权利要求1所述的方法,其特征在于,在将所述队列中的全部索引写入所述磁盘的过程中,所述方法还包括:判断写入时是否出现索引关键字冲突;如果出现关键字冲突,则根据时间戳判断数据的新旧以保证非最新的数据不会覆盖最新的数据。
- 根据权利要求3所述的方法,其特征在于,所述方法还包括:每隔预设周期对各节点进行时间一致性校验。
- 根据权利要求1所述的方法,其特征在于,在从所述磁盘读取所述索引读的内部请求所对应的索引的过程中,所述方法还包括:判断内存中是否存在重复索引,其中所述重复索引是与所读取的索引的关键字相同的索引;如果内存中存在所述重复索引,则在内存中删除所述重复索引。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:对于所述分布式数据库系统:获取用户对文件的操作;将所述操作转换为所述分布式数据库系统的内部请求,其中,所述内部请求分为索引写的内部请求和索引读的内部请求;根据预设的负载均衡策略在所述分布式数据库系统中选取一个节点;将转换得到的所述内部请求发送给所选取的节点。
- 根据权利要求6所述的方法,其特征在于,所述负载均衡策略包括:当所述分布式数据库系统的负荷不高于预设阈值时,根据平均分配的策略为所述内部请求选取节点;当所述分布式数据库系统的负荷高于预设阈值时,先根据所述文件的文件名计算得到哈希值,然后基于所述哈希值进行节点的选择以实现节点间的负载均衡。
- 根据权利要求1所述的方法,其特征在于,所述分布式数据库系统的各节点之间通过Gluster文件系统实现共享磁盘和内存。
- 一种基于分布式数据库的对象存储装置,其特征在于,所述装置用于分布式数据库系统,所述分布式数据库系统包括多个节点,各节点之间共享磁盘和内存;所述装置包括:日志记录单元,用于当当前节点接收到索引写的内部请求时,在所述磁盘中以及当前节点内存中写入当前节点的应用程序接口API的行为日志,然后触发第一索引写单元;第一索引写单元,用于将所述索引写的内部请求所对应的索引写入当前节点内存中的先入先出FIFO队列,其中所述索引包括关键字,所述关键字为用户所操作的文件的文件名;第二索引写单元,用于定期或者在所述队列存满时将所述队列中的全部索引写入所述磁盘;索引读单元,用于当当前节点接收到索引读的内部请求时,从所述磁盘读 取所述索引读的内部请求所对应的索引并返回。
- 根据权利要求9所述的装置,其特征在于,所述日志记录单元在用于在所述磁盘中写入当前节点的应用程序接口API的行为日志时,具体用于:打开目标日志文件,其中,所述目标日志文件为所述磁盘上的与当前节点对应的日志文件,不同节点所对应的日志文件的文件名互斥;以追记写的形式将所述行为日志写入与所述目标日志文件。
- 根据权利要求9所述的装置,其特征在于,所述第二索引写单元还用于:判断写入时是否出现索引关键字冲突;如果出现关键字冲突,则根据时间戳判断数据的新旧以保证非最新的数据不会覆盖最新的数据。
- 根据权利要求11所述的装置,其特征在于,所述装置还包括:时间一致性校验单元,用于每隔预设周期对各节点进行时间一致性校验。
- 根据权利要求9所述的装置,其特征在于,所述索引读单元还用于:判断内存中是否存在重复索引,其中所述重复索引是与所读取的索引的关键字相同的索引;如果内存中存在所述重复索引,则在内存中删除所述重复索引。
- 根据权利要求9所述的装置,其特征在于,所述装置还包括:内部请求生成单元,用于获取用户对文件的操作,将所述操作转换为所述分布式数据库系统的内部请求,其中,所述内部请求分为索引写的内部请求和索引读的内部请求;任务分配单元,用于根据预设的负载均衡策略在所述分布式数据库系统中选取一个节点,将转换得到的所述内部请求发送给所选取的节点。
- 根据权利要求14所述的装置,其特征在于,所述负载均衡策略包括:当所述分布式数据库系统的负荷不高于预设阈值时,根据平均分配的策略为所述内部请求选取节点;当所述分布式数据库系统的负荷高于预设阈值时,先根据所述文件的文件名计算得到哈希值,然后基于所述哈希值进行节点的选择以实现节点间的负载均衡。
- 根据权利要求9所述的装置,其特征在于,所述分布式数据库系统的各节点之间通过Gluster文件系统实现共享磁盘和内存。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210391992.X | 2022-04-14 | ||
CN202210391992.XA CN114741449A (zh) | 2022-04-14 | 2022-04-14 | 一种基于分布式数据库的对象存储方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023197404A1 true WO2023197404A1 (zh) | 2023-10-19 |
Family
ID=82280812
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/094380 WO2023197404A1 (zh) | 2022-04-14 | 2022-05-23 | 一种基于分布式数据库的对象存储方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114741449A (zh) |
WO (1) | WO2023197404A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117596211A (zh) * | 2024-01-18 | 2024-02-23 | 湖北省楚天云有限公司 | Ip分片多核负载均衡装置及方法 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115543214B (zh) * | 2022-11-25 | 2023-03-28 | 深圳华锐分布式技术股份有限公司 | 低时延场景下的数据存储方法、装置、设备及介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020029214A1 (en) * | 2000-08-10 | 2002-03-07 | Nec Corporation | Synchronizable transactional database method and system |
US20040107381A1 (en) * | 2002-07-12 | 2004-06-03 | American Management Systems, Incorporated | High performance transaction storage and retrieval system for commodity computing environments |
CN103577339A (zh) * | 2012-07-27 | 2014-02-12 | 深圳市腾讯计算机系统有限公司 | 一种数据存储方法及系统 |
CN104133867A (zh) * | 2014-07-18 | 2014-11-05 | 中国科学院计算技术研究所 | 分布式顺序表片内二级索引方法及系统 |
CN104731921A (zh) * | 2015-03-26 | 2015-06-24 | 江苏物联网研究发展中心 | Hadoop分布式文件系统针对日志型小文件的存储和处理方法 |
CN111046044A (zh) * | 2019-12-13 | 2020-04-21 | 南京富士通南大软件技术有限公司 | 一种基于内存型数据库的分布式对象存储系统的高可靠性架构 |
CN113961153A (zh) * | 2021-12-21 | 2022-01-21 | 杭州趣链科技有限公司 | 一种索引数据写入磁盘的方法、装置及终端设备 |
-
2022
- 2022-04-14 CN CN202210391992.XA patent/CN114741449A/zh active Pending
- 2022-05-23 WO PCT/CN2022/094380 patent/WO2023197404A1/zh unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020029214A1 (en) * | 2000-08-10 | 2002-03-07 | Nec Corporation | Synchronizable transactional database method and system |
US20040107381A1 (en) * | 2002-07-12 | 2004-06-03 | American Management Systems, Incorporated | High performance transaction storage and retrieval system for commodity computing environments |
CN103577339A (zh) * | 2012-07-27 | 2014-02-12 | 深圳市腾讯计算机系统有限公司 | 一种数据存储方法及系统 |
CN104133867A (zh) * | 2014-07-18 | 2014-11-05 | 中国科学院计算技术研究所 | 分布式顺序表片内二级索引方法及系统 |
CN104731921A (zh) * | 2015-03-26 | 2015-06-24 | 江苏物联网研究发展中心 | Hadoop分布式文件系统针对日志型小文件的存储和处理方法 |
CN111046044A (zh) * | 2019-12-13 | 2020-04-21 | 南京富士通南大软件技术有限公司 | 一种基于内存型数据库的分布式对象存储系统的高可靠性架构 |
CN113961153A (zh) * | 2021-12-21 | 2022-01-21 | 杭州趣链科技有限公司 | 一种索引数据写入磁盘的方法、装置及终端设备 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117596211A (zh) * | 2024-01-18 | 2024-02-23 | 湖北省楚天云有限公司 | Ip分片多核负载均衡装置及方法 |
CN117596211B (zh) * | 2024-01-18 | 2024-04-05 | 湖北省楚天云有限公司 | Ip分片多核负载均衡装置及方法 |
Also Published As
Publication number | Publication date |
---|---|
CN114741449A (zh) | 2022-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11153380B2 (en) | Continuous backup of data in a distributed data store | |
US10642840B1 (en) | Filtered hash table generation for performing hash joins | |
US10423493B1 (en) | Scalable log-based continuous data protection for distributed databases | |
US10853182B1 (en) | Scalable log-based secondary indexes for non-relational databases | |
US11720594B2 (en) | Synchronous replication in a distributed storage environment | |
US9946735B2 (en) | Index structure navigation using page versions for read-only nodes | |
CN105393243B (zh) | 事务定序 | |
US11841844B2 (en) | Index update pipeline | |
WO2023197404A1 (zh) | 一种基于分布式数据库的对象存储方法及装置 | |
US8868487B2 (en) | Event processing in a flash memory-based object store | |
US8700842B2 (en) | Minimizing write operations to a flash memory-based object store | |
US20180218023A1 (en) | Database concurrency control through hash-bucket latching | |
US20120158650A1 (en) | Distributed data cache database architecture | |
US20070288526A1 (en) | Method and apparatus for processing a database replica | |
JP7549137B2 (ja) | トランザクション処理方法、システム、装置、機器、及びプログラム | |
WO2023165196A1 (zh) | 一种日志存储加速方法、装置、电子设备及非易失性可读存储介质 | |
WO2021057108A1 (zh) | 一种读数据方法、写数据方法及服务器 | |
JPWO2011108695A1 (ja) | 並列データ処理システム、並列データ処理方法及びプログラム | |
WO2023077971A1 (zh) | 事务处理方法、装置、计算设备及存储介质 | |
US20240028598A1 (en) | Transaction Processing Method, Distributed Database System, Cluster, and Medium | |
WO2024131379A1 (zh) | 一种数据存储方法、装置及系统 | |
WO2019109256A1 (zh) | 一种日志管理方法、服务器和数据库系统 | |
WO2020119709A1 (zh) | 数据合并的实现方法、装置、系统及存储介质 | |
US11442663B2 (en) | Managing configuration data | |
US10922012B1 (en) | Fair data scrubbing in a data storage system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22937045 Country of ref document: EP Kind code of ref document: A1 |