WO2024021488A1 - Metadata storage method and apparatus based on distributed key-value database - Google Patents

Metadata storage method and apparatus based on distributed key-value database Download PDF

Info

Publication number
WO2024021488A1
WO2024021488A1 PCT/CN2022/141807 CN2022141807W WO2024021488A1 WO 2024021488 A1 WO2024021488 A1 WO 2024021488A1 CN 2022141807 W CN2022141807 W CN 2022141807W WO 2024021488 A1 WO2024021488 A1 WO 2024021488A1
Authority
WO
WIPO (PCT)
Prior art keywords
metadata
storage
preset
value
hash
Prior art date
Application number
PCT/CN2022/141807
Other languages
French (fr)
Chinese (zh)
Inventor
胡爱存
侯飞
梁成武
陈玉鹏
张翼
Original Assignee
天翼云科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 天翼云科技有限公司 filed Critical 天翼云科技有限公司
Publication of WO2024021488A1 publication Critical patent/WO2024021488A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2219Large Object storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24573Query processing with adaptation to user needs using data annotations, e.g. user-defined metadata

Definitions

  • the invention relates to the field of data processing, and in particular to a metadata storage method and device based on a distributed key-value database.
  • the distributed object storage system maintains an index table for each bucket, which stores the mapping relationship between the bucket and the metadata of all objects in the bucket.
  • the existing technology adopts a dynamic sharding mechanism by creating multiple index tables.
  • the object metadata in the bucket will be rebalanced. If the bucket is stored With a large number of objects, the data rebalancing operation during sharding will be very time-consuming, blocking front-end writing and giving users a very unfriendly experience.
  • embodiments of the present invention provide a metadata storage method and device based on a distributed key-value database to solve the business congestion caused by the limited single-bucket object storage scale and the sharding mechanism in the distributed object storage system. question.
  • an embodiment of the present invention provides a metadata storage method based on a distributed key-value database.
  • the method is applied to a server, and the server is installed with a distributed object storage system.
  • the method includes:
  • the metadata is saved into the corresponding hash table and ordered list; each bucket in the distributed storage system passes through at least one hash
  • the hash table and at least one ordered list store metadata, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
  • the preset types include first, second, third and fourth types, the metadata of the first type is basic metadata, and the metadata of the second type
  • the data is object attribute metadata
  • the third type of metadata is index metadata
  • the fourth type of metadata is index sequence metadata
  • the step of saving the metadata into the corresponding hash table and ordered list based on the determined preset type and key-value pair of the metadata specifically includes:
  • the metadata is stored in a container group corresponding to the preset type; each preset type of metadata corresponds to one of the container groups, and each of the container groups corresponds to at least One hash slot, and the number of hash slots corresponding to the container group is equal to each other;
  • Assign a corresponding score value to the metadata determine the ordered list corresponding to the metadata based on the assigned score value, and store the metadata in the corresponding ordered list;
  • the metadata of each item in the ordered list is sorted in order according to the score value.
  • the number of hash tables is determined based on a preset number and the total number of hash slots, and the number of hash tables exceeds The preset number is a factor of the total number.
  • the method further includes the following steps:
  • the metadata retrieval request includes retrieval information, and the retrieval information includes the index order and number of index elements of each of the ordered lists;
  • the metadata is stored in a preset map table, and based on the preset map table, the metadata stored in the preset map table is sequentially sorted, and all steps required to complete the storage and sorting of the metadata are The above preset map table is returned to the client.
  • the method further includes the following steps:
  • the metadata retrieval request includes retrieval information, and the retrieval information includes reference metadata and the number of index elements;
  • the metadata is stored in a preset map table, and based on the preset map table, the metadata stored in the preset map table is sequentially sorted, and all steps required to complete the storage and sorting of the metadata are The above preset map table is returned to the client.
  • the metadata is stored in a preset map table, and based on the preset map table, the The metadata stored in the preset map table is sequentially sorted, and the preset map table that completes the storage and sorting of the metadata is returned to the client, which also includes:
  • an embodiment of the present invention also provides a metadata storage device based on a distributed key-value database, the device is applied to a server, and the server is installed with a distributed object storage system, and the Devices include:
  • the first determination module is used to determine the metadata of the objects stored in the bucket, and determine the preset type and key-value pair of the metadata;
  • a data storage module configured to save the metadata into the corresponding hash table and ordered list based on the determined preset type and key-value pair of the metadata; each in the distributed storage system
  • Each bucket stores metadata through at least one hash table and at least one ordered list, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
  • an embodiment of the present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor.
  • the processor executes the program, the above is implemented.
  • embodiments of the present invention also provide a non-transitory computer-readable storage medium on which a computer program is stored.
  • the distributed key-based storage medium is implemented as described above. Steps for the metadata storage method for value databases.
  • embodiments of the present invention further provide a computer program product, including a computer program that, when executed by a processor, implements any one of the above-mentioned metadata storage methods based on a distributed key-value database. A step of.
  • the metadata storage method and device based on the distributed key-value database provided by the present invention stores the metadata of the object in the form of key-value pairs in the distributed key-value cluster, and adopts the method of sub-tables to use multiple Hash tables and multiple ordered lists are used to carry the metadata of objects in a bucket.
  • the storage of metadata through hash tables reduces the complexity of I/O operations and increases the storage scale of a single bucket while ensuring efficiency.
  • the storage of metadata in ordered lists provides an interface for adding, deleting, modifying, and querying object metadata. Without increasing storage costs, it achieves high efficiency and low space utilization suitable for retrieval of metadata across multiple ordered lists. .
  • Figure 1 shows a schematic flow chart of a metadata storage method based on a distributed key-value database provided by the present invention
  • Figure 2 shows a schematic flow chart of step S20 in the metadata storage method based on a distributed key-value database provided by the present invention
  • Figure 3 shows a schematic structural diagram of a metadata storage device based on a distributed key-value database provided by the present invention
  • Figure 4 shows a schematic structural diagram of the electronic device provided by the present invention.
  • Object metadata in a distributed object storage system is divided into two parts for storage: one part is the index metadata of the object, which is called omap, and omap refers to object.
  • Map is an object used to save key-value pair map data.
  • omap objects play a very important role.
  • the performance of omap directly affects the cluster.
  • omap is stored in an independent key-value (key-value) storage system outside the local file system, which is levelDB when using filestore, and rocksDB when using bluestore; the other part is the extended attributes of the object, It is called xattr and is usually used to save the version information of the object.
  • xattr is stored in the RADOS (Reliable Autonomic Distributed Object Store) object of the bucket.
  • the RADOS object is stored in the local file system. Its size is affected by the file system. Limitation, resulting in a limit on the number of objects it can carry. This results in the need to read related data in two I/O paths when reading the metadata of an object, corresponding to the local file system and the key-value storage system respectively.
  • the distributed object storage system maintains an index table for each bucket, which stores the mapping relationship between the bucket and the metadata of all objects in the bucket.
  • an index table for each bucket, which stores the mapping relationship between the bucket and the metadata of all objects in the bucket.
  • the object metadata in the bucket will be rebalanced, and the index data in the old RADOS object will be recalculated, organized and migrated to the new RADOS object. Then, if a large number of objects are stored in the bucket, the data rebalancing operation during sharding will be very time-consuming, blocking front-end writes, and giving users a very unfriendly experience.
  • the metadata storage method based on the distributed key-value database of the present invention is described below with reference to Figure 1.
  • This method is designed to solve the problems caused by the limited single-bucket object storage scale and the fragmentation mechanism in native distributed object storage.
  • this method is applied to the server side.
  • the server side is installed with a distributed object storage system. The method includes:
  • S10 Determine the metadata of the objects stored in the bucket, and determine the preset type (type) and key-value pairs of the metadata. Specifically, determine the metadata stored in each bucket on the server, as well as the preset type of metadata and key-value pairs.
  • the preset types include first, second, third and fourth types.
  • the first type of metadata is basic metadata (object metadata), basic metadata saves the basic information of the latest version of the object, including size, instance and last version
  • the second type of metadata is object attribute metadata (xattrs), which is the attribute information of the object itself
  • the third type of metadata Metadata is index metadata (omap), which stores bucket object index information, which is omap information
  • the fourth type of metadata is index sequence metadata (omap order), the index order element stores the sequence list of all omaps of the objects in the bucket.
  • each bucket in the distributed storage system stores metadata through at least one hash table and at least one ordered list.
  • the hash table is used to store metadata
  • the ordered list is used to store metadata.
  • a single-bucket multi-table metadata storage model is constructed based on the data structure of distributed key-value data.
  • the native distributed object storage system only includes object gateway and back-end data storage.
  • the present invention adds A proprietary distributed key-value module is used to store object metadata and realize separate storage of object data and metadata.
  • this application reorganizes the object's metadata, removes redundant data types, and reclassifies the metadata into four types of metadata. Since dozens of types of metadata in the native distributed object storage system are not simultaneously operated during object operations, the object metadata granularity can be reduced, and metadata read and write operations are more flexible and efficient.
  • this method will use at least one hash table and at least one ordered list for storage, and when storing metadata, it will also ensure that it is inside the container group (container). It is evenly distributed. How to achieve uniform distribution of metadata in the container group will be explained below.
  • the metadata of the object is stored in the distributed key-value cluster in the form of key-value pairs.
  • This method uses a table partitioning method and uses multiple hash tables to carry the metadata of the object in a bucket.
  • Each A hash table can store up to 4.2 billion key-value pairs.
  • Multiple hash tables can easily support the metadata of tens of billions of objects by uniformly storing the metadata of objects in a distributed key-value hash table.
  • the time complexity of its read and write operations is 0 or 1, which reduces the complexity of I/O operations and increases the scale of single-bucket storage while ensuring efficiency. There is no need to load RADOS object data from the local file system.
  • the metadata storage method based on a distributed key-value database stores the object's metadata in the form of key-value pairs in a distributed key-value cluster, and uses multiple tables in a split-table manner.
  • Hash tables and multiple ordered lists are used to carry the metadata of objects in a bucket.
  • the storage of metadata through hash tables reduces the complexity of I/O operations and increases the storage scale of a single bucket while ensuring efficiency.
  • Sequence lists store metadata and provide an interface for adding, deleting, modifying, and querying object metadata. Without increasing storage costs, it achieves high efficiency and low space utilization suitable for metadata retrieval across multiple ordered lists.
  • Step S20 specifically includes:
  • each preset type of metadata corresponds to a container group, and each container group corresponds to at least one hash slot (hash slot). slot), and the number of hash slots corresponding to the container group is equal to each other, that is, the number of hash slots corresponding to each container group remains consistent.
  • each hash table corresponds to at least one hash slot.
  • each hash table needs to be allocated a slot partition in advance.
  • the total number of slots in the entire container group is 16384.
  • the hash slot corresponding to the metadata can be determined in the following way:
  • the HASH_SLOT algorithm can be used to map the metadata of the bucket object to the corresponding hash table, so that the metadata can be evenly distributed in the corresponding container group.
  • S24 Assign a corresponding score value (score) to the metadata. Based on the assigned score value, determine the ordered list corresponding to the metadata, and store the metadata in the corresponding ordered list. In this method, the ordered list Each item of metadata in the list is sorted sequentially according to the score value. For example, based on the score value assigned/assigned by the metadata, the metadata in a single ordered list is sorted in an orderly manner according to the score value from low to high.
  • a score value is assigned to the key of the metadata.
  • the method also includes the following steps, aiming to achieve high efficiency and low space utilization suitable for retrieval of metadata across multiple ordered lists without increasing storage costs:
  • the metadata retrieval request contains retrieval information, and the retrieval information includes the index order and number of index elements of each ordered list, etc.
  • the index order and number of index elements of different ordered lists are consistent.
  • the metadata retrieval request is to retrieve the top 10 elements in each ordered list, that is, the index order is based on the highest order.
  • the front-end elements start to be retrieved in a continuous sequence, and the number of index elements is 10.
  • A20 Based on the retrieval information, retrieve the corresponding metadata from each ordered list. For example, take the consecutive n elements (metadata) from the frontmost, lastmost, i-th to i+n-1th elements from each ordered list.
  • A30 Store the metadata in the preset map table, sort the metadata stored in the preset map table based on the preset map table, and return the preset map table that has completed metadata storage and sorting to the client.
  • the preset map table can sort the stored metadata sequentially.
  • the preset map table is a list in which the retrieved metadata is sorted again. Based on the preset map table, users can perform high-efficiency, low space utilization metadata retrieval across multiple ordered lists.
  • the method further includes the following steps:
  • the metadata retrieval request also contains retrieval information.
  • the retrieval information includes reference metadata and the number of index elements.
  • the reference metadata Data is start key.
  • A50 Determine the storage location of the baseline metadata in each ordered list.
  • the baseline metadata is metadata that is definitely present in the bucket. It can be understood that the baseline metadata is and will only be stored in one of the ordered lists. Therefore, in step A50, the specific storage location of the benchmark metadata in an ordered list corresponding to the storage will be obtained, which is the storage sequence, that is, the storage element number/serial number in the ordered list; for those that do not store benchmark metadata In ordered lists, the storage location of the baseline metadata stored in these ordered lists will first be determined, that is, the storage order in the ordered lists. For example, based on the letters of the benchmark metadata and the letters of the metadata that have been stored in each other ordered list, the pre-storage position of the benchmark metadata in the ordered list where the benchmark metadata is not stored is determined.
  • the storage location includes the real storage location and the pre-storage location, and retrieve the corresponding metadata from each ordered list. For example, starting from the pre-storage location, fetch consecutive n elements (metadata) from an ordered list of unstored baseline metadata.
  • A70 Store the metadata in the preset map table, sort the metadata stored in the preset map table based on the preset map table, and return the preset map table that has completed metadata storage and sorting to the client.
  • the preset map table can sort the stored metadata sequentially.
  • the preset map table is a list in which the retrieved metadata is sorted again. Based on the preset map table, users can perform high-efficiency, low space utilization metadata retrieval across multiple ordered lists.
  • the default map table has a storage limit, which can be set by the user.
  • the storage limit is N elements.
  • the storage limit of the default map table is not reached and metadata needs to be stored, the metadata will be processed directly. Storage.
  • steps A30 and A70 will also include:
  • A80 Determine that the preset map expression reaches the storage upper limit and there is unstored metadata.
  • the storage upper limit N is less than the total number of metadata taken out S.
  • the sorting value can be understood as the storage location/storage order.
  • A90 Determine that the map table sorting value corresponding to the last/most metadata exceeds the map table sorting value corresponding to the unstored metadata, delete the last metadata, and store the unstored metadata to the preset Set up a map table. Use the above method to determine whether each unstored metadata needs to be stored in the default map table. It should be noted that when the last element is deleted and the new metadata is stored in the default map table, the default All stored metadata in the map table are reordered according to the map table sorting value, ensuring that the map table sorting value corresponding to the metadata at the end of the default map table is the maximum value among the map table sorting values of all elements.
  • the metadata storage device based on the distributed key database provided by the present invention is described below.
  • the metadata storage device based on the distributed key database described below can be used with the metadata storage method based on the distributed key database described above. mutual reference.
  • the metadata storage device based on the distributed key-value database of the present invention is described below with reference to Figure 3.
  • This device is designed to solve the problems of limited single-bucket object storage scale and business blocking caused by the sharding mechanism in native distributed object storage.
  • the device is applied to the server, and the server is installed with a distributed object storage system.
  • the device includes:
  • the first determination module 10 is used to determine the metadata of the objects stored in the bucket, and determine the preset type (type) and key-value pairs of the metadata. Specifically, determine the metadata stored in each bucket on the server, as well as the preset type of metadata and key-value pairs.
  • the preset types include first, second, third and fourth types.
  • the first type of metadata is basic metadata (object metadata), basic metadata saves the basic information of the latest version of the object, including size, instance and last version
  • the second type of metadata is object attribute metadata (xattrs), which is the attribute information of the object itself
  • the third type of metadata Metadata is index metadata (omap), which stores bucket object index information, which is omap information
  • the fourth type of metadata is index sequence metadata (omap order), the index order element stores the sequence list of all omaps of the objects in the bucket.
  • the data storage module 20 is used to save the metadata to the corresponding hash table (hash table) based on the determined preset type and key-value pair of the metadata. table) and ordered list (zset).
  • each bucket in the distributed storage system stores metadata through at least one hash table and at least one ordered list.
  • the hash table is used to store metadata
  • the ordered list is used to store metadata. Stores retrieval information for metadata.
  • a single-bucket multi-table metadata storage model is constructed based on the data structure of distributed key-value data.
  • the native distributed object storage system only includes object gateway and back-end data storage.
  • the present invention adds A proprietary distributed key-value module is used to store object metadata and realize separate storage of object data and metadata.
  • this application reorganizes the object's metadata, removes redundant data types, and reclassifies the metadata into four types of metadata. Since dozens of types of metadata in the native distributed object storage system are not simultaneously operated during object operations, the object metadata granularity can be reduced, and metadata read and write operations are more flexible and efficient.
  • the device will use at least one hash table and at least one ordered list to store, and when storing metadata, it will also ensure that it is within the container group (container). It is evenly distributed. How to achieve uniform distribution of metadata in the container group will be explained below.
  • the metadata of the object is stored in the distributed key-value cluster in the form of key-value pairs.
  • the device adopts a sub-table method and uses multiple hash tables to carry the metadata of the object in a bucket.
  • Each A hash table can store up to 4.2 billion key-value pairs.
  • Multiple hash tables can easily support the metadata of tens of billions of objects by uniformly storing the metadata of objects in a distributed key-value hash table.
  • the time complexity of its read and write operations is 0 or 1, which reduces the complexity of I/O operations and increases the scale of single-bucket storage while ensuring efficiency. There is no need to load RADOS object data from the local file system.
  • the metadata storage device based on a distributed key-value database provided by this application stores the object's metadata in the form of key-value pairs in a distributed key-value cluster, and uses multiple tables in a split-table manner.
  • Hash tables and multiple ordered lists are used to carry the metadata of objects in a bucket.
  • the storage of metadata through hash tables reduces the complexity of I/O operations and increases the storage scale of a single bucket while ensuring efficiency.
  • Sequence lists store metadata and provide an interface for adding, deleting, modifying, and querying object metadata. Without increasing storage costs, it achieves high efficiency and low space utilization suitable for metadata retrieval across multiple ordered lists.
  • Figure 4 illustrates a schematic diagram of the physical structure of an electronic device.
  • the electronic device may include: a processor (processor) 810, a communications interface (Communications Interface) 820, a memory (memory) 830 and a communication bus 840.
  • the processor 810, the communication interface 820, and the memory 830 complete communication with each other through the communication bus 840.
  • the processor 810 can call logical instructions in the memory 830 to execute a metadata storage method based on a distributed key-value database, which method includes:
  • the metadata is saved into the corresponding hash table and ordered list; each bucket in the distributed storage system passes through at least one hash
  • the hash table and at least one ordered list store metadata, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
  • the above-mentioned logical instructions in the memory 830 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product.
  • the technical solution of the present invention essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Various media that can store program code, such as Memory), magnetic disks or optical disks.
  • the present invention also provides a computer program product.
  • the computer program product includes a computer program.
  • the computer program can be stored on a non-transitory computer-readable storage medium.
  • the computer program can Execute the metadata storage method based on the distributed key-value database provided by each of the above methods, which method includes:
  • the metadata is saved into the corresponding hash table and ordered list; each bucket in the distributed storage system passes through at least one hash
  • the hash table and at least one ordered list store metadata, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
  • the present invention also provides a non-transitory computer-readable storage medium on which a computer program is stored.
  • the computer program is implemented when executed by the processor to execute the elements based on the distributed key-value database provided by the above methods.
  • Data storage method which includes:
  • the metadata is saved into the corresponding hash table and ordered list; each bucket in the distributed storage system passes through at least one hash
  • the hash table and at least one ordered list store metadata, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated.
  • the components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.
  • each embodiment can be implemented by software plus a necessary general hardware platform, and of course, it can also be implemented by hardware.
  • the computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disc, optical disk, etc., including a number of instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods described in various embodiments or certain parts of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to the field of data processing. Disclosed are a metadata storage method and apparatus based on a distributed key-value database. The method comprises: determining metadata of objects stored in buckets, and determining preset types and key-value pairs of the metadata; and storing the metadata in corresponding hash tables and ordered lists on the basis of the determined preset types and key-value pairs of the metadata, wherein each bucket in a distributed storage system stores metadata by means of at least one hash table and at least one ordered list, the hash table is used for storing metadata, and the ordered list is used for storing retrieval information of metadata. The present invention reduces the complexity of I/O operations, and increases the storage scale of a single bucket while ensuring efficiency, thereby achieving efficient and low-space-utilization retrieval suitable for metadata across multiple ordered lists.

Description

一种基于分布式键值数据库的元数据存储方法及装置A metadata storage method and device based on distributed key-value database 技术领域Technical field
本发明涉及数据处理领域,具体涉及一种基于分布式键值数据库的元数据存储方法及装置。The invention relates to the field of data processing, and in particular to a metadata storage method and device based on a distributed key-value database.
背景技术Background technique
分布式对象存储系统中会为每个桶维护一份索引表,保存了桶和桶内全部对象元数据之间的映射关系,当用户访问桶内对象时,是通过桶索引获取对象的具体数据,而当桶内存储的对象数量过大时,超大的索引会造成性能和可靠性的问题。The distributed object storage system maintains an index table for each bucket, which stores the mapping relationship between the bucket and the metadata of all objects in the bucket. When a user accesses an object in a bucket, the specific data of the object is obtained through the bucket index. , and when the number of objects stored in the bucket is too large, oversized indexes will cause performance and reliability problems.
技术问题technical problem
为了解决单桶对象规模的限制,现有技术中采用了动态分片机制,通过创建多份索引表,但是在进行桶分片时会对桶内的对象元数据进行再平衡,如果桶内存储了大量的对象,在分片时数据再平衡操作会非常耗时,阻塞了前端的写入,给用户带来非常不友好的体验。In order to solve the limitation of the size of a single bucket object, the existing technology adopts a dynamic sharding mechanism by creating multiple index tables. However, when bucket sharding is performed, the object metadata in the bucket will be rebalanced. If the bucket is stored With a large number of objects, the data rebalancing operation during sharding will be very time-consuming, blocking front-end writing and giving users a very unfriendly experience.
因此,如何解决分布式对象存储系统中单桶对象存储规模受限和分片机制带来的业务阻塞问题,是目前业界亟待解决的重要课题。Therefore, how to solve the business blocking problems caused by the limited single-bucket object storage scale and the sharding mechanism in distributed object storage systems is an important issue that needs to be solved urgently in the industry.
技术解决方案Technical solutions
有鉴于此,本发明实施例提供了一种基于分布式键值数据库的元数据存储方法及装置,以解决分布式对象存储系统中单桶对象存储规模受限和分片机制带来的业务阻塞问题。In view of this, embodiments of the present invention provide a metadata storage method and device based on a distributed key-value database to solve the business congestion caused by the limited single-bucket object storage scale and the sharding mechanism in the distributed object storage system. question.
根据第一方面,本发明实施例提供了一种基于分布式键值数据库的元数据存储方法,所述方法应用于服务端,所述服务端安装有分布式对象存储系统,所述方法包括:According to a first aspect, an embodiment of the present invention provides a metadata storage method based on a distributed key-value database. The method is applied to a server, and the server is installed with a distributed object storage system. The method includes:
确定桶内存储的对象的元数据,并确定所述元数据的预设类型以及键-值对;Determine the metadata of the objects stored in the bucket, and determine the preset type and key-value pairs of the metadata;
基于所述元数据已确定的预设类型以及键-值对,将所述元数据保存至对应的哈希表以及有序列表中;所述分布式存储系统中每个桶均通过至少一个哈希表和至少一个有序列表存储元数据,所述哈希表用于存储元数据,所述有序列表用于存储元数据的检索信息。Based on the determined preset type and key-value pair of the metadata, the metadata is saved into the corresponding hash table and ordered list; each bucket in the distributed storage system passes through at least one hash The hash table and at least one ordered list store metadata, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
结合第一方面,在第一方面第一实施方式中,所述预设类型包括第一、第二、第三和第四类型,第一类型的元数据为基本元数据,第二类型的元数据为对象属性元数据,第三类型的元数据为索引元数据,第四类型的元数据为索引顺序元数据;With reference to the first aspect, in the first embodiment of the first aspect, the preset types include first, second, third and fourth types, the metadata of the first type is basic metadata, and the metadata of the second type The data is object attribute metadata, the third type of metadata is index metadata, and the fourth type of metadata is index sequence metadata;
所述基于所述元数据已确定的预设类型以及键-值对,将所述元数据保存至对应的哈希表以及有序列表中,具体包括:The step of saving the metadata into the corresponding hash table and ordered list based on the determined preset type and key-value pair of the metadata specifically includes:
基于已确定的预设类型,将所述元数据存储至预设类型对应的容器组中;每一种预设类型的元数据均对应一个所述容器组,每一个所述容器组均对应至少一个哈希槽,且,所述容器组对应的哈希槽的数量彼此相等;Based on the determined preset type, the metadata is stored in a container group corresponding to the preset type; each preset type of metadata corresponds to one of the container groups, and each of the container groups corresponds to at least One hash slot, and the number of hash slots corresponding to the container group is equal to each other;
基于所述元数据的键-值对,确定所述元数据的CRC16值,基于哈希槽的总数以及已确定的CRC16值,确定元数据对应的哈希槽;Based on the key-value pair of the metadata, determine the CRC16 value of the metadata, and determine the hash slot corresponding to the metadata based on the total number of hash slots and the determined CRC16 value;
将所述元数据映射并存储至已确定的哈希槽对应的所述哈希表;每一个所述哈希表对应至少一个哈希槽;Map and store the metadata to the hash table corresponding to the determined hash slot; each hash table corresponds to at least one hash slot;
为所述元数据分配对应的分数值,基于已分配的所述分数值,确定所述元数据对应的所述有序列表,并将所述元数据存储至对应的所述有序列表中;所述有序列表中各项所述元数据按照所述分数值进行顺序排序。Assign a corresponding score value to the metadata, determine the ordered list corresponding to the metadata based on the assigned score value, and store the metadata in the corresponding ordered list; The metadata of each item in the ordered list is sorted in order according to the score value.
结合第一方面第一实施方式,在第一方面第二实施方式中,所述哈希表的数量是基于预设数量和所述哈希槽的总数确定的,所述哈希表的数量超过所述预设数量且为所述总数的因数。With reference to the first implementation of the first aspect, in the second implementation of the first aspect, the number of hash tables is determined based on a preset number and the total number of hash slots, and the number of hash tables exceeds The preset number is a factor of the total number.
结合第一方面,在第一方面第三实施方式中,该方法还包括以下步骤:In conjunction with the first aspect, in the third implementation of the first aspect, the method further includes the following steps:
确定客户端的元数据检索请求;所述元数据检索请求中包含检索信息,所述检索信息包括每个所述有序列表的索引顺序和索引元素数目;Determine the client's metadata retrieval request; the metadata retrieval request includes retrieval information, and the retrieval information includes the index order and number of index elements of each of the ordered lists;
基于所述检索信息,从每个所述有序列表中取出对应的所述元数据;Based on the retrieval information, retrieve the corresponding metadata from each of the ordered lists;
将所述元数据存放至预设map表中,基于所述预设map表,对所述预设map表中存储的所述元数据进行顺序排序,将完成所述元数据存放及排序的所述预设map表返回给客户端。The metadata is stored in a preset map table, and based on the preset map table, the metadata stored in the preset map table is sequentially sorted, and all steps required to complete the storage and sorting of the metadata are The above preset map table is returned to the client.
结合第一方面,在第一方面第四实施方式中,该方法还包括以下步骤:In conjunction with the first aspect, in the fourth implementation manner of the first aspect, the method further includes the following steps:
确定客户端的元数据检索请求;所述元数据检索请求中包含检索信息,所述检索信息包括基准元数据和索引元素数目;Determine the client's metadata retrieval request; the metadata retrieval request includes retrieval information, and the retrieval information includes reference metadata and the number of index elements;
确定所述基准元数据在每个所述有序列表中的存储位置;Determine the storage location of the baseline metadata in each of the ordered lists;
基于所述检索信息和所述存储位置,从每个所述有序列表中取出对应的所述元数据;Based on the retrieval information and the storage location, retrieve the corresponding metadata from each of the ordered lists;
将所述元数据存放至预设map表中,基于所述预设map表,对所述预设map表中存储的所述元数据进行顺序排序,将完成所述元数据存放及排序的所述预设map表返回给客户端。The metadata is stored in a preset map table, and based on the preset map table, the metadata stored in the preset map table is sequentially sorted, and all steps required to complete the storage and sorting of the metadata are The above preset map table is returned to the client.
结合第一方面第三实施方式或者第四实施方式,在第一方面第五实施方式中,所述将所述元数据存放至预设map表中,基于所述预设map表,对所述预设map表中存储的所述元数据进行顺序排序,将完成所述元数据存放及排序的所述预设map表返回给客户端,还包括:With reference to the third implementation or the fourth implementation of the first aspect, in the fifth implementation of the first aspect, the metadata is stored in a preset map table, and based on the preset map table, the The metadata stored in the preset map table is sequentially sorted, and the preset map table that completes the storage and sorting of the metadata is returned to the client, which also includes:
确定所述预设map表达到存储上限且存在未存入的所述元数据,基于所述预设map表,确定未存入的所述元数据与所述预设map表中最尾端的元数据的map表排序值;It is determined that the preset map expression has reached the storage upper limit and there is unstored metadata. Based on the preset map table, it is determined that the unstored metadata and the last element in the preset map table are The map table sorting value of the data;
确定所述最尾端的元数据对应的map表排序值超过未存入的所述元数据对应的map表排序值,删除所述所述最尾端的元数据,并将未存入的所述元数据存储至所述预设map表中。Determine that the map table sorting value corresponding to the tailmost metadata exceeds the map table sorting value corresponding to the unstored metadata, delete the tailmost metadata, and store the unstored metadata. The data is stored in the preset map table.
根据第二方面,本发明实施例还提供了一种所述基于分布式键值数据库的元数据存储装置,所述装置应用于服务端,所述服务端安装有分布式对象存储系统,所述装置包括:According to the second aspect, an embodiment of the present invention also provides a metadata storage device based on a distributed key-value database, the device is applied to a server, and the server is installed with a distributed object storage system, and the Devices include:
第一确定模块,用于确定桶内存储的对象的元数据,并确定所述元数据的预设类型以及键-值对;The first determination module is used to determine the metadata of the objects stored in the bucket, and determine the preset type and key-value pair of the metadata;
数据存储模块,用于基于所述元数据已确定的预设类型以及键-值对,将所述元数据保存至对应的哈希表以及有序列表中;所述分布式存储系统中每个桶均通过至少一个哈希表和至少一个有序列表存储元数据,所述哈希表用于存储元数据,所述有序列表用于存储元数据的检索信息。A data storage module, configured to save the metadata into the corresponding hash table and ordered list based on the determined preset type and key-value pair of the metadata; each in the distributed storage system Each bucket stores metadata through at least one hash table and at least one ordered list, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
根据第三方面,本发明实施例还提供了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述任一种所述基于分布式键值数据库的元数据存储方法的步骤。According to a third aspect, an embodiment of the present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, the above is implemented. The steps of any one of the metadata storage methods based on a distributed key-value database.
根据第四方面,本发明实施例还提供了一种非暂态计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如上述任一种所述基于分布式键值数据库的元数据存储方法的步骤。According to a fourth aspect, embodiments of the present invention also provide a non-transitory computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the distributed key-based storage medium is implemented as described above. Steps for the metadata storage method for value databases.
根据第五方面,本发明实施例还提供了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现如上述任一种所述基于分布式键值数据库的元数据存储方法的步骤。According to a fifth aspect, embodiments of the present invention further provide a computer program product, including a computer program that, when executed by a processor, implements any one of the above-mentioned metadata storage methods based on a distributed key-value database. A step of.
有益效果beneficial effects
本发明提供的基于分布式键值数据库的元数据存储方法及装置,通过将对象的元数据以键值对的形式存储在分布式key-value集群中,并且采用分表的方式,用多个哈希表以及多个有序列表来承载一个桶内对象的元数据,通过哈希表进行元数据的存储减少了I/O操作的复杂度,在保证效率的同时提升单桶存储规模,通过有序列表进行元数据的存储提供对象元数据的增删改查的接口,在不增加存储成本的基础上,实现了高效率、低空间利用率的适用于跨多有序列表的元数据的检索。The metadata storage method and device based on the distributed key-value database provided by the present invention stores the metadata of the object in the form of key-value pairs in the distributed key-value cluster, and adopts the method of sub-tables to use multiple Hash tables and multiple ordered lists are used to carry the metadata of objects in a bucket. The storage of metadata through hash tables reduces the complexity of I/O operations and increases the storage scale of a single bucket while ensuring efficiency. The storage of metadata in ordered lists provides an interface for adding, deleting, modifying, and querying object metadata. Without increasing storage costs, it achieves high efficiency and low space utilization suitable for retrieval of metadata across multiple ordered lists. .
附图说明Description of drawings
通过参考附图会更加清楚的理解本发明的特征和优点,附图是示意性的而不应理解为对本发明进行任何限制,在附图中:The features and advantages of the present invention will be more clearly understood by referring to the accompanying drawings, which are schematic and should not be construed as limiting the invention in any way, in which:
图1示出了本发明提供的基于分布式键值数据库的元数据存储方法的流程示意图;Figure 1 shows a schematic flow chart of a metadata storage method based on a distributed key-value database provided by the present invention;
图2示出了本发明提供的基于分布式键值数据库的元数据存储方法中步骤S20的流程示意图;Figure 2 shows a schematic flow chart of step S20 in the metadata storage method based on a distributed key-value database provided by the present invention;
图3示出了本发明提供的基于分布式键值数据库的元数据存储装置的结构示意图;Figure 3 shows a schematic structural diagram of a metadata storage device based on a distributed key-value database provided by the present invention;
图4示出了本发明提供的电子设备的结构示意图。Figure 4 shows a schematic structural diagram of the electronic device provided by the present invention.
本发明的实施方式Embodiments of the invention
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, rather than all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without making creative efforts fall within the scope of protection of the present invention.
目前基于分布式存储架构的分布式对象存储系统成为了云计算的一种优选方案,其文件存储更是具有可共享、价格低廉等优势。分布式对象存储系统中对象元数据分为两部分进行存储:一部分为对象的索引元数据,称为omap,omap指代的是object map,即用来保存key-value键值对map数据的对象,在分布式存储中,omap对象有着非常重要的作用,在分布式存储提供的文件和对象服务中,omap性能的表现直接影响集群存储性能的表现,omap保存在本地文件系统之外的独立的key-value(键-值)存储系统中,在使用filestore 时是 levelDB,在使用 bluestore 时是 rocksDB;另一部分为对象的扩展属性,称为xattr,通常用来保存对象的版本信息等,xattr保存在桶(bucket)的RADOS(Reliable Autonomic Distributed Object Store)的对象中,RADOS的对象又存储在本地文件系统内,其大小受文件系统限制,导致其所能承载的对象数受限保。这就导致在读取对象的元数据时需要分两个I/O路径进行相关数据的读取,分别对应本地文件系统和key-value存储系统。At present, the distributed object storage system based on the distributed storage architecture has become a preferred solution for cloud computing, and its file storage has the advantages of being shareable and low-priced. Object metadata in a distributed object storage system is divided into two parts for storage: one part is the index metadata of the object, which is called omap, and omap refers to object. Map is an object used to save key-value pair map data. In distributed storage, omap objects play a very important role. In the file and object services provided by distributed storage, the performance of omap directly affects the cluster. Storage performance performance, omap is stored in an independent key-value (key-value) storage system outside the local file system, which is levelDB when using filestore, and rocksDB when using bluestore; the other part is the extended attributes of the object, It is called xattr and is usually used to save the version information of the object. xattr is stored in the RADOS (Reliable Autonomic Distributed Object Store) object of the bucket. The RADOS object is stored in the local file system. Its size is affected by the file system. Limitation, resulting in a limit on the number of objects it can carry. This results in the need to read related data in two I/O paths when reading the metadata of an object, corresponding to the local file system and the key-value storage system respectively.
分布式对象存储系统中会为每个桶维护一份索引表,保存了桶和桶内全部对象元数据之间的映射关系,当用户访问桶内对象时,是通过桶索引获取对象的具体数据,而当桶内存储的对象数量过大时,超大的索引会造成性能和可靠性的问题。为了解决单桶对象规模的限制,现有技术中采用了动态分片机制,也就是使用多个RADOS对象维护桶索引,通过创建多份索引表,解决单个索引对象可能过大导致不能满足数据增长等业务需要的问题。但是在进行桶分片时会对桶内的对象元数据进行再平衡,将旧的RADOS对象中的索引数据重新进行计算和组织迁移到新的RADOS对象中。那么,如果桶内存储了大量的对象,在分片时数据再平衡操作会非常耗时,阻塞了前端的写入,给用户带来非常不友好的体验。The distributed object storage system maintains an index table for each bucket, which stores the mapping relationship between the bucket and the metadata of all objects in the bucket. When a user accesses an object in a bucket, the specific data of the object is obtained through the bucket index. , and when the number of objects stored in the bucket is too large, oversized indexes will cause performance and reliability problems. In order to solve the limitation of the size of a single bucket object, the existing technology uses a dynamic sharding mechanism, that is, multiple RADOS objects are used to maintain the bucket index, and multiple index tables are created to solve the problem that a single index object may be too large and cannot meet the data growth. and other business needs. However, when bucket sharding is performed, the object metadata in the bucket will be rebalanced, and the index data in the old RADOS object will be recalculated, organized and migrated to the new RADOS object. Then, if a large number of objects are stored in the bucket, the data rebalancing operation during sharding will be very time-consuming, blocking front-end writes, and giving users a very unfriendly experience.
针对上述问题,下面结合图1对本发明的基于分布式键值数据库的元数据存储方法进行描述,该方法旨在解决原生分布式对象存储中单桶对象存储规模受限和分片机制带来的业务阻塞的问题,该方法应用于服务端,服务端安装有分布式对象存储系统,该方法包括:In response to the above problems, the metadata storage method based on the distributed key-value database of the present invention is described below with reference to Figure 1. This method is designed to solve the problems caused by the limited single-bucket object storage scale and the fragmentation mechanism in native distributed object storage. To solve the problem of business blocking, this method is applied to the server side. The server side is installed with a distributed object storage system. The method includes:
S10、确定桶(bucket)内存储的对象的元数据,并确定元数据的预设类型(type)以及键-值对。具体的,确定服务端中每个桶内存储的元数据以及元数据的预设类型以及键-值对。S10. Determine the metadata of the objects stored in the bucket, and determine the preset type (type) and key-value pairs of the metadata. Specifically, determine the metadata stored in each bucket on the server, as well as the preset type of metadata and key-value pairs.
在本申请中,预设类型包括第一、第二、第三和第四类型。其中,第一类型的元数据为基本元数据(object metadata),基本元数据保存了对象最新版本的基本信息包括size、instance和last version等;第二类型的元数据为对象属性元数据(xattrs),也就是对象自身的属性信息;第三类型的元数据为索引元数据(omap),索引元数据保存了桶对象索引信息也就是omap信息;第四类型的元数据为索引顺序元数据(omap order),索引顺序元数保存了桶内对象所有omap的顺序列表。In this application, the preset types include first, second, third and fourth types. Among them, the first type of metadata is basic metadata (object metadata), basic metadata saves the basic information of the latest version of the object, including size, instance and last version; the second type of metadata is object attribute metadata (xattrs), which is the attribute information of the object itself; the third type of metadata Metadata is index metadata (omap), which stores bucket object index information, which is omap information; the fourth type of metadata is index sequence metadata (omap order), the index order element stores the sequence list of all omaps of the objects in the bucket.
S20、基于元数据已确定的预设类型以及键-值对,将元数据保存至对应的哈希表(hash table)以及有序列表(zset)中。在本申请中,分布式存储系统中每个桶均通过至少一个哈希表和至少一个有序列表存储元数据,具体的,本申请中哈希表用于存储元数据,有序列表用于存储元数据的检索信息。S20. Based on the determined preset type and key-value pair of the metadata, save the metadata into the corresponding hash table (hash table) and ordered list (zset). In this application, each bucket in the distributed storage system stores metadata through at least one hash table and at least one ordered list. Specifically, in this application, the hash table is used to store metadata, and the ordered list is used to store metadata. Stores retrieval information for metadata.
在本申请中,依托分布式key-value数据的数据结构构建单桶多表的元数据存储模型,原生分布式对象存储系统中只包括对象网关和后端数据存储,本发明在此基础上增加了专有的分布式key-value模块用来存储对象的元数据,实现对象数据和元数据的分离存储。同时本申请重新组织了对象的元数据,去除冗余的数据类型,将元数据重新归纳为四种类型的元数据。由于在对象操作过程中不会同时操作原生分布式对象存储系统中多达几十种类型的元数据,因此可以减小对象元数据粒度,元数据读写操作更灵活,效率更高。In this application, a single-bucket multi-table metadata storage model is constructed based on the data structure of distributed key-value data. The native distributed object storage system only includes object gateway and back-end data storage. On this basis, the present invention adds A proprietary distributed key-value module is used to store object metadata and realize separate storage of object data and metadata. At the same time, this application reorganizes the object's metadata, removes redundant data types, and reclassifies the metadata into four types of metadata. Since dozens of types of metadata in the native distributed object storage system are not simultaneously operated during object operations, the object metadata granularity can be reduced, and metadata read and write operations are more flexible and efficient.
对于同一个桶中的四种预设类型的元数据,该方法均会使用至少一个哈希表和至少一个有序列表进行存储,并且在存储元数据时还会确保在容器组(container)内部是均匀分布的,具体如何实现元数据在容器组中的均匀分布,将在下文阐述。For the four preset types of metadata in the same bucket, this method will use at least one hash table and at least one ordered list for storage, and when storing metadata, it will also ensure that it is inside the container group (container). It is evenly distributed. How to achieve uniform distribution of metadata in the container group will be explained below.
在本申请中,对象的元数据以键值对的形式存储在分布式key-value集群中,该方法采用分表的方式,用多个哈希表来承载一个桶内对象的元数据,每个哈希表最多能存储42亿个键值对,多个哈希表能轻松支撑百亿级别的对象的元数据,通过将对象的元数据统一存储在分布式key-value的哈希表中,其读写操作的时间复杂度为0或者1,减少了I/O操作的复杂度,并且在保证效率的同时提升单桶存储规模,不需要再从本地文件系统中加载RADOS对象数据了。In this application, the metadata of the object is stored in the distributed key-value cluster in the form of key-value pairs. This method uses a table partitioning method and uses multiple hash tables to carry the metadata of the object in a bucket. Each A hash table can store up to 4.2 billion key-value pairs. Multiple hash tables can easily support the metadata of tens of billions of objects by uniformly storing the metadata of objects in a distributed key-value hash table. , the time complexity of its read and write operations is 0 or 1, which reduces the complexity of I/O operations and increases the scale of single-bucket storage while ensuring efficiency. There is no need to load RADOS object data from the local file system.
本申请的提供的基于分布式键值数据库的元数据存储方法,通过将对象的元数据以键值对的形式存储在分布式key-value集群中,并且采用分表的方式,用多个哈希表以及多个有序列表来承载一个桶内对象的元数据,通过哈希表进行元数据的存储减少了I/O操作的复杂度,在保证效率的同时提升单桶存储规模,通过有序列表进行元数据的存储提供对象元数据的增删改查的接口,在不增加存储成本的基础上,实现了高效率、低空间利用率的适用于跨多有序列表的元数据的检索。The metadata storage method based on a distributed key-value database provided by this application stores the object's metadata in the form of key-value pairs in a distributed key-value cluster, and uses multiple tables in a split-table manner. Hash tables and multiple ordered lists are used to carry the metadata of objects in a bucket. The storage of metadata through hash tables reduces the complexity of I/O operations and increases the storage scale of a single bucket while ensuring efficiency. Sequence lists store metadata and provide an interface for adding, deleting, modifying, and querying object metadata. Without increasing storage costs, it achieves high efficiency and low space utilization suitable for metadata retrieval across multiple ordered lists.
下面结合图2对本发明的基于分布式键值数据库的元数据存储方法进行描述,步骤S20具体包括:The metadata storage method based on the distributed key-value database of the present invention will be described below with reference to Figure 2. Step S20 specifically includes:
S21、基于已确定的预设类型,将元数据存储至预设类型对应的容器组中。在本申请中,每一种预设类型的元数据均对应一个容器组,每一个容器组均对应至少一个哈希槽(hash slot),且,容器组对应的哈希槽的数量彼此相等,即,各个容器组对应的哈希槽的数量保持一致。S21. Based on the determined preset type, store the metadata in the container group corresponding to the preset type. In this application, each preset type of metadata corresponds to a container group, and each container group corresponds to at least one hash slot (hash slot). slot), and the number of hash slots corresponding to the container group is equal to each other, that is, the number of hash slots corresponding to each container group remains consistent.
S22、基于元数据的键-值对,确定元数据的CRC16值,基于哈希槽的总数以及已确定的CRC16值,确定元数据对应的哈希槽。S22. Based on the key-value pair of the metadata, determine the CRC16 value of the metadata, and determine the hash slot corresponding to the metadata based on the total number of hash slots and the determined CRC16 value.
具体的,先基于元数据的键(key)计算元数据的CRC16值,然后对哈希槽(hash slot)的总数取模,可以获取到元数据对应的哈希槽(hash slot)。Specifically, first calculate the CRC16 value of the metadata based on the key of the metadata, and then take the modulo of the total number of hash slots to obtain the hash slot corresponding to the metadata. slot).
S23、将元数据映射并存储至已确定的哈希槽对应的哈希表,在该方法中,每一个哈希表对应至少一个哈希槽。S23. Map and store the metadata to the hash table corresponding to the determined hash slot. In this method, each hash table corresponds to at least one hash slot.
为了做到元数据在容器组的各个哈希表中均匀分布,本申请中需要预先对每个哈希表分配槽(slot)分区。对于每一个容器组,整个容器组中的槽(slot)的总数为16384,哈希表数量我们配置为超过预设数量(例如5)且为16384的因数,之后,将这16384个槽(slot)平均分配到容器组中的各个哈希表中。In order to ensure that metadata is evenly distributed in each hash table of the container group, in this application, each hash table needs to be allocated a slot partition in advance. For each container group, the total number of slots in the entire container group is 16384. We configure the number of hash tables to exceed the preset number (for example, 5) and be a factor of 16384. After that, these 16384 slots are ) are evenly distributed among the hash tables in the container group.
因此,可以通过以下方式确定元数据对应的哈希槽:Therefore, the hash slot corresponding to the metadata can be determined in the following way:
HASH_SLOT = CRC16(key) mod 16384HASH_SLOT = CRC16(key) mod 16384
通过HASH_SLOT算法可以将桶对象的元数据映射到相应的哈希表中,实现元数据在对应容器组中均匀分布。The HASH_SLOT algorithm can be used to map the metadata of the bucket object to the corresponding hash table, so that the metadata can be evenly distributed in the corresponding container group.
S24、为元数据分配对应的分数值(score),基于已分配的分数值,确定元数据对应的有序列表,并将元数据存储至对应的有序列表中,在该方法中,有序列表中各项元数据按照分数值进行顺序排序,例如基于元数据分配/赋予的分数值,按照分数值由低至高进行单个有序列表中元数据的有序排序。S24. Assign a corresponding score value (score) to the metadata. Based on the assigned score value, determine the ordered list corresponding to the metadata, and store the metadata in the corresponding ordered list. In this method, the ordered list Each item of metadata in the list is sorted sequentially according to the score value. For example, based on the score value assigned/assigned by the metadata, the metadata in a single ordered list is sorted in an orderly manner according to the score value from low to high.
在本实施例中,是为元数据的键(key)赋予分数值。In this embodiment, a score value is assigned to the key of the metadata.
相对于上传、下载、删除等单个对象元数据操作的流程,有序对象列表业务无疑是比较复杂的,本申请中使用多个有序列表来维护桶内对象的有序元数据(omap key),这些omap key在单个有序列表内部是有序的,但是为了元数据的均匀分布,跨多个有序列表之间的omap key是做不到有序排列的,为了解决数据检索的效率低的问题,现有技术中又会将元数据分离存储在高性能的磁盘上,例如固态硬盘(Solid State Disk,SSD),然而,虽然对象的读写效率提高了,但是所需的存储成本也大幅增加了。Compared with the process of uploading, downloading, deleting and other single object metadata operations, the ordered object list business is undoubtedly more complicated. In this application, multiple ordered lists are used to maintain the ordered metadata (omap key) of the objects in the bucket. , these omap keys are ordered within a single ordered list, but in order to evenly distribute metadata, the omap keys across multiple ordered lists cannot be arranged in an orderly manner. In order to solve the low efficiency of data retrieval problem, in the existing technology, metadata will be separately stored on high-performance disks, such as solid-state drives (Solid-state drives). State Disk, SSD), however, although the efficiency of reading and writing objects has improved, the required storage cost has also increased significantly.
在一些可能的实施例中,该方法还包括以下步骤,旨在不增加存储成本的基础上,实现了高效率、低空间利用率的适用于跨多有序列表的元数据的检索:In some possible embodiments, the method also includes the following steps, aiming to achieve high efficiency and low space utilization suitable for retrieval of metadata across multiple ordered lists without increasing storage costs:
A10、确定客户端的元数据检索请求,在这些实施例中,元数据检索请求中包含检索信息,检索信息包括每个有序列表的索引顺序和索引元素数目等。A10. Determine the client's metadata retrieval request. In these embodiments, the metadata retrieval request contains retrieval information, and the retrieval information includes the index order and number of index elements of each ordered list, etc.
在本实施例中,不同有序列表的索引顺序和索引元素数目均是一致的,例如元数据检索请求是检索每个有序列表中最靠前的10个元素,即,索引顺序是以最前端的元素开始进行连续的顺序检索,索引元素数目是10。In this embodiment, the index order and number of index elements of different ordered lists are consistent. For example, the metadata retrieval request is to retrieve the top 10 elements in each ordered list, that is, the index order is based on the highest order. The front-end elements start to be retrieved in a continuous sequence, and the number of index elements is 10.
A20、基于检索信息,从每个有序列表中取出对应的元数据。例如,从每个有序列表中取出最前端、最尾端、第i至第i+n-1的连续的n个元素(元数据将)。A20. Based on the retrieval information, retrieve the corresponding metadata from each ordered list. For example, take the consecutive n elements (metadata) from the frontmost, lastmost, i-th to i+n-1th elements from each ordered list.
A30、将元数据存放至预设map表中,基于预设map表,对预设map表中存储的元数据进行顺序排序,将完成元数据存放及排序的预设map表返回给客户端。A30. Store the metadata in the preset map table, sort the metadata stored in the preset map table based on the preset map table, and return the preset map table that has completed metadata storage and sorting to the client.
由于map表本身可以基于元素的字母对元素进行排序,因此当元数据存入预设map表后,预设map表可以对存入的元数据进行顺序排序。预设map表也就是对取出的各个元数据再次进行顺序排序的列表,用户基于预设map表可以进行跨多有序列表的高效率、低空间利用率的元数据的检索。Since the map table itself can sort elements based on their letters, when metadata is stored in the preset map table, the preset map table can sort the stored metadata sequentially. The preset map table is a list in which the retrieved metadata is sorted again. Based on the preset map table, users can perform high-efficiency, low space utilization metadata retrieval across multiple ordered lists.
在一些可能的实施例中,该方法还包括以下步骤:In some possible embodiments, the method further includes the following steps:
A40、确定客户端的元数据检索请求,同样的,与步骤A10类似在这些实施例中,元数据检索请求也中包含检索信息,区别在于,检索信息包括基准元数据和索引元素数目等,基准元数据也就是start key。A40. Determine the client's metadata retrieval request. Similarly, similar to step A10, in these embodiments, the metadata retrieval request also contains retrieval information. The difference is that the retrieval information includes reference metadata and the number of index elements. The reference metadata Data is start key.
A50、确定基准元数据在每个有序列表中的存储位置。A50. Determine the storage location of the baseline metadata in each ordered list.
需要说明的是,基准元数据(start key)是一个确定存在于桶内的元数据,可以理解而是,该基准元数据有且仅会存储在其中一个有序列表中。因此步骤A50中会得到基准元数据在其对应存储的一个有序列表中具体的存储位置也就是存储顺序,也就是在该有序列表中的存储元素编号/序号;对于未存储基准元数据的有序列表,会先确定若基准元数据存储在这些有序列表中的存储位置也就是在有序列表中的存储顺序。例如基于基准元数据的字母以及其他各个有序列表中已存入的元数据的字母,确定基准元数据在未存储基准元数据的有序列表中的预存储位置。It should be noted that the baseline metadata (start key) is metadata that is definitely present in the bucket. It can be understood that the baseline metadata is and will only be stored in one of the ordered lists. Therefore, in step A50, the specific storage location of the benchmark metadata in an ordered list corresponding to the storage will be obtained, which is the storage sequence, that is, the storage element number/serial number in the ordered list; for those that do not store benchmark metadata In ordered lists, the storage location of the baseline metadata stored in these ordered lists will first be determined, that is, the storage order in the ordered lists. For example, based on the letters of the benchmark metadata and the letters of the metadata that have been stored in each other ordered list, the pre-storage position of the benchmark metadata in the ordered list where the benchmark metadata is not stored is determined.
A60、基于检索信息和存储位置,存储位置包括真实的存储位置和预存储位置,从每个有序列表中取出对应的元数据。例如,从预存储位置开始,从未存储基准元数据的有序列表中取出连续的n个元素(元数据)。A60. Based on the retrieval information and storage location. The storage location includes the real storage location and the pre-storage location, and retrieve the corresponding metadata from each ordered list. For example, starting from the pre-storage location, fetch consecutive n elements (metadata) from an ordered list of unstored baseline metadata.
A70、将元数据存放至预设map表中,基于预设map表,对预设map表中存储的元数据进行顺序排序,将完成元数据存放及排序的预设map表返回给客户端。A70. Store the metadata in the preset map table, sort the metadata stored in the preset map table based on the preset map table, and return the preset map table that has completed metadata storage and sorting to the client.
由于map表本身可以基于元素的字母对元素进行排序,因此当元数据存入预设map表后,预设map表可以对存入的元数据进行顺序排序。预设map表也就是对取出的各个元数据再次进行顺序排序的列表,用户基于预设map表可以进行跨多有序列表的高效率、低空间利用率的元数据的检索。Since the map table itself can sort elements based on their letters, when metadata is stored in the preset map table, the preset map table can sort the stored metadata sequentially. The preset map table is a list in which the retrieved metadata is sorted again. Based on the preset map table, users can perform high-efficiency, low space utilization metadata retrieval across multiple ordered lists.
预设map表会有存储上限,该存储上限可由用户进行设置,例如存储上限为N个元素,当未打到预设map表的存储上限并且需要存入元数据时,会直接进行元数据的存储,当预设map表达到存储上限时还需要设计相应的检索算法,因此,步骤A30以及步骤A70还会包括:The default map table has a storage limit, which can be set by the user. For example, the storage limit is N elements. When the storage limit of the default map table is not reached and metadata needs to be stored, the metadata will be processed directly. Storage. When the preset map expression reaches the upper limit of storage, a corresponding retrieval algorithm needs to be designed. Therefore, steps A30 and A70 will also include:
A80、确定预设map表达到存储上限且存在未存入的元数据,例如存储上限N小于取出的元数据的总数S,基于预设map表,确定未存入的元数据与预设map表中最尾端的元数据的map表排序值,在本实施例中,排序值可以理解为存储位置/存储顺序。A80. Determine that the preset map expression reaches the storage upper limit and there is unstored metadata. For example, the storage upper limit N is less than the total number of metadata taken out S. Based on the preset map table, determine the unstored metadata and the preset map table. In this embodiment, the sorting value can be understood as the storage location/storage order.
A90、确定最尾端/最末端的元数据对应的map表排序值超过未存入的元数据对应的map表排序值,删除最尾端的元数据,并将未存入的元数据存储至预设map表。通过以上方式确定每一个未存入的元数据是否需要存入预设map表中,需要说明的是,当删除最尾端的元素且新的元数据存入预设map表后,会对预设map表中所有已存入的元数据依照map表排序值进行重新排序,确保预设map表最尾端的元数据对应的map表排序值是所有元素的map表排序值中的最大值。A90. Determine that the map table sorting value corresponding to the last/most metadata exceeds the map table sorting value corresponding to the unstored metadata, delete the last metadata, and store the unstored metadata to the preset Set up a map table. Use the above method to determine whether each unstored metadata needs to be stored in the default map table. It should be noted that when the last element is deleted and the new metadata is stored in the default map table, the default All stored metadata in the map table are reordered according to the map table sorting value, ensuring that the map table sorting value corresponding to the metadata at the end of the default map table is the maximum value among the map table sorting values of all elements.
下面对本发明提供的基于分布式键值数据库的元数据存储装置进行描述,下文描述的基于分布式键值数据库的元数据存储装置与上文描述的基于分布式键值数据库的元数据存储方法可相互对应参照。The metadata storage device based on the distributed key database provided by the present invention is described below. The metadata storage device based on the distributed key database described below can be used with the metadata storage method based on the distributed key database described above. mutual reference.
下面结合图3对本发明的基于分布式键值数据库的元数据存储装置进行描述,该装置旨在解决原生分布式对象存储中单桶对象存储规模受限和分片机制带来的业务阻塞的问题,该装置应用于服务端,服务端安装有分布式对象存储系统,该装置包括:The metadata storage device based on the distributed key-value database of the present invention is described below with reference to Figure 3. This device is designed to solve the problems of limited single-bucket object storage scale and business blocking caused by the sharding mechanism in native distributed object storage. , the device is applied to the server, and the server is installed with a distributed object storage system. The device includes:
第一确定模块10,用于确定桶(bucket)内存储的对象的元数据,并确定元数据的预设类型(type)以及键-值对。具体的,确定服务端中每个桶内存储的元数据以及元数据的预设类型以及键-值对。The first determination module 10 is used to determine the metadata of the objects stored in the bucket, and determine the preset type (type) and key-value pairs of the metadata. Specifically, determine the metadata stored in each bucket on the server, as well as the preset type of metadata and key-value pairs.
在本申请中,预设类型包括第一、第二、第三和第四类型。其中,第一类型的元数据为基本元数据(object metadata),基本元数据保存了对象最新版本的基本信息包括size、instance和last version等;第二类型的元数据为对象属性元数据(xattrs),也就是对象自身的属性信息;第三类型的元数据为索引元数据(omap),索引元数据保存了桶对象索引信息也就是omap信息;第四类型的元数据为索引顺序元数据(omap order),索引顺序元数保存了桶内对象所有omap的顺序列表。In this application, the preset types include first, second, third and fourth types. Among them, the first type of metadata is basic metadata (object metadata), basic metadata saves the basic information of the latest version of the object, including size, instance and last version; the second type of metadata is object attribute metadata (xattrs), which is the attribute information of the object itself; the third type of metadata Metadata is index metadata (omap), which stores bucket object index information, which is omap information; the fourth type of metadata is index sequence metadata (omap order), the index order element stores the sequence list of all omaps of the objects in the bucket.
数据存储模块20,用于基于元数据已确定的预设类型以及键-值对,将元数据保存至对应的哈希表(hash table)以及有序列表(zset)中。在本申请中,分布式存储系统中每个桶均通过至少一个哈希表和至少一个有序列表存储元数据,具体的,本申请中哈希表用于存储元数据,有序列表用于存储元数据的检索信息。The data storage module 20 is used to save the metadata to the corresponding hash table (hash table) based on the determined preset type and key-value pair of the metadata. table) and ordered list (zset). In this application, each bucket in the distributed storage system stores metadata through at least one hash table and at least one ordered list. Specifically, in this application, the hash table is used to store metadata, and the ordered list is used to store metadata. Stores retrieval information for metadata.
在本申请中,依托分布式key-value数据的数据结构构建单桶多表的元数据存储模型,原生分布式对象存储系统中只包括对象网关和后端数据存储,本发明在此基础上增加了专有的分布式key-value模块用来存储对象的元数据,实现对象数据和元数据的分离存储。同时本申请重新组织了对象的元数据,去除冗余的数据类型,将元数据重新归纳为四种类型的元数据。由于在对象操作过程中不会同时操作原生分布式对象存储系统中多达几十种类型的元数据,因此可以减小对象元数据粒度,元数据读写操作更灵活,效率更高。In this application, a single-bucket multi-table metadata storage model is constructed based on the data structure of distributed key-value data. The native distributed object storage system only includes object gateway and back-end data storage. On this basis, the present invention adds A proprietary distributed key-value module is used to store object metadata and realize separate storage of object data and metadata. At the same time, this application reorganizes the object's metadata, removes redundant data types, and reclassifies the metadata into four types of metadata. Since dozens of types of metadata in the native distributed object storage system are not simultaneously operated during object operations, the object metadata granularity can be reduced, and metadata read and write operations are more flexible and efficient.
对于同一个桶中的四种预设类型的元数据,该装置均会使用至少一个哈希表和至少一个有序列表进行存储,并且在存储元数据时还会确保在容器组(container)内部是均匀分布的,具体如何实现元数据在容器组中的均匀分布,将在下文阐述。For the four preset types of metadata in the same bucket, the device will use at least one hash table and at least one ordered list to store, and when storing metadata, it will also ensure that it is within the container group (container). It is evenly distributed. How to achieve uniform distribution of metadata in the container group will be explained below.
在本申请中,对象的元数据以键值对的形式存储在分布式key-value集群中,该装置采用分表的方式,用多个哈希表来承载一个桶内对象的元数据,每个哈希表最多能存储42亿个键值对,多个哈希表能轻松支撑百亿级别的对象的元数据,通过将对象的元数据统一存储在分布式key-value的哈希表中,其读写操作的时间复杂度为0或者1,减少了I/O操作的复杂度,并且在保证效率的同时提升单桶存储规模,不需要再从本地文件系统中加载RADOS对象数据了。In this application, the metadata of the object is stored in the distributed key-value cluster in the form of key-value pairs. The device adopts a sub-table method and uses multiple hash tables to carry the metadata of the object in a bucket. Each A hash table can store up to 4.2 billion key-value pairs. Multiple hash tables can easily support the metadata of tens of billions of objects by uniformly storing the metadata of objects in a distributed key-value hash table. , the time complexity of its read and write operations is 0 or 1, which reduces the complexity of I/O operations and increases the scale of single-bucket storage while ensuring efficiency. There is no need to load RADOS object data from the local file system.
本申请的提供的基于分布式键值数据库的元数据存储装置,通过将对象的元数据以键值对的形式存储在分布式key-value集群中,并且采用分表的方式,用多个哈希表以及多个有序列表来承载一个桶内对象的元数据,通过哈希表进行元数据的存储减少了I/O操作的复杂度,在保证效率的同时提升单桶存储规模,通过有序列表进行元数据的存储提供对象元数据的增删改查的接口,在不增加存储成本的基础上,实现了高效率、低空间利用率的适用于跨多有序列表的元数据的检索。The metadata storage device based on a distributed key-value database provided by this application stores the object's metadata in the form of key-value pairs in a distributed key-value cluster, and uses multiple tables in a split-table manner. Hash tables and multiple ordered lists are used to carry the metadata of objects in a bucket. The storage of metadata through hash tables reduces the complexity of I/O operations and increases the storage scale of a single bucket while ensuring efficiency. Sequence lists store metadata and provide an interface for adding, deleting, modifying, and querying object metadata. Without increasing storage costs, it achieves high efficiency and low space utilization suitable for metadata retrieval across multiple ordered lists.
图4示例了一种电子设备的实体结构示意图,如图4所示,该电子设备可以包括:处理器(processor)810、通信接口(Communications Interface)820、存储器(memory)830和通信总线840,其中,处理器810,通信接口820,存储器830通过通信总线840完成相互间的通信。处理器810可以调用存储器830中的逻辑指令,以执行基于分布式键值数据库的元数据存储方法,该方法包括:Figure 4 illustrates a schematic diagram of the physical structure of an electronic device. As shown in Figure 4, the electronic device may include: a processor (processor) 810, a communications interface (Communications Interface) 820, a memory (memory) 830 and a communication bus 840. Among them, the processor 810, the communication interface 820, and the memory 830 complete communication with each other through the communication bus 840. The processor 810 can call logical instructions in the memory 830 to execute a metadata storage method based on a distributed key-value database, which method includes:
确定桶内存储的对象的元数据,并确定所述元数据的预设类型以及键-值对;Determine the metadata of the objects stored in the bucket, and determine the preset type and key-value pairs of the metadata;
基于所述元数据已确定的预设类型以及键-值对,将所述元数据保存至对应的哈希表以及有序列表中;所述分布式存储系统中每个桶均通过至少一个哈希表和至少一个有序列表存储元数据,所述哈希表用于存储元数据,所述有序列表用于存储元数据的检索信息。Based on the determined preset type and key-value pair of the metadata, the metadata is saved into the corresponding hash table and ordered list; each bucket in the distributed storage system passes through at least one hash The hash table and at least one ordered list store metadata, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
此外,上述的存储器830中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。In addition, the above-mentioned logical instructions in the memory 830 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product. Based on this understanding, the technical solution of the present invention essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Various media that can store program code, such as Memory), magnetic disks or optical disks.
另一方面,本发明还提供一种计算机程序产品,所述计算机程序产品包括计算机程序,计算机程序可存储在非暂态计算机可读存储介质上,所述计算机程序被处理器执行时,计算机能够执行上述各方法所提供的基于分布式键值数据库的元数据存储方法,该方法包括:On the other hand, the present invention also provides a computer program product. The computer program product includes a computer program. The computer program can be stored on a non-transitory computer-readable storage medium. When the computer program is executed by a processor, the computer can Execute the metadata storage method based on the distributed key-value database provided by each of the above methods, which method includes:
确定桶内存储的对象的元数据,并确定所述元数据的预设类型以及键-值对;Determine the metadata of the objects stored in the bucket, and determine the preset type and key-value pairs of the metadata;
基于所述元数据已确定的预设类型以及键-值对,将所述元数据保存至对应的哈希表以及有序列表中;所述分布式存储系统中每个桶均通过至少一个哈希表和至少一个有序列表存储元数据,所述哈希表用于存储元数据,所述有序列表用于存储元数据的检索信息。Based on the determined preset type and key-value pair of the metadata, the metadata is saved into the corresponding hash table and ordered list; each bucket in the distributed storage system passes through at least one hash The hash table and at least one ordered list store metadata, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
又一方面,本发明还提供一种非暂态计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现以执行上述各方法提供的基于分布式键值数据库的元数据存储方法,该方法包括:In another aspect, the present invention also provides a non-transitory computer-readable storage medium on which a computer program is stored. The computer program is implemented when executed by the processor to execute the elements based on the distributed key-value database provided by the above methods. Data storage method, which includes:
确定桶内存储的对象的元数据,并确定所述元数据的预设类型以及键-值对;Determine the metadata of the objects stored in the bucket, and determine the preset type and key-value pairs of the metadata;
基于所述元数据已确定的预设类型以及键-值对,将所述元数据保存至对应的哈希表以及有序列表中;所述分布式存储系统中每个桶均通过至少一个哈希表和至少一个有序列表存储元数据,所述哈希表用于存储元数据,所述有序列表用于存储元数据的检索信息。Based on the determined preset type and key-value pair of the metadata, the metadata is saved into the corresponding hash table and ordered list; each bucket in the distributed storage system passes through at least one hash The hash table and at least one ordered list store metadata, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are only illustrative. The units described as separate components may or may not be physically separated. The components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and of course, it can also be implemented by hardware. Based on this understanding, the part of the above technical solution that essentially contributes to the existing technology can be embodied in the form of a software product. The computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disc, optical disk, etc., including a number of instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods described in various embodiments or certain parts of the embodiments.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be used Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent substitutions are made to some of the technical features; however, these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

  1. 一种基于分布式键值数据库的元数据存储方法,其特征在于,所述方法应用于服务端,所述服务端安装有分布式对象存储系统,所述方法包括:A metadata storage method based on a distributed key-value database, characterized in that the method is applied to a server, and the server is installed with a distributed object storage system. The method includes:
    确定桶内存储的对象的元数据,并确定所述元数据的预设类型以及键-值对;Determine the metadata of the objects stored in the bucket, and determine the preset type and key-value pairs of the metadata;
    基于所述元数据已确定的预设类型以及键-值对,将所述元数据保存至对应的哈希表以及有序列表中;所述分布式存储系统中每个桶均通过至少一个哈希表和至少一个有序列表存储元数据,所述哈希表用于存储元数据,所述有序列表用于存储元数据的检索信息。Based on the determined preset type and key-value pair of the metadata, the metadata is saved into the corresponding hash table and ordered list; each bucket in the distributed storage system passes through at least one hash The hash table and at least one ordered list store metadata, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
  2. 根据权利要求1所述的基于分布式键值数据库的元数据存储方法,其特征在于,所述预设类型包括第一、第二、第三和第四类型,第一类型的元数据为基本元数据,第二类型的元数据为对象属性元数据,第三类型的元数据为索引元数据,第四类型的元数据为索引顺序元数据; The metadata storage method based on a distributed key-value database according to claim 1, characterized in that the preset types include first, second, third and fourth types, and the first type of metadata is basic Metadata, the second type of metadata is object attribute metadata, the third type of metadata is index metadata, and the fourth type of metadata is index sequence metadata;
    所述基于所述元数据已确定的预设类型以及键-值对,将所述元数据保存至对应的哈希表以及有序列表中,具体包括:The step of saving the metadata into the corresponding hash table and ordered list based on the determined preset type and key-value pair of the metadata specifically includes:
    基于已确定的预设类型,将所述元数据存储至预设类型对应的容器组中;每一种预设类型的元数据均对应一个所述容器组,每一个所述容器组均对应至少一个哈希槽,且,所述容器组对应的哈希槽的数量彼此相等;Based on the determined preset type, the metadata is stored in a container group corresponding to the preset type; each preset type of metadata corresponds to one of the container groups, and each of the container groups corresponds to at least One hash slot, and the number of hash slots corresponding to the container group is equal to each other;
    基于所述元数据的键-值对,确定所述元数据的CRC16值,基于哈希槽的总数以及已确定的CRC16值,确定元数据对应的哈希槽;Based on the key-value pair of the metadata, determine the CRC16 value of the metadata, and determine the hash slot corresponding to the metadata based on the total number of hash slots and the determined CRC16 value;
    将所述元数据映射并存储至已确定的哈希槽对应的所述哈希表;每一个所述哈希表对应至少一个哈希槽;Map and store the metadata to the hash table corresponding to the determined hash slot; each hash table corresponds to at least one hash slot;
    为所述元数据分配对应的分数值,基于已分配的所述分数值,确定所述元数据对应的所述有序列表,并将所述元数据存储至对应的所述有序列表中;所述有序列表中各项所述元数据按照所述分数值进行顺序排序。Assign a corresponding score value to the metadata, determine the ordered list corresponding to the metadata based on the assigned score value, and store the metadata in the corresponding ordered list; The metadata of each item in the ordered list is sorted in order according to the score value.
  3. 根据权利要求2所述的基于分布式键值数据库的元数据存储方法,其特征在于,所述哈希表的数量是基于预设数量和所述哈希槽的总数确定的,所述哈希表的数量超过所述预设数量且为所述总数的因数。The metadata storage method based on a distributed key-value database according to claim 2, wherein the number of hash tables is determined based on a preset number and the total number of hash slots. The number of tables exceeds the preset number and is a factor of the total number.
  4. 根据权利要求1所述的基于分布式键值数据库的元数据存储方法,其特征在于,该方法还包括以下步骤:The metadata storage method based on a distributed key-value database according to claim 1, characterized in that the method further includes the following steps:
    确定客户端的元数据检索请求;所述元数据检索请求中包含检索信息,所述检索信息包括每个所述有序列表的索引顺序和索引元素数目;Determine the client's metadata retrieval request; the metadata retrieval request includes retrieval information, and the retrieval information includes the index order and number of index elements of each of the ordered lists;
    基于所述检索信息,从每个所述有序列表中取出对应的所述元数据;Based on the retrieval information, retrieve the corresponding metadata from each of the ordered lists;
    将所述元数据存放至预设map表中,基于所述预设map表,对所述预设map表中存储的所述元数据进行顺序排序,将完成所述元数据存放及排序的所述预设map表返回给客户端。The metadata is stored in a preset map table, and based on the preset map table, the metadata stored in the preset map table is sequentially sorted, and all steps required to complete the storage and sorting of the metadata are The above preset map table is returned to the client.
  5. 根据权利要求1所述的基于分布式键值数据库的元数据存储方法,其特征在于,该方法还包括以下步骤:The metadata storage method based on a distributed key-value database according to claim 1, characterized in that the method further includes the following steps:
    确定客户端的元数据检索请求;所述元数据检索请求中包含检索信息,所述检索信息包括基准元数据和索引元素数目;Determine the client's metadata retrieval request; the metadata retrieval request includes retrieval information, and the retrieval information includes reference metadata and the number of index elements;
    确定所述基准元数据在每个所述有序列表中的存储位置;Determine the storage location of the baseline metadata in each of the ordered lists;
    基于所述检索信息和所述存储位置,从每个所述有序列表中取出对应的所述元数据;Based on the retrieval information and the storage location, retrieve the corresponding metadata from each of the ordered lists;
    将所述元数据存放至预设map表中,基于所述预设map表,对所述预设map表中存储的所述元数据进行顺序排序,将完成所述元数据存放及排序的所述预设map表返回给客户端。The metadata is stored in a preset map table, and based on the preset map table, the metadata stored in the preset map table is sequentially sorted, and all steps required to complete the storage and sorting of the metadata are The above preset map table is returned to the client.
  6. 根据权利要求4或者5所述的基于分布式键值数据库的元数据存储方法,其特征在于,所述将所述元数据存放至预设map表中,基于所述预设map表,对所述预设map表中存储的所述元数据进行顺序排序,将完成所述元数据存放及排序的所述预设map表返回给客户端,还包括:The metadata storage method based on a distributed key-value database according to claim 4 or 5, characterized in that the metadata is stored in a preset map table, and based on the preset map table, the metadata is stored in a preset map table. The metadata stored in the preset map table is sequentially sorted, and the preset map table that completes the storage and sorting of the metadata is returned to the client, which also includes:
    确定所述预设map表达到存储上限且存在未存入的所述元数据,基于所述预设map表,确定未存入的所述元数据与所述预设map表中最尾端的元数据的map表排序值;It is determined that the preset map expression has reached the storage upper limit and there is unstored metadata. Based on the preset map table, it is determined that the unstored metadata and the last element in the preset map table are The map table sorting value of the data;
    确定所述最尾端的元数据对应的map表排序值超过未存入的所述元数据对应的map表排序值,删除所述所述最尾端的元数据,并将未存入的所述元数据存储至所述预设map表中。Determine that the map table sorting value corresponding to the tailmost metadata exceeds the map table sorting value corresponding to the unstored metadata, delete the tailmost metadata, and store the unstored metadata. The data is stored in the preset map table.
  7. 一种基于分布式键值数据库的元数据存储装置,其特征在于,所述装置应用于服务端,所述服务端安装有分布式对象存储系统,所述装置包括: A metadata storage device based on a distributed key-value database, characterized in that the device is applied to a server, and the server is installed with a distributed object storage system. The device includes:
    第一确定模块,用于确定桶内存储的对象的元数据,并确定所述元数据的预设类型以及键-值对;The first determination module is used to determine the metadata of the objects stored in the bucket, and determine the preset type and key-value pair of the metadata;
    数据存储模块,用于基于所述元数据已确定的预设类型以及键-值对,将所述元数据保存至对应的哈希表以及有序列表中;所述分布式存储系统中每个桶均通过至少一个哈希表和至少一个有序列表存储元数据,所述哈希表用于存储元数据,所述有序列表用于存储元数据的检索信息。A data storage module, configured to save the metadata into the corresponding hash table and ordered list based on the determined preset type and key-value pair of the metadata; each in the distributed storage system Each bucket stores metadata through at least one hash table and at least one ordered list, the hash table is used to store metadata, and the ordered list is used to store retrieval information of metadata.
  8. 一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1至6任一项所述基于分布式键值数据库的元数据存储方法的步骤。An electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, characterized in that when the processor executes the program, it implements claims 1 to 6 The steps of any one of the metadata storage methods based on distributed key-value database.
  9. 一种非暂态计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至6任一项所述基于分布式键值数据库的元数据存储方法的步骤。A non-transitory computer-readable storage medium with a computer program stored thereon, characterized in that when the computer program is executed by a processor, it implements the distributed key-value database based on any one of claims 1 to 6. Metadata storage method steps.
  10. 一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至6任一项所述基于分布式键值数据库的元数据存储方法的步骤。A computer program product, including a computer program, characterized in that when the computer program is executed by a processor, the steps of the metadata storage method based on a distributed key-value database according to any one of claims 1 to 6 are implemented.
PCT/CN2022/141807 2022-07-29 2022-12-26 Metadata storage method and apparatus based on distributed key-value database WO2024021488A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210906067.6A CN115454994A (en) 2022-07-29 2022-07-29 Metadata storage method and device based on distributed key value database
CN202210906067.6 2022-07-29

Publications (1)

Publication Number Publication Date
WO2024021488A1 true WO2024021488A1 (en) 2024-02-01

Family

ID=84297062

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/141807 WO2024021488A1 (en) 2022-07-29 2022-12-26 Metadata storage method and apparatus based on distributed key-value database

Country Status (2)

Country Link
CN (1) CN115454994A (en)
WO (1) WO2024021488A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115454994A (en) * 2022-07-29 2022-12-09 天翼云科技有限公司 Metadata storage method and device based on distributed key value database

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150154422A1 (en) * 2013-11-29 2015-06-04 Thomson Licensing Method for determining a statistic value on data based on encrypted data
CN113821171A (en) * 2021-09-01 2021-12-21 浪潮云信息技术股份公司 Key value storage method based on hash table and LSM tree
CN113886331A (en) * 2021-12-03 2022-01-04 苏州浪潮智能科技有限公司 Distributed object storage method and device, electronic equipment and readable storage medium
CN115454994A (en) * 2022-07-29 2022-12-09 天翼云科技有限公司 Metadata storage method and device based on distributed key value database

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150154422A1 (en) * 2013-11-29 2015-06-04 Thomson Licensing Method for determining a statistic value on data based on encrypted data
CN113821171A (en) * 2021-09-01 2021-12-21 浪潮云信息技术股份公司 Key value storage method based on hash table and LSM tree
CN113886331A (en) * 2021-12-03 2022-01-04 苏州浪潮智能科技有限公司 Distributed object storage method and device, electronic equipment and readable storage medium
CN115454994A (en) * 2022-07-29 2022-12-09 天翼云科技有限公司 Metadata storage method and device based on distributed key value database

Also Published As

Publication number Publication date
CN115454994A (en) 2022-12-09

Similar Documents

Publication Publication Date Title
JP7053682B2 (en) Database tenant migration system and method
US11693830B2 (en) Metadata management method, system and medium
US9684702B2 (en) Database redistribution utilizing virtual partitions
US10248676B2 (en) Efficient B-Tree data serialization
US20160350302A1 (en) Dynamically splitting a range of a node in a distributed hash table
US8924357B2 (en) Storage performance optimization
US20160283538A1 (en) Fast multi-tier indexing supporting dynamic update
US11580162B2 (en) Key value append
WO2009009556A1 (en) Method and system for performing a scan operation on a table of a column-oriented database
US20100082546A1 (en) Storage Tiers for Database Server System
EP3803615A1 (en) Scalable multi-tier storage structures and techniques for accessing entries therein
US10909091B1 (en) On-demand data schema modifications
JP6707797B2 (en) Database management system and database management method
CN113535670A (en) Virtual resource mirror image storage system and implementation method thereof
WO2024021488A1 (en) Metadata storage method and apparatus based on distributed key-value database
US10127238B1 (en) Methods and apparatus for filtering dynamically loadable namespaces (DLNs)
EP4302200A1 (en) Measuring and improving index quality in a distributed data system
Eisa et al. A fragmentation algorithm for storage management in cloud database environment
WO2022267508A1 (en) Metadata compression method and apparatus
WO2022121274A1 (en) Metadata management method and apparatus in storage system, and storage system
JP2012243039A (en) Snap shot data storage method
US10997126B1 (en) Methods and apparatus for reorganizing dynamically loadable namespaces (DLNs)
WO2023237120A1 (en) Data processing system and apparatus
Herodotou Towards a distributed multi-tier file system for cluster computing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22952917

Country of ref document: EP

Kind code of ref document: A1