WO2020052379A1 - 分布式存储系统中处理对象的元数据的方法及装置 - Google Patents

分布式存储系统中处理对象的元数据的方法及装置 Download PDF

Info

Publication number
WO2020052379A1
WO2020052379A1 PCT/CN2019/099610 CN2019099610W WO2020052379A1 WO 2020052379 A1 WO2020052379 A1 WO 2020052379A1 CN 2019099610 W CN2019099610 W CN 2019099610W WO 2020052379 A1 WO2020052379 A1 WO 2020052379A1
Authority
WO
WIPO (PCT)
Prior art keywords
bucket
logical sub
name
service node
storage system
Prior art date
Application number
PCT/CN2019/099610
Other languages
English (en)
French (fr)
Inventor
谢晓芹
李坤
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020052379A1 publication Critical patent/WO2020052379A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually

Definitions

  • Embodiments of the present invention relate to the field of storage technologies, and in particular, to a method and an apparatus for processing metadata of an object in a distributed storage system.
  • the service node determines the metadata partition of the object (that is, determines which partition the metadata of the object is written to) according to the shard key included in the identifier of the object, so that the partition is managed.
  • the inode of this object manages the metadata of this object.
  • the slice key value includes the name of the bucket and the name of the object.
  • the present application provides a method and device for processing metadata of an object in a distributed storage system, which can solve the problem of uneven distribution of metadata of the object.
  • a method for processing metadata of an object in a distributed storage system includes at least two logical sub-buckets, and the partitions to which the at least two logical sub-buckets are mapped are different, at least two A first logical sub-bucket in each of the logical sub-buckets is mapped to a first partition, and an index node in the distributed storage system is used to manage the first partition.
  • the method is as follows: after receiving the input / output (IO) operation including the name of the bucket and the name of the first object, the service node in the distributed storage system according to the name of the bucket and the first object's Name, select the first logical sub-bucket from the bucket, and send a processing request including the name of the first logical sub-bucket and the name of the first object to the index node according to the mapping relationship between the first logical sub-bucket and the first partition, the The processing request is used to request processing of metadata of the first object in the first partition.
  • IO input / output
  • a bucket includes at least two logical sub-buckets, and each logical sub-bucket is mapped to a different partition.
  • a service node can The metadata of different objects in the bucket is hashed to different logical sub-buckets, and then written to different partitions, which effectively improves the uniformity of the metadata distribution of different objects in the same bucket and avoids the emergence of partition hotspots.
  • the method for the service node to select the first logical sub-bucket from the bucket according to the name of the bucket and the name of the first object is: the service node according to the name of the bucket, Determine the bucket; the service node determines the number of the first logical sub-bucket according to the name of the first object, the number of logical sub-buckets in the bucket, the starting number of the logical sub-bucket number in the bucket, and the initial number of partitions of the distributed storage system . After the service node determines the number of the first logical sub-bucket, it can select the first logical sub-bucket from at least two logical sub-buckets according to the number.
  • the service node before the processing request is sent to the index node according to the mapping relationship between the first logical sub-bucket and the first partition, the service node also generates a name including the first logical sub-bucket.
  • the name of the first logical sub-bucket includes the first The number of a logical sub-bucket and the name of the bucket.
  • a bucket includes at least two logical sub-buckets, and the at least two logical sub-buckets in the bucket may be distinguished by numbers. Because a distributed storage system may include at least two buckets, for each logical sub-bucket, the name of the logical sub-bucket needs to be represented by the name of the bucket to which it belongs and the number of the logical sub-bucket.
  • the service node before receiving an input-output IO operation, the service node also establishes a bucket, and determines the number of logical sub-buckets in the bucket according to the performance index of the bucket, and according to The initial partition number and random number of the distributed storage system determine the starting number of logical sub-buckets in the bucket.
  • the number of initial partitions of the distributed storage system in the present application is not less than two.
  • the number of initial partitions is not less than 2
  • the initial performance of the distributed storage system is effectively improved.
  • a service node which is applied to a distributed storage system, and the distributed storage system further includes an index node.
  • the bucket of the distributed storage system includes at least two logical sub-buckets. At least two logical sub-buckets are mapped to different partitions. The first logical sub-bucket in the at least two logical sub-buckets is mapped to the first partition.
  • the index node is used for Manage the first partition.
  • the service node provided in this application includes a receiving unit, a processing unit, and a sending unit.
  • the receiving unit is configured to receive an IO operation, where the IO operation includes a name of the bucket and a name of the first object.
  • the processing unit is configured to select a first logical sub-bucket from the bucket according to the name of the bucket and the name of the first object.
  • the sending unit is configured to send a processing request to the index node according to the mapping relationship between the first logical sub-bucket and the first partition.
  • the processing request includes a name of the first logical sub-bucket and a name of the first object.
  • the request is to process the metadata of the first object in the first partition.
  • the processing unit is specifically configured to determine the bucket according to the name of the bucket, and according to the name of the first object, the number of logical sub-buckets in the bucket, and the logic in the bucket.
  • the starting number of the sub-bucket number and the initial number of partitions of the distributed storage system determine the number of the first logical sub-bucket.
  • the processing unit is further configured to: before the sending unit sends a processing request to the index node according to the mapping relationship between the first logical sub-bucket and the first partition, generate a first number including the number of the first logical sub-bucket and the name of the bucket.
  • the name of a logical sub-bucket is specifically configured to determine the bucket according to the name of the bucket, and according to the name of the first object, the number of logical sub-buckets in the bucket, and the logic in the bucket.
  • the starting number of the sub-bucket number and the initial number of partitions of the distributed storage system determine the number of the first logical sub
  • the processing unit is further configured to establish a bucket before the receiving unit receives an input-output IO operation, and determine a logical element in the bucket according to a performance indicator of the bucket.
  • the number of buckets, and the initial number of logical sub-buckets in the bucket are determined according to the initial number of partitions and random numbers of the distributed storage system.
  • the number of initial partitions of the distributed storage system is not less than two.
  • a service node includes: one or more processors and a memory.
  • the memory is connected to the one or more processors.
  • the memory is configured to store computer instructions.
  • the service node executes the method according to any one of the first aspect and any possible implementation manners.
  • a computer program product including instructions.
  • the computer program product includes computer instructions.
  • the processor of the service node according to the third aspect executes the computer instructions
  • the service node executes the first Aspect and the method described in any one of its possible implementations.
  • a computer-readable storage medium includes computer instructions, and when the processor of the service node according to the third aspect executes the computer instructions, causes the service node to execute as described above.
  • FIG. 1 is a schematic diagram of partitioning in an embodiment of the present invention
  • FIG. 2 is a first schematic structural diagram of a distributed storage system according to an embodiment of the present invention.
  • FIG. 3 is a second schematic diagram of a structure of a distributed storage system according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a hardware structure of a service node according to an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of a method for processing metadata of an object in an embodiment of the present invention.
  • FIG. 6 is a first schematic structural diagram of a service node according to an embodiment of the present invention.
  • FIG. 7 is a second schematic diagram of a structure of a seeding service node according to an embodiment of the present invention.
  • words such as “exemplary” or “for example” are used as examples, illustrations or illustrations. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present invention should not be construed as more preferred or more advantageous than other embodiments or designs. Rather, the use of the words “exemplary” or “for example” is intended to present the relevant concept in a concrete manner.
  • a distributed storage system includes multiple servers. Generally, multiple servers in a distributed storage system are divided into service nodes, index nodes, and storage nodes according to the functions the server has.
  • the storage node is used to store the data of the object and / or the metadata of the object. Specifically, the data of the object and / or the metadata of the object are stored in a storage medium (such as a hard disk) of the storage node.
  • the inode is used to manage the metadata of the object, and it can write the metadata of the object into a certain partition it manages.
  • the service node is used to determine that the metadata of the objects in a bucket is written to a certain partition.
  • the metadata of each object is stored in a metadata table in the form of a data entry.
  • the identification of each object includes the name of the bucket, the name of the object, and the version number of the object.
  • the partitioning technology is usually used in the distributed storage system to dynamically split the metadata table. Each partition after the split manages different data entries. Each partition is managed by an index node in the distributed storage system. Dynamic partitioning enables dynamic horizontal expansion of distributed storage systems.
  • a distributed storage system In the partition technology, a distributed storage system generally uses a value in at least one column of a data entry to determine the partition to which the data entry belongs.
  • the combination of the values in the at least one column is called a shard key. That is, for each data entry in the metadata table, the partition to which the data entry belongs can be uniquely determined according to the slice key value of the data entry.
  • Table 1 is a metadata table.
  • the combination of "name of the bucket”, "name of the object”, and “version number of the object” in the metadata table uniquely identifies an object, that is, the identifier of the object is ⁇ bucket Name, object name, object version number ⁇ .
  • the ⁇ storage name, object name ⁇ can be used as the ShardKey in the distributed storage system to facilitate indexing different partitions.
  • the ShardKey of the data entry in the first row of Table 1 is ⁇ the name of the bucket is "A”, and the name of the object is "0109" ⁇ .
  • the ShardKey of the data entry in the second line is ⁇ the name of the bucket is "A”, and the name of the object is "0109” ⁇ .
  • the ShardKey of the data entry in the second line is the same as the ShardKey of the data entry in the first line. Therefore, the data entry in the first line
  • the metadata represented and the metadata represented by the data entry in line 2 belong to the same partition.
  • the identification of the object represented by the data entry in the first line ⁇ the name of the bucket is "A”
  • the name of the object is "0109”
  • the version number of the object is "1” ⁇
  • the identification of the object represented by the data entry in the second line
  • the name of the bucket is "A”
  • the name of the object is "0109”
  • the version number of the object is "2” ⁇
  • the ShardKey of the data entry in line 3 is ⁇ the name of the bucket is "B”
  • the name of the object is "0201" ⁇ .
  • the distributed storage system usually uses the Range Partition technology.
  • the principle of left-to-right open or left-to-right close is used to determine the partition to which the metadata belongs.
  • FIG. 1 illustrates a structure of a partition in a distributed storage system.
  • bn indicates the name of the bucket
  • on indicates the name of the object
  • ver indicates the version number of the object.
  • the value range of ShardKey for partition 1 is: ( ⁇ bn: min; on: max ⁇ , ⁇ bn: 0011 ; On: max ⁇ ]
  • the value range of ShardKey for partition 2 is: ( ⁇ bn: 0011; on: max ⁇ to ⁇ bn: 0020; on: max ⁇ ].
  • the identity of an object is: ⁇ bn: 0010 ; On: max; ver: 1 ⁇
  • the partition to which the metadata of the object belongs is partition 1.
  • the slice key value of an object uniquely determines the partition to which the metadata of the object belongs, therefore, in the scenario where the slice key values are arranged sequentially, the metadata of multiple objects in the same bucket always belongs. In the same partition, partition hotspots are caused, and the metadata of objects in the bucket is unevenly distributed.
  • embodiments of the present invention provide a method and a device for processing metadata of an object in a distributed storage system.
  • the basic principle is: establish at least two logical sub-buckets in the buckets of the distributed storage system, hash the metadata of objects in the same bucket to different logical sub-buckets, and each logical sub-bucket is then mapped to a different one Partition to achieve the effect that the metadata of the objects in the same bucket belong to different partitions, effectively improving the uniformity of the metadata distribution of the objects.
  • multiple partitions can be configured according to the scale and performance of each service node. That is, when the distributed storage system is initially deployed, the initial number of partitions is configured for the distributed storage system. The number of the initial partitions is not less than two. In this way, the initial multi-partition capability of the distributed storage system is guaranteed, and the initial performance of the distributed storage system is effectively improved.
  • FIG. 2 is a schematic structural diagram of a distributed storage system according to an embodiment of the present invention.
  • the distributed storage system includes at least one service node 20, at least two index nodes 21, and at least two storage nodes 22.
  • the index node 21 can communicate with each serving node 20 and can also communicate with each storage node 22.
  • At least one service node 20 in the distributed storage system may form a service node cluster, at least one index node 21 may form an index node cluster, and at least one storage node 22 may form a storage array.
  • the service node 20 may construct buckets in the distributed storage system, and establish at least two logical sub-buckets in the already-established buckets.
  • the numbers of the at least two logical sub-buckets are arranged in lexicographical order.
  • the name of each logical sub-bucket includes the number of the logical sub-bucket and the name of the bucket to which the logical sub-bucket belongs.
  • At least two logical sub-buckets are mapped to different partitions. That is, for the metadata of an object in a logical sub-bucket, the partition to which the metadata of the object belongs has a mapping relationship with the logical sub-bucket.
  • the service node 20 may determine the logical sub-bucket according to the slice key value of the object, and determine the index node according to the mapping relationship between the logical sub-bucket and the shard.
  • the service node 21 is used to: The target bucket, and according to the performance index of the target bucket, determine the number of logical sub-buckets in the target bucket, and determine the starting number of the logical sub-bucket in the target sub-bucket (S500); receive the name including the target bucket and the name of the first object Write operation (S501); selecting the first logical sub-bucket from the target bucket according to the name of the target bucket and the name of the first object (S502); indexing to the target according to the mapping relationship between the first logical sub-bucket and the first partition
  • the node sends a processing request including the name of the first logical sub-bucket and the name of the first object (S503).
  • the IO operation may be a write operation or a read operation.
  • the write operation may carry a logical block address (logical block address, LBA) and data to be written, and is used to write the data to be written into a physical storage space corresponding to the LBA to be written.
  • the read operation may carry the LBA to be read, and is used to read data (that is, data to be read) stored in the physical storage space corresponding to the LBA to be read.
  • the inode 21 is used to manage the metadata of the object, and can write the metadata of the object into a certain partition it manages.
  • the index node 21 may manage at least one partition in the distributed storage system. After receiving the processing request including the name of the logical sub-bucket and the name of the object sent by the service node 20, the index node 21 processes the metadata of the object in the partition having a mapping relationship with the logical sub-bucket.
  • the storage node 22 is configured to store data of the object and / or metadata of the object. Specifically, the data of the object and / or metadata of the object are stored in a storage medium of the storage node.
  • the physical form of the storage medium of the storage node 22 may be a solid state drive (SSD) or a hard disk (HDD), which is not specifically limited in this embodiment of the present invention.
  • the service node 20, the index node 21, and the storage node 22 in the embodiment of the present invention may all be physical machines (such as servers), virtual machines, or any other device for providing object storage services.
  • the present invention The embodiment does not specifically limit this.
  • FIG. 3 is another schematic structural diagram of a distributed storage system in an embodiment of the present invention.
  • the distributed storage system includes n (n ⁇ 1) service nodes, m (m ⁇ 2) index nodes, and k (k ⁇ 2) storage nodes.
  • n service nodes are service node 1, service node 2, ..., service node n
  • m index nodes are index node 1, index node 2, ..., index node m
  • k storage nodes are storage node 1 , Storage node 2, ..., storage node k.
  • Each service node can create a bucket, and at least two logical sub-buckets can be established in the already established buckets.
  • the numbers of the at least two logical sub-buckets are arranged in lexicographic order, and the name of each logical sub-bucket includes the logic.
  • service node 1 has bucket 1 established, and bucket 1 has 5 logical sub-buckets.
  • the names of these 5 logical sub-buckets are: 001-bck1, 002-bck1, 003-bck1, 004-bck1, 005-bck1.
  • 001-bck1 indicates that the logical sub-bucket is the logical sub-bucket numbered 001 in bucket 1
  • 002-bck1 indicates that the logical sub-bucket is the logical sub-bucket numbered 002 in bucket 1, and so on.
  • Each logical sub-bucket is mapped to a partition, and at least two logical sub-buckets in the same bucket are mapped to different partitions. That is, for the metadata of an object in a logical sub-bucket, the partition to which the metadata of the object belongs has a mapping relationship with the logical sub-bucket.
  • logical sub-bucket 001-bck1 is mapped to partition 1
  • logical sub-bucket 003-bck1 is mapped to partition i, that is, there is a mapping relationship between logical sub-bucket 001-bck1 and partition 1
  • logical sub-bucket 003- There is a mapping relationship between bck1 and partition i.
  • Each index node in a distributed storage system manages at least one partition.
  • the i-node 2 in FIG. 3 manages partition 1 and i-node 1 manages partition i.
  • i-node 2 can write metadata of objects in logical sub-bucket 001-bck1 to partition 1, i-node 1
  • the metadata of the objects in logical sub-bucket 003-bck1 can be written to partition i.
  • inode 2 writes metadata of objects in logical sub-bucket 001-bck1 to the storage node corresponding to partition 1
  • inode 1 writes metadata of objects in logical sub-bucket 003-bck1 to correspond to partition i Storage node.
  • a distributed storage system needs to be built first. Specifically, according to a configuration file, a partition is deployed on a service node of the distributed storage system, and logical sub-buckets are established.
  • the configuration file can be used to record: the initial number of partitions in the distributed storage system (that is, the number of initial partitions), the number of partitions that have a mapping relationship with each logical sub-bucket, and the number of inodes.
  • FIG. 4 is a schematic structural diagram of a service node 20 according to an embodiment of the present invention.
  • the service node 20 provided in the embodiment of the present invention includes a communication interface 40, a communication interface 41, and a control module 42.
  • the communication interface 20 and the communication interface 41 are respectively connected to the control module 42.
  • the communication interface 40 is used to communicate with the client, and the communication interface 41 is used to communicate with the index node 21.
  • the communication interface 40 and the communication interface 41 communicate through a communication network, such as Ethernet, wireless local area network (WLAN), and the like.
  • a communication network such as Ethernet, wireless local area network (WLAN), and the like.
  • the control module 42 establishes at least two logical sub-buckets in the target bucket, and is further configured to select the first logical sub-bucket in the target bucket.
  • the control module 42 includes a processor 421 and a memory 422.
  • the processor 421 and the memory 422 are connected.
  • the communication interface 40, the communication interface 41, the processor 421, and the memory 422 may be connected through a system bus 43.
  • the memory 422 may exist independently, and is connected to the processor 421 through the system bus 43.
  • the memory 422 may also be integrated with the processor 421.
  • the processor 421 is configured to establish at least two logical sub-buckets in the target bucket, and is further configured to select a first logical sub-bucket in the target bucket.
  • the memory 422 is configured to temporarily store information received by the service node 20.
  • the memory 422 is further configured to store software programs and application modules.
  • the processor 421 executes various functional applications and data processing of the service node 20 by running the software programs and application modules stored in the memory 422.
  • the memory 422 mainly includes a storage program area 4221.
  • the storage program area 4221 may store an operating system and application programs required for at least one function, such as selecting a first logical sub-bucket.
  • the processor 421 may be any computing device, and may be a general-purpose central processing unit (CPU), a microprocessor, a programmable controller, an application-specific integrated circuit (ASIC), or one or more for controlling The integrated circuit of the above program execution.
  • the processor 421 is a control center of the service node 20.
  • the processor 421 connects various parts of the service node 20 by using various interfaces and lines, and executes various functions of the service node 20 and processes data by running or executing software programs and / or application modules stored in the memory 422, thereby providing services to the service.
  • the node 20 performs overall monitoring.
  • the processor may include one or more CPUs.
  • the processor in FIG. 4 includes CPU0 and CPU1.
  • the memory 422 may include volatile memory (for example, random-access memory (RAM); the memory 422 may also include non-volatile memory (non-volatile memory), for example, read-only Memory (read-only memory (ROM), flash memory (flash memory), hard disk (hard disk drive (HDD), solid-state hard disk (solid-state drive (SSD), disk storage media or other magnetic storage devices, or can be used Any other medium for carrying or storing desired program code in the form of instructions or data structures and capable of being accessed by a network device, but is not limited thereto.
  • RAM random-access memory
  • non-volatile memory non-volatile memory
  • ROM read-only Memory
  • flash memory flash memory
  • HDD hard disk drive
  • SSD solid-state hard disk
  • disk storage media or other magnetic storage devices or can be used Any other medium for carrying or storing desired program code in the form of instructions or data structures and capable of being accessed by a network device, but is not limited thereto.
  • the system bus 43 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, or the like.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the system bus 43 can be divided into an address bus, a data bus, a control bus, and the like. In the embodiment of the present invention, for clear description, various buses are schematically illustrated as the system bus 43 in FIG. 4.
  • FIG. 5 it is an interaction schematic diagram of a method for processing metadata of an object in a distributed storage system according to an embodiment of the present invention. Among them, the specific description is made by taking an IO operation as a write operation as an example.
  • the method shown in FIG. 5 may include the following steps:
  • the service node establishes a target bucket, and determines the number of logical sub-buckets in the target bucket and the starting number of the logical sub-bucket in the target sub-bucket according to the performance index of the target bucket.
  • the service node is any one of at least one service node in the distributed storage system.
  • a user may send a bucket creation request to a distributed storage system when there is a need to create a target bucket.
  • the service node in the distributed storage system After receiving the bucket creation request, the service node in the distributed storage system creates the target bucket according to the relevant attributes of the target bucket, such as the bucket name and bucket size.
  • the number of logical sub-buckets is determined according to the performance indicators of the target bucket. It is easy to understand that the number of logical sub-buckets in the target bucket directly affects the initial performance of the target bucket. Therefore, if the user does not set the number of logical sub-buckets, the number of logical sub-buckets in the target bucket is the default value. ; When the number of logical sub-buckets set by the user is too large, the number of logical sub-buckets in the target bucket is the maximum number of sub-buckets in the distributed storage system.
  • the service node needs to determine not only the number of logical sub-buckets in the target bucket, but also the starting number of the logical sub-buckets in the target bucket.
  • the numbers of the logical sub-buckets are arranged in lexicographical order. In this way, after the starting number of the logical sub-bucket is determined, the service node can determine the number of each logical sub-bucket in the target bucket.
  • the service node may determine the starting number of the logical sub-bucket in the target bucket using a modulo operation according to the number of initial partitions of the distributed storage system and a random number, or according to the initial partition of the distributed storage system.
  • Number and random number, simple addition and subtraction operations are used to determine the starting number of logical sub-buckets in the target bucket, and according to the initial number of partitions and random numbers in the distributed storage system, multiplication and addition operations are used to determine logical sub-buckets in the target bucket.
  • the starting number of the bucket which is not specifically limited in this embodiment of the present invention.
  • the service node uses the following formula to calculate the starting number of the logical sub-bucket in the target bucket:
  • S0 represents the starting number of the logical sub-bucket number in the target bucket
  • Rand () represents a random number
  • c is the number of initial partitions of the distributed storage system.
  • the above formula is used to calculate the starting number of the logical sub-bucket. Since a random number is used to calculate the starting number of the logical sub-bucket, the starting number of the logical sub-bucket in the target bucket is also random.
  • the service node After determining the number of logical sub-buckets in the target bucket and the starting number of the logical sub-buckets in the target sub-bucket, the service node stores it.
  • the initial number of partitions in the embodiment of the present invention is the total number of partitions configured for the distributed storage system when a new distributed system is created.
  • the number of the initial partitions may be one or not less than two.
  • the method for processing metadata of an object provided by the embodiment of the present invention is applicable to a scenario where the number of subsequent partitions increases.
  • the method for the distributed storage system to determine the number of initial partitions is: configure a certain number of partitions for each service node according to the performance of each service node.
  • the total number of partitions is the number of initial partitions.
  • the number of initial partitions is not less than 2 effectively improving the initial performance of the distributed storage system.
  • the service node receives a write operation, where the write operation is used to request writing metadata of a first object in a target bucket.
  • the write operation includes an identification of the first object, and the identification of the first object includes a name of the target bucket, a name of the first object, and a version number of the first object.
  • the service node selects the first logical sub-bucket from the target bucket according to the name of the target bucket and the name of the first object.
  • the service node determines the target bucket according to the name of the target bucket; then, the service node determines the target bucket based on the name of the first object, the number of logical sub-buckets in the target bucket, the starting number of the logical sub-bucket in the target bucket, and the initial partition. Quantity, which calculates the number of the target logical sub-bucket.
  • the service node can use the modulo operation to calculate the number of the target logical sub-bucket, can also use a simple addition and subtraction operation to calculate the number of the target logical sub-bucket, and can also use the multiplication and addition operation to calculate the number of the target logical sub-bucket.
  • the embodiment of the invention does not specifically limit this.
  • the service node uses the following formula to calculate the number of the target logical sub-bucket:
  • S1 is the number of the target logical sub-bucket
  • S0 is the starting number of the logical sub-bucket in the target bucket
  • a is the name of the first object
  • b is the number of logical sub-buckets in the target bucket
  • c is the initial partition Quantity
  • hash (a) represents the hash value of the name of the first object.
  • the service node calculates the number of the target logical sub-bucket to determine the target logical sub-bucket.
  • the service node sends a processing request to the target index node.
  • the processing request includes the name of the target logical sub-bucket and the name of the first object.
  • the processing request is used to request that metadata of the first object be written in the partition corresponding to the target logical sub-bucket. .
  • the service node After calculating the number of the target logical sub-bucket, the service node generates the name of the target logical sub-bucket according to the number of the target logical sub-bucket and the name of the target bucket.
  • the name of the target logical sub bucket includes the number of the target logical sub bucket and the name of the target bucket.
  • the target index node writes the metadata of the first object to the storage node.
  • the at least two logical sub-buckets are respectively mapped to different partitions, so metadata of different objects in the same bucket can be hashed into different logical sub-buckets, and further Writing to different partitions effectively improves the uniformity of the metadata distribution of the object.
  • the process of the method of processing the metadata of the object is similar to the process shown in FIG. 4 except that the target inode needs to read the name of the target logical sub-bucket from the storage node. Metadata. Subsequently, the target index node sends the read metadata to the service node that carries the name of the target logical sub-bucket. Further, the service node removes the name of the target logical sub-bucket and removes the metadata of the name of the target logical sub-bucket. Sort it so that the service node can get all the metadata of the first object.
  • the method for processing object metadata in a distributed storage system provided by an embodiment of the present invention, by establishing at least two logical sub-buckets in buckets of the distributed storage system, the metadata of objects in the same bucket are scattered. Columns are sorted into different logical sub-buckets, and each logical sub-bucket is then mapped to a different partition to achieve the effect that the metadata of objects in the same bucket belongs to different partitions, effectively improving the uniformity of the metadata distribution of objects.
  • the service node may be divided into functional modules according to the foregoing method example.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules may be implemented in the form of hardware or software functional modules.
  • the division of the modules in the embodiment of the present invention is schematic, and is only a logical function division. In actual implementation, there may be another division manner.
  • FIG. 6 illustrates a possible structural diagram of a service node involved in the foregoing embodiment.
  • the serving node 6 includes a receiving unit 60, a processing unit 61, and a sending unit 62.
  • the receiving unit 60 is configured to support the service node to perform S501 and the like in the above embodiments, and / or other processes used in the technology described herein.
  • the processing unit 61 is configured to support the service node to perform S500, S502, and the like in the foregoing embodiments, and / or other processes used in the technology described herein.
  • the sending unit 62 is configured to support the service node to perform S503 and the like in the above embodiments, and / or other processes for the technology described herein.
  • the service node provided in the embodiment of the present invention includes, but is not limited to, the foregoing modules.
  • the service node may further include a storage unit 63.
  • the storage unit 63 may be configured to store a program code of the service node.
  • the service node 7 includes a processing module 70 and a communication module 71.
  • the processing module 70 is configured to control and manage the actions of the service node, for example, to execute the steps performed by the processing unit 61 described above, and / or to perform other processes of the technology described herein.
  • the communication module 71 is configured to support interaction between the service node and other devices, for example, execute the steps performed by the receiving unit 60 and the sending unit 62 described above.
  • the service node 7 may further include a storage module 72.
  • the storage module 72 is configured to store program codes and data of the service node 7, for example, to store content stored in the storage unit 63.
  • the processing module 70 corresponds to the processor 421 in FIG. 4, the communication module 71 corresponds to the communication interface 40 and the communication interface 41 in FIG. 4, and the storage module corresponds to the memory 422 in FIG. 4.
  • Both the service node 6 and the service node 7 can execute the method for processing metadata of an object in the distributed storage system shown in FIG. 5.
  • the functions performed by the service node and the index node in the embodiments of the present invention may also be performed by other nodes in the distributed object storage system.
  • the specific implementation can be determined according to the requirements of the distributed object storage system.
  • An embodiment of the present invention further provides a computer-readable storage medium.
  • the computer-readable storage medium includes computer instructions.
  • the processor of the service node executes the computer instructions, the service node executes distributed storage as shown in FIG. 5.
  • An embodiment of the present invention further provides a computer program product.
  • the computer program product includes computer instructions.
  • the processor of the service node executes the computer instructions, the service node is caused to implement and execute processing objects in the distributed storage system shown in FIG. 5. Metadata method.
  • all or part of them may be implemented by software, hardware, firmware, or any combination thereof.
  • a software program When implemented using a software program, it may appear in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions according to the embodiments of the present invention are wholly or partially generated.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server, or data center Transmission to another website site, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL), Ethernet) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, and the like that includes one or more available medium integration.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are only schematic.
  • the division of the modules or units is only a logical function division.
  • multiple units or components may be divided.
  • the combination can either be integrated into another device, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may be one physical unit or multiple physical units, that is, may be located in one place, or may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • the functional units in the embodiments of the present invention may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated unit When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a readable storage medium.
  • the technical solution of the embodiment of the present invention essentially or part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium
  • Included are several instructions for causing a device (which can be a single-chip microcomputer, a chip, etc.) or a processor to execute all or part of the steps of the method described in the embodiments of the present invention.
  • the foregoing storage medium includes various media that can store program codes, such as a U disk, a mobile hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种分布式存储系统中处理对象的元数据的方法及装置,涉及存储技术领域,能够解决对象的元数据的分布不均匀的问题。这里分布式存储系统的桶包括至少两个逻辑子桶,至少两个逻辑子桶映射到的分区不同,至少两个逻辑子桶中的第一逻辑子桶映射到第一分区,分布式存储系统的索引节点管理第一分区。该方法为:分布式存储系统的服务节点接收包括桶的名称以及第一对象的名称的IO操作;服务节点根据桶的名称以及第一对象的名称,从桶中选择第一逻辑子桶;服务节点根据第一逻辑子桶与第一分区的映射关系,向索引节点发送包括第一逻辑子桶的名称和第一对象的名称的处理请求,处理请求用于请求在第一分区中处理第一对象的元数据。

Description

分布式存储系统中处理对象的元数据的方法及装置 技术领域
本发明实施例涉及存储技术领域,尤其涉及一种分布式存储系统中处理对象的元数据的方法及装置。
背景技术
在分布式存储系统中,服务节点根据对象的标识所包括的片键值(ShardKey)确定该对象的元数据归属分区(即确定将对象的元数据写入哪一分区),进而使得管理该分区的索引节点对该对象的元数据进行管理。一般的,片键值包括桶的名称以及对象的名称。这样,在键值为顺序排列的场景中,分布式存储系统中某一桶内的对象的元数据总是归属于一个分区中,造成分区热点,且该桶内的对象的元数据的分布不均匀。即使后续分布式存储系统中分区的数量变大,也依旧无法解决对象的元数据的分布不均匀的问题。
发明内容
本申请提供一种分布式存储系统中处理对象的元数据的方法及装置,能够解决对象的元数据的分布不均匀的问题。
为达到上述目的,本申请采用如下技术方案:
第一方面,提供一种分布式存储系统中处理对象的元数据的方法,该分布式存储系统的桶包括至少两个逻辑子桶,该至少两个逻辑子桶映射到的分区不同,至少两个逻辑子桶中的第一逻辑子桶映射到第一分区,分布式存储系统中的索引节点用于管理第一分区。具体的,该方法为:分布式存储系统中的服务节点在接收到包括桶的名称以及第一对象的名称的输入输出(input/output,IO)操作后,根据桶的名称以及第一对象的名称,从桶中选择第一逻辑子桶,并根据第一逻辑子桶与第一分区的映射关系,向索引节点发送包括第一逻辑子桶的名称和第一对象的名称的处理请求,该处理请求用于请求在第一分区中处理第一对象的元数据。
本申请提供的方法中,在桶与分区之间引入了逻辑子桶的概念,一个桶包括至少两个逻辑子桶,且每个逻辑子桶映射到不同的分区,这样,服务节点可将一个桶中的不同对象的元数据散列到不同的逻辑子桶,进而写入不同的分区,有效的提高了同一桶中不同对象的元数据的分布的均匀性,避免了分区热点的出现。
可选的,在本申请的一种可能的实现方式中,上述服务节点根据桶的名称以及第一对象的名称,从桶中选择第一逻辑子桶的方法为:服务节点根据桶的名称,确定桶;服务节点根据第一对象的名称、桶中逻辑子桶的数量、桶中逻辑子桶号的起始编号以及分布式存储系统的初始的分区的数量,确定第一逻辑子桶的编号。服务节点在确定出第一逻辑子桶的编号后,即可根据该编号从至少两个逻辑子桶中选取出第一逻辑子桶。
进一步地,在根据第一逻辑子桶与第一分区的映射关系,向索引节点发送处理请求之前,服务节点还生成包括第一逻辑子桶的名称,这里,第一逻辑子桶的名称包括第一逻辑子桶的编号以及桶的名称。一个桶包括至少两个逻辑子桶,在桶中所述至少两个逻辑子桶可以采用编号进行区分。由于分布式存储系统可能包括至少两个桶,因此,对于每个逻辑子桶而言,该逻辑子桶的名称需要采用其归属的桶的名称以及该逻辑子桶的编号表示。
可选的,在本申请的另一种可能的实现方式中,在接收输入输出IO操作之前,服务节点还建立桶,并根据桶的性能指标,确定该桶中逻辑子桶的数量,以及根据分布式存储系统的初始的分区的数量和随机数,确定桶中逻辑子桶的起始编号。
可选的,在本申请的另一种可能的实现方式中,本申请中分布式存储系统的初始的分区的数量不小于2。在初始的分区的数量不小于2的情况下,有效的提升了该分布式存储系统的初始性能。
第二方面,提供一种服务节点,应用于分布式存储系统,该分布式存储系统还包括索引节点。该分布式存储系统的桶包括至少两个逻辑子桶,至少两个逻辑子桶映射到的分区不同,至少两个逻辑子桶中的第一逻辑子桶映射到第一分区,索引节点用于管理第一分区。本申请提供的服务节点包括接收单元、处理单元和发送单元。
具体的,上述接收单元,用于接收IO操作,该IO操作包括桶的名称以及第一对象的名称。上述处理单元,用于根据桶的名称以及第一对象的名称,从桶中选择第一逻辑子桶。上述发送单元,用于根据第一逻辑子桶与第一分区的映射关系,向上述索引节点发送处理请求,该处理请求包括第一逻辑子桶的名称和第一对象的名称,该处理请求用于请求在第一分区中处理第一对象的元数据。
可选的,在本申请的一种可能的实现方式中,上述处理单元,具体用于根据桶的名称,确定桶,以及根据第一对象的名称、桶中逻辑子桶的数量、桶中逻辑子桶号的起始编号以及分布式存储系统的初始的分区的数量,确定第一逻辑子桶的编号。此外,上述处理单元,还用于在上述发送单元根据第一逻辑子桶与第一分区的映射关系,向索引节点发送处理请求之前,生成包括第一逻辑子桶的编号以及桶的名称的第一逻辑子桶的名称。
可选的,在本申请的另一种可能的实现方式中,上述处理单元,还用于在上述接收单元接收输入输出IO操作之前,建立桶,以及根据桶的性能指标,确定桶中逻辑子桶的数量,以及根据分布式存储系统的初始的分区的数量以及随机数,确定桶中逻辑子桶的起始编号。
可选的,在本申请的另一种可能的实现方式中,分布式存储系统的初始的分区的数量不小于2。
第三方面,提供一种服务节点,该服务节点包括:一个或多个处理器和存储器。存储器与上述一个或多个处理器连接。存储器用于存储计算机指令,当一个或多个处理器执行该计算机指令时,服务节点执行如上述第一方面及其任意一种可能的实现方式中任意之一所述的方法。
第四方面,提供一种包含指令的计算机程序产品,该计算机程序产品包括计算机指令,当上述第三方面所述的服务节点的处理器执行计算机指令时,使得所述服务节点执行如上述第一方面及其任意一种可能的实现方式中任意之一所述的方法。
第五方面,提供一种计算机可读存储介质,所述计算机可读存储介质包括计算机指令,当上述第三方面所述的服务节点的处理器执行计算机指令时,使得所述服务节点执行如上述第一方面及其任意一种可能的实现方式中任意之一所述的方法。
在本申请中,上述服务节点的名字对设备或功能模块本身不构成限定,在实际实现中,这些设备或功能模块可以以其他名称出现。只要各个设备或功能模块的功能和本申请类似,属于本申请权利要求及其等同技术的范围之内。
本申请中第二方面到第五方面及其各种实现方式的具体描述,可以参考第一方面及其各种实现方式中的详细描述;并且,第二方面到第五方面及其各种实现方式的有益效果,可以参考第一方面及其各种实现方式中的有益效果分析,此处不再赘述。
本申请的这些方面或其他方面在以下的描述中会更加简明易懂。
附图说明
图1为本发明实施例中分区的示意图;
图2为本发明实施例中分布式存储系统的结构示意图一;
图3为本发明实施例中分布式存储系统的结构示意图二;
图4为本发明实施例中服务节点的硬件结构示意图;
图5为本发明实施例中处理对象的元数据的方法的流程示意图;
图6为本发明实施例中服务节点的结构示意图一;
图7为本发明实施例提种服务节点的结构示意图二。
具体实施方式
本发明实施例的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于限定特定顺序。
在本发明实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本发明实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
分布式存储系统包括多个服务器。一般的,根据服务器具备的功能,将分布式存储系统中的多个服务器划分为服务节点、索引节点以及存储节点。存储节点用于存储对象的数据和/或对象的元数据,具体的,对象的数据和/或对象的元数据存储于存储节点的存储介质(如硬盘)中。索引节点用于管理对象的元数据,能够将该对象的元数据写入其管理的某一分区中。服务节点用于确定将某一桶内的对象的元数据写入某一分区中。
在分布式存储系统中,每一对象的元数据均以一条数据条目的形式存储于元数据表中。每一对象的标识包括桶的名称、对象的名称以及对象的版本号。
为了适应元数据表中数据条目不断增长的需求,分布式存储系统中通常采用分区技术将元数据表进行动态拆分,拆分后的每个分区管理不同的数据条目。每个分区由分布式存储系统中的某一个索引节点管理。动态分区实现了分布式存储系统的动态横向扩展。
在分区技术中,分布式存储系统通常使用数据条目中至少一列的数值确定该数据条目归属的分区,该至少一列的数值的组合称为片键值(ShardKey)。也就是说,对于元数据表中的每个数据条目,均可根据该数据条目的片键值唯一确定该数据条目归属的分区。
示例性的,下述表1为元数据表,该元数据表中“桶的名称”、“对象的名称”以及“对象的版本号”的组合唯一标识一个对象,即对象的标识为{桶的名称、对象的名称、对象版本号}。为了适应元数据的数量的不断增长,分布式存储系统中可以将{桶的名称、对象的名称}作为ShardKey,以便于索引不同分区。
表1
Figure PCTCN2019099610-appb-000001
表1中的第1行数据条目的ShardKey为{桶的名称是“A”,对象的名称为“0109”}。第2行数据条目的ShardKey为{桶的名称是“A”,对象的名称为“0109”},第2行数据条目的ShardKey与第1行数据条目的ShardKey相同,因此,第1行数据条目表示的元数据和第2行数据条目表示的元数据属于同一分区。但是,第1行数据条目表示的对象的标识{桶的名称是“A”,对象的名称为“0109”,对象的版本号为“1”}和第2行数据条目表示的对象的标识{桶的名称是“A”,对象的名称为“0109”,对象的版本号为“2”}不同。第3行数据条目的ShardKey为{桶的名称是“B”,对象的名称为“0201”}。
为了保证对象的元数据能够按照ShardKey的自然序排列,分布式存储系统通常采用范围分区(Range Partition)技术。对于处于分区边界点上的元数据,采用左闭右开或者左开右闭的原则确定该元数据归属的分区。
示例性的,图1示出了分布式存储系统中分区的结构。图1中的bn表示桶的名称,on表示对象的名称,ver表示对象的版本号。对于处于分区边界点上的元数据,若采用左闭右开原则确定该元数据归属的分区,则分区1的ShardKey的数值范围为:({bn:min;on:max},{bn:0011;on:max}],分区2的ShardKey的数值范围为:({bn:0011;on:max}~{bn:0020;on:max}]。若某一对象的标识为:{bn:0010;on:max;ver:1},则该对象的元数据归属的分区为分区1。
在分布式存储系统中,由于对象的片键值唯一确定该对象的元数据归属的分区,因此,在片键值采用顺序排列的场景中,同一桶中的多个对象的元数据总是归属于同一个分区,造成分区热点,且该桶内的对象的元数据的分布不均匀。
此外,传统的分布式存储系统在初始启动后,整个分布式存储系统中只有一个分区,此时所有对象的元数据的操作均由该分区提供服务。在运行一段时间后,经过多次拆分,分布式存储系统中存在多个分区,分布式存储系统的性能得到线性扩展。但是,这种方式下,分布式存储系统的初始性能较低。
鉴于此,本发明实施例提供了一种分布式存储系统中处理对象的元数据的方法及装置。其基本原理是:在分布式存储系统的桶中建立至少两个逻辑子桶,将同一个桶中的对象的元数据散列到不同的逻辑子桶中,每个逻辑子桶再映射到不同的分区,以达到同一桶中对 象的元数据归属于不同分区的效果,有效的提高了对象的元数据分布的均匀性。
进一步地,本发明实施例提供的分布式存储系统中处理对象的元数据的方法中,还能够在分布式存储系统初始化部署时,根据每一个服务节点的规模和性能,配置多个分区。也就是说,在初始化部署分布式存储系统时,为该分布式存储系统配置了初始的分区的数量。该初始的分区的数量不小于2。这样,保证了分布式存储系统初始的多分区能力,有效的提高了该分布式存储系统的初始性能。
以下,结合附图,对本发明实施例提供的技术方案进行示例性说明。
图2为本发明实施例提供的一种分布式存储系统的结构示意图。如图2所示,分布式存储系统包括至少一个服务节点20、至少两个索引节点21以及至少两个存储节点22。对于每个索引节点21而言,该索引节点21可与每个服务节点20通信,还可与每个存储节点22通信。该分布式存储系统中的至少一个服务节点20可以组成服务节点集群,至少一个索引节点21可以组成索引节点集群,至少一个存储节点22可以组成存储阵列。
服务节点20可以构建分布式存储系统中的桶,并在已经建立的桶中建立至少两个逻辑子桶。所述至少两个逻辑子桶的编号按照字典序顺序排列,每一逻辑子桶的名称包括该逻辑子桶的编号以及该逻辑子桶所归属的桶的名称。
至少两个逻辑子桶映射到的分区不同。也就是说,对于某一逻辑子桶的对象的元数据,该对象的元数据所归属的分区与该逻辑子桶存在映射关系。该服务节点20可以根据对象的片键值确定逻辑子桶,并根据逻辑子桶与分片之间的映射关系确定出索引节点。
具体的,在目标桶中的第一逻辑子桶与第一分区映射,且分布式存储系统中的目标索引节点管理第一分区的情况下,参考下述图5,服务节点21用于:建立目标桶,并根据目标桶的性能指标,确定目标桶中逻辑子桶的数量,以及确定目标子桶中逻辑子桶的起始编号(S500);接收包括目标桶的名称以及第一对象的名称的写操作(S501);根据目标桶的名称以及第一对象的名称,从目标桶中选择第一逻辑子桶(S502);根据第一逻辑子桶与第一分区的映射关系,向目标索引节点发送包括第一逻辑子桶的名称和第一对象的名称的处理请求(S503)。
其中,在本发明实施例中,IO操作可以是写操作或读操作。写操作可以携带待写逻辑区块地址(logical block address,LBA)和待写数据,用于将待写数据写入待写LBA对应的物理存储空间中。读操作可以携带待读LBA,用于读取待读LBA对应的物理存储空间中存储的数据(即待读数据)。
索引节点21用于管理对象的元数据,能够将该对象的元数据写入其管理的某一分区中。索引节点21可以管理分布式存储系统中的至少一个分区。索引节点21在接收到服务节点20发送的包括逻辑子桶的名称和对象的名的处理请求后,在与该逻辑子桶存在映射关系的分区中处理对象的元数据。
存储节点22用于存储对象的数据和/或对象的元数据,具体的,对象的数据和/或对象的元数据存储于存储节点的存储介质中。可选的,存储节点22的存储介质的物理形态可以是固态硬盘(solid state drives,SSD),也可以是磁盘(hard disk drive,HDD),本发明实施例对此不作具体限定。
本发明实施例中的服务节点20、索引节点21以及存储节点22均可以为物理机(如服 务器),也可以为虚拟机,还可以为用于提供对象存储服务的其他任一设备,本发明实施例对此不作具体限定。
示例性的,图3为本发明实施例中的分布式存储系统的另一种结构示意图。如图3所示,分布式存储系统包括n(n≥1)个服务节点、m(m≥2)个索引节点以及k(k≥2)个存储节点。其中,n个服务节点为服务节点1、服务节点2、……、服务节点n;m个索引节点为索引节点1、索引节点2、……、索引节点m;k个存储节点为存储节点1、存储节点2、……、存储节点k。
每个服务节点均可建立桶,并在已经建立的桶中建立至少两个逻辑子桶,所述至少两个逻辑子桶的编号按照字典序顺序排列,每一逻辑子桶的名称包括该逻辑子桶的编号以及该逻辑子桶所归属的桶的名称。图3中,服务节点1建立有桶1,桶1中建立有5个逻辑子桶,这5个逻辑子桶的名称分别为:001-bck1、002-bck1、003-bck1、004-bck1、005-bck1。001-bck1表示该逻辑子桶为桶1中编号为001的逻辑子桶,002-bck1表示该逻辑子桶为桶1中编号为002的逻辑子桶,以此类推。
每个逻辑子桶均映射到一个分区,且同一桶中的至少两个逻辑子桶映射到的分区不同。也就是说,对于某一逻辑子桶的对象的元数据,该对象的元数据所归属的分区与该逻辑子桶存在映射关系。图3中的逻辑子桶001-bck1映射到分区1,逻辑子桶003-bck1映射到分区i,也就是说,逻辑子桶001-bck1与分区1之间存在映射关系,逻辑子桶003-bck1与分区i之间存在映射关系。
分布式存储系统中的每个索引节点管理至少一个分区。图3中的索引节点2管理分区1,索引节点1管理分区i,这样,在写操作的情况下,索引节点2可将逻辑子桶001-bck1中对象的元数据写入分区1,索引节点1可将逻辑子桶003-bck1中对象的元数据写入分区i。具体的,索引节点2将逻辑子桶001-bck1中对象的元数据写入与分区1对应的存储节点中,索引节点1将逻辑子桶003-bck1中对象的元数据写入与分区i对应的存储节点中。
通常,在执行IO操作之前,首先需要搭建分布式存储系统,具体的,根据配置文件在分布式存储系统的服务节点上部署分区、建立逻辑子桶等。其中,配置文件可以用于记录:分布式存储系统中的分区的初始数量(即初始的分区的数量),与每个逻辑子桶存在映射关系的分区、索引节点的数量等。
图4为本发明实施例提供的服务节点20的结构示意图。如图4所示,本发明实施例提供的服务节点20包括:通信接口40、通信接口41和控制模块42。通信接口20和通信接口41分别与控制模块42连接。在服务节点20中,通信接口40用于与客户端通信,通信接口41用于与索引节点21通信。通信接口40和通信接口41通过通信网络通信,通信网络如以太网,无线局域网(wireless local area networ,WLAN)等。
控制模块42在目标桶中建立至少两个逻辑子桶,还用于在目标桶中选择第一逻辑子桶。
如图4所示,控制模块42包括处理器421和存储器422。处理器421和存储器422连接。通信接口40、通信接口41、处理器421和存储器422可以通过系统总线43连接。存储器422可以是独立存在的,通过系统总线43与处理器421相连接。存储器422也可以和处理器421集成在一起。
处理器421用于目标桶中建立至少两个逻辑子桶,还用于在目标桶中选择第一逻辑子桶。
存储器422用于临时存储服务节点20接收到的信息。存储器422还用于存储软件程序以及应用模块,处理器421通过运行存储在存储器422的软件程序以及应用模块,从而执行服务节点20的各种功能应用以及数据处理。
存储器422主要包括存储程序区4221。其中,存储程序区4221可存储操作系统、至少一个功能所需的应用程序,比如选择第一逻辑子桶等。
处理器421可以是任何计算器件,可以是通用中央处理器(CPU),微处理器,可编程控制器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制以上方案程序执行的集成电路。处理器421是服务节点20的控制中心。处理器421利用各种接口和线路连接服务节点20的各个部分,通过运行或执行存储在存储器422内的软件程序和/或应用模块,执行服务节点20的各种功能和处理数据,从而对服务节点20进行整体监控。在具体实现中,作为一种实施例,处理器可以包括一个或多个CPU,例如图4中的处理器包括CPU 0和CPU 1。
存储器422可以包括易失性存储器(volatile memory),例如,随机存取存储器(random-access memory,RAM);该存储器422也可以包括非易失性存储器(non-volatile memory),例如,只读存储器(read-only memory,ROM),快闪存储器(flash memory),硬盘(hard disk drive,HDD)、固态硬盘(solid-state drive,SSD)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由网络设备存取的任何其他介质,但不限于此。
系统总线43可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。
系统总线43可以分为地址总线、数据总线、控制总线等。本发明实施例中为清楚说明,在图4中将各种总线都示意为系统总线43。
以下,结合图2~图4,对本发明实施例提供的分布式存储系统中处理对象的元数据的方法进行说明。
为了便于说明,以分布式存储系统中的目标索引节点管理第一分区为例进行说明。如图5所示,为本发明实施例提供的一种分布式存储系统中处理对象的元数据的方法的交互示意图。其中,具体是以IO操作是写操作为例进行说明的。图5所示的方法可以包括如下步骤:
S500、服务节点建立目标桶,并根据目标桶的性能指标,确定目标桶中逻辑子桶的数量,以及确定目标子桶中逻辑子桶的起始编号。
服务节点为分布式存储系统中至少一个服务节点中的任意一个。
具体的,用户可以在具有创建目标桶的需求时,向分布式存储系统发送桶创建请求。分布式存储系统中的服务节点接收到桶创建请求后,根据目标桶的相关属性,如桶的名称、桶的大小等,创建目标桶。
服务节点在创建目标桶后,根据目标桶的性能指标,确定逻辑子桶的数量。容易理解的是,目标桶中逻辑子桶的数量会直接影响到该目标桶的初始性能,因此,在用户未设置 逻辑子桶的数量的情况下,目标桶中逻辑子桶的数量为默认值;在用户设置的逻辑子桶的数量太大时,目标桶中逻辑子桶的数量为分布式存储系统中的最大子桶数。
服务节点不仅需要确定目标桶中逻辑子桶的数量,还需要确定目标桶中逻辑子桶的起始编号。可选的,逻辑子桶的编号按照字典序顺序排列,这样,在确定出逻辑子桶的起始编号后,服务节点即可确定出目标桶中每一逻辑子桶的编号。
可选的,服务节点可以根据分布式存储系统的初始的分区的数量以及随机数,采用取模运算确定目标桶中逻辑子桶的起始编号,也可以根据分布式存储系统的初始的分区的数量以及随机数,采用简单的加减运算确定目标桶中逻辑子桶的起始编号,还可以根据分布式存储系统的初始的分区的数量以及随机数,采用乘加运算确定目标桶中逻辑子桶的起始编号,本发明实施例对此不做具体限定。
示例性的,服务节点采用下述公式计算目标桶中逻辑子桶的起始编号:
S0=Rand()%c
其中,S0表示目标桶中逻辑子桶号的起始编号,Rand()表示随机数,c为分布式存储系统的初始的分区的数量。
采用上述公式计算逻辑子桶的起始编号,由于计算逻辑子桶的起始编号用到了随机数,因此,目标桶中逻辑子桶的起始编号也具备随机性。
在确定出目标桶中逻辑子桶的数量以及目标子桶中逻辑子桶的起始编号后,服务节点将其存储。
本发明实施例中的初始的分区的数量是在新建分布式系统时,为分布式存储系统配置的分区总数。该初始的分区的数量可以为1,也可以不小于2。
对于初始的分区的数量为1的情况,在运行一段时间后,分布式系统中的分区数会增多(可参考上述描述)。在这种情况下,本发明实施例提供的处理对象的元数据的方法适用于后续分区数增多的场景。
对于初始的分区的数量不小于2的情况,分布式存储系统确定初始的分区的数量的方法为:根据每一服务节点的性能为每一服务节点配置一定数量的分区,这样,所有服务节点的分区数的总和为初始的分区的数量。
示例性的,分布式存储系统包括16个服务节点,若每个服务节点设置有4个分区即可发挥该服务节点的所有能力,则初始的分区的数量为16×4=64个。
相比于初始的分区的数量为1,初始的分区的数量不小于2有效的提高了分布式存储系统的初始性能。
S501、服务节点接收写操作,该写操作用于请求写入目标桶中第一对象的元数据。
该写操作包括第一对象的标识,第一对象的标识包括目标桶的名称、第一对象的名称以及第一对象的版本号。
S502、服务节点根据目标桶的名称以及第一对象的名称,从目标桶中选择第一逻辑子桶。
具体的,服务节点根据目标桶的名称,确定目标桶;然后,服务节点根据第一对象的名称、目标桶中逻辑子桶的数量、目标桶中逻辑子桶的起始编号以及初始的分区的数量,计算目标逻辑子桶的编号。
可选的,服务节点可以采用取模运算计算目标逻辑子桶的编号,也可以采用简单的加减运算计算目标逻辑子桶的编号,还可以采用乘加运算计算目标逻辑子桶的编号,本发明实施例对此不做具体限定。
示例性的,服务节点采用下述公式计算目标逻辑子桶的编号:
S1=(S0+hash(a)%b)%c
其中,S1表示目标逻辑子桶的编号,S0表示目标桶中逻辑子桶号的起始编号,a表示第一对象的名称,b为目标桶中逻辑子桶的数量,c为初始的分区的数量,hash(a)表示第一对象的名称的哈希值。
服务节点计算出目标逻辑子桶的编号,即可确定出目标逻辑子桶。
S503、服务节点向目标索引节点发送处理请求,处理请求包括目标逻辑子桶的名称和第一对象的名称,处理请求用于请求在目标逻辑子桶对应的分区中写入第一对象的元数据。
服务节点在计算出目标逻辑子桶的编号后,根据该目标逻辑子桶的编号以及目标桶的名称生成目标逻辑子桶的名称。目标逻辑子桶的名称包括目标逻辑子桶的编号以及目标桶的名称。
S504、目标索引节点将第一对象的元数据写入到存储节点。
目标索引节点将第一对象的元数据写入存储节点的方法可以参考现有技术中索引节点将某一对象的元数据存储于存储节点的方法,本发明实施例对此不作具体限定。
由于目标桶中建立有至少两个逻辑子桶,所述至少两个逻辑子桶分别映射到不同的分区,因此,同一桶中不同对象的元数据可以散列到不同的逻辑子桶中,进而写入不同的分区中,有效的提高了对象的元数据分布的均匀性。
对于IO操作是读操作的情况,处理对象的元数据的方法的流程与图4示出的流程类似,不同的是,目标索引节点需要从存储节点中读取出携带有目标逻辑子桶的名称的元数据。后续,目标索引节点向服务节点发送读取出的携带有目标逻辑子桶的名称的元数据,进一步地,服务节点去除目标逻辑子桶的名称,并将去除目标逻辑子桶的名称的元数据进行排序,这样,服务节点即可获取到第一对象的所有元数据。
综上所述,本发明实施例提供的分布式存储系统中处理对象元数据方法,通过在分布式存储系统的桶中建立至少两个逻辑子桶,将同一个桶中的对象的元数据散列到不同的逻辑子桶中,每个逻辑子桶再映射到不同的分区,以达到同一桶中对象的元数据归属于不同分区的效果,有效的提高了对象的元数据分布的均匀性。
本发明实施例可以根据上述方法示例对服务节点进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本发明实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图6示出上述实施例中所涉及的服务节点的一种可能的结构示意图。如图6所示,服务节点6包括接收单元60、处理单元61和发送单元62。
接收单元60用于支持该服务节点执行上述实施例中的S501等,和/或用于本 文所描述的技术的其它过程。
处理单元61用于支持该服务节点执行上述实施例中的S500、S502等,和/或用于本文所描述的技术的其它过程。
发送单元62用于支持该服务节点执行上述实施例中的S503等,和/或用于本文所描述的技术的其它过程。
当然,本发明实施例提供的服务节点包括但不限于上述模块,例如服务节点还可以包括存储单元63。存储单元63可以用于存储该服务节点的程序代码。
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在采用集成的单元的情况下,本发明实施例提供的服务节点的结构示意图如图7所示。在图7中,服务节点7包括:处理模块70和通信模块71。处理模块70用于对服务节点的动作进行控制管理,例如,执行上述处理单元61执行的步骤,和/或用于执行本文所描述的技术的其它过程。通信模块71用于支持服务节点与其他设备之间的交互,例如,执行上述接收单元60和发送单元62执行的步骤。如图7所示,服务节点7还可以包括存储模块72,存储模块72用于存储服务节点7的程序代码和数据,例如存储上述存储单元63所保存的内容。
其中,处理模块70与上述图4中的处理器421对应,通信模块71与上述图4中的通信接口40和通信接口41对应,存储模块与上述图4中的存储器422对应。
其中,上述方法实施例涉及的各场景的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
上述服务节点6和服务节点7均可执行上述图5所示的分布式存储系统中处理对象的元数据的方法。
本发明实施例中服务节点以及索引节点所执行的功能也可以由分布式对象存储系统中其他节点来执行。具体实现可根据分布式对象存储系统的需求来确定。
本发明实施例还提供一种计算机可读存储介质,该计算机可读存储介质包括计算机指令,当服务节点的处理器在执行该计算机指令时,该服务节点执行如图5所示的分布式存储系统中处理对象的元数据的方法。
本发明实施例还提供一种计算机程序产品,该计算机程序产品包括计算机指令,当服务节点的处理器执行该计算机指令时,使得服务节点实施执行如图5所示的分布式存储系统中处理对象的元数据的方法。
在上述实施例中,可以全部或部分的通过软件,硬件,固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式出现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本发明实施例所述的流程或功能。
所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线 (DSL)、以太网)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质,(例如,软盘,硬盘、磁带)、光介质(例如,DVD)或者半导体介质(例如固态硬盘solid state disk(SSD))等。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本发明实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本发明实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (11)

  1. 一种分布式存储系统中处理对象的元数据的方法,所述分布式存储系统包括服务节点和索引节点,其特征在于,所述分布式存储系统的桶包括至少两个逻辑子桶,所述至少两个逻辑子桶映射到的分区不同,所述至少两个逻辑子桶中的第一逻辑子桶映射到第一分区,所述索引节点用于管理所述第一分区;所述方法包括:
    所述服务节点接收输入输出IO操作,所述IO操作包括所述桶的名称以及第一对象的名称;
    所述服务节点根据所述桶的名称以及所述第一对象的名称,从所述桶中选择所述第一逻辑子桶;
    所述服务节点根据所述第一逻辑子桶与所述第一分区的映射关系,向所述索引节点发送处理请求,所述处理请求包括所述第一逻辑子桶的名称和所述第一对象的名称,所述处理请求用于请求在所述第一分区中处理所述第一对象的元数据。
  2. 根据权利要求1所述的方法,其特征在于,所述服务节点根据所述桶的名称以及所述第一对象的名称,从所述桶中选择所述第一逻辑子桶,具体包括:
    所述服务节点根据所述桶的名称,确定所述桶;
    所述服务节点根据所述第一对象的名称、所述桶中逻辑子桶的数量、所述桶中逻辑子桶号的起始编号以及所述分布式存储系统的初始的分区的数量,确定所述第一逻辑子桶的编号;
    所述服务节点根据所述第一逻辑子桶与所述第一分区的映射关系,向所述索引节点发送处理请求之前,所述方法还包括:
    所述服务节点生成所述第一逻辑子桶的名称;其中,所述第一逻辑子桶的名称包括所述第一逻辑子桶的编号以及所述桶的名称。
  3. 根据权利要求1或2所述的方法,其特征在于,所述服务节点接收IO操作之前,所述方法还包括:
    所述服务节点建立所述桶;
    所述服务节点根据所述桶的性能指标,确定所述桶中逻辑子桶的数量;
    所述服务节点根据所述分布式存储系统的初始的分区的数量以及随机数,确定所述桶中逻辑子桶的起始编号。
  4. 根据权利要求3所述的方法,其特征在于,
    所述分布式存储系统的初始的分区的数量不小于2。
  5. 一种服务节点,其特征在于,应用于分布式存储系统,所述分布式存储系统还包括索引节点,所述分布式存储系统的桶包括至少两个逻辑子桶,所述至少两个逻辑子桶映射到的分区不同,所述至少两个逻辑子桶中的第一逻辑子桶映射到第一分区,所述索引节点用于管理所述第一分区;所述服务节点包括:
    接收单元,用于接收输入输出IO操作,所述IO操作包括所述桶的名称以及第一对象的名称;
    处理单元,用于根据所述桶的名称以及所述第一对象的名称,从所述桶中选择所述第一逻辑子桶;
    发送单元,用于根据所述第一逻辑子桶与所述第一分区的映射关系,向所述索引节点 发送处理请求,所述处理请求包括所述第一逻辑子桶的名称和所述第一对象的名称,所述处理请求用于请求在所述第一分区中处理所述第一对象的元数据。
  6. 根据权利要求5所述的服务节点,其特征在于,
    所述处理单元,具体用于根据所述桶的名称,确定所述桶,以及根据所述第一对象的名称、所述桶中逻辑子桶的数量、所述桶中逻辑子桶号的起始编号以及所述分布式存储系统的初始的分区的数量,确定所述第一逻辑子桶的编号;
    所述处理单元,还用于在所述发送单元根据所述第一逻辑子桶与所述第一分区的映射关系,向所述索引节点发送处理请求之前,生成所述第一逻辑子桶的名称,所述第一逻辑子桶的名称包括所述第一逻辑子桶的编号以及所述桶的名称。
  7. 根据权利要求5或6所述的服务节点,其特征在于,
    所述处理单元,还用于在所述接收单元接收输入输出IO操作之前,建立所述桶,以及根据所述桶的性能指标,确定所述桶中逻辑子桶的数量,以及根据所述分布式存储系统的初始的分区的数量以及随机数,确定所述桶中逻辑子桶的起始编号。
  8. 根据权利要求7所述的服务节点,其特征在于,
    所述分布式存储系统的初始的分区的数量不小于2。
  9. 一种服务节点,其特征在于,应用于分布式存储系统,所述服务节点包括:一个或多个处理器和存储器;
    所述存储器与所述一个或多个处理器连接;所述存储器用于存储计算机指令,当所述一个或多个处理器执行所述计算机指令时,所述服务节点执行如权利要求1-4中任意一项所述的方法。
  10. 一种包含指令的计算机程序产品,其特征在于,所述计算机程序产品包括计算机指令,当服务节点的处理器执行计算机指令时,使得所述服务节点执行如权利要求1-4中任意一项所述的方法。
  11. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机指令,当服务节点的处理器执行所述计算机指令时,使得所述服务节点执行如权利要求1-4中任意一项所述的方法。
PCT/CN2019/099610 2018-09-10 2019-08-07 分布式存储系统中处理对象的元数据的方法及装置 WO2020052379A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811051458.4A CN109299190B (zh) 2018-09-10 2018-09-10 分布式存储系统中处理对象的元数据的方法及装置
CN201811051458.4 2018-09-10

Publications (1)

Publication Number Publication Date
WO2020052379A1 true WO2020052379A1 (zh) 2020-03-19

Family

ID=65166680

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/099610 WO2020052379A1 (zh) 2018-09-10 2019-08-07 分布式存储系统中处理对象的元数据的方法及装置

Country Status (2)

Country Link
CN (2) CN109299190B (zh)
WO (1) WO2020052379A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299190B (zh) * 2018-09-10 2020-11-17 华为技术有限公司 分布式存储系统中处理对象的元数据的方法及装置
CN110362577B (zh) * 2019-07-10 2020-06-09 星环信息科技(上海)有限公司 一种数据插入方法、装置、设备和储存介质
CN111984691B (zh) 2020-09-11 2023-01-06 苏州浪潮智能科技有限公司 一种分布式存储系统中对象元数据检索列举方法及装置
CN113111033A (zh) * 2021-04-07 2021-07-13 山东英信计算机技术有限公司 一种分布式对象存储系统中桶索引动态重分的方法和系统
CN113111194B (zh) * 2021-04-07 2022-11-18 山东英信计算机技术有限公司 对象元数据聚合方法、读取方法、装置、设备及存储介质
CN113010526A (zh) * 2021-04-19 2021-06-22 星辰天合(北京)数据科技有限公司 基于对象存储服务的存储方法及装置
CN113419828B (zh) * 2021-05-31 2022-07-29 济南浪潮数据技术有限公司 一种对象存储的生命周期管理方法和系统
CN118035343A (zh) * 2022-11-01 2024-05-14 华为云计算技术有限公司 一种基于对象存储服务的存储方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170139913A1 (en) * 2015-11-12 2017-05-18 Yahoo! Inc. Method and system for data assignment in a distributed system
CN107391554A (zh) * 2017-06-07 2017-11-24 中国人民解放军国防科学技术大学 高效分布式局部敏感哈希方法
CN109299190A (zh) * 2018-09-10 2019-02-01 华为技术有限公司 分布式存储系统中处理对象的元数据的方法及装置
CN109726175A (zh) * 2018-12-29 2019-05-07 北京赛思信安技术股份有限公司 一种基于HBase的海量文件离线分区管理方法

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8392400B1 (en) * 2005-12-29 2013-03-05 Amazon Technologies, Inc. Method and apparatus for stress management in a searchable data service
US7885911B2 (en) * 2006-03-24 2011-02-08 Alcatel-Lucent Usa Inc. Fast approximate wavelet tracking on streams
US8290919B1 (en) * 2010-08-27 2012-10-16 Disney Enterprises, Inc. System and method for distributing and accessing files in a distributed storage system
CN102968498B (zh) * 2012-12-05 2016-08-10 华为技术有限公司 数据处理方法及装置
WO2014101000A1 (zh) * 2012-12-26 2014-07-03 华为技术有限公司 元数据管理方法及系统
US9262505B2 (en) * 2013-05-17 2016-02-16 Amazon Technologies, Inc. Input-output prioritization for database workload
CN104123359B (zh) * 2014-07-17 2017-03-22 江苏省邮电规划设计院有限责任公司 一种分布式对象存储系统的资源管理方法
US10055458B2 (en) * 2015-07-30 2018-08-21 Futurewei Technologies, Inc. Data placement control for distributed computing environment
CN106339181B (zh) * 2016-08-19 2019-05-24 华为技术有限公司 存储系统中数据处理方法和装置
CN107145394B (zh) * 2017-04-28 2020-05-08 中国人民解放军国防科学技术大学 一种针对数据倾斜的均衡负载处理方法及装置
CN107943421B (zh) * 2017-11-30 2021-04-20 成都华为技术有限公司 一种基于分布式存储系统的分区划分方法及装置
CN108196786B (zh) * 2017-12-08 2021-05-18 成都华为技术有限公司 用于存储系统分区的方法和管理设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170139913A1 (en) * 2015-11-12 2017-05-18 Yahoo! Inc. Method and system for data assignment in a distributed system
CN107391554A (zh) * 2017-06-07 2017-11-24 中国人民解放军国防科学技术大学 高效分布式局部敏感哈希方法
CN109299190A (zh) * 2018-09-10 2019-02-01 华为技术有限公司 分布式存储系统中处理对象的元数据的方法及装置
CN109726175A (zh) * 2018-12-29 2019-05-07 北京赛思信安技术股份有限公司 一种基于HBase的海量文件离线分区管理方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NISHIMURA, SHOJI: "MD -HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services", 2011 IEEE 12TH INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT, 3 November 2011 (2011-11-03) *

Also Published As

Publication number Publication date
CN109299190A (zh) 2019-02-01
CN109299190B (zh) 2020-11-17
CN112417036A (zh) 2021-02-26

Similar Documents

Publication Publication Date Title
WO2020052379A1 (zh) 分布式存储系统中处理对象的元数据的方法及装置
US11082206B2 (en) Layout-independent cryptographic stamp of a distributed dataset
WO2020259352A1 (zh) 一种数据处理方法、节点及区块链系统
WO2017088358A1 (zh) 一种分布式数据库处理的方法和设备
US20160179581A1 (en) Content-aware task assignment in distributed computing systems using de-duplicating cache
US10235047B2 (en) Memory management method, apparatus, and system
CN104462225B (zh) 一种数据读取的方法、装置及系统
CN110019004B (zh) 一种数据处理方法、装置及系统
WO2017067117A1 (zh) 数据查询方法和装置
BR112016004490B1 (pt) aparelho e método de armazenamento de dados
WO2019153702A1 (zh) 一种中断处理方法、装置及服务器
WO2021008197A1 (zh) 资源分配方法、存储设备和存储系统
CN107368260A (zh) 基于分布式系统的存储空间整理方法、装置及系统
WO2016074370A1 (zh) 一种KeyValue数据库的数据表的更新方法与表数据更新装置
WO2023231336A1 (zh) 执行交易的方法和区块链节点
WO2021142768A1 (zh) 一种文件系统的克隆方法及装置
WO2019120226A1 (zh) 数据访问预测方法和装置
WO2019153880A1 (zh) 集群中镜像文件下载的方法、节点、查询服务器
WO2019047142A1 (zh) 程序打补丁的方法、装置、微控制单元和终端设备
US20240220334A1 (en) Data processing method in distributed system, and related system
JP5893028B2 (ja) キャッシングに対応したストレージ装置上における効率的なシーケンシャルロギングのためのシステム及び方法
JP6189266B2 (ja) データ処理装置、データ処理方法及びデータ処理プログラム
TW201525707A (zh) 伺服器品質驗證方法及其系統
WO2017045545A1 (zh) 多存储盘负载管理方法、装置、文件系统及存储网络系统
CN115129709A (zh) 一种数据处理方法、服务端及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19859733

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19859733

Country of ref document: EP

Kind code of ref document: A1