CN112422611B - Virtual bucket storage processing method and system based on distributed object storage - Google Patents

Virtual bucket storage processing method and system based on distributed object storage Download PDF

Info

Publication number
CN112422611B
CN112422611B CN202010953846.2A CN202010953846A CN112422611B CN 112422611 B CN112422611 B CN 112422611B CN 202010953846 A CN202010953846 A CN 202010953846A CN 112422611 B CN112422611 B CN 112422611B
Authority
CN
China
Prior art keywords
bucket
weight
virtual
space
weight information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010953846.2A
Other languages
Chinese (zh)
Other versions
CN112422611A (en
Inventor
唐卓
宋柏森
刘玲星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhengtong Electronics Co Ltd
Original Assignee
Shenzhen Zhengtong Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhengtong Electronics Co Ltd filed Critical Shenzhen Zhengtong Electronics Co Ltd
Priority to CN202010953846.2A priority Critical patent/CN112422611B/en
Publication of CN112422611A publication Critical patent/CN112422611A/en
Application granted granted Critical
Publication of CN112422611B publication Critical patent/CN112422611B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0813Configuration setting characterised by the conditions triggering a change of settings
    • H04L41/0816Configuration setting characterised by the conditions triggering a change of settings the condition being an adaptation, e.g. in response to network events
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0896Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1029Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers using data related to the state of servers by a load balancer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1031Controlling of the operation of servers by a load balancer, e.g. adding or removing servers that serve requests
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to a virtual bucket storage processing method and system based on distributed object storage, wherein a physical host runs software defined storage service and stores user data and metadata, a plurality of physical hosts form a bucket, a plurality of buckets form a virtual bucket, a plurality of virtual buckets form a storage space of a control node cluster, the physical hosts, the bucket and the virtual bucket periodically send weight information to the upper level of the physical hosts, the bucket and the virtual bucket, the control node cluster records the weight information of the physical hosts, the bucket and the virtual bucket into a space weight table, the storage space is distributed according to the space weight table, and the space weight table is sent to a client. By the method and the device, the problem that the capacity of the user for storing the data is limited due to the fact that the capacity of a single barrel is limited because the capacity of the single barrel is limited when the data of the object storage user is stored in the barrel and the data of the single barrel is stored in the key value database is solved, the utilization rate of a physical host cluster is improved, and the capacity of the user for storing the data is increased.

Description

Virtual bucket storage processing method and system based on distributed object storage
Technical Field
The present application relates to the field of computers, and in particular, to a virtual bucket storage processing method and system based on distributed object storage.
Background
Along with the rapid development of wireless network technology and the popularization of various monitoring cameras, more and more data which can not be modified after the unstructured data pictures and videos are stored are available. Different from the traditional file storage and block storage, the object storage is more and more popular as an emerging storage mode. However, in the related art, data of an object storage user is stored in a bucket, and data of a single bucket is stored in a key value database, the capacity of the single bucket is limited due to the limitation of the capacity of the single key value database, and the capacity of the user storage data is limited, so that the requirements of many actual services cannot be met.
At present, no effective solution is provided for the problem that the capacity of a single bucket is limited due to the fact that data of an object storage user is stored in the bucket in the related art, and the capacity of the user storage data is limited.
Disclosure of Invention
The embodiment of the application provides a virtual bucket storage processing method and system based on distributed object storage, which aim to at least solve the problem that in the related art, the capacity of a single bucket is limited because the data of an object storage user is stored in the bucket, so that the capacity of the user storage data is limited.
In a first aspect, an embodiment of the present application provides a virtual bucket storage processing method based on distributed object storage, which is applied to a virtual bucket storage processing system based on distributed object storage, where the system includes: a cluster of physical hosts and control nodes, the method comprising:
the physical hosts run software-defined storage services and store user data and metadata, a plurality of the physical hosts form a bucket, a plurality of the buckets form a virtual bucket, and a plurality of the virtual buckets form a storage space of the control node cluster;
the physical host periodically sends first weight information to the bucket, the bucket checks whether the first weight information changes after receiving the first weight information, and if so, the physical host updates second weight information of the bucket;
the bucket periodically sends the second weight information to the virtual bucket, the virtual bucket checks whether the second weight information changes after receiving the second weight information, and if so, the third weight information of the virtual bucket is updated;
the virtual bucket periodically sends the third weight information to the control node cluster, and after receiving the third weight information, the control node cluster checks whether the third weight information changes, and if so, updates a spatial weight table of the control node cluster;
and the control node cluster records the weight information of the physical host, the bucket and the virtual bucket into the space weight table, allocates storage space according to the space weight table, and sends the space weight table to a client.
In some embodiments, the client requests the control node cluster for the spatial weight table, updates spatial tree information of a virtual bucket of the client according to the spatial weight table, finds a target physical host according to the spatial tree information of the virtual bucket, and sends a read-write application of data to the target physical host.
In some embodiments, finding the target physical host according to the spatial tree information of the virtual bucket comprises:
and the client performs Hash weight calculation according to the space tree information of the virtual bucket to obtain the virtual bucket for storing the data, performs the Hash weight calculation according to the third weight information of the virtual bucket to obtain the bucket for storing the data, and performs the Hash weight calculation according to the second weight information of the bucket to obtain the target physical host for storing the data.
In some of these embodiments, the first weight information includes current capacities of the physical hosts, the second weight information includes current capacities of a number of the physical hosts that make up the bucket, the third weight information includes current capacities of a number of the buckets that make up the virtual bucket, and the spatial weight table includes current capacities of a number of the virtual buckets that make up a storage space of the cluster of control nodes.
In some of these embodiments, the client updating the spatial tree information of the virtual bucket comprises: the client side actively applies for a new space from the control node cluster; the client periodically sends a heartbeat message to the control node cluster to inquire whether a new space weight table exists or not; and after the control node cluster generates the new space weight table, pushing the latest space weight table to the client.
In a second aspect, the present application provides a virtual bucket storage processing system based on distributed object storage, the system including a physical host and a control node cluster,
the physical hosts run software-defined storage services and store user data and metadata, a plurality of the physical hosts form a bucket, a plurality of the buckets form a virtual bucket, and a plurality of the virtual buckets form a storage space of the control node cluster;
the physical host periodically sends first weight information to the bucket, the bucket checks whether the first weight information changes after receiving the first weight information, and if so, the physical host updates second weight information of the bucket;
the bucket periodically sends the second weight information to the virtual bucket, the virtual bucket checks whether the second weight information changes after receiving the second weight information, and if so, the third weight information of the virtual bucket is updated;
the virtual bucket periodically sends the third weight information to the control node cluster, and after receiving the third weight information, the control node cluster checks whether the third weight information changes, and if so, updates a spatial weight table of the control node cluster;
and the control node cluster records the weight information of the physical host, the bucket and the virtual bucket into the space weight table, allocates storage space according to the space weight table, and sends the space weight table to a client.
In some embodiments, the client requests the control node cluster for the spatial weight table, updates spatial tree information of a virtual bucket of the client according to the spatial weight table, finds a target physical host according to the spatial tree information of the virtual bucket, and sends a read-write application of data to the target physical host.
In some embodiments, finding the target physical host according to the spatial tree information of the virtual bucket comprises:
and the client performs Hash weight calculation according to the space tree information of the virtual bucket to obtain the virtual bucket for storing the data, performs the Hash weight calculation according to the third weight information of the virtual bucket to obtain the bucket for storing the data, and performs the Hash weight calculation according to the second weight information of the bucket to obtain the target physical host for storing the data.
In some of these embodiments, the first weight information includes current capacities of the physical hosts, the second weight information includes current capacities of a number of the physical hosts that make up the bucket, the third weight information includes current capacities of a number of the buckets that make up the virtual bucket, and the spatial weight table includes current capacities of a number of the virtual buckets that make up a storage space of the cluster of control nodes.
In some of these embodiments, the client updating the spatial tree information of the virtual bucket comprises: the client actively applies for a new space from the control node cluster; the client periodically sends a heartbeat message to the control node cluster to inquire whether a new space weight table exists or not; and after the control node cluster generates the new space weight table, pushing the latest space weight table to the client.
Compared with the prior art, the virtual bucket storage processing method based on distributed object storage provided by the embodiment of the application includes that software definition storage service and storage user data and metadata run on physical hosts, a plurality of physical hosts form a bucket, a plurality of buckets form a virtual bucket, a plurality of virtual buckets form a storage space of a control node cluster, the physical hosts periodically send first weight information to the buckets, the buckets check whether the first weight information changes after receiving the first weight information, if so, update second weight information of the buckets, the buckets periodically send the second weight information to the virtual buckets, after receiving the second weight information, the virtual buckets check whether the second weight information changes, if so, update third weight information of the virtual buckets, the virtual buckets periodically send the third weight information to the control node cluster, after receiving the third weight information, check whether the third weight information changes, if so, update a space weight table of the control node cluster, the control node cluster updates the space weight table of the control node cluster, the physical hosts, the buckets and the storage space information of the control node cluster record the storage space table into a single storage space, and the storage space table is a bottleneck capacity limit value of a single storage key value, and the storage capacity limit data base is increased according to the bottleneck problem that the storage capacity of the single storage space table is limited by the user data base.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a schematic diagram of an application environment of a distributed object storage based virtual bucket storage processing method according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a distributed object storage based virtual bucket storage processing method according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating another method for processing a virtual bucket storage based on distributed object storage according to an embodiment of the present application;
FIG. 4 is a block diagram of a distributed object storage based virtual bucket storage processing system according to an embodiment of the present application;
FIG. 5 is a block diagram of another distributed object storage based virtual bucket storage processing system according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a capacity allocation flow of a virtual cask space tree according to an embodiment of the present application;
fig. 7 is a schematic diagram of a framework of a virtual bucket storage processing method based on distributed object storage according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The use of the terms "including," "comprising," "having," and any variations thereof herein, is meant to cover a non-exclusive inclusion; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but rather can include electrical connections, whether direct or indirect. Reference herein to "a plurality" means greater than or equal to two. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The distributed object storage-based virtual bucket storage processing method provided by the present application can be applied to an application environment shown in fig. 1, where fig. 1 is an application environment schematic diagram of the distributed object storage-based virtual bucket storage processing method according to the embodiment of the present application, and as shown in fig. 1, a bottom layer is composed of a plurality of physical hosts, and a software-defined storage service is run on the physical hosts and carries storage of user data and metadata. On top of the physical hosts are the buckets of users, each bucket consisting of several physical hosts, onto which the data is distributed when written. Then, a virtual Bucket (virtual Bucket 4) comprises Bucket0 (Bucket 0), bucket1 (Bucket 1) and Bucket2 (Bucket 2), virtual Bucket5 (virtual Bucket 5) comprises Bucket0 (Bucket 3)), a storage space of a control node cluster (controllers) (the storage space of the control node cluster comprises virtual Bucket4 and virtual Bucket 5), and a client (Clients) finds a target physical host of stored data based on space tree calculation of the virtual buckets.
The application provides a virtual bucket storage processing method based on distributed object storage, and fig. 2 is a schematic flow chart of the virtual bucket storage processing method based on distributed object storage according to the embodiment of the application, and as shown in fig. 2, the flow includes the following steps:
step S201, a physical host runs software defined storage service and stores user data and metadata, a plurality of physical hosts form a bucket, a plurality of buckets form a virtual bucket, and a plurality of virtual buckets form a storage space of a control node cluster; in this embodiment, different buckets may be composed of physical hosts of the same data volume, or may be composed of physical hosts of different data volumes, to form buckets of different capacities; different virtual buckets can also consist of buckets with the same data volume, or consist of buckets with different data volumes to form virtual buckets with different capacities; the virtual buckets with different capacities form a storage space of the control node cluster;
step S202, the physical host periodically sends first weight information to the bucket, after the bucket receives the first weight information, whether the first weight information changes or not is checked, and if the first weight information changes, second weight information of the bucket is updated; specifically, all physical hosts below one bucket periodically perform heartbeat data interaction on the bucket service, and the interacted data mainly include the hard disk capacity and the state of the current physical host, wherein the first weight information is the hard disk capacity of the current physical host, and if the hard disk capacity of the current physical host changes, the change is updated into the second weight information of the bucket service, so that reference is made for data distribution at the next moment. For the newly discovered physical host for capacity expansion, the physical host cannot be normally added at the current moment and can be added at the next data distribution moment;
step S203, the bucket periodically sends second weight information to the virtual bucket, the virtual bucket checks whether the second weight information changes after receiving the second weight information, and if so, the third weight information of the virtual bucket is updated; specifically, all buckets below one virtual bucket periodically perform heartbeat data interaction on the virtual bucket service, and the interacted data mainly refer to the capacity and the state of the current bucket, wherein the second weight information is the capacity of the current bucket, if the capacity of the current bucket changes, the change is updated into third weight information of the virtual bucket service, and similarly, the changed weight value is not influenced by the weight of the spatial tree to which data is already allocated at the current moment, and only influences the allocation of the spatial tree at the next moment;
step S204, the virtual bucket periodically sends third weight information to the control node cluster, after receiving the third weight information, the control node cluster checks whether the third weight information changes, and if so, the spatial weight table of the control node cluster is updated; specifically, all virtual buckets below the control node cluster periodically perform heartbeat data interaction on the control node cluster, wherein the interactive data mainly includes the capacity and the state of the current virtual bucket, the spatial weight table is the capacity of the current virtual bucket, and if the capacity of the current virtual bucket changes, the change is updated to the spatial weight table of the control node cluster;
step S205, the control node cluster records the weight information of the physical host, the bucket and the virtual bucket into a space weight table, allocates storage space according to the space weight table, and sends the space weight table to the client. Specifically, after the control node cluster allocates the formal storage space each time, a formal spatial weight table is generated, and in the formal spatial weight table, allocation manners of the spatial trees at different times are recorded.
Through the steps S201 to S205, compared with the prior art that data of an object storage user is stored in a bucket, data of a single bucket is stored in a key value database, and the capacity of the single bucket is limited due to the limitation of the capacity of the single key value database, so that the capacity of the user storage data is limited, data of a client in the system is stored in different virtual buckets, and then stored in different buckets, and then stored in all physical hosts, so that the performance bottleneck of the single bucket is the performance bottleneck of all physical hosts of the whole cluster, the utilization rate and the read-write performance of the physical host cluster are improved, and the capacity of the user storage data is increased.
In some embodiments, after the space weight table is updated, the client updates the space tree information of the local virtual bucket according to the space weight table, and performs allocation and search on data by the client depending on the space tree information. Therefore, fig. 3 is a schematic flowchart of another virtual bucket storage processing method based on distributed object storage according to an embodiment of the present application, and as shown in fig. 3, the flowchart includes step S301:
step S301, the client requests a space weight table from the control node cluster, updates the space tree information of the virtual bucket according to the space weight table, finds the target physical host according to the space tree information of the virtual bucket, and sends a read-write application of data to the target physical host.
Through the step S301, the client sends a data read-write application to the target physical host without passing through the control node cluster, so that the number of times of performing read-write Operations (Input/Output Operations Per Second, referred to as IOPS) by the client is increased, and after receiving the read-write application from the client, the target physical host searches for corresponding data on the local hard disk and then returns the data to the client.
In some embodiments, the first weight information includes a current capacity of the physical hosts, the second weight information includes a current capacity of a number of the physical hosts that make up the bucket, the third weight information includes a current capacity of a number of the buckets that make up the virtual bucket, and the space weight table includes a current capacity of a number of the virtual buckets that make up the storage space of the control node cluster. The larger the capacity of a physical host, the greater its weight, the more data is allocated to that physical host, and further the larger the capacity of a bucket, the greater its weight, the more data is allocated to that bucket, and finally the larger the capacity of a virtual bucket, the greater its weight, the more data is allocated to that virtual bucket.
In some embodiments, the client updating the spatial tree information of the virtual bucket comprises: the client actively applies for a new space from the control node cluster; the client periodically sends a heartbeat message to the control node cluster to inquire whether a new space weight table exists or not; and after the control node cluster generates a new space weight table, pushing the latest space weight table to the client. The target physical host can be found according to the space tree information of the latest virtual bucket, so that the client needs to update the space tree information of the virtual bucket in various ways, when the storage space used by a user is used, the client can apply for a new storage space, and the requirements of the user on the storage space are dynamically met in a way of the space tree of the virtual bucket and the weight of the virtual bucket.
In some embodiments, finding the target physical host from the spatial tree information of the virtual bucket comprises:
and the client performs hash weight calculation according to the spatial tree information of the virtual bucket to obtain a virtual bucket for storing data, performs hash weight calculation according to the third weight information of the virtual bucket to obtain a bucket for storing data, and performs hash weight calculation according to the second weight information of the bucket to obtain a target physical host for storing data. The client can know the latest weight information of each virtual bucket, each bucket and each physical host according to the continuously updated space tree information of the virtual buckets, the client performs hash calculation on the object name md5 and then performs modulo calculation, the obtained value obtains the virtual bucket to which the data are distributed according to the third weight information, the bucket to which the data are distributed is obtained according to the second weight information, and the target physical host to which the data are distributed is obtained according to the first weight information.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The present embodiment further provides a virtual bucket storage processing system based on distributed object storage, where the system is used to implement the foregoing embodiments and preferred embodiments, and details are not repeated for what has been described. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 4 is a block diagram of a distributed object storage based virtual bucket storage processing system according to an embodiment of the present application, and as shown in fig. 4, the system includes a physical host 41 and a control node cluster 42:
the physical host 41 runs a software defined storage service and stores user data and metadata, the physical host 41 forms a bucket, the buckets form a virtual bucket, the virtual buckets form a storage space of the control node cluster 42, the physical host 41 periodically sends first weight information to the bucket, the bucket checks whether the first weight information changes after receiving the first weight information, if so, the second weight information of the bucket is updated, the bucket periodically sends the second weight information to the virtual bucket, after receiving the second weight information, the virtual bucket checks whether the second weight information changes, if so, the third weight information of the virtual bucket is updated, and the virtual bucket periodically sends the third weight information to the control node cluster 42. Physical hosts 41 with different data volumes form buckets with different capacities, buckets with different data volumes form virtual buckets with different capacities, virtual buckets with different data volumes form a storage space of a capacity control node cluster 42, each physical host 41 in a bucket needs to periodically report heartbeat to a bucket service, and then the buckets report the heartbeat to the virtual bucket service, so that service monitoring management is divided into three levels, the control node cluster 42 monitors the virtual bucket service, the virtual bucket service supervises the bucket service, and the bucket service supervises the physical hosts 41 at the bottom layer.
After receiving the third weight information, the control node cluster 42 checks whether the third weight information changes, and if so, updates the space weight table of the control node cluster 42, and the control node cluster 42 records the weight information of the physical host 41, the bucket, and the virtual bucket in the space weight table, allocates storage space according to the space weight table, and sends the space weight table to the client. The control node cluster 42 stores the topology structure of the whole virtual bucket, and stores the information of each node and the capacity usage condition contained in the virtual bucket into the space weight table, the control node cluster 42 ensures the consistency of cluster data through a PAXOS consistency protocol, a formal space weight table is generated after the control node cluster 42 allocates a formal storage space each time, and the allocation modes of space trees at different times are recorded in the formal space weight table.
Through the system, compared with the prior art that data of an object storage user is stored in a barrel, the data of a single barrel is stored in a key value database, the capacity of the single barrel is limited due to the limitation of the capacity of the single key value database, and the capacity of the data stored by the user is limited, the data of a client 51 in the system is stored in different virtual barrels, and then stored in different barrels, and then stored in all physical host computers, so that the performance bottleneck of the single barrel is the performance bottleneck of all the physical host computers 41 of the whole cluster, the utilization rate and the read-write performance of the cluster of the physical host computers 41 are improved, and the capacity of the data stored by the user is increased.
In some embodiments, after the space weight table is updated, the client updates the space tree information of the local virtual bucket according to the space weight table, and performs allocation and search on data by the client depending on the space tree information. Therefore, fig. 5 is a block diagram of another distributed object storage based virtual bucket storage processing system according to an embodiment of the present application, and as shown in fig. 5, the block diagram includes a client 51:
the client 51 requests the spatial weight table from the control node cluster 42, updates the spatial tree information of the virtual bucket thereof according to the spatial weight table, finds the target physical host 41 according to the spatial tree information of the virtual bucket, and sends a read-write application of data to the target physical host 41. In this embodiment, the client 51 sends a data read-write application to the target physical host 41 without passing through the control node cluster 42, so as to improve the read-write performance of the client 51, and after receiving the read-write application from the client 51, the target physical host 41 searches for corresponding data on a local hard disk, and then returns the data to the client 51.
The following describes an embodiment of the present invention in detail with reference to a specific application scenario, and fig. 6 is a schematic diagram of a capacity allocation process of a virtual bucket space tree according to an embodiment of the present invention, and as shown in fig. 6, a processing system for storing virtual buckets based on distributed objects:
the topology of the entire virtual bucket is preserved by the cluster of control nodes 42. Each virtual bucket will periodically send the node information it contains, as well as the capacity usage situation, to the cluster of control nodes 42. Then, the control node cluster 42 ensures the consistency of cluster data through a PAXOS consistency protocol. The PAXOS protocol for odd nodes ensures that the topology information of each virtual bucket is kept consistent at each node, thereby ensuring data consistency of the entire control node cluster 42. Each physical node in the bucket needs to report heartbeat to the bucket service regularly, and then the bucket reports heartbeat to the virtual bucket service. Thus, the service monitoring management is divided into three levels, controlling the node cluster 42, as long as the virtual bucket service is supervised, the virtual bucket service supervises the bucket service, and the bucket service supervises the underlying physical host 41 service.
For storage allocation of user data, different virtual buckets may be composed of physical hosts 41 with the same amount of data, or may be composed of physical hosts 41 with different amounts of data, so that there are virtual buckets with different capacities. And for virtual buckets with different capacities, allocating the storage capacity based on a space tree mode. The first time, 100T of space is allocated, and then allocation is performed according to the weight of the current virtual bucket, and the larger the capacity of the default bucket, the higher the weight of allocation. When the space is used soon, new storage space can be continuously allocated for the user to use. Alternatively, a plurality of sections of storage spaces may be generated at the same time and then allocated to different users to use storage spaces of different capacities. When the memory space used by the user is used up, a new memory space can be applied. Through the mode of adding the virtual bucket weight to the space tree, the requirement of a user on the storage space can be dynamically met, and meanwhile, the data can be ensured to be more balanced. The virtual buckets allocate 500T of space at the initial T1 moment based on the capacity allocation mode of the space tree, and then data is allocated according to the physical actual storage space of each virtual bucket, so that the more the physical available space is, the more the allocated resources are. After the user uses a period of time, the user needs 900T of used space at time T2, and the space is allocated according to the current actual used capacity of each virtual bucket by reevaluation.
Finally, when the client 51 wants to access data, the spatial tree structure of the virtual bucket is acquired through the control node cluster 42, then the client 51 performs hash lookup based on weight locally based on the spatial tree structure, the data is firstly distributed to the virtual bucket, and then the loaded bucket is calculated by the hash based on weight of the virtual bucket. Finally, the bucket carried is hashed to the specific physical host 41 based on the weight of the physical host 41. Thus, through the calculation of the client 51, the client 51 can clearly know which specific physical host 41 the data is stored in. The client 51 can read and write access to the physical machine, avoiding hot spots of look-up tables and central metadata servers. The data of the client 51 is stored in different virtual buckets, and then in different buckets, and then in all the host physical machines. Thus, the performance bottleneck of a single bucket, which is the performance bottleneck of all the physical hosts 41 of the whole cluster, can greatly improve the utilization rate and the read-write performance of the physical host cluster.
Fig. 7 is a schematic frame diagram of a virtual bucket storage processing method based on distributed object storage according to an embodiment of the present application, and as shown in fig. 7, a virtual bucket storage processing method based on distributed object storage includes the following steps:
s1: all physical hosts 41 below a bucket will periodically interact with the heartbeat data for this bucket service. The data of the interaction is mainly the hard disk capacity and the state of the current physical host 41. The bucket service, upon receiving the heartbeat message of the physical host 41, checks whether there is a change in the current capacity of the physical host 41. If the data in the weight table is changed and updated, data distribution is carried out on the spatial data at the next moment, and weight modification is used as reference. For newly discovered physical nodes subjected to capacity expansion, the physical nodes cannot be normally added at the current moment and can be added at the next space distribution moment.
S2: the virtual bucket service periodically receives heartbeat information of the bucket service, and then updates a weight table of the virtual bucket service. Similarly, the weight value in this table is not affected by the weight of the spatial tree to which data has been allocated at the current time, but only affects the allocation of the spatial tree at the next time. That is, the space tree, which records the weight of each virtual bucket and the physical host 41, is not modified when space is allocated at a certain time. After the space is allocated at a certain time, the weight of the temporary space weight table changes, so that the actual use condition of each physical host 41 to which the bucket below each virtual bucket belongs can be reflected more truly when space allocation is performed at the next time, and the utilization rate of physical resources and the read-write access performance of the bucket are improved.
S3: the virtual buckets will periodically report their weight information to the control node cluster 42. After receiving the periodic report information of the virtual bucket, the control node cluster 42 updates its spatial weight table. In the spatial weight table of the control node cluster 42, the weight information of each virtual bucket, and physical host 41 is recorded. After the control node allocates the formal storage space each time, a formal weight table is generated. In the formal weight table, the allocation method of the spatial tree at different times is recorded. The spatial tree is a requirement on which the client 51 relies for data allocation lookups.
S4: after the client 51 goes online, it actively contacts the control node cluster 42 to request the spatial tree of the virtual bucket. For the initial cluster, which has not been allocated, the client 51 will have the required size of the space in the request, and then the control node cluster 42 will allocate the space tree. After a period of operation, since the previously allocated space is about to be used up, or the client has a new space demand, the client 51 will request the control node cluster 42 to generate a new space again, and then trigger the generation of the space tree of the new virtual bucket.
S5: the client 51, upon receiving the message returned from the control node cluster 42, will update the spatial tree information of its local virtual bucket. Because the space tree information of the virtual bucket does not change frequently, the client 51 generally has three ways to update the space tree information of the virtual bucket, the first way is that the client 51 actively applies for a new space to the control node, the second way is that the client 51 periodically sends a heartbeat message to the control node cluster 42 to inquire whether a new space tree of the virtual bucket exists, and the third way is that the control node cluster 42 pushes the latest space tree information to all the clients 51 after generating a new space tree.
S6: after obtaining the spatial tree of the virtual bucket, the client 51 performs hash weight calculation on the virtual bucket at different time points based on the object name. The method comprises the steps of firstly obtaining virtual bucket information through hash calculation based on weight in a virtual bucket, then entering a next-level child node of a virtual bucket space tree, and obtaining bucket information through hash calculation based on weight in the bucket. And finally, calculating the hash weight of different physical hosts 41 at the leaf node entering the virtual barrel space tree to obtain the final information of the target physical host 41. So that the client 51 can communicate directly with the target physical host 41.
S7: the client 51 obtains the target physical host 41 through calculation based on the space tree of the virtual bucket, and then directly sends a data read-write application to the target physical host 41, and the request mode can improve the read-write performance of the client 51 without passing through a control node. The principle of the hash calculation method based on the weight is that the larger the capacity of the physical host 41 is, the larger the weight is, the more data is allocated to it, and further, the larger the capacity of the bucket is, the larger the weight is, the more data is allocated to it, the larger the capacity of the last virtual bucket is, the maximum weight is, and the more data is allocated to it.
S8: after receiving the read-write request from the client 51, the physical host 41 searches for corresponding data on the local hard disk, and then returns the data to the client 51. As can be seen from the above allocation method of the space tree based on the virtual bucket, the client 51 stores all the corresponding physical hosts 41, and all the user data and metadata are stored on all the physical hosts 41. Thus, the performance of the whole cluster can be fully exerted, the IOPS of the client 51 is improved, and the capacity and read-write performance bottleneck of a single bucket are abandoned. And on the other hand, based on the capacity distribution mode of the space tree of the virtual bucket, when the space is applied at different moments, new space tree child nodes are generated. The utilization rate of the cluster and the service efficiency of the node hard disk can be fully exerted, and the utilization rate of the whole physical cluster is improved.
It should be understood by those skilled in the art that various technical features of the above-described embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above-described embodiments are not described, however, so long as there is no contradiction between the combinations of the technical features, they should be considered as being within the scope of the present description.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (8)

1. A virtual bucket storage processing system based on distributed object storage, the system comprising a cluster of physical hosts and control nodes,
the physical hosts run software defined storage service and store user data and metadata, a plurality of the physical hosts form a bucket, a plurality of the buckets form a virtual bucket, and a plurality of the virtual buckets form a storage space of the control node cluster;
the physical host periodically sends first weight information to the bucket, the bucket checks whether the first weight information changes after receiving the first weight information, and if so, the physical host updates second weight information of the bucket;
the bucket periodically sends the second weight information to the virtual bucket, the virtual bucket checks whether the second weight information changes after receiving the second weight information, and if so, the third weight information of the virtual bucket is updated;
the virtual bucket periodically sends the third weight information to the control node cluster, and after receiving the third weight information, the control node cluster checks whether the third weight information changes, and if so, updates a spatial weight table of the control node cluster; the control node cluster records the weight information of the physical host, the buckets and the virtual buckets into the space weight table, distributes storage space according to the space weight table and sends the space weight table to a client;
wherein the first weight information includes current capacities of the physical hosts, the second weight information includes current capacities of a number of the physical hosts that make up the bucket, the third weight information includes current capacities of a number of the buckets that make up the virtual bucket, and the space weight table includes current capacities of a number of the virtual buckets that make up a storage space of the cluster of control nodes.
2. The system according to claim 1, wherein the client requests the space weight table from the control node cluster, updates space tree information of its own virtual bucket according to the space weight table, and sends a read-write application of data to a target physical host after finding the target physical host according to the space tree information of the virtual bucket.
3. The system of claim 2, wherein finding the target physical host according to the spatial tree information of the virtual bucket comprises:
and the client performs Hash weight calculation according to the space tree information of the virtual bucket to obtain the virtual bucket for storing the data, performs the Hash weight calculation according to the third weight information of the virtual bucket to obtain the bucket for storing the data, and performs the Hash weight calculation according to the second weight information of the bucket to obtain the target physical host for storing the data.
4. The system of claim 1, wherein the client updating the spatial tree information for the virtual bucket comprises: the client actively applies for a new space from the control node cluster; the client periodically sends a heartbeat message to the control node cluster to inquire whether a new space weight table exists or not; and after the control node cluster generates the new space weight table, pushing the latest space weight table to the client.
5. A virtual bucket storage processing method based on distributed object storage is applied to a virtual bucket storage processing system based on distributed object storage, and the system comprises the following steps: a cluster of physical hosts and control nodes, the method comprising:
the physical hosts run software-defined storage services and store user data and metadata, a plurality of the physical hosts form a bucket, a plurality of the buckets form a virtual bucket, and a plurality of the virtual buckets form a storage space of the control node cluster;
the physical host periodically sends first weight information to the bucket, the bucket checks whether the first weight information changes after receiving the first weight information, and if so, the physical host updates second weight information of the bucket;
the bucket periodically sends the second weight information to the virtual bucket, the virtual bucket checks whether the second weight information changes after receiving the second weight information, and if so, the third weight information of the virtual bucket is updated;
the virtual bucket periodically sends the third weight information to the control node cluster, and after receiving the third weight information, the control node cluster checks whether the third weight information changes, and if so, updates a spatial weight table of the control node cluster; the control node cluster records the weight information of the physical host, the buckets and the virtual buckets into the space weight table, distributes storage space according to the space weight table and sends the space weight table to a client;
wherein the first weight information includes a current capacity of the physical host, the second weight information includes a current capacity of a number of the physical hosts that make up the bucket, the third weight information includes a current capacity of a number of the buckets that make up the virtual bucket, and the space weight table includes a current capacity of a number of the virtual buckets that make up a storage space of the cluster of control nodes.
6. The method according to claim 5, wherein the client requests the control node cluster for the spatial weight table, updates spatial tree information of its own virtual bucket according to the spatial weight table, and sends a read-write application of data to a target physical host after finding the target physical host according to the spatial tree information of the virtual bucket.
7. The method of claim 6, wherein finding the target physical host according to the spatial tree information of the virtual bucket comprises:
and the client performs Hash weight calculation according to the space tree information of the virtual bucket to obtain the virtual bucket for storing the data, performs the Hash weight calculation according to the third weight information of the virtual bucket to obtain the bucket for storing the data, and performs the Hash weight calculation according to the second weight information of the bucket to obtain the target physical host for storing the data.
8. The method of claim 5, wherein the client updating the spatial tree information of the virtual bucket comprises: the client actively applies for a new space from the control node cluster; the client periodically sends a heartbeat message to the control node cluster to inquire whether a new space weight table exists or not; and after the control node cluster generates the new space weight table, pushing the latest space weight table to the client.
CN202010953846.2A 2020-09-11 2020-09-11 Virtual bucket storage processing method and system based on distributed object storage Active CN112422611B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010953846.2A CN112422611B (en) 2020-09-11 2020-09-11 Virtual bucket storage processing method and system based on distributed object storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010953846.2A CN112422611B (en) 2020-09-11 2020-09-11 Virtual bucket storage processing method and system based on distributed object storage

Publications (2)

Publication Number Publication Date
CN112422611A CN112422611A (en) 2021-02-26
CN112422611B true CN112422611B (en) 2023-04-18

Family

ID=74854756

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010953846.2A Active CN112422611B (en) 2020-09-11 2020-09-11 Virtual bucket storage processing method and system based on distributed object storage

Country Status (1)

Country Link
CN (1) CN112422611B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114416737B (en) * 2022-01-04 2022-08-05 北京中电兴发科技有限公司 Time sequence data storage method based on dynamic weight balance time sequence database cluster
CN116827947B (en) * 2023-08-31 2024-01-19 联通在线信息科技有限公司 Distributed object storage scheduling method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103929500A (en) * 2014-05-06 2014-07-16 刘跃 Method for data fragmentation of distributed storage system
CN109189329A (en) * 2018-08-08 2019-01-11 杭州数梦工场科技有限公司 The method of adjustment and device of memory node weight
CN109831540A (en) * 2019-04-12 2019-05-31 成都四方伟业软件股份有限公司 Distributed storage method, device, electronic equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7813276B2 (en) * 2006-07-10 2010-10-12 International Business Machines Corporation Method for distributed hierarchical admission control across a cluster

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103929500A (en) * 2014-05-06 2014-07-16 刘跃 Method for data fragmentation of distributed storage system
CN109189329A (en) * 2018-08-08 2019-01-11 杭州数梦工场科技有限公司 The method of adjustment and device of memory node weight
CN109831540A (en) * 2019-04-12 2019-05-31 成都四方伟业软件股份有限公司 Distributed storage method, device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
云环境下基于迁移的虚拟机集群优化算法;季莉莉等;《电子科技》;20160815(第08期);全文 *

Also Published As

Publication number Publication date
CN112422611A (en) 2021-02-26

Similar Documents

Publication Publication Date Title
US11431791B2 (en) Content delivery method, virtual server management method, cloud platform, and system
US9052962B2 (en) Distributed storage of data in a cloud storage system
JP4068473B2 (en) Storage device, assignment range determination method and program
US9442671B1 (en) Distributed consumer cloud storage system
US20120166611A1 (en) Distributed storage system including a plurality of proxy servers and method for managing objects
US10708355B2 (en) Storage node, storage node administration device, storage node logical capacity setting method, program, recording medium, and distributed data storage system
WO2004091277A2 (en) Peer-to-peer system and method with improved utilization
CN112422611B (en) Virtual bucket storage processing method and system based on distributed object storage
US8924513B2 (en) Storage system
US11621891B1 (en) Systems and methods for routing network data based on social connections of users
CN113672175A (en) Distributed object storage method, device and equipment and computer storage medium
EP3739440A1 (en) Distributed storage system, data processing method and storage node
JP2009295127A (en) Access method, access device and distributed data management system
JP6243528B2 (en) Distribution of creator systems among lease agent systems
CN110569302A (en) method and device for physical isolation of distributed cluster based on lucene
CN109783564A (en) Support the distributed caching method and equipment of multinode
US10986065B1 (en) Cell-based distributed service architecture with dynamic cell assignment
CN102970349B (en) A kind of memory load equalization methods of DHT network
CN109005071B (en) Decision deployment method and scheduling equipment
US11507313B2 (en) Datafall: a policy-driven algorithm for decentralized placement and reorganization of replicated data
JP6028728B2 (en) Object placement device, object placement method, and program
US11310309B1 (en) Arc jump: per-key selection of an alternative server when implemented bounded loads
CN114879907A (en) Data distribution determination method, device, equipment and storage medium
WO2018235132A1 (en) Distributed storage system
CN112799849A (en) Data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant