CN107211003A - Distributed memory system and the method for managing metadata - Google Patents

Distributed memory system and the method for managing metadata Download PDF

Info

Publication number
CN107211003A
CN107211003A CN201580070472.7A CN201580070472A CN107211003A CN 107211003 A CN107211003 A CN 107211003A CN 201580070472 A CN201580070472 A CN 201580070472A CN 107211003 A CN107211003 A CN 107211003A
Authority
CN
China
Prior art keywords
mdc
resource pool
metadata
node
main
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201580070472.7A
Other languages
Chinese (zh)
Other versions
CN107211003B (en
Inventor
谢会云
陈钟平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN107211003A publication Critical patent/CN107211003A/en
Application granted granted Critical
Publication of CN107211003B publication Critical patent/CN107211003B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of method that the present invention provides distributed memory system and management metadata, the distributed memory system includes:Metadatabase, multiple metadata controller MDC and multiple resource pools;The metadatabase, for storing the metadata corresponding with the plurality of resource pool;The mapping relations between standby MDC and resource pool in main MDC in the plurality of MDC, the plurality of MDC stored for managing in the metadatabase;Standby MDC in the plurality of MDC, what is stored for managing in the metadatabase for MDC has the corresponding metadata of the resource pool of mapping relations with this.Therefore, the distributed memory system of the embodiment of the present invention can manage more massive storage cluster, realize failure domain separation.

Description

Distributed memory system and the method for managing metadata Technical field
The present embodiments relate to computer fields, and more particularly, to distributed memory system and the method for managing metadata.
Background technique
The basic framework of typical distributed memory system includes Zookeeper (ZK) cluster, metadata controller (Metadata Controller, referred to as " MDC ") cluster, resource pool (Pool) and client (Client) cluster.Wherein, MDC cluster is disposed by the way of a main prepare more, and main MDC is responsible for the business such as the calculating, read-write, Pool troubleshooting of metadata.Metadata memory node in ZK cluster is divided into data (Data) node and interim (Ephemeral) node, data under Data node are modified by main MDC, other MDC can be read, Ephemeral node includes main transient node and standby transient node, the identification information of main MDC is stored in main transient node, each standby transient node stores the identification information of a standby MDC, main MDC monitors standby transient node, to determine whether standby MDC state is normal, standby MDC monitors main transient node, and whether the state to judge main MDC is normal.Once main MDC failure, all standby MDC will receive ZK event notice, into competition main flow, the new main MDC of generation will read metadata from ZK cluster, externally provide business service after completing initialization.
Since the read-write of metadata is all main MDC processing, it can only support the resource pool service of certain scale, not support the dilatation of resource pool dimension;Standby MDC not processing business, the resource of waste system, in main MDC failure and new main MDC does not provide viability also, and the business of entire distributed memory system can all be affected, and existing distributed memory system cannot support the dynamic of metadata control cluster to expand volume reduction.
Summary of the invention
A kind of method that the present invention provides distributed memory system and manages metadata, can manage more massive storage cluster, and can be realized failure domain separation.
In a first aspect, providing a kind of distributed memory system, comprising: metadatabase, multiple metadata controller MDC and multiple resource pools;The metadatabase, for storing metadata corresponding with multiple resource pool;Main MDC in multiple MDC, for managing the mapping relations between standby MDC and resource pool in the multiple MDC stored in the metadatabase;Standby MDC in multiple MDC, For manage stored in the metadatabase there is the corresponding metadata of the resource pool of mapping relations for MDC with this.
Main MDC in distributed memory system in the embodiment of the present invention is used to manage the mapping relations between MDC and resource pool, standby MDC be used to manage stored in element database there is the corresponding metadata of the resource pool of mapping relations for MDC with this, including business such as calculating, read-write, the resource pool troubleshootings of metadata of resource pool responsible and that this is for MDC with mapping relations, thus, distributed memory system can manage more massive storage cluster, realize failure domain separation.
With reference to first aspect, in the first possible implementation of the first aspect, the metadata memory node in the metadatabase includes publicly-owned node, privately owned node and transient node;Wherein, the metadata stored in the publicly-owned node is modified by the main MDC;The corresponding metadata of each resource pool in multiple resource pool is stored in the privately owned node, and the corresponding metadata of each resource pool is read out and is modified by the standby MDC for managing the resource pool in multiple resource pool;The identification information of each MDC in multiple MDC is stored in the transient node.
That is, different types of metadata memory node can be established in metadatabase to realize management of the main MDC to publicly-owned metadata, mutual monitoring of the standby MDC to state between the management of the metadata for the resource pool for having mapping relations therewith and main MDC and standby MDC.Each MDC in multiple MDC can participate in the management of metadata as a result, can manage bigger storage cluster, and each MDC only manages the metadata of corresponding resource pool, can be realized failure domain separation.
The possible implementation of with reference to first aspect the first, in the second possible implementation of the first aspect, the main MDC, specifically for when needing establishing resource pond, the ownership MDC for needing the resource pool created is determined in multiple MDC, and the mapping relations between the resource pool created and the ownership MDC of the resource pool of needs creation will be needed to be written in the publicly-owned node;The ownership MDC of the resource pool of needs creation reads the topology information of the resource pool of needs creation for the mapping relations for belonging to MDC for the resource pool that the resource pool and the needs according to the needs creation stored in the publicly-owned node create from the privately owned node.
Optionally, metadata can be stored in the form of multistage catalogue in metadatabase, that is to say, that, metadata can be stored by the way of multistage node, for example, publicly-owned node can be used as root node, different types of metadata is stored in the different leaf nodes under root node.Privately owned node can be used as root node, and the corresponding metadata of each resource pool is stored in a leaf node.
The possible implementation of second with reference to first aspect, in a third possible implementation of the first aspect, the main MDC are also used to: receiving the establishing resource pond request that user sends, creation money The request of source pond carries the topology information;The topology information is written in the privately owned node.
Second with reference to first aspect or the third possible implementation, in a fourth possible implementation of the first aspect, the ownership MDC of the resource pool of needs creation is also used to: according to the topology information stored in the privately owned node, generating metadata corresponding with the resource pool that the needs create;The metadata corresponding with resource pool that the needs create is written in the privately owned node.
Thus, it is possible to need to increase the number of resource pool according to the business of user, the memory capacity of lifting system better meets the business demand of user.
In conjunction with any of the above-described possible implementation, in the fifth possible implementation of the first aspect, which is also used to: deleting first resource pond in the multiple resource pool stored in the publicly-owned node and the mapping relations for first in MDC for MDC;This first for MDC, for determine the first resource pond and this first be deleted for the mapping relations of MDC when, stop the management to the corresponding metadata in first resource pond.
The 5th kind of possible implementation with reference to first aspect, in the sixth possible implementation of the first aspect, which is specifically used for: determine this first for MDC failure when, delete the first resource pond and this first for MDC mapping relations.
In user due to business change, when not needing using some resource pool, the mapping relations of the resource pool and standby MDC can be deleted, thus, stop the management to the corresponding metadata of the resource pool with the standby MDC that the resource pool has mapping relations, and this can when needed be managed the metadata of other resource pools for MDC, and thus, it is possible to improve the utilization rate of system resource.
The 6th kind of possible implementation with reference to first aspect, in a seventh possible implementation of the first aspect, the main MDC is also used to: determine this first for MDC failure when, determine in multiple MDC second for MDC, by the first resource pond and this second be written in the publicly-owned node for the mapping relations of MDC;This second for MDC, for according to the first resource pond stored in the publicly-owned node and this second for MDC mapping relations, the metadata in the first resource pond is read from the privately owned node.
The 7th kind of possible implementation with reference to first aspect, in the 8th kind of possible implementation of first aspect, which is specifically used for: load will be less than one in the standby MDC of preset threshold in multiple MDC and be determined as this second for MDC for MDC.
In the embodiment of the present invention, when a standby MDC breaks down, a standby MDC can be redefined to manage by the corresponding metadata of resource pool for the standby MDC management broken down, thus, it is possible to improve the reliability of system.
Any possible implementation with reference to first aspect, in the 9th kind of possible realization of first aspect In mode, which is also used to: receiving the creation MDC request that user sends;The identification information of the MDC of creation is requested to be written in the publicly-owned node creation MDC.
In turn, the distributed memory system of the embodiment of the present invention can increase MDC according to the request of user online, to improve the processing capacity and reliability of system.
Any possible implementation with reference to first aspect, in the tenth kind of possible implementation of first aspect, standby MDC in multiple MDC is also used to when determining the main MDC failure, initiates competition main flow;A standby MDC in multiple MDC is as new main MDC, for loading the metadata in the publicly-owned node.
Any possible implementation with reference to first aspect, in a kind of the tenth possible implementation of first aspect, which is also used to: determining that the viewstate of client changes;Update metadata corresponding with the view of the client in the publicly-owned node.
Any possible implementation with reference to first aspect, in the 12nd kind of possible implementation of first aspect, standby MDC in multiple MDC is also used to: when the state of resource pool determining and that this is for MDC with mapping relations changes, updating and this has the metadata of the resource pool of mapping relations for MDC.
Second aspect, provide a kind of method that metadata is managed in distributed memory system, it include: that the first metadata controller MDC receives the mapping relations inquiry request that client is sent, the MDC for the metadata which is used to that searching and managing to be requested to request corresponding resource pool with user;First MDC sends mapping relations to the client and indicates information, and it is the 2nd MDC that mapping relations instruction information instruction, which manages this and requests the MDC of corresponding resource pool with user,;The client reads from the 2nd MDC and is somebody's turn to do the metadata that requested corresponding resource pool with user.
In conjunction with second aspect, in the first possible implementation of the second aspect, before the first MDC sends mapping relations instruction information to the client, this method further include: the first MDC reads the mapping relations list stored in metadatabase, wherein, metadata memory node in the metadatabase includes publicly-owned node, privately owned node and transient node, the metadata stored in the publicly-owned node is modified by main MDC, the corresponding metadata of each resource pool in multiple resource pool is stored in the privately owned node, and the corresponding metadata of each resource pool is read out and is modified by the standby MDC for managing the resource pool in multiple resource pool, the identification information of each MDC in multiple MDC is stored in the transient node;First MDC is determined as managing the MDC for requesting with user corresponding resource pool according to the mapping relations list, by the 2nd MDC.
In embodiments of the present invention, main MDC is managed publicly-owned metadata, and standby MDC is to therewith There is the metadata of the resource pool of mapping relations to be managed, and the identification information of each MDC is stored in transient node, mutually to monitor mutual state between main MDC and standby MDC, since each MDC can participate in the management of metadata, it is possible thereby to manage bigger storage cluster, and each MDC only manages the metadata of corresponding resource pool, can be realized failure domain separation.
In conjunction with the first possible implementation of second aspect, in a second possible implementation of the second aspect, which needs establishing resource pond;The main MDC determines the ownership MDC for needing the resource pool created;The main MDC will need the mapping relations of the ownership MDC of the resource pool created and the resource pool of needs creation to be written in the publicly-owned node;The ownership MDC of the resource pool of needs creation reads the topology information of the resource pool of needs creation according to the mapping relations of the resource pool of the needs creation stored in the publicly-owned node and the ownership MDC of the resource pool of needs creation from the privately owned node.
In embodiments of the present invention, main MDC can determination needs establishing resource according to the demand of user in system initialization;Or main MDC, when the business that the memory capacity of the determining existing resource pool of user is not able to satisfy user needs, determination needs establishing resource pond;Or, main MDC is when the establishing resource pond for receiving user's transmission is requested, determination needs establishing resource pond, wherein, the establishing resource pond request that user sends carries the topology information for needing the resource pool created, when main MDC receives the establishing resource pond request of user's transmission, topology information therein is written under privately owned (Private) node in metadatabase.
In conjunction with second of possible implementation of second aspect, in the third possible implementation of the second aspect, this method further include: the main MDC receives the establishing resource pond request that user sends, and establishing resource pond request carries the topology information;The topology information is written in the privately owned node the main MDC.
In conjunction with the third possible implementation of second aspect, in the fourth possible implementation of the second aspect, this method further include: the ownership MDC of the resource pool of needs creation generates metadata corresponding with the resource pool that the needs create according to the topology information;The metadata corresponding with the resource pool that the needs create is written in the privately owned node ownership MDC of the resource pool of needs creation.
In conjunction with any of the above-described possible implementation, in a fifth possible implementation of the second aspect, this method further include: the main MDC delete the first resource pond that is stored in the publicly-owned node with first for MDC mapping relations;This first for MDC determine the first resource pond and this first be deleted for the mapping relations of MDC when, stop management to the corresponding metadata in first resource pond.
In conjunction with the 5th kind of possible implementation of second aspect, in the sixth possible implementation of the second aspect, this method further include: the main MDC determine this first for MDC failure when, determine Second for MDC;The main MDC by the first resource pond and this second be written in the publicly-owned node for the mapping relations of MDC;This second for MDC according to the first resource pond that is stored in the publicly-owned node and this second for MDC mapping relations, the metadata in the first resource pond is read from the privately owned node.
In conjunction with any of the above-described possible implementation of second aspect, in the 7th kind of possible implementation of second aspect, this method further include: the main MDC receives the creation MDC request that user sends;Creation MDC is requested the relevant information of the MDC created to be written in the publicly-owned node by the main MDC.
In conjunction with any of the above-described possible implementation of second aspect, in the 8th kind of possible implementation of second aspect, this method further include: standby MDC initiates competition main flow when determining main MDC failure;This, as new main MDC, loads the metadata in the publicly-owned node for a standby MDC in MDC.
In conjunction with any of the above-described possible implementation of second aspect, in the 9th kind of possible implementation of second aspect, this method further include: standby MDC updates when the state of resource pool determining and that this is for MDC with mapping relations changes and this has the metadata of the resource pool of mapping relations for MDC.
The third aspect, provide a kind of metadata controller MDC, the method in any possible implementation for executing above-mentioned second aspect or second aspect, specifically, which includes the unit for executing the method in any possible implementation of above-mentioned second aspect or second aspect.
Fourth aspect, provide a kind of data service unit, it include: processor and memory, the processor is connected with the memory by bus system, the memory is for storing metadata, the processor is for managing the metadata stored in the memory, so that the data service unit executes the method in any possible implementation of above-mentioned second aspect or second aspect.
5th aspect, provides a kind of computer-readable medium, for storing computer program, which includes the instruction for executing the method in any possible implementation of second aspect or second aspect.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, attached drawing needed in the embodiment of the present invention will be briefly described below, apparently, drawings described below is only some embodiments of the present invention, for those of ordinary skill in the art, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the architecture diagram of distributed memory system according to an embodiment of the present invention;
Fig. 2 is the architecture diagram of distributed memory system accord to a specific embodiment of that present invention;
Fig. 3 is the schematic diagram of the division methods of the node in metadatabase according to an embodiment of the present invention;
Fig. 4 is the schematic flow chart that the method for metadata is managed in distributed memory system according to an embodiment of the present invention;
Fig. 5 is the schematic flow chart of the method for the mapping relations for establishing POOL and MDC accord to a specific embodiment of that present invention;
Fig. 6 is the schematic diagram of mapping relations according to an embodiment of the present invention;
Fig. 7 is the schematic flow chart of the method for ownership MDC creation POOL according to an embodiment of the present invention;
Fig. 8 is the schematic flow chart of the method for deletion mapping relations accord to a specific embodiment of that present invention;
Fig. 9 is the schematic flow chart for rebuilding the method for mapping relations between MDC and POOL accord to a specific embodiment of that present invention;
Figure 10 is the schematic flow chart of the method for increase MDC accord to a specific embodiment of that present invention;
Figure 11 is the schematic flow chart of the method for the main MDC of selection newly accord to a specific embodiment of that present invention;
Figure 12 is the schematic flow chart of the method for main MDC processing client's end-view change accord to a specific embodiment of that present invention;
Figure 13 is the schematic flow chart of the method for ownership MDC treatment source pool Status Change accord to a specific embodiment of that present invention;
Figure 14 is the schematic block diagram of data service unit according to an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is a part of the embodiments of the present invention, rather than whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art's every other embodiment obtained without making creative work, all should belong to the scope of protection of the invention.
Fig. 1 is the schematic architectural diagram of distributed memory system according to an embodiment of the present invention, as shown in Figure 1, the distributed memory system 10 includes: metadatabase 11, multiple metadata controller (Metadata Controller, referred to as " MDC ") 12 and multiple resource pools 13;
The metadatabase 11, for storing metadata corresponding with multiple resource pool 13;
Main MDC121 in multiple MDC, for managing the mapping relations between standby MDC122 and resource pool in the multiple MDC stored in the metadatabase 11;
Standby MDC122 in multiple MDC, for manage stored in the metadatabase there is the corresponding metadata of the resource pool of mapping relations for MDC122 with this.
In embodiments of the present invention, optionally, the metadatabase can be distributed, open source code a distributed application program coordination service, such as ZooKeeper (referred to as " ZK "), it can also be Googele Chubby, multiple MDC may be constructed a MDC cluster, be one-to-one relationship between the standby MDC and resource pool in MDC cluster.
Generally, the distributed memory system of the embodiment of the present invention can be distributed memory system shown in Fig. 2.As shown in Fig. 2, distributed memory system includes: Zookeeper cluster, MDC cluster, multiple resource pools (POOL) and client (Client) cluster.MDC cluster includes a main MDC and multiple standby MDC.Main MDC is responsible for expanding the control of the resources of overall importance such as volume reduction, the creation of POOL and deletion with the management of POOL mapping relations, the monitoring of standby MDC health status, MDC cluster for MDC;Standby MDC is responsible for and this has calculating, the read-write, POOL troubleshooting business of the metadata (such as view) of the POOL of mapping relations for MDC.
As shown in Figure 3, metadata memory node in ZK is divided into three types: publicly-owned (Public) node, privately owned (Private) node and interim (Ephemeral) node, wherein, Public node and Private node are back end used to store metadata, the metadata stored in Public node is properly termed as publicly-owned data, and the metadata stored in Private node is properly termed as private data.In general the metadata under Public node may include resource pool mapping relations, MDC list, client's end-view etc., and the metadata under Public node can only be modified by main MDC, other MDC can be read;Metadata under Private node is corresponding with specific POOL, and metadata corresponding with a POOL can only be by having the MDC of mapping relations to be read out and modify with the POOL;The identification information of each NDC is stored in Ephemeral node, Ephemeral node may further include main transient node and standby transient node, the identification information of main MDC is stored in main transient node, each standby transient node stores the identification information of a standby MDC, main MDC monitors standby transient node, whether normal for MDC state to determine, standby MDC monitors main transient node, and whether the state to judge main MDC is normal.The identification information of MDC may include the Internet protocol address (Internet Protocol, referred to as " IP ") and/or the identity (Identification, referred to as " ID ") of MDC.
Optionally, as an example, as shown in Figure 3, in metadatabase metadata can be stored in the form of multistage catalogue, that is metadata can be stored by the way of multistage node, such as, it can be using publicly-owned node as root node, different types of metadata is stored in the different leaf nodes under root node, such as in Fig. 3, a leaf node under publicly-owned node is used for memory resource pool mapping relations, and a leaf node is for storing MDC list, and a leaf node is for storing client's end-view.It can be using privately owned node as root node, the corresponding metadata of each resource pool is stored in a leaf node, as shown in Figure 3, a leaf node under privately owned node is used for the corresponding metadata of memory resource pool 0, and it can store different types of metadata in each next stage leaf node of the leaf node, for example, in a next stage leaf node memory resource pool 0 topology information, the view information of memory resource pool 0 in another next stage leaf node.
Client cluster externally provides the volume service of distributed storage, needs to interact with MDC cluster and POOL.Client finds specific location of the data in POOL by the view information in MDC cluster, then requests the reading and writing data for completing user to be requested to POOL.POOL is the resource collection of object storage device (Object-based Storage Device, referred to as " OSD "), needs to interact with MDC cluster and Client cluster.Data distribution in POOL depends on the Metadata View The in MDC cluster.
Corresponding, Fig. 4 shows the method that metadata is managed in the distributed memory system in the embodiment of the present invention, as shown in figure 4, method 100 includes:
S110, the first metadata controller MDC receive the mapping relations inquiry request that client is sent, which requests the MDC of the metadata of corresponding resource pool for searching and managing with user;
S120, the first MDC send mapping relations to the client and indicate information, and it is the 2nd MDC that mapping relations instruction information instruction, which manages this and requests the MDC of corresponding resource pool with user,;
S130, the client read from the 2nd MDC and are somebody's turn to do the metadata that requested corresponding resource pool with user.
Specifically, client in client cluster receives the read-write requests that user sends, client parses the read-write requests received later, when client not can determine that the specific storage position of the requested data of the read-write requests in the subregion view (Partition View) being locally stored, client can into MDC cluster any one MDC request inquiry mapping relations, i.e. MDC corresponding with the resource pool where the requested data of the read-write requests is inquired in request, the MDC for receiving inquiry request can read mapping relations from metadatabase, and return to client, client is determining management and read-write After the corresponding MDC of resource pool where requesting requested data, newest Partition View is requested to the MDC, finds specific location of the data in resource pool, then the read-write requests of the data to resource pool request completion user.
In embodiments of the present invention, optionally, main MDC121, specifically for when needing establishing resource pond, the ownership MDC for needing the resource pool created is determined in multiple MDC, and the mapping relations of the ownership MDC of the resource pool created and the resource pool of needs creation will be needed to be written in the publicly-owned node;The ownership MDC of the resource pool of needs creation reads the topology information of the resource pool of needs creation for the mapping relations for belonging to MDC for the resource pool that the resource pool and the needs according to the needs creation stored in the publicly-owned node create from the privately owned node.The ownership MDC for the resource pool for needing to create reads the mapping relations stored in publicly-owned node, pass through the mapping relations for comparing the mapping relations currently read with reading before, when confirmation needs to establish the mapping relations with the resource pool for needing to create, the ownership MDC of the resource pool of needs creation reads the topology information of the resource pool of needs creation from privately owned node.
In embodiments of the present invention, when newly increasing resource pool, it is necessary first to establish resource pool and the mapping relations of MDC.Optionally, main MDC can determination needs establishing resource pond according to the demand of user in system initialization;Or main MDC, when the business that the memory capacity of the determining existing resource pool of user is not able to satisfy user needs, determination needs establishing resource pond;Or, main MDC is when the establishing resource pond for receiving user's transmission is requested, determination needs establishing resource pond, wherein, the establishing resource pond request that user sends carries the topology information for needing the resource pool created, when main MDC receives the establishing resource pond request of user's transmission, topology information therein is written under the Private node in metadatabase.
Optionally, when main MDC determines the ownership MDC for needing the resource pool created in multiple MDC, a standby MDC in standby MDC in non-faulting state can be determined as to the ownership MDC of the resource pool of needs creation, preferably, one most lightly loaded in the standby MDC in non-faulting state standby MDC is determined as the ownership MDC of the resource pool of needs creation by main MDC.Also, when all standby MDC are all in malfunction, oneself can also be determined as the ownership MDC of the resource pool of needs creation by main MDC.
Optionally, when publicly-owned node is written in the mapping relations of the ownership MDC for the resource pool for needing the resource pool created and needs to create by main MDC, in order to guarantee the consistency of data, main MDC can modify the content in publicly-owned node by the way of the mapping relations of the ownership MDC for the resource pool that the resource pool and needs for re-writing all already present mapping relations and needs creation create.
As an example it is assumed that having existed following mapping relations in publicly-owned node: POOL1- > MDC1, POOL2- > MDC2, and needing newly created POOL is POOL3, and the ownership MDC for the resource pool for needing to create is MDC3, that is, need to establish the mapping relations of POOL3 and MDC3, then main MDC is when modifying the content in publicly-owned node, the mapping relations of POOL3 and MDC3 can be added by the way of increasing POOL3- > MDC3 on the basis of original content, the mapping relations of POOL3 and MDC3 can also be written in publicly-owned node by the way of write-in POOL1- > MDC1, POOL2- > MDC2 and POOL3- > MDC3.
Below in conjunction with the method for establishing mapping relations between POOL and MDC of Fig. 5 detailed description accord to a specific embodiment of that present invention.As shown in figure 5, method 200 includes:
Resource pool in S201, all standby MDC monitoring ZK maps (POOL Mapping) node;
The mapping relations that all standby MDC and resource pool are preserved in resource pool mapping node can also optionally, in resource pool mapping node preserve the state for the MDC for having mapping relations with resource pool (whether in malfunction).
S202, main MDC modify POOL Mapping node content;
Main MDC is when determination needs to create new resource pool, ownership MDC of the standby MDC as the resource pool for needing to create is selected first, in accordance with methods as described herein above, and a node corresponding with the resource pool for needing to create is created under the Private node in ZK, and will be in the node that the topology information of the resource pool created be needed to be written in ZK.
As shown in Figure 6, the POOL for needing to create is POOL3, and the ownership MDC for the POOL that main MDC is determined is MDC3, therefore main MDC needs in the topology information write-in ZK by POOL3, and increases the mapping relations of POOL3 and MDC3 newly in POOL Mapping node.
S203, ZK return to successfully modified information to main MDC;
S204, MDC3 receive Node Events notice;
Since the content of POOL Mapping node has been modified, notified so ZK can be triggered to all standby MDC sending node events.
After S205, MDC3 receive Node Events notice, the content in POOL Mapping node is obtained;
S206, ZK return to the content that MDC3 is read to MDC3;
S207, MDC3 determine the mapping relations of newly-increased POOL3 and MDC3 according to the content of POOL Mapping node;
MDC3 read POOL Mapping node content after, the content that front and back twi-read arrives can be compared, according to front and back twi-read to content determine whether to establish and POOL mapping relations.
S208, MDC3 read the associated traffic data of POOL3 from ZK.
When creating POOL, what MDC3 was read from ZK is the topology information of POOL3.
S209, ZK read successful information to POOL3 returned data;
The determining mapping relations with POOL3 of S210, MDC3 are successfully established.
In embodiments of the present invention, optionally, after the ownership MDC for needing the resource pool created reads the topology information for needing the resource pool created, metadata corresponding with the resource pool that the needs create can be generated according to the topology information got;Metadata information is written in the privately owned node ownership MDC of the resource pool of needs creation later.
Specifically, Fig. 7 is the schematic flow chart of the method for ownership MDC creation POOL accord to a specific embodiment of that present invention, as shown in fig. 7, method 300 includes:
S301, ownership MDC generate data information according to topology information;
Data information may include object storage device (Object-based Storage Device, referred to as " OSD ") view (View) and Partition View.
S302 belongs to MDC for topology information and data information etc. and is sent to ZK by the interface on ZK.
After S303, ZK receive the information that ownership MDC is sent, node corresponding with the resource pool for needing to create is created, is written in the node to belong to MDC for topology information and data information etc.;
S304, ZK return to write-in successfully instruction information to ownership MDC;
S305, ownership MDC determine that POOL is created successfully, and OSD is waited to provide service.
In embodiments of the present invention, optionally, which is also used to: deleting first resource pond in the multiple resource pool stored in the publicly-owned node and the mapping relations for first in MDC for MDC;
This first for MDC, for determine the first resource pond and this first be deleted for the mapping relations of MDC when, stop the management to the metadata in the first resource pond.
Specifically, since customer service changes, when not needing using current POOL, POOL can be deleted, first has to delete the mapping relations between the resource pool and MDC before deleting POOL, deletes the corresponding Private node of the POOL in metadatabase later.Alternatively, main MDC determine this first for MDC failure when, can delete the first resource pond and this first for MDC mapping relations.
As an example it is assumed that needing to delete the mapping relations of MDC2 and POOL2, can specifically be carried out according to method shown in Fig. 8, as shown in figure 8, method 400 includes:
S401, MDC2 monitor the POOL Mapping node in ZK;
All MDC require the POOL-Mapping node in monitoring ZK.
S402, main MDC modify POOL Mapping node content;
Main MDC deletes the mapping relations between the MDC2 and POOL2 stored in POOL Mapping, assuming that having existed following mapping relations in POOL Mapping: POOL1- > MDC1, POOL2- > MDC2, POOL3- > MDC3, main MDC is when modifying the content in POOL Mapping node, the mapping relations that POOL2 and MDC2 can be deleted by the way of deleting POOL2- > MDC2 on the basis of original content, can also modify the content in POOL Mapping by the way of only re-writing POOL1- > MDC1 and POOL3- > MDC3.
S403, ZK return to successfully modified information to main MDC;
S404, MDC2 receive Node Events notice;
Since the content of POOL Mapping node is modified, so ZK can be triggered to all standby MDC sending node time announcements.
After S405, MDC2 receive Node Events notice, the content in POOL Mapping node is obtained;
S406, ZK return to the content that MDC2 is read to MDC2;
S407, MDC2 determine the mapping relations for deleting MDC2 and POOL2 according to the content in POOL Mapping node;
MDC2 read POOL Mapping node content after, the content that front and back twi-read arrives can be compared, according to front and back twi-read to content determine whether to deletion mapping relations.
S408, MDC2 unload resource relevant to POOL2, stop the management to the corresponding metadata of POOL2;
Whether S409, main MDC confirmation MDC2 unload completion;
S410, MDC2 return to confirmation unloading and complete information;
S411, main MDC, which confirm, successfully deletes MDC2 and POOL2 mapping relations.
In embodiments of the present invention, optionally, which is also used to: determine this first for MDC failure when, determine in multiple MDC second for MDC, by the first resource pond and this second be written in the publicly-owned node for the mapping relations of MDC;
This second for MDC, for according to the first resource pond stored in the publicly-owned node and this second for MDC mapping relations, the metadata in the first resource pond is read from the privately owned node.
Main MDC determined in multiple MDC second for MDC when, load can be less than one in the standby MDC of preset threshold and be determined as this second for MDC for MDC by main MDC, for example, MDC most lightly loaded can be determined as this second for MDC by main MDC.
In other words, in a standby MDC failure, the mapping relations of POOL and standby MDC can be rebuild.Optionally, main MDC can periodically inquire the standby transient node that each standby MDC is created in metadatabase, when finding that a standby transient node fails, determine and break down with this for the corresponding standby MDC of transient node, set malfunction for this later for the state of MDC.Alternatively, standby MDC can periodically report the health status of oneself to main MDC, when standby MDC be in failure or it is overdue not on give the correct time, main MDC sets malfunction for this for the state of MDC.
When being described in detail in standby MDC (standby MDC2) failure below in conjunction with Fig. 9, the method that re-establishes the mapping relations of POOL and standby MDC3.As shown in figure 9, method 500 includes:
S501, main MDC inquire the state for the standby transient node that standby MDC2 and standby MDC3 are established in ZK;
S502, main MDC establish new mapping relations when determining standby MDC2 failure, for the former POOL2 being mapped on MDC2;
For example, main MDC, which chooses MDC3 and POOL2 most lightly loaded, establishes mapping relations, the mapping relations of MDC2 and POOL2 are deleted in the POOL Mapping node in ZK, and add the mapping relations of MDC3 and POOL3 into POOL Mapping node.
S503, MDC3 receive POOL Mapping node altering event notice;
S504, MDC3 read POOL Mapping node content;
S505, ZK return to the content of POOL Mapping node to MDC3;
S506, MDC3 determine the mapping relations of newly-increased MDC2 and POOL2 according to the content of POOL Mapping node;
S507, MDC3 read the associated traffic data of POOL2 from ZK;
S508, ZK return to the associated traffic data of POOL2 to MDC3;
S509, MDC3 determine the mapping relations being successfully established with POOL2.
In embodiments of the present invention, optionally, main MDC121 is also used to: being received the creation MDC request that user sends, is requested the identification information of the MDC of creation to be written in the publicly-owned node creation MDC.The identification information of MDC can be the IP address or port numbers of MDC, but the present invention is not limited thereto.
Therefore, distributed memory system of the embodiment of the present invention can dynamically increase or decrease the quantity of resource pool, and can also carry out online smoothing to MDC and expand volume reduction, and thus, it is possible to improve the processing capacity of system and reliability.
The method according to an embodiment of the present invention for increasing MDC is described in detail below in conjunction with Figure 10, such as Shown in Figure 10, method 600 includes:
The request of newly-increased MDC is sent to main MDC by S601, user;
It is assumed that it is MDC3 that increased MDC is requested in the request for the newly-increased MDC that user sends.
S602, the legitimacy of main MDC checking request;
For example, whether the IP of the main increased MDC3 of MDC checking request, port are legal etc..
S603, main MDC, which update MDC list and add to MDC3, to be monitored;
S604, ZK update the MDC list (MDC list) under Public node;
S605, ZK are returned to main MDC updates result;
S606, main MDC are returned to user increases MDC3 success;
S607, user's MDC process new to agency (Agent) request pull-up;
S608, Agent execute pull-up MDC and carry out;
S609, Agent return to user's operation result to user;
S610, main MDC are established to be connected with the long of MDC3;
S611, MDC3 create transient node in ZK;
S612, ZK return to creation result to MDC3;
S613, MDC3 initialization load necessary data, start to work normally.
In embodiments of the present invention, optionally, it if necessary to delete a MDC, can be carried out according to the inverse process of method 600, details are not described herein.
In embodiments of the present invention, optionally, in main MDC failure, need to redefine main MDC, therefore, the standby MDC in multiple MDC is also used to when determining main MDC failure, initiates competition main flow;A standby MDC in multiple MDC is as new main MDC, for loading the metadata in the publicly-owned node.
Specifically, synchronization only allows a standby MDC to create main transient node in metadatabase, the standby MDC for creating main transient node in metadatabase at first can be determined as to new main MDC, or after a standby MDC creates main transient node, the mapping relations of the MDC and POOL that store in metadatabase can be passed through, determine other for the load of MDC, if this determines that own load is higher than other for the load of certain standby MDC in MDC for MDC, the then main transient node failure that oneself can be made to create for MDC, other are allowed to create main transient node in metadatabase for MDC with this, in this manner, standby MDC most lightly loaded can be made to be upgraded to new main MDC, and service is provided after the metadata of the publicly-owned node in metadata about load library.
The side according to an embodiment of the present invention for reselecting main MDC is described in detail below in conjunction with Figure 11 Method.As shown in figure 11, it is assumed that the MDC cluster in distributed memory system includes a main MDC, 3 standby MDC, respectively for MDC1, standby MDC2 and standby MDC3.Method 700 includes:
S701, main MDC register main transient node in ZK;
S702, standby MDC monitor main transient node;
S703, main MDC failure cause main transient node to fail;
S704, standby MDC receive main transient node Notification of Changes;
S705, standby MDC initiate competition process;
S706, the standby MDC for creating main transient node in ZK at first are upgraded to new main MDC;
For example, standby MDC3 is upgraded to new main MDC shown in Figure 11.
Service is provided after data in S707, MDC3 (new main MDC) load ZK in Public node.
In embodiments of the present invention, optionally, main MDC121 is also used to: determining that the view of client changes;Update the metadata corresponding with the view of the client stored in the publicly-owned node.
Optionally, client periodically can report heartbeat to main MDC, and main MDC is not received by the heartbeat of client transmission within a certain period of time, can be confirmed that the view of the client changes.In turn, the metadata corresponding with the view of the client stored in main MDC changing metadata library.
The method that main MDC processing client's end-view change according to an embodiment of the present invention is described in detail below in conjunction with Figure 12.As shown in figure 12, method 800 includes:
S801, main MDC confirmation client's end-view (Client View) are changed;
Data in S802, main MDC modification ZK in Client View node;
S803, ZK return successfully modified to main MDC;
S804, main MDC return to processing result to client.
It should be understood that Figure 12 is illustrated by taking Client View as an example, the change of other publicly-owned data can only also be handled by main MDC, because write conflict will not occur, the process flow of other public datas is similar with method 800.
In embodiments of the present invention, optionally, the standby MDC in multiple MDC is also used to: when the state of resource pool determining and that this is for MDC with mapping relations changes, updating and this has the metadata of the resource pool of mapping relations for MDC.The state change of resource pool includes that change occurs for OSD view, Partition View is changed.
The method of the Status Change of standby MDC treatment source pool according to an embodiment of the present invention is described in detail below in conjunction with Figure 13.As shown in figure 13, method 900 includes:
S901, ownership MDC determine that OSD view changes;
The data of OSD View node in S902, ownership MDC modification ZK;
S903, ZK return successfully modified to ownership MDC;
S904, ownership MDC return to processing result to OSD.
It should be understood that Figure 13 is illustrated by taking OSD View as an example, the change of other private metadatas of resource pool can only also be handled by the ownership MDC therewith with mapping relations, and the process flow of other private datas is similar with method 900.Due to a resource pool private metadata can only by having the MDC of mapping relations to handle with the resource pool, can accomplish the processing of multiple POOL service concurrences in service layer, not conflict with each other.
Therefore, the distributed memory system of the embodiment of the present invention and the method for managing metadata, can manage more massive storage cluster, and can be realized failure domain separation.
Figure 14 shows data service unit 100 according to an embodiment of the present invention, the data service unit 100 includes multiple processors 101, memory 102 and bus system 103, multiple processor 101 is connected with the memory 102 by bus system 103, the memory 102 is for storing metadata corresponding with multiple resource pools, primary processor in multiple processor 101, for managing the mapping relations between standby processor and resource pool in the multiple processor stored in the memory 102, standby processor in multiple processor, for manage stored in the memory there is the corresponding metadata of the resource pool of mapping relations for processor with this.
First processor in multiple processor 101, for receiving the mapping relations inquiry request of client transmission, the processor for the metadata which is used to that inquiry to be requested to request corresponding resource pool with user;The first processor, it is also used to send mapping relations instruction information to client, it is second processor that mapping relations instruction information instruction, which manages this and requests the processor of corresponding resource pool with user, so that the client reads from the second processor and should request with user the metadata of corresponding resource pool.
The data service unit of the embodiment of the present invention can manage more massive storage cluster, and can be realized failure domain separation.
It should be understood that, in embodiments of the present invention, optionally, processor 101 can be central processing unit (Central Processing Unit, abbreviation CPU), processor 101 can also be other general processors, digital signal processor (Digital Signal Processing, abbreviation DSP), specific integrated circuit (Application Specific Integrated Circuit, abbreviation ASIC), field programmable gate array (Field-Programmable Gate Array, abbreviation FPGA) or other programmable logic device, Discrete gate or transistor logic, discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor etc..
The memory 102 may include read-only memory and random access memory, and provide instruction and data to processor 101.The a part of of memory 102 can also include nonvolatile RAM.For example, memory 102 can be with the information of storage device type.
The bus system 103 can also include power bus, control bus and status signal bus in addition etc. in addition to including data/address bus.But for the sake of clear explanation, various buses are all designated as bus system 103 in figure.
During realization, each step of the above method can be completed by the integrated logic circuit of the hardware in processor 101 or the instruction of software form.The step of method in conjunction with disclosed in the embodiment of the present invention, can be embodied directly in hardware processor and execute completion, or in processor hardware and software module combination execute completion.Software module can be located in the storage medium of this fields such as random access memory, flash memory, read-only memory, programmable read only memory or electrically erasable programmable memory, register maturation.The step of storage medium is located at memory 102, and processor 101 reads the metadata in memory 102, completes the above method in conjunction with its hardware.To avoid repeating, it is not detailed herein.
Optionally, as one embodiment, the metadata memory node in the memory 102 includes publicly-owned node, privately owned node and transient node;Wherein, the metadata stored in the publicly-owned node is modified by the primary processor, the corresponding metadata of each resource pool in multiple resource pool is stored in the privately owned node, and the corresponding metadata of each resource pool is read out and is modified by the standby processor for managing the resource pool in multiple resource pool;The identification information of each processor in multiple processor is stored in the transient node.
Optionally, as one embodiment, the primary processor is specifically used for: when needing establishing resource pond, the ownership processor for needing the resource pool created is determined in multiple processor, and the mapping relations of the ownership processor of the resource pool created and the resource pool of needs creation will be needed to be written in the memory;
The ownership MDC of the resource pool of needs creation reads the topology information of the resource pool of needs creation for the mapping relations for belonging to MDC for the resource pool that the resource pool and the needs according to the needs creation stored in the memory create from the privately owned node.
Optionally, as one embodiment, which is also used to: receiving the establishing resource pond request that user sends, establishing resource pond request carries the topology information;The topology information is written in the memory.
Optionally, as one embodiment, the ownership processor of the resource pool of needs creation is also used to: according to the topology information stored in the memory, generating member corresponding with the resource pool that the needs create Data;The metadata corresponding with resource pool that the needs create is written in the memory.
Optionally, as one embodiment, which is also used to: deleting first resource pond in the multiple resource pool stored in the memory and the mapping relations for first in processor for processor;This first for processor, for determine the first resource pond and this first be deleted for the mapping relations of MDC when, stop the management to the corresponding metadata in first resource pond.
Optionally, as one embodiment, which is also used to: determine this first for processor fault when, determine in multiple processor second for processor, by the first resource pond and this second be written in the memory for the mapping relations of processor;This second for processor, for according to the first resource pond stored in the memory and this second for processor mapping relations, the metadata in the first resource pond is read from the memory.
Optionally, as one embodiment, which is also used to: receiving the creation processor request that user sends;The identification information of the processor of creation is requested to be written in the memory creation processor.
Optionally, as one embodiment, standby processor in multiple processor is also used to when determining the primary processor failure, initiates competition main flow;A standby processor in multiple processor is as new primary processor, for loading the metadata in the memory.
Optionally, as one embodiment, the standby processor in multiple processor is also used to: when determining and this for processor there is the state of resource pool of mapping relations to change, updating and this has the metadata of the resource pool of mapping relations for processor.
The data service unit of the embodiment of the present invention, primary processor is for the mapping relations between management processor and resource pool, standby processor be used to manage stored in memory there is the corresponding metadata of the resource pool of mapping relations for processor with this, thus, each processor can participate in the management of metadata, bigger storage cluster can be managed, and each processor only manages the metadata of corresponding resource pool, can be realized failure domain separation.
Those of ordinary skill in the art may be aware that unit described in conjunction with the examples disclosed in the embodiments of the present disclosure and algorithm steps, can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Professional technician can use different methods to achieve the described function each specific application, but such implementation should not be considered as beyond the scope of the present invention.
It is apparent to those skilled in the art that for convenience and simplicity of description, system, the specific work process of device and unit of foregoing description can be with reference to the correspondences in preceding method embodiment Process, details are not described herein.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods may be implemented in other ways.Such as, the apparatus embodiments described above are merely exemplary, such as, the division of the unit, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed mutual coupling, direct-coupling or communication connection can be through some interfaces, the indirect coupling or communication connection of device or unit, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, and component shown as a unit may or may not be physical unit, it can and it is in one place, or may be distributed over multiple network units.It can some or all of the units may be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
In addition, the functional units in various embodiments of the present invention may be integrated into one processing unit, it is also possible to each unit and physically exists alone, can also be integrated in one unit with two or more units.
If the function is realized in the form of SFU software functional unit and when sold or used as an independent product, can store in a computer readable storage medium.Based on this understanding, substantially the part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products technical solution of the present invention in other words, the computer software product is stored in a storage medium, it uses including some instructions so that a computer equipment (can be personal computer, server or the network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), the various media that can store program code such as magnetic or disk.
It is described above; only a specific embodiment of the invention, but scope of protection of the present invention is not limited thereto, and anyone skilled in the art is in the technical scope disclosed by the present invention; it can easily think of the change or the replacement, should be covered by the protection scope of the present invention.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (20)

  1. A kind of distributed memory system characterized by comprising metadatabase, multiple metadata controller MDC and multiple resource pools;
    The metadatabase, for storing metadata corresponding with the multiple resource pool;
    Main MDC in the multiple MDC, for managing the mapping relations between standby MDC and resource pool in the multiple MDC stored in the metadatabase;
    Standby MDC in the multiple MDC, for manage stored in the metadatabase there is the corresponding metadata of the resource pool of mapping relations with the standby MDC.
  2. The method according to claim 1, wherein the metadata memory node in the metadatabase includes publicly-owned node, privately owned node and transient node;
    Wherein, the metadata stored in the publicly-owned node is modified by the main MDC;The corresponding metadata of each resource pool in the multiple resource pool is stored in the privately owned node, and the corresponding metadata of each resource pool is read out and is modified by the standby MDC for managing the resource pool in the multiple resource pool;The identification information of each MDC in the multiple MDC is stored in the transient node.
  3. Distributed memory system according to claim 2, it is characterized in that, the main MDC, specifically for when needing establishing resource pond, the ownership MDC for needing the resource pool created is determined in the multiple MDC, and the mapping relations for needing the ownership MDC of the resource pool created and the resource pool for needing to create are written in the publicly-owned node;
    The ownership MDC of the resource pool for needing to create, for reading the topology information of the resource pool for needing to create from the privately owned node according to the resource pool of the needs creation stored in the publicly-owned node and the mapping relations for belonging to MDC of the resource pool for needing to create.
  4. Distributed memory system according to claim 3, which is characterized in that the main MDC is also used to:
    The establishing resource pond request that user sends is received, the establishing resource pond request carries the topology information;
    The topology information is written in the privately owned node.
  5. Distributed memory system according to claim 4, which is characterized in that the ownership MDC for needing the resource pool created is also used to:
    According to the topology information stored in the privately owned node, metadata corresponding with the resource pool for needing to create is generated;
    The metadata corresponding with the resource pool for needing to create is written in the privately owned node.
  6. The distributed memory system according to any one of claim 2 to 5, which is characterized in that the main MDC is also used to: mapping relations of the first resource pond in the multiple resource pool stored in the publicly-owned node with first in the standby MDC for MDC are deleted;
    Described first for MDC, for stopping the management to the corresponding metadata in the first resource pond when determining that the first resource pond is deleted with described first for the mapping relations of MDC.
  7. Distributed memory system according to claim 6, it is characterized in that, the main MDC is also used to: determine described first for MDC failure when, second is determined in the multiple MDC for MDC, the first resource pond is written in the publicly-owned node with described second for the mapping relations of MDC;
    Described second for MDC, for according to the first resource pond stored in the publicly-owned node with described second for MDC mapping relations, the metadata in the first resource pond is read from the privately owned node.
  8. The distributed memory system according to any one of claim 2 to 7, which is characterized in that the main MDC is also used to:
    Receive the creation MDC request that user sends;
    The identification information of the MDC of creation is requested to be written in the publicly-owned node creation MDC.
  9. The distributed memory system according to any one of claim 2 to 8, which is characterized in that the standby MDC in the multiple MDC is also used to when determining the main MDC failure, initiates competition main flow;
    A standby MDC in the multiple MDC is as new main MDC, for loading the metadata in the publicly-owned node.
  10. The distributed memory system according to any one of claim 2 to 9, which is characterized in that the standby MDC in the multiple MDC is also used to:
    When the state of the determining resource pool with the standby MDC with mapping relations changes, the metadata with the standby MDC resource pool with mapping relations is updated.
  11. The method of metadata is managed in a kind of distributed memory system, which is characterized in that the described method includes:
    First metadata controller MDC receives the mapping relations inquiry request that client is sent, the MDC for the metadata that the mapping relations inquiry request is used to that searching and managing to be requested to request corresponding resource pool with user;
    First MDC sends mapping relations to the client and indicates information, and the mapping relations instruction information instruction management MDC that corresponding resource pool is requested with user is the 2nd MDC;
    The client is read and the metadata that corresponding resource pool is requested with user from the 2nd MDC.
  12. According to the method for claim 11, which is characterized in that before the first MDC sends mapping relations instruction information to the client, the method also includes:
    First MDC reads the mapping relations list stored in metadatabase, wherein, metadata memory node in the metadatabase includes publicly-owned node, privately owned node and transient node, the metadata stored in the publicly-owned node is modified by main MDC, the corresponding metadata of each resource pool in the multiple resource pool is stored in the privately owned node, and the corresponding metadata of each resource pool is read out and is modified by the standby MDC for managing the resource pool in the multiple resource pool, and the identification information of each MDC in multiple MDC is stored in the transient node;
    First MDC is determined as the management MDC that corresponding resource pool is requested with user according to the mapping relations list, by the 2nd MDC.
  13. According to the method for claim 12, which is characterized in that the method also includes:
    The main MDC determination needs establishing resource pond;
    The main MDC determines the ownership MDC for needing the resource pool created;
    The mapping relations for needing the ownership MDC of the resource pool created and the resource pool for needing to create are written in the publicly-owned node the main MDC;
    The ownership MDC for needing the resource pool created reads the topology information of the resource pool for needing to create according to the mapping relations of the resource pool of the needs creation stored in the publicly-owned node and the ownership MDC of the resource pool for needing to create from the privately owned node.
  14. According to the method for claim 13, which is characterized in that the method also includes:
    The main MDC receives the establishing resource pond request that user sends, and the establishing resource pond request carries the topology information;
    The topology information is written in the privately owned node the main MDC.
  15. According to the method for claim 14, which is characterized in that the method also includes:
    The ownership MDC of the resource pool for needing to create generates metadata corresponding with the resource pool for needing to create according to the topology information;
    The metadata corresponding with the resource pool for needing to create is written in the privately owned node ownership MDC for needing the resource pool created.
  16. Method described in any one of 2 to 15 according to claim 1, which is characterized in that the method also includes:
    The main MDC delete the first resource pond that is stored in the publicly-owned node with first for MDC mapping relations;
    Described first for MDC when determining that the first resource pond is deleted with described first for the mapping relations of MDC, stop management to the corresponding metadata in the first resource pond.
  17. According to the method for claim 16, which is characterized in that the method also includes:
    The main MDC determine described first for MDC failure when, determine second for MDC;
    The first resource pond is written in the publicly-owned node with described second for the mapping relations of MDC the main MDC;
    Described second for MDC according to the first resource pond that is stored in the publicly-owned node with described second for MDC mapping relations, the metadata in the first resource pond is read from the privately owned node.
  18. Method described in any one of 2 to 17 according to claim 1, which is characterized in that the method also includes:
    The main MDC receives the creation MDC request that user sends;
    The main MDC requests the creation MDC in the identification information write-in publicly-owned node of the MDC of creation.
  19. Method described in any one of 2 to 18 according to claim 1, which is characterized in that the method also includes:
    Standby MDC initiates competition main flow when determining main MDC failure;
    A standby MDC in the standby MDC loads the metadata in the publicly-owned node as new main MDC.
  20. Method described in any one of 2 to 19 according to claim 1, which is characterized in that the method also includes:
    Standby MDC updates the metadata with the standby MDC resource pool with mapping relations when the state of the determining resource pool with the standby MDC with mapping relations changes.
CN201580070472.7A 2015-12-31 2015-12-31 Distributed storage system and method for managing metadata Active CN107211003B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/100088 WO2017113280A1 (en) 2015-12-31 2015-12-31 Distributed storage system and metadata managing method

Publications (2)

Publication Number Publication Date
CN107211003A true CN107211003A (en) 2017-09-26
CN107211003B CN107211003B (en) 2020-07-14

Family

ID=59224094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580070472.7A Active CN107211003B (en) 2015-12-31 2015-12-31 Distributed storage system and method for managing metadata

Country Status (2)

Country Link
CN (1) CN107211003B (en)
WO (1) WO2017113280A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704212A (en) * 2017-10-31 2018-02-16 紫光华山信息技术有限公司 A kind of data processing method and device
CN111414136A (en) * 2020-03-13 2020-07-14 苏州浪潮智能科技有限公司 Method, system, device and medium for creating storage pool
CN112260874A (en) * 2020-10-23 2021-01-22 南京鹏云网络科技有限公司 Management system and method based on distributed storage unit
CN112667577A (en) * 2020-12-25 2021-04-16 浙江大华技术股份有限公司 Metadata management method, metadata management system and storage medium
CN116560818A (en) * 2023-06-29 2023-08-08 深圳市易图资讯股份有限公司 Method and system for distributing and scheduling space data service

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108540832A (en) * 2018-03-12 2018-09-14 四川合智聚云科技有限公司 A kind of intelligent management based on television services
CN111131441A (en) * 2019-12-21 2020-05-08 西安天互通信有限公司 Real-time file sharing system and method
US11223681B2 (en) 2020-04-10 2022-01-11 Netapp, Inc. Updating no sync technique for ensuring continuous storage service in event of degraded cluster state

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030220943A1 (en) * 2002-05-23 2003-11-27 International Business Machines Corporation Recovery of a single metadata controller failure in a storage area network environment
US20110107139A1 (en) * 2009-11-09 2011-05-05 Quantum Corporation Timer bounded arbitration protocol for resource control
WO2011130185A2 (en) * 2010-04-11 2011-10-20 Alex Grossman Systems and methods for raid metadata storage
CN103503414A (en) * 2012-12-31 2014-01-08 华为技术有限公司 Computing storage integration cluster system
CN104135539A (en) * 2014-08-15 2014-11-05 华为技术有限公司 Data storage method, SDN controller and distributed network storage system
CN104820670A (en) * 2015-03-13 2015-08-05 国家电网公司 Method for acquiring and storing big data of power information

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103384550B (en) * 2012-12-28 2016-05-25 华为技术有限公司 The method of storage data and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030220943A1 (en) * 2002-05-23 2003-11-27 International Business Machines Corporation Recovery of a single metadata controller failure in a storage area network environment
US20110107139A1 (en) * 2009-11-09 2011-05-05 Quantum Corporation Timer bounded arbitration protocol for resource control
WO2011130185A2 (en) * 2010-04-11 2011-10-20 Alex Grossman Systems and methods for raid metadata storage
CN103503414A (en) * 2012-12-31 2014-01-08 华为技术有限公司 Computing storage integration cluster system
CN104135539A (en) * 2014-08-15 2014-11-05 华为技术有限公司 Data storage method, SDN controller and distributed network storage system
CN104820670A (en) * 2015-03-13 2015-08-05 国家电网公司 Method for acquiring and storing big data of power information

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107704212A (en) * 2017-10-31 2018-02-16 紫光华山信息技术有限公司 A kind of data processing method and device
CN111414136A (en) * 2020-03-13 2020-07-14 苏州浪潮智能科技有限公司 Method, system, device and medium for creating storage pool
CN111414136B (en) * 2020-03-13 2023-01-06 苏州浪潮智能科技有限公司 Method, system, device and medium for creating storage pool
CN112260874A (en) * 2020-10-23 2021-01-22 南京鹏云网络科技有限公司 Management system and method based on distributed storage unit
CN112667577A (en) * 2020-12-25 2021-04-16 浙江大华技术股份有限公司 Metadata management method, metadata management system and storage medium
CN116560818A (en) * 2023-06-29 2023-08-08 深圳市易图资讯股份有限公司 Method and system for distributing and scheduling space data service
CN116560818B (en) * 2023-06-29 2023-09-12 深圳市易图资讯股份有限公司 Method and system for distributing and scheduling space data service

Also Published As

Publication number Publication date
WO2017113280A1 (en) 2017-07-06
CN107211003B (en) 2020-07-14

Similar Documents

Publication Publication Date Title
CN107211003A (en) Distributed memory system and the method for managing metadata
JP4700459B2 (en) Data processing system, data management method, and storage system
KR102376713B1 (en) Composite partition functions
US10235047B2 (en) Memory management method, apparatus, and system
CN109299190B (en) Method and device for processing metadata of object in distributed storage system
CN104765661B (en) The multinode hot spare method of Metadata Service node in a kind of cloud storage service
US20130019087A1 (en) System structure management device, system structure management method, and program
CN111585887B (en) Communication method and device based on multiple networks, electronic equipment and storage medium
US20210132845A1 (en) Method for storage management, electronic device and computer program product
CN110751458A (en) Business approval method, device and system
CN111147312B (en) Resource allocation management method and device, resource allocation cache management method and device, and allocation management system
CN109726546A (en) A kind of right management method and device
JP2005208999A (en) Virtual machine management program
CN113535087B (en) Data processing method, server and storage system in data migration process
CN110609656B (en) Storage management method, electronic device, and computer program product
CN113419672B (en) Storage capacity management method, system and storage medium
US11429311B1 (en) Method and system for managing requests in a distributed system
CN111858188A (en) Method, apparatus and computer program product for storage management
CN111478953B (en) Self-construction method, device, system, equipment and storage medium of server cluster
CN112541039A (en) Database processing method and device, computer equipment and storage medium
CN115470303A (en) Database access method, device, system, equipment and readable storage medium
CN112685218B (en) Method, apparatus and computer program product for managing backup systems
CN110472167B (en) Data management method, device and computer readable storage medium
CN110058790B (en) Method, apparatus and computer program product for storing data
CN115269530A (en) Data synchronization method, electronic device and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220211

Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Patentee after: Huawei Cloud Computing Technologies Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.

TR01 Transfer of patent right