CN103384267B

CN103384267B - A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment

Info

Publication number: CN103384267B
Application number: CN201310226210.8A
Authority: CN
Inventors: 刘冠川; 秦东明; 杨亮; 曹振南; 王勇; 何牧君; 张新风; 陈飞; 刘超; 龚超; 明立波; 王慧; 吕永安
Original assignee: Dawning Information Industry Beijing Co Ltd
Current assignee: Dawning Information Industry Beijing Co Ltd; Dawning Information Industry Co Ltd
Priority date: 2013-06-07
Filing date: 2013-06-07
Publication date: 2017-09-01
Anticipated expiration: 2033-06-07
Also published as: CN103384267A

Abstract

The present invention relates to a kind of Parastor200 management node high availability methods based on distributed block equipment, methods described is realized by following two aspects：（1）The synchronization of management node storing system information；（2）Management node failover.The present invention is designed by realizing that the High Availabitity of Parastor200 management nodes makes Parastor200 realize the full redundancy in complete meaning, and the damage of any part does not influence the use of storage system in system.Service, can be switched on secondary node by the damage of any part of management node within the several seconds.So neither influence is normally used, and has the time of abundance to go to repair failure again.

Description

A kind of Parastor200 parallel memorizings management node based on distributed block equipment is high Methods availalbe

Technical field

The present invention relates to a kind of Parastor200 parallel memorizing management nodes High Availabitity side based on distributed block equipment Method.

Background technology

ParaStor200 parallel memory systems, which are employed, represents memory technology, the network communications technology and data management skill The parallel architecture framework of art developing direction, is a height for handling, possessing independent intellectual property right towards magnanimity unstructured data Hold storage system.It can provide TB/s grades of high speed bandwidth and EB grades of massive storage space, disclosure satisfy that aircraft automobile ship Oceangoing ship design, biological gene research, material science research, weather forecast, seismic monitoring, environmental monitoring and analysis, energy exploration, electronics Hold in the fields such as commercial affairs, online game, social and video sharing Web Hosting, animation rendering, video editing processing for storage Amount and the high application of I/O performance requirements, can be widely applied to government, education, scientific research, manufacture, enterprise, medical treatment, oil, wide The industries such as electricity, internet.

MGR represents Parastor200 management node there is provided unified control administration interface, and keeper passes through the node Manage whole storage system.

OPara represents Parastor200 metadata nodes, and all index datas and name for managing storage system are empty Between, single global image is externally provided, supports multiple nodes to be worked with Active-Active cluster modes.

OStor represents Parastor200 back end, for providing data space, embeds high-performance data access Engine, the data access request of parallel processing all clients supports multiple oStor in copy mode（1-3 copy）It is fault-tolerant.

Parastor200 management node is there is provided unified control administration interface, and its in store whole system is important to open up Structure and configuration information are flutterred, keeper passes through the whole storage system of the node administration.In whole storage system, management node Usage frequency is relatively low, only when carry client, checks storage system status, addition memory cell, deletion memory cell Management node can be just used when being operated Deng management.Generally manage relatively simple in small-scale cluster, management operation is also fewer, Now the importance of management node is relatively low, even if management node breaks down, and we also have the sufficient time to go to repair pipe Node is managed, even if management node disk permanent damages occur also is unlikely to catastrophic effect occur, because we can pass through member The important information that configuration information on back end, back end comes on reconfiguration management node.And simply some history lost Data and client authorization information, will not cause too much influence to storage system.At present, for the solution of this problem It is that, by administration interface schedule backup management node configuration information, when management node breaks down, secondary node can be used Management node diagram shape interface program is installed, is then introduced into the information of backup to complete.It is exactly using altogether to also have a kind of technology in addition Disk battle array is enjoyed to be mounted in active and standby management node by optical fiber switch.When main management node breaks down, standby management node passes through Carry preserves all information of the subregion acquisition storage management node of storing system information.

Existing scheme has several potential risks.First, even if frequency of your backups is higher, but it can not avoid backing up twice Between the possibility that is modified of system configuration.Particularly carry out increasing or reduced memory cell, change client authorization information etc. Operation, the information and real information after recovery is different, it will the normal operation of influence system.Secondly, even without any letter Breath is lost, time that one management node of reconstruct expends or long, larger for those, user it is more, it is necessary to The system for being often managed operation is clearly unacceptable.Problem above, but share dish can be solved using share dish battle array The cost of battle array is too high.

The content of the invention

In view of the shortcomings of the prior art, the present invention provides a kind of Parastor200 based on distributed block equipment and deposited parallel Store up management node high availability method；The present invention is by realizing that the High Availabitity of Parastor200 management nodes makes Parastor200 real Having showed the damage of any part in the full redundancy design in complete meaning, system does not influence the use of storage system.Management node Service, can be switched in standby management node by the damage of any part within the several seconds.So neither influence is normally used, again The time for having abundance goes to repair failure.It can be realized using distributed block equipment and technology in the case of the cost of very little really Real-time synchronization, it is ensured that active and standby management node storing system information is completely the same.

The purpose of the present invention is realized using following technical proposals：

A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment, it improves it It is in methods described is realized by following two aspects：

（1）The synchronization of management node storing system information file：Realized using distributed block equipment.

（2）Management node failover.

Wherein, it is described（1）In, the management node storing system information is synchronously to realize the storage system when in management node When system information is changed, main management node is consistent with the information under standby management node respective directories.

Wherein, the distributed block equipment is one and realized with software, without shared, mirror image block device between server The storage replication solution of content；

When writing data into the file system on local host distributed apparatus, data can be sent in network simultaneously An other distance host on, and record in the same manner in a file system；The establishment of the file system It is to be realized by the synchronization of distributed block equipment；

Wherein in when distance host and local host are all returned when writing successfully, the process that whole data are write just is returned into Work(；When main management node breaks down, a identical data are remained with standby management node.

Wherein, it is described（2）In, using heartbeat mechanism failure judgement management node, i.e., by online management node and standby Heartbeat is connected between management node and sends information and the monitoring of response other side, and passes through ping third party's node mode failure judgement Management node simultaneously realizes failover automatically.

Wherein, the migration when carrying out failover with reference to resource and service is realized；The resource and service include：

1）Management node storing system information file；

2）Management node manages IP；

3）Parastor200 management services and the service of Parastor200 graphical interfaces；

4）Data synchronization service.

Wherein, described 1）In, management node storing system information file resource passes through backed up in synchronization to standby management node On.

Wherein, described 2）In, management node management IP is management node to metadata node, back end sending tube Reason order walked IP, the management node management IP move to standby management node in failover from online management node On.

Wherein, described 3）In, the Parastor200 management services and the service of Parastor200 graphical interfaces are in event During barrier switching, it is switched to from online management node in standby management node.

Wherein, described 4）In, standby management node turns into main management node after switching（Main management node and standby management node It is relative concept, online management node is main management node）, the information of standby management node is backuped into original in turn In the main management node come.

Compared with the prior art, the beneficial effect that reaches of the present invention is：

The present invention provides the Parastor200 parallel memorizing management node high availability methods based on distributed block equipment, leads to Cross and realize that the High Availabitity of Parastor200 management nodes makes Parastor200 realize the full redundancy design in complete meaning, be The damage of any part does not influence the use of storage system in system.The damage of any part of management node, can be within the several seconds Service is switched in standby management node.So neither influence is normally used, and has the time of abundance to go to repair failure again.Using point Cloth block device technology can realize real real-time synchronization in the case of the cost of very little, it is ensured that active and standby management node is deposited Storage system information is completely the same.

Embodiment

The embodiment to the present invention is described in further detail below.

The present invention is the High Availabitity of Parastor200 management nodes to be realized.By analyzing the problem of prior art is present We just know that the present invention is to solve following two problems：（1）The synchronization of management node storing system information；（2）Management section Point failure switches.

Management node storing system information file synchronization is solved, seeks to realize that management node storage information file occurs to appoint When what is changed, the storing system information file under standby management node correspondence catalogue is also changed simultaneously, and main-standby nodes are corresponding Information under catalogue is completely the same.This patent solves this problem using distributed block equipment.Distributed block equipment is one Storage replication solution that realized with software, nothing is shared, mirror image block device content between server.When you write data When entering the file system on local distributed apparatus, data can be sent on the other main frame in network simultaneously, And with identical form record in a file system（The establishment of actually file system is also by distributed block equipment Synchronization realize）.The process entirely write when distance host and local host, which are all returned, to be write successfully is just returned successfully.Cause This local node can ensure real-time synchronous with the data of remote node, and ensure IO uniformity.So when main management section When point breaks down, a identical data can be also remained with standby management node, can be continuing with, to reach that height can Use purpose.

During management node failover, failover firstly the need of solve the problem of be how failure judgement, here I Use heartbeat mechanism, pass through and the prison that heartbeat sends information and response other side connected between management node and standby management node Survey, and realize by the mode failure judgement node such as ping third party's node and automatically failover.Carrying out failover also needs Solve one it is important the problem of be exactly service, the migration of resource.Resource and service include in this present invention：1）Management node Storing system information file, these resources are by backed up in synchronization to secondary node.2）Management node manages IP, and this IP is not It is same as the IP for the network that two inter-node synchronous files are walked.It is management node to metadata node, back end sending tube The walked IP of reason order.This IP needs to move in standby management node from main management node in failover.3） Parastor200 management services and the service of Parastor200 graphical interfaces, the two are serviced also in failover, from pipe Reason node is switched on secondary node.4）Data synchronization service, that is, standby management node becomes main management node after switching, it Need to backup to the information above it in turn in original main management node.

Finally it should be noted that：The above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, to the greatest extent The present invention is described in detail with reference to above-described embodiment for pipe, those of ordinary skills in the art should understand that：Still The embodiment of the present invention can be modified or equivalent substitution, and without departing from any of spirit and scope of the invention Modification or equivalent substitution, it all should cover among scope of the presently claimed invention.

Claims

1. a kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment, it is characterised in that Methods described is realized by following two aspects：

(1) synchronization of management node storing system information file：Realized using distributed block equipment；

(2) management node failover；

In (1), the management node storing system information is synchronously to realize the storing system information hair when in management node When changing more, main management node is consistent with the information under standby management node respective directories；

The distributed block equipment be one realized with software, without it is shared, mirror image block device content deposits between server Storage replicates solution；

When writing data into the file system on local host distributed apparatus, data can be sent to another in network simultaneously On an outer distance host, and record in the same manner in a file system；The establishment of the file system be by The synchronization of distributed block equipment is realized；

When distance host and local host, which are all returned, to be write successfully, the process that whole data are write just is returned successfully；Work as main management When node breaks down, a identical data are remained with standby management node；

In (2), using heartbeat mechanism failure judgement management node, i.e., by between online management node and standby management node Connect heartbeat and send information and the monitoring of response other side, and by ping third party's node mode failure judgement management node simultaneously Automatically failover is realized；

Migration when carrying out failover with reference to resource and service is realized；The resource and service include：

1) management node storing system information file；

2) management node management IP；

3) Parastor200 management services and the service of Parastor200 graphical interfaces；

4) data synchronization service；

It is described 1) in, management node storing system information file resource passes through in backed up in synchronization to standby management node；

It is described 2) in, management node management IP is management node to metadata node, back end transmission administration order institute The IP of process, the management node management IP is moved in standby management node in failover from online management node；

It is described 3) in, the Parastor200 management services and Parastor200 graphical interfaces service in failover, It is switched to from online management node in standby management node；

It is described 4) in, after switching standby management node turn into main management node, the information of standby management node is backed up in turn Onto original main management node.