CN103384267B - A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment - Google Patents

A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment Download PDF

Info

Publication number
CN103384267B
CN103384267B CN201310226210.8A CN201310226210A CN103384267B CN 103384267 B CN103384267 B CN 103384267B CN 201310226210 A CN201310226210 A CN 201310226210A CN 103384267 B CN103384267 B CN 103384267B
Authority
CN
China
Prior art keywords
management node
management
node
standby
realized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310226210.8A
Other languages
Chinese (zh)
Other versions
CN103384267A (en
Inventor
刘冠川
秦东明
杨亮
曹振南
王勇
何牧君
张新风
陈飞
刘超
龚超
明立波
王慧
吕永安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Dawning Information Industry Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN201310226210.8A priority Critical patent/CN103384267B/en
Publication of CN103384267A publication Critical patent/CN103384267A/en
Application granted granted Critical
Publication of CN103384267B publication Critical patent/CN103384267B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to a kind of Parastor200 management node high availability methods based on distributed block equipment, methods described is realized by following two aspects:(1)The synchronization of management node storing system information;(2)Management node failover.The present invention is designed by realizing that the High Availabitity of Parastor200 management nodes makes Parastor200 realize the full redundancy in complete meaning, and the damage of any part does not influence the use of storage system in system.Service, can be switched on secondary node by the damage of any part of management node within the several seconds.So neither influence is normally used, and has the time of abundance to go to repair failure again.

Description

A kind of Parastor200 parallel memorizings management node based on distributed block equipment is high Methods availalbe
Technical field
The present invention relates to a kind of Parastor200 parallel memorizing management nodes High Availabitity side based on distributed block equipment Method.
Background technology
ParaStor200 parallel memory systems, which are employed, represents memory technology, the network communications technology and data management skill The parallel architecture framework of art developing direction, is a height for handling, possessing independent intellectual property right towards magnanimity unstructured data Hold storage system.It can provide TB/s grades of high speed bandwidth and EB grades of massive storage space, disclosure satisfy that aircraft automobile ship Oceangoing ship design, biological gene research, material science research, weather forecast, seismic monitoring, environmental monitoring and analysis, energy exploration, electronics Hold in the fields such as commercial affairs, online game, social and video sharing Web Hosting, animation rendering, video editing processing for storage Amount and the high application of I/O performance requirements, can be widely applied to government, education, scientific research, manufacture, enterprise, medical treatment, oil, wide The industries such as electricity, internet.
MGR represents Parastor200 management node there is provided unified control administration interface, and keeper passes through the node Manage whole storage system.
OPara represents Parastor200 metadata nodes, and all index datas and name for managing storage system are empty Between, single global image is externally provided, supports multiple nodes to be worked with Active-Active cluster modes.
OStor represents Parastor200 back end, for providing data space, embeds high-performance data access Engine, the data access request of parallel processing all clients supports multiple oStor in copy mode(1-3 copy)It is fault-tolerant.
Parastor200 management node is there is provided unified control administration interface, and its in store whole system is important to open up Structure and configuration information are flutterred, keeper passes through the whole storage system of the node administration.In whole storage system, management node Usage frequency is relatively low, only when carry client, checks storage system status, addition memory cell, deletion memory cell Management node can be just used when being operated Deng management.Generally manage relatively simple in small-scale cluster, management operation is also fewer, Now the importance of management node is relatively low, even if management node breaks down, and we also have the sufficient time to go to repair pipe Node is managed, even if management node disk permanent damages occur also is unlikely to catastrophic effect occur, because we can pass through member The important information that configuration information on back end, back end comes on reconfiguration management node.And simply some history lost Data and client authorization information, will not cause too much influence to storage system.At present, for the solution of this problem It is that, by administration interface schedule backup management node configuration information, when management node breaks down, secondary node can be used Management node diagram shape interface program is installed, is then introduced into the information of backup to complete.It is exactly using altogether to also have a kind of technology in addition Disk battle array is enjoyed to be mounted in active and standby management node by optical fiber switch.When main management node breaks down, standby management node passes through Carry preserves all information of the subregion acquisition storage management node of storing system information.
Existing scheme has several potential risks.First, even if frequency of your backups is higher, but it can not avoid backing up twice Between the possibility that is modified of system configuration.Particularly carry out increasing or reduced memory cell, change client authorization information etc. Operation, the information and real information after recovery is different, it will the normal operation of influence system.Secondly, even without any letter Breath is lost, time that one management node of reconstruct expends or long, larger for those, user it is more, it is necessary to The system for being often managed operation is clearly unacceptable.Problem above, but share dish can be solved using share dish battle array The cost of battle array is too high.
The content of the invention
In view of the shortcomings of the prior art, the present invention provides a kind of Parastor200 based on distributed block equipment and deposited parallel Store up management node high availability method;The present invention is by realizing that the High Availabitity of Parastor200 management nodes makes Parastor200 real Having showed the damage of any part in the full redundancy design in complete meaning, system does not influence the use of storage system.Management node Service, can be switched in standby management node by the damage of any part within the several seconds.So neither influence is normally used, again The time for having abundance goes to repair failure.It can be realized using distributed block equipment and technology in the case of the cost of very little really Real-time synchronization, it is ensured that active and standby management node storing system information is completely the same.
The purpose of the present invention is realized using following technical proposals:
A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment, it improves it It is in methods described is realized by following two aspects:
(1)The synchronization of management node storing system information file:Realized using distributed block equipment.
(2)Management node failover.
Wherein, it is described(1)In, the management node storing system information is synchronously to realize the storage system when in management node When system information is changed, main management node is consistent with the information under standby management node respective directories.
Wherein, the distributed block equipment is one and realized with software, without shared, mirror image block device between server The storage replication solution of content;
When writing data into the file system on local host distributed apparatus, data can be sent in network simultaneously An other distance host on, and record in the same manner in a file system;The establishment of the file system It is to be realized by the synchronization of distributed block equipment;
Wherein in when distance host and local host are all returned when writing successfully, the process that whole data are write just is returned into Work(;When main management node breaks down, a identical data are remained with standby management node.
Wherein, it is described(2)In, using heartbeat mechanism failure judgement management node, i.e., by online management node and standby Heartbeat is connected between management node and sends information and the monitoring of response other side, and passes through ping third party's node mode failure judgement Management node simultaneously realizes failover automatically.
Wherein, the migration when carrying out failover with reference to resource and service is realized;The resource and service include:
1)Management node storing system information file;
2)Management node manages IP;
3)Parastor200 management services and the service of Parastor200 graphical interfaces;
4)Data synchronization service.
Wherein, described 1)In, management node storing system information file resource passes through backed up in synchronization to standby management node On.
Wherein, described 2)In, management node management IP is management node to metadata node, back end sending tube Reason order walked IP, the management node management IP move to standby management node in failover from online management node On.
Wherein, described 3)In, the Parastor200 management services and the service of Parastor200 graphical interfaces are in event During barrier switching, it is switched to from online management node in standby management node.
Wherein, described 4)In, standby management node turns into main management node after switching(Main management node and standby management node It is relative concept, online management node is main management node), the information of standby management node is backuped into original in turn In the main management node come.
Compared with the prior art, the beneficial effect that reaches of the present invention is:
The present invention provides the Parastor200 parallel memorizing management node high availability methods based on distributed block equipment, leads to Cross and realize that the High Availabitity of Parastor200 management nodes makes Parastor200 realize the full redundancy design in complete meaning, be The damage of any part does not influence the use of storage system in system.The damage of any part of management node, can be within the several seconds Service is switched in standby management node.So neither influence is normally used, and has the time of abundance to go to repair failure again.Using point Cloth block device technology can realize real real-time synchronization in the case of the cost of very little, it is ensured that active and standby management node is deposited Storage system information is completely the same.
Embodiment
The embodiment to the present invention is described in further detail below.
The present invention is the High Availabitity of Parastor200 management nodes to be realized.By analyzing the problem of prior art is present We just know that the present invention is to solve following two problems:(1)The synchronization of management node storing system information;(2)Management section Point failure switches.
Management node storing system information file synchronization is solved, seeks to realize that management node storage information file occurs to appoint When what is changed, the storing system information file under standby management node correspondence catalogue is also changed simultaneously, and main-standby nodes are corresponding Information under catalogue is completely the same.This patent solves this problem using distributed block equipment.Distributed block equipment is one Storage replication solution that realized with software, nothing is shared, mirror image block device content between server.When you write data When entering the file system on local distributed apparatus, data can be sent on the other main frame in network simultaneously, And with identical form record in a file system(The establishment of actually file system is also by distributed block equipment Synchronization realize).The process entirely write when distance host and local host, which are all returned, to be write successfully is just returned successfully.Cause This local node can ensure real-time synchronous with the data of remote node, and ensure IO uniformity.So when main management section When point breaks down, a identical data can be also remained with standby management node, can be continuing with, to reach that height can Use purpose.
During management node failover, failover firstly the need of solve the problem of be how failure judgement, here I Use heartbeat mechanism, pass through and the prison that heartbeat sends information and response other side connected between management node and standby management node Survey, and realize by the mode failure judgement node such as ping third party's node and automatically failover.Carrying out failover also needs Solve one it is important the problem of be exactly service, the migration of resource.Resource and service include in this present invention:1)Management node Storing system information file, these resources are by backed up in synchronization to secondary node.2)Management node manages IP, and this IP is not It is same as the IP for the network that two inter-node synchronous files are walked.It is management node to metadata node, back end sending tube The walked IP of reason order.This IP needs to move in standby management node from main management node in failover.3) Parastor200 management services and the service of Parastor200 graphical interfaces, the two are serviced also in failover, from pipe Reason node is switched on secondary node.4)Data synchronization service, that is, standby management node becomes main management node after switching, it Need to backup to the information above it in turn in original main management node.
Finally it should be noted that:The above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, to the greatest extent The present invention is described in detail with reference to above-described embodiment for pipe, those of ordinary skills in the art should understand that:Still The embodiment of the present invention can be modified or equivalent substitution, and without departing from any of spirit and scope of the invention Modification or equivalent substitution, it all should cover among scope of the presently claimed invention.

Claims (1)

1. a kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment, it is characterised in that Methods described is realized by following two aspects:
(1) synchronization of management node storing system information file:Realized using distributed block equipment;
(2) management node failover;
In (1), the management node storing system information is synchronously to realize the storing system information hair when in management node When changing more, main management node is consistent with the information under standby management node respective directories;
The distributed block equipment be one realized with software, without it is shared, mirror image block device content deposits between server Storage replicates solution;
When writing data into the file system on local host distributed apparatus, data can be sent to another in network simultaneously On an outer distance host, and record in the same manner in a file system;The establishment of the file system be by The synchronization of distributed block equipment is realized;
When distance host and local host, which are all returned, to be write successfully, the process that whole data are write just is returned successfully;Work as main management When node breaks down, a identical data are remained with standby management node;
In (2), using heartbeat mechanism failure judgement management node, i.e., by between online management node and standby management node Connect heartbeat and send information and the monitoring of response other side, and by ping third party's node mode failure judgement management node simultaneously Automatically failover is realized;
Migration when carrying out failover with reference to resource and service is realized;The resource and service include:
1) management node storing system information file;
2) management node management IP;
3) Parastor200 management services and the service of Parastor200 graphical interfaces;
4) data synchronization service;
It is described 1) in, management node storing system information file resource passes through in backed up in synchronization to standby management node;
It is described 2) in, management node management IP is management node to metadata node, back end transmission administration order institute The IP of process, the management node management IP is moved in standby management node in failover from online management node;
It is described 3) in, the Parastor200 management services and Parastor200 graphical interfaces service in failover, It is switched to from online management node in standby management node;
It is described 4) in, after switching standby management node turn into main management node, the information of standby management node is backed up in turn Onto original main management node.
CN201310226210.8A 2013-06-07 2013-06-07 A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment Active CN103384267B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310226210.8A CN103384267B (en) 2013-06-07 2013-06-07 A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310226210.8A CN103384267B (en) 2013-06-07 2013-06-07 A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment

Publications (2)

Publication Number Publication Date
CN103384267A CN103384267A (en) 2013-11-06
CN103384267B true CN103384267B (en) 2017-09-01

Family

ID=49491958

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310226210.8A Active CN103384267B (en) 2013-06-07 2013-06-07 A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment

Country Status (1)

Country Link
CN (1) CN103384267B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631903B (en) * 2013-11-22 2017-09-01 曙光信息产业股份有限公司 A kind of system of database synchronization data
CN105516365A (en) * 2016-01-22 2016-04-20 浪潮电子信息产业股份有限公司 Method for managing a distributed type mirror image storage block device based on network
CN107256131B (en) * 2017-06-15 2019-10-01 深圳市云舒网络技术有限公司 A kind of performance optimization method based on TCMU virtual disk distributed block storage system
CN116185697B (en) * 2023-05-04 2023-08-04 苏州浪潮智能科技有限公司 Container cluster management method, device and system, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102170460A (en) * 2011-03-10 2011-08-31 浪潮(北京)电子信息产业有限公司 Cluster storage system and data storage method thereof
CN103095837A (en) * 2013-01-18 2013-05-08 浪潮电子信息产业股份有限公司 Method achieving lustre metadata server redundancy

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102170460A (en) * 2011-03-10 2011-08-31 浪潮(北京)电子信息产业有限公司 Cluster storage system and data storage method thereof
CN103095837A (en) * 2013-01-18 2013-05-08 浪潮电子信息产业股份有限公司 Method achieving lustre metadata server redundancy

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
分布式块设备复制系统的分析与改进;陈嘉迅,刘晓洁;《计算机工程与设计》;20120319;第32卷(第11期);第3599~3601,3806页 *
集群文件系统lustre的介绍及应用;马艳军,等;《科技信息》;20120531(第5期);第139~140页 *

Also Published As

Publication number Publication date
CN103384267A (en) 2013-11-06

Similar Documents

Publication Publication Date Title
US10896104B2 (en) Heartbeat monitoring of virtual machines for initiating failover operations in a data storage management system, using ping monitoring of target virtual machines
KR101547719B1 (en) Maintaining data integrity in data servers across data centers
US9529883B2 (en) Maintaining two-site configuration for workload availability between sites at unlimited distances for products and services
US10084858B2 (en) Managing continuous priority workload availability and general workload availability between sites at unlimited distances for products and services
US9280396B2 (en) Lock state synchronization for non-disruptive persistent operation
US20160246865A1 (en) Zero-data loss recovery for active-active sites configurations
DE102021113808A1 (en) Handling replications between different networks
CN107430606B (en) Message broker system with parallel persistence
CN106502823A (en) data cloud backup method and system
CN103457775B (en) A kind of high available virtual machine pond management system of based role
CN108604164A (en) Synchronous for the storage of storage area network agreement is replicated
JP7389793B2 (en) Methods, devices, and systems for real-time checking of data consistency in distributed heterogeneous storage systems
CN103384266B (en) A kind of Parastor200 based on file-level real-time synchronization manages node high availability method
CN103384267B (en) A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment
US10852985B2 (en) Persistent hole reservation
US20120278817A1 (en) Event distribution pattern for use with a distributed data grid
CN102833580A (en) High-definition video application system and method based on infiniband
US20150317223A1 (en) Method and system for handling failures by tracking status of switchover or switchback
US20200301948A1 (en) Timestamp consistency for synchronous replication
CN109739435A (en) File storage and update method and device
US9367413B2 (en) Detecting data loss during site switchover
CN116389233A (en) Container cloud management platform active-standby switching system, method and device and computer equipment
Yadav et al. Fault tolerant algorithm for Replication Management in distributed cloud system
Bartkowski et al. High availability and disaster recovery options for DB2 for Linux, UNIX, and Windows
Hu et al. Research on the Architecture of Cloud Host Autonomous Backup System in a Cloud Data Center

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220725

Address after: 100193 No. 36 Building, No. 8 Hospital, Wangxi Road, Haidian District, Beijing

Patentee after: Dawning Information Industry (Beijing) Co.,Ltd.

Patentee after: DAWNING INFORMATION INDUSTRY Co.,Ltd.

Address before: 100193 No.36 Zhongguancun Software Park, No.8 Dongbeiwang West Road, Haidian District, Beijing

Patentee before: Dawning Information Industry (Beijing) Co.,Ltd.