CN103384267B - A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment - Google Patents
A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment Download PDFInfo
- Publication number
- CN103384267B CN103384267B CN201310226210.8A CN201310226210A CN103384267B CN 103384267 B CN103384267 B CN 103384267B CN 201310226210 A CN201310226210 A CN 201310226210A CN 103384267 B CN103384267 B CN 103384267B
- Authority
- CN
- China
- Prior art keywords
- management node
- management
- node
- standby
- realized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The present invention relates to a kind of Parastor200 management node high availability methods based on distributed block equipment, methods described is realized by following two aspects:(1)The synchronization of management node storing system information;(2)Management node failover.The present invention is designed by realizing that the High Availabitity of Parastor200 management nodes makes Parastor200 realize the full redundancy in complete meaning, and the damage of any part does not influence the use of storage system in system.Service, can be switched on secondary node by the damage of any part of management node within the several seconds.So neither influence is normally used, and has the time of abundance to go to repair failure again.
Description
Technical field
The present invention relates to a kind of Parastor200 parallel memorizing management nodes High Availabitity side based on distributed block equipment
Method.
Background technology
ParaStor200 parallel memory systems, which are employed, represents memory technology, the network communications technology and data management skill
The parallel architecture framework of art developing direction, is a height for handling, possessing independent intellectual property right towards magnanimity unstructured data
Hold storage system.It can provide TB/s grades of high speed bandwidth and EB grades of massive storage space, disclosure satisfy that aircraft automobile ship
Oceangoing ship design, biological gene research, material science research, weather forecast, seismic monitoring, environmental monitoring and analysis, energy exploration, electronics
Hold in the fields such as commercial affairs, online game, social and video sharing Web Hosting, animation rendering, video editing processing for storage
Amount and the high application of I/O performance requirements, can be widely applied to government, education, scientific research, manufacture, enterprise, medical treatment, oil, wide
The industries such as electricity, internet.
MGR represents Parastor200 management node there is provided unified control administration interface, and keeper passes through the node
Manage whole storage system.
OPara represents Parastor200 metadata nodes, and all index datas and name for managing storage system are empty
Between, single global image is externally provided, supports multiple nodes to be worked with Active-Active cluster modes.
OStor represents Parastor200 back end, for providing data space, embeds high-performance data access
Engine, the data access request of parallel processing all clients supports multiple oStor in copy mode(1-3 copy)It is fault-tolerant.
Parastor200 management node is there is provided unified control administration interface, and its in store whole system is important to open up
Structure and configuration information are flutterred, keeper passes through the whole storage system of the node administration.In whole storage system, management node
Usage frequency is relatively low, only when carry client, checks storage system status, addition memory cell, deletion memory cell
Management node can be just used when being operated Deng management.Generally manage relatively simple in small-scale cluster, management operation is also fewer,
Now the importance of management node is relatively low, even if management node breaks down, and we also have the sufficient time to go to repair pipe
Node is managed, even if management node disk permanent damages occur also is unlikely to catastrophic effect occur, because we can pass through member
The important information that configuration information on back end, back end comes on reconfiguration management node.And simply some history lost
Data and client authorization information, will not cause too much influence to storage system.At present, for the solution of this problem
It is that, by administration interface schedule backup management node configuration information, when management node breaks down, secondary node can be used
Management node diagram shape interface program is installed, is then introduced into the information of backup to complete.It is exactly using altogether to also have a kind of technology in addition
Disk battle array is enjoyed to be mounted in active and standby management node by optical fiber switch.When main management node breaks down, standby management node passes through
Carry preserves all information of the subregion acquisition storage management node of storing system information.
Existing scheme has several potential risks.First, even if frequency of your backups is higher, but it can not avoid backing up twice
Between the possibility that is modified of system configuration.Particularly carry out increasing or reduced memory cell, change client authorization information etc.
Operation, the information and real information after recovery is different, it will the normal operation of influence system.Secondly, even without any letter
Breath is lost, time that one management node of reconstruct expends or long, larger for those, user it is more, it is necessary to
The system for being often managed operation is clearly unacceptable.Problem above, but share dish can be solved using share dish battle array
The cost of battle array is too high.
The content of the invention
In view of the shortcomings of the prior art, the present invention provides a kind of Parastor200 based on distributed block equipment and deposited parallel
Store up management node high availability method;The present invention is by realizing that the High Availabitity of Parastor200 management nodes makes Parastor200 real
Having showed the damage of any part in the full redundancy design in complete meaning, system does not influence the use of storage system.Management node
Service, can be switched in standby management node by the damage of any part within the several seconds.So neither influence is normally used, again
The time for having abundance goes to repair failure.It can be realized using distributed block equipment and technology in the case of the cost of very little really
Real-time synchronization, it is ensured that active and standby management node storing system information is completely the same.
The purpose of the present invention is realized using following technical proposals:
A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment, it improves it
It is in methods described is realized by following two aspects:
(1)The synchronization of management node storing system information file:Realized using distributed block equipment.
(2)Management node failover.
Wherein, it is described(1)In, the management node storing system information is synchronously to realize the storage system when in management node
When system information is changed, main management node is consistent with the information under standby management node respective directories.
Wherein, the distributed block equipment is one and realized with software, without shared, mirror image block device between server
The storage replication solution of content;
When writing data into the file system on local host distributed apparatus, data can be sent in network simultaneously
An other distance host on, and record in the same manner in a file system;The establishment of the file system
It is to be realized by the synchronization of distributed block equipment;
Wherein in when distance host and local host are all returned when writing successfully, the process that whole data are write just is returned into
Work(;When main management node breaks down, a identical data are remained with standby management node.
Wherein, it is described(2)In, using heartbeat mechanism failure judgement management node, i.e., by online management node and standby
Heartbeat is connected between management node and sends information and the monitoring of response other side, and passes through ping third party's node mode failure judgement
Management node simultaneously realizes failover automatically.
Wherein, the migration when carrying out failover with reference to resource and service is realized;The resource and service include:
1)Management node storing system information file;
2)Management node manages IP;
3)Parastor200 management services and the service of Parastor200 graphical interfaces;
4)Data synchronization service.
Wherein, described 1)In, management node storing system information file resource passes through backed up in synchronization to standby management node
On.
Wherein, described 2)In, management node management IP is management node to metadata node, back end sending tube
Reason order walked IP, the management node management IP move to standby management node in failover from online management node
On.
Wherein, described 3)In, the Parastor200 management services and the service of Parastor200 graphical interfaces are in event
During barrier switching, it is switched to from online management node in standby management node.
Wherein, described 4)In, standby management node turns into main management node after switching(Main management node and standby management node
It is relative concept, online management node is main management node), the information of standby management node is backuped into original in turn
In the main management node come.
Compared with the prior art, the beneficial effect that reaches of the present invention is:
The present invention provides the Parastor200 parallel memorizing management node high availability methods based on distributed block equipment, leads to
Cross and realize that the High Availabitity of Parastor200 management nodes makes Parastor200 realize the full redundancy design in complete meaning, be
The damage of any part does not influence the use of storage system in system.The damage of any part of management node, can be within the several seconds
Service is switched in standby management node.So neither influence is normally used, and has the time of abundance to go to repair failure again.Using point
Cloth block device technology can realize real real-time synchronization in the case of the cost of very little, it is ensured that active and standby management node is deposited
Storage system information is completely the same.
Embodiment
The embodiment to the present invention is described in further detail below.
The present invention is the High Availabitity of Parastor200 management nodes to be realized.By analyzing the problem of prior art is present
We just know that the present invention is to solve following two problems:(1)The synchronization of management node storing system information;(2)Management section
Point failure switches.
Management node storing system information file synchronization is solved, seeks to realize that management node storage information file occurs to appoint
When what is changed, the storing system information file under standby management node correspondence catalogue is also changed simultaneously, and main-standby nodes are corresponding
Information under catalogue is completely the same.This patent solves this problem using distributed block equipment.Distributed block equipment is one
Storage replication solution that realized with software, nothing is shared, mirror image block device content between server.When you write data
When entering the file system on local distributed apparatus, data can be sent on the other main frame in network simultaneously,
And with identical form record in a file system(The establishment of actually file system is also by distributed block equipment
Synchronization realize).The process entirely write when distance host and local host, which are all returned, to be write successfully is just returned successfully.Cause
This local node can ensure real-time synchronous with the data of remote node, and ensure IO uniformity.So when main management section
When point breaks down, a identical data can be also remained with standby management node, can be continuing with, to reach that height can
Use purpose.
During management node failover, failover firstly the need of solve the problem of be how failure judgement, here I
Use heartbeat mechanism, pass through and the prison that heartbeat sends information and response other side connected between management node and standby management node
Survey, and realize by the mode failure judgement node such as ping third party's node and automatically failover.Carrying out failover also needs
Solve one it is important the problem of be exactly service, the migration of resource.Resource and service include in this present invention:1)Management node
Storing system information file, these resources are by backed up in synchronization to secondary node.2)Management node manages IP, and this IP is not
It is same as the IP for the network that two inter-node synchronous files are walked.It is management node to metadata node, back end sending tube
The walked IP of reason order.This IP needs to move in standby management node from main management node in failover.3)
Parastor200 management services and the service of Parastor200 graphical interfaces, the two are serviced also in failover, from pipe
Reason node is switched on secondary node.4)Data synchronization service, that is, standby management node becomes main management node after switching, it
Need to backup to the information above it in turn in original main management node.
Finally it should be noted that:The above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, to the greatest extent
The present invention is described in detail with reference to above-described embodiment for pipe, those of ordinary skills in the art should understand that:Still
The embodiment of the present invention can be modified or equivalent substitution, and without departing from any of spirit and scope of the invention
Modification or equivalent substitution, it all should cover among scope of the presently claimed invention.
Claims (1)
1. a kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment, it is characterised in that
Methods described is realized by following two aspects:
(1) synchronization of management node storing system information file:Realized using distributed block equipment;
(2) management node failover;
In (1), the management node storing system information is synchronously to realize the storing system information hair when in management node
When changing more, main management node is consistent with the information under standby management node respective directories;
The distributed block equipment be one realized with software, without it is shared, mirror image block device content deposits between server
Storage replicates solution;
When writing data into the file system on local host distributed apparatus, data can be sent to another in network simultaneously
On an outer distance host, and record in the same manner in a file system;The establishment of the file system be by
The synchronization of distributed block equipment is realized;
When distance host and local host, which are all returned, to be write successfully, the process that whole data are write just is returned successfully;Work as main management
When node breaks down, a identical data are remained with standby management node;
In (2), using heartbeat mechanism failure judgement management node, i.e., by between online management node and standby management node
Connect heartbeat and send information and the monitoring of response other side, and by ping third party's node mode failure judgement management node simultaneously
Automatically failover is realized;
Migration when carrying out failover with reference to resource and service is realized;The resource and service include:
1) management node storing system information file;
2) management node management IP;
3) Parastor200 management services and the service of Parastor200 graphical interfaces;
4) data synchronization service;
It is described 1) in, management node storing system information file resource passes through in backed up in synchronization to standby management node;
It is described 2) in, management node management IP is management node to metadata node, back end transmission administration order institute
The IP of process, the management node management IP is moved in standby management node in failover from online management node;
It is described 3) in, the Parastor200 management services and Parastor200 graphical interfaces service in failover,
It is switched to from online management node in standby management node;
It is described 4) in, after switching standby management node turn into main management node, the information of standby management node is backed up in turn
Onto original main management node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310226210.8A CN103384267B (en) | 2013-06-07 | 2013-06-07 | A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310226210.8A CN103384267B (en) | 2013-06-07 | 2013-06-07 | A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103384267A CN103384267A (en) | 2013-11-06 |
CN103384267B true CN103384267B (en) | 2017-09-01 |
Family
ID=49491958
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310226210.8A Active CN103384267B (en) | 2013-06-07 | 2013-06-07 | A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103384267B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103631903B (en) * | 2013-11-22 | 2017-09-01 | 曙光信息产业股份有限公司 | A kind of system of database synchronization data |
CN105516365A (en) * | 2016-01-22 | 2016-04-20 | 浪潮电子信息产业股份有限公司 | Method for managing a distributed type mirror image storage block device based on network |
CN107256131B (en) * | 2017-06-15 | 2019-10-01 | 深圳市云舒网络技术有限公司 | A kind of performance optimization method based on TCMU virtual disk distributed block storage system |
CN116185697B (en) * | 2023-05-04 | 2023-08-04 | 苏州浪潮智能科技有限公司 | Container cluster management method, device and system, electronic equipment and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102170460A (en) * | 2011-03-10 | 2011-08-31 | 浪潮(北京)电子信息产业有限公司 | Cluster storage system and data storage method thereof |
CN103095837A (en) * | 2013-01-18 | 2013-05-08 | 浪潮电子信息产业股份有限公司 | Method achieving lustre metadata server redundancy |
-
2013
- 2013-06-07 CN CN201310226210.8A patent/CN103384267B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102170460A (en) * | 2011-03-10 | 2011-08-31 | 浪潮(北京)电子信息产业有限公司 | Cluster storage system and data storage method thereof |
CN103095837A (en) * | 2013-01-18 | 2013-05-08 | 浪潮电子信息产业股份有限公司 | Method achieving lustre metadata server redundancy |
Non-Patent Citations (2)
Title |
---|
分布式块设备复制系统的分析与改进;陈嘉迅,刘晓洁;《计算机工程与设计》;20120319;第32卷(第11期);第3599~3601,3806页 * |
集群文件系统lustre的介绍及应用;马艳军,等;《科技信息》;20120531(第5期);第139~140页 * |
Also Published As
Publication number | Publication date |
---|---|
CN103384267A (en) | 2013-11-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10896104B2 (en) | Heartbeat monitoring of virtual machines for initiating failover operations in a data storage management system, using ping monitoring of target virtual machines | |
KR101547719B1 (en) | Maintaining data integrity in data servers across data centers | |
US9529883B2 (en) | Maintaining two-site configuration for workload availability between sites at unlimited distances for products and services | |
US10084858B2 (en) | Managing continuous priority workload availability and general workload availability between sites at unlimited distances for products and services | |
US9280396B2 (en) | Lock state synchronization for non-disruptive persistent operation | |
US20160246865A1 (en) | Zero-data loss recovery for active-active sites configurations | |
DE102021113808A1 (en) | Handling replications between different networks | |
CN107430606B (en) | Message broker system with parallel persistence | |
CN106502823A (en) | data cloud backup method and system | |
CN103457775B (en) | A kind of high available virtual machine pond management system of based role | |
CN108604164A (en) | Synchronous for the storage of storage area network agreement is replicated | |
JP7389793B2 (en) | Methods, devices, and systems for real-time checking of data consistency in distributed heterogeneous storage systems | |
CN103384266B (en) | A kind of Parastor200 based on file-level real-time synchronization manages node high availability method | |
CN103384267B (en) | A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment | |
US10852985B2 (en) | Persistent hole reservation | |
US20120278817A1 (en) | Event distribution pattern for use with a distributed data grid | |
CN102833580A (en) | High-definition video application system and method based on infiniband | |
US20150317223A1 (en) | Method and system for handling failures by tracking status of switchover or switchback | |
US20200301948A1 (en) | Timestamp consistency for synchronous replication | |
CN109739435A (en) | File storage and update method and device | |
US9367413B2 (en) | Detecting data loss during site switchover | |
CN116389233A (en) | Container cloud management platform active-standby switching system, method and device and computer equipment | |
Yadav et al. | Fault tolerant algorithm for Replication Management in distributed cloud system | |
Bartkowski et al. | High availability and disaster recovery options for DB2 for Linux, UNIX, and Windows | |
Hu et al. | Research on the Architecture of Cloud Host Autonomous Backup System in a Cloud Data Center |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220725 Address after: 100193 No. 36 Building, No. 8 Hospital, Wangxi Road, Haidian District, Beijing Patentee after: Dawning Information Industry (Beijing) Co.,Ltd. Patentee after: DAWNING INFORMATION INDUSTRY Co.,Ltd. Address before: 100193 No.36 Zhongguancun Software Park, No.8 Dongbeiwang West Road, Haidian District, Beijing Patentee before: Dawning Information Industry (Beijing) Co.,Ltd. |