CN103384267A - Parastor200 parallel storage management node high availability method based on distributed block device - Google Patents

Parastor200 parallel storage management node high availability method based on distributed block device Download PDF

Info

Publication number
CN103384267A
CN103384267A CN2013102262108A CN201310226210A CN103384267A CN 103384267 A CN103384267 A CN 103384267A CN 2013102262108 A CN2013102262108 A CN 2013102262108A CN 201310226210 A CN201310226210 A CN 201310226210A CN 103384267 A CN103384267 A CN 103384267A
Authority
CN
China
Prior art keywords
management node
node
management
parallel high
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013102262108A
Other languages
Chinese (zh)
Other versions
CN103384267B (en
Inventor
刘冠川
秦东明
杨亮
曹振南
王勇
何牧君
张新风
陈飞
刘超
龚超
明立波
王慧
吕永安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Dawning Information Industry Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN201310226210.8A priority Critical patent/CN103384267B/en
Publication of CN103384267A publication Critical patent/CN103384267A/en
Application granted granted Critical
Publication of CN103384267B publication Critical patent/CN103384267B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention relates to a parastor200 parallel storage management node high availability method based on a distributed block device. The method can be achieved through synchronization of management node storage system information and management node fault switching. Due to high availability of a parastor200 management node, the full redundancy design of the parastor200 management node is achieved in the real sense, and use of a storage system is not influenced by damage of any part in the system. When any part of the management node is damaged, services can be switched to a standby node in a few seconds. Therefore, normal use is not influenced and people have enough time to eliminate faults.

Description

The parallel high methods availalbe of storage administration node of a kind of Parastor200 based on distributed block equipment
Technical field
The present invention relates to the parallel high methods availalbe of storage administration node of a kind of Parastor200 based on distributed block equipment.
Background technology
The ParaStor200 parallel memory system has adopted the parallel architectural framework that represents memory technology, the network communications technology and data management technique developing direction, is a high-end storage systems nowadays of processing, have independent intellectual property right towards the magnanimity unstructured data.It can provide the high speed bandwidth of TB/s level and the massive storage space of EB level, can satisfy in the fields such as aircraft automobile Ship Design, biological gene research, material science research, weather forecast, seismic monitoring, environmental monitoring and analysis, energy exploration, ecommerce, online game, social and video sharing Web Hosting, animation rendering, video editing processing for memory capacity and the high application of I/O performance requirement, can be widely used in the industries such as government, education, scientific research, manufacturing, enterprise, medical treatment, oil, broadcasting and TV, the Internet.
MGR represents the management node of Parastor200, and unified control and management interface is provided, and the keeper is by the whole storage system of this node administration.
OPara represents the Parastor200 metadata node, is used for all index datas and the NameSpace of management storage systems, and single overall situation reflection externally is provided, and supports a plurality of nodes to work with the Active-Active cluster mode.
OStor represents the Parastor200 back end, and being used for provides data space, embedded high-performance data access engine, and the data access request of all clients of parallel processing supports a plurality of oStor fault-tolerant in copy mode (1-3 copy).
The management node of Parastor200 provides unified control and management interface, topological structure and configuration information that its in store whole system is important, and the keeper is by the whole storage system of this node administration.In whole storage system, the usage frequency of management node is relatively low, only has when the carry client, checks storage system status, adds memory cell, just can use management node during the bookkeeping such as deletion memory cell.Usually management is comparatively simple in the small-scale cluster, bookkeeping is also fewer, this moment, the importance of management node was relatively low, even management node breaks down, we also have the sufficient time to remove the remediation management node, we also are unlikely to occur catastrophic effect even if management node disk permanent damages occurs, because can come by the configuration information on metadata node, back end the important information on the reconfiguration management node.And just some historical datas and the client authorization information of losing can not cause too much influence to storage system.At present, be by administration interface schedule backup management node configuration information for this solution of problem way, when management node breaks down, can use secondary node installation administration node graphical interface program, the information that then imports backup is completed.Also have in addition a kind of technology to use exactly shared dish battle array to be mounted on active and standby management node by optical fiber switch.When the main management node broke down, standby management node was preserved all information of the subregion acquisition storage administration node of storing system information by carry.
Existing scheme has several potential risks.At first, even if frequency of your backups is higher, but can't avoid the possibility that between twice backup, system configuration is modified.Particularly carried out increasing or reduced the operations such as memory cell, change client authorization information, information and real information after recovery are different, will affect the normal operation of system.Secondly, even if without any information dropout, the time that management node of reconstruct expends or long, larger for those, the user is more, and the system that need to often manage operation is obviously unacceptable.Use is shared the dish battle array and can be overcome the above problems, but it is too high to share the cost that coils battle array.
Summary of the invention
For the deficiencies in the prior art, the invention provides the parallel high methods availalbe of storage administration node of a kind of Parastor200 based on distributed block equipment; The present invention has realized that by the high available Parastor200 of making that realizes the Parastor200 management node full redundancy on complete meaning designs, and in system, the damage of any parts does not affect the use of storage system.The damage of any parts of management node can switch to service within the several seconds on standby management node.So neither impact is normal uses, and has again the sufficient time to go to repair fault.Use the distributed block equipment and technology in the situation that very little cost is realized real real-time synchronization, to guarantee that active and standby management node storing system information is in full accord.
The objective of the invention is to adopt following technical proposals to realize:
The parallel high methods availalbe of storage administration node of a kind of Parastor200 based on distributed block equipment, its improvements be, described method is by the realization of following two aspects:
(1) management node storing system information file is synchronous: adopt distributed block equipment to realize.
(2) management node failover.
Wherein, in described (1), described management node storing system information is synchronously to realize when change occurs the storing system information on management node, and the main management node is with consistent for the information under the management node respective directories.
Wherein, described distributed block equipment be one with software realize, without share, the storage replication solution of mirror image block device content between server;
When the file system that data write on the local host distributed apparatus, data can be sent on an other distance host in network simultaneously, and are recorded in a file system with identical form; The establishment of described file system is synchronously realizing by distributed block equipment;
Wherein in, when distance host and local host all return when writing successfully, the process that whole data are write is just returned successfully; When the main management node breaks down, remain with a identical data on standby management node.
Wherein, in described (2), adopt heartbeat mechanism failure judgement management node, namely connect with being connected the monitoring that heartbeat sends information and replys the other side by the online management node between management node, and also automatically realize failover by ping third party's node mode failure judgement management node.
Wherein, the migration in conjunction with resource and service realizes when carrying out failover; Described resource and service comprise:
1) management node storing system information file;
2) management node managing I P;
3) Parastor200 management service and Parastor200 graphical interfaces service;
4) data synchronization service.
Wherein, described 1) in, management node storing system information file resource by backed up in synchronization to standby management node.
Wherein, described 2) in, described management node managing I P is that management node sends to metadata node, back end the IP that administration order is walked, described management node managing I P moves on standby management node from the online management node when failover.
Wherein, described 3) in, described Parastor200 management service and the service of Parastor200 graphical interfaces switch on standby management node from the online management node when failover.
Wherein, described 4) in, after switching, standby management node becomes main management node (the main management node is relative concept with standby management node, and online management node is namely the main management node), and the information of standby management node is backuped on original main management node conversely.
Compared with the prior art, the beneficial effect that reaches of the present invention is:
The invention provides the parallel high methods availalbe of storage administration node based on the Parastor200 of distributed block equipment, realized that by the high available Parastor200 of making that realizes the Parastor200 management node full redundancy on complete meaning designs, in system, the damage of any parts does not affect the use of storage system.The damage of any parts of management node can switch to service within the several seconds on standby management node.So neither impact is normal uses, and has again the sufficient time to go to repair fault.Use the distributed block equipment and technology in the situation that very little cost is realized real real-time synchronization, to guarantee that active and standby management node storing system information is in full accord.
Embodiment
The below is described in further detail the specific embodiment of the present invention.
The present invention is will realize the Parastor200 management node high available.We just know by analyzing problem that prior art exists, and the present invention will solve following two problems: (1) management node storing system information synchronous; (2) management node failover.
Solve management node storing system information file synchronization, in the time of will realizing that exactly any change occurs management node storage message file, storing system information file under the corresponding catalogue of standby management node also changes simultaneously, and the information under the main-standby nodes respective directories is in full accord.This patent adopts distributed block equipment to solve this problem.Distributed block equipment be one with software realize, without share, the storage replication solution of mirror image block device content between server.When you, data being write the file system on local distributed apparatus, data can be sent on an other main frame in network simultaneously, and are recorded in (in fact the establishment of file system is also synchronously realizing by distributed block equipment) in a file system with identical form.When all returning to when writing successfully the whole process of writing, distance host and local host just return successfully.Therefore local node can guarantee real-time synchronizeing with the data of remote node, and guarantees the consistency of IO.So when the main management node breaks down, also can remain with a identical data on standby management node, can continue to use, to reach high available order
During the management node failover, the problem that at first failover needs to solve is exactly failure judgement how, here we adopt heartbeat mechanism, connect with being connected the monitoring that heartbeat sends information and replys the other side by management node between management node, and also automatically realize failover by mode failure judgement nodes such as ping third party's nodes.Carry out failover and also need to solve the migration that an important problem is exactly service, resource.Resource and service comprise in this present invention: 1) management node storing system information file, these resources by backed up in synchronization to secondary node.2) management node managing I P, this IP are different from the IP of the network that between two nodes, synchronous documents is walked.It is that management node sends to metadata node, back end the IP that administration order is walked.This IP need to move on standby management node from the main management node when failover.3) Parastor200 management service and Parastor200 graphical interfaces service, these two services also when failover, switch on secondary node from management node.4) data synchronization service, after namely switching, standby management node becomes the main management node, and it need to backup to the information above it on original main management node conversely.
Should be noted that at last: above embodiment is only in order to illustrate that technical scheme of the present invention is not intended to limit, although with reference to above-described embodiment, the present invention is had been described in detail, those of ordinary skill in the field are to be understood that: still can modify or be equal to replacement the specific embodiment of the present invention, and do not break away from any modification of spirit and scope of the invention or be equal to replacement, it all should be encompassed in the middle of claim scope of the present invention.

Claims (10)

1. the parallel high methods availalbe of storage administration node of the Parastor200 based on distributed block equipment, is characterized in that, described method realizes by following two aspects:
(1) management node storing system information file is synchronous: adopt distributed block equipment to realize.
(2) management node failover.
2. the parallel high methods availalbe of storage administration node of Parastor200 as claimed in claim 1, it is characterized in that, in described (1), described management node storing system information is synchronously to realize when change occurs the storing system information on management node, and the main management node is with consistent for the information under the management node respective directories.
3. the parallel high methods availalbe of storage administration node of Parastor200 as claimed in claim 1, is characterized in that, described distributed block equipment be one with software realize, without share, the storage replication solution of mirror image block device content between server;
When the file system that data write on the local host distributed apparatus, data can be sent on an other distance host in network simultaneously, and are recorded in a file system with identical form; The establishment of described file system is synchronously realizing by distributed block equipment.
4. the parallel high methods availalbe of storage administration node of Parastor200 as claimed in claim 3, is characterized in that, when distance host and local host all return when writing successfully, the process that whole data are write is just returned successfully; When the main management node breaks down, remain with a identical data on standby management node.
5. the parallel high methods availalbe of storage administration node of Parastor200 as claimed in claim 1, it is characterized in that, in described (2), adopt heartbeat mechanism failure judgement management node, namely connect with being connected the monitoring that heartbeat sends information and replys the other side by the online management node between management node, and also automatically realize failover by ping third party's node mode failure judgement management node.
6. the parallel high methods availalbe of storage administration node of Parastor200 as claimed in claim 5, is characterized in that, the migration in conjunction with resource and service when carrying out failover realizes; Described resource and service comprise:
1) management node storing system information file;
2) management node managing I P;
3) Parastor200 management service and Parastor200 graphical interfaces service;
4) data synchronization service.
7. the parallel high methods availalbe of storage administration node of Parastor200 as claimed in claim 6, is characterized in that described 1) in, management node storing system information file resource by backed up in synchronization to standby management node.
8. the parallel high methods availalbe of storage administration node of Parastor200 as claimed in claim 6, it is characterized in that, described 2) in, described management node managing I P is that management node sends to metadata node, back end the IP that administration order is walked, and described management node managing I P moves on standby management node from the online management node when failover.
9. the parallel high methods availalbe of storage administration node of Parastor200 as claimed in claim 6, it is characterized in that, described 3) in, described Parastor200 management service and the service of Parastor200 graphical interfaces switch on standby management node from the online management node when failover.
10. the parallel high methods availalbe of storage administration node of Parastor200 as claimed in claim 6, it is characterized in that, described 4) in, after switching, standby management node becomes the main management node, and the information of standby management node is backuped on original main management node conversely.
CN201310226210.8A 2013-06-07 2013-06-07 A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment Active CN103384267B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310226210.8A CN103384267B (en) 2013-06-07 2013-06-07 A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310226210.8A CN103384267B (en) 2013-06-07 2013-06-07 A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment

Publications (2)

Publication Number Publication Date
CN103384267A true CN103384267A (en) 2013-11-06
CN103384267B CN103384267B (en) 2017-09-01

Family

ID=49491958

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310226210.8A Active CN103384267B (en) 2013-06-07 2013-06-07 A kind of Parastor200 parallel memorizing management node high availability methods based on distributed block equipment

Country Status (1)

Country Link
CN (1) CN103384267B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631903A (en) * 2013-11-22 2014-03-12 曙光信息产业股份有限公司 System for synchronizing data of database
CN105516365A (en) * 2016-01-22 2016-04-20 浪潮电子信息产业股份有限公司 Management method of distributed mirror image storage block equipment based on network
CN107256131A (en) * 2017-06-15 2017-10-17 深圳市云舒网络技术有限公司 A kind of performance optimization method based on TCMU virtual disk distributed block storage systems
CN116185697A (en) * 2023-05-04 2023-05-30 苏州浪潮智能科技有限公司 Container cluster management method, device and system, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102170460A (en) * 2011-03-10 2011-08-31 浪潮(北京)电子信息产业有限公司 Cluster storage system and data storage method thereof
CN103095837A (en) * 2013-01-18 2013-05-08 浪潮电子信息产业股份有限公司 Method achieving lustre metadata server redundancy

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102170460A (en) * 2011-03-10 2011-08-31 浪潮(北京)电子信息产业有限公司 Cluster storage system and data storage method thereof
CN103095837A (en) * 2013-01-18 2013-05-08 浪潮电子信息产业股份有限公司 Method achieving lustre metadata server redundancy

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
陈嘉迅,刘晓洁: "分布式块设备复制系统的分析与改进", 《计算机工程与设计》 *
马艳军,等: "集群文件系统lustre的介绍及应用", 《科技信息》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631903A (en) * 2013-11-22 2014-03-12 曙光信息产业股份有限公司 System for synchronizing data of database
CN103631903B (en) * 2013-11-22 2017-09-01 曙光信息产业股份有限公司 A kind of system of database synchronization data
CN105516365A (en) * 2016-01-22 2016-04-20 浪潮电子信息产业股份有限公司 Management method of distributed mirror image storage block equipment based on network
CN107256131A (en) * 2017-06-15 2017-10-17 深圳市云舒网络技术有限公司 A kind of performance optimization method based on TCMU virtual disk distributed block storage systems
CN107256131B (en) * 2017-06-15 2019-10-01 深圳市云舒网络技术有限公司 A kind of performance optimization method based on TCMU virtual disk distributed block storage system
CN116185697A (en) * 2023-05-04 2023-05-30 苏州浪潮智能科技有限公司 Container cluster management method, device and system, electronic equipment and storage medium
CN116185697B (en) * 2023-05-04 2023-08-04 苏州浪潮智能科技有限公司 Container cluster management method, device and system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN103384267B (en) 2017-09-01

Similar Documents

Publication Publication Date Title
US10503427B2 (en) Synchronously replicating datasets and other managed objects to cloud-based storage systems
US9280430B2 (en) Deferred replication of recovery information at site switchover
US8195976B2 (en) Fault-tolerance and fault-containment models for zoning clustered application silos into continuous availability and high availability zones in clustered systems during recovery and maintenance
CN102833580B (en) Based on HD video application system and the method for infiniband
CN101997823B (en) Distributed file system and data access method thereof
CN103763155A (en) Multi-service heartbeat monitoring method for distributed type cloud storage system
CN103384266A (en) Parastor200 management node high availability method based on real-time synchronization at file level
US20100036885A1 (en) Maintaining Data Integrity in Data Servers Across Data Centers
CN106502823A (en) data cloud backup method and system
CN105069160A (en) Autonomous controllable database based high-availability method and architecture
CN101808127B (en) Data backup method, system and server
CN110727709A (en) Cluster database system
US9367409B2 (en) Method and system for handling failures by tracking status of switchover or switchback
CN101902498A (en) Network technology based storage cloud backup method
CN102413172B (en) Parallel data sharing method based on cluster technology and apparatus thereof
US20120084260A1 (en) Log-shipping data replication with early log record fetching
CN103795801A (en) Metadata group design method based on real-time application group
CN103384267A (en) Parastor200 parallel storage management node high availability method based on distributed block device
CN109739435A (en) File storage and update method and device
US9367413B2 (en) Detecting data loss during site switchover
CN114089923A (en) Double-live storage system and data processing method thereof
CN103544081A (en) Management method and device for double metadata servers
CN103076994B (en) The method of off-line written document is realized in a kind of SAN shared-file system
US20160182638A1 (en) Cloud serving system and cloud serving method
CN116389233A (en) Container cloud management platform active-standby switching system, method and device and computer equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220725

Address after: 100193 No. 36 Building, No. 8 Hospital, Wangxi Road, Haidian District, Beijing

Patentee after: Dawning Information Industry (Beijing) Co.,Ltd.

Patentee after: DAWNING INFORMATION INDUSTRY Co.,Ltd.

Address before: 100193 No.36 Zhongguancun Software Park, No.8 Dongbeiwang West Road, Haidian District, Beijing

Patentee before: Dawning Information Industry (Beijing) Co.,Ltd.

TR01 Transfer of patent right