CN103986755A - Implementation method of high-security full-redundancy parallel file system - Google Patents

Implementation method of high-security full-redundancy parallel file system Download PDF

Info

Publication number
CN103986755A
CN103986755A CN201410196050.1A CN201410196050A CN103986755A CN 103986755 A CN103986755 A CN 103986755A CN 201410196050 A CN201410196050 A CN 201410196050A CN 103986755 A CN103986755 A CN 103986755A
Authority
CN
China
Prior art keywords
file system
lustre
mds
disk
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410196050.1A
Other languages
Chinese (zh)
Inventor
孙玉超
陈良华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410196050.1A priority Critical patent/CN103986755A/en
Publication of CN103986755A publication Critical patent/CN103986755A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention provides an implementation method of a high-security full-redundancy parallel file system. The concrete implementation process includes the steps that Lustre file system configuration is set; a Lustre object storage file system is set and composed of a client, a storage server OST and a meta data sever MDS, the Lustre file system runs on the client of the Lustre, and the client performs file data I/O interaction with the OST and performs namespace operation interaction with the MDS; mds nodes serve as the meta data sever of the lustre; two oss nodes serve as an object data server of lustre nodes. Compared with the prior art, the implementation method of the high-security full-redundancy parallel file system has the advantages that after fault points occur, automatic switching of the lustre file system can be performed, continuity of the file system is guaranteed, and the shutdown risk of a computer is reduced.

Description

A kind of high safe full redundancy parallel file system implementation method
Technical field
The present invention relates to computer cloud field, specifically a kind of full redundancy parallel file system implementation method that realizes the privately owned cloud high safety separated with publicly-owned cloud.
Background technology
Develop rapidly along with computer technology, High-Performance Computing Cluster and cloud computing system obtain application more and more widely gradually, when building cluster and cloud computing system, often need to coordinate the parallel file system of high readwrite bandwidth, as parallel file system, except read or write speed, fail safe is also the problem that first people will consider.Lustre parallel file system becomes the first-selection of parallel file system with its powerful extended capability and high readwrite bandwidth, high concurrent ability, particularly in super calculation in the heart, more and more users select to build lustre parallel file system, the full redundancy parallel file system of high safety, lustre parallel file system is not the file system of a safety.When forming hard disk, storage and mds and the oss node of lustre file system, either party Shi Douhui that breaks down causes the machine of delaying of whole lustre file system.Thereby the fail safe that how to improve lustre file system, reducing the down machine time is also the problem that people are considering always.
Summary of the invention
Technical assignment of the present invention is to solve the deficiencies in the prior art, and a kind of full redundancy parallel file system implementation method of more efficient, the safer high safety that is convenient for people to life is provided.
Technical scheme of the present invention realizes in the following manner, this kind of high safe full redundancy parallel file system implementation method, and its specific implementation process is:
One, the configuration of Luster file system is set, this configuration comprises:
Mds node: 2 station servers;
Oss node: 2 station servers;
SAN storage;
Controller: quantity is Active/Active dual-active controller;
Disk extension cabinet: quantity is the disk extension cabinet of 5;
Physical disk: 240 3TB SATA hard disks, 10 300GB SAS hard disks;
Two, Lustre object storage file system is set, this system is comprised of client, storage server OST and meta data server MDS tri-parts, the client operation Lustre file system of Lustre, it and OST carry out the mutual of file data I/O, and MDS carries out the mutual of NameSpace operation;
Three, mds node is made the meta data server of lustre;
Four, two oss nodes are made the object data server of lustre node.
The detailed operating process of described step 3 is: the SAS disk of 10 300GB of storage is made raid6, be mounted to mds node simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after a mds goes wrong, disk can be taken over and continue to provide service by another mds.
The detailed operating process of described step 4 is: 48 disks of each extension cabinet, 5 extension cabinet are totally 240 disks, each extension cabinet take out 5 extension cabinet of 2 disks totally 10 disks be one group of raid6, totally 24 groups of raid6,24 groups of raid6 are mounted on two oss nodes simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after an oss node breaks down, disk can be taken over and continue to provide service by another oss.
The beneficial effect that the present invention compared with prior art produced is:
A kind of high safe full redundancy parallel file system implementation method of the present invention is all made redundant state the each several part that forms lustre parallel file system, guarantee after having fault point to occur, lustre file system can automatically switch, guarantee the continuity of file system, reduce down machine risk, practical, applied widely, be easy to promote.
Accompanying drawing explanation
Accompanying drawing 1 is the mutual schematic diagram of Lustre object storage file system of the present invention.
Embodiment
Below in conjunction with accompanying drawing, a kind of high safe full redundancy parallel file system implementation method of the present invention is described in detail.
As shown in Figure 1, the invention provides a kind of high safe full redundancy parallel file system implementation method, technical scheme of the present invention realizes in the following manner, this kind of high safe full redundancy parallel file system implementation method, and its specific implementation process is:
One, the configuration of Luster file system is set, this configuration comprises:
Mds node: 2 station servers;
Oss node: 2 station servers;
SAN storage;
Controller: quantity is Active/Active dual-active controller;
Disk extension cabinet: quantity is the disk extension cabinet of 5;
Physical disk: 240 3TB SATA hard disks, 10 300GB SAS hard disks.
Two, Lustre object storage file system is set, this system is comprised of client, storage server OST and meta data server MDS tri-parts, the client operation Lustre file system of Lustre, it and OST carry out the mutual of file data I/O, and MDS carries out the mutual of NameSpace operation.
Three, mds01 and mds02 node are done the meta data server of lustre.
The SAS disk (2 of each disk extension cabinet) of 10 300GB of storage is raid6, be mounted to mds01 and mds02 simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after a mds goes wrong, disk can be taken over and continue to provide service by another mds.
Four, oss01 and oss02 do the object data server of lustre node.
48 disks of each extension cabinet, 5 extension cabinet are totally 240 disks, each extension cabinet take out 5 extension cabinet of 2 disks totally 10 disks be one group of raid6, totally 24 groups of raid6,24 groups of raid6 are mounted to oss01 and oss02 node simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after an oss node breaks down, disk can be taken over and continue to provide service by another oss.
Because 10 disks are raid6, divide in 5 extension cabinet 2 disks of each extension cabinet, raid6 mechanism allows two disks in raid group break down and do not affect use, so after a complete down of extension cabinet falls, by the mechanism of raid6, whole system can down machine, and impact is used.
The foregoing is only embodiments of the invention, within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (3)

1. a high safe full redundancy parallel file system implementation method, is characterized in that its specific implementation process is:
One, the configuration of Luster file system is set, this configuration comprises:
Mds node: 2 station servers;
Oss node: 2 station servers;
SAN storage;
Controller: quantity is Active/Active dual-active controller;
Disk extension cabinet: quantity is the disk extension cabinet of 5;
Physical disk: 240 3TB SATA hard disks, 10 300GB SAS hard disks;
Two, Lustre object storage file system is set, this system is comprised of client, storage server OST and meta data server MDS tri-parts, the client operation Lustre file system of Lustre, it and OST carry out the mutual of file data I/O, and MDS carries out the mutual of NameSpace operation;
Three, mds node is made the meta data server of lustre;
Four, two oss nodes are made the object data server of lustre node.
2. a kind of high safe full redundancy parallel file system implementation method according to claim 1, it is characterized in that: the detailed operating process of described step 3 is: the SAS disk of 10 300GB of storage is made raid6, be mounted to mds node simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after a mds goes wrong, disk can be taken over and continue to provide service by another mds.
3. a kind of high safe full redundancy parallel file system implementation method according to claim 1, it is characterized in that: the detailed operating process of described step 4 is: 48 disks of each extension cabinet, 5 extension cabinet are totally 240 disks, each extension cabinet take out 5 extension cabinet of 2 disks totally 10 disks be one group of raid6, totally 24 groups of raid6,24 groups of raid6 are mounted on two oss nodes simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after an oss node breaks down, disk can be taken over and continue to provide service by another oss.
CN201410196050.1A 2014-05-12 2014-05-12 Implementation method of high-security full-redundancy parallel file system Pending CN103986755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410196050.1A CN103986755A (en) 2014-05-12 2014-05-12 Implementation method of high-security full-redundancy parallel file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410196050.1A CN103986755A (en) 2014-05-12 2014-05-12 Implementation method of high-security full-redundancy parallel file system

Publications (1)

Publication Number Publication Date
CN103986755A true CN103986755A (en) 2014-08-13

Family

ID=51278578

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410196050.1A Pending CN103986755A (en) 2014-05-12 2014-05-12 Implementation method of high-security full-redundancy parallel file system

Country Status (1)

Country Link
CN (1) CN103986755A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227839A (en) * 2016-07-26 2016-12-14 浪潮电子信息产业股份有限公司 The expansion method of a kind of lustre file system and device
CN107633070A (en) * 2017-09-22 2018-01-26 郑州云海信息技术有限公司 Balance Control Scheme method, apparatus and storage medium without the MDS of configuration

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103095837A (en) * 2013-01-18 2013-05-08 浪潮电子信息产业股份有限公司 Method achieving lustre metadata server redundancy
CN103279386A (en) * 2013-06-09 2013-09-04 浪潮电子信息产业股份有限公司 Method for achieving high availability of computer operation scheduling system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103095837A (en) * 2013-01-18 2013-05-08 浪潮电子信息产业股份有限公司 Method achieving lustre metadata server redundancy
CN103279386A (en) * 2013-06-09 2013-09-04 浪潮电子信息产业股份有限公司 Method for achieving high availability of computer operation scheduling system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张晓波: ""基于高性能集群计算的并行文件系统关键技术研究"", 《中国优秀硕士论文全文数据库》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227839A (en) * 2016-07-26 2016-12-14 浪潮电子信息产业股份有限公司 The expansion method of a kind of lustre file system and device
CN107633070A (en) * 2017-09-22 2018-01-26 郑州云海信息技术有限公司 Balance Control Scheme method, apparatus and storage medium without the MDS of configuration

Similar Documents

Publication Publication Date Title
US9639437B2 (en) Techniques to manage non-disruptive SAN availability in a partitioned cluster
CN105141456A (en) Method for monitoring high-availability cluster resource
CN104361030A (en) Distributed cache architecture with task distribution function and cache method
US10244069B1 (en) Accelerated data storage synchronization for node fault protection in distributed storage system
CN104331254A (en) Dual-active storage system design method based on dual-active logical volumes
CN103279386A (en) Method for achieving high availability of computer operation scheduling system
CN105468297A (en) Quick synchronization method for master and slave device data in cloud storage system
CN102404201B (en) Method of realizing maximum bandwidth of Lustre concurrent file system
CN104298574A (en) Data high-speed storage processing system
CN103095837A (en) Method achieving lustre metadata server redundancy
CN103986755A (en) Implementation method of high-security full-redundancy parallel file system
CN103209219A (en) Distributed cluster file system
CN103209218A (en) Management system for disaster-tolerant all-in-one machine
CN103309774A (en) Construction method of virtual cluster double-layer redundancy framework
CN203054824U (en) Server storage system
CN105227394A (en) A kind of fault detect of the DB2 database based on x86 platform and changing method
CN106445729A (en) Backup virtualization-based method
CN103268271A (en) Disaster tolerance realizing method of all-in-one machine
CN107145409A (en) A kind of method of file multichannel backup
CN105354156A (en) Mainboard design method capable of supporting NVDIMM (Non-Volatile Dual In-line Memory Module)
US20200319989A1 (en) Collecting performance metrics of a device
Gong et al. Research and application of distributed storage technology in power grid enterprise database
CN105630420A (en) Network computer storage system and storage method thereof
CN204650521U (en) A kind of TB DBMS library storage system
CN203179007U (en) Cloud backup system based on asynchronous data cloud

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140813

RJ01 Rejection of invention patent application after publication