CN103986755A

CN103986755A - Implementation method of high-security full-redundancy parallel file system

Info

Publication number: CN103986755A
Application number: CN201410196050.1A
Authority: CN
Inventors: 孙玉超; 陈良华
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: Inspur Electronic Information Industry Co Ltd
Priority date: 2014-05-12
Filing date: 2014-05-12
Publication date: 2014-08-13

Abstract

The invention provides an implementation method of a high-security full-redundancy parallel file system. The concrete implementation process includes the steps that Lustre file system configuration is set; a Lustre object storage file system is set and composed of a client, a storage server OST and a meta data sever MDS, the Lustre file system runs on the client of the Lustre, and the client performs file data I/O interaction with the OST and performs namespace operation interaction with the MDS; mds nodes serve as the meta data sever of the lustre; two oss nodes serve as an object data server of lustre nodes. Compared with the prior art, the implementation method of the high-security full-redundancy parallel file system has the advantages that after fault points occur, automatic switching of the lustre file system can be performed, continuity of the file system is guaranteed, and the shutdown risk of a computer is reduced.

Description

A kind of high safe full redundancy parallel file system implementation method

Technical field

The present invention relates to computer cloud field, specifically a kind of full redundancy parallel file system implementation method that realizes the privately owned cloud high safety separated with publicly-owned cloud.

Background technology

Develop rapidly along with computer technology, High-Performance Computing Cluster and cloud computing system obtain application more and more widely gradually, when building cluster and cloud computing system, often need to coordinate the parallel file system of high readwrite bandwidth, as parallel file system, except read or write speed, fail safe is also the problem that first people will consider.Lustre parallel file system becomes the first-selection of parallel file system with its powerful extended capability and high readwrite bandwidth, high concurrent ability, particularly in super calculation in the heart, more and more users select to build lustre parallel file system, the full redundancy parallel file system of high safety, lustre parallel file system is not the file system of a safety.When forming hard disk, storage and mds and the oss node of lustre file system, either party Shi Douhui that breaks down causes the machine of delaying of whole lustre file system.Thereby the fail safe that how to improve lustre file system, reducing the down machine time is also the problem that people are considering always.

Summary of the invention

Technical assignment of the present invention is to solve the deficiencies in the prior art, and a kind of full redundancy parallel file system implementation method of more efficient, the safer high safety that is convenient for people to life is provided.

Technical scheme of the present invention realizes in the following manner, this kind of high safe full redundancy parallel file system implementation method, and its specific implementation process is:

One, the configuration of Luster file system is set, this configuration comprises:

Mds node: 2 station servers;

Oss node: 2 station servers;

SAN storage;

Controller: quantity is Active/Active dual-active controller;

Disk extension cabinet: quantity is the disk extension cabinet of 5;

Physical disk: 240 3TB SATA hard disks, 10 300GB SAS hard disks;

Two, Lustre object storage file system is set, this system is comprised of client, storage server OST and meta data server MDS tri-parts, the client operation Lustre file system of Lustre, it and OST carry out the mutual of file data I/O, and MDS carries out the mutual of NameSpace operation;

Three, mds node is made the meta data server of lustre;

Four, two oss nodes are made the object data server of lustre node.

The detailed operating process of described step 3 is: the SAS disk of 10 300GB of storage is made raid6, be mounted to mds node simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after a mds goes wrong, disk can be taken over and continue to provide service by another mds.

The detailed operating process of described step 4 is: 48 disks of each extension cabinet, 5 extension cabinet are totally 240 disks, each extension cabinet take out 5 extension cabinet of 2 disks totally 10 disks be one group of raid6, totally 24 groups of raid6,24 groups of raid6 are mounted on two oss nodes simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after an oss node breaks down, disk can be taken over and continue to provide service by another oss.

The beneficial effect that the present invention compared with prior art produced is:

A kind of high safe full redundancy parallel file system implementation method of the present invention is all made redundant state the each several part that forms lustre parallel file system, guarantee after having fault point to occur, lustre file system can automatically switch, guarantee the continuity of file system, reduce down machine risk, practical, applied widely, be easy to promote.

Accompanying drawing explanation

Accompanying drawing 1 is the mutual schematic diagram of Lustre object storage file system of the present invention.

Embodiment

Below in conjunction with accompanying drawing, a kind of high safe full redundancy parallel file system implementation method of the present invention is described in detail.

As shown in Figure 1, the invention provides a kind of high safe full redundancy parallel file system implementation method, technical scheme of the present invention realizes in the following manner, this kind of high safe full redundancy parallel file system implementation method, and its specific implementation process is:

Mds node: 2 station servers;

Oss node: 2 station servers;

SAN storage;

Controller: quantity is Active/Active dual-active controller;

Disk extension cabinet: quantity is the disk extension cabinet of 5;

Physical disk: 240 3TB SATA hard disks, 10 300GB SAS hard disks.

Two, Lustre object storage file system is set, this system is comprised of client, storage server OST and meta data server MDS tri-parts, the client operation Lustre file system of Lustre, it and OST carry out the mutual of file data I/O, and MDS carries out the mutual of NameSpace operation.

Three, mds01 and mds02 node are done the meta data server of lustre.

The SAS disk (2 of each disk extension cabinet) of 10 300GB of storage is raid6, be mounted to mds01 and mds02 simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after a mds goes wrong, disk can be taken over and continue to provide service by another mds.

Four, oss01 and oss02 do the object data server of lustre node.

48 disks of each extension cabinet, 5 extension cabinet are totally 240 disks, each extension cabinet take out 5 extension cabinet of 2 disks totally 10 disks be one group of raid6, totally 24 groups of raid6,24 groups of raid6 are mounted to oss01 and oss02 node simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after an oss node breaks down, disk can be taken over and continue to provide service by another oss.

Because 10 disks are raid6, divide in 5 extension cabinet 2 disks of each extension cabinet, raid6 mechanism allows two disks in raid group break down and do not affect use, so after a complete down of extension cabinet falls, by the mechanism of raid6, whole system can down machine, and impact is used.

The foregoing is only embodiments of the invention, within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims

1. a high safe full redundancy parallel file system implementation method, is characterized in that its specific implementation process is:

Mds node: 2 station servers;

Oss node: 2 station servers;

SAN storage;

Controller: quantity is Active/Active dual-active controller;

Disk extension cabinet: quantity is the disk extension cabinet of 5;

Physical disk: 240 3TB SATA hard disks, 10 300GB SAS hard disks;

Three, mds node is made the meta data server of lustre;

Four, two oss nodes are made the object data server of lustre node.

2. a kind of high safe full redundancy parallel file system implementation method according to claim 1, it is characterized in that: the detailed operating process of described step 3 is: the SAS disk of 10 300GB of storage is made raid6, be mounted to mds node simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after a mds goes wrong, disk can be taken over and continue to provide service by another mds.

3. a kind of high safe full redundancy parallel file system implementation method according to claim 1, it is characterized in that: the detailed operating process of described step 4 is: 48 disks of each extension cabinet, 5 extension cabinet are totally 240 disks, each extension cabinet take out 5 extension cabinet of 2 disks totally 10 disks be one group of raid6, totally 24 groups of raid6,24 groups of raid6 are mounted on two oss nodes simultaneously, by heartbeat, realize dual-computer redundancy handoff functionality, after an oss node breaks down, disk can be taken over and continue to provide service by another oss.