CN101329691B - Redundant magnetic disk array sharing file system and read-write method - Google Patents

Redundant magnetic disk array sharing file system and read-write method Download PDF

Info

Publication number
CN101329691B
CN101329691B CN200810142716XA CN200810142716A CN101329691B CN 101329691 B CN101329691 B CN 101329691B CN 200810142716X A CN200810142716X A CN 200810142716XA CN 200810142716 A CN200810142716 A CN 200810142716A CN 101329691 B CN101329691 B CN 101329691B
Authority
CN
China
Prior art keywords
files
blocks
disk
client
copy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN200810142716XA
Other languages
Chinese (zh)
Other versions
CN101329691A (en
Inventor
程剑
王日红
王魏强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN200810142716XA priority Critical patent/CN101329691B/en
Publication of CN101329691A publication Critical patent/CN101329691A/en
Application granted granted Critical
Publication of CN101329691B publication Critical patent/CN101329691B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a redundant disk array shared file system and a read-write method thereof. The redundant disk array shared file system comprises at least one server, at least one disk the file storage strategy of which is controlled by the server, at least one client which responses to the access requirement of an upper application file and learns from the server which disk the file is in through network connection, at least one disk access agent which responses to the file access requirement of the client and directly accesses the disk. The redundant disk array shared file system and the read-write method thereof provided by the invention can easily realize expansion and smooth upgrade through adding the number of servers and disks, thus realizing application in fields with high real-time requirements.

Description

A kind of redundant magnetic disk array sharing file system and reading/writing method thereof
Technical field
The invention belongs to the Distributed Storage technology, be specifically related to a kind of redundant magnetic disk array sharing file system and reading/writing method thereof.
Background technology
In the applications such as IPTV (Web TV), media server, relate to the visit to mass data, its memory data output is huge, and is the multipath concurrence read and write access, this just need one efficiently file system to realize visit to mass data.Prior art generally adopts magnetic battle array storage system, when it supports fibre optic data transmission and cost higher.Prior art also have to adopt the magnetic battle array storage system based on Ethernet, when its cost still higher.Prior art also has a kind of GFS file system, it provides a kind of distributed storage method of cheapness, but it provides and once appends the method for writing, repeatedly reading, can only be adapted to the less demanding application scenario of real-time is unsuitable for IPTV, media server etc. to the demanding application of real-time.
As seen, prior art has yet to be improved and developed.
Summary of the invention
The objective of the invention is to, a kind of redundant magnetic disk array sharing file system and reading/writing method thereof are provided, make it can guarantee the Integrity And Reliability of data, support the multinode shared data, reduce cost, and realize smooth expansion by increasing disk, thereby be implemented in real-time is required application in the high field.
For solving the problems of the technologies described above, technical scheme of the present invention is as follows:
A kind of redundant magnetic disk array sharing file system, it comprises:
At least one server;
At least one disk, the document storage strategy of the described disk of described server controls;
At least one client, it responds upper layer application file access request, and knows the file place disk of being visited by the network connection to described server;
At least one disk access agency, it responds the file access request of described client, directly visits described disk;
Described disk is set to several, and each disk has some blocks of files, the corresponding blocks of files copy of each blocks of files, and described blocks of files copy is stored on one or more other disks;
Described server also is used for the data on each blocks of files copy of timing scan, if find loss of data, then carries out the flow process of a new files piece copy.
Described redundant magnetic disk array sharing file system, wherein, described redundant array of inexpensive disk access system further has an Ethernet interface, and described server, described client and described disk access agency communicate to connect respectively making on the described Ethernet and can communicate to connect mutually between described client, described disk access agency and the described server.
Described redundant magnetic disk array sharing file system, wherein, described disk is configured on the described server, and has some blocks of files on described disk.
A kind of read method of described redundant magnetic disk array sharing file system, it may further comprise the steps:
A1, described client end response upper layer application file access request send a File Open order to described server;
A2, server return the blocks of files information of being visited and give described client;
A3, client according to described blocks of files information to the corresponding blocks of files content of its corresponding disk access proxy requests;
A4, disk access agency read corresponding blocks of files content and send back to client, if described disk access agency reads blocks of files content failure on the corresponding disk, then notify client to read the blocks of files copy;
Described server is the data on each blocks of files copy of timing scan also, if find loss of data, then carry out the flow process of a new files piece copy.
Described read method, wherein, in the described steps A 4, disk access agency is that unit reads blocks of files content on the corresponding disk with the page or leaf, and is that unit is sent to client with the described blocks of files content that reads with the page or leaf.
Described read method wherein, defines described blocks of files place disk access agency and is the source disk access agent, defines its blocks of files copy place disk access agency and is the destination disc agency, and described new files piece copy flow process may further comprise the steps:
C1, the suitable disk access of selection are acted on behalf of the new destination disc access agent as described source disk access agent correspondence, send out a blocks of files copy command to the source disk access agent;
C2, source disk access agent write described new destination disc access agent with blocks of files information;
C3, described new destination disc access agent write described blocks of files copy on its corresponding disk.
A kind of write method of described redundant magnetic disk array sharing file system, it may further comprise the steps:
B1, described client end response upper layer application file access request, user end to server sends the request of opening file;
B2, server return the blocks of files information of being visited and give client;
B3, client write data to the disk access agency of the blocks of files information correspondence of being visited by page or leaf, and described client is sent one to the disk access agency who preserves the blocks of files copy and write the copy request message;
The data that B4, disk access agency write client write the blocks of files on the corresponding disk;
The disk access agency at described blocks of files copy place writes described blocks of files copy and returns one and write the copy response message to client;
Data on each blocks of files copy of described server timing scan if find loss of data, are then carried out the flow process of a new files piece copy.
Redundant magnetic disk array sharing file system provided by the invention and reading/writing method thereof, use some disks as memory device, and described some disk correspondences have a redundancy backup, improve it and preserve data reliability, the framework of described redundant magnetic disk array sharing file system can be easy to realize dilatation by increasing the method for number of servers and number of disks, realize smooth upgrade, real-time is required application in the high field thereby be implemented in.
Description of drawings
The structural representation of the redundant magnetic disk array sharing file system that Fig. 1 provides for the embodiment of the invention;
The process flow diagram of new files piece copy in the redundant magnetic disk array sharing file system of Fig. 2 Fig. 1.
Fig. 3 adopts the process flow diagram of the method for reading of the redundant magnetic disk array sharing file system of Fig. 1;
Fig. 4 adopts the process flow diagram of write method of the redundant magnetic disk array sharing file system of Fig. 1.
Embodiment
The present invention is described in detail below in conjunction with the drawings and specific embodiments.
In conjunction with Fig. 1, the synoptic diagram of a kind of redundant magnetic disk array sharing file system that the detailed description embodiment of the invention provides.
Described redundant magnetic disk array sharing file system adopts the client-server structure, and it comprises at least one server, at least one client, at least one disk access agency, at least one disk and an Ethernet.
Described server, plurality of client end, some disk access agents communicate to connect respectively and are making that by Ethernet interface mutual communication connects between described client, disk access agency, the server on the described Ethernet.
Be appreciated that in the reality that each server can be a computing machine.In the present embodiment, described at least one disk can be set to several disks, and some blocks of files are arranged on each disk, and the disk access agency can directly visit each blocks of files.Preferably, in the reality, described some disks are arranged on the described server, and described server is responsible for creating the blocks of files in file, deleted file and the distribution recovery disk file on described disk, and it adopts two-node cluster hot backup.
In the present embodiment, described client can be set to several clients, and described disk access agency is set to several disk access agency, described several clients of its corresponding respectively response.
Described several disk access agency is corresponding with described plurality of client end respectively, and the file on the some disks is directly managed in the file read-write request of each its corresponding client of disk access proxy response.
Described client is responsible for handling the file access request of upper layer application, and reads or write blocks of files by the disk access agency.
Wherein, file is stored with the form burst of blocks of files, the corresponding blocks of files copy of each blocks of files, and this document piece copy is stored on other disk, in case the blocks of files on certain disk can't be visited, client can be visited described blocks of files copy corresponding on other disk.Blocks of files deposit strategy by server controls, client can know that by server the corresponding disk access of the blocks of files of being visited acts on behalf of pairing disk, client is directly to disk access proxy access data.
Preferably, but each blocks of files copy information data of described server timing scan, if find its loss of data, carry out a new files piece copy flow process, idiographic flow sees also Fig. 2, define described blocks of files place disk access agency and be the source disk access agent, define its copy place disk access agency and be the destination disc agency, preferably, described server controls blocks of files deposit strategy, promptly client can be known the destination disc access agent of each blocks of files to deserved source disk access agent and copy correspondence thereof by server.
Described new files piece copy flow process may further comprise the steps:
C1, the suitable disk access of selection are acted on behalf of the new destination disc access agent as described source disk access agent correspondence, send out a blocks of files copy command 401 to the source disk access agent.
C2, source disk access agent write blocks of files information described new destination disc access agent and send a write request message 402.Preferably, the source disk access agent writes described new destination disc access agent until having write described blocks of files by page or leaf.
C3, destination disc access agent write the blocks of files copy.
The destination disc access agent has been write described blocks of files copy and has been returned one and finish message 403.
Preferably, after the step C3, the source disk access agent returns a disk to server and writes and finish message 404.
Described file has some visit units, and described visit unit is greater than or equal to page or leaf (128KB byte), visits described file according to described visit unit.Preferably, each data of transmitting are greater than or equal to one page between client and the disk access agency.
Use disconnected network transmitting-receiving packet interface subpackage to transmit.Because disk is a slow block equipment, must adopt the visit of big data block (as being typically the 1024KB byte) read-write mode for improving handling capacity, the 128KB page or leaf is undertaken adaptive with the 1024KB piece by the buffer memory of disk access agency maintenance and relevant algorithm.
The embodiment of the invention provides a kind of file read method of described redundant magnetic disk array sharing file system, and it may further comprise the steps:
A1, client send the File Open order to server;
Client is sent the order that opens file (OPEN) 201.Described File Open order 201 comprises described Archive sit information.
Preferably,, then resend and ask server, read reference count otherwise increase if file is not opened in client;
A2, server respond described File Open order and return a file size information and blocks of files information (OPEN_ACK) 202;
Particularly, described file size information of the described Archive sit information acquisition of described whois lookup and blocks of files information, and it is back to described client.
Preferably, when reading reference count, described increase writes down described client-side information;
A3, client according to described blocks of files information to its corresponding disk access proxy requests respective file piece content;
Client selects suitable disk access agency to read one page request (READ_A_PAGE_REQ) 203 to its transmission according to blocks of files information.Preferably, read the blocks of files content with the page or leaf for the unit request.
If client thinks that blocks of files information is aging, can be to server demand file block message again;
A4, disk access agency read described respective file piece content and are sent to client;
Particularly, disk access agency returns and reads one page to answer message (READ_A_PAGE_ACK) 204 be that unit is sent to client with the blocks of files content that reads with the page or leaf.
Be appreciated that disk access agency is that unit reads blocks of files content on the corresponding disk with the page or leaf, and be that unit is sent to client with the described blocks of files content that reads, that is to say that reading and sending is to client page by page with the page or leaf.
If client reads the blocks of files failure, then can read this document piece copy.
Preferably, the blocks of files of reading does not if desired have its blocks of files copy information of buffer memory in this locality, and user end to server sends a message (GET_CHUNK_REQ) 205 of obtaining the blocks of files copy information to obtain the blocks of files copy information.Server returns response file piece copy information (GET_CHUNK_ACK) 206 makes client read the blocks of files copy.
Preferably, after the steps A 4, execution in step A5 is specially:
A5, client are received the blocks of files content of described request, return an ending request (CLOSE) 207 to server;
Particularly, client is read the end of file, reduces to read reference count, if do not have that process is opened this file then send turn-off request to server.
Preferably, after the steps A 5, server returns an ending request response message (CLOSE_ACK) 208 and gives client.
Particularly, server is received described turn-off request and is discharged the corresponding data district.
Below in conjunction with Fig. 3, the embodiment of the invention provides a kind of file wiring method of described redundant magnetic disk array sharing file system, and it may further comprise the steps:
B1, user end to server send the request of opening file;
Preferably, if described file this locality is write do not open, then user end to server sends one and writes the request of opening (OPEN) 301.
B2, server return a response message (OPEN_ACK) 302 according to the described request of opening file, and described response message 302 include file capacity informations and blocks of files information are given described client;
Particularly, server is according to the described request locating file nodal information that opens file, and backspace file size and blocks of files information increase reference count and record client.
Preferably, a locating file nodal information and apply for that is write a lock.
Preferably, if append written document, need new allocate file piece, server is used to create blocks of files and copy thereof according to one group of disk access agency of policy selection.
B3, client write data to corresponding disk access agency;
Particularly, client is sent one according to blocks of files information to the disk access agency and is write one page request message (WRITE_A_PAGE_REQ) 303.
Preferably, be that unit writes described data with the page or leaf.
Preferably, client send to disk access agency described write one page request message (WRITE_A_PAGE_REQ) 303 in, send one to the disk access agency who preserves described blocks of files copy and write copy request message (WRITE_A_PAGE_REQ) 305.
B4, disk access agency write blocks of files;
Particularly, the disk access agency writes described blocks of files and returns one and write one page response message (WRITE_A_PAGE_ACK) 304 to client;
Preferably, the disk access at the fast copy of described file place agency writes described blocks of files copy and returns one and write copy response message (WRITE_A_PAGE_ACK) 306 to client.
Preferably, client is write one page response message 304 and is write copy response message 306 and regularly send a notification message (ACTIVE) 307 to server and write situation and file size with the circular document piece according to described, if blocks of files or blocks of files copy have one to write failure then to this situation of server notification,, blocks of files and copy thereof then write failure if all writing failure.
Be appreciated that if the blocks of files that will write does not have its information of buffer memory user end to server is sent out a message (GET_CHUNK_REQ) 308 of obtaining blocks of files information, server returns the fast information of described file (GET_CHUNK_ACK) 309 and gives client.
B5, user end to server send one and write ending request (CLOSE) 310, and server returns a response end (CLOSE_ACK) 311 and gives client.
Server reclaims blocks of files information and discharges the described lock of writing.
Redundant magnetic disk array sharing file system that the embodiment of the invention provides and reading/writing method thereof, use some disks as memory device, and described some disk correspondences have a redundancy backup, improve it and preserve data reliability, the framework of described redundant magnetic disk array sharing file system, can be easy to realize dilatation by increasing the method for number of servers and number of disks, realize smooth upgrade.The present invention also uses a plurality of clients that the file access interface externally is provided, all addressable each disk of each client, and the file access interface compatibility standard file system access interface of each client.
Concrete enforcement part in the above instructions, it only is preferred embodiment of the present invention, be not that the present invention is done any pro forma restriction, though the present invention discloses as above with preferred embodiment, yet be not in order to limit the present invention, any those skilled in the art, in not breaking away from the technical solution of the present invention scope, when the method that can utilize above-mentioned announcement and technology contents are made a little change or be modified to the equivalent embodiment of equivalent variations, but every content that does not break away from technical solution of the present invention, according to technical spirit of the present invention to any simple modification that above embodiment did, equivalent variations and modification all still belong in the scope of technical solution of the present invention.

Claims (7)

1. a redundant magnetic disk array sharing file system is characterized in that, comprising:
At least one server;
At least one disk, the document storage strategy of the described disk of described server controls;
At least one client, it responds upper layer application file access request, and knows the file place disk of being visited by the network connection to described server;
At least one disk access agency, it responds the file access request of described client, directly visits described disk;
Described disk is set to several, and each disk has some blocks of files, the corresponding blocks of files copy of each blocks of files, and described blocks of files copy is stored on one or more other disks;
Described server also is used for the data on each blocks of files copy of timing scan, if find loss of data, then carries out the flow process of a new files piece copy.
2. redundant magnetic disk array sharing file system as claimed in claim 1, it is characterized in that, described redundant array of inexpensive disk access system further has an Ethernet interface, and described server, described client and described disk access agency communicate to connect respectively making on the described Ethernet and can communicate to connect mutually between described client, described disk access agency and the described server.
3. redundant magnetic disk array sharing file system as claimed in claim 2 is characterized in that described disk is configured on the described server, and has some blocks of files on described disk.
4. read method of redundant magnetic disk array sharing file system according to claim 1, it may further comprise the steps:
A1, described client end response upper layer application file access request send a File Open order to described server;
A2, server return the blocks of files information of being visited and give described client;
A3, client according to described blocks of files information to the corresponding blocks of files content of its corresponding disk access proxy requests;
A4, disk access agency read corresponding blocks of files content and send back to client, if described disk access agency reads blocks of files content failure on the corresponding disk, then notify client to read the blocks of files copy;
Described server is the data on each blocks of files copy of timing scan also, if find loss of data, then carry out the flow process of a new files piece copy.
5. read method as claimed in claim 4 is characterized in that, in the described steps A 4, disk access agency is that unit reads blocks of files content on the corresponding disk with the page or leaf, and is that unit is sent to client with the described blocks of files content that reads with the page or leaf.
6. read method as claimed in claim 4, it is characterized in that, define described blocks of files place disk access agency and be the source disk access agent, define its blocks of files copy place disk access agency and be the destination disc agency, described new files piece copy flow process may further comprise the steps:
C1, the suitable disk access of selection are acted on behalf of the new destination disc access agent as described source disk access agent correspondence, send out a blocks of files copy command to the source disk access agent;
C2, source disk access agent write described new destination disc access agent with blocks of files information;
C3, described new destination disc access agent write described blocks of files copy on its corresponding disk.
7. write method of redundant magnetic disk array sharing file system according to claim 1, it may further comprise the steps:
B1, described client end response upper layer application file access request, user end to server sends the request of opening file;
B2, server return the blocks of files information of being visited and give client;
B3, client write data to the disk access agency of the blocks of files information correspondence of being visited by page or leaf, and described client is sent one to the disk access agency who preserves the blocks of files copy and write the copy request message;
The data that B4, disk access agency write client write the blocks of files on the corresponding disk;
The disk access agency at described blocks of files copy place writes described blocks of files copy and returns one and write the copy response message to client;
Data on each blocks of files copy of described server timing scan if find loss of data, are then carried out the flow process of a new files piece copy.
CN200810142716XA 2008-07-30 2008-07-30 Redundant magnetic disk array sharing file system and read-write method Expired - Fee Related CN101329691B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810142716XA CN101329691B (en) 2008-07-30 2008-07-30 Redundant magnetic disk array sharing file system and read-write method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810142716XA CN101329691B (en) 2008-07-30 2008-07-30 Redundant magnetic disk array sharing file system and read-write method

Publications (2)

Publication Number Publication Date
CN101329691A CN101329691A (en) 2008-12-24
CN101329691B true CN101329691B (en) 2011-06-22

Family

ID=40205499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810142716XA Expired - Fee Related CN101329691B (en) 2008-07-30 2008-07-30 Redundant magnetic disk array sharing file system and read-write method

Country Status (1)

Country Link
CN (1) CN101329691B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102469119A (en) * 2010-11-04 2012-05-23 英业达股份有限公司 Network hard disk system
CN105659213B (en) * 2013-10-18 2018-12-14 株式会社日立制作所 Restore without the target drives independent data integrality and redundancy shared in distributed memory system
CN105573872B (en) * 2014-10-09 2019-01-08 腾讯科技(深圳)有限公司 The HD management method and apparatus of data-storage system
CN105635310B (en) * 2016-01-20 2019-02-26 杭州宏杉科技股份有限公司 A kind of access method and device of storage resource
CN106933515A (en) * 2017-03-15 2017-07-07 郑州云海信息技术有限公司 A kind of disk RAID redundancy approach for taking into account read or write speed and data safety
CN108255640B (en) * 2017-12-15 2021-11-02 云南省科学技术情报研究院 Method and device for rapidly recovering redundant data in distributed storage
CN108509155B (en) * 2018-03-31 2021-07-13 深圳忆联信息系统有限公司 Method and device for remotely accessing disk

Also Published As

Publication number Publication date
CN101329691A (en) 2008-12-24

Similar Documents

Publication Publication Date Title
US10956601B2 (en) Fully managed account level blob data encryption in a distributed storage environment
CN101329691B (en) Redundant magnetic disk array sharing file system and read-write method
US10659225B2 (en) Encrypting existing live unencrypted data using age-based garbage collection
CN101228523B (en) System and method for caching network file systems
US6907457B2 (en) Architecture for access to embedded files using a SAN intermediate device
US9128833B2 (en) Two level addressing in storage clusters
EP1569085B1 (en) Method and apparatus for increasing data storage capacity
CN100428185C (en) Bottom-up cache structure for storage servers
US8046421B2 (en) High performance storage access environment
US10783121B2 (en) Techniques for optimizing data flows in hybrid cloud storage systems
US20090034377A1 (en) System and method for efficient updates of sequential block storage
WO2014183708A1 (en) Method and system for realizing block storage of distributed file system
CN106156359A (en) A kind of data synchronization updating method under cloud computing platform
JP2003323263A (en) Common memory control method and control system
CN101916289A (en) Method for establishing digital library storage system supporting mass small files and dynamic backup number
CN102917005A (en) Method and device supporting massive memory access to transactions
US20150106468A1 (en) Storage system and data access method
CN111435286B (en) Data storage method, device and system
CN106020713A (en) File storage method based on buffer area
US11216204B2 (en) Degraded redundant metadata, DRuM, technique
KR101470857B1 (en) Network distributed file system and method using iSCSI storage system
CN103077134A (en) Method and device for achieving multi-pipe data transmission in embedded system
JP5278254B2 (en) Storage system, data storage method and program
KR20150061314A (en) Method and System for recovery of iSCSI storage system used network distributed file system
US11349924B1 (en) Mechanism for peer-to-peer communication between storage management systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110622

Termination date: 20190730