CN105095352B - Data processing method and device applied to distributed system - Google Patents
Data processing method and device applied to distributed system Download PDFInfo
- Publication number
- CN105095352B CN105095352B CN201510344249.9A CN201510344249A CN105095352B CN 105095352 B CN105095352 B CN 105095352B CN 201510344249 A CN201510344249 A CN 201510344249A CN 105095352 B CN105095352 B CN 105095352B
- Authority
- CN
- China
- Prior art keywords
- memory
- data
- access
- file data
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Techniques For Improving Reliability Of Storages (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention proposes a kind of data processing methods and device applied to distributed system, this method comprises: the file data of write-in is carried out erasure code coding, generate the redundant data of file data;By file data and redundant data storage to first memory;Redundant data of the temperature lower than the file data of preset value will be accessed and be transferred to second memory;When needing to read the corresponding data information of file data, file data is read in the first memory directly to obtain data information.Data processing method of the invention can effectively guarantee to access the response speed of data, and carrying cost and O&M cost is greatly saved.
Description
Technical field
The present invention relates to computer fields, it particularly relates to a kind of data processing method applied to distributed system
And device.
Background technique
Distributed file system generally comprises client, meta data server and data server, and client is responsible for file
The access interface of data is formulated, and meta data server handles the layout and attribute of file, the data of data server storage file
Content.
For distributed file system, can store mass data and have high reliability is its most important feature, when
A large amount of file is stored in system, needs a large amount of disk storage, but disk is higher by very much, then relative to tape library cost
Storing data is classified with tape library and disk seems necessary.
Traditional way is that the total data of the less file of access times within by a period of time is directly stored in tape library
On, and the more file of access times is stored on sata disk, and disadvantage of this is that when needing the file accessed to be located at tape
When on library, access speed becomes slower than sata disk very much, and user experience is very poor, and repeatedly access tape library will also result in magnetic
The acceleration of tape pool damages, and causes file that can not repair.
For the problems in the relevant technologies, currently no effective solution has been proposed.
Summary of the invention
For the problems in the relevant technologies, the present invention proposes a kind of data processing method and dress applied to distributed system
It sets.
The technical scheme of the present invention is realized as follows:
According to an aspect of the invention, there is provided a kind of data processing method applied to distributed system.
This method comprises:
The file data of write-in is subjected to erasure code coding, generates the redundant data of file data;
By file data and redundant data storage to first memory;
Redundant data of the temperature lower than the file data of preset value will be accessed and be transferred to second memory;
When needing to read the corresponding data information of file data, file data is read in the first memory directly to obtain
Take data information.
If carrying out erasure to the redundant data being stored in second memory when first memory damage
Code coding, generates file data corresponding with redundant data, and file data is stored in the normal first memory of performance.
And when second memory damage when, then to be stored in first memory and with being stored in it is superfluous in second memory
Remainder carries out erasure code coding according to corresponding file data, and the redundant data storage regenerated is normal in performance
Second memory.
Wherein first memory is following one of any:
Sata disk, ssd disk, sas disk;And
Second memory is following one of any:
Tape library, sata disk, ssd disk, sas disk;
But first memory with do not repeated selected by second memory.
In addition, access temperature includes at least one of:
Access times, access frequency, access duration of the moment away from current time of access time, the last time.
According to another aspect of the present invention, a kind of data processing equipment applied to distributed system is additionally provided, comprising:
Generation module, the file data for that will be written carry out erasure code coding, generate the redundancy of file data
Data;
First memory module, for by file data and redundant data storage to first memory;
Second memory module is deposited for the redundant data for accessing the file data that temperature is lower than preset value to be transferred to second
Reservoir;
Read module, for directly reading in the first memory when needing to read the corresponding data information of file data
File data is taken to obtain data information.
In addition, the device can also include:
First unloading module, for when first memory damage when, to the redundant data being stored in second memory into
Row erasure code coding, generates corresponding with redundant data file data, and it is normal that file data is stored in performance
First memory.
Second unloading module, for when second memory damage, to be stored in first memory and be stored in the
The corresponding file data of redundant data in two memories carries out erasure code coding, and the redundant data regenerated is deposited
It is stored in the normal second memory of performance.
Wherein, first memory is following one of any:
Sata disk, ssd disk, sas disk;And
Second memory is following one of any:
Tape library, sata disk, ssd disk, sas disk;
But first memory with do not repeated selected by second memory.
In addition, access temperature includes at least one of:
Access times, access frequency, access duration of the moment away from current time of access time, the last time.
Data processing method of the invention can effectively guarantee to access the response speed of data, and storage is greatly saved
Cost and O&M cost.
Detailed description of the invention
It in order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will be to institute in embodiment
Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the invention
Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings
Obtain other attached drawings.
Fig. 1 is the flow chart of the data processing method according to an embodiment of the present invention applied to distributed system;
Fig. 2 is the schematic diagram of the data processing method according to an embodiment of the present invention applied to distributed system;
Fig. 3 is the block diagram of the data processing equipment according to an embodiment of the present invention applied to distributed system.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art's every other embodiment obtained belong to what the present invention protected
Range.
According to an embodiment of the invention, providing a kind of data processing method applied to distributed system.
As shown in Figure 1, the data processing method according to an embodiment of the present invention applied to distributed system includes:
The file data of write-in is carried out erasure code coding, generates the redundant data of file data by step S101;
Step S103, by file data and redundant data storage to first memory;
Step S105 will access redundant data of the temperature lower than the file data of preset value and be transferred to second memory;
Step S107 directly reads text when needing to read the corresponding data information of file data in the first memory
Number of packages obtains data information accordingly.
In addition, if then being carried out to the redundant data being stored in second memory when first memory damage
Erasure code coding, generates file data corresponding with the redundant data, and the file data restored again is stored in
The normal first memory of performance.
And when second memory damage when, then to be stored in first memory and with being stored in it is superfluous in second memory
Remainder carries out erasure code coding according to corresponding file data, and the redundant data storage regenerated is normal in performance
Second memory.
Wherein first memory can be following one of any:
Sata disk, ssd disk, sas disk;And
Second memory can be following one of any:
Tape library, sata disk, ssd disk, sas disk;
But the first memory of selection cannot be repeated with second memory.
In addition, access temperature includes at least one of:
Access times, access frequency, access duration of the moment away from current time of access time, the last time.
Where preferably embodying the difference and advantage of technical solution of the present invention and prior art, in a reality
It applies in example, first memory is chosen for sata disk, and second memory is chosen for tape library.It is literary when being written in distributed file system
When part, erasure code coding is carried out to this document first, after generating corresponding redundancy check, it is unified by initial data and
Redundant data is stored on the sata disk of data storage server, then will infrequently according to the access time of file last time
The redundant data of the file of access dumps on tape library, and deletes the redundant copy on sata disk, when this document is by again
When being accessed, then do not have to read data from tape library, only need to read initial data from sata disk, if when on disk
Corrupted data when, redundancy can be read from tape library and recovers initial data again, and when the corrupted data on tape library
When, then it can go out the data damaged with the data reconstruction on sata disk, ensure that response speed when data access in this way, again
Taking full advantage of tape library reduces carrying cost, while meeting high reliability request again.
Technical solution of the present invention is understood in order to clearer, it is in a specific embodiment, real referring to shown in Fig. 2
Specific step is as follows for existing technical solution of the present invention:
1. client obtains the request of file layout information to meta data server when file write-in;
2. client requests redundant data and former data write-in layout to after file progress erasure code coding
In layout specified sata disk;
3. meta data server timing scan file directory accesses last time according to the file access strategy of setting
Time is more than that the file general of certain time (such as 3 days) elects so far;
4. the redundant data of the file screened is moved on tape library, and modify file layout information.
5. the data on tape library are read according to file layout information and pass through erasure code when sata adjustment debit bad when
Coding calculates in initial data reparation to good sata disk;
6. data reading on sata disk is passed through according to file layout information when there is corrupted data in tape library
Erasure code coding calculates redundant data and re-writes in tape library;
When user reads data, it is only necessary to the initial data in common disk is read, without accessing on tape library
Data.
According to an embodiment of the invention, additionally providing a kind of data processing equipment applied to distributed system.
As shown in figure 3, the data processing equipment according to an embodiment of the present invention applied to distributed system includes:
Generation module 31, the file data for that will be written carry out erasure code coding, generate the superfluous of file data
Remainder evidence;
First memory module 32, for by file data and redundant data storage to first memory;
Second memory module 33 is transferred to second for will access redundant data of the temperature lower than the file data of preset value
Memory;
Read module 34, for when needing to read the corresponding data information of file data, directly in the first memory
File data is read to obtain data information.
In addition, the device can also include:
First unloading module (not shown) is used for when first memory damage, superfluous in second memory to being stored in
Remainder generates corresponding with redundant data file data according to erasure code coding is carried out, and by being stored in property of file data
It can normal first memory.
Second unloading module (not shown), for when second memory damage when, to be stored in first memory and with
The corresponding file data of the redundant data being stored in second memory carries out erasure code coding, superfluous by what is regenerated
Remainder evidence is stored in the normal second memory of performance.
Wherein, first memory can be following one of any:
Sata disk, ssd disk, sas disk;And
Second memory can be following one of any:
Tape library, sata disk, ssd disk, sas disk;
But first memory with do not repeated selected by second memory.
In addition, access temperature includes at least one of:
Access times, access frequency, access duration of the moment away from current time of access time, the last time.
Data processing method of the invention can effectively guarantee to access the response speed of data, and storage is greatly saved
Cost and O&M cost.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Within mind and principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Claims (6)
1. a kind of data processing method applied to distributed system characterized by comprising
The file data of write-in is subjected to erasure code coding, generates the redundant data of the file data;
By the file data and redundant data storage to first memory;
Redundant data of the temperature lower than the file data of preset value will be accessed and be transferred to second memory, and the second memory
Access speed be less than the first memory access speed;
When needing to read the corresponding data information of the file data, the file is directly read in the first memory
Data are to obtain the data information;
When first memory damage, erasure code is carried out to the redundant data being stored in the second memory
Coding generates file data corresponding with the redundant data, and the file data is stored in performance normal first and is deposited
Reservoir;
When the second memory damage when, to be stored in the first memory and be stored in the second memory
The corresponding file data of redundant data carry out erasure code coding, just in performance by the redundant data storage regenerated
Normal second memory.
2. the method according to claim 1, wherein including:
The first memory includes:
Sata disk, ssd disk, sas disk;And
The second memory includes:
Tape library, sata disk, ssd disk, sas disk;
Wherein, the first memory is not repeated with selected by the second memory.
3. the method according to claim 1, wherein the access temperature includes at least one of:
Access times, access frequency, access duration of the moment away from current time of access time, the last time.
4. a kind of data processing equipment applied to distributed system characterized by comprising
Generation module, the file data for that will be written carry out erasure code coding, generate the redundancy of the file data
Data;
First memory module, for by the file data and redundant data storage to first memory;
Second memory module is transferred to the second storage for will access redundant data of the temperature lower than the file data of preset value
Device, and the access speed of the second memory is less than the access speed of the first memory;
Read module, for when needing to read the corresponding data information of the file data, directly in the first memory
It is middle to read the file data to obtain the data information;
First unloading module is used for when first memory damage, to the redundant digit being stored in the second memory
According to erasure code coding is carried out, file data corresponding with the redundant data is generated, and the file data is stored
In the normal first memory of performance;
Second unloading module, for when the second memory damage when, to be stored in the first memory and with storage
The corresponding file data of redundant data in the second memory carries out erasure code coding, superfluous by what is regenerated
Remainder evidence is stored in the normal second memory of performance.
5. device according to claim 4 characterized by comprising
The first memory includes:
Sata disk, ssd disk, sas disk;And
The second memory includes:
Tape library, sata disk, ssd disk, sas disk;
Wherein, the first memory is not repeated with selected by the second memory.
6. device according to claim 4, which is characterized in that the access temperature includes at least one of:
Access times, access frequency, access duration of the moment away from current time of access time, the last time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510344249.9A CN105095352B (en) | 2015-06-19 | 2015-06-19 | Data processing method and device applied to distributed system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510344249.9A CN105095352B (en) | 2015-06-19 | 2015-06-19 | Data processing method and device applied to distributed system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105095352A CN105095352A (en) | 2015-11-25 |
CN105095352B true CN105095352B (en) | 2019-03-05 |
Family
ID=54575789
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510344249.9A Active CN105095352B (en) | 2015-06-19 | 2015-06-19 | Data processing method and device applied to distributed system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105095352B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106528002A (en) * | 2016-12-06 | 2017-03-22 | 郑州云海信息技术有限公司 | Time-based storage scheduling method |
CN106649891A (en) * | 2017-02-24 | 2017-05-10 | 深圳市中博睿存信息技术有限公司 | Distributed data storage method and system |
CN112256472B (en) * | 2020-10-20 | 2024-06-25 | 平安科技(深圳)有限公司 | Distributed data retrieval method and device, electronic equipment and storage medium |
CN112558886A (en) * | 2020-12-25 | 2021-03-26 | 北京嘀嘀无限科技发展有限公司 | Data storage method and system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101488104A (en) * | 2009-02-26 | 2009-07-22 | 北京世纪互联宽带数据中心有限公司 | System and method for implementing high-efficiency security memory |
CN102508789A (en) * | 2011-10-14 | 2012-06-20 | 浪潮电子信息产业股份有限公司 | Grading storage method for system |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103631666B (en) * | 2012-08-24 | 2018-04-20 | 中兴通讯股份有限公司 | The fault-tolerant adaptation management equipment of data redundancy, service equipment, system and method |
CN102937967B (en) * | 2012-10-11 | 2018-02-27 | 南京中兴新软件有限责任公司 | Data redundancy realization method and device |
US9495246B2 (en) * | 2013-01-21 | 2016-11-15 | Kaminario Technologies Ltd. | Raid erasure code applied to partitioned stripe |
CN104281533B (en) * | 2014-09-18 | 2018-03-20 | 深圳市中博科创信息技术有限公司 | A kind of method and device of data storage |
-
2015
- 2015-06-19 CN CN201510344249.9A patent/CN105095352B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101488104A (en) * | 2009-02-26 | 2009-07-22 | 北京世纪互联宽带数据中心有限公司 | System and method for implementing high-efficiency security memory |
CN102508789A (en) * | 2011-10-14 | 2012-06-20 | 浪潮电子信息产业股份有限公司 | Grading storage method for system |
Also Published As
Publication number | Publication date |
---|---|
CN105095352A (en) | 2015-11-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10802727B2 (en) | Solid-state storage power failure protection using distributed metadata checkpointing | |
US10977124B2 (en) | Distributed storage system, data storage method, and software program | |
US8103847B2 (en) | Storage virtual containers | |
US10127166B2 (en) | Data storage controller with multiple pipelines | |
TWI645404B (en) | Data storage device and control method for non-volatile memory | |
CN105573681B (en) | Method and system for establishing RAID in SSD | |
CN101916173B (en) | RAID (Redundant Array of Independent Disks) based data reading and writing method and system thereof | |
US20160217040A1 (en) | Raid parity stripe reconstruction | |
KR101870521B1 (en) | Methods and systems for improving storage journaling | |
CN107391027A (en) | Redundant Array of Inexpensive Disc storage device and its management method | |
KR101678868B1 (en) | Apparatus for flash address translation apparatus and method thereof | |
US8843704B2 (en) | Stride based free space management on compressed volumes | |
JP2007012058A (en) | File system for storing transaction records in flash-like media | |
CN105095352B (en) | Data processing method and device applied to distributed system | |
CN102799533B (en) | Method and apparatus for shielding damaged sector of disk | |
US11379155B2 (en) | System and method for flash storage management using multiple open page stripes | |
CN103064765A (en) | Method and device for data recovery and cluster storage system | |
CN103425589A (en) | Control apparatus, storage device, and storage control method | |
CN111124262B (en) | Method, apparatus and computer readable medium for managing Redundant Array of Independent Disks (RAID) | |
TW200907995A (en) | Method and system of defect management for storage medium | |
CN106686095A (en) | Data storage method and device based on erasure code technology | |
CN102024021A (en) | Method for logging metadata in logical file system | |
CN105302665A (en) | Improved copy-on-write snapshot method and system | |
CN107728943B (en) | Method for delaying generation of check optical disc and corresponding data recovery method | |
US11347860B2 (en) | Randomizing firmware loaded to a processor memory |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |