CN109101639B - Aggregation mode for improving performance of file system - Google Patents

Aggregation mode for improving performance of file system Download PDF

Info

Publication number
CN109101639B
CN109101639B CN201810948103.9A CN201810948103A CN109101639B CN 109101639 B CN109101639 B CN 109101639B CN 201810948103 A CN201810948103 A CN 201810948103A CN 109101639 B CN109101639 B CN 109101639B
Authority
CN
China
Prior art keywords
file
aggregation
files
database
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810948103.9A
Other languages
Chinese (zh)
Other versions
CN109101639A (en
Inventor
吴火城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cyphy Technology Xiamen Co ltd
Original Assignee
Cyphy Technology Xiamen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cyphy Technology Xiamen Co ltd filed Critical Cyphy Technology Xiamen Co ltd
Priority to CN201810948103.9A priority Critical patent/CN109101639B/en
Publication of CN109101639A publication Critical patent/CN109101639A/en
Application granted granted Critical
Publication of CN109101639B publication Critical patent/CN109101639B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an aggregation mode for improving the performance of a file system, which does not generate a temporary large file, and sequentially reads each small file as a large file to upload by splicing file header information so as to avoid one-time file writing operation to a disk and subsequent one-time file reading operation, wherein the head of the aggregation file stores database information of all files in the aggregation file, a metadata base of the file system is damaged, the database information is recovered according to the file header information of the aggregation file, when a backup file needs to be recovered, the whole aggregation file is not read to recover all the aggregation files, but one small file in the aggregation file is recovered independently, the waste of system resources and the long-time recovery are avoided, the metadata base information corresponding to all the files is recorded to the head of the file, the method avoids reading the whole aggregation file when the database is restored, and only the file head needs to be read.

Description

Aggregation mode for improving performance of file system
Technical Field
The invention relates to the field of big data storage, in particular to an aggregation mode for improving the performance of a file system.
Background
The blue-ray equipment realizes the data storage for more than 10 years by burning the data on the blue-ray disc without manual interference. And because the storage capacity of the blue-ray disc is large, a large amount of data can be stored for a long time.
The cloud platform provides storage service by providing a use mode of object storage for the outside. The data file is saved to the disk. The filing system automatically operates in the background, and automatically backs up the files to the blue-ray equipment under the condition of meeting the conditions according to the rules configured by the user.
The archiving system is slow backup software running at the back end of the cloud platform. The method realizes the backup of the files on the disk to the blue-ray equipment, and can restore the files when the files are needed.
The size range of the file uploaded by the user is basically unlimited, and the size range of the file uploaded by the user can be only a few bytes, or hundreds of G, and dozens of T.
The blue light device has 2 physical requirements: the files uploaded by the blue-ray equipment are required to be between 256MB and 5GB, and the physical characteristics of the burning times of the blue-ray equipment are required to be within 7000 times. And the rate at which data files are uploaded to the blu-ray device in the range of about 4GB is highest.
Based on the above requirements of these blu-ray devices, the archiving system must use an aggregation mode: a large number of small files are aggregated into a large file of about 4G, and then the large file is backed up on a blue-ray device through a network interface. There are 3 of these benefits: 1. the speed of file transmission on the network is improved; 2. the burning times on the blue light equipment are reduced, and the service life of the equipment is delayed; 3. the number of files below one volume above the blue-ray equipment is reduced, and the efficiency of the blue-ray equipment is improved.
Disclosure of Invention
The present invention is directed to an aggregation module for improving the performance of a file system, so as to solve the problems mentioned in the background art.
In order to achieve the purpose, the invention provides the following technical scheme:
an aggregation mode for improving the performance of a file system comprises the following specific steps:
(1) a database, a compression unit, an aggregation storage unit, a blue light equipment system, a reading unit and a database updating unit are sequentially arranged in the aggregation mode system, and system data read in the reading unit are fed back to the aggregation storage unit;
(2) the filing system is internally provided with a filing task library and a recovery task library.
As a further scheme of the invention: and (2) storing files to be aggregated in the database in the step (1), and scanning the database to obtain a file set to be archived.
As a further scheme of the invention: and (2) the compression unit in the step (1) performs compression operation on the file needing to be compressed.
As a further scheme of the invention: in the aggregation mode of the files in the aggregation storage unit in the step (1), a temporary large file is not generated, and the small files are sequentially read by splicing file header information and uploaded as a large file, so that one-time file writing operation and one-time subsequent file reading operation are avoided, and the database information of all files in the aggregation file is stored in the header of the aggregation file. The essence of this technique is to read all file information, all file contents in turn into a large block of memory, and then upload this block of memory as a large temporary aggregate file to the blu-ray device.
As a further scheme of the invention: when the file needs to be restored, the restoration task library in the step (2) searches a metadata library of the filing task library according to the information of the object to obtain the information of the file stored on the blue-ray equipment system: and the IP, the volume, the file name, the offset in the aggregation file, the file size and whether the file is compressed in the aggregation file of the blue-ray equipment system are respectively read according to the information on the blue-ray equipment and the offset in the aggregation file, and the sub-file is decompressed according to whether the sub-file is compressed or not to be restored into the original file.
As a still further scheme of the invention: the reading unit in step (1) reads the metadata base information of the files, and the files are sequentially read out, loaded into a buffer with a fixed size, such as 512KB, and then uploaded to the blu-ray device system as a single file.
As a still further scheme of the invention: and (2) updating the database storage data by the database updating unit in the step (1) to perform the next round of aggregation storage, so that the filing task is complete.
As a still further scheme of the invention: and the metadata database of the filing task library is damaged, so that the database information can be recovered according to the file header information of the aggregation file. The metadata base information corresponding to all the files is recorded to the head of the file, so that the reading of the whole aggregated file is avoided when the database is restored, the head of the file only needs to be read, and the metadata base information of the key is put together, so that the reading efficiency of the optical disc can be improved. When the backup file needs to be restored, the whole aggregation file is not read and all the aggregation files are restored, but one small file in the aggregation files is independently restored, so that the waste of system resources is avoided, and the restoration needs to be carried out for a long time.
Compared with the prior art, the invention has the beneficial effects that: in the aggregation mode, a temporary large file is not generated, and the small files are sequentially read by splicing file header information and uploaded as a large file, so that one-time file writing operation and subsequent one-time file reading operation are avoided. The header of the aggregated file stores the database information of all files in the aggregated file, the metadata base of the file system is damaged, the database information can be recovered according to the file header information of the aggregated file, the metadata base information corresponding to all the files is recorded to the header of the file, the situation that the whole aggregated file is read when the database is recovered is avoided, only the header of the file needs to be read, and the key metadata base information is put together, so that the reading efficiency of the optical disc can be improved. When the backup file needs to be restored, the whole aggregation file is not read and all the aggregation files are restored, but one small file in the aggregation files is independently restored, so that the waste of system resources is avoided, and the restoration needs to be carried out for a long time.
Drawings
FIG. 1 is a block diagram of a flow diagram in an aggregation mode for improving file system performance.
FIG. 2 is a block diagram of an embodiment of an aggregation mode for improving file system performance.
FIG. 3 is a diagram illustrating an aggregate file structure in an aggregate mode for improving file system performance.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the embodiment of the invention, a convergence mode for improving the performance of a file system comprises the following specific management steps:
(1) the aggregation mode system is internally provided with a database, a compression unit, an aggregation storage unit, a blue-ray equipment system, a reading unit and a database updating unit in sequence, system data read in the reading unit is fed back to the aggregation storage unit, files to be aggregated are stored in the database, a file set to be filed is obtained by scanning the database, the compression unit compresses the files to be compressed, the aggregation mode of the files in the aggregation storage unit is not to generate a large file stored to a disk, all small files are read in sequence by splicing file header information and are uploaded as a large file (which is equivalent to generating an aggregation file in a memory), so that the operation of writing the file to the disk once and the subsequent operation of reading the file once are avoided, and the database information of all the files in the aggregation file is stored in the head of the aggregation file, the reading unit sequentially reads the metadata base information of the files, each file is loaded into a buffer memory with a fixed size such as 512KB (when network transmission is carried out, a large block data mode with a fixed size is used, the uploading speed is the highest), then the files are used as a single file and uploaded to the blue-ray equipment system, the database updating unit updates the database storage data, the next round of aggregation storage is carried out, and the filing task is complete.
(2) The filing system is internally provided with a filing task library and a recovery task library, when the file needs to be recovered, the recovery task library searches a metadata library of the filing task library according to the information of the object to obtain the information of the file stored on the blue-ray equipment system: the IP, the volume, the file name, the offset in the aggregation file, the file size and whether the file is compressed in the aggregation file of the blue-ray equipment system are independently read according to the information on the blue-ray equipment and the offset in the aggregation file, decompression operation is carried out according to whether the file is compressed or not, the original file is restored, the metadata base of the task base is filed to be damaged, and database information can be restored according to the file header information of the aggregation file. The metadata base information corresponding to all the files is recorded to the head of the file, so that the reading of the whole aggregated file is avoided when the database is restored, the head of the file only needs to be read, and the metadata base information of the key is put together, so that the reading efficiency of the optical disc can be improved. When the backup file needs to be restored, the whole aggregation file is not read and all the aggregation files are restored, but one small file in the aggregation files is independently restored, so that the waste of system resources is avoided, and the restoration needs to be carried out for a long time.
Example (b):
Figure BDA0001771449980000041
Figure BDA0001771449980000051
Figure BDA0001771449980000061
Figure BDA0001771449980000071
Figure BDA0001771449980000081
Figure BDA0001771449980000091
it will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (5)

1. An aggregation mode for improving the performance of a file system is characterized by comprising the following specific management steps:
(1) a database, a compression unit, an aggregation storage unit, a blue light equipment system, a reading unit and a database updating unit are sequentially arranged in the aggregation mode system, and system data read in the reading unit are fed back to the aggregation storage unit;
(2) a filing task library and a recovery task library are arranged in the filing system;
files needing to be aggregated are stored in the database in the step (1), and a file set needing to be archived is obtained by scanning the database;
the compression unit in the step (1) performs compression operation on the files needing to be compressed, and ignores the past files needing not to be compressed;
in the aggregation mode of the files in the aggregation storage unit in the step (1), a temporary large file is not generated, and the small files are sequentially read by splicing file header information and uploaded as a large file, so that one-time file writing operation and one-time subsequent file reading operation are avoided, and the database information of all files in the aggregation file is stored in the header of the aggregation file.
2. The aggregation mode for improving the performance of the file system according to claim 1, wherein the recovery task library in step (2) searches a meta database of the archive task library according to the information of the object when the file needs to be recovered, so as to obtain the information of the file stored in the blu-ray device system: the IP address, volume, file name, offset in the aggregated file, file size of the blue-ray equipment system, whether the file is compressed in the aggregated file or not, and the section of file is independently read according to the information in the database and the offset in the aggregated file, and is restored to the original file according to whether the file is compressed or not and decompression operation is carried out.
3. The assembly mode for improving the performance of the file system according to claim 1, wherein the reading unit in step (1) reads the metadata base information of the files, and each file is sequentially read out, loaded into a fixed-size cache, and then uploaded to the blu-ray device system as a single file; the method is a virtual technology for gathering files, the content of each file is spliced by splicing file header information, and the spliced data content is sequentially transmitted through a fixed-size cache external interface, so that the external interface considers that the read content is read from an integral large file.
4. The aggregate mode for improving file system performance according to claim 1, wherein the database updating unit in step (1) updates the database storage data for the next aggregate storage until the archiving task is completed.
5. The aggregate mode for improving file system performance of claim 4, wherein the metadata base of the archive system is damaged, and the database information can be recovered according to the file header information of the aggregate file; the metadata base information corresponding to all the files is recorded to the head of the file, so that the reading of the whole aggregated file is avoided when the database is restored, the head of the file only needs to be read, and the metadata base information of the key is put together, so that the reading efficiency of the optical disc can be improved; when the backup file needs to be restored, the whole aggregation file is not read and all the aggregation files are restored, but one small file in the aggregation files is independently restored, so that the waste of system resources is avoided, and the restoration needs to be carried out for a long time.
CN201810948103.9A 2018-08-21 2018-08-21 Aggregation mode for improving performance of file system Active CN109101639B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810948103.9A CN109101639B (en) 2018-08-21 2018-08-21 Aggregation mode for improving performance of file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810948103.9A CN109101639B (en) 2018-08-21 2018-08-21 Aggregation mode for improving performance of file system

Publications (2)

Publication Number Publication Date
CN109101639A CN109101639A (en) 2018-12-28
CN109101639B true CN109101639B (en) 2021-03-23

Family

ID=64850240

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810948103.9A Active CN109101639B (en) 2018-08-21 2018-08-21 Aggregation mode for improving performance of file system

Country Status (1)

Country Link
CN (1) CN109101639B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113704027B (en) * 2021-10-29 2022-02-18 苏州浪潮智能科技有限公司 File aggregation compatible method and device, computer equipment and storage medium
CN114116652A (en) * 2021-11-29 2022-03-01 苏州浪潮智能科技有限公司 Data aggregation storage method, system, device and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1248749A (en) * 1998-09-18 2000-03-29 英业达股份有限公司 Method for merging files
CN101478370A (en) * 2009-01-20 2009-07-08 中兴通讯股份有限公司 File compression method and apparatus based on file system
CN105630688A (en) * 2014-10-30 2016-06-01 国际商业机器公司 Aggregate file storage method and system as well as aggregate file compression method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1248749A (en) * 1998-09-18 2000-03-29 英业达股份有限公司 Method for merging files
CN101478370A (en) * 2009-01-20 2009-07-08 中兴通讯股份有限公司 File compression method and apparatus based on file system
CN105630688A (en) * 2014-10-30 2016-06-01 国际商业机器公司 Aggregate file storage method and system as well as aggregate file compression method and system

Also Published As

Publication number Publication date
CN109101639A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
US11907168B2 (en) Data object store and server for a cloud storage environment, including data deduplication and data management across multiple cloud storage sites
US11016858B2 (en) Systems and methods for managing single instancing data
CN101937377B (en) Data recovery method and device
US11593217B2 (en) Systems and methods for managing single instancing data
US7925623B2 (en) Method and apparatus for integrating primary data storage with local and remote data protection
CN110109778B (en) Large-amount small data file backup method and recovery method
US9773059B2 (en) Tape data management
CN104462563A (en) File storage method and system
WO2017009828A1 (en) A system and method for mainframe computers backup and restore
CN109101639B (en) Aggregation mode for improving performance of file system
US10359964B2 (en) Reducing time to read many files from tape
CN105100716A (en) Safe memory cell used for network video monitoring and system thereof
US10802719B2 (en) Method and system for data compression and data storage optimization
CN112800019A (en) Data backup method and system based on Hadoop distributed file system
CN109766218A (en) Data back up method based on distributed storage
CN102340544B (en) Method and device for downloading upgrade file packet
CN109753381A (en) A kind of continuous data protection method based on object storage
CN103970869A (en) Large file storage method
US11409604B1 (en) Storage optimization of pre-allocated units of storage
US9122405B1 (en) Fast initialization of storage device
CN102543108A (en) Video redundancy strategy optimization method based on distributed storage
CN105573677A (en) Implementation method of efficient storage
CN101577143A (en) Method, device and system for storing files
JP2008310889A (en) Recording and reproducing device
CN109660611B (en) Data storage method for cloud backup and data cloud backup method for storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant