CN107291915A - A kind of small documents storage method, small documents read method and system - Google Patents

A kind of small documents storage method, small documents read method and system Download PDF

Info

Publication number
CN107291915A
CN107291915A CN201710501667.3A CN201710501667A CN107291915A CN 107291915 A CN107291915 A CN 107291915A CN 201710501667 A CN201710501667 A CN 201710501667A CN 107291915 A CN107291915 A CN 107291915A
Authority
CN
China
Prior art keywords
small documents
stored
metadata
big file
small
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710501667.3A
Other languages
Chinese (zh)
Inventor
李杰辉
牛立国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201710501667.3A priority Critical patent/CN107291915A/en
Publication of CN107291915A publication Critical patent/CN107291915A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of small documents storage method, small documents read method and system, the small documents storage method includes:A metadatabase is preset in the logic storage unit of big file;The data content information of small documents to be stored is stored in big file, the data content information of the small documents to be stored is added to the end of big file;The metadata of small documents to be stored is stored in metadatabase, metadata includes:Offset and self-defining metadata of the title, size, type, check value, timestamp, small documents of small documents in big file.The metadata of the data content information of small documents to be stored and small documents to be stored is stored separately by the present invention by increasing a metadatabase in the logic storage unit of big file, and realizing individually can be to the modification of metadata, addition and deletion;In addition, when metadata be loaded into internal memory cached, it is not necessary to scan whole big file again.

Description

A kind of small documents storage method, small documents read method and system
Technical field
The present invention relates to technical field of data storage, more particularly to a kind of small documents storage method, small documents reading side Method and system.
Background technology
The high speed development of internet generates the files such as the picture of magnanimity, document, is that size is smaller the characteristics of these files (typically in below 100KB), enormous amount (hundreds of millions of), traditional POSIX interface document systems have been difficult to meet demand, Here it is the famous mass small documents problem of industry.For this problem, the common practice of industry is to merge storage, i.e., by small text Part merges storage to a traditional POSIX file, such as Haystack of Facebook, LinkedIn Ambry and Taobao TFS, the merging storage mode of these systems is all similar, is exactly that server end only preserves part metadata, remaining member number Client preservation is given according to file ID is encoded into, while creating index to the small documents in big file, property is read to reach to improve The purpose of energy.Metadata size, the type of server end preservation are all regularly to be stored in initial data on big file.
Above-mentioned storage mode has these defects:(1) self-defining metadata is not supported, if to support first number of new type According to, must just update small documents organizational form or change file ID;(2) when changing existing metadata, new value size has Strict limitation, otherwise can cover other valid data;(3) some operations only need to read metadata, it is still desirable in big file In navigate to small documents offset could start read, it is also desirable to can be in these metadata of memory cache, existing Under framework, if to cache these metadata, need from the beginning to scan whole big file, it is less efficient.And cause these master It is that small documents are stored on disk in the way of merging storage to want reason, needs strict border to distinguish between small documents, one Denier write-in just can not change border again, therefore form of the small documents in big file has a strict limitation, and can not arbitrarily change, Addition and deletion metadata.
The content of the invention
It is existing to solve it is an object of the invention to propose a kind of small documents storage method, small documents read method and system Have in technology because small documents are stored on disk in the way of merging storage, need strict border to come area between small documents Point, once write-in just can not be on modification border, therefore form of the small documents in big file has strict limitation, and can not be any The problem of modification, addition and deletion metadata.
To reach above-mentioned purpose, the invention provides following technical scheme:
A kind of small documents storage method, including:
A metadatabase is preset in the logic storage unit of big file;
The data content information of small documents to be stored is stored in the big file, the data content letter of the small documents Breath is added to the end of the big file successively;
The metadata of the small documents to be stored is stored in the metadatabase.
Wherein, the metadata includes:The filename of the small documents, the size of the small documents, the small documents The offset of type, the check value of the small documents, the timestamp of the small documents, the small documents in the big file with And self-defining metadata.
Wherein, the metadatabase is key value database RocksDB.
It is preferred that, in addition to:
By the part metadata redundant storage of the small documents to be stored in the big file, the small documents to be stored The data content information redundancies of part metadata and the small documents to be stored be stored in one big file, it is described to be stored The part metadata of small documents includes:Offset of the small documents in the big file, the filename of the small documents and Shared by length, the metadata of the small documents in the data of space size, the check value of the small documents and the small documents Hold the original position of information.
A kind of small documents storage system, including:
Default unit, for presetting a metadatabase in the logic storage unit of big file;
First memory cell, it is described for the data content information of small documents to be stored to be stored in the big file The data content information of small documents to be stored is added to the end of the big file successively;
Second memory cell, for the metadata of the small documents to be stored to be stored in the metadatabase.
Wherein, the metadata includes:The filename of the small documents, the size of the small documents, the small documents Type, the check value of the small documents, the timestamp of the small documents, the path of the big file, the small documents are described Offset and self-defining metadata in big file.
Wherein, the metadatabase is key value database RocksDB.
It is preferred that, first memory cell is additionally operable to the part metadata redundant storage of the small documents to be stored In the big file, the part metadata of the small documents to be stored and the data content information of the small documents to be stored are superfluous Remaining to be stored in one big file, the part metadata of the small documents to be stored includes:The small documents are in the big file In offset, the filename of the small documents and length, space size, the small documents shared by the metadata of the small documents Check value and the small documents data content information original position.
A kind of small documents read method, including:
Obtain the filename of the small documents;
Searched according to the filename of the small documents in the metadatabase of metadata of the small documents is stored described small The metadata information of file, the metadata information includes:In big file where the size of the small documents, the small documents The offset of path and the small documents in the big file;
According to the metadata information of the small documents the data content information for storing the small documents the big file The middle data content information for reading the small documents;
The data content information of the metadata information of the small documents and the small documents is returned into user.
A kind of small documents read system, including:
Acquiring unit, the filename for obtaining the small documents;
Searching unit, for the filename according to the small documents the metadata for storing the small documents metadatabase The middle metadata information for searching the small documents, the metadata information includes:The size of the small documents, small documents institute The offset of path and the small documents in the big file in big file;
Reading unit, the data content information of the small documents is being stored for the metadata information according to the small documents The big file in read the data content information of the small documents;
Feedback unit, for the data content information of the metadata information of the small documents and the small documents to be returned to User.
Via above-mentioned technical scheme understand, compared with prior art, the invention discloses a kind of small documents storage method, Small documents read method and system, the small documents storage method include:One is preset in the logic storage unit of big file Metadatabase;The data content information of small documents to be stored is stored in big file, in the data of the small documents to be stored Hold the end that information is added to big file;The metadata of small documents to be stored is stored in metadatabase, metadata includes:It is small Offset and self-defining metadata of the title, size, type, check value, timestamp, small documents of file in big file. The present invention is believed the data content of small documents to be stored by increasing a metadatabase in the logic storage unit of big file Breath and the metadata of small documents to be stored are stored separately, realize individually can be to metadata modification, addition and deletion; In addition, when metadata be loaded into internal memory cached, it is not necessary to scan whole big file again.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the accompanying drawing used required in technology description to be briefly described, it should be apparent that, drawings in the following description are only this The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is a kind of small documents storage method schematic flow sheet provided in an embodiment of the present invention;
The file system schematic diagram that Fig. 2 is constituted for multiple big files provided in an embodiment of the present invention;
Fig. 3 is the list of meta data of the small documents provided in an embodiment of the present invention being stored in metadatabase;
Fig. 4 is the part list of meta data of the small documents provided in an embodiment of the present invention being stored in big file;
Fig. 5 is a kind of small documents memory system architecture schematic diagram provided in an embodiment of the present invention;
Fig. 6 is a kind of small documents read method schematic flow sheet provided in an embodiment of the present invention;
Fig. 7 reads system structure diagram for a kind of small documents provided in an embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.
For the merging storage of mass small documents, include the write-in and the reading of small documents of small documents.Individually below from small The write-in of file and the storage mode of the reading explanation mass small documents of small documents.
Accompanying drawing 1 is referred to, Fig. 1 merges the signal of storage method flow for a kind of mass small documents provided in an embodiment of the present invention Figure.As shown in figure 1, merging storage method the invention discloses a kind of mass small documents, this method specifically includes following steps:
S101, a default metadatabase in the logic storage unit of big file;
Refer to accompanying drawing 2, the file system schematic diagram that Fig. 2 is constituted for multiple big files provided in an embodiment of the present invention.Such as , it is necessary to preset a metadatabase, the i.e. metadata in each logic storage unit comprising the big files of POSIX shown in Fig. 2 Storehouse is key value database RocksDB.
The realization of the present invention is not only restricted to any programming language and platform, can use Go in practice in Linux platform Language is as realizing language.
S102, the data content information of small documents to be stored is stored in big file, in the data of small documents to be stored Hold the end that information is added to big file successively;
S103, the metadata of small documents to be stored is stored in metadatabase.
Specifically, as shown in figure 3, metadata includes:The filename of small documents, the size of small documents, the type of small documents, The offset and self-defining metadata of the check value of small documents, the timestamp of small documents, small documents in big file.
Wherein, metadatabase is key value database RocksDB.
At the same time it can also by the part metadata redundant storage of small documents to be stored in big file, specifically, such as Fig. 4 It is shown, the data content information redundancy of the part metadata of small documents to be stored and small documents to be stored is stored in one big text In part, the part metadata of small documents to be stored includes:Offset, the filename of small documents and length of the small documents in big file The start bit of the data content information of space size, the check value of small documents and small documents shared by degree, the metadata of small documents Put.
The invention discloses a kind of small documents storage method, methods described includes:In the logic storage unit of big file Preset a metadatabase;The data content information of small documents to be stored is stored in big file, the small documents to be stored Data content information be added to big file end;The metadata of small documents to be stored is stored in metadatabase, first number According to including:Offset of the title, size, type, check value, timestamp, small documents of small documents in big file and make by oneself Adopted metadata.The present invention in the logic storage unit of big file by increasing a metadatabase, by small documents to be stored Data content information and the metadata of small documents to be stored are stored separately, realize individually can to the modification of metadata, Addition and deletion;In addition, when metadata be loaded into internal memory cached, it is not necessary to scan whole big file again.
The present invention also discloses corresponding system on the basis of method disclosed above.
Small documents storage system provided in an embodiment of the present invention is introduced below, it is necessary to explanation is, relevant this is small The explanation of document storage system can refer to small documents storage method provided above, not repeat below.
Accompanying drawing 5 is referred to, Fig. 5 is a kind of small documents memory system architecture schematic diagram provided in an embodiment of the present invention.Such as Fig. 5 Shown, the invention discloses a kind of small documents storage system, the system concrete structure includes as follows:
Default unit 501, for presetting a metadatabase in the logic storage unit of big file;
First memory cell 502, it is to be stored for the data content information of small documents to be stored to be stored in big file The data content information of small documents is added to the end of big file successively;
Second memory cell 503, for the metadata of small documents to be stored to be stored in metadatabase.
Wherein, metadata includes:The filename of small documents, the size of small documents, the type of small documents, the verification of small documents Value, the offset and self-defining metadata of the timestamp of small documents, the path of big file, small documents in big file.
Meanwhile, the first memory cell 502 can also be by the part metadata redundant storage of small documents to be stored in big file In, the part metadata of small documents to be stored is stored in one big file with the data content information of small documents to be stored, is treated The part metadata of storage small documents includes:Offset, the filename of small documents and length of the small documents in big file, small text The original position of the data content information of space size, the check value of small documents and small documents shared by the metadata of part.
The invention discloses a kind of small documents storage system, the system in the logic storage unit of big file by increasing One metadatabase, the metadata of the data content information of small documents to be stored and small documents to be stored is stored separately, Realizing to the modification of metadata, addition and can individually delete;In addition, when metadata is loaded into internal memory cached, Whole big file need not be scanned again.
A kind of small documents storage method and system are present embodiments provided, mainly can be appended to the small documents of write-in Big end of file, while remembeing the size and the original position in big file, i.e. offset of small documents.Secondly, can will be small The metadata of file, the timestamp, file type such as write-in, the offset added above is written to metadatabase together In.
To sum up, it is pre- in the logic storage unit of big file the invention discloses a kind of small documents storage method and system If a metadatabase;The data content information of small documents to be stored is stored in big file, the small documents to be stored Data content information is added to the end of big file;The metadata of small documents to be stored is stored in metadatabase, metadata Including:Offset of the title, size, type, check value, timestamp, small documents of small documents in big file and self-defined Metadata.The present invention in the logic storage unit of big file by increasing a metadatabase, by the number of small documents to be stored Be stored separately according to the metadata of content information and small documents to be stored, realize individually can to the modification of metadata, add Plus and delete;In addition, when metadata be loaded into internal memory cached, it is not necessary to scan whole big file again.
The present invention also discloses a kind of small documents on the basis of a kind of small documents storage method disclosed above and system Read method and system.
Accompanying drawing 6 is referred to, Fig. 6 is a kind of small documents read method schematic flow sheet provided in an embodiment of the present invention.Such as Fig. 6 Shown, the invention discloses a kind of small documents read method, this method specifically includes following steps:
S601, the filename for obtaining small documents;
S602, the member that small documents are searched according to the filenames of small documents in the metadatabase of the metadata of storage small documents Data message, metadata information includes:Size, the path in the big file in small documents place and the small documents of small documents are in big text Offset in part;
S603, read in the big file of the data content information of storage small documents according to the metadata informations of small documents it is small The data content information of file;
S604, the data content information of the metadata information of small documents and small documents returned into user.
A kind of small documents read method is present embodiments provided, is mainly found according to the name of small documents into metadatabase The metadata information of small documents, includes size, the path of the big file in place and the offset in big file of small documents, so The content of small documents can be just read according to these information afterwards, most the metadata of this partial data content information and small documents at last Information returns to user together.
Accompanying drawing 7 is referred to, Fig. 7 reads system structure diagram for a kind of small documents provided in an embodiment of the present invention.Such as Fig. 7 It is shown, system is read the invention discloses a kind of small documents, the system concrete structure includes as follows:
Acquiring unit 701, the filename for obtaining small documents;
Searching unit 702, is looked into for the filename according to small documents in the metadatabase of the metadata of storage small documents The metadata information of small documents is looked for, metadata information includes:Path in big file where the sizes of small documents, small documents and Offset of the small documents in big file;
Reading unit 703, the big of the data content information of small documents is being stored for the metadata information according to small documents The data content information of small documents is read in file;
Feedback unit 704, for the data content information of the metadata information of small documents and small documents to be returned into user.
Present embodiments provide a kind of small documents and read system, mainly found according to the name of small documents into metadatabase The metadata information of small documents, includes size, the path of the big file in place and the offset in big file of small documents, so The content of small documents can be just read according to these information afterwards, most the metadata of this partial data content information and small documents at last Information returns to user together.
The function that the present invention is realized needs big file and metadatabase to use cooperatively.In this programme, each big file Will one metadatabase of distribution.This metadatabase is a key value database, and being exactly in simple terms just can be with according to name Get content of the access in database.In file system, a file contains two-part content in fact, a part It is content-data, the content of such as one photo.Another is metadata, describes some other information of file.To shine Exemplified by piece file, the time, place such as photograph taking.In this programme, the content-data of small documents is stored in big file In, metadata is stored in database.That is, database only houses the metadata of small documents.Generally speaking, it is exactly greatly File is used for depositing content-data, and database is used for depositing the metadata and index information of small documents.
Merge the storage demand for being mainly used for solving mass small documents of storage, common small documents have picture, text Deng specifically, the storage available for picture and some UGC small videos, subtitle files etc..
In summary, the invention discloses a kind of small documents storage method, small documents read method and system, the small text Part storage method includes:A metadatabase is preset in the logic storage unit of big file;By the data of small documents to be stored Content information is stored in big file, and the data content information of the small documents to be stored is added to the end of big file;It will treat The metadata of storage small documents is stored in metadatabase, and metadata includes:The titles of small documents, size, type, check value, The offset and self-defining metadata of timestamp, small documents in big file.The present invention is stored by the logic in big file Increase a metadatabase in unit, the data content information of small documents to be stored and the metadata of small documents to be stored are carried out It is stored separately, realizing to the modification of metadata, addition and can individually delete;In addition, being deposited into metadata is loaded into During row caching, it is not necessary to scan whole big file again.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight Point explanation be all between difference with other embodiment, each embodiment identical similar part mutually referring to.
It should also be noted that, herein, such as first and second or the like relational terms are used merely to one Entity or operation make a distinction with another entity or operation, and not necessarily require or imply between these entities or operation There is any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to contain Lid nonexcludability is included, so that article or equipment including a series of key elements not only include those key elements, but also Including other key elements being not expressly set out, or also include for this article or the intrinsic key element of equipment.Do not having In the case of more limitations, the key element limited by sentence "including a ...", it is not excluded that including the article of above-mentioned key element Or also there is other identical element in equipment.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or using the present invention. A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one The most wide scope caused.

Claims (10)

1. a kind of small documents storage method, it is characterised in that including:
A metadatabase is preset in the logic storage unit of big file;
The data content information of small documents to be stored is stored in the big file, the data content of the small documents to be stored Information is added to the end of the big file successively;
The metadata of the small documents to be stored is stored in the metadatabase.
2. small documents storage method according to claim 1, it is characterised in that the metadata includes:The small documents Filename, the size of the small documents, the type of the small documents, the check value of the small documents, the small documents when Between stamp, offset and self-defining metadata of the small documents in the big file.
3. small documents storage method according to claim 1, it is characterised in that the metadatabase is key value database RocksDB。
4. small documents storage method according to claim 1, it is characterised in that also include:
By the part metadata redundant storage of the small documents to be stored in the big file, the portion of the small documents to be stored Point metadata and the data content information redundancy of the small documents to be stored are stored in one big file, the small text to be stored The part metadata of part includes:Offset, the filename of the small documents and length of the small documents in the big file, The data content information of space size, the check value of the small documents and the small documents shared by the metadata of the small documents Original position.
5. a kind of small documents storage system, it is characterised in that including:
Default unit, for presetting a metadatabase in the logic storage unit of big file;
First memory cell, it is described for the data content information of the small documents to be stored to be stored in the big file The data content information of small documents to be stored is added to the end of the big file successively;
Second memory cell, for the metadata of the small documents to be stored to be stored in the metadatabase.
6. small documents storage system according to claim 5, it is characterised in that the metadata includes:The small documents Filename, the size of the small documents, the type of the small documents, the check value of the small documents, the small documents when Between stamp, the offset and self-defining metadata of the path of the big file, the small documents in the big file.
7. small documents storage system according to claim 5, it is characterised in that the metadatabase is key value database RocksDB。
8. small documents storage system according to claim 5, it is characterised in that first memory cell, is additionally operable to:
By the part metadata redundant storage of the small documents to be stored in the big file, the portion of the small documents to be stored Point metadata and the data content information redundancy of the small documents to be stored are stored in one big file, the small text to be stored The part metadata of part includes:Offset, the filename of the small documents and length of the small documents in the big file, The data content information of space size, the check value of the small documents and the small documents shared by the metadata of the small documents Original position.
9. a kind of small documents read method, it is characterised in that including:
Obtain the filename of the small documents;
The small documents are searched in the metadatabase of metadata of the small documents is stored according to the filename of the small documents Metadata information, the metadata information includes:Path in big file where the size of the small documents, the small documents And offset of the small documents in the big file;
Read according to the metadata information of the small documents in the big file of data content information of the small documents is stored Take the data content information of the small documents;
The data content information of the metadata information of the small documents and the small documents is returned into user.
10. a kind of small documents read system, it is characterised in that including:
Acquiring unit, the filename for obtaining the small documents;
Searching unit, is looked into for the filename according to the small documents in the metadatabase of metadata of the small documents is stored The metadata information of the small documents is looked for, the metadata information includes:It is big where the size of the small documents, the small documents The offset of path and the small documents in the big file in file;
Reading unit, for the metadata information according to the small documents the data content information for storing the small documents institute State the data content information that the small documents are read in big file;
Feedback unit, for the data content information of the metadata information of the small documents and the small documents to be returned into use Family.
CN201710501667.3A 2017-06-27 2017-06-27 A kind of small documents storage method, small documents read method and system Pending CN107291915A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710501667.3A CN107291915A (en) 2017-06-27 2017-06-27 A kind of small documents storage method, small documents read method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710501667.3A CN107291915A (en) 2017-06-27 2017-06-27 A kind of small documents storage method, small documents read method and system

Publications (1)

Publication Number Publication Date
CN107291915A true CN107291915A (en) 2017-10-24

Family

ID=60098716

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710501667.3A Pending CN107291915A (en) 2017-06-27 2017-06-27 A kind of small documents storage method, small documents read method and system

Country Status (1)

Country Link
CN (1) CN107291915A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108287869A (en) * 2017-12-20 2018-07-17 江苏省公用信息有限公司 A kind of mass small documents solution based on speedy storage equipment
CN110531929A (en) * 2019-08-09 2019-12-03 济南浪潮数据技术有限公司 The small documents processing method and processing device of storage system
CN111258955A (en) * 2018-11-30 2020-06-09 北京白山耘科技有限公司 File reading method and system, storage medium and computer equipment
CN112235422A (en) * 2020-12-11 2021-01-15 浙江大华技术股份有限公司 Data processing method and device, computer readable storage medium and electronic device
CN113722279A (en) * 2021-08-19 2021-11-30 北京达佳互联信息技术有限公司 Method, device and equipment for determining size of folder and storage medium
CN114327285A (en) * 2021-12-30 2022-04-12 南京中孚信息技术有限公司 Data storage method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102332027A (en) * 2011-10-15 2012-01-25 西安交通大学 Mass non-independent small file associated storage method based on Hadoop
CN103577123A (en) * 2013-11-12 2014-02-12 河海大学 Small file optimization storage method based on HDFS
CN103605726A (en) * 2013-11-15 2014-02-26 中安消技术有限公司 Method and system for accessing small files, control node and storage node
CN104133882A (en) * 2014-07-28 2014-11-05 四川大学 HDFS (Hadoop Distributed File System)-based old file processing method
CN104572670A (en) * 2013-10-15 2015-04-29 方正国际软件(北京)有限公司 Small file storage, query and deletion method and system
CN104731921A (en) * 2015-03-26 2015-06-24 江苏物联网研究发展中心 Method for storing and processing small log type files in Hadoop distributed file system
CN105404652A (en) * 2015-10-29 2016-03-16 河海大学 Mass small file processing method based on HDFS

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102332027A (en) * 2011-10-15 2012-01-25 西安交通大学 Mass non-independent small file associated storage method based on Hadoop
CN104572670A (en) * 2013-10-15 2015-04-29 方正国际软件(北京)有限公司 Small file storage, query and deletion method and system
CN103577123A (en) * 2013-11-12 2014-02-12 河海大学 Small file optimization storage method based on HDFS
CN103605726A (en) * 2013-11-15 2014-02-26 中安消技术有限公司 Method and system for accessing small files, control node and storage node
CN104133882A (en) * 2014-07-28 2014-11-05 四川大学 HDFS (Hadoop Distributed File System)-based old file processing method
CN104731921A (en) * 2015-03-26 2015-06-24 江苏物联网研究发展中心 Method for storing and processing small log type files in Hadoop distributed file system
CN105404652A (en) * 2015-10-29 2016-03-16 河海大学 Mass small file processing method based on HDFS

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108287869A (en) * 2017-12-20 2018-07-17 江苏省公用信息有限公司 A kind of mass small documents solution based on speedy storage equipment
CN111258955A (en) * 2018-11-30 2020-06-09 北京白山耘科技有限公司 File reading method and system, storage medium and computer equipment
CN111258955B (en) * 2018-11-30 2023-09-19 北京白山耘科技有限公司 File reading method and system, storage medium and computer equipment
CN110531929A (en) * 2019-08-09 2019-12-03 济南浪潮数据技术有限公司 The small documents processing method and processing device of storage system
CN112235422A (en) * 2020-12-11 2021-01-15 浙江大华技术股份有限公司 Data processing method and device, computer readable storage medium and electronic device
CN113722279A (en) * 2021-08-19 2021-11-30 北京达佳互联信息技术有限公司 Method, device and equipment for determining size of folder and storage medium
CN113722279B (en) * 2021-08-19 2024-03-01 北京达佳互联信息技术有限公司 Method, device, equipment and storage medium for determining size of folder
CN114327285A (en) * 2021-12-30 2022-04-12 南京中孚信息技术有限公司 Data storage method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107291915A (en) A kind of small documents storage method, small documents read method and system
CN104731921B (en) Storage and processing method of the Hadoop distributed file systems for log type small documents
US10248356B2 (en) Using scratch extents to facilitate copying operations in an append-only storage system
US9710535B2 (en) Object storage system with local transaction logs, a distributed namespace, and optimized support for user directories
US8171202B2 (en) Asynchronous distributed object uploading for replicated content addressable storage clusters
CN103812939B (en) Big data storage system
US8762353B2 (en) Elimination of duplicate objects in storage clusters
US8046331B1 (en) Method and apparatus for recreating placeholders
US10296518B2 (en) Managing distributed deletes in a replicated storage system
CN102629247B (en) Method, device and system for data processing
CN103282899B (en) The storage method of data, access method and device in file system
JP5233233B2 (en) Information search system, information search index registration device, information search method and program
US9772783B2 (en) Constructing an index to facilitate accessing a closed extent in an append-only storage system
CN104133882A (en) HDFS (Hadoop Distributed File System)-based old file processing method
JP2005267600A5 (en)
CN104408111A (en) Method and device for deleting duplicate data
CN103246700A (en) Mass small file low latency storage method based on HBase
US9720607B2 (en) Append-only storage system supporting open and closed extents
Zhai et al. Hadoop perfect file: A fast and memory-efficient metadata access archive file to face small files problem in hdfs
US8612717B2 (en) Storage system
Samundiswary et al. Object storage architecture in cloud for unstructured data
Riegger et al. Efficient data and indexing structure for blockchains in enterprise systems
Cheng et al. Optimizing small file storage process of the HDFS which based on the indexing mechanism
Rahmanto et al. Data preservation process in big data environment using open archival information system
Luan et al. Towards effective 3D model management on hadoop

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20171024

RJ01 Rejection of invention patent application after publication