CN107291915A - A kind of small documents storage method, small documents read method and system - Google Patents
A kind of small documents storage method, small documents read method and system Download PDFInfo
- Publication number
- CN107291915A CN107291915A CN201710501667.3A CN201710501667A CN107291915A CN 107291915 A CN107291915 A CN 107291915A CN 201710501667 A CN201710501667 A CN 201710501667A CN 107291915 A CN107291915 A CN 107291915A
- Authority
- CN
- China
- Prior art keywords
- small documents
- stored
- metadata
- big file
- small
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of small documents storage method, small documents read method and system, the small documents storage method includes:A metadatabase is preset in the logic storage unit of big file;The data content information of small documents to be stored is stored in big file, the data content information of the small documents to be stored is added to the end of big file;The metadata of small documents to be stored is stored in metadatabase, metadata includes:Offset and self-defining metadata of the title, size, type, check value, timestamp, small documents of small documents in big file.The metadata of the data content information of small documents to be stored and small documents to be stored is stored separately by the present invention by increasing a metadatabase in the logic storage unit of big file, and realizing individually can be to the modification of metadata, addition and deletion;In addition, when metadata be loaded into internal memory cached, it is not necessary to scan whole big file again.
Description
Technical field
The present invention relates to technical field of data storage, more particularly to a kind of small documents storage method, small documents reading side
Method and system.
Background technology
The high speed development of internet generates the files such as the picture of magnanimity, document, is that size is smaller the characteristics of these files
(typically in below 100KB), enormous amount (hundreds of millions of), traditional POSIX interface document systems have been difficult to meet demand,
Here it is the famous mass small documents problem of industry.For this problem, the common practice of industry is to merge storage, i.e., by small text
Part merges storage to a traditional POSIX file, such as Haystack of Facebook, LinkedIn Ambry and Taobao
TFS, the merging storage mode of these systems is all similar, is exactly that server end only preserves part metadata, remaining member number
Client preservation is given according to file ID is encoded into, while creating index to the small documents in big file, property is read to reach to improve
The purpose of energy.Metadata size, the type of server end preservation are all regularly to be stored in initial data on big file.
Above-mentioned storage mode has these defects:(1) self-defining metadata is not supported, if to support first number of new type
According to, must just update small documents organizational form or change file ID;(2) when changing existing metadata, new value size has
Strict limitation, otherwise can cover other valid data;(3) some operations only need to read metadata, it is still desirable in big file
In navigate to small documents offset could start read, it is also desirable to can be in these metadata of memory cache, existing
Under framework, if to cache these metadata, need from the beginning to scan whole big file, it is less efficient.And cause these master
It is that small documents are stored on disk in the way of merging storage to want reason, needs strict border to distinguish between small documents, one
Denier write-in just can not change border again, therefore form of the small documents in big file has a strict limitation, and can not arbitrarily change,
Addition and deletion metadata.
The content of the invention
It is existing to solve it is an object of the invention to propose a kind of small documents storage method, small documents read method and system
Have in technology because small documents are stored on disk in the way of merging storage, need strict border to come area between small documents
Point, once write-in just can not be on modification border, therefore form of the small documents in big file has strict limitation, and can not be any
The problem of modification, addition and deletion metadata.
To reach above-mentioned purpose, the invention provides following technical scheme:
A kind of small documents storage method, including:
A metadatabase is preset in the logic storage unit of big file;
The data content information of small documents to be stored is stored in the big file, the data content letter of the small documents
Breath is added to the end of the big file successively;
The metadata of the small documents to be stored is stored in the metadatabase.
Wherein, the metadata includes:The filename of the small documents, the size of the small documents, the small documents
The offset of type, the check value of the small documents, the timestamp of the small documents, the small documents in the big file with
And self-defining metadata.
Wherein, the metadatabase is key value database RocksDB.
It is preferred that, in addition to:
By the part metadata redundant storage of the small documents to be stored in the big file, the small documents to be stored
The data content information redundancies of part metadata and the small documents to be stored be stored in one big file, it is described to be stored
The part metadata of small documents includes:Offset of the small documents in the big file, the filename of the small documents and
Shared by length, the metadata of the small documents in the data of space size, the check value of the small documents and the small documents
Hold the original position of information.
A kind of small documents storage system, including:
Default unit, for presetting a metadatabase in the logic storage unit of big file;
First memory cell, it is described for the data content information of small documents to be stored to be stored in the big file
The data content information of small documents to be stored is added to the end of the big file successively;
Second memory cell, for the metadata of the small documents to be stored to be stored in the metadatabase.
Wherein, the metadata includes:The filename of the small documents, the size of the small documents, the small documents
Type, the check value of the small documents, the timestamp of the small documents, the path of the big file, the small documents are described
Offset and self-defining metadata in big file.
Wherein, the metadatabase is key value database RocksDB.
It is preferred that, first memory cell is additionally operable to the part metadata redundant storage of the small documents to be stored
In the big file, the part metadata of the small documents to be stored and the data content information of the small documents to be stored are superfluous
Remaining to be stored in one big file, the part metadata of the small documents to be stored includes:The small documents are in the big file
In offset, the filename of the small documents and length, space size, the small documents shared by the metadata of the small documents
Check value and the small documents data content information original position.
A kind of small documents read method, including:
Obtain the filename of the small documents;
Searched according to the filename of the small documents in the metadatabase of metadata of the small documents is stored described small
The metadata information of file, the metadata information includes:In big file where the size of the small documents, the small documents
The offset of path and the small documents in the big file;
According to the metadata information of the small documents the data content information for storing the small documents the big file
The middle data content information for reading the small documents;
The data content information of the metadata information of the small documents and the small documents is returned into user.
A kind of small documents read system, including:
Acquiring unit, the filename for obtaining the small documents;
Searching unit, for the filename according to the small documents the metadata for storing the small documents metadatabase
The middle metadata information for searching the small documents, the metadata information includes:The size of the small documents, small documents institute
The offset of path and the small documents in the big file in big file;
Reading unit, the data content information of the small documents is being stored for the metadata information according to the small documents
The big file in read the data content information of the small documents;
Feedback unit, for the data content information of the metadata information of the small documents and the small documents to be returned to
User.
Via above-mentioned technical scheme understand, compared with prior art, the invention discloses a kind of small documents storage method,
Small documents read method and system, the small documents storage method include:One is preset in the logic storage unit of big file
Metadatabase;The data content information of small documents to be stored is stored in big file, in the data of the small documents to be stored
Hold the end that information is added to big file;The metadata of small documents to be stored is stored in metadatabase, metadata includes:It is small
Offset and self-defining metadata of the title, size, type, check value, timestamp, small documents of file in big file.
The present invention is believed the data content of small documents to be stored by increasing a metadatabase in the logic storage unit of big file
Breath and the metadata of small documents to be stored are stored separately, realize individually can be to metadata modification, addition and deletion;
In addition, when metadata be loaded into internal memory cached, it is not necessary to scan whole big file again.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is the accompanying drawing used required in technology description to be briefly described, it should be apparent that, drawings in the following description are only this
The embodiment of invention, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis
The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is a kind of small documents storage method schematic flow sheet provided in an embodiment of the present invention;
The file system schematic diagram that Fig. 2 is constituted for multiple big files provided in an embodiment of the present invention;
Fig. 3 is the list of meta data of the small documents provided in an embodiment of the present invention being stored in metadatabase;
Fig. 4 is the part list of meta data of the small documents provided in an embodiment of the present invention being stored in big file;
Fig. 5 is a kind of small documents memory system architecture schematic diagram provided in an embodiment of the present invention;
Fig. 6 is a kind of small documents read method schematic flow sheet provided in an embodiment of the present invention;
Fig. 7 reads system structure diagram for a kind of small documents provided in an embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made
Embodiment, belongs to the scope of protection of the invention.
For the merging storage of mass small documents, include the write-in and the reading of small documents of small documents.Individually below from small
The write-in of file and the storage mode of the reading explanation mass small documents of small documents.
Accompanying drawing 1 is referred to, Fig. 1 merges the signal of storage method flow for a kind of mass small documents provided in an embodiment of the present invention
Figure.As shown in figure 1, merging storage method the invention discloses a kind of mass small documents, this method specifically includes following steps:
S101, a default metadatabase in the logic storage unit of big file;
Refer to accompanying drawing 2, the file system schematic diagram that Fig. 2 is constituted for multiple big files provided in an embodiment of the present invention.Such as
, it is necessary to preset a metadatabase, the i.e. metadata in each logic storage unit comprising the big files of POSIX shown in Fig. 2
Storehouse is key value database RocksDB.
The realization of the present invention is not only restricted to any programming language and platform, can use Go in practice in Linux platform
Language is as realizing language.
S102, the data content information of small documents to be stored is stored in big file, in the data of small documents to be stored
Hold the end that information is added to big file successively;
S103, the metadata of small documents to be stored is stored in metadatabase.
Specifically, as shown in figure 3, metadata includes:The filename of small documents, the size of small documents, the type of small documents,
The offset and self-defining metadata of the check value of small documents, the timestamp of small documents, small documents in big file.
Wherein, metadatabase is key value database RocksDB.
At the same time it can also by the part metadata redundant storage of small documents to be stored in big file, specifically, such as Fig. 4
It is shown, the data content information redundancy of the part metadata of small documents to be stored and small documents to be stored is stored in one big text
In part, the part metadata of small documents to be stored includes:Offset, the filename of small documents and length of the small documents in big file
The start bit of the data content information of space size, the check value of small documents and small documents shared by degree, the metadata of small documents
Put.
The invention discloses a kind of small documents storage method, methods described includes:In the logic storage unit of big file
Preset a metadatabase;The data content information of small documents to be stored is stored in big file, the small documents to be stored
Data content information be added to big file end;The metadata of small documents to be stored is stored in metadatabase, first number
According to including:Offset of the title, size, type, check value, timestamp, small documents of small documents in big file and make by oneself
Adopted metadata.The present invention in the logic storage unit of big file by increasing a metadatabase, by small documents to be stored
Data content information and the metadata of small documents to be stored are stored separately, realize individually can to the modification of metadata,
Addition and deletion;In addition, when metadata be loaded into internal memory cached, it is not necessary to scan whole big file again.
The present invention also discloses corresponding system on the basis of method disclosed above.
Small documents storage system provided in an embodiment of the present invention is introduced below, it is necessary to explanation is, relevant this is small
The explanation of document storage system can refer to small documents storage method provided above, not repeat below.
Accompanying drawing 5 is referred to, Fig. 5 is a kind of small documents memory system architecture schematic diagram provided in an embodiment of the present invention.Such as Fig. 5
Shown, the invention discloses a kind of small documents storage system, the system concrete structure includes as follows:
Default unit 501, for presetting a metadatabase in the logic storage unit of big file;
First memory cell 502, it is to be stored for the data content information of small documents to be stored to be stored in big file
The data content information of small documents is added to the end of big file successively;
Second memory cell 503, for the metadata of small documents to be stored to be stored in metadatabase.
Wherein, metadata includes:The filename of small documents, the size of small documents, the type of small documents, the verification of small documents
Value, the offset and self-defining metadata of the timestamp of small documents, the path of big file, small documents in big file.
Meanwhile, the first memory cell 502 can also be by the part metadata redundant storage of small documents to be stored in big file
In, the part metadata of small documents to be stored is stored in one big file with the data content information of small documents to be stored, is treated
The part metadata of storage small documents includes:Offset, the filename of small documents and length of the small documents in big file, small text
The original position of the data content information of space size, the check value of small documents and small documents shared by the metadata of part.
The invention discloses a kind of small documents storage system, the system in the logic storage unit of big file by increasing
One metadatabase, the metadata of the data content information of small documents to be stored and small documents to be stored is stored separately,
Realizing to the modification of metadata, addition and can individually delete;In addition, when metadata is loaded into internal memory cached,
Whole big file need not be scanned again.
A kind of small documents storage method and system are present embodiments provided, mainly can be appended to the small documents of write-in
Big end of file, while remembeing the size and the original position in big file, i.e. offset of small documents.Secondly, can will be small
The metadata of file, the timestamp, file type such as write-in, the offset added above is written to metadatabase together
In.
To sum up, it is pre- in the logic storage unit of big file the invention discloses a kind of small documents storage method and system
If a metadatabase;The data content information of small documents to be stored is stored in big file, the small documents to be stored
Data content information is added to the end of big file;The metadata of small documents to be stored is stored in metadatabase, metadata
Including:Offset of the title, size, type, check value, timestamp, small documents of small documents in big file and self-defined
Metadata.The present invention in the logic storage unit of big file by increasing a metadatabase, by the number of small documents to be stored
Be stored separately according to the metadata of content information and small documents to be stored, realize individually can to the modification of metadata, add
Plus and delete;In addition, when metadata be loaded into internal memory cached, it is not necessary to scan whole big file again.
The present invention also discloses a kind of small documents on the basis of a kind of small documents storage method disclosed above and system
Read method and system.
Accompanying drawing 6 is referred to, Fig. 6 is a kind of small documents read method schematic flow sheet provided in an embodiment of the present invention.Such as Fig. 6
Shown, the invention discloses a kind of small documents read method, this method specifically includes following steps:
S601, the filename for obtaining small documents;
S602, the member that small documents are searched according to the filenames of small documents in the metadatabase of the metadata of storage small documents
Data message, metadata information includes:Size, the path in the big file in small documents place and the small documents of small documents are in big text
Offset in part;
S603, read in the big file of the data content information of storage small documents according to the metadata informations of small documents it is small
The data content information of file;
S604, the data content information of the metadata information of small documents and small documents returned into user.
A kind of small documents read method is present embodiments provided, is mainly found according to the name of small documents into metadatabase
The metadata information of small documents, includes size, the path of the big file in place and the offset in big file of small documents, so
The content of small documents can be just read according to these information afterwards, most the metadata of this partial data content information and small documents at last
Information returns to user together.
Accompanying drawing 7 is referred to, Fig. 7 reads system structure diagram for a kind of small documents provided in an embodiment of the present invention.Such as Fig. 7
It is shown, system is read the invention discloses a kind of small documents, the system concrete structure includes as follows:
Acquiring unit 701, the filename for obtaining small documents;
Searching unit 702, is looked into for the filename according to small documents in the metadatabase of the metadata of storage small documents
The metadata information of small documents is looked for, metadata information includes:Path in big file where the sizes of small documents, small documents and
Offset of the small documents in big file;
Reading unit 703, the big of the data content information of small documents is being stored for the metadata information according to small documents
The data content information of small documents is read in file;
Feedback unit 704, for the data content information of the metadata information of small documents and small documents to be returned into user.
Present embodiments provide a kind of small documents and read system, mainly found according to the name of small documents into metadatabase
The metadata information of small documents, includes size, the path of the big file in place and the offset in big file of small documents, so
The content of small documents can be just read according to these information afterwards, most the metadata of this partial data content information and small documents at last
Information returns to user together.
The function that the present invention is realized needs big file and metadatabase to use cooperatively.In this programme, each big file
Will one metadatabase of distribution.This metadatabase is a key value database, and being exactly in simple terms just can be with according to name
Get content of the access in database.In file system, a file contains two-part content in fact, a part
It is content-data, the content of such as one photo.Another is metadata, describes some other information of file.To shine
Exemplified by piece file, the time, place such as photograph taking.In this programme, the content-data of small documents is stored in big file
In, metadata is stored in database.That is, database only houses the metadata of small documents.Generally speaking, it is exactly greatly
File is used for depositing content-data, and database is used for depositing the metadata and index information of small documents.
Merge the storage demand for being mainly used for solving mass small documents of storage, common small documents have picture, text
Deng specifically, the storage available for picture and some UGC small videos, subtitle files etc..
In summary, the invention discloses a kind of small documents storage method, small documents read method and system, the small text
Part storage method includes:A metadatabase is preset in the logic storage unit of big file;By the data of small documents to be stored
Content information is stored in big file, and the data content information of the small documents to be stored is added to the end of big file;It will treat
The metadata of storage small documents is stored in metadatabase, and metadata includes:The titles of small documents, size, type, check value,
The offset and self-defining metadata of timestamp, small documents in big file.The present invention is stored by the logic in big file
Increase a metadatabase in unit, the data content information of small documents to be stored and the metadata of small documents to be stored are carried out
It is stored separately, realizing to the modification of metadata, addition and can individually delete;In addition, being deposited into metadata is loaded into
During row caching, it is not necessary to scan whole big file again.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight
Point explanation be all between difference with other embodiment, each embodiment identical similar part mutually referring to.
It should also be noted that, herein, such as first and second or the like relational terms are used merely to one
Entity or operation make a distinction with another entity or operation, and not necessarily require or imply between these entities or operation
There is any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant are intended to contain
Lid nonexcludability is included, so that article or equipment including a series of key elements not only include those key elements, but also
Including other key elements being not expressly set out, or also include for this article or the intrinsic key element of equipment.Do not having
In the case of more limitations, the key element limited by sentence "including a ...", it is not excluded that including the article of above-mentioned key element
Or also there is other identical element in equipment.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or using the present invention.
A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention
The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one
The most wide scope caused.
Claims (10)
1. a kind of small documents storage method, it is characterised in that including:
A metadatabase is preset in the logic storage unit of big file;
The data content information of small documents to be stored is stored in the big file, the data content of the small documents to be stored
Information is added to the end of the big file successively;
The metadata of the small documents to be stored is stored in the metadatabase.
2. small documents storage method according to claim 1, it is characterised in that the metadata includes:The small documents
Filename, the size of the small documents, the type of the small documents, the check value of the small documents, the small documents when
Between stamp, offset and self-defining metadata of the small documents in the big file.
3. small documents storage method according to claim 1, it is characterised in that the metadatabase is key value database
RocksDB。
4. small documents storage method according to claim 1, it is characterised in that also include:
By the part metadata redundant storage of the small documents to be stored in the big file, the portion of the small documents to be stored
Point metadata and the data content information redundancy of the small documents to be stored are stored in one big file, the small text to be stored
The part metadata of part includes:Offset, the filename of the small documents and length of the small documents in the big file,
The data content information of space size, the check value of the small documents and the small documents shared by the metadata of the small documents
Original position.
5. a kind of small documents storage system, it is characterised in that including:
Default unit, for presetting a metadatabase in the logic storage unit of big file;
First memory cell, it is described for the data content information of the small documents to be stored to be stored in the big file
The data content information of small documents to be stored is added to the end of the big file successively;
Second memory cell, for the metadata of the small documents to be stored to be stored in the metadatabase.
6. small documents storage system according to claim 5, it is characterised in that the metadata includes:The small documents
Filename, the size of the small documents, the type of the small documents, the check value of the small documents, the small documents when
Between stamp, the offset and self-defining metadata of the path of the big file, the small documents in the big file.
7. small documents storage system according to claim 5, it is characterised in that the metadatabase is key value database
RocksDB。
8. small documents storage system according to claim 5, it is characterised in that first memory cell, is additionally operable to:
By the part metadata redundant storage of the small documents to be stored in the big file, the portion of the small documents to be stored
Point metadata and the data content information redundancy of the small documents to be stored are stored in one big file, the small text to be stored
The part metadata of part includes:Offset, the filename of the small documents and length of the small documents in the big file,
The data content information of space size, the check value of the small documents and the small documents shared by the metadata of the small documents
Original position.
9. a kind of small documents read method, it is characterised in that including:
Obtain the filename of the small documents;
The small documents are searched in the metadatabase of metadata of the small documents is stored according to the filename of the small documents
Metadata information, the metadata information includes:Path in big file where the size of the small documents, the small documents
And offset of the small documents in the big file;
Read according to the metadata information of the small documents in the big file of data content information of the small documents is stored
Take the data content information of the small documents;
The data content information of the metadata information of the small documents and the small documents is returned into user.
10. a kind of small documents read system, it is characterised in that including:
Acquiring unit, the filename for obtaining the small documents;
Searching unit, is looked into for the filename according to the small documents in the metadatabase of metadata of the small documents is stored
The metadata information of the small documents is looked for, the metadata information includes:It is big where the size of the small documents, the small documents
The offset of path and the small documents in the big file in file;
Reading unit, for the metadata information according to the small documents the data content information for storing the small documents institute
State the data content information that the small documents are read in big file;
Feedback unit, for the data content information of the metadata information of the small documents and the small documents to be returned into use
Family.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710501667.3A CN107291915A (en) | 2017-06-27 | 2017-06-27 | A kind of small documents storage method, small documents read method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710501667.3A CN107291915A (en) | 2017-06-27 | 2017-06-27 | A kind of small documents storage method, small documents read method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107291915A true CN107291915A (en) | 2017-10-24 |
Family
ID=60098716
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710501667.3A Pending CN107291915A (en) | 2017-06-27 | 2017-06-27 | A kind of small documents storage method, small documents read method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107291915A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108287869A (en) * | 2017-12-20 | 2018-07-17 | 江苏省公用信息有限公司 | A kind of mass small documents solution based on speedy storage equipment |
CN110531929A (en) * | 2019-08-09 | 2019-12-03 | 济南浪潮数据技术有限公司 | The small documents processing method and processing device of storage system |
CN111258955A (en) * | 2018-11-30 | 2020-06-09 | 北京白山耘科技有限公司 | File reading method and system, storage medium and computer equipment |
CN112235422A (en) * | 2020-12-11 | 2021-01-15 | 浙江大华技术股份有限公司 | Data processing method and device, computer readable storage medium and electronic device |
CN113722279A (en) * | 2021-08-19 | 2021-11-30 | 北京达佳互联信息技术有限公司 | Method, device and equipment for determining size of folder and storage medium |
CN114327285A (en) * | 2021-12-30 | 2022-04-12 | 南京中孚信息技术有限公司 | Data storage method, device, equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102332027A (en) * | 2011-10-15 | 2012-01-25 | 西安交通大学 | Mass non-independent small file associated storage method based on Hadoop |
CN103577123A (en) * | 2013-11-12 | 2014-02-12 | 河海大学 | Small file optimization storage method based on HDFS |
CN103605726A (en) * | 2013-11-15 | 2014-02-26 | 中安消技术有限公司 | Method and system for accessing small files, control node and storage node |
CN104133882A (en) * | 2014-07-28 | 2014-11-05 | 四川大学 | HDFS (Hadoop Distributed File System)-based old file processing method |
CN104572670A (en) * | 2013-10-15 | 2015-04-29 | 方正国际软件(北京)有限公司 | Small file storage, query and deletion method and system |
CN104731921A (en) * | 2015-03-26 | 2015-06-24 | 江苏物联网研究发展中心 | Method for storing and processing small log type files in Hadoop distributed file system |
CN105404652A (en) * | 2015-10-29 | 2016-03-16 | 河海大学 | Mass small file processing method based on HDFS |
-
2017
- 2017-06-27 CN CN201710501667.3A patent/CN107291915A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102332027A (en) * | 2011-10-15 | 2012-01-25 | 西安交通大学 | Mass non-independent small file associated storage method based on Hadoop |
CN104572670A (en) * | 2013-10-15 | 2015-04-29 | 方正国际软件(北京)有限公司 | Small file storage, query and deletion method and system |
CN103577123A (en) * | 2013-11-12 | 2014-02-12 | 河海大学 | Small file optimization storage method based on HDFS |
CN103605726A (en) * | 2013-11-15 | 2014-02-26 | 中安消技术有限公司 | Method and system for accessing small files, control node and storage node |
CN104133882A (en) * | 2014-07-28 | 2014-11-05 | 四川大学 | HDFS (Hadoop Distributed File System)-based old file processing method |
CN104731921A (en) * | 2015-03-26 | 2015-06-24 | 江苏物联网研究发展中心 | Method for storing and processing small log type files in Hadoop distributed file system |
CN105404652A (en) * | 2015-10-29 | 2016-03-16 | 河海大学 | Mass small file processing method based on HDFS |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108287869A (en) * | 2017-12-20 | 2018-07-17 | 江苏省公用信息有限公司 | A kind of mass small documents solution based on speedy storage equipment |
CN111258955A (en) * | 2018-11-30 | 2020-06-09 | 北京白山耘科技有限公司 | File reading method and system, storage medium and computer equipment |
CN111258955B (en) * | 2018-11-30 | 2023-09-19 | 北京白山耘科技有限公司 | File reading method and system, storage medium and computer equipment |
CN110531929A (en) * | 2019-08-09 | 2019-12-03 | 济南浪潮数据技术有限公司 | The small documents processing method and processing device of storage system |
CN112235422A (en) * | 2020-12-11 | 2021-01-15 | 浙江大华技术股份有限公司 | Data processing method and device, computer readable storage medium and electronic device |
CN113722279A (en) * | 2021-08-19 | 2021-11-30 | 北京达佳互联信息技术有限公司 | Method, device and equipment for determining size of folder and storage medium |
CN113722279B (en) * | 2021-08-19 | 2024-03-01 | 北京达佳互联信息技术有限公司 | Method, device, equipment and storage medium for determining size of folder |
CN114327285A (en) * | 2021-12-30 | 2022-04-12 | 南京中孚信息技术有限公司 | Data storage method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107291915A (en) | A kind of small documents storage method, small documents read method and system | |
CN104731921B (en) | Storage and processing method of the Hadoop distributed file systems for log type small documents | |
US10248356B2 (en) | Using scratch extents to facilitate copying operations in an append-only storage system | |
US9710535B2 (en) | Object storage system with local transaction logs, a distributed namespace, and optimized support for user directories | |
US8171202B2 (en) | Asynchronous distributed object uploading for replicated content addressable storage clusters | |
CN103812939B (en) | Big data storage system | |
US8762353B2 (en) | Elimination of duplicate objects in storage clusters | |
US8046331B1 (en) | Method and apparatus for recreating placeholders | |
US10296518B2 (en) | Managing distributed deletes in a replicated storage system | |
CN102629247B (en) | Method, device and system for data processing | |
CN103282899B (en) | The storage method of data, access method and device in file system | |
JP5233233B2 (en) | Information search system, information search index registration device, information search method and program | |
US9772783B2 (en) | Constructing an index to facilitate accessing a closed extent in an append-only storage system | |
CN104133882A (en) | HDFS (Hadoop Distributed File System)-based old file processing method | |
JP2005267600A5 (en) | ||
CN104408111A (en) | Method and device for deleting duplicate data | |
CN103246700A (en) | Mass small file low latency storage method based on HBase | |
US9720607B2 (en) | Append-only storage system supporting open and closed extents | |
Zhai et al. | Hadoop perfect file: A fast and memory-efficient metadata access archive file to face small files problem in hdfs | |
US8612717B2 (en) | Storage system | |
Samundiswary et al. | Object storage architecture in cloud for unstructured data | |
Riegger et al. | Efficient data and indexing structure for blockchains in enterprise systems | |
Cheng et al. | Optimizing small file storage process of the HDFS which based on the indexing mechanism | |
Rahmanto et al. | Data preservation process in big data environment using open archival information system | |
Luan et al. | Towards effective 3D model management on hadoop |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171024 |
|
RJ01 | Rejection of invention patent application after publication |