CN103617293A - Key-Value storage method oriented towards storage system of mass small files - Google Patents
Key-Value storage method oriented towards storage system of mass small files Download PDFInfo
- Publication number
- CN103617293A CN103617293A CN201310688270.1A CN201310688270A CN103617293A CN 103617293 A CN103617293 A CN 103617293A CN 201310688270 A CN201310688270 A CN 201310688270A CN 103617293 A CN103617293 A CN 103617293A
- Authority
- CN
- China
- Prior art keywords
- key
- storage
- storage system
- value
- small files
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
Abstract
The invention provides a Key-Value storage method oriented towards a storage system of mass small files. The application of the storage system of the small files can be met through the storage method. Compared with the frequently-used Redis and LevelDB, the storage method has the advantages that pay expenses are lower, delaying is shorter, and the lightweight-class persistent Key-Value storage method is oriented towards the storage system of the small files. A storage structure comprises two parts, and as shown in the figure 1, the two parts include a HashTable structure in an internal storage and a persistent storage Store on a disk.
Description
Technical field
The present invention relates to a minute mass small documents field of storage, be specifically related to a kind of Key-Value storage means towards mass small documents storage system.
Background technology
In recent years, social network sites is as rapid in network service development such as Facebook, Twitter, Renren Network and e-commerce website e-Bay, Alibaba, and this class service all needs to store the small documents such as a large amount of pictures, short text.Small documents is often referred to the file that file size is less than 64M, the picture that often need to store such as internet, applications, mail, e-book, music file, microblogging, content of text etc.
Small documents storage has caused some concerns in academia and industry member gradually.Social network sites Facebook has stored 2,600 hundred million pictures, and capacity surpasses 20PB, and these file overwhelming majority are all less than 64MB.In supercomputer field, for example, the application program on ORNL ' s CrayXT5 cluster (18688 nodes, 12 processors of each node) can, periodically by application state writing in files, cause system to accumulate a large amount of small documents.In concrete scientific research computing environment, for example, during some biology calculates, may produce 3,000 ten thousand files, and its mean size only has 190KB.U.S.'s Pacific Northwest National Laboratory is published in a research report data of 2007 and shows, in the system that this laboratory is used, stored 1,200 ten thousand files, the file that is wherein less than 64MB accounts for 94% of total number of files, and what be less than 64KB accounts for 58%.The huge whale net of music site has been included the music file of 3,600,000 MP3 format.Various data all show, the data of accessing on internet mostly are the small documents of high access frequency.
The retrieving information of small documents is realized by preserving Key-Value key-value pair, conventionally uses NoSQL database (as Redis and LevelDB).Yet the main flow such as LevelDB and Redis Key-Value storage engines is bad memory scan information timeliness fruit.Reason is as follows:
1. the value data layout of small documents storage system record is fixed, size is unified, with respect to Redis, be applicable to various data structures, small documents storage system can be for application-specific reduced data structure, reduced the operation that data structure is packed and resolved, and avoid realizing communication protocol, thus save CPU expense, reduce and postpone.
2.Redis system is mainly for the application in internal memory by data buffer storage.Redis can configure persistence function, but small documents storage system can, for its application-specific, can be supported persistence better.
3.LevelDB is standalone version persistence Key-Value storage engines, but LevelDB reduces and writes fashionable delay by the MemTable in internal memory and the Level blocks of files on disk, but for read request, LevelDB may need repeatedly disk operating (will read at most above disk file 6 times), so LevelDB is very suitable for writing the situation that mutiread is few.And in real internet, applications, as commodity photo in the photograph album on Facebook website or Taobao, its application characteristic is write-once, repeatedly read, so LevelDB is also not suitable for preserving retrieving information in small documents storage system.
Summary of the invention
The present invention proposes the Key-Value storage engines KVDB that a kind of lightweight towards small documents storage system can persistence.
KVDB structural design is as Fig. 1.KVDB comprises two parts, the HashTable structure in internal memory and the persistent storage Store on disk.
Figure is as Fig. 2 for HashTable class.DictEntry is used for representing an entry of the Hash table in upper figure.Use chained list method to solve hash-collision.HashTable has preserved key for retrieving and value in the position of Store.
Store is used for depositing value content, and Store can write a value content, and returns to this value content in the side-play amount of Store inside, or a given reference position, and Store can read out a value content.The class figure of Store is as Fig. 3.In KVDB, all these type of data structure that will preserve, all must be able to carry out serializing and unserializing (serialize and deserialize), so that Store class leaves this data structure in disk, carry out persistence.The corresponding disk file of Store class, such comprises handle and some statistical informations of pointing to this disk file, and all data structures that will preserve all write in this disk, and Store provides read-write interface function simultaneously.For power interruption recovering function is provided, Store also provides an interface function loading from disk.For improving reading speed, in Store for the file of depositing value content when loading, by mmap function, file content is mapped in internal memory.Like this, during disk file corresponding to access Store, conventionally can in internal memory, have access to.
The class figure of KVDB is as Fig. 4.The disclosed interface of KVDB comprises constructed fuction, initialization function init, and format output function format, writes a key-value key-value pair put function, reads a key-value key-value pair get function.
Accompanying drawing explanation
Fig. 1 is the structural drawing of KVDB
Fig. 2 is the HashTable class figure of KVDB
Fig. 3 is the Store class figure in KVDB
Fig. 4 is KVDB class figure
Embodiment
KVDB carries out key-value storage by the HashTable structure in internal memory and the persistent storage Store on disk to data structure, for example, when for retrieving information RetrieveInfo, key is this small documents filename (comspec), and value is retrieving information RetrieveInfo.
While writing small documents, retrieving information RetrieveInfo corresponding to this small documents that small documents is generated writes KVDB<RetrieveInfo>.While reading small documents, just from KVDB<RetrieveInfo>, read out the retrieving information RetrieveInfo that this small documents filename (complete trails filename) is corresponding.
Claims (1)
1. the present invention has designed a kind of Key-Value storage means towards mass small documents storage system, is a kind of application-specific that can either meet small documents storage system, also can reduce the storage means that the lightweight of expense, delay can persistence simultaneously.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310688270.1A CN103617293A (en) | 2013-12-16 | 2013-12-16 | Key-Value storage method oriented towards storage system of mass small files |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310688270.1A CN103617293A (en) | 2013-12-16 | 2013-12-16 | Key-Value storage method oriented towards storage system of mass small files |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103617293A true CN103617293A (en) | 2014-03-05 |
Family
ID=50167996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310688270.1A Pending CN103617293A (en) | 2013-12-16 | 2013-12-16 | Key-Value storage method oriented towards storage system of mass small files |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103617293A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104657500A (en) * | 2015-03-12 | 2015-05-27 | 浪潮集团有限公司 | Distributed storage method based on KEY-VALUE pair |
CN108108247A (en) * | 2017-12-28 | 2018-06-01 | 大唐软件技术股份有限公司 | Distributed picture storage service system and method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1692356A (en) * | 2002-11-14 | 2005-11-02 | 易斯龙系统公司 | Systems and methods for restriping files in a distributed file system |
CN101551807A (en) * | 2009-05-07 | 2009-10-07 | 山东中创软件商用中间件股份有限公司 | Multilevel index technology for file database |
US20110060876A1 (en) * | 2009-09-08 | 2011-03-10 | Brocade Communications Systems, Inc. | Exact Match Lookup Scheme |
CN103279568A (en) * | 2013-06-18 | 2013-09-04 | 无锡紫光存储系统有限公司 | System and method for metadata management |
-
2013
- 2013-12-16 CN CN201310688270.1A patent/CN103617293A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1692356A (en) * | 2002-11-14 | 2005-11-02 | 易斯龙系统公司 | Systems and methods for restriping files in a distributed file system |
CN101551807A (en) * | 2009-05-07 | 2009-10-07 | 山东中创软件商用中间件股份有限公司 | Multilevel index technology for file database |
US20110060876A1 (en) * | 2009-09-08 | 2011-03-10 | Brocade Communications Systems, Inc. | Exact Match Lookup Scheme |
CN103279568A (en) * | 2013-06-18 | 2013-09-04 | 无锡紫光存储系统有限公司 | System and method for metadata management |
Non-Patent Citations (3)
Title |
---|
何文: "改进的key/value数据存储设计方案", 《东北电力大学学报》 * |
刘小军: "论文分享系统中的海量图片存储研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
杨乐: "基于属性与链接的海量文件组织机制研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104657500A (en) * | 2015-03-12 | 2015-05-27 | 浪潮集团有限公司 | Distributed storage method based on KEY-VALUE pair |
CN108108247A (en) * | 2017-12-28 | 2018-06-01 | 大唐软件技术股份有限公司 | Distributed picture storage service system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104731921B (en) | Storage and processing method of the Hadoop distributed file systems for log type small documents | |
CN104866497B (en) | The metadata updates method, apparatus of distributed file system column storage, host | |
US9311252B2 (en) | Hierarchical storage for LSM-based NoSQL stores | |
CN101866359B (en) | Small file storage and visit method in avicade file system | |
CN105183839A (en) | Hadoop-based storage optimizing method for small file hierachical indexing | |
US9141626B2 (en) | Volume having tiers of different storage traits | |
US10019457B1 (en) | Multi-level compression for storing data in a data store | |
US20130219135A1 (en) | Dynamic time reversal of a tree of images of a virtual hard disk | |
TW201520889A (en) | Hybrid storage | |
CN102024047B (en) | Data searching method and device thereof | |
CN103020315A (en) | Method for storing mass of small files on basis of master-slave distributed file system | |
CN103559027A (en) | Design method of separate-storage type key-value storage system | |
Adya et al. | Fast key-value stores: An idea whose time has come and gone | |
JP7153420B2 (en) | Using B-Trees to Store Graph Information in a Database | |
CN103793475A (en) | Distributed file system data migration method | |
CN108509507A (en) | The account management system and its implementation of unified entrance | |
CN104158863A (en) | Cloud storage mechanism based on transaction-level whole-course high-speed buffer | |
CN103617293A (en) | Key-Value storage method oriented towards storage system of mass small files | |
Zhang et al. | FlameDB: A key-value store with grouped level structure and heterogeneous Bloom filter | |
CN107273443B (en) | Mixed indexing method based on metadata of big data model | |
CN105022822A (en) | PHP (Professional Hypertext Preprocessor) based log collection and storage method and system | |
US11586353B2 (en) | Optimized access to high-speed storage device | |
Ma et al. | Efficient attribute-based data access in astronomy analysis | |
Hongyu et al. | PCRAM-based data management method for storage and computation integration | |
Ha et al. | Ink: In-kernel key-value storage with persistent memory |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20140305 |
|
WD01 | Invention patent application deemed withdrawn after publication |