CN103617293A - Key-Value storage method oriented towards storage system of mass small files - Google Patents

Key-Value storage method oriented towards storage system of mass small files Download PDF

Info

Publication number
CN103617293A
CN103617293A CN201310688270.1A CN201310688270A CN103617293A CN 103617293 A CN103617293 A CN 103617293A CN 201310688270 A CN201310688270 A CN 201310688270A CN 103617293 A CN103617293 A CN 103617293A
Authority
CN
China
Prior art keywords
key
storage
storage system
value
small files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310688270.1A
Other languages
Chinese (zh)
Inventor
王雷
王振
王平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201310688270.1A priority Critical patent/CN103617293A/en
Publication of CN103617293A publication Critical patent/CN103617293A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data

Abstract

The invention provides a Key-Value storage method oriented towards a storage system of mass small files. The application of the storage system of the small files can be met through the storage method. Compared with the frequently-used Redis and LevelDB, the storage method has the advantages that pay expenses are lower, delaying is shorter, and the lightweight-class persistent Key-Value storage method is oriented towards the storage system of the small files. A storage structure comprises two parts, and as shown in the figure 1, the two parts include a HashTable structure in an internal storage and a persistent storage Store on a disk.

Description

A kind of Key-Value storage means towards mass small documents storage system
Technical field
The present invention relates to a minute mass small documents field of storage, be specifically related to a kind of Key-Value storage means towards mass small documents storage system.
Background technology
In recent years, social network sites is as rapid in network service development such as Facebook, Twitter, Renren Network and e-commerce website e-Bay, Alibaba, and this class service all needs to store the small documents such as a large amount of pictures, short text.Small documents is often referred to the file that file size is less than 64M, the picture that often need to store such as internet, applications, mail, e-book, music file, microblogging, content of text etc.
Small documents storage has caused some concerns in academia and industry member gradually.Social network sites Facebook has stored 2,600 hundred million pictures, and capacity surpasses 20PB, and these file overwhelming majority are all less than 64MB.In supercomputer field, for example, the application program on ORNL ' s CrayXT5 cluster (18688 nodes, 12 processors of each node) can, periodically by application state writing in files, cause system to accumulate a large amount of small documents.In concrete scientific research computing environment, for example, during some biology calculates, may produce 3,000 ten thousand files, and its mean size only has 190KB.U.S.'s Pacific Northwest National Laboratory is published in a research report data of 2007 and shows, in the system that this laboratory is used, stored 1,200 ten thousand files, the file that is wherein less than 64MB accounts for 94% of total number of files, and what be less than 64KB accounts for 58%.The huge whale net of music site has been included the music file of 3,600,000 MP3 format.Various data all show, the data of accessing on internet mostly are the small documents of high access frequency.
The retrieving information of small documents is realized by preserving Key-Value key-value pair, conventionally uses NoSQL database (as Redis and LevelDB).Yet the main flow such as LevelDB and Redis Key-Value storage engines is bad memory scan information timeliness fruit.Reason is as follows:
1. the value data layout of small documents storage system record is fixed, size is unified, with respect to Redis, be applicable to various data structures, small documents storage system can be for application-specific reduced data structure, reduced the operation that data structure is packed and resolved, and avoid realizing communication protocol, thus save CPU expense, reduce and postpone.
2.Redis system is mainly for the application in internal memory by data buffer storage.Redis can configure persistence function, but small documents storage system can, for its application-specific, can be supported persistence better.
3.LevelDB is standalone version persistence Key-Value storage engines, but LevelDB reduces and writes fashionable delay by the MemTable in internal memory and the Level blocks of files on disk, but for read request, LevelDB may need repeatedly disk operating (will read at most above disk file 6 times), so LevelDB is very suitable for writing the situation that mutiread is few.And in real internet, applications, as commodity photo in the photograph album on Facebook website or Taobao, its application characteristic is write-once, repeatedly read, so LevelDB is also not suitable for preserving retrieving information in small documents storage system.
Summary of the invention
The present invention proposes the Key-Value storage engines KVDB that a kind of lightweight towards small documents storage system can persistence.
KVDB structural design is as Fig. 1.KVDB comprises two parts, the HashTable structure in internal memory and the persistent storage Store on disk.
Figure is as Fig. 2 for HashTable class.DictEntry is used for representing an entry of the Hash table in upper figure.Use chained list method to solve hash-collision.HashTable has preserved key for retrieving and value in the position of Store.
Store is used for depositing value content, and Store can write a value content, and returns to this value content in the side-play amount of Store inside, or a given reference position, and Store can read out a value content.The class figure of Store is as Fig. 3.In KVDB, all these type of data structure that will preserve, all must be able to carry out serializing and unserializing (serialize and deserialize), so that Store class leaves this data structure in disk, carry out persistence.The corresponding disk file of Store class, such comprises handle and some statistical informations of pointing to this disk file, and all data structures that will preserve all write in this disk, and Store provides read-write interface function simultaneously.For power interruption recovering function is provided, Store also provides an interface function loading from disk.For improving reading speed, in Store for the file of depositing value content when loading, by mmap function, file content is mapped in internal memory.Like this, during disk file corresponding to access Store, conventionally can in internal memory, have access to.
The class figure of KVDB is as Fig. 4.The disclosed interface of KVDB comprises constructed fuction, initialization function init, and format output function format, writes a key-value key-value pair put function, reads a key-value key-value pair get function.
Accompanying drawing explanation
Fig. 1 is the structural drawing of KVDB
Fig. 2 is the HashTable class figure of KVDB
Fig. 3 is the Store class figure in KVDB
Fig. 4 is KVDB class figure
Embodiment
KVDB carries out key-value storage by the HashTable structure in internal memory and the persistent storage Store on disk to data structure, for example, when for retrieving information RetrieveInfo, key is this small documents filename (comspec), and value is retrieving information RetrieveInfo.
While writing small documents, retrieving information RetrieveInfo corresponding to this small documents that small documents is generated writes KVDB<RetrieveInfo>.While reading small documents, just from KVDB<RetrieveInfo>, read out the retrieving information RetrieveInfo that this small documents filename (complete trails filename) is corresponding.

Claims (1)

1. the present invention has designed a kind of Key-Value storage means towards mass small documents storage system, is a kind of application-specific that can either meet small documents storage system, also can reduce the storage means that the lightweight of expense, delay can persistence simultaneously.
CN201310688270.1A 2013-12-16 2013-12-16 Key-Value storage method oriented towards storage system of mass small files Pending CN103617293A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310688270.1A CN103617293A (en) 2013-12-16 2013-12-16 Key-Value storage method oriented towards storage system of mass small files

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310688270.1A CN103617293A (en) 2013-12-16 2013-12-16 Key-Value storage method oriented towards storage system of mass small files

Publications (1)

Publication Number Publication Date
CN103617293A true CN103617293A (en) 2014-03-05

Family

ID=50167996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310688270.1A Pending CN103617293A (en) 2013-12-16 2013-12-16 Key-Value storage method oriented towards storage system of mass small files

Country Status (1)

Country Link
CN (1) CN103617293A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657500A (en) * 2015-03-12 2015-05-27 浪潮集团有限公司 Distributed storage method based on KEY-VALUE pair
CN108108247A (en) * 2017-12-28 2018-06-01 大唐软件技术股份有限公司 Distributed picture storage service system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1692356A (en) * 2002-11-14 2005-11-02 易斯龙系统公司 Systems and methods for restriping files in a distributed file system
CN101551807A (en) * 2009-05-07 2009-10-07 山东中创软件商用中间件股份有限公司 Multilevel index technology for file database
US20110060876A1 (en) * 2009-09-08 2011-03-10 Brocade Communications Systems, Inc. Exact Match Lookup Scheme
CN103279568A (en) * 2013-06-18 2013-09-04 无锡紫光存储系统有限公司 System and method for metadata management

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1692356A (en) * 2002-11-14 2005-11-02 易斯龙系统公司 Systems and methods for restriping files in a distributed file system
CN101551807A (en) * 2009-05-07 2009-10-07 山东中创软件商用中间件股份有限公司 Multilevel index technology for file database
US20110060876A1 (en) * 2009-09-08 2011-03-10 Brocade Communications Systems, Inc. Exact Match Lookup Scheme
CN103279568A (en) * 2013-06-18 2013-09-04 无锡紫光存储系统有限公司 System and method for metadata management

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
何文: "改进的key/value数据存储设计方案", 《东北电力大学学报》 *
刘小军: "论文分享系统中的海量图片存储研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
杨乐: "基于属性与链接的海量文件组织机制研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104657500A (en) * 2015-03-12 2015-05-27 浪潮集团有限公司 Distributed storage method based on KEY-VALUE pair
CN108108247A (en) * 2017-12-28 2018-06-01 大唐软件技术股份有限公司 Distributed picture storage service system and method

Similar Documents

Publication Publication Date Title
CN104731921B (en) Storage and processing method of the Hadoop distributed file systems for log type small documents
CN104866497B (en) The metadata updates method, apparatus of distributed file system column storage, host
US9311252B2 (en) Hierarchical storage for LSM-based NoSQL stores
CN101866359B (en) Small file storage and visit method in avicade file system
CN105183839A (en) Hadoop-based storage optimizing method for small file hierachical indexing
US9141626B2 (en) Volume having tiers of different storage traits
US10019457B1 (en) Multi-level compression for storing data in a data store
US20130219135A1 (en) Dynamic time reversal of a tree of images of a virtual hard disk
TW201520889A (en) Hybrid storage
CN102024047B (en) Data searching method and device thereof
CN103020315A (en) Method for storing mass of small files on basis of master-slave distributed file system
CN103559027A (en) Design method of separate-storage type key-value storage system
Adya et al. Fast key-value stores: An idea whose time has come and gone
JP7153420B2 (en) Using B-Trees to Store Graph Information in a Database
CN103793475A (en) Distributed file system data migration method
CN108509507A (en) The account management system and its implementation of unified entrance
CN104158863A (en) Cloud storage mechanism based on transaction-level whole-course high-speed buffer
CN103617293A (en) Key-Value storage method oriented towards storage system of mass small files
Zhang et al. FlameDB: A key-value store with grouped level structure and heterogeneous Bloom filter
CN107273443B (en) Mixed indexing method based on metadata of big data model
CN105022822A (en) PHP (Professional Hypertext Preprocessor) based log collection and storage method and system
US11586353B2 (en) Optimized access to high-speed storage device
Ma et al. Efficient attribute-based data access in astronomy analysis
Hongyu et al. PCRAM-based data management method for storage and computation integration
Ha et al. Ink: In-kernel key-value storage with persistent memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140305

WD01 Invention patent application deemed withdrawn after publication