CN109101640A - A kind of distribution scheme of object data in file system - Google Patents

A kind of distribution scheme of object data in file system Download PDF

Info

Publication number
CN109101640A
CN109101640A CN201810951038.5A CN201810951038A CN109101640A CN 109101640 A CN109101640 A CN 109101640A CN 201810951038 A CN201810951038 A CN 201810951038A CN 109101640 A CN109101640 A CN 109101640A
Authority
CN
China
Prior art keywords
catalogue
deep
file system
data
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810951038.5A
Other languages
Chinese (zh)
Inventor
傅金地
黄键明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Saifun Information Technology (xiamen) Co Ltd
Original Assignee
Saifun Information Technology (xiamen) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Saifun Information Technology (xiamen) Co Ltd filed Critical Saifun Information Technology (xiamen) Co Ltd
Priority to CN201810951038.5A priority Critical patent/CN109101640A/en
Publication of CN109101640A publication Critical patent/CN109101640A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of distribution scheme of object data in file system, catalogue/position as where object is calculated according to the md5 value of object name, and md5 value has stronger dispersibility, object data is reasonably dispersed under each catalogue of file system, object data can be made largely not concentrate on under catalogue in this way, influence the performance of file system itself, using being evenly dispersed to different subdirectories the invention enables object data, keep it reasonable in file system, orderly storage distribution, improve search speed, to improve work efficiency, convenience is more preferable.

Description

A kind of distribution scheme of object data in file system
Technical field
The present invention relates to a kind of object data field, distribution scheme of specifically a kind of object data in file system.
Background technique
Object storage system (Object-BasedStorageSystem) is that the advantages of combining NAS and SAN, have simultaneously There is the high speed of SAN directly to access and the advantages such as the data sharing of NAS, provides the number of high reliability, professional platform independence and safety According to shared storage architecture.MD5 verification and (checksum) be by received transmission data execution hash operations come Check the correctness of data.Object data can be written inside object storage system, but general due to the not no catalogue of object storage It reads, so when having a large amount of object data into file system, it will lead to the individual catalogue subfiles of file system Enormous amount influences the performance of entire storage system very much, needs reasonable layout data in the distribution of file system, so that right Image data is evenly dispersed to different subdirectories.
Summary of the invention
The purpose of the present invention is to provide a kind of distribution scheme of object data in file system, to solve above-mentioned background The problem of being proposed in technology.
To achieve the above object, the invention provides the following technical scheme:
A kind of distribution scheme of object data in file system, specific steps are as follows:
S1, object storage system is write data into, when object storage system is written in data, object storage system meter Calculate the md5 check value of object oriented, it is assumed that object oriented cyphy-objecter-test-1, it is assumed that md5 check value is 0 c6bccf9d390407ae92e02ed7b1286a4;
S2, by md5 check value with n character, it is assumed that n 2 is divided into m substring for a subsection, such as: 0c, 6b, ccf, 9d, 39,04,07, ae, 92, e0,2e, d7, b1,28,6a, 4, then 2 bit combinations of md5 check value have x kind;
S3, according to the characteristics of file system and amount of capacity Catalogue Of Programme depth, it is assumed that each catalogue is having M subfile When, influences performance still smaller, and file system total capacity is cap, and the mean size of single file is assumed to be FileSize, then ideal catalogue quantity dirs are as follows: cap/fileSize/m, if each catalogue has x subdirectory, The directories deep needed are as follows: the x exploitation root of dirs^ (1/x), as dirs are exactly directories deep deep, if deep is greater than m, So just with the value of m as the value of deep;This deep value is calculated when system initialization, once it is determined that just no longer changing It is dynamic;
S4, assume deep=5, form a path, example to the deep substring from the 1st substring of md5 check value Such as: d2/30/66/e1/bc/ is created step by step if the catalogue is not present, and by object data there are under the catalogue, therefore The position of the object exists: d2/30/66/e1/bc/cyphy-objecter-test-1;
S5, when user accesses data again, need to execute 1,2,4 process again, object data institute can be obtained Catalogue d2/30/66/e1/bc/cyphy-objecter-test-1, guarantee object data can access property again;
S6, when listing object, need to remove the catalogue prefix of object, return to true object oriented cyphy- Objecter-test-1 is to user.
As a further solution of the present invention: calculated hashed value and dissipating with data transmission are assumed in the step S1 Train value is identical.
As further scheme of the invention: x in planning generally being required to be less than M in the step S3.
Compared with prior art, the beneficial effects of the present invention are: catalogue/position as where object is according to object name Md5 value calculate, and md5 value has stronger dispersibility, and object data is enabled reasonably to be dispersed in file system Each catalogue under, object data can be made largely not concentrate on influencing the property of file system itself under catalogue in this way Can, using different subdirectories is evenly dispersed to the invention enables object data, make its in file system rationally, it is orderly Storage distribution, improves search speed, to improve work efficiency, convenience is more preferable.
Specific embodiment
Below in conjunction with the embodiment of the present invention, technical scheme in the embodiment of the invention is clearly and completely described, Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based in the present invention Embodiment, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, all Belong to the scope of protection of the invention.
In the embodiment of the present invention, a kind of distribution scheme of object data in file system, specific steps are as follows:
S1, object storage system is write data into, when object storage system is written in data, object storage system meter Calculate the md5 check value of object oriented, it is assumed that calculated hashed value is identical with the hashed value transmitted with data, it is assumed that object name Referred to as cyphy-objecter-test-1, it is assumed that md5 check value is 0c6bccf9d390407ae92e02ed7b1286a4;
S2, by md5 check value with n character, it is assumed that n 2 is divided into m substring for a subsection, such as: 0c, 6b, ccf, 9d, 39,04,07, ae, 92, e0,2e, d7, b1,28,6a, 4, then 2 bit combinations of md5 check value have x kind;
S3, according to the characteristics of file system and amount of capacity Catalogue Of Programme depth, it is assumed that each catalogue is having M subfile When performance is influenced it is still smaller, and it is general require x in planning to be less than M, and file system total capacity is cap, The mean size of single file is assumed to be fileSize, then ideal catalogue quantity dirs are as follows: cap/fileSize/m, if Each catalogue has x subdirectory, then the directories deep needed are as follows: the x exploitation root of dirs^ (1/x), as dirs are exactly catalogue Depth deep, if deep is greater than m, just with the value of m as the value of deep;This deep value is counted when system initialization It calculates, once it is determined that just no longer changing;
S4, assume deep=5, form a path, example to the deep substring from the 1st substring of md5 check value Such as: d2/30/66/e1/bc/ is created step by step if the catalogue is not present, and by object data there are under the catalogue, therefore The position of the object exists: d2/30/66/e1/bc/cyphy-objecter-test-1;
S5, when user accesses data again, need to execute 1,2,4 process again, object data institute can be obtained Catalogue d2/30/66/e1/bc/cyphy-objecter-test-1, guarantee object data can access property again;
S6, when listing object, need to remove the catalogue prefix of object, return to true object oriented cyphy- Objecter-test-1 makes it in text to user using different subdirectories is evenly dispersed to the invention enables object data Reasonable, orderly storage distribution, improves search speed, to improve work efficiency, convenience is more preferable in part system.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included within the present invention.
In addition, it should be understood that although this specification is described in terms of embodiments, but not each embodiment is only wrapped Containing an independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should It considers the specification as a whole, the technical solutions in the various embodiments may also be suitably combined, forms those skilled in the art The other embodiments being understood that.

Claims (3)

1. a kind of distribution scheme of object data in file system, which is characterized in that itself specific steps are as follows:
S1, object storage system is write data into, when object storage system is written in data, object storage system is calculated The md5 check value of object oriented, it is assumed that object oriented cyphy-objecter-test-1, it is assumed that md5 check value is 0c6bc cf9d390407ae92e02ed7b1286a4;
S2, by md5 check value with n character, it is assumed that n 2 is divided into m substring for a subsection, such as: 0c, 6b, Ccf, 9d, 39,04,07, ae, 92, e0,2e, d7, b1,28,2 bit combinations of 6a, 4, md5 check value have x kind;
S3, according to the characteristics of file system and amount of capacity Catalogue Of Programme depth, it is assumed that each catalogue have M subfile when Time is still smaller on performance influence, and file system total capacity is cap, and the mean size of single file is assumed to be FileSize, then ideal catalogue quantity dirs are as follows: cap/fileSize/m, if each catalogue has x subdirectory, The directories deep needed are as follows: the x exploitation root of dirs^ (1/x), as dirs are exactly directories deep deep, if deep is greater than m, So just with the value of m as the value of deep;This deep value is calculated when system initialization, once it is determined that just no longer changing It is dynamic;
S4, assume deep=5, form a path from the 1st substring of md5 check value to the deep substring, such as: D2/30/66/e1/bc/ is created step by step if the catalogue is not present, and by object data there are under the catalogue, therefore this is right The position of elephant exists: d2/30/66/e1/bc/cyphy-objecter-test-1;
S5, when user accesses data again, need to execute 1,2,4 process again, where capable of obtaining object data Catalogue d2/30/66/e1/bc/cyphy-objecter-test-1, guarantee object data can access property again;
S6, when listing object, need to remove the catalogue prefix of object, return to true object oriented cyphy- Objecter-test-1 improves search speed to user, to improve work efficiency, convenience is more preferable.
2. a kind of distribution scheme of the object data according to claim 1 in file system, which is characterized in that the step Assume that calculated hashed value is identical with the hashed value transmitted with data in rapid S1.
3. a kind of distribution scheme of the object data according to claim 1 in file system, which is characterized in that the step X in planning is generally required to be less than M in rapid S3.
CN201810951038.5A 2018-08-21 2018-08-21 A kind of distribution scheme of object data in file system Pending CN109101640A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810951038.5A CN109101640A (en) 2018-08-21 2018-08-21 A kind of distribution scheme of object data in file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810951038.5A CN109101640A (en) 2018-08-21 2018-08-21 A kind of distribution scheme of object data in file system

Publications (1)

Publication Number Publication Date
CN109101640A true CN109101640A (en) 2018-12-28

Family

ID=64850372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810951038.5A Pending CN109101640A (en) 2018-08-21 2018-08-21 A kind of distribution scheme of object data in file system

Country Status (1)

Country Link
CN (1) CN109101640A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0528194A (en) * 1991-07-23 1993-02-05 Fuji Xerox Co Ltd Data access system
FR2805691A1 (en) * 2000-02-29 2001-08-31 Safelogic Controlling integrity of data files of client station, encrypting hashing values of file references and registering in local directory
CN1773497A (en) * 2004-11-12 2006-05-17 国际商业机器公司 Method and system for rearranging files in a computing system
CN1973288A (en) * 2004-06-24 2007-05-30 西姆毕恩软件有限公司 File management in a computing device
CN102819599A (en) * 2012-08-15 2012-12-12 华数传媒网络有限公司 Method for constructing hierarchical catalogue based on consistent hashing data distribution
CN103902632A (en) * 2012-12-31 2014-07-02 华为技术有限公司 File system building method and device in key-value storage system, and electronic device
CN105279258A (en) * 2015-10-21 2016-01-27 Tcl集团股份有限公司 File storage method and system with even distribution function

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0528194A (en) * 1991-07-23 1993-02-05 Fuji Xerox Co Ltd Data access system
FR2805691A1 (en) * 2000-02-29 2001-08-31 Safelogic Controlling integrity of data files of client station, encrypting hashing values of file references and registering in local directory
CN1973288A (en) * 2004-06-24 2007-05-30 西姆毕恩软件有限公司 File management in a computing device
CN1773497A (en) * 2004-11-12 2006-05-17 国际商业机器公司 Method and system for rearranging files in a computing system
CN102819599A (en) * 2012-08-15 2012-12-12 华数传媒网络有限公司 Method for constructing hierarchical catalogue based on consistent hashing data distribution
CN103902632A (en) * 2012-12-31 2014-07-02 华为技术有限公司 File system building method and device in key-value storage system, and electronic device
CN105279258A (en) * 2015-10-21 2016-01-27 Tcl集团股份有限公司 File storage method and system with even distribution function

Similar Documents

Publication Publication Date Title
Ahn et al. ForestDB: A fast key-value storage system for variable-length string keys
CN104809183B (en) A kind of digital independent and the method and apparatus of write-in
CN108628942A (en) The digital independent and wiring method of block chain node device, distributed data base
JP2012531675A5 (en)
WO2003054739A3 (en) Hybrid search memory for network processor and computer systems
CN102821138A (en) Metadata distributed storage method applicable to cloud storage system
CN103559027A (en) Design method of separate-storage type key-value storage system
WO2017095435A1 (en) Combining hashes of data blocks
CN103412950B (en) The method of accelerating space large data files read or write speed
CN104331659A (en) Design method for resource application isolation of key application host system
CN105279258A (en) File storage method and system with even distribution function
CN108228606A (en) The wiring method and device of data
CN103905517A (en) Data storage method and equipment
CN102404201B (en) Method of realizing maximum bandwidth of Lustre concurrent file system
CN105573674A (en) Distributed storage method oriented to a large number of small files
CN105763604A (en) Lightweight distributed file system and method for recovering original name of downloaded file
CN109101640A (en) A kind of distribution scheme of object data in file system
CN110020272A (en) Caching method, device and computer storage medium
CN107291454A (en) A kind of method and Commentary Systems that comment is added in the comment list of event
CN106101710A (en) A kind of distributed video transcoding method and device
US10262000B1 (en) Global distributed file append using log-structured file system
CN108897554A (en) A kind of unity3D packaging system optimization method
CN108614879A (en) Small documents processing method and device
CN107220342A (en) The control method and system of a kind of distributed data base
CN106776680A (en) A kind of acquisition method of distributed stream data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181228

RJ01 Rejection of invention patent application after publication