CN106570113A - Cloud storage method and system for mass vector slice data - Google Patents

Cloud storage method and system for mass vector slice data Download PDF

Info

Publication number
CN106570113A
CN106570113A CN201610939884.6A CN201610939884A CN106570113A CN 106570113 A CN106570113 A CN 106570113A CN 201610939884 A CN201610939884 A CN 201610939884A CN 106570113 A CN106570113 A CN 106570113A
Authority
CN
China
Prior art keywords
data
vector slice
magnanimity
slice
magnanimity vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610939884.6A
Other languages
Chinese (zh)
Other versions
CN106570113B (en
Inventor
马潇
王景朝
费香泽
王宪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
State Grid Anhui Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
China Electric Power Research Institute Co Ltd CEPRI
State Grid Anhui Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, China Electric Power Research Institute Co Ltd CEPRI, State Grid Anhui Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201610939884.6A priority Critical patent/CN106570113B/en
Publication of CN106570113A publication Critical patent/CN106570113A/en
Application granted granted Critical
Publication of CN106570113B publication Critical patent/CN106570113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Abstract

The invention discloses a cloud storage method for mass vector slice data. The method comprises the following steps of: establishing a distributed file system directory tree file; establishing all the metadata nodes corresponding to a distributed file system directory tree file; aggregating the mass vector slice data under the same directories in a distributed file system so as to generate a mass vector slice data packet; storing the mass vector slice data packet in the metadata nodes; establishing indexes for the mass vector slice data, and correlating the mass vector slice data through the indexes so as to form a net-structure data index table of the mass vector slice data, wherein the index table is used for recording paths of the mass vector slice data in the mass vector slice data packets; and providing mass vector slice data indexing services through a mass vector slice data packet index table.

Description

A kind of magnanimity vector slice of data cloud storage method and system
Technical field
The present invention relates to mass data storage field, more particularly, to a kind of magnanimity vector slice of data cloud storage side Method and system.
Background technology
With the continuous development of science and technology, the mass data epoch have arrived.Therefore, how file optimizing system is negative Carry, lifting the harmony of load becomes demand important at present.When the size of data set is more than an independent physical computer Storage capacity when, it is therefore necessary to it carry out subregion and store on some independent computers.Google, Amazon, IBM Substantial amounts of scientific research strength has been put into the international major company such as Microsoft in this field, it is proposed that the Mass Data Management skill of various innovations Art.At present research work is concentrated mainly on accumulation layer, computation layer and interface layer this 3 levels.The Hadoop projects of prior art Realize Hadoop distributed file system Hadoop DFS (abbreviation HDFS), and multiple programming framework Hadoop MapReduce.Distributed file system framework introduces the complexity of network programming, therefore distributed field system on network System is more increasingly complex than ordinary magnetic disc file.The target of distributed file system is to realize resource-sharing, program picture is stored and is visited Ask that its Typical Representative is Google file system GFS, Hadoop similar to the mode of local file is accessed to Remote File Manipulation File system HDFS, dynamo, TFS etc..Present distributed file system generally remains almost identical with local file system Access interface and object model, this is primarily to provide a user with compatibility backward.
Prior art is primarily directed to super large rank (referring to that file size is hundreds of MB, GB or TB) data file and adopts and be based on Distributed file system is stored and read.But carry out based on distributed file system for large amount of small documents data, by It is slow in storage speed, it is impossible to meet the storage demand of large amount of small documents data.Currently without for storage large amount of small documents data Carry out the technical scheme for being stored based on distributed file system and being read.
The content of the invention
In order to solve the speed issue that large amount of small documents data carry out when being stored based on distributed file system, this Bright to provide a method that, methods described includes:
Set up all metadata nodes corresponding with distributive catalogue of document system tree;
To be polymerized with the magnanimity vector slice of data under first class catalogue in distributed file system, generated magnanimity vector Slice of data bag;
The magnanimity vector slice of data bag is stored in the metadata node;
Set up for the magnanimity vector slice of data and index, the magnanimity vector slice of data sets up association by index, Form the data directory of cancellated magnanimity vector slice of data;
The magnanimity vector slice of data index service is provided by the magnanimity vector slice of data bag concordance list.
Preferably, method according to claim 1, methods described includes:
The magnanimity vector slice of data index includes the magnanimity vector slice of data path, title and in the sea Side-play amount in amount vector slice of data bag;
The magnanimity vector slice of data path includes that first site position, magnanimity vector slice of data line position are put and magnanimity Vector slice of data column position.
Preferably, methods described includes:
The default unitary Data Node of each layer, concordance list is stored in each layer be pre-designed of metadata node;
The magnanimity vector slice of data concordance list stored in the metadata is transmitted to client, magnanimity arrow is set up The lasting mapping table of amount slice of data concordance list.
Preferably, the magnanimity vector slice of data bag includes file header and at least one record;
The file header includes file type, version number, document keyword, file name, per recording corresponding described in bar Position;
Per described in bar record correspondence one vector slice of data, it is described per bar record include vector slice of data length, Key length, key and value.
Preferably, the magnanimity vector slice of data bag is stored using data file sequencing method.
Preferably, also include:Carry out adding storage in the afterbody of the magnanimity vector slice of data bag.
Preferably, methods described includes:The magnanimity vector slice of data is indexed into table cache to client, is reduced and is accessed The metadata node number of times accesses the access times of magnanimity vector slice of data to improve.
Preferably, also include:The method that magnanimity vector slice of data is read out:
The corresponding unit of the magnanimity vector slice of data bag is determined by the magnanimity vector slice of data concordance list Back end shortest path;
By it is determined that metadata node in file header in data APMB package, determine the vector slice of data Position.
Based on embodiments of the present invention, the present invention provides a kind of cloud storage system for magnanimity vector slice of data, The system includes:
First signal generating unit, for setting up distributive catalogue of document system tree file;
Second signal generating unit, for setting up all metadata nodes corresponding with distributive catalogue of document system tree;
Polymerized unit, for will be gathered with the magnanimity vector slice of data under first class catalogue based on distributed file system Close, generate magnanimity vector slice of data bag;
Memory element, for the magnanimity vector slice of data bag to be stored in the metadata node;
3rd signal generating unit, for generating the magnanimity vector slice of data concordance list, by concordance list the sea is set up The network structure of amount vector slice of data bag, for recording the magnanimity vector slice of data in the magnanimity vector slice of data Path in bag;
Indexing units, for providing the magnanimity vector slice of data index by magnanimity vector slice of data index Service.
Beneficial effects of the present invention are:To enter with the magnanimity vector slice of data under first class catalogue in distributed file system Row polymerization, generates magnanimity vector slice of data bag so that magnanimity vector slice of data realizes quick storage.Propose simultaneously as sea Amount vector slice of data sets up index, and magnanimity vector slice of data sets up association by index, forms cancellated magnanimity arrow The data directory of amount slice of data.By the data directory of network structure, realization finds corresponding unit by shortest path Back end, accelerates the access speed of data.
Description of the drawings
By reference to the following drawings, the illustrative embodiments of the present invention can be more fully understood by:
Fig. 1 is according to a kind of magnanimity vector slice of data cloud storage method system flow chart of embodiment of the present invention;And
Fig. 2 is according to a kind of magnanimity vector slice of data cloud storage method system structure chart of embodiment of the present invention.
Specific embodiment
With reference now to accompanying drawing, the illustrative embodiments of the present invention are introduced, however, the present invention can be with many different shapes Formula is not limited to embodiment described herein implementing, there is provided these embodiments are to disclose at large and fully The present invention, and fully pass on the scope of the present invention to person of ordinary skill in the field.For showing for being illustrated in the accompanying drawings Term in example property embodiment is not limitation of the invention.In the accompanying drawings, identical cells/elements are attached using identical Icon is remembered.
Unless otherwise stated, term (including scientific and technical terminology) used herein has to person of ordinary skill in the field It is common to understand implication.Further it will be understood that the term limited with the dictionary being usually used, is appreciated that and it The linguistic context of association area has consistent implication, and is not construed as Utopian or excessively formal meaning.
Fig. 1 is according to a kind of magnanimity vector slice of data cloud storage method system flow chart of embodiment of the present invention.This It is bright to propose a kind of method that magnanimity vector slice of data based on distributed file system is stored.The solution of the present invention is with existing Based on distributive catalogue of document system tree construction, the multiple magnanimity vector slice of datas in a catalogue are packaged into into magnanimity arrow Amount slice of data bag is stored, and the magnanimity vector slice of data bag being packaged into is large data files, and file-level is in 100 MB More than.Meanwhile, technical scheme life magnanimity vector slice of data sets up index, and record magnanimity vector slice of data is in sea Path in amount vector slice of data bag, accesses magnanimity vector slice of data and provides interface for client.The method of the present invention Make full use of in the advantage of the high fault-tolerant of master-salve distributed file system, extensibility and distributivity, it is super in object oriented file rank On the basis of crossing the distributed file system of 100 MB, the efficient storage of massive vector data is realized.Method proposed by the present invention makes Massive vector data is stored with distributed file system, while massive vector data is set up indexing, storage sea at present is solved The slow-footed problem of amount vector data, and improve access speed by setting up index.
Preferably, method 100 starts to walk from step 101:Set up distributive catalogue of document system tree file.Build distributed text Part system directory tree construction file, can make full use of the high fault-tolerant of distributed file system, extensibility and distributed excellent Point.
Preferably, step 102:Set up all metadata nodes corresponding with distributive catalogue of document system tree.Metadata Node is used for data storage.
Preferably, step 103:To be gathered with the magnanimity vector slice of data under first class catalogue in distributed file system Close, generate magnanimity vector slice of data bag.The file structure of design magnanimity vector slice of data bag, magnanimity vector slice of data bag Including file header and at least one record.File header includes file type, version number, document keyword, file name, remembers per bar Record corresponding position.One vector slice of data of correspondence is recorded per bar, record includes length, the bond distance of vector slice of data per bar Degree, key and value.The additional of magnanimity vector slice of data is stored as being added in the afterbody of magnanimity vector slice of data bag Storage.Magnanimity vector slice of data bag is stored using data file sequencing method.Embodiment proposed by the present invention, uses In the method in a distributed manner based on system architecture of massive vector data cloud storage, by a metadata node and metadata node The back end composition of lower multilevel hierarchy.Embodiments of the present invention are by the magnanimity vector slice of data whole under same first class catalogue It is saved in the data file under the catalogue, is the data file magnanimity vector slice of data bag in the present invention, is distributed text File in part system.In embodiment of the present invention, the memory technology that is polymerized it is critical only that magnanimity vector slice of data APMB package Design.Magnanimity vector slice of data APMB package is distributed using binary system key/value (Key/Value) perdurable data structure File system files, it is made up of file header and one or more subsequent record.Magnanimity vector slice of data APMB package head First three byte for SEQ file type, the version number of a byte representation file data structure followed by.File header is also Including some other field, including the content such as the title of key and value respective type.Magnanimity vector slice of data is when being stored Directly the afterbody in magnanimity vector slice of data APMB package is added.Per bar, record represents a vector slice of data.Record It is made up of the length, key length, key, value four that record.Wherein the value of key is the filename of vector slice of data, is worth and is cut for vector The content of sheet data.
Preferably, step 104:Magnanimity vector slice of data bag is stored in metadata node.Magnanimity vector number of slices It is that base distributed file system is realized according to bag storage method, it depends on distribution to the operation that magnanimity vector slice of data is accessed Formula file system.The additional of magnanimity vector slice of data is stored as carrying out adding in the afterbody of magnanimity vector slice of data bag depositing Storage.Magnanimity vector slice of data bag is stored using data file sequencing method.Vector is cut into slices when there is a client When data are write under certain catalogue, the client can carry out write operation, distributed file system note to the data file of the catalogue What occupancy authority Lease for having recorded the data file was considered as file writes lock.If now another client is also required to certainly Oneself vector slice of data is stored under identical catalogue, and equally it can also go application to the magnanimity vector number of slices under the catalogue Write operation is carried out according to APMB package.Lock, and distributed document are write because magnanimity vector slice of data APMB package has had one System is not carried out the maintenance of transactions requests queue, the result for directly returning operation failure to client.From from the point of view of user, It is not in conflict that different magnanimity vector slice of data APMB packages are created under same catalogue, but is in fact in rear end The operation that same magnanimity vector slice of data APMB package is carried out, because such lock mechanism just occurs multiple users same To the problem of different vector slice of data write conflicts under one catalogue.The realization of magnanimity vector slice of data APMB package mainly adopts number According to the sequence and unserializing method of file.So-called serializing, refers to and for structured object to be converted into byte stream, so as in network Upper transmission is write and permanently stored on disk.Unserializing is referred to the inverse process of the byte stream meeting of conversion structured object.
Preferably, step 105:Set up for magnanimity vector slice of data and index, magnanimity vector slice of data is built by index Vertical association, forms the data directory of cancellated magnanimity vector slice of data;Concordance list is used to record the section of magnanimity vector Path of the data in magnanimity vector slice of data bag.Magnanimity vector slice of data index includes magnanimity vector slice of data road Footpath, title and the side-play amount in magnanimity vector slice of data bag, magnanimity vector slice of data path include first site position, Magnanimity vector slice of data line position is put and magnanimity vector slice of data column position.For example, a magnanimity vector slice of data road Footpath includes<18,0506>, wherein 18 is metadata site position, 05 puts for magnanimity vector slice of data line position, and 06 is magnanimity arrow Amount slice of data column position.When making a look up to this magnanimity vector slice of data, by positioning metadata site position 18, then Corresponding row 05 is continued to search for, then makes a look up corresponding row 06.All magnanimity vector slice of datas are according to path in concordance list Metadata site position, magnanimity vector slice of data line position is put and magnanimity vector slice of data column position Special composition is netted Index structure.Embodiments of the present invention can realize the shortest path that magnanimity vector slice of data is searched.
Each layer of metadata node presets a metadata node for being used for data storage concordance list, by the section of magnanimity vector Data directory is stored in corresponding metadata node.By the magnanimity vector slice of data concordance list recorded in metadata transmit to Catalogue file, and set up the lasting mapping table of magnanimity vector slice of data index in client.
The index record of vector slice of data vector slice of data in concrete magnanimity vector slice of data APMB package Position and other attributes of vector slice of data, it is that client is necessary after the data for having stored magnanimity vector slice of data It to be its establishment.The magnanimity that the title comprising magnanimity vector slice of data, magnanimity vector slice of data are located in index record Vector slice of data APMB package path and the side-play amount in magnanimity vector slice of data APMB package.Magnanimity vector slice of data Digit shared by APMB package name determines the quantity of data file under a catalogue, and the digit shared by side-play amount determines data text The capacity of data storage is limited under the size of part, therefore one catalogue of explanation.
Preferably, magnanimity vector slice of data index is distributed to each back end to manage.Magnanimity vector slice of data Although index data it is very huge, after being distributed in metadata node, the index data in single metadata node is with regard to phase To very little, and the ability of cluster-based storage magnanimity vector slice of data depends on the scale of cluster.The size of cluster scale is not only The size of memory capacity can be determined, the size of storage magnanimity vector slice of data quantity can be more embodied.Metadata node is safeguarded The index of vector slice of data, and provide index service to client.Vector is safeguarded in the index position description of vector slice of data The metadata node of slice of data index.
Preferably, the index of magnanimity vector slice of data is classified according to its parent directory being located, and its objective is will be same Magnanimity vector slice of data index under one catalogue is managed by the metadata node of same one-level.In view of the feature, the present invention Embodiment creates index position mapping table to record the mapping relations of catalogue and metadata node.Index position mapping table is by unit Back end is managed.Client is when magnanimity vector slice of data index is inquired about, it is necessary first to knows and safeguards the extra large vector The metadata node position of slice of data index.It by the way that the path of magnanimity vector slice of data is passed to into metadata node, so Afterwards metadata node finds metadata node position according to the parent directory search index position mapping table in extra large vector slice of data path Put.The present invention designs index position maintenance module in metadata node, dedicated for distributing back end for catalogue, safeguards index Position mapping table.
Preferably, index position maintenance module therefrom selects to distribute to according to the total data node that metadata node is safeguarded Catalogue.Index position mapping table is persisted on local disk, and when its data change, the content on its disk also will Re-start renewal.If index position maintenance module can not find enough metadata sections when metadata node is distributed to catalogue Point, the module can wait unappropriated direct insertion to catalogue in distribution queue, while the content of the queue also wants persistence To on disk, queue once has new catalogue to add or deletion is required for updating again on disk.When metadata node starts Need queuing data on disk to be read in internal memory.The purpose of the queue is to wait for distributed file system new data section Point is registered plus fashionable, and index position maintenance module is redistributed to the catalogue in queue.Same queue updates also need every time Carry out persistence.
Embodiment of the present invention on back end by designing vector slice of data index module come maintenance management vector The index of slice of data, to client index service is provided.Module safeguard the index record and index file in internal memory and with The corresponding journal file of index file.Metadata node is ranked up to accelerate the lookup for indexing to visit to index record with B-tree Ask.The renewal of index record first can modify to memory data structure, temporarily asynchronous to correspond to index file.But will more New content record as needed arranges index file after back end starts in the corresponding Log files of the index file Sequence reads in internal memory, and index data structure is updated according to Log, and will now index record stores data again in internal memory Old index file is replaced on node, Log is emptied.The purpose of do so is in order to avoid the unexpected power-off of back end causes internal memory In index data lose.
Preferably, magnanimity vector slice of data is indexed into table cache to client, reduces and access the metadata node time Number accesses the access times of magnanimity vector slice of data to improve.Embodiments of the present invention, by client-cache user Commonly used magnanimity vector slice of data index, it is possible to reduce access times of the client to metadata node, improves to magnanimity The efficiency that vector slice of data is accessed.
Preferably, step 106:Magnanimity vector slice of data index clothes are provided by magnanimity vector slice of data bag concordance list Business.The corresponding metadata node shortest path of magnanimity vector slice of data bag is determined by magnanimity vector slice of data concordance list. By it is determined that metadata node in file header in data APMB package, determine the position of vector slice of data.
Fig. 2 is according to a kind of magnanimity vector slice of data cloud storage method system structure chart of embodiment of the present invention.System 200 include:
First signal generating unit 201, for setting up distributive catalogue of document system tree file;
Second signal generating unit 202, for setting up all metadata nodes corresponding with distributive catalogue of document system tree;
Polymerized unit 203, for will be entered with the magnanimity vector slice of data under first class catalogue based on distributed file system Row polymerization, generates magnanimity vector slice of data bag;
Memory element 204, for magnanimity vector slice of data bag to be stored in metadata node;
3rd signal generating unit 205, for generating magnanimity vector slice of data concordance list, by concordance list magnanimity vector is set up The network structure of slice of data bag, for recording path of the magnanimity vector slice of data in magnanimity vector slice of data bag;
Indexing units 206, for providing magnanimity vector slice of data index service by magnanimity vector slice of data index.
Magnanimity vector slice of data cloud storage method system 200 a kind of according to the embodiment of the present invention is another with the present invention A kind of magnanimity vector slice of data cloud storage method system 100 of embodiment is corresponding, and here is no longer repeated.
Beneficial effects of the present invention are:To enter with the magnanimity vector slice of data under first class catalogue in distributed file system Row polymerization, generates magnanimity vector slice of data bag so that magnanimity vector slice of data realizes quick storage.Propose simultaneously as sea Amount vector slice of data sets up index, and magnanimity vector slice of data sets up association by index, forms cancellated magnanimity arrow The data directory of amount slice of data.By the data directory of network structure, realization finds corresponding unit by shortest path Back end, accelerates the access speed of data.
The present invention is described by reference to a small amount of embodiment.However, it is known in those skilled in the art, as What subsidiary Patent right requirement was limited, except the present invention other embodiments disclosed above equally fall the present invention's In the range of.
Normally, all terms for using in the claims are all solved according to them in the usual implication of technical field Release, unless clearly defined in addition wherein.It is all of to be all opened ground with reference to " one/described/be somebody's turn to do [device, component etc.] " At least one of described device, component etc. example is construed to, unless otherwise expressly specified.Any method disclosed herein Step all need not be run with disclosed accurate order, unless explicitly stated otherwise.
In addition, those skilled in the art are it should be appreciated that embodiments herein can be provided as method, system or calculate Machine program product.Therefore, the application can be using complete hardware embodiment, complete software embodiment or with reference to software and hardware side The form of the embodiment in face.And, the application can be adopted and wherein include computer usable program code at one or more The computer implemented in computer-usable storage medium (including but not limited to disk memory, CD-ROM, optical memory etc.) The form of program product.
The application is the flow process with reference to method, equipment (system) and computer program according to the embodiment of the present application Figure and/or block diagram are describing.It should be understood that can be by computer program instructions flowchart and/or each stream in block diagram The combination of journey and/or square frame and flow chart and/or the flow process in block diagram and/or square frame.These computer programs can be provided The processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that produced for reality by the instruction of computer or the computing device of other programmable data processing devices The device of the function of specifying in present one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or multiple square frames.
These computer program instructions may be alternatively stored in can guide computer or other programmable data processing devices with spy In determining the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory is produced to be included referring to Make the manufacture of device, the command device realize in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or The function of specifying in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing devices so that in meter Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented process, so as in computer or The instruction performed on other programmable devices is provided for realizing in one flow process of flow chart or multiple flow processs and/or block diagram one The step of function of specifying in individual square frame or multiple square frames.

Claims (9)

1. a kind of cloud storage method for magnanimity vector slice of data, methods described includes:
Set up all metadata nodes corresponding with distributive catalogue of document system tree;
To be polymerized with the magnanimity vector slice of data under first class catalogue in distributed file system, generated the section of magnanimity vector Packet;
The magnanimity vector slice of data bag is stored in the metadata node;
Set up for the magnanimity vector slice of data and index, the magnanimity vector slice of data sets up association by index, formed The data directory of cancellated magnanimity vector slice of data;
The magnanimity vector slice of data index service is provided by the magnanimity vector slice of data bag concordance list.
2. method according to claim 1, methods described includes:
The magnanimity vector slice of data index includes the magnanimity vector slice of data path, title and in magnanimity arrow Side-play amount in amount slice of data bag;
The magnanimity vector slice of data path includes that first site position, magnanimity vector slice of data line position are put and magnanimity vector Slice of data column position.
3. method according to claim 1, methods described includes:
The default unitary Data Node of each layer, concordance list is stored in each layer be pre-designed of metadata node;
The magnanimity vector slice of data concordance list stored in the metadata is transmitted to client, magnanimity vector is set up and is cut The lasting mapping table of sheet data concordance list.
4. method according to claim 1, the magnanimity vector slice of data bag includes file header and at least one record;
The file header includes file type, version number, document keyword, file name, per recording corresponding position described in bar;
Per one vector slice of data of correspondence is recorded described in bar, the record per bar includes length, the bond distance of vector slice of data Degree, key and value.
5. method according to claim 1, the magnanimity vector slice of data bag is entered using data file sequencing method Row storage.
6. method according to claim 1, also includes:Carry out adding in the afterbody of the magnanimity vector slice of data bag and deposit Storage.
7. method according to claim 1, methods described includes:By magnanimity vector slice of data index table cache extremely Client.
8. method according to claim 4, also includes:The method that magnanimity vector slice of data is read out:
The corresponding metadata of the magnanimity vector slice of data bag is determined by the magnanimity vector slice of data concordance list Node shortest path;
By it is determined that metadata node in file header in data APMB package, determine the position of the vector slice of data Put.
9. a kind of cloud storage system for magnanimity vector slice of data, the system includes:
First signal generating unit, for setting up distributive catalogue of document system tree file;
Second signal generating unit, for setting up all metadata nodes corresponding with distributive catalogue of document system tree;
Polymerized unit, for will be polymerized with the magnanimity vector slice of data under first class catalogue based on distributed file system, Generate magnanimity vector slice of data bag;
Memory element, for the magnanimity vector slice of data bag to be stored in the metadata node;
3rd signal generating unit, for generating the magnanimity vector slice of data concordance list, by concordance list the magnanimity arrow is set up The network structure of amount slice of data bag, for recording the magnanimity vector slice of data in the magnanimity vector slice of data bag Path;
Indexing units, for providing the magnanimity vector slice of data index clothes by magnanimity vector slice of data index Business.
CN201610939884.6A 2016-10-25 2016-10-25 Mass vector slice data cloud storage method and system Active CN106570113B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610939884.6A CN106570113B (en) 2016-10-25 2016-10-25 Mass vector slice data cloud storage method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610939884.6A CN106570113B (en) 2016-10-25 2016-10-25 Mass vector slice data cloud storage method and system

Publications (2)

Publication Number Publication Date
CN106570113A true CN106570113A (en) 2017-04-19
CN106570113B CN106570113B (en) 2022-04-01

Family

ID=58536334

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610939884.6A Active CN106570113B (en) 2016-10-25 2016-10-25 Mass vector slice data cloud storage method and system

Country Status (1)

Country Link
CN (1) CN106570113B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291889A (en) * 2017-06-20 2017-10-24 郑州云海信息技术有限公司 A kind of date storage method and system
CN108172277A (en) * 2017-12-19 2018-06-15 浙江大学 A kind of more multiplying power digital slices image storages and the method and system of browsing
CN109767274A (en) * 2018-12-05 2019-05-17 航天信息股份有限公司 A kind of pair of magnanimity invoice data is associated the method and system of storage
CN111459882A (en) * 2020-03-30 2020-07-28 北京百度网讯科技有限公司 Namespace transaction processing method and device of distributed file system
CN111782663A (en) * 2020-05-21 2020-10-16 浙江邦盛科技有限公司 Aggregation index structure and aggregation index method for improving aggregation query efficiency
CN115373645A (en) * 2022-10-24 2022-11-22 济南新语软件科技有限公司 Complex data packet operation method and system based on dynamic definition

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102332027A (en) * 2011-10-15 2012-01-25 西安交通大学 Mass non-independent small file associated storage method based on Hadoop
CN102385623A (en) * 2011-10-25 2012-03-21 曙光信息产业(北京)有限公司 Catalogue access method in DFS (distributed file system)
CN102541985A (en) * 2011-10-25 2012-07-04 曙光信息产业(北京)有限公司 Organization method of client directory cache in distributed file system
CN103020315A (en) * 2013-01-10 2013-04-03 中国人民解放军国防科学技术大学 Method for storing mass of small files on basis of master-slave distributed file system
CN103473287A (en) * 2013-08-30 2013-12-25 中国科学院信息工程研究所 Method and system for automatically distributing, running and updating executable programs
CN103577123A (en) * 2013-11-12 2014-02-12 河海大学 Small file optimization storage method based on HDFS
US8825652B1 (en) * 2012-06-28 2014-09-02 Emc Corporation Small file aggregation in a parallel computing system
CN104731921A (en) * 2015-03-26 2015-06-24 江苏物联网研究发展中心 Method for storing and processing small log type files in Hadoop distributed file system
CN105404691A (en) * 2015-12-14 2016-03-16 曙光信息产业股份有限公司 File storage method and apparatus

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102332027A (en) * 2011-10-15 2012-01-25 西安交通大学 Mass non-independent small file associated storage method based on Hadoop
CN102385623A (en) * 2011-10-25 2012-03-21 曙光信息产业(北京)有限公司 Catalogue access method in DFS (distributed file system)
CN102541985A (en) * 2011-10-25 2012-07-04 曙光信息产业(北京)有限公司 Organization method of client directory cache in distributed file system
US8825652B1 (en) * 2012-06-28 2014-09-02 Emc Corporation Small file aggregation in a parallel computing system
CN103020315A (en) * 2013-01-10 2013-04-03 中国人民解放军国防科学技术大学 Method for storing mass of small files on basis of master-slave distributed file system
CN103473287A (en) * 2013-08-30 2013-12-25 中国科学院信息工程研究所 Method and system for automatically distributing, running and updating executable programs
CN103577123A (en) * 2013-11-12 2014-02-12 河海大学 Small file optimization storage method based on HDFS
CN104731921A (en) * 2015-03-26 2015-06-24 江苏物联网研究发展中心 Method for storing and processing small log type files in Hadoop distributed file system
CN105404691A (en) * 2015-12-14 2016-03-16 曙光信息产业股份有限公司 File storage method and apparatus

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291889A (en) * 2017-06-20 2017-10-24 郑州云海信息技术有限公司 A kind of date storage method and system
CN108172277B (en) * 2017-12-19 2020-07-07 浙江大学 Method and system for storing and browsing multiple-magnification digital slice image
CN108172277A (en) * 2017-12-19 2018-06-15 浙江大学 A kind of more multiplying power digital slices image storages and the method and system of browsing
CN109767274B (en) * 2018-12-05 2023-04-25 航天信息股份有限公司 Method and system for carrying out associated storage on massive invoice data
CN109767274A (en) * 2018-12-05 2019-05-17 航天信息股份有限公司 A kind of pair of magnanimity invoice data is associated the method and system of storage
CN111459882A (en) * 2020-03-30 2020-07-28 北京百度网讯科技有限公司 Namespace transaction processing method and device of distributed file system
CN111459882B (en) * 2020-03-30 2023-08-29 北京百度网讯科技有限公司 Namespace transaction processing method and device for distributed file system
CN111782663A (en) * 2020-05-21 2020-10-16 浙江邦盛科技有限公司 Aggregation index structure and aggregation index method for improving aggregation query efficiency
WO2021232645A1 (en) * 2020-05-21 2021-11-25 浙江邦盛科技有限公司 Aggregation index structure and aggregation index method for improving aggregate query efficiency
CN111782663B (en) * 2020-05-21 2023-09-01 浙江邦盛科技股份有限公司 Aggregation index structure and aggregation index method for improving aggregation query efficiency
US11928113B2 (en) 2020-05-21 2024-03-12 Zhejiang Bangsun Technology Co., Ltd. Structure and method of aggregation index for improving aggregation query efficiency
CN115373645A (en) * 2022-10-24 2022-11-22 济南新语软件科技有限公司 Complex data packet operation method and system based on dynamic definition
CN115373645B (en) * 2022-10-24 2023-02-03 济南新语软件科技有限公司 Complex data packet operation method and system based on dynamic definition

Also Published As

Publication number Publication date
CN106570113B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
CN106570113A (en) Cloud storage method and system for mass vector slice data
US9558194B1 (en) Scalable object store
US10275489B1 (en) Binary encoding-based optimizations at datastore accelerators
Jiang et al. THE optimization of HDFS based on small files
Padhy et al. RDBMS to NoSQL: reviewing some next-generation non-relational database’s
CN106021266B (en) Fast multi-tier index supporting dynamic updates
US8341130B2 (en) Scalable file management for a shared file system
CN107247778B (en) System and method for implementing an extensible data storage service
US8266136B1 (en) Mechanism for performing fast directory lookup in a server system
Pirzadeh et al. Performance evaluation of range queries in key value stores
Dong et al. Towards a fast and secure design for enterprise‐oriented cloud storage systems
Xu et al. Drop: Facilitating distributed metadata management in eb-scale storage systems
US11080207B2 (en) Caching framework for big-data engines in the cloud
Xu et al. Efficient and scalable metadata management in EB-scale file systems
US10558373B1 (en) Scalable index store
CN102024019A (en) Suffix tree based catalog organizing method in distributed file system
Nguyen et al. Zing database: high-performance key-value store for large-scale storage service
US10592153B1 (en) Redistributing a data set amongst partitions according to a secondary hashing scheme
Qi S-store: A scalable data store towards permissioned blockchain sharding
Gao et al. An efficient ring-based metadata management policy for large-scale distributed file systems
CN110502472A (en) A kind of the cloud storage optimization method and its system of large amount of small documents
US9275091B2 (en) Database management device and database management method
Lu et al. Hybrid storage architecture and efficient MapReduce processing for unstructured data
Orhean et al. Toward scalable indexing and search on distributed and unstructured data
US10628391B1 (en) Method and system for reducing metadata overhead in a two-tier storage architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant