CN105187565A - Method for utilizing network storage data - Google Patents
Method for utilizing network storage data Download PDFInfo
- Publication number
- CN105187565A CN105187565A CN201510660935.7A CN201510660935A CN105187565A CN 105187565 A CN105187565 A CN 105187565A CN 201510660935 A CN201510660935 A CN 201510660935A CN 105187565 A CN105187565 A CN 105187565A
- Authority
- CN
- China
- Prior art keywords
- cloud platform
- client
- data server
- topology information
- access
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention provides a method for utilizing network storage data. The method includes the following steps: hashing is performed on IDs of data servers in a cloud platform storage system according to a predefined hash algorithm, and each data server is distributed on a ring of a whole Hash value range; each data server periodically sends a heartbeat message to a host node, and the host node manages topological information of a cloud platform; and a client accesses the host node to obtain the topological information of the cloud platform and cache locally, performs hashing calculation on file names, determines the data server where a corresponding small file is and performs reading or writing operation. The invention proposes a novel small-file-oriented cloud platform storage system and storage method, reduces the number of times of network connection for continuously reading and writing small files, storage is relatively lightweight, delay is relatively short, the host node load is reduced, service restoration is relatively fast after a failure, and availability of the whole storage system is improved.
Description
Technical field
The present invention relates to data to store, a kind of particularly remote storage method.
Background technology
In recent years, network service all needs to store large amount of small documents, as picture, mail, e-book, music file, microblogging content of text etc.At present for the node of the main storing metadata of Storage and Processing system of magnanimity large files and the node of store file data when being applied to mass small documents, run into a lot of problem: the first, mass small documents brings a large amount of metadata.Because the metadata information of each catalogue and file leaves in the internal memory of namenode, if there is a large amount of small documents in system, then can reduce storage efficiency and the storage capacity of whole cloud platform storage system undoubtedly.The second, the speed of access large amount of small documents is far smaller than the speed of several large files of the identical capacity of access.Because if access a large amount of small documents, need constantly to connect different data servers, this is a kind of data access patterns of poor efficiency.And small documents is merged in bulk by the existing cloud platform storage system towards mass small documents, although decrease the number that small documents takies local node.But need to use multiple service cloud platform unit to complete storage file function, therefore make read-write flow process loaded down with trivial details, once read and write needs and set up repeatedly network connection.Meanwhile, also use the name server of high capacity, this makes the availability of existing small documents system still lower.
Therefore, for the problems referred to above existing in correlation technique, at present effective solution is not yet proposed.
Summary of the invention
For solving the problem existing for above-mentioned prior art, the present invention proposes a kind of method utilizing network stored data, comprising:
In cloud platform storage system, multiple data server is logically organized into ring, according to predefine hashing algorithm, hash is carried out to the ID of data server, and according to hashed value, each data server is distributed on the ring of whole hash-value area;
Arrange a host node in cloud platform storage system, each data server periodically sends its heartbeat message to host node, and host node receives these message for managing the topology information of cloud platform;
Client-access host node obtains cloud platform topology information and is buffered in this locality, when client carries out reading and writing, hash calculating is carried out to filename, and determines the data server at corresponding small documents place, and in corresponding data server, read or write operation is carried out to small documents.
Preferably, perform following steps and realize small documents ablation process:
(1) if client is first time access cloud platform storage system, then client-access host node, the topology information of request cloud platform, and be recorded to this locality, when connected reference, if not first time access cloud platform storage system, then the client local buffer memory topology information of cloud platform;
(2) client carries out hash to filename, and determines that this small documents should be processed by which data server according to consistency hashing algorithm;
(3) data server obtained in client-access step (2), is sent to this data server by the cloud platform topology information of client-cache, the filename of small documents, small documents content buffer;
(4) first data server judges that whether the cloud platform topology information of client-cache is expired, namely whether the cloud platform topology information of correction data server record itself is consistent with the version number of the cloud platform topology information in client write request message, if consistent, go to step (5); If inconsistent, then contrast the cloud platform topology information in client write request message, judge whether difference can affect this write operation, if do not affected, be labeled as and need upgrade and go to step (5), otherwise the failure of notice client write operation, and new cloud platform topology information is sent to client;
(5) data server access access information management unit, checks whether the filename of this small documents exists, if exist, then notifies that client file name exists, otherwise goes to step (6);
(6) data server passes through blocks of files administrative unit by the content write-in block of small documents, the access information simultaneously blocks of files administrative unit obtained and filename are with key-value pair form write access information management unit, write success message is returned to client, update mark is needed if be provided with, then simultaneously by new cloud platform topology information notice client, write operation terminates.
The present invention compared to existing technology, has the following advantages:
Present invention achieves a kind of new cloud platform storage system towards small documents and storage means thereof.Have the following advantages:
(1) when cloud platform topology information does not change, read-write small documents only needs primary network to connect continuously.
(2) key-value pair storage is used to carry out the mapping relations of management document name and corresponding access information.Store more light weight, postpone lower.
(3) host node load is low, and after losing efficacy, service recovery was faster, and therefore cloud platform storage system availability is higher.
Accompanying drawing explanation
Fig. 1 is the flow chart utilizing the method for network stored data according to the embodiment of the present invention.
Embodiment
Detailed description to one or more embodiment of the present invention is hereafter provided together with the accompanying drawing of the diagram principle of the invention.Describe the present invention in conjunction with such embodiment, but the invention is not restricted to any embodiment.Scope of the present invention is only defined by the claims, and the present invention contain many substitute, amendment and equivalent.Set forth many details in the following description to provide thorough understanding of the present invention.These details are provided for exemplary purposes, and also can realize the present invention according to claims without some in these details or all details.
An aspect of of the present present invention provides a kind of method utilizing network stored data.Fig. 1 is the method flow diagram utilizing network stored data according to the embodiment of the present invention.
First all data servers are logically organized into ring by the present invention in cloud platform storage system.Cloud platform storage system adopts consistency Hash scheme, carries out hash, and be distributed on the ring of whole hash-value area by each data server according to hashed value according to predefine hashing algorithm to the ID of data server.
Arrange a host node and host node in cloud platform, each data server periodically sends its heartbeat message to host node, and host node receives these message, for managing the topology information of cloud platform.The cloud platform topology information of host node management comprises the data server list of all activities in cloud platform and the version number of current cloud platform topology information.IP address and port that the ID of the data server of each activity and this data server monitor is saved in data server list.Cloud platform topology information version number monotonically increasing timestamp represents.When cloud platform has new data server to add or original data server exits, host node regenerates a cloud platform topology information, and the version number of this cloud platform topology information is set to current time stamp, then this new cloud platform topology information is sent to the data server of all current actives by host node.Data servers all so all can preserve same cloud platform global information.
Client is in local cache cloud platform topology information.During client first time cloud platform, host node can be accessed and obtain cloud platform topology information, and be buffered in this locality, during follow-up read-write, use the cloud platform topology information of local cache.When client is read and write, first according to filename, according to consistency Hash scheme, hash calculating is carried out to filename, and determine the data server at this small documents place.Then the version number of the cloud platform topology information of correction data server preservation and the cloud platform topology information of client preservation, if version number is consistent, then carries out actual read-write operation at data server.
Data server has two formants, and one is blocks of files administrative unit, and one is access information management unit.Blocks of files administrative unit uses small documents to be merged into the scheme of bulk.Cloud platform storage system allocates larger blocks of files in advance, and the small documents of then new write can write in bulk.When known small documents place block number, small documents in block bias internal amount and these access informations of small documents size, just can retrieve this small documents from a data server.Cloud platform storage system uses key-value pair storage to carry out the mapping relations of management document name to access information, that is:
Key:filename→Value:(BlockId,Offset,Size)
Present invention achieves a key-value pair possessing persistence function to store.Managing access information is carried out with this key-value pair storage.
The following describe the small documents write flow process in cloud storage platform:
(1) if client is first time access cloud platform storage system, then client-access host node, the topology information of request cloud platform, and be recorded to this locality.When connected reference, if not first time access cloud platform storage system, then the client local buffer memory topology information of cloud platform.
(2) client carries out hash to filename, and determines that this small documents should be processed by which data server according to consistency hashing algorithm.
(3) data server obtained in client-access step (2), is sent to this data server by the cloud platform topology information of client-cache, the filename of small documents, small documents content buffer.
(4) first data server judges that whether the cloud platform topology information of client-cache is expired, and namely whether the cloud platform topology information of correction data server record itself is consistent with the version number of the cloud platform topology information in client write request message.If consistent, go to step (5); If inconsistent, then contrast the cloud platform topology information in client write request message, judge whether difference can affect this write operation, if do not affected, be labeled as and need upgrade and go to step (5), otherwise the failure of notice client write operation, and new cloud platform topology information is sent to client, write operation terminates.
(5) data server access access information management unit, checks whether the filename of this small documents exists.If exist, then notify that client file name exists, otherwise go to step (6).
(6) data server is by blocks of files administrative unit by the content write-in block of small documents, and the access information simultaneously blocks of files administrative unit obtained and filename are with key-value pair form write access information management unit.Return write success message to client, need update mark if be provided with, then simultaneously by new cloud platform topology information notice client, write operation terminates.
The small documents the following described in cloud storage platform reads flow process:
(1) if client is first time access cloud platform storage system, then client-access host node, the topology information of request cloud platform, and be recorded to this locality.During connected reference, if not first time access cloud platform storage system, then the client local buffer memory topology information of cloud platform.
(2) client carries out hash to filename, and determines that this small documents should be processed by which data server according to consistency hashing algorithm.
(3) data server obtained in client-access step (2), judges that whether cloud platform topology information subsidiary in client read request is consistent with the cloud platform topology information version number of local data service device record.If consistent, go to step (4); If inconsistent, mark needs to upgrade.
(4) data server inquires about the filename of this small documents to access information management unit, checks whether this small documents filename exists.If existed, read out access information, go to step (5); If there is no, then send file to client and there is not message, if step (3) is provided with need update mark, then do not deposit subsidiary for new cloud platform topology information in the message at file, notice client upgrades the cloud platform topology information in buffer memory, and read operation terminates.
(5) access information of data server by obtaining in step (4), from blocks of files administrative unit, read small documents content, and send to client, need to upgrade if marked, then by new cloud platform topology information incidentally within the message, read operation terminates.
As can be seen from above-mentioned read-write flow process, when connected reference, cloud platform storage system client, after first time access host node, reads cloud platform topology information, and at client-cache.When cloud platform topology does not change, according to the cloud platform topology information of client buffer memory, the follow-up read-write operation of client directly can determine that client needs the data server of access, this data server of client-access completes a read request or write request.
When cloud platform topology changes, the follow-up read-write requests of client, first according to the outmoded cloud platform topology information of buffer memory before client, determines the data server that will access.If client's side link success, and correctly carried out reading or write, then show that data server judges that now cloud platform change in topology (increase node newly or have node to exit) does not have influence on the read-write requests of this small documents, can from request message read out subsidiary up-to-date cloud platform topology information after client completes this read-write requests more simultaneously, upgrade the local outmoded cloud platform topology information buffer memory of client.Read-write requests follow-up more just can visit cloud platform storage system cloud platform by up-to-date cloud platform topology information.If connect unsuccessful, show that wanting the node of accessing to lose efficacy exits, client needs how once to access host node, obtains current up-to-date cloud platform topology information, re-starts read-write.If access successfully but data server judges that cloud platform change in topology have impact on this read-write requests, then data server replys read-write on client side failure, and by subsidiary for up-to-date cloud platform topology information to client, client re-starts read-write.Therefore cloud platform storage system simplifies read-write flow process, and the network decreased in each read-write process connects number of times.The results show, this improvement effectively can reduce delay.In addition, cloud platform storage system uses the key-value pair storage of more lightweight to carry out managing access information.Many at access information, continuously when write access information, key-value pair stores often as lower in MySQL shows than traditional database in performance delay and higher throughput.
In sum, the present invention proposes a kind of new cloud platform storage system towards small documents and storage means, the network decreasing read-write small documents continuously connects number of times, store more lightweight, postpone lower, and reduce host node load, after making it lose efficacy, service recovery is faster, and whole storage system availability improves.
Obviously, it should be appreciated by those skilled in the art, above-mentioned of the present invention each module or each step can realize with general computing system, they can concentrate on single computing system, or be distributed on network that multiple computing system forms, alternatively, they can realize with the executable program code of computing system, thus, they can be stored in cloud platform storage system and be performed by computing system.Like this, the present invention is not restricted to any specific hardware and software combination.
Should be understood that, above-mentioned embodiment of the present invention only for exemplary illustration or explain principle of the present invention, and is not construed as limiting the invention.Therefore, any amendment made when without departing from the spirit and scope of the present invention, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.In addition, claims of the present invention be intended to contain fall into claims scope and border or this scope and border equivalents in whole change and modification.
Claims (2)
1. utilize a method for network stored data, it is characterized in that, comprising:
In cloud platform storage system, multiple data server is logically organized into ring, according to predefine hashing algorithm, hash is carried out to the ID of data server, and according to hashed value, each data server is distributed on the ring of whole hash-value area;
Arrange a host node in cloud platform storage system, each data server periodically sends its heartbeat message to host node, and host node receives these message for managing the topology information of cloud platform;
Client-access host node obtains cloud platform topology information and is buffered in this locality, when client carries out reading and writing, hash calculating is carried out to filename, and determines the data server at corresponding small documents place, and in corresponding data server, read or write operation is carried out to small documents.
2. method according to claim 1, is characterized in that, comprises further, performs following steps and realizes small documents ablation process:
(1) if client is first time access cloud platform storage system, then client-access host node, the topology information of request cloud platform, and be recorded to this locality, when connected reference, if not first time access cloud platform storage system, then the client local buffer memory topology information of cloud platform;
(2) client carries out hash to filename, and determines that this small documents should be processed by which data server according to consistency hashing algorithm;
(3) data server obtained in client-access step (2), is sent to this data server by the cloud platform topology information of client-cache, the filename of small documents, small documents content buffer;
(4) first data server judges that whether the cloud platform topology information of client-cache is expired, namely whether the cloud platform topology information of correction data server record itself is consistent with the version number of the cloud platform topology information in client write request message, if consistent, go to step (5); If inconsistent, then contrast the cloud platform topology information in client write request message, judge whether difference can affect this write operation, if do not affected, be labeled as and need upgrade and go to step (5), otherwise the failure of notice client write operation, and new cloud platform topology information is sent to client;
(5) data server access access information management unit, checks whether the filename of this small documents exists, if exist, then notifies that client file name exists, otherwise goes to step (6);
(6) data server passes through blocks of files administrative unit by the content write-in block of small documents, the access information simultaneously blocks of files administrative unit obtained and filename are with key-value pair form write access information management unit, write success message is returned to client, update mark is needed if be provided with, then simultaneously by new cloud platform topology information notice client, write operation terminates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510660935.7A CN105187565A (en) | 2015-10-14 | 2015-10-14 | Method for utilizing network storage data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510660935.7A CN105187565A (en) | 2015-10-14 | 2015-10-14 | Method for utilizing network storage data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105187565A true CN105187565A (en) | 2015-12-23 |
Family
ID=54909404
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510660935.7A Pending CN105187565A (en) | 2015-10-14 | 2015-10-14 | Method for utilizing network storage data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105187565A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109634965A (en) * | 2018-12-17 | 2019-04-16 | 郑州云海信息技术有限公司 | Backboard configuration information access method, device, equipment and medium |
CN112714155A (en) * | 2020-12-14 | 2021-04-27 | 国电南瑞科技股份有限公司 | Electric power operation data consistency verification method and device based on end cloud cooperative service |
CN114745281A (en) * | 2022-04-11 | 2022-07-12 | 京东科技信息技术有限公司 | Data processing method and device |
CN116541365A (en) * | 2023-07-06 | 2023-08-04 | 成都泛联智存科技有限公司 | File storage method, device, storage medium and client |
CN116541537A (en) * | 2023-06-06 | 2023-08-04 | 简单汇信息科技(广州)有限公司 | Knowledge graph-based enterprise trade information visual display method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102143215A (en) * | 2011-01-20 | 2011-08-03 | 中国人民解放军理工大学 | Network-based PB level cloud storage system and processing method thereof |
CN103501319A (en) * | 2013-09-18 | 2014-01-08 | 北京航空航天大学 | Low-delay distributed storage system for small files |
CN104965845A (en) * | 2014-12-30 | 2015-10-07 | 浙江大华技术股份有限公司 | Small file positioning method and system |
-
2015
- 2015-10-14 CN CN201510660935.7A patent/CN105187565A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102143215A (en) * | 2011-01-20 | 2011-08-03 | 中国人民解放军理工大学 | Network-based PB level cloud storage system and processing method thereof |
CN103501319A (en) * | 2013-09-18 | 2014-01-08 | 北京航空航天大学 | Low-delay distributed storage system for small files |
CN104965845A (en) * | 2014-12-30 | 2015-10-07 | 浙江大华技术股份有限公司 | Small file positioning method and system |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109634965A (en) * | 2018-12-17 | 2019-04-16 | 郑州云海信息技术有限公司 | Backboard configuration information access method, device, equipment and medium |
CN109634965B (en) * | 2018-12-17 | 2021-10-29 | 郑州云海信息技术有限公司 | Backboard configuration information access method, device, equipment and medium |
CN112714155A (en) * | 2020-12-14 | 2021-04-27 | 国电南瑞科技股份有限公司 | Electric power operation data consistency verification method and device based on end cloud cooperative service |
CN114745281A (en) * | 2022-04-11 | 2022-07-12 | 京东科技信息技术有限公司 | Data processing method and device |
CN114745281B (en) * | 2022-04-11 | 2023-12-05 | 京东科技信息技术有限公司 | Data processing method and device |
CN116541537A (en) * | 2023-06-06 | 2023-08-04 | 简单汇信息科技(广州)有限公司 | Knowledge graph-based enterprise trade information visual display method |
CN116541537B (en) * | 2023-06-06 | 2023-11-03 | 简单汇信息科技(广州)有限公司 | Knowledge graph-based enterprise trade information visual display method |
CN116541365A (en) * | 2023-07-06 | 2023-08-04 | 成都泛联智存科技有限公司 | File storage method, device, storage medium and client |
CN116541365B (en) * | 2023-07-06 | 2023-09-15 | 成都泛联智存科技有限公司 | File storage method, device, storage medium and client |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10198356B2 (en) | Distributed cache nodes to send redo log records and receive acknowledgments to satisfy a write quorum requirement | |
US9547706B2 (en) | Using colocation hints to facilitate accessing a distributed data storage system | |
US11005717B2 (en) | Storage capacity evaluation method based on content delivery network application and device thereof | |
US9715507B2 (en) | Techniques for reconciling metadata and data in a cloud storage system without service interruption | |
CN102708165B (en) | Document handling method in distributed file system and device | |
CN107025243B (en) | Resource data query method, query client and query system | |
US9699017B1 (en) | Dynamic utilization of bandwidth for a quorum-based distributed storage system | |
US9317213B1 (en) | Efficient storage of variably-sized data objects in a data store | |
US9330108B2 (en) | Multi-site heat map management | |
CN105187565A (en) | Method for utilizing network storage data | |
US20130339314A1 (en) | Elimination of duplicate objects in storage clusters | |
CN105549905A (en) | Method for multiple virtual machines to access distributed object storage system | |
US20100312749A1 (en) | Scalable lookup service for distributed database | |
CN104111804A (en) | Distributed file system | |
CN104184812B (en) | A kind of multipoint data transmission method based on private clound | |
CN108021717B (en) | Method for implementing lightweight embedded file system | |
US20150149500A1 (en) | Multi-level lookup architecture to facilitate failure recovery | |
EP2710477B1 (en) | Distributed caching and cache analysis | |
CN107003814A (en) | Effective metadata in storage system | |
CN103501319A (en) | Low-delay distributed storage system for small files | |
US8423517B2 (en) | System and method for determining the age of objects in the presence of unreliable clocks | |
CN104660643A (en) | Request response method and device and distributed file system | |
CN105159845A (en) | Memory reading method | |
CN108540510B (en) | Cloud host creation method and device and cloud service system | |
US9380127B2 (en) | Distributed caching and cache analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20151223 |
|
RJ01 | Rejection of invention patent application after publication |