CN105187565A - Method for utilizing network storage data - Google Patents

Method for utilizing network storage data Download PDF

Info

Publication number
CN105187565A
CN105187565A CN201510660935.7A CN201510660935A CN105187565A CN 105187565 A CN105187565 A CN 105187565A CN 201510660935 A CN201510660935 A CN 201510660935A CN 105187565 A CN105187565 A CN 105187565A
Authority
CN
China
Prior art keywords
cloud platform
client
data server
topology information
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510660935.7A
Other languages
Chinese (zh)
Inventor
汪正冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Xiechuang Information Technology Service Co Ltd
Original Assignee
Sichuan Xiechuang Information Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Xiechuang Information Technology Service Co Ltd filed Critical Sichuan Xiechuang Information Technology Service Co Ltd
Priority to CN201510660935.7A priority Critical patent/CN105187565A/en
Publication of CN105187565A publication Critical patent/CN105187565A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention provides a method for utilizing network storage data. The method includes the following steps: hashing is performed on IDs of data servers in a cloud platform storage system according to a predefined hash algorithm, and each data server is distributed on a ring of a whole Hash value range; each data server periodically sends a heartbeat message to a host node, and the host node manages topological information of a cloud platform; and a client accesses the host node to obtain the topological information of the cloud platform and cache locally, performs hashing calculation on file names, determines the data server where a corresponding small file is and performs reading or writing operation. The invention proposes a novel small-file-oriented cloud platform storage system and storage method, reduces the number of times of network connection for continuously reading and writing small files, storage is relatively lightweight, delay is relatively short, the host node load is reduced, service restoration is relatively fast after a failure, and availability of the whole storage system is improved.

Description

A kind of method utilizing network stored data
Technical field
The present invention relates to data to store, a kind of particularly remote storage method.
Background technology
In recent years, network service all needs to store large amount of small documents, as picture, mail, e-book, music file, microblogging content of text etc.At present for the node of the main storing metadata of Storage and Processing system of magnanimity large files and the node of store file data when being applied to mass small documents, run into a lot of problem: the first, mass small documents brings a large amount of metadata.Because the metadata information of each catalogue and file leaves in the internal memory of namenode, if there is a large amount of small documents in system, then can reduce storage efficiency and the storage capacity of whole cloud platform storage system undoubtedly.The second, the speed of access large amount of small documents is far smaller than the speed of several large files of the identical capacity of access.Because if access a large amount of small documents, need constantly to connect different data servers, this is a kind of data access patterns of poor efficiency.And small documents is merged in bulk by the existing cloud platform storage system towards mass small documents, although decrease the number that small documents takies local node.But need to use multiple service cloud platform unit to complete storage file function, therefore make read-write flow process loaded down with trivial details, once read and write needs and set up repeatedly network connection.Meanwhile, also use the name server of high capacity, this makes the availability of existing small documents system still lower.
Therefore, for the problems referred to above existing in correlation technique, at present effective solution is not yet proposed.
Summary of the invention
For solving the problem existing for above-mentioned prior art, the present invention proposes a kind of method utilizing network stored data, comprising:
In cloud platform storage system, multiple data server is logically organized into ring, according to predefine hashing algorithm, hash is carried out to the ID of data server, and according to hashed value, each data server is distributed on the ring of whole hash-value area;
Arrange a host node in cloud platform storage system, each data server periodically sends its heartbeat message to host node, and host node receives these message for managing the topology information of cloud platform;
Client-access host node obtains cloud platform topology information and is buffered in this locality, when client carries out reading and writing, hash calculating is carried out to filename, and determines the data server at corresponding small documents place, and in corresponding data server, read or write operation is carried out to small documents.
Preferably, perform following steps and realize small documents ablation process:
(1) if client is first time access cloud platform storage system, then client-access host node, the topology information of request cloud platform, and be recorded to this locality, when connected reference, if not first time access cloud platform storage system, then the client local buffer memory topology information of cloud platform;
(2) client carries out hash to filename, and determines that this small documents should be processed by which data server according to consistency hashing algorithm;
(3) data server obtained in client-access step (2), is sent to this data server by the cloud platform topology information of client-cache, the filename of small documents, small documents content buffer;
(4) first data server judges that whether the cloud platform topology information of client-cache is expired, namely whether the cloud platform topology information of correction data server record itself is consistent with the version number of the cloud platform topology information in client write request message, if consistent, go to step (5); If inconsistent, then contrast the cloud platform topology information in client write request message, judge whether difference can affect this write operation, if do not affected, be labeled as and need upgrade and go to step (5), otherwise the failure of notice client write operation, and new cloud platform topology information is sent to client;
(5) data server access access information management unit, checks whether the filename of this small documents exists, if exist, then notifies that client file name exists, otherwise goes to step (6);
(6) data server passes through blocks of files administrative unit by the content write-in block of small documents, the access information simultaneously blocks of files administrative unit obtained and filename are with key-value pair form write access information management unit, write success message is returned to client, update mark is needed if be provided with, then simultaneously by new cloud platform topology information notice client, write operation terminates.
The present invention compared to existing technology, has the following advantages:
Present invention achieves a kind of new cloud platform storage system towards small documents and storage means thereof.Have the following advantages:
(1) when cloud platform topology information does not change, read-write small documents only needs primary network to connect continuously.
(2) key-value pair storage is used to carry out the mapping relations of management document name and corresponding access information.Store more light weight, postpone lower.
(3) host node load is low, and after losing efficacy, service recovery was faster, and therefore cloud platform storage system availability is higher.
Accompanying drawing explanation
Fig. 1 is the flow chart utilizing the method for network stored data according to the embodiment of the present invention.
Embodiment
Detailed description to one or more embodiment of the present invention is hereafter provided together with the accompanying drawing of the diagram principle of the invention.Describe the present invention in conjunction with such embodiment, but the invention is not restricted to any embodiment.Scope of the present invention is only defined by the claims, and the present invention contain many substitute, amendment and equivalent.Set forth many details in the following description to provide thorough understanding of the present invention.These details are provided for exemplary purposes, and also can realize the present invention according to claims without some in these details or all details.
An aspect of of the present present invention provides a kind of method utilizing network stored data.Fig. 1 is the method flow diagram utilizing network stored data according to the embodiment of the present invention.
First all data servers are logically organized into ring by the present invention in cloud platform storage system.Cloud platform storage system adopts consistency Hash scheme, carries out hash, and be distributed on the ring of whole hash-value area by each data server according to hashed value according to predefine hashing algorithm to the ID of data server.
Arrange a host node and host node in cloud platform, each data server periodically sends its heartbeat message to host node, and host node receives these message, for managing the topology information of cloud platform.The cloud platform topology information of host node management comprises the data server list of all activities in cloud platform and the version number of current cloud platform topology information.IP address and port that the ID of the data server of each activity and this data server monitor is saved in data server list.Cloud platform topology information version number monotonically increasing timestamp represents.When cloud platform has new data server to add or original data server exits, host node regenerates a cloud platform topology information, and the version number of this cloud platform topology information is set to current time stamp, then this new cloud platform topology information is sent to the data server of all current actives by host node.Data servers all so all can preserve same cloud platform global information.
Client is in local cache cloud platform topology information.During client first time cloud platform, host node can be accessed and obtain cloud platform topology information, and be buffered in this locality, during follow-up read-write, use the cloud platform topology information of local cache.When client is read and write, first according to filename, according to consistency Hash scheme, hash calculating is carried out to filename, and determine the data server at this small documents place.Then the version number of the cloud platform topology information of correction data server preservation and the cloud platform topology information of client preservation, if version number is consistent, then carries out actual read-write operation at data server.
Data server has two formants, and one is blocks of files administrative unit, and one is access information management unit.Blocks of files administrative unit uses small documents to be merged into the scheme of bulk.Cloud platform storage system allocates larger blocks of files in advance, and the small documents of then new write can write in bulk.When known small documents place block number, small documents in block bias internal amount and these access informations of small documents size, just can retrieve this small documents from a data server.Cloud platform storage system uses key-value pair storage to carry out the mapping relations of management document name to access information, that is:
Key:filename→Value:(BlockId,Offset,Size)
Present invention achieves a key-value pair possessing persistence function to store.Managing access information is carried out with this key-value pair storage.
The following describe the small documents write flow process in cloud storage platform:
(1) if client is first time access cloud platform storage system, then client-access host node, the topology information of request cloud platform, and be recorded to this locality.When connected reference, if not first time access cloud platform storage system, then the client local buffer memory topology information of cloud platform.
(2) client carries out hash to filename, and determines that this small documents should be processed by which data server according to consistency hashing algorithm.
(3) data server obtained in client-access step (2), is sent to this data server by the cloud platform topology information of client-cache, the filename of small documents, small documents content buffer.
(4) first data server judges that whether the cloud platform topology information of client-cache is expired, and namely whether the cloud platform topology information of correction data server record itself is consistent with the version number of the cloud platform topology information in client write request message.If consistent, go to step (5); If inconsistent, then contrast the cloud platform topology information in client write request message, judge whether difference can affect this write operation, if do not affected, be labeled as and need upgrade and go to step (5), otherwise the failure of notice client write operation, and new cloud platform topology information is sent to client, write operation terminates.
(5) data server access access information management unit, checks whether the filename of this small documents exists.If exist, then notify that client file name exists, otherwise go to step (6).
(6) data server is by blocks of files administrative unit by the content write-in block of small documents, and the access information simultaneously blocks of files administrative unit obtained and filename are with key-value pair form write access information management unit.Return write success message to client, need update mark if be provided with, then simultaneously by new cloud platform topology information notice client, write operation terminates.
The small documents the following described in cloud storage platform reads flow process:
(1) if client is first time access cloud platform storage system, then client-access host node, the topology information of request cloud platform, and be recorded to this locality.During connected reference, if not first time access cloud platform storage system, then the client local buffer memory topology information of cloud platform.
(2) client carries out hash to filename, and determines that this small documents should be processed by which data server according to consistency hashing algorithm.
(3) data server obtained in client-access step (2), judges that whether cloud platform topology information subsidiary in client read request is consistent with the cloud platform topology information version number of local data service device record.If consistent, go to step (4); If inconsistent, mark needs to upgrade.
(4) data server inquires about the filename of this small documents to access information management unit, checks whether this small documents filename exists.If existed, read out access information, go to step (5); If there is no, then send file to client and there is not message, if step (3) is provided with need update mark, then do not deposit subsidiary for new cloud platform topology information in the message at file, notice client upgrades the cloud platform topology information in buffer memory, and read operation terminates.
(5) access information of data server by obtaining in step (4), from blocks of files administrative unit, read small documents content, and send to client, need to upgrade if marked, then by new cloud platform topology information incidentally within the message, read operation terminates.
As can be seen from above-mentioned read-write flow process, when connected reference, cloud platform storage system client, after first time access host node, reads cloud platform topology information, and at client-cache.When cloud platform topology does not change, according to the cloud platform topology information of client buffer memory, the follow-up read-write operation of client directly can determine that client needs the data server of access, this data server of client-access completes a read request or write request.
When cloud platform topology changes, the follow-up read-write requests of client, first according to the outmoded cloud platform topology information of buffer memory before client, determines the data server that will access.If client's side link success, and correctly carried out reading or write, then show that data server judges that now cloud platform change in topology (increase node newly or have node to exit) does not have influence on the read-write requests of this small documents, can from request message read out subsidiary up-to-date cloud platform topology information after client completes this read-write requests more simultaneously, upgrade the local outmoded cloud platform topology information buffer memory of client.Read-write requests follow-up more just can visit cloud platform storage system cloud platform by up-to-date cloud platform topology information.If connect unsuccessful, show that wanting the node of accessing to lose efficacy exits, client needs how once to access host node, obtains current up-to-date cloud platform topology information, re-starts read-write.If access successfully but data server judges that cloud platform change in topology have impact on this read-write requests, then data server replys read-write on client side failure, and by subsidiary for up-to-date cloud platform topology information to client, client re-starts read-write.Therefore cloud platform storage system simplifies read-write flow process, and the network decreased in each read-write process connects number of times.The results show, this improvement effectively can reduce delay.In addition, cloud platform storage system uses the key-value pair storage of more lightweight to carry out managing access information.Many at access information, continuously when write access information, key-value pair stores often as lower in MySQL shows than traditional database in performance delay and higher throughput.
In sum, the present invention proposes a kind of new cloud platform storage system towards small documents and storage means, the network decreasing read-write small documents continuously connects number of times, store more lightweight, postpone lower, and reduce host node load, after making it lose efficacy, service recovery is faster, and whole storage system availability improves.
Obviously, it should be appreciated by those skilled in the art, above-mentioned of the present invention each module or each step can realize with general computing system, they can concentrate on single computing system, or be distributed on network that multiple computing system forms, alternatively, they can realize with the executable program code of computing system, thus, they can be stored in cloud platform storage system and be performed by computing system.Like this, the present invention is not restricted to any specific hardware and software combination.
Should be understood that, above-mentioned embodiment of the present invention only for exemplary illustration or explain principle of the present invention, and is not construed as limiting the invention.Therefore, any amendment made when without departing from the spirit and scope of the present invention, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.In addition, claims of the present invention be intended to contain fall into claims scope and border or this scope and border equivalents in whole change and modification.

Claims (2)

1. utilize a method for network stored data, it is characterized in that, comprising:
In cloud platform storage system, multiple data server is logically organized into ring, according to predefine hashing algorithm, hash is carried out to the ID of data server, and according to hashed value, each data server is distributed on the ring of whole hash-value area;
Arrange a host node in cloud platform storage system, each data server periodically sends its heartbeat message to host node, and host node receives these message for managing the topology information of cloud platform;
Client-access host node obtains cloud platform topology information and is buffered in this locality, when client carries out reading and writing, hash calculating is carried out to filename, and determines the data server at corresponding small documents place, and in corresponding data server, read or write operation is carried out to small documents.
2. method according to claim 1, is characterized in that, comprises further, performs following steps and realizes small documents ablation process:
(1) if client is first time access cloud platform storage system, then client-access host node, the topology information of request cloud platform, and be recorded to this locality, when connected reference, if not first time access cloud platform storage system, then the client local buffer memory topology information of cloud platform;
(2) client carries out hash to filename, and determines that this small documents should be processed by which data server according to consistency hashing algorithm;
(3) data server obtained in client-access step (2), is sent to this data server by the cloud platform topology information of client-cache, the filename of small documents, small documents content buffer;
(4) first data server judges that whether the cloud platform topology information of client-cache is expired, namely whether the cloud platform topology information of correction data server record itself is consistent with the version number of the cloud platform topology information in client write request message, if consistent, go to step (5); If inconsistent, then contrast the cloud platform topology information in client write request message, judge whether difference can affect this write operation, if do not affected, be labeled as and need upgrade and go to step (5), otherwise the failure of notice client write operation, and new cloud platform topology information is sent to client;
(5) data server access access information management unit, checks whether the filename of this small documents exists, if exist, then notifies that client file name exists, otherwise goes to step (6);
(6) data server passes through blocks of files administrative unit by the content write-in block of small documents, the access information simultaneously blocks of files administrative unit obtained and filename are with key-value pair form write access information management unit, write success message is returned to client, update mark is needed if be provided with, then simultaneously by new cloud platform topology information notice client, write operation terminates.
CN201510660935.7A 2015-10-14 2015-10-14 Method for utilizing network storage data Pending CN105187565A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510660935.7A CN105187565A (en) 2015-10-14 2015-10-14 Method for utilizing network storage data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510660935.7A CN105187565A (en) 2015-10-14 2015-10-14 Method for utilizing network storage data

Publications (1)

Publication Number Publication Date
CN105187565A true CN105187565A (en) 2015-12-23

Family

ID=54909404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510660935.7A Pending CN105187565A (en) 2015-10-14 2015-10-14 Method for utilizing network storage data

Country Status (1)

Country Link
CN (1) CN105187565A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109634965A (en) * 2018-12-17 2019-04-16 郑州云海信息技术有限公司 Backboard configuration information access method, device, equipment and medium
CN112714155A (en) * 2020-12-14 2021-04-27 国电南瑞科技股份有限公司 Electric power operation data consistency verification method and device based on end cloud cooperative service
CN114745281A (en) * 2022-04-11 2022-07-12 京东科技信息技术有限公司 Data processing method and device
CN116541365A (en) * 2023-07-06 2023-08-04 成都泛联智存科技有限公司 File storage method, device, storage medium and client
CN116541537A (en) * 2023-06-06 2023-08-04 简单汇信息科技(广州)有限公司 Knowledge graph-based enterprise trade information visual display method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102143215A (en) * 2011-01-20 2011-08-03 中国人民解放军理工大学 Network-based PB level cloud storage system and processing method thereof
CN103501319A (en) * 2013-09-18 2014-01-08 北京航空航天大学 Low-delay distributed storage system for small files
CN104965845A (en) * 2014-12-30 2015-10-07 浙江大华技术股份有限公司 Small file positioning method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102143215A (en) * 2011-01-20 2011-08-03 中国人民解放军理工大学 Network-based PB level cloud storage system and processing method thereof
CN103501319A (en) * 2013-09-18 2014-01-08 北京航空航天大学 Low-delay distributed storage system for small files
CN104965845A (en) * 2014-12-30 2015-10-07 浙江大华技术股份有限公司 Small file positioning method and system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109634965A (en) * 2018-12-17 2019-04-16 郑州云海信息技术有限公司 Backboard configuration information access method, device, equipment and medium
CN109634965B (en) * 2018-12-17 2021-10-29 郑州云海信息技术有限公司 Backboard configuration information access method, device, equipment and medium
CN112714155A (en) * 2020-12-14 2021-04-27 国电南瑞科技股份有限公司 Electric power operation data consistency verification method and device based on end cloud cooperative service
CN114745281A (en) * 2022-04-11 2022-07-12 京东科技信息技术有限公司 Data processing method and device
CN114745281B (en) * 2022-04-11 2023-12-05 京东科技信息技术有限公司 Data processing method and device
CN116541537A (en) * 2023-06-06 2023-08-04 简单汇信息科技(广州)有限公司 Knowledge graph-based enterprise trade information visual display method
CN116541537B (en) * 2023-06-06 2023-11-03 简单汇信息科技(广州)有限公司 Knowledge graph-based enterprise trade information visual display method
CN116541365A (en) * 2023-07-06 2023-08-04 成都泛联智存科技有限公司 File storage method, device, storage medium and client
CN116541365B (en) * 2023-07-06 2023-09-15 成都泛联智存科技有限公司 File storage method, device, storage medium and client

Similar Documents

Publication Publication Date Title
US10198356B2 (en) Distributed cache nodes to send redo log records and receive acknowledgments to satisfy a write quorum requirement
US9547706B2 (en) Using colocation hints to facilitate accessing a distributed data storage system
US11005717B2 (en) Storage capacity evaluation method based on content delivery network application and device thereof
US9715507B2 (en) Techniques for reconciling metadata and data in a cloud storage system without service interruption
CN102708165B (en) Document handling method in distributed file system and device
CN107025243B (en) Resource data query method, query client and query system
US9699017B1 (en) Dynamic utilization of bandwidth for a quorum-based distributed storage system
US9317213B1 (en) Efficient storage of variably-sized data objects in a data store
US9330108B2 (en) Multi-site heat map management
CN105187565A (en) Method for utilizing network storage data
US20130339314A1 (en) Elimination of duplicate objects in storage clusters
CN105549905A (en) Method for multiple virtual machines to access distributed object storage system
US20100312749A1 (en) Scalable lookup service for distributed database
CN104111804A (en) Distributed file system
CN104184812B (en) A kind of multipoint data transmission method based on private clound
CN108021717B (en) Method for implementing lightweight embedded file system
US20150149500A1 (en) Multi-level lookup architecture to facilitate failure recovery
EP2710477B1 (en) Distributed caching and cache analysis
CN107003814A (en) Effective metadata in storage system
CN103501319A (en) Low-delay distributed storage system for small files
US8423517B2 (en) System and method for determining the age of objects in the presence of unreliable clocks
CN104660643A (en) Request response method and device and distributed file system
CN105159845A (en) Memory reading method
CN108540510B (en) Cloud host creation method and device and cloud service system
US9380127B2 (en) Distributed caching and cache analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20151223

RJ01 Rejection of invention patent application after publication