CN104850548B - A kind of method and system for realizing big data platform input/output processing - Google Patents

A kind of method and system for realizing big data platform input/output processing Download PDF

Info

Publication number
CN104850548B
CN104850548B CN201410050179.1A CN201410050179A CN104850548B CN 104850548 B CN104850548 B CN 104850548B CN 201410050179 A CN201410050179 A CN 201410050179A CN 104850548 B CN104850548 B CN 104850548B
Authority
CN
China
Prior art keywords
server
signal
write
primary server
primary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410050179.1A
Other languages
Chinese (zh)
Other versions
CN104850548A (en
Inventor
鲁瑞
侯建卫
王晓颖
李栓林
付长冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Shanxi Co Ltd
Original Assignee
China Mobile Group Shanxi Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Shanxi Co Ltd filed Critical China Mobile Group Shanxi Co Ltd
Priority to CN201410050179.1A priority Critical patent/CN104850548B/en
Publication of CN104850548A publication Critical patent/CN104850548A/en
Application granted granted Critical
Publication of CN104850548B publication Critical patent/CN104850548B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The embodiment of the invention discloses a kind of methods for realizing big data platform I/O processing, and after the data pushed by client are write itself cache by Primary server, the signal of Primary server caches is had been written into master transmissions;Primary server complete signal to client feedback write operations;By after by the data sending that client is pushed to each Secondary server, each Secondary server are asynchronous to be write the data in the cache of itself Primary server;After the asynchronous write buffers for waiting and receiving all Secondary server feedbacks of Primary server complete signal, the signal of Secondary server caches is had been written into master transmissions.The embodiment of the present invention further simultaneously discloses a kind of system for realizing big data platform I/O processing.

Description

A kind of method and system for realizing big data platform input/output processing
Technical field
The present invention relates to cloud storage technology more particularly to a kind of realization big data platform input/output(Input/ Output, I/O)The method and system of processing.
Background technology
With the rapid development of business in internet, occur the various data of a large amount of forms in a short time.It was predicted that The year two thousand twenty data volume is up to 35ZB.However, the core challenge in big data epoch is:Quantity is big, species is numerous, speed is fast, therefore, Various big data platforms come into being.
At present, people mainly use and follow Google file system(Google File System, GFS)Hadoop point Cloth file system(Hadoop Distributed File System, HDFS), MapReduce parallel environments and Hbase Or Hive database datas warehouse etc., various big data business are handled or applied.
In systems in practice, GFS, HDFS, parallel network file system(Parallel Network File System, pNFS)The cardinal principle structure of distributed file system, by taking GFS as an example, as shown in Figure 1.Wherein, each application of GFS is equal Include a primary storage server(master)With multiple pieces of storage servers(chunk server);Multiple clients can pass through GFS clients(client)It is interacted with master, each chunk server, so as to fulfill to cloud storage data in GFS Access.
In the concrete realization, GFS is by metadata(metadata)It is stored in master, and needs the specific data stored It is stored in each chunk server;Metadata Service be mainly client and master carry out chunk server positions, The information such as block Data Position operate;And the read-write process operation of specific data is directly completed with each chunk server.Wherein, In GFS, HDFS distributed file system, main reading process operation data includes:
(1)Utilize fixed piece(chunk)Size, client is filename(file name)The byte specified with program Offset is converted into the block index of file(chunk index);
(2)Client sends a request for including file name and chunk index to master;
(3)Master returns to client responses, including block handle(chunk handle)With the position of chunk server It puts;Here, the position of chunk server includes multiple chunk server;
(4)Client is key assignments caching chunk handle and chunk server with file name and chunk index Location information;
(5)Client is transmit a request at one of chunk server, can generally be selected nearest.This request Specify the block handle and bytes range of block;Here, chunk server identify chunk with chunk handle;
(6)Chunk server are by specified data sending to client.
Here, main data writing operation process includes:
(1)Client asks the base server of current chunk server to master(Primary server)Order Memorial tablet is put and other dependent servers(Secondary server)Position;
(2)Master feeds back response, which includes the token of the Primary server of current chunk server Position and the location message of other Secondary server;
Here, client can cache these data, so as to avoid frequently accessing master.
(3)Client gives data-pushing to chunk server;
Here, chunk server include Primary server and Secondary server.
(4)After all chunk server receive corresponding data, client initiates a write request to Primary server;
(5)Primary server are according to sequence number order(serial number order)Change his own local State;
(6)Write request is published to all Secondary server by Primary server;Each Secondary Server is changed according to identical serial number order;
(7)Secondary server feed back Primary server responses, and write operation has been completed in expression;
(8)Primary server give client completed responses.
In the prior art, the HDFS+MapReduce+Hive+Hbase big datas platform based on Hadoop framework has height Scalability, high reliability and high fault tolerance.But in actual big data business processing procedure, as wireless application is assisted View(Wireless Application Protocol, WAP)Internet log(Web log, Blog), large user's mailing system, In the applications such as Blog log analysis, user information tracking and analysis, current big data platform is deposited in data I/O processing methods In defect, especially for non-structured, half structure, big data quantity business, there are more serious for I/O processing speeds Problem.
Its main problem includes:
(1)Big data quantity situation especially in the case of continuous write, I/O performances are slow, and I/O speed adds Speed is than relation non-linear with server number of nodes;
(2)In non-structural, semi structured data the processing such as LOG, BLOG, video, social relationships information, not according to big Stored Data Type is optimized with feature, and processing speed is partially slow;
(3)Using the technology of more chunk server synchronous writes, cause in the states such as network and storage device be in confused situation Lower synchronization time is longer so that the cost time is longer in the processing of data consistency;
(4)The read or write speed of chunk server storage devices and physical device situation, part are not accounted for The readwrite performance of Secondary server seriously affects the I/O performances of whole system.
Therefore, in big data platform, I/O process performances and coherency management strategy are presently the most asking for core The key element of topic and the whole I/O system performances of influence big data platform.
The content of the invention
In view of this, an embodiment of the present invention is intended to provide a kind of method and system for realizing big data platform I/O processing, energy Big data platform I/O performances are enough effectively improved, while solve Data Consistency.
In order to achieve the above objectives, the technical proposal of the invention is realized in this way:
An embodiment of the present invention provides a kind of method for realizing big data platform input/output I/O processing, this method bags It includes:
After the data pushed by client are write itself cache by base server Primary server, The signal of Primary server caches is had been written into primary storage server master transmissions;
Primary server complete signal to client feedback write operations;
Primary server will be by the data sending that client is pushed to each dependent server Secondary After server, each Secondary server are asynchronous to be write the data in the cache of itself;
The asynchronous write buffers for waiting and receiving all Secondary server feedbacks of Primary server complete signal Afterwards, the signal of Secondary server caches is had been written into master transmissions.
In said program, each Secondary server are asynchronous to be write the data in the cache of itself Including:
After asynchronous write operation order and the specific data for waiting and receiving Primary server of Secondary server, The specific data are write into the cache of itself;Afterwards, signal is completed to Primary server feedback write buffers.
In said program, the method further includes:
Primary server are after master transmissions have been written into the signal of Secondary server caches, directly It connects and writes the data block for needing to store in the memory of itself;Afterwards, Primary server are had been written into master transmissions Signal;
The asynchronous feedback letters that all Secondary server is waited to send after the write operation is completed of Primary server Number and the feedback signal is detected;
When the feedback signal is that write operation completes signal, the cache memory space of itself is discharged;Afterwards, to Master sends the signal for having been written into Secondary server.
In said program, the write operation of the Secondary server includes:
Secondary server are after sending write buffer and completing signal, the asynchronous inspection that Primary server is waited to send It surveys signal and directly writes the data for needing to store in the memory of itself;It is true through judging after receiving the detection signal After the completion of write operation before settled, complete signal to Primary server feedback write operations and discharge the cache sky of itself Between.
In said program, the method further includes:When feedback signal is there are during write operation failure signal, Primary Server determines the Secondary server of corresponding feedback write operation failure signal, and by the data of the cache of itself Block is transferred to corresponding Secondary server;Afterwards, Primary server continue asynchronous all Secondary of wait Server feedback signals after the write operation is completed are simultaneously detected the feedback signal.
The embodiment of the present invention additionally provides a kind of system for realizing big data platform I/O processing, which includes: Primary server, Secondary server, client and master;Wherein,
Primary server, after the data pushed by client are write itself cache, to main master Send the signal for having been written into Primary server caches;Signal is completed to client feedback write operations;It will be by client The data sending of push is to each Secondary server;It is asynchronous to wait and receive all Secondary server feedbacks After write buffer completes signal, the signal of Secondary server caches is had been written into master transmissions;
Secondary server for the Asynchronous Reception Primary server data sent and write itself In cache;
Client, for Primary server propelling datas;It is complete to receive the write operation that Primary server are sent Into signal;
Master, for receiving the letter for having been written into Primary server caches of Primary server transmissions Number;Receive the signal for having been written into Secondary server caches that Primary server are sent.
In said program, the Secondary server are used for the number that Asynchronous Reception Primary server are sent According to and write the cache of itself and include:Secondary server are asynchronous wait and receive Primary server write behaviour After making order and specific data, the specific data are write into the cache of itself;It is write to Primary server feedbacks slow It deposits and completes signal.
In said program, the Primary server are additionally operable to, and Secondary is being had been written into master transmissions After the signal of server caches, directly the data block for needing to store is write in the memory of itself;It is sent out to master Send the signal for having been written into Primary server;It is asynchronous to wait what all Secondary server were sent after the write operation is completed Feedback signal is simultaneously detected the feedback signal;When it is that write operation completes signal that detection, which determines the feedback signal, Discharge the cache memory space of itself;The signal of Secondary server is had been written into master transmissions;
Master is additionally operable to receive the signal for having been written into Primary server that Primary server are sent;It receives The signal for having been written into Secondary server that Primary server are sent.
In said program, in write operation, Secondary server are additionally operable to the Secondary server, are being sent out After write buffer is sent to complete signal, the asynchronous detection signal that Primary server is waited to send simultaneously will directly need the data stored In the memory for writing itself;After receiving the detection signal, through judging after the completion of determining current write operation, to Primary server feedback write operations complete signal and discharge the cache memory space of itself.
In said program, the Primary server are additionally operable to, when the feedback signal of Secondary server exists During write operation failure signal, the Secondary server of corresponding feedback write operation failure signal are determined, and by the height of itself The transmission of data blocks of speed caching gives corresponding Secondary server;Primary server, which continue asynchronous wait, to be owned Secondary server feedback signals after the write operation is completed are simultaneously detected the feedback signal.
The method and system for the realization big data platform I/O processing that the embodiment of the present invention is provided, Primary server After the data pushed by client are write itself cache, it is slow at a high speed to have been written into Primary server to master transmissions The signal deposited;Primary server complete signal to client feedback write operations;Primary server will be pushed away by client The data sending sent to after each Secondary server, each Secondary server it is asynchronous by data write-in from In the cache of body;The write buffer that Primary server are asynchronous to be waited and receive all Secondary server feedback is complete Into after signal, the signal of Secondary server caches is had been written into master transmissions.It so, it is possible to effectively improve big Data platform I/O performances;Further, Primary server have been written into Secondary server at a high speed to master transmissions After the signal of caching, directly the data block for needing to store is write in the memory of itself;Afterwards, had been written into master transmissions The signal of Primary server;Primary server are asynchronous to wait all Secondary server after the write operation is completed The feedback signal of transmission is simultaneously detected the feedback signal;When the feedback signal is that write operation completes signal, release Put the cache memory space of itself;Then, the signal of Secondary server is had been written into master transmissions, so as to effectively solve It has determined Data Consistency so that the whole I/O performances of big data platform are improved.
Description of the drawings
Fig. 1 is the cardinal principle structure diagram of GFS;
Fig. 2 is that the embodiment of the present invention realizes that the method for big data platform I/O processing realizes flow diagram;
Fig. 3 is that the embodiment of the present invention realizes that the method for big data platform I/O processing implements flow diagram;
Fig. 4 is that the embodiment of the present invention realizes that the system of big data platform I/O processing forms structure diagram.
Specific embodiment
At present, the key problem for influencing GFS, HDFS distributed file system I/O access performances is to carry out data Write operation on;That is, in order to keep the data consistency of Primary server and Secondary server, a side All Secondary server in face write direct memory, and on the other hand all Secondary server complete all deposit After reservoir I/O, Primary server just handle follow-up work after being connected to the response of all Secondary server.In this way, One write operation is, it is necessary to which all Secondary server complete respective I/O processing so that whole system I/O processing times It is longer, it is especially busy or part Secondary server network transmissions are seriously by shadow in part Secondary server In the case of sound, the I/O write operation times can be caused unpredictable, and then influence the I/O performances of whole system.
In embodiments of the present invention, the data pushed by client are write itself cache by Primary server Afterwards, the signal of Primary server caches is had been written into master transmissions;Primary server are fed back to client Write operation completes signal;Primary server will be by the data sending that client is pushed to each Secondary server Afterwards, each Secondary server are asynchronous writes the data in the cache of itself;Primary server are asynchronous After waiting and receiving the write buffer completion signal that all Secondary server are fed back, had been written into master transmissions The signal of Secondary server caches.
Here, in big data platform, realize Write post without directly will based on cache under distributed system environment Data write the algorithm of memory, also referred to as buffer write(Cache-Write)Algorithm can effectively improve the whole I/O of system Performance.However, the algorithm is Data Consistency the problem of maximum.
Therefore, further, on the basis of Cache-Write algorithms, the method and machine that data consistency is provided are increased System, the mechanism are also referred to as direct read/write(Direct-IO)Mechanism, specifically,
Primary server are after master transmissions have been written into the signal of Secondary server caches, directly It connects and writes the data block for needing to store in the memory of itself;Afterwards, Primary server are had been written into master transmissions Signal;The asynchronous feedback signals that all Secondary server is waited to send after the write operation is completed of Primary server And the feedback signal is detected;When the feedback signal is that write operation completes signal, the high speed for discharging itself is delayed Deposit space;Afterwards, the signal of Secondary server is had been written into master transmissions.
In this way, the big data platform I/O processing methods based on Cache-Write algorithms and Direct-IO mechanism not only have There is the I/O performance higher than HDFS distributed files, but also with stringent data consistency.
Below in conjunction with the accompanying drawings and specific embodiment the present invention is further described in more detail.
Fig. 2 is that the embodiment of the present invention realizes that the method for big data platform I/O processing realizes flow diagram, as shown in Fig. 2, The embodiment of the present invention realizes that the method for big data platform I/O processing includes:
Step S100:After the data pushed by client are write itself cache by Primary server, to Master sends the signal for having been written into Primary server caches;
Step S101:Primary server complete signal to client feedback write operations;
Step S102:Primary server will be by the data sending that client is pushed to each Secondary server Afterwards, each Secondary server are asynchronous writes the data in the cache of itself;
Step S103:The asynchronous write buffers waited and receive all Secondary server feedbacks of Primary server After completing signal, the signal of Secondary server caches is had been written into master transmissions.
Fig. 3 is that the embodiment of the present invention realizes that the method for big data platform I/O processing implements flow diagram, such as Fig. 3 Shown, the embodiment of the present invention realizes that the method for big data platform I/O processing specifically includes:
Step S200:After the data pushed by client are write itself cache by Primary server, to Master sends the signal for having been written into Primary server caches;
Here, the data pushed by client are write to the process of itself cache in the Primary server In, if write-in data are excessive, the cache of itself does not have enough spaces, then the asynchronous operation for carrying out I/O storages.
Step S201:Primary server complete signal to client feedback write operations;
Step S202:Primary server will be by the data sending that client is pushed to each Secondary server Afterwards, each Secondary server are asynchronous writes the data in the cache of itself;
Here, asynchronous write the data in the cache of itself of each Secondary server is specifically wrapped It includes:
After asynchronous write operation order and the specific data for waiting and receiving Primary server of Secondary server, The specific data are write into the cache of itself;Afterwards, signal is completed to Primary server feedback write buffers.
Here, during the specific data are write the cache of itself by the Secondary server, If write, data are excessive, and the cache of itself does not have enough spaces, then the asynchronous operation for carrying out I/O storages.
Step S203:The asynchronous write buffers waited and receive all Secondary server feedbacks of Primary server After completing signal, the signal of Secondary server caches is had been written into master transmissions.
Step S204:Primary server have been written into the letter of Secondary server caches to master transmissions After number, directly the data block for needing to store is write in the memory of itself;Afterwards, Primary is had been written into master transmissions The signal of server;
Step S205:Primary server are asynchronous to wait all Secondary server after the write operation is completed, hair The feedback signal sent simultaneously is detected the feedback signal;
Step S206:When the feedback signal is that write operation completes signal, the cache memory space of itself is discharged;It Afterwards, the signal of Secondary server is had been written into master transmissions.
Here, when feedback signal is there are during write operation failure signal, Primary server determine that corresponding feedback writes behaviour Make the Secondary server of failure signal, and give the transmission of data blocks of the cache of itself to corresponding Secondary server;Afterwards, Primary server continue asynchronous to wait all Secondary server after the write operation is completed anti- Feedback signal is simultaneously detected the feedback signal.
Here, the write operation of the Secondary server includes:
Secondary server are after sending write buffer and completing signal, the asynchronous inspection that Primary server is waited to send It surveys signal and directly writes the data for needing to store in the memory of itself;It is true through judging after receiving the detection signal After the completion of write operation before settled, complete signal to Primary server feedback write operations and discharge the cache sky of itself Between.
Wherein, when through judging to determine that current write operation does not complete, Secondary server are continued waiting for until writing After the completion of operation, complete signal to Primary server feedback write operations and discharge the cache memory space of itself.
Fig. 4 is that the embodiment of the present invention realizes that the system of big data platform I/O processing forms structure diagram, as shown in figure 4, The embodiment of the present invention realizes that the system of big data platform I/O processing includes:Primary server10、Secondary Server11, client12 and master13;Wherein,
Primary server10, after the data pushed by client12 are write itself cache, to master Master13 sends the signal for having been written into Primary server10 caches;Letter is completed to client12 feedback write operations Number;It will be by the data sending that client12 is pushed to each Secondary server11;It is asynchronous to wait and receive all After the write buffer of Secondary server11 feedbacks completes signal, Secondary server are had been written into master13 transmissions The signal of cache;
Secondary server11 for the Asynchronous Reception Primary server10 data sent and write certainly The cache of body;
Specifically, Secondary server11 it is asynchronous wait and receive Primary server10 write operation order and After specific data, the specific data are write into the cache of itself;It is completed to Primary server10 feedback write buffers Signal.
Client12, for Primary server10 propelling datas;Receive writing for Primary server10 transmissions Signal is completed in operation;
Master13, for receive Primary server10 transmission have been written into Primary server10 caches Signal;Receive the signal for having been written into Secondary server11 caches that Primary server10 are sent.
Further, after increasing the Direct-IO mechanism for improving data consistency in Cache-Write algorithms,
Primary server10 are additionally operable to, slow at a high speed having been written into Secondary server11 to master13 transmissions After the signal deposited, directly the data block for needing to store is write in the memory of itself;It is had been written into master13 transmissions The signal of Primary server10;The asynchronous feedback that all Secondary server12 is waited to send after the write operation is completed Signal is simultaneously detected the feedback signal;When it is that write operation completes signal that detection, which determines the feedback signal, release The cache memory space of itself;The signal of Secondary server11 is had been written into master13 transmissions;
Master13 is additionally operable to receive the letter for having been written into Primary server10 that Primary server10 are sent Number;Receive the signal for having been written into Secondary server11 that Primary server10 are sent.
Here, the Secondary server11 are in write operation, specifically:
Secondary server11 are additionally operable to, after sending write buffer and completing signal, asynchronous wait The detection signal of Primaryserver10 transmissions simultaneously directly writes the data for needing to store in the memory of itself;It receives After the detection signal, through judging after the completion of determining current write operation, completed to Primary server10 feedback write operations Signal and the cache memory space for discharging itself.
Here, when the feedback signal of Secondary server11 is there are during write operation failure signal, Primary Server10 is additionally operable to, and determines the Secondary server11 of corresponding feedback write operation failure signal, and by the height of itself The transmission of data blocks of speed caching gives corresponding Secondary server11;Primary server10, which continue asynchronous wait, to be owned Secondary server11 feedback signals after the write operation is completed are simultaneously detected the feedback signal.
In conclusion cache-Write algorithms and Direct-IO mechanism based on the embodiment of the present invention, big data platform The write operation of distributed file system HDFS and client be specifically simplified as:
The first step:Client asked to master the token position of the Primary server of current chunk server with And the position of other all Secondary server;
Second step:Master to client feed back response, specifically include Primary server token position and its The location information of his all Secondary server.
3rd step:The location information of data and correlation Secondary server is pushed to Primary by client server。
4th step:Primary directly invokes Cache-Write algorithms and carries out write operation;
5th step:Primary is in the location information of processing completion local data cache and Secondary server Write operation is directly fed back after storage and completes signal to client.
6th step:Primary server and Secondary server backstages asynchronous process data write magnetic disk(Write The memory of itself)With data consistency relevant operation.
7th step:Asynchronous process various metadata informations in Primary server and master backstages write behaviour until all It completes, terminates flow.
Compare the new distributed write operation based on cache-Write algorithms and Direct-IO mechanism and HDFS before divides Cloth file system write operation, it can be deduced that:
(1)Client only writes data into Primary server, reduces the transmission of multiple Secondary server; (2)After the cache of write-in Primary server, client can think that write operation is completed, so as to carry out it The operation of his client reduces the intermediate complicated Secondary server-Secondary server that write and confirms process; (3)And the process read in client, due between Primary server and master to the storage location and situation of data very Clear, master can provide most suitable storage location and block handle, therefore asynchronous write operation does not influence the progress of read operation; (4)New distributed file system write operation has higher I/O performances.
Theoretically, big data platform of the embodiment of the present invention based on cache-Write algorithms and Direct-IO mechanism I/O processing methods are compared with the method before HDFS/GFS, and write operation process will reduce by 3*N write operation;Read operation performance and Reading method before is compared, and read operation can be reduced N*1 times.Therefore, whole system I/O performances are effectively raised.
In order to verify the specific improvement effect of the embodiment of the present invention, HDFS, Mapdreduce of hadoop platforms can be used The databases such as parallel environment and Hive, Hbase, it is specific to test WAP log systems, customer relation management(Customer Relationship Management, CRM)The I/O behavior patterns of message data system are found by test, non-structural This method improves about 113% or so than system performance before in the case of change, read-write frequently, and in the CRM message numbers of large capacity According to the situation continuously write, continuously read under system, is belonged to, performance improves about 81%.It can be seen that frequent read-write operation Business its performance improvement it is bigger, it is clear that the embodiment of the present invention can effectively improve big data platform I/O performances.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention.

Claims (8)

  1. A kind of 1. method for realizing big data platform input/output I/O processing, which is characterized in that the described method includes:
    After the data pushed by client are write itself cache by base server Primary server, to master Storage server master sends the signal for having been written into Primary server caches;
    Primary server complete signal to client feedback write operations;
    Primary server by after by the data sending that client is pushed to each dependent server Secondary server, Each Secondary server are asynchronous to be write the data in the cache of itself;
    After the asynchronous write buffers for waiting and receiving all Secondary server feedbacks of Primary server complete signal, to Master sends the signal for having been written into Secondary server caches;
    Primary server directly will after master transmissions have been written into the signal of Secondary server caches The data block stored is needed to write in the memory of itself;Afterwards, the letter of Primary server is had been written into master transmissions Number;
    The asynchronous feedback signals that all Secondary server is waited to send after the write operation is completed of Primary server are simultaneously The feedback signal is detected;
    When the feedback signal is that write operation completes signal, the cache memory space of itself is discharged;Afterwards, sent out to master Send the signal for having been written into Secondary server.
  2. 2. according to the method described in claim 1, it is characterized in that, each Secondary server are asynchronous by the number Include according to the cache of itself is write:
    After asynchronous write operation order and the specific data for waiting and receiving Primary server of Secondary server, by institute It states specific data and writes the cache of itself;Afterwards, signal is completed to Primary server feedback write buffers.
  3. 3. according to the method described in claim 1, it is characterized in that, the write operation of the Secondary server includes:
    Secondary server are after sending write buffer and completing signal, the asynchronous detection that Primary server is waited to send letter Number and directly the data that store will be needed to write in the memory of itself;After receiving the detection signal, through judging to determine to work as After the completion of preceding write operation, complete signal to Primary server feedback write operations and discharge the cache memory space of itself.
  4. 4. according to the method described in claim 1, it is characterized in that, the method further includes:When feedback signal, there are write operations During failure signal, Primary server determine the Secondary server of corresponding feedback write operation failure signal, and will The transmission of data blocks of the cache of itself gives corresponding Secondary server;Afterwards, Primary server continue different Step waits all Secondary server feedback signal after the write operation is completed and the feedback signal is detected.
  5. 5. a kind of system for realizing big data platform I/O processing, which is characterized in that the system comprises:Primary server、 Secondary server, client and master;Wherein,
    Primary server after the data pushed by client are write itself cache, are sent to main master Have been written into the signal of Primary server caches;Signal is completed to client feedback write operations;It will be pushed by client Data sending to each Secondary server;It is asynchronous to wait and to receive writing for all Secondary server feedback slow It deposits after completing signal, the signal of Secondary server caches is had been written into master transmissions;
    Secondary server for the Asynchronous Reception Primary server data sent and write the high speed of itself In caching;
    Client, for Primary server propelling datas;It receives the write operation that Primary server are sent and completes letter Number;
    Master, for receiving the signal for having been written into Primary server caches of Primary server transmissions;It connects Receive the signal for having been written into Secondary server caches that Primary server are sent;
    Wherein, the Primary server are additionally operable to, slow at a high speed having been written into Secondary server to master transmissions After the signal deposited, directly the data block for needing to store is write in the memory of itself;It is had been written into master transmissions The signal of Primary server;The asynchronous feedback signal that all Secondary server is waited to send after the write operation is completed And the feedback signal is detected;When it is that write operation completes signal that detection, which determines the feedback signal, itself is discharged Cache memory space;The signal of Secondary server is had been written into master transmissions;
    The master is additionally operable to receive the signal for having been written into Primary server that Primary server are sent;It receives The signal for having been written into Secondary server that Primary server are sent.
  6. 6. system according to claim 5, which is characterized in that the Secondary server are used for Asynchronous Reception Data that Primary server are sent simultaneously write the cache of itself and include:The asynchronous waits of Secondary server And after receiving write operation order and the specific data of Primary server, the specific data are write into the high speed of itself and are delayed It deposits;Signal is completed to Primary server feedback write buffers.
  7. 7. system according to claim 5, which is characterized in that the Secondary server in write operation, Secondary server are additionally operable to, after sending write buffer and completing signal, the asynchronous inspection that Primary server is waited to send It surveys signal and directly writes the data for needing to store in the memory of itself;It is true through judging after receiving the detection signal After the completion of write operation before settled, complete signal to Primary server feedback write operations and discharge the cache sky of itself Between.
  8. 8. system according to claim 5, which is characterized in that the Primary server are additionally operable to, and work as Secondary There are the Secondary for during write operation failure signal, determining corresponding feedback write operation failure signal for the feedback signal of server Server, and give the transmission of data blocks of the cache of itself to corresponding Secondary server;Primary server Continue asynchronous to wait all Secondary server feedback signal after the write operation is completed and carry out the feedback signal Detection.
CN201410050179.1A 2014-02-13 2014-02-13 A kind of method and system for realizing big data platform input/output processing Active CN104850548B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410050179.1A CN104850548B (en) 2014-02-13 2014-02-13 A kind of method and system for realizing big data platform input/output processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410050179.1A CN104850548B (en) 2014-02-13 2014-02-13 A kind of method and system for realizing big data platform input/output processing

Publications (2)

Publication Number Publication Date
CN104850548A CN104850548A (en) 2015-08-19
CN104850548B true CN104850548B (en) 2018-05-22

Family

ID=53850196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410050179.1A Active CN104850548B (en) 2014-02-13 2014-02-13 A kind of method and system for realizing big data platform input/output processing

Country Status (1)

Country Link
CN (1) CN104850548B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106980645B (en) * 2017-02-24 2020-09-15 北京同有飞骥科技股份有限公司 Distributed file system architecture implementation method and device
CN109222853A (en) * 2018-11-19 2019-01-18 苏州新光维医疗科技有限公司 Endoscope and endoscope working method
CN112866339B (en) * 2020-12-30 2022-12-06 金蝶软件(中国)有限公司 Data transmission method and device, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7149923B1 (en) * 2003-01-17 2006-12-12 Unisys Corporation Software control using the controller as a component to achieve resiliency in a computer system utilizing separate servers for redundancy
CN101854388A (en) * 2010-05-17 2010-10-06 浪潮(北京)电子信息产业有限公司 Method and system concurrently accessing a large amount of small documents in cluster storage
CN102129434A (en) * 2010-01-13 2011-07-20 腾讯科技(北京)有限公司 Method and system for reading and writing separation database
CN102882983A (en) * 2012-10-22 2013-01-16 南京云创存储科技有限公司 Rapid data memory method for improving concurrent visiting performance in cloud memory system
CN103078936A (en) * 2012-12-31 2013-05-01 网宿科技股份有限公司 Metadata hierarchical storage method and system for Global file system (GFS)-based distributed file system
CN103530387A (en) * 2013-10-22 2014-01-22 浪潮电子信息产业股份有限公司 Improved method aimed at small files of HDFS

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7149923B1 (en) * 2003-01-17 2006-12-12 Unisys Corporation Software control using the controller as a component to achieve resiliency in a computer system utilizing separate servers for redundancy
CN102129434A (en) * 2010-01-13 2011-07-20 腾讯科技(北京)有限公司 Method and system for reading and writing separation database
CN101854388A (en) * 2010-05-17 2010-10-06 浪潮(北京)电子信息产业有限公司 Method and system concurrently accessing a large amount of small documents in cluster storage
CN102882983A (en) * 2012-10-22 2013-01-16 南京云创存储科技有限公司 Rapid data memory method for improving concurrent visiting performance in cloud memory system
CN103078936A (en) * 2012-12-31 2013-05-01 网宿科技股份有限公司 Metadata hierarchical storage method and system for Global file system (GFS)-based distributed file system
CN103530387A (en) * 2013-10-22 2014-01-22 浪潮电子信息产业股份有限公司 Improved method aimed at small files of HDFS

Also Published As

Publication number Publication date
CN104850548A (en) 2015-08-19

Similar Documents

Publication Publication Date Title
CN107169083B (en) Mass vehicle data storage and retrieval method and device for public security card port and electronic equipment
JP6044539B2 (en) Distributed storage system and method
US11119654B2 (en) Determining an optimal storage environment for data sets and for migrating data sets
US9952940B2 (en) Method of operating a shared nothing cluster system
CN109086388A (en) Block chain date storage method, device, equipment and medium
CN109710614A (en) A kind of method and device of real-time data memory and inquiry
CN106775446A (en) Based on the distributed file system small documents access method that solid state hard disc accelerates
CN106570113B (en) Mass vector slice data cloud storage method and system
CN103207894A (en) Multipath real-time video data storage system and cache control method thereof
TW201702860A (en) Storage apparatus and method for autonomous space compaction
CN103312624A (en) Message queue service system and method
CN103559229A (en) Small file management service (SFMS) system based on MapFile and use method thereof
Zeng et al. Optimal metadata replications and request balancing strategy on cloud data centers
CN111159176A (en) Method and system for storing and reading mass stream data
CN109471843A (en) A kind of metadata cache method, system and relevant apparatus
CN104850548B (en) A kind of method and system for realizing big data platform input/output processing
US20150058438A1 (en) System and method providing hierarchical cache for big data applications
WO2013172405A1 (en) Storage system and data access method
US11157456B2 (en) Replication of data in a distributed file system using an arbiter
CN105335450B (en) Data storage processing method and device
CN106254270A (en) A kind of queue management method and device
CN106940712A (en) Sequence generating method and equipment
CN107493309A (en) File wiring method and device in a kind of distributed system
WO2024001025A1 (en) Pre-execution cache data cleaning method and blockchain node
CN106528667A (en) Low-power-consumption mass data full-text retrieval system frame capable of carrying out read-write separation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant