CN104850548B - A kind of method and system for realizing big data platform input/output processing - Google Patents
A kind of method and system for realizing big data platform input/output processing Download PDFInfo
- Publication number
- CN104850548B CN104850548B CN201410050179.1A CN201410050179A CN104850548B CN 104850548 B CN104850548 B CN 104850548B CN 201410050179 A CN201410050179 A CN 201410050179A CN 104850548 B CN104850548 B CN 104850548B
- Authority
- CN
- China
- Prior art keywords
- server
- signal
- write
- primary server
- primary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The embodiment of the invention discloses a kind of methods for realizing big data platform I/O processing, and after the data pushed by client are write itself cache by Primary server, the signal of Primary server caches is had been written into master transmissions;Primary server complete signal to client feedback write operations;By after by the data sending that client is pushed to each Secondary server, each Secondary server are asynchronous to be write the data in the cache of itself Primary server;After the asynchronous write buffers for waiting and receiving all Secondary server feedbacks of Primary server complete signal, the signal of Secondary server caches is had been written into master transmissions.The embodiment of the present invention further simultaneously discloses a kind of system for realizing big data platform I/O processing.
Description
Technical field
The present invention relates to cloud storage technology more particularly to a kind of realization big data platform input/output(Input/
Output, I/O)The method and system of processing.
Background technology
With the rapid development of business in internet, occur the various data of a large amount of forms in a short time.It was predicted that
The year two thousand twenty data volume is up to 35ZB.However, the core challenge in big data epoch is:Quantity is big, species is numerous, speed is fast, therefore,
Various big data platforms come into being.
At present, people mainly use and follow Google file system(Google File System, GFS)Hadoop point
Cloth file system(Hadoop Distributed File System, HDFS), MapReduce parallel environments and Hbase
Or Hive database datas warehouse etc., various big data business are handled or applied.
In systems in practice, GFS, HDFS, parallel network file system(Parallel Network File System,
pNFS)The cardinal principle structure of distributed file system, by taking GFS as an example, as shown in Figure 1.Wherein, each application of GFS is equal
Include a primary storage server(master)With multiple pieces of storage servers(chunk server);Multiple clients can pass through
GFS clients(client)It is interacted with master, each chunk server, so as to fulfill to cloud storage data in GFS
Access.
In the concrete realization, GFS is by metadata(metadata)It is stored in master, and needs the specific data stored
It is stored in each chunk server;Metadata Service be mainly client and master carry out chunk server positions,
The information such as block Data Position operate;And the read-write process operation of specific data is directly completed with each chunk server.Wherein,
In GFS, HDFS distributed file system, main reading process operation data includes:
(1)Utilize fixed piece(chunk)Size, client is filename(file name)The byte specified with program
Offset is converted into the block index of file(chunk index);
(2)Client sends a request for including file name and chunk index to master;
(3)Master returns to client responses, including block handle(chunk handle)With the position of chunk server
It puts;Here, the position of chunk server includes multiple chunk server;
(4)Client is key assignments caching chunk handle and chunk server with file name and chunk index
Location information;
(5)Client is transmit a request at one of chunk server, can generally be selected nearest.This request
Specify the block handle and bytes range of block;Here, chunk server identify chunk with chunk handle;
(6)Chunk server are by specified data sending to client.
Here, main data writing operation process includes:
(1)Client asks the base server of current chunk server to master(Primary server)Order
Memorial tablet is put and other dependent servers(Secondary server)Position;
(2)Master feeds back response, which includes the token of the Primary server of current chunk server
Position and the location message of other Secondary server;
Here, client can cache these data, so as to avoid frequently accessing master.
(3)Client gives data-pushing to chunk server;
Here, chunk server include Primary server and Secondary server.
(4)After all chunk server receive corresponding data, client initiates a write request to Primary
server;
(5)Primary server are according to sequence number order(serial number order)Change his own local
State;
(6)Write request is published to all Secondary server by Primary server;Each Secondary
Server is changed according to identical serial number order;
(7)Secondary server feed back Primary server responses, and write operation has been completed in expression;
(8)Primary server give client completed responses.
In the prior art, the HDFS+MapReduce+Hive+Hbase big datas platform based on Hadoop framework has height
Scalability, high reliability and high fault tolerance.But in actual big data business processing procedure, as wireless application is assisted
View(Wireless Application Protocol, WAP)Internet log(Web log, Blog), large user's mailing system,
In the applications such as Blog log analysis, user information tracking and analysis, current big data platform is deposited in data I/O processing methods
In defect, especially for non-structured, half structure, big data quantity business, there are more serious for I/O processing speeds
Problem.
Its main problem includes:
(1)Big data quantity situation especially in the case of continuous write, I/O performances are slow, and I/O speed adds
Speed is than relation non-linear with server number of nodes;
(2)In non-structural, semi structured data the processing such as LOG, BLOG, video, social relationships information, not according to big
Stored Data Type is optimized with feature, and processing speed is partially slow;
(3)Using the technology of more chunk server synchronous writes, cause in the states such as network and storage device be in confused situation
Lower synchronization time is longer so that the cost time is longer in the processing of data consistency;
(4)The read or write speed of chunk server storage devices and physical device situation, part are not accounted for
The readwrite performance of Secondary server seriously affects the I/O performances of whole system.
Therefore, in big data platform, I/O process performances and coherency management strategy are presently the most asking for core
The key element of topic and the whole I/O system performances of influence big data platform.
The content of the invention
In view of this, an embodiment of the present invention is intended to provide a kind of method and system for realizing big data platform I/O processing, energy
Big data platform I/O performances are enough effectively improved, while solve Data Consistency.
In order to achieve the above objectives, the technical proposal of the invention is realized in this way:
An embodiment of the present invention provides a kind of method for realizing big data platform input/output I/O processing, this method bags
It includes:
After the data pushed by client are write itself cache by base server Primary server,
The signal of Primary server caches is had been written into primary storage server master transmissions;
Primary server complete signal to client feedback write operations;
Primary server will be by the data sending that client is pushed to each dependent server Secondary
After server, each Secondary server are asynchronous to be write the data in the cache of itself;
The asynchronous write buffers for waiting and receiving all Secondary server feedbacks of Primary server complete signal
Afterwards, the signal of Secondary server caches is had been written into master transmissions.
In said program, each Secondary server are asynchronous to be write the data in the cache of itself
Including:
After asynchronous write operation order and the specific data for waiting and receiving Primary server of Secondary server,
The specific data are write into the cache of itself;Afterwards, signal is completed to Primary server feedback write buffers.
In said program, the method further includes:
Primary server are after master transmissions have been written into the signal of Secondary server caches, directly
It connects and writes the data block for needing to store in the memory of itself;Afterwards, Primary server are had been written into master transmissions
Signal;
The asynchronous feedback letters that all Secondary server is waited to send after the write operation is completed of Primary server
Number and the feedback signal is detected;
When the feedback signal is that write operation completes signal, the cache memory space of itself is discharged;Afterwards, to
Master sends the signal for having been written into Secondary server.
In said program, the write operation of the Secondary server includes:
Secondary server are after sending write buffer and completing signal, the asynchronous inspection that Primary server is waited to send
It surveys signal and directly writes the data for needing to store in the memory of itself;It is true through judging after receiving the detection signal
After the completion of write operation before settled, complete signal to Primary server feedback write operations and discharge the cache sky of itself
Between.
In said program, the method further includes:When feedback signal is there are during write operation failure signal, Primary
Server determines the Secondary server of corresponding feedback write operation failure signal, and by the data of the cache of itself
Block is transferred to corresponding Secondary server;Afterwards, Primary server continue asynchronous all Secondary of wait
Server feedback signals after the write operation is completed are simultaneously detected the feedback signal.
The embodiment of the present invention additionally provides a kind of system for realizing big data platform I/O processing, which includes:
Primary server, Secondary server, client and master;Wherein,
Primary server, after the data pushed by client are write itself cache, to main master
Send the signal for having been written into Primary server caches;Signal is completed to client feedback write operations;It will be by client
The data sending of push is to each Secondary server;It is asynchronous to wait and receive all Secondary server feedbacks
After write buffer completes signal, the signal of Secondary server caches is had been written into master transmissions;
Secondary server for the Asynchronous Reception Primary server data sent and write itself
In cache;
Client, for Primary server propelling datas;It is complete to receive the write operation that Primary server are sent
Into signal;
Master, for receiving the letter for having been written into Primary server caches of Primary server transmissions
Number;Receive the signal for having been written into Secondary server caches that Primary server are sent.
In said program, the Secondary server are used for the number that Asynchronous Reception Primary server are sent
According to and write the cache of itself and include:Secondary server are asynchronous wait and receive Primary server write behaviour
After making order and specific data, the specific data are write into the cache of itself;It is write to Primary server feedbacks slow
It deposits and completes signal.
In said program, the Primary server are additionally operable to, and Secondary is being had been written into master transmissions
After the signal of server caches, directly the data block for needing to store is write in the memory of itself;It is sent out to master
Send the signal for having been written into Primary server;It is asynchronous to wait what all Secondary server were sent after the write operation is completed
Feedback signal is simultaneously detected the feedback signal;When it is that write operation completes signal that detection, which determines the feedback signal,
Discharge the cache memory space of itself;The signal of Secondary server is had been written into master transmissions;
Master is additionally operable to receive the signal for having been written into Primary server that Primary server are sent;It receives
The signal for having been written into Secondary server that Primary server are sent.
In said program, in write operation, Secondary server are additionally operable to the Secondary server, are being sent out
After write buffer is sent to complete signal, the asynchronous detection signal that Primary server is waited to send simultaneously will directly need the data stored
In the memory for writing itself;After receiving the detection signal, through judging after the completion of determining current write operation, to
Primary server feedback write operations complete signal and discharge the cache memory space of itself.
In said program, the Primary server are additionally operable to, when the feedback signal of Secondary server exists
During write operation failure signal, the Secondary server of corresponding feedback write operation failure signal are determined, and by the height of itself
The transmission of data blocks of speed caching gives corresponding Secondary server;Primary server, which continue asynchronous wait, to be owned
Secondary server feedback signals after the write operation is completed are simultaneously detected the feedback signal.
The method and system for the realization big data platform I/O processing that the embodiment of the present invention is provided, Primary server
After the data pushed by client are write itself cache, it is slow at a high speed to have been written into Primary server to master transmissions
The signal deposited;Primary server complete signal to client feedback write operations;Primary server will be pushed away by client
The data sending sent to after each Secondary server, each Secondary server it is asynchronous by data write-in from
In the cache of body;The write buffer that Primary server are asynchronous to be waited and receive all Secondary server feedback is complete
Into after signal, the signal of Secondary server caches is had been written into master transmissions.It so, it is possible to effectively improve big
Data platform I/O performances;Further, Primary server have been written into Secondary server at a high speed to master transmissions
After the signal of caching, directly the data block for needing to store is write in the memory of itself;Afterwards, had been written into master transmissions
The signal of Primary server;Primary server are asynchronous to wait all Secondary server after the write operation is completed
The feedback signal of transmission is simultaneously detected the feedback signal;When the feedback signal is that write operation completes signal, release
Put the cache memory space of itself;Then, the signal of Secondary server is had been written into master transmissions, so as to effectively solve
It has determined Data Consistency so that the whole I/O performances of big data platform are improved.
Description of the drawings
Fig. 1 is the cardinal principle structure diagram of GFS;
Fig. 2 is that the embodiment of the present invention realizes that the method for big data platform I/O processing realizes flow diagram;
Fig. 3 is that the embodiment of the present invention realizes that the method for big data platform I/O processing implements flow diagram;
Fig. 4 is that the embodiment of the present invention realizes that the system of big data platform I/O processing forms structure diagram.
Specific embodiment
At present, the key problem for influencing GFS, HDFS distributed file system I/O access performances is to carry out data
Write operation on;That is, in order to keep the data consistency of Primary server and Secondary server, a side
All Secondary server in face write direct memory, and on the other hand all Secondary server complete all deposit
After reservoir I/O, Primary server just handle follow-up work after being connected to the response of all Secondary server.In this way,
One write operation is, it is necessary to which all Secondary server complete respective I/O processing so that whole system I/O processing times
It is longer, it is especially busy or part Secondary server network transmissions are seriously by shadow in part Secondary server
In the case of sound, the I/O write operation times can be caused unpredictable, and then influence the I/O performances of whole system.
In embodiments of the present invention, the data pushed by client are write itself cache by Primary server
Afterwards, the signal of Primary server caches is had been written into master transmissions;Primary server are fed back to client
Write operation completes signal;Primary server will be by the data sending that client is pushed to each Secondary server
Afterwards, each Secondary server are asynchronous writes the data in the cache of itself;Primary server are asynchronous
After waiting and receiving the write buffer completion signal that all Secondary server are fed back, had been written into master transmissions
The signal of Secondary server caches.
Here, in big data platform, realize Write post without directly will based on cache under distributed system environment
Data write the algorithm of memory, also referred to as buffer write(Cache-Write)Algorithm can effectively improve the whole I/O of system
Performance.However, the algorithm is Data Consistency the problem of maximum.
Therefore, further, on the basis of Cache-Write algorithms, the method and machine that data consistency is provided are increased
System, the mechanism are also referred to as direct read/write(Direct-IO)Mechanism, specifically,
Primary server are after master transmissions have been written into the signal of Secondary server caches, directly
It connects and writes the data block for needing to store in the memory of itself;Afterwards, Primary server are had been written into master transmissions
Signal;The asynchronous feedback signals that all Secondary server is waited to send after the write operation is completed of Primary server
And the feedback signal is detected;When the feedback signal is that write operation completes signal, the high speed for discharging itself is delayed
Deposit space;Afterwards, the signal of Secondary server is had been written into master transmissions.
In this way, the big data platform I/O processing methods based on Cache-Write algorithms and Direct-IO mechanism not only have
There is the I/O performance higher than HDFS distributed files, but also with stringent data consistency.
Below in conjunction with the accompanying drawings and specific embodiment the present invention is further described in more detail.
Fig. 2 is that the embodiment of the present invention realizes that the method for big data platform I/O processing realizes flow diagram, as shown in Fig. 2,
The embodiment of the present invention realizes that the method for big data platform I/O processing includes:
Step S100:After the data pushed by client are write itself cache by Primary server, to
Master sends the signal for having been written into Primary server caches;
Step S101:Primary server complete signal to client feedback write operations;
Step S102:Primary server will be by the data sending that client is pushed to each Secondary server
Afterwards, each Secondary server are asynchronous writes the data in the cache of itself;
Step S103:The asynchronous write buffers waited and receive all Secondary server feedbacks of Primary server
After completing signal, the signal of Secondary server caches is had been written into master transmissions.
Fig. 3 is that the embodiment of the present invention realizes that the method for big data platform I/O processing implements flow diagram, such as Fig. 3
Shown, the embodiment of the present invention realizes that the method for big data platform I/O processing specifically includes:
Step S200:After the data pushed by client are write itself cache by Primary server, to
Master sends the signal for having been written into Primary server caches;
Here, the data pushed by client are write to the process of itself cache in the Primary server
In, if write-in data are excessive, the cache of itself does not have enough spaces, then the asynchronous operation for carrying out I/O storages.
Step S201:Primary server complete signal to client feedback write operations;
Step S202:Primary server will be by the data sending that client is pushed to each Secondary server
Afterwards, each Secondary server are asynchronous writes the data in the cache of itself;
Here, asynchronous write the data in the cache of itself of each Secondary server is specifically wrapped
It includes:
After asynchronous write operation order and the specific data for waiting and receiving Primary server of Secondary server,
The specific data are write into the cache of itself;Afterwards, signal is completed to Primary server feedback write buffers.
Here, during the specific data are write the cache of itself by the Secondary server,
If write, data are excessive, and the cache of itself does not have enough spaces, then the asynchronous operation for carrying out I/O storages.
Step S203:The asynchronous write buffers waited and receive all Secondary server feedbacks of Primary server
After completing signal, the signal of Secondary server caches is had been written into master transmissions.
Step S204:Primary server have been written into the letter of Secondary server caches to master transmissions
After number, directly the data block for needing to store is write in the memory of itself;Afterwards, Primary is had been written into master transmissions
The signal of server;
Step S205:Primary server are asynchronous to wait all Secondary server after the write operation is completed, hair
The feedback signal sent simultaneously is detected the feedback signal;
Step S206:When the feedback signal is that write operation completes signal, the cache memory space of itself is discharged;It
Afterwards, the signal of Secondary server is had been written into master transmissions.
Here, when feedback signal is there are during write operation failure signal, Primary server determine that corresponding feedback writes behaviour
Make the Secondary server of failure signal, and give the transmission of data blocks of the cache of itself to corresponding Secondary
server;Afterwards, Primary server continue asynchronous to wait all Secondary server after the write operation is completed anti-
Feedback signal is simultaneously detected the feedback signal.
Here, the write operation of the Secondary server includes:
Secondary server are after sending write buffer and completing signal, the asynchronous inspection that Primary server is waited to send
It surveys signal and directly writes the data for needing to store in the memory of itself;It is true through judging after receiving the detection signal
After the completion of write operation before settled, complete signal to Primary server feedback write operations and discharge the cache sky of itself
Between.
Wherein, when through judging to determine that current write operation does not complete, Secondary server are continued waiting for until writing
After the completion of operation, complete signal to Primary server feedback write operations and discharge the cache memory space of itself.
Fig. 4 is that the embodiment of the present invention realizes that the system of big data platform I/O processing forms structure diagram, as shown in figure 4,
The embodiment of the present invention realizes that the system of big data platform I/O processing includes:Primary server10、Secondary
Server11, client12 and master13;Wherein,
Primary server10, after the data pushed by client12 are write itself cache, to master
Master13 sends the signal for having been written into Primary server10 caches;Letter is completed to client12 feedback write operations
Number;It will be by the data sending that client12 is pushed to each Secondary server11;It is asynchronous to wait and receive all
After the write buffer of Secondary server11 feedbacks completes signal, Secondary server are had been written into master13 transmissions
The signal of cache;
Secondary server11 for the Asynchronous Reception Primary server10 data sent and write certainly
The cache of body;
Specifically, Secondary server11 it is asynchronous wait and receive Primary server10 write operation order and
After specific data, the specific data are write into the cache of itself;It is completed to Primary server10 feedback write buffers
Signal.
Client12, for Primary server10 propelling datas;Receive writing for Primary server10 transmissions
Signal is completed in operation;
Master13, for receive Primary server10 transmission have been written into Primary server10 caches
Signal;Receive the signal for having been written into Secondary server11 caches that Primary server10 are sent.
Further, after increasing the Direct-IO mechanism for improving data consistency in Cache-Write algorithms,
Primary server10 are additionally operable to, slow at a high speed having been written into Secondary server11 to master13 transmissions
After the signal deposited, directly the data block for needing to store is write in the memory of itself;It is had been written into master13 transmissions
The signal of Primary server10;The asynchronous feedback that all Secondary server12 is waited to send after the write operation is completed
Signal is simultaneously detected the feedback signal;When it is that write operation completes signal that detection, which determines the feedback signal, release
The cache memory space of itself;The signal of Secondary server11 is had been written into master13 transmissions;
Master13 is additionally operable to receive the letter for having been written into Primary server10 that Primary server10 are sent
Number;Receive the signal for having been written into Secondary server11 that Primary server10 are sent.
Here, the Secondary server11 are in write operation, specifically:
Secondary server11 are additionally operable to, after sending write buffer and completing signal, asynchronous wait
The detection signal of Primaryserver10 transmissions simultaneously directly writes the data for needing to store in the memory of itself;It receives
After the detection signal, through judging after the completion of determining current write operation, completed to Primary server10 feedback write operations
Signal and the cache memory space for discharging itself.
Here, when the feedback signal of Secondary server11 is there are during write operation failure signal, Primary
Server10 is additionally operable to, and determines the Secondary server11 of corresponding feedback write operation failure signal, and by the height of itself
The transmission of data blocks of speed caching gives corresponding Secondary server11;Primary server10, which continue asynchronous wait, to be owned
Secondary server11 feedback signals after the write operation is completed are simultaneously detected the feedback signal.
In conclusion cache-Write algorithms and Direct-IO mechanism based on the embodiment of the present invention, big data platform
The write operation of distributed file system HDFS and client be specifically simplified as:
The first step:Client asked to master the token position of the Primary server of current chunk server with
And the position of other all Secondary server;
Second step:Master to client feed back response, specifically include Primary server token position and its
The location information of his all Secondary server.
3rd step:The location information of data and correlation Secondary server is pushed to Primary by client
server。
4th step:Primary directly invokes Cache-Write algorithms and carries out write operation;
5th step:Primary is in the location information of processing completion local data cache and Secondary server
Write operation is directly fed back after storage and completes signal to client.
6th step:Primary server and Secondary server backstages asynchronous process data write magnetic disk(Write
The memory of itself)With data consistency relevant operation.
7th step:Asynchronous process various metadata informations in Primary server and master backstages write behaviour until all
It completes, terminates flow.
Compare the new distributed write operation based on cache-Write algorithms and Direct-IO mechanism and HDFS before divides
Cloth file system write operation, it can be deduced that:
(1)Client only writes data into Primary server, reduces the transmission of multiple Secondary server;
(2)After the cache of write-in Primary server, client can think that write operation is completed, so as to carry out it
The operation of his client reduces the intermediate complicated Secondary server-Secondary server that write and confirms process;
(3)And the process read in client, due between Primary server and master to the storage location and situation of data very
Clear, master can provide most suitable storage location and block handle, therefore asynchronous write operation does not influence the progress of read operation;
(4)New distributed file system write operation has higher I/O performances.
Theoretically, big data platform of the embodiment of the present invention based on cache-Write algorithms and Direct-IO mechanism
I/O processing methods are compared with the method before HDFS/GFS, and write operation process will reduce by 3*N write operation;Read operation performance and
Reading method before is compared, and read operation can be reduced N*1 times.Therefore, whole system I/O performances are effectively raised.
In order to verify the specific improvement effect of the embodiment of the present invention, HDFS, Mapdreduce of hadoop platforms can be used
The databases such as parallel environment and Hive, Hbase, it is specific to test WAP log systems, customer relation management(Customer
Relationship Management, CRM)The I/O behavior patterns of message data system are found by test, non-structural
This method improves about 113% or so than system performance before in the case of change, read-write frequently, and in the CRM message numbers of large capacity
According to the situation continuously write, continuously read under system, is belonged to, performance improves about 81%.It can be seen that frequent read-write operation
Business its performance improvement it is bigger, it is clear that the embodiment of the present invention can effectively improve big data platform I/O performances.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention.
Claims (8)
- A kind of 1. method for realizing big data platform input/output I/O processing, which is characterized in that the described method includes:After the data pushed by client are write itself cache by base server Primary server, to master Storage server master sends the signal for having been written into Primary server caches;Primary server complete signal to client feedback write operations;Primary server by after by the data sending that client is pushed to each dependent server Secondary server, Each Secondary server are asynchronous to be write the data in the cache of itself;After the asynchronous write buffers for waiting and receiving all Secondary server feedbacks of Primary server complete signal, to Master sends the signal for having been written into Secondary server caches;Primary server directly will after master transmissions have been written into the signal of Secondary server caches The data block stored is needed to write in the memory of itself;Afterwards, the letter of Primary server is had been written into master transmissions Number;The asynchronous feedback signals that all Secondary server is waited to send after the write operation is completed of Primary server are simultaneously The feedback signal is detected;When the feedback signal is that write operation completes signal, the cache memory space of itself is discharged;Afterwards, sent out to master Send the signal for having been written into Secondary server.
- 2. according to the method described in claim 1, it is characterized in that, each Secondary server are asynchronous by the number Include according to the cache of itself is write:After asynchronous write operation order and the specific data for waiting and receiving Primary server of Secondary server, by institute It states specific data and writes the cache of itself;Afterwards, signal is completed to Primary server feedback write buffers.
- 3. according to the method described in claim 1, it is characterized in that, the write operation of the Secondary server includes:Secondary server are after sending write buffer and completing signal, the asynchronous detection that Primary server is waited to send letter Number and directly the data that store will be needed to write in the memory of itself;After receiving the detection signal, through judging to determine to work as After the completion of preceding write operation, complete signal to Primary server feedback write operations and discharge the cache memory space of itself.
- 4. according to the method described in claim 1, it is characterized in that, the method further includes:When feedback signal, there are write operations During failure signal, Primary server determine the Secondary server of corresponding feedback write operation failure signal, and will The transmission of data blocks of the cache of itself gives corresponding Secondary server;Afterwards, Primary server continue different Step waits all Secondary server feedback signal after the write operation is completed and the feedback signal is detected.
- 5. a kind of system for realizing big data platform I/O processing, which is characterized in that the system comprises:Primary server、 Secondary server, client and master;Wherein,Primary server after the data pushed by client are write itself cache, are sent to main master Have been written into the signal of Primary server caches;Signal is completed to client feedback write operations;It will be pushed by client Data sending to each Secondary server;It is asynchronous to wait and to receive writing for all Secondary server feedback slow It deposits after completing signal, the signal of Secondary server caches is had been written into master transmissions;Secondary server for the Asynchronous Reception Primary server data sent and write the high speed of itself In caching;Client, for Primary server propelling datas;It receives the write operation that Primary server are sent and completes letter Number;Master, for receiving the signal for having been written into Primary server caches of Primary server transmissions;It connects Receive the signal for having been written into Secondary server caches that Primary server are sent;Wherein, the Primary server are additionally operable to, slow at a high speed having been written into Secondary server to master transmissions After the signal deposited, directly the data block for needing to store is write in the memory of itself;It is had been written into master transmissions The signal of Primary server;The asynchronous feedback signal that all Secondary server is waited to send after the write operation is completed And the feedback signal is detected;When it is that write operation completes signal that detection, which determines the feedback signal, itself is discharged Cache memory space;The signal of Secondary server is had been written into master transmissions;The master is additionally operable to receive the signal for having been written into Primary server that Primary server are sent;It receives The signal for having been written into Secondary server that Primary server are sent.
- 6. system according to claim 5, which is characterized in that the Secondary server are used for Asynchronous Reception Data that Primary server are sent simultaneously write the cache of itself and include:The asynchronous waits of Secondary server And after receiving write operation order and the specific data of Primary server, the specific data are write into the high speed of itself and are delayed It deposits;Signal is completed to Primary server feedback write buffers.
- 7. system according to claim 5, which is characterized in that the Secondary server in write operation, Secondary server are additionally operable to, after sending write buffer and completing signal, the asynchronous inspection that Primary server is waited to send It surveys signal and directly writes the data for needing to store in the memory of itself;It is true through judging after receiving the detection signal After the completion of write operation before settled, complete signal to Primary server feedback write operations and discharge the cache sky of itself Between.
- 8. system according to claim 5, which is characterized in that the Primary server are additionally operable to, and work as Secondary There are the Secondary for during write operation failure signal, determining corresponding feedback write operation failure signal for the feedback signal of server Server, and give the transmission of data blocks of the cache of itself to corresponding Secondary server;Primary server Continue asynchronous to wait all Secondary server feedback signal after the write operation is completed and carry out the feedback signal Detection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410050179.1A CN104850548B (en) | 2014-02-13 | 2014-02-13 | A kind of method and system for realizing big data platform input/output processing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410050179.1A CN104850548B (en) | 2014-02-13 | 2014-02-13 | A kind of method and system for realizing big data platform input/output processing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104850548A CN104850548A (en) | 2015-08-19 |
CN104850548B true CN104850548B (en) | 2018-05-22 |
Family
ID=53850196
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410050179.1A Active CN104850548B (en) | 2014-02-13 | 2014-02-13 | A kind of method and system for realizing big data platform input/output processing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104850548B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106980645B (en) * | 2017-02-24 | 2020-09-15 | 北京同有飞骥科技股份有限公司 | Distributed file system architecture implementation method and device |
CN109222853A (en) * | 2018-11-19 | 2019-01-18 | 苏州新光维医疗科技有限公司 | Endoscope and endoscope working method |
CN112866339B (en) * | 2020-12-30 | 2022-12-06 | 金蝶软件(中国)有限公司 | Data transmission method and device, computer equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7149923B1 (en) * | 2003-01-17 | 2006-12-12 | Unisys Corporation | Software control using the controller as a component to achieve resiliency in a computer system utilizing separate servers for redundancy |
CN101854388A (en) * | 2010-05-17 | 2010-10-06 | 浪潮(北京)电子信息产业有限公司 | Method and system concurrently accessing a large amount of small documents in cluster storage |
CN102129434A (en) * | 2010-01-13 | 2011-07-20 | 腾讯科技(北京)有限公司 | Method and system for reading and writing separation database |
CN102882983A (en) * | 2012-10-22 | 2013-01-16 | 南京云创存储科技有限公司 | Rapid data memory method for improving concurrent visiting performance in cloud memory system |
CN103078936A (en) * | 2012-12-31 | 2013-05-01 | 网宿科技股份有限公司 | Metadata hierarchical storage method and system for Global file system (GFS)-based distributed file system |
CN103530387A (en) * | 2013-10-22 | 2014-01-22 | 浪潮电子信息产业股份有限公司 | Improved method aimed at small files of HDFS |
-
2014
- 2014-02-13 CN CN201410050179.1A patent/CN104850548B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7149923B1 (en) * | 2003-01-17 | 2006-12-12 | Unisys Corporation | Software control using the controller as a component to achieve resiliency in a computer system utilizing separate servers for redundancy |
CN102129434A (en) * | 2010-01-13 | 2011-07-20 | 腾讯科技(北京)有限公司 | Method and system for reading and writing separation database |
CN101854388A (en) * | 2010-05-17 | 2010-10-06 | 浪潮(北京)电子信息产业有限公司 | Method and system concurrently accessing a large amount of small documents in cluster storage |
CN102882983A (en) * | 2012-10-22 | 2013-01-16 | 南京云创存储科技有限公司 | Rapid data memory method for improving concurrent visiting performance in cloud memory system |
CN103078936A (en) * | 2012-12-31 | 2013-05-01 | 网宿科技股份有限公司 | Metadata hierarchical storage method and system for Global file system (GFS)-based distributed file system |
CN103530387A (en) * | 2013-10-22 | 2014-01-22 | 浪潮电子信息产业股份有限公司 | Improved method aimed at small files of HDFS |
Also Published As
Publication number | Publication date |
---|---|
CN104850548A (en) | 2015-08-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107169083B (en) | Mass vehicle data storage and retrieval method and device for public security card port and electronic equipment | |
JP6044539B2 (en) | Distributed storage system and method | |
US11119654B2 (en) | Determining an optimal storage environment for data sets and for migrating data sets | |
US9952940B2 (en) | Method of operating a shared nothing cluster system | |
CN109086388A (en) | Block chain date storage method, device, equipment and medium | |
CN109710614A (en) | A kind of method and device of real-time data memory and inquiry | |
CN106775446A (en) | Based on the distributed file system small documents access method that solid state hard disc accelerates | |
CN106570113B (en) | Mass vector slice data cloud storage method and system | |
CN103207894A (en) | Multipath real-time video data storage system and cache control method thereof | |
TW201702860A (en) | Storage apparatus and method for autonomous space compaction | |
CN103312624A (en) | Message queue service system and method | |
CN103559229A (en) | Small file management service (SFMS) system based on MapFile and use method thereof | |
Zeng et al. | Optimal metadata replications and request balancing strategy on cloud data centers | |
CN111159176A (en) | Method and system for storing and reading mass stream data | |
CN109471843A (en) | A kind of metadata cache method, system and relevant apparatus | |
CN104850548B (en) | A kind of method and system for realizing big data platform input/output processing | |
US20150058438A1 (en) | System and method providing hierarchical cache for big data applications | |
WO2013172405A1 (en) | Storage system and data access method | |
US11157456B2 (en) | Replication of data in a distributed file system using an arbiter | |
CN105335450B (en) | Data storage processing method and device | |
CN106254270A (en) | A kind of queue management method and device | |
CN106940712A (en) | Sequence generating method and equipment | |
CN107493309A (en) | File wiring method and device in a kind of distributed system | |
WO2024001025A1 (en) | Pre-execution cache data cleaning method and blockchain node | |
CN106528667A (en) | Low-power-consumption mass data full-text retrieval system frame capable of carrying out read-write separation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |