CN105391755B - Data processing method, apparatus and system in a kind of distributed system - Google Patents
Data processing method, apparatus and system in a kind of distributed system Download PDFInfo
- Publication number
- CN105391755B CN105391755B CN201510644448.1A CN201510644448A CN105391755B CN 105391755 B CN105391755 B CN 105391755B CN 201510644448 A CN201510644448 A CN 201510644448A CN 105391755 B CN105391755 B CN 105391755B
- Authority
- CN
- China
- Prior art keywords
- data
- node
- memory
- updated
- version
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses data processing method, apparatus and systems in a kind of distributed system, for solving the problem of to complete the data-handling efficiency of distributed system could be caused to reduce based on the progress data processing of new subregion view since calculate node needs to wait for the task based on former subregion view.The data processing method includes:When the first calculate node updates the data using first according to the first subregion view and is updated operation to first data, the second calculate node receives the second subregion view;Second calculate node updates the data using second according to the second subregion view and is updated operation to first data;Wherein, the first moment of the first subregion view instruction stores at least two memory nodes of first data, and at least two memory nodes that first data are stored with the second moment of the second subregion view instruction are not exactly the same.
Description
Technical field
Data processing method, device and it is the present invention relates to field of data storage, more particularly in a kind of distributed system
System.
Background technology
Distributed system refers to being distributed to computing resource and storage resource on different nodes, by network by each node
The system connected.Distributed system can be accomplished a task by managing and controlling multiple nodes, but externally performance
For a complete system of independence, user's perception is less than distributed system task in internal execution details.
Fig. 1 is the schematic diagram of distributed system, wherein client for receiving user demand, (ask by such as write request and reading
Ask), calculate node is responsible for distribution routing, and memory node is responsible for single-deck management and actual storage, in addition, further including pipe in system
Node is managed, is responsible for being managed multiple calculate nodes and memory node.
Distributed system preserves data using more replication policies, i.e., same data is stored on multiple memory nodes, with
Avoid loss of data caused by memory node failure, the reliability of enhancing data storage.Meanwhile system will also ensure same data
Multiple copies consistency, i.e., with multiple memory nodes store same data when, it is ensured that stored on multiple memory nodes
Data are consistent.
Chain type replicanism is a kind of strategy ensureing more copy consistencies, and reference Fig. 2 stores multiple storage sections of data
There are chain relationships between point, are known as head node (and can be described as host node), the memory node of last-of-chain positioned at the memory node of chain head
Referred to as tail node, calculate node carries out data write operation when carrying out write operation first in head node, then in chain type knot
Write operation is carried out in node 2 in structure after head node, and so on, until be written data in tail node, calculate node is to visitor
Family end returned data is written successfully and when reading data, and calculate node can only read the data stored in tail node.
In a distributed system, each calculate node is preserved a by the unified management of management node and updating maintenance
Mapping table, referred to as subregion view preserves the information for the multiple memory nodes for storing each data in subregion view, calculates
Node determines the memory node of each data with specific reference to the subregion view of preservation, and then asks corresponding memory node execution pair
It should operate.
In actual conditions, storing the memory node of data may change, node of the management node in storage data
New subregion view can be generated based on the memory node for storing data after change after change, and new subregion view is sent to
Each calculate node.If there is calculate node is when receiving new subregion view, having the task based on former subregion view, (such as data are more
New operation) not yet to complete, then remaining calculate node needs first to wait for the calculate node that will carry out based on the task of former subregion view
It finishes, new subregion view could be based on and carry out data manipulation.Calculate node due to receiving new subregion view cannot be based on immediately
New subregion view carries out data manipulation, will lead to task blocking, reduces the data-handling efficiency of distributed system.
Invention content
The embodiment of the present invention provides data processing method, apparatus and system in a kind of distributed system, for solve due to
Calculate node needs to wait for the completion of the task based on former subregion view, and could be based on new subregion view progress data processing causes to be distributed
The problem of data-handling efficiency of formula system reduces.
In a first aspect, data processing method in a kind of distributed system of offer of the embodiment of the present invention, the distributed system
Including management node, the first calculate node, the second calculate node and multiple memory nodes, the management node, first meter
It is communicated between operator node, second calculate node and the multiple memory node, the method includes:
It is updated the data using first according to the first subregion view in first calculate node and first data is carried out
When update operation, second calculate node receives the second subregion view;
Second calculate node according to the second subregion view using second update the data to first data into
Row update operation;
Wherein, the first subregion view and the second subregion view are generated by the management node;Described
One subregion view is used to indicate at least two memory nodes for storing first data at the first moment;The second subregion view
It is used to indicate at least two memory nodes that the second moment stored first data;Wherein, first moment and described the
Two moment are different;First moment stores at least two memory nodes of first data and second moment stores institute
At least two memory nodes for stating the first data are not exactly the same.
With reference to first aspect, in the first possible realization method of first aspect, the first subregion view instruction
Storage first data at least two memory nodes and the second subregion view instruction storage first data
At least two memory nodes in include the first memory node;
Second calculate node according to the second subregion view using second update the data to first data into
Row update operation, including:
Second calculate node competes the write permission of first memory node with first calculate node;
Second calculate node updates the data after the write permission for obtaining first memory node using described second
Operation is updated to first data in first memory node.
The possible realization method of with reference to first aspect the first, in second of possible realization method of first aspect
In, first memory node is at least two memory nodes of storage first data of the first subregion view instruction
In memory node in addition to primary storage node, and the storage institute that first memory node indicates for the second subregion view
State the primary storage node at least two memory nodes of the first data;
Second calculate node competes the write permission of first memory node with first calculate node, including:
Second calculate node determines the primary storage node for storing first data according to the second subregion view
For first memory node;
Second calculate node is updated operation to first data to first memory node transmission and asks
It asks;
Wherein, main memory of first calculate node in storage first data indicated the first subregion view
First data in storage node are updated after operation, and first calculate node is sent to first memory node
First data are updated with the request of operation;First memory node will be updated operation to first data
Write permission authorize the corresponding calculate node of request that operation is updated to first data first received.
With reference to first aspect, in the third possible realization method of first aspect, the second subregion view instruction
At least two memory nodes of storage first data include the first memory node and the second memory node;
Second calculate node according to the second subregion view using second update the data to first data into
Row update operation, including:
Second calculate node updates the data described first stored to first memory node using described second
Data are updated, and form the data of the first version in first memory node;
Second calculate node determines that the corresponding version of first data that second memory node preserves is less than
The previous version of the first version;
First data that second calculate node is preserved in second memory node are updated to described first
After the data of the previous version of version, updated the data to the first edition in second memory node using described second
The data of this previous version are updated, and form the data of the first version in second memory node.
The third possible realization method with reference to first aspect, in the 4th kind of possible realization method of first aspect
In, second calculate node determines the corresponding version of first data that second memory node preserves less than described the
The previous version of one version, including:
Second calculate node is updated to the second memory node transmission data asks, and the data update request is used
First data are updated using the data that described second updates the data as first version in request;
Second calculate node receives the error information that second memory node returns, and the error information shows institute
State the previous version that the corresponding version of first data in the second memory node is less than the first version;
First data that second calculate node is preserved in second memory node are updated to described first
After the data of the previous version of version, updated the data to the first edition in second memory node using described second
The data of this previous version are updated, including:
After second calculate node sets duration after receiving the error information, again to second memory node
Send the data update request;
Second calculate node receives the data update success message that second memory node returns, wherein described
Second memory node is updated to by first calculate node previous version of the first version in first data of storage
After this data, the data update request that second calculate node is sent is responded, and in first data by more
After newly being updated the data for described second the data update success message is returned to first calculate node.
Second aspect, the embodiment of the present invention provide data processing equipment in a kind of distributed system, the distributed system
Including management node, the first calculate node, the data processing equipment and multiple memory nodes, the management node, described
It is communicated between one calculate node, the data processing equipment and the multiple memory node, the data processing equipment includes:
Receiving module, for being updated the data to described using first according to the first subregion view in first calculate node
When first data are updated operation, the second subregion view is received;
Update module carries out more first data for being updated the data using second according to the second subregion view
New operation;
Wherein, the first subregion view and the second subregion view are generated by the management node;Described
One subregion view is used to indicate at least two memory nodes for storing first data at the first moment;The second subregion view
It is used to indicate at least two memory nodes that the second moment stored first data;Wherein, first moment and described the
Two moment are different;First moment stores at least two memory nodes of first data and second moment stores institute
At least two memory nodes for stating the first data are not exactly the same.
In conjunction with second aspect, in the first possible realization method of second aspect, the first subregion view instruction
Storage first data at least two memory nodes and the second subregion view instruction storage first data
At least two memory nodes in include the first memory node;
The update module, is specifically used for:
The write permission of first memory node is competed with first calculate node;
After the write permission for obtaining first memory node, updated the data to the first storage section using described second
First data in point are updated operation.
In conjunction with the first possible realization method of second aspect, in second of possible realization method of second aspect
In, first memory node is at least two memory nodes of storage first data of the first subregion view instruction
In memory node in addition to primary storage node, and the storage institute that first memory node indicates for the second subregion view
State the primary storage node at least two memory nodes of the first data;
The update module is specifically used for:
Determine that the primary storage node for storing first data is the first storage section according to the second subregion view
Point;
The request that first data are updated with operation is sent to first memory node;
Wherein, main memory of first calculate node in storage first data indicated the first subregion view
First data in storage node are updated after operation, and first calculate node is sent to first memory node
First data are updated with the request of operation;First memory node will be updated operation to first data
Write permission authorize the corresponding calculate node of request that operation is updated to first data first received.
In conjunction with second aspect, in the third possible realization method of second aspect, the second subregion view instruction
At least two memory nodes of storage first data include the first memory node and the second memory node;
The update module is specifically used for:
First data for storing first memory node are updated the data using described second to be updated, and are formed
The data of first version in first memory node;
Determine the corresponding version of first data of the second memory node preservation less than before the first version
One version;
The previous version of the first version is updated in first data that second memory node preserves
After data, the number of the previous version to the first version in second memory node is updated the data using described second
According to being updated, the data of the first version in second memory node are formed.
In conjunction with the third possible realization method of second aspect, in the 4th kind of possible realization method of second aspect
In, the update module is specifically used for:
It updates and asks to the second memory node transmission data, the data update request is for asking with described second
It updates the data and first data is updated as the data of first version;
The error information that second memory node returns is received, the error information shows in second memory node
The corresponding version of first data be less than the first version previous version;
After receiving the error information after setting duration, the data update is sent to second memory node again
Request;
Receive the data update success message that second memory node returns, wherein second memory node is being deposited
After first data of storage are updated to the data of previous version of the first version by first calculate node, response
The data update request that the data processing equipment is sent, and it is updated to the second update number in first data
The data update success message is returned according to backward first calculate node.
The third aspect, the embodiment of the present invention provide calculate node in a kind of distributed system, are applied in distributed system,
The distributed system further includes management node, the first calculate node and multiple memory nodes, the management node, described first
It is communicated between calculate node, the calculate node and the multiple memory node, the calculate node includes:
Processor is communicated with input/output interface, for being used according to the first subregion view in first calculate node
First updates the data when being updated operation to first data, indicates that the input/output interface receives the second subregion and regards
Figure;
The input/output interface is used to receive the second subregion view according to processor instruction;
The input/output interface is additionally operable to reception second and updates the data;
The processor be additionally operable to according to the second subregion view using second update the data to first data into
Row update operation;
Wherein, the first subregion view and the second subregion view are generated by the management node;Described
One subregion view is used to indicate at least two memory nodes for storing first data at the first moment;The second subregion view
It is used to indicate at least two memory nodes that the second moment stored first data;Wherein, first moment and described the
Two moment are different;First moment stores at least two memory nodes of first data and second moment stores institute
At least two memory nodes for stating the first data are not exactly the same.
In conjunction with the third aspect, in the first possible realization method of the third aspect, the first subregion view instruction
Storage first data at least two memory nodes and the second subregion view instruction storage first data
At least two memory nodes in include the first memory node;
The processor is updated the data using second according to the second subregion view and is updated to first data
Operation, including:
The write permission of first memory node is competed with first calculate node;
After the write permission for obtaining first memory node, updated the data to the first storage section using described second
First data in point are updated operation.
In conjunction with the first possible realization method of the third aspect, in second of possible realization method of the third aspect
In, first memory node is at least two memory nodes of storage first data of the first subregion view instruction
In memory node in addition to primary storage node, and the storage institute that first memory node indicates for the second subregion view
State the primary storage node at least two memory nodes of the first data;
The processor competes the write permission of first memory node with first calculate node, including:
Determine that the primary storage node for storing first data is the first storage section according to the second subregion view
Point;
Indicate that the input/output interface is updated operation to first memory node transmission to first data
Request;
The input/output interface be additionally operable to first memory node send to first data carry out described in more
The request newly operated;
Wherein, main memory of first calculate node in storage first data indicated the first subregion view
First data in storage node are updated after operation, and first calculate node is sent to first memory node
First data are updated with the request of operation;First memory node will be updated operation to first data
Write permission authorize the corresponding calculate node of request that operation is updated to first data first received.
In conjunction with the third aspect, in the third possible realization method of the third aspect, the second subregion view instruction
At least two memory nodes of storage first data include the first memory node and the second memory node;
The processor is updated the data using second according to the second subregion view and is updated to first data
Operation, including:
First data for storing first memory node are updated the data using described second to be updated, and are formed
The data of first version in first memory node;
Determine the corresponding version of first data of the second memory node preservation less than before the first version
One version;
The previous version of the first version is updated in first data that second memory node preserves
After data, the number of the previous version to the first version in second memory node is updated the data using described second
According to being updated, the data of the first version in second memory node are formed.
In conjunction with the third possible realization method of the third aspect, in the 4th kind of possible realization method of the third aspect
In, the processor determines that the corresponding version of first data that second memory node preserves is less than the first version
Previous version, including:
It updates and asks to the second memory node transmission data, the data update request is for asking with described second
It updates the data and first data is updated as the data of first version;
Indicate that the input/output interface receives the error information that second memory node returns, the error information table
The corresponding version of first data in bright second memory node is less than the previous version of the first version;
The input/output interface is additionally operable to send the data update request to second memory node;Reception connects
Receive the error information that second memory node returns;
First data that the processor is preserved in second memory node are updated to the first version
After the data of previous version, updated the data to before the first version in second memory node using described second
The data of one version are updated, including:
After receiving the error information after setting duration, indicate the input/output interface again to second storage
Node sends the data update request;
Indicate that the input/output interface receives the data update success message that second memory node returns;
The input/output interface is additionally operable to send the data update request to second memory node again;It connects
Receive the data update success message that second memory node returns;
Wherein, second memory node is updated in first data of storage by first calculate node described
After the data of the previous version of first version, responds the data update that second calculate node is sent and ask, and
First data be updated to described second update the data after to first calculate node return to data update success
Message.
Fourth aspect, the embodiment of the present invention provide a kind of distributed system, and the distributed system includes management node, the
One calculate node, the second calculate node and multiple memory nodes, the management node, first calculate node, described second
It is communicated between calculate node and the multiple memory node;
The management node, for generate be used to indicate the first moment store first data at least two storage section
First subregion view of point, and generate at least two memory nodes for being used to indicate that the second moment stored first data
Second subregion view, wherein first moment is different from second moment;First moment stores first data
At least two memory nodes stored with second moment first data at least two memory nodes it is not exactly the same;
First calculate node, for according to the first subregion view using first update the data to first data into
Row update operation;
Second calculate node, for using the first update number according to the first subregion view in first calculate node
When according to being updated operation to first data, the second subregion view is received, second is used according to the second subregion view
It updates the data and operation is updated to first data.
In conjunction with fourth aspect, in the first possible realization method of fourth aspect, the first subregion view instruction
Storage first data at least two memory nodes and the second subregion view instruction storage first data
At least two memory nodes in include the first memory node;
Second calculate node is used for:It is updated the data to first number using second according to the second subregion view
It is operated according to being updated, including:
The write permission of first memory node is competed with first calculate node;
After the write permission for obtaining first memory node, updated the data to the first storage section using described second
First data in point are updated operation.
In conjunction with fourth aspect, in second of possible realization method of fourth aspect, the second subregion view instruction
At least two memory nodes of storage first data include the first memory node and the second memory node;
Second calculate node is used for:It is updated the data to first number using second according to the second subregion view
It is operated according to being updated, including:
First data for storing first memory node are updated the data using described second to be updated, and are formed
The data of first version in first memory node;
Determine the corresponding version of first data of the second memory node preservation less than before the first version
One version;
The previous version of the first version is updated in first data that second memory node preserves
After data, the number of the previous version to the first version in second memory node is updated the data using described second
According to being updated, the data of the first version in second memory node are formed.
The one or more technical solutions provided in the embodiment of the present invention, have at least the following technical effects or advantages:
In the embodiment of the present invention, it is based on first the first data of subregion view pair in the first calculate node and carries out data update behaviour
When being not finished, the second calculate node can be based on second the first data of subregion view pair and carry out data update operation.Due to not
Terminated with the data update operation waited for based on the first subregion view, you can based on second point issued after the first subregion view
Area view carries out data update operation, and the stand-by period and business processing for reducing business processing take, avoid because based on
The obstruction of task based on the second subregion view caused by the task of first subregion view is delayed, improve distributed system into
The efficiency of row data manipulation.
Description of the drawings
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly introduced.
Fig. 1 is the schematic diagram of distributed system;
Fig. 2 is the schematic diagram of Linked Storage Structure;
Fig. 3 is the flow diagram of data processing method in the embodiment of the present invention;
Fig. 4 is another flow diagram of data processing method in the embodiment of the present invention;
Fig. 5 is the schematic diagram that memory node changes in the embodiment of the present invention;
Fig. 6 is the another flow diagram of data processing method in the embodiment of the present invention;
Fig. 7 is another schematic diagram that memory node changes in the embodiment of the present invention;
Fig. 8 is the structural schematic block diagram of data processing equipment in the embodiment of the present invention;
Fig. 9 is the structural schematic block diagram of calculate node in the embodiment of the present invention.
Specific implementation mode
Management node in distributed system is responsible for safeguarding the information of the memory node of data, generates the mark of data
With the mapping table of the storage location of data, and the mapping table is sent to each calculate node, calculate node protects the mapping table
It deposits on local storage, which is also known as the subregion view in calculate node, and subregion of the calculate node based on preservation regards
Figure determines the memory node of data, and then initiates data service (such as read operation, write operation) to corresponding memory node and ask.
Data processing method, apparatus and system in a kind of distributed system of offer of the embodiment of the present invention, below by attached drawing
And specific embodiment is described in detail technical solution of the present invention, it should be understood that in the embodiment of the present invention and embodiment
Specific features are the detailed description to technical solution of the present invention, in the absence of conflict, the embodiment of the present invention and implementation
Technical characteristic in example can be combined with each other.
Fig. 3 is the flow diagram of data processing method in distributed system, which includes the following steps:
Step 101:It is updated the data using first according to the first subregion view in the first calculate node and the first data is carried out
When update operation, the second calculate node receives the second subregion view;
Step 102:Second calculate node is updated the data using second according to the second subregion view and is carried out more to the first data
New operation;
Wherein, the first subregion view and the second subregion view are generated by management node;First subregion view is for referring to
Show that the first moment stored at least two memory nodes of the first data;Second subregion view is used to indicate the second moment storage first
At least two memory nodes of data;Wherein, the first moment is different from the second moment;First moment stored the first data at least
At least two memory nodes that two memory nodes store the first data with the second moment are not exactly the same.
Specifically, distributed system includes management node, the first calculate node, the second calculate node and multiple storages section
Point, wherein the first calculate node is two meters at least two calculate nodes in distributed system with the second calculate node
Operator node.With continued reference to Fig. 1, pass through network between management node, the first calculate node, the second calculate node and multiple memory nodes
It is communicated.
At the first moment, management node generates the first subregion view, and the first subregion view is used to indicate the storage of the first moment
At least two memory nodes of the first data.After the first moment, storage of the management node because detecting the first data of storage
Node changes, and generates the second subregion view at the second moment, the second subregion view is used to indicate the second moment storage first
At least two memory nodes of data.Since the memory node that the second subregion view embodies is used to store first after changing
The information of the memory node of data, so at least two storages of the first data of the first moment storage of the first subregion view instruction
Node, at least two memory nodes that the first data are stored with the second moment of the second subregion view instruction are not exactly the same.
After management node generates the second subregion view, the second subregion view can be sent to each calculate node, so that
Calculate node can be based on the second subregion view and carry out data manipulation (as read data manipulation, data writing operation).
In step 101, when the second calculate node receives the second subregion view, still there is the first calculate node according to first point
Area's view updates the data using first and is updated operation to the first data.
In step 102, the second calculate node receives the request of client, executes the request of client, is calculated first
In the case that node not yet terminates the update operation of the first data according to the first subregion view, updated the data pair using second
First data are updated operation.
Due to without waiting for based on the first subregion view data update operation terminate, you can based on the first subregion view it
The the second subregion view issued afterwards carries out data update operation, and the stand-by period and business processing for reducing business processing take,
The obstruction because of the task based on the second subregion view caused by the task based on the first subregion view is delayed is avoided, is improved
The efficiency of the progress data manipulation of distributed system.
Optionally, a kind of situation being likely to occur in the embodiment of the present invention is:The storage first of first subregion view instruction
At least two memory nodes of data and at least two memory nodes of the first data of storage of the second subregion view instruction wrap
Containing the first memory node.In this case, in order to avoid the first calculate node and the second calculate node are for the first storage
The data manipulation of node mutually conflicts, and introduces competition mechanism in the embodiment of the present invention.
Step 102:Second calculate node is updated the data using second according to the second subregion view and is carried out more to the first data
New operation, when it is implemented, with reference to Fig. 4, includes the following steps:
Step 1021:Second calculate node competes the write permission of the first memory node with the first calculate node;
Step 1022:Second calculate node updates the data pair after the write permission for obtaining the first memory node using second
The first data in first memory node are updated operation.
Specifically, when carrying out data update operation to the first memory node, the write permission of the first memory node is obtained.
In the present embodiment, when carrying out data update operation according to first the first data of subregion view pair due to the first calculate node, need
It is updated the data using first and the first data in first memory node is updated, and data are carried out according to the second subregion view
The second calculate node for updating operation also needs to carry out data update operation in the first memory node.
In the embodiment of the present invention, the second calculate node competes the write permission of the first memory node with the first calculate node, such as
The second calculate node of fruit competes successfully, then the second calculate node is updated the data using second to the first number in the first memory node
It is operated according to being updated., whereas if the first calculate node competes successfully, then the first calculate node first is updated the data to first
The first data in memory node are updated operation.
In above-mentioned technical proposal, in the first calculate node being updated according to first the first data of subregion view pair, with
And it is intended to first in the first memory node according to the second calculate node that second the first data of subregion view pair are updated
When data are updated, by allowing the first calculate node to be avoided with the write permission of the second calculate node the first memory node of competition
Concurrency conflict.
Optionally, in the embodiment of the present invention, the first calculate node is intended to the second calculate node in the first memory node
A kind of possible situation for being updated of the first data be:First memory node is that the storage first of the first subregion view instruction counts
According at least two memory nodes in memory node in addition to primary storage node, and the first memory node is the second subregion view
Primary storage node at least two memory nodes of the first data of storage of instruction.
And step 1021:Second calculate node competes the write permission of the first memory node with the first calculate node, specific real
Shi Shi may include steps of:
Second calculate node determines that the primary storage node of the first data of storage is that the first storage saves according to the second subregion view
Point;
Second calculate node sends the request that the first data are updated with operation to the first memory node;
Wherein, the first calculate node in the primary storage node of the first data of storage indicated the first subregion view
One data are updated after operation, and the first calculate node is updated operation to the transmission of the first memory node to the first data
Request;First memory node the write permission that the first data are updated with operation is authorized first receive to the first data carry out
Update the corresponding calculate node of request of operation.
Specifically, the second calculate node updated the data using second first data are updated when, first in basis
The primary storage node that second subregion view is determined carries out data update in i.e. the first memory node, therefore, the second calculate node
The request that the first data are updated with operation is sent to the first memory node.
And the first calculate node is regarded in the first subregion first when being updated according to first the first data of subregion view pair
The first data are updated in the primary storage node that figure is determined, after update, are continued to true according to the first subregion view
In (spare) memory node in addition to primary storage node at least two memory nodes of the first data of storage made
First data are updated, wherein include being updated to the first data in the first memory node.
Therefore, the first memory node will receive two calculate nodes (the first calculate node and the second calculate node) hair
The request being updated to the first data sent, the second calculate node will compete writing for the first memory node with the first calculate node
Permission.Wherein, competition refers to sending to same data to same memory node in two or more calculate nodes
When data update is asked, only one of which calculate node can obtain write permission, and the calculate node for obtaining write permission can be at
Work(carries out data update, and the data update request for competing the calculate node of failure will not be responded by memory node.Specific implementation
When, the second calculate node competes the write permission of the first memory node, including following realization method with the first calculate node:
First, the first calculate node and the second calculate node are to the first memory node to the first memory node transmission data
Update request, the first memory node determine authorize write permission to which calculate node according to the sequencing for receiving request, that is,
If the first memory node first receives the data update request of the first calculate node transmission, write permission is authorized to the first calculating
Node then authorizes write permission to the second calculating section, whereas if first receiving the data update request of the second calculate node transmission
Point.
Second, the first calculate node and the second calculate node are to the first memory node to the first memory node transmission data
Update request, the first memory node determine authorize write permission to which according to the sequencing of calculate node transmission data update request
A calculate node, that is, include a timestamp in the data update request that calculate node is sent, which sends for calculate node
To the time of the first memory node transmission data update request, the first memory node is receiving the first calculate node and second
When the data update for the first data that calculate node is sent is asked, asked determining to send according to the timestamp in respectively request
Write permission is authorized the side for first sending request by the sequencing asked.
In addition, the first memory node and the second memory node compete the first memory node write permission can also use it is existing
Other competition mechanisms in technology, the embodiment of the present invention are not described in detail.
In above-mentioned technical proposal, the calculate node of request is first sent by authorizing the write permission of the first memory node, no
Concurrency conflict can only be avoided, additionally it is possible to improve the efficiency of distributed system.
Optionally, in the embodiment of the present invention, also ensure the data of each memory node using primary storage node locking mechanisms
Consistency, specifically, the first calculate node to the first data carry out data update operation when, first according to the second subregion
View determines at least two memory node of the first data of storage, wherein (also known as including a primary storage node:Cephalomere
Point), the first calculate node first has to the write permission that competition obtains primary storage node, then could carry out data in primary storage node
Update operation, after receiving the data update success message that primary storage node returns, then is located in chain structure shown in Fig. 2
In memory node after primary storage node carry out data update operation, every time the previous node only in chain structure (or
Referred to as father node) write-in data success after, just in latter node carry out data update operation, to ensure data in more copies
Consistency.
Optionally, in the embodiment of the present invention, also ensure the stability and validity of data using version control mechanism.Version
This control refers to that the data being written every time correspond to a version number and first read host node when calculate node carries out write operation
The version value V of middle datai, the version value that data are written in this write operation is Vi+1.After introducing version control mechanism, storage section
Point asks the covering of highest version data by lowest version data are not received, and ensures the stability and consistency of data.
Optionally, in the embodiment of the present invention, ensure the consistency of data and steady in conjunction with competition mechanism and Version Control
It is qualitative.
Give storing three nodes of the first data in the first subregion view for example, might as well set as node A, section below
Point B (being the first memory node), node C.Its interior joint A is main memory node, and in the first calculate node according to first point
Before area view carries out data update, the versions of the first data for including in three nodes is t, and the states of the first data can be with
It is labeled as:{A[t]、B[t]、C[t]}.
First calculate node is according to the request being updated to the first data at the first subregion view customer in response end, first
The original version value for reading the first data in node A is t, and then, the request of the first calculate node is in node A by the first data update
The data for being t+1 for version.After first calculate node carries out data update success in node A, continuing the transmission into node B will
First data update is the request of the data of version t+1.
Before the data update for receiving the transmission of the first calculate node of node B is asked, with reference to Fig. 5, management node inspection
It measures node A to break down, using node B as new primary storage node, and transient node (label is determined according to hash algorithm
For hint) replacing node A, (hint nodes can use heartbeat detection mechanism by data copy to node A, and the embodiment of the present invention is to this
Not be described in detail), and generate the second subregion view, three nodes that the first data are stored in the second subregion view be respectively node B,
Hint nodes, node C.
After second calculate node receives the second subregion view of management node transmission, pair of client transmission is received
The request that first data are updated determines that primary storage node is node B, at this point, section might as well be set according to the second subregion view
Point B not yet receives the data update request of the first calculate node transmission, therefore, the corresponding version of the first data in node B
T is remained as, therefore, the second calculate node is read in node B after the version of the first data, is sent to node B by the first data more
The request of the new data for version t+1.
In this case, node B sends the request for receiving the transmission of the second calculate node and the first calculate node
Request, the two request write-in data version it is legal, node B authorizes write permission the request first received corresponding meter
Operator node.
If the second calculate node competes successfully, the second calculate node will be sequentially in node B, hint node and node
It is updated the data using second in C and the first data is updated, the state of updated data is { B [t+1], hint [t+1], C
[t+1]}.Since the version number of data in node B has been t+1, then version is written in the request for refusing the first calculate node by node B
This is the request of t+1, avoids the covering of same version number data, and the first calculate node will be written to client returned data to fail.
Corresponding, if the first calculate node competes successfully, the first calculate node will be sequentially in node B and node C
Middle updated the data using first is updated the first data, and state of the updated data in the first subregion view is { A [t
+1]、B[t+1]、C[t+1]}.Since the version number of data in node B has been t+1, then node B will refuse the second calculate node
Request write-in version be t+1 request, avoid the covering of same version number data, the second calculate node that from returning to number to client
Fail according to write-in.
In above-mentioned technical proposal, by combining Version Control and locking mechanisms, it can either allow to regard based on the second subregion
The calculate node of figure carries out data update operation when there is the task based on the first subregion view to be not finished, and can guarantee second point
In area's view in different memory nodes data consistency.
Optionally, if the second calculate node compete the first memory node write permission failure, the second calculate node to
Failed message is written in client returned data.
Optionally, as another embodiment, at least two storage sections of the first data of storage of the second subregion view instruction
Point includes the first memory node and the second memory node;
Step 102:Second calculate node is updated the data using second according to the second subregion view and is carried out more to the first data
New operation, when it is implemented, with reference to Fig. 6, includes the following steps:
Step 1023:Second calculate node is updated the data using second and is carried out to the first data that the first memory node stores
Update forms the data of the first version in the first memory node;
Step 1024:Second calculate node determines that the corresponding version of the first data that the second memory node preserves is less than first
The previous version of version;
Step 1025:Before the first data that second calculate node is preserved in the second memory node are updated to first version
After the data of one version, using second update the data to the data of the previous version of the first version in the second memory node into
Row update, forms the data of the first version in the second memory node.
Specifically, the second calculate node determines that at least two storages of the first data of storage save according to the second subregion view
Point specifically includes the first memory node and the second memory node.Wherein, in second the first memory node of calculate node pair
Before first data are updated, the first data correspond to the previous version of first version, and the second calculate node using second more
After the first data in the first memory node of new data pair are updated, updated second, which updates the data corresponding version, is
First version.
Second calculate node continues to be updated the first data in the second memory node, however, it was found that the second storage section
The corresponding version of the first data in point is less than the previous version of first version, and in the embodiment of the present invention, the second calculate node will
The data that the first data in the second memory node are updated to the previous version of first version are waited for then to reuse second
It updates the data and the data of the previous version of the first version in the second memory node is updated.
In above-mentioned technical proposal, by forbidding the data cover of cross-version, i.e., only receive next version of original version value
Data update is asked, and ensures the consistency of data, and when multiple calculate nodes carry out data update, avoid interfering other meters
The successful execution of the data update of operator node.
It is asked although the second calculate node needs the first data of the second memory node of waiting to be updated to the second calculate node
The last revision of the version of write-in is sought, still, the second calculate node can directly carry out data update in the first memory node,
Without waiting for the first calculate node to execute the operation of the data update based on the first subregion view, remain able at reduction business
Manage the time.
Optionally, in the embodiment of the present invention, step 1024:Second calculate node determines first that the second memory node preserves
The corresponding version of data is less than the previous version of first version, when it is implemented, including the following steps:
Second calculate node is updated to the second memory node transmission data asks, and data update request is for asking with second
It updates the data and is updated as the first data of data pair of first version;
Second calculate node receives the error information that the second memory node returns, and error information shows in the second memory node
The corresponding version of the first data be less than first version previous version.
Step 1025:Before the first data that second calculate node is preserved in the second memory node are updated to first version
After the data of one version, using second update the data to the data of the previous version of the first version in the second memory node into
Row update, when it is implemented, including the following steps:
After second calculate node sets duration after receiving error information, updated again to the second memory node transmission data
Request;
Second calculate node receives the data update success message that the second memory node returns, wherein the second memory node
After the first data of storage are updated to the data of previous version of first version by the first calculate node, response second calculates
Node send data update request, and the first data be updated to second update the data after to the first calculate node return number
According to being updated successfully message.
Specifically, the second memory node will return after the data update request for receiving cross-version to the second calculate node
Error information is returned, so that the second calculate node knows current data update request cross-version mistake.
The first data in second calculate node the second memory node of waiting are updated to the previous version of first version
Data, realization method can be:
First, the second calculate node according to the setting period, is periodically updated to the second memory node retransmission data and is asked,
After the version of the first data in the second memory node is updated to the data of previous version of first version, second calculates section
The data update request that point is retransmitted will be accepted, and the second memory node is first version by version is updated in the first data
After data message is updated successfully to the second calculate node returned data.
Second, it is first version that active is updated to version by the second memory node in the first data by the first calculate node
After the data of previous version, the data that the first data have been updated to the previous version of first version are returned to the second calculate node
Message, the second calculate node will be updated to the second memory node transmission data again asks, and the second memory node will accept this
Request executes data update.
By above two mode, the second calculate node can be in time in the first data of the second memory node storage by more
After the new data for the previous version of first version, operation is updated to the first data in the second memory node, reduces industry
It is engaged in processing time.
In actual conditions, there is the situation of above-mentioned cross-version mistake in data update request, including:
Situation 1, in newly-increased memory node come when storing the first data, management module will be after instruction the first calculate node use
The mode of platform copy, by the old version copy of the first data in newly-increased memory node, in this case, data update
Request is likely to occur cross-version mistake, is illustrated below.
It might as well set and store the node of the first data in the first subregion view as node A (being the first memory node), node
C, wherein node A is main memory node, and node C is tail node.It is distributed, is increased newly automatically according to user's request or management node
Nodes of the node D as the first data of storage, wherein using node D as the child node of node A and the father node of node C.
With reference to Fig. 7, management node distributes the first calculate node and carries out background copy, the old version data of node C are copied
Shellfish is to node D.It might as well set, before increasing node D newly, the state of data is { A [4], C [4] }, then the first calculate node is by first
The data C [4] of data C [1] to the 4th version of a version be copied to node D (due to the consistency of copy, A [i]=C [i],
I takes 1,2,3,4).
In addition, when the first calculate node carries out background copy, due to the data not yet one on node A, node D, node C
It causes, management node will not generate the second subregion view, and calculate node carries out data processing still according to the first subregion view.It there is no harm in
If third calculate node updates data, the state of data becomes { A when the first calculate node carries out background copy
[5], [5] C | D [3] }, wherein D [3] shows that the third edition data C [3] of data is copied to by the first calculate node
Node D.
After the first calculate node by the 5th of data the edition data C [5] by node D is copied to, the state of data becomes
For { A [5], C [5] | D [5] }, after management node determines that the versions of data of node D is consistent with the versions of data of node C, the is generated
Two subregion views, and the second subregion view is sent to all calculate nodes including the second calculate node.
But the time that different calculate nodes receive the second subregion view is variant, for example, in T1Moment, the second meter
Operator node receives the second subregion view;In T1T after moment2Moment, third calculate node not yet receive the second subregion and regard
Figure, the update to data for but receiving client are asked, and third calculate node will be docked according to the first subregion view of preservation
Data in point A and node C are updated.In T2T after moment3At the moment, third calculate node is according to the first subregion view
It is { A [6], C [6] | D [5] } by data update.
T3T after moment4Moment, the second calculate node receive the request of client being updated to data, and second
The version that calculate node reads the data of primary storage node A is 6, then sends the data of the version 7 of write-in data to node A, write
It is { A [7], D [5], C [6] } according to the state of the second subregion viewdata after entering successfully.In T4T after moment5Moment, the
Two calculate nodes continue to send the data of the version 7 of write-in data to node D, however, the version of data is 5 in node D, it will not
The write request for receiving cross-version, to the second calculate node return error information, inform the second calculate node write operation exist across
Version error.
In actual conditions, in T before3Moment, third calculate node by data update be { A [6], C [6] | D [5] } it
Afterwards, management node will assign the first calculate node to carry out background copy again according to the first subregion view, and C [6] is copied node
D.And in T5The background copy at moment, the first calculate node is not yet completed, thus just occur the second calculate node write operation across
The mistake of version.Can periodically (version 7 of data be as written to the update request of node D retransmission datas in second calculate node
Data request), in T5T after moment6At the moment, for the first calculate node successfully by C [6] copy nodes D, second calculates section
The data update request of point will become legal, and continuation is carried out data update by the second calculate node in node D and node D,
The state of data is { A [7], D [7], C [7] } after update.
In situation 1, using technical solution provided by the invention, the second calculate node can the first calculate node according to
Before the data background copy that first subregion view carries out is completed, operation is updated to data according to the second subregion view, is subtracted
The business stand-by period is lacked, has reduced business processing and take, improve system effectiveness.
Situation 2 deletes the situation of storage node.
It might as well set and store the node of data in the first subregion view as node A (being the first memory node), node B, section
Point C (being the second memory node), according to user's instruction or management node Automatic dispatching, is removed from it node B, i.e., no longer will section
Memory nodes of the point B as data, management node will generate the second subregion view, and the second subregion view is sent to each calculating
Node.
But the time that different calculate nodes receive the second subregion view is variant, for example, in T1Moment, the second meter
Operator node receives the second subregion view;In T1T after moment2Moment, the first calculate node not yet receive the second subregion and regard
Figure, the update to data for but receiving client are asked, and the first calculate node will be docked according to the first subregion view of preservation
Data in point A and node C are updated.It might as well set, in T2Before moment, the state of data is { A [2], B [2], C
[2] }, then the data update of the first calculate node be written data version 3 data.
In T3At the moment, data success is updated to A [3] by the first calculate node in node A, and data update is still in node C
Not starting (or not yet completing), the update to data that the second calculate node receives client is asked, in the embodiment of the present invention,
Second calculate node terminates without waiting for data update operation of first calculate node based on the first subregion view, is directly based upon the
Two subregion views carry out data update operation.
First, the version that the second calculate node reads the data in primary storage node A is version 3, then asks to send out to node A
The request for sending the data of the edition 4 of write-in data after being written successfully, continues in node C according to the second subregion view into line number
According to update.However, when sending the request of the data of the edition 4 of write-in data to node C in the second calculate node, first calculates section
Data update operation of the point in node C not yet terminates, and therefore, according to the second subregion view, the state of data is { A [4], C
[2] }, there is cross-version mistake in the data update request of the second calculate node, and the second calculate node will wait for the first calculate node
After being C [3] by the data update in node C, continue to be C [4] by data update in node C.
In situation 2, using technical solution provided by the invention, the second calculate node can the first calculate node according to
Before the data update that first subregion view carries out is completed, operation is updated to data according to the second subregion view, is reduced
The business stand-by period reduces business processing and takes, improves system effectiveness.
Based on the same technical idea, the embodiment of the present invention additionally provides data processing equipment in a kind of distributed system
200, distributed system includes management node, the first calculate node, data processing equipment 200 and multiple memory nodes, management section
It is communicated between point, the first calculate node, data processing equipment 200 and multiple memory nodes, with reference to Fig. 8, data processing equipment 200 wraps
It includes:
Receiving module 201, for being updated the data to first using first according to the first subregion view in the first calculate node
When data are updated operation, the second subregion view is received;
Update module 202 is updated behaviour for being updated the data using second according to the second subregion view to the first data
Make;
Wherein, the first subregion view and the second subregion view are generated by management node;First subregion view is for referring to
Show that the first moment stored at least two memory nodes of the first data;Second subregion view is used to indicate the second moment storage first
At least two memory nodes of data;Wherein, the first moment is different from the second moment;First moment stored the first data at least
At least two memory nodes that two memory nodes store the first data with the second moment are not exactly the same.
Optionally, in the embodiment of the present invention, at least two storage sections of the first data of storage of the first subregion view instruction
Include the first memory node in point and at least two memory nodes of the first data of storage of the second subregion view instruction;
Update module 202, is specifically used for:
The write permission of the first memory node is competed with the first calculate node;
After the write permission for obtaining the first memory node, updated the data to the first number in the first memory node using second
It is operated according to being updated.
Optionally, in the embodiment of the present invention, the first memory node is the first data of storage of the first subregion view instruction
Memory node at least two memory nodes in addition to primary storage node, and the first memory node indicates for the second subregion view
The first data of storage at least two memory nodes in primary storage node;
Update module 202 is specifically used for:
Determine that the primary storage node of the first data of storage is the first memory node according to the second subregion view;
The request that the first data are updated with operation is sent to the first memory node;
Wherein, the first calculate node in the primary storage node of the first data of storage indicated the first subregion view
One data are updated after operation, and the first calculate node is updated operation to the transmission of the first memory node to the first data
Request;First memory node the write permission that the first data are updated with operation is authorized first receive to the first data carry out
Update the corresponding calculate node of request of operation.
Optionally, in the embodiment of the present invention, at least two storage sections of the first data of storage of the second subregion view instruction
Point includes the first memory node and the second memory node;
Update module 202 is specifically used for:
The first data for storing the first memory node are updated the data using second to be updated, and form the first memory node
In first version data;
Determine that the corresponding version of the first data of the second memory node preservation is less than the previous version of first version;
After the first data that the second memory node preserves are updated to the data of previous version of first version, use
Second updates the data and is updated to the data of the previous version of the first version in the second memory node, forms the second storage section
The data of first version in point.
Optionally, in the embodiment of the present invention, update module 202 is specifically used for:
To the second memory node transmission data update ask, data update request for ask using second update the data as
The first data of data pair of first version are updated;
The error information of the second memory node return is received, error information shows the first data pair in the second memory node
The version answered is less than the previous version of first version;
After setting duration after receiving error information, updates ask to the second memory node transmission data again;
Receive the second memory node return data update success message, wherein the second memory node in storage first
After data are updated to the data of previous version of first version by the first calculate node, the number of response data processing unit transmission
Asked according to update, and the first data be updated to second update the data after be updated successfully and disappear to the first calculate node returned data
Breath.
The data processing method corresponding with Fig. 3 of data processing equipment 200 in the present embodiment is to be based on same inventive concept
Under two aspect, the implementation process of method is described in detail in front, so those skilled in the art can root
The structure and implementation process of the data processing equipment 200 in the present embodiment are well understood according to foregoing description, in order to illustrate book
Succinctly, details are not described herein again.
Based on identical inventive concept, the embodiment of the present invention also provides calculate node 300 in a kind of distributed system, application
In distributed system, distributed system includes management node, the first calculate node, calculate node 300 and multiple memory nodes,
It is communicated between management node, the first calculate node, calculate node 300 and multiple memory nodes, with reference to Fig. 9, calculate node 300 is wrapped
It includes:Processor 301 and input/output interface 302.
Wherein, processor 301 are communicated with input/output interface 302, for being regarded according to the first subregion in the first calculate node
Figure is updated the data using first when being updated operation to the first data, and instruction input/output interface 302 receives the second subregion and regards
Figure;
Input/output interface 302 is used to be indicated to receive the second subregion view according to processor 301;
Input/output interface 302 is additionally operable to reception second and updates the data;
Processor 301 is additionally operable to be updated the data using second according to the second subregion view is updated behaviour to the first data
Make;
Wherein, the first subregion view and the second subregion view are generated by management node;First subregion view is for referring to
Show that the first moment stored at least two memory nodes of the first data;Second subregion view is used to indicate the second moment storage first
At least two memory nodes of data;Wherein, the first moment is different from the second moment;First moment stored the first data at least
At least two memory nodes that two memory nodes store the first data with the second moment are not exactly the same.
Optionally, in the embodiment of the present invention, at least two storage sections of the first data of storage of the first subregion view instruction
Include the first memory node in point and at least two memory nodes of the first data of storage of the second subregion view instruction;
Processor 301 updates the data using second according to the second subregion view and is updated operation to the first data, including:
The write permission of the first memory node is competed with the first calculate node;
After the write permission for obtaining the first memory node, updated the data to the first number in the first memory node using second
It is operated according to being updated.
Optionally, in the embodiment of the present invention, the first memory node is the first data of storage of the first subregion view instruction
Memory node at least two memory nodes in addition to primary storage node, and the first memory node indicates for the second subregion view
The first data of storage at least two memory nodes in primary storage node;
Processor 301 and the first calculate node compete the write permission of the first memory node, including:
Determine that the primary storage node of the first data of storage is the first memory node according to the second subregion view;
Indicate that input/output interface 302 sends the request that the first data are updated with operation to the first memory node;
Input/output interface 302 is additionally operable to send the request for being updated the first data operation to the first memory node;
Wherein, the first calculate node in the primary storage node of the first data of storage indicated the first subregion view
One data are updated after operation, and the first calculate node is updated operation to the transmission of the first memory node to the first data
Request;First memory node the write permission that the first data are updated with operation is authorized first receive to the first data carry out
Update the corresponding calculate node of request of operation.
Optionally, in the embodiment of the present invention, at least two storage sections of the first data of storage of the second subregion view instruction
Point includes the first memory node and the second memory node;
Processor 301 updates the data using second according to the second subregion view and is updated operation to the first data, including:
The first data for storing the first memory node are updated the data using second to be updated, and form the first memory node
In first version data;
Determine that the corresponding version of the first data of the second memory node preservation is less than the previous version of first version;
After the first data that the second memory node preserves are updated to the data of previous version of first version, use
Second updates the data and is updated to the data of the previous version of the first version in the second memory node, forms the second storage section
The data of first version in point.
Optionally, in the embodiment of the present invention, processor 301 determines the corresponding version of the first data that the second memory node preserves
This is less than the previous version of first version, including:
It indicates that input/output interface 302 is updated to the second memory node transmission data to ask, data update is asked for asking
The first data of data pair updated the data using second as first version are asked to be updated;
Indicate that input/output interface 302 receives the error information that the second memory node returns, error information shows that second deposits
Store up the previous version that the corresponding version of the first data in node is less than first version;
Input/output interface 302 is additionally operable to update to the second memory node transmission data and ask;Receive the second storage
The error information that node returns;
The first data that processor 301 is preserved in the second memory node are updated to the data of the previous version of first version
Later, it is updated the data using second and the data of the previous version of the first version in second memory node is updated, including:
After setting duration after receiving error information, instruction input/output interface 302 is sent to the second memory node again
Data update is asked;
Indicate that input/output interface 302 receives the data update success message that the second memory node returns;
Input/output interface 302 is additionally operable to update to the second memory node transmission data again and ask;Receive the second storage
The data update success message that node returns;
Wherein, the second memory node is updated to by the first calculate node the previous version of first version in the first data of storage
After this data, the data update request that the second calculate node of response is sent, and it is updated to the second update in the first data
After data message is updated successfully to the first calculate node returned data.
It should be noted that the above processor 301 can be a processor, can also be the general designation of multiple processors.
For example, processor 301 can be central processing unit, can also be specific integrated circuit, or be arranged to implement the present invention
One or more integrated circuits of embodiment, such as:One or more microprocessors, or, one or more field-programmable
Gate array.
Optionally, processor 301 is connect with input/output interface 302 by bus, which can be industrial standard body
Architecture bus, external equipment interconnection bus or extended industry standard architecture bus etc..It is total that the bus can be divided into address
Line, data/address bus, controlling bus etc..Only indicated with a line for ease of indicating, in figure, it is not intended that an only bus or
A type of bus.
The data processing method corresponding with Fig. 3 of calculate node 300 in the present embodiment is based under same inventive concept
Two aspects, are in front described in detail the implementation process of method, so those skilled in the art can be before
The structure and implementation process of the calculate node 300 that description is well understood in the present embodiment are stated, in order to illustrate the succinct of book, herein
Just repeat no more.
Based on identical inventive concept, the embodiment of the present invention provides a kind of distributed system, and distributed system includes management
Node, the first calculate node, the second calculate node and multiple memory nodes, management node, the first calculate node, second calculate section
It is communicated between point and multiple memory nodes;
Management node is used to indicate the first of at least two memory nodes that the first moment stored the first data for generating
Subregion view, and generate and be used to indicate for the second moment and store the second subregions of at least two memory nodes of the first data and regard
Figure, wherein the first moment is different from the second moment;First moment stored at least two memory nodes of the first data and when second
At least two memory nodes for carving the first data of storage are not exactly the same;
First calculate node is updated behaviour for being updated the data using first according to the first subregion view to the first data
Make;
Second calculate node, for being updated the data to first using first according to the first subregion view in the first calculate node
When data are updated operation, the second subregion view is received, is updated the data using second according to the second subregion view and is counted to first
It is operated according to being updated.
Optionally, in the embodiment of the present invention, at least two storage sections of the first data of storage of the first subregion view instruction
Include the first memory node in point and at least two memory nodes of the first data of storage of the second subregion view instruction;
Second calculate node is used for:It is updated the data using second according to the second subregion view and behaviour is updated to the first data
Make, including:
The write permission of the first memory node is competed with the first calculate node;
After the write permission for obtaining the first memory node, updated the data to the first number in the first memory node using second
It is operated according to being updated.
Optionally, in the embodiment of the present invention, at least two storage sections of the first data of storage of the second subregion view instruction
Point includes the first memory node and the second memory node;
Second calculate node is used for:It is updated the data using second according to the second subregion view and behaviour is updated to the first data
Make, including:
The first data for storing the first memory node are updated the data using second to be updated, and form the first memory node
In first version data;
Determine that the corresponding version of the first data of the second memory node preservation is less than the previous version of first version;
After the first data that the second memory node preserves are updated to the data of previous version of first version, use
Second updates the data and is updated to the data of the previous version of the first version in the second memory node, forms the second storage section
The data of first version in point.
Distributed system data processing method corresponding with Fig. 3 in the present embodiment is based on two under same inventive concept
A aspect is in front described in detail the implementation process of method, so those skilled in the art can be according to aforementioned
The structure and implementation process of the distributed system in the present embodiment is well understood in description, in order to illustrate the succinct of book, herein
It repeats no more.
The one or more technical solutions provided in the embodiment of the present invention, have at least the following technical effects or advantages:
In the embodiment of the present invention, it is based on first the first data of subregion view pair in the first calculate node and carries out data update behaviour
When being not finished, the second calculate node can be based on second the first data of subregion view pair and carry out data update operation.Due to not
Terminated with the data update operation waited for based on the first subregion view, you can based on second point issued after the first subregion view
Area view carries out data update operation, and the stand-by period and business processing for reducing business processing take, avoid because based on
The obstruction of task based on the second subregion view caused by the task of first subregion view is delayed, improve distributed system into
The efficiency of row data manipulation.
It should be understood by those skilled in the art that, the embodiment of the present invention can be provided as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, the present invention can be used in one or more wherein include computer usable program code computer
The computer program production implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of product.
The present invention be with reference to according to the method for the embodiment of the present invention, the flow of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that can be realized by computer program instructions every first-class in flowchart and/or the block diagram
The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided
Instruct the processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine so that the instruction executed by computer or the processor of other programmable data processing devices is generated for real
The device for the function of being specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to
Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or
The function of being specified in multiple boxes.
Claims (18)
1. data processing method in a kind of distributed system, which is characterized in that the distributed system includes management node, first
Calculate node, the second calculate node and multiple memory nodes, the management node, first calculate node, second meter
It is communicated between operator node and the multiple memory node, the method includes:
It is updated the data using first according to the first subregion view in first calculate node and operation is updated to the first data
When, second calculate node receives the second subregion view;
Second calculate node is updated the data using second according to the second subregion view and is carried out more to first data
New operation;
Wherein, the first subregion view and the second subregion view are generated by the management node;Described first point
Area's view is used to indicate at least two memory nodes for storing first data at the first moment;The second subregion view is used for
Indicate that the second moment stored at least two memory nodes of first data;Wherein, when first moment is with described second
It carves different;First moment stores at least two memory nodes of first data and second moment storage described the
At least two memory nodes of one data are not exactly the same.
2. the method as described in claim 1, which is characterized in that storage first data of the first subregion view instruction
At least two memory nodes and the second subregion view instruction storage first data at least two memory nodes
In include the first memory node;
Second calculate node is updated the data using second according to the second subregion view and is carried out more to first data
New operation, including:
Second calculate node competes the write permission of first memory node with first calculate node;
Second calculate node is updated the data using described second to institute after the write permission for obtaining first memory node
First data stated in the first memory node are updated operation.
3. method as claimed in claim 2, which is characterized in that first memory node indicates for the first subregion view
Storage first data at least two memory nodes in memory node in addition to primary storage node, and described first deposits
Store up the primary storage section at least two memory nodes of storage first data that node is the second subregion view instruction
Point;
Second calculate node competes the write permission of first memory node with first calculate node, including:
Second calculate node determines that it is institute to store the primary storage node of first data according to the second subregion view
State the first memory node;
Second calculate node sends the request that first data are updated with operation to first memory node;
Wherein, primary storage section of first calculate node in storage first data indicated the first subregion view
First data in point are updated after operation, and first calculate node is sent to first memory node to institute
State the request that the first data are updated operation;First memory node will be updated operation to first data and write
Permission authorizes the corresponding calculate node of request for being updated operation to first data first received.
4. the method as described in claim 1, which is characterized in that storage first data of the second subregion view instruction
At least two memory nodes include the first memory node and the second memory node;
Second calculate node is updated the data using second according to the second subregion view and is carried out more to first data
New operation, including:
Second calculate node updates the data first data stored to first memory node using described second
It is updated, forms the data of the first version in first memory node;
It is described that second calculate node determines that the corresponding version of first data that second memory node preserves is less than
The previous version of first version;
First data that second calculate node is preserved in second memory node are updated to the first version
Previous version data after, updated the data to the first version in second memory node using described second
The data of previous version are updated, and form the data of the first version in second memory node.
5. method as claimed in claim 4, which is characterized in that second calculate node determines that second memory node is protected
The corresponding version of first data deposited is less than the previous version of the first version, including:
Second calculate node is updated to the second memory node transmission data asks, and the data update request is for asking
The data updated the data using described second as first version are asked to be updated first data;
Second calculate node receives the error information that second memory node returns, and the error information shows described the
The corresponding version of first data in two memory nodes is less than the previous version of the first version;
First data that second calculate node is preserved in second memory node are updated to the first version
Previous version data after, updated the data to the first version in second memory node using described second
The data of previous version are updated, including:
Second calculate node after setting duration, is sent to second memory node again after receiving the error information
The data update request;
Second calculate node receives the data update success message that second memory node returns, wherein described second
Memory node is updated to by first calculate node previous version of the first version in first data of storage
After data, the data update request that second calculate node is sent is responded, and be updated in first data
Described second update the data after to first calculate node return to the data update success message.
6. data processing equipment in a kind of distributed system, which is characterized in that the distributed system includes management node, first
Calculate node, the data processing equipment and multiple memory nodes, the management node, first calculate node, the number
It is communicated according between processing unit and the multiple memory node, the data processing equipment includes:
Receiving module, for being updated the data to the first data using first according to the first subregion view in first calculate node
When being updated operation, the second subregion view is received;
Update module is updated behaviour for being updated the data using second according to the second subregion view to first data
Make;
Wherein, the first subregion view and the second subregion view are generated by the management node;Described first point
Area's view is used to indicate at least two memory nodes for storing first data at the first moment;The second subregion view is used for
Indicate that the second moment stored at least two memory nodes of first data;Wherein, when first moment is with described second
It carves different;First moment stores at least two memory nodes of first data and second moment storage described the
At least two memory nodes of one data are not exactly the same.
7. device as claimed in claim 6, which is characterized in that storage first data of the first subregion view instruction
At least two memory nodes and the second subregion view instruction storage first data at least two memory nodes
In include the first memory node;
The update module, is specifically used for:
The write permission of first memory node is competed with first calculate node;
After the write permission for obtaining first memory node, updated the data in first memory node using described second
First data be updated operation.
8. device as claimed in claim 7, which is characterized in that first memory node indicates for the first subregion view
Storage first data at least two memory nodes in memory node in addition to primary storage node, and described first deposits
Store up the primary storage section at least two memory nodes of storage first data that node is the second subregion view instruction
Point;
The update module is specifically used for:
Determine that the primary storage node for storing first data is first memory node according to the second subregion view;
The request that first data are updated with operation is sent to first memory node;
Wherein, primary storage section of first calculate node in storage first data indicated the first subregion view
First data in point are updated after operation, and first calculate node is sent to first memory node to institute
State the request that the first data are updated operation;First memory node will be updated operation to first data and write
Permission authorizes the corresponding calculate node of request for being updated operation to first data first received.
9. device as claimed in claim 6, which is characterized in that storage first data of the second subregion view instruction
At least two memory nodes include the first memory node and the second memory node;
The update module is specifically used for:
First data for storing first memory node are updated the data using described second to be updated, described in formation
The data of first version in first memory node;
Determine that the corresponding version of first data of the second memory node preservation is less than the previous version of the first version
This;
The data of the previous version of the first version are updated in first data that second memory node preserves
Later, using described second update the data to the data of the previous version of the first version in second memory node into
Row update, forms the data of the first version in second memory node.
10. device as claimed in claim 9, which is characterized in that the update module is specifically used for:
It updates and asks to the second memory node transmission data, the data update request is updated for asking with described second
Data are updated first data as the data of first version;
The error information that second memory node returns is received, the error information shows the institute in second memory node
State the previous version that the corresponding version of the first data is less than the first version;
After receiving the error information after setting duration, the data update is sent to second memory node again and asked
It asks;
Receive the data update success message that second memory node returns, wherein second memory node is in storage
After first data are updated to the data of previous version of the first version by first calculate node, described in response
Data processing equipment send the data update request, and first data be updated to described second update the data after
The data update success message is returned to first calculate node.
11. calculate node in a kind of distributed system, which is characterized in that be applied in distributed system, the distributed system
Further include management node, the first calculate node and multiple memory nodes, it is the management node, first calculate node, described
It is communicated between calculate node and the multiple memory node, the calculate node includes:
Processor is communicated with input/output interface, for using first according to the first subregion view in first calculate node
It updates the data when being updated operation to the first data, indicates that the input/output interface receives the second subregion view;
The input/output interface is used to receive the second subregion view according to processor instruction;
The input/output interface is additionally operable to reception second and updates the data;
The processor is additionally operable to be updated the data using second according to the second subregion view and be carried out more to first data
New operation;
Wherein, the first subregion view and the second subregion view are generated by the management node;Described first point
Area's view is used to indicate at least two memory nodes for storing first data at the first moment;The second subregion view is used for
Indicate that the second moment stored at least two memory nodes of first data;Wherein, when first moment is with described second
It carves different;First moment stores at least two memory nodes of first data and second moment storage described the
At least two memory nodes of one data are not exactly the same.
12. calculate node as claimed in claim 11, which is characterized in that the storage of the first subregion view instruction described the
At least two of storage first data that at least two memory nodes of one data and the second subregion view indicate deposit
Store up in node includes the first memory node;
The processor updates the data using second according to the second subregion view and is updated operation to first data,
Including:
The write permission of first memory node is competed with first calculate node;
After the write permission for obtaining first memory node, updated the data in first memory node using described second
First data be updated operation.
13. calculate node as claimed in claim 12, which is characterized in that first memory node is that first subregion regards
Memory node at least two memory nodes of storage first data of figure instruction in addition to primary storage node, and it is described
First memory node is the master at least two memory nodes of storage first data of the second subregion view instruction
Memory node;
The processor competes the write permission of first memory node with first calculate node, including:
Determine that the primary storage node for storing first data is first memory node according to the second subregion view;
Indicate that the input/output interface is updated operation to first data to first memory node transmission and asks
It asks;
The input/output interface is additionally operable to send to first memory node carries out the update behaviour to first data
The request of work;
Wherein, primary storage section of first calculate node in storage first data indicated the first subregion view
First data in point are updated after operation, and first calculate node is sent to first memory node to institute
State the request that the first data are updated operation;First memory node will be updated operation to first data and write
Permission authorizes the corresponding calculate node of request for being updated operation to first data first received.
14. calculate node as claimed in claim 11, which is characterized in that the storage of the second subregion view instruction described the
At least two memory nodes of one data include the first memory node and the second memory node;
The processor updates the data using second according to the second subregion view and is updated operation to first data,
Including:
First data for storing first memory node are updated the data using described second to be updated, described in formation
The data of first version in first memory node;
Determine that the corresponding version of first data of the second memory node preservation is less than the previous version of the first version
This;
The data of the previous version of the first version are updated in first data that second memory node preserves
Later, using described second update the data to the data of the previous version of the first version in second memory node into
Row update, forms the data of the first version in second memory node.
15. calculate node as claimed in claim 14, it is characterised in that:
The processor determines that the corresponding version of first data that second memory node preserves is less than the first edition
This previous version, including:
It indicates that the input/output interface is updated to the second memory node transmission data to ask, the data update request is used
First data are updated using the data that described second updates the data as first version in request;
Indicate that the input/output interface receives the error information that second memory node returns, the error information shows institute
State the previous version that the corresponding version of first data in the second memory node is less than the first version;
The input/output interface is additionally operable to send the data update request to second memory node;Receive institute
State the error information of the second memory node return;
First data that the processor is preserved in second memory node are updated to the previous of the first version
After the data of version, the previous version to the first version in second memory node is updated the data using described second
This data are updated, including:
After receiving the error information after setting duration, indicate the input/output interface again to second memory node
Send the data update request;
Indicate that the input/output interface receives the data update success message that second memory node returns;
The input/output interface is additionally operable to send the data update request to second memory node again;Receive institute
State the data update success message of the second memory node return;
Wherein, second memory node is updated to described first in first data of storage by first calculate node
After the data of the previous version of version, data update request is responded, and be updated to described the in first data
Two update the data after to first calculate node return to the data update success message.
16. a kind of distributed system, which is characterized in that the distributed system includes management node, the first calculate node, second
Calculate node and multiple memory nodes, the management node, first calculate node, second calculate node and described more
It is communicated between a memory node;
The management node is used to indicate the first of at least two memory nodes that the first moment stored the first data for generating
Subregion view, and generate the second subregion for being used to indicate at least two memory nodes that the second moment stored first data
View, wherein first moment is different from second moment;First moment stores at least the two of first data
At least two memory nodes that a memory node stores first data with second moment are not exactly the same;
First calculate node carries out more first data for being updated the data using first according to the first subregion view
New operation;
Second calculate node, for being updated the data pair using first according to the first subregion view in first calculate node
When first data are updated operation, the second subregion view is received, the second update is used according to the second subregion view
Data are updated operation to first data.
17. system as claimed in claim 16, which is characterized in that storage first number of the first subregion view instruction
According at least two memory nodes and the second subregion view instruction storage first data at least two storage sections
It include the first memory node in point;
Second calculate node is used for:According to the second subregion view using second update the data to first data into
Row update operation, including:
The write permission of first memory node is competed with first calculate node;
After the write permission for obtaining first memory node, updated the data in first memory node using described second
First data be updated operation.
18. system as claimed in claim 16, which is characterized in that storage first number of the second subregion view instruction
According at least two memory nodes include the first memory node and the second memory node;
Second calculate node is used for:According to the second subregion view using second update the data to first data into
Row update operation, including:
First data for storing first memory node are updated the data using described second to be updated, described in formation
The data of first version in first memory node;
Determine that the corresponding version of first data of the second memory node preservation is less than the previous version of the first version
This;
The data of the previous version of the first version are updated in first data that second memory node preserves
Later, using described second update the data to the data of the previous version of the first version in second memory node into
Row update, forms the data of the first version in second memory node.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510644448.1A CN105391755B (en) | 2015-09-30 | 2015-09-30 | Data processing method, apparatus and system in a kind of distributed system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510644448.1A CN105391755B (en) | 2015-09-30 | 2015-09-30 | Data processing method, apparatus and system in a kind of distributed system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105391755A CN105391755A (en) | 2016-03-09 |
CN105391755B true CN105391755B (en) | 2018-10-19 |
Family
ID=55423585
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510644448.1A Active CN105391755B (en) | 2015-09-30 | 2015-09-30 | Data processing method, apparatus and system in a kind of distributed system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105391755B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102384351B1 (en) * | 2018-05-09 | 2022-04-06 | 삼성에스디에스 주식회사 | Method for generating a block in a blockchain-based system |
CN109655072B (en) * | 2018-10-31 | 2021-01-12 | 百度在线网络技术(北京)有限公司 | Map generation method and device |
CN110069494A (en) * | 2019-03-12 | 2019-07-30 | 北京字节跳动网络技术有限公司 | Date storage method, device, electronic equipment and computer readable storage medium |
KR102331734B1 (en) | 2021-03-19 | 2021-12-01 | 쿠팡 주식회사 | Method for processing data of distributed coordination system and electronic apparatus therefor |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7239605B2 (en) * | 2002-09-23 | 2007-07-03 | Sun Microsystems, Inc. | Item and method for performing a cluster topology self-healing process in a distributed data system cluster |
CN103518364A (en) * | 2013-03-19 | 2014-01-15 | 华为技术有限公司 | Data update method for distributed storage system and server |
CN104283906A (en) * | 2013-07-02 | 2015-01-14 | 华为技术有限公司 | Distributed storage system, cluster nodes and range management method of cluster nodes |
CN104902009A (en) * | 2015-04-27 | 2015-09-09 | 浙江大学 | Erasable encoding and chained type backup-based distributed storage system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102693168B (en) * | 2011-03-22 | 2014-12-31 | 中兴通讯股份有限公司 | A method, a system and a service node for data backup recovery |
-
2015
- 2015-09-30 CN CN201510644448.1A patent/CN105391755B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7239605B2 (en) * | 2002-09-23 | 2007-07-03 | Sun Microsystems, Inc. | Item and method for performing a cluster topology self-healing process in a distributed data system cluster |
CN103518364A (en) * | 2013-03-19 | 2014-01-15 | 华为技术有限公司 | Data update method for distributed storage system and server |
CN104283906A (en) * | 2013-07-02 | 2015-01-14 | 华为技术有限公司 | Distributed storage system, cluster nodes and range management method of cluster nodes |
CN104902009A (en) * | 2015-04-27 | 2015-09-09 | 浙江大学 | Erasable encoding and chained type backup-based distributed storage system |
Also Published As
Publication number | Publication date |
---|---|
CN105391755A (en) | 2016-03-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105391755B (en) | Data processing method, apparatus and system in a kind of distributed system | |
CN108140028B (en) | Method and architecture for providing database access control in a network with a distributed database system | |
CN105700907B (en) | Utilize the Uninstaller model for local checkpoint | |
JP3779263B2 (en) | Conflict resolution for collaborative work systems | |
CN108351860A (en) | The distributed storage devices based on RDMA of low latency | |
CN106302122B (en) | Virtual objects management method and device | |
CN106843749A (en) | Write request processing method, device and equipment | |
CN110471688A (en) | Operation system processing method, device, equipment and storage medium | |
CN106899648A (en) | A kind of data processing method and equipment | |
CN106484311A (en) | A kind of data processing method and device | |
CN108334396A (en) | The creation method and device of a kind of data processing method and device, resource group | |
US20120167105A1 (en) | Determining the processing order of a plurality of events | |
CN108572876A (en) | A kind of implementation method and device of Read-Write Locks | |
CN109739684A (en) | The copy restorative procedure and device of distributed key value database based on vector clock | |
CN111240806A (en) | Distributed container mirror image construction scheduling system and method | |
CN107368324A (en) | A kind of component upgrade methods, devices and systems | |
CN102413166B (en) | Distributed transaction method and system thereof | |
CN109189431A (en) | A kind of client side upgrading method, device, equipment and readable storage medium storing program for executing | |
US11522966B2 (en) | Methods, devices and systems for non-disruptive upgrades to a replicated state machine in a distributed computing environment | |
JP4920567B2 (en) | Equipment network system and data access control method | |
CN110413207A (en) | Reduce method, equipment and the program product of the data recovery time of storage system | |
US20230409386A1 (en) | Automatically orchestrating a computerized workflow | |
JP2004295272A (en) | Transaction control method | |
JPH07306795A (en) | Data base equivalent processor of duplex system computer | |
CN112596801B (en) | Transaction processing method, device, equipment, storage medium and database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |