CN105323271A - Cloud computing system, and processing method and apparatus thereof - Google Patents

Cloud computing system, and processing method and apparatus thereof Download PDF

Info

Publication number
CN105323271A
CN105323271A CN201410289531.7A CN201410289531A CN105323271A CN 105323271 A CN105323271 A CN 105323271A CN 201410289531 A CN201410289531 A CN 201410289531A CN 105323271 A CN105323271 A CN 105323271A
Authority
CN
China
Prior art keywords
data
disk
node
cloud computing
computing system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410289531.7A
Other languages
Chinese (zh)
Other versions
CN105323271B (en
Inventor
莫嫣
高洪
韩银俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201410289531.7A priority Critical patent/CN105323271B/en
Priority to PCT/CN2014/090398 priority patent/WO2015196692A1/en
Publication of CN105323271A publication Critical patent/CN105323271A/en
Application granted granted Critical
Publication of CN105323271B publication Critical patent/CN105323271B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications

Abstract

The present invention provides a cloud computing system, and a processing method and apparatus of the cloud computing system. The processing method of the cloud computing system includes the steps as follows: receiving an operation requirement of a client side to the cloud computing system; acquiring the data identification to be operated in the cloud computing system according to the operation requirement; searching various disks storing data corresponding to the data identification in various nodes of the cloud computing system, and various disk states according to a node disk state report of the cloud computing system; performing corresponding operations according to the various dish states storing data corresponding to the data identification in the various nodes of the cloud computing system. The node disk state report includes: the disk states in the various nodes of the cloud computing system, and the data identification corresponding to the data stored in the disks. According to the invention, the tolerance of the system to disk faults can be improved.

Description

The processing method of a kind of cloud computing system and cloud computing system and device
Technical field
The present invention relates to field of cloud computer technology, refer to processing method and the device of a kind of cloud computing system and cloud computing system especially.
Background technology
At present, cloud computing (CloudComputing) is grid computing (GridComputing), Distributed Calculation (DistributedComputing), parallel computation (ParallelComputing), effectiveness calculate the product that the traditional calculations machine technology such as (UtilityComputing), the network storage (NetworkStorageTechnologies) virtual (Virtualization), load balancing (LoadBalance) and network technical development merge.It to be intended to by network, the computational entity of multiple advantage of lower cost, be integrated into the system that has powerful calculating ability.Distributed caching is a field in cloud computing category, and its effect is to provide the distributed storage service of mass data and the ability of high-speed read-write access.
Distributed cache system is connected to each other by some server nodes and client and forms; Server node is responsible for the storage of data, and client can to operations such as the write of Servers for data, reading, renewal, deletions.In general, data can not only be kept on individual server node (hereinafter referred to as " node "), but on multiple stage node, preserve the copy of same data, backup each other.Modal memory module is active-standby mode, and one of them node is as host node (master), and other nodes are as slave node (slave), and the identity of host node is obtained by election or other algorithms.For simple flow, Data Update generally occurs on the primary node, and slave node obtains data from host node to carry out synchronously, and data access can obtain data from host node, also can obtain data from slave node, specifically sees the consistency policy of this access.
In distributed cache system, according to the requirement of consistency and availability, generally this data storage method is classified by NRW, wherein N represents the number of copies of data, R represents the data trnascription number obtained in a data access request, and W represents the minimum participation nodes (Data Update namely on how many nodes completes) of a Data Update request.
When distributed cache system realizes persistence function, distribution data are on that server kept on disk.In practical situations both, if disk failures, this server just cannot provide read-write to serve.Because distributed cache system data preserve the characteristic of multiple copy, at this moment, as long as other servers are in normal condition, system still can normally provide read-write service by the copy of other nodes.
If distributed cache system node has mounted polylith disk, wherein only have one or a few disk to damage for a certain reason, cause this server normally can not provide service, according to aforementioned, because other servers are good for use, whole cluster or available.Assuming that during this period of time, another server also there occurs analogue, and that node normally can not provide service, probably makes number of copies cannot meet NRW strategy, and so distributed caching cluster just thoroughly cannot provide and serve.Typically under relatively more conventional NRW is the condition of 3/2/2, two nodes break down, and only have a node normal, read-write operation all cannot meet the minimum requirement operated on two copies.
Summary of the invention
The technical problem to be solved in the present invention is, provides processing method and the device of a kind of cloud computing system and cloud computing system, can improve the tolerance of system to disk failure.
For solving the problems of the technologies described above, embodiments of the invention provide a kind of energy-consumption monitoring system, comprising:
On the one hand, a kind of processing method of cloud computing system is provided, comprises:
Receive client to the operation requests of cloud computing system;
According to described operation requests, obtain Data Identification to be operated in described cloud computing system;
Node Disk State according to described cloud computing system is reported, searches in each node of described cloud computing system and stores each disk of described Data Identification corresponding data and the state of disk described in each; The report of described node Disk State comprises: the Data Identification corresponding to data stored in the state of disk, described disk in each node of described cloud computing system;
According to store in each node in described cloud computing system described Data Identification corresponding data each described in the state of disk, operate accordingly.
The described state according to disk described in each, the step of carrying out corresponding operation comprises:
Described operation requests is update request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, then respond described update request; Otherwise, refuse described update request; Or
Described operation requests is data access request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, then respond described data access request; Otherwise, refuse described data access request.
Described when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, then the step responding described update request comprises:
When described operation requests is update request, and when the state storing the disk of the host node of described data is normal, the host node of described cloud computing system carries out Data Update to the described data place disk of host node; Described cloud computing system obtain data to be synchronized from node from described host node, describedly carry out Data Update from node to the described described data place disk from node;
When described operation requests is update request, and when the state storing the disk of the host node of described data is fault, first of described cloud computing system carries out Data Update from node to described first from the described data place disk of node; Second of described cloud computing system obtains data to be synchronized from node from described first from node; Described Section Point carries out Data Update to described second from the described data place disk of node; Described first state from node and described second from the disk of the described data of the storage of node is normal.
Described when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, then the step responding described data access request comprises:
When described operation requests is data access request, and when the state storing the disk of the host node of described data is normal, from the described data place disk of the host node of described cloud computing system, obtain the first authentic copy of described data, from the described data place disk of node, obtain the triplicate of described data from least one of described cloud computing system; From the described first authentic copy and described triplicate, choose the copy of latest edition; And the copy of described latest edition is sent to described client; Described second is normal from the state of the disk of the described data of the storage of node;
When described operation requests is data access request, and when the state storing the disk of the host node of described data is fault, from the described data place disk of node, obtain the triplicate of described data from least one of described cloud computing system; From triplicate described at least one, choose the copy of latest edition, and the copy of described latest edition is sent to described client; Described second is normal from the state of the disk of the described data of the storage of node.
Before the step of the operation requests of described reception client, described method also comprises:
The node Disk State report of described cloud computing system is obtained from node.
On the other hand, a kind of processing unit of cloud computing system is provided, comprises:
First receiving element, receives client to the operation requests of cloud computing system;
Acquiring unit, according to described operation requests, obtains Data Identification to be operated in described cloud computing system;
Search unit, the node Disk State according to described cloud computing system is reported, searches in each node of described cloud computing system and stores each disk of described Data Identification corresponding data and the state of disk described in each; The report of described node Disk State comprises: the Data Identification corresponding to data stored in the state of disk, described disk in each node of described cloud computing system;
Operating unit, according to store in each node in described cloud computing system described Data Identification corresponding data each described in the state of disk, operate accordingly.
Described operating unit comprises:
Described operating unit comprises:
First response subelement, described operation requests is update request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, then respond described update request;
First refusal subelement, when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is less than the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, refuses described update request;
Second response subelement, described operation requests is data access request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, then respond described data access request;
Second refusal subelement, when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is less than the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, refuses described data access request.
Described device, also comprises:
Second receiving element, receives the node Disk State report of described cloud computing system from node.
On the other hand, a kind of cloud computing system is provided, comprises: client, processing unit, node, the disk that described node is corresponding;
Described processing unit, receives the operation requests of described client to cloud computing system; According to described operation requests, obtain Data Identification to be operated in described cloud computing system; Node Disk State according to described cloud computing system is reported, search described cloud computing system each described in store each disk of described Data Identification corresponding data and the state of disk described in each in node; Described node Disk State report comprises: the Data Identification corresponding to data stored in the state of disk described in each node of described cloud computing system, described disk; According to store in each node in described cloud computing system described Data Identification corresponding data each described in the state of disk, operate accordingly.
Described node, to described processing unit sending node Disk State report.
The beneficial effect of technique scheme of the present invention is as follows:
The present invention is directed to distributed cache system, when there being disk failures, available resource can be made full use of, integrate out the copy resource meeting consistency and availability requirement, improve the availability of system as far as possible, improve system to the tolerance of fault.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the processing method of a kind of cloud computing system of the present invention;
Fig. 2 is the structural representation of the processing unit of a kind of cloud computing system of the present invention;
Fig. 3 is the structural representation of a kind of cloud computing system of the present invention;
Fig. 4 and Fig. 5 is the structural representation of the application scenarios of a kind of cloud computing system of the present invention.
Embodiment
For making the technical problem to be solved in the present invention, technical scheme and advantage clearly, be described in detail below in conjunction with the accompanying drawings and the specific embodiments.
As shown in Figure 1, be the processing method of a kind of cloud computing system of the present invention, comprise:
Step 11, receives client to the operation requests of cloud computing system; Operation requests can be Data Update request or data access request etc.
Step 12, according to described operation requests, obtains Data Identification to be operated in described cloud computing system; Such as, operation requests is upgrade the copy 1 in Fig. 4, and copy 1 is Data Identification.
Step 13, the node Disk State according to described cloud computing system is reported, searches in each node of described cloud computing system and stores each disk of described Data Identification corresponding data and the state of disk described in each; The report of described node Disk State comprises: the Data Identification corresponding to data stored in the state of disk, described disk in each node of described cloud computing system; The state of disk is normal or fault, and in Fig. 4, the Disk State of node A is reported as: (node A: disk I, copy 1, fault; Disk I I, copy 2, normally; Disk I II, copy 3, normal).
Step 14, according to store in each node in described cloud computing system described Data Identification corresponding data each described in the state of disk, operate accordingly.
Before step 14, described method also comprises:
Step 10, obtains the node Disk State report of described cloud computing system from node.Nodal test to storage one data disk failures or break down, then send report; Or send report based on request.
Wherein, step 14 step comprises:
Described operation requests is update request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, then respond described update request; Otherwise, refuse described update request; Or
Described operation requests is data access request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, then respond described data access request; Otherwise, refuse described data access request.
Be specially:
When described operation requests is update request, and when the state storing the disk of the host node of described data is normal, the host node of described cloud computing system carries out Data Update to the described data place disk of host node; Described cloud computing system obtain data to be synchronized from node from described host node, describedly carry out Data Update from node to the described described data place disk from node;
When described operation requests is update request, and when the state storing the disk of the host node of described data is fault, first of described cloud computing system carries out Data Update from node to described first from the described data place disk of node; Second of described cloud computing system obtains data to be synchronized from node from described first from node; Described Section Point carries out Data Update to described second from the described data place disk of node; Described first state from node and described second from the disk of the described data of the storage of node is normal.
When described operation requests is data access request, and when the state storing the disk of the host node of described data is normal, the first authentic copy of described data is obtained from the described data place disk of the host node of described cloud computing system, from the described data place disk of node, the triplicate of described data is obtained from described cloud computing system at least one (also can be two or 3, according to actual conditions setting); From the described first authentic copy and described triplicate, choose the copy of latest edition; And the copy of described latest edition is sent to described client; Described second is normal from the state of the disk of the described data of the storage of node;
When described operation requests is data access request, and when the state storing the disk of the host node of described data is fault, from the described data place disk of node, obtain the triplicate of described data from least one of described cloud computing system; From triplicate described at least one, choose the copy of latest edition, and the copy of described latest edition is sent to described client; Described second is normal from the state of the disk of the described data of the storage of node.
Such as, Fig. 5 is a distributed caching storage system be made up of 3 nodes, and each data of this storage system have three copies, adopts the mode of 322 to upgrade and visit data.The read request access copy amount that cloud computing system specifies is 2, when there being a disk to break down, still can respond renewal or data access operation request, when there being two disks to break down, then and can not operation response request.
In the present invention, when generation node disk failure, even multiple node breaks down disk simultaneously, as long as on remaining available disk, number of copies can meet NRW strategy on cluster, system just can ensure consistency and availability, even may not affect the service of all data, more thoroughly can not cannot provide the situation of service by generation systems, also just provide service as far as possible.
Certainly, when part disk failures continues to provide service, the recovery problem of data after thereupon bringing disk to recover, this can be completed by distributed caching data recovery function, namely obtains copy data to repair from other nodes.
As shown in Figure 2, be the processing unit of a kind of cloud computing system of the present invention, comprise:
First receiving element 21, receives client to the operation requests of cloud computing system;
Acquiring unit 22, according to described operation requests, obtains Data Identification to be operated in described cloud computing system;
Search unit 23, the node Disk State according to described cloud computing system is reported, searches in each node of described cloud computing system and stores each disk of described Data Identification corresponding data and the state of disk described in each; The report of described node Disk State comprises: the Data Identification corresponding to data stored in the state of disk, described disk in each node of described cloud computing system;
Operating unit 24, according to store in each node in described cloud computing system described Data Identification corresponding data each described in the state of disk, operate accordingly.
Described operating unit 24 comprises:
First response subelement, described operation requests is update request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, then respond described update request;
First refusal subelement, when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is less than the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, refuses described update request;
Second response subelement, described operation requests is data access request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, then respond described data access request;
Second refusal subelement, when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is less than the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, refuses described data access request.
Described device, also comprises:
Second receiving element 25, receives the node Disk State report of described cloud computing system from node.
As shown in Figure 3, be a kind of cloud computing system of the present invention, comprise: the disk 34 of client 31, processing unit 32, node 33, described node 33 correspondence;
Described processing unit 32, receives the operation requests of described client 31 pairs of cloud computing system; According to described operation requests, obtain Data Identification to be operated in described cloud computing system; Node Disk State according to described cloud computing system is reported, search described cloud computing system each described in store the disk of described Data Identification corresponding data and the state of disk 34 described in each in node 33; Described node Disk State report comprises: the Data Identification corresponding to data stored in the state of disk described in each node 33 of described cloud computing system, described disk; According to store in each node in described cloud computing system described Data Identification corresponding data each described in the state of disk 34, operate accordingly.
Described node 33, to described processing unit 32 sending node Disk State report.
Two methods scene of the present invention is below described.
First application scenarios is the implementation method describing the availability when disk failures situation in a kind of cloud computing distributed cache system, under many disks path.
Previous step: in client and distributed cache system, multiple server node connects, connect mutually and normal operation between server node, each server has some pieces of disks for the persistence of data, and different data fragmentation persistences is on different disks.Data trnascription number is N, read request access number of copies is R, the minimum more latest copy of write request successfully counts as W, the maximum Fault Tolerance of single of system is that (expression allows that the request on O node is broken down to O, as Single Point of Faliure then O=1, O<W), coherence request W+R>N.
Steps A: under normal circumstances, all disks on each node normally work system, and data have N number of copy in systems in which.When client initiates Data Update request, Data Update process is carried out to data place disk by Master, slave is from master synchrodata, and carry out Data Update to data place disk on slave, after Data Update is successfully completed on W node, returns client data and be updated successfully message;
When client initiates data access request, asked by Master/Slave process, after obtaining the data trnascription of access from R node data place disk, from this R data trnascription, choose up-to-date copy return to client.
During step B: node A startup, find that certain disk failure cannot be accessed, but other disks are still normal; Or, in node A running, find that certain disk repeatedly accesses failure, be judged to be this disk failure.Node A does not switch to node failure, but continues to provide read-write service, records the mark of data trnascription corresponding on failed disk and this disk simultaneously.
Step C: when client initiates Data Update request, and these data are distributed in the failed disk of node A described in step B just, then, when to these node updates data, node A directly returns failure; After Data Update (does not comprise node A in this W node) and is successfully completed on W node, return to client data and be updated successfully message;
When client initiates data access request, node A directly returns failure, asked by Master/Slave process, obtain after the data trnascription of access from R node (not comprising node A this R node) data place disk, from this R data trnascription, choose up-to-date copy return to client.
Step D: when client initiates Data Update and access request, and these data are not distributed in the failed disk of node A described in step B, then the same steps A of processing mode.
Step e: when Node B is in running, repeatedly accesses certain disk and unsuccessfully judges that this disk is as fault.Node B does not switch to node failure, but continues to provide read-write service, records the mark of data trnascription corresponding on failed disk and this disk simultaneously.
Assuming that the copy that the failed disk of the failed disk of Node B and node A is preserved is without coincidence.Continue next step.
Step F: when client initiates Data Update and access request, and these data are distributed in the failed disk of Node B described in step e just, based on above-mentioned supposition, then not in the failed disk of node A described in step B, then when to these node updates data, Node B directly returns failure; After Data Update (does not comprise Node B in this W node) and is successfully completed on W node, return to client data and be updated successfully message;
When client initiates data access request, Node B directly returns failure, asked by Master/Slave process, obtain after the data trnascription of access from R node (not comprising Node B this R node) data place disk, from this R data trnascription, choose up-to-date copy, return to client.
Step G: when client initiates Data Update request, and these data are distributed in the failed disk of node A described in step B just, based on above-mentioned supposition, then not in the failed disk of Node B described in step e, then when to this node updates and visit data, processing procedure is with step C, and result normally to upgrade and to have access to.
In the present invention, when generation node disk failure, even multiple node breaks down disk simultaneously, as long as on remaining available disk, number of copies can meet NRW strategy on cluster, system just can ensure consistency and availability, even may not affect the service of all data, more thoroughly can not cannot provide the situation of service by generation systems, also just provide service as far as possible.
Certainly, when part disk failures continues to provide service, the recovery problem of data after thereupon bringing disk to recover, this can be completed by distributed caching data recovery function, namely obtains copy data to repair from other nodes.
The invention provides a kind of implementation method improving availability at distributed cache system in many disk failures situation, when consistency is constant, enhance the availability of system, thus optimize application experience.
Below in conjunction with Fig. 4 and Fig. 5, the second application scenarios is described.
Be specially: describe in detail under single node occurs that disk failures and multinode occur disk failures simultaneously for the active and standby storage system of 322 patterns, availability implementation.
Distributed cache system is formed by server node and client, to specific data, there is a host node (master) to be responsible for the process renewal of client and access request, have several slave nodes for the data of synchronous master and receive the data access request (slave is deal with data update request not) of client.
Environment: a distributed caching storage system be made up of 3 nodes, each data of this storage system have three copies, adopt the mode of 322 to upgrade and visit data.
The present invention includes following steps:
Step 1, initial normal phase, system acceptance client-requested, in the disk I that tentation data is positioned at node A copy 1 (being equivalent to above-mentioned Data Identification), Node B disk I on copy 1 and node C disk I II on copy 1.For the purpose of describing and simplifying, assuming that the copy 1 in Node B is master, the copy on other two nodes is slave.Copy 2 on node A is master, and the copy on other two nodes is slave.Copy 3 on node A is master, and the copy on other two nodes is slave.
Step 2, when client initiates Data Update request, Data Update is carried out to copy in disk I 1 by B node M aster, slave is from master synchrodata, and carry out Data Update to data place disk on slave, after Data Update is successfully completed on W=2 node, returns to client data and be updated successfully message.Because all disks are all normal, actual all copies have all been updated successfully; When client initiates data access request, three nodes all process request, and after obtaining the data trnascription of access from R=2 node data place disk, return client, actual all node copies have all read successfully.
Step 3, as shown in Figure 4, assuming that disk I is damaged on node A, causes copy 1 unavailable.When the data of the update request that client is initiated are positioned on node A copy 1, Data Update is carried out to copy in disk I 1 by B node M aster, the slave of node C is from master synchrodata, and carry out Data Update to data on copy on node C disk I II, at this moment, Data Update returns to client data and is updated successfully message after being successfully completed on W=2 node;
When client initiate the data of data access request be positioned on node A copy 1 time, node A directly returns failure, and after the copy 1 of Node B and node C obtains data, (meeting R=2) returns to client.
Step 4, in step 3 situation, when the renewal that client is initiated and access request are positioned on node A copy 2 or copy 3, because the copy of three nodes is all available, then handling process is with step 2.
Step 5, as shown in Figure 5, when disk I I in Node B damages, causes the copy 3 of Node B unavailable.When the renewal that client is initiated and the data of access request are positioned on node A copy 1, the copy on Node B and node C is all available, and meet NRW strategy, then handling process is with step 3.
Step 6, in step 5 situation, when the renewal that client is initiated and access request are positioned on node A copy 2, because the copy 2 of three nodes is all available, then handling process is with step 2.
Step 7, in step 5 situation, when the data of the update request that client is initiated are positioned on node A copy 3, the copy 3 of B node damages, and the copy 3 of C node can be used.Data Update is carried out to copy on disk I II 3 by A node M aster, the slave of node C is from master synchrodata, and carry out Data Update to data on copy 3 on node C disk I I, after at this moment Data Update is successfully completed on W=2 node, returns client data and be updated successfully message;
When the data that client initiates data access request are positioned on node A copy 3, Node B directly returns failure, and after the copy 3 of node A and node C obtains data, (meeting R=2) returns client.
Can see from above, even if when node A and Node B all exist disk failures, as long as the copy damaging disk does not repeat, distributed caching cluster still can provide the read-write service of total data.
In above-mentioned application scenarios, if there are two malfunctioning nodes, each node reality is part disk failures, when more optimistic, if what the disk damaged was deposited is not the copy of same data, on the available disk of then actual whole system, or at least two copies of in store all data, possess the condition that all services are normally provided completely.Even if just deposit the copy of same data on the disk damaged, data available so on other disks, still can meet consistency and availability, can provide read-write service, only for this part data damaged simultaneously, read and write access cannot be provided.
Beneficial effect of the present invention is as follows:
The present invention is directed to distributed cache system, when there being disk failures, available resource can be made full use of, integrate out the copy resource meeting consistency and availability requirement, improve the availability of system as far as possible, improve system to the tolerance of fault.That is, in field of cloud calculation distributed cache system, a kind of disk and data management mechanism are provided, even if in node section disk failures situation, still can utilize the data on available disk as far as possible, keep the ability that service is provided, make service end when less disk or data resource, the stores service of consistency and availability is provided.
The above is the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the prerequisite not departing from principle of the present invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (10)

1. a processing method for cloud computing system, is characterized in that, comprising:
Receive client to the operation requests of cloud computing system;
According to described operation requests, obtain Data Identification to be operated in described cloud computing system;
Node Disk State according to described cloud computing system is reported, searches in each node of described cloud computing system and stores each disk of described Data Identification corresponding data and the state of disk described in each; The report of described node Disk State comprises: the Data Identification corresponding to data stored in the state of disk, described disk in each node of described cloud computing system;
According to store in each node in described cloud computing system described Data Identification corresponding data each described in the state of disk, operate accordingly.
2. method according to claim 1, is characterized in that, the described state according to disk described in each, and the step of carrying out corresponding operation comprises:
Described operation requests is update request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, then respond described update request; Otherwise, refuse described update request; Or
Described operation requests is data access request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, then respond described data access request; Otherwise, refuse described data access request.
3. method according to claim 2, it is characterized in that, described when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, then the step responding described update request comprises:
When described operation requests is update request, and when the state storing the disk of the host node of described data is normal, the host node of described cloud computing system carries out Data Update to the described data place disk of host node; Described cloud computing system obtain data to be synchronized from node from described host node, describedly carry out Data Update from node to the described described data place disk from node;
When described operation requests is update request, and when the state storing the disk of the host node of described data is fault, first of described cloud computing system carries out Data Update from node to described first from the described data place disk of node; Second of described cloud computing system obtains data to be synchronized from node from described first from node; Described Section Point carries out Data Update to described second from the described data place disk of node; Described first state from node and described second from the disk of the described data of the storage of node is normal.
4. method according to claim 2, it is characterized in that, described when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, then the step responding described data access request comprises:
When described operation requests is data access request, and when the state storing the disk of the host node of described data is normal, from the described data place disk of the host node of described cloud computing system, obtain the first authentic copy of described data, from the described data place disk of node, obtain the triplicate of described data from least one of described cloud computing system; From the described first authentic copy and described triplicate, choose the copy of latest edition; And the copy of described latest edition is sent to described client; Described second is normal from the state of the disk of the described data of the storage of node;
When described operation requests is data access request, and when the state storing the disk of the host node of described data is fault, from the described data place disk of node, obtain the triplicate of described data from least one of described cloud computing system; From triplicate described at least one, choose the copy of latest edition, and the copy of described latest edition is sent to described client; Described second is normal from the state of the disk of the described data of the storage of node.
5. method according to claim 1, is characterized in that, before the step of the operation requests of described reception client, described method also comprises:
The node Disk State report of described cloud computing system is obtained from node.
6. a processing unit for cloud computing system, is characterized in that, comprising:
First receiving element, receives client to the operation requests of cloud computing system;
Acquiring unit, according to described operation requests, obtains Data Identification to be operated in described cloud computing system;
Search unit, the node Disk State according to described cloud computing system is reported, searches in each node of described cloud computing system and stores each disk of described Data Identification corresponding data and the state of disk described in each; The report of described node Disk State comprises: the Data Identification corresponding to data stored in the state of disk, described disk in each node of described cloud computing system;
Operating unit, according to store in each node in described cloud computing system described Data Identification corresponding data each described in the state of disk, operate accordingly.
7. device according to claim 6, is characterized in that, described operating unit comprises:
First response subelement, described operation requests is update request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, then respond described update request;
First refusal subelement, when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is less than the minimum participation number of nodes of a predetermined Data Update request of described cloud computing system, refuses described update request;
Second response subelement, described operation requests is data access request; When storing described data in described cloud computing system and the quantity being in the described disk of normal condition is more than or equal to the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, then respond described data access request;
Second refusal subelement, when storing described data in described cloud computing system and the quantity being in the described disk of normal condition is less than the data trnascription quantity that a predetermined data access request of described cloud computing system obtains, refuses described data access request.
8. device according to claim 6, is characterized in that, also comprises:
Second receiving element, receives the node Disk State report of described cloud computing system from node.
9. a cloud computing system, is characterized in that, comprising: client, processing unit, node, the disk that described node is corresponding;
Described processing unit, receives the operation requests of described client to cloud computing system; According to described operation requests, obtain Data Identification to be operated in described cloud computing system; Node Disk State according to described cloud computing system is reported, search described cloud computing system each described in store the disk of described Data Identification corresponding data and the state of disk described in each in node; Described node Disk State report comprises: the Data Identification corresponding to data stored in the state of disk described in each node of described cloud computing system, described disk; According to store in each node in described cloud computing system described Data Identification corresponding data each described in the state of disk, operate accordingly.
10. system according to claim 9, is characterized in that, described node, to described processing unit sending node Disk State report.
CN201410289531.7A 2014-06-24 2014-06-24 Cloud computing system and processing method and device thereof Active CN105323271B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410289531.7A CN105323271B (en) 2014-06-24 2014-06-24 Cloud computing system and processing method and device thereof
PCT/CN2014/090398 WO2015196692A1 (en) 2014-06-24 2014-11-05 Cloud computing system and processing method and apparatus for cloud computing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410289531.7A CN105323271B (en) 2014-06-24 2014-06-24 Cloud computing system and processing method and device thereof

Publications (2)

Publication Number Publication Date
CN105323271A true CN105323271A (en) 2016-02-10
CN105323271B CN105323271B (en) 2020-04-24

Family

ID=54936632

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410289531.7A Active CN105323271B (en) 2014-06-24 2014-06-24 Cloud computing system and processing method and device thereof

Country Status (2)

Country Link
CN (1) CN105323271B (en)
WO (1) WO2015196692A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108173672A (en) * 2017-12-04 2018-06-15 华为技术有限公司 The method and apparatus for detecting failure
CN108206768A (en) * 2016-12-20 2018-06-26 阿里巴巴集团控股有限公司 Cluster monitoring and switching method and device
CN110321225A (en) * 2019-07-08 2019-10-11 腾讯科技(深圳)有限公司 Load-balancing method, meta data server and computer readable storage medium
CN113485648A (en) * 2021-07-14 2021-10-08 华能吉林发电有限公司 Storage resource control system based on cloud platform

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103747072A (en) * 2013-12-30 2014-04-23 乐视网信息技术(北京)股份有限公司 Data reading and writing method and application server
CN103763155A (en) * 2014-01-24 2014-04-30 国家电网公司 Multi-service heartbeat monitoring method for distributed type cloud storage system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102202087B (en) * 2011-04-25 2015-04-01 中兴通讯股份有限公司 Method for identifying storage equipment and system thereof
CN103257977B (en) * 2012-02-21 2017-07-04 阿里巴巴集团控股有限公司 Obtain the method and device of identification number

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103747072A (en) * 2013-12-30 2014-04-23 乐视网信息技术(北京)股份有限公司 Data reading and writing method and application server
CN103763155A (en) * 2014-01-24 2014-04-30 国家电网公司 Multi-service heartbeat monitoring method for distributed type cloud storage system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高洪等: "云计算分布式缓存技术及其在物联网中的应用", 《中兴通讯技术》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108206768A (en) * 2016-12-20 2018-06-26 阿里巴巴集团控股有限公司 Cluster monitoring and switching method and device
CN108173672A (en) * 2017-12-04 2018-06-15 华为技术有限公司 The method and apparatus for detecting failure
CN108173672B (en) * 2017-12-04 2021-06-08 华为技术有限公司 Method and device for detecting fault
CN110321225A (en) * 2019-07-08 2019-10-11 腾讯科技(深圳)有限公司 Load-balancing method, meta data server and computer readable storage medium
CN110321225B (en) * 2019-07-08 2021-04-30 腾讯科技(深圳)有限公司 Load balancing method, metadata server and computer readable storage medium
CN113485648A (en) * 2021-07-14 2021-10-08 华能吉林发电有限公司 Storage resource control system based on cloud platform

Also Published As

Publication number Publication date
WO2015196692A1 (en) 2015-12-30
CN105323271B (en) 2020-04-24

Similar Documents

Publication Publication Date Title
US11086555B1 (en) Synchronously replicating datasets
EP3620905B1 (en) Method and device for identifying osd sub-health, and data storage system
US11516072B2 (en) Hybrid cluster recovery techniques
US8984330B2 (en) Fault-tolerant replication architecture
US9170892B2 (en) Server failure recovery
US10366106B2 (en) Quorum-based replication of data records
US10223007B1 (en) Predicting IO
CN110535692B (en) Fault processing method and device, computer equipment, storage medium and storage system
US20180004777A1 (en) Data distribution across nodes of a distributed database base system
JP6491210B2 (en) System and method for supporting persistent partition recovery in a distributed data grid
CN103929500A (en) Method for data fragmentation of distributed storage system
CN110807064B (en) Data recovery device in RAC distributed database cluster system
CN108628717A (en) A kind of Database Systems and monitoring method
US20140059315A1 (en) Computer system, data management method and data management program
JPWO2014076838A1 (en) Virtual machine synchronization system
CN102833281A (en) Method, device and system for realizing distributed automatically-increasing counting
CN105323271A (en) Cloud computing system, and processing method and apparatus thereof
CN109726211B (en) Distributed time sequence database
CN107943615B (en) Data processing method and system based on distributed cluster
CN113190620B (en) Method, device, equipment and storage medium for synchronizing data between Redis clusters
Kazhamiaka et al. Sift: resource-efficient consensus with RDMA
EP3167372B1 (en) Methods for facilitating high availability storage services and corresponding devices
CN116974489A (en) Data processing method, device and system, electronic equipment and storage medium
US11762741B2 (en) Storage system, storage node virtual machine restore method, and recording medium
CN115756955A (en) Data backup and data recovery method and device and computer equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant