WO2014101108A1 - 分布式存储系统的缓存方法、节点和计算机可读介质 - Google Patents

分布式存储系统的缓存方法、节点和计算机可读介质 Download PDF

Info

Publication number
WO2014101108A1
WO2014101108A1 PCT/CN2012/087842 CN2012087842W WO2014101108A1 WO 2014101108 A1 WO2014101108 A1 WO 2014101108A1 CN 2012087842 W CN2012087842 W CN 2012087842W WO 2014101108 A1 WO2014101108 A1 WO 2014101108A1
Authority
WO
WIPO (PCT)
Prior art keywords
lock
client node
owner
striped data
node
Prior art date
Application number
PCT/CN2012/087842
Other languages
English (en)
French (fr)
Inventor
郭洪星
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2012/087842 priority Critical patent/WO2014101108A1/zh
Priority to CA2896123A priority patent/CA2896123C/en
Priority to EP12891167.4A priority patent/EP2830284B1/en
Priority to CN201280003290.4A priority patent/CN103392167B/zh
Priority to AU2012398211A priority patent/AU2012398211B2/en
Priority to JP2015514321A priority patent/JP6301318B2/ja
Publication of WO2014101108A1 publication Critical patent/WO2014101108A1/zh
Priority to US14/509,471 priority patent/US9424204B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/14Protection against unauthorised use of memory or access to memory
    • G06F12/1458Protection against unauthorised use of memory or access to memory by checking the subject access rights
    • G06F12/1466Key-lock mechanism
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2211/00Indexing scheme relating to details of data-processing equipment not covered by groups G06F3/00 - G06F13/00
    • G06F2211/10Indexing scheme relating to G06F11/10
    • G06F2211/1002Indexing scheme relating to G06F11/1076
    • G06F2211/1028Distributed, i.e. distributed RAID systems with parity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1041Resource optimization
    • G06F2212/1044Space efficiency improvement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/15Use in a specific computing environment
    • G06F2212/154Networked environment

Definitions

  • Embodiments of the present invention relate to storage technologies, and in particular, to a cache method, a node, and a computer readable medium of a distributed storage system. Background technique
  • a distributed storage system multiple node devices are connected to form a cluster, and each node device has a data storage function. All node devices are connected through a front-end network (Front-End Network) and a back-end network (Back-End Network).
  • the front-end network is used for request and data interaction between the user service and the distributed storage system
  • the back-end network is used for request and data interaction between the various node devices in the distributed storage system.
  • a distributed storage system user data is striped to obtain stripe, and then each stripe data in the stripe is distributed and stored in a hard disk of a different node device.
  • the application server When accessing user data, the application server first sends an access request to a node device through the front-end network, and then the node device reads the strip data of the user data from the other node device to the local node device through the back-end network, and uses the disk.
  • the Redundant Array of Independent Disks (RAID) algorithm or the erasure code (Erasure Code) algorithm restores the stripe data to user data and returns it to the application server through the front-end network.
  • the cache method is as follows: The hotspot data block on the local node device is cached in the cache of each node device.
  • the node device needs to obtain the data block constituting the stripe data from the cache of each node device, and if the required data block cannot be obtained in the cache of the node device,
  • the hard disk of the node device is accessed, the data block is obtained from the hard disk, and the obtained data block is summarized, reconstructed, and redundantly checked to obtain the stripe data.
  • Another caching method is as follows: The cache file of the hotspot file counted by the local device is cached in the cache of each node device.
  • the node device When the node device needs to obtain the stripe data, the node device first obtains the stripe data from the cache, and if the stripe data to be accessed cannot be obtained in the cache of the node device, It also needs to be obtained from the cache of other node devices in turn. When the cache of other node devices is not available, it needs to be obtained from the hard disk of the node device.
  • the data caching technology adopted in the distributed storage system is the above two caching methods or a combination of the above two caching methods.
  • each node device determines the hot content in the content stored on the hard disk according to the access statistics, and caches the hot content in the cache, because each node device independently performs the above cache, so The phenomenon that the same content is cached on different node devices causes the cache utilization of the node device to be low. Summary of the invention
  • a first aspect of the present invention provides a method for caching a distributed storage system to address the deficiencies in the prior art and improve the cache utilization of the node device.
  • Another aspect of the present invention is to provide a stripe data owner server node and a lock client node to solve the defects in the prior art and improve the cache utilization of the node device.
  • Still another aspect of the present invention is to provide a computer readable medium for solving the deficiencies in the prior art and improving cache utilization of a node device.
  • a first aspect of the present invention provides a method for caching a distributed storage system, including: a stripe data owner server node receives a lock notification for a stripe data from a lock client node, and performs the lock notification Judge
  • the stripe data owner server node When the lock notification is a first read lock notification or a write lock notification, the stripe data owner server node records the lock client node as an owner of the stripe data, to the lock client node Returning, the owner of the stripe data is a response message of the lock client node, so that the lock client node caches the stripe data;
  • the stripe data owner server node When the lock notification is a non-first read lock notification, the stripe data owner server node returns a response message containing the owner information of the stripe data to the lock client node, so that the lock client The end node reads the stripe data from a cache of the owner of the stripe data.
  • the method further includes:
  • the stripe data owner server node When the lock notification is a read lock notification, the stripe data owner server node is recording The attribute information of the stripe data is searched for the attribute information of the stripe data, and if not found, the read lock notification is determined to be the first read lock notification.
  • the stripe data owner server node receives a request message from the lock client node to change the owner of the stripe data to another lock client node;
  • the stripe data owner server node changes the owner of the stripe data to the other lock client node
  • the stripe data owner server node returns an owner change success response message of the stripe data to the lock client node, so that the lock client node deletes the locally cached stripe data, and causes the stripe data to be Another lock client node caches the stripe data.
  • any possible implementation manner further provide an implementation manner, when the stripe data owner server node is integrated in the lock server node device,
  • the read lock notification includes a read lock request
  • the write lock notification includes a write lock request
  • the returning the response message of the owner of the stripe data to the lock client node as the lock client node further includes: returning a lock success response message to the lock client node;
  • the response message that the lock client node returns the owner information of the stripe data further includes: returning a lock success response message to the lock client node.
  • Another aspect of the present invention provides a method for caching a distributed storage system, comprising: a lock client node transmitting a read lock notification or a write lock notification to a stripe data owner server node;
  • the lock client node caches the stripe data ;
  • the lock client node when the lock client node receives an identifier ID of an owner of the stripe data returned by the stripe data owner server node, the lock client node sets an ID of the owner of the stripe data and itself The IDs are compared, and when the two are different, the lock client node reads the stripe data from the cache of the lock client node corresponding to the ID of the owner of the stripe data.
  • the aspect as described above and any possible implementation manner further provide an implementation manner, after the lock client node sends a read lock notification or a write lock notification to the stripe data owner server node for the stripe data, Before the lock client node receives the response message that the owner of the stripe data returning the stripe data is the lock client node, the lock client node further includes:
  • the stripe data owner server node records the lock client node as the owner of the stripe data upon receiving the first read lock notification or the write lock notification for the stripe data.
  • lock client node sends a write lock notification to the stripe data owner server node for the stripe data
  • the lock client node caches the stripe data send a lock downgrade request to the lock server node, So that the lock server node modifies the record to the lock client node holding a read lock on the stripe data.
  • the lock client node receives a read request or a write request for the stripe data from an application server;
  • the lock client node locally searches for a read lock or a write lock for the stripe data; if found, determines the owner of the stripe data according to the read lock or write lock of the stripe data, The owner of the stripe data reads or writes the stripe data;
  • the step of executing the lock client node to send a read lock notification or a write lock notification to the stripe data to the stripe data owner server node is performed.
  • the lock client node sends the stripe data owner server to the stripe data owner server Sending, by the node, a request message that changes an owner of the stripe data to a target lock client node, so that the stripe data owner server node changes an owner of the stripe data to a target lock client node;
  • the lock client node receives an owner change success response message of the stripe data returned by the stripe data owner server node, deletes the stripe data of the local cache, and sends the stripe data to the target lock client node Demarcating the data so that the target lock client node caches the stripe data.
  • the stripe data specifically includes:
  • the lock client node sends a request for reading the stripe data to the lock client node corresponding to the ID of the owner of the stripe data, so that the lock client node corresponding to the ID of the owner of the stripe data Finding the stripe data in the locally cached data, and if found, returning the stripe data to the lock client node, otherwise reading the lock client node from the distributed storage system
  • Each stripe data of the stripe data is constructed and converted into the stripe data, and returned to the lock client node.
  • a stripe data owner server node including: a receiving unit, configured to receive a lock notification for a stripe data from a lock client node; and a determining unit, configured to notify the lock Make judgments;
  • a recording unit configured to: when the determining unit determines that the lock notification is a first read lock notification or a write lock notification, record the lock client node as an owner of the stripe data;
  • a sending unit configured to: when the receiving determining unit determines that the lock notification is a first read lock notification or a write lock notification, returning the owner of the stripe data to the lock client node as the lock client node Responding message, so that the lock client node caches the stripe data; and when the receiving judging unit judges that the lock notification is a non-first read lock notification, returning the score to the lock client node a response message of the owner information of the piece of data, such that the lock client node reads the stripe data from the cache of the owner of the stripe data.
  • the determining unit is specifically configured to: when the lock notification is a read lock notification, search for attribute information of the stripe data in attribute information of the recorded stripe data, and if not found, determine the read lock The notification is the first read lock notification.
  • the receiving unit is further configured to receive a request message from the lock client node to change an owner of the stripe data to another lock client node;
  • the recording unit is further configured to change an owner of the stripe data to the another lock client node;
  • the sending unit is further configured to return an owner change success response message of the stripe data to the lock client node, so that the lock client node deletes the stripped data of the local cache, and causes the other A lock client node caches the stripe data.
  • the read lock notification includes a read lock request
  • the write lock notification includes a write lock request
  • the sending unit is further configured to return a lock success response message to the lock client node.
  • a lock client node including:
  • a sending unit configured to send a read lock notification or a write lock notification to the stripe data owner server node
  • a receiving unit configured to receive, by the stripe data owner server node, a response message that returns an owner of the stripe data to the lock client node, or receive the returned by the stripe data owner server node The ID of the owner of the stripe data;
  • a comparing unit configured to compare an ID of an owner of the stripe data with an ID of the own; when the two are different, the reading and writing unit is turned on;
  • the buffering unit is configured to: when the receiving unit receives the response message that the owner of the stripe data owner node returns the stripe data is the lock client node, buffering the stripe data ;
  • the reading and writing unit is configured to read the stripe data from a cache of a lock client node corresponding to an ID of an owner of the stripe data.
  • the sending unit is further configured to send a lock downgrade request to the lock server node, so that the lock server node modifies the record to The lock client node holds a read lock on the stripe data.
  • search unit a search unit
  • the receiving unit is further configured to receive a read request or a write request for the stripe data from an application server;
  • the searching unit is configured to locally search for a read lock or a write lock for the stripe data; if found, determine an owner of the stripe data according to a read lock or a write lock of the stripe data, Turning on the read/write unit; if not found, turning on the sending unit;
  • the read/write unit is further configured to read or write the stripe data in an owner of the stripe data.
  • the cache unit is further configured to use a pre-removal rate in a unit time greater than or equal to a total amount of caches of the lock client node.
  • the sending unit is controlled to send, to the stripe data owner server node, a request message for changing the owner of the stripe data to a target lock client node; and is further configured to receive the strip according to the receiving unit
  • the owner of the data changes the success response message, deletes the stripped data of the local cache, and controls the sending unit to send the stripe data to the target lock client node;
  • the sending unit is further configured to send, according to the control of the cache unit, a request message for changing the owner of the stripe data to a target lock client node to the stripe data owner server node, so that the The stripe data owner server node changes the owner of the stripe data to a target lock client node; and is further configured to send the stripe data to the target lock client node according to the control of the cache unit, to Causing the target lock client node to cache the stripe data;
  • the receiving unit is further configured to receive an owner change success response message of the stripe data returned by the stripe data owner server node.
  • the read/write unit is specifically configured to send a read score to a lock client node corresponding to an ID of an owner of the stripe data. a request for the strip data, so that the lock client node corresponding to the ID of the owner of the stripe data searches for the stripe data in the locally cached data, and if found, returns the score to the read/write unit Strip data, otherwise, the strip data of the stripe data is read from each lock client node of the distributed storage system, and is constructed into the stripe data, and then returned to the read/write unit.
  • a stripe data owner server node including: a processor, a memory, a communication interface, and a bus; the processor, the memory, and the communication interface communicate via the bus;
  • the memory is configured to store execution instructions, the communication interface is configured to communicate with a first lock client node and a second lock client node; when the stripe data owner server node is running, the processor executes the The execution instructions stored by the memory to cause the stripe data owner server node to perform a caching method of the distributed storage system as described above.
  • a lock client node including: a processor, a memory, a communication interface, and a bus; the processor, the memory, and the communication interface communicate through the bus; Executing an execution instruction in the depositing program; the communication interface is configured to communicate with the stripe data owner server node and other lock client nodes; when the lock client node is running, the processor executes the memory storage Executing instructions to cause the stripe data owner server node to perform a caching method of the distributed storage system as described above.
  • Another aspect of the present invention provides a computer readable medium, comprising computer execution instructions for causing a stripe data owner server node to perform the method of any of the above.
  • Another aspect of the present invention provides a computer readable medium, comprising computer execution instructions for causing a lock client node to perform the method of any of the above.
  • the lock client node that initiated the request is recorded as all of the stripe data.
  • the stripe data is cached by the lock client node; when the lock stripe data owner server node receives the non-first read lock request for the stripe data, the lock client node that initiates the request according to the record is Informing the owner of the stripe data that the requesting lock client node reads the stripe data from the owner's cache.
  • each lock client node By locking the stripe data owner server node to record and feedback the stripe data owner, so that the same stripe data is cached only once in the owner of the stripe data in the entire distributed storage system, each lock client node The stripe data can be read from the owner of the stripe data, thereby avoiding the phenomenon that the same stripe data is cached on different node devices, and the cache utilization of the node device is improved.
  • FIG. 1 is a schematic structural diagram of a distributed storage system according to Embodiment 1 of the present invention.
  • FIG. 2a is a flowchart of a method for caching a distributed storage system according to a second embodiment of the present invention
  • FIG. 2b is a schematic structural diagram of attribute information of N stripe data according to Embodiment 2 of the present invention
  • FIG. 4 is a signaling flowchart of a method for caching a distributed storage system according to Embodiment 4 of the present invention
  • FIG. 5 is a flowchart of a method for caching a distributed storage system according to Embodiment 5 of the present invention
  • FIG. 6 is a signaling flowchart of a method for caching a distributed storage system according to Embodiment 6 of the present invention
  • FIG. 7 is a signaling flowchart of a method for caching a distributed storage system according to Embodiment 7 of the present invention
  • 8 is a signaling flowchart of a method for caching a distributed storage system according to Embodiment 8 of the present invention
  • FIG. 9 is a schematic structural diagram of a node data owner server node according to Embodiment 9 of the present invention. Schematic diagram of the structure of the lock client node;
  • FIG. 11 is a schematic structural diagram of a stripe data owner server node according to Embodiment 11 of the present invention.
  • FIG. 12 is a schematic structural diagram of a lock client node according to Embodiment 12 of the present invention.
  • a cache process for accessing data after storing data in a hard disk of each node device of the distributed storage system is taken as an example.
  • FIG. 1 is a schematic structural diagram of a distributed storage system according to Embodiment 1 of the present invention.
  • the caching method of each of the following embodiments of the present invention can be applied to the distributed storage system of the first embodiment of the present invention.
  • a distributed storage system in the art includes a plurality of lock client nodes and a plurality of lock server nodes.
  • the distributed storage system of the first embodiment of the present invention adds one or more stripe data owner server nodes based on the above-mentioned distributed storage system in the prior art. For one operation, multiple lock client nodes, one lock server node, and one stripe data owner server node are involved. As shown in FIG.
  • FIG. 1 multiple lock client nodes, one lock server node, and one stripe data owner server node in the distributed storage system are shown in FIG. 1, and the remaining lock server nodes and the stripe data owner server are shown in FIG. The node is not shown.
  • the lock client node, the lock server node, and the stripe data owner server node are respectively set as an example.
  • a lock client node may be simultaneously set in each node device.
  • the lock server node set in one of the node devices is used as the lock server node of the operation, and is set in one of the node devices.
  • the stripe data owner server node acts as the stripe data owner server node for this operation.
  • the stripe data owner server node and the lock server node can be separately set and respectively perform their respective functions; the stripe data owner server node can also be combined with the lock server node, that is, the existing lock server
  • the node is modified to increase the operations performed by the stripe data owner server node proposed by the embodiment of the present invention on the basis of the operations performed by the existing lock server node.
  • FIG. 2 is a flowchart of a method for caching a distributed storage system according to Embodiment 2 of the present invention. As shown in Figure 2, the method includes the following process.
  • Step 101 The stripe data owner server node receives a lock notification for the stripe data from the lock client node, and determines the lock notification.
  • Step 102 When the lock notification is a first read lock notification or a write lock notification, the stripe data owner server node records the lock client node as an owner of the stripe data, to the lock client The node returns a response message of the owner of the stripe data to the lock client node, so that the lock client node caches the stripe data.
  • the first read lock notification of the stripped data from the first lock client node or the write lock notification of the stripe data is received by the stripe data owner server node as an example.
  • a stripe data owner server node records the first lock client node as an owner of the stripe data, and returns an owner of the stripe data to the first lock client node as the first Locking the response message of the client node, so that the first lock client node caches the stripe data.
  • the first lock client node After obtaining the read lock authorization or the write lock authorization from the lock server node, the first lock client node sends a lock notification to the owner server node, where the notification carries the identifier of the stripe data corresponding to the lock, and the lock is read
  • the lock carries the read lock identifier, and when the lock is a write lock, carries the write lock identifier, and through the lock notification, informs the owner server node that the first lock client has obtained a read lock or a write lock on the strip data.
  • the attribute information of the stripe data is recorded in the stripe data owner server node.
  • the owner server node receives the lock notification sent by the first lock client node, and searches for the attribute information of the recorded stripe data according to the identifier of the stripe data carried in the lock notification, if the lock notification carries the read lock identifier and records the score If the read lock corresponding to the stripe data is not found in the attribute information of the strip data, the owner server node confirms receipt of the first read lock notification for the stripe data.
  • FIG. 2b is a schematic structural diagram of attribute information of N stripe data in Embodiment 2 of the present invention.
  • the identifier (identity, abbreviated as ID) of the stripe data is recorded in the attribute information of each stripe data, and the ID of the lock client node of the lock currently holding the stripe data is correspondingly recorded,
  • the type of the lock the owner of the stripe data.
  • the lock type is used to indicate that the lock of the current stripe data is a read lock or a write lock.
  • a lock client node holding a read lock on a stripe data can read the stripe data, and a lock client node holding a write lock on a stripe data can write or modify the stripe data.
  • the level of write locks is higher than the level of read locks.
  • other lock client nodes can also hold a read lock on the stripe data; when a lock client node holds a certain When a write lock of stripe data is performed, other lock client nodes are not allowed to hold a read lock or a write lock on the stripe data.
  • the lock client nodes of the locks currently holding the stripe data are recorded in the attribute information of the stripe data shown in FIG. 2b. It can be understood that the embodiment of the present invention is for the current A lock client node that holds a lock on a piece of data is unrestricted.
  • Step 103 When the lock notification is a non-first read lock notification, the stripe data owner server node returns a response message including the owner information of the stripe data to the lock client node, so that the The lock client node reads the stripe data from a cache of the owner of the stripe data.
  • the stripe data owner server node receives the non-first read lock notification for the stripped data from the second lock client node as an example, and the stripe data owner server node goes to the second lock.
  • the client node returns a response message of the owner of the stripe data to the first lock client node, so that the second lock client node reads from the cache of the first lock client node Describe the data.
  • the lock client node that initiated the notification is recorded as the owner of the stripe data. Causing the stripe data by the lock client node; when the stripe data owner server node receives the non-first read lock notification for the stripe data, notifying the lock client node that initiated the notification according to the record The owner of the stripe data, the lock client node that initiated the notification reads the stripe data from the owner's cache.
  • each lock client The node can read the stripe data from the owner, thereby avoiding the phenomenon that the same stripe data is cached on different node devices, and the cache utilization of the node device is improved.
  • the method may further include: when the lock notification is a read lock notification, the stripe data owner server node is in the attribute information of the recorded stripe data. The attribute information of the stripe data is searched, and if not found, the read lock notification is determined to be the first read lock notification.
  • the method further includes: The stripe data owner server node receives a request message from the first lock client node to change the owner of the stripe data to a third lock client node.
  • the stripe data owner server node changes the owner of the stripe data to a third lock client node.
  • an owner change success response message of the stripe data to the first lock client node so that the first lock client node deletes the stripped data of the local cache, And causing the third lock client node to cache the stripe data.
  • the stripe data owner is transferred to the third lock client node to implement striping.
  • the dynamic replacement of the owner of the data reduces the buffer load of the first lock client node, thereby better implementing load balancing of the stripe data cache.
  • the read lock notification may be a read lock request
  • the write lock notification may be a write lock. request.
  • returning the response message of the owner of the stripe data to the first lock client node to the first lock client node further includes: returning a lock success response to the first lock client node Message.
  • the returning the response message to the second lock client node that the owner of the stripe data is the first lock client node further includes: returning a lock success response message to the second lock client node .
  • the read lock request or the write lock request is brought by the portable It is sent to the lock server node device, which avoids resending the read lock request or the write lock request to the lock server node device, reduces the signaling between the system devices, and improves the read and write efficiency of the stripe data.
  • FIG. 3 is a flowchart of a method for caching a distributed storage system according to Embodiment 3 of the present invention. As shown in Figure 3, the method includes the following process.
  • Step 201 The lock client node sends a read lock notification or a write lock notification to the stripe data owner server node for the stripe data.
  • Step 202 When the lock client node receives the response message that the owner of the stripe data owner server returns the stripe data is the lock client node, the lock client node caches the Striped data.
  • Step 203 When the lock client node receives the ID of the owner of the stripe data returned by the stripe data owner server node, the ID of the owner of the stripe data by the lock client node Comparing with its own ID, when the two are different, the lock client node reads the stripe data from the cache of the lock client node corresponding to the ID of the owner of the stripe data.
  • the lock client node after the lock client node sends a read lock notification or a write lock notification to the stripe data owner server node, the lock client node receives the Before the stripe data owner server node returns the response message of the lock client node, the owner of the stripe data node further includes: the strip data unit The owner server node records the lock client node as the owner of the stripe data upon receiving a first read lock notification for the stripe data or a write lock notification for the stripe data.
  • the method further includes: if the lock client node sends a write lock notification to the stripe data owner server node, the lock client node caches the stripe After the data, a lock downgrade request is sent to the lock server node to cause the lock server node to modify the record to the lock client node holding a read lock on the stripe data.
  • the lock client node holding the write lock After the lock client node holding the write lock completes the operation of buffering the stripe data to itself, if the write lock is still held, the other lock client needs to recall the write lock when applying for the lock, in the present invention
  • the lock client node holding the write lock actively sends a downgrade request to the lock server after the operation of buffering the stripe data to itself, and demotes the write lock to a read lock, thereby applying at other lock clients.
  • the lock is read, it is not necessary to perform the recall operation again, which can save the time for the subsequent start of the read lock operation and improve the cache processing efficiency.
  • the method further includes: the lock client node receiving A read request or a write request for the stripe data from the application server.
  • the lock client node locally looks for a read or write lock on the stripe data. If found, the owner of the stripe data is determined based on the read or write lock of the stripe data, and the stripe data is read or written by the owner of the stripe data. If not found, the step of executing the lock client node to send a read lock notification or a write lock notification to the stripe data to the stripe data owner server node.
  • the method further includes: when the deletion rate of the cache of the lock client node in a unit time is greater than or equal to a preset ratio of the total cache amount of the lock client node, The lock client node sends a request message to the stripe data owner server node to change the owner of the stripe data to a target lock client node, so that the stripe data owner server node will The owner of the stripe data is changed to the target lock client node.
  • the lock client node receives an owner change success response message of the stripe data returned by the stripe data owner server node, deletes the stripe data of the local cache, and sends the stripe data to the target lock client node Demarcating the data so that the target lock client node caches the stripe data.
  • the cache usage efficiency is reduced due to frequent cache deletion on the lock client node. Defects can maintain cache usage balance on each lock client node throughout the system, improving cache usage efficiency.
  • the lock client node reads the stripe data from the lock client node corresponding to the ID of the owner of the stripe data, and specifically includes: the lock client node Sending a request for reading the stripe data to the lock client node corresponding to the ID of the owner of the stripe data, so that the lock client node corresponding to the ID of the owner of the stripe data is in the locally cached data Finding the stripe data, if found, returning the stripe data to the lock client node, otherwise, reading each stripe of the stripe data from each lock client node of the distributed storage system After the data is loaded and constructed into the stripe data, it is returned to the lock client node.
  • the lock client node when the lock client node issues a first read lock notification or a write lock notification for a certain stripe data, the lock client node caches the stripe data, and the stripe data owner server node
  • the lock client node records as the owner of the stripe data; when the lock client node issues a non-first read lock notification for the stripe data, the stripe data owner server node notifies the lock client node according to the record The owner of the stripe data, which reads the stripe data from the owner's cache.
  • the stripe data owner Through the record and feedback of the stripe data owner by the stripe data owner server node, the same stripe data is cached only once by its owner in the entire distributed storage system, and each lock client node can read from the owner.
  • the stripe data is taken, thereby avoiding the phenomenon that the same stripe data is cached on different node devices, and the cache utilization of the node device is improved.
  • the cache method in the case where the owner server node and the lock server node are separately set is introduced by the second embodiment and the third embodiment of the present invention.
  • the owner server node and the lock server node may also be combined, and the same node is used to perform the operations of the owner server node and the lock server node.
  • the existing lock server node may be modified to increase the operations performed by the owner server node proposed by the embodiment of the present invention based on the operations performed by the existing lock server node.
  • the case where the operation of the owner server node and the lock server node is performed by using the same node is taken as an example, that is, after the above-mentioned improved lock server node is performed, the existing The operation of the lock server node and the operation of the owner server node proposed in the embodiment of the present invention further describe the cache method proposed by the present invention.
  • the locks described in the fourth to eighth embodiments of the present invention The server node is the above-mentioned improved lock server node, and the lock request sent by each lock client node to the lock server node is used as a lock notification.
  • the lock server node performs an operation of the existing lock server node, and the lock server node also uses the read lock request as a read lock notification to perform the present invention.
  • the lock server node performs an operation of the existing lock server node, and the lock server node also uses the write lock request as a write lock notification to execute the present invention.
  • FIG. 4 is a signaling flowchart of a method for caching a distributed storage system according to Embodiment 4 of the present invention.
  • a read lock operation for a certain stripe data occurs for the first time in a distributed storage system or a read lock operation for the stripe data first appears after the stripe data is deleted.
  • the method includes the following process.
  • Step 301 The first lock client node receives a read request from the application server for the stripe data.
  • the first lock client node after the first lock client node receives the read request for the stripe data from the application server, the first lock client node locally searches for the read lock for the stripe data, if found, A lock client node that is an owner of the stripe data indicated in the read lock of the stripe data reads the stripe data. Otherwise, go to step 302. In the embodiment of the present invention, the case is not found as an example.
  • Step 302 The first lock client node sends a read lock request for the stripe data to the lock server node.
  • Step 303 The lock server node records the first lock client node as the owner of the stripe data.
  • the lock server node After receiving the request for the stripe lock, the lock server node first checks whether the attribute information of the stripe data exists in the record, and if not, generates a record of the attribute information of the stripe data, otherwise, Check the information of the lock client node holding the lock in the attribute information of the stripe data.
  • the attribute information in which the stripe data does not exist is taken as an example. If there is no relevant record of the stripe data, the first time the system applies for the stripe data after the first application or deletion, the lock server node adds the relevant information of the first lock client node of the application lock to the record. Record the first lock client node as the owner of the stripe data, will be the first The ID of the owner corresponding to a lock client node is recorded as the ID of the first lock client node.
  • Step 304 The lock server node returns the owner of the stripe data to the first lock client node as a response message of the first lock client node.
  • the lock server node returns a lock success response message to the first lock client node, where the owner of the stripe data is returned to the first lock client node as the first lock client node, and at the same time
  • a lock client node records the owner ID of the stripe data itself as the first lock client node ID.
  • Step 305 The first lock client node caches the stripe data.
  • the first lock client node finds that the first lock client node itself is the owner of the stripe data according to the information returned by the lock server node, and the first lock client node applies for the cache from the local global unified cache.
  • Space reading the stripe data of the stripe data from other lock client nodes, constructing the stripe data in the local global unified cache, and returning the stripe data to the application server.
  • the first lock client node reads the stripe data of the stripe data from the other lock client node, constructs the stripe data according to the stripe data, and after the construction is completed, the stripe data and the strip are obtained.
  • the stripe data is cached in the first lock client node, and the redundant data of the stripe data is not cached, thereby further improving the distributed storage system.
  • the cache utilization of the middle node when a lock client node caches stripe data, it can be cached in this manner.
  • FIG. 5 is a signaling flowchart of a method for caching a distributed storage system according to Embodiment 5 of the present invention.
  • the second lock client node reads the stripe data again as an example.
  • the method includes the following process.
  • Step 401 The second lock client node receives a read request for the stripe data from the application server.
  • the second lock client node locally searches for a read lock for the stripe data, and if found, the owner of the stripe data indicated in the read lock of the stripe data The lock client node reads the stripe data. Otherwise, go to step 402. In the embodiment of the present invention, the case is not found as an example.
  • Step 402 The second lock client node sends a read lock request for the stripe data to the lock server node.
  • the lock server node After receiving the request for the stripe lock, the lock server node first checks whether the attribute information of the stripe data exists in the record, and if not, generates a record of the attribute information of the stripe data, otherwise, Check the information of the lock client node holding the lock in the attribute information of the stripe data.
  • the existence is taken as an example. If the attribute information of the stripe data exists, the owner of the stripe data can be known according to the attribute information, and the owner of the stripe data is the first lock client node, and the lock server node locks the application.
  • the ID of the second lock client node is added to the attribute information of the stripe data, and the owner flag corresponding to the ID of the second lock client node is set to a preset value indicating the non-owner.
  • the lock server node returns a lock success response message to the second lock client node, and the owner of the stripe data returned to the second lock client node in the message is the first lock client node.
  • the second lock client node reads the stripe data from the cache of the first lock client node, and specifically includes the following steps.
  • Step 404 The second lock client node sends a stripe data read request to the first lock client node.
  • the second lock client node records the owner ID of the stripe data as the first lock client node ID according to the information returned by the lock server node, and the second lock client node knows that it is not the score.
  • the owner of the strip data generates a stripe data read request and sends the request to the first lock client node through the backend network.
  • Step 405 The first lock client node returns the stripe data to the second lock client node.
  • the first lock client node obtains the stripe data from the local global unified cache, and directly returns to the second lock client node.
  • Step 406 The second lock client node returns the stripe data to the application server.
  • the second lock client node receives the read data response of the first lock client node and sends it to the application server.
  • FIG. 6 is a signaling flowchart of a method for caching a distributed storage system according to Embodiment 6 of the present invention.
  • the second lock client node writes the stripe data as an example.
  • the method includes the following process.
  • Step 501 The second lock client node receives a write request for the stripe data from the application server.
  • the second lock client node when the second lock client node receives the write request for the stripe data of the application server, the second lock client node locally searches for the write lock for the stripe data, if found, in the The lock client node indicated as the owner of the stripe data indicated in the write lock of the stripe data writes the stripe data. For the case where the second lock client node locally finds the write lock for the stripe data, it is stated that before this moment, the lock server node has granted the write lock of the stripe data to the second lock client node.
  • the lock server node records the owner of the stripe data as the second lock client node before granting the write lock to the second lock client node, and therefore, at the second lock client node
  • the lock client node indicated as the owner of the stripe data in the write lock of the stripe data is the second lock client node itself. If the second lock client node does not find a write lock for the stripe data locally, then step 502 is performed. In the embodiment of the present invention, it is not found.
  • Step 502 The second lock client node sends a write lock request for the stripe data to the lock server node.
  • Step 503 The lock server node sends a lock recall request for the stripe data to the first lock client node.
  • the lock server node After receiving the request for the stripe lock, the lock server node first checks whether the attribute information of the stripe data exists in the record, and if not, generates a record of the attribute information of the stripe data, otherwise, Check the information of the lock client node holding the lock in the attribute information of the stripe data.
  • attribute information of the stripe data is taken as an example. If the attribute information of the stripe data exists, according to the attribute information, the lock client node holding the read lock or the write lock of the stripe data is obtained, and the lock server node generates the score held by the lock client node. A request for a data lock is sent to the corresponding lock client node.
  • the lock server node Taking the first lock client node holding the read lock of the stripe data as an example, the lock server node generates a request for recalling the read lock of the stripe data held by the first lock client node, and sends the request to the first lock client. End node.
  • Step 504 The first lock client node returns a lock recall success response to the lock server node.
  • the first lock client node first checks whether the lock is still in use. If not in use, the lock recall success response message is directly returned to the lock server node; if it is in use, after waiting to release the lock, the lock recall success response message is returned to the lock server node. If the first lock client node is the owner of the stripe data, the global unified cache stripe data on the first lock client node is first deleted from the global unified cache before the lock recall success response message is sent. .
  • Step 505 The lock server node records the second lock client node as the owner of the stripe data.
  • the lock server node After receiving the lock recall success response message, the lock server node records the second lock client node applying for the stripe data write lock in the stripe data attribute information, and records the second lock client node as a score. The owner of the data.
  • Step 506 The lock server node returns a stripe write lock success response message to the second lock client node.
  • Step 507 The second lock client node caches the stripe data.
  • the second lock client node after receiving the add-and-write write lock success response message of the lock server node, the second lock client node applies the stripe cache space from the local global unified cache, and then receives the write request from the application server.
  • the stripe data is stored in the local global unified cache, and then the stripe data of the stripe data is written to the corresponding lock client node.
  • Step 508 The second lock client node sends a lock downgrade request to the lock server node.
  • a lock downgrade request is generated and sent to the lock server node, the request indicating that the write lock for the stripe data is demoted to a read lock.
  • Step 509 The lock server node modifies the record to a second lock client node holding a read lock on the stripe data.
  • the lock type corresponding to the second lock client node is changed from "write lock” to "read lock”.
  • FIG. 7 is a signaling flowchart of a method for caching a distributed storage system according to Embodiment 7 of the present invention.
  • the second lock client node reads the stripe data from the first lock client node.
  • the method includes the following process.
  • Step 601 The second lock client node sends a read request of the stripe data to the first lock client node.
  • the first lock client node is used as the owner of a stripe data
  • the second lock client node learns that the first lock client node is the owner in the process of reading the stripe data.
  • a lock client node sends a read request for the stripe data, requesting to read the stripe data buffered from the first lock client node.
  • Step 602 The first lock client node searches for the stripe data in the cached resource.
  • the first lock client node first searches for the stripe data in the cached resource. If it can be found, the strip data is directly returned to the second lock client node; if it is not found, the data of the first lock client node is deleted, and the strip data has been deleted, then the steps are continued. 603. In the seventh embodiment of the present invention, the case is not found as an example.
  • Step 603 The first lock client node reads each strip data of the stripe data from each lock client node of the distributed storage system and constructs the strip data.
  • the first lock client node actively initiates a read data request to each lock client node of the distributed storage system, from Each lock client node reads each stripe data of the stripe data, and reconstructs the stripe data in the global cache of the first lock client node itself.
  • Step 604 The first lock client node returns the stripe data to the second lock client node.
  • FIG. 8 is a signaling flowchart of a method for caching a distributed storage system according to Embodiment 8 of the present invention.
  • the cached stripe data is dynamically migrated between the lock client nodes as an example.
  • the basic principle is to cache the stripe data on the lock client node that first accesses the stripe data, but when the application server passes
  • the stripe data accessed by different lock client nodes is cached on a small number of lock client nodes, the cache deletion on these lock client nodes for caching becomes abnormally frequent, which seriously affects the value of the cache.
  • the stripe data with high access frequency in the cache is called cache hotspot. Therefore, in order to maintain the cache usage balance on each storage service in the whole system, the hotspot must be dynamically migrated in the cache of each lock client section.
  • the method includes the following process.
  • Step 701 The first lock client node determines whether dynamic migration is enabled.
  • each lock client node determines whether there is a cache hotspot in its own cache, and if so, starts dynamic migration.
  • Whether the cache hotspot detection method exists in the global unified cache in the distributed storage system includes: periodically detecting the deletion rate of the cache of each lock client node in a unit time. When the deletion rate of the cache of a lock client node in the unit time exceeds the preset proportion of the total cache amount of the lock client node, it is defined that the lock client node cache has a hot spot, and a dynamic hotspot migration operation is required.
  • the lock client node actively migrates the cache to other lock client nodes. For example, the above preset ratio may be 20%.
  • the first lock client node determines whether dynamic migration is enabled. When the cache rate of the first lock client node in the unit time is greater than or equal to a preset ratio of the cache total of the lock client node, the first lock The client node determines that dynamic migration is enabled. In the embodiment of the present invention, the preset ratio of the cache rate of the first lock client node in the unit time is greater than or equal to the preset cache ratio of the lock client node.
  • An implementation manner includes: setting a monitor connected to each lock client node, and each lock client node periodically reports to the monitor a deletion rate of the lock client node cache in a unit time.
  • the monitor periodically counts the cache usage on each lock client node, and pushes cache hotspot information to each lock client node, where the cache hotspot information includes the ID of the lock client node that meets the cache hotspot condition, and each lock client node is based on Whether the cache hotspot information contains the ID of the lock client node to determine whether it belongs to the cache hotspot, and if so, starts the cache dynamic migration task.
  • the cache dynamic migration task execution period is a sum of a heartbeat period in which the lock client node reports the deletion rate to the monitor and a period in which the monitor obtains the deletion rate of each lock client node, and the migration end condition is the end of the execution of the current cycle and the next The ID of the local client node is not included in the secondary cache hotspot information.
  • Step 702 The second lock client node sends a stripe data read request to the first lock client node.
  • the owner of the stripe data is taken as the first lock client node.
  • Step 702 is an optional step.
  • step 702 is performed first, and the second lock client node sends the stripe to the first lock client node.
  • a read request for the data is then performed at step 703.
  • step 701 if the execution of step 701 is completed, the second lock client node If the stripe read lock is not successfully applied, step 702 is not performed, and step 703 is directly executed.
  • Step 703 The first lock client node sends a change owner request message to the lock server node.
  • the first lock client node sends a change owner request message to the lock server node, where the message includes an ID of the stripe data and an ID of the target lock client node, where the message indicates that the request is to be The owner of the strip data is changed to the target lock client node. That is, the first lock client node sends a request message to the lock server node to change the owner of the stripe data to the target lock client node.
  • the target lock client node is used as the second target lock client node as an example.
  • the first lock client node generates the change owner request after receiving the stripe data read request sent by the second lock client node.
  • the first lock client node actively generates the change owner request message.
  • Step 704 The lock server node changes the owner of the stripe data to a target lock client node.
  • the lock server node modifies the owner information of the stripe data, and changes the owner of the stripe data to the target lock client. node.
  • Step 705 The lock server node returns the owner change success response message of the stripe data to the first lock client node.
  • Step 706 The first lock client node sends the stripe data to the second lock client node.
  • the first lock client node also returns a read data success response message to the second lock client node.
  • Step 707 The first lock client node deletes the stripped data of the local cache.
  • the first lock client node after receiving the owner change success response message of the stripe data, the first lock client node actively deletes the stripe data cached in the first lock client node.
  • Step 708 The second lock client node caches the stripe data.
  • step 702 if the foregoing process includes the step 702, after the second lock client node receives the read data success response message, the stripe data is cached into the local global cache of the second lock client node, and simultaneously to the application server. Returns a response message. If step 702 is not included in the above process, the first lock client node actively pushes the stripe data to the local of the second lock client node. In the global cache.
  • FIG. 9 is a schematic structural diagram of a stripe data owner server node according to Embodiment 9 of the present invention.
  • the stripe data owner server node includes at least: a receiving unit 91, a recording unit 92, and a transmitting unit 93.
  • the receiving unit 91 is configured to receive a lock notification for the stripe data from the lock client node.
  • a determining unit 94 configured to determine the lock notification
  • the recording unit 92 is configured to record the lock client node as the owner of the stripe data when the determining unit determines that the lock notification is a first read lock notification or a write lock notification.
  • a sending unit 93 configured to: when the determining unit determines that the lock notification is a first read lock notification or a write lock notification, returning the owner of the stripe data to the lock client node as the lock client a response message of the node, so that the lock client node caches the stripe data; and when the receiving judging unit judges that the lock notification is a non-first read lock notification, returning to the lock client node includes the And a response message of the owner information of the stripe data, so that the lock client node reads the stripe data from a cache of an owner of the stripe data.
  • the determining unit 94 is specifically configured to: when the lock notification is a read lock notification, search for attribute information of the stripe data in the attribute information of the recorded stripe data. If not found, it is determined that the read lock notification is a first read lock notification.
  • the receiving unit 91 is further configured to receive a request message from the lock client node to change an owner of the stripe data to another lock client node.
  • the recording unit 92 is further configured to change the owner of the stripe data to the another lock client node.
  • the sending unit 93 is further configured to return an owner change success response message of the stripe data to the lock client node, so that the lock client node deletes the stripped data of the local cache, and The other lock client node caches the stripe data.
  • the read lock notification includes a read lock request
  • the write lock notification includes a write lock request
  • the sending unit 93 is further configured to return a lock success response message to the lock client node.
  • the stripe data owner server node of the ninth embodiment of the present invention can be used to execute the present invention.
  • the specific implementation process and technical effects of the second embodiment to the ninth embodiment of the present invention reference may be made to Embodiment 8 of the present invention to Embodiment 8 of the present invention, and details are not described herein again.
  • FIG. 10 is a schematic structural diagram of a lock client node according to Embodiment 10 of the present invention.
  • the lock client node includes at least: a transmitting unit 1001, a receiving unit 1002, a comparing unit 1003, a buffer unit 1004, and a reading and writing unit 1005.
  • the sending unit 1001 is configured to send a read lock notification or a write lock notification to the stripe data owner server node.
  • the receiving unit 1002 is configured to receive, by the stripe data owner server node, a response message that returns that the owner of the stripe data is the lock client node, or receive the returned by the stripe data owner server node The ID of the owner of the stripe data.
  • the comparing unit 1003 is configured to compare the ID of the owner of the stripe data with its own ID; when the two are different, the reading and writing unit 1005 is turned on.
  • the buffer unit 1004 is configured to cache the stripe data when the receiving unit receives the response message that the stripe data owner server node returns the owner of the stripe data as the lock client node. .
  • the read/write unit 1005 is configured to read the stripe data from a cache of a lock client node corresponding to an ID of an owner of the stripe data.
  • the sending unit 1001 is further configured to send a lock downgrade request to the lock server node, so that the lock server node modifies the record to the lock client node holding pair.
  • the read lock of the stripe data is further configured to send a lock downgrade request to the lock server node, so that the lock server node modifies the record to the lock client node holding pair.
  • the method further includes: a searching unit 1006.
  • the receiving unit 1002 is further configured to receive a read request or a write request for the stripe data from the application server.
  • the searching unit 1006 is configured to locally search for a read lock or a write lock for the stripe data; if found, determine the owner of the stripe data according to a read lock or a write lock of the stripe data
  • the reading and writing unit 1005 is turned on; if not found, the transmitting unit 1001 is turned on.
  • the read/write unit 1005 is further configured to read or write the stripe data in an owner of the stripe data.
  • the buffer unit 1004 is further configured to control the sending unit when a deletion rate per unit time is greater than or equal to a preset ratio of a total amount of caches of the lock client node.
  • 1001 sends the stripe data owner server node to The owner of the stripe data is changed to the request message of the target lock client node, and is further configured to delete the locally cached stripe data according to the owner change success response message of the stripe data received by the receiving unit 1002, and control The sending unit 1001 sends the stripe data to the target lock client node.
  • the sending unit 1001 is further configured to send, according to the control of the cache unit 1004, a request message for changing the owner of the stripe data to a target lock client node to the stripe data owner server node, So that the stripe data owner server node changes the owner of the stripe data to a target lock client node.
  • the sending unit 1001 is further configured to send the stripe data to the target lock client node according to the control of the cache unit 1004, so that the target lock client node caches the stripe data.
  • the receiving unit 1002 is further configured to receive an owner change success response message of the stripe data returned by the stripe data owner server node.
  • the read/write unit 1005 is specifically configured to send a request for reading the stripe data to the lock client node corresponding to the ID of the owner of the stripe data, so that the The lock client node corresponding to the ID of the owner of the stripe data searches the locally cached data for the stripe data, and if found, returns the stripe data to the read/write unit 1005, otherwise, from the Each of the lock client nodes of the distributed storage system reads the strip data of the stripe data, and constructs the stripe data, and returns the strip data to the read/write unit 1005.
  • the lock client node of the tenth embodiment of the present invention may be used to perform the cache method according to the second embodiment of the present invention to the eighth embodiment of the present invention.
  • the specific implementation process and technical effects may refer to the second embodiment of the present invention to the embodiment of the present invention. Eight, no longer repeat them here.
  • FIG. 11 is a block diagram showing the structure of a stripe data owner server node in the eleventh embodiment of the present invention.
  • the stripe data owner server node includes at least: a processor 1101, a memory 1 102, a communication interface 1103, and a bus 1104.
  • the processor 1101, the memory 1102, and the communication interface 1103 communicate through the bus 1104.
  • the memory 1102 is used to store programs. Specifically, the program code may be included in the program, and the program code includes a computer execution instruction.
  • the memory 1102 can be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
  • the communication interface 1103 is configured to communicate with the first lock client node and the second lock client node. Letter.
  • the processor 1101 is configured to execute an execution instruction stored by the memory 1102, which may be a single core or a multi-core central processing unit (CPU), or an application specific integrated circuit (ASIC), or One or more integrated circuits configured to implement embodiments of the present invention.
  • the memory 1102 may be a single core or a multi-core central processing unit (CPU), or an application specific integrated circuit (ASIC), or One or more integrated circuits configured to implement embodiments of the present invention.
  • the processor 1101 runs a program to execute the following instructions:
  • the stripe data owner server node receives a lock notification for the stripe data from the lock client node, and determines the lock notification;
  • the stripe data owner server node When the lock notification is a first read lock notification or a write lock notification, the stripe data owner server node records the lock client node as an owner of the stripe data, to the lock client node Returning, the owner of the stripe data is a response message of the lock client node, so that the lock client node caches the stripe data;
  • the stripe data owner server node When the lock notification is a non-first read lock notification, the stripe data owner server node returns a response message containing the owner information of the stripe data to the lock client node, so that the lock client The end node reads the stripe data from a cache of the owner of the stripe data.
  • the stripe data owner server node of the eleventh embodiment of the present invention can be used to perform the caching method according to the second embodiment of the present invention to the eighth embodiment of the present invention.
  • the specific implementation process and technical effects can be referred to the second embodiment of the present invention.
  • the eighth embodiment of the present invention is not described herein again.
  • FIG. 12 is a schematic structural diagram of a lock client node according to Embodiment 12 of the present invention.
  • the lock client node includes at least: a processor 1201, a memory 1202, a communication interface 1203, and a bus 1204.
  • the processor 1201, the memory 1202, and the communication interface 1203 communicate through the bus 1204.
  • the memory 1202 is used to store programs. Specifically, the program code may be included in the program, and the program code includes a computer execution instruction.
  • the memory 1202 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
  • the processor 1201 is configured to execute an execution instruction stored by the memory 1202, which may be a single core or a multi-core central processing unit (CPU), or for a specific integration.
  • the memory 1202 may be a single core or a multi-core central processing unit (CPU), or for a specific integration.
  • ASIC Application Specific Integrated Circuit
  • ASIC or one or more integrated circuits configured to implement embodiments of the present invention.
  • the communication interface 1203 is for communicating with a stripe data owner server node and other lock client nodes.
  • the processor 1201 runs a program to execute the following instructions: Send a read lock notification or a write lock notification to the stripe data owner server node for the stripe data;
  • the lock client node caches the stripe data ;
  • the ID of the owner of the stripe data is compared with its own ID, when the two are different, The stripe data is read in a cache of a lock client node corresponding to the ID of the owner of the stripe data.
  • the lock client node of the embodiment 12 of the present invention can be used to perform the cache method according to the second embodiment of the present invention to the eighth embodiment of the present invention.
  • the specific implementation process and technical effects can be implemented by referring to the second embodiment of the present invention to the implementation of the present invention. Example 8 is not repeated here.
  • a thirteenth embodiment of the present invention provides a computer readable medium, comprising computer execution instructions, wherein the computer execution instruction is used to make a stripe data owner server node according to the method described in Embodiment 2 of the present invention.
  • a fourteenth embodiment of the present invention provides a computer readable medium, comprising computer execution instructions, wherein the computer execution instructions are used to cause a lock client node to perform the method according to Embodiment 3 of the present invention.
  • Computer readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one location to another.
  • a storage medium may be any available media that can be accessed by a computer.
  • computer readable media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage media or other magnetic storage device, or can be used for carrying or storing in the form of an instruction or data structure.
  • any connection may suitably be a computer readable medium.
  • the software is transmitted from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave
  • coaxial cable , fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, wireless, and microwaves are included in the fixing of the associated media.
  • a disk and a disc include a compact disc (CD), a laser disc, a disc, a digital versatile disc (DVD), a floppy disk, and a Blu-ray disc, wherein the disc is usually magnetically copied, and the disc is The laser is used to optically replicate the data. Combinations of the above should also be included within the scope of the computer readable media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提供一种分布式存储系统的缓存方法、节点和计算机可读介质。锁客户端节点发送对分条数据的锁通知,若为首次读锁通知或写锁通知,分条数据所有者服务器节点记录锁客户端节点为分条数据的所有者,锁客户端节点缓存分条数据;若为非首次读锁通知,锁客户端节点从分条数据的所有者的锁客户端节点的缓存中读取分条数据。

Description

分布式存储系统的緩存方法、 节点和计算机可读介质 技术领域
本发明实施例涉及存储技术, 尤其涉及一种分布式存储系统的緩存方 法、 节点和计算机可读介质。 背景技术
在分布式存储系统中, 将多个节点设备相连形成集群, 各个节点设备均 具有数据存储功能。 全部节点设备通过前端网络(Front-End Network )和后 端网络(Back-End Network )分别连接。 前端网络用于用户业务与分布式存 储系统之间进行请求与数据交互, 后端网络用于分布式存储系统内部各个节 点设备之间进行请求与数据交互。
在分布式存储系统中, 将用户数据进行条带化得到一个个分条 ( Stripe ) , 然后将分条中的各个条带(Strip )数据分发存储到不同的节点 设备的硬盘中。 在访问用户数据时, 应用服务器首先通过前端网络将访问 请求发送到一个节点设备上, 然后该节点设备通过后端网络将用户数据所 在的条带数据从其它节点设备读到本节点设备,采用磁盘阵列(Redundant Array of Independent Disks, 简称 RAID ) 算法或者消除码 (简称 Erasure Code ) 算法将条带数据还原成用户数据后通过前端网络返回给应用服务 器。
在上述访问用户数据的过程中, 采用緩存技术。 其中, 一种緩存方法 是: 每个节点设备的緩存中緩存本节点设备上的热点数据块。 当某一节点 设备需要获取分条数据时, 该节点设备需要从各个节点设备的緩存中获取 构成该分条数据的数据块, 如果无法在节点设备的緩存中获取需要的数据 块, 则还需要访问节点设备的硬盘, 从硬盘中获取该数据块, 对上述获取 到的数据块进行汇总、 重建和冗余检验, 获得该分条数据。 另一种緩存方 法是: 每个节点设备的緩存中緩存本节点设备统计出来的热点文件分条数 据。 在节点设备需要获取分条数据时, 该节点设备先从自身緩存中获取该 分条数据, 如果在本节点设备的緩存中无法获取需要访问的分条数据, 则 还需要依次从其它节点设备的緩存中获取, 当其他节点设备的緩存中获取 不到, 则需要从节点设备的硬盘中获取。
目前, 在分布式存储系统中采用的数据緩存技术是上述两种緩存方法 或上述两种緩存方法的组合。 采用现有的分布式存储系统緩存方法, 每个 节点设备根据访问统计, 确定自身硬盘存储的内容中的热点内容, 在緩存 中緩存该热点内容, 由于每个节点设备独立进行上述緩存, 因此存在不同 的节点设备上緩存相同内容的现象, 造成节点设备的緩存利用率低。 发明内容
本发明的第一个方面是提供一种分布式存储系统的緩存方法, 用以解 决现有技术中的缺陷, 提高节点设备的緩存利用率。
本发明的另一个方面是提供一种分条数据所有者服务器节点和锁客 户端节点, 用以解决现有技术中的缺陷, 提高节点设备的緩存利用率。
本发明的又一个方面是提供一种计算机可读介质, 用以解决现有技术 中的缺陷, 提高节点设备的緩存利用率。
本发明的第一个方面是提供一种分布式存储系统的緩存方法, 包括: 分条数据所有者服务器节点接收到来自锁客户端节点的对于分条数 据的锁通知, 对所述锁通知进行判断;
当所述锁通知为首次读锁通知或写锁通知时, 所述分条数据所有者服 务器节点将所述锁客户端节点记录为所述分条数据的所有者, 向所述锁客 户端节点返回所述分条数据的所有者为所述锁客户端节点的响应消息, 以 使所述锁客户端节点緩存所述分条数据;
当所述锁通知为非首次读锁通知时, 所述分条数据所有者服务器节点 向所述锁客户端节点返回包含所述分条数据的所有者信息的响应消息, 以 使所述锁客户端节点从所述分条数据的所有者的緩存中读取所述分条数 据。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 所述分条数据所有者服务器节点接收到来自锁客户端节点的对于分 条数据的锁通知之后, 还包括:
当所述锁通知为读锁通知时, 所述分条数据所有者服务器节点在记录 的分条数据的属性信息中查找所述分条数据的属性信息, 若未查找到, 则 确定所述读锁通知为首次读锁通知。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 所述向所述锁客户端节点返回所述分条数据的所有者为所述锁客户端节 点的响应消息之后还包括:
所述分条数据所有者服务器节点接收来自所述锁客户端节点的将所 述分条数据的所有者更改为另一锁客户端节点的请求消息;
所述分条数据所有者服务器节点将所述分条数据的所有者更改为所 述另一锁客户端节点;
所述分条数据所有者服务器节点向所述锁客户端节点返回分条数据 的所有者更改成功响应消息, 以使所述锁客户端节点删除本地緩存的所述 分条数据, 并使所述另一锁客户端节点緩存所述分条数据。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 当所述分条数据所有者服务器节点集成在锁服务器节点设备中时,
所述读锁通知包括读锁请求;
所述写锁通知包括写锁请求;
所述向所述锁客户端节点返回所述分条数据的所有者为所述锁客户 端节点的响应消息还包括: 向所述锁客户端节点返回加锁成功响应消息; 所述向所述锁客户端节点返回包含所述分条数据的所有者信息的响 应消息还包括: 向所述锁客户端节点返回加锁成功响应消息。
本发明的另一个方面是提供一种分布式存储系统的緩存方法, 包括: 锁客户端节点向分条数据所有者服务器节点发送对于分条数据的读 锁通知或写锁通知;
当所述锁客户端节点接收到所述分条数据所有者服务器节点返回所 述分条数据的所有者为所述锁客户端节点的响应消息, 所述锁客户端节点 緩存所述分条数据;
当所述锁客户端节点接收所述分条数据所有者服务器节点返回的所 述分条数据的所有者的标识 ID,所述锁客户端节点将所述分条数据的所有 者的 ID和自身的 ID进行比较, 当两者不同时, 所述锁客户端节点从所述 分条数据的所有者的 ID对应的锁客户端节点的緩存中读取所述分条数据。 如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 在所述锁客户端节点向分条数据所有者服务器节点发送对于分条数据的 读锁通知或写锁通知之后, 所述锁客户端节点接收到所述分条数据所有者 服务器节点返回所述分条数据的所有者为所述锁客户端节点的响应消息 之前, 还包括:
所述分条数据所有者服务器节点在接收到对于分条数据的首次读锁 通知或写锁通知时, 将所述锁客户端节点记录为所述分条数据的所有者。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 还包括:
如果锁客户端节点向分条数据所有者服务器节点发送的是对于分条 数据的写锁通知, 所述锁客户端节点緩存所述分条数据之后, 向所述锁服 务器节点发送锁降级请求, 以使所述锁服务器节点将记录修改为所述锁客 户端节点持有对所述分条数据的读锁。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 所述锁客户端节点向分条数据所有者服务器节点发送对于分条数据的读 锁通知或写锁通知之前, 还包括:
所述锁客户端节点接收来自应用服务器的对于所述分条数据的读请 求或写请求;
所述锁客户端节点在本地查找对于所述分条数据的读锁或写锁; 如果查找到, 根据所述分条数据的读锁或写锁, 确定所述分条数据的 所有者, 在所述分条数据的所有者读取或写入所述分条数据;
如果没有查找到, 执行所述锁客户端节点向分条数据所有者服务器节 点发送对于分条数据的读锁通知或写锁通知的步骤。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 还包括:
当所述锁客户端节点的緩存在单位时间内的删除率大于或等于所述 锁客户端节点的緩存总量的预设比例时, 所述锁客户端节点向所述分条数 据所有者服务器节点发送将所述分条数据的所有者更改为目标锁客户端 节点的请求消息, 以使所述分条数据所有者服务器节点将所述分条数据的 所有者更改为目标锁客户端节点; 所述锁客户端节点接收所述分条数据所有者服务器节点返回的分条 数据的所有者更改成功响应消息, 删除本地緩存的所述分条数据, 并向所 述目标锁客户端节点发送所述分条数据, 以使所述目标锁客户端节点緩存 所述分条数据。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 所述锁客户端节点从所述分条数据的所有者的 ID对应的锁客户端节点的 緩存中读取所述分条数据具体包括:
所述锁客户端节点向所述分条数据的所有者的 ID对应的锁客户端节 点发送读取分条数据的请求, 以使所述分条数据的所有者的 ID对应的锁 客户端节点在本地緩存的数据中查找所述分条数据, 如果查找到, 向所述 锁客户端节点返回所述分条数据, 否则, 从所述分布式存储系统的各个锁 客户端节点读取所述分条数据的各个条带数据, 并构建成所述分条数据 后, 返回给所述锁客户端节点。
本发明的另一个方面是提供一种分条数据所有者服务器节点, 包括: 接收单元, 用于接收来自锁客户端节点的对于分条数据的锁通知; 判断单元, 用于对所述锁通知进行判断;
记录单元, 用于当所述判断单元判断所述锁通知为首次读锁通知或写 锁通知时, 将所述锁客户端节点记录为所述分条数据的所有者;
发送单元, 用于当所述接收判断单元判断所述锁通知为首次读锁通知 或写锁通知时, 向所述锁客户端节点返回所述分条数据的所有者为所述锁 客户端节点的响应消息, 以使所述锁客户端节点緩存所述分条数据; 当所 述接收判断单元判断所述锁通知为非首次读锁通知时, 向所述锁客户端节 点返回包含所述分条数据的所有者信息的响应消息, 以使所述锁客户端节 点从所述分条数据的所有者的緩存中读取所述分条数据。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 还包括:
所述判断单元, 具体用于在所述锁通知为读锁通知时, 在记录的分条 数据的属性信息中查找所述分条数据的属性信息, 若未查找到, 则确定所 述读锁通知为首次读锁通知。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 所述接收单元, 还用于接收来自所述锁客户端节点的将所述分条数据 的所有者更改为另一锁客户端节点的请求消息;
所述记录单元, 还用于将所述分条数据的所有者更改为所述另一锁客 户端节点;
所述发送单元, 还用于向所述锁客户端节点返回分条数据的所有者更 改成功响应消息, 以使所述锁客户端节点删除本地緩存的所述分条数据, 并使所述另一锁客户端节点緩存所述分条数据。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 当所述分条数据所有者服务器节点集成在锁服务器节点设备中时, 所述读锁通知包括读锁请求;
所述写锁通知包括写锁请求;
所述发送单元, 还用于向所述锁客户端节点返回加锁成功响应消息。 本发明的另一个方面是提供一种锁客户端节点, 包括:
发送单元, 用于向分条数据所有者服务器节点发送对于分条数据的读 锁通知或写锁通知;
接收单元, 用于接收所述分条数据所有者服务器节点返回所述分条数 据的所有者为所述锁客户端节点的响应消息, 或接收所述分条数据所有者 服务器节点返回的所述分条数据的所有者的标识 ID;
比较单元,用于将所述分条数据的所有者的 ID和自身的 ID进行比较; 当两者不同时, 开启读写单元;
所述緩存单元, 用于当所述接收单元接收到所述分条数据所有者服务 器节点返回所述分条数据的所有者为所述锁客户端节点的响应消息时, 緩 存所述分条数据;
所述读写单元, 用于从所述分条数据的所有者的 ID对应的锁客户端 节点的緩存中读取所述分条数据。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 所述发送单元, 还用于向所述锁服务器节点发送锁降级请求, 以使所 述锁服务器节点将记录修改为所述锁客户端节点持有对所述分条数据的 读锁。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 还包括: 查找单元;
所述接收单元, 还用于接收来自应用服务器的对于所述分条数据的读 请求或写请求;
所述查找单元, 用于在本地查找对于所述分条数据的读锁或写锁; 如 果查找到,根据所述分条数据的读锁或写锁,确定所述分条数据的所有者, 开启所述读写单元; 如果没有查找到, 开启所述发送单元;
所述读写单元, 还用于在所述分条数据的所有者读取或写入所述分条 数据。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 所述緩存单元, 还用于在单位时间内的删除率大于或等于所述锁客户 端节点的緩存总量的预设比例时, 控制所述发送单元向所述分条数据所有 者服务器节点发送将所述分条数据的所有者更改为目标锁客户端节点的 请求消息; 还用于根据接收单元接收的分条数据的所有者更改成功响应消 息, 删除本地緩存的所述分条数据, 控制所述发送单元向所述目标锁客户 端节点发送所述分条数据;
所述发送单元, 还用于根据所述緩存单元的控制向所述分条数据所有 者服务器节点发送将所述分条数据的所有者更改为目标锁客户端节点的 请求消息, 以使所述分条数据所有者服务器节点将所述分条数据的所有者 更改为目标锁客户端节点; 还用于根据所述緩存单元的控制向所述目标锁 客户端节点发送所述分条数据, 以使所述目标锁客户端节点緩存所述分条 数据;
所述接收单元, 还用于接收所述分条数据所有者服务器节点返回的分 条数据的所有者更改成功响应消息。
如上所述的方面和任一可能的实现方式, 进一步提供一种实现方式, 所述读写单元, 具体用于向所述分条数据的所有者的 ID对应的锁客 户端节点发送读取分条数据的请求, 以使所述分条数据的所有者的 ID对 应的锁客户端节点在本地緩存的数据中查找所述分条数据, 如果查找到, 向所述读写单元返回所述分条数据, 否则, 从所述分布式存储系统的各个 锁客户端节点读取所述分条数据的各个条带数据, 并构建成所述分条数据 后, 返回给所述读写单元。 本发明的另一个方面是提供一种分条数据所有者服务器节点, 包括: 处理器、 存储器、 通信接口和总线; 所述处理器、 所述存储器和所述通信 接口通过所述总线通信; 所述存储器用于存储执行指令, 所述通信接口用 于与第一锁客户端节点和第二锁客户端节点通信; 当所述分条数据所有者 服务器节点运行时, 所述处理器执行所述存储器存储的所述执行指令, 以 使所述分条数据所有者服务器节点执行如上任一所述的分布式存储系统 的緩存方法。
本发明的另一个方面是提供一种锁客户端节点, 包括: 处理器、 存储 器、 通信接口和总线; 所述处理器、 所述存储器和所述通信接口通过所述 总线通信; 所述存储器用于存放程序存储执行指令; 所述通信接口用于与 分条数据所有者服务器节点和其他锁客户端节点通信; 当所述锁客户端节 点运行时, 所述处理器执行所述存储器存储的所述执行指令, 以使所述分 条数据所有者服务器节点执行如上任一所述的分布式存储系统的緩存方 法。
本发明的另一个方面是提供一种计算机可读介质, 包含计算机执行指 令, 所述计算机执行指令用于使分条数据所有者服务器节点执行如上任一 所述的方法。
本发明的另一个方面是提供一种计算机可读介质, 包含计算机执行指 令, 所述计算机执行指令用于使锁客户端节点执行如上任一所述的方法。
由上述发明内容可见, 分条数据所有者服务器节点在接收到对于某个 分条数据的首次读锁请求或写锁请求时, 将发起该请求的锁客户端节点记 录为该分条数据的所有者, 由该锁客户端节点緩存该分条数据; 锁分条数 据所有者服务器节点在接收到对于该分条数据的非首次读锁请求时, 根据 上述记录向发起该请求的锁客户端节点告知该分条数据的所有者, 发起请 求的锁客户端节点从所有者的緩存中读取分条数据。 通过锁分条数据所有 者服务器节点对分条数据所有者的记录和反馈, 使得同一分条数据在整个 分布式存储系统中只在其该分条数据的所有者緩存一次, 各个锁客户端节 点均可以从该分条数据的所有者读取该分条数据, 从而避免了不同的节点 设备上緩存相同的分条数据的现象, 提高了节点设备的緩存利用率。 附图说明 为了更清楚地说明本发明实施例或现有技术中的技术方案, 下面将对实 施例或现有技术描述中所需要使用的附图作简单地介绍, 显而易见地, 下面 描述中的附图仅仅是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的前提下, 还可以根据这些附图获得其他的附图。
图 1为本发明实施例一的分布式存储系统的结构示意图;
图 2a为本发明实施例二的分布式存储系统的緩存方法的流程图; 图 2b为本发明实施例二中 N个分条数据的属性信息的结构示意图; 图 3为本发明实施例三的分布式存储系统的緩存方法的流程图; 图 4为本发明实施例四的分布式存储系统的緩存方法的信令流程图; 图 5为本发明实施例五的分布式存储系统的緩存方法的信令流程图; 图 6为本发明实施例六的分布式存储系统的緩存方法的信令流程图; 图 7为本发明实施例七的分布式存储系统的緩存方法的信令流程图; 图 8为本发明实施例八的分布式存储系统的緩存方法的信令流程图; 图 9为本发明实施例九的分条数据所有者服务器节点的结构示意图; 图 10为本发明实施例十的锁客户端节点的结构示意图;
图 11 为本发明实施例十一的分条数据所有者服务器节点的结构示意 图;
图 12为本发明实施例十二的锁客户端节点的结构示意图。
具体实施方式 下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进 行清楚、 完整地描述,显然, 所描述的实施例仅仅是本发明一部分实施例, 而不是全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没 有做出创造性劳动前提下所获得的所有其他实施例, 都属于本发明保护的 范围。
以下本发明各个实施例中,均以在分布式存储系统的各个节点设备的硬 盘中存储数据之后对该数据进行访问的緩存过程为例。
图 1为本发明实施例一的分布式存储系统的结构示意图。 本发明下述各 个实施例的緩存方法, 均可应用于本发明实施例一的分布式存储系统中。 现 有技术中的分布式存储系统包括多个锁客户端节点和多个锁服务器节点。 本发明实施例一的分布式存储系统, 在上述现有技术中的分布式存储系统 的基础上, 增加一个或多个分条数据所有者服务器节点。 对于一次操作过 程, 涉及其中的多个锁客户端节点、 一个锁服务器节点和一个分条数据所 有者服务器节点。 如图 1所示, 在图 1中示出分布式存储系统中的多个锁 客户端节点、 一个锁服务器节点和一个分条数据所有者服务器节点, 其余 锁服务器节点和分条数据所有者服务器节点未示出。 在本发明各个实施例 中, 以锁客户端节点、 锁服务器节点和分条数据所有者服务器节点分别设 置为例, 在实际应用中, 可以在每个节点设备中均同时设置一个锁客户端 节点、 一个锁服务器节点和一个分条数据所有者服务器节点, 在每次操作 过程中, 以其中一个节点设备中设置的锁服务器节点作为该次操作的锁服 务器节点, 以其中一个节点设备中设置的分条数据所有者服务器节点作为 该次操作的分条数据所有者服务器节点。
在实际应用中, 分条数据所有者服务器节点与锁服务器节点可以分别 设置, 分别执行各自的功能; 也可以将分条数据所有者服务器节点与锁服 务器节点结合, 即, 对现有的锁服务器节点进行改进, 使其在现有的锁服 务器节点执行的操作的基础上, 增加本发明实施例提出的分条数据所有者 服务器节点执行的操作。
以下通过本发明实施例二和实施例三, 对分条数据所有者服务器节点 与锁服务器节点分别设置情况下的緩存方法进行介绍。
图 2为本发明实施例二的分布式存储系统的緩存方法的流程图。 如图 2 所示, 该方法包括以下过程。
步骤 101 : 分条数据所有者服务器节点接收到来自锁客户端节点的对 于分条数据的锁通知, 对所述锁通知进行判断。
步骤 102: 当所述锁通知为首次读锁通知或写锁通知时, 分条数据所 有者服务器节点将所述锁客户端节点记录为所述分条数据的所有者, 向所 述锁客户端节点返回所述分条数据的所有者为所述锁客户端节点的响应 消息, 以使所述锁客户端节点緩存所述分条数据。
在本步骤中, 以分条数据所有者服务器节点在接收到来自第一锁客户 端节点的对分条数据的首次读锁通知或对所述分条数据的写锁通知为例, 分条数据所有者服务器节点将所述第一锁客户端节点记录为所述分条数 据的所有者, 向所述第一锁客户端节点返回所述分条数据的所有者为所述 第一锁客户端节点的响应消息, 以使所述第一锁客户端节点緩存所述分条 数据。 第一锁客户端节点在从锁服务器节点获得读锁授权或写锁授权后, 向所有者服务器节点发送锁通知, 在该通知中携带锁对应的分条数据的标 识, 并在该锁为读锁时携带读锁标识, 在该锁为写锁时携带写锁标识, 通 过该锁通知, 向所有者服务器节点告知第一锁客户端已获得对该分条数据 的读锁或写锁。 在分条数据所有者服务器节点中记录分条数据的属性信 息。 所有者服务器节点接收第一锁客户端节点发送的锁通知, 根据锁通知 中携带的分条数据的标识查找记录的分条数据的属性信息, 如果锁通知中 携带读锁标识并且在记录的分条数据的属性信息中未查找到该分条数据 对应的读锁, 则所有者服务器节点确认收到对该分条数据的首次读锁通 知。
图 2b为本发明实施例二中 N个分条数据的属性信息的结构示意图。如 图 2b所示, 每个分条数据的属性信息中记录该分条数据的标识( identity, 简称 ID ) , 并且对应地记录当前持有该分条数据的锁的锁客户端节点的 ID、 锁类型、 分条数据的所有者的 ID。 其中, 锁类型用于指示当前该分 条数据的锁为读锁或写锁。 持有对某个分条数据的读锁的锁客户端节点可 以读取该分条数据, 持有对某个分条数据的写锁的锁客户端节点可以写入 或修改该分条数据。 写锁的级别高于读锁的级别。 当某个锁客户端节点持 有对某个分条数据的读锁时, 其它锁客户端节点也可以持有对该分条数据 的读锁; 当某个锁客户端节点持有对某个分条数据的写锁时, 不允许其它 锁客户端节点持有对该分条数据的读锁或写锁。需要说明的是,作为举例, 图 2b中示出的分条数据的属性信息中均记录 3个当前持有该分条数据的 锁的锁客户端节点, 可以理解, 本发明的实施例对于当前持有某一分条数 据的锁的锁客户端节点不加限制。
步骤 103: 当所述锁通知为非首次读锁通知时, 所述分条数据所有者 服务器节点向所述锁客户端节点返回包含所述分条数据的所有者信息的 响应消息, 以使所述锁客户端节点从所述分条数据的所有者的緩存中读取 所述分条数据。 在本步骤中, 以分条数据所有者服务器节点在接收到来自第二锁客户 端节点的对于分条数据的非首次读锁通知为例, 分条数据所有者服务器节 点向所述第二锁客户端节点返回所述分条数据的所有者为所述第一锁客 户端节点的响应消息, 以使所述第二锁客户端节点从所述第一锁客户端节 点的緩存中读取所述分条数据。
上述实施例中, 分条数据所有者服务器节点在接收到对于某个分条数 据的首次读锁通知或写锁通知时, 将发起该通知的锁客户端节点记录为该 分条数据的所有者, 由该锁客户端节点緩存该分条数据; 分条数据所有者 服务器节点在接收到对于该分条数据的非首次读锁通知时, 根据上述记录 向发起该通知的锁客户端节点告知该分条数据的所有者, 发起通知的锁客 户端节点从所有者的緩存中读取分条数据。 通过分条数据所有者服务器节 点对分条数据所有者的记录和反馈, 使得同一分条数据在整个分布式存储 系统中只在其所有者对应的锁客户端节点中緩存一次, 各个锁客户端节点 均可以从所有者读取该分条数据, 从而避免了不同的节点设备上緩存相同 的分条数据的现象, 提高了节点设备的緩存利用率。
在上述技术方案的基础上, 进一步地,在步骤 101之后,还可以包括: 当所述锁通知为读锁通知时, 所述分条数据所有者服务器节点在记录的分 条数据的属性信息中查找所述分条数据的属性信息, 若未查找到, 则确定 所述读锁通知为首次读锁通知。
在上述技术方案的基础上, 进一步地, 当分条数据所有者服务器节点 向第一锁客户端节点返回分条数据的所有者为所述第一锁客户端节点的 响应消息之后还包括: 所述分条数据所有者服务器节点接收来自所述第一 锁客户端节点的将所述分条数据的所有者更改为第三锁客户端节点的请 求消息。 所述分条数据所有者服务器节点将所述分条数据的所有者更改为 第三锁客户端节点。 所述分条数据所有者服务器节点向所述第一锁客户端 节点返回分条数据的所有者更改成功响应消息, 以使所述第一锁客户端节 点删除本地緩存的所述分条数据, 并使所述第三锁客户端节点緩存所述分 条数据。
采用上述实现方式, 当第一锁客户端节点内存不足而无法存储该分条 数据时, 通过将该分条数据的所有者转移给第三锁客户端节点, 实现分条 数据的所有者的动态更换, 减轻了第一锁客户端节点的緩存负载, 从而更 好地实现了分条数据緩存的负载均衡。
在上述技术方案的基础上, 进一步地, 当所述分条数据所有者服务器 节点集成在锁服务器节点设备中时, 所述读锁通知可以为读锁请求, 所述 写锁通知可以为写锁请求。 相应地, 向所述第一锁客户端节点返回所述分 条数据的所有者为所述第一锁客户端节点的响应消息还包括: 向所述第一 锁客户端节点返回加锁成功响应消息。 所述向所述第二锁客户端节点返回 所述分条数据的所有者为所述第一锁客户端节点的响应消息还包括: 向所 述第二锁客户端节点返回加锁成功响应消息。
通过将分条数据所有者服务器节点集成在锁服务器节点设备中, 在锁 客户端节点向分条数据所有者服务器节点发送读锁通知或者写锁通知时, 顺便携带了读锁请求或写锁请求发送给锁服务器节点设备, 这样避免了重 新发送读锁请求或写锁请求给锁服务器节点设备, 减少了系统设备之间的 信令, 提高了分条数据的读写效率。
图 3为本发明实施例三的分布式存储系统的緩存方法的流程图。 如图 3 所示, 该方法包括以下过程。
步骤 201: 锁客户端节点向分条数据所有者服务器节点发送对于分条 数据的读锁通知或写锁通知。
步骤 202: 当所述锁客户端节点接收到所述分条数据所有者服务器节 点返回所述分条数据的所有者为所述锁客户端节点的响应消息, 所述锁客 户端节点緩存所述分条数据。
步骤 203: 当所述锁客户端节点接收所述分条数据所有者服务器节点 返回的所述分条数据的所有者的 ID,所述锁客户端节点对所述分条数据的 所有者的 ID和自身的 ID进行比较, 当两者不同时, 所述锁客户端节点从 所述分条数据的所有者的 ID对应的锁客户端节点的緩存中读取所述分条 数据。
在上述技术方案的基础上, 进一步地, 所述锁客户端节点向分条数据 所有者服务器节点发送对于分条数据的读锁通知或写锁通知之后, 在所述 锁客户端节点接收到所述分条数据所有者服务器节点返回所述分条数据 的所有者为所述锁客户端节点的响应消息之前, 还包括: 所述分条数据所 有者服务器节点在接收到对于分条数据的首次读锁通知或对于所述分条 数据的写锁通知时, 将所述锁客户端节点记录为所述分条数据的所有者。
在上述技术方案的基础上, 进一步地, 还包括: 如果锁客户端节点向 分条数据所有者服务器节点发送的是对于分条数据的写锁通知, 所述锁客 户端节点緩存所述分条数据之后, 向所述锁服务器节点发送锁降级请求, 以使所述锁服务器节点将记录修改为所述锁客户端节点持有对所述分条 数据的读锁。 持有写锁的锁客户端节点在将该分条数据緩存到自身的操作 完成后, 如果仍一直持有写锁, 则其它锁客户端申请读锁时, 需要召回该 写锁, 在本发明实施例中, 持有写锁的锁客户端节点在将该分条数据緩存 到自身的操作完成后主动向锁服务器端发送降级请求, 将写锁降级为读 锁, 从而在其它锁客户端申请读锁时不必再次进行召回操作, 能够节省后 续发起读锁操作的时间, 提高緩存处理效率。
在上述技术方案的基础上, 进一步地, 所述锁客户端节点向分条数据 所有者服务器节点发送对于分条数据的读锁通知或写锁通知之前, 还包 括: 所述锁客户端节点接收来自应用服务器的对于所述分条数据的读请求 或写请求。 所述锁客户端节点在本地查找对于所述分条数据的读锁或写 锁。 如果查找到, 根据所述分条数据的读锁或写锁, 确定所述分条数据的 所有者, 在所述分条数据的所有者读取或写入所述分条数据。 如果没有查 找到, 执行所述锁客户端节点向分条数据所有者服务器节点发送对于分条 数据的读锁通知或写锁通知的步骤。
在上述技术方案的基础上, 进一步地, 还包括: 当所述锁客户端节点 的緩存在单位时间内的删除率大于或等于所述锁客户端节点的緩存总量 的预设比例时, 所述锁客户端节点向所述分条数据所有者服务器节点发送 将所述分条数据的所有者更改为目标锁客户端节点的请求消息, 以使所述 分条数据所有者服务器节点将所述分条数据的所有者更改为目标锁客户 端节点。 所述锁客户端节点接收所述分条数据所有者服务器节点返回的分 条数据的所有者更改成功响应消息, 删除本地緩存的所述分条数据, 并向 所述目标锁客户端节点发送所述分条数据, 以使所述目标锁客户端节点緩 存所述分条数据。 通过在各个锁客户端节点的緩存中动态迁移分条数据, 避免了由于锁客户端节点上的緩存删除频繁导致的緩存使用效率降低的 缺陷, 能够在整个系统内维护每个锁客户端节点上的緩存使用均衡, 提高 緩存的使用效率。
在上述技术方案的基础上, 进一步地, 所述锁客户端节点从所述分条 数据的所有者的 ID对应的锁客户端节点读取所述分条数据具体包括: 所 述锁客户端节点向所述分条数据的所有者的 ID对应的锁客户端节点发送 读取分条数据的请求, 以使所述分条数据的所有者的 ID对应的锁客户端 节点在本地緩存的数据中查找所述分条数据, 如果查找到, 向所述锁客户 端节点返回所述分条数据, 否则, 从所述分布式存储系统的各个锁客户端 节点读取所述分条数据的各个条带数据, 并构建成所述分条数据后, 返回 给所述锁客户端节点。
在本发明实施例三中, 当锁客户端节点对于某个分条数据发出首次读 锁通知或写锁通知时, 由该锁客户端节点緩存该分条数据, 分条数据所有 者服务器节点将该锁客户端节点记录为该分条数据的所有者; 当锁客户端 节点对于该分条数据发出非首次读锁通知时, 分条数据所有者服务器节点 根据上述记录向该锁客户端节点告知该分条数据的所有者, 该锁客户端节 点从所有者的緩存中读取分条数据。 通过分条数据所有者服务器节点对分 条数据所有者的记录和反馈, 使得同一分条数据在整个分布式存储系统中 只在其所有者緩存一次, 各个锁客户端节点均可以从所有者读取该分条数 据, 从而避免了不同的节点设备上緩存相同的分条数据的现象, 提高了节 点设备的緩存利用率。
以上通过本发明实施例二和实施例三, 对所有者服务器节点与锁服务 器节点分别设置情况下的緩存方法进行介绍。 在实际应用中, 还可以将所 有者服务器节点与锁服务器节点结合, 采用同一个节点执行所有者服务器 节点与锁服务器节点的操作。例如,可以对现有的锁服务器节点进行改进, 使其在现有的锁服务器节点执行的操作的基础上, 增加本发明实施例提出 的所有者服务器节点执行的操作。 接下来, 在本发明实施例四至实施例八 中, 以采用同一个节点执行所有者服务器节点与锁服务器节点的操作的情 况为例, 即, 采用进行了上述改进后锁服务器节点, 执行现有的锁服务器 节点的操作和本发明实施例提出的所有者服务器节点的操作, 进一步对本 发明提出的緩存方法进行说明。 在本发明实施例四至实施例八中记载的锁 服务器节点即为上述改进后锁服务器节点, 以各个锁客户端节点向锁服务 器节点发送的锁请求作为锁通知。 例如, 如果锁客户端节点向锁服务器节 点发送读锁请求,则锁服务器节点执行现有的锁服务器节点的操作,并且, 锁服务器节点还以该读锁请求作为读锁通知, 执行本发明提出的所有者服 务器节点执行的操作。 同理, 如果锁客户端节点向锁服务器节点发送写锁 请求, 则锁服务器节点执行现有的锁服务器节点的操作, 并且, 锁服务器 节点还以该写锁请求作为写锁通知, 执行本发明提出的所有者服务器节点 执行的操作。
图 4为本发明实施例四的分布式存储系统的緩存方法的信令流程图。在 本发明实施例四中, 以分布式存储系统中首次出现对于某个分条数据的读 锁操作或者在该分条数据删除后首次出现对于该分条数据的读锁操作为 例。 如图 4所示, 该方法包括以下过程。
步骤 301 : 第一锁客户端节点接收来自应用服务器的对于所述分条数 据的读请求。
在本步骤中, 第一锁客户端节点接收来自应用服务器的对于所述分条 数据的读请求后, 第一锁客户端节点在本地查找对于所述分条数据的读 锁, 如果查找到, 在所述分条数据的读锁中指示的作为所述分条数据的所 有者的锁客户端节点读取所述分条数据。 否则, 执行步骤 302。 在本发明 实施例中, 以未查找到为例。
步骤 302: 第一锁客户端节点向锁服务器节点发送对于分条数据的读 锁请求。
步骤 303 : 锁服务器节点将第一锁客户端节点记录为分条数据的所有 者。
在本步骤中, 锁服务器节点在收到申请分条锁请求后, 首先检查记录 中是否存在该分条数据的属性信息, 如果不存在, 则生成该分条数据的属 性信息的记录, 否则, 检查该分条数据的属性信息中持有锁的锁客户端节 点的信息。 在本发明实施例四中, 以不存在该分条数据的属性信息为例。 如果不存在该分条数据的相关记录, 说明是系统首次申请或被删除后首次 申请该分条数据, 锁服务器节点将本次申请锁的第一锁客户端节点的相关 信息加入到记录中, 将第一锁客户端节点记录为分条数据的所有者, 将第 一锁客户端节点对应的所有者的 ID记录为第一锁客户端节点的 ID。
步骤 304: 锁服务器节点向第一锁客户端节点返回分条数据的所有者 为第一锁客户端节点的响应消息。
在本步骤中, 锁服务器节点向第一锁客户端节点返回加锁成功响应消 息, 在该消息中向第一锁客户端节点返回分条数据的所有者为第一锁客户 端节点, 同时第一锁客户端节点在自身记录该分条数据的所有者 ID为第 一锁客户端节点 ID。
步骤 305: 第一锁客户端节点緩存分条数据。
在本步骤中, 第一锁客户端节点根据锁服务器节点返回的信息, 发现 第一锁客户端节点自身就是该分条数据的所有者, 第一锁客户端节点从本 地全局统一緩存中申请緩存空间, 从其它锁客户端节点读取该分条数据的 条带数据, 在本地的全局统一緩存中构建该分条数据, 并向应用服务器返 回该分条数据。 其中, 第一锁客户端节点从其它锁客户端节点读取该分条 数据的条带数据, 根据该条带数据构建该分条数据, 在构建完成后, 可以 获得该条带数据以及该条带数据的冗余数据, 在本发明实施例四中, 在第 一锁客户端节点中仅緩存该分条数据, 而不緩存该分条数据的冗余数据, 从 而进一步提高了分布式存储系统中节点的緩存利用率。 在本发明其它各个实 施例中, 某个锁客户端节点緩存分条数据时, 均可以采用此方式进行緩存。
图 5为本发明实施例五的分布式存储系统的緩存方法的信令流程图。在 本发明实施例五中, 以分布式存储系统中的第一锁客户端节点緩存过某分 条数据后, 第二锁客户端节点再次读该分条数据为例。 如图 5所示, 该方 法包括以下过程。
步骤 401 : 第二锁客户端节点接收来自应用服务器的对于所述分条数 据的读请求。
在本步骤中, 第二锁客户端节点在本地查找对于所述分条数据的读 锁, 如果查找到, 在所述分条数据的读锁中指示的作为所述分条数据的所 有者的锁客户端节点读取所述分条数据。 否则, 执行步骤 402。 在本发明 实施例中, 以未查找到为例。
步骤 402: 第二锁客户端节点向锁服务器节点发送对于分条数据的读 锁请求。 步骤 403: 锁服务器节点向所述第二锁客户端节点返回所述分条数据 的所有者为所述第一锁客户端节点的响应消息。
在本步骤中, 锁服务器节点在收到申请分条锁请求后, 首先检查记录 中是否存在该分条数据的属性信息, 如果不存在, 则生成该分条数据的属 性信息的记录, 否则, 检查该分条数据的属性信息中持有锁的锁客户端节 点的信息。 在本发明实施例五中, 以存在为例。 如果存在该分条数据的属 性信息, 根据该属性信息能够获知该分条数据的所有者, 以该分条数据的 所有者为第一锁客户端节点为例, 锁服务器节点将本次申请锁的第二锁客 户端节点的 ID加入到该分条数据的属性信息中, 将第二锁客户端节点的 ID对应的所有者标志设置为表示非所有者的预设值。锁服务器节点向第二 锁客户端节点返回加锁成功响应消息, 在该消息中向所述第二锁客户端节 点返回所述分条数据的所有者为所述第一锁客户端节点。
在步骤 403之后, 所述第二锁客户端节点从所述第一锁客户端节点的 緩存中读取所述分条数据, 具体包括下述步骤。
步骤 404: 第二锁客户端节点向第一锁客户端节点发送分条数据读请 求。
在本步骤中, 第二锁客户端节点根据锁服务器节点返回的信息, 在自 身记录该分条数据的所有者 ID为第一锁客户端节点 ID , 第二锁客户端节 点获知自己不是该分条数据的所有者, 则生成分条数据读请求并将该请求 通过后端网络发送到第一锁客户端节点上。
步骤 405: 第一锁客户端节点向第二锁客户端节点返回分条数据。 在本步骤中, 第一锁客户端节点收到第二锁客户端节点的读分条数据 请求后, 从本地全局统一緩存获取该分条数据, 并直接返回给第二锁客户 端节点。
步骤 406: 第二锁客户端节点向应用服务器返回分条数据。
在本步骤中, 第二锁客户端节点收到第一锁客户端节点的读数据响应 后发送给应用服务器。
图 6为本发明实施例六的分布式存储系统的緩存方法的信令流程图。在 本发明实施例六中, 以分布式存储系统中在第一锁客户端节点緩存某个分 条数据后, 第二锁客户端节点写该分条数据的操作为例。 如图 6所示, 该 方法包括以下过程。
步骤 501 : 第二锁客户端节点接收来自应用服务器的对于所述分条数 据的写请求。
在本步骤中, 当第二锁客户端节点收到应用服务器的对于分条数据的 写请求时, 第二锁客户端节点在本地查找对于所述分条数据的写锁, 如果 查找到, 在所述分条数据的写锁中指示的作为所述分条数据的所有者的锁 客户端节点写入所述分条数据。 对于第二锁客户端节点在本地查找到对于 所述分条数据的写锁的情况, 说明在此时刻之前, 锁服务器节点已经将所 述分条数据的写锁授予第二锁客户端节点, 根据写锁的授予原则, 锁服务 器节点会在向第二锁客户端节点授予写锁之前, 将该分条数据的所有者记 录为第二锁客户端节点, 因此, 在第二锁客户端节点在本地查找到对于所 述分条数据的写锁的情况下, 分条数据的写锁中指示的作为所述分条数据 的所有者的锁客户端节点为第二锁客户端节点自身。 如果第二锁客户端节 点在本地未查找对于所述分条数据的写锁, 则执行步骤 502。 在本发明实 施例中, 以未查找到为例。
步骤 502: 第二锁客户端节点向锁服务器节点发送对于分条数据的写 锁请求。
步骤 503 : 锁服务器节点向第一锁客户端节点发送对于分条数据的锁 召回请求。
在本步骤中, 锁服务器节点在收到申请分条锁请求后, 首先检查记录 中是否存在该分条数据的属性信息, 如果不存在, 则生成该分条数据的属 性信息的记录, 否则, 检查该分条数据的属性信息中持有锁的锁客户端节 点的信息。 在本发明实施例六中, 以存在所述分条数据的属性信息为例。 如果存在该分条数据的属性信息, 根据该属性信息能够获知持有该分条数 据的读锁或写锁的锁客户端节点, 锁服务器节点生成召回这些锁客户端节 点上持有的该分条数据锁的请求, 并发送到相应的锁客户端节点上。 以第 一锁客户端节点持有该分条数据的读锁为例, 锁服务器节点生成召回第一 锁客户端节点持有的该分条数据的读锁的请求, 并发送到第一锁客户端节 点。
步骤 504: 第一锁客户端节点向锁服务器节点返回锁召回成功响应消 在本步骤中, 第一锁客户端节点收到锁召回请求后, 首先检查该锁是 否还在使用。 如果不在使用, 则直接向锁服务器节点返回锁召回成功响应 消息; 如果在使用, 则等待释放锁后, 向锁服务器节点返回锁召回成功响 应消息。 如果第一锁客户端节点为该分条数据的所有者, 则在发送锁召回 成功响应消息之前, 首先将第一锁客户端节点上的全局统一緩存的分条数 据从全局统一緩存中删除掉。
步骤 505 : 锁服务器节点将第二锁客户端节点记录为分条数据的所有 者。
在本步骤中, 锁服务器节点收到锁召回成功响应消息后, 在分条数据 属性信息中记录申请该分条数据写锁的第二锁客户端节点, 将第二锁客户 端节点记录为分条数据的所有者。
步骤 506: 锁服务器节点向第二锁客户端节点返回加分条写锁成功响 应消息。
步骤 507: 第二锁客户端节点緩存分条数据。
在本步骤中, 第二锁客户端节点收到锁服务器节点的加分条写锁成功 响应消息后, 从本地的全局统一緩存中申请分条緩存空间, 然后从应用服 务器接收需要写入的该分条数据并存储到本地的全局统一緩存中, 然后, 将该分条数据的各个条带数据写到对应的锁客户端节点中。
步骤 508: 第二锁客户端节点向锁服务器节点发送锁降级请求。
在本步骤中, 第二锁客户端节点写数据成功后, 生成锁降级请求并将 该请求发送到锁服务器节点上, 该请求表示将对于该分条数据的写锁降级 为读锁。
步骤 509: 锁服务器节点将记录修改为第二锁客户端节点持有对分条 数据的读锁。
在本步骤中, 锁服务器节点收到锁降级请求后, 将第二锁客户端节点 对应的锁类型由"写锁 "修改为 "读锁"。
图 7为本发明实施例七的分布式存储系统的緩存方法的信令流程图。在 本发明实施例七中,以作为所有者的第一锁客户端节点由于緩存不足导致数 据被删除时, 第二锁客户端节点从第一锁客户端节点读取分条数据的操作 为例。 如图 7所示, 该方法包括以下过程。
步骤 601 : 第二锁客户端节点向第一锁客户端节点发送分条数据的读 请求。
在本步骤中, 以第一锁客户端节点作为某分条数据的所有者, 第二锁 客户端节点在读该分条数据的过程中, 获知第一锁客户端节点为所有者 后, 向第一锁客户端节点发送分条数据的读请求, 请求读取从第一锁客户 端节点緩存的分条数据。
步骤 602: 第一锁客户端节点在緩存的资源中查找分条数据。
在本步骤中, 第一锁客户端节点首先在緩存的资源中查找该分条数 据。 如果能够查找到, 则直接向第二锁客户端节点返回该分条数据; 如果 未查找到, 说明第一锁客户端节点的数据发生删除现象, 该分条数据已被 删除, 则继续执行步骤 603。 在本发明实施例七中, 以未查找到为例。
步骤 603 : 第一锁客户端节点从分布式存储系统的各个锁客户端节点 读取分条数据的各个条带数据并构建成分条数据。
在本步骤中, 如果第一锁客户端节点在自己的緩存中没有查找到分条 数据, 则由第一锁客户端节点主动向分布式存储系统的各个锁客户端节点 发起读数据请求, 从各个锁客户端节点读取分条数据的各个条带数据, 并 在第一锁客户端节点自身的全局緩存中重构该分条数据。
步骤 604: 第一锁客户端节点向第二锁客户端节点返回分条数据。 图 8为本发明实施例八的分布式存储系统的緩存方法的信令流程图。在 本发明实施例八中, 以在锁客户端节点之间动态迁移緩存的分条数据为 例。 在本发明实施例中, 在各个锁客户端节点的全局统一緩存中建立緩存 时, 基本原则是将分条数据緩存在首先访问该分条数据的锁客户端节点 上, 但是, 当应用服务器通过不同的锁客户端节点访问的分条数据集中緩 存在少数的锁客户端节点上时, 这些用于緩存的锁客户端节点上的緩存删 除会变得异常频繁, 严重影响緩存的价值。 将緩存中访问频率高的分条数 据称为緩存热点, 因此要想在整个系统内维护每个存储服务上的緩存使用 均衡, 必须实现热点在各个锁客户端节, ^的緩存中动态迁移。
如图 8所示, 该方法包括以下过程。
步骤 701 : 第一锁客户端节点判断是否开启动态迁移。 在本步骤中, 全局统一緩存动态热点迁移在具体实施时, 各个锁客户 端节点判断自身的緩存中是否存在緩存热点,如果存在,则开启动态迁移。 分布式存储系统中全局统一緩存中是否存在緩存热点检测方法包括: 定时 检测每个锁客户端节点的緩存在单位时间内的删除率。 当某个锁客户端节 点的緩存在单位时间内的删除率超过本锁客户端节点緩存总量的预设比 例时, 则定义为该锁客户端节点緩存存在热点, 需要进行动态热点迁移操 作, 由该锁客户端节点主动将该緩存动态迁移到其它锁客户端节点上。 例 如, 上述预设比例可以为 20 %。
第一锁客户端节点判断是否开启动态迁移, 当第一锁客户端节点的緩 存在单位时间内的删除率大于或等于所述锁客户端节点的緩存总量的预 设比例时, 第一锁客户端节点判断开启动态迁移。 在本发明实施例中, 以 第一锁客户端节点的緩存在单位时间内的删除率大于或等于所述锁客户 端节点的緩存总量的预设比例为例。
其中, 一种实现方式包括: 设置与各个锁客户端节点连接的监控器, 每个锁客户端节点向监控器定时上报单位时间内该锁客户端节点緩存的 删除率。 监控器定时统计各个锁客户端节点上的緩存使用情况, 向各个锁 客户端节点推送緩存热点信息, 该緩存热点信息中包括符合緩存热点条件 的锁客户端节点的 ID ,各个锁客户端节点根据緩存热点信息中是否包含本 锁客户端节点的 ID来判断自身是否属于緩存热点, 如果是, 则启动緩存 动态迁移任务。 该緩存动态迁移任务执行周期为锁客户端节点向监控器报 告删除率的心跳周期与监控器获取各个锁客户端节点的删除率的周期之 和, 迁移的结束条件为本次周期执行结束并且下次緩存热点信息中不包括 本锁客户端节点的 ID。
步骤 702: 第二锁客户端节点向第一锁客户端节点发送分条数据读请 求。
在本步骤中, 以分条数据的所有者为第一锁客户端节点为例。
步骤 702为可选步骤。在实现方式一中,如果在步骤 701执行完毕时, 第二锁客户端节点已经成功申请分条读锁, 则先执行步骤 702, 第二锁客 户端节点向第一锁客户端节点发送分条数据的读请求, 然后在执行步骤 703。 在实现方式二中, 如果在步骤 701 执行完毕时, 第二锁客户端节点 未成功申请分条读锁, 则不必执行步骤 702, 直接执行步骤 703。
步骤 703: 第一锁客户端节点向所述锁服务器节点发送更改所有者请 求消息。
在本步骤中, 第一锁客户端节点向所述锁服务器节点发送更改所有者 请求消息, 该消息中包括分条数据的 ID以及目标锁客户端节点的 ID, 该 消息表示请求将所述分条数据的所有者更改为目标锁客户端节点。 即, 第 一锁客户端节点向所述锁服务器节点发送将所述分条数据的所有者更改 为目标锁客户端节点的请求消息。 在本发明实施例中, 以目标锁客户端节 点为第二目标锁客户端节点为例。
在实现方式一中, 第一锁客户端节点在收到第二锁客户端节点发送的 分条数据读请求后, 生成上述更改所有者请求。 在实现方式二中, 第一锁 客户端节点主动生成上述更改所有者请求消息。
步骤 704: 锁服务器节点将所述分条数据的所有者更改为目标锁客户 端节点。
在本步骤中, 锁服务器节点收到来自第一锁客户端节点的更改所有者 请求消息后, 修改该分条数据的所有者信息, 将所述分条数据的所有者更 改为目标锁客户端节点。
步骤 705: 锁服务器节点向第一锁客户端节点返回分条数据的所有者 更改成功响应消息。
步骤 706:第一锁客户端节点向第二锁客户端节点发送所述分条数据。 在本步骤中, 如果上述过程中包括步骤 702, 则在本步骤中, 第一锁 客户端节点还向第二锁客户端节点返回读数据成功响应消息。
步骤 707: 第一锁客户端节点删除本地緩存的所述分条数据。
在本步骤中, 第一锁客户端节点收到分条数据的所有者更改成功响应 消息后, 主动删除掉第一锁客户端节点中緩存的该分条数据。
步骤 708: 第二锁客户端节点緩存所述分条数据。
在本步骤中, 如果上述过程中包括步骤 702, 第二锁客户端节点收到 读数据成功响应消息后, 将分条数据緩存到第二锁客户端节点的本地全局 緩存中, 同时向应用服务器返回响应消息。 如果上述过程中不包括步骤 702, 第一锁客户端节点将分条数据主动推送到第二锁客户端节点的本地 全局緩存中。
图 9为本发明实施例九的分条数据所有者服务器节点的结构示意图。如 图 9所示, 该分条数据所有者服务器节点至少包括: 接收单元 91、 记录单 元 92和发送单元 93。
其中,接收单元 91 , 用于接收来自锁客户端节点的对于分条数据的锁 通知;
判断单元 94 , 用于对所述锁通知进行判断;
记录单元 92 ,用于当所述判断单元判断所述锁通知为首次读锁通知或 写锁通知时, 将所述锁客户端节点记录为所述分条数据的所有者。
发送单元 93 ,用于当所述判断单元判断在所述锁通知为首次读锁通知 或写锁通知时, 向所述锁客户端节点返回所述分条数据的所有者为所述锁 客户端节点的响应消息, 以使所述锁客户端节点緩存所述分条数据; 当所 述接收判断单元判断所述锁通知为非首次读锁通知时, 向所述锁客户端节 点返回包含所述分条数据的所有者信息的响应消息, 以使所述锁客户端节 点从所述分条数据的所有者的緩存中读取所述分条数据。
在上述技术方案的基础上, 进一步地: 所述判断单元 94 , 具体用于在 所述锁通知为读锁通知时, 在记录的分条数据的属性信息中查找所述分条 数据的属性信息, 若未查找到, 则确定所述读锁通知为首次读锁通知。
在上述技术方案的基础上, 进一步地, 所述接收单元 91还用于接收 来自所述锁客户端节点的将所述分条数据的所有者更改为另一锁客户端 节点的请求消息。 相应地, 所述记录单元 92还用于将所述分条数据的所 有者更改为所述另一锁客户端节点。 相应地, 所述发送单元 93还用于向 所述锁客户端节点返回分条数据的所有者更改成功响应消息, 以使所述锁 客户端节点删除本地緩存的所述分条数据, 并使所述另一锁客户端节点緩 存所述分条数据。
在上述技术方案的基础上, 进一步地, 当所述分条数据所有者服务器 节点集成在锁服务器节点设备中时, 所述读锁通知包括读锁请求, 所述写 锁通知包括写锁请求。 相应地, 所述发送单元 93还用于向所述锁客户端 节点返回加锁成功响应消息。
本发明实施例九的分条数据所有者服务器节点可以用于执行本发明 实施例二至本发明实施例八所述的緩存方法, 其具体实现过程和技术效果 可以参照本发明实施例二至本发明实施例八, 此处不再赘述。
图 10为本发明实施例十的锁客户端节点的结构示意图。 如图 10所示, 该锁客户端节点至少包括:发送单元 1001、接收单元 1002、比较单元 1003、 緩存单元 1004、 读写单元 1005。
其中, 发送单元 1001用于向分条数据所有者服务器节点发送对于分 条数据的读锁通知或写锁通知。
接收单元 1002用于接收所述分条数据所有者服务器节点返回所述分 条数据的所有者为所述锁客户端节点的响应消息, 或接收所述分条数据所 有者服务器节点返回的所述分条数据的所有者的 ID。
比较单元 1003用于将所述分条数据的所有者的 ID和自身的 ID进行 比较; 当两者不同时, 开启读写单元 1005。
所述緩存单元 1004用于当所述接收单元接收到所述分条数据所有者 服务器节点返回所述分条数据的所有者为所述锁客户端节点的响应消息 时, 緩存所述分条数据。
所述读写单元 1005用于从所述分条数据的所有者的 ID对应的锁客户 端节点的緩存中读取所述分条数据。
在上述技术方案的基础上, 进一步地, 所述发送单元 1001还用于向 所述锁服务器节点发送锁降级请求, 以使所述锁服务器节点将记录修改为 所述锁客户端节点持有对所述分条数据的读锁。
在上述技术方案的基础上, 进一步地, 还可以包括: 查找单元 1006。 相应地, 接收单元 1002还用于接收来自应用服务器的对于所述分条数据 的读请求或写请求。 相应地, 查找单元 1006用于在本地查找对于所述分 条数据的读锁或写锁; 如果查找到, 根据所述分条数据的读锁或写锁, 确 定所述分条数据的所有者, 开启所述读写单元 1005; 如果没有查找到, 开 启所述发送单元 1001。 相应地, 所述读写单元 1005还用于在所述分条数 据的所有者读取或写入所述分条数据。
在上述技术方案的基础上, 进一步地, 所述緩存单元 1004还用于在 单位时间内的删除率大于或等于所述锁客户端节点的緩存总量的预设比 例时, 控制所述发送单元 1001向所述分条数据所有者服务器节点发送将 所述分条数据的所有者更改为目标锁客户端节点的请求消息, 还用于根据 接收单元 1002接收的分条数据的所有者更改成功响应消息, 删除本地緩 存的所述分条数据, 控制所述发送单元 1001向所述目标锁客户端节点发 送所述分条数据。 相应地, 所述发送单元 1001还用于根据所述緩存单元 1004的控制向所述分条数据所有者服务器节点发送将所述分条数据的所 有者更改为目标锁客户端节点的请求消息, 以使所述分条数据所有者服务 器节点将所述分条数据的所有者更改为目标锁客户端节点。 所述发送单元 1001还用于根据所述緩存单元 1004的控制向所述目标锁客户端节点发送 所述分条数据, 以使所述目标锁客户端节点緩存所述分条数据。 相应地, 所述接收单元 1002还用于接收所述分条数据所有者服务器节点返回的分 条数据的所有者更改成功响应消息。
在上述技术方案的基础上, 进一步地, 所述读写单元 1005具体用于 向所述分条数据的所有者的 ID对应的锁客户端节点发送读取分条数据的 请求, 以使所述分条数据的所有者的 ID对应的锁客户端节点在本地緩存 的数据中查找所述分条数据, 如果查找到, 向所述读写单元 1005返回所 述分条数据, 否则, 从所述分布式存储系统的各个锁客户端节点读取所述 分条数据的各个条带数据, 并构建成所述分条数据后, 返回给所述读写单 元 1005。
本发明实施例十的锁客户端节点可以用于执行本发明实施例二至本 发明实施例八所述的緩存方法, 其具体实现过程和技术效果可以参照本发 明实施例二至本发明实施例八, 此处不再赘述。
图 11为本发明实施例十一的分条数据所有者服务器节点的结构示意 图。 如图 11所示, 该分条数据所有者服务器节点至少包括: 处理器 1101、 存储器 1 102、 通信接口 1103和总线 1104。 其中, 所述处理器 1101、 所述 存储器 1 102和所述通信接口 1103通过所述总线 1104通信。
所述存储器 1102用于存放程序。 具体的, 程序中可以包括程序代码, 所述程序代码包括计算机执行指令。 所述存储器 1102可以为高速 RAM存储 器, 也可以为非易失性存储器(non-volatile memory ) , 例如至少一个磁盘存 储器。
所述通信接口 1103用于与第一锁客户端节点和第二锁客户端节点通 信。
所述处理器 1101用于执行所述存储器 1102存储的执行指令, 可能为 单核或多核中央处理单元( Central Processing Unit, CPU ) , 或者为特定集成 电路(Application Specific Integrated Circuit, ASIC ) , 或者为被配置成实施 本发明实施例的一个或多个集成电路。
当分条数据所有者服务器节点运行时, 处理器 1101运行程序, 以执行 以下指令:
分条数据所有者服务器节点接收到来自锁客户端节点的对于分条数 据的锁通知, 对所述锁通知进行判断;
当所述锁通知为首次读锁通知或写锁通知时, 所述分条数据所有者服 务器节点将所述锁客户端节点记录为所述分条数据的所有者, 向所述锁客 户端节点返回所述分条数据的所有者为所述锁客户端节点的响应消息, 以 使所述锁客户端节点緩存所述分条数据;
当所述锁通知为非首次读锁通知时, 所述分条数据所有者服务器节点 向所述锁客户端节点返回包含所述分条数据的所有者信息的响应消息, 以 使所述锁客户端节点从所述分条数据的所有者的緩存中读取所述分条数 据。
本发明实施例十一的分条数据所有者服务器节点可以用于执行本发 明实施例二至本发明实施例八所述的緩存方法, 其具体实现过程和技术效 果可以参照本发明实施例二至本发明实施例八, 此处不再赘述。
图 12为本发明实施例十二的锁客户端节点的结构示意图。如图 12所示, 该锁客户端节点至少包括: 处理器 1201、 存储器 1202、 通信接口 1203和 总线 1204。其中,所述处理器 1201、所述存储器 1202和所述通信接口 1203 通过所述总线 1204通信。
所述存储器 1202用于存放程序。 具体的, 程序中可以包括程序代码, 所述程序代码包括计算机执行指令。 所述存储器 1202可以为高速 RAM存储 器, 也可以为非易失性存储器(non-volatile memory ) , 例如至少一个磁盘存 储器。
所述处理器 1201用于执行所述存储器 1202存储的执行指令, 可能为 单核或多核中央处理单元( Central Processing Unit, CPU ) , 或者为特定集成 电路 ( Application Specific Integrated Circuit, ASIC ) , 或者为被配置成实施 本发明实施例的一个或多个集成电路。
所述通信接口 1203 用于与分条数据所有者服务器节点和其它锁客户 端节点通信。
当锁客户端节点运行时, 处理器 1201运行程序, 以执行以下指令: 向分条数据所有者服务器节点发送对于分条数据的读锁通知或写锁 通知;
当所述锁客户端节点接收到所述分条数据所有者服务器节点返回所 述分条数据的所有者为所述锁客户端节点的响应消息, 所述锁客户端节点 緩存所述分条数据;
当接收所述分条数据所有者服务器节点返回的所述分条数据的所有 者的标识 ID, 所述分条数据的所有者的 ID和自身的 ID进行比较, 当两 者不同时, 从所述分条数据的所有者的 ID对应的锁客户端节点的緩存中 读取所述分条数据。
本发明实施例十二的锁客户端节点可以用于执行本发明实施例二至 本发明实施例八所述的緩存方法, 其具体实现过程和技术效果可以参照本 发明实施例二至本发明实施例八, 此处不再赘述。
本发明实施例十三提供一种计算机可读介质, 包含计算机执行指令, 所述计算机执行指令用于使分条数据所有者服务器节点本发明实施例二 所述的方法。
本发明实施例十四提供一种计算机可读介质, 包含计算机执行指令, 所述计算机执行指令用于使锁客户端节点执行本发明实施例三所述的方 法。 需要说明的是: 对于前述的各方法实施例, 为了简单描述, 故将其都表 述为一系列的动作组合, 但是本领域技术人员应该知悉, 本发明并不受所描 述的动作顺序的限制, 因为依据本发明, 某些步骤可以采用其他顺序或者同 时进行。 其次, 本领域技术人员也应该知悉, 说明书中所描述的实施例均属 于优选实施例, 所涉及的动作和模块并不一定是本发明所必须的。
在上述实施例中, 对各个实施例的描述都各有侧重, 某个实施例中没有 详述的部分, 可以参见其他实施例的相关描述。
通过以上的实施方式的描述, 所属领域的技术人员可以清楚地了解到 本发明可以用硬件实现, 或固件实现, 或它们的组合方式来实现。 当使用 软件实现时, 可以将上述功能存储在计算机可读介质中或作为计算机可读 介质上的一个或多个指令或代码进行传输。 计算机可读介质包括计算机存 储介质和通信介质, 其中通信介质包括便于从一个地方向另一个地方传送 计算机程序的任何介质。 存储介质可以是计算机能够存取的任何可用介 质。以此为例但不限于:计算机可读介质可以包括 RAM、 ROM, EEPROM、 CD-ROM或其他光盘存储、磁盘存储介质或者其他磁存储设备、 或者能够 用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计 算机存取的任何其他介质。 此外。 任何连接可以适当的成为计算机可读介 质。例如,如果软件是使用同轴电缆、光纤光缆、双绞线、数字用户线( DSL ) 或者诸如红外线、 无线电和微波之类的无线技术从网站、 服务器或者其他 远程源传输的, 那么同轴电缆、 光纤光缆、双绞线、 DSL或者诸如红外线、 无线和微波之类的无线技术包括在所属介质的定影中。 如本发明所使用 的, 盘 (Disk ) 和碟(disc ) 包括压缩光碟(CD ) 、 激光碟、 光碟、 数字 通用光碟(DVD ) 、 软盘和蓝光光碟, 其中盘通常磁性的复制数据, 而碟 则用激光来光学的复制数据。 上面的组合也应当包括在计算机可读介质的 保护范围之内。
最后应说明的是: 以上实施例仅用以说明本发明的技术方案, 而非对其 限制; 尽管参照前述实施例对本发明进行了详细的说明, 本领域的普通技术 人员应当理解: 其依然可以对前述各实施例所记载的技术方案进行修改, 或 者对其中部分技术特征进行等同替换; 而这些修改或者替换, 并不使相应技 术方案的本质脱离本发明各实施例技术方案的精神和范围。

Claims

权 利 要 求 书
1、 一种分布式存储系统的緩存方法, 其特征在于, 包括:
分条数据所有者服务器节点接收到来自锁客户端节点的对于某分条 数据的锁通知, 对所述锁通知进行判断;
当所述锁通知为首次读锁通知或写锁通知时, 所述分条数据所有者服 务器节点将所述锁客户端节点记录为所述分条数据的所有者, 向所述锁客 户端节点返回所述分条数据的所有者为所述锁客户端节点的响应消息, 以 使所述锁客户端节点緩存所述分条数据;
当所述锁通知为非首次读锁通知时, 所述分条数据所有者服务器节点 向所述锁客户端节点返回包含所述分条数据的所有者信息的响应消息, 以 使所述锁客户端节点从所述分条数据的所有者的緩存中读取所述分条数 据。
2、 根据权利要求 1所述的方法, 其特征在于, 所述分条数据所有者 服务器节点接收到来自锁客户端节点的对于分条数据的锁通知之后, 还包 括:
当所述锁通知为读锁通知时, 所述分条数据所有者服务器节点在记录 的分条数据的属性信息中查找所述分条数据的属性信息, 若未查找到, 则 确定所述读锁通知为首次读锁通知。
3、 根据权利要求 1或 2所述的方法, 其特征在于, 所述向所述客户 端节点锁客户端节点返回所述分条数据的所有者为所述锁客户端节点的 响应消息之后还包括:
所述分条数据所有者服务器节点接收来自所述锁客户端节点的将所 述分条数据的所有者更改为另一锁客户端节点的请求消息;
所述分条数据所有者服务器节点将所述分条数据的所有者更改为所 述另一锁客户端节点;
所述分条数据所有者服务器节点向所述锁客户端节点返回分条数据 的所有者更改成功响应消息, 以使所述锁客户端节点删除本地緩存的所述 分条数据, 并使所述另一锁客户端节点緩存所述分条数据。
4、 根据权利要求 1至 3中任意一项所述的方法, 其特征在于, 当所 述分条数据所有者服务器节点集成在锁服务器节点设备中时, 所述读锁通知包括读锁请求;
所述写锁通知包括写锁请求;
所述向所述锁客户端节点返回所述分条数据的所有者为所述锁客户 端节点的响应消息还包括: 向所述锁客户端节点返回加锁成功响应消息; 所述向所述锁客户端节点返回包含所述分条数据的所有者信息的响 应消息还包括: 向所述锁客户端节点返回加锁成功响应消息。
5、 一种分布式存储系统的緩存方法, 其特征在于, 包括:
锁客户端节点向分条数据所有者服务器节点发送对于分条数据的读 锁通知或写锁通知;
当所述锁客户端节点接收到所述分条数据所有者服务器节点返回所 述分条数据的所有者为所述锁客户端节点的响应消息, 所述锁客户端节点 緩存所述分条数据;
当所述锁客户端节点接收所述分条数据所有者服务器节点返回的所 述分条数据的所有者的标识 ID,所述锁客户端节点将所述分条数据的所有 者的 ID和自身的 ID进行比较, 当两者不同时, 所述锁客户端节点从所述 分条数据的所有者的 ID对应的锁客户端节点的緩存中读取所述分条数据。
6、 根据权利要求 5所述的方法, 其特征在于, 在所述锁客户端节点 向分条数据所有者服务器节点发送对于分条数据的读锁通知或写锁通知 之后, 所述锁客户端节点接收到所述分条数据所有者服务器节点返回所述 分条数据的所有者为所述锁客户端节点的响应消息之前, 还包括:
所述分条数据所有者服务器节点在接收到对于分条数据的首次读锁 通知或写锁通知时, 将所述锁客户端节点记录为所述分条数据的所有者。
7、 根据权利要求 5或 6所述的方法, 其特征在于, 还包括: 如果锁客户端节点向分条数据所有者服务器节点发送的是对于分条 数据的写锁通知, 所述锁客户端节点緩存所述分条数据之后, 向所述锁服 务器节点发送锁降级请求, 以使所述锁服务器节点将记录修改为所述锁客 户端节点持有对所述分条数据的读锁。
8、 根据权利要求 5至 7中任意一项所述的方法, 其特征在于, 所述 锁客户端节点向分条数据所有者服务器节点发送对于分条数据的读锁通 知或写锁通知之前, 还包括: 所述锁客户端节点接收来自应用服务器的对于所述分条数据的读请 求或写请求;
所述锁客户端节点在本地查找对于所述分条数据的读锁或写锁; 如果查找到, 根据所述分条数据的读锁或写锁, 确定所述分条数据的 所有者, 在所述分条数据的所有者读取或写入所述分条数据;
如果没有查找到, 执行所述锁客户端节点向分条数据所有者服务器节 点发送对于分条数据的读锁通知或写锁通知的步骤。
9、 根据权利要求 5至 8中任意一项所述的方法, 其特征在于, 还包 括:
当所述锁客户端节点的緩存在单位时间内的删除率大于或等于所述 锁客户端节点的緩存总量的预设比例时, 所述锁客户端节点向所述分条数 据所有者服务器节点发送将所述分条数据的所有者更改为目标锁客户端 节点的请求消息, 以使所述分条数据所有者服务器节点将所述分条数据的 所有者更改为目标锁客户端节点;
所述锁客户端节点接收所述分条数据所有者服务器节点返回的分条 数据的所有者更改成功响应消息, 删除本地緩存的所述分条数据, 并向所 述目标锁客户端节点发送所述分条数据, 以使所述目标锁客户端节点緩存 所述分条数据。
10、 根据权利要求 5至 9中任意一项所述的方法, 其特征在于, 所述 锁客户端节点从所述分条数据的所有者的 ID对应的锁客户端节点的緩存 中读取所述分条数据具体包括:
所述锁客户端节点向所述分条数据的所有者的 ID对应的锁客户端节 点发送读取分条数据的请求, 以使所述分条数据的所有者的 ID对应的锁 客户端节点在本地緩存的数据中查找所述分条数据, 如果查找到, 向所述 锁客户端节点返回所述分条数据, 否则, 从所述分布式存储系统的各个锁 客户端节点读取所述分条数据的各个条带数据, 并构建成所述分条数据 后, 返回给所述锁客户端节点。
11、 一种分条数据所有者服务器节点, 其特征在于, 包括:
接收单元, 用于接收来自锁客户端节点的对于某分条数据的锁通知; 判断单元, 用于对所述锁通知进行判断; 记录单元, 用于当所述判断单元判断所述锁通知为首次读锁通知或写 锁通知时, 将所述锁客户端节点记录为所述分条数据的所有者;
发送单元, 用于当所述接收判断单元判断在所述锁通知为首次读锁通 知或写锁通知时, 向所述锁客户端节点返回所述分条数据的所有者为所述 锁客户端节点的响应消息, 以使所述锁客户端节点緩存所述分条数据; 当 所述接收判断单元判断所述锁通知为非首次读锁通知时, 向所述锁客户端 节点返回包含所述分条数据的所有者信息的响应消息, 以使所述锁客户端 节点从所述分条数据的所有者的緩存中读取所述分条数据。
12、 根据权利要求 1 1所述的节点, 其特征在于, 所述判断单元, 具 体用于在所述锁通知为读锁通知时, 在记录的分条数据的属性信息中查找 所述分条数据的属性信息, 若未查找到, 则确定所述读锁通知为首次读锁 通知。
13、 根据权利要求 1 1或 12所述的节点, 其特征在于,
所述接收单元, 还用于接收来自所述锁客户端节点的将所述分条数据 的所有者更改为另一锁客户端节点的请求消息;
所述记录单元, 还用于将所述分条数据的所有者更改为所述另一锁客 户端节点;
所述发送单元, 还用于向所述锁客户端节点返回分条数据的所有者更 改成功响应消息, 以使所述锁客户端节点删除本地緩存的所述分条数据, 并使所述另一锁客户端节点緩存所述分条数据。
14、 根据权利要求 1 1至 13中任意一项所述的节点, 其特征在于, 当所述分条数据所有者服务器节点集成在锁服务器节点设备中时, 所述读锁通知包括读锁请求;
所述写锁通知包括写锁请求;
所述发送单元, 还用于向所述锁客户端节点返回加锁成功响应消息。
15、 一种锁客户端节点, 其特征在于, 包括:
发送单元, 用于向分条数据所有者服务器节点发送对于分条数据的读 锁通知或写锁通知;
接收单元, 用于接收所述分条数据所有者服务器节点返回所述分条数 据的所有者为所述锁客户端节点的响应消息, 或接收所述分条数据所有者 服务器节点返回的所述分条数据的所有者的标识 ID;
比较单元,用于将所述分条数据的所有者的 ID和自身的 ID进行比较; 当两者不同时, 开启读写单元;
所述緩存单元, 用于当所述接收单元接收到所述分条数据所有者服务 器节点返回所述分条数据的所有者为所述锁客户端节点的响应消息时, 緩 存所述分条数据;
所述读写单元, 用于从所述分条数据的所有者的 ID对应的锁客户端 节点的緩存中读取所述分条数据。
16、 根据权利要求 15所述的节点, 其特征在于,
所述发送单元, 还用于向所述锁服务器节点发送锁降级请求, 以使所 述锁服务器节点将记录修改为所述锁客户端节点持有对所述分条数据的 读锁。
17、 根据权利要求 15或 16所述的节点, 其特征在于, 还包括: 查找 单元;
所述接收单元, 还用于接收来自应用服务器的对于所述分条数据的读 请求或写请求;
所述查找单元, 用于在本地查找对于所述分条数据的读锁或写锁; 如 果查找到,根据所述分条数据的读锁或写锁,确定所述分条数据的所有者, 开启所述读写单元; 如果没有查找到, 开启所述发送单元;
所述读写单元, 还用于在所述分条数据的所有者读取或写入所述分条 数据。
18、 根据权利要求 15至 17中任意一项所述的节点, 其特征在于, 所述緩存单元, 还用于在单位时间内的删除率大于或等于所述锁客户 端节点的緩存总量的预设比例时, 控制所述发送单元向所述分条数据所有 者服务器节点发送将所述分条数据的所有者更改为目标锁客户端节点的 请求消息; 还用于根据接收单元接收的分条数据的所有者更改成功响应消 息, 删除本地緩存的所述分条数据, 控制所述发送单元向所述目标锁客户 端节点发送所述分条数据;
所述发送单元, 还用于根据所述緩存单元的控制向所述分条数据所有 者服务器节点发送将所述分条数据的所有者更改为目标锁客户端节点的 请求消息, 以使所述分条数据所有者服务器节点将所述分条数据的所有者 更改为目标锁客户端节点; 还用于根据所述緩存单元的控制向所述目标锁 客户端节点发送所述分条数据, 以使所述目标锁客户端节点緩存所述分条 数据;
所述接收单元, 还用于接收所述分条数据所有者服务器节点返回的分 条数据的所有者更改成功响应消息。
19、 根据权利要求 15至 18中任意一项所述的节点, 其特征在于, 所述读写单元, 具体用于向所述分条数据的所有者的 ID对应的锁客 户端节点发送读取分条数据的请求, 以使所述分条数据的所有者的 ID对 应的锁客户端节点在本地緩存的数据中查找所述分条数据, 如果查找到, 向所述读写单元返回所述分条数据, 否则, 从所述分布式存储系统的各个 锁客户端节点读取所述分条数据的各个条带数据, 并构建成所述分条数据 后, 返回给所述读写单元。
20、 一种分条数据所有者服务器节点, 其特征在于, 包括: 处理器、 存储器、 通信接口和总线; 所述处理器、 所述存储器和所述通信接口通过 所述总线通信; 所述存储器用于存储执行指令, 所述通信接口用于与第一 锁客户端节点和第二锁客户端节点通信; 当所述分条数据所有者服务器节 点运行时, 所述处理器执行所述存储器存储的所述执行指令, 以使所述分 条数据所有者服务器节点执行如权利要求 1-4中任一所述的分布式存储系 统的緩存方法。
21、 一种锁客户端节点, 其特征在于, 包括: 处理器、 存储器、 通信 接口和总线;所述处理器、所述存储器和所述通信接口通过所述总线通信; 所述存储器用于存放程序存储执行指令; 所述通信接口用于与分条数据所 有者服务器节点和其他锁客户端节点通信; 当所述锁客户端节点运行时, 所述处理器执行所述存储器存储的所述执行指令, 以使所述分条数据所有 者服务器节点执行如权利要求 5-10中任一所述的分布式存储系统的緩存 方法。
22、 一种计算机可读介质, 包含计算机执行指令, 所述计算机执行指 令用于使分条数据所有者服务器节点执行权利要求 1至 4任一所述的方 法。
23、 一种计算机可读介质, 包含计算机执行指令, 所述计算机执行指 令用于使锁客户端节点执行权利要求 5至 10任一所述的方法。
PCT/CN2012/087842 2012-12-28 2012-12-28 分布式存储系统的缓存方法、节点和计算机可读介质 WO2014101108A1 (zh)

Priority Applications (7)

Application Number Priority Date Filing Date Title
PCT/CN2012/087842 WO2014101108A1 (zh) 2012-12-28 2012-12-28 分布式存储系统的缓存方法、节点和计算机可读介质
CA2896123A CA2896123C (en) 2012-12-28 2012-12-28 Caching method for distributed storage system, a lock server node, and a lock client node
EP12891167.4A EP2830284B1 (en) 2012-12-28 2012-12-28 Caching method for distributed storage system, node and computer readable medium
CN201280003290.4A CN103392167B (zh) 2012-12-28 2012-12-28 分布式存储系统的缓存方法、节点
AU2012398211A AU2012398211B2 (en) 2012-12-28 2012-12-28 Caching method for distributed storage system, a lock server node, and a lock client node
JP2015514321A JP6301318B2 (ja) 2012-12-28 2012-12-28 分散ストレージシステムのためのキャッシュ処理方法、ノード及びコンピュータ可読媒体
US14/509,471 US9424204B2 (en) 2012-12-28 2014-10-08 Caching method for distributed storage system, a lock server node, and a lock client node

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/087842 WO2014101108A1 (zh) 2012-12-28 2012-12-28 分布式存储系统的缓存方法、节点和计算机可读介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/509,471 Continuation US9424204B2 (en) 2012-12-28 2014-10-08 Caching method for distributed storage system, a lock server node, and a lock client node

Publications (1)

Publication Number Publication Date
WO2014101108A1 true WO2014101108A1 (zh) 2014-07-03

Family

ID=49535837

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/087842 WO2014101108A1 (zh) 2012-12-28 2012-12-28 分布式存储系统的缓存方法、节点和计算机可读介质

Country Status (7)

Country Link
US (1) US9424204B2 (zh)
EP (1) EP2830284B1 (zh)
JP (1) JP6301318B2 (zh)
CN (1) CN103392167B (zh)
AU (1) AU2012398211B2 (zh)
CA (1) CA2896123C (zh)
WO (1) WO2014101108A1 (zh)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559319B (zh) * 2013-11-21 2017-07-07 华为技术有限公司 分布式集群文件系统的缓存同步方法和设备
CN104113587B (zh) * 2014-06-23 2017-08-01 华中科技大学 一种分布式文件系统客户端元数据缓存优化方法
CN104536916B (zh) * 2014-12-18 2018-04-10 华为技术有限公司 一种多核系统的仲裁方法及多核系统
CN107844268B (zh) * 2015-06-04 2021-09-14 华为技术有限公司 一种数据分发方法、数据存储方法、相关装置以及系统
US20170097887A1 (en) * 2015-10-02 2017-04-06 Netapp, Inc. Storage Controller Cache Having Reserved Parity Area
CN105183670B (zh) * 2015-10-27 2018-11-27 北京百度网讯科技有限公司 用于分布式缓存系统的数据处理方法和装置
CA2963365C (en) 2015-12-31 2019-02-05 Lei Chen Data write method, apparatus, and system
CN105573682B (zh) * 2016-02-25 2018-10-30 浪潮(北京)电子信息产业有限公司 一种san存储系统及其数据读写方法
CN107239474B (zh) * 2016-03-29 2021-05-04 创新先进技术有限公司 一种数据记录方法及装置
CN106156334B (zh) * 2016-07-06 2019-11-22 益佳科技(北京)有限责任公司 内存数据处理设备及内存数据处理方法
US10191854B1 (en) * 2016-12-06 2019-01-29 Levyx, Inc. Embedded resilient distributed dataset systems and methods
CN106850856A (zh) * 2017-03-28 2017-06-13 南京卓盛云信息科技有限公司 一种分布式存储系统及其同步缓存方法
CN107239235B (zh) * 2017-06-02 2020-07-24 苏州浪潮智能科技有限公司 一种多控多活raid同步方法及系统
CN107330061B (zh) * 2017-06-29 2021-02-02 苏州浪潮智能科技有限公司 一种基于分布式存储的文件删除方法及装置
CN107608626B (zh) * 2017-08-16 2020-05-19 华中科技大学 一种基于ssd raid阵列的多级缓存及缓存方法
CN107623722A (zh) * 2017-08-21 2018-01-23 云宏信息科技股份有限公司 一种远端数据缓存方法、电子设备及存储介质
US10685010B2 (en) 2017-09-11 2020-06-16 Amazon Technologies, Inc. Shared volumes in distributed RAID over shared multi-queue storage devices
US10365980B1 (en) * 2017-10-31 2019-07-30 EMC IP Holding Company LLC Storage system with selectable cached and cacheless modes of operation for distributed storage virtualization
US10474545B1 (en) * 2017-10-31 2019-11-12 EMC IP Holding Company LLC Storage system with distributed input-output sequencing
US11209997B2 (en) 2017-11-22 2021-12-28 Blackberry Limited Method and system for low latency data management
US10831670B2 (en) * 2017-11-22 2020-11-10 Blackberry Limited Method and system for low latency data management
CN110413217B (zh) * 2018-04-28 2023-08-11 伊姆西Ip控股有限责任公司 管理存储系统的方法、设备和计算机程序产品
CN110347516B (zh) * 2019-06-27 2023-03-24 河北科技大学 一种面向细粒度读写锁的软件自动重构方法及装置
CN110442558B (zh) * 2019-07-30 2023-12-29 深信服科技股份有限公司 数据处理方法、分片服务器、存储介质及装置
US20210232442A1 (en) * 2020-01-29 2021-07-29 International Business Machines Corporation Moveable distributed synchronization objects
CN111651464B (zh) * 2020-04-15 2024-02-23 北京皮尔布莱尼软件有限公司 数据处理方法、系统及计算设备
US20230146076A1 (en) * 2021-11-08 2023-05-11 Rubrik, Inc. Backing file system with cloud object store
CN114860167A (zh) * 2022-04-29 2022-08-05 重庆紫光华山智安科技有限公司 数据存储方法、装置、电子设备及存储介质
WO2024026784A1 (zh) * 2022-08-04 2024-02-08 华为技术有限公司 事务处理方法、装置、节点及计算机可读存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030093524A1 (en) * 2001-11-13 2003-05-15 Microsoft Corporation Method and system for locking resources in a distributed environment
CN101706802A (zh) * 2009-11-24 2010-05-12 成都市华为赛门铁克科技有限公司 一种数据写入、修改及恢复的方法、装置及服务器
CN102387204A (zh) * 2011-10-21 2012-03-21 中国科学院计算技术研究所 维护集群缓存一致性的方法及系统

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3596021B2 (ja) * 1994-03-04 2004-12-02 三菱電機株式会社 データリンク情報制御方式
US7200623B2 (en) * 1998-11-24 2007-04-03 Oracle International Corp. Methods to perform disk writes in a distributed shared disk system needing consistency across failures
US6446237B1 (en) * 1998-08-04 2002-09-03 International Business Machines Corporation Updating and reading data and parity blocks in a shared disk system
US6148414A (en) * 1998-09-24 2000-11-14 Seek Systems, Inc. Methods and systems for implementing shared disk array management functions
US6490615B1 (en) * 1998-11-20 2002-12-03 International Business Machines Corporation Scalable cache
US6742135B1 (en) * 2000-11-07 2004-05-25 At&T Corp. Fault-tolerant match-and-set locking mechanism for multiprocessor systems
JP2002251313A (ja) * 2001-02-23 2002-09-06 Fujitsu Ltd キャッシュサーバ及び分散キャッシュサーバシステム
US6757790B2 (en) * 2002-02-19 2004-06-29 Emc Corporation Distributed, scalable data storage facility with cache memory
US7200715B2 (en) * 2002-03-21 2007-04-03 Network Appliance, Inc. Method for writing contiguous arrays of stripes in a RAID storage system using mapped block writes
US6990560B2 (en) * 2003-01-16 2006-01-24 International Business Machines Corporation Task synchronization mechanism and method
US8543781B2 (en) * 2004-02-06 2013-09-24 Vmware, Inc. Hybrid locking using network and on-disk based schemes
EP1782244A4 (en) * 2004-07-07 2010-01-20 Emc Corp SYSTEMS AND METHODS FOR IMPLEMENTING DISTRIBUTED CACHED MEMORY COHERENCE
JP5408140B2 (ja) 2008-10-23 2014-02-05 富士通株式会社 認証システム、認証サーバおよび認証方法
US8103838B2 (en) * 2009-01-08 2012-01-24 Oracle America, Inc. System and method for transactional locking using reader-lists
US8156368B2 (en) * 2010-02-22 2012-04-10 International Business Machines Corporation Rebuilding lost data in a distributed redundancy data storage system
US8103904B2 (en) * 2010-02-22 2012-01-24 International Business Machines Corporation Read-other protocol for maintaining parity coherency in a write-back distributed redundancy data storage system
JP5661355B2 (ja) * 2010-07-09 2015-01-28 株式会社野村総合研究所 分散キャッシュシステム
US9383932B2 (en) * 2013-12-27 2016-07-05 Intel Corporation Data coherency model and protocol at cluster level

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030093524A1 (en) * 2001-11-13 2003-05-15 Microsoft Corporation Method and system for locking resources in a distributed environment
CN101706802A (zh) * 2009-11-24 2010-05-12 成都市华为赛门铁克科技有限公司 一种数据写入、修改及恢复的方法、装置及服务器
CN102387204A (zh) * 2011-10-21 2012-03-21 中国科学院计算技术研究所 维护集群缓存一致性的方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2830284A4 *

Also Published As

Publication number Publication date
CA2896123A1 (en) 2014-07-03
AU2012398211B2 (en) 2016-12-08
US9424204B2 (en) 2016-08-23
JP2015525392A (ja) 2015-09-03
CN103392167A (zh) 2013-11-13
CA2896123C (en) 2018-02-13
EP2830284B1 (en) 2017-03-22
AU2012398211A1 (en) 2015-07-30
EP2830284A4 (en) 2015-05-20
US20150026417A1 (en) 2015-01-22
EP2830284A1 (en) 2015-01-28
CN103392167B (zh) 2016-08-03
JP6301318B2 (ja) 2018-03-28

Similar Documents

Publication Publication Date Title
WO2014101108A1 (zh) 分布式存储系统的缓存方法、节点和计算机可读介质
US11243922B2 (en) Method, apparatus, and storage medium for migrating data node in database cluster
US9830101B2 (en) Managing data storage in a set of storage systems using usage counters
CN109783438B (zh) 基于librados的分布式NFS系统及其构建方法
CN102523279B (zh) 一种分布式文件系统及其热点文件存取方法
US10387380B2 (en) Apparatus and method for information processing
CN108363641B (zh) 一种主备机数据传递方法、控制节点以及数据库系统
CN105549905A (zh) 一种多虚拟机访问分布式对象存储系统的方法
WO2018068626A1 (zh) 一种磁盘锁的管理方法、装置和系统
JP2004199420A (ja) 計算機システム、磁気ディスク装置、および、ディスクキャッシュ制御方法
WO2016066108A1 (zh) 路由访问方法、路由访问系统及用户终端
CN109302448B (zh) 一种数据处理方法及装置
WO2012126229A1 (zh) 一种分布式缓存系统数据存取的方法及装置
US11321021B2 (en) Method and apparatus of managing mapping relationship between storage identifier and start address of queue of storage device corresponding to the storage identifier
WO2013091167A1 (zh) 日志存储方法及系统
CN109254958A (zh) 分布式数据读写方法、设备及系统
CN102510390B (zh) 利用硬盘温度自检测指导数据迁移的方法和装置
US20230205638A1 (en) Active-active storage system and data processing method thereof
WO2022083267A1 (zh) 数据处理方法、装置、计算节点以及计算机可读存储介质
WO2023273803A1 (zh) 一种认证方法、装置和存储系统
JP2004246702A (ja) 計算機システム、計算機装置、計算機システムにおけるデータアクセス方法及びプログラム
JP2010277342A (ja) 管理プログラム、管理装置および管理方法
US20220182384A1 (en) Multi-protocol lock manager for distributed lock management
CN109947704A (zh) 一种锁类型切换方法、装置及集群文件系统
CN115878584A (zh) 一种数据访问方法、存储系统及存储节点

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12891167

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2012891167

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2012891167

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2015514321

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2896123

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2012398211

Country of ref document: AU

Date of ref document: 20121228

Kind code of ref document: A