WO2013005812A1

WO2013005812A1 - Distributed positioning device and distributed positioning method

Info

Publication number: WO2013005812A1
Application number: PCT/JP2012/067245
Authority: WO
Inventors: 徳寿伊賀; 純明榮
Original assignee: 日本電気株式会社
Priority date: 2011-07-01
Filing date: 2012-06-29
Publication date: 2013-01-10

Abstract

Provided is a distributed positioning device with which it is possible to dynamically change a distributed positioning algorithm even when operating a distributed system. The distributed positioning device comprises: a repositioning information generation means for generating repositioning information which includes one or more instances of repositioning data item identification information which is information which identifies a data item to be repositioned which is stored in a node; a repositioning read-in means for reading in a data item which is specified with the repositioning data item identification information from a first node which is determined using a first algorithm; a repositioning write means for writing the data item to a second node which is determined using a second algorithm; and a repositioning deletion means for deleting the data item from the first node.

Description

Distributed arrangement apparatus and distributed arrangement method

The present invention relates to a distributed arrangement device that distributes data to a plurality of nodes, a distributed arrangement method, and a program therefor.

In a distributed system composed of nodes such as servers and storages connected via a network, a plurality of data items (a chunk of data assigned with an identifier) are distributed to a plurality of nodes according to a specific algorithm. Technology is known.
This particular algorithm (hereinafter referred to as a distributed placement algorithm) is an algorithm that determines how data items are placed. There are various types of distributed placement algorithms. Also, there are various methods for assigning data item identification information (hereinafter referred to as data item ID (Identification Data)) for determining which data item is placed in which node.
For example, as a method for determining the arrangement corresponding to the distributed arrangement algorithm, there is a method in which a table stored in the relational database is divided into data items for each row or column and distributed to a plurality of nodes.
The distributed key-value store is a method in which a key-value store that stores data items in a combination of a key and a value is extended so as to be shared and stored by a plurality of nodes. The apparatus that executes the distributed key-value store calculates a hash value by using a hash function for each of a key used as data item identification information and node identification information (for example, IP (Internet Protocol) address). Next, the apparatus determines a data item and a node that shares storage of the data item based on the calculated hash value. As an algorithm for determining the nodes that share the storage of data items in this way, for example, there is a consistent hash method.
Furthermore, there is a round robin in which a plurality of data items are sequentially allocated to a plurality of nodes.
Patent Document 1 discloses a technique in which each storage device constituting a distributed storage system determines a range to be shared within the identifier space of the distributed storage system, and stores a data chunk having a corresponding identifier.
The storage node described in Patent Literature 1 determines a shared range in the identifier space of the distributed storage system as follows. First, the space width determination unit of the storage node determines the relative space width between the self node and the adjacent storage node based on the weight between the self node and the adjacent storage node. Next, the space allocation control unit of the storage node divides the space between the own node and the adjacent storage node, and determines a range to be shared by the own node. The space between the own node and the adjacent storage node is a space whose end points are values obtained by applying a hash function to the address of the own node and the address of the adjacent storage node.
The storage node described in Patent Document 1 stores data chunks having corresponding identifiers as follows. First, the storage node receives an input / output request including a file ID (for example, a value obtained by applying a hash function to the file name of the target file). Next, the storage node checks whether or not the file ID falls within the range shared by the own node. If applicable, the storage node accesses the file storage unit of its own node and performs the requested input / output processing.

JP 2004-252663 A

However, the technique described in Patent Document 1 has a problem that it is difficult to dynamically change a distributed arrangement algorithm during operation of a distributed system. That is, the technique described in Patent Document 1 cannot dynamically change the distributed arrangement algorithm even when there is another distributed data arrangement algorithm that is more suitable for the system.
The reason why the distributed arrangement algorithm cannot be changed is that the technique described in Patent Document 1 supports access to data arranged before the change of the distributed arrangement algorithm due to a change in data access logic accompanying the change of the distributed arrangement algorithm. It is because it becomes impossible.
An object of the present invention is to provide a distributed arrangement apparatus, a distributed arrangement method, and a program therefor that can solve the above-described problems.

The distributed placement device of the present invention generates and outputs relocation information including one or more relocation data item identification information, which is data item identification information, for identifying a relocation target data item stored in a node And a first node derived using a first algorithm among a plurality of algorithms for deriving specific node identification information among a plurality of node identification information corresponding to the plurality of nodes. Relocation reading means for reading a data item specified by the relocation data item identification information included in the relocation information from the first node corresponding to the identification information, and a second algorithm of the plurality of algorithms The data specified by the rearrangement data item identification information in the second node corresponding to the second node identification information derived using The rearrangement writing means for writing an item, and the first node corresponding to the first node identification information derived using the first algorithm, specified by the rearrangement data item identification information Relocation erasure means for erasing the data item.
According to the distributed arrangement method of the present invention, a computer generates relocation information including one or more relocation data item identification information that is data item identification information for identifying a data item to be relocated stored in a node. And outputting the first node identification information derived using the first algorithm among the plurality of algorithms for deriving specific node identification information among the plurality of node identification information corresponding to the plurality of nodes. A data item specified by the rearrangement data item identification information included in the rearrangement information is read from a corresponding first node, and a second item derived using a second algorithm of the plurality of algorithms is derived. Writing the data item specified by the relocation data item identification information to the second node corresponding to the node identification information; From the first node corresponding to the first node identification information derived using the serial first algorithm, erasing the data item specified by the rearranged data item identification information.
The program of the present invention generates and outputs relocation information including one or more relocation data item identification information that is data item identification information for identifying a relocation target data item stored in a node; The first corresponding to the first node identification information derived using the first algorithm among the plurality of algorithms for deriving specific node identification information among the plurality of node identification information corresponding to the plurality of nodes. A process of reading a data item specified by the rearrangement data item identification information included in the rearrangement information from a node of the second node, and a second node identification derived using a second algorithm of the plurality of algorithms A process of writing the data item specified by the relocation data item identification information to a second node corresponding to the information; A process of deleting the data item specified by the rearranged data item identification information from the first node corresponding to the first node identification information derived using the first algorithm; Let it run.

The effect of the present invention is that the distributed arrangement algorithm can be dynamically changed even during the operation of the distributed system.

FIG. 1 is a block diagram illustrating a configuration of a distributed access node according to the first embodiment. FIG. 2 is a block diagram showing the configuration of the distributed system in the first to fourth embodiments. FIG. 3 is a diagram illustrating an example of access history information in the first to fourth embodiments. FIG. 4 is a diagram illustrating an example of rearrangement information in the first to fourth embodiments. FIG. 5 is a diagram illustrating an example of a distributed arrangement algorithm table in the first to fourth embodiments. FIG. 6 is a diagram showing a hardware configuration of the distributed access node and its peripheral devices according to the first to fourth embodiments. FIG. 7 is a flowchart showing a part of the operation when an input / output request is received in the normal state in the first embodiment. FIG. 8 is a flowchart showing a part of the operation when the input / output request is received in the normal state in the first embodiment. FIG. 9 is a flowchart showing a part of the operation of changing the distributed arrangement algorithm in the first embodiment and rearranging the data items. FIG. 10 is a flowchart showing a part of the operation of changing the distributed arrangement algorithm in the first embodiment and rearranging the data items. FIG. 11 is a flowchart showing a part of the operation when an input / output request is received in the rearrangement state in the first embodiment. FIG. 12 is a flowchart illustrating a part of the operation when an input / output request is received in the rearrangement state according to the first embodiment. FIG. 13 is a flowchart illustrating a part of the operation when an input / output request is received in the rearrangement state according to the first embodiment. FIG. 14 is a block diagram illustrating a configuration of a distributed access node according to the second embodiment. FIG. 15 is a diagram illustrating an example of rearrangement node correspondence information in the second and third embodiments. FIG. 16 is a diagram illustrating an example of user determination information according to the second embodiment. FIG. 17 is a flowchart illustrating an operation for calculating a load state according to the second embodiment. FIG. 18 is a flowchart showing the operation of the effect notification unit in the second embodiment. FIG. 19 is a block diagram illustrating a configuration of a distributed access node according to the third embodiment. FIG. 20 is a flowchart illustrating an operation for determining a distributed arrangement algorithm according to the third embodiment. FIG. 21 is a block diagram illustrating a configuration of a distributed access node according to the fourth embodiment. FIG. 22 is a block diagram illustrating an example of a non-volatile storage medium on which a program is recorded.

Next, embodiments of the present invention will be described in detail with reference to the drawings. In addition, in each embodiment described in each drawing and specification, the same code | symbol is given to the component provided with the same function.
[First Embodiment]
FIG. 1 is a block diagram showing the configuration of the distributed access node 30 according to the first embodiment of the present invention. Referring to FIG. 1, the distributed access node 30 includes an access location determination unit 3010, an algorithm change unit 3020, a relocation execution unit 3030, an access history collection unit 3040, a relocation information generation unit 3050, a load monitoring unit 3060, and input / output execution. Part 3031.
FIG. 2 is a block diagram showing a configuration of the distributed system 100 including the distributed access node 30. Referring to FIG. 2, the distributed system 100 includes a client node 10, a distributed access node 30, a distributed system network 40, a server storage 50 and a storage 60.
The client node 10 is connected to the distributed access node 30 via the network 20.
In FIG. 2, there are three client nodes 10, server storage nodes 50, and storages 60, but any number of client nodes 10, server storage nodes 50, and storages 60 may be used.
In the configuration shown in FIG. 2, the distributed system 10 generally operates as follows.
The client node 10 using the distributed system transmits a data item read / write request including at least the data item ID to the distributed access node 30 via the network 20. The data item ID is also called data item identification information. A request for reading and writing data items is also called an input / output request.
When receiving the input / output request, the distributed access node 30 uniquely determines the data item storage location (server storage node 50) using a specific distributed arrangement algorithm. Subsequently, the distributed access node 30 transmits an access request to the server storage node 50 that is the determined storage location via the distributed system network 40 based on the received input / output request. Based on the received access request, the server storage node 50 reads or writes data items from / to the storage 60 managed by the server storage node 50.
As described above, the distributed access node 30 uniquely determines the storage location of the data item using a specific distributed arrangement algorithm.
The specific distributed arrangement algorithm may be an arbitrary distributed arrangement algorithm. Therefore, a specific distributed arrangement on the distributed system network 40 is arbitrary. Here, the specific distributed arrangement is how to determine the arrangement of the server storage nodes 50. For example, a specific distributed arrangement may be an arrangement that is multiplexed, hierarchized or virtualized between the server storage nodes 50.
Further, the server storage node 50 may have a cache (not shown) in order to reduce a delay when accessing the storage 60. The server storage node 50 may read or write data items to / from the cache if there is an access request target data item in the cache.
The server storage node 50 may control the storage 60 having an arbitrary configuration using an arbitrary file system (not shown).
Next, each component of the distributed access node 30 will be described in detail. Note that the components shown in FIG. 1 are not hardware components but functional units.
The algorithm changing unit 3020 sets the distributed placement algorithm in the access location determining unit 3010 when the distributed access node 30 is initially set and when the designation of the distributed placement algorithm is received.
For example, the algorithm changing unit 3020 sets a predetermined distributed arrangement algorithm in the access location determining unit 3010 when the distributed access node 30 is initially set. For example, the algorithm changing unit 3020 receives a designation of a distributed arrangement algorithm from an operator via an input unit (not shown), and sets a distributed arrangement algorithm corresponding to this designation as a predetermined distributed arrangement algorithm. For example, the algorithm changing unit 3020 receives designation of a distributed arrangement algorithm from an external server (not shown) or another module (not shown), and sets the distributed arrangement algorithm corresponding to this designation as a predetermined distributed arrangement algorithm. Also good.
The algorithm changing unit 3020 holds, for example, a distributed arrangement program corresponding to each distributed arrangement algorithm in a storage unit (not shown). Then, the algorithm changing unit 3020 downloads and sets a distributed arrangement program corresponding to a predetermined distributed arrangement algorithm to the access location determination unit 3010 among the distributed arrangement programs held.
The algorithm changing unit 3020 notifies the access location determining unit 3010 of identification information (hereinafter referred to as distributed allocation algorithm ID) of the distributed allocation algorithm to be used, and sets the distributed allocation algorithm in the access location determining unit 3010. Also good. In this case, the access location determination unit 3010 may be operable with a plurality of types of distributed arrangement algorithms that uniquely determine the server storage node 50 to be accessed. Then, the access location determination unit 3010 operates with the distributed placement algorithm corresponding to the notified distributed placement algorithm ID among the plurality of distributed placement algorithms, and uniquely determines the server storage node 50 to be accessed.
Note that the algorithm changing unit 3020 may acquire and hold the distributed arrangement program by means not shown (for example, means for acquiring a program from a server not shown via the network 20).
The access history collection unit 3040 collects the data item IDs of the data items that have been requested (accessed) from the client node 10 as access history information.
FIG. 3 is a diagram illustrating an example of the access history information 400. Referring to FIG. 3, the access history information 400 includes a data item ID 401.
Note that the access history collection unit 3040 may collect, for example, the data item ID of the data item that has requested access to the server storage node 50 as the access history information 400.
Further, the access history collection unit 3040 may collect other information as access history information instead of the data item ID 401. In this case, the other information (for example, the hash value of the data item) is information that allows each distributed arrangement algorithm to identify a specific data item as the same data item. That is, the access history information includes at least identification information of some data item that identifies the data item stored in the server storage node 50.
Further, the access history collection unit 3040 may collect any information other than the data item identification information as access history information. The arbitrary information is, for example, the time when the input / output request is made, the time when the access request is made, the type of input / output request (for example, reading or writing), the size of the data item, and the corresponding distributed arrangement algorithm ID.
The rearrangement information generation unit 3050 generates and outputs rearrangement information including one or more rearrangement data item IDs (also referred to as rearrangement data item identification information). The rearrangement data item ID is the data item ID 401 of the data item to be rearranged stored in the server storage node 50.
FIG. 4 is a diagram illustrating an example of the rearrangement information 410. Referring to FIG. 4, the rearrangement information 410 includes a rearrangement data item ID 411.
The rearrangement information generation unit 3050 counts the number of each data item ID 401 included in the access history 400 and arranges the data item ID 401 as the rearrangement data item ID 411 in descending order of the counted number. Relocation information 410 is generated.
For example, the rearrangement information generation unit 3050 may generate rearrangement information 410 in which data item IDs included in access history information are arranged as rearrangement data item IDs 411 in descending order of the number of accesses per unit time. The rearrangement information generation unit 3050 may generate rearrangement information 410 in which, for example, data item IDs included in the access history information are arranged as rearrangement data item IDs 411 in the order of recent access. In these cases, the access history information includes at least one of the time when the input / output request is made and the time when the access request is made, corresponding to the data item identification information.
Furthermore, the rearrangement information generation unit 3050 arranges the rearrangement data item IDs 411 in the order determined based on, for example, the type of input / output request, the size of the data item, and any of the corresponding distributed arrangement algorithm IDs. The rearrangement information 410 may be generated. In this case, the access history information includes at least information corresponding to each.
The rearrangement information generation unit 3050 may acquire the data item ID of the data item stored in the server storage node 50 from the server storage node 50, for example, and may use the acquired data item ID as the rearrangement data item ID 411. . In this case, the distributed access node 30 may not include the access history collection unit 3040.
The rearrangement execution unit 3030 acquires one rearrangement data item ID 411 from the rearrangement information 410. Subsequently, the rearrangement execution unit 3030 outputs a rearrangement read request for requesting reading of the data item specified by the rearrangement data item ID 411 to the access location determination unit 3010, including the acquired rearrangement data item ID 411. To do.
Further, the rearrangement execution unit 3030 outputs a rearrangement write request for requesting writing of the data item read in response to the rearrangement read request to the access location determination unit 3010. The rearrangement writing request may include the corresponding rearrangement data item ID 411.
In addition, the rearrangement execution unit 3030 sends, to the access location determination unit 3010, a post-relocation erasure request that requests erasure of the data item from the server storage node 50 that has read the data item based on the relocation read request. Output. The post-relocation erasure request may include a corresponding rearrangement data item ID 411.
The input / output execution unit 3031 receives an input / output request including at least a data item ID and information indicating whether the request is a read request or a write request.
The input / output execution unit 3031 executes the following processes (1) to (4) in response to the input / output request.
(1) The input / output execution unit 3031 indicates that the received input / output request is a request for reading a data item, and the rearrangement data item ID 411 having the same value as the data item ID of the data item is included in the rearrangement information 410 If it is included, the following processing is performed.
First, the input / output execution unit 3031 outputs a rearrangement read request including the rearrangement data item ID 411 to the access location determination unit 3010.
Next, the input / output execution unit 3031 outputs a relocation writing request to the access location determination unit 3010.
Next, the input / output execution unit 3031 outputs a post-relocation erasure request to the access location determination unit 3010.
Next, the input / output execution unit 3031 transmits an input / output response including the data item read in response to the relocation read request to the client node 10 that issued the input / output request.
(2) In the input / output execution unit 3031, the received input / output request is a request for reading a data item, and the rearrangement data item ID 411 having the same value as the data item ID of the data item is included in the rearrangement information 410. If not, the following processing is performed.
The input / output execution unit 3031 includes the data item ID and outputs a normal read request for requesting reading of the data item specified by the data item ID to the access location determination unit 3010.
Next, the input / output execution unit 3031 transmits an input / output response including the data item read in response to the normal read request to the client node 10 that issued the input / output request.
(3) The input / output execution unit 3031 includes the relocation data item ID 411 having the same value as the data item ID of the data item in the relocation information 410 when the received input / output request is a data item write request. If it is included, the following processing is performed. Note that the writing when the rearrangement data item ID 411 having the same value as the data item ID of the data item is included in the rearrangement information 410 is a writing that overwrites the data item.
First, the input / output execution unit 3031 outputs a rearrangement write request for requesting writing of a data item corresponding to the received input / output request to the access location determination unit 3010, including the corresponding rearrangement data item ID 411.
Next, the input / output execution unit 3031 outputs a post-placement deletion request to the access location determination unit 3010. The post-deployment deletion request includes the corresponding rearrangement data item ID 411, and requests that the data item specified by the rearrangement data item ID 411 is deleted from the server storage node 50 derived by the pre-change algorithm described later. Is.
(4) In the input / output execution unit 3031, the received input / output request is a data item write request, and the rearrangement data item ID 411 having the same value as the data item ID of the data item is included in the rearrangement information 410. If not, the following processing is performed.
The input / output execution unit 3031 outputs a normal write request for requesting writing of the data item specified by the data item ID to the access location determination unit 3010, including the data item ID.
The processing in the input / output execution unit 3031 is as described above.
When the distributed placement algorithm is set by the algorithm changing unit 3020, the access location determining unit 3010 holds this as the changed algorithm (second algorithm).
In addition, the access location determination unit 3010 operates as follows when the distributed placement algorithm is newly set by the algorithm change unit 3020 in a state where the changed algorithm is already held. First, the access location determination unit 3010 sets the post-change algorithm already held as the pre-change algorithm (first algorithm). Subsequently, the access location determination unit 3010 holds the newly set distributed arrangement algorithm as a new changed algorithm.
For example, the access location determination unit 3010 receives (sets) a program that realizes the distributed arrangement algorithm from the algorithm change unit 3020 and holds the program name as information for specifying the changed algorithm and the changed algorithm. FIG. 5 is a diagram illustrating an example of the distributed arrangement algorithm table 420. Referring to FIG. 5, the distributed arrangement algorithm table 420 holds a post-change program name 421 that realizes the post-change algorithm and a pre-change program name 422 that realizes the pre-change algorithm.
When the relocation processing using the post-change algorithm is completed, the access location determination unit 3010 clears the pre-change program name 422 with “0” and deletes the corresponding distributed allocation program.
In addition, when the access location determination unit 3010 receives a rearrangement read request, the access location determination unit 3010 uses the pre-change algorithm to transmit node access information corresponding to the rearrangement read request (corresponding to the first node identification information). Is derived. Subsequently, the access location determination unit 3010 reads the data item specified by the relocation data item ID 411 included in the relocation read request from the server storage node 50 (first node) corresponding to the derived node identification information. .
In addition, when the access location determination unit 3010 receives a rearrangement write request, the access location determination unit 3010 uses the post-change algorithm to transmit node access information corresponding to the rearrangement write request (second node identification information). Derived). Subsequently, the access location determination unit 3010 sends the data item specified by the relocation data item ID 411 included in the relocation write request to the server storage node 50 (second node) corresponding to the derived node identification information. Write out.
In addition, when the access location determination unit 3010 receives the post-relocation erasure request, the access location determination unit 3010 uses the pre-change algorithm to transmit the access request corresponding to the post-relocation erasure request to the node identification information (first node identification information). Derived). Subsequently, the access location determination unit 3010 determines the data item specified by the relocation data item ID 411 included in the post-relocation deletion request from the server storage node 50 (first node) corresponding to the derived node identification information. to erase.
When receiving the normal read request, the access location determination unit 3010 derives node identification information for transmitting the access request corresponding to the normal read request, using the changed algorithm. Subsequently, the access location determination unit 3010 reads the data item specified by the relocation data item ID 411 included in the relocation read request from the server storage node 50 corresponding to the derived node identification information.
In addition, when receiving a normal write request, the access location determination unit 3010 derives node identification information for transmitting an access request corresponding to the normal write request, using the changed algorithm. Subsequently, the access location determination unit 3010 writes the data item specified by the relocation data item ID 411 included in the relocation write request to the server storage node 50 corresponding to the derived node identification information.
The distributed arrangement algorithm may be, for example, the following algorithm.
In the initial setting procedure in the distributed placement algorithm, the access location determination unit 3010 has, for example, serial numbers “0”, “1” corresponding to the node identification information of the three server storage nodes 50 shown in FIG. 1 ”and“ 2 ”are assigned.
The access location determination unit 3010 operates as follows in the first to fourth procedures for deriving node identification information corresponding to the data item ID in the distributed placement algorithm.
First, in the first procedure, the access location determination unit 3010 converts the data item ID included in the input / output request into a numerical value. For example, the access location determination unit 3010 converts the character code of the data item ID into a hexadecimal number.
Next, in the second procedure, the access location determination unit 3010 calculates a remainder obtained by dividing the data item ID included in the input / output request by “3 (the number of server storage nodes 50)”.
Next, in the third procedure, the access location determination unit 3010 determines the calculated remainder as the serial number of the server storage node 50 that stores the data item specified by the data item ID.
Next, in the fourth procedure, the access location determination unit 3010 derives node identification information corresponding to the determined serial number.
The distributed arrangement algorithm is, for example, the following algorithm.
The access location determination unit 3010 associates, with the node identification information of the server storage node 50, a value that can be taken by the hash value calculated by the hash function specified by the distributed placement algorithm in the procedure of initial setting of the distributed placement algorithm. . The hash function is, for example, CRC (Cyclic Redundancy Check) 16. Specifically, when the hash function specified by the distributed arrangement algorithm is, for example, CRC16, the value that can be taken by the hash value calculated by the hash function is # 0000 (the following four characters are hexadecimal numbers) To #FFFF. For example, the access location determination unit 3010 associates each of the three server storage nodes 50 shown in FIG. That is, for example, the access location determination unit 3010 shares # 0000 to # 5555, # 5556 to #AAAA, and #AAAB to #FFFF for the node identification information of the first to third server storage nodes 50, respectively. Associate as a hash value.
The access location determination unit 3010 operates as follows in the first to fourth procedures for deriving node identification information corresponding to the data item ID of the distributed placement algorithm.
First, in the first procedure, the access location determination unit 3010 calculates the hash value of the data item ID included in the input / output request using the hash function. When the data item ID is “DATA1” and the hash function is CRC16, the access location determination unit 3010 calculates, for example, # 0C9E.
Next, in the second procedure, the access location determination unit 3010 determines in which shared hash value range the hash value of the data item ID is included. For example, the access location determination unit 3010 determines that # 0C9E is included in # 0000 to # 5555.
Next, in the third procedure, the access location determination unit 3010 derives node identification information corresponding to the determined shared hash value range.
The load monitoring unit 3060 monitors the load state of the distributed access node 30 and outputs the monitoring result. The load state is, for example, a usage rate of a CPU (Central Processing Unit, not shown) or a usage rate of a memory (not shown) of the distributed access node 30.
The relocation execution unit 3030 outputs a relocation read request, and the server storage node 50 corresponding to the node identification information derived by the access location determination unit 3010 using the pre-change algorithm is included in the relocation read request. Reading the data item specified by the rearranged data item ID 411 is also referred to as rearrangement reading means.
Also, the rearrangement execution unit 3030 outputs a rearrangement write request, and the access location determination unit 3010 sends a rearrangement write request to the server storage node 50 corresponding to the node identification information derived using the changed algorithm. The writing of the data item specified by the rearrangement data item ID 411 included in is also referred to as rearrangement writing means.
Further, the rearrangement execution unit 3030 outputs a post-relocation deletion request, and the post-relocation deletion request from the server storage node 50 corresponding to the node identification information derived by the access location determination unit 3010 using the pre-change algorithm. The deletion of the data item specified by the rearrangement data item ID 411 included in the ID is also referred to as rearrangement erasure means.
This completes the description of each component of the functional unit of the distributed access node 30.
Next, components in units of hardware of the distributed access node 30 will be described.
FIG. 6 is a diagram illustrating a hardware configuration of the distributed access node 30 and its peripheral devices in the present embodiment. As illustrated in FIG. 6, the distributed access node 30 includes a CPU 3070, a storage unit 3071, a storage device 3072, an input unit 3073, an output unit 3074, and a communication unit 3075. The CPU 3070 controls the overall operation of the distributed access node 30 according to the present embodiment by operating an operating system (not shown). Further, the CPU 3070 reads a program and data into the storage unit 3071 from, for example, a nonvolatile recording medium attached to the storage device 3072. FIG. 22 is a block diagram illustrating an example of a nonvolatile storage medium 3077 in which a program is recorded. The CPU 3070 executes various processes as each unit shown in FIG. 1 according to the read program and based on the read data.
Note that the CPU 3070 may download a program and data to the storage unit 3071 from an external computer (not shown) connected to a communication network (not shown).
The storage unit 3071 stores programs and data.
The storage device 3072 is, for example, an optical disk, a flexible disk, a magnetic optical disk, an external hard disk, and a semiconductor memory, and includes a nonvolatile storage medium. The storage device 3072 records the program so that it can be read by a computer. Further, the storage device 3072 may record data so as to be readable by a computer.
The input unit 3073 is realized by, for example, a mouse, a keyboard, a built-in key button, and the like, and is used for an input operation. The input unit 3073 is not limited to a mouse, a keyboard, and a built-in key button, but may be a touch panel, an accelerometer, a gyro sensor, a camera, or the like.
The output unit 3074 is realized by a display, for example, and is used to check the output.
The communication unit 3075 realizes an interface with the network 20 and the distributed access node 30. The communication unit 3075 is included as part of the access location determination unit 3010, the algorithm change unit 3020, and the input / output execution unit 3031.
This completes the description of each hardware component of the distributed access node 30.
As described above, the functional unit block shown in FIG. 1 is realized by the hardware configuration shown in FIG. However, the means for realizing each unit included in the distributed access node 30 is not limited to the above. In other words, the distributed access node 30 may be realized by one physically coupled device, or by two or more physically separated devices connected by wire or wirelessly, and by the plurality of devices. Also good.
Further, a recording medium (or storage medium) 3077 in which the above-described program is recorded may be supplied to the distributed access node 30, and the distributed access node 30 may read and execute the program stored in the recording medium. That is, this embodiment includes an embodiment of a recording medium that stores a program executed by the distributed access node 30 temporarily or non-temporarily.
Next, the operation of this embodiment will be described in detail with reference to the drawings.
7 and 8 are flowcharts showing the operation when an input / output request is received in the normal state in the present embodiment.
The normal state is a state in which a distributed placement algorithm is set by the algorithm changing unit 3020 when the distributed access node 30 is initially set, and a new distributed placement algorithm has not been set yet. In the normal state, a new distributed arrangement algorithm is set by the algorithm changing unit 3020, the rearrangement process using the new distributed arrangement algorithm is completed, and no further new distributed arrangement algorithm has been set yet. It is. Therefore, the normal state is a state where none of the rearrangement data item IDs 411 is included in the rearrangement information 410, there is no pre-change algorithm, and the rearrangement process is not executed. The state in which the rearrangement process is being executed is hereinafter referred to as a rearrangement state.
The distributed access node 30 starts this operation when receiving an input / output request transmitted by any of the client nodes 10 to read / write data items.
First, the input / output execution unit 3031 confirms whether or not it is in a normal state (A101). For example, the input / output execution unit 3031 checks whether or not the rearrangement data item ID 411 is not included in the rearrangement information 410 (normal state) or not (rearrangement state). For example, the input / output execution unit 3031 refers to the distributed arrangement algorithm table 420, and the state where the pre-change algorithm and the post-change algorithm exist (rearrangement state) or the state where only the post-change algorithm exists (see FIG. (Normal state) may be confirmed.
If it is not a steady state (rearranged state) (NO in A101), the process proceeds to step C103 in FIG.
In the steady state (YES in A101), the input / output execution unit 3031 determines whether the received input / output request is an input / output request requesting reading or an input / output request requesting writing (step). A102). If the input / output request requests reading (YES in step A102), the process proceeds to A103. If the input / output request requests writing (NO in step A102), the process proceeds to A110.
In A103, the input / output execution unit 3031 outputs a normal read request including the data item ID included in the received input / output request to the access location determination unit 3010 (step A103).
Next, the access location determination unit 3010 derives the node identification information of the server storage node 50 that outputs the access request corresponding to the received normal read request using the post-change algorithm (step A104).
Next, the access location determination unit 3010 outputs an access request corresponding to the received normal read request to the server storage node 50 corresponding to the derived node identification information (step A105).
Next, the input / output execution unit 3031 receives the data item corresponding to the access request output by the access location determination unit 3010 (step A106).
Next, the input / output execution unit 3031 transmits an input / output response including the read data item to the client node 10 that issued the input / output request (step A107). Then, the process proceeds to step A120.
In step A110, the input / output execution unit 3031 outputs a normal write request including the data item ID included in the received input / output request to the access location determination unit 3010 (step A110).
Next, the access location determination unit 3010 derives node identification information of the server storage node 50 that outputs an access request corresponding to the received normal write request using the post-change distributed arrangement algorithm (step A111).
Next, the access location determination unit 3010 outputs an access request corresponding to the received normal write request to the server storage node 50 corresponding to the derived node identification information (step A112). Then, the process proceeds to step A120.
In step A120, the access history collection unit 3040 records the data item ID corresponding to the access request output to the access history information 400. (Step A120). Then, the process ends.
The above is the description of the operation of the distributed access node 30 when an input / output request is received in the normal state.
FIG. 9 and FIG. 10 are flowcharts showing the operation of changing the distributed placement algorithm and rearranging the data items in the present embodiment.
The distributed access node 30 starts this operation when receiving the designation of the newly set distributed arrangement algorithm from the operator via the input unit 3073 shown in FIG.
The algorithm changing unit 3020 sets the distributed location algorithm corresponding to the designation of the distributed location algorithm newly received from the operator in the access location determining unit 3010 (step B101). Specifically, the algorithm changing unit 3020 downloads the distributed arrangement program corresponding to the received distributed arrangement algorithm designation to the access location determining unit 3010. At the same time, the algorithm changing unit 3020 outputs the program name of the distributed arrangement program to the access location determining unit 3010.
Next, the access location determination unit 3010 receives the setting of the distributed arrangement algorithm. Specifically, upon receiving the program name, the access location determination unit 3010 overwrites the program name 421 after the change with the program name 422 before the change, and writes the received program name into the program name 421 after the change (step B102).
Next, the rearrangement information generation unit 3050 refers to the access history information 400 as shown in FIG. 3 and generates rearrangement information 410 as shown in FIG. 4 (step B103).
Note that the rearrangement information generation unit 3050 may delete the referenced data item ID 401 among the data item IDs 401 included in the access history information 400. In such a case, after the relocation information 410 is generated, the data item ID 401 included in the access history information 400 identifies the data item stored in the server storage node 50 determined using the newly received distributed allocation algorithm. . Therefore, it is possible to execute processing for rearranging data items by changing to a new distributed placement algorithm.
Next, the load monitoring unit 3060 confirms that the load is not high (step B104). For example, the load monitoring unit 3060 confirms that the CPU is not in a high load state based on whether or not the CPU usage rate of the distributed access node 30 exceeds a predetermined threshold (for example, 50%). Further, the load monitoring unit 3060 confirms that the load is not in a high load state based on the result of inquiring the server storage node 50 about the load state, or on both the result and the CPU usage rate of the distributed access node 30. You may do it. If the load is high (YES in step B104), the process ends.
When not in a high load state (NO in step B104), the rearrangement execution unit 3030 acquires one rearrangement data item ID 411 from the head of the rearrangement information 410 (step B105).
Next, the rearrangement execution unit 3030 outputs a rearrangement read request for requesting reading of the data item specified by the rearrangement data item ID 411 to the access location determination unit 3010, including the acquired rearrangement data item ID 411. (Step B106).
Next, the access location determination unit 3010 receives the rearrangement read request and derives node identification information for transmitting an access request corresponding to the rearrangement read request using the pre-change algorithm (step B107).
Subsequently, the access location determination unit 3010 reads the data item specified by the relocation data item ID 411 included in the relocation read request from the server storage node 50 corresponding to the derived node identification information (step B108). Specifically, the access location determination unit 3010 transmits an access request including the relocation data item ID 411 to the server storage node 50 corresponding to the derived node identification information. Next, the access location determination unit 3010 receives the data item transmitted from the server storage node 50 in response to the output access request.
Next, the rearrangement execution unit 3030 outputs a rearrangement write request for requesting writing of the data item read in response to the rearrangement read request to the access location determination unit 3010 (step B109).
Next, the access location determination unit 3010 receives the rearrangement write request and derives node identification information for transmitting the access request corresponding to the rearrangement write request using the changed algorithm (step B110).
Subsequently, the access location determination unit 3010 writes the data item specified by the relocation data item ID 411 included in the relocation write request to the server storage node 50 corresponding to the derived node identification information (step B111).
Next, the rearrangement execution unit 3030 sends an erase request after rearrangement requesting that the data item is erased from the server storage node 50 that has read the data item based on the rearrangement read request, to the access location determination unit 3010. (Step B112).
Next, the access location determination unit 3010 receives the post-relocation erasure request and derives node identification information for transmitting an access request corresponding to the post-relocation erasure request using the pre-change algorithm (step B113).
Subsequently, the access location determination unit 3010 deletes the data item specified by the relocation data item ID 411 included in the post-relocation deletion request from the server storage node 50 corresponding to the derived node identification information (step B114). .
Next, the rearrangement execution unit 3030 deletes the corresponding rearrangement data item ID 411 from the rearrangement information 410 (step B115).
Next, the rearrangement execution unit 3030 checks whether or not the rearrangement data item ID 411 exists in the rearrangement information 410 (step B116). If it exists (YES in step B116), the process returns to step B105.
If it does not exist (NO in step B116), the access location determination unit 3010 clears the pre-change program name 422 in the distributed arrangement algorithm table 420 to “0” (step B117). Then, the process ends.
Note that it is desirable that the process of generating the rearrangement information 410 from the access history information 400 in step B103 is performed simultaneously with, for example, step B101, step B102, and step B104. Further, when the frequency of input / output requests from the client node 10 is high, for example, the processing of step B103 may be performed at regular intervals independently of the processing of the flowchart shown in FIG. In such a case, the referenced data item ID 401 among the data item IDs 401 included in the access history information 400 may be sequentially deleted.
The above is the description of the operation of changing the distributed arrangement algorithm and rearranging the data items.
11, 12 and 13 are flowcharts showing the operation when an input / output request is received in the rearrangement state (YES in step A101 in FIG. 7).
When the input / output execution unit 3031 receives the input / output request in the rearrangement state (YES in step A101 in FIG. 7), the data item ID included in the received input / output request includes the rearrangement included in the rearrangement information 410 It is confirmed whether or not it matches any of the data item IDs 411 (step C103). If they match (YES in step C103), the process proceeds to step C104. If they do not match (NO in step C103), the process proceeds to step A102 in FIG.
In step C104, the input / output execution unit 3031 determines whether the received input / output request is an input / output request requesting reading or an input / output request requesting writing (step C104). If the input / output request requests reading (YES in step C104), the process proceeds to C105. If the input / output request requests writing (NO in step C104), the process proceeds to C120.
In step C105, the input / output execution unit 3031 outputs a rearrangement read request including the rearrangement data item ID 411 that identifies the data item to be read to the access location determination unit 3010 (step C105).
Next, the access location determination unit 3010 receives the rearrangement read request and derives node identification information for transmitting an access request corresponding to the rearrangement read request using the pre-change algorithm (step C106).
Subsequently, the access location determination unit 3010 reads the data item specified by the relocation data item ID 411 included in the relocation read request from the server storage node 50 corresponding to the derived node identification information (step C107).
Next, the input / output execution unit 3031 outputs a rearrangement write request for requesting writing of the data item read in response to the rearrangement read request to the access location determination unit 3010 (step C108).
Next, the access location determination unit 3010 receives the rearrangement write request and derives node identification information for transmitting an access request corresponding to the rearrangement write request using the changed algorithm (step C109).
Subsequently, the access location determination unit 3010 writes the data item specified by the relocation data item ID 411 included in the relocation write request to the server storage node 50 corresponding to the derived node identification information (step C110).
Next, the input / output execution unit 3031 sends an after-relocation erasure request for erasing the data item from the server storage node 50 that has read the data item based on the relocation read request, to the access location determination unit 3010. (Step C111).
Next, the access location determination unit 3010 receives the post-relocation erasure request and derives node identification information for transmitting an access request corresponding to the post-relocation erasure request using the pre-change algorithm (step C112).
Subsequently, the access location determination unit 3010 deletes the data item specified by the relocation data item ID 411 included in the post-relocation deletion request from the server storage node 50 corresponding to the derived node identification information (step C113). .
Next, the input / output execution unit 3031 deletes the corresponding rearrangement data item ID 411 from the rearrangement information 410 (step C114).
Next, the input / output execution unit 3031 transmits an input / output response including the data item read in response to the rearrangement read request to the client node 10 that issued the input / output request (step C115). Then, the process proceeds to C130.
In C120, the input / output execution unit 3031 outputs an arrangement write request for requesting writing of a data item corresponding to the received input / output request to the access location determination unit 3010, including the corresponding relocation data item ID 411 ( Step C120).
Next, the access location determination unit 3010 receives the arrangement write request and derives node identification information for transmitting the access request corresponding to the relocation write request using the post-change algorithm (step C121).
Subsequently, the access location determination unit 3010 writes the data item specified by the relocation data item ID 411 included in the relocation write request to the server storage node 50 corresponding to the derived node identification information (step C122).
Next, the input / output execution unit 3031 outputs, to the access location determination unit 3010, a post-relocation deletion request that requests the data item to be deleted from the server storage node 50 derived by the pre-change algorithm (Step S31). C123).
Next, the access location determination unit 3010 receives the post-relocation erasure request and derives node identification information for transmitting the access request corresponding to the post-relocation erasure request using the pre-change algorithm (step C124).
Subsequently, the access location determination unit 3010 deletes the data item specified by the relocation data item ID 411 included in the post-relocation deletion request from the server storage node 50 corresponding to the derived node identification information (step C125). .
Next, the input / output execution unit 3031 deletes the corresponding rearrangement data item ID 411 from the rearrangement information 410 (step C126). Here, the corresponding rearrangement data item ID 411 is the rearrangement data item ID 411 included in the post-relocation erasure request. Then, the process proceeds to C130.
[0166]
In step C130, the access history collection unit 3040 records the data item ID corresponding to the output access request in the access history information 400. (Step C130).
The above is the description of the operation when the input / output request is received in the rearrangement state.
The distributed access node 30 may use three or more distributed arrangement algorithms. In this case, the access history information may include a corresponding distributed arrangement algorithm ID in addition to the data item ID. In addition, the access location determination unit 3010 may select the pre-change algorithm used when processing the relocation read request and the post-relocation erasure request with reference to the distributed allocation algorithm ID.
In addition, the distributed access node 30 performs relocation only on the data item corresponding to the data item ID included in the input / output request in a predetermined specific situation (for example, the CPU usage rate is higher than a predetermined value). You may make it do.
The first effect of the present embodiment described above is that the distributed arrangement algorithm can be dynamically changed even during the operation of the distributed system.
This is because the following configuration is included. That is, first, the rearrangement information generation unit 3050 generates rearrangement information 410. Second, the rearrangement execution unit 3030 outputs a rearrangement read request, a rearrangement write request, and a post-relocation erasure request based on the rearrangement information 410. Third, the access location determination unit 3010 reads the corresponding data item from the server storage node 50 derived using the pre-change algorithm in response to the relocation read request. Fourth, the access location determination unit 3010 writes the data item to the server storage node 50 derived using the post-change algorithm in response to the relocation writing request. Fifth, the access location determination unit 3010 deletes the data item included in the server storage node 50 derived using the pre-change algorithm in response to the delete request after rearrangement. The “derived server storage node 50” here means “the server storage node 50 corresponding to the derived node identification information”.
The second effect of the present embodiment described above is that the process of changing the distributed arrangement algorithm and the process of the input / output request can be efficiently processed in parallel.
This is because the following configuration is included. That is, first, the input / output execution unit 3031 performs a rearrangement read request, a rearrangement write request, and an erase request after rearrangement based on the rearrangement information 410, or a normal read request and a normal write request. Are selectively output. Second, the access location determination unit 3010 reads the corresponding data item from the server storage node 50 derived using the post-change algorithm in response to the normal read request. Third, the access location determination unit 3010 writes the data item to the server storage node 50 derived using the post-change algorithm in response to the normal write request. The “derived server storage node 50” here means “the server storage node 50 corresponding to the derived node identification information”.
The third effect of the present embodiment described above is that the change to the third distributed arrangement algorithm can be further started before the change from the first distributed arrangement algorithm to the second distributed arrangement algorithm is completed. It is a point.
The reason is that the access history information includes the distributed placement algorithm ID, and the access location determination unit 3010 refers to the distributed placement algorithm ID and selects the pre-change algorithm used when processing the rearrangement read request and the rearrangement deletion request. It is because it tried to do.
The fourth effect of the present embodiment described above is that data item relocation processing is efficiently executed even if one of one or more of the distributed access node 30 and the server storage node 50 is in a high load state. It is possible to do that.
The reason is that the distributed access node 30 executes the rearrangement only for the data item corresponding to the data item ID included in the input / output request in a predetermined specific situation.
[Second Embodiment]
Next, a second embodiment of the present invention will be described in detail with reference to the drawings.
FIG. 14 is a block diagram showing a configuration of the distributed access node 32 according to the present embodiment. Referring to FIG. 14, the distributed access node 32 according to the present embodiment further includes an effect prediction unit 3080 and an effect notification unit 3081 compared to the distributed access node 30 of the first embodiment. Also, the distributed access node 32 includes an access location determination unit 3210 instead of the access location determination unit 3010 as compared to the distributed access node 30.
The components of the distributed access node 32 in hardware units are the same as the components of the hardware configuration of the distributed access node 30 and its peripheral devices shown in FIG.
The effect prediction unit 3080 calculates an appropriateness of arrangement that indicates the appropriateness of the arrangement of data items corresponding to the candidate algorithm. The candidate algorithm is one of a plurality of distributed arrangement algorithms that are replaced with the current changed algorithm (the currently used distributed arrangement algorithm) as a new changed algorithm. The candidate algorithm includes the current modified algorithm itself. The effect prediction unit 3080 selects a candidate algorithm, outputs it to the access location determination unit 3210, receives the rearrangement node correspondence information as a response, and calculates the placement appropriateness based on the received rearrangement node correspondence information.
FIG. 15 is information indicating an example of the rearrangement node correspondence information 430. Referring to FIG. 15, the rearrangement node correspondence information 430 includes a rearrangement data item ID 411 and node identification information 432 in association with each other.
The effect prediction unit 3080 calculates the distribution degree (for example, standard deviation) of the data items based on the number of data items stored in each server storage node 50, and sets this as the appropriateness of arrangement.
For example, the effect prediction unit 3080 refers to the access history information 400 to calculate the access frequency of each data item. Then, the effect prediction unit 3080 further calculates the access frequency of each server storage node 50 based on this access frequency. Then, the effect prediction unit 3080 may use the calculated access frequency as the placement appropriateness.
The effect prediction unit 3080 may score (numerize) the access frequency of each data item and the number and size of the data items stored in each server storage node 50. Then, the effect prediction unit 3080 may calculate the total value for each server storage node 50 to obtain the placement appropriateness. In such a case, the access history information 400 may include the size of the data item.
The effect notification unit 3081 generates user determination information including the distributed placement algorithm and the corresponding placement appropriateness based on the placement appropriateness predicted by the effect prediction unit 3080. Next, the effect notification unit 3081 displays the generated user determination information on a display (not shown) via, for example, the output unit 3074 in FIG.
FIG. 16 is a diagram illustrating an example of the user determination information 440. Referring to FIG. 16, the user determination information 440 includes a distributed arrangement algorithm ID 441 and an arrangement appropriateness level 442.
The access location determination unit 3210 includes the following functions in addition to the functions of the access location determination unit 3010 of FIG. First, the access location determination unit 3210 uses the candidate algorithm to derive the node identification information 432 of the server storage node 50 in which the data items corresponding to the relocation data item ID 411 included in the relocation information 410 are stored. To do. Next, the access location determination unit 3010 outputs rearrangement node correspondence information 430 in which the node identification information 432 derived for each rearrangement data item ID 411 of the rearrangement information 410 is associated.
Next, the operation of the present embodiment will be described in detail with reference to the drawings.
FIG. 17 is a flowchart showing the operation for calculating the placement appropriateness in this embodiment.
First, the rearrangement information generation unit 3050 generates the rearrangement information 410 with reference to the access history information 400 (step D101).
Next, the effect prediction unit 3080 outputs the current changed algorithm as a candidate algorithm to the access location determination unit 3010 (step D102).
Next, the access location determination unit 3010 uses the received candidate algorithm, and for the data items corresponding to the relocation data item ID 411 included in the relocation information 410, the node identification information of the server storage node 50 in which they are stored 432 is derived (step D103).
Next, the access location determination unit 3010 generates rearrangement node correspondence information 430 in which the node identification information 411 derived for each rearrangement data item ID 411 of the rearrangement information 410 is associated, and outputs the generated rearrangement node correspondence information 430 to the effect prediction unit 3080 ( Step D104).
Next, the effect prediction unit 3080 calculates the placement appropriateness corresponding to the candidate algorithm based on the received rearrangement node correspondence information 430, and outputs it to the effect notification unit 3081 together with the corresponding distributed placement algorithm ID (step D105). ).
Next, the effect prediction unit 3080 determines whether or not there is a distributed arrangement algorithm to be selected as a candidate algorithm from among a plurality of distributed arrangement algorithms that can be set as the changed algorithm by the algorithm changing unit 3020 (step D106). If there is a distributed arrangement algorithm to be selected (YES in step D106), the process proceeds to step D107. If there is no distributed arrangement algorithm to be selected (YES in step D106), the process proceeds to step E101 in FIG. In addition, the effect prediction unit 3080, for example, if the ratio between the placement appropriateness calculated using the current modified algorithm as a candidate algorithm and the placement appropriateness output in step D105 exceeds a predetermined value, the candidate algorithm The selection may be terminated.
In step D107, the effect prediction unit 3080 selects one candidate algorithm and outputs it to the access location determination unit 3010 (step D107). Then, the process returns to step D103. Note that the effect prediction unit 3080 selects, for example, the distributed arrangement algorithm held by the algorithm change unit 3020 as a candidate algorithm in order. Further, for example, the effect prediction unit 3080 may select a distributed arrangement algorithm input by the operator from the input unit 3073 shown in FIG. 6 as a candidate algorithm. FIG. 18 is a flowchart showing the operation of the effect notification unit 3081 in this embodiment.
First, the effect notification unit 3081 generates user determination information 440 based on the received placement appropriateness and the corresponding distributed placement algorithm ID (step E101).
Next, the effect notification unit 3081 displays the generated user determination information 440 on a display (not shown) via, for example, the output unit 3074 in FIG. 6 (step E102).
The operator can confirm the displayed user determination information 440 and select an appropriate distributed arrangement algorithm by the input / output unit 3073 shown in FIG. Then, the distributed access node 32 designates the selected distributed arrangement algorithm, executes the operations shown in FIGS. 9 and 10, and rearranges the data items.
The first effect in the present embodiment described above is that it is possible to obtain a prediction of how appropriate the arrangement of data items is when the distributed arrangement algorithm is changed.
The reason is that the access location determination unit 3010 generates the rearrangement node correspondence information 430, and the effect prediction unit 3080 calculates the placement appropriateness corresponding to the candidate algorithm based on the rearrangement node correspondence information 430. Because.
The second effect of the present embodiment described above is that the operator can select an appropriate distributed arrangement algorithm.
The reason is that the effect notification unit 3081 generates and displays the user determination information 440 based on the placement appropriateness and the corresponding distributed placement algorithm ID.
[Third Embodiment]
Next, a third embodiment of the present invention will be described in detail with reference to the drawings.
FIG. 19 is a block diagram showing the configuration of the distributed access node 33 according to this embodiment. Referring to FIG. 19, the distributed access node 33 according to the present embodiment further includes an effect prediction unit 3080, an algorithm determination unit 3090, and an appropriateness monitoring unit 3091, as compared to the distributed access node 30 of the first embodiment. Also, the distributed access node 33 includes an access location determination unit 3210 instead of the access location determination unit 3010 as compared to the distributed access node 30.
The components of the distributed access node 33 in hardware units are the same as the components of the hardware configuration of the distributed access node 30 and its peripheral devices shown in FIG.
The effect prediction unit 3080 is equivalent to the effect prediction unit 3080 shown in FIG.
The access location determination unit 3210 is equivalent to the access location determination unit 3210 shown in FIG.
The appropriateness monitoring unit 3091 outputs the current post-change algorithm as a candidate algorithm to the access location determination unit 3210 at a predetermined time (for example, every hour). The appropriateness monitoring unit 3091 may output the current modified algorithm to the access location determination unit 3210 as a candidate algorithm when receiving an operator instruction from the input unit 3073 in FIG.
Next, the appropriateness monitoring unit 3091 receives the rearrangement node correspondence information 430 as a response, and calculates the placement appropriateness based on the received rearrangement node correspondence information 430. Next, the appropriateness monitoring unit 3091 uses an algorithm when the calculated placement appropriateness exceeds a predetermined threshold value (for example, when the standard deviation representing the degree of distribution as the placement appropriateness exceeds a predetermined value). The placement appropriateness is output to the determination unit 3090.
The algorithm determination unit 3090 selects an appropriate distributed arrangement algorithm based on the arrangement appropriateness output from the effect prediction unit 3080. Next, the algorithm determination unit 3090 outputs the designation of the selected distributed arrangement algorithm to the algorithm change unit 3020.
Next, the operation of the present embodiment will be described in detail with reference to the drawings.
FIG. 20 is a flowchart showing an operation for determining a distributed arrangement algorithm in the present embodiment.
The distributed access node 33 starts the operation of FIG. 20 when a time-out means (not shown) generates a timeout.
First, the rearrangement information generation unit 3050 generates the rearrangement information 410 with reference to the access history information 400 (Step F101).
Next, the appropriateness monitoring unit 3091 outputs the current changed algorithm as a candidate algorithm to the access location determination unit 3210 (step F102).
Next, the access location determination unit 3210 uses the received candidate algorithm, and for the data items corresponding to the rearrangement data item ID 411 included in the rearrangement information 410, the node identification information of the server storage node 50 in which they are stored 432 is derived (step F103).
Next, the access location determination unit 3010 outputs the relocation node correspondence information 430 in which the node identification information 432 derived to each of the relocation data item IDs 411 of the relocation information 410 is associated with the appropriateness monitoring unit 3091 (step S91). F104).
Next, the appropriateness monitoring unit 3091 calculates the placement appropriateness based on the received rearrangement node correspondence information 430 (step F105).
Next, the appropriateness monitoring unit 3091 determines whether or not the calculated placement appropriateness exceeds a predetermined threshold (step F106). If the threshold is not exceeded (NO in step F106), the process ends.
If the placement appropriateness exceeds a predetermined threshold (YES in Step F106), the appropriateness monitoring unit 3091 outputs the calculated placement appropriateness to the algorithm determining unit 3090 (Step F107).
Here, the predetermined threshold is, for example, a value of a ratio between the number of any data item in the server storage node 50 and the number of data items in the other server storage node 50, which is “2”. Alternatively, the predetermined threshold value may be “2”, for example, a value of a ratio between the access frequency of any of the server storage nodes 50 and the access frequency of other server storage nodes 50.
Next, the effect prediction unit 3080 calculates placement appropriateness corresponding to a plurality of distributed placement algorithms that can be set as the changed algorithm by the algorithm changing unit 3020, and outputs the degree of placement appropriateness to the algorithm determining unit 3090 (step F108). The operation of step F106 can be easily understood with reference to step D103 to step D107 of FIG.
Next, the algorithm determination unit 3090 selects an appropriate distributed placement algorithm based on the placement appropriateness level corresponding to the received multiple distributed placement algorithms (step F109). The algorithm determination unit 3090 selects, for example, a distributed placement algorithm whose placement appropriateness is better than the placement appropriateness corresponding to the current post-change algorithm and the best placement appropriateness corresponding to a plurality of distributed placement algorithms. To do.
Next, the algorithm determination unit 3090 outputs the designation of the selected distributed arrangement algorithm to the algorithm change unit 3020 (step F110). Then, the process ends.
The effect of the present embodiment described above is that a more suitable distributed arrangement algorithm than the currently used distributed arrangement algorithm is selected, and data items are rearranged autonomously using the selected distributed arrangement algorithm. It is a point that can be made possible.
The reason is that the appropriateness monitoring unit 3091 monitors the current placement appropriateness, the algorithm determining unit 3090 selects an appropriate distributed placement algorithm in accordance with the monitoring result, and the algorithm is changed to specify the distributed placement algorithm. This is because the notification to the unit 3020 is made.
[Fourth Embodiment]
Next, a fourth embodiment of the present invention will be described in detail with reference to the drawings.
FIG. 21 is a block diagram showing a configuration of the distributed access node 34 according to the present embodiment. Referring to FIG. 21, the distributed access node 33 according to the present embodiment includes a rearrangement information generation unit 3050, a rearrangement reading unit 3410, a rearrangement writing unit 3420, and a rearrangement erasing unit 3430.
In the distributed system 100 illustrated in FIG. 2, the distributed access node 30 may be replaced with the distributed access node 34 of the present embodiment.
The components of the distributed access node 34 in hardware units are the same as the hardware configuration of the distributed access node 30 and its peripheral devices shown in FIG.
The rearrangement information generation unit 3050 generates and outputs rearrangement information 410 including one or more rearrangement data item IDs 411 as illustrated in FIG. The relocation data item ID 411 is a data item ID of a data item to be relocated among the data items stored in the server storage node 50.
The rearrangement reading unit 3410 derives node identification information using a first algorithm (pre-change algorithm) among a plurality of algorithms for deriving specific node identification information among the plurality of node identification information. Next, the relocation reading unit 3410 reads the data item specified by the relocation data item ID included in the relocation information 410 from the server storage node 50 corresponding to the derived node identification information.
The relocation writing unit 3420 stores the data specified by the relocation data item ID in the server storage node 50 corresponding to the node identification information derived using the second algorithm (algorithm after change) of the plurality of algorithms. Export items.
The rearrangement deleting unit 3430 deletes the data item specified by the rearrangement data item ID from the server storage node 50 corresponding to the node identification information derived using the pre-change algorithm.
The effect of this embodiment described above is that the distributed arrangement algorithm can be dynamically changed even during the operation of the distributed system.
This is because the following configuration is included. That is, first, the rearrangement information generation unit 3050 generates rearrangement information. Secondly, the data item specified by the relocation data item ID included in the relocation information 410 is retrieved from the server storage node 50 corresponding to the node identification information derived by the relocation reading unit 3410 using the pre-change algorithm. Read. Third, the relocation writing unit 3420 writes the data item specified by the relocation data item ID to the server storage node 50 corresponding to the node identification information derived using the post-change algorithm. Fourth, the rearrangement deleting unit 3430 deletes the data item specified by the rearrangement data item ID from the server storage node 50 corresponding to the node identification information derived using the pre-change algorithm.
Although the present invention has been described with reference to each embodiment, the present invention is not limited to the above embodiment. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.
For example, each component described in each of the above embodiments does not necessarily have to be individually independent. For example, for each component, a plurality of components may be realized as one module, or one component may be realized as a plurality of modules. Each component is configured such that a component is a part of another component, or a part of a component overlaps a part of another component. Also good.
Further, in each of the embodiments described above, a plurality of operations are described in order in the form of a flowchart, but the described order does not limit the order in which the plurality of operations are executed. For this reason, when each embodiment is implemented, the order of the plurality of operations can be changed within a range that does not hinder the contents.
Furthermore, in each embodiment described above, a plurality of operations are not limited to being executed at different timings. For example, another operation may occur during the execution of a certain operation, or the execution timing of a certain operation and another operation may partially or entirely overlap.
Furthermore, in each of the embodiments described above, a certain operation is described as a trigger for another operation, but the description does not limit all relationships between the certain operation and the other operations. For this reason, when each embodiment is implemented, the relationship between the plurality of operations can be changed within a range that does not hinder the contents. The specific description of each operation of each component does not limit each operation of each component. For this reason, each specific operation | movement of each component may be changed in the range which does not cause trouble with respect to a functional, performance, and other characteristic in implementing each embodiment.
While the present invention has been described with reference to the embodiments, the present invention is not limited to the above embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.
This application claims the priority on the basis of Japanese application Japanese Patent Application No. 2011-147519 for which it applied on July 1, 2011, and takes in those the indications of all here.

The distributed arrangement apparatus, the distributed arrangement method, and the program therefor according to the present invention can be applied to uses such as data center operation and management. Further, the distributed arrangement apparatus, the distributed arrangement method, and the program therefor according to the present invention can be applied to uses such as operation and management of appliance products constructed by a plurality of racks, computers, and storages, for example.

DESCRIPTION OF SYMBOLS 10 Client node 20 Network 30 Distributed access node 32 Distributed access node 33 Distributed access node 34 Distributed access node 40 Distributed system network 50 Server storage node 60 Storage 400 Access history information 401 Data item ID
410 Relocation information 411 Relocation data item ID
420 Distributed Allocation Algorithm Table 421 Program Name After Change 422 Program Name Before Change 430 Relocation Node Correspondence Information 432 Node Identification Information 440 User Judgment Information 441 Distributed Allocation Algorithm ID
442 Arrangement appropriateness 3010 Access location determination unit 3210 Access location determination unit 3020 Algorithm change unit 3030 Relocation execution unit 3031 Input / output execution unit 3040 Access history collection unit 3050 Relocation information generation unit 3060 Load monitoring unit 3070 CPU
3071 Storage unit 3072 Storage device 3073 Input unit 3074 Output unit 3075 Communication unit 3077 Recording medium 3080 Effect prediction unit 3081 Effect notification unit 3090 Algorithm determination unit 3091 Relevance monitoring unit 3410 Relocation reading unit 3420 Relocation writing unit 3430 Relocation erasure Part

Claims

Relocation information generating means for generating and outputting relocation information including one or more relocation data item identification information which is data item identification information for identifying a data item to be relocated stored in a node;
Among the plurality of algorithms for deriving specific node identification information among the plurality of node identification information corresponding to the plurality of nodes, the first corresponding to the first node identification information derived using the first algorithm Relocation reading means for reading a data item specified by the relocation data item identification information included in the relocation information from a node;
A rearrangement document for writing out the data item specified by the rearrangement data item identification information to a second node corresponding to second node identification information derived using the second algorithm of the plurality of algorithms Means of exiting,
Relocation erasure means for erasing the data item specified by the relocation data item identification information from the first node corresponding to the first node identification information derived using the first algorithm; , Including distributed placement equipment.
The received input / output request including at least the data item identification information is a data item read request, and the data item identification information of the data item is included in the relocation information as the relocation data item identification information A means for requesting the rearrangement reading means to read the data item specified by the rearrangement data item identification information;
The relocation writing means when the input / output request is a data item write request and the data item identification information of the data item is included in the relocation information as the relocation data item identification information The distributed arrangement apparatus according to claim 1, further comprising: means for requesting writing of the data item specified by the relocation data item identification information.
It further includes load monitoring means for monitoring the load state of the distributed arrangement device itself and outputting the monitoring result,
The distributed arrangement apparatus according to claim 1, wherein the rearrangement reading unit reads a data item specified by the rearrangement data item identification information based on the monitoring result.
The relocation information generating means has a large number of accesses per unit time based on access history information including at least the data item identification information and a time when the data item specified by the data item identification information is accessed. The distributed arrangement apparatus according to any one of claims 1 to 3, wherein the rearrangement information in which the rearrangement data item identification information is arranged in an order corresponding to the data items is generated.
The access history information further includes identification information of the algorithm corresponding to the data item identification information,
The distributed arrangement apparatus according to claim 4, wherein the rearrangement reading unit and the rearrangement erasing unit select the first algorithm based on identification information of the algorithm.
6. The method according to claim 1, further comprising an effect prediction unit that calculates and outputs an arrangement appropriateness indicating an appropriateness of the arrangement of the data item corresponding to each of the plurality of algorithms. The distributed arrangement apparatus as described.
The distributed arrangement apparatus according to claim 6, further comprising an algorithm determination unit that selects the second algorithm based on the appropriateness of arrangement.
The distributed arrangement apparatus according to claim 7, further comprising an appropriateness monitoring unit that monitors the appropriateness of arrangement corresponding to the second algorithm and notifies the algorithm determination unit of a result of the monitoring.
Computer
Generating and outputting relocation information including one or more relocation data item identification information which is data item identification information for identifying a relocation target data item stored in the node;
Among the plurality of algorithms for deriving specific node identification information among the plurality of node identification information corresponding to the plurality of nodes, the first corresponding to the first node identification information derived using the first algorithm From the node, read the data item specified by the relocation data item identification information included in the relocation information,
Writing out the data item specified by the relocation data item identification information to the second node corresponding to second node identification information derived using a second algorithm of the plurality of algorithms;
A distributed arrangement method for erasing the data item specified by the relocation data item identification information from the first node corresponding to the first node identification information derived using the first algorithm.
The received input / output request including at least the data item identification information is a data item read request, and the data item identification information of the data item is included in the relocation information as the relocation data item identification information A reading of a data item identified by the relocation data item identification information from the first node corresponding to the first node identification information derived using the first algorithm,
When the input / output request is a data item write request and the data item identification information of the data item is included in the relocation information as the relocation data item identification information, the second algorithm is The distributed arrangement method according to claim 9, wherein the data item specified by the rearranged data item identification information is written to the second node corresponding to the second node identification information derived by using the second node identification information.
Monitoring the load status of the computer itself,
The distributed arrangement method according to claim 9 or 10, wherein a data item specified by the rearrangement data item identification information is read based on the monitoring result.
The distributed arrangement method according to any one of claims 9 to 11, wherein an arrangement appropriateness indicating an appropriateness of the arrangement of the data items corresponding to each of the plurality of algorithms is calculated and output.
The distributed arrangement method according to claim 12, wherein the second algorithm is selected based on the appropriateness of arrangement.
Processing for generating and outputting relocation information including one or more relocation data item identification information, which is data item identification information, for identifying a relocation target data item stored in a node;
Among the plurality of algorithms for deriving specific node identification information among the plurality of node identification information corresponding to the plurality of nodes, the first corresponding to the first node identification information derived using the first algorithm A process of reading a data item specified by the relocation data item identification information included in the relocation information from a node;
A process of writing the data item specified by the relocation data item identification information to a second node corresponding to second node identification information derived using a second algorithm of the plurality of algorithms;
A process of erasing the data item specified by the relocation data item identification information from the first node corresponding to the first node identification information derived using the first algorithm. A non-volatile medium that records a program to be executed.