WO2013171953A1 - 分散データ管理装置及び分散データ操作装置 - Google Patents
分散データ管理装置及び分散データ操作装置 Download PDFInfo
- Publication number
- WO2013171953A1 WO2013171953A1 PCT/JP2013/001768 JP2013001768W WO2013171953A1 WO 2013171953 A1 WO2013171953 A1 WO 2013171953A1 JP 2013001768 W JP2013001768 W JP 2013001768W WO 2013171953 A1 WO2013171953 A1 WO 2013171953A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- node
- tree
- data
- identifier
- value
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2272—Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0644—Management of space entities, e.g. partitions, extents, pools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Definitions
- the present invention relates to a distributed management technique for data ordered in attribute value order.
- associative arrays that obtain values from keys, key value stores, maps, and storage engines. Further, such data structures include those where keys are stored ordered by their values and those where the keys are stored unordered. In the non-ordered form, the data storage destination is determined based on the value obtained by hashing the key. On the other hand, in the ordered form, each storage destination has a range of values in charge as range data, and the storage location of the data is determined from the range data and the key.
- Non-Patent Document 1 proposes an example of a range management method in an information system.
- a system called Bigtable includes a tablet server that stores a plurality of data, a Bigtable master, a central server Chubby, a metadata tablet server, and a client.
- Each tablet server stores data in a certain continuous range as a tablet.
- the Bigtable master manages to which tablet server a tablet is stored by B + Tree, stores a plurality of subtrees constituting the B + Tree in a plurality of metadata tablet servers, and stores a subtree corresponding to the root in the Chubby. Store.
- the tablet managed by the tablet server is changed, the change is notified to the master.
- the client accessing this Bigtable accesses the Chubby to obtain the route of the B + Tree, obtains its subtree from the metadata tablet server, and caches it. While this cache is valid, the client can locally identify the tablet server corresponding to the key value. When the tablet that the tablet server is responsible for changes, the cache on the client becomes invalid, but the client only detects the invalidity by accessing the tablet server that corresponds to the key value, and the metadata tablet server Query valid information.
- Non-Patent Document 2 proposes another example of the range management method.
- a system called “Baton” is composed of a plurality of P2P (Peer to Peer) nodes. Each node stores continuous range data. Each node has a link relationship with other nodes, and the link relationship is a balanced tree as a whole. Each node has a link to a node corresponding to the parent node in the tree structure, a link to a node corresponding to the child node, and a link to an adjacent node in the same hierarchy. For links to adjacent nodes in the same hierarchy, the range related to each link destination is also managed.
- a certain node When a certain node obtains an access request for a certain value, it decides in which range the value is included in the range that the adjacent node of the same hierarchy is in charge of, and sends the access request to the determined node. Forward. By continuing similar processing at the transfer destination node, a node holding data corresponding to the value is detected. The link relationship between nodes is sequentially changed so as to maintain a balanced tree when a node is newly added or removed. Further, when the distribution of stored data is not uniform among nodes and distortion occurs, each node changes the value range and link relationship so that the data distribution is uniform.
- the above-described range management method has the following problems.
- the client detects the change in the value range of the data storage node when the client executes access to the data. That is, after the detection, the client acquires a new value range from the metadata server and re-executes data access, so that the communication delay is directly taken as the data access time.
- Non-Patent Document 1 a configuration in which the client periodically inquires the metadata server about the value range to the method of Non-Patent Document 1.
- a plurality of metadata servers that are common in the system receive requests from all clients at predetermined intervals, and as the number of clients increases, the load on the metadata server and the system As a result, the communication load of the system becomes high, and as a result, the performance of the entire system deteriorates.
- Non-Patent Document 2 since a data access request is sequentially transferred from a P2P node to another node, a node storing data to be accessed is detected. Tend to be longer.
- the link relationship between nodes is updated according to the value range for each attribute or the load on the node, the link between P2P nodes increases as the number of attributes handled in the system increases. As a result, the relationship increases, resulting in an increase in the load required for management and update, and a situation in which a failure is likely to occur.
- the present invention has been made in view of the circumstances as described above, and provides a distributed data management technique for reducing data access time while suppressing load in a system for managing distributed data in order of attribute values. There is.
- the present invention relates to a distributed data management apparatus that implements a node.
- the target logical node realized by the distributed data management device according to the first aspect is assigned to the target logical node among a plurality of identifiers that are uniquely assigned to the plurality of logical nodes in a finite identifier space having a ring structure.
- a node identifier storage unit that stores a specified identifier as a target node identifier, a data storage unit that stores at least one of a plurality of partial data, and a communicable relationship between the target logical node and another logical node
- a link table for storing link information between the target logical node and the link destination logical node established according to the relationship between the target node identifier in the identifier space, and a data storage unit Is a range boundary value for each attribute corresponding to the partial data stored in, and the range boundary for each attribute is the target logical node in the identifier space.
- a range storage unit for storing a range boundary value for each attribute located between the target logical node and the adjacent logical node, and a range for specifying the logical node for storing the partial data corresponding to the access request are shown.
- Tree structure data for each attribute composed of a plurality of tree nodes, which is formed from a pointer to a child tree node associated with the link destination logical node and a value indicating a range for selecting the pointer.
- a tree storage unit for storing tree structure data having a root tree node including at least one entry.
- a program for causing a computer to realize the target logical node as described above may be used, or a computer-readable recording medium that records such a program may be used.
- This recording medium includes a non-transitory tangible medium.
- FIG. 1 It is a figure which shows notionally the structural example of the distribution system in 1st Embodiment. It is a figure which shows notionally the process structural example of the data server in 1st Embodiment. It is a figure which shows notionally the example of the link relationship of a logical node. It is a figure which shows notionally the link relationship on the basis of the node N (1) shown by FIG. It is a figure which shows notionally the relationship between ID ring and range information. It is a figure which shows notionally the example of the tree structure data of the node N (1) based on the example of a link of FIG.3 and FIG.4 in the management form 3.
- FIG. 3 shows notionally the structural example of the distribution system in 1st Embodiment. It is a figure which shows notionally the process structural example of the data server in 1st Embodiment. It is a figure which shows notionally the example of the link relationship of a logical node. It is a figure which shows notionally the link relationship on the basis of
- FIG. It is a figure which shows notionally the example of the tree structure data of the node N (1) based on the example of a link of FIG.3 and FIG.4 in the management form 6.
- FIG. It is a flowchart which shows the operation example of the tree production
- FIG. 6 is a diagram illustrating a part of tree structure data generated at each node 11 in the first embodiment. It is a figure which shows notionally the example of the load distribution in Example 1.
- FIG. It is a figure which shows notionally the example of the tree structure data of the node (980) after load distribution in Example 1.
- FIG. It is a figure which shows notionally the example of the tree structure data after the version update of the node (413) in Example 1.
- the distributed data management device includes a plurality of partial data obtained by dividing data ordered in attribute value order, and a plurality of partial data each having a value range for each attribute.
- the target logical node in the present embodiment stores a node identifier storage unit that stores the target node identifier, a data storage unit that stores the partial data, and link information between the target logical node and the link destination logical node.
- a link table a range storage unit that stores a range boundary value for each attribute corresponding to partial data stored in the data storage unit, a pointer to a child tree node associated with the link destination logical node, and this pointer
- a tree storage unit that stores tree structure data for each attribute having a root tree node including at least one entry including a value indicating a range for selecting.
- the logical node is a software element such as a task, a process, an instance, and the like, and is realized by a computer such as the distributed data management apparatus according to the present embodiment.
- the target node identifier is an identifier assigned to the target logical node among a plurality of identifiers uniquely assigned to a plurality of logical nodes in a finite identifier space having a ring structure. That is, each logical node is assigned a unique identifier (hereinafter also referred to as node ID or ID).
- a finite identifier space having a ring structure means an identifier space defined such that the largest identifier is followed by the smallest identifier in the space.
- the target logical node establishes a link with another logical node (the link destination logical node) according to the relationship between the target node identifier and another identifier in the identifier space.
- Link establishment means that the target logical node can communicate with the link destination logical node, and is realized by, for example, having an IP (Internet Protocol) address with each other. Note that this embodiment does not limit the method of realizing the link.
- a topology based on the node ID is constructed in a plurality of logical nodes that manage data in a distributed manner.
- each logical node manages range information corresponding to partial data to be stored, like the above-described range storage unit.
- a value boundary for each attribute is located between adjacent logical nodes in the identifier space. That is, a value range for each attribute that each logical node is responsible for is determined corresponding to the ring structure of the node ID space.
- the attribute value space for each attribute can be managed as having a cyclic order (ring structure).
- This can also be said to be a configuration in which the range information view of the attribute value space is superposed on the link topology of the node ID space for management. With this configuration, it is possible to respond to the range change by changing only the range information view superimposed on the node ID space without changing the link topology of the node ID space.
- the range boundary value stored in the range storage unit may be the starting point of the range of the partial data stored in the target logical node, may be the end point of the range, or may be the start point and end point of the range. It may be a combination. Further, the range storage unit may store not only the range boundary value of the target logical node but also the range boundary value of the partial data stored in the adjacent node of the target logical node.
- the target logical node stores tree structure data in which the range of each logical node is reflected, and an arbitrary entry is made by referring to the tree node entry stored in the tree structure data.
- a logical node that stores partial data including an attribute value is specified.
- the root tree node entry included in the tree structure data includes a pointer to a child tree node associated with the link destination logical node.
- the tree structure data in the present embodiment reflects the link relationship between the logical nodes on the topology constructed based on the node ID, so that the target logical node searches for the tree structure data.
- the link destination logical node can be immediately identified.
- the target logical node does not need to ask the specific server to change the value range, so that it is possible to prevent load concentration on the specific server.
- the logical node can identify the logical node storing the desired range by referring to the tree structure data stored by itself, so that the node of the data access request As a result, it is possible to prevent an increase in data access time associated with the transfer of data access requests between nodes.
- FIG. 1 is a diagram conceptually illustrating a configuration example of a distributed system 1 in the first embodiment.
- the distributed system 1 in the first embodiment includes a plurality of data servers 10 and the like.
- the data servers 10 are connected to each other via a network 9 so that they can communicate with each other.
- the data server 10 corresponds to the distributed data management device in the above-described embodiment.
- the data server 10 accesses data stored in the data server 10 and acquires desired data.
- the data server 10 is a so-called computer and includes, for example, a CPU (Central Processing Unit) 2, a memory 3, an input / output interface (I / F) 4 and the like connected to each other via a bus 5.
- the memory 3 is a RAM (Random Access Memory), a ROM (Read Only Memory), a hard disk, a portable storage medium, or the like.
- the input / output I / F 4 is connected via a network 9 to a communication device 7 that communicates with another data server 10, a data operation client 50, another terminal, and the like.
- the input / output I / F 4 may be connected to a user interface device such as a display device or an input device.
- the hardware configuration of the data server 10 is not limited.
- FIG. 2 is a diagram conceptually illustrating a processing configuration example of the data server 10 in the first embodiment.
- the data server 10 includes a data operation unit 12, a tree search unit 13, a tree generation unit 14, a tree update unit 15, a version comparison unit 16, one or more logical nodes 11, and the like.
- Each logical node includes a link generation unit 17, a node ID storage unit 18, a link table 19, a tree storage unit 20, a data access unit 21, a data storage unit 22, a value range storage unit 23, and the like.
- processing units are realized by the CPU 2 executing a program stored in the memory 3.
- the program is installed from a portable recording medium such as a CD (Compact Disc) or a memory card or another computer on the network via the input / output I / F 4 and stored in the memory 3.
- a portable recording medium such as a CD (Compact Disc) or a memory card or another computer on the network via the input / output I / F 4 and stored in the memory 3.
- the logical node 11 and each processing unit 12 to 16 are shown to be different software elements, but each processing unit 12 to 16 is provided for each logical node 11. Also good.
- the logical node 11 is also simply referred to as the node 11.
- the node ID storage unit 18 corresponds to the node identifier storage unit in the above-described embodiment. That is, the node ID storage unit 18 stores a node ID assigned to each node 11 within a finite ID space having a ring structure. For example, the hash value of the IP address of the corresponding server is used as the node ID.
- the link table 19 stores the link relationship between the node 11 and other nodes as described in the above embodiment.
- the link table 19 in the first embodiment stores a node ID, an IP address corresponding to the node, and a link relationship with the own node 11 for each of the other nodes.
- the IP address corresponding to each node is an IP address used for communicating with each node, for example, the IP address of the data server 10 in which each node is realized.
- the link relationship stored in the link table 19 includes whether or not the link table 19 is linked to the own node 11 and, if linked, a number (link order or the like) for identifying the link.
- the link generation unit 17 builds a link relationship based on the distance in the ID space between the ID of the own node 11 and the ID of any other node in the distributed system 1 and reflects the content in the link table 19.
- the Chord algorithm, the Koorde algorithm, or the like is used for the construction of the link relationship (construction of the ID topology).
- the first embodiment does not limit the ID topology construction method as long as the topology is constructed to have a ring structure in a finite ID space.
- FIG. 3 is a diagram conceptually illustrating an example of the link relationship of the logical nodes 11.
- FIG. 4 is a diagram conceptually showing the link relationship based on the node N (1) shown in FIG.
- the node N (1) is linked to the nodes N (3) and N (4), and the node N (3) is connected to the nodes N (5), N (6) and N (7) and node N (4) are linked to nodes N (7) and N (8).
- the ID space has a ring structure.
- this ring structure is also referred to as an ID ring.
- the node 11 having the closest ID larger than the ID of the own node 11 is represented as a successor node or a suc node, and has an ID smaller than and closest to the own node 11.
- the node 11 is expressed as a predecessor node or a pred node.
- the node 11 having the maximum node ID in the ID space has the node 11 having the minimum node ID in the ID space as the suc node and has the minimum node ID in the ID space.
- the node 11 having the maximum node ID in the ID space is a Pred node. Note that such a link relationship on the ID ring can be immediately identified by the value of the node ID, and therefore does not have to be stored in the link table 19 or between adjacent nodes in the ID space as described above. May be stored in the link table 19 together with the relationship.
- the adjacent node 11 on the ID ring is referred to as a suc node or a pred node
- the node 11 that is not adjacent on the ID ring and has a link established is referred to as a link destination node.
- the distance from a certain node 11 to the other node following the link is expressed as a hop. That is, a node that is directly linked to a certain target node 11 is represented as a link destination node of the first hop of the target node 11, and a node that is directly linked to the link destination node of the first hop is the node of the target node 11. It is expressed as a link destination node of the second hop.
- the data storage unit 22 stores some partial data of a plurality of partial data obtained by dividing data ordered in attribute value order.
- Data handled in the present embodiment is data composed of a plurality of rows (tuples) each including a plurality of columns (attributes).
- the stored attribute value may be a one-dimensional value obtained by processing a plurality of attribute values by space filling curve processing.
- the data may be a conditional expression instead of the data itself.
- the partial data stored in the data storage unit 22 has a value range for each attribute.
- the range storage unit 23 stores, as metadata, the boundary value of the range for each attribute with other nodes that have divided the data.
- Another node that is in a range that divides a value range from a certain node is an adjacent node having an ID that is closest to the ID of the certain node in the ID space.
- the range boundary values stored in the range storage unit 23 in each node can have various forms.
- a form in which the start point of the range is managed as a range boundary value (management form 1)
- a form in which the end point of the range is managed as a range boundary value (management form 2)
- there is a form in which both the start point and the end point are managed (management form 3).
- both the start point and end point of the range corresponding to the partial data stored in the Suc node are managed (management mode 4), and both the start point and end point of the range corresponding to the partial data stored in the Pred node are managed.
- Management form 5 form in which both the start point and end point of the range relating to the own node and the suc node are managed (management form 6), and both the start point and end point of the range relating to the own node and the Pred node are managed. (Management form 7).
- each management form it is desirable to select each management form according to system requirements. For example, when the simplest configuration is desired, the management form 1 or 2 can be selected. In addition, for example, in a system in which data consistency is a requirement, the management form 3, 6 or 7 can be selected. In such a system, range boundary values are synchronized between adjacent nodes. Also, for example, in a system where high fault tolerance is a requirement, depending on the adjacent node (Suc node or Pred node) acting as a secondary node (backup node) of the own node, the management forms 4, 5, 6 or 7 can be selected. In such a system, the adjacent node (Suc node or Pred node) of the target node manages the range information of the target node. The present embodiment does not limit the form of range information managed in each node.
- FIG. 5 is a diagram conceptually showing the relationship between the ID ring and range information.
- the node N (0) stores the range boundary value (25) regarding the range boundary with the adjacent node N (1)
- the node N (0 1) stores the range boundary value (32) regarding the range boundary with the adjacent node N (2).
- the range for each attribute assigned to each node is determined corresponding to the ring structure of the ID space.
- the attribute value (200) is determined to be larger than the range boundary value (175) of the node N (6) and smaller than the range boundary value (3) of the node N (7).
- Such a circulation order in the attribute value space is used, for example, by the tree search unit 13 or the like.
- the data access unit 21 receives a data access request to the own node 11, and the attribute value or attribute value range included in the request is included in the value range handled by the own node 11 stored in the value range storage unit 23. To determine whether or not If not included, the data access unit 21 returns invalidity to the issuer of the request. If the included attribute value or attribute value range exists, the data access unit 21 permits arbitrary processing to the data storage unit 22 and Return the result to the issuer of the request.
- the tree storage unit 20 stores tree structure data for each attribute composed of a plurality of tree nodes each indicating a value range, as described in the above embodiment.
- the tree structure data stored in the tree storage unit 20 in the first embodiment has a plurality of hierarchies, and has tree data for each hierarchy.
- the node as an element which comprises this tree structure data is described as a tree node.
- the hierarchy which each tree data of each hierarchy contained in this tree structure data has is described with a step.
- the tree node is classified into a root tree node, a branch tree node, and a leaf tree node according to the existing stage.
- the root node and the branch node include a pointer to a node (child tree node) one level below, and the leaf node includes a pointer to data that can specify the logical node 11 that is a data storage destination.
- Each tree node has an entry that includes such a pointer and range information that specifies a range for selecting such a pointer.
- the tree node in the first embodiment has the following characteristics in addition to such a structure similar to a tree data structure for searching for existing data such as a B-tree. Specifically, a range of attribute values not including all attribute values in the target attribute can be set in each root tree node of each tree data. Since the conventional tree data structure is composed of one piece of tree data, a range of attribute values including all attribute values in the target attribute is set in the root tree node. On the other hand, each root tree node in each hierarchy in the first embodiment supports a part of the range of all attribute values in the target attribute, and the whole root tree node in all hierarchies supports the range of all attribute values in the target attribute. To do.
- the tree data of the first layer (hierarchy 0) is generated from the range information that the own node has in the range storage unit 23. Therefore, the structure of the tree data of hierarchy 0 corresponds to the management mode of the range information as described above. Therefore, for example, the tree data of the hierarchy 0 of the management form 6 is formed from root tree nodes having the following three entries. In this case, the root tree node also has the position of the leaf tree node.
- the first entry includes the start point of the value range handled by the own node (the end point of the value range handled by the Pred node of the own node) and a pointer to data that can identify the own node.
- the second entry includes the end point of the value range handled by the own node (the start point of the value range handled by the suc node of the own node) and a pointer to data that can identify the suc node of the own node.
- the third entry includes the end point of the value range handled by the suc node of the self node and no value (Null).
- node specifying data Data that can specify a certain node is hereinafter referred to as node specifying data. For example, a node ID is used as the node specifying data.
- the tree data of the second hierarchy (hierarchy 1) is generated from the range information that the link destination node has in the range storage unit 23. Therefore, the structure of the tree data of the hierarchy 1 also corresponds to the management mode of the range information as described above.
- the tree data of the hierarchy 1 in the management form 4 has a plurality of entries as follows. In each entry, an end point of a range that each link destination node is responsible for and a pointer to node specifying data of the suc node of the link destination node are set, and the last link destination node is set in the range boundary value of the last entry. The end point of the value range handled by the suc node is set, and no value (Null) is set for the pointer.
- the last link destination node is a link destination node having a maximum ID value among a plurality of link destination nodes.
- Hierarchy L (L is an integer of 2 or more) tree data is formed from information on each link destination node of the L hop and has an L-stage configuration.
- the root tree node of the tree data of the hierarchy L includes at least one entry including a pointer to a child tree node associated with the link destination node and a value indicating a value range for selecting the pointer.
- the form of associating the pointer to the child tree node with the link destination node is not limited as long as the node specifying data (for example, the node ID) of the corresponding link destination node can be acquired from the pointer.
- the tree node below the root tree node of the hierarchy L is generated from the tree data of the hierarchy (L-1) in the link destination node. This generation method will be described later.
- FIG. 6A is a diagram conceptually illustrating an example of the tree structure data of the node N (1) based on the link examples of FIGS. 3 and 4 in the management form 3.
- FIG. 6B is a diagram conceptually illustrating an example of the tree structure data of the node N (1) based on the link examples of FIGS. 3 and 4 in the management form 6.
- the left triangles in FIGS. 6A and 6B indicate each layer, and the right side indicates tree data in each layer.
- the arrows in FIGS. 6A and 6B indicate a pointer, and the tip of the arrow indicates node specifying data.
- N (x). sV indicates a range start point of the node N (x), and N (x). eV indicates the range end of the node N (X).
- the example of FIGS. 6A and 6B shows tree structure data stored by the node N (1), and this tree structure data has tree data of three layers. In FIGS. 6A and 6B, the portion where Null is set is not shown.
- the value range start point (N (1) .sV) of the own node the value range end point (N (1) .eV) of the own node, and a pointer to the node specifying data of the own node are set. ing.
- the following data is set in each entry of the hierarchy 1 tree data.
- the range start point (N (3) .sV) of the node N (3) that is the link destination node is set, and the pointer of the node N (3) is set in the pointer of the first entry Pointer to node specific data is set.
- the range start point (N (4) .sV) of the node N (4) that is the link destination node is set, and the pointer of the node N (4) is set in the pointer of the second entry Pointer to node specific data is set.
- the range end point (N (4) .eV) of the node N (4) is set, and Null is set in the pointer of the third entry.
- the tree data of the hierarchy 2 includes a root tree node and two leaf tree nodes.
- the root tree node includes child tree nodes (corresponding to nodes N (3) and N (4) which are link destination nodes. Two pointers to (leaf tree node) are included.
- the two leaf tree nodes are linked to the nodes N (3) and N (4), and are the second hop destination nodes N (5) to N (8) that are based on the node N (1).
- the leaf tree node indicated by the pointer associated with the node N (3) is generated from the tree data of the hierarchy 1 of the node N (3) and is associated with the node N (4). Is generated from the tree data of the hierarchy 1 of the node N (4).
- an entry indicating a value range of the second hop nodes N (5) to N (8) is set in the root node of the hierarchy 2.
- the tree data of each hierarchy shown in FIG. 6B will be described.
- a pointer to the node specific data and an end point of the value range are set for the node N (2) that is the suc node of the own node N (1) in addition to the configuration shown in FIG. 6A. Is done.
- a pointer to node specific data and an end point of a value range are set in the tree data of the hierarchy 1 regarding the node N (5) that is the suc node of the last link destination node N (4). Is done.
- the tree data of the hierarchy 2 includes a pointer to the node specific data and a range of values related to the node N (9) that is the suc node of the last link destination node N (8) of the second hop. The end point is set.
- the tree data of each hierarchy is generated in a form corresponding to the management form of the range information in each node.
- the tree data of the hierarchy 0 is generated from the range information that the own node has in the range storage unit 23
- the tree data of the hierarchy 1 is generated from the range information that the link destination node has in the range storage unit 23
- the hierarchy L ( (L is 1 or more) tree data is consistent in that it is generated from the tree data of the link destination node hierarchy (L-1).
- the smaller the hierarchy number of the tree data the higher the freshness of the range information reflected in the tree data.
- version information is added to each tree node of each tree data. This version information is updated according to the range change. This version information is used by the version comparison unit 16 or the like.
- the data operation unit 12 acquires a target attribute and an attribute value or a range of attribute values from a determination condition of operation target data acquired by an input from an application program or a user interface device, and the attribute value or attribute value
- the node 11 corresponding to the value range is detected, and the data access process is executed for the node 11.
- the corresponding node 11 is acquired from the tree search unit 13 by passing the target attribute and the attribute value or attribute value range to the tree search unit 13. If the data access fails, the data operation unit 12 inquires the tree search unit 13 again about the data access destination node.
- the tree search unit 13 When the tree search unit 13 acquires the target attribute and the attribute value or the attribute value range from the data operation unit 12, the tree search unit 13 acquires the tree structure data related to the specified attribute of an arbitrary node from the tree storage unit 20, and The node 11 that stores the partial data corresponding to the attribute value or the attribute value range is identified from the inside.
- the tree search unit 13 requests the tree generation unit 14 to generate tree data when the node 11 cannot be specified or when there is no tree data of a certain level of tree structure data. Detailed processing of the tree search unit 13 will be described in detail in the section of the operation example.
- the tree generation unit 14 generates the tree structure data as described above.
- the tree generation unit 14 acquires the communication address of the link destination node from the link table 19 of the target node 11, and uses the communication address.
- the link destination node By using the link destination node, the tree data of the hierarchy (L-1) stored in the tree storage unit 20 of the link destination node is acquired.
- the tree generation unit 14 may acquire the tree data of the hierarchy 0 stored in the tree storage unit 20 of the link destination node, or the link
- the range information stored in the range storage unit 23 of the previous node may be acquired. Detailed processing of the tree generation unit 14 will be described in detail in the section of an operation example.
- each node 11 manages range information regarding other nodes by using tree structure data stored in the tree storage unit 20. Therefore, a situation occurs in which the range change is not reflected in the tree structure data of a certain node 11 even though the range is changed in another node. In such a situation, the tree search unit 13 may identify an inappropriate node 11, that is, a node 11 that does not store partial data corresponding to a target attribute value or attribute value range. is there.
- the tree update unit 15 confirms whether or not the tree structure data stored in each node 11 is the latest range information at an arbitrary timing, and the old range information The tree data that has become is updated with the latest range information. Specifically, for the tree data of each hierarchy higher than hierarchy 2, the tree update unit 15 performs the version of the child tree node indicated by the pointer with respect to the link destination node associated with the pointer included in the root tree node. Send a version confirmation request with information set. The tree update unit 15 updates the tree data and the version information of the layers having different versions by using the tree data and the version information included in the reply from the link destination node.
- the tree update unit 15 periodically performs the version check and the update of the tree structure data at a predetermined cycle asynchronously with the data access process executed by the data operation unit 12. In this way, the transmission time of the range change to all the nodes can be shortened. As a result, the situation in which the tree search unit 13 identifies an inappropriate node 11 can be reduced. The increase in time can be suppressed.
- the version comparison unit 16 transmits the version information included in this version confirmation request of the hierarchy (L-1) possessed by the node 11. Compare with the version information of the tree data. The version comparison unit 16 returns the tree data of the layer (L-1) having a different version together with the version information of each tree node to the other node of the request source.
- the management form 6 that is, the form in which both the start point and the end point of the range relating to the own node and the Suc node are managed is used.
- FIG. 7 is a flowchart showing an operation example of the tree generation unit 14 in the first embodiment.
- the tree generation unit 14 receives a generation request in which an attribute and hierarchy of an arbitrary node 11 (target node 11) are specified, the tree generation unit 14 specifies the generation target attribute (S51), and according to the specified hierarchy, The following operation is performed (S52).
- the tree generation unit 14 When the designated hierarchy is 0, the tree generation unit 14 generates each entry of the root tree node in the tree data of the hierarchy 0 of the target node 11 based on the range information stored in the range storage unit 23. The tree generation unit 14 sets, in the first entry, the range start point of the target node 11 and a pointer to data that can identify the target node 11 (S53).
- the tree generation unit 14 generates a second entry of the root tree node in the tree data of the hierarchy 0 of the target node 11 (S54).
- the tree generation unit 14 sets the value range start point of the suc node of the target node 11 and a pointer to the node specifying data of the suc node of the target node 11 in the second entry.
- the tree generation unit 14 generates a third entry of the root tree node in the tree data of the hierarchy 0 of the target node 11 (S55).
- the tree generation unit 14 sets a value range end point of the suc node of the target node 11 and no value (Null) in the third entry.
- the tree generation unit 14 first specifies the link destination node m of the target node 11 with reference to the link table 19 (S61). When there are a plurality of link destination nodes, one of them is specified as the link destination node m. The initial value of m is 1.
- the tree generation unit 14 generates the mth entry of the root tree node in the tree data of the hierarchy 1 (S62).
- the tree generation unit 14 sets the range start point of the link destination node m and a pointer to the node specifying data of the link destination node m in the m-th entry.
- the tree generation unit 14 acquires the communication address of the link destination node m from the link table 19, and acquires the range start point and the node specifying data from the link destination node m using the communication address.
- the tree generation unit 14 executes (S61) and (S62) described above for all link destination nodes (S66).
- the tree generation unit 14 If the link destination node m is the last link destination node (S63; YES), the tree generation unit 14 generates the (m + 1) th entry (S64), and further generates the (m + 2) entry (S65). ). Specifically, the tree generation unit 14 sets the range end point of the link destination node m and the pointer to the node specifying data of the suc node of the link destination node m in the (m + 1) th entry (S64). . Further, the tree generation unit 14 sets the range end of the suc node of the link destination node m in the (m + 2) th entry (S65).
- the range end point of the link destination node m, the node specifying data of the suc node of the link destination node m, and the range end point of the suc node of the link destination node m are acquired from the link destination node m.
- the tree generation unit 14 sets no value (Null) to the pointer of the last entry.
- the tree generation unit 14 specifies the link destination node m of the target node 11 (S56) as in the above (S61). Subsequently, the tree generation unit 14 requests the link destination node m for the tree data of the hierarchy (L-1), and in response to this request, the tree generation unit 14 obtains the tree data of the hierarchy (L-1) from the link destination node m. Obtain (S57). Also at this time, as described above, the communication address of the link destination node m is acquired from the link table 19.
- the tree generation unit 14 generates, from the acquired tree data, a tree node and lower specified by a certain pointer of the root tree node of the tree data of the hierarchy L (S58). That is, the tree generation unit 14 generates second and subsequent tree nodes based on the acquired tree data. When the acquired tree data is formed away from the two-stage tree node, the tree generation unit 14 connects the two-stage tree data to the root tree node.
- the tree generation unit 14 generates the mth entry of the root tree node of the tree data of the hierarchy L (S59).
- the tree generation unit 14 includes, in the m-th entry, the range boundary value set in the first entry of the root tree node of the acquired tree data and the tree data generated from the root tree node of the acquired tree data Set a pointer to At this time, information duplicated with a tree other than the m-th entry may be deleted as appropriate.
- the tree generation unit 14 executes (S56), (S57), (S58), and (S59) described above for all link destination nodes (S60). For the last link of the root tree node of the tree data of the hierarchy L, the tree generation unit 14 has a range end point set in the last entry of the root tree node of the tree data and the pointer has no value (Null). Set the entry.
- the generation process of the tree data of the hierarchy 1 and the generation process of the tree data of the hierarchy L (L ⁇ 2) are distinguished from each other. You may make it produce
- the step (S52) it is determined whether the hierarchy is 0 or L (L ⁇ 1), and if the hierarchy 1 is designated, the process (S56) and subsequent steps may be executed.
- FIG. 8 is a flowchart showing an operation example of the tree update unit 15 and the version comparison unit 16 in the first embodiment.
- Each node 11 causes the tree update unit 15 to operate as follows at a predetermined timing.
- the tree update unit 15 refers to the link table 19 to identify the link destination node m of the target node 11 (S70). When there are a plurality of link destination nodes, one of them is specified as the link destination node m. The initial value of m is 1.
- the tree update unit 15 acquires the version information of the tree node specified by the pointer associated with the link destination node m included in the root tree node for each of the hierarchies 2 and higher of the target node 11. (S71).
- the child tree node specified by the corresponding pointer is a branch tree node
- version information of the child tree node and all tree nodes under the child tree node may be acquired.
- the tree update unit 15 transmits to the link destination node m a version confirmation request that includes each version information acquired for each of the tiers 2 and higher (S72). Also at this time, as described above, the communication address of the link destination node m is acquired from the link table 19.
- the link destination node m When the link destination node m receives this version confirmation request (S81), it causes the version comparison unit 16 to operate as follows.
- the version comparison unit 16 compares the version information of each layer n included in the version confirmation request with the version information of each layer (n ⁇ 1) one level lower (S82). For example, the version information of the hierarchy 2 included in the version confirmation request is compared with the version information of the hierarchy 1 of the link destination node m, and the version information of the hierarchy 3 included in the version confirmation request is the hierarchy information of the hierarchy 2 of the link destination node m. Compared with version information.
- the version comparison unit 16 corresponds to the tree data (including the version information) in the hierarchy having different versions, as the range information (the tree data in the hierarchy 0 of the link destination node m) of the link destination node m. ) And the response to the target node 11 (S83).
- tree data of a hierarchy having different versions is formed from a plurality of tree nodes, data and version information relating to the plurality of tree nodes are returned.
- the hierarchy (n -1) Tree data and version information are set in the reply.
- the target node 11 When the target node 11 receives this reply (S73), it causes the tree update unit 15 to operate as follows. First, the tree update unit 15 relates to the tree data of the hierarchy 1, the range boundary value set in the entry corresponding to the range information possessed by the link destination node m, and the latest range information possessed by the link destination node m included in the reply. And compare. If they are different, the tree update unit 15 updates the range boundary value of the entry with the latest range information, and moves up the version information of the tree data of the hierarchy 1 (S74).
- the tree update unit 15 refers to the reply and determines whether or not there is a hierarchy having a different version (S75).
- the tree update unit 15 updates the partial tree data corresponding to the link destination node m in the hierarchy with different versions of the target node 11 with the new tree data included in the reply (S76). For example, when the version of the hierarchy 2 is different, the tree update unit 15 determines the child tree specified by the pointer corresponding to the link destination node included in the root tree node in the tree data of the hierarchy 2 of the target node 11. The nodes and below are updated with the tree data included in the reply. At this time, the version information of each tree node below the corresponding child tree node is also updated. Further, the range boundary value of the entry corresponding to the link destination node of the root tree node of hierarchy 2 is also updated.
- the tree update unit 15 moves up the version information of the root tree node of the hierarchy having a different version (S77).
- the tree update unit 15 executes (S70) to (S77) described above for all link destination nodes (S78).
- the tree update unit 15 specifies a tree identified by a pointer associated with the link destination node m included in the root tree node for each of the hierarchies 1 and higher of the target node 11. Node version information may be acquired.
- the version comparison unit 16 does not have to return the range boundary value of the link destination node m, and the tree update unit 15 does not have to execute the step (S74).
- FIG. 9 is a flowchart showing an operation example of the tree search unit 13 in the first embodiment.
- the tree search unit 13 acquires the target attribute and the attribute value or the attribute value range from the data operation unit 12, the tree search unit 13 relates to the target attribute from the tree storage unit 20 of an arbitrary node 11 (hereinafter referred to as the target node 11).
- Tree structure data is acquired (S90).
- the tree search unit 13 sets an initial value 0 to the hierarchy L (S91).
- the hierarchy 0 is the lowest hierarchy, and the hierarchy L (1 or more) is an upper hierarchy.
- the tree search unit 13 determines whether or not tree data of the hierarchy L exists in the acquired tree structure data (S92). When the tree data of the hierarchy L does not exist (S92; NO), the tree search unit 13 requests the tree generation unit 14 to create the tree data of the hierarchy L of the target node 11 (S93). In response to this request, tree data of the hierarchy L of the target node 11 is generated as described above.
- the tree search unit 13 acquires tree data of the hierarchy L (S94), and identifies a node (destination node) having a value range corresponding to the attribute value of the target attribute or the attribute value range from the tree data of the hierarchy L. (S95). Details of the destination node specifying process will be described later with reference to FIG.
- the tree search unit 13 If the tree search unit 13 succeeds in specifying the destination node (S96; YES), the tree search unit 13 outputs information about the destination node (S97). Based on the output information, the data operation unit 12 transfers a data access request to the destination node.
- the layer L is incremented by 1 when the destination nodes of all virtual nodes fail to be identified in one layer. You may do it.
- FIG. 10 is a flowchart showing an example of the operation of specifying the destination node from the tree data of the hierarchy L of the tree search unit 13 in the first embodiment. That is, FIG. 10 shows the detailed operation of (S95) in FIG.
- the tree search unit 13 specifies the root tree node of the tree data of the hierarchy L (S100).
- an initial value 0 (indicating the highest layer) is set for the layer L (S101).
- the tree search unit 13 is specified based on the circulation order in the attribute value space of the target attribute using the range boundary value of the first entry of the specified tree node (here, the root tree node) as a reference value.
- the entry including the value range corresponding to the attribute value or the attribute value range of the target attribute is identified from the tree nodes (S102). As described above, this circulation order corresponds to the ring structure of the attribute value space corresponding to the ID ring.
- the tree search unit 13 determines the specification failure of the destination node (S108). On the other hand, when the tree search unit 13 succeeds in specifying the entry (S103; YES), the tree search unit 13 determines whether or not the pointer of the specified entry points to the node specifying data (S104).
- the tree searching unit 13 When the tree search unit 13 indicates the node specifying data (S105; YES), the tree searching unit 13 acquires the node specifying data indicated by the pointer (S107). This process corresponds to identification of the destination node.
- the tree search unit 13 specifies the child tree node pointed to by the pointer when the node specifying data is not pointed (S105; NO), that is, when the pointer points to the child tree node (S106).
- the tree search unit 13 executes (S102) and subsequent steps for the identified tree node.
- the node 11 uses the tree structure data stored in the tree storage unit 20 of the own node 11 in response to the data access request related to the attribute value or the attribute value range of the target attribute.
- the destination node that stores the partial data that is the target of the data access request is specified.
- the data access request can be directly transferred to the destination node specified by a certain node 11, an increase in the data access time associated with the inter-node transfer of the data access request is increased. Can be prevented.
- the tree structure data reflecting the range information possessed by each node 11 is automatically generated from the tree data acquired from the link destination node of each node 11.
- each tree node included in the tree structure data is assigned version information, and each node 11 stores this version information at a predetermined timing in the background of the data access process. Used to check and update the version of its own tree structure data. At this time, each node 11 acquires the latest tree data reflecting the range change from the link destination node.
- the first embodiment since range information is exchanged between the node 11 and the link destination node, it is necessary to provide a specific server group such as a metadata server that centrally manages the range information. Thus, load concentration on a specific server group can be prevented. In addition, when a specific server group is provided, it becomes necessary to make the system robust and manage so that they do not become a single point of failure. It is possible to reduce the labor of countermeasures and operation and maintenance. According to the first embodiment, even if the range information generated individually at each node 11 disappears for some reason, it can be restored by acquiring it again from the link destination node. Operation and maintenance labor can also be reduced.
- each node 11 operates to keep the range information (tree structure data) up-to-date autonomously and separately from the data access process.
- range information tree structure data
- the tree structure data reflecting the range information is formed to have the same structure as the link relation of the node IDs, and the version confirmation and acquisition of the range information (tree structure data) is performed for each node. 11 and the link destination node of each node 11.
- the number of times of performing version confirmation (range change confirmation) at a predetermined time is the number of links of each node 11. The load of can also be suppressed.
- each node 11 may update the tree structure data at the time of data access as well as performing version check every predetermined time.
- the node ID corresponding to a certain key value is obtained by referring to the tree structure data, and after accessing the node, the value range is already old and the access is invalid and failed.
- the tree structure can be updated with respect to the link corresponding to the path from the value tree node to the leaf tree node in the tree structure data.
- the method of constructing the link relationship based on the node ID is not limited.
- a new method is applied as the link relationship construction method, and system parameter adjustment processing related thereto is added.
- the configuration of the distributed system 1 in the second embodiment is the same as that in the first embodiment, and the processing of the data server 10 is different from that in the first embodiment.
- the data server 10 in the second embodiment will be described focusing on the contents different from the first embodiment, and the same contents as in the first embodiment will be omitted as appropriate.
- FIG. 11 is a diagram conceptually illustrating a processing configuration example of the data server 10 in the second embodiment.
- the data server 10 in the second embodiment further includes a parameter setting unit 31 in addition to the configuration of the first embodiment.
- the link generation unit 17 constructs a link relationship based on the node ID by a new algorithm (hereinafter referred to as extended Koorde), and reflects the content in the link table 19.
- extended Koorde a new algorithm
- each node 11 is a Suc node, a Pred node corresponding to a value obtained by multiplying the ID of each node 11 by a parameter k (natural number), and (k ⁇ 1) Suc nodes from the Pred node. And establish a link with each.
- the number of link destination nodes of all the nodes 11 is determined by a fixed value k.
- the well-known Koorde is a node in the de Bruijn graph, but a node whose ID does not logically exist as a node is an absent node (Imaginary Note), and this absent node is managed by the Pred node. Take the expected distribution of the number of links required.
- the present invention derives the distribution of the number of links necessary for hopping between the nodes of the de Bruijn graph, but the well-known method takes only the expected value of the distribution and calculates the number of link destination nodes. It can be interpreted that k was fixed. Therefore, in the well-known Koorde, there may be an order that is more than the required order (the number of link destination nodes), or there may be an order that is less than the required order, and when it is insufficient, routing (transfer) to the suc node. ). Independent of the way of linking, the well-known Koorde performs a shift operation on the number of (log 2 k) bits corresponding to the k-adic number, so k is limited to an exponential power of 2.
- each node 11 has a first link destination node that is a Pred node for a value obtained by multiplying its own node ID by a parameter k (an integer of 2 or more), and the suc of its own node.
- a second link destination node which is a Pred node for a value obtained by multiplying the node ID by a parameter k times, and a link destination node existing between the first link destination node and the second link destination node in the ID ring Each establishes a link.
- the number (order) of link destination nodes determined for each node 11 has a probability distribution controlled by the parameter k, and many nodes 11 have orders smaller than the parameter k.
- Node 11 has an order greater than the parameter k.
- the expected value of the order is k + 1.
- value obtained by multiplying ID by parameter k, etc.” means the remainder of (ID ⁇ k) divided by 2 b (ID space size) for convenience of explanation. It is an expression. In other places, it is simplified as “value obtained by multiplying ID by parameter k” as an expression indicating the meaning of “the remainder obtained by dividing (ID ⁇ k) by 2 b (size of ID space)”. There is a case.
- the tree update unit 15 holds the value of the polling interval T, and executes the version confirmation based on the polling interval T.
- the polling interval T may be a value that is inversely proportional to the order of each node, or may be the same value throughout the system. When inversely proportional, there is a merit that the polling load of each node becomes uniform. When the values are the same throughout the entire system, the time from when the range change occurs until the tree of each node is reflected in the original range can be restricted within a certain range.
- the parameter setting unit 31 includes a maximum time wmax until the change of the range in a certain node 11 is transmitted to all the nodes 11 in the distributed system 1, a unit time load ⁇ for version confirmation executed by each node 11, At least one of the parameters k and T is set so as to satisfy the system requirement (constraint) for the maximum tree height hmax related to the tree update.
- the system request is a minimization request such as setting an upper limit constraint that should be less than a certain value or reducing the load as much as possible.
- the maximum height hmax is a system requirement (constraint) that includes a one-hop communication delay and processing time (indicated as ⁇ 0) per one level of the tree.
- the delay in execution related to the update is (hmax ⁇ ⁇ 0) time, and ⁇ 0 is a predetermined value.
- the unit time load ⁇ indicates, for example, the number of version confirmations per second executed by each node.
- a delay in execution related to update is simply referred to as a delay in execution, and a communication delay and processing time of one hop per one level of the tree are expressed as an execution delay per one hop.
- T is the same for all the nodes 11 (the total number N of nodes), and that the constraint time wc for the maximum time wmax is given.
- T may be different for all nodes, and the polling interval T for each node is determined based on the constraint load ⁇ c with respect to the unit time load ⁇ .
- the parameter setting unit 31 updates the value held by the tree update unit 15 at the determined polling interval T.
- the parameter k is determined based on the tree height constraint hc with respect to the maximum tree height hmax.
- the constraint time wc, the constraint load ⁇ c, and the tree height constraint hc may be acquired from another device via the communication device 7 or may be a user interface connected to the input / output I / F 4. It may be input by operating the device by a user, or may be acquired from a portable storage medium via the input / output I / F 4.
- the parameter setting unit 31 satisfies the following (Formula 1) using the acquired constraint time wc, the total number N of nodes 11, and the parameter k used by the link generation unit 17.
- the polling interval T to be calculated is calculated.
- the parameter setting unit 31 uses the acquired constraint load ⁇ c and the number (order) D of the link destination nodes of the node acquired by the link generation unit 17 as follows.
- a polling interval T that satisfies (Expression 2) is calculated.
- the parameter setting unit 31 sets the parameter k according to the following (Equation 3) using the acquired tree height constraint hc. Note that the tree height constraint hc may be calculated from the execution time delay ( ⁇ 0) per one hop described above and the constraints on the execution time delay.
- the polling interval T is the same value over the entire node, or when different for each node, it is assumed that the polling interval T is a value obtained from the distribution of the polling interval set in the above (Equation 2).
- the expected value is (1 + k) / ⁇ c.
- the unit time load ⁇ and the maximum time wmax vary according to the polling interval T of each node 11 and the parameter k.
- the unit time load ⁇ can be expressed as in the following (formula 4).
- the expected value of the time w required until the change of the range reaches all the nodes 11 is expressed by the following (formula 5).
- the maximum time wmax can be expressed as shown below (Formula 6).
- the right side of the above (Formula 5) is derived as follows. That is, when each version check executed by each node 11 is performed at the same timing independently and at a constant timing, and there are (log k N) stages, the time w required for propagation of the range change is (log k N It follows an Irwin-Hall distribution which is the sum of random variables according to a uniform distribution U (0, T). And the right side of the above (Formula 5) is derived from the expected value of the time w. The above (Equation 6) is obtained as the maximum value of this distribution.
- the link generation unit 17 of each node 11 determines the link destination node of each node 11 by the extended Koorde algorithm.
- the number (order) of link destination nodes in each node 11 is determined probabilistically.
- the stochastic element resulting from the node ID being generated by hash does not appear in the number of links (constant) of each node 11, and between the Suc node and other nodes.
- the number of links of each node 11 becomes probabilistic.
- a single data server 10 has a large number of links 11 and the number of links. Therefore, there is a difference in the load of checking the version of the tree structure data in each node 11.
- the version check load of the data server 10 can be made substantially constant.
- the number (order, particularly the output order) at which one node links to another node is constant, but the number at which one node is linked from another node (input order) is: Stochastic and varied.
- this variation exists, but the variation is smaller than that of the well-known Koorde, and the load is easily distributed. If the output order varies, it is possible to equalize the load by considering other factors such as the polling interval T as shown in the second parameter setting method described above.
- the input order which is the “number of links”, to have a small degree of variation in the order as in the extended Koorde.
- tree data of the hierarchy 1 or higher is generated according to the link relationship determined by the extended Koorde. Therefore, the transmission destination of the version confirmation request is excluded from the suc node and only the link destination node. And management of tree structure data reflecting range information can be facilitated. That is, a tree structure in which paths from each node to a node group that is h hop ahead can be balanced.
- the polling interval T for realizing them is According to the polling interval T, the version of the tree structure data, that is, the range change confirmation is performed at each node 11. Therefore, according to the second embodiment, it is possible to prevent an unexpected load from being generated by the version confirmation process of the range information in each node 11.
- each data server 10 that implements the node 11 that stores the partial data executes the above-described processing. However, in the device that does not store the partial data and does not implement the node 11, Such processing may be executed.
- the distributed system 1 in the third embodiment further includes a data operation client 50 as an apparatus that does not store partial data and does not realize the node 11.
- the distributed system 1 of the third embodiment will be described focusing on the contents different from the first embodiment, and the same contents as those of the first embodiment will be omitted as appropriate.
- FIG. 12 is a diagram conceptually illustrating a configuration example of the distributed system 1 in the third embodiment.
- the distributed system 1 in the third embodiment further includes a plurality of data operation clients (hereinafter simply referred to as clients) 50 in addition to the configuration of the first embodiment.
- the client 50 is communicably connected to the data server 10 via the network 9. Similar to the data server 10, the client 50 accesses data stored in the data server 10 in response to a request from an application or another terminal, and acquires desired data.
- the hardware configuration of the client 50 is the same as that of the data server 10, and the present embodiment does not limit the hardware configuration of the client 50.
- FIG. 13 is a diagram conceptually illustrating a processing configuration example of the data operation client 50 in the third embodiment.
- the client 50 includes a data operation unit 12, a tree search unit 13, a tree generation unit 14, a tree update unit 15, a version comparison unit 16, a link generation unit 17, a link table 19, and a tree storage unit 20. Etc.
- Each of these processing units is basically the same as in the first embodiment.
- the link generation unit 17 constructs a link relationship between any one or more nodes 11 among the plurality of nodes 11 realized by the plurality of data servers 10 and the client 50, and reflects this link relationship in the link table 19. To do. Since all the nodes hold the value ranges related to all the other nodes, the client 50 may have a link relationship of only one arbitrary node. Since the client 50 does not participate in the ID space constructed by the node ID of each node 11, that is, is not a target that is voluntarily accessed from each node 11, the link relationship construction method in the client 50 is not limited at all. .
- the tree structure data stored in the tree storage unit 20 does not have hierarchy 0 tree data because the client 50 does not have an adjacent node on the ID ring.
- the tree data of each hierarchy above hierarchy 1 is the same as in the first embodiment.
- the tree generation unit 14 is the same as that of the first embodiment except that the tree data of the hierarchy 0 is not generated.
- the same processing as in the first embodiment is performed in the client 50 that does not store the partial data and does not realize the node 11. Therefore, even when the client 50 in the third embodiment acquires a data access request, the same operations and effects as in the first embodiment can be achieved.
- the finite range of the ID space of the node ID is [0, 1024), and nine nodes 11 are realized in the distributed system 1.
- ID value a node having a certain ID value.
- FIG. 14 is a diagram conceptually illustrating the relationship between the ID ring and the range information in the first embodiment.
- the node (70) has a range of (10, 25)
- the node (803) has a range of (175, 255] and (0, 3).
- the link generation unit 17 of each node 11 determines the link destination node of the own node 11 by the extended Koorde algorithm.
- the parameter k used in the extended Koorde algorithm is set to 2.
- FIG. 15 is a diagram illustrating a part of the link relationship generated in the first embodiment.
- the node (413) includes the above-described node (803) as the first link destination node, the above-described node (70) as the second link destination node, and the above-described third node existing between these nodes in the ID ring.
- a link is established with the node (980) as the link destination node.
- the node (803) is a Pred node for a value (826) obtained by multiplying the ID (413) of the node (413) by the parameter k (2).
- the node (803) has a link with each node from the node (551) to the node (803).
- the node (551) establishes a link with the node (70) and the node (129).
- the node 70 establishes a link with the node (129) and the node (250).
- the node (551) reaches from the node (70) to the node (250) in two hops.
- Such a link relationship is generated by the link generation unit 17 of each node 11 and stored in the link table 19 of each node 11.
- the tree search unit 13 and the first example in the first embodiment and The operation of the tree generation unit 14 will be described with reference to the flowcharts of FIGS.
- the data operation unit 12 instructs the tree search unit 13 to specify the node 11 storing the partial data including the attribute value (35) with the node (413) as the target node. .
- the tree search unit 13 acquires the attribute value (35) of the target attribute (S90), sets an initial value 0 to the hierarchy L (S91), and determines whether or not tree data of the hierarchy 0 exists (S92). ).
- the tree search unit 13 requests the tree generation unit 14 to generate the tree data of the hierarchy 0 of the node (413) because the tree data of the hierarchy 0 does not exist (S92; NO) (S93).
- the management form 6 that is, the form in which both the start point and the end point of the range related to the own node and the Suc node are managed is used. Therefore, at this time, the node (413) stores in the range storage unit 23 the start point (53) and end point (67) of the range of its own node (413), and the start point (67) and end point of the suc node (551) ( 138).
- the tree generation unit 14 Based on the range start point (53) and range end point (67) of the own node (413) and the range start point (67) and range end point (138) of the Suc node (551), the tree generation unit 14 creates the following hierarchy: 0 tree data is generated (S53). First entry: Range boundary value (53), pointer related to node (413) Second entry: Pointer related to range boundary value (67), node (551) Third entry: Range boundary value (138), Null
- the tree search unit 13 tries to specify a destination node having a value range including the attribute value (35) from the tree data of the hierarchy 0 (S95). However, since the value range indicated by the tree data of layer 0 is (53, 138), the attribute value (35) is not included.Therefore, the tree search unit 13 specifies the destination node in the tree data of layer 0. Since it cannot be performed (S96; NO), the search target level L is increased by one level (S98).
- the tree search unit 13 requests generation of the tree data of the layer 1 of the node (413) from the tree generation unit 14 because the tree data of the layer 1 does not exist (S92; NO) (S93).
- the tree generation unit 14 includes a range start point (175) of the link destination node (803), a range start point (3) of the link destination node (980), a range start point (10) and a range end point (25) of the link destination node (70). Based on the range end point (32) of the suc node (129) of the link destination node (70), the following hierarchical level 1 tree data is generated (S62).
- the tree search unit 13 tries to specify a destination node having a value range including the attribute value (35) from the tree data of the hierarchy 1 (S95). However, since the value range indicated by the tree data of hierarchy 1 is (175, 32), the attribute value (35) is not included, so that the tree search unit 13 can specify the destination node in the tree data of hierarchy 1. Since it cannot be performed (S96; NO), the search target level L is further increased by one level (S98).
- the tree search unit 13 Since the tree search unit 13 does not have the tree data of the hierarchy 2 (S92; NO), the tree search unit 13 requests the tree generation unit 14 to generate the tree data of the hierarchy 2 of the node (413) (S93).
- the tree generation unit 14 acquires the tree data of the hierarchy 1 from the link destination nodes (803, 980, and 70) (S57), and generates the tree data of the hierarchy 2 based on the acquired tree data (S58). , S59). Thereby, the root tree node of the tree data of the hierarchy 2 is generated as follows.
- Second entry Range boundary value (175), pointer to child tree node (corresponding to link destination node (980))
- Third entry Range boundary value (25), pointer to child tree node (corresponding to link destination node (70))
- the attribute value (53) is included.
- (67, 67] is regarded as (67, 255] ⁇ [0, 67], and the attribute value (35) is included in this.
- the tree search unit 13 sets (32, 53) to (32, 53). The entry to be the range is specified, the pointer of the entry is traced, and the pointer to the node specifying data of the node (250) is specified at the tree node one step below the root tree node (S107).
- the data operation unit 12 acquires the communication address of the node (250) based on the node specifying data of the node (250) acquired from the tree search unit 13, and uses the communication address to acquire the node (250). ) Performs data access processing of the attribute value (35).
- the tree structure data generated in this way is shown in FIG.
- FIG. 16 is a diagram illustrating a part of the tree structure data generated at each node 11 in the first embodiment.
- FIG. 17 is a diagram conceptually illustrating an example of load distribution in the first embodiment.
- the node (129) stores the partial data having the range (25, 38)
- the node (250) stores the partial data having the range (38, 53).
- the load of the data node may be a data storage amount, a data access frequency, or the load around the logical node 11 may be equalized using the load around the data server 10 as an index.
- the processing target of the tree update unit 15 and the version comparison unit 16 is a node (413) is taken as an example. At this time, it is assumed that the node (413) has tree structure data as shown in FIG.
- the tree update unit 15 performs the following version check process at a predetermined timing.
- the tree update unit 15 recognizes the link destination node (803, 980, 70) of the node (413) by referring to the link table 19 of the node (413), and one link destination node ( 803) is specified (S70). Subsequently, the tree update unit 15 determines the version information of the tree node specified by the pointer corresponding to the link destination node (803) set in the root tree node of the hierarchy 2 in the tree structure data of the node (413) ( Version 1 (ver. 1)) is acquired (S71). The tree update unit 15 transmits a version confirmation request including the version information (version 1) of the hierarchy 2 to the link destination node (803) (S72).
- the version comparison unit 16 When the link destination node (803) receives the version confirmation request (S81), the version comparison unit 16 includes the layer 2 version information (version 1) included therein and the layer 1 version information of the link destination node (803). (Version 1) is compared (S82). Here, the versions are the same. Therefore, the version comparison unit 16 sends a reply including the range information possessed by the link destination node (803) (S83).
- the range information management form (management form 6) of each node in the first embodiment as the range information held by the link destination node (803), the range start point (175) of the link destination node (803), the link destination node (803) The value range start point (3) and value range end point (10) of the suc node (980) are returned.
- the tree update unit 15 When receiving the reply (S73), the tree update unit 15 is set from the first entry to the third entry corresponding to the range information of the link destination node (803) in the tree data of the hierarchy 1 of the node (413).
- the current range boundary values (175, 3 and 10) are compared with the latest range information (175, 3 and 10) of the link destination node (803) included in the reply. Here, both are the same, and it is determined that the range has not been changed.
- the tree update unit 15 determines whether or not there is a hierarchy having a different version by referring to the reply (S75).
- the tree update unit 15 identifies the next link destination node (980) (S70).
- the node (980) has tree structure data as shown in FIG.
- FIG. 18 is a diagram conceptually illustrating an example of tree structure data of the node (980) after load distribution in the first exemplary embodiment. That is, since the range boundary value of the link destination node (129) of the node (980) has been changed, the version information of the tree data of the hierarchy 1 of the node (980) is carried up to 2.
- the tree update unit 15 similarly performs the process (S71) for the link destination node (980), and transmits a version confirmation request including the version information (version 1) of the hierarchy 2 to the link destination node (980) (S72). ).
- the version comparison unit 16 When the link destination node (980) receives the version confirmation request (S81), the version comparison unit 16 includes the layer 2 version information (version 1) included therein and the layer 1 version information of the link destination node (980). (Version 2) is compared (S82). Here, since the versions are different, the version comparison unit 16 uses the tree data (including version information) of the hierarchy 1 of the link destination node (980) and the range information (3, 10 and 25) held by the link destination node (980). A reply including is sent (S83).
- the tree update unit 15 Upon receiving the reply (S73), the tree update unit 15 is set from the second entry to the fourth entry corresponding to the range information of the link destination node (980) in the tree data of the hierarchy 1 of the node (413).
- the current range boundary values (3, 10 and 25) are compared with the latest range information (3, 10 and 25) held by the link destination node (980) included in the reply. Here, both are the same, and it is determined that the range has not been changed.
- the tree update unit 15 determines that the version of the hierarchy 2 is different (S75; YES), and uses a pointer corresponding to the link destination node (980) included in the root tree node of the hierarchy 2
- the identified tree node is updated with the tree data of the hierarchy 1 of the link destination node (980) included in the reply (S76).
- the version information of the tree node specified by the pointer becomes 2.
- the update may be a copy of the tree data included in the reply, or the range boundary value information in the tree data included in the reply is omitted in relation to the range boundary value set in the root tree node. You may be.
- the tree update unit 15 uses the range boundary value (175) of the top entry of the tree data of the hierarchy 1 of the acquired link destination node (980) as the link destination node in the tree data of the hierarchy 2 of the node (413). The value is set to the range boundary value of the entry including the pointer associated with (980), and the version information of the root tree node of hierarchy 2 is carried up (S77). Subsequently, the tree update unit 15 specifies the next link destination node (70) (S70). Thereafter, the same version confirmation process is executed for the link destination node (70).
- FIG. 19 is a diagram conceptually illustrating an example of the tree structure data after the version update of the node (413) in the first embodiment.
- the version information (Ver. 3) of the root tree node and the version information of the child tree node specified by the pointer associated with the link destination node (980) ( Ver.2), the version information (Ver.2) and the range boundary value (38) of the child tree node specified by the pointer associated with the link destination node (70) are updated.
- the nodes other than the node (413) that have a link to the node (980) are similarly processed, and new range information is acquired.
- duplicate entries there are duplicate entries (three entries including range boundary values (25, 38 and 53)) in the tree data acquired from the link destination node (980) and the tree data acquired from the link destination node (70).
- the duplicate entry is excluded from the child tree node specified by the pointer associated with the previous link destination node (980).
- the tree nodes other than the root tree node of the tree data of the hierarchy 2 or higher may be a copy of the tree data included in the reply, or are modified from the tree data in order to eliminate redundancy. There may be.
- the range boundary value is changed in the deleted entry because of redundancy, but the version information of the child tree node specified by the pointer associated with the link destination node (980) is carried up. It was. However, version information may not be carried up for updates to entries that are deleted due to redundancy (a part that overlaps with one's own suc).
- Example 2 an extended Koorde and parameters used in the above-described second embodiment will be described.
- the extended Koorde input order (indegree), output order (outdegree), and the height of the constructed tree (or the number of hops) depend strongly on a random variable related to the distance of each node to the Suc node.
- the number of virtual servers number of logical nodes 11 around one physical server (data server 10) is v and the total number of logical nodes 11 is N
- the random variable corresponding to the distance to the adjacent node follows the geometric distribution
- the sum of the virtual server portion v follows a negative binomial distribution NB (v, p).
- the number of nodes included in a certain range x follows a binomial distribution B (x, p).
- p is N / 2b .
- the output order ( ⁇ OUT ) and the input order ( ⁇ IN ) are given by (Expression X1) and (Expression X2) below.
- the distribution of the upper limit of the highest tree height hmax can be easily obtained at each node.
- hmax when the range of the ID of h-hop destination node group covers a rh, an rh> minimum height that satisfies 2 b (h).
- rh is a distance between the node and the suc node, rh is wider than rk h , so hmax is at least the minimum h that satisfies rk h > 2 b , which is the upper limit hmax.
- hc is the tree height constraint hc as described above. With these conditions as constraints, the Lagrangian function of the following (formula X4) is minimized.
- the above (formula X4) is partially differentiated by the polling interval T, the following (formula X5) is obtained.
- condition (i) 0 (denoted as condition (i)).
- condition (i) corresponds to the case where hc> 0.78 log e N is satisfied.
- the polling interval T is set by the following (formula X9).
- Each node 11 may link with at least nodes from the pred node (km) to the pred node (km + k ⁇ ) using the given tree height constraint hc2 when constructing the link.
- ⁇ is expressed by the following (formula X10).
- version information is compared on the link destination node side of the target node (S82 in FIG. 8), and tree data of a hierarchy having different versions is obtained from the link destination node. It has been returned to the target node (S83 in FIG. 8).
- all the tree data after the hierarchy 2 may be sent from the link destination node to the target node, and the version information may be compared on the target node side.
- the above-mentioned extended Koorde algorithm is effective even when applied in a manner other than that shown in the above-described embodiments and examples.
- the extended Koorde algorithm may be applied to a DHT (Distributed Hash Table) in a data structure in which attribute values are not ordered.
- DHT Distributed Hash Table
- a distributed data management device for realizing at least one target logical node among a plurality of logical nodes storing a plurality of partial data into which data is divided,
- the target logical node is A node identifier storage unit that stores, as a target node identifier, an identifier assigned to the target logical node among a plurality of identifiers uniquely assigned to the plurality of logical nodes within a finite identifier space having a ring structure;
- a data storage unit for storing at least one of the plurality of partial data; Link information indicating a communicable relationship between the target logical node and another logical node, a value obtained by multiplying the target node identifier by a parameter k (k is a natural number), or smaller than the value And the first link destination logical node having the identifier closest to the value, the value obtained by multiplying the identifier of the successor logical node larger than the target node identifier and the identifier nearest by the parameter k,
- the computer that implements the target logical node Link information indicating a communicable relationship between the target logical node and another logical node, a value obtained by multiplying the target node identifier by a parameter k (k is a natural number), or smaller than the value And the first link destination logical node having the identifier closest to the value, the value obtained by multiplying the identifier of the successor logical node larger than the target node identifier and the identifier nearest by the parameter k, or the value
- a second link destination logical node that is small and has the nearest identifier of the value, and at least an identifier between the identifier of the first link destination logical node and the identifier of the second link destination logical node in the identifier space
- Generating link information including a plurality of links between one third link destination logical node and the target logical node; Distributed data management method.
- the target logical node is A node identifier storage unit that stores, as a target node identifier, an identifier assigned to the target logical node among a plurality of identifiers uniquely assigned to the plurality of logical nodes within a finite identifier space having a ring structure;
- a data storage unit for storing at least one of the plurality of partial data;
- Link information indicating a communicable relationship between the target logical node and another logical node, and the target logical node established according to a relationship between the target node identifier in the identifier space;
- a link table for storing link information between link destination logical nodes;
- the target logical node is Tree generation for acquiring tree data from the link destination logical node associated with the pointer included in the root tree node, and generating at least one tree node lower than the root tree node from the acquired tree data
- the distributed data management device according to appendix 1, further comprising a unit.
- the tree structure data stored in the tree storage unit has a plurality of hierarchies, each hierarchy has tree data, and the first hierarchy tree data is a value range stored in the range storage unit by the link destination logical node.
- An entry corresponding to the information, the tree data of the hierarchy L higher than the first hierarchy (L is 2 or more) includes the root tree node;
- the tree generation unit acquires tree data of the hierarchy (L-1) stored in the link destination logical node from the link destination logical node associated with the pointer included in the root tree node, and acquires the tree data
- a partial tree data corresponding to the link destination logical node in the tree data of the hierarchy L is generated from the tree data of the hierarchy (L-1)
- the distributed data management device according to attachment 2.
- Each tree node constituting each tree data stored in the tree storage unit includes version information
- the target logical node is A version confirmation request in which version information of a child tree node indicated by the pointer is set is transmitted to the link destination logical node associated with the pointer included in the root tree node of the hierarchy L, and the version A tree update unit for updating the version information of each tree node and each tree node with the tree data and version information included in the reply from the link destination logical node in response to the confirmation request;
- the version confirmation request is received from another logical node, and the version information regarding the hierarchy L included in the version confirmation request is compared with the version information of the tree data of the hierarchy (L-1) possessed by the target logical node.
- a version comparison unit for returning tree data of a different version (L-1) together with version information to the other logical node;
- the distributed data management device according to appendix 3, further comprising:
- the target logical node is The range boundary value indicated by the first entry of the tree node included in the tree structure data related to the search target attribute is set as a reference value in the attribute value space of the search target attribute, and from the reference value to the maximum value in the attribute value space
- the tree structure by the inclusion determination based on the cyclic order of the attribute value space, including a case where an arbitrary value between is smaller than an arbitrary value between the minimum value in the attribute value space and the reference value
- the distributed data management device according to appendix 4, further comprising a tree search unit that identifies an entry including a search target attribute value in the range from the data.
- a value range that does not include all attribute values in the attribute value space is set for each root tree node of each hierarchy, When the tree search unit cannot identify an entry including the search target attribute value in the range from the tree data of a certain hierarchy, the tree search unit tries to search tree data of the hierarchy one level higher, and If the tree data of the hierarchy does not exist, the tree generation unit is requested to generate the tree data of the hierarchy one level above.
- the distributed data management apparatus according to appendix 5.
- the link information stored in the link table includes a value obtained by multiplying the target node identifier by a parameter k (k is a natural number), or a first link destination having an identifier smaller than the value and closest to the value A logical node, a value obtained by multiplying the identifier of a successor logical node having an identifier larger than the target node identifier and the nearest one by the parameter k, or a second link having an identifier smaller than the value and the nearest identifier of the value A destination logical node, at least one third link destination logical node having an identifier between the identifier of the first link destination logical node and the identifier of the second link destination logical node in the identifier space, and the target logic Including multiple links between nodes,
- the distributed data management device according to any one of appendices 1 to 6.
- the link table includes a value obtained from the target logical node by multiplying the target node identifier by a parameter k (k is a natural number), or a first link destination logical having an identifier smaller than the value and closest to the value.
- a plurality of links to a logical node and at least one third linked logical node having an identifier between the identifier of the first linked logical node and the identifier of the second linked logical node in the identifier space Store link information,
- the tree update unit transmits the version confirmation request at a polling interval T
- the target logical node is A system constraint time wc with respect to a maximum time until a change in the range in at least one of the plurality of logical nodes is transmitted to all of the plurality of logical nodes, or a unit in which each logical node transmits the version check request
- the system constraint load ⁇ c with respect to the time load is acquired, the acquired system constrain
- Appendix 9 The distributed data management device according to appendix 7 or 8, wherein the parameter k is set to 4.
- a distributed data operating device that stores partial data corresponding to an access request and identifies the target logical node realized by the distributed data management device according to attachment 6 as a destination of the access request, A link table storing link information capable of communicating with a plurality of link destination logical nodes including the target logical node;
- the tree storage The tree update unit;
- the tree search unit A distributed data manipulation device comprising:
- the target logical node is A node identifier storage unit that stores, as a target node identifier, an identifier assigned to the target logical node among a plurality of identifiers uniquely assigned to the plurality of logical nodes within a finite identifier space having a ring structure;
- a data storage unit for storing at least one of the plurality of partial data; Link information indicating a communicable relationship between the target logical node and another logical node, and the target logical node established according to a relationship between the target node identifier in the identifier space;
- a link table for storing link information between link destination logical nodes; A range boundary value for each attribute corresponding to the partial data stored in the data storage unit
- the target logical node acquires tree data from the link destination logical node associated with the pointer included in the root tree node, and at least one tree lower than the root tree node from the acquired tree data
- the tree structure data stored in the tree storage unit has a plurality of hierarchies, each hierarchy has tree data, and the first hierarchy tree data is a value range stored in the range storage unit by the link destination logical node.
- An entry corresponding to the information, the tree data of the hierarchy L higher than the first hierarchy (L is 2 or more) includes the root tree node;
- the tree generation unit acquires tree data of the hierarchy (L-1) stored in the link destination logical node from the link destination logical node associated with the pointer included in the root tree node, and acquires the tree data
- a partial tree data corresponding to the link destination logical node in the tree data of the hierarchy L is generated from the tree data of the hierarchy (L-1)
- the program according to attachment 12 The program according to attachment 12.
- Each tree node constituting each tree data stored in the tree storage unit includes version information
- the target logical node is A version confirmation request in which version information of a child tree node indicated by the pointer is set is transmitted to the link destination logical node associated with the pointer included in the root tree node of the hierarchy L, and the version A tree update unit for updating the version information of each tree node and each tree node with the tree data and version information included in the reply from the link destination logical node in response to the confirmation request;
- the version confirmation request is received from another logical node, and the version information regarding the hierarchy L included in the version confirmation request is compared with the version information of the tree data of the hierarchy (L-1) possessed by the target logical node.
- a version comparison unit for returning tree data of a different version (L-1) together with version information to the other logical node;
- the target logical node is The range boundary value indicated by the first entry of the tree node included in the tree structure data related to the search target attribute is set as a reference value in the attribute value space of the search target attribute, and from the reference value to the maximum value in the attribute value space
- the tree structure by the inclusion determination based on the cyclic order of the attribute value space, including a case where an arbitrary value between is smaller than an arbitrary value between the minimum value in the attribute value space and the reference value 15.
- a value range that does not include all attribute values in the attribute value space is set for each root tree node of each hierarchy, When the tree search unit cannot identify an entry including the search target attribute value in the range from the tree data of a certain hierarchy, the tree search unit tries to search tree data of the hierarchy one level higher, and If the tree data of the hierarchy does not exist, the tree generation unit is requested to generate the tree data of the hierarchy one level above.
- the link information stored in the link table includes a value obtained by multiplying the target node identifier by a parameter k (k is a natural number), or a first link destination having an identifier smaller than the value and closest to the value A logical node, a value obtained by multiplying the identifier of a successor logical node having an identifier larger than the target node identifier and the nearest one by the parameter k, or a second link having an identifier smaller than the value and the nearest identifier of the value A destination logical node, at least one third link destination logical node having an identifier between the identifier of the first link destination logical node and the identifier of the second link destination logical node in the identifier space, and the target logic Including multiple links between nodes, The program according to any one of appendices 11 to 16.
- the link table includes a value obtained from the target logical node by multiplying the target node identifier by a parameter k (k is a natural number), or a first link destination logical having an identifier smaller than the value and closest to the value.
- a plurality of links to a logical node and at least one third linked logical node having an identifier between the identifier of the first linked logical node and the identifier of the second linked logical node in the identifier space Store link information,
- the tree update unit transmits the version confirmation request at a polling interval T
- the target logical node is A system constraint time wc with respect to a maximum time until a change in the range in at least one of the plurality of logical nodes is transmitted to all of the plurality of logical nodes, or a unit in which each logical node transmits the version check request
- the system constraint load ⁇ c with respect to the time load is acquired, the acquired system constrain
- a distributed data management device for realizing at least one target logical node among a plurality of logical nodes storing a plurality of partial data into which data is divided,
- the target logical node is A node identifier storage unit that stores, as a target node identifier, an identifier assigned to the target logical node among a plurality of identifiers uniquely assigned to the plurality of logical nodes within a finite identifier space having a ring structure;
- a data storage unit for storing at least one of the plurality of partial data; Link information indicating a communicable relationship between the target logical node and another logical node, a value obtained by multiplying the target node identifier by a parameter k (k is a natural number), or smaller than the value And the first link destination logical node having the identifier closest to the value, the value obtained by multiplying the identifier of the successor logical node larger than the target node identifier and the identifier nearest by the parameter k
- the computer that implements the target logical node Link information indicating a communicable relationship between the target logical node and another logical node, a value obtained by multiplying the target node identifier by a parameter k (k is a natural number), or smaller than the value And the first link destination logical node having the identifier closest to the value, the value obtained by multiplying the identifier of the successor logical node larger than the target node identifier and the identifier nearest by the parameter k, or the value
- a second link destination logical node that is small and has the nearest identifier of the value, and at least an identifier between the identifier of the first link destination logical node and the identifier of the second link destination logical node in the identifier space
- Generating link information including a plurality of links between one third link destination logical node and the target logical node; Distributed data management method.
- Appendix 22 A recording medium for recording the program according to any one of appendices 11 to 19 in a computer-readable manner.
Abstract
Description
例えば、非特許文献1の方法では、データを格納するノードの値域に変更が発生した場合に、クライアントのデータアクセス時間が長くなってしまう。その理由は、クライアントがデータ格納ノードの値域の変更を検知するのは、クライアントがデータへのアクセスを実行した時であるからである。即ち、クライアントは、その検知後、メタデータサーバから新たな値域を取得し、データアクセスの再実行をすることになるため、その通信遅延がそのままデータアクセス時間としてかかってしまう。
[第1実施形態]
〔システム構成〕
図1は、第1実施形態における分散システム1の構成例を概念的に示す図である。第1実施形態における分散システム1は、複数のデータサーバ10等を有する。データサーバ10は相互にネットワーク9によって通信可能に接続される。データサーバ10は、上述の実施形態における分散データ管理装置に対応する。データサーバ10は、アプリケーションや他の端末からの要求に応じて、データサーバ10に格納されるデータにアクセスし、所望のデータを取得する。
図2は、第1実施形態におけるデータサーバ10の処理構成例を概念的に示す図である。図2に示されるように、データサーバ10は、データ操作部12、ツリー探索部13、ツリー生成部14、ツリー更新部15、バージョン比較部16、1以上の論理ノード11等を有する。各論理ノードは、リンク生成部17、ノードID格納部18、リンクテーブル19、ツリー格納部20、データアクセス部21、データ格納部22、値域格納部23等をそれぞれ有する。
以下、各ノード11が有する各処理部についてそれぞれ説明する。
階層0のツリーデータには、自ノードの値域始点(N(1).sV)及び自ノードの値域終点(N(1).eV)、並びに、自ノードのノード特定データへのポインタが設定されている。
以下、第1実施形態におけるデータサーバ10の動作例について図7から図10を用いて説明する。なお、以下の説明では、各ノードにおける値域情報の管理形態として、上記管理形態6、即ち、自ノードとSucノードとに関する値域の始点及び終点の両方が管理される形態が用いられる。
上述したように、第1実施形態では、対象属性の属性値又は属性値の範囲に関するデータアクセス要求に応じて、ノード11は、自ノード11のツリー格納部20に格納されるツリー構造データを用いて、そのデータアクセス要求の対象となる部分データを格納する宛先ノードを特定する。これにより、第1実施形態によれば、データアクセス要求を或るノード11で特定された宛先ノードに直接に転送することができるため、データアクセス要求のノード間転送に伴うデータアクセス時間の増加を防ぐことができる。
第1実施形態では、ノードIDに基づくリンク関係の構築の手法は限定されていなかった。第2実施形態では、当該リンク関係の構築手法として新たな手法が適応され、これに関連するシステムパラメータの調整処理が加えられる。第2実施形態における分散システム1の構成は、第1実施形態と同様であり、データサーバ10の処理が第1実施形態と異なる。以下、第2実施形態におけるデータサーバ10について第1実施形態と異なる内容を中心に説明し、第1実施形態と同様の内容については適宜省略する。
図11は、第2実施形態におけるデータサーバ10の処理構成例を概念的に示す図である。第2実施形態におけるデータサーバ10は、第1実施形態の構成に加えて、パラメータ設定部31を更に有する。
第2実施形態では、各ノード11のリンク生成部17は、拡張Koordeアルゴリズムにより、各ノード11のリンク先ノードを決定する。拡張Koordeでは、上述したように、各ノード11におけるリンク先ノードの数(次数)が確率的に決定されるものとした。これにより、周知のKoordeでは、ノードIDがハッシュにより生成されることに起因する確率的な要素が、各ノード11のリンク数(定数)には表れず、Sucノードとそれ以外のノードとの間の利用割合に表れていたのに対して、拡張Koordeでは、各ノード11のリンク数が確率的になる。
上述の各実施形態では、部分データを格納するノード11を実現する各データサーバ10が上述のような処理を実行していたが、部分データを格納せずノード11を実現しない装置において、上述のような処理が実行されてもよい。第3実施形態における分散システム1は、部分データを格納せずノード11を実現しない装置として、データ操作クライアント50を更に有する。以下、第3実施形態の分散システム1について、第1実施形態と異なる内容を中心に説明し、第1実施形態と同じ内容については適宜省略する。
図12は、第3実施形態における分散システム1の構成例を概念的に示す図である。第3実施形態における分散システム1は、第1実施形態の構成に加えて、複数のデータ操作クライアント(以降、単にクライアントとも表記する)50を更に有する。クライアント50は、ネットワーク9を介してデータサーバ10と通信可能に接続される。クライアント50は、データサーバ10と同様に、アプリケーションや他の端末からの要求に応じて、データサーバ10に格納されるデータにアクセスし、所望のデータを取得する。なお、クライアント50のハードウェア構成もデータサーバ10と同様であり、本実施形態は、クライアント50のハードウェア構成を限定しない。
図13は、第3実施形態におけるデータ操作クライアント50の処理構成例を概念的に示す図である。図13に示されるように、クライアント50は、データ操作部12、ツリー探索部13、ツリー生成部14、ツリー更新部15、バージョン比較部16、リンク生成部17、リンクテーブル19、ツリー格納部20等を有する。これら各処理部は、基本的には、第1実施形態と同様である。
このように、第3実施形態では、部分データを格納せずノード11を実現しないクライアント50において、第1実施形態と同様の処理が行われる。よって、第3実施形態におけるクライアント50がデータアクセス要求を取得した場合でも、第1実施形態と同様の作用及び効果を奏することができる。
第1エントリ:値域境界値(53)、ノード(413)に関するポインタ
第2エントリ:値域境界値(67)、ノード(551)に関するポインタ
第3エントリ:値域境界値(138)、Null
第1エントリ:値域境界値(175)、ノード(803)に関するポインタ
第2エントリ:値域境界値(3)、ノード(980)に関するポインタ
第3エントリ:値域境界値(10)、ノード(70)に関するポインタ
第4エントリ:値域境界値(25)、ノード(129)に関するポインタ
第5エントリ:値域境界値(32)、Null
第1エントリ:値域境界値(67)、子ツリーノードへのポインタ(リンク先ノード(803)に対応)
第2エントリ:値域境界値(175)、子ツリーノードへのポインタ(リンク先ノード(980)に対応)
第3エントリ:値域境界値(25)、子ツリーノードへのポインタ(リンク先ノード(70)に対応)
第4エントリ:値域境界値(67)、Null
上述の各実施形態及び各変形例におけるバージョン確認処理では、対象ノードのリンク先ノード側でバージョン情報の比較が行われ(図8のS82)、バージョンが異なる階層のツリーデータがそのリンク先ノードから対象ノードへ返信されていた(図8のS83)。このような構成は、リンク先ノードから対象ノードへ階層2以降の全ツリーデータが送られ、対象ノード側でバージョン情報の比較が行われるようにされてもよい。
上述の拡張Koordeアルゴリズムは、上述のような各実施形態及び各実施例で示されるような態様以外に適用されても効果的である。例えば、当該拡張Koordeアルゴリズムは、属性値が順序付けられていないデータ構造におけるDHT(Distributed Hash Table)に適用されてもよい。この場合、次のような実施態様が考えられる。
前記対象論理ノードは、
リング構造を有する有限の識別子空間内で前記複数の論理ノードに一意にそれぞれ割り当てられる複数の識別子の中の、前記対象論理ノードに割り当てられた識別子を対象ノード識別子として格納するノード識別子格納部と、
前記複数の部分データの中の少なくとも1つを格納するデータ格納部と、
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含むリンク情報を格納するリンクテーブルと、
を備える分散データ管理装置。
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含むリンク情報を生成する、
分散データ管理方法。
属性値順に順序付けられたデータが分割された複数の部分データであって属性毎の値域をそれぞれ有する複数の部分データを格納する複数の論理ノードの中の少なくとも1つの対象論理ノードを実現する分散データ管理装置であって、
前記対象論理ノードは、
リング構造を有する有限の識別子空間内で前記複数の論理ノードに一意にそれぞれ割り当てられる複数の識別子の中の、前記対象論理ノードに割り当てられた識別子を対象ノード識別子として格納するノード識別子格納部と、
前記複数の部分データの中の少なくとも1つを格納するデータ格納部と、
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記識別子空間内における前記対象ノード識別子との間の関係に応じて確立される前記対象論理ノードとリンク先論理ノードとの間のリンク情報を格納するリンクテーブルと、
前記データ格納部に格納される前記部分データに対応する属性毎の値域境界値であって、該属性毎の該値域境界が前記識別子空間内において前記対象論理ノードと前記対象論理ノードと隣接する論理ノードとの間に位置する該属性毎の該値域境界値を格納する値域格納部と、
アクセス要求に対応する前記部分データを格納する論理ノードを特定するための、値域をそれぞれ示す複数のツリーノードから構成される属性毎のツリー構造データであって、前記リンク先論理ノードと対応付けられた子ツリーノードへのポインタと、該ポインタを選択するための値域を示す値とから形成される少なくとも1つのエントリを含む根ツリーノードを有するツリー構造データを格納するツリー格納部と、
を備える分散データ管理装置。
前記対象論理ノードは、
前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードからツリーデータを取得し、該取得されたツリーデータから前記根ツリーノードより下段の少なくとも1つのツリーノードを生成するツリー生成部を更に備える付記1に記載の分散データ管理装置。
前記ツリー格納部に格納される前記ツリー構造データは、複数階層を持ち、各階層にツリーデータをそれぞれ持ち、第1階層のツリーデータは、前記リンク先論理ノードが前記値域格納部に格納する値域情報に対応するエントリを有し、前記第1階層より上位の階層L(Lは2以上)のツリーデータは、前記根ツリーノードを含み、
前記ツリー生成部は、前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードから、前記リンク先論理ノードに格納される階層(L-1)のツリーデータを取得し、取得された階層(L-1)のツリーデータから、前記階層Lのツリーデータの中の前記リンク先論理ノードに対応する部分ツリーデータを生成する、
付記2に記載の分散データ管理装置。
前記ツリー格納部に格納される前記各ツリーデータを構成する各ツリーノードは、バージョン情報をそれぞれ含み、
前記対象論理ノードは、
前記階層Lの前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードに対して、該ポインタが指す子ツリーノードのバージョン情報が設定されたバージョン確認要求を送信し、該バージョン確認要求に対する該リンク先論理ノードからの返信に含まれるツリーデータ及びバージョン情報により、各ツリーノード及び各ツリーノードのバージョン情報を更新するツリー更新部と、
他の論理ノードから前記バージョン確認要求を受信し、該バージョン確認要求に含まれる前記階層Lに関するバージョン情報を、前記対象論理ノードが持つ階層(L-1)のツリーデータのバージョン情報とそれぞれ比較し、バージョンが異なる階層(L-1)のツリーデータをバージョン情報と共に該他の論理ノードへ返信するバージョン比較部と、
を更に備える付記3に記載の分散データ管理装置。
前記対象論理ノードが、
探索対象属性に関する前記ツリー構造データに含まれるツリーノードの第1エントリが示す値域境界値を該探索対象属性の属性値空間内の基準値とし、該基準値から該属性値空間内の最大値までの間の任意の値が、該属性値空間内の最小値から該基準値までの間の任意の値より小さくなる場合を含む、該属性値空間の循環順序に基づく包含判定により、前記ツリー構造データから探索対象の属性値を値域に含むエントリを特定するツリー探索部を更に備える付記4に記載の分散データ管理装置。
前記各階層の根ツリーノードには、前記属性値空間における全属性値を包含しない値域がそれぞれ設定され、
前記ツリー探索部は、或る階層の前記ツリーデータから、前記探索対象属性値を値域に含むエントリを特定できない場合には、1段上の階層のツリーデータの探索を試み、該1段上の階層のツリーデータが存在しない場合には、前記ツリー生成部に、該1段上の階層のツリーデータの生成を依頼する、
付記5に記載の分散データ管理装置。
前記リンクテーブルに格納される前記リンク情報は、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含む、
付記1から6のいずれか1つに記載の分散データ管理装置。
前記リンクテーブルは、前記対象論理ノードから、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードへの複数リンクを含むリンク情報を格納し、
前記ツリー更新部は、前記バージョン確認要求をポーリング間隔Tで送信し、
前記対象論理ノードは、
前記複数の論理ノードの少なくとも1つにおける値域の変更が前記複数の論理ノードの全てに伝達されるまでの最大時間に対するシステム制約時間wc、又は、前記各論理ノードが前記バージョン確認要求を送信する単位時間負荷に対するシステム制約負荷λcを取得し、取得されたシステム制約時間wc又は取得されたシステム制約負荷λc、及び、前記論理ノードの総数N又は前記対象論理ノードのリンク先ノードの数Dを、下記(式1)又は下記(式2)に適用することにより、前記ポーリング間隔Tを算出するパラメータ設定部、
を更に備える付記4から6のいずれか1つに記載の分散データ管理装置。
前記パラメータkが4に設定される付記7又は8に記載の分散データ管理装置。
アクセス要求に対応する部分データを格納しており、付記6に記載の分散データ管理装置で実現される前記対象論理ノードを該アクセス要求の宛先として特定する分散データ操作装置であって、
前記対象論理ノードを含む複数のリンク先論理ノードと通信可能となるリンク情報を格納するリンクテーブルと、
前記ツリー格納部と、
前記ツリー更新部と、
前記ツリー探索部と、
を備える分散データ操作装置。
属性値順に順序付けられたデータが分割された複数の部分データであって属性毎の値域をそれぞれ有する複数の部分データを格納する複数の論理ノードの中の少なくとも1つの対象論理ノードをコンピュータに実現させるプログラムであって、
前記対象論理ノードが、
リング構造を有する有限の識別子空間内で前記複数の論理ノードに一意にそれぞれ割り当てられる複数の識別子の中の、前記対象論理ノードに割り当てられた識別子を対象ノード識別子として格納するノード識別子格納部と、
前記複数の部分データの中の少なくとも1つを格納するデータ格納部と、
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記識別子空間内における前記対象ノード識別子との間の関係に応じて確立される前記対象論理ノードとリンク先論理ノードとの間のリンク情報を格納するリンクテーブルと、
前記データ格納部に格納される前記部分データに対応する属性毎の値域境界値であって、該属性毎の該値域境界が前記識別子空間内において前記対象論理ノードと前記対象論理ノードと隣接する論理ノードとの間に位置する該属性毎の該値域境界値を格納する値域格納部と、
アクセス要求に対応する前記部分データを格納する論理ノードを特定するための、値域をそれぞれ示す複数のツリーノードから構成される属性毎のツリー構造データであって、前記リンク先論理ノードと対応付けられた子ツリーノードへのポインタと、該ポインタを選択するための値域を示す値とから形成される少なくとも1つのエントリを含む根ツリーノードを有するツリー構造データを格納するツリー格納部と、
を備えるプログラム。
前記対象論理ノードが、前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードからツリーデータを取得し、該取得されたツリーデータから前記根ツリーノードより下段の少なくとも1つのツリーノードを生成するツリー生成部を更に備える付記11に記載のプログラム。
前記ツリー格納部に格納される前記ツリー構造データは、複数階層を持ち、各階層にツリーデータをそれぞれ持ち、第1階層のツリーデータは、前記リンク先論理ノードが前記値域格納部に格納する値域情報に対応するエントリを有し、前記第1階層より上位の階層L(Lは2以上)のツリーデータは、前記根ツリーノードを含み、
前記ツリー生成部は、前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードから、前記リンク先論理ノードに格納される階層(L-1)のツリーデータを取得し、取得された階層(L-1)のツリーデータから、前記階層Lのツリーデータの中の前記リンク先論理ノードに対応する部分ツリーデータを生成する、
付記12に記載のプログラム。
前記ツリー格納部に格納される前記各ツリーデータを構成する各ツリーノードは、バージョン情報をそれぞれ含み、
前記対象論理ノードは、
前記階層Lの前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードに対して、該ポインタが指す子ツリーノードのバージョン情報が設定されたバージョン確認要求を送信し、該バージョン確認要求に対する該リンク先論理ノードからの返信に含まれるツリーデータ及びバージョン情報により、各ツリーノード及び各ツリーノードのバージョン情報を更新するツリー更新部と、
他の論理ノードから前記バージョン確認要求を受信し、該バージョン確認要求に含まれる前記階層Lに関するバージョン情報を、前記対象論理ノードが持つ階層(L-1)のツリーデータのバージョン情報とそれぞれ比較し、バージョンが異なる階層(L-1)のツリーデータをバージョン情報と共に該他の論理ノードへ返信するバージョン比較部と、
を更に備える付記13に記載のプログラム。
前記対象論理ノードが、
探索対象属性に関する前記ツリー構造データに含まれるツリーノードの第1エントリが示す値域境界値を該探索対象属性の属性値空間内の基準値とし、該基準値から該属性値空間内の最大値までの間の任意の値が、該属性値空間内の最小値から該基準値までの間の任意の値より小さくなる場合を含む、該属性値空間の循環順序に基づく包含判定により、前記ツリー構造データから探索対象の属性値を値域に含むエントリを特定するツリー探索部を更に備える付記14に記載のプログラム。
前記各階層の根ツリーノードには、前記属性値空間における全属性値を包含しない値域がそれぞれ設定され、
前記ツリー探索部は、或る階層の前記ツリーデータから、前記探索対象属性値を値域に含むエントリを特定できない場合には、1段上の階層のツリーデータの探索を試み、該1段上の階層のツリーデータが存在しない場合には、前記ツリー生成部に、該1段上の階層のツリーデータの生成を依頼する、
付記15に記載のプログラム。
前記リンクテーブルに格納される前記リンク情報は、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含む、
付記11から16のいずれか1つに記載のプログラム。
前記リンクテーブルは、前記対象論理ノードから、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードへの複数リンクを含むリンク情報を格納し、
前記ツリー更新部は、前記バージョン確認要求をポーリング間隔Tで送信し、
前記対象論理ノードは、
前記複数の論理ノードの少なくとも1つにおける値域の変更が前記複数の論理ノードの全てに伝達されるまでの最大時間に対するシステム制約時間wc、又は、前記各論理ノードが前記バージョン確認要求を送信する単位時間負荷に対するシステム制約負荷λcを取得し、取得されたシステム制約時間wc又は取得されたシステム制約負荷λc、及び、前記論理ノードの総数N又は前記対象論理ノードのリンク先ノードの数Dを、下記(式1)又は下記(式2)に適用することにより、前記ポーリング間隔Tを算出するパラメータ設定部、
を更に備える付記14に記載のプログラム。
前記パラメータkが4に設定される付記17又は18に記載のプログラム。
データが分割された複数の部分データを格納する複数の論理ノードの中の少なくとも1つの対象論理ノードを実現する分散データ管理装置であって、
前記対象論理ノードは、
リング構造を有する有限の識別子空間内で前記複数の論理ノードに一意にそれぞれ割り当てられる複数の識別子の中の、前記対象論理ノードに割り当てられた識別子を対象ノード識別子として格納するノード識別子格納部と、
前記複数の部分データの中の少なくとも1つを格納するデータ格納部と、
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含むリンク情報を格納するリンクテーブルと、
を備える分散データ管理装置。
データが分割された複数の部分データを格納する複数の論理ノードの中の少なくとも1つの対象論理ノードであって、リング構造を有する有限の識別子空間内で該複数の論理ノードに一意にそれぞれ割り当てられる複数の識別子の中の、該対象論理ノードに割り当てられた識別子を対象ノード識別子として格納するノード識別子格納部と、該複数の部分データの中の少なくとも1つを格納するデータ格納部とを有する該対象論理ノードを実現するコンピュータが、
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含むリンク情報を生成する、
分散データ管理方法。
付記11から19のいずれか1つに記載のプログラムをコンピュータに読み取り可能に記録する記録媒体。
Claims (21)
- 属性値順に順序付けられたデータが分割された複数の部分データであって属性毎の値域をそれぞれ有する複数の部分データを格納する複数の論理ノードの中の少なくとも1つの対象論理ノードを実現する分散データ管理装置であって、
前記対象論理ノードは、
リング構造を有する有限の識別子空間内で前記複数の論理ノードに一意にそれぞれ割り当てられる複数の識別子の中の、前記対象論理ノードに割り当てられた識別子を対象ノード識別子として格納するノード識別子格納部と、
前記複数の部分データの中の少なくとも1つを格納するデータ格納部と、
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記識別子空間内における前記対象ノード識別子との間の関係に応じて確立される前記対象論理ノードとリンク先論理ノードとの間のリンク情報を格納するリンクテーブルと、
前記データ格納部に格納される前記部分データに対応する属性毎の値域境界値であって、該属性毎の該値域境界が前記識別子空間内において前記対象論理ノードと前記対象論理ノードと隣接する論理ノードとの間に位置する該属性毎の該値域境界値を格納する値域格納部と、
アクセス要求に対応する前記部分データを格納する論理ノードを特定するための、値域をそれぞれ示す複数のツリーノードから構成される属性毎のツリー構造データであって、前記リンク先論理ノードと対応付けられた子ツリーノードへのポインタと、該ポインタを選択するための値域を示す値とから形成される少なくとも1つのエントリを含む根ツリーノードを有するツリー構造データを格納するツリー格納部と、
を備える分散データ管理装置。 - 前記対象論理ノードは、
前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードからツリーデータを取得し、該取得されたツリーデータから前記根ツリーノードより下段の少なくとも1つのツリーノードを生成するツリー生成部を更に備える請求項1に記載の分散データ管理装置。 - 前記ツリー格納部に格納される前記ツリー構造データは、複数階層を持ち、各階層にツリーデータをそれぞれ持ち、第1階層のツリーデータは、前記リンク先論理ノードが前記値域格納部に格納する値域情報に対応するエントリを有し、前記第1階層より上位の階層L(Lは2以上)のツリーデータは、前記根ツリーノードを含み、
前記ツリー生成部は、前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードから、前記リンク先論理ノードに格納される階層(L-1)のツリーデータを取得し、取得された階層(L-1)のツリーデータから、前記階層Lのツリーデータの中の前記リンク先論理ノードに対応する部分ツリーデータを生成する、
請求項2に記載の分散データ管理装置。 - 前記ツリー格納部に格納される前記各ツリーデータを構成する各ツリーノードは、バージョン情報をそれぞれ含み、
前記対象論理ノードは、
前記階層Lの前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードに対して、該ポインタが指す子ツリーノードのバージョン情報が設定されたバージョン確認要求を送信し、該バージョン確認要求に対する該リンク先論理ノードからの返信に含まれるツリーデータ及びバージョン情報により、各ツリーノード及び各ツリーノードのバージョン情報を更新するツリー更新部と、
他の論理ノードから前記バージョン確認要求を受信し、該バージョン確認要求に含まれる前記階層Lに関するバージョン情報を、前記対象論理ノードが持つ階層(L-1)のツリーデータのバージョン情報とそれぞれ比較し、バージョンが異なる階層(L-1)のツリーデータをバージョン情報と共に該他の論理ノードへ返信するバージョン比較部と、
を更に備える請求項3に記載の分散データ管理装置。 - 前記対象論理ノードが、
探索対象属性に関する前記ツリー構造データに含まれるツリーノードの第1エントリが示す値域境界値を該探索対象属性の属性値空間内の基準値とし、該基準値から該属性値空間内の最大値までの間の任意の値が、該属性値空間内の最小値から該基準値までの間の任意の値より小さくなる場合を含む、該属性値空間の循環順序に基づく包含判定により、前記ツリー構造データから探索対象の属性値を値域に含むエントリを特定するツリー探索部を更に備える請求項4に記載の分散データ管理装置。 - 前記各階層の根ツリーノードには、前記属性値空間における全属性値を包含しない値域がそれぞれ設定され、
前記ツリー探索部は、或る階層の前記ツリーデータから、前記探索対象属性値を値域に含むエントリを特定できない場合には、1段上の階層のツリーデータの探索を試み、該1段上の階層のツリーデータが存在しない場合には、前記ツリー生成部に、該1段上の階層のツリーデータの生成を依頼する、
請求項5に記載の分散データ管理装置。 - 前記リンクテーブルに格納される前記リンク情報は、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含む、
請求項1から6のいずれか1項に記載の分散データ管理装置。 - 前記リンクテーブルは、前記対象論理ノードから、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードへの複数リンクを含むリンク情報を格納し、
前記ツリー更新部は、前記バージョン確認要求をポーリング間隔Tで送信し、
前記対象論理ノードは、
前記複数の論理ノードの少なくとも1つにおける値域の変更が前記複数の論理ノードの全てに伝達されるまでの最大時間に対するシステム制約時間wc、又は、前記各論理ノードが前記バージョン確認要求を送信する単位時間負荷に対するシステム制約負荷λcを取得し、取得されたシステム制約時間wc又は取得されたシステム制約負荷λc、及び、前記論理ノードの総数N又は前記対象論理ノードのリンク先ノードの数Dを、下記(式1)又は下記(式2)に適用することにより、前記ポーリング間隔Tを算出するパラメータ設定部、
を更に備える請求項4から6のいずれか1項に記載の分散データ管理装置。
- 前記パラメータkが4に設定される請求項7又は8に記載の分散データ管理装置。
- アクセス要求に対応する部分データを格納しており、請求項6に記載の分散データ管理装置で実現される前記対象論理ノードを該アクセス要求の宛先として特定する分散データ操作装置であって、
前記対象論理ノードを含む複数のリンク先論理ノードと通信可能となるリンク情報を格納するリンクテーブルと、
前記ツリー格納部と、
前記ツリー更新部と、
前記ツリー探索部と、
を備える分散データ操作装置。 - 属性値順に順序付けられたデータが分割された複数の部分データであって属性毎の値域をそれぞれ有する複数の部分データを格納する複数の論理ノードの中の少なくとも1つの対象論理ノードをコンピュータに実現させるプログラムであって、
前記対象論理ノードが、
リング構造を有する有限の識別子空間内で前記複数の論理ノードに一意にそれぞれ割り当てられる複数の識別子の中の、前記対象論理ノードに割り当てられた識別子を対象ノード識別子として格納するノード識別子格納部と、
前記複数の部分データの中の少なくとも1つを格納するデータ格納部と、
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記識別子空間内における前記対象ノード識別子との間の関係に応じて確立される前記対象論理ノードとリンク先論理ノードとの間のリンク情報を格納するリンクテーブルと、
前記データ格納部に格納される前記部分データに対応する属性毎の値域境界値であって、該属性毎の該値域境界が前記識別子空間内において前記対象論理ノードと前記対象論理ノードと隣接する論理ノードとの間に位置する該属性毎の該値域境界値を格納する値域格納部と、
アクセス要求に対応する前記部分データを格納する論理ノードを特定するための、値域をそれぞれ示す複数のツリーノードから構成される属性毎のツリー構造データであって、前記リンク先論理ノードと対応付けられた子ツリーノードへのポインタと、該ポインタを選択するための値域を示す値とから形成される少なくとも1つのエントリを含む根ツリーノードを有するツリー構造データを格納するツリー格納部と、
を備えるプログラム。 - 前記対象論理ノードが、前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードからツリーデータを取得し、該取得されたツリーデータから前記根ツリーノードより下段の少なくとも1つのツリーノードを生成するツリー生成部を更に備える請求項11に記載のプログラム。
- 前記ツリー格納部に格納される前記ツリー構造データは、複数階層を持ち、各階層にツリーデータをそれぞれ持ち、第1階層のツリーデータは、前記リンク先論理ノードが前記値域格納部に格納する値域情報に対応するエントリを有し、前記第1階層より上位の階層L(Lは2以上)のツリーデータは、前記根ツリーノードを含み、
前記ツリー生成部は、前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードから、前記リンク先論理ノードに格納される階層(L-1)のツリーデータを取得し、取得された階層(L-1)のツリーデータから、前記階層Lのツリーデータの中の前記リンク先論理ノードに対応する部分ツリーデータを生成する、
請求項12に記載のプログラム。 - 前記ツリー格納部に格納される前記各ツリーデータを構成する各ツリーノードは、バージョン情報をそれぞれ含み、
前記対象論理ノードは、
前記階層Lの前記根ツリーノードに含まれる前記ポインタに対応付けられた前記リンク先論理ノードに対して、該ポインタが指す子ツリーノードのバージョン情報が設定されたバージョン確認要求を送信し、該バージョン確認要求に対する該リンク先論理ノードからの返信に含まれるツリーデータ及びバージョン情報により、各ツリーノード及び各ツリーノードのバージョン情報を更新するツリー更新部と、
他の論理ノードから前記バージョン確認要求を受信し、該バージョン確認要求に含まれる前記階層Lに関するバージョン情報を、前記対象論理ノードが持つ階層(L-1)のツリーデータのバージョン情報とそれぞれ比較し、バージョンが異なる階層(L-1)のツリーデータをバージョン情報と共に該他の論理ノードへ返信するバージョン比較部と、
を更に備える請求項13に記載のプログラム。 - 前記対象論理ノードが、
探索対象属性に関する前記ツリー構造データに含まれるツリーノードの第1エントリが示す値域境界値を該探索対象属性の属性値空間内の基準値とし、該基準値から該属性値空間内の最大値までの間の任意の値が、該属性値空間内の最小値から該基準値までの間の任意の値より小さくなる場合を含む、該属性値空間の循環順序に基づく包含判定により、前記ツリー構造データから探索対象の属性値を値域に含むエントリを特定するツリー探索部を更に備える請求項14に記載のプログラム。 - 前記各階層の根ツリーノードには、前記属性値空間における全属性値を包含しない値域がそれぞれ設定され、
前記ツリー探索部は、或る階層の前記ツリーデータから、前記探索対象属性値を値域に含むエントリを特定できない場合には、1段上の階層のツリーデータの探索を試み、該1段上の階層のツリーデータが存在しない場合には、前記ツリー生成部に、該1段上の階層のツリーデータの生成を依頼する、
請求項15に記載のプログラム。 - 前記リンクテーブルに格納される前記リンク情報は、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含む、
請求項11から16のいずれか1項に記載のプログラム。 - 前記リンクテーブルは、前記対象論理ノードから、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードへの複数リンクを含むリンク情報を格納し、
前記ツリー更新部は、前記バージョン確認要求をポーリング間隔Tで送信し、
前記対象論理ノードは、
前記複数の論理ノードの少なくとも1つにおける値域の変更が前記複数の論理ノードの全てに伝達されるまでの最大時間に対するシステム制約時間wc、又は、前記各論理ノードが前記バージョン確認要求を送信する単位時間負荷に対するシステム制約負荷λcを取得し、取得されたシステム制約時間wc又は取得されたシステム制約負荷λc、及び、前記論理ノードの総数N又は前記対象論理ノードのリンク先ノードの数Dを、下記(式1)又は下記(式2)に適用することにより、前記ポーリング間隔Tを算出するパラメータ設定部、
を更に備える請求項14に記載のプログラム。
- 前記パラメータkが4に設定される請求項17又は18に記載のプログラム。
- データが分割された複数の部分データを格納する複数の論理ノードの中の少なくとも1つの対象論理ノードを実現する分散データ管理装置であって、
前記対象論理ノードは、
リング構造を有する有限の識別子空間内で前記複数の論理ノードに一意にそれぞれ割り当てられる複数の識別子の中の、前記対象論理ノードに割り当てられた識別子を対象ノード識別子として格納するノード識別子格納部と、
前記複数の部分データの中の少なくとも1つを格納するデータ格納部と、
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含むリンク情報を格納するリンクテーブルと、
を備える分散データ管理装置。 - データが分割された複数の部分データを格納する複数の論理ノードの中の少なくとも1つの対象論理ノードであって、リング構造を有する有限の識別子空間内で該複数の論理ノードに一意にそれぞれ割り当てられる複数の識別子の中の、該対象論理ノードに割り当てられた識別子を対象ノード識別子として格納するノード識別子格納部と、該複数の部分データの中の少なくとも1つを格納するデータ格納部とを有する該対象論理ノードを実現するコンピュータが、
前記対象論理ノードと他の論理ノードとの間の通信可能な関係を示すリンク情報であって、前記対象ノード識別子をパラメータk(kは自然数)倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第1リンク先論理ノード、前記対象ノード識別子よりも大きくかつ直近の識別子を持つ後継論理ノードの識別子を該パラメータk倍して得られる値、又は、該値より小さくかつ該値の直近の識別子を持つ第2リンク先論理ノード、及び、前記識別子空間における該第1リンク先論理ノードの識別子と該第2リンク先論理ノードの識別子との間の識別子を持つ少なくとも1つの第3リンク先論理ノードと、前記対象論理ノードとの間の複数リンクを含むリンク情報を生成する、
分散データ管理方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2014515468A JP5967195B2 (ja) | 2012-05-15 | 2013-03-15 | 分散データ管理装置及び分散データ操作装置 |
US14/400,056 US10073857B2 (en) | 2012-05-15 | 2013-03-15 | Distributed data management device and distributed data operation device |
EP13790415.7A EP2851803A4 (en) | 2012-05-15 | 2013-03-15 | DISTRIBUTED DATA MANAGEMENT DEVICE AND DISTRIBUTED DATA OPERATION DEVICE |
CN201380037661.5A CN104487951B (zh) | 2012-05-15 | 2013-03-15 | 分布式数据管理设备和分布式数据操作设备 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2012-111189 | 2012-05-15 | ||
JP2012111189 | 2012-05-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013171953A1 true WO2013171953A1 (ja) | 2013-11-21 |
Family
ID=49583386
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/001768 WO2013171953A1 (ja) | 2012-05-15 | 2013-03-15 | 分散データ管理装置及び分散データ操作装置 |
Country Status (5)
Country | Link |
---|---|
US (1) | US10073857B2 (ja) |
EP (1) | EP2851803A4 (ja) |
JP (1) | JP5967195B2 (ja) |
CN (1) | CN104487951B (ja) |
WO (1) | WO2013171953A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2021060635A (ja) * | 2019-10-02 | 2021-04-15 | ヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5850054B2 (ja) * | 2011-08-01 | 2016-02-03 | 日本電気株式会社 | 分散処理管理サーバ、分散システム、及び分散処理管理方法 |
CN106484321A (zh) * | 2016-09-08 | 2017-03-08 | 华为数字技术(成都)有限公司 | 一种数据存储方法及数据中心 |
CN107229429B (zh) * | 2017-06-27 | 2020-06-16 | 苏州浪潮智能科技有限公司 | 一种存储空间管理方法及装置 |
CN107665241B (zh) * | 2017-09-07 | 2020-09-29 | 北京京东尚科信息技术有限公司 | 一种实时数据多维度去重方法和装置 |
CN108536447B (zh) * | 2018-04-11 | 2021-07-16 | 上海掌门科技有限公司 | 运维管理方法 |
US11119679B2 (en) * | 2019-08-02 | 2021-09-14 | Micron Technology, Inc. | Storing data based on a probability of a data graph |
US11474866B2 (en) * | 2019-09-11 | 2022-10-18 | International Business Machines Corporation | Tree style memory zone traversal |
US11507541B2 (en) * | 2020-01-21 | 2022-11-22 | Microsoft Technology Licensing, Llc | Method to model server-client sync conflicts using version trees |
CN113032401B (zh) * | 2021-03-31 | 2023-09-08 | 合安科技技术有限公司 | 基于异形结构树的大数据处理方法、装置及相关设备 |
US11494366B1 (en) * | 2021-05-25 | 2022-11-08 | Oracle International Corporation | Change data capture on no-master data stores |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005094264A (ja) * | 2003-09-16 | 2005-04-07 | Nomura Research Institute Ltd | 論理ネットワークへの参加要求方法、参加受付方法、メッセージ送信方法、参加要求プログラム、参加受付プログラム、メッセージ送信プログラム、参加要求装置、参加受付装置、及びメッセージ送信装置 |
JP2005244630A (ja) * | 2004-02-26 | 2005-09-08 | Nomura Research Institute Ltd | マルチキャストルーティング情報送信方法、マルチキャストメッセージ送信方法、マルチキャストルーティング情報送信プログラム、及びマルチキャストメッセージ送信プログラム |
JP2008234563A (ja) * | 2007-03-23 | 2008-10-02 | Nec Corp | オーバレイ管理装置、オーバレイ管理システム、オーバレイ管理方法およびオーバレイ管理用プログラム |
JP2008262507A (ja) * | 2007-04-13 | 2008-10-30 | Nec Corp | データ検索装置、データ検索システム、データ検索方法およびデータ検索用プログラム |
JP2009508410A (ja) * | 2005-09-08 | 2009-02-26 | パナソニック株式会社 | マルチデスティネーション・ルーティングを利用したピアツーピア・オーバーレイ通信の並列実行 |
JP2010509692A (ja) * | 2006-11-14 | 2010-03-25 | シーメンス アクチエンゲゼルシヤフト | ピアツーピア・オーバーレイ・ネットワークにおける負荷分散のための方法 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2117846C (en) * | 1993-10-20 | 2001-02-20 | Allen Reiter | Computer method and storage structure for storing and accessing multidimensional data |
CN100401667C (zh) * | 2000-06-21 | 2008-07-09 | 索尼公司 | 信息记录/再生装置及方法 |
US7519574B2 (en) * | 2003-08-25 | 2009-04-14 | International Business Machines Corporation | Associating information related to components in structured documents stored in their native format in a database |
EP1764710A4 (en) | 2004-06-03 | 2009-03-18 | Turbo Data Lab Inc | LAYOUT GENERATION PROCESS, INFORMATION PROCESSING DEVICE AND PROGRAM |
WO2006080268A1 (ja) * | 2005-01-25 | 2006-08-03 | Turbo Data Laboratories Inc. | ツリーの検索、集計、ソート方法、情報処理装置、および、ツリーの検索、集計、ソートプログラム |
WO2007024918A2 (en) | 2005-08-23 | 2007-03-01 | Matsushita Electric Industrial Co., Ltd. | System and method for service discovery in a computer network using dynamic proxy and data dissemination |
US8250116B2 (en) * | 2008-12-31 | 2012-08-21 | Unisys Corporation | KStore data simulator directives and values processor process and files |
-
2013
- 2013-03-15 WO PCT/JP2013/001768 patent/WO2013171953A1/ja active Application Filing
- 2013-03-15 US US14/400,056 patent/US10073857B2/en active Active
- 2013-03-15 EP EP13790415.7A patent/EP2851803A4/en not_active Withdrawn
- 2013-03-15 CN CN201380037661.5A patent/CN104487951B/zh active Active
- 2013-03-15 JP JP2014515468A patent/JP5967195B2/ja active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005094264A (ja) * | 2003-09-16 | 2005-04-07 | Nomura Research Institute Ltd | 論理ネットワークへの参加要求方法、参加受付方法、メッセージ送信方法、参加要求プログラム、参加受付プログラム、メッセージ送信プログラム、参加要求装置、参加受付装置、及びメッセージ送信装置 |
JP2005244630A (ja) * | 2004-02-26 | 2005-09-08 | Nomura Research Institute Ltd | マルチキャストルーティング情報送信方法、マルチキャストメッセージ送信方法、マルチキャストルーティング情報送信プログラム、及びマルチキャストメッセージ送信プログラム |
JP2009508410A (ja) * | 2005-09-08 | 2009-02-26 | パナソニック株式会社 | マルチデスティネーション・ルーティングを利用したピアツーピア・オーバーレイ通信の並列実行 |
JP2010509692A (ja) * | 2006-11-14 | 2010-03-25 | シーメンス アクチエンゲゼルシヤフト | ピアツーピア・オーバーレイ・ネットワークにおける負荷分散のための方法 |
JP2008234563A (ja) * | 2007-03-23 | 2008-10-02 | Nec Corp | オーバレイ管理装置、オーバレイ管理システム、オーバレイ管理方法およびオーバレイ管理用プログラム |
JP2008262507A (ja) * | 2007-04-13 | 2008-10-30 | Nec Corp | データ検索装置、データ検索システム、データ検索方法およびデータ検索用プログラム |
Non-Patent Citations (4)
Title |
---|
FAY CHANG; JEFFREY DEAN; SANJAY GHEMAWAT; WILSON C. HSIEH; DEBORAH A; WALLACH MIKE BURROWS; TUSHAR CHANDRA; ANDREW FIKES; ROBERT E: "Bigtable : A Distributed Storage System For Structured Data", SYMPOSIUM ON OPERATING SYSTEMS DESIGN, 6 November 2006 (2006-11-06) |
H.V. JAGADISH; BENG CHIN OOI; QUANG HIEU VU: "BATON: A Balanced Tree Structure for Peer-to-Peer Networks", VERY LARGE DATA BASES, 30 August 2005 (2005-08-30) |
See also references of EP2851803A4 |
SHINJI NAKADAI ET AL.: "PLATON: Multi- dimensional Range Query System for Data Sharing in Distributed System", SYMPOSIUM ON MULTIMEDIA, DISTRIBUTED, COOPERATIVE AND MOBILE SYSTEMS (DICOM02007) RONBUNSHU, IPSJ SYMPOSIUM SERIES, vol. 2007, no. L, 4 July 2007 (2007-07-04), pages 173 - 184, XP008174995 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2021060635A (ja) * | 2019-10-02 | 2021-04-15 | ヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
JP7239433B2 (ja) | 2019-10-02 | 2023-03-14 | ヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
Also Published As
Publication number | Publication date |
---|---|
US20150120649A1 (en) | 2015-04-30 |
EP2851803A4 (en) | 2016-01-13 |
CN104487951A (zh) | 2015-04-01 |
CN104487951B (zh) | 2017-09-22 |
JPWO2013171953A1 (ja) | 2016-01-12 |
US10073857B2 (en) | 2018-09-11 |
JP5967195B2 (ja) | 2016-08-10 |
EP2851803A1 (en) | 2015-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5967195B2 (ja) | 分散データ管理装置及び分散データ操作装置 | |
US8166074B2 (en) | Index data structure for a peer-to-peer network | |
JP6119421B2 (ja) | エンコードされたトリプルを格納するデータベース、制御部、方法及びシステム | |
Wang et al. | Indexing multi-dimensional data in a cloud system | |
JP6094487B2 (ja) | 情報システム、管理装置、データ処理方法、データ構造、プログラム、および記録媒体 | |
JP5090450B2 (ja) | 階層に編成され、ネットワークを介してリンクされた複数のノードに保管された複製データを更新するための方法、プログラム、およびコンピュータ可読媒体 | |
US20100161657A1 (en) | Metadata server and metadata management method | |
US8296420B2 (en) | Method and apparatus for constructing a DHT-based global namespace | |
CN103455531B (zh) | 一种支持高维数据实时有偏查询的并行索引方法 | |
CN105357247B (zh) | 基于分层云对等网络的多维属性云资源区间查找方法 | |
Choi et al. | Dynamic hybrid replication effectively combining tree and grid topology | |
Kumar et al. | M-Grid: a distributed framework for multidimensional indexing and querying of location based data | |
JP7202558B1 (ja) | ヒューマンサイバーフィジカル融合環境におけるデジタルオブジェクトアクセス方法及びシステム | |
JP2008234563A (ja) | オーバレイ管理装置、オーバレイ管理システム、オーバレイ管理方法およびオーバレイ管理用プログラム | |
Aebeloe et al. | Decentralized indexing over a network of RDF peers | |
Hassanzadeh-Nazarabadi et al. | Laras: Locality aware replication algorithm for the skip graph | |
US20060209717A1 (en) | Distributed storing of network position information for nodes | |
Qi et al. | A balanced strategy to improve data invulnerability in structured P2P system | |
JP6182861B2 (ja) | 情報処理装置、情報処理端末、情報検索プログラム及び情報検索方法 | |
Tran | Data storage for social networks: a socially aware approach | |
Gao et al. | Indexing multi-dimensional data in modular data centers | |
Antoine et al. | Dealing with skewed data in structured overlays using variable hash functions | |
CN117440003A (zh) | 一种无中心的分布式存储方法及系统 | |
Malkov et al. | An overlay network for distributed exact and range search in one-dimensional space | |
KR20240041205A (ko) | 탈중앙화 방식의 주소지정방식 p2p 스토리지를 파일시스템으로 사용한 key-value 데이터베이스 시스템과 그 방법 및 컴퓨터 프로그램 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13790415 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2013790415 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14400056 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 2014515468 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |