WO2022091204A1 - Data analysis processing device, data analysis processing method, and program - Google Patents

Data analysis processing device, data analysis processing method, and program Download PDF

Info

Publication number
WO2022091204A1
WO2022091204A1 PCT/JP2020/040213 JP2020040213W WO2022091204A1 WO 2022091204 A1 WO2022091204 A1 WO 2022091204A1 JP 2020040213 W JP2020040213 W JP 2020040213W WO 2022091204 A1 WO2022091204 A1 WO 2022091204A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
multidimensional
range
storage area
cube
Prior art date
Application number
PCT/JP2020/040213
Other languages
French (fr)
Japanese (ja)
Inventor
哲 八木
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to JP2022558636A priority Critical patent/JP7464142B2/en
Priority to PCT/JP2020/040213 priority patent/WO2022091204A1/en
Publication of WO2022091204A1 publication Critical patent/WO2022091204A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models

Definitions

  • One aspect of the present invention relates to a data analysis processing apparatus, a data analysis processing method, and a program.
  • Real-world events change temporally, spatially, or both. In other words, an event is created, disappeared, or a state transitions.
  • the data that embodies the event can be mapped to a multidimensional cube, as it is called in data analysis technology.
  • the data analysis processing device executes an online analytical processing (OLAP) operation on the multidimensional cube to analyze the data.
  • OLAP online analytical processing
  • the data analysis processing apparatus uses, for example, a method as disclosed in Non-Patent Document 1.
  • the data analysis processing device executes an OLAP operation on a certain multidimensional cube
  • the argument instructed by the client is used as an argument of the OLAP operation.
  • the data analysis processing device can use a relational database to execute OLAP operations. Therefore, when performing an OLAP operation on a certain multidimensional cube, when trying to use the data constituting another multidimensional cube as an argument of the OLAP operation, the data constituting the certain multidimensional cube is newly used.
  • searching / manipulating data constituting other multidimensional cubes as a key it is possible to use the means for speeding up the relational database. For example, a speed-up means as disclosed in Non-Patent Document 2 can be used.
  • Data of up to 2 items of the data of each dimension / data representing each characteristic that composes the multidimensional cube can be stored in one of the list of one-dimensional value ranges, the list of names, and the hash function that are common among the multidimensional cubes. It is classified according to the value range based on it, and stored and managed in the storage area corresponding to the only value range to which the data belongs.
  • the range of search / operation is limited to the storage area corresponding to the above, and when a plurality of searches / operations are executed at the same time, the conflict of the storage area to be searched / operated is further avoided.
  • the means can be used only in a limited range. That is, the method that can be applied when each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is one-dimensional data cannot be applied when each of the above data is multidimensional data. Further, even when the data classified by the range belongs to a plurality of ranges, it is not possible to avoid the conflict of the storage area to be searched / operated and promote the speedup. Specifically, when the conventional data analysis processing device newly performs an OLAP operation on a certain multidimensional cube, when trying to use the data constituting another multidimensional cube as an argument of the OLAP operation.
  • the means for speeding up the relational database can be used.
  • the range that can be speeded up was limited.
  • the data classified by the range is classified by the range based on one of the list of one-dimensional range, the list of names, and the hash function common among the cubes, and the data classified by the range belongs to a single range, the data belongs to the only one.
  • the range to be searched / operated is limited to the storage area corresponding to the same value range of both multidimensional cubes, and multiple searches / operations are performed.
  • the speed can be increased by further avoiding the conflict of the storage area to be searched / operated.
  • the data can be classified by the multidimensional value range common among the multidimensional cubes, or the value range.
  • the data classified in (1) belongs to a plurality of price ranges, it cannot be accumulated and managed in duplicate in the storage area corresponding to each price range.
  • the present invention has been made by paying attention to the above circumstances, and is intended to provide a technique capable of executing OLAP operations on a multidimensional cube at high speed.
  • the data analysis processing apparatus includes a multidimensional database, an OLAP operation execution unit, and a multidimensional database management unit.
  • the multidimensional database stores data embodying a real-world event in a multidimensional cube constructed for each subject in association with the identifier of the event.
  • the OLAP operation execution unit executes an OLAP (Online Analytical Processing) operation on a multidimensional cube in response to a request from a client. Further, when the OLAP operation execution unit executes an OLAP operation on a certain multidimensional cube, at least one of the arguments instructed by the client as the argument of the OLAP operation or the data constituting another multidimensional cube. To use.
  • the multidimensional database management unit manages time-dimensional data, spatial-dimensional data, multiple types of unique-dimensional data, and data representing multiple types of characteristics in a multidimensional cube. If each of the data constituting the multidimensional cube is multidimensional data, the multidimensional database management unit classifies the multidimensional data in a multidimensional value range common among the multidimensional cubes. More specifically, the multidimensional database management unit determines that if each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is multidimensional data, the multidimensional value range common to the multidimensional cubes is used. Classify by. When the data classified by the range belongs to a single range, the multidimensional database management unit stores and manages the data in the storage area corresponding to the range.
  • the multidimensional database management unit When the data classified by the range belongs to multiple ranges, the multidimensional database management unit stores and manages the actual data or the reference of the data in the storage area corresponding to each range. do. In addition, the multidimensional database management unit simply uses the range used for classification as an index when searching / manipulating the data constituting the multidimensional cube using the data constituting another multidimensional cube as a key. When executing one search / operation, the range to be searched / operated is in the storage area corresponding to the same range of both multidimensional cubes and the storage area corresponding to the range near the same range of both multidimensional cubes. In addition to limiting the number of searches / operations, when multiple searches / operations are executed in parallel, conflicts in the storage area to be searched / operated are further avoided.
  • FIG. 1 is a functional block diagram showing an example of a data analysis processing apparatus according to the present invention.
  • FIG. 2 is a diagram for explaining a data storage state in the multidimensional database 16.
  • FIG. 3 is a diagram showing an example of a range of a wide range including the widest data or the main data.
  • FIG. 4 is a diagram showing an example of a storage area corresponding to a hierarchy of a range in which a higher range includes a lower adjacent range.
  • FIG. 5 is a sequence diagram for explaining an example of the operation of the data analysis processing device 10.
  • FIG. 6 is a flowchart showing an example of the processing procedure of the multidimensional database management unit 15.
  • FIG. 1 is a functional block diagram showing an example of a data analysis processing apparatus according to the present invention.
  • FIG. 2 is a diagram for explaining a data storage state in the multidimensional database 16.
  • FIG. 3 is a diagram showing an example of a range of a wide range including the widest data or the main data.
  • FIG. 7 is a diagram for explaining an example of processing for limiting the search / operation range in the storage area by the multidimensional database management unit 15.
  • FIG. 8 is a diagram for explaining another example of the process of limiting the search / operation range in the storage area by the multidimensional database management unit 15.
  • FIG. 9 is a diagram for explaining an example of an operation of avoiding a conflict in a storage area searched / operated by the multidimensional database management unit 15.
  • FIG. 10 is a diagram for explaining another example of the operation of avoiding the conflict of the storage area searched / operated by the multidimensional database management unit 15.
  • FIG. 11 is a diagram for explaining an example of a process in which the multidimensional database management unit 15 selects a hierarchy of a range.
  • FIG. 12 is a schematic diagram for explaining an example of an operation of suppressing redundant processing when a range corresponding to a plurality of storage areas is selected.
  • FIG. 13 is a diagram showing an example of tabular data representing the situation shown in FIG.
  • FIG. 14 is a block diagram showing an example of the hardware configuration of the data analysis processing apparatus according to the present invention.
  • FIG. 1 is a functional block diagram showing an example of a data analysis processing apparatus according to the present invention.
  • the data analysis processing device 10 includes an OLAP operation execution unit 11, a multidimensional database management unit 15, and a multidimensional database 16.
  • the multidimensional database 16 stores data embodying an event in the real world in a multidimensional cube in association with an event identifier for identifying an event that is an information source of the data.
  • Multidimensional cubes are constructed by subject.
  • the accumulated data includes time-dimensional data, spatial-dimensional data, a plurality of types of unique-dimensional data, and data representing a plurality of types of characteristics.
  • subject-dependent data There are multiple types of subject-dependent data in the eigendimensional dimension.
  • the characteristic data is identified by time-dimensional, spatial-dimensional, and eigen-dimensional data.
  • the multidimensional database 16 When each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is the multidimensional data, the multidimensional database 16 is the multidimensional data in the multidimensional value range common among the multidimensional cubes. To classify. Then, when the data classified by the range belongs to a single range, the multidimensional database 16 stores the data in the storage area corresponding to the range. Further, when the data classified by the range belongs to a plurality of ranges, the multidimensional database 16 duplicately stores the data entity or the reference in the storage area corresponding to each range.
  • FIG. 2 is a diagram for explaining the data accumulation state in the multidimensional database 16.
  • data a to c which are two-dimensional data representing features and the like
  • value ranges 1 to 4 which are two-dimensional value ranges representing areas and the like
  • data a to c are in the range 1 and data are in the range 2.
  • Data c is classified into b and range 3.
  • the data a belongs to the range 1
  • the data b belongs to the range 1 and 2
  • the data c belongs to the range 1 and 3.
  • the main body of the data entity is stored in the storage area corresponding to the range corresponding to the widest overlapping range, and the entity is duplicated or duplicated in the storage area corresponding to the other ranges.
  • the reference is, for example, the address of the data stored in the storage.
  • Distinguish between the body of an entity that accumulates in a storage area and a duplicate of an entity or a reference to the body of an entity for example, by partitioning within the storage area to store, marking the data to be stored, or creating an index. be able to.
  • the replication of the entity and the reference to the body of the entity accumulated in the storage area are, arbitrarily or according to the criteria, from the replication of the entity to the reference to the body of the entity, from the reference to the body of the entity to the replication of the entity. Can be changed.
  • the range is set to, for example, a size that can include the widest data or a size that can contain the main data. By doing so, the number of range to which the data belongs can be suppressed to the number of adjacent range at most.
  • the multidimensional database 16 classifies the multidimensional data in the multidimensional range, and when the data classified in the range belongs to a single range, the multidimensional database 16 stores the data in the storage area corresponding to the range.
  • the multidimensional database 16 duplicately stores the data entity or the reference in the storage area corresponding to each range.
  • * represents the substance (main body) of the data
  • ** represents the duplication of the substance of the data / the reference to the body of the substance.
  • FIG. 3 is a diagram showing an example of a range of a wide range including the widest data or the main data.
  • the data is re-accumulated according to the new range, including the accumulated data. ..
  • a hierarchy of the range in which the upper range includes the lower adjacent range is constructed, and the hierarchy of the range to be used is selected according to the situation.
  • the hierarchy of the range corresponding to the plurality of storage areas is selected for the multidimensional database 16, the data duplicated and stored in the plurality of storage areas is not used.
  • FIG. 4 is a diagram showing an example of a storage area corresponding to the hierarchy of the range in which the upper range includes the lower adjacent range.
  • the OLAP operation execution unit 11 executes an OLAP operation on multidimensional data according to the OLAP operation received from the client 20 and the arguments. That is, the OLAP operation execution unit 11 instructs the multidimensional database management unit 15 to perform an OLAP operation on the multidimensional data. Further, when the OLAP operation execution unit 11 receives the result of the instructed operation from the multidimensional database management unit 15, the OLAP operation execution unit 11 transmits the operation result to the client 20.
  • the multidimensional database management unit 15 refers to the information in the value range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic as index information in response to the instruction of the OLAP operation execution unit 11. Specify the storage area to be searched / operated based on the referenced index information. Further, the multidimensional database management unit 15 searches / operates the data constituting the multidimensional cube in parallel with the range corresponding to the storage area as the processing unit. Then, when the search / operation of all the storage areas to be searched / operated is completed, the multidimensional database management unit 15 aggregates the search / operation results and returns the operation result to the OLAP operation execution unit 11. Further, the multidimensional database 16 is managed so that the data is accumulated and used in the multidimensional database 16 as described above.
  • FIG. 5 is a sequence diagram for explaining an example of the operation of the data analysis processing device 10.
  • the OLAP operation execution unit 11 receives an OLAP operation and an argument from the client 20, it instructs the multidimensional database management unit 15 to operate the multidimensional data accordingly.
  • the multidimensional database management unit 15 refers to and refers to the information in the value range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic as index information in response to the operation instruction of the multidimensional data. Specify the storage area to be searched / operated based on the index information. The multidimensional database management unit 15 searches / operates the data constituting the multidimensional cube in parallel in parallel with the range corresponding to the storage area as the processing unit (“PARALLELL” surrounded by the broken line in FIG. 5).
  • the multidimensional database management unit 15 repeats until the search / operation of all the storage areas to be searched / operated is completed (“LOOP” surrounded by the broken line in FIG. 5), and when the search / operation is completed, the search / operation results are aggregated and the operation results are displayed. Return it to the OLAP operation execution unit 11.
  • the OLAP operation execution unit 11 repeats the instruction to the multidimensional database management unit 15 according to the received OLAP operation and the contents of the argument ("LOOP" surrounded by the broken line in FIG. 5).
  • the OLAP operation execution unit 11 acquires the final operation result corresponding to the OLAP operation and the contents of the argument, the OLAP operation execution unit 11 returns the operation result of the OLAP operation to the client 20.
  • FIG. 6 is a flowchart showing an example of the processing procedure of the multidimensional database management unit 15.
  • the multidimensional database management unit 15 waits for the reception of the operation instruction of the multidimensional data from the OLAP operation execution unit 11 (step S11).
  • the multidimensional database management unit 15 refers to the information in the range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic as index information (step S12).
  • the multidimensional database management unit 15 specifies a storage area to be searched / operated based on the referenced index information (step S13), and configures a multidimensional cube with the value range corresponding to the storage area as a processing unit.
  • Search / operate data in parallel steps S141 to S14N). This process is repeated in step S15 until it is determined that the search / operation of all the storage areas to be searched / operated has been completed.
  • the multidimensional database management unit 15 sets the storage area corresponding to the same range of both multidimensional cubes and the range near the same range of both multidimensional cubes. Limit the search / operation range to the corresponding storage area. Further, when a plurality of searches / operations are executed in parallel, the multidimensional database management unit 15 further avoids a conflict in the storage area to be searched / operated. Then, the multidimensional database management unit 15 aggregates the search / operation results (step S16).
  • the multidimensional database management unit 15 configures another multidimensional cube as an argument of the OLAP operation when executing an OLAP operation on a certain multidimensional cube in response to an operation instruction of the multidimensional data.
  • the data constituting a certain multidimensional cube is searched / operated by using the data constituting another multidimensional cube as a key. That is, when the multidimensional database management unit 15 executes a single search / operation by using the range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic as an index.
  • the multidimensional database management unit 15 limits the search / operation range to the storage area corresponding to the same range of both multidimensional cubes and the storage area corresponding to the range in the vicinity of the same range of both multidimensional cubes. Further, when a plurality of searches / operations are executed in parallel, the multidimensional database management unit 15 further avoids a conflict in the storage area to be searched / operated.
  • FIG. 7 is a diagram for explaining an example of processing for limiting the search / operation range in the storage area by the multidimensional database management unit 15.
  • the multidimensional database management unit 15 searches / operates the data constituting the multidimensional cube 1 using the data constituting the multidimensional cube 0 as a key
  • the value ranges 01, 02, 04 The data included in or superimposed on the data classified in the corresponding storage areas 01, 02, 04 and stored and managed in the corresponding storage areas 11, 12, 14 are classified into the value areas 11, 12, and 14, respectively, and stored and managed in the corresponding storage areas 11, 12, 14.
  • FIG. 8 is a diagram for explaining another example of the process of limiting the search / operation range in the storage area by the multidimensional database management unit 15.
  • the multidimensional database management unit 15 searches / operates the data constituting the multidimensional cube 1 using the data constituting the multidimensional cube 0 as a key, it is classified into a range 01 and a range.
  • the data in the vicinity represented by the dotted circle from the center of gravity of the data stored and managed in the storage area corresponding to 01 is the range 11 and the range 12, 14, 15 within the range of the radius of the dotted circle from the range 11.
  • the range to be searched / operated can be limited to the pair of the area 01 and the areas 11, 12, 14, and 15, which are the storage areas corresponding to the range of. The same applies to the data classified into other range and stored and managed in the storage area corresponding to the range.
  • the multidimensional database management unit 15 specifies the storage area to be searched / operated based on the referenced index information, the storage area corresponding to the same range of both multidimensional cubes and the two multidimensional cubes.
  • the range to be searched / operated is limited to the storage area corresponding to the range in the vicinity of the same range of.
  • FIG. 9 is a diagram for explaining an example of an operation of avoiding a conflict in the storage area to be searched / operated by the multidimensional database management unit 15. This will be described in association with the schematic diagram of FIG. 7. As shown in FIG. 9, it is a storage area corresponding to the same value range of both multidimensional cubes when the data constituting the multidimensional cube 1 is searched / operated by using the data constituting the multidimensional cube 0 as a key. By searching / manipulating the data constituting the multidimensional cube in parallel with the set of areas 01 and 11, the set of areas 02 and 12, and the set of areas 04 and 14, the conflict of the storage area to be searched / operated can be found. It can be avoided.
  • FIG. 10 is a diagram for explaining another example of the operation of avoiding the conflict of the storage area searched / operated by the multidimensional database management unit 15. This will be described in association with the schematic diagram of FIG.
  • the storage area is classified into the value range 01 and corresponds to the value range 01 as in FIG.
  • the data in the vicinity represented by the dotted circle from the center of gravity of the data accumulated and managed in is classified into the value range 11 and the value ranges 12, 14, and 15 within the range of the radius of the dotted circle from the value range 11.
  • the data in is classified into the value range 14 and the value range 11, 12, 15, 17, 18 within the range of the radius of the dotted circle from the value range 14, and accumulated in the corresponding storage areas 11, 12, 15, 17, 18 and stored.
  • Areas 01 and 15, 14, 12 which are storage areas corresponding to the same value range of both multidimensional cubes and storage areas in the vicinity of the same value range of both multidimensional cubes because they are managed data. , 11 pairs, regions 04 and 18, 17, 15, 14 as a unit, when searching / operating the data constituting the multidimensional cube in parallel, the region 15 for the data in the region 01.
  • the reference destination to the main body of the data entity and the main body of the relevant data entity are in the same storage area. Therefore, when the main body of any of the data stored in the storage area is searched / operated, the conflict of the storage area to be searched / operated cannot be avoided. On the other hand, when the reference to the main body of any of the stored data is searched / operated in the storage area, the conflict of the storage area to be searched / operated can be avoided. Further, if the reference to the main body of the entity is accumulated instead of accumulating the copy of the entity, the required amount of the storage area can be suppressed.
  • the multidimensional database management unit 15 further searches / operates the data constituting the multidimensional cube in parallel with the range corresponding to the storage area as the processing unit based on the referenced index information. Avoid conflicts in the storage area to be searched / operated.
  • the storage area to which the data does not belong is excluded from the processing target in the first place.
  • the same data is searched / operated in multiple sets of storage area because the entity or reference is duplicated and managed in the storage area corresponding to each range. In some cases. As a result, if the same result is obtained, the duplicated results are aggregated.
  • FIG. 11 is a diagram for explaining an example of a process in which the multidimensional database management unit 15 selects a range hierarchy.
  • the multidimensional database management unit 15 identifies the storage area to be searched / operated based on the referenced index information, and simultaneously parallels the data constituting the multidimensional cube with the storage area corresponding to the value range as a unit.
  • searching / operating the multidimensional database management unit 15 sets the hierarchy of the range in which the upper range includes the lower adjacent range for the range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic. Build and select the range hierarchy to be the processing unit of search / operation according to the situation.
  • the situation is to select according to the value of the stored data, select the level of the range that can accommodate the widest data or the range that can accommodate the main data, and the data belongs. Limit the number of ranges to the number of adjacent ranges at most.
  • the range that can contain the widest data and the range that can contain the main data specifies the level of the range that can contain the data each time the data is accumulated, and the level of the maximum range and the most frequent. It is obtained by calculating the level of the range of. For example, since the data a and b cannot be included in the level 2 range and can be included in the level 1 range, the level 1 range is selected.
  • the selection is made according to the degree of parallelism that can be executed, the selection is made based on the number of available CPU cores and the status of other processing, and the processing capacity is maximized. For example, if a level 2 range is selected, the 64 storage area corresponds to the 64 range, and 64 is the upper limit of the degree of parallelism that can be executed. If the range of level 1 is selected, the 64 storage areas are aggregated into four, corresponding to the four ranges, and 4 is the upper limit of the degree of parallelism that can be executed. If the range of level 0 is selected, the 64 storage areas are aggregated into one, corresponding to one range, and 1 is the upper limit of the degree of parallelism that can be executed.
  • the degree of parallelism that can be executed is larger than the number of CPU cores when I / O waits are taken into consideration, and less than the number of CPU cores when the execution of other processes is taken into consideration. Therefore, the degree of parallelism that can be executed is calculated based on the information set in advance and the information acquired from the OS (Operating System). For example, if the number of CPU cores is 4, the range of level 1 whose range number is closest to the number of CPU cores is selected.
  • FIG. 12 and 13 are diagrams for explaining an example of processing for suppressing redundant processing by the multidimensional database management unit 15.
  • the multidimensional database management unit 15 selects the range hierarchy corresponding to a plurality of storage areas as the range hierarchy used as the search / operation processing unit. think.
  • redundant processing can be suppressed by not using the data that is duplicately stored and managed in a plurality of storage areas.
  • data belongs to multiple range the entity or reference is stored and managed in duplicate in the storage area corresponding to each range. Therefore, when the same data is searched / operated in multiple sets of storage areas. There is. As a result, if the same result is obtained, it is necessary to aggregate the duplicated results.
  • the multidimensional database management unit 15 suppresses this redundant processing.
  • the data a is the level for the level 2 range included in the level 1 range. It is classified into the range 2 of 2 and stored and managed in the corresponding storage area 2, and the data b is classified into the range 2, 3, 6 and 7 of the level 2 and stored and managed in the corresponding storage areas 2, 3, 6 and 7. It is shown that the range 1 to 16 of the level 2 is included in the range 3 of the level 1, and the range 1 to 4 of the level 1 is included in the range 1 of the level 0.
  • FIG. 13 is an example of tabular data representing the situation shown in FIG. Similar to FIG. 11, when the level 1 range is selected as the hierarchy of the range used as the search / operation processing unit, the multidimensional database management unit 15 corresponds to the level 2 range included in the level 1 range. Data is read out and processed in order from each storage area. For example, when the data a is read from the storage area corresponding to the range 2 of the level 2, by searching the tabular data of FIG. 13, the data is stored only in the storage area corresponding to the range 2 of the level 2. Can be identified. Therefore, in order to suppress redundant processing, the multidimensional database management unit 15 searches / operates the storage area corresponding to the range 2 of the level 2 of the paired multidimensional cube.
  • the multidimensional database management unit 15 searches / operates the storage area corresponding to the range 2, 3, 6, and 7 of the level 2 of the paired multidimensional cube. Further, in order to suppress redundant processing, the multidimensional database management unit 15 marks the tabular data in FIG. 13 that the data b has been processed, and corresponds to the range 3, 6 and 7 of the level 2. Data b is not read from the storage area.
  • the main body of the entity, the duplication of the entity, and the reference to the main body of the entity are displayed in the storage area corresponding to the hierarchy. If it has been accumulated, the copy of the entity and the reference to the main body of the entity can be deleted and reflected in the tabular data of FIG. 13, or the storage area and the state before the deletion can be obtained after the deletion. It is also possible to return the tabular data of.
  • FIG. 14 is a block diagram showing an example of the hardware configuration of the data analysis processing apparatus according to the present invention.
  • the data analysis processing device 10 includes a processor 12, a storage 200 for storing a multidimensional database 16, an interface unit 13, and a memory 14. That is, the data analysis processing device 10 is a computer, and is realized as, for example, a personal computer, a server computer, or the like.
  • the interface unit 13 is connected to the network 100 and receives access from the client 20 connected to the network 100.
  • the storage 200 is a non-volatile storage medium (block device) such as an HDD (Hard Disk Drive) or SSD (Solid State Drive).
  • the storage 200 stores a multidimensional database 16 in a predetermined storage area in addition to basic programs such as an OS (Operating System) and a device driver, and a program for realizing the functions of the data analysis processing device 10.
  • basic programs such as an OS (Operating System) and a device driver, and a program for realizing the functions of the data analysis processing device 10.
  • the memory 14 in FIG. 14 is, for example, a RAM (RandomAccessMemory), and stores a program 14a loaded from the storage 200 and various data 14b.
  • RAM RandomAccessMemory
  • the processor 12 in FIG. 14 is an arithmetic unit such as a Central Processing Unit (CPU) or a Micro Processing Unit (MPU), and its function is realized by a program loaded in the memory 14.
  • CPU Central Processing Unit
  • MPU Micro Processing Unit
  • the processor 12 includes an OLAP operation execution unit 11 and a multidimensional database management unit 15 as processing functions related to the embodiment.
  • the OLAP operation execution unit 11, the multidimensional database management unit 15, and the time-series alignment unit 17 are processing functions realized by the processor 12 executing the instructions included in the program 14a. That is, the data analysis processing device 10 of the present invention can also be realized by a computer and a program. In addition to recording and distributing the program on a recording medium such as an optical medium, it is also possible to provide the program through a network.
  • the OLAP operation execution unit 11 and the multidimensional database management unit 15 include integrated circuits such as an ASIC (Application Specific Integrated Circuit) and an FPGA (field-programmable gate array) in place of or in addition to the processor 12. , May be realized in various other formats.
  • ASIC Application Specific Integrated Circuit
  • FPGA field-programmable gate array
  • the processor 12 can receive the OLAP operation and the argument from the client 20 via the interface unit 13, and can send the operation result to the client 20.
  • the multidimensional database management unit 15 shares the data among the multidimensional cubes. Classify by value range. Further, when the data classified by the range belongs to a single range, the multidimensional database management unit 15 stores the data in the storage area corresponding to the range, and the data classified by the range belongs to a plurality of ranges. In that case, the entity or reference is duplicated and accumulated in the storage area corresponding to each range.
  • the range information used to classify the data to be operated that constitutes the multidimensional cube is used as index information.
  • the storage area corresponding to the same range of both multidimensional cubes and the storage area corresponding to the range near the same range of both multidimensional cubes are searched. / Limit the range of operation. Further, when a plurality of searches / operations are executed at the same time, the conflict of the storage area to be searched / operated is further avoided.
  • the multidimensional database management unit 15 uses data constituting another multidimensional cube as an argument of the OLAP operation.
  • the multidimensional database management unit 15 is the data of each dimension constituting the multidimensional cube.
  • the hierarchy of the value range in which the upper value range includes the lower adjacent value range is constructed.
  • the multidimensional database management unit 15 selects a hierarchy of a range to be a processing unit of search / operation according to a situation such as a value of accumulated data and a degree of parallelism that can be executed. Further, when the multidimensional database management unit 15 selects a hierarchy of range corresponding to a plurality of storage areas, the multidimensional database management unit 15 does not use the data duplicated and stored and managed in the plurality of storage areas.
  • the embodiment it is possible to speed up the process of searching / operating the data constituting another multidimensional cube by using the data constituting the multidimensional cube as a key. That is, according to the embodiment, it becomes possible to provide a data analysis processing device, a data analysis processing method, and a program capable of executing OLAP operations on a multidimensional cube at high speed. More specifically, according to the embodiment, when data constituting another multidimensional cube is used as an argument of an OLAP operation, the data constituting one multidimensional cube and the data constituting another multidimensional cube are used.
  • the present invention is not limited to the above-described embodiment as it is, and at the implementation stage, the components can be modified and embodied within a range that does not deviate from the gist thereof.
  • various inventions can be formed by an appropriate combination of the plurality of components disclosed in the above-described embodiment. For example, some components may be removed from all the components shown in the embodiments. In addition, components from different embodiments may be combined as appropriate.

Abstract

A data analysis processing device according to one aspect of the present invention comprises: a multidimensional database; an OLAP operation execution unit; and a multidimensional database management unit. The multidimensional database stores data embodying a real-world event in a multidimensional cube constructed for each subject in association with the identifier of the event. The OLAP operation execution unit executes an Online Analytical Processing (OLAP) operation on a multidimensional cube in response to a request from a client. The multidimensional database management unit manages, in the multidimensional cube, time-dimensional data, space-dimensional data, multiple types of unique dimensional data, and data representing multiple types of characteristics. When each of the data constituting the multidimensional cube is multidimensional data, the multidimensional database management unit classifies the multidimensional data in a multidimensional value range common among the multidimensional cubes.

Description

データ分析処理装置、データ分析処理方法、およびプログラムData analysis processing equipment, data analysis processing method, and program
 この発明の一態様は、データ分析処理装置、データ分析処理方法、およびプログラムに関する。 One aspect of the present invention relates to a data analysis processing apparatus, a data analysis processing method, and a program.
 実世界の事象は、時間的に、空間的に、或いはその双方にわたって変化する。つまり事象は、生成したり、消滅したり、状態が遷移したりする。事象を体現するデータは、データ分析技術に言うところの多次元キューブに写像されることができる。データ分析処理装置は、多次元キューブにオンライン分析処理(Online Analytical Processing:OLAP)操作を実行して、データを分析する。データ分析処理装置は、例えば、非特許文献1に開示されているような手法を用いる。 Real-world events change temporally, spatially, or both. In other words, an event is created, disappeared, or a state transitions. The data that embodies the event can be mapped to a multidimensional cube, as it is called in data analysis technology. The data analysis processing device executes an online analytical processing (OLAP) operation on the multidimensional cube to analyze the data. The data analysis processing apparatus uses, for example, a method as disclosed in Non-Patent Document 1.
 データ分析処理装置は、或る多次元キューブにOLAP操作を実行するとき、クライアントから指示された引数を、OLAP操作の引数として利用する。また、データ分析処理装置は、OLAP操作を実行するために、関係データベース(Relational Database)を利用することができる。従って新たに、ある多次元キューブに対してOLAP操作を実行する場合に、OLAP操作の引数として他の多次元キューブを構成するデータを利用しようと試みる場合において、ある多次元キューブを構成するデータを、他の多次元キューブを構成するデータをキーとして検索/操作する場合に、関係データベースの高速化手段を利用することができる。例えば、非特許文献2に開示されているような高速化手段を利用することができる。 When the data analysis processing device executes an OLAP operation on a certain multidimensional cube, the argument instructed by the client is used as an argument of the OLAP operation. In addition, the data analysis processing device can use a relational database to execute OLAP operations. Therefore, when performing an OLAP operation on a certain multidimensional cube, when trying to use the data constituting another multidimensional cube as an argument of the OLAP operation, the data constituting the certain multidimensional cube is newly used. , When searching / manipulating data constituting other multidimensional cubes as a key, it is possible to use the means for speeding up the relational database. For example, a speed-up means as disclosed in Non-Patent Document 2 can be used.
 多次元キューブを構成する各次元のデータ/各特性を表すデータのうち最大2項目のデータを、多次元キューブの間で共通する1次元の値域のリスト、名称のリスト、ハッシュ関数のいずれかに基づく値域で分類し、データが属する唯一の値域に対応するストレージ領域に蓄積及び管理する。
 多次元キューブを構成する各次元のデータ/各特性を表すデータの分類に用いた値域を索引として利用することにより、単一の検索/操作を実行する場合には、両多次元キューブの同一値域に対応するストレージ領域に検索/操作する範囲を限定するとともに、複数の検索/操作を同時実行する場合には、さらに検索/操作するストレージ領域の競合を回避する。
Data of up to 2 items of the data of each dimension / data representing each characteristic that composes the multidimensional cube can be stored in one of the list of one-dimensional value ranges, the list of names, and the hash function that are common among the multidimensional cubes. It is classified according to the value range based on it, and stored and managed in the storage area corresponding to the only value range to which the data belongs.
When performing a single search / operation by using the range used to classify the data of each dimension / the data representing each characteristic that constitutes the multidimensional cube as an index, the same range of both multidimensional cubes. The range of search / operation is limited to the storage area corresponding to the above, and when a plurality of searches / operations are executed at the same time, the conflict of the storage area to be searched / operated is further avoided.
 従来のデータ分析処理装置では、関係データベースの高速化手段を利用できたとしても、その手段は、限られた範囲でしか利用できなかった。つまり、多次元キューブを構成する各次元のデータ/各特性を表すデータのそれぞれが1次元データである場合に適用できる手法は、上記データのそれぞれが多次元データである場合に適用できない。また、値域で分類したデータが複数の値域に属する場合においても、検索/操作するストレージ領域の競合を回避して高速化を促すことができない。
 詳しくは、従来のデータ分析処理装置は、新たに、ある多次元キューブに対してOLAP操作を実行する場合に、OLAP操作の引数として他の多次元キューブを構成するデータを利用しようと試みる場合において、ある多次元キューブを構成するデータを、他の多次元キューブを構成するデータをキーとして検索/操作する場合に、関係データベースの高速化手段を利用することができる。しかし、高速化可能な範囲が限定されていた。
 例えば、従来のデータ分析処理装置は、多次元キューブを構成する各次元のデータ/各特性を表すデータのそれぞれが1次元データである場合には、データのうち最大2項目のデータを、多次元キューブの間で共通する1次元の値域のリスト、名称のリスト、ハッシュ関数のいずれかに基づく値域で分類し、値域で分類したデータが単一の値域に属する場合には、データが属する唯一の値域に対応するストレージ領域に蓄積及び管理し、単一の検索/操作を実行する場合には、両多次元キューブの同一値域に対応するストレージ領域に検索/操作する範囲を限定するとともに、複数の検索/操作を同時実行する場合には、さらに検索/操作するストレージ領域の競合を回避することにより、高速化することができた。
 しかし、多次元キューブを構成する各次元のデータ/各特性を表すデータのそれぞれが多次元データである場合に、データを、多次元キューブ間で共通する多次元の値域で分類することや、値域で分類したデータが複数の値域に属する場合に、各値域に対応するストレージ領域に重複して蓄積及び管理することはできない。したがって、多次元キューブを構成する各次元のデータ/各特性を表すデータのそれぞれが多次元データである場合や、値域で分類したデータが複数の値域に属する場合において、単一の検索/操作を実行する場合には、検索/操作する範囲に限定するとともに、複数の検索/操作を同時実行する場合には、さらに検索/操作するストレージ領域の競合を回避することにより、高速化することができなかった。
In the conventional data analysis processing device, even if the means for speeding up the relational database can be used, the means can be used only in a limited range. That is, the method that can be applied when each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is one-dimensional data cannot be applied when each of the above data is multidimensional data. Further, even when the data classified by the range belongs to a plurality of ranges, it is not possible to avoid the conflict of the storage area to be searched / operated and promote the speedup.
Specifically, when the conventional data analysis processing device newly performs an OLAP operation on a certain multidimensional cube, when trying to use the data constituting another multidimensional cube as an argument of the OLAP operation. , When searching / operating the data constituting a certain multidimensional cube with the data constituting another multidimensional cube as a key, the means for speeding up the relational database can be used. However, the range that can be speeded up was limited.
For example, in the conventional data analysis processing device, when each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is one-dimensional data, the data of up to two items of the data can be multidimensionalized. If the data classified by the range is classified by the range based on one of the list of one-dimensional range, the list of names, and the hash function common among the cubes, and the data classified by the range belongs to a single range, the data belongs to the only one. When accumulating and managing in the storage area corresponding to the value range and executing a single search / operation, the range to be searched / operated is limited to the storage area corresponding to the same value range of both multidimensional cubes, and multiple searches / operations are performed. When the search / operation is executed at the same time, the speed can be increased by further avoiding the conflict of the storage area to be searched / operated.
However, when each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is multidimensional data, the data can be classified by the multidimensional value range common among the multidimensional cubes, or the value range. When the data classified in (1) belongs to a plurality of price ranges, it cannot be accumulated and managed in duplicate in the storage area corresponding to each price range. Therefore, when each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is multidimensional data, or when the data classified by the range belongs to a plurality of ranges, a single search / operation is performed. When executing, the speed can be increased by limiting the search / operation range, and when executing multiple searches / operations at the same time, further avoiding conflicts in the storage area to be searched / operated. There wasn't.
 この発明は、上記事情に着目してなされたもので、多次元キューブへのOLAP操作を高速に実行できる技術を提供しようとするものである。 The present invention has been made by paying attention to the above circumstances, and is intended to provide a technique capable of executing OLAP operations on a multidimensional cube at high speed.
 この発明の一態様に係るデータ分析処理装置は、多次元データベース、OLAP操作実行部、および多次元データベース管理部を具備する。多次元データベースは、主題ごとに構築される多次元キューブに、実世界の事象を体現するデータを当該事象の識別子と対応付けて蓄積する。OLAP操作実行部は、クライアントからの要求に応じて多次元キューブに対するOLAP(Online Analytical Processing)操作を実行する。
 また、OLAP操作実行部は、ある多次元キューブに対してOLAP操作を実行する場合に、OLAP操作の引数としてクライアントから指示された引数、または、他の多次元キューブを構成するデータの少なくともいずれかを利用する。
 多次元データベース管理部は、多次元キューブにおいて、時間次元のデータと、空間次元のデータと、複数種別の固有次元のデータと、複数種別の特性を表すデータとを管理する。この多次元データベース管理部は、多次元キューブを構成するデータのそれぞれが多次元データであれば、当該多次元キューブの間で共通する多次元の値域で当該多次元データを分類する。
 より詳しくは、多次元データベース管理部は、多次元キューブを構成する各次元のデータ/各特性を表すデータのそれぞれが多次元データであれば、当該多次元キューブの間で共通する多次元の値域で分類する。値域で分類したデータが単一の値域に属する場合には、多次元データベース管理部は、当該データを値域に対応するストレージ領域に蓄積及び管理する。値域で分類したデータが複数の値域に属する場合には、多次元データベース管理部は、当該データを各値域に対応するストレージ領域に当該データの実体、または当該データの参照を重複して蓄積及び管理する。
 また、多次元データベース管理部は、多次元キューブを構成するデータを他の多次元キューブを構成するデータをキーとして検索/操作する場合に、分類に用いた値域を索引として利用することにより、単一の検索/操作を実行する場合には、両多次元キューブの同一値域に対応するストレージ領域と、両多次元キューブの同一値域の近傍の値域に対応するストレージ領域とに、検索/操作する範囲を限定するとともに、複数の検索/操作を同時並行に実行する場合には、さらに検索/操作するストレージ領域の競合を回避する。
The data analysis processing apparatus according to one aspect of the present invention includes a multidimensional database, an OLAP operation execution unit, and a multidimensional database management unit. The multidimensional database stores data embodying a real-world event in a multidimensional cube constructed for each subject in association with the identifier of the event. The OLAP operation execution unit executes an OLAP (Online Analytical Processing) operation on a multidimensional cube in response to a request from a client.
Further, when the OLAP operation execution unit executes an OLAP operation on a certain multidimensional cube, at least one of the arguments instructed by the client as the argument of the OLAP operation or the data constituting another multidimensional cube. To use.
The multidimensional database management unit manages time-dimensional data, spatial-dimensional data, multiple types of unique-dimensional data, and data representing multiple types of characteristics in a multidimensional cube. If each of the data constituting the multidimensional cube is multidimensional data, the multidimensional database management unit classifies the multidimensional data in a multidimensional value range common among the multidimensional cubes.
More specifically, the multidimensional database management unit determines that if each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is multidimensional data, the multidimensional value range common to the multidimensional cubes is used. Classify by. When the data classified by the range belongs to a single range, the multidimensional database management unit stores and manages the data in the storage area corresponding to the range. When the data classified by the range belongs to multiple ranges, the multidimensional database management unit stores and manages the actual data or the reference of the data in the storage area corresponding to each range. do.
In addition, the multidimensional database management unit simply uses the range used for classification as an index when searching / manipulating the data constituting the multidimensional cube using the data constituting another multidimensional cube as a key. When executing one search / operation, the range to be searched / operated is in the storage area corresponding to the same range of both multidimensional cubes and the storage area corresponding to the range near the same range of both multidimensional cubes. In addition to limiting the number of searches / operations, when multiple searches / operations are executed in parallel, conflicts in the storage area to be searched / operated are further avoided.
 この発明の一態様によれば、多次元キューブへのOLAP操作を高速に実行できる技術を提供することができる。 According to one aspect of the present invention, it is possible to provide a technique capable of executing OLAP operations on a multidimensional cube at high speed.
図1は、この発明に係るデータ分析処理装置の一例を示す機能ブロック図である。FIG. 1 is a functional block diagram showing an example of a data analysis processing apparatus according to the present invention. 図2は、多次元データベース16におけるデータの蓄積状態について説明するための図である。FIG. 2 is a diagram for explaining a data storage state in the multidimensional database 16. 図3は、最も広いデータあるいは主要なデータを包含する広さの値域の一例を示す図である。FIG. 3 is a diagram showing an example of a range of a wide range including the widest data or the main data. 図4は、上位の値域が下位の隣接する値域を包含する値域の階層と対応するストレージ領域の一例を示す図である。FIG. 4 is a diagram showing an example of a storage area corresponding to a hierarchy of a range in which a higher range includes a lower adjacent range. 図5は、データ分析処理装置10の動作の一例を説明するためのシーケンス図である。FIG. 5 is a sequence diagram for explaining an example of the operation of the data analysis processing device 10. 図6は、多次元データベース管理部15の処理手順の一例を示すフローチャートである。FIG. 6 is a flowchart showing an example of the processing procedure of the multidimensional database management unit 15. 図7は、多次元データベース管理部15がストレージ領域における検索/操作する範囲を限定する処理の一例を説明するための図である。FIG. 7 is a diagram for explaining an example of processing for limiting the search / operation range in the storage area by the multidimensional database management unit 15. 図8は、多次元データベース管理部15がストレージ領域における検索/操作する範囲を限定する処理の他の例を説明するための図である。FIG. 8 is a diagram for explaining another example of the process of limiting the search / operation range in the storage area by the multidimensional database management unit 15. 図9は、多次元データベース管理部15が検索/操作するストレージ領域の競合を回避する動作の一例を説明するための図である。FIG. 9 is a diagram for explaining an example of an operation of avoiding a conflict in a storage area searched / operated by the multidimensional database management unit 15. 図10は、多次元データベース管理部15が検索/操作するストレージ領域の競合を回避する動作の他の例を説明するための図である。FIG. 10 is a diagram for explaining another example of the operation of avoiding the conflict of the storage area searched / operated by the multidimensional database management unit 15. 図11は、多次元データベース管理部15が値域の階層を選択する処理の一例を説明するための図である。FIG. 11 is a diagram for explaining an example of a process in which the multidimensional database management unit 15 selects a hierarchy of a range. 図12は、複数のストレージ領域に対応する値域を選択した場合に、冗長な処理を抑制する動作の一例を説明するための模式図である。FIG. 12 is a schematic diagram for explaining an example of an operation of suppressing redundant processing when a range corresponding to a plurality of storage areas is selected. 図13は、図12に示される状況を表す表形式データの一例を示す図である。FIG. 13 is a diagram showing an example of tabular data representing the situation shown in FIG. 図14は、この発明に係るデータ分析処理装置のハードウェア構成の一例を示すブロック図である。FIG. 14 is a block diagram showing an example of the hardware configuration of the data analysis processing apparatus according to the present invention.
 以下、図面を参照してこの発明に係わる実施形態を説明する。 Hereinafter, embodiments relating to the present invention will be described with reference to the drawings.
 (構成)
 図1は、この発明に係るデータ分析処理装置の一例を示す機能ブロック図である。データ分析処理装置10は、OLAP操作実行部11と、多次元データベース管理部15と、多次元データベース16とを備える。
(Constitution)
FIG. 1 is a functional block diagram showing an example of a data analysis processing apparatus according to the present invention. The data analysis processing device 10 includes an OLAP operation execution unit 11, a multidimensional database management unit 15, and a multidimensional database 16.
 多次元データベース16は、実世界の事象を体現するデータを、そのデータの情報源である事象を識別するための事象の識別子と対応付けて多次元キューブに蓄積する。多次元キューブは主題ごとに構築される。蓄積されるデータは、時間次元のデータと、空間次元のデータと、複数種別の固有次元のデータと、複数種別の特性を表すデータとを含む。固有次元のデータには、主題に依存する複数の種別がある。特性を表すデータは、時間次元、空間次元、固有次元のデータにより識別される。特性を表すデータには、主題に依存する複数の種別がある。 The multidimensional database 16 stores data embodying an event in the real world in a multidimensional cube in association with an event identifier for identifying an event that is an information source of the data. Multidimensional cubes are constructed by subject. The accumulated data includes time-dimensional data, spatial-dimensional data, a plurality of types of unique-dimensional data, and data representing a plurality of types of characteristics. There are multiple types of subject-dependent data in the eigendimensional dimension. The characteristic data is identified by time-dimensional, spatial-dimensional, and eigen-dimensional data. There are multiple types of subject-dependent data that represent characteristics.
 多次元キューブを構成する各次元のデータ/各特性を表すデータのそれぞれが多次元データである場合には、多次元データベース16は、多次元キューブの間で共通する多次元の値域で多次元データを分類する。そして、値域で分類したデータが単一の値域に属する場合に、多次元データベース16は、値域に対応するストレージ領域にデータを蓄積する。さらに、値域で分類したデータが複数の値域に属する場合は、多次元データベース16は、各値域に対応するストレージ領域にデータの実体、または参照を、重複して蓄積する。 When each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is the multidimensional data, the multidimensional database 16 is the multidimensional data in the multidimensional value range common among the multidimensional cubes. To classify. Then, when the data classified by the range belongs to a single range, the multidimensional database 16 stores the data in the storage area corresponding to the range. Further, when the data classified by the range belongs to a plurality of ranges, the multidimensional database 16 duplicately stores the data entity or the reference in the storage area corresponding to each range.
 図2は、多次元データベース16におけるデータの蓄積状態について説明するための図である。図2において、地物などを表す2次元データであるデータa~cを、地域などを表す2次元の値域である値域1~4で分類すると、値域1にデータa~c、値域2にデータb、値域3にデータcが分類される。データaは値域1に属し、データbは値域1、2に属し、データcは値域1、3に属する。 FIG. 2 is a diagram for explaining the data accumulation state in the multidimensional database 16. In FIG. 2, when data a to c, which are two-dimensional data representing features and the like, are classified into value ranges 1 to 4, which are two-dimensional value ranges representing areas and the like, data a to c are in the range 1 and data are in the range 2. Data c is classified into b and range 3. The data a belongs to the range 1, the data b belongs to the range 1 and 2, and the data c belongs to the range 1 and 3.
 複数の値域に属するデータは、例えば、最も重畳する範囲が広い値域に対応するストレージ領域に、そのデータの実体の本体を蓄積し、それ以外の値域に対応するストレージ領域に、実体の複製、あるいは実体の本体への参照を蓄積する。参照は、例えばストレージに蓄積されたデータのアドレスである。 For data belonging to multiple ranges, for example, the main body of the data entity is stored in the storage area corresponding to the range corresponding to the widest overlapping range, and the entity is duplicated or duplicated in the storage area corresponding to the other ranges. Accumulate references to the body of an entity. The reference is, for example, the address of the data stored in the storage.
 ストレージ領域に蓄積する、実体の本体と、実体の複製あるいは実体の本体への参照は、例えば、蓄積するストレージ領域内を区分する、蓄積するデータにマーキングする、索引を作成することにより、区別することができる。ストレージ領域に蓄積する、実体の複製と実体の本体への参照は、任意あるいは基準に即して、実体の複製から実体の本体への参照へ、実体の本体への参照から実体の複製へ、変更することができる。 Distinguish between the body of an entity that accumulates in a storage area and a duplicate of an entity or a reference to the body of an entity, for example, by partitioning within the storage area to store, marking the data to be stored, or creating an index. be able to. The replication of the entity and the reference to the body of the entity accumulated in the storage area are, arbitrarily or according to the criteria, from the replication of the entity to the reference to the body of the entity, from the reference to the body of the entity to the replication of the entity. Can be changed.
 データの実体の複製にアクセスすれば、当該データの実体の複製を蓄積するストレージ領域をアクセスするために、データの実体の複製と、当該データの実体の本体を同時にアクセスしても、アクセスするストレージ領域は競合しない。 If you access the duplicate of the data entity, the storage that you can access even if you access the duplicate of the data entity and the main body of the data entity at the same time in order to access the storage area that stores the duplicate of the data entity. Areas do not conflict.
 データの実体の本体への参照にアクセスすれば、データの実体の本体への参照を蓄積するストレージ領域を経て、参照しているデータの実体の本体を蓄積するストレージ領域をアクセスするために、データの実体の本体への参照と、当該データの実体の本体を同時にアクセスすれば、アクセスするストレージ領域が競合することがある。 If you access the reference to the body of the data entity, you need to access the storage area that stores the body of the data entity you are referencing through the storage area that stores the reference to the body of the data entity. If the reference to the body of the entity and the body of the data entity are accessed at the same time, the storage areas to be accessed may conflict.
 ここで、値域の広さは、例えば、最も広いデータを包含できる広さや、主要なデータを包含できる広さにする。このようにすれば、データが属する値域の数を、高々、隣接する値域の数に抑制できる。 Here, the range is set to, for example, a size that can include the widest data or a size that can contain the main data. By doing so, the number of range to which the data belongs can be suppressed to the number of adjacent range at most.
 多次元データベース16は、このように、多次元データを多次元の値域で分類し、値域で分類したデータが単一の値域に属する場合に、値域に対応するストレージ領域に当該データを蓄積する。また、値域で分類したデータが複数の値域に属する場合には、多次元データベース16は、各値域に対応するストレージ領域にデータの実体、あるいは参照を、重複して蓄積する。 
 なお図2において、*が、データの実体(本体)を表し、**が、データの実体の複製/実体の本体への参照を表す。
In this way, the multidimensional database 16 classifies the multidimensional data in the multidimensional range, and when the data classified in the range belongs to a single range, the multidimensional database 16 stores the data in the storage area corresponding to the range. When the data classified by the range belongs to a plurality of ranges, the multidimensional database 16 duplicately stores the data entity or the reference in the storage area corresponding to each range.
In FIG. 2, * represents the substance (main body) of the data, and ** represents the duplication of the substance of the data / the reference to the body of the substance.
 図3は、最も広いデータあるいは主要なデータを包含する広さの値域の一例を示す図である。多次元データベース16に対して、値域の広さを変更する際に、例えば、新たなデータの蓄積を契機として、蓄積済みのデータも含めて、新しい値域の広さに合わせてデータを蓄積しなおす。また、多次元データベース16に対して、例えば、上位の値域が下位の隣接する値域を包含する値域の階層を構築しておき、状況に応じて利用する値域の階層を選択する。多次元データベース16に対して、複数のストレージ領域に対応する値域の階層を選択した場合には、複数のストレージ領域に重複して蓄積されているデータを利用しない。 FIG. 3 is a diagram showing an example of a range of a wide range including the widest data or the main data. When changing the range of the multidimensional database 16, for example, when new data is accumulated, the data is re-accumulated according to the new range, including the accumulated data. .. Further, for the multidimensional database 16, for example, a hierarchy of the range in which the upper range includes the lower adjacent range is constructed, and the hierarchy of the range to be used is selected according to the situation. When the hierarchy of the range corresponding to the plurality of storage areas is selected for the multidimensional database 16, the data duplicated and stored in the plurality of storage areas is not used.
 図4は、上位の値域が下位の隣接する値域を包含する値域の階層と対応するストレージ領域の一例を示す図である。 FIG. 4 is a diagram showing an example of a storage area corresponding to the hierarchy of the range in which the upper range includes the lower adjacent range.
 OLAP操作実行部11は、クライアント20から受信したOLAP操作および引数に応じて、多次元データへのOLAP操作を実行する。つまりOLAP操作実行部11は、多次元データへのOLAP操作を多次元データベース管理部15に指示する。また、OLAP操作実行部11は、指示した操作の結果を多次元データベース管理部15から受信すると、この操作結果をクライアント20に送信する。 The OLAP operation execution unit 11 executes an OLAP operation on multidimensional data according to the OLAP operation received from the client 20 and the arguments. That is, the OLAP operation execution unit 11 instructs the multidimensional database management unit 15 to perform an OLAP operation on the multidimensional data. Further, when the OLAP operation execution unit 11 receives the result of the instructed operation from the multidimensional database management unit 15, the OLAP operation execution unit 11 transmits the operation result to the client 20.
 多次元データベース管理部15は、OLAP操作実行部11の指示に応じて、多次元キューブを構成する各次元のデータ/各特性を表すデータの分類に用いた値域の情報を索引情報として参照し、参照した索引情報を基に、検索/操作するストレージ領域を特定する。また、多次元データベース管理部15は、ストレージ領域に対応する値域を処理単位として、多次元キューブを構成するデータを同時並行に検索/操作する。そして、多次元データベース管理部15は、検索/操作する全ストレージ領域の検索/操作が終われば、検索/操作した結果を集約して、OLAP操作実行部11に操作結果を返却する。また、上記のように多次元データベース16にデータが蓄積され利用されるように、多次元データベース16を管理する。 The multidimensional database management unit 15 refers to the information in the value range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic as index information in response to the instruction of the OLAP operation execution unit 11. Specify the storage area to be searched / operated based on the referenced index information. Further, the multidimensional database management unit 15 searches / operates the data constituting the multidimensional cube in parallel with the range corresponding to the storage area as the processing unit. Then, when the search / operation of all the storage areas to be searched / operated is completed, the multidimensional database management unit 15 aggregates the search / operation results and returns the operation result to the OLAP operation execution unit 11. Further, the multidimensional database 16 is managed so that the data is accumulated and used in the multidimensional database 16 as described above.
 (作用)
 次に、以上のように構成されたデータ分析処理装置の処理動作を説明する。 
 図5は、データ分析処理装置10の動作の一例を説明するためのシーケンス図である。図5において、OLAP操作実行部11は、クライアント20からOLAP操作と引数を受信すると、それらに応じて多次元データベース管理部15に多次元データの操作を指示する。
(Action)
Next, the processing operation of the data analysis processing apparatus configured as described above will be described.
FIG. 5 is a sequence diagram for explaining an example of the operation of the data analysis processing device 10. In FIG. 5, when the OLAP operation execution unit 11 receives an OLAP operation and an argument from the client 20, it instructs the multidimensional database management unit 15 to operate the multidimensional data accordingly.
 多次元データベース管理部15は、多次元データの操作指示に応じて、多次元キューブを構成する各次元のデータ/各特性を表すデータの分類に用いた値域の情報を索引情報として参照し、参照した索引情報を基に、検索/操作するストレージ領域を特定する。多次元データベース管理部15は、ストレージ領域に対応する値域を処理単位として、多次元キューブを構成するデータを同時並行に検索/操作する(図5の破線囲み「PARALLEL」)。 The multidimensional database management unit 15 refers to and refers to the information in the value range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic as index information in response to the operation instruction of the multidimensional data. Specify the storage area to be searched / operated based on the index information. The multidimensional database management unit 15 searches / operates the data constituting the multidimensional cube in parallel in parallel with the range corresponding to the storage area as the processing unit (“PARALLELL” surrounded by the broken line in FIG. 5).
 多次元データベース管理部15は、検索/操作する全ストレージ領域の検索/操作が終わるまで繰り返し(図5の破線囲み「LOOP」)、終了すると、検索/操作した結果を集約して、操作結果をOLAP操作実行部11に返却する。 The multidimensional database management unit 15 repeats until the search / operation of all the storage areas to be searched / operated is completed (“LOOP” surrounded by the broken line in FIG. 5), and when the search / operation is completed, the search / operation results are aggregated and the operation results are displayed. Return it to the OLAP operation execution unit 11.
 OLAP操作実行部11は、受信したOLAP操作と引数の内容に応じて、多次元データベース管理部15への指示を繰り返す(図5の破線囲み「LOOP」)。OLAP操作実行部11は、OLAP操作と引数の内容に対応する最終的な操作結果を取得すると、そのOLAP操作の操作結果をクライアント20に返却する。 The OLAP operation execution unit 11 repeats the instruction to the multidimensional database management unit 15 according to the received OLAP operation and the contents of the argument ("LOOP" surrounded by the broken line in FIG. 5). When the OLAP operation execution unit 11 acquires the final operation result corresponding to the OLAP operation and the contents of the argument, the OLAP operation execution unit 11 returns the operation result of the OLAP operation to the client 20.
 次に、多次元データベース管理部15の動作の詳細を説明する。 
 図6は、多次元データベース管理部15の処理手順の一例を示すフローチャートである。図6において、多次元データベース管理部15は、OLAP操作実行部11から多次元データの操作指示の受信を待ち受ける(ステップS11)。操作指示を受信すると、多次元データベース管理部15は、多次元キューブを構成する各次元のデータ/各特性を表すデータの分類に用いた値域の情報を索引情報として参照する(ステップS12)。
Next, the details of the operation of the multidimensional database management unit 15 will be described.
FIG. 6 is a flowchart showing an example of the processing procedure of the multidimensional database management unit 15. In FIG. 6, the multidimensional database management unit 15 waits for the reception of the operation instruction of the multidimensional data from the OLAP operation execution unit 11 (step S11). Upon receiving the operation instruction, the multidimensional database management unit 15 refers to the information in the range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic as index information (step S12).
 次に、多次元データベース管理部15は、参照した索引情報を基に、検索/操作するストレージ領域を特定し(ステップS13)、ストレージ領域に対応する値域を処理単位として、多次元キューブを構成するデータを同時並行に検索/操作する(ステップS141~S14N)。この処理は、ステップS15において、検索/操作する全ストレージ領域の検索/操作が終わったと判定されるまで繰り返される。 Next, the multidimensional database management unit 15 specifies a storage area to be searched / operated based on the referenced index information (step S13), and configures a multidimensional cube with the value range corresponding to the storage area as a processing unit. Search / operate data in parallel (steps S141 to S14N). This process is repeated in step S15 until it is determined that the search / operation of all the storage areas to be searched / operated has been completed.
 このとき、単一の検索/操作を実行する場合には、多次元データベース管理部15は、両多次元キューブの同一値域に対応するストレージ領域と、両多次元キューブの同一値域の近傍の値域に対応するストレージ領域とに、検索/操作する範囲を限定する。また、多次元データベース管理部15は、複数の検索/操作を同時並行に実行する場合には、さらに検索/操作するストレージ領域の競合を回避する。そして多次元データベース管理部15は、検索/操作した結果を集約する(ステップS16)。 At this time, when executing a single search / operation, the multidimensional database management unit 15 sets the storage area corresponding to the same range of both multidimensional cubes and the range near the same range of both multidimensional cubes. Limit the search / operation range to the corresponding storage area. Further, when a plurality of searches / operations are executed in parallel, the multidimensional database management unit 15 further avoids a conflict in the storage area to be searched / operated. Then, the multidimensional database management unit 15 aggregates the search / operation results (step S16).
 このようにして、多次元データベース管理部15は、多次元データの操作指示に応じて、ある多次元キューブに対してOLAP操作を実行する場合に、OLAP操作の引数として他の多次元キューブを構成するデータを利用する場合において、ある多次元キューブを構成するデータを、他の多次元キューブを構成するデータをキーとして検索/操作する。
 すなわち多次元データベース管理部15は、多次元キューブを構成する各次元のデータ/各特性を表すデータの分類に用いた値域を索引として利用することにより、単一の検索/操作を実行する場合には、両多次元キューブの同一値域に対応するストレージ領域と、両多次元キューブの同一値域の近傍の値域に対応するストレージ領域とに、検索/操作する範囲を限定する。また、多次元データベース管理部15は、複数の検索/操作を同時並行に実行する場合には、さらに検索/操作するストレージ領域の競合を回避する。
In this way, the multidimensional database management unit 15 configures another multidimensional cube as an argument of the OLAP operation when executing an OLAP operation on a certain multidimensional cube in response to an operation instruction of the multidimensional data. When using the data to be used, the data constituting a certain multidimensional cube is searched / operated by using the data constituting another multidimensional cube as a key.
That is, when the multidimensional database management unit 15 executes a single search / operation by using the range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic as an index. Limits the search / operation range to the storage area corresponding to the same range of both multidimensional cubes and the storage area corresponding to the range in the vicinity of the same range of both multidimensional cubes. Further, when a plurality of searches / operations are executed in parallel, the multidimensional database management unit 15 further avoids a conflict in the storage area to be searched / operated.
 図7は、多次元データベース管理部15がストレージ領域における検索/操作する範囲を限定する処理の一例を説明するための図である。図7に示されるように、多次元データベース管理部15が、多次元キューブ1を構成するデータを、多次元キューブ0を構成するデータをキーとして検索/操作する場合に、値域01、02、04に分類され対応するストレージ領域01、02、04に蓄積及び管理されるデータに包含あるいは重畳するデータは、それぞれ値域11、12、14に分類され対応するストレージ領域11、12、14に蓄積及び管理されるデータであるために、両多次元キューブの同一値域に対応するストレージ領域である領域01と11の組、領域02と12の組、領域04と14の組とに、検索/操作する範囲を限定できる。 FIG. 7 is a diagram for explaining an example of processing for limiting the search / operation range in the storage area by the multidimensional database management unit 15. As shown in FIG. 7, when the multidimensional database management unit 15 searches / operates the data constituting the multidimensional cube 1 using the data constituting the multidimensional cube 0 as a key, the value ranges 01, 02, 04 The data included in or superimposed on the data classified in the corresponding storage areas 01, 02, 04 and stored and managed in the corresponding storage areas 11, 12, 14 are classified into the value areas 11, 12, and 14, respectively, and stored and managed in the corresponding storage areas 11, 12, 14. The range to be searched / operated in the set of areas 01 and 11, the set of areas 02 and 12, and the set of areas 04 and 14, which are storage areas corresponding to the same value range of both multidimensional cubes. Can be limited.
 図8は、多次元データベース管理部15がストレージ領域における検索/操作する範囲を限定する処理の他の例を説明するための図である。図8に示されるように、多次元データベース管理部15が、多次元キューブ1を構成するデータを、多次元キューブ0を構成するデータをキーとして検索/操作する場合に、値域01に分類され値域01に対応するストレージ領域に蓄積及び管理されるデータの重心から点線の円で表される近傍にあるデータは、値域11と値域11から点線の円の半径の範囲にある値域12、14、15とに分類され対応するストレージ領域11、12、14、15に蓄積及び管理されるデータであるために、両多次元キューブの同一値域に対応するストレージ領域と、両多次元キューブの同一値域の近傍の値域に対応するストレージ領域である、領域01と領域11、12、14、15の組に、検索/操作する範囲を限定できる。他の値域に分類され当該値域に対応するストレージ領域に蓄積及び管理されるデータについても同様である。 FIG. 8 is a diagram for explaining another example of the process of limiting the search / operation range in the storage area by the multidimensional database management unit 15. As shown in FIG. 8, when the multidimensional database management unit 15 searches / operates the data constituting the multidimensional cube 1 using the data constituting the multidimensional cube 0 as a key, it is classified into a range 01 and a range. The data in the vicinity represented by the dotted circle from the center of gravity of the data stored and managed in the storage area corresponding to 01 is the range 11 and the range 12, 14, 15 within the range of the radius of the dotted circle from the range 11. Since the data is stored and managed in the corresponding storage areas 11, 12, 14, and 15, the storage area corresponding to the same range of both multidimensional cubes and the vicinity of the same range of both multidimensional cubes. The range to be searched / operated can be limited to the pair of the area 01 and the areas 11, 12, 14, and 15, which are the storage areas corresponding to the range of. The same applies to the data classified into other range and stored and managed in the storage area corresponding to the range.
 このように、多次元データベース管理部15は、参照した索引情報を基に、検索/操作するストレージ領域を特定する場合に、両多次元キューブの同一値域に対応するストレージ領域と、両多次元キューブの同一値域の近傍の値域に対応するストレージ領域とに、検索/操作する範囲を限定する。 In this way, when the multidimensional database management unit 15 specifies the storage area to be searched / operated based on the referenced index information, the storage area corresponding to the same range of both multidimensional cubes and the two multidimensional cubes. The range to be searched / operated is limited to the storage area corresponding to the range in the vicinity of the same range of.
 図9は、多次元データベース管理部15は検索/操作するストレージ領域の競合を回避する動作の一例を説明するための図である。図7の模式図に対応付けて説明する。図9に示されるように、多次元キューブ1を構成するデータを、多次元キューブ0を構成するデータをキーとして検索/操作する場合に、両多次元キューブの同一値域に対応するストレージ領域である領域01と11の組、領域02と12の組、領域04と14の組を単位として、多次元キューブを構成するデータ同時並行に検索/操作することで、検索/操作するストレージ領域の競合を回避できる。なぜなら、値域01、02、04に分類され対応するストレージ領域01、02、04に蓄積、管理されるデータに包含あるいは重複するデータは、それぞれ領域11、12、14に分類され対応するストレージ領域11、12、14に蓄積、管理されるデータであるからである。 FIG. 9 is a diagram for explaining an example of an operation of avoiding a conflict in the storage area to be searched / operated by the multidimensional database management unit 15. This will be described in association with the schematic diagram of FIG. 7. As shown in FIG. 9, it is a storage area corresponding to the same value range of both multidimensional cubes when the data constituting the multidimensional cube 1 is searched / operated by using the data constituting the multidimensional cube 0 as a key. By searching / manipulating the data constituting the multidimensional cube in parallel with the set of areas 01 and 11, the set of areas 02 and 12, and the set of areas 04 and 14, the conflict of the storage area to be searched / operated can be found. It can be avoided. This is because the data included or duplicated in the data classified in the range 01, 02, 04 and stored and managed in the corresponding storage areas 01, 02, 04 is classified into the areas 11, 12, and 14, respectively, and the corresponding storage area 11 is used. This is because the data is stored and managed in 12, 14.
 図10は、多次元データベース管理部15が検索/操作するストレージ領域の競合を回避する動作の他の例を説明するための図である。図8の模式図に対応付けて説明する。図10において、多次元キューブ1を構成するデータを、多次元キューブ0を構成するデータをキーとして検索/操作する場合に、図8と同様に、値域01に分類され値域01に対応するストレージ領域に蓄積及び管理されるデータの重心から点線の円で表される近傍にあるデータは、値域11と値域11から点線の円の半径の範囲にある値域12、14、15とに分類され対応するストレージ領域11、12、14、15に蓄積及び管理されるデータであり、値域04に分類され値域04に対応するストレージ領域に蓄積及び管理されるデータの重心から一点鎖線の円で表される近傍にあるデータは、値域14と値域14から点線の円の半径の範囲にある値域11、12、15、17、18とに分類され対応するストレージ領域11、12、15、17、18に蓄積及び管理されるデータであるために、両多次元キューブの同一値域に対応するストレージ領域と、両多次元キューブの同一値域の近傍の値域に対応するストレージ領域である、領域01と15、14、12、11の組、領域04と18、17、15、14の組を単位として、前記多次元キューブを構成する前記データ同時並行に検索/操作する場合に、領域01のデータに対しては領域15、14、12、11の順、領域04のデータに対しては領域18、17、15、14、12、11の順というように検索/操作する順番を合わせることで、検索/操作するストレージ領域の競合を回避できる。他の値域に分類され当該値域に対応するストレージ領域に蓄積及び管理されるデータについても同様である。 FIG. 10 is a diagram for explaining another example of the operation of avoiding the conflict of the storage area searched / operated by the multidimensional database management unit 15. This will be described in association with the schematic diagram of FIG. In FIG. 10, when the data constituting the multidimensional cube 1 is searched / operated using the data constituting the multidimensional cube 0 as a key, the storage area is classified into the value range 01 and corresponds to the value range 01 as in FIG. The data in the vicinity represented by the dotted circle from the center of gravity of the data accumulated and managed in is classified into the value range 11 and the value ranges 12, 14, and 15 within the range of the radius of the dotted circle from the value range 11. The data stored and managed in the storage areas 11, 12, 14, and 15, which are classified into the value range 04 and are stored and managed in the storage area corresponding to the value range 04. The data in is classified into the value range 14 and the value range 11, 12, 15, 17, 18 within the range of the radius of the dotted circle from the value range 14, and accumulated in the corresponding storage areas 11, 12, 15, 17, 18 and stored. Areas 01 and 15, 14, 12 which are storage areas corresponding to the same value range of both multidimensional cubes and storage areas in the vicinity of the same value range of both multidimensional cubes because they are managed data. , 11 pairs, regions 04 and 18, 17, 15, 14 as a unit, when searching / operating the data constituting the multidimensional cube in parallel, the region 15 for the data in the region 01. , 14, 12, 11 and the storage area to be searched / operated by matching the search / operation order such as the order of areas 18, 17, 15, 14, 12, 11 for the data in the area 04. Conflict can be avoided. The same applies to the data classified into other range and stored and managed in the storage area corresponding to the range.
 なお、両多次元キューブの同一値域に対応するストレージ領域と、両多次元キューブの同一値域の近傍の値域に対応するストレージ領域である、領域01と15、14、12、11の組、領域04と18、17、15、14の組を単位として、多次元キューブを構成するデータ同時並行に検索/操作する。他の値域に分類され当該値域に対応するストレージ領域に蓄積、管理されるデータについても同様である。 A pair of areas 01 and 15, 14, 12, and 11, which are a storage area corresponding to the same range of both multidimensional cubes and a storage area corresponding to a range in the vicinity of the same range of both multidimensional cubes, a region 04. And 18, 17, 15, 14 are used as a unit to search / operate the data constituting the multidimensional cube in parallel. The same applies to the data classified into other range and stored and managed in the storage area corresponding to the range.
 図9、および図10に示されるように、ストレージ領域にデータの実体の複製を蓄積している場合は、データの実体の複製と当該データの実体が異なるストレージ領域にあるので、検索/操作するストレージ領域の競合を完全に回避できる。 As shown in FIGS. 9 and 10, when the duplicate of the data entity is stored in the storage area, the duplicate of the data entity and the data entity are in different storage areas, and therefore the search / operation is performed. You can completely avoid storage space conflicts.
 一方、ストレージ領域にデータの実体の本体への参照を蓄積している場合は、データの実体の本体への参照先と当該データの実体の本体が同一のストレージ領域にある。このため、当該ストレージ領域において蓄積されている何れかのデータの実体の本体を検索/操作している場合には、検索/操作するストレージ領域の競合を回避できない。一方、当該ストレージ領域においても蓄積されている何れかのデータの実体の本体への参照を検索/操作している場合には、検索/操作するストレージ領域の競合を回避できる。また、実体の複製を蓄積するかわりに実体の本体への参照を蓄積すれば、ストレージ領域の必要量を抑制できる。 On the other hand, when the reference to the main body of the data entity is stored in the storage area, the reference destination to the main body of the data entity and the main body of the relevant data entity are in the same storage area. Therefore, when the main body of any of the data stored in the storage area is searched / operated, the conflict of the storage area to be searched / operated cannot be avoided. On the other hand, when the reference to the main body of any of the stored data is searched / operated in the storage area, the conflict of the storage area to be searched / operated can be avoided. Further, if the reference to the main body of the entity is accumulated instead of accumulating the copy of the entity, the required amount of the storage area can be suppressed.
 このようにして、多次元データベース管理部15は、参照した索引情報を基に、ストレージ領域に対応する値域を処理単位として、多次元キューブを構成するデータ同時並行に検索/操作する場合に、さらに検索/操作するストレージ領域の競合を回避する。 In this way, when the multidimensional database management unit 15 further searches / operates the data constituting the multidimensional cube in parallel with the range corresponding to the storage area as the processing unit based on the referenced index information. Avoid conflicts in the storage area to be searched / operated.
 なお、図7~図10の説明において、そもそも、属するデータがないストレージ領域は処理対象から除外される。データが複数の値域に属する場合には、各値域に対応するストレージ領域に実体あるいは参照を重複して蓄積及び管理していることから、ストレージ領域の複数の組において、同じデータを検索/操作する場合がある。その結果、同じ結果が得られた場合には、重複する結果を集約する。 In the description of FIGS. 7 to 10, the storage area to which the data does not belong is excluded from the processing target in the first place. When data belongs to multiple range, the same data is searched / operated in multiple sets of storage area because the entity or reference is duplicated and managed in the storage area corresponding to each range. In some cases. As a result, if the same result is obtained, the duplicated results are aggregated.
 図11は、多次元データベース管理部15が値域の階層を選択する処理の一例を説明するための図である。図11において、多次元データベース管理部15が、参照した索引情報を基に、検索/操作するストレージ領域を特定し、値域に対応するストレージ領域を単位として、多次元キューブを構成するデータを同時並行に検索/操作する場合を考える。この場合、多次元データベース管理部15は、多次元キューブを構成する各次元のデータ/各特性を表すデータの分類に用いる値域について、上位の値域が下位の隣接する値域を包含する値域の階層を構築しておき、状況に応じて検索/操作の処理単位とする値域の階層を選択する。 FIG. 11 is a diagram for explaining an example of a process in which the multidimensional database management unit 15 selects a range hierarchy. In FIG. 11, the multidimensional database management unit 15 identifies the storage area to be searched / operated based on the referenced index information, and simultaneously parallels the data constituting the multidimensional cube with the storage area corresponding to the value range as a unit. Consider the case of searching / operating. In this case, the multidimensional database management unit 15 sets the hierarchy of the range in which the upper range includes the lower adjacent range for the range used for classifying the data of each dimension constituting the multidimensional cube / the data representing each characteristic. Build and select the range hierarchy to be the processing unit of search / operation according to the situation.
 例えば、状況として、蓄積されているデータの値に応じて選択する場合であれば、最も広いデータを包含できる広さや、主要なデータを包含できる広さの値域のレベルを選択し、データが属する値域の数を、高々、隣接する値域の数に抑制する。 For example, if the situation is to select according to the value of the stored data, select the level of the range that can accommodate the widest data or the range that can accommodate the main data, and the data belongs. Limit the number of ranges to the number of adjacent ranges at most.
 最も広いデータを包含できる広さや、主要なデータを包含できる広さの値域は、データを蓄積するたびにデータを包含できる広さの値域のレベルを特定し、最大の値域のレベルや、最頻の値域のレベルを計算することで求められる。例えば、データa、bは、レベル2の値域では包含できず、レベル1の値域では包含できるために、レベル1の値域を選択する。 The range that can contain the widest data and the range that can contain the main data specifies the level of the range that can contain the data each time the data is accumulated, and the level of the maximum range and the most frequent. It is obtained by calculating the level of the range of. For example, since the data a and b cannot be included in the level 2 range and can be included in the level 1 range, the level 1 range is selected.
 また、例えば、状況として、実行可能な並列度に応じて選択する場合であれば、利用可能なCPUコア数や他の処理の状況に基づいて選択し、処理能力を最大限に利用する。例えば、レベル2の値域を選択すれば、64のストレージ領域は64の値域に対応し、64が実行可能な並列度の上限になる。レベル1の値域を選択すれば、64のストレージ領域は4つに集約されて4つの値域に対応し、4が実行可能な並列度の上限になる。レベル0の値域を選択すれば、64のストレージ領域は1つに集約されて1つの値域に対応し、1が実行可能な並列度の上限になる。 Also, for example, if the selection is made according to the degree of parallelism that can be executed, the selection is made based on the number of available CPU cores and the status of other processing, and the processing capacity is maximized. For example, if a level 2 range is selected, the 64 storage area corresponds to the 64 range, and 64 is the upper limit of the degree of parallelism that can be executed. If the range of level 1 is selected, the 64 storage areas are aggregated into four, corresponding to the four ranges, and 4 is the upper limit of the degree of parallelism that can be executed. If the range of level 0 is selected, the 64 storage areas are aggregated into one, corresponding to one range, and 1 is the upper limit of the degree of parallelism that can be executed.
 実行可能な並列度は、I/Oウェイトなどを考慮すればCPUコア数より多く、他のプロセスの実行などを考慮すればCPUコア数より少ない。このため、あらかじめ設定した情報やOS(Operating System)から取得した情報を基に、実行可能な並列度を計算する。例えば、CPUコア数が4であれば、CPUコア数に値域数が最も近いレベル1の値域を選択する。 The degree of parallelism that can be executed is larger than the number of CPU cores when I / O waits are taken into consideration, and less than the number of CPU cores when the execution of other processes is taken into consideration. Therefore, the degree of parallelism that can be executed is calculated based on the information set in advance and the information acquired from the OS (Operating System). For example, if the number of CPU cores is 4, the range of level 1 whose range number is closest to the number of CPU cores is selected.
 図12、図13は、多次元データベース管理部15による、冗長な処理を抑制する処理の一例を説明するための図である。図11においてレベル1の値域を選択した場合のように、検索/操作の処理単位とする値域の階層として、複数のストレージ領域に対応する値域の階層を多次元データベース管理部15が選択した場合を考える。この場合、複数のストレージ領域に重複して蓄積、管理されているデータを利用しないことにより、冗長な処理を抑制することができる。データが複数の値域に属する場合には、各値域に対応するストレージ領域に実体あるいは参照を重複して蓄積及び管理しているので、ストレージ領域の複数の組において、同じデータを検索/操作する場合がある。その結果として、同じ結果が得られる場合には重複する結果を集約する必要がある。多次元データベース管理部15は、この冗長な処理を抑制する。 12 and 13 are diagrams for explaining an example of processing for suppressing redundant processing by the multidimensional database management unit 15. As in the case where the level 1 range is selected in FIG. 11, the multidimensional database management unit 15 selects the range hierarchy corresponding to a plurality of storage areas as the range hierarchy used as the search / operation processing unit. think. In this case, redundant processing can be suppressed by not using the data that is duplicately stored and managed in a plurality of storage areas. When data belongs to multiple range, the entity or reference is stored and managed in duplicate in the storage area corresponding to each range. Therefore, when the same data is searched / operated in multiple sets of storage areas. There is. As a result, if the same result is obtained, it is necessary to aggregate the duplicated results. The multidimensional database management unit 15 suppresses this redundant processing.
 図12は、図11と同様に、検索/操作の処理単位とする値域の階層として、レベル1の値域を選択した場合、レベル1の値域に包含されるレベル2の値域について、データaがレベル2の値域2に分類され対応するストレージ領域2に蓄積、管理され、データbがレベル2の値域2、3、6、7に分類され対応するストレージ領域2、3、6、7に蓄積、管理され、レベル2の値域1~16がレベル1の値域3に包含され、レベル1の値域1~4がレベル0の値域1に包含されることを示す。 In FIG. 12, similarly to FIG. 11, when the level 1 range is selected as the hierarchy of the range to be the processing unit of the search / operation, the data a is the level for the level 2 range included in the level 1 range. It is classified into the range 2 of 2 and stored and managed in the corresponding storage area 2, and the data b is classified into the range 2, 3, 6 and 7 of the level 2 and stored and managed in the corresponding storage areas 2, 3, 6 and 7. It is shown that the range 1 to 16 of the level 2 is included in the range 3 of the level 1, and the range 1 to 4 of the level 1 is included in the range 1 of the level 0.
 図13は、図12に示される状況を表す表形式データの一例である。図11と同様に、検索/操作の処理単位とする値域の階層として、レベル1の値域を選択した場合、多次元データベース管理部15は、レベル1の値域に包含されるレベル2の値域に対応する各ストレージ領域から順にデータを読み出して処理する。例えば、レベル2の値域2に対応するストレージ領域からデータaを読み出したときに、図13の表形式のデータを検索することで、レベル2の値域2に対応するストレージ領域のみに蓄積されていることが識別できる。よって、多次元データベース管理部15は、冗長な処理を抑制するために、対になる多次元キューブのレベル2の値域2に対応するストレージ領域を検索/操作の対象にする。 FIG. 13 is an example of tabular data representing the situation shown in FIG. Similar to FIG. 11, when the level 1 range is selected as the hierarchy of the range used as the search / operation processing unit, the multidimensional database management unit 15 corresponds to the level 2 range included in the level 1 range. Data is read out and processed in order from each storage area. For example, when the data a is read from the storage area corresponding to the range 2 of the level 2, by searching the tabular data of FIG. 13, the data is stored only in the storage area corresponding to the range 2 of the level 2. Can be identified. Therefore, in order to suppress redundant processing, the multidimensional database management unit 15 searches / operates the storage area corresponding to the range 2 of the level 2 of the paired multidimensional cube.
 また、例えば、レベル2の値域2に対応するストレージ領域からデータbを読み出したときに、図13の表形式のデータを検索することで、レベル2の値域3、6、7に対応するストレージ領域にも蓄積されていることが識別できる。よって、多次元データベース管理部15は、対になる多次元キューブのレベル2の値域2、3、6、7に対応するストレージ領域を検索/操作の対象にする。また、冗長な処理を抑制するために、多次元データベース管理部15は、図13の表形式のデータにデータbが処理済であることをマーキングし、レベル2の値域3、6、7に対応するストレージ領域からはデータbを読みださない。なお、任意の契機で、複数のストレージ領域に対応する値域の階層を選択したときのために、その階層に対応するストレージ領域に、実体の本体と、実体の複製、実体の本体への参照が蓄積されていた場合に、実体の複製と実体の本体への参照を削除し図13の表形式のデータに反映しておくことも、削除した後に、削除する前の状態にストレージ領域と図13の表形式のデータを戻すこともできる。 Further, for example, when the data b is read from the storage area corresponding to the range 2 of the level 2, the storage area corresponding to the range 3, 6 and 7 of the level 2 is searched by searching the tabular data of FIG. It can be identified that it is also accumulated. Therefore, the multidimensional database management unit 15 searches / operates the storage area corresponding to the range 2, 3, 6, and 7 of the level 2 of the paired multidimensional cube. Further, in order to suppress redundant processing, the multidimensional database management unit 15 marks the tabular data in FIG. 13 that the data b has been processed, and corresponds to the range 3, 6 and 7 of the level 2. Data b is not read from the storage area. In addition, in case the hierarchy of the value range corresponding to multiple storage areas is selected at any opportunity, the main body of the entity, the duplication of the entity, and the reference to the main body of the entity are displayed in the storage area corresponding to the hierarchy. If it has been accumulated, the copy of the entity and the reference to the main body of the entity can be deleted and reflected in the tabular data of FIG. 13, or the storage area and the state before the deletion can be obtained after the deletion. It is also possible to return the tabular data of.
 図14は、この発明に係るデータ分析処理装置のハードウェア構成の一例を示すブロック図である。図14において、データ分析処理装置10は、プロセッサ12、多次元データベース16を記憶するストレージ200、インタフェース部13、およびメモリ14を備える。つまりデータ分析処理装置10はコンピュータであり、例えば、パーソナルコンピュータ、あるいはサーバコンピュータ等として実現される。 FIG. 14 is a block diagram showing an example of the hardware configuration of the data analysis processing apparatus according to the present invention. In FIG. 14, the data analysis processing device 10 includes a processor 12, a storage 200 for storing a multidimensional database 16, an interface unit 13, and a memory 14. That is, the data analysis processing device 10 is a computer, and is realized as, for example, a personal computer, a server computer, or the like.
 インタフェース部13は、ネットワーク100に接続され、ネットワーク100に接続されたクライアント20からのアクセスを受け付ける。 The interface unit 13 is connected to the network 100 and receives access from the client 20 connected to the network 100.
 ストレージ200は、例えば、HDD(Hard Disk Drive)やSSD(Solid State Drive)等の、不揮発性の記憶媒体(ブロックデバイス)である。ストレージ200は、OS(Operating System)やデバイスドライバなどの基本プログラム、およびデータ分析処理装置10の機能を実現させるためのプログラム等に加えて、所定の記憶領域に多次元データベース16を記憶する。 The storage 200 is a non-volatile storage medium (block device) such as an HDD (Hard Disk Drive) or SSD (Solid State Drive). The storage 200 stores a multidimensional database 16 in a predetermined storage area in addition to basic programs such as an OS (Operating System) and a device driver, and a program for realizing the functions of the data analysis processing device 10.
 図14のメモリ14は、例えばRAM(Random Access Memory)であり、ストレージ200からロードされたプログラム14a、および各種のデータ14bを記憶する。 The memory 14 in FIG. 14 is, for example, a RAM (RandomAccessMemory), and stores a program 14a loaded from the storage 200 and various data 14b.
 さらに、図14におけるプロセッサ12は、例えばCentral Processing Unit(CPU)やMicro Processing Unit(MPU)等の演算ユニットであり、メモリ14にロードされたプログラムにより、その機能を実現する。 Further, the processor 12 in FIG. 14 is an arithmetic unit such as a Central Processing Unit (CPU) or a Micro Processing Unit (MPU), and its function is realized by a program loaded in the memory 14.
 ところで、プロセッサ12は、OLAP操作実行部11、および多次元データベース管理部15を、実施形態に係わる処理機能として備える。OLAP操作実行部11、多次元データベース管理部15、および時系列整列部17は、プログラム14aに含まれる命令をプロセッサ12が実行することで実現される、処理機能である。すなわち、本発明のデータ分析処理装置10はコンピュータとプログラムによっても実現できる。光学メディアなどの記録媒体にプログラムを記録して配布することに加え、ネットワークを通してプログラムを提供することも可能である。 By the way, the processor 12 includes an OLAP operation execution unit 11 and a multidimensional database management unit 15 as processing functions related to the embodiment. The OLAP operation execution unit 11, the multidimensional database management unit 15, and the time-series alignment unit 17 are processing functions realized by the processor 12 executing the instructions included in the program 14a. That is, the data analysis processing device 10 of the present invention can also be realized by a computer and a program. In addition to recording and distributing the program on a recording medium such as an optical medium, it is also possible to provide the program through a network.
 なお、OLAP操作実行部11、および多次元データベース管理部15は、プロセッサ12に代えて、あるいはそれに加えて、ASIC(Application Specific Integrated Circuit)やFPGA(field-programmable gate array)などの集積回路を含む、他の多様な形式で、実現されても良い。 The OLAP operation execution unit 11 and the multidimensional database management unit 15 include integrated circuits such as an ASIC (Application Specific Integrated Circuit) and an FPGA (field-programmable gate array) in place of or in addition to the processor 12. , May be realized in various other formats.
 プロセッサ12は、インタフェース部13経由で、クライアント20からのOLAP操作と引数とを受信することができ、クライアント20に操作結果を送信することができる。 The processor 12 can receive the OLAP operation and the argument from the client 20 via the interface unit 13, and can send the operation result to the client 20.
 (効果)
 以上述べたように、実施形態では、多次元データベース管理部15は、多次元キューブを構成するデータのそれぞれが多次元データである場合に、該データを多次元キューブの間で共通する多次元の値域で分類する。さらに、値域で分類したデータが単一の値域に属する場合には、多次元データベース管理部15は、当該データを値域に対応するストレージ領域に蓄積し、値域で分類したデータが複数の値域に属する場合には、各値域に対応するストレージ領域に実体あるいは参照を重複して蓄積する。
(effect)
As described above, in the embodiment, when each of the data constituting the multidimensional cube is multidimensional data, the multidimensional database management unit 15 shares the data among the multidimensional cubes. Classify by value range. Further, when the data classified by the range belongs to a single range, the multidimensional database management unit 15 stores the data in the storage area corresponding to the range, and the data classified by the range belongs to a plurality of ranges. In that case, the entity or reference is duplicated and accumulated in the storage area corresponding to each range.
 また、多次元キューブを構成する操作対象のデータの分類に用いた値域の情報を索引情報として利用する。これにより、単一の検索/操作を実行する場合には、両多次元キューブの同一値域に対応するストレージ領域と、両多次元キューブの同一値域の近傍の値域に対応するストレージ領域とに、検索/操作する範囲を限定する。また、複数の検索/操作を同時実行する場合には、さらに検索/操作するストレージ領域の競合を回避する。 Also, the range information used to classify the data to be operated that constitutes the multidimensional cube is used as index information. As a result, when performing a single search / operation, the storage area corresponding to the same range of both multidimensional cubes and the storage area corresponding to the range near the same range of both multidimensional cubes are searched. / Limit the range of operation. Further, when a plurality of searches / operations are executed at the same time, the conflict of the storage area to be searched / operated is further avoided.
 このようにすることで、多次元キューブを構成する各次元のデータ/各特性を表すデータのそれぞれが多次元データである場合や、値域で分類したデータが複数の値域に属する場合においても、単一の検索/操作を実行する場合には、検索/操作する範囲を限定するとともに、複数の検索/操作を同時実行する場合には、さらに検索/操作するストレージ領域の競合を回避できる。 By doing so, even when each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is multidimensional data, or when the data classified by the range belongs to a plurality of ranges, it is simple. When executing one search / operation, the range of the search / operation can be limited, and when a plurality of searches / operations are executed at the same time, it is possible to avoid the conflict of the storage area to be further searched / operated.
 よって、実施形態によれば、多次元キューブを構成する各次元のデータ/各特性を表すデータのそれぞれが多次元データである場合や、値域で分類したデータが複数の値域に属する場合においても、処理を高速化することができる。 Therefore, according to the embodiment, even when each of the data of each dimension constituting the multidimensional cube / the data representing each characteristic is multidimensional data, or when the data classified by the range belongs to a plurality of ranges. Processing can be speeded up.
 また、ある多次元キューブに対してOLAP操作を実行する場合に、多次元データベース管理部15は、OLAP操作の引数として他の多次元キューブを構成するデータを利用する。このとき、ある多次元キューブを構成するデータを、他の多次元キューブを構成するデータをキーとして検索/操作する場合に、多次元データベース管理部15は、多次元キューブを構成する各次元のデータ/各特性を表すデータの分類に用いる値域について、上位の値域が下位の隣接する値域を包含する値域の階層を構築しておく。また、多次元データベース管理部15は、蓄積されているデータの値や実行可能な並列度などの状況に応じて検索/操作の処理単位とする値域の階層を選択する。さらに、多次元データベース管理部15は、複数のストレージ領域に対応する値域の階層を選択した場合には、複数のストレージ領域に重複して蓄積、管理されているデータを利用しない。 Further, when executing an OLAP operation on a certain multidimensional cube, the multidimensional database management unit 15 uses data constituting another multidimensional cube as an argument of the OLAP operation. At this time, when the data constituting a certain multidimensional cube is searched / operated by using the data constituting another multidimensional cube as a key, the multidimensional database management unit 15 is the data of each dimension constituting the multidimensional cube. / For the value range used for classifying the data representing each characteristic, the hierarchy of the value range in which the upper value range includes the lower adjacent value range is constructed. Further, the multidimensional database management unit 15 selects a hierarchy of a range to be a processing unit of search / operation according to a situation such as a value of accumulated data and a degree of parallelism that can be executed. Further, when the multidimensional database management unit 15 selects a hierarchy of range corresponding to a plurality of storage areas, the multidimensional database management unit 15 does not use the data duplicated and stored and managed in the plurality of storage areas.
 このように、複数のストレージ領域に対応する値域の階層を選択した場合にも、データが複数の値域に属する場合は各値域に対応するストレージ領域に実体あるいは参照を重複して蓄積及び管理しているために、ストレージ領域の複数の組において、同じデータを検索/操作する場合がある。同じ結果が得られる場合には重複する結果を集約する必要があるが、検索/操作の処理単位内においては冗長な処理を抑制できる。 In this way, even when the hierarchy of the range corresponding to multiple storage areas is selected, if the data belongs to multiple ranges, the entity or reference is duplicated and managed in the storage area corresponding to each range. Therefore, the same data may be searched / operated in a plurality of sets of storage areas. When the same result is obtained, it is necessary to aggregate duplicate results, but redundant processing can be suppressed within the processing unit of search / operation.
 よって、複数のストレージ領域に対応する値域の階層を選択した場合にも、検索/操作の処理単位内においては冗長な処理を抑制し、高速化することができる。 Therefore, even when the range hierarchy corresponding to a plurality of storage areas is selected, redundant processing can be suppressed and the speed can be increased within the search / operation processing unit.
 従って、実施形態によれば、多次元キューブを構成するデータをキーとして他の多次元キューブを構成するデータを検索/操作する処理を高速化することができる。すなわち、実施形態によれば、多次元キューブへのOLAP操作を高速に実行できるデータ分析処理装置、データ分析処理方法、およびプログラムを提供することが可能になる。より詳しくは、実施形態によれば、OLAP操作の引数として他の多次元キューブを構成するデータを利用する場合において、ある多次元キューブを構成するデータを、他の多次元キューブを構成するデータをキーとして検索/操作する場合に、多次元キューブを構成する各次元のデータ/各特性を表すデータのそれぞれが多次元データである場合や、値域で分類したデータが複数の値域に属する場合においても、単一の検索/操作を実行する場合には、検索/操作する範囲を限定するとともに、複数の検索/操作を同時実行する場合には、さらに検索/操作するストレージ領域の競合を回避することにより、多次元キューブへのOLAP操作を高速に実行できる技術を提供することができる。 Therefore, according to the embodiment, it is possible to speed up the process of searching / operating the data constituting another multidimensional cube by using the data constituting the multidimensional cube as a key. That is, according to the embodiment, it becomes possible to provide a data analysis processing device, a data analysis processing method, and a program capable of executing OLAP operations on a multidimensional cube at high speed. More specifically, according to the embodiment, when data constituting another multidimensional cube is used as an argument of an OLAP operation, the data constituting one multidimensional cube and the data constituting another multidimensional cube are used. When searching / operating as a key, even if each of the data of each dimension that constitutes the multidimensional cube / the data that represents each characteristic is multidimensional data, or if the data classified by the value range belongs to multiple value ranges. When executing a single search / operation, limit the search / operation range, and when executing multiple searches / operations at the same time, avoid conflicts in the storage area to be searched / operated. Thereby, it is possible to provide a technique capable of executing OLAP operations on a multidimensional cube at high speed.
 すなわち、この発明は、上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合せにより種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態に亘る構成要素を適宜組み合せてもよい。 That is, the present invention is not limited to the above-described embodiment as it is, and at the implementation stage, the components can be modified and embodied within a range that does not deviate from the gist thereof. In addition, various inventions can be formed by an appropriate combination of the plurality of components disclosed in the above-described embodiment. For example, some components may be removed from all the components shown in the embodiments. In addition, components from different embodiments may be combined as appropriate.
 10…データ分析処理装置
 11…OLAP操作実行部
 12…プロセッサ
 13…インタフェース部
 14…メモリ
 14a…プログラム
 14b…データ
 15…多次元データベース管理部
 16…多次元データベース
 17…時系列整列部
 20…クライアント
 100…ネットワーク
 200…ストレージ
10 ... Data analysis processing device 11 ... OLAP operation execution unit 12 ... Processor 13 ... Interface unit 14 ... Memory 14a ... Program 14b ... Data 15 ... Multidimensional database management unit 16 ... Multidimensional database 17 ... Time series alignment unit 20 ... Client 100 … Network 200… Storage

Claims (8)

  1.  主題ごとに構築される多次元キューブに、実世界の事象を体現するデータを当該事象の識別子と対応付けて蓄積する多次元データベースをと、
     クライアントからの要求に応じて前記多次元キューブに対するOLAP(Online Analytical Processing)操作を実行するOLAP操作実行部と、
     前記多次元キューブにおいて、時間次元のデータと、空間次元のデータと、複数種別の固有次元のデータと、複数種別の特性を表すデータとを管理する多次元データベース管理部とを具備し、
     前記多次元データベース管理部は、前記多次元キューブを構成するデータのそれぞれが多次元データであれば、前記多次元キューブの間で共通する多次元の値域で当該多次元データを分類する、データ分析処理装置。
    A multidimensional database that stores data that embodies real-world events in association with the identifier of the event in a multidimensional cube constructed for each subject.
    An OLAP operation execution unit that executes an OLAP (Online Analytical Processing) operation on the multidimensional cube in response to a request from a client.
    The multidimensional cube includes a multidimensional database management unit that manages time-dimensional data, spatial-dimensional data, a plurality of types of unique-dimensional data, and data representing a plurality of types of characteristics.
    If each of the data constituting the multidimensional cube is multidimensional data, the multidimensional database management unit classifies the multidimensional data in a common multidimensional value range among the multidimensional cubes, and data analysis. Processing device.
  2.  前記多次元データベース管理部は、前記分類されたデータが単一の値域に属する場合に、当該値域に対応するストレージ領域に前記データを蓄積する、請求項1に記載のデータ分析処理装置。 The data analysis processing device according to claim 1, wherein the multidimensional database management unit stores the data in a storage area corresponding to the range when the classified data belongs to a single range.
  3.  前記多次元データベース管理部は、前記分類されたデータが複数の値域に属する場合に、当該値域のそれぞれに対応するストレージ領域に前記データの実体、または当該データの参照を重複して蓄積する、請求項1に記載のデータ分析処理装置。 When the classified data belongs to a plurality of range, the multidimensional database management unit duplicately accumulates the substance of the data or the reference of the data in the storage area corresponding to each of the ranges. Item 1. The data analysis processing apparatus according to Item 1.
  4.  前記OLAP操作実行部は、前記OLAP操作の引数として、前記クライアントから指示された引数、または、他の前記多次元キューブを構成するデータの少なくともいずれかを利用する、請求項1に記載のデータ分析処理装置。 The data analysis according to claim 1, wherein the OLAP operation execution unit uses at least one of an argument instructed by the client or other data constituting the multidimensional cube as an argument of the OLAP operation. Processing equipment.
  5.  前記多次元データベース管理部は、多次元キューブを構成するデータを他の多次元キューブを構成するデータをキーとして検索/操作する場合に、前記分類に用いた値域を索引として利用することにより、単一の検索/操作を実行する場合には、両多次元キューブの同一値域に対応するストレージ領域と、両多次元キューブの同一値域の近傍の値域に対応するストレージ領域とに、検索/操作する範囲を限定するとともに、複数の検索/操作を同時並行に実行する場合には、さらに検索/操作するストレージ領域の競合を回避する、請求項1に記載のデータ分析処理装置。 The multidimensional database management unit simply uses the range used for the classification as an index when searching / operating the data constituting the multidimensional cube using the data constituting another multidimensional cube as a key. When executing one search / operation, the range to be searched / operated is in the storage area corresponding to the same range of both multidimensional cubes and the storage area corresponding to the range near the same range of both multidimensional cubes. The data analysis processing apparatus according to claim 1, wherein the data analysis processing apparatus is limited to, and when a plurality of searches / operations are executed in parallel, the content of the storage area to be searched / operated is further avoided.
  6.  前記多次元データベース管理部は、上位の値域が下位の隣接する値域を包含する値域の階層を構築し、状況に応じて検索/操作の処理単位とする値域の階層を選択し、複数のストレージ領域に対応する値域の階層を選択した場合には、前記複数のストレージ領域に重複して蓄積及び管理されているデータを利用しない、請求項5に記載のデータ分析処理装置。 The multidimensional database management unit constructs a range hierarchy in which the upper range includes the lower adjacent range, selects the range hierarchy to be the processing unit of search / operation according to the situation, and multiple storage areas. The data analysis processing apparatus according to claim 5, wherein when the hierarchy of the range corresponding to is selected, the data that is accumulated and managed in duplicate in the plurality of storage areas is not used.
  7.  コンピュータのプロセッサが、主題ごとに構築される多次元キューブに、実世界の事象を体現するデータを当該事象の識別子と対応付けて多次元データベースに蓄積する過程と、
     前記プロセッサが、クライアントからの要求に応じて前記多次元キューブに対するOLAP(Online Analytical Processing)操作を実行する過程と、
     前記プロセッサが、前記多次元キューブにおいて、時間次元のデータと、空間次元のデータと、複数種別の固有次元のデータと、複数種別の特性を表すデータとを管理する過程と、
     前記多次元キューブを構成するデータのそれぞれが多次元データであれば、前記多次元キューブの間で共通する多次元の値域で当該多次元データを分類する過程とを含む、データ分析処理方法。
    The process in which a computer processor stores data embodying a real-world event in a multidimensional database constructed for each subject in association with the identifier of the event.
    A process in which the processor executes an OLAP (Online Analytical Processing) operation on the multidimensional cube in response to a request from a client.
    A process in which the processor manages time-dimensional data, spatial-dimensional data, a plurality of types of eigendimensional data, and data representing a plurality of types of characteristics in the multidimensional cube.
    A data analysis processing method including a process of classifying the multidimensional data in a multidimensional range common among the multidimensional cubes if each of the data constituting the multidimensional cube is the multidimensional data.
  8.  コンピュータのプロセッサを、請求項1乃至6の何れかに記載のデータ分析処理装置として機能させる、プログラム。 A program that causes a computer processor to function as the data analysis processing device according to any one of claims 1 to 6.
PCT/JP2020/040213 2020-10-27 2020-10-27 Data analysis processing device, data analysis processing method, and program WO2022091204A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2022558636A JP7464142B2 (en) 2020-10-27 2020-10-27 DATA ANALYSIS PROCESSING APPARATUS, DATA ANALYSIS PROCESSING METHOD, AND PROGRAM
PCT/JP2020/040213 WO2022091204A1 (en) 2020-10-27 2020-10-27 Data analysis processing device, data analysis processing method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/040213 WO2022091204A1 (en) 2020-10-27 2020-10-27 Data analysis processing device, data analysis processing method, and program

Publications (1)

Publication Number Publication Date
WO2022091204A1 true WO2022091204A1 (en) 2022-05-05

Family

ID=81382206

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/040213 WO2022091204A1 (en) 2020-10-27 2020-10-27 Data analysis processing device, data analysis processing method, and program

Country Status (2)

Country Link
JP (1) JP7464142B2 (en)
WO (1) WO2022091204A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007502466A (en) * 2003-08-12 2007-02-08 オラクル・インターナショナル・コーポレイション Systems and methods for mutual attribute analysis and manipulation in online analytical processing (OLAP) and multi-dimensional planning applications by dimension splitting
US20070150862A1 (en) * 2005-11-07 2007-06-28 Business Objects, S.A. Apparatus and method for defining report parts
JP2016518646A (en) * 2013-03-15 2016-06-23 デシジョン, インク. System, apparatus, and method for generating contextual objects mapped to data measurements by dimensional data
JP2018136963A (en) * 2014-11-19 2018-08-30 株式会社インフォメックス Data retrieval device, data retrieval method, data retrieval program, and recording medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007502466A (en) * 2003-08-12 2007-02-08 オラクル・インターナショナル・コーポレイション Systems and methods for mutual attribute analysis and manipulation in online analytical processing (OLAP) and multi-dimensional planning applications by dimension splitting
US20070150862A1 (en) * 2005-11-07 2007-06-28 Business Objects, S.A. Apparatus and method for defining report parts
JP2016518646A (en) * 2013-03-15 2016-06-23 デシジョン, インク. System, apparatus, and method for generating contextual objects mapped to data measurements by dimensional data
JP2018136963A (en) * 2014-11-19 2018-08-30 株式会社インフォメックス Data retrieval device, data retrieval method, data retrieval program, and recording medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YAGI SATORU: "A concept of a multidimensional data analysis system for real-world phenomena", IPSJ SIG TECHNICAL REPORT, vol. 2019-DBS-169, no. 14, 10 September 2019 (2019-09-10), pages 1 - 6, XP055938138, ISSN: 2188-871X *

Also Published As

Publication number Publication date
JP7464142B2 (en) 2024-04-09
JPWO2022091204A1 (en) 2022-05-05

Similar Documents

Publication Publication Date Title
US20200356901A1 (en) Target variable distribution-based acceptance of machine learning test data sets
US9449115B2 (en) Method, controller, program and data storage system for performing reconciliation processing
KR101137147B1 (en) Query forced indexing
US10545945B2 (en) Change monitoring spanning graph queries
US10417265B2 (en) High performance parallel indexing for forensics and electronic discovery
CN108140040A (en) The selective data compression of database in memory
US20190179832A1 (en) Relational database storage system and method for supporting quick query processing with low data redundancy, and method for processing query on basis of relational database storage method
Hu et al. Towards big linked data: a large-scale, distributed semantic data storage
US10599614B1 (en) Intersection-based dynamic blocking
JP6153331B2 (en) Project management system based on associative memory
Wang et al. Accelerated butterfly counting with vertex priority on bipartite graphs
CN108804556A (en) Distributed treatment frame system based on time travel and tense aggregate query
WO2022091204A1 (en) Data analysis processing device, data analysis processing method, and program
US8473496B2 (en) Utilizing density metadata to process multi-dimensional data
US20200218705A1 (en) System and method of managing indexing for search index partitions
US20090300000A1 (en) Method and System For Improved Search Relevance In Business Intelligence systems through Networked Ranking
Wang et al. Turbo: Dynamic and decentralized global analytics via machine learning
CN108664662A (en) Time travel and tense aggregate query processing method
US10019472B2 (en) System and method for querying a distributed dwarf cube
Hanmanthu et al. Parallel optimal grid-clustering algorithm exploration on mapreduce framework
Topcu Evaluating Riak Key Value Cluster for Big Data
NR et al. MapReduce‐based storage and indexing for big health data
Bhatnagar et al. DASC: data aware algorithm for scalable clustering
Salisu et al. An Efficient Storage Management Analysis forBig Data
Grigoriev et al. Efficiency Analysis of the access method with the cascading Bloom filter to the data warehouse on the parallel computing platform

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20959722

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022558636

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20959722

Country of ref document: EP

Kind code of ref document: A1