WO2016157707A1 - 表操作システム、方法およびプログラム - Google Patents
表操作システム、方法およびプログラム Download PDFInfo
- Publication number
- WO2016157707A1 WO2016157707A1 PCT/JP2016/001011 JP2016001011W WO2016157707A1 WO 2016157707 A1 WO2016157707 A1 WO 2016157707A1 JP 2016001011 W JP2016001011 W JP 2016001011W WO 2016157707 A1 WO2016157707 A1 WO 2016157707A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- column
- value
- key
- attribute value
- customer
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/211—Schema design and management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Definitions
- the present invention relates to a table operation system, a table operation method, and a table operation program for operating a table in a predetermined database.
- FIG. 16 is a schematic diagram showing various tables included in the database.
- FIG. 16 shows a purchase table, a customer master, a product master, and a store master.
- a customer ID indicates the attribute value of the customer (age in the example shown in FIG. 16) for each customer ID.
- the product master indicates the attribute value of the product (product category in the example shown in FIG. 16) for each product ID.
- the store master indicates the attribute of the store (in the example shown in FIG. 16, the district where the store exists) for each store ID.
- the BI tool (not shown) extracts information satisfying the conditions specified by the analyst from the database illustrated in FIG.
- an analyst wants to know the number of purchases that a person in their twenties purchased a jacket in Kawasaki.
- the analyst specifies the condition “20s” for the customer, the condition “jacket” for the product, and the condition “Kawasaki” for the store.
- the BI tool narrows down the customer ID of the customer corresponding to “20th generation” from the customer master, narrows down the product ID of the product corresponding to “jacket” from the product master, and changes from the store master to “Kawasaki”.
- the store ID of the corresponding store is narrowed down.
- BI tool specifies the line containing the combination of those customer ID, goods ID, and shop ID from a purchase table.
- the rows A and C in the purchase table are specified.
- the BI tool calculates the sum of the number of purchases in the row and presents it to the analyst.
- purchase time information is not included in the purchase table, but the analyst may also request display of changes in the number of purchases for each period (for example, every month) including the purchase time in the purchase table. Conceivable.
- the BI tool calculates “the number of purchases that a person in their twenties purchased a jacket in Kawasaki” for each month as described above, and displays the graph illustrated in FIG. FIG. 17 illustrates a monthly change in “the number of purchases that a person in their twenties purchased a jacket in Kawasaki”. The analyst can make a decision by confirming the visualized information as illustrated in FIG.
- Patent Document 1 describes an analysis environment in which a relational database server and an OLAP (Online Analytical Processing) server are connected by a network.
- OLAP Online Analytical Processing
- the analyst can specify only information indicated as an attribute in each master (for example, the customer master, the product master, and the store master illustrated in FIG. 16) as a condition. And it is not always appropriate to specify such conditions. For example, in the above example, the analyst designates the condition “20s” for the customer, designates the condition “jacket” for the product, and designates the condition “Kawasaki” for the store. When such a condition is specified, information that can be used as a reference for decision making may not be presented.
- An object of the present invention is to provide a table operation system, a table operation method, and a table operation program that can solve the technical problem of changing the configuration of a table so that conditions for obtaining information that can be easily analyzed can be specified.
- the table manipulation system specifies at least two foreign key columns in a table having two or more foreign key columns and one or more relation value columns, specifies a relation value column, and associates the relation value column with the specification.
- Specified reception means that receives the specification of the type of statistical model to be specified, and individual tables that are the reference destinations of each specified foreign key column, and for each key value combination of the primary key columns of the specified individual table Based on the data acquisition means for acquiring the aggregate value of the relation value of the relation value column, the aggregate value acquired for each key value combination of the primary key column of each table, and the statistical model of the specified type Grouping means for grouping the key values of the primary key column of each table, and identification information of the group to which the key value of the primary key column of the table belongs for each table Characterized in that it comprises an attribute value column addition unit for adding the attribute value string to an attribute value.
- the table operation method specifies at least two foreign key columns in a table having two or more foreign key columns and one or more relationship value columns, specifies a relationship value column, It accepts the specification of the type of statistical model to be associated, identifies each table that is the reference destination of each specified foreign key column, and relates each key value combination of the primary key column of each identified table Acquires the aggregate value of the relation value of the value column, and based on the aggregate value acquired for each key value combination of the primary key column of each table and the statistical model of the specified type, the primary key of each table
- the key values of the column are grouped, and an attribute value column having an attribute value as identification information of the group to which the key value of the primary key column of the table belongs is added to each table.
- the table operation program provides a computer with designation of at least two foreign key columns in a table having two or more foreign key columns and one or more relation value columns, designation of a relation value column, and the relationship.
- Specified reception process that accepts the specification of the type of statistical model associated with the value column, identifies the individual table that is the reference destination of each specified foreign key column, and the key value of the primary key column of the identified individual table
- Data acquisition process for acquiring the aggregate value of the relation value of the relation value column for each combination, the aggregate value acquired for each key value combination of the primary key column of each table, and the statistical model of the specified type Grouping process for grouping the key values of the primary key column of each table based on the table, and the key value of the primary key column of the table for each table Characterized in that to execute the attribute value column addition processing to add an attribute value string to an attribute value identification information for the group to which it belongs.
- the technical effect of the present invention is that a table configuration can be changed so that conditions for obtaining information that can be easily analyzed can be specified.
- schema setting information Information specifying the configuration of individual tables included in the relational database and the reference relationship between the tables is referred to as schema setting information.
- the schema setting information is created in advance by an analyst. Then, individual tables included in the relational database are created according to the schema setting information.
- FIG. 1 is an explanatory diagram showing an example of a relational database (hereinafter referred to as RDB).
- RDB relational database
- FIG. 1 shows a case where the RDB includes a purchase table, a customer master, a product master, and a store master as tables.
- the purchase table represents purchase records of individual purchases.
- the customer ID, the product ID, the store ID, the number of purchases, and the purchase amount are associated with each other in the purchase table. Therefore, each row of the purchase table shown in FIG. 1 indicates who purchased what products, how many, and how much at which store.
- Customer master represents individual customer attributes.
- the customer ID, age, and gender are associated with each other in the customer master. Accordingly, each row of the customer master shown in FIG. 1 indicates the age and sex for each customer.
- the product master represents the attributes of individual products.
- the product ID is associated with the product category in the product master. Therefore, each row of the product master shown in FIG. 1 indicates a product category for each product.
- the store master represents the attributes of individual stores.
- the store ID is associated with the district where the store exists in the store master. Accordingly, each row of the store master shown in FIG. 1 indicates the existing district for each store.
- the purchase table corresponds to a fact table.
- the customer master, the product master, and the store master correspond to the master table.
- the customer ID in the purchase table is the reference source when referring to the customer master
- the product ID in the purchase table is the reference source when referring to the product master
- the store ID in the purchase table is when referring to the store master. It becomes the reference source of.
- Such a reference relationship and the structure of each table are defined by schema setting information created in advance by an analyst.
- FIG. 1 illustrates an RDB in which a reference relationship is defined as a star schema, but the RDB reference relationship may be defined as a snowflake schema.
- Each table has columns (see Fig. 1). Columns can be classified into key columns and value columns.
- the key columns can be classified into foreign key columns and primary key columns.
- the specific value of the key stored in the key column is referred to as the key value.
- the foreign key column is a column that stores the foreign key.
- the foreign key is a key that becomes a reference source when referring to another table.
- the “customer ID” column, the “product ID” column, and the “store ID” column of the purchase table correspond to the foreign key columns.
- the primary key column is a column that stores the primary key.
- the primary key is a key that becomes a reference destination when referenced from another table.
- the “customer ID” column of the customer master, the “product ID” column of the product master, and the “store ID” column of the store master correspond to the primary keys.
- the value column can be classified into a relationship value column and an attribute value column.
- the relationship value column is a column that stores relationship values.
- the relationship value is a value representing a relationship between a plurality of key values. Accordingly, the relationship value is associated with a plurality of key values in the table.
- the “plurality of key values” described here may be all foreign keys, or may include a foreign key and a primary key. In the example shown in FIG. 1, the “number of purchases” column and the “purchase amount” column in the purchase table correspond to the relationship value columns.
- Attribute value column is a column that stores attribute values.
- the attribute value is a value representing the attribute of one primary key. Therefore, the attribute value is associated with one primary key in the table.
- the “age” column and the “sex” column of the customer master, the “product category” column of the product master, and the “district” column of the store master correspond to the attribute value columns.
- the table can be classified into, for example, three types from the first table to the third table shown below.
- the first table is a table including two or more foreign key columns.
- An example of the first table is a fact table.
- the purchase table corresponds to the first table.
- the second table is a table including only one primary key column as a key column.
- An example of the second table is a master table, for example.
- the customer master, the product master, and the store master correspond to the second table.
- the third table is a table having one primary key column and one or more foreign key columns.
- the third table is used, for example, in a snowflake schema.
- An example of the third table is shown in FIG.
- a column described as “FK” is a foreign key column
- a column described as “PK” is a primary key column.
- the table 91 shown in FIG. 2 includes one primary key column (“customer ID” column) and one foreign key column (“company ID” column). Therefore, the table 91 shown in FIG. 2 corresponds to the third table.
- FIG. 2 shows a case where the table 91 includes one foreign key column, the number of foreign key columns included in the third table may be two or more.
- a table having a plurality of key columns may have a relation value column. Accordingly, the first table and the third table may have a relationship value column.
- the purchase table shown in FIG. 1 corresponds to the first table, and has a “number of purchases” column and a “purchase amount” column as relation value columns.
- the table 91 illustrated in FIG. 2 corresponds to the third table, and has a “year of service” column as a relationship value column.
- a table having a plurality of key columns may not have a relationship value column.
- a table with a primary key column can have an attribute value column.
- the second table and the third table may have attribute value columns.
- Each of the customer master, the product master, and the store master shown in FIG. 1 corresponds to the second table, and has an attribute value column.
- the table having the primary key column may not have the attribute value column.
- the table 91 illustrated in FIG. 2 does not have an attribute value string.
- each column included in each table defined by the schema setting information is not limited to the example shown in FIG. That is, the structure of the table defined by the schema setting information created in advance by the analyst is not limited to the example shown in FIG. Further, the table reference relationship defined by the schema setting information is not limited to the example shown in FIG. Moreover, in FIG. 1, although the purchase table, the goods master, the customer master, and the store master were illustrated, the content which each table represents is not specifically limited.
- FIG. FIG. 3 is a block diagram illustrating a configuration example of the table operation system according to the first embodiment of this invention.
- the table operation system 1 according to the first embodiment includes a database management unit 2 (hereinafter simply referred to as a DB management unit 2), a designation receiving unit 3, a data acquisition unit 4, a grouping unit 5, and an attribute value sequence. And an additional unit 6.
- the DB management unit 2 is a database management system (DBMS) that manages an RDB (relational database) 7.
- DBMS database management system
- RDB relational database
- a reference relationship between tables included in the RDB 7 is defined by schema setting information created in advance by an analyst.
- the configuration of each table included in the RDB 7 is also defined by the schema setting information.
- the RDB 7 in which the configuration of each table and the table reference relationship are defined by the schema setting information includes a table having two or more foreign key columns and one or more relationship value columns.
- the RDB 7 also includes individual tables that are the reference destinations of these foreign key columns.
- the RDB 7 in which the configuration of each table and the reference relationship between the tables are defined by the schema setting information is the RDB illustrated in FIG.
- the purchase table shown in FIG. 1 corresponds to “a table having two or more foreign key columns and one or more relationship value columns”.
- each of the customer master, the product master, and the store master shown in FIG. 1 corresponds to “an individual table that is a reference destination of each foreign key column”.
- the table operation system 1 performs a process of adding an attribute value column to a part of these tables. Therefore, an attribute value string not defined in the schema setting information is added to some tables.
- the designation receiving unit 3 designates at least two foreign key columns in a table having two or more foreign key columns and one or more relational value columns (in this example, a purchase table) as an operator of the table operation system 1 (hereinafter referred to as the following). , Simply referred to as the operator.). Further, the designation receiving unit 3 also accepts designation of one relation value sequence in the table and designation of a statistical model type associated with the relation value series. Examples of the types of statistical models include Poisson distribution and Gaussian distribution, but are not limited to these distributions. When the operator specifies the “number of purchases” column of the purchase table as the relation value column, the operator may specify, for example, a Poisson distribution as the type of statistical model associated with the “number of purchases” column. Further, when the “purchase price” column of the purchase table is specified as the relation value column, the operator may specify, for example, a Gaussian distribution as the type of statistical model associated with the “purchase price” column.
- the designation receiving unit 3 displays, for example, a GUI (Graphical User Interface) on the display device (not shown in FIG. 3) for the operator to input the designation of the foreign key sequence, the relation value sequence, and the statistical model type.
- GUI Graphic User Interface
- the specification of the foreign key column, the relation value column, and the type of the statistical model may be received via the GUI.
- the designation receiving unit 3 receives the designation of information via the GUI will be described as an example.
- the designation receiving unit 3 will be described by taking as an example a case where the designation of the “customer ID” column and the “product ID” column among the foreign key columns of the purchase table shown in FIG. 1 is accepted. Further, the designation receiving unit 3 receives the designation of the “number of purchases” column in the relation value column of the purchase table shown in FIG. 1 and the case of accepting the designation of the Poisson distribution as the statistical model type. explain.
- the types of foreign key columns, relationship value columns, and statistical models specified by the operator are not limited to the above example. Further, the number of designated foreign key columns may be three or more.
- the data acquisition unit 4 specifies each table that is a reference destination of each foreign key designated by the operator. Then, the data acquisition unit 4 acquires, from the RDB 7, the total value of the relationship values of the specified relationship value column for each combination of key values of the primary key columns of each identified table.
- the processing of the data acquisition unit 4 will be described using a specific example.
- a “customer ID” column and a “product ID” column are designated as foreign keys.
- the data acquisition unit 4 specifies the customer master as a table to be referred to in the “customer ID” column.
- the data acquisition unit 4 specifies a product master as a table to be referred to in the “product ID” column.
- the data acquisition unit 4 calculates, for each combination of the key value of the primary key column of the customer master and the key value of the primary key column of the product master, the aggregate value of the relationship values of the specified relationship value column, Acquired from a table including a value column (in this example, a purchase table).
- the “number of purchases” column is designated as the relationship value column. Therefore, the data acquisition unit 4 acquires the total value of the number of purchases from the purchase table for each combination of the key value of the primary key column of the customer master and the key value of the primary key column of the product master.
- the primary key column of the customer master is the “customer ID” column in the customer master
- the primary key column of the product master is the “product ID” column in the product master.
- the data acquisition unit 4 searches for a row in the purchase table that matches the combination of the customer ID “183” and the product ID “50”, and returns a total value of the number of purchases in each row to the DB management unit. 2 is requested.
- FIG. 1 illustrates the example illustrated in FIG.
- the rows in the purchase table that match the combination of the customer ID “183” and the product ID “50” are the row B and the row E. Therefore, the DB management unit 2 calculates the total value “7” of the value “3” of the number of purchases in the row B and the value “4” of the number of purchases in the row E, and uses the calculation result as the total value of the number of purchases. Return to the data acquisition unit 4. As a result, the data acquisition unit 4 acquires the total value of the number of purchases in the combination of the customer ID “183” and the product ID “50”.
- the combination of the customer ID “183” and the product ID “50” is taken as an example, and the case where the data acquisition unit 4 acquires the total number of purchases is shown. Similarly, the data acquisition unit 4 acquires the total number of purchases for all combinations of the customer ID value and the product ID.
- the data acquisition unit 4 calculates the number of purchases for each combination of the key value of the primary key column of the customer master (customer ID in the customer master) and the key value of the primary key column of the product master (product ID in the product master).
- An aggregate value of the number of purchases for each combination may be acquired by generating an SQL requesting the aggregate value and inputting the SQL to the DB management unit 2.
- the grouping unit 5 uses the aggregate value acquired by the data acquisition unit 4 for each combination of primary key column key values of each table that is the reference destination of each specified foreign key, and the statistical model of the specified type. Based on that, group the key values in the primary key column of that individual table.
- the processing of the grouping unit 5 will be described using a specific example.
- the data acquisition unit 4 shows a case where the total value of the number of purchases is acquired for each combination of the customer ID value and the product ID.
- the Poisson distribution is specified as the type of the statistical model.
- the grouping unit 5 groups the customer ID in the customer master and the product ID in the product master based on the total number of purchases for each combination of the customer ID value and the product ID and the Poisson distribution. To do.
- a group of customer IDs is simply referred to as a customer ID group.
- a group of product IDs is simply referred to as a product ID group.
- the grouping unit 5 is configured such that each customer ID belongs to only one customer ID group and each product ID belongs to only one product ID group.
- a case where customer IDs and product IDs are grouped will be described as an example. Note that, in this way, defining a group so that one key value belongs to only one group is called clustering.
- the customer ID is represented by the symbol “c”. Further, the product ID is represented by a symbol “i”.
- a total value of relation values corresponding to the combination of the customer ID “c” and the product ID “i” (in this example, the total value of the number of purchases) is described as a total value x c, i .
- x c, i 7.
- FIG. 4 illustrates a state in which customer IDs and product IDs before grouping are arranged in order.
- FIG. 4 shows a state in which customer IDs are arranged in order in the horizontal axis direction and product IDs are arranged in order in the vertical axis direction.
- the total value xc, i of the number of purchases corresponding to the combination of customer ID and goods ID is shown in figure.
- x 3 and 5 shown in FIG. 4 are total values of the number of purchases corresponding to the combination of the customer ID “3” and the product ID “5”.
- FIG. 5 is an explanatory diagram schematically showing examples of customer ID groups and product ID groups determined by the grouping unit 5.
- the grouping unit 5 determines a plurality of customer ID groups and product ID groups. However, in FIG. 5, only the customer ID group with ID “9” and the product ID group with ID “4” are illustrated for the sake of simplicity.
- the number of customer ID groups and the number of product ID groups may be set to fixed values, respectively, or may not be limited to fixed values.
- the number of customer ID groups is K C
- the ID of each customer ID group is 1 to K C.
- the number of product ID group is a K I number
- ID of each product ID group is assumed to be a 1 ⁇ K I.
- the customer ID group referred to as the customer ID group "k". This also applies to the product ID group.
- customer IDs and product IDs belonging to a group are shown in parentheses. For example, customer IDs “1” and “3” belong to the customer ID group “9”. Product ID “2”, “5”, etc. belong to the product ID group “4”.
- a combination of one customer ID group and one product ID group corresponds to a customer ID belonging to the customer ID group and an aggregate value x c, i corresponding to a combination of product IDs belonging to the product ID group.
- x 1 , 2 , x 3 , 5 and the like correspond to combinations of the customer ID group “9” and the product ID group “4”.
- FIG. 5 is a modification of FIG. 4 so that customer IDs belonging to the same customer ID group are continuously arranged and product IDs belonging to the same product ID group are continuously arranged. it can.
- the customer ID group to which the customer ID “c” belongs is denoted as z C c .
- z C c 1
- the product ID group to which the product ID “i” belongs is denoted as z I i .
- z I i the product ID group to which the product ID “i” belongs.
- z I 2 4
- z I i 4
- z I i may be represented by a vector in which only the element corresponding to the ID of the product ID group is 1 and the other elements are 0.
- p (x c, i ) the probability that the total value x c, i occurs under the parameters of the statistical model is denoted as p (x c, i ).
- p (x c, i ) is expressed as in the following formula (1).
- ⁇ (z C c , z I i ) is defined as the relationship between the customer ID group “z C c ” to which the customer ID “c” belongs and the product ID group “z I i ” to which the product ID “i” belongs.
- This is a distribution parameter of the total number of purchases corresponding to the combination.
- This distribution is a distribution designated by the operator, and in this example is a Poisson distribution. That is, in this example, ⁇ (z C c , z I i ) is a Poisson distribution parameter corresponding to a combination of “z C c ” and “z I i ”.
- P (x c, i ) is the probability that x c, i will occur under such a Poisson distribution parameter.
- the distribution parameter may be a parameter corresponding to the type of distribution. For example, the parameters of the Gaussian distribution are mean and variance.
- the grouping unit 5 uses the following formula (2).
- Equation (2) is a distribution parameter ( ⁇ (z C c ) corresponding to each customer ID group to which each customer ID belongs, each product ID group to which each product ID belongs, and a combination of the customer ID group and the product ID group. , Z I i )).
- S c is a set of customer IDs
- S i is a set of product IDs.
- the grouping unit 5 determines each customer ID group and each product ID group using the likelihood calculated by Expression (2). At this time, the grouping unit 5 also determines a distribution parameter ( ⁇ (z C c , z I i )) corresponding to the combination of the customer ID group and the product ID group for each combination.
- the grouping unit 5 updates z C c , z I i , ⁇ (z C c , z I i ) in equation (2) so that the likelihood calculated in equation (2) increases.
- the distribution parameters corresponding to each customer ID group, each product ID group, and the combination of the customer ID group and the product ID group may be determined.
- the grouping unit 5 may use EM (Expectation-Maximization).
- the method of maximizing the likelihood is an example, and the grouping unit 5 uses a method other than the method of maximizing the likelihood (for example, variational Bayes method, Gibbs sampling method, etc.)
- Each customer ID group, each product ID group, etc. may be determined.
- each customer ID in the customer master can be associated with the ID of the customer ID group to which the customer ID belongs as an attribute value.
- each product ID in the product master can be associated with the ID of the product ID group to which the product ID belongs as an attribute value.
- the attribute value string adding unit 6 adds an attribute value string having, as an attribute value, the ID of the group to which the key value of the primary key column of the table belongs for each table that is a reference destination of each designated foreign key. .
- the attribute value string adding unit 6 adds an attribute value string whose attribute value is the ID of the customer ID group to which the customer ID belongs to the customer master.
- the attribute value string adding unit 6 adds an attribute value string whose attribute value is the ID of the product ID group to which the product ID belongs to the product master.
- FIG. 6 shows an example of a customer master and a product master in which an attribute value string is newly added.
- the purchase table and the store master shown in FIG. 1 are not shown.
- the case where the “customer ID” column and the “product ID” column among the foreign key columns are specified has been described as an example. Therefore, an attribute value column is newly added to the customer master and the product master. If the “store ID” column is included in the designated foreign key column, an attribute value column is newly added to the store master as a reference destination of the “store ID” column.
- the attribute values of the attribute value sequence to be added are also represented by numbers.
- the ID of the customer ID group or the ID of the product ID group may be represented by a name that is easy for an analyst to understand, and the attribute value of the attribute value string to be added may be such a name that is easy to understand. This also applies to a second embodiment described later.
- the DB management unit 2, the designation reception unit 3, the data acquisition unit 4, the grouping unit 5, and the attribute value string addition unit 6 are realized by a CPU of a computer that operates according to a program, for example.
- the CPU reads a table operation program from a program recording medium such as a computer program storage device (not shown in FIG. 3), and in accordance with the table operation program, the DB management unit 2, the designation receiving unit 3, and the data acquisition unit 4
- the grouping unit 5 and the attribute value string adding unit 6 may be operated.
- the DB management unit 2, the specification receiving unit 3, the data acquisition unit 4, the grouping unit 5, and the attribute value sequence adding unit 6 may be realized by different hardware. These points are the same in the second embodiment described later.
- the table operation system 1 may have a configuration in which two or more physically separated devices are connected by wire or wirelessly. This also applies to a second embodiment described later.
- FIG. 7 is a flowchart illustrating an example of processing progress according to the first embodiment of this invention.
- the designation receiving unit 3 receives designation of at least two foreign key columns in a table having two or more foreign key columns and one or more relation value columns, and designates one relation value column in the table and its relation value
- the designation of the type of statistical model associated with the column is accepted (step S1).
- the designation accepting unit 3 displays a GUI for the user to input these items on a display device (not shown in FIG. 3), and accepts designation of each item described above via the GUI. A specific example of the GUI will be described later.
- the “customer ID” column and the “product ID” column of the purchase table shown in FIG. 1 are designated as the foreign key columns, and the “number of purchases” column of the purchase table is designated as the relation value column. Take the case as an example. In addition, a case where a Poisson distribution is designated as the statistical model type is taken as an example.
- the data acquisition unit 4 identifies each table that is a reference destination of the foreign key column designated in step S1.
- the data acquisition unit 4 specifies a customer master and a product master (see FIG. 1), which are reference destinations of the “customer ID” column and the “product ID” column of the purchase table.
- the data acquisition part 4 acquires the total value of the relationship value of the designated relationship value row
- the data acquisition unit 4 counts the number of purchases for each combination of each customer ID stored in the “customer ID” column of the customer master and each product ID stored in the “product ID” column of the product master.
- the SQL requesting the total value is generated, and the SQL is input to the DB management unit 2.
- the DB management unit 2 searches the purchase table (see FIG. 1) for a line that matches the combination of the customer ID and the product ID, and calculates the total number of purchases in the searched line. Executed for each combination of customer ID and product ID.
- the DB management unit 2 returns the total number of purchases calculated for each combination of customer ID and product ID to the data acquisition unit 4.
- the data acquisition unit 4 acquires the total value of the number of purchases for each combination.
- the grouping unit 5 groups the key values of the primary key column of each table specified in step S2 (step S3).
- the grouping unit 5 has a customer ID group (z C c ) to which each customer ID belongs, and a product ID group (to which each product ID belongs) (z C c ) so that the likelihood calculated by Expression (2) increases.
- z I i a customer ID group
- z I i a product ID group (to which each product ID belongs)
- the Poisson distribution parameter ⁇ (z C c , z I i ) corresponding to the combination of “z C c ” and “z I i ” are updated, and each customer ID group, each product
- the parameters of the Poisson distribution according to the ID group and the combination of the customer ID group and the product ID group are determined.
- the attribute value string adding unit 6 adds, for each table specified in step S2, an attribute value string whose attribute is the ID of the group to which the key value of the primary key string of the table belongs (step S4).
- the attribute value string adding unit 6 adds an attribute value string whose attribute value is the ID of the customer ID group to which the customer ID belongs to the customer master.
- the attribute value string adding unit 6 adds an attribute value string whose attribute value is the ID of the product ID group to which the product ID belongs to the product master.
- a new attribute value sequence is added to the customer master and the product master.
- the attribute value string that is not defined in the schema setting information is added to the table that is the reference destination of the foreign key string specified in step S1. Therefore, the analyst can include the attribute value of the added attribute value string (the ID of the group to which the primary key belongs) in the condition when causing the BI tool to present information that matches the desired condition. For example, when an analyst wants to know the number of purchases of a customer belonging to the customer ID group “5” and a product belonging to the merchandise ID group “11”, the analyst selects “customer ID group“ 5 ”” for the customer. The condition “product ID group“ 11 ”” can be specified for the product.
- the grouping unit 5 groups the key values of the primary key column so that the likelihood of the group or parameter increases. Each group obtained as a result cannot be grasped by the analyst from the RDB 7 created according to the schema setting information.
- the present invention adds such a group ID as a new attribute value to the reference table. Therefore, according to the present invention, it becomes possible for the analyst to specify conditions that the analyst cannot notice from the configuration of the table defined by the schema setting information. As a result, information that is easier to analyze can be extracted from the RDB 7 by the BI tool and presented to the analyst.
- the data acquisition part 4 should just acquire the total value of a relationship value for every combination of customer ID in a customer master, goods ID in a goods master, and shop ID in a shop master.
- the grouping part 5 should just divide each into group regarding customer ID, goods ID, and shop ID.
- the grouping unit 5 updates the parameters of each group to which each key to be grouped belongs, the distribution (statistical model of the specified type) according to the combination of each group, What is necessary is just to determine the parameter of the distribution according to each group and the combination of each group so that the likelihood of the parameter of the distribution according to the combination of groups may be increased.
- the grouping unit 5 includes a customer ID group to which each customer ID of the customer master belongs, a product ID group to which each product ID of the product master belongs, a store ID group to which each store ID of the store master belongs, And the parameter of distribution according to the combination of a customer ID group, a product ID group, and a store ID group is updated, and the likelihood is increased so that each product ID group, each customer ID group, each store ID Group and distribution parameters may be established.
- the attribute value string adding unit 6 may add a new attribute value string to the customer master, the product master, and the store master.
- FIG. 3 The table operation system according to the second embodiment of the present invention can be represented by the block diagram shown in FIG. 3 similarly to the table operation system according to the first embodiment. Therefore, the second embodiment will be described with reference to FIG. Will be explained. Explanation of matters similar to those in the first embodiment will be omitted as appropriate.
- the table operation system 1 includes a DB management unit 2, a designation receiving unit 3, a data acquisition unit 4, a grouping unit 5, and an attribute value string adding unit 6 (see FIG. 3).
- the DB management unit 2 is the same as the DB management unit 2 in the first embodiment. Also in the second embodiment, description will be made assuming that the RDB 7 in which the configuration of each table and the reference relationship between the tables are defined by the schema setting information is the RDB illustrated in FIG.
- the designation receiving unit 3 receives designation of at least two foreign key columns in a table having two or more foreign key columns and one or more relation value columns, and designates one relation value column in the table and its relation value
- the specification of the type of statistical model associated with the column is accepted. This operation is the same as in the first embodiment.
- the “customer ID” column and the “product ID” column of the purchase table shown in FIG. 1 are designated as the foreign key columns, and the “number of purchases” of the purchase table as the relationship value column. ”Column is specified as an example.
- a case where a Poisson distribution is designated as the statistical model type is taken as an example.
- the designation receiving unit 3 further accepts designation of an attribute value sequence possessed by a table that is a reference destination of the designated foreign key column, and the type of the statistical model associated with the attribute value sequence.
- the specification is also accepted.
- the number of attribute value strings specified in the second embodiment is one or more. There are a plurality of tables to which the specified foreign key column is referred, but there may be a table in which no attribute value column is specified. Also, the type of statistical model is designated for each designated attribute value string.
- the “customer ID” column and the “product ID” column of the purchase table are designated as foreign key columns.
- the table that is the reference destination of the “customer ID” column is the customer master
- the table that is the reference destination of the “product ID” column is the product master (see FIG. 1).
- the customer master has an “age” column and a “sex” column as attribute value columns
- the product master has a “product category” column as attribute value columns (see FIG. 1).
- the designation receiving unit 3 receives designation of one or more attribute value columns from among the “age” column, the “sex” column, and the “product category” column.
- the designation receiving unit 3 receives the designation of one attribute value sequence
- the attribute value column is an “age” column
- the operator may specify the type of the statistical model according to the attribute value string to be specified.
- the operator may specify the Gaussian distribution as the type of the statistical model associated with the “age” column.
- a Gaussian distribution is specified as the type of the statistical model associated with the “age” column will be described as an example.
- the operator when the operator specifies the “gender” column as the attribute value column, the operator may specify the Bernoulli distribution. Further, for example, when the “product category” column is specified as the attribute value column, the operator may specify a multinomial distribution.
- the data acquisition unit 4 specifies each table that is a reference destination of each foreign key designated by the operator. Then, the data acquisition unit 4 acquires, from the RDB 7, the total value of the relationship values of the specified relationship value column for each combination of key values of the primary key columns of each identified table. This operation is the same as in the first embodiment.
- the data acquisition unit 4 further acquires the attribute value of the designated attribute value column for each key value of the primary key column of the table having the attribute value column.
- an “age” column is designated as the attribute value column.
- the table having the “age” column is the customer master, and the primary key column of the customer master is the “customer ID” column (see FIG. 1). Therefore, in this case, the data acquisition unit 4 acquires the attribute value of the “age” column for each customer ID included in the customer master.
- the data acquisition unit 4 generates an SQL requesting the attribute value of the specified attribute value column for each key value of the primary key column of the table having the attribute value column, and inputs the SQL to the DB management unit 2 Thus, an attribute value may be acquired for each key value.
- the grouping unit 5 is a statistical model of the type associated with the aggregate value acquired for each key value combination of the primary key column of each table that is the reference destination of each designated foreign key column and the relationship value column Based on the attribute value obtained for each key value of the primary key column of the table having the specified attribute value column and the statistical model of the type associated with the attribute value column. Group key values in primary key columns. In other words, the grouping unit 5 groups the key values of the primary key string using the attribute values of the designated attribute value string.
- a c be the age corresponding to the customer ID “c”.
- the age corresponding to the customer ID "c” is the probability a a c
- the probability of age a c occurs under the parameters of the statistical model referred to as p (a c).
- p (a c ) is expressed as in the following formula (3).
- ⁇ (z C c ) is a parameter of the age distribution corresponding to the customer ID group “z C c ” to which the customer ID “c” belongs.
- This distribution is a statistical model of the type associated with the attribute value sequence. This type is specified by the operator. In this example, this distribution is a Gaussian distribution.
- the grouping unit 5 uses the following formula (4).
- Expression (4) is a distribution parameter ( ⁇ (z C c ,) according to each customer ID group to which each customer ID belongs, each product ID group to which each product ID belongs, and a combination of the customer ID group and the product ID group. z I i )) and the likelihood of the combination of distribution parameters ( ⁇ (z C c )) according to customer ID groups.
- the grouping unit 5 determines each customer ID group and each product ID group using the likelihood calculated by Expression (4). At this time, the grouping unit 5 also determines a distribution parameter ( ⁇ (z C c , z I i )) corresponding to the combination of the customer ID group and the product ID group for each combination, and also according to each customer ID group. The distribution parameter ( ⁇ (z C c )) is also determined for each customer ID group.
- the grouping unit 5 sets z C c , z I i , ⁇ (z C c , z I i ), ⁇ (z in equation (4) so that the likelihood calculated in equation (4) increases.
- C c is updated, and the distribution parameters according to each customer ID group, each product ID group, the combination of the customer ID group and the product ID group, and the distribution parameters according to the customer ID group are determined. That's fine.
- the operation in which the grouping unit 5 updates the groups and parameters so as to increase the likelihood and determines the groups and parameters is the same as the operation in the first embodiment.
- each customer ID in the customer master can be associated with the ID of the customer ID group to which the customer ID belongs as an attribute value.
- each product ID in the product master can be associated with the ID of the product ID group to which the product ID belongs as an attribute value.
- the attribute value string adding unit 6 adds an attribute value string having, as an attribute value, the ID of the group to which the key value of the primary key column of the table belongs for each table that is a reference destination of each designated foreign key. .
- the operation of the attribute value string adding unit 6 in the second embodiment is the same as the operation of the attribute value string adding unit 6 in the first embodiment.
- the attribute value string adding unit 6 adds an attribute value string whose attribute value is the ID of the customer ID group to which the customer ID belongs to the customer master.
- the attribute value string adding unit 6 adds an attribute value string whose attribute value is the ID of the product ID group to which the product ID belongs to the product master.
- FIG. 8 is a flowchart showing an example of processing progress of the second embodiment of the present invention.
- the designation receiving unit 3 receives designation of at least two foreign key columns in a table having two or more foreign key columns and one or more relation value columns, and designates one relation value column in the table and its relation value
- the specification of the type of statistical model associated with the column is accepted.
- the designation accepting unit 3 accepts designation of an attribute value sequence possessed by a table that is a reference destination of the foreign key sequence and designation of a type of statistical model associated with the attribute value sequence (step S11).
- the designation accepting unit 3 displays a GUI for the user to input these items on a display device (not shown in FIG. 3), and accepts designation of each item described above via the GUI. A specific example of the GUI will be described later.
- the “customer ID” column and the “product ID” column of the purchase table shown in FIG. 1 are designated as the foreign key columns, and the “number of purchases” column of the purchase table is designated as the relation value column.
- the relation value column Take the case as an example.
- a case where a Poisson distribution is designated as a type of a statistical model associated with a relation value string is taken as an example.
- a case where an “age” column is designated as the attribute value sequence and a Gaussian distribution is designated as the type of the statistical model associated with the attribute value sequence is taken as an example.
- the data acquisition unit 4 specifies each table that is a reference destination of the foreign key column designated in step S11.
- the data acquisition unit 4 specifies a customer master and a product master (see FIG. 1), which are reference destinations of the “customer ID” column and the “product ID” column of the purchase table.
- the data acquisition part 4 acquires the total value of the relationship value of the designated relationship value row
- step S12 the data acquisition unit 4 acquires the attribute value of the designated attribute value column for each key of the primary key column of the table having the attribute value column (step S12).
- the table having the “age” column designated as the attribute value column is the customer master (see FIG. 1). Therefore, in this example, the data acquisition unit 4 acquires the attribute value of the “age” column for each customer ID stored in the “customer ID” column of the customer master.
- the data acquisition unit 4 generates an SQL requesting the attribute value of the “age” column for each customer ID stored in the “customer ID” column of the customer master, and inputs the SQL to the DB management unit 2.
- the DB management unit 2 executes processing for extracting the attribute value of the “age” column for each customer ID stored in the “customer ID” column of the customer master, and extracts for each customer ID.
- the attribute value of the “age” column is returned to the data acquisition unit 4.
- the data acquisition unit 4 acquires an attribute value for each customer ID.
- the grouping unit 5 groups the key values in the primary key column of each table (customer master and product master) specified in step S12 (step S13).
- the grouping unit 5 has a customer ID group (z C c ) to which each customer ID belongs, and a product ID group (to which each product ID belongs) (z C c ) so that the likelihood calculated by Expression (4) increases.
- ⁇ (z C c ) is updated, each customer ID group, each product ID group, a Poisson distribution parameter according to the combination of the customer ID group and the product ID group, and a Gaussian distribution parameter according to the customer ID group Confirm each.
- step S14 the attribute value string adding unit 6 adds, for each table specified in step S12, an attribute value string whose attribute is the ID of the group to which the key value of the primary key string of the table belongs (step S14).
- the process of step S14 is the same as the process of step S4 in the first embodiment.
- a new attribute value sequence is added to the customer master and the product master.
- the same effect as in the first embodiment can be obtained.
- the grouping when grouping the key values of the primary key column of the table that is the reference destination of the designated foreign key, the grouping is performed in consideration of the designated attribute value column. Therefore, the accuracy of grouping the key values of the primary key sequence is improved. As a result, the analyst can more appropriately specify conditions for obtaining information that can be easily analyzed.
- the “age” column of the customer master is designated as the attribute value column
- the “age” column is an example of an attribute value column, and an attribute value column other than the “age” column may be designated.
- the number of attribute value strings specified by the operator may be two or more.
- the grouping unit 5 may perform grouping in step S13 using a likelihood calculation formula corresponding to the specified attribute value sequence.
- the first embodiment and the second embodiment can use a common GUI.
- the screen used as the GUI includes, for example, a table name display column 51, a selected table name display column 52, a column name display column 53, a first selected column name display column 54, a reference table name display column 55, 56, a second selected column name display field 57, a type selection field 58, and buttons 61 to 64.
- the selected table name display field 52, the column name display field 53, the first selected column name display field 54, the reference table name display fields 55 and 56, the second selected column name display field 57, and the type selection field 58 are: It is blank in the initial state.
- the table name display column 51 is a column for displaying the table name of each table.
- the designation receiving unit 3 displays the table name of each table defined in the schema setting information in the table name display field 51.
- 9 and 10 exemplify a state in which the designation receiving unit 3 displays three table names such as “order ⁇ ” in the table name display field 51.
- the selected table name display column 52 is a column for displaying the table name selected by the operator. For example, when a click operation is performed on one of the table names displayed in the table name display field 51, the designation receiving unit 3 determines that the table name has been selected, and selects the table name. This is displayed in the selection table name display field 52.
- FIG. 9 and FIG. 10 illustrate a state where the designation receiving unit 3 displays the table name “order” in the selected table name display column 52 when the operator clicks “order” in the table name display column 51. ing.
- the column name display column 53 is a column that displays the column name of each column included in the table indicated by the selected table name.
- 9 and 10 exemplify a state in which the designation receiving unit 3 displays the column names of the respective columns included in the table having the table name “order” in the column name display field 53.
- FIG. 9 and FIG. 10 the symbol “PK” is added to the column name of the primary key column, and the symbol “FK” is added to the column name of the foreign key column.
- the first selected column name display column 54 is a column that displays the column name of the selected foreign key column.
- the foreign key column whose column name is displayed in the first selected column name display field 54 is a foreign key column designated by the operator.
- the foreign key column selection operation (in other words, the designation operation) is, for example, an operation of clicking the column name of the foreign key column displayed in the column name display field 53 and then clicking the button 61.
- the designation receiving unit 3 displays the column name of the foreign key column in the first selected column name display column 54 and deletes the column name from the column name display column 53.
- the operator performs the above selection operation on at least two foreign key columns, and the designation receiving unit 3 determines that those foreign key columns are designated.
- FIG. 10 illustrates a state in which the foreign key column having the column name “customer_cd” and the foreign key column having the column name “item_cd” are designated by the operator.
- the operator clicks the column name of the foreign key column to be undesignated among the column names displayed in the first selected column name display field 54, Subsequently, the button 62 may be clicked.
- the designation receiving unit 3 deletes the column name of the foreign key column from the first selected column name display column 54, and the column name is again displayed in the column name display column 53. indicate.
- the designation receiving unit 3 also obtains the table name of the table that is the reference destination of the foreign key column (that is, the foreign key column designated by the operator) whose column name is displayed in the first selected column name display field 54. Displayed in the reference table name display fields 55 and 56.
- the table name “customer” of the table that is the reference destination of the foreign key column with the column name “customer_cd” and the table name “item” of the table that is the reference destination of the foreign key column of the column name “item_cd” are shown.
- the state displayed in the reference table name display fields 55 and 56 is illustrated. The operator can grasp the table that becomes the reference destination of the designated foreign key column by looking at the reference destination table name display fields 55 and 56.
- the second selected column name display column 57 is a column for displaying the column name of the selected relation value column or the column name of the selected attribute value column.
- the value string whose column name is displayed in the second selected column name display field 57 is a relation value string or an attribute value string designated by the operator.
- the relation value column selection operation (in other words, the designation operation) is, for example, an operation of clicking the column name of the relation value column displayed in the column name display field 53 and then clicking the button 63.
- the designation receiving unit 3 displays the column name of the relation value column in the second selected column name display column 57 and deletes the column name from the column name display column 53.
- the operator performs the above selection operation on one relation value string, and the designation receiving unit 3 determines that the relation value string has been designated.
- the type selection column 58 is a column for selecting the type of statistical model corresponding to the value column indicated by the column name displayed in the second selected column name display column 57.
- the type selection column 58 is a pull-down menu is illustrated.
- the operator may select the type of the statistical model corresponding to the value column indicated by the column name displayed in the second selected column name display column 57 by operating the type selection column 58.
- the designation receiving unit 3 determines that the statistical model type corresponding to the value column indicated by the column name displayed in the second selected column name display field 57 is designated.
- FIG. 10 illustrates a case where a relation value column having a column name “quantity” is designated, and a Poisson distribution is designated as the type of statistical model corresponding to the relation value column.
- the attribute value string selection operation and the statistical model type selection operation corresponding to the attribute value string are the same as the relation value column selection operation and the statistical model type selection operation corresponding to the relation value column.
- the operator may click the column name of the attribute value column displayed in the column name display field 53 and click the button 63.
- the designation receiving unit 3 displays the column name of the attribute value column in the second selected column name display column 57 and deletes the column name from the column name display column 53.
- the operator performs the above operation on the attribute value string to be selected, and the designation receiving unit 3 determines that the attribute value string has been designated.
- the operator selects the statistical model type corresponding to the attribute value column by operating the type selection column 58 in a state where the specified attribute value column is displayed in the second selection column name display column 57. do it.
- this operation it is determined that the type of statistical model corresponding to the attribute value column displayed in the second selected column name display field 57 is selected.
- the table operation system 1 performs the attribute value sequence. And statistical models need not be used.
- the operator When the operator cancels the designation of the relation value column or the attribute value column, the operator clicks the column name of the relation value column or the attribute value column displayed in the second selected column name display field 57. Subsequently, the button 64 may be clicked. When such an operation is performed, the designation receiving unit 3 deletes the column name of the relation value column or the column name of the attribute value column from the second selected column name display field 57, and the column name is again displayed. Are displayed in the column name display field 53.
- the designation receiving unit 3 can accept the designation of various items by the GUI as described above.
- the GUI is not limited to the GUI illustrated in FIGS. 9 and 10.
- the designation receiving unit 3 may receive the designation of various items without using the GUI.
- the grouping unit 5 may group the key values by allowing one key value to belong to a plurality of groups.
- the grouping unit 5 is configured so that each customer ID belongs to one or more customer ID groups and each product ID belongs to one or more product ID groups.
- the customer ID group and the product ID group may be determined.
- the key column may include a plurality of columns.
- the key value may be represented by a vector.
- FIG. 11 is an explanatory diagram illustrating an example in which the key column includes a plurality of columns.
- FIG. 11 illustrates a key string for uniquely identifying individual employees belonging to the group company.
- the key column illustrated in FIG. 11 includes a “company ID” column and an “in-house employee ID” column.
- a plurality of companies forming a group company may use the same in-house employee ID. For example, as shown in FIG. 11, it is assumed that the company “1001” and the company “1002” have different employees having the same in-house employee ID “12345”.
- individual employees belonging to the group company cannot be uniquely identified only by the in-house employee ID.
- individual employees belonging to the group company can be uniquely identified.
- different employees can be uniquely identified by a key value of (1001, 12345) and a key value of (1002, 12345).
- FIG. 11 illustrates the case where the key column includes two columns, the key column may include three or more columns.
- the relation value column may include a plurality of columns.
- the relationship value sequence includes a plurality of columns
- the relationship value may be represented by a vector.
- FIG. 12 is an explanatory diagram illustrating an example in the case where the relationship value sequence includes a plurality of columns.
- FIG. 12 exemplifies a relation value string that stores the number of purchases.
- the relational value column for storing the purchase quantity includes a “daytime purchase quantity” column and a “night purchase quantity” column is illustrated.
- the number of purchases can be represented by a vector (12, 5).
- the number of purchases can be represented by a vector of (7, 10).
- FIG. 12 illustrates the case where the relationship value column includes two columns, the relationship value column may include three or more columns.
- FIG. 13 is an explanatory diagram showing an example of a table in which the relation value string is not explicitly shown.
- the table shown in FIG. 13 includes a “customer ID” column and a “company ID” column as key columns.
- the “company ID” column stores the ID of the company where the person specified by the customer ID is working.
- the relationship value sequence is not explicitly shown. However, in FIG. 13, it can be considered that there is a relationship value string that stores a relationship value of “whether there is an employment relationship between the customer and the company”.
- the relationship value “whether there is an employment relationship between the customer and the company” is one of two values (in this example, “0” and “1”).
- the value of this relationship value is “1”
- the value of this relationship value is “1”.
- the relationship value “existence of employment relationship between customer and company” is “1”. Therefore, in all the rows illustrated in FIG. 13, it can be considered that there is a relationship value string in which “1” is stored as the relationship value “whether there is an employment relationship between the customer and the company”.
- FIG. 14 is a schematic block diagram showing a configuration example of a computer according to each embodiment of the present invention.
- the computer 1000 includes a CPU 1001, a main storage device 1002, an auxiliary storage device 1003, an interface 1004, a display device 1005, and an input device 1006.
- the table operation system of each embodiment is implemented in the computer 1000.
- the operation of the table operation system is stored in the auxiliary storage device 1003 in the form of a program (table operation program).
- the CPU 1001 reads out the program from the auxiliary storage device 1003, develops it in the main storage device 1002, and executes the above processing according to the program.
- the auxiliary storage device 1003 is an example of a tangible medium that is not temporary.
- Other examples of the non-temporary tangible medium include a magnetic disk, a magneto-optical disk, a CD-ROM, a DVD-ROM, and a semiconductor memory connected via the interface 1004.
- this program is distributed to the computer 1000 via a communication line, the computer 1000 that has received the distribution may develop the program in the main storage device 1002 and execute the above processing.
- the program may be for realizing a part of the above-described processing.
- the program may be a differential program that realizes the above-described processing in combination with another program already stored in the auxiliary storage device 1003.
- FIG. 15 is a block diagram showing an outline of the present invention.
- the table operation system of the present invention includes a designation receiving unit 71, a data obtaining unit 72, a grouping unit 73, and an attribute value string adding unit 74.
- the designation receiving means 71 (for example, the designation receiving unit 3) specifies at least two foreign key columns in a table having two or more foreign key columns and one or more relation value columns, specifies a relation value column, The specification of the type of statistical model associated with the relation value string is accepted.
- the data acquisition unit 72 (for example, the data acquisition unit 4) specifies each table that is a reference destination of each designated foreign key column, and for each combination of key values of the primary key column of each specified table, Acquires the aggregate value of relation values in the relation value column.
- the grouping unit 73 (for example, the grouping unit 5) uses the aggregated value acquired for each key value combination of the primary key column of each table and the statistical model of the specified type to determine the primary of each table. Group the key values in the key column into groups.
- the attribute value string adding unit 74 (for example, the attribute value string adding unit 6) adds, for each table, an attribute value string whose attribute value is the identification information of the group to which the key value of the primary key column of the table belongs. .
- the configuration of the table can be changed so that conditions for obtaining information that can be easily analyzed can be specified.
- the designation accepting means 71 accepts designation of an attribute value string possessed by a table that is a reference destination of the designated foreign key string, and designation of a type of statistical model associated with the attribute value string, and a data obtaining means 72
- the attribute value of the designated attribute value column is obtained for each key value of the primary key column of the table having the attribute value column, and the grouping means 73 makes individual reference destinations of the designated foreign key columns. It is acquired for each key value of the primary key column of the table with the aggregate value acquired for each key value combination of the primary key column of the table, the statistical model of the type associated with the relationship value column, and the attribute value column
- the key value of the primary key column of each table may be grouped based on the attribute value and the statistical model of the type associated with the attribute value column.
- the designation receiving means 71 displays a table name display column (for example, the table name display column 51) for displaying the table name of each table, and a column for displaying the column name of each column included in the table indicated by the selected table name.
- Name display field for example, column name display field 53
- a selected column name display field for example, second selected column name
- a type selection column for example, a type selection column 58
- the specification of at least two foreign key columns is accepted and the column names displayed in the column name display column are accepted.
- the column name of the relation value column is selected from and the type of statistical model is selected in the type selection column.
- the specification of the relation value column and the specification of the type of statistical model associated with the relation value column are accepted, and the column name of the attribute value column is selected from the column names displayed in the column name display column.
- a configuration may be adopted in which designation of an attribute value sequence and designation of a statistical model type associated with the attribute value sequence are received by being selected and selecting a statistical model type in the type selection column.
- grouping means 73 may be configured to group key values so that one key value belongs to only one group.
- the grouping means 73 may be configured to group key values by allowing one key value to belong to a plurality of groups.
- the present invention is preferably applied to a table operation system that operates a table in a predetermined database.
- Table operation system Database management part (DB management part) 3 designation reception part 4 data acquisition part 5 grouping part 6 attribute value string addition part
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
図3は、本発明の第1の実施形態の表操作システムの構成例を示すブロック図である。第1の実施形態の表操作システム1は、データベース管理部2(以下、単にDB管理部2と記す。)と、指定受付部3と、データ取得部4と、グルーピング部5と、属性値列追加部6とを備える。
本発明の第2の実施形態の表操作システムは、第1の実施形態の表操作システムと同様に、図3に示すブロック図で表すことができるので、図3を用いて第2の実施形態を説明する。第1の実施形態と同様の事項については、適宜説明を省略する。
2 データベース管理部(DB管理部)
3 指定受付部
4 データ取得部
5 グルーピング部
6 属性値列追加部
Claims (7)
- 2以上のフォーリンキー列と1以上の関係値列とを持つ表における少なくとも2つのフォーリンキー列の指定と、関係値列の指定と、当該関係値列に対応付けられる統計モデルの種別の指定とを受け付ける指定受付手段と、
指定された各フォーリンキー列の参照先となる個々の表を特定し、特定した個々の表のプライマリキー列のキー値の組み合わせ毎に、前記関係値列の関係値の集計値を取得するデータ取得手段と、
前記個々の表のプライマリキー列のキー値の組み合わせ毎に取得された集計値と、指定された種別の統計モデルとに基づいて、前記個々の表のプライマリキー列のキー値をそれぞれグループ分けするグルーピング手段と、
前記個々の表毎に、当該表のプライマリキー列のキー値が属するグループの識別情報を属性値とする属性値列を追加する属性値列追加手段とを備える
ことを特徴とする表操作システム。 - 指定受付手段は、
指定されたフォーリンキー列の参照先となる表が持つ属性値列の指定と、当該属性値列に対応付けられる統計モデルの種別の指定とを受け付け、
データ取得手段は、
指定された属性値列の属性値を、当該属性値列を持つ表のプライマリキー列のキー値毎に取得し、
グルーピング手段は、
指定された各フォーリンキー列の参照先となる個々の表のプライマリキー列のキー値の組み合わせ毎に取得された集計値と、関係値列に対応付けられた種別の統計モデルと、前記属性値列を持つ表のプライマリキー列のキー値毎に取得された属性値と、前記属性値列に対応付けられた種別の統計モデルとに基づいて、前記個々の表のプライマリキー列のキー値をそれぞれグループ分けする
請求項1に記載の表操作システム。 - 指定受付手段は、
各表の表名を表示する表名表示欄と、
選択された表名が示す表が持つ各列の列名を表示する列名表示欄と、
選択された関係値列の列名または選択された属性値列の列名を表示する選択列名表示欄と、
前記選択列名表示欄に表示された列名に対応する統計モデルの種別を選択するための種別選択欄とを含む画面を表示し、
前記列名表示欄に表示された列名の中から少なくとも2つのフォーリンキー列の列名が選択されることによって、前記少なくとも2つのフォーリンキー列の指定を受け付け、
前記列名表示欄に表示された列名の中から関係値列の列名が選択され、かつ、前記種別選択欄で統計モデルの種別が選択されることによって、前記関係値列の指定と、当該関係値列に対応付けられる統計モデルの種別の指定を受け付け、
前記列名表示欄に表示された列名の中から属性値列の列名が選択され、かつ、前記種別選択欄で統計モデルの種別が選択されることによって、前記属性値列の指定と、当該属性値列に対応付けられる統計モデルの種別の指定を受け付ける
請求項2に記載の表操作システム。 - グルーピング手段は、1つのキー値が1つのグループのみに属するようにキー値をグループ分けする
請求項1から請求項3のうちのいずれか1項に記載の表操作システム。 - グルーピング手段は、1つのキー値が複数のグループに属することを許容してキー値をグループ分けする
請求項1から請求項3のうちのいずれか1項に記載の表操作システム。 - 2以上のフォーリンキー列と1以上の関係値列とを持つ表における少なくとも2つのフォーリンキー列の指定と、関係値列の指定と、当該関係値列に対応付けられる統計モデルの種別の指定とを受け付け、
指定された各フォーリンキー列の参照先となる個々の表を特定し、特定した個々の表のプライマリキー列のキー値の組み合わせ毎に、前記関係値列の関係値の集計値を取得し、
前記個々の表のプライマリキー列のキー値の組み合わせ毎に取得した集計値と、指定された種別の統計モデルとに基づいて、前記個々の表のプライマリキー列のキー値をそれぞれグループ分けし、
前記個々の表毎に、当該表のプライマリキー列のキー値が属するグループの識別情報を属性値とする属性値列を追加する
ことを特徴とする表操作方法。 - コンピュータに、
2以上のフォーリンキー列と1以上の関係値列とを持つ表における少なくとも2つのフォーリンキー列の指定と、関係値列の指定と、当該関係値列に対応付けられる統計モデルの種別の指定とを受け付ける指定受付処理、
指定された各フォーリンキー列の参照先となる個々の表を特定し、特定した個々の表のプライマリキー列のキー値の組み合わせ毎に、前記関係値列の関係値の集計値を取得するデータ取得処理、
前記個々の表のプライマリキー列のキー値の組み合わせ毎に取得された集計値と、指定された種別の統計モデルとに基づいて、前記個々の表のプライマリキー列のキー値をそれぞれグループ分けするグルーピング処理、および、
前記個々の表毎に、当該表のプライマリキー列のキー値が属するグループの識別情報を属性値とする属性値列を追加する属性値列追加処理
を実行させるための表操作プログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2017509208A JPWO2016157707A1 (ja) | 2015-03-30 | 2016-02-25 | 表操作システム、方法およびプログラム |
US15/562,542 US10698874B2 (en) | 2015-03-30 | 2016-02-25 | System, method, and program for business intelligence using table operations in a relational database |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015069370 | 2015-03-30 | ||
JP2015-069370 | 2015-03-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016157707A1 true WO2016157707A1 (ja) | 2016-10-06 |
Family
ID=57006912
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2016/001011 WO2016157707A1 (ja) | 2015-03-30 | 2016-02-25 | 表操作システム、方法およびプログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US10698874B2 (ja) |
JP (1) | JPWO2016157707A1 (ja) |
WO (1) | WO2016157707A1 (ja) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11481364B2 (en) * | 2019-12-16 | 2022-10-25 | Alibaba Group Holding Limited | Row-column formatting for relational databases |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001216307A (ja) * | 2000-01-31 | 2001-08-10 | Teijin Ltd | リレーショナルデータベース管理システム及びそれを記憶した記憶媒体 |
US20030018620A1 (en) * | 2001-06-26 | 2003-01-23 | International Business Machines Corporation | Storing data mining clustering results in a relational database for querying and reporting |
JP2015026188A (ja) * | 2013-07-25 | 2015-02-05 | 株式会社日立製作所 | データベース分析装置及び方法 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4154264B2 (ja) | 2003-03-24 | 2008-09-24 | 株式会社日立情報システムズ | 多次元分析システム |
US7103588B2 (en) * | 2003-05-05 | 2006-09-05 | International Business Machines Corporation | Range-clustered tables in a database management system |
US7596584B2 (en) * | 2007-04-25 | 2009-09-29 | Microsoft Corporation | Predicate based group management |
US20120109875A1 (en) * | 2010-10-31 | 2012-05-03 | Microsoft Corporation | Organization of data mart using clustered key |
US20120254173A1 (en) * | 2011-03-31 | 2012-10-04 | Goetz Graefe | Grouping data |
US9235633B2 (en) * | 2011-12-06 | 2016-01-12 | International Business Machines Corporation | Processing data in a data warehouse |
US9449068B2 (en) * | 2012-06-13 | 2016-09-20 | Oracle International Corporation | Information retrieval and navigation using a semantic layer and dynamic objects |
US20150205855A1 (en) * | 2012-08-03 | 2015-07-23 | Nec Corporation | Product management method, product management device, product management system, and program |
US8856126B2 (en) * | 2012-10-10 | 2014-10-07 | Oracle Financial Services Software Limited | Simplifying grouping of data items stored in a database |
WO2016136147A1 (ja) * | 2015-02-25 | 2016-09-01 | 日本電気株式会社 | グルーピングシステムおよび推薦商品決定システム |
JP6414363B2 (ja) * | 2016-03-16 | 2018-10-31 | 日本電気株式会社 | 予測システム、方法およびプログラム |
-
2016
- 2016-02-25 JP JP2017509208A patent/JPWO2016157707A1/ja active Pending
- 2016-02-25 US US15/562,542 patent/US10698874B2/en active Active
- 2016-02-25 WO PCT/JP2016/001011 patent/WO2016157707A1/ja active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001216307A (ja) * | 2000-01-31 | 2001-08-10 | Teijin Ltd | リレーショナルデータベース管理システム及びそれを記憶した記憶媒体 |
US20030018620A1 (en) * | 2001-06-26 | 2003-01-23 | International Business Machines Corporation | Storing data mining clustering results in a relational database for querying and reporting |
JP2015026188A (ja) * | 2013-07-25 | 2015-02-05 | 株式会社日立製作所 | データベース分析装置及び方法 |
Also Published As
Publication number | Publication date |
---|---|
US20180107699A1 (en) | 2018-04-19 |
JPWO2016157707A1 (ja) | 2018-02-01 |
US10698874B2 (en) | 2020-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6843882B2 (ja) | 履歴ログからの学習と、etlツール内のデータアセットに関するデータベースオペレーションの推奨 | |
US11734233B2 (en) | Method for classifying an unmanaged dataset | |
EP3171282A1 (en) | Data retrieval apparatus, program and recording medium | |
US20170039233A1 (en) | Sankey diagram graphical user interface customization | |
US20170039577A1 (en) | Generating metadata and visuals related to mined data habits | |
KR102213627B1 (ko) | 분석 소프트웨어 관리 시스템 및 분석 소프트웨어 관리 방법 | |
JP6810745B2 (ja) | ターゲットクラスタリング手法を利用して、属性タイプが混合した顧客をセグメント化するためのシステムおよび方法 | |
JPWO2015049797A1 (ja) | データ管理方法、データ管理装置及び記憶媒体 | |
US11308102B2 (en) | Data catalog automatic generation system and data catalog automatic generation method | |
US20190042639A1 (en) | Technology for generating a model in response to user selection of data | |
JP6696568B2 (ja) | アイテム推奨方法、アイテム推奨プログラムおよびアイテム推奨装置 | |
JP6242540B1 (ja) | データ変換システム及びデータ変換方法 | |
WO2018185899A1 (ja) | ライブラリ検索装置、ライブラリ検索システム、及びライブラリ検索方法 | |
US7992126B2 (en) | Apparatus and method for quantitatively measuring the balance within a balanced scorecard | |
WO2016157707A1 (ja) | 表操作システム、方法およびプログラム | |
JP5771476B2 (ja) | データ管理システム及びデータ管理方法 | |
Girsang et al. | Decision support system using data warehouse for hotel reservation system | |
WO2009006028A2 (en) | Explaining changes in measures thru data mining | |
US8856126B2 (en) | Simplifying grouping of data items stored in a database | |
JP2018124930A (ja) | 業者検索システム、業者検索方法及び業者検索プログラム | |
JP5444282B2 (ja) | データ整形システム、方法、及びプログラム | |
Ying et al. | Research on E-commerce Data Mining and Managing Model in The Process of Farmers' Welfare Growth | |
US12056149B1 (en) | Visual analysis platform utilizing dynamic group data elements in custom calculations | |
JP2012118612A (ja) | マーケティング提案支援システム | |
Trabelsi et al. | Employing Data and Process Mining Techniques for Redundancy Detection and Analystics in Business Processes. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16771601 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017509208 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15562542 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16771601 Country of ref document: EP Kind code of ref document: A1 |