WO2022254607A1 - Information processing device, difference extraction method, and non-temporary computer-readable medium - Google Patents

Information processing device, difference extraction method, and non-temporary computer-readable medium Download PDF

Info

Publication number
WO2022254607A1
WO2022254607A1 PCT/JP2021/020987 JP2021020987W WO2022254607A1 WO 2022254607 A1 WO2022254607 A1 WO 2022254607A1 JP 2021020987 W JP2021020987 W JP 2021020987W WO 2022254607 A1 WO2022254607 A1 WO 2022254607A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
difference
learning
analysis
case
Prior art date
Application number
PCT/JP2021/020987
Other languages
French (fr)
Japanese (ja)
Inventor
瑞 蒋
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to PCT/JP2021/020987 priority Critical patent/WO2022254607A1/en
Priority to JP2023525238A priority patent/JPWO2022254607A5/en
Publication of WO2022254607A1 publication Critical patent/WO2022254607A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present disclosure relates to an information processing device, a difference extraction method, and a non-transitory computer-readable medium.
  • AI Artificial Intelligence
  • Patent Literature 1 discloses a system that evaluates performance by comparing candidate algorithms used for machine learning in order to create a learning model.
  • the design pattern information includes, for example, algorithms, objective variables, explanatory variables, hyperparameters, data used for learning and verification, and the like. Therefore, the person in charge of analysis determines whether or not there is a difference in each type of information, and analyzes how the information with the difference affects the prediction accuracy of the learning model. However, depending on the skill level of the person in charge of analysis, the information with the difference may be overlooked, or the degree of influence of the information with the difference on the learning model may not be determined.
  • One object of the present disclosure is to solve the above problems, and an information processing device, a difference extraction method, and a non-temporary computer-readable method capable of efficiently evaluating a learning model. It is to provide a medium.
  • the information processing device is A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model.
  • an analysis means for extracting a difference from explanatory variables; When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and a calculation means for calculating a second correlation coefficient with a second objective variable included in the second case information; an output means for outputting the extraction result of the analysis means and the calculation result of the calculation means; Prepare.
  • the difference extraction method is A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. Extracting the difference with the explanatory variable, When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and calculating a second correlation coefficient with a second objective variable included in the second case information; and outputting the extracted extraction result and the calculated calculation result. include.
  • a non-transitory computer-readable medium includes: A non-temporary computer-readable medium storing a program that causes an information processing device to execute a difference extraction method,
  • the difference extraction method is A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. Extracting the difference with the explanatory variable, When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and calculating a second correlation coefficient with a second objective variable included in the second case information; and outputting the extracted extraction result and the calculated calculation result.
  • an information processing device a difference extraction method, and a non-temporary computer-readable medium capable of efficiently evaluating a learning model.
  • FIG. 1 is a block diagram showing a configuration example of an information processing apparatus according to a first embodiment;
  • FIG. It is a figure which shows the structural example of the information processing apparatus concerning 2nd Embodiment.
  • FIG. 4 is a diagram showing information held by an information analysis unit; It is a table which shows the process target data relevant to an analysis process. It is a figure which shows an example of a determination table.
  • 9 is a flow chart showing an operation example of the information processing apparatus according to the second embodiment; 9 is a flow chart showing an operation example of the information processing apparatus according to the second embodiment; It is a figure which shows an example of an extraction result. It is a figure which shows an example of an extraction result. It is a figure which shows an example of an extraction result. It is a figure which shows an example of an extraction result.
  • FIG. 10 is a diagram for explaining processing for narrowing down difference display;
  • FIG. 10 is a diagram for explaining processing for narrowing down difference display;
  • design patterns that indicate patterns for designing learning models are referred to as "cases.”
  • case is defined as a term that can also include design information for creating, validating, and evaluating analytical models.
  • Design information includes the specification of the AI engine, the specification of data for learning, the data for verification and the data for evaluation, the specification of hyperparameters and data division conditions, and the specification of parameters other than hyperparameters used to execute the AI engine. can include Furthermore, the design information may include the source code of the AI engine execution program, and the like.
  • the first learning model when the first learning model is created based on the first design pattern, the first design pattern is referred to as the first case, and the information about the first design pattern (used for the first design pattern). information) is referred to as the first case information.
  • a learning model may be referred to as an analysis model.
  • FIG. 1 is a block diagram illustrating a configuration example of an information processing apparatus according to a first embodiment;
  • the information processing device 1 may be a personal computer or a server.
  • the information processing device 1 includes an analysis unit 2 , a calculation unit 3 and an output unit 4 .
  • the analysis unit 2 analyzes the first explanatory variable included in the first case information indicating the information regarding the design pattern of the first learning model and the second case information indicating the information regarding the design pattern of the second learning model. A difference from the included second explanatory variable is extracted.
  • the analysis unit 2 may acquire the first case information and the second case information from a storage device (not shown) holding the first case information and the second case information. Alternatively, the analysis unit 2 may acquire the first case information and the second case information by inputting them from an input device (not shown).
  • the storage device and the input device may be provided inside or outside the information processing device 1, respectively.
  • the analysis unit 2 extracts the difference between the first explanatory variable and the second explanatory variable from the acquired first case information and second case information.
  • the calculation unit 3 calculates a first correlation coefficient between the first explanatory variable and the first objective variable included in the first case information, and calculates a second correlation coefficient.
  • a second correlation coefficient between the explanatory variable and the second objective variable included in the second case information is calculated.
  • the first correlation coefficient is an index value indicating the relationship between the first explanatory variable and the first objective variable.
  • the calculation unit 3 divides the covariance between the first explanatory variable and the first objective variable by the product of the standard deviation of the first explanatory variable and the standard deviation of the first objective variable to obtain the first may be calculated.
  • the second correlation coefficient is an index value indicating the relationship between the second explanatory variable and the second objective variable.
  • the calculator 3 calculates the covariance between the second explanatory variable and the second objective variable as the standard deviation of the second explanatory variable and the standard deviation of the second objective variable.
  • a second correlation coefficient may be calculated by dividing by the product of .
  • the output unit 4 outputs the extraction result of the analysis unit 2 and the calculation result of the calculation unit 3.
  • the output unit 4 may output the extraction result of the analysis unit 2 and the calculation result of the calculation unit 3 to an output device (not shown) provided inside or outside the information processing apparatus 1 .
  • the information processing apparatus 1 extracts the difference between the explanatory variables included in the two pieces of case information by the analysis unit 2.
  • the information processing apparatus 1 extracts the explanatory variable , and the correlation with the objective variable.
  • the information processing device 1 outputs the extraction result of the analysis unit 2 and the calculation result of the calculation unit 3 .
  • a correlation coefficient is an index value that indicates the relationship between an explanatory variable and an objective variable. Therefore, by checking the results output from the information processing apparatus 1, the person in charge of analysis who analyzes each piece of case information can grasp the difference between the explanatory variables and the explanatory variables that affect the objective variable.
  • the learning model is a model that predicts the objective variable based on the explanatory variables
  • the correlation coefficient can also be said to be an index value of the influence of the explanatory variables on the learning model. Therefore, by checking the results output from the information processing apparatus 1, the person in charge of analysis can grasp the difference between the explanatory variables and also grasp the explanatory variables that affect the learning model. Therefore, according to the information processing apparatus 1 according to the first embodiment, it is possible to efficiently evaluate the learning model regardless of the skill level of the person in charge of analysis.
  • FIG. 2 is a diagram illustrating a configuration example of an information processing apparatus according to a second embodiment
  • An information processing device 100 corresponds to the information processing device 1 according to the first embodiment.
  • the information processing device 100 is a device that analyzes an analysis model that is a machine-learned learning model.
  • Information processing apparatus 100 may be a personal computer or a server.
  • the information processing device 100 includes a repository 10 , a processing device 20 , an input device 30 and an output device 40 . Note that in the following description, the learning model analyzed by the information processing apparatus 100 is described as an analysis model.
  • the repository 10 is a storage device that stores (holds) case information analyzed by the information processing device 100 and various types of information related to the case information.
  • the repository 10 may be, for example, the NEC Advanced Analytics Platform Modeler (AAPF Modeler).
  • the repository 10 has an information holding unit 11 .
  • the information holding unit 11 inputs and holds various types of information received by the information input unit 21 provided in the processing device 20 from the information input unit 21 .
  • the information holding unit 11 may be called a storage unit.
  • FIG. 3 is a diagram showing information held by an information analysis unit. As shown in FIG. 3, the information holding unit 11 holds analysis summary information, case information, analysis model information, evaluation record information, and assignment information.
  • Analysis summary information is created for each analysis purpose for which you want to analyze using an analysis model, which is a learning model. For example, if a user (person in charge of analysis) who uses an analysis model wants to perform power demand forecasting and sets power demand forecasting as the purpose of analysis, analysis summary information with "power demand forecasting" as the purpose of analysis is created. be. For example, when a user who uses an analysis model wants to make a sales forecast different from a power demand forecast and sets the sales forecast as the purpose of analysis, analysis summary information is created with the purpose of analysis being "sales forecast.” Analysis summary information includes analysis summary name, analysis objective, prediction objective, and target accuracy index value.
  • the name of the analysis summary is set in the analysis summary name.
  • the purpose of analysis is set with the purpose of creating an analysis model. Using the above example, the analysis purpose is set to, for example, "power demand forecast” or "sales forecast.”
  • the type of analysis performed by machine learning is set.
  • Types of analysis performed in machine learning include, for example, regression analysis of supervised learning, class analysis, and the like. Therefore, information that can specify the type of analysis, such as regression analysis of supervised learning and class analysis, is set for the prediction purpose.
  • the target accuracy index value is set with an accuracy index value that is the target of the prediction accuracy of the analysis model created based on the analysis summary information.
  • the information about the target accuracy index is set with information indicating the index value of the prediction accuracy target of the analysis model created from a plurality of cases based on the analysis summary information.
  • Items related to the accuracy index include, for example, the following items.
  • the average absolute percent error is set to XX%.
  • the case information is information about cases (design patterns) for creating an analysis model based on the analysis summary information.
  • an analysis model with high prediction accuracy is created according to the analysis purpose, prediction purpose, and target accuracy index value included in the analysis summary information. It is generally difficult to create an analytical model with high prediction accuracy with only one design. Create high analytical models. Therefore, a plurality of pieces of case information are created from one piece of analysis summary information.
  • the analysis summary information is information bundling a plurality of pieces of case information
  • the information holding unit 11 holds, for example, the analysis summary information and the case information in a hierarchical manner. In other words, the information holding unit 11 holds the case information so that it is stored one level below the analysis summary information.
  • Case information includes case names, learning candidate data, AI engine algorithms, hyperparameters, objective variables, explanatory variables, and corresponding tasks.
  • a case name is set to identify a case for designing an analysis model.
  • a set of data that may be used to create an analysis model is set in learning candidate data.
  • the learning candidate data is set with a plurality of variable names that can be used as objective variables and explanatory variables, and data such as numerical values for each variable.
  • the learning candidate data may include variables that are not used as objective variables and explanatory variables.
  • the AI engine algorithm is set with the AI engine name and the name of the algorithm used by the AI engine.
  • AI engine is a general term for AI that performs analysis based on a specific algorithm classification.
  • An AI engine refers to a system that realizes analysis processing such as prediction and discrimination by generating an analysis model using machine learning technology according to a predetermined data analysis method.
  • the AI engine is, for example, a commercial software program or a software program provided as open source.
  • AI engines include, for example, scikit-learn and PyTorch.
  • the variable name (objective variable name) of the information to be predicted by the analysis model (data to be predicted) and the data type are set.
  • the data type of the objective variable is a label that indicates the type of value of the objective variable and is used for classification. Examples of data types include, for example, categorical types and numeric types. For example, if the purpose of analysis is "electricity demand forecast", the objective variable is set to "result (10,000 kW)", which indicates the objective variable name of the actual electric power value related to electric power demand, and the data type of the objective variable. be.
  • explanatory variables are multiple variables used when the analysis model makes predictions, and variable names (explanatory variable names) that are assumed to affect the objective variable are set.
  • explanatory variables all explanatory variable names are set, for example, in the form of a variable list.
  • the explanatory variables include "temperature”, “precipitation”, and the actual electric power value two days ago, which are used to forecast the electric power demand, which is the objective variable.
  • a variable name such as “Actual (10,000 kW)_2 days ago” is set in a list format as a variable list.
  • the problem to be solved is information related to the problem information described later, and the problem to be solved in each case is set in the problem to be solved. For example, when evaluating an analysis model created from a certain case, if it is found that the data related to "temperature” included in the learning candidate data is insufficient, the problem information will include "Data related to "temperature” is missing. Insufficient” problem is set. If a newly considered case is based on training candidate data to which data on 'temperature' has been added, the response task included in the case information for that case will have the message 'data on 'temperature' is lacking'. ” is set.
  • the information holding unit 11 hierarchically holds analysis summary information, case information, and analysis model information. Specifically, the information holding unit 11 stores the analysis outline so that the case information is stored in the hierarchy one level below the analysis outline information, and the analysis model information is stored in the hierarchy one level below the case information. Retain information, case information and analysis model information. Therefore, the analysis summary information, the case information, and the analysis model information are held by the information holding unit 11 so that the corresponding information can be specified by tracing the held hierarchy.
  • the analysis model information includes an analysis model name, forecast/actual log, explanatory variable column correspondence map, learning data, evaluation data, model qualitative information, and accuracy index value.
  • the name of the analytical model is set in the analytical model name.
  • a value predicted by the analysis model and an actual value are set in the forecast/actual log.
  • the forecast/actual log may be held in the information holding unit 11 in a file format.
  • Information that determines how to process the data used in the analysis model is set in the explanatory variable column correspondence map.
  • column correspondence information before and after data processing is set in the explanatory variable column correspondence map.
  • the explanatory variable column correspondence map contains information on the input column to which the input explanatory variable is set, information indicating the processing content of what kind of processing is to be performed on the input column, and an explanatory variable column correspondence map.
  • Information of the output column to which the variable after the variable is set is set. Examples of processing contents include binary expansion of expanding one column into a plurality of columns, standardization processing of standardizing and outputting one column, and the like.
  • a set of multiple data used to create an analysis model is set in the learning data. Variable names and numerical data of each variable are set in the learning data. All variable names are set in the learning data, for example, in the form of a variable list. Since the explanatory variables may be processed by the explanatory variable column correspondence map when creating the analysis model, the variable names of the learning data are the variable names after the explanatory variables have been processed. Note that when the explanatory variables are not processed, the variable names of the learning data are the explanatory variable names.
  • a set of data used to evaluate the analysis model is set in the evaluation data.
  • a variable name and numerical data of each variable are set in the evaluation data. All variable names are set in the evaluation data, for example, in the form of a variable list. Since the explanatory variables may be processed by the explanatory variable column correspondence map when creating the analysis model, the variable names of the evaluation data are the variable names after the explanatory variables have been processed. Note that when the explanatory variables are not processed, the variable names of the evaluation data are the explanatory variable names.
  • the model qualitative information contains information on the grounds for the analysis model to derive the predicted value.
  • the model qualitative information includes the regression equation and the regression coefficients included in the regression equation.
  • a decision tree represents a predictive model for deriving a conclusion about the target value of a certain item from the observed results for that item.
  • the hierarchy of decision trees represents the hierarchy of the tree structure of decision trees.
  • internal nodes correspond to variables, and branches to child nodes indicate possible values of the variable.
  • a leaf of the decision tree represents the predicted value of the objective variable for the variable value represented by the path from the root.
  • the number of data samples in the leaves of the decision tree is the number of data records expected in each leaf.
  • Decision tree hierarchy information includes the number of decision tree layers, relationship information between layers, decision conditions for each branch of the decision tree, the number of learning data samples for each leaf of the decision tree, and the number of evaluation data samples for each leaf of the decision tree. including.
  • the relationship information between hierarchies is information indicating the connection of each leaf.
  • the number of learning data samples for each leaf of the decision tree is the number of learning data records predicted for each leaf of the decision tree.
  • the number of evaluation data samples for each leaf of the decision tree is the number of records of evaluation data predicted for each leaf of the decision tree.
  • the accuracy index value is the accuracy of the learning result, which is the output result of the analysis model after inputting the learning data into the analysis model, and the accuracy of the prediction result, which is the output result of the analysis model after inputting the evaluation data into the analysis model.
  • the accuracy index value an item related to the accuracy index and the value of each item are set.
  • the items related to the accuracy index are the items listed in the description of the target accuracy index value.
  • the value of each item is calculated from the forecast/actual log.
  • the accuracy index value indicating the accuracy of the learning result is described as the accuracy index value based on the learning data
  • the accuracy index value indicating the accuracy of the prediction result is described as the accuracy index value based on the evaluation data.
  • the accuracy index value based on the learning data and the accuracy index value based on the evaluation data are set as the accuracy index value. Any one of the values may be set.
  • the evaluation record information is information related to records when evaluation target case information and analysis model information are evaluated.
  • the evaluation record information includes an evaluation record name, an evaluation target, an accuracy index, and an evaluation/opinion.
  • the name of the evaluation record is set in the evaluation record name.
  • Information specifying a case related to the analysis model to be evaluated is set in the evaluation target.
  • the accuracy index is set with the accuracy index value of the actual value with respect to the predicted value of the analysis model.
  • the accuracy index is an index that can be arbitrarily set by the user who performs the evaluation, and may be set based on the forecast/actual log. In the evaluation/opinion, the opinion of the user who evaluates the analysis model and case to be evaluated is set.
  • the assignment information is set with information related to assignments identified from the evaluation record information. For example, when evaluating an analysis model created from a certain case, if it is found that the data related to "temperature” included in the learning candidate data is insufficient, the problem information will include "Data related to "temperature” is missing. Insufficient” information is set.
  • the task information includes the task name, task content, occurrence evaluation result name, source case, task response case, and presence/absence of case effect.
  • the name of the assignment is set in the assignment name. If the task information is information about the task ⁇ Insufficient data about temperature'', information such as ⁇ Insufficient data about temperature'' is set in the task name, for example.
  • the specific content of the task is set in the task content. If the task information is information about the task that ⁇ the data about 'temperature' is insufficient'', the task content includes, for example, ⁇ the data about 'temperature' included in the learning candidate data is insufficient''. information is set.
  • the occurrence evaluation result name is set to the evaluation record name included in the evaluation record information in which the issue was found.
  • Information specifying a case in which a problem has been identified is set in the source case.
  • Information specifying a case set as an evaluation target included in the evaluation record information in which the problem was found is set in the source case.
  • Information that identifies the case corresponding to the issue is set in the issue-related case. For example, when a new case is created for an issue, that case is set as the issue-handling case.
  • the judgment result of whether or not each case solves the problem is set for the new case corresponding to the problem. Assume that two new cases are created for the problem, the first case does not solve the problem, and the second case solves the problem. In this case, information indicating whether the problem has been solved is set for the first case as information about the presence or absence of case effects, and information indicating that the problem has been solved for the second case is set. information is set.
  • the processing device 20 functions as a control section that performs various controls on data input from the input device 30 . Also, the processing device 20 analyzes the analysis summary information, the case information, and the analysis model information using various types of information held by the repository 10 and outputs the analysis results to the output device 40 . The processing device 20 performs operations on external systems.
  • the processing device 20 includes an information input section 21 , an information analysis section 22 , a calculation section 23 , an output section 24 and an external system control section 25 .
  • the information input unit 21 receives various information held by the information holding unit 11 of the repository 10 from the input device 30 .
  • Information input unit 21 inputs the received information to information holding unit 11 .
  • the information input unit 21 receives, through the input device 30 , information about an analysis target to be analyzed and an analysis model of a comparison target to be compared with the analysis target, input by the user to the input device 30 .
  • the information input unit 21 outputs to the information analysis unit 22 information about the analysis model to be analyzed and the analysis model to be compared.
  • the information input unit 21 may receive, via the input device 30, the information on the case of the analysis target to be analyzed and the comparison target case to be compared with the analysis target, input by the user into the input device 30. Alternatively, if the user wants to compare all the analysis models, the information input unit 21 does not need to receive feedback regarding the analysis target and comparison target analysis models.
  • the information input unit 21 receives, from the user via the input device 30, information as to whether or not the analysis processing, which is performed by the information analysis unit 22 and the calculation unit 23 and will be described later, is to be stopped in the middle. 22.
  • the information input unit 21 receives output conditions for outputting the extraction result extracted by the information analysis unit 22 and the calculation result calculated by the calculation unit 23 from the user via the input device 30, and outputs the output condition to the output unit 24.
  • output to The information input unit 21 receives from the user via the input device 30 whether or not to output each of the extraction result and the calculation result to the output device 40 , and outputs them to the output unit 24 .
  • the information input unit 21 receives output items to be output to the output device 40 from the user via the input device 30 and outputs them to the output unit 24 .
  • the information analysis unit 22 corresponds to the analysis unit 2 in the first embodiment.
  • the information analysis unit 22 uses the information on the analytical model to be analyzed and the analytical model to be compared that are input to the information input unit 21 among the various types of information held by the information holding unit 11 of the repository 10, and analyzes the two Run an analysis process that compares two analysis models.
  • the information analysis unit 22 performs learning candidate data, AI engine algorithms, hyperparameters, objective variables, explanatory variables, learning data, evaluation data, model qualitative information, and accuracy, which are surrounded by dotted lines in FIG. Analytical processing is performed by comparing index values. Details of the analysis processing will be described later.
  • the calculator 23 corresponds to the calculator 3 in the first embodiment.
  • the calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable in the analysis process.
  • a correlation coefficient is an index value that indicates the relationship between an explanatory variable and an objective variable.
  • the calculation unit 23 may calculate the correlation coefficient by dividing the covariance between the explanatory variable and the objective variable by the product of the standard deviation of the explanatory variable and the standard deviation of the objective variable.
  • the calculator 23 outputs the calculated correlation coefficient to the information analyzer 22 .
  • the calculation unit 23 calculates hash values of learning candidate data in the analysis process.
  • the calculation unit 23 outputs the calculated hash value of the learning candidate data to the information analysis unit 22 .
  • the calculation unit 23 calculates basic statistics of learning candidate data, learning data, and evaluation data in the analysis process.
  • the calculator 23 calculates a basic statistic according to the data type of the objective variable.
  • Basic statistics include, for example, number of elements, arithmetic mean, standard deviation, minimum value, quarter quantile, median, and three quarter quantile.
  • the basic statistic calculated by the calculator 23 is not limited to the above, and may be configured to calculate an arbitrarily set statistic.
  • the calculation unit 23 outputs the calculated basic statistics to the information analysis unit 22 .
  • FIG. 4 is a table showing processing target data related to analysis processing.
  • FIG. 4 shows data to be processed by the information analysis unit 22 to be processed in the analysis process, extraction information indicating information for extracting a difference from the data to be processed, and additional extraction when there is a difference in the extraction information. It is a table showing a relationship with additional extraction/calculation information.
  • the information analysis unit 22 sequentially processes the processing target data shown in FIG. 4 from the top.
  • the information analysis unit 22 identifies analysis model information to be analyzed based on the information about the analysis model to be analyzed, which is input from the information input unit 21 .
  • the information analysis unit 22 identifies analysis model information to be compared based on the information about the analysis model to be compared, which is input from the information input unit 21 .
  • the information analysis unit 22 extracts the difference between the objective variable included in the case information corresponding to the analysis model information to be analyzed and the objective variable included in the case information corresponding to the analysis model information to be compared.
  • the information analysis unit 22 generates case information corresponding to the analysis model information to be analyzed and case information corresponding to the analysis model information to be compared based on the hierarchical relationship between the analysis model information in the information holding unit 11 and the case information. to identify
  • the information analysis unit 22 sets the objective variable name set to the objective variable included in the case information corresponding to the analysis model information to be analyzed and the objective variable name set to the objective variable included in the case information corresponding to the analysis model information to be compared. Extract the difference from the target variable name.
  • the case information corresponding to the analysis model information to be analyzed can be described as the case information to be analyzed
  • the case information corresponding to the analysis model information to be compared can be described as the case information to be compared. be.
  • the information analysis unit 22 When the difference between the objective variable included in the case information to be analyzed and the objective variable included in the case information to be compared is extracted, the information analysis unit 22 performs additional extraction/ Extract the difference in the data type that is the calculation information.
  • the information analysis unit 22 extracts the difference between the AI engine algorithm included in the case information to be analyzed and the AI engine algorithm included in the case information to be compared. Specifically, the information analysis unit 22 extracts the difference between the AI engine name included in the case information to be analyzed and the AI engine name included in the case information to be compared. The information analysis unit 22 also extracts the difference between the algorithm name included in the case information to be analyzed and the algorithm name included in the case information to be compared.
  • the information analysis unit 22 extracts the difference between the hyperparameters included in the case information to be analyzed and the hyperparameters included in the case information to be compared. Specifically, when the AI engine name and the algorithm name match, the information analysis unit 22 extracts the difference between the hyperparameters included in the case information to be analyzed and the hyperparameters included in the case information to be compared.
  • the calculation unit 23 calculates the hash value of the learning candidate data included in the case information to be analyzed, and calculates the hash value of the learning candidate data included in the case information to be compared.
  • the information analysis unit 22 extracts the difference between the hash value of the learning candidate data included in the case information to be analyzed and the hash value of the learning candidate data included in the case information to be compared.
  • a hash value is a fixed-length value obtained from learning candidate data by, for example, a hash function. If there is a difference in the hash value, it can be seen that the learning candidate data included in the case information to be analyzed is different from the learning candidate data included in the case information to be compared. Therefore, the information analysis unit 22 extracts the difference between the hash value of the learning candidate data included in the case information to be analyzed and the hash value of the learning candidate data included in the case information to be compared.
  • the information analysis unit 22 corresponds to the learning candidate data. Extract the attached additional extraction/calculation information.
  • the information analysis unit 22 extracts the difference between the basic statistics according to the data type of the objective variable, which is the additional extraction/calculation information associated with the learning candidate data in FIG. Therefore, the calculation unit 23 calculates a basic statistic of the learning candidate data included in the case information to be analyzed and a basic statistic of the learning candidate data included in the case information to be compared.
  • the calculator 23 calculates a basic statistic for each variable included in the learning candidate data.
  • Basic statistics include, for example, number of elements, arithmetic mean, standard deviation, minimum value, quarter quantile, median, and three quarter quantile.
  • the calculation unit 23 calculates the value of each item included in the basic statistics for each variable.
  • the information analysis unit 22 extracts the difference between the basic statistics of the learning candidate data included in the case information to be analyzed and the basic statistics of the learning candidate data included in the case information to be compared.
  • the information analysis unit 22 analyzes the basic statistics of the learning candidate data included in the case information to be analyzed and the case information to be compared for each variable included in the learning candidate data and each item included in the basic statistics. Extract the difference from the basic statistic of the learning candidate data.
  • the information analysis unit 22 extracts the difference between the explanatory variables included in the case information to be analyzed and the explanatory variables included in the case information to be compared.
  • the information analysis unit 22 compares the variable list in which the explanatory variable name set as the explanatory variable is set, thereby determining the explanatory variable included in the case information to be analyzed and the explanatory variable included in the case information to be compared. Extract the difference between The information analysis unit 22 determines the explanatory variable names that match (overlapping) and do not match in the case information to be analyzed and the case information to be compared in the variable list. Extract the difference.
  • the correlation coefficient with the objective variable and the weighting are extracted as additional extraction/calculation information associated with the explanatory variables.
  • the calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable using learning candidate data included in the case information to be analyzed.
  • the calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable using the learning candidate data included in the case information to be compared.
  • the calculation unit 23 calculates at least the correlation coefficient between the explanatory variable having a difference among the explanatory variables included in the case information to be analyzed and the case information to be compared, and the objective variable.
  • the information analysis unit 22 extracts the weighting that indicates the degree of weighting of the explanatory variables in the analysis model from the model qualitative information included in the analysis model information to be analyzed.
  • the information analysis unit 22 extracts the weighting indicating the degree of weighting of the explanatory variables in the analysis model from the model qualitative information included in the analysis model information to be compared.
  • the information analysis unit 22 extracts at least the weighting of explanatory variables having a difference among the explanatory variables included in the case information to be analyzed and the case information to be compared.
  • the weighting is a quantification of the importance of the input value, so it may also be referred to as a weighting factor.
  • the information analysis unit 22 sets the regression coefficient of each explanatory variable set in the model qualitative information included in the analysis model information to be analyzed and the model qualitative information included in the analysis model information to be compared.
  • the regression coefficients of each explanatory variable are extracted as weights.
  • the information analysis unit 22 extracts regression coefficients corresponding to explanatory variables of the regression equations as weights (weighting coefficients).
  • weighting coefficients may be extracted as weighting.
  • the calculation unit 23 calculates the basic statistics of the learning data included in the analysis model information to be analyzed, and calculates the basic statistics of the learning data included in the analysis model information to be compared. Specifically, the calculation unit 23 calculates the basic statistics of each variable set in the learning data included in the analysis model information to be analyzed, and calculates the basic statistics of each variable set in the learning data included in the analysis model information to be compared. Calculate basic statistics for each variable. In addition, since the basic statistics include, for example, the number of elements, the arithmetic mean, the standard deviation, the minimum value, the 1/4 quantile, the median value, and the 3/4 quantile, the calculation unit 23, for each variable , and for each basic statistic item, the basic statistic is calculated.
  • the information analysis unit 22 extracts the difference between the variables set in the learning data included in the analysis model information to be analyzed and the variables set in the learning data included in the analysis model information to be compared. Further, based on the result calculated by the calculation unit 23, the information analysis unit 22 calculates the learning data included in the analysis model information to be analyzed and the analysis model to be compared for each variable and for each basic statistic item. A difference in basic statistics is calculated with respect to the learning data included in the information.
  • the calculation unit 23 calculates the basic statistics of the evaluation data included in the analysis model information to be analyzed, and calculates the basic statistics of the evaluation data included in the analysis model information to be compared. Specifically, the calculation unit 23 calculates the basic statistics of each variable set in the evaluation data included in the analysis model information to be analyzed, and calculates the basic statistics of each variable set in the evaluation data included in the analysis model information to be compared. Calculate basic statistics for each variable. In addition, since the basic statistics include, for example, the number of elements, the arithmetic mean, the standard deviation, the minimum value, the 1/4 quantile, the median value, and the 3/4 quantile, the calculation unit 23, for each variable , and for each basic statistic item, the basic statistic is calculated.
  • the information analysis unit 22 extracts the difference between the variables set in the evaluation data included in the analysis model information to be analyzed and the variables set in the evaluation data included in the analysis model information to be compared. Further, based on the result calculated by the calculation unit 23, the information analysis unit 22 calculates the evaluation data included in the analysis model information to be analyzed and the analysis model to be compared for each variable and for each basic statistic item. A difference in basic statistics is calculated with respect to the evaluation data included in the information.
  • the information analysis unit 22 extracts the difference between the model qualitative information included in the analysis model information to be analyzed and the model qualitative information included in the analysis model information to be compared.
  • the information analysis unit 22 extracts the difference of regression coefficients, which are weights included in the regression formula.
  • the information analysis unit 22 uses the regression coefficients of the regression equations as weights, and calculates the weighted difference. Extract. In other words, the information analysis unit 22 extracts the difference in weighting for explanatory variables with different weights even when there is no difference between the explanatory variables included in the case information to be analyzed and the case information to be compared.
  • the calculation unit 23 calculates the correlation coefficient between the explanatory variable with a difference in weighting (weighting coefficient) and the objective variable.
  • the calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable having different weightings using the learning candidate data included in the case information to be analyzed, and calculates the learning data included in the case information to be compared. Calculated using candidate data.
  • the information analysis unit 22 extracts differences in the hierarchical information of the decision tree. Specifically, when decision tree hierarchy information is set in the analysis model information to be analyzed and the analysis model information to be compared, the information analysis unit 22 extracts the difference in the hierarchy information of the decision trees.
  • the decision tree hierarchy information includes the number of levels of the decision tree, decision conditions for each branch of the decision tree, the number of learning data samples for each leaf of the decision tree, and the number of evaluation data samples for each leaf of the decision tree. Therefore, the information analysis unit 22 extracts the difference for each piece of hierarchical information of the decision tree.
  • the information analysis unit 22 extracts the difference between the accuracy index value included in the analysis model information to be analyzed and the accuracy index value included in the analysis model information to be compared. Specifically, the information analysis unit 22 sets the accuracy index value based on the learning data and the accuracy index based on the evaluation data, which are set to the accuracy index values included in the analytical model information to be analyzed and the analytical model information to be compared. Extract value differences. In addition, since at least one item related to the accuracy index is set as the accuracy index value, the information analysis unit 22 calculates the accuracy index value based on the learning data and the accuracy index value based on the evaluation data for each item related to the accuracy index. Extract the difference.
  • the information analysis unit 22 calculates the difference between the accuracy index value based on the learning data and the accuracy index value based on the evaluation data, the target accuracy index value related to the analysis model information to be analyzed and the analysis model information to be compared, and a predetermined The superiority or inferiority of the analysis model is judged based on the judgment conditions. Specifically, the information analysis unit 22, based on the difference between the accuracy index value based on the learning data and the accuracy index value based on the evaluation data, the target accuracy index value included in the analysis summary information, and the determination table, Determine the superiority or inferiority of the analysis model. That is, the information analysis unit 22 determines whether the prediction accuracy of the analysis model to be compared has improved or deteriorated based on the analysis model to be analyzed.
  • FIG. 5 is a diagram showing an example of a determination table.
  • a list of items of accuracy index values, a determination condition for performance improvement, and a determination condition for performance deterioration are set in order from the left.
  • the information analysis unit 22 acquires from the information holding unit 11 of the repository 10 an item indicating the accuracy index set as the target accuracy index value from the analysis outline information including the model information to be analyzed and the analysis model information to be compared.
  • the information analysis unit 22 extracts the difference of the item that matches the item indicating the acquired accuracy index, with respect to the difference of the accuracy index value based on the learning data and the difference of the accuracy index value based on the evaluation data.
  • the information analysis unit 22 searches the determination table for the extracted items, and compares the determination conditions for performance improvement and the determination conditions for performance deterioration set in the determination table with the difference of the extracted items.
  • the information analysis unit 22 determines that the accuracy of the analytical model to be analyzed has improved compared to the analytical model to be compared when the difference for the extracted item satisfies the performance improvement criteria. The information analysis unit 22 determines that the accuracy of the analytical model to be analyzed is lower than that of the analytical model to be compared when the difference for the extracted item satisfies the performance deterioration determination condition.
  • the output section 24 corresponds to the output section 4 in the first embodiment.
  • the output unit 24 outputs the extraction result of the information analysis unit 22 and the calculation result of the calculation unit 23 to the output device 40 .
  • the output unit 24 receives output conditions and output items from the information input unit 21 .
  • the output unit 24 outputs to the output device 40 the output items, the output items satisfying the output conditions, the extraction results, and the calculation results among the extraction results of the information analysis unit 22 and the calculation results of the calculation unit 23 .
  • the external system control unit 25 controls the execution of the AI engine provided outside the information processing device 100 .
  • the input device 30 functions as an input unit.
  • the input device 30 may be, for example, a keyboard, mouse, touch panel, or the like.
  • the input device 30 outputs the inputted information to the information input unit 21 .
  • the input device 30 outputs the information to the information input unit 21 .
  • the input device 30 receives information from the user as to whether or not to stop the analysis processing performed by the information analysis unit 22 and the calculation unit 23, and outputs the information to the information input unit 21.
  • the input device 30 receives output conditions for outputting the extraction results extracted by the information analysis unit 22 and the calculation results calculated by the calculation unit 23 from the user and outputs them to the information input unit 21 .
  • the input device 30 receives output items to be output to the output device 40 from the user and outputs them to the information input unit 21 .
  • the output device 40 functions as an output unit.
  • the output device 40 is configured to include, for example, a display.
  • the output device 40 displays the result calculated by the processing device 20 to the user.
  • the output device 40 displays the output items, extraction results, and calculation results output from the output unit 24 on the display. Note that the output device 40 may output the output items, extraction results, and calculation results output from the output unit 24 to a file.
  • FIGS. 6 and 7 are flowcharts showing an operation example of the information processing apparatus according to the second embodiment.
  • the information analysis unit 22 specifies the analytical model information to be analyzed and the analytical model information to be compared.
  • the information analysis unit 22 also specifies analysis target case information corresponding to analysis target analysis model information and comparison target case information corresponding to comparison target analysis model information.
  • the information analysis unit 22 obtains the objective variable name set to the objective variable included in the case information to be analyzed and the objective variable set to the objective variable included in the case information other than the case information to be analyzed acquired from the information holding unit 11. A difference from the target variable name is extracted (step S1).
  • the information analysis unit 22 determines whether or not there is a difference in objective variable names (step S2). In other words, the information analysis unit 22 determines whether the difference in the objective variable name has been extracted. If there is a difference (YES in step S2), the information analysis unit 22 executes step S3. If there is no difference (NO in step S2), the information analysis unit 22 executes step S5.
  • step S3 the information analysis unit 22 determines whether there is a difference in the data types of the objective variables included in the case information to be analyzed and the case information to be compared (step S3). If there is a difference in the data type (YES in step S3), the information input unit 21 confirms with the user via the input device 30 whether to stop extracting the difference in order to determine whether to stop the subsequent processing. (step S4). If there is a difference in the data types, the purpose of analysis may be different and meaningful comparison may not be possible. Therefore, the information input unit 21 confirms with the user whether to execute subsequent processing.
  • step S3 if there is no difference in data type (NO in step S3), the information analysis unit 22 executes step S5. Even if the target variable names are different, if the data types are the same, it can be determined that the analysis purposes match, so the information analysis unit 22 executes the subsequent processing.
  • step S4 when the information input unit 21 receives information indicating that the user will stop extracting the difference via the input device 30 (YES in step S4), the information processing device 100 executes step S8. .
  • the information analysis unit 22 executes step S5.
  • step S5 the information analysis unit 22 extracts the difference between the AI engine algorithm included in the case information to be analyzed and the AI engine algorithm included in the case information to be compared (step S5).
  • the information analysis unit 22 extracts the difference between the AI engine name included in the case information to be analyzed and the AI engine name included in the case information to be compared.
  • the information analysis unit 22 also extracts the difference between the algorithm name included in the case information to be analyzed and the algorithm name included in the case information to be compared.
  • the information analysis unit 22 determines whether there is a difference between the AI engine algorithm included in the case information to be analyzed and the AI engine algorithm included in the case information to be compared (step S6). In other words, the information analysis unit 22 determines whether the AI engine algorithm difference has been extracted. If there is a difference (YES in step S6), the information input unit 21 confirms with the user via the input device 30 whether to stop extracting the difference in order to determine whether to stop subsequent processing (step S7). If there is no difference (NO in step S6), the information analysis unit 22 executes step S9.
  • step S7 when the information input unit 21 receives through the input device 30 information indicating that the user is to stop extracting the difference (YES in step S7), the information processing apparatus 100 executes step S8. .
  • the information analysis unit 22 executes step S10.
  • step S8 the output unit 24 outputs the differences extracted in steps S1 and S5 to the output device 40 and displays them on the screen of the output device 40 (step S8).
  • step S8 the information processing apparatus 100 ends the process.
  • step S9 the information analysis unit 22 extracts the difference between the hyperparameters included in the case information to be analyzed and the hyperparameters included in the case information to be compared (step S9). Since the same AI engine is used for the case information to be analyzed and the case information to be compared, the information analysis unit 22 extracts differences in hyperparameters.
  • FIG. 8 is a diagram showing an example of an extraction result.
  • the information analysis unit 22 holds the extraction results in, for example, a table format. Difference extraction items, cases to be analyzed, cases to be compared, and differences are set in the table in which the extraction results are set.
  • case name included in the case information to be compared corresponding to the analysis model to be compared is set.
  • FIG. 8 shows that case information to be compared is case 2 .
  • the values included in the case information to be compared are set for the items for which differences are extracted in steps S1, S5 and S9.
  • Information indicating whether or not there was a difference in the difference extraction items for which differences were extracted in steps S1, S5, and S9 is set in the difference column.
  • a numerical value is set for an item, such as a hyperparameter, and there is a difference
  • the value obtained by subtracting the value of the case to be analyzed from the value of the case to be compared is displayed in the column that indicates the difference. set.
  • step S10 the calculation unit 23 calculates hash values of learning candidate data included in the case information to be analyzed and the case information to be compared, and the information analysis unit 22 extracts the difference between the calculated hash values. (Step S10).
  • the information analysis unit 22 determines whether there is a difference in hash values (step S11). If there is a difference (YES in step S11), the calculator 23 executes step S12. If there is a difference in hash values, it can be determined that the learning candidate data are different, so the information processing apparatus 100 analyzes the learning candidate data. If there is no difference (NO in step S11), the calculator 23 executes step S20.
  • step S12 the calculation unit 23 calculates the basic statistics of the learning candidate data included in the case information to be analyzed and the basic statistics of the learning candidate data included in the case information to be compared (step S12).
  • the calculator 23 calculates a basic statistic for each variable included in the learning candidate data.
  • Basic statistics include, for example, number of elements, arithmetic mean, standard deviation, minimum value, quarter quantile, median, and three quarter quantile.
  • the calculation unit 23 calculates the value of each item included in the basic statistics for each variable.
  • the information analysis unit 22 extracts the difference between the basic statistics of the learning candidate data included in the case information to be analyzed and the basic statistics of the learning candidate data included in the case information to be compared (step S13).
  • the information analysis unit 22 analyzes the basic statistics of the learning candidate data included in the case information to be analyzed and the basic statistics of the learning candidate data included in the case information to be compared for each variable included in the learning candidate data and each item included in the basic statistics. Extract the difference from the basic statistic of the learning candidate data.
  • the calculation unit 23 uses the learning candidate data included in the case information to be analyzed to calculate the correlation coefficient between the explanatory variable and the objective variable, and using the learning candidate data included in the case information to be compared, A correlation coefficient between the explanatory variable and the objective variable is calculated (step S14). Note that after the calculation unit 23 calculates the correlation coefficient, the information analysis unit 22 calculates the correlation coefficient calculated using the learning candidate data included in the case information to be analyzed and the correlation coefficient included in the case information to be compared. A difference from the correlation coefficient calculated using the learning candidate data obtained from the data may be extracted.
  • the information analysis unit 22 extracts the weighting for each explanatory variable from the model qualitative information included in the analysis model information to be analyzed, and extracts the weighting for each explanatory variable from the model qualitative information included in the analysis model information for comparison (step S15).
  • the information analysis unit 22 extracts the regression coefficient of each explanatory variable set in the analysis model information to be analyzed as weighting, and extracts the regression coefficient of each explanatory variable set in the model qualitative information included in the analysis model information to be compared. are extracted as weights. Note that when the analysis model is created by heterogeneous mixture learning using a plurality of prediction formulas, the information analysis unit 22 regards each of the plurality of prediction formulas as a regression formula, and calculates the coefficient of each variable of the prediction formula as a regression formula.
  • the information analysis unit 22 may extract the difference between the weighting extracted from the analytical model information to be analyzed and the weighting extracted from the analytical model information to be compared.
  • the weighting and correlation coefficient values are the basis for judging whether changes in the explanatory variables have an impact on the prediction accuracy of the analysis model. Therefore, in step S14, the calculation unit 23 calculates the correlation coefficient, and in step S15, the information analysis unit 22 extracts weighting.
  • FIGS. 9A and 9B are diagrams showing examples of extraction results.
  • 9A and 9B are diagrams obtained by dividing the extraction results extracted by the information analysis unit 22.
  • the information analysis unit 22 holds FIGS. 9A and 9B as extraction results in steps S12 to S15.
  • the information analysis unit 22 similarly to FIG. 8, holds the extraction results in tabular form, for example. Difference extraction items, objective variable difference results, and explanatory variable difference results are set in the table in which the extraction results are set.
  • each item is set. For example, since the basic statistic of learning candidate data includes a plurality of items, each item is set in one row so that the difference between the items can be understood. Also, regarding the weighting extracted from the model qualitative information, if the analytical model is created by heterogeneous mixture learning using multiple prediction formulas, each prediction formula should be 1 set to one line.
  • the objective variable name indicating which variable is the objective variable is set, the case name indicating the case information to be analyzed, and the case indicating the case information to be compared Name and difference are set. If there is a difference in an item for which a numerical value is set, a value obtained by subtracting the value of the case to be analyzed from the value of the case to be compared is set in the column indicating the difference.
  • the area in which the difference result of the explanatory variables is set includes the area in which the difference result of each explanatory variable is set so that the difference result of each explanatory variable can be understood.
  • an explanatory variable name indicating which explanatory variable is the explanatory variable is set, a case name indicating the case information to be analyzed, and a case information to be compared.
  • a case name and difference are set. If there is a difference in an item for which a numerical value is set, a value obtained by subtracting the value of the case to be analyzed from the value of the case to be compared is set in the column indicating the difference.
  • step S16 the information analysis unit 22 extracts the difference between the explanatory variable included in the case information to be analyzed and the explanatory variable included in the case information to be compared (step S16).
  • the information analysis unit 22 compares the variable list in which the explanatory variable name set as the explanatory variable is set, thereby determining the explanatory variable included in the case information to be analyzed and the explanatory variable included in the case information to be compared. Extract the difference between The information analysis unit 22 determines the explanatory variable names that match (overlapping) and do not match in the case information to be analyzed and the case information to be compared in the variable list. Extract the difference.
  • the information analysis unit 22 determines whether there is a difference in explanatory variables (step S17). In other words, the information analysis unit 22 determines whether the difference of explanatory variables has been extracted. If there is a difference (YES in step S17), the calculator 23 executes step S18. If there is no difference (NO in step S17), the calculator 23 executes step S20.
  • step S18 the calculation unit 23 uses the learning candidate data included in the case information to be analyzed and the case information to be compared to calculate the correlation coefficient between the explanatory variable and the objective variable (step S18).
  • the calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable using learning candidate data included in the case information to be analyzed.
  • the calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable using the learning candidate data included in the case information to be compared.
  • the correlation coefficient calculated in step S14 may be used.
  • the information analysis unit 22 extracts the weighting for each explanatory variable from the model qualitative information included in the analysis model information to be analyzed, and extracts the weighting for each explanatory variable from the model qualitative information included in the analysis model information for comparison (step S19).
  • steps S18 and S19 the user can grasp the correlation coefficients and weighting values of the deleted or added explanatory variables, and determine whether the explanatory variables have affected the prediction accuracy of the analysis model. I can judge.
  • the information analysis unit 22 extracts the weighting in step S15, the weighting extracted in step S15 may be used.
  • FIG. 10 is a diagram showing an example of the extraction result.
  • the information analysis unit 22 holds the extraction results in, for example, a table format.
  • explanatory variable names information about the explanatory variables included in the case information to be analyzed, and information about the explanatory variables included in the case information to be compared are set.
  • explanatory variable names of the explanatory variables included in the case information to be analyzed and the explanatory variables included in the case information to be compared are set in the explanatory variable name column. Each explanatory variable name is set for each row in the explanatory variable name.
  • a case name that indicates the case information to be analyzed is set in the information area related to explanatory variables included in the case information to be analyzed.
  • explanatory variables are set to indicate whether they exist in the case information to be analyzed
  • information is set so that it is possible to ascertain whether they also exist in the case information to be compared.
  • the hatched circle indicates that the explanatory variable is included not only in the case information to be analyzed but also in the case information to be compared.
  • a circle without a slash indicates that the explanatory variable is included only in the case information to be analyzed or the case information to be compared, and there is a difference between the case information to be analyzed and the case information to be compared. It is shown that.
  • a case name indicating the case information to be compared is set in the area of the information related to the explanatory variables included in the case information to be compared.
  • explanatory variables are set to indicate whether they exist in the case information to be compared
  • information is set so that it is possible to ascertain whether they also exist in the case information to be analyzed. If a circle without a slash is set in the "presence" column of the information about the explanatory variable included in the case information to be compared, it means that the explanatory variable is only in the case information to be compared. show. Note that when the analysis model is created by heterogeneous mixture learning using a plurality of prediction formulas, the coefficients of each prediction formula may be set as one column.
  • step S20 the calculation unit 23 calculates the basic statistics of the learning data and the evaluation data included in the analysis model information to be analyzed, and calculates the basic statistics of the learning data and the evaluation data included in the analysis model information to be compared. calculate. (Step S20).
  • the calculation unit 23 calculates the basic statistics of each variable set in the learning data included in the analysis model information to be analyzed, and calculates the basic statistics of each variable set in the learning data included in the analysis model information to be compared. Calculate quantity.
  • the calculator 23 calculates a basic statistic for each variable and for each basic statistic item.
  • the calculation unit 23 calculates the basic statistics of each variable set in the evaluation data included in the analysis model information to be analyzed, and calculates the basic statistics of each variable set in the evaluation data included in the analysis model information to be compared. Calculate quantity.
  • the calculator 23 calculates a basic statistic for each variable and for each basic statistic item.
  • the information analysis unit 22 extracts the difference between the basic statistics of the learning data and the evaluation data and the difference between the variables based on the calculation result of the calculation unit 23 (step S21). Based on the results calculated by the calculation unit 23, the information analysis unit 22 analyzes the learning data included in the analysis model information to be analyzed and the analysis model information to be compared for each variable and each basic statistic item. Differences in basic statistics are calculated for included learning data. The information analysis unit 22 extracts the difference between the variable set in the learning data included in the analysis model information to be analyzed and the variable set in the learning data included in the analysis model information to be compared.
  • the information analysis unit 22 calculates the evaluation data included in the analysis model information to be analyzed and the analysis model to be compared for each variable and for each basic statistic item. A difference in basic statistics is calculated with respect to the evaluation data included in the information. The information analysis unit 22 also extracts the difference between the variables set in the evaluation data included in the analytical model information to be analyzed and the variables set in the evaluation data included in the analytical model information to be compared.
  • the information analysis unit 22 determines whether there is model qualitative information in the analysis model information to be analyzed, and determines whether there is model qualitative information in the analysis model information to be compared (step S22).
  • step S22 If there is no model qualitative information (YES in step S22), the information analysis unit 22 executes step S27. If there is model qualitative information (NO in step S22), the information analysis unit 22 extracts the difference in weighting of each explanatory variable (step S23). When regression coefficients of regression equations are set for the analysis model information to be analyzed and the analysis model information to be compared, the information analysis unit 22 uses the regression coefficients of the regression equations as weights and extracts weighted differences.
  • step S24 the information analysis unit 22 determines whether there is a weighting difference (step S24). If there is a weighting difference (YES in step S24), the calculator 23 calculates a correlation coefficient between the explanatory variable and the objective variable (step S25). On the other hand, if there is no weighting difference (NO in step S24), the information analysis unit 22 executes step S26.
  • the information analysis unit 22 extracts the difference in the hierarchical information of the decision tree (step S26).
  • decision tree hierarchy information is set in the analysis model information to be analyzed and the analysis model information to be compared
  • the information analysis unit 22 extracts the difference in the hierarchy information of the decision trees.
  • the decision tree hierarchy information includes the number of levels of the decision tree, decision conditions for each branch of the decision tree, the number of learning data samples for each leaf of the decision tree, and the number of evaluation data samples for each leaf of the decision tree.
  • the information analysis unit 22 extracts a difference for each piece of hierarchical information of the decision tree.
  • FIGS. 11 and 12 are diagrams showing examples of extraction results.
  • FIG. 11 is a diagram for explaining differences in the decision conditions of each branch of the decision tree among the hierarchical information of the decision tree.
  • the information analysis unit 22 uses arrows to indicate the decision conditions of each branch of the decision tree for each of the cases to be analyzed and the cases to be compared so that the relationships between the branches can be understood.
  • the information analysis unit 22 finds a difference part based on the relation line of the decision condition of each branch of the decision tree.
  • the information analysis unit 22 holds the difference between the decision conditions of each branch of the decision tree so that the difference can be found.
  • FIG. 12 is a diagram for explaining the difference between the number of learning data samples for each leaf of the decision tree and the number of evaluation data samples for each leaf of the decision tree among the hierarchical information of the decision tree.
  • the information analysis unit 22 holds the extraction results in, for example, a table format. Information indicating the leaves of the decision tree and the number of samples are set in the table in which the extraction results are set.
  • each row contains the predicted value of the final objective variable, which is the leaf of the decision tree.
  • the area where the number of samples is set includes an area where information about the number of data samples for which the training data is classified into each leaf is set, and an area for which information about the number of data samples for which the evaluation data is classified into each leaf is set. and are set.
  • the area where information about the number of data samples for which the training data is classified into each leaf is set. is set.
  • the area in which information about the number of data samples for which the evaluation data is classified into each leaf is set. is set.
  • the difference column the difference calculated by the information analysis unit 22 for the number of learning data samples and the number of evaluation data samples for each leaf of the decision tree is set for each leaf of the decision tree.
  • the information analysis unit 22 calculates a difference by subtracting the number of learning data samples specified from the case information to be compared from the number of learning data samples specified from the case information to be compared, and calculates the difference in the difference column. set the value.
  • step S27 the information analysis unit 22 extracts the difference between the accuracy index value included in the analytical model information to be analyzed and the accuracy index value included in the analytical model information to be compared (step S27).
  • the information analysis unit 22 extracts the difference between the accuracy index value based on the learning data and the accuracy index value based on the evaluation data set in the accuracy index values included in the analytical model information to be analyzed and the analytical model information to be compared. do.
  • the information analysis unit 22 calculates the accuracy index value based on the learning data and the accuracy index value based on the evaluation data for each item related to the accuracy index. Extract the difference.
  • FIG. 13 is a diagram illustrating an example of an extraction result;
  • the information analysis unit 22 holds the extraction results in, for example, a table format. Difference extraction items, cases to be analyzed, cases to be compared, and differences are set in the table in which the extraction results are set.
  • each item related to the accuracy index indicating that the difference extraction item is an accuracy index value and the accuracy index value is set.
  • Each item related to the accuracy index includes a row in which an accuracy index value based on learning data is set and a row in which an accuracy index value based on evaluation data is set.
  • case name included in the case information to be analyzed corresponding to the analysis model to be analyzed is set.
  • FIG. 13 shows that case information to be analyzed is case 1 .
  • an accuracy index value based on the learning data and an accuracy index value based on the evaluation data are set for each item related to the accuracy index.
  • case names included in the case information to be compared corresponding to the analysis model to be compared are set.
  • FIG. 13 shows that case information to be compared is case 2 .
  • an accuracy index value based on the learning data and an accuracy index value based on the evaluation data are set for each item related to the accuracy index.
  • a value obtained by subtracting the value of the case to be analyzed from the value of the case to be compared is set in the column indicating the difference.
  • step S28 the information analysis unit 22 determines the superiority or inferiority of the performance of the analysis model based on the difference in accuracy index value corresponding to the target accuracy index value (step S28). Based on the difference between the accuracy index value based on the learning data and the accuracy index value based on the evaluation data, the target accuracy index value included in the analysis summary information, and the determination table shown in FIG. Determine the superiority or inferiority of the analysis model.
  • the information analysis unit 22 acquires from the information holding unit 11 of the repository 10 an item indicating the accuracy index set as the target accuracy index value from the analysis outline information including the model information to be analyzed and the analysis model information to be compared.
  • the information analysis unit 22 extracts the difference of the item that matches the item indicating the acquired accuracy index, with respect to the difference of the accuracy index value based on the learning data and the difference of the accuracy index value based on the evaluation data.
  • the information analysis unit 22 searches the determination table for the extracted items, and compares the determination conditions for performance improvement and the determination conditions for performance deterioration set in the determination table with the difference of the extracted items.
  • the information analysis unit 22 determines that the accuracy of the analytical model to be analyzed has improved compared to the analytical model to be compared when the difference for the extracted item satisfies the criteria for performance improvement. The information analysis unit 22 determines that the accuracy of the analytical model to be analyzed is lower than that of the analytical model to be compared when the difference for the extracted item satisfies the performance deterioration determination condition. If there is no difference between the items of the learning accuracy index value and the prediction accuracy index value and the difference is 0 (zero), no determination is made for the corresponding item.
  • the information analysis unit 22 outputs the difference extracted up to step S28 to the output device 40 via the output unit 24 (step S29).
  • the information input unit 21 confirms with the user via the input device 30 whether or not to narrow down the difference display output to the output device 40 (step S30).
  • step S30 When the information input unit 21 receives information indicating that the user narrows down the difference table via the input device 30 (YES in step S30), the information processing apparatus 100 executes step S31. On the other hand, when the information input unit 21 receives information indicating that the user does not narrow down the difference table via the input device 30 (NO in step S30), the output unit 24 executes step S34.
  • step S31 the information input unit 21 inputs display item selection information selected by the user via the input device 30 (step S31). In other words, the information input unit 21 inputs output items to be finally output to the output device 40 .
  • the information input unit 21 inputs, via the input device 30, output conditions for narrowing down the difference display output to the output device 40 (step S32).
  • the information input unit 21 inputs output conditions for determining items to be finally output to the output device 40 for each of the objective variable and the explanatory variable via the input device 30 .
  • the information input unit 21 sets the determination condition for whether or not to display the basic statistics of the learning candidate data, the correlation coefficient with the objective variable, and the weighting difference for each explanatory variable of the model qualitative information.
  • the content input by the user is input to the input device 30 for each explanatory variable.
  • the output unit 24 determines output items, extraction results, and calculation results that satisfy the display item selection information and output conditions (step S33). In other words, the output unit 24 determines the output items based on the display item selection information, the output items satisfying the output conditions, the extraction results, and the calculation results. The output unit 24 determines whether or not to display the difference between the learning candidate data, explanatory variables, and model qualitative information on the screen for each of the objective variable and the explanatory variable.
  • FIGS. 14A and 14B are diagrams for explaining the process of narrowing down the difference display.
  • 14A and 14B are diagrams corresponding to FIGS. 9A and 9B, respectively, with the addition of a screen display column to the rightmost columns of FIGS. 9A and 9B.
  • a screen display area is added at the bottom.
  • a display availability determination condition row and a display row are added.
  • step S31 the output unit 24 displays, on the output device 40, the screen display columns and the screen display rows of FIGS. 14A and 14B in a blank state.
  • the user inputs display selection information on the output item to the input device 30 by selecting the output item to be finally displayed on the screen.
  • step S32 the user inputs into the input device 30 the output conditions for determining the extraction results and calculation results to be finally displayed on the screen.
  • the output condition is determined by the user inputting the display propriety determination condition in the screen display area shown in FIG. 14B.
  • the information input unit 21 inputs the display propriety determination condition input by the user via the input device 30 as an output condition.
  • step S33 the output unit 24 determines the display item selection information, the output items that satisfy the output conditions, the extraction results, and the calculation results.
  • the output unit 24 determines to display the output item included in the display item selection information on the screen.
  • the output unit 24 also determines whether the output conditions are satisfied for each of the objective variable and explanatory variable, and determines the extraction results and calculation results to be displayed on the screen.
  • the output condition of whether the absolute value of the difference in the arithmetic mean is 100 or more or whether data exists in only one of case 1 and case 2 is input as the display availability determination condition. ing.
  • the output unit 24 determines whether or not the input output conditions are satisfied for each of the objective variable and explanatory variable. In the example shown in FIGS.
  • the output unit 24 determines to display the explanatory variable on the screen.
  • the output unit 24 sets the determined contents in the display line in the screen display line.
  • step S34 the output unit 24 outputs by displaying the difference on the screen of the output device 40 (step S34).
  • step S30 if the difference display is not narrowed down, the output unit 24 outputs to the output device 40 so as to maintain the difference displayed in step S29.
  • step S30 when narrowing down the difference display, the output unit 24 outputs to the output device 40 the output items and the difference determined to be displayed on the screen in step S33.
  • the information processing apparatus 100 extracts the difference between various types of information regarding the analytical model to be analyzed and the analytical model to be compared. Therefore, by using the information processing device 100, it is possible to standardize the case difference extraction, and by clarifying the points to be evaluated, the prediction accuracy of the leveled analysis model can be improved in a short time and regardless of the skill level. evaluation can be realized. Therefore, by using the information processing apparatus 100, it is possible to efficiently create an analysis model and improve the prediction accuracy.
  • the person in charge of analysis can confirm the overall improvement status based on the difference in the accuracy index values extracted by the information processing apparatus 100, and can confirm the difference in the explanatory variables and the difference in the AI engine/algorithm. You can immediately check the factors for Also, regarding the degree of impact on improvement, the person in charge of analysis should consider basic statistics of learning candidate data, changes in data trends such as correlation coefficients between explanatory variables and objective variables, weighting of regression formulas, and decision tree It is possible to determine whether or not to adopt the conditional expression used in In this way, even if the person in charge of analysis is inexperienced, the information processing apparatus 100 outputs information corresponding to the points to be confirmed, so that the evaluation of the prediction accuracy of the analysis model can be made more efficient. Therefore, according to the information processing apparatus 100 according to the second embodiment, it is possible to efficiently evaluate the learning model regardless of the skill level of the person in charge of analysis.
  • the accuracy index value when the AI engine and algorithm are changed for the comparison target with the same objective variable, learning candidate data, explanatory variable, learning data, and evaluation data Extract the difference between In the technique disclosed in Patent Document 1, the influence on the prediction accuracy of the analysis model due to changes in the AI engine and algorithm is determined.
  • the information processing apparatus 100 extracts the difference even when the objective variable, learning candidate data, explanatory variable, learning data, and evaluation data are not the same. Therefore, according to the information processing apparatus 100 according to the second embodiment, it is possible to grasp information that affects the prediction accuracy of the analysis model, which contributes to creating an analysis model with high prediction accuracy.
  • FIG. 15 is a diagram illustrating a hardware configuration example of an information processing apparatus according to the present disclosure.
  • the information processing device 1 and the like include a processor 1201 and a memory 1202 .
  • the processor 1201 reads software (computer program) from the memory 1202 and executes it to perform the processing of the information processing apparatus 1 and the like described using the flowcharts in the above-described embodiments.
  • the processor 1201 may be, for example, a microprocessor, MPU (Micro Processing Unit), or CPU (Central Processing Unit).
  • Processor 1201 may include multiple processors.
  • the memory 1202 is composed of a combination of volatile memory and non-volatile memory.
  • Memory 1202 may include storage remotely located from processor 1201 .
  • processor 1201 may access memory 1202 via an I/O (Input/Output) interface (not shown).
  • I/O Input/Output
  • memory 1202 is used to store software modules.
  • the processor 1201 reads these software modules from the memory 1202 and executes them, thereby performing the processing of the information processing apparatus 1 and the like described in the above embodiments.
  • each of the one or more processors included in the information processing apparatus 1 or the like has one or more processors containing instructions for causing the computer to execute the algorithm described with reference to the drawings. Run the program.
  • the program includes instructions (or software code) that, when read into a computer, cause the computer to perform one or more of the functions described in the embodiments.
  • the program may be stored in a non-transitory computer-readable medium or tangible storage medium.
  • computer readable media or tangible storage media may include random-access memory (RAM), read-only memory (ROM), flash memory, solid-state drives (SSD) or other memory technology, CDs -ROM, digital versatile disc (DVD), Blu-ray disc or other optical disc storage, magnetic cassette, magnetic tape, magnetic disc storage or other magnetic storage device;
  • the program may be transmitted on a transitory computer-readable medium or communication medium.
  • transitory computer readable media or communication media include electrical, optical, acoustic, or other forms of propagated signals.
  • an analysis means for extracting a difference from explanatory variables; When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and a calculation means for calculating a second correlation coefficient with a second objective variable included in the second case information; an output means for outputting the extraction result of the analysis means and the calculation result of the calculation means; Information processing device.
  • the analysis means when extracting a difference between the first explanatory variable and the second explanatory variable, provides a first weighting coefficient indicating the degree of weighting of the first explanatory variable in the first learning model. and a second weighting factor indicating the degree of weighting of the second explanatory variable in the second learning model.
  • the analysis means extracts a regression coefficient of the first regression formula as the first weighting factor
  • the second learning model is The information processing apparatus according to appendix 2, wherein, when expressed by a second regression equation, a regression coefficient of the second regression equation is extracted as the second weighting coefficient.
  • the calculating means calculates a third correlation coefficient between the first explanatory variable and the first objective variable, with respect to the first explanatory variable having a difference between the first weighting factor and the second weighting factor. , for a second explanatory variable having a difference between the first weighting factor and the second weighting factor, calculating a fourth correlation coefficient with the second objective variable, Supplementary Note 2 or 4.
  • the information processing device according to 3.
  • the calculation means calculates a first basic statistic of first learning candidate data included in the first case information and a second basic statistic of second learning candidate data included in the second case information. calculate the amount and 5.
  • the information processing apparatus according to any one of appendices 1 to 4, wherein the analysis means extracts a difference between the first basic statistic and the second basic statistic.
  • the calculation means calculates a third basic statistic of the first learning data used to create the first learning model, and calculates a third basic statistic used to create the second learning model. Calculate the fourth basic statistic of the learning data of 2,
  • the analysis means analyzes the difference between the third basic statistic and the fourth basic statistic, and the difference between the variables included in the first learning data and the variables included in the second learning data. 6.
  • the information processing device according to any one of appendices 1 to 5, which extracts a difference.
  • the calculating means calculates a fifth basic statistic of the first evaluation data used to evaluate the first learning model and a second basic statistic used to evaluate the second learning model. Calculate a sixth basic statistic of the evaluation data, The analysis means analyzes the difference between the fifth basic statistic and the sixth basic statistic, and the variables included in the first evaluation data and the variables included in the second evaluation data. 7.
  • the information processing device according to any one of appendices 1 to 6, which extracts a difference.
  • Appendix 8 When the first learning model is represented by a first decision tree and when the second learning model is represented by a second decision tree, the analysis means performs 8.
  • the information processing device according to any one of appendices 1 to 7, wherein a difference between hierarchical information and hierarchical information of the second decision tree is extracted.
  • the hierarchy information includes the number of layers of the decision tree, the relation information between the layers, the decision condition of each branch of the decision tree, the number of learning data samples of each leaf of the decision tree, and the number of evaluation data samples of each leaf of the decision tree.
  • the information processing apparatus according to appendix 8.
  • the analysis means performs at least a first accuracy index value indicating accuracy of at least one of the learning result and prediction result of the first learning model and at least one of the learning result and prediction result of the second learning model. 10.
  • the information processing apparatus according to any one of appendices 1 to 9, wherein a difference between a second accuracy index value indicating one accuracy and a difference is extracted.
  • the analysis means includes a difference between the first accuracy index value and the second accuracy index value, a target accuracy index value related to the first learning model and the second learning model, and a predetermined 11.
  • the information processing apparatus according to appendix 10, wherein it is determined whether or not the prediction accuracy of the first learning model is improved over the prediction accuracy of the second learning model based on a determination condition.
  • Appendix 12 further comprising input means for inputting output items and output conditions, 12.
  • the output unit according to any one of appendices 1 to 11, wherein the output means outputs the output item, the output item satisfying the output condition, the extraction result, and the calculation result out of the extraction result and the calculation result.
  • Information processing equipment (Appendix 13)
  • the analysis means extracts a difference between a first AI (Artificial Intelligence) engine included in the first case information and a second AI engine included in the second case information, Appendices 1 to 13.
  • the information processing device according to any one of 12.
  • Appendix 14 14.
  • the analysis means is included in the first case information if the first AI engine matches the second AI engine and the first learning algorithm matches the second learning algorithm 15.
  • the information processing device according to appendix 14, wherein a difference between a first hyperparameter and a second hyperparameter included in the second case information is extracted.
  • Appendix 16 A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model.
  • Diff method including. (Appendix 17) A non-temporary computer-readable medium storing a program that causes an information processing device to execute a difference extraction method, The difference extraction method is A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model.
  • Non-transitory computer-readable media including;
  • Reference Signs List 1 100 information processing device 2 analysis unit 3 calculation unit 4 output unit 10 repository 11 information holding unit 20 processing unit 21 information input unit 22 information analysis unit 23 calculation unit 24 output unit 25 external system control unit 30 input device 40 output device 1201 processor 1202 memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Provided is an information processing device (1) capable of efficiently evaluating a learning model. The information processing device (1) is provided with: an analysis unit (2) for extracting the difference between a first explanatory variable included in first case information indicating information that pertains to a design pattern of a first learning model and a second explanatory variable included in second case information indicating information that pertains to a design pattern of a second learning model; a calculation unit (3) that, when the difference is extracted, calculates a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information, and calculates a second correlation coefficient between the second explanatory variable and a second objective variable included in the second case information; and, an output unit (4) for outputting the extraction result of the analysis unit (2) and the calculation result of the calculation unit (3).

Description

情報処理装置、差分抽出方法、及び非一時的なコンピュータ可読媒体Information processing device, difference extraction method, and non-transitory computer-readable medium
 本開示は、情報処理装置、差分抽出方法、及び非一時的なコンピュータ可読媒体に関する。 The present disclosure relates to an information processing device, a difference extraction method, and a non-transitory computer-readable medium.
 多くの分野、システムにおいて、AI(Artificial Intelligence)活用が求められている。AI活用では、過去の設計情報及び学習モデルの差異に基づいて学習モデルを評価し、予測精度の良し悪しが判断される。例えば、特許文献1には、学習モデルを作成するために、機械学習に使用されるアルゴリズムの候補を比較することにより性能を評価するシステムが開示されている。 AI (Artificial Intelligence) utilization is required in many fields and systems. In AI utilization, the learning model is evaluated based on the past design information and the difference between the learning model, and the quality of the prediction accuracy is determined. For example, Patent Literature 1 discloses a system that evaluates performance by comparing candidate algorithms used for machine learning in order to create a learning model.
特開2017-004509号JP 2017-004509
 一般的に、良い学習モデルを作成するために、学習モデルを分析し、既に試行した設計パターンに使用された情報を確認することで、新たな設計パターンを検討する必要がある。設計パターン情報は、例えば、アルゴリズム、目的変数、説明変数、ハイパーパラメータ、学習及び検証に使用したデータ等が挙げられる。そのため、分析担当者は、各種情報に差異があるのかを判断し、差異がある情報に対して、学習モデルの予測精度に、どのような影響を与えているのかを分析する。しかしながら、分析担当者のスキルレベルによっては、差異がある情報を見逃してしまったり、上記差異がある情報が、学習モデルに与える影響度を判断できなかったりしてしまう。したがって、分析担当者のスキルレベルの差を埋めるために、例えば、有識者が最終確認を行う等の措置が行われる。しかしながら、このような措置を行うと、有識者の作業時間及びコストが発生してしまうことになるため、有識者の負担を増大させてしまう。 In general, in order to create a good learning model, it is necessary to consider new design patterns by analyzing the learning model and checking the information used in already tried design patterns. The design pattern information includes, for example, algorithms, objective variables, explanatory variables, hyperparameters, data used for learning and verification, and the like. Therefore, the person in charge of analysis determines whether or not there is a difference in each type of information, and analyzes how the information with the difference affects the prediction accuracy of the learning model. However, depending on the skill level of the person in charge of analysis, the information with the difference may be overlooked, or the degree of influence of the information with the difference on the learning model may not be determined. Therefore, in order to make up for the difference in the skill level of the person in charge of analysis, for example, measures such as final confirmation by an expert are taken. However, if such a measure is taken, the work time and cost of the expert will be generated, and the burden on the expert will be increased.
 本開示の目的の1つは、上記課題を解決するためになされたものであり、学習モデルの評価を効率的に行うことが可能な情報処理装置、差分抽出方法、及び非一時的なコンピュータ可読媒体を提供することにある。 One object of the present disclosure is to solve the above problems, and an information processing device, a difference extraction method, and a non-temporary computer-readable method capable of efficiently evaluating a learning model. It is to provide a medium.
 本開示にかかる情報処理装置は、
 第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出する分析手段と、
 前記差分が抽出された場合、前記第1の説明変数と、前記第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、前記第2の説明変数と、前記第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出する算出手段と、
 前記分析手段の抽出結果と、前記算出手段の算出結果と、を出力する出力手段と、
を備える。
The information processing device according to the present disclosure is
A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. an analysis means for extracting a difference from explanatory variables;
When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and a calculation means for calculating a second correlation coefficient with a second objective variable included in the second case information;
an output means for outputting the extraction result of the analysis means and the calculation result of the calculation means;
Prepare.
 本開示にかかる差分抽出方法は、
 第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出すること、
 前記差分が抽出された場合、前記第1の説明変数と、前記第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、前記第2の説明変数と、前記第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出すること、及び
 前記抽出された抽出結果と、前記算出された算出結果と、を出力すること、を含む。
The difference extraction method according to the present disclosure is
A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. Extracting the difference with the explanatory variable,
When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and calculating a second correlation coefficient with a second objective variable included in the second case information; and outputting the extracted extraction result and the calculated calculation result. include.
 本開示にかかる非一時的なコンピュータ可読媒体は、
 情報処理装置に差分抽出方法を実行させるプログラムが格納された非一時的なコンピュータ可読媒体であって、
 前記差分抽出方法は、
 第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出すること、
 前記差分が抽出された場合、前記第1の説明変数と、前記第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、前記第2の説明変数と、前記第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出すること、及び
 前記抽出された抽出結果と、前記算出された算出結果と、を出力すること、を含む。
A non-transitory computer-readable medium according to the present disclosure includes:
A non-temporary computer-readable medium storing a program that causes an information processing device to execute a difference extraction method,
The difference extraction method is
A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. Extracting the difference with the explanatory variable,
When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and calculating a second correlation coefficient with a second objective variable included in the second case information; and outputting the extracted extraction result and the calculated calculation result. include.
 本開示によれば、学習モデルの評価を効率的に行うことが可能な情報処理装置、差分抽出方法、及び非一時的なコンピュータ可読媒体を提供できる。 According to the present disclosure, it is possible to provide an information processing device, a difference extraction method, and a non-temporary computer-readable medium capable of efficiently evaluating a learning model.
第1の実施形態にかかる情報処理装置の構成例を示すブロック図である。1 is a block diagram showing a configuration example of an information processing apparatus according to a first embodiment; FIG. 第2の実施形態にかかる情報処理装置の構成例を示す図である。It is a figure which shows the structural example of the information processing apparatus concerning 2nd Embodiment. 情報分析部が保持する情報を示す図である。FIG. 4 is a diagram showing information held by an information analysis unit; 分析処理に関連する処理対象データを示すテーブルである。It is a table which shows the process target data relevant to an analysis process. 判定テーブルの一例を示す図である。It is a figure which shows an example of a determination table. 第2の実施形態にかかる情報処理装置の動作例を示すフローチャートである。9 is a flow chart showing an operation example of the information processing apparatus according to the second embodiment; 第2の実施形態にかかる情報処理装置の動作例を示すフローチャートである。9 is a flow chart showing an operation example of the information processing apparatus according to the second embodiment; 抽出結果の一例を示す図である。It is a figure which shows an example of an extraction result. 抽出結果の一例を示す図である。It is a figure which shows an example of an extraction result. 抽出結果の一例を示す図である。It is a figure which shows an example of an extraction result. 抽出結果の一例を示す図である。It is a figure which shows an example of an extraction result. 抽出結果の一例を示す図である。It is a figure which shows an example of an extraction result. 抽出結果の一例を示す図である。It is a figure which shows an example of an extraction result. 抽出結果の一例を示す図である。It is a figure which shows an example of an extraction result. 差分表示を絞り込む処理を説明するための図である。FIG. 10 is a diagram for explaining processing for narrowing down difference display; 差分表示を絞り込む処理を説明するための図である。FIG. 10 is a diagram for explaining processing for narrowing down difference display; 情報処理装置等のハードウェア構成例を示す図である。It is a figure which shows the hardware structural examples, such as an information processing apparatus.
 以下、図面を参照して本開示の実施の形態について説明する。なお、以下の記載及び図面は、説明の明確化のため、適宜、省略及び簡略化がなされている。また、以下の各図面において、同一の要素には同一の符号が付されており、必要に応じて重複説明は省略されている。 Embodiments of the present disclosure will be described below with reference to the drawings. Note that the following descriptions and drawings are appropriately omitted and simplified for clarity of explanation. Further, in each drawing below, the same elements are denoted by the same reference numerals, and redundant description is omitted as necessary.
 まず、本開示において使用する用語について説明する。本開示では、学習モデルを設計するためのパターンを示す設計パターンを「ケース」と称する。また、本開示では、「ケース」は、分析モデルの作成、検証、及び評価を行うための設計情報も含み得る用語として定義する。設計情報は、AIエンジンの指定、学習用データ、検証用データ及び評価用データの指定、ハイパーパラメータ及びデータ分割条件の指定、並びにハイパーパラメータ以外でAIエンジンを実行するために使用したパラメータの指定等を含み得る。さらに、設計情報は、AIエンジン実行プログラムのソースコード等を含み得る。例えば、第1の設計パターンに基づいて第1の学習モデルが作成された場合、第1の設計パターンを、第1のケースと称し、第1の設計パターンに関する情報(第1の設計パターンに使用された情報)を、第1のケース情報と称して記載する。また、本開示では、学習モデルを、分析モデルと称して記載することがある。 First, the terms used in this disclosure will be explained. In the present disclosure, design patterns that indicate patterns for designing learning models are referred to as "cases." Also, in this disclosure, "case" is defined as a term that can also include design information for creating, validating, and evaluating analytical models. Design information includes the specification of the AI engine, the specification of data for learning, the data for verification and the data for evaluation, the specification of hyperparameters and data division conditions, and the specification of parameters other than hyperparameters used to execute the AI engine. can include Furthermore, the design information may include the source code of the AI engine execution program, and the like. For example, when the first learning model is created based on the first design pattern, the first design pattern is referred to as the first case, and the information about the first design pattern (used for the first design pattern). information) is referred to as the first case information. Also, in the present disclosure, a learning model may be referred to as an analysis model.
(第1の実施形態)
 図1を用いて、第1の実施形態にかかる情報処理装置1の構成例について説明する。図1は、第1の実施形態にかかる情報処理装置の構成例を示すブロック図である。情報処理装置1は、パーソナルコンピュータでもよく、サーバでもよい。情報処理装置1は、分析部2と、算出部3と、出力部4とを備える。
(First embodiment)
A configuration example of the information processing apparatus 1 according to the first embodiment will be described with reference to FIG. FIG. 1 is a block diagram illustrating a configuration example of an information processing apparatus according to a first embodiment; The information processing device 1 may be a personal computer or a server. The information processing device 1 includes an analysis unit 2 , a calculation unit 3 and an output unit 4 .
 分析部2は、第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出する。分析部2は、第1のケース情報及び第2のケース情報を保持している記憶装置(不図示)から、第1のケース情報及び第2のケース情報を取得してもよい。もしくは、分析部2は、第1のケース情報及び第2のケース情報を入力装置(不図示)から入力されることで取得してもよい。記憶装置及び入力装置は、それぞれ情報処理装置1の内部又は外部に設けられてもよい。分析部2は、取得した第1のケース情報及び第2のケース情報から、第1の説明変数と、第2の説明変数との差分を抽出する。 The analysis unit 2 analyzes the first explanatory variable included in the first case information indicating the information regarding the design pattern of the first learning model and the second case information indicating the information regarding the design pattern of the second learning model. A difference from the included second explanatory variable is extracted. The analysis unit 2 may acquire the first case information and the second case information from a storage device (not shown) holding the first case information and the second case information. Alternatively, the analysis unit 2 may acquire the first case information and the second case information by inputting them from an input device (not shown). The storage device and the input device may be provided inside or outside the information processing device 1, respectively. The analysis unit 2 extracts the difference between the first explanatory variable and the second explanatory variable from the acquired first case information and second case information.
 算出部3は、分析部2が差分を抽出した場合、第1の説明変数と、第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、第2の説明変数と、第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出する。第1の相関係数は、第1の説明変数と、第1の目的変数との関係を示す指標値である。算出部3は、第1の説明変数と第1の目的変数との共分散を、第1の説明変数の標準偏差と第1の目的変数の標準偏差との積で除算することで、第1の相関係数を算出してもよい。また、第2の相関係数は、第2の説明変数と、第2の目的変数との関係を示す指標値である。算出部3は、第1の相関係数と同様に、第2の説明変数と第2の目的変数との共分散を、第2の説明変数の標準偏差と第2の目的変数の標準偏差との積で除算することで、第2の相関係数を算出してもよい。 When the analysis unit 2 extracts the difference, the calculation unit 3 calculates a first correlation coefficient between the first explanatory variable and the first objective variable included in the first case information, and calculates a second correlation coefficient. A second correlation coefficient between the explanatory variable and the second objective variable included in the second case information is calculated. The first correlation coefficient is an index value indicating the relationship between the first explanatory variable and the first objective variable. The calculation unit 3 divides the covariance between the first explanatory variable and the first objective variable by the product of the standard deviation of the first explanatory variable and the standard deviation of the first objective variable to obtain the first may be calculated. Also, the second correlation coefficient is an index value indicating the relationship between the second explanatory variable and the second objective variable. Similar to the first correlation coefficient, the calculator 3 calculates the covariance between the second explanatory variable and the second objective variable as the standard deviation of the second explanatory variable and the standard deviation of the second objective variable. A second correlation coefficient may be calculated by dividing by the product of .
 出力部4は、分析部2の抽出結果と、算出部3の算出結果とを出力する。出力部4は、情報処理装置1の内部又は外部に設けられた出力装置(不図示)に分析部2の抽出結果と、算出部3の算出結果とを出力してもよい。 The output unit 4 outputs the extraction result of the analysis unit 2 and the calculation result of the calculation unit 3. The output unit 4 may output the extraction result of the analysis unit 2 and the calculation result of the calculation unit 3 to an output device (not shown) provided inside or outside the information processing apparatus 1 .
 以上のように、情報処理装置1は、分析部2が、2つのケース情報に含まれる説明変数の差分を抽出し、説明変数の差分が抽出された場合、各ケース情報に対して、説明変数と、目的変数との相関関係を算出する。情報処理装置1は、分析部2の抽出結果と、算出部3の算出結果と、を出力する。相関係数は、説明変数と、目的変数との関係を示す指標値である。そのため、各ケース情報を分析する分析担当者は、情報処理装置1から出力された結果を確認することで、説明変数の差分を把握できるとともに、目的変数に影響を与える説明変数を把握できる。学習モデルは、説明変数に基づいて目的変数を予測するモデルであるため、相関係数は、説明変数が学習モデルに影響を与える指標値とも言える。したがって、分析担当者は、情報処理装置1から出力された結果を確認することで、説明変数の差分を把握できるとともに、学習モデルに影響を与える説明変数を把握できる。よって、第1の実施形態にかかる情報処理装置1によれば、分析担当者のスキルレベルによらず、学習モデルの評価を効率的に行うことが可能となる。 As described above, the information processing apparatus 1 extracts the difference between the explanatory variables included in the two pieces of case information by the analysis unit 2. When the difference between the explanatory variables is extracted, the information processing apparatus 1 extracts the explanatory variable , and the correlation with the objective variable. The information processing device 1 outputs the extraction result of the analysis unit 2 and the calculation result of the calculation unit 3 . A correlation coefficient is an index value that indicates the relationship between an explanatory variable and an objective variable. Therefore, by checking the results output from the information processing apparatus 1, the person in charge of analysis who analyzes each piece of case information can grasp the difference between the explanatory variables and the explanatory variables that affect the objective variable. Since the learning model is a model that predicts the objective variable based on the explanatory variables, the correlation coefficient can also be said to be an index value of the influence of the explanatory variables on the learning model. Therefore, by checking the results output from the information processing apparatus 1, the person in charge of analysis can grasp the difference between the explanatory variables and also grasp the explanatory variables that affect the learning model. Therefore, according to the information processing apparatus 1 according to the first embodiment, it is possible to efficiently evaluate the learning model regardless of the skill level of the person in charge of analysis.
(第2の実施形態)
 続いて、第2の実施形態について説明する。第2の実施形態は、第1の実施形態を具体的にした実施形態である。
<情報処理装置の構成例>
 図2を用いて、第2の実施形態にかかる情報処理装置100の構成例について説明する。図2は、第2の実施形態にかかる情報処理装置の構成例を示す図である。情報処理装置100は、第1の実施形態にかかる情報処理装置1に対応する。情報処理装置100は、機械学習された学習モデルである分析モデルを分析する装置である。情報処理装置100は、パーソナルコンピュータでもよく、サーバでもよい。情報処理装置100は、リポジトリ10と、処理装置20と、入力装置30と、出力装置40とを備える。なお、以降の説明では、情報処理装置100が分析する学習モデルを、分析モデルとして記載する。
(Second embodiment)
Next, a second embodiment will be described. The second embodiment is a concrete embodiment of the first embodiment.
<Configuration example of information processing device>
A configuration example of the information processing apparatus 100 according to the second embodiment will be described with reference to FIG. FIG. 2 is a diagram illustrating a configuration example of an information processing apparatus according to a second embodiment; An information processing device 100 corresponds to the information processing device 1 according to the first embodiment. The information processing device 100 is a device that analyzes an analysis model that is a machine-learned learning model. Information processing apparatus 100 may be a personal computer or a server. The information processing device 100 includes a repository 10 , a processing device 20 , an input device 30 and an output device 40 . Note that in the following description, the learning model analyzed by the information processing apparatus 100 is described as an analysis model.
 リポジトリ10は、情報処理装置100が分析するケース情報、及びケース情報に関連する各種情報を格納(保持)する記憶装置である。リポジトリ10は、例えば、NEC Advanced Analytics Platform Modeler(AAPF Modeler)でもよい。リポジトリ10は、情報保持部11を備える。 The repository 10 is a storage device that stores (holds) case information analyzed by the information processing device 100 and various types of information related to the case information. The repository 10 may be, for example, the NEC Advanced Analytics Platform Modeler (AAPF Modeler). The repository 10 has an information holding unit 11 .
 情報保持部11は、処理装置20が備える情報入力部21が受信した各種情報を、情報入力部21から入力し保持する。情報保持部11は、記憶部と称されてもよい。 The information holding unit 11 inputs and holds various types of information received by the information input unit 21 provided in the processing device 20 from the information input unit 21 . The information holding unit 11 may be called a storage unit.
 ここで、図3を用いて、情報保持部11が保持(蓄積)する各種情報について説明する。図3は、情報分析部が保持する情報を示す図である。図3に示すように、情報保持部11は、分析概要情報、ケース情報、分析モデル情報、評価記録情報、及び課題情報を保持する。 Various types of information held (accumulated) by the information holding unit 11 will now be described with reference to FIG. FIG. 3 is a diagram showing information held by an information analysis unit. As shown in FIG. 3, the information holding unit 11 holds analysis summary information, case information, analysis model information, evaluation record information, and assignment information.
 分析概要情報は、学習モデルである分析モデルにより分析を行いたい分析目的毎に作成される。例えば、分析モデルを使用するユーザ(分析担当者)が、電力需要予測を行いたいと考え、電力需要予測を分析目的とした場合、「電力需要予測」を分析目的とする分析概要情報が作成される。例えば、分析モデルを使用するユーザが、電力需要予測とは異なる販売予測を行いたいと考え、販売予測を分析目的とした場合、「販売予測」を分析目的とする分析概要情報が作成される。分析概要情報は、分析概要名、分析目的、予測目的、及び目標精度指標値を含む。  Analysis summary information is created for each analysis purpose for which you want to analyze using an analysis model, which is a learning model. For example, if a user (person in charge of analysis) who uses an analysis model wants to perform power demand forecasting and sets power demand forecasting as the purpose of analysis, analysis summary information with "power demand forecasting" as the purpose of analysis is created. be. For example, when a user who uses an analysis model wants to make a sales forecast different from a power demand forecast and sets the sales forecast as the purpose of analysis, analysis summary information is created with the purpose of analysis being "sales forecast." Analysis summary information includes analysis summary name, analysis objective, prediction objective, and target accuracy index value.
 分析概要名には、分析概要の名称が設定される。
 分析目的には、分析モデルの作成目的が設定される。上記した例を用いると、分析目的には、例えば、「電力需要予測」又は「販売予測」が設定される。
The name of the analysis summary is set in the analysis summary name.
The purpose of analysis is set with the purpose of creating an analysis model. Using the above example, the analysis purpose is set to, for example, "power demand forecast" or "sales forecast."
 予測目的には、機械学習で行われる分析の種類が設定される。機械学習で行われる分析の種類としては、例えば、教師あり学習の回帰分析、及びクラス分析等が挙げられる。そのため、予測目的には、例えば、教師あり学習の回帰分析、及びクラス分析等の分析の種類を特定可能な情報が設定される。 For prediction purposes, the type of analysis performed by machine learning is set. Types of analysis performed in machine learning include, for example, regression analysis of supervised learning, class analysis, and the like. Therefore, information that can specify the type of analysis, such as regression analysis of supervised learning and class analysis, is set for the prediction purpose.
 目標精度指標値には、分析概要情報に基づいて作成された分析モデルの予測精度の目標となる精度指標値が設定される。言い換えると、目標精度指標に関する情報には、分析概要情報に基づく複数のケースにより作成された分析モデルの予測精度目標の指標値を示す情報が設定される。目標精度指標値には、以下に示す、精度指標に関する項目と当該項目の数値とが設定される。精度指標に関する項目は、例えば、以下に示す項目が挙げられる。目標精度指標値には、例えば、平均絶対パーセント誤差がXX%のように設定される。
<精度指標に関する項目>
・平均絶対誤差(MAE:Mean Absolute Error)
・平均二乗誤差(MSE:Mean Squared Error)
・二乗平均平方根誤差(RMSE:Root Mean Squared Error)
・平均絶対パーセント誤差(MAPE:Mean Absolute Percentage Error)
・平均二乗パーセント誤差の平方根(RMSPE:Root Mean Squared Percentage Error)
・決定係数(CoD:Coefficient of Determination)
・AUC(Area Under Curve):ROC(Receiver Operating Characteristic)曲線を作成したときのROC曲線よりも下の部分の面積を示す指標値
・PR-AUC(PR-Area Under Curve):PR(Precision Recall)曲線を作成したときに、PR曲線よりも下の部分の面積を示す指標値
・TP(true positive):正しくpositiveに予測した数
・FP(false positive):間違えてpositiveに予測した数
・TN(true negative):正しくnegativeに予測した数
・FN(false negative):間違えてnegativeに予測した数
・正解率(Accuracy)
・適合率(Precision)
・再現率(Recall)
・特異度(Specificity)
・偽陽性率(FPR:False Positive Rate)
・偽陰性率(FNR:False Negative Rate)
・F-尺度(F-measure)
・マシューズ相関係数(MCC:Matthews Correlation Coefficient)
・Logloss(Logarithmic Loss):対数損失
The target accuracy index value is set with an accuracy index value that is the target of the prediction accuracy of the analysis model created based on the analysis summary information. In other words, the information about the target accuracy index is set with information indicating the index value of the prediction accuracy target of the analysis model created from a plurality of cases based on the analysis summary information. As the target accuracy index value, items related to the accuracy index and numerical values of the items are set as shown below. Items related to the accuracy index include, for example, the following items. For the target accuracy index value, for example, the average absolute percent error is set to XX%.
<Items related to accuracy index>
・Mean Absolute Error (MAE)
・Mean Squared Error (MSE)
・Root Mean Squared Error (RMSE)
・Mean Absolute Percentage Error (MAPE)
・Root Mean Squared Percentage Error (RMSPE)
・CoD: Coefficient of Determination
・AUC (Area Under Curve): An index value that indicates the area below the ROC curve when creating the ROC (Receiver Operating Characteristic) curve ・PR-AUC (PR-Area Under Curve): PR (Precision Recall) Index value indicating the area below the PR curve when the curve is created ・TP (true positive): number of correctly predicted positively ・FP (false positive): number of incorrectly positively predicted true negative): number of correct negative predictions FN (false negative): number of wrong negative predictions Accuracy
・Precision
・Recall rate
・Specificity
・False positive rate (FPR)
・False Negative Rate (FNR)
・F-measure
・Matthews Correlation Coefficient (MCC)
・Logloss (Logarithmic Loss)
 ケース情報は、分析概要情報に基づいて分析モデルを作成するためのケース(設計パターン)に関する情報である。1つの分析概要情報が作成されると、当該分析概要情報に含まれる分析目的、予測目的及び目標精度指標値に応じた、予測精度が高い分析モデルが作成される。1回の設計だけでは、予測精度が高い分析モデルを作成することは一般的に難しいため、複数回の設計による試行錯誤で複数のケースを作成し、分析モデルを評価することで、予測精度が高い分析モデルを作成する。そのため、1つの分析概要情報から、複数のケース情報が作成される。つまり、分析概要情報は、複数のケース情報を束ねる情報であり、情報保持部11は、例えば、分析概要情報と、ケース情報とを階層化して保持する。言い換えると、情報保持部11は、分析概要情報の1つ下の階層に、ケース情報が格納されるように保持する。そのため、分析概要情報と、ケース情報とは、保持された階層をたどることで対応する情報が特定できるように、情報保持部11により保持されている。ケース情報は、ケース名、学習候補データ、AIエンジン・アルゴリズム、ハイパーパラメータ、目的変数、説明変数、及び対応課題を含む。 The case information is information about cases (design patterns) for creating an analysis model based on the analysis summary information. When one piece of analysis summary information is created, an analysis model with high prediction accuracy is created according to the analysis purpose, prediction purpose, and target accuracy index value included in the analysis summary information. It is generally difficult to create an analytical model with high prediction accuracy with only one design. Create high analytical models. Therefore, a plurality of pieces of case information are created from one piece of analysis summary information. In other words, the analysis summary information is information bundling a plurality of pieces of case information, and the information holding unit 11 holds, for example, the analysis summary information and the case information in a hierarchical manner. In other words, the information holding unit 11 holds the case information so that it is stored one level below the analysis summary information. Therefore, the analysis summary information and the case information are held by the information holding unit 11 so that the corresponding information can be identified by tracing the held hierarchy. Case information includes case names, learning candidate data, AI engine algorithms, hyperparameters, objective variables, explanatory variables, and corresponding tasks.
 ケース名には、分析モデルを設計するためのケースを特定する名称が設定される。
 学習候補データには、分析モデルを作成するために、使用される可能性があるデータの集合が設定される。具体的には、学習候補データには、目的変数及び説明変数として使用され得る複数の変数名と、各変数の数値等のデータとが設定される。なお、学習候補データには、目的変数及び説明変数として使用されない変数を含んでもよい。
A case name is set to identify a case for designing an analysis model.
A set of data that may be used to create an analysis model is set in learning candidate data. Specifically, the learning candidate data is set with a plurality of variable names that can be used as objective variables and explanatory variables, and data such as numerical values for each variable. Note that the learning candidate data may include variables that are not used as objective variables and explanatory variables.
 AIエンジン・アルゴリズムには、AIエンジン名、及びAIエンジンが使用するアルゴリズム名が設定される。AIエンジンは、特定のアルゴリズム分類に基づいた分析を行うAIの総称を指す。AIエンジンとは、所定のデータ分析手法に沿って、機械学習技術を用いた分析モデルを生成することで、予測及び判別といった分析処理を実現するシステムを指す。AIエンジンは、例えば、商用のソフトウェアプログラム、又はオープンソースで提供されているソフトウェアプログラムである。AIエンジンには、例えば、scikit-learn及びPyTorch等が挙げられる。 The AI engine algorithm is set with the AI engine name and the name of the algorithm used by the AI engine. AI engine is a general term for AI that performs analysis based on a specific algorithm classification. An AI engine refers to a system that realizes analysis processing such as prediction and discrimination by generating an analysis model using machine learning technology according to a predetermined data analysis method. The AI engine is, for example, a commercial software program or a software program provided as open source. AI engines include, for example, scikit-learn and PyTorch.
 目的変数には、分析モデルにより予測したい情報(予測対象のデータ)の変数名(目的変数名)と、データ型とが設定される。目的変数のデータ型は、目的変数の値の種類を示し、分類分けするために使用されるラベルである。データ型の一例として、例えば、カテゴリ型、及び数値型等が挙げられる。例えば、「電力需要予測」が分析目的であるとすると、目的変数には、電力需要に関する電力実績値の目的変数名を示す「実績(万kW)」と、目的変数のデータ型とが設定される。 For the objective variable, the variable name (objective variable name) of the information to be predicted by the analysis model (data to be predicted) and the data type are set. The data type of the objective variable is a label that indicates the type of value of the objective variable and is used for classification. Examples of data types include, for example, categorical types and numeric types. For example, if the purpose of analysis is "electricity demand forecast", the objective variable is set to "result (10,000 kW)", which indicates the objective variable name of the actual electric power value related to electric power demand, and the data type of the objective variable. be.
 説明変数には、分析モデルが予測する際に使用する複数の変数であって、目的変数に影響を与えると想定される変数名(説明変数名)が設定される。説明変数には、例えば、変数一覧の形式で、全ての説明変数名が設定される。例えば、「電力需要予測」が分析目的であるとすると、説明変数には、目的変数である電力需要を予測するために使用する、「気温」、「降水量」、及び2日前の電力実績値を示す「実績(万kW)_2日前」等の変数名が、変数一覧として一覧形式で設定される。 The explanatory variables are multiple variables used when the analysis model makes predictions, and variable names (explanatory variable names) that are assumed to affect the objective variable are set. For explanatory variables, all explanatory variable names are set, for example, in the form of a variable list. For example, if the purpose of the analysis is "electricity demand forecast", the explanatory variables include "temperature", "precipitation", and the actual electric power value two days ago, which are used to forecast the electric power demand, which is the objective variable. A variable name such as “Actual (10,000 kW)_2 days ago” is set in a list format as a variable list.
 対応課題は、後述する課題情報に関連する情報であり、対応課題には、各ケースで解決対象となる課題が設定される。例えば、あるケースにより作成されたある分析モデルを評価したところ、学習候補データに含まれる「気温」に関するデータが不足していることが判明した場合、課題情報には、「「気温」に関するデータが不足している」という課題が設定される。新たに検討されたケースが、「気温」に関するデータが追加された学習候補データに基づいている場合、当該ケースのケース情報に含まれる対応課題には、「「気温」に関するデータが不足している」が設定される。 The problem to be solved is information related to the problem information described later, and the problem to be solved in each case is set in the problem to be solved. For example, when evaluating an analysis model created from a certain case, if it is found that the data related to "temperature" included in the learning candidate data is insufficient, the problem information will include "Data related to "temperature" is missing. Insufficient" problem is set. If a newly considered case is based on training candidate data to which data on 'temperature' has been added, the response task included in the case information for that case will have the message 'data on 'temperature' is lacking'. ” is set.
 分析モデル情報は、1つのケース情報から作成された分析モデルに関する情報が設定される。1つのケースから複数の分析モデルが作成されることがあるため、1つのケース情報に対して、少なくとも1つの分析モデル情報が対応付けられる。情報保持部11は、分析概要情報と、ケース情報と、分析モデル情報とを階層化して保持している。具体的には、情報保持部11は、分析概要情報の1つ下の階層に、ケース情報が格納され、ケース情報の1つ下の階層に、分析モデル情報が格納されるように、分析概要情報、ケース情報及び分析モデル情報を保持する。そのため、分析概要情報と、ケース情報と、分析モデル情報とは、保持された階層をたどることで対応する情報が特定できるように、情報保持部11により保持されている。分析モデル情報は、分析モデル名、予実ログ、説明変数カラム対応マップ、学習データ、評価データ、モデル定性情報、及び精度指標値を含む。 Information about the analysis model created from one piece of case information is set in the analysis model information. Since a plurality of analytical models may be created from one case, at least one piece of analytical model information is associated with one piece of case information. The information holding unit 11 hierarchically holds analysis summary information, case information, and analysis model information. Specifically, the information holding unit 11 stores the analysis outline so that the case information is stored in the hierarchy one level below the analysis outline information, and the analysis model information is stored in the hierarchy one level below the case information. Retain information, case information and analysis model information. Therefore, the analysis summary information, the case information, and the analysis model information are held by the information holding unit 11 so that the corresponding information can be specified by tracing the held hierarchy. The analysis model information includes an analysis model name, forecast/actual log, explanatory variable column correspondence map, learning data, evaluation data, model qualitative information, and accuracy index value.
 分析モデル名には、分析モデルの名前が設定される。
 予実ログには、分析モデルが予測した値と、実績値とが設定される。予実ログは、ファイル形式で、情報保持部11に保持されてもよい。
The name of the analytical model is set in the analytical model name.
A value predicted by the analysis model and an actual value are set in the forecast/actual log. The forecast/actual log may be held in the information holding unit 11 in a file format.
 説明変数カラム対応マップには、分析モデルに用いるデータを、どのように処理するのかを決定する情報が設定される。具体的には、説明変数カラム対応マップには、データ加工処理前後のカラム対応関係情報が設定される。より具体的には、説明変数カラム対応マップには、入力された説明変数が設定される入力カラムの情報と、入力カラムに対してどのような処理を行うかの処理内容を示す情報と、説明変数が加工された後の変数が設定される出力カラムの情報とが設定される。処理内容としては、例えば、1つのカラムを複数のカラムに展開する二値展開、1つのカラムを標準化して出力する標準化処理等が挙げられる。 Information that determines how to process the data used in the analysis model is set in the explanatory variable column correspondence map. Specifically, column correspondence information before and after data processing is set in the explanatory variable column correspondence map. More specifically, the explanatory variable column correspondence map contains information on the input column to which the input explanatory variable is set, information indicating the processing content of what kind of processing is to be performed on the input column, and an explanatory variable column correspondence map. Information of the output column to which the variable after the variable is set is set. Examples of processing contents include binary expansion of expanding one column into a plurality of columns, standardization processing of standardizing and outputting one column, and the like.
 学習データには、分析モデルを作成するために使用された複数のデータの集合が設定される。学習データには、変数名と、各変数の数値データとが設定される。学習データには、例えば、変数一覧の形式で、全ての変数名が設定される。説明変数は、分析モデルを作成する際に、説明変数カラム対応マップにより説明変数が加工される場合があり得るため、学習データの変数名は、説明変数が加工された後の変数名となる。なお、説明変数が加工されない場合は、学習データの変数名は、説明変数名となる。 A set of multiple data used to create an analysis model is set in the learning data. Variable names and numerical data of each variable are set in the learning data. All variable names are set in the learning data, for example, in the form of a variable list. Since the explanatory variables may be processed by the explanatory variable column correspondence map when creating the analysis model, the variable names of the learning data are the variable names after the explanatory variables have been processed. Note that when the explanatory variables are not processed, the variable names of the learning data are the explanatory variable names.
 評価データには、分析モデルを評価するために使用されたデータの集合が設定される。評価データには、変数名と、各変数の数値データとが設定される。評価データには、例えば、変数一覧の形式で、全ての変数名が設定される。説明変数は、分析モデルを作成する際に、説明変数カラム対応マップにより説明変数が加工される場合があり得るため、評価データの変数名は、説明変数が加工された後の変数名となる。なお、説明変数が加工されない場合は、評価データの変数名は、説明変数名となる。 A set of data used to evaluate the analysis model is set in the evaluation data. A variable name and numerical data of each variable are set in the evaluation data. All variable names are set in the evaluation data, for example, in the form of a variable list. Since the explanatory variables may be processed by the explanatory variable column correspondence map when creating the analysis model, the variable names of the evaluation data are the variable names after the explanatory variables have been processed. Note that when the explanatory variables are not processed, the variable names of the evaluation data are the explanatory variable names.
 モデル定性情報には、分析モデルが予測値を導くための根拠の情報が設定される。モデル定性情報には、例えば、分析モデルが回帰式で表される場合、回帰式、及び回帰式に含まれる回帰係数等が設定される。 The model qualitative information contains information on the grounds for the analysis model to derive the predicted value. For example, when the analysis model is represented by a regression equation, the model qualitative information includes the regression equation and the regression coefficients included in the regression equation.
 また、モデル定性情報には、例えば、分析モデルが決定木で表される場合、決定木の階層情報が設定される。決定木とは、ある事項に対する観察結果から、その事項の目標値に関する結論を導くための予測モデルを表す。決定木の階層は、決定木の木構造の階層を表す。決定木の枝は、内部の節点が変数に対応し、子である節点への枝が、その変数の取り得る値を示す。決定木の葉は、根(root)からの経路によって表される変数値に対して、目的変数の予測値を表す。決定木の葉のデータサンプル数は、各葉で予測されるデータのレコード数である。決定木の階層情報は、決定木の階層数、階層間の関連情報、決定木の各枝の決定条件、決定木の各葉の学習データサンプル数、及び決定木の各葉の評価データサンプル数を含む。階層間の関連情報は、各葉のつながりを示す情報である。決定木の各葉の学習データサンプル数は、決定木の各葉で予測される学習データのレコード数である。決定木の各葉の評価データサンプル数は、決定木の各葉で予測される評価データのレコード数である。 Also, in the model qualitative information, for example, when the analysis model is represented by a decision tree, hierarchical information of the decision tree is set. A decision tree represents a predictive model for deriving a conclusion about the target value of a certain item from the observed results for that item. The hierarchy of decision trees represents the hierarchy of the tree structure of decision trees. In the branches of the decision tree, internal nodes correspond to variables, and branches to child nodes indicate possible values of the variable. A leaf of the decision tree represents the predicted value of the objective variable for the variable value represented by the path from the root. The number of data samples in the leaves of the decision tree is the number of data records expected in each leaf. Decision tree hierarchy information includes the number of decision tree layers, relationship information between layers, decision conditions for each branch of the decision tree, the number of learning data samples for each leaf of the decision tree, and the number of evaluation data samples for each leaf of the decision tree. including. The relationship information between hierarchies is information indicating the connection of each leaf. The number of learning data samples for each leaf of the decision tree is the number of learning data records predicted for each leaf of the decision tree. The number of evaluation data samples for each leaf of the decision tree is the number of records of evaluation data predicted for each leaf of the decision tree.
 精度指標値には、学習データを分析モデルに入力して、分析モデルが出力した結果を示す学習結果、及び評価データを分析モデルに入力して、分析モデルが出力した結果を示す予測結果の精度を示す精度指標値が設定される。精度指標値は、精度指標に関する項目と、各項目の値とが設定される。精度指標に関する項目は、目標精度指標値の説明において列挙した項目である。各項目の値は、予実ログから算出される。なお、以降の説明では、学習結果の精度を示す精度指標値を、学習データに基づく精度指標値として記載し、予測結果の精度を示す精度指標値を、評価データに基づく精度指標値として記載することがある。なお、本実施形態では、精度指標値には、学習データに基づく精度指標値、及び評価データに基づく精度指標値が設定されるが、学習データに基づく精度指標値、及び評価データに基づく精度指標値のうち、いずれか1つが設定されてもよい。 The accuracy index value is the accuracy of the learning result, which is the output result of the analysis model after inputting the learning data into the analysis model, and the accuracy of the prediction result, which is the output result of the analysis model after inputting the evaluation data into the analysis model. is set. As the accuracy index value, an item related to the accuracy index and the value of each item are set. The items related to the accuracy index are the items listed in the description of the target accuracy index value. The value of each item is calculated from the forecast/actual log. In the following description, the accuracy index value indicating the accuracy of the learning result is described as the accuracy index value based on the learning data, and the accuracy index value indicating the accuracy of the prediction result is described as the accuracy index value based on the evaluation data. Sometimes. In this embodiment, the accuracy index value based on the learning data and the accuracy index value based on the evaluation data are set as the accuracy index value. Any one of the values may be set.
 評価記録情報は、評価対象のケース情報及び分析モデル情報を評価したときの記録に関する情報である。評価記録情報は、評価記録名、評価対象、精度指標、評価・見解を含む。
 評価記録名には、評価記録の名前が設定される。
 評価対象には、評価対象の分析モデルに関するケースを特定する情報が設定される。
The evaluation record information is information related to records when evaluation target case information and analysis model information are evaluated. The evaluation record information includes an evaluation record name, an evaluation target, an accuracy index, and an evaluation/opinion.
The name of the evaluation record is set in the evaluation record name.
Information specifying a case related to the analysis model to be evaluated is set in the evaluation target.
 精度指標には、分析モデルの予測値に対する実績値の精度指標値が設定される。精度指標は、評価を行うユーザが任意に設定可能な指標であり、予実ログに基づいて設定されてもよい。
 評価・見解には、評価対象の分析モデル及びケースに関して、評価を行うユーザの見解が設定される。
The accuracy index is set with the accuracy index value of the actual value with respect to the predicted value of the analysis model. The accuracy index is an index that can be arbitrarily set by the user who performs the evaluation, and may be set based on the forecast/actual log.
In the evaluation/opinion, the opinion of the user who evaluates the analysis model and case to be evaluated is set.
 課題情報は、評価記録情報から判明された課題に関する情報が設定される。例えば、あるケースにより作成されたある分析モデルを評価したところ、学習候補データに含まれる「気温」に関するデータが不足していることが判明した場合、課題情報には、「「気温」に関するデータが不足している」という課題に関する情報が設定される。課題情報は、課題名、課題内容、発生評価結果名、発生元ケース、課題対応ケース、及びケース効果有無を含む。 The assignment information is set with information related to assignments identified from the evaluation record information. For example, when evaluating an analysis model created from a certain case, if it is found that the data related to "temperature" included in the learning candidate data is insufficient, the problem information will include "Data related to "temperature" is missing. Insufficient" information is set. The task information includes the task name, task content, occurrence evaluation result name, source case, task response case, and presence/absence of case effect.
 課題名には、課題の名称が設定される。課題情報が、「「気温」に関するデータが不足している」という課題に関する情報である場合、課題名には、例えば、「気温に関するデータ不足」のような情報が設定される。 The name of the assignment is set in the assignment name. If the task information is information about the task ``Insufficient data about temperature'', information such as ``Insufficient data about temperature'' is set in the task name, for example.
 課題内容には、課題の具体的な内容が設定される。課題情報が、「「気温」に関するデータが不足している」という課題に関する情報である場合、課題内容には、例えば、「学習候補データに含まれる「気温」に関するデータが不足している」のような情報が設定される。 The specific content of the task is set in the task content. If the task information is information about the task that ``the data about 'temperature' is insufficient'', the task content includes, for example, ``the data about 'temperature' included in the learning candidate data is insufficient''. information is set.
 発生評価結果名には、課題が判明した評価記録情報に含まれる評価記録名が設定される。
 発生元ケースには、課題が判明したケースを特定する情報が設定される。発生元ケースには、課題が判明した評価記録情報に含まれる評価対象に設定されたケースを特定する情報が設定される。
The occurrence evaluation result name is set to the evaluation record name included in the evaluation record information in which the issue was found.
Information specifying a case in which a problem has been identified is set in the source case. Information specifying a case set as an evaluation target included in the evaluation record information in which the problem was found is set in the source case.
 課題対応ケースには、課題に対応するケースを特定する情報が設定される。例えば、課題に対して新たなケースが作成された場合、課題対応ケースには、当該ケースが設定される。 Information that identifies the case corresponding to the issue is set in the issue-related case. For example, when a new case is created for an issue, that case is set as the issue-handling case.
 ケース効果有無には、課題に対応する新たなケースに対して、各ケースが課題を解決しているのか否かの判断結果が設定される。課題に対して、新たに2つのケースが作成され、1つ目のケースは課題を解決しておらず、2つ目のケースが課題を解決しているとする。この場合、ケース効果有無に関する情報として、1つ目のケースに対して、課題を解決していないことを示す情報が設定され、2つ目のケースに対して、課題を解決していることを示す情報が設定される。 For the presence or absence of case effect, the judgment result of whether or not each case solves the problem is set for the new case corresponding to the problem. Assume that two new cases are created for the problem, the first case does not solve the problem, and the second case solves the problem. In this case, information indicating whether the problem has been solved is set for the first case as information about the presence or absence of case effects, and information indicating that the problem has been solved for the second case is set. information is set.
 図2に戻り、処理装置20の構成例について説明する。処理装置20は、入力装置30から入力されたデータに対して各種制御を実施する制御部として機能する。また、処理装置20は、リポジトリ10が保持する各種情報を用いて、分析概要情報、ケース情報及び分析モデル情報を分析し、分析結果を出力装置40に出力する。処理装置20は、外部システムに対する操作を行う。処理装置20は、情報入力部21と、情報分析部22と、算出部23と、出力部24と、外部システム制御部25とを備える。 Returning to FIG. 2, a configuration example of the processing device 20 will be described. The processing device 20 functions as a control section that performs various controls on data input from the input device 30 . Also, the processing device 20 analyzes the analysis summary information, the case information, and the analysis model information using various types of information held by the repository 10 and outputs the analysis results to the output device 40 . The processing device 20 performs operations on external systems. The processing device 20 includes an information input section 21 , an information analysis section 22 , a calculation section 23 , an output section 24 and an external system control section 25 .
 情報入力部21は、入力装置30からリポジトリ10の情報保持部11が保持する各種情報を受信する。情報入力部21は、受信した情報を情報保持部11に入力する。情報入力部21は、ユーザが入力装置30に入力した、分析したい分析対象及び分析対象と比較を行いたい比較対象の分析モデルに関する情報を、入力装置30を介して受信する。情報入力部21は、分析対象及び比較対象の分析モデルに関する情報を情報分析部22に出力する。 The information input unit 21 receives various information held by the information holding unit 11 of the repository 10 from the input device 30 . Information input unit 21 inputs the received information to information holding unit 11 . The information input unit 21 receives, through the input device 30 , information about an analysis target to be analyzed and an analysis model of a comparison target to be compared with the analysis target, input by the user to the input device 30 . The information input unit 21 outputs to the information analysis unit 22 information about the analysis model to be analyzed and the analysis model to be compared.
 なお、情報入力部21は、ユーザが入力装置30に入力した、分析したい分析対象及び分析対象と比較を行いたい比較対象のケースに関する情報を、入力装置30を介して受信してもよい。もしくは、ユーザが全ての分析モデルの間の比較を行いたい場合、情報入力部21は、分析対象及び比較対象の分析モデルに関する応報を受信しなくてもよい。 Note that the information input unit 21 may receive, via the input device 30, the information on the case of the analysis target to be analyzed and the comparison target case to be compared with the analysis target, input by the user into the input device 30. Alternatively, if the user wants to compare all the analysis models, the information input unit 21 does not need to receive feedback regarding the analysis target and comparison target analysis models.
 また、情報入力部21は、情報分析部22及び算出部23が行う、後述する分析処理を途中で中止するのか否かの情報を、入力装置30を介して、ユーザから受け付けて、情報分析部22に出力する。情報入力部21は、分析処理において、情報分析部22が抽出した抽出結果及び算出部23が算出した算出結果を出力する出力条件を、入力装置30を介して、ユーザから受け付けて、出力部24に出力する。情報入力部21は、抽出結果及び算出結果のそれぞれを、出力装置40に出力するのか否かを、入力装置30を介して、ユーザから受け付けて、出力部24に出力する。言い換えると、情報入力部21は、出力装置40に出力する出力項目を、入力装置30を介して、ユーザから受け付けて、出力部24に出力する。 In addition, the information input unit 21 receives, from the user via the input device 30, information as to whether or not the analysis processing, which is performed by the information analysis unit 22 and the calculation unit 23 and will be described later, is to be stopped in the middle. 22. In the analysis process, the information input unit 21 receives output conditions for outputting the extraction result extracted by the information analysis unit 22 and the calculation result calculated by the calculation unit 23 from the user via the input device 30, and outputs the output condition to the output unit 24. output to The information input unit 21 receives from the user via the input device 30 whether or not to output each of the extraction result and the calculation result to the output device 40 , and outputs them to the output unit 24 . In other words, the information input unit 21 receives output items to be output to the output device 40 from the user via the input device 30 and outputs them to the output unit 24 .
 情報分析部22は、第1の実施形態における分析部2に対応する。情報分析部22は、リポジトリ10の情報保持部11が保持する各種情報のうち、情報入力部21に入力された分析対象の分析モデルと、比較対象の分析モデルとに関する情報を用いて、当該2つの分析モデルの比較を行う分析処理を実行する。具体的には、情報分析部22は、図3の点線で囲まれた、学習候補データ、AIエンジン・アルゴリズム、ハイパーパラメータ、目的変数、説明変数、学習データ、評価データ、モデル定性情報、及び精度指標値を比較することで分析処理を実行する。なお、分析処理の詳細については後述する。 The information analysis unit 22 corresponds to the analysis unit 2 in the first embodiment. The information analysis unit 22 uses the information on the analytical model to be analyzed and the analytical model to be compared that are input to the information input unit 21 among the various types of information held by the information holding unit 11 of the repository 10, and analyzes the two Run an analysis process that compares two analysis models. Specifically, the information analysis unit 22 performs learning candidate data, AI engine algorithms, hyperparameters, objective variables, explanatory variables, learning data, evaluation data, model qualitative information, and accuracy, which are surrounded by dotted lines in FIG. Analytical processing is performed by comparing index values. Details of the analysis processing will be described later.
 算出部23は、第1の実施形態における算出部3に対応する。算出部23は、分析処理において、説明変数と、目的変数との相関係数を算出する。相関係数は、説明変数と、目的変数との関係を示す指標値である。算出部23は、説明変数と目的変数との共分散を、説明変数の標準偏差と目的変数の標準偏差との積で除算することで、相関係数を算出してもよい。算出部23は、算出した相関係数を、情報分析部22に出力する。算出部23は、分析処理において、学習候補データのハッシュ値を算出する。算出部23は、算出した学習候補データのハッシュ値を、情報分析部22に出力する。 The calculator 23 corresponds to the calculator 3 in the first embodiment. The calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable in the analysis process. A correlation coefficient is an index value that indicates the relationship between an explanatory variable and an objective variable. The calculation unit 23 may calculate the correlation coefficient by dividing the covariance between the explanatory variable and the objective variable by the product of the standard deviation of the explanatory variable and the standard deviation of the objective variable. The calculator 23 outputs the calculated correlation coefficient to the information analyzer 22 . The calculation unit 23 calculates hash values of learning candidate data in the analysis process. The calculation unit 23 outputs the calculated hash value of the learning candidate data to the information analysis unit 22 .
 算出部23は、分析処理において、学習候補データ、学習データ及び評価データの基本統計量を算出する。算出部23は、目的変数のデータ型に応じた基本統計量を算出する。基本統計量は、例えば、要素の個数、算術平均、標準偏差、最小値、1/4分位数、中央値、及び3/4分位数を含む。なお、算出部23が算出する基本統計量は、上記に限られず、任意に設定された統計量を算出するように構成されてよい。算出部23は、算出した基本統計量を、情報分析部22に出力する。 The calculation unit 23 calculates basic statistics of learning candidate data, learning data, and evaluation data in the analysis process. The calculator 23 calculates a basic statistic according to the data type of the objective variable. Basic statistics include, for example, number of elements, arithmetic mean, standard deviation, minimum value, quarter quantile, median, and three quarter quantile. The basic statistic calculated by the calculator 23 is not limited to the above, and may be configured to calculate an arbitrarily set statistic. The calculation unit 23 outputs the calculated basic statistics to the information analysis unit 22 .
<分析処理>
 ここで、図4を参照して、情報分析部22及び算出部23が行う分析処理について説明する。図4は、分析処理に関連する処理対象データを示すテーブルである。
<analysis processing>
Now, with reference to FIG. 4, analysis processing performed by the information analysis unit 22 and the calculation unit 23 will be described. FIG. 4 is a table showing processing target data related to analysis processing.
 まず、図4について説明する。図4は、情報分析部22が分析処理において処理対象とする処理対象データと、処理対象データに対して差分を抽出する情報を示す抽出情報と、抽出情報に差分がある場合に追加で抽出する追加抽出・算出情報との関係を示すテーブルである。情報分析部22は、図4に示す処理対象データを上から順に処理していく。 First, FIG. 4 will be explained. FIG. 4 shows data to be processed by the information analysis unit 22 to be processed in the analysis process, extraction information indicating information for extracting a difference from the data to be processed, and additional extraction when there is a difference in the extraction information. It is a table showing a relationship with additional extraction/calculation information. The information analysis unit 22 sequentially processes the processing target data shown in FIG. 4 from the top.
 次に、分析処理の詳細について説明する。
 情報分析部22は、情報入力部21から入力された、分析対象の分析モデルに関する情報に基づいて、分析対象の分析モデル情報を特定する。情報分析部22は、情報入力部21から入力された、比較対象の分析モデルに関する情報に基づいて、比較対象の分析モデル情報を特定する。
Next, the details of the analysis processing will be described.
The information analysis unit 22 identifies analysis model information to be analyzed based on the information about the analysis model to be analyzed, which is input from the information input unit 21 . The information analysis unit 22 identifies analysis model information to be compared based on the information about the analysis model to be compared, which is input from the information input unit 21 .
 情報分析部22は、分析対象の分析モデル情報に対応するケース情報に含まれる目的変数と、比較対象の分析モデル情報に対応するケース情報に含まれる目的変数との差分を抽出する。情報分析部22は、情報保持部11の分析モデル情報と、ケース情報との階層関係に基づいて、分析対象の分析モデル情報に対応するケース情報と、比較対象の分析モデル情報に対応するケース情報とを特定する。情報分析部22は、分析対象の分析モデル情報に対応するケース情報に含まれる目的変数に設定された目的変数名と、比較対象の分析モデル情報に対応するケース情報に含まれる目的変数に設定された目的変数名との差分を抽出する。なお、以降の説明では、分析対象の分析モデル情報に対応するケース情報を、分析対象のケース情報とし、比較対象の分析モデル情報に対応するケース情報を、比較対象のケース情報として記載することがある。 The information analysis unit 22 extracts the difference between the objective variable included in the case information corresponding to the analysis model information to be analyzed and the objective variable included in the case information corresponding to the analysis model information to be compared. The information analysis unit 22 generates case information corresponding to the analysis model information to be analyzed and case information corresponding to the analysis model information to be compared based on the hierarchical relationship between the analysis model information in the information holding unit 11 and the case information. to identify The information analysis unit 22 sets the objective variable name set to the objective variable included in the case information corresponding to the analysis model information to be analyzed and the objective variable name set to the objective variable included in the case information corresponding to the analysis model information to be compared. Extract the difference from the target variable name. In the following explanation, the case information corresponding to the analysis model information to be analyzed can be described as the case information to be analyzed, and the case information corresponding to the analysis model information to be compared can be described as the case information to be compared. be.
 情報分析部22は、分析対象のケース情報に含まれる目的変数と、比較対象のケース情報に含まれる目的変数の差分が抽出された場合、図4において、目的変数に対応付けられた追加抽出・算出情報であるデータ型の差分を抽出する。 When the difference between the objective variable included in the case information to be analyzed and the objective variable included in the case information to be compared is extracted, the information analysis unit 22 performs additional extraction/ Extract the difference in the data type that is the calculation information.
 情報分析部22は、分析対象のケース情報に含まれるAIエンジン・アルゴリズムと、比較対象のケース情報に含まれるAIエンジン・アルゴリズムの差分を抽出する。具体的には、情報分析部22は、分析対象のケース情報に含まれるAIエンジン名と、比較対象のケース情報に含まれるAIエンジン名の差分を抽出する。また、情報分析部22は、分析対象のケース情報に含まれるアルゴリズム名と、比較対象のケース情報に含まれるアルゴリズム名の差分を抽出する。 The information analysis unit 22 extracts the difference between the AI engine algorithm included in the case information to be analyzed and the AI engine algorithm included in the case information to be compared. Specifically, the information analysis unit 22 extracts the difference between the AI engine name included in the case information to be analyzed and the AI engine name included in the case information to be compared. The information analysis unit 22 also extracts the difference between the algorithm name included in the case information to be analyzed and the algorithm name included in the case information to be compared.
 情報分析部22は、分析対象のケース情報に含まれるハイパーパラメータと、比較対象のケース情報に含まれるハイパーパラメータの差分を抽出する。具体的には、情報分析部22は、AIエンジン名及びアルゴリズム名が一致する場合、分析対象のケース情報に含まれるハイパーパラメータと、比較対象のケース情報に含まれるハイパーパラメータの差分を抽出する。 The information analysis unit 22 extracts the difference between the hyperparameters included in the case information to be analyzed and the hyperparameters included in the case information to be compared. Specifically, when the AI engine name and the algorithm name match, the information analysis unit 22 extracts the difference between the hyperparameters included in the case information to be analyzed and the hyperparameters included in the case information to be compared.
 算出部23は、分析対象のケース情報に含まれる学習候補データのハッシュ値を算出し、比較対象のケース情報に含まれる学習候補データのハッシュ値を算出する。情報分析部22は、分析対象のケース情報に含まれる学習候補データのハッシュ値と、比較対象のケース情報に含まれる学習候補データのハッシュ値との差分を抽出する。ハッシュ値は、学習候補データから、例えば、ハッシュ関数により求められる固定長の値である。ハッシュ値に差分がある場合、分析対象のケース情報に含まれる学習候補データが、比較対象のケース情報に含まれる学習候補データと異なることが分かる。そのため、情報分析部22は、分析対象のケース情報に含まれる学習候補データのハッシュ値と、比較対象のケース情報に含まれる学習候補データのハッシュ値との差分を抽出する。 The calculation unit 23 calculates the hash value of the learning candidate data included in the case information to be analyzed, and calculates the hash value of the learning candidate data included in the case information to be compared. The information analysis unit 22 extracts the difference between the hash value of the learning candidate data included in the case information to be analyzed and the hash value of the learning candidate data included in the case information to be compared. A hash value is a fixed-length value obtained from learning candidate data by, for example, a hash function. If there is a difference in the hash value, it can be seen that the learning candidate data included in the case information to be analyzed is different from the learning candidate data included in the case information to be compared. Therefore, the information analysis unit 22 extracts the difference between the hash value of the learning candidate data included in the case information to be analyzed and the hash value of the learning candidate data included in the case information to be compared.
 分析対象のケース情報に含まれる学習候補データのハッシュ値と、比較対象のケース情報に含まれる学習候補データのハッシュ値との差分が抽出された場合、情報分析部22は、学習候補データに対応付けられた追加抽出・算出情報を抽出する。情報分析部22は、図4の学習候補データに対応付けられた追加抽出・算出情報である目的変数のデータ型に応じた基本統計量の差分を抽出する。そのため、算出部23は、分析対象のケース情報に含まれる学習候補データの基本統計量と、比較対象のケース情報に含まれる学習候補データの基本統計量とを算出する。算出部23は、学習候補データに含まれる、各変数に対して、基本統計量を算出する。基本統計量は、例えば、要素の個数、算術平均、標準偏差、最小値、1/4分位数、中央値、及び3/4分位数を含む。算出部23は、各変数について、上記した基本統計量に含まれる各項目の値を算出する。 When the difference between the hash value of the learning candidate data included in the case information to be analyzed and the hash value of the learning candidate data included in the case information to be compared is extracted, the information analysis unit 22 corresponds to the learning candidate data. Extract the attached additional extraction/calculation information. The information analysis unit 22 extracts the difference between the basic statistics according to the data type of the objective variable, which is the additional extraction/calculation information associated with the learning candidate data in FIG. Therefore, the calculation unit 23 calculates a basic statistic of the learning candidate data included in the case information to be analyzed and a basic statistic of the learning candidate data included in the case information to be compared. The calculator 23 calculates a basic statistic for each variable included in the learning candidate data. Basic statistics include, for example, number of elements, arithmetic mean, standard deviation, minimum value, quarter quantile, median, and three quarter quantile. The calculation unit 23 calculates the value of each item included in the basic statistics for each variable.
 情報分析部22は、分析対象のケース情報に含まれる学習候補データの基本統計量と、比較対象のケース情報に含まれる学習候補データの基本統計量との差分を抽出する。情報分析部22は、学習候補データに含まれる各変数、及び基本統計量に含まれる各項目について、分析対象のケース情報に含まれる学習候補データの基本統計量と、比較対象のケース情報に含まれる学習候補データの基本統計量との差分を抽出する。 The information analysis unit 22 extracts the difference between the basic statistics of the learning candidate data included in the case information to be analyzed and the basic statistics of the learning candidate data included in the case information to be compared. The information analysis unit 22 analyzes the basic statistics of the learning candidate data included in the case information to be analyzed and the case information to be compared for each variable included in the learning candidate data and each item included in the basic statistics. Extract the difference from the basic statistic of the learning candidate data.
 情報分析部22は、分析対象のケース情報に含まれる説明変数と、比較対象のケース情報に含まれる説明変数との差分を抽出する。情報分析部22は、説明変数に設定された説明変数名が設定された変数一覧を比較することで、分析対象のケース情報に含まれる説明変数と、比較対象のケース情報に含まれる説明変数との差分を抽出する。情報分析部22は、変数一覧において、分析対象のケース情報と、比較対象のケース情報とで、一致(重複)する説明変数名、及び一致しない説明変数名を判別することで、説明変数名の差分を抽出する。 The information analysis unit 22 extracts the difference between the explanatory variables included in the case information to be analyzed and the explanatory variables included in the case information to be compared. The information analysis unit 22 compares the variable list in which the explanatory variable name set as the explanatory variable is set, thereby determining the explanatory variable included in the case information to be analyzed and the explanatory variable included in the case information to be compared. Extract the difference between The information analysis unit 22 determines the explanatory variable names that match (overlapping) and do not match in the case information to be analyzed and the case information to be compared in the variable list. Extract the difference.
 説明変数の差分が存在する場合、図4において、説明変数に対応付けられた追加抽出・算出情報である、目的変数との相関係数と、重み付けとを抽出する。具体的には、算出部23は、分析対象のケース情報に含まれる学習候補データを用いて、説明変数と、目的変数との相関係数を算出する。算出部23は、比較対象のケース情報に含まれる学習候補データを用いて、説明変数と、目的変数との相関係数を算出する。算出部23は、分析対象のケース情報と、比較対象のケース情報とに含まれている説明変数のうち、差分がある説明変数と、目的変数との相関係数を少なくとも算出する。 When there is a difference in the explanatory variables, in FIG. 4, the correlation coefficient with the objective variable and the weighting are extracted as additional extraction/calculation information associated with the explanatory variables. Specifically, the calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable using learning candidate data included in the case information to be analyzed. The calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable using the learning candidate data included in the case information to be compared. The calculation unit 23 calculates at least the correlation coefficient between the explanatory variable having a difference among the explanatory variables included in the case information to be analyzed and the case information to be compared, and the objective variable.
 情報分析部22は、分析対象の分析モデル情報に含まれるモデル定性情報から、分析モデルにおける説明変数の重み度合いを示す重み付けを抽出する。情報分析部22は、比較対象の分析モデル情報に含まれるモデル定性情報から、分析モデルにおける説明変数の重み度合いを示す重み付けを抽出する。情報分析部22は、分析対象のケース情報と、比較対象のケース情報とに含まれている説明変数のうち、差分がある説明変数の重み付けを少なくとも抽出する。なお、重み付けは、入力値の重要性を数値化したものであるため、重み係数と称されてもよい。 The information analysis unit 22 extracts the weighting that indicates the degree of weighting of the explanatory variables in the analysis model from the model qualitative information included in the analysis model information to be analyzed. The information analysis unit 22 extracts the weighting indicating the degree of weighting of the explanatory variables in the analysis model from the model qualitative information included in the analysis model information to be compared. The information analysis unit 22 extracts at least the weighting of explanatory variables having a difference among the explanatory variables included in the case information to be analyzed and the case information to be compared. Note that the weighting is a quantification of the importance of the input value, so it may also be referred to as a weighting factor.
 具体的には、情報分析部22は、分析対象の分析モデル情報に含まれるモデル定性情報に設定された各説明変数の回帰係数と、比較対象の分析モデル情報に含まれるモデル定性情報に設定された各説明変数の回帰係数とを重み付けとして抽出する。言い換えると、情報分析部22は、分析対象の分析モデル及び比較対象の分析モデルが回帰式で表される場合、当該回帰式の各説明変数に対応する回帰係数を重み付け(重み係数)として抽出する。なお、分析モデルが、複数の予測式を使用して学習される異種混合学習により作成されている場合、複数の予測式の各々を回帰式と見なし、各予測式の各説明変数の係数を回帰係数と見なして、当該回帰係数を重み付けとして抽出してもよい。 Specifically, the information analysis unit 22 sets the regression coefficient of each explanatory variable set in the model qualitative information included in the analysis model information to be analyzed and the model qualitative information included in the analysis model information to be compared. The regression coefficients of each explanatory variable are extracted as weights. In other words, when the analytical model to be analyzed and the analytical model to be compared are represented by regression equations, the information analysis unit 22 extracts regression coefficients corresponding to explanatory variables of the regression equations as weights (weighting coefficients). . If the analysis model is created by heterogeneous mixture learning using multiple prediction formulas, each of the multiple prediction formulas is regarded as a regression formula, and the coefficient of each explanatory variable of each prediction formula is used for regression. Considering it as a coefficient, the regression coefficient may be extracted as weighting.
 算出部23は、分析対象の分析モデル情報に含まれる学習データの基本統計量を算出し、比較対象の分析モデル情報に含まれる学習データの基本統計量を算出する。具体的には、算出部23は、分析対象の分析モデル情報に含まれる学習データに設定された各変数の基本統計量を算出し、比較対象の分析モデル情報に含まれる学習データに設定された各変数の基本統計量を算出する。また、基本統計量は、例えば、要素の個数、算術平均、標準偏差、最小値、1/4分位数、中央値、及び3/4分位数を含むため、算出部23は、変数毎に、かつ基本統計量の項目毎に、基本統計量を算出する。 The calculation unit 23 calculates the basic statistics of the learning data included in the analysis model information to be analyzed, and calculates the basic statistics of the learning data included in the analysis model information to be compared. Specifically, the calculation unit 23 calculates the basic statistics of each variable set in the learning data included in the analysis model information to be analyzed, and calculates the basic statistics of each variable set in the learning data included in the analysis model information to be compared. Calculate basic statistics for each variable. In addition, since the basic statistics include, for example, the number of elements, the arithmetic mean, the standard deviation, the minimum value, the 1/4 quantile, the median value, and the 3/4 quantile, the calculation unit 23, for each variable , and for each basic statistic item, the basic statistic is calculated.
 情報分析部22は、分析対象の分析モデル情報に含まれる学習データに設定された変数と、比較対象の分析モデル情報に含まれる学習データに設定された変数との差分を抽出する。また、情報分析部22は、算出部23が算出した結果に基づいて、変数毎に、かつ基本統計量の項目毎に、分析対象の分析モデル情報に含まれる学習データと、比較対象の分析モデル情報に含まれる学習データとについて、基本統計量の差分を算出する。 The information analysis unit 22 extracts the difference between the variables set in the learning data included in the analysis model information to be analyzed and the variables set in the learning data included in the analysis model information to be compared. Further, based on the result calculated by the calculation unit 23, the information analysis unit 22 calculates the learning data included in the analysis model information to be analyzed and the analysis model to be compared for each variable and for each basic statistic item. A difference in basic statistics is calculated with respect to the learning data included in the information.
 算出部23は、分析対象の分析モデル情報に含まれる評価データの基本統計量を算出し、比較対象の分析モデル情報に含まれる評価データの基本統計量を算出する。具体的には、算出部23は、分析対象の分析モデル情報に含まれる評価データに設定された各変数の基本統計量を算出し、比較対象の分析モデル情報に含まれる評価データに設定された各変数の基本統計量を算出する。また、基本統計量は、例えば、要素の個数、算術平均、標準偏差、最小値、1/4分位数、中央値、及び3/4分位数を含むため、算出部23は、変数毎に、かつ基本統計量の項目毎に、基本統計量を算出する。 The calculation unit 23 calculates the basic statistics of the evaluation data included in the analysis model information to be analyzed, and calculates the basic statistics of the evaluation data included in the analysis model information to be compared. Specifically, the calculation unit 23 calculates the basic statistics of each variable set in the evaluation data included in the analysis model information to be analyzed, and calculates the basic statistics of each variable set in the evaluation data included in the analysis model information to be compared. Calculate basic statistics for each variable. In addition, since the basic statistics include, for example, the number of elements, the arithmetic mean, the standard deviation, the minimum value, the 1/4 quantile, the median value, and the 3/4 quantile, the calculation unit 23, for each variable , and for each basic statistic item, the basic statistic is calculated.
 情報分析部22は、分析対象の分析モデル情報に含まれる評価データに設定された変数と、比較対象の分析モデル情報に含まれる評価データに設定された変数との差分を抽出する。また、情報分析部22は、算出部23が算出した結果に基づいて、変数毎に、かつ基本統計量の項目毎に、分析対象の分析モデル情報に含まれる評価データと、比較対象の分析モデル情報に含まれる評価データとについて、基本統計量の差分を算出する。 The information analysis unit 22 extracts the difference between the variables set in the evaluation data included in the analysis model information to be analyzed and the variables set in the evaluation data included in the analysis model information to be compared. Further, based on the result calculated by the calculation unit 23, the information analysis unit 22 calculates the evaluation data included in the analysis model information to be analyzed and the analysis model to be compared for each variable and for each basic statistic item. A difference in basic statistics is calculated with respect to the evaluation data included in the information.
 情報分析部22は、分析対象の分析モデル情報に含まれるモデル定性情報と、比較対象の分析モデル情報に含まれるモデル定性情報との差分を抽出する。分析モデルが回帰式で表される場合、情報分析部22は、回帰式に含まれる重み付けである回帰係数の差分を抽出する。具体的には、情報分析部22は、分析対象の分析モデル情報及び比較対象の分析モデル情報に回帰式の回帰係数が設定されている場合、回帰式の回帰係数を重み付けとして、重み付けの差分を抽出する。つまり、情報分析部22は、分析対象のケース情報及び比較対象のケース情報に含まれる説明変数に差分が無い場合でも、重み付けが異なる説明変数について、重み付けの差分を抽出する。 The information analysis unit 22 extracts the difference between the model qualitative information included in the analysis model information to be analyzed and the model qualitative information included in the analysis model information to be compared. When the analysis model is represented by a regression formula, the information analysis unit 22 extracts the difference of regression coefficients, which are weights included in the regression formula. Specifically, when the regression coefficients of the regression equation are set for the analytical model information to be analyzed and the analytical model information to be compared, the information analysis unit 22 uses the regression coefficients of the regression equations as weights, and calculates the weighted difference. Extract. In other words, the information analysis unit 22 extracts the difference in weighting for explanatory variables with different weights even when there is no difference between the explanatory variables included in the case information to be analyzed and the case information to be compared.
 重み付けの差分がある場合、算出部23は、重み付け(重み係数)に差分がある説明変数に対して、目的変数との相関係数を算出する。算出部23は、重み付けに差分がある説明変数と目的変数との相関係数を、分析対象のケース情報に含まれる学習候補データを使用して算出するとともに、比較対象のケース情報に含まれる学習候補データを使用して算出する。 If there is a difference in weighting, the calculation unit 23 calculates the correlation coefficient between the explanatory variable with a difference in weighting (weighting coefficient) and the objective variable. The calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable having different weightings using the learning candidate data included in the case information to be analyzed, and calculates the learning data included in the case information to be compared. Calculated using candidate data.
 分析モデルが決定木で表される場合、情報分析部22は、決定木の階層情報の差分を抽出する。具体的には、情報分析部22は、分析対象の分析モデル情報及び比較対象の分析モデル情報に、決定木の階層情報が設定されている場合、決定木の階層情報の差分を抽出する。決定木の階層情報は、決定木の階層数、決定木の各枝の決定条件、決定木の各葉の学習データサンプル数、及び決定木の各葉の評価データサンプル数を含む。そのため、情報分析部22は、決定木の階層情報のそれぞれに対して、差分を抽出する。 When the analysis model is represented by a decision tree, the information analysis unit 22 extracts differences in the hierarchical information of the decision tree. Specifically, when decision tree hierarchy information is set in the analysis model information to be analyzed and the analysis model information to be compared, the information analysis unit 22 extracts the difference in the hierarchy information of the decision trees. The decision tree hierarchy information includes the number of levels of the decision tree, decision conditions for each branch of the decision tree, the number of learning data samples for each leaf of the decision tree, and the number of evaluation data samples for each leaf of the decision tree. Therefore, the information analysis unit 22 extracts the difference for each piece of hierarchical information of the decision tree.
 情報分析部22は、分析対象の分析モデル情報に含まれる精度指標値と、比較対象の分析モデル情報に含まれる精度指標値との差分を抽出する。具体的には、情報分析部22は、分析対象の分析モデル情報及び比較対象の分析モデル情報に含まれる精度指標値に設定された、学習データに基づく精度指標値、及び評価データに基づく精度指標値の差分を抽出する。また、精度指標値は、精度指標に関する少なくとも1つの項目が設定されるため、情報分析部22は、精度指標に関する項目毎に、学習データに基づく精度指標値、及び評価データに基づく精度指標値の差分を抽出する。 The information analysis unit 22 extracts the difference between the accuracy index value included in the analysis model information to be analyzed and the accuracy index value included in the analysis model information to be compared. Specifically, the information analysis unit 22 sets the accuracy index value based on the learning data and the accuracy index based on the evaluation data, which are set to the accuracy index values included in the analytical model information to be analyzed and the analytical model information to be compared. Extract value differences. In addition, since at least one item related to the accuracy index is set as the accuracy index value, the information analysis unit 22 calculates the accuracy index value based on the learning data and the accuracy index value based on the evaluation data for each item related to the accuracy index. Extract the difference.
 情報分析部22は、学習データに基づく精度指標値、及び評価データに基づく精度指標値の差分と、分析対象の分析モデル情報及び比較対象の分析モデル情報に関連する目標精度指標値と、所定の判定条件とに基づいて、分析モデルの優劣判定を行う。具体的には、情報分析部22は、学習データに基づく精度指標値、及び評価データに基づく精度指標値の差分と、分析概要情報に含まれる目標精度指標値と、判定テーブルとに基づいて、分析モデルの優劣判定を行う。つまり、情報分析部22は、比較対象の分析モデルの予測精度が、分析対象の分析モデルを基準に改善したのか劣化したのかの判定を行う。 The information analysis unit 22 calculates the difference between the accuracy index value based on the learning data and the accuracy index value based on the evaluation data, the target accuracy index value related to the analysis model information to be analyzed and the analysis model information to be compared, and a predetermined The superiority or inferiority of the analysis model is judged based on the judgment conditions. Specifically, the information analysis unit 22, based on the difference between the accuracy index value based on the learning data and the accuracy index value based on the evaluation data, the target accuracy index value included in the analysis summary information, and the determination table, Determine the superiority or inferiority of the analysis model. That is, the information analysis unit 22 determines whether the prediction accuracy of the analysis model to be compared has improved or deteriorated based on the analysis model to be analyzed.
 ここで、図5を用いて、判定テーブルの一例について説明し、情報分析部22が行う予測精度の改善及び劣化判定について説明する。図5は、判定テーブルの一例を示す図である。判定テーブルには、左から順に、精度指標値の項目一覧、性能改善の判定条件、及び性能劣化の判定条件が設定される。 Here, an example of the determination table will be described using FIG. 5, and the prediction accuracy improvement and deterioration determination performed by the information analysis unit 22 will be described. FIG. 5 is a diagram showing an example of a determination table. In the determination table, a list of items of accuracy index values, a determination condition for performance improvement, and a determination condition for performance deterioration are set in order from the left.
 情報分析部22は、リポジトリ10の情報保持部11から分析対象のモデル情報及び比較対象の分析モデル情報が含まれる分析概要情報から目標精度指標値に設定された精度指標を示す項目を取得する。情報分析部22は、学習データに基づく精度指標値の差分、及び評価データに基づく精度指標値の差分について、取得した精度指標を示す項目と一致する項目の差分を抽出する。情報分析部22は、抽出した項目を判定テーブルから検索し、判定テーブルに設定された、性能改善の判定条件及び性能劣化の判定条件と、抽出した項目についての差分とを比較する。情報分析部22は、抽出した項目についての差分が、性能改善の判定条件を満たした場合、分析対象の分析モデルが、比較対象の分析モデルよりも精度が向上したと判定する。情報分析部22は、抽出した項目についての差分が、性能劣化の判定条件を満たした場合、分析対象の分析モデルが、比較対象の分析モデルよりも精度が劣化したと判定する。 The information analysis unit 22 acquires from the information holding unit 11 of the repository 10 an item indicating the accuracy index set as the target accuracy index value from the analysis outline information including the model information to be analyzed and the analysis model information to be compared. The information analysis unit 22 extracts the difference of the item that matches the item indicating the acquired accuracy index, with respect to the difference of the accuracy index value based on the learning data and the difference of the accuracy index value based on the evaluation data. The information analysis unit 22 searches the determination table for the extracted items, and compares the determination conditions for performance improvement and the determination conditions for performance deterioration set in the determination table with the difference of the extracted items. The information analysis unit 22 determines that the accuracy of the analytical model to be analyzed has improved compared to the analytical model to be compared when the difference for the extracted item satisfies the performance improvement criteria. The information analysis unit 22 determines that the accuracy of the analytical model to be analyzed is lower than that of the analytical model to be compared when the difference for the extracted item satisfies the performance deterioration determination condition.
 図2に戻り、出力部24の説明を行う。出力部24は、第1の実施形態における出力部4に対応する。出力部24は、情報分析部22の抽出結果と、算出部23の算出結果と、を、出力装置40に出力する。出力部24は、情報入力部21から、出力条件及び出力項目を受信する。出力部24は、情報分析部22の抽出結果、及び算出部23の算出結果のうち、出力項目と、出力条件とを満たす出力項目、抽出結果及び算出結果を、出力装置40に出力する。 Returning to FIG. 2, the output unit 24 will be described. The output section 24 corresponds to the output section 4 in the first embodiment. The output unit 24 outputs the extraction result of the information analysis unit 22 and the calculation result of the calculation unit 23 to the output device 40 . The output unit 24 receives output conditions and output items from the information input unit 21 . The output unit 24 outputs to the output device 40 the output items, the output items satisfying the output conditions, the extraction results, and the calculation results among the extraction results of the information analysis unit 22 and the calculation results of the calculation unit 23 .
 外部システム制御部25は、情報処理装置100の外部に設けられたAIエンジンの実行制御を行う。 The external system control unit 25 controls the execution of the AI engine provided outside the information processing device 100 .
 入力装置30は、入力部として機能する。入力装置30は、例えば、キーボード、マウス、タッチパネル等であってもよい。入力装置30は、リポジトリ10の情報保持部11が保持する各種情報をユーザが入力装置30に入力した場合、入力された情報を情報入力部21に出力する。入力装置30は、情報分析部22が分析する分析対象の分析モデル及び比較対象の分析モデルをユーザが入力装置30に入力した場合、当該情報を情報入力部21に出力する。 The input device 30 functions as an input unit. The input device 30 may be, for example, a keyboard, mouse, touch panel, or the like. When the user inputs various information held by the information holding unit 11 of the repository 10 to the input device 30 , the input device 30 outputs the inputted information to the information input unit 21 . When the user inputs an analysis model to be analyzed and an analysis model to be compared by the information analysis unit 22 , the input device 30 outputs the information to the information input unit 21 .
 入力装置30は、情報分析部22及び算出部23が行う、分析処理を途中で中止するのか否かの情報を、ユーザから受け付けて、情報入力部21に出力する。入力装置30は、情報分析部22が抽出した抽出結果及び算出部23が算出した算出結果を出力する出力条件を、ユーザから受け付けて、情報入力部21に出力する。入力装置30は、出力装置40に出力する出力項目をユーザから受け付けて、情報入力部21に出力する。 The input device 30 receives information from the user as to whether or not to stop the analysis processing performed by the information analysis unit 22 and the calculation unit 23, and outputs the information to the information input unit 21. The input device 30 receives output conditions for outputting the extraction results extracted by the information analysis unit 22 and the calculation results calculated by the calculation unit 23 from the user and outputs them to the information input unit 21 . The input device 30 receives output items to be output to the output device 40 from the user and outputs them to the information input unit 21 .
 出力装置40は、出力部として機能する。出力装置40は、例えば、ディスプレイ等を備えるように構成される。出力装置40は、処理装置20が演算した結果をユーザに対して表示する。出力装置40は、出力部24から出力された、出力項目、抽出結果及び算出結果をディスプレイに表示する。なお、出力装置40は、出力部24から出力された、出力項目、抽出結果及び算出結果をファイルに出力してもよい。 The output device 40 functions as an output unit. The output device 40 is configured to include, for example, a display. The output device 40 displays the result calculated by the processing device 20 to the user. The output device 40 displays the output items, extraction results, and calculation results output from the output unit 24 on the display. Note that the output device 40 may output the output items, extraction results, and calculation results output from the output unit 24 to a file.
<情報処理装置の動作例>
 続いて、図6及び図7を用いて、情報処理装置100の動作例について説明する。また、情報処理装置100の動作例について、情報分析部22が抽出した抽出結果の具体例を示しながら説明する。図6及び図7は、第2の実施形態にかかる情報処理装置の動作例を示すフローチャートである。前提として、ユーザが分析対象の分析モデル及び比較対象の分析モデルを指定し、情報分析部22が、分析対象の分析モデル情報及び比較対象の分析モデル情報を特定しているとして説明する。また、情報分析部22は、分析対象の分析モデル情報に対応する分析対象のケース情報、及び比較対象の分析モデル情報に対応する比較対象のケース情報も特定していることとして説明する。
<Example of operation of information processing device>
Next, an operation example of the information processing apparatus 100 will be described with reference to FIGS. 6 and 7. FIG. Further, an operation example of the information processing device 100 will be described while showing a specific example of the extraction result extracted by the information analysis unit 22 . 6 and 7 are flowcharts showing an operation example of the information processing apparatus according to the second embodiment. As a premise, it is assumed that the user designates the analytical model to be analyzed and the analytical model to be compared, and the information analysis unit 22 specifies the analytical model information to be analyzed and the analytical model information to be compared. In addition, the information analysis unit 22 also specifies analysis target case information corresponding to analysis target analysis model information and comparison target case information corresponding to comparison target analysis model information.
 情報分析部22は、分析対象のケース情報に含まれる目的変数に設定された目的変数名と、情報保持部11から取得した分析対象のケース情報以外のケース情報に含まれる目的変数に設定された目的変数名との差分を抽出する(ステップS1)。 The information analysis unit 22 obtains the objective variable name set to the objective variable included in the case information to be analyzed and the objective variable set to the objective variable included in the case information other than the case information to be analyzed acquired from the information holding unit 11. A difference from the target variable name is extracted (step S1).
 情報分析部22は、目的変数名の差分があるか否かを判定する(ステップS2)。言い換えると、情報分析部22は、目的変数名の差分が抽出されたかを判定する。
 差分がある場合(ステップS2のYES)、情報分析部22は、ステップS3を実行する。
 差分がない場合(ステップS2のNO)、情報分析部22は、ステップS5を実行する。
The information analysis unit 22 determines whether or not there is a difference in objective variable names (step S2). In other words, the information analysis unit 22 determines whether the difference in the objective variable name has been extracted.
If there is a difference (YES in step S2), the information analysis unit 22 executes step S3.
If there is no difference (NO in step S2), the information analysis unit 22 executes step S5.
 ステップS3において、情報分析部22は、分析対象のケース情報及び比較対象のケース情報に含まれる目的変数のデータ型に差分があるか判定する(ステップS3)。
 データ型に差分がある場合(ステップS3のYES)、情報入力部21は、以降の処理を中止するのかを判断するために差分抽出を中止するかを、入力装置30を介して、ユーザに確認する(ステップS4)。データ型に差分が存在する場合、分析目的が異なり、有意義な比較ができない可能性があるため、情報入力部21は、以降の処理を実行するのかをユーザに確認する。
In step S3, the information analysis unit 22 determines whether there is a difference in the data types of the objective variables included in the case information to be analyzed and the case information to be compared (step S3).
If there is a difference in the data type (YES in step S3), the information input unit 21 confirms with the user via the input device 30 whether to stop extracting the difference in order to determine whether to stop the subsequent processing. (step S4). If there is a difference in the data types, the purpose of analysis may be different and meaningful comparison may not be possible. Therefore, the information input unit 21 confirms with the user whether to execute subsequent processing.
 一方、データ型に差分がない場合(ステップS3のNO)、情報分析部22は、ステップS5を実行する。目的変数名が異なっていたとしても、同じデータ型である場合、分析目的が一致すると判断できるため、情報分析部22は、以降の処理を実行する。 On the other hand, if there is no difference in data type (NO in step S3), the information analysis unit 22 executes step S5. Even if the target variable names are different, if the data types are the same, it can be determined that the analysis purposes match, so the information analysis unit 22 executes the subsequent processing.
 ステップS4において、情報入力部21が、入力装置30を介して、ユーザが差分抽出を中止することを示す情報を受信した場合(ステップS4のYES)、情報処理装置100は、ステップS8を実行する。
 一方、情報入力部21が、入力装置30を介して、ユーザが差分抽出を継続することを示す情報を受信した場合(ステップS4のNO)、情報分析部22は、ステップS5を実行する。
In step S4, when the information input unit 21 receives information indicating that the user will stop extracting the difference via the input device 30 (YES in step S4), the information processing device 100 executes step S8. .
On the other hand, when the information input unit 21 receives information indicating that the user continues the difference extraction via the input device 30 (NO in step S4), the information analysis unit 22 executes step S5.
 ステップS5において、情報分析部22は、分析対象のケース情報に含まれるAIエンジン・アルゴリズムと、比較対象のケース情報に含まれるAIエンジン・アルゴリズムの差分を抽出する(ステップS5)。情報分析部22は、分析対象のケース情報に含まれるAIエンジン名と、比較対象のケース情報に含まれるAIエンジン名の差分を抽出する。また、情報分析部22は、分析対象のケース情報に含まれるアルゴリズム名と、比較対象のケース情報に含まれるアルゴリズム名の差分を抽出する。 In step S5, the information analysis unit 22 extracts the difference between the AI engine algorithm included in the case information to be analyzed and the AI engine algorithm included in the case information to be compared (step S5). The information analysis unit 22 extracts the difference between the AI engine name included in the case information to be analyzed and the AI engine name included in the case information to be compared. The information analysis unit 22 also extracts the difference between the algorithm name included in the case information to be analyzed and the algorithm name included in the case information to be compared.
 情報分析部22は、分析対象のケース情報に含まれるAIエンジン・アルゴリズムと、比較対象のケース情報に含まれるAIエンジン・アルゴリズムの差分があるかを判定する(ステップS6)。言い換えると、情報分析部22は、AIエンジン・アルゴリズムの差分が抽出されたかを判定する。
 差分がある場合(ステップS6のYES)、情報入力部21は、以降の処理を中止するのかを判断するために差分抽出を中止するかを、入力装置30を介して、ユーザに確認する(ステップS7)。
 差分がない場合(ステップS6のNO)、情報分析部22は、ステップS9を実行する。
The information analysis unit 22 determines whether there is a difference between the AI engine algorithm included in the case information to be analyzed and the AI engine algorithm included in the case information to be compared (step S6). In other words, the information analysis unit 22 determines whether the AI engine algorithm difference has been extracted.
If there is a difference (YES in step S6), the information input unit 21 confirms with the user via the input device 30 whether to stop extracting the difference in order to determine whether to stop subsequent processing (step S7).
If there is no difference (NO in step S6), the information analysis unit 22 executes step S9.
 ステップS7において、情報入力部21が、入力装置30を介して、ユーザが差分抽出を中止することを示す情報を受信した場合(ステップS7のYES)、情報処理装置100は、ステップS8を実行する。
 一方、情報入力部21が、入力装置30を介して、ユーザが差分抽出を継続することを示す情報を受信した場合(ステップS7のNO)、情報分析部22は、ステップS10を実行する。
In step S7, when the information input unit 21 receives through the input device 30 information indicating that the user is to stop extracting the difference (YES in step S7), the information processing apparatus 100 executes step S8. .
On the other hand, when the information input unit 21 receives information indicating that the user continues the difference extraction via the input device 30 (NO in step S7), the information analysis unit 22 executes step S10.
 ステップS8において、出力部24は、ステップS1及びS5において抽出された差分を、出力装置40に出力し、出力装置40の画面に表示する(ステップS8)。情報処理装置100は、ステップS8を実行すると、処理を終了する。 In step S8, the output unit 24 outputs the differences extracted in steps S1 and S5 to the output device 40 and displays them on the screen of the output device 40 (step S8). After executing step S8, the information processing apparatus 100 ends the process.
 ステップS9において、情報分析部22は、分析対象のケース情報に含まれるハイパーパラメータと、比較対象のケース情報に含まれるハイパーパラメータの差分を抽出する(ステップS9)。分析対象のケース情報と、比較対象のケース情報とが同一AIエンジンであるため、情報分析部22は、ハイパーパラメータの差分を抽出する。 In step S9, the information analysis unit 22 extracts the difference between the hyperparameters included in the case information to be analyzed and the hyperparameters included in the case information to be compared (step S9). Since the same AI engine is used for the case information to be analyzed and the case information to be compared, the information analysis unit 22 extracts differences in hyperparameters.
 ここで、図8を用いて、ステップS1~S9までに情報分析部22が抽出した抽出結果の具体例を示す。図8は、抽出結果の一例を示す図である。図8に示すように、情報分析部22は、抽出結果を、例えば、表形式として保持する。抽出結果が設定される表には、差分抽出項目、分析対象のケース、比較対象のケース、及び差分が設定される。 Here, using FIG. 8, a specific example of the extraction results extracted by the information analysis unit 22 in steps S1 to S9 will be shown. FIG. 8 is a diagram showing an example of an extraction result. As shown in FIG. 8, the information analysis unit 22 holds the extraction results in, for example, a table format. Difference extraction items, cases to be analyzed, cases to be compared, and differences are set in the table in which the extraction results are set.
 差分抽出項目の列には、ステップS1、S5及びS9において差分を抽出した項目が各行に設定される。
 分析対象のケースを示す列には、分析対象の分析モデルに対応する、分析対象のケース情報に含まれるケース名が設定される。図8では、分析対象のケース情報が、ケース1であることを示している。また、分析対象のケースの各行には、ステップS1、S5及びS9において差分を抽出した項目について、分析対象のケース情報に含まれる値が設定される。
In the columns of difference extraction items, items for which differences are extracted in steps S1, S5 and S9 are set in each row.
In the column indicating the case to be analyzed, the case name included in the case information to be analyzed corresponding to the analysis model to be analyzed is set. FIG. 8 shows that case information to be analyzed is case 1 . In each row of the analysis target case, the values included in the analysis target case information are set for the items for which differences are extracted in steps S1, S5, and S9.
 比較対象のケースを示す列には、比較対象の分析モデルに対応する、比較対象のケース情報に含まれるケース名が設定される。図8では、比較対象のケース情報が、ケース2であることを示している。また、比較対象のケースの各行には、ステップS1、S5及びS9において差分を抽出した項目について、比較対象のケース情報に含まれる値が設定される。 In the column indicating the case to be compared, the case name included in the case information to be compared corresponding to the analysis model to be compared is set. FIG. 8 shows that case information to be compared is case 2 . In each row of the case to be compared, the values included in the case information to be compared are set for the items for which differences are extracted in steps S1, S5 and S9.
 差分を示す列には、ステップS1、S5及びS9において差分を抽出した差分抽出項目に差分があったのか無かったのかを示す情報が設定される。また、例えば、ハイパーパラメータのように、数値が設定される項目であって、差分がある場合、差分を示す列には、比較対象のケースの値から分析対象のケースの値を引いた値が設定される。 Information indicating whether or not there was a difference in the difference extraction items for which differences were extracted in steps S1, S5, and S9 is set in the difference column. In addition, for example, if a numerical value is set for an item, such as a hyperparameter, and there is a difference, the value obtained by subtracting the value of the case to be analyzed from the value of the case to be compared is displayed in the column that indicates the difference. set.
 図6に戻り、情報処理装置100の動作例の説明を続ける。
 ステップS10において、算出部23は、分析対象のケース情報及び比較対象のケース情報に含まれる学習候補データのハッシュ値を算出し、情報分析部22は、算出されたハッシュ値の差分を抽出する。(ステップS10)。
Returning to FIG. 6, the description of the operation example of the information processing apparatus 100 is continued.
In step S10, the calculation unit 23 calculates hash values of learning candidate data included in the case information to be analyzed and the case information to be compared, and the information analysis unit 22 extracts the difference between the calculated hash values. (Step S10).
 情報分析部22は、ハッシュ値に差分があるかを判定する(ステップS11)。
 差分がある場合(ステップS11のYES)、算出部23は、ステップS12を実行する。ハッシュ値に差分が存在する場合、学習候補データが異なると判断できるため、情報処理装置100は、学習候補データの分析を行う。
 差分がない場合(ステップS11のNO)、算出部23は、ステップS20を実行する。
The information analysis unit 22 determines whether there is a difference in hash values (step S11).
If there is a difference (YES in step S11), the calculator 23 executes step S12. If there is a difference in hash values, it can be determined that the learning candidate data are different, so the information processing apparatus 100 analyzes the learning candidate data.
If there is no difference (NO in step S11), the calculator 23 executes step S20.
 ステップS12において、算出部23は、分析対象のケース情報に含まれる学習候補データの基本統計量と、比較対象のケース情報に含まれる学習候補データの基本統計量とを算出する(ステップS12)。算出部23は、学習候補データに含まれる、各変数に対して、基本統計量を算出する。基本統計量は、例えば、要素の個数、算術平均、標準偏差、最小値、1/4分位数、中央値、及び3/4分位数を含む。算出部23は、各変数について、上記した基本統計量に含まれる各項目の値を算出する。 In step S12, the calculation unit 23 calculates the basic statistics of the learning candidate data included in the case information to be analyzed and the basic statistics of the learning candidate data included in the case information to be compared (step S12). The calculator 23 calculates a basic statistic for each variable included in the learning candidate data. Basic statistics include, for example, number of elements, arithmetic mean, standard deviation, minimum value, quarter quantile, median, and three quarter quantile. The calculation unit 23 calculates the value of each item included in the basic statistics for each variable.
 情報分析部22は、分析対象のケース情報に含まれる学習候補データの基本統計量と、比較対象のケース情報に含まれる学習候補データの基本統計量との差分を抽出する(ステップS13)。情報分析部22は、学習候補データに含まれる各変数、及び基本統計量に含まれる各項目について、分析対象のケース情報に含まれる学習候補データの基本統計量と、比較対象のケース情報に含まれる学習候補データの基本統計量との差分を抽出する。 The information analysis unit 22 extracts the difference between the basic statistics of the learning candidate data included in the case information to be analyzed and the basic statistics of the learning candidate data included in the case information to be compared (step S13). The information analysis unit 22 analyzes the basic statistics of the learning candidate data included in the case information to be analyzed and the basic statistics of the learning candidate data included in the case information to be compared for each variable included in the learning candidate data and each item included in the basic statistics. Extract the difference from the basic statistic of the learning candidate data.
 算出部23は、分析対象のケース情報に含まれる学習候補データを用いて、説明変数と、目的変数との相関係数を算出し、比較対象のケース情報に含まれる学習候補データを用いて、説明変数と、目的変数との相関係数を算出する(ステップS14)。なお、算出部23が、相関係数を算出した後、情報分析部22は、分析対象のケース情報に含まれる学習候補データを用いて算出された相関係数と、比較対象のケース情報に含まれる学習候補データを用いて算出された相関係数との差分を抽出してもよい。 The calculation unit 23 uses the learning candidate data included in the case information to be analyzed to calculate the correlation coefficient between the explanatory variable and the objective variable, and using the learning candidate data included in the case information to be compared, A correlation coefficient between the explanatory variable and the objective variable is calculated (step S14). Note that after the calculation unit 23 calculates the correlation coefficient, the information analysis unit 22 calculates the correlation coefficient calculated using the learning candidate data included in the case information to be analyzed and the correlation coefficient included in the case information to be compared. A difference from the correlation coefficient calculated using the learning candidate data obtained from the data may be extracted.
 情報分析部22は、分析対象の分析モデル情報に含まれるモデル定性情報から各説明変数に対する重み付けを抽出し、比較対象の分析モデル情報に含まれるモデル定性情報から各説明変数に対する重み付けを抽出する(ステップS15)。情報分析部22は、分析対象の分析モデル情報に設定された各説明変数の回帰係数を重み付けとして抽出し、比較対象の分析モデル情報に含まれるモデル定性情報に設定された各説明変数の回帰係数を重み付けとして抽出する。なお、分析モデルが、複数の予測式を使用した異種混合学習により作成されている場合、情報分析部22は、複数の予測式のそれぞれを回帰式と見なし、予測式の各変数の係数を回帰係数と見なして、当該係数を重み付けとして抽出してもよい。また、情報分析部22は、重み付けを抽出した後、分析対象の分析モデル情報から抽出された重み付けと、比較対象の分析モデル情報から抽出された重み付けとの差分を抽出してもよい。 The information analysis unit 22 extracts the weighting for each explanatory variable from the model qualitative information included in the analysis model information to be analyzed, and extracts the weighting for each explanatory variable from the model qualitative information included in the analysis model information for comparison ( step S15). The information analysis unit 22 extracts the regression coefficient of each explanatory variable set in the analysis model information to be analyzed as weighting, and extracts the regression coefficient of each explanatory variable set in the model qualitative information included in the analysis model information to be compared. are extracted as weights. Note that when the analysis model is created by heterogeneous mixture learning using a plurality of prediction formulas, the information analysis unit 22 regards each of the plurality of prediction formulas as a regression formula, and calculates the coefficient of each variable of the prediction formula as a regression formula. It may be regarded as a coefficient and the coefficient may be extracted as weighting. After extracting the weighting, the information analysis unit 22 may extract the difference between the weighting extracted from the analytical model information to be analyzed and the weighting extracted from the analytical model information to be compared.
 重み付け及び相関係数の数値は、説明変数の変化が、分析モデルの予測精度に与える影響があるのかの判断根拠となる。そのため、ステップS14において、算出部23は、相関係数を算出するとともに、ステップS15において、情報分析部22は、重み付けを抽出する。 The weighting and correlation coefficient values are the basis for judging whether changes in the explanatory variables have an impact on the prediction accuracy of the analysis model. Therefore, in step S14, the calculation unit 23 calculates the correlation coefficient, and in step S15, the information analysis unit 22 extracts weighting.
 ここで、図9A及び図9Bを用いて、ステップS12~S15に情報分析部22が抽出した抽出結果の一例を示す。図9A及び図9Bは、抽出結果の一例を示す図である。図9A及び図9Bは、情報分析部22が抽出した抽出結果を分割した図であり、情報分析部22は、ステップS12~S15において、図9A及び図9Bを抽出結果として保持する。図9A及び図9Bに示すように、図8と同様に、情報分析部22は、抽出結果を、例えば、表形式として保持する。抽出結果が設定される表には、差分抽出項目、目的変数の差分結果、及び説明変数の差分結果が設定される。 Here, an example of the extraction result extracted by the information analysis unit 22 in steps S12 to S15 is shown using FIGS. 9A and 9B. 9A and 9B are diagrams showing examples of extraction results. 9A and 9B are diagrams obtained by dividing the extraction results extracted by the information analysis unit 22. The information analysis unit 22 holds FIGS. 9A and 9B as extraction results in steps S12 to S15. As shown in FIGS. 9A and 9B, similarly to FIG. 8, the information analysis unit 22 holds the extraction results in tabular form, for example. Difference extraction items, objective variable difference results, and explanatory variable difference results are set in the table in which the extraction results are set.
 差分抽出項目の列には、例えば、上から順に、説明変数が、分析対象のケース情報及び比較対象のケース情報に存在するかのチェック結果、ステップS12~S15において情報分析部22が差分を抽出した各項目が設定される。例えば、学習候補データの基本統計量については、複数の項目が含まれるため、各項目の差分が分かるように、各項目が1つの行に設定される。また、モデル定性情報から抽出された重み付けについて、分析モデルが、複数の予測式を用いた異種混合学習により作成されている場合、各予測式の係数の差分が分かるように、各予測式が1つの行に設定される。 In the column of difference extraction items, for example, from the top, the result of checking whether explanatory variables exist in the case information to be analyzed and the case information to be compared, and the difference extracted by the information analysis unit 22 in steps S12 to S15. Each item is set. For example, since the basic statistic of learning candidate data includes a plurality of items, each item is set in one row so that the difference between the items can be understood. Also, regarding the weighting extracted from the model qualitative information, if the analytical model is created by heterogeneous mixture learning using multiple prediction formulas, each prediction formula should be 1 set to one line.
 目的変数の差分結果が設定される領域には、例えば、目的変数がどの変数であるかを示す目的変数名が設定され、分析対象のケース情報を示すケース名、比較対象のケース情報を示すケース名、及び差分が設定される。なお、数値が設定される項目であって、差分がある場合、差分を示す列には、比較対象のケースの値から分析対象のケースの値を引いた値が設定される。 In the area where the difference result of the objective variable is set, for example, the objective variable name indicating which variable is the objective variable is set, the case name indicating the case information to be analyzed, and the case indicating the case information to be compared Name and difference are set. If there is a difference in an item for which a numerical value is set, a value obtained by subtracting the value of the case to be analyzed from the value of the case to be compared is set in the column indicating the difference.
 説明変数の差分結果が設定される領域には、説明変数の各々についての差分結果が分かるように各説明変数の差分結果が設定される領域を含む。各説明変数の差分結果が設定される領域には、例えば、説明変数がどの変数であるかを示す説明変数名が設定され、分析対象のケース情報を示すケース名、比較対象のケース情報を示すケース名、及び差分が設定される。なお、数値が設定される項目であって、差分がある場合、差分を示す列には、比較対象のケースの値から分析対象のケースの値を引いた値が設定される。 The area in which the difference result of the explanatory variables is set includes the area in which the difference result of each explanatory variable is set so that the difference result of each explanatory variable can be understood. In the area where the difference result of each explanatory variable is set, for example, an explanatory variable name indicating which explanatory variable is the explanatory variable is set, a case name indicating the case information to be analyzed, and a case information to be compared. A case name and difference are set. If there is a difference in an item for which a numerical value is set, a value obtained by subtracting the value of the case to be analyzed from the value of the case to be compared is set in the column indicating the difference.
 図6に戻り、情報処理装置100の動作例の説明を続ける。
 ステップS16において、情報分析部22は、分析対象のケース情報に含まれる説明変数と、比較対象のケース情報に含まれる説明変数との差分を抽出する(ステップS16)。情報分析部22は、説明変数に設定された説明変数名が設定された変数一覧を比較することで、分析対象のケース情報に含まれる説明変数と、比較対象のケース情報に含まれる説明変数との差分を抽出する。情報分析部22は、変数一覧において、分析対象のケース情報と、比較対象のケース情報とで、一致(重複)する説明変数名、及び一致しない説明変数名を判別することで、説明変数名の差分を抽出する。
Returning to FIG. 6, the description of the operation example of the information processing apparatus 100 is continued.
In step S16, the information analysis unit 22 extracts the difference between the explanatory variable included in the case information to be analyzed and the explanatory variable included in the case information to be compared (step S16). The information analysis unit 22 compares the variable list in which the explanatory variable name set as the explanatory variable is set, thereby determining the explanatory variable included in the case information to be analyzed and the explanatory variable included in the case information to be compared. Extract the difference between The information analysis unit 22 determines the explanatory variable names that match (overlapping) and do not match in the case information to be analyzed and the case information to be compared in the variable list. Extract the difference.
 情報分析部22は、説明変数に差分があるかを判定する(ステップS17)。言い換えると、情報分析部22は、説明変数の差分が抽出されたかを判定する。
 差分がある場合(ステップS17のYES)、算出部23は、ステップS18を実行する。
 差分がない場合(ステップS17のNO)、算出部23は、ステップS20を実行する。
The information analysis unit 22 determines whether there is a difference in explanatory variables (step S17). In other words, the information analysis unit 22 determines whether the difference of explanatory variables has been extracted.
If there is a difference (YES in step S17), the calculator 23 executes step S18.
If there is no difference (NO in step S17), the calculator 23 executes step S20.
 ステップS18において、算出部23は、分析対象のケース情報及び比較対象のケース情報に含まれる学習候補データを用いて、説明変数と、目的変数との相関係数を算出する(ステップS18)。算出部23は、分析対象のケース情報に含まれる学習候補データを用いて、説明変数と、目的変数との相関係数を算出する。算出部23は、比較対象のケース情報に含まれる学習候補データを用いて、説明変数と、目的変数との相関係数を算出する。なお、ステップS14において、算出部23は、説明変数と目的変数との相関係数を算出しているため、ステップS14において算出した相関係数を用いてもよい。 In step S18, the calculation unit 23 uses the learning candidate data included in the case information to be analyzed and the case information to be compared to calculate the correlation coefficient between the explanatory variable and the objective variable (step S18). The calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable using learning candidate data included in the case information to be analyzed. The calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable using the learning candidate data included in the case information to be compared. In addition, since the calculation unit 23 calculates the correlation coefficient between the explanatory variable and the objective variable in step S14, the correlation coefficient calculated in step S14 may be used.
 情報分析部22は、分析対象の分析モデル情報に含まれるモデル定性情報から各説明変数に対する重み付けを抽出し、比較対象の分析モデル情報に含まれるモデル定性情報から各説明変数に対する重み付けを抽出する(ステップS19)。ステップS18及びS19が実行されることで、ユーザは、削除された又は追加された説明変数の相関係数及び重み付けの数値を把握でき、説明変数が分析モデルの予測精度に影響を与えたかどうかを判断できる。なお、ステップS15において、情報分析部22は、重み付けを抽出しているため、ステップS15において抽出した重み付けを用いてもよい。 The information analysis unit 22 extracts the weighting for each explanatory variable from the model qualitative information included in the analysis model information to be analyzed, and extracts the weighting for each explanatory variable from the model qualitative information included in the analysis model information for comparison ( step S19). By executing steps S18 and S19, the user can grasp the correlation coefficients and weighting values of the deleted or added explanatory variables, and determine whether the explanatory variables have affected the prediction accuracy of the analysis model. I can judge. In addition, since the information analysis unit 22 extracts the weighting in step S15, the weighting extracted in step S15 may be used.
 ここで、図10を用いて、ステップS16~S19に情報分析部22が抽出した抽出結果の一例を示す。図10は、抽出結果の一例を示す図である。図10に示すように、図8と同様に、情報分析部22は、抽出結果を、例えば、表形式として保持する。抽出結果が設定される表には、説明変数名、分析対象のケース情報に含まれる説明変数に関する情報と、比較対象のケース情報に含まれる説明変数に関する情報とが設定される。 Here, using FIG. 10, an example of the extraction results extracted by the information analysis unit 22 in steps S16 to S19 is shown. FIG. 10 is a diagram showing an example of the extraction result. As shown in FIG. 10, similarly to FIG. 8, the information analysis unit 22 holds the extraction results in, for example, a table format. In the table in which the extraction results are set, explanatory variable names, information about the explanatory variables included in the case information to be analyzed, and information about the explanatory variables included in the case information to be compared are set.
 説明変数名の列には、分析対象のケース情報に含まれる説明変数、及び比較対象のケース情報に含まれる説明変数の説明変数名が設定される。説明変数名には、各行に、各説明変数名が設定される。 The explanatory variable names of the explanatory variables included in the case information to be analyzed and the explanatory variables included in the case information to be compared are set in the explanatory variable name column. Each explanatory variable name is set for each row in the explanatory variable name.
 分析対象のケース情報に含まれる説明変数に関する情報の領域には、分析対象のケース情報を示すケース名が設定される。また、分析対象のケース情報に含まれる説明変数に関する情報の領域には、各行に設定された説明変数が、分析対象のケース情報に存在しているか否かが設定される列と、目的関数との相関係数が設定される列と、重み付けが設定される列とを含む。 A case name that indicates the case information to be analyzed is set in the information area related to explanatory variables included in the case information to be analyzed. In addition, in the area of information related to explanatory variables included in the case information to be analyzed, there is a column for setting whether or not the explanatory variable set in each row exists in the case information to be analyzed, and an objective function and , and a column in which the weights are set.
 説明変数が、分析対象のケース情報に存在しているか否かが設定される列については、比較対象のケース情報にも存在しているのかも把握できるような情報が設定される。図10に示す一例では、斜線が記載された丸印は、説明変数が、分析対象のケース情報だけでなく、比較対象のケース情報にも含まれていることを示している。斜線が記載されていない丸印は、説明変数が、分析対象のケース情報、又は比較対象のケース情報にのみ含まれており、分析対象のケース情報と、比較対象のケース情報とで差分があることを示している。つまり、分析対象のケース情報に含まれる説明変数に関する情報の「存在」列に、斜線が記載されていない丸印が設定されていれば、対象の説明変数が、分析対象のケース情報にしか無いことを表す。なお、分析モデルが、複数の予測式を用いた異種混合学習により作成されている場合、各予測式の係数が1つの列として設定されてもよい。 For columns where explanatory variables are set to indicate whether they exist in the case information to be analyzed, information is set so that it is possible to ascertain whether they also exist in the case information to be compared. In the example shown in FIG. 10, the hatched circle indicates that the explanatory variable is included not only in the case information to be analyzed but also in the case information to be compared. A circle without a slash indicates that the explanatory variable is included only in the case information to be analyzed or the case information to be compared, and there is a difference between the case information to be analyzed and the case information to be compared. It is shown that. In other words, if a circle without a slash is set in the "presence" column of the information related to the explanatory variable included in the case information to be analyzed, then the explanatory variable of interest is only found in the case information to be analyzed. Represents Note that when the analysis model is created by heterogeneous mixture learning using a plurality of prediction formulas, the coefficients of each prediction formula may be set as one column.
 比較対象のケース情報に含まれる説明変数に関する情報の領域には、比較対象のケース情報を示すケース名が設定される。また、比較対象のケース情報に含まれる説明変数に関する情報の領域には、各行に設定された説明変数が、比較対象のケース情報に存在しているか否かが設定される列と、目的関数との相関係数が設定される列と、重み付けが設定される列とを含む。 A case name indicating the case information to be compared is set in the area of the information related to the explanatory variables included in the case information to be compared. In addition, in the area of information related to explanatory variables included in the case information to be compared, there is a column for setting whether or not the explanatory variable set in each row exists in the case information to be compared, and an objective function and , and a column in which the weights are set.
 説明変数が、比較対象のケース情報に存在しているか否かが設定される列については、分析対象のケース情報にも存在しているのかも把握できるような情報が設定される。比較対象のケース情報に含まれる説明変数に関する情報の「存在」列に、斜線が記載されていない丸印が設定されていれば、対象の説明変数が、比較対象のケース情報にしか無いことを表す。なお、分析モデルが、複数の予測式を用いた異種混合学習により作成されている場合、各予測式の係数が1つの列として設定されてもよい。 For columns where explanatory variables are set to indicate whether they exist in the case information to be compared, information is set so that it is possible to ascertain whether they also exist in the case information to be analyzed. If a circle without a slash is set in the "presence" column of the information about the explanatory variable included in the case information to be compared, it means that the explanatory variable is only in the case information to be compared. show. Note that when the analysis model is created by heterogeneous mixture learning using a plurality of prediction formulas, the coefficients of each prediction formula may be set as one column.
 次に、図7を用いて、情報処理装置100の動作例の説明を続ける。
 ステップS20において、算出部23は、分析対象の分析モデル情報に含まれる学習データ及び評価データの基本統計量を算出し、比較対象の分析モデル情報に含まれる学習データ及び評価データの基本統計量を算出する。(ステップS20)。算出部23は、分析対象の分析モデル情報に含まれる学習データに設定された各変数の基本統計量を算出し、比較対象の分析モデル情報に含まれる学習データに設定された各変数の基本統計量を算出する。算出部23は、変数毎に、かつ基本統計量の項目毎に、基本統計量を算出する。算出部23は、分析対象の分析モデル情報に含まれる評価データに設定された各変数の基本統計量を算出し、比較対象の分析モデル情報に含まれる評価データに設定された各変数の基本統計量を算出する。算出部23は、変数毎に、かつ基本統計量の項目毎に、基本統計量を算出する。
Next, the description of the operation example of the information processing apparatus 100 will be continued with reference to FIG.
In step S20, the calculation unit 23 calculates the basic statistics of the learning data and the evaluation data included in the analysis model information to be analyzed, and calculates the basic statistics of the learning data and the evaluation data included in the analysis model information to be compared. calculate. (Step S20). The calculation unit 23 calculates the basic statistics of each variable set in the learning data included in the analysis model information to be analyzed, and calculates the basic statistics of each variable set in the learning data included in the analysis model information to be compared. Calculate quantity. The calculator 23 calculates a basic statistic for each variable and for each basic statistic item. The calculation unit 23 calculates the basic statistics of each variable set in the evaluation data included in the analysis model information to be analyzed, and calculates the basic statistics of each variable set in the evaluation data included in the analysis model information to be compared. Calculate quantity. The calculator 23 calculates a basic statistic for each variable and for each basic statistic item.
 情報分析部22は、算出部23の算出結果に基づいて、学習データ及び評価データの基本統計量の差分と、変数の差分とを抽出する(ステップS21)。情報分析部22は、算出部23が算出した結果に基づいて、変数毎に、かつ基本統計量の項目毎に、分析対象の分析モデル情報に含まれる学習データと、比較対象の分析モデル情報に含まれる学習データとについて、基本統計量の差分を算出する。情報分析部22は、分析対象の分析モデル情報に含まれる学習データに設定された変数と、比較対象の分析モデル情報に含まれる学習データに設定された変数との差分を抽出する。また、情報分析部22は、算出部23が算出した結果に基づいて、変数毎に、かつ基本統計量の項目毎に、分析対象の分析モデル情報に含まれる評価データと、比較対象の分析モデル情報に含まれる評価データとについて、基本統計量の差分を算出する。また、情報分析部22は、分析対象の分析モデル情報に含まれる評価データに設定された変数と、比較対象の分析モデル情報に含まれる評価データに設定された変数との差分を抽出する。 The information analysis unit 22 extracts the difference between the basic statistics of the learning data and the evaluation data and the difference between the variables based on the calculation result of the calculation unit 23 (step S21). Based on the results calculated by the calculation unit 23, the information analysis unit 22 analyzes the learning data included in the analysis model information to be analyzed and the analysis model information to be compared for each variable and each basic statistic item. Differences in basic statistics are calculated for included learning data. The information analysis unit 22 extracts the difference between the variable set in the learning data included in the analysis model information to be analyzed and the variable set in the learning data included in the analysis model information to be compared. Further, based on the result calculated by the calculation unit 23, the information analysis unit 22 calculates the evaluation data included in the analysis model information to be analyzed and the analysis model to be compared for each variable and for each basic statistic item. A difference in basic statistics is calculated with respect to the evaluation data included in the information. The information analysis unit 22 also extracts the difference between the variables set in the evaluation data included in the analytical model information to be analyzed and the variables set in the evaluation data included in the analytical model information to be compared.
 次に、情報分析部22は、分析対象の分析モデル情報にモデル定性情報がないかを判定し、比較対象の分析モデル情報にモデル定性情報がないかを判定する(ステップS22)。 Next, the information analysis unit 22 determines whether there is model qualitative information in the analysis model information to be analyzed, and determines whether there is model qualitative information in the analysis model information to be compared (step S22).
 モデル定性情報がない場合(ステップS22のYES)、情報分析部22は、ステップS27を実行する。
 モデル定性情報がある場合(ステップS22のNO)、情報分析部22は、各説明変数の重み付けの差分を抽出する(ステップS23)。情報分析部22は、分析対象の分析モデル情報及び比較対象の分析モデル情報に回帰式の回帰係数が設定されている場合、回帰式の回帰係数を重み付けとして、重み付けの差分を抽出する。
If there is no model qualitative information (YES in step S22), the information analysis unit 22 executes step S27.
If there is model qualitative information (NO in step S22), the information analysis unit 22 extracts the difference in weighting of each explanatory variable (step S23). When regression coefficients of regression equations are set for the analysis model information to be analyzed and the analysis model information to be compared, the information analysis unit 22 uses the regression coefficients of the regression equations as weights and extracts weighted differences.
 ステップS24において、情報分析部22は、重み付けの差分があるかを判定する(ステップS24)。
 重み付けの差分がある場合(ステップS24のYES)、算出部23は、説明変数に対して、目的変数との相関係数を算出する(ステップS25)。
 一方、重み付けの差分がない場合(ステップS24のNO)、情報分析部22は、ステップS26を実行する。
In step S24, the information analysis unit 22 determines whether there is a weighting difference (step S24).
If there is a weighting difference (YES in step S24), the calculator 23 calculates a correlation coefficient between the explanatory variable and the objective variable (step S25).
On the other hand, if there is no weighting difference (NO in step S24), the information analysis unit 22 executes step S26.
 ステップS26において、情報分析部22は、決定木の階層情報の差分を抽出する(ステップS26)。情報分析部22は、分析対象の分析モデル情報及び比較対象の分析モデル情報に、決定木の階層情報が設定されている場合、決定木の階層情報の差分を抽出する。決定木の階層情報は、決定木の階層数、決定木の各枝の決定条件、決定木の各葉の学習データサンプル数、及び決定木の各葉の評価データサンプル数を含む。情報分析部22は、決定木の階層情報のそれぞれに対して、差分を抽出する。 At step S26, the information analysis unit 22 extracts the difference in the hierarchical information of the decision tree (step S26). When decision tree hierarchy information is set in the analysis model information to be analyzed and the analysis model information to be compared, the information analysis unit 22 extracts the difference in the hierarchy information of the decision trees. The decision tree hierarchy information includes the number of levels of the decision tree, decision conditions for each branch of the decision tree, the number of learning data samples for each leaf of the decision tree, and the number of evaluation data samples for each leaf of the decision tree. The information analysis unit 22 extracts a difference for each piece of hierarchical information of the decision tree.
 ここで、図11及び図12を用いて、ステップS26において情報分析部22が抽出した抽出結果の一例を示す。図11及び図12は、抽出結果の一例を示す図である。 Here, an example of the extraction result extracted by the information analysis unit 22 in step S26 is shown using FIGS. 11 and 12. FIG. 11 and 12 are diagrams showing examples of extraction results.
 図11は、決定木の階層情報のうち、決定木の各枝の決定条件の差分について説明するための図である。図11に示すように、情報分析部22は、分析対象のケース、及び比較対象のケースのそれぞれについて、決定木の各枝の決定条件を、矢印を用いて、各枝の関連性が分かるように抽出結果を設定する表に設定する。情報分析部22は、決定木の各枝の決定条件の関連線に基づいて差分箇所を割り出す。情報分析部22は、差分箇所が分かるように、決定木の各枝の決定条件の差分を保持する。 FIG. 11 is a diagram for explaining differences in the decision conditions of each branch of the decision tree among the hierarchical information of the decision tree. As shown in FIG. 11, the information analysis unit 22 uses arrows to indicate the decision conditions of each branch of the decision tree for each of the cases to be analyzed and the cases to be compared so that the relationships between the branches can be understood. Set to table to set the extraction result to. The information analysis unit 22 finds a difference part based on the relation line of the decision condition of each branch of the decision tree. The information analysis unit 22 holds the difference between the decision conditions of each branch of the decision tree so that the difference can be found.
 図12は、決定木の階層情報のうち、決定木の各葉の学習データサンプル数及び決定木の各葉の評価データサンプル数の差分について説明するための図である。図12に示すように、情報分析部22は、抽出結果を、例えば、表形式として保持する。抽出結果が設定される表には、決定木の葉を示す情報と、サンプル数とが設定される。 FIG. 12 is a diagram for explaining the difference between the number of learning data samples for each leaf of the decision tree and the number of evaluation data samples for each leaf of the decision tree among the hierarchical information of the decision tree. As shown in FIG. 12, the information analysis unit 22 holds the extraction results in, for example, a table format. Information indicating the leaves of the decision tree and the number of samples are set in the table in which the extraction results are set.
 決定木の葉を示す情報の列には、各行に決定木の葉である最終的な目的変数の予測値が設定される。 In the column of information indicating the leaves of the decision tree, each row contains the predicted value of the final objective variable, which is the leaf of the decision tree.
 サンプル数が設定される領域には、学習データが各葉に分類されるデータサンプル数に関する情報が設定される領域と、評価データが各葉に分類されるデータサンプル数に関する情報が設定される領域と、が設定される。 The area where the number of samples is set includes an area where information about the number of data samples for which the training data is classified into each leaf is set, and an area for which information about the number of data samples for which the evaluation data is classified into each leaf is set. and are set.
 学習データが各葉に分類されるデータサンプル数に関する情報が設定される領域には、分析対象のケースの場合のデータサンプル数と、比較対象のケースの場合のデータサンプル数と、これらの差分とが設定される。 The area where information about the number of data samples for which the training data is classified into each leaf is set. is set.
 評価データが各葉に分類されるデータサンプル数に関する情報が設定される領域には、分析対象のケースの場合のデータサンプル数と、比較対象のケースの場合のデータサンプル数と、これらの差分とが設定される。差分の列には、情報分析部22が、決定木の各葉について、学習データサンプル数及び決定木の各葉の評価データサンプル数のそれぞれについて算出した差分が設定される。情報分析部22は、比較対象のケース情報から特定される学習データサンプル数から比較対象のケース情報から特定される学習データサンプル数を減算することで差分を算出し、差分の列に算出された値を設定する。 The area in which information about the number of data samples for which the evaluation data is classified into each leaf is set. is set. In the difference column, the difference calculated by the information analysis unit 22 for the number of learning data samples and the number of evaluation data samples for each leaf of the decision tree is set for each leaf of the decision tree. The information analysis unit 22 calculates a difference by subtracting the number of learning data samples specified from the case information to be compared from the number of learning data samples specified from the case information to be compared, and calculates the difference in the difference column. set the value.
 図7に戻り情報処理装置100の動作例の説明を続ける。
 ステップS27において、情報分析部22は、分析対象の分析モデル情報に含まれる精度指標値と、比較対象の分析モデル情報に含まれる精度指標値との差分を抽出する(ステップS27)。情報分析部22は、分析対象の分析モデル情報及び比較対象の分析モデル情報に含まれる精度指標値に設定された、学習データに基づく精度指標値、及び評価データに基づく精度指標値の差分を抽出する。また、精度指標値は、精度指標に関する少なくとも1つの項目が設定されるため、情報分析部22は、精度指標に関する項目毎に、学習データに基づく精度指標値、及び評価データに基づく精度指標値の差分を抽出する。
Returning to FIG. 7, the description of the operation example of the information processing apparatus 100 is continued.
In step S27, the information analysis unit 22 extracts the difference between the accuracy index value included in the analytical model information to be analyzed and the accuracy index value included in the analytical model information to be compared (step S27). The information analysis unit 22 extracts the difference between the accuracy index value based on the learning data and the accuracy index value based on the evaluation data set in the accuracy index values included in the analytical model information to be analyzed and the analytical model information to be compared. do. In addition, since at least one item related to the accuracy index is set as the accuracy index value, the information analysis unit 22 calculates the accuracy index value based on the learning data and the accuracy index value based on the evaluation data for each item related to the accuracy index. Extract the difference.
 ここで、図13を用いて、ステップS27において情報分析部22が抽出した抽出結果の一例を示す。図13は、抽出結果の一例を示す図である。図13に示すように、情報分析部22は、抽出結果を、例えば、表形式として保持する。抽出結果が設定される表には、差分抽出項目、分析対象のケース、比較対象のケース、及び差分が設定される。 Here, using FIG. 13, an example of the extraction result extracted by the information analysis unit 22 in step S27 is shown. FIG. 13 is a diagram illustrating an example of an extraction result; As shown in FIG. 13, the information analysis unit 22 holds the extraction results in, for example, a table format. Difference extraction items, cases to be analyzed, cases to be compared, and differences are set in the table in which the extraction results are set.
 差分抽出項目の列には、差分抽出項目が精度指標値であること、及び精度指標値を表す精度指標に関する各項目が設定される。また、精度指標に関する各項目には、学習データに基づく精度指標値が設定される行と、評価データに基づく精度指標値が設定される行とが含まれる。 In the difference extraction item column, each item related to the accuracy index indicating that the difference extraction item is an accuracy index value and the accuracy index value is set. Each item related to the accuracy index includes a row in which an accuracy index value based on learning data is set and a row in which an accuracy index value based on evaluation data is set.
 分析対象のケースを示す列には、分析対象の分析モデルに対応する、分析対象のケース情報に含まれるケース名が設定される。図13では、分析対象のケース情報が、ケース1であることを示している。また、分析対象のケースの列には、精度指標に関する項目毎に、学習データに基づく精度指標値と、評価データに基づく精度指標値と、が設定される。 In the column indicating the case to be analyzed, the case name included in the case information to be analyzed corresponding to the analysis model to be analyzed is set. FIG. 13 shows that case information to be analyzed is case 1 . Also, in the columns of the cases to be analyzed, an accuracy index value based on the learning data and an accuracy index value based on the evaluation data are set for each item related to the accuracy index.
 比較対象のケースを示す列には、比較対象の分析モデルに対応する、比較対象のケース情報に含まれるケース名が設定される。図13では、比較対象のケース情報が、ケース2であることを示している。また、比較対象のケースの列には、精度指標に関する項目毎に、学習データに基づく精度指標値と、評価データに基づく精度指標値と、が設定される。
 差分を示す列には、比較対象のケースの値から分析対象のケースの値を引いた値が設定される。
In the column indicating the case to be compared, case names included in the case information to be compared corresponding to the analysis model to be compared are set. FIG. 13 shows that case information to be compared is case 2 . Also, in the column of cases to be compared, an accuracy index value based on the learning data and an accuracy index value based on the evaluation data are set for each item related to the accuracy index.
A value obtained by subtracting the value of the case to be analyzed from the value of the case to be compared is set in the column indicating the difference.
 図7に戻り、情報処理装置100の動作例の説明を続ける。
 ステップS28において、情報分析部22は、目標精度指標値に該当する精度指標値の差分により、分析モデルの性能の優劣を判定する(ステップS28)。情報分析部22は、学習データに基づく精度指標値、及び評価データに基づく精度指標値の差分と、分析概要情報に含まれる目標精度指標値と、図5に示した判定テーブルとに基づいて、分析モデルの優劣判定を行う。
Returning to FIG. 7, the description of the operation example of the information processing apparatus 100 is continued.
In step S28, the information analysis unit 22 determines the superiority or inferiority of the performance of the analysis model based on the difference in accuracy index value corresponding to the target accuracy index value (step S28). Based on the difference between the accuracy index value based on the learning data and the accuracy index value based on the evaluation data, the target accuracy index value included in the analysis summary information, and the determination table shown in FIG. Determine the superiority or inferiority of the analysis model.
 情報分析部22は、リポジトリ10の情報保持部11から分析対象のモデル情報及び比較対象の分析モデル情報が含まれる分析概要情報から目標精度指標値に設定された精度指標を示す項目を取得する。情報分析部22は、学習データに基づく精度指標値の差分、及び評価データに基づく精度指標値の差分について、取得した精度指標を示す項目と一致する項目の差分を抽出する。情報分析部22は、抽出した項目を判定テーブルから検索し、判定テーブルに設定された、性能改善の判定条件及び性能劣化の判定条件と、抽出した項目についての差分とを比較する。 The information analysis unit 22 acquires from the information holding unit 11 of the repository 10 an item indicating the accuracy index set as the target accuracy index value from the analysis outline information including the model information to be analyzed and the analysis model information to be compared. The information analysis unit 22 extracts the difference of the item that matches the item indicating the acquired accuracy index, with respect to the difference of the accuracy index value based on the learning data and the difference of the accuracy index value based on the evaluation data. The information analysis unit 22 searches the determination table for the extracted items, and compares the determination conditions for performance improvement and the determination conditions for performance deterioration set in the determination table with the difference of the extracted items.
 情報分析部22は、抽出した項目についての差分が、性能改善の判定条件を満たした場合、分析対象の分析モデルが、比較対象の分析モデルよりも精度が向上したと判定する。情報分析部22は、抽出した項目についての差分が、性能劣化の判定条件を満たした場合、分析対象の分析モデルが、比較対象の分析モデルよりも精度が劣化したと判定する。なお、学習精度指標値及び予測精度指標値の項目の差分がなく、差分が0(ゼロ)である場合、該当する項目について判定を行わない。 The information analysis unit 22 determines that the accuracy of the analytical model to be analyzed has improved compared to the analytical model to be compared when the difference for the extracted item satisfies the criteria for performance improvement. The information analysis unit 22 determines that the accuracy of the analytical model to be analyzed is lower than that of the analytical model to be compared when the difference for the extracted item satisfies the performance deterioration determination condition. If there is no difference between the items of the learning accuracy index value and the prediction accuracy index value and the difference is 0 (zero), no determination is made for the corresponding item.
 情報分析部22は、ステップS28までに抽出した差分を、出力部24を介して、出力装置40に出力する(ステップS29)。
 情報入力部21は、出力装置40に出力した差分表示を絞り込むのか否かを、入力装置30を介して、ユーザに確認する(ステップS30)。
The information analysis unit 22 outputs the difference extracted up to step S28 to the output device 40 via the output unit 24 (step S29).
The information input unit 21 confirms with the user via the input device 30 whether or not to narrow down the difference display output to the output device 40 (step S30).
 情報入力部21が、入力装置30を介して、ユーザが差分表を絞り込むことを示す情報を受信した場合(ステップS30のYES)、情報処理装置100は、ステップS31を実行する。
 一方、情報入力部21が、入力装置30を介して、ユーザが差分表を絞り込まないことを示す情報を受信した場合(ステップS30のNO)、出力部24は、ステップS34を実行する。
When the information input unit 21 receives information indicating that the user narrows down the difference table via the input device 30 (YES in step S30), the information processing apparatus 100 executes step S31.
On the other hand, when the information input unit 21 receives information indicating that the user does not narrow down the difference table via the input device 30 (NO in step S30), the output unit 24 executes step S34.
 ステップS31において、情報入力部21は、入力装置30を介して、ユーザが選択した表示項目選択情報を入力する(ステップS31)。言い換えると、情報入力部21は、出力装置40に最終的に出力する出力項目を入力する。 In step S31, the information input unit 21 inputs display item selection information selected by the user via the input device 30 (step S31). In other words, the information input unit 21 inputs output items to be finally output to the output device 40 .
 情報入力部21は、出力装置40に出力した差分表示を絞り込むための出力条件を、入力装置30を介して入力する(ステップS32)。情報入力部21は、入力装置30を介して、目的変数及び説明変数のそれぞれについて、出力装置40に最終的に出力する項目を決定するための出力条件を入力する。具体的には、情報入力部21は、学習候補データの基本統計量、目的変数との相関係数、モデル定性情報の説明変数毎の重み付けの差分を表示するかどうかの判定条件を目的変数及び説明変数毎に、入力装置30にユーザが入力した内容を入力する。 The information input unit 21 inputs, via the input device 30, output conditions for narrowing down the difference display output to the output device 40 (step S32). The information input unit 21 inputs output conditions for determining items to be finally output to the output device 40 for each of the objective variable and the explanatory variable via the input device 30 . Specifically, the information input unit 21 sets the determination condition for whether or not to display the basic statistics of the learning candidate data, the correlation coefficient with the objective variable, and the weighting difference for each explanatory variable of the model qualitative information. The content input by the user is input to the input device 30 for each explanatory variable.
 出力部24は、表示項目選択情報及び出力条件を満たす出力項目、抽出結果及び算出結果を判定する(ステップS33)。言い換えると、出力部24は、表示項目選択情報に基づく出力項目及び出力条件を満たす出力項目、抽出結果及び算出結果を判定する。出力部24は、学習候補データ、説明変数、モデル定性情報の差分を、目的変数及び説明変数のそれぞれについて、画面表示を行うか否かを判定する。 The output unit 24 determines output items, extraction results, and calculation results that satisfy the display item selection information and output conditions (step S33). In other words, the output unit 24 determines the output items based on the display item selection information, the output items satisfying the output conditions, the extraction results, and the calculation results. The output unit 24 determines whether or not to display the difference between the learning candidate data, explanatory variables, and model qualitative information on the screen for each of the objective variable and the explanatory variable.
 ここで、図14A及び図14Bを用いて、ステップS31~S33までの内容を再度説明する。図14A及び図14Bは、差分表示を絞り込む処理を説明するための図である。
図14A及び図14Bは、それぞれ、図9A及び図9Bに対応する図であり、図9A及び図9Bの最も右の列に画面表示という列が追加されている。また、図14Bには、一番下に画面表示の領域が追加されている。なお、図14Bのうち、画面表示の領域には、表示可否判定条件の行、及び表示の行が追加されている。
Here, the contents of steps S31 to S33 will be explained again using FIGS. 14A and 14B. 14A and 14B are diagrams for explaining the process of narrowing down the difference display.
14A and 14B are diagrams corresponding to FIGS. 9A and 9B, respectively, with the addition of a screen display column to the rightmost columns of FIGS. 9A and 9B. Also, in FIG. 14B, a screen display area is added at the bottom. In addition, in FIG. 14B, in the screen display area, a display availability determination condition row and a display row are added.
 ステップS31において、出力部24は、出力装置40に、図14A及び図14Bのうち、画面表示の列及び画面表示の行が空欄の状態で表示する。ユーザは、最終的に画面表示を行う出力項目を選択することで、入力装置30に出力項目に関する表示選択情報を入力する。 In step S31, the output unit 24 displays, on the output device 40, the screen display columns and the screen display rows of FIGS. 14A and 14B in a blank state. The user inputs display selection information on the output item to the input device 30 by selecting the output item to be finally displayed on the screen.
 また、ステップS32において、ユーザは、最終的に画面表示をする抽出結果及び算出結果を決定するための出力条件を入力装置30に入力する。図14Bに示す画面表示の領域に、表示可否判定条件をユーザが入力することにより、出力条件が決定される。情報入力部21は、ユーザが入力した表示可否判定条件を、入力装置30を介して出力条件として入力する。 Also, in step S32, the user inputs into the input device 30 the output conditions for determining the extraction results and calculation results to be finally displayed on the screen. The output condition is determined by the user inputting the display propriety determination condition in the screen display area shown in FIG. 14B. The information input unit 21 inputs the display propriety determination condition input by the user via the input device 30 as an output condition.
 ステップS33において、出力部24は、表示項目選択情報及び出力条件を満たす出力項目、抽出結果及び算出結果を判定する。出力部24は、表示項目選択情報に含まれる出力項目を画面に表示すると判定する。また、出力部24は、目的変数及び説明変数のそれぞれについて出力条件を満たすかを判定し、画面表示を行う抽出結果及び算出結果を決定する。図14Bに示す一例では、表示可否判定条件には、算術平均の差分の絶対値が100以上であるか、又はケース1及びケース2のうち片方のみデータが存在するか、という出力条件が入力されている。出力部24は、目的変数及び説明変数のそれぞれについて、入力された出力条件を満たすか否かを判定する。図14A及び図14Bに示す一例では、「実績(万kW)_7日前」の説明変数が条件を満たすため、出力部24は、当該説明変数を画面表示することを決定する。出力部24は、決定した内容を画面表示の行にある表示行に設定する。 In step S33, the output unit 24 determines the display item selection information, the output items that satisfy the output conditions, the extraction results, and the calculation results. The output unit 24 determines to display the output item included in the display item selection information on the screen. The output unit 24 also determines whether the output conditions are satisfied for each of the objective variable and explanatory variable, and determines the extraction results and calculation results to be displayed on the screen. In the example shown in FIG. 14B, the output condition of whether the absolute value of the difference in the arithmetic mean is 100 or more or whether data exists in only one of case 1 and case 2 is input as the display availability determination condition. ing. The output unit 24 determines whether or not the input output conditions are satisfied for each of the objective variable and explanatory variable. In the example shown in FIGS. 14A and 14B , the explanatory variable “result (10,000 kW)_7 days ago” satisfies the condition, so the output unit 24 determines to display the explanatory variable on the screen. The output unit 24 sets the determined contents in the display line in the screen display line.
 図7に戻り、情報処理装置100の動作例の説明を続ける。
 ステップS34において、出力部24は、出力装置40の画面に差分を表示することで出力する(ステップS34)。ステップS30において、差分表示を絞り込まない場合、出力部24は、ステップS29で表示した差分を維持するように、出力装置40に出力する。ステップS30において、差分表示を絞り込む場合、出力部24は、ステップS33で画面表示すると決定した出力項目及び差分を出力装置40に出力する。
Returning to FIG. 7, the description of the operation example of the information processing apparatus 100 is continued.
In step S34, the output unit 24 outputs by displaying the difference on the screen of the output device 40 (step S34). In step S30, if the difference display is not narrowed down, the output unit 24 outputs to the output device 40 so as to maintain the difference displayed in step S29. In step S30, when narrowing down the difference display, the output unit 24 outputs to the output device 40 the output items and the difference determined to be displayed on the screen in step S33.
 以上のように、情報処理装置100は、分析対象の分析モデル及び比較対象の分析モデルに関する各種情報の差分を抽出する。そのため、情報処理装置100を用いることで、ケース差分抽出を標準化することができ、評価するポイントを明確にすることで、より短時間かつスキルレベルによらない平準化された分析モデルの予測精度の評価を実現できる。そのため、情報処理装置100を用いることで、分析モデルの作成及び予測精度向上の効率化を実現できる。 As described above, the information processing apparatus 100 extracts the difference between various types of information regarding the analytical model to be analyzed and the analytical model to be compared. Therefore, by using the information processing device 100, it is possible to standardize the case difference extraction, and by clarifying the points to be evaluated, the prediction accuracy of the leveled analysis model can be improved in a short time and regardless of the skill level. evaluation can be realized. Therefore, by using the information processing apparatus 100, it is possible to efficiently create an analysis model and improve the prediction accuracy.
 具体的には、分析担当者は、情報処理装置100が抽出した精度指標値の差分により全体の改善状況を確認できるとともに、説明変数の差分及びAIエンジン・アルゴリズムの差異を確認することで、改善に対する要因をすぐに確認できる。また、改善に与える影響度合いについても、分析担当者は、学習候補データの基本統計量、及び説明変数と目的変数との相関係数等のデータ傾向の変化、回帰式の重みづけ、及び決定木に使用する条件式の採用有無等を判断することが可能となる。このように、分析担当者が、仮に経験の浅いとしても、情報処理装置100が、確認すべきポイントに対応する情報を出力することで、分析モデルの予測精度の評価を効率化できる。したがって、第2の実施形態にかかる情報処理装置100によれば、分析担当者のスキルレベルによらず、学習モデルの評価を効率的に行うことが可能となる。 Specifically, the person in charge of analysis can confirm the overall improvement status based on the difference in the accuracy index values extracted by the information processing apparatus 100, and can confirm the difference in the explanatory variables and the difference in the AI engine/algorithm. You can immediately check the factors for Also, regarding the degree of impact on improvement, the person in charge of analysis should consider basic statistics of learning candidate data, changes in data trends such as correlation coefficients between explanatory variables and objective variables, weighting of regression formulas, and decision tree It is possible to determine whether or not to adopt the conditional expression used in In this way, even if the person in charge of analysis is inexperienced, the information processing apparatus 100 outputs information corresponding to the points to be confirmed, so that the evaluation of the prediction accuracy of the analysis model can be made more efficient. Therefore, according to the information processing apparatus 100 according to the second embodiment, it is possible to efficiently evaluate the learning model regardless of the skill level of the person in charge of analysis.
 また、上述した特許文献1に開示された技術では、目的変数、学習候補データ、説明変数、学習データ及び評価データが同一の比較対象に対して、AIエンジン及びアルゴリズムを変更したときの精度指標値の差分を抽出する。そして、特許文献1に開示された技術では、AIエンジン及びアルゴリズムが変わったことによる分析モデルの予測精度への影響を判定している。これに対して、情報処理装置100は、目的変数、学習候補データ、説明変数、学習データ及び評価データが同一ではない場合でも、差分を抽出する。したがって、第2の実施形態にかかる情報処理装置100によれば、分析モデルの予測精度に影響がある情報を把握できるため、予測精度の高い分析モデルを作成することに寄与できる。 Further, in the technology disclosed in Patent Document 1 described above, the accuracy index value when the AI engine and algorithm are changed for the comparison target with the same objective variable, learning candidate data, explanatory variable, learning data, and evaluation data Extract the difference between In the technique disclosed in Patent Document 1, the influence on the prediction accuracy of the analysis model due to changes in the AI engine and algorithm is determined. On the other hand, the information processing apparatus 100 extracts the difference even when the objective variable, learning candidate data, explanatory variable, learning data, and evaluation data are not the same. Therefore, according to the information processing apparatus 100 according to the second embodiment, it is possible to grasp information that affects the prediction accuracy of the analysis model, which contributes to creating an analysis model with high prediction accuracy.
(他の実施形態) (Other embodiments)
 上述した実施形態において説明した情報処理装置1及び100(以下、情報処理装置1等と称する)は、次のようなハードウェア構成を有していてもよい。図15は、本開示にかかる情報処理装置のハードウェア構成例を示す図である。 The information processing apparatuses 1 and 100 (hereinafter referred to as information processing apparatuses 1 and the like) described in the above embodiments may have the following hardware configuration. FIG. 15 is a diagram illustrating a hardware configuration example of an information processing apparatus according to the present disclosure;
 図15を参照すると、情報処理装置1等は、プロセッサ1201及びメモリ1202を含む。プロセッサ1201は、メモリ1202からソフトウェア(コンピュータプログラム)を読み出して実行することで、上述の実施形態においてフローチャートを用いて説明された情報処理装置1等の処理を行う。プロセッサ1201は、例えば、マイクロプロセッサ、MPU(Micro Processing Unit)、又はCPU(Central Processing Unit)であってもよい。プロセッサ1201は、複数のプロセッサを含んでもよい。 With reference to FIG. 15, the information processing device 1 and the like include a processor 1201 and a memory 1202 . The processor 1201 reads software (computer program) from the memory 1202 and executes it to perform the processing of the information processing apparatus 1 and the like described using the flowcharts in the above-described embodiments. The processor 1201 may be, for example, a microprocessor, MPU (Micro Processing Unit), or CPU (Central Processing Unit). Processor 1201 may include multiple processors.
 メモリ1202は、揮発性メモリ及び不揮発性メモリの組み合わせによって構成される。メモリ1202は、プロセッサ1201から離れて配置されたストレージを含んでもよい。この場合、プロセッサ1201は、図示されていないI/O (Input/Output)インターフェースを介してメモリ1202にアクセスしてもよい。 The memory 1202 is composed of a combination of volatile memory and non-volatile memory. Memory 1202 may include storage remotely located from processor 1201 . In this case, processor 1201 may access memory 1202 via an I/O (Input/Output) interface (not shown).
 図15の例では、メモリ1202は、ソフトウェアモジュール群を格納するために使用される。プロセッサ1201は、これらのソフトウェアモジュール群をメモリ1202から読み出して実行することで、上述の実施形態において説明された情報処理装置1等の処理を行うことができる。 In the example of FIG. 15, memory 1202 is used to store software modules. The processor 1201 reads these software modules from the memory 1202 and executes them, thereby performing the processing of the information processing apparatus 1 and the like described in the above embodiments.
 図15を用いて説明したように、情報処理装置1等が有する1つ又は複数のプロセッサの各々は、図面を用いて説明されたアルゴリズムをコンピュータに行わせるための命令群を含む1又は複数のプログラムを実行する。 As described with reference to FIG. 15, each of the one or more processors included in the information processing apparatus 1 or the like has one or more processors containing instructions for causing the computer to execute the algorithm described with reference to the drawings. Run the program.
 上述の例において、プログラムは、コンピュータに読み込まれた場合に、実施形態で説明された1又はそれ以上の機能をコンピュータに行わせるための命令群(又はソフトウェアコード)を含む。プログラムは、非一時的なコンピュータ可読媒体又は実体のある記憶媒体に格納されてもよい。限定ではなく例として、コンピュータ可読媒体又は実体のある記憶媒体は、random-access memory(RAM)、read-only memory(ROM)、フラッシュメモリ、solid-state drive(SSD)又はその他のメモリ技術、CD-ROM、digital versatile disc(DVD)、Blu-ray(登録商標)ディスク又はその他の光ディスクストレージ、磁気カセット、磁気テープ、磁気ディスクストレージ又はその他の磁気ストレージデバイスを含む。プログラムは、一時的なコンピュータ可読媒体又は通信媒体上で送信されてもよい。限定ではなく例として、一時的なコンピュータ可読媒体又は通信媒体は、電気的、光学的、音響的、またはその他の形式の伝搬信号を含む。 In the above examples, the program includes instructions (or software code) that, when read into a computer, cause the computer to perform one or more of the functions described in the embodiments. The program may be stored in a non-transitory computer-readable medium or tangible storage medium. By way of example, and not limitation, computer readable media or tangible storage media may include random-access memory (RAM), read-only memory (ROM), flash memory, solid-state drives (SSD) or other memory technology, CDs -ROM, digital versatile disc (DVD), Blu-ray disc or other optical disc storage, magnetic cassette, magnetic tape, magnetic disc storage or other magnetic storage device; The program may be transmitted on a transitory computer-readable medium or communication medium. By way of example, and not limitation, transitory computer readable media or communication media include electrical, optical, acoustic, or other forms of propagated signals.
 なお、本開示は上記実施の形態に限られたものではなく、趣旨を逸脱しない範囲で適宜変更することが可能である。また、本開示は、それぞれの実施の形態を適宜組み合わせて実施されてもよい。 It should be noted that the present disclosure is not limited to the above embodiments, and can be modified as appropriate without departing from the scope. In addition, the present disclosure may be implemented by appropriately combining each embodiment.
 また、上記の実施形態の一部又は全部は、以下の付記のようにも記載されうるが、以下には限られない。
 (付記1)
 第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出する分析手段と、
 前記差分が抽出された場合、前記第1の説明変数と、前記第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、前記第2の説明変数と、前記第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出する算出手段と、
 前記分析手段の抽出結果と、前記算出手段の算出結果と、を出力する出力手段と、
を備える情報処理装置。
 (付記2)
 前記分析手段は、前記第1の説明変数と、前記第2の説明変数との差分を抽出した場合、前記第1の学習モデルにおける前記第1の説明変数の重み度合いを示す第1の重み係数と、前記第2の学習モデルにおける前記第2の説明変数の重み度合いを示す第2の重み係数と、を抽出する、付記1に記載の情報処理装置。
 (付記3)
 前記分析手段は、前記第1の学習モデルが第1の回帰式で表される場合、前記第1の回帰式の回帰係数を前記第1の重み係数として抽出し、前記第2の学習モデルが第2の回帰式で表される場合、前記第2の回帰式の回帰係数を前記第2の重み係数として抽出する、付記2に記載の情報処理装置。
 (付記4)
 前記算出手段は、前記第1の重み係数と、前記第2の重み係数とに差分がある第1の説明変数に対して、前記第1の目的変数との第3の相関係数を算出し、前記第1の重み係数と、前記第2の重み係数とに差分がある第2の説明変数に対して、前記第2の目的変数との第4の相関係数を算出する、付記2又は3に記載の情報処理装置。
 (付記5)
 前記算出手段は、前記第1のケース情報に含まれる第1の学習候補データの第1の基本統計量と、前記第2のケース情報に含まれる第2の学習候補データの第2の基本統計量とを算出し、
 前記分析手段は、前記第1の基本統計量と、前記第2の基本統計量との差分を抽出する、付記1~4のいずれか1項に記載の情報処理装置。
 (付記6)
 前記算出手段は、前記第1の学習モデルを作成するために使用された第1の学習データの第3の基本統計量を算出し、前記第2の学習モデルを作成するために使用された第2の学習データの第4の基本統計量を算出し、
 前記分析手段は、前記第3の基本統計量と、前記第4の基本統計量との差分、及び前記第1の学習データに含まれる変数と、前記第2の学習データに含まれる変数との差分を抽出する、付記1~5のいずれか1項に記載の情報処理装置。
 (付記7)
 前記算出手段は、前記第1の学習モデルを評価するために使用された第1の評価データの第5の基本統計量と、前記第2の学習モデルを評価するために使用された第2の評価データの第6の基本統計量とを算出し、
 前記分析手段は、前記第5の基本統計量と、前記第6の基本統計量との差分、及び前記第1の評価データに含まれる変数と、前記第2の評価データに含まれる変数との差分を抽出する、付記1~6のいずれか1項に記載の情報処理装置。
 (付記8)
 前記分析手段は、前記第1の学習モデルが第1の決定木で表される場合、かつ、前記第2の学習モデルが第2の決定木で表される場合、前記第1の決定木の階層情報と、前記第2の決定木の階層情報との差分を抽出する、付記1~7のいずれか1項に記載の情報処理装置。
 (付記9)
 前記階層情報は、決定木の階層数、階層間の関連情報、決定木の各枝の決定条件、決定木の各葉の学習データサンプル数、及び決定木の各葉の評価データサンプル数を含む、付記8に記載の情報処理装置。
 (付記10)
 前記分析手段は、前記第1の学習モデルの学習結果及び予測結果のうち、少なくとも1つの精度を示す第1の精度指標値と、前記第2の学習モデルの学習結果及び予測結果のうち、少なくとも1つの精度を示す第2の精度指標値と、の差分を抽出する、付記1~9のいずれか1項に記載の情報処理装置。
 (付記11)
 前記分析手段は、前記第1の精度指標値と、前記第2の精度指標値との差分と、前記第1の学習モデル及び前記第2の学習モデルに関連する目標精度指標値と、所定の判定条件とに基づいて、前記第1の学習モデルの予測精度が、前記第2の学習モデルの予測精度よりも向上したのか否かを判定する、付記10に記載の情報処理装置。
 (付記12)
 出力項目と、出力条件と、を入力する入力手段をさらに備え、
 前記出力手段は、前記抽出結果及び前記算出結果のうち、前記出力項目と、前記出力条件とを満たす出力項目、抽出結果及び算出結果を出力する、付記1~11のいずれか1項に記載の情報処理装置。
 (付記13)
 前記分析手段は、前記第1のケース情報に含まれる第1のAI(Artificial Intelligence)エンジンと、前記第2のケース情報に含まれる第2のAIエンジンと、の差分を抽出する、付記1~12のいずれか1項に記載の情報処理装置。
 (付記14)
 前記分析手段は、前記第1のケース情報に含まれる第1の学習アルゴリズムと、前記第2のケース情報に含まれる第2の学習アルゴリズムと、の差分を抽出する、付記13に記載の情報処理装置。
 (付記15)
 前記分析手段は、前記第1のAIエンジンが前記第2のAIエンジンと一致し、かつ前記第1の学習アルゴリズムが前記第2の学習アルゴリズムと一致する場合、前記第1のケース情報に含まれる第1のハイパーパラメータと、前記第2のケース情報に含まれる第2のハイパーパラメータとの差分を抽出する、付記14に記載の情報処理装置。
 (付記16)
 第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出すること、
 前記差分が抽出された場合、前記第1の説明変数と、前記第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、前記第2の説明変数と、前記第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出すること、及び
 前記抽出された抽出結果と、前記算出された算出結果と、を出力すること、を含む差分抽出方法。
 (付記17)
 情報処理装置に差分抽出方法を実行させるプログラムが格納された非一時的なコンピュータ可読媒体であって、
 前記差分抽出方法は、
 第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出すること、
 前記差分が抽出された場合、前記第1の説明変数と、前記第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、前記第2の説明変数と、前記第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出すること、及び
 前記抽出された抽出結果と、前記算出された算出結果と、を出力すること、を含む非一時的なコンピュータ可読媒体。
In addition, part or all of the above-described embodiments can be described as the following additional remarks, but are not limited to the following.
(Appendix 1)
A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. an analysis means for extracting a difference from explanatory variables;
When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and a calculation means for calculating a second correlation coefficient with a second objective variable included in the second case information;
an output means for outputting the extraction result of the analysis means and the calculation result of the calculation means;
Information processing device.
(Appendix 2)
The analysis means, when extracting a difference between the first explanatory variable and the second explanatory variable, provides a first weighting coefficient indicating the degree of weighting of the first explanatory variable in the first learning model. and a second weighting factor indicating the degree of weighting of the second explanatory variable in the second learning model.
(Appendix 3)
When the first learning model is represented by a first regression formula, the analysis means extracts a regression coefficient of the first regression formula as the first weighting factor, and the second learning model is The information processing apparatus according to appendix 2, wherein, when expressed by a second regression equation, a regression coefficient of the second regression equation is extracted as the second weighting coefficient.
(Appendix 4)
The calculating means calculates a third correlation coefficient between the first explanatory variable and the first objective variable, with respect to the first explanatory variable having a difference between the first weighting factor and the second weighting factor. , for a second explanatory variable having a difference between the first weighting factor and the second weighting factor, calculating a fourth correlation coefficient with the second objective variable, Supplementary Note 2 or 4. The information processing device according to 3.
(Appendix 5)
The calculation means calculates a first basic statistic of first learning candidate data included in the first case information and a second basic statistic of second learning candidate data included in the second case information. calculate the amount and
5. The information processing apparatus according to any one of appendices 1 to 4, wherein the analysis means extracts a difference between the first basic statistic and the second basic statistic.
(Appendix 6)
The calculation means calculates a third basic statistic of the first learning data used to create the first learning model, and calculates a third basic statistic used to create the second learning model. Calculate the fourth basic statistic of the learning data of 2,
The analysis means analyzes the difference between the third basic statistic and the fourth basic statistic, and the difference between the variables included in the first learning data and the variables included in the second learning data. 6. The information processing device according to any one of appendices 1 to 5, which extracts a difference.
(Appendix 7)
The calculating means calculates a fifth basic statistic of the first evaluation data used to evaluate the first learning model and a second basic statistic used to evaluate the second learning model. Calculate a sixth basic statistic of the evaluation data,
The analysis means analyzes the difference between the fifth basic statistic and the sixth basic statistic, and the variables included in the first evaluation data and the variables included in the second evaluation data. 7. The information processing device according to any one of appendices 1 to 6, which extracts a difference.
(Appendix 8)
When the first learning model is represented by a first decision tree and when the second learning model is represented by a second decision tree, the analysis means performs 8. The information processing device according to any one of appendices 1 to 7, wherein a difference between hierarchical information and hierarchical information of the second decision tree is extracted.
(Appendix 9)
The hierarchy information includes the number of layers of the decision tree, the relation information between the layers, the decision condition of each branch of the decision tree, the number of learning data samples of each leaf of the decision tree, and the number of evaluation data samples of each leaf of the decision tree. , the information processing apparatus according to appendix 8.
(Appendix 10)
The analysis means performs at least a first accuracy index value indicating accuracy of at least one of the learning result and prediction result of the first learning model and at least one of the learning result and prediction result of the second learning model. 10. The information processing apparatus according to any one of appendices 1 to 9, wherein a difference between a second accuracy index value indicating one accuracy and a difference is extracted.
(Appendix 11)
The analysis means includes a difference between the first accuracy index value and the second accuracy index value, a target accuracy index value related to the first learning model and the second learning model, and a predetermined 11. The information processing apparatus according to appendix 10, wherein it is determined whether or not the prediction accuracy of the first learning model is improved over the prediction accuracy of the second learning model based on a determination condition.
(Appendix 12)
further comprising input means for inputting output items and output conditions,
12. The output unit according to any one of appendices 1 to 11, wherein the output means outputs the output item, the output item satisfying the output condition, the extraction result, and the calculation result out of the extraction result and the calculation result. Information processing equipment.
(Appendix 13)
The analysis means extracts a difference between a first AI (Artificial Intelligence) engine included in the first case information and a second AI engine included in the second case information, Appendices 1 to 13. The information processing device according to any one of 12.
(Appendix 14)
14. The information processing according to appendix 13, wherein the analysis means extracts a difference between a first learning algorithm included in the first case information and a second learning algorithm included in the second case information. Device.
(Appendix 15)
The analysis means is included in the first case information if the first AI engine matches the second AI engine and the first learning algorithm matches the second learning algorithm 15. The information processing device according to appendix 14, wherein a difference between a first hyperparameter and a second hyperparameter included in the second case information is extracted.
(Appendix 16)
A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. Extracting the difference with the explanatory variable,
When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and calculating a second correlation coefficient with a second objective variable included in the second case information; and outputting the extracted extraction result and the calculated calculation result. Diff method including.
(Appendix 17)
A non-temporary computer-readable medium storing a program that causes an information processing device to execute a difference extraction method,
The difference extraction method is
A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. Extracting the difference with the explanatory variable,
When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and calculating a second correlation coefficient with a second objective variable included in the second case information; and outputting the extracted extraction result and the calculated calculation result. non-transitory computer-readable media including;
 1、100 情報処理装置
 2 分析部
 3 算出部
 4 出力部
 10 リポジトリ
 11 情報保持部
 20 処理装置
 21 情報入力部
 22 情報分析部
 23 算出部
 24 出力部
 25 外部システム制御部
 30 入力装置
 40 出力装置
 1201 プロセッサ
 1202 メモリ
Reference Signs List 1, 100 information processing device 2 analysis unit 3 calculation unit 4 output unit 10 repository 11 information holding unit 20 processing unit 21 information input unit 22 information analysis unit 23 calculation unit 24 output unit 25 external system control unit 30 input device 40 output device 1201 processor 1202 memory

Claims (17)

  1.  第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出する分析手段と、
     前記差分が抽出された場合、前記第1の説明変数と、前記第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、前記第2の説明変数と、前記第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出する算出手段と、
     前記分析手段の抽出結果と、前記算出手段の算出結果と、を出力する出力手段と、
    を備える情報処理装置。
    A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. an analysis means for extracting a difference from explanatory variables;
    When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and a calculation means for calculating a second correlation coefficient with a second objective variable included in the second case information;
    an output means for outputting the extraction result of the analysis means and the calculation result of the calculation means;
    Information processing device.
  2.  前記分析手段は、前記第1の説明変数と、前記第2の説明変数との差分を抽出した場合、前記第1の学習モデルにおける前記第1の説明変数の重み度合いを示す第1の重み係数と、前記第2の学習モデルにおける前記第2の説明変数の重み度合いを示す第2の重み係数と、を抽出する、請求項1に記載の情報処理装置。 The analysis means, when extracting a difference between the first explanatory variable and the second explanatory variable, provides a first weighting coefficient indicating the degree of weighting of the first explanatory variable in the first learning model. and a second weighting factor indicating the degree of weighting of the second explanatory variable in the second learning model.
  3.  前記分析手段は、前記第1の学習モデルが第1の回帰式で表される場合、前記第1の回帰式の回帰係数を前記第1の重み係数として抽出し、前記第2の学習モデルが第2の回帰式で表される場合、前記第2の回帰式の回帰係数を前記第2の重み係数として抽出する、請求項2に記載の情報処理装置。 When the first learning model is represented by a first regression formula, the analysis means extracts a regression coefficient of the first regression formula as the first weighting factor, and the second learning model is 3. The information processing apparatus according to claim 2, wherein, when represented by a second regression equation, a regression coefficient of said second regression equation is extracted as said second weighting coefficient.
  4.  前記算出手段は、前記第1の重み係数と、前記第2の重み係数とに差分がある第1の説明変数に対して、前記第1の目的変数との第3の相関係数を算出し、前記第1の重み係数と、前記第2の重み係数とに差分がある第2の説明変数に対して、前記第2の目的変数との第4の相関係数を算出する、請求項2又は3に記載の情報処理装置。 The calculating means calculates a third correlation coefficient between the first explanatory variable and the first objective variable, with respect to the first explanatory variable having a difference between the first weighting factor and the second weighting factor. , calculating a fourth correlation coefficient with said second objective variable for a second explanatory variable having a difference between said first weighting factor and said second weighting factor; 3. The information processing device according to 3.
  5.  前記算出手段は、前記第1のケース情報に含まれる第1の学習候補データの第1の基本統計量と、前記第2のケース情報に含まれる第2の学習候補データの第2の基本統計量とを算出し、
     前記分析手段は、前記第1の基本統計量と、前記第2の基本統計量との差分を抽出する、請求項1~4のいずれか1項に記載の情報処理装置。
    The calculation means calculates a first basic statistic of first learning candidate data included in the first case information and a second basic statistic of second learning candidate data included in the second case information. calculate the amount and
    5. The information processing apparatus according to claim 1, wherein said analysis means extracts a difference between said first basic statistic and said second basic statistic.
  6.  前記算出手段は、前記第1の学習モデルを作成するために使用された第1の学習データの第3の基本統計量を算出し、前記第2の学習モデルを作成するために使用された第2の学習データの第4の基本統計量を算出し、
     前記分析手段は、前記第3の基本統計量と、前記第4の基本統計量との差分、及び前記第1の学習データに含まれる変数と、前記第2の学習データに含まれる変数との差分を抽出する、請求項1~5のいずれか1項に記載の情報処理装置。
    The calculation means calculates a third basic statistic of the first learning data used to create the first learning model, and calculates a third basic statistic used to create the second learning model. Calculate the fourth basic statistic of the learning data of 2,
    The analysis means analyzes the difference between the third basic statistic and the fourth basic statistic, and the difference between the variables included in the first learning data and the variables included in the second learning data. The information processing apparatus according to any one of claims 1 to 5, which extracts a difference.
  7.  前記算出手段は、前記第1の学習モデルを評価するために使用された第1の評価データの第5の基本統計量と、前記第2の学習モデルを評価するために使用された第2の評価データの第6の基本統計量とを算出し、
     前記分析手段は、前記第5の基本統計量と、前記第6の基本統計量との差分、及び前記第1の評価データに含まれる変数と、前記第2の評価データに含まれる変数との差分を抽出する、請求項1~6のいずれか1項に記載の情報処理装置。
    The calculating means calculates a fifth basic statistic of the first evaluation data used to evaluate the first learning model and a second basic statistic used to evaluate the second learning model. Calculate a sixth basic statistic of the evaluation data,
    The analysis means analyzes the difference between the fifth basic statistic and the sixth basic statistic, and the variables included in the first evaluation data and the variables included in the second evaluation data. 7. The information processing apparatus according to any one of claims 1 to 6, which extracts a difference.
  8.  前記分析手段は、前記第1の学習モデルが第1の決定木で表される場合、かつ、前記第2の学習モデルが第2の決定木で表される場合、前記第1の決定木の階層情報と、前記第2の決定木の階層情報との差分を抽出する、請求項1~7のいずれか1項に記載の情報処理装置。 When the first learning model is represented by a first decision tree and when the second learning model is represented by a second decision tree, the analysis means performs 8. The information processing apparatus according to claim 1, wherein difference between hierarchical information and hierarchical information of said second decision tree is extracted.
  9.  前記階層情報は、決定木の階層数、階層間の関連情報、決定木の各枝の決定条件、決定木の各葉の学習データサンプル数、及び決定木の各葉の評価データサンプル数を含む、請求項8に記載の情報処理装置。 The hierarchy information includes the number of layers of the decision tree, the relation information between the layers, the decision condition of each branch of the decision tree, the number of learning data samples of each leaf of the decision tree, and the number of evaluation data samples of each leaf of the decision tree. 9. The information processing apparatus according to claim 8.
  10.  前記分析手段は、前記第1の学習モデルの学習結果及び予測結果のうち、少なくとも1つの精度を示す第1の精度指標値と、前記第2の学習モデルの学習結果及び予測結果のうち、少なくとも1つの精度を示す第2の精度指標値と、の差分を抽出する、請求項1~9のいずれか1項に記載の情報処理装置。 The analysis means performs at least a first accuracy index value indicating accuracy of at least one of the learning result and prediction result of the first learning model and at least one of the learning result and prediction result of the second learning model. 10. The information processing apparatus according to any one of claims 1 to 9, wherein a difference between a second accuracy index value indicating one accuracy and a difference is extracted.
  11.  前記分析手段は、前記第1の精度指標値と、前記第2の精度指標値との差分と、前記第1の学習モデル及び前記第2の学習モデルに関連する目標精度指標値と、所定の判定条件とに基づいて、前記第1の学習モデルの予測精度が、前記第2の学習モデルの予測精度よりも向上したのか否かを判定する、請求項10に記載の情報処理装置。 The analysis means includes a difference between the first accuracy index value and the second accuracy index value, a target accuracy index value related to the first learning model and the second learning model, and a predetermined 11. The information processing apparatus according to claim 10, wherein it is determined whether or not the prediction accuracy of said first learning model has improved over the prediction accuracy of said second learning model based on a determination condition.
  12.  出力項目と、出力条件と、を入力する入力手段をさらに備え、
     前記出力手段は、前記抽出結果及び前記算出結果のうち、前記出力項目と、前記出力条件とを満たす出力項目、抽出結果及び算出結果を出力する、請求項1~11のいずれか1項に記載の情報処理装置。
    further comprising input means for inputting output items and output conditions,
    12. The output unit according to any one of claims 1 to 11, wherein out of the extraction result and the calculation result, the output item satisfying the output item and the output condition, the extraction result, and the calculation result are output. information processing equipment.
  13.  前記分析手段は、前記第1のケース情報に含まれる第1のAI(Artificial Intelligence)エンジンと、前記第2のケース情報に含まれる第2のAIエンジンと、の差分を抽出する、請求項1~12のいずれか1項に記載の情報処理装置。 2. The analyzing means extracts a difference between a first AI (Artificial Intelligence) engine included in the first case information and a second AI engine included in the second case information. 13. The information processing apparatus according to any one of items 1 to 12.
  14.  前記分析手段は、前記第1のケース情報に含まれる第1の学習アルゴリズムと、前記第2のケース情報に含まれる第2の学習アルゴリズムと、の差分を抽出する、請求項13に記載の情報処理装置。 14. The information according to claim 13, wherein said analysis means extracts a difference between a first learning algorithm included in said first case information and a second learning algorithm included in said second case information. processing equipment.
  15.  前記分析手段は、前記第1のAIエンジンが前記第2のAIエンジンと一致し、かつ前記第1の学習アルゴリズムが前記第2の学習アルゴリズムと一致する場合、前記第1のケース情報に含まれる第1のハイパーパラメータと、前記第2のケース情報に含まれる第2のハイパーパラメータとの差分を抽出する、請求項14に記載の情報処理装置。 The analysis means is included in the first case information if the first AI engine matches the second AI engine and the first learning algorithm matches the second learning algorithm 15. The information processing apparatus according to claim 14, extracting a difference between a first hyperparameter and a second hyperparameter included in said second case information.
  16.  第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出すること、
     前記差分が抽出された場合、前記第1の説明変数と、前記第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、前記第2の説明変数と、前記第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出すること、及び
     前記抽出された抽出結果と、前記算出された算出結果と、を出力すること、を含む差分抽出方法。
    A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. Extracting the difference with the explanatory variable,
    When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and calculating a second correlation coefficient with a second objective variable included in the second case information; and outputting the extracted extraction result and the calculated calculation result. Diff method including.
  17.  情報処理装置に差分抽出方法を実行させるプログラムが格納された非一時的なコンピュータ可読媒体であって、
     前記差分抽出方法は、
     第1の学習モデルの設計パターンに関する情報を示す第1のケース情報に含まれる第1の説明変数と、第2の学習モデルの設計パターンに関する情報を示す第2のケース情報に含まれる第2の説明変数との差分を抽出すること、
     前記差分が抽出された場合、前記第1の説明変数と、前記第1のケース情報に含まれる第1の目的変数との第1の相関係数を算出し、前記第2の説明変数と、前記第2のケース情報に含まれる第2の目的変数との第2の相関係数を算出すること、及び
     前記抽出された抽出結果と、前記算出された算出結果と、を出力すること、を含む非一時的なコンピュータ可読媒体。
    A non-temporary computer-readable medium storing a program that causes an information processing device to execute a difference extraction method,
    The difference extraction method is
    A first explanatory variable included in the first case information indicating information about the design pattern of the first learning model, and a second explanatory variable included in the second case information indicating information about the design pattern of the second learning model. Extracting the difference with the explanatory variable,
    When the difference is extracted, a first correlation coefficient between the first explanatory variable and a first objective variable included in the first case information is calculated, and the second explanatory variable and calculating a second correlation coefficient with a second objective variable included in the second case information; and outputting the extracted extraction result and the calculated calculation result. non-transitory computer-readable media including;
PCT/JP2021/020987 2021-06-02 2021-06-02 Information processing device, difference extraction method, and non-temporary computer-readable medium WO2022254607A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2021/020987 WO2022254607A1 (en) 2021-06-02 2021-06-02 Information processing device, difference extraction method, and non-temporary computer-readable medium
JP2023525238A JPWO2022254607A5 (en) 2021-06-02 Information processing device, difference extraction method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/020987 WO2022254607A1 (en) 2021-06-02 2021-06-02 Information processing device, difference extraction method, and non-temporary computer-readable medium

Publications (1)

Publication Number Publication Date
WO2022254607A1 true WO2022254607A1 (en) 2022-12-08

Family

ID=84322843

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/020987 WO2022254607A1 (en) 2021-06-02 2021-06-02 Information processing device, difference extraction method, and non-temporary computer-readable medium

Country Status (1)

Country Link
WO (1) WO2022254607A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011227920A (en) * 2011-06-29 2011-11-10 Nomura Research Institute Ltd Marketing support system
WO2017168460A1 (en) * 2016-03-29 2017-10-05 日本電気株式会社 Information processing system, information processing method, and information processing program
JP2020052514A (en) * 2018-09-25 2020-04-02 日本電気株式会社 Ai (artificial intelligence) execution support device, method, and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011227920A (en) * 2011-06-29 2011-11-10 Nomura Research Institute Ltd Marketing support system
WO2017168460A1 (en) * 2016-03-29 2017-10-05 日本電気株式会社 Information processing system, information processing method, and information processing program
JP2020052514A (en) * 2018-09-25 2020-04-02 日本電気株式会社 Ai (artificial intelligence) execution support device, method, and program

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KAWASAKI KENTO: "Structuring Non-Structured Data by Transformer", JSAI TECHNICAL REPORT. 121ST SIG-KBS, 20 November 2020 (2020-11-20), pages 16 - 21, XP093012644, [retrieved on 20230110], DOI: 10.11517/jsaikbs.121.0_04 *

Also Published As

Publication number Publication date
JPWO2022254607A1 (en) 2022-12-08

Similar Documents

Publication Publication Date Title
CN108875784B (en) Method and system for data-based optimization of performance metrics in industry
US10839314B2 (en) Automated system for development and deployment of heterogeneous predictive models
US9047559B2 (en) Computer-implemented systems and methods for testing large scale automatic forecast combinations
US20190251458A1 (en) System and method for particle swarm optimization and quantile regression based rule mining for regression techniques
US20060184460A1 (en) Automated learning system
US20180082185A1 (en) Predictive model updating system, predictive model updating method, and predictive model updating program
JP7069029B2 (en) Automatic prediction system, automatic prediction method and automatic prediction program
US11481692B2 (en) Machine learning program verification apparatus and machine learning program verification method
CN111738331A (en) User classification method and device, computer-readable storage medium and electronic device
US11995667B2 (en) Systems and methods for business analytics model scoring and selection
US20220366315A1 (en) Feature selection for model training
US20210201179A1 (en) Method and system for designing a prediction model
KR102406375B1 (en) An electronic device including evaluation operation of originated technology
WO2022254607A1 (en) Information processing device, difference extraction method, and non-temporary computer-readable medium
Zhang et al. Predicting consistent clone change
JP2012181739A (en) Man-hour estimation device, man-hour estimation method, and man-hour estimation program
CN115619539A (en) Pre-loan risk evaluation method and device
JPWO2018235841A1 (en) Graph structure analysis device, graph structure analysis method, and program
KR20230052010A (en) Demand forecasting method using ai-based model selector algorithm
AU2020201689A1 (en) Cognitive forecasting
CA3160715A1 (en) Systems and methods for business analytics model scoring and selection
WO2023275971A1 (en) Information processing device, information processing method, and non-transitory computer-readable medium
Sousa et al. Applying Machine Learning to Estimate the Effort and Duration of Individual Tasks in Software Projects
EP4310736A1 (en) Method and system of generating causal structure
CN111260191B (en) Test bed maturity quantization method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21944111

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023525238

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21944111

Country of ref document: EP

Kind code of ref document: A1