WO2023144998A1 - 情報処理装置、情報処理方法及びプログラム - Google Patents

情報処理装置、情報処理方法及びプログラム Download PDF

Info

Publication number
WO2023144998A1
WO2023144998A1 PCT/JP2022/003310 JP2022003310W WO2023144998A1 WO 2023144998 A1 WO2023144998 A1 WO 2023144998A1 JP 2022003310 W JP2022003310 W JP 2022003310W WO 2023144998 A1 WO2023144998 A1 WO 2023144998A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
machine learning
executed
priority
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/003310
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
悦子 市原
純明 榮
貴史 小梨
裕樹 多賀戸
佑嗣 小林
淳 西岡
昌尚 棗田
純 児玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP2023576515A priority Critical patent/JP7750311B2/ja
Priority to PCT/JP2022/003310 priority patent/WO2023144998A1/ja
Publication of WO2023144998A1 publication Critical patent/WO2023144998A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present invention relates to an information processing device, an information processing method, and a program.
  • Patent Document 1 discloses a parameter adjustment device that can efficiently perform grid searches. This parameter adjustment device extracts parameter combination patterns that bring the accuracy of the model within an allowable range, and uses the combination patterns with the highest rank when performing the next analysis.
  • Patent Document 1 ranks the combinations themselves, so there is a problem that it is difficult to identify which parameter among the plurality of parameters contributes to the improvement of the analysis accuracy of the model. .
  • One aspect of the present invention has been made in view of the above problems, and an example of its purpose is to provide a technology that can support efficient model generation by machine learning.
  • An information processing apparatus includes a first acquisition unit that acquires target data that is data related to target machine learning, and acquires executed data that is data related to a plurality of machine learnings that have been executed. Identifying means for identifying one or more executed machine learning similar to the target machine learning using the second obtaining means, the target data, and the executed data, and the identifying means Priority calculation means for calculating the priority of each of a plurality of parameters related to one or more specified machine learning that has been executed, and giving the calculated priority to the parameter; identification information of the plurality of parameters; and generating means for generating output data including a range of values of a plurality of parameters, the output data including the priority in an identifiable manner.
  • An information processing system includes a first acquisition unit that acquires target data that is data related to target machine learning, and acquired executed data that is data related to a plurality of machine learnings that have been executed. Identifying means for identifying one or more executed machine learning similar to the target machine learning using the second obtaining means, the target data, and the executed data, and the identifying means Priority calculation means for calculating the priority of each of a plurality of parameters related to one or more specified machine learning that has been executed, and giving the calculated priority to the parameter; identification information of the plurality of parameters; and generating means for generating output data including a range of values of a plurality of parameters, the output data including the priority in an identifiable manner.
  • An information processing method comprises: one or more processors acquiring target data, which is data related to target machine learning; Obtaining, using the target data and the executed data to identify one or more executed machine learning similar to the target machine learning, the identified one or more executed machine learning Calculating the priority of each of a plurality of parameters related to machine learning, and assigning the calculated priority to the parameter, identification information of the plurality of parameters, and output data including the range of values of the plurality of parameters and generating output data that includes the priority in an identifiable manner.
  • a program provides a computer with a first acquisition process for acquiring target data, which is data related to target machine learning, and acquired executed data, which is data related to a plurality of executed machine learning a second acquisition process to identify one or more executed machine learning similar to the target machine learning using the target data and the executed data; and a specified a priority calculation process of calculating the priority of each of a plurality of parameters related to one or more executed machine learning and giving the calculated priority to the parameter; identification information of the plurality of parameters; and a generation process of generating output data including the parameter value range and including the priority in an identifiable manner.
  • FIG. 1 is a block diagram showing the configuration of an information processing device according to exemplary Embodiment 1 of the present invention
  • FIG. FIG. 3 is a flow diagram showing the flow of an information processing method according to exemplary embodiment 1
  • FIG. 7 is a block diagram showing the configuration of an information processing apparatus according to exemplary Embodiment 2 of the present invention
  • 10 is a flow chart showing the flow of information processing related to the information processing apparatus according to Exemplary Embodiment 2
  • It is an example of data input to an analysis device and data output from the analysis device.
  • It is a figure which shows an example of the identification process which an identification part performs.
  • FIG. 10 is a flow chart showing the flow of an information processing method S2 according to exemplary embodiment 2;
  • FIG. 10 is a block diagram showing the configuration of an information processing system according to exemplary Embodiment 3 of the present invention;
  • FIG. 12 is a flow chart showing an example of a generation process S3 according to exemplary embodiment 4 of the present invention;
  • FIG. 11 is an example of data for output in which the axes of hyperparameters are arranged in order of priority, according to exemplary embodiment 4;
  • FIG. This is an example in which a part of the specified machine learning input data is included in the output data.
  • This is an example of output data in which the thickness of lines indicating combinations of multiple hyperparameters is changed according to value. It is an example of output data when the user selects three hyperparameters.
  • FIG. 1 is a configuration diagram for realizing an information processing apparatus by software
  • FIG. 1 is a block diagram showing the configuration of an information processing device 1.
  • the information processing apparatus 1 includes a first acquisition unit 11 , a second acquisition unit 12 , an identification unit 13 , a priority calculation unit 14 and a generation unit 15 .
  • the first acquisition unit 11, the second acquisition unit 12, the identification unit 13, the priority calculation unit 14, and the generation unit 15 are the first acquisition unit, the second acquisition unit, and the identification unit, respectively. It is one form of means, priority calculation means, and generation means.
  • the information processing apparatus 1 is an information processing apparatus for efficiently setting parameters of a machine learning model, particularly parameters preset by a user, when attempting to perform machine learning on new data.
  • the first acquisition unit 11 acquires target data, which is data related to target machine learning.
  • Machine learning (hereinafter also referred to as “machine learning processing”) is, as an example, learning using a machine learning device that executes a machine learning algorithm. to train a model for regression, classification, prediction, etc.
  • machine learning includes machine learning algorithms.
  • Data related to machine learning includes, for example, the type of machine learning device, the type and value of parameters set by the user, the data to be processed by the above algorithm (data input to the machine learning device), and the It includes calculated final or intermediate data (data output from a machine learning device, etc.).
  • target data includes the type of machine learning device that generates a new model, data input to the machine learning device, parameter values set by the user, data output from the machine learning device, etc. be
  • the machine learning device may be a device using various open sources (Scikit Learn, TensorFlow, etc.), or a machine learning device created independently by the user.
  • Machine learning devices also referred to as analysis devices, may be different types of machine learning devices.
  • setting a certain parameter and executing machine learning is also referred to as “analyzing”.
  • a model with high performance can be generated by repeating the analysis multiple times with different parameters.
  • the second acquisition unit 12 acquires executed data, which is data related to a plurality of machine learnings that have been executed.
  • the “executed data” is the type of machine learning device, numerical values of set parameters, data input to the machine learning device, data output from the machine learning device, and the like analyzed in the past.
  • the data output from the machine learning device also includes an index, which will be described later.
  • the identifying unit 13 uses the target data and the executed data to identify one or more executed machine learnings similar to the target machine learning.
  • Similar means having something in common. For example, machine learning that uses the same type of machine learning device, machine learning that uses similar types of input data, and machine learning that uses similar analysis project names can be said to be similar machine learning.
  • the identifying unit 13 can identify machine learning having a degree of similarity equal to or greater than a predetermined threshold with the target data from among the machine learning that has been executed, using a machine model that evaluates the degree of similarity of sentences.
  • the priority calculation unit 14 calculates the priority of each of a plurality of parameters related to one or more executed machine learnings identified by the identification unit 13, and assigns the calculated priority to the parameter.
  • priority refers to the order of priority that should be selected as a parameter that is highly effective in improving machine learning. That is, the priority calculation unit 14 calculates and assigns a higher priority to a parameter with a higher learning effect.
  • the generating unit 15 generates output data that includes identification information of a plurality of parameters and ranges of values of the plurality of parameters, and that includes priority in an identifiable manner.
  • “Output data” is, for example, data to be output to and displayed on a display device. In that case, the generation unit 15 generates data for displaying the priority on the display device in an identifiable manner.
  • the user uses the range of parameter values displayed on the display device and the priority of each parameter as a guide, and also takes into account the user's personal experience and knowledge, and executes a new machine. In learning, it is possible to decide which parameter to set with what numerical value for analysis. This makes it possible to generate a model with good performance more efficiently than setting parameter values at random.
  • the first acquisition unit 11 acquires target data, which is data related to target machine learning, and data related to a plurality of executed machine learning a second acquisition unit 12 that acquires the executed data, and an identification unit that identifies one or more executed machine learnings similar to the target machine learning using the target data and the executed data 13, a priority calculation unit 14 that calculates the priority of each of a plurality of parameters related to one or more executed machine learning specified by the specifying unit 13, and assigns the calculated priority to the parameter;
  • a generation unit 15 that generates output data including parameter identification information and a range of values of a plurality of parameters in a manner that allows identification of priority. ing.
  • the information processing apparatus 1 it is possible to obtain an effect that it is possible to support efficiency improvement of model generation by machine learning.
  • FIG. 2 is a flow diagram showing the flow of the information processing method S1.
  • the information processing method S1 includes the following steps. That is, in step S11, one or a plurality of processors (for example, the first acquisition unit 11) acquires target data, which is data related to target machine learning.
  • the target data are as described above.
  • one or more processors acquire executed data, which is data relating to a plurality of executed machine learnings.
  • the executed data is as described above.
  • one or more processors perform one or more executed machine learning similar to the target machine learning using the target data and the executed data. Identify. Similar machine learning examples are given above.
  • one or a plurality of processors calculates the priority of each of a plurality of parameters related to the identified one or a plurality of executed machine learning, and the calculated priority degree to the parameter.
  • the priority calculation unit 14 calculates the priority of each of a plurality of parameters related to the identified one or a plurality of executed machine learning, and the calculated priority degree to the parameter.
  • one or a plurality of processors (for example, the generation unit 15) is output data including identification information of a plurality of parameters and a range of values of the parameters, and the priority can be identified.
  • Generate data for output including in the manner. Examples of output data are as described above.
  • one or a plurality of processors acquire target data that is data relating to target machine learning, obtaining executed data that is data; using the target data and the executed data to identify one or more executed machine learning similar to the target machine learning; Calculating the priority of each of a plurality of parameters related to a plurality of executed machine learning, giving the calculated priority to the parameter, the identification information of the plurality of parameters, and the output including the value range of the plurality of parameters and generating output data that includes the priority in an identifiable manner.
  • FIG. 3 is a block diagram showing the configuration of an information processing apparatus 1A according to exemplary embodiment 2.
  • the information processing apparatus 1A includes a control section 10, a memory 17, a communication section 18, and a database 19.
  • FIG. The control unit 10 includes a first acquisition unit 11 , a second acquisition unit 12 , a specification unit 13 , a priority calculation unit 14 , a generation unit 15 and an extraction unit 16 .
  • the functions of the first acquisition unit 11, the second acquisition unit 12, the identification unit 13, the priority calculation unit 14, and the generation unit 15 are as described in the first exemplary embodiment.
  • Part or all of the first acquisition unit 11, the second acquisition unit 12, the identification unit 13, the priority calculation unit 14, the generation unit 15, the extraction unit 16, the memory 17, the communication unit 18, and the database 19 They may be distributed and arranged. Also, part or all of these units and the database 19 may be arranged on the cloud.
  • the information processing device 1A is an information processing device for efficiently setting parameters of a machine learning model when attempting to perform machine learning on new data.
  • machine learning model parameters include at least hyperparameters that are not subject to updating by learning.
  • the hyperparameters are parameters preset by the user.
  • the hyperparameters include the learning rate, the number of epochs (the number of times the learning of all the data has been repeated), the batch size (the number of divisions of all the data), the window size (the size of extracting part of all the data), This includes the number of hidden units in lstm (long short-time memory) architecture.
  • parameters of the machine learning model may include parameters such as weighting factors to be updated by learning, and loss parameters.
  • the extraction unit 16 extracts one or more pieces of machine learning associated with parameters that satisfy a predetermined condition from the one or more executed machine learning pieces identified by the identification unit 13 .
  • the extraction unit 16 is one form of extraction means described in the claims.
  • the priority calculation unit 14 calculates the priority of each of the plurality of parameters related to one or more machine learning extracted by the extraction unit 16, and assigns the calculated priority to the parameter.
  • a “predetermined condition” is, for example, a condition that an index indicating the degree of learning (performance) of the machine learning device has reached a level set by the user.
  • a parameter that satisfies a predetermined condition means, for example, that a loss parameter (described later) that indicates the performance of the machine learning device has reached a level set by the user.
  • the memory 17 includes, for example, a ROM (Read Only Memory) and a RAM (Random Access Memory), and stores one or more programs in the ROM. The function of each part of 10 is realized.
  • the database 19 records the target data acquired by the first acquisition unit 11, the executed data acquired by the second acquisition unit 12, and the like.
  • the information processing device 1A is configured to be able to exchange information with the analysis result database 30 via the information communication network N.
  • the analysis result database 30 stores executed data.
  • the information processing device 1A may include a display unit 20.
  • the output data generated by the generation unit 15 is output and displayed on the display unit 20 .
  • the display unit 20 may be, for example, a display.
  • the first acquisition unit 11, the second acquisition unit 12, the identification unit 13, the priority calculation unit 14, the generation unit 15, the extraction unit 16, the memory 17, the communication unit 18, and the database 19 are one information processing unit. It does not have to be grouped in one place as the device 1A. In other words, some or all of these may be dispersed and arranged at different locations. Also, some or all of these may be distributed and arranged on the cloud.
  • FIG. 4 is a flow chart showing the flow of information processing related to the information processing apparatus 1A.
  • the analysis result database 30 stores data that has been executed. Specifically, the analysis data, hyperparameters, etc. used for the analysis are input to the analysis device, and the output data from the analysis device is recorded in the analysis result database 30 together with the input analysis data, hyperparameters, etc. (step S20).
  • the first acquisition unit 11 acquires the target data input to the information processing device 1A and records it in the database 19 (step S21: first acquisition process).
  • the target data may be stored in the analysis result database 30 .
  • the first acquisition unit 11 acquires the target data from the analysis result database 30 via the communication unit 18 and records it in the database 19 .
  • the second acquisition unit 12 acquires the executed data from the analysis result database 30 via the communication unit 18 and records it in the database 19 (step S22: second acquisition process).
  • the second acquisition unit 12 may directly acquire the executed data held by the analysis device.
  • the first acquisition process and the second acquisition process may be executed in parallel.
  • the identifying unit 13 identifies one or more executed machine learnings similar to the target machine learning using the target data and the executed data recorded in the database 19 (Step S23: Identifying process).
  • the extraction unit 16 extracts one or more pieces of machine learning associated with a parameter that satisfies a predetermined condition from the identified executed machine learning (step S24: extraction processing).
  • the extraction unit 16 extracts one or more pieces of machine learning that output a loss parameter equal to or less than a predetermined threshold from among the one or more pieces of executed machine learning identified by the identification unit 13 .
  • a loss parameter refers to a loss value that is the output of a loss function included in the learning algorithm.
  • the loss value is an index that indicates the performance of the model.
  • a loss function is a function that evaluates the difference between the output numerical value and the teacher data (correct data) in supervised learning.
  • Examples of loss functions include, but are not limited to, mean squared error, mean absolute error, square root of mean squared error, Huber loss, Poisson loss, and the like.
  • the priority calculation unit 14 uses the loss parameter output by each of the plurality of machine learnings extracted by the extraction unit 16 and the parameters other than the loss parameter to determine the loss parameter Priority is given to each parameter other than the loss parameter according to the degree of contribution to the improvement of (step S25: priority calculation processing).
  • Parameters other than loss parameters are hyperparameters in this exemplary embodiment.
  • the contribution is, for example, the amount of decrease in the loss value when the hyperparameter is changed.
  • the hyperparameter change range (difference between the maximum value and the minimum value) in the executed data varies greatly depending on the type of hyperparameter. Therefore, it may be difficult to compare the effect of changing the hyperparameters on the loss value reduction for each hyperparameter.
  • the priority calculation unit 14 calculates the contribution by comparing the amount of decrease in the loss value when the hyperparameter is changed by, for example, the entire change width, 1/2, or 1/4. and may be given priority. This is because the change range of the hyperparameters in the executed data was judged as a range within which improvement of the loss parameter can be expected as a result of trials.
  • the priority calculation unit 14 compares the loss parameters output from each of the plurality of machine learnings, and the parameters (hyperparameters) other than the loss parameters used for each of the plurality of machine learnings. Contribution may be identified according to the comparison results of and . For example, the loss parameters are compared to extract two machine learnings with different loss parameter values. Then, the hyperparameters whose numerical values are changed between the two machine learning methods are compared. Such a comparison is also performed between different machine learning methods. The priority calculation unit 14 calculates and identifies the contribution of the hyperparameter estimated to have the effect of reducing the loss parameter by changing the numerical value according to the amount of decrease in the loss parameter. Then, the priority calculation unit 14 may give priority according to the degree of contribution.
  • the number of hyperparameters that the priority calculator 14 estimates to be effective in reducing the loss parameter is not limited to one, and may be plural.
  • the priority calculation unit 14 may specify a contribution level for each of a plurality of hyperparameters. Then, the priority calculation unit 14 may assign a higher priority to a hyperparameter with a higher degree of contribution.
  • the generation unit 15 generates output data including identification information of a plurality of parameters and a range of values of the plurality of parameters, and including priority in an identifiable manner (step S26: generation processing).
  • the information processing device 1A may output the generated output data to the display unit 20 (step S27: output processing).
  • FIG. 5 shows an example of data input to the analyzer ANL1 and data output from the analyzer ANL1.
  • the input shown in FIG. 5 is a part of the data that the user wants the model to learn (analyze) anew, and as shown in the Model_id column of the "hyperparameter identification information", the machine learning analysis project The name is TDA.
  • TDA An example of using this analysis project TDA as target data will be described below.
  • input data (X.csv) and input data (Y.csv), which are data to be analyzed, and hyperparameters are shown as input data.
  • input data (X.csv) and input data (Y.csv) are shown as input data.
  • input data (X.csv) and input data (Y.csv). include.
  • Learning_rate is the learning rate.
  • Training_epochs is the number of epochs.
  • Batch_size is the batch size.
  • Input File is the name of the input data mentioned above.
  • Model_id is the name of an individual analysis in one analysis project.
  • the output data includes execution time, analyzer name (type), loss (loss parameter), Learning_rate, Training_epochs, Batch_size, Input File, and Model_id.
  • the data output corresponding to the input data are the execution time, the name (type) of the analyzer, and the loss, and the rest are the input data output as they are.
  • the output data is recorded in the analysis result database 30 .
  • the first acquisition unit 11 acquires this output data recorded in the analysis result database 30 and records it in the database 19 as target data.
  • the second acquisition unit 12 also acquires the executed data recorded in the analysis result database 30 and records it in the database 19 .
  • the executed data acquired by the second acquisition unit 12 may be only analysis output data.
  • the first acquisition unit 11 acquires target data, which is data related to target machine learning (analysis project), from the analysis result database 30 as an example, and records it in the database 19 .
  • the second acquisition unit 12 also acquires executed data, which is data relating to a plurality of executed machine learnings, from the analysis result database 30 and records it in the database 19 .
  • FIG. 6 is a diagram illustrating an example of identification processing executed by the identification unit 13.
  • a table 601 is part of the target data acquired by the first acquisition unit 11 and part of the executed data acquired by the second acquisition unit 12, and one line of analysis data is recorded.
  • the types of data shown in table 601 are, from the left, the execution time when the analysis was started, the name (type) of the analyzer, the loss value obtained by the analysis, the learning rate (Learning_rate1), the number of epochs (Training_epochs), Batch size (Batch_size), input data name (Input File), and analysis project name (Model_id). Note that the types of hyperparameters are not limited to these.
  • the data in the first row, whose Model_id is indicated by TDA, is the target data.
  • TDA the target analysis project
  • a table 602 is data of executed machine learning similar to the target machine learning, identified by the identifying unit 13 using the target data and the executed data.
  • the identification unit 13 identifies machine learning having similar names of input data and analysis projects to TDA.
  • analysis projects TDA_2, TDA_3, TDA_4, TDA_5, and TDA_6 are specified that have similar names of analyzers and names of input data to the target TDA data. Note that when a related analysis project is associated with a certain analysis project, a similar analysis project may be identified by referring to the related analysis project.
  • the identifying unit 13 uses the target data and the executed data to identify one or more executed machine learnings (executed data shown in Table 602) similar to the target machine learning.
  • FIG. 7 is a diagram showing the result of the priority calculation processing executed by the extraction unit 16 and the priority calculation unit 14.
  • a table 701 shows TDA_2, TDA_3, and TDA_4 extracted by the extracting unit 16 among the analysis projects specified by the specifying unit 13 .
  • the extracting unit 16 refers to the condition that the loss specified by the user is less than 2.0, and extracts analysis projects where loss ⁇ 2.0. In this way, the extraction unit 16 extracts one or more pieces of machine learning associated with parameters that satisfy a predetermined condition from among one or more pieces of executed machine learning identified by the identification unit 13 .
  • the priority calculation unit 14 refers to TDA_2, TDA_3, and TDA_4, calculates an index indicating the priority of the hyperparameter, and assigns the priority. As an example, the priority calculation unit 14 identifies the degree of contribution to decrease the loss value according to the comparison result of the loss values and the comparison result of the hyperparameters. Then, a priority is given to each hyperparameter according to the degree of contribution to improvement of the loss value.
  • the priority calculation unit 14 creates a simultaneous equation with each hyperparameter as a variable, multiplied by a coefficient, and summed up as a loss value.
  • the coefficient may be regarded as contribution and priority may be given in that order.
  • the loss value since the loss value may be smaller when the hyperparameter is made smaller, the absolute value of the difference between the hyperparameters of the two analyzes is used as the variable, and the value obtained by multiplying it by the coefficient and summing it is the difference of the loss value.
  • a system of equations may be evaluated.
  • the absolute value of the hyperparameter difference between the two analysis results which is normalized to the ratio of the difference between the maximum and minimum values of the hyperparameter, may be used. .
  • a table 702 is a table in which the priority calculation unit 14 assigns priorities to hyperparameters. As shown in table 702, the priority calculation unit 14 assigns the highest priority (priority) 1 to the learning rate. Next, the priority calculation unit 14 assigns priority 2 to the number of epochs and assigns priority 3 to the batch size.
  • the table 702 indicates that the priority calculation unit 14 estimates that changing the learning rate is most effective for changing the loss value. Specifically, it can be understood from Tables 701 and 702 that reducing the learning rate has the greatest contribution to improving (reducing) the loss value. It can be seen that then increasing the number of epochs is the second largest and increasing the batch size is the third largest.
  • the priority calculation unit 14 calculates the priority of each of a plurality of parameters related to one or more machine learning extracted by the extraction unit 16, and assigns the calculated priority to the parameter.
  • Table 702 The user can refer to this table 702 to determine which hyperparameter should be adjusted to reduce the loss value.
  • Table 702 also displays the minimum and maximum values of prioritized hyperparameters. By referring to this, the user can determine to what extent each hyperparameter should be set.
  • the generation unit 15 can generate output data for displaying the tables 601, 602, 701, and 702 described above.
  • the generation unit 15 may generate output data that can display the effects of hyperparameters in a more user-friendly manner. A specific example of the output data generated by the generation unit 15 will be described later.
  • the priority calculation unit 14 not only assigns priority to a plurality of hyperparameters, but also calculates optimum values or optimum ranges by narrowing the range of the hyperparameters. may be generated as
  • the information processing apparatus 1A in addition to the configuration of the information processing apparatus 1 according to the first exemplary embodiment, one or more executed machines specified by the specifying unit 13
  • the extraction unit 16 is provided for extracting one or more machine learning associated with a parameter satisfying a predetermined condition from the learning, and the priority calculation unit 14 extracts a plurality of machine learning related to the one or more machine learning extracted by the extraction unit 16.
  • a configuration is adopted in which priority is given to each parameter. Therefore, according to the information processing device 1A according to the present exemplary embodiment, in addition to the effects of the information processing device 1 according to the first exemplary embodiment, the types of hyperparameters to be adjusted to improve the loss value , the user can determine how to set the hyperparameters.
  • FIG. 8 is a flow chart showing the flow of the information processing method S2.
  • the information processing method S2 includes the following steps. That is, in step S21, one or a plurality of processors (for example, the first acquisition unit 11) acquires target data, which is data related to target machine learning.
  • processors for example, the first acquisition unit 11
  • one or a plurality of processors acquires executed data, which is data regarding a plurality of executed machine learnings.
  • one or more processors perform one or more executed machine learning similar to the target machine learning using the target data and the executed data. Identify.
  • one or a plurality of processors extracts one or a plurality of machine learnings associated with parameters that satisfy a predetermined condition from among the identified executed machine learnings.
  • the predetermined condition is, for example, a condition that the loss value is smaller than the threshold.
  • one or a plurality of processors calculates the priority of each of the extracted one or a plurality of parameters related to machine learning, and calculates the calculated priority. Assign to parameters.
  • one or a plurality of processors (for example, the generation unit 15) generates output data including identification information of a plurality of parameters and a range of values of the plurality of parameters, in which priority can be identified. Generate data for output including in the manner.
  • the type of the parameter, its value range and priority can be output in an identifiable manner. Therefore, according to the information processing method S2 according to the present exemplary embodiment, in addition to the effects of the information processing method S1 according to the first exemplary embodiment, the types of hyperparameters to be adjusted to improve the loss value , the user can determine how to set the hyperparameters.
  • FIG. 9 is a block diagram showing the configuration of the information processing system 2 according to this exemplary embodiment.
  • the information processing system 2 includes a first acquisition unit 11 , a second acquisition unit 12 , an identification unit 13 , a priority calculation unit 14 and a generation unit 15 .
  • the function of each of these units is as described in the first exemplary embodiment.
  • the first acquisition unit 11, the second acquisition unit 12, the identification unit 13, the priority calculation unit 14, and the generation unit 15 are connected to each other via the information communication network N so as to be able to communicate with each other.
  • each part of the information processing system 2 can communicate information with the database 19 and the analysis result database 30 via the information communication network N.
  • the functions of the database 19 and analysis result database 30 are as described in the second exemplary embodiment.
  • Each part of the information processing system 2, the database 19, and the analysis result database 30 may be distributed in part or in whole. Further, each part of the information processing system 2, the database 19, and the analysis result database 30 may be partially or wholly arranged on the cloud.
  • each part of the information processing system 2 is connected to each other via the information communication network N so as to be able to communicate with each other. Further, each part of the information processing system 2 can communicate information with the database 19 and the analysis result database 30 via the information communication network N.
  • each part of the information processing system 2, the database 19, and the analysis result database 30 may be partially or wholly arranged on the cloud. Therefore, according to the information processing system 2 according to the present exemplary embodiment, in addition to the effects of the information processing apparatus 1 according to the first exemplary embodiment, each unit of the information processing system 2, the database 19, and the analysis result database 30 part or all of can be distributed at arbitrary locations.
  • FIG. 10 is a flowchart showing the output data generation processing S3 executed by the generation unit 15. As shown in FIG. Note that the content of the generation process shown in FIG. 10 is an example, and the generation unit 15 may perform various other generation processes.
  • the generation process S3 includes the following steps. That is, in step S31, one or a plurality of processors (eg, the generation unit 15) calculates the minimum and maximum values of the specified machine learning hyperparameter values and stores them in the database.
  • the specified machine learning is the machine learning specified by the specifying unit 13 .
  • the database may for example be the database 19 described in the second exemplary embodiment.
  • one or more processors (eg, the generation unit 15) generate a parallel coordinate graph drawn using the maximum value, minimum value, and priority of each hyperparameter.
  • a parallel coordinate graph is a graph showing a numerical range (coordinate range) of one hyperparameter on a one-dimensional axis, and is a graph in which a plurality of hyperparameter axes are arranged in parallel.
  • a parallel coordinate graph can be generated using known drawing software.
  • step S33 one or more processors (eg, the generation unit 15) generate drawing data by extracting a portion of the specified machine learning input data. By displaying a part of the input data, the user can confirm whether or not the type of the input data, the numerical values of the input data, etc. are similar to the target data.
  • this step S33 is not essential, when executing this step, it is preferable to record the specified machine learning input data in the database 19 in advance.
  • step S34 various output data are generated based on the user's designation.
  • the generation unit 15 may generate output data in which the axes of the hyperparameters are arranged in a predetermined order or arranged in an order specified by the user.
  • the predetermined order is, for example, the order of priority, the order of the number of times the parameter values have been changed, or the like.
  • the order of priority may be set as a default in the absence of user designation.
  • the output data includes identification information of hyperparameters and their value ranges.
  • output data may be generated by connecting the coordinates of the maximum value and the minimum value of the range displayed on each axis selected by the user with a line. Also, as a default, output data may be generated by connecting the coordinates of the maximum and minimum values of the range displayed on each axis with a line.
  • the coordinates of the maximum and minimum values can be obtained using known coordinate calculation software.
  • FIG. 11 is an example of a parallel coordinate graph (output data) in which the hyperparameter axes are arranged in order of priority.
  • the output data shown in FIG. 11 includes hyperparameter identification information and hyperparameter value ranges arranged in order of priority. Specifically, the numerical range of one hyperparameter is displayed on the one-dimensional vertical axis along with the identifying information. The vertical axis of multiple hyperparameters is arranged in descending order of priority from the left. size for extracting partial data), and lstm_hidden_dim (number of hidden units in lstm (long short-time memory) architecture). Note that FIG.
  • the exemplary embodiment describes data for output that includes window_size and lstm_hidden_dim.
  • the range of maximum and minimum values are indicated by rectangles along with their numerical values.
  • the coordinates of the maximum values and the coordinates of the minimum values of adjacent hyperparameters are connected by lines.
  • FIG. 12 is an example in which part of the specified machine learning input data is included in the output data.
  • the user can confirm whether or not the type of the input data, the numerical value of the input data, and the like are similar to the target data.
  • the generation unit 15 may generate output data in which the display mode of the line connecting the coordinates of the maximum value or the minimum value is changed according to the value.
  • the generation unit 15 may generate output data that displays a combination of values of a plurality of parameters by changing the display mode according to the value of the combination.
  • FIG. 13 shows, as an example, output data in which the thickness of lines indicating combinations of a plurality of hyperparameters is changed according to value.
  • the value is, as an example, the magnitude of the effect of reducing the loss value. That is, the generation unit 15 determines that a combination of hyperparameter numerical values having a relatively large effect of reducing the loss value is more valuable than a combination having a relatively small effect of reducing the loss value. Then, the generation unit 15 generates output data that displays combinations of numerical values of hyperparameters having relatively high value with thicker lines.
  • the aspect to be changed according to the value is not limited to the thickness of the line.
  • the line color may be changed, and the line type may be changed.
  • the relative value may be indicated by a very thick line, a thick line, and a thin line in the order of large, medium, and small.
  • FIG. 13 shows output data when the user selects the number of epochs and batch size.
  • the combination of 500 epochs and 2000 batch sizes is displayed in bold. In other words, it indicates that the combination of these numerical values is highly effective in reducing the loss value.
  • the range is indicated by connecting the maximum values and the minimum values with a line, it is possible to determine what kind of combination within that range has a large effect of reducing the loss value. Can not. However, by displaying as shown in FIG. 13, the user can determine how to combine multiple hyperparameters.
  • Fig. 14 is an example of output data when the user selects the number of epochs, batch size, and window_size.
  • the combination in which the number of epochs is 500, the batch size is 2000, and the window_size is 10 is displayed in bold. In other words, this numerical combination of these hyperparameters is highly effective in reducing the loss value.
  • FIG. 15 is an example of output data when the user selects a non-adjacent epoch number and window_size.
  • the user may select hyperparameters that are not adjacent.
  • combinations of high value highly effective in reducing the loss value
  • the combination of the number of epochs of 500 and the window_size of 10 is more valuable than other combinations, so the output data is connected by a thick line.
  • the generation unit 15 may generate output data that displays the range of values of a plurality of parameters selected from among the plurality of parameters.
  • FIG. 16 is a modification of the output data shown in FIG. 15, and is an example of output data displaying a plurality of hyperparameters selected by the user and their numerical ranges. Furthermore, as shown in the figure, the generation unit 15 may generate output data in which combinations of relatively high values among the selected combinations of numerical values of the hyperparameters are connected by thick lines.
  • the expansion/non-expansion button shown in the upper left of the figure is a button for selecting whether or not to display only the hyperparameter selected by the user and the range of its numerical values.
  • the generation unit 15 may generate output data including the identification information of the hyperparameters arranged in the sequence specified by the user and the range of the values of the parameters.
  • FIG. 17 is an example of output data including hyperparameter identification information and parameter value ranges when the user specifies the order of priority. As shown in the figure, a selection button is displayed on the upper left of the display screen with which the user can specify conditions for the sorting order of the hyperparameters. Note that FIG. 17 shows an example in which the user selects "priority order", selects the number of epochs and window_size, and highlights them with a thick line.
  • the generation unit 15 generates output data in which axes (hyperparameters) having a predetermined relationship specified by the user are aligned adjacent to each other, as shown in step S344.
  • a predetermined relationship refers to a proportional relationship, an inverse proportional relationship, or the like. In other words, when a combination of numerical values with relatively high value of two hyperparameters has a specified relationship such as proportionality or inversely proportionality, it generates output data in which those hyperparameters are arranged adjacently. good too.
  • FIG. 18 is an example of output data that allows the user to specify a predetermined relationship between combinations of highly valuable hyperparameter values.
  • a selection button is displayed on the upper left of the display screen with which the user can specify the relationship that the combination of numerical values of the hyperparameters has.
  • proportional is selected is displayed, but "inversely proportional” or the like can also be selected.
  • "Inversely proportional” is a relationship in which a combination of one value is relatively large and the other value is relatively small.
  • the thick line indicates that the combination of relatively high value of the number of epochs and batch size has a proportional relationship (combination of small number of epochs and small batch size has relatively high value). .
  • the output data generated by the generation unit 15 is generated in various forms.
  • the user can visually see what numerical ranges of which hyperparameters are effective in improving the performance of the model. Therefore, it is possible to support efficient model generation by machine learning.
  • Some or all of the functions of the information processing apparatuses 1 and 1A and the information processing system 2 may be realized by hardware such as an integrated circuit (IC chip). , may be implemented by software.
  • the information processing device 1 and the like are implemented by, for example, a computer that executes instructions of a program that is software that implements each function.
  • a computer that executes instructions of a program that is software that implements each function.
  • An example of such a computer (hereinafter referred to as computer C) is shown in FIG.
  • Computer C comprises at least one processor C1 and at least one memory C2.
  • a program P for operating the computer C as the information processing apparatus 1 or the like is recorded in the memory C2.
  • the processor C1 reads the program P from the memory C2 and executes it, thereby realizing each function of the information processing apparatus 1 and the like.
  • processor C1 for example, CPU (Central Processing Unit), GPU (Graphics Processing Unit), DSP (Digital Signal Processor), MPU (Micro Processing Unit), FPU (Floating point number Processing Unit), PPU (Physics Processing Unit) , a microcontroller, or a combination thereof.
  • memory C2 for example, a flash memory, HDD (Hard Disk Drive), SSD (Solid State Drive), or a combination thereof can be used.
  • the computer C may further include a RAM (Random Access Memory) for expanding the program P during execution and temporarily storing various data.
  • Computer C may further include a communication interface for sending and receiving data to and from other devices.
  • Computer C may further include an input/output interface for connecting input/output devices such as a keyboard, mouse, display, and printer.
  • the program P can be recorded on a non-temporary tangible recording medium M that is readable by the computer C.
  • a recording medium M for example, a tape, disk, card, semiconductor memory, programmable logic circuit, or the like can be used.
  • the computer C can acquire the program P via such a recording medium M.
  • the program P can be transmitted via a transmission medium.
  • a transmission medium for example, a communication network or broadcast waves can be used.
  • Computer C can also obtain program P via such a transmission medium.
  • Appendix 2 extracting means for extracting one or more machine learnings associated with a parameter satisfying a predetermined condition from among the one or more executed machine learnings identified by the identifying means, wherein the priority calculating means is configured to: The information processing apparatus according to appendix 1, wherein the priority of each of the plurality of parameters relating to one or more machine learning extracted by the extraction means is calculated, and the calculated priority is given to the parameter.
  • machine learning including parameters that satisfy a predetermined condition is extracted from among the identified and executed machine learning, so it is possible to more efficiently support the efficiency of model generation.
  • the priority calculation means uses the loss parameter output by each of the plurality of machine learning extracted by the extraction means and the parameter other than the loss parameter to determine the loss parameter among the parameters other than the loss parameter. 3.
  • parameters are prioritized according to their degree of contribution to the improvement of loss parameters, so it is possible to more efficiently support the efficiency of model generation.
  • Appendix 6 The information processing apparatus according to any one of appendices 1 to 5, wherein the output data includes identification information of the parameters arranged in the priority order and ranges of values of the parameters.
  • Appendix 7 The information processing apparatus according to any one of appendices 1 to 5, wherein the output data includes the identification information of the parameters arranged in an arrangement order specified by a user and the range of values of the parameters.
  • the user can recognize the effects of the parameters from multiple viewpoints, so it is possible to more efficiently support the efficiency of model generation.
  • Appendix 8 The information processing apparatus according to appendix 6 or 7, wherein the output data is data for displaying a range of values of a plurality of parameters selected from the plurality of parameters.
  • the user can visually recognize the relationship between multiple parameters, so it is possible to more efficiently support the efficiency of model generation.
  • (Appendix 9) The output data according to any one of appendices 6 to 8, characterized in that the output data is data for displaying the combination of the values of the plurality of parameters by changing the display mode according to the value of the combination. information processing equipment.
  • the user can visually recognize a combination of highly valuable parameter values, so that it is possible to more efficiently support the efficiency of model generation.
  • Appendix 10 10. The information processing apparatus according to any one of appendices 6 to 9, further comprising display means for displaying the output data.
  • An information processing system comprising: a generation means for generating output data including the priority in an identifiable manner.
  • One or more processors acquire target data that is data related to target machine learning, acquire executed data that is data related to a plurality of machine learnings that have been executed, the target data, and the executed machine learning identifying one or more executed machine learning similar to the target machine learning using data, and a priority of each of a plurality of parameters related to the identified one or more executed machine learning and assigning the calculated priority to the parameter, identification information of the plurality of parameters, and output data including a range of values of the plurality of parameters, wherein the priority can be identified
  • An information processing method comprising generating data for output including in.
  • a computer performs a first acquisition process for acquiring target data that is data related to target machine learning, a second acquisition process that acquires executed data that is data related to a plurality of machine learnings that have been executed, and the target A specific process for identifying one or more executed machine learning similar to the target machine learning using the data and the executed data, and the identified one or more executed machine learning Priority calculation processing for calculating the priority of each of a plurality of parameters and assigning the calculated priority to the parameter, and output data including identification information of the plurality of parameters and a range of values of the plurality of parameters and a generating process for generating output data including the priority in an identifiable manner.
  • At least one processor is provided, and the processor performs a first acquisition process that acquires target data that is data related to target machine learning, and a second acquisition process that acquires executed data that is data related to a plurality of machine learnings that have been executed.
  • a specific process for identifying one or more executed machine learning similar to the target machine learning, and the identified one or more An assignment process of calculating the priority of each of a plurality of parameters related to machine learning that has already been executed and assigning the calculated priority to the parameter; an information processing apparatus for generating output data including the priority in an identifiable manner.
  • the information processing apparatus may further include a memory, and the memory stores the first acquisition process, the second acquisition process, the identification process, the assignment process, and the generation process. may be stored with a program for causing the processor to execute In addition, this program may be recorded in a computer-readable non-temporary tangible recording medium.
  • Reference Signs List 1 Reference Signs List 1, 1A Information processing device 2 Information processing system 11 First acquisition unit 12 Second acquisition unit 13 Identification unit 14 Priority calculation unit 15. Generating unit 16 Extracting unit 17 Memory 18 Communication unit 19 Database 20 Display unit 30 Analysis result database

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
PCT/JP2022/003310 2022-01-28 2022-01-28 情報処理装置、情報処理方法及びプログラム Ceased WO2023144998A1 (ja)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2023576515A JP7750311B2 (ja) 2022-01-28 2022-01-28 情報処理装置、情報処理システム、情報処理方法及びプログラム
PCT/JP2022/003310 WO2023144998A1 (ja) 2022-01-28 2022-01-28 情報処理装置、情報処理方法及びプログラム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/003310 WO2023144998A1 (ja) 2022-01-28 2022-01-28 情報処理装置、情報処理方法及びプログラム

Publications (1)

Publication Number Publication Date
WO2023144998A1 true WO2023144998A1 (ja) 2023-08-03

Family

ID=87470924

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/003310 Ceased WO2023144998A1 (ja) 2022-01-28 2022-01-28 情報処理装置、情報処理方法及びプログラム

Country Status (2)

Country Link
JP (1) JP7750311B2 (https=)
WO (1) WO2023144998A1 (https=)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025062563A1 (ja) * 2023-09-21 2025-03-27 日本電信電話株式会社 処理装置、処理方法およびプログラム
CN120296717A (zh) * 2025-04-24 2025-07-11 智企安信息技术(常州)有限公司 基于机器学习的工业生产区域数字化数据传输系统及方法

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021533450A (ja) * 2018-08-15 2021-12-02 セールスフォース ドット コム インコーポレイティッド 機械学習のためのハイパーパラメータの識別および適用

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021533450A (ja) * 2018-08-15 2021-12-02 セールスフォース ドット コム インコーポレイティッド 機械学習のためのハイパーパラメータの識別および適用

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
UENO, YOSUKE: "Investigation of Learning Model Selection Method for Efficient Transfer Learning in Image Recognition", IEICE TECHNICAL REPORT, vol. 117, no. 278, 7 November 2017 (2017-11-07), pages 13 - 18, XP009548118 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025062563A1 (ja) * 2023-09-21 2025-03-27 日本電信電話株式会社 処理装置、処理方法およびプログラム
CN120296717A (zh) * 2025-04-24 2025-07-11 智企安信息技术(常州)有限公司 基于机器学习的工业生产区域数字化数据传输系统及方法

Also Published As

Publication number Publication date
JPWO2023144998A1 (https=) 2023-08-03
JP7750311B2 (ja) 2025-10-07

Similar Documents

Publication Publication Date Title
US11216741B2 (en) Analysis apparatus, analysis method, and non-transitory computer readable medium
US10324453B2 (en) Space for materials selection
US10515083B2 (en) Event analysis apparatus, an event analysis system, an event analysis method, and an event analysis program
US20090297044A1 (en) Image processing apparatus, method of image processing, processing apparatus, method of processing, and recording medium
JP6187977B2 (ja) 解析装置、解析方法及びプログラム
WO2023144998A1 (ja) 情報処理装置、情報処理方法及びプログラム
US10884841B2 (en) Information processing device, information processing method, and recording medium
RU2006117318A (ru) Устройство анализа характеристик документа для документа, который должен исследоваться
JP2018528511A (ja) 生産システムにおける出力効率の最適化
JPWO2021245733A5 (ja) 脳画像解析装置、制御方法、及びプログラム
US20210390623A1 (en) Data analysis method and data analysis device
JP7655328B2 (ja) 情報処理システム、情報処理方法、及びコンピュータプログラム
JP5687122B2 (ja) ソフトウェア評価装置、ソフトウェア評価方法およびシステム評価装置
WO2020039790A1 (ja) 情報処理装置、情報処理方法及びプログラム
CN113792205A (zh) 一种确定模型特征分箱方案的方法及装置
CN114037137A (zh) 对象预测方法、系统及介质
JP7380699B2 (ja) 分析装置及びプログラム
JP2020187511A (ja) 情報処理プログラム、情報処理方法、及び情報処理装置
JP2016024713A (ja) パラメータ選定方法、パラメータ選定プログラム及びパラメータ選定装置
WO2016006101A1 (ja) シミュレーションシステム、及び、シミュレーション方法
JP7345744B2 (ja) データ処理装置
JP4918868B2 (ja) 入力値選定プログラム、入力値選定方法および入力値選定装置
JP7579765B2 (ja) 解析装置、解析方法及び解析プログラム
JP2022171165A (ja) 分析プログラム、分析方法及び分析装置
JP2021131578A (ja) 情報表示システム、情報表示方法及び情報表示プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22923862

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023576515

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22923862

Country of ref document: EP

Kind code of ref document: A1