WO2020250843A1 - ハイパーパラメタチューニング方法、プログラム試行システム及びコンピュータプログラム - Google Patents

ハイパーパラメタチューニング方法、プログラム試行システム及びコンピュータプログラム Download PDF

Info

Publication number
WO2020250843A1
WO2020250843A1 PCT/JP2020/022428 JP2020022428W WO2020250843A1 WO 2020250843 A1 WO2020250843 A1 WO 2020250843A1 JP 2020022428 W JP2020022428 W JP 2020022428W WO 2020250843 A1 WO2020250843 A1 WO 2020250843A1
Authority
WO
WIPO (PCT)
Prior art keywords
program
hyperparameter
parameter
value
description data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2020/022428
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
正太郎 佐野
利彦 柳瀬
健 太田
拓哉 秋葉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Preferred Networks Inc
Original Assignee
Preferred Networks Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Preferred Networks Inc filed Critical Preferred Networks Inc
Priority to DE112020002822.4T priority Critical patent/DE112020002822T5/de
Priority to JP2021526074A priority patent/JP7303299B2/ja
Publication of WO2020250843A1 publication Critical patent/WO2020250843A1/ja
Priority to US17/643,661 priority patent/US12430147B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • G06F9/4494Execution paradigms, e.g. implementations of programming paradigms data driven
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn

Definitions

  • This disclosure relates to hyperparameter tuning methods, program trial systems and computer programs.
  • hyperparameters such as learning rate, batch size, number of learning iterations, number of neural network layers, and number of channels. Specifically, the values of various hyperparameters are set, and the machine learning model is trained under the set hyperparameter values.
  • hyperparameters of deep learning models are either manually adjusted by the user or a particular programming language such as Python. It is adjusted by executing the hyper-parameter setting program written by the user.
  • the conventional hyper parameter setting program (for example, refer to https://github.com/Epistimio/orion) needs to write the code directly in the target program (machine learning model) for which you want to set the hyper parameter. It was not always very convenient.
  • the present disclosure provides a method for tuning hyperparameters with improved convenience.
  • one aspect of the present disclosure is that one or more processors acquire a program execution instruction including parameter description data described in a command line interface, and the one or more processors described above.
  • the present invention relates to a hyperparameter tuning method in which a processor sets a hyperparameter value of a program to be optimized according to the parameter description data and acquires an evaluation value of the hyperparameter value.
  • a hyperparameter setting device for a trial target model such as a deep learning model
  • a user inputs a program execution instruction including parameter description data for setting hyperparameters for a user program to be tried in a command line interface such as a command prompt or console displayed on a user terminal
  • the hyper parameter setting device sets the hyper parameter value in the user program according to the input parameter description data.
  • a user transmits a program execution instruction including parameter description data of a hyper parameter from a command line interface.
  • the hyperparameter setting device sets the hyperparameter value of the user program based on the program execution instruction including the acquired parameter description data.
  • the user can search for hyperparameters without writing a program or file in the user program for setting hyperparameters. Further, for example, it is possible to search for hyperparameters suitable for the described user program to be tried regardless of the type of programming language.
  • the hyperparameter refers to a parameter that controls the behavior of the machine learning algorithm.
  • the machine learning model is a deep learning model, it refers to parameters such as the learning rate, batch size, and number of learning iterations that cannot be optimized by the gradient method.
  • tuning of hyperparameters includes evaluation of hyperparameter values, improvement (selecting a more preferable value), optimization, and the like.
  • the evaluation of the hyperparameter value means that the user program is executed using the value of a certain hyperparameter and the evaluation value is acquired.
  • the program trial system 10 has a hyper parameter setting device 100, a program trial device 200, and a database 250, and is connected to the user terminal 50.
  • the program trial system 10 acquires parameter description data together with the user program to be tried (for example, a deep learning model) from the user terminal 50, for example, when the user program is tried according to the acquired parameter description data and the program execution instruction.
  • the program trial system 10 determines the value of the next hyper parameter based on the trial result, and repeats the trial until the end condition is satisfied.
  • the program trial system 10 determines the optimum hyperparameter value for the user program based on the trial results based on the values of various hyperparameters. For example, when the user program is a deep learning model, the deep learning model trained by the optimum hyperparameter values is used for subsequent classification, prediction, and the like.
  • “optimization” means to approach the optimum state, and is not limited to strictly the optimum value or state. Similarly, “optimal” is not limited to being strictly optimal.
  • the program trial system 10 is composed of three physically separated devices, a hyper parameter setting device 100, a program trial device 200, and a database 250. , But is not limited to this, it may be realized by a single device that realizes the functions provided by the hyper parameter setting device 100, the program trial device 200, and the database 250. Further, the program trial system 10 according to the present disclosure may be realized by one or more processors and one or more memories.
  • the user terminal 50 is, for example, a calculation device such as a personal computer or a tablet.
  • the user creates and stores a user program that describes a trial target model such as a machine learning model to be trained and a deep learning model by using the user terminal 50, and stores the user program in the hyper of the program trial system 10. It is transmitted to the parameter setting device 100.
  • the user terminal 50 provides the hyper parameter setting device 100 with a program execution instruction including parameter description data for designating the hyper parameters to be set in the user program.
  • the user inputs a command line for starting the hyper parameter setting device 100 on a command line interface such as a command prompt or console displayed on the user terminal 50, and sets various hyper parameters. Enter a program execution instruction containing parameter description data for the command line as an option.
  • the hyper parameter setting program is entered by inputting "optunash run -n 100 ---" in the command line interface on the user terminal 50.
  • the execution of "optunash” and the number of trials of the user program by the execution are specified as 100 times.
  • the user inputs a program execution instruction including parameter description data.
  • the user inputs the user program "nn-train” to be tried.
  • the user specifies, for example, the learning rate “ ⁇ learning_rate” and the number of layers “ ⁇ num_layers” as the parameter description data.
  • the parameter description data may define the hyperparameter setting range by the distribution identifier and the range specified value.
  • “uniform” is a distribution identifier indicating that the distribution is uniform, and the range designation values "0.01" and “ “0.1” indicates a setting range from “0.01” to "0.1”
  • “--learning_rate ⁇ uniform, 0.01, 0.1 ⁇ ” is "0.01”. It is specified that the learning rate is selected by a uniform distribution in the setting range from to "0.1".
  • the distribution identifier is not limited to these, for example, the distribution identifier "loguniform” indicating that the distribution is a logarithmic uniform distribution, and the distribution identifier “loguniform” indicating that the distribution is a uniform distribution selected by the specified quantization step. "Discrete_uniform”, distribution identifier "categical” that specifies the element to be selected, and the like may be used.
  • the parameter description data that defines the setting range of the hyper parameter by the distribution identifier and the range specified value is input via the command line interface on the user terminal 50, and by pressing the return key, for example, the hyper parameter It is transmitted to the setting device 100.
  • the hyperparameter setting device 100 has an interface unit 110 and a hyperparameter setting unit 120.
  • the interface unit 110 acquires a program execution instruction including parameter description data for setting hyper parameters of the program described in the command line interface. As shown in FIG. 1, for example, a program execution instruction including a user program and parameter description data transmitted from the user terminal 50 to the hyper parameter setting device 100 is received by the interface unit 110. The interface unit 110 passes the received user program and the program execution instruction including the parameter description data to the program trial device 200 and the hyperparameter setting unit 120, respectively.
  • the interface unit 110 when the interface unit 110 acquires a program execution instruction including parameter description data of various hyper parameters input to the command line interface on the user terminal 50, the interface unit 110 changes the parameter description data in the acquired program execution instruction to the hyper parameter setting unit. It is converted into a data format that can be processed by 120, and the converted parameter description data is passed to the hyper parameter setting unit 120.
  • the interface unit 110 parses the program execution instruction including the acquired parameter description data and extracts a specific pattern part such as " ⁇ character string ⁇ ".
  • the interface unit 110 refers to the correspondence table held in advance, specifies the parameter request corresponding to the extracted distribution identifier, and specifies the extracted range specification value as an argument. Set in the request.
  • the hyperparameter setting unit 120 sets the hyperparameter value in the program according to the parameter description data. Specifically, as shown in FIG. 1, when the parameter description data is acquired from the interface unit 110, the hyper parameter setting unit 120 determines and determines the value of the hyper parameter for the user program according to the acquired parameter description data. The value of the hyper parameter is returned to the interface unit 110.
  • the hyperparameter value may be selected according to the distribution identifier and range specification value of the parameter request acquired from the interface unit 110.
  • the hyper parameter setting unit 120 is within the range of "0.01" to "0.1".
  • Hyperparameter values may be set randomly according to a uniform distribution. Further, the hyperparameter values may be set by using Bayesian optimization, Gaussian assumptions, software for hyperparameter optimization, or the like. Further, the hyperparameter setting unit 120 may store the set hyperparameter values in the database 250.
  • the interface unit 110 sends the acquired hyper-parameter value to the program trial device 200 and executes the user program according to the hyper-parameter value.
  • the interface unit 110 instructs the program trial device 200 to train the deep learning model according to the value of the hyper parameter set by the hyper parameter setting unit 120.
  • the program trial device 200 trains the deep learning model under the value of the instructed hyper parameter by using, for example, the training data prepared in advance, and the accuracy, execution time, and progress of the trained deep learning model.
  • the trial result including the evaluation value such as the degree is returned to the interface unit 110.
  • the trial result may be obtained from standard output (stdout) or as a file.
  • the interface unit 110 acquires an evaluation value including an evaluation value such as accuracy, execution time, and progress of the trained deep learning model from the trial result acquired as a standard output or a file.
  • an evaluation value such as accuracy, execution time, and progress of the trained deep learning model
  • the evaluation value reads the last line of the input string and converts it to floating point. It may be a value obtained.
  • the extraction of the evaluation value is not limited to this, and instead of reading the last line, it may be acquired by using a pattern match such as a regular expression. Alternatively, the evaluation value pattern may be specified by an argument.
  • the interface unit 110 passes the acquired trial result to the hyperparameter setting unit 120.
  • the hyperparameter setting unit 120 stores the acquired trial result in the database 250 in association with the applied hyperparameter value. After that, the hyper parameter setting unit 120 sets the next value of the hyper parameter according to a uniform distribution within the setting range of “0.01” to “0.1”, transmits it to the interface unit 110, and resets it.
  • the user program is executed in the program trial device 200 for a specified number of trials as described above under the value of the hyperparameter.
  • the program trial device 200 is, for example, a calculation device such as a server that executes a user program to be tried, such as a deep learning model.
  • the program trial device 200 is shown as a calculation device different from the hyperparameter setting device 100.
  • the present disclosure is not limited to this, and for example, the hyperparameter setting device 100 and the program trial device 200 may be realized by the same computing device.
  • the database 250 stores the trial results of the user program to be tried under the hyperparameter values set by the hyperparameter setting unit 120. For example, after executing the user program with the specified number of trials, the user acquires the trial results including the evaluation values stored in the database 250, and determines the optimum hyperparameter value from the trial results with the values of various hyperparameters. can do.
  • FIG. 3 is a sequence diagram showing a hyperparameter setting process according to an embodiment of the present disclosure.
  • the hyper parameter setting process may be started by starting the hyper parameter setting application provided to the user terminal 50 by the hyper parameter setting device 100.
  • step S101 the user terminal 50 uses a user program to be tried (for example, a deep learning model to be trained) and parameter description data of hyperparameters applied to the trial of the user program.
  • a program execution instruction including the above is provided to the interface unit 110.
  • the user uses the user terminal 50 to transmit the user program to be tried to the hyper parameter setting device 100, and also uses the user program to try the user program via the command line interface of the user terminal 50.
  • a program execution command including parameter description data for instructing the hyper parameter to be input is transmitted to the hyper parameter setting device 100.
  • step S102 the interface unit 110 generates a parameter request based on the parameter description data acquired from the user terminal 50, and transmits the generated parameter request to the hyper parameter setting unit 120.
  • the hyperparameter setting unit 120 determines the value of the hyperparameter for the trial of the user program based on the acquired parameter request. For example, when the acquired parameter request is "trial.suggest_uniform ('learning_rate', 0.01.0.1)", the hyper parameter setting unit 120 sets from “0.01” to "0.1". Select the learning rate value according to the uniform distribution within the range.
  • step S104 the hyperparameter setting unit 120 returns the determined hyperparameter value to the interface unit 110.
  • step S105 the interface unit 110 generates a program execution instruction in which the parameter description data is replaced with the value of the hyper parameter, and the program trial device replaces the user program to be tried and the program execution instruction in which the value of the set hyper parameter is replaced. It sends to 200 and instructs the program trial device 200 to try the user program under the value of the hyper parameter.
  • step S106 the program trial device 200 executes the user program under the acquired hyperparameter values according to the trial instruction from the interface unit 110.
  • the program trial device 200 trains the deep learning model under the value of the hyper parameter set by the hyper parameter setting unit 120, and trains under the value of the hyper parameter. Evaluate the accuracy, execution time, progress, etc. of the deep learning model.
  • step S107 the program trial device 200 returns a trial result including evaluation results such as accuracy, execution time, and progress to the interface unit 110.
  • step S108 the interface unit 110 extracts the evaluation result from the acquired trial result and provides the extracted evaluation result to the hyperparameter setting unit 120.
  • step S109 the hyper parameter setting unit 120 stores the acquired trial result in the database 250, and sets the value of the hyper parameter to be applied to the next trial when the specified number of trials has not been reached. After that, steps S104 to S109 are repeated until the specified number of trials is reached.
  • step S110 the interface unit 110 provides the user terminal 50 with the trial results acquired so far as the final result.
  • the hyperparameter value is Define-by-. It is set by the run method. That is, when a plurality of types of hyperparameters are set, the values of other hyperparameters are set depending on the values of one hyperparameter that has already been set.
  • the user specifies the number of layers by the distribution identifier "int” and the setting range from “1” to “8".
  • the user specifies the number of hidden layers by the distribution identifier "int” and the setting range from “1” to "1024 / n_layers”. That is, the setting range of the number of hidden layers is set depending on the value of the number of layers "n_layers" that has already been set.
  • the values of the number of layers "n_layers” are first determined in the order specified in the command line interface, and the pair of the hyperparameter "n_layers" and the determined values is saved. Will be done.
  • the setting range of the number of hidden layers "n_hiden” is set.
  • the setting range of the number of hidden layers "n_1024" is determined. For example, in the specific example shown in FIG. 4, when “n_layers” is set to “2”, a pair of “n_layers” and “2” is saved and used to determine “n_hiden”. Further, when “n_layers” is set to "7”, a pair of "n_layers” and “7” is saved and used to determine "n_hiden”. That is, the value of the hyper parameter can be set by the Define-by-run method in which the value of a certain hyper parameter is set depending on the setting value of another hyper parameter.
  • n_layers is 2 and n_hiden is 16
  • the hyper parameter setting unit 120 sets the hyper parameters in the order described in the acquired parameter description data. Determines and saves values, and determines the values of other hyperparameters depending on the values of hyperparameters that have already been determined and saved.
  • the hyperparameter setting device according to another embodiment of the present disclosure will be described with reference to FIG.
  • the user program is executed for the number of trials specified by the user, but in this embodiment, the training of the user program is terminated at an early stage based on the intermediate evaluation of the trial results. That is, the trial of the user program is terminated based on the intermediate trial result of the user program tried under the set hyperparameter value.
  • FIG. 5 is a sequence diagram showing hyperparameter setting processing according to another embodiment of the present disclosure.
  • step S201 the user terminal 50 uses a user program to be tried (for example, a deep learning model to be trained) and parameter description data of hyperparameters applied to the trial of the user program.
  • a user program to be tried for example, a deep learning model to be trained
  • parameter description data of hyperparameters applied to the trial of the user program To the interface unit 110.
  • step S202 the interface unit 110 generates a parameter request based on the parameter description data acquired from the user terminal 50, and transmits the generated parameter request to the hyper parameter setting unit 120.
  • step S203 the hyperparameter setting unit 120 determines the value of the hyperparameter for the trial of the user program based on the acquired parameter request.
  • step S204 the hyperparameter setting unit 120 returns the determined hyperparameter value to the interface unit 110.
  • step S205 the interface unit 110 transmits the user program to be tried and the acquired hyper-parameter value to the program trial device 200, and instructs the program trial device 200 to try the user program under the hyper-parameter value. To do.
  • step S206 the program trial device 200 executes the user program under the acquired hyperparameter values according to the trial instruction from the interface unit 110.
  • the program trial device 200 trains the deep learning model through a plurality of epochs executed under the hyperparameter value set by the hyperparameter setting unit 120, for example.
  • the evaluation values such as the accuracy, execution time, and progress of the deep learning model trained under the value of the hyper parameter are output as the intermediate trial result.
  • step S207 the program trial device 200 returns the evaluation result in each epoch to the interface unit 110 as an intermediate trial result.
  • step S208 the interface unit 110 provides the acquired intermediate trial result to the hyperparameter setting unit 120.
  • step S209 the hyper parameter setting unit 120 evaluates (interim evaluation) the intermediate trial result of each epoch during execution of the user program acquired earlier than the end of the normal training, and based on the intermediate trial result, the user program Execution of may be terminated earlier than usual.
  • the hyper parameter setting unit 120 may determine whether to end further training of the user program based on an intermediate evaluation value consisting of one or more intermediate trial results of accuracy, execution time, and progress.
  • the intermediate trial result for the intermediate evaluation used for this early termination determination and the trial result for the final result reported to the user may be different.
  • the hyper parameter setting unit 120 may end the execution of further training based on the progress, and the interface unit 110 may report the execution time to the user terminal 50 as the final result.
  • step S210 the hyper parameter setting unit 120 sends an early termination instruction to the program trial device 200 and the interface unit 110 to end the execution of the user program. Then, in step S211 if the specified number of trials has not been reached, the value of the hyperparameter to be applied to the next trial is set.
  • the information related to the intermediate evaluation may be arbitrarily stored in the database 250.
  • the program trial device 200 continues the execution of the user program, returns to step S207, and provides the interface unit 110 with the intermediate trial result of the next epoch. After that, the flow does not correspond to early termination, and the above-mentioned processing is repeated until the termination conditions such as the completion of the processing of a predetermined number of epochs are satisfied.
  • step S211 the hyper parameter setting unit 120 stores the acquired trial result in the database 250, and sets the value of the hyper parameter to be applied to the next trial when the specified number of trials has not been reached.
  • steps S204 to S211 are repeated until the specified number of trials is reached.
  • step S212 the interface unit 110 provides the user terminal 50 with the trial results acquired so far as the final result.
  • the hyper parameter setting device 100 acquires an intermediate evaluation value by monitoring the standard output of the program trial device 200 during the execution of the user program. Every time the standard output of the program trial device 200 for each epoch is updated, the hyper-parameter setting device 100 executes, for example, a predetermined intermediate evaluation extraction function to extract intermediate evaluation values such as accuracy and MSE. Then, the hyperparameter setting device 100 determines whether to terminate the execution of the user program early based on the acquired intermediate evaluation value. For example, the execution of the user program may be stopped when the intermediate evaluation value extracted by comparing with the intermediate evaluation value of the trial for the set value of another hyper parameter is equal to or less than the average value.
  • a predetermined intermediate evaluation extraction function to extract intermediate evaluation values such as accuracy and MSE.
  • the hyperparameter setting device 100 compares the intermediate evaluation value at the 5th second with the intermediate evaluation value of another trial, and for example, the intermediate evaluation value at the 5th second is the intermediate evaluation value of the other trial. If it is lower than the average of the intermediate evaluation values, the trial may be stopped in the middle.
  • the hyper parameter setting device 100 and the program trial device 200 may be realized by the program trial system 10 composed of a single device.
  • the program trial system 10 may execute the hyperparameter setting process according to the flowchart as shown in FIG.
  • step S301 the program trial system 10 acquires the user program and parameter description data from the user terminal 50. That is, the program trial system 10 acquires a user program to be tried (for example, a deep learning model to be trained) and parameter description data of hyperparameters applied to the trial of the user program.
  • a user program to be tried for example, a deep learning model to be trained
  • step S302 the program trial system 10 sets hyperparameters according to the hyperparameter description data.
  • step S303 the program trial system 10 executes the user program according to the set hyperparameters. Specifically, the program trial system 10 repeatedly executes the user program by the number of epochs.
  • step S304 the program trial system 10 acquires an intermediate evaluation value for each epoch.
  • step S305 the program trial system 10 determines whether to end further repetition of the user program by the hyper parameter value set in step S302 based on the acquired intermediate evaluation value. For example, when the acquired intermediate evaluation value is less than the average of the evaluation values by the other hyper parameter values that have been executed (S305: YES), the program trial system 10 further increases the user program according to the set hyper parameter values. The repetition is stopped, and the process proceeds to step S307. On the other hand, when the acquired intermediate evaluation value is equal to or greater than the average of the evaluation values by other hyperparameter values that have already been executed (S305: NO), the program trial system 10 proceeds to step S306 without terminating the user program early. To do.
  • step S306 the program trial system 10 determines whether or not the predetermined maximum number of epochs has been exceeded. If the maximum number of epochs is not exceeded (S306: NO), the program trial system 10 returns to step S304 and acquires an intermediate evaluation value of the next epoch according to the set hyperparameter value. On the other hand, when the maximum number of epochs is exceeded (S306: YES), the program trial system 10 proceeds to step S307.
  • step S307 the program trial system 10 determines whether the user program has been executed for all possible values of the hyperparameters.
  • the program trial system 10 ends the hyperparameter setting process.
  • the program trial system 10 sets the next hyperparameter value in step S302, and sets the next hyperparameter.
  • the user program is repeatedly executed for the value by the number of epochs.
  • the interface unit 110 acquires parameter description data as shown in FIG. 7 from the user terminal 50.
  • two identical parameter description data are input via the command line interface on the user terminal 50.
  • the interface unit 110 sends a parameter request corresponding to these two parameter description data to the hyper parameter setting unit 120, and the hyper parameter setting unit 120 sends a hyper parameter to each parameter description data.
  • the interface unit 110 gives the set values of the two hyper parameters to the program trial device 200, and instructs the program trial device 200 to try the user program in parallel under the values of each hyper parameter.
  • each function may be a circuit composed of an analog circuit, a digital circuit, or an analog / digital mixed circuit. Further, a control circuit for controlling each function may be provided. The mounting of each circuit may be by ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array) or the like.
  • the hyper parameter setting device may be configured by hardware, or may be configured by software and executed by a CPU (Central Processing Unit) or the like by information processing of the software. Good.
  • the hyper parameter setting device and a program that realizes at least a part of the functions thereof may be stored in a storage medium, read by a computer, and executed.
  • the storage medium is not limited to removable ones such as magnetic disks (for example, flexible disks) and optical disks (for example, CD-ROMs and DVD-ROMs), and is fixed such as SSDs (Solid State Drives) that use hard disk devices and memories. It may be a type storage medium. That is, information processing by software may be concretely implemented using hardware resources. Further, the processing by software may be implemented in a circuit such as FPGA and executed by hardware. The job may be executed by using an accelerator such as a GPU (Graphics Processing Unit), for example.
  • GPU Graphics Processing Unit
  • the computer can be used as the device of the above embodiment by reading the dedicated software stored in the storage medium that can be read by the computer.
  • the type of storage medium is not particularly limited.
  • the computer can be used as the device of the above embodiment. In this way, information processing by software is concretely implemented using hardware resources.
  • FIG. 8 is a block diagram showing an example of the hardware configuration in the embodiment of the present disclosure.
  • the hyper parameter setting device 100 includes a processor 101, a main storage device 102, an auxiliary storage device 103, a network interface 104, and a device interface 105, and these are realized as a computer device connected via a bus 106. it can.
  • the hyperparameter setting device 100 shown in FIG. 8 includes one component, the hyperparameter setting device 100 may include a plurality of the same components. Further, although one hyper parameter setting device 100 is shown, even if software is installed in a plurality of computer devices and each of the plurality of hyper parameter setting devices 100 executes a part of different processing of the software. Good. In this case, each of the plurality of hyperparameter setting devices 100 may communicate via the network interface 104 or the like.
  • the processor 101 is an electronic circuit (processing circuit, Processing circuitry) including a control unit and an arithmetic unit of the hyperparameter setting device 100.
  • the processor 101 performs arithmetic processing based on data and programs input from each apparatus of the internal configuration of the hyper parameter setting apparatus 100, and outputs an arithmetic result and a control signal to each apparatus and the like.
  • the processor 101 controls each component constituting the hyper parameter setting device 100 by executing an OS (Operating System) of the hyper parameter setting device 100, an application, or the like.
  • the processor 101 is not particularly limited as long as it can perform the above processing.
  • the hyperparameter setting device 100 and each component thereof are realized by the processor 101.
  • the processing circuit may refer to one or more electric circuits arranged on one chip, or may refer to one or more electric circuits arranged on two or more chips or devices. Good. When a plurality of electronic circuits are used, each electronic circuit may communicate by wire or wirelessly.
  • the main storage device 102 is a storage device that stores instructions executed by the processor 101, various data, and the like, and the information stored in the main storage device 102 is directly read by the processor 101.
  • the auxiliary storage device 103 is a storage device other than the main storage device 102. Note that these storage devices mean arbitrary electronic components capable of storing electronic information, and may be memory or storage. Further, the memory includes a volatile memory and a non-volatile memory, but either of them may be used.
  • a memory for storing various data in the hyper parameter setting device 100 for example, a memory may be realized by the main storage device 102 or the auxiliary storage device 103. For example, at least a part of the memory may be mounted on the main storage device 102 or the auxiliary storage device 103.
  • an accelerator is provided, at least a part of the above-mentioned memory may be implemented in the memory provided in the accelerator.
  • the network interface 104 is an interface for connecting to the communication network 108 wirelessly or by wire. As the network interface 104, one conforming to the existing communication standard may be used. Information may be exchanged by the network interface 104 with the external device 109A which is communicated and connected via the communication network 108.
  • the external device 109A includes, for example, a camera, motion capture, an output destination device, an external sensor, an input source device, and the like. Further, the external device 109A may be a device having some functions of the components of the hyperparameter setting device 100. Then, the hyper parameter setting device 100 may receive a part of the processing result of the hyper parameter setting device 100 via the communication network 108 like a cloud service.
  • the device interface 105 is an interface such as USB (Universal Serial Bus) that directly connects to the external device 109B.
  • the external device 109B may be an external storage medium or a storage device.
  • the memory may be realized by the external device 109B.
  • the external device 109B may be an output device.
  • the output device may be, for example, a display device for displaying an image, a device for outputting audio, or the like.
  • a display device for displaying an image for example, there are LCD (Liquid Crystal Display), CRT (Cathode Ray Tube), PDP (Plasma Display Panel), organic EL (ElectroLuminescence) display, speaker and the like, but the present invention is not limited thereto.
  • the external device 109B may be an input device.
  • the input device includes devices such as a keyboard, a mouse, a touch panel, and a microphone, and gives the information input by these devices to the hyperparameter setting device 100.
  • the signal from the input device is output to the processor 101.
  • the interface unit 110, the hyperparameter setting unit 120, and the like of the hyperparameter setting device 100 in the present embodiment may be realized by the processor 101.
  • the memory of the hyperparameter setting device 100 may be realized by the main storage device 102 or the auxiliary storage device 103.
  • the hyperparameter setting device 100 may be equipped with one or more memories.
  • At least one of a, b and c is not only a combination of a, b, c, ab, ac, bc and abc, but also aa. , Abb, aabbbc, etc., which includes a plurality of combinations of the same elements.
  • it is an expression that covers a configuration including elements other than a, b, and c, such as a combination of abcd.
  • a, b or c is not only a combination of a, b, c, ab, ac, bc, abc, but also a combination. It is an expression including a plurality of combinations of the same elements such as aa, abb, aabbbcc. In addition, it is an expression that covers a configuration including elements other than a, b, and c, such as a combination of abcd.
  • Program trial system 50 User terminal 100 Hyper parameter setting device 110 Interface unit 120 Hyper parameter setting unit 101 Processor 102 Main storage device 103 Auxiliary storage device 104 Network interface 105 Device interface 108 Communication network 109 External device 200 Program trial device 250 Database

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Stored Programmes (AREA)
  • Tests Of Electronic Circuits (AREA)
PCT/JP2020/022428 2019-06-12 2020-06-05 ハイパーパラメタチューニング方法、プログラム試行システム及びコンピュータプログラム Ceased WO2020250843A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
DE112020002822.4T DE112020002822T5 (de) 2019-06-12 2020-06-05 Hyperparameteroptimierungsverfahren, Programmversuchssystem und Computerprogramm
JP2021526074A JP7303299B2 (ja) 2019-06-12 2020-06-05 ハイパーパラメタチューニング方法、プログラム試行システム及びコンピュータプログラム
US17/643,661 US12430147B2 (en) 2019-06-12 2021-12-10 Hyperparameter tuning method, program trial system, and computer program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019109537 2019-06-12
JP2019-109537 2019-06-12

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/643,661 Continuation US12430147B2 (en) 2019-06-12 2021-12-10 Hyperparameter tuning method, program trial system, and computer program

Publications (1)

Publication Number Publication Date
WO2020250843A1 true WO2020250843A1 (ja) 2020-12-17

Family

ID=73781187

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/022428 Ceased WO2020250843A1 (ja) 2019-06-12 2020-06-05 ハイパーパラメタチューニング方法、プログラム試行システム及びコンピュータプログラム

Country Status (4)

Country Link
US (1) US12430147B2 (https=)
JP (1) JP7303299B2 (https=)
DE (1) DE112020002822T5 (https=)
WO (1) WO2020250843A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220335327A1 (en) * 2021-04-19 2022-10-20 Rohde & Schwarz Gmbh & Co. Kg Electronic device and method of setting processing parameters

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220245350A1 (en) * 2021-02-03 2022-08-04 Cambium Assessment, Inc. Framework and interface for machines
KR102925434B1 (ko) * 2023-01-17 2026-02-11 삼성전자주식회사 신경망의 학습을 위한 하이퍼 파라미터들의 탐색 방법 및 장치

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013242604A (ja) * 2010-09-10 2013-12-05 Fixstars Multi Core Labo Corp 実行モジュール最適化装置、実行モジュール最適化方法、およびプログラム
JP2019003408A (ja) * 2017-06-15 2019-01-10 株式会社日立製作所 ハイパーパラメータの評価方法、計算機及びプログラム
CN109816116A (zh) * 2019-01-17 2019-05-28 腾讯科技(深圳)有限公司 机器学习模型中超参数的优化方法及装置

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6620422B2 (ja) 2015-05-22 2019-12-18 富士通株式会社 設定方法、設定プログラム、及び設定装置
US10565498B1 (en) * 2017-02-28 2020-02-18 Amazon Technologies, Inc. Deep neural network-based relationship analysis with multi-feature token model
EP4700664A3 (en) * 2017-05-17 2026-04-22 Intel Corporation Systems and methods implementing an intelligent optimization platform
US20200097853A1 (en) * 2017-06-02 2020-03-26 Google Llc Systems and Methods for Black Box Optimization
JP6974712B2 (ja) 2017-10-24 2021-12-01 富士通株式会社 探索方法、探索装置および探索プログラム
US11270217B2 (en) * 2017-11-17 2022-03-08 Intel Corporation Systems and methods implementing an intelligent machine learning tuning system providing multiple tuned hyperparameter solutions
US20190236487A1 (en) * 2018-01-30 2019-08-01 Microsoft Technology Licensing, Llc Machine learning hyperparameter tuning tool
JP6892424B2 (ja) 2018-10-09 2021-06-23 株式会社Preferred Networks ハイパーパラメータチューニング方法、装置及びプログラム
JP6882356B2 (ja) 2019-02-28 2021-06-02 キヤノン株式会社 像加熱装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013242604A (ja) * 2010-09-10 2013-12-05 Fixstars Multi Core Labo Corp 実行モジュール最適化装置、実行モジュール最適化方法、およびプログラム
JP2019003408A (ja) * 2017-06-15 2019-01-10 株式会社日立製作所 ハイパーパラメータの評価方法、計算機及びプログラム
CN109816116A (zh) * 2019-01-17 2019-05-28 腾讯科技(深圳)有限公司 机器学习模型中超参数的优化方法及装置

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "@koshian2", QIITA, 13 May 2019 (2019-05-13), pages 1 - 16, XP055771861, Retrieved from the Internet <URL:https://qiita.com/koshian2/items/1c0f781d244a6046b83e> [retrieved on 20200715] *
TAKUYA AKIBA: "Optuna Preferred Networks Research & Development Blog", OPTUNA PREFERRED NETWORKS RESEARCH & DEVELOPMENT BLOG, 3 December 2018 (2018-12-03), pages 1 - 7, XP055771774, Retrieved from the Internet <URL:https://tech.preferred.jp/ja/blog/optuna-release> [retrieved on 20200715] *
TEREDESAI ANKUR, KUMAR VIPIN, LI YING, ROSALES RÓMER, TERZI EVIMARIA, KARYPIS GEORGE, AKIBA TAKUYA, SANO SHOTARO, YANASE TOSHIHIKO: "Optuna: A Next-generation Hyperparameter Optimization Framework", KDD 2019 APPLIED DATA SCIENCE TRACK, 26 July 2019 (2019-07-26), pages 2623 - 2631, XP058466352, Retrieved from the Internet <URL:https://arxiv.org/abs/1907.10902> [retrieved on 20200715] *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220335327A1 (en) * 2021-04-19 2022-10-20 Rohde & Schwarz Gmbh & Co. Kg Electronic device and method of setting processing parameters
US12026594B2 (en) * 2021-04-19 2024-07-02 Rohde & Schwarz Gmbh & Co. Kg Electronic device and method of setting processing parameters

Also Published As

Publication number Publication date
JP7303299B2 (ja) 2023-07-04
US12430147B2 (en) 2025-09-30
DE112020002822T5 (de) 2022-03-03
JPWO2020250843A1 (https=) 2020-12-17
US20220100531A1 (en) 2022-03-31

Similar Documents

Publication Publication Date Title
US11175895B2 (en) Code generation and simulation for graphical programming
US12430147B2 (en) Hyperparameter tuning method, program trial system, and computer program
US10228849B2 (en) Method and device for providing controller
EP2677451A2 (en) License verification method and apparatus, and computer readable storage medium storing program therefor
US11507856B2 (en) Method and apparatus for updating application
WO2019137444A1 (zh) 用于执行机器学习的特征工程的方法及系统
JP7044839B2 (ja) エンドツーエンドモデルのトレーニング方法および装置
US20140232724A1 (en) Moving visualizations between displays and contexts
TW201604719A (zh) 智能設備的控制方法及裝置
CN116580212B (zh) 图像生成方法、图像生成模型的训练方法、装置和设备
JP2021144461A (ja) 学習装置及び推論装置
US9684738B2 (en) Text-based command generation
JP7096360B2 (ja) ミニバッチ学習装置とその作動プログラムおよび作動方法
CN116306396A (zh) 芯片验证方法及装置、设备和介质
CN107918584A (zh) 信息生成系统、装置、方法及计算机可读取的记录介质
CN115228087B (zh) 游戏角色控制优化方法、装置、电子设备及介质
US10114518B2 (en) Information processing system, information processing device, and screen display method
CN108536501A (zh) 一种输入界面响应布局生成方法、装置和服务器
US20120079532A1 (en) Techniques for developing a television user interface for a secondary device
JP7744523B2 (ja) 表示制御システム
CN115082294A (zh) 图像格式确定方法、装置及电子设备
CN116363260A (zh) 图像生成方法、装置及电子设备
JP6290743B2 (ja) 情報処理装置及びプログラム
CN114092590A (zh) 电子设备及其图像渲染性能的评估方法、介质
JP2023084981A (ja) 情報処理装置、情報処理方法、並びにプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20823075

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021526074

Country of ref document: JP

Kind code of ref document: A

122 Ep: pct application non-entry in european phase

Ref document number: 20823075

Country of ref document: EP

Kind code of ref document: A1