KR20230012790A

KR20230012790A - Method and apparatus for function optimization

Info

Publication number: KR20230012790A
Application number: KR1020210093496A
Authority: KR
Inventors: 이원종; 조형헌
Original assignee: 서울대학교산학협력단
Priority date: 2021-07-16
Filing date: 2021-07-16
Publication date: 2023-01-26
Also published as: KR102559605B1; WO2023287239A1

Abstract

According to an embodiment of the present invention, provided are a function optimization method and device for performing optimization on an unknown function by applying a Bayesian optimization model in an evolutionary algorithm. According to diversified Bayesian optimization, robustness of optimization is improved and accuracy is enhanced. The function optimization method includes the steps of: a first step of determining K parent candidates for an optimal solution of an unknown objective function by a first optimization process; a second step of generating M child candidates from the K parent candidates by a genetic operation; a third step of determining expected fitness of the M child candidates by a second optimization process for the objective function, and determining a final candidate for an optimal solution of the objective function based on the expected fitness; and a fourth step of evaluating the fitness of the final candidate for the optimal solution.

Description

Function optimization method and device {METHOD AND APPARATUS FOR FUNCTION OPTIMIZATION}

본 발명은 함수 최적화 방법 및 장치에 관한 것으로, 진화 알고리즘에서 베이지안 최적화를 활용한 함수 최적화 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for optimizing a function, and more particularly, to a method and apparatus for optimizing a function using Bayesian optimization in an evolutionary algorithm.

이하에서 기술되는 내용은 본 발명의 실시예와 관련되는 배경 정보를 제공할 목적으로 기재된 것일 뿐이고, 기술되는 내용들이 당연하게 종래기술을 구성하는 것은 아니다.The contents described below are only described for the purpose of providing background information related to an embodiment of the present invention, and the contents described do not naturally constitute prior art.

기존의 최적화 방법들은 특정 문제가 갖는 특성에 맞추어 개발되어야 하는 어려움이 있었다.Existing optimization methods have difficulties in being developed according to the characteristics of a specific problem.

이를 해결하고자 문제의 최적 해를 찾지 못하더라도 특정 문제가 갖는 정보에 크게 구속되지 않으면서 대부분의 문제에서 대체로 좋은 성능을 달성하기 위한 방법들인 메타 휴리스틱(meta-heuristic) 알고리즘이 개발되었다.In order to solve this problem, meta-heuristic algorithms have been developed, which are methods for achieving generally good performance in most problems without being significantly restricted by the information of a specific problem even if the optimal solution to the problem is not found.

진화 알고리즘(Evolutionary Algorithm; EA)은 자연에서 영감을 받은 (naturally-inspired) 메타 휴리스틱 알고리즘 중 하나로, 임의로 선정한 K개의 후보 세트(candidate set)를 모집단(population)으로 선택해서 그 일부를 진화시키고 일부는 도태시키는 과정을 통해 최적 해에수렴해 가는 방법이다.Evolutionary Algorithm (EA) is one of the naturally-inspired meta-heuristic algorithms, which randomly selects K candidate sets as a population, evolves some of them, and evolves some of them. It is a method that converges to the optimal solution through the process of weeding out.

진화 알고리즘의 하나인 유전 알고리즘(Genetic Algorithm; GA)은 빌딩 블록 가설(Building Block Hypothesis; BBH) 하에서 스키마 이론(schema theory)으로 효과성이 입증된 바 있으나 실질적으로는 표현형(genetic representation) 및 유전 연산자(genetic operator)의 구현 방법 뿐 아니라 모집단 크기(population size) K와 같은 설계 파라미터(design parameter)에 따라 성능이 크게 달라질 수 있다.Genetic Algorithm (GA), one of the evolutionary algorithms, has been proven effective as schema theory under the Building Block Hypothesis (BBH), but in practice, genetic representation and genetic operators Performance can vary greatly depending on the implementation method of the genetic operator as well as design parameters such as population size K.

하지만, 문제에 따라 K를 정하는 경우에 K가 너무 작으면 초기 모집단이 BBH를 만족시키지 못해 최적 해에 수렴할 수 없게 되는 문제가 발생할 수 있고, 초기 모집단이 너무 크면 진화 과정이 더디게 진행되는 문제(즉, cold-start) 가 발생할 수 있다.However, when K is determined according to the problem, if K is too small, a problem may occur in which the initial population does not satisfy BBH and cannot converge to the optimal solution, and if the initial population is too large, the evolution process is slow ( That is, a cold-start may occur.

한편, 전술한 선행기술은 발명자가 본 발명의 도출을 위해 보유하고 있었거나, 본 발명의 도출 과정에서 습득한 기술 정보로서, 반드시 본 발명의 출원 전에 일반 공중에게 공개된 공지기술이라 할 수는 없다.On the other hand, the above-mentioned prior art is technical information that the inventor possessed for derivation of the present invention or acquired during the derivation process of the present invention, and cannot necessarily be said to be known art disclosed to the general public prior to the filing of the present invention. .

본 발명의 일 과제는 진화 알고리즘에서 베이지안 최적화를 활용한 함수 최적화 방법 및 장치를 제공하는 것이다.An object of the present invention is to provide a method and apparatus for optimizing a function using Bayesian optimization in an evolutionary algorithm.

본 발명의 목적은 이상에서 언급한 과제에 한정되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있고, 본 발명의 실시예에 의해 보다 분명하게 이해될 것이다. 또한, 본 발명의 목적 및 장점들은 청구범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 알 수 있을 것이다.The object of the present invention is not limited to the problems mentioned above, and other objects and advantages of the present invention not mentioned above can be understood by the following description and will be more clearly understood by the examples of the present invention. It will also be seen that the objects and advantages of the present invention may be realized by means of the instrumentalities and combinations indicated in the claims.

본 발명의 일 실시예에 따른 함수 최적화 방법은, 미지의 목적 함수(Unknown objective function)에 대한 제 1 최적화 프로세스에 의해 상기 목적 함수의 최적 해에 대한 K 개의 부모 후보를 결정하는 제 1 단계; 진화 연산(genetic operation)에 의해 상기 K 개의 부모 후보로부터 M 개의 자식 후보를 생성하는 제 2 단계; 상기 목적 함수에 대한 제 2 최적화 프로세스에 의해 상기 M 개의 자식 후보의 예상 적합도를 결정하고, 상기 예상 적합도에 기반하여 상기 목적 함수의 최적 해에 대한 최종 후보를 결정하는 제 3 단계; 및 상기 최적 해에 대한 상기 최종 후보의 적합도를 평가하는 제 4 단계를 포함할 수 있다.A function optimization method according to an embodiment of the present invention includes a first step of determining K parent candidates for an optimal solution of an unknown objective function by a first optimization process for the objective function; a second step of generating M child candidates from the K parent candidates by a genetic operation; a third step of determining expected fitness of the M child candidates by a second optimization process for the objective function, and determining a final candidate for an optimal solution of the objective function based on the expected fitness; and a fourth step of evaluating the suitability of the final candidate for the optimal solution.

본 발명의 일 실시예에 따른 함수 최적화 장치는 적어도 하나의 프로세서를 포함하고, 상기 프로세서는, 미지의 목적 함수에 대한 제 1 최적화 프로세스에 의해 상기 목적 함수의 최적 해에 대한 K 개의 부모 후보를 결정하는 제 1 연산; 진화 연산에 의해 상기 K 개의 부모 후보로부터 M 개의 자식 후보를 생성하는 제 2 연산; 상기 목적 함수에 대한 제 2 최적화 프로세스에 의해 상기 M 개의 자식 후보의 예상 적합도를 결정하고, 상기 예상 적합도에 기반하여 상기 목적 함수의 최적 해에 대한 최종 후보를 결정하는 제 3 연산; 및 상기 최적 해에 대한 상기 최종 후보의 적합도를 평가하는 제 4 연산을 수행하도록 구성될 수 있다.An apparatus for optimizing a function according to an embodiment of the present invention includes at least one processor, and the processor determines K parent candidates for an optimal solution of an unknown objective function by a first optimization process for the objective function. A first operation to do; a second operation generating M child candidates from the K parent candidates by an evolution operation; a third operation for determining expected goodness of fit of the M child candidates by a second optimization process for the objective function, and determining a final candidate for an optimal solution of the objective function based on the expected goodness of fit; and a fourth operation of evaluating the fitness of the final candidate for the optimal solution.

전술한 것 외의 다른 측면, 특징, 및 이점이 이하의 도면, 청구범위 및 발명의 상세한 설명으로부터 명확해질 것이다.Other aspects, features, and advantages other than those described above will become apparent from the following drawings, claims, and detailed description of the invention.

실시예에 의하면, 다각화된 베이지안 최적화를 적용하여 미지의 함수에 대한 보다 강건한 최적화가 가능하다.According to the embodiment, more robust optimization for an unknown function is possible by applying diversified Bayesian optimization.

실시예에 의하면, 두 개의 베이지안 최적화의 협업에 의해 최적화 성능이 향상된다.According to an embodiment, optimization performance is improved by the collaboration of two Bayesian optimizations.

본 발명의 효과는 이상에서 언급된 것들에 한정되지 않으며, 언급되지 아니한 다른 효과들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The effects of the present invention are not limited to those mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description below.

도 1은 실시예에 따른 함수 최적화를 개략적으로 설명하기 위한 도면이다.
도 2는 실시예에 따른 함수 최적화 장치의 블록도이다.
도 3은 실시예에 따른 함수 최적화 방법의 흐름도이다.
도 4a는 실시예에 따른 다각화된 베이지안 최적화 방식을 활용한 함수 최적화의 예시적인 결과 그래프이다.
도 4b는 실시예에 따른 두 개의 베이지안 최적화 모듈을 활용한 진화 알고리즘에 의한 함수 최적화의 예시적인 결과 그래프이다.
도 5는 실시예에 따른 함수 최적화 과정의 예시적인 의사 코드이다.1 is a diagram for schematically explaining function optimization according to an embodiment.
2 is a block diagram of a function optimization device according to an embodiment.
3 is a flowchart of a function optimization method according to an embodiment.
4A is an exemplary result graph of function optimization using a diversified Bayesian optimization method according to an embodiment.
4B is an exemplary result graph of function optimization by an evolutionary algorithm utilizing two Bayesian optimization modules according to an embodiment.
5 is an exemplary pseudocode of a function optimization process according to an embodiment.

이하에서는 도면을 참조하여 본 발명을 보다 상세하게 설명한다. 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며, 여기에서 설명하는 실시예들에 한정되지 않는다. 이하 실시예에서는 본 발명을 명확하게 설명하기 위해서 설명과 직접적인 관계가 없는 부분을 생략하지만, 본 발명의 사상이 적용된 장치 또는 시스템을 구현함에 있어서, 이와 같이 생략된 구성이 불필요함을 의미하는 것은 아니다. 아울러, 명세서 전체를 통하여 동일 또는 유사한 구성요소에 대해서는 동일한 참조번호를 사용한다.Hereinafter, the present invention will be described in more detail with reference to the drawings. This invention may be embodied in many different forms and is not limited to the embodiments set forth herein. In the following embodiments, parts not directly related to the description are omitted in order to clearly explain the present invention, but this does not mean that the omitted configuration is unnecessary in implementing a device or system to which the spirit of the present invention is applied. . In addition, the same reference numbers are used for the same or similar elements throughout the specification.

이하의 설명에서 제1, 제2 등의 용어는 다양한 구성요소들을 설명하는 데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안되며, 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 또한, 이하의 설명에서 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.In the following description, terms such as first and second may be used to describe various components, but the components should not be limited by the above terms, and the terms refer to one component from another. Used only for distinguishing purposes. Also, in the following description, singular expressions include plural expressions unless the context clearly indicates otherwise.

이하의 설명에서, "포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In the following description, terms such as "comprise" or "having" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or more other It should be understood that it does not preclude the possibility of addition or existence of features, numbers, steps, operations, components, parts, or combinations thereof.

이하 도면을 참고하여 본 발명을 상세히 설명하기로 한다.The present invention will be described in detail with reference to the drawings below.

도 1은 실시예에 따른 함수 최적화를 개략적으로 설명하기 위한 도면이다.1 is a diagram for schematically explaining function optimization according to an embodiment.

실시예에 따른 함수 최적화 장치(100)는 미지의 함수(Unknown Function)(

)에 대한 최적화를 수행한다. 함수 최적화 장치(100)는 진화 알고리즘(Evolutionary Algorithm; EA)의 일부 과정에 베이지안 최적화(Bayesian Optimization; BO)를 적용하여 미지의 함수에 대한 최적 해를 추정한다.The function optimization apparatus 100 according to the embodiment is an unknown function (

) is optimized. The function optimization apparatus 100 estimates an optimal solution for an unknown function by applying Bayesian Optimization (BO) to a part of an Evolutionary Algorithm (EA).

여기서 미지의 함수는 평가 결과에 노이즈(noise)가 포함될 수 있으며, 입력 값에 따른 출력 값 외에는 미분 정보와 같이 내부 매커니즘을 알기 힘든 모든 블랙 박스(black box)함수를 의미한다. 함수 최적화는 미지의 함수의 최적 해 또는 최적 해에 대한 근사치를 탐색하는 과정을 의미한다.Here, the unknown function may include noise in the evaluation result, and refers to all black box functions whose internal mechanism is difficult to know, such as differential information, except for output values according to input values. Function optimization refers to the process of searching for an optimal solution or an approximation to an optimal solution of an unknown function.

실시예에 따른 함수 최적화 방법 및 장치에서 최적화의 대상으로 하는 목적 함수는, 미지의 함수로서 컨벡스 함수가 아니면서 평가(evaluation)에 많은 비용이 드는 목적 함수(non-convex expensive objective function)를 의미한다.In the function optimization method and apparatus according to the embodiment, the objective function to be optimized is an unknown function that is not a convex function and is expensive to evaluate (non-convex expensive objective function). .

미지의 함수 최적화를 포함한 블랙-박스 최적화(black-box optimization)는 딥 러닝 분야에서 활발히 연구되고 있다. 예를 들어 뉴럴 아키텍처 탐색(Neural Architecture Search; NAS) 및 초모수 최적화(Hyper-Parameter Optimization; HPO)는 대표적으로 최적화 비용이 비싼 문제들이다.Black-box optimization, including unknown function optimization, is being actively studied in the field of deep learning. For example, Neural Architecture Search (NAS) and Hyper-Parameter Optimization (HPO) are typically expensive problems to optimize.

딥 뉴럴 네트워크(Deep Neural Network; DNN) 및 이와 연계된 초모수 최적화(HPO)와 뉴럴 아키텍처 탐색(NAS) 문제를 풀기 위하여, 미지의 함수, 즉 블랙-박스 함수(black-box function)의 최적화가 중요하게 다루어지고 있다.In order to solve Deep Neural Network (DNN) and related hyperparameter optimization (HPO) and neural architecture search (NAS) problems, optimization of an unknown function, that is, a black-box function, is required. are treated as important.

DNN의 경우 초모수 최적화(HPO) 문제는 넓게는 뉴럴 아키텍처 탐색(NAS) 문제를 포함한다. 이와 같은 최적화 문제는 다음의 수학식 1로 표현될 수 있다.For DNNs, hyperparameter optimization (HPO) problems broadly include neural architecture search (NAS) problems. This optimization problem can be expressed as Equation 1 below.

여기서 x는

로 분해될 수 있다.

는 아키텍처 변수(architecture variables)의 집합이고,

는 비-아키텍처 변수(non-architecture variables)의 집합이고,

는 최적화 공간(즉, 탐색 공간)이고,

는 최대화할 목적 함수이다.where x is

can be decomposed into

is a set of architecture variables,

is a set of non-architecture variables,

is the optimization space (i.e. the search space),

is the objective function to maximize.

즉, 실시예에 따른 함수 최적화 기법은 집합

의 원소 x에 대하여 미지의 함수

를 최대화하는 최적값

를 결정할 수 있다.That is, the function optimization technique according to the embodiment is set

function unknown for element x of

Optimal value that maximizes

can decide

일부 HPO 분야의 연구는 고정 아키텍처

에 중점을 두고

를 최적화하여 왔다. 예를 들어,

는 학습률(learning rate)과 같은 훈련(training)과 관련된 연속적인 변수들을 포함하지만, 이에 제한되지 않으며, 유형적 변수들(categorical variables) 역시 포함할 수 있다.Research in some HPO fields is based on fixed architectures.

focus on

have been optimizing for example,

includes, but is not limited to, continuous variables related to training, such as a learning rate, and may also include categorical variables.

한편, NAS 관련 연구는

의 최적화에 중점을 두어 왔다. 일반적인 NAS와 비교할 때 최근 NAS 분야의 연구는 이와 관련된 기완성된 NAS 연구로부터 획득된 선험적 지식에 기반하여 주의깊게 결정되는 특정(specific)한 좁은 아키텍처 탐색 공간의

를 한정하고 여기에서 최고 성능을 갱신하기 위한 특별한 방법들을 개발하는 것에 중점을 두어 왔다. 이러한 연구에서

의 모든 변수들은 유형적이고 비-아키텍처 변수들은 사전에 결정된

에 고정된다.On the other hand, NAS-related research

has been focused on the optimization of Compared to general NAS, recent research in the NAS field has a specific narrow architectural search space that is carefully determined based on a priori knowledge obtained from related completed NAS research.

Emphasis has been placed on defining and developing special methods to update peak performance there. in these studies

All variables in are tangible and non-architectural variables are predetermined.

fixed on

HPO와 NAS의 실행을 위해서는 적어도 다음의 세 가지 요소가 필요하다.For HPO and NAS to run, at least three elements are required:

1. 탐색 공간(Search Space): 초기 HPO 연구는 일반적인 탐색 공간을 고려한 반면에 탐색 공간들이 수동 아키텍처 설계를 통해 획득된 지식 또는 모바일 플랫폼과 같은 하드웨어의 제약조건들로 제한된 최근 연구들은 협소한 탐색 공간을 고려해왔다.1. Search Space: While early HPO studies considered a general search space, recent studies in which the search spaces were limited by knowledge acquired through manual architecture design or hardware constraints such as mobile platforms narrowed the search space. has been considering

2. 탐색 방법(Search Method): 랜덤 탐색(random search), 진화 알고리즘 및 베이지안 최적화를 포함한다. NAS의 경우 강화 학습(Reinforcement Learning; RL)에서 특별한 방법들이 개발되거나 또는 그래디언트-기반(gradient-based) 최적화가 활용된다.2. Search Method: Includes random search, evolutionary algorithm and Bayesian optimization. For NAS, special methods are developed in Reinforcement Learning (RL) or gradient-based optimization is utilized.

3. 평가(Evaluation): 고 평가 비용을 갖는 함수들을 위한 다양한 테크닉들이 개발되어 왔다. 예시로는, (부분 훈련으로 알려진) 조기 종료(early termination), 가중치-공유(weight-sharing)에 의한 DNN의 웜-스타트(warm-start), 그리고 오직 수퍼넷의 서브 그래프들만 고려되는 원-샷 접근법(one-shot approach)을 포함한다. 그밖에 주요 흐름은 성능 추정(performance estimation)을 이용하여 평가 비용을 감소시키는 것이다.3. Evaluation: Various techniques have been developed for functions with high evaluation costs. Examples include early termination (known as partial training), warm-start of DNNs by weight-sharing, and circle-start in which only subgraphs of a supernet are considered. Includes a one-shot approach. Another major trend is to reduce the evaluation cost by using performance estimation.

한편,

와

를 포함한 모든 타입의

에 적용가능한, 진화 알고리즘(EA)과 베이지안 최적화(BO)는 지난 수십년에서 가장 인기있는 탐색 방법이 되었다.Meanwhile,

Wow

of all types, including

Applicable to, evolutionary algorithms (EA) and Bayesian optimization (BO) have become the most popular search methods in the past decades.

진화 알고리즘(EA)은 모델을 필요로 하지 않는(model-less) 방법으로 기저의 적합도 형태(underlying fitness landscape)에 대한 영향이 적어서 광범위한 문제들에서 최적화를 잘 수행한다.Evolutionary algorithms (EAs) are model-less methods that perform well in optimization over a wide range of problems because they have little impact on the underlying fitness landscape.

베이지안 최적화(BO)는 상대적으로 저 비용의 대리 대리 모델(surrogate model)을 통해 관찰 데이터의 패턴을 순차적으로 학습함으로써 고 평가 비용을 갖는 문제들을 다루는 데에 매우 효과적이다.Bayesian Optimization (BO) is very effective in dealing with problems with high evaluation cost by sequentially learning patterns from observed data through a relatively low-cost surrogate model.

실시예에 따른 함수 최적화 방법 및 장치는 NAS를 포함한 일반적인 HPO 태스크들을 위하여, 효과적인 함수 최적화 기법(B²EA)을 제안한다.The function optimization method and apparatus according to the embodiment proposes an effective function optimization technique (B ² EA) for general HPO tasks including NAS.

이는 진화 알고리즘(EA)이 탐색 공간에서 최적 해를 찾기 위한 탐색 방법으로서, 진화 알고리즘(EA)의 각 모듈과 전술한 HPO와 NAS의 세 가지 요소와의 대응 관계를 설계 및 발전시킨 연구 결과이다.This is a search method for the Evolutionary Algorithm (EA) to find the optimal solution in the search space, and it is the result of designing and developing the correspondence between each module of the Evolutionary Algorithm (EA) and the above-mentioned three elements of HPO and NAS.

실시예에 따른 함수 최적화 방법 및 장치는 도함수가 필요 없는(derivative-free) 최적화 기법으로, 한정된 자원(budget) 하에서 임의의 블랙-박스 함수의 최적 해에 근사하기 위한 유용한 기법을 제공한다.The function optimization method and apparatus according to the embodiment is a derivative-free optimization technique, and provides a useful technique for approximating an optimal solution of an arbitrary black-box function under limited budget.

구체적으로, 실시예에 따른 함수 최적화 기법은 최적 해를 찾는 것이 매우 어렵다고 알려진 NP-Hard 문제 혹은 시추 지점 예측 같이 평가에 엄청난 비용이 드는 공학 문제 등에서 널리 활용될 수 있다. 또한, 최근에는 대규모 컴퓨팅 리소스가 요구되는 딥 러닝 모델의 학습(training) 이전에 미리 결정되어야만 하는 아키텍처 구조 설계(architecture design)나 초모수(hyper-parameter)를 미세 조정(fine-tuning) 하는 데 이용될 수 있다.Specifically, the function optimization technique according to the embodiment can be widely used in an NP-Hard problem in which it is known to be very difficult to find an optimal solution or an engineering problem in which evaluation is expensive, such as a drilling point prediction. In addition, it is recently used for fine-tuning architecture design or hyper-parameters that must be determined before training of deep learning models that require large-scale computing resources. It can be.

진화 알고리즘 기법은 조합 최적화(combinatorial optimization) 문제에서 활용되고, 베이지안 최적화(Bayesian optimization) 기법은 연속적 최적화(continuous optimization) 문제에 빈번히 활용되는 기법으로서, 진화 알고리즘은 낮은 평가 비용의 함수 최적화에서 선호되고, 베이지안 최적화는 높은 평가 비용의 함수 최적화에서 선호되어 왔다.The evolutionary algorithm technique is used in combinatorial optimization problems, and the Bayesian optimization technique is frequently used in continuous optimization problems. The evolutionary algorithm is preferred in function optimization with low evaluation cost, Bayesian optimization has been favored in high evaluation cost function optimization.

본 발명이 제시하는 함수 최적화 방법 및 장치는 조합 최적화(combinatorial optimization) 문제 및 연속적 최적화(continuous optimization) 문제 및 혼합-유형 최적화(mixed-typed optimization) 문제 모두에서 적용 가능하다.The function optimization method and apparatus proposed by the present invention can be applied to all problems of combinatorial optimization, continuous optimization, and mixed-type optimization.

실시예에 따른 함수 최적화 방법 및 장치는 진화 알고리즘의 작업 흐름(work-flow)에서 생존자 선택(survivor selection) 및 후손 평가(offspring evaluation) 과정 등의 모듈에서 베이지안 최적화 기법을 사용함으로써 최적화 성능(performance) 향상을 기대할 수 있다.The function optimization method and apparatus according to the embodiment optimizes performance by using Bayesian optimization techniques in modules such as survivor selection and offspring evaluation processes in the work-flow of evolution algorithms. improvement can be expected.

실시예에 따른 함수 최적화 방법 및 장치는, 베이지안 최적화 알고리즘을 적용할 때에 문제에 적합한 베이지안 모델의 설계 파라미터들을 선험 지식(prior knowledge) 으로 선정해야 하는 어려움을 제거하고 베이지안 모델을 다각화(diversification) 하는 기법을 채택함으로써 안정성 (robustness) 향상을 도모할 수 있다.A method and apparatus for optimizing a function according to an embodiment is a technique for diversification of a Bayesian model and eliminating the difficulty of selecting design parameters of a Bayesian model suitable for a problem with prior knowledge when a Bayesian optimization algorithm is applied. By adopting , robustness can be improved.

또한, 실시예에 따른 함수 최적화 방법 및 장치는 막대한 실험 비용이 드는 임의의 알려지지 않은 함수(an unknown function)에 대해 사람의 개입 없이 자동으로 최적화하려는 모든 경우에 활용 가능하다.In addition, the method and apparatus for optimizing a function according to the embodiment can be used in all cases in which an unknown function requiring enormous experimental costs is automatically optimized without human intervention.

나아가, 본 발명의 실험예로 최근 다양한 문제에서 높은 예측 성능을 달성함으로써 각광받고 있는 딥 러닝에서의 초모수 최적화 (hyper-parameter optimization; HPO)와 뉴럴 아키텍처 탐색(neural architecture search; NAS) 문제에 적용했을 때 최신(state-of-the-arts) 알고리즘과 비교 시 우월한 성능을 보여주고 있다.Furthermore, as an experimental example of the present invention, it is applied to hyper-parameter optimization (HPO) and neural architecture search (NAS) problems in deep learning that have recently been in the spotlight by achieving high prediction performance in various problems When compared to state-of-the-arts algorithms, it shows superior performance.

한편, 본 발명에서 베이지안 최적화는 목적 함수에 대한 사전 및 사후 분포들(prior and posterior distributions)을 명확히 정의하는 것에 의한 최적화 방법에 제한되지 않는다. 실시예에 의한 함수 최적화 기법에서 베이지안 최적화는 순차적인 모델-기반의 전역 최적화 전략(sequential model-based global optimization strategy)을 포괄하는 최광의의 개념으로 이해되어야 한다.Meanwhile, the Bayesian optimization in the present invention is not limited to an optimization method by clearly defining prior and posterior distributions for the objective function. In the function optimization technique according to the embodiment, Bayesian optimization should be understood as the broadest concept encompassing a sequential model-based global optimization strategy.

도 2는 실시예에 따른 함수 최적화 장치의 블록도이다. 2 is a block diagram of a function optimization device according to an embodiment.

실시예에 따른 함수 최적화 장치(100)는 프로세서(110)를 포함한다.The function optimization apparatus 100 according to the embodiment includes a processor 110 .

프로세서(110)는 일종의 중앙처리장치로서, 메모리(120)에 저장된 하나 이상의 명령어를 실행하여 실시예에 따른 함수 최적화 방법을 실행할 수 있다. 프로세서(110)는 데이터를 처리할 수 있는 모든 종류의 장치를 포함할 수 있다.The processor 110, as a kind of central processing unit, may execute one or more instructions stored in the memory 120 to execute the function optimization method according to the embodiment. The processor 110 may include any type of device capable of processing data.

프로세서(110)는 예를 들어 프로그램 내에 포함된 코드 또는 명령으로 표현된 기능을 수행하기 위해 물리적으로 구조화된 회로를 갖는, 하드웨어에 내장된 데이터 처리 장치를 의미할 수 있다. 이와 같이 하드웨어에 내장된 데이터 처리 장치의 일 예로서, 마이크로프로세서(microprocessor), 중앙처리장치(central processing unit; CPU), 프로세서 코어(processor core), 멀티프로세서(multiprocessor), ASIC(application-specific integrated circuit), FPGA(field programmable gate array) 및 그래픽 처리 유닛(Graphics Processing Unit; GPU) 등의 처리 장치를 망라할 수 있으나, 이에 한정되는 것은 아니다. 프로세서(110)는 하나 이상의 프로세서를 포함할 수 있다. 프로세서(110)는 적어도 하나의 코어를 포함할 수 있다.The processor 110 may mean, for example, a data processing device embedded in hardware having a physically structured circuit to perform a function expressed as a code or command included in a program. As an example of such a data processing device built into hardware, a microprocessor, a central processing unit (CPU), a processor core, a multiprocessor, an application-specific integrated integrated circuit (ASIC) circuit), a field programmable gate array (FPGA), and a graphics processing unit (GPU), but is not limited thereto. Processor 110 may include one or more processors. Processor 110 may include at least one core.

실시예에 따른 함수 최적화 장치(100)는 메모리(120)를 더 포함할 수 있다.The function optimization apparatus 100 according to the embodiment may further include a memory 120 .

메모리(120)는 실시예에 따른 함수 최적화 장치(100)가 함수 최적화 방법을 실행하기 위한 명령 등을 저장할 수 있다. 메모리(120)는 실시예에 따른 함수 최적화 기법을 구현한 하나 이상의 명령을 생성 및 실행하는 실행가능한(executable) 프로그램을 저장할 수 있다.The memory 120 may store commands for the function optimization apparatus 100 according to the embodiment to execute a function optimization method. The memory 120 may store an executable program that generates and executes one or more instructions implementing a function optimization technique according to an embodiment.

프로세서(110)는 메모리(120)에 저장된 프로그램 및 명령어들에 기반하여 실시예에 따른 메모리 관리 방법을 실행할 수 있다.The processor 110 may execute a memory management method according to an embodiment based on programs and instructions stored in the memory 120 .

메모리(120)는 내장 메모리 및/또는 외장 메모리를 포함할 수 있으며, DRAM, SRAM, 또는 SDRAM 등과 같은 휘발성 메모리, OTPROM(one time programmable ROM), PROM, EPROM, EEPROM, mask ROM, flash ROM, NAND 플래시 메모리, 또는 NOR 플래시 메모리 등과 같은 비휘발성 메모리, SSD, CF(compact flash) 카드, SD 카드, Micro-SD 카드, Mini-SD 카드, Xd 카드, 또는 메모리 스틱(memory stick) 등과 같은 플래시 드라이브, 또는 HDD와 같은 저장 장치를 포함할 수 있다. 메모리(120)는 자기 저장 매체(magnetic storage media) 또는 플래시 저장 매체(flash storage media)를 포함할 수 있으나, 이에 한정되는 것은 아니다.The memory 120 may include built-in memory and/or external memory, and may include volatile memory such as DRAM, SRAM, or SDRAM, one time programmable ROM (OTPROM), PROM, EPROM, EEPROM, mask ROM, flash ROM, and NAND. Non-volatile memory such as flash memory or NOR flash memory, flash drives such as SSD, compact flash (CF) card, SD card, Micro-SD card, Mini-SD card, Xd card, or memory stick; Alternatively, it may include a storage device such as a HDD. The memory 120 may include magnetic storage media or flash storage media, but is not limited thereto.

프로세서(110)는 미지의 목적 함수에 대한 제 1 최적화 프로세스에 의해 목적 함수의 최적 해에 대한 K 개의 부모 후보를 결정하는 제 1 연산, 진화 연산에 의해 K 개의 부모 후보로부터 M 개의 자식 후보를 생성하는 제 2 연산, 목적 함수에 대한 제 2 최적화 프로세스에 의해 M 개의 자식 후보의 예상 적합도를 결정하고, 결정된 예상 적합도에 기반하여 목적 함수의 최적 해에 대한 최종 후보를 결정하는 제 3 연산 및 목적 함수의 최적 해에 대한 최종 후보의 적합도를 평가하는 제 4 연산을 수행하도록 구성될 수 있다.The processor 110 generates M child candidates from the K parent candidates by a first operation of determining K parent candidates for an optimal solution of the objective function by a first optimization process for the unknown objective function, and an evolution operation. a second operation that determines the expected fitness of the M child candidates by a second optimization process for the objective function, and a third operation and an objective function for determining a final candidate for an optimal solution of the objective function based on the determined expected fitness It may be configured to perform a fourth operation to evaluate the goodness of fit of the final candidate for the optimal solution of .

여기서 제 1 연산, 제 2 연산, 제 3 연산 및 제 4 연산은 도 3을 참조하여 후술할 제 1 단계, 제 2 단계, 제 3 단계 및 제 4 단계에 각각 대응하며, 이에 대하여는 도 3을 참조하여 후술한다.Here, the first operation, the second operation, the third operation, and the fourth operation correspond to the first step, the second step, the third step, and the fourth step, respectively, which will be described later with reference to FIG. to be described later.

일 예에서 제 1 최적화 프로세스는 목적 함수에 대한 제 1 베이지안 최적화 모델의 연산을 포함할 수 있다. 일 예에서, 제 2 최적화 프로세스는 목적 함수에 대한 제 2 베이지안 최적화 모델의 연산을 포함할 수 있다.In one example, the first optimization process may include computation of a first Bayesian optimization model on an objective function. In one example, the second optimization process may include computation of a second Bayesian optimization model on the objective function.

제 1 베이지안 최적화 모델 및 제 2 베이지안 최적화 모델은 다각화된 베이지안 최적화 기법(Diversified Bayesian Optimization)에 의해 일 세트의 베이지안 최적화 모델로부터 선택된 모델에 대응한다.The first Bayesian optimization model and the second Bayesian optimization model correspond to models selected from a set of Bayesian optimization models by a diversified Bayesian optimization technique.

프로세서(110)는 제 1 연산에서, 제 1 대리 함수(surrogate function)에 의해, 목적 함수의 전체 탐색 공간(full search space)에서 최적 해에 대한 탐색 이력(search history)에 기반하여 목적 함수에 대한 수학적 모델을 생성하고, 제 1 획득 함수(acquisition function)에 의해, 앞서 생성된 수학적 모델에 기반하여 K 개의 부모 후보를 결정하도록 구성될 수 있다.In the first operation, the processor 110 determines the objective function based on the search history for the optimal solution in the full search space of the objective function by a first surrogate function. generate a mathematical model and, by a first acquisition function, determine the K parent candidates based on the previously created mathematical model.

프로세서(110)는, 제 1 연산에서, 일 세트의 베이지안 최적화 모델에서 제 1 대리 함수 및 제 1 획득 함수를 정의하는 베이지안 최적화 모델을 선택하도록 구성될 수 있다. 여기서, 프로세서(110)는 라운드-로빈(round-robin) 방식 또는 균등 랜덤(uniform random) 방식을 사용할 수 있으며, 이에 제한되지 않고 다양한 선택 방식을 사용할 수 있다.The processor 110 may be configured to, in a first operation, select a Bayesian optimization model defining a first surrogate function and a first acquisition function from a set of Bayesian optimization models. Here, the processor 110 may use a round-robin method or a uniform random method, but is not limited thereto and may use various selection methods.

프로세서(110)는, 제 3 연산에서, 제 2 대리 함수에 의해, M 개의 자식 후보에 기초한 목적 함수의 탐색 공간에서 최적 해에 대한 탐색 이력에 기반하여 목적 함수에 대한 수학적 모델을 생성하고, 제 2 획득 함수에 의해, 이와 같은 수학적 모델에 기반하여 최종 후보를 결정하도록 구성될 수 있다.The processor 110, in a third operation, generates a mathematical model for the objective function based on the search history for an optimal solution in the search space of the objective function based on the M child candidates by a second surrogate function, and 2 acquisition function, it can be configured to determine the final candidate based on such a mathematical model.

프로세서(110)는, 제 3 연산에서, 일 세트의 베이지안 최적화 모델에서 제 2 대리 함수 및 제 2 획득 함수를 정의하는 베이지안 최적화 모델을 선택하도록 구성될 수 있다. 여기서, 프로세서(110)는 선택을 위해 라운드-로빈 방식 또는 균등 랜덤 방식을 사용할 수 있으며, 이에 제한되지 않고 다양한 선택 방식을 사용할 수 있다.Processor 110 may be configured to, in a third operation, select a Bayesian optimization model defining a second surrogate function and a second acquisition function from a set of Bayesian optimization models. Here, the processor 110 may use a round-robin method or a uniform random method for selection, but is not limited thereto and may use various selection methods.

프로세서(110)는, 제 4 연산에서, 최종 후보를 최적 해에 대한 탐색 이력에 추가하도록 구성될 수 있다.The processor 110, in a fourth operation, may be configured to add the final candidate to the search history for an optimal solution.

프로세서(110)는, 소정의 종료 조건의 만족 여부에 따라 제 1 연산 내지 상기 제 4 연산을 반복하도록 구성될 수 있다.The processor 110 may be configured to repeat the first to fourth operations according to whether a predetermined termination condition is satisfied.

이하에서 도 3을 참조하여 실시예에 따른 함수 최적화 방법에 대하여 구체적으로 살펴본다.Hereinafter, with reference to FIG. 3, a function optimization method according to an embodiment will be described in detail.

도 3은 실시예에 따른 함수 최적화 방법의 흐름도이다.3 is a flowchart of a function optimization method according to an embodiment.

실시예에 따른 함수 최적화 방법은 도 2를 참조하여 전술한 함수 최적화 장치(100)에 의해 수행된다.The function optimization method according to the embodiment is performed by the function optimization apparatus 100 described above with reference to FIG. 2 .

실시예에 따른 함수 최적화 방법은 메타 휴리스틱 알고리즘을 향상시키는 방법으로 진화 알고리즘의 네 가지 모듈(도 3을 참조하여 후술할 Module Init - Module A - Module B - Module C)과 진행 흐름(work-flow)을 기초로, 각 모듈마다 베이지안 최적화 기법을 도입함으로써 최적화 성능 향상을 도모할 수 있는 방법을 제시한다.The function optimization method according to the embodiment is a method of improving the meta-heuristic algorithm, and includes four modules of the evolution algorithm (Module Init - Module A - Module B - Module C, which will be described later with reference to FIG. 3) and work-flow Based on this, we present a method to improve optimization performance by introducing Bayesian optimization techniques for each module.

여기서, 베이지안 최적화 기법은 순차적으로 모델을 이용한 전역 최적화 방법을 통칭한다. 베이지안 최적화 기법은, 평가(evaluation)에 많은 비용이 드는 실제 함수를 대신하여 값 싼 평가 비용의 대리 함수(surrogate function)를 베이지안 모델을 이용해 학습하여 충분히 좋은 성능이 예측되는 후보자(candidate)만을 선택하여 평가함으로써 빠르게 최적 해로 근사해 가는 방법이다.Here, the Bayesian optimization technique collectively refers to a global optimization method using a model sequentially. The Bayesian optimization technique uses a Bayesian model to learn a surrogate function with a cheap evaluation cost instead of an actual function that is expensive to evaluate, and selects only candidates that are predicted to have sufficiently good performance. It is a method of quickly approximating the optimal solution by evaluating it.

도 3에서 초기화 모듈(Module Init)은 초기화 단계(S0)와 연계되고, 모듈 A(Module A)는 제 1 단계(S1)와 연계되고, 모듈 B(Module B)는 제 2 단계(S2)와 연계된다.In FIG. 3, the initialization module (Module Init) is associated with the initialization step (S0), module A (Module A) is associated with the first step (S1), and module B (Module B) is associated with the second step (S2). linked

모듈 C(Module C)는 두 개의 서브 모듈(submodule)인 모듈 C1(Module C1)과 모듈 C2(Module C2)로 나뉘며, 모듈 C1(Module C1)은 제 3 단계(S3)과 연계되고, 모듈 C2(Module C2)는 제 4 단계(S4)와 연계된다. Module C is divided into two submodules, Module C1 and Module C2. Module C1 is linked to the third step (S3), and Module C2 (Module C2) is linked to the fourth step (S4).

여기서 모듈은 소프트웨어 모듈로서 프로세서(110)로 하여금 해당 모듈과 연계된 단계 내지 연산을 수행하도록 하는 하나 이상의 명령어를 포함할 수 있다.Here, the module is a software module and may include one or more instructions that cause the processor 110 to perform steps or operations associated with the corresponding module.

실시예에 따른 함수 최적화 방법은 초기화 단계(S0)에서 시작된다.The function optimization method according to the embodiment starts at an initialization step (S0).

단계(S0)에서 도 2를 참조하여 실시예에 따른 함수 최적화 장치(100)의 프로세서(110)는 목적 함수의 최적 해에 대한 초기 후보자 세트(initial candidate set)(C⁰)를 선정한다.In step S0, referring to FIG. 2, the processor 110 of the function optimization apparatus 100 according to the embodiment selects an initial candidate set (C ⁰ ) for an optimal solution of the objective function.

여기서, 후보자 세트는 후보자 개체(individual) 및 해당 후보자의 적합도 값의 페어(pair)를 원소로 하는 집합으로서, 초기 후보자 세트(C⁰)는 후보자 및 해당 후보자의 적합도 값의 페어를 적어도 두 개 포함할 수 있다. 여기서, 초기 후보자 세트(C⁰)의 각 후보자의 적합도 값(fitness value)은 초기값으로 설정된다.Here, the candidate set is a set whose elements are pairs of candidate individuals and fitness values of the corresponding candidates, and the initial candidate set (C ⁰ ) includes at least two pairs of candidates and fitness values of the corresponding candidates. can do. Here, the fitness value of each candidate in the initial candidate set C ⁰ is set to an initial value.

예를 들어, 초기 후보자 세트(C⁰)는 두 개의 초기 후보 (

)를 선정하고, 각 초기 후보(

)의 적합도 값의 초기값을 -∞로 설정할 수 있다. 여기서 두 개의 초기 후보(

)는 랜덤하게 선택할 수 있다.For example, the initial candidate set (C ⁰ ) consists of two initial candidates (

) is selected, and each initial candidate (

) can be set to -∞ as the initial value of the fitness value. Here are two initial candidates (

) can be chosen randomly.

단계(S0)에서 프로세서(110)는 후술할 단계(S4)에서 적합도 평가(fitness evaluation)가 끝난 후보들이 저장될 탐색 이력(Search History;

)을 초기화한다. 예를 들어 탐색 이력(

)은 공집합으로 초기화된다.In step S0, the processor 110 determines a search history (Search History) in which candidates for which fitness evaluation has been completed in step S4, which will be described later, will be stored.

) is initialized. For example, browsing history (

) is initialized to the empty set.

다음의 수학식은 단계(S0)에서 수행되는 연산을 나타낸다.The following equation represents the operation performed in step S0.

후보자 세트는, 후술할 단계(S1), 단계(S2), 단계(S3) 및 단계(S4)의 반복(iteration) 과정 마다 생성된다. 예를 들어, n번째 반복 과정에서 n 번째 후보자 세트(Cⁿ)가 생성된다. 여기서 n은 진화의 세대(generation)를 나타낸다.A candidate set is generated for each iteration of steps S1, S2, S3, and S4, which will be described later. For example, in an n-th iteration process, an n-th candidate set (C ⁿ ) is generated. where n represents the generation of evolution.

프로세서(110)는 단계(S0)에 후속하여, 초기 후보자 세트(C⁰)에 대하여 단계(S4) 및 단계(S5)를 수행한다.Following step S0, the processor 110 performs steps S4 and S5 with respect to the initial candidate set C ⁰ .

즉, 단계(S0)의 초기화 작업이 끝나면 적합도 평가를 진행하는 단계(S4)에서 초기 후보자 세트(C⁰)의 후보자의 적합도를 평가하고, 단계(S5)에서 최적화 종료 여부를 판단하기 위해 초기 후보자 세트(C⁰)의 후보의 적합도 값이 종료 요건을 만족하는 지 확인한다. 종료 요건을 만족하면 못하면 단계(S1)으로 진행하고, 세대를 나타내는 n을 증가시킨다.That is, after the initialization of step S0 is completed, the suitability of the candidates of the initial candidate set C ⁰ is evaluated in step S4 of performing the fitness evaluation, and the initial candidate is evaluated in step S5 to determine whether the optimization is finished. Check whether the fitness values of the candidates in the set (C ⁰ ) satisfy the termination requirements. If the termination requirement is not satisfied, the process proceeds to step S1, and n representing the generation is incremented.

단계(S0)에서 본 발명은 베이지안 최적화 기법을 활용하여 문제에 적합한 K 값 설정과 초기 후보자(

)를 선정하는 방식(sampling method)에 따른 성능 영향을 줄일 수 있다.In step S0, the present invention utilizes a Bayesian optimization technique to set a K value suitable for the problem and an initial candidate (

) can reduce the performance effect of the sampling method.

단계(S4) 및 단계(S5)에 대하여는 이하에서 단계(S1) 내지 단계(S3)을 살펴본 이후에 다시 상세히 설명한다.Steps S4 and S5 will be described in detail again after examining steps S1 to S3 below.

실시예에 따른 함수 최적화 방법은 미지의 목적 함수에 대한 제 1 최적화 프로세스에 의해 목적 함수의 최적 해에 대한 K 개의 부모 후보를 결정하는 제 1 단계(S1), 진화 연산(genetic operation)에 의해 K 개의 부모 후보로부터 M 개의 자식 후보를 생성하는 제 2 단계(S2), 목적 함수에 대한 제 2 최적화 프로세스에 의해 M 개의 자식 후보의 예상 적합도를 결정하고, 결정된 예상 적합도에 기반하여 목적 함수의 최적 해에 대한 최종 후보를 결정하는 제 3 단계(S3) 및 목적 함수의 최적 해에 대한 최종 후보의 적합도를 평가하는 제 4 단계(S4)를 포함한다.The function optimization method according to the embodiment includes a first step (S1) of determining K parent candidates for an optimal solution of an objective function by a first optimization process for an unknown objective function, and K by a genetic operation. A second step (S2) of generating M child candidates from the parent candidates, determining the expected fitness of the M child candidates by a second optimization process for the objective function, and an optimal solution of the objective function based on the determined expected fitness. A third step (S3) of determining a final candidate for and a fourth step (S4) of evaluating the suitability of the final candidate for the optimal solution of the objective function.

제 1 단계(S1)에서 프로세서(110)는 미지의 목적 함수에 대한 제 1 최적화 프로세스에 의해 목적 함수의 최적 해에 대한 K 개의 부모 후보를 결정할 수 있다.In a first step ( S1 ), the processor 110 may determine K parent candidates for an optimal solution of the objective function by a first optimization process for the unknown objective function.

제 1 단계(S1)는 부모 선정(parent selection) 과정으로 탐색 이력(

)과 후보자 세트(

)의 합집합에서 다음 세대의 후보자들에게 좋은 유전자(gene)를 전달할 후보자들, 즉 부모(parent)를 K 개를 선정하는 단계이다. 즉, 제 1 단계(S1)는 매 반복회(each interation)마다 K 개의 부모 후보 세트를 동적으로 생성한다.The first step (S1) is a search history (parent selection) process

) and the candidate set (

) is a step in which K candidates, that is, parents, are selected to pass on good genes to candidates in the next generation. That is, in the first step (S1), K parent candidate sets are dynamically generated at each repetition.

진화 알고리즘과는 달리 실시예에 따른 함수 최적화 방법은 탐색 이력(

) 또는 후보자 세트(

)에 속한 원소 수가 적은 경우에도 성능이 저하되지 않고, 다음 세대의 후보자들에게 좋은 유전자를 전달할 부모를 선정할 수 있다. 이는 후술할 제 1 최적화 프로세스 및 제 2 최적화 프로세스의 협업에 의하여 매 세대(generation)마다 적응적으로 탐색 공간을 제어하면서 최적에 가까운 후손(offspring)을 찾을 수 있기 때문이다.Unlike the evolutionary algorithm, the function optimization method according to the embodiment has a search history (

) or a set of candidates (

), performance does not decrease even when the number of elements belonging to is small, and parents can be selected to pass on good genes to candidates in the next generation. This is because it is possible to find an offspring close to the optimum while adaptively controlling the search space for each generation by collaboration of the first optimization process and the second optimization process, which will be described later.

일 예에서, 제 1 최적화 프로세스는 목적 함수에 대한 제 1 베이지안 최적화 모델의 연산을 포함할 수 있다. 즉, 제 1 단계(S1)에서 프로세서(110)는 목적 함수에 대한 제 1 베이지안 최적화 모델을 연산하여 K 개의 부모 후보를 결정할 수 있다.In one example, the first optimization process may include computation of a first Bayesian optimization model on an objective function. That is, in the first step (S1), the processor 110 may determine K parent candidates by calculating a first Bayesian optimization model for the objective function.

여기서 제 1 베이지안 최적화 모델은 다각화된 베이지안 최적화 기법(Diversified Bayesian Optimization)에 의해 일 세트의 베이지안 최적화 모델로부터 선택된 모델을 포함할 수 있다.Here, the first Bayesian optimization model may include a model selected from a set of Bayesian optimization models by a diversified Bayesian optimization technique.

다각화된 베이지안 최적화 기법은 실시예에 따른 함수 최적화 방법의 반복 과정(iteration)마다 복수 개의 베이지안 최적화 모델을 원소로 하는 집합, 즉 일 세트의 베이지안 최적화 모델 중에서 하나의 베이지안 최적화 모델을 선택하는 것을 의미한다. 다각화된 베이지안 최적화 기법은 탐색 이력의 카디널리티(|H|)가 작은 경우에도 매우 정확할 수 있고, 이는 고 평가 비용을 갖는 함수의 최적화에서 효과적으로 성능을 향상시킬 수 있도록 한다.The diversified Bayesian optimization technique means selecting one Bayesian optimization model from among a set having a plurality of Bayesian optimization models as elements, that is, a set of Bayesian optimization models for each iteration of the function optimization method according to the embodiment. . The diversified Bayesian optimization technique can be very accurate even when the cardinality (|H|) of the search history is small, which can effectively improve performance in the optimization of functions with high evaluation cost.

일 예에서 프로세서(110)는 라운드-로빈 방식 또는 균등 랜덤 방식으로 일 세트의 베이지안 최적화 모델로부터 하나의 베이지안 최적화 모델을 선택할 수 있으며, 이에 제한되지 않고 다양한 선택 방식이 사용될 수 있다.In one example, the processor 110 may select one Bayesian optimization model from a set of Bayesian optimization models in a round-robin method or a uniform random method, and various selection methods may be used without being limited thereto.

본 발명은 베이지안 최적화에서 문제에 따라 성능을 결정하는 설계 요소인 수학적 모델 및 획득 함수(acquisition utility function)의 종류 등을 하나로 한정하지 않고 여러 조합을 모두 활용해 후보자의 다양성(diversity)을 증가시키는 기법인 다각화 전략(diversification strategy)를 채택하였다.The present invention is a technique for increasing the diversity of candidates by utilizing all combinations of mathematical models and types of acquisition utility functions, which are design elements that determine performance according to problems in Bayesian optimization. A diversification strategy was adopted.

본 발명은 베이지안 최적화를 모듈로 활용하는 모든 경우에서 다각화 전략을 사용해서 모델 편향을 줄이고 모델 사이에서의 이력 공유(history sharing)을 통한 협력 효과(cooperation effect)를 증진시킬 수 있다.In all cases in which Bayesian optimization is used as a module, the present invention can reduce model bias by using a diversification strategy and enhance a cooperation effect through history sharing between models.

K 개의 부모 후보를 결정하기 위하여, 제 1 단계(S1)는, 제 1 대리 함수(surrogate function)에 의해, 목적 함수의 전체 탐색 공간(full search space)에서 최적 해에 대한 탐색 이력(search history)(

)에 기반하여 목적 함수에 대한 수학적 모델(mathematical model)을 생성하는 단계 및 제 1 획득 함수(acquisition function)에 의해, 앞서 생성된 수학적 모델에 기반하여 K 개의 부모 후보를 결정하는 단계를 포함할 수 있다.In order to determine K parent candidates, the first step (S1) is, by a first surrogate function, the search history for the optimal solution in the full search space of the objective function (

) and determining K parent candidates based on the previously generated mathematical model by a first acquisition function. there is.

프로세서(110)는 제 1 대리 함수에 의해, 목적 함수의 전체 탐색 공간에서 최적 해에 대한 탐색 이력(

)에 기반하여 목적 함수에 대한 수학적 모델을 생성할 수 있다.The processor 110 searches for the optimal solution in the entire search space of the objective function by the first surrogate function (

) to generate a mathematical model for the objective function.

여기서 목적 함수에 대한 수학적 모델은 확률 분포 모델(probability distribution model) 또는 점수 기반 모델(score based model)을 포함한다.Here, the mathematical model for the objective function includes a probability distribution model or a score based model.

말하자면, 제 1 대리 함수는, 목적 함수의 전체 탐색 공간에서 목적 함수의 최적 해에 대한 탐색 이력(

)에 기반하여 목적 함수의 최적 해를 추정하기 위한 수학적 모델을 제공할 수 있다.In other words, the first surrogate function is the search history for the optimal solution of the objective function in the entire search space of the objective function (

), it is possible to provide a mathematical model for estimating the optimal solution of the objective function.

예를 들어, 제 1 대리 함수는 확률 분포 함수로서 목적 함수의 최적 해에 대한 확률 분포 모델을 생성할 수 있다. 예를 들어, 제 1 대리 함수는 점수 기반 함수로서 목적 함수의 최적 해에 대한 점수 기반 모델을 생성할 수 있다. 제 1 대리 함수는 예를 들어 가우시안 프로세스(Gaussian Process), 랜덤 포레스트(Random Forest) 및 뉴럴 넷을 이용한 성능 예측함수(Neural Predictor) 등을 포함한다.For example, the first surrogate function may generate a probability distribution model for an optimal solution of the objective function as a probability distribution function. For example, the first surrogate function is a score-based function that can generate a score-based model for an optimal solution of the objective function. The first surrogate function includes, for example, a Gaussian Process, a Random Forest, and a Neural Predictor using a Neural Net.

후속하여 프로세서(110)는, 제 1 획득 함수에 의해, 앞서 생성된 수학적 모델에 기반하여 K 개의 부모 후보를 결정할 수 있다. 제 1 획득 함수는 예를 들어 PI(Probability of Improvement), EI(Expected Improvement) 및 UCB(Upper Confidence Bound) 등을 포함한다.Subsequently, the processor 110 may determine, by means of the first acquisition function, K parent candidates based on the previously generated mathematical model. The first acquisition function includes, for example, Probability of Improvement (PI), Expected Improvement (EI), and Upper Confidence Bound (UCB).

예를 들어, 프로세서(110)는 제 1 대리 함수에 의해 생성된 수학적 모델로부터 제 1 획득 함수에 의해 확률이 높은 순서대로 또는 점수가 높은 순서대로 K 개의 부모 후보(Top K parents)를 선택할 수 있다.For example, the processor 110 may select K parent candidates (Top K parents) from the mathematical model generated by the first surrogate function in order of high probability or in order of high score by the first acquisition function. .

추가적으로, 제 1 단계(S1)는, 프로세서(110)에 의해, 일 세트의 베이지안 최적화 모델에서 제 1 대리 함수 및 제 1 획득 함수를 정의하는 베이지안 최적화 모델을 선택하는 단계를 더 포함할 수 있다.Additionally, the first step S1 may further include selecting, by the processor 110, a Bayesian optimization model defining a first surrogate function and a first acquisition function from a set of Bayesian optimization models.

이는 제 1 최적화 프로세스를 위한 베이지안 최적화 모델의 다각화(diversification)를 위한 것이다. 즉, 제 1 베이지안 최적화 모델은 다각화된 베이지안 최적화 모델이다.This is for diversification of the Bayesian optimization model for the first optimization process. That is, the first Bayesian optimization model is a diversified Bayesian optimization model.

이와 같은 베이지안 최적화 모델을 선택하는 단계는 예를 들어 전술한 제 1 대리 함수에 의해 수학적 모델을 생성하는 단계 및 제 1 획득 함수에 의해 K 개의 부모 후보를 결정하는 단계 이전에 수행될 수 있다.The step of selecting such a Bayesian optimization model may be performed before, for example, the step of generating a mathematical model by the above-described first surrogate function and the step of determining K parent candidates by the first acquisition function.

일 예에서, 프로세서(110)는 일 세트의 베이지안 최적화 모델로부터 제 1 대리 함수 및 제 1 획득 함수를 정의하는 베이지안 최적화 모델을 선택할 수 있다. 여기서 일 세트의 베이지안 최적화 모델은 복수 개의 대리 함수 및 복수 개의 획득 함수 간의 조합에 기반한 복수 개의 베이지안 최적화 모델을 원소로 하는 집합을 의미한다.In one example, processor 110 may select a Bayesian optimization model defining a first surrogate function and a first acquisition function from a set of Bayesian optimization models. Here, a set of Bayesian optimization models means a set including a plurality of Bayesian optimization models based on a combination of a plurality of surrogate functions and a plurality of acquisition functions as elements.

예를 들어, 프로세서(110)는 라운드-로빈(round-robin) 또는 균등 랜덤(uniform random) 방식으로 일 세트의 베이지안 최적화 모델로부터 제 1 대리 함수 및 제 1 획득 함수를 정의하는 베이지안 최적화 모델을 선택할 수 있다.For example, the processor 110 selects a Bayesian optimization model defining a first surrogate function and a first acquisition function from a set of Bayesian optimization models in a round-robin or uniform random manner. can

제 1 단계(S1)에서 본 발명은 베이지안 최적화 기법을 활용해 부모 선정 시에 P와 C 뿐 아니라 가능한 모든 후보자들 중의 예상 성능을 평가하여 선정할 수 있기 때문에 P, C에서의 유전자와 별개로 좋은 유전자를 가진 부모 선정이 가능하다.In the first step (S1), the present invention utilizes the Bayesian optimization technique to select parents by evaluating not only P and C but also the expected performance among all possible candidates. It is possible to select parents with genes.

다음의 수학식 3은 제 1 단계(S1)에서 수행되는 연산을 나타낸다.Equation 3 below represents the operation performed in the first step (S1).

제 2 단계(S2)에서 프로세서(110)는 진화 연산에 의해 제 1 단계(S1)에서 결정된 K 개의 부모 후보로부터 M 개의 자식 후보를 생성할 수 있다. 예를 들어 M은 자연수이고, K는 M과 같거나 큰 자연수이다.In the second step (S2), the processor 110 may generate M child candidates from the K parent candidates determined in the first step (S1) by an evolution operation. For example, M is a natural number and K is a natural number greater than or equal to M.

제 2 단계(S2)는 자식 생성(offspring generation) 과정으로 제 1 단계(S1)에서 선택한 부모 개체들에 변이(mutation), 교차(cross-over) 등의 진화 연산자를 적용하여 M 종으로 구성된 새로운 후보자 세트인 C를 생성한다.The second step (S2) is the process of offspring generation, and evolutionary operators such as mutation and cross-over are applied to the parents selected in the first step (S1) to create new species composed of M species. Generate a set of candidates, C.

일 예에서, 진화 연산은 변이(mutation), 교차(crossover) 및 보존 중 적어도 하나를 포함할 수 있다. 여기서 보존은 유전 연산을 적용하지 않는 것, 즉 무 유전 연산(no genetic operation)을 의미한다.In one example, an evolutionary operation may include at least one of mutation, crossover, and conservation. Conservation here means not applying genetic operation, that is, no genetic operation.

다음의 수학식 4는 제 2 단계(S2)에서의 연산을 나타낸다.Equation 4 below represents the operation in the second step (S2).

예를 들어 보존 연산은, K 개의 부모 후보로부터 M개의 자녀 후보를 랜덤하게 선택할 수 있다. 예를 들어 변이 연산은 부모 후보의 매개 변수 값 중 임의로 고른 하나에 대하여 변이를 적용할 수 있다. 예를 들어 교차 연산은 K 개의 부모 후보로부터 M개의 부모 후보의 매개 변수 값 중 임의로 하나(예를 들어, 서로 중복되지 않은 짝(unique pair))를 선택하고 교차를 적용하여 M개의 후손(offspring)을 생성할 수 있다. 예를 들어, 교차의 경우 후손의 매개 변수 값은 부모가 지닌 매개 변수 값 중 하나를 같은 확률(equal probability)로 선택하여 상속할 수 있다.For example, the conservation operation may randomly select M child candidates from K parent candidates. For example, a mutation operation can apply a mutation to one randomly selected parameter value of a parent candidate. For example, the intersection operation randomly selects one of the parameter values of M parent candidates from K parent candidates (for example, a unique pair that does not overlap with each other) and applies intersection to generate M offspring. can create For example, in the case of crossover, a descendant's parameter value can be inherited by selecting one of the parent's parameter values with equal probability.

제 2 단계(S2)에서 본 발명은 제 1 단계(S1) 및 제 3 단계(S3)에 적용된 베이지안 최적화 기법의 도움으로 인해 잘못된 진화 오퍼레이터의 선택에 따른 후보자 세트의 성능 악화를 상쇄시킬 수 있다. 반면, 베이지안 최적화 기법에서 통계 모델(probabilistic model)의 편향(model bias)에 따른 오차를 진화 오퍼레이터를 통한 섭동 효과(perturbation)로 감소시킬 수 있다.In the second step (S2), with the help of the Bayesian optimization technique applied in the first step (S1) and the third step (S3), the deterioration in the performance of the candidate set due to the selection of the wrong evolution operator can be offset. On the other hand, in the Bayesian optimization technique, an error due to model bias of a probabilistic model can be reduced by a perturbation effect through an evolutionary operator.

제 3 단계(S3)에서 프로세서(110)는 목적 함수에 대한 제 2 최적화 프로세스에 의해 제 2 단계(S2)에서 생성된 M 개의 자식 후보의 예상 적합도를 결정하고, 결정된 예상 적합도에 기반하여 목적 함수의 최적 해에 대한 최종 후보를 결정할 수 있다.In a third step (S3), the processor 110 determines the expected fitness of the M child candidates generated in the second step (S2) by the second optimization process for the objective function, and determines the objective function based on the determined expected fitness. The final candidate for the optimal solution of can be determined.

일 예에서, 제 2 최적화 프로세스는 목적 함수에 대한 제 2 베이지안 최적화 모델의 연산을 포함할 수 있다. 여기서 제 2 베이지안 최적화 모델은 다각화된 베이지안 최적화 기법에 의해 일 세트의 베이지안 최적화 모델로부터 선택된 모델을 포함할 수 있다.In one example, the second optimization process may include computation of a second Bayesian optimization model on the objective function. Here, the second Bayesian optimization model may include a model selected from a set of Bayesian optimization models by a diversified Bayesian optimization technique.

이를 위하여, 제 3 단계(S3)는, 일 세트의 베이지안 최적화 모델에서 제 2 대리 함수 및 제 2 획득 함수를 정의하는 베이지안 최적화 모델을 선택하는 단계를 더 포함할 수 있다.To this end, the third step S3 may further include selecting a Bayesian optimization model defining a second surrogate function and a second acquisition function from a set of Bayesian optimization models.

예를 들어 프로세서(110)는 일 세트의 베이지안 최적화 모델로부터 라운드-로빈(round-robin) 또는 균등 랜덤(uniform random) 방식으로 제 2 대리 함수 및 제 2 획득 함수를 정의하는 베이지안 최적화 모델을 선택할 수 있다.For example, the processor 110 may select a Bayesian optimization model defining a second surrogate function and a second acquisition function in a round-robin or uniform random manner from a set of Bayesian optimization models. there is.

예를 들어 프로세서(110)는 일 세트의 베이지안 최적화 모델 중에서 제 1 대리 함수를 제외한 나머지 중에서 제 2 대리 함수를 선택할 수 있다. 말하자면, 동일한 세대에서 제 1 대리 함수와 제 2 대리 함수는 서로 상이한 함수일 수 있다.For example, the processor 110 may select a second surrogate function from among a set of Bayesian optimization models except for the first surrogate function. In other words, in the same generation, the first surrogate function and the second surrogate function may be different functions.

제 3 단계(S3)에서 프로세서(110)는 목적 함수에 대한 제 2 베이지안 최적화를 수행하여 제 2 단계(S2)에서 생성된 M 개의 자식 후보의 예상 적합도를 결정하고, 결정된 예상 적합도에 기반하여 목적 함수의 최적 해에 대한 최종 후보를 결정할 수 있다.In a third step (S3), the processor 110 determines the expected fitness of the M child candidates generated in the second step (S2) by performing a second Bayesian optimization on the objective function, and based on the determined expected fitness, the objective function is determined. The final candidate for the optimal solution of the function can be determined.

제 3 단계(S3)는, 제 2 대리 함수에 의해, 단계(S2)에서 생성된 M 개의 자식 후보에 기초한 목적 함수의 탐색 공간에서 목적 함수의 최적 해에 대한 탐색 이력(

)에 기반하여 목적 함수에 대한 수학적 모델을 생성하는 단계 및 제 2 획득 함수에 의해, 앞서 생성된 수학적 모델에 기반하여 최종 후보를 결정하는 단계를 포함할 수 있다.The third step (S3) is a search history for the optimal solution of the objective function in the search space of the objective function based on the M child candidates generated in step (S2) by the second surrogate function (

), and determining a final candidate based on the previously generated mathematical model by means of a second acquisition function.

즉, 프로세서(110)는 제 2 대리 함수에 의해 생성된 수학적 모델에 기반하여 M 개의 자식 후보의 예상 적합도를 결정하고, 제 2 획득 함수에 의해 M 개의 자식 후보 중에서 최종 후보를 결정할 수 있다.That is, the processor 110 may determine expected fitness of M child candidates based on the mathematical model generated by the second surrogate function, and determine a final candidate among the M child candidates by using the second acquisition function.

제 3 단계(S3)에서의 탐색 공간은 제 2 단계(S2)에서 생성된 M 개의 자식 후보를 포함한다. 이로써, 제 3 단계(S3)에서의 탐색 공간의 크기가 줄어들고 연산 비용이 감소된다.The search space in the third step (S3) includes M child candidates generated in the second step (S2). As a result, the size of the search space in the third step (S3) is reduced and the operation cost is reduced.

제 3 단계(S3)에서 수학적 모델은 확률 분포 모델 또는 점수 기반 모델을 포함한다.In the third step (S3), the mathematical model includes a probability distribution model or a score-based model.

말하자면, 제 2 대리 함수는, 목적 함수의 전체 탐색 공간에서 목적 함수의 최적 해에 대한 탐색 이력(

)에 기반하여 목적 함수의 최적 해를 추정하기 위한 수학적 모델을 제공할 수 있다.In other words, the second surrogate function is the search history for the optimal solution of the objective function in the entire search space of the objective function (

예를 들어, 제 2 대리 함수는 확률 분포 함수로서 목적 함수의 최적 해에 대한 확률 분포 모델을 생성할 수 있다. 예를 들어, 제 2 대리 함수는 점수 기반 함수로서 목적 함수의 최적 해에 대한 점수 기반 모델을 생성할 수 있다. 제 2 대리 함수는 예를 들어 가우시안 프로세스(Gaussian Process), 랜덤 포레스트(Random Forest) 및 뉴럴 넷을 이용한 성능 예측함수(Neural Predictor) 등을 포함한다.For example, the second surrogate function may generate a probability distribution model for an optimal solution of the objective function as a probability distribution function. For example, the second surrogate function is a score-based function that can generate a score-based model for an optimal solution of the objective function. The second surrogate function includes, for example, a Gaussian Process, a Random Forest, and a Neural Predictor using a Neural Net.

후속하여 프로세서(110)는, 제 2 획득 함수에 의해, 앞서 생성된 수학적 모델에 기반하여 최종 후보를 결정할 수 있다. 제 2 획득 함수는 예를 들어 PI(Probability of Improvement), EI(Expected Improvement) 및 UCB(Upper Confidence Bound) 등을 포함한다.Subsequently, the processor 110 may determine the final candidate based on the previously generated mathematical model by means of the second acquisition function. The second acquisition function includes, for example, Probability of Improvement (PI), Expected Improvement (EI), and Upper Confidence Bound (UCB).

예를 들어, 프로세서(110)는 제 2 대리 함수에 의해 생성된 수학적 모델로부터 제 2 획득 함수에 의해 확률이 가장 높은 또는 점수가 가장 높은 자식 후보를 최종 후보로 선택할 수 있다.For example, the processor 110 may select, as a final candidate, a child candidate having the highest probability or having the highest score from the mathematical model generated by the second surrogate function by means of the second acquisition function.

제 3 단계(S3)에서 베이지안 최적화 기법을 사용하면 임의로 선정한 후보자 중 실제 적합도 평가를 해야 할 대상의 수를 크게 줄일 수 있기 때문에 기존 방법보다 빠른 최적화 성능 향상(i.e., warm-start)이 가능하다.If the Bayesian optimization technique is used in the third step (S3), it is possible to significantly reduce the number of subjects to be evaluated for actual fitness among randomly selected candidates, enabling faster optimization performance improvement (i.e., warm-start) than existing methods.

제 3 단계(S3)에서 본 발명은 새로 생성한 자식 후보 세트 C의 모든 개체에 대한 값 비싼 적합도 평가에 앞서 베이지안 최적화 기법을 활용해 값 싼 대리 함수로 미리 예상 적합도 평가를 거침으로써 실제 적합도 평가 대상을 높은 성능이 예상되는 일부 후보자로 한정함으로써 비싼 평가 비용을 줄일 수 있다.In the third step (S3), the present invention uses a Bayesian optimization technique prior to the expensive fitness evaluation of all individuals of the newly created child candidate set C. Expensive evaluation cost can be reduced by limiting to some candidates expected to have high performance.

다음의 수학식 5는 제 3 단계(S3)의 연산을 나타낸다.Equation 5 below represents the operation of the third step (S3).

제 4 단계(S4)에서 프로세서(110)는 목적 함수의 최적 해에 대한 최종 후보의 적합도를 평가할 수 있다.In a fourth step (S4), the processor 110 may evaluate the fitness of the final candidate for the optimal solution of the objective function.

제 4 단계(S4)에서 프로세서(110)는 M개의 자식 후보의 적합도를 평가하는 대신에 제 3 단계(S3)에서 선택된 최종 후보의 적합도를 평가한다. 이로써 평가 비용이 비싼 함수에 대한 적합도 평가 비용을 완화할 수 있다.In the fourth step (S4), the processor 110 evaluates the fitness of the final candidate selected in the third step (S3) instead of evaluating the fitness of the M child candidates. This can alleviate the cost of goodness-of-fit evaluation for functions with high evaluation cost.

말하자면, 제 3 단계(S3)에서 제 2 베이지안 최적화 모델을 이용하여 M 개의 자녀 후보의 실제 함수 값(true function value)을 적은 비용으로(cheap cost) 추정(estimate)하고, 최적 추정치를 갖는 자녀 후보를 최종 후보로 선택한다. 이어서 제 4 단계(S4)에서 최종 후보의 실제 함수 값을 평가(evaluate)한다.In other words, in the third step (S3), the true function values of the M child candidates are estimated at a cheap cost using the second Bayesian optimization model, and the child candidate having the best estimate is selected as the final candidate. Then, in the fourth step (S4), the actual function value of the final candidate is evaluated.

일 예에서, 제 4 단계(S4)는, 다음의 수학식 6에 나타난 것과 같이 단계(S3)에서 결정된 최종 후보를 목적 함수의 최적 해에 대한 탐색 이력(

)에 추가하는 단계를 포함할 수 있다.In one example, the fourth step (S4), as shown in Equation 6 below, the search history for the optimal solution of the objective function for the final candidate determined in step (S3) (

).

이로써 탐색 이력(

)은 각 세대(generation)의 최종 후보를 포함하는 집합으로 유지된다.With this, the search history (

) is maintained as a set containing the final candidates of each generation.

추가적으로 실시예에 따른 함수 최적화 방법은 소정의 종료 조건의 만족 여부에 따라 제 1 단계(S1) 내지 제 4 단계(S4)를 반복하는 단계를 더 포함할 수 있다. 여기서 소정의 종료 조건은 진화 알고리즘의 통상의 종료 조건을 사용할 수 있다. 예를 들어, 종료 조건은 미리 설정해둔 목표 성능을 능가하는 해를 찾거나 진화 과정의 최대 반복 횟수를 지칭하는 세대 수(maximum number of generations)에 도달하는 경우를 포함한다.Additionally, the function optimization method according to the embodiment may further include repeating the first step (S1) to the fourth step (S4) according to whether a predetermined end condition is satisfied. Here, as the predetermined end condition, a normal end condition of an evolutionary algorithm may be used. For example, the termination condition includes finding a solution that exceeds a pre-set target performance or reaching a maximum number of generations, which refers to the maximum number of iterations of an evolutionary process.

종료 조건이 만족되지 않은 경우, 세대(n)을 1만큼 증가(

)시키고 제 1 단계(S1) 내지 제 4 단계(S4)를 반복한다.If the exit condition is not satisfied, increment generation (n) by 1 (

) and repeat the first step (S1) to the fourth step (S4).

한편, 실시예에 따른 함수 최적화 방법과 도 1을 참조하여 전술한 HPO 및 NAS의 세 가지 요소와의 대응 관계를 살펴본다.On the other hand, the function optimization method according to the embodiment and the corresponding relationship between the three elements of HPO and NAS described above with reference to FIG. 1 are examined.

도 3에서 모듈 A(Module A)는 탐색 공간(Search Space)을 제어하는 역할을 수행하고, 모듈 C(Module C)는 선택된 후보들을 평가하는 역할을 수행한다.In FIG. 3 , module A serves to control a search space, and module C serves to evaluate selected candidates.

이와 같은 대응 관계로부터, 실시예에서는, 제 1 베이지안 최적화 모델에 기반한 제 1 최적화 프로세스를 모듈 A(Module A)에 배치하여, 제 1 최적화 프로세스가 적응적으로 탐색 공간을 제어(adaptive search space control)하도록 하였다.From this correspondence, in the embodiment, the first optimization process based on the first Bayesian optimization model is placed in Module A, so that the first optimization process adaptively controls the search space (adaptive search space control) made to do

또한, 제 2 베이지안 최적화 모델에 기반한 제 2 최적화 프로세스를 모듈 C(Module C)에 배치하여, 제 2 최적화 프로세스가 많은 후보 구성들(candidate configurations)의 성능 추정을 지원하도록 설계하였다.In addition, the second optimization process based on the second Bayesian optimization model is placed in Module C, and the second optimization process is designed to support performance estimation of many candidate configurations.

실시예에 의하면 두 개의 베이지안 최적화 모듈을 진화 알고리즘에 통합해 딥 러닝에서 다양한 HPO 문제와 NAS 문제 모두에서 성능을 크게 향상시킬 수 있다. According to the embodiment, by integrating two Bayesian optimization modules into an evolutionary algorithm, performance can be greatly improved in both various HPO problems and NAS problems in deep learning.

실시예의 성능을 딥 러닝에서의 HPO 및 NAS 문제 14종에서 평가했을 때, 어려운 최적화 목표 달성에 대한 성공 확률이 기존 최고 성능을 보인 진화 알고리즘(Regularized Evolution)이 평균 40.5%에 불과한 반면, 실시예에서 진화 오퍼레이터로 변이를 쓸 때의 성능은 평균 96.5%으로 탁월한 성능 향상 효과를 확인할 수 있었다.When the performance of the embodiment was evaluated on 14 types of HPO and NAS problems in deep learning, the probability of success for achieving a difficult optimization goal was only 40.5% on average for the regularized evolution algorithm that showed the highest performance in the past, whereas in the embodiment The performance when using mutation as an evolution operator was 96.5% on average, showing an excellent performance improvement effect.

도 4a는 실시예에 따른 다각화된 베이지안 최적화 방식을 활용한 함수 최적화의 예시적인 결과 그래프이다.4A is an exemplary result graph of function optimization using a diversified Bayesian optimization method according to an embodiment.

도 4a의 (a)는 예시적으로 현재까지 4 개의 후보가 평가된 탐색 이력을 나타내고, (b)는 GP-UCB(Gaussian Process-Upper Confidence Bound)에 의해 선택된 x5에 대한 최적 값, (c)는 RF-EI(Random Forest-Expected Improvement)에 의해 선택된 x₅에 대한 최적 값, (d)는 GP-EI(Gaussian Process-Expected Improvement)에 의해 선택된 x5에 대한 최적 값을 나타낸다.(a) of FIG. 4a exemplarily shows a search history in which four candidates have been evaluated up to now, (b) is an optimal value for x5 selected by GP-UCB (Gaussian Process-Upper Confidence Bound), (c) is an optimal value for x 5 selected by RF-EI (Random Forest-Expected Improvement), and (d) represents an optimal value for x ₅ selected by GP-EI (Gaussian Process-Expected Improvement).

(b)에서 GP-UCB는 다음 후보(x5)를 이전에 시도된 후보들(x3 및 x4)과 매우 근접한 위치로 선택하였고, 결과적으로 이들 근처에 갇히게(stuck) 된다. (c)에서 RF-EI는 (b)보다는 상황이 낫지만 국소 최적 값(local optimum)인 l₁ 근처의 점을 선택하였다. (d)에서 GP-EI는 다른 귀납적 편향(inductive bias)을 갖고 전역 최적값(global optimum)인 g 근처를 선택하였다.In (b), GP-UCB selects the next candidate (x5) as a position very close to the previously tried candidates (x3 and x4), and consequently gets stuck near them. In (c), RF-EI is better than (b), but a point near l ₁ , which is a local optimum, is selected. In (d), the GP-EI has a different inductive bias and selects around g, which is the global optimum.

이로부터 실시예에 따른 다각화된 베이지안 최적화에 기반한 함수 최적화 기법은 다중 모델들에 기반하므로, 단일 베이지안 최적화에 비해 강건(robust)하며, 지역 최적(local optimum)으로부터 탈출할 수 있음을 알 수 있다.From this, it can be seen that the function optimization technique based on the diversified Bayesian optimization according to the embodiment is based on multiple models, so it is more robust than the single Bayesian optimization and can escape from the local optimum.

도 4b는 실시예에 따른 두 개의 베이지안 최적화 모듈을 활용한 진화 알고리즘에 의한 함수 최적화의 예시적인 결과 그래프이다.4B is an exemplary result graph of function optimization by an evolutionary algorithm utilizing two Bayesian optimization modules according to an embodiment.

도 4b의 (a)는 탐색 이력을 나타내고, (b)는 RF-UCB(Random Forest -Upper Confidence Bound)에 의해 선택된 x^next의 최적 값, (c)는 GP-EI(Gaussian Process-Expected Improvement)에 의해 선택된 x^next에 대한 최적 값, (d)는 협업(Cooperation)에 의해 선택된 x^next에 대한 최적 값을 나타낸다.4B (a) shows the search history, (b) is the optimal value of x ^next selected by RF-UCB (Random Forest-Upper Confidence Bound), (c) is GP-EI (Gaussian Process-Expected Improvement) The optimal value for x ^next selected by , (d) represents the optimal value for x ^next selected by cooperation.

개별적인 베이지안 최적화 모델에 의한 (b)와 (c)의 최적 값에 대한 결정에 비하여 (d)에 나타난 두 개의 베이지안 모델의 협업에 의한 결정이 x^next에 대한 더 나은 후보를 선택하였음을 알 수 있다.It can be seen that the decision by the collaboration of the two Bayesian models shown in (d) selected a better candidate for x ^next compared to the decision on the optimal value of (b) and (c) by individual Bayesian optimization models. .

(d)의 협업은 탐색 공간을 상위 10 개 지점(top M = 10)으로 제한하는 제 1 베이지안 최적화 모델과 이와 같은 10 개의 지점으로부터 하나의 최적점을 선택하는 제 2 베이지안 최적화 모델에 의하여 달성된다.The collaboration in (d) is achieved by a first Bayesian optimization model that limits the search space to the top 10 points (top M = 10) and a second Bayesian optimization model that selects one optimal point from these 10 points. .

두 개의 베이지안 최적화 모델은 서로 상이한 모델 특성을 가지므로, 최종 후보 선택에 있어서 상호 보완적으로 절충을 이룰 수 있다.Since the two Bayesian optimization models have different model characteristics, a compromise can be achieved in a complementary manner in the final candidate selection.

도 5는 실시예에 따른 함수 최적화 과정의 예시적인 의사 코드이다.5 is an exemplary pseudocode of a function optimization process according to an embodiment.

실시예에 따른 함수 최적화 과정은 미지의 목적 함수(

), 일 세트의 다각화된 베이지안 최적화 모델(

) 및 전체 탐색 공간(

) 정보를 입력으로 하여 도 5의 예시적인 의사 코드에 따라 도 3을 참조하여 전술한 함수 최적화 방법의 단계들을 실행한다.The function optimization process according to the embodiment is an unknown objective function (

), a set of diversified Bayesian optimization models (

) and the entire search space (

) information as input, the steps of the function optimization method described above with reference to FIG. 3 are executed according to the exemplary pseudo code of FIG. 5 .

Line 1 내지 Line 4는 도 3의 초기화 단계(S0)에 대응한다. 전체 탐색 공간(

)으로부터, 예를 들어 두 개의 초기 후보자(

)를 선택하고, 각 초기 후보자(

)를 평가하고, 결과를 탐색 이력(

)에 추가한다. 여기서, 임의로 선정하는 K개의 후보자 대신에 단 두 개의 후보자만으로도 시작이 가능함을 나타낸다.Lines 1 to 4 correspond to the initialization step S0 of FIG. 3 . entire search space (

), for example two initial candidates (

) is selected, and each initial candidate (

), and the search history (

) is added to Here, it indicates that it is possible to start with only two candidates instead of randomly selected K candidates.

Line 5의 반복문에 의해 Line 17의 종료 조건(exit criteria)를 만족할 때까지 이하의 Line 6 내지 Line 19의 과정을 반복한다.The following processes from Line 6 to Line 19 are repeated until the exit criteria of Line 17 are satisfied by the repetition statement of Line 5.

Line 6 내지 Line 8은 도 3의 제 1 단계(S1)에 대응한다. 일 세트의 다각화된 베이지안 모델(

) 중에서, 예를 들어 모듈로(mod) 연산을 이용한 라운드-로빈 방식으로, 제 1 베이지안 최적화 모델(

)을 설정한다. 설정된 제 1 베이지안 최적화 모델(

)에 기반하여 전체 탐색 공간(

)에서 K 개의 부모 후보(

)를 선택한다.Lines 6 to 8 correspond to the first step (S1) of FIG. A set of diversified Bayesian models (

), for example, in a round-robin method using a modulo operation, a first Bayesian optimization model (

) is set. Established first Bayesian optimization model (

) based on the entire search space (

) in K parent candidates (

) is selected.

여기서, 임의의 베이지안 최적화 알고리즘을 나타내는 첫 번째 모자함수(hat function)

를 이용해 탐색 이력(

)으로 학습해 만든 대리 함수에서 K 개의 부모 후보(

)를 선정한다.Here, the first hat function representing an arbitrary Bayesian optimization algorithm

Search history (

) in the surrogate function created by learning K parent candidates (

) is selected.

Line 9 내지 Line 11은 도 3의 제 2 단계(S2)에 대응한다. K 개의 부모 후보(

)에 대하여 보존(nothing), 변이(mutation) 또는 교차(crossover)의 진화 연산을 적용하여 M 개의 자식 후보(

)를 생성한다.Lines 9 to 11 correspond to the second step (S2) of FIG. K parent candidates (

) by applying an evolutionary operation of nothing, mutation, or crossover to M child candidates (

) to create

여기서, 두 가지 진화 오퍼레이터인 변이 또는 교차를 각각 적용하거나 진화 오퍼레이터를 적용하지 않을 경우(즉, 보존)에 대해서 최적화 성능에 미치는 영향을 평가한다.Here, the effect on the optimization performance is evaluated when two evolution operators, mutation or intersection, are applied or no evolution operator is applied (ie, conservation).

Line 12 내지 Line 14는 도 3의 제 3 단계(S3)에 대응한다. 일 세트의 다각화된 베이지안 모델(

) 중에서, 예를 들어 랜덤 방식으로, 제 2 베이지안 최적화 모델(

)을 설정한다. 설정된 제 2 베이지안 최적화 모델(

)에 기반하여 M 개의 자식 후보(

) 중에서 최종 후보(

)를 선택한다. 여기서, 두 번째 모자 함수

로 M 개의 자녀 후보(

) 중 가장 좋은 적합도 평가 성능이 예상되는 자녀를 최종 후보(

)로 선정한다.Lines 12 to 14 correspond to the third step (S3) of FIG. A set of diversified Bayesian models (

), for example, in a random manner, the second Bayesian optimization model (

) is set. Established second Bayesian optimization model (

) based on M child candidates (

) among the final candidates (

) is selected. Here, the second hat function

as M child candidates (

), select the child who is expected to have the best fitness evaluation performance as the final candidate (

) is selected.

Line 15는 도 3의 제 4 단계(S4)에 대응한다. 선택된 최종 후보(x_n)를 평가하여, 목적 함수(f)의 함숫값(y_n)을 결정한다. 여기서, 값 비싼 적합도 평가 대상을 하나의 최종 후보(

)로 한정함으로써 최적화 성능 향상을 가져올 수 있다.Line 15 corresponds to the fourth step (S4) of FIG. The selected final candidate (x _n ) is evaluated to determine the function value (y _n ) of the objective function (f). Here, the expensive fitness evaluation target is selected as one final candidate (

), optimization performance can be improved.

Line 16 및 Line 17은 도 3의 단계(S5)에 대응한다. 종료 조건(exit criteria)이 만족되면 중지한다. 종료 조건이 만족되지 않으면, Line 18 및 Line 19에서 최종 후보(x_n, y_n)를 탐색 이력(

)에 추가한다.Lines 16 and 17 correspond to step S5 in FIG. 3 . Stop when the exit criteria are met. If the termination condition is not satisfied, the final candidates (x _n , y _n ) are converted into the search history (

) is added to

실시예에 의하면, 제 1 다각화된 베이지안 최적화 모델에 의한 동적 탐색 공간 제어(dynamic search space control)와 제 2 다각화된 베이지안 최적화에 의한 저비용의 함수 출력 추정(cheap function output estimation)이 가능하다.According to the embodiment, dynamic search space control by the first diversified Bayesian optimization model and cheap function output estimation by the second diversified Bayesian optimization are possible.

특히, 두 가지 BO 모델들의 협업은 (i)단일 BO의 실패를 방지하고 (ii)일반적인 EA를 광범위한 복잡한 블랙 박스 최적화 문제들에 대하여 효과적일 수 있는 견고하고 효율적인 방법으로 업그레이드할 수 있다.In particular, the collaboration of the two BO models can (i) prevent failure of a single BO and (ii) upgrade a generic EA into a robust and efficient method that can be effective for a wide range of complex black box optimization problems.

본 발명은 실시예로 성능을 검증한 딥 러닝의 HPO 문제 혹은 NAS 문제 뿐 아니라 산업 전반 대부분의 블랙박스 함수에 대한 최적화 문제의 솔루션으로 활용이 가능하다. 예를 들면, 엔진과 같은 복잡한 기계 장치에 대한 튜닝 문제, 반도체 소자 및 회로 설계 문제, 그리고 사용자 반응 평가를 위한 A/B 테스팅에서 적용가능하다. 특히, 기존의 베이지안 최적화 또는 진화 알고리즘이 적용된 응용 분야 뿐 아니라 매우 값 비싼 실험 비용이 드는 응용 분야에서 활용했을 때에도 성능 향상 및 총 실험 비용 감소를 기대할 수 있다.The present invention can be used as a solution to optimization problems for most black box functions in the industry as well as HPO problems or NAS problems of deep learning whose performance has been verified by examples. For example, it can be applied to tuning problems for complex mechanical devices such as engines, semiconductor device and circuit design problems, and A/B testing for user response evaluation. In particular, performance improvement and total experimental cost reduction can be expected when used not only in applications where conventional Bayesian optimization or evolution algorithms are applied, but also in applications requiring very expensive experiments.

전술한 본 발명의 일 실시예에 따른 방법은 프로그램이 기록된 매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 매체는, 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 매체의 예로는, HDD(Hard Disk Drive), SSD(Solid State Disk), SDD(Silicon Disk Drive), ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등이 있다.The method according to an embodiment of the present invention described above can be implemented as computer readable code on a medium on which a program is recorded. The computer-readable medium includes all types of recording devices in which data that can be read by a computer system is stored. Examples of computer-readable media include Hard Disk Drive (HDD), Solid State Disk (SSD), Silicon Disk Drive (SDD), ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc. there is

이상 설명된 본 발명의 실시예에 대한 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The description of the embodiments of the present invention described above is for illustrative purposes, and those skilled in the art can easily modify them into other specific forms without changing the technical spirit or essential features of the present invention. you will understand that Therefore, the embodiments described above should be understood as illustrative in all respects and not limiting. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as distributed may be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 청구범위에 의하여 나타내어지며, 청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and equivalent concepts thereof should be construed as being included in the scope of the present invention.

100: 함수 최적화 장치
110: 프로세서
120: 메모리100: function optimizer
110: processor
120: memory

Claims

As a function optimization method,
a first step of determining K parent candidates for an optimal solution of an unknown objective function by a first optimization process for the objective function;
a second step of generating M child candidates from the K parent candidates by a genetic operation;
a third step of determining expected fitness of the M child candidates by a second optimization process for the objective function, and determining a final candidate for an optimal solution of the objective function based on the expected fitness; and
A fourth step of evaluating the fitness of the final candidate for the optimal solution
containing
How to optimize a function.

According to claim 1,
wherein the first optimization process includes computation of a first Bayesian optimization model for the objective function;
wherein the second optimization process comprises computation of a second Bayesian optimization model for the objective function;
How to optimize a function.

According to claim 2,
The first Bayesian optimization model and the second Bayesian optimization model are models selected from a set of Bayesian optimization models by a diversified Bayesian optimization technique,
How to optimize a function.

According to claim 1,
The first step is
By a first surrogate function, a mathematical model for the objective function is formed based on a search history for the optimal solution in the full search space of the objective function. generating; and
determining, by a first acquisition function, the K parent candidates based on the mathematical model;
including,
How to optimize a function.

According to claim 4,
The first step is
selecting a Bayesian optimization model defining the first surrogate function and the first acquisition function from a set of Bayesian optimization models;
Including more,
How to optimize a function.

According to claim 1,
The evolutionary operation includes at least one of mutation, crossover, and conservation.
How to optimize a function.

According to claim 1,
The third step,
generating a mathematical model for the objective function based on a search history for the optimal solution in a search space of the objective function based on the M child candidates, by a second surrogate function; and
determining the final candidate based on the mathematical model by a second acquisition function;
including,
How to optimize a function.

According to claim 7,
The third step,
selecting a Bayesian optimization model defining the second surrogate function and the second acquisition function from a set of Bayesian optimization models;
Including more,
How to optimize a function.

According to claim 1,
In the fourth step,
adding the final candidate to the search history for the optimal solution;
Including more,
How to optimize a function.

According to claim 1,
Repeating the first step to the fourth step according to whether a predetermined end condition is satisfied
Including more,
How to optimize a function.

As a function optimizer,
at least one processor
including,
the processor,
a first operation for determining K parent candidates for an optimal solution of an unknown objective function by a first optimization process;
a second operation generating M child candidates from the K parent candidates by an evolution operation;
a third operation for determining expected goodness of fit of the M child candidates by a second optimization process for the objective function, and determining a final candidate for an optimal solution of the objective function based on the expected goodness of fit; and
A fourth operation for evaluating the goodness of fit of the final candidate for the optimal solution
configured to perform
function optimizer.

According to claim 11,
wherein the first optimization process includes computation of a first Bayesian optimization model for the objective function;
wherein the second optimization process comprises computation of a second Bayesian optimization model for the objective function;
function optimizer.

According to claim 12,
The first Bayesian optimization model and the second Bayesian optimization model are models selected from a set of Bayesian optimization models by a diversified Bayesian optimization technique,
function optimizer.

According to claim 11,
the processor,
In the first operation,
generating a mathematical model for the objective function based on a search history for the optimal solution in a full search space of the objective function by a first surrogate function;
configured to determine, by a first acquisition function, the K parent candidates based on the mathematical model;
function optimizer.

15. The method of claim 14,
the processor,
In the first operation,
configured to select a Bayesian optimization model defining the first surrogate function and the first acquisition function from a set of Bayesian optimization models;
function optimizer.

According to claim 11,
the processor,
In the third operation,
generating a mathematical model for the objective function based on a search history for the optimal solution in a search space of the objective function based on the M child candidates by a second surrogate function;
configured to determine, by a second acquisition function, the final candidate based on the mathematical model;
function optimizer.

17. The method of claim 16,
the processor,
In the third operation,
configured to select a Bayesian optimization model defining the second surrogate function and the second acquisition function from a set of Bayesian optimization models;
function optimizer.

According to claim 11,
the processor,
In the fourth operation,
And configured to add the final candidate to the search history for the optimal solution.
function optimizer.

According to claim 11,
the processor,
It is configured to repeat the first operation to the fourth operation according to whether a predetermined end condition is satisfied,
function optimizer.

A computer program stored in a non-transitory storage medium including at least one instruction for a processor to execute the function optimization method according to any one of claims 1 to 10.