KR102153540B1

KR102153540B1 - Method and apparatus for micro simulation parameter calibration using machine learning in agent based simulation

Info

Publication number: KR102153540B1
Application number: KR1020180146622A
Authority: KR
Inventors: 김동준; 문일철; 윤태섭
Original assignee: 한국과학기술원
Priority date: 2018-11-23
Filing date: 2018-11-23
Publication date: 2020-09-21
Also published as: KR20200061173A; WO2020105776A1

Abstract

컴퓨터에 의해 수행되는, 에이전트 기반 시뮬레이션에서 기계학습을 이용한 미시 시뮬레이션 파리미터 교정 방법이 제공된다. 본 개시의 방법은 (a) 복수의 에이전트의 특성 데이터를 포함하며, 시뮬레이션을 통하여 추정하고자 하는 목표 데이터인 동화 대상 데이터를 전처리하여 소정의 데이터 포맷으로 변환하여 밸리데이션(validation) 데이터를 생성하는 단계; (b) 상기 복수의 에이전트가 가지고 있는 특성에 대한 초기 파리미터 세트와 외부 변수를 시뮬레이션에 반영하기 위한 시나리오 데이터를 ABM 입력 데이터로 하여, 에이전트 기반 모델링 및 시뮬레이션(Agent-Based Modeling and Simulation: ABMS)을 수행하여 상기 복수의 에이전트 각각의 에이전트 미시 데이터를 구하고, 상기 에이전트 미시 데이터를 집합(aggregation)하여 에이전트 거시 데이터를 구하는 단계; (c) 상기 복수의 에이전트 각각의 미시 데이터에 대해 미시 군집 분석을 수행하여 적어도 하나의 군집(clustering)을 생성하고 군집 분석을 수행하는 단계; (d) 상기 밸리데이션 데이터와 상기 에이전트 거시 데이터를 비교하여 오차를 분석하는 단계; (e)　상기 오차가 소정의 기준치를 초과하는 경우, (e-1)　상기 오차를 줄이도록 상기 적어도 하나의 군집 각각에 대한 상기 파라미터를 교정하고, 상기 적어도 하나의 군집 각각에 대해 교정 파라미터 세트를 설정하는 단계; (e-2) 상기 복수의 에이전트가 가지고 있는 특성에 대한 교정 초기 파리미터 세트와 상기 시나리오 데이터를 ABM 입력 데이터로 설정하고, 상기 ABM 입력 데이터에 대해 에이전트 기반 모델링 및 시뮬레이션을 수행하여 상기 복수의 에이전트 각각의 에이전트 미시 데이터를 구하고, 상기 에이전트 미시 데이터를 집합하여 에이전트 거시 데이터를 구하는 단계; (e-3) 상기 (d) 단계 및 (e) 단계를 수행하는 단계; (f) 상기 오차가 소정의 기준치 이하인 경우, 상기 교정된 파라미터를 최종 파라미터로 판단하는 단계를 포함할 수 있다. A method for calibrating micro-simulation parameters using machine learning in an agent-based simulation performed by a computer is provided. The method of the present disclosure includes the steps of: (a) generating validation data by pre-processing moving target data, which is target data to be estimated through simulation, including characteristic data of a plurality of agents, and converting it into a predetermined data format; (b) Agent-Based Modeling and Simulation (ABMS) is performed using scenario data for reflecting the initial parameter set and external variables for the characteristics of the plurality of agents in the simulation as ABM input data. Obtaining agent micro data of each of the plurality of agents by performing, and obtaining agent micro data by aggregating the agent micro data; (c) performing micro-clustering analysis on micro-data of each of the plurality of agents to generate at least one clustering and performing cluster analysis; (d) analyzing an error by comparing the validation data and the agent macroscopic data; (e) 　, when the error exceeds a predetermined reference value, (e-1) 　 calibrate the parameters for each of the at least one cluster to reduce the error, and set a calibration parameter set for each of the at least one cluster Setting up; (e-2) Each of the plurality of agents by setting the calibration initial parameter set for the characteristics of the plurality of agents and the scenario data as ABM input data, and performing agent-based modeling and simulation on the ABM input data. Obtaining agent micro data of the agent and collecting the agent micro data to obtain agent macro data; (e-3) performing steps (d) and (e); (f) when the error is less than or equal to a predetermined reference value, determining the corrected parameter as a final parameter.

Description

Method and apparatus for micro simulation parameter calibration using machine learning in agent based simulation}

본 개시는 에이전트 기반 시뮬레이션 방법에 관한 것으로, 구체적으로는 에이전트 기반 시뮬레이션에서 기계학습을 이용한 미시 시뮬레이션 파리미터 교정 방법에 관한 것이다. The present disclosure relates to an agent-based simulation method, and more particularly, to a micro-simulation parameter calibration method using machine learning in agent-based simulation.

에이전트 기반 시뮬레이션의 결과가 모델링 대상 시스템의 관측 결과와 차이 나는 것은 에이전트 기반 시뮬레이션의 고질적 문제이다. 이러한 문제점을 파라미터 교정을 통하여 해결하는 생성적 특징을 가지고 있는 시뮬레이션 모델이 범용적으로 사용되기 위해서는 모델의 신뢰성이 보장되어야 한다. 모델의 신뢰성을 높이기 위하여, 생성된 시뮬레이션 결과를 기반으로 모델 파라미터를 교정하는 작업을 수행할 수 있다. 기계학습, 즉 데이터 기반 통계 모델을 통한 시뮬레이션 파라미터 교정 작업은 시뮬레이션 모델의 신뢰성을 높이는데 핵심적이다.It is a chronic problem of agent-based simulation that the result of agent-based simulation is different from the observation result of the system to be modeled. In order for a simulation model with a generative feature to solve this problem through parameter correction to be used universally, the reliability of the model must be guaranteed. In order to increase the reliability of the model, it is possible to calibrate model parameters based on the generated simulation result. Machine learning, that is, correction of simulation parameters through data-based statistical models, is key to increasing the reliability of the simulation model.

파라미터 교정에 관한 선행 연구는 일반적으로 특정 시뮬레이션 시나리오에 만 적용 가능한 방법론이었다. 보다 일반적인 시뮬레이션의 교정 방법론으로는 유전 알고리즘을 이용한 방법이 있다. 하지만 유전 알고리즘을 이용한 방법은 유전 알고리즘을 사용하기 위하여 시뮬레이션 모델에 사회 연결망이 모델링되어 있어야 적용할 수 있다는 한계가 있다. 또한, 유전 알고리즘을 이용한 방법은 에이전트의 효용을 기준으로 전략을 결정하는 파라미터를 학습하는데, 에이전트의 효용은 모델링 하는 것이 매우 까다롭다는 단점이 있다.Previous work on parameter calibration was generally a methodology applicable only to specific simulation scenarios. A more general simulation calibration methodology is a method using a genetic algorithm. However, there is a limitation in that a method using a genetic algorithm can be applied only when a social network is modeled in a simulation model in order to use the genetic algorithm. In addition, the method using the genetic algorithm learns the parameters that determine the strategy based on the utility of the agent, but it has the disadvantage that it is very difficult to model the utility of the agent.

더 일반적인 시뮬레이션 모델의 교정 방법론으로는 기계 학습을 사용한 동적 파라미터 교정 방법론("Data-Driven Automatic Calibration for Validation of Agent-Based Social Simulations,")이 소개된 바 있다. 이는 시뮬레이션의 적합도를 직접 높이는 방향으로 동적 파라미터를 학습하는 방법론으로서, 모든 시뮬레이션 모델에 적용할 수 있다는 있다는 장점이 있다. 편의상 방법론에서 교정한 파라미터를 거시 파라미터라고 하였을 때, 거시 파라미터의 역할은 모든 에이전트에 공통적으로 적용될 수 있는 파라미터 일반을 의미한다. 동적인 거시 파라미터를 은닉 마르코프 모델을 사용하여 레짐을 탐지한 후, 레짐 별로 거시 파라미터를 추정한다. 즉, 시간에 따라 변하는 동적 거시 파라미터 교정 방법론이다. 하지만, 에이전트의 전략을 결정하는 요인에는 공통적 요인과 에이전트 개개인의 다른 요인이 함께 존재한다. 그렇기 때문에 에이전트 개개인의 요인을 결정하는 미시 파라미터에 대한 교정 방법론이 요구된다.As a more general simulation model calibration methodology, a dynamic parameter calibration methodology using machine learning ("Data-Driven Automatic Calibration for Validation of Agent-Based Social Simulations") has been introduced. This is a methodology for learning dynamic parameters in the direction of directly increasing the fit of the simulation, and has the advantage that it can be applied to all simulation models. For convenience, when the corrected parameter in the methodology is called a macro parameter, the role of the macro parameter means a general parameter that can be applied to all agents in common. After detecting a regime using a hidden Markov model for dynamic macro parameters, the macro parameters are estimated for each regime. In other words, it is a dynamic macroscopic parameter calibration methodology that changes over time. However, the factors that determine the agent's strategy include common factors and other factors of each agent. Therefore, a correction methodology is required for micro-parameters that determine factors of each agent.

[1] B. Park and H. Qi, “Development and evaluation of a procedure for the calibration of simulation models,” Transportation Research Record: Journal of the Transportation Research Board(TRB), 2005.[1] B. Park and H. Qi, “Development and evaluation of a procedure for the calibration of simulation models,” Transportation Research Record: Journal of the Transportation Research Board (TRB), 2005. [2] T. Toledo, M. Ben-Akiva, D. Darda, M. Jha, and H. Koutsopoulos, “Calibration of microscopic traffic simulation models with aggregate data,” Transportation Research Record: Journal of the Transportation Research Board(TRB), 2004.[2] T. Toledo, M. Ben-Akiva, D. Darda, M. Jha, and H. Koutsopoulos, “Calibration of microscopic traffic simulation models with aggregate data,” Transportation Research Record: Journal of the Transportation Research Board (TRB ), 2004. [3] J. Hourdakis, P. Michalopoulos, and J. Kottommannil, “Practical procedure for calibrating microscopic traffic simulation models,” Transportation Research Record: Journal of the Transportation Research Board(TRB), 2003.[3] J. Hourdakis, P. Michalopoulos, and J. Kottommannil, “Practical procedure for calibrating microscopic traffic simulation models,” Transportation Research Record: Journal of the Transportation Research Board (TRB), 2003. [4] V. Nannen and A. E. Eiben, “A method for parameter calibration and relevance estimation in evolutionary algorithms,” In 8th annual conference on Genetic and evolutionary computation(ACM), 2006.[4] V. Nannen and A. E. Eiben, “A method for parameter calibration and relevance estimation in evolutionary algorithms,” In 8th annual conference on Genetic and evolutionary computation (ACM), 2006. [5] Il-Chul Moon, Dongjun Kim, Tae-Sub Yun, Jang Won Bae, Dong-oh Kang, and Euihyun Paik, “Data-Driven Automatic Calibration for Validation of Agent-Based Social Simulations,” In IEEE International Conference on Systems, Man, and Cybernetics(SMC), 2018 [5] Il-Chul Moon, Dongjun Kim, Tae-Sub Yun, Jang Won Bae, Dong-oh Kang, and Euihyun Paik, “Data-Driven Automatic Calibration for Validation of Agent-Based Social Simulations,” In IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2018

시뮬레이션 모델의 신뢰도를 높일 수 있는 방법이 요구된다. There is a need for a method to increase the reliability of the simulation model.

본 개시의 일 특징에 따라, 컴퓨터에 의해 수행되는, 에이전트 기반 시뮬레이션에서 기계학습을 이용한 미시 시뮬레이션 파리미터 교정 방법이 제공된다. 본 개시의 방법은 (a) 복수의 에이전트의 특성 데이터를 포함하며, 시뮬레이션을 통하여 추정하고자 하는 목표 데이터인 동화 대상 데이터를 전처리하여 소정의 데이터 포맷으로 변환하여 밸리데이션(validation) 데이터를 생성하는 단계; (b) 상기 복수의 에이전트가 가지고 있는 특성에 대한 초기 파리미터 세트와 외부 변수를 시뮬레이션에 반영하기 위한 시나리오 데이터를 ABM 입력 데이터로 하여, 에이전트 기반 모델링 및 시뮬레이션(Agent-Based Modeling and Simulation: ABMS)을 수행하여 상기 복수의 에이전트 각각의 에이전트 미시 데이터를 구하고, 상기 에이전트 미시 데이터를 집합(aggregation)하여 에이전트 거시 데이터를 구하는 단계; (c) 상기 복수의 에이전트 각각의 미시 데이터에 대해 미시 군집 분석을 수행하여 적어도 하나의 군집(clustering)을 생성하고 군집 분석을 수행하는 단계; (d) 상기 밸리데이션 데이터와 상기 에이전트 거시 데이터를 비교하여 오차를 분석하는 단계; (e)　상기 오차가 소정의 기준치를 초과하는 경우, (e-1)　상기 오차를 줄이도록 상기 적어도 하나의 군집 각각에 대한 상기 파라미터를 교정하고, 상기 적어도 하나의 군집 각각에 대해 교정 파라미터 세트를 설정하는 단계; (e-2) 상기 복수의 에이전트가 가지고 있는 특성에 대한 교정 초기 파리미터 세트와 상기 시나리오 데이터를 ABM 입력 데이터로 설정하고, 상기 ABM 입력 데이터에 대해 에이전트 기반 모델링 및 시뮬레이션을 수행하여 상기 복수의 에이전트 각각의 에이전트 미시 데이터를 구하고, 상기 에이전트 미시 데이터를 집합하여 에이전트 거시 데이터를 구하는 단계; (e-3) 상기 (d) 단계 및 (e) 단계를 수행하는 단계; (f)　상기 오차가 소정의 기준치 이하인 경우, 상기 교정된 파라미터를 최종 파라미터로 판단하는 단계를 포함할 수 있다. According to an aspect of the present disclosure, a method of calibrating micro-simulation parameters using machine learning in agent-based simulation, performed by a computer, is provided. The method of the present disclosure includes the steps of: (a) generating validation data by pre-processing moving target data, which is target data to be estimated through simulation, including characteristic data of a plurality of agents, and converting it into a predetermined data format; (b) Agent-Based Modeling and Simulation (ABMS) is performed using scenario data for reflecting the initial parameter set and external variables for the characteristics of the plurality of agents in the simulation as ABM input data. Obtaining agent micro data of each of the plurality of agents by performing, and obtaining agent micro data by aggregating the agent micro data; (c) performing micro-clustering analysis on micro-data of each of the plurality of agents to generate at least one clustering and performing cluster analysis; (d) analyzing an error by comparing the validation data and the agent macroscopic data; (e) 　, when the error exceeds a predetermined reference value, (e-1) 　 calibrate the parameters for each of the at least one cluster to reduce the error, and set a calibration parameter set for each of the at least one cluster Setting up; (e-2) Each of the plurality of agents by setting the calibration initial parameter set for the characteristics of the plurality of agents and the scenario data as ABM input data, and performing agent-based modeling and simulation on the ABM input data. Obtaining agent micro data of the agent and collecting the agent micro data to obtain agent macro data; (e-3) performing steps (d) and (e); (f) When the error is less than or equal to a predetermined reference value, determining the corrected parameter as a final parameter may be included.

본 개시의 다른 특징에 의하면, 컴퓨터에 의해 수행되는, 에이전트 기반 시뮬레이션에서 기계학습을 이용한 미시 시뮬레이션 파리미터 교정 방법이 제공된다. 본 개시의 방법은 (a) 복수의 에이전트의 특성 데이터를 포함하며, 시뮬레이션을 통하여 추정하고자 하는 목표 데이터인 동화 대상 데이터를 전처리하여 소정의 데이터 포맷으로 변환하여 밸리데이션(validation) 데이터를 생성하는 단계; (b) 상기 복수의 에이전트가 가지고 있는 특성에 대한 초기 파리미터 세트와 외부 변수를 시뮬레이션에 반영하기 위한 시나리오 데이터를 ABM 입력 데이터로 하여, 에이전트 기반 모델링 및 시뮬레이션(Agent-Based Modeling and Simulation: ABMS)을 수행하여 상기 복수의 에이전트 각각의 에이전트 미시 데이터를 구하고, 상기 에이전트 미시 데이터를 집합(aggregation)하여 에이전트 거시 데이터를 구하는 단계; (c) 상기 복수의 에이전트 각각의 미시 데이터에 대해 미시 군집 분석을 수행하여 적어도 하나의 군집(clustering)을 생성하고 군집 분석을 수행하는 단계; (d) 상기 밸리데이션 데이터와 상기 에이전트 거시 데이터를 비교하여 오차를 분석하는 단계; 및 (e)　상기 오차가 소정의 기준치를 초과하는 경우, 미시 파라미터를 추정하는 단계를 포함할 수 있다. According to another feature of the present disclosure, a method of calibrating micro-simulation parameters using machine learning in agent-based simulation, performed by a computer, is provided. The method of the present disclosure includes the steps of: (a) generating validation data by pre-processing moving target data, which is target data to be estimated through simulation, including characteristic data of a plurality of agents, and converting it into a predetermined data format; (b) Agent-Based Modeling and Simulation (ABMS) is performed using scenario data for reflecting the initial parameter set and external variables for the characteristics of the plurality of agents in the simulation as ABM input data. Obtaining agent micro data of each of the plurality of agents by performing, and obtaining agent micro data by aggregating the agent micro data; (c) performing micro-clustering analysis on micro-data of each of the plurality of agents to generate at least one clustering and performing cluster analysis; (d) analyzing an error by comparing the validation data and the agent macroscopic data; And (e) when the error exceeds a predetermined reference value, estimating a micro parameter.

일 실시예에 있어서, 전술한 방법의 상기 군집 분석은 변이형 오토인코더(variational autoencoder, VAE)와 디리쉴레 프로세스 혼합 모형(dirichlet process mixture model, DPMM) 등의, 잠재 표현을 통한 군집 분석 수행 방법을 적용할 수 있다.In one embodiment, the cluster analysis of the above-described method is performed by a method of performing cluster analysis through latent expressions, such as a variant autoencoder (VAE) and a dirichlet process mixture model (DPMM). Can be applied.

본 개시의 또 다른 특징에 의하면, 하나 이상의 명령어를 포함하는 컴퓨터 판독 가능 기록 매체로서, 상기 하나 이상의 명령어는, 컴퓨터에 위해 실행되는 경우, 상기 컴퓨터로 하여금, 전술한 방법 중 어느 한 방법을 수행하게 하는 컴퓨터 판독 가능 기록 매체가 제공될 수 있다. According to another feature of the present disclosure, a computer-readable recording medium including one or more instructions, wherein the one or more instructions, when executed for a computer, cause the computer to perform any one of the above-described methods. A computer-readable recording medium may be provided.

본 개시의 또 다른 특징에 의하면, 에이전트 기반 시뮬레이션에서 기계학습을 이용한 미시 시뮬레이션 파리미터 교정 장치가 제공될 수 있다. 본 개시의 장치는 복수의 에이전트의 특성 데이터를 포함하며, 시뮬레이션을 통하여 추정하고자 하는 목표 데이터인 동화 대상 데이터를 전처리하여 소정의 데이터 포맷으로 변환하여 밸리데이션(validation) 데이터를 생성하도록 구성된 밸리데이션 데이터 생성 모듈; 상기 복수의 에이전트가 가지고 있는 특성에 대한 초기 파리미터 세트와 외부 변수를 시뮬레이션에 반영하기 위한 시나리오 데이터를 ABM 입력 데이터로 하여, 에이전트 기반 모델링 및 시뮬레이션(Agent-Based Modeling and Simulation: ABMS)을 수행하여 상기 복수의 에이전트 각각의 에이전트 미시 데이터를 구하고, 상기 에이전트 미시 데이터를 집합(aggregation)하여 에이전트 거시 데이터를 구하도록 구성된 모델링 모듈; 상기 복수의 에이전트 각각의 미시 데이터에 대해 미시 군집 분석을 수행하여 적어도 하나의 군집(clustering)을 생성하고 군집 분석을 수행하도록 구성된 군집 분석 모듈; 상기 밸리데이션 데이터와 상기 에이전트 거시 데이터를 비교하여 오차를 분석하도록 구성된 오차 분석 모듈; 및 상기 오차를 줄이도록 상기 적어도 하나의 군집 각각에 대한 상기 파라미터를 교정하고, 상기 적어도 하나의 군집 각각에 대해 교정 파라미터 세트를 설정하도록 구성된 미시 파라미터 추정 모듈을 포함할 수 있다. According to another feature of the present disclosure, an apparatus for calibrating micro-simulation parameters using machine learning in agent-based simulation may be provided. The apparatus of the present disclosure includes characteristic data of a plurality of agents, and a validation data generation module configured to generate validation data by preprocessing the moving target data, which is target data to be estimated through simulation, and converting it into a predetermined data format. ; Agent-Based Modeling and Simulation (ABMS) is performed using scenario data for reflecting the initial parameter set and external variables for the characteristics of the plurality of agents in the simulation as ABM input data, A modeling module configured to obtain agent micro data of each of a plurality of agents, and to obtain agent macro data by aggregating the agent micro data; A cluster analysis module configured to generate at least one cluster by performing micro cluster analysis on the micro data of each of the plurality of agents and perform cluster analysis; An error analysis module configured to analyze an error by comparing the validation data and the agent macroscopic data; And a micro parameter estimation module configured to calibrate the parameter for each of the at least one cluster to reduce the error and set a calibration parameter set for each of the at least one cluster.

본 개시의 따르면, 휴리스틱 기법을 이용한 파라미터를 이용하는 것 보다 미시 시뮬레이션 파라미터 교정을 통해 파라미터를 선정하여 시뮬레이션 모델의 신뢰도를 높일 수 있다. According to the present disclosure, it is possible to increase the reliability of a simulation model by selecting a parameter through micro-simulation parameter calibration rather than using a parameter using a heuristic technique.

본 개시의 따르면, 복수의 통계 모델, 즉 군집 분석에 확률적 자기부호화기와 디리클레 프로세스 혼합 모형을 사용하고, 최대 정합도를 도출하는 미시 파라미터를 얻기 위하여 베이지안 최적화를 이용하여, 높은 정합도를 내는 미시 파라미터를 찾을 수 있다. According to the present disclosure, a plurality of statistical models, i.e., a probabilistic self-coding model and a Dirichlet process mixture model for cluster analysis, and Bayesian optimization to obtain a micro parameter that derives the maximum degree of matching, are used to produce a high degree of matching. You can find the parameters.

도 1은 본 개시의 일 실시예에 따른 기계학습을 이용한 미시 시뮬레이션 파리미터를 교정하는 방법을 도시한 도면이다.
도 2는 본 개시의 일 실시예에 따른 기계학습을 이용한 미시 시뮬레이션 파라미터를 교정하는 방법을 도시한 흐름도이다.
도 3은 본 개시의 일 실시예에 따라 확률적 자기부호화기를 이용하여 군집 별로 잠재 표현을 분석한 결과를 시각화한 도면이다.
도 4는 본 개시의 일 실시예에 따라 기계학습을 이용한 미시 시뮬레이션 파리미터를 교정하는 방법을 이용한 시뮬레이션의 정합도 결과 그래프이다.
도 5는 본 개시의 일 실시예에 따라 기계학습을 이용한 미시 시뮬레이션 파리미터를 교정하는 방법을 기초로 교정된 파라미터를 이용하여 구한 수도권 아파트 매매 거래량 그래프이다.
도 6은 기존 애드-혹(ad-hoc) 파라미터를 이용하여 구한 수도권 아파트 매매 거래량 그래프이다. 1 is a diagram illustrating a method of calibrating micro-simulation parameters using machine learning according to an embodiment of the present disclosure.
2 is a flowchart illustrating a method of calibrating micro-simulation parameters using machine learning according to an embodiment of the present disclosure.
3 is a diagram illustrating a result of analyzing a latent expression for each cluster using a probabilistic self-encoder according to an embodiment of the present disclosure.
FIG. 4 is a graph showing results of simulation matching using a method of calibrating micro-simulation parameters using machine learning according to an embodiment of the present disclosure.
FIG. 5 is a graph of a metropolitan area apartment trading volume obtained by using a calibrated parameter based on a method of calibrating micro-simulation parameters using machine learning according to an embodiment of the present disclosure.
6 is a graph of a metropolitan area apartment trading volume calculated using an existing ad-hoc parameter.

본 개시의 이점들과 특징들 그리고 이들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해 질 것이다. 그러나 본 개시는 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 본 실시예들은 단지 본 개시의 개시가 완전하도록 하며 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려 주기 위해 제공되는 것이며, 본 개시는 청구항의 범주에 의해 정의될 뿐이다.Advantages and features of the present disclosure, and a method of achieving them will become apparent with reference to the embodiments described later in detail together with the accompanying drawings. However, the present disclosure is not limited to the embodiments disclosed below, but will be implemented in a variety of different forms, and the present embodiments only make the disclosure of the present disclosure complete, and those skilled in the art to which the present disclosure pertains. It is provided to inform the person of the scope of the invention completely, and the present disclosure is only defined by the scope of the claims.

본 명세서에서 사용되는 용어는 단지 특정한 실시예를 설명하기 위해 사용되는 것으로 본 개시를 한정하려는 의도에서 사용된 것이 아니다. 예를 들어, 단수로 표현된 구성 요소는 문맥상 명백하게 단수만을 의미하지 않는다면 복수의 구성 요소를 포함하는 개념으로 이해되어야 한다. 또한, 본 개시의 명세서에서, '포함하다' 또는 '가지다' 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것일 뿐이고, 이러한 용어의 사용에 의해 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성 요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성이 배제되는 것은 아니다. 또한, 본 명세서에 기재된 실시예에 있어서 '모듈' 혹은 '부'는 적어도 하나의 기능이나 동작을 수행하는 기능적 부분을 의미할 수 있다.The terms used in this specification are only used to describe specific embodiments and are not intended to limit the present disclosure. For example, a constituent element expressed in the singular should be understood as a concept including a plurality of constituent elements, unless the context clearly means only the singular. In addition, in the specification of the present disclosure, terms such as'include' or'have' are only intended to designate the existence of features, numbers, steps, actions, components, parts, or a combination thereof described in the specification. The use of the term does not exclude the possibility of the presence or addition of one or more other features, numbers, steps, actions, components, parts, or combinations thereof. In addition, in the embodiments described in the present specification, the'module' or'unit' may mean a functional part that performs at least one function or operation.

덧붙여, 다르게 정의되지 않는 한 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미가 있는 것으로 해석되어야 하며, 본 개시의 명세서에서 명백하게 정의하지 않는 한 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.In addition, unless otherwise defined, all terms used herein including technical or scientific terms have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Terms as defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related technology, and interpreted as an ideal or excessively formal meaning unless explicitly defined in the specification of the present disclosure. It doesn't work.

이하, 첨부된 도면들을 참조하여 본 개시의 실시예들을 보다 상세히 설명한다. 다만, 이하의 설명에서는 본 개시의 요지를 불필요하게 흐릴 우려가 있는 경우, 널리 알려진 기능이나 구성에 관한 구체적 설명은 생략하기로 한다.Hereinafter, embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. However, in the following description, when there is a possibility that the subject matter of the present disclosure may be unnecessarily obscure, detailed descriptions of widely known functions or configurations will be omitted.

도 1은 본 개시의 일 실시예에 따른 기계학습을 이용한 미시 시뮬레이션 파리미터를 교정하는 방법을 도시한 도면이다. 1 is a diagram illustrating a method of calibrating micro-simulation parameters using machine learning according to an embodiment of the present disclosure.

본 개시의 일 실시예에 있어서, 미시 파라미터는 각 에이전트의 전략을 결정하는 에이전트마다 다른 파라미터를 의미한다. 그러나 모든 에이전트의 파라미터 값을 교정하는 것은 기계 학습의 오버피팅에 해당될 수 있다. 따라서, 본 개시의 일 실시예에 있어서, 미시 파라미터는 에이전트 군집의 파라미터로 정의하고, 미시 파라미터 교정은 에이전트 군집 분석을 통한 군집 별로 할당된 파라미터를 교정하는 것으로 정의한다. In one embodiment of the present disclosure, the micro parameter means a parameter different for each agent that determines the strategy of each agent. However, correcting the parameter values of all agents may correspond to machine learning overfitting. Accordingly, in an embodiment of the present disclosure, the micro parameter is defined as a parameter of an agent cluster, and the micro parameter correction is defined as correcting a parameter allocated for each cluster through agent cluster analysis.

본 개시의 일 실시예에 있어서, 동화 대상 데이터는 시뮬레이션을 통하여 추정하고자 하는 목표 데이터를 의미할 수 있다. In an embodiment of the present disclosure, the moving object data may mean target data to be estimated through simulation.

본 개시의 일 실시예에 있어서, 시나리오 데이터는 외부 변수를 시뮬레이션에 반영하기 위한 데이터로서, 예컨대 주택 가격과 거래량이 어떻게 움직일 것인가를 예측하는 주택 시뮬레이션 모델에서 시나리오 데이터는 LTV와 DTI 등의 주택 정책 변수일 수 있다. In an embodiment of the present disclosure, the scenario data is data for reflecting external variables in the simulation. For example, in a housing simulation model that predicts how the housing price and transaction volume will move, the scenario data is housing policy variables such as LTV and DTI. Can be

동화 대상 데이터 전처리(110)Pre-processing of moving target data (110)

동화 대상 데이터는 시뮬레이션을 통하여 추정하고자 하는 목표 데이터로 현실 데이터를 전처리(preprocessing)하여 오차 분석을 가능하게 하는 데이터 포맷으로 변환할 수 있다. 본 개시의 일 실시예에 있어서, 시뮬레이션을 통해서 특정 현상을 분석하기 위해, 그 현상의 데이터를 전처리하여 시뮬레이션 결과와 같은 형식으로 변환할 수 있다. 즉, 주어진 파라미터를 기반으로 나온 현실 데이터를 시뮬레이션 결과와 비교할 수 있게 전처리할 수 있다. 그럼으로써, 현실 데이터와 시뮬레이션 데이터를 직접 비교하는 것이 가능해지고, 이를 통해 현실 데이터를 더 잘 모사하는 시뮬레이션 파라미터를 찾을 수 있다.The moving object data may be converted into a data format that enables error analysis by preprocessing real data as target data to be estimated through simulation. In an embodiment of the present disclosure, in order to analyze a specific phenomenon through simulation, data of the phenomenon may be preprocessed and converted into the same format as the simulation result. In other words, it is possible to preprocess the real data generated based on the given parameters to be compared with the simulation results. This makes it possible to directly compare the real data and the simulation data, and through this, it is possible to find a simulation parameter that better simulates the real data.

도 1에 도시된 바와 같이, 동화 대상 자료를 전처리하여 밸리데이션 데이터(real world data: validation data)의　그래프(120)를 얻을 수 있다. As shown in FIG. 1, a graph 120 of real world data (validation data) may be obtained by pre-processing the moving object data.

에이전트 기반 모델링 및 시뮬레이션(130)Agent-based modeling and simulation (130)

에이전트 기반 모델링 및 시뮬레이션(Agent-Based Modeling and Simulation: ABMS)은 기존의 시스템 수준에서의 모델링을 기반으로 한 시뮬레이션이 표현할 수 없는 에이전트들의 세밀한 행동과 상호작용을 구현할 수 있어 주식 시장, 소비자 시장, 전염병 확산 예측 등 다양한 분야의 모델링을 위한 방법론으로써 활용되고 있다. 본 개시의 일 실시예에서, 도 1에 도시된 ABMS(130)에서 에이전트 기반 모델(ABM)은 다중 에이전트 시스템(Multi agent system)을 전제하여 모델링하는 것을 의미하며, 에이전트(agent)는 최적 행동을 한다고 전제한다. Agent-Based Modeling and Simulation (ABMS) is capable of realizing detailed behaviors and interactions of agents that cannot be expressed by simulations based on modeling at the existing system level. It is used as a methodology for modeling in various fields such as diffusion prediction. In an embodiment of the present disclosure, the agent-based model (ABM) in the ABMS 130 shown in FIG. 1 refers to modeling on the premise of a multi agent system, and the agent performs optimal behavior. It is premised.

본 개시의 일 실시예에 있어서, 에이전트 미시 데이터(140)는 ABMS(130)에 의해 시뮬레이션한 것을 각 에이전트별 미시 데이터로 표현한 것이다. In one embodiment of the present disclosure, the agent micro data 140 is a simulation of the ABMS 130 as micro data for each agent.

본 개시의 일 실시예에 있어서, 에이전트 거시 데이터 그래프(150)는 ABMS(130)에 의해 시뮬레이션한 결과 값인 에이전트 미시 데이터(140)를 집합(aggregation)하여 에이전트 거시 데이터로 표현한 그래프이다. 그래프(150)의 X축은 시간, Y축은 시뮬레이션 결과를 의미한다. In one embodiment of the present disclosure, the agent macroscopic data graph 150 is a graph in which agent microscopic data 140, which is a result value simulated by the ABMS 130, is aggregated and expressed as agent macroscopic data. The X axis of the graph 150 indicates time, and the Y axis indicates a simulation result.

시뮬레이션 오차 분석(160)Simulation error analysis (160)

다음으로, 본 개시의 일 실시예에서, 그래프(120)와 그래프(150)를 비교하여 통계적으로 의미 있는 오차를 구할 수 있다. 본 개시의 일 실시예에 있어서 오차를 구하는 방법은 평균 제곱근 오차(RMSE; Root Mean Square Error) 또는 평균 절대 비율 오차 (Mean absolute percentage error; MAPE)를 이용할 수 있으나, 본 개시가 이에 한정되는 것은 아니다. Next, in an embodiment of the present disclosure, a statistically meaningful error may be obtained by comparing the graph 120 and the graph 150. In an embodiment of the present disclosure, a method of obtaining an error may use a root mean square error (RMSE) or a mean absolute percentage error (MAPE), but the present disclosure is not limited thereto. .

미시 군집 분석(170)Micro cluster analysis (170)

본 개시의 일 실시예에 있어서, 변이형 오토인코더(variational autoencoder, VAE)를 통해 에이전트의 잠재 표현(hidden representation)을 학습한 후 잠재 표현을 새로운 입력값으로 하여 군집 분석을 수행할 수 있다. 본 개시의 일 실시예에서, 디리슐레 프로세스 혼합모델(Dirichlet process mixture model: dpmm)을 이용하여 군집 분석을 할 수 있으나 이에 한정되는 것은 아니다. In an embodiment of the present disclosure, after learning a hidden representation of an agent through a variant autoencoder (VAE), cluster analysis may be performed using the latent representation as a new input value. In an embodiment of the present disclosure, cluster analysis may be performed using a Dirichlet process mixture model (dpmm), but the present disclosure is not limited thereto.

본 개시의 일 실시예에 있어서, 에이전트가 10,000명 있는 주택 모델의 경우, 에이전트 별로 250차원의 데이터가 존재하고, 10,000명의 에이전트에 대해서 10차원의 잠재 표현을 구할 수 있다. 즉, 250차원 데이터(에이전트의 저축액(saving), 수입(income), 부채(loan) 등)을 10차원으로 압축할 수 있다. 다음으로, 10000개의 10차원 데이터를 클러스터링 알고리즘, 예컨대, 디리슐레 프로세스 혼합모델을 이용하여 군집 분석을 수행하여 10,000명의 에이전트가 각각 어떤 클러스터에 속하는지 구할 수 있다. In an embodiment of the present disclosure, in the case of a housing model with 10,000 agents, 250-dimensional data exists for each agent, and a 10-dimensional latent expression can be obtained for 10,000 agents. That is, 250-dimensional data (saving, income, and debt of the agent) can be compressed into 10 dimensions. Next, a cluster analysis is performed on 10000 10-dimensional data using a clustering algorithm, for example, a Dirichule process mixed model to determine which cluster each 10,000 agents belong to.

예컨대, 한 개인은 여러 가지 사유로 집을 사거나 팔 수 있다. 한 개인이 집을 산다고 할 때, 전체적인 경제 상황을 보고 집을 살 수도 있고, 개인적 사정 예컨대 집이 꼭 필요한 시점이기 때문에 집을 살 수도 있고, 세로 내어 다달이 얼마씩 받기 위해 집을 살 수도 있다. 또한, 한 개인이 집을 판다고 했을 때, 전체적인 경제 상황을 보고 집을 팔 수도 있고, 목돈이 필요해서 집을 팔 수도 있다. 즉, 집을 매매하는 것은 개인별 투자성향에 따라 다를 수 있다. 개인별 어떠한 행위를 하는데 있어 판단하는 기준이 다를 수 있는데, 개인별 판단 기준을 미시 파라미터(micro parameter)로 군집화하여 분석할 수 있다. For example, an individual can buy or sell a house for several reasons. When an individual buys a house, he or she can buy a house based on the overall economic situation, personal circumstances, for example, because the house is indispensable, so he can buy a house, or he can buy a house in order to receive a small amount each month. In addition, when an individual says they are selling a house, they can see the overall economic situation and sell the house, or they can sell the house because they need a small amount of money. In other words, buying and selling a house may differ according to individual investment preferences. Although the criteria for judging certain behaviors for each individual may be different, the criteria for judging each individual can be clustered into micro parameters and analyzed.

본 개시의 일 실시예에 있어서, 군집 분석은 처음 한번 수행하면 교정 반복(calibration iteration) 동안은 다시 수행하지 않을 수 있다. In an embodiment of the present disclosure, if the cluster analysis is performed once for the first time, it may not be performed again during calibration iteration.

미시 파리미터 추정(180)Micro parameter estimation (180)

다음으로, 본 개시의 일 실시예에서, 각 클러스터에 속한 에이전트에 파라미터를 부여할 수 있다. 같은 클러스터에 속한 경우 같은 파라미터를 부여하게 된다. 부여된 파라미터는 다음번 시뮬레이션의 입력 파라미터(input parameter)가 된다. Next, in an embodiment of the present disclosure, parameters may be assigned to agents belonging to each cluster. If they belong to the same cluster, the same parameters are assigned. The assigned parameter becomes the input parameter of the next simulation.

본 개시의 일 실시예에 있어서, 미시 파리미터 추정(180)은 베이지안 최적화를 이용한 가우시안 프로세스를 이용할 수 있으나 본 발명에 이에 한정되는 것은 아니다. In an embodiment of the present disclosure, the micro-parameter estimation 180 may use a Gaussian process using Bayesian optimization, but the present invention is not limited thereto.

본 개시의 일 실시예에서, 교정 반복(calibration iteration)은 시뮬레이션 결과 그래프(150)가 현실 데이터(120)와 가장 잘 맞는 미시 파라미터 세트를 찾을 때까지 반복될 수 있다. 본 개시의 일 실시예에 있어서, 시뮬레이션 오차 분석(160)이 소정의 기준치 이하인 경우, 교정된 미시 파라미터 세트를 최종 파라미터로 간주할 수 있다. In one embodiment of the present disclosure, the calibration iteration may be repeated until the simulation result graph 150 finds a set of microscopic parameters that best match the real data 120. In one embodiment of the present disclosure, when the simulation error analysis 160 is less than or equal to a predetermined reference value, the corrected micro parameter set may be regarded as the final parameter.

도 2는 본 개시의 일 실시예에 따른 기계학습을 이용한 미시 시뮬레이션 파라미터를 교정하는 방법을 도시한 흐름도이다. 2 is a flowchart illustrating a method of calibrating micro-simulation parameters using machine learning according to an embodiment of the present disclosure.

먼저, 단계(S210)에서 초기 파라미터 세트

를 설정한다. 여기서, k는 교정하고자 하는 미시 파라미터의 인덱스(index)를 의미하고, 0는 금번 파라미터 세트가 초기 파라미터 세트임을 의미한다. 초기 파라미터 세트로 시뮬레이션을 수행하였을 때, 시뮬레이션의 에이전트 별 미시적 특성은 시간에 따른 결과값

을 가진다. 여기서, t는 시간을, att는 특성을 의미한다. First, the initial parameter set in step S210

Is set. Here, k means the index of the micro parameter to be corrected, and 0 means that this parameter set is an initial parameter set. When the simulation is performed with the initial parameter set, the microscopic characteristics of each agent in the simulation are the result values over time.

Have. Here, t stands for time and att stands for characteristics.

다음으로, 단계(S220)에서, 새로운 파라미터 세트

로 시뮬레이션을 반복 실험할 수 있다. 본 개시의 일 실시예에서, 새로운 파리미터 세트로 시뮬레이션을 반복하기 위해 파라미터 세트를 에이전트 별 미시적 특성의 시간에 따른 결과값

으로 치환할 수 있다. 그 후, 단계(S230)에서, 에이전트 별 미시적 특성의 시간에 따른 결과값

을 입력으로하여 차원 축소(dimension reduction) 기법을 통한 잠재표현

를 구할 수 있다. 본 개시의 일 실시예에서, 차원 축소 기법으로 기계 학습 방법론인 확률적 자기부호화기를 사용할 수 있으나 이에 한정되는 것은 아니다. Next, in step S220, a new parameter set

The simulation can be repeated. In an embodiment of the present disclosure, in order to repeat the simulation with a new parameter set, a parameter set is used as a result of the microscopic characteristics of each agent over time.

Can be substituted with Then, in step (S230), the result value according to the time of the microscopic characteristics of each agent

Latent expression through dimension reduction technique using as input

Can be obtained. In an embodiment of the present disclosure, a probabilistic self-encoder, which is a machine learning methodology, may be used as a dimensionality reduction technique, but is not limited thereto.

다음으로, 단계(S240)에서, 에이전트 별로 구한 잠재 표현

를 입력으로 하여 군집(cluster)을 구성할 수 있다. 군집 구성 후, 각 에이전트가 어떤 군집에 속하는지 알 수 있다. Next, in step S240, the latent expression obtained for each agent

A cluster can be formed by using as an input. After clustering, it is possible to know which cluster each agent belongs to.

본 개시의 일 실시예에 있어서, 군집 구성은 디리클레 프로세스 혼합 모형을 이용할 수 있다. 디리클레 프로세스 혼합 모형은 비모수 군집 분석 방법론이다. 디리클레 프로세스 혼합 모형은 데이터가 주어져 있을 때 데이터가 어떤 군집에 속할 지를 중국 식당 규칙(Chinese restaurant rule)에 의하여 결정해 준다. 여기서, 중국 식당 규칙은 새로운 데이터가 들어왔을 때 기존 군집에 속할 확률은 군집에 속해 있는 데이터의 개수에 비례하고, 새 군집이 형성될 확률은 모델 변수 γ에 비례하는 것이다. 중국 식당 규칙에 의하여 생성되는 사전 분포와 새로운 데이터가 군집에 속할 가능성(likelihood)를 곱하여 사후 분포를 생성한 후, 데이터의 군집을 샘플링을 통하여 결정할 수 있다. In one embodiment of the present disclosure, the cluster configuration may use a Dirichlet process mixing model. The Dirichlet process mixed model is a nonparametric cluster analysis methodology. The Dirichlet process hybrid model determines which cluster the data will belong to, given the data, according to the Chinese restaurant rule. Here, according to the Chinese restaurant rule, the probability of belonging to the existing cluster when new data is input is proportional to the number of data belonging to the cluster, and the probability of forming a new cluster is proportional to the model variable γ. After generating the posterior distribution by multiplying the prior distribution generated by the Chinese restaurant rule by the likelihood of the new data, the cluster of data can be determined through sampling.

본 개시의 다른 실시예에서, 모수가 알려진 군집 분석 방법인 가우시안 혼합 모형을 사용할 수 있다. 가우시안 혼합 모형은 시뮬레이션 모델이 바뀌면 에이전트의 군집의 개수도 바뀔 수 있기 때문에 각각의 시뮬레이션 모델에 적절한 군집의 개수를 정해 주어야 한다. In another embodiment of the present disclosure, a Gaussian mixed model, which is a cluster analysis method with known parameters, may be used. In the Gaussian mixed model, when the simulation model changes, the number of agent clusters can also change. Therefore, an appropriate number of clusters should be determined for each simulation model.

후술할 도 3에 도시된 바와 같이, 같은 곳에 위치한 에이전트는 같은 군집에 속하는 것을 확인할 수 있다. As shown in FIG. 3 to be described later, it can be confirmed that agents located in the same place belong to the same cluster.

단계(S250)에서, 군집 별 파라미터 세트를

로 설정할 수 있다. 여기서 c는 군집을 의미하고, k는 교정하고자 하는 미시 파라미터의 인덱스를 의미하며, i는 파라미터 교정 반복 횟수를 의미한다. In step S250, a parameter set for each cluster is

Can be set to Here, c denotes a cluster, k denotes the index of a micro parameter to be corrected, and i denotes the number of repetitions of parameter correction.

단계(S260)에서, 파라미터 세트를 시뮬레이션 결과값

로 설정할 수 있다. 즉, 단계(S250)에서 군집 별 파라미터 세트

를 입력으로 하여 시뮬레이션을 수행하면, 정합도 계산에 쓰이는 시뮬레이션 결과

를 구할 수 있다. 여기서, t는 시간을 의미한다. In step S260, the parameter set is simulated

Can be set to That is, in step S250, the parameter set for each cluster

If the simulation is performed with as input, the simulation result used for the matching degree calculation

Can be obtained. Here, t means time.

단계(S270)에서, 시뮬레이션 정합도를 이용하여 베이지안 최적화를 수행할 수 있다. 본 개시의 일 실시예에 있어서, 검증 데이터(validation data)와 비교하여 정합도를 계산한 후, 정합도를 기반으로 베이지안 최적화를 수행하여 정합도가 최대가 되는 교정 미시 파라미터 세트를 추정할 수 있다. 여기서 검증 데이터는 실제 데이터를 전처리(preprocessing)하여 오차 분석을 가능하게 하는 데이터이다. In step S270, Bayesian optimization may be performed by using the simulation matching degree. In an embodiment of the present disclosure, after calculating the degree of matching by comparing with validation data, Bayesian optimization is performed based on the degree of matching to estimate a set of calibration microparameters having a maximum matching degree. . Here, the verification data is data that enables error analysis by preprocessing actual data.

본 개시의 일 실시예에 있어서, 베이지안 최적화 방법은 미시 파라미터를 입력으로 하고 정합도를 출력으로 하는 함수가 있다고 할 때 최대 정합도를 가지는 미시 파라미터 값을 찾아주는 방법론이다. 베이지안 최적화 방법은 현재

를 미시 파라미터로,

를 정합도로 가지는 데이터

가 있을 때, 데이터를 기반으로 가우시안 프로세스를 진행하여 함수 y=f(x)의 모양을 추정하고, 함수의 모양을 기반으로 최댓값이 나올 확률이 가장 높은 미시 파라미터 값을 추정할 수 있다. 추정한 미시 파라미터 값으로 시뮬레이션을 진행하여 새로운 정합도를 구한 후, 새로운 데이터

를 기존 데이터에 추가하여 위의 과정을 반복한다.In an embodiment of the present disclosure, the Bayesian optimization method is a methodology for finding a micro parameter value having a maximum degree of matching when there is a function that takes micro parameters as an input and outputs the degree of matching. Bayesian optimization method is currently

As a micro parameter,

Data with a degree of consistency

When is present, a Gaussian process is performed based on the data to estimate the shape of the function y=f(x), and the microparameter value with the highest probability of obtaining the maximum value can be estimated based on the shape of the function. After the simulation is performed with the estimated microparameter values, a new degree of matching is obtained, and the new data

Is added to the existing data and the above process is repeated.

데이터가 주어져 있을 때, 최대 정합도를 가지는 미시 파라미터 값은 기대 향상 매수 함수(Expected Improvement acquisition function)으로 찾아줄 수 있다. 임의의 파라미터 x에 대하여 기대 향상 매수 함수(Expected Improvement acquisition function)(EI) 의 값은 f(x)가 현재 가우시안 프로세스로 추정한 최대 정합도보다 높을 확률을 계산하여 높은 만큼 가중치를 주어 합한 값이다. 매수 함수(Acquisition function)가 최대가 되는 미시 파라미터 x를 새로운 파라미터로 설정하는 것이 베이지안 최적화이다.Given the data, the microparameter value having the maximum degree of matching can be found by using an Expected Improvement acquisition function. For an arbitrary parameter x, the value of the Expected Improvement acquisition function (EI) is the sum of calculating the probability that f(x) is higher than the maximum matching degree estimated by the current Gaussian process, and weighting it as high as possible. . Bayesian optimization is to set the micro parameter x where the acquisition function is maximized as a new parameter.

[수식1][Equation 1]

단계(S280)에서, 추정한 새로운 미시 파라미터 세트(교정된 파라미터 세트)를 입력으로 하여 위 단계를 반복하여 수행하면 최종적으로 정합도의 최대값을 구할 수 있다. In step S280, if the estimated new micro parameter set (corrected parameter set) is input and the above steps are repeatedly performed, the maximum matching degree can be finally obtained.

도 3은 본 개시의 일 실시예에 따라 확률적 자기부호화기를 이용하여 군집 별로 잠재 표현을 분석한 결과를 시각화한 도면이다. 3 is a diagram illustrating a result of analyzing a latent expression for each cluster using a probabilistic self-encoder according to an embodiment of the present disclosure.

여기서 확률적 자기부호화기는 데이터의 잠재표현을 구하는 기계학습 방법론이다. 데이터의 잠재 표현은 일반적으로 해석 불가능하다는 단점이 있지만, 많은 양의 차원이 큰 데이터는 잠재 표현을 통해 군집 분석을 하는 것이 보다 효과적이라는 연구 결과가 알려진 바 있다. Here, the probabilistic self-encoder is a machine learning methodology that finds the latent representation of data. There is a drawback that the latent representation of data is generally uninterpretable, but research results have known that it is more effective to perform cluster analysis through the latent representation for large amounts of data with large dimensions.

확률적 자기부호화기는 잠재 공간의 잠재 표현이 따르는 사전 분포가 표준 가우시안 분포라고 가정한 자기부호화기이다. 인코더 부분의 결과가 입력 x 의 잠재 표현 평균과 분산으로 나오고, 입력 x 의 잠재 표현은 인코더 결과인 평균과 분산을 가지는 가우시안 분포에서 샘플링하여 얻는다. 재매개변수화 트릭(Reparametrization trick)을 이용하여 샘플링으로 인해 생기는 인코더 부분 신경망 미분 불가능 문제를 해결할 수 있다. Probabilistic self-encoder is a self-encoder that assumes that the prior distribution followed by the latent representation of the latent space is a standard Gaussian distribution. The result of the encoder part comes out as the mean and variance of the latent representation of the input x, and the latent representation of the input x is obtained by sampling from the Gaussian distribution with the mean and variance, which are the encoder results. A reparametrization trick can be used to solve the problem of inability to differentiate an encoder partial neural network due to sampling.

확률적 자기부호화기는 데이터의 로그 우도(loglikelihood)의 하한인 증거 하한 값(evidence lower bound: ELBO)을 경사 하강법으로 이용하여 학습될 수 있다. The probabilistic self-encoder can be learned using an evidence lower bound (ELBO), which is a lower limit of the loglikelihood of data, as a gradient descent method.

[수식2][Equation 2]

학습 결과 데이터의 잠재 표현이 나오며, 잠재 표현을 t-sne 시각화 방법을 사용하여 2차원에 나타내면 도 3과 같이 도시된다. 도 3에 도시된 바와 같이, 동일한 군집의 데이터는 비슷한 잠재 표현 값을 가지는 것을 알 수 있다. 도 3은 240차원의 데이터의 10차원 잠재 표현을 추출한 후, t-sne 방법을 통해 2차원으로 압축하여 시각화한 것이다.A latent expression of the learning result data comes out, and when the latent expression is expressed in two dimensions using a t-sne visualization method, it is shown as in FIG. 3. As shown in FIG. 3, it can be seen that data of the same cluster have similar latent expression values. FIG. 3 is a diagram illustrating a result of extracting a 10-dimensional latent expression of 240-dimensional data, and then compressing it into two dimensions through a t-sne method and visualizing it.

시뮬레이션 모델Simulation model

본 개시의 일 실시예에 따른 에이전트 기반 시뮬레이션은 주택 시뮬레이션 모델일 수 있다. 여기서 주택 시뮬레이션 모델은 정책 변화에 따른 주택 시장 변동을 분석하기 위한 모델이다. 주택 시뮬레이션 모델을 통해 거시적 시각에서 주택 시장의 가격 지수와 유동성을 분석하고, 미시적 시각에서 주택 정책으로 인한 가구주의 재정 변화를 분석할 수 있다. The agent-based simulation according to an embodiment of the present disclosure may be a housing simulation model. Here, the housing simulation model is a model for analyzing the housing market fluctuations according to policy changes. Through the housing simulation model, it is possible to analyze the price index and liquidity of the housing market from a macro perspective, and analyze the financial changes of household owners caused by housing policy from a micro perspective.

본 개시의 일 실시예에 있어서, 한국의 주택시뮬레이션 모델을 사용할 수 있다. 예컨대, 모델이 사용하는 환경 및 정책 변수들로서 한국 주택시장 지표 및 한국 정부의 경제 정책 미시 가구 데이터로 통계청의 가계금융복지 데이터를 사용할 수 있다. In an embodiment of the present disclosure, a Korean housing simulation model may be used. For example, as environmental and policy variables used by the model, the Korean housing market index and household financial welfare data of the National Statistical Office can be used as micro-family data of the Korean government's economic policy.

본 개시의 일 실시예에 있어서, 주택 시장 모델의 구조는 에이전트, 상호작용, 환경, 주택을 포함할 수 있다. 일 실시예에서, 모델 에이전트는 3종류로, 가구, 외부 주택 공급자, 부동산 중개업자일 수 있다. 여기서, 가구는 저축과 대출 서비스를 이용하여 필요에 따라 주택 구입과 판매, 임대를 결정한다. 외부 주택 공급자는 가구는 아니지만 주택 시장에 직접 참여하고 있는 정부, 기업, 건축사 등의 참여자들을 포괄하는 개념의 행위자로서, 신규 주택 공급 기능을 담당하고 있다. 부동산 중개업자는 매매 또는 임대차계약을 원하는 주택 정보를 가구들로부터 받아서 시장에 등록해 주며, 가구가 요청할 시 시장 정보를 제공해 주는 역할을 할 수 있다. In one embodiment of the present disclosure, the structure of the housing market model may include agents, interactions, environments, and houses. In one embodiment, there are three types of model agents, which may be furniture, external housing providers, and real estate agents. Here, the household decides to purchase, sell, and lease a home as needed using savings and loan services. External housing providers are not households, but are actors in the concept encompassing participants such as governments, companies, and architects who are directly participating in the housing market, and are in charge of supplying new housing. A real estate broker receives information on a house for sale or lease from households, registers it in the market, and provides market information when requested by households.

일 실시예에서, 모델 에이전트 간의 상호작용으로는 가구들 간의 혹은 외부 공급자와의 주택 매매 거래 또는 임대차 계약 등이 포함될 수 있다. In one embodiment, the interaction between model agents may include a house sale transaction or a lease agreement between households or with an external supplier.

일 실시예에서, 모델 환경은 정부의 주택 정책, 거시 경제환경, 지역간의 인구 유*?*출입, 신규 주택 공급 등을 포함할 수 있다. In one embodiment, the model environment may include the government's housing policy, macroeconomic environment, population availability*?* access, new housing supply, and the like.

일 실시예에서, 모델 주택은 시장가치, 주택 유형 등의 특성을 가지며, 가구의 의사결정에 따라 소유주, 거주자 등의 특성이 바뀔 수 있다 .In one embodiment, the model house has characteristics such as market value and type of house, and characteristics of owners, residents, etc. may change according to a decision of the household.

일 실시예에서, 본 주택시장 모델은 업데이트 프로세스, 리스트 프로세스 , 바이 프로세스를 순차적으로 복하면서 반복하면서 진행될 수 있다. 업데이트 프로세스는 가구와 주택의 특성에 대한 업데이트를 진행하고 신규 가구의 추가 및 주택 공급 과정이 이루어질 수 있다. 리스트 프로세스는 가구들이 시장에 보유 주택을 리스트 할 지 결정하는 과정이다. 리스트 할 때 거래 형태 및 거래 가격을 결정한다. 리스트는 중개업자를 통해 이루어질 수 있다. 바이 프로세스는 가구들이 주택 구입 또는 임대 계약을 통해 입주하거나 주택을 추가 구매하는 과정이 이루어질 수 있다. 이 과정에서 가구들은 시장 참여 및 거주 지역을 결정하고 선호하는 주택 형태를 선택할 수 있다. 가구들은 부동산 중개업자로부터 조건에 맞는 주택 정보를 받고, 그 중 예산을 고려하여 주택을 선택하고 계약할 수 있다. In one embodiment, the housing market model may be performed by repeating the update process, the list process, and the buy process sequentially. In the update process, the characteristics of furniture and houses are updated, and new furniture is added and the housing supply process can be performed. The listing process is the process by which households decide whether to list their homes on the market. When listing, you determine the transaction type and transaction price. The listing can be made through an intermediary. In the buy-in process, households may move in through a home purchase or lease contract or purchase additional homes. In this process, households can decide on market participation, where they live, and choose their preferred housing type. Families can receive housing information that meets the conditions from a real estate agent, and can select and contract a house in consideration of the budget.

실험 결과Experiment result

10,000명의 에이전트를 상기 시뮬레이션 모델에 적용하여 실험을 하였다. 10,000 agents were applied to the simulation model for experimentation.

시뮬레이션의 미시 결과인 에이전트 별 특성으로는 거주 지역, 저축액, 소득, 대출액, 거주 주택 종류, 거주 주택 입주 형태, 보유 주택 개수 총 7개가 있다. 표 1은 거주 주택 입주 형태를 원 핫(one-hot) 인코딩으로 전 처리하여 에이전트 별로 총 10개의 미시 가구 특성을 나타내는 표이다. As for the characteristics of each agent, which is the microscopic result of the simulation, there are a total of 7 residential areas, savings, income, loans, residential housing type, residential housing occupancy type, and the number of houses held. Table 1 is a table showing the characteristics of a total of 10 micro households for each agent by pre-processing the occupancy type of a residence by one-hot encoding.

[표 1] 에이전트 별 생성되는 미시 가구 특성[Table 1] Characteristics of micro furniture generated by agent

여기서, 시뮬레이션의 정합도 계산에는 거시 결과인 수도권 아파트 매매, 전월세의 가격지수 및 거래량을 이용한다. 에이전트마다 시뮬레이션 시간만큼 존재하는 시뮬레이션 미시 결과를 기반으로 다룬 군집 분석을 수행하면 도 3과 같은 결과가 나온다. Here, for the calculation of the matching degree of the simulation, the macroscopic results of the apartment sales in the metropolitan area, the price index of cheonsei and the transaction volume are used. When the cluster analysis handled based on the simulation microscopic results that exist for each agent for as long as the simulation time is performed, a result as shown in FIG. 3 is obtained.

도 4는 본 개시의 일 실시예에 따라 기계학습을 이용한 미시 시뮬레이션 파리미터를 교정하는 방법을 이용한 시뮬레이션의 정합도 결과 그래프이다. FIG. 4 is a graph showing results of simulation matching using a method of calibrating micro-simulation parameters using machine learning according to an embodiment of the present disclosure.

시뮬레이션 정합도 결과는 도 4와 같다. 도 4의 그래프(410)은 휴리스틱(heuristic) 기법, 즉 사람이 찾아준 애드 혹(ad-hoc) 파라미터 세트로 실험한 경우 도출된 정합도이고, 그래프(420)은 미시 파라미터 교정 결과로 도출된 정합도이다. 도 4의 결과는 4개의 군집으로 나누고 미시 파라미터 교정을 수행한 결과이다. 교정한 미시 파라미터는 지불 용의(willingness to pay)로, 현재 가구가 가진 가용 예산 중 주택을 구매하는 데 지불할 최대 비율을 의미한다. 복수의 미시 파라미터에 대하여 교정을 수행할 수 있다. 도 4에 도시된 바와 같이, 미시 파라미터 교정 결과로 도출된 정합도가 휴리스틱 기법을 이용한 정합도에 비해, 10회 내외의 빠른 교정(10 iteration 내외)으로 4.2% 정합도를 향상시킬 수 있다. The simulation match degree results are shown in FIG. 4. The graph 410 of FIG. 4 is a matching degree derived when experimenting with a heuristic technique, that is, an ad-hoc parameter set found by a human, and the graph 420 is derived as a result of micro parameter calibration. It is the degree of consistency. The result of FIG. 4 is a result of dividing into four clusters and performing micro parameter calibration. The corrected micro parameter is willingness to pay, which represents the maximum percentage of the current household's available budget to pay to purchase a home. Calibration can be performed on a plurality of micro parameters. As shown in FIG. 4, the matching degree derived from the micro parameter calibration result can be improved by 4.2% by fast calibration (around 10 iterations) about 10 times compared to the matching degree using the heuristic technique.

도 5 은 본 개시의 일 실시예에 따라 기계학습을 이용한 미시 시뮬레이션 파리미터를 교정하는 방법을 기초로 교정된 파라미터를 이용하여 구한 수도권 아파트 매매 거래량 그래프이다. 도 5에 도시된 그래프(510)는 시뮬레이션 결과값이고 그래프(520)은 밸리데이션 데이터의 결과값이다. FIG. 5 is a graph of a metropolitan area apartment trading volume obtained by using a calibrated parameter based on a method for calibrating a micro-simulation parameter using machine learning according to an embodiment of the present disclosure. A graph 510 shown in FIG. 5 is a simulation result value, and a graph 520 is a result value of validation data.

도 6은 기존 애드-혹(ad-hoc) 파라미터를 이용하여 구한 수도권 아파트 매매 거래량 그래프이다. 도 6의 그래프(610)는 시뮬레이션 결과값이고 그래프(620)은 밸리데이션 데이터 결과값이다. 도 5와 도 6을 비교하면, 기존 파라미터(ad-hoc, 사람이 맞춰준 파라미터)로 실험하였을 때(도 6)는 전체적으로 시뮬레이션 결과가 밸리데이션 데이터에 하향되어있었던 반면, 기계학습을 통한 교정된 파라미터를 시뮬레이션 입력 파라미터로 대입하여 얻은 시뮬레이션 결과는 밸리데이션 데이터를 보다 더 잘 모사하는 것을 알 수 있다.6 is a graph of a metropolitan area apartment trading volume calculated using an existing ad-hoc parameter. The graph 610 of FIG. 6 is a simulation result value, and the graph 620 is a validation data result value. Comparing FIG. 5 and FIG. 6, when experimenting with the existing parameters (ad-hoc, human-adjusted parameters) (FIG. 6), the simulation result as a whole was down to the validation data, whereas the corrected parameters through machine learning. It can be seen that the simulation results obtained by substituting for the simulation input parameters better simulate the validation data.

본 개시에서는 미시 시뮬레이션 파라미터 교정 방법론에 대하여 제시하면서, 시뮬레이션 모델의 신뢰도를 획득하기 위하여 복수의 통계 모델을 사용하였다. 예컨대, 군집 분석에는 딥 제너러티브 모델(Deep Generative Model) 중 가장 널리 쓰이는 확률적 자기부호화기와 비모수 클러스터링 모델(Nonparametric Clustering Model)인 디리클레 프로세스 혼합 모형을 사용하였고, 최대 정합도를 도출하는 미시 파라미터를 얻기 위하여 베이지안 최적화를 이용하였으나, 본 개시가 이에 한정되는 것은 아니며 본 기술분야의 당업자는 다양한 변형을 이용할 수 있을 것이다. In this disclosure, while presenting a methodology for calibration of micro-simulation parameters, a plurality of statistical models are used to obtain the reliability of the simulation model. For example, for cluster analysis, the most widely used probabilistic self-coding model among the deep generative models and the Dirichlet process mixture model, which is a nonparametric clustering model, were used, and micro parameters that derive the maximum degree of matching were used. Bayesian optimization was used to obtain, but the present disclosure is not limited thereto, and various modifications may be used by those skilled in the art.

상기 방법은 특정 실시예들을 통하여 설명되었지만, 상기 방법은 또한 컴퓨터 판독 가능한 기록매체에 컴퓨터 판독 가능한 코드로서 구현하는 것이 가능하다. 컴퓨터 판독 가능한 기록매체는 컴퓨터 시스템에 의해 판독될 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터 판독 가능한 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광데이터 저장장치 등이 있으며, 또한 케리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 또한, 컴퓨터 판독 가능한 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터에 의해 판독될 수 있는 코드가 저장되고 실행될 수 있다. 그리고, 상기 실시예들을 구현하기 위한 기능적인(functional) 프로그램, 코드 및 코드 세그먼트들은 본 개시가 속하는 기술분야의 프로그래머들에 의해 용이하게 추론될 수 있다.Although the method has been described through specific embodiments, it is also possible to implement the method as computer-readable code on a computer-readable recording medium. The computer-readable recording medium includes all types of recording devices that store data that can be read by a computer system. Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tapes, floppy disks, optical data storage devices, and the like, and also include those implemented in the form of carrier waves (for example, transmission through the Internet). . Further, the computer-readable recording medium is distributed over a computer system connected by a network, so that codes that can be read by the computer in a distributed manner can be stored and executed. Further, functional programs, codes, and code segments for implementing the above embodiments can be easily inferred by programmers in the technical field to which the present disclosure belongs.

본원에 개시된 실시예들에 있어서, 도시된 구성 요소들의 배치는 발명이 구현되는 환경 또는 요구 사항에 따라 달라질 수 있다. 예컨대, 일부 구성 요소가 생략되거나 몇몇 구성 요소들이 통합되어 하나로 실시될 수 있다. 또한 일부 구성 요소들의 배치 순서 및 연결이 변경될 수 있다.In the embodiments disclosed herein, the arrangement of the illustrated components may vary depending on the environment or requirements in which the invention is implemented. For example, some components may be omitted or some components may be integrated and implemented as one. Also, the arrangement order and connection of some components may be changed.

이상에서는 본 개시의 다양한 실시예들에 대하여 도시하고 설명하였지만, 본 개시는 상술한 특정의 실시예들에 한정되지 아니하며, 상술한 실시예들은 첨부하는 특허청구범위에서 청구하는 본 개시의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양하게 변형 실시될 수 있음은 물론이고, 이러한 변형 실시예들이 본 개시의 기술적 사상이나 범위와 별개로 이해되어져서는 아니 될 것이다. 따라서, 본 개시의 기술적 범위는 오직 첨부된 특허청구범위에 의해서만 정해져야 할 것이다.In the above, various embodiments of the present disclosure have been illustrated and described, but the present disclosure is not limited to the specific embodiments described above, and the foregoing embodiments deviate from the gist of the present disclosure as claimed in the appended claims. Without this, various modifications may be made by those skilled in the art to which the present invention pertains, and these modified embodiments should not be understood separately from the technical spirit or scope of the present disclosure. Therefore, the technical scope of the present disclosure should be defined only by the appended claims.

110: 동화 대상 자료 전처리
120: 현실 데이터의 그래프
130: ABM 파라미터 및 구조
140: 2차원 그림
150: 시뮬레이션 결과 그래프
160: 시뮬레이션 오차 분석
170: 미시 군집 분석
180: 미시 파라미터 추정110: Pre-processing target data
120: graph of real data
130: ABM parameters and structure
140: two-dimensional drawing
150: Simulation result graph
160: simulation error analysis
170: micro cluster analysis
180: micro parameter estimation

Claims

As a method of calibrating micro-simulation parameters using machine learning in an agent-based simulation performed by a computer,
(a) generating validation data by preprocessing moving object data, which is target data to be estimated through simulation, including characteristic data of a plurality of agents, and converting it into a predetermined data format;
(b) Agent-Based Modeling and Simulation (ABMS) is performed using scenario data for reflecting the initial parameter set and external variables for the characteristics of the plurality of agents into the simulation as ABM input data. Obtaining agent micro data of each of the plurality of agents by performing, and obtaining agent micro data by aggregating the agent micro data;
(c) performing micro-clustering analysis on micro-data of each of the plurality of agents to generate at least one clustering and performing cluster analysis; (d) analyzing an error by comparing the validation data and the agent macroscopic data;
(e) If the error exceeds a predetermined reference value,
(e-1) calibrating the parameters for each of the at least one cluster to reduce the error, and setting a correction parameter set for each of the at least one cluster;
(e-2) Each of the plurality of agents by setting the calibration initial parameter set and the scenario data for the characteristics of the plurality of agents as ABM input data, and performing agent-based modeling and simulation on the ABM input data. Obtaining agent micro data of the agent and collecting the agent micro data to obtain agent macro data;
(e-3) performing steps (d) and (e);
(f) when the error is less than or equal to a predetermined reference value, determining the corrected parameter as a final parameter
Micro-simulation parameter calibration method comprising a.

As a method of calibrating micro-simulation parameters using machine learning in an agent-based simulation performed by a computer,
(a) generating validation data by preprocessing moving object data, which is target data to be estimated through simulation, including characteristic data of a plurality of agents, and converting it into a predetermined data format;
(b) Agent-Based Modeling and Simulation (ABMS) is performed using scenario data for reflecting the initial parameter set and external variables for the characteristics of the plurality of agents into the simulation as ABM input data. Obtaining agent micro data of each of the plurality of agents by performing, and obtaining agent micro data by aggregating the agent micro data;
(c) performing micro-clustering analysis on micro-data of each of the plurality of agents to generate at least one clustering and performing cluster analysis;
(d) analyzing an error by comparing the validation data and the agent macroscopic data; And
(e) if the error exceeds a predetermined reference value, estimating micro parameters
Micro-simulation parameter calibration method comprising a.

The method of claim 1,
The step of performing the cluster analysis is a micro-simulation parameter, which is a step of performing cluster analysis using the latent expression as a new input value after learning the hidden representation of the agent through a variant autoencoder (VAE). Calibration method.

A computer-readable recording medium containing one or more instructions,
The one or more instructions, when executed for a computer, cause the computer to perform the method of any one of claims 1 to 3.

As a micro-simulation parameter calibration device using machine learning in agent-based simulation,
A validation data generation module configured to generate validation data by preprocessing moving object data, which is target data to be estimated through simulation, including characteristic data of a plurality of agents, and converting it into a predetermined data format;
Using the scenario data for reflecting the initial parameter set and external variables for the characteristics of the plurality of agents into the simulation as ABM input data, agent-based modeling and simulation (ABMS) is performed, and the A modeling module configured to obtain agent micro data of each of a plurality of agents, and to obtain agent macro data by aggregating the agent micro data;
A cluster analysis module configured to generate at least one cluster by performing micro cluster analysis on the micro data of each of the plurality of agents and perform cluster analysis;
An error analysis module configured to compare the validation data with the agent macroscopic data to analyze an error; And
A micro parameter estimation module, configured to calibrate the parameter for each of the at least one cluster to reduce the error, and set a calibration parameter set for each of the at least one cluster
Micro-simulation parameter calibration device comprising a.