KR102698899B1

KR102698899B1 - Method, system and non-transitory computer-readable recording medium for providing a reward function for reinforcement learning

Info

Publication number: KR102698899B1
Application number: KR1020210159242A
Authority: KR
Inventors: 이경엽; 이일규; 서종관; 김명준
Original assignee: 스페이스워크 주식회사
Priority date: 2021-11-18
Filing date: 2021-11-18
Publication date: 2024-08-27
Also published as: KR20230072717A

Abstract

본 발명은 강화학습을 위한 보상함수를 제공하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체에 관한 것이다.
본 발명의 일 태양에 따르면, 강화학습을 위한 보상함수를 제공하는 방법으로서, 강화학습 모델에 의해 생성된 건축 설계안의 대지에 관한 정량적 가치(quantitative value) 및 상기 건축 설계안의 형상에 관한 정성적 가치(qualitative value)를 각각 산출하는 단계, 및 상기 정량적 가치 및 상기 정성적 가치를 참조하여 산출되는 종점 보상값(terminal reward)을 상기 건축 설계안을 평가하기 위한 보상 함수로서 결정하는 단계를 포함하는 방법이 제공된다.The present invention relates to a method, a system, and a non-transitory computer-readable recording medium for providing a reward function for reinforcement learning.
According to one aspect of the present invention, a method for providing a reward function for reinforcement learning is provided, comprising: a step of calculating a quantitative value regarding a site of an architectural design plan generated by a reinforcement learning model and a qualitative value regarding a shape of the architectural design plan, and a step of determining a terminal reward value calculated with reference to the quantitative value and the qualitative value as a reward function for evaluating the architectural design plan.

Description

METHOD, SYSTEM AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM FOR PROVIDING A REWARD FUNCTION FOR REINFORCEMENT LEARNING

본 발명은 강화학습을 위한 보상함수를 제공하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체에 관한 것이다.The present invention relates to a method, a system, and a non-transitory computer-readable recording medium for providing a reward function for reinforcement learning.

사회가 발전함에 따라 건축물에 대한 기능, 구조, 미뿐만 아니라 경제성, 공공성, 환경적 고려 등 건축에서 요구되는 사항이 점점 많아지고 있다. 다양한 요구사항에 대해 최적으로 해결하기 위해 건축가는 전문적인 지식과 숙련된 경험을 필요로 하며, 이에 따라 전문가를 육성하기 위한 비용과 시간의 증가가 수반된다.As society develops, the requirements for architecture are increasing, including not only the function, structure, and beauty of buildings, but also economic efficiency, publicness, and environmental considerations. In order to optimally solve various requirements, architects require specialized knowledge and skilled experience, and this entails an increase in the cost and time required to train experts.

이러한 문제를 보완하고, 논리적이고 체계적인 설계안을 신속하게 도출하기 위해 인공지능 기술이 활용될 수 있다. 4차 산업혁명 시대에서 인공지능 기술은 명시적 코딩에 기반하지 않고 주어진 데이터의 학습을 통해 패턴을 찾아 주어진 문제를 해결함으로써 다방면에서 핵심기술로 자리하고 있다.To complement these problems and quickly derive logical and systematic design plans, artificial intelligence technology can be utilized. In the era of the 4th industrial revolution, artificial intelligence technology is positioned as a core technology in many areas by finding patterns and solving given problems through learning given data without explicit coding.

건축 문제에서도 컴퓨터 알고리즘을 통하여 최적 설계를 도출하는 프로세스의 자동화가 요구되지만, 건축 환경의 자유도와 복잡도가 높기 때문에 시장의 수요가 큼에도 불구하고 상용화가 가능한 수준의 기술이 아직 존재하지 않는다.In architectural problems, automation of the process of deriving optimal designs through computer algorithms is also required, but due to the high degree of freedom and complexity of the architectural environment, there is no technology yet that can be commercialized despite the high market demand.

종래 기술에 따르면, 법규를 만족시키는 건축물을 용이하게 모델링할 수 있도록 전자 지도 상의 실측데이터를 기반으로 모델링된 가이드 모델을 제공하고 있을 뿐 자동화된 설계 공정을 통해 건축물이 설계되지 않는다는 한계가 있다.According to the prior art, there is a limitation in that the guide model is provided based on actual measurement data on an electronic map to facilitate modeling of buildings that satisfy regulations, but the buildings are not designed through an automated design process.

이에 본 발명자는, 자동화된 프로세스에 의해 건축 설계안을 도출하는 강화학습 기반 건축 설계 방법을 제안하는 바이다.Accordingly, the inventors of the present invention propose a reinforcement learning-based architectural design method for deriving an architectural design plan through an automated process.

한국 공개특허 제10-2021-0062876호(2021.06.01)는 건축물 기획 설계를 지원하는 서비스 제공 방법에 관하여 개시한다.Korean Patent Publication No. 10-2021-0062876 (June 1, 2021) discloses a service provision method that supports architectural planning and design.

본 발명은 전술한 종래 기술의 문제점을 모두 해결하는 것을 그 목적으로 한다.The present invention aims to solve all of the problems of the above-mentioned prior art.

또한, 본 발명은 강화학습을 위한 보상함수를 제공함으로써 건축 설계 정보의 가공 및 매개변수화를 통해 최적의 건축 설계안을 도출하는 것을 목적으로 한다.In addition, the present invention aims to derive an optimal architectural design plan through processing and parameterization of architectural design information by providing a reward function for reinforcement learning.

상기 목적을 달성하기 위한 본 발명의 대표적인 구성은 다음과 같다.A representative configuration of the present invention to achieve the above purpose is as follows.

본 발명의 일 태양에 따르면, 강화학습을 위한 보상함수를 제공하는 방법으로서, 강화학습 모델에 의해 생성된 건축 설계안의 대지에 관한 정량적 가치(quantitative value) 및 상기 건축 설계안의 형상에 관한 정성적 가치(qualitative value)를 각각 산출하는 단계, 및 상기 정량적 가치 및 상기 정성적 가치를 참조하여 산출되는 종점 보상값(terminal reward)을 상기 건축 설계안을 평가하기 위한 보상 함수로서 결정하는 단계를 포함하는 방법이 제공된다. According to one aspect of the present invention, a method for providing a reward function for reinforcement learning is provided, comprising: a step of calculating a quantitative value regarding a site of an architectural design plan generated by a reinforcement learning model and a qualitative value regarding a shape of the architectural design plan, and a step of determining a terminal reward value calculated with reference to the quantitative value and the qualitative value as a reward function for evaluating the architectural design plan.

본 발명의 다른 태양에 따르면, 강화학습을 위한 보상함수를 제공하는 시스템으로서, 강화학습 모델에 의해 생성된 건축 설계안의 대지에 관한 정량적 가치(quantitative value) 및 상기 건축 설계안의 형상에 관한 정성적 가치(qualitative value)를 각각 산출하고, 상기 정량적 가치 및 상기 정성적 가치를 참조하여 산출되는 종점 보상값(terminal reward)을 상기 건축 설계안을 평가하기 위한 보상 함수로서 결정하는 정보 처리부를 포함하는 시스템이 제공된다.According to another aspect of the present invention, a system for providing a reward function for reinforcement learning is provided, the system including an information processing unit which calculates a quantitative value regarding a site of an architectural design plan generated by a reinforcement learning model and a qualitative value regarding a shape of the architectural design plan, and determines a terminal reward value calculated with reference to the quantitative value and the qualitative value as a reward function for evaluating the architectural design plan.

본 발명의 또 다른 태양에 따르면, 건축 설계를 위한 함수를 제공하는 방법으로서, 건축 설계 모델에 의해 생성된 건축 설계안의 대지에 관한 정량적 가치(quantitative value) 및 상기 건축 설계안의 형상에 관한 정성적 가치(qualitative value)를 각각 산출하는 단계, 및 상기 정량적 가치 및 상기 정성적 가치를 참조하여 산출되는 종점 보상값(terminal reward)에 대응되는 목적함수(objective function) 또는 적합도 함수(fitness function)를 최대화하여 건축 설계안을 도출하기 위한 복수의 파라미터를 결정하는 단계를 포함하는 방법이 제공된다.According to another aspect of the present invention, a method for providing a function for architectural design is provided, comprising: a step of calculating a quantitative value of a site of an architectural design generated by an architectural design model and a qualitative value of a shape of the architectural design, respectively; and a step of determining a plurality of parameters for deriving an architectural design by maximizing an objective function or a fitness function corresponding to a terminal reward calculated with reference to the quantitative value and the qualitative value.

본 발명의 또 다른 태양에 따르면, 건축 설계를 위한 함수를 제공하는 시스템으로서, 건축 설계 모델에 의해 생성된 건축 설계안의 대지에 관한 정량적 가치(quantitative value) 및 상기 건축 설계안의 형상에 관한 정성적 가치(qualitative value)를 각각 산출하고, 상기 정량적 가치 및 상기 정성적 가치를 참조하여 산출되는 종점 보상값(terminal reward)에 대응되는 목적함수(objective function) 또는 적합도 함수(fitness function)를 최대화하여 건축 설계안을 도출하기 위한 복수의 파라미터를 결정하는 정보 처리부를 포함하는 시스템이 제공된다.According to another aspect of the present invention, a system providing a function for architectural design is provided, the system including an information processing unit which calculates a quantitative value regarding a site of an architectural design plan generated by an architectural design model and a qualitative value regarding a shape of the architectural design plan, and determines a plurality of parameters for deriving an architectural design plan by maximizing an objective function or a fitness function corresponding to a terminal reward calculated with reference to the quantitative value and the qualitative value.

이 외에도, 본 발명을 구현하기 위한 다른 방법, 다른 시스템 및 상기 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 비일시성의 컴퓨터 판독 가능한 기록 매체가 더 제공된다.In addition, other methods for implementing the present invention, other systems, and non-transitory computer-readable recording media recording a computer program for executing the above methods are further provided.

본 발명에 의하면, 강화학습을 기반으로 건축 설계 정보의 가공 및 매개변수화를 이용한 자동화 프로세스에 의해 최적의 건축 설계안을 도출할 수 있다.According to the present invention, an optimal architectural design plan can be derived through an automated process utilizing processing and parameterization of architectural design information based on reinforcement learning.

또한, 본 발명에 의하면, 가공된 건축 설계 정보에 대응하는 복수의 파라미터에 기초하여 탐색 공간을 결정함으로써 특정되는, 미결정된 상태 집합 및 행동 집합을 대상으로 학습시키는 강화학습 모델을 이용하기 때문에 별도의 건축 예시에 대한 데이터 확보 없이 건축에 필요한 기본 데이터만을 가지고 심층강화학습을 수행할 수 있다.In addition, according to the present invention, since a reinforcement learning model is used to learn about an undetermined set of states and a set of actions that are specified by determining a search space based on a plurality of parameters corresponding to processed architectural design information, deep reinforcement learning can be performed using only basic data necessary for architecture without securing data on separate architectural examples.

도 1은 본 발명의 일 실시예에 따라 건축 설계를 위한 전체 시스템의 개략적인 구성을 도시하는 도면이다.
도 2는 본 발명의 일 실시예에 따라 건축 설계 시스템의 내부 구성을 예시적으로 나타내는 도면이다.
도 3은 본 발명의 일 실시예에 따라 강화학습 알고리즘을 개략적으로 나타내는 도면이다.
도 4 내지 도 5는 본 발명의 일 실시예에 따라 건축 설계 시스템을 통해 건축 설계안을 도출하는 과정을 예시적으로 나타내는 도면이다.FIG. 1 is a drawing schematically illustrating the configuration of an entire system for architectural design according to one embodiment of the present invention.
FIG. 2 is a drawing exemplarily showing the internal configuration of an architectural design system according to one embodiment of the present invention.
FIG. 3 is a diagram schematically illustrating a reinforcement learning algorithm according to one embodiment of the present invention.
FIGS. 4 and 5 are drawings exemplarily showing a process of deriving an architectural design plan through an architectural design system according to one embodiment of the present invention.

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이러한 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 본 명세서에 기재되어 있는 특정 형상, 구조 및 특성은 본 발명의 정신과 범위를 벗어나지 않으면서 일 실시예로부터 다른 실시예로 변경되어 구현될 수 있다. 또한, 각각의 실시예 내의 개별 구성요소의 위치 또는 배치도 본 발명의 정신과 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 행하여지는 것이 아니며, 본 발명의 범위는 특허청구범위의 청구항들이 청구하는 범위 및 그와 균등한 모든 범위를 포괄하는 것으로 받아들여져야 한다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 구성요소를 나타낸다.The detailed description of the present invention set forth below refers to the accompanying drawings which illustrate specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention. It should be understood that the various embodiments of the present invention, while different from each other, are not necessarily mutually exclusive. For example, specific shapes, structures, and characteristics described herein may be modified and implemented from one embodiment to another without departing from the spirit and scope of the invention. It should also be understood that the positions or arrangements of individual components within each embodiment may be changed without departing from the spirit and scope of the invention. Accordingly, the detailed description set forth below is not to be taken in a limiting sense, and the scope of the present invention is to be taken to encompass the scope of the claims and all equivalents thereof. Like reference numerals in the drawings represent the same or similar elements throughout the several aspects.

이하에서는, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있도록 하기 위하여, 본 발명의 여러 바람직한 실시예에 관하여 첨부된 도면을 참조하여 상세히 설명하기로 한다.Hereinafter, various preferred embodiments of the present invention will be described in detail with reference to the attached drawings so that a person having ordinary skill in the art to which the present invention pertains can easily practice the present invention.

전체 시스템의 구성Composition of the entire system

도 1은 본 발명의 일 실시예에 따라 건축 설계를 위한 전체 시스템의 개략적인 구성을 도시하는 도면이다.FIG. 1 is a drawing schematically illustrating the configuration of an entire system for architectural design according to one embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 전체 시스템은 통신망(100), 디바이스(200) 및 건축 설계 시스템(300)을 포함하여 구성될 수 있다.As illustrated in FIG. 1, the entire system according to one embodiment of the present invention may be configured to include a communication network (100), a device (200), and an architectural design system (300).

먼저, 본 발명의 일 실시예에 따르면, 통신망(100)은 유선 통신이나 무선 통신과 같은 통신 양태를 가리지 않고 구성될 수 있으며, 근거리 통신망(LAN; Local Area Network), 도시권 통신망(MAN; Metropolitan Area Network), 광역 통신망(WAN; Wide Area Network) 등 다양한 통신망으로 구성될 수 있다. 바람직하게는, 본 명세서에서 말하는 통신망(100)은 공지의 인터넷 또는 월드와이드웹(WWW; World Wide Web)일 수 있다. 그러나, 통신망(100)은, 굳이 이에 국한될 필요 없이, 공지의 유무선 데이터 통신망, 공지의 전화망 또는 공지의 유무선 텔레비전 통신망을 그 적어도 일부에 있어서 포함할 수도 있다.First, according to one embodiment of the present invention, the communication network (100) may be configured regardless of the communication mode, such as wired communication or wireless communication, and may be configured with various communication networks, such as a local area network (LAN), a metropolitan area network (MAN), and a wide area network (WAN). Preferably, the communication network (100) referred to in the present specification may be the well-known Internet or the World Wide Web (WWW). However, the communication network (100) is not necessarily limited thereto and may include at least a part of a well-known wired and wireless data communication network, a well-known telephone network, or a well-known wired and wireless television communication network.

예를 들면, 통신망(100)은 무선 데이터 통신망으로서, 와이파이(WiFi) 통신, 와이파이 다이렉트(WiFi-Direct) 통신, 롱텀 에볼루션(LTE, Long Term Evolution) 통신, 블루투스 통신(저전력 블루투스(BLE; Bluetooth Low Energy) 포함), 적외선 통신, 초음파 통신 등과 같은 종래의 통신 방법을 적어도 그 일부분에 있어서 구현하는 것일 수 있다. 다른 예를 들면, 통신망(100)은 광 통신망으로서, 라이파이(LiFi, Light Fidelity) 등과 같은 종래의 통신 방법을 적어도 그 일부분에 있어서 구현하는 것일 수 있다.For example, the communication network (100) may be a wireless data communication network that implements, at least in part, a conventional communication method such as WiFi communication, WiFi-Direct communication, Long Term Evolution (LTE) communication, Bluetooth communication (including Bluetooth Low Energy (BLE)), infrared communication, or ultrasonic communication. As another example, the communication network (100) may be an optical communication network that implements, at least in part, a conventional communication method such as LiFi (Light Fidelity).

다음으로 본 발명의 일 실시예에 따라 디바이스(200)는 통신망(100)을 통해 후술할 건축 설계 시스템(300)에 접속한 후 통신할 수 있도록 하는 기능을 포함하는 디지털 기기로서, 컴퓨터, 노트북, 스마트폰, 태블릿 PC 등과 같이 메모리 수단을 구비하고 마이크로 프로세서를 탑재하여 연산 능력을 갖춘 휴대 가능한 디지털 기기라면 얼마든지 본 발명에 따른 사용자 디바이스(200)로서 채택될 수 있다.Next, according to one embodiment of the present invention, a device (200) is a digital device that includes a function for connecting to and communicating with an architectural design system (300) to be described later via a communication network (100). Any portable digital device equipped with a memory means, a microprocessor, and a computing capability, such as a computer, a laptop, a smart phone, a tablet PC, etc., can be adopted as the user device (200) according to the present invention.

한편, 본 발명의 일 실시예에 따른 디바이스(200)에는, 본 발명에 따른 건축 설계를 지원하는 애플리케이션이 포함되어 있을 수 있다. 이와 같은 애플리케이션은 외부의 애플리케이션 배포 서버(미도시됨)로부터 다운로드된 것일 수 있다. 한편, 이러한 프로그램 모듈의 성격은 후술할 바와 같은 건축 설계 시스템(300)의 정보 가공부(310), 정보 처리부(320), 통신부(330) 및 제어부(340)와 전반적으로 유사할 수 있다. 여기서, 애플리케이션은 그 적어도 일부가 필요에 따라 그것과 실질적으로 동일하거나 균등한 기능을 수행할 수 있는 하드웨어 장치나 펌웨어 장치로 치환될 수도 있다.Meanwhile, the device (200) according to one embodiment of the present invention may include an application that supports architectural design according to the present invention. Such an application may be downloaded from an external application distribution server (not shown). Meanwhile, the nature of such a program module may be generally similar to the information processing unit (310), the information processing unit (320), the communication unit (330), and the control unit (340) of the architectural design system (300) described below. Here, at least a part of the application may be replaced with a hardware device or firmware device that can perform functions substantially identical to or equivalent thereto, as necessary.

다음으로, 본 발명의 일 실시예에 따른 건축 설계 시스템(300)은, 건축 설계 정보를 건축 환경에 맞게 가공하고, 가공된 건축 설계 정보에 대응하는 복수의 파라미터를 포함하는 학습 데이터를 강화학습 모델에 입력하여 건축 설계안을 도출하기 위한 복수의 파라미터를 결정하는 기능을 수행할 수 있으며, 강화학습 모델은 가공된 건축 설계 정보에 대응하는 복수의 파라미터에 기초하여 탐색 공간을 결정함으로써 특정되는, 미결정된 상태(state) 집합 및 행동(action) 집합을 대상으로 학습될 수 있다.Next, an architectural design system (300) according to one embodiment of the present invention can perform a function of processing architectural design information to fit an architectural environment, and inputting learning data including a plurality of parameters corresponding to the processed architectural design information into a reinforcement learning model to determine a plurality of parameters for deriving an architectural design plan, and the reinforcement learning model can learn by targeting an undetermined set of states and a set of actions that are specified by determining a search space based on a plurality of parameters corresponding to the processed architectural design information.

또한, 본 발명의 일 실시예에 따른 건축 설계 시스템(300)은 통신망(100)을 통해 디바이스(200)에게 생성된 건축 설계안을 제공하는 기능을 수행할 수도 있다.In addition, the architectural design system (300) according to one embodiment of the present invention may perform a function of providing the generated architectural design plan to a device (200) through a communication network (100).

본 발명에 따른 건축 설계 시스템(300)의 구성과 기능에 관하여는 이하의 상세한 설명을 통하여 자세하게 알아보기로 한다. 한편, 건축 설계 시스템(300)에 관하여 위와 같이 설명되었으나, 이러한 설명은 예시적인 것이고, 건축 설계 시스템(300)에 요구되는 기능이나 구성요소의 적어도 일부가 필요에 따라 디바이스(200) 또는 서버(미도시됨) 내에서 실현되거나 외부 시스템(미도시됨) 내에 포함될 수도 있음은 당업자에게 자명하다.The configuration and function of the architectural design system (300) according to the present invention will be described in detail through the following detailed description. Meanwhile, although the architectural design system (300) has been described as above, this description is exemplary, and it is obvious to those skilled in the art that at least some of the functions or components required for the architectural design system (300) may be realized within the device (200) or server (not shown) or included within an external system (not shown) as needed.

건축 설계 시스템의 구성Composition of architectural design system

이하에서는, 본 발명의 구현을 위하여 중요한 기능을 수행하는 건축 설계 시스템(300)의 내부 구성과 각 구성요소의 기능에 대하여 살펴보기로 한다.Below, the internal configuration of the architectural design system (300) that performs important functions for implementing the present invention and the functions of each component will be examined.

도 2는 본 발명의 일 실시예에 따라 건축 설계 시스템의 내부 구성을 예시적으로 나타내는 도면이다.FIG. 2 is a drawing exemplarily showing the internal configuration of an architectural design system according to one embodiment of the present invention.

도 2에 도시된 바와 같이, 본 발명의 일 실시예에 따라 건축 설계 시스템(300)은, 정보 가공부(310), 정보 처리부(320), 통신부(330) 및 제어부(340)를 포함할 수 있다. 또한, 본 발명의 일 실시예에 따라 정보 가공부(310), 정보 처리부(320), 통신부(330) 및 제어부(340)는 그 중 적어도 일부가 외부 시스템(미도시됨)과 통신하는 프로그램 모듈들일 수 있다. 이러한 프로그램 모듈들은 운영 시스템, 응용 프로그램 모듈 및 기타 프로그램 모듈의 형태로 건축 설계 시스템(300)에 포함될 수 있으며, 물리적으로는 여러 가지 공지의 기억 장치 상에 저장될 수 있다. 또한, 이러한 프로그램 모듈들은 건축 설계 시스템(300)과 통신 가능한 원격 기억 장치에 저장될 수도 있다. 한편, 이러한 프로그램 모듈들은 본 발명에 따라 후술할 특정 업무를 수행하거나 특정 추상 데이터 유형을 실행하는 루틴, 서브루틴, 프로그램, 오브젝트, 컴포넌트, 데이터 구조 등을 포괄하지만, 이에 제한되지는 않는다.As illustrated in FIG. 2, according to one embodiment of the present invention, the architectural design system (300) may include an information processing unit (310), an information processing unit (320), a communication unit (330), and a control unit (340). In addition, according to one embodiment of the present invention, at least some of the information processing unit (310), the information processing unit (320), the communication unit (330), and the control unit (340) may be program modules that communicate with an external system (not shown). These program modules may be included in the architectural design system (300) in the form of an operating system, an application program module, and other program modules, and may be physically stored on various known memory devices. In addition, these program modules may be stored in a remote memory device that can communicate with the architectural design system (300). Meanwhile, these program modules include, but are not limited to, routines, subroutines, programs, objects, components, data structures, etc. that perform specific tasks described below or execute specific abstract data types according to the present invention.

먼저, 본 발명의 일 실시예에 따르면, 정보 가공부(310)는, 건축 설계 정보를 건축 환경에 맞게 가공하는 기능을 수행할 수 있다.First, according to one embodiment of the present invention, the information processing unit (310) can perform a function of processing architectural design information to suit the architectural environment.

본 발명의 일 실시예에 따라 건축 설계 정보는 건축 환경에 관한 건축 기하학적 정보 및 건축 법규 정보를 포함할 수 있다.According to one embodiment of the present invention, the architectural design information may include architectural geometric information and building code information regarding the architectural environment.

본 발명의 일 실시예에 따라 건축 기하학적 정보는 건축 환경과 관련된 치수, 모양, 또는 상대적 위치 중 적어도 하나에 관한 정보를 포함할 수 있으며, 점, 선, 면, 도형 또는 공간 중 적어도 하나에 대한 정보로 표시될 수 있다. 건축 기하학적 정보는 토지의 소재, 지번, 지목 및 면적 중 적어도 하나에 관한 정보(예를 들면, 필지 정보 데이터, 지적도 데이터 등)을 포함할 수 있다.According to one embodiment of the present invention, the architectural geometric information may include information regarding at least one of a dimension, shape, or relative position related to the architectural environment, and may be expressed as information regarding at least one of a point, a line, a surface, a figure, or a space. The architectural geometric information may include information regarding at least one of the location, address, land use, and area of the land (e.g., parcel information data, cadastral map data, etc.).

또한, 본 발명의 일 실시예에 따라 건축 법규 정보는 해당 건축 대지에 건물을 짓는 경우에 적용될 수 있는 법규 정보를 포함할 수 있다. 건축 법규 정보는 건축물의 대지, 구조, 설비, 및 용도 중 적어도 하나에 관한 기준을 포함할 수 있다. 예를 들면, 건축법, 주차장법, 주택법, 지구단위계획, 지자치조례 등에 대한 정보를 포함할 수 있다.In addition, according to one embodiment of the present invention, the building code information may include code information applicable when constructing a building on the building site. The building code information may include criteria regarding at least one of the building site, structure, equipment, and use. For example, it may include information on the Building Code, Parking Lot Act, Housing Act, District Unit Plan, and Local Government Ordinance.

본 발명의 일 실시예에 따르면, 정보 가공부(310)는, 건축 기하학적 및 건축 법규 정보를 포함하는 건축 설계 정보를 획득하기 위해 오픈 API(open application program interface) 제공 서버, 공지의 데이터베이스 등을 참조할 수 있다. 예를 들어, 국가 공간 정보 시스템(nsdi.go.kr) 등과 같은 각종 공공 기관 사이트의 데이터베이스로부터 획득된 데이터를 분석함으로써, 각종 건축, 법규, 금융, 대지, 도로, 환경 등에 대한 정보를 획득할 수 있다.According to one embodiment of the present invention, the information processing unit (310) may refer to an open API (open application program interface) providing server, a public database, etc., to obtain architectural design information including architectural geometry and building code information. For example, by analyzing data obtained from a database of various public institution sites such as the National Spatial Information System (nsdi.go.kr), information on various buildings, laws, finance, land, roads, the environment, etc. may be obtained.

본 발명의 일 실시예에 따라 정보 가공부(310)는, 법규 생성 알고리즘을 기초로 가공된 건축 설계 정보를 생성할 수 있다. 법규 생성 알고리즘은 건축 기하학적 정보 및 건축 법규 정보뿐만 아니라 해당 건축 대지가 위치한 지역의 환경 정보(예를 들면, 시간별 태양의 고도각, 건물 등 주변 환경에 의한 조망 범위, 기후, 날씨 등)를 고려하여 가공된 건축 설계 정보를 생성할 수 있다.According to one embodiment of the present invention, the information processing unit (310) can generate processed architectural design information based on a regulation generation algorithm. The regulation generation algorithm can generate processed architectural design information by considering not only architectural geometric information and architectural regulation information, but also environmental information of the area where the building site is located (e.g., elevation angle of the sun by hour, view range due to surrounding environment such as buildings, climate, weather, etc.).

구체적으로, 본 발명의 일 실시예에 따라 정보 가공부(310)는 법규 생성 알고리즘에 기초하여 지적도 데이터로부터 필지 경계선 정보를 생성할 수 있다. 또한, 정보 가공부(310)는 필지 경계선 정보 및 건축법 데이터(예를 들어, 건축선 후퇴, 가각전제 등)를 기초로 대지 경계선 정보를 생성할 수 있다. 또한, 정보 가공부(310)는 대지 경계선 정보, 지구 단위 계획(예를 들어, 대지안의 공지 이격) 및 지자치조례(예를 들어, 건축한계선 이격)를 기초로 대지 법규선 정보를 생성할 수 있다. 또한, 정보 가공부(310)는 대지 법규선 정보 및 환경 정보(예를 들어, 일조권 사선 제한)를 기초로 층별 법규선 정보를 생성할 수 있다.Specifically, according to one embodiment of the present invention, the information processing unit (310) can generate parcel boundary line information from the cadastral map data based on a regulation generation algorithm. In addition, the information processing unit (310) can generate land boundary line information based on parcel boundary line information and building law data (e.g., building line setback, sectional premises, etc.). In addition, the information processing unit (310) can generate land regulation line information based on land boundary line information, district unit plan (e.g., open space separation within the lot), and local government ordinance (e.g., building limit line separation). In addition, the information processing unit (310) can generate floor regulation line information based on land regulation line information and environmental information (e.g., sunlight right diagonal restriction).

또한, 본 발명의 가공된 건축 설계 정보는 건축 법규 정보에 포함된 기준을 충족하는 건축 기하학적 정보를 포함할 수 있다. 예를 들면, 가공된 건축 설계 정보는 건축 환경에 맞는 주변 필지선, 필지 경계선, 대지 법규선, 층별 법규선 등에 대한 정보를 포함할 수 있다.In addition, the processed architectural design information of the present invention may include architectural geometric information that satisfies the criteria included in the building code information. For example, the processed architectural design information may include information on surrounding lot lines, lot boundary lines, land code lines, and floor code lines that are suitable for the building environment.

다음으로, 본 발명의 일 실시예에 따르면, 정보 처리부(320)는, 정보 가공부(310)를 통해 가공된 건축 설계 정보에 대응하는 복수의 파라미터를 포함하는 학습 데이터를 강화학습 모델에 입력하여 건축 설계안을 도출하기 위한 복수의 파라미터들을 결정하는 기능을 수행할 수 있다.Next, according to one embodiment of the present invention, the information processing unit (320) can perform a function of determining a plurality of parameters for deriving an architectural design plan by inputting learning data including a plurality of parameters corresponding to architectural design information processed through the information processing unit (310) into a reinforcement learning model.

본 발명의 일 실시예에 따라 가공된 건축 설계 정보에 대응하는 복수의 파라미터는 기하학에 기초하여 건축 요소에 대해 공간을 제한하도록 추정될 수 있다. 복수의 파라미터는 가공된 건축 설계 정보에 대응하는 매개 변수로서, 강화학습 모델을 통해 가중치가 업데이트될 수 있다. 또한, 복수의 파라미터는 건축 환경에 따른 건축 요소를 추정하기 위한 매개변수를 포함할 수 있다. 건축 요소는 설계 개요와 층별 개요를 포함할 수 있다. 설계 개요는 대지 면적, 건축 면적, 건물 용도, 건물 규모, 제외 면적, 도로 후퇴면적, 건물 구조, 건물 용도, 건물 높이, 연면적, 건폐율, 용적률, 주차시설, 정화조시설, 조경 면적 등에 대한 정보를 포함할 수 있다. 층별 개요는 층별 용도 및 면적을 나타내는 것으로 근린생활시설, 단독주택, 다가구주택, 다세대주택 등에 관한 정보를 포함할 수 있다. 상술한 설계 개요 및 층별 개요는 예시일 뿐 본 발명은 이에 제한되지 않는다.According to one embodiment of the present invention, a plurality of parameters corresponding to processed architectural design information can be estimated to limit space for architectural elements based on geometry. The plurality of parameters are parameters corresponding to processed architectural design information, and weights can be updated through a reinforcement learning model. In addition, the plurality of parameters can include parameters for estimating architectural elements according to an architectural environment. The architectural elements can include a design outline and a floor outline. The design outline can include information on land area, building area, building use, building scale, excluded area, road setback area, building structure, building use, building height, floor area, building coverage ratio, floor area ratio, parking facilities, septic tank facilities, and landscaping area. The floor outline indicates the use and area of each floor and can include information on neighborhood living facilities, single-family houses, multi-family houses, and multi-generational houses. The above-described design outline and floor outline are merely examples, and the present invention is not limited thereto.

구체적으로, 본 발명의 일 실시예에 따른 건축 요소는 매스(mass), 코어(core), 복도(corridor), 세대(unit), 실(room), 발코니(balcony), 및 주차장(parking lot) 중 적어도 하나를 포함할 수 있다. 일례로, 매스는 건축 면적 또는 대지 면적을 의미할 수 있다. 일례로, 코어는 건물 또는 공간 구조를 의미할 수 있다. 일례로, 세대는 공간의 구획을 의미할 수 있다. 일례로, 실은 세대 내 분리된 공간을 의미할 수 있다. 상술한 건축 요소는 예시일 뿐 본 발명은 이에 제한되지 않는다.Specifically, an architectural element according to one embodiment of the present invention may include at least one of a mass, a core, a corridor, a unit, a room, a balcony, and a parking lot. For example, a mass may mean a building area or a land area. For example, a core may mean a building or a space structure. For example, a unit may mean a partition of a space. For example, a room may mean a separated space within a unit. The above-described architectural elements are merely examples, and the present invention is not limited thereto.

본 발명의 일 실시예에 따라 가공된 건축 설계 정보에 대응하는 복수의 파라미터들은 건축 요소 간의 상관 관계에 기초하여 추정될 수 있다. 본 발명의 일 실시예에 따라 복수의 파라미터는 건축 요소 간의 시간적 상관 관계에 기초하여 추정될 수 있다. 또한, 건축물의 유형에 따라 건축 요소가 달라질 수 있으므로 건축 요소 간 상관 관계는 달라질 수 있다. 또한 건축 요소 간 상관 관계는 디자이너의 전문성과 경험적 지식에 기반한 것일 수 있다.According to one embodiment of the present invention, a plurality of parameters corresponding to the processed architectural design information can be estimated based on the correlation between architectural elements. According to one embodiment of the present invention, a plurality of parameters can be estimated based on the temporal correlation between architectural elements. In addition, since the architectural elements can vary depending on the type of building, the correlation between architectural elements can vary. In addition, the correlation between architectural elements can be based on the expertise and empirical knowledge of the designer.

구체적으로, 본 발명의 일 실시예에 따라 가공된 건축 설계 정보에 대응하는 복수의 파라미터는 건축 요소를 생성하기 위한 행동 집합과 연관이 되고, 제1 건축 요소 생성 단계 이후에 진행되는 제2 건축 요소 생성 단계의 상태 집합은 제1 건축 요소 생성 단계의 상태 집합을 포함할 수 있다.Specifically, a plurality of parameters corresponding to architectural design information processed according to one embodiment of the present invention are associated with a set of actions for generating an architectural element, and a state set of a second architectural element generation step performed after the first architectural element generation step may include a state set of the first architectural element generation step.

본 발명의 일 실시예에 따라 정보 처리부(320)는, 강화학습 모델의 학습을 위해 기하학을 기초로 건축 문제의 탐색 공간을 제한할 수 있다. 예를 들어, 정보 처리부(320)는 가공된 건축 설계 정보에 대응하는 복수의 파라미터를 기초로 기하학을 통해 건축 요소들이 배치될 수 있는 물리적 공간을 제한할 수 있다.According to one embodiment of the present invention, the information processing unit (320) can limit the search space of an architectural problem based on geometry for learning a reinforcement learning model. For example, the information processing unit (320) can limit the physical space in which architectural elements can be placed through geometry based on a plurality of parameters corresponding to processed architectural design information.

구체적으로, 본 발명의 일 실시예에 따라 정보 처리부(320)는 강화학습을 위해 결정된 탐색 공간에 대해 정수 혹은 유리수로 사상할 수 있다. 본 발명의 일 실시예에 따라, 정보 처리부(320)는 추상적인 건축 설계 공간의 무한집합을 숫자로 표현 가능한 부분집합으로 표현하는 임베딩(embedding)을 수행할 수 있다. 또한, 본 발명의 일 실시예에 따라, 정보 처리부(320)는 건축 요소에 대한 임베딩을 통해 강화 학습을 위한 행동 집합(action space)을 생성할 수 있다.Specifically, according to one embodiment of the present invention, the information processing unit (320) can map the search space determined for reinforcement learning into integers or rational numbers. According to one embodiment of the present invention, the information processing unit (320) can perform embedding, which expresses an infinite set of abstract architectural design spaces as subsets that can be expressed in numbers. In addition, according to one embodiment of the present invention, the information processing unit (320) can generate an action space for reinforcement learning through embedding for architectural elements.

또한, 본 발명의 일 실시예에 따라 정보 처리부(320)는 강화학습 모델을 학습시킬 수 있고, 강화학습 모델은 가공된 건축 설계 정보에 대응하는 복수의 파라미터에 기초하여 탐색 공간을 결정함으로써 특정되는, 미결정된 상태(state) 집합 및 행동(action) 집합을 대상으로 학습될 수 있다. 강화학습 모델에 대한 구체적인 설명은 도 3에서 후술한다.In addition, according to one embodiment of the present invention, the information processing unit (320) can train a reinforcement learning model, and the reinforcement learning model can be trained on an undetermined set of states and a set of actions that are specified by determining a search space based on a plurality of parameters corresponding to processed architectural design information. A specific description of the reinforcement learning model is described later in Fig. 3.

한편, 본 발명의 일 실시예에 따라 정보 처리부(320)는 가공된 건축 설계 정보에 대응하는 복수의 파라미터를 설정하기 위해 건축 환경 매개변수화 알고리즘을 이용할 수 있다. 건축 환경 매개변수화 알고리즘은 메타 휴리스틱(meta heuristic) 알고리즘을 이용할 수 있으며, 이러한 메타 휴리스틱 알고리즘에는, 유전 알고리즘(GA; genetic algorithms), 시뮬레이티드 어닐링(SA; simulated annealing) 및 타부 탐색(TS, tabu searching) 중 적어도 하나가 포함될 수 있다.Meanwhile, according to one embodiment of the present invention, the information processing unit (320) may use an architectural environment parameterization algorithm to set a plurality of parameters corresponding to processed architectural design information. The architectural environment parameterization algorithm may use a meta heuristic algorithm, and the meta heuristic algorithm may include at least one of a genetic algorithm (GA), a simulated annealing (SA), and a tabu search (TS).

한편, 본 발명의 다른 실시예에 따르면, 건축 설계 시스템(300)은 강화학습 이외의 다른 학습 알고리즘을 이용하여 건축 설계를 수행할 수 있다. 본 발명의 다른 실시예에 따른 건축 설계 시스템(300)은 진화전략 알고리즘 또는 유전 알고리즘을 이용하여 건축 설계를 수행할 수 있고, 그 과정에서 목적함수(objective function) 또는 적합도 함수(fitness function)에 기초하여 복수의 파라미터를 설정할 수 있다.Meanwhile, according to another embodiment of the present invention, the architectural design system (300) can perform architectural design using a learning algorithm other than reinforcement learning. The architectural design system (300) according to another embodiment of the present invention can perform architectural design using an evolution strategy algorithm or a genetic algorithm, and in the process, can set a plurality of parameters based on an objective function or a fitness function.

구체적으로, 본 발명의 다른 실시예에 다른 건축 설계 시스템(300)은 목적함수 또는 적합도 함수를 이용하여 최종 건축 설계안(복수의 파라미터를 포함함)에 대한 종점 보상값을 산출하고, 그 종점 보상값에 근거하여 최적의 건축 설계안을 결정할 수 있는데, 여기서 이용되는 목적함수 또는 적합도 함수의 내용은 전술한 보상함수에 대응되고, 종점 보상값은 목적함수 또는 적합도 함수를 최대화함으로써 결정될 수 있다.Specifically, in another embodiment of the present invention, an architectural design system (300) can calculate an end-point reward value for a final architectural design plan (including a plurality of parameters) using an objective function or a fitness function, and determine an optimal architectural design plan based on the end-point reward value. Here, the content of the objective function or the fitness function used corresponds to the above-described reward function, and the end-point reward value can be determined by maximizing the objective function or the fitness function.

다음으로, 본 발명의 일 실시예에 따른 통신부(330)는 정보 가공부(310) 및 정보 처리부(320)로부터의/로의 데이터 송수신이 가능하도록 하는 기능을 수행할 수 있다.Next, the communication unit (330) according to one embodiment of the present invention can perform a function that enables data transmission and reception from/to the information processing unit (310) and the information processing unit (320).

마지막으로, 본 발명의 일 실시예에 따른 제어부(340)는 정보 가공부(310), 정보 처리부(320) 및 통신부(330) 간의 데이터의 흐름을 제어하는 기능을 수행할 수 있다. 즉, 본 발명에 따른 제어부(340)는 건축 설계 시스템(300)의 외부로부터의/로의 데이터 흐름 또는 건축 설계 시스템(300)의 각 구성요소 간의 데이터 흐름을 제어함으로써, 정보 가공부(310), 정보 처리부(320) 및 통신부(340)에서 각각 고유 기능을 수행하도록 제어할 수 있다.Finally, the control unit (340) according to one embodiment of the present invention can perform a function of controlling the flow of data between the information processing unit (310), the information processing unit (320), and the communication unit (330). That is, the control unit (340) according to the present invention can control the flow of data from/to the outside of the architectural design system (300) or the flow of data between each component of the architectural design system (300), thereby controlling the information processing unit (310), the information processing unit (320), and the communication unit (340) to perform their respective functions.

도 3은 본 발명의 일 실시예에 따라 강화학습 알고리즘을 개략적으로 나타내는 도면이다.FIG. 3 is a diagram schematically illustrating a reinforcement learning algorithm according to one embodiment of the present invention.

본 발명의 일 실시예에 따라 정보 처리부(320)는 강화학습 방법에 기초하여 미결정된 상태(state) 집합 및 행동(action) 집합을 대상으로 강화학습 모델을 학습시킬 수 있고, 강화학습 모델은 인공 신경망(artificial neural network)을 포함할 수 있다.According to one embodiment of the present invention, the information processing unit (320) can train a reinforcement learning model based on a reinforcement learning method for an undetermined set of states and a set of actions, and the reinforcement learning model can include an artificial neural network.

강화학습은 인공 신경망 모델이 행동이 행동을 선택하고, 선택한 행동에 대해 주어지는 보상에 기초하여 인공 신경망을 학습시키는 방법이다. 강화학습 과정에서 인공 신경망 모델에게 주어지는 보상은 여러 행동의 결과가 누적된 보상일 수 있다. 강화학습은 학습을 통해 여러가지 상태(state)와, 행동(action)에 따른 보상(reward)을 고려하여 보상 또는 리턴(return)이 최대가 되도록 하는 인공 신경망 모델을 생성한다.Reinforcement learning is a method in which an artificial neural network model learns based on actions that select actions and rewards given for the selected actions. In the reinforcement learning process, the reward given to the artificial neural network model can be the accumulated reward of the results of multiple actions. Reinforcement learning creates an artificial neural network model that maximizes reward or return by considering various states and rewards according to actions through learning.

본 발명의 일 실시예에 따라 강화 학습 모델은 행동을 결정하는 주체로서 에이전트(agent)와 상호 교환되어 사용 가능하다. 또한 본 발명의 일 실시예에 따라 에이전트에 대응되는 개념으로 환경(environment)이 사용될 수 있다.According to one embodiment of the present invention, a reinforcement learning model can be used interchangeably with an agent as a subject that determines actions. In addition, according to one embodiment of the present invention, an environment can be used as a concept corresponding to an agent.

본 발명의 일 실시예에 따르면, 환경(420)은 적어도 하나 이상의 에이전트(410)가 행동을 결정하는데 근거가 될 수 있는 특정 시각()의 상태 ()를 에이전트(310)에게 제공할 수 있다. 그 후 에이전트(410)는 환경(420)으로부터 획득한 상태에 기초하여 행동()을 결정할 수 있다. 에이전트(410)가 결정된 행동을 환경(420)으로 전달하면, 에이전트(410)는 행동에 기초한 보상(, 430) 및 다음 상태()를 환경(420)으로부터 수신할 수 있다. 이러한 상호 과정에 기초하여 에이전트(410)는 주어진 환경(420)에서 누적된 보상(430)을 최대화하는 정책(policy)을 학습한다. 정책은 에이전트(410)가 특정 상태에 대해 특정 행동을 할 확률에 관한 집합을 의미할 수 있다.According to one embodiment of the present invention, the environment (420) may be configured to provide at least one agent (410) with a specific time ( ) status ( ) can be provided to the agent (310). Then, the agent (410) takes action ( ) can be determined. When the agent (410) transmits the determined action to the environment (420), the agent (410) receives a reward ( , 430) and the following status ( ) can be received from the environment (420). Based on this reciprocal process, the agent (410) learns a policy that maximizes the accumulated reward (430) in the given environment (420). The policy may mean a set regarding the probability that the agent (410) will take a specific action for a specific state.

구체적으로, 본 발명의 일 실시예에 따라 환경(420)은 건축 설계 정보에 대응되는 상태 집합을 포함할 수 있다. 상기 상태 집합은 에이전트(410)가 가공된 건축 설계 정보에 대응하는 파라미터를 설정함으로써 특정될 수 있다. 또한 상기 파라미터는 건축 요소 생성을 위한 행동 집합에 대응될 수 있다.Specifically, according to one embodiment of the present invention, the environment (420) may include a set of states corresponding to architectural design information. The set of states may be specified by the agent (410) setting parameters corresponding to the processed architectural design information. In addition, the parameters may correspond to a set of actions for generating architectural elements.

또한, 본 발명의 일 실시예에 따라 환경(420)은 제1 건축 요소 생성 단계 이후에 진행되는 제2 건축 요소 생성 단계의 상태 집합은 제1 건축 요소 생성 단계의 상태 집합을 포함할 수 있다. 예를 들어, 환경(420)은 건축 요소 생성 에피소드(episode)의 경험 재생 기억(experience replay memory)에 기초하여 이후의 진행단계에서 이전까지의 상태 집합을 포함할 수 있다. 에이전트(410)은 이전 시점까지의 상태가 포함된 상태를 기초로 행동을 결정할 수 있다. 이를 통해 강화학습 모델은 건축 예시에 대한 추가적인 데이터를 수집하지 않고서도 건축에 필요한 기본 데이터만을 활용하여 심층강화학습이 수행될 수 있다.In addition, according to one embodiment of the present invention, the state set of the second building element generation step performed after the first building element generation step of the environment (420) may include the state set of the first building element generation step. For example, the environment (420) may include the state set from the subsequent progress step to the previous step based on the experience replay memory of the building element generation episode. The agent (410) may determine an action based on the state including the state up to the previous point in time. Through this, the reinforcement learning model may perform deep reinforcement learning by utilizing only the basic data required for construction without collecting additional data on the building example.

또한, 본 발명의 일 실시예에 따르면, 강화학습 모델은 정책 기반(policy-based) 학습 알고리즘, 가치 기반(value-based) 학습 알고리즘 또는 Actor-critic 알고리즘에 따라 학습될 수 있다. 가치 기반 학습 알고리즘은 가치 함수를 기초로 각 상태에서 최고의 가치를 주는 행동을 결정하며 학습되는 알고리즘이다. 일례로, DQN(Deep Q-Network) 등이 포함될 수 있다. 정책 기반 학습 알고리즘은 가치 함수 없이 최종 리턴 및 정책 함수에 기초하여 행동을 결정하며 학습되는 알고리즘이다. 일례로, 정책 경사(policy gradient) 기법이 포함될 수 있다. 배우-비평가(Actor-critic) 알고리즘은 정책 함수가 행동을 결정하면 가치함수가 행동을 평가하는 방식으로 학습되는 알고리즘이다.In addition, according to one embodiment of the present invention, the reinforcement learning model can be learned according to a policy-based learning algorithm, a value-based learning algorithm, or an actor-critic algorithm. A value-based learning algorithm is an algorithm that learns by determining an action that gives the highest value in each state based on a value function. For example, DQN (Deep Q-Network) may be included. A policy-based learning algorithm is an algorithm that learns by determining an action based on a final return and a policy function without a value function. For example, a policy gradient technique may be included. An actor-critic algorithm is an algorithm that learns in such a way that a value function evaluates an action when a policy function determines an action.

본 발명의 일 실시예에 따라 강화학습 모델은 미결정된 상태 집합에서 가공된 건축 설계 정보에 대응되지 않는 일부 상태가 기본값(default value)으로 대체되어 학습될 수 있다. 구체적으로, 본 발명의 일 실시예에 따라 가공된 건축 설계 정보에 대응하는 복수의 파라미터에 기초하여 상태 집합이 특정이 되는 경우, 강화학습 모델은 미결정된 상태 집합에서 특정된 상태 이외의 상태는 기본값으로 대체하여 학습될 수 있다.According to one embodiment of the present invention, a reinforcement learning model can be learned by replacing some states that do not correspond to processed architectural design information in an undetermined state set with default values. Specifically, when a state set is specified based on a plurality of parameters corresponding to processed architectural design information according to one embodiment of the present invention, a reinforcement learning model can be learned by replacing states other than the specified states in the undetermined state set with default values.

본 발명의 일 실시예에 따라 행동 집합에서 건축 환경에 적용되지 않은 행동에 대해 마스크(mask)를 적용함으로써 마스크가 적용된 행동에 대한 업데이트가 차단되도록 학습될 수 있다. 구체적으로, 본 발명의 일 실시예에 따라 환경(420)에 적용되지 않은 행동에 대해서도 업데이트를 진행할 경우, 학습에 노이즈(noise)로 작용할 수 있는 바 적용되지 않은 행동에 사전 생성된 마스크를 토대로 경사 중지(stop gradient)를 적용함으로써 파라미터에 대한 선택적 업데이트가 진행될 수 있다.According to one embodiment of the present invention, by applying a mask to an action that is not applied to the built environment in the action set, it can be learned to block updates to the masked action. Specifically, when an update is performed for an action that is not applied to the environment (420) according to one embodiment of the present invention, a selective update of the parameter can be performed by applying a stop gradient based on a mask generated in advance to the action that is not applied, which can act as noise in learning.

본 발명의 일 실시예에 따라 강화학습 모델은 건축 설계안을 평가하기 위해 결정된 보상함수를 기초로 학습될 수 있다. 본 발명의 일 실시예에 따라 보상함수는 강화학습 모델에 의해 생성된 건축 설계안의 대지 또는 건축 면적에 관한 정량적 가치(quantitative value) 및 건축 설계안의 형상에 관한 정성적 가치(qualitative value)를 참조하여 산출되는 종점 보상값(terminal reward)을 포함할 수 있다. 본 발명의 일 실시예에 따라, 종점 보상값은 정량적 가치 및 건축 설계안마다 산정된 정성적 가치의 조합(예를 들어, 대수적 연산)을 기초로 연산될 수 있다.According to one embodiment of the present invention, a reinforcement learning model can be learned based on a reward function determined to evaluate an architectural design. According to one embodiment of the present invention, the reward function can include a terminal reward value calculated by referring to a quantitative value regarding a land or building area of an architectural design generated by the reinforcement learning model and a qualitative value regarding a shape of the architectural design. According to one embodiment of the present invention, the terminal reward value can be calculated based on a combination (e.g., an algebraic operation) of a quantitative value and a qualitative value calculated for each architectural design.

본 발명의 일 실시예에 따라 정량적 가치는 대지 또는 건축 면적당 가격 및 세대(unit)의 실사용 면적을 기초로 연산될 수 있다. 구체적으로, 본 발명의 일 실시예에 따르면, 정량적 가치는 대지 또는 건축 면적 당 가격과 세대별 실사용 면적에 대해 곱셈을 수행하고, 세대 별로 합산하는 방식으로 연산될 수 있다.According to one embodiment of the present invention, the quantitative value can be calculated based on the price per land or building area and the actual usable area of the unit. Specifically, according to one embodiment of the present invention, the quantitative value can be calculated by multiplying the price per land or building area and the actual usable area of the unit, and adding them up for each unit.

본 발명의 일 실시예에 따라 정성적 가치는 사전 설정된 기준에 따른 페널티(penalty)를 기초로 연산될 수 있다. 본 발명의 일 실시예에 따라 정성적 가치는 정량화된 페널티를 기초로 연산될 수 있다. 정량화된 페널티는 건축 및 설계 요소별로 기설정된 음의 계수를 기초로 연산될 수 있으며, 사전 설정된 기준에 부합할수록 정성적 가치의 절대값이 작아질 수 있다. 상기 음의 계수는 건축 설계 전문가의 경험과 지식에 근거하여 사전 설정될 수 있다.According to one embodiment of the present invention, the qualitative value can be calculated based on a penalty according to a preset criterion. According to one embodiment of the present invention, the qualitative value can be calculated based on a quantified penalty. The quantified penalty can be calculated based on a negative coefficient preset for each architectural and design element, and the absolute value of the qualitative value can be reduced as it meets the preset criterion. The negative coefficient can be preset based on the experience and knowledge of an architectural design expert.

본 발명의 일 실시예에 따라 사전 결정된 기준에 대한 페널티는 세대 형상 페널티(unit penalty), 매스 형상 페널티(mass penalty) 및 법규 불충족에 따른 페널티(constraint penalty) 중 적어도 하나를 포함할 수 있다.According to one embodiment of the present invention, the penalty for the predetermined criterion may include at least one of a unit penalty, a mass penalty, and a constraint penalty.

본 발명의 일 실시예에 따라 세대 형상 페널티는 세대 형상을 정방형, 장방형으로 설정하기 위한 공간 구획 페널티, 세대 형상의 균형을 위한 종횡 비 페널티 및 세대 형상의 공간을 최대화하기 위한 개방 페널티 중 적어도 하나에 기초하여 연산될 수 있다. 세대 형상 페널티는 공간의 구획에 관련된 것으로 제한된 공간 내 최대한 많은 세대가 포함될 수 있도록 하는 기준을 기초로 연산될 수 있다. 본 발명의 일 실시예에 따라 세대 형상 페널티는 공간 구획 페널티, 종횡 비 페널티 및 개방 페널티의 합으로 연산될 수 있다.According to one embodiment of the present invention, a generation shape penalty may be calculated based on at least one of a space partition penalty for setting a generation shape to a square or rectangular shape, an aspect ratio penalty for balancing the generation shape, and an open penalty for maximizing the space of the generation shape. The generation shape penalty is related to the partitioning of space and may be calculated based on a criterion that allows as many generations as possible to be included in a limited space. According to one embodiment of the present invention, the generation shape penalty may be calculated as the sum of the space partition penalty, the aspect ratio penalty, and the open penalty.

본 발명의 일 실시예에 따라 매스 형상 페널티는 건물의 상부층과 하부층의 정렬(align)을 위해 층별 면적의 차이에 기초하여 연산될 수 있다. 예를 들어, 상기 상부층과 하부층의 정렬은 하부층의 면적이 상부층의 면적보다 큰 것을 의미할 수 있다.According to one embodiment of the present invention, a mass shape penalty may be calculated based on the difference in floor area for the alignment of upper and lower floors of a building. For example, the alignment of the upper and lower floors may mean that the area of the lower floor is larger than the area of the upper floor.

본 발명의 일 실시예에 따라 법규 불충족에 따른 페널티는 건폐율 페널티, 용적률 페널티, 연면적 페널티 및 주차 대수 페널티 중 적어도 하나에 기초하여 연산될 수 있다. 본 발명의 일 실시예에 따라 법규 불충족에 따른 페널티는 건폐율 페널티, 용적률 페널티, 연면적 페널티 및 주차 대수 페널티의 합으로 연산될 수 있다.According to one embodiment of the present invention, a penalty for noncompliance with regulations may be calculated based on at least one of a building coverage ratio penalty, a floor area ratio penalty, a floor area penalty, and a parking space penalty. According to one embodiment of the present invention, a penalty for noncompliance with regulations may be calculated as the sum of a building coverage ratio penalty, a floor area ratio penalty, a floor area penalty, and a parking space penalty.

구체적으로, 본 발명의 일 실시예에 따라 건폐율 페널티는 법정 건폐율과 설계된 건축물의 건폐율의 차이에 기초하여 연산될 수 있다. 예를 들어, 법정 건폐율보다 설계된 건축물의 건폐율이 크다면, 건폐율 페널티는 건폐율과 관련하여 기설정된 음의 계수와 상기 건폐율 차이의 곱셈에 기초하여 연산될 수 있다. 한편, 법정 건폐율보다 설계된 건축물의 건폐율이 작다면, 법을 충족하는 것으로서 건폐율 페널티는 0이 되어 법규 불충족에 따른 페널티에 포함되지 않을 수 있다.Specifically, according to one embodiment of the present invention, the building coverage ratio penalty may be calculated based on the difference between the statutory building coverage ratio and the building coverage ratio of the designed building. For example, if the building coverage ratio of the designed building is greater than the statutory building coverage ratio, the building coverage ratio penalty may be calculated based on the multiplication of the difference in the building coverage ratio by a preset negative coefficient related to the building coverage ratio. On the other hand, if the building coverage ratio of the designed building is less than the statutory building coverage ratio, the building coverage ratio penalty may be 0 as it complies with the law and may not be included in the penalty for noncompliance with the law.

또한, 본 발명의 일 실시예에 따라 용적률 페널티는 법정 용적률과 설계된 건축물의 용적률의 차이에 기초하여 연산될 수 있다. 예를 들어, 법정 용적률보다 설계된 건축물의 용적률이 크다면, 용적률 페널티는 용적률과 관련하여 기설정된 음의 계수와 상기 용적률 차이의 곱셈에 기초하여 연산될 수 있다. 한편, 법정 용적률보다 설계된 건축물의 용적률이 작다면, 법을 충족하는 것으로서 용적률 페널티는 0이 되어 법규 불충족에 따른 페널티에 포함되지 않을 수 있다.In addition, according to one embodiment of the present invention, the floor area ratio penalty may be calculated based on the difference between the statutory floor area ratio and the floor area ratio of the designed building. For example, if the floor area ratio of the designed building is larger than the statutory floor area ratio, the floor area ratio penalty may be calculated based on the multiplication of the difference in the floor area ratio and a predetermined negative coefficient related to the floor area ratio. On the other hand, if the floor area ratio of the designed building is smaller than the statutory floor area ratio, the floor area ratio penalty may be 0 as it complies with the law and may not be included in the penalty for noncompliance with the law.

또한, 본 발명의 일 실시예에 따라 연면적 페널티는 법정 연면적과 설계된 건축물의 연면적의 차이에 기초하여 연산될 수 있다. 예를 들어, 법정 연면적보다 설계된 건축물의 연면적이 크다면, 연면적 페널티는 연면적과 관련하여 기설정된 음의 계수와 상기 연면적 차이의 곱셈에 기초하여 연산될 수 있다. 한편, 법정 연면적보다 설계된 건축물의 연면적이 작다면, 법을 충족하는 것으로서 연면적 페널티는 0이 되어 법규 불충족에 따른 페널티에 포함되지 않을 수 있다.In addition, according to one embodiment of the present invention, the floor area penalty may be calculated based on the difference between the statutory floor area and the floor area of the designed building. For example, if the floor area of the designed building is larger than the statutory floor area, the floor area penalty may be calculated based on the multiplication of the floor area difference by a predetermined negative coefficient related to the floor area. On the other hand, if the statutory floor area of the designed building is smaller than the statutory floor area, the floor area penalty may be 0 as it complies with the law and may not be included in the penalty for noncompliance with the law.

또한, 본 발명의 일 실시예에 따라 주차 대수 페널티는 법정 주차 대수와 설계된 건축물의 주차 대수의 차이에 기초하여 연산될 수 있다. 예를 들어, 법정 주차 대수보다 설계된 건축물의 주차 대수가 작다면, 주차 대수 페널티는 주차 공간과 관련하여 기설정된 음의 계수와 상기 주차 대수 차이의 곱셈에 기초하여 연산될 수 있다. 한편, 법정 주차 대수보다 설계된 건축물의 주차 대수가 크다면, 법을 충족하는 것으로서 주차 대수 페널티는 0이 되어 법규 불충족에 따른 페널티에 포함되지 않을 수 있다.In addition, according to one embodiment of the present invention, the parking penalty may be calculated based on the difference between the legal number of parking spaces and the number of parking spaces of the designed building. For example, if the number of parking spaces of the designed building is smaller than the legal number of parking spaces, the parking penalty may be calculated based on the multiplication of the difference in the number of parking spaces by a predetermined negative coefficient related to the parking space. On the other hand, if the number of parking spaces of the designed building is larger than the legal number of parking spaces, the parking penalty may be 0 as it complies with the law and may not be included in the penalty for noncompliance with the law.

도 4 내지 도 5는 본 발명의 일 실시예에 따라 건축 설계 시스템을 통해 건축 설계안을 도출하는 과정을 예시적으로 나타내는 도면이다.FIGS. 4 and 5 are drawings exemplarily showing a process of deriving an architectural design plan through an architectural design system according to one embodiment of the present invention.

도 4에 도시된 바와 같이, 발명의 일 실시예에 따라 건축 설계 시스템(300)은 건축 환경 매개변수화 알고리즘을 기초로 건축 설계안을 산출할 수 있다. 건축 설계 시스템(300)은 가공된 건축 설계 정보(주변 필지선, 필지 경계선, 대지 법규선, 층별 법규선)에 기초하여, 이에 대응하는 복수의 파라미터를 통해 건축 요소(매스, 코어, 주차구획, 복도, 세대, 실)를 생성할 수 있다. 건축 설계 시스템(300)은 상기 생성된 건축 요소에 기초하여 건축개요를 산출할 수 있다. 또한, 발명의 일 실시예에 따라 상기 복수의 파라미터는 건축 요소 간의 상관 관계에 기초하여 추정될 수 있다.As illustrated in FIG. 4, according to one embodiment of the invention, the architectural design system (300) can produce an architectural design plan based on an architectural environment parameterization algorithm. Based on processed architectural design information (surrounding lot lines, lot boundary lines, land regulation lines, floor regulation lines), the architectural design system (300) can produce architectural elements (mass, core, parking space, hallway, household, room) through a plurality of parameters corresponding thereto. The architectural design system (300) can produce an architectural outline based on the produced architectural elements. In addition, according to one embodiment of the invention, the plurality of parameters can be estimated based on a correlation between architectural elements.

도 5에 도시된 바와 같이, 발명의 일 실시예에 따라 건축 설계 시스템(300)은 매스와 코어의 상관관계에 기초하여 코어를 생성할 수 있다. 구체적으로, 본 발명의 일 실시예에 따라 건축 설계 시스템(300)을 통해 생성된 바닥층 매스(510)와 상부층 매스(520) 사이에서 코어에 관한 파라미터가 학습될 수 있는 행동 집합은 상부층 매스(520)의 탐색 공간 내로 구성될 수 있다. 얇은 점선으로 표시된 직사각형(530)은 코어가 배치될 수 있는 공간의 예시를 나타낸 것이다. 그러나 코어가 배치될 수 있는 공간은 연속적이므로 코어가 배치될 수 있는 경우의 수는 무한대로 커진다고 볼 수 있다. 본 발명의 일 실시예에 따른 건축 설계 시스템은(300)은 기하적 연산에 기초하여 탐색 공간을 굵은 점선으로 표시된 직사각형(540)으로 제한할 수 있다. 예를 들어, 상기 기하적 연산은 상부층 매스(520) 내에서 가장 많은 코어를 포함할 수 있도록 연산될 수 있다. 상기 결정된 탐색 공간을 기초로 상태 집합과 행동 집합이 특정이 될 수 있고, 건축 설계 시스템(300)은 강화학습 모델을 학습시킬 수 있다.As illustrated in FIG. 5, according to one embodiment of the invention, the architectural design system (300) can generate a core based on the correlation between the mass and the core. Specifically, a set of actions from which parameters regarding the core can be learned between the floor mass (510) and the upper mass (520) generated by the architectural design system (300) according to one embodiment of the invention can be configured within the search space of the upper mass (520). A rectangle (530) indicated by a thin dotted line shows an example of a space in which a core can be placed. However, since the space in which the core can be placed is continuous, it can be seen that the number of cases in which the core can be placed increases infinitely. The architectural design system (300) according to one embodiment of the invention can limit the search space to a rectangle (540) indicated by a thick dotted line based on a geometric operation. For example, the geometric operation can be operated so as to include the largest number of cores within the upper mass (520). Based on the search space determined above, a set of states and a set of actions can be specified, and the architectural design system (300) can train a reinforcement learning model.

이상 설명된 본 발명에 따른 실시예는 다양한 컴퓨터 구성요소를 통하여 실행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수 있다. 컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등과 같은, 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는, 컴파일러에 의하여 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용하여 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위하여 하나 이상의 소프트웨어 모듈로 변경될 수 있으며, 그 역도 마찬가지이다.The embodiments of the present invention described above may be implemented in the form of program commands that can be executed through various computer components and recorded on a computer-readable recording medium. The computer-readable recording medium may include program commands, data files, data structures, etc., alone or in combination. The program commands recorded on the computer-readable recording medium may be those specially designed and configured for the present invention or those known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, and hardware devices specially configured to store and execute program commands, such as ROMs, RAMs, and flash memories. Examples of the program commands include not only machine language codes generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter, etc. The hardware devices may be changed into one or more software modules to perform processing according to the present invention, and vice versa.

이상에서 본 발명이 구체적인 구성요소 등과 같은 특정 사항과 한정된 실시예 및 도면에 의하여 설명되었으나, 이는 본 발명의 보다 전반적인 이해를 돕기 위하여 제공된 것일 뿐, 본 발명이 상기 실시예에 한정되는 것은 아니며, 본 발명이 속하는 기술분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정과 변경을 꾀할 수 있다.Although the present invention has been described above with reference to specific details such as specific components and limited examples and drawings, these have been provided only to help a more general understanding of the present invention, and the present invention is not limited to the above examples, and those with common knowledge in the technical field to which the present invention pertains may make various modifications and changes based on this description.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 또는 이로부터 등가적으로 변경된 모든 범위는 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the idea of the present invention should not be limited to the embodiments described above, and not only the scope of the patent claims described below but also all scopes equivalent to or equivalently modified from the scope of the patent claims are included in the scope of the idea of the present invention.

100: 통신망
200: 디바이스
300: 건축 설계 시스템
310: 정보 가공부
320: 정보 처리부
330: 통신부
340: 제어부
410: 에이전트(agent)
420: 환경(environment)
430: 보상(reward)
510: 하부층 매스
520: 상부층 매스
530: 코어의 탐색 공간
540: 기하적 연산에 의해 결정된 코어의 탐색 공간100: Communication network
200: Device
300: Architectural Design System
310: Information Processing Department
320: Information Processing Unit
330: Communications Department
340: Control Unit
410: agent
420: environment
430: Reward
510: Lower floor mass
520: Upper floor mass
530: Core's search space
540: Search space of cores determined by geometric operations

Claims

A method for providing a reward function for reinforcement learning,
A step of calculating the quantitative value of the land of the architectural design plan generated by the reinforcement learning model and the qualitative value of the shape of the architectural design plan, respectively, and
Including a step of determining a terminal reward value calculated by referring to the above quantitative value and the above qualitative value as a reward function for evaluating the architectural design plan,
The above reinforcement learning model is learned by targeting an undetermined set of states and a set of actions, which are specified by determining a search space based on multiple parameters corresponding to processed architectural design information.
The above reinforcement learning model is learned according to a policy-based learning algorithm or a value-based learning algorithm.
method.

In the first paragraph,
The above quantitative value is calculated based on the price per land or building area and the actual usable area of the unit.
method.

In the first paragraph,
The above qualitative value is calculated based on a penalty according to preset criteria.
method.

In the third paragraph,
The penalty for the above preset criteria includes at least one of a unit penalty, a mass penalty, and a constraint penalty for noncompliance with the law.
method.

In paragraph 4,
The above generation shape penalty is calculated based on at least one of a space partition penalty for setting the generation shape to a square or rectangular shape, an aspect ratio penalty for balancing the generation shape, and an opening penalty for maximizing the space of the generation shape.
method.

In paragraph 4,
The above mass shape penalty is calculated based on the difference in floor area for the alignment of the upper and lower floors of the building.
method.

In paragraph 4,
The penalty for noncompliance with the above regulations is calculated based on at least one of the building coverage ratio penalty, floor area ratio penalty, gross floor area penalty, and legal parking space penalty.
method.

A method for providing a function for architectural design,
A step of calculating the quantitative value of the land of the architectural design plan generated by the architectural design model and the qualitative value of the shape of the architectural design plan, respectively, and
It includes a step of determining multiple parameters for deriving an architectural design plan by maximizing an objective function or a fitness function corresponding to a terminal reward value calculated with reference to the above quantitative value and the above qualitative value,
The above architectural design model is learned for a set of undetermined states and a set of actions, which are specified by determining a search space based on a plurality of parameters corresponding to processed architectural design information.
The above architectural design model is learned according to a policy-based learning algorithm or a value-based learning algorithm.
method.

A non-transitory computer-readable recording medium recording a computer program for executing the method according to claim 1.

As a system that provides a reward function for reinforcement learning,
The quantitative value of the land of the architectural design plan generated by the reinforcement learning model and the qualitative value of the shape of the architectural design plan are calculated, respectively.
Including an information processing unit that determines a terminal reward value calculated by referring to the above quantitative value and the above qualitative value as a reward function for evaluating the architectural design plan,
The above reinforcement learning model is learned by targeting an undetermined set of states and a set of actions, which are specified by determining a search space based on multiple parameters corresponding to processed architectural design information.
The above reinforcement learning model is learned according to a policy-based learning algorithm or a value-based learning algorithm.
System.

In Article 10,
The above quantitative value is calculated based on the price per land or building area and the actual usable area of the unit.
System.

In Article 10,
The above qualitative value is calculated based on a penalty according to preset criteria.
System.

In Article 12,
The penalty for the above preset criteria includes at least one of a unit penalty, a mass penalty, and a constraint penalty for noncompliance with the law.
System.

In Article 13,
The above generation shape penalty is calculated based on at least one of a space partition penalty for setting the generation shape to a square or rectangular shape, an aspect ratio penalty for balancing the generation shape, and an opening penalty for maximizing the space of the generation shape.
System.

In Article 13,
The above mass shape penalty is calculated based on the difference in floor area for the alignment of the upper and lower floors of the building.
System.

In Article 13,
The penalty for noncompliance with the above regulations is calculated based on at least one of the building coverage ratio penalty, floor area ratio penalty, gross floor area penalty, and legal parking space penalty.
System.

As a system providing functions for architectural design,
The quantitative value of the land of the architectural design plan generated by the architectural design model and the qualitative value of the shape of the architectural design plan are calculated, respectively.
An information processing unit is included to determine multiple parameters for deriving an architectural design plan by maximizing an objective function or a fitness function corresponding to a terminal reward value calculated with reference to the above quantitative value and the above qualitative value.
The above architectural design model is learned for a set of undetermined states and a set of actions, which are specified by determining a search space based on a plurality of parameters corresponding to processed architectural design information.
The above architectural design model is learned according to a policy-based learning algorithm or a value-based learning algorithm.
System.