KR20200012379A

KR20200012379A - Image-based indoor position detection apparatus and detection method

Info

Publication number: KR20200012379A
Application number: KR1020180087618A
Authority: KR
Inventors: 김형관; 하인해
Original assignee: 연세대학교 산학협력단
Priority date: 2018-07-27
Filing date: 2018-07-27
Publication date: 2020-02-05
Also published as: KR102090779B1

Abstract

The present invention relates to a device for detecting an indoor position based on an image using BIM information and a deep learning network and a method thereof. According to an embodiment of the present invention, the device for detecting an indoor position based on an image comprises: a server communication unit for receiving an image, photographed through a camera embedded in a user terminal, through wired/wireless communication; a server storage unit for storing a building information modeling database (BIM DB) in which indoor images within at least one building are stored; and a server control unit for extracting a layer including structure information from the images received from the user terminal, extracting an indoor image having the similarity equal to or greater than a reference value by matching the structure information on the extracted layer with the indoor images stored in the BIM DB, and detecting a current position of a user using the extracted indoor images.

Description

Image-based indoor position detection apparatus and detection method

본 발명은 이미지 기반 실내위치 검출장치 및 검출방법에 관한 것으로, 보다 구체적으로는 BIM 정보와 딥 러닝 네트워크를 이용한 이미지 기반 실내위치 검출장치 및 검출방법에 관한 것이다. The present invention relates to an image-based indoor position detection device and a detection method, and more particularly to an image-based indoor position detection device and a detection method using BIM information and deep learning network.

실내에서 물체의 위치와 방향을 파악하는 것은 증강 현실, 실내 경로 안내, 그리고 물체나 사람의 추적 등에 주로 사용된다. 위치 확인에 GNSS(global navigation satellite system) 신호를 주로 이용하는 실외와 달리 실내는 GNSS(global navigation satellite system) 신호의 수신이 어렵다. 따라서 실내 위치 확인을 위한 다양한 연구들이 이루어지고 있다. Finding the location and orientation of objects in a room is often used for augmented reality, indoor route guidance, and tracking of objects or people. Unlike outdoors where GNSS (global navigation satellite system) signals are mainly used for positioning, indoors cannot receive GNSS (global navigation satellite system) signals. Therefore, various studies for indoor positioning have been made.

실내 위치 확인에는 주로 라디오 신호(radio signals) 기반의 방법들이 활용되지만, 이 방법들은 위치 확인이 필요한 실내에서의 사전작업이 요구되며 면적에 따라 증가하는 센서에 의해 비용이 반복적으로 발생된다. 또한 신호를 기반으로 물체의 위치를 파악하기 때문에 물체의 방향을 파악하는 데는 어려움이 존재한다. Indoor positioning is mainly based on radio signals, but these methods require preliminary work in the room where positioning is required, and the costs are repeatedly generated by sensors that increase with area. In addition, since the position of the object is determined based on the signal, it is difficult to determine the direction of the object.

이미지 기반의 실내 위치확인 방법들은 위치 확인 당시의 실내 사진과 사전에 구축한 실내 이미지 데이터들을 비교하여 물체의 위치를 추정한다. 이미지 데이터 구축뿐만 아니라 실내 지도 생성을 위하여 실내 곳곳의 사진을 촬영하여야 한다. 이 역시 실내 위치 확인을 위하여 사전에 번거로운 작업이 요구된다. Image-based indoor positioning methods estimate the location of an object by comparing the indoor image at the time of location confirmation with previously constructed indoor image data. In addition to constructing image data, indoor pictures must be taken to create indoor maps. This also requires cumbersome work in advance for indoor positioning.

BIM(building information modeling)은 설계부터 시공, 그리고 완공 이후 시설물의 유지관리까지 가능한 통합적인 건축물의 모델링 과정을 의미한다. BIM 사용을 통한 시간 및 비용적인 측면의 이점이 입증되며 점차 사용이 증가하고 있는 추세이다. 통합적인 건축물의 정보 관리가 가능하고 어느 지점에서나 도면 추출이 용이하여 시설물 유지관리에도 유용하게 사용된다.Building information modeling (BIM) refers to an integrated building modeling process from design to construction and maintenance of facilities after completion. The time and cost benefits of using BIM are demonstrated and are increasingly being used. Information management of integrated buildings is possible, and drawing extraction is easy at any point, which is useful for facility maintenance.

BIM은 또한 센서와 함께 실내 위치 확인에 활용되기도 한다. 주로 센서의 오차 범위를 제한할 수 있는 실내의 구조적 정보 제안이나 시각화, 그리고 통합 정보 제공 등에 BIM은 활용된다. 하지만 이 역시 센서를 활용한 실내 위치 확인의 한계를 지니고 있다. BIM can also be used for indoor positioning with sensors. Primarily, BIM is used to suggest structural information, visualize, and provide integrated information that can limit the sensor's margin of error. However, this also has the limitation of indoor positioning using a sensor.

한국등록특허 10-1415016호Korea Patent Registration 10-1415016

본 발명은 센서를 사용하거나 별도의 3D 모델링을 제작하지 않아도 구현될 수 있는 BIM 정보와 딥 러닝 네트워크를 이용한 이미지 기반 실내위치 검출장치 및 검출방법을 제공하는 것을 목적으로 한다.An object of the present invention is to provide an image-based indoor position detection device and a detection method using a deep learning network and BIM information that can be implemented without using a sensor or a separate 3D modeling.

본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치는,Image-based indoor position detection apparatus according to an embodiment of the present invention,

사용자 단말기에 내장된 카메라를 통해 촬영된 이미지를 유무선 통신을 통해 수신하는 서버 통신부; 적어도 하나 이상의 건물 내부의 실내 이미지들이 저장된 BIM DB(Building Information Modeling DataBase)가 저장된 서버 저장부; 및, 상기 사용자 단말기로부터 수신된 이미지에서 구조 정보가 포함된 레이어를 추출하고, 추출된 레이어의 구조 정보를 상기 BIM DB에 저장된 어느 하나 이상의 실내 이미지와 매칭하여 기준치 이상의 유사도를 갖는 실내 이미지를 추출하고, 추출된 실내 이미지를 이용하여 사용자의 현재 위치를 검출하는 서버 제어부를 포함한다.A server communication unit which receives an image captured by a camera embedded in a user terminal through wired or wireless communication; A server storage unit configured to store a building information modeling database (BIM DB) storing indoor images of at least one interior of the building; And extracting a layer including structure information from the image received from the user terminal, matching the extracted structure information with one or more indoor images stored in the BIM DB, and extracting an indoor image having a similarity or higher than a reference value. The server controller may include detecting a current location of the user using the extracted indoor image.

본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 BIM DB는, 상기 적어도 하나 이상의 건물 내부의 실내 이미지들에 대해 딥 러닝 네트워크를 수행하여 얻은 이미지 레이어들을 저장할 수 있다.In the image-based indoor position detection apparatus according to an embodiment of the present invention, the BIM DB may store image layers obtained by performing a deep learning network on indoor images of the at least one building.

본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 딥 러닝 네트워크는, VGG-16, VGG-19 중 적어도 어느 하나를 포함하는 콘볼루션 신경망 네트워크(Convolutional neural network, CNN)일 수 있다.In the image-based indoor position detection apparatus according to an embodiment of the present invention, the deep learning network may be a convolutional neural network (CNN) including at least one of VGG-16 and VGG-19. have.

본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 BIM DB는 상기 적어도 하나 이상의 건물 내부의 실내 이미지들에 대해 콘볼루션 신경망 네트워크를 수행하여 얻은 이미지 레이어들을 저장하고, 상기 서버 제어부는, 상기 사용자 단말기로부터 수신된 이미지에서 구조 정보가 포함된 레이어를 추출하는 레이어 추출모듈; 상기 추출된 레이어의 구조 정보를 상기 BIM DB에 저장된 이미지 레이어들의 구조 정보와 매칭하여 기준치 이상의 유사도를 갖는 이미지 레이어를 추출하는 이미지 매칭모듈; 상기 추출된 이미지 레이어를 이용하여 사용자의 현재 위치를 검출하는 위치 검출모듈을 포함할 수 있다.In the image-based indoor position detection apparatus according to an embodiment of the present invention, the BIM DB stores image layers obtained by performing a convolutional neural network for indoor images of the at least one building, and the server controller The layer extraction module for extracting a layer containing the structure information from the image received from the user terminal; An image matching module for extracting an image layer having a similarity or more than a reference value by matching the structure information of the extracted layer with the structure information of the image layers stored in the BIM DB; It may include a position detection module for detecting the current position of the user by using the extracted image layer.

본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 구조 정보는, 상기 수신된 이미지에 대해 콘볼루션 신경망 네트워크를 수행하여 얻은 이미지 레이어에 포함된 이미지 특징맵의 형상일 수 있다.In the image-based indoor position detection apparatus according to an embodiment of the present invention, the structural information may be a shape of an image feature map included in an image layer obtained by performing a convolutional neural network on the received image.

본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 이미지 레이어는, 상기 콘볼루션 신경망 네트워크를 수행하여 얻은 제4 풀링 레이어일 수 있다.In the image-based indoor position detection apparatus according to an embodiment of the present invention, the image layer may be a fourth pulling layer obtained by performing the convolutional neural network.

본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치는,Image-based indoor position detection apparatus according to another embodiment of the present invention,

외부 서버로부터 적어도 하나 이상의 건물 내부의 실내 이미지에 대한 BIM 데이터를 수신하는 통신부; 내장된 카메라를 통해 촬영된 이미지에 포함된 구조 정보를 추출하고, 추출된 구조 정보를 상기 수신된 BIM 데이터에 포함된 어느 하나 이상의 실내 이미지와 매칭하여 기준치 이상의 유사도를 갖는 실내 이미지를 추출하고, 추출된 실내 이미지를 이용하여 사용자의 현재 위치를 검출하는 제어부를 포함한다.A communicator configured to receive BIM data for an indoor image of at least one building from an external server; Extract the structural information included in the image taken by the built-in camera, match the extracted structural information with any one or more indoor images included in the received BIM data to extract an indoor image having a similarity or more than a reference value, It includes a control unit for detecting the current location of the user using the indoor image.

본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 BIM 데이터는, 상기 적어도 하나 이상의 건물 내부의 실내 이미지들에 대해 딥 러닝 네트워크를 수행하여 얻은 이미지 레이어일 수 있다.In the image-based indoor position detection apparatus according to another embodiment of the present invention, the BIM data may be an image layer obtained by performing a deep learning network on the indoor images of the at least one building.

본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 딥 러닝 네트워크는, VGG-16, VGG-19 중 적어도 어느 하나를 포함하는 콘볼루션 신경망 네트워크일 수 있다.In the image-based indoor position detection apparatus according to another embodiment of the present invention, the deep learning network may be a convolutional neural network including at least one of VGG-16 and VGG-19.

본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 BIM 데이터는 상기 적어도 하나 이상의 건물 내부의 실내 이미지들에 대해 콘볼루션 신경망 네트워크를 수행하여 얻은 이미지 레이어들이고, 상기 제어부는, 상기 내장된 카메라를 통해 촬영된 이미지에 포함된 구조 정보를 추출하는 레이어 추출모듈; 상기 추출된 구조 정보를 상기 BIM 데이터에 포함된 이미지 레이어들의 구조 정보와 매칭하여 기준치 이상의 유사도를 갖는 이미지 레이어를 추출하는 이미지 매칭모듈; 상기 추출된 이미지 레이어를 이용하여 사용자의 현재 위치를 검출하는 위치 검출모듈을 포함할 수 있다.In the image-based indoor position detection apparatus according to another embodiment of the present invention, the BIM data are image layers obtained by performing a convolutional neural network on the indoor images of the at least one building, the control unit, A layer extraction module for extracting structural information included in an image captured by the built-in camera; An image matching module which extracts an image layer having a similarity or higher than a reference value by matching the extracted structure information with structure information of image layers included in the BIM data; It may include a position detection module for detecting the current position of the user by using the extracted image layer.

본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 구조 정보는, 상기 수신된 이미지에 대해 콘볼루션 신경망 네트워크를 수행하여 얻은 이미지 레이어에 포함된 이미지 특징맵의 형상일 수 있다.In the image-based indoor position detection apparatus according to another embodiment of the present invention, the structural information may be a shape of an image feature map included in an image layer obtained by performing a convolutional neural network on the received image.

본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치에 있어서, 상기 이미지 레이어는, 상기 콘볼루션 신경망 네트워크를 수행하여 얻은 제4 풀링 레이어일 수 있다.In the image-based indoor position detection apparatus according to another embodiment of the present invention, the image layer may be a fourth pulling layer obtained by performing the convolutional neural network.

본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출방법은,Image-based indoor location detection method according to an embodiment of the present invention,

적어도 하나 이상의 건물 내부의 실내 이미지 레이어들이 저장된 BIM DB를 형성하여 저장하는 단계; 카메라를 통해 촬영된 이미지를 유무선 통신을 통해 수신하는 단계; 상기 수신된 이미지에서 구조 정보를 추출하는 단계; 상기 추출된 구조 정보를 상기 BIM DB에 저장된 이미지 레이어들의 구조 정보와 매칭하여 기준치 이상의 유사도를 갖는 이미지 레이어를 추출하는 단계; 상기 추출된 이미지 레이어를 이용하여 사용자의 현재 위치를 검출하는 단계를 포함한다.Forming and storing a BIM DB in which indoor image layers inside at least one building are stored; Receiving an image photographed through a camera through wired or wireless communication; Extracting structural information from the received image; Extracting an image layer having a similarity or higher than a reference value by matching the extracted structure information with structure information of image layers stored in the BIM DB; Detecting a current location of a user using the extracted image layer.

본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출방법에 있어서, 상기 BIM DB는 상기 적어도 하나 이상의 건물 내부의 실내 이미지들에 대해 콘볼루션 신경망 네트워크를 수행하여 얻은 이미지 레이어들을 저장한 DB이고, 상기 구조 정보는 상기 수신된 이미지에 대해 콘볼루션 신경망 네트워크를 수행하여 얻은 이미지 레이어에 포함된 이미지 특징맵의 형상일 수 있다.In the image-based indoor location detection method according to an embodiment of the present invention, the BIM DB is a DB that stores the image layers obtained by performing a convolutional neural network for the indoor images of the at least one building, the The structural information may be a shape of an image feature map included in an image layer obtained by performing a convolutional neural network on the received image.

본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출방법에 있어서, 상기 이미지 레이어는, 상기 콘볼루션 신경망 네트워크를 수행하여 얻은 제4 풀링 레이어일 수 있다.In the image-based indoor position detection method according to an embodiment of the present invention, the image layer may be a fourth pulling layer obtained by performing the convolutional neural network.

본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출방법은, Image-based indoor location detection method according to another embodiment of the present invention,

외부 서버로부터 적어도 하나 이상의 건물에 대한 BIM 데이터를 수신하는 단계; 내장된 카메라를 통해 촬영된 이미지에 포함된 구조 정보를 추출하는 단계; 상기 추출된 구조 정보를 상기 수신된 BIM 데이터에 포함된 어느 하나 이상의 실내 이미지와 매칭하여 기준치 이상의 유사도를 갖는 실내 이미지를 추출하는 단계; 상기 추출된 이미지 레이어를 이용하여 사용자의 현재 위치를 검출하는 단계를 포함한다.Receiving BIM data for at least one building from an external server; Extracting structural information included in an image captured by an embedded camera; Extracting an indoor image having a similarity or higher than a reference value by matching the extracted structural information with at least one indoor image included in the received BIM data; Detecting a current location of a user using the extracted image layer.

본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출방법에 있어서, 상기 BIM 데이터는 상기 적어도 하나 이상의 건물 내부의 실내 이미지들에 대해 콘볼루션 신경망 네트워크를 수행하여 얻은 이미지 레이어들이고, 상기 구조 정보는 상기 수신된 이미지에 대해 콘볼루션 신경망 네트워크를 수행하여 얻은 이미지 레이어에 포함된 이미지 특징맵의 형상일 수 있다.In the image-based indoor position detection method according to another embodiment of the present invention, the BIM data are image layers obtained by performing a convolutional neural network for the indoor images of the at least one building, the structure information is It may be a shape of an image feature map included in an image layer obtained by performing a convolutional neural network on the received image.

본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출방법에 있어서, 상기 이미지 레이어는, 상기 콘볼루션 신경망 네트워크를 수행하여 얻은 제4 풀링 레이어일 수 있다.In the image-based indoor position detection method according to another embodiment of the present invention, the image layer may be a fourth pulling layer obtained by performing the convolutional neural network.

본 발명의 일 실시예는, 컴퓨터에 의해 실행되며, 적어도 하나 이상의 건물 내부의 실내 이미지 레이어들이 저장된 BIM DB를 형성하여 저장하는 단계; 카메라를 통해 촬영된 이미지를 유무선 통신을 통해 수신하는 단계; 상기 수신된 이미지에서 구조 정보를 추출하는 단계; 상기 추출된 구조 정보를 상기 BIM DB에 저장된 이미지 레이어들의 구조 정보와 매칭하여 기준치 이상의 유사도를 갖는 이미지 레이어를 추출하는 단계; 상기 추출된 이미지 레이어를 이용하여 사용자의 현재 위치를 검출하는 단계를 실행시키기 위한 프로그램을 기록한 컴퓨터 해독 가능한 기록 매체에 의해 실행될 수 있다.One embodiment of the present invention, executed by a computer, forming and storing a BIM DB in which at least one indoor image layers inside a building are stored; Receiving an image photographed through a camera through wired or wireless communication; Extracting structural information from the received image; Extracting an image layer having a similarity or higher than a reference value by matching the extracted structure information with structure information of image layers stored in the BIM DB; It can be executed by a computer-readable recording medium recording a program for executing the step of detecting the current position of the user by using the extracted image layer.

본 발명의 다른 실시예는, 컴퓨터에 의해 실행되며, 외부 서버로부터 적어도 하나 이상의 건물에 대한 BIM 데이터를 수신하는 단계; 내장된 카메라를 통해 촬영된 이미지에 포함된 구조 정보를 추출하는 단계; 상기 추출된 구조 정보를 상기 수신된 BIM 데이터에 포함된 어느 하나 이상의 실내 이미지와 매칭하여 기준치 이상의 유사도를 갖는 실내 이미지를 추출하는 단계; 상기 추출된 이미지 레이어를 이용하여 사용자의 현재 위치를 검출하는 단계를 실행시키기 위한 프로그램을 기록한 컴퓨터 해독 가능한 기록 매체에 의해 실행될 수 있다.Another embodiment of the present invention is implemented by a computer, receiving BIM data for at least one building from an external server; Extracting structural information included in an image captured by an embedded camera; Extracting an indoor image having a similarity or higher than a reference value by matching the extracted structural information with at least one indoor image included in the received BIM data; It can be executed by a computer-readable recording medium recording a program for executing the step of detecting the current position of the user by using the extracted image layer.

기타 본 발명의 다양한 측면에 따른 구현예들의 구체적인 사항은 이하의 상세한 설명에 포함되어 있다.Other specific details of embodiments according to various aspects of the present invention are included in the following detailed description.

본 발명의 실시예들에 따른 이미지 기반 실내위치 검출장치 및 검출방법에 의하면, According to the image-based indoor position detection apparatus and detection method according to embodiments of the present invention,

사전 학습된 딥 러닝 네트워크(구체적으로는, CNN, 보다 구체적으로는 VGG 네트워크)에서 얻은 특징(feature)을 이용하여 이미지 추출 및 매칭을 통해 건물 내에서의 현재 위치를 확인할 수 있다. Features obtained from a pre-learned deep learning network (specifically, CNN, more specifically, VGG network) can be used to identify the current location within the building through image extraction and matching.

BIM DB에 저장된 각각의 BIM 이미지는 실내 위치와 방향 정보가 함께 저장되어 있으므로, 카메라에 의해 촬영된 실내 이미지와 동일한 지점과 동일한 방향으로 취득된 BIM 이미지가 매칭되면, 실내 이미지를 촬영한 사용자의 실내 위치 정보를 정확하게 확인할 수 있게 된다.Since each BIM image stored in the BIM DB is stored with indoor location and direction information, if the BIM image acquired in the same direction and the same direction as the indoor image captured by the camera is matched, the user's indoor image is captured. The location information can be confirmed accurately.

또한, 본 발명의 실시예들에 따른 이미지 기반 실내위치 검출장치 및 검출방법에 의하면, 기 제작된 BIM 모델링 데이터를 활용하는 것이기 때문에, 실내 위치 확인을 위해 별도의 3D 모델링을 제작하지 않아도 된다. In addition, according to the image-based indoor position detection apparatus and detection method according to the embodiments of the present invention, since it utilizes the pre-made BIM modeling data, it is not necessary to produce a separate 3D modeling for the indoor position confirmation.

또한, 기존의 vision-based localization 방법과는 달리 사전에 실내의 곳곳을 촬영하여 DB를 구축하는 수고로운 작업이 수반되지 않는다.In addition, unlike the conventional vision-based localization method, it does not involve the trouble of constructing a DB by photographing various places in the room in advance.

도 1은 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출시스템이 도시된 블록도이다.
도 2는 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치의 일 구성인 서버 제어부가 도시된 블록도이다.
도 3a는 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치의 일 구성인 서버 저장부에 저장되는 BIM DB의 구축 과정이 예시된 도면이다.
도 3b는 콘볼루션 신경망 네트워크(CNN)의 다층 구조가 예시된 도면이다.
도 4는 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치의 전체적인 동작 과정을 예시하는 도면이다.
도 5는 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치에서 BIM DB에 저장된 실내 이미지와 사용자 단말기에서 촬영된 실내 이미지를 매칭하는 과정을 설명하기 위한 예시도이다.
도 6은 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출방법이 도시된 순서도이다.
도 7은 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출방법의 제어 흐름도이다.
도 8은 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출시스템이 도시된 블록도이다.
도 9는 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치가 도시된 블록도이다.
도 10은 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출방법의 제어 흐름도이다.1 is a block diagram showing an image-based indoor position detection system according to an embodiment of the present invention.
2 is a block diagram illustrating a server controller which is one component of an image-based indoor position detection apparatus according to an embodiment of the present invention.
3A is a diagram illustrating a process of constructing a BIM DB stored in a server storage, which is one component of an image-based indoor position detection apparatus according to an embodiment of the present invention.
3B is a diagram illustrating a multilayer structure of a convolutional neural network (CNN).
4 is a diagram illustrating the overall operation of the image-based indoor position detection apparatus according to an embodiment of the present invention.
5 is an exemplary diagram for describing a process of matching an indoor image stored in a BIM DB and an indoor image photographed by a user terminal in an image-based indoor position detection apparatus according to an embodiment of the present invention.
6 is a flowchart illustrating an image-based indoor position detection method according to an embodiment of the present invention.
7 is a control flowchart of an image-based indoor position detection method according to an embodiment of the present invention.
8 is a block diagram showing an image-based indoor position detection system according to another embodiment of the present invention.
9 is a block diagram showing an image-based indoor position detection apparatus according to another embodiment of the present invention.
10 is a control flowchart of an image-based indoor position detection method according to another embodiment of the present invention.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예를 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.As the invention allows for various changes and numerous embodiments, particular embodiments will be illustrated and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all transformations, equivalents, and substitutes included in the spirit and scope of the present invention.

본 발명에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 발명에서, '포함하다' 또는 '가지다' 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. 이하, 도면을 참조하여 본 발명의 실시예에 따른 이미지 기반 실내위치 검출장치 및 검출방법을 설명한다.The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In the present invention, the terms 'comprise' or 'having' are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof. Hereinafter, an image-based indoor position detection apparatus and a detection method according to an embodiment of the present invention will be described with reference to the drawings.

도 1은 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출시스템이 도시된 블록도이고, 도 2는 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치의 일 구성인 서버 제어부가 도시된 블록도이다.1 is a block diagram showing an image-based indoor position detection system according to an embodiment of the present invention, Figure 2 is a server control unit shown in one configuration of the image-based indoor position detection apparatus according to an embodiment of the present invention It is a block diagram.

도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출시스템은, 사용자 단말기(100)와 실내위치 검출서버(200)를 포함한다. 여기서, 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치는 실내위치 검출서버(200)가 될 수 있다. 한편, 후술하는 다른 실시예에 따른 이미지 기반 실내위치 검출장치는 사용자 단말기(100)가 될 수 있는데, 이에 대해서는 후술한다. 본 실시예에서는 실내위치 검출서버(200)를 이미지 기반 실내위치 검출장치로 하여 설명한다.As shown in FIG. 1, an image-based indoor location detection system according to an embodiment of the present invention includes a user terminal 100 and an indoor location detection server 200. Here, the image-based indoor position detection apparatus according to an embodiment of the present invention may be an indoor position detection server 200. On the other hand, the image-based indoor position detection apparatus according to another embodiment to be described later may be a user terminal 100, which will be described later. In this embodiment, the indoor location detection server 200 will be described as an image-based indoor location detection device.

실내위치 검출서버(200)는 서버 통신부(210)와, 서버 저장부(220)와, 서버 제어부(230)를 포함한다.The indoor location detection server 200 includes a server communication unit 210, a server storage unit 220, and a server control unit 230.

서버 통신부(210)는 유무선 통신망을 경유하여 사용자 단말기(100)와 연동하는 기능을 수행하는 통신 수단으로서, 각종 데이터를 송수신하는 기능을 수행한다. 서버 통신부(210)는 사용자 단말기(100)에 내장된 카메라를 통해 촬영된 이미지를 수신하고, 서버 제어부(230)에서 검출된 사용자의 현재 위치 정보를 사용자 단말기(100)로 전송한다.The server communication unit 210 is a communication means for performing a function of interworking with the user terminal 100 via a wired or wireless communication network, and performs a function of transmitting and receiving various data. The server communication unit 210 receives an image captured by a camera embedded in the user terminal 100, and transmits the current location information of the user detected by the server controller 230 to the user terminal 100.

서버 저장부(220)는 실내위치 검출서버(200)의 구동에 필요한 각종 데이터를 저장한다. 또한, 서버 저장부(220)에는 딥 러닝 네트워크를 수행하는 프로그램이 저장될 수 있다. 여기서, 딥 러닝 네트워크는 콘볼루션 신경망 네트워크(Convolutional neural network, CNN)일 수 있다. 보다 구체적으로, 콘볼루션 신경망 네트워크는 VGG-16, VGG-19 중 적어도 어느 하나일 수 있다. 이하의 설명에서, 딥 러닝 네트워크는 콘볼루션 신경망 네트워크 중 VGG-16 네트워크를 예시로 설명하며, VGG-16 네트워크를 VGG 네트워크로 약칭하여 설명한다. 물론, 본 발명에서 적용 가능한 딥 러닝 네트워크가 이에 한정되는 것은 아니다. The server storage unit 220 stores various data necessary for driving the indoor location detection server 200. In addition, the server storage unit 220 may store a program for performing a deep learning network. Here, the deep learning network may be a convolutional neural network (CNN). More specifically, the convolutional neural network may be at least one of VGG-16 and VGG-19. In the following description, the deep learning network will be described as an example of the VGG-16 network of the convolutional neural network, and the VGG-16 network is abbreviated as a VGG network. Of course, the deep learning network applicable in the present invention is not limited thereto.

도 3은 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치의 일 구성인 서버 저장부에 저장되는 BIM DB의 구축 과정이 예시된 도면이다.3 is a diagram illustrating a process of constructing a BIM DB stored in a server storage unit, which is one component of an image-based indoor position detection apparatus according to an embodiment of the present invention.

서버 저장부(220)는 가상으로 건물을 모델링하기 위한 BIM 모델링 데이터(221)에서 건물의 실내 이미지를 추출(222)하고, 추출된 실내 이미지를 VGG 네트워크의 입력(223)으로 하여 추출된 이미지 레이어들(224a ~ 224e)을 저장할 수 있다. VGG 네트워크(V)를 통해 추출된 이미지 레이어들(224a ~ 224e)은 BIM DB화(Building Information Modeling DataBase)되어 서버 저장부(220)에 저장될 수 있다. 추출된 각각의 이미지 레이어들(224a ~ 224e)은 이미지 특징맵(feature map)을 포함한다. 특히, 서버 저장부(220)는 VGG 네트워크(V)를 통해 추출된 이미지 레이어들 중에서 제4 풀링 레이어(224d, pooling layer 4)를 저장할 수 있다.The server storage unit 220 extracts an indoor image of the building from the BIM modeling data 221 for virtually modeling the building 222, and extracts the extracted image layer using the extracted indoor image as an input 223 of the VGG network. Fields 224a to 224e can be stored. The image layers 224a to 224e extracted through the VGG network V may be stored in the server storage unit 220 by BIM DB (Building Information Modeling Database). Each of the extracted image layers 224a to 224e includes an image feature map. In particular, the server storage unit 220 may store a fourth pooling layer 224d among the image layers extracted through the VGG network V.

한편, 외부에서 별도로 VGG 네트워크를 구동하여 이미지 레이어들을 추출할 수 있고, 서버 저장부(220)는 추출된 이미지 레이어들을 BIM DB화하여 저장할 수도 있다.On the other hand, it is possible to extract the image layers by driving the VGG network separately from the outside, the server storage unit 220 may be stored in the BIM DB extracted image layers.

즉, 서버 저장부(220)에는 콘볼루션 신경망 네트워크(CNN)의 각 층(layer)을 통과한 결과물로 얻어지는 이미지 특징맵(feature map)이 포함된 이미지 레이어들이 BIM DB로 저장된다. 이미지 특징맵(feature map)은 벡터화되어 코사인 거리(cosine distance)로 계산되는 이미지 유사도 평가에 사용된다. That is, the server storage unit 220 stores image layers including an image feature map obtained as a result of passing through each layer of the convolutional neural network CNN as a BIM DB. Image feature maps are used to evaluate image similarity, which is vectorized and calculated as cosine distance.

BIM 모델링 데이터에 포함된 건물의 실내 이미지는 실내에서의 위치 및 방향 정보를 가지고 있으며, 이 위치 및 방향 정보는 VGG 네트워크(V)를 통해 추출된 이미지 레이어들(224a ~ 224e)에도 포함된다. 따라서, 건물의 실내에서 사용자 단말기(100)에서 촬영된 이미지와 가장 유사한 이미지가 BIM DB에서 검출되면, 이로부터 사용자의 위치를 검출할 수 있게 된다. The indoor image of the building included in the BIM modeling data has indoor location and orientation information, which is also included in the image layers 224a through 224e extracted through the VGG network V. Therefore, when the image most similar to the image photographed by the user terminal 100 in the interior of the building is detected in the BIM DB, the user's position can be detected therefrom.

콘볼루션 신경망 네트워크(CNN)는 다수의 층들(layers)로 구성되며 목표로 하는 작업을 수행하기 위하여 이미지의 유의미한 특징(feature)을 추출하는 것을 학습한다. CNN을 구성하는 층(layer)의 종류는 크게 콘볼루셔널 레이어(convolutional layer), 풀링 레이어(pooling layer), 그리고 풀리-컨넥티드 레이어(fully-connected layer)로 나뉜다. The convolutional neural network (CNN) consists of a plurality of layers and learns to extract meaningful features of the image to perform the desired task. The types of layers constituting the CNN are largely divided into a convolutional layer, a pooling layer, and a fully-connected layer.

CNN은 2차원 미가공 이미지(raw image)를 초기 입력(input)으로 받으며 각 층(layer)은 이전 층(layer)에서 나온 결과물을 입력(input)으로 받는다. 콘볼루셔널 레이어(Convolutional layer)에는 각 층(layer)에서 지정된 크기를 가지는 필터(filter)가 층(layer)의 깊이와 동일한 수로 존재하고, 각 필터는 입력(input) 위를 스트라이드(stride) 만큼의 간격으로 이동하며 입력(input)과 디스크리트 콘볼루션(discrete convolution) 연산을 거친다. The CNN receives a two-dimensional raw image as an initial input, and each layer receives the output from the previous layer. In the convolutional layer, there is a filter having the size specified in each layer in the same number as the depth of the layer, and each filter is strides over the input. It is moved at intervals of and undergoes input and discrete convolution operations.

이미지와 같은 어레이 데이터(array data)는 거리가 근접할수록 높은 연관성을 가지기 때문에, 이 연산 과정은 입력(input)에서 필터가 적용되는 부분인 리셉티브 필드(receptive field)에서의 두드러지는 특성을 추출한다.Since array data, such as images, have a higher correlation as the distance is closer, this operation extracts prominent features in the receptive field, which is the part of the filter that is applied at the input. .

따라서 콘볼루셔널 레이어(convolutional layer)를 통과하여 나온 출력(output)은 입력 이미지의 각 위치에 따른 특징(feature)이 추출된 것이라고 볼 수 있다. 이와 같이 콘볼루셔널 레이어(convolutional layer)에서 이미지 각 부분의 특징(feature)을 찾으면, 풀링 레이어(pooling layer)에서는 의미론적으로 유사한 특징들(semantically similar features)을 하나로 병합(merge)하는 작업을 수행하며, 네트워크의 성능이 이미지의 왜곡과 이동에 대해 불변(invariant)하도록 한다. 또한, 특징맵의 크기(dimension)는 줄어들기 때문에, 특징맵의 단위 영역이 표현하는 입력 이미지의 영역은 이전보다 커지게 된다.Therefore, the output through the convolutional layer may be regarded as a feature extracted from each position of the input image. When the features of each part of the image are found in the convolutional layer, the pooling layer merges the semantically similar features into one. The network performance is invariant to image distortion and movement. In addition, since the dimension of the feature map is reduced, the area of the input image represented by the unit area of the feature map is larger than before.

이 특징맵이 새로운 콘볼루셔널 레이어의 입력(input)으로 들어가면, 이전 콘볼루셔널 레이어와 동일한 크기의 필터가 적용되더라도, 원본 이미지에서 더 큰 영역에서 특징이 추출된다. 따라서, 콘볼루셔널 레이어와 풀링 레이어가 반복되면, 원본 이미지에서 점점 더 큰 영역에서의 특징이 추출되기 때문에, 네트워크 후반부로 갈수록 전체를 관찰하는 글로벌 디스크립터(global descriptor)를 얻을 수 있다.When this feature map enters the input of the new convolutional layer, the feature is extracted from a larger area of the original image, even if the same size filter as the previous convolutional layer is applied. Therefore, when the convolutional layer and the pooling layer are repeated, features in a larger area are extracted from the original image, so that a global descriptor that observes the whole as the second half of the network can be obtained.

본 발명에서는 사전 학습된 콘볼루션 신경망 네트워크(pre-trained CNN)의 각 층(layer)를 통과하여 나온 이미지 레이어들의 집합체인 특징맵(feature map)을 사용하여 사용자의 현재 위치를 추정한다. In the present invention, the current position of the user is estimated by using a feature map, which is a collection of image layers that pass through each layer of a pre-trained convolutional neural network.

객체 분류(object classification)를 위하여 학습된 CNN은 다른 종류의 이미지 추출에 적용되었을 때 우수한 결과를 보인다. CNN은 학습 과정에서 입력 이미지의 주요 특징를 추출할 수 있는 필터를 학습하였기 때문에 다른 종류의 이미지가 입력되었을 때에도 이미지 내 유의미한 특징를 추출할 수 있다. 따라서, 본 발명에서는 시각적으로는 유사하지만 이미지의 유형 차이가 존재하는 실내 이미지와 이미 구축된 BIM 이미지를 매칭시키기 위해서는, 여러 가지 딥 러닝 네트워크 중에서 콘볼루션 신경망 네트워크를 사용하는 것이 바람직하다. CNNs trained for object classification show excellent results when applied to other types of image extraction. Since the CNN has learned a filter that can extract the main features of the input image during the learning process, the CNN can extract significant features in the image even when other types of images are input. Therefore, in the present invention, it is preferable to use a convolutional neural network among various deep learning networks in order to match a BIM image that is already constructed with an indoor image that is visually similar but there is a difference in image type.

도 3b는 VGG-16 네트워크 구조가 도시된 것으로, 여러 개의 콘볼루셔널 레이어와 맥스 풀링(max pooling)이 5번 반복되며 네트워크의 마지막에는 3개의 풀리-컨넥티드 레이어(fully-connected layer)가 존재한다. 풀리-컨넥티드 레이어의 결과물은 이미지의 특징이 모두 혼합되어 위치적 정보를 잃기 때문에 본 발명에서 위치 검출을 위한 특징으로 사용되지 않고, 맥스 풀링(max pooling)을 거쳐서 나온 5개의 풀링 레이어를 위치 검출을 위한 특징으로 사용한다. 도 3에서 표기된 숫자는 도면부호를 의미하는 것이 아니라, 이미지 사이즈를 설명하기 위한 것이다.3B shows the VGG-16 network structure, where several convolutional layers and max pooling are repeated five times and at the end of the network there are three fully-connected layers. do. The result of the pulley-connected layer is not used as a feature for position detection in the present invention because all of the features of the image are mixed to lose positional information, and the position detection is performed through five pooling layers that have been passed through max pooling. Used as a feature for. The numerals shown in FIG. 3 do not mean reference numerals, but are for explaining image size.

본 발명에서 이미지 추출(image retrieval)을 위한 데이터는 지정된 실내 위치에서 사용자의 시야와 유사하게 렌더링(rendering)된 BIM 이미지로 구성된다. BIM 모델링 이미지는 건물 내부에서, 즉 실내에서, 2차원 이미지 형태의 보행자 시선 추출이 가능하다. 추출될 보행자 시선(view)의 방향은 사전에 다양하게 설정되어 각 위치에서 동일하게 적용된다. 또한, 사용자가 사용할 카메라의 화각 정보에 맞추어 BIM 모델에서의 사용자 시선(view)를 렌더링하여 BIM 모델링 이미지를 구축하며, 이는 BIM DB로 저장된다.In the present invention, the data for image retrieval is composed of BIM images rendered similar to the user's field of view at a designated indoor location. BIM modeling image can be extracted inside the building, that is, indoor, pedestrian gaze in the form of a two-dimensional image. The direction of the pedestrian view to be extracted is variously set in advance and applied equally at each position. In addition, the BIM modeling image is constructed by rendering a user view in the BIM model according to the angle of view information of the camera to be used by the user, which is stored in the BIM DB.

사용자가 찍은 사진과 매칭되었을 때 사용자의 현재 위치 정보를 제시하기 위하여 각각의 BIM 모델링 이미지는 렌더링된 실내 위치와 방향 정보를 가지고 있다. 이때, 다른 위치에서도 동일하게 추출될 수 있는 문이나 단색의 벽 등은 이미지 추출을 활용한 위치정보 제공에 적합하지 않기 때문에 BIM 모델링 이미지에서 제거될 수 있다.Each BIM modeling image has rendered interior location and orientation information to present the user's current location information when matched with the picture taken by the user. In this case, a door or a wall of a single color that can be extracted in the same location may be removed from the BIM modeling image because it is not suitable for providing location information using image extraction.

서버 제어부(230)는 실내위치 검출서버(200)의 전반적인 기능을 제어한다. 서버 제어부(230)는 사용자 단말기(100)로부터 수신된 이미지에서 구조 정보가 포함된 레이어를 추출하고, 추출된 레이어의 구조 정보를 BIM DB에 저장된 어느 하나 이상의 이미지 레이어와 매칭하여 기준치 이상의 유사도를 갖는 이미지 레이어를 추출하고, 추출된 이미지 레이어를 이용하여 사용자의 현재 위치를 검출한다. 여기서, 구조 정보는 레이어에 포함된 이미지 특징맵의 형상을 의미할 수 있다. 즉, 이미지 특징맵의 전체적인 형상의 유사 여부를 비교하여 매칭되는 이미지 레이어를 추출한다.The server controller 230 controls the overall functions of the indoor location detection server 200. The server controller 230 extracts a layer including structure information from an image received from the user terminal 100, and matches the extracted structure information with one or more image layers stored in a BIM DB to have a similarity or higher than a reference value. The image layer is extracted and the current position of the user is detected using the extracted image layer. Here, the structure information may mean the shape of the image feature map included in the layer. That is, a matching image layer is extracted by comparing the similarity of the overall shape of the image feature map.

이를 위해, 서버 제어부(230)는 레이어 추출모듈(231)과, 이미지 매칭모듈(232)과, 위치 검출모듈(233)을 포함한다. To this end, the server controller 230 includes a layer extraction module 231, an image matching module 232, and a position detection module 233.

레이어 추출모듈(231)은 사용자 단말기(100)로부터 수신된 이미지에서 구조 정보가 포함된 레이어를 추출하고, 이미지 매칭모듈(232)은 추출된 레이어의 구조 정보를 BIM DB에 저장된 어느 하나 이상의 이미지 레이어에 포함된 구조 정보와 매칭하여 기준치 이상의 유사도를 갖는 이미지 레이어를 추출한다. 이때, 유사도는 벡터화된 이미지들간의 코사인 거리(cosine distance)를 이용하는 코사인 유사도(cosine similarity) 평가를 통해 산출될 수 있다. 코사인 거리(cosine distance)는 두 벡터간 사이각의 코사인 값으로, 벡터간 유사도를 평가하는 데 사용될 수 있으며 노옴(norm)에 의해 자연스럽게 규격화(naturally normalized)되는 이점을 가지고 있다. The layer extraction module 231 extracts a layer including structure information from an image received from the user terminal 100, and the image matching module 232 transmits the structure information of the extracted layer to any one or more image layers stored in a BIM DB. The image layer having a similarity or more than a reference value is extracted by matching the structure information included in the. In this case, the similarity may be calculated through cosine similarity evaluation using cosine distance between the vectorized images. Cosine distance is a cosine of the angle between two vectors, which can be used to evaluate similarity between vectors and has the advantage of being naturally normalized by the norm.

위치 검출모듈(233)은 추출된 이미지 레이어에 포함된 위치/방향 정보를 추출하여 사용자의 현재 위치를 검출한다.The position detection module 233 extracts position / direction information included in the extracted image layer to detect the current position of the user.

이러한, 서버 제어부(230)의 기능은 콘볼루션 신경망 네트워크의 일종인 VGG 네트워크를 이용하여 수행될 수 있다. 즉, VGG 네트워크는 레이어 추출모듈(231)과 이미지 매칭모듈(232)의 기능을 수행하고, VGG 네트워크를 구동한 결과, 어느 하나의 이미지 레이어가 추출되면, 위치 검출모듈(233)은 해당 이미지 레이어에 포함된 위치/방향 정보를 추출하여 서버 통신부(210)를 통해 사용자 단말기(100)로 전송한다. The function of the server controller 230 may be performed using a VGG network, which is a kind of a convolutional neural network. That is, the VGG network performs the functions of the layer extraction module 231 and the image matching module 232, and when the image layer is extracted as a result of driving the VGG network, the position detection module 233 performs the corresponding image layer. The location / direction information included in the extracted information is transmitted to the user terminal 100 through the server communication unit 210.

여기서, 기준치 이상의 유사도를 갖는 이미지 레이어는 VGG 네트워크(V)를 통해 추출된 이미지 레이어들(224a ~ 224e) 중에서 제4 풀링 레이어(224d)일 수 있다. 본 발명의 발명자는 제4 풀링 레이어(224d)를 이용할 경우, 정확도가 높은 매칭 결과를 도출할 수 있음을 확인하였다. 이에 대해서는 후술하도록 한다.Here, the image layer having a similarity or more than a reference value may be the fourth pulling layer 224d among the image layers 224a to 224e extracted through the VGG network V. FIG. The inventor of the present invention has confirmed that, when the fourth pulling layer 224d is used, a matching result with high accuracy can be derived. This will be described later.

도 4는 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치의 전체적인 동작 과정을 예시하는 도면이다.4 is a diagram illustrating the overall operation of the image-based indoor position detection apparatus according to an embodiment of the present invention.

사용자 단말기(100)에 내장된 카메라를 통해 촬영된 이미지가 수신되면, 이미지는 VGG 네트워크 입력으로 적합한 사이즈(예를 들어 224X224 사이즈)로 리사이징(resizing)되어 VGG 네트워크(V)로 입력된다. When an image captured by a camera built in the user terminal 100 is received, the image is resized to a size suitable for the VGG network input (for example, a size of 224 × 224) and input to the VGG network (V).

서버 제어부(230)는 입력된 이미지를 여러 개의 콘볼루셔널 레이어와 맥스 풀링을 통과시키면서, 그 결과물로 다수개의 풀링 레이어(234a ~ 234e)를 얻는다. 매칭을 위한 특징은 네트워크의 마지막 단에 위치한, 풀리-컨넥티드 레이어(fully-connected layer)를 제외한 각 층(layer)를 통과한 결과물인 특징맵(feature map) 형태로 얻어진다.The server controller 230 passes the input image through a plurality of convolutional layers and max pooling, and obtains a plurality of pooling layers 234a through 234e as a result. The feature for matching is obtained in the form of a feature map that is the result of passing through each layer except the fully-connected layer, located at the last end of the network.

본 발명의 발명자는 여러 풀링 레이어에 대한 실험 결과를 통해, VGG 네트워크(V)를 통해 사전 학습하여 얻은, BIM DB에 저장된 이미지 레이어들 중에서 “기준치 이상의 유사도를 갖는 이미지 레이어”(즉, 위치 검출에 유효한 이미지 레이어)는 제4 풀링 레이어임(224d, 234d)을 확인하였고, 제4 풀링 레이어에서 특징을 추출하여 이미지 매칭을 수행하였을 때, 다른 레이어에 비해 정확도 높은 결과를 얻을 수 있었다.The inventors of the present invention, based on the results of experiments for a number of pooling layers, among the image layers stored in the BIM DB obtained by pre-learning through the VGG network (V) "image layer having a similarity or more than the reference value" (that is, the position detection Valid image layers) were identified as the fourth pooling layers 224d and 234d. When image matching was performed by extracting features from the fourth pooling layer, an accurate result was obtained compared to other layers.

이와 관련하여, 도 5를 참조하여 설명한다. 도 5는 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출장치에서 BIM DB에 저장된 실내 이미지와 사용자 단말기에서 촬영된 실내 이미지를 매칭하는 과정을 설명하기 위한 예시도이다.In this regard, a description will be given with reference to FIG. 5. FIG. 5 is an exemplary diagram for describing a process of matching an indoor image stored in a BIM DB and an indoor image photographed by a user terminal in an image-based indoor position detection apparatus according to an embodiment of the present invention.

도 5는 각각의 이미지 레이어(224a ~ 224e, 234a ~ 234e)에서 추출된 특징을 확인하고 이를 비교하기 위해 각각의 이미지 레이어(224a ~ 224e, 234a ~ 234e)를 시각화하였다. 도 5는 동일 지점과 방향에서 취득된 실내 사진과 BIM image가 각각 VGG-16 네트워크를 통과하였을 때, 각각의 풀링 레이어에서 얻을 수 있는 특징을 시각화한 예시이다.FIG. 5 visualizes each image layer 224a through 224e and 234a through 234e in order to identify and compare features extracted from each image layer 224a through 224e and 234a through 234e. FIG. 5 is an example of visualizing characteristics obtained in each pooling layer when the indoor photograph and the BIM image acquired at the same point and direction pass through the VGG-16 network, respectively.

CNN은 layer가 깊어짐에 따라, 이미지 내 점차 넓은 영역에서 특징을 추출하도록 훈련된다. 즉, network는 local descriptor를 추출하는 것에서 시작하여 점차 global descriptor를 추출하게 된다. As the layer deepens, the CNN is trained to extract features from an increasingly wide area of the image. In other words, the network starts from extracting local descriptors and gradually extracts global descriptors.

도 5에서 각각의 이미지 레이어에 시각화된 특징을 살펴보면, 후반부의 풀링 레이어에서 나온 결과물일수록 이미지에서의 세부적인 특성보다는 구조적인 특성(구조 정보)을 나타내는 것을 시각적으로 확인할 수 있다.Referring to the features visualized in each image layer in FIG. 5, it can be visually confirmed that the results obtained from the pooling layer in the second half show structural characteristics (structural information) rather than detailed characteristics in the image.

전반부의 이미지 레이어에서 나온 결과물을 보면, 타일(tile)의 격자 무늬, 명패, 조도 등 세부적인 요인에 의하여 색상 배치가 다른 것을 확인할 수 있다. 하지만, 본 발명은 종류가 다른 두 이미지를 대상으로 이미지 추출을 통한 이미지 매칭을 수행하는 것이기 때문에, 세부적인 특징(feature) 보다는 이미지의 전체적인 구조 비교가 필요하다. Looking at the result from the image layer of the first half, it can be seen that the color arrangement is different depending on the detailed factors such as the tile grid, nameplate, and roughness. However, in the present invention, since image matching is performed through image extraction for two different types of images, it is necessary to compare the overall structure of the images rather than the detailed features.

실제로, 실험 결과를 통해 VGG 네트워크 내에서 상대적으로 글로벌 디스트립터(global descriptor)를 추출하는 제4 풀링 레이어에서 나온 특징(feature)이 cross-domain 실내 이미지 matching에 가장 적합한 것을 확인하였다. In fact, the experimental results confirmed that the feature derived from the fourth pooling layer extracting a relatively global descriptor in the VGG network is most suitable for cross-domain indoor image matching.

도 5를 참조하면, 매칭되는 두 이미지의 시각화된 특징의 색상 배치가 제4 풀링 레이어의 결과물에서 가장 유사해지는 것을 확인할 수 있다. 제5 풀링 레이어에서 얻어지는 특징은 제4 풀링 레이어에서 얻어지는 것 보다 더 넓은 영역의 특징을 추출한다. 하지만, 이미지 매칭 및 추출 정확도는 더 떨어지기 때문에, 본 발명의 발명자는 제5 풀링 레이어는 이미지 정보를 과하게 축약한 것이 부정적으로 작용한 것이라고 추정한다. 이를 통해, 크로스 도메인 이미지 매칭(cross domain image matching)이라고 해서 무조건적으로 더 넓은 영역을 나타내는 글로벌 디스크립터(global descriptor)를 이용하는 것이 항상 유용한 것은 아니라는 것을 확인할 수 있었다.Referring to FIG. 5, it can be seen that the color arrangement of the visualized features of the two matching images is most similar in the result of the fourth pulling layer. Features obtained in the fifth pooling layer extract features of a wider area than those obtained in the fourth pooling layer. However, since the image matching and extraction accuracy is further inferior, the inventors of the present invention estimate that the fifth pulling layer has an adverse effect on excessively shortening the image information. Through this, it was confirmed that cross domain image matching is not always useful to use a global descriptor that represents a wider area unconditionally.

다음으로, 도 6 및 도 7을 참조하여 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출방법을 설명한다. 도 6은 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출방법이 도시된 순서도이고, 도 7은 본 발명의 일 실시예에 따른 이미지 기반 실내위치 검출방법의 제어 흐름도이다.Next, an image-based indoor location detection method according to an embodiment of the present invention will be described with reference to FIGS. 6 and 7. 6 is a flowchart illustrating an image-based indoor location detection method according to an embodiment of the present invention, and FIG. 7 is a control flowchart of the image-based indoor location detection method according to an embodiment of the present invention.

도 6 및 도 7에 도시된 바와 같이, 먼저, BIM DB를 형성하여 저장한다.(S110) 즉, 가상으로 건물을 모델링하기 위한 BIM 모델링 데이터(221)에서 건물의 실내 이미지를 추출(222)하고, 추출된 실내 이미지를 VGG 네트워크의 입력(223)으로 하여 추출된 이미지 레이어들(224a ~ 224e)을 저장한다. 여기서, 추출된 이미지 레이어들(224a ~ 224e) 중에서 제4 풀링 레이어(224d)를 이용할 경우, 정확도가 높은 매칭 결과를 도출할 수 있으므로, BIM DB는 각각의 실내 이미지에서 추출된 제4 풀링 레이어(224d)만을 저장할 수도 있다. 이러한 BIM DB는 실내위치 검출서버(200)의 서버 저장부(220)에 저장된다.6 and 7, first, a BIM DB is formed and stored (S110). That is, the indoor image of the building is extracted (222) from the BIM modeling data 221 for modeling the building virtually. The extracted image layers 224a to 224e are stored using the extracted indoor image as an input 223 of the VGG network. In this case, when the fourth pooling layer 224d is used among the extracted image layers 224a to 224e, a matching result with high accuracy can be derived, so that the BIM DB uses the fourth pooling layer extracted from each indoor image. 224d) only. The BIM DB is stored in the server storage unit 220 of the indoor location detection server 200.

다음, 사용자 단말기(100)로부터 질의 이미지를 수신한다.(S120) 사용자가 어느 건물의 실내에서 그 위치를 확인하고자 하는 경우, 사용자 단말기(100)에 내장된 카메라로 해당 위치의 실내를 촬영하고, 촬영된 이미지(질의 이미지)를 유무선 통신을 통해 실내위치 검출서버(200)로 전송한다.Next, a query image is received from the user terminal 100. (S120) When a user wants to check the location of a building in a room, the indoor of the location is photographed by a camera built in the user terminal 100. The photographed image (query image) is transmitted to the indoor location detection server 200 through wired or wireless communication.

다음, 질의 이미지를 수신한 실내위치 검출서버(200)는 질의 이미지를 입력으로 하여 VGG 네트워크를 실행하여(S130), 질의 이미지 내의 특징을 추출하고, 추출된 특징을 BIM DB에 저장된 실내 이미지의 특징과 매칭한다.(S140) 여기서, 매칭을 위한 특징은 질의 이미지 내의 구조 정보와 BIM DB에 저장된 실내 이미지의 구조 정보이다. 구조 정보는 레이어에 포함된 이미지 특징맵의 형상을 의미할 수 있다. 즉, 이미지 특징맵의 전체적인 형상의 유사 여부를 비교한다.Next, the indoor location detection server 200 receiving the query image executes the VGG network using the query image as an input (S130), extracts the feature in the query image, and extracts the extracted feature from the feature of the indoor image stored in the BIM DB. Here, the feature for matching is the structure information in the query image and the structure information of the indoor image stored in the BIM DB. The structure information may mean the shape of the image feature map included in the layer. That is, the similarity between the overall shape of the image feature map is compared.

서버 제어부(230)는 입력된 질의 이미지를 여러 개의 콘볼루셔널 레이어와 맥스 풀링을 통과시키면서, 그 결과물로 다수개의 풀링 레이어(234a ~ 234e)를 얻는다. BIM DB 형성 과정에서 사전 학습된 VGG 네트워크는, 매칭을 위해 질의 이미지에서 구조 정보가 포함된 레이어를 추출하고, 추출된 레이어의 구조 정보를 BIM DB에 저장된 실내 이미지의 이미지 레이어와 매칭하여 기준치 이상의 유사도를 갖는 이미지 레이어를 추출한다. 여러 풀링 레이어에 대한 실험 결과를 통해 확인된 바에 의하면, 기준치 이상의 유사도를 갖는 이미지 레이어는 제4 풀링 레이어인 것이 바람직하다.The server controller 230 passes the input query image through a plurality of convolutional layers and max pooling, and obtains a plurality of pooling layers 234a through 234e as a result. The VGG network pre-trained in the BIM DB formation process extracts a layer including structure information from a query image for matching, and matches the extracted layer structure information with an image layer of an indoor image stored in a BIM DB to match similarities above a reference value. Extract the image layer with. As a result of experiments on various pooling layers, it is preferable that an image layer having a similarity or higher than a reference value is a fourth pooling layer.

다음, 매칭을 통해, 질의 이미지와 유사한 실내 이미지가 확정되면, 해당 실내 이미지 레이어에 포함된 위치/방향 정보를 추출하여 사용자의 현재 위치를 검출한다.(S150) 그리고, 검출된 위치 정보를 사용자 단말기(100)로 전송한다.Next, when the indoor image similar to the query image is determined through matching, the current position of the user is detected by extracting the position / direction information included in the corresponding indoor image layer (S150). Send to 100.

다음으로, 도 8 및 도 9를 참조하여 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치를 설명한다. 도 8은 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출시스템이 도시된 블록도이고, 도 9는 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치가 도시된 블록도이다.Next, an image-based indoor position detection apparatus according to another embodiment of the present invention will be described with reference to FIGS. 8 and 9. 8 is a block diagram illustrating an image-based indoor position detection system according to another embodiment of the present invention, and FIG. 9 is a block diagram showing an image-based indoor position detection device according to another embodiment of the present invention.

도 8에 도시된 바와 같이, 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출시스템은, 사용자 단말기(100)와 BIM DB 서버(201)를 포함한다. 여기서, 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출장치는 사용자 단말기(100)가 될 수 있다.As shown in FIG. 8, the image-based indoor location detection system according to another embodiment of the present invention includes a user terminal 100 and a BIM DB server 201. Here, the image-based indoor position detection apparatus according to another embodiment of the present invention may be a user terminal 100.

전술한 일 실시예와 동일하게, BIM DB 서버(201)에는 가상으로 건물을 모델링하기 위한 BIM 모델링 데이터(221)에서 건물의 실내 이미지를 추출(222)하고, 추출된 실내 이미지를 VGG 네트워크의 입력(223)으로 하여 추출된 이미지 레이어들(224a ~ 224e)이 BIM DB화되어 저장된다. BIM DB 서버(201)에는 콘볼루션 신경망 네트워크(CNN)의 각 층(layer)을 통과한 결과물로 얻어지는 이미지 특징맵(feature map)을 포함하는 이미지 레이어들이 BIM DB로 저장될 수 있다.In the same manner as in the above-described embodiment, the BIM DB server 201 extracts an indoor image of the building from the BIM modeling data 221 for virtually modeling the building 222, and inputs the extracted indoor image to the VGG network. The image layers 224a to 224e extracted as 223 are converted into BIM DB and stored. The BIM DB server 201 may store image layers including an image feature map obtained as a result of passing through each layer of the convolutional neural network CNN.

다만, 전술한 일 실시예와는 달리, BIM DB 서버(201)는 사용자 단말기(100)에서 촬영된 이미지를 수신 받아서 사용자의 현재 위치를 검출하는 것이 아니라, 사용자 단말기(100)가 위치 확인이 필요한 특정 건물에 대한 BIM 데이터를 수신하고, 사용자 단말기(100)에서 딥 러닝 네트워크를 수행하여 얻은 이미지 레이어를 이용하여 사용자의 현재 위치를 검출한다.However, unlike the above-described embodiment, the BIM DB server 201 receives the image captured by the user terminal 100 and does not detect the current location of the user, but the user terminal 100 needs to check the location. Receives BIM data for a specific building, and detects the current location of the user using an image layer obtained by performing a deep learning network in the user terminal 100.

즉, 전술한 일 실시예는 실내위치 검출서버(200)에서 사용자의 현재 위치를 검출하고, 본 실시예는 사용자 단말기(100)에서 사용자의 현재 위치를 검출한다.That is, the above-described embodiment detects the current location of the user in the indoor location detection server 200, and the present embodiment detects the current location of the user in the user terminal 100.

이를 위해, 사용자 단말기(100)는, 도 9에 도시된 바와 같이, 통신부(110)와, 입력부(120)와, 출력부(130)와, 저장부(140)와, 제어부(150)를 포함한다. To this end, the user terminal 100, as shown in FIG. 9, includes a communication unit 110, an input unit 120, an output unit 130, a storage unit 140, and a controller 150. do.

통신부(110)는 유무선 통신망(N)을 통해 BIM DB 서버(201)와 다양한 정보의 송수신을 지원하는 역할을 수행한다. 통신부(110)는 BIM DB 서버(201)로 BIM 데이터 전송을 요청하고, BIM DB 서버(201)로부터 전송된 BIM 데이터를 수신한다.The communication unit 110 serves to support the transmission and reception of various information with the BIM DB server 201 through the wired or wireless communication network (N). The communication unit 110 requests BIM data transmission to the BIM DB server 201 and receives the BIM data transmitted from the BIM DB server 201.

입력부(120)는 사용자로부터 입력되는 숫자 및 문자 정보 등의 다양한 정보, 각종 기능 설정 및 사용자 단말기(100)의 기능 제어와 관련하여 입력되는 신호를 제어부(150)로 전달한다. 이러한 입력부(120)는, 키 보드나 키 패드와 같은 키 입력 수단, 터치 센서나 터치 패드와 같은 터치 입력 수단, 음성 입력 수단, 자이로 센서, 지자기 센서, 가속도 센서와 근접 센서, 그리고 카메라 중 적어도 하나 이상을 포함하여 이루어지는 제스처 입력 수단을 포함할 수 있다.The input unit 120 transmits a variety of information such as numeric and text information input from a user, a signal input in connection with various function settings, and function control of the user terminal 100 to the controller 150. The input unit 120 may include at least one of a key input means such as a keyboard or a keypad, a touch input means such as a touch sensor or a touch pad, a voice input means, a gyro sensor, a geomagnetic sensor, an acceleration sensor, a proximity sensor, and a camera. It may include a gesture input means comprising the above.

출력부(130)는 입력부(120)의 입력 결과를 출력하며, 제어부(150)에서 검출된 사용자의 현재 위치를 출력한다.The output unit 130 outputs the input result of the input unit 120 and outputs the current position of the user detected by the control unit 150.

저장부(140)는 사용자 단말기의 구동에 필요한 각종 데이터 및 프로그램, 어플리케이션을 저장한다. 또한, 저장부(140)에는 딥 러닝 네트워크를 수행하는 프로그램이 저장될 수 있다. 여기서, 딥 러닝 네트워크는 콘볼루션 신경망 네트워크(Convolutional neural network, CNN)일 수 있다. 보다 구체적으로, 콘볼루션 신경망 네트워크는 VGG-16, VGG-19 중 적어도 어느 하나일 수 있다. 저장부(140)에는 BIM DB 서버(201)로부터 전송된 BIM 데이터가 저장된다. The storage 140 stores various data, programs, and applications necessary for driving the user terminal. In addition, the storage 140 may store a program for executing a deep learning network. Here, the deep learning network may be a convolutional neural network (CNN). More specifically, the convolutional neural network may be at least one of VGG-16 and VGG-19. The storage 140 stores BIM data transmitted from the BIM DB server 201.

제어부(150)는 사용자 단말기(100)에 내장된 카메라로 촬영한 실내 이미지에서 구조 정보가 포함된 레이어를 추출하고, 추출된 레이어의 구조 정보와 BIM DB 서버(201)로부터 전송된 BIM 데이터에 포함된 어느 하나 이상의 이미지 레이어와 매칭하여 기준치 이상의 유사도를 갖는 이미지 레이어를 추출하고, 추출된 이미지 레이어를 이용하여 사용자의 현재 위치를 검출한다.The controller 150 extracts a layer including structure information from an indoor image captured by a camera embedded in the user terminal 100, and includes the structure information of the extracted layer and the BIM data transmitted from the BIM DB server 201. The image layer having a similarity or more than a reference value is matched with any one or more image layers, and the current position of the user is detected using the extracted image layer.

이를 위해, 제어부(150)는 레이어 추출모듈(151)과, 이미지 매칭모듈(152)과, 위치 검출모듈(153)을 포함한다.To this end, the controller 150 includes a layer extraction module 151, an image matching module 152, and a position detection module 153.

레이어 추출모듈(151)은 카메라로 촬영한 실내 이미지에서 구조 정보가 포함된 레이어를 추출하고, 이미지 매칭모듈(152)은 추출된 레이어의 구조 정보를 BIM DB 서버(201)로부터 전송된 BIM 데이터에 포함된 어느 하나 이상의 이미지 레이어와 매칭하여 기준치 이상의 유사도를 갖는 이미지 레이어를 추출한다. 이때, 유사도는 벡터화된 이미지들간의 코사인 거리(cosine distance)를 이용하는 코사인 유사도(cosine similarity) 평가를 통해 산출될 수 있다. The layer extraction module 151 extracts a layer including structure information from an indoor image photographed by a camera, and the image matching module 152 transmits the structure information of the extracted layer to the BIM data transmitted from the BIM DB server 201. An image layer having a similarity or higher than a reference value is extracted by matching one or more included image layers. In this case, the similarity may be calculated through cosine similarity evaluation using cosine distance between the vectorized images.

위치 검출모듈(153)은 추출된 이미지 레이어에 포함된 위치/방향 정보를 추출하여 사용자의 현재 위치를 검출한다.The position detection module 153 extracts position / direction information included in the extracted image layer to detect a current position of the user.

이러한, 제어부(150)의 기능은 콘볼루션 신경망 네트워크의 일종인 VGG 네트워크를 이용하여 수행될 수 있다. 즉, VGG 네트워크는 레이어 추출모듈(151)과 이미지 매칭모듈(152)의 기능을 수행하고, VGG 네트워크를 구동한 결과, 어느 하나의 이미지 레이어가 추출되면, 위치 검출모듈(153)은 해당 이미지 레이어에 포함된 위치/방향 정보를 추출하여 출력부(130)를 통해 출력한다.The function of the controller 150 may be performed using a VGG network, which is a kind of a convolutional neural network. That is, the VGG network performs the functions of the layer extraction module 151 and the image matching module 152, and as a result of driving the VGG network, when any one image layer is extracted, the position detection module 153 performs the corresponding image layer. Extract the position / direction information included in the output through the output unit 130.

다음으로, 도 10을 참조하여 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출방법을 설명한다. 도 10은 본 발명의 다른 실시예에 따른 이미지 기반 실내위치 검출방법의 제어 흐름도이다.Next, an image-based indoor position detection method according to another embodiment of the present invention will be described with reference to FIG. 10. 10 is a control flowchart of an image-based indoor position detection method according to another embodiment of the present invention.

도 10에 도시된 바와 같이, 먼저, BIM DB를 형성하여 BIM DB 서버(201)에 저장한다. (S210) BIM DB 형성 과정은 전술한 바와 같다.As shown in FIG. 10, first, a BIM DB is formed and stored in the BIM DB server 201. (S210) The BIM DB forming process is as described above.

다음, 사용자 단말기(100)에서 BIM DB 서버(201)로 위치 확인이 필요한 특정 건물에 대한 BIM 데이터 전송을 요청한다. (S220) Next, the BIM DB server 201 requests from the user terminal 100 to transmit the BIM data for a specific building that needs to be located. (S220)

사용자가 건물의 명칭/주소 등 건물을 특정할 수 있는 정보를 알고 있는 경우, 해당 정보를 입력하고 전송하여 특정 건물에 대한 BIM 데이터 전송을 요청할 수 있다. 만약, 사용자가 건물을 특정할 수 있는 정보를 알지 못하는 경우, 사용자 단말기(100)에 내장된 GPS 신호를 통해 특정 건물에 대한 BIM 데이터 전송을 요청할 수 있다. 건물 내에서는 GPS 신호는 수신되지 않으므로, 최후의 GPS 신호 수신 정보를 BIM DB 서버(201)로 전송하는 방식으로 BIM 데이터 전송을 요청할 수 있다. If the user knows information for identifying a building, such as the name / address of the building, the user may request the BIM data transmission for a specific building by inputting and transmitting the corresponding information. If the user does not know the information for specifying the building, the BIM data transmission for the specific building may be requested through the GPS signal embedded in the user terminal 100. Since the GPS signal is not received in the building, the BIM data transmission may be requested by transmitting the last GPS signal reception information to the BIM DB server 201.

다음, BIM DB 서버(201)는 S220 단계의 요청에 대응하는 건물의 BIM 데이터를 사용자 단말기(100)로 전송한다.(S230) 이때, 전송되는 BIM 데이터는 콘볼루션 신경망 네트워크(CNN)의 각 층(layer)을 통과한 결과물로 얻어지는 이미지 특징맵(feature map)이 포함된 이미지 레이어들 일 수 있다.Next, the BIM DB server 201 transmits the BIM data of the building corresponding to the request of step S220 to the user terminal 100. (S230) At this time, the transmitted BIM data is each floor of the convolutional neural network (CNN). The image layers may include image feature maps obtained as a result of passing through the layer.

다음, BIM 데이터를 수신한 후, 사용자 단말기(100)에 내장된 카메라에 의해 촬영된 실내 이미지(질의 이미지)를 입력으로 하여 VGG 네트워크를 실행하여(S240), 실내 이미지(질의 이미지) 내의 특징을 추출하고, 추출된 특징을 BIM DB 서버(201)로부터 수신한 BIM 데이터에 포함된 실내 이미지의 특징과 매칭한다.(S250) 여기서, 매칭을 위한 특징은 실내 이미지(질의 이미지) 내의 구조 정보와 BIM 데이터에 포함된 실내 이미지의 구조 정보이다. Next, after receiving the BIM data, the VGG network is executed by inputting an indoor image (query image) taken by a camera built in the user terminal 100 (S240), and the features in the indoor image (query image) The extracted feature is matched with the feature of the indoor image included in the BIM data received from the BIM DB server 201 (S250). Here, the feature for matching is the structure information and the BIM in the indoor image (query image). Structure information of the indoor image included in the data.

다음, 매칭을 통해, 질의 이미지와 유사한 실내 이미지가 확정되면, 해당 실내 이미지 레이어에 포함된 위치/방향 정보를 추출하여 사용자의 현재 위치를 검출한다.(S260) 그리고, 검출된 위치 정보를 출력부(130)를 통해 출력한다.Next, when the indoor image similar to the query image is determined through matching, the current position of the user is detected by extracting the position / direction information included in the corresponding indoor image layer (S260). Output through 130.

상기와 같은 본 발명의 실시예들에 따른 이미지 기반 실내위치 검출장치 및 검출방법에 의하면, 사전 학습된 딥 러닝 네트워크(구체적으로는, CNN, 보다 구체적으로는 VGG 네트워크)에서 얻은 특징(feature)을 이용하여 이미지 추출 및 매칭을 통해 건물 내에서의 현재 위치를 확인할 수 있다. BIM DB에 저장된 각각의 BIM 이미지는 실내 위치와 방향 정보가 함께 저장되어 있으므로, 카메라에 의해 촬영된 실내 이미지와 동일한 지점과 동일한 방향으로 취득된 BIM 이미지가 매칭되면, 실내 이미지를 촬영한 사용자의 실내 위치 정보를 정확하게 확인할 수 있게 된다.According to the image-based indoor position detection apparatus and the detection method according to the embodiments of the present invention as described above, the feature obtained in the pre-learned deep learning network (specifically, CNN, more specifically VGG network) Image extraction and matching to determine the current location in the building. Since each BIM image stored in the BIM DB is stored together with indoor location and direction information, when a BIM image acquired in the same direction and the same direction as the indoor image captured by the camera is matched, the user's indoor image is captured. The location information can be confirmed accurately.

또한, 본 발명의 실시예들에 따른 이미지 기반 실내위치 검출장치 및 검출방법에 의하면, 기 제작된 BIM 모델링 데이터를 활용하는 것이기 때문에, 실내 위치 확인을 위해 별도의 3D 모델링을 제작하지 않아도 된다. 또한, 기존의 vision-based localization 방법과는 달리 사전에 실내의 곳곳을 촬영하여 DB를 구축하는 수고로운 작업이 수반되지 않는다.In addition, according to the image-based indoor position detection apparatus and detection method according to the embodiments of the present invention, since it utilizes the pre-made BIM modeling data, it is not necessary to produce a separate 3D modeling for the indoor position confirmation. In addition, unlike the conventional vision-based localization method, it does not involve the trouble of constructing a DB by photographing various places in the room in advance.

이상, 본 발명의 일 실시예에 대하여 설명하였으나, 해당 기술 분야에서 통상의 지식을 가진 자라면 특허청구범위에 기재된 본 발명의 사상으로부터 벗어나지 않는 범위 내에서, 구성 요소의 부가, 변경, 삭제 또는 추가 등에 의해 본 발명을 다양하게 수정 및 변경시킬 수 있을 것이며, 이 또한 본 발명의 권리범위 내에 포함된다고 할 것이다.As mentioned above, although an embodiment of the present invention has been described, those of ordinary skill in the art may add, change, delete or add elements within the scope not departing from the spirit of the present invention described in the claims. The present invention may be modified and changed in various ways, etc., which will also be included within the scope of the present invention.

100 : 사용자 단말기 110 : 통신부
120 : 입력부 130 : 출력부
140 : 저장부 150 : 제어부
151 : 레이어 추출모듈 152 : 이미지 매칭모듈
153 : 위치 검출모듈
200 : 실내위치 검출서버 210 : 서버 통신부
220 : 서버 저장부 230 : 서버 제어부
231 : 레이어 추출모듈 232 : 이미지 매칭모듈
233 : 위치 검출모듈
201 : BIM DB 서버100: user terminal 110: communication unit
120: input unit 130: output unit
140: storage unit 150: control unit
151: layer extraction module 152: image matching module
153: position detection module
200: indoor location detection server 210: server communication unit
220: server storage unit 230: server control unit
231: layer extraction module 232: image matching module
233: position detection module
201: BIM DB Server

Claims

A server communication unit which receives an image captured by a camera embedded in a user terminal through wired or wireless communication;
A server storage unit configured to store a building information modeling database (BIM DB) storing indoor images of at least one interior of the building; And,
Extracting a layer including structure information from the image received from the user terminal, and matching the structure information of the extracted layer with any one or more indoor images stored in the BIM DB to extract an indoor image having a similarity or more than a reference value. Server control unit to detect the current location of the user by using the indoor image
Image-based indoor position detection device comprising a.

The method according to claim 1, wherein the BIM DB,
And storing image layers obtained by performing a deep learning network on indoor images of the at least one building.

The method of claim 2, wherein the deep learning network,
An image-based indoor position detection device, which is a convolutional neural network (CNN) including at least one of VGG-16 and VGG-19.

The method according to claim 1,
The BIM DB stores image layers obtained by performing a convolutional neural network on indoor images of the at least one building,
The server control unit,
A layer extraction module for extracting a layer including structure information from the image received from the user terminal;
An image matching module for extracting an image layer having a similarity or more than a reference value by matching the structure information of the extracted layer with structure information of the image layers stored in the BIM DB;
Position detection module for detecting the current position of the user by using the extracted image layer
Image-based indoor position detection device comprising a.

The method according to claim 4, wherein the structure information,
An image-based indoor position detection device having a shape of an image feature map included in an image layer obtained by performing a convolutional neural network on the received image.

The method of claim 5, wherein the image layer,
And a fourth pulling layer obtained by performing the convolutional neural network.

A communicator configured to receive BIM data for an indoor image of at least one building from an external server;
Extract the structural information included in the image taken by the built-in camera, extract the indoor image having a similarity or more than a reference value by matching the extracted structural information with any one or more indoor images included in the received BIM data, Control unit to detect the current location of the user by using the indoor image
Image-based indoor position detection device comprising a.

The method according to claim 7, wherein the BIM data,
And an image layer obtained by performing a deep learning network on the indoor images of the at least one building.

The method of claim 8, wherein the deep learning network,
An image-based indoor position detection device, which is a convolutional neural network (CNN) including at least one of VGG-16 and VGG-19.

The method according to claim 7,
The BIM data are image layers obtained by performing a convolutional neural network on indoor images inside the at least one building,
The control unit,
A layer extraction module for extracting structural information included in an image captured by the built-in camera;
An image matching module which extracts an image layer having a similarity or higher than a reference value by matching the extracted structure information with structure information of image layers included in the BIM data;
Position detection module for detecting the current position of the user by using the extracted image layer
Image-based indoor position detection device comprising a.

The method according to claim 10, wherein the structure information,
An image-based indoor position detection device having a shape of an image feature map included in an image layer obtained by performing a convolutional neural network on the received image.

The method of claim 11, wherein the image layer,
And a fourth pulling layer obtained by performing the convolutional neural network.

Forming and storing a BIM DB in which indoor image layers inside at least one building are stored;
Receiving an image photographed through a camera through wired or wireless communication;
Extracting structural information from the received image;
Extracting an image layer having a similarity or higher than a reference value by matching the extracted structure information with structure information of image layers stored in the BIM DB;
Detecting a current position of a user using the extracted image layer
Image-based indoor location detection method comprising a.

The method according to claim 13,
The BIM DB is a DB storing image layers obtained by performing a convolutional neural network for indoor images of at least one building,
And the structure information is a shape of an image feature map included in an image layer obtained by performing a convolutional neural network on the received image.

The method of claim 14, wherein the image layer,
And a fourth pulling layer obtained by performing the convolutional neural network.

Receiving BIM data for at least one building from an external server;
Extracting structural information included in an image captured by an embedded camera;
Extracting an indoor image having a similarity or higher than a reference value by matching the extracted structural information with at least one indoor image included in the received BIM data;
Detecting a current position of a user using the extracted image layer
Image-based indoor location detection method comprising a.

The method according to claim 16,
The BIM data are image layers obtained by performing a convolutional neural network on indoor images inside the at least one building,
And the structure information is a shape of an image feature map included in an image layer obtained by performing a convolutional neural network on the received image.

The method of claim 17, wherein the image layer,
And a fourth pulling layer obtained by performing the convolutional neural network.

Run by the computer,
Forming and storing a BIM DB in which indoor image layers inside at least one building are stored;
Receiving an image photographed through a camera through wired or wireless communication;
Extracting structural information from the received image;
Extracting an image layer having a similarity or higher than a reference value by matching the extracted structure information with structure information of image layers stored in the BIM DB;
Detecting a current position of a user using the extracted image layer
A computer readable recording medium having recorded thereon a program for executing the program.

Run by the computer,
Receiving BIM data for at least one building from an external server;
Extracting structural information included in an image captured by an embedded camera;
Extracting an indoor image having a similarity or higher than a reference value by matching the extracted structural information with at least one indoor image included in the received BIM data;
Detecting a current position of a user using the extracted image layer
A computer readable recording medium having recorded thereon a program for executing the program.