KR102605429B1

KR102605429B1 - Method for display augmented reality image

Info

Publication number: KR102605429B1
Application number: KR1020220048998A
Authority: KR
Inventors: 신준범; 김서규; 김우현; 최정헌
Original assignee: 주식회사 큐에스
Priority date: 2022-04-20
Filing date: 2022-04-20
Publication date: 2023-11-23
Also published as: KR20230149594A

Abstract

본 발명은 대상 객체의 공간 상에서의 오리엔테이션과 크기를 고려한 증강현실 영상 표시 방법으로, 대상 객체의 깊이 정보에 기반하여 대상 객체의 크기, 오리엔테이션 및 위치를 결정하고 이에 기반하여 증강현실 영상표시하는 방법에 관한 것이다.The present invention is an augmented reality image display method that takes into account the orientation and size of a target object in space, and includes a method of determining the size, orientation, and position of a target object based on depth information of the target object and displaying an augmented reality image based on this. It's about.

Description

Method for display augmented reality image}

본 발명은 대상 객체의 공간 상에서의 오리엔테이션과 크기를 고려한 증강현실 영상 표시 방법으로, 대상 객체의 깊이 정보에 기반하여 대상 객체의 크기, 오리엔테이션 및 위치를 결정하고 이에 기반하여 증강현실 영상을 표시하는 방법에 관한 것이다.The present invention is an augmented reality image display method that takes into account the orientation and size of a target object in space. A method of determining the size, orientation, and position of a target object based on depth information of the target object and displaying an augmented reality image based on this. It's about.

구조물이나 기계장치의 유지관리를 하기 위해서는 수년간의 경험이 필요하고, 숙련자라 할지라도 신규 구조물이나 기계장치의 도입 시 이를 숙지하는데 오랜 시간이 소요된다. 통상 정비사들은 설비의 유지 보수를 위한 매뉴얼을 항상 휴대하여 유지 보수 업무를 수행하여 왔다.Maintaining a structure or mechanical device requires several years of experience, and even for skilled workers, it takes a long time to become familiar with the introduction of a new structure or mechanical device. Normally, mechanics always carry a manual for maintenance of equipment and perform maintenance work.

종래의 2차원 매뉴얼은 단순히 구성요소의 설명과 2차원 도면으로만 각 부품을 확인할 수 있도록 작성되어 있어, 사용자가 부품의 상세 설명을 확인하거나 상세 구성을 분해, 조립, 회전, 확대, 축소하는 것이 불가한 한계점이 있었다.Conventional two-dimensional manuals are written so that each part can be checked simply through component descriptions and two-dimensional drawings, making it difficult for users to check detailed descriptions of parts or disassemble, assemble, rotate, enlarge, and reduce detailed configurations. There was an inevitable limit.

이러한 문제점을 해결하기 위하여, 실사 영상을 기반으로 한 정비 매뉴얼 단말기를 제공하는 방법이 안출 되었다. 선행기술문헌 1을 참조하면, 실사 영상을 기반으로 한 정비 매뉴얼 단말기는 정비하고자 하는 영역을 카메라로 촬영하여 무선통신을 통해 정비 가이드 관리 서버로 전송하고, 정비 가이드 관리 서버는 전송받은 실사 영상을 분석하여 이에 따른 정비 가이드 영상을 무선통신을 통해 정비 매뉴얼 단말기로 전송함으로써, 정비하고자 하는 영역에 대해서 실사 영상으로 구현된 정비 가이드를 실시간으로 제공받을 수 있게 된다.In order to solve this problem, a method of providing a maintenance manual terminal based on live images was developed. Referring to prior art document 1, a maintenance manual terminal based on an actual video captures the area to be maintained with a camera and transmits it to the maintenance guide management server through wireless communication, and the maintenance guide management server analyzes the received actual video. By transmitting the corresponding maintenance guide video to the maintenance manual terminal through wireless communication, it is possible to receive a maintenance guide implemented as a live video for the area to be maintained in real time.

하지만, 이와 같은 종래의 객체 인식 기법은 2차원 이미지가 중점적으로 사용되어 왔으며, 단순히 관련된 정보를 표시하는 것에 불과했기에 제공된 정보에 기반하여 사용자의 추가적인 노력을 요한다는 한계점이 있었다. However, such conventional object recognition techniques have mainly used two-dimensional images and have the limitation of simply displaying related information, requiring additional effort from the user based on the provided information.

1. 한국 특허 등록 제 10-1195446 호 (발명의 명칭 : 증강현실 기반의 정비 가이드 제공 단말기 및 이를 이용한 정비 가이드 제공 방법)1. Korean Patent Registration No. 10-1195446 (Title of invention: Augmented reality-based maintenance guide providing terminal and method of providing maintenance guide using the same)

이에, 본 발명은 단말에서 획득된 영상 내의 객체를 인식하고, 인식된 객체의 정보를 실시간 영상 상에 표시하고자 한다.Accordingly, the present invention seeks to recognize objects in images obtained from a terminal and display information about the recognized objects on real-time images.

또한 본 발명은 인식된 객체와 관련된 정보를 실시간 영상 내 증강 현실로 표시하되, 영상 내에서의 객체의 크기와 오리엔테이션을 고려하여 제공하고자 한다.In addition, the present invention is intended to display information related to a recognized object in augmented reality within a real-time video, taking into account the size and orientation of the object within the video.

또한 본 발명은 실시간 영상의 일부를 대체하여 보다 더 상세한 정보를 갖는 영상을 제공하고자 한다. Additionally, the present invention seeks to provide an image with more detailed information by replacing part of the real-time image.

본 발명의 일 실시예에 따른 대상 객체의 공간 상에서의 오리엔테이션과 크기를 고려한 증강현실 영상 표시 방법은, 대상 객체의 색상 정보를 포함하는 제1 영상 및 상기 대상 객체의 깊이(Depth) 정보를 포함하는 제2 영상을 획득하는 단계로써, 상기 제1 영상 및 상기 제2 영상은 동일한 제1 장면을 촬영한 영상이고; 상기 제1 영상 내에서 상기 대상 객체에 해당하는 제1 영역을 설정하는 단계; 상기 제2 영상에서 상기 제1 영역에 상응하는 제2 영역을 설정하는 단계; 상기 제1 영상으로부터 생성된 상기 대상 객체의 식별 정보를 이용하여 기준 모델을 로드 하는 단계; 상기 제2 영역과 상기 기준 모델의 비교 결과에 기초하여 상기 제2 영상에서의 상기 대상 객체의 크기, 오리엔테이션 및 위치를 결정하는 단계; 상기 대상 객체의 크기, 오리엔테이션 및 위치 중 적어도 하나를 참조하여 상기 제1 영상 상에 상기 대상 객체와 관련된 정보를 나타내는 제3 영상을 표시하는 단계; 를 포함할 수 있다.An augmented reality image display method considering the orientation and size in space of a target object according to an embodiment of the present invention includes a first image including color information of the target object and depth information of the target object. As a step of acquiring a second image, the first image and the second image are images taken of the same first scene; Setting a first area corresponding to the target object in the first image; setting a second area corresponding to the first area in the second image; Loading a reference model using identification information of the target object generated from the first image; determining the size, orientation, and position of the target object in the second image based on a comparison result between the second area and the reference model; Displaying a third image representing information related to the target object on the first image with reference to at least one of the size, orientation, and position of the target object; may include.

상기 제1 영역을 설정하는 단계는 학습된 인공 신경망을 이용하여 상기 제1 영상으로부터 상기 대상 객체에 해당하는 제1 영역 및 상기 대상 객체의 상기 식별 정보를 결정하는 단계; 를 포함할 수 있다. 이때 상기 인공 신경망은 입력된 영상에 대해 상기 입력된 영상 내에서 객체에 해당하는 영역과 객체의 식별 정보를 출력하도록 학습된 신경망일 수 있다.Setting the first area includes determining a first area corresponding to the target object and the identification information of the target object from the first image using a learned artificial neural network; may include. At this time, the artificial neural network may be a neural network that has been trained to output identification information of the object and a region corresponding to the object within the input image.

상기 기준 모델은 상기 대상 객체의 3차원 형상을 포함하는 모델이고, 상기 대상 객체의 크기, 오리엔테이션 및 위치를 결정하는 단계는 상기 제2 영역 내의 깊이 정보로부터 도출되는 3차원 형상과 상기 기준 모델에 포함된 상기 대상 객체의 3차원 형상 간의 비교를 통해 상기 제1 장면에서 상기 대상 객체의 크기, 오리엔테이션 및 위치를 결정하는 단계; 를 포함할 수 있다. 이때 상기 크기는 상기 대상 객체의 소정의 기준 크기에 대한 상기 제1 장면에서의 대상 객체의 크기의 비율을 포함하고, 상기 오리엔테이션은 적어도 하나의 기준 방향에 대한 상기 대상 객체의 회전 정도를 포함하고, 상기 위치는 상기 대상 객체에 대해 설정된 소정의 기준 지점의 상기 제1 장면에서의 위치를 포함할 수 있다.The reference model is a model including a three-dimensional shape of the target object, and the step of determining the size, orientation, and position of the target object includes the three-dimensional shape derived from depth information in the second area and the reference model. determining the size, orientation, and position of the target object in the first scene through comparison between three-dimensional shapes of the target object; may include. At this time, the size includes a ratio of the size of the target object in the first scene to a predetermined reference size of the target object, and the orientation includes a degree of rotation of the target object with respect to at least one reference direction, The location may include the location of a predetermined reference point set for the target object in the first scene.

상기 제3 영상을 표시하는 단계는 상기 대상 객체의 크기, 오리엔테이션 및 위치를 고려하여 상기 대상 객체를 구성하는 적어도 하나의 부품을 상기 대상 객체에 인접하여 나열한 제3 영상을 표시하는 단계; 를 포함할 수 있다.The displaying of the third image includes: displaying a third image in which at least one part constituting the target object is arranged adjacent to the target object in consideration of the size, orientation, and position of the target object; may include.

상기 제3 영상을 표시하는 단계는 상기 대상 객체의 오리엔테이션과 상기 적어도 하나의 부품의 결합 방향 및 결합 순서 중 적어도 하나에 따라 상기 적어도 하나의 부품의 표시 위치를 결정하는 단계; 를 더 포함할 수 있다.The displaying of the third image may include determining a display position of the at least one part according to at least one of an orientation of the target object and a coupling direction and coupling order of the at least one part; It may further include.

본 발명의 일 실시예에 따른 증강현실 영상 표시 방법은 상기 대상 객체의 크기, 오리엔테이션 및 위치를 결정하는 단계 이후에, 상기 기준 모델로부터 제4 영상을 생성하여 상기 제1 영상의 적어도 일부를 상기 제4 영상으로 대체하여 표시하는 단계; 를 더 포함할 수 있다.The augmented reality image display method according to an embodiment of the present invention generates a fourth image from the reference model after determining the size, orientation, and location of the target object, and at least a portion of the first image is converted to the second image. 4 Step of replacing and displaying an image; It may further include.

상기 제4 영상으로 대체하여 표시하는 단계는 상기 대상 객체의 크기 및 오리엔테이션을 고려하여 상기 기준 모델로부터 프로젝션 영상인 상기 제4 영상을 생성하는 단계; 를 포함할 수 있다. 이때 상기 프로젝션 영상은 상기 기준 모델이 상기 결정된 대상 객체의 크기이고 상기 오리엔테이션에 따라 회전된 상태에서의 2차원 영상일 수 있다.The step of replacing and displaying the fourth image includes generating the fourth image, which is a projection image, from the reference model in consideration of the size and orientation of the target object; may include. At this time, the projection image may be a two-dimensional image in which the reference model is the size of the determined target object and is rotated according to the orientation.

본 발명에 따르면 획득된 영상 내의 객체를 인식하고, 인식된 객체의 정보를 실시간 영상 상에 표시할 수 있다.According to the present invention, an object in an acquired image can be recognized, and information on the recognized object can be displayed on a real-time image.

또한 인식된 객체와 관련된 정보를 실시간 영상 내 증강 현실로 표시하되, 영상 내에서의 객체의 크기와 오리엔테이션을 고려하여 제공할 수 있다.Additionally, information related to recognized objects can be displayed in augmented reality within a real-time video, taking into account the size and orientation of the object within the video.

또 실시간 영상의 일부를 대체하여 보다 더 상세한 정보를 갖는 영상을 제공할 수 있다.Additionally, it is possible to provide an image with more detailed information by replacing part of the real-time image.

도 1은 본 발명의 일 실시예에 따른 영상 제공 시스템의 구성을 개략적으로 도시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 영상 표시 장치(100)의 구성을 개략적으로 도시한 도면이다.
도 3은 인공 신경망의 예시적인 구조를 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시예에 따른 서버(300)가 복수의 학습 데이터(510)를 이용하여 인공 신경망(520)을 학습하는 방법을 설명하기 위한 도면이다.
도 5는 학습된 인공 신경망(520)의 입력 데이터 및 출력 데이터를 도시한 도면이다.
도 6은 본 발명의 일 실시예에 따른 영상 표시 장치(100)가 제1 영상(410) 및 제2 영상(420)을 획득하는 과정을 설명하기 위한 도면이다.
도 7은 본 발명의 일 실시예에 따른 영상 표시 장치(100)가 제2 영역(421)을 설정하는 과정을 설명하기 위한 도면이다.
도 8은 본 발명의 일 실시예에 따른 영상 표시 장치(100)가 대상 객체의 크기, 오리엔테이션 및 위치를 결정하는 과정을 설명하기 위한 도면이다.
도 9는 제1 영상(410)상에 표시되는 예시적인 제3 영상(710)을 도시한 도면이다.
도 10은 제1 영상(410)에서 휠에 해당하는 영역이 제4 영상(720)으로 대체된 영상을 도시한 도면이다.
도 11은 본 발명의 일 실시예에 따른 영상 표시 장치(100)에 의해 수행되는 증강현실 영상 표시 방법을 설명하기 위한 흐름도이다.1 is a diagram schematically showing the configuration of an image providing system according to an embodiment of the present invention.
FIG. 2 is a diagram schematically showing the configuration of an image display device 100 according to an embodiment of the present invention.
Figure 3 is a diagram for explaining an exemplary structure of an artificial neural network.
FIG. 4 is a diagram illustrating a method by which the server 300 learns an artificial neural network 520 using a plurality of learning data 510 according to an embodiment of the present invention.
FIG. 5 is a diagram showing input data and output data of the learned artificial neural network 520.
FIG. 6 is a diagram illustrating a process in which the image display device 100 acquires the first image 410 and the second image 420 according to an embodiment of the present invention.
FIG. 7 is a diagram illustrating a process in which the video display device 100 sets the second area 421 according to an embodiment of the present invention.
FIG. 8 is a diagram illustrating a process by which the image display device 100 determines the size, orientation, and location of a target object according to an embodiment of the present invention.
FIG. 9 is a diagram illustrating an exemplary third image 710 displayed on the first image 410.
FIG. 10 is a diagram illustrating an image in which the area corresponding to the wheel in the first image 410 is replaced with the fourth image 720.
Figure 11 is a flowchart for explaining an augmented reality image display method performed by the image display device 100 according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. The advantages and features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail below along with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below and will be implemented in various different forms. The present embodiments only serve to ensure that the disclosure of the present invention is complete and that common knowledge in the technical field to which the present invention pertains is not limited. It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims.

비록 제1, 제2 등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않음은 물론이다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있음은 물론이다.Although first, second, etc. are used to describe various components, these components are of course not limited by these terms. These terms are merely used to distinguish one component from another. Therefore, it goes without saying that the first component mentioned below may also be a second component within the technical spirit of the present invention.

이하의 실시예에서, 포함하다 또는 가지다 등의 용어는 명세서상에 기재된 특징, 또는 구성요소가 존재함을 의미하는 것이고, 하나 이상의 다른 특징들 또는 구성요소가 부가될 가능성을 미리 배제하는 것은 아니다. In the following embodiments, terms such as include or have mean that the features or components described in the specification exist, and do not exclude in advance the possibility of adding one or more other features or components.

도면에서는 설명의 편의를 위하여 구성 요소들이 그 크기가 과장 또는 축소될 수 있다. 예컨대, 도면에서 나타난 각 구성의 크기 및 형태는 설명의 편의를 위해 임의로 나타내었으므로, 본 발명이 반드시 도시된 바에 한정되지 않는다. In the drawings, the sizes of components may be exaggerated or reduced for convenience of explanation. For example, the size and shape of each component shown in the drawings are arbitrarily shown for convenience of explanation, so the present invention is not necessarily limited to what is shown.

명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Like reference numerals refer to like elements throughout the specification.

본 발명의 여러 실시예들의 각각 특징들이 부분적으로 또는 전체적으로 서로 결합 또는 조합 가능하며, 당업자가 충분히 이해할 수 있듯이 기술적으로 다양한 연동 및 구동이 가능하며, 각 실시예들이 서로에 대하여 독립적으로 실시 가능할 수도 있고 연관 관계로 함께 실시 가능할 수도 있다.Each feature of the various embodiments of the present invention can be partially or fully combined or combined with each other, and as can be fully understood by those skilled in the art, various technical interconnections and operations are possible, and each embodiment may be implemented independently of each other. It may be possible to conduct them together due to a related relationship.

이하, 첨부된 도면을 참고로 하여 본 발명의 증강현실 영상 표시 방법에 대하여 자세히 설명한다.Hereinafter, the augmented reality image display method of the present invention will be described in detail with reference to the attached drawings.

도 1은 본 발명의 일 실시예에 따른 영상 제공 시스템의 구성을 개략적으로 도시한 도면이다.1 is a diagram schematically showing the configuration of an image providing system according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 영상 제공 시스템은 대상 객체의 공간 상에서의 오리엔테이션과 크기를 고려한 증강현실 영상을 제공할 수 있다. 가령 본 발명의 일 실시예에 따른 영상 제공 시스템은 차량의 휠이 대상 객체인 경우, 휠의 크기, 휠의 오리엔테이션 및 휠의 위치를 고려하여 휠을 구성하는 부품을 휠의 축 방향으로 나열하여 표시한 증강현실 영상을 제공할 수 있다.The image providing system according to an embodiment of the present invention can provide an augmented reality image that takes into account the orientation and size of the target object in space. For example, when a vehicle wheel is the target object, the image providing system according to an embodiment of the present invention displays the parts constituting the wheel in the axial direction of the wheel, taking into account the size of the wheel, the orientation of the wheel, and the position of the wheel. An augmented reality video can be provided.

또한 본 발명의 일 실시예에 따른 영상 제공 시스템은 대상 객체의 영상을 기 저장된 기준 모델로부터 생성된 프로젝션 영상으로 대체하여 제공할 수 있다. 가령 전술한 예시 에서와 같이 차량의 휠이 대상 객체인 경우 휠의 크기 및 휠의 오리엔테이션을 고려하여 기준 모델로부터 프로젝션 영상을 생성하고, 생성된 영상을 대상 객체의 영상의 적어도 일부분에 갈음하여 제공할 수 있다.Additionally, the image providing system according to an embodiment of the present invention can replace the image of the target object with a projection image generated from a pre-stored reference model and provide the image. For example, as in the above-mentioned example, when the wheel of a vehicle is the target object, a projection image is generated from the reference model by considering the size and orientation of the wheel, and the generated image is provided in place of at least a portion of the image of the target object. You can.

본 발명에서 '대상 객체'는 증강현실 영상의 대상이 되는 객체를 의미할 수 있다. 가령 대상 객체는 구분된 객체 전체를 의미할 수도 있고, 구분된 객체의 일부분을 의미할 수도 있다. 예를 들어 대상 객체는 도 1에 도시된 차량 전체일 수도 있고, 차량의 휠일 수도 있다. 다만 이는 예시적인 것으로 본 발명의 사상이 이에 한정되는 것은 아니다.In the present invention, 'target object' may refer to an object that is the target of an augmented reality image. For example, the target object may mean the entire divided object, or it may mean a part of the divided object. For example, the target object may be the entire vehicle shown in FIG. 1, or it may be a wheel of the vehicle. However, this is an example and the spirit of the present invention is not limited thereto.

본 발명에서 대상 객체의 '오리엔테이션'은 적어도 하나의 기준 방향에 대한 대상 객체의 회전 정도를 의미할 수 있다. 가령 오리엔테이션은 3차원 공간을 정의하는 3개의 축 각각에 대한 대상 객체의 회전 각도를 의미할 수 있다. 다만 이는 예시적인 것으로 다양한 좌표계가 본 발명의 오리엔테이션의 표현에 사용될 수 있다.In the present invention, the 'orientation' of a target object may mean the degree of rotation of the target object with respect to at least one reference direction. For example, orientation may mean the rotation angle of the target object with respect to each of the three axes that define three-dimensional space. However, this is an example and various coordinate systems can be used to express the orientation of the present invention.

본 발명에서 대상 객체의 '크기'는 대상 객체에 대해 미리 설정된 기준 크기에 대한 공간 상의 대상 객체의 상대적 크기를 의미할 수 있다. 가령 대상 객체의 크기는 0.7, 1.4와 같이 미리 설정된 기준 크기에 대한 상대적 크기일 수 있다.In the present invention, the 'size' of the target object may mean the relative size of the target object in space with respect to a preset reference size for the target object. For example, the size of the target object may be a relative size to a preset standard size, such as 0.7 or 1.4.

본 발명에서 대상 객체의 위치는 대상 객체에 대해 설정된 소정의 기준 지점의 특정 장면(또는 특정 공간)에서의 위치를 의미할 수 있다. 가령 대상 객체의 위치는 특정 장면 내에서의 2차원 좌표의 형태로 표현될 수 있다.In the present invention, the location of a target object may mean the location of a predetermined reference point set for the target object in a specific scene (or specific space). For example, the location of a target object can be expressed in the form of two-dimensional coordinates within a specific scene.

본 발명에서 '기준 모델'은 대상 객체에 대해 미리 생성된 3차원 모델로써, 객체의 3차원 형상 및/또는 외형에 대한 정보를 포함하는 데이터를 의미할 수 있다. 가령 대상 객체가 힐인 경우 기준 모델은 휠의 3차원 형상 정보를 포함하는 데이터 일 수 있다.In the present invention, a 'reference model' is a 3D model created in advance for a target object and may mean data containing information about the 3D shape and/or appearance of the object. For example, if the target object is a heel, the reference model may be data containing 3D shape information of the wheel.

본 발명에서 '프로젝션 영상'은 기준 모델이 특정 크기이고 특정 오리엔테이션에 따라 회전된 상태에서의 2차원 영상을 의미할 수 있다.In the present invention, 'projection image' may mean a two-dimensional image in which the reference model is of a specific size and rotated according to a specific orientation.

본 발명에서 '제1 영상'은 대상 객체 또는 대상 객체가 위치한 환경의 색상 정보를 포함하는 영상을 의미할 수 있다. 가령 제1 영상은 가시광선 대역의 파장의 광을 주로 기록하는 영상 획득 장치에 의해 생성된 RGB 영상일 수 있다.In the present invention, the 'first image' may refer to an image containing color information of the target object or the environment in which the target object is located. For example, the first image may be an RGB image generated by an image acquisition device that mainly records light in the visible light band.

본 발명에서 '제2 영상'은 객체 또는 대상 객체가 위치한 환경의 깊이(또는 거리) 정보를 포함하는 영상을 의미할 수 있다. 가령 제2 영상은 Lidar가 획득한 거리 정보에 기반하여 생성된 영상일 수 있다.In the present invention, the 'second image' may refer to an image containing depth (or distance) information of an object or an environment where the target object is located. For example, the second image may be an image generated based on distance information obtained by Lidar.

본 발명에서 '인공 신경망'은 소정의 목적에 따라 학습된 것으로, 머신 러닝(Machine Learning) 또는 딥러닝(Deep Learning) 기법에 의해 학습된 것을 의미할 수 있다. 인공 신경망에 대해서는 도 3 내지 도 5를 참조하여 후술한다.In the present invention, 'artificial neural network' refers to something learned according to a predetermined purpose, and may mean something learned using machine learning or deep learning techniques. The artificial neural network will be described later with reference to FIGS. 3 to 5.

본 발명의 일 실시예에 따른 영상 제공 시스템은 도 1에 도시된 바와 같이 영상 표시 장치(100), 대상 객체(200) 및 서버(300)를 포함할 수 있다.An image providing system according to an embodiment of the present invention may include an image display device 100, a target object 200, and a server 300, as shown in FIG. 1.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 대상 객체(200)의 공간 상에서의 오리엔테이션과 크기를 고려한 증강현실 영상을 제공할 수 있다. 또한 본 발명의 일 실시예에 따른 영상 표시 장치(100)는 대상 객체(200)의 영상을 기 저장된 기준 모델로부터 생성된 프로젝션 영상으로 대체하여 제공할 수 있다.The image display device 100 according to an embodiment of the present invention can provide an augmented reality image taking into account the orientation and size of the target object 200 in space. Additionally, the image display device 100 according to an embodiment of the present invention may replace the image of the target object 200 with a projection image generated from a pre-stored reference model and provide the image.

도 2는 본 발명의 일 실시예에 따른 영상 표시 장치(100)의 구성을 개략적으로 도시한 도면이다.FIG. 2 is a diagram schematically showing the configuration of an image display device 100 according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 제1 영상 획득부(110), 제2 영상 획득부(120), 제어부(130), 통신부(140), 메모리(150) 및 디스플레이부(160)를 포함할 수 있다.The image display device 100 according to an embodiment of the present invention includes a first image acquisition unit 110, a second image acquisition unit 120, a control unit 130, a communication unit 140, a memory 150, and a display unit. It may include (160).

본 발명의 일 실시예에 따른 제1 영상 획득부(110)는 대상 객체(200)에 대한 제1 영상을 획득할 수 있다. 가령 제1 영상 획득부(110)는 대상 객체(200)의 색상 정보를 포함하는 RGB 영상을 획득할 수 있다.The first image acquisition unit 110 according to an embodiment of the present invention may acquire a first image of the target object 200. For example, the first image acquisition unit 110 may acquire an RGB image including color information of the target object 200.

본 발명의 일 실시예에 따른 제2 영상 획득부(120)는 대상 객체(200)에 대한 제2 영상을 획득할 수 있다. 가령 제2 영상 획득부(120)는 대상 객체(200)의 깊이(거리)정보를 포함하는 깊이 영상을 획득할 수 있다.The second image acquisition unit 120 according to an embodiment of the present invention may acquire a second image of the target object 200. For example, the second image acquisition unit 120 may acquire a depth image including depth (distance) information of the target object 200.

본 발명의 일 실시예에 따른 제어부(130)는 대상 객체(200)에 대한 증강현실 영상을 제공하는 일련의 과정을 처리할 수 있다. 이때 본 발명의 일 실시예에 따른 제어부(130)는 프로세서(processor)와 같이 데이터를 처리할 수 있는 모든 종류의 장치를 포함할 수 있다. 여기서, '프로세서(processor)'는, 예를 들어 프로그램 내에 포함된 코드 또는 명령으로 표현된 기능을 수행하기 위해 물리적으로 구조화된 회로를 갖는, 하드웨어에 내장된 데이터 처리 장치를 의미할 수 있다. 이와 같이 하드웨어에 내장된 데이터 처리 장치의 일 예로써, 마이크로프로세서(microprocessor), 중앙처리장치(central processing unit: CPU), 프로세서 코어(processor core), 멀티프로세서(multiprocessor), ASIC(application-specific integrated circuit), FPGA(field programmable gate array) 등의 처리 장치를 망라할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다. The control unit 130 according to an embodiment of the present invention can process a series of processes for providing an augmented reality image for the target object 200. At this time, the control unit 130 according to an embodiment of the present invention may include all types of devices capable of processing data, such as a processor. Here, 'processor' may mean, for example, a data processing device built into hardware that has a physically structured circuit to perform a function expressed by code or instructions included in a program. Examples of data processing devices built into hardware include a microprocessor, central processing unit (CPU), processor core, multiprocessor, and application-specific integrated (ASIC). circuit) and FPGA (field programmable gate array), etc., but the scope of the present invention is not limited thereto.

본 발명의 일 실시예에 따른 통신부(140)는 영상 표시 장치(100)가 서버(300)와 같은 다른 네트워크 장치와 데이터를 송수신하기 위한 기능을 수행하기 위한 하드웨어 및/또는 소프트웨어를 의미할 수 있다. 가령 통신부(140)는 서버(300)로부터 기준 모델을 수신하여 메모리(150)에 저장하는 기능을 수행할 수 있다. 다만 이는 예시적인 기능으로 통신부(140)의 기능이 이에 한정되는 것은 아니다.The communication unit 140 according to an embodiment of the present invention may refer to hardware and/or software for performing a function for the video display device 100 to transmit and receive data with another network device such as the server 300. . For example, the communication unit 140 may receive a reference model from the server 300 and store it in the memory 150. However, this is an exemplary function and the function of the communication unit 140 is not limited to this.

본 발명의 일 실시예에 따른 메모리(150)는 제어부(130)가 처리하는 데이터, 명령어(instructions), 프로그램, 프로그램 코드, 또는 이들의 결합 등을 일시적 또는 영구적으로 저장하는 기능을 수행한다. 가령 메모리(150)는 복수의 기준 모델과 관련된 데이터를 일시적 및/또는 영구적으로 저장할 수도 있다. 이와 같은 메모리(150)는 자기 저장 매체(magnetic storage media) 또는 플래시 저장 매체(flash storage media)를 포함할 수 있으나, 본 발명의 범위가 이에 한정되는 것은 아니다. The memory 150 according to an embodiment of the present invention functions to temporarily or permanently store data, instructions, programs, program codes, or a combination thereof processed by the control unit 130. For example, memory 150 may temporarily and/or permanently store data related to a plurality of reference models. Such memory 150 may include magnetic storage media or flash storage media, but the scope of the present invention is not limited thereto.

본 발명의 일 실시예에 따른 디스플레이부(160)는 대상 객체(200)에 대한 실시간 영상 및/또는 증강현실 영상을 표시할 수 있다. 이와 같은 디스플레이부(160)는 가령 CRT(Cathode Ray Tube), LCD(Liquid Crystal Display), PDP (Plasma Display Panel), LED(Light-Emitting Diode) 및 OLED(Organic Light Emitting Diode) 중 어느 하나로 구성될 수 있으나, 본 발명의 사상은 이에 제한되지 않는다.The display unit 160 according to an embodiment of the present invention may display a real-time image and/or an augmented reality image of the target object 200. Such a display unit 160 may be composed of one of, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), a plasma display panel (PDP), a light-emitting diode (LED), and an organic light emitting diode (OLED). However, the idea of the present invention is not limited thereto.

도 1에는 설명의 편의를 위해 영상 표시 장치(100)가 태블릿 PC와 같은 형태인 것으로 도시 되었지만, 영상 표시 장치(100)의 형태가 이에 한정되는 것이 아니다. 가령 영상 표시 장치(100)는 신체에 부착 가능한 웨어러블 형태(예를 들어 안경, 시계 등)로 구현될 수도 있다.In FIG. 1 , for convenience of explanation, the video display device 100 is shown as being in the same form as a tablet PC, but the form of the video display device 100 is not limited to this. For example, the image display device 100 may be implemented in a wearable form that can be attached to the body (eg, glasses, watches, etc.).

본 발명에서 대상 객체(200)는 전술한 바와 같이 증강현실 영상의 대상이 되는 객체를 의미할 수 있다. 가령 대상 객체는 구분된 객체 전체를 의미할 수도 있고, 구분된 객체의 일부분을 의미할 수도 있다. 예를 들어 대상 객체는 도 1에 도시된 차량 전체일 수도 있고, 차량의 휠일 수도 있다. 다만 이는 예시적인 것으로 본 발 명의 사상이 이에 한정되는 것은 아니다.In the present invention, the target object 200 may refer to an object that is the target of an augmented reality image, as described above. For example, the target object may mean the entire divided object, or it may mean a part of the divided object. For example, the target object may be the entire vehicle shown in FIG. 1, or it may be a wheel of the vehicle. However, this is an example and the idea of the present invention is not limited thereto.

본 발명에서 서버(300)는 전술한 영상 표시 장치(100)에 데이터를 제공하는 장치일 수 있다. 가령 서버(300)는 대상 객체의 식별 정보를 생성하는데 사용되는 인공 신경망과 관련된 데이터(예를 들어 학습된 계수, 가중치 등)를 영상 표시 장치(100)에 제공할 수 있다. 또한 서버(300)는 복수의 기준 모델과 관련된 데이터를 영상 표시 장치(100)에 제공할 수도 있다. 다만 상술한 데이터들은 예시적인 것으로 본 발명의 사상이 이에 한정되는 것은 아니다.In the present invention, the server 300 may be a device that provides data to the video display device 100 described above. For example, the server 300 may provide data (eg, learned coefficients, weights, etc.) related to the artificial neural network used to generate identification information of the target object to the video display device 100. Additionally, the server 300 may provide data related to a plurality of reference models to the video display device 100. However, the above-described data are illustrative and the scope of the present invention is not limited thereto.

이하에서는 인공 신경망의 구조와 학습 과정에 대해서 먼저 설명하고, 영상 표시 장치(100)가 증강현실 영상을 제공하는 과정을 나중에 설명한다.Below, the structure and learning process of the artificial neural network will first be described, and then the process by which the image display device 100 provides an augmented reality image will be described later.

도 3은 인공 신경망의 예시적인 구조를 설명하기 위한 도면이다.Figure 3 is a diagram for explaining an exemplary structure of an artificial neural network.

본 발명의 일 실시예에서, 인공 신경망은 서버(300)에 의해 학습되어 영상 표시 장치(100)로 전송될 수 있다. 이하에서는 인공 신경망이 서버(300)에 의해 학습됨을 전제로 설명한다.In one embodiment of the present invention, the artificial neural network may be learned by the server 300 and transmitted to the video display device 100. Hereinafter, the description will be made on the premise that the artificial neural network is learned by the server 300.

본 발명의 일 실시예에 따른 인공 신경망은 도 3에 도시된 바와 같은 합성 곱 신경망(CNN: Convolutional Neural Network) 모델에 따른 인공 신경망일 수 있다. 이때 CNN 모델은 복수의 연산 레이어(Convolutional Layer, Pooling Layer)를 번갈아 수행하여 최종적으로는 입력 데이터의 특징을 추출하는 데 사용되는 계층 모델일 수 있다. 이때 본 발명의 일 실시예에 따른 서버(300)는 학습 데이터를 지도학습(Supervised Learning) 기법에 따라 처리하여 인공 신경망 모델을 구축하거나 학습시킬 수 있다.The artificial neural network according to an embodiment of the present invention may be an artificial neural network based on a convolutional neural network (CNN) model as shown in FIG. 3. At this time, the CNN model may be a hierarchical model used to ultimately extract features of input data by alternately performing a plurality of computational layers (Convolutional Layer, Pooling Layer). At this time, the server 300 according to an embodiment of the present invention can build or learn an artificial neural network model by processing the learning data according to a supervised learning technique.

본 발명의 일 실시예에 따른 서버(300)는 입력 데이터의 특징 값을 추출하기 위한 컨볼루션 레이어(Convolution layer), 추출된 특징 값을 결합하여 특징 맵을 구성하는 풀링 레이어(pooling layer)를 생성할 수 있다. The server 300 according to an embodiment of the present invention generates a convolution layer for extracting feature values of input data and a pooling layer for forming a feature map by combining the extracted feature values. can do.

또한 본 발명의 일 실시예에 따른 서버(300)는 생성된 특징 맵을 결합하여, 입력 데이터가 복수의 항목 각각에 해당할 확률을 결정할 준비를 하는 풀리 커넥티드 레이어(Fully Connected Layer)를 생성할 수 있다. In addition, the server 300 according to an embodiment of the present invention combines the generated feature maps to generate a fully connected layer that prepares to determine the probability that the input data corresponds to each of a plurality of items. You can.

마지막으로 서버(300)는 입력 데이터에 대응되는 출력을 포함하는 아웃풋 레이어(Output Layer)를 산출할 수 있다.Finally, the server 300 can calculate an output layer containing output corresponding to the input data.

도 3에 도시된 예시에서는, 입력 데이터가 5X7 형태의 블록으로 나누어지며, 컨볼루션 레이어의 생성에 5X3 형태의 단위 블록이 사용되고, 풀링 레이어의 생성에 1X4 또는 1X2 형태의 단위 블록이 사용되는 것으로 도시 되었지만, 이는 예시적인 것으로 본 발명의 사상이 이에 한정되는 것은 아니다.In the example shown in Figure 3, the input data is divided into blocks of 5 However, this is an example and the spirit of the present invention is not limited thereto.

이와 같은 입력 데이터의 분할 크기, 컨볼루션 레이어에 사용되는 단위 블록의 크기, 풀링 레이어의 수, 풀링 레이어의 단위 블록의 크기 등은 인공 신경망의 학습 조건을 나타내는 파라미터 셋에 포함되는 항목일 수 있다. 바꾸어 말하면, 파라미터 셋은 상술한 항목들을 결정하기 위한 파라미터(즉 구조 파라미터)들을 포함할 수 있다. The division size of the input data, the size of the unit block used in the convolutional layer, the number of pooling layers, the size of the unit block of the pooling layer, etc. may be items included in the parameter set representing the learning conditions of the artificial neural network. In other words, the parameter set may include parameters (i.e., structural parameters) for determining the above-mentioned items.

따라서 파라미터 셋의 변경 및/또는 조절에 따라 인공 신경망의 구조가 변경될 수 있으며, 이에 따라 동일한 학습 데이터를 이용하더라도 학습 결과가 달라질 수 있다.Therefore, the structure of the artificial neural network may change depending on the change and/or adjustment of the parameter set, and accordingly, the learning results may vary even if the same learning data is used.

한편 이와 같은 인공 신경망은 서버(300)의 메모리(미도시)에 인공 신경망을 구성하는 적어도 하나의 노드의 계수, 노드의 가중치 및 인공 신경망을 구성하는 복수의 레이어 간의 관계를 정의하는 함수의 계수들의 형태로 저장될 수 있다. 또한 서버(300)에 저장된 데이터(인공 신경망과 관련된 데이터)는 영상 표시 장치(100)로 전송되어 영상 표시 장치(100)가 증강현실 영상을 표시하는 과정에 사용 될 수 있다. 이에 대한 상세한 설명은 후술한다.Meanwhile, such an artificial neural network stores in the memory (not shown) of the server 300 the coefficients of at least one node constituting the artificial neural network, the weight of the node, and the coefficients of a function that defines the relationship between the plurality of layers constituting the artificial neural network. It can be stored in this format. Additionally, data (data related to the artificial neural network) stored in the server 300 may be transmitted to the image display device 100 and used in the process of the image display device 100 displaying an augmented reality image. A detailed description of this will be provided later.

도 4는 본 발명의 일 실시예에 따른 서버(300)가 복수의 학습 데이터(510)를 이용하여 인공 신경망(520)을 학습하는 방법을 설명하기 위한 도면이다.FIG. 4 is a diagram illustrating a method by which the server 300 learns an artificial neural network 520 using a plurality of learning data 510 according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 인공 신경망(520)은 복수의 학습 데이터(510) 각각에 포함되는 입력 데이터와 출력 데이터 간의 상관관계를 학습한(또는 학습하는) 신경망을 의미할 수 있다. 이 때 복수의 학습 데이터(510) 각각에 포함되는 입력 데이터와 출력 데이터는 시스템의 목적 및/또는 용도에 따라 다양하게 결정될 수 있다.The artificial neural network 520 according to an embodiment of the present invention may refer to a neural network that has learned (or learns) the correlation between input data and output data included in each of the plurality of learning data 510. At this time, the input data and output data included in each of the plurality of learning data 510 may be determined in various ways depending on the purpose and/or use of the system.

따라서 본 발명의 일 실시예에 따른 인공 신경망(520)은 입력 데이터의 입력에 따라, 그에 대응되는 출력 데이터를 출력하도록 학습된(또는 학습되는) 신경망을 의미할 수 있다.Therefore, the artificial neural network 520 according to an embodiment of the present invention may refer to a neural network that has been trained (or is learned) to output corresponding output data according to the input of input data.

가령 인공 신경망(520)이 영상에서 객체를 인식하는데 사용되는 경우, 복수의 학습 데이터(510) 각각은 입력 영상과 해당 영상 내에서의 객체의 위치, 해당 영상 내에서의 객체의 크기 및 객체의 클래스를 포함할 수 있다.For example, when the artificial neural network 520 is used to recognize an object in an image, each of the plurality of learning data 510 includes an input image, the location of the object within the image, the size of the object within the image, and the class of the object. may include.

예를 들어 첫 번째 학습 데이터(511)의 경우 인식의 대상이 되는 객체를 포함하는 영상(511A), 영상(511A) 내에서의 객체의 위치(511B), 영상(511A) 내에서의 객체의 크기(511C) 및 객체의 클래스(511D)를 포함할 수 있다. 이와 유사하게 두 번째 학습 데이터(512) 및 세 번째 학습 데이터(513)도 각각 상술한 항목들을 포함할 수 있다. 다만 상술한 항목들은 예시적인것으로 본 발명의 사상이 이에 한정되는 것은 아니다.For example, in the case of the first learning data 511, an image 511A including an object to be recognized, the position of the object within the image 511A 511B, and the size of the object within the image 511A 511C) and a class 511D of objects. Similarly, the second learning data 512 and the third learning data 513 may each include the above-described items. However, the above-described items are illustrative and the scope of the present invention is not limited thereto.

도 5는 학습된 인공 신경망(520)의 입력 데이터 및 출력 데이터를 도시한 도면이다.FIG. 5 is a diagram showing input data and output data of the learned artificial neural network 520.

전술한 바와 같이 인공 신경망(520)은 입력 데이터의 입력에 따라, 그에 대응되는 출력 데이터를 출력하도록 학습된(또는 학습되는) 신경망을 의미할 수 있다. As described above, the artificial neural network 520 may refer to a neural network that has been trained (or is being trained) to output corresponding output data according to input data.

본 발명의 일 실시예에 따른 인공 신경망(520)은 대상 객체를 포함하는 영상(531)이 입력됨에 따라 해당 영상 내에서 대상 객체에 해당하는 영역과 객체의 식별 정보를 출력 데이터(532)로 출력할 수 있다. 이때 '영역'은 대상 객체의 영상 내에서의 위치와 대상 객체의 영상 내에서의 크기의 형태로 출력될 수 있다. 가령 도 5에 도시된 바와 같이 영역은 중심 좌표(X, Y), 영역의 가로 크기(W) 및 영역의 세로 크기(H) 형태로 출력될 수 있다. 다만 이는 예시적인것으로 본 발명의 사상이 이에 한정되는 것은 아니다.As the image 531 including the target object is input, the artificial neural network 520 according to an embodiment of the present invention outputs the area corresponding to the target object in the image and the identification information of the object as output data 532. can do. At this time, the 'area' may be output in the form of the location of the target object within the image and the size of the target object within the image. For example, as shown in FIG. 5, the area may be output in the form of center coordinates (X, Y), horizontal size of the area (W), and vertical size of the area (H). However, this is an example and the spirit of the present invention is not limited thereto.

이하에서는 상술한 과정에 따라 인공 신경망(520)이 학습되어 서버(300)에서 영상 표시 장치(100)로 전송되어 있음을 전제로 영상 표시 장치(100)가 증강현실 영상을 표시하는 과정을 중심으로 설명한다. 또한 설명의 편의를 위해 대상 객체가 차량의 '휠'인 것을 전제로 설명한다.The following will focus on the process by which the video display device 100 displays an augmented reality image, assuming that the artificial neural network 520 is learned according to the above-described process and transmitted from the server 300 to the video display device 100. Explain. Also, for convenience of explanation, the explanation is made on the assumption that the target object is the 'wheel' of a vehicle.

도 6은 본 발명의 일 실시예에 따른 영상 표시 장치(100)가 제1 영상(410) 및 제2 영상(420)을 획득하는 과정을 설명하기 위한 도면이다.FIG. 6 is a diagram illustrating a process in which the image display device 100 acquires the first image 410 and the second image 420 according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 대상 객체의 색상 정보를 포함하는 제1 영상(410) 및 대상 객체의 깊이(Depth) 정보를 포함하는 제2 영상(420)을 획득할 수 있다. 이때 제1 영상(410) 및 제2 영상(420)은 동일한 제1 장면(400)을 촬영한 영상일 수 있다. The image display device 100 according to an embodiment of the present invention acquires a first image 410 including color information of the target object and a second image 420 including depth information of the target object. You can. At this time, the first image 410 and the second image 420 may be images taken of the same first scene 400.

바꾸어 말하면 제1 영상(410)은 제1 장면(400)의 색상 정보를 수집한 영상이고, 제2 영상(420)은 제1 장면(400)의 깊이 정보를 수집한 영상일 수 있다. 한편 본 발명에서 제1 영상(410)의 촬영 범위와 제2 영상(420)의 촬영 범위는 동일하거나 어느 하나의 촬영 범위가 다른 하나의 촬영 범위를 포함할 수 있다. In other words, the first image 410 may be an image in which color information of the first scene 400 is collected, and the second image 420 may be an image in which depth information of the first scene 400 is collected. Meanwhile, in the present invention, the capturing range of the first image 410 and the capturing range of the second image 420 may be the same, or one capturing range may include the other capturing range.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 제1 영상(410) 내에서 대상 객체에 해당하는 제1 영역을 설정할 수 있다. The image display device 100 according to an embodiment of the present invention may set a first area corresponding to the target object within the first image 410.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 도 3 내지 도 5에서 설명한 인공 신경망(520)을 이용하여 제1 영역을 설정할 수 있다. 가령 본 발명의 일 실시예에 따른 영상 표시 장치(100)는 제1 영상(410)을 인공 신경망(520)에 입력하고, 그 출력으로써 휠에 해당하는 제1 영역(위치 및 크기)과 식별 정보(클래스'휠')를 획득할 수 있다.The image display device 100 according to an embodiment of the present invention can set the first area using the artificial neural network 520 described in FIGS. 3 to 5. For example, the image display device 100 according to an embodiment of the present invention inputs the first image 410 to the artificial neural network 520, and outputs the first area (position and size) corresponding to the wheel and identification information. You can obtain (class 'Wheel').

본 발명의 선택적 실시예에 따른 영상 표시 장치(100)는 제1 영상(410)에서 식별된 대상 객체가 복수인 경우, 복수의 대상 객체 각각에 대한 영역과 식별 정보를 생성하고, 복수의 대상 객체를 선택 가능한 형태로 사용자에게 제공할 수 있다. 사용자는 제공된 복수의 대상 객체 중 적어도 하나를 선택하고, 그에 따라 후술하는 과정이 진행될 수 있다.When there are multiple target objects identified in the first image 410, the image display device 100 according to an optional embodiment of the present invention generates a region and identification information for each of the multiple target objects, and can be provided to the user in a selectable form. The user may select at least one of the plurality of target objects provided, and the process described later may proceed accordingly.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 제2 영상(420)에서 제1 영역에 상응하는 제2 영역을 설정할 수 있다.The image display device 100 according to an embodiment of the present invention may set a second area corresponding to the first area in the second image 420.

도 7은 본 발명의 일 실시예에 따른 영상 표시 장치(100)가 제2 영역(421)을 설정하는 과정을 설명하기 위한 도면이다.FIG. 7 is a diagram illustrating a process in which the video display device 100 sets the second area 421 according to an embodiment of the present invention.

설명의 편의를 위해서, 대상 객체가 휠이며 도 5의 인공 신경망(520)의 출력 결과에 도시된 바와 같이 휠을 포함하는 영역이 전술한 과정에 따라 제1 영역으로 설정되었음을 전제로 설명한다.For convenience of explanation, the description will be made on the assumption that the target object is a wheel and that the area including the wheel, as shown in the output result of the artificial neural network 520 in FIG. 5, has been set as the first area according to the above-described process.

상술한 가정 하에, 본 발명의 일 실시예에 따른 영상 표시 장치(100)는 제2 영상(420)에서 제1 영역에 상응하는 제2 영역(421)을 설정할 수 있다. 이때 설정된 제2 영역(421)은 제1 영상(410) 상의 제1 영역에 해당하는 영역의 깊이 정보만을 포함할 수 있다. 즉 제1 영역과 제2 영역(421)은 장면(400) 상의 동일한 부분에 대해서 각각 색상 정보와 깊이 정보를 가지는 부분 영상에 해당할 수 있다.Under the above-described assumption, the image display device 100 according to an embodiment of the present invention may set the second area 421 corresponding to the first area in the second image 420. At this time, the set second area 421 may include only depth information of the area corresponding to the first area on the first image 410. That is, the first area and the second area 421 may correspond to partial images each having color information and depth information for the same part of the scene 400.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 전술한 과정에 따라 생성된 대상 객체의 식별 정보를 이용하여 기준 모델을 로드 할 수 있다. 가령 영상 표시 장치(100)는 메모리(150)로부터 휠의 기준 모델을 로드 할 수도 있고, 서버(300)로부터 휠의 기준 모델을 로드 할 수도 있다.The video display device 100 according to an embodiment of the present invention may load a reference model using the identification information of the target object generated according to the above-described process. For example, the image display device 100 may load the reference model of the wheel from the memory 150 or the reference model of the wheel from the server 300.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 전술한 과정에 따라 제2 영상(420) 상에 설정된 제2 영역(421)과 기준 모델의 비교 결과에 기초하여 제2 영상(420)에서의 대상 객체의 크기, 오리엔테이션 및 위치를 결정할 수 있다.The image display device 100 according to an embodiment of the present invention displays the second image 420 based on the comparison result between the second area 421 set on the second image 420 and the reference model according to the above-described process. The size, orientation, and location of the target object can be determined.

도 8은 본 발명의 일 실시예에 따른 영상 표시 장치(100)가 대상 객체의 크기, 오리엔테이션 및 위치를 결정하는 과정을 설명하기 위한 도면이다.FIG. 8 is a diagram illustrating a process by which the image display device 100 determines the size, orientation, and location of a target object according to an embodiment of the present invention.

전술한 바와 같이 기준 모델(610)은 대상 객체의 3차원 형상을 포함하는 모델일 수 있다. 또한 제2 영역(421)은 대상 객체의 깊이 정보를 포함하는 영역일 수 있다.As described above, the reference model 610 may be a model including the three-dimensional shape of the target object. Additionally, the second area 421 may be an area containing depth information of the target object.

따라서 본 발명의 일 실시예에 따른 영상 표시 장치(100)는 제2 영역(421) 내의 깊이 정보로부터 도출되는 3차원 형상과 기준 모델(610)에 포함된 대상 객체의 3차원 형상 간의 비교를 통해 제1 장면(400)에서 대상 객체의 크기, 오리엔테이션 및 위치(620)를 결정할 수 있다.Therefore, the image display device 100 according to an embodiment of the present invention performs a comparison between the 3D shape derived from the depth information in the second area 421 and the 3D shape of the target object included in the reference model 610. The size, orientation, and location 620 of the target object in the first scene 400 may be determined.

이때 대상 객체의 크기는 소정의 기준 크기에 대한 제1 장면(400)에서의 대상 객체의 크기의 비율을 포함할 수 있다. 가령 대상 객체의 크기는 휠의 기준 크기에 대한 제1 장면(400)에서의 휠의 크기의 비율을 포함할 수 있다.At this time, the size of the target object may include a ratio of the size of the target object in the first scene 400 to a predetermined standard size. For example, the size of the target object may include a ratio of the size of the wheel in the first scene 400 to the standard size of the wheel.

오리엔테이션은 적어도 하나의 기준 방향에 대한 대상 객체의 회전 정도를 포함할 수 있다. 가령 오리엔테이션은 X, Y, Z 축 방향 각각에 대한 휠의 회전 정도를 포함할 수 있다.The orientation may include the degree of rotation of the target object with respect to at least one reference direction. For example, orientation may include the degree of rotation of the wheel for each of the X, Y, and Z axes.

위치는 대상 객체에 대해 설정된 소정의 기준 지점의 제1 장면(400)에서의 위치를 포함할 수 있다. 가령 위치는 휠의 외면의 중심점의 제1 장면(또는 제1 영상(410))에서의 위지 좌표를 포함할 수 있다.The location may include the location of a predetermined reference point set for the target object in the first scene 400. For example, the location may include the location coordinates of the center point of the outer surface of the wheel in the first scene (or first image 410).

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 상술한 과정에 따라 결정된 대상 객체의 크기, 오리엔테이션 및 위치(620) 중 적어도 하나를 참조하여 상기 제1 영상(410) 상에 대상 객체와 관련된 정보를 나타내는 제3 영상을 표시할 수 있다.The image display device 100 according to an embodiment of the present invention displays the target object and A third image representing related information may be displayed.

도 9는 제1 영상(410)상에 표시되는 예시적인 제3 영상(710)을 도시한 도면이다.FIG. 9 is a diagram illustrating an exemplary third image 710 displayed on the first image 410.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 대상 객체의 크기, 오리엔테이션 및 위치를 고려하여 대상 객체를 구성하는 적어도 하나의 부품을 대상 객체에 인접하여 나열한 제3 영상(710)을 표시할 수 있다. 이때 본 발명의 일 실시예에 따른 영상 표시 장치(100)는 대상 객체의 오리엔테이션과 대상 객체의 적어도 하나의 부품의 결합 방향 및 결합 순서 중 적어도 하나에 따라 적어도 하나의 부품의 표시 위치를 결정할 수 있다.The image display device 100 according to an embodiment of the present invention displays a third image 710 in which at least one part constituting the target object is arranged adjacent to the target object in consideration of the size, orientation, and location of the target object. can do. At this time, the image display device 100 according to an embodiment of the present invention may determine the display position of at least one part according to at least one of the orientation of the target object and the coupling direction and coupling order of at least one part of the target object. .

가령 영상 표시 장치(100)는 도 9에 도시된 바와 같이 대상 객체 휠에 대한 제3 영상(710)으로써, 휠의 축 방향으로 휠을 구성하는 부품들을 결합 순서와 방향을 고려하여 나열한 제3 영상(710)을 표시할 수 있다.For example, the image display device 100 is a third image 710 of the target object wheel as shown in FIG. 9, and is a third image in which the parts constituting the wheel in the axial direction of the wheel are arranged in consideration of the order and direction of assembly. (710) can be displayed.

이와 같이 본 발명은 단순히 대상 객체에 대한 영상을 대상 객체에 인접하여 표시하는 것이 아니라, 대상 객체의 크기, 오리엔테이션 및 위치를 모두 고려하여 표시함으로써 더 정밀한 정보를 제공할 수 있도록 한다.In this way, the present invention does not simply display the image of the target object adjacent to the target object, but displays it by taking into account the size, orientation, and location of the target object, thereby providing more precise information.

한편 본 발명의 일 실시예에 따른 영상 표시 장치(100)는 기준 모델로부터 제4 영상을 생성하여 제1 영상(410)의 적어도 일부를 제4 영상으로 대체하여 표시할 수 있다.Meanwhile, the image display device 100 according to an embodiment of the present invention can generate a fourth image from a reference model and display it by replacing at least part of the first image 410 with the fourth image.

도 10은 제1 영상(410)에서 휠에 해당하는 영역이 제4 영상(720)으로 대체된 영상을 도시한 도면이다.FIG. 10 is a diagram illustrating an image in which the area corresponding to the wheel in the first image 410 is replaced with the fourth image 720.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 대상 객체의 크기 및 오리엔테이션을 고려하여 기준 모델로부터 프로젝션 영상인 제4 영상(720)을 생성할 수 있다. 이때 프로젝션 영상은 전술한 바와 같이 기준 모델이 결정된 대상 객체의 크기이고 해당 오리엔테이션에 따라 회전된 상태에서의 2차원 영상일 수 있다.The image display device 100 according to an embodiment of the present invention may generate the fourth image 720, which is a projection image, from the reference model by considering the size and orientation of the target object. At this time, the projection image is the size of the target object for which the reference model has been determined, as described above, and may be a two-dimensional image rotated according to the corresponding orientation.

가령 영상 표시 장치(100)는 도 10에 도시된 바와 같이 휠의 기준 모델로부터 프로젝션 영상인 제4 영상(720)을 생성하고, 이를 제1 영상(410)에서 휠을 나타내는 영역에 갈음하여 표시할 수 있다.For example, as shown in FIG. 10, the image display device 100 generates a fourth image 720, which is a projection image, from the reference model of the wheel, and displays it in place of the area representing the wheel in the first image 410. You can.

이로써 본 발명은 식별된 대상 객체에 대해서 보다 상세한 정보를 제공하는 영상을 표시할 수 있다.As a result, the present invention can display an image that provides more detailed information about the identified target object.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 장면의 변화에 따라 상술한 과정을 반복하여 수행할 수 있다. 가령 사용자가 영상 표시 장치(100)의 촬영 각도 및/또는 촬영 위치를 변경함에 따라 영상 표시 장치(100)는 변경된 각도 및/또는 위치를 기준으로 대상 객체에 대한 제3 영상을 생성하여 제공할 수 있다.The video display device 100 according to an embodiment of the present invention may repeatedly perform the above-described process according to changes in the scene. For example, as the user changes the shooting angle and/or shooting position of the video display device 100, the video display device 100 may generate and provide a third image of the target object based on the changed angle and/or position. there is.

도 11은 본 발명의 일 실시예에 따른 영상 표시 장치(100)에 의해 수행되는 증강현실 영상 표시 방법을 설명하기 위한 흐름도이다. 이하에서는 도 1 내지 도 10을 함께 참조하여 설명한다.Figure 11 is a flowchart for explaining an augmented reality image display method performed by the image display device 100 according to an embodiment of the present invention. Hereinafter, it will be described with reference to FIGS. 1 to 10.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 대상 객체의 색상 정보를 포함하는 제1 영상 및 대상 객체의 깊이(Depth) 정보를 포함하는 제2 영상을 획득할 수 있다.(S1110) The image display device 100 according to an embodiment of the present invention may acquire a first image including color information of the target object and a second image including depth information of the target object (S1110).

전술한 바와 같이 제1 영상(410) 및 제2 영상(420)은 동일한 제1 장면(400)을 촬영한 영상일 수 있다. As described above, the first image 410 and the second image 420 may be images of the same first scene 400.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 제1 영상(410) 내에서 대상 객체에 해당하는 제1 영역을 설정할 수 있다. (S1120)The image display device 100 according to an embodiment of the present invention may set a first area corresponding to the target object within the first image 410. (S1120)

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 제2 영상(420)에서 제1 영역에 상응하는 제2 영역을 설정할 수 있다.(S1130)The image display device 100 according to an embodiment of the present invention can set a second area corresponding to the first area in the second image 420 (S1130).

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 전술한 과정에 따라 생성된 대상 객체의 식별 정보를 이용하여 기준 모델을 로드 할 수 있다.(S1140) 가령 영상 표시 장치(100)는 메모리(150)로부터 휠의 기준 모델을 로드 할 수도 있고, 서버(300)로부터 휠의 기준 모델을 로드 할 수도 있다.The video display device 100 according to an embodiment of the present invention may load a reference model using the identification information of the target object generated according to the above-described process. (S1140) For example, the video display device 100 may load a reference model using the identification information of the target object generated according to the above-described process. The reference model of the wheel may be loaded from 150, or the reference model of the wheel may be loaded from the server 300.

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 전술한 과정에 따라 제2 영상(420) 상에 설정된 제2 영역(421)과 기준 모델의 비교 결과에 기초하여 제2 영상(420)에서의 대상 객체의 크기, 오리엔테이션 및 위치를 결정할 수 있다. (S1150)The image display device 100 according to an embodiment of the present invention displays the second image 420 based on the comparison result between the second area 421 set on the second image 420 and the reference model according to the above-described process. The size, orientation, and location of the target object can be determined. (S1150)

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 기준 모델로부터 제4 영상을 생성하여 제1 영상(410)의 적어도 일부를 제4 영상으로 대체하여 표시할 수 있다.(S1160)The image display device 100 according to an embodiment of the present invention can generate a fourth image from a reference model and display it by replacing at least part of the first image 410 with the fourth image (S1160).

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 상술한 과정에 따라 결정된 대상 객체의 크기, 오리엔테이션 및 위치(620) 중 적어도 하나를 참조하여 상기 제1 영상(410) 상에 대상 객체와 관련된 정보를 나타내는 제3 영상을 표시할 수 있다.(S1170)The image display device 100 according to an embodiment of the present invention displays the target object and A third image representing related information can be displayed (S1170).

본 발명의 일 실시예에 따른 영상 표시 장치(100)는 장면의 변화에 따라 상술한 단계 S1110 내지 S1170을 반복하여 수행할 수 있다. 가령 사용자가 영상 표시 장치(100)의 촬영 각도 및/또는 촬영 위치를 변경함에 따라 영상 표시 장치(100)는 변경된 각도 및/또는 위치를 기준으로 대상 객체에 대한 제3 영상을 생성하여 제공할 수 있다.The video display device 100 according to an embodiment of the present invention may repeatedly perform the above-described steps S1110 to S1170 according to changes in the scene. For example, as the user changes the shooting angle and/or shooting position of the video display device 100, the video display device 100 may generate and provide a third image of the target object based on the changed angle and/or position. there is.

이상 설명된 본 발명에 따른 실시예는 컴퓨터 상에서 다양한 구성요소를 통하여 실행될 수 있는 컴퓨터 프로그램의 형태로 구현될 수 있으며, 이와 같은 컴퓨터 프로그램은 컴퓨터로 판독 가능한 매체에 기록될 수 있다. 이때, 매체는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광 기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등과 같은, 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치를 포함할 수 있다. 나아가, 매체는 네트워크 상에서 전송 가능한 형태로 구현되는 무형의 매체를 포함할 수 있으며, 예를 들어 소프트웨어 또는 애플리케이션 형태로 구현되어 네트워크를 통해 전송 및 유통이 가능한 형태의 매체일 수도 있다. The embodiments according to the present invention described above may be implemented in the form of a computer program that can be executed through various components on a computer, and such a computer program may be recorded on a computer-readable medium. At this time, the media includes magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, and ROM. , RAM, flash memory, etc., may include hardware devices specifically configured to store and execute program instructions. Furthermore, the medium may include intangible media implemented in a form that can be transmitted over a network. For example, it may be a form of media implemented in the form of software or an application that can be transmitted and distributed over a network.

한편, 상기 컴퓨터 프로그램은 본 발명을 위하여 특별히 설계되고 구성된 것이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수 있다. 컴퓨터 프로그램의 예에는, 컴파일러에 의하여 만들어지는 것과 같은 기계어 코드 뿐만 아니라 인터프리터 등을 사용하여 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함될 수 있다.Meanwhile, the computer program may be designed and configured specifically for the present invention, or may be known and available to those skilled in the art of computer software. Examples of computer programs may include not only machine language code such as that created by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.

본 발명에서 설명하는 특정 실행들은 일 실시예들로서, 어떠한 방법으로도 본 발명의 범위를 한정하는 것은 아니다. 명세서의 간결함을 위하여, 종래 전자적인 구성들, 제어 시스템들, 소프트웨어, 상기 시스템들의 다른 기능적인 측면들의 기재는 생략될 수 있다. 또한, 도면에 도시된 구성 요소들 간의 선들의 연결 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로 적 연결들을 예시적으로 나타낸 것으로서, 실제 장치에서는 대체 가능하거나 추가의 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결 들로서 나타내어질 수 있다. 또한, “필수적인”, “중요하게” 등과 같이 구체적인 언급이 없다면 본 발명의 적용을 위하여 반드시 필요한 구성 요소가 아닐 수 있다.The specific implementations described in the present invention are examples and do not limit the scope of the present invention in any way. For the sake of brevity of the specification, descriptions of conventional electronic components, control systems, software, and other functional aspects of the systems may be omitted. In addition, the connections or connection members of lines between components shown in the drawings exemplify functional connections and/or physical or circuit connections, and in actual devices, various functional connections or physical connections may be replaced or added. It can be expressed as connections, or circuit connections. Additionally, if there is no specific mention such as “essential,” “important,” etc., it may not be a necessary component for the application of the present invention.

이상 첨부된 도면을 참조하여 본 발명의 실시예들을 설명하였지만, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명의 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다.Although embodiments of the present invention have been described above with reference to the attached drawings, those skilled in the art will understand that the present invention can be implemented in other specific forms without changing its technical idea or essential features. You will be able to understand it. Therefore, the embodiments described above should be understood in all respects as illustrative and not restrictive.

[이 발명을 지원한 국가연구개발사업][National research and development project that supported this invention]

[부처명] 경상북도[Ministry name] Gyeongsangbuk-do

[과제관리(전문)기관명] 구미전자정보기술원[Project management (professional) organization name] Gumi Electronics and Information Technology Institute

[연구사업명] 경상북도 4차산업혁명 핵심기술개발사업[Research Project Name] Gyeongsangbuk-do 4th Industrial Revolution Core Technology Development Project

[연구과제명] 실물 객체와 증강데이터 정밀 정합을 위한 인공지능 혼합학습(RGBD+3D) 기반 3D 객체 인식기술 개발[Research project name] Development of 3D object recognition technology based on artificial intelligence mixed learning (RGBD+3D) for precise matching of real objects and augmented data

[연구과제번호] SF321006A [Research Project Number] SF321006A

[기여율] 1/1[Contribution rate] 1/1

[과제수행기관명] 주식회사 큐에스[Name of project carrying out organization] QS Co., Ltd.

[연구기간] 2021.07.01 ~ 2022.06.30[Research period] 2021.07.01 ~ 2022.06.30

100: 영상 표시 장치
110: 제1 영상 획득부
120: 제2 영상 획득부
130: 제어부
140: 통신부
150: 메모리
160: 디스플레이부
200: 대상 객체
300: 서버100: video display device
110: First image acquisition unit
120: Second image acquisition unit
130: control unit
140: Department of Communications
150: memory
160: Display unit
200: Target object
300: server

Claims

In an augmented reality image display method that considers the spatial orientation and size of the target object,
A step of acquiring a first image including color information of the target object and a second image including depth information of the target object, wherein the first image and the second image are captured from the same first scene. It's a video;
Setting a first area corresponding to the target object in the first image;
setting a second area corresponding to the first area in the second image;
Loading a reference model using identification information of the target object generated from the first image;
determining the size, orientation, and position of the target object in the second image based on a comparison result between the second area and the reference model; and
Displaying a third image representing information related to the target object on the first image with reference to at least one of the size, orientation, and position of the target object; Including,
The displaying of the third image includes: displaying a third image in which at least one part constituting the target object is arranged adjacent to the target object in consideration of the size, orientation, and position of the target object; A method for displaying augmented reality images, including.

In claim 1
The step of setting the first area is
determining a first area corresponding to the target object and the identification information of the target object from the first image using a learned artificial neural network; Including,
The artificial neural network is
An augmented reality image display method, which is a neural network trained to output an area corresponding to an object and identification information of the object for the input image.

In claim 1
The reference model is a model including the three-dimensional shape of the target object,
The step of determining the size, orientation and location of the target object is
determining the size, orientation, and position of the target object in the first scene through comparison between a 3D shape derived from depth information in the second area and a 3D shape of the target object included in the reference model; Including,
The size is
Contains a ratio of the size of the target object in the first scene to a predetermined reference size of the target object,
The orientation is
Contains a degree of rotation of the target object with respect to at least one reference direction,
The location is
An augmented reality image display method including a location in the first scene of a predetermined reference point set for the target object.

delete

In claim 1
The step of displaying the third image is
determining a display position of the at least one part according to at least one of the orientation of the target object and the coupling direction and coupling order of the at least one part; An augmented reality video display method further comprising:

In claim 1
The augmented reality image display method is
After determining the size, orientation and location of the target object,
generating a fourth image from the reference model and displaying the first image by replacing at least a portion of the first image with the fourth image; An augmented reality image display method further comprising:

In claim 6
The step of replacing and displaying the fourth image is
generating the fourth image, which is a projection image, from the reference model in consideration of the size and orientation of the target object; Including,
The projection image is
An augmented reality image display method, wherein the reference model is the size of the determined target object and is a two-dimensional image rotated according to the orientation.