WO2023120770A1

WO2023120770A1 - Method and apparatus for interaction between cognitive mesh information generated in three-dimensional space and virtual objects

Info

Publication number: WO2023120770A1
Application number: PCT/KR2021/019669
Authority: WO
Inventors: 한상준
Original assignee: 엔센스코리아주식회사
Priority date: 2021-12-22
Filing date: 2021-12-22
Publication date: 2023-06-29
Also published as: KR20230095197A

Abstract

In order to implement augmented reality technology that synthesizes virtual objects in a certain space, the present invention relates to a method that creates pieces of three-dimensional cognitively segmented mesh information for an interaction between a virtual object and updates information on a position, direction, and state of the virtual object such that the virtual object interacts with a real space, thereby enhancing a user's sense of presence and manipulability. More specifically, three-dimensional space information is generated by estimating a geometric relationship by using continuous images obtained from a camera, and cognitive domain segmentation is performed on the images of the current space by using a convolutional neural network, and this is combined with the space information to generate segmented cognitive mesh information. The present invention relates to a method and an apparatus for implementing an interaction between cognitive mesh information generated in a three-dimensional space and virtual objects by continuously updating location information, direction information, and state information in which the virtual objects disposed in the three-dimensional space may physically exist, determining whether the virtual objects can actually exist at the current location by using the cognitive mesh information to cause the virtual objects and the cognitive mesh information to interact with each other, and comparing the positional relationship between the virtual objects and the 3-dimensional mesh information to represent occlusion.

Description

Method and device for interaction between cognitive mesh information generated in 3D space and virtual objects

본 발명은 가상의 객체를 실제 공간 위에 존재하는 것처럼 합성할 수 있는 실-가상 정합을 구현하기 위한 기술로서, 실제 공간의 형상으로부터 인지적으로 분할된 3차원의 매쉬 정보를 생성하여, 가상의 객체가 실제 공간에서 실재할 수 있는 위치 정보를 정확히 하고, 가상의 객체들과 실제 공간이 시각적, 물리적으로 상호작용하여 사용자로 하여금 가상의 객체가 실제 공간에 실재한다는 착각을 불러 일으켜 현존감을 높일 수 있는 실-가상 정합 콘텐츠를 제공하기 위한 방법 및 장치에 관한 것이다.The present invention is a technology for implementing real-virtual matching that can synthesize virtual objects as if they exist in a real space, and generates three-dimensional mesh information cognitively segmented from the shape of a real space to create a virtual object. Accurate location information that can exist in real space, and visual and physical interaction between virtual objects and real space gives users the illusion that virtual objects exist in real space, increasing the sense of presence. A method and apparatus for providing real-virtual matched content.

일반적으로 증강현실(Augmented Reality) 기술이란, 현실 공간 위에 가상의 물체를 합성하여 중첩하고 이를 보여주는 가상현실(Virtual Reality) 기술에서 파생된 기술 분야로써, 가상현실 기술보다 가상의 물체를 보여주는데 있어서 사용자로 하여금 가상의 물체가 실제로 현실 공간에 존재하는 것처럼 착각을 일으켜 현존감(Presence)을 높여줄 수 있다.In general, augmented reality technology is a technology field derived from virtual reality technology that synthesizes and overlaps virtual objects on a real space and shows them. It can increase the sense of presence by creating an illusion as if the virtual object actually exists in the real space.

첫번째 일례로서 뎁스 카메라를 이용하여 획득한 뎁스 영상으로부터 3D 포인트 클라우드 맵을 생성하고 3D 포인트 클라우드 맵으로 증강 콘텐츠가 투영될 실 공간의 객체를 추적하여, 프로젝터와 같은 디스플레이 장치를 이용해 실제 공간 상에 가상의 물체를 투영하여 직접 중첩시키고 사용자와의 인터랙션을 할 수 있는 방법이 있다.As a first example, a 3D point cloud map is created from a depth image obtained using a depth camera, and an object in a real space where augmented content is to be projected is tracked with the 3D point cloud map, and a display device such as a projector is used to create a virtual image in real space. There is a way to directly overlap objects by projecting them and interact with the user.

두번째 일례로서는 3차원의 현실 공간에서 사용자의 동작을 인식하여 이벤트를 발생시키기 위한 방법으로, 뎁스 카메라를 이용하여 획득한 뎁스 영상으로부터, 가상 공간 상에서의 대상 객체의 위치값을 연산하여 기준위치 데이터베이스와 비교하여 이벤트 실행 신호를 방생하는 방법이 있다.As a second example, as a method for generating an event by recognizing a user's motion in a three-dimensional real space, the position value of a target object in virtual space is calculated from a depth image obtained using a depth camera to obtain a reference position database and There is a way to compare and generate an event execution signal.

하지만 상기의 일례들의 방법에서는 카메라 또는 센서를 통해 얻은 3차원 공간정보만 가지고 있으므로, 이렇게 얻은 3D 포인트 클라우드 맵을 통하여 생성한 단편적 매쉬 정보는 공간과 사물이 분리되지 않으며, 또 공간에 해당하는 영역의 속성을 알 수 없는것과 같이 인지적 정보를 얻는 것에 한계가 있다.However, since the methods of the examples above have only 3D spatial information obtained through a camera or sensor, the fragmentary mesh information generated through the 3D point cloud map obtained in this way does not separate the space and the object, and the area corresponding to the space There is a limit to obtaining cognitive information, such as unknown properties.

또한 사용자는 실제의 공간에서 가상의 객체를 증강하기 위하여, 일반적으로는 사용자가 선택한 객체를 공간에 배치하고, 사물의 크기를 조절하거나 이동하는 등 사용자의 의도에 의하여만 증강현실을 구현하는 한계가 있다.In addition, in order to augment a virtual object in a real space, there is a limit to implementing augmented reality only according to the user's intention, such as placing an object selected by the user in the space, adjusting the size of the object, or moving it. there is.

위와 같은 한계는 증강현실을 구현함에 있어 콘텐츠를 제작하거나 시나리오를 제작하는데 실공간과의 상호작용이 무시되거나, 임의의 객체를 사전에 짜여진 각본에 의해서만 표출할 수 있으므로 현존감 높은 사용자 경험을 제공하는데 상당한 문제점이 있다.The above limitation is to provide a user experience with high presence because interaction with real space is ignored in creating content or scenarios in implementing augmented reality, or arbitrary objects can be expressed only according to a pre-planned script. There are significant problems.

또한 상기의 일례들은 모두 3차원의 실제 공간을 공간정보를 생성하기 위하여 스테레오 카메라 똔느 뎁스 카메라와 같은 심도(Depth)를 획득할 수 있는 장치가 수반되어야 한다. 때문에 심도를 얻을 수 있는 카메라가 내장되어 있지 않은 기기에서는 상기 일례들과 같은 기술을 구현할 수 없는 문제점이 있다.In addition, all of the above examples must be accompanied by a device capable of acquiring depth such as a stereo camera or a depth camera to generate spatial information in a 3D real space. Therefore, there is a problem in that the technology such as the examples described above cannot be implemented in a device that does not have a built-in camera capable of obtaining a depth of field.

이러한 한계를 극복하기 위하여, 실제 공간의 2차원 연속 이미지로부터 특징점을 추출하고 연속 이미지간의 위치 정합을 통해 3차원 공간의 위치를 추적하는 3D 포인트 클라우드 맵을 생성하는 방법이 등장하였다.In order to overcome these limitations, a method of generating a 3D point cloud map that tracks the position of a 3D space by extracting feature points from 2D continuous images of real space and matching the positions of the continuous images has emerged.

하지만 이러한 기술 역시 실제 공간의 3D 포인트 클라우드 맵을 구성한 후 가상의 객체를 증강하여 표현하는데 있어서 공간의 인지적 분할을 하지 않고, 가상의 객체를 실제 공간에 유지하기 위한 앵커(Anchor point)를 계산하는데 그쳐, 실제 공간과 가상의 객체가 상호작용하지 않아 사용자가 높은 현존감을 느끼게 할 수 없는 문제점이 있다.However, this technology also constructs a 3D point cloud map of the real space and then augments and expresses the virtual object, without cognitive division of the space, and calculating an anchor point to maintain the virtual object in the real space. However, since the real space and the virtual object do not interact, there is a problem in that the user cannot feel a high sense of presence.

본 발명은 임의의 한 공간에 가상 객체들을 합성하는 증강현실 기술을 구현하고, 가상의 객체와 실제 공간 간의 상호작용을 위하여, 먼저 실제 공간의 3차원 매쉬 정보를 생성하고, 인지적 영역분할을 통해 얻은 영역 정보를 3차원 공간에 투영하여 이를 다시 3차원 매쉬 정보의 영역 분할을 위해 사용하여 인지적 매쉬 정보를 생성한 후, 가상의 객체들이 인지적 매쉬 정보를 이용하여 물리적 특성을 적용한 객체의 위치 결정과 이동 모습을 구현하고, 가상의 객체가 실제 공간의 사물간의 위치 관계를 고려하여 객체의 가려짐을 구현함으로써, 사용자로 하여금 높은 현존감을 얻을 수 있게 하고 사용자와 가상의 객체들이 실제 공간에서 서로 상호작용 할 수 있는 방법을 제공하는데 그 목적이 있다.The present invention implements an augmented reality technology that synthesizes virtual objects in an arbitrary space, and first generates 3D mesh information of the real space for interaction between the virtual object and the real space, and through cognitive domain segmentation. The obtained area information is projected onto a 3D space and used again for area division of the 3D mesh information to generate cognitive mesh information, and then virtual objects use the cognitive mesh information to apply physical characteristics to the location of the object. By realizing the decision and movement and realizing the obscuring of objects by considering the positional relationship between virtual objects and objects in real space, users can obtain a high sense of presence, and users and virtual objects interact with each other in real space. Its purpose is to provide a method that can work.

본 발명에 앞서 종래의 선행기술에서는 적외선 ToF 카메라, RGB-D 방식의 뎁스 카메라, 스테레오카메라 등을 통해 취득한 깊이 정보를 기반으로 실제 공간을 인식하였으나, 본 발명은 단안 카메라로부터 연속하여 2차원 이미지를 취득하고, 서로 이웃한 이미지 사이의 위치관계를 이용하여 실제하는 공간이나 사물과 같은 대상체까지의 거리를 추정하는 방식으로 깊이 정보를 생성하고, 이 때 각 이미지에서 추출한 특징점 중 카메라까지의 거리를 올바르게 추정하였다고 판단한 특징점들을 모아 3차원 포인트 클라우드 맵으로 구성하는 SLAM(Simultaneous Localization and Mapping) 방법을 구현하여 공간을 해석한다.Prior to the present invention, in the prior art, a real space was recognized based on depth information acquired through an infrared ToF camera, an RGB-D depth camera, a stereo camera, etc. and create depth information by estimating the distance to an object such as a real space or object using the positional relationship between adjacent images, and at this time, among the feature points extracted from each image, the distance to the camera is correctly calculated. We analyze the space by implementing the SLAM (Simultaneous Localization and Mapping) method, which collects the feature points that are determined to be estimated and configures them into a 3D point cloud map.

본 발명은 해석된 공간 정보와, 취득 시각이 동일한 2차원 이미지를, 합성곱 신경망(Convolutional Neural Network)을 이용한 영역 분할 방법을 이용하여 인지적 영역 분할(Semantic Segmentation)을 수행하고, 그 결과로서 2차원의 인지적 영역 분할 정보를 상기 해석된 공간에 투영(Projection)하여, 상기 해석된 공간 정보로부터 인지적 매쉬 정보로 영역의 의미를 부여한다.The present invention performs semantic segmentation on the analyzed spatial information and the two-dimensional image at the same acquisition time using a region segmentation method using a convolutional neural network, and as a result, 2 Dimensional cognitive region segmentation information is projected onto the analyzed space, and the meaning of the region is given to cognitive mesh information from the analyzed spatial information.

인지적 매쉬 정보는 가상의 객체를 실제 공간에 배치함에 있어, 가상 객체가 실제 공간에 보다 자연스럽게 실재할 수 있도록 3차원 공간상의 앵커(Anker) 포인트를 계산하는데 활용하며, 매쉬 정보는 가상 객체가 실제 공간에 가려져 일부 또는 그 전체의 형상이 보이지 않아야 할 상황에 가상 객체를 실제와 같이 렌더링할 수 있도록 도움을 준다.Cognitive mesh information is used to calculate anchor points in 3D space so that virtual objects can more naturally exist in real space when arranging virtual objects in real space. It helps to render a virtual object in a realistic way in a situation where some or all of its shape should not be visible because it is hidden in space.

또한, 가상의 객체는 인지적 매쉬 정보를 이용하여 실제 공간과 가상의 객체들이 서로 상호작용 할 수 있도록 부가적인 정보를 제공한다. 예를 들어, 인지적 영역 분할을 통해 실제 공간의 테이블을 인식하였다고 한다면, 가상의 객체는 테이블의 위에 실재하는 것처럼 그 위치를 고정하여 표현할 수 있어야 할 것이다. 또는, 실제 공간의 화분을 인식하였다고 한다면, 가상의 객체는 화분의 주변에 어우러져 자연스럽게 보여줄 뿐 아니라 가상의 객체가 화분 뒤에 가려져 일부 또는 전체가 보이지 않음을 표현할 수 있어야 할 것이다. 또는, 실제 공간의 매쉬 정보가 수평면의 높낮이가 있는 경우, 가상 객체의 움직임을 구현할 때 현실 세계의 물리법칙을 적용하여 중력 방향으로 가상의 객체들이 이동하거나 바닥 또는 벽 또는 사물에 부딪혀 움직임의 방향이 바뀌는 것을 표현할 수 있어야 할 것이다.In addition, the virtual object provides additional information so that the real space and the virtual objects can interact with each other using cognitive mesh information. For example, if a table in a real space is recognized through cognitive domain segmentation, the virtual object should be able to express its position by fixing it as if it were real on top of the table. Alternatively, if a potted plant in a real space is recognized, the virtual object should be displayed naturally in harmony with the planter's surroundings, and it should be possible to express that the virtual object is hidden behind the pot and partially or entirely invisible. Alternatively, if the mesh information in the real space has a height of the horizontal plane, the virtual objects move in the direction of gravity by applying the physical laws of the real world when implementing the movement of virtual objects, or the direction of movement is changed by bumping into the floor, wall, or object. You should be able to express what is changing.

이와 같이 가상의 객체들이 실제의 공간에서 어떻게 상호작용 해야 하는지에 대한 표현을 하기 위해 실제 공간의 깊이를 추출하고, 3차원 포인트 클라우드 맵을 생성하고, 생성된 공간정보로부터 매쉬 정보를 생성하고, 2차원 이미지의 인지적 영역분할을 수행하고, 이를 상기 매쉬정보의 영역분할을 위해 3차원 투영하여 결합함으로써 사물이나 영역에 따라 분할된 인지적 매쉬정보를 생성하고, IMU센서를 통해 취득한 중력 방향을 결합하여 물리법칙을 적용할 수 있는 정보를 부가하고, 사용자의 요청 또는 사전에 설계된 시나리오로부터 객체의 생성 명령을 받아, 그 객체가 실제 공간에 존재할 수 있는 좌표를 찾아 3차원 공간에 위치하도록 하고, 카메라로부터 얻은 2차원 이미지 위에 가상 객체를 합성하여 디스플레이에 출력하는 증강현실을 구현하는 실제 공간과 가상의 객체들이 서로 상호작용을 구현한다.In this way, in order to express how virtual objects should interact in real space, the depth of the real space is extracted, a 3D point cloud map is created, mesh information is created from the generated spatial information, and 2 By performing cognitive domain segmentation of a dimensional image, 3D projection and combining for domain division of the mesh information, cognitive mesh information divided according to objects or areas is generated, and the gravity direction acquired through the IMU sensor is combined. to add information that can apply the laws of physics, receive an object creation command from a user's request or a pre-designed scenario, find coordinates where the object can exist in real space, and position it in a 3-dimensional space, The real space and virtual objects that implement augmented reality that synthesizes virtual objects on the two-dimensional image obtained from and outputs them to the display implement interaction with each other.

본 발명은 가상의 객체가 실제 공간에서 서로 상호작용 하기 위하여, 카메라와, IMU센서와, 디스플레이와, 상기 증강현실을 구현할 수 있는 방법인 알고리즘을 탑재할 수 있는 전자적 기록 매체와, 이를 실행하기 위한 프로세서로 구성된 장치로 구성한다.The present invention relates to an electronic recording medium capable of mounting a camera, an IMU sensor, a display, and an algorithm, which is a method for realizing the augmented reality, in order for virtual objects to interact with each other in a real space, and a method for executing the same. It consists of a device composed of processors.

상기 장치는, 증강현실을 구현할 수 있는 방법인 알고리즘을 탑재하였는데, 이는, 영상 취득 처리와, 공간정보 생성처리와, 매쉬정보 생성 처리와, 인지적 영역분할 처리와, 인지적 매쉬정보 생성 처리로 구성된 영상해석부로부터 실제 공간의 인지적 매쉬 정보를 생성하는 단계; 가상의 객체를 사용자의 입력 또는 사전에 설계된 시나리오를 통해 가상객체 생성명령을 처리하고, 가상객체의 위치를 계산하고, 가상 객체를 상기 획득 영상 위에 인지적 매쉬 정보를 결합하여 합성하는, 객체표현부로 구성된 객체 표현 단계로 이루어진 것에 특징이 있다.The device is equipped with an algorithm that can implement augmented reality, which includes image acquisition processing, spatial information generation processing, mesh information generation processing, cognitive area division processing, and cognitive mesh information generation processing. generating cognitive mesh information of the real space from the configured image analyzer; An object expression unit that processes a virtual object creation command through a user's input or a pre-designed scenario, calculates the location of a virtual object, and synthesizes the virtual object by combining cognitive mesh information on the acquired image. It is characterized by being composed of composed object expression steps.

상기와 같은 본 발명은 사용자에게 가상의 객체가 실제 공간에 물리적으로 실재한다고 착각을 일으켜 현존감을 높일 수 있는 방법으로, 가상의 객체가 실제 공간에서 서로 상호 작용할 수 있어 증강현실 콘텐츠를 제공하는 측면에서 몰입감 있는 시나리오를 구성하는 것이 가능하게 되었다.The present invention as described above is a method for enhancing a sense of presence by creating an illusion that a virtual object physically exists in a real space to a user, in terms of providing augmented reality content as virtual objects can interact with each other in a real space. It has become possible to construct immersive scenarios.

3차원 매쉬 정보를 인지적 영역 분할을 이용하여 실제 공간의 구성을 각 사물에 따라 분리할 수 있게 되어, 가상의 객체가 실제 공간의 사물과 상호작용하도록 표현하고, 실제 사물에 가상의 객체가 고정되거나, 부착되거나, 부딪혀 튕겨 나가거나, 쌓이거나, 이에 국한하지 않고 다양한 상호작용을 할 수 있다.It is possible to separate the composition of the real space according to each object by using the 3D mesh information by cognitive domain segmentation, so that the virtual object interacts with the object in the real space, and the virtual object is fixed to the real object. It is not limited to being, being attached, being bumped and bounced off, being piled up, or being not limited thereto, and can have various interactions.

이를 통해 어느 임의의 한 공간에서, 실제의 벽에 가상의 벽시계를 걸어두는 것을 표현하거나, 실제의 테이블 주변에 가상의 의자를 배치하고, 가상의 의자가 배치된 위치에 따라 실제의 테이블에 가려져 그 일부 또는 전체가 보이지 않도록 표현하는 것이 사전에 정교하게 설계되지 않아도 사용자의 목적에 따라 손쉬운 조작으로도 구현할 수 있어, 건설, 건축, 인테리어, 3차원 제품 디자인, 게임 등 다양한 분야의 증강현실 기술로 활용할 수 있다.Through this, it is expressed that a virtual wall clock is hung on a real wall in a certain space, or a virtual chair is placed around a real table, and the virtual chair is hidden by the real table according to the position where it is placed. It can be implemented with easy operation according to the user's purpose even if it is not elaborately designed in advance to make some or all invisible, so it can be used as augmented reality technology in various fields such as construction, architecture, interior, 3D product design, and games. can

단, 본 발명의 적용은 상기 기술한 활용 예에 국한하지 않고, 본 발명의 본질적인 내용의 변경없이 상기의 실시 예시와 다른 분야에도 본 발명을 적용할 수 있다.However, the application of the present invention is not limited to the above-described utilization examples, and the present invention can be applied to fields other than the above embodiments without changing the essential content of the present invention.

도 1은 본 발명을 구성하는 카메라, IMU 센서, 디스플레이로 구성된 장치와, 본 발명의 방법을 구현하기 위한 각 부 알고리즘의 구조를 나타낸 흐름도이다.1 is a flowchart showing the structure of a device composed of a camera, an IMU sensor, and a display constituting the present invention, and each sub-algorithm for implementing the method of the present invention.

도 2는 영상해석부의 처리 과정을 상세히 기술한 순서도이다.Figure 2 is a flow chart describing in detail the processing of the image analysis unit.

도 3은 객체표현부의 처리 과정을 상세히 기술한 순서도이다.Figure 3 is a flow chart describing in detail the processing process of the object expression unit.

도 4는 인지적 영역 분할을 수행하는 합성곱 신경망을 상세히 기술한 구조도이다.4 is a structural diagram detailing a convolutional neural network that performs cognitive domain segmentation.

본 발명을 첨부된 도면을 참조하여 상세히 설명하면 다음과 같다.The present invention will be described in detail with reference to the accompanying drawings.

도 1은 2차원의 컬러 영상을 취득할 수 있는 하나의 카메라(D001)와, 증강현실을 구현하기 위한 알고리즘으로 구성된 영상해석부(S100)와 객체표현부(S200)를 저장할 수 있는 전자적 기록 매체와 이를 실행할 수 있는 프로세서와, 인지적 매쉬 정보에 가속도와 자세에 대한 물리적 환경 정보를 부가하기 위한 IMU센서(D002)와, 실제 공간의 2차원 이미지 위에 가상의 객체를 합성한 영상을 재생할 수 있는 하나의 디스플레이(D003)를 가진 장치와 그 방법에 대한 흐름도이다.1 is an electronic recording medium capable of storing a camera D001 capable of obtaining a two-dimensional color image, an image analysis unit S100 composed of an algorithm for realizing augmented reality, and an object representation unit S200. And a processor that can execute it, an IMU sensor (D002) for adding physical environment information about acceleration and attitude to cognitive mesh information, and a video that can reproduce a synthesized image of a virtual object on a two-dimensional image of a real space It is a flow chart of a device with one display (D003) and its method.

상기 흐름도의 영상해석부(S100)를 구체적으로 기술하면, 카메라(D001)로부터 영상을 얻는 영상 취득 처리(S110)를 통해 정해진 시간 간격마다 일정하게 2차원 이미지를 취득하고, 이를 획득 영상(R001)으로 일시 기억한다. 획득 영상은, 직전에 취득한 2차원 이미지와 현재 취득한 2차원 이미지로 구분하여 기억하고, 이 두 이미지에서 각각 SIFT, SURF, ORB 또는 이와 유사한 이미지의 특징점을 찾아 해당 특징점의 설명자를 생성할 수 있는 알고리즘을 이용하여 특징정보를 추출하고, 시점이 다른 두 이미지 사이의 특징정보간의 유사도를 비교하여, 최근접 특징점의 쌍(Pair)을 구하고, 쌍을 이룬 특징점의 기하학적 관계를 이용하여 t-1 프레임으로부터 t 프레임의 상대적 위치를 구한다. 이 때 특징점의 쌍이 충분히 많은 일정 간격의 연속 프레임을 키프레임 (Key Frame)으로 등록하고, 기하학적 관계로부터 3차원 위치 정보를 구한 특징점들의 군집을 3차원 포인트 클라우드맵으로 생성하는 공간정보 생성 처리(S120) 단계를 실행한다. 다시 상기 3차원 포인트 클라우드맵은 각 특징점의 3차원 공간에서의 유클리디안 거리를 구하고, 모든 특징점은 자신을 제외한 다른 특징점 중 최근접쌍을 적어도 2개 이상 구하여, 3개로 쌍을 이룬 특징점들을 이어 삼각형의 연결로 이루어진 매쉬정보를 생성하는 매쉬정보 생성 처리(S130)를 실행한다.Specifically, the image analysis unit (S100) of the above flow chart acquires a two-dimensional image at regular intervals through the image acquisition process (S110) for obtaining an image from the camera (D001), and converts it into an acquired image (R001). Temporarily remember as The acquired image is divided into the two-dimensional image acquired just before and the two-dimensional image currently acquired and stored, and an algorithm that can find feature points of SIFT, SURF, ORB or similar images from these two images, respectively, and create descriptors of the feature points. Extracts feature information using , compares the similarity between feature information between two images with different viewpoints, obtains a pair of closest feature points, and uses the geometric relationship between paired feature points to obtain information from the t-1 frame Get the relative position of the t frame. At this time, spatial information generation processing of registering continuous frames at regular intervals having a sufficiently large number of feature point pairs as key frames and generating a 3D point cloud map as a cluster of feature points obtained from 3D location information from geometric relationships (S120). ) step. Again, the 3D point cloud map obtains the Euclidean distance of each feature point in the 3D space, and for all feature points, at least two or more nearest pairs among other feature points other than the feature point are obtained, and the three paired feature points are connected into a triangle. A mesh information generating process (S130) for generating mesh information made of connections is executed.

상기 영상 취득 처리(S110)에서 취득한 2차원 이미지는, 도 4와 같이 구성된 합성곱 신경망을 이용하여 인지적 영역분할 처리(S140)를 실행하여, 2차원의 인지적 영역분할 정보를 생성한다.The 2D image obtained in the image acquisition process (S110) is subjected to a cognitive domain segmentation process (S140) using the convolutional neural network configured as shown in FIG. 4 to generate 2D cognitive domain segmentation information.

상기 2차원의 인지적 영역분할 정보는 평면 이미지와 같이 취급할 수 있는데, 이때 Fast corner detector 와 같은 코너점 추출 알고리즘을 이용하여 추출한 각 모서리의 2차원 위치를, 상기 매쉬정보 생성 처리(S130)에서 생성한 매쉬정보가 존재하는 3차원 공간에 투영(Projection)하여 3차원 정보로 변환하고, 투영된 3차원 좌표 사이의 유클리디안 거리를 구하여 최근접 포인트끼리 서로 연결하여 3차원으로 투영된 인지적 영역분할 정보를 생성하고, 상기 매쉬정보 생성 처리(S130)에서 생성한 매쉬 정보와 3차원의 인지적 영역분할 정보가 가지는 각각의 삼각형간의 거리를 구하고, 서로 근접한 삼각형 중 3차원의 인지적 영역분할 정보 측의 삼각형 각 모서리 3차원 좌표를 매쉬 정보에 추가하여 실제 공간을 세부적으로 표현할 수 있도록 분할하고, 인지적 영역 분할 정보 중 사물의 범주에 대한 정보를 매쉬 정보의 각 삼각형에 부가하여 매쉬정보의 영역이 어떤 사물을 표현하는 것인지에 대해 의미를 가지도록 하고, IMU센서(D002)로 부터 취득한 장치의 이동에 대한 가속도와 장치의 자세에 대한 정보를 얻고, 이 때 중력이 작용하는 방향을 계산하여, 상기 인지적 매쉬정보(R002)에 부가하는, 인지적 매쉬정보 생성 처리(S150)를 실행한다.The 2D cognitive region segmentation information can be treated like a flat image. At this time, the 2D location of each corner extracted using a corner point extraction algorithm such as a fast corner detector is determined in the mesh information generation process (S130). The generated mesh information is projected onto a 3D space where it is converted into 3D information, and the Euclidean distance between the projected 3D coordinates is obtained, and the closest points are connected to each other to create a 3D projected cognitive function. Area segmentation information is generated, the distance between each triangle of the mesh information generated in the mesh information generation process (S130) and the 3D cognitive area segmentation information is obtained, and 3D cognitive area segmentation is performed among triangles adjacent to each other. The three-dimensional coordinates of each corner of the triangle on the information side are added to the mesh information to divide the real space so that it can be expressed in detail. Make the area have meaning about what kind of object it represents, obtain information about the device's posture and the acceleration of the device's movement obtained from the IMU sensor (D002), and calculate the direction in which gravity acts at this time, A cognitive mesh information generation process (S150) is executed, which is added to the cognitive mesh information (R002).

상기 객체표현부(S200)는, 사용자가 디스플레이(D003)의 임의의 위치를 터치하거나, 타이머 또는 시계를 이용한 객체 출연 정보에 대한 시나리오를 이용하거나, 가상의 객체와 상호작용하기 위한 조건에 의해서나, 또는 본 발명에서 기술한 방법에 국한하지 않고 객체의 생성을 명령하는 방법에 의해서, 가상객체 생성 명령을 수신하여 일시적 기억장치에 가상객체의 출현 여부와, 위치 정보, 방향 정보, 상태 정보를 만들어 기억하는 가상객체 생성명령 처리(S210)를 실행하고, 생성 명령을 통해 입력된 위치 정보에 가상의 객체를 배치한 후, 시간의 흐름에 따라 물리적인 위치의 변화값이 존재하는지 재계산하여 객체의 위치 정보 또는 방향 정보 또는 상태 정보를 갱신하는 가상객체 위치 계산 처리(S220)를 실행하고, 인지적 매쉬정보(R002)를 이용하여 가상객체가 실제 공간에서 실재하는 것처럼 느끼도록 객체의 물리적 효과 또는 가려짐 또는 본 발명에서 기술한 방법에 국한하지 않고 상호작용하는 장면을, 획득 영상(R001)과 합성하여 실제 공간에 가상의 객체가 존재하는 것과 같이 렌더링하는 가상객체 합성 처리(S230)를 실행하고, 이를 디스플레이(D003)에 출력한다.The object expression unit (S200), the user touches an arbitrary position of the display (D003), uses a scenario for object appearance information using a timer or clock, or conditions for interacting with a virtual object , or not limited to the method described in the present invention, by receiving a virtual object creation command by a method of instructing object creation, whether or not a virtual object appears in a temporary storage device, location information, direction information, and state information are created. After executing the stored virtual object creation command processing (S210), arranging the virtual object in the location information input through the creation command, and then recalculating whether there is a change value of the physical location over time, The virtual object position calculation process (S220) for updating the location information, direction information or state information is executed, and the physical effect or masking of the object is performed to make the virtual object feel as if it is real in the real space by using the cognitive mesh information (R002). Virtual object synthesis processing (S230) is performed to synthesize the scene of interaction with the acquired image (R001) and render it as if a virtual object exists in a real space, without being limited to the method described in the present invention. This is output to the display (D003).

Claims

In a device equipped with a camera, an IMU sensor, a display, and an electronic recording medium, the step of acquiring continuous images to analyze the 3D real space with the images acquired by the camera, the step of extracting and matching features from the continuous images, A spatial information generation step consisting of generating 3D spatial information by obtaining a geometric relationship with matched feature pairs from the obtained image, generating mesh information by connecting feature points of the 3D spatial information, and cognitive domain in the acquired image. Generating cognitive domain segmentation information by segmentation; combining the spatial information and cognitive domain segmentation information obtained in the above two steps to generate cognitive segmentation information; A method for generating cognitive mesh information by combining the attitude information of the device obtained using the IMU sensor with the cognitive mesh information

Calculating the coordinates of a space in which a virtual object can physically exist in order to synthesize images obtained in a real space, and the cognitive mesh information generated in claim 1 in order for a virtual object to interact in a real space. Updating the position information, direction information, and state information of the virtual object based on , storing the state of the virtual object to preserve the state, and synthesizing the virtual object on the image obtained in the real space. Virtual object synthesis method consisting of