KR102360172B1

KR102360172B1 - Method and apparatus for controlling interaction between user and 3d object

Info

Publication number: KR102360172B1
Application number: KR1020210083057A
Authority: KR
Inventors: 배원석
Original assignee: 배원석
Priority date: 2021-06-25
Filing date: 2021-06-25
Publication date: 2022-02-08

Abstract

According to various embodiments of the present invention, a 3D modeling providing server which controls interaction between a user using a user terminal and a 3D object includes a trigger event generating unit, a target object determining unit, a 3D object obtaining unit, a 3D object providing unit and an interaction control unit. The trigger event generating unit generates a trigger event based on at least one of an audio and an image frame output while a video acquired through an image providing server is played. The target object determining unit detects the trigger event generated while the video is being reproduced through a display of the user terminal, analyzes voice output from the video from a time when the trigger event occurs in response to detection, and determines a target object to be expressed as a 3D object based on an analysis. The 3D object obtaining unit acquires a 3D object corresponding to the target object determined through at least one 3D engine. The 3D object providing unit overlaps the video and the 3D object and provides it to the user terminal. The interaction control unit may obtain a user's input through the user terminal, and perform functions of enlarging, reducing, moving, and rotating the 3D object according to the user's input. Accordingly, it is possible to satisfy user's sensibility by displaying the 3D object in a timely manner in various environments such as a 2D environment.

Description

METHOD AND APPARATUS FOR CONTROLLING INTERACTION BETWEEN USER AND 3D OBJECT

본 발명은 사용자와 3D 객체 간 인터랙션을 제어하는 방법 및 장치에 관한 것으로, 더욱 상세하게는 재생되는 동영상에 3D 객체를 중첩하여 표시하고, 표시된 3D 객체를 제어하는 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for controlling an interaction between a user and a 3D object, and more particularly, to a method and apparatus for superimposing and displaying a 3D object on a reproduced video, and controlling the displayed 3D object.

그래픽 사용자 인터페이스(GUI; Graphical User Interface)는 사용자가 컴퓨터, 이동 단말, TV 등의 전자 장치와 정보를 교환할 때, 그래픽을 통해 정보를 교환하는 작업 환경을 말한다. 예를 들면, 사용자가 전자 장치를 조작하고자 할 때 터치 방식을 이용하여 전자 장치의 디스플레이에 있는 아이콘을 선택하여 작업을 수행하도록 하는 환경이나, 사용자가 TV와 정보를 교환할 때 리모컨 등을 이용하여 화면에 있는 메뉴를 선택하여 작업을 수행하도록 하는 환경을 말한다.A graphic user interface (GUI) refers to a work environment in which a user exchanges information through a graphic when exchanging information with an electronic device such as a computer, a mobile terminal, or a TV. For example, when a user wants to operate an electronic device, the user selects an icon on the display of the electronic device using a touch method to perform a task, or when the user exchanges information with a TV, using a remote control, etc. It refers to an environment that allows you to perform a task by selecting a menu on the screen.

디지털 기술 발전으로 VRML과 같이 가상의 그래픽 형상을 만드는 여러 가지의 방법이 존재한다. OpenGL처럼 오픈 소스를 통하여 그래픽 형상을 만드는 가장 기초적인 방법들로부터 분화되어 상업적으로 개발된 DirectX 규격에서 더욱 현실 세계와 같은 논리 연산으로 시각화 성능을 개선한 다양한 엔진들이 존재한다. 또한 근래에는 모바일과 웹에서 입체를 구현하는 다양한 WebGL, Web3D 알고리즘 등이 오픈 소스 및 유료 코드로 개발되고 있으며 전통적인 CPU 기술과 함께 근래에 더욱 발전하는 GPU 기술로 성능이 개선되어 대량의 3D 객체를 가상의 공간에 구현할 수 있다.With the development of digital technology, there are several methods of creating virtual graphic shapes such as VRML. There are various engines that have improved visualization performance with logical operations more like the real world in the commercially developed DirectX standard, which is differentiated from the most basic methods of creating graphic shapes through open sources like OpenGL. In addition, in recent years, various WebGL and Web3D algorithms that implement stereoscopic images on mobile and web are being developed as open source and paid codes. can be implemented in the space of

2D 객체는 평면적이고 정적인데 비해 3D 객체는 2D 객체에 비해 입체적이고 동적이기 때문에 정보의 전달에 있어서 보다 시각적이고 직관적임으로 사용자 감성을 만족시킬 수 있다는 장점이 있다. 그로 인해 현재 많은 전자 장치에서 GUI 환경이 2D GUI에서 3D GUI로 점차 대체되어 가고 있는 추세일 뿐 아니라, 많은 기술 분야, 마케팅 분야 등 여러 분야에서 시각적 효과를 극대화하기 위하여 3D 객체를 활용하고 있다. While 2D objects are flat and static, 3D objects are three-dimensional and dynamic compared to 2D objects, so they have the advantage of being able to satisfy user sensibility through visual and intuitive information delivery. Therefore, in many electronic devices, the GUI environment is gradually being replaced from the 2D GUI to the 3D GUI, and 3D objects are used to maximize the visual effect in many fields such as technology and marketing.

다만, 위와 같은 기술 개발로 인해 3D 공간에서 3D 객체를 구현하는 것은 일반적이나, 사용자가 실행하고 있는 특정 2D 환경에서 3D 객체를 구현하는 것은 제한적이다. 또한, 사용자와 3D 객체 간 인터랙션은 기본적으로 사용자의 터치 입력이나 마우스 입력에 따라 크기 조절, 회전 등의 기능만을 제공하였다. 그러나, 사용자는 사용자 단말을 통해 3D 객체를 제어하므로, 사용자 단말의 상태나 사용자와 사용자 단말 간 관계에 기초하여 3D 객체를 제어할 필요성이 대두된다.However, due to the development of the above technology, it is common to implement 3D objects in 3D space, but it is limited to implement 3D objects in a specific 2D environment that the user is running. In addition, the interaction between the user and the 3D object basically provided only functions such as size adjustment and rotation according to the user's touch input or mouse input. However, since the user controls the 3D object through the user terminal, there is a need to control the 3D object based on the state of the user terminal or the relationship between the user and the user terminal.

상기와 같은 문제점을 해결하기 위한 본 발명의 목적은, 특정 트리거 이벤트에 따라 특정 2D 환경(예: 2D 동영상)에 3D 객체를 표시하는 방법 및 장치를 제공하는데 있다.An object of the present invention for solving the above problems is to provide a method and apparatus for displaying a 3D object in a specific 2D environment (eg, a 2D video) according to a specific trigger event.

상기와 같은 문제점을 해결하기 위한 본 발명의 다른 목적은, 사용자 단말의 상태나 사용자 단말을 통해 판단한 사용자의 상태에 기초하여 3D 객체를 표시하고 제어하는 방법 및 장치를 제공하는데 있다.Another object of the present invention to solve the above problems is to provide a method and apparatus for displaying and controlling a 3D object based on the state of the user terminal or the state of the user determined through the user terminal.

다양한 실시 예에 따르면, 사용자 단말을 사용하는 사용자와 3D 객체 간 인터랙션을 제어하는 3D 모델링 제공 서버는, 상기 3D 객체를 표시하는 일종의 명령인 트리거 이벤트를 생성하는 트리거 이벤트 생성부; 상기 트리거 이벤트에 응답하여 상기 3D 객체로 표현할 대상 객체를 결정하는 대상 객체 결정부; 상기 대상 객체에 대응되는 상기 3D 객체를 생성하는 3D 객체 획득부; 생성된 상기 3D 객체를 상기 사용자 단말을 통해 상기 사용자에게 제공하는 3D 객체 제공부; 및 상기 사용자의 입력에 기초하여 상기 사용자와 상기 3D 객체 간 인터랙션을 제어하는 인터랙션 제어부를 포함할 수 있다. 상기 트리거 이벤트 생성부는, 영상 제공 서버를 통해 획득된 동영상이 재생되는 동안 출력되는 음성 및 영상 프레임 중 적어도 하나에 기초하여 트리거 이벤트를 생성하고, 상기 대상 객체 결정부는, 상기 동영상이 상기 사용자 단말의 디스플레이를 통해 재생되는 동안 생성된 상기 트리거 이벤트를 감지하고, 상기 감지에 응답하여 상기 트리거 이벤트가 발생한 시점부터 상기 동영상에서 출력되는 음성을 분석하고, 상기 분석에 기초하여 3D 객체로 표현할 대상 객체를 결정하고, 상기 3D 객체 획득부는, 적어도 하나의 3D 엔진을 통해 결정된 상기 대상 객체에 대응되는 3D 객체를 획득하고, 상기 3D 객체 제공부는, 상기 동영상과 상기 3D 객체를 중첩시켜 상기 사용자 단말로 제공하고, 상기 인터랙션 제어부는, 상기 사용자의 입력을 상기 사용자 단말을 통해 획득하고, 상기 사용자의 입력에 따라 상기 3D 객체의 확대, 축소, 이동 및 회전의 기능을 수행할 수 있다.According to various embodiments of the present disclosure, a 3D modeling providing server for controlling an interaction between a user using a user terminal and a 3D object includes: a trigger event generator for generating a trigger event, which is a kind of command for displaying the 3D object; a target object determiner configured to determine a target object to be expressed as the 3D object in response to the trigger event; a 3D object acquisition unit generating the 3D object corresponding to the target object; a 3D object providing unit providing the generated 3D object to the user through the user terminal; and an interaction controller configured to control an interaction between the user and the 3D object based on the user's input. The trigger event generation unit generates a trigger event based on at least one of an audio and an image frame output while the video obtained through the video providing server is being played, and the target object determiner is configured to display the video on the display of the user terminal. Detects the trigger event generated while being played through, analyzes the voice output from the video from the time the trigger event occurs in response to the detection, and determines a target object to be expressed as a 3D object based on the analysis, , the 3D object obtaining unit obtains a 3D object corresponding to the target object determined through at least one 3D engine, and the 3D object providing unit provides the video and the 3D object by overlapping the 3D object to the user terminal, and the The interaction control unit may obtain the user's input through the user terminal, and perform functions of enlarging, reducing, moving, and rotating the 3D object according to the user's input.

다양한 실시 예에 따르면, 상기 대상 객체 결정부는, 상기 영상 제공 서버로부터 획득한 동영상에서 출력하는 음성, 상기 사용자의 음성 및 상기 사용자의 제스쳐 중 적어도 하나를 감지할 수 있다.According to various embodiments, the target object determiner may detect at least one of a voice output from the video acquired from the image providing server, the user's voice, and the user's gesture.

다양한 실시 예에 따르면, 상기 3D 객체 제공부는, 재생 중인 상기 동영상의 컬러를 흑백 처리하고, 상기 3D 객체의 컬러를 색표현 할 수 있다.According to various embodiments, the 3D object providing unit may perform black-and-white processing on the color of the video being reproduced, and color express the color of the 3D object.

다양한 실시 예에 따르면, 상기 3D 모델링 제공 서버는 초기 표시 특성 결정부는 더 포함하고, 상기 초기 표시 특성 결정부는, 상기 3D 객체가 상기 사용자 단말을 통해 처음 표시될 때의 3D 객체의 초기 특성인 객체의 초기 크기, 초기 깊이, 초기 위치, 초기 방향 및 기본 동작을 결정하고, 상기 3D 객체 제공부는, 상기 초기 특성이 적용된 상기 3D 객체를 제공할 수 있다. According to various embodiments, the 3D modeling providing server further includes an initial display characteristic determining unit, wherein the initial display characteristic determining unit is an object that is an initial characteristic of a 3D object when the 3D object is first displayed through the user terminal. An initial size, an initial depth, an initial position, an initial direction, and a basic motion are determined, and the 3D object providing unit may provide the 3D object to which the initial characteristics are applied.

다양한 실시 예에 따르면, 상기 초기 표시 특성 결정부는, 상기 동영상이 표시되고 상기 사용자 단말의 디스플레이의 표면을 제1 면(plane)으로 설정하고, 상기 제1 면이 바라보는 제1 방향과 반대 방향인 제2 방향으로 상기 제1 면의 중앙에서 수직으로 이어지는 가상의 점을 설정하고, 상기 제1 면과 상기 가상의 점 사이에 상기 3D 객체의 초기 깊이를 결정할 수 있으며, 상기 사용자와 상기 사용자 단말 간 거리가 가까울수록 상기 3D 객체의 초기 깊이는 깊어질 수 있다.According to various embodiments of the present disclosure, the initial display characteristic determining unit is configured to display the moving image and set the surface of the display of the user terminal as a first plane, in a direction opposite to a first direction viewed by the first plane. A virtual point extending vertically from the center of the first surface may be set in a second direction, and an initial depth of the 3D object may be determined between the first surface and the virtual point, and between the user and the user terminal As the distance increases, the initial depth of the 3D object may increase.

다양한 실시 예에 따르면, 상기 트리거 이벤트 생성부는, 상기 동영상을 구성하는 영상 프레임(image frame)들 중에서, 서로 시간적으로 인접한 2개의 영상 프레임들을 대상으로, 서로 위치가 대응하는 화소값끼리 차분하여 차분 영상 프레임을 생성하고, 셍상된 상기 차분 영상 프레임을 구성하는 화소값들의 합산값이 미리 설정된 임계값 이하인 경우, 해당하는 2개의 영상 프레임들 사이에 장면 전환이 있는 것으로 판단하고, 상기 장면 전환이 있는 시점을 트리거 이벤트로 결정할 수 있다.According to various embodiments, the trigger event generating unit is configured to differentiate between pixel values corresponding to each other in positions of two image frames temporally adjacent to each other among the image frames constituting the moving picture to obtain a difference image. When the sum of pixel values constituting the difference image frame generated by generating a frame is equal to or less than a preset threshold, it is determined that there is a scene change between the two corresponding image frames, and the time point at which the scene change occurs can be determined as a trigger event.

다양한 실시 예에 따르면, 상기 트리거 이벤트 생성부는, 상기 동영상 속 적어도 하나의 사람을 검출하고, 상기 동영상에서 음성이 출력되는 동안 검출된 사람의 입 모양이 변화되는지 판단하고, 음성이 출력되는 동안 입 모양이 변하는 사람을 발화자로 결정하고, 지정된 시간 동안의 상기 동영상에서 출력되는 음성의 평균 세기 또는 음성의 평균 주파수가 제1 변화량만큼 변화되고, 상기 음성이 출력되는 동안 입 모양이 변하는 다른 사람을 검출한 경우, 발화자가 바뀌는 것으로 판단하고, 상기 발화자가 바뀌는 시점을 트리거 이벤트로 결정할 수 있다.According to various embodiments of the present disclosure, the trigger event generator may detect at least one person in the video, determine whether the detected person's mouth shape changes while a voice is output from the video, and determine a mouth shape while a voice is output This changing person is determined as the talker, the average intensity or average frequency of the voice output from the moving picture for a specified time is changed by the first amount of change, and another person whose mouth shape changes while the voice is output is detected. In this case, it may be determined that the speaker changes, and a time point at which the speaker changes may be determined as a trigger event.

다양한 실시 예에 따르면, 상기 대상 객체 결정부는, 자연어 처리(NLP)를 이용하여 상기 동영상에서 출력되는 음성에서 다수의 단어들을 인식하고, 인식된 상기 단어들을 통해 상기 대상 객체를 결정하기 위한 토픽을 결정하고, 상기 사용자 단말을 통해 획득된 상기 사용자의 검색 기록 및 영상 시청 기록에 기초하여 사용자의 관심 분야를 결정하고, 결정된 상기 토픽 및 결정된 상기 사용자의 관심 분야에 기초하여 대상 객체를 결정할 수 있다.According to various embodiments, the target object determiner recognizes a plurality of words from the voice output from the video using natural language processing (NLP), and determines a topic for determining the target object through the recognized words and determine the user's field of interest based on the user's search record and video viewing record obtained through the user terminal, and determine the target object based on the determined topic and the determined user's field of interest.

다양한 실시 예에 따르면, 상기 3D 객체 획득부는, 상기 동영상의 해상도를 판단하고, 판단된 동영상의 해상도에 따라 상기 3D 객체를 생성하는데 필요한 버텍스(vertex)의 수를 조절하여 상기 3D 객체를 생성하며, 상기 동영상의 해상도가 미리 설정된 기준 해상도인 경우, 기준 개수의 버텍스 수를 가진 상기 3D 객체를 생성하고, 상기 동영상의 해상도가 상기 기준 해상도보다 낮은 제1 해상도인 경우, 상기 기준 개수보다 적은 제1 개수의 버텍스 수를 가진 상기 3D 객체를 생성할 수 있다.According to various embodiments, the 3D object obtaining unit determines the resolution of the video, and adjusts the number of vertices required to generate the 3D object according to the determined resolution of the video to generate the 3D object, When the resolution of the video is a preset reference resolution, the 3D object having a reference number of vertices is generated, and when the resolution of the video is a first resolution lower than the reference resolution, the first number less than the reference number It is possible to create the 3D object with the number of vertices of .

다양한 실시 예에 따르면, 상기 초기 표시 특성 결정부는, 상기 초기 위치를 결정하기 위하여 상기 동영상의 표시 영역을 복수 개의 영역들로 분할하고, 분할된 상기 복수 개의 영역들 각각에서 직선 성분 및 곡선 성분을 추출하고, 분할된 상기 복수 개의 영역들 각각에 포함된 픽셀들의 컬러 값들을 획득하고, 획득된 상기 컬러 값들의 편차 값을 계산하고, 상기 복수 개의 영역들 각각에 포함된 직선 성분 및 곡선 성분을 이루고 있는 픽셀들의 비중이 제1 비중 이하이고, 계산된 상기 편차 값이 제1 임계 편차 값 이하인 영역을 상기 3D 객체를 표시할 상기 초기 위치로 결정할 수 있다.According to various embodiments, the initial display characteristic determining unit divides the display area of the moving picture into a plurality of areas to determine the initial position, and extracts a linear component and a curved component from each of the divided areas. and obtaining color values of pixels included in each of the divided plurality of areas, calculating a deviation value of the obtained color values, and forming a linear component and a curved component included in each of the plurality of areas A region in which the proportion of pixels is equal to or less than the first proportion and the calculated deviation value is equal to or less than the first threshold deviation value may be determined as the initial position for displaying the 3D object.

다양한 실시 예에 따르면, 상기 초기 표시 특성 결정부는, 상기 사용자 단말의 가속도 센서, 자이로 센서 및 지자계 센서를 포함하는 9축 센서를 통해 사용자 단말의 요(yaw), 피치(pitch) 및 롤(roll) 방향의 각도를 획득하고, 상기 3D 객체의 기준 방향을 결정하고, 상기 3D 객체의 기준 방향으로부터의 요, 피치 및 롤 방향의 각도와 상기 사용자 단말의 요, 피치 및 롤 방향의 각도가 대응되도록 상기 초기 방향을 결정하되, 상기 사용자 단말의 요, 피치 및 롤 방향의 각도 중 적어도 하나 방향의 각도가 제1 임계 각도를 초과하는 경우, 상기 3D 객체의 상기 초기 방향 중 상기 적어도 하나의 방향의 각도를 제1 임계 각도로 결정할 수 있다.According to various embodiments, the initial display characteristic determining unit may include a yaw, a pitch and a roll of the user terminal through a 9-axis sensor including an acceleration sensor, a gyro sensor, and a geomagnetic sensor of the user terminal. ) direction angle, determine the reference direction of the 3D object, so that the angle of the yaw, pitch and roll direction from the reference direction of the 3D object and the angle of the yaw, pitch and roll direction of the user terminal correspond The initial direction is determined, but when the angle of at least one of the angles of the yaw, pitch, and roll direction of the user terminal exceeds a first critical angle, the angle of the at least one direction among the initial directions of the 3D object may be determined as the first critical angle.

다양한 실시 예에 따르면, 상기 초기 표시 특성 결정부는, 상기 3D 객체의 상기 기본 동작을 결정하기 위하여 상기 대상 객체가 포함된 동영상들을 크롤링하고, 크롤링된 상기 동영상들에서 상기 대상 객체의 모션들 중 가장 많이 포착되는 모션을 기본 동작으로 결정할 수 있다.According to various embodiments, the initial display characteristic determining unit crawls videos including the target object in order to determine the basic operation of the 3D object, and in the crawled videos, the highest number of motions of the target object The captured motion can be determined as the default motion.

다양한 실시 예에 따르면, 상기 인터랙션 제어부는, 상기 동영상에 서로 다른 깊이 값을 가진 복수 개의 3D 객체들이 표시된 경우, 상기 사용자 단말의 상기 디스플레이의 사이즈에 기초하여 사용자와 상기 사용자 단말 간 기준 거리를 설정하고, 상기 복수 개의 3D 객체들 중 상기 기준 거리에 대응되는 하나의 기준 3D 객체를 결정하고, 상기 사용자와 상기 사용자 단말 간 거리가 상기 기준 거리인 경우, 상기 기준 3D 객체에 식별 표시를 하고, 상기 사용자와 상기 사용자 단말 간 거리가 상기 기준 거리보다 가까운 경우, 상기 기준 3D 객체의 깊이 값보다 깊은 3D 객체에 식별 표시를 하고, 상기 사용자와 상기 사용자 단말 간 거리가 상기 기준 거리보다 먼 경우, 상기 기준 3D 객체의 깊이 값보다 얕은 3D 객체에 식별 표시를 하고, 상기 식별 표시가 된 경우, 상기 식별 표시가 된 이후부터 지정된 시간 내에 상기 사용자의 핸드 모션의 변화를 감지한 경우 상기 식별 표시가 된 3D 객체를 모션을 제어할 3D 객체로 결정할 수 있다.According to various embodiments, when a plurality of 3D objects having different depth values are displayed in the video, the interaction control unit sets a reference distance between the user and the user terminal based on the size of the display of the user terminal, , determine one reference 3D object corresponding to the reference distance from among the plurality of 3D objects, and when the distance between the user and the user terminal is the reference distance, mark the reference 3D object for identification, and the user When the distance between the user terminal and the user terminal is closer than the reference distance, an identification mark is displayed on a 3D object that is deeper than the depth value of the reference 3D object, and when the distance between the user and the user terminal is greater than the reference distance, the reference 3D An identification mark is made on a 3D object that is shallower than the depth value of the object, and when the identification mark is made, when a change in the user's hand motion is detected within a specified time after the identification mark becomes the identification mark, the 3D object with the identification mark It can be determined by the 3D object whose motion is to be controlled.

다양한 실시 예에 따르면, 상기 인터랙션 제어부는, 상기 디스플레이의 사이즈가 커질수록 상기 기준 거리가 멀어지도록 설정하고, 상기 사용자 단말의 카메라를 통해 감지한 상기 사용자의 지문의 개수에 기초하여 상기 핸드 모션의 변화를 감지할 수 있다.According to various embodiments of the present disclosure, the interaction control unit sets the reference distance to increase as the size of the display increases, and changes the hand motion based on the number of fingerprints of the user detected through the camera of the user terminal. can detect

본 문서에 개시되는 다양한 실시 예들에 따르면, 2D 환경과 같은 다양한 환경에서 적시에 3D 객체를 나타냄으로써 사용자의 감성을 만족시킬 수 있다.According to various embodiments disclosed in this document, it is possible to satisfy a user's emotion by displaying a 3D object in a timely manner in various environments such as a 2D environment.

또한, 다양한 실시 예들에 따르면, 사용자와 3D 객체 간 다양한 인터랙션을 통해 3D 객체를 용이하게 제어할 수 있다.Also, according to various embodiments, it is possible to easily control the 3D object through various interactions between the user and the 3D object.

이 외에, 본 문서를 통해 직접적 또는 간접적으로 파악되는 다양한 효과들이 제공될 수 있다.In addition, various effects directly or indirectly identified through this document may be provided.

도 1은 일 실시 예에 따른 3D 모델링 제공 시스템을 도시한 도면이다.
도 2는 도 1에 따른 3D 모델링 제공 서버의 구성들을 도시한 도면이다.
도 3a은 도 2의 초기 표시 특성 결정부를 통해 3D 객체의 초기 표시 특성(예: 초기 깊이)을 결정하는 예시를 나타낸 도면이다.
도 3b는 도 2의 초기 표시 특성 결정부를 통해 3D 객체의 초기 표시 특성(예: 초기 위치)을 결정하는 예시를 나타낸 도면이다.
도 4는 도 2의 인터랙션 제어부를 통해 3D 객체와 사용자가 인터랙션을 하는 예시를 나타낸 도면이다.
도 5는 도 2의 인터랙션 제어부를 통해 복수 개의3D 객체들과 사용자가 인터랙션을 하는 예시를 나타낸 도면이다.
도 6은 도 1에 따른 3D 모델링 제공 서버의 하드웨어 구성을 나타낸 도면이다.1 is a diagram illustrating a 3D modeling providing system according to an embodiment.
FIG. 2 is a diagram illustrating configurations of a 3D modeling providing server according to FIG. 1 .
3A is a diagram illustrating an example of determining an initial display characteristic (eg, an initial depth) of a 3D object through the initial display characteristic determiner of FIG. 2 .
3B is a diagram illustrating an example of determining an initial display characteristic (eg, an initial position) of a 3D object through the initial display characteristic determiner of FIG. 2 .
4 is a diagram illustrating an example in which a user interacts with a 3D object through the interaction control unit of FIG. 2 .
FIG. 5 is a diagram illustrating an example in which a user interacts with a plurality of 3D objects through the interaction control unit of FIG. 2 .
6 is a diagram illustrating a hardware configuration of the 3D modeling providing server according to FIG. 1 .

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용하였다. Since the present invention can have various changes and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and it should be understood to include all modifications, equivalents and substitutes included in the spirit and scope of the present invention. In describing each figure, like reference numerals have been used for like elements.

제1, 제2, A, B 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Terms such as first, second, A, and B may be used to describe various elements, but the elements should not be limited by the terms. The above terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, a first component may be referred to as a second component, and similarly, a second component may also be referred to as a first component. and/or includes a combination of a plurality of related listed items or any of a plurality of related listed items.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. When an element is referred to as being “connected” or “connected” to another element, it is understood that it may be directly connected or connected to the other element, but other elements may exist in between. it should be On the other hand, when it is said that a certain element is "directly connected" or "directly connected" to another element, it should be understood that the other element does not exist in the middle.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in the present application are only used to describe specific embodiments, and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as “comprise” or “have” are intended to designate that a feature, number, step, operation, component, part, or combination thereof described in the specification exists, but one or more other features It should be understood that this does not preclude the existence or addition of numbers, steps, operations, components, parts, or combinations thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical and scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present application. does not

이하, 본 발명에 따른 바람직한 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 일 실시 예에 따른 3D 모델링 제공 시스템(10)을 도시한 도면이다. 도 1을 참조하면, 3D 모델링 제공 시스템(10)은 3D 모델링 제공 서버(100), 사용자 단말(200), 영상 제공 서버(300) 등을 포함할 수 있다.1 is a diagram illustrating a 3D modeling providing system 10 according to an embodiment. Referring to FIG. 1 , the 3D modeling providing system 10 may include a 3D modeling providing server 100 , a user terminal 200 , an image providing server 300 , and the like.

3D 모델링 제공 서버(100)는 적어도 하나의 3D 엔진을 통해 3D 객체를 생성 내지 획득할 수 있다. 상기 적어도 하나의 3D 엔진은 웹(web) 기반의 3D 엔진을 포함할 수 있다. 3D 객체는 기하 특성(x, y, z), 표시 특성(크기, 깊이, 위치, 방향), 광원 특성(광원의 위치, 유형세기 등), 기타 특성(색상, 표면의 반사계수, 투명도, 광택도 등)을 포함할 수 있다.The 3D modeling providing server 100 may generate or obtain a 3D object through at least one 3D engine. The at least one 3D engine may include a web-based 3D engine. 3D objects have geometric properties (x, y, z), display properties (size, depth, position, direction), light source properties (light source location, type intensity, etc.), and other properties (color, surface reflection coefficient, transparency, gloss). and the like) may be included.

3D 모델링 제공 서버(100)는 사용자의 입력에 따라 3D 객체를 생성 및 제공하기 위한 리소스(resource)가 필요한 경우 그래픽 라이브러리를 이용하여 필요한 리소스를 획득할 수 있다. 예를 들어, 3D 모델링 제공 서버(100)는 외부 또는 내부 저장부에서 3D 객체를 생성하는데 필요한 3D 모델을 가져오거나, 그밖에 필요한 리소스를 획득할 수 있다. 3D 모델링 제공 서버(100)는 3D 모델을 이용하여 3D 객체를 생성하는데 필요한 미리 정해진 스크립트 파일을 불러와 실행시킬 수 있다.When a resource for generating and providing a 3D object is required according to a user's input, the 3D modeling providing server 100 may acquire the necessary resource by using a graphic library. For example, the 3D modeling providing server 100 may bring a 3D model necessary for generating a 3D object from an external or internal storage unit, or may acquire other necessary resources. The 3D modeling providing server 100 may call and execute a predetermined script file required to generate a 3D object using the 3D model.

3D 모델링 제공 서버(100)는 3D로 표현할 대상 객체를 결정하고, 결정된 대상 객체에 대응되는 3D 객체를 생성 또는 획득하고, 영상 제공 서버(300)를 통해 획득된 동영상에 상기 3D 객체를 중첩하여 사용자에게 제공할 수 있다.The 3D modeling providing server 100 determines a target object to be expressed in 3D, generates or obtains a 3D object corresponding to the determined target object, and superimposes the 3D object on a video obtained through the image providing server 300 to a user can be provided to

사용자 단말(200)은 3D모델링 제공 서버(100)에서 제공하는 3D 객체와 상기 3D 객체가 중첩된 동영상을 적어도 하나의 표시 장치(예: 디스플레이)를 통해 사용자에게 시각적으로 제공할 수 있다. 사용자 단말(200)은 사용자와 3D 객체 간 인터랙션(interaction, 상호작용)에 기초하여, 다시 말해서, 사용자 단말(200)은 사용자의 입력에 따라 3D 객체의 크기, 위치, 회전 등을 제어할 수 있다.The user terminal 200 may visually provide a 3D object provided by the 3D modeling providing server 100 and a video in which the 3D object is superimposed to the user through at least one display device (eg, a display). The user terminal 200 is based on the interaction between the user and the 3D object, that is, the user terminal 200 may control the size, position, rotation, etc. of the 3D object according to the user's input. .

사용자 단말(200)은 적어도 일 면에 디스플레이를 포함할 수 있으며, 카메라(예: 이미지 센서) 및 거리 감지 센서 등을 포함할 수 있다. The user terminal 200 may include a display on at least one surface, and may include a camera (eg, an image sensor) and a distance sensor.

사용자 단말(200)은 통신 가능한 데스크탑 컴퓨터(desktop computer), 랩탑 컴퓨터(laptop computer), 노트북(notebook), 스마트폰(smart phone), 태블릿 PC(tablet PC), 모바일폰(mobile phone), 스마트 워치(smart watch), 스마트 글래스(smart glass), e-book 리더기, PMP(portable multimedia player), 휴대용 게임기, 네비게이션(navigation) 장치, 디지털 카메라(digital camera), DMB(digital multimedia broadcasting) 재생기, 디지털 음성 녹음기(digital audio recorder), 디지털 음성 재생기(digital audio player), 디지털 동영상 녹화기(digital video recorder), 디지털 동영상 재생기(digital video player), 및 PDA(Personal Digital Assistant) 등 일 수 있다.The user terminal 200 is a communicable desktop computer, a laptop computer, a notebook, a smart phone, a tablet PC, a mobile phone, and a smart watch. (smart watch), smart glass, e-book reader, PMP (portable multimedia player), portable game console, navigation device, digital camera, DMB (digital multimedia broadcasting) player, digital voice It may be a digital audio recorder, a digital audio player, a digital video recorder, a digital video player, and a personal digital assistant (PDA).

영상 제공 서버(300)는 동영상을 제공하는 특정 웹 서버를 포함할 수 있다. 영상 제공 서버(300)는 이에 한정되지 않으며, 동영상을 제공하는 플랫폼 서버를 포함할 수 있다. 예를 들어, 유튜브(Youtube) 서버, 페이스북(Facebook) 서버, 틱톡(Tiktok) 서버, 인스타그램(Instagram) 서버 등을 포함할 수 있다. 영상 제공 서버(300)는 서버에 저장된 동영상, 스트리밍 되는 동영상 및 실시간으로 촬영되고 있는 동영상을 제공할 수 있으며, 상기 동영상은 .avi, .mp4, .mkv, .wmv, .flv 등 다양한 확장자 형태로 구성될 수 있다. 또는, 영상 제공 서버(300)는 동영상에 접근 가능한 외부 접속 주소인 URL(Uniform Resource Locator)을 제공할 수 있다.The image providing server 300 may include a specific web server that provides a video. The image providing server 300 is not limited thereto, and may include a platform server that provides a video. For example, it may include a YouTube server, a Facebook server, a Tiktok server, an Instagram server, and the like. The video providing server 300 may provide video stored in the server, streaming video, and video being recorded in real time, and the video is in the form of various extensions such as .avi, .mp4, .mkv, .wmv, .flv, etc. can be configured. Alternatively, the image providing server 300 may provide a Uniform Resource Locator (URL) that is an external access address accessible to the video.

도 2는 도 1에 따른 3D 모델링 제공 서버(100)의 구성들을 도시한 도면이다.FIG. 2 is a diagram illustrating the configurations of the 3D modeling providing server 100 according to FIG. 1 .

도 2를 참조하면, 3D 모델링 제공 서버(100)는 트리거 이벤트 생성부(101), 대상 객체 결정부(102), 3D 객체 획득부(103), 초기 표시 특성 결정부(104), 3D 객체 제공부(105) 및 인터랙션 제어부(106)를 포함할 수 있다.Referring to FIG. 2 , the 3D modeling providing server 100 includes a trigger event generating unit 101 , a target object determining unit 102 , a 3D object obtaining unit 103 , an initial display characteristic determining unit 104 , and a 3D object first Study 105 and may include an interaction control unit (106).

트리거 이벤트 생성부(101)는 3D 객체를 표시하는 일종의 명령인 트리거 이벤트를 생성할 수 있다. 트리거 이벤트 생성부(101)는, 사용자 단말(200)의 사용자의 음성 및 제스처, 사용자 단말(200)에서 재생 중인 동영상에서 출력되는 음성에 기초하여 트리거 이벤트를 생성할 수 있다. 이하, 3D 모델링 제공 서버(100) 중 하나의 구성이3D 객체를 표시한다는 의미는 3D 모델링 제공 서버(100)가 사용자 단말(200)을 통해 3D 객체를 표시하는 것을 의미할 수 있다.The trigger event generator 101 may generate a trigger event that is a kind of command for displaying a 3D object. The trigger event generator 101 may generate a trigger event based on the user's voice and gesture of the user terminal 200 , and a voice output from a video being played in the user terminal 200 . Hereinafter, the meaning that the configuration of one of the 3D modeling providing servers 100 displays the 3D object may mean that the 3D modeling providing server 100 displays the 3D object through the user terminal 200 .

트리거 이벤트 생성부(101)는 사용자 단말(200)에서 재생되는 동영상에서 출력되는 음성, 동영상이 재생 중에 사용자 단말(200)을 통해 입력되는 사용자의 음성이나 제스쳐에 기초하여 트리거 이벤트를 생성할 수 있다. The trigger event generator 101 may generate a trigger event based on a voice output from a video reproduced in the user terminal 200, or a user's voice or gesture input through the user terminal 200 while the video is being played. .

예를 들어, 트리거 이벤트 생성부(101)는, 동영상에서 장면 전환이 있는 것으로 판단되는 시점에 트리거 이벤트를 생성할 수 있다. 구체적으로, 트리거 이벤트 생성부(101)는, 동영상을 구성하는 영상 프레임(image frame)들 중에서, 서로 시간적으로 인접한 2개의 영상 프레임들을 대상으로, 서로 위치가 대응하는 화소값끼리 차분하여 차분 영상 프레임을 생성하고, 차분 영상 프레임을 구성하는 화소값들의 합산값이 미리 설정된 임계값 이하인 경우, 해당하는 2개의 영상 프레임들 사이에 장면 전환이 있는 것으로 판단하고, 트리거 이벤트를 생성할 수 있다.For example, the trigger event generator 101 may generate a trigger event at a point in time when it is determined that there is a scene change in the video. Specifically, the trigger event generating unit 101 is configured to differentiate between pixel values corresponding to each other in positions of two image frames that are temporally adjacent to each other among image frames constituting the moving image, thereby providing a difference image frame. , and when the sum of pixel values constituting the differential image frame is equal to or less than a preset threshold, it is determined that there is a scene change between the two corresponding image frames, and a trigger event may be generated.

다른 예를 들어, 트리거 이벤트 생성부(101)는 재생되는 동영상 속 발화자가 바뀌는 것으로 판단되는 시점에 트리거 이벤트를 생성할 수 있다. 구체적으로, 트리거 이벤트 생성부(101)는, 재생되는 동영상 속 적어도 하나의 객체(예: 사람)를 검출하고, 동영상에서 음성이 출력되는 동안 검출된 객체(예: 사람)의 입 모양이 변화되는지 판단할 수 있다. 트리거 이벤트 생성부(101)는 음성이 출력되는 동안 입 모양이 변하는 객체(예: 사람)를 발화자로 결정할 수 있다. 트리거 이벤트 생성부(101)는 지정된 시간 동안의 동영상에서 출력되는 음성의 평균 세기 또는 음성의 평균 주파수가 제1 변화량(예: 15%) 만큼 변화되고, 음성이 출력되는 동안 입 모양이 변하는 다른 객체(예: 사람)를 검출한 경우, 발화자가 바뀌는 것으로 판단하고, 트리거 이벤트를 생성할 수 있다.As another example, the trigger event generator 101 may generate a trigger event at a time when it is determined that the speaker in the reproduced video changes. Specifically, the trigger event generator 101 detects at least one object (eg, a person) in the reproduced video, and checks whether the mouth shape of the detected object (eg, a person) is changed while a voice is output from the video. can judge The trigger event generator 101 may determine an object (eg, a person) whose mouth shape changes while a voice is output as the speaker. The trigger event generator 101 changes the average intensity or average frequency of the voice output from the video for a specified time by a first change amount (eg, 15%), and another object whose mouth shape changes while the voice is output. When (eg, a person) is detected, it is determined that the speaker is changed, and a trigger event can be generated.

트리거 이벤트 생성부(101)는 동영상이 중지된 상태에서는 트리거 이벤트를 생성하지 않는다.The trigger event generator 101 does not generate a trigger event while the video is stopped.

대상 객체 결정부(102)는, 트리거 이벤트 생성부(101)에 의해 생성된 트리거 이벤트를 감지하고, 감지된 트리거 이벤트에 대응하는 3D 객체로 표현할 대상인 대상 객체를 결정할 수 있다. 예를 들어, 대상 객체 결정부(102)는 자동차를 대상 객체로 결정할 수 있다.The target object determiner 102 may detect a trigger event generated by the trigger event generator 101 and determine a target object to be expressed as a 3D object corresponding to the detected trigger event. For example, the target object determiner 102 may determine a car as the target object.

대상 객체 결정부(102)는 트리거 이벤트가 발생된 시점을 기준으로 미리 설정된 시간 범위 이내에 동영상에서 출력된 음성을 통해 다수의 단어들을 인식할 수 있다. 예를 들어, 대상 객체 결정부(102)는 장면 전환이 있는 것으로 판단되는 시점 및/또는 발화자가 변경되는 것으로 판단되는 시점으로부터 미리 설정된 시간 범위 이내에서 동영상에서 다수의 단어들을 인식할 수 있다. 대상 객체 결정부(102)는 자연어 처리(NLP, natural language processing)를 이용하여 출력되는 음성에서 다수의 단어들을 인식할 수 있다. 자연어 처리(NLP)에는 자연어 분석, 자연어 이해, 자연어 생성 등의 기술이 사용된다. 자연어 분석은 그 정도에 따라 형태소 분석(morphological analysis), 통사 분석(syntactic analysis), 의미 분석(semantic analysis) 및 화용(話用) 분석(pragmatic analysis)의 4가지로 나눌 수 있으며, 이외에도 다양한 방식이 이용될 수 있다.The target object determiner 102 may recognize a plurality of words through a voice output from the video within a preset time range based on the time when the trigger event occurs. For example, the target object determiner 102 may recognize a plurality of words in the video within a preset time range from a time when it is determined that there is a scene change and/or a time when it is determined that the speaker is changed. The target object determiner 102 may recognize a plurality of words from the output voice using natural language processing (NLP). Natural language processing (NLP) uses techniques such as natural language analysis, natural language understanding, and natural language generation. According to the degree of natural language analysis, morphological analysis, syntactic analysis, semantic analysis, and pragmatic analysis can be divided into 4 types. can be used

대상 객체 결정부(102)는 인식된 다수의 단어들을 통해 대상 객체를 결정하기 위한 토픽(topic, 주제)을 결정할 수 있다. 또는, 대상 객체 결정부(102)는 재생 중인 동영상의 카테고리 내지 제목에 기초하여 대상 객체를 결정하기 위한 토픽을 결정할 수 있다. The target object determiner 102 may determine a topic for determining the target object through a plurality of recognized words. Alternatively, the target object determiner 102 may determine a topic for determining the target object based on the category or title of the video being reproduced.

대상 객체 결정부(102)는 결정된 토픽과 대응하는 대상 객체를 결정할 수 있다. 대상 객체 결정부(102)는 결정된 토픽과 사용자의 관심 분야를 함께 고려하여 대상 객체를 결정할 수 있다. 사용자의 관심 분야는 사용자 단말(200)을 통한 사용자의 검색 기록, 영상 시청 기록 등에 의하여 결정될 수 있다. 예를 들어, 재생 중인 동영상의 토픽이 영국을 소개하는 내용이고, 사용자의 관심 분야가 스포츠인 경우, 축구공을 대상 객체로 결정할 수 있다. 또 다른 예시로, 재생 중인 동영상의 토픽이 영국을 소개하는 내용이고, 사용자의 관심 분야가 음식인 경우, 피시앤칩스(fish and chips)를 대상 객체로 결정할 수 있다.The target object determiner 102 may determine a target object corresponding to the determined topic. The target object determiner 102 may determine the target object in consideration of the determined topic and the user's field of interest. The user's field of interest may be determined by the user's search record, video viewing record, and the like through the user terminal 200 . For example, if the topic of the video being played is to introduce England and the user's field of interest is sports, a soccer ball may be determined as the target object. As another example, if the topic of the video being played is introducing the UK and the user's field of interest is food, fish and chips may be determined as the target object.

대상 객체 결정부(102)는 재생되는 동영상 속 검출된 객체를 터치 내지 클릭하는 사용자의 입력에 응답하여 터치 내지 클릭된 객체를 대상 객체로 결정할 수 있다.The target object determiner 102 may determine the touched or clicked object as the target object in response to a user input of touching or clicking the detected object in the reproduced video.

3D 객체 획득부(103)는 미리 지정된 웹(web) 기반 3D 엔진을 통해 대상 객체에 대한 3D 객체를 획득할 수 있다. 3D 객체 획득부(103)는 초기 표시 특성 결정부(104)를 통해 결정된 초기 표시 특성이 적용된 3D 객체를 획득할 수 있다.The 3D object acquisition unit 103 may acquire a 3D object for the target object through a predetermined web-based 3D engine. The 3D object acquirer 103 may acquire a 3D object to which the initial display characteristic determined through the initial display characteristic determiner 104 is applied.

3D 객체 획득부(103)는 재생 중인 동영상에 3D 객체가 중첩되어 표시될 때 이질감을 최소화하기 위하여, 재생 중인 동영상의 해상도에 기초하여 3D 객체를 획득할 수 있다.The 3D object acquisition unit 103 may acquire the 3D object based on the resolution of the video being reproduced in order to minimize the sense of heterogeneity when the 3D object is overlapped and displayed on the video being reproduced.

일 실시 예에서, 3D 객체 획득부(103)는 동영상의　해상도가 미리 설정된 기준 해상도에서는 대상 객체에 대해　미리 설정된 기준 픽셀 밀도의 이미지들을 획득하여 3D 객체를 생성할 수 있다. 예를 들어, 3D 객체 획득부(103)는 상기 기준 해상도보다 높은 제1 해상도에서는 대상 객체에 대해 상기 기준 픽셀 밀도보다 높은 제1 픽셀 밀도의 이미지들을 획득하여 3D 객체를 생성할 수 있다. 3D 객체 획득부(103)는 상기 기준 해상도보다 낮은 제2 해상도에서는 대상 객체에 대해 상기 기준 픽셀 밀도보다 낮은 제2 픽셀 밀도의 이미지들을 획득하여 3D 객체를 생성할 수 있다. In an embodiment, the 3D object acquisition unit 103 may generate a 3D object by acquiring images of a preset reference pixel density for a target object in a reference resolution in which the resolution is preset in advance. For example, the 3D object acquisition unit 103 may generate a 3D object by acquiring images of a target object having a first pixel density higher than the reference pixel density at a first resolution higher than the reference resolution. The 3D object acquisition unit 103 may generate a 3D object by acquiring images having a second pixel density lower than the reference pixel density of the target object at a second resolution lower than the reference resolution.

일 실시 예에서, 3D 객체 획득부(103)는 동영상의 해상도에 따라 3D 객체에 포함된 버텍스의 수 내지 폴리곤의 수를 조절하여 3D 객체를 생성할 수 있다. 3D 객체 획득부(103)는 동영상의 미리 설정된 기준 해상도에서 기준 개수의 버텍스 수를 가진 3D 객체를 획득할 수 있다. 3D 객체 획득부(103)는 상기 기준 해상도보다 낮은 제1 해상도에서는 상기 기준 개수보다 적은 제1 개수의 버텍스 수를 가진 3D 객체를 획득하고, 상기 기준 해상도보다 높은 제2 해상도에서는 상기 기준 개수보다 많은 제2 개수의 버텍스 수를 가진 3D 객체를 획득할 수 있다.In an embodiment, the 3D object obtainer 103 may generate the 3D object by adjusting the number of vertices or the number of polygons included in the 3D object according to the resolution of the video. The 3D object acquisition unit 103 may acquire a 3D object having a reference number of vertices at a preset reference resolution of the video. The 3D object acquisition unit 103 acquires a 3D object having a first number of vertices less than the reference number at a first resolution lower than the reference resolution, and obtains a 3D object having a first number of vertices smaller than the reference number at a second resolution higher than the reference resolution. A 3D object having the second number of vertices may be obtained.

초기 표시 특성 결정부(104)는 획득된 3D 객체의 초기 표시 특성을 결정할 수 있다. 초기 표시 특성은 사용자 단말(200)의 표시 장치(예: 디스플레이)를 통해 3D 객체가 처음 표시될 때의 초기 크기, 초기 깊이, 초기 위치, 초기 방향 및 기본 동작을 포함할 수 있다. 상기 기본 동작은 해당 3D 객체의 동작 중 가장 많은 비중을 차지하는 모션을 의미할 수 있다. The initial display characteristic determiner 104 may determine the initial display characteristic of the obtained 3D object. The initial display characteristics may include an initial size, an initial depth, an initial position, an initial direction, and a basic operation when a 3D object is first displayed through a display device (eg, a display) of the user terminal 200 . The basic motion may mean a motion that occupies the most weight among motions of the corresponding 3D object.

3D 객체 제공부(105)는 재생 중인 동영상과 획득된 3D 객체를 중첩시켜서 표시할 수 있다. 3D 객체 제공부(105)는 초기 표시 특성 결정부(104)에서 결정된 초기 표시 특성에 기초하여 3D 객체를 표시하도록 제어할 수 있다.The 3D object providing unit 105 may display the video being reproduced by overlapping the obtained 3D object. The 3D object providing unit 105 may control to display the 3D object based on the initial display characteristic determined by the initial display characteristic determining unit 104 .

3D 객체 제공부(105)는 3D 객체가 표시될 때, 재생 중인 동영상의 컬러를 흑백 처리하고 3D 객체의 컬러는 색표현을 하여 3D 객체를 부각시킬 수 있다. When the 3D object is displayed, the 3D object providing unit 105 may highlight the 3D object by processing the color of the video being reproduced in black and white and expressing the color of the 3D object.

인터랙션 제어부(106)는 3D 객체와 인터랙션하는 사용자의 입력을 획득하고, 획득된 사용자의 입력의 종류를 식별할 수 있다. 인터랙션 제어부(106)는 식별된 사용자의 입력에 따라 각 입력에 대응되는 기능을 수행할 수 있다. 예를 들어, 인터랙션 제어부(106)는 사용자 단말(200)의 일 영역에 표시된 3D 객체를 이동 및 회전시킬 수 있다.The interaction control unit 106 may obtain a user's input interacting with the 3D object, and identify the type of the obtained user's input. The interaction control unit 106 may perform a function corresponding to each input according to the identified user input. For example, the interaction controller 106 may move and rotate a 3D object displayed on one area of the user terminal 200 .

도 3a은 도 2의 초기 표시 특성 결정부를 통해 3D 객체의 초기 표시 특성(예: 초기 깊이)을 결정하는 예시를 나타낸 도면이다. 도 3b는 도 2의 초기 표시 특성 결정부를 통해 3D 객체의 초기 표시 특성(예: 초기 위치)을 결정하는 예시를 나타낸 도면이다.FIG. 3A is a diagram illustrating an example of determining an initial display characteristic (eg, an initial depth) of a 3D object through the initial display characteristic determiner of FIG. 2 . 3B is a diagram illustrating an example of determining an initial display characteristic (eg, an initial position) of a 3D object through the initial display characteristic determiner of FIG. 2 .

도 3a를 참조하면, 초기 표시 특성 결정부(104)는 동영상이 표시되고 있는 디스플레이의 표면을 제1 면(plane)(401)으로 설정하고, 상기 제1 면(401)이 바라보는 제1 방향(411)과 반대 방향인 제2 방향으로 상기 제1 면의 중앙(403)에서 수직으로 이어지는 가상의 점(405)을 설정할 수 있다. 초기 표시 특성 결정부(104)는 사용자와 사용자 단말(200) 간 거리에 기초하여, 상기 면과 상기 가상의 점 사이의 거리 중 3D 객체의 초기 깊이를 결정할 수 있다. 사용자와 사용자 단말(200) 간 거리가 가까울수록 3D 객체의 초기 깊이는 깊어지고, 사용자와 사용자 단말(200) 간 거리가 멀어질수록 3D 객체의 초기 깊이는 얕아질 수 있다.Referring to FIG. 3A , the initial display characteristic determining unit 104 sets the surface of the display on which a moving picture is displayed as a first plane 401 , and a first direction viewed by the first plane 401 . An imaginary point 405 vertically extending from the center 403 of the first surface in a second direction opposite to 411 may be set. The initial display characteristic determiner 104 may determine the initial depth of the 3D object among the distances between the surface and the virtual point based on the distance between the user and the user terminal 200 . The closer the distance between the user and the user terminal 200, the deeper the initial depth of the 3D object, and the greater the distance between the user and the user terminal 200, the shallower the initial depth of the 3D object.

일 실시 예에서, 초기 표시 특성 결정부(104)는 사용자 단말(200)로부터 사용자의 시력 정보을 획득할 수 있다. 초기 표시 특성 결정부(104)는 사용자 단말(200)을 통해 사용자로부터 직접 시력 정보를 입력 받거나, 외부 서버로부터 테스트한 사용자의 시력 정보를 전달받을 수 있다. 초기 표시 특성 결정부(104)는 획득된 사용자의 시력 정보에 기초하여 3D 객체의 초기 크기를 결정할 수 있다. 시력 정보는 안경을 착용한 상태에서의 시력 및 라식 수술(lasik operation), 라섹 수술(lasek operation) 등 수술 후의 시력을 포함할 수 있다. 초기 표시 특성 결정부(104)는 획득된 시력 정보에 기초하여 사용자의 시력이 좋을수록 3D 객체의 초기 크기를 확대시킬 수 있다.In an embodiment, the initial display characteristic determiner 104 may acquire the user's eyesight information from the user terminal 200 . The initial display characteristic determiner 104 may receive visual acuity information directly from the user through the user terminal 200 or may receive the user's tested visual acuity information from an external server. The initial display characteristic determiner 104 may determine the initial size of the 3D object based on the acquired user's eyesight information. The vision information may include visual acuity in a state in which glasses are worn and visual acuity after surgery such as a lasik operation or a lasek operation. Based on the acquired visual acuity information, the initial display characteristic determiner 104 may enlarge the initial size of the 3D object as the user's visual acuity is improved.

일 실시 예에서, 초기 표시 특성 결정부(104)는 3D 객체의 초기 위치를 결정할 수 있다. 초기 표시 특성 결정부(104)는 3D 객체의 초기 위치를 결정하기 위하여, 동영상 표시 영역에서 여백 영역을 식별할 수 있다. 초기 표시 특성 결정부(104)는 여백 영역을 식별하기 위하여, 동영상 표시 영역을 복수 개의 영역들로 분할할 수 있다. 초기 표시 특성 결정부(104)는 미리 설정된 시간 동안의 동영상을 구성하는 영상 프레임들을 이미지 처리(예: 엣지 검출)하여 상기 복수 개의 영역들 각각에서 직선 성분, 곡선 성분 등을 추출할 수 있다. 또한, 초기 표시 특성 결정부(104)는 분할된 복수 개의 영역들 각각에 포함된 픽셀들의 컬러 값들을 획득하고, 분할된 영역들 각각에 포함된 픽셀들의 컬러 값들의 편차 값을 계산할 수 있다.In an embodiment, the initial display characteristic determiner 104 may determine an initial position of the 3D object. The initial display characteristic determiner 104 may identify a blank area in the video display area to determine the initial position of the 3D object. The initial display characteristic determiner 104 may divide the video display area into a plurality of areas in order to identify the blank area. The initial display characteristic determiner 104 may extract a linear component, a curved component, etc. from each of the plurality of regions by image processing (eg, edge detection) of image frames constituting a moving picture for a preset time. Also, the initial display characteristic determiner 104 may obtain color values of pixels included in each of the plurality of divided regions, and calculate a deviation value between color values of pixels included in each of the divided regions.

초기 표시 특성 결정부(104)는 복수 개의 영역들 중 각각에 포함된 직선 성분 또는 곡선 성분을 이루고 있는 픽셀들의 비중이 제1 비중 이하이고, 계산된 편차 값이 제1 임계 편차 값 이하인 영역을 여백으로 판단하여 3D 객체를 표시할 위치로 결정할 수 있다. 편차가 낮은 영역은 다양한 객체들로 인하여 복잡한 영역으로 아니라고 판단되어, 3D 객체를 표시시키기에 적절한 위치로 결정될 수 있다. The initial display characteristic determiner 104 blanks out a region in which the proportion of pixels constituting the linear component or the curved component included in each of the plurality of regions is equal to or less than the first proportion and the calculated deviation value is equal to or less than the first critical deviation value. can be determined as a position to display the 3D object. It is determined that the area with a low deviation is not a complex area due to various objects, and thus an appropriate location for displaying the 3D object may be determined.

일 실시 예에서, 도 3b를 참조하면, 초기 표시 특성 결정부(104)는 3D 객체의 초기 방향을 결정할 수 있다. 초기 표시 특성 결정부(104)는 사용자 단말(200)의 적어도 하나의 센서를 이용해서 사용자 단말(200)의 각도를 판단하고, 판단된 사용자 단말(200)의 각도에 기초하여 3D 객체가 표시되는 초기 방향을 결정할 수 있다. 구체적으로, 표시 특성 결정부(104)는 사용자 단말(200)의 가속도 센서, 자이로 센서 및 지자계 센서를 포함하는 9축 센서를 통해 사용자 단말(200)의 요(yaw), 피치(pitch) 및 롤(roll) 방향의 각도를 획득할 수 있다. 표시 특성 결정부(104)는 사용자 단말(200)로부터 획득한 사용자 단말(200)의 요, 피치 및롤 방향의 각도에 기초하여 3D 객체의 초기 방향으로 결정할 수 있다.In an embodiment, referring to FIG. 3B , the initial display characteristic determiner 104 may determine an initial direction of the 3D object. The initial display characteristic determining unit 104 determines the angle of the user terminal 200 using at least one sensor of the user terminal 200, and based on the determined angle of the user terminal 200, a 3D object is displayed. The initial direction can be determined. Specifically, the display characteristic determining unit 104 is a yaw (yaw) of the user terminal 200, the pitch (pitch) and The angle in the roll direction can be obtained. The display characteristic determiner 104 may determine the initial direction of the 3D object based on the angles of the yaw, pitch, and roll directions of the user terminal 200 obtained from the user terminal 200 .

도 3b를 참조하면, 요 방향은 z축을 기준으로 회전하는 방향으로 이해될 수 있다. 롤 방향은 x축을 기준으로 회전하는 방향으로 이해될 수 있다. 피치 방향은 y축을 기준으로 회전하는 방향으로 이해될 수 있다. 여기서, x축은 사용자 단말(200) 또는 3D 객체의 길이 방향의 축, y축은 사용자 단말(200) 또는 3D 객체의 너비 방향의 축, z축은 중력 방향의 축으로 이해될 수 있다. x, y 및 z 축은 각 축에 대하여 수직할 수 있다.Referring to FIG. 3B , the yaw direction may be understood as a direction of rotation about the z-axis. The roll direction may be understood as a direction of rotation about the x-axis. The pitch direction may be understood as a direction of rotation about the y-axis. Here, the x-axis may be understood as an axis in the longitudinal direction of the user terminal 200 or the 3D object, the y-axis may be understood as an axis in the width direction of the user terminal 200 or the 3D object, and the z-axis may be understood as an axis in the direction of gravity. The x, y and z axes may be perpendicular to each axis.

초기 표시 특성 결정부(104)는 획득된 3D 객체의 기준 방향을 결정할 수 있다. 3D 객체의 기준 방향은 3D 객체의 요, 피치 및 롤 방향의 각도가 0도인 경우의 3D 객체의 방향을 의미할 수 있다. 다시 말해서, 초기 표시 특성 결정부(104)는 3D 객체의 전면 방향, 후면 방향, 측면 방향에 대한 기준 방향을 결정할 수 있다. 초기 표시 특성 결정부(104)는 3D 객체의 기준 방향으로부터의 요, 피치 및 롤 방향의 각도와 상기 사용자 단말의 요, 피치 및 롤 방향의 각도가 대응되도록 상기 초기 방향을 결정할 수 있다. 예를 들어, 사용자가 사용자 단말(200)을 롤 방향으로 +20˚만큼, 요 방향으로 +40˚ 만큼 회전시킨 상태에서 동영상을 시청하는 경우, 3D 객체 획득부(103)는 기준 방향에서 롤 방향으로 +20˚만큼, 요 방향으로 +40˚만큼 회전된 3D 객체를 획득하고, 3D 객체 제공부(105)는 획득된 상기 3D 객체를 사용자 단말(200)을 통해 표시할 수 있다.The initial display characteristic determiner 104 may determine a reference direction of the obtained 3D object. The reference direction of the 3D object may mean a direction of the 3D object when the angles of the yaw, pitch, and roll directions of the 3D object are 0 degrees. In other words, the initial display characteristic determiner 104 may determine a reference direction for the front direction, the back direction, and the side direction of the 3D object. The initial display characteristic determiner 104 may determine the initial direction so that the angles of the yaw, pitch, and roll directions from the reference direction of the 3D object correspond to the angles of the yaw, pitch, and roll directions of the user terminal. For example, when a user watches a video while rotating the user terminal 200 in the roll direction by +20˚ and in the yaw direction by +40˚, the 3D object acquisition unit 103 moves from the reference direction to the roll direction. A 3D object rotated by +20° and +40° in the yaw direction is obtained, and the 3D object providing unit 105 may display the obtained 3D object through the user terminal 200 .

초기 표시 특성 결정부(104)는 사용자 단말(200)의 요, 피치 및 롤 방향의 각도 중 적어도 하나의 방향의 각도가 제1 임계 각도를 초과하는 경우, 상기 3D 객체의 초기 방향 중 상기 적어도 하나의 방향의 각도를 제1 임계 각도로 결정할 수 있다. 예를 들어, 상기 제1 임계 각도가 +50˚인 경우, 사용자가 사용자 단말(200)을 롤 방향으로 +130˚만큼, 요 방향으로 +10˚ 회전시킨 상태에서 동영상을 시청하는 경우, 3D 객체 획득부(103)는 기준 방향에서 롤 방향으로 +50˚만큼, 요 방향으로 +10˚만큼 회전된 3D 객체를 획득하고, 3D 객체 제공부(105)는 획득된 상기 3D 객체를 사용자 단말(200)을 통해 표시할 수 있다.When the angle of at least one of the angles of the yaw, pitch, and roll direction of the user terminal 200 exceeds a first threshold angle, the initial display characteristic determination unit 104 is configured to determine the at least one of the initial directions of the 3D object. An angle in the direction of may be determined as the first critical angle. For example, when the first critical angle is +50˚, when a user watches a video while rotating the user terminal 200 by +130˚ in the roll direction and +10˚ in the yaw direction, a 3D object The obtaining unit 103 obtains a 3D object rotated by +50° in the roll direction from the reference direction and +10° in the yaw direction, and the 3D object providing unit 105 returns the obtained 3D object to the user terminal 200 ) can be indicated by

일 실시 예에서, 초기 표시 특성 결정부(104)는 3D 객체의 기본 동작을 결정할 수 있다. 초기 표시 특성 결정부(104)는 3D 객체의 기본 동작을 결정하기 위하여, 상기 3D 객체에 대응되는 대상 객체가 포함된 동영상들을 크롤링할 수 있다. In an embodiment, the initial display characteristic determiner 104 may determine a basic operation of the 3D object. The initial display characteristic determiner 104 may crawl videos including a target object corresponding to the 3D object in order to determine a basic operation of the 3D object.

초기 표시 특성 결정부(104)는 크롤링된 동영상들에서 대상 객체의 모션들을 분석하고, 분석된 대상 객체의 모션들 중 가장 많이 포착되는 모션을 기본 동작으로 결정할 수 있다. 예를 들어, 초기 표시 특성 결정부(104)는 3D 객체가 자동차인 경우, 3D 객체인 자동차의 바퀴가 돌아가는 모습을 기본 동작으로 결정할 수 있다. 예를 들어, 초기 표시 특성 결정부(104)는 3D 객체가 헬리콥터인 경우, 3D 객체인 헬리콥터의 날개가 돌아가는 모습을 기본 동작으로 결정할 수 있다.The initial display characteristic determiner 104 may analyze the motions of the target object in the crawled videos, and determine the most captured motion among the analyzed motions of the target object as a basic operation. For example, when the 3D object is a car, the initial display characteristic determiner 104 may determine that the wheels of the car, which is the 3D object, rotate as a basic operation. For example, when the 3D object is a helicopter, the initial display characteristic determiner 104 may determine the rotation of the wings of the helicopter, which is the 3D object, as a basic operation.

3D 객체 제공부(105)는 초기 표시 특성 결정부(104)를 통해 결정된 기본 동작이 적용된 3D 객체를 사용자 단말(200)에 제공할 수 있다. The 3D object providing unit 105 may provide the 3D object to which the basic operation determined through the initial display characteristic determination unit 104 is applied to the user terminal 200 .

도 4는 도 2의 인터랙션 제어부(106)를 통해 3D 객체와 사용자가 인터랙션을 하는 예시를 나타낸 도면이다. 도 4에서 설명되는 동작들은 도 5의 복수 개의 3D 객체 중 하나의 3D 객체를 제어하는 동작에 동일 내지 유사하게 적용될 수 있다.FIG. 4 is a diagram illustrating an example in which a user interacts with a 3D object through the interaction control unit 106 of FIG. 2 . The operations described in FIG. 4 may be equally or similarly applied to an operation of controlling one 3D object among the plurality of 3D objects of FIG. 5 .

일 실시 예에서, 3D 객체 제공부(105)는 트리거 이벤트를 감지하는 것에 응답하여, 초기 표시 특성이 적용된 3D 객체를 사용자 단말(200)을 통해 표시할 수 있다. 인터랙션 제어부(106)는 사용자의 입력에 응답하여 3D 객체의 확대, 3D 객체의 축소, 3D 객체의 이동 및 3D 객체의 회전 등을 포함하는 3D 객체의 모션을 제어할 수 있다. 3D 객체 제공부(105)는 인터랙션 제어부(106)를 통해 확대, 축소, 이동 및/또는 회전된 3D 객체를 사용자 단말(200)에 제공하고, 제공한 3D 객체를 사용자 단말(200)의 디스플레이를 통해 표시할 수 있다.In an embodiment, the 3D object providing unit 105 may display the 3D object to which the initial display characteristic is applied through the user terminal 200 in response to detecting the trigger event. The interaction controller 106 may control the motion of the 3D object including enlargement of the 3D object, reduction of the 3D object, movement of the 3D object, and rotation of the 3D object in response to a user input. The 3D object providing unit 105 provides the enlarged, reduced, moved and/or rotated 3D object to the user terminal 200 through the interaction control unit 106, and provides the 3D object to the display of the user terminal 200. can be displayed through

인터랙션 제어부(106)가 3D 객체의 이동 내지 회전을 제어하는 동안, 동영상은 일시적으로 정지될 수 있다.While the interaction controller 106 controls the movement or rotation of the 3D object, the video may be temporarily stopped.

인터랙션 제어부(106)는 3D 객체를 터치 내지 클릭한 채 제1 방향으로 드래그하는 사용자의 입력에 응답하여 3D 객체를 상기 제1 방향으로 이동시킬 수 있다.The interaction controller 106 may move the 3D object in the first direction in response to a user input of dragging the 3D object in the first direction while touching or clicking.

인터랙션 제어부(106)는 사용자의 스크롤 업 또는 스크롤 다운 입력에 의하여 3D 객체의 크기를 키우거나 줄일 수 있으며, 사용자의 두 개 손가락이 디스플레이에 터치된 상태로 서로 멀어지거나 가까워지는 입력에 따라 3D 객체의 크기를 키우거나 줄일 수 있다.The interaction control unit 106 may increase or decrease the size of the 3D object according to the user's scroll-up or scroll-down input, and the user's two fingers touch the display and move away from or closer to each other according to the input. You can increase or decrease the size.

인터랙션 제어부(106)는 사용자의 두 개의 손가락 중 첫 번째로 입력된 터치 입력을 피벗 포인트로 설정하고, 두 번째로 입력된 터치 입력이 드래그 입력으로서 상기 피벗 포인트로부터 일정 각도만큼 회전한 경우, 상기 회전한 각도만큼 3D 객체를 회전시킬 수 있다.The interaction control unit 106 sets the first touch input of the user's two fingers as the pivot point, and when the second input touch input is a drag input and rotates by a predetermined angle from the pivot point, the rotation You can rotate a 3D object by one angle.

인터랙션 제어부(106)는 용이한 인터랙션을 위하여, 제어할 모션을 선택할 수 있는 옵션 메뉴를 사용자 단말(200)을 통해 사용자에게 제공할 수 있다. 예를 들어, 인터랙션 제어부(106)는 사용자 단말(200)의 일 면에 3D 객체를 이동시키기 위한 이동 버튼, 3D 객체를 회전시키기 위한 회전 버튼 등을 표시할 수 있다.The interaction control unit 106 may provide an option menu for selecting a motion to be controlled to the user through the user terminal 200 for easy interaction. For example, the interaction control unit 106 may display a movement button for moving a 3D object, a rotation button for rotating a 3D object, and the like on one surface of the user terminal 200 .

인터랙션 제어부(106)는 사용자의 터치 내지 클릭 횟수에 따른 다양한 효과를 설정할 수 있다. 인터랙션 제어부(106)는 사용자가 3D 객체를 한 번 터치하면 해당 3D 객체를 회전시킬 수 있는 회전 모드로 변경할 수 있다. 인터랙션 제어부(106)는 사용자가 3D 객체를 두 번 터치하면 해당 3D 객체를 이동시킬 수 있는 이동 모드로 변경할 수 있다. The interaction control unit 106 may set various effects according to the number of touches or clicks of the user. When the user touches the 3D object once, the interaction control unit 106 may change to a rotation mode in which the 3D object can be rotated. When the user touches the 3D object twice, the interaction control unit 106 may change to a movement mode in which the 3D object can be moved.

인터랙션 제어부(106)는 3D 객체을 회전시킬 때, 사용자의 입력이 아닌 사용자 단말(200)의 각도에 기초하여 3D 객체를 회전시킬 수 있다. 다시 말해서, 인터랙션 제어부(106)는 상기 도 3b에서 초기 방향을 결정하는 동작을 3D 객체의 회전 제어에 그대로 적용하여 3D 객체를 회전시킬 수 있다. When rotating the 3D object, the interaction control unit 106 may rotate the 3D object based on the angle of the user terminal 200 rather than the user's input. In other words, the interaction controller 106 may rotate the 3D object by directly applying the operation of determining the initial direction in FIG. 3B to the rotation control of the 3D object.

인터랙션 제어부(106)는 사용자가 3D 객체를 세 번 터치하거나 세 번 클릭하면 해당 3D 객체를 저장할 수 있다. 인터랙션 제어부(106)는 저장된 3D 객체는 해당 동영상의 다른 구간 또는 다른 동영상에서 불러들일 수 있다.The interaction control unit 106 may store the 3D object when the user touches or clicks the 3D object three times. The interaction control unit 106 may call the stored 3D object from another section of the corresponding video or from another video.

도 5는 도 2의 인터랙션 제어부(106)를 통해 복수 개의3D 객체들과 사용자가 인터랙션을 하는 예시를 나타낸 도면이다.FIG. 5 is a diagram illustrating an example in which a user interacts with a plurality of 3D objects through the interaction control unit 106 of FIG. 2 .

3D 객체 제공부(105)는 사용자 단말(200)의 디스플레이의 일 영역에 복수 개의 3D 객체들을 표시할 수 있다. 복수 개의 3D 객체들이 표시되는 경우, 인터랙션 제어부(106)는 복수 개의 3D 객체들 중 하나의 3D 객체를 모션을 제어할 3D 객체로 결정하고, 결정된 3D 객체와 사용자의 인터랙션을 수행할 수 있다.The 3D object providing unit 105 may display a plurality of 3D objects on one area of the display of the user terminal 200 . When a plurality of 3D objects are displayed, the interaction control unit 106 may determine one 3D object among the plurality of 3D objects as a 3D object to control motion, and perform user interaction with the determined 3D object.

인터랙션 제어부(106)는 사용자의 3D 객체를 터치하는 입력 또는 사용자의 음성 명령에 응답하여 복수 개의 3D 객체들 중 하나의 3D 객체를 모션을 제어할 3D 객체로 결정할 수 있다.The interaction controller 106 may determine one 3D object among the plurality of 3D objects as the 3D object to control the motion in response to a user's input of touching the 3D object or a user's voice command.

인터랙션 제어부(106)는 사용자와 사용자 단말(200) 간 거리 및 사용자의 핸드 모션의 변화에 기초하여 복수 개의 3D 객체들 중 하나의 3D 객체를 모션을 제어할 3D 객체로 결정할 수 있다. 인터랙션 제어부(106)는 사용자 단말(200)의 거리 감지 센서를 통해 사용자와 사용자 단말(200) 간 거리를 확인할 수 있고, 사용자 단말(200)의 카메라(예: 이미지 센서)를 통해 획득된 이미지 데이터에 기초하여 사용자의 핸드 모션을 식별할 수 있다. 상기 사용자와 사용자 단말(200) 간 거리는 사용자의 특정 부위(예: 손)와 사용자 단말(200)의 거리 감지 센서까지의 거리일 수 있다. The interaction control unit 106 may determine one 3D object among the plurality of 3D objects as the 3D object to control the motion based on the distance between the user and the user terminal 200 and the change in the user's hand motion. The interaction control unit 106 may check the distance between the user and the user terminal 200 through the distance detection sensor of the user terminal 200 , and image data obtained through the camera (eg, image sensor) of the user terminal 200 . based on the user's hand motion may be identified. The distance between the user and the user terminal 200 may be a distance between a specific part of the user (eg, a hand) and a distance sensor of the user terminal 200 .

인터랙션 제어부(106)는 사용자와 사용자 단말(200) 간 기준 거리를 설정할 수 있다. 디스플레이 사이즈에 따라 사용자가 사용자 단말(200)의 디스플레이를 보는 적정 시청 거리가 달라지므로, 인터랙션 제어부(106)는 사용자 단말(200)의 디스플레이 사이즈에 기초하여 상기 기준 거리를 결정할 수 있다. 예를 들어, 인터랙션 제어부(106)는, 사용자 단말(200)의 디스플레이가 제1 사이즈인 경우, 기준 거리를 제1 기준 거리로 설정하고, 사용자 단말(200)의 디스플레이의 사이즈가 제1 사이즈보다 큰 제2 사이즈인 경우, 기준 거리를 제1 기준 거리보다 먼 제2 기준 거리로 설정할 수 있다.The interaction control unit 106 may set a reference distance between the user and the user terminal 200 . Since the appropriate viewing distance at which the user views the display of the user terminal 200 varies according to the display size, the interaction control unit 106 may determine the reference distance based on the display size of the user terminal 200 . For example, when the display of the user terminal 200 has a first size, the interaction control unit 106 sets the reference distance as the first reference distance, and the size of the display of the user terminal 200 is larger than the first size. In the case of the large second size, the reference distance may be set as a second reference distance that is greater than the first reference distance.

인터랙션 제어부(106)는 서로 다른 깊이 값을 가진 3D 객체들 중 설정된 기준 거리에 대응되는 기준 3D 객체를 결정할 수 있다. 인터랙션 제어부(106)는 3D 객체들 각각의 무게 중심에 대응되는 픽셀의 깊이 값을 기준으로 3D 객체들 각각의 깊이 값을 판단할 수 있다.The interaction controller 106 may determine a reference 3D object corresponding to a set reference distance from among 3D objects having different depth values. The interaction controller 106 may determine a depth value of each of the 3D objects based on a depth value of a pixel corresponding to a center of gravity of each of the 3D objects.

인터랙션 제어부(106)는 사용자와 사용자 단말(200)간 거리에 따라 상기 거리에 대응되는 3D 객체에 식별 표시(예: 점선 박스 표시)를 할 수 있다. 예를 들어, 인터랙션 제어부(106)는 사용자와 사용자 단말 간 거리가 기준 거리(11)인 경우, 제1 3D 객체(01)에 식별 표시(21)를 하고, 사용자와 사용자 단말(200) 간 거리가 제2 거리(13)인 경우, 제2 3D 객체(03)에 식별 표시를 하고, 사용자와 사용자 단말 간 거리가 제3 거리(15)인 경우, 제3 3D 객체(05)에 식별 표시를 할 수 있다.According to the distance between the user and the user terminal 200 , the interaction control unit 106 may display an identification mark (eg, a dotted line box) on a 3D object corresponding to the distance. For example, when the distance between the user and the user terminal is the reference distance 11 , the interaction control unit 106 makes an identification mark 21 on the first 3D object 01 , and the distance between the user and the user terminal 200 . When is the second distance 13, an identification mark is placed on the second 3D object 03, and when the distance between the user and the user terminal is the third distance 15, an identification mark is placed on the third 3D object 05 can do.

인터랙션 제어부(106)는 3D 객체들 중 하나의 3D 객체(예: 제1 3D 객체(01))에 식별 표시(21)가 된 경우에, 상기 식별 표시가 된 이후부터 지정된 시간 내에 사용자의 핸드 모션의 변화를 감지한 경우 식별 표시된 3D 객체를 모션을 제어할 객체로 결정할 수 있다. 예를 들어, 인터랙션 제어부(106)는 사용자 단말(200)을 통해 사용자의 손을 완전히 펼친 상태의 핸드 모션에서 적어도 하나의 손가락을 접은 핸드 모션으로 변경된 것을 감지한 경우, 선택된 3D 객체를 제어할 3D 객체로 결정할 수 있다. 다른 예를 들어, 인터랙션 제어부(106)는 사용자 단말(200)을 통해 사용자의 손바닥이 사용자 단말(200)을 향하는 핸드 모션에서 사용자의 손등이 사용자 단말(200)을 향하는 핸드 모션으로 변경된 것을 감지한 경우, 선택된 3D 객체를 제어할 3D 객체로 결정할 수 있다.When an identification mark 21 is made on one of the 3D objects (eg, the first 3D object 01 ), the interaction control unit 106 controls the user's hand motion within a specified time after the identification mark is displayed. When a change in is detected, the identified 3D object may be determined as an object to control motion. For example, when the interaction control unit 106 detects a change from a hand motion in a state in which the user's hand is fully extended to a hand motion in a state in which the user's hand is fully extended through the user terminal 200, the 3D object to control the selected 3D object object can be determined. For another example, the interaction control unit 106 detects that the user's palm is changed from a hand motion toward the user terminal 200 to a hand motion with the back of the user's hand toward the user terminal 200 through the user terminal 200. In this case, the selected 3D object may be determined as the 3D object to be controlled.

인터랙션 제어부(106)는 사용자 단말(200)의 카메라(예: 이미지 센서)를 통해 감지한 지문의 개수에 기초하여 핸드 모션을 식별할 수 있다. 예를 들어, 인터랙션 제어부(106)는 다섯 손가락의 지문이 감지된 경우, 손이 펼쳐진 상태로 인지하고, 네 손가락의 지문이 감지된 경우, 하나의 손가락이 접힌 상태로 인지할 수 있다. 인터랙션 제어부(106)는 상기 감지한 지문의 개수가 상기 지정된 시간 내에 제1 개에서 제1 개수보다 적은 제2 개수로 변한 경우, 상기 사용자의 핸드 모션의 변화를 감지한 것으로 판단할 수 있다.The interaction control unit 106 may identify a hand motion based on the number of fingerprints detected through a camera (eg, an image sensor) of the user terminal 200 . For example, when a fingerprint of five fingers is sensed, the interaction control unit 106 may recognize that the hand is in an open state, and when a fingerprint of four fingers is sensed, it may recognize that one finger is in a folded state. The interaction control unit 106 may determine that the change in the user's hand motion is detected when the number of the detected fingerprints changes from the first number to the second number less than the first number within the specified time.

도 6은 도 1에 따른 3D 모델링 제공 서버(100)의 하드웨어 구성을 나타낸 도면이다.6 is a diagram illustrating a hardware configuration of the 3D modeling providing server 100 according to FIG. 1 .

도 6을 참조하면, 3D 모델링 제공 서버(100)는 적어도 하나의 프로세서(110) 및 상기 적어도 하나의 프로세서(110)가 적어도 하나의 동작(operation)을 수행하도록 지시하는 명령어들(instructions)을 저장하는 메모리(memory)를 포함할 수 있다.Referring to FIG. 6 , the 3D modeling providing server 100 stores at least one processor 110 and instructions for instructing the at least one processor 110 to perform at least one operation. It may include a memory (memory).

상기 적어도 하나의 동작은, 전술한 3D 모델링 제공 서버(100)의 동작이나 기능 중 적어도 일부를 포함하고 명령어들 형태로 구현되어 프로세서(110)에 의하여 수행될 수 있다.The at least one operation may include at least some of the operations or functions of the above-described 3D modeling providing server 100 , and may be implemented in the form of instructions and performed by the processor 110 .

여기서 적어도 하나의 프로세서(110)는 중앙 처리 장치(central processing unit, CPU), 그래픽 처리 장치(graphics processing unit, GPU), 또는 본 발명의 실시예들에 따른 방법들이 수행되는 전용의 프로세서를 의미할 수 있다. 메모리(120) 및 저장 장치(160) 각각은 휘발성 저장 매체 및 비휘발성 저장 매체 중에서 적어도 하나로 구성될 수 있다. 예를 들어, 메모리(120)는 읽기 전용 메모리(read only memory, ROM) 및 랜덤 액세스 메모리(random access memory, RAM) 중 하나일 수 있고, 저장 장치(160)는, 플래시메모리(flash-memory), 하드디스크 드라이브(HDD), 솔리드 스테이트 드라이브(SSD), 또는 각종 메모리 카드(예를 들어, micro SD 카드) 등일 수 있다.Here, the at least one processor 110 may mean a central processing unit (CPU), a graphics processing unit (GPU), or a dedicated processor on which methods according to embodiments of the present invention are performed. can Each of the memory 120 and the storage device 160 may be configured of at least one of a volatile storage medium and a non-volatile storage medium. For example, the memory 120 may be one of a read only memory (ROM) and a random access memory (RAM), and the storage device 160 is a flash-memory. , a hard disk drive (HDD), a solid state drive (SSD), or various memory cards (eg, micro SD card).

또한, 3D 모델링 제공 서버(100)는 무선 네트워크를 통해 통신을 수행하는 송수신 장치(transceiver)(130)를 포함할 수 있다. 또한, 3D 모델링 제공 서버(100)는 입력 인터페이스 장치(140), 출력 인터페이스 장치(150), 저장 장치(160) 등을 더 포함할 수 있다. 3D 모델링 제공 서버(100)에 포함된 각각의 구성 요소들은 버스(bus)(170)에 의해 연결되어 서로 통신을 수행할 수 있다. In addition, the 3D modeling providing server 100 may include a transceiver 130 for performing communication through a wireless network. In addition, the 3D modeling providing server 100 may further include an input interface device 140 , an output interface device 150 , a storage device 160 , and the like. Each of the components included in the 3D modeling providing server 100 may be connected by a bus 170 to communicate with each other.

본 발명에 따른 방법들은 다양한 컴퓨터 수단을 통해 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 컴퓨터 판독 가능 매체에 기록되는 프로그램 명령은 본 발명을 위해 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.The methods according to the present invention may be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the computer-readable medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software.

컴퓨터 판독 가능 매체의 예에는 롬(ROM), 램(RAM), 플래시 메모리(flash memory) 등과 같이 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함될 수 있다. 프로그램 명령의 예에는 컴파일러(compiler)에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터(interpreter) 등을 사용해서 컴퓨터에 의해 실행될 수 있는 고급 언어 코드를 포함할 수 있다. 상술한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 적어도 하나의 소프트웨어 모듈로 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Examples of computer-readable media may include hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions may include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as at least one software module to perform the operations of the present invention, and vice versa.

또한, 상술한 방법 또는 장치는 그 구성이나 기능의 전부 또는 일부가 결합되어 구현되거나, 분리되어 구현될 수 있다. In addition, the above-described method or apparatus may be implemented by combining all or part of its configuration or function, or may be implemented separately.

상기에서는 본 발명의 바람직한 실시예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다. Although the above has been described with reference to preferred embodiments of the present invention, those skilled in the art can variously modify and change the present invention within the scope without departing from the spirit and scope of the present invention as set forth in the claims below. You will understand that it can be done.

100: 3D 모델링 제공 서버 200: 사용자 단말
300: 영상 제공 서버100: 3D modeling providing server 200: user terminal
300: video providing server

Claims

A 3D modeling providing server that controls the interaction between a user using a user terminal and a 3D object,
a trigger event generator for generating a trigger event that is a kind of command for displaying the 3D object;
a target object determiner configured to determine a target object to be expressed as the 3D object in response to the trigger event;
a 3D object acquisition unit generating the 3D object corresponding to the target object;
a 3D object providing unit providing the generated 3D object to the user through the user terminal;
an interaction control unit for controlling an interaction between the user and the 3D object based on the user's input; and
and an initial display characteristic determining unit that determines initial display characteristics of the 3D object,
The trigger event generating unit generates a trigger event based on at least one of an audio and an image frame output while the video acquired through the video providing server is being played,
generating a difference image frame by differentiating pixel values corresponding to each other in positions of two image frames temporally adjacent to each other among image frames constituting the moving picture;
When the sum of the pixel values constituting the generated difference image frame is equal to or less than a preset threshold, it is determined that there is a scene change between the two image frames,
Determining the timing of the scene change as the trigger event,
The target object determination unit detects the trigger event generated while the video is played through the display of the user terminal,
Analyzes the voice output from the video from the time the trigger event occurs in response to the detection, and determines a target object to be expressed as a 3D object based on the analysis,
The 3D object acquisition unit acquires a 3D object corresponding to the target object determined through at least one 3D engine,
The 3D object providing unit overlaps the video and the 3D object and provides it to the user terminal,
The interaction control unit obtains the user's input through the user terminal, and performs functions of enlarging, reducing, moving and rotating the 3D object according to the user's input,
The initial display characteristic determining unit,
determining the initial size, initial depth, initial position, initial direction and basic motion of the 3D object, which are initial characteristics of the 3D object when the 3D object is first displayed through the user terminal,
dividing the display area of the video into a plurality of areas, extracting a linear component and a curved component from each of the divided areas,
obtaining color values of pixels included in each of the plurality of divided regions, calculating a deviation value of the obtained color values,
The initial position at which the 3D object is to be displayed in an area in which the proportion of pixels constituting the linear component and the curved component included in each of the plurality of regions is equal to or less than the first proportion, and the calculated deviation value is equal to or less than the first threshold deviation value As determined by the 3D modeling server.

In claim 1,
The target object determination unit,
A 3D modeling providing server for detecting at least one of a voice output from the video acquired from the image providing server, the user's voice, and the user's gesture.

In claim 2,
The 3D object providing unit,
A 3D modeling providing server that processes the color of the moving picture being reproduced in black and white and expresses the color of the 3D object.

In claim 3,
The 3D object providing unit, a 3D modeling providing server that provides the 3D object to which the initial characteristic is applied.

In claim 4,
The initial display characteristic determining unit,
The video is displayed and the surface of the display of the user terminal is set as a first plane,
setting an imaginary point that extends vertically from the center of the first surface in a second direction opposite to the first direction viewed by the first surface,
determining an initial depth of the 3D object between the first surface and the virtual point,
The closer the distance between the user and the user terminal, the deeper the initial depth of the 3D object, 3D modeling providing server.