KR20230073869A

KR20230073869A - Method of providing user interface supporting interaction and elecronic apparatus performing the same

Info

Publication number: KR20230073869A
Application number: KR1020210160709A
Authority: KR
Inventors: 볼로디미어 사빈; 올렉산드르 사포즈니크; 이고르 브도비첸코; 블라디슬라프 다이키; 김덕호; 정지원
Original assignee: 삼성전자주식회사
Priority date: 2021-11-19
Filing date: 2021-11-19
Publication date: 2023-05-26
Also published as: WO2023090893A1

Abstract

본 문서에 개시되는 일 실시예에 따라 인터랙션 (interaction)을 지원하는 사용자 인터페이스를 제공하는 전자 장치는, 복수의 카메라로부터 촬영된 복수의 영상을 획득하고, 획득된 복수의 영상 중 손(hand)을 포함하는 영상을 식별하며, 식별된 영상 및 기 설정된 손 특성 파라미터(hand feature parameter)에 기초하여, 손의 3차원 포즈(pose) 정보를 획득하고, 전자 장치에서 실행되는 애플리케이션에 기초하여, 획득된 손의 3차원 포즈 정보에 대응되는 인터랙션 정보를 제공할 수 있다. According to an embodiment disclosed in this document, an electronic device that provides a user interface supporting interaction acquires a plurality of images taken from a plurality of cameras, and displays a hand among the obtained plurality of images. identify images that contain Based on the identified image and preset hand feature parameters, 3D pose information of the hand is obtained, and based on an application running on the electronic device, the acquired 3D pose information of the hand is applied. Corresponding interaction information may be provided.

Description

A method for providing a user interface supporting interaction and an electronic device performing the same

본 문서에서 개시되는 실시예들은 인터랙션을 지원하는 사용자 인터페이스를 제공하는 방법 및 이를 수행하는 전자 장치에 관한 것이다. Embodiments disclosed in this document relate to a method for providing a user interface supporting interaction and an electronic device performing the same.

사용자에게 가상 현실(virtual reality, VR)/ 증강 현실(augmented reality, AR)을 체험할 수 있도록 하는 다양한 기술들이 개발되고 있다. 가상 현실은 실제 현실의 특정 환경, 상황 또는 가상의 시나리오를 컴퓨터 모델링을 통해 구축하고 이러한 가상 환경에서 사용자가 인터랙션할 수 있도록 돕는 시스템이다. 또한, 증강 현실은 실세계에 가상의 오브젝트를 중첩하여 출력함으로써, 공간과 상황에 대한 가상 정보를 제공하는 시스템이다. 증강 현실에서는, 디지털적으로 재생성된 이미지 또는 그 일부가, 이들이 현실인 것처럼 생각되거나, 또는 현실로서 인식될 수 있는 방식으로 사용자에게 제시될 수 있다.Various technologies are being developed to enable users to experience virtual reality (VR)/augmented reality (AR). Virtual reality is a system that builds a specific environment, situation, or virtual scenario of real reality through computer modeling and helps users interact in this virtual environment. In addition, augmented reality is a system that provides virtual information about space and situations by superimposing virtual objects on the real world and outputting them. In augmented reality, digitally recreated images, or parts thereof, can be presented to the user in such a way that they appear to be real, or can be perceived as real.

가상 현실/ 증강 현실을 사용자에게 보다 실감나게 체험할 수 있도록 하기 위해서는, 실제 현실에서와 같이 사용자의 입력에 대한 인터랙션을 적절하게 제공하는 것이 중요하다. 이를 위해서는, 가상 현실/ 증강 현실에서의 사용자의 입력을 정확하게 인식하는 것이 중요한 부분에 해당함에 따라, 이를 위한 다양한 기술들이 개발되고 있다. In order to allow the user to experience virtual reality/augmented reality more realistically, it is important to properly provide interaction with the user's input as in real life. To this end, as accurately recognizing a user's input in virtual reality/augmented reality corresponds to an important part, various technologies for this are being developed.

본 개시는 사용자 입력을 정확하게 인식하여 이에 대한 인터랙션을 적절하게 수행할 수 있는 사용자 인터페이스를 제공하는 방법 및 이를 수행하는 전자 장치에 관한 것이다. The present disclosure relates to a method for providing a user interface capable of accurately recognizing a user input and appropriately performing an interaction therewith, and an electronic device performing the same.

본 개시의 일 실시예에 따른 전자 장치가 인터랙션을 지원하는 사용자 인터페이스를 제공하는 방법은, 복수의 카메라로부터 촬영된 복수의 영상을 획득하는 단계; 획득된 복수의 영상 중 손을 포함하는 영상을 식별하는 단계; 식별된 영상 및 기 설정된 손 특성 파라미터에 기초하여, 손의 3차원 포즈 정보를 획득하는 단계; 및 전자 장치에서 실행되는 애플리케이션에 기초하여, 획득된 손의 3차원 포즈 정보에 대응되는 인터랙션 정보를 제공하는 단계를 포함할 수 있다. A method for providing a user interface supporting interaction by an electronic device according to an embodiment of the present disclosure includes obtaining a plurality of images captured by a plurality of cameras; identifying an image including a hand among a plurality of acquired images; obtaining 3D pose information of the hand based on the identified image and preset hand characteristic parameters; and providing interaction information corresponding to the obtained 3D pose information of the hand, based on an application executed on the electronic device.

본 개시의 일 실시예에 따른 인터랙션을 지원하는 사용자 인터페이스를 제공하는 전자 장치는, 복수의 카메라를 포함하는 센싱부; 출력부; 및 적어도 하나의 프로세서를 포함하고, 적어도 하나의 프로세서는, 복수의 카메라부터 촬영된 복수의 영상을 획득하고, 획득된 복수의 영상 중 손을 포함하는 영상을 식별하며, 식별된 영상 및 기 설정된 손 특성 파라미터에 기초하여, 손의 3차원 포즈정보를 획득하고, 전자 장치에서 실행되는 애플리케이션에 기초하여, 획득된 손의 3차원 포즈 정보에 대응되는 인터랙션 정보를 제공할 수 있다. An electronic device providing a user interface supporting interaction according to an embodiment of the present disclosure includes a sensing unit including a plurality of cameras; output unit; and at least one processor, wherein the at least one processor acquires a plurality of images taken from a plurality of cameras, identifies an image including a hand among the plurality of acquired images, and identifies the identified image and a preset hand. 3D pose information of the hand may be acquired based on the characteristic parameter, and interaction information corresponding to the acquired 3D pose information of the hand may be provided based on an application executed in the electronic device.

본 개시의 일 실시예에 따른 전자 장치가 인터랙션을 지원하는 사용자 인터페이스를 제공하는 방법을 수행하도록 하는 프로그램이 저장된 기록매체를 포함하는 컴퓨터 프로그램 제품은, 복수의 카메라로부터 촬영된 복수의 영상을 획득하는 동작, 획득된 복수의 영상 중 손을 포함하는 영상을 식별하는 동작, 식별된 영상 및 기 설정된 손 특성 파라미터에 기초하여, 손의 3차원 포즈정보를 획득하는 동작, 및 전자 장치에서 실행되는 애플리케이션에 기초하여, 획득된 손의 3차원 포즈 정보에 대응되는 인터랙션 정보를 제공하는 동작을 수행하도록 하는 프로그램이 저장될 수 있다. A computer program product including a recording medium storing a program for causing an electronic device to perform a method of providing a user interface supporting interaction according to an embodiment of the present disclosure includes obtaining a plurality of images taken from a plurality of cameras. An operation of identifying an image including a hand among a plurality of acquired images, an operation of obtaining 3D pose information of the hand based on the identified image and a preset hand characteristic parameter, and an application running on the electronic device. Based on this, a program for performing an operation of providing interaction information corresponding to the obtained 3D pose information of the hand may be stored.

도 1은 일 실시예에 따른 전자 장치가 인터랙션을 지원하는 사용자 인터페이스를 제공하는 방법을 설명하기 위한 개념도이다.
도 2는 복수의 카메라의 FoV(field of view)를 설명하기 위한 도면이다.
도 3은 일 실시예에 따른 전자 장치가 인터랙션을 지원하는 사용자 인터페이스를 제공하는 방법을 설명하기 위한 흐름도이다.
도 4는 일 실시예에 따른 손 특성 파라미터를 설명하기 위한 도면이다.
도 5는 일 실시예에 따른 전자 장치가 손이 포함된 복수의 영상이 식별되었는지 여부에 기초하여 손의 3차원 포즈 정보를 획득하는 방법을 설명하기 위한 흐름도이다.
도 6은 일 실시예에 따른 전자 장치가 손의 움직임에 따라 손의 3차원 정보를 획득하는 방법을 설명하기 위한 도면이다.
도 7은 일 실시예에 따른 전자 장치가 손 특성 파라미터를 획득하는 방법을 설명하기 위한 흐름도이다.
도 8은 일 실시예에 따른 전자 장치가 손의 3차원 포즈 정보를 교정하는 방법을 설명하기 위한 도면이다.
도 9 및 도 10은 일 실시예에 따른 전자 장치의 블록도이다.
도 11은, 다양한 실시예들에 따른, 카메라 모듈을 예시하는 블록도이다. 1 is a conceptual diagram illustrating a method of providing a user interface supporting interactions by an electronic device according to an exemplary embodiment.
2 is a diagram for explaining a field of view (FoV) of a plurality of cameras.
3 is a flowchart illustrating a method of providing a user interface supporting interaction by an electronic device according to an exemplary embodiment.
4 is a diagram for describing hand characteristic parameters according to an exemplary embodiment.
5 is a flowchart illustrating a method of obtaining, by an electronic device, 3D pose information of a hand based on whether a plurality of images including the hand are identified, according to an exemplary embodiment.
6 is a diagram for explaining a method of acquiring 3D information of a hand according to hand motion by an electronic device according to an exemplary embodiment.
7 is a flowchart illustrating a method of acquiring a hand characteristic parameter by an electronic device according to an exemplary embodiment.
8 is a diagram for explaining a method of calibrating 3D pose information of a hand by an electronic device according to an exemplary embodiment.
9 and 10 are block diagrams of an electronic device according to an exemplary embodiment.
11 is a block diagram illustrating a camera module, in accordance with various embodiments.

아래에서는 첨부한 도면을 참조하여 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 실시예를 상세히 설명한다. 그러나 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments will be described in detail so that those skilled in the art can easily practice the invention with reference to the accompanying drawings. However, the disclosure may be embodied in many different forms and is not limited to the embodiments described herein. And in order to clearly explain the invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

본 개시에서 "제스쳐(gesture)"는 사용자 입력을 나타내는 신체적, 음성적 동작 또는 표현으로서, 예를 들어, 탭(tap: 누르기), 드래그(drag: 누른 채 움직임), 핀치(pinch: 두 손가락으로 넓히기/좁히기), 프레스(press: 오래 누르기) 및 플릭(flick: 빠르게 스크롤) 등을 포함할 수 있다. In the present disclosure, a “gesture” is a physical or vocal action or expression representing a user input, such as a tap (press), drag (movement while pressing), pinch (spreading with two fingers). /narrow), press (long press), and flick (quickly scroll).

본 개시에서 "포즈(pose)"는 고정된 시점에서의 제스쳐의 형태로, 연속된 포즈들이 하나의 제스쳐를 구성할 수 있다. In the present disclosure, a “pose” is a gesture form at a fixed viewpoint, and consecutive poses may constitute one gesture.

본 개시에서 "사용자 프로파일(user profile)" 은 사용자의 신원 및 사용자와 관련된 설정을 나타내는 정보로서, 예를 들어, 사용자의 나이, 성별, 키, 체중 또는 지역 등에 관한 정보를 포함할 수 있다. In the present disclosure, a “user profile” is information representing a user's identity and settings related to the user, and may include, for example, information about the user's age, gender, height, weight, or region.

명세서에서 사용된 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 구성요소들은 용어들에 의해 한정되어서는 안 된다. 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms used in the specification may be used to describe various components, but the components should not be limited by the terms. Terms are only used to distinguish one component from another.

이하 첨부된 도면을 참고하여 개시를 상세히 설명하기로 한다.Hereinafter, the disclosure will be described in detail with reference to the accompanying drawings.

도 1은 일 실시예에 따른 전자 장치가 인터랙션을 지원하는 사용자 인터페이스를 제공하는 방법을 설명하기 위한 개념도이다. 1 is a conceptual diagram illustrating a method of providing a user interface supporting interactions by an electronic device according to an exemplary embodiment.

도 1을 참조하면, 전자 장치(100)는 사용자(5)가 AR (augmented reality)/ VR (virtual reality) 환경을 체험할 수 있도록, 적어도 하나의 영상을 실세계에 중첩하여 출력할 수 있다. 또한, 전자 장치(100)는 출력된 영상에 대한 사용자 입력을 센싱할 수 있다. 예를 들어, 복수의 애플리케이션의 아이콘(12, 14, 16)을 포함하는 영상(10)이 실세계에 중첩하여 출력되고 있는 경우, 전자 장치(100)는 복수의 애플리케이션의 아이콘(12, 14, 16) 중 하나를 선택하는 사용자 입력을 센싱할 수 있다. 사용자 입력은 손 제스쳐(gesture) 등을 포함할 수 있으나, 이는 일 예일 뿐, 사용자 입력이 손 제스쳐에 한정되는 것은 아니다. Referring to FIG. 1 , the electronic device 100 may overlap and output at least one image to the real world so that the user 5 may experience an augmented reality (AR)/virtual reality (VR) environment. Also, the electronic device 100 may sense a user input for the output image. For example, when the image 10 including the icons 12, 14, and 16 of a plurality of applications is overlapped and output on the real world, the electronic device 100 displays the icons 12, 14, and 16 of the plurality of applications. ) may sense a user input selecting one of them. The user input may include a hand gesture, etc., but this is only an example, and the user input is not limited to the hand gesture.

전자 장치(100)는 사용자 입력이 센싱됨에 따라, 센싱된 사용자 입력에 대응되는 인터랙션 정보를 제공할 수 있다. 예를 들어, 전자 장치(100)는 이미지 센서를 포함한 카메라를 통해 적어도 하나의 영상이 획득됨에 따라, 획득된 적어도 하나의 영상 내에서 손을 검출할 수 있다. 전자 장치(100)는 검출된 손의 3차원 포즈 정보를 기초로 손 제스쳐를 식별할 수 있다. 전자 장치(100)는 식별된 손 제스쳐가 "탭 제스쳐"인 경우, 손가락의 위치에 대응되는 애플리케이션 아이콘(예를 들어, 12)의 실행 화면을 인터랙션 정보로서 제공할 수 있다. As the user input is sensed, the electronic device 100 may provide interaction information corresponding to the sensed user input. For example, as at least one image is acquired through a camera including an image sensor, the electronic device 100 may detect a hand in at least one acquired image. The electronic device 100 may identify a hand gesture based on the detected 3D pose information of the hand. When the identified hand gesture is a “tap gesture”, the electronic device 100 may provide an execution screen of an application icon (eg, 12) corresponding to the position of the finger as interaction information.

한편, 사용자 입력에 대응되는 인터랙션 정보가 사용자의 의도를 반영하여 적절하게 제공되기 위해서는, 전자 장치(100)가 센싱된 사용자 입력을 정확하게 식별할 필요가 있다. 예를 들어, 전술한 실시예에서와 같이, 사용자 입력이 탭 제스쳐인 경우, 전자 장치(100)는 손의 3차원 포즈 정보를 기초로 사용자의 손가락의 위치를 정확하게 식별할 필요가 있다. Meanwhile, in order to appropriately provide interaction information corresponding to the user input by reflecting the user's intention, the electronic device 100 needs to accurately identify the sensed user input. For example, as in the above-described embodiment, when the user input is a tap gesture, the electronic device 100 needs to accurately identify the position of the user's finger based on the 3D pose information of the hand.

손의 3차원 포즈 정보는 일반적으로 스테레오스코픽(stereoscopic) 방식을 통해 획득될 수 있다. 스테레오스코픽 방식은 인간이 두 눈을 이용하여 물체를 인식하는 것을 모방한 방식으로, 전자 장치(100)는 서로 다른 FoV를 갖는 두 개의 카메라를 통해 획득한 영상들 각각에서 검출된 손의 형태, 크기 및 깊이 정보를 기초로 손의 3차원 포즈 정보를 획득할 수 있다. 3D pose information of the hand may be generally obtained through a stereoscopic method. The stereoscopic method imitates the way a human recognizes an object using both eyes, and the electronic device 100 detects the shape and size of the hand in each of the images acquired through two cameras having different FoVs. and 3D pose information of the hand may be obtained based on the depth information.

스테레오스코픽 방식을 이용하여 손의 3차원 포즈 정보를 획득하는 경우, 손의 절대적인 크기 및 형태를 정확하게 식별할 수 있으나, 이는 두 개의 카메라의 FoV가 중첩되는 영역에서 손이 검출되는 경우에 한정적으로 적용될 수 있다. 예를 들어, 사용자(5)의 손이 두 개의 카메라의 FoV가 중첩되지 않은 영역에서 검출되는 경우, 즉, 하나의 카메라에서 촬영된 영상에서만 손이 검출되는 경우에는 스테레오스코픽 방식에 따라 손의 3차원 포즈 정보를 획득하기 어렵다. 본 개시의 실시예들은, 두 개의 카메라의 FoV가 중첩되지 않은 영역에서도 손 특성 파라미터를 이용하여 손의 3차원 포즈 정보를 획득하는 방법에 대해 제공하고자 한다. When 3D pose information of the hand is acquired using the stereoscopic method, the absolute size and shape of the hand can be accurately identified, but this is limitedly applied when the hand is detected in an area where the FoVs of the two cameras overlap. can For example, if the hand of the user 5 is detected in an area where the FoVs of the two cameras do not overlap, that is, if the hand is detected only in an image captured by one camera, the hand's 3 It is difficult to obtain dimensional pose information. Embodiments of the present disclosure are intended to provide a method for obtaining 3D pose information of a hand using a hand characteristic parameter even in an area where FoVs of two cameras do not overlap.

한편, 보다 실감나는 AR/VR 환경을 제공하기 위해, 전자 장치에 포함되는 카메라의 개수가 증가하는 추세이다. 이러한 경우, 전자 장치는 카메라 별로 오버랩되는 영역에 대한 영상들을 처리해야 함에 따라, 손의 3차원 포즈 정보를 얻는 데 수행되어야 하는 연산의 양이 증가될 수 있다. 본 개시의 실시예들은, 손 특성 파라미터를 이용하여 단일 영상으로부터 손의 3차원 포즈 정보를 제공할 수 있어, 연산량이 증가되는 것을 방지할 수 있다. Meanwhile, in order to provide a more realistic AR/VR environment, the number of cameras included in electronic devices tends to increase. In this case, as the electronic device needs to process images of overlapping regions for each camera, the amount of calculations required to obtain 3D pose information of the hand may increase. Embodiments of the present disclosure may provide 3D pose information of a hand from a single image using a hand characteristic parameter, thereby preventing an increase in computational complexity.

또한, 본 개시의 실시예들에 따른 전자 장치는 손의 3차원 포즈 정보에 대한 정확도를 복수의 영상으로부터 손의 3차원 포즈 정보를 획득하는 경우의 정확도의 수준으로 유지할 수 있다. 이를 위해, 전자 장치는 복수의 카메라의 FoV가 오버랩되는 지점에서 손이 검출된 경우, 손의 위치 또는 영상으로부터 검출된 손의 손가락의 선명도 등의 기준을 고려하여, 손 특성 파라미터를 업데이트할 수 있다. 전자 장치가 손 특성 파라미터를 업데이트하는 기준에 대하서는 도 8을 참조하여 보다 구체적으로 후술하도록 한다. Also, the electronic device according to embodiments of the present disclosure may maintain the accuracy of the 3D pose information of the hand at the level of accuracy when obtaining the 3D pose information of the hand from a plurality of images. To this end, when a hand is detected at a point where the FoVs of a plurality of cameras overlap, the electronic device may update the hand characteristic parameter in consideration of criteria such as the position of the hand or the sharpness of fingers of the hand detected from the image. . A criterion for updating the hand characteristic parameter by the electronic device will be described later in detail with reference to FIG. 8 .

도 2는 복수의 카메라의 FoV(field of view)를 설명하기 위한 도면이다. 2 is a diagram for explaining a field of view (FoV) of a plurality of cameras.

카메라의 FoV는 카메라에 구비된 렌즈를 통해 영상을 촬영한 경우, 촬영한 영상에 대응되는 영역으로, 센서의 크기와 렌즈 배율에 따라 FoV의 값이 결정될 수 있다. 또한, 전자 장치(100)에 카메라가 구비된 위치에 따라 FoV가 형성되는 영역이 상이할 수 있으며, 전자 장치(100)에 구비된 복수의 카메라들의 FoV는 각각 상이할 수 있다. The FoV of a camera is an area corresponding to a photographed image when an image is captured through a lens provided in the camera, and the FoV value may be determined according to the size of the sensor and the magnification of the lens. In addition, the area where the FoV is formed may be different depending on the position where the camera is provided in the electronic device 100, and the FoV of the plurality of cameras provided in the electronic device 100 may be different.

도 2를 참조하면, 전자 장치(100)에는 2개의 카메라가 구비될 수 있으며, 2개의 카메라 중 제 1 카메라의 FoV(22)와 제 2 카메라의 FoV(24)가 서로 상이할 수 있다. 제 1 카메라의 FoV(22)와 제 2 카메라의 FoV(24) 간에는 중첩되는 영역이 존재할 수 있다. 본 개시에서는 서로 다른 카메라 간의 FoV가 중첩되는 영역을 스테레오(stereo) 영역(26)으로 설명하도록 한다.Referring to FIG. 2 , the electronic device 100 may include two cameras, and among the two cameras, a FoV 22 of a first camera and a FoV 24 of a second camera may be different from each other. An overlapping area may exist between the FoV 22 of the first camera and the FoV 24 of the second camera. In the present disclosure, a region in which FoVs of different cameras overlap is described as a stereo region 26 .

손이 스테레오 영역(26)에 위치한 경우, 전자 장치(100)는 각각의 카메라를 통해 획득한 영상들에 스테레오스코픽 방식을 적용하여 정확한 손의 3차원 포즈 정보를 획득할 수 있다. 다만, 이러한 방식은 스테레오 영역(26)에 국한되어 적용될 수 있다. 또한, 손이 스테레오 영역(26)에 위치한다고 하더라도, 계속적으로 스테레오스코픽 방식을 적용하여 손의 3차원 포즈 정보를 획득할 경우, 전자 장치(100)의 연산량이 증가될 수 있다. 특히, 실시간으로 움직이는 손의 3차원 포즈 정보를 획득해야 하는 경우, 전자 장치(100)의 연산량은 급격이 증가될 수 있다. When the hand is located in the stereo area 26, the electronic device 100 may obtain accurate 3D pose information of the hand by applying a stereoscopic method to images obtained through each camera. However, this method may be applied only to the stereo region 26 . In addition, even if the hand is located in the stereo region 26, when the 3D pose information of the hand is obtained by continuously applying the stereoscopic method, the amount of computation of the electronic device 100 may increase. In particular, when it is necessary to acquire 3D pose information of a hand moving in real time, the amount of computation of the electronic device 100 may rapidly increase.

본 개시의 일 실시예에 따른 전자 장치(100)는 손 특성 파라미터를 설정하고, 설정된 손 특성 파라미터를 이용함으로써 손을 포함하는 하나의 영상으로부터 손의 3차원 포즈 정보를 획득할 수 있다. 일 실시예에 따른 전자 장치(100)는 손이 스테레오 영역에 위치하면서 기 설정된 기준을 만족하는 경우에만 스테레오스코픽 방식을 적용하여 손 특성 파라미터를 결정할 수 있다. 이를 통해 전자 장치는 획득된 손의 3차원 포즈 정보의 정확도가 저하되는 것을 방지할 수 있다. The electronic device 100 according to an embodiment of the present disclosure may set a hand characteristic parameter and acquire 3D pose information of the hand from one image including the hand by using the set hand characteristic parameter. The electronic device 100 according to an embodiment may determine a hand characteristic parameter by applying a stereoscopic method only when the hand is located in the stereo area and satisfies a preset criterion. Through this, the electronic device can prevent the accuracy of the obtained 3D pose information of the hand from deteriorating.

도 3은 일 실시예에 따른 전자 장치가 인터랙션을 지원하는 사용자 인터페이스를 제공하는 방법을 설명하기 위한 흐름도이다. 3 is a flowchart illustrating a method of providing a user interface supporting interaction by an electronic device according to an exemplary embodiment.

단계 S310에서, 전자 장치는 복수의 카메라로부터 촬영된 복수의 영상을 획득할 수 있다. 전자 장치는 복수의 카메라를 포함할 수 있으며, 복수의 카메라의 FoV는 서로 상이할 수 있다.In step S310, the electronic device may acquire a plurality of images captured by a plurality of cameras. An electronic device may include a plurality of cameras, and FoVs of the plurality of cameras may be different from each other.

전자 장치는 전자 장치 주변의 객체에 관한 정보를 요구하는 애플리케이션이 실행됨에 따라, 복수의 카메라를 이용하여 전자 장치 주변을 촬영할 수 있다. 전자 장치 주변의 객체에 관한 정보를 요구하는 애플리케이션은 사진 촬영 애플리케이션, AR/VR 환경을 제공하는 애플리케이션 등을 포함할 수 있다. 다만, 이는 일 예일 뿐, 전자 장치 주변의 객체에 관한 정보를 요구하는 애플리케이션이 전술한 예에 한정되는 것은 아니다. 전자 장치가 복수의 카메라를 이용하여 전자 장치 주변을 촬영한 결과, 복수의 카메라로부터 각 카메라의 FoV에 대응되는 영역이 촬영된 영상들이 획득될 수 있다. As an application requesting information on an object around the electronic device is executed, the electronic device may photograph the surroundings of the electronic device using a plurality of cameras. Applications requesting information on objects around the electronic device may include a photo taking application, an application providing an AR/VR environment, and the like. However, this is only an example, and an application requesting information about an object around the electronic device is not limited to the above example. As a result of the electronic device capturing the surroundings of the electronic device using a plurality of cameras, images in which an area corresponding to the FoV of each camera is photographed may be obtained from the plurality of cameras.

단계 S320에서, 전자 장치는 획득된 복수의 영상 중 손을 포함하는 영상을 식별할 수 있다. In step S320, the electronic device may identify an image including a hand among a plurality of acquired images.

전자 장치는 실행 중인 애플리케이션이 손 제스쳐를 지원하는 경우, 획득된 복수의 영상 중 손을 포함하는 영상을 식별할 수 있다. 다만, 이는 일 예시일 뿐, 전자 장치는 사용자의 손 특성 파라미터를 설정해야 하는 경우에도, 획득된 복수의 영상 중 손을 포함하는 영상을 식별할 수 있다. When an application being executed supports a hand gesture, the electronic device may identify an image including a hand among a plurality of acquired images. However, this is only an example, and the electronic device may identify an image including a hand among a plurality of acquired images even when the user's hand characteristic parameter needs to be set.

일 실시예에 따른 전자 장치는 복수의 영상 각각에 대해 손을 포함하는지 여부를 식별하는, 손 검출 동작을 수행하는 순서를 결정할 수 있다. 전자 장치는 결정된 순서에 기초하여 복수의 영상에 대해 손 검출 동작을 수행하며, 기 설정된 개수의 손을 포함하는 영상이 획득된 경우, 손 검출 동작을 중단할 수 있다. 예를 들어, 전자 장치는, 복수의 영상 각각으로부터 손이 검출될 확률 정보에 기초하여, 복수의 영상에 손 검출 동작을 수행하는 순서를 결정할 수 있다. 복수의 영상 각각으로부터 손이 검출될 확률 정보는, 전자 장치에 획득된 손의 트래킹 정보를 기초로, 손의 움직임을 예측함으로써 결정될 수 있다. An electronic device according to an embodiment may determine an order of performing a hand detection operation for identifying whether a hand is included in each of a plurality of images. The electronic device may perform a hand detection operation on a plurality of images based on the determined order, and may stop the hand detection operation when an image including a preset number of hands is acquired. For example, the electronic device may determine an order in which a hand detection operation is performed on a plurality of images, based on information about a hand detection probability from each of the plurality of images. Hand detection probability information from each of the plurality of images may be determined by estimating hand motion based on hand tracking information acquired by the electronic device.

단계 S330에서, 전자 장치는 식별된 영상 및 기 설정된 손 특성 파라미터에 기초하여 손의 3차원 포즈 정보를 획득할 수 있다. In step S330, the electronic device may acquire 3D pose information of the hand based on the identified image and preset hand characteristic parameters.

전자 장치는 손을 포함하는 하나의 영상이 식별된 경우, 식별된 하나의 영상으로부터 검출된 손의 형태 및 크기 등에 관한 정보를 획득할 수 있다. 또한, 전자 장치는 기 설정된 손 특성 파라미터를 이용하여 획득된 정보로부터 손의 3차원 포즈 정보를 획득할 수 있다. 하나의 영상으로는 정확한 손의 3차원 포즈 정보를 획득하기 어려움에 따라, 전자 장치는 하나의 영상으로부터 검출된 손의 형태 및 크기 등을 기 설정된 손 특성 파라미터에 따라 교정(calibration)할 수 있다. When an image including a hand is identified, the electronic device may obtain information about the shape and size of the detected hand from the identified image. In addition, the electronic device may obtain 3D pose information of the hand from information obtained using a preset hand characteristic parameter. As it is difficult to obtain accurate 3D pose information of the hand with one image, the electronic device may calibrate the shape and size of the hand detected from one image according to preset hand characteristic parameters.

일 실시예에 따른 손 특성 파라미터는 사용자의 프로파일에 기초하여 설정될 수 있다. 예를 들어, 사용자가 전자 장치에서 손 제스쳐를 입력하는 동작을 처음으로 시도한 경우, 전자 장치는 사용자의 프로파일을 이용하여 손 특성 파라미터를 초기값으로 설정할 수 있다. 사용자의 프로파일에는 사용자의 나이, 성별, 키, 체중 또는 지역 등에 관한 정보가 포함될 수 있으나, 이는 일 예일 뿐, 사용자의 프로파일이 전술한 예에 한정되는 것은 아니다. 사용자의 손의 크기 및 형태 등이 성별, 나이, 키, 체중 또는 지역 등에 따라 상이할 수 있으므로, 전자 장치는 사용자의 프로파일에 기초하여 손 특성 파라미터를 초기값으로 설정할 수 있다. 일 실시예에 따른 전자 장치에는 성별, 나이, 키, 체중 또는 지역 별 손 특성 파라미터의 값들이 미리 저장될 수 있다. 다른 실시예에 따라, 전자 장치는 외부 서버로부터 사용자 프로파일에 대응되는 손 특성 파라미터의 초기값을 획득할 수도 있다. Hand characteristic parameters according to an embodiment may be set based on a user's profile. For example, when a user first attempts an operation of inputting a hand gesture in an electronic device, the electronic device may set a hand characteristic parameter as an initial value using the user's profile. The user's profile may include information about the user's age, gender, height, weight, region, etc., but this is only an example, and the user's profile is not limited to the above example. Since the size and shape of the user's hand may vary according to gender, age, height, weight, region, etc., the electronic device may set the hand characteristic parameter as an initial value based on the user's profile. Hand characteristic parameter values for each gender, age, height, weight, or region may be previously stored in the electronic device according to an embodiment. According to another embodiment, the electronic device may obtain an initial value of a hand characteristic parameter corresponding to a user profile from an external server.

한편, 사용자의 프로파일에 기초하여 설정된 손 특성 파라미터는, 성별, 나이, 키, 체중, 지역 별 평균적인 손의 특성에 기초하여 결정된 것임에 따라, 실제 사용자의 손 특성과 일치하지 않는 부분이 존재할 수 있다. 이에 따라, 전자 장치는 손이 스테레오 영역에 위치하고, 기 설정된 기준을 만족하는 경우 서로 다른 카메라에서 스테레오 영역에 위치한 손을 촬영한 결과 획득된 복수의 영상에 스테레오스코픽 방식을 적용하여 손 특성 파라미터를 업데이트 할 수 있다. On the other hand, since the hand characteristic parameter set based on the user's profile is determined based on average hand characteristics for each gender, age, height, weight, and region, there may be parts that do not match the actual user's hand characteristics. there is. Accordingly, when the hand is located in the stereo region and satisfies a predetermined criterion, the electronic device updates hand characteristic parameters by applying a stereoscopic method to a plurality of images obtained as a result of photographing the hand located in the stereo region from different cameras. can do.

단계 S340에서, 전자 장치는 실행되는 애플리케이션에 기초하여, 획득된 손의 3차원 포즈 정보에 대응되는 인터랙션 정보를 제공할 수 있다. In step S340, the electronic device may provide interaction information corresponding to the obtained 3D pose information of the hand based on the executed application.

전자 장치는 손의 3차원 포즈 정보를 기초로, 사용자의 손 제스쳐를 식별할 수 있다. 전자 장치에서 손 제스쳐를 지원하는 애플리케이션들 각각에는 손 제스쳐 별로 애플리케이션에서 제공해야하는 정보 또는 실행해야 하는 명령어 등이 미리 설정되어 있다. The electronic device may identify the user's hand gesture based on the 3D pose information of the hand. In each of the applications supporting hand gestures in the electronic device, information to be provided by the application or commands to be executed are preset for each hand gesture.

예를 들어, 전자 장치에 지도 애플리케이션이 실행되고 있고, 손의 3차원 포즈 정보를 기초로 식별된 손 제스쳐가 드래그 제스쳐인 경우, 전자 장치는 드래그 제스쳐의 시작점에 위치한 장소로부터 끝점에 위치한 장소까지 가는 경로에 대한 정보를 인터랙션 정보로 제공할 수 있다. 다른 예에 따라, 전자 장치에 메뉴 애플리케이션이 실행되고 있고, 손의 3차원 포즈 정보를 기초로 식별된 손 제스쳐가 탭 제스쳐인 경우, 전자 장치는 메뉴 애플리케이션에서 탭 제스쳐가 센싱된 위치에 대응되는 애플리케이션 아이콘의 실행 화면을 인터랙션 정보로 제공할 수 있다.For example, if a map application is running on the electronic device and the hand gesture identified based on the 3D pose information of the hand is a drag gesture, the electronic device moves from the starting point to the ending point of the drag gesture. Information on a route may be provided as interaction information. According to another example, if a menu application is being executed on the electronic device and the hand gesture identified based on the 3D pose information of the hand is a tap gesture, the electronic device may send an application corresponding to a position where the tap gesture is sensed in the menu application. An execution screen of an icon may be provided as interaction information.

도 4는 일 실시예에 따른 손 특성 파라미터를 설명하기 위한 도면이다. 4 is a diagram for describing hand characteristic parameters according to an exemplary embodiment.

본 개시에 따른 손 특성 파라미터는 손을 구성하는 적어도 일부 점들에 기초하여 결정되는 손의 각 부분에 대한 크기 정보를 포함할 수 있다. A hand characteristic parameter according to the present disclosure may include size information for each part of the hand determined based on at least some points constituting the hand.

도 4를 참조하면, 전자 장치는 손의 특성을 정의하기 위해, 손을 구성하는 복수의 점들 중 일부를 특정할 수 있다. 예를 들어, 전자 장치는 손목(405)으로부터 엄지손가락 끝점(418), 집게손가락 끝점(428), 가운데손가락 끝점(438), 약손가락 끝점(448), 새끼손가락 끝점(458)까지의 길이에 관한 정보를 손 특성 파라미터로 설정할 수 있다. Referring to FIG. 4 , the electronic device may specify some of a plurality of points constituting the hand in order to define characteristics of the hand. For example, the electronic device relates to the length from the wrist 405 to the tip of the thumb 418, the tip of the index finger 428, the tip of the middle finger 438, the tip of the ring finger 448, and the tip of the little finger 458. Information can be set as hand characteristic parameters.

또한, 다른 예에 따라, 전자 장치는 손가락 별로 세부 지점을 특정할 수 있다. 예를 들어, 엄지손가락의 경우, 엄지손가락 시작점(412), 엄지손가락 관절 a(414), 엄지손가락 관절 b(416) 등이 특정될 수 있다. 또한, 새끼손가락의 경우, 새끼손가락 시작점(452), 새끼손가락 관절 a(454), 새끼손가락 관절 b(456) 등이 특정될 수 있다. Also, according to another example, the electronic device may specify detailed points for each finger. For example, in the case of a thumb, a thumb start point 412, a thumb joint a 414, a thumb joint b 416, and the like may be specified. In addition, in the case of the little finger, the little finger starting point 452, the little finger joint a 454, the little finger joint b 456, and the like may be specified.

전자 장치는 특정된 점들 간의 길이 정보를 손 특성 파라미터로 설정할 수 있다. 다만, 이는 일 예일 뿐, 손 특성 파라미터의 설정을 위해 특정되는 점들이 전술한 예에 한정되는 것은 아니다. The electronic device may set length information between specified points as a hand characteristic parameter. However, this is only an example, and points specified for setting hand characteristic parameters are not limited to the above example.

도 5는 일 실시예에 따른 전자 장치가 손이 포함된 복수의 영상이 식별되었는지 여부에 기초하여 손의 3차원 포즈 정보를 획득하는 방법을 설명하기 위한 흐름도이다. 5 is a flowchart illustrating a method of obtaining, by an electronic device, 3D pose information of a hand based on whether a plurality of images including the hand are identified, according to an exemplary embodiment.

단계 S510에서, 전자 장치는 복수의 영상 중 적어도 하나에 대한 손 검출 동작을 수행할 수 있다.In step S510, the electronic device may perform a hand detection operation on at least one of the plurality of images.

전자 장치는 복수의 영상 중 적어도 하나에 대해 손을 포함하는지 여부를 식별하는, 손 검출 동작을 수행할 수 있다. 일 실시예에 따른 전자 장치는 손이 검출될 확률 정보에 기초하여, 손 검출 동작이 수행되는 영상의 순서를 결정할 수 있다. 전자 장치는 결정된 순서에 기초하여 복수의 영상에 대해 손 검출 동작을 수행하며, 기 설정된 개수의 손을 포함하는 영상이 획득된 경우, 손 검출 동작을 중단할 수 있다. The electronic device may perform a hand detection operation to identify whether at least one of the plurality of images includes a hand. An electronic device according to an embodiment may determine an order of images in which a hand detection operation is performed based on hand detection probability information. The electronic device may perform a hand detection operation on a plurality of images based on the determined order, and may stop the hand detection operation when an image including a preset number of hands is acquired.

복수의 영상 각각으로부터 손이 검출될 확률 정보는 전술한 바와 같이, 손의 트래킹 정보를 기초로 결정될 수 있다. 예를 들어, 전자 장치는 손의 이동 방향 및 속도 등을 센싱한 후, 센싱된 이동 방향 및 속도 등을 기초로 시간에 따른 손의 위치를 나타내는 트래킹 정보를 획득할 수 있다. 전자 장치는 a 지점에서 움직임이 시작된 손이 a지점과 c 지점의 중간에 위치한 b 지점에서 정지(예를 들어, 속도값: y 미만)한 것으로 판단한 경우, c 지점이 촬영된 영상에 대한 손 검출 확률을 0으로 설정할 수 있다. 또한, 전자 장치는 획득한 손의 트래킹 정보를 기초로, 서로 다른 FoV를 갖는 복수의 카메라 중 손이 검출될 확률이 기 설정된 값 미만인 영역을 FoV로 갖는 카메라의 촬영을 중단할 수 있다. 예를 들어, 전자 장치는 전력 소모 등을 줄이기 위해, c 지점을 FoV로 포함하는 카메라의 촬영을 중단할 수도 있다.As described above, probability information of a hand being detected from each of the plurality of images may be determined based on hand tracking information. For example, after sensing the movement direction and speed of the hand, the electronic device may acquire tracking information representing the position of the hand over time based on the sensed movement direction and speed. When the electronic device determines that the hand whose movement started at point a has stopped at point b located in the middle of points a and c (for example, speed value: less than y), hand detection for the image in which point c is captured Probability can be set to 0. Also, based on the acquired hand tracking information, the electronic device may stop capturing of a camera having an FoV of an area in which a probability of detecting a hand is less than a preset value among a plurality of cameras having different FoVs. For example, the electronic device may stop photographing by a camera including point c as the FoV in order to reduce power consumption.

단계 S520에서, 전자 장치는 손이 포함된 복수의 영상이 식별되었는지 여부를 판단할 수 있다. In step S520, the electronic device may determine whether a plurality of images including hands are identified.

단계 S530에서, 전자 장치는 식별된 복수의 영상을 기초로 손의 깊이 정보 및 손의 크기 정보를 획득할 수 있다. In step S530, the electronic device may obtain hand depth information and hand size information based on the identified plurality of images.

예를 들어, 전자 장치는 손이 서로 다른 카메라의 FoV가 중첩되는 스테레오 영역에 위치하고, 스테레오 영역에 위치한 손이 서로 다른 카메라에서 촬영된 경우 손을 포함하는 복수의 영상을 획득할 수 있다. 전자 장치는 복수의 영상에 스테레오 스코픽 방식을 적용하여 정확한 손의 깊이 정보 및 손의 크기 정보를 획득할 수 있다. For example, the electronic device may acquire a plurality of images including the hands when the hands are located in a stereo area where FoVs of different cameras overlap, and the hands located in the stereo area are captured by different cameras. The electronic device may acquire accurate hand depth information and hand size information by applying a stereoscopic method to a plurality of images.

단계 S540에서, 전자 장치는 획득된 손의 깊이 정보 및 획득된 손의 크기 정보를 기초로 손 특성 파라미터를 업데이트할 수 있다. In step S540, the electronic device may update hand characteristic parameters based on the acquired hand depth information and the acquired hand size information.

일 실시예에 따른 전자 장치는 식별된 복수의 영상에서 검출된 손의 손가락의 선명도가 기 설정된 기준 이상이거나, 검출된 손의 위치가 식별된 영상의 중심으로부터 기 설정된 거리 범위 이내에 포함되는 경우, 획득된 손의 깊이 정보 및 획득된 손의 크기 정보를 기초로, 손 특성 파라미터를 업데이트할 수 있다. 전자 장치는 손가락의 선명도가 기 설정된 기준 이상이거나, 검출된 손의 위치가 식별된 영상의 중심으로부터 기 설정된 거리 범위 이내에 포함되는 경우 정확한 손 특성 파라미터가 획득될 가능성이 상대적으로 높은 것으로 판단할 수 있다. 다만, 이는 일 예일 뿐, 손 특성 파라미터를 업데이트 하는 조건이 전술한 예에 한정되는 것은 아니다. The electronic device according to an embodiment obtains, when sharpness of fingers of a hand detected in a plurality of identified images is equal to or greater than a preset standard, or when the position of the detected hand is included within a preset distance range from the center of the identified image. Based on the obtained hand depth information and the acquired hand size information, hand characteristic parameters may be updated. The electronic device may determine that the possibility of obtaining an accurate hand characteristic parameter is relatively high when the sharpness of the finger is equal to or greater than a preset standard or when the detected hand position is within a preset distance range from the center of the identified image. . However, this is only an example, and the condition for updating the hand characteristic parameter is not limited to the above example.

단계 S550에서, 전자 장치는 업데이트 이후에 손을 포함하는 하나의 영상이 식별됨에 따라, 식별된 하나의 영상 및 업데이트된 손 특성 파라미터를 기초로 손의 3차원 포즈 정보를 획득할 수 있다. 예를 들어, 전자 장치는 하나의 영상으로부터 식별된 손의 형태 또는 크기에 업데이트된 손 특성 파라미터에 따라 교정(calibration)을 수행하여, 손의 3차원 포즈 정보를 획득할 수 있다. In step S550, as one image including the hand is identified after the update, the electronic device may acquire 3D pose information of the hand based on the identified one image and the updated hand characteristic parameters. For example, the electronic device may acquire 3D pose information of the hand by performing calibration on the shape or size of the hand identified from one image according to the updated hand characteristic parameters.

단계 S560에서, 전자 장치는 식별된 영상 및 기 설정된 손 특성 파라미터에 기초하여 손의 3차원 포즈 정보를 획득할 수 있다. 전자 장치는 손을 포함하는 하나의 영상이 식별된 경우, 식별된 하나의 영상으로부터 검출된 손의 형태 및 크기 등에 관한 정보를 획득할 수 있다. 또한, 전자 장치는 기 설정된 손 특성 파라미터를 이용하여 획득된 정보로부터 손의 3차원 포즈 정보를 획득할 수 있다. 하나의 영상으로는 정확한 손의 3차원 포즈 정보를 획득하기 어려움에 따라, 전자 장치는 하나의 영상으로부터 검출된 손의 형태 및 크기 등을 기 설정된 손 특성 파라미터에 따라 교정할 수 있다. In step S560, the electronic device may acquire 3D pose information of the hand based on the identified image and preset hand characteristic parameters. When an image including a hand is identified, the electronic device may obtain information about the shape and size of the detected hand from the identified image. In addition, the electronic device may obtain 3D pose information of the hand from information obtained using a preset hand characteristic parameter. As it is difficult to obtain accurate 3D pose information of the hand with one image, the electronic device may correct the shape and size of the hand detected from one image according to preset hand characteristic parameters.

도 6은 일 실시예에 따른 전자 장치가 손의 움직임에 따라 손의 3차원 정보를 획득하는 방법을 설명하기 위한 도면이다. 6 is a diagram for explaining a method of acquiring 3D information of a hand according to hand motion by an electronic device according to an exemplary embodiment.

도 6을 참조하면, 사용자의 손(605)은

시점에 다른 카메라의 FoV와 중첩하지 않은 제 1 카메라의 FoV 영역인, 제 1 모노(mono) 영역(612)에 위치할 수 있다. 전자 장치는 제 1 카메라에서 촬영된 하나의 영상으로부터 사용자의 손(605)의 형태 및 크기 등을 식별할 수 있다. 또한, 전자 장치는 식별된 사용자의 손(605)의 형태 및 크기 등을 기 설정된 손 특성 파라미터를 이용하여 교정할 수 있다. 이 때, 전자 장치가 사용자의 손 제스쳐를 최초로 인식하는 경우에는, 손 특성 파라미터는 사용자의 프로파일에 기초하여 결정될 수 있다. Referring to Figure 6, the user's hand 605

It may be located in a first mono region 612, which is an FoV region of the first camera that does not overlap with FoVs of other cameras at the viewpoint. The electronic device may identify the shape and size of the user's hand 605 from one image captured by the first camera. In addition, the electronic device may calibrate the shape and size of the identified user's hand 605 using preset hand characteristic parameters. In this case, when the electronic device recognizes the user's hand gesture for the first time, hand characteristic parameters may be determined based on the user's profile.

또한, 사용자의 손(605)은

시점에 제 1 카메라의 FoV 영역과 제 2 카메라의 FoV 영역이 중첩되는 스테레오 영역(614)에 위치할 수 있다. 일 실시예에 따른 전자 장치는 사용자의 손(605)이 스테레오 영역(614)에 위치한 경우, 기 설정된 손 특성 파라미터의 정확도 또는 기 설정된 손 특성 파라미터의 개수 중 적어도 하나에 기초하여, 스테레오스코픽 방식에 따라 손의 3차원 포즈 정보를 획득할지 여부를 결정할 수 있다. In addition, the user's hand 605

At the viewpoint, the FoV area of the first camera and the FoV area of the second camera may be located in the stereo area 614 overlapping. When the user's hand 605 is located in the stereo area 614, the electronic device according to an embodiment uses a stereoscopic method based on at least one of the accuracy of preset hand characteristic parameters or the number of preset hand characteristic parameters. Accordingly, it may be determined whether or not to acquire 3D pose information of the hand.

예를 들어, 전자 장치는 기 설정된 손 특성 파라미터의 개수가 임계값 이상인 경우, 제 1 카메라 또는 제 2 카메라 중 어느 하나에서 촬영된 하나의 영상 및 손 특성 파라미터를 이용하여 손의 3차원 포즈 정보를 획득할 수 있다. 또한, 전자 장치는 기 설정된 손 특성 파라미터의 개수가 임계값 미만인 경우, 제 1 카메라 및 제 2 카메라에서 촬영된 영상들을 기초로 스테레오스코픽 방식에 따라 손의 3차원 포즈 정보를 획득할 수 있다. 전자 장치는 스테레오스코픽 방식을 이용할 경우, 손의 크기 및 형태 등에 관한 절대적인 정보를 획득할 수 있으므로, 이를 기반으로 손 특성 파라미터를 업데이트할 수 있다. For example, when the number of preset hand characteristic parameters is greater than or equal to a threshold value, the electronic device obtains 3D pose information of the hand using one image captured by either the first camera or the second camera and the hand characteristic parameters. can be obtained In addition, when the number of preset hand characteristic parameters is less than a threshold value, the electronic device may obtain 3D pose information of the hand according to a stereoscopic method based on images captured by the first camera and the second camera. Since the electronic device can obtain absolute information about the size and shape of the hand when using the stereoscopic method, it can update hand characteristic parameters based on this.

다른 예에 따라, 전자 장치는 이전에 입력된 손 제스쳐에 대한 응답으로 제공된 인터랙션 정보의 정확도가 기 설정된 기준 이하인지 여부를 판단할 수 있다. 전자 장치는 인터랙션 정보의 제공 이후에 사용자의 피드백으로 동일한 손 제스쳐가 다시 입력되거나 인터랙션 정보가 적절하지 않음을 나타내는 사용자 입력이 획득되는 이벤트가 일정 횟수 이상 발생하는 경우, 인터랙션 정보의 정확도가 기 설정된 기준 이하인 것으로 판단할 수 있다. 전자 장치는 인터랙션 정보의 정확도가 기 설정된 기준 이하인 것으로 판단된 경우, 손 특성 파라미터의 정확도 역시 낮은 것으로 판단할 수 있다. 전자 장치는 손 특성 파라미터의 정확도가 기 설정된 기준을 만족하지 못하는 경우, 제 1 카메라 및 제 2 카메라에서 촬영된 영상들을 기초로 스테레오스코픽 방식에 따라 손의 3차원 포즈 정보를 획득할 수 있다. 전자 장치는 스테레오스코픽 방식을 이용할 경우, 손의 크기 및 형태 등에 관한 절대적인 정보를 획득할 수 있으므로, 이를 기반으로 손 특성 파라미터를 업데이트할 수 있다. According to another example, the electronic device may determine whether accuracy of interaction information provided in response to a previously input hand gesture is equal to or less than a preset standard. When an event in which the same hand gesture is re-entered as a user's feedback after providing the interaction information or an event in which a user input indicating that the interaction information is not appropriate occurs more than a certain number of times, the electronic device determines the accuracy of the interaction information based on a preset standard. It can be judged to be below. When the electronic device determines that the accuracy of the interaction information is less than or equal to a predetermined criterion, the electronic device may also determine that the accuracy of the hand characteristic parameter is low. When the accuracy of the hand characteristic parameter does not satisfy a predetermined criterion, the electronic device may acquire 3D pose information of the hand in a stereoscopic manner based on images captured by the first camera and the second camera. Since the electronic device can obtain absolute information about the size and shape of the hand when using the stereoscopic method, it can update hand characteristic parameters based on this.

또한, 사용자의 손(605)은

시점에 다른 카메라의 FoV와 중첩하지 않은 제 2 카메라의 FoV 영역인, 제 2 모노 영역(616)에 위치할 수 있다. 전자 장치는 제 2 카메라에서 촬영된 하나의 영상으로부터 사용자의 손의 형태 및 크기 등을 식별할 수 있다. 또한, 전자 장치는 식별된 사용자의 손의 형태 및 크기 등을 기 설정된 손 특성 파라미터를 이용하여 교정할 수 있다. 이 때,

시점에 스테레오 영역(614)에서 손 특성 파라미터가 업데이트된 경우, 전자 장치는 업데이트된 손 특성 파라미터 및 제 2 카메라에서 촬영된 하나의 영상을 이용하여 손의 3차원 포즈 정보를 결정할 수 있다. In addition, the user's hand 605

It may be located in the second mono area 616, which is the FoV area of the second camera that does not overlap with the FoV of the other camera at the viewpoint. The electronic device may identify the shape and size of the user's hand from one image captured by the second camera. In addition, the electronic device may calibrate the identified user's hand shape and size using preset hand characteristic parameters. At this time,

When the hand characteristic parameter is updated in the stereo area 614 at the viewpoint, the electronic device may determine 3D pose information of the hand using the updated hand characteristic parameter and one image captured by the second camera.

도 7은 일 실시예에 따른 전자 장치가 손 특성 파라미터를 획득하는 방법을 설명하기 위한 흐름도이다. 7 is a flowchart illustrating a method of acquiring a hand characteristic parameter by an electronic device according to an exemplary embodiment.

단계 S710에서, 전자 장치는 손이 포함된 복수의 영상을 식별할 수 있다. 예를 들어, 전자 장치는 손이 스테레오 영역에 위치한 경우, 서로 다른 카메라에서 손을 촬영된, 손을 포함하는 복수의 영상을 획득할 수 있다. In step S710, the electronic device may identify a plurality of images including hands. For example, when the hand is located in the stereo area, the electronic device may acquire a plurality of images including the hand, which are photographed by different cameras.

단계 S720에서, 전자 장치는 식별된 복수의 영상에서 검출된 손의 손가락의 선명도가 기 설정된 기준 이상이거나 검출된 손의 위치가 식별된 영상의 중심으로부터 기 설정된 거리 범위 이내에 포함됨에 따라, 식별된 복수의 영상으로부터 손 특성 파라미터를 획득할 수 있다. 일 실시예에 따른 전자 장치는 복수의 영상으로부터 획득된 손 특성 파라미터가 손의 특성을 적절하게 나타내고 있는지 여부를 판단하기 위해, 손의 선명도 또는 촬영 시 손의 위치 등을 기준으로 설정할 수 있다.In step S720, the electronic device determines whether the sharpness of the fingers of the hand detected in the identified plurality of images is greater than or equal to a predetermined standard or the position of the detected hand is included within a predetermined distance range from the center of the identified image. Hand characteristic parameters can be obtained from the image of . The electronic device according to an embodiment may set the sharpness of the hand or the location of the hand when photographing the hand to determine whether the hand characteristic parameters obtained from the plurality of images appropriately represent the hand characteristic.

단계 S730에서, 전자 장치는 전자 장치에 설정된 손 특성 파라미터의 개수가 임계값 이상인지 여부를 식별할 수 있다.In step S730, the electronic device may identify whether the number of hand characteristic parameters set in the electronic device is greater than or equal to a threshold value.

단계 S740에서, 전자 장치는 설정된 손 특성 파라미터의 개수가 임계값 이상인 경우, 설정된 손 특성 파라미터를 최종적인 손 특성 파라미터로 확정할 수 있다. In step S740, when the number of set hand characteristic parameters is greater than or equal to the threshold value, the electronic device may determine the set hand characteristic parameters as the final hand characteristic parameters.

단계 S750에서, 전자 장치는 획득된 손 특성 파라미터를 저장할 수 있다. 일 실시예에 따른 전자 장치는 설정된 손 특성 파라미터의 개수가 임계값 미만인 경우, 추가적인 손 특성 파라미터의 설정이 필요한 것으로 판단할 수 있다. 이에 따라, 전자 장치는 전술한 단계 S720에서 획득한 손 특성 파라미터를 저장할 수 있다. In step S750, the electronic device may store the acquired hand characteristic parameters. When the number of set hand characteristic parameters is less than a threshold value, the electronic device according to an embodiment may determine that additional hand characteristic parameters need to be set. Accordingly, the electronic device may store the hand characteristic parameter acquired in step S720 described above.

도 8은 일 실시예에 따른 전자 장치가 손의 3차원 포즈 정보를 교정하는 방법을 설명하기 위한 도면이다. 8 is a diagram for explaining a method of calibrating 3D pose information of a hand by an electronic device according to an exemplary embodiment.

도 8을 참조하면, 전자 장치는 AR 환경을 제공하기 위해, 실세계에 중첩하여 키보드 영상(810)을 출력할 수 있다. 사용자는 탭 제스쳐를 통해 키보드 영상(810)에서 원하는 텍스트를 입력하여 메시지(820)를 생성할 수 있다. 일 실시예에 따른 전자 장치는 복수의 카메라 중 하나의 카메라에서 획득된 하나의 영상과 손 특성 파라미터를 이용하여 손의 3차원 포즈 정보를 획득할 수 있다. 이 때, 손의 3차원 포즈 정보에 대한 인터랙션 정보로 제공된 텍스트가 사용자가 의도한 넥스트와 상이한 경우, 전자 장치는 설정되어 있는 손 특성 파라미터의 정확도가 낮은 것으로 판단할 수 있다. 이에 따라, 전자 장치는 손 특성 파라미터를 업데이트하고 3차원 손 포즈 정보를 교정할 수 있다. Referring to FIG. 8 , the electronic device may output a keyboard image 810 superimposed on the real world to provide an AR environment. A user may generate a message 820 by inputting desired text in the keyboard image 810 through a tap gesture. The electronic device according to an embodiment may obtain 3D pose information of the hand using one image obtained from one of a plurality of cameras and a hand characteristic parameter. In this case, if the text provided as the interaction information for the 3D pose information of the hand is different from the next intended by the user, the electronic device may determine that the accuracy of the set hand characteristic parameter is low. Accordingly, the electronic device may update hand characteristic parameters and correct 3D hand pose information.

예를 들어, 전자 장치는 복수의 카메라 중 2 개 이상의 카메라를 이용하여, 사용자의 손을 촬영하고, 촬영된 복수의 영상에 스테레오스코픽 방식을 적용하여 사용자의 손 특성 파라미터를 업데이트할 수 있다. For example, the electronic device may use two or more cameras among a plurality of cameras to photograph the user's hand, and apply a stereoscopic method to the plurality of photographed images to update the user's hand characteristic parameter.

전자 장치는 업데이트된 손 특성 파라미터를 이용하여 손의 3차원 포즈 정보를 새롭게 획득할 수 있다. 전자 장치는 새롭게 획득된 손의 3차원 포즈 정보를 기초로, 새로운 메시지(825)를 생성할 수 있다. The electronic device may newly acquire 3D pose information of the hand by using the updated hand characteristic parameter. The electronic device may generate a new message 825 based on the newly acquired 3D pose information of the hand.

도 9 및 도 10은 일 실시예에 따른 전자 장치의 블록도이다. 9 and 10 are block diagrams of an electronic device according to an exemplary embodiment.

도 9에 도시된 바와 같이, 일 실시예에 따른 전자 장치(100)는, 센싱부(110), 프로세서(130) 및 출력부(150)를 포함할 수 있다. 그러나 도시된 구성요소 모두가 필수구성요소인 것은 아니다. 도시된 구성요소보다 많은 구성요소에 의해 전자 장치(100)가 구현될 수도 있고, 그보다 적은 구성요소에 의해서도 전자 장치(100)는 구현될 수 있다.As shown in FIG. 9 , the electronic device 100 according to an embodiment may include a sensing unit 110 , a processor 130 and an output unit 150 . However, not all illustrated components are essential components. The electronic device 100 may be implemented with more components than those illustrated, or the electronic device 100 may be implemented with fewer components.

예를 들어, 도 10에 도시된 바와 같이, 본 개시의 일 실시예에 따른 전자 장치(100)는, 센싱부(110), 프로세서(130), 출력부(150) 이외에 통신부(160) 및 메모리(170)를 더 포함할 수도 있다. For example, as shown in FIG. 10 , the electronic device 100 according to an embodiment of the present disclosure includes a sensing unit 110, a processor 130, and an output unit 150 as well as a communication unit 160 and a memory. (170) may be further included.

이하 상기 구성요소들에 대해 차례로 살펴본다.Hereinafter, the above components are examined in turn.

센싱부(110)는, 전자 장치(100)의 상태 또는 전자 장치(100) 주변의 상태를 감지하고, 감지된 정보를 프로세서(130)로 전달할 수 있다. 예를 들어, 센싱부(110)는 사용자의 손을 촬영하고, 촬영된 영상을 프로세서(130)로 전달할 수 있다. The sensing unit 110 may detect a state of the electronic device 100 or a state around the electronic device 100 and transmit the sensed information to the processor 130 . For example, the sensing unit 110 may capture a user's hand and transmit the captured image to the processor 130 .

센싱부(110)는, 이미지 센서(111), 모션 센서(112), 가속도 센서(113), 위치 센서(114), 안구 추적 센서(115), 온/습도 센서(116), 지자기 센서(117), 자이로스코프 센서(118), 마이크로폰(119) 및 착용 감지 센서(120)를 포함할 수 있으나, 이에 한정되는 것은 아니다. 각 센서들의 기능은 그 명칭으로부터 당업자가 직관적으로 추론할 수 있으므로, 구체적인 설명은 생략하기로 한다.The sensing unit 110 includes an image sensor 111, a motion sensor 112, an acceleration sensor 113, a position sensor 114, an eye tracking sensor 115, a temperature/humidity sensor 116, and a geomagnetic sensor 117. ), a gyroscope sensor 118, a microphone 119, and a wearing detection sensor 120 may be included, but is not limited thereto. Since a person skilled in the art can intuitively infer the function of each sensor from its name, a detailed description thereof will be omitted.

프로세서(130)는, 통상적으로 전자 장치(100)의 전반적인 동작을 제어한다. 예를 들어, 프로세서(130)는, 메모리(170)에 저장된 프로그램들을 실행함으로써, 센싱부(110), 출력부(150), 통신부(160) 및 메모리(170) 등을 전반적으로 제어할 수 있다. The processor 130 typically controls overall operations of the electronic device 100 . For example, the processor 130 may overall control the sensing unit 110, the output unit 150, the communication unit 160, the memory 170, etc. by executing programs stored in the memory 170. .

프로세서(130)는 도 1 내지 도 8을 참조하여 전술한 전자 장치의 동작을 수행할 수 있다. 일 실시예에 따른 프로세서(130)는 복수의 카메라로부터 촬영된 복수의 영상을 획득할 수 있다. 또한, 프로세서(130)는 획득된 복수의 영상 중 손(hand)을 포함하는 영상을 식별할 수 있다. 프로세서(130)는 식별된 영상 및 기 설정된 손 특성 파라미터(hand feature parameter)에 기초하여, 손의 3차원 포즈(pose) 정보를 획득할 수 있다. 프로세서(130)는 전자 장치(100)에서 실행되는 애플리케이션에 기초하여, 획득된 손의 3차원 포즈 정보에 대응되는 인터랙션 정보를 제공할 수 있다. The processor 130 may perform the operation of the electronic device described above with reference to FIGS. 1 to 8 . The processor 130 according to an embodiment may acquire a plurality of images captured by a plurality of cameras. Also, the processor 130 may identify an image including a hand among a plurality of acquired images. The processor 130 may obtain 3D pose information of the hand based on the identified image and preset hand feature parameters. The processor 130 may provide interaction information corresponding to the obtained 3D pose information of the hand, based on an application executed in the electronic device 100 .

출력부(150)는, 오디오 신호 또는 비디오 신호 또는 진동 신호의 출력을 위한 것으로, 이에는 디스플레이부(151)와 음향 출력부(152) 등이 포함될 수 있다.The output unit 150 is for outputting an audio signal, a video signal, or a vibration signal, and may include a display unit 151 and a sound output unit 152.

디스플레이부(151)는 전자 장치(100)에서 처리되는 정보가 표시되도록, 이를 출력할 수 있다. 한편, 디스플레이부(151)와 터치패드가 레이어 구조를 이루어 터치 스크린으로 구성되는 경우, 디스플레이부(151)는 출력 장치 이외에 입력 장치로도 사용될 수 있다. The display unit 151 may output information processed by the electronic device 100 to be displayed. Meanwhile, when the display unit 151 and the touch pad form a layer structure to form a touch screen, the display unit 151 may be used as an input device as well as an output device.

음향 출력부(152)는 통신부(160)로부터 수신되거나 메모리(170)에 저장된 오디오 데이터를 출력한다. 또한, 음향 출력부(152)는 전자 장치(100)에서 수행되는 기능(예를 들어, 호신호 수신음, 메시지 수신음, 알림음)과 관련된 음향 신호를 출력한다. 이러한 음향 출력부(152)에는 스피커(speaker), 버저(Buzzer) 등이 포함될 수 있다.The audio output unit 152 outputs audio data received from the communication unit 160 or stored in the memory 170 . Also, the sound output unit 152 outputs sound signals related to functions performed by the electronic device 100 (eg, a call signal reception sound, a message reception sound, and a notification sound). The sound output unit 152 may include a speaker, a buzzer, and the like.

통신부(160)는, 전자 장치(100)와 외부 기기 또는 전자 장치(100)와 서버 간의 통신을 하게 하는 하나 이상의 구성요소를 포함할 수 있다. 예를 들어, 통신부(160)는, 근거리 통신부(161), 이동 통신부(162), 방송 수신부(163)를 포함할 수 있다. The communication unit 160 may include one or more components that enable communication between the electronic device 100 and an external device or between the electronic device 100 and a server. For example, the communication unit 160 may include a short-distance communication unit 161, a mobile communication unit 162, and a broadcast reception unit 163.

근거리 통신부(short-range wireless communication unit)(161)는, 블루투스 통신부, BLE(Bluetooth Low Energy) 통신부, 근거리 무선 통신부(Near Field Communication unit), WLAN(와이파이) 통신부, 지그비(Zigbee) 통신부, 적외선(IrDA, infrared Data Association) 통신부, WFD(Wi-Fi Direct) 통신부, UWB(ultra wideband) 통신부, Ant+ 통신부 등을 포함할 수 있으나, 이에 한정되는 것은 아니다. The short-range wireless communication unit 161 includes a Bluetooth communication unit, a Bluetooth Low Energy (BLE) communication unit, a Near Field Communication unit, a WLAN (Wi-Fi) communication unit, a Zigbee communication unit, and an infrared ( It may include an infrared data association (IrDA) communication unit, a Wi-Fi Direct (WFD) communication unit, an ultra wideband (UWB) communication unit, an Ant+ communication unit, etc., but is not limited thereto.

이동 통신부(162)는, 이동 통신망 상에서 기지국, 외부의 단말, 서버 중 적어도 하나와 무선 신호를 송수신한다. 여기에서, 무선 신호는, 음성 호 신호, 화상 통화 호 신호 또는 문자/멀티미디어 메시지 송수신에 따른 다양한 형태의 데이터를 포함할 수 있다.The mobile communication unit 162 transmits and receives radio signals with at least one of a base station, an external terminal, and a server on a mobile communication network. Here, the radio signal may include a voice call signal, a video call signal, or various types of data according to text/multimedia message transmission/reception.

방송 수신부(163)는, 방송 채널을 통하여 외부로부터 방송 신호 및/또는 방송 관련된 정보를 수신한다. 방송 채널은 위성 채널, 지상파 채널을 포함할 수 있다. 구현 예에 따라서 전자 장치(100)가 방송 수신부(163)를 포함하지 않을 수도 있다.The broadcast receiver 163 receives a broadcast signal and/or broadcast-related information from the outside through a broadcast channel. Broadcast channels may include satellite channels and terrestrial channels. According to an implementation example, the electronic device 100 may not include the broadcast reception unit 163.

메모리(170)는, 프로세서(130)의 처리 및 제어를 위한 프로그램을 저장할 수도 있고, 입/출력되는 데이터들을 저장할 수도 있다. The memory 170 may store programs for processing and control of the processor 130 or may store input/output data.

메모리(170)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory) SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. The memory 170 may be a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (eg SD or XD memory, etc.), RAM (RAM, Random Access Memory) SRAM (Static Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk , an optical disk, and at least one type of storage medium.

메모리(170)에 저장된 프로그램들은 그 기능에 따라 복수 개의 모듈들로 분류할 수 있는데, 예를 들어, UI 모듈(171), 알림 모듈(172) 등으로 분류될 수 있다. Programs stored in the memory 170 may be classified into a plurality of modules according to their functions, for example, a UI module 171, a notification module 172, and the like.

UI 모듈(171)은, 애플리케이션 별로 전자 장치(100)와 연동되는 특화된 UI, GUI 등을 제공할 수 있다. The UI module 171 may provide a specialized UI, GUI, or the like that works with the electronic device 100 for each application.

알림 모듈(172)은 전자 장치(100)의 이벤트 발생을 알리기 위한 신호를 발생할 수 있다. 전자 장치(100)에서 발생되는 이벤트의 예로는 호 신호 수신, 메시지 수신, 키 신호 입력, 일정 알림 등이 있다. The notification module 172 may generate a signal for notifying occurrence of an event of the electronic device 100 . Examples of events generated by the electronic device 100 include reception of a call signal, reception of a message, input of a key signal, and notification of a schedule.

도 11은, 다양한 실시예들에 따른, 카메라 모듈을 예시하는 블록도이다. 11 is a block diagram illustrating a camera module, in accordance with various embodiments.

일 실시예에 따른 카메라 모듈(1100)은 도 1 내지 도 8에서 전술한 카메라 또는 복수의 카메라에 대응될 수 있다. The camera module 1100 according to an embodiment may correspond to the camera or a plurality of cameras described above in FIGS. 1 to 8 .

도 11을 참조하면, 카메라 모듈(1100)은 렌즈 어셈블리(1110), 플래쉬(1120), 이미지 센서(1130), 이미지 스태빌라이저(1140), 메모리(1150)(예: 버퍼 메모리), 또는 이미지 시그널 프로세서(1160)를 포함할 수 있다. 렌즈 어셈블리(1110)는 촬영의 대상인 피사체로부터 방출되는 빛을 수집할 수 있다. 렌즈 어셈블리(1110)는 하나 또는 그 이상의 렌즈들을 포함할 수 있다. 일실시예에 따르면, 카메라 모듈(1100)은 복수의 렌즈 어셈블리(1110)들을 포함할 수 있다. 이런 경우, 카메라 모듈(1100)은, 예를 들면, 듀얼 카메라, 360도 카메라, 또는 구형 카메라(spherical camera)를 형성할 수 있다. 복수의 렌즈 어셈블리(210)들 중 일부는 동일한 렌즈 속성(예: 화각, 초점 거리, 자동 초점, f 넘버(f number), 또는 광학 줌)을 갖거나, 또는 적어도 하나의 렌즈 어셈블리는 다른 렌즈 어셈블리의 렌즈 속성들과 다른 하나 이상의 렌즈 속성들을 가질 수 있다. 렌즈 어셈블리(1110)는, 예를 들면, 광각 렌즈 또는 망원 렌즈를 포함할 수 있다. Referring to FIG. 11 , a camera module 1100 includes a lens assembly 1110, a flash 1120, an image sensor 1130, an image stabilizer 1140, a memory 1150 (eg, a buffer memory), or an image signal processor. (1160). The lens assembly 1110 may collect light emitted from a subject that is a photographing target. The lens assembly 1110 may include one or more lenses. According to one embodiment, the camera module 1100 may include a plurality of lens assemblies 1110 . In this case, the camera module 1100 may form, for example, a dual camera, a 360-degree camera, or a spherical camera. Some of the plurality of lens assemblies 210 may have the same lens properties (eg, angle of view, focal length, auto focus, f number, or optical zoom), or at least one lens assembly may have the same lens properties as other lens assemblies. may have one or more lens properties different from the lens properties of . The lens assembly 1110 may include, for example, a wide-angle lens or a telephoto lens.

플래쉬(1120)는 피사체로부터 방출 또는 반사되는 빛을 강화하기 위하여 사용되는 빛을 방출할 수 있다. 일실시예에 따르면, 플래쉬(1120)는 하나 이상의 발광 다이오드들(예: RGB(red-green-blue) LED, white LED, infrared LED, 또는 ultraviolet LED), 또는 xenon lamp를 포함할 수 있다. The flash 1120 may emit light used to enhance light emitted or reflected from a subject. According to one embodiment, the flash 1120 may include one or more light emitting diodes (eg, a red-green-blue (RGB) LED, a white LED, an infrared LED, or an ultraviolet LED), or a xenon lamp.

이미지 센서(1130)는 피사체로부터 방출 또는 반사되어 렌즈 어셈블리(1110)를 통해 전달된 빛을 전기적인 신호로 변환함으로써, 상기 피사체에 대응하는 이미지를 획득할 수 있다. 일실시예에 따르면, 이미지 센서(1130)는, 예를 들면, RGB 센서, BW(black and white) 센서, IR 센서, 또는 UV 센서와 같이 속성이 다른 이미지 센서들 중 선택된 하나의 이미지 센서, 동일한 속성을 갖는 복수의 이미지 센서들, 또는 다른 속성을 갖는 복수의 이미지 센서들을 포함할 수 있다. 이미지 센서(1130)에 포함된 각각의 이미지 센서는, 예를 들면, CCD(charged coupled device) 센서 또는 CMOS(complementary metal oxide semiconductor) 센서를 이용하여 구현될 수 있다.The image sensor 1130 may acquire an image corresponding to the subject by converting light emitted or reflected from the subject and transmitted through the lens assembly 1110 into an electrical signal. According to an embodiment, the image sensor 1130 may be, for example, an image sensor selected from among image sensors having different properties, such as an RGB sensor, a black and white (BW) sensor, an IR sensor, or a UV sensor, It may include a plurality of image sensors having a property, or a plurality of image sensors having other properties. Each image sensor included in the image sensor 1130 may be implemented using, for example, a charged coupled device (CCD) sensor or a complementary metal oxide semiconductor (CMOS) sensor.

이미지 스태빌라이저(1140)는 카메라 모듈 또는 이를 포함하는 전자 장치(100)의 움직임에 반응하여, 렌즈 어셈블리(1110)에 포함된 적어도 하나의 렌즈 또는 이미지 센서(1130)를 특정한 방향으로 움직이거나 이미지 센서(1130)의 동작 특성을 제어(예: 리드 아웃(read-out) 타이밍을 조정 등)할 수 있다. 이는 촬영되는 이미지에 대한 상기 움직임에 의한 부정적인 영향의 적어도 일부를 보상하게 해 준다. The image stabilizer 1140 may move at least one lens or image sensor 1130 included in the lens assembly 1110 in a specific direction in response to movement of the camera module or the electronic device 100 including the camera module, or the image sensor ( Operation characteristics of 1130) may be controlled (eg, read-out timing is adjusted, etc.). This makes it possible to compensate at least part of the negative effect of the movement on the image being taken.

메모리(1150)는 이미지 센서(1130)을 통하여 획득된 이미지의 적어도 일부를 다음 이미지 처리 작업을 위하여 적어도 일시 저장할 수 있다. 예를 들어, 셔터에 따른 이미지 획득이 지연되거나, 또는 복수의 이미지들이 고속으로 획득되는 경우, 획득된 원본 이미지(예: Bayer-patterned 이미지 또는 높은 해상도의 이미지)는 메모리(1150)에 저장이 되고, 그에 대응하는 사본 이미지(예: 낮은 해상도의 이미지)는 디스플레이(151)를 통하여 프리뷰될 수 있다. 이후, 지정된 조건이 만족되면(예: 사용자 입력 또는 시스템 명령) 메모리(1150)에 저장되었던 원본 이미지의 적어도 일부가, 예를 들면, 이미지 시그널 프로세서(1160)에 의해 획득되어 처리될 수 있다. 일실시예에 따르면, 메모리(1150)는 메모리(170)의 적어도 일부로, 또는 이와는 독립적으로 운영되는 별도의 메모리로 구성될 수 있다.The memory 1150 may at least temporarily store at least a part of an image acquired through the image sensor 1130 for a next image processing task. For example, when image acquisition is delayed according to the shutter, or when a plurality of images are acquired at high speed, the acquired original image (eg, a Bayer-patterned image or a high-resolution image) is stored in the memory 1150 and , a copy image (eg, a low resolution image) corresponding thereto may be previewed through the display 151 . Thereafter, when a specified condition is satisfied (eg, a user input or a system command), at least a part of the original image stored in the memory 1150 may be acquired and processed by, for example, the image signal processor 1160 . According to one embodiment, the memory 1150 may be configured as at least a part of the memory 170 or as a separate memory operated independently of the memory 170 .

이미지 시그널 프로세서(1160)는 이미지 센서(1130)를 통하여 획득된 이미지 또는 메모리(1150)에 저장된 이미지에 대하여 하나 이상의 이미지 처리들을 수행할 수 있다. 상기 하나 이상의 이미지 처리들은, 예를 들면, 깊이 지도(depth map) 생성, 3차원 모델링, 파노라마 생성, 특징점 추출, 이미지 합성, 또는 이미지 보상(예: 노이즈 감소, 해상도 조정, 밝기 조정, 블러링(blurring), 샤프닝(sharpening), 또는 소프트닝(softening)을 포함할 수 있다. 일실시예에 따르면, 이미지 시그널 프로세서(1160)는 프로세서(130)의 적어도 일부로 구성되거나, 프로세서(130)와 독립적으로 운영되는 별도의 프로세서로 구성될 수 있다. The image signal processor 1160 may perform one or more image processes on an image acquired through the image sensor 1130 or an image stored in the memory 1150 . The one or more image processes, for example, depth map generation, 3D modeling, panorama generation, feature point extraction, image synthesis, or image compensation (eg, noise reduction, resolution adjustment, brightness adjustment, blurring ( blurring, sharpening, or softening According to one embodiment, the image signal processor 1160 is configured as at least a part of the processor 130 or operates independently of the processor 130. It can be configured as a separate processor.

일실시예에 따르면, 전자 장치(100)는 각각 다른 속성 또는 기능을 가진 복수의 카메라 모듈들을 포함할 수 있다. 이런 경우, 예를 들면, 상기 복수의 카메라 모듈들 중 적어도 하나는 광각 카메라이고, 적어도 다른 하나는 망원 카메라일 수 있다. 유사하게, 상기 복수의 카메라 모듈들 중 적어도 하나는 전면 카메라이고, 적어도 다른 하나는 후면 카메라일 수 있다. According to an embodiment, the electronic device 100 may include a plurality of camera modules each having different properties or functions. In this case, for example, at least one of the plurality of camera modules may be a wide-angle camera and at least the other may be a telephoto camera. Similarly, at least one of the plurality of camera modules may be a front camera, and at least another one may be a rear camera.

본 문서에 개시된 다양한 실시예들에 따른 전자 장치는 다양한 형태의 장치가 될 수 있다. 전자 장치는, 예를 들면, 휴대용 통신 장치(예: 스마트폰), 컴퓨터 장치, 휴대용 멀티미디어 장치, 휴대용 의료 기기, 카메라, 웨어러블 장치, 또는 가전 장치를 포함할 수 있다. 본 문서의 실시예에 따른 전자 장치는 전술한 기기들에 한정되지 않는다.Electronic devices according to various embodiments disclosed in this document may be devices of various types. The electronic device may include, for example, a portable communication device (eg, a smart phone), a computer device, a portable multimedia device, a portable medical device, a camera, a wearable device, or a home appliance. An electronic device according to an embodiment of the present document is not limited to the aforementioned devices.

본 문서의 다양한 실시예들 및 이에 사용된 용어들은 본 문서에 기재된 기술적 특징들을 특정한 실시예들로 한정하려는 것이 아니며, 해당 실시예의 다양한 변경, 균등물, 또는 대체물을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 또는 관련된 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다. 아이템에 대응하는 명사의 단수 형은 관련된 문맥상 명백하게 다르게 지시하지 않는 한, 상기 아이템 한 개 또는 복수 개를 포함할 수 있다. 본 문서에서, "A 또는 B", "A 및 B 중 적어도 하나", "A 또는 B 중 적어도 하나", "A, B 또는 C", "A, B 및 C 중 적어도 하나", 및 "A, B, 또는 C 중 적어도 하나"와 같은 문구들 각각은 그 문구들 중 해당하는 문구에 함께 나열된 항목들 중 어느 하나, 또는 그들의 모든 가능한 조합을 포함할 수 있다. "제 1", "제 2", 또는 "첫째" 또는 "둘째"와 같은 용어들은 단순히 해당 구성요소를 다른 해당 구성요소와 구분하기 위해 사용될 수 있으며, 해당 구성요소들을 다른 측면(예: 중요성 또는 순서)에서 한정하지 않는다. 어떤(예: 제 1) 구성요소가 다른(예: 제 2) 구성요소에, "기능적으로" 또는 "통신적으로"라는 용어와 함께 또는 이런 용어 없이, "커플드" 또는 "커넥티드"라고 언급된 경우, 그것은 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로(예: 유선으로), 무선으로, 또는 제 3 구성요소를 통하여 연결될 수 있다는 것을 의미한다.Various embodiments of this document and terms used therein are not intended to limit the technical features described in this document to specific embodiments, but should be understood to include various modifications, equivalents, or substitutes of the embodiments. In connection with the description of the drawings, like reference numbers may be used for like or related elements. The singular form of a noun corresponding to an item may include one item or a plurality of items, unless the relevant context clearly dictates otherwise. In this document, "A or B", "at least one of A and B", "at least one of A or B", "A, B or C", "at least one of A, B and C", and "A Each of the phrases such as "at least one of , B, or C" may include any one of the items listed together in that phrase, or all possible combinations thereof. Terms such as "first", "second", or "first" or "secondary" may simply be used to distinguish a given component from other corresponding components, and may be used to refer to a given component in another aspect (eg, importance or order) is not limited. A (e.g., first) component is said to be "coupled" or "connected" to another (e.g., second) component, with or without the terms "functionally" or "communicatively." When mentioned, it means that the certain component may be connected to the other component directly (eg by wire), wirelessly, or through a third component.

본 문서의 다양한 실시예들에서 사용된 용어 "모듈"은 하드웨어, 소프트웨어 또는 펌웨어로 구현된 유닛을 포함할 수 있으며, 예를 들면, 로직, 논리 블록, 부품, 또는 회로와 같은 용어와 상호 호환적으로 사용될 수 있다. 모듈은, 일체로 구성된 부품 또는 하나 또는 그 이상의 기능을 수행하는, 상기 부품의 최소 단위 또는 그 일부가 될 수 있다. 예를 들면, 일실시예에 따르면, 모듈은 ASIC(application-specific integrated circuit)의 형태로 구현될 수 있다. The term "module" used in various embodiments of this document may include a unit implemented in hardware, software, or firmware, and is interchangeable with terms such as, for example, logic, logical blocks, parts, or circuits. can be used as A module may be an integrally constructed component or a minimal unit of components or a portion thereof that performs one or more functions. For example, according to one embodiment, the module may be implemented in the form of an application-specific integrated circuit (ASIC).

본 문서의 다양한 실시예들은 기기(machine)(예: 전자 장치(100)) 의해 읽을 수 있는 저장 매체(storage medium)에 저장된 하나 이상의 명령어들을 포함하는 소프트웨어(예: 프로그램)로서 구현될 수 있다. 예를 들면, 기기(예: 전자 장치(100))의 프로세서(예: 프로세서(130))는, 저장 매체로부터 저장된 하나 이상의 명령어들 중 적어도 하나의 명령을 호출하고, 그것을 실행할 수 있다. 이것은 기기가 상기 호출된 적어도 하나의 명령어에 따라 적어도 하나의 기능을 수행하도록 운영되는 것을 가능하게 한다. 상기 하나 이상의 명령어들은 컴파일러에 의해 생성된 코드 또는 인터프리터에 의해 실행될 수 있는 코드를 포함할 수 있다. 기기로 읽을 수 있는 저장 매체는, 비일시적(non-transitory) 저장 매체의 형태로 제공될 수 있다. Various embodiments of this document may be implemented as software (eg, a program) including one or more instructions stored in a storage medium readable by a machine (eg, the electronic device 100). For example, a processor (eg, the processor 130 ) of a device (eg, the electronic device 100 ) may call at least one command among one or more instructions stored from a storage medium and execute it. This enables the device to be operated to perform at least one function according to the at least one command invoked. The one or more instructions may include code generated by a compiler or code executable by an interpreter. The device-readable storage medium may be provided in the form of a non-transitory storage medium.

일실시예에 따르면, 본 문서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory(CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어(예: 플레이 스토어TM)를 통해 또는 두 개의 사용자 장치들(예: 스마트 폰들) 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, the method according to various embodiments disclosed in this document may be included and provided in a computer program product. Computer program products may be traded between sellers and buyers as commodities. A computer program product is distributed in the form of a device-readable storage medium (e.g. compact disc read only memory (CD-ROM)), or through an application store (e.g. Play Store™) or on two user devices (e.g. It can be distributed (eg downloaded or uploaded) online, directly between smart phones. In the case of online distribution, at least part of the computer program product may be temporarily stored or temporarily created in a device-readable storage medium such as a manufacturer's server, an application store server, or a relay server's memory.

다양한 실시예들에 따르면, 상기 기술한 구성요소들의 각각의 구성요소(예: 모듈 또는 프로그램)는 단수 또는 복수의 개체를 포함할 수 있으며, 복수의 개체 중 일부는 다른 구성요소에 분리 배치될 수도 있다. 다양한 실시예들에 따르면, 전술한 해당 구성요소들 중 하나 이상의 구성요소들 또는 동작들이 생략되거나, 또는 하나 이상의 다른 구성요소들 또는 동작들이 추가될 수 있다. 대체적으로 또는 추가적으로, 복수의 구성요소들(예: 모듈 또는 프로그램)은 하나의 구성요소로 통합될 수 있다. 이런 경우, 통합된 구성요소는 상기 복수의 구성요소들 각각의 구성요소의 하나 이상의 기능들을 상기 통합 이전에 상기 복수의 구성요소들 중 해당 구성요소에 의해 수행되는 것과 동일 또는 유사하게 수행할 수 있다. 다양한 실시예들에 따르면, 모듈, 프로그램 또는 다른 구성요소에 의해 수행되는 동작들은 순차적으로, 병렬적으로, 반복적으로, 또는 휴리스틱하게 실행되거나, 상기 동작들 중 하나 이상이 다른 순서로 실행되거나, 생략되거나, 또는 하나 이상의 다른 동작들이 추가될 수 있다. According to various embodiments, each component (eg, module or program) of the above-described components may include a single object or a plurality of entities, and some of the plurality of entities may be separately disposed in other components. there is. According to various embodiments, one or more components or operations among the aforementioned corresponding components may be omitted, or one or more other components or operations may be added. Alternatively or additionally, a plurality of components (eg modules or programs) may be integrated into a single component. In this case, the integrated component may perform one or more functions of each of the plurality of components identically or similarly to those performed by a corresponding component of the plurality of components prior to the integration. . According to various embodiments, the actions performed by a module, program, or other component are executed sequentially, in parallel, iteratively, or heuristically, or one or more of the actions are executed in a different order, or omitted. or one or more other actions may be added.

기기로 읽을 수 있는 저장매체는, 비일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, '비일시적 저장매체'는 실재(tangible)하는 장치이고, 신호(signal)(예: 전자기파)를 포함하지 않는다는 것을 의미할 뿐이며, 이 용어는 데이터가 저장매체에 반영구적으로 저장되는 경우와 임시적으로 저장되는 경우를 구분하지 않는다. 예로, '비일시적 저장매체'는 데이터가 임시적으로 저장되는 버퍼를 포함할 수 있다.The device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-temporary storage medium' only means that it is a tangible device and does not contain signals (e.g., electromagnetic waves), and this term refers to the case where data is stored semi-permanently in the storage medium and temporary It does not discriminate if it is saved as . For example, a 'non-temporary storage medium' may include a buffer in which data is temporarily stored.

일 실시예에 따르면, 본 문서에 개시된 다양한 실시예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로 배포되거나, 또는 어플리케이션 스토어를 통해 또는 두개의 사용자 장치들(예: 스마트폰들) 간에 직접, 온라인으로 배포(예: 다운로드 또는 업로드)될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품(예: 다운로더블 앱(downloadable app))의 적어도 일부는 제조사의 서버, 어플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 기기로 읽을 수 있는 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, the method according to various embodiments disclosed in this document may be provided by being included in a computer program product. Computer program products may be traded between sellers and buyers as commodities. A computer program product is distributed in the form of a device-readable storage medium (eg compact disc read only memory (CD-ROM)), or through an application store or between two user devices (eg smartphones). It can be distributed (e.g., downloaded or uploaded) directly or online. In the case of online distribution, at least a part of a computer program product (eg, a downloadable app) is stored on a device-readable storage medium such as a memory of a manufacturer's server, an application store server, or a relay server. It can be temporarily stored or created temporarily.

Claims

A method for providing a user interface supporting interaction by an electronic device, comprising:
Acquiring a plurality of images taken from a plurality of cameras;
identifying an image including a hand from among the plurality of acquired images;
obtaining 3D pose information of the hand based on the identified image and preset hand feature parameters; and
and providing interaction information corresponding to the obtained 3D pose information of the hand, based on an application executed in the electronic device.

According to claim 1,
obtaining user profile information including at least one of the user's gender, age, height, weight, and region; and
The method further comprises setting a value of the preset hand characteristic parameter to a value corresponding to the acquired user profile information.

According to claim 1,
Among the obtained plurality of images, as a plurality of images obtained as a result of photographing a hand located in a stereo area, which is an area where fields of view (FoV) of the plurality of cameras overlap, are identified, the identified plurality obtaining depth information of the hand and size information of the hand based on an image of;
Based on the obtained hand depth information and the obtained hand size information, updating hand characteristic parameters; and
As one image including the hand is identified after the update, obtaining 3D pose information of the hand based on the identified one image and the updated hand characteristic parameter, the method further comprising: .

The method of claim 3, wherein updating the hand characteristic parameter comprises:
When the sharpness of the fingers of the hand detected in the plurality of identified images is equal to or greater than a predetermined standard, or the position of the detected hand is within a predetermined distance range from the center of the identified image, the obtained depth of the hand and updating the hand characteristic parameter based on information and the obtained hand size information.

According to claim 3,
When the number of hand characteristic parameters set in the electronic device is greater than or equal to a threshold, updating of the hand characteristic parameters is not performed.

According to claim 3,
Further comprising identifying accuracy of the provided interaction information based on user input information received after providing the interaction information;
In the step of updating the hand characteristic parameters,
If the accuracy of the provided interaction information does not satisfy a preset criterion, updating the hand characteristic parameter based on the obtained hand depth information and the obtained hand size information.

The method of claim 1, wherein the hand characteristic parameter,
and size information for each part of the hand determined based on at least some points constituting the hand.

According to claim 1,
obtaining tracking information of the hand; and
Acquiring probability information for a hand to be detected from the plurality of images based on the acquired hand tracking information;
Identifying the image including the hand,
and performing a hand detection operation on the plurality of images according to an order determined based on the obtained probability information.

The method of claim 7, wherein the hand detection operation,
The method of claim 1 , wherein the hand detection operation according to the determined order is stopped as images including a predetermined number of hands are acquired.

According to claim 7,
Based on the acquired hand tracking information, stopping shooting of a camera having an FoV of an area in which a probability of detecting the hand is less than a predetermined value among the plurality of cameras having different FoVs, .

An electronic device providing a user interface supporting interaction,
a sensing unit including a plurality of cameras;
output unit; and
including at least one processor, wherein the at least one processor comprises:
Acquiring a plurality of images taken from the plurality of cameras, identifying an image including a hand among the acquired plurality of images, and based on the identified images and preset hand feature parameters to obtain 3D pose information of the hand, and provide interaction information corresponding to the acquired 3D pose information of the hand based on an application running in the electronic device.

The method of claim 11, wherein the at least one processor,
Acquiring user profile information including at least one of the user's gender, age, height, weight, and region;
The electronic device sets a value of the preset hand characteristic parameter to a value corresponding to the acquired user profile information.

The method of claim 11, wherein the at least one processor,
Among the obtained plurality of images, as a plurality of images obtained as a result of photographing a hand located in a stereo area, which is an area where fields of view (FoV) of the plurality of cameras overlap, are identified, the identified plurality Obtaining depth information of the hand and size information of the hand based on an image of
Based on the obtained hand depth information and the obtained hand size information, update the hand characteristic parameters;
When one image including the hand is identified after the update, 3D pose information of the hand is obtained based on the identified one image and the updated hand characteristic parameter.

The method of claim 13, wherein the at least one processor,
When the sharpness of the fingers of the hand detected in the plurality of identified images is equal to or greater than a predetermined standard, or the position of the detected hand is within a predetermined distance range from the center of the identified image, the obtained depth of the hand and updating the hand characteristic parameter based on information and the obtained hand size information.

According to claim 13,
When the number of hand characteristic parameters set in the electronic device is greater than or equal to a threshold, updating of the hand characteristic parameters is not performed.

The method of claim 13, wherein the at least one processor,
Identifying accuracy of the provided interaction information based on user input information received after providing the interaction information;
and updating the hand characteristic parameter based on the obtained hand depth information and the obtained hand size information when accuracy of the provided interaction information does not satisfy a predetermined criterion.

The method of claim 11, wherein the hand characteristic parameter,
and size information for each part of the hand determined based on at least some points constituting the hand.

The method of claim 11, wherein the at least one processor,
Obtain tracking information of the hand;
Based on the acquired hand tracking information, obtaining hand detection probability information from the plurality of images;
identify an image containing the hand;
The electronic device performs a hand detection operation on the plurality of images according to an order determined based on the obtained probability information.

The method of claim 18, wherein the hand detection operation,
While the hand detection operation is performed according to the determined order, the electronic device is stopped when an image including a predetermined number of hands is acquired.

The method of claim 18, wherein the at least one processor,
Based on the acquired hand tracking information, among the plurality of cameras having different FoVs, a camera having a FoV of an area in which a probability of detecting the hand is less than a preset value is stopped from being photographed.

A computer program product including a recording medium storing a program that causes an electronic device to perform a method of providing a user interface supporting interaction,
Obtaining a plurality of images taken from a plurality of cameras;
An operation of identifying an image including a hand among the plurality of acquired images;
Obtaining 3D pose information of the hand based on the identified image and preset hand feature parameters; and
A recording medium storing a program for performing an operation of providing interaction information corresponding to the acquired 3D pose information of the hand, based on an application executed in the electronic device.