KR20230000932A

KR20230000932A - Methods and devices for analyzing images

Info

Publication number: KR20230000932A
Application number: KR1020220001180A
Authority: KR
Inventors: 김천윤
Original assignee: 김천윤
Priority date: 2021-04-27
Filing date: 2022-01-04
Publication date: 2023-01-03
Also published as: KR102348852B1; KR20230000931A

Abstract

The present invention relates to a method and device for analyzing an image. According to one embodiment of the present invention, a method for detecting an object may include the steps of: extracting frame images and motion vectors from images, moving pictures, image information, and/or moving picture information; generating an integrated feature vector based on the frame image and the motion vector; and detecting an object included in the images, the moving pictures, the image information, and/or the moving picture information based on the integrated feature vector. It is possible to provide the method and device for effectively detecting an object included in the images, the moving pictures, the image information, and/or the moving picture information.

Description

Methods and devices for analyzing images {Methods and devices for analyzing images}

본 발명은 이미지 내에서 오브젝트를 추출하는 방법 및 이를 위한 장치에 관한 것이다.The present invention relates to a method for extracting an object from an image and an apparatus therefor.

또한 본 발명은 컨텐츠 내의 오브젝트를 추출하여 추출된 오브젝트에 대한 정보를 제공하는 방법 및 이를 위한 장치에 관한 것이기도 하다. 구체적으로, 사용자가 컨텐츠 내의 특정 오브젝트에 대한 정보를 원하는 경우, 특정 오브젝트에 대한 정보를 사용자에게 제공하는 방법 및 이를 위한 장치에 관한 것이다.In addition, the present invention also relates to a method and apparatus for extracting an object from content and providing information on the extracted object. Specifically, the present invention relates to a method and an apparatus for providing information on a specific object to a user when the user desires information on a specific object in content.

최근 멀티미디어 콘텐츠의 급증 및 cctv 등의 보안 장비의 보편화에 힘입어 방송 콘텐츠 및 cctv 녹화영상 등의 동영상을 대상으로하는 영상 인식 기술의 중요성이 증가하고 있다.Recently, thanks to the rapid increase in multimedia contents and the generalization of security equipment such as cctv, the importance of image recognition technology for moving images such as broadcasting contents and cctv recorded images is increasing.

일반적으로, 동영상 기반 영상 인식 기술에는 정지 영상에 기반하는 기술과 연속적인 복수의 프레임 영상들에 기반하는 기술이 있다. 정지 영상에 기반하는 기술은 동영상을 프레임 단위의 정지 영상들로 나누고, 각 정지 영상에 이미지 기반 분석 기술을 적용하여 오브젝트를 검출하고 인식한다. 연속적인 복수의 프레임 영상에 기반하는 기술은 복수의 프레임 영상들을 기반으로 오브젝트의 움직임 특성을 모델링하여 움직이는 오브젝트를 검출하거나 특정 이벤트를 인식한다.In general, video-based image recognition technology includes a technology based on a still image and a technology based on a plurality of consecutive frame images. A technology based on still images divides a video into still images in frame units, and detects and recognizes objects by applying image-based analysis technology to each still image. A technique based on a plurality of consecutive frame images detects a moving object or recognizes a specific event by modeling motion characteristics of an object based on a plurality of frame images.

그러나, 시계 열 동작을 다수의 상태 모델 또는 연속적인 복수의 프레임 영상 모델의 복잡성과 과도한 연산량으로 인해 고속 인식에 한계가 존재한다. 나아가, CCTV 등을 이용하는 보안 분야의 영상 인식 기술은 배경이 고정된 동영상에서 이동 오브젝트를 분리 및 인식하는 기술을 제공하고 있지만, 특정 오브젝트를 검출하거나 이동 중인 카메라의 영상 또는 동적 배경에서 오브젝트를 분리하는데 한계가 있다.However, there is a limit to high-speed recognition of time-series motion due to the complexity and excessive amount of computation of a plurality of state models or a plurality of consecutive frame image models. Furthermore, image recognition technology in the field of security using CCTV provides a technology for separating and recognizing a moving object from a video with a fixed background, but it is difficult to detect a specific object or separate an object from a moving camera image or dynamic background. There are limits.

또한 최근 정보통신 기술의 발달에 따라 사용자들이 사진이나 동영상 등 컨텐츠에 노출되는 경우가 많다. 이때, 사용자들은 컨텐츠 상의 특정 오브젝트에 대한 정보를 얻고 싶으나, 컨텐츠 상에서는 오브젝트에 대한 정보를 제공하지 않는다. 또한, 컨텐츠 뿐만 아니라, 일상 생활 중에서도 사용자들이 특정 오브젝트에 대한 정보를 얻고자 하는 경우가 있을 수 있다. 구체적으로, 사용자는 미디어 상 또는 일상 생활에서 접하는 특정 오브젝트(예를 들어, 전자제품, 옷, 가방 등)에 대한 정보를 원하는 경우가 있다. In addition, with the recent development of information and communication technology, users are often exposed to content such as photos and videos. At this time, users want to obtain information on a specific object on the content, but information on the object is not provided on the content. In addition, there may be cases in which users want to obtain information on a specific object not only in content but also in daily life. Specifically, there are cases in which a user desires information on a specific object (eg, electronic product, clothes, bag, etc.) encountered on media or in daily life.

하지만, 사용자가 특정 오브젝트에 대한 정보를 얻으려면, 인터넷 검색 등을 통해야 하고, 심지어는 인터넷 검색을 위한 키워드등을 정확히 입력하지 못하여 특정 물품에 대한 정보를 얻지 못하는 경우도 있다.However, in order to obtain information on a specific object, the user has to search the Internet or the like, and there are cases in which information on a specific product cannot be obtained because a user cannot accurately input a keyword or the like for Internet search.

따라서, 사용자가 원하는 특정 오브젝트에 대한 정보를 정확히 제공하는 방법 및 이를 위한 장치에 대한 필요성이 대두되고 있다.Accordingly, a need for a method for accurately providing information on a specific object desired by a user and an apparatus therefor has emerged.

국내공개특허 10-2015-0005131 A (2015.01.14)Domestic Patent Publication 10-2015-0005131 A (2015.01.14) 국내공개특허 10-2014-0122292 A (2014.10.20)Domestic Patent Publication 10-2014-0122292 A (2014.10.20)

본 발명의 일 실시예는 통합 특징 벡터에 기초하여 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출함으로써, 오브젝트의 정적인 특성 및 동적인 특성을 모두 고려하여 효과적으로 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출하는 방법 및 장치를 제공할 수 있다.An embodiment of the present invention effectively detects an image, video, image information, and/or an object included in video information based on an integrated feature vector, taking into account both static and dynamic characteristics of the object and effectively , a method and apparatus for detecting an object included in image information and/or video information may be provided.

또한 본 발명의 일 실시예는 통합 특징 벡터에 기초하여 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출함으로써, 오브젝트의 정적인 특성 및 동적인 특성을 모두 고려하여 효과적으로 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출하는 방법 및 장치를 제공할 수 있다.In addition, an embodiment of the present invention detects an object included in an image, a video, image information, and/or video information based on an integrated feature vector, effectively taking into account both static and dynamic characteristics of the object, A method and apparatus for detecting a video, image information, and/or an object included in the video information may be provided.

또한 본 발명의 일 실시예는 정지 영상 기반 오브젝트 검출의 단순성과 계산 효율성 및 연속적인 복수의 프레임 영상 기반 오브젝트 검출의 고성능의 장점을 결합함으로써, 연산량을 효과적으로 감소시켜 고속으로 오브젝트를 검출하는 방법 및 장치를 제공할 수 있다.In addition, an embodiment of the present invention is a method and apparatus for detecting an object at high speed by effectively reducing the amount of computation by combining the advantages of simplicity and calculation efficiency of object detection based on still images and high performance of object detection based on continuous multiple frame images. can provide.

또한 본 발명의 일 실시예는 정지 영상에 포함된 오브젝트의 영상 정보와 전체/부분 이동 및 변형 등의 오브젝트의 움직임 정보를 통합하여 활용함으로써, 일관된 움직임 패턴을 보이는 오브젝트를 높은 정확도로 검출하는 방법 및 장치를 제공할 수 있다.In addition, an embodiment of the present invention is a method for detecting an object showing a consistent movement pattern with high accuracy by integrating and utilizing image information of an object included in a still image and motion information of an object such as total/partial movement and transformation, and device can be provided.

또한 본 발명의 일 실시예는 프레임 영상에 기초한 오브젝트의 정적인 특성과 모션 벡터에 기초한 오브젝트의 동적인 특성을 모두 고려함으로써, 오브젝트를 촬영한 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에서 나타날 수 있는 흐려짐(blurring)에 강인한 오브젝트 검출 방법을 제공할 수 있다.In addition, an embodiment of the present invention considers both the static characteristics of an object based on a frame image and the dynamic characteristics of an object based on a motion vector, so that an object may be displayed in a photographed image, video, image information, and/or video information. It is possible to provide an object detection method that is robust against possible blurring.

또한 본 발명의 일 실시예는 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하는 장치를 제공할 수 있다.In addition, an embodiment of the present invention may provide an apparatus for providing information on a specific object within content.

또한 본 발명의 일 실시예는 컨탠츠 내 특정 오브젝트에 대한 정보를 제공하는 서버의 동작 방법을 제공할 수 있다.In addition, an embodiment of the present invention may provide a method of operating a server providing information on a specific object within content.

본 발명의 일 실시예는 프레임 영상과 모션 벡터에 기초하는 오브젝트 검출 방법 및 장치를 제안한다.An embodiment of the present invention proposes a method and apparatus for detecting an object based on a frame image and a motion vector.

본 발명의 일 실시예에 따른 오브젝트 검출 방법은 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보로부터 프레임 영상 및 모션 벡터를 추출하는 단계; 상기 프레임 영상 및 모션 벡터에 기초하여 통합 특징 벡터를 생성하는 단계; 및 상기 통합 특징 벡터에 기초하여 상기 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출하는 단계를 포함할 수 있다.An object detection method according to an embodiment of the present invention includes extracting frame images and motion vectors from images, videos, image information, and/or video information; generating an integrated feature vector based on the frame image and the motion vector; and detecting an object included in the image, video, image information, and/or video information based on the integrated feature vector.

본 발명의 일 실시예에 따른 오브젝트 검출 방법에서 상기 통합 특징 벡터를 생성하는 단계는, 상기 프레임 영상의 통계적인 특성을 제1 특징 벡터로 추출하고, 상기 모션 벡터의 통계적인 특성을 제2 특징 벡터로 추출하는 단계; 및 상기 제1 특징 벡터 및 제2 특징 벡터를 결합함으로써 상기 통합 특징 벡터를 생성하는 단계를 포함할 수 있다.In the object detection method according to an embodiment of the present invention, the generating of the integrated feature vector may include extracting statistical characteristics of the frame image as a first feature vector and converting statistical characteristics of the motion vector into a second feature vector. Extracting with; and generating the integrated feature vector by combining the first feature vector and the second feature vector.

본 발명의 일 실시예에 따른 오브젝트 검출 방법에서 상기 제1 특징 벡터 및 제2 특징 벡터를 추출하는 단계는, 상기 프레임 영상과 상기 모션 벡터를 복수의 블록들로 분할하고, 블록에 포함되는 프레임 영상에 기초하여 제1 특징 벡터를 추출하고, 블록에 포함되는 모션 벡터에 기초하여 제2 특징 벡터를 추출할 수 있다.The step of extracting the first feature vector and the second feature vector in the object detection method according to an embodiment of the present invention includes dividing the frame image and the motion vector into a plurality of blocks, and the frame image included in the block. A first feature vector may be extracted based on , and a second feature vector may be extracted based on a motion vector included in a block.

본 발명의 일 실시예에 따른 오브젝트 검출 방법에서 상기 제1 특징 벡터 및 제2 특징 벡터를 추출하는 단계는, 상기 프레임 영상에 포함된 픽셀의 밝기의 기울기에 기초하여 상기 제1 특징 벡터를 추출할 수 있다.In the object detection method according to an embodiment of the present invention, the extracting of the first feature vector and the second feature vector may include extracting the first feature vector based on a brightness gradient of a pixel included in the frame image. can

본 발명의 일 실시예에 따른 오브젝트 검출 방법에서 상기 제1 특징 벡터 및 제2 특징 벡터를 추출하는 단계는, 상기 프레임 영상에 포함된 픽셀의 밝기 레벨에 기초하여 상기 제1 특징 벡터를 추출할 수 있다.In the object detection method according to an embodiment of the present invention, the extracting of the first feature vector and the second feature vector may include extracting the first feature vector based on a brightness level of a pixel included in the frame image. there is.

본 발명의 일 실시예에 따른 오브젝트 검출 방법에서 상기 제1 특징 벡터 및 제2 특징 벡터를 추출하는 단계는, 상기 프레임 영상에 포함된 픽셀의 색상에 기초하여 상기 제1 특징 벡터를 추출할 수 있다.In the object detection method according to an embodiment of the present invention, the extracting of the first feature vector and the second feature vector may include extracting the first feature vector based on a color of a pixel included in the frame image. .

본 발명의 일 실시예에 따른 오브젝트 검출 방법에서 상기 제1 특징 벡터 및 제2 특징 벡터를 추출하는 단계는, 상기 모션 벡터의 방향에 기초하여 상기 제2 특징 벡터를 추출할 수 있다.In the step of extracting the first feature vector and the second feature vector in the object detection method according to an embodiment of the present invention, the second feature vector may be extracted based on the direction of the motion vector.

본 발명의 일 실시예에 따른 오브젝트 검출 방법에서 상기 프레임 영상 및 모션 벡터를 추출하는 단계는, 상기 프레임 영상에 대응하는 기준 프레임을 복수의 블록들로 분할하고 블록마다 모션 벡터를 추출하여 모션 벡터 맵을 구성하고, 상기 모션 벡터 맵을 구성하는 블록의 크기를 균일화할 수 있다.In the object detection method according to an embodiment of the present invention, the step of extracting the frame image and the motion vector may include dividing a reference frame corresponding to the frame image into a plurality of blocks and extracting a motion vector for each block to map the motion vector. , and the size of the blocks constituting the motion vector map may be equalized.

본 발명의 일 실시예에 따른 오브젝트 검출 방법에서 상기 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출하는 단계는, 상기 통합 특징 벡터에 기초하여 상기 프레임 영상에 검출 대상 오브젝트가 포함되어 있는지 여부를 판별함으로써 상기 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출할 수 있다.In the object detection method according to an embodiment of the present invention, the step of detecting an object included in the image, video, image information, and/or video information may include a detection target object in the frame image based on the integrated feature vector. It is possible to detect an object included in the image, video, image information, and/or video information by determining whether it is included.

본 발명의 일 실시예에 따른 오브젝트 검출 방법에서 상기 프레임 영상 및 모션 벡터를 추출하는 단계는, 상기 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 모션 벡터를 디코딩 과정에서 추출하거나, 상기 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 연속된 복수의 프레임 영상들에 기초하여 모션 벡터를 추출할 수 있다.In the object detection method according to an embodiment of the present invention, the step of extracting the frame image and the motion vector may include extracting the image, video, image information, and/or motion vector included in the video information in a decoding process, or A motion vector may be extracted based on an image, a video, image information, and/or a plurality of consecutive frame images included in the video information.

본 발명의 일 실시예에 따른 오브젝트 검출 장치는 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보로부터 프레임 영상 및 모션 벡터를 추출하는 추출부; 상기 프레임 영상 및 모션 벡터에 기초하여 통합 특징 벡터를 생성하는 특징 생성부; 및 상기 통합 특징 벡터에 기초하여 상기 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출하는 오브젝트 검출부를 포함할 수 있다.An object detection apparatus according to an embodiment of the present invention includes an extractor for extracting frame images and motion vectors from images, videos, image information, and/or video information; a feature generating unit generating an integrated feature vector based on the frame image and the motion vector; and an object detector detecting an object included in the image, video, image information, and/or video information based on the integrated feature vector.

본 발명의 일 실시예에 따른 오브젝트 검출 장치에서 상기 특징 생성부는, 상기 프레임 영상의 통계적인 특성을 제1 특징 벡터로 추출하고, 상기 모션 벡터의 통계적인 특성을 제2 특징 벡터로 추출하며, 상기 제1 특징 벡터 및 제2 특징 벡터를 결합함으로써 상기 통합 특징 벡터를 생성할 수 있다.In the object detection apparatus according to an embodiment of the present invention, the feature generation unit extracts statistical characteristics of the frame image as a first feature vector, extracts statistical characteristics of the motion vector as a second feature vector, and The integrated feature vector may be generated by combining the first feature vector and the second feature vector.

본 발명의 일 실시예에 따른 오브젝트 검출 장치에서 상기 특징 생성부는, 상기 프레임 영상과 상기 모션 벡터를 복수의 블록들로 분할하고, 블록에 포함되는 프레임 영상에 기초하여 제1 특징 벡터를 추출하고, 블록에 포함되는 모션 벡터에 기초하여 제2 특징 벡터를 추출할 수 있다.In the object detection apparatus according to an embodiment of the present invention, the feature generator divides the frame image and the motion vector into a plurality of blocks, extracts a first feature vector based on the frame image included in the block, A second feature vector may be extracted based on the motion vector included in the block.

본 발명의 일 실시예에 따른 오브젝트 검출 장치에서 상기 특징 생성부는, 상기 프레임 영상에 포함된 픽셀의 밝기의 기울기에 기초하여 상기 제1 특징 벡터를 추출할 수 있다.In the object detection apparatus according to an embodiment of the present invention, the feature generator may extract the first feature vector based on a gradient of brightness of pixels included in the frame image.

본 발명의 일 실시예에 따른 오브젝트 검출 장치에서 상기 특징 생성부는, 상기 프레임 영상에 포함된 픽셀의 밝기 레벨에 기초하여 상기 제1 특징 벡터를 추출할 수 있다.In the object detection apparatus according to an embodiment of the present invention, the feature generator may extract the first feature vector based on a brightness level of a pixel included in the frame image.

본 발명의 일 실시예에 따른 오브젝트 검출 장치에서 상기 특징 생성부는, 상기 프레임 영상에 포함된 픽셀의 색상에 기초하여 상기 제1 특징 벡터를 추출할 수 있다.In the object detection apparatus according to an embodiment of the present invention, the feature generator may extract the first feature vector based on a color of a pixel included in the frame image.

본 발명의 일 실시예에 따른 오브젝트 검출 장치에서 상기 특징 생성부는, 상기 모션 벡터의 방향에 기초하여 상기 제2 특징 벡터를 추출할 수 있다.In the object detection apparatus according to an embodiment of the present invention, the feature generator may extract the second feature vector based on the direction of the motion vector.

본 발명의 일 실시예에 따른 오브젝트 검출 장치에서 상기 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출하는 단계는, 상기 통합 특징 벡터에 기초하여 상기 프레임 영상에 검출 대상 오브젝트가 포함되어 있는지 여부를 판별함으로써 상기 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출할 수 있다.The step of detecting an object included in the image, video, image information, and/or video information in the object detection apparatus according to an embodiment of the present invention may include a detection target object in the frame image based on the integrated feature vector. It is possible to detect an object included in the image, video, image information, and/or video information by determining whether it is included.

또한 본 발명의 일 실시예는 컨텐츠 내 특정 오브젝트에 대한 정보를 획득하기 위한 사용자 장치를 제공한다.In addition, an embodiment of the present invention provides a user device for obtaining information on a specific object in content.

본 발명의 일 실시예에 따른 사용자 장치는 통신부; 및 상기 통신부와 기능적으로 연결되어 있는 프로세서를 포함하고, 상기 프로세서는, 하나 이상의 오브젝트들을 포함하는 컨텐츠를 획득하고, 상기 컨텐츠의 상기 하나 이상의 오브젝트들 중 특정 오브젝트를 추출하고, 상기 특정 오브젝트는 심층 신경망 또는 합성곱 신경망을 이용하여 기 학습된 학습 정보에 기초하여 추출되고, 상기 특정 컨텐츠를 식별하기 위한 식별 정보를 오브젝트 정보 제공 서버로 전송하고, 상기 오브젝트 정보 제공 서버로부터, 상기 특정 오브젝트에 대한 하나 이상의 정보를 수신하여 디스플레이하고, 상기 특정 오브젝트에 대한 하나 이상의 정보는, 상기 특정 오브젝트의 종류에 따라 상이하고, 상기 특정 오브젝트에 대한 하나 이상의 정보는, 상기 컨텐츠와 중첩되거나 팝업 형태로 디스플레이되는 것을 특징으로 한다.A user device according to an embodiment of the present invention includes a communication unit; and a processor functionally connected to the communication unit, wherein the processor obtains content including one or more objects, extracts a specific object from among the one or more objects of the content, and selects the specific object through a deep neural network. Alternatively, identification information for identifying the specific content extracted based on pre-learned learning information using a convolutional neural network is transmitted to an object information providing server, and from the object information providing server, one or more information about the specific object is transmitted. Information is received and displayed, one or more information on the specific object is different according to the type of the specific object, and the one or more information on the specific object is overlapped with the content or displayed in a pop-up form. do.

상기 특정 오브젝트는, 상기 컨텐츠를 구성하는 픽셀들 각각에 대해 인접한 픽셀들의 픽셀 값들이 기 설정된 임계 값 이상이 되는 픽셀들에 기초하여 추출될 수 있다.The specific object may be extracted based on pixels having pixel values of pixels adjacent to each of the pixels constituting the content equal to or greater than a predetermined threshold value.

상기 기 설정된 임계 값은, 상기 컨텐츠를 구성하는 픽셀들의 픽셀 값들의 평균 값일 수 있다.The predetermined threshold value may be an average value of pixel values of pixels constituting the content.

상기 특정 오브젝트에 대한 하나 이상의 정보는, 상기 특정 오브젝트가 인물인 경우, 인물명, 생년월일, 상기 인물이 착용한 아이템 및 상기 아이템의 구매처 중 적어도 어느 하나를 포함하고, 상기 특정 오브젝트가 사물인 경우, 모델명, 제조일자, 가격 및 상기 물건의 구매처 중 적어도 어느 하나를 포함할 수 있다.The one or more pieces of information on the specific object include at least one of a person's name, date of birth, an item worn by the person, and a place where the item was purchased when the specific object is a person, and a model name when the particular object is an object. , the date of manufacture, the price, and at least one of the place of purchase of the product.

상기 특정 오브젝트에 대한 하나 이상의 정보의 개수는, 사용자가 상기 오브젝트 정보 제공 서버에 결제한 금액에 의해 결정될 수 있다.The number of one or more pieces of information about the specific object may be determined by the amount of money paid by the user to the object information providing server.

또한 본 발명의 일 실시예는 장치가 컨텐츠 내 특정 오브젝트에 대한 정보를 획득하기 방법에 있어서, 상기 장치가: 하나 이상의 오브젝트들을 포함하는 컨텐츠를 획득하는 단계; 상기 컨텐츠의 상기 하나 이상의 오브젝트들 중 특정 오브젝트를 추출하되, 상기 특정 오브젝트는 심층 신경망 또는 합성곱 신경망을 이용하여 기 학습된 학습 정보에 기초하여 추출하는 단계; 상기 특정 컨텐츠를 식별하기 위한 식별 정보를 오브젝트 정보 제공 서버로 전송하는 단계; 및 상기 특정 오브젝트에 대한 하나 이상의 정보를 수신하여 상기 오브젝트 정보 제공 서버로부터 디스플레이를 통하여 표시하는 단계; 를 포함하고, 상기 특정 오브젝트에 대한 하나 이상의 정보는, 상기 특정 오브젝트의 종류에 따라 상이하고, 상기 특정 오브젝트에 대한 하나 이상의 정보는, 상기 컨텐츠와 중첩되거나 팝업 형태로 표시되는 방법을 제안한다.In addition, an embodiment of the present invention is a method for acquiring information on a specific object in content by a device, comprising: acquiring content including one or more objects; extracting a specific object from among the one or more objects of the content, extracting the specific object based on pre-learned learning information using a deep neural network or a convolutional neural network; Transmitting identification information for identifying the specific content to an object information providing server; and receiving one or more pieces of information about the specific object and displaying the information from the object information providing server through a display. Including, the one or more information on the specific object is different according to the type of the specific object, and the one or more information on the specific object overlaps with the content or is displayed in a pop-up form.

또한 본 발명의 일 실시예는 상기 방법을 실행시키기 위한 프로그램을 기록한 컴퓨터 판독 가능한 기록 매체를 제안한다.In addition, an embodiment of the present invention proposes a computer readable recording medium recording a program for executing the method.

또한 본 발명의 일 실시예는 상기 방법을 실행시키기 위해 컴퓨터 판독 가능한 기록 매체에 기록된 프로그램을 제안한다.In addition, an embodiment of the present invention proposes a program recorded on a computer readable recording medium to execute the method.

본 발명의 일 실시예는, 통합 특징 벡터에 기초하여 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출함으로써, 오브젝트의 정적인 특성 및 동적인 특성을 모두 고려하여 효과적으로 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출할 수 있다.An embodiment of the present invention effectively detects an image, a video, image information, and/or an object included in the video information based on an integrated feature vector, taking into account both static and dynamic characteristics of the object. A video, image information, and/or an object included in the video information may be detected.

본 발명의 일 실시예는, 정지 영상 기반 오브젝트 검출의 단순성과 계산 효율성 및 연속적인 복수의 프레임 영상 기반 오브젝트 검출의 고성능의 장점을 결합함으로써, 연산량을 효과적으로 감소시켜 고속으로 오브젝트를 검출할 수 있다.An embodiment of the present invention can detect an object at high speed by effectively reducing the amount of computation by combining the simplicity and calculation efficiency of still image-based object detection and the high performance of object detection based on a plurality of consecutive frame images.

본 발명의 일 실시예는, 정지 영상에 포함된 오브젝트의 영상 정보와 전체/부분 이동 및 변형 등의 오브젝트의 움직임 정보를 통합하여 활용함으로써, 일관된 움직임 패턴을 보이는 오브젝트를 높은 정확도로 검출할 수 있다.An embodiment of the present invention can detect an object exhibiting a consistent movement pattern with high accuracy by integrating and utilizing image information of an object included in a still image and movement information of an object such as total/partial movement and transformation. .

본 발명의 일 실시예는, 프레임 영상에 기초한 오브젝트의 정적인 특성과 모션 벡터에 기초한 오브젝트의 동적인 특성을 모두 고려함으로써, 오브젝트를 촬영한 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에서 나타날 수 있는 흐려짐에 강인한 오브젝트 검출 방법을 제공할 수 있다.An embodiment of the present invention considers both the static characteristics of an object based on a frame image and the dynamic characteristics of an object based on a motion vector, thereby appearing in an image, moving image, image information, and/or moving image information of an object captured. It is possible to provide an object detection method that is robust against possible blurring.

사용자는 사용자가 원하는 컨텐츠 내의 특정 오브젝트에 대한 정보를 즉각적으로 제공받을 수 있다는 효과가 있다.There is an effect that the user can be immediately provided with information on a specific object in the content desired by the user.

본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.The effects obtainable in the present invention are not limited to the effects mentioned above, and other effects not mentioned can be clearly understood by those skilled in the art from the description below. will be.

본 발명의 특정한 바람직한 실시예들의 상기에서 설명한 바와 같은 또한 다른 측면들과, 특징들 및 이득들은 첨부 도면들과 함께 처리되는 하기의 설명으로부터 보다 명백하게 될 것이다.
도 1은 본 발명의 일 실시예에 따른 오브젝트 검출 방법을 나타낸 도면이다.
도 2는 본 발명의 일 실시예에 따라 통합 특징 벡터를 생성하는 과정을 설명하기 위한 도면이다.
도 3은 본 발명의 일 실시예에 따라 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보로부터 통합 특징 벡터를 생성하는 예시를 나타낸 도면이다.
도 4는 본 발명의 일 실시예에 따른 오브젝트 검출 장치의 세부 구성을 나타낸 도면이다.
도 5은 본 발명의 일 실시예에 따른 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하기 위한 시스템의 구성을 나타낸 도면이다.
도 6은 본 발명의 일 실시예에 따른 사용자 장치를 나타낸 블록도이다.
도 7은 본 발명의 일 실시예에 따른 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하는 방법을 나타낸 도면이다.
도 8는 본 발명의 일 실시예에 따른 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하는 방법을 나타낸 도면이다.
도 9는 본 발명의 일 실시에에 따른 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하는 방법을 나타낸 흐름도이다.Also other aspects as described above, features and benefits of certain preferred embodiments of the present invention will become more apparent from the following description taken in conjunction with the accompanying drawings.
1 is a diagram illustrating an object detection method according to an embodiment of the present invention.
2 is a diagram for explaining a process of generating an integrated feature vector according to an embodiment of the present invention.
3 is a diagram illustrating an example of generating an integrated feature vector from images, videos, image information, and/or video information according to an embodiment of the present invention.
4 is a diagram showing a detailed configuration of an object detection device according to an embodiment of the present invention.
5 is a diagram showing the configuration of a system for providing information on a specific object in content according to an embodiment of the present invention.
6 is a block diagram illustrating a user device according to an embodiment of the present invention.
7 is a diagram illustrating a method of providing information on a specific object within content according to an embodiment of the present invention.
8 is a diagram illustrating a method of providing information on a specific object within content according to an embodiment of the present invention.
9 is a flowchart illustrating a method of providing information on a specific object within content according to an embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

실시예를 설명함에 있어서 본 발명이 속하는 기술 분야에 익히 알려져 있고 본 발명과 직접적으로 관련이 없는 기술 내용에 대해서는 설명을 생략한다. 이는 불필요한 설명을 생략함으로써 본 발명의 요지를 흐리지 않고 더욱 명확히 전달하기 위함이다.In describing the embodiments, descriptions of technical contents that are well known in the technical field to which the present invention pertains and are not directly related to the present invention will be omitted. This is to more clearly convey the gist of the present invention without obscuring it by omitting unnecessary description.

마찬가지 이유로 첨부 도면에 있어서 일부 구성요소는 과장되거나 생략되거나 개략적으로 도시되었다. 또한, 각 구성요소의 크기는 실제 크기를 전적으로 반영하는 것이 아니다. 각 도면에서 동일한 또는 대응하는 구성요소에는 동일한 참조 번호를 부여하였다.For the same reason, in the accompanying drawings, some components are exaggerated, omitted, or schematically illustrated. Also, the size of each component does not entirely reflect the actual size. In each figure, the same reference number is assigned to the same or corresponding component.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present invention, and methods for achieving them, will become clear with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only the present embodiments make the disclosure of the present invention complete, and common knowledge in the art to which the present invention belongs It is provided to fully inform the holder of the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numbers designate like elements throughout the specification.

이때, 처리 흐름도 도면들의 각 블록과 흐름도 도면들의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수 있음을 이해할 수 있을 것이다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 인스트럭션들이 흐름도 블록(들)에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 흐름도 블록(들)에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 흐름도 블록(들)에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.At this time, it will be understood that each block of the process flow chart diagrams and combinations of the flow chart diagrams can be performed by computer program instructions. These computer program instructions may be embodied in a processor of a general purpose computer, special purpose computer, or other programmable data processing equipment, so that the instructions executed by the processor of the computer or other programmable data processing equipment are described in the flowchart block(s). It creates means to perform functions. These computer program instructions may also be stored in a computer usable or computer readable memory that can be directed to a computer or other programmable data processing equipment to implement functionality in a particular way, such that the computer usable or computer readable memory The instructions stored in are also capable of producing an article of manufacture containing instruction means that perform the functions described in the flowchart block(s). The computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operational steps are performed on the computer or other programmable data processing equipment to create a computer-executed process to generate computer or other programmable data processing equipment. Instructions for performing processing equipment may also provide steps for performing the functions described in the flowchart block(s).

또한, 각 블록은 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실행 예들에서는 블록들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.Additionally, each block may represent a module, segment, or portion of code that includes one or more executable instructions for executing specified logical function(s). It should also be noted that in some alternative implementations it is possible for the functions mentioned in the blocks to occur out of order. For example, two blocks shown in succession may in fact be executed substantially concurrently, or the blocks may sometimes be executed in reverse order depending on their function.

이 때, 본 실시 예에서 사용되는 '~부'라는 용어는 소프트웨어 또는 FPGA(field-Programmable Gate Array) 또는 ASIC(Application Specific Integrated Circuit)과 같은 하드웨어 구성요소를 의미하며, '~부'는 어떤 역할들을 수행한다. 그렇지만 '~부'는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. '~부'는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 '~부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들, 및 변수들을 포함한다. 구성요소들과 '~부'들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 '~부'들로 결합되거나 추가적인 구성요소들과 '~부'들로 더 분리될 수 있다. 뿐만 아니라, 구성요소들 및 '~부'들은 디바이스 또는 보안 멀티미디어카드 내의 하나 또는 그 이상의 CPU들을 재생시키도록 구현될 수도 있다.At this time, the term '~unit' used in this embodiment means software or a hardware component such as a field-programmable gate array (FPGA) or application specific integrated circuit (ASIC), and what role does '~unit' have? perform them However, '~ part' is not limited to software or hardware. '~bu' may be configured to be in an addressable storage medium and may be configured to reproduce one or more processors. Therefore, as an example, '~unit' refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, and procedures. , subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables. Functions provided within components and '~units' may be combined into smaller numbers of components and '~units' or further separated into additional components and '~units'. In addition, components and '~units' may be implemented to play one or more CPUs in a device or a secure multimedia card.

본 발명의 실시예들을 구체적으로 설명함에 있어서, 특정 시스템의 예를 주된 대상으로 할 것이지만, 본 명세서에서 청구하고자 하는 주요한 요지는 유사한 기술적 배경을 가지는 여타의 통신 시스템 및 서비스에도 본 명세서에 개시된 범위를 크게 벗어나지 아니하는 범위에서 적용 가능하며, 이는 당해 기술분야에서 숙련된 기술적 지식을 가진 자의 판단으로 가능할 것이다.In describing the embodiments of the present invention in detail, an example of a specific system will be the main target, but the main subject matter to be claimed in this specification extends the scope disclosed herein to other communication systems and services having a similar technical background. It can be applied within a range that does not deviate greatly, and this will be possible with the judgment of those skilled in the art.

본 발명에서 오브젝트 검출 장치, 오브젝트 정보 제공 서버, 및/또는 사용자 장치는 본 발명의 일 실시예에 따른 서비스(및/또는 기능)를 제공하는 온라인 플랫폼을 구현하는 어플리케이션(모바일앱) 및/또는 웹사이트(웹서비스)를 통하여 획득(및/또는 수신)되는 정보에 포함되는 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보로부터 오브젝트를 검출하거나 및/또는 검출된 오브젝트에 대한 정보를 사용자에게 제공하는 특징을 제안하려고 한다. 한편 이러한 기능을 구현하기 위한 명령어(커맨트)는 크롬 브라우저 내 확장프로그램, 스마트폰 어플리케이션, 및/또는 오큘러스 등 증강현실 장비용 소프트웨어의 형태로 구현될 수 있다.In the present invention, an object detection device, an object information providing server, and/or a user device may be an application (mobile app) and/or web that implements an online platform that provides services (and/or functions) according to an embodiment of the present invention. Detecting an object from an image, video, image information, and/or video information included in information obtained (and/or received) through a site (web service) and/or providing information about the detected object to the user I'm trying to suggest a feature. Meanwhile, commands (commands) for implementing these functions may be implemented in the form of an extension program in the Chrome browser, a smartphone application, and/or software for augmented reality equipment such as Oculus.

또한 본 발명에서 오브젝트 검출 장치, 오브젝트 정보 제공 서버, 및/또는 사용자 장치는 본 발명의 일 실시예에 따른 차량 내부 및/또는 외부에 설치되는 카메라(및/또는 센서)에 의해 획득되는 이미지 및/또는 동영상으로부터 오브젝트를 검출하거나 및/또는 검출된 오브젝트에 대한 정보를 사용자에게 제공하는 특징을 제안하려고 한다.In addition, in the present invention, an object detection device, an object information providing server, and/or a user device may include images and/or images acquired by a camera (and/or sensor) installed inside and/or outside a vehicle according to an embodiment of the present invention. Alternatively, a feature of detecting an object from a video and/or providing information on the detected object to a user is proposed.

이하, 본 발명에 따른 바람직한 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 오브젝트 검출 방법을 나타낸 도면이다.1 is a diagram illustrating an object detection method according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 오브젝트 검출 방법은 후술하는 오브젝트 검출 장치(400)에 구비된 프로세서(및/또는 제어부)에 의해 수행될 수 있다. 예를 들어, 오브젝트 검출 장치(400)는 이미지 및/또는 동영상에 포함된 오브젝트를 검출하는 장치로서, 소프트웨어 모듈, 하드웨어 모듈, 또는 이들의 조합으로 구현될 수 있다. An object detection method according to an embodiment of the present invention may be performed by a processor (and/or control unit) included in the object detection apparatus 400 to be described later. For example, the object detection device 400 is a device for detecting an object included in an image and/or video, and may be implemented as a software module, a hardware module, or a combination thereof.

또한 상기 오브젝트 검출 장치(400)는 도 5의 오브젝트 정보 제공 서버(520) 및/또는 사용자 장치(530)로 구현될 수 있다. 즉, 오브젝트 검출 장치(400)는 스마트 폰, 테블릿 컴퓨터, 랩톱 컴퓨터, 데스크톱 컴퓨터, 텔레비전, 웨어러블 장치, 보안 시스템, 스마트 홈 시스템 등 다양한 컴퓨팅 장치 및/또는 시스템에 탑재될 수 있다. 또한 오브젝트 검출 장치(400)는 도 1 내지 도 9를 참조하여 설명되는 본 발명의 일 실시예에 따른 특징들이 오브젝트 정보 제공 서버(520) 및 사용자 장치(530) 각각에 의해 구현되거나, 오브젝트 정보 제공 서버(520) 및 사용자 장치(530)에 의해 함께 구현될 수도 있다.Also, the object detection device 400 may be implemented as the object information providing server 520 and/or the user device 530 of FIG. 5 . That is, the object detection device 400 may be installed in various computing devices and/or systems such as smart phones, tablet computers, laptop computers, desktop computers, televisions, wearable devices, security systems, and smart home systems. In addition, the object detection device 400 may implement features according to an embodiment of the present invention described with reference to FIGS. 1 to 9 by the object information providing server 520 and the user device 530, respectively, or provide object information. It may also be implemented by the server 520 and the user device 530 together.

단계(S110)에서, 오브젝트 검출 장치(400)는 이미지 및/또는 동영상으로부터 프레임 영상 및 모션 벡터를 추출한다. 정적 이미지는, 예를 들면, jpg, PNG 등의 확장자를 갖는 그림, 사진 중 적어도 어느 하나를 포함할 수 있다. 정적 이미지는, 예를 들면, GIF 등의 확장자를 갖는 동적인 그림, 사진 중 적어도 어느 하나를 포함할 수 있다. 동영상은 avi, mp4 등의 확장자를 갖는 스트림, 파일, 방송신호 등의 다양한 형태로 구성될 수 있다.In step S110, the object detection apparatus 400 extracts frame images and motion vectors from images and/or moving pictures. The static image may include, for example, at least one of a picture and a photograph having extensions such as jpg and PNG. The static image may include, for example, at least one of a dynamic picture and a photograph having an extension such as GIF. A video may be configured in various forms such as a stream having an extension such as avi or mp4, a file, or a broadcast signal.

오브젝트 검출 장치(400)는 동영상으로부터 프레임 영상을 추출한다. 오브젝트 검출 장치는 동영상에 포함된 복수의 프레임 영상들을 추출함으로써 특정 프레임 영상을 추출할 수 있다.The object detection device 400 extracts a frame image from a video. The object detection apparatus may extract a specific frame image by extracting a plurality of frame images included in a video.

오브젝트 검출 장치(400)는 동영상으로부터 모션 벡터를 추출한다. 일 예로, 오브젝트 검출 장치는 동영상을 디코딩하는 과정에서 동영상에 포함된 모션 벡터를 추출할 수 있다. 동영상에 포함된 모션 벡터는 동영상의 인코딩 과정에서 생성된 것일 수 있다.The object detection device 400 extracts a motion vector from a video. For example, the object detection device may extract a motion vector included in a video in a process of decoding the video. The motion vector included in the video may be generated during encoding of the video.

다른 예로, 오브젝트 검출 장치(400)는 모션 벡터 연산 알고리즘을 이용하여 동영상으로부터 모션 벡터를 추출할 수 있다. 구체적으로, 오브젝트 검출 장치(400)는 동영상으로부터 추출된 연속적인 복수의 프레임 영상들로부터 광류(optical flow)를 계산할 수 있다. 오브젝트 검출 장치(400)는 계산된 광류에 기초하여 모션 벡터를 추출할 수 있다. 이 때, 오브젝트 검출 장치(400)는 기준 프레임을 복수의 블록들로 분할하고 해당 블록마다 모션 벡터를 추출함으로써 모션 벡터 맵을 생성할 수 있다. 기준 프레임은 영상 프레임에 대응하는 것으로, 모션 벡터를 추출하는 프레임을 나타낼 수 있다.As another example, the object detection device 400 may extract a motion vector from a video using a motion vector calculation algorithm. Specifically, the object detection apparatus 400 may calculate an optical flow from a plurality of consecutive frame images extracted from a video. The object detection apparatus 400 may extract a motion vector based on the calculated optical flow. In this case, the object detection apparatus 400 may generate a motion vector map by dividing the reference frame into a plurality of blocks and extracting a motion vector for each corresponding block. The reference frame corresponds to an image frame and may indicate a frame from which a motion vector is extracted.

모션 벡터 맵을 구성하는 복수의 블록들의 크기는 일정하지 않을 수도 있다. 이 경우, 오브젝트 검출 장치(400)는 복수의 블록들의 크기 중에서 가장 작은 블록 크기로 모션 벡터 맵을 구성하는 복수의 블록들의 크기를 조절할 수 있다. 오브젝트 검출 장치(400)는 모션 벡터 맵을 구성하는 블록들의 크기를 균일화할 수 있다.The size of a plurality of blocks constituting the motion vector map may not be constant. In this case, the object detection apparatus 400 may adjust the size of the plurality of blocks constituting the motion vector map to the smallest block size among the sizes of the plurality of blocks. The object detection apparatus 400 may uniformize the size of blocks constituting the motion vector map.

단계(S120)에서, 오브젝트 검출 장치(400)는 프레임 영상 및 모션 벡터에 기초하여 통합 특징 벡터를 생성한다. 오브젝트 검출 장치(400)는 프레임 영상으로부터 제1 특징 벡터를 추출하고 모션 벡터에 기초하여 제2 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치(400)는 제1 특징 벡터 및 제2 특징 벡터에 기초하여 통합 특징 벡터를 생성할 수 있다.In step S120, the object detection apparatus 400 generates an integrated feature vector based on the frame image and the motion vector. The object detection apparatus 400 may extract a first feature vector from the frame image and extract a second feature vector based on the motion vector. The object detection apparatus 400 may generate an integrated feature vector based on the first feature vector and the second feature vector.

일 예로, 오브젝트 검출 장치(400)는 프레임 영상과 모션 벡터를 복수의 블록들로 분할하고, 블록에 포함되는 프레임 영상으로부터 제1 특징 벡터를 추출하고 블록에 포함되는 모션 벡터로부터 제2 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치는 해당 블록에서 추출된 제1 특징 벡터 및 제2 특징 벡터를 결합함으로써 해당 블록에 대응하는 통합 특징 벡터를 생성할 수 있다.For example, the object detection apparatus 400 divides a frame image and a motion vector into a plurality of blocks, extracts a first feature vector from the frame image included in the block, and extracts a second feature vector from the motion vector included in the block. can be extracted. The object detection apparatus may generate an integrated feature vector corresponding to the corresponding block by combining the first feature vector and the second feature vector extracted from the corresponding block.

통합 특징 벡터를 생성하는 상세한 과정에 대해서는 도 2를 참조하여 후술한다.A detailed process of generating the integrated feature vector will be described later with reference to FIG. 2 .

단계(S130)에서, 오브젝트 검출 장치(400)는 통합 특징 벡터에 기초하여 이미지 및/또는 동영상에 포함된 오브젝트를 검출한다. 오브젝트 검출 장치는 통합 특징 벡터에 기초하여 프레임 영상에 검출 대상 오브젝트가 포함되어 있는지 여부를 판별함으로써 이미지 및/또는 동영상에 포함된 오브젝트를 검출할 수 있다. 검출 대상 오브젝트는 이미지 및/또는 동영상에 포함된 움직이는 오브젝트를 나타낼 수 있다. 검출 대상 오브젝트는 프레임 영상의 일부 영역에 포함될 수 있으며, 분할된 블록들 중 하나의 블록 또는 복수의 블록들에 포함될 수 있다.In step S130, the object detection device 400 detects an object included in the image and/or video based on the integrated feature vector. The object detection apparatus may detect an object included in an image and/or a video by determining whether or not the object to be detected is included in the frame image based on the integrated feature vector. The detection target object may represent a moving object included in an image and/or a video. The object to be detected may be included in a partial region of the frame image, and may be included in one block or a plurality of blocks among divided blocks.

오브젝트 검출 장치(400)는 소정의 기준에 따라 상이한 오브젝트 검출 모드로 동작할 수 있으며, 제1 오브젝트 검출 모드, 제2 오브젝트 검출 모드, 또는 제3 오브젝트 검출 모드 중 어느 하나로 설정(및/또는 동작)될 수 있다. 예를 들면, 오브젝트 검출 장치(400)는 외부로부터 수신된 및/또는 상기 오브젝트 검출 장치(400)에 저장된 이미지 및/또는 동영상에 소정의 식별자를 결합시켜 이미지 정보 및/또는 동영상 정보를 생성할 수 있으며, 이때 소정의 식별자는 상기 이미지 및/또는 동영상이 생성된(및/또는 촬영된) 시간을 나타내는 정보, 상기 이미지 및/또는 동영상의 크기, 상기 이미지 및/또는 동영상에 포함된 검출 대상 오브젝트의 개수를 나타내는 정보, 상기 이미지 및/또는 동영상에 포함되는 RGB(red-green-blue) 값들의 가지 수(number of branches)를 나타내는 정보 등일 수 있다. The object detection device 400 may operate in different object detection modes according to a predetermined criterion, and set (and/or operate) one of the first object detection mode, the second object detection mode, and the third object detection mode. It can be. For example, the object detection apparatus 400 may generate image information and/or video information by combining a predetermined identifier with an image and/or video received from the outside and/or stored in the object detection apparatus 400. At this time, the predetermined identifier is information indicating the time when the image and / or video was created (and / or captured), the size of the image and / or video, and the detection target object included in the image and / or video It may be information indicating the number, information indicating the number of branches of red-green-blue (RGB) values included in the image and/or video, and the like.

이때 상기 이미지 및/또는 동영상에 포함된 검출 대상 오브젝트의 개수를 나타내는 정보는, 상기 이미지 및/또는 동영상을 촬영한(및/또는 생성한) 장치(및/또는 단말, 카메라 등)에 사용자가 직접 입력한 정보이거나 상기 이미지 및/또는 동영상을 촬영한(및/또는 생성한) 장치(및/또는 단말, 카메라 등) 자체에서 소정의 오브젝트 추출 알고리즘을 이용하여 먼저 상기 이미지 및/또는 동영상 내에 포함되는 오브젝트의 개수 정도만 먼저 파악하여 입력(및/또는 생성)시킨 정보일 수 있다.At this time, the information indicating the number of objects to be detected included in the image and/or video is provided by the user directly to the device (and/or terminal, camera, etc.) that captured (and/or created) the image and/or video. It is input information or is first included in the image and / or video by using a predetermined object extraction algorithm in the device (and / or terminal, camera, etc.) that took (and / or created) the image and / or video It may be information that is input (and/or generated) after identifying only the number of objects in advance.

이때 소정의 오브젝트 추출 알고리즘은 로지스틱 회귀분석(Logistic Regression), SVM(Support Vector Machine), Latent SVM(Latent Support Vector Machine), 가변 부분 모델(deformable part model), HOG(Histogram of Oriented Gradient), Haar-like feature, Co-occurrence HOG, LBP(local binary pattern), FAST(features from accelerated segment test) 등을 포함할 수 있다.At this time, the predetermined object extraction algorithm is logistic regression, SVM (Support Vector Machine), Latent SVM (Latent Support Vector Machine), deformable part model, HOG (Histogram of Oriented Gradient), Haar- It can include like feature, co-occurrence HOG, local binary pattern (LBP), features from accelerated segment test (FAST), etc.

또한 상기 이미지 및/또는 동영상에 포함되는 RGB(red-green-blue) 값들의 가지 수(number of branches)를 나타내는 정보(줄여서 'RGB 가지수 정보'라 칭할 수 있다)는, 예를 들면, 상기 이미지 및/또는 동영상에서 상기 오브젝트 검출 장치(400)(및/또는 프로세서)에 의해 식별(및/또는 확인)되는 상기 이미지 및/또는 동영상의 RGB 값이 FFFFCC, FFCC99, FFCC33, FFCCFF, FF66FF일 경우, 상기 이미지 및/또는 동영상에 포함되는 RGB 값들의 가지 수(number of branches)는 '5'일 것이다. 다른 예로, 상기 이미지 및/또는 동영상의 RGB 값이 33FFFF, 33CCCC, 339999, 336633, 0066FF, 777777, 666666, 330000, 330033, 660066, 6600CC 일 경우, 상기 이미지 및/또는 동영상에 포함되는 RGB 값들의 가지 수(number of branches)는 '11'일 것이다.In addition, information indicating the number of branches of red-green-blue (RGB) values included in the image and/or video (which may be referred to as 'RGB branch number information' for short), for example, When the RGB value of the image and/or video identified (and/or confirmed) by the object detection device 400 (and/or processor) in the image and/or video is FFFFCC, FFCC99, FFCC33, FFCCFF, FF66FF , the number of branches of RGB values included in the image and/or video will be '5'. As another example, when the RGB values of the image and/or video are 33FFFF, 33CCCC, 339999, 336633, 0066FF, 777777, 666666, 330000, 330033, 660066, and 6600CC, the RGB values included in the image and/or video The number of branches will be '11'.

한편 오브젝트 검출 장치(400)가 소정의 기준에 따라 상이한 오브젝트 검출 모드로 동작하는 것과 관련하여, 이하 다시 설명한다.Meanwhile, operation of the object detection apparatus 400 in different object detection modes according to a predetermined criterion will be described again below.

일 예로, 상기 이미지 정보 및/또는 동영상 정보가 검출 대상 오브젝트가 단일 오브젝트임을 나타내고, 상기 RGB 가지수 정보가 소정의 임계치(예; 0~150 사이의 정수값)보다 높은 값을 나타내는 경우, 오브젝트 검출 장치(400)는 제1 오브젝트 검출 모드로 동작하거나 제1 오브젝트 검출 모드로 설정될 수 있다. 제1 오브젝트 검출 모드로 동작하거나 제1 오브젝트 검출 모드로 설정된 오브젝트 검출 장치(400)는 로지스틱 회귀분석(Logistic Regression), SVM(Support Vector Machine), Latent SVM(Latent Support Vector Machine) 등의 다양한 인식기들을 이용하여 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출함으로써 오브젝트 정보를 생성(및/또는 획득)할 수 있다. For example, when the image information and/or the video information indicates that the detection target object is a single object and the RGB number information indicates a value higher than a predetermined threshold value (eg, an integer value between 0 and 150), the object is detected. The device 400 may operate in the first object detection mode or may be set to the first object detection mode. The object detection apparatus 400 operating in the first object detection mode or set to the first object detection mode uses various recognizers such as logistic regression, support vector machine (SVM), and latent support vector machine (SVM). object information may be generated (and/or obtained) by detecting an object included in an image, video, image information, and/or video information using

다른 예로, 상기 이미지 정보 및/또는 동영상 정보가 검출 대상 오브젝트가 단일 오브젝트임을 나타내고, 상기 RGB 가지수 정보가 상기 소정의 임계치(예; 0~150 사이의 정수값) 이하의 값을 나타내는 경우, 오브젝트 검출 장치(400)는 제2 오브젝트 검출 모드로 동작하거나 제2 오브젝트 검출 모드로 설정될 수 있다. 제2 오브젝트 검출 모드로 동작하거나 제2 오브젝트 검출 모드로 설정된 오브젝트 검출 장치(400)는 가변 부분 모델(deformable part model)에서 이미지 부분 모델을 이미지-모션 혼합 특징 기반 부분 모델로 대체함으로써, 규칙적인 움직임을 보이는 오브젝트에 대한 모델링을 수행하여 이동 중인 오브젝트와 배경을 분리시킬 수 있다. 그래서, 오브젝트 검출 장치는, 예를 들어 회전하는 자동차 바퀴, 걷고 있는 사람의 다리 등의 규칙적인 움직임을 보이는 오브젝트를 검출함으로써 오브젝트 정보를 생성(및/또는 획득)할 수 있다.As another example, when the image information and/or video information indicates that the detection target object is a single object, and the RGB number information indicates a value less than or equal to the predetermined threshold value (eg, an integer value between 0 and 150), the object The detection device 400 may operate in the second object detection mode or may be set to the second object detection mode. The object detection apparatus 400 operating in the second object detection mode or set to the second object detection mode replaces the image part model with the image-motion hybrid feature-based part model in the deformable part model, thereby performing regular motion. It is possible to separate the moving object from the background by performing modeling on the object that shows the . Thus, the object detection apparatus may generate (and/or obtain) object information by detecting an object that exhibits regular motion, such as a rotating car wheel or a walking person's leg.

또 다른 예로, 상기 이미지 정보 및/또는 동영상 정보가 검출 대상 오브젝트가 복수의 오브젝트임을 나타내는 경우(이때는 상기 RGB 가지수 정보와 상기 소정의 임계치 사이의 비교를 고려하지 않음), 오브젝트 검출 장치(400)는 제3 오브젝트 검출 모드로 동작하거나 제3 오브젝트 검출 모드로 설정될 수 있다. 제3 오브젝트 검출 모드로 동작하거나 제3 오브젝트 검출 모드로 설정된 오브젝트 검출 장치(400)는 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 추출하여 오브젝트 정보를 획득할 수 있다. 제3 오브젝트 검출 모드로 동작하거나 제3 오브젝트 검출 모드로 설정된 오브젝트 검출 장치(400)오브젝트 검출 장치(400)는 HOG(Histogram of Oriented Gradient), Haar-like feature, Co-occurrence HOG, LBP(local binary pattern), FAST(features from accelerated segment test) 등과 같은 오브젝트 특징 추출을 위한 다양한 알고리즘을 통하여, 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트의 윤곽선 또는 상기 오브젝트에서 추출할 수 있는 글씨(또는 정보를 나타내는 윤곽선(또는 외형))를 획득할 수 있다. 또한 제3 오브젝트 검출 모드로 동작하거나 제3 오브젝트 검출 모드로 설정된 오브젝트 검출 장치(400)는 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 영상 분석을 통해 인식(또는 식별)하고, 상기 인식된 오브젝트에 대응되는 영역을 마스킹 처리하여 마스킹 영상 정보를 생성할 수 있다. 이때, 마스킹 처리 과정은, 예를 들면, 차분영상 방법, GMM(Gaussian Mixture Models)을 이용하는 MOG(Model of Gaussian) 알고리즘, 코드북(Codebook) 알고리즘 등과 같은 오브젝트와 배경을 분리하기 위한 배경 모델링을 통해 오브젝트에 해당하는 오브젝트 후보 영역을 추출하는 방법을 이용함으로써 오브젝트 정보를 추출 및/또는 획득할 수 있다.As another example, when the image information and/or the moving image information indicates that the object to be detected is a plurality of objects (in this case, comparison between the RGB number information and the predetermined threshold is not considered), the object detection device 400 may be operated in the third object detection mode or set to the third object detection mode. The object detection device 400 operating in the third object detection mode or set to the third object detection mode may obtain object information by extracting an object included in an image, video, image information, and/or video information. The object detection apparatus 400 operating in the third object detection mode or set to the third object detection mode The object detection apparatus 400 has a Histogram of Oriented Gradient (HOG), Haar-like feature, Co-occurrence HOG, and local binary (LBP) pattern), FAST (features from accelerated segment test) through various algorithms for object feature extraction, etc., images, videos, image information, and/or contours of objects included in video information or text that can be extracted from the objects (or an outline (or outline) representing information) can be obtained. In addition, the object detection device 400 operating in the third object detection mode or set to the third object detection mode recognizes (or identifies) an object included in an image, video, image information, and/or video information through image analysis and , Masking image information may be generated by masking a region corresponding to the recognized object. At this time, the masking process is, for example, a difference image method, a model of Gaussian (MOG) algorithm using GMM (Gaussian Mixture Models), a codebook algorithm, etc. Object information may be extracted and/or obtained by using a method of extracting an object candidate region corresponding to .

도 2는 본 발명의 일 실시예에 따라 통합 특징 벡터를 생성하는 과정을 설명하기 위한 도면이다.2 is a diagram for explaining a process of generating an integrated feature vector according to an embodiment of the present invention.

오브젝트 검출 장치에 의해 수행되는 단계(S120)는 다음과 같이 세분화될 수 있다.Step S120 performed by the object detection device may be subdivided as follows.

단계(S121)에서, 오브젝트 검출 장치는 프레임 영상으로부터 제1 특징 벡터를 추출하고, 모션 벡터로부터 제2 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치는 프레임 영상의 통계적인 특성을 제1 특징 벡터로 추출하고, 모션 벡터의 통계적인 특성을 제2 특징 벡터로 추출할 수 있다.In operation S121, the object detection apparatus may extract a first feature vector from the frame image and a second feature vector from the motion vector. The object detection apparatus may extract statistical characteristics of a frame image as a first feature vector and extract statistical characteristics of a motion vector as a second feature vector.

오브젝트 검출 장치는 프레임 영상과 모션 벡터를 복수의 블록들로 분할할 수 있다. 오브젝트 검출 장치는 분할된 블록들마다 해당 블록에 대응하는 제1 특징 벡터 및 제2 특징 벡터를 추출함으로써 해당 블록에 대응하는 통합 특징 벡터를 생성할 수 있다.The object detection apparatus may divide the frame image and the motion vector into a plurality of blocks. The object detection apparatus may generate an integrated feature vector corresponding to the block by extracting the first feature vector and the second feature vector corresponding to the block for each divided block.

일 예로, 오브젝트 검출 장치는 프레임 영상에 포함된 픽셀의 밝기의 기울기에 기초하여 제1 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치는 픽셀의 밝기의 기울기에 대한 히스토그램에 기초하여 제1 특징 벡터를 추출할 수 있다.For example, the object detection apparatus may extract a first feature vector based on a brightness gradient of a pixel included in a frame image. The object detection apparatus may extract a first feature vector based on a histogram of a brightness gradient of a pixel.

다른 예로, 오브젝트 검출 장치는 프레임 영상에 포함된 픽셀의 밝기 레벨에 기초하여 제1 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치는 픽셀의 밝기 레벨에 대한 히스토그램에 기초하여 제1 특징 벡터를 추출할 수 있다.As another example, the object detection apparatus may extract the first feature vector based on the brightness level of pixels included in the frame image. The object detection apparatus may extract the first feature vector based on the histogram of the brightness level of the pixel.

또 다른 예로, 오브젝트 검출 장치는 프레임 영상에 포함된 픽셀의 색상에 기초하여 제1 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치는 픽셀의 색상에 대한 히스토그램에 기초하여 제1 특징 벡터를 추출할 수 있다.As another example, the object detection apparatus may extract a first feature vector based on a color of a pixel included in a frame image. The object detection apparatus may extract the first feature vector based on the histogram of the color of the pixel.

일 예로, 오브젝트 검출 장치는 모션 벡터의 방향에 기초하여 제2 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치는 분할된 블록에 대응하는 적어도 하나의 모션 벡터의 방향에 대한 히스토그램에 기초하여 제2 특징 벡터를 추출할 수 있다. 예를 들어, 분할된 블록에 포함된 모션 벡터가 복수인 경우, 오브젝트 검출 장치는 해당 블록에 포함된 모션 벡터를 합산하여 도출된 벡터 방향에 기초하여 제2 특징 벡터를 추출할 수 있다.For example, the object detection apparatus may extract the second feature vector based on the direction of the motion vector. The object detection apparatus may extract the second feature vector based on a histogram of a direction of at least one motion vector corresponding to the divided block. For example, when there are a plurality of motion vectors included in a divided block, the object detection apparatus may extract a second feature vector based on a vector direction derived by summing the motion vectors included in the block.

단계(S122)에서, 오브젝트 검출 장치는 제1 특징 벡터 및 제2 특징 벡터를 결합함으로써 통합 특징 벡터를 생성할 수 있다. 통합 특징 벡터는 제1 특징 벡터와 제2 특징 벡터를 모두 고려하는 특징 벡터를 의미할 수 있다. 통합 특징 벡터를 이용함으로써, 오브젝트 검출 장치는 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트의 정적인 특성과 동적인 특성을 모두 고려하여 오브젝트를 검출할 수 있다.In operation S122, the object detection apparatus may generate an integrated feature vector by combining the first feature vector and the second feature vector. The integrated feature vector may refer to a feature vector considering both the first feature vector and the second feature vector. By using the integrated feature vector, the object detection apparatus may detect an object by considering both static and dynamic characteristics of an object included in an image, video, image information, and/or video information.

도 3은 본 발명의 일 실시예에 따라 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보로부터 통합 특징 벡터를 생성하는 예시를 나타낸 도면이다.3 is a diagram illustrating an example of generating an integrated feature vector from images, videos, image information, and/or video information according to an embodiment of the present invention.

본 발명의 일 실시예에 따라 도 3에 도시된 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에는 삼각형 오브젝트와 원형 오브젝트가 포함될 수 있다. 도 3은 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에서 삼각형 오브젝트는 아래로 이동하고, 원형 오브젝트는 왼쪽 상단으로 이동하는 상황을 가정한다. 도 3에 도시된 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에서 실선은 점선보다 일정 시간이 지난 후의 오브젝트를 나타낼 수 있다.According to an embodiment of the present invention, a triangular object and a circular object may be included in the image, video, image information, and/or video information shown in FIG. 3 . FIG. 3 assumes a situation in which a triangular object moves downward and a circular object moves to the top left of an image, video, image information, and/or video information. In the image, video, image information, and/or video information shown in FIG. 3, a solid line may indicate an object after a certain time has elapsed from the dotted line.

오브젝트 검출 장치는 동영상으로부터 프레임 영상을 추출할 수 있다. 오브젝트 검출 장치는 동영상에 포함된 시간적으로 연속된 복수의 프레임 영상들을 추출함으로써 특정 프레임 영상을 추출할 수 있다. 오브젝트 검출 장치는 추출된 프레임 영상에 기초하여 동영상에 포함된 오브젝트를 정적으로 분석할 수 있다.The object detection device may extract a frame image from a video. The object detection apparatus may extract a specific frame image by extracting a plurality of temporally continuous frame images included in a video. The object detection apparatus may statically analyze an object included in a video based on the extracted frame image.

오브젝트 검출 장치는 동영상으로부터 모션 벡터를 추출할 수 있다. 일 예로, 오브젝트 검출 장치는 동영상의 인코딩 과정에서 생성된 모션 벡터를 동영상으로부터 추출할 수 있다. 다른 예로, 오브젝트 검출 장치는 동영상에 포함된 시간적으로 연속적인 복수의 프레임 영상들로부터 모션 벡터를 추출할 수 있다. 이 때, 오브젝트 검출 장치는 광류 계산 등과 같은 모션 벡터 연산 알고리즘을 이용하여 모션 벡터를 추출할 수 있다. 오브젝트 검출 장치는 기준 프레임을 복수의 블록들로 분할하고, 블록에 대응하는 모션 벡터를 개별적으로 추출할 수 있다.The object detection device may extract a motion vector from a video. For example, the object detection device may extract a motion vector generated in a video encoding process from a video. As another example, the object detection apparatus may extract a motion vector from a plurality of temporally continuous frame images included in a video. In this case, the object detection device may extract the motion vector using a motion vector calculation algorithm such as optical flow calculation. The object detection apparatus may divide the reference frame into a plurality of blocks and individually extract motion vectors corresponding to the blocks.

예를 들어, 오브젝트 검출 장치는 해당 블록에 대응하는 영상의 색상의 차에 기초하여 해당 블록에 대응하는 모션 벡터를 추출할 수 있다. 오브젝트 검출 장치는 해당 블록에 대응하는 현재 영상과 이전 영상을 서로 비교하고, 두 영상 간의 색상 차이가 미리 정해진 값보다 큰 경우 색상 차이가 나타나는 부분을 중심으로 기준 오브젝트를 식별하며 기준 오브젝트의 움직임에 대한 모션 벡터를 계산함으로써, 해당 블록의 모션 벡터를 추출할 수 있다. 오브젝트 검출 장치는 추출된 모션 벡터를 이용하여 모션 벡터 맵을 구성할 수 있다. 모션 벡터 맵을 구성하는 블록의 크기가 일정하지 않는 경우, 오브젝트 검출 장치는 가장 작은 블록 크기를 기준으로 모션 벡터 맵에 포함된 블록들을 균일화할 수 있다.For example, the object detection apparatus may extract a motion vector corresponding to a corresponding block based on a color difference of an image corresponding to the corresponding block. The object detection device compares the current image and the previous image corresponding to the corresponding block, and when the color difference between the two images is greater than a predetermined value, identifies a reference object centered on the part where the color difference appears, and determines the motion of the reference object. By calculating the motion vector, the motion vector of the corresponding block can be extracted. The object detection device may construct a motion vector map using the extracted motion vector. When the size of blocks constituting the motion vector map is not constant, the object detection apparatus may uniformize the blocks included in the motion vector map based on the smallest block size.

오브젝트 검출 장치는 모션 벡터에 기초하여 동영상에 포함된 오브젝트를 동적으로 분석할 수 있다.The object detection apparatus may dynamically analyze an object included in a video based on a motion vector.

오브젝트 검출 장치는 추출된 프레임 영상으로부터 제1 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치는 프레임 영상을 복수의 블록들로 분할하고, 블록에 대응하는 프레임 영상에 기초하여 해당 블록에 대한 제1 특성 벡터를 추출할 수 있다. 일 예로, 블록에 대한 제1 특성 벡터는 해당 블록에 포함된 픽셀의 밝기의 기울기에 대한 히스토그램에 기초하여 추출될 수 있다. 다른 예로, 블록에 대한 제1 특성 벡터는 해당 블록에 포함된 픽셀의 밝기 레벨에 대한 히스토그램에 기초하여 추출될 수 있다. 또 다른 예로, 블록에 대한 제1 특성 벡터는 해당 블록에 포함된 픽셀의 색상에 대한 히스토그램에 기초하여 추출될 수 있다.The object detection apparatus may extract a first feature vector from the extracted frame image. The object detection apparatus may divide a frame image into a plurality of blocks and extract a first feature vector for a corresponding block based on the frame image corresponding to the block. For example, the first feature vector for a block may be extracted based on a histogram of a brightness gradient of a pixel included in the corresponding block. As another example, the first feature vector for a block may be extracted based on a histogram of brightness levels of pixels included in the corresponding block. As another example, the first feature vector for a block may be extracted based on a histogram of colors of pixels included in the corresponding block.

오브젝트 검출 장치는 추출된 모션 벡터로부터 제2 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치는 프레임 영상의 블록과 동일한 크기의 블록을 기준으로 제2 특징 벡터를 추출할 수 있다. 오브젝트 검출 장치는 프레임 영상을 분할하는 블록과 동일한 크기의 블록 내에 포함되는 적어도 하나의 모션 벡터의 방향에 대한 히스토그램에 기초하여 해당 블록에 대응하는 제2 특징 벡터를 추출할 수 있다.The object detection device may extract the second feature vector from the extracted motion vector. The object detection apparatus may extract the second feature vector based on a block having the same size as a block of the frame image. The object detection apparatus may extract a second feature vector corresponding to a corresponding block based on a histogram of a direction of at least one motion vector included in a block having the same size as a block dividing a frame image.

오브젝트 검출 장치는 제1 특징 벡터와 제2 특징 벡터를 결합함으로써 통합 특징 벡터를 생성할 수 있다. 이 때, 제1 특징 벡터에 대응하는 블록과 제 2 특징 벡터에 대응하는 블록은 동일한 크기를 가질 수 있다. 오브젝트 검출 장치는 블록을 기준으로 해당 블록에 대응하는 제1 특징 벡터와 제2 특징 벡터를 결합할 수 있다. 다시 말해, 오브젝트 검출 장치는 영역별로 통합 특징 벡터를 생성할 수 있다.The object detection apparatus may generate an integrated feature vector by combining the first feature vector and the second feature vector. In this case, a block corresponding to the first feature vector and a block corresponding to the second feature vector may have the same size. The object detection apparatus may combine a first feature vector and a second feature vector corresponding to a corresponding block on a block-by-block basis. In other words, the object detection apparatus may generate an integrated feature vector for each region.

도 4는 본 발명의 일 실시예에 따른 오브젝트 검출 장치의 세부 구성을 나타낸 도면이다.4 is a diagram showing a detailed configuration of an object detection device according to an embodiment of the present invention.

도 4를 참조하면, 오브젝트 검출 장치(400)는 추출부(410), 특징 생성부(420) 및 오브젝트 검출부(430)를 포함한다. 오브젝트 검출 장치(400)는 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출하는 장치를 나타낸다. 오브젝트 검출 장치(400)는 소프트웨어 모듈, 하드웨어 모듈, 또는 이들의 조합으로 구현될 수 있다. 오브젝트 검출 장치(400)는 스마트 폰, 테블릿 컴퓨터, 랩톱 컴퓨터, 데스크톱 컴퓨터, 텔레비전, 웨어러블 장치, 보안 시스템, 스마트 홈 시스템 등 다양한 컴퓨팅 장치 및/또는 시스템에 탑재될 수 있다.Referring to FIG. 4 , the object detection apparatus 400 includes an extraction unit 410 , a feature generation unit 420 and an object detection unit 430 . The object detection device 400 represents a device for detecting an object included in an image, video, image information, and/or video information. The object detection device 400 may be implemented as a software module, a hardware module, or a combination thereof. The object detection device 400 may be installed in various computing devices and/or systems such as smart phones, tablet computers, laptop computers, desktop computers, televisions, wearable devices, security systems, and smart home systems.

추출부(410)는 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보로부터 프레임 영상 및 모션 벡터를 추출할 수 있다. 추출부(410)는 동영상에 포함된 시간적으로 연속되는 복수의 프레임 영상들을 추출함으로써 특정 프레임 영상을 추출할 수 있다.The extractor 410 may extract frame images and motion vectors from images, videos, image information, and/or video information. The extractor 410 may extract a specific frame image by extracting a plurality of temporally continuous frame images included in the video.

추출부(410)는 동영상의 인코딩 과정에서 생성된 모션 벡터를 동영상으로부터 추출할 수 있다. 또는, 추출부(410)는 동영상에 포함된 시간적으로 연속되는 복수의 프레임 영상들에 기초하여 모션 벡터를 추출할 수 있다.The extractor 410 may extract a motion vector generated in the video encoding process from the video. Alternatively, the extractor 410 may extract a motion vector based on a plurality of temporally continuous frame images included in the video.

예를 들어, 도 4에서는 추출부(410)에서 프레임 영상 및 모션 벡터가 모두 추출되는 것으로 도시되어 있으나, 이는 하나의 실시예에 불과할 뿐 추출부(410)의 실시예를 제한하지 않는다. 즉, 오브젝트 검출 장치(400)는 동영상으로부터 프레임 영상을 추출하는 프레임 영상 추출부, 동영상으로부터 모션 벡터를 추출하는 모션 벡터 추출부를 독립적으로 포함할 수도 있다.For example, although it is illustrated in FIG. 4 that both frame images and motion vectors are extracted by the extractor 410, this is only one embodiment and does not limit the embodiment of the extractor 410. That is, the object detection apparatus 400 may independently include a frame image extractor extracting a frame image from a video and a motion vector extractor extracting a motion vector from the video.

특징 생성부(420)는 프레임 영상 및 모션 벡터에 기초하여 통합 특징 벡터를 생성한다. 특징 생성부(420)는 프레임 영상을 복수의 블록들로 분할하고 블록에 포함되는 프레임 영상에 기초하여 해당 블록에 대응하는 제1 특징 벡터를 추출할 수 있다. 특징 생성부(420)는 프레임 영상의 통계적인 특성을 제1 특징 벡터로 추출할 수 있다.The feature generator 420 generates an integrated feature vector based on the frame image and the motion vector. The feature generator 420 may divide the frame image into a plurality of blocks and extract a first feature vector corresponding to the block based on the frame image included in the block. The feature generator 420 may extract statistical characteristics of the frame image as a first feature vector.

일 예로, 특징 생성부(420)는 블록에 대응하는 프레임 영상에 포함된 픽셀의 밝기의 기울기에 기초하여 해당 블록에 대응하는 제1 특징 벡터를 추출할 수 있다. 다른 예로, 특징 생성부(420)는 블록에 대응하는 프레임 영상에 포함된 픽셀의 밝기 레벨에 기초하여 해당 블록에 대응하는 제1 특징 벡터를 추출할 수 있다. 또 다른 예로, 특징 생성부(420)는 블록에 대응하는 프레임 영상에 포함된 픽셀의 색상에 기초하여 해당 블록에 대응하는 제1 특징 벡터를 추출할 수 있다.For example, the feature generator 420 may extract a first feature vector corresponding to a block based on a brightness gradient of a pixel included in a frame image corresponding to the block. As another example, the feature generator 420 may extract a first feature vector corresponding to a block based on a brightness level of a pixel included in a frame image corresponding to the block. As another example, the feature generator 420 may extract a first feature vector corresponding to a block based on a color of a pixel included in a frame image corresponding to the block.

특징 생성부(420)는 모션 벡터를 복수의 블록들로 분할하고 블록에 포함되는 모션 벡터에 기초하여 해당 블록에 대응하는 제2 특징 벡터를 추출할 수 있다. 특징 생성부(420)는 모션 벡터의 통계적인 특성을 제2 특징 벡터로 추출할 수 있다. 예를 들어, 특징 생성부(420)는 블록에 포함된 적어도 하나의 모션 벡터의 방향에 기초하여 제2 특징 벡터를 추출할 수 있다. 여기서, 모션 벡터를 분할하는 블록은 프레임 영상을 분할하는 블록과 크기가 동일할 수 있다.The feature generator 420 may divide the motion vector into a plurality of blocks and extract a second feature vector corresponding to the block based on the motion vector included in the block. The feature generator 420 may extract statistical characteristics of motion vectors as second feature vectors. For example, the feature generator 420 may extract a second feature vector based on the direction of at least one motion vector included in the block. Here, a block dividing the motion vector may have the same size as a block dividing the frame image.

오브젝트 검출부(430)는 통합 특징 벡터에 기초하여 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출한다. 오브젝트 검출부(430)는 통합 특징 벡터에 기초하여 프레임 영상에 검출 대상 오브젝트가 포함되어 있는지 여부를 판별함으로써 이미지, 동영상, 이미지 정보, 및/또는 동영상 정보에 포함된 오브젝트를 검출할 수 있다.The object detection unit 430 detects an object included in an image, video, image information, and/or video information based on the integrated feature vector. The object detector 430 may detect an image, a video, image information, and/or an object included in the video information by determining whether or not the object to be detected is included in the frame image based on the integrated feature vector.

본 발명에 적용될 수 있는 기술들 중 일부는 본 발명의 개념이 모호해지는 것을 피하기 위해 생략될 수 있다. 이러한 생략된 구성들은 "Histograms of oriented gradients for human detection", "Object Detection with Discriminatively Trained Part Based Models"을 참조하여 본 발명에 적용될 수 있다.Some of the techniques applicable to the present invention may be omitted to avoid obscuring the concept of the present invention. These omitted configurations can be applied to the present invention by referring to "Histograms of oriented gradients for human detection" and "Object Detection with Discriminatively Trained Part Based Models".

도 5은 본 발명의 일 실시예에 따른 특정 컨텐츠 내 오브젝트에 대한 정보를 제공하기 위한 시스템의 구성을 나타낸 도면이다.5 is a diagram showing the configuration of a system for providing information on an object within specific content according to an embodiment of the present invention.

도 5을 참조하면, 본 발명에 따른 특정 컨텐츠 내 오브젝트에 대한 정보를 제공하기 위한 시스템은 통신망(510)을 통해서 연결되는 오브젝트 정보 제공 서버(520)와 사용자 장치(530)를 포함하여 구성될 수 있다.Referring to FIG. 5 , a system for providing information on an object within specific content according to the present invention may include an object information providing server 520 and a user device 530 connected through a communication network 510. there is.

여기서, 통신망(510)은 오브젝트 정보 제공 서버(520)와 사용자 장치(530)가 접근 가능한 유/무선 통신망일 수 있다. 또한 통신망(510)은 단말들 및 서버들과 같은 각각의 노드 상호 간에 정보 교환이 가능한 연결 구조를 의미하는 것으로, 이러한 통신망(510)의 예로는 인터넷(Internet), Wireless LAN(Wireless Local Area Network), WAN(Wide Area Network), PAN(Personal Area Network), 3G, LTE(Long Term Evolution), WiFi(Wireless Fidelity), WiMAX(World Interoperability for Microwave Access), WiGig(Wireless Gigabit) 등이 포함되나 이에 한정되지는 않는다.Here, the communication network 510 may be a wired/wireless communication network accessible to the object information providing server 520 and the user device 530 . In addition, the communication network 510 means a connection structure capable of exchanging information between nodes such as terminals and servers, and examples of such a communication network 510 include the Internet, Wireless LAN (Wireless Local Area Network) , Wide Area Network (WAN), Personal Area Network (PAN), 3G, Long Term Evolution (LTE), Wireless Fidelity (WiFi), World Interoperability for Microwave Access (WiMAX), Wireless Gigabit (WiGig), etc. It doesn't work.

오브젝트 정보 제공 서버(520)는 본 발명에 따른 컨텐츠 내 사용자가 원하는 특정 오브젝트에 대한 정보를 통신망(510)을 통해 사용자 장치(530)에 제공하는 기능을 수행할 수 있다. 또한, 오브젝트 정보 제공 서버(520)는, 사용자 장치(530)에 컨텐츠를 제공할 수 있고, 제공되는 컨텐츠 내에 포함되는 오브젝트들에 대한 정보를 제공할 수 있다. 오브젝트 정보 제공 서버(520)는 통신망(510)을 통해 사용자 장치(530)로부터 수신한 오브젝트를 식별하여 사용자에게 오브젝트에 대한 정보를 제공할수 있는데, 이때 오브젝트에 대한 정보는 외부로부터 제공받을 수 있다. 예를 들어, 포털 사이트로부터 오브젝트에 대한 정보를 제공받을 수 있다.The object information providing server 520 may perform a function of providing information on a specific object desired by a user in content according to the present invention to the user device 530 through the communication network 510 . In addition, the object information providing server 520 may provide content to the user device 530 and information on objects included in the provided content. The object information providing server 520 may identify an object received from the user device 530 through the communication network 510 and provide information about the object to the user. In this case, the information about the object may be provided from the outside. For example, information about an object may be provided from a portal site.

사용자 장치(530)는 본 발명에 따른 외부로부터 컨텐츠를 제공받거나 사용자로부터 컨텐츠를 제공받아 디스플레이하는 기능을 수행할 수 있다. 더하여, 사용자는 사용자 장치(530)를 통해 상기 컨텐츠 내 특정 오브젝트를 선택할 수 있다. 사용자 장치(530)는 사용자에 의해 선택된 특정 오브젝트를 통신망(510)을 통해 오브젝트 정보 제공 서버(520)로 전송할 수 있다. 사용자 장치(530)는 선택된 특정 오브젝트에 대한 정보를 통신망(510)을 통해 오브젝트 정보 제공 서버(520)로부터 수신하고, 수신한 특정 오브젝트에 대한 정보를 디스플레이하여 사용자에게 제공할 수 있다. 이때, 사용자에게 제공되는 특정 오브젝트에 대한 정보는 사용자가 선택한 특정 오브젝트에 따라 달라질 수 있고, 사용자에 따라 특정 오브젝트에 대한 정보의 일부만이 제공될 수도 있다. 사용자 장치(530)는 구체적으로, 통신 가능한 데스크탑 컴퓨터(desktop computer), 랩탑 컴퓨터(laptop computer), 노트북(notebook), 스마트폰(smart phone), 태블릿 PC(tablet PC), 모바일폰(mobile phone), 스마트 워치(smart watch), 스마트 글래스(smart glass), e-book 리더기, PMP(portable multimedia player), 휴대용 게임기, 네비게이션(navigation) 장치, 디지털 카메라(digital camera), DMB(digital multimedia broadcasting) 재생기, 디지털 음성 녹음기(digital audio recorder), 디지털 음성 재생기(digital audio player), 디지털 동영상 녹화기(digital video recorder), 디지털 동영상 재생기(digital video player), PDA(Personal Digital Assistant), 전방 표시 장치(Head Up Display,HUD) 등일 수 있다. 또한, 사용자 장치(530)는, 사용자가 정보를 얻기 원하는 특정 오브젝트에 대한 이미지를 획득하기 위한 카메라(camera)를 포함할 수 있다. 즉, 사용자는 정보를 얻기 원하는 특정 오브젝트에 대한 이미지를 사용자 장치(530)를 통해 획득할 수 있다. 사용자는 사용자 장치(530)와 통신망(510)을 통해 연결된 오브젝트 정보 제공 오브젝트 정보 제공 서버(520)에 획득된 이미지를 전송하고, 오브젝트 정보 제공 오브젝트 정보 제공 서버(520)로부터 획득된 이미지 내의 특정 오브젝트에 대한 정보를 제공받을 수 있다.The user device 530 may perform a function of receiving content from the outside or receiving content from a user and displaying the content according to the present invention. In addition, the user may select a specific object within the content through the user device 530 . The user device 530 may transmit a specific object selected by the user to the object information providing server 520 through the communication network 510 . The user device 530 may receive information on the selected specific object from the object information providing server 520 through the communication network 510, display the received information on the specific object, and provide the information to the user. In this case, information on a specific object provided to the user may vary depending on the specific object selected by the user, and only part of the information on the specific object may be provided according to the user. The user device 530 is specifically, a communicable desktop computer, a laptop computer, a notebook, a smart phone, a tablet PC, and a mobile phone. , smart watch, smart glass, e-book reader, PMP (portable multimedia player), portable game device, navigation device, digital camera, DMB (digital multimedia broadcasting) player , digital audio recorder, digital audio player, digital video recorder, digital video player, personal digital assistant (PDA), head up Display, HUD), etc. In addition, the user device 530 may include a camera for acquiring an image of a specific object for which the user wants to obtain information. That is, the user may acquire an image of a specific object for which information is desired through the user device 530 . The user transmits the obtained image to the object information providing object information providing server 520 connected to the user device 530 through the communication network 510, and the specific object in the image obtained from the object information providing object information providing server 520 information can be provided.

도 6은 본 발명의 일 실시예에 따른 사용자 장치를 나타낸 블록도이다.6 is a block diagram illustrating a user device according to an embodiment of the present invention.

도 6을 참조하면 본 발명의 일 실시예에 따른 사용자 장치(530)는, 디스플레이부(610), 저장부(620), 통신부(630), 오브젝트 추출부(640) 및 입력부(650)를 포함하여 구성될 수 있다. 본 명세서에서의 사용자 장치를 구성하는 디스플레이부(610), 저장부(620), 통신부(630), 오브젝트 추출부(640) 및 입력부(650)는 전부 또는 일부가 프로세서로 기술될 수 있다. 예를 들어 사용자 장치는 통신부(630) 및 프로세서(디스플레이부(610), 저장부(620), 오브젝트 추출부(640) 및 입력부(650))로 구성될 수 있고, 이때 프로세서와 통신부는 서로 기능적으로 연결될 수 있다.Referring to FIG. 6 , a user device 530 according to an embodiment of the present invention includes a display unit 610, a storage unit 620, a communication unit 630, an object extraction unit 640, and an input unit 650. can be configured. All or part of the display unit 610, the storage unit 620, the communication unit 630, the object extraction unit 640, and the input unit 650 constituting the user device in this specification may be described as a processor. For example, the user device may be composed of a communication unit 630 and a processor (display unit 610, storage unit 620, object extraction unit 640, and input unit 650), wherein the processor and the communication unit are functional with each other. can be connected to

디스플레이부(610)는, 컨텐츠를 디스플레이하고, 컨텐츠 내에 포함된 특정 오브젝트에 대한 정보를 오브젝트 정보 제공 서버(520)로부터 수신하여 디스플레이하는 기능을 수행할 수 있다. 예를 들어, 디스플레이부(610)는 사용자의 기 저장된 컨텐츠 등을 디스플레이할 수 있고, 사용자가 직접 실시간으로 촬영한 컨텐츠 등을 디스플레이할 수 있고, 외부로부터 실시간으로 수신되는 컨텐츠 등을 디스플레이할 수 있다. 이때, 컨텐츠는 동영상, 이미지(사진), 오디오(음성) 등일 수 있다. The display unit 610 may perform a function of displaying content, receiving information on a specific object included in the content from the object information providing server 520 and displaying the received information. For example, the display unit 610 may display content previously stored by the user, content captured by the user in real time, or content received from the outside in real time. . In this case, the content may be a video, image (photo), audio (voice), or the like.

저장부(620)는, 사용자가 기 저장한 각종 컨텐츠를 저장하는 기능을 수행할 수 있다. 또한, 저장부(620)는, 사용자가 선택한 컨텐츠 내에 포함된 특정 오브젝트에 대한 정보를 저장하는 기능을 수행할 수 있다.The storage unit 620 may perform a function of storing various contents previously stored by a user. Also, the storage unit 620 may perform a function of storing information about a specific object included in content selected by a user.

통신부(630)는, 도 5에서 설명한 통신망(510)을 통해 사용자 장치(530)와 오브젝트 정보 제공 서버(520)간 통신을 수행할 수 있다. 즉, 사용자가 컨텐츠 내에서 선택한 특정 오브젝트를 오브젝트 정보 제공 서버(520)로 전송할 수 있고, 특정 오브젝트에 대한 정보를 오브젝트 정보 제공 서버(520)로부터 수신할 수 있다. 또한, 특정 오브젝트에 대한 정보에 인터넷 링크가 포함되는 경우, 사용자 장치(530)를 인터넷 링크에 접속하게하는 기능을 수행할 수 있다.The communication unit 630 may perform communication between the user device 530 and the object information providing server 520 through the communication network 510 described in FIG. 5 . That is, a specific object selected by the user from content can be transmitted to the object information providing server 520, and information on a specific object can be received from the object information providing server 520. In addition, when an Internet link is included in the information on a specific object, a function of allowing the user device 530 to access the Internet link may be performed.

오브젝트 추출부(640)는, 저장부(620)에 기 저장된 컨텐츠 또는 사용자가 직접 촬영한 컨텐츠 내에 포함되는 오브젝트를 추출할 수 있다. 오브젝트 추출부(640)는 심층 신경망(Deep Neural Network, DNN) 또는 합성곱 신경망(Convolutional Neural Network, CNN)을 이용하여 기 학습된 정보에 기초하여 오브젝트를 추출할 수 있다. 오브젝트 추출부(640)를 통해 추출된 오브젝트들을 표시하기 위해 별도의 바운딩 박스가 컨텐츠 상에 중첩되어 표시될 수 있다. 예를 들어, 컨텐츠 내에 하나 이상의 오브젝트들이 존재하는 경우, 바운딩 박스는 각 오브젝트들마다 할당될 수 있다. 또한, 별도의 바운딩 박스는 컨텐츠 내에 표시되지 않고 투명한 상태로 존재할 수도 있다. 컨텐츠가 영상 또는 이미지인 경우, 컨텐츠 내에 포함되는 오브젝트를 추출하기 위해 픽셀 값을 이용할 수 있다. 즉, 오브젝트 추출부(640)는, 컨텐츠의 각 픽셀들 간 유사도에 기초하여 특정 오브젝트를 추출할 수 있다. 구체적으로, 오브젝트를 추출하기 위해서는 영상 또는 이미지 내에서 배경과 오브젝트가 분리되어야하는데, 배경은 배경에 해당되는 픽셀들끼리, 오브젝트는 오브젝트에 해당하는 픽셀들끼리 각 픽셀의 픽셀 값이 유사한 경우가 많다. 따라서, 오브젝트 추출부(640)는 영상 또는 이미지의 모든 픽셀들의 픽셀 값을 구하고, 인접한 픽셀들의 픽셀 값들의 차이가 기 설정된 임계 값 이상이 되는 픽셀들에 기초하여 오브젝트를 추출할 수 있다. 인접한 픽셀은, 현재 픽셀의 상측, 하측, 좌측, 우측에 있는 픽셀을 의미할 수 있다. 다시 말하면 컨텐츠 내의 각 픽셀들마다 해당 각 픽셀에 인접한 픽셀들과의 픽셀 값 차이들을 계산하고, 픽셀 값 차이들이 기 설정된 임계 값 이상이 될 때, 해당 픽셀들의 집합이 상술한 바운딩 박스가 될 수 있다. 이때, 기 설정된 임계 값은 영상 또는 이미지의 모든 픽셀들의 픽셀 값의 평균에 해당하는 값일 수 있다. The object extractor 640 may extract an object included in content pre-stored in the storage 620 or content directly photographed by a user. The object extractor 640 may extract an object based on pre-learned information using a deep neural network (DNN) or a convolutional neural network (CNN). In order to display the objects extracted through the object extractor 640, a separate bounding box may be overlapped and displayed on the content. For example, when one or more objects exist in content, a bounding box may be assigned to each object. Also, a separate bounding box may exist in a transparent state without being displayed within the content. If the content is a video or image, a pixel value may be used to extract an object included in the content. That is, the object extractor 640 may extract a specific object based on the similarity between pixels of the content. Specifically, in order to extract an object, the background and object must be separated from the video or image. In many cases, the pixel values of each pixel are similar between pixels corresponding to the background and objects corresponding to the object. . Accordingly, the object extractor 640 may obtain pixel values of all pixels of the video or image, and extract an object based on pixels for which a difference between pixel values of adjacent pixels is equal to or greater than a predetermined threshold value. An adjacent pixel may refer to a pixel above, below, left, or right of the current pixel. In other words, for each pixel in the content, pixel value differences with pixels adjacent to each pixel are calculated, and when the pixel value differences exceed a predetermined threshold value, a set of corresponding pixels may become the aforementioned bounding box. . In this case, the predetermined threshold value may be a value corresponding to an average of pixel values of all pixels of the video or image.

본 명세서에서 기술하는 픽셀 값은, 각 픽셀의 밝기를 나타내는 값일 수 있다. 예를 들어, 컨텐츠가 흑백인 경우, 픽셀 값은 0 내지 255의 값들 중 어느 하나가 될 수 있다. 픽셀 값이 0인 경우, 가장 어두운 상태(즉, 검은색)을 의미하고, 픽셀 값이 255인 경우, 가장 밝은 상태(즉, 흰색)을 의미할 수 있다. A pixel value described in this specification may be a value representing brightness of each pixel. For example, if the content is black and white, the pixel value may be any one of values from 0 to 255. When the pixel value is 0, it means the darkest state (ie, black), and when the pixel value is 255, it means the brightest state (ie, white).

또 다른 예로, 컨텐츠가 흑백이 아닌 컬러 컨텐츠인 경우, 각 픽셀은 적색(Red, R), 녹색(Green, G), 청색(Blue, B)의 세가지 색상이 혼합된 색으로 표현될 수 있다. 각 픽셀의 픽셀 값은 R에 해당하는 값, G에 해당하는 값, B에 해당하는 값 각각으로 구성될 수 있다. R, G, B에 해당하는 값은 각각 0 내지 255의 값들 중 어느 하나의 값을 가질 수 있다. R에 해당하는 값이 0인 경우, 검은 색을 의미하고, 255인 경우 적색 자체(즉, 원색)을 의미할 수 있다. G, B에 해당하는 값도 마찬가지이다. 상술한 픽셀 값 차이는 각 픽셀들의 R, G, B에 해당하는 값 각각에 대해 계산될 수 있다. 다시 말하면 인접한 픽셀들의 픽셀 값들의 차이는 R에 해당하는 값, G에 해당하는 값, B에 해당하는 값 각각 계산될 수 있다. 또한, 모든 픽셀들의 픽셀 값의 평균은 모든 픽셀들의 R에 해당하는 값에 대한 평균, G에 해당하는 값에 대한 평균, B에 해당하는 값 각각에 대한 평균일 수 있다. 즉, R, G, B에 해당하는 픽셀 값들 각각에 대해 차이 값과 기 설정된 임계 값이 비교된다. 구체적으로 R, G, B에 해당하는 픽셀 값들의 차이가 기 설정된 임계 값보다 모두 같거나 큰 경우, 바운딩 박스가 설정될 수 있다. As another example, when the content is color content rather than black and white, each pixel may be expressed as a mixture of three colors of red (R), green (G), and blue (B). A pixel value of each pixel may include a value corresponding to R, a value corresponding to G, and a value corresponding to B, respectively. Values corresponding to R, G, and B may have any one of values from 0 to 255, respectively. When the value corresponding to R is 0, it means black, and when it is 255, it may mean red itself (ie, the primary color). The same applies to the values corresponding to G and B. The pixel value difference described above may be calculated for each value corresponding to R, G, and B of each pixel. In other words, the difference between pixel values of adjacent pixels may be calculated as a value corresponding to R, a value corresponding to G, and a value corresponding to B, respectively. Also, the average of pixel values of all pixels may be an average of values corresponding to R, an average of values corresponding to G, and an average of values corresponding to B of all pixels. That is, for each of the pixel values corresponding to R, G, and B, a difference value and a predetermined threshold value are compared. Specifically, when differences between pixel values corresponding to R, G, and B are equal to or greater than a predetermined threshold value, a bounding box may be set.

한편, R, G, B에 해당하는 픽셀 값들의 차이 중 어느 하나만이라도 기 설정된 임계 값보다 같거나 큰 경우, 바운딩 박스가 설정될 수 있다. 이때, 컨텐츠의 모든 픽셀들의 R, G, B 픽셀 값들의 평균 중 가장 작은 평균 값을 가지는 요소에 해당하는 픽셀 값들의 차이가 기 설정된 임계 값보다 같거나 큰 경우, 바운딩 박스가 설정될 수 있다. 예를 들어, 컨텐츠의 모든 픽셀들의 R에 해당하는 픽셀 값의 평균 값이 630이고, G에 해당하는 픽셀 값의 평균 값이 530이고, B에 해당하는 픽셀 값의 평균 값이 30이면, 기 설정된 임계 값은 30일 수 있다. Meanwhile, if any one of the differences between pixel values corresponding to R, G, and B is equal to or greater than a predetermined threshold value, a bounding box may be set. In this case, when a difference between pixel values corresponding to an element having the smallest average value among averages of R, G, and B pixel values of all pixels of the content is equal to or greater than a predetermined threshold value, a bounding box may be set. For example, if the average value of pixel values corresponding to R of all pixels of the content is 630, the average value of pixel values corresponding to G is 530, and the average value of pixel values corresponding to B is 30, the preset The threshold may be 30.

픽셀 값의 평균은, 아래 수학식 1과 같이 계산될 수 있다.The average of the pixel values may be calculated as in Equation 1 below.

[수학식 1][Equation 1]

수학식 1에서 k는 컨텐츠의 픽셀들의 개수이고, VRn은 컨텐츠의 n번째 픽셀의 R에 해당하는 픽셀 값이고, VGn은 컨텐츠의 n번째 픽셀의 G에 해당하는 픽셀 값이고, VBn은 컨텐츠의 n번째 픽셀의 B에 해당하는 픽셀 값일 수 있다. n은 1이상의 정수로 1부터 k까지의 정수일 수 있다.In Equation 1, k is the number of pixels of the content, VRn is a pixel value corresponding to R of the nth pixel of the content, VGn is a pixel value corresponding to G of the nth pixel of the content, and VBn is n of the content It may be a pixel value corresponding to B of the th pixel. n is an integer greater than or equal to 1 and may be an integer from 1 to k.

입력부(650)는, 사용자가 컨텐츠 내의 특정 오브젝트에 대한 정보를 얻고자 하는 경우, 사용자가 특정 오브젝트를 선택하는 정보를 입력받을 수 있다. 예를 들어, 컨텐츠 내에 제1 오브젝트 및 제2 오브젝트가 존재할 때, 사용자는 제1 오브젝트에 대한 정보를 얻기 위해 제1 오브젝트를 선택할 수 있는데, 이때, 사용자는 입력부(650)를 통해 제1 오브젝트를 선택할 수 있다. 이때, 사용자는 외부 장치(예, 마우스, 키보드 등)를 이용하거나 터치 기능을 이용하여 직접 터치함으로써 컨텐츠 내 오브젝트를 선택할 수 있다.The input unit 650 may receive information for the user to select a specific object when the user wants to obtain information on a specific object within the content. For example, when a first object and a second object exist in the content, the user may select the first object to obtain information on the first object. At this time, the user selects the first object through the input unit 650. You can choose. In this case, the user may select an object within the content by directly touching it using an external device (eg, mouse, keyboard, etc.) or using a touch function.

도 7은 본 발명의 일 실시예에 따른 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하는 방법을 나타낸 도면이다.7 is a diagram illustrating a method of providing information on a specific object within content according to an embodiment of the present invention.

도 7(a)는 컨텐츠가 사용자 장치 상에서 디스플레이되는 것을 나타낸 도면이고, 도 7(b)는 컨텐츠 내 사용자가 선택한 특정 오브젝트에 대한 정보가 사용자 장치 상에서 디스플레이되는 것을 나타낸 도면이다. FIG. 7(a) is a diagram showing that content is displayed on a user device, and FIG. 7(b) is a diagram showing that information about a specific object selected by a user in content is displayed on a user device.

도 7(a), (b)에서 나타낸 사용자 장치 상에서 디스플레이되는 컨텐츠는, 외부로부터 제공받는 컨텐츠(예, 방송 플랫폼을 통해 제공받는 실시간 영상, 이미지, 오디오 등) 또는 사용자가 직접 촬영한 영상, 이미지 또는 사용자가 사용자 장치를 통해 실시간으로 녹음하는 오디오(음성)일 수 있다. The content displayed on the user device shown in FIGS. 7 (a) and (b) includes content provided from the outside (eg, real-time video, image, audio, etc. provided through a broadcasting platform) or video or image captured by the user. Alternatively, it may be audio (speech) recorded by the user in real time through the user device.

사용자 장치는 컨텐츠 내 전부 또는 일부의 오브젝트들을 추출할 수 있다. 도 7(a)를 참조하면, 사용자 장치는 제 1오브젝트(710), 제 2오브젝트(720), 제3 오브젝트(730), 제4 오브젝트(740)를 추출할 수 있다. 예를 들어, 제 1오브젝트(710)는 의자, 제 2오브젝트(720)는 노트북, 제3 오브젝트(730)는 사람, 제4 오브젝트(740)는 책상일 수 있다.The user device may extract all or some objects in the content. Referring to FIG. 7(a) , the user device may extract a first object 710, a second object 720, a third object 730, and a fourth object 740. For example, the first object 710 may be a chair, the second object 720 may be a laptop computer, the third object 730 may be a person, and the fourth object 740 may be a desk.

도 7(b)를 참조하면 사용자는, 컨텐츠 내 추출되는 오브젝트들 중 정보를 얻고자 하는 특정 오브젝트를 선택할 수 있다. 예를 들어, 사용자는 제2 오브젝트(720)(예, 노트북)을 선택할 수 있다. Referring to FIG. 7( b ), a user may select a specific object to obtain information from among objects extracted from content. For example, the user may select the second object 720 (eg, a notebook).

사용자 장치는, 오브젝트 정보 제공 서버로 사용자가 선택한 특정 오브젝트(예, 제2 오브젝트)를 전송할 수 있다. 다시 말하면, 사용자 장치는 오브젝트 정보 제공 서버로 특정 오브젝트를 식별하기 위한 정보를 전송할 수 있다. 예를 들어, 사용자 장치는 컨텐츠 내의 특정 오브젝트를 따로 크롭하여 크롭된 부분만 서버로 전송할 수 있다. The user device may transmit a specific object (eg, a second object) selected by the user to the object information providing server. In other words, the user device may transmit information for identifying a specific object to the object information providing server. For example, the user device may separately crop a specific object in the content and transmit only the cropped portion to the server.

오브젝트 정보 제공 서버는, 사용자 장치로부터 수신한 특정 오브젝트를 식별하기 위한 정보를 이용하여 획득된 특정 오브젝트에 대한 정보를 사용자 장치로 전송할 수 있다. 이때, 특정 오브젝트에 대한 정보는, 오브젝트 정보 제공 서버가 직접 획득할 수 있고, 외부 검색 서버(예, 검색 포털 사이트)로부터 제공받을 수 있고, 컨텐츠 제공처로부터 직접 제공받을 수 있다. The object information providing server may transmit information about a specific object obtained by using the information for identifying the specific object received from the user device to the user device. In this case, information on a specific object may be directly obtained by an object information providing server, may be provided from an external search server (eg, a search portal site), or may be directly provided from a content provider.

오브젝트 정보 제공 서버로부터 특정 오브젝트에 대한 정보를 수신한 사용자 장치는, 특정 오브젝트에 대한 정보를 디스플레이 할 수 있다. 예를 들어, 사용자가 선택한 특정 오브젝트가 제2 오브젝트(예, 노트북)인 경우, 사용자 장치는 제2 오브젝트에 대한 정보(750)를 디스플레이 할 수 있다. 이때, 제2 오브젝트에 대한 정보(750)는 컨텐츠 상 중첩되어 디스플레이되거나 별도의 팝업 창을 통해 디스플레이될 수 있다. 예를 들어, 사용자가 마우스 커서를 특정 오브젝트에 위치시키는 경우, 마우스 커서 주변에 특정 오브젝트에 대한 정보가 디스플레이될 수 있다. 특정 오브젝트에 대한 정보는, 사용자가 선택한 특정 오브젝트에 따라 달리 구성될 수 있다. 예를 들어, 특정 오브젝트가 사물인 경우, 특정 오브젝트에 대한 정보는 모델명, 제조일자, 색상, 가격(최소가격 내지 최대가격), 구매처(구매처 링크) 등일 수 있다. 특정 오브젝트가 사람인 경우, 특정 오브젝트에 대한 정보는, 인물명, 생년월일, 상기 인물이 착용한 아이템(예, 옷, 시계, 신발 등)을 구매할 수 있는 구매처(구매처 링크), 필모그래피 등일 수 있다. 또한, 특정 오브젝트에 대한 정보는 텍스트 형식뿐 아니라 이미지, 영상, 음성 등 디지털화 가능한 모든 형태로 사용자에게 제공될 수 있다.Upon receiving information on a specific object from the object information providing server, the user device may display information on the specific object. For example, when the specific object selected by the user is a second object (eg, a laptop computer), the user device may display information 750 on the second object. In this case, the information 750 on the second object may be displayed overlapping with the content or displayed through a separate pop-up window. For example, when a user places a mouse cursor on a specific object, information on the specific object may be displayed around the mouse cursor. Information on a specific object may be configured differently according to the specific object selected by the user. For example, when a specific object is a thing, information on the specific object may include a model name, a manufacturing date, a color, a price (minimum or maximum price), a place of purchase (a link to a place of purchase), and the like. If a specific object is a person, information on the specific object may include a person's name, date of birth, a place where items worn by the person (eg, clothes, watches, shoes, etc.) can be purchased (link to a place of purchase), filmography, and the like. In addition, information on a specific object may be provided to the user in all forms that can be digitized, such as image, video, voice, as well as text format.

특정 오브젝트에 대한 정보가 컨텐츠 상에 중첩되어 디스플레이될 것인지, 별도의 팝업 창을 통해 디스플레이 될 것인지 여부는 오브젝트 정보 제공 서버로부터 수신한 특정 오브젝트에 대한 정보의 데이터 크기에 기초하여 결정되거나 특정 오브젝트에 대한 정보가 사용자에게 어떠한 형태로 제공되는지에 기초하여 결정될 수 있다. 구체적으로, 특정 오브젝트에 대한 정보가 이미지 또는 영상으로 제공되는 경우, 특정 오브젝트에 대한 정보는 팝업 형태로 디스플레이될 수 있다. 또한, 특정 오브젝트에 대한 정보의 데이터 크기가 기 설정된 임계 값보다 큰 경우 팝업 형태로 디스플레이될 수 있다. 또한, 컨텐츠 내 특정 오브젝트가 존재하는 위치에 따라 특정 오브젝트에 대한 정보가 디스플레이되는 위치가 결정될 수 있다. 구체적으로, 특정 오브젝트에 대한 정보가 컨텐츠 상에 중첩되어 디스플레이되는 경우, 컨텐츠의 높이 또는 너비(영상인 경우, 프레임 높이 또는 너비)에 기초하여 특정 오브젝트에 대한 정보가 디스플레이되는 위치가 결정될 수 있다. 예를 들어, 컨텐츠의 높이가 H인 경우, 특정 오브젝트가 컨텐츠 내 H/2보다 하측에 존재하는 경우, 특정 오브젝트에 대한 정보는 특정 오브젝트의 상측에 디스플레이될 수 있고, 특정 오브젝트가 컨텐츠 내 H/2 보다 상측에 존재하는 경우, 특정 오브젝트에 대한 정보는 특정 오브젝트의 하측에 디스플레이될 수 있다. 마찬가지로 컨텐츠의 높이가 W인 경우, 특정 오브젝트가 컨탠츠 내 W/2보다 좌측에 존재하는 경우, 특정 오브젝트에 대한 정보는 특정 오브젝트의 우측에 디스플레이될 수 있고, 특정 오브젝트가 컨텐츠 내 W/2보다 우측에 존재하는 경우, 특정 오브젝트에 대한 정보는 특정 오브젝트의 좌측에 디스플레이될 수 있다. Whether information on a specific object is displayed overlaid on the content or displayed through a separate pop-up window is determined based on the data size of the information on the specific object received from the object information providing server or It can be determined based on what form the information is provided to the user. Specifically, when information on a specific object is provided as an image or video, the information on the specific object may be displayed in a pop-up form. Also, when the data size of information on a specific object is larger than a preset threshold value, it may be displayed in a pop-up form. Also, a location where information about a specific object is displayed may be determined according to a location where a specific object exists in content. In detail, when information about a specific object is overlapped and displayed on content, a position where information about a specific object is displayed may be determined based on the height or width of the content (or the height or width of a frame in the case of an image). For example, when the height of the content is H and a specific object exists below H/2 in the content, information about the specific object may be displayed above the specific object, and the specific object may be displayed above H/2 in the content. When present above 2, information on a specific object may be displayed below the specific object. Similarly, when the height of content is W, and a specific object exists on the left side of W/2 within the content, information on the specific object can be displayed on the right side of the specific object, and the specific object is located on the right side of W/2 within the content. When present in , information on a specific object may be displayed on the left side of the specific object.

또한, 특정 오브젝트에 대한 정보가 디스플레이되는 위치는 컨텐츠의 높이 및 너비 모두가 고려되어 결정될 수 있다. 예를 들어, 특정 오브젝트가 컨텐츠 내 H/2보다 하측에 존재하고, W/2보다 좌측에 존재하는 경우, 특정 오브젝트에 대한 정보는 특정 오브젝트의 우상측에 디스플레이될 수 있다. 특정 오브젝트가 컨텐츠 내 H/2보다 하측에 존재하고, W/2보다 우측에 존재하는 경우, 특정 오브젝트에 대한 정보는 특정 오브젝트의 좌상측에 디스플레이될 수 있다. 특정 오브젝트가 컨텐츠 내 H/2보다 상측에 존재하고, W/2보다 좌측에 존재하는 경우, 특정 오브젝트에 대한 정보는 특정 오브젝트의 우하측에 디스플레이될 수 있다. 특정 오브젝트가 컨텐츠 내 H/2보다 상측에 존재하고, W/2보다 우측에 존재하는 경우, 특정 오브젝트에 대한 정보는 특정 오브젝트의 좌하측에 디스플레이될 수 있다.Also, a position where information on a specific object is displayed may be determined by considering both the height and width of the content. For example, when a specific object exists below H/2 and to the left of W/2 in the content, information on the specific object may be displayed on the right side of the specific object. When a specific object exists below H/2 and to the right of W/2 in the content, information on the specific object may be displayed on the upper left side of the specific object. When a specific object exists above H/2 and to the left of W/2 in the content, information on the specific object may be displayed on the right and bottom of the specific object. When a specific object exists above H/2 and to the right of W/2 in the content, information on the specific object may be displayed on the left and bottom of the specific object.

한편, 사용자 장치는 오디오 컨텐츠를 재생할 수도 있는데, 이때 사용자 장치는 오디오 컨텐츠의 소리를 자막으로 디스플레이할 수 있다. 사용자는 자막 중 정보를 얻고자 하는 단어를 선택하면, 해당 단어에 대한 정보를 제공받을 수 있다. 이때, 해당 단어에 대한 정보는 상술한 특정 오브젝트에 대한 정보와 동일한 방법으로 제공될 수 있다.Meanwhile, the user device may play audio content, and at this time, the user device may display the sound of the audio content as a subtitle. When a user selects a word for which information is to be obtained from subtitles, the user may be provided with information on the corresponding word. In this case, information on the corresponding word may be provided in the same way as information on the specific object described above.

한편, 사용자 장치는 사용자가 실시간으로 녹음하는 음성 컨텐츠를 입력으로하여, 녹음한 음성 컨텐츠에 대한 정보를 사용자에게 제공할 수 있다. 예를 들어, 녹음한 음성 컨텐츠는, 사용자가 정보를 얻기 원하는 특정 장소 또는 특정 물건을 지칭하는 단어일 수 있다. 이때, 녹음한 음성 컨텐츠에 대한 정보는 상술한 특정 오브젝트에 대한 정보와 동일한 방법으로 제공될 수 있다.Meanwhile, the user device may provide information on the recorded voice content to the user by taking the voice content recorded by the user in real time as an input. For example, the recorded voice content may be a word indicating a specific place or a specific object from which the user wants to obtain information. In this case, information on the recorded voice content may be provided in the same way as information on the specific object described above.

도 8는 본 발명의 일 실시예에 따른 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하는 방법을 나타낸 도면이다.8 is a diagram illustrating a method of providing information on a specific object within content according to an embodiment of the present invention.

도 8(a)는 사용자가 자신이 정보를 얻기 위한 특정 오브젝트를 사용자 장치로 직접 촬영하는 것을 나타낸 도면이다. 도 8(b)는 사용자가 직접 촬영한 이미지를 나타낸 도면이다. 도 8(c)는 사용자가 직접 촬영한 이미지 내 특정 오브젝트에 대한 정보를 나타낸 도면이다.8(a) is a diagram illustrating that a user directly photographs a specific object for obtaining information with a user device. 8(b) is a diagram illustrating an image directly captured by a user. 8(c) is a diagram showing information on a specific object in an image directly photographed by a user.

도 8(a)를 참조하면, 사용자는 일상 생활 중 특정 오브젝트에 대한 정보를 획득하기 원하는 경우가 있을 수 있다. 도 8(b)를 참조하면, 사용자는 정보를 얻고자 하는 특정 오브젝트를 포함하는 복수의 오브젝트들이 포함되는 컨텐츠(예, 이미지)를 촬영할 수 있다. 예를 들어, 사용자는 특정 꽃에 대한 정보를 얻고자 하여 특정 꽃을 포함하는 복수의 꽃들이 존재하는 이미지를 촬영할 수 있다. 이때 촬영된 이미지 내에는 복수의 꽃들이 존재하므로, 사용자 장치는 촬영된 이미지 내 복수의 꽃들 각각을 추출하여야 한다. 도 8(b)에서 나타낸 바와 같이 사용자 장치는 복수의 꽃들을 각각 제1 꽃(810), 제 2꽃(820)으로 추출할 수 있고, 별도의 바운딩 박스를 제공할 수도 있다. 사용자는 복수의 오브젝트 중 정보를 얻기 원하는 오브젝트를 선택할 수 있다. 이때 사용자가 오브젝트를 선택하는 방법은 상술한 바와 같이 외부 장치(예, 키보드, 마우스)를 이용하거나 터치 기능을 활용하는 것일 수 있다. 사용자 장치는 사용자가 선택한 특정 오브젝트를 오브젝트 정보 제공 서버로 전송한 후, 오브젝트 정보 제공 서버로부터 특정 오브젝트에 대한 정보를 획득할 수 있다. 예를 들어, 도 8(c)를 참조하면, 사용자가 제 1꽃(810)을 선택한 경우, 사용자 장치는 오브젝트 정보 제공 서버로부터 제 1꽃(810)에 대한 정보를 획득하여 사용자에게 제공할 수 있다. 이때, 제공되는 정보는 사용자가 선택한 특정 오브젝트의 종류에 따라 달라질 수 있다. 예를 들어, 특정 오브젝트가 꽃인 경우 제공되는 정보는 식물명, 개화시기, 원산지, 구매가격, 구매처 등일 수 있다. 또한 제공되는 정보는 전술한 바와 같이 컨텐츠 내 중첩되어 디스플레이되거나 별도의 팝업 창을 통해 디스플레이될 수 있다.Referring to FIG. 8(a) , there may be cases in which a user wants to obtain information about a specific object in daily life. Referring to FIG. 8(b) , a user may photograph content (eg, an image) including a plurality of objects including a specific object for which information is to be obtained. For example, a user may capture an image of a plurality of flowers including a specific flower in order to obtain information on a specific flower. At this time, since a plurality of flowers exist in the photographed image, the user device must extract each of the plurality of flowers in the photographed image. As shown in FIG. 8( b ), the user device may extract a plurality of flowers as a first flower 810 and a second flower 820, respectively, and may provide a separate bounding box. A user may select an object to obtain information from among a plurality of objects. In this case, a method for the user to select an object may be to use an external device (eg, a keyboard or a mouse) or a touch function as described above. After transmitting the specific object selected by the user to the object information providing server, the user device may acquire information about the specific object from the object information providing server. For example, referring to FIG. 8(c) , when the user selects the first flower 810, the user device may obtain information on the first flower 810 from the object information providing server and provide the information to the user. there is. At this time, the provided information may vary according to the type of a specific object selected by the user. For example, when a specific object is a flower, the provided information may include a plant name, flowering time, place of origin, purchase price, and place of purchase. Also, as described above, the provided information may be displayed overlapped within the content or displayed through a separate pop-up window.

도 9는 본 발명의 일 실시에에 따른 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하는 방법을 나타낸 흐름도이다.9 is a flowchart illustrating a method of providing information on a specific object within content according to an embodiment of the present invention.

한편 이하 도 9에 기반하여 설명되는 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하는 방법에 기반하여, ⓐ 유튜브 영상 내 인물에 대해 마우스를 위치시키면 자동으로 연관 정보가 제공되거나, ⓑ 오디오 정보의 경우, 컨텐츠 자막에 마우스를 위치시키면 연관성 높은 관련 정보가 제공되거나, ⓒ 제공되는 정보는 기초적인 텍스트에서부터 영상 등 디지털화 된 가능한 모든 형태의 정보가 제공되거나, ⓓ 거리를 걷다가도 궁금한 장소나 물건에 대해 궁금할 경우, 스마트폰 버튼 또는 음성 질문을 통해 정보 요청을 받아 제공될 수 있다.On the other hand, based on the method of providing information on a specific object in the content described below based on FIG. 9, ⓐ when the mouse is placed on a person in a YouTube video, related information is automatically provided, or ⓑ in the case of audio information, content If you place the mouse on the subtitle, related information with high relevance is provided, ⓒ provided information is provided in all possible forms of digitization, from basic text to video, or ⓓ If you are curious about a place or object while walking on the street , it can be provided upon receiving a request for information through a smartphone button or voice question.

이하에서 도 9를 참조하여 도 5 내지 도 8를 통해 설명한 본 발명의 일 실시예에 따른 컨텐츠 내 특정 오브젝트에 대한 정보를 제공하는 방법에 대해 구체적으로 설명한다.Hereinafter, a method of providing information on a specific object within content according to an embodiment of the present invention described through FIGS. 5 to 8 with reference to FIG. 9 will be described in detail.

사용자 장치(903)는 사용자에게 컨텐츠를 디스플레이할 수 있다. 이때, 컨텐츠는 컨텐츠 제공 서버(904)로부터 제공받을 수 있다(S910). 컨텐츠는 사용자가 직접 촬영하거나 녹음한 컨텐츠일 수도 있다. 사용자 장치(903)는, 컨텐츠 내 존재하는 오브젝트들을 각각 추출할 수 있다(S920). 사용자는 사용자 장치(903)가 추출한 오브젝트들 중 정보를 얻고자 하는 특정 오브젝트를 선택할 수 있다. 이때, 사용자 장치(903)는 사용자의 오브젝트 선택을 위해 추출한 오브젝트들 각각에 별도의 바운딩 박스를 할당하여 디스플레이할 수 있다. 이때, 특정 오브젝트를 추출하는 방법은 상술한 컨텐츠의 픽셀들의 각 픽셀 값들을 이용하는 방법이 적용될 수 있다. 사용자 장치(903)는, 사용자가 선택한 특정 오브젝트에 대한 특정 오브젝트 식별 정보를 오브젝트 정보 제공 서버(902)로 전송할 수 있다(S930). 특정 오브젝트 식별 정보는 오브젝트 정보 제공 서버(902)가 특정 오브젝트가 무엇인지 식별하기 위한 것이다. 오브젝트 정보 제공 서버(902)는 특정 오브젝트 식별 정보를 통해 특정 오브젝트를 식별한다(S940). 특정 오브젝트 식별 정보는, 사용자가 선택한 특정 오브젝트를 컨텐츠로부터 크롭한 것일 수 있다. 오브젝트 정보 제공 서버(902)는, 특정 오브젝트 식별 정보를 이용하여 식별한 특정 오브젝트에 대한 식별 정보를 검색 서버(901)로 전송할 수 있다. 식별 정보는, 사물인 경우 모델명, 인물인 경우 인물명, 식물인 경우 식물명 등일 수 있다. 검색 서버(901)는 식별 정보를 이용하여 특정 오브젝트에 대한 정보를 특정 오브젝트 정보를 오브젝트 정보 제공 서버(902)로 전송하고, 이를 수신한 오브젝트 정보 제공 서버(902)는 특정 오브젝트 정보를 사용자 장치(903)으로 전송할 수 있다(S960, S970). 사용자 장치는 특정 오브젝트 정보를 디스플레이하여 사용자에게 제공할 수 있다(S980). 이때 특정 오브젝트 정보는 상술한 바와 같이 컨텐츠에 중첩되어 디스플레이되거나 별도의 팝업 창을 통해 디스플레이될 수 있다.The user device 903 can display content to the user. At this time, the content may be provided from the content providing server 904 (S910). The content may also be content directly photographed or recorded by the user. The user device 903 may extract objects existing in the content (S920). A user may select a specific object to obtain information from among objects extracted by the user device 903 . In this case, the user device 903 may assign and display a separate bounding box to each of the objects extracted for the user's object selection. In this case, as a method of extracting a specific object, a method of using respective pixel values of pixels of the content described above may be applied. The user device 903 may transmit specific object identification information about a specific object selected by the user to the object information providing server 902 (S930). The specific object identification information is for the object information providing server 902 to identify a specific object. The object information providing server 902 identifies a specific object through specific object identification information (S940). The specific object identification information may be obtained by cropping a specific object selected by the user from content. The object information providing server 902 may transmit identification information on a specific object identified using the specific object identification information to the search server 901 . The identification information may be a model name in the case of an object, a person name in the case of a person, a plant name in the case of a plant, and the like. The search server 901 transmits information on a specific object using the identification information to the object information providing server 902, and the object information providing server 902 having received the information transmits the specific object information to the user device ( 903) can be transmitted (S960, S970). The user device may display and provide specific object information to the user (S980). In this case, the specific object information may be displayed overlapping with the content as described above or displayed through a separate pop-up window.

한편, 컨텐츠를 제공하는 컨텐츠 제공 서버(904)는 컨텐츠에 포함되는 오브젝트들에 대한 정보를 오브젝트 정보 제공 서버(902)에 제공할 수 있다. 이때, 오브젝트 정보 제공 서버는, 컨텐츠 제공 서버(904)가 제공한 오브젝트들에 대한 정보 중 사용자 장치가 전송하는 특정 오브젝트 식별 정보와 대응되는 오브젝트에 대한 정보를 사용자 장치로 전송할 수 있다. 즉, 컨텐츠 제공 서버가 컨텐츠에 포함되는 오브젝트들에 대한 정보를 오브젝트 정보 제공 서버(902)에 제공하는 경우, S950, S960 단계는 생략될 수 있다.Meanwhile, the content providing server 904 providing content may provide information on objects included in the content to the object information providing server 902 . In this case, the object information providing server may transmit information about an object corresponding to specific object identification information transmitted by the user device among information about objects provided by the content providing server 904 to the user device. That is, when the content providing server provides information on objects included in the content to the object information providing server 902, steps S950 and S960 may be omitted.

특정 오브젝트에 대한 정보인, 특정 오브젝트 정보는 하나의 정보가 아닌 복수개의 정보일 수 있다. 오브젝트 정보 제공 서버(902)는, 사용자 장치(903)의 등급에 따라 복수개의 정보 중 일부만을 사용자 장치(903)에게 제공하거나 전부를 제공할 수 있다. 복수개의 정보 중 일부만이 사용자 장치(903)에게 제공되는 경우, 복수개의 정보 중 다른 일부가 블러처리되어 사용자 장치(903)로 제공될 수 있다. 사용자 장치(903)의 등급은 사용자가 오브젝트 정보 제공 서버(902)에 제공한 금액(즉, 과금)에 따라 설정될 수 있다.Specific object information, which is information about a specific object, may be a plurality of pieces of information rather than one piece of information. The object information providing server 902 may provide some or all of the plurality of pieces of information to the user device 903 according to the level of the user device 903 . When only some of the plurality of pieces of information are provided to the user device 903, other pieces of the plurality of pieces of information may be blur-processed and provided to the user device 903. The level of the user device 903 may be set according to the amount (ie, billing) provided by the user to the object information providing server 902 .

특정 오브젝트 정보에는, 특정 오브젝트를 구입할 수 있는 쇼핑몰의 인터넷주소가 포함될 수 있다. 이때, 상기 쇼핑몰은 복수 개일 수 있는데, 오브젝트 정보 제공 서버(902)는 상기 복수 개의 쇼핑몰 중 특정 쇼핑몰에게 쇼핑몰 연결 수수료를 제공받을 수 있고, 상기 특정 쇼핑몰의 인터넷주소 만을 특정 오브젝트 정보에 포함하거나, 상기 특정 쇼핑몰의 인터넷주소를 상기 복수 개의 쇼핑몰의 인터넷주소들 중 가장 우선하여 제공할 수 있다.The specific object information may include an internet address of a shopping mall where a specific object can be purchased. At this time, there may be a plurality of shopping malls, and the object information providing server 902 may receive a shopping mall connection fee from a specific shopping mall among the plurality of shopping malls, include only the Internet address of the specific shopping mall in specific object information, or The Internet address of a specific shopping mall may be provided with the highest priority among the Internet addresses of the plurality of shopping malls.

도 9에 도시된 오브젝트 정보 제공 서버(902), 사용자 장치(903)는 도 5의 오브젝트 정보 제공 서버(520), 사용자 장치(530)와 동일한 것일 수 있다.The object information providing server 902 and user device 903 shown in FIG. 9 may be the same as the object information providing server 520 and user device 530 of FIG. 5 .

또한 본 발명의 일 실시예에 따른 오브젝트 검출 장치(400)는 아래와 같은 특징을 포함할 수 있다.In addition, the object detection device 400 according to an embodiment of the present invention may include the following features.

오브젝트 검출 장치(400)는, 예를 들면, 오브젝트를 포함하는 이미지 데이터를 입력 받는 입력부; 상기 입력된 이미지 데이터로부터 상기 오브젝트에 관한 적어도 하나 이상의 특징점을 추출하는 추출부(410); 상기 추출된 특징점에 대응하는 특징점 기술(description) 데이터를 생성하는 특징 생성부(420); 및 상기 생성된 특징점 기술 데이터에 기초하여 상기 이미지 데이터에 대응하는 컨텐츠를 식별하는 오브젝트 검출부(430)를 더 포함할 수 있다.The object detection device 400 may include, for example, an input unit for receiving image data including an object; an extraction unit 410 for extracting at least one feature point of the object from the input image data; a feature generator 420 generating feature point description data corresponding to the extracted feature points; and an object detection unit 430 that identifies content corresponding to the image data based on the generated feature point description data.

상기 특징 생성부(420)는: 상기 추출된 특징점 주변의 복수의 영역을 결정하는 영역 결정부; 상기 결정된 영역 각각과 상기 추출된 특징점과의 거리에 기초하여 상기 결정된 영역 각각을 설명하는 데이터의 크기를 결정하는 크기 결정부; 및 상기 결정된 데이터의 크기에 기초하여 상기 특징점 기술 데이터를 생성하는 생성부; 를 포함할 수 있다.The feature generator 420 includes: a region determiner for determining a plurality of regions around the extracted feature points; a size determiner configured to determine a size of data describing each of the determined areas based on a distance between each of the determined areas and the extracted feature point; and a generating unit generating the feature point description data based on the size of the determined data. can include

상기 데이터의 크기는, 예를 들면, 상기 특징점과 상기 결정된 영역 각각 간의 특징 벡터의 정수값 또는 소수점 자리수에 대응할 수 있다.The size of the data may correspond to, for example, an integer value or a decimal point number of a feature vector between the feature point and each of the determined regions.

오브젝트 검출 장치(400)는 특정 컨텐츠의 요청을 위한 특정 오브젝트를 포함하는 이미지 데이터를 사용자 장치(530)로부터 수신할 수 있다.The object detection device 400 may receive image data including a specific object for requesting specific content from the user device 530 .

오브젝트 검출 장치(400)는 상기 사용자 장치(530)로부터 수신한 이미지 데이터에 포함된 오브젝트의 특징점을 추출하고, 추출된 특징점을 이용하여 이미지 데이터에 대응하는 컨텐츠를 식별할 수 있다. 일 예로, 상기 식별된 컨텐츠는 복수 개일 수 있다.The object detection device 400 may extract feature points of an object included in the image data received from the user device 530 and identify content corresponding to the image data using the extracted feature points. For example, the identified content may be plural.

이 때, 오브젝트 검출 장치(400)는 식별된 컨텐츠의 정보를 사용자 장치(530)에게 전송함으로써, 사용자 장치(530)이 오브젝트 정보 제공 서버(520)에게 해당 컨텐츠를 요청하게 할 수 있다. 또는 오브젝트 검출 장치(400)는 식별된 컨텐츠의 정보를 오브젝트 정보 제공 서버(520)에게 전송함으로써, 오브젝트 정보 제공 서버(520)가 사용자 장치(530)에게 해당 컨텐츠를 전송하게 할 수 있다.At this time, the object detection device 400 transmits the identified content information to the user device 530, so that the user device 530 requests the object information providing server 520 for the corresponding content. Alternatively, the object detection device 400 may transmit the identified content information to the object information providing server 520 so that the object information providing server 520 transmits the corresponding content to the user device 530 .

오브젝트 정보 제공 서버(520)는 특정 컨텐츠의 사용자 장치(530)로부터 요청받을 수 있고, 컨텐츠의 요청에 대한 응답으로 해당 컨텐츠를 사용자 장치(530)에게 전송할 수 있다. 또한, 오브젝트 정보 제공 서버(520)는 오브젝트 검출 장치(400)로부터 특정 컨텐츠의 사용자 장치(530)로의 전송을 요청받을 수 있고, 이 때 오브젝트 정보 제공 서버(520)는 컨텐츠의 전송 요청에 대한 응답으로 해당 컨텐츠를 사용자 장치(530)에게 전송할 수 있다.The object information providing server 520 may receive a request for specific content from the user device 530 and transmit the corresponding content to the user device 530 in response to the content request. In addition, the object information providing server 520 may receive a request for transmission of specific content from the object detection device 400 to the user device 530, and at this time, the object information providing server 520 responds to the content transmission request. With this, the corresponding content can be transmitted to the user device 530 .

오브젝트 정보 제공 서버(520)는 오브젝트 검출 장치(400)로부터 특정 컨텐츠의 사용자 장치(530)로의 전송을 요청받은 경우, 즉시 컨텐츠를 사용자 장치(530)에게 전송하지 않고, 오브젝트 검출 장치(400)로부터 수신한 컨텐츠의 정보에 대응하는 데이터를 사용자 장치(530)에게 전송함으로써 사용자가 원하는 컨텐츠가 맞는지 확인을 받을 수 있다. 이는 오브젝트 검출 장치(400)가 복수개의 컨텐츠를 식별한 경우, 사용자가 복수의 컨텐츠 중 시청하려는 컨텐츠를 선택하게 할 수 있다.When the object information providing server 520 receives a request for transmission of specific content from the object detection device 400 to the user device 530, the object information providing server 520 does not immediately transmit the content to the user device 530, but from the object detection device 400 By transmitting data corresponding to the received content information to the user device 530, it is possible to receive confirmation whether the content desired by the user is correct. When the object detecting device 400 identifies a plurality of contents, the user may select the contents to be viewed from among the plurality of contents.

컨텐츠 식별하는 서비스는 사용자 장치(530)에 설치되어있는 애플리케이션을 통해 제공될 수 있다. 여기서 애플리케이션은 응용 프로그램(application)을 의미하며, 예를 들어, 스마트폰에서 실행되는 앱(app)을 포함할 수 있다.The content identification service may be provided through an application installed on the user device 530 . Here, the application means an application, and may include, for example, an app running on a smartphone.

입력부는 사용자 장치(530)로부터 오브젝트를 포함하는 이미지 데이터를 입력 받을 수 있다.The input unit may receive image data including an object from the user device 530 .

추출부(410)는 입력부에서 입력된 이미지 데이터에 포함된 오브젝트에 관한 적어도 하나 이상의 특징점를 추출할 수 있다.The extraction unit 410 may extract at least one feature point of an object included in image data input from the input unit.

또한, 추출부(410)는 특징점 추출 알고리즘을 이용하여 입력부에서 입력된 이미지 데이터의 고유의 특징을 나타내는 적어도 하나 이상의 특징점을 추출할 수 있다. 이 때, 특징점 추출 알고리즘은 SURF(Speed Up Robust Feature Transform) 알고리즘 또는 SIFT(Scale Invariant Feature Transform) 알고리즘 중 어느 하나일 수 있다.In addition, the extractor 410 may extract at least one feature point representing a unique feature of image data input from the input unit using a feature point extraction algorithm. In this case, the feature point extraction algorithm may be either a Speed Up Robust Feature Transform (SURF) algorithm or a Scale Invariant Feature Transform (SIFT) algorithm.

특징 생성부(420)는 추출부(410)에서 추출된 특징점에 대응하는 특징점 기술(description) 데이터를 생성할 수 있다.The feature generator 420 may generate feature point description data corresponding to the feature points extracted by the extractor 410 .

또한 특징 생성부(420)는 영역 결정부, 크기 결정부, 및 생성부를 포함할 수 있다.Also, the feature generator 420 may include a region determiner, a size determiner, and a generator.

영역 결정부는 추출부(410)에서 추출된 특징점 주변의 복수의 영역을 결정할 수 있다. 예를 들면, 특징점 주변의 복수의 영역은 4×4의 16개의 영역일 수 있으며, 이는 예시적일 뿐이고, 실제로 4×4 뿐만 아니라 다양한 크기 또는 개수의 복수의 영역이 있을 수 있다.The region determiner may determine a plurality of regions around the feature points extracted by the extractor 410 . For example, the plurality of regions around the feature points may be 16 4x4 regions, which is just an example, and there may actually be a plurality of regions of various sizes or numbers as well as 4x4 regions.

크기 결정부는 영역 결정부에서 결정된 영역 각각과 추출부(410)에서 추출된 특징점과의 거리에 기초하여 영역 결정부에서 결정된 영역 각각을 설명하는 데이터의 크기를 결정할 수 있다. 이 때, 특징점 주변의 복수의 영역 중 하나인 제 1 영역이 제 2 영역보다 특징점에 가까운 경우, 크기 결정부는 제 1 영역을 설명하는 데이터의 크기를 제 2 영역을 설명하는 데이터의 크기보다 크게 결정할 수 있다.The size determination unit may determine the size of data describing each region determined by the region determination unit based on the distance between each region determined by the region determination unit and the feature point extracted by the extraction unit 410 . At this time, when the first area, which is one of a plurality of areas around the feature point, is closer to the feature point than the second area, the size determining unit determines the size of data describing the first area to be greater than the size of data describing the second area. can

또한, 데이터의 크기는 특징점과 결정된 영역 각각간의 특징 벡터의 정수값 또는 소수점 자리수에 대응할 수 있다. 즉, 특징점과 특정 영역간의 거리에 있어서 정수값 또는 소수점 자리수에 제한을 두어 특징점의 기술자(descriptor) 길이를 짧게 유지하도록 할 수 있다.In addition, the size of the data may correspond to an integer value or decimal point number of a feature vector between each feature point and the determined region. That is, the length of the descriptor of the feature point may be kept short by limiting the integer value or the number of decimal places in the distance between the feature point and the specific region.

종래 기술에서는 16개의 영역 모두 동일한 데이터 크기(가중치)를 가지나, 크기 결정부는 특징점과 근접한 가운데 4개의 영역에 가중치를 두어 각 영역당 4byte를 할당하고, 나머지 테두리에 있는 12개의 영역에는 2byte를 할당할 수 있다. 또한, 크기 결정부는 특징점으로부터 거리가 먼 4개의 꼭지점 영역에 2byte를 할당하고 상대적으로 특징점으로부터 거리가 가까운 나머지 12개의 영역에 각각 4byte를 할당할 수 있다. 마찬가지로, 크기 결정부는 특징점과 근접한 가운데 4개의 영역에 가중치를 두어 각 영역당 4byte를 할당하고, 특징점과 멀리 떨어진 4개의 꼭지점 영역에 2byte를 할당하고 특징점과 적정의 거리를 둔 나머지 8개의 영역에는 3byte를 할당할 수 있다.In the prior art, all 16 areas have the same data size (weight), but the size determination unit allocates 4 bytes to each area by assigning weights to 4 areas in the middle near the feature point, and allocates 2 bytes to 12 areas on the remaining border. can In addition, the size determiner may allocate 2 bytes to 4 vertex areas far from the feature point and allocate 4 bytes to each of the remaining 12 areas relatively short from the feature point. Similarly, the size determiner allocates 4 bytes to each area by assigning weights to the 4 regions in the middle close to the feature point, allocates 2 bytes to the 4 vertex regions far from the feature point, and allocates 3 bytes to the remaining 8 regions at an appropriate distance from the feature point. can be assigned.

생성부는 크기 결정부에서 결정된 데이터의 크기에 기초하여 특징점 기술(description) 데이터를 생성할 수 있다.The generation unit may generate feature point description data based on the size of the data determined by the size determination unit.

생성부는 추출된 특징점의 오리엔테이션(orientation) 및 영역 결정부에서 결정된 복수의 영역에 대한 엣지(edge) 히스토그램(histogram)을 이용하여 특징점 기술 데이터를 생성할 수 있다.The generation unit may generate feature point description data using the orientation of the extracted feature points and edge histograms for a plurality of regions determined by the region determination unit.

오브젝트 검출부(430)는 특징 생성부(420)에서 생성된 특징점 기술 데이터에 기초하여 입력부에서 입력받은 이미지 데이터에 대응하는 컨텐츠를 식별할 수 있다. 이 때, 오브젝트 검출부(430)는 복수의 컨텐츠를 식별할 수 있다.The object detector 430 may identify content corresponding to the image data received from the input unit based on the feature point description data generated by the feature generator 420 . At this time, the object detection unit 430 may identify a plurality of contents.

통신부는 오브젝트 검출부(430)에서 식별된 컨텐츠에 대한 정보를 사용자 장치(530) 또는 오브젝트 정보 제공 서버(520)에게 전송할 수 있다. 이 때, 오브젝트 정보 제공 서버(520)는 전송된 컨텐츠의 정보에 대응하는 데이터를 사용자 장치(530)에게 전송할 수 있다. 여기서, 컨텐츠의 정보에 대응하는 데이터는 사용자 장치(530)로 하여금 입력부에서 입력받은 이미지에 대한 컨텐츠가 오브젝트 검출부(430)에서 식별된 컨텐츠와 일치하는지 확인하기 위한 데이터일 수 있다.The communication unit may transmit information about the content identified by the object detection unit 430 to the user device 530 or the object information providing server 520 . At this time, the object information providing server 520 may transmit data corresponding to the transmitted content information to the user device 530 . Here, the data corresponding to the content information may be data used by the user device 530 to determine whether the content of the image received from the input unit matches the content identified by the object detection unit 430 .

컨텐츠의 정보에 대응하는 데이터는 식별된 컨텐츠 또는 식별된 컨텐츠에 대한 광고 정보일 수 있다. 또한, 컨텐츠의 정보에 대응하는 데이터는 식별된 컨텐츠에 대한 임의의 검색 엔진의 결과 정보일 수 있다.Data corresponding to information of content may be identified content or advertisement information for the identified content. Also, the data corresponding to the content information may be result information of an arbitrary search engine for the identified content.

뿐만 아니라, 컨텐츠의 정보에 대응하는 데이터는 사용자 장치(530)에 설치된 특정 어플리케이션을 실행시키는 신호를 포함할 수 있다.In addition, data corresponding to content information may include a signal for executing a specific application installed in the user device 530 .

통신부는 오브젝트 검출부(430)에서 복수의 컨텐츠를 식별한 경우, 식별된 복수의 컨텐츠의 정보를 오브젝트 정보 제공 서버(520)로 전송할 수 있다. 이 때, 오브젝트 정보 제공 서버(520)는 전송된 복수의 컨텐츠의 정보에 대응하는 복수의 데이터를 사용자 장치(530)에게 전송할 수 있다. 또한, 통신부는 사용자 장치(530)로부터 복수의 컨텐츠의 정보에 대응하는 복수의 데이터 중 어느 하나를 선택하는 신호를 수신할 수 있다.When the object detector 430 identifies a plurality of contents, the communication unit may transmit information on the identified plurality of contents to the object information providing server 520 . At this time, the object information providing server 520 may transmit a plurality of data corresponding to the transmitted plurality of content information to the user device 530 . Also, the communication unit may receive a signal for selecting one of a plurality of pieces of data corresponding to information on a plurality of contents from the user device 530 .

이 때, 복수의 데이터 중 선택된 데이터에 대한 컨텐츠의 정보는 입력부에서 입력 받은 이미지 데이터와 매칭되어 데이터베이스에 저장될 수 있다.In this case, content information on the selected data among the plurality of data may be matched with image data input from the input unit and stored in the database.

데이터베이스는 데이터를 저장한다. 이 때, 데이터는 오브젝트 검출 장치(400) 내부의 각 구성요소들 간에 입력 및 출력되는 데이터를 포함하고, 오브젝트 검출 장치(400)와 오브젝트 검출 장치(400) 외부의 구성요소들간에 입력 및 출력되는 데이터를 포함한다. 예를 들어, 데이터베이스는 추출부(410)에서 추출한 적어도 하나 이상의 추출점의 위치 정보를 저장할 수 있다. 이러한 데이터베이스의 일 예에는 오브젝트 검출 장치(400) 내부 또는 외부에 존재하는 하드디스크드라이브, ROM(Read Only Memory), RAM(Random Access Memory), 플래쉬메모리 및 메모리카드 등이 포함된다.Databases store data. At this time, the data includes data input and output between components inside the object detection device 400, and data input and output between the object detection device 400 and components outside the object detection device 400. contains data For example, the database may store location information of at least one extraction point extracted by the extraction unit 410 . An example of such a database includes a hard disk drive, a read only memory (ROM), a random access memory (RAM), a flash memory, and a memory card existing inside or outside the object detecting device 400 .

본 문서의 다양한 실시예들에 따른 전자 장치(즉, 오브젝트 검출 장치, 오브젝트 정보 제공 서버, 및/또는 사용자 장치)는, 예를 들면, 스마트폰, 태블릿 PC, 이동 전화기, 영상 전화기, 전자책 리더기, 데스크탑 PC, 랩탑 PC, 넷북 컴퓨터, 워크스테이션, 서버, PDA, PMP(portable multimedia player), MP3 플레이어, 의료기기, 카메라, 또는 웨어러블 장치 중 적어도 하나를 포함할 수 있다. 웨어러블 장치는 액세서리형(예: 시계, 반지, 팔찌, 발찌, 목걸이, 안경, 콘택트 렌즈, 또는 머리 착용형 장치(head-mounted-device(HMD)), 직물 또는 의류 일체형(예: 전자 의복), 신체 부착형(예: 스킨 패드 또는 문신), 또는 생체 이식형 회로 중 적어도 하나를 포함할 수 있다. 어떤 실시예들에서, 전자 장치는, 예를 들면, 텔레비전, DVD(digital video disk) 플레이어, 오디오, 냉장고, 에어컨, 청소기, 오븐, 전자레인지, 세탁기, 공기 청정기, 셋톱 박스, 홈 오토매이션 컨트롤 패널, 보안 컨트롤 패널, 미디어 박스(예: 삼성 HomeSync^TM, 애플TV^TM, 또는 구글 TV^TM), 게임 콘솔(예: Xbox^TM, PlayStation^TM), 전자 사전, 전자 키, 캠코더, 또는 전자 액자 중 적어도 하나를 포함할 수 있다.An electronic device (ie, an object detection device, an object information providing server, and/or a user device) according to various embodiments of the present document may include, for example, a smart phone, a tablet PC, a mobile phone, a video phone, and an e-book reader. , a desktop PC, a laptop PC, a netbook computer, a workstation, a server, a PDA, a portable multimedia player (PMP), an MP3 player, a medical device, a camera, or a wearable device. A wearable device may be in the form of an accessory (e.g. watch, ring, bracelet, anklet, necklace, eyeglasses, contact lens, or head-mounted-device (HMD)), integrated into textiles or clothing (e.g. electronic garment); In some embodiments, the electronic device may include, for example, a television, a digital video disk (DVD) player, Audio, refrigerator, air conditioner, vacuum cleaner, oven, microwave, washing machine, air purifier, set top box, home automation control panel, security control panel, media box (e.g. Samsung HomeSync ^TM , Apple TV ^TM , or Google TV ^TM ) , a game console (eg, Xbox ^TM , PlayStation ^TM ), an electronic dictionary, an electronic key, a camcorder, or an electronic picture frame.

다른 실시예에서, 전자 장치는, 각종 의료기기(예: 각종 휴대용 의료측정기기(혈당 측정기, 심박 측정기, 혈압 측정기, 또는 체온 측정기 등), MRA(magnetic resonance angiography), MRI(magnetic resonance imaging), CT(computed tomography), 촬영기, 또는 초음파기 등), 네비게이션 장치, 위성 항법 시스템(GNSS(global navigation satellite system)), EDR(event data recorder), FDR(flight data recorder), 자동차 인포테인먼트 장치, 선박용 전자 장비(예: 선박용 항법 장치, 자이로 콤파스 등), 항공 전자기기(avionics), 보안 기기, 차량용 헤드 유닛(head unit), 산업용 또는 가정용 로봇, 드론(drone), 금융 기관의 ATM, 상점의 POS(point of sales), 또는 사물 인터넷 장치 (예: 전구, 각종 센서, 스프링클러 장치, 화재 경보기, 온도조절기, 가로등, 토스터, 운동기구, 온수탱크, 히터, 보일러 등) 중 적어도 하나를 포함할 수 있다. 어떤 실시예에 따르면, 전자 장치는 가구, 건물/구조물 또는 자동차의 일부, 전자 보드(electronic board), 전자 사인 수신 장치(electronic signature receiving device), 프로젝터, 또는 각종 계측 기기(예: 수도, 전기, 가스, 또는 전파 계측 기기 등) 중 적어도 하나를 포함할 수 있다. 다양한 실시예에서, 전자 장치는 플렉서블하거나, 또는 전술한 다양한 장치들 중 둘 이상의 조합일 수 있다. 본 문서의 실시예에 따른 전자 장치는 전술한 기기들에 한정되지 않는다. 본 문서에서, 사용자라는 용어는 전자 장치를 사용하는 사람 또는 전자 장치를 사용하는 장치(예: 인공지능 전자 장치)를 지칭할 수 있다. In another embodiment, the electronic device may include various types of medical devices (e.g., various portable medical measuring devices (such as blood glucose meter, heart rate monitor, blood pressure monitor, or body temperature monitor), magnetic resonance angiography (MRA), magnetic resonance imaging (MRI), CT (computed tomography), imager, or ultrasonicator, etc.), navigation device, global navigation satellite system (GNSS), EDR (event data recorder), FDR (flight data recorder), automobile infotainment device, marine electronic equipment (e.g. navigation devices for ships, gyrocompasses, etc.), avionics, security devices, head units for vehicles, industrial or home robots, drones, ATMs in financial institutions, point of sale (POS) in stores of sales), or IoT devices (eg, light bulbs, various sensors, sprinkler devices, fire alarms, thermostats, street lights, toasters, exercise equipment, hot water tanks, heaters, boilers, etc.). According to some embodiments, the electronic device may be a piece of furniture, a building/structure or a vehicle, an electronic board, an electronic signature receiving device, a projector, or various measuring devices (eg, water, electricity, gas, radio wave measuring device, etc.). In various embodiments, the electronic device may be flexible or a combination of two or more of the various devices described above. An electronic device according to an embodiment of the present document is not limited to the aforementioned devices. In this document, the term user may refer to a person using an electronic device or a device using an electronic device (eg, an artificial intelligence electronic device).

또한 전자 장치는 버스, 프로세서, 메모리, 입출력 인터페이스, 디스플레이, 및 통신 인터페이스를 포함할 수 있다. 어떤 실시예에서는, 전자 장치는, 구성요소들 중 적어도 하나를 생략하거나 다른 구성요소를 추가적으로 구비할 수 있다. 버스는 구성요소들을 서로 연결하고, 구성요소들 간의 통신(예: 제어 메시지 또는 데이터)을 전달하는 회로를 포함할 수 있다. 프로세서는, 중앙처리장치, 어플리케이션 프로세서, 또는 커뮤니케이션 프로세서(communication processor(CP)) 중 하나 또는 그 이상을 포함할 수 있다. 프로세서는, 예를 들면, 전자 장치의 적어도 하나의 다른 구성요소들의 제어 및/또는 통신에 관한 연산이나 데이터 처리를 실행할 수 있다.Also, the electronic device may include a bus, a processor, a memory, an input/output interface, a display, and a communication interface. In some embodiments, the electronic device may omit at least one of the components or may additionally include other components. A bus may include circuitry that connects components together and carries communications (eg, control messages or data) between components. The processor may include one or more of a central processing unit, an application processor, or a communication processor (Communication Processor (CP)). The processor may, for example, execute calculations or data processing related to control and/or communication of at least one other element of the electronic device.

메모리는, 휘발성 및/또는 비휘발성 메모리를 포함할 수 있다. 메모리는, 예를 들면, 전자 장치의 적어도 하나의 다른 구성요소에 관계된 명령 또는 데이터를 저장할 수 있다. 한 실시예에 따르면, 메모리는 소프트웨어 및/또는 프로그램을 저장할 수 있다. 프로그램은, 예를 들면, 커널, 미들웨어, 어플리케이션 프로그래밍 인터페이스(API), 및/또는 어플리케이션 프로그램(또는 "어플리케이션") 등을 포함할 수 있다. 커널, 미들웨어, 또는 API의 적어도 일부는, 운영 시스템으로 지칭될 수 있다. 커널은, 예를 들면, 다른 프로그램들(예: 미들웨어, API, 또는 어플리케이션 프로그램)에 구현된 동작 또는 기능을 실행하는 데 사용되는 시스템 리소스들(예: 버스, 프로세서, 또는 메모리 등)을 제어 또는 관리할 수 있다. 또한, 커널은 미들웨어, API, 또는 어플리케이션 프로그램에서 전자 장치의 개별 구성요소에 접근함으로써, 시스템 리소스들을 제어 또는 관리할 수 있는 인터페이스를 제공할 수 있다. Memory may include volatile and/or non-volatile memory. The memory may store, for example, commands or data related to at least one other component of the electronic device. According to one embodiment, the memory may store software and/or programs. Programs may include, for example, kernels, middleware, application programming interfaces (APIs), and/or application programs (or "applications"). At least part of a kernel, middleware, or API may be referred to as an operating system. The kernel, for example, controls or controls system resources (eg, a bus, processor, or memory, etc.) used to execute operations or functions implemented in other programs (eg, middleware, APIs, or application programs). can manage Also, the kernel may provide an interface capable of controlling or managing system resources by accessing individual components of the electronic device in middleware, API, or application programs.

미들웨어는, 예를 들면, API 또는 어플리케이션 프로그램이 커널과 통신하여 데이터를 주고받을 수 있도록 중개 역할을 수행할 수 있다. 또한, 미들웨어는 어플리케이션 프로그램으로부터 수신된 하나 이상의 작업 요청들을 우선 순위에 따라 처리할 수 있다. 예를 들면, 미들웨어는 어플리케이션 프로그램 중 적어도 하나에 전자 장치의 시스템 리소스(예: 버스, 프로세서, 또는 메모리 등)를 사용할 수 있는 우선 순위를 부여하고, 상기 하나 이상의 작업 요청들을 처리할 수 있다. API는 어플리케이션이 커널 또는 미들웨어에서 제공되는 기능을 제어하기 위한 인터페이스로, 예를 들면, 파일 제어, 창 제어, 영상 처리, 또는 문자 제어 등을 위한 적어도 하나의 인터페이스 또는 함수(예: 명령어)를 포함할 수 있다. 입출력 인터페이스는, 예를 들면, 사용자 또는 다른 외부 기기로부터 입력된 명령 또는 데이터를 전자 장치의 다른 구성요소(들)에 전달하거나, 또는 전자 장치의 다른 구성요소(들)로부터 수신된 명령 또는 데이터를 사용자 또는 다른 외부 기기로 출력할 수 있다. Middleware, for example, may perform an intermediary role so that an API or an application program communicates with a kernel to exchange data. Also, the middleware may process one or more task requests received from the application program according to priority. For example, the middleware may assign a priority for using system resources (eg, a bus, processor, memory, etc.) of the electronic device to at least one of the application programs and process the one or more task requests. An API is an interface for an application to control functions provided by the kernel or middleware, and includes at least one interface or function (eg, command) for file control, window control, image processing, or text control, for example. can do. The input/output interface transfers, for example, commands or data input from a user or other external device to other component(s) of the electronic device, or commands or data received from other component(s) of the electronic device. It can be output to users or other external devices.

디스플레이는, 예를 들면, 액정 디스플레이(LCD), 발광 다이오드(LED) 디스플레이, 유기 발광 다이오드(OLED) 디스플레이, 또는 마이크로 전자기계 시스템(MEMS) 디스플레이, 또는 전자종이(electronic paper) 디스플레이를 포함할 수 있다. 디스플레이는, 예를 들면, 사용자에게 각종 콘텐츠(예: 텍스트, 이미지, 비디오, 아이콘, 및/또는 심볼 등)을 표시할 수 있다. 디스플레이는, 터치 스크린을 포함할 수 있으며, 예를 들면, 전자 펜 또는 사용자의 신체의 일부를 이용한 터치, 제스쳐, 근접, 또는 호버링 입력을 수신할 수 있다. 통신 인터페이스는, 예를 들면, 전자 장치와 외부 장치(예: 제1 외부 전자 장치, 제2 외부 전자 장치, 또는 서버) 간의 통신을 설정할 수 있다. 예를 들면, 통신 인터페이스는 무선 통신 또는 유선 통신을 통해서 네트워크에 연결되어 외부 장치(예: 제2 외부 전자 장치 또는 서버)와 통신할 수 있다. 여기서 호버링 입력은 전자 펜 또는 사용자의 신체의 일부가 상기 디스플레이에 물리적으로(및/또는 직접적으로) 접촉되지는 않았으나, 정전기에 의해 전자 펜 또는 사용자의 신체의 일부의 접근을 상기 디스플레이를 통하여 식별(및/또는 인식)되는 과정을 통하여 입력되는 것이라고 볼 수 있으며, 예를 들면, 호버링 좌표 정보 등이 포함될 수 있다.The display may include, for example, a liquid crystal display (LCD), a light emitting diode (LED) display, an organic light emitting diode (OLED) display, or a microelectromechanical system (MEMS) display, or an electronic paper display. there is. The display may display, for example, various contents (eg, text, image, video, icon, and/or symbol) to the user. The display may include a touch screen, and may receive, for example, a touch, gesture, proximity, or hovering input using an electronic pen or a part of the user's body. The communication interface may establish communication between the electronic device and an external device (eg, a first external electronic device, a second external electronic device, or a server). For example, the communication interface may be connected to a network through wireless communication or wired communication to communicate with an external device (eg, a second external electronic device or server). Here, the hovering input identifies the approach of the electronic pen or a part of the user's body through the display by static electricity, even though the electronic pen or part of the user's body is not physically (and/or directly) in contact with the display ( and/or recognition), and may include, for example, hovering coordinate information.

무선 통신은, 예를 들면, LTE, LTE-A(LTE Advance), CDMA(code division multiple access), WCDMA(wideband CDMA), UMTS(universal mobile telecommunications system), WiBro(Wireless Broadband), 또는 GSM(Global System for Mobile Communications) 등 중 적어도 하나를 사용하는 셀룰러 통신을 포함할 수 있다. 한 실시예에 따르면, 무선 통신은, 예를 들면, WiFi(wireless fidelity), 블루투스, 블루투스 저전력(BLE), 지그비(Zigbee), NFC(near field communication), 자력 시큐어 트랜스미션(Magnetic Secure Transmission), 라디오 프리퀀시(RF), 또는 보디 에어리어 네트워크(BAN) 중 적어도 하나를 포함할 수 있다. 한실시예에 따르면, 무선 통신은 GNSS를 포함할 수 있다. GNSS는, 예를 들면, GPS(Global Positioning System), Glonass(Global Navigation Satellite System), Beidou Navigation Satellite System(이하 "Beidou") 또는 Galileo, the European global satellite-based navigation system일 수 있다. 이하, 본 문서에서는, "GPS"는 "GNSS"와 상호 호환적으로 사용될 수 있다. 유선 통신은, 예를 들면, USB(universal serial bus), HDMI(high definition multimedia interface), RS-232(recommended standard232), 전력선 통신, 또는 POTS(plain old telephone service) 등 중 적어도 하나를 포함할 수 있다. 네트워크는 텔레커뮤니케이션 네트워크, 예를 들면, 컴퓨터 네트워크(예: LAN 또는 WAN), 인터넷, 또는 텔레폰 네트워크 중 적어도 하나를 포함할 수 있다.Wireless communication is, for example, LTE, LTE-A (LTE Advance), CDMA (code division multiple access), WCDMA (wideband CDMA), UMTS (universal mobile telecommunications system), WiBro (Wireless Broadband), or GSM (Global System for Mobile Communications) may include cellular communication using at least one of the like. According to one embodiment, wireless communication, for example, WiFi (wireless fidelity), Bluetooth, Bluetooth Low Energy (BLE), Zigbee, near field communication (NFC), magnetic secure transmission (Magnetic Secure Transmission), radio It may include at least one of a frequency (RF) and a body area network (BAN). According to one embodiment, wireless communication may include GNSS. The GNSS may be, for example, a Global Positioning System (GPS), a Global Navigation Satellite System (Glonass), a Beidou Navigation Satellite System (hereinafter “Beidou”) or Galileo, the European global satellite-based navigation system. Hereinafter, in this document, "GPS" may be used interchangeably with "GNSS". Wired communication may include, for example, at least one of universal serial bus (USB), high definition multimedia interface (HDMI), recommended standard 232 (RS-232), power line communication, or plain old telephone service (POTS). there is. The network may include at least one of a telecommunications network, for example, a computer network (eg, LAN or WAN), the Internet, or a telephone network.

제1 및 제2 외부 전자 장치 각각은 전자 장치와 동일한 또는 다른 종류의 장치일 수 있다. 다양한 실시예에 따르면, 전자 장치에서 실행되는 동작들의 전부 또는 일부는 다른 하나 또는 복수의 전자 장치(예: 전자 장치, 또는 서버에서 실행될 수 있다. 한 실시예에 따르면, 전자 장치가 어떤 기능이나 서비스를 자동으로 또는 요청에 의하여 수행해야 할 경우에, 전자 장치는 기능 또는 서비스를 자체적으로 실행시키는 대신에 또는 추가적으로, 그와 연관된 적어도 일부 기능을 다른 장치(예: 전자 장치, 또는 서버)에게 요청할 수 있다. 다른 전자 장치(예: 전자 장치, 또는 서버)는 요청된 기능 또는 추가 기능을 실행하고, 그 결과를 전자 장치로 전달할 수 있다. 전자 장치는 수신된 결과를 그대로 또는 추가적으로 처리하여 요청된 기능이나 서비스를 제공할 수 있다. 이를 위하여, 예를 들면, 클라우드 컴퓨팅, 분산 컴퓨팅, 또는 클라이언트-서버 컴퓨팅 기술이 이용될 수 있다.Each of the first and second external electronic devices may be the same or different type of the electronic device. According to various embodiments, all or part of operations executed in an electronic device may be executed in one or more electronic devices (eg, an electronic device or a server). According to an embodiment, an electronic device may perform certain functions or services. When it is necessary to automatically or upon request, the electronic device may request at least some functions related to it from another device (eg, an electronic device or a server) instead of or in addition to executing the function or service by itself. Another electronic device (eg, an electronic device or a server) may execute the requested function or additional function and deliver the result to the electronic device, which may process the received result as it is or additionally, and perform the requested function For this purpose, for example, cloud computing, distributed computing, or client-server computing technology may be used.

전자 장치는 하나 이상의 프로세서(예: AP), 통신 모듈, (가입자 식별 모듈, 메모리, 센서 모듈, 입력 장치, 디스플레이, 인터페이스, 오디오 모듈, 카메라 모듈, 전력 관리 모듈, 배터리, 인디케이터, 및 모터를 포함할 수 있다. 프로세서는, 예를 들면, 운영 체제 또는 응용 프로그램을 구동하여 프로세서에 연결된 다수의 하드웨어 또는 소프트웨어 구성요소들을 제어할 수 있고, 각종 데이터 처리 및 연산을 수행할 수 있다. 프로세서는, 예를 들면, SoC(system on chip)로 구현될 수 있다. 한 실시예에 따르면, 프로세서는 GPU(graphic processing unit) 및/또는 이미지 신호 프로세서를 더 포함할 수 있다. 프로세서는 다른 구성요소들(예: 비휘발성 메모리) 중 적어도 하나로부터 수신된 명령 또는 데이터를 휘발성 메모리에 로드)하여 처리하고, 결과 데이터를 비휘발성 메모리에 저장할 수 있다.Electronic devices include one or more processors (eg, APs), communication modules, (subscriber identification modules, memory, sensor modules, input devices, displays, interfaces, audio modules, camera modules, power management modules, batteries, indicators, and motors). The processor may, for example, drive an operating system or an application program to control a plurality of hardware or software components connected to the processor, and may perform various data processing and calculations. For example, it may be implemented as a system on chip (SoC). According to one embodiment, the processor may further include a graphic processing unit (GPU) and/or an image signal processor. The processor may include other components (eg : A command or data received from at least one of the non-volatile memory) may be loaded into the volatile memory, processed, and resultant data may be stored in the non-volatile memory.

통신 모듈(예: 통신 인터페이스)와 동일 또는 유사한 구성을 가질 수 있다. 통신 모듈은, 예를 들면, 셀룰러 모듈, WiFi 모듈, 블루투스 모듈, GNSS 모듈, NFC 모듈 및 RF 모듈을 포함할 수 있다. 셀룰러 모듈은, 예를 들면, 통신망을 통해서 음성 통화, 영상 통화, 문자 서비스, 또는 인터넷 서비스 등을 제공할 수 있다. 한 실시예에 따르면, 셀룰러 모듈은 가입자 식별 모듈(예: SIM 카드)을 이용하여 통신 네트워크 내에서 전자 장치의 구별 및 인증을 수행할 수 있다. 한 실시예에 따르면, 셀룰러 모듈은 프로세서가 제공할 수 있는 기능 중 적어도 일부 기능을 수행할 수 있다. 한 실시예에 따르면, 셀룰러 모듈은 커뮤니케이션 프로세서(CP)를 포함할 수 있다. 어떤 실시예에 따르면, 셀룰러 모듈, WiFi 모듈, 블루투스 모듈, GNSS 모듈 또는 NFC 모듈 중 적어도 일부(예: 두 개 이상)는 하나의 integrated chip(IC) 또는 IC 패키지 내에 포함될 수 있다. RF 모듈은, 예를 들면, 통신 신호(예: RF 신호)를 송수신할 수 있다. RF 모듈은, 예를 들면, 트랜시버, PAM(power amp module), 주파수 필터, LNA(low noise amplifier), 또는 안테나 등을 포함할 수 있다. 다른 실시예에 따르면, 셀룰러 모듈, WiFi 모듈, 블루투스 모듈, GNSS 모듈 또는 NFC 모듈 중 적어도 하나는 별개의 RF 모듈을 통하여 RF 신호를 송수신할 수 있다. 가입자 식별 모듈은, 예를 들면, 가입자 식별 모듈을 포함하는 카드 또는 임베디드 SIM을 포함할 수 있으며, 고유한 식별 정보(예: ICCID(integrated circuit card identifier)) 또는 가입자 정보(예: IMSI(international mobile subscriber identity))를 포함할 수 있다. It may have the same or similar configuration as the communication module (eg, communication interface). The communication module may include, for example, a cellular module, a WiFi module, a Bluetooth module, a GNSS module, an NFC module, and an RF module. The cellular module may provide, for example, a voice call, a video call, a text service, or an Internet service through a communication network. According to one embodiment, the cellular module may perform identification and authentication of an electronic device within a communication network using a subscriber identification module (eg, a SIM card). According to one embodiment, the cellular module may perform at least some of the functions that a processor can provide. According to one embodiment, the cellular module may include a communication processor (CP). According to some embodiments, at least some (eg, two or more) of the cellular module, WiFi module, Bluetooth module, GNSS module, or NFC module may be included in one integrated chip (IC) or IC package. The RF module may transmit and receive communication signals (eg, RF signals), for example. The RF module may include, for example, a transceiver, a power amp module (PAM), a frequency filter, a low noise amplifier (LNA), or an antenna. According to another embodiment, at least one of a cellular module, a WiFi module, a Bluetooth module, a GNSS module, or an NFC module may transmit and receive an RF signal through a separate RF module. The subscriber identification module may include, for example, a card or an embedded SIM including the subscriber identification module, and may include unique identification information (eg, integrated circuit card identifier (ICCID)) or subscriber information (eg, international mobile terminal (IMSI)). subscriber identity)).

메모리(예: 메모리)는, 예를 들면, 내장 메모리 또는 외장 메모리를 포함할 수 있다. 내장 메모리는, 예를 들면, 휘발성 메모리(예: DRAM, SRAM, 또는 SDRAM 등), 비휘발성 메모리(예: OTPROM(one time programmable ROM), PROM, EPROM, EEPROM, mask ROM, flash ROM, 플래시 메모리, 하드 드라이브, 또는 솔리드 스테이트 드라이브(SSD) 중 적어도 하나를 포함할 수 있다. 외장 메모리는 플래시 드라이브(flash drive), 예를 들면, CF(compact flash), SD(secure digital), Micro-SD, Mini-SD, xD(extreme digital), MMC(multi-media card) 또는 메모리 스틱 등을 포함할 수 있다. 외장 메모리는 다양한 인터페이스를 통하여 전자 장치와 기능적으로 또는 물리적으로 연결될 수 있다.The memory (eg, memory) may include, for example, a built-in memory or an external memory. Built-in memory includes, for example, volatile memory (e.g. DRAM, SRAM, or SDRAM, etc.), non-volatile memory (e.g. one time programmable ROM (OTPROM), PROM, EPROM, EEPROM, mask ROM, flash ROM, flash memory) , hard drive, or solid state drive (SSD) external memory may include a flash drive (flash drive), for example, CF (compact flash), SD (secure digital), Micro-SD, It may include a Mini-SD, extreme digital (xD), multi-media card (MMC), memory stick, etc. The external memory may be functionally or physically connected to an electronic device through various interfaces.

센서 모듈은, 예를 들면, 물리량을 계측하거나 전자 장치의 작동 상태를 감지하여, 계측 또는 감지된 정보를 전기 신호로 변환할 수 있다. 센서 모듈은, 예를 들면, 제스처 센서, 자이로 센서, 기압 센서, 마그네틱 센서, 가속도 센서, 그립 센서, 근접 센서, 컬러(color) 센서(예: RGB(red, green, blue) 센서), 생체 센서, 온/습도 센서, 조도 센서, 또는 UV(ultra violet) 센서 중의 적어도 하나를 포함할 수 있다. 추가적으로 또는 대체적으로, 센서 모듈은, 예를 들면, 후각(e-nose) 센서, 일렉트로마이오그라피(EMG) 센서, 일렉트로엔씨팔로그램(EEG) 센서, 일렉트로카디오그램(ECG) 센서, IR(infrared) 센서, 홍채 센서 및/또는 지문 센서를 포함할 수 있다. 센서 모듈은 그 안에 속한 적어도 하나 이상의 센서들을 제어하기 위한 제어 회로를 더 포함할 수 있다. 어떤 실시예에서는, 전자 장치는 프로세서의 일부로서 또는 별도로, 센서 모듈을 제어하도록 구성된 프로세서를 더 포함하여, 프로세서가 슬립(sleep) 상태에 있는 동안, 센서 모듈을 제어할 수 있다.The sensor module may, for example, measure a physical quantity or detect an operating state of an electronic device, and convert the measured or sensed information into an electrical signal. The sensor module may include, for example, a gesture sensor, a gyro sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a grip sensor, a proximity sensor, a color sensor (eg, a red, green, blue (RGB) sensor), and a biosensor. , a temperature/humidity sensor, an illuminance sensor, or an ultra violet (UV) sensor. Additionally or alternatively, the sensor module may include, for example, an e-nose sensor, an electromyography (EMG) sensor, an electroencephalogram (EEG) sensor, an electrocardiogram (ECG) sensor, an infrared (IR) sensor, ) sensor, an iris sensor and/or a fingerprint sensor. The sensor module may further include a control circuit for controlling one or more sensors included therein. In some embodiments, the electronic device may further include a processor configured to control the sensor module, either as part of the processor or separately, to control the sensor module while the processor is in a sleep state.

입력 장치는, 예를 들면, 터치 패널, (디지털) 펜 센서, 키, 또는 초음파 입력 장치를 포함할 수 있다. 터치 패널은, 예를 들면, 정전식, 감압식, 적외선 방식, 또는 초음파 방식 중 적어도 하나의 방식을 사용할 수 있다. 또한, 터치 패널은 제어 회로를 더 포함할 수도 있다. 터치 패널은 택타일 레이어(tactile layer)를 더 포함하여, 사용자에게 촉각 반응을 제공할 수 있다. (디지털) 펜 센서는, 예를 들면, 터치 패널의 일부이거나, 별도의 인식용 쉬트를 포함할 수 있다. 키는, 예를 들면, 하드웨어 버튼, 광학식 키, 또는 키패드를 포함할 수 있다. 초음파 입력 장치는 마이크를 통해, 입력 도구에서 발생된 초음파를 감지하여, 상기 감지된 초음파에 대응하는 데이터를 확인할 수 있다.The input device may include, for example, a touch panel, a (digital) pen sensor, a key, or an ultrasonic input device. The touch panel may use at least one of, for example, a capacitive type, a pressure-sensitive type, an infrared type, or an ultrasonic type. Also, the touch panel may further include a control circuit. The touch panel may further include a tactile layer to provide a tactile response to the user. The (digital) pen sensor may be, for example, a part of the touch panel or may include a separate recognition sheet. Keys may include, for example, hardware buttons, optical keys, or keypads. The ultrasonic input device may detect ultrasonic waves generated by an input tool through a microphone and check data corresponding to the detected ultrasonic waves.

디스플레이는 패널, 홀로그램 장치, 프로젝터, 및/또는 이들을 제어하기 위한 제어 회로를 포함할 수 있다. 패널은, 예를 들면, 유연하게, 투명하게, 또는 착용할 수 있게 구현될 수 있다. 패널은 터치 패널과 하나 이상의 모듈로 구성될 수 있다. 한 실시예에 따르면, 패널은 사용자의 터치에 대한 압력의 세기를 측정할 수 있는 압력 센서(또는 포스 센서)를 포함할 수 있다. 상기 압력 센서는 터치 패널과 일체형으로 구현되거나, 또는 터치 패널과는 별도의 하나 이상의 센서로 구현될 수 있다. 홀로그램 장치는 빛의 간섭을 이용하여 입체 영상을 허공에 보여줄 수 있다. 프로젝터는 스크린에 빛을 투사하여 영상을 표시할 수 있다. 스크린은, 예를 들면, 전자 장치의 내부 또는 외부에 위치할 수 있다. 인터페이스는, 예를 들면, HDMI, USB, 광 인터페이스(optical interface), 또는 D-sub(D-subminiature)(278)를 포함할 수 있다. 추가적으로 또는 대체적으로, 인터페이스는, 예를 들면, MHL(mobile high-definition link) 인터페이스, SD카드/MMC(multi-media card) 인터페이스, 또는 IrDA(infrared data association) 규격 인터페이스를 포함할 수 있다. The display may include a panel, a hologram device, a projector, and/or control circuitry for controlling them. The panel may be implemented to be flexible, transparent, or wearable, for example. The panel may be composed of a touch panel and one or more modules. According to one embodiment, the panel may include a pressure sensor (or force sensor) capable of measuring the strength of a user's touch. The pressure sensor may be implemented integrally with the touch panel, or may be implemented as one or more sensors separate from the touch panel. A hologram device can show a 3D image in the air by using the interference of light. The projector may display an image by projecting light onto a screen. The screen may be located inside or outside the electronic device, for example. The interface may include, for example, HDMI, USB, optical interface, or D-subminiature (D-subminiature) 278 . Additionally or alternatively, the interface may include, for example, a mobile high-definition link (MHL) interface, an SD card/multi-media card (MMC) interface, or an infrared data association (IrDA) standard interface.

오디오 모듈은, 예를 들면, 소리와 전기 신호를 쌍방향으로 변환시킬 수 있다. 오디오 모듈은, 예를 들면, 스피커, 리시버, 이어폰, 또는 마이크 등을 통해 입력 또는 출력되는 소리 정보를 처리할 수 있다. 카메라 모듈은, 예를 들면, 이미지(예; 정적 이미지 및/또는 동적 이미지) 및 동영상을 촬영할 수 있는 장치로서, 한 실시예에 따르면, 하나 이상의 이미지 센서(예: 전면 센서 또는 후면 센서), 렌즈, 이미지 시그널 프로세서(ISP), 또는 플래시(예: LED 또는 xenon lamp 등)를 포함할 수 있다. 전력 관리 모듈은, 예를 들면, 전자 장치의 전력을 관리할 수 있다. 한 실시예에 따르면, 전력 관리 모듈은 PMIC(power management integrated circuit), 충전 IC, 또는 배터리 또는 연료 게이지를 포함할 수 있다. PMIC는, 유선 및/또는 무선 충전 방식을 가질 수 있다. 무선 충전 방식은, 예를 들면, 자기공명 방식, 자기유도 방식 또는 전자기파 방식 등을 포함하며, 무선 충전을 위한 부가적인 회로, 예를 들면, 코일 루프, 공진 회로, 또는 정류기 등을 더 포함할 수 있다. 배터리 게이지는, 예를 들면, 배터리의 잔량, 충전 중 전압, 전류, 또는 온도를 측정할 수 있다. 배터리는, 예를 들면, 충전식 전지 및/또는 태양 전지를 포함할 수 있다. The audio module can, for example, convert a sound and an electrical signal in both directions. The audio module may process sound information input or output through, for example, a speaker, receiver, earphone, or microphone. The camera module is, for example, a device capable of capturing images (eg, static images and/or dynamic images) and moving images, and according to one embodiment, one or more image sensors (eg, a front sensor or a rear sensor), a lens , an image signal processor (ISP), or a flash (eg LED or xenon lamp, etc.). The power management module may manage power of the electronic device, for example. According to one embodiment, the power management module may include a power management integrated circuit (PMIC), a charging IC, or a battery or fuel gauge. A PMIC may have a wired and/or wireless charging method. The wireless charging method includes, for example, a magnetic resonance method, a magnetic induction method, or an electromagnetic wave method, and may further include an additional circuit for wireless charging, for example, a coil loop, a resonance circuit, or a rectifier. there is. The battery gauge may measure, for example, remaining capacity of the battery, voltage, current, or temperature during charging. Batteries may include, for example, rechargeable cells and/or solar cells.

인디케이터는 전자 장치 또는 그 일부(예: 프로세서)의 특정 상태, 예를 들면, 부팅 상태, 메시지 상태 또는 충전 상태 등을 표시할 수 있다. 모터는 전기적 신호를 기계적 진동으로 변환할 수 있고, 진동, 또는 햅틱 효과 등을 발생시킬 수 있다. 전자 장치는, 예를 들면, DMB(digital multimedia broadcasting), DVB(digital video broadcasting), 또는 미디어플로(mediaFlo^TM) 등의 규격에 따른 미디어 데이터를 처리할 수 있는 모바일 TV 지원 장치(예: GPU)를 포함할 수 있다. 본 문서에서 기술된 구성요소들 각각은 하나 또는 그 이상의 부품(component)으로 구성될 수 있으며, 해당 구성요소의 명칭은 전자 장치의 종류에 따라서 달라질 수 있다. 다양한 실시예에서, 전자 장치(예: 전자 장치)는 일부 구성요소가 생략되거나, 추가적인 구성요소를 더 포함하거나, 또는, 구성요소들 중 일부가 결합되어 하나의 개체로 구성되되, 결합 이전의 해당 구성요소들의 기능을 동일하게 수행할 수 있다.The indicator may indicate a specific state of the electronic device or a part thereof (eg, a processor), for example, a booting state, a message state, or a charging state. The motor may convert electrical signals into mechanical vibrations and generate vibrations or haptic effects. The electronic device is, for example, a mobile TV support device (eg, GPU) capable of processing media data according to standards such as digital multimedia broadcasting (DMB), digital video broadcasting (DVB), or mediaFlo ^TM . can include Each of the components described in this document may be composed of one or more components, and the name of the corresponding component may vary depending on the type of electronic device. In various embodiments, an electronic device (eg, an electronic device) is configured as a single entity by omitting some components, further including additional components, or combining some of the components, but The functions of the components can be performed identically.

본 발명의 다양한 실시예에서, 전자 장치(또는, 전자 장치)는, 전면, 후면 및 상기 전면과 상기 후면 사이의 공간을 둘러싸는 측면을 포함하는 하우징을 포함할 수도 있다. 터치스크린 디스플레이(예: 디스플레이)는, 상기 하우징 안에 배치되며, 상기 전면을 통하여 노출될 수 있다. 마이크는, 상기 하우징 안에 배치되며, 상기 하우징의 부분을 통하여 노출될 수 있다. 적어도 하나의 스피커는, 상기 하우징 안에 배치되며, 상기 하우징의 다른 부분을 통하여 노출될 수 있다. 하드웨어 버튼(예: 키)는, 상기 하우징의 또 다른 부분에 배치되거나 또는 상기 터치스크린 디스플레이 상에 표시하도록 설정될 수 있다. 무선 통신 회로(예: 통신 모듈)은, 상기 하우징 안에 위치할 수 있다. 상기 프로세서(또는, 프로세서)는, 상기 하우징 안에 위치하며, 상기 터치스크린 디스플레이, 상기 마이크, 상기 스피커 및 상기 무선 통신 회로에 전기적으로 연결될 수 있다. 상기 메모리(또는, 메모리)는, 상기 하우징 안에 위치하며, 상기 프로세서에 전기적으로 연결될 수 있다.In various embodiments of the present disclosure, an electronic device (or electronic device) may include a housing including a front side, a rear side, and a side surface surrounding a space between the front side and the back side. A touch screen display (eg, display) is disposed within the housing and may be exposed through the front surface. A microphone is disposed within the housing and may be exposed through a portion of the housing. At least one speaker is disposed within the housing and may be exposed through another part of the housing. A hardware button (eg key) may be placed on another part of the housing or set to display on the touchscreen display. A wireless communication circuit (eg, a communication module) may be located within the housing. The processor (or processor) may be located in the housing and electrically connected to the touch screen display, the microphone, the speaker, and the wireless communication circuit. The memory (or memory) may be located in the housing and electrically connected to the processor.

본 발명의 다양한 실시예에서, 상기 메모리는, 텍스트 입력을 수신하기 위한 제1 사용자 인터페이스를 포함하는 제1 어플리케이션 프로그램을 저장하도록 설정되고, 상기 메모리는, 실행 시에, 상기 프로세서가, 제1 동작과 제2 동작을 수행하도록 야기하는 인스트럭션들을 저장하고, 상기 제1 동작은, 상기 제1 사용자 인터페이스가 상기 터치스크린 디스플레이 상에 표시되지 않는 도중에, 상기 버튼을 통하여 제1 타입의 사용자 입력을 수신하고, 상기 제1 타입의 사용자 입력을 수신한 이후에, 상기 마이크를 통하여 제1 사용자 발화를 수신하고, 자동 스피치 인식(ASR: automatic speech recognition) 및 지능 시스템(intelligence system)을 포함하는 외부 서버로 상기 제1 사용자 발화에 대한 제1 데이터를 제공하고, 상기 제1 데이터를 제공한 이후에, 상기 외부 서버로부터 상기 제1 사용자 발화에 응답하여 상기 지능 시스템에 의하여 생성되는 태스크를 수행하도록 하는 적어도 하나의 명령을 수신하고, 상기 제2 동작은, 상기 터치스크린 디스플레이 상에 상기 제1 사용자 인터페이스가 표시되는 도중에 상기 버튼을 통하여 상기 제1 사용자 입력을 수신하고, 상기 제1 타입의 사용자 입력을 수신한 이후에, 상기 마이크를 통하여 제2 사용자 발화를 수신하고, 상기 외부 서버로 상기 제2 사용자 발화에 대한 제2 데이터를 제공하고, 상기 제2 데이터를 제공한 이후에, 상기 서버로부터, 상기 제2 사용자 발화로부터 상기 자동 스피치 인식에 의하여 생성된 텍스트에 대한 데이터를 수신하지만, 상기 지능 시스템에 의하여 생성되는 명령은 수신하지 않고, 상기 제1 사용자 인터페이스에 상기 텍스트를 입력할 수 있다.In various embodiments of the present invention, the memory is set to store a first application program including a first user interface for receiving text input, and the memory, when executed, causes the processor to perform the first operation and instructions that cause a second operation to be performed, wherein the first operation receives a first type of user input through the button while the first user interface is not displayed on the touch screen display; , after receiving the first type of user input, receiving a first user utterance through the microphone, and to an external server including an automatic speech recognition (ASR) and intelligence system; At least one method for providing first data for a first user utterance and, after providing the first data, performing a task generated by the intelligent system in response to the first user utterance from the external server. A command is received, and the second operation is performed after receiving the first user input through the button while the first user interface is displayed on the touch screen display and receiving the first type of user input. In this case, after receiving the second user speech through the microphone, providing second data for the second user speech to the external server, and providing the second data, from the server, the second user Data on text generated by the automatic speech recognition from utterances may be received, but the text may be input to the first user interface without receiving a command generated by the intelligent system.

본 발명의 다양한 실시예에서, 상기 버튼은, 상기 하우징의 상기 측면에 위치하는 물리적인 키를 포함할 수 있다.In various embodiments of the present invention, the button may include a physical key located on the side of the housing.

본 발명의 다양한 실시예에서, 상기 제1 타입의 사용자 입력은, 상기 버튼에 대한 1회 누름, 상기 버튼에 대한 2회 누름, 상기 버튼에 대한 3회 누름, 상기 버튼에 대한 1회 누른 이후에 누름 유지, 또는 상기 버튼에 대한 2회 누름 및 누름 유지 중 하나일 수 있다.In various embodiments of the present invention, the first type of user input is after pressing the button once, pressing the button twice, pressing the button three times, or pressing the button once. It can be either a held press, or a two-time press and hold press on the button.

본 발명의 다양한 실시예에서, 상기 인스트럭션들은, 상기 프로세서가 상기 제1 사용자 인터페이스를 가상 키보드와 함께 표시하도록 더 야기할 수 있다. 상기 버튼은, 상기 가상 키보드의 일부가 아닐 수 있다.In various embodiments of the present disclosure, the instructions may further cause the processor to display the first user interface along with a virtual keyboard. The button may not be part of the virtual keyboard.

본 발명의 다양한 실시예에서, 상기 인스트럭션들은, 상기 프로세서가, 상기 외부 서버로부터, 상기 제1 동작 내에서의 상기 제1 사용자 발화로부터 ASR에 의하여 생성되는 텍스트에 대한 데이터를 수신하도록 더 야기할 수 있다.In various embodiments of the invention, the instructions may further cause the processor to receive, from the external server, data for text generated by ASR from the first user utterance within the first action. there is.

본 발명의 다양한 실시예에서, 상기 제1 어플리케이션 프로그램은, 노트 어플리케이션 프로그램, 이메일 어플리케이션 프로그램, 웹 브라우저 어플리케이션 프로그램 또는 달력 어플리케이션 프로그램 중 적어도 하나를 포함할 수 있다.In various embodiments of the present disclosure, the first application program may include at least one of a note application program, an e-mail application program, a web browser application program, or a calendar application program.

본 발명의 다양한 실시예에서, 상기 제1 어플리케이션 프로그램은, 메시지 어플리케이션을 포함하고, 상기 인스트럭션들은, 상기 프로세서가, 상기 텍스트를 입력한 이후에 선택된 시간(및/또는 기간)이 초과하면, 상기 무선 통신 회로를 통하여 자동으로 입력된 텍스트를 송신하도록 더 야기할 수 있다.In various embodiments of the present invention, the first application program includes a message application, and the instructions are performed by the processor when a selected time (and/or period) exceeds after inputting the text, the wireless It may further cause the automatically inputted text to be transmitted via the communication circuit.

본 발명의 다양한 실시예에서, 상기 인스트럭션들은, 상기 프로세서가, 제3 동작을 수행하도록 더 야기하고, 상기 제3 동작은, 상기 터치스크린 디스플레이 상에 상기 제1 사용자 인터페이스를 표시하는 도중에, 상기 버튼을 통하여 제2 타입의 사용자 입력을 수신하고, 상기 제2 타입의 사용자 입력을 수신한 이후에, 상기 마이크를 통하여 제3 사용자 발화를 수신하고, 상기 외부 서버로 상기 제3 사용자 발화에 대한 제3 데이터를 제공하고, 상기 제3 데이터를 제공한 이후에, 상기 제3 사용자 발화에 응답하여 상기 지능 시스템에 의하여 생성된 태스크를 수행하기 위한 적어도 하나의 명령을 상기 외부 서버로부터 수신할 수 있다.In various embodiments of the present invention, the instructions further cause the processor to perform a third operation, wherein the third operation is performed by pressing the button while displaying the first user interface on the touchscreen display. After receiving the second type of user input, a third user speech is received through the microphone, and a third user speech response to the third user speech is received by the external server. After providing data and providing the third data, at least one command for performing a task generated by the intelligent system in response to the third user's utterance may be received from the external server.

본 발명의 다양한 실시예에서, 상기 인스트럭션들은, 상기 프로세서가, 제4 동작을 수행하도록 더 야기하고, 상기 제4 동작은, 상기 터치스크린 디스플레이 상에 상기 제1 사용자 인터페이스가 표시되지 않는 도중에, 상기 버튼을 통하여 상기 제2 타입의 사용자 입력을 수신하고, 상기 제2 타입의 사용자 입력을 수신한 이후에, 상기 마이크를 통하여 제4 사용자 발화를 수신하고, 상기 제4 사용자 발화에 대한 제4 데이터를 상기 외부 서버로 제공하고, 상기 제4 데이터를 제공한 이후에, 상기 제4 사용자 발화에 응답하여, 상기 지능 시스템에 의하여 생성된 태스크를 수행하기 위한 적어도 하나의 명령을 상기 외부 서버로부터 수신하고, 상기 마이크를 통하여 제5 사용자 발화를 수신하고, 상기 외부 서버로, 상기 제5 사용자 발화에 대한 제5 데이터를 제공하고, 및 상기 제5 데이터를 제공한 이후에, 상기 제5 사용자 발화에 응답하여 상기 지능 시스템에 의하여 생성된 태스크를 수행하기 위한 적어도 하나의 명령을 상기 외부 서버로부터 수신할 수 있다.In various embodiments of the present disclosure, the instructions further cause the processor to perform a fourth operation, wherein the fourth operation is performed while the first user interface is not displayed on the touch screen display. The second type of user input is received through a button, and after receiving the second type of user input, a fourth user speech is received through the microphone, and fourth data for the fourth user speech is received. Provided to the external server, and after providing the fourth data, in response to the fourth user utterance, receiving at least one command for performing a task generated by the intelligent system from the external server; Receiving a fifth user speech through the microphone, providing fifth data for the fifth user speech to the external server, and after providing the fifth data, in response to the fifth user speech At least one command for performing a task generated by the intelligent system may be received from the external server.

본 발명의 다양한 실시예에서, 상기 제1 타입의 사용자 입력 및 상기 제2 타입의 사용자 입력은 서로 다르며, 상기 버튼에 대한 1회 누름, 상기 버튼에 대한 2회 누름, 상기 버튼에 대한 3회 누름, 상기 버튼에 대한 1회 누른 이후에 누름 유지, 또는 상기 버튼에 대한 2회 누름 및 누름 유지 중 하나로부터 선택될 수 있다.In various embodiments of the present invention, the first type of user input and the second type of user input are different, and include pressing the button once, pressing the button twice, and pressing the button three times. , Pressing and maintaining the button after pressing the button once, or pressing and maintaining the button twice and maintaining the button may be selected.

본 발명의 다양한 실시예에서, 상기 메모리는, 텍스트 입력을 수신하기 위한 제2 사용자 인터페이스를 포함하는 제2 어플리케이션 프로그램을 저장하도록 더 설정되며, 상기 인스트럭션들은, 실행 시에, 상기 프로세서가, 제3 동작을 수행하도록 더 야기하고, 상기 제3 동작은, 상기 제2 사용자 인터페이스를 표시하는 도중에 상기 버튼을 통하여 상기 제1 타입의 사용자 입력을 수신하고, 상기 제1 타입의 사용자 입력이 수신된 이후에, 상기 마이크를 통하여 제3 사용자 발화를 수신하고, 상기 외부 서버로, 상기 제3 사용자 발화에 대한 제3 데이터를 제공하고, 상기 제3 데이터를 제공한 이후에, 상기 외부 서버로부터, 상기 제3 사용자 발화로부터 ASR에 의하여 생성된 텍스트에 대한 데이터를 수신하면서, 상기 지능 시스템에 의하여 생성되는 명령은 수신하지 않고, 상기 제2 사용자 인터페이스에 상기 텍스트를 입력하고, 선택된 시간(및/또는 기간)이 초과하면 상기 무선 통신 회로를 통하여 상기 입력된 텍스트를 자동으로 송신할 수 있다.In various embodiments of the present invention, the memory is further configured to store a second application program including a second user interface for receiving text input, and the instructions, when executed, cause the processor to: further cause an operation to be performed, wherein the third operation is to receive the first type of user input through the button while displaying the second user interface, and after the first type of user input is received; , Receiving a third user speech through the microphone, providing third data for the third user speech to the external server, and after providing the third data, from the external server, the third data While receiving data on text generated by ASR from user utterances, but not receiving commands generated by the intelligent system, the text is input to the second user interface, and the selected time (and/or period) is If exceeded, the input text may be automatically transmitted through the wireless communication circuit.

본 발명의 다양한 실시예에서, 상기 메모리는, 텍스트 입력을 수신하기 위한 제1 사용자 인터페이스를 포함하는 제1 어플리케이션 프로그램을 저장하도록 설정되고, 상기 메모리는, 실행 시에, 상기 프로세서가, 제1 동작과 제2 동작을 수행하도록 야기하는 인스트럭션들을 저장하고, 상기 제1 동작은, 상기 버튼을 통하여 제1 타입의 사용자 입력을 수신하고, 상기 제1 타입의 사용자 입력을 수신한 이후에, 상기 마이크를 통하여 제1 사용자 발화를 수신하고, 자동 스피치 인식(ASR: automatic speech recognition) 및 지능 시스템(intelligence system)을 포함하는 외부 서버로, 상기 제1 사용자 발화에 대한 제1 데이터를 제공하고, 상기 제1 데이터를 제공한 이후에, 상기 제1 사용자 발화에 응답하여 상기 지능 시스템에 의하여 생성된 태스크를 수행하기 위한 적어도 하나의 명령을 상기 외부 서버로부터 수신하고, 상기 제2 동작은, 상기 버튼을 통하여 제2 타입의 사용자 입력을 수신하고, 상기 제2 타입의 사용자 입력을 수신한 이후에, 상기 마이크를 통하여 제2 사용자 발화를 수신하고, 상기 외부 서버로 상기 제2 사용자 발화에 대한 제2 데이터를 제공하고, 상기 제2 데이터를 제공한 이후에, 상기 서버로부터, 상기 제2 사용자 발화로부터 ASR에 의하여 생성된 텍스트에 대한 데이터를 수신하면서, 상기 지능 시스템에 의하여 생성되는 명령은 수신하지 않으며, 상기 제1 사용자 인터페이스에 상기 텍스트를 입력할 수 있다.In various embodiments of the present invention, the memory is set to store a first application program including a first user interface for receiving text input, and the memory, when executed, causes the processor to perform the first operation and instructions that cause the second operation to be performed, the first operation comprising: receiving a first type of user input through the button, receiving a first user utterance through the microphone after receiving the first type of user input, automatic speech recognition (ASR), and An external server including an intelligence system, providing first data for the first user utterance, and after providing the first data, in response to the first user utterance, by the intelligence system At least one command for performing the created task is received from the external server, and the second operation is performed after receiving a second type of user input through the button and receiving the second type of user input. In this case, after receiving the second user speech through the microphone, providing second data for the second user speech to the external server, and providing the second data, from the server, the second user While receiving data on text generated by ASR from the utterance, the command generated by the intelligent system is not received, and the text can be input to the first user interface.

본 발명의 다양한 실시예에서, 상기 인스트럭션들은, 상기 프로세서가 상기 제1 사용자 인터페이스를 가상 키보드와 함께 표시하도록 더 야기할 수 있으며, 상기 버튼은, 상기 가상 키보드의 일부가 아닐 수 있다.In various embodiments of the present disclosure, the instructions may further cause the processor to display the first user interface along with a virtual keyboard, and the buttons may not be part of the virtual keyboard.

본 발명의 다양한 실시예에서, 상기 인스트럭션들은, 상기 프로세서가, 상기 외부 서버로부터 상기 제1 동작 내에서 상기 제1 사용자 발화로부터 상기 ASR에 의하여 생성되는 텍스트에 대한 데이터를 수신하도록 더 야기할 수 있다.In various embodiments of the invention, the instructions may further cause the processor to receive data for text generated by the ASR from the first user utterance within the first operation from the external server. .

본 발명의 다양한 실시예에서, 상기 인스트럭션들은, 상기 프로세서가 상기 제1 사용자 인터페이스의 상기 디스플레이 상에 표시와 독립적으로 상기 제1 동작을 수행하도록 더 야기할 수 있다.In various embodiments of the present invention, the instructions may further cause the processor to perform the first operation independently of display on the display of the first user interface.

본 발명의 다양한 실시예에서, 상기 인스트럭션들은, 상기 프로세서가, 상기 전자 장치가 잠금 상태에 있거나 또는 상기 터치스크린 디스플레이가 턴 오프된 것 중 적어도 하나인 경우에, 상기 제2 동작을 수행하도록 더 야기할 수 있다.In various embodiments of the present disclosure, the instructions further cause the processor to perform the second operation when at least one of the electronic device is locked or the touchscreen display is turned off. can do.

본 발명의 다양한 실시예에서, 상기 인스트럭션들은, 상기 프로세서가, 상기 터치스크린 디스플레이 상에 상기 제1 사용자 인터페이스를 표시하는 도중에, 상기 제2 동작을 수행하도록 더 야기할 수 있다.In various embodiments of the present disclosure, the instructions may further cause the processor to perform the second operation while displaying the first user interface on the touch screen display.

본 발명의 다양한 실시예에서, 상기 메모리는, 실행 시에, 상기 프로세서가, 상기 마이크를 통하여 사용자 발화를 수신하고, 자동 스피치 인식(automatic speech recognition: ASR) 또는 자연어 이해(natural language understanding: NLU) 중 적어도 하나를 수행하는 외부 서버로, 상기 사용자 발화에 대한 데이터와 함께, 상기 사용자 발화에 대한 데이터에 대하여 상기 ASR을 수행하여 획득된 텍스트에 대하여 상기 자연어 이해를 수행할지 여부와 연관된 정보를 송신하고, 상기 정보가 상기 자연어 이해를 수행하지 않을 것을 나타내면, 상기 외부 서버로부터 상기 사용자 발화에 대한 데이터에 대한 상기 텍스트를 수신하고, 상기 정보가 상기 자연어 이해를 수행할 것을 나타내면, 상기 외부 서버로부터 상기 텍스트에 대한 상기 자연어 이해 수행 결과 획득된 명령을 수신하도록 야기하는 인스트럭션을 저장할 수 있다.In various embodiments of the present invention, the memory, when executed, allows the processor to receive user utterances through the microphone, perform automatic speech recognition (ASR) or natural language understanding (NLU) To an external server that performs at least one of the following: Transmits, along with data on the user's speech, information associated with whether or not to perform the natural language understanding on text obtained by performing the ASR on the data on the user's speech; and , if the information indicates not to perform the natural language understanding, receive the text for data on the user utterance from the external server, and if the information indicates that the natural language understanding is not to be performed, the text from the external server It is possible to store an instruction that causes to receive a command obtained as a result of performing the natural language understanding for .

한 실시예에 따르면, 프로그램 모듈(예: 프로그램)은 전자 장치(예: 전자 장치)에 관련된 자원을 제어하는 운영 체제 및/또는 운영 체제 상에서 구동되는 다양한 어플리케이션(예: 어플리케이션 프로그램)을 포함할 수 있다. 운영 체제는, 예를 들면, Android^TM, iOS^TM, Windows^TM, Symbian^TM, Tizen^TM, 또는 Bada^TM를 포함할 수 있다. 프로그램 모듈은 커널(예: 커널), 미들웨어(예: 미들웨어), (API(예: API), 및/또는 어플리케이션(예: 어플리케이션 프로그램)을 포함할 수 있다. 프로그램 모듈의 적어도 일부는 전자 장치 상에 프리로드 되거나, 외부 전자 장치(예: 전자 장치, 서버 등)로부터 다운로드 가능하다.According to one embodiment, a program module (eg, program) may include an operating system that controls resources related to an electronic device (eg, the electronic device) and/or various applications (eg, application programs) running on the operating system. there is. The operating system may include, for example, Android ^TM , iOS ^TM , Windows ^TM , Symbian ^TM , Tizen ^TM , or Bada ^TM . The program module may include a kernel (eg, kernel), middleware (eg, middleware), (API (eg, API), and/or an application (eg, application program). At least a portion of the program modules may be configured on an electronic device). It can be preloaded on or downloaded from an external electronic device (eg, electronic device, server, etc.).

커널은, 예를 들면, 시스템 리소스 매니저 및/또는 디바이스 드라이버를 포함할 수 있다. 시스템 리소스 매니저는 시스템 리소스의 제어, 할당, 또는 회수를 수행할 수 있다. 한 실시예에 따르면, 시스템 리소스 매니저는 프로세스 관리부, 메모리 관리부, 또는 파일 시스템 관리부를 포함할 수 있다. 디바이스 드라이버는, 예를 들면, 디스플레이 드라이버, 카메라 드라이버, 블루투스 드라이버, 공유 메모리 드라이버, USB 드라이버, 키패드 드라이버, WiFi 드라이버, 오디오 드라이버, 또는 IPC(inter-process communication) 드라이버를 포함할 수 있다. 미들웨어는, 예를 들면, 어플리케이션이 공통적으로 필요로 하는 기능을 제공하거나, 어플리케이션이 전자 장치 내부의 제한된 시스템 자원을 사용할 수 있도록 API를 통해 다양한 기능들을 어플리케이션으로 제공할 수 있다. 한 실시예에 따르면, 미들웨어는 런타임 라이브러리, 어플리케이션 매니저, 윈도우 매니저, 멀티미디어 매니저, 리소스 매니저, 파워 매니저, 데이터베이스 매니저, 패키지 매니저, 커넥티비티 매니저, 노티피케이션 매니저, 로케이션 매니저, 그래픽 매니저, 또는 시큐리티 매니저 중 적어도 하나를 포함할 수 있다.The kernel may include, for example, system resource managers and/or device drivers. A system resource manager can control, allocate, or reclaim system resources. According to one embodiment, the system resource manager may include a process management unit, a memory management unit, or a file system management unit. The device driver may include, for example, a display driver, a camera driver, a Bluetooth driver, a shared memory driver, a USB driver, a keypad driver, a WiFi driver, an audio driver, or an inter-process communication (IPC) driver. Middleware, for example, may provide functions commonly required by applications or provide various functions to applications through an API so that applications can use limited system resources inside the electronic device. According to one embodiment, the middleware is a runtime library, application manager, window manager, multimedia manager, resource manager, power manager, database manager, package manager, connectivity manager, notification manager, location manager, graphics manager, or security manager. may contain at least one.

런타임 라이브러리는, 예를 들면, 어플리케이션이 실행되는 동안에 프로그래밍 언어를 통해 새로운 기능을 추가하기 위해 컴파일러가 사용하는 라이브러리 모듈을 포함할 수 있다. 런타임 라이브러리는 입출력 관리, 메모리 관리, 또는 산술 함수 처리를 수행할 수 있다. 어플리케이션 매니저는, 예를 들면, 어플리케이션의 생명 주기를 관리할 수 있다. 윈도우 매니저는 화면에서 사용되는 GUI 자원을 관리할 수 있다. 멀티미디어 매니저는 미디어 파일들의 재생에 필요한 포맷을 파악하고, 해당 포맷에 맞는 코덱을 이용하여 미디어 파일의 인코딩 또는 디코딩을 수행할 수 있다. 리소스 매니저는 어플리케이션의 소스 코드 또는 메모리의 공간을 관리할 수 있다. 파워 매니저는, 예를 들면, 배터리의 용량 또는 전원을 관리하고, 전자 장치의 동작에 필요한 전력 정보를 제공할 수 있다. 한 실시예에 따르면, 파워 매니저는 바이오스(BIOS: basic input/output system)와 연동할 수 있다. 데이터베이스 매니저는, 예를 들면, 어플리케이션에서 사용될 데이터베이스를 생성, 검색, 또는 변경할 수 있다. 패키지 매니저는 패키지 파일의 형태로 배포되는 어플리케이션의 설치 또는 갱신을 관리할 수 있다. A runtime library may include, for example, a library module used by a compiler to add new functions through a programming language while an application is running. A runtime library can perform I/O management, memory management, or processing of arithmetic functions. The application manager may manage the life cycle of the application, for example. A window manager can manage GUI resources used in a screen. The multimedia manager can identify a format required for reproducing media files, and encode or decode the media file using a codec suitable for the format. A resource manager can manage an application's source code or memory space. The power manager may manage, for example, battery capacity or power, and provide power information necessary for the operation of the electronic device. According to one embodiment, the power manager may interoperate with a basic input/output system (BIOS). A database manager can create, search, or change a database to be used in an application, for example. The package manager may manage installation or update of applications distributed in the form of package files.

커넥티비티 매니저는, 예를 들면, 무선 연결을 관리할 수 있다. 노티피케이션 매니저는, 예를 들면, 도착 메시지, 약속, 근접성 알림 등의 이벤트를 사용자에게 제공할 수 있다. 로케이션 매니저는, 예를 들면, 전자 장치의 위치 정보를 관리할 수 있다. 그래픽 매니저는, 예를 들면, 사용자에게 제공될 그래픽 효과 또는 이와 관련된 사용자 인터페이스를 관리할 수 있다. 보안 매니저는, 예를 들면, 시스템 보안 또는 사용자 인증을 제공할 수 있다. 한 실시예에 따르면, 미들웨어는 전자 장치의 음성 또는 영상 통화 기능을 관리하기 위한 통화(telephony) 매니저 또는 전술된 구성요소들의 기능들의 조합을 형성할 수 있는 하는 미들웨어 모듈을 포함할 수 있다. 한 실시예에 따르면, 미들웨어는 운영 체제의 종류 별로 특화된 모듈을 제공할 수 있다. 미들웨어는 동적으로 기존의 구성요소를 일부 삭제하거나 새로운 구성요소들을 추가할 수 있다. API는, 예를 들면, API 프로그래밍 함수들의 집합으로, 운영 체제에 따라 다른 구성으로 제공될 수 있다. 예를 들면, 안드로이드 또는 iOS의 경우, 플랫폼 별로 하나의 API 셋을 제공할 수 있으며, 타이젠의 경우, 플랫폼 별로 두 개 이상의 API 셋을 제공할 수 있다.A connectivity manager can manage wireless connections, for example. The notification manager may provide a user with an event such as an arrival message, an appointment, or proximity notification. The location manager may manage location information of the electronic device, for example. The graphic manager may manage, for example, a graphic effect to be provided to a user or a user interface related thereto. A security manager may provide system security or user authentication, for example. According to one embodiment, middleware may include a telephony manager for managing voice or video call functions of an electronic device or a middleware module capable of forming a combination of functions of the aforementioned components. According to one embodiment, middleware may provide modules specialized for each type of operating system. Middleware can dynamically delete some existing components or add new components. The API is, for example, a set of API programming functions, and may be provided in different configurations depending on the operating system. For example, in the case of Android or iOS, one API set can be provided for each platform, and in the case of Tizen, two or more API sets can be provided for each platform.

어플리케이션은, 예를 들면, 홈, 다이얼러, SMS/MMS, IM(instant message), 브라우저, 카메라, 알람, 컨택트, 음성 다이얼, 이메일, 달력, 미디어 플레이어, 앨범, 와치, 헬스 케어(예: 운동량 또는 혈당 등을 측정), 또는 환경 정보(예: 기압, 습도, 또는 온도 정보) 제공 어플리케이션을 포함할 수 있다. 한 실시예에 따르면, 어플리케이션은 전자 장치와 외부 전자 장치 사이의 정보 교환을 지원할 수 있는 정보 교환 어플리케이션을 포함할 수 있다. 정보 교환 어플리케이션은, 예를 들면, 외부 전자 장치에 특정 정보를 전달하기 위한 노티피케이션 릴레이 어플리케이션, 또는 외부 전자 장치를 관리하기 위한 장치 관리 어플리케이션을 포함할 수 있다. 예를 들면, 알림 전달 어플리케이션은 전자 장치의 다른 어플리케이션에서 발생된 알림 정보를 외부 전자 장치로 전달하거나, 또는 외부 전자 장치로부터 알림 정보를 수신하여 사용자에게 제공할 수 있다. 장치 관리 어플리케이션은, 예를 들면, 전자 장치와 통신하는 외부 전자 장치의 기능(예: 외부 전자 장치 자체(또는, 일부 구성 부품)의 턴-온/턴-오프 또는 디스플레이의 밝기(또는, 해상도) 조절), 또는 외부 전자 장치에서 동작하는 어플리케이션을 설치, 삭제, 또는 갱신할 수 있다. 한 실시예에 따르면, 어플리케이션은 외부 전자 장치의 속성에 따라 지정된 어플리케이션(예: 모바일 의료 기기의 건강 관리 어플리케이션)을 포함할 수 있다. 한 실시예에 따르면, 어플리케이션은 외부 전자 장치로부터 수신된 어플리케이션을 포함할 수 있다. 프로그램 모듈의 적어도 일부는 소프트웨어, 펌웨어, 하드웨어(예: 프로세서), 또는 이들 중 적어도 둘 이상의 조합으로 구현(예: 실행)될 수 있으며, 하나 이상의 기능을 수행하기 위한 모듈, 프로그램, 루틴, 명령어 세트 또는 프로세스를 포함할 수 있다.Applications include, for example, home, dialer, SMS/MMS, IM (instant message), browser, camera, alarm, contact, voice dial, email, calendar, media player, album, watch, health care (eg exercise or measurement of blood sugar, etc.), or environmental information (eg, atmospheric pressure, humidity, or temperature information) providing applications. According to one embodiment, the application may include an information exchange application capable of supporting information exchange between an electronic device and an external electronic device. The information exchange application may include, for example, a notification relay application for delivering specific information to an external electronic device or a device management application for managing an external electronic device. For example, a notification delivery application may transfer notification information generated by another application of an electronic device to an external electronic device or may receive notification information from an external electronic device and provide the notification information to a user. The device management application is, for example, a function of an external electronic device that communicates with the electronic device (eg, turn-on/turn-off of the external electronic device itself (or some components) or brightness (or resolution) of a display). adjustment), or an application operating in an external electronic device may be installed, deleted, or updated. According to one embodiment, the application may include an application designated according to the properties of the external electronic device (eg, a health management application of a mobile medical device). According to one embodiment, the application may include an application received from an external electronic device. At least some of the program modules may be implemented (eg, executed) in software, firmware, hardware (eg, a processor), or a combination of at least two of them, and may be implemented as modules, programs, routines, or instruction sets for performing one or more functions. or process.

본 발명에 따른 방법들은 다양한 컴퓨터 수단을 통해 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 컴퓨터 판독 가능 매체에 기록되는 프로그램 명령은 본 발명을 위해 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다.The methods according to the present invention may be implemented in the form of program instructions that can be executed by various computer means and recorded on a computer readable medium. Computer readable media may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on a computer readable medium may be specially designed and configured for the present invention or may be known and usable to those skilled in computer software.

컴퓨터 판독 가능 매체의 예에는 롬(ROM), 램(RAM), 플래시 메모리(flash memory) 등과 같이 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함될 수 있다. 프로그램 명령의 예에는 컴파일러(compiler)에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터(interpreter) 등을 사용해서 컴퓨터에 의해 실행될 수 있는 고급 언어 코드를 포함할 수 있다. 상술한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 적어도 하나의 소프트웨어 모듈로 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Examples of computer readable media may include hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions may include not only machine language codes generated by a compiler but also high-level language codes that can be executed by a computer using an interpreter and the like. The hardware device described above may be configured to operate with at least one software module to perform the operations of the present invention, and vice versa.

또한, 상술한 방법 또는 장치는 그 구성이나 기능의 전부 또는 일부가 결합되어 구현되거나, 분리되어 구현될 수 있다. In addition, the above-described method or device may be implemented by combining all or some of its components or functions, or may be implemented separately.

상기에서는 본 발명의 바람직한 실시예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.Although the above has been described with reference to preferred embodiments of the present invention, those skilled in the art will variously modify and change the present invention within the scope not departing from the spirit and scope of the present invention described in the claims below. You will understand that it can be done.

Claims

A method and apparatus for analyzing images.