KR101356006B1

KR101356006B1 - Method and apparatus for tagging multimedia contents based upon voice enable of range setting

Info

Publication number: KR101356006B1
Application number: KR1020120011807A
Authority: KR
Inventors: 석영태; 이동원; 이호원; 이수빈
Original assignee: 한국과학기술원
Priority date: 2012-02-06
Filing date: 2012-02-06
Publication date: 2014-02-12
Also published as: KR20130090570A

Abstract

구간설정이 가능한 음성기반 멀티미디어 컨텐츠 태깅 방법 및 장치가 제공된다.
본 발명에 따른 구간설정이 가능한 음성기반 멀티미디어 컨텐츠 태깅 방법은 전자 기기에서의 컨텐츠 재생 중 음성정보 입력모드가 활성화되는 단계; 상기 음성정보 입력모드 중 음성정보가 상기 전자기기에 입력되는 단계; 상기 음성정보 입력모드 중 활성화된 상기 터치스크린에서의 드래깅 제스쳐에 따라 상기 음성정보가 태깅되는 컨텐츠 구간이 결정되는 단계; 상기 음성정보 입력모드가 종료되는 단계; 및 상기 입력된 음성정보가 상기 결정된 구간의 컨텐츠에 대한 태깅 정보로 상기 전자 기기에 저장되는 단계를 포함하며, 본 발명에 따르면, 음성에 기반하여 멀티미디어 컨텐츠를 태깅할 수 있다. 이 경우, 음성이 입력되는 시간동안 터치스크린 상에 수행되는 드래깅 제스쳐를 통하여 음성정보가 태깅되는 컨텐츠 범위가 결정되므로, 음성정보가 태깅되는 컨텐츠 범위가 간단히 결정될 수 있다. 또한, 음성 정보가 태깅된 멀티미디어 컨텐츠는 사용자의 입력 음성에 따라 효과적으로 검색될 수 있으며, 다수의 사용자가 사용하는 멀티미디어인 경우, 집단 지성 방식에 따라 다수 사용자가 멀티미디어를 통합 태깅할 수 있다. 더 나아가, 음성으로 기록된 태그는 즉시 또는 주기적으로 텍스트화되어, 타 사용자에게 검색에 유용한 정보로 활용될 수 있다.Provided are a method and apparatus for tagging speech based multimedia content.
According to an aspect of the present invention, there is provided a voice-based multimedia content tagging method including activating a voice information input mode during content playback in an electronic device; Inputting voice information to the electronic device in the voice information input mode; Determining a content section in which the voice information is tagged according to a dragging gesture on the touch screen activated during the voice information input mode; Terminating the voice information input mode; And storing the input voice information in the electronic device as tagging information of the determined section content. According to the present invention, the multimedia content may be tagged based on the voice. In this case, since the content range to which the voice information is tagged is determined through the dragging gesture performed on the touch screen during the voice input time, the content range to which the voice information is tagged can be simply determined. In addition, the multimedia content tagged with the voice information may be effectively searched according to the input voice of the user, and when the multimedia is used by a plurality of users, the multi-user may tag the multimedia according to the collective intelligence method. Furthermore, the voice-recorded tag may be texted immediately or periodically, and may be used as information useful for search to other users.

Description

Method and apparatus for tagging multimedia content based on segment setting {Method and apparatus for tagging multimedia contents based upon voice enable of range setting}

본 발명은 구간설정이 가능한 음성기반 멀티미디어 컨텐츠 태깅 방법 및 장치에 관한 것으로, 보다 상세하게는 사용자가 음성정보가 태깅되는 컨텐츠의 범위를 사용자 제스쳐로 설정할 수 있는 음성기반 멀티미디어 컨텐츠 태깅 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for tagging voice-based multimedia contents, which can be segmented. More specifically, the present invention relates to a method and apparatus for tagging voice-based multimedia contents, in which a user can set a range of contents tagged with voice information. will be.

데이터 트래픽의 상당부분은 영상과 음향이 결합된 형태인 멀티미디어 방식의 컨텐츠로 이루어진다. 특히 스마트 폰과 같은 통신 기기에서도 이러한 멀티미디어 재생과 검색이 활발히 이루어지므로, 다양한 멀티미디어 중 사용자가 관심있어하는 데이터를 검색, 파악하는 것이 매우 중요해지고 있다. Much of the data traffic consists of multimedia content that combines video and sound. In particular, since such multimedia playback and search are actively performed in a communication device such as a smart phone, it is becoming very important to search and grasp data of interest to users of various multimedia.

이러한 멀티미디어 태깅 방법으로 전체 컨텐츠에 대한 사용자 입력에 기반하는 방식이 있다. 이 경우, 멀티미디어 자체를 구별할 수 있으나, 대용량, 장시간의 멀티미디어인 경우, 세부적인 데이터 검색이 어렵다는 문제가 있다. The multimedia tagging method is based on a user input for the entire content. In this case, the multimedia itself can be distinguished, but there is a problem that detailed data retrieval is difficult in the case of a large capacity and a long time multimedia.

이와 달리 멀티미디어 컨텐츠에 POI(Point Of Interest)를 태깅하는 기법이 제공되고 있다. 이 방식인 경우, 상기 컨텐츠에 POI를 태깅하는 기법은 컨텐츠에 특정 지역명, 혹은 지물명 등을 태깅하여 사용자가 상기 컨텐츠에 관련된 지역 혹은 지물을 쉽게 알 수 있도록 하는 기법이다. 하지만, 검색을 위하여 별도의 지역 혹은 사물의 이름을 사용자가 입력하여야 하는 번거로움이 있다. 또한 최근 MPEG 7에서는 컨텐츠의 특정 시전부터의 기간에 대한 태깅이 가능해지나, 관리자가 이를 태깅하는 방법이 복잡하고, 별도의 입력수단(예를 들어, 키보드 등)이 필요하다는 문제가 있다.In contrast, a technique of tagging point of interest (POI) on multimedia content is provided. In this method, the POI tagging of the content is a technique of tagging the content with a specific region name or feature name so that the user can easily know the region or feature related to the content. However, there is a hassle that a user must input a separate region or name of an object for searching. In addition, in recent MPEG 7, tagging for a period from a specific cast of contents becomes possible, but there is a problem in that a method of tagging by the administrator is complicated, and a separate input means (for example, a keyboard, etc.) is required.

더 나아가, 이러한 종래 기술에 따른 태깅 방법은 태깅되는 컨텐츠 범위, 즉, 태깅 구간을 효과적으로 설정할 수 없다는 문제가 있다.
Furthermore, the tagging method according to the related art has a problem in that the content range to be tagged, that is, the tagging section cannot be effectively set.

이에 따라, 본 발명이 해결하려는 과제는 보다 간단하고, 효과적인 방식을 멀티미디어 컨텐츠를 구간설정하여 태깅할 수 있는 방법 및 장치를 제공하는 것이다.Accordingly, the problem to be solved by the present invention is to provide a method and apparatus that can tag by tagging the multimedia content in a simpler, more effective manner.

상기 과제를 해결하기 위하여, 본 발명은 터치스크린을 구비한 전자기기에 의한 멀티미디어 컨텐츠 태깅 방법으로, 상기 방법은 상기 전자 기기에서의 컨텐츠 재생 중 음성정보 입력모드가 활성화되는 단계; 상기 음성정보 입력모드 중 음성정보가 상기 전자기기에 입력되는 단계; 상기 음성정보 입력모드 중 활성화된 상기 터치스크린에서의 드래깅 제스쳐에 따라 상기 음성정보가 태깅되는 컨텐츠 구간이 결정되는 단계; 상기 음성정보 입력모드가 종료되는 단계; 및 상기 입력된 음성정보가 상기 결정된 구간의 컨텐츠에 대한 태깅 정보로 상기 전자 기기에 저장되는 단계를 포함한다. In order to solve the above problems, the present invention provides a method for tagging multimedia content by an electronic device having a touch screen, the method comprising: activating a voice information input mode during content playback on the electronic device; Inputting voice information to the electronic device in the voice information input mode; Determining a content section in which the voice information is tagged according to a dragging gesture on the touch screen activated during the voice information input mode; Terminating the voice information input mode; And storing the input voice information in the electronic device as tagging information about the content of the determined section.

본 발명의 일 실시예에서, 상기 드래깅 제스쳐는 두 개의 터치에 의한 드래깅 제스쳐이며, 상기 두 개의 터치 지점 중 어느 하나는 상기 음성정보가 태깅되는 컨텐츠 구간의 시작점, 나머지 하나는 종료점에 대응된다. In one embodiment of the present invention, the dragging gesture is a dragging gesture by two touches, one of the two touch points corresponds to a start point of a content section in which the voice information is tagged, and the other corresponds to an end point.

본 발명의 일 실시예에서 상기 컨텐츠 구간은 상기 두 개의 터치 지점 사이의 거리에 대응된다. In one embodiment of the present invention, the content section corresponds to a distance between the two touch points.

본 발명의 일 실시예에서 상기 방법은 상기 드래깅 제스쳐가 수행됨에 따라 상기 터치스크린에는 상기 구간의 종료점에 대응하는 프레임이 표시되는 단계를 더 포함한다. In an embodiment of the present disclosure, the method further includes displaying a frame corresponding to an end point of the section as the dragging gesture is performed.

본 발명의 일 실시예에서 상기 방법은 상기 입력된 음성정보가 텍스트화되는 단계를 더 포함한다. In one embodiment of the present invention, the method further includes textualizing the input voice information.

본 발명의 일 실시예에서 상기 방법은 상기 음성정보 입력모드 중 상기 컨텐츠 재생 중 발생하는 소리를 무음화시키는 단계를 더 포함한다. In an embodiment of the present invention, the method further includes muting the sound generated during the playing of the content in the voice information input mode.

본 발명의 일 실시예에서, 상기 음성정보 입력모드는 사용자 입력에 따른 입력신호에 의하여 활성화된다. In one embodiment of the present invention, the voice information input mode is activated by an input signal according to a user input.

상기 과제를 해결하기 위하여, 본 발명은 터치스크린을 구비한 전자기기의 멀티미디어 컨텐츠 태깅장치로서, 상기 장치는 멀티미디어 컨텐츠를 상기 전자기기에서 재생하는 재생부; 상기 재생부에 의한 멀티미디어 컨텐츠 재생 중, 음성정보가 상기 전자기기에 입력될 수 있는 모드인 음성정보 입력모드를 활성화시키는 음성정보 입력모드 활성부; 상기 음성정보 입력모드 활성부에 의한 음성정보 입력모드에서, 상기 외부로부터의 음성정보를 상기 전자기기에 입력시키는 음성정보입력부; 및 상기 음성정보 입력모드 활성부에 의한 음성정보 입력모드에서, 상기 터치스크린에서의 수행되는 두 개의 터치에 의한 드래깅 제스쳐를 검출하는 드래깅 제스쳐 검출부: 상기 두 개의 터치 사이 거리에 따라 상기 컨텐츠 구간을 결정하는 컨텐츠 구간 결정부; 및 상기 컨텐츠 구간 결정부에 의하여 결정된 컨텐츠 구간에 상기 음성정보입력부에 의하여 음성정보를 태깅정보로 저장하는 저장부를 포함하는 것을 특징으로 하는 멀티미디어 컨텐츠 태깅장치를 제공한다. In order to solve the above problems, the present invention provides a multimedia content tagging device of an electronic device having a touch screen, the device comprising: a playback unit for playing the multimedia content on the electronic device; A voice information input mode activator for activating a voice information input mode which is a mode in which voice information can be input to the electronic device during reproduction of multimedia contents by the playback unit; A voice information input unit which inputs voice information from the outside to the electronic device in a voice information input mode by the voice information input mode activator; And a dragging gesture detector for detecting a dragging gesture by two touches performed on the touch screen in the voice information input mode by the voice information input mode activator: determining the content section according to a distance between the two touches. A content section determination unit; And a storage unit which stores voice information as tagging information by the voice information input unit in the content section determined by the content section determining unit.

본 발명의 일 실시예에서, 상기 컨텐츠 구간 결정부는 상기 두 개의 터치 사이 거리에 비례하여 상기 컨텐츠 구간을 결정한다. In one embodiment of the present invention, the content section determiner determines the content section in proportion to the distance between the two touches.

본 발명의 일 실시예에서 상기 드래깅 제스쳐는 핀치 제스쳐이며, 상기 컨텐츠 태깅장치는 상기 재생부는 상기 음성정보 입력모드에서 소리를 무음화시킨다. In one embodiment of the present invention, the dragging gesture is a pinch gesture, and the content tagging device mutes the sound in the voice information input mode.

본 발명의 일 실시예에서, 상기 장치는 상기 컨텐츠 구간 결정부에 의하여 결정된 컨텐츠 구간의 종료점에 대응하는 프레임을 상기 터치스크린에 표시하는 표시부를 더 포함한다. In one embodiment of the present invention, the device further includes a display unit for displaying a frame corresponding to the end point of the content section determined by the content section determiner on the touch screen.

본 발명의 일 실시예에서, 상기 태깅장치는 음성정보가 태깅된 구간의 컨텐츠 정보를 외부 서버로 송신할 수 있는 통신부를 더 포함한다. In one embodiment of the present invention, the tagging device further includes a communication unit for transmitting the content information of the section tagged with the voice information to the external server.

본 발명은 또한 음성정보 태그 공유시스템으로, 상기 시스템은 다수 사용자 각각에 의하여 컨텐츠 구간 및 상기 컨텐츠 구간에 대응하는 음성정보가 태깅되는 상술한 멀티미디어 컨텐츠 태깅장치; 상기 다수 사용자 각각의 멀티미디어 컨텐츠 태깅장치로부터 컨텐츠 구간 및 상기 컨텐츠 구간에 태깅된 음성정보를 각각 수신받는 서버; 상기 서버로 수신된 컨텐츠 구간 및 상기 컨텐츠 구간에 태깅된 음성정보를 분석하여, 가장 다수의 사용자가 입력한 태그 정보를 결정하는 분석부; 및 상기 분석부에 의하여 결정된 최다 수신 태그 정보를 유효 태그 정보로 저장하는 저장부를 포함한다. The present invention also provides a voice information tag sharing system, wherein the system comprises: the above-mentioned multimedia content tagging device in which voice information corresponding to the content section and the content section is tagged by each of a plurality of users; A server for receiving a content section and voice information tagged in the content section from the multimedia content tagging device of each of the plurality of users; An analysis unit which analyzes the content section received by the server and the voice information tagged in the content section, and determines tag information input by the plurality of users; And a storage unit which stores the most received tag information determined by the analyzer as valid tag information.

본 발명의 일 실시예에서, 상기 유효 태그 정보는 타 사용자 클라이언트에 표시될 수 있다. In one embodiment of the present invention, the valid tag information may be displayed to another user client.

본 발명에 따르면, 음성에 기반하여 멀티미디어 컨텐츠를 태깅할 수 있다. 이 경우, 음성이 입력되는 시간동안 터치스크린 상에 수행되는 드래깅 제스쳐를 통하여 음성정보가 태깅되는 컨텐츠 범위가 결정되므로, 음성정보가 태깅되는 컨텐츠 범위가 간단히 결정될 수 있다. 또한, 음성 정보가 태깅된 멀티미디어 컨텐츠는 사용자의 입력 음성에 따라 효과적으로 검색될 수 있으며, 다수의 사용자가 사용하는 멀티미디어인 경우, 집단 지성 방식에 따라 다수 사용자가 멀티미디어를 통합 태깅할 수 있다. 더 나아가, 음성으로 기록된 태그는 즉시 또는 주기적으로 텍스트화되어, 타 사용자에게 검색에 유용한 정보로 활용될 수 있다.According to the present invention, it is possible to tag multimedia content based on voice. In this case, since the content range to which the voice information is tagged is determined through the dragging gesture performed on the touch screen during the voice input time, the content range to which the voice information is tagged can be simply determined. In addition, the multimedia content tagged with the voice information may be effectively searched according to the input voice of the user, and when the multimedia is used by a plurality of users, the multi-user may tag the multimedia according to the collective intelligence method. Furthermore, the voice-recorded tag may be texted immediately or periodically, and may be used as information useful for search to other users.

도 1은 본 발명의 일 실시예에 따른 멀티미디어 컨텐츠 태깅 방법의 단계도이다.
도 2 내지 6은 본 발명의 일 실시예에 따른 멀티미디어 컨텐츠 태깅방법을 설명하는 도면이다.
도 7은 본 발명의 또 다른 일 실시예에 따른 멀티미디어 컨텐츠 태깅장치의 블록도이다.
도 8은 본 발명의 또 다른 일 실시예에 따른 멀티미디어 컨텐츠 태깅장치의 블록도이다.
도 9는 본 발명에 따른 태깅 장치에 구비된 통신부를 이용한, 다수 사용자의 태깅 정보 공유 및 집단 지성 방식으로의 컨텐츠 통합 태깅 방식 및 시스템을 설명하는 도면이다.1 is a step diagram of a multimedia content tagging method according to an embodiment of the present invention.
2 to 6 are diagrams illustrating a multimedia content tagging method according to an embodiment of the present invention.
7 is a block diagram of an apparatus for tagging multimedia contents according to another embodiment of the present invention.
8 is a block diagram of an apparatus for tagging multimedia contents according to another embodiment of the present invention.
FIG. 9 is a diagram illustrating a content integration tagging method and a system using a communication unit included in a tagging device according to an exemplary embodiment of the present invention.

본 발명과 본 발명의 동작상의 이점 및 본 발명의 실시에 의하여 달성되는 목적을 충분히 이해하기 위해서는 본 발명의 바람직한 실시예를 예시하는 첨부 도면 및 첨부 도면에 기재된 내용을 참조하여야만 한다. In order to fully understand the present invention, operational advantages of the present invention, and objects achieved by the practice of the present invention, reference should be made to the accompanying drawings and the accompanying drawings which illustrate preferred embodiments of the present invention.

이하, 첨부한 도면을 참조하여 본 발명의 바람직한 실시예를 설명함으로써, 본 발명을 상세히 설명한다. 그러나, 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며, 설명하는 실시예에 한정되는 것이 아니다. 그리고, 본 발명을 명확하게 설명하기 위하여 설명과 관계없는 부분은 생략되며, 도면의 동일한 참조부호는 동일한 부재임을 나타낸다. BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, the present invention will be described in detail with reference to the preferred embodiments of the present invention with reference to the accompanying drawings. However, the present invention can be implemented in various different forms, and is not limited to the embodiments described. In order to clearly describe the present invention, parts that are not related to the description are omitted, and the same reference numerals in the drawings denote the same members.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 “포함”한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라, 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 “...부”, “...기”, “모듈”, “블록” 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Throughout the specification, when an element is referred to as " including " an element, it does not exclude other elements unless specifically stated to the contrary. The terms "part", "unit", "module", "block", and the like described in the specification mean units for processing at least one function or operation, And a combination of software.

본 발명은 상술한 종래 기술의 문제를 해결하기 위하여, 구간설정을 드래깅 제스쳐로 설정하는 음성 기반의 멀티미디어 컨텐츠 태깅 방법을 제공한다. 본 발명의 일 실시예에서 구간설정은 시작점과 종료점으로 이루어지며, 상기 시작점과 종료점은 터치스크린 상의 두 지점 터치에 대응되며, 핀치(꼬집기) 제스쳐 또는 언핀치(펼치기) 제스쳐를 통하여 음성정보가 태깅되는 컨텐츠 범위가 결정될 수 있다. The present invention provides a voice-based multimedia content tagging method for setting the sectioning to the dragging gesture in order to solve the above-described problems of the prior art. In one embodiment of the present invention, the section setting includes a start point and an end point, and the start point and the end point correspond to two point touches on the touch screen, and voice information is tagged through a pinch (pinch) gesture or an unpinch (unfold) gesture. The content range to be determined may be determined.

즉, 본 발명은 특정 컨텐츠를 식별시키기 위한 가상 표식인 태그를 음성으로 제공하며, 이로써 사용자 입력 수단(마우스. 터치패널 입력 등) 없이, 간단히 특정 음성이나 텍스트를 특정 컨텐츠에 기록, 매칭시킴으로써 멀티미디어 태깅이 가능하다. 더 나아가, 음성정보 태그정보 입력 중 진행되는 멀티터치 드래깅 제스쳐를 통하여 태깅되는 컨텐츠 구간이 결정된다. That is, the present invention provides a tag, which is a virtual mark for identifying specific content, by voice, thereby enabling multimedia tagging by simply recording and matching a specific voice or text to specific content without a user input means (mouse, touch panel input, etc.). This is possible. Furthermore, the content section tagged is determined through the multi-touch dragging gesture that is performed during the input of the voice information tag information.

본 발명에 따른 컨텐츠 태깅 방법과 장치는 멀티미디어가 재생될 수 있는 컴퓨터 등과 같은 전자기기뿐만 아니라, 스마트 폰과 같은 모바일 기기와 같은 전자기기에서도 활용가능하다. 본 발명에 따른 컨텐츠는 그 종류나 형식에 제한이 없으며, 음성 또는 문자 형태로 태깅될 수 있는, 구부가능한 임의의 모든 형태가 본 발명에 따른 컨텐츠에 속한다.The content tagging method and apparatus according to the present invention can be utilized not only in electronic devices, such as computers, etc., in which multimedia can be played, but also in electronic devices, such as mobile devices, such as smart phones. The content according to the present invention is not limited in kind or form, and any bendable form, which can be tagged in a voice or text form, belongs to the content according to the present invention.

이하, 본 발명의 일 실시예에 따른 멀티미디어 컨텐츠 태깅 방법을 도면으로 통하여 설명한다.Hereinafter, a method of tagging multimedia content according to an embodiment of the present invention will be described with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 멀티미디어 컨텐츠 태깅 방법의 단계도이다. 1 is a step diagram of a multimedia content tagging method according to an embodiment of the present invention.

도 1을 참조하면, 먼저 전자 기기에서의 컨텐츠 재생 중 음성정보 입력모드가 활성화된다(S110). 본 명세서에서 음성정보 입력모드는 전자기기 외부로부터의 음성을 인식하고, 이를 저장할 수 있는 전자기기의 작동모드를 말하며, 이러한 음성정보 입력모드는 사용자 입력에 따라 전자기기에 구비된 입력수단으로부터 발생하는 입력신호에 따라 활성화된다. 예를 들어, 전자기기가 휴대전화와 같은 모바일 기기인 경우, 모바일 기기에 구비된 입력버튼의 누름에 따라 음성정보 입력모드가 활성화된다. Referring to FIG. 1, first, a voice information input mode is activated during content playback in an electronic device (S110). In the present specification, the voice information input mode refers to an operation mode of an electronic device capable of recognizing a voice from the outside of the electronic device and storing the same. Activated according to the input signal. For example, when the electronic device is a mobile device such as a cellular phone, the voice information input mode is activated by pressing an input button provided in the mobile device.

이후, 상기 음성정보 입력모드 중 음성정보가 상기 전자기기에 입력된다(S120). 상기 입력되는 음성정보는 외부로부터의 특정 음성에 해당하며, 상기 음성정보는 전자기기에 구비된 마이크 등과 같은 음향기기를 통하여 전자기기에 입력되며, 상기 입력된 음성정보는 음성정보 입력모드 중 드래깅 제스쳐에 의하여 결정된 특정 구간의 컨텐츠 정보에 대한 태그(tag)가 된다. Thereafter, voice information of the voice information input mode is input to the electronic device (S120). The input voice information corresponds to a specific voice from the outside, and the voice information is input to the electronic device through an acoustic device such as a microphone provided in the electronic device, and the input voice information is a dragging gesture in the voice information input mode. It becomes a tag for content information of a specific section determined by.

이후, 상기 음성정보 입력모드 중 활성화된 상기 터치스크린에서의 드래깅 제스쳐에 따라 상기 음성정보가 태깅되는 컨텐츠 구간이 결정된다(S130). 본 발명의 일 실시예에서 상기 드래깅 제스쳐는 두 개의 터치가 동시에 검출되는 형태의 멀티터치 드래깅 제스쳐이며, 상기 검출된 상기 두 개의 터치 지점 중 어느 하나는 상기 음성정보가 태깅되는 컨텐츠 구간의 시작점에 대응되거, 나머지는 컨텐츠 구간의 종료점에 해당한다. 즉, 본 발명은 컨텐츠의 시간 범위 중 특정 시간범위(구간)를 하나의 음성정보로 태깅하며, 이로써 재생중 일 프레임에만 태깅되는 종래 기술과 달리 일정 시간동안의 다수 프레임으로 이루어진 컨텐츠를 묶음 형태로 한 번의 제스쳐로 태깅할 수 있다. Thereafter, the content section in which the voice information is tagged is determined according to the dragging gesture on the touch screen activated in the voice information input mode (S130). In one embodiment of the present invention, the dragging gesture is a multi-touch dragging gesture in which two touches are simultaneously detected, and any one of the detected two touch points corresponds to a starting point of a content section in which the voice information is tagged. The rest corresponds to the end point of the content section. That is, according to the present invention, a specific time range (section) of the time range of the content is tagged as one voice information, and thus, unlike the conventional technology in which only one frame is tagged during playback, the content consisting of a plurality of frames for a predetermined time is bundled into a bundle form. You can tag with one gesture.

이후, 상기 음성정보 입력모드가 종료된다(S140). 본 발명의 일 실시예에서 상기 음성정보 입력모드의 종료 또한 사용자에 의한 입력선택 신호에 따라 결정되며, 상기 음성정보 입력모드의 활성화와 종료는 상기 전자기기에 구비된 특정 입력버튼에 의한 사용자 입력 지속 시간에 따라 결정될 수 있다. 이후, 상기 입력된 음성정보가 상기 결정된 구간의 컨텐츠에 대한 태깅 정보로 상기 전자 기기에 저장된다(S150). Thereafter, the voice information input mode ends (S140). In one embodiment of the present invention, the termination of the voice information input mode is also determined according to an input selection signal by a user, and the activation and termination of the voice information input mode is continued by a user's specific input button provided in the electronic device. Can be determined over time. Thereafter, the input voice information is stored in the electronic device as tagging information on the content of the determined section (S150).

이하 본 발명의 일 실시예에 따른 멀티미디어 컨텐츠 태깅방법을 실시예로서 상세히 설명한다. Hereinafter, a method of tagging multimedia content according to an embodiment of the present invention will be described in detail as an embodiment.

도 2 내지 6은 본 발명의 일 실시예에 따른 멀티미디어 컨텐츠 태깅방법을 설명하는 도면이다.2 to 6 are diagrams illustrating a multimedia content tagging method according to an embodiment of the present invention.

도 2를 참조하면, 터치스크린(110)을 구비한 전자기기(100)이 개시된다. 상기 전자기기(100)는 동영상, 슬라이드 등의 멀티미디어 컨텐츠를 재생할 수 있는 임의의 전자기기일 수 있다. Referring to FIG. 2, an electronic device 100 including a touch screen 110 is disclosed. The electronic device 100 may be any electronic device capable of playing multimedia contents such as moving pictures and slides.

도 3을 참조하면, 상기 전자기기(100)에서 음성정보 입력모드가 활성화되며, 이로써 특정음성의 음성정보가 입력된다. 이때 상기 재생중인 멀티미디어 컨텐츠에서의 소리는 무음처리될 수 있으며, 이로써 외부 음성의 인식률이 향상된다. Referring to FIG. 3, the voice information input mode is activated in the electronic device 100, thereby inputting voice information of a specific voice. At this time, the sound in the multimedia content being played may be silenced, thereby improving the recognition rate of the external voice.

도 4를 참조하면, 상기 음성정보 입력모드 중 터치스크린(110)에서의 핀치 제스쳐가 수행된다. 핀치 제스쳐는 두 개의 터치 입력수단의 거리를 동시에 좁히거나 넓히는, 소위 꼬집는 제스쳐에 해당한다. 즉, 본 발명은 터치스크린에서 검출된 두 개의 터치입력수단에 의한 터치지점 사이의 거리에 따라, 상기 컨텐츠 구간을 비례하여 결정하는데, 도 5에서는 두 개의 터치지점사이의 거리(d1)에 따라 t1-t2 구간의 컨텐츠가 상기 입력된 음성정보에 의하여 태깅된다. 즉, 본 발명은 원하는 구간 종료 시점까지 컨텐츠 재생을 기다릴 필요없이, 한 번에 수행될 수 있는 드래깅 제스쳐에 따라 태깅되는 컨텐츠 구간을 간단하게 결정할 수 있다. 또한 터치 거리와 컨텐츠 구간은 그 기준이 미리 결정되어 있을 수 있으며, 이 경우, 상기 기준과 비율에 따라 터치 거리-컨텐츠 구간이 결정될 수 있다. Referring to FIG. 4, a pinch gesture on the touch screen 110 is performed in the voice information input mode. A pinch gesture corresponds to a so-called pinch gesture that simultaneously narrows or widens the distance of two touch input means. That is, according to the present invention, the content section is proportionally determined according to the distance between the touch points by the two touch input means detected by the touch screen. In FIG. 5, t1 is determined according to the distance d1 between the two touch points. The content of the section -t2 is tagged by the input voice information. That is, the present invention can simply determine a tagged content section according to a dragging gesture that can be performed at one time without having to wait for content playback until a desired section end time. In addition, the touch distance and the content section may have a predetermined criterion. In this case, the touch distance-content section may be determined according to the criterion and the ratio.

도 6에서는, 거리(d2)에서는 t1-t3구간의 컨텐츠가 음성정보가 태깅되는 컨텐츠 구간이 된다. 즉, 본 발명은 핀치 제스쳐에 따라 변화되는 터치지점 사이의 거리와, 컨텐츠 구간을 매칭시켜, 컨텐츠 음성태깅을 보다 간단히 수행할 수 있게 한다. In FIG. 6, the content of the t1-t3 section is the content section in which the voice information is tagged at the distance d2. That is, the present invention makes it possible to perform content voice tagging more simply by matching the distance between the touch points that change according to the pinch gesture and the content section.

더 나아가, 핀치 제스쳐로 예시될 수 있는 드래깅 제스쳐가 수행됨에 따라 컨텐츠 구간의 종료점 프레임이 터치스크린에 표시될 수 있다. 즉, 도 5와 6의 프레임은 t2와 t3에 해당하는 동영상 프레임이 된다. Furthermore, as the dragging gesture, which may be illustrated as the pinch gesture, is performed, an end point frame of the content section may be displayed on the touch screen. That is, the frames of FIGS. 5 and 6 become video frames corresponding to t2 and t3.

본 발명의 또 다른 일 실시예는 상기 입력된 음성정보가 텍스트화되는 단계를 더 포함하며, 이로써 사용자는 문자 형태로도 특정 구간의 컨텐츠를 효과적으로 검색할 수 있다. Another embodiment of the present invention further includes the step of text inputting the voice information so that the user can effectively search for the contents of a specific section even in the form of text.

도 7은 본 발명의 또 다른 일 실시예에 따른 멀티미디어 컨텐츠 태깅장치의 블록도로서, 상기 장치는 멀티미디어 컨텐츠를 상기 전자기기에서 재생하는 재생부(210); 상기 재생부에 의한 멀티미디어 컨텐츠 재생 중, 음성정보가 상기 전자기기에 입력될 수 있는 모드인 음성정보 입력모드를 활성화시키는 음성정보 입력모드 활성부(220); 상기 음성정보 입력모드 활성부에 의한 음성정보 입력모드에서, 상기 외부로부터의 음성정보를 상기 전자기기에 입력시키는 음성정보입력부(230); 및 상기 음성정보 입력모드 활성부에 의한 음성정보 입력모드에서, 상기 터치스크린에서의 수행되는 두 터치에 의한 드래깅 제스쳐를 검출하는 드래깅 제스쳐 검출부(240): 상기 드래깅 제스쳐 검출부에 의하여 검출된 드래깅 제스쳐의 두 터치 지점 사이의 거리에 따라 상기 컨텐츠 구간을 결정하는 컨텐츠 구간 결정부(250); 및 상기 컨텐츠 구간 결정부에 의하여 결정된 컨텐츠 구간에 상기 음성정보입력부에 의하여 음성정보를 태깅정보로 저장하는 저장부(260)를 포함한다. 7 is a block diagram of a multimedia content tagging apparatus according to another embodiment of the present invention, the apparatus comprising: a playback unit 210 for playing multimedia contents on the electronic device; A voice information input mode activator 220 for activating a voice information input mode, which is a mode in which voice information can be input to the electronic device, during reproduction of the multimedia contents by the playback unit; A voice information input unit 230 for inputting voice information from the outside to the electronic device in a voice information input mode by the voice information input mode activator; And a dragging gesture detector 240 for detecting a dragging gesture by two touches performed on the touch screen in the voice information input mode by the voice information input mode activator: a dragging gesture detected by the dragging gesture detector; A content section determiner 250 that determines the content section according to a distance between two touch points; And a storage unit 260 for storing voice information as tagging information by the voice information input unit in the content section determined by the content section determining unit.

즉, 본 발명에 따른 멀티미디어 컨텐츠 태깅장치는 재생부(210)에 의한 컨텐츠 재생 중 태깅하고자 하는 프레임에서 특정 음성을 태깅 정보로 입력하고, 드래깅 제스쳐에 따라 상기 태깅 정보에 결합되는 컨텐츠 구간을 결정한다. 특히, 본 발명에 따른 상기 컨텐츠 구간 결정부(250)는 터치스크린에서 동시에 검출되는 두 터치 지점 사이의 거리에 비례하여 상기 컨텐츠 구간을 결정하며, 상기 컨텐츠 구간 결정은 음성정보입력부(230)에 의하여 활성화되는 음성정보 입력모드 중에서만 수행될 수 있다. That is, the multimedia content tagging apparatus according to the present invention inputs a specific voice as tagging information in a frame to be tagged during content reproduction by the playback unit 210 and determines a content section combined with the tagging information according to a dragging gesture. . In particular, the content section determiner 250 according to the present invention determines the content section in proportion to the distance between two touch points simultaneously detected on the touch screen, the content section is determined by the voice information input unit 230 Only during the voice information input mode that is activated.

본 발명에서는 상술한 바와 같이 두 개의 터치가 동시에 검출되며, 그 사이의 거리에 따라 컨텐츠 구간이 결정된다. 따라서, 본 발명에서 상기 드래깅 제스쳐는 핀치 제스쳐일 수 있다. 더 나아가, 본 발명에 따른 상기 컨텐츠 태깅장치에서 상기 재생부는 상기 음성정보 입력모드에서 소리를 무음화시키며, 이로써 음성인식 정확도를 향상시킬 수 있다. In the present invention, as described above, two touches are simultaneously detected, and a content section is determined according to the distance therebetween. Therefore, in the present invention, the dragging gesture may be a pinch gesture. Furthermore, in the content tagging apparatus according to the present invention, the playback unit mutes the sound in the voice information input mode, thereby improving the voice recognition accuracy.

도 8은 본 발명의 또 다른 일 실시예에 따른 멀티미디어 컨텐츠 태깅장치의 블록도이다. 8 is a block diagram of an apparatus for tagging multimedia contents according to another embodiment of the present invention.

도 8을 참조하면, 상기 장치는 도 7의 장치에 추가로 상기 컨텐츠 구간 결정부에 의하여 결정된 컨텐츠 구간의 종료점에 대응하는 프레임을 상기 터치스크린에 표시하는 표시부(270)를 더 포함한다. 이로써 사용자는 자신의 손가락을 오무리거나(핀치), 벌림으로써(언핀치) 자신이 원하는 컨텐츠 구간의 종료점을 효과적으로 미리 볼 수 있다. 본 발명의 일 실시예에 따른 멀티미디어 컨텐츠 태깅장치는 음성정보가 태깅된 구간의 컨텐츠 정보를 외부 서버로 송신할 수 있는 통신부를 더 포함하며, 도 9는 본 발명에 따른 태깅 장치에 구비된 통신부를 이용한, 다수 사용자의 태깅 정보 공유 및 집단 지성 방식으로의 컨텐츠 통합 태깅 방식 및 시스템을 설명하는 도면이다.Referring to FIG. 8, the device further includes a display unit 270 that displays, on the touch screen, a frame corresponding to an end point of the content section determined by the content section determiner in addition to the device of FIG. 7. In this way, the user can effectively preview the end point of the desired content section by pinching (pinching) or spreading (unpinching) his or her finger. Multimedia content tagging apparatus according to an embodiment of the present invention further includes a communication unit for transmitting the content information of the section tagged with the voice information to an external server, Figure 9 is a communication unit provided in the tagging device according to the present invention FIG. 3 is a diagram illustrating a content integration tagging method and system using a multi-user tagging information sharing and collective intelligence method.

도 9를 참조하면, 온라인 서버((410)을 통하여 다수의 사용자(420, 430, 440)에게 동일한 멀티미디어 컨텐츠가 제공된다. 재생 중 다수의 사용자가 비슷한 재생구간의 컨텐츠에 음성 또는 텍스트 방식으로 태깅을 진행하면, 상기 태깅된 정보는 상기 서버(410)로 전송된다. 이후, 상기 서버는 입력된 컨텐츠 시간과 태그 정보를 분석하여, 가장 다수의 사용자가 입력한 태그 정보를 상기 시간 범위의 컨텐츠 태그로 저장하고, 이를 타 사용자의 사용자 단말에 표시한다. 따라서 본 발명의 일 실시예에 따른 음성정보 태그 공유 시스템은 가장 다수의 사용자가 입력한 태그 정보를 분석하는 분석부(420)를 포함한다. 상기 분석부(420)는 서버(410)를 통하여 수신된 컨텐츠 구간 정보와 이에 태깅된 음성정보를 분석하고, 동일 구간의 컨텐츠에 최다 수신된 음성정보를 분석한다. 9, the same multimedia contents are provided to a plurality of users 420, 430, and 440 through an online server 410. During playback, a plurality of users tag the contents of a similar playback section by voice or text. Then, the tagged information is transmitted to the server 410. Thereafter, the server analyzes the input content time and tag information, and the tag information input by the plurality of users is input to the content tag of the time range. The voice information tag sharing system according to an exemplary embodiment of the present invention includes an analysis unit 420 for analyzing tag information inputted by a plurality of users. The analyzer 420 analyzes the content section information received through the server 410 and the tagged voice information, and divides the most received voice information into the contents of the same section. The.

본 발명에 따른 시스템은 더 나아가, 최다 수신 태그 정보를 유효 태그 정보로 저장하는 저장부(430)를 더 포함하며, 상기 저장된 유효 태그 정보는 상기 컨텐츠 구간의 태깅 정보로 타 사용자라 이를 검색할 수 있다.The system according to the present invention further includes a storage unit 430 for storing the most received tag information as valid tag information, and the stored valid tag information can be searched by other users as tagging information of the content section. have.

본 발명은 이러한 집단지성 방식으로 처음으로 동일 컨텐츠를 접하는 사용자도 미리 다수의 사용자가 입력한 신뢰성 높은 태그 정보를 이용할 수 있다. 본 발명은 또한 상기 멀티미디어 컨텐츠 태깅장치를 포함하는 전자기기로서, 휴대전화와 같은 모바일 기기를 제공한다. According to the present invention, a user who encounters the same content for the first time in such a collective intelligence method can use reliable tag information input by a plurality of users in advance. The present invention also provides a mobile device such as a cellular phone as an electronic device including the multimedia content tagging device.

이상과 같이 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명이 상기의 실시예에 한정되는 것은 아니며, 이는 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다. 따라서, 본 발명의 사상은 아래에 기재된 특허청구범위에 의해서만 파악되어야 하고, 이와 균등하거나 또는 등가적인 변형 모두는 본 발명 사상의 범주에 속한다 할 것이다. While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, Modification is possible. Accordingly, the spirit of the present invention should be understood only in accordance with the following claims, and all of the equivalent or equivalent variations will fall within the scope of the present invention.

또한, 본 발명에 따른 멀티미디어 컨텐츠 태깅 방법 및 장치는 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 기록매체의 예로는 ROM, RAM, CD ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장장치, 하드 디스크, 플래시 드라이브 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.In addition, the method and apparatus for tagging multimedia contents according to the present invention may be embodied as computer readable codes on a computer readable recording medium. A computer-readable recording medium includes all kinds of recording apparatuses in which data that can be read by a computer system is stored. Examples of the recording medium include a ROM, a RAM, a CD ROM, a magnetic tape, a floppy disk, an optical data storage device, a hard disk, a flash drive, and the like, and may be implemented in the form of a carrier wave . The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

Claims

A method of tagging multimedia content by an electronic device having a touch screen, the method
Activating a voice information input mode during content playback on the electronic device;
Inputting voice information to the electronic device in the voice information input mode;
Determining a content section in which the voice information is tagged according to a dragging gesture on the touch screen activated during the voice information input mode;
Terminating the voice information input mode; And
And when the voice information input mode ends, storing the voice information input during the voice information input mode as tagging information for the content of the determined section in the electronic device.

The method of claim 1,
The dragging gesture is a dragging gesture by two touches, wherein one of the two touch points corresponds to a start point of a content section in which the voice information is tagged, and the other corresponds to an end point.

3. The method of claim 2,
The content section is a multimedia content tagging method, characterized in that corresponding to the distance between the two touch points.

The method of claim 3, wherein the method
And displaying a frame corresponding to an end point of the section on the touch screen as the dragging gesture is performed.

The method of claim 4, wherein the method
The method of claim 1, further comprising the step of textualizing the input voice information.

The method of claim 1,
And muting the sound generated during the playing of the content in the voice information input mode.

Claim 7 has been abandoned due to the setting registration fee.

The method of claim 1,
And the voice information input mode is activated by an input signal according to a user input.

An apparatus for tagging multimedia contents of an electronic device having a touch screen, the apparatus comprising:
A playback unit for playing multimedia contents on the electronic device;
A voice information input mode activator for activating a voice information input mode which is a mode in which voice information can be input to the electronic device during reproduction of multimedia contents by the playback unit;
A voice information input unit which inputs voice information from the outside to the electronic device in a voice information input mode by the voice information input mode activator; And
In the voice information input mode by the voice information input mode activator, Dragging gesture detection unit for detecting a dragging gesture by two touches performed on the touch screen:
A content section determiner configured to determine the content section according to the distance between the two touches; And
And a storage unit for storing the voice information input during the voice information input mode as tagging information in the content section determined by the content section determining unit when the voice information input mode is terminated. .

Claim 9 has been abandoned due to the setting registration fee.

The method of claim 8,
And the content section determination unit determines the content section in proportion to the distance between the two touches.

Claim 10 has been abandoned due to the setting registration fee.

The method of claim 9,
The dragging gesture is a multimedia content tagging device, characterized in that the pinch gesture.

The method of claim 8, wherein the content tagging device
And the playback unit mutes the sound in the voice information input mode.

The device of claim 8, wherein the device is
And a display unit for displaying a frame corresponding to an end point of the content section determined by the content section determiner on the touch screen.

Claim 13 has been abandoned due to the set registration fee.

13. The tagging apparatus according to any one of claims 8 to 12, wherein
And a communication unit for transmitting the content information of the section in which the voice information is tagged to an external server.

delete