KR20170130729A

KR20170130729A - Method, Apparatus, and Computer-Readable Medium for Automatic editing moving picture data and audio data

Info

Publication number: KR20170130729A
Application number: KR1020160061285A
Authority: KR
Inventors: 이경준
Original assignee: (주)알투디투사운드
Priority date: 2016-05-19
Filing date: 2016-05-19
Publication date: 2017-11-29

Abstract

According to the present invention, it is possible to provide a user with second data obtained by matching and combining automatically pre-stored audio data with a whole or a part of a reproduction section of first data by using image analysis data derived from attribute information input by a user with respect to the first data that is video data, and variation and color information of a plurality of image data included in the first data. Accordingly, when a user edits a video, a process of selecting each audio data and a process of selecting a reproduction section of a video to be combined with the audio data can be minimized. Also, since a user does not need to know an editing technique of video data and audio data, it is possible to minimize technical barriers in editing video data and to provide a user with utmost convenience through omission of an editing process.

Description

[0001] The present invention relates to a method and a computer-readable medium for automatically editing video data and audio data,

본 발명은 동영상 데이터와 음성 데이터를 결합하는 등의 동영상 편집 기술에 관한 것으로, 구체적으로는 동영상 데이터를 분석하여 자동으로 이에 매칭되는 효과음 및 배경음 등의 음성 데이터를 결합할 수 있도록 하는 기술에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention [0002] The present invention relates to a moving picture editing technique such as combining moving picture data and voice data, and more specifically, to a technique for analyzing moving picture data to automatically combine voice data such as an effect sound and a background sound .

최근 다양한 동영상 공유 사이트 등에 대한 이용률 증가 및 소셜 네트워크 서비스의 이용률 증가로, 사용자들은 개인 또는 단체로 동영상을 제작하고, 이를 공유하여 관심을 얻거나 유명세를 얻고 있다. 이러한 동영상은, 흔히 촬영 장치를 이용하여 동영상을 촬영하고, 촬영된 동영상에 다양한 효과를 적용하여 제작되는 것으로서, 동영상과 음성 데이터가 결합된 형태로 공유되고 있다.Recently, as the usage rate of various video sharing sites increases and the usage rate of social network services increases, users make videos or share them with individuals or groups, thereby gaining interest or gaining popularity. Such a moving image is often produced by shooting a moving image using a photographing apparatus and applying various effects to the photographed moving image, and is shared with a combination of moving image and audio data.

동영상에 포함되는 음성 데이터는 흔히 촬영 장치에서 직접 녹음된, 즉 동영상 촬영 시에 함께 녹음된 음성 데이터 및 동영상을 편집 시 결합된 음성 데이터가 존재한다.Audio data included in a moving picture is often recorded directly in a photographing apparatus, that is, voice data recorded together when moving pictures are captured, and voice data combined when moving pictures are edited.

이 중, 함께 녹음된 음성 데이터의 수정 및 편집 또는 동영상을 편집 시 새로운 음성 데이터의 결합 등에는 전문적인 프로그램이 사용된다. 비록 최근에는 다양한 보급형 동영상 편집 프로그램 및 어플리케이션이 보급되면서, 비교적 그 난이도는 낮아졌다 하지만, 편집 대상이 되는 음성 데이터를 사용자가 직접 찾고 이를 자신의 단말에 저장하는 동시에, 음성 데이터를 동영상의 재생 구간 중 어느 구간에 결합해야 할지를 직접 결정하고 이를 편집해야 하는 등의 번거로운 작업이 불가피하여, 동영상 편집에 있어서의 시간 소요 및 불편함이 존재하고 있다.Of these, specialized programs are used for editing and editing voice data recorded together or for combining new voice data when editing moving images. Although the difficulty has been lowered relatively recently due to the spread of various types of popular video editing programs and applications, the user directly searches for voice data to be edited and stores the voice data in his / her terminal, and at the same time, There is a time-consuming and inconvenient time for editing a moving picture because it is inevitable that a troublesome work such as directly determining whether to be combined in a section and editing it is inevitable.

동일 기술분야로서 한국공개특허 제2007-0098362호 등에서는, 재생되는 배경 음악에 대한 재생 구간을 설정하고, 이에 배경 음악을 결합하여 동영상 데이터를 생성하는 기술을 제시하고 있다. 그러나 상기의 기술에서도 사용자가 직접 배경 음악을 선택해야 하고, 결국 타임 라인 컨트롤 도구 상에서 사용자가 직접 배경 음악의 재생 구간을 선택해야 하는 등의 불편함이 존재해 왔다.Korean Unexamined Patent Publication No. 2007-0098362 discloses a technology for generating moving picture data by setting a playback interval for the background music to be played back and combining the background music with the background music. However, in the above-mentioned technique, the user has to directly select the background music. Consequently, there has been inconvenience that the user must directly select the playback interval of the background music on the timeline control tool.

이에 본 발명은, 많은 음성 데이터를 동영상 데이터에 결합하는 기술에 있어서, 동영상 데이터에 자동으로 어울리는 음성 데이터를 매칭하여 사용자에게 제공함으로써, 사용자가 수많은 음성 데이터를 일일이 선택하는 불편함을 최소화하고, 재생 구간 역시 자동으로 분석 및 선택할 수 있는 기술을 제공함으로써, 배경음 또는 효과음이 동영상에 자동으로 결합되어, 사용자의 동영상 편집의 편리함을 극대화하는 기술을 제공하는 데 그 목적이 있다.Accordingly, the present invention provides a technology for combining a large amount of audio data with moving picture data, and by matching the audio data automatically matching the moving picture data to the user, the inconvenience of the user selecting a large number of audio data is minimized, And a technique for automatically analyzing and selecting sections is provided so that background sound or effect sound is automatically combined with a moving picture to provide a technique for maximizing the convenience of editing a moving picture of a user.

상기 목적을 달성하기 위하여, 본 발명의 일 실시예에 다른 동영상 데이터와 음성 데이터의 자동 편집 방법은, 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 장치에서 수행되는 동영상 데이터와 음성 데이터의 자동 편집 방법으로서, 사용자 단말로부터 편집 대상이 되는 동영상 데이터로서, 재생 시간 순으로 나열된 복수의 이미지 데이터를 포함하여, 재생 시 상기 복수의 이미지 데이터가 재생되어 동영상으로 재생되는 데이터인 제1 데이터를 수신하는 단계; 상기 수신한 제1 데이터에 포?c된 복수의 이미지 데이터의 동영상 재생에 따른 변화 패턴에 관한 정보 및 복수의 이미지 데이터의 색상 정보를 포함하는 이미지 분석 데이터를 추출하는 단계; 상기 제1 데이터의 일 영역에 저장된 상기 제1 데이터의 속성 정보 및 상기 이미지 분석 데이터를 기 저장된 음성 데이터와 비교하여, 적어도 하나의 음성 데이터를 상기 제1 데이터의 재생 시간 구간 내의 전부 또는 일부 구간에 매칭하는 단계; 및 상기 제1 데이터와 상기 제1 데이터와 매칭된 음성 데이터를 결합하여 동영상 데이터와 음성 데이터가 결합된 제2 데이터를 생성하여 상기 사용자 단말에 제공하는 단계;를 포함하는 것을 특징으로 한다.According to an aspect of the present invention, there is provided a method for automatic editing of moving picture data and audio data, the method comprising the steps of: A method of automatic editing of data and audio data, the method comprising the steps of: receiving, as moving image data to be edited from a user terminal, a plurality of image data arranged in order of reproduction time, Receiving first data; Extracting image analysis data including color information of a plurality of image data and information on a change pattern according to moving picture reproduction of a plurality of image data embedded in the received first data; Comparing the attribute information of the first data and the image analysis data stored in one area of the first data with pre-stored audio data, and storing at least one audio data in all or a part of the reproduction time period of the first data Matching step; And combining the first data and the audio data matched with the first data to generate second data in which the moving picture data and the audio data are combined and providing the second data to the user terminal.

본 발명에 따르면, 동영상 데이터인 제1 데이터에 대해서 사용자가 입력한 속성 정보 및 제1 데이터에 포함된 복수의 이미지 데이터의 변동 및 색상 정보로부터 도출된 이미지 분석 데이터를 이용하여, 자동으로 기 저장된 음성 데이터를 제1 데이터의 재생 구간 전부 또는 일부에 매칭시켜 결합된 제2 데이터를 사용자에게 제공하게 된다.According to the present invention, by using image analysis data derived from variation and color information of a plurality of image data included in the first data and attribute information inputted by a user with respect to first data that is moving image data, The data is matched to all or a part of the playback period of the first data and the combined second data is provided to the user.

이에 따라서, 사용자가 동영상을 편집 시, 음성 데이터를 일일이 선택해야 하는 과정 및 음성 데이터를 결합할 동영상의 재생 구간을 선택해야 하는 과정을 최소화할 수 있고, 동영상 데이터와 음성 데이터의 편집 기술을 사용자가 알 필요가 없어지기 때문에, 동영상 데이터의 편집에 있어서의 기술 장벽을 최소화할 수 있고, 사용자에게 편집 과정의 생략에 따른 편의성을 최대로 제공할 수 있는 효과가 있다.Accordingly, it is possible to minimize the process of selecting the audio data at the time of editing the moving picture and the process of selecting the reproduction section of the moving picture to be combined with the audio data, It is possible to minimize the technical barrier in the editing of the moving picture data and to provide the user with the convenience of omitting the editing process to the maximum.

또한 음성 데이터를 직접 제작하는 음성 제작자의 측면에서는 본 발명에 따른 기술의 제공 시 음성 데이터를 제공하고, 이에 대한 부가적인 수익을 얻을 수 있는 추가적인 효과가 있다.In addition, in the aspect of a voice maker who directly produces voice data, there is an additional effect that voice data can be provided in providing the technology according to the present invention, and additional profit can be obtained therefrom.

도 1 내지 5는 본 발명의 각 실시예에 따른 동영상 데이터와 음성 데이터의 자동 편집 방법의 플로우차트.
도 6 내지 9는 본 발명의 각 실시예의 구현에 따라서 사용자 단말에 출력되는 화면의 예.1 to 5 are flowcharts of a method of automatically editing moving picture data and voice data according to each embodiment of the present invention.
6 to 9 are examples of screens output to the user terminal according to the implementation of each embodiment of the present invention.

다양한 실시예들 및/또는 양상들이 이제 도면들을 참조하여 개시된다. 하기 설명에서는 설명을 목적으로, 하나이상의 양상들의 전반적 이해를 돕기 위해 다수의 구체적인 세부사항들이 개시된다. 그러나, 이러한 양상(들)은 이러한 구체적인 세부사항들 없이도 실행될 수 있다는 점 또한 본 발명의 기술 분야에서 통상의 지식을 가진 자에게 인식될 수 있을 것이다. 이후의 기재 및 첨부된 도면들은 하나 이상의 양상들의 특정한 예시적인 양상들을 상세하게 기술한다. 하지만, 이러한 양상들은 예시적인 것이고 다양한 양상들의 원리들에서의 다양한 방법들 중 일부가 이용될 수 있으며, 기술되는 설명들은 그러한 양상들 및 그들의 균등물들을 모두 포함하고자 하는 의도이다.Various embodiments and / or aspects are now described with reference to the drawings. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more aspects. However, it will also be appreciated by those of ordinary skill in the art that such aspect (s) may be practiced without these specific details. The following description and the annexed drawings set forth in detail certain illustrative aspects of one or more aspects. It is to be understood, however, that such aspects are illustrative and that some of the various ways of practicing various aspects of the principles of various aspects may be utilized, and that the description set forth is intended to include all such aspects and their equivalents.

또한, 다양한 양상들 및 특징들이 다수의 디바이스들, 컴포넌트들 및/또는 모듈들 등을 포함할 수 있는 시스템에 의하여 제시될 것이다. 다양한 시스템들이, 추가적인 장치들, 컴포넌트들 및/또는 모듈들 등을 포함할 수 있다는 점 그리고/또는 도면들과 관련하여 논의된 장치들, 컴포넌트들, 모듈들 등 전부를 포함하지 않을 수도 있다는 점 또한 이해되고 인식되어야 한다.In addition, various aspects and features will be presented by a system that may include multiple devices, components and / or modules, and so forth. It should be understood that the various systems may include additional devices, components and / or modules, etc., and / or may not include all of the devices, components, modules, etc. discussed in connection with the drawings Must be understood and understood.

본 명세서에서 사용되는 "실시예", "예", "양상", "예시" 등은 기술되는 임의의 양상 또는 설계가 다른 양상 또는 설계들보다 양호하다거나, 이점이 있는 것으로 해석되지 않을 수도 있다. 아래에서 사용되는 용어들 '컴포넌트', '모듈', '시스템', '인터페이스' 등은 일반적으로 컴퓨터 관련 엔티티(computer-related entity)를 의미하며, 예를 들어, 하드웨어, 하드웨어와 소프트웨어의 조합, 소프트웨어를 의미할 수 있다.As used herein, the terms "an embodiment," "an embodiment," " an embodiment, "" an embodiment ", etc. are intended to indicate that any aspect or design described is better or worse than other aspects or designs. . As used herein, the terms 'component,' 'module,' 'system,' 'interface,' and the like generally refer to a computer-related entity and include, for example, hardware, It can mean software.

더불어, 용어 "또는"은 배타적 "또는"이 아니라 내포적 "또는"을 의미하는 것으로 의도된다. 즉, 달리 특정되지 않거나 문맥상 명확하지 않은 경우에, "X는 A 또는 B를 이용한다"는 자연적인 내포적 치환 중 하나를 의미하는 것으로 의도된다. 즉, X가 A를 이용하거나; X가 B를 이용하거나; 또는 X가 A 및 B 모두를 이용하는 경우, "X는A 또는 B를 이용한다"가 이들 경우들 어느 것으로도 적용될 수 있다. 또한, 본 명세서에 사용된 "및/또는"이라는 용어는 열거된 관련 아이템들 중 하나 이상의 아이템의 가능한 모든 조합을 지칭하고 포함하는 것으로 이해되어야 한다. In addition, the term "or" is intended to mean " exclusive or " That is, it is intended to mean one of the natural inclusive substitutions "X uses A or B ", unless otherwise specified or unclear in context. That is, X uses A; X uses B; Or when X uses both A and B, "X uses A or B" can be applied to either of these cases. It should also be understood that the term "and / or" as used herein refers to and includes all possible combinations of one or more of the listed related items.

또한, "포함한다" 및/또는 "포함하는"이라는 용어는, 해당 특징 및/또는 구성요소가 존재함을 의미하지만, 하나이상의 다른 특징, 구성요소 및/또는 이들의 그룹의 존재 또는 추가를 배제하지 않는 것으로 이해되어야 한다.It is also to be understood that the term " comprises "and / or" comprising " means that the feature and / or component is present, but does not exclude the presence or addition of one or more other features, components and / It should be understood that it does not.

또한, 본 명세서에서 명백하게 다른 내용을 지시하지 않는 “한”과, “상기”와 같은 단수 표현들은 복수 표현들을 포함한다는 것이 이해될 수 있을 것이다. 따라서, 일 예로, “컴포넌트 표면(component surface)”은 하나 혹은 그 이상의 컴포넌트 표면들을 포함한다.It is also to be understood that the singular forms "a" and "an" above, which do not expressly state otherwise in this specification, include plural representations. Thus, in one example, a " component surface " includes one or more component surfaces.

또한, 제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Also, terms including ordinal numbers such as first, second, etc. may be used to describe various elements, but the elements are not limited to these terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. And / or < / RTI > includes any combination of a plurality of related listed items or any of a plurality of related listed items.

또한, 본 명세서에서 사용한 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Also, the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, the terms "comprises" or "having" and the like refer to the presence of stated features, integers, steps, operations, elements, components, or combinations thereof, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

또한, 본 발명의 실시예들에서, 별도로 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥 상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 발명의 실시예에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Furthermore, in the embodiments of the present invention, all terms used herein, including technical or scientific terms, unless otherwise defined, are intended to be inclusive in a manner that is generally understood by those of ordinary skill in the art to which this invention belongs. Have the same meaning. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and, unless explicitly defined in the embodiments of the present invention, are intended to mean ideal or overly formal .

도 1 내지 5는 본 발명의 각 실시예에 따른 동영상 데이터와 음성 데이터의 자동 편집 방법의 플로우차트이다.1 to 5 are flowcharts of a method of automatically editing moving picture data and audio data according to each embodiment of the present invention.

먼저 도 1을 참조하면, 본 발명에서의 동영상 데이터와 음성 데이터의 자동 편집 방법은, 상기 언급한 바와 같이 하나 이상의 프로세서 및 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 장치에서 수행되는 것으로 이해될 것이다. 이하에서는, 본 발명에서의 동영상 데이터와 음성 데이터의 자동 편집 방법은 '장치'에 의하여 수행되는 것으로 설명될 것이며, 이는 상기의 컴퓨팅 장치를 포함하여 본 발명의 기능 수행이 가능한 모든 장치를 의미하는 것으로 이해될 것이다.Referring first to FIG. 1, a method for automatic editing of moving picture data and voice data in the present invention is performed in a computing device including a main memory storing instructions executable by one or more processors and processors as mentioned above It will be understood. Hereinafter, a method for automatically editing moving picture data and voice data according to the present invention will be described as being performed by an 'apparatus', which means all the apparatuses capable of performing the functions of the present invention including the above-described computing apparatus It will be understood.

먼저 도 1을 참조하면, 장치는 사용자 단말로부터 편집 대상이 되는 동영상 데이터로서, 재생 시간 순으로 나열된 복수의 이미지 데이터를 포함하여, 재생 시 복수의 이미지 데이터가 재생되어 동영상으로 재생되는 데이터인 제1 데이터를 수신하는 단계(S10)를 수행한다.First, referring to FIG. 1, the apparatus includes moving image data to be edited from a user terminal, the moving image data including a plurality of image data arranged in order of reproduction time, (S10) of receiving data.

본 발명에서 동영상이란, 영상 또는 음성 데이터가 결합된 형태의 편집 대상이 되는 원본 데이터를 의미한다. 즉, 자동으로 음성 데이터가 매칭되어 결합되는 본 발명의 기능 수행을 거치기 전 저장된 직전의 데이터를 의미한다. 이에 따라서, 제1 데이터는 영상 데이터만이 존재하는 형식의 데이터, 영상과 음성 데이터가 결합된 데이터 및 본 발명의 일 실시예의 기능 수행이 이미 적용된 상태의 데이터를 의미할 수 있는 것으로 이해될 것이다.In the present invention, moving picture means original data to be edited in the form of combining video or audio data. That is, it means data immediately before the function of the present invention, in which voice data is automatically matched and combined, is stored. Accordingly, it will be understood that the first data may mean data in which only image data exists, data in which image and audio data are combined, and data in which the performance of the embodiment of the present invention is already applied.

상기 언급한 바와 같이 제1 데이터에는 적어도 재생 시간 순으로 나열된 복수의 이미지 데이터 및 각 이미지 데이터의 결합 순서 및 결합 관계에 대한 데이터가 포함될 수 있다. 이는 일반적으로 동영상 데이터에 포함된 복수의 정지 이미지 데이터를 의미한다. 동영상 데이터는 필름과 같이 정지된 복수의 화면이 재생에 따라서 움직이는 것과 같이 표현되는 것으로 이해될 것이므로 상기의 내용에 대한 구체적인 설명은 생략하기로 한다.As described above, the first data may include a plurality of pieces of image data arranged in order of reproduction time, and data on a combining order and a combining relation of each image data. This generally means a plurality of still image data included in moving image data. It will be understood that the moving picture data is expressed as moving a plurality of still pictures, such as a film, in accordance with reproduction, so a detailed description of the contents will be omitted.

복수의 이미지 데이터는 후술하는 바와 같이 각 이미지 데이터를 구성하는 복수의 픽셀들이 해당 픽셀 영역에 위치되도록 하는 픽셀 데이터 및 픽셀 결합 데이터로 구성될 수 있다. 각 픽셀 데이터는 RGB값을 갖도록 구성되어, 각 픽셀 데이터가 픽셀 결합 데이터를 기준으로 각 픽셀 데이터가 위치해야 할 영역에 표현됨에 따라서 하나의 정지된 이미지 데이터를 구성하게 된다.The plurality of image data may be composed of pixel data and pixel combination data such that a plurality of pixels constituting each image data are positioned in the corresponding pixel region as described later. Each pixel data is configured to have an RGB value so that one pixel of the image data is constituted as each pixel data is expressed in the area where each pixel data should be located based on the pixel-combination data.

복수의 이미지 데이터는 각 재생 시점이 정해져 있어, 재생에 따라서 정해진 시점에 해당 시점에 매핑되는 이미지 데이터가 로드됨으로써 동영상으로 재생된다. A plurality of image data is determined for each reproduction time point, and image data mapped at a predetermined point in time according to the reproduction is loaded and reproduced as a moving image.

한편 제1 데이터에는 후술하는 바와 같이 제1 데이터를 생성 시 사용자 단말로부터 입력된 정보가 포함될 수 있다. 해당 정보는 후술하는 바와 같이 제1 데이터의 속성 정보로서 제1 데이터에 포함될 수 있다. 제1 데이터의 속성 정보에는, 사용자가 입력한 음악에 대한 선호 정보를 포함할 수 있다. 음악에 대한 선호 정보에는 음악 장르, 템포, 분위기, 작사가, 작곡가 등에 대한 정보가 포함될 수 있다. 즉, 제1 데이터의 속성 정보는 적어도 사용자가 입력한 음악에 대한 키워드 정보를 포함하는 것으로 이해될 수 있다.Meanwhile, as described later, the first data may include information input from the user terminal when generating the first data. The information may be included in the first data as attribute information of the first data as described later. The attribute information of the first data may include preference information on the music inputted by the user. The preference information for the music may include information about the music genre, tempo, atmosphere, lyrics, composer, and the like. That is, it can be understood that the attribute information of the first data includes at least keyword information on the music inputted by the user.

S10 단계에 있어서 장치는 상기 언급한 제1 데이터에 포함된 모든 정보를 사용자 단말로부터 수신함으로써, 후술할 본 발명의 기능 수행을 위한 전처리 작업을 수행하게 된다.In step S10, the apparatus receives all the information included in the above-mentioned first data from the user terminal, thereby performing a pre-processing operation for performing the function of the present invention to be described later.

이후 장치는, 수신한 제1 데이터에 포함된 복수의 이미지 데이터의 동영상 재생에 따른 변화 패턴에 관한 정보 및 복수의 이미지 데이터의 색상 정보를 포함하는 이미지 분석 데이터를 추출하는 단계(S20)를 수행하게 된다. S20 단계에 대한 구체적인 예가 도 2에 도시되어 있다.The apparatus then performs step S20 of extracting image analysis data including color information of a plurality of image data and information on a change pattern according to moving picture reproduction of a plurality of image data included in the received first data do. A concrete example of step S20 is shown in Fig.

도 2를 참조하면, 장치는 S20 단계를 수행함에 있어서, 먼저 제1 데이터에 포함된 복수의 이미지 데이터에 포함된 복수의 픽셀들 간의 결합 형태를 기준으로 복수의 이미지 데이터에 포함된 객체 데이터 및 각 픽셀들의 RGB값을 추출하는 단계(S21)를 수행한다.Referring to FIG. 2, in step S20, the apparatus first calculates object data included in a plurality of image data based on a combination of a plurality of pixels included in a plurality of image data included in the first data, A step S21 of extracting RGB values of pixels is performed.

S21 단계에 있어서 예를 들어 장치는 복수의 픽셀들 간의 RGB값의 변화를 기준으로 각 픽셀들 중 객체의 경계를 나타내는 픽셀들을 추출하고, 해당 픽셀들의 결합 형태를 기준으로 객체 데이터를 추출하게 된다. 예를 들어, 객체 데이터에는 특정 사물을 의미하는 픽셀들의 그룹, 사람을 의미하는 픽셀들의 그룹 등 특정 객체를 식별할 수 있는 헤더 정보와, 해당 객체에 포함된 픽셀에 관한 데이터들이 포함될 수 있다.In step S21, for example, the apparatus extracts pixels representing a boundary of an object among the plurality of pixels based on a change in an RGB value between the plurality of pixels, and extracts object data based on a combination form of the pixels. For example, the object data may include header information for identifying a specific object such as a group of pixels meaning a specific object, a group of pixels representing a person, and data about pixels included in the object.

S21 단계가 수행되면, 장치는 복수의 이미지 데이터들의 재생 순서를 이용하여, 복수의 이미지 데이터들에 포함된 객체 데이터의 변동 정보를 연산하는 단계(S22)를 수행한다.When step S21 is performed, the apparatus performs step S22 of calculating variation information of the object data included in the plurality of image data, using the reproduction order of the plurality of image data.

예를 들어, 객체 데이터에 포함된 픽셀들의 그룹은, 이미지 데이터가 동영상으로 재생됨에 따라서 각 이미지 데이터별로 다르게 구성될 수 있다. 즉, 객체 데이터에 포함된 픽셀들의 그룹이 변동됨에 따라서, 객체 데이터가 변동되는 것으로 인식될 수 있다.For example, the group of pixels included in the object data may be configured differently for each image data as the image data is reproduced as a moving image. That is, as the group of pixels included in the object data changes, the object data can be perceived as fluctuating.

예를 들어, 사람을 의미하는 객체 데이터에 포함된 픽셀들의 그룹은 해당 객체 데이터가 인식된 이미지 데이터의 재생 순서에 따라서 인접한 이미지 데이터에는 서로 다른 픽셀 그룹으로 구성될 수 있다. 이는 사람의 객체 데이터의 크기가 변동되거나 사람이 움직이는 것으로 표현됨 또는 사람을 표현하는 객체 데이터의 색상이 변동에 따라서 발생되는 이벤트이다.For example, a group of pixels included in object data representing a person may be composed of different pixel groups in adjacent image data according to a reproduction order of image data in which the object data is recognized. This is an event in which the size of the object data of a person is changed, the person is expressed as moving, or the color of object data representing a person is changed.

즉, S22 단계에서는, 객체 데이터에 포함된 픽셀들의 그룹의 정보가 변동됨을 감지 시, 객체 데이터의 변동 정보가 연산되도록 하여, 인식된 객체의 변동 여부를 감지하게 되는 것이다.That is, in step S22, when it is detected that the information of the group of pixels included in the object data is changed, the variation information of the object data is calculated to detect whether the recognized object is changed.

S22 단계의 수행이 완료되면, 장치는 객체 데이터의 변동 정보를 변화 패턴에 관한 정보로 가공하고, 각 픽셀들의 RGB값을 이용하여 복수의 이미지 데이터들의 색상 정보로 가공함으로써, S20 단계에서 언급한 바와 같이 이미지 분석 데이터를 추출하는 단계(S23)를 수행하게 된다.When the operation of step S22 is completed, the apparatus processes the variation information of the object data into information on the change pattern, and uses the RGB values of the pixels to process the color information of the plurality of image data, And the step of extracting image analysis data (S23) is performed.

S20 및 S23 단계의 수행 결과 생성되는 정보는 전술한 객체 데이터들의 변동 정보를 패터닝한 정보를 의미할 수 있다. 예를 들어, 특정 재생 구간에 있어서 이미지 분석 데이터는, 밝은 옷을 입은 여러 사람이 초원을 뛰는 동영상을 의미하는 것으로 분석될 수 있다. 이때 여러 사람 및 초원을 의미하는 객체 데이터에 있어서 초원의 풀이 흔들리는 객체의 변화가 감지될 수 있고, 여러 사람으로 인식된 객체 데이터들이 빠르게 뛰는 형태의 패턴을 그리면서 이동되는 것으로 객체 데이터의 변화가 감지될 수 있다. 또한 사람과 초원의 풀 및 햇빛이 색상 정보로서 감지될 수 있다.The information generated as a result of performing the steps S20 and S23 may be information on patterning the variation information of the object data described above. For example, image analysis data for a particular playback interval may be interpreted as meaning a moving image of several people wearing bright clothing. In this case, the change of the object in which the grass grass is shaken can be detected in the object data representing several people and the grassland, and the change in the object data can be detected . Also, grass and meadow grass and sunlight can be detected as color information.

도 2에서 언급한 방식 이외에, 동영상을 감지하는 다양한 방식이 S20 단계에서 사용될 수 있음은 당연할 것이다.In addition to the method described in FIG. 2, it will be appreciated that various methods of sensing motion pictures may be used in step S20.

다시 도 1에 대한 설명으로 돌아와서, S20 단계가 완료되면, 장치는 제1 데이터의 일 영역에 전술하는 바와 같이 저장된 제1 데이터의 속성 정보 및 S20 단계에 의하여 추출 및 생성된 이미지 분석 데이터를 장치의 라이브러리에 기 저장된 음성 데이터와 비교하여, 적어도 하나의 음성 데이터를 제1 데이터의 재생 시간 구간 내의 정부 또는 일부 구간에 매칭하는 단계(S30)를 수행하게 된다.Referring back to FIG. 1, when the step S20 is completed, the apparatus stores attribute information of the first data stored in one area of the first data and image analysis data extracted and generated in step S20, (S30) of comparing at least one piece of audio data with the audio data previously stored in the library and matching the audio data within the reproduction time interval of the first data with a part of the reproduction time interval.

본 발명에서 음성 데이터는, 특정 프로세스를 통하여 외부의 음악 제작자로부터 입력된 음성 데이터를 의미한다. 즉, 저작권이 문제가 될 수 있는 다른 음성 데이터를 제외하고, 장치에 음성 데이터 제작자로 등록된 제작자의 단말로부터 입력된 해당 제작자가 제작한 음성 데이터를 의미한다. 물론, 저작권의 문제가 해결되는 한도 내에서, 외부의 다른 음성 데이터가 사용될 수 있을 것이다.In the present invention, voice data refers to voice data input from an external music producer through a specific process. That is, this means audio data produced by a corresponding producer input from a terminal of a producer registered as a producer of audio data in a device, excluding other audio data whose copyright may be a problem. Of course, other external audio data may be used to the extent that the problem of copyright is solved.

S30 단계는 장치가 특정 재생 구간을 자동으로 선택하고, 이에 대해서 해당 재상 구간에 결합할 음성 데이터를 자동으로 선택하는 일련의 과정을 의미한다. 전술한 바와 같이, 해당 과정에 있어서 장치는, 전술한 바와 같이 사용자가 선호하는 음악에 대한 정보인 제1 데이터의 속성 정보 또는 S20 단계에서 분석한 이미지 분석 데이터를 이용하여 재생 구간의 선택 및 음성 데이터의 선택 프로세스를 진행한다.Step S30 is a series of processes in which the device automatically selects a specific playback interval and automatically selects audio data to be combined with the corresponding playback interval. As described above, in the corresponding process, the apparatus selects the reproduction interval and the audio data (audio data) using the attribute information of the first data, which is the information about the music preferred by the user, or the image analysis data analyzed in step S20, Lt; / RTI >

즉, S30 단계는 다시 말해, 장치가 영상을 분석함에 따라서 도출된 영상의 변동 정보를 포함하는 패턴/템포 정보 및 색상 정보등과 함께 사용자가 제1 데이터에 대해서 입력한 선호 정보를 기준으로, 자동으로 해당 정보에 매칭되는 음성 데이터를 선택하고, 이를 재생 구간에 자동 매칭하는 프로세스를 의미한다. That is, in step S30, in accordance with the pattern / tempo information including the variation information of the image derived as the device analyzes the image, the color information, and the like, based on the preference information inputted by the user about the first data, And selects the audio data matching the corresponding information and automatically matches the audio data to the reproduction section.

본 발명에서 음성 데이터는 제1 데이터의 전부 또는 일부와 매칭되기 위하여, 매칭의 기준이 되는 정보가 결합되거나 저장될 수 있다. 예를 들어 기 저장된 음성 데이터는, 음성 데이터 제작자 단말로부터 수신된 데이터로서, 적어도 배경음 데이터인지 효과음 데이터인지 여부를 나타내는 제1 식별 정보, 음성 데이터의 장르를 나타내는 제2 식별 정보 및 음성 데이터의 템포를 나타내는 제3 식별 정보를 포함할 수 있다. 예를 들어 해당 음성 데이터가 배경음 데이터이며, 장르는 댄스, 템포는 미디엄 템포 등으로 설정될 수 있다. 상술한 바와 같이 음성 데이터의 제1 내지 제3 식별 정보는 음성 데이터 제작자 단말로부터 음성 데이터를 수신 시 함께 수신될 수 있으나, 음성 인식 프로그램에 의하여 자동으로 생성되어 음성 데이터에 결합되어 저장될 수 있다.In the present invention, in order to match with all or a part of the first data, the information on which the matching is based may be combined or stored. For example, pre-stored audio data includes first identification information indicating whether the data is at least background sound data or effect sound data, second identification information indicating a genre of the audio data, and a tempo of the audio data The third identification information indicating the third identification information. For example, the voice data may be background sound data, the genre may be set to dance, and the tempo may be set to a medium tempo. As described above, the first to third identification information of the voice data may be received together with the voice data from the voice data producer terminal, but may be automatically generated by the voice recognition program and stored in the voice data.

S30 단계의 구체적인 예에 대해서 도 3 내지 도 5에 도시된 플로우차트를 참조하여 설명하기로 한다.A concrete example of step S30 will be described with reference to flowcharts shown in FIGS. 3 to 5. FIG.

먼저 도 3을 참조하면, 장치는 음성 데이터와 제1 데이터를 매칭함에 있어서, 제1 데이터의 재생 시간 구간 내에 포함된 복수의 이미지 데이터들 중, 추출된 RGB값의 평균값 또는 최빈값의 변동률이 기설정된 임계 변동률 미만인 연속되는 일 구간의 복수의 이미지 데이터에 대해서 추출된 RGB값의 평균값 또는 최빈값을 연산하는 단계(S31)를 수행한다.Referring to FIG. 3, in the matching of audio data and first data, the apparatus determines whether a variation rate of an average value or a mode value of extracted RGB values among a plurality of image data included in a reproduction time period of the first data is preset A step S31 of calculating an average value or a mode value of the extracted RGB values for a plurality of image data of a successive section that is less than the critical variation rate is performed.

기설정된 임계 변동률이란, 해당 이미지 데이터들의 색상이 급변하는 경우 동일한 음성 데이터를 매칭하는 것이 어렵기 때문에, 해당 임계 변동률 내에 존재하는 연속되는 복수의 이미지 데이터가 포함된 상기의 일 구간을 음성 데이터를 매칭할 재생 구간으로 설정하기 위한 개념이다. 예를 들어 임계 변동률은 색상이 완전히 변동되는 비율로서, RGB값의 예를 들어 25%로 설정될 수 있다.It is difficult to match the same voice data when the color of the corresponding image data is rapidly changed. Therefore, the above-mentioned one period including the continuous plurality of image data existing within the critical variation ratio is matched with the voice data To be reproduced. For example, the threshold variation can be set to 25% of the RGB value, for example, as the rate at which the color completely fluctuates.

S31 단계가 수행되어 음성 데이터가 매칭될 재생 구간이 설정되면, 장치는 RGB값의 평균값 또는 최빈값이 속하는 색상 식별 구간에 매칭되는 상술한 제2 식별 정보 또는 제3 식별 정보를 갖는 음성 데이터가 연속되는 일 구간의 복수의 이미지 데이터의 재생 시간 동안 재생되도록 음성 데이터와 제1 데이터를 매칭하는 단계(S32)를 수행할 수 있다.When the reproduction interval in which the audio data is to be matched is set in step S31, the apparatus determines whether the audio data having the above-described second identification information or the third identification information matching the color identification period to which the average value or the mode value of the RGB value belongs (S32) of matching the audio data and the first data to be reproduced during a reproduction time of a plurality of image data of one section.

음성 데이터의 제1 내지 제3 식별 정보는 상술한 색상 식별 구간에 따라서 미리 매칭 정보가 설정될 수 있다. 즉, 색상이 빨간색 구간인 경우 이에 대해서는 댄스 장르의 빠른 템포의 음성 데이터가 모두 매칭될 수 있다. 이외의 구현예가 다양하게 변경되어 적용될 수 있음은 당연할 것이다.As the first to third identification information of the audio data, the matching information may be set in advance according to the above-described color identification period. That is, when the hue is in the red section, all of the voice data of the fast tempo of the dance genre can be matched. It will be appreciated that other implementations may be variously modified and applied.

본 발명에서 음성 데이터의 제1 식별 정보는 상기와 같이 자동으로 파악된 복수의 이미지 데이터의 재생 구간, 즉 도 3의 예에서는 복수의 이미지 데이터들 중, 추출된 RGB값의 평균값 또는 최빈값의 변동률이 기설정된 임계 변동률 미만인 연속되는 일 구간에서의 이미지 데이터의 재생 구간의 길이에 따라서 배경음 데이터 또는 효과음 데이터를 제1 데이터에 매칭할지 여부를 결정하기 위하여 사용된다.In the present invention, the first identification information of the audio data is a reproduction interval of a plurality of image data automatically grasped as described above, that is, a variation rate of an average value or a mode value of extracted RGB values among a plurality of image data in the example of FIG. 3 Is used to determine whether to match the background sound data or the sound effect data with the first data according to the length of the playback period of the image data in a successive section that is less than a predetermined threshold variation rate.

예를 들어, 상기와 같은 과정에 의하여 파악된 일 구간의 길이가 기설정된 길이 또는 비율(예를 들어 전체 재생 구간의 70%에 해당하는 길이)를 초과하는 경우, 배경음 데이터들 중 제1 데이터에 상기의 제2 및 제3 식별 정보를 이용하여 매칭되는 음성 데이터를 매칭할 수 있다. 또한 일 구간의 길이가 기설정된 길이 또는 비율(예를 들어 전체 재생 구간의 5%를 초과하는 길이) 미만인 경우, 효과음 데이터들 중 제1 데이터에 상기의 제2 및 제3 식별 정보를 이용하여 매칭되는 음성 데이터를 매칭할 수 있다. 물론 상기의 기설정된 길이 또는 전체 대비 비율은 본 발명의 다양한 실시예에 따라서 변형 적용될 수 있을 것이다.For example, when the length of one section recognized by the above procedure exceeds a predetermined length or a predetermined length (for example, a length corresponding to 70% of the entire playback period), the first data The matching voice data can be matched using the second and third identification information. When the length of one section is less than a predetermined length or ratio (for example, a length exceeding 5% of the entire reproduction section), the first data of the sound effect data is matched using the second and third identification information Can be matched. Of course, the predetermined lengths or the total to relative ratios may be modified according to various embodiments of the present invention.

또한, 이외에 재생 구간을 사용자 단말로부터 직접 선택받고, 상기의 예와 유사한 기준에 따라서 재생 구간에 따라서 제1 식별 정보가 적용되어 제1 데이터에 배경음 데이터 또는 효과음 데이터가 적용되는 실시예가 사용될 수도 있다. 물론 이때는 자동으로 재생 구간을 선택하고 이에 대해서 음성 데이터를 자동으로 매칭하는 것이 아니라, 사용자 단말로부터 선택된 재생 구간에 포함된 복수의 이미지 데이터를 자동으로 상술 또는 후술하는 바와 같은 기준에 따라서 분석하고, 이를 바탕으로 자동으로 음성 데이터를 매칭하는 프로세스가 수행될 것이다.In addition, an embodiment may be used in which the playback section is directly selected from the user terminal, and the first identification information is applied according to the playback section according to a similar criterion to the first example, so that the background data or the sound effect data is applied to the first data. Of course, at this time, instead of automatically selecting the playback interval and automatically matching the audio data, the plurality of image data included in the playback interval selected from the user terminal is automatically analyzed according to the criteria described above or described below, A process of automatically matching voice data on the basis will be performed.

한편 도 4를 참조하면, 장치는 제1 데이터의 재생 시간 구간 내에 포함되는 복수의 이미지 데이터들 중, 객체 데이터의 변동 정보에 포함된 객체 데이터의 크기의 변동 정보 또는 객체 데이터의 이동에 따른 이동 정보를 연산하는 단계(S33)를 수행한다. Referring to FIG. 4, the apparatus includes information on the size of the object data included in the variation information of the object data among the plurality of image data included in the reproduction time period of the first data, (Step S33).

이때, 도 3과 같이 재생 구간이 자동으로 선택될 수 있다. 예를 들어 장치는 객체 데이터의 크기의 변동 정보를 바탕으로 객체 데이터의 크기가 커지거나 작아지는 등 변동이 시작되는 시점부터, 객체 데이터의 크기의 변동이 멈추는 시점까지를 재생 구간으로 자동으로 선택할 수 있다. 또한 장치는 이동 정보를 연산하고, 객체 데이터가 이동하는 패턴을 분석하여, 해당 패턴이 유지되는 시간을 재생 구간으로 선택하는 등의 프로세스를 진행할 수 있다. 물론 상기의 예 이외에, 객체 데이터의 크기 및 이동에 따라서 재생 구간을 선택할 수 있는 다양한 실시예가 적용될 수 있다.At this time, the playback section may be automatically selected as shown in FIG. For example, the apparatus can automatically select a playback interval from a time point at which the variation starts, such as a size of the object data increases or decreases, to a point at which the variation of the size of the object data stops, based on the variation information of the size of the object data have. Further, the apparatus may process the motion information, analyze the pattern of movement of the object data, and select the time for which the pattern is held as the playback section. Of course, in addition to the above example, various embodiments can be applied in which the playback period can be selected according to the size and movement of the object data.

S33 단계가 수행되면, 장치는 객체 데이터의 크기의 변동 정보 또는 이동 정보를 이용하여 연산된 값으로서, 객체 데이터의 크기 변동 속도, 객체 데이터의 크기 변동 비율 및 객체 데이터의 이동 속도 중 적어도 하나에 매칭되는 제2 식별 정보 또는 제3 식별 정보를 갖는 음성 데이터가 객체 데이터의 크기의 변동 정보 또는 객체 데이터의 이동 정보가 연산된 시간 구간 내의 복수의 이미지 데이터의 재생 시간 동안 재생되도록 음성 데이터와 제1 데이터를 매칭하는 단계(S34)를 수행한다.When the step S33 is performed, the apparatus calculates a value calculated by using the variation information of the size of the object data or the movement information as at least one of the size variation rate of the object data, the size variation ratio of the object data, The audio data having the second identification information or the third identification information is reproduced during the reproduction time of the plurality of image data in the time interval in which the movement information of the object data is calculated, (Step S34).

예를 들어, 객체 데이터가 천천히 작아지는 경우, 발라드 장르의 느린 템포를 제2 및 제3 식별 정보로 갖는 음성 데이터를 모두 해당 재생 구간의 제1 데이터에 매칭할 수 있다. 또는 객체 데이터가 느리게 걷는 것으로 이동 패턴이 파악된 경우, 역시 발라드 장르의 느린 템포를 제2 및 제3 식별 정보로 갖는 음성 데이터를 상술한 바와 같이 재생 구간의 길이에 따라서 배경음 또는 효과음으로 선택하여 제1 데이터에 매칭할 수 있다. 물론 이외에 상술한 바와 유사한 다양한 예가 적용될 수 있다.For example, when the object data is gradually decreased, all the voice data having the slow tempo of the ballad genre as the second and third identification information can be matched to the first data of the corresponding reproduction section. Alternatively, when the moving pattern is recognized as the object data is slowly walking, the voice data having the slow tempo of the ballad genre as the second and third identification information is selected as the background sound or the effect sound according to the length of the playback section, 1 < / RTI > data. Of course, various examples similar to those described above can be applied.

다시 도 1에 대한 설명으로 돌아와서, 상기의 도 3 및 4에 대한 설명에서 언급한 모든 실시예에 있어서 상술한 제1 데이터의 속성 정보가 반영될 수 있다. Referring back to FIG. 1, the attribute information of the first data described above may be reflected in all the embodiments described in the description of FIGS. 3 and 4. FIG.

상술한 바와 같이 제1 데이터의 속성 정보에는 사용자 단말로부터 제1 데이터 생성 시 입력된 음악에 대한 선호 정보가 포함될 수 있다. 또한 상술한 바와 같이 제1 데이터에 각각 매칭되는 음성 데이터는 적어도 하나, 예를 들어 복수개가 될 수 있다.As described above, the attribute information of the first data may include preference information on the music input when generating the first data from the user terminal. Also, as described above, there may be at least one, e.g., a plurality of, audio data each matching the first data.

이때 S30 단계에 있어서, 장치는 제1 데이터에 매칭되는 음성 데이터들 중, 제1 데이터의 속성 정보에 포함된 음악에 대한 선호 정보에 대응되는 음성 데이터를 제1 데이터화 최종 매칭하도록 제어할 수 있다.At this time, in step S30, the apparatus can control the audio data corresponding to the preference information of the music included in the attribute information of the first data among the audio data matched to the first data to finally match the first data.

예를 들어, 하나의 제1 데이터 및 동일한 재생 구간에 발라드 장르의 미디엄 템포의 A 음성 데이터 및 댄스 장르의 미디엄 템포의 B 음성 데이터가 모두 매칭된 상태이고, 제1 데이터에 대한 속성 정도에 선호 정보로서 댄스 장르가 포함된 경우, 최종적으로 해당 제1 데이터의 재생 구간에는 B 음성 데이터가 매칭되도록 하는 것이다.For example, in a state in which both the first data and the A-voice data of the medium tempo of the ballad genre and the B-voice data of the medium tempo of the dance genre are all matched in the same first reproduction period and the preference information In the case where the dance genre is included, the B voice data is finally matched to the reproduction section of the first data.

한편 상술한 바와 같이 동일한 매칭 대상이 되는 제1 데이터에 있어서 복수의 음성 데이터가 매칭될 때, 도 5의 실시예와 같은 프로세스가 수행될 수 있다.On the other hand, when a plurality of pieces of audio data are matched in the first data to be the same matching object as described above, a process similar to the embodiment of Fig. 5 can be performed.

도 5를 참조하면, S30 단계를 수행함에 있어서 장치는 매칭되는 음성 데이터를 적어도 하나 선택하게 될 것이며, 이때 선택된 적어도 하나의 음성 데이터를 사용자 단말에 전송하는 단계(S35)를 수행하게 된다.Referring to FIG. 5, in operation S30, the apparatus selects at least one audio data to be matched, and transmits at least one selected audio data to the user terminal (S35).

S35 단계는 예를 들어, 사용자 단말에 상술한 프로세스를 이용하기 위하여 출력되는 인터페이스의 일 화면에, 해당 재생 구간에 매칭되는 복수의 음성 데이터에 대해서 사용자가 이를 확인할 수 있도록 음성 데이터들을 제공하는 프로세스를 의미한다.In step S35, for example, a process of providing voice data to a user terminal so that a user can confirm a plurality of voice data matched to the corresponding playback interval on a screen of an interface output to use the process described above it means.

S35 단계가 수행되면, 장치는 사용자 단말로부터 S34 단계에 의하여 전송된 음성 데이터들 중 어느 하나에 대한 선택 입력을 수신하는 단계(S36)를 수행하고, 사용자 단말의 선택 입력에 대응하는 음성 데이터를 제1 데이터에 최종 매칭하는 단계(S37)를 수행하게 된다.When step S35 is performed, the apparatus performs step S36 of receiving a selection input for one of the voice data transmitted in step S34 from the user terminal, and transmits voice data corresponding to the selection input of the user terminal 1 data (step S37).

결국, 도 5의 실시예는 사용자에게, 자동 매칭에 따라서 음성 데이터를 추천하고, 추천한 음성 데이터들에 대한 사용자의 선택을 기준으로 최종적으로 음성 데이터와 제1 데이터를 매칭하는 프로세스를 의미한다. 이와 같은 프로세스의 목적에 따라서, 도 5의 실시예는 도 1 내지 4의 실시예와 결합되어 수행될 수 있다.Finally, the embodiment of FIG. 5 refers to a process of recommending voice data to the user in accordance with automatic matching and finally matching the voice data with the first data based on the user's selection of the recommended voice data. For the purpose of such a process, the embodiment of Fig. 5 may be performed in combination with the embodiment of Figs. 1-4.

즉, 복수의 음성 데이터가 선택될 때 장치에 의하여 사용자의 음악 선호 정보가 반영되도록 일정한 음성 데이터를 선택하고 이를 사용자 단말에 전송한 뒤 사용자 단말로부터 선택된 음성 데이터를 최종적으로 제1 데이터에 매칭할 수 있는 것이다.That is, when a plurality of audio data is selected, predetermined audio data is selected by the device so that the user's music preference information is reflected, and the audio data selected from the user terminal is finally matched to the first data after transmitting the selected audio data to the user terminal It is.

다시 도 1에 대한 설명으로 돌아와서, 상술한 프로세스가 완료되면, 장치는 제1 데이터와 제1 데이터와 매칭된 음성 데이터를 결합함으로써 동영상 데이터와 음성 데이터가 최종적으로 결합된 형태의 제2 데이터를 생성하여 사용자 단말에 전송하는 단계(S40)를 수행하여 동영상 데이터와 음성 데이터의 자동 편집 프로세스를 완료하게 된다.1, when the above-described process is completed, the apparatus combines the first data and the first data and the audio data matched with each other, thereby generating second data in a form in which the moving image data and the audio data are finally combined To the user terminal (S40), and the automatic editing process of the moving picture data and the audio data is completed.

본 발명의 상술한 프로세스에 의하면, 사용자는 음성 데이터를 일일이 확인할 필요 없이 장치에 의하여 추천된 음성 데이터들 중 어느 한 음성 데이터를 선택하여 동영상에 결합하는 편집 프로세스를 이용할 수 있다.According to the above-described process of the present invention, the user can use an editing process of selecting any one of the voice data recommended by the device and combining the selected voice data to the moving picture without checking the voice data.

또한, 음성 데이터의 결합 구간을 장치가 자동으로 선택할 수 있어, 사용자에게 편의성을 제공할 수 있다.In addition, the device can automatically select a combining period of the voice data, thereby providing convenience to the user.

도 6 내지 9는 본 발명의 각 실시예의 구현에 따라서 사용자 단말에 출력되는 화면의 예이다. 이하의 설명에 있어서 도 1 내지 5에 대한 설명과 중복되는 부분은 이를 생략하기로 한다.6 to 9 are examples of screens output to the user terminal according to the implementation of each embodiment of the present invention. In the following description, portions overlapping with those in the description of Figs. 1 to 5 will be omitted.

도 6의 화면(100)을 통해 사용자들은 편집 대상이 되는 동영상 데이터, 즉 제1 데이터를 입력할 수 있다. 이때 도 7 내지 9와 같은 편집 인터페이스가 사용자 단말에 출력될 수 있다.Through the screen 100 of FIG. 6, users can input moving image data to be edited, that is, first data. At this time, the editing interface as shown in FIGS. 7 to 9 may be output to the user terminal.

도 7의 화면(110)을 참조하면, 동영상이 재생되는 화면(111)을 통해 동영상 재생을 확인할 수 있고, 프레임 화면(112)을 통하여 해당 동영상을 구성하는 복수의 이미지 데이터를 확인할 수 있다. 분석 화면(113)를 통해 해당 동영상을 구성하는 복수의 이미지 데이터에 대하여 상술한 이미지 분석 데이터를 통해 이미지 분석 데이터를 그래프 및 텍스트로 확인할 수 있다.Referring to the screen 110 of FIG. 7, it is possible to confirm the reproduction of the moving picture through the screen 111 on which the moving picture is reproduced, and to confirm a plurality of image data constituting the moving picture through the frame screen 112. Through the analysis screen 113, the image analysis data can be confirmed as a graph and a text through the above-described image analysis data for a plurality of image data constituting the moving image.

도 8의 화면(120)을 참조하면, 동영상이 재생되는 화면(121) 이외에, 재생 구간 정보(122)를 확인할 수 있다. 이때 자동으로 음성 데이터가 매칭되는 재생 구간(T1)이 설정되거나 상술한 바와 같이 사용자가 재생 구간(T1)을 선택할 수 있다.Referring to the screen 120 of FIG. 8, in addition to the screen 121 on which the moving image is reproduced, the playback interval information 122 can be confirmed. At this time, a playback interval T1 in which audio data is automatically matched may be set, or a user may select the playback interval T1 as described above.

도 8의 화면(120)에서 사용자들은 해당 재생 구간(T1)에 자동으로 매칭된 효과음 데이터, 즉 음성 데이터의 리스트(123)를 확인하고, 각 음성 데이터를 재생 메뉴(124)를 통해 미리 들을 수 있다. 이후, 선택 메뉴(125)를 통해 어느 한 음성 데이터를 선택하여 재생 구간(T1)에 대해서 음성 데이터를 효과음으로 최종 매칭할 수 있다.In the screen 120 of FIG. 8, the users can check the list 123 of the sound data, that is, the sound data automatically matched to the corresponding reproduction interval T1, and can preview each voice data through the reproduction menu 124 have. Thereafter, any one of the audio data can be selected through the selection menu 125, and the audio data can be finally matched with the audio data for the reproduction section (T1).

한편 도 9의 화면(130)에서 사용자들은 동영상이 재생되는 화면(121) 이외에 재생 구간 정보(132)를 확인하고 예를 들어 전 구간에 대해서 재생되도록 자동으로 매칭된 배경음 데이터, 즉 음성 데이터의 리스트(133)를 확인하고, 각 음성 데이터를 재생 메뉴(134)를 통해 미리 들을 수 있다. 이후, 선택 메뉴(135)를 통해 어느 한 음성 데이터를 선택하여 전체 구간에 대해서 음성 데이터를 배경음으로써 최종 매칭할 수 있다. On the other hand, in the screen 130 of FIG. 9, the users check the playback interval information 132 in addition to the screen 121 on which the moving picture is played back, and for example, the background sound data automatically matched to be reproduced with respect to the entire interval, (133), and each audio data can be previewed through the playback menu (134). Then, any one of the audio data can be selected through the selection menu 135, and the audio data can be finally matched as a background sound for the entire section.

하나 이상의 예시적인 구현에서, 여기서 제시된 기능들은 하드웨어, 소프트웨어, 펌웨어, 또는 이들의 조합을 통해 구현될 수 있다. 소프트웨어로 구현되는 경우, 상기 기능들은 컴퓨터 판독가능한 매체 상에 하나 이상의 명령들 또는 코드로서 저장되거나, 또는 이들을 통해 전송될 수 있다. 컴퓨터 판독가능한 매체는 컴퓨터 저장 매체 및 일 장소에서 다른 장소로 컴퓨터 프로그램의 이전을 용이하게 하기 위한 임의의 매체를 포함하는 통신매체를 포함한다. 저장 매체는 범용 컴퓨터 또는 특수 목적의 컴퓨터에 의해 액세스될 수 있는 임의의 가용한 매체일 수 있다. 예를 들어, 이러한 컴퓨터 판독가능한 매체는 RAM, ROM, EEPROM, CD-ROM 또는 다른 광학 디스크저장 매체, 자기 디스크 저장 매체 또는 다른 자기 저장 장치들, 또는 명령 또는 데이터 구조의 형태로 요구되는 프로그램 코드 수단을 저장하는데 사용될 수 있고, 범용 컴퓨터, 특수목적의 컴퓨터, 범용 프로세서, 또는 특별한 프로세서에 의해 액세스될 수 있는 임의의 다른 매체를 포함하지만, 이들로 제한되는 것은 아니다.In one or more exemplary implementations, the functions presented herein may be implemented in hardware, software, firmware, or a combination thereof. When implemented in software, the functions may be stored on or transmitted via one or more instructions or code on a computer readable medium. Computer-readable media includes computer storage media and communication media including any medium for facilitating transfer of a computer program from one place to another. The storage medium may be any general purpose computer or any available medium that can be accessed by a special purpose computer. By way of example, and not limitation, such computer-readable media can comprise any form of computer readable medium, such as RAM, ROM, EEPROM, CD-ROM or other optical disk storage media, magnetic disk storage media or other magnetic storage devices, But not limited to, a general purpose computer, a special purpose computer, a general purpose processor, or any other medium that can be accessed by a particular processor.

또한, 임의의 연결 수단이 컴퓨터 판독가능한 매체로 간주될 수 있다. 예를 들어, 소프트웨어가 웹사이트, 서버, 또는 다른 원격 소스로부터 동축 케이블, 광섬유 케이블, 연선, 디지털 가입자 라인(DSL), 또는 적외선 라디오, 및 마이크로웨이브와 같은 무선 기술들을 통해 전송되는 경우, 이러한 동축 케이블, 광섬유 케이블, 연선, DSL, 또는 적외선 라디오, 및 마이크로웨이브와 같은 무선 기술들이 이러한 매체의 정의 내에 포함될 수 있다. 여기서 사용되는 disk 및 disc은 컴팩트 disc(CD), 레이저 disc, 광 disc, DVD, 플로피 disk, 및 블루-레In addition, any connection means may be considered as a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source over wireless technologies such as coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or infrared radio, and microwave, Wireless technologies such as cable, fiber optic cable, twisted pair, DSL, or infrared radio, and microwave may be included within the definition of such medium. The disks and discs used herein may be compact discs (CDs), laser discs, optical discs, DVDs, floppy disks,

이 disc를 포함하며, 여기서 disk는 데이터를 자기적으로 재생하지만, disc은 레이저를 통해 광학적으로 데이터를 재생한다. 상기 조합들 역시 컴퓨터 판독가능한 매체의 범위 내에 포함될 수 있다.This disc contains discs where discs reproduce data magnetically, while discs reproduce data optically through a laser. The combinations may also be included within the scope of computer readable media.

통상의 기술자는 상술한 다양한 예시적인 엘리먼트, 컴포넌트, 논리블록, 모듈 및 알고리즘 단계들이 전자 하드웨어, 컴퓨터 소프트웨어, 또는 이들의 조합으로서 구현될 수 있음을 잘 이해할 것이다. 하드웨어 및 소프트웨어의 상호 호환성을 명확히 하기 위해, 다양한 예시적인 소자들, 블록, 모듈 및 단계들이 그들의 기능적 관점에서 기술되었다. 이러한 기능이 하드웨어로 구현되는지, 또는 소프트웨어로 구현되는지는 특정 애플리케이션 및 전체 시스템에 대해 부가된 설계 제한들에 의존한다. 당업자는 이러한 기능들을 각각의 특정 애플리케이션에 대해 다양한 방식으로 구현할 수 있지만, 이러한 구현 결정이 본 발명의 영역을 벗어나는 것은 아니다.Those of ordinary skill in the art will appreciate that the various illustrative elements, components, logical blocks, modules, and algorithm steps described above may be implemented as electronic hardware, computer software, or combinations thereof. In order to clarify the interchangeability of hardware and software, various illustrative components, blocks, modules and steps have been described in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement these functions in varying ways for each particular application, but such implementation decisions are not necessarily outside the scope of the invention.

본 명세서에서 기재되는 다양한 예시적인 논리 블록들 및 모듈들은 범용 프로세서, 디지털 신호 처리기(DSP), 주문형 반도체(ASIC), 필드 프로그래머블 게이트 어레이(FPGA) 또는 다른 프로그램어블 논리 디바이스, 이산 게이트 또는 트랜지스터 논리, 이산 하드웨어 컴포넌트들 또는 여기서 기재되는 기능들을 구현하도록 설계되는 임의의 조합을 통해 구현 또는 수행될 수 있다. 범용 프로세서는 마이크로 프로세서 일 수 있지만; 대안적 실시예에서, 이러한 프로세서는 기존 프로세서, 제어기, 마이크로 제어기, 또는 상태 머신일 수 있다. 프로세서는 예를 들어, DSP 및 마이크로프로세서, 복수의 마이크로프로세서들, DSP 코어와 결합된 하나 이상의 마이크로프로세서, 또는 이러한 구성들의 조합과 같이 계산 장치들의 조합으로서 구현될 수 있다. The various illustrative logical blocks and modules described herein may be implemented or performed with a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, Discrete hardware components, or any combination designed to implement the functions described herein. A general purpose processor may be a microprocessor; In an alternative embodiment, such a processor may be an existing processor, controller, microcontroller, or state machine. A processor may be implemented as a combination of computing devices, such as, for example, a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or a combination of such configurations.

하드웨어 구현에 대하여, 여기에서 개시되는 양상들과 관련하여 설명되는 프로세싱 유닛들의 다양한 예시적인 로직들, 로직 블록들 및 모듈들은, 하나 이상의 주문형 반도체(ASIC)들, 디지털 신호 처리기들(DSP)들, 디지털 신호 프로세싱 디바이스(DSPD)들, 프로그래밍 가능한 로직 디바이스(PLD)들, 필드 프로그래밍 가능한 게이트 어레이(FPGA)들, 이산 게이트 또는 트랜지스터 로직, 이산 하드웨어 컴포넌트들, 범용 목적의 프로세서들, 제어기들, 마이크로-컨트롤러들, 마이크로프로세서들, 여기에서 설명되는 기능들을 수행하도록 설계되는 다른 전자 유닛들, 또는 이들의 조합에서 구현될 수 있다. 범용-목적 프로세서는 마이크로프로세서일 수 있지만, 대안적으로, 임의의 기존의 프로세서, 제어기, 마이크로컨트롤러, 또는 상태 머신일 수 있다. 프로세서는 또한 컴퓨팅 디바이스들의 조합(예컨대, DSP 및 마이크로프로세서, 복수의 마이크로프로세서들, DSP 코어와 관련된 하나 이상의 마이크로프로세서들의 조합, 또는 임의의 다른 적절한 구성)으로 구현될 수 있다. 추가적으로, 적어도 하나의 프로세서는 여기에서 설명되는 단계들 및/또는 동작들 중 하나 이상을 구현할 수 있는 하나 이상의 모듈들을 포함할 수 있다.For a hardware implementation, the various illustrative logic, logic blocks, and modules of processing units described in connection with the aspects disclosed herein may be implemented with one or more application specific integrated circuits (ASICs), digital signal processors (DSPs) Such as digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), discrete gate or transistor logic, discrete hardware components, general purpose processors, Controllers, microprocessors, other electronic units designed to perform the functions described herein, or a combination thereof. The general purpose processor may be a microprocessor, but, in the alternative, it may be any conventional processor, controller, microcontroller, or state machine. The processor may also be implemented as a combination of computing devices (e.g., a DSP and a microprocessor, a plurality of microprocessors, a combination of one or more microprocessors in conjunction with a DSP core, or any other suitable configuration). Additionally, at least one processor may include one or more modules capable of implementing one or more of the steps and / or operations described herein.

게다가, 여기에서 설명되는 다양한 양상들 또는 특징들은 표준 프로그래밍 및/또는 엔지니어링 기법들을 사용하는 방법, 장치, 또는 제조물로서 구현될 수 있다. 또한, 여기에서 개시되는 양상들과 관련하여 설명되는 방법 또는 알고리즘의 단계들 및/또는 동작들은 하드웨어로, 프로세서에 의해 실행되는 소프트웨어 모듈로, 또는 이들의 조합으로 직접 구현될 수 있다. 추가적으로, 몇몇의 양상들에서, 방법 또는 알고리즘의 단계들 또는 동작들은 기계-판독가능 매체, 또는 컴퓨터-판독가능 매체 상의 코드들 또는 명령들의 세트의 적어도 하나의 또는 임의의 조합으로서 존재할 수 있으며, 이는 컴퓨터 프로그램 물건으로 통합될 수 있다. 여기에서 사용되는 용어 제조물은 임의의 적절한 컴퓨터-판독가능 디바이스 또는 매체로부터 액세스 가능한 컴퓨터 프로그램을 포함하도록 의도된다. In addition, various aspects or features described herein may be implemented as a method, apparatus, or article of manufacture using standard programming and / or engineering techniques. Moreover, steps and / or operations of a method or algorithm described in connection with the aspects disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. Additionally, in some aspects, steps or acts of a method or algorithm may be present as a machine-readable medium, or as a combination of at least one or any combination of codes or instructions on a computer-readable medium, It can be integrated into computer program stuff. The term article of manufacture as used herein is intended to encompass a computer program accessible from any suitable computer-readable device or medium.

제시된 실시예들에 대한 설명은 임의의 본 발명의 기술 분야에서 통상의 지식을 가진 자가 본 발명을 이용하거나 또는 실시할 수 있도록 제공된다. 이러한 실시예들에 대한 다양한 변형들은 본 발명의 기술 분야에서 통상의 지식을 가진 자에게 명백할 것이며, 여기에 정의된 일반적인 원리들은 본 발명의 범위를 벗어남이 없이 다른 실시예들에 적용될 수 있다. 그리하여, 본 발명은 여기에 제시된 실시예들로 한정되는 것이 아니라, 여기에 제시된 원리들 및 신규한 특징들과 일관되는 최광의의 범위에서 해석되어야 할 것이다. The description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features presented herein.

Claims

A method for automatic editing of moving picture data and audio data, which is performed in a computing device including at least one processor and a main memory for storing instructions executable by the processor,
Receiving first data as a moving picture data to be edited from a user terminal, the first data including a plurality of image data arranged in order of reproduction time, the data being reproduced by the plurality of image data during reproduction;
Extracting image analysis data including color information of a plurality of image data and information on a change pattern according to moving picture reproduction of a plurality of image data embedded in the received first data;
Comparing the attribute information of the first data and the image analysis data stored in one area of the first data with pre-stored audio data, and storing at least one audio data in all or a part of the reproduction time period of the first data Matching step; And
And combining the first data and the audio data matched with the first data to generate second data in which the moving picture data and the audio data are combined and providing the second data to the user terminal. Automatic editing of data.

The method according to claim 1,
Wherein the extracting comprises:
Extracting object data included in a plurality of image data and RGB values of each pixel based on a combination form among pixels included in the plurality of image data included in the first data;
Calculating variation information of object data included in a plurality of image data using a reproduction order of the plurality of image data; And
Processing the variation data of the object data into information on the variation pattern and extracting the image analysis data by processing the color information of the plurality of image data using the RGB values of the pixels; Wherein the video data and the audio data are automatically edited.

3. The method of claim 2,
Wherein the pre-stored voice data includes at least first identification information indicating whether the data is at least background sound data or effect sound data, second identification information indicating a genre of voice data, and a second identification information indicating a tempo of voice data 3 < / RTI > identification information.

The method of claim 3,
The matching step comprises:
The image processing apparatus according to claim 1, wherein, in a case where a plurality of pieces of image data included in a reproduction time period of the first data include a plurality of pieces of RGB data extracted for a plurality of image data of a continuous interval having a variation rate of an average value or a mode value of extracted RGB values less than a predetermined threshold variation rate Calculating an average value or a mode value; And
The audio data having the second identification information or the third identification information matched with the color identification section to which the average value or the mode value of the RGB value belongs is reproduced during the reproduction time of the plurality of image data of the continuous section, And automatically matching the audio data with the audio data.

The method of claim 3,
The matching step comprises:
Calculating movement information of the object data included in the variation information of the object data or movement information of the object data among the plurality of image data included in the reproduction time period of the first data; And
A second identification that matches at least one of a size variation rate of the object data, a size variation ratio of the object data, and a movement speed of the object data, the value being calculated using the variation information of the size of the object data or the movement information, Information or third identification information is reproduced during the reproduction time of the plurality of image data in the time interval in which the movement information of the object data is calculated or the variation information of the size of the object data is matched with the first data And automatically editing the moving picture data and the audio data.

The method of claim 3,
The matching step comprises:
And the background sound data or sound effect data is matched to the first data by using the first identification information of the audio data in accordance with the length of the reproduction section of the plurality of image data included in the extracted image analysis data. And a method for automatically editing voice data.

7. The method according to any one of claims 3 to 6,
Wherein the attribute information of the first data is preference information on music input from the user terminal when the first data is generated and the audio data matching the first data is at least one,
The matching step comprises:
And finally matching the audio data corresponding to the preference information on the music included in the attribute information of the first data with the first data among the audio data matching the first data. Automatic editing method.

The method according to claim 1,
The matching step comprises:
Selecting at least one of the matching voice data and transmitting the selected voice data to the user terminal;
Receiving a selection input for one of the voice data transmitted from the user terminal; And
And finally matching the audio data corresponding to the selection input of the user terminal to the first data.

22. A computer-readable medium,
The computer-readable medium storing instructions that cause a computer to perform the steps of:
Receiving first data as a moving picture data to be edited from a user terminal, the first data including a plurality of image data arranged in order of reproduction time, the data being reproduced by the plurality of image data during reproduction;
Extracting image analysis data including color information of a plurality of image data and information on a change pattern according to moving picture reproduction of a plurality of image data embedded in the received first data;
Comparing the attribute information of the first data and the image analysis data stored in one area of the first data with pre-stored audio data, and storing at least one audio data in all or a part of the reproduction time period of the first data Matching step; And
And combining the first data and the audio data matched with the first data to generate second data combined with the moving picture data and the audio data and providing the generated second data to the user terminal. media.