KR20110121830A

KR20110121830A - Apparatus and method for automatically producing music video in mobile communication terminal

Info

Publication number: KR20110121830A
Application number: KR1020100041306A
Authority: KR
Inventors: 유민주; 윤제한; 김현수
Original assignee: 삼성전자주식회사
Priority date: 2010-05-03
Filing date: 2010-05-03
Publication date: 2011-11-09

Abstract

PURPOSE: A device and a method for automatically producing a music video in a mobile communication terminal are provided to automatically generate a music video. CONSTITUTION: A sub interval music mood determining unit(108) divides the total music interval of a selected music file into at least one sub intervals including music mood values in an error range. An image mood determining unit(116) determines at least one selected image mood value per image file.

Description

Apparatus and method for automatically generating music video on a mobile terminal {APPARATUS AND METHOD FOR AUTOMATICALLY PRODUCING MUSIC VIDEO IN MOBILE COMMUNICATION TERMINAL}

본 발명은 이동통신 단말기에서 자동으로 뮤직비디오를 생성하기 위한 장치 및 방법에 관한 것으로서, 특히 이동통신 단말기에서 음악과 정지영상을 분석하여 자동으로 뮤직비디오를 생성하기 위한 장치 및 방법에 관한 것이다.
The present invention relates to an apparatus and method for automatically generating a music video in a mobile communication terminal, and more particularly, to an apparatus and method for automatically generating a music video by analyzing music and still images in a mobile communication terminal.

참여형 컨텐츠(contents)의 보급이 늘어나고 화면 디스플레이(display)가 지원되는 음악재생 장치의 보급이 많이 이루어지면서, 보여 주는 음악이 많아지고 있는 추세이며, 또한 사용자들이 직접 멀티미디티어 컨텐츠를 제작하는 UCC(User Created Contents)에 대한 관심 및 요구가 증가하고 있다. As the number of participatory contents increases and the number of music playback devices that support screen displays increases, the number of showing music tends to increase, and users can also create multimedia contents directly by UCC ( There is a growing interest and demand for User Created Contents.

종래 기술에 따른 뮤직비디오 생성 방식에서는, 사용자가 음악의 분위기에 따라 직접 정지영상 및 동영상을 선택하여 배치하거나, 단말 자체적으로 음악의 분위기와 상관없이 랜덤으로 정지영상 및 동영상을 배치하는 것을 전제로 한다. 이와 같은 방식은 사용자의 의도에 맞는 뮤직비디오를 생성할 수 있다는 장점을 가진다. 하지만 영상의 양이 많을 경우 사용자의 작업량이 많아짐에 따라 사용자에게 불편함을 초래할 수 있다. 특히 이동통신 단말기와 같이 화면이 작고 입력 수단이 제한적인 장치의 경우, 대규모 영상 앨범에서 분위기에 적합한 영상을 찾아 배치하는데 어려움이 존재한다.
The music video generation method according to the related art is based on the premise that the user selects and arranges a still image and a video directly according to the atmosphere of music, or randomly arranges the still image and a video regardless of the music atmosphere. . Such a method has an advantage of generating a music video suitable for a user's intention. However, if the amount of video is large, the user's work volume increases, which may cause inconvenience to the user. In particular, in the case of a device having a small screen and limited input means such as a mobile communication terminal, it is difficult to find and arrange a video suitable for the atmosphere in a large video album.

본 발명의 목적은 이동통신 단말기에서 자동으로 뮤직비디오를 생성하기 위한 장치 및 방법을 제공함에 있다. An object of the present invention is to provide an apparatus and method for automatically generating a music video in a mobile communication terminal.

본 발명의 다른 목적은 이동통신 단말기에서 음악과 정지영상을 분석하여 자동으로 뮤직비디오를 생성하기 위한 장치 및 방법을 제공함에 있다. Another object of the present invention is to provide an apparatus and method for automatically generating music videos by analyzing music and still images in a mobile communication terminal.

본 발명의 또 다른 목적은 이동통신 단말기에서 음악의 분위기에 맞는 영상을 자동으로 매칭하여 뮤직비디오를 생성하기 위한 장치 및 방법을 제공함에 있다.
Still another object of the present invention is to provide an apparatus and method for automatically generating a music video by automatically matching an image suitable for a music atmosphere in a mobile communication terminal.

상술한 목적들을 달성하기 위한 본 발명의 제 1 견지에 따르면, 이동통신 단말기에서 뮤직비디오를 생성하기 위한 방법에 있어서, 음악 파일과 하나 이상의 영상 파일을 선택받는 과정과, 상기 선택된 음악 파일의 세그먼트별 음악 분위기 값을 결정하는 과정과, 상기 결정된 세그먼트별 음악 분위기 값을 대상으로, 상기 선택된 음악 파일의 전체 음악 구간을, 허용 가능한 오차범위 내 음악 분위기 값들을 포함하는 하나 이상의 서브 구간들로 분할하는 과정과, 상기 선택된 하나 이상의 영상 파일별 영상 분위기 값을 결정하는 과정과, 상기 음악 파일의 서브 구간별로, 상기 하나 이상의 영상 파일들 중에서, 해당 서브 구간의 음악 분위기 값에 대응하는 영상 분위기 값을 가지는 영상 파일을 선택하여 매칭하는 과정을 포함하는 것을 특징으로 한다.According to a first aspect of the present invention for achieving the above object, in a method for generating a music video in a mobile communication terminal, the step of receiving a music file and at least one video file, and by segment of the selected music file Determining a music mood value and dividing the entire music section of the selected music file into one or more sub-sections including music mood values within an allowable error range based on the determined music mood value for each segment And determining an image mood value for each of the selected one or more image files, and for each sub-section of the music file, an image having an image mood value corresponding to the music mood value of the sub-section among the one or more image files. And selecting and matching the file.

본 발명의 제 2 견지에 따르면, 이동통신 단말기에서 뮤직비디오를 생성하기 위한 장치에 있어서, 음악 파일과 하나 이상의 영상 파일을 선택 입력받는 입력부와, 상기 선택된 음악 파일의 세그먼트별 음악 분위기 값을 결정하는 PCM 데이터 분석부와, 상기 결정된 세그먼트별 음악 분위기 값을 대상으로, 상기 선택된 음악 파일의 전체 음악 구간을, 허용 가능한 오차범위 내 음악 분위기 값들을 포함하는 하나 이상의 서브 구간들로 분할하는 서브 구간별 음악 분위기 결정부와, 상기 선택된 하나 이상의 영상 파일별 영상 분위기 값을 결정하는 영상 분위기 결정부와, 상기 음악 파일의 서브 구간별로, 상기 하나 이상의 영상 파일들 중에서, 해당 서브 구간의 음악 분위기 값에 대응하는 영상 분위기 값을 가지는 영상 파일을 선택하여 매칭하는 음악/영상 매칭부를 포함하는 것을 특징으로 한다.
According to a second aspect of the present invention, an apparatus for generating a music video in a mobile communication terminal, comprising: an input unit for selectively inputting a music file and at least one video file, and determining a music mood value of each segment of the selected music file; Sub-section music, which divides the entire music section of the selected music file into one or more sub-sections including music mood values within an acceptable error range, based on the PCM data analyzer and the determined music mood value for each segment. An atmosphere determiner, an image mood determiner configured to determine an image mood value for each of the selected one or more image files, and for each sub-section of the music file, among the one or more image files, a music mood value corresponding to the corresponding sub-section; Music / Video matching and selecting video file with video mood value It characterized in that it comprises a matching unit.

본 발명은 이동통신 단말기에서 음악과 정지영상의 분석을 통해 음악의 분위기에 맞는 영상을 자동으로 매칭하여 뮤직비디오를 생성함으로써, 사용자가 수동으로 뮤직비디오를 제작해야 하는 번거로움을 없애고, 각 음악의 분위기에 맞는 뮤직비디오를 자동으로 생성할 수 있는 이점이 있다.
The present invention generates a music video by automatically matching the image suitable for the atmosphere of the music through the analysis of music and still images in the mobile communication terminal, eliminating the hassle of having to manually create a music video, It has the advantage of automatically generating music videos that match the mood.

도 1은 본 발명에 따른 이동통신 단말기의 장치 구성을 도시한 블럭도,
도 2는 본 발명의 실시 예에 따른 이동통신 단말기에서 뮤직비디오를 생성하기 위한 방법을 도시한 흐름도,
도 3은 본 발명의 실시 예에 따른 이동통신 단말기에서 음악 파일의 구간 분할과 서브 구간별 음악 분위기 결정 방법을 도시한 예시도, 및
도 4는 본 발명의 실시 예에 따른 이동통신 단말기에서 음악 파일의 서브 구간별 영상 파일 선택 방법을 도시한 예시도.1 is a block diagram showing an apparatus configuration of a mobile communication terminal according to the present invention;
2 is a flowchart illustrating a method for generating a music video in a mobile communication terminal according to an embodiment of the present invention;
3 is an exemplary diagram illustrating a method of determining a music segment section and a music mood for each sub section in a mobile communication terminal according to an embodiment of the present invention; and
4 is an exemplary view illustrating a video file selection method for each sub-section of a music file in a mobile communication terminal according to an exemplary embodiment of the present invention.

이하 첨부된 도면을 참조하여 본 발명의 동작 원리를 상세히 설명한다. 하기에서 본 발명을 설명함에 있어 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.
Hereinafter, the operating principle of the present invention will be described in detail with reference to the accompanying drawings. In the following description of the present invention, detailed descriptions of well-known functions or configurations will be omitted if it is determined that the detailed description of the present invention may unnecessarily obscure the subject matter of the present invention. Terms to be described later are terms defined in consideration of functions in the present invention, and may be changed according to intentions or customs of users or operators. Therefore, the definition should be made based on the contents throughout the specification.

이하 본 발명에서는 이동통신 단말기에서 음악과 정지영상의 분석을 통해 음악의 분위기에 맞는 영상을 자동으로 매칭하여 뮤직비디오를 생성하기 위한 방안을 제시한다. Hereinafter, the present invention proposes a method for generating a music video by automatically matching the image suitable for the atmosphere of music through the analysis of music and still images in the mobile communication terminal.

이하 본 발명에서는 음악의 분위기를 4가지 분위기, 즉 다이나믹(Dynamic), 스태틱(Static), 하드(Hard), 소프트(Soft)로 분류하는 것을 예로 들어 설명할 것이나, 이에 한정하지 않음은 물론이다.
Hereinafter, the present invention will be described with an example of classifying music into four atmospheres, that is, dynamic, static, hard, and soft, but the present invention is not limited thereto.

도 1은 본 발명에 따른 이동통신 단말기의 장치 구성을 도시한 블럭도이다. 1 is a block diagram showing an apparatus configuration of a mobile communication terminal according to the present invention.

도시된 바와 같이, 단말은 제어부(100), 오디오 디코더(102), 미디 데이터 분석부(104), PCM 데이터 분석부(106), 서브 구간별 음악 분위기 결정부(108), 영상 디코더(110), 칼라 히스토그램 분석부(112), 영상 주파수 분석부(114), 영상 분위기 결정부(116), 음악/영상 매칭부(118), 전환 영상 생성부(120), 동영상 디코더(122), 저장부(124), 입력부(126), 표시부(128)를 포함하여 구성된다. As shown, the terminal includes a controller 100, an audio decoder 102, a MIDI data analyzer 104, a PCM data analyzer 106, a music mood determination unit 108 for each sub-section, and an image decoder 110. , Color histogram analyzer 112, image frequency analyzer 114, image mood determiner 116, music / image matcher 118, switching image generator 120, video decoder 122, storage 124, an input unit 126, and a display unit 128.

상기 도 1을 참조하면, 상기 제어부(100)는 단말의 전반적인 동작을 위한 제어 및 처리를 수행하며, 특히 본 발명에 따라 음악과 정지영상의 분석을 통해 음악의 분위기에 맞는 영상을 자동으로 매칭하여 뮤직비디오를 생성하기 위한 기능을 처리한다. 이를 위해 상기 제어부(100)는 입력부(126)를 통해 사용자로부터 뮤직비디오를 생성하기 위한 음악 파일과 하나 이상의 영상 파일들을 선택 입력받고, 저장부(124)에서 상기 선택 입력받은 음악 파일과 하나 이상의 영상 파일들을 추출하여, 상기 추출된 음악 파일을 오디오 디코더(102)로 제공하고, 상기 추출된 하나 이상의 영상 파일들을 영상 디코더(110)로 제공한다. Referring to FIG. 1, the controller 100 performs control and processing for the overall operation of the terminal, and in particular, automatically matches images suitable for the atmosphere of music through analysis of music and still images according to the present invention. Handles the function for creating music videos. To this end, the controller 100 receives and inputs a music file and one or more image files for generating a music video from the user through the input unit 126, and the music file and one or more images selected by the storage unit 124. The files are extracted, the extracted music file is provided to the audio decoder 102, and the extracted one or more image files are provided to the image decoder 110.

상기 오디오 디코더(102)는 상기 제어부(100)로부터의 음악 파일을 디코딩하여 음악 데이터로 변환하고, 상기 변환된 음악 데이터가 미디 데이터일 경우 상기 미디 데이터를 미디 데이터 분석부(104)로 제공하며, 상기 변환된 음악 데이터가 미디 데이터가 PCM 데이터일 경우 상기 PCM 데이터를 PCM 데이터 분석부(106)로 제공한다. The audio decoder 102 decodes a music file from the controller 100 and converts the music file into music data. When the converted music data is MIDI data, the audio decoder 102 provides the MIDI data to the MIDI data analyzer 104. If the converted music data is MIDI data is PCM data, the PCM data is provided to the PCM data analyzer 106.

상기 미디 데이터 분석부(104)는 상기 오디오 디코더(102)로부터의 미디 데이터에서 음악 분위기 값을 결정하기 위한 정보, 즉 음악의 속도(즉, BPM(Beats Per Minute)), 각 음의 코드번호, 각 음의 높이, 각 음의 세기 등의 정보를 추출하고, 상기 추출된 음악의 속도(즉, BPM), 각 음의 코드번호, 각 음의 높이, 각 음의 세기 등의 정보를 기반으로, 상기 음악 파일의 각 비트별 음악 분위기 값을 결정하여 서브 구간별 음악 분위기 결정부(108)로 제공한다. The MIDI data analyzer 104 may determine the music mood value in the MIDI data from the audio decoder 102, that is, the speed of music (ie, beats per minute), the code number of each note, Extracting information such as the height of each sound, the strength of each sound, and the like, based on information such as the speed of the extracted music (ie, BPM), the code number of each sound, the height of each sound, the strength of each sound, The music mood value for each bit of the music file is determined and provided to the music mood determiner 108 for each sub-section.

상기 PCM 데이터 분석부(106)는 상기 오디오 디코더(102)로부터의 PCM 데이터에서 세그먼트 단위로 음악 분위기 값을 결정하기 위한 정보, 즉 음색 특성과 템포 특성 등의 정보를 추출하고, 상기 추출된 세그먼트별 음색 특성과 템포 특성 등의 정보를 기반으로, 상기 음악 파일의 각 세그먼트별 음악 분위기 값을 결정하여 서브 구간별 음악 분위기 결정부(108)로 제공한다. The PCM data analyzer 106 extracts information for determining a music mood value in units of segments from the PCM data from the audio decoder 102, that is, information such as a tone characteristic and a tempo characteristic, and extracts the extracted segment information. Based on the information on the tone and tempo characteristics, the music mood value of each segment of the music file is determined and provided to the music mood determiner 108 for each sub-section.

상기 서브 구간별 음악 분위기 결정부(108)는 상기 미디 데이터 분석부(104)로부터의 상기 음악 파일의 각 비트별 음악 분위기 값 또는 상기 PCM 데이터 분석부(106)로부터의 상기 음악 파일의 각 세그먼트별 음악 분위기 값을 기반으로, 상기 음악 파일의 전체 구간을 유사한 음악 분위기 값들을 포함하는 서브 구간들로 분할한다. 이로써 상기 서브 구간별 음악 분위기 결정부(108)는 상기 음악 파일의 전체 구간에 대해 서브 구간별 음악 분위기 값을 결정할 수 있으며, 이와 같이 결정된 상기 음악 파일의 서브 구간별 음악 분위기 값을 음악/영상 매칭부(118)로 제공한다.The music mood determination unit 108 for each sub-section may include a music mood value for each bit of the music file from the MIDI data analyzer 104 or for each segment of the music file from the PCM data analyzer 106. Based on a music mood value, the entire section of the music file is divided into sub sections including similar music mood values. Accordingly, the music mood determination unit 108 for each sub-section may determine a music mood value for each sub-section for the entire section of the music file, and music / image matching of the music mood value for each sub-section of the music file determined as described above. Provided to section 118.

상기 영상 디코더(110)는 상기 제어부(100)로부터의 영상 파일을 디코딩하여 영상 데이터를 출력한다. The image decoder 110 decodes an image file from the controller 100 and outputs image data.

상기 칼라 히스토그램 분석부(112)는 상기 영상 디코더(110)로부터의 디코딩된 영상 데이터를 HSV(Hue-Saturation-Value) 스페이스로 색상 변환하고, 상기 색상 변환된 전체/일부 영상 데이터에 대하여 HSV 칼라 히스토그램을 생성한 후, 상기 생성된 HSV 칼라 히스토그램을 기반으로 해당 영상 데이터의 영상 분위기 추정값을 결정하여 영상 분위기 결정부(116)로 제공한다. The color histogram analyzer 112 color-decodes the decoded image data from the image decoder 110 to a Hue-Saturation-Value (HSV) space, and HSV color histogram for the color / converted image data. After generating the, the image atmosphere estimation value of the corresponding image data is determined based on the generated HSV color histogram and provided to the image atmosphere determiner 116.

상기 영상 주파수 분석부(114)는 상기 영상 디코더(110)로부터의 디코딩된 영상 데이터에 대해 주파수 분석(예를 들어, Edge Distribution, DCT(Discrete Cosine Transform), Wavelet Transform, Garbor filtering)하여 영상의 복잡도와 반복 패턴 개수 등을 결정하고, 이를 기반으로 해당 영상 데이터의 영상 분위기 추정값을 결정하여 영상 분위기 결정부(116)로 제공한다. The image frequency analyzer 114 analyzes the decoded image data from the image decoder 110 by using frequency analysis (eg, edge distribution, discrete cosine transform (DCT), wavelet transform, and garbor filtering) The number of repetitive patterns and the like are determined, and the image atmosphere estimation value of the corresponding image data is determined and provided to the image atmosphere determiner 116.

상기 영상 분위기 결정부(116)는 상기 칼라 히스토그램 분석부(112) 및 영상 주파수 분석부(114)로부터의 영상 데이터의 영상 분위기 추정값들을 기반으로, 해당 영상 데이터의 최종 영상 분위기 값을 결정하여, 영상 데이터별 영상 분위기 값을 음악/영상 매칭부(118)로 제공한다. The image mood determiner 116 determines a final image mood value of the corresponding image data based on image mood estimates of the image data from the color histogram analyzer 112 and the image frequency analyzer 114. The image mood data for each data is provided to the music / image matching unit 118.

상기 음악/영상 매칭부(118)는 상기 서브 구간별 음악 분위기 결정부(108)로부터의 상기 음악 파일의 서브 구간별 음악 분위기 값과, 상기 영상 분위기 결정부(116)로부터의 영상 데이터별 영상 분위기 값을 기반으로, 상기 음악 파일의 각 서브 구간별로, 상기 선택된 영상 파일들 중, 해당 서브 구간의 음악 분위기 값에 대응하는 영상 분위기값을 가지는 영상 파일을 선택하여 전환 영상 생성부(120)로 제공한다. 이에 따라, 상기 음악/영상 매칭부(118)는 음악 파일의 각 서브 구간 별로 해당 서브 구간의 분위기에 따라 가장 유사한 분위기의 영상을 자동 매칭할 수 있다. The music / image matching unit 118 is a music atmosphere value of each sub-section of the music file from the music mood determination unit 108 for each sub-section, and an image atmosphere for each image data from the video atmosphere determination unit 116. Based on the value, for each sub-section of the music file, an image file having an image mood value corresponding to the music mood value of the sub-section is selected from the selected image files and provided to the switching image generator 120. do. Accordingly, the music / image matching unit 118 may automatically match images of the most similar atmosphere according to the atmosphere of the corresponding sub-section for each sub-section of the music file.

상기 전환 영상 생성부(120)는 상기 음악/영상 매칭부(118)로부터의 상기 음악 파일의 각 서브 구간별로 선택된 영상 파일을 대상으로, 상기 음악 파일의 각 서브 구간별로 선택된 연속된 두개의 영상 파일 사이에 삽입할 전환 영상을 생성하여 동영상 디코더(122)로 제공한다. The switching image generator 120 targets an image file selected for each sub-section of the music file from the music / image matching unit 118, and two consecutive image files selected for each sub-section of the music file. A conversion image to be inserted is generated and provided to the video decoder 122.

상기 동영상 디코더(122)는 상기 전환 영상 생성부(120)로부터의 상기 음악 파일과 상기 음악 파일의 각 서브 구간별 영상 파일과, 연속된 두 개의 영상 파일 사이에 삽입할 전환 영상을 인코딩하여 하나의 동영상 파일을 생성하고, 상기 생성된 동영상 파일을 상기 선택된 음악 파일에 대한 뮤직비디오로서 상기 저장부(124)에 저장한다. The video decoder 122 encodes the music file from the switching image generating unit 120, the image file for each sub-section of the music file, and the switching image to be inserted between two consecutive image files. A video file is generated, and the generated video file is stored in the storage unit 124 as a music video for the selected music file.

상기 저장부(124)는 상기 제어부(100)의 처리 및 제어를 위한 프로그램의 마이크로코드와 각종 참조 데이터를 저장하고, 각종 프로그램 수행 중에 발생하는 일시적인 데이터를 저장한다. 특히, 본 발명에 따라 상기 저장부(124)는 음악과 정지영상의 분석을 통해 음악의 분위기에 맞는 영상을 자동으로 매칭하여 뮤직비디오를 생성하기 위한 프로그램을 저장한다. 또한, 상기 저장부(124)는 음악 파일과 영상 파일, 그리고 이를 기반으로 생성된 동영상 파일(즉, 뮤직비디오)을 저장 및 관리한다. The storage unit 124 stores microcodes and various reference data of programs for processing and control of the control unit 100, and stores temporary data generated during execution of various programs. In particular, according to the present invention, the storage unit 124 stores a program for automatically generating a music video by automatically matching images suitable for the atmosphere of music through analysis of music and still images. In addition, the storage unit 124 stores and manages a music file, an image file, and a video file (ie, a music video) generated based on the same.

상기 입력부(126)는 다수의 숫자키 및 기능키들을 구비하며, 사용자가 누르는 키에 대응하는 키입력 데이터를 상기 제어부(100)로 제공한다. The input unit 126 includes a plurality of numeric keys and function keys, and provides key input data corresponding to a key pressed by the user to the controller 100.

상기 표시부(128)는 단말의 동작 중에 발생하는 상태 정보, 제한된 숫자의 문자들, 다량의 동영상 및 정지영상 등을 디스플레이한다. 상기 표시부(128)는 칼라 액정 디스플레이 장치(LCD : Liquid Crystal Display)를 사용할 수 있다.
The display unit 128 displays status information generated during the operation of the terminal, a limited number of characters, a large amount of video and still images, and the like. The display unit 128 may use a color liquid crystal display (LCD).

도 2는 본 발명의 실시 예에 따른 이동통신 단말기에서 뮤직비디오를 생성하기 위한 방법을 도시한 흐름도이다. 2 is a flowchart illustrating a method for generating a music video in a mobile communication terminal according to an embodiment of the present invention.

상기 도 2를 참조하면, 단말은 201단계에서 뮤직비디오를 생성하기 위한 음악 파일과 하나 이상의 영상 파일들이 선택되는지 여부를 검사한다. 여기서, 상기 하나 이상의 영상 파일들은, 단말 내 영상 앨범의 모든/일부 영상 또는 영상 앨범 내 일부 폴더의 모든 영상을 포함할 수 있다. Referring to FIG. 2, in step 201, the terminal determines whether a music file for generating a music video and one or more image files are selected. Here, the one or more image files may include all / some images of the image album in the terminal or all images of some folders in the image album.

상기 201단계에서 뮤직비디오를 생성하기 위한 음악 파일과 하나 이상의 영상 파일들의 선택이 감지될 시, 상기 단말은 203단계에서 상기 선택된 음악 파일이 미디(MIDI) 파일인지 여부를 검사한다. When the selection of the music file and the one or more image files for generating the music video is detected in step 201, the terminal determines whether the selected music file is a MIDI file in step 203.

상기 203단계에서, 상기 선택된 음악 파일이 미디 파일임이 판단될 시, 상기 단말은 205단계에서 상기 미디 파일을 디코딩하여 미디 데이터로 변환하고, 상기 변환된 미디 데이터에서 음악 분위기 값을 결정하기 위한 정보, 즉 음악의 속도(즉, BPM(Beats Per Minute)), 각 음의 코드번호, 각 음의 높이, 각 음의 세기 등의 정보를 추출한다. In step 203, when it is determined that the selected music file is a MIDI file, the terminal decodes the MIDI file into MIDI data in step 205, information for determining a music mood value from the converted MIDI data, That is, information such as the speed of music (ie, beats per minute), the code number of each note, the height of each note, and the strength of each note is extracted.

이후, 상기 단말은 207단계에서 상기 추출된 음악의 속도(즉, BPM), 각 음의 코드번호, 각 음의 높이, 각 음의 세기 등의 정보를 기반으로, 상기 음악 파일의 각 비트별 음악 분위기 값을 결정한 후, 213단계로 진행한다. 여기서, 상기 결정된 각 비트별 음악 분위기 값은, 예를 들어, 전체 음악의 분위기를 4가지 분위기, 즉 다이나믹(Dynamic), 스태틱(Static), 하드(Hard), 소프트(Soft)로 분류한다고 가정하였을 경우, 다이나믹_스태틱/하드_소프트 그래프(예를 들어, 도 3의 우측 도면) 상의 좌표값, 즉 (다이나믹_스태틱 값, 하드_소프트 값)으로 각각 표현될 수 있다. In step 207, the terminal determines the music of each bit of the music file based on information of the speed of the extracted music (ie, BPM), the code number of each sound, the height of each sound, the strength of each sound, and the like. After determining the atmosphere value, the flow proceeds to step 213. Here, it is assumed that the determined music mood value for each bit is classified into, for example, four moods, that is, dynamic, static, hard, and soft. In this case, it may be expressed as a coordinate value on the dynamic_static / hard_soft graph (for example, the right figure of FIG. 3), that is, (dynamic_static value, hard_soft value).

여기서, 상기 단말은 각 비트별 음악 분위기 값을 다음과 같은 방법으로 결정한다. 하나의 실시 예로, 상기 단말에는 음악의 속도(즉 BPM), 음의 코드번호, 음의 높이, 음의 세기 등의 각 정보별 다이나믹_스태틱 값과 하드_소프트 값을 정의하는 테이블이 존재하며, 상기 단말은 이와 같은 테이블을 참조하여, 상기 추출된 음악의 속도(즉, BPM), 각 음의 코드번호, 각 음의 높이, 각 음의 세기 등의 정보에 대응하는 다이나믹_스태틱 값과 하드_소프트 값을 각각 검출한다. 다른 실시 예로, 상기 단말은 학습 기반 알고리즘을 구동하여 상기 추출된 음악의 속도(즉, BPM), 각 음의 코드번호, 각 음의 높이, 각 음의 세기 등의 정보에 대응하는 다이나믹_스태틱 값과 하드_소프트 값을 각각 검출할 수 있다. 이후, 상기 단말은 각 비트 별로 상기 검출된 각 정보들의 다이나믹_스태틱 값의 평균값과 각 비트 별로 상기 검출된 각 정보들의 하드_소프트 값의 평균값을 결정하고, 상기 결정된 각 비트별 다이나믹_스태틱 값의 평균값과 하드_소프트 값의 평균값을 이용하여 각 비트 별로 다이나믹_스태틱/하드_소프트 그래프 상의 좌표값, 즉 (다이나믹_스태틱 값, 하드_소프트 값)을 결정한다. Here, the terminal determines the music mood value for each bit in the following manner. In one embodiment, there is a table that defines a dynamic_static value and a hard_soft value for each information such as music speed (ie, BPM), sound code number, sound height, sound strength, etc. The terminal refers to such a table, and the dynamic_static value and the hard_corresponding to the information of the speed of the extracted music (ie, BPM), the code number of each sound, the height of each sound, the strength of each sound, and the like. Each soft value is detected. In another embodiment, the terminal may drive a learning-based algorithm so that the dynamic_static value corresponding to information such as the speed of the extracted music (ie, BPM), the code number of each sound, the height of each sound, and the strength of each sound may be obtained. And hard_soft values can be detected respectively. Thereafter, the terminal determines the average value of the dynamic_static value of the detected information for each bit and the average value of the hard_soft value of the detected information for each bit, and determines the average dynamic_static value of the determined bit for each bit. The average value of the average value and the hard_soft value is used to determine the coordinate values on the dynamic_hard / hard_soft graph, that is, (dynamic_static value, hard_soft value) for each bit.

반면, 상기 203단계에서, 상기 선택된 음악 파일이 미디 파일이 아님이 판단될 시, 상기 단말은 상기 선택된 음악 파일이 PCM 파일이라고 판단하여, 209단계에서 상기 PCM 파일을 디코딩하여 PCM 데이터로 변환하고, 상기 변환된 PCM 데이터에서 세그먼트 단위로 음악 분위기 값을 결정하기 위한 정보, 즉 음색 특성과 템포 특성 등의 정보를 추출한다. 예를 들어, 상기 단말은 상기 변환된 PCM 데이터에서 세그먼트 단위로 MDCT(Modified Discrete Cosine Transformation) 계수를 추출하고, 상기 추출된 MDCT 계수들로부터 음색 특성을 추출할 수 있다. 대표적인 상기 음색 특성으로 스펙트럼의 중심(spectral centroid), 대역폭(bandwidth), 롤오프(rolloff), 플럭스(flux), 스펙트럼의 서브 밴드 피크(spectral sub-band peak), 밸리(valley), 평균(average) 등이 있다. 또한, 상기 단말은 상기 변환된 PCM 데이터에서 세그먼트 단위로 MDCT 계수를 추출하고, 상기 추출된 MDCT 계수들에 대해 DFT(Discrete Fourier Transformation)을 수행하여 MDCT 변조 스펙트럼(Modulation Spectrum)을 추출한 후, 상기 추출된 MDCT 변조 스펙트럼으로부터 에너지를 추출하여 템포 특성으로 사용할 수 있다. On the other hand, when it is determined in step 203 that the selected music file is not a MIDI file, the terminal determines that the selected music file is a PCM file, in step 209 decodes the PCM file and converts it into PCM data, Information for determining a music mood value in units of segments, that is, a tone characteristic and a tempo characteristic, is extracted from the converted PCM data. For example, the terminal may extract a Modified Discrete Cosine Transformation (MDCT) coefficient on a segment basis from the transformed PCM data, and extract a tone characteristic from the extracted MDCT coefficients. Representative tone characteristics include spectral centroid, bandwidth, rolloff, flux, spectral sub-band peak, valley, and average. Etc. In addition, the terminal extracts the MDCT coefficients in units of segments from the transformed PCM data, extracts an MDCT modulation spectrum by performing Discrete Fourier Transformation (DFT) on the extracted MDCT coefficients, and then extracts the MDCT coefficients. The energy can be extracted from the extracted MDCT modulation spectrum and used as a tempo characteristic.

이후, 상기 단말은 211단계에서 상기 추출된 세그먼트별 음색 특성과 템포 특성 등의 정보를 기반으로, 상기 음악 파일의 각 세그먼트별 음악 분위기 값을 결정한 후, 상기 213단계로 진행한다. 여기서, 상기 결정된 각 세그먼트별 음악 분위기 값은, 예를 들어, 전체 음악의 분위기를 4가지 분위기, 즉 다이나믹(Dynamic), 스태틱(Static), 하드(Hard), 소프트(Soft)로 분류한다고 가정하였을 경우, 다이나믹_스태틱/하드_소프트 그래프 상의 좌표값, 즉 (다이나믹_스태틱 값, 하드_소프트 값)으로 각각 표현될 수 있다.In step 211, the terminal determines the music mood value of each segment of the music file based on the extracted tone characteristics and tempo characteristics of the segment, and then proceeds to step 213. Here, it is assumed that the determined music mood values for each segment are classified into four moods, that is, dynamic, static, hard, and soft, for example. In this case, the dynamic value may be represented as a coordinate value on the dynamic_hard / hard_soft graph, that is, (dynamic_static value, hard_soft value).

여기서, 상기 단말은 각 세그먼트별 음악 분위기 값을 다음과 같은 방법으로 결정한다. 하나의 실시 예로, 상기 단말에는 음색 특성과 템포 특성 등의 각 정보별 다이나믹_스태틱 값과 하드_소프트 값을 정의하는 테이블이 존재하며, 상기 단말은 이와 같은 테이블을 참조하여, 상기 추출된 세그먼트별 음색 특성과 템포 특성 등의 정보에 대응하는 다이나믹_스태틱 값과 하드_소프트 값을 각각 검출한다. 다른 실시 예로, 상기 단말은 학습 기반 알고리즘을 구동하여 상기 추출된 세그먼트별 음색 특성과 템포 특성 등의 정보에 대응하는 다이나믹_스태틱 값과 하드_소프트 값을 각각 검출할 수 있다. 이후, 상기 단말은 각 세그먼트별로 상기 검출된 각 정보들의 다이나믹_스태틱 값의 평균값과 각 세그먼트별로 상기 검출된 각 정보들의 하드_소프트 값의 평균값을 결정하고, 상기 결정된 각 세그먼트별 다이나믹_스태틱 값의 평균값과 하드_소프트 값의 평균값을 이용하여 각 세그먼트 별로 다이나믹_스태틱/하드_소프트 그래프 상의 좌표값, 즉 (다이나믹_스태틱 값, 하드_소프트 값)을 결정한다. Here, the terminal determines the music mood value for each segment in the following manner. According to an embodiment, there is a table defining dynamic_static values and hard_soft values for each information such as a tone characteristic and a tempo characteristic in the terminal, and the terminal refers to such a table for each extracted segment. Dynamic_static values and hard_soft values corresponding to information such as timbre characteristics and tempo characteristics are detected, respectively. In another embodiment, the terminal may drive a learning based algorithm to detect a dynamic_static value and a hard_soft value, respectively, corresponding to the extracted tone characteristics and tempo characteristics of each segment. Thereafter, the terminal determines an average value of dynamic_static values of the detected information for each segment and an average value of hard_soft values of the detected information for each segment, and determines the dynamic_static value of each determined segment. The average value of the average value and the hard_soft value is used to determine coordinate values on the dynamic_hard / hard_soft graph, that is, (dynamic_static value, hard_soft value) for each segment.

이후, 상기 단말은 상기 213단계에서, 상기 207단계에서 결정된 상기 음악 파일의 각 비트별 음악 분위기 값 또는 상기 211단계에서 결정된 상기 음악 파일의 각 세그먼트별 음악 분위기 값을 기반으로, 상기 음악 파일의 전체 구간을 유사한 음악 분위기 값들을 포함하는 서브 구간들로 분할한다. 즉, 상기 단말은 각 비트별 다이나믹_스태틱/하드_소프트 그래프 상의 (다이나믹_스태틱 값, 하드_소프트 값) 또는 각 세그먼트별 다이나믹_스태틱/하드_소프트 그래프 상의 (다이나믹_스태틱 값, 하드_소프트 값)을 기반으로, 도 3과 같이, 연속된 시간 동안 허용 가능한 오차범위(Th) 내에 존재하는 모든 (다이나믹_스태틱 값, 하드_소프트 값)들을 동일한 서브 구간 내에 포함시킨다. 여기서, 각 서브 구간의 길이는 최소 길이(D_min)와 최대 길이(D_max) 사이에서 변화 가능하다. 또한, 서브 구간 내에 허용 가능한 오차범위(Th)를 벗어나는 (다이나믹_스태틱 값, 하드_소프트 값)이 존재하더라도, 서브 구간 내에서 허용 가능한 오차범위(Th)를 벗어나는 (다이나믹_스태틱 값, 하드_소프트 값)이 차지하는 시간이 허용 가능한 시간오차범위(C_min) 내에 존재한다면, 해당 (다이나믹_스태틱 값, 하드_소프트 값)을 해당 서브 구간 내에 유지시킨다. 이로써 상기 단말은 상기 음악 파일의 전체 구간에 대해 서브 구간별 음악 분위기 값을 결정할 수 있다. Thereafter, in step 213, the terminal determines the music mood value of each bit of the music file determined in step 207 or the music mood value of each segment of the music file determined in step 211. The interval is divided into sub-sections containing similar musical mood values. That is, the terminal may display the dynamic_static / hard_soft graph on each bit of the dynamic_static / hard_soft graph or the dynamic_static value, hard_soft on the dynamic_static / hard_soft graph of each segment. Value), as shown in FIG. 3, all (dynamic_static values, hard_soft values) existing in the allowable error range Th for a continuous time are included in the same sub-interval. Here, the length of each sub-section may vary between the minimum length D _min and the maximum length D _max . In addition, even if there is a dynamic range (dynamic_static value, hard_soft value) outside the allowable error range Th in the sub-section, the dynamic_static value, hard_ If the time occupied by the soft value is within the allowable time error range (C _min ), the corresponding (dynamic_static value, hard_soft value) is kept in the corresponding sub-interval. Accordingly, the terminal can determine the music mood value for each sub-section for the entire section of the music file.

이후, 상기 단말은 215단계에서 상기 선택된 하나 이상의 영상 파일들을 디코딩하여 각각의 영상 데이터로 변환한다. In step 215, the terminal decodes the selected one or more image files and converts the selected one or more image data into respective image data.

이후, 상기 단말은 217단계에서 상기 변환된 각 영상 데이터별로, 해당 영상 데이터를 HSV(Hue-Saturation-Value) 스페이스로 색상 변환하여, 상기 색상 변환된 전체/일부 영상 데이터에 대하여 HSV 칼라 히스토그램을 생성한 후, 상기 생성된 HSV 칼라 히스토그램을 기반으로 영상 분위기 추정값을 결정한다. 즉, 상기 단말은 상기 변환된 각 영상 데이터별로, 해당 영상 데이터를 HSV 색공간 좌표에 대응하도록 색상 변환하여, 상기 색상 변환된 전체/일부 영상 데이터에 대하여 해당 영상 데이터에 포함된 색상의 분포를 정리한 후, 이를 기반으로 영상 분위기 추정값을 결정한다. In step 217, the terminal converts the corresponding image data into a Hue-Saturation-Value (HSV) space for each of the converted image data, thereby generating an HSV color histogram for the color / converted image data. Then, the image mood estimation value is determined based on the generated HSV color histogram. That is, the terminal converts the corresponding color image data to correspond to HSV color space coordinates for each of the converted image data, and arranges the distribution of colors included in the corresponding image data with respect to the color transformed all / partial image data. Afterwards, the image mood estimation value is determined based on this.

여기서, 상기 단말은 HSV 칼라 히스토그램을 기반으로 영상 분위기 추정값을 다음과 같은 방법으로 결정한다. 예를 들어, 상기 단말은 영상 분위기 특성별 대표 색상값을 이용하여 영상 분위기 추정값을 결정하고, 영상 분위기 특성별 배색 색상값을 이용하여 영상 분위기 추정값을 결정할 수 있다. Here, the terminal determines the image mood estimation value based on the HSV color histogram as follows. For example, the terminal may determine the image mood estimation value using the representative color value for each image mood characteristic, and determine the image mood estimate value using the color tone color value for each image mood characteristic.

먼저, 영상 분위기 특성별 대표 색상값을 이용한 영상 분위기 추정값 결정 방법에 대해 살펴보면 다음과 같다. 하나의 실시 예로, 상기 단말은 상기 생성된 HSV 칼라 히스토그램을 기반으로 가장 높은 히스토그램값을 가지는 색상을 결정하고, 즉 해당 영상 데이터에 포함된 색상 중 가장 많이 분포된 색상을 결정하고, 영상 분위기 특성별 대표 색상값을 정의하는 테이블을 기반으로, 상기 결정된 색상값과의 차이가 가장 작은 영상 분위기 특성 대표 색상값을 결정한다. 이로써, 상기 단말은, 하기 <수학식 1>과 같이, 각 영상 데이터의 제1 영상 분위기 추정값을 결정할 수 있다. First, a method of determining an image atmosphere estimation value using representative color values for each image atmosphere characteristic is as follows. In one embodiment, the terminal determines the color having the highest histogram value based on the generated HSV color histogram, that is, determines the most distributed color among the colors included in the corresponding image data, Based on the table defining the representative color values, the representative color value of the image atmosphere characteristic having the smallest difference from the determined color values is determined. As a result, the terminal may determine the first image atmosphere estimation value of each image data, as shown in Equation 1 below.

여기서, 상기

은 I번째 영상 데이터의 제1 영상 분위기 추정값을 의미하고, 상기 H(x)는 색상값 x에 대한 HSV 히스토그램값을 의미하고, 상기 M(y)는 영상 분위기 특성 y의 대표 색상값을 의미한다. Where

Denotes a first image mood estimation value of the I-th image data, H (x) denotes an HSV histogram value with respect to the color value x, and M (y) denotes a representative color value of the image mood characteristic y. .

다른 실시 예로, 상기 단말은 상기 생성된 HSV 칼라 히스토그램과 영상 분위기 특성별 대표 색상값을 정의하는 테이블을 기반으로, 영상 분위기 특성별 대표 색상값 중 가장 높은 히스토그램값을 가지는 색상을 결정한다. 즉 상기 단말은 해당 영상 데이터에 포함된 색상 중 가장 많이 분포된 영상 분위기 특성 대표 색상값을 결정한다. 이로써, 상기 단말은, 하기 <수학식 2>와 같이, 각 영상 데이터의 제1 영상 분위기 추정값을 결정할 수 있다. In another embodiment, the terminal determines the color having the highest histogram value among the representative color values for each image atmosphere characteristic based on the generated HSV color histogram and a table for defining the representative color values for each image atmosphere characteristic. That is, the terminal determines the representative color value of the image atmosphere characteristic most distributed among the colors included in the corresponding image data. As a result, the terminal may determine the first image atmosphere estimation value of each image data, as shown in Equation 2 below.

다음으로, 영상 분위기 특성별 배색 색상값을 이용한 영상 분위기 추정값 결정 방법에 대해 살펴보면 다음과 같다. 하나의 실시 예로, 상기 단말은 상기 생성된 HSV 칼라 히스토그램과 영상 분위기 특성별 배색 색상값을 정의하는 테이블을 기반으로, 배색 색상값들의 히스토그램값의 합이 가장 높은 영상 분위기 특성 배색 색상값을 결정한다. 이로써, 상기 단말은, 하기 <수학식 3>과 같이, 각 영상 데이터의 제2 영상 분위기 추정값을 결정할 수 있다. Next, a method of determining an image mood estimation value using color schemes of colors according to image mood characteristics will be described. According to an embodiment, the terminal determines the image mood characteristic color value having the highest sum of the histogram values of the color scheme values based on the generated HSV color histogram and the color scheme of the color value for each image mood characteristic. . As a result, the terminal may determine the second image atmosphere estimation value of each image data, as shown in Equation 3 below.

여기서, 상기

은 I번째 영상 데이터의 제2 영상 분위기 추정값을 의미하고, 상기 H(x)는 색상값 x에 대한 HSV 히스토그램값을 의미하고, 상기 M(y,i)는 영상 분위기 특성 y의 배색 색상값 중 i번째 색상값을 의미한다. 여기서, 영상 분위기 특성별로 3개의 배색 색상값이 존재하는 것을 가정하고 있으나, 이에 한정하지 않음은 물론이다. Where

Denotes a second image mood estimation value of the I-th image data, wherein H (x) denotes an HSV histogram value with respect to the color value x, and M (y, i) is a color tone value of the image mood characteristic y. It means the i'th color value. Here, although it is assumed that three color schemes exist for each image atmosphere characteristic, the present invention is not limited thereto.

이후, 상기 단말은 219단계에서 상기 변환된 각 영상 데이터별로, 해당 영상 데이터에 대한 주파수 분석(예를 들어, Edge Distribution, DCT(Discrete Cosine Transform), Wavelet Transform, Garbor filtering)을 통해 영상의 복잡도와 반복 패턴 개수 등을 결정하고, 이를 기반으로 영상 분위기 추정값을 결정한다. 하나의 실시 예로, 상기 단말에는 영상의 복잡도와 반복 패턴 개수 등의 각 정보별 영상 분위기 추정값을 정의하는 테이블이 존재하며, 상기 단말은 이와 같은 테이블을 참조하여, 상기 결정된 영상의 복잡도와 반복 패턴 개수 등의 정보에 대응하는 영상 분위기 추정값을 결정한다. 이로써, 상기 단말은 각 영상 데이터의 제3 영상 분위기 추정값

을 결정할 수 있다. Afterwards, in step 219, the terminal determines the complexity of the image through frequency analysis (eg, edge distribution, discrete cosine transform (DCT), wavelet transform, and garbor filtering) for the corresponding image data. The number of repetition patterns and the like are determined, and an image mood estimation value is determined based on these. According to an embodiment, there is a table defining an image mood estimation value for each information such as the complexity of the image and the number of repetitive patterns in the terminal. The terminal may refer to the table to determine the complexity and the number of repetitive patterns of the determined image. An image mood estimation value corresponding to information such as the above is determined. Thus, the terminal estimates the third image atmosphere of each image data.

Can be determined.

이후, 상기 단말은 221단계에서 각 영상 데이터별 상기 결정된 영상 분위기 추정값들을 기반으로, 각 영상 데이터별 최종 영상 분위기 값을 결정한다. 예를 들어, 상기 단말은, 하기 <수학식 4>와 같이, 각 영상 데이터별 상기 결정된 제1, 제2, 제3 영상 분위기 추정값들의 합으로 각 영상 데이터별 최종 영상 분위기 값을 결정할 수 있다.In step 221, the terminal determines the final image mood value for each image data based on the determined image mood estimate values for each image data. For example, as shown in Equation 4, the terminal may determine a final image mood value for each image data based on the sum of the determined first, second, and third image mood estimates for each image data.

여기서, 상기

은 I번째 영상 데이터의 최종 영상 분위기 값을 의미하고, 상기

은 I번째 영상 데이터의 제i 영상 분위기 추정값을 의미하며, 상기

는 가중치값을 의미한다. 여기서, 다른 2개의 영상 분위기 추정값들과 나머지 하나의 영상 분위기 추정값의 차이가 현저할 경우, 상기 나머지 하나의 영상 분위기 추정값은 무시할 수 있다. Where

Denotes the final image mood value of the I-th image data.

Denotes an i-th image mood estimation value of the I-th image data.

Denotes a weight value. Here, when the difference between the other two image atmosphere estimates and the other image atmosphere estimate is significant, the other one may be ignored.

이후, 상기 단말은 223단계에서 상기 음악 파일의 각 서브 구간별로, 상기 선택된 하나 이상의 영상 파일들 중, 해당 서브 구간의 음악 분위기 값에 대응하는 영상 분위기값을 가지는 영상 파일을 선택한다. 이에 따라, 도 4와 같이, 상기 단말은 음악 파일의 각 서브 구간 별로 해당 서브 구간의 분위기에 따라 가장 유사한 분위기의 영상을 자동 매칭할 수 있다. In step 223, the terminal selects an image file having an image mood value corresponding to the music mood value of the sub-section among the selected one or more image files for each sub-section of the music file. Accordingly, as shown in FIG. 4, the terminal may automatically match images of the most similar atmosphere according to the atmosphere of the corresponding sub-section for each sub-section of the music file.

이후, 상기 단말은 225단계에서 상기 음악 파일의 각 서브 구간별로 선택된 연속된 두개의 영상 파일 사이에 삽입할 전환 영상을 생성한다. 하나의 실시 예로, 상기 단말은 영상 분위기 특성별 배색 색상값을 정의하는 테이블을 기반으로, 연속된 두개의 영상 파일 각각에 대응하는 영상 분위기 특성의 배색 색상값들을 추출하고, 상기 추출된 배색 색상값들을 혼합하여, 연속된 두개의 영상 파일 사이에 삽입할 전환 영상을 생성할 수 있다. 이에 따라 영상이 변하는 부분에 전환 효과를 삽입할 수 있다. In step 225, the terminal generates a switching image to be inserted between two consecutive image files selected for each sub-section of the music file. According to an embodiment, the terminal extracts color values of the color schemes corresponding to each of two consecutive image files based on a table defining color values of color schemes for each of the image mood characteristics, and extracts the color scheme values. These images may be mixed to generate a transition image to be inserted between two consecutive image files. Accordingly, a transition effect can be inserted in the portion where the image changes.

이후, 상기 단말은 227단계에서 상기 선택된 음악 파일과 상기 음악 파일의 각 서브 구간별 영상 파일과, 연속된 두 개의 영상 파일 사이에 삽입할 전환 영상을 인코딩하여 하나의 동영상 파일을 생성하고, 상기 생성된 동영상 파일을 상기 선택된 음악 파일에 대한 뮤직비디오로서 저장한다. In step 227, the terminal encodes the selected music file, the video file for each sub-section of the music file, and the transition video to be inserted between two consecutive video files to generate one video file. The recorded video file as a music video for the selected music file.

이후, 상기 단말은 본 발명에 따른 알고리즘을 종료한다.
Thereafter, the terminal terminates the algorithm according to the present invention.

한편 본 발명의 상세한 설명에서는 구체적인 실시 예에 관해 설명하였으나, 본 발명의 범위에서 벗어나지 않는 한도 내에서 여러 가지 변형이 가능함은 물론이다. 그러므로 본 발명의 범위는 설명된 실시 예에 국한되어 정해져서는 아니 되며 후술하는 특허청구의 범위뿐만 아니라 이 특허청구의 범위와 균등한 것들에 의해 정해져야 한다.
Meanwhile, in the detailed description of the present invention, specific embodiments have been described, but various modifications are possible without departing from the scope of the present invention. Therefore, the scope of the present invention should not be limited to the described embodiments, but should be determined not only by the scope of the following claims, but also by the equivalents of the claims.

오디오 디코더 102, 미디 데이터 분석부 104, PCM 데이터 분석부 106, 서브 구간별 음악 분위기 결정부 108, 영상 디코더 110, 칼라 히스토그램 분석부 112, 영상 주파수 분석부 114, 영상 분위기 결정부 116, 음악/영상 매칭부 118, 전환 영상 생성부 120, 동영상 디코더 122Audio decoder 102, MIDI data analyzer 104, PCM data analyzer 106, music mood determiner 108 for each sub-section, image decoder 110, color histogram analyzer 112, video frequency analyzer 114, video mood determiner 116, music / video Matching unit 118, switching image generating unit 120, video decoder 122

Claims

In the method for generating a music video in a mobile terminal,
Receiving a music file and one or more video files,
Determining a music mood value for each segment of the selected music file;
Dividing the entire music section of the selected music file into one or more sub-sections including music mood values within an allowable error range based on the determined music mood value for each segment;
Determining an image mood value for each of the selected one or more image files;
And selecting and matching the image file having an image mood value corresponding to the music mood value of the sub-section among the one or more image files for each sub-section of the music file.

The method of claim 1,
Converting the selected music file into music data;
Extracting at least one of a timbre characteristic and a tempo characteristic for each segment from the converted music data;
The music mood value of each segment of the selected music file may be determined based on at least one of the extracted segment tone characteristics and tempo characteristics.

The method of claim 1,
Checking whether the selected music file is a MIDI file;
When the selected music file is a MIDI file, determining the music mood value of each bit of the selected music file.

The method of claim 3, wherein
Converting the selected music file into music data;
Extracting at least one of a speed of music, a code number of each sound, a height of each sound, and an intensity of each sound from the converted music data,
The music mood value for each bit of the selected music file may be determined based on at least one of a speed of the extracted music, a code number of each sound, a height of each sound, and an intensity of each sound.

The method of claim 1,
Converting the selected one or more image files into image data;
For each of the converted image data, color converting the corresponding image data into a Hue-Saturation-Value (HSV) space, and generating an HSV color histogram of the color converted image data.
The image mood value of each of the selected one or more image files may be determined based on the HSV color histogram generated for the corresponding image data.

The method of claim 5, wherein the determining of the video mood value for each video file comprises:
Determining a color having the highest histogram value based on the HSV color histogram generated for the corresponding image data;
And determining an image color characteristic representative color value having the smallest difference from the determined color value based on a table defining a representative color value corresponding to each image atmosphere characteristic.

The method according to claim 6,
The video mood value for each video file is determined using the following Equation.

Where

Denotes an image mood value of the I-th image data, H (x) denotes an HSV histogram value with respect to the color value x, and M (y) denotes a representative color value of the image mood characteristic y.

The method of claim 5, wherein the determining of the video mood value for each video file comprises:
A method of determining a color having the highest histogram value among representative color values for each image atmosphere characteristic based on the HSV color histogram generated for the corresponding image data and a table defining representative color values for each image atmosphere characteristic. .

The method of claim 8,
The video mood value for each video file is determined using the following Equation.

Where

The method of claim 5, wherein the determining of the video mood value for each video file comprises:
Based on the HSV color histogram generated for the corresponding image data and a table defining color values of color values for each of the image mood characteristics, the sum of the histogram values of the color values of the color schemes determines the image color characteristic color value having the highest value. How to.

The method of claim 10,
The video mood value for each video file is determined using the following Equation.

Where

Denotes an image mood value of the I-th image data, H (x) denotes an HSV histogram value with respect to the color value x, and M (y, i) denotes an i-th color scheme of color scheme of the image mood characteristic y. It means the color value.

The method of claim 1,
Converting the selected one or more image files into image data;
For each of the converted image data, further comprising the step of determining at least one of the complexity of the image and the number of repeating patterns through the frequency analysis of the corresponding image data,
The image mood value for each of the selected one or more image files may be determined based on at least one of the complexity of the determined image and the number of repetitive patterns for the corresponding image data.

The method of claim 1,
Generating a switching image to be inserted between two consecutive image files matched for each sub-section of the music file;
And generating a video file by encoding the music file, an image file matched for each sub-section of the music file, and a transition image to be inserted between two consecutive image files.

The method of claim 13, wherein the converting image generation process comprises:
Extracting color values of the image mood characteristics corresponding to each of two consecutive image files based on a table defining color values of the color values for each of the image mood characteristics;
And mixing the extracted color scheme values.

In the device for generating a music video in a mobile communication terminal,
An input unit for selecting and inputting a music file and one or more image files;
A PCM data analyzer for determining a music mood value for each segment of the selected music file;
A music mood determination unit for each sub-section that divides the entire music section of the selected music file into one or more sub-sections including music mood values within an allowable error range, based on the determined music mood value for each segment;
An image atmosphere determiner configured to determine an image atmosphere value for each of the selected one or more image files;
And a music / image matching unit configured to select and match an image file having an image mood value corresponding to a music mood value of the sub-section among the one or more image files for each sub-section of the music file. .

The method of claim 15,
An audio decoder which decodes the selected music file and converts the selected music file into music data;
The PCM data analyzer extracts at least one of a timbre characteristic and a tempo characteristic for each segment from the converted music data, and the music for each segment of the selected music file based on at least one of the extracted tones characteristic and the tempo characteristic for each segment. Device for determining the atmosphere value.

The method of claim 15,
An audio decoder which decodes the selected music file and converts it into music data, and provides the converted music data to a MIDI data analyzer if the selected music file is a MIDI file;
And a MIDI data analyzer configured to determine a music mood value for each bit of the selected music file.

The method of claim 17,
The MIDI data analyzer extracts at least one of a speed of music, a code number of each sound, a height of each sound, and an intensity of each sound from the converted music data, and the speed of the extracted music and the code number of each sound. And determining a music mood value for each bit of the selected music file based on at least one of the height of each sound and the strength of each sound.

The method of claim 15,
And a video decoder for decoding the selected one or more video files and converting the selected video files into image data.
The image mood determination unit converts corresponding image data into a Hue-Saturation-Value (HSV) space for each of the converted image data, generates an HSV color histogram for the color converted image data, and for each of the image data. And determining an image mood value for each of the selected one or more image files based on an HSV color histogram.

The method of claim 19, wherein the video mood determination unit,
The color having the highest histogram value is determined based on the HSV color histogram generated for the corresponding image data.
Based on a table defining a representative color value for each image mood characteristic, the device characterized in that the image mood characteristic representative color value having the smallest difference from the determined color value is determined, and determines the image mood value for each image file. .

The method of claim 20,
The image mood value for each image file is determined using the following Equation.

Where

The method of claim 19, wherein the video mood determination unit,
Based on the HSV color histogram generated for the corresponding image data and a table defining a representative color value for each image mood characteristic, the color having the highest histogram value among the representative color values for each image mood characteristic is determined and the image for each image file is determined. Device for determining the atmosphere value.

The method of claim 22,
The image mood value for each image file is determined using the following Equation.

Where

The method of claim 19, wherein the video mood determination unit,
Based on the HSV color histogram generated for the corresponding image data and a table defining color values of color values for each of the image mood characteristics, the image file is determined by determining an image color characteristic color value having the highest sum of the histogram values of color values. Device for determining the star image mood value.

The method of claim 24,
The image mood value for each image file is determined using the following Equation.

Where

The method of claim 15,
And a video decoder for decoding the selected one or more video files and converting the selected video files into image data.
The image atmosphere determiner determines at least one of the complexity of the image and the number of repetition patterns through frequency analysis of the corresponding image data for each of the converted image data, and based on at least one of the complexity of the determined image and the number of repetition patterns. And determining an image mood value for each image file.

The method of claim 15,
A switching image generator for generating a switching image to be inserted between two consecutive image files matched for each sub-section of the music file;
The apparatus further comprises a video decoder for generating a video file by encoding the music file, a video file matched for each sub-section of the music file, and a transition video to be inserted between two consecutive video files.

The method of claim 27, wherein the conversion image generator,
Based on a table defining color values of color schemes for each image mood characteristic, color scheme color values of image mood characteristics corresponding to two consecutive image files are extracted, and the extracted color values are mixed to generate the conversion image. Device characterized in that.