KR20190109626A

KR20190109626A - Method and apparatus for generating and evaluating music

Info

Publication number: KR20190109626A
Application number: KR1020180023898A
Authority: KR
Inventors: 안창욱; 이종현; 정윤채; 정재훈
Original assignee: 주식회사 크리에이티브마인드
Priority date: 2018-02-27
Filing date: 2018-02-27
Publication date: 2019-09-26
Also published as: KR102138247B1

Abstract

Disclosed are a method for evaluating a sound source and a method thereof and an apparatus for generating a sound source using the same and an apparatus thereof. The apparatus for evaluating a sound source learns at least one classifier for outputting preference based on at least one feature value extracted for each section of learning sound source, extracts at least one feature value for each node of sound source to be evaluated, and understands preference of the sound source to be evaluated using the classifier.

Description

Sound source evaluation method and apparatus and method for generating sound source using same and apparatus therefor {Method and apparatus for generating and evaluating music}

본 발명은 음원의 평가 및 생성에 관한 것으로, 보다 상세하게는 기계 학습(Machine Learning)을 이용하여 음원을 평가하고 음원을 생성하는 방법 및 그 장치에 관한 것이다.The present invention relates to the evaluation and generation of sound sources, and more particularly, to a method and apparatus for evaluating sound sources and generating sound sources using machine learning.

일반적으로 화음은 사람들에게 안정감을 주나 불협화음은 사람들에게 긴장감을 유발한다. 또한, 음의 높이가 점차 높아지면 긴장감이 높아지고 음의 높이가 점차 낮아지면 안정감이 높아진다. 이와 같이 화음이나 음의 진행 과정에 따라 사람이 느끼는 감정에 대한 다양한 이론이 존재한다. 그러나 이러한 기존 이론만으로 사람들이 선호하는 음원인지 평가하는 데에는 한계가 있다. In general, harmonics give people a sense of security, but dissonance causes tension. In addition, when the pitch is gradually increased, the tension is increased, and when the pitch is gradually lowered, the stability is increased. Thus, there are various theories about how people feel according to the progression of chords or sounds. However, there is a limit in evaluating whether the sound source is a favorite sound by these existing theories alone.

본 발명이 이루고자 하는 기술적 과제는, 기계 학습을 이용하여 실제 사람이 느끼는 선호도와 유사하게 음원을 평가하는 방법 및 그 장치를 제공하는 데 있다.The technical problem to be achieved by the present invention is to provide a method and apparatus for evaluating a sound source similar to the preference felt by a real person using machine learning.

본 발명이 이루고자 하는 다른 기술적 과제는, 음원의 평가 결과를 이용하여 퀄리티 높은 음원을 생성하는 방법 및 그 장치를 제공하는 데 있다. Another object of the present invention is to provide a method and apparatus for generating a high quality sound source using the evaluation result of the sound source.

상기의 기술적 과제를 달성하기 위한, 본 발명의 실시 예에 따른 음원 평가 방법의 일 예는, 학습음원의 각 구간별로 추출한 적어도 하나 이상의 특징값들을 기초로 선호도를 출력하는 적어도 하나 이상의 분류기를 학습시키는 단계; 평가대상음원의 각 구간별로 적어도 하나 이상의 특징값들을 추출하는 단계; 및 상기 적어도 하나 이상의 분류기를 이용하여 상기 평가대상음원의 선호도를 파악하는 단계;를 포함한다.In order to achieve the above technical problem, an example of a sound source evaluation method according to an embodiment of the present disclosure includes learning at least one classifier that outputs a preference based on at least one feature value extracted for each section of the learning sound source. step; Extracting at least one feature value for each section of the evaluation target sound source; And determining a preference of the sound source to be evaluated using the at least one classifier.

상기의 기술적 과제를 달성하기 위한, 본 발명의 실시 예에 따른 음원 평가 장치의 일 예는, 학습음원의 각 구간별로 추출한 적어도 하나 이상의 특징값들을 기초로 선호도를 출력하는 적어도 하나 이상의 분류기; 평가대상음원의 각 구간별로 적어도 하나 이상의 특징값들을 추출하는 특징추출부; 및 상기 적어도 하나 이상의 분류기를 이용하여 상기 평가대상음원의 선호도를 파악하는 평가부;를 포함한다.In order to achieve the above technical problem, an example of a sound source evaluation apparatus according to an embodiment of the present invention, at least one classifier for outputting a preference based on at least one feature value extracted for each section of the learning sound source; Feature extraction unit for extracting at least one feature value for each section of the evaluation target sound source; And an evaluator configured to determine a preference of the evaluation target sound source using the at least one classifier.

상기의 기술적 과제를 달성하기 위한, 본 발명의 실시 예에 따른 음원 생성 장치의 일 예는, 무작위의 음표로 구성되는 적어도 하나 이상의 제1 평가대상음원을 생성하는 음원생성부; 및 상기 제1 평가대상음원의 적어도 한 구간 이상을 다른 음원의 해당 구간과 교체하거나 적어도 하나 이상의 음표를 다른 음표로 바꾸어 적어도 하나 이상의 제2 평가대상음원을 생성하는 음원진화부;를 포함하고, 상기 음원진화부는, 상기 적어도 하나 이상의 제1 평가대상음원 또는 상기 적어도 하나 이상의 제2 평가대상음원 중 선호도가 일정 이상인 음원들 상호간의 적어도 하나 이상의 구간 교체 또는 음표 변경을 이용하여 적어도 하나 이상의 제3 평가대상음원을 생성하는 과정을 반복수행한다.In order to achieve the above technical problem, an example of a sound source generating apparatus according to an embodiment of the present invention, a sound source generation unit for generating at least one or more first evaluation target sound source consisting of random notes; And a sound source evolution unit for generating at least one second evaluation target sound source by replacing at least one section of the first evaluation target sound source with another section of another sound source or by replacing at least one note with another note. The sound source evolution unit may include at least one or more third evaluation targets by using at least one or more section replacements or note changes between the at least one first or second evaluation target sound sources or the at least one or more second evaluation target sound sources. Repeat the process of creating the sound source.

본 발명에 따르면, 실제 사람의 평가와 유사하게 음원을 평가할 수 있다. 또한, 평가 결과를 이용하여 퀄리티 높은 음원을 자동으로 생성할 수 있다. 또한, 특정 감정에 맞는 음원을 생성할 수 있다.According to the present invention, the sound source can be evaluated similarly to the actual human evaluation. In addition, a high quality sound source can be automatically generated using the evaluation result. In addition, a sound source suitable for a specific emotion may be generated.

도 1은 본 발명의 실시 예에 따른 음원평가장치의 일 예를 도시한 도면,
도 2는 본 발명의 실시 예에 따른 음원평가를 이용한 음원생성장치의 일 예를 도시한 도면,
도 3은 본 발명의 실시 예에 따른 음원의 악보의 일 예를 도시한 도면,
도 4는 본 발명의 실시 예에 따른 음원평가를 위해 추출하는 특징값 리스트의 일 예를 도시한 도면,
도 5는 본 발명의 실시 예에 따른 분류기의 학습 방법의 일 예를 도시한 도면,
도 6은 본 발명의 실시 예에 따른 분류기의 학습 방법의 다른 예를 도시한 도면,
도 7은 본 발명의 실시 예에 따른 분류기를 이용한 음원 평가 방법의 일 예를 도시한 도면,
도 8은 본 발명의 실시 예에 따른 음원의 유사도를 나타내는 그래프의 일 예를 도시한 도면,
도 9는 본 발명의 실시 예에 따른 음원의 평가결과를 나타내는 그래프의 일 예를 도시한 도면,
도 10은 본 발명의 실시 예에 따른 음원의 반주부분을 생성하는 방법의 일 예를 도시한 도면,
도 11은 본 발명의 실시 예에 따른 음원을 유전적 방법을 통해 생성하는 방법의 일 예를 도시한 도면,
도 12는 본 발명의 실시 예에 따른 음원 평가 방법의 일 예를 도시한 흐름도,
도 13은 본 발명의 실시 예에 다른 음원 생성 방법의 일 예를 도시한 흐름도,
도 14는 본 발명의 실시 예에 따른 음원평가장치의 일 예의 구성을 도시한 도면, 그리고,
도 15는 본 발명의 실시 예에 따른 음원생성장치의 일 예의 구성을 도시한 도면이다.1 is a view showing an example of a sound source evaluation apparatus according to an embodiment of the present invention,
2 is a view showing an example of a sound source generating apparatus using a sound source evaluation according to an embodiment of the present invention,
3 is a view showing an example of music scores of a sound source according to an embodiment of the present invention;
4 is a view showing an example of a feature value list extracted for sound source evaluation according to an embodiment of the present invention;
5 is a diagram illustrating an example of a learning method of a classifier according to an embodiment of the present invention;
6 is a view showing another example of a learning method of a classifier according to an embodiment of the present invention;
7 is a diagram illustrating an example of a sound source evaluation method using a classifier according to an embodiment of the present invention;
8 is a view showing an example of a graph showing the similarity of sound sources according to an embodiment of the present invention;
9 is a view showing an example of a graph showing an evaluation result of a sound source according to an embodiment of the present invention;
10 is a view showing an example of a method of generating an accompaniment portion of a sound source according to an embodiment of the present invention;
11 is a view showing an example of a method of generating a sound source through a genetic method according to an embodiment of the present invention,
12 is a flowchart illustrating an example of a sound source evaluation method according to an embodiment of the present invention;
13 is a flowchart illustrating an example of a sound source generating method according to an embodiment of the present invention;
14 is a view showing the configuration of an example of a sound source evaluation apparatus according to an embodiment of the present invention, and
15 is a diagram illustrating an example of a configuration of a sound source generating apparatus according to an embodiment of the present invention.

이하에서, 첨부된 도면들을 참조하여 본 발명의 실시 예에 따른 음원 평가 방법 및 그 장치와, 이를 이용한 음원 생성 방법 및 그 장치에 대해 상세히 설명한다.Hereinafter, with reference to the accompanying drawings will be described in detail with respect to the sound source evaluation method and apparatus, and the sound source generation method and apparatus using the same.

도 1은 본 발명의 실시 예에 따른 음원평가장치의 일 예를 도시한 도면이다.1 is a view showing an example of a sound source evaluation apparatus according to an embodiment of the present invention.

도 1을 참조하면, 음원평가장치(100)는 음원에 대한 선호도 또는 기존 음원과의 유사도 등의 평가결과를 출력한다. 본 실시 예에서, 음원은 스피커를 통해 소리나 음악을 출력할 수 있는 모든 형태의 데이터를 의미한다. 예를 들어, 음원은 악보를 포함하는 전자문서, MIDI(Musical Instrument Digital Interface) 형태의 디지털신호, MP3(MPEG-1 Audio Layer-3) 등과 같은 각종 오디오 데이터를 저장하는 음악 파일 등 컴퓨터에서 읽고 처리할 수 있는 다양한 형태일 수 있다. 또한, 본 실시 예에서의 음원은 한 곡 전체의 데이터뿐만 아니라 그 일부만을 포함할 수도 있다. Referring to FIG. 1, the sound source evaluating apparatus 100 outputs an evaluation result such as a preference for a sound source or a similarity with an existing sound source. In the present embodiment, the sound source means all types of data capable of outputting sound or music through the speaker. For example, the sound source is read and processed by a computer, such as an electronic document containing musical notes, a digital signal in the form of a musical instrument digital interface (MIDI), a music file storing various audio data such as MP3 (MPEG-1 Audio Layer-3) It can be in various forms. In addition, the sound source in the present embodiment may include not only data of one whole song but only a part thereof.

음원에 대해 사람들의 좋고 나쁨을 나타내는 선호도는 개인차가 존재하나 모집단을 크게 할 경우에 통계적으로 해당 음원에 대한 선호도를 일정 값으로 수치화할 수 있다. 예를 들어, A 음원을 100명의 사람들에게 들려준 후 1~10의 범위 내의 선호도를 설문조사하여 이를 평균하면 A 음원의 선호도를 수치화할 수 있다. 본 실시 예는 이와 같은 선호도를 종래의 통계적 방법이 아닌 기계 학습(Machine Learning)을 이용하여 자동으로 파악할 수 있는 방법을 제시한다. 기계 학습을 이용한 선호도 파악의 방법에 대해서는 도 5 내지 도 7에서 다시 살펴본다. 그리고 음원의 유사도를 파악하는 방법에 대해서는 도 8에서 살펴본다.The preferences that indicate the good or bad of people for a sound source can be statistically quantified to a certain value when there is an individual difference but the population is large. For example, a sound source A may be heard by 100 people, and then surveyed for a preference within a range of 1 to 10, and the average of the sound source A may be quantified. The present embodiment proposes a method of automatically detecting the preference using machine learning, rather than a conventional statistical method. The method of determining preference using machine learning will be described again with reference to FIGS. 5 to 7. And how to determine the similarity of the sound source is described in FIG.

도 2는 본 발명의 실시 예에 따른 음원평가를 이용한 음원생성장치의 일 예를 도시한 도면이다.2 is a diagram illustrating an example of a sound source generating apparatus using sound source evaluation according to an embodiment of the present invention.

도 2를 참조하면, 음원생성장치(200)는 임의로 생성한 음원을 음원평가장치(100)로 입력하여 그 평가결과를 수신한다. 음원생성장치(200)는 기 정의된 음악이론에 따라 임의의 멜로디를 가지는 음원을 생성할 수 있다. 예를 들어, 음원생성장치는 임의의 음자리표(높은음자리표, 가온음자리표, 낮은음자리표 등)를 선택하고, 박자(2/4, 4/3, 4/4, 6/8 등)를 선택한 후 각 마디별 박자에 맞게 음표의 길이 및 높이를 임의 선택하고 배치하여 음원을 생성할 수 있다. 이 외에도 작곡방법과 관련된 종래의 다양한 이론이 존재하며, 음원생성장치(200)에 작곡이론과 관련된 다양한 규칙이 알고리즘으로 구현되어 있거나, 종래의 다양한 음원 생성 알고리즘이 미리 구축되어 있다면 이를 통해 임의의 음원을 생성할 수도 있다. Referring to FIG. 2, the sound source generating apparatus 200 inputs a randomly generated sound source to the sound source evaluating apparatus 100 and receives the evaluation result. The sound source generating apparatus 200 may generate a sound source having an arbitrary melody according to a predefined music theory. For example, you can select any clef (high clef, warm clef, bass clef, etc.), select a time signature (2/4, 4/3, 4/4, 6/8, etc.) The sound source can be created by randomly selecting and arranging the length and height of the note according to the time signature. In addition, there are various conventional theories related to the composition method, and if the various rules related to the composition theory are implemented in the sound generating device 200 as an algorithm, or various conventional sound source generation algorithms are built in advance, any sound source may be used. You can also create

임의로 생성된 음원의 품질은 일정하지가 않다. 1000곡의 음원을 임의 생성하여 음원평가장치(100)에서 평가한다면 이 중에서 10곡의 음원의 평가결과는 일정 기준값 이상이나 990곡의 음원은 평가결과가 낮을 수 있다. 또한, 임의 생성하는 음원이 기존의 음원과 유사할 수도 있다. 특히 음원이 기존의 작곡 이론만으로 만들어진다면 그 유사도는 더 높아질 수 있다.The quality of the randomly generated sound source is not constant. If the randomly generated 1000 songs to be evaluated by the sound source evaluation apparatus 100, the evaluation results of the 10 sound sources of the above more than a predetermined reference value, but the 990 sound sources may have a low evaluation result. In addition, the randomly generated sound source may be similar to the existing sound source. In particular, the similarity can be higher if the sound source is made only from the existing composition theory.

이에 본 실시 예의 음원생성장치(200)는 임의 생성된 음원들 중 그나마 평가결과가 좋은 음원들을 일종의 생물처럼 유전적 방법으로 변형시켜 새로운 음원을 생성하는 방법을 사용한다. 다시 말해, 부모의 두 유전자 사이의 교차 또는 변이를 통해 자식의 유전자가 새롭게 결정되는 것과 같이, 음원생성장치(200)는 두 음원을 유전자처럼 취급하여 음원의 일부를 교차 또는 변이하여 새로운 자식 음원을 생성한다. 예를 들어, 1000곡의 임의 생성 음원에 대해 음원평가장치(100)의 평가 결과 일정 이상인 음원을 선별하고, 선별된 음원들의 유전적 변형을 통해 자식 음원을 생성한 후 음원평가장치(100)를 통해 다시 선별하고, 다시 선별된 음원들 사이의 유전적 변형을 통해 다시 자식 음원을 생성하고 평가하는 과정을 반복하여 점점 더 평가결과가 우수한 음원을 만들어 갈 수 있다. 음원을 유전적 방법을 통해 진화시키는 방법의 일 예가 도 11에 도시되어 있다.Thus, the sound source generating apparatus 200 of the present embodiment uses a method of generating a new sound source by transforming sound sources having a good evaluation result among randomly generated sound sources in a genetic manner as a kind of organism. In other words, as the gene of the child is newly determined through the intersection or mutation between two genes of the parent, the sound generating device 200 treats the two sound sources like genes, thereby altering or altering a part of the sound source to generate a new child sound source. Create For example, as a result of the evaluation of the sound source evaluating apparatus 100 for 1000 randomly generated sound sources, a sound source having a predetermined value or more is selected, and after generating a child sound source through genetic modification of the selected sound sources, the sound source evaluating apparatus 100 is used. Through selection and re-selection, the process of generating and evaluating the child sound source again through genetic transformation between the selected sound sources can be made more and more excellent sound source. An example of a method of evolving a sound source through a genetic method is shown in FIG. 11.

도 3은 본 발명의 실시 예에 따른 음원의 악보의 일 예를 도시한 도면이다.3 is a diagram illustrating an example of a music score of a sound source according to an exemplary embodiment of the present invention.

도 3을 참조하면, 음원(300)은 크게 멜로디 부분(310)과 반주 부분(320)을 포함할 수 있다. 실시 예에 따라 음원(300)은 멜로디 부분(310)만을 포함하거나 반주 부분(320)만을 포함할 수 있다. 다만, 설명의 편의를 위하여 이하에서 음원은 멜로디 부분(310)과 반주 부분(320)을 모두 포함하는 것으로 설명한다. Referring to FIG. 3, the sound source 300 may largely include a melody part 310 and an accompaniment part 320. According to an embodiment, the sound source 300 may include only the melody part 310 or only the accompaniment part 320. However, for convenience of explanation, the sound source will be described below as including both the melody part 310 and the accompaniment part 320.

또한, 음원은 기 정의된 다양한 방법에 따라 복수의 구간으로 분할될 수 있다. 예를 들어, 음원은 마디 또는 소절 단위로 분할되거나 사용자가 정의한 구간으로 분할될 수 있다. 이하에서 다시 살펴보겠지만, 음원평가장치(100)는 음원을 복수의 구간으로 분할한 후 각 구간별로 도 4와 같은 특징값들을 파악한다. 이하 설명의 편의를 위하여 음원평가장치(100)는 음원을 마디별로 분할하고, 마디별로 특징값들을 추출하는 것으로 한정하여 설명한다.In addition, the sound source may be divided into a plurality of sections according to various predefined methods. For example, the sound source may be divided into segments or measures or divided into user-defined sections. As will be described later, the sound source evaluation apparatus 100 divides the sound source into a plurality of sections and then grasps the characteristic values shown in FIG. 4 for each section. For convenience of explanation, the sound source evaluating apparatus 100 will be described by limiting the sound source to each node and extracting feature values for each node.

도 4는 본 발명의 실시 예에 따른 음원평가를 위해 추출하는 특징값 리스트의 일 예를 도시한 도면이다.4 is a diagram illustrating an example of a feature value list extracted for sound source evaluation according to an exemplary embodiment of the present invention.

도 4를 참조하면, 특징값 리스트는 피치(pitch)(400), 멜로디(410), 리듬(420), 화음(Chord)(430) 등을 포함한다.Referring to FIG. 4, the feature value list includes a pitch 400, a melody 410, a rhythm 420, a chord 430, and the like.

피치(400)는 음의 높이를 의미하며 주파수와 관련된다. 음의 주파수가 높을수록 피치가 높고 음의 주파수가 낮을수록 피치가 낮다. 피치(400)와 관련한 특징값의 예로 음의 최고높이와 최저높이의 차이를 나타내는 범위(range), 각 높이별 빈도수(occurrence rate), 조성(Tonal) 등이 있다. 피치(400)의 특징값은 음원의 멜로디 부분 및 반주 부분에서 파악될 수 있다.Pitch 400 means the height of the sound and is related to frequency. The higher the frequency of the sound, the higher the pitch. The lower the frequency of the sound, the lower the pitch. Examples of feature values related to the pitch 400 include a range indicating the difference between the highest and lowest heights of the sound, an occurrence rate for each height, and a composition. The characteristic value of the pitch 400 may be grasped in the melody part and the accompaniment part of the sound source.

멜로디(410)는 높이가 길이로 가진 음을 시간적인 질서로 결합하여 나타내는 가락 또는 선율을 의미한다. 멜로디(410)와 관련된 특징값의 예로 이웃 음 사이의 시간 간격을 나타내는 간격(intervals), 복수의 음으로 구성된 서브 그룹 단위(예를 들어, 마디 단위)의 변화(variation), 멜로디의 전체적인 윤곽(contours), 반복 패턴(repetition) 등이 있다. 멜로디(410)는 음원의 멜로디 부분에서 파악될 수 있다.The melody 410 refers to a melody or melody represented by combining the sound having a height as a length in a temporal order. Examples of feature values associated with the melody 410 include intervals representing time intervals between neighboring notes, variations in subgroup units (e.g., nodes) consisting of a plurality of notes, overall contours of the melody ( contours, repetition, and the like. The melody 410 may be identified in the melody portion of the sound source.

리듬(420)과 관련된 특징값의 예로 시간 간격(time intervals), 음의 지속시간(duration), 시간 축의 배치 패턴(patterns), 싱커페이션(syncopation) 등이 있고, 화음(430)과 관련된 특징값의 예로 음의 수직 간격(vertical intervals), 화음 타입(type of chords), 하모닉 변화(harmonic movement) 등이 있다. 리듬(420)은 음원의 멜로디 부분 및 반주 부분에서 파악될 수 있으며, 화음(430)은 음원의 반주 부분에서 파악될 수 있다.Examples of feature values associated with the rhythm 420 include time intervals, negative durations, time axis placement patterns, syncopation, and the like. Examples include vertical intervals, type of chords, and harmonic movements. The rhythm 420 may be grasped in the melody part and the accompaniment part of the sound source, and the chord 430 may be grasped in the accompaniment part of the sound source.

본 실시 예의 특징값 리스트는 이해를 돕기 위한 하나의 예이며, 음의 특징을 나타내는 여러 값이 존재할 수 있다. 따라서, 도 4에 도시된 특징값 리스트가 모두 본 실시 예에 사용되거나 일부만이 사용되거나 아니면 전혀 다른 특징값을 정의하여 사용할 수 있는 등 다양하게 변형 가능하다.The feature value list of the present embodiment is an example for clarity, and there may be several values representing negative features. Therefore, all of the feature value lists shown in FIG. 4 may be used in the present embodiment, or only a part thereof may be used, or a completely different feature value may be defined and used.

사람이 아닌 컴퓨터를 통해 음원을 평가하기 위해서는 특징값들은 컴퓨터에서 처리할 수 있는 일정한 값의 형태로 만들어져야 한다. 예를 들어, 피치(400)의 범위는 주파수와 관련된 값이므로 음의 높이에 해당하는 주파수와 일대일 맵핑되는 숫자를 정의하여 사용할 수 있다. 이 외에도 각 특징값들을 어떤 문자나 숫자로 표현할지 실시 예에 따라 다양하게 미리 정의하여 사용할 수 있다. In order to evaluate the sound source through a computer rather than a human, the feature values must be made in the form of constant values that can be processed by the computer. For example, the range of the pitch 400 is a value related to a frequency, and thus a number that is one-to-one mapped to a frequency corresponding to a sound height may be defined and used. In addition, various feature values may be defined in various ways according to embodiments.

음원평가장치(100)는 음원의 각 구간별(예를 들어, 마디별)로 특징값을 추출한다. 예를 들어, 도 4의 특징값 리스트를 사용하고, 음원이 30개의 마디로 구성되었다면, 음원평가장치(100)는 각 마디별로 피치(400), 멜로디(410), 리듬(420), 화음(430)의 각 특징값들을 파악한다. The sound source evaluation apparatus 100 extracts feature values for each section of the sound source (for example, for each node). For example, if the feature value list of FIG. 4 is used, and the sound source is composed of 30 nodes, the sound source evaluating apparatus 100 includes the pitch 400, the melody 410, the rhythm 420, and the chord (for each node). Identify each feature of 430.

도 5는 본 발명의 실시 예에 따른 분류기의 학습 방법의 일 예를 도시한 도면이다.5 is a diagram illustrating an example of a learning method of a classifier according to an exemplary embodiment of the present invention.

도 5를 참조하면, 음원평가장치(100)는 음원 평가를 위하여 분류기(530)를 이용한다. 분류기(530)는 기계 학습을 수행하는 인공지능을 의미하고, 특정 용어에 한정되는 것은 아니며 종래 기계 학습을 수행하는 다양한 형태로 구현될 수 있다. 예를 들어, 분류기(530)는 신경망네트워크(neural network), 서포트 벡터 머신(Support Vector Machine), 의사결정 나무(Decision Tree), 나이브 베이지안(Naive Bayesian) 등으로 구현될 수 있다. 기계 학습이나 신경망네트워크 등은 이미 널리 알려진 구성이므로 이에 대한 상세한 설명은 생략한다. 이하에서는 이러한 종래의 다양한 형태로 구현되는 분류기(530)를 어떻게 학습시켜 음원 평가에 사용할 수 있는지에 대해 설명한다.Referring to FIG. 5, the sound source evaluation apparatus 100 uses the classifier 530 for sound source evaluation. The classifier 530 refers to artificial intelligence for performing machine learning, and is not limited to a specific term, and may be implemented in various forms for performing conventional machine learning. For example, the classifier 530 may be implemented as a neural network, a support vector machine, a decision tree, a naive bayesian, or the like. Since machine learning and neural network are well known components, detailed description thereof will be omitted. Hereinafter, how to learn the classifier 530, which is implemented in various conventional forms, can be used for sound source evaluation.

먼저, 음원평가장치(100)는 미리 선호도가 정의된 복수의 학습음원(500,502,504)을 포함한다. 복수의 학습음원(500,502,504)은 복수의 사람에 대한 설문조사 또는 전문가의 평가 등을 통해 선호도가 미리 결정된다. First, the sound source evaluation apparatus 100 includes a plurality of learning sound sources 500, 502, and 504 defined in advance. The plurality of learning sound sources 500, 502, 504 may be determined in advance through surveys of a plurality of people or evaluation of experts.

음원평가장치(100)는 복수의 학습음원(500,502,504)에 대해 특징값들을 추출한다. 예를 들어, 음원평가장치(100)는 각 학습음원의 마디별로 도 4에 도시된 특징값 리스트(510)의 특징값을 추출할 수 있다. 음원평가장치(100)는 추출한 특징값들을 기 정의된 순서로 나열한 데이터 셋을 만들거나, 이를 벡터(즉, 특징벡터)(520)로 만들 수 있다. 본 실시 예는 각 음원에서 추출한 특징값들 기 정의된 순서로 분류기(530)에 입력한다. 다만, 설명의 편의를 위하여 이하에서 음원으로부터 추출한 특징값들을 특징벡터(520)로 표현한다. The sound source evaluating apparatus 100 extracts feature values for the plurality of learning sound sources 500, 502, 504. For example, the sound source evaluating apparatus 100 may extract the feature values of the feature value list 510 shown in FIG. 4 for each node of each learning sound source. The sound source evaluating apparatus 100 may create a data set in which the extracted feature values are arranged in a predefined order, or may be a vector (ie, a feature vector) 520. In the present embodiment, the feature values extracted from each sound source are input to the classifier 530 in a defined order. However, for convenience of description, the feature values extracted from the sound source are described below as a feature vector 520.

음원평가장치(100)는 각 음원의 특징벡터(520)를 분류기(530)에 입력하고 분류기(530)의 결과값과 해당 음원에 미리 정의된 선호도를 비교한다(540). 음원평가장치(100)는 각 음원의 분류기(530)의 결과값과 선호도가 일치하도록 분류기(530)의 파라미터를 조정한다. 학습음원이 많을수록 분류기(530)의 파라미터를 보다 정교하게 조정하여, 분류기(530)를 이용한 선호도의 평가결과를 보다 정확하게 할 수 있다.The sound source evaluating apparatus 100 inputs the feature vector 520 of each sound source into the classifier 530 and compares the result of the classifier 530 with a predefined preference for the corresponding sound source (540). The sound source evaluating apparatus 100 adjusts the parameters of the classifier 530 so that the preference of the sound source and the result value of the classifier 530 of each sound source match. The more learning sounds, the more precisely the parameters of the classifier 530 can be adjusted, so that the result of evaluation of the preference using the classifier 530 can be more accurately.

실시 예에 따라, 특징값 리스트(510)의 특징값 종류가 수십 개가 존재하고 학습음원(500,502,504)의 개수가 수백~수천 개가 존재한다면 각 음원으로부터 모든 특징값들을 추출하여 특징 벡터를 만들고 이를 분류기에 학습시키는데 상당한 시간이 소요될 수 있다. 분류기의 학습을 보다 간결하고 빠르게 하기 위한 방법의 일 예가 도 6에 도시되어 있다.According to an embodiment, if there are dozens of types of feature values in the feature value list 510 and there are hundreds to thousands of learning sources 500, 502, 504, all feature values are extracted from each sound source to create a feature vector, and the classifier Learning can take considerable time. An example of a method for more concise and faster learning of the classifier is shown in FIG. 6.

도 6은 본 발명의 실시 예에 따른 분류기의 학습 방법의 다른 예를 도시한 도면이다.6 is a diagram illustrating another example of a learning method of a classifier according to an exemplary embodiment of the present invention.

도 6을 참조하면, 특징값 리스트(610)에 포함된 전체 특징값들 중 일부를 임의 선별하고 조합한 특징값 서브셋(630,632,634,636,638)을 만든다. 예를 들어, 특징값 리스트(610)가 특징값0, 특징값1, 특징값2... 특징값7로 구성된 경우에, 음원평가장치(100)는 특징값0, 특징값1, 특징값2로 구성된 제1 특징값 서브셋(630)을 생성하고, 특징값3, 특징값5, 특징값6으로 구성된 제2 특징값 서브셋(632) 등을 생성한다. 각 특징값 서브셋(630,632,634,636,638)은 특징값 리스트(610)에 속한 특징값들의 서로 다른 조합으로 구성된다. Referring to FIG. 6, a feature value subset 630, 632, 634, 636, 638 is selected by randomly selecting and combining some of the feature values included in the feature value list 610. For example, when the feature value list 610 is composed of the feature value 0, the feature value 1, the feature value 2, the feature value 7, the sound source evaluating apparatus 100 has the feature value 0, the feature value 1, and the feature value. A first feature value subset 630 consisting of two is generated, and a second feature value subset 632 consisting of feature value 3, feature value 5, feature value 6, and the like is generated. Each feature value subset 630, 632, 634, 636, 638 is comprised of different combinations of feature values that belong to the feature value list 610.

또한, 음원평가장치(100)는 전체 학습음원(600) 중 일부를 임의 선별하고 조합(620,622,624,626,628)하여 특징값 서브셋(630,632,634,636,638)에 분산 할당한다. 예를 들어, 학습음원(600)이 학습음원0, 학습음원1... 학습음원9로 구성된 경우에, 음원평가장치(100)는 학습음원1,3,4,5,6(620)을 제1 특징값 서브셋(630)에 할당하고, 학습음원0,2,7,8,9(622)를 제2 특징값 서브넷(632)에 할당한다. In addition, the sound source evaluating apparatus 100 randomly selects and combines (620, 622, 624, 626, 628) a part of the entire learning sound source (600) and distributes the allocation to the feature value subsets (630, 632, 634, 636, 638). For example, when the learning sound source 600 is composed of a learning sound source 0, a learning sound source 1 ... learning sound source 9, the sound source evaluation apparatus 100 uses the learning sound sources 1, 3, 4, 5, 6 (620). The first feature value subset 630 is assigned, and the learning sound sources 0, 2, 7, 8, 9 622 are assigned to the second feature value subnet 632.

그리고 음원평가장치(100)는 각 학습음원의 서브셋(620,622,624,626,628)에 대해 특징값 서브셋(630,632,634,636,638)에 해당하는 특징값을 추출한다. 음원평가장치(100)는 각 학습음원 서브셋(620,622,624,626,628)에 대해 파악한 각 특징벡터(640,642,644,646,648)를 각 분류기(650,652,654,656,658)에 입력한다. 예를 들어, 제1 특징값 서브셋(630)으로 추출하여 얻은 제1 특징벡터(640)를 제1 분류기(650)에 입력하고, 제2 특징값 서브셋(632)으로 추출하여 얻은 제2 특징백터642)를 제2 분류기(662)에 입력한다. The sound source evaluating apparatus 100 extracts feature values corresponding to the feature value subsets 630, 632, 634, 636, and 638 for the subsets 620, 622, 624, 626, and 628 of the learning sound sources. The sound source evaluating apparatus 100 inputs the feature vectors 640, 642, 644, 646, and 648 identified for the learning sound source subsets 620, 622, 624, 626, and 628 into the classifiers 650, 652, 654, 656, and 658. For example, a second feature vector obtained by inputting the first feature vector 640 obtained by extracting the first feature value subset 630 into the first classifier 650 and extracting it by the second feature value subset 632. 642) into the second classifier 662.

음원평가장치(100)는 도 5에서 살핀 바와 같이 각 분류기(650,652,654,656,658)의 각 출력값(660,662,664,666,668)과 해당 학습음원의 선호도를 일치시키는 파라미터 조정 과정을 통해 분류기(650,652,654,656,658)를 학습시킨다. The sound source evaluating apparatus 100 learns the classifiers 650, 652, 654, 656 and 658 through a parameter adjustment process that matches the output values 660, 662, 664, 666 and 668 of the classifiers 650, 652, 654, 656 and 658 and the learning sound sources.

도 7은 본 발명의 실시 예에 따른 분류기를 이용한 음원 평가 방법의 일 예를 도시한 도면이다.7 is a diagram illustrating an example of a sound source evaluation method using a classifier according to an exemplary embodiment of the present invention.

도 7을 참조하면, M개의 분류기(730,732,734)가 존재하며, M개의 분류기(730,732,734)는 도 6의 방법을 통해 미리 학습되었다고 가정한다. 본 실시 예는 복수의 분류기(730,732,734)를 이용하는 경우를 설명하고 있으나, 도 5와 같이 하나의 분류기를 학습시킨 후 하나의 분류기를 이용하여 음원 평가를 할 수 있다.Referring to FIG. 7, it is assumed that there are M classifiers 730, 732, and 734, and the M classifiers 730, 732, and 734 have been learned in advance through the method of FIG. 6. Although the present embodiment has described a case where a plurality of classifiers 730, 732, and 734 are used, the sound source may be evaluated using one classifier after learning one classifier as shown in FIG.

음원평가장치(100)는 평가대상음원(700)의 각 마디별로 각 특징값 서브셋(710,712,714)의 특징값들을 추출한다. 음원평가장치(100)는 각 특징값 서브셋(710,712,714)을 통해 추출된 각 특징벡터(720,722,724)를 각 분류기(730,732,734)에 입력한다. 음원평가장치(100)는 각 분류기(70,732,734)의 출력값(740,742,744)의 통계값(예를 들어, 출력값들의 평균)을 평가대상음원(700)의 선호도로 출력한다. 음원평가장치(100)에 의해 평가된 평가대상음원(700)은 도 5 또는 도 6에서 살핀 학습음원으로 활용할 수 있다. The sound source evaluating apparatus 100 extracts feature values of each feature value subset 710, 712, 714 for each node of the evaluation target sound source 700. The sound source evaluating apparatus 100 inputs the feature vectors 720, 722 and 724 extracted through the feature value subsets 710, 712 and 714 into the classifiers 730, 732 and 734. The sound source evaluating apparatus 100 outputs statistical values (eg, an average of the output values) of the output values 740, 742, and 744 of the classifiers 70, 732, and 734 as the preferences of the evaluation target sound source 700. The evaluation target sound source 700 evaluated by the sound source evaluation apparatus 100 may be utilized as a learning sound source in FIG. 5 or FIG. 6.

도 8은 본 발명의 실시 예에 따른 음원의 유사도를 나타내는 그래프의 일 예를 도시한 도면이다.8 is a diagram illustrating an example of a graph showing the similarity of sound sources according to an embodiment of the present invention.

도 8을 참조하면, 음원평가장치(100)는 평가대상음원(810)과 기 저장된 복수의 음원(830,832,834,836,838)과의 유사도를 파악한다. 예를 들어, 음원평가장치(100)는 모든 음원을 일정한 데이터 포맷(예를 들어, MIDI 포맷)으로 저장하고 있을 수 있다. 음원평가장치(100)는 이러한 데이터를 벡터로 만든 후 두 벡터 사이의 거리인 유클리드 거리(Euclidean distance)를 파악할 수 있다. 음원평가장치(100)는 평가대상음원(810)과 기 저장된 각 음원(830,832,834,836,838) 사이의 거리를 평균하여 유사도를 파악할 수 있다. 이 외에도 두 데이터의 유사도를 파악하는 종래의 다양한 알고리즘이 적용될 수 있다. Referring to FIG. 8, the sound source evaluating apparatus 100 grasps the similarity between the evaluation target sound source 810 and a plurality of previously stored sound sources 830, 832, 834, 836, and 838. For example, the sound source evaluation apparatus 100 may store all sound sources in a certain data format (eg, MIDI format). The sound source evaluating apparatus 100 may determine the Euclidean distance, which is the distance between the two vectors, after making such data as a vector. The sound source evaluating apparatus 100 may determine the similarity by averaging the distances between the evaluation target sound source 810 and the pre-stored respective sound sources 830, 832, 834, 836, and 838. In addition, various conventional algorithms for determining the similarity of two data may be applied.

기 저장된 복수의 음원이 많다면, 평가대상음원(810)과 기 저장된 음원(830,832,834,836,838) 사이의 각각의 거리를 파악하는데 많은 시간이 소요될 수 있으므로, 기 저장된 복수의 음원을 복수의 그룹(830,832,834,836,838)으로 그룹핑한 후 해당 그룹을 대표하는 벡터값을 만들 수 있다. 이 경우, 음원평가장치(100)는 평가대상음원(810)과 각 그룹(830,832,834,836,838)의 대표 벡터값 사이의 거리를 산출하여 유사도를 파악할 수 있다.If there are a plurality of pre-stored sound sources, it may take a long time to determine the distance between the evaluation target sound source 810 and the pre-stored sound sources (830,832,834,836,838), so that the pre-stored sound sources are stored in a plurality of groups (830,832,834,836,838). After grouping, you can create a vector value that represents the group. In this case, the sound source evaluating apparatus 100 may determine the similarity by calculating a distance between the evaluation target sound source 810 and representative vector values of the groups 830, 832, 834, 836, and 838.

도 9는 본 발명의 실시 예에 따른 음원의 평가결과를 나타내는 그래프의 일 예를 도시한 도면이다.9 is a diagram illustrating an example of a graph illustrating an evaluation result of a sound source according to an exemplary embodiment of the present invention.

도 9를 참조하면, 각 음원의 평가결과가 선호도 및 유사도를 각 축으로 하는 그래프에 도시되어 있다. 음원의 퀄리티가 높을수록 선호도는 커지고 유사도는 낮아진다. 음원평가장치(100)는 선호도와 유사도가 일정 크기 이상인 음원(900)을 선별할 수 있다. 선호도와 유사도가 일정 크기 이상인 음원(900)은 도 2에서 살핀 음원생성장치(200)의 음원 생성에 사용될 수 있다.Referring to FIG. 9, evaluation results of each sound source are shown in a graph with preferences and similarities on each axis. The higher the quality of the sound source, the greater the preference and the lower the similarity. The sound source evaluating apparatus 100 may select a sound source 900 having a preference and similarity more than a predetermined size. The sound source 900 having a preference and similarity more than a predetermined size may be used to generate a sound source of the salping sound source generator 200 in FIG. 2.

도 10은 본 발명의 실시 예에 따른 음원의 반주부분을 생성하는 방법의 일 예를 도시한 도면이다.10 is a diagram illustrating an example of a method of generating an accompaniment portion of a sound source according to an exemplary embodiment of the present invention.

도 10을 참조하면, 반주템플릿데이터베이스(1000)는 음원의 반주부분에 대한 템플릿을 포함한다. 반주템플릿은 기존에 존재하는 음원에 사용된 반주부분의 전체 또는 일부이거나 새롭게 작곡된 반주부분일 수 있다. 예를 들어, 반주템플릿은 마디 또는 소절 단위로 저장 관리되거나 한 곡 전체의 반주부분이 저장 관리될 수 있다.Referring to FIG. 10, the accompaniment template database 1000 includes a template for an accompaniment portion of a sound source. The accompaniment template may be all or part of an accompaniment portion used for an existing sound source or may be a newly composed accompaniment portion. For example, the accompaniment template may be stored and managed in units of measure or measure, or the accompaniment part of the whole song may be stored and managed.

음원생성장치(200)는 반주템플릿데이터베이스에서 반주템플릿 임의 추출(1010)하여 반주 부분을 생성(1030)할 수 있다. 예를 들어, 음원생성장치(200)는 30마디로 구성된 멜로디 부분을 임의로 생성하고, 반주템플릿데이터베이스(1000)에서 반주템플릿을 임의 추출하여 30마디의 반주부분을 생성할 수 있다. 일 예로, 반주템플릿이 마디 단위로 저장되어 있다면, 음원생성장치(200)는 30개의 서로 다른 반주템플릿을 임의의 순서로 추출하여 반주부분을 생성하거나 30개 미만의 서로 다른 반주템플릿을 추출하고 몇 개의 반주템플릿을 반복 사용하여 30마디의 반주부분을 생성할 수 있다. 또 다른 예로, 반주템플릿이 소절이나 한 곡 단위로 저장 관리되는 경우에, 음원생성장치(200)는 반주템플릿의 소절 또는 한 곡 단위를 그대로 추출하여 반주부분을 생성하거나 소절 또는 한 곡 단위의 일부분을 추출하여 반주부분을 생성할 수 있다. The sound source generator 200 may generate an accompaniment part 1010 by randomly extracting the accompaniment template 1010 from the accompaniment template database. For example, the sound source generating apparatus 200 may arbitrarily generate a melody part consisting of 30 words, and randomly extract an accompaniment template from the accompaniment template database 1000 to generate an accompaniment part of 30 words. For example, if the accompaniment templates are stored in units of measure, the sound source generating apparatus 200 extracts 30 different accompaniment templates in an arbitrary order to generate accompaniment parts or extracts less than 30 different accompaniment templates. You can create 30 accompaniment parts by repeating the accompaniment template. As another example, when the accompaniment template is stored and managed in units of measures or one song, the sound source generating apparatus 200 extracts the measure or one song unit of the accompaniment template as it is, or creates an accompaniment part or a part of the measure or one song unit. You can create accompaniment by extracting.

음원생성장치(200)는 반주템플릿데이터베이스(1000)로부터 추출한 반주템플릿을 나열하여 생성한 반주부분을 화음에 따라 변조한다(1020). 음원에 사용되는 화음의 진행 순서는 임의로 선정되거나 미리 정해진 작곡 이론에 따라 정해질 수 있다. 예를 들어, 경쾌한 음악, 슬픈 음악, 조용한 음악 등 각 장르에 대한 화음의 진행 순서가 작곡 이론에 따라 미리 정의되어 음원생성장치(200)에 저장되어 있다면, 음원생성장치(2000)는 각 장르별 화음에 맞도록 반주부분을 변조한다. 음원생성장치(200)에는 미리 정의된 작곡이론에 따라 반주부분을 각 화음에 맞도록 변조하는 화음 변조 규칙이 저장되어 있다.The sound source generator 200 modulates the accompaniment part generated by listing the accompaniment templates extracted from the accompaniment template database 1000 according to the chord (1020). The order of chords used in the sound source may be arbitrarily selected or determined according to a predetermined composition theory. For example, if the order of the chords for each genre such as light music, sad music, and quiet music is predefined according to the composition theory and stored in the sound source generator 200, the sound source generator 2000 is a chord for each genre. Modulate the accompaniment to match. The sound source generator 200 stores a chord modulation rule for modulating the accompaniment part to match each chord according to a predefined composition theory.

음원생성장치(200)는 감정정보를 입력받는다면, 해당 감정정보에 맞은 화음의 진행순서로 반주부분을 변조할 수 있다. 예를 들어, 음원생성장치(200)는 복수의 감정 중 어느 하나를 사용자로부터 선택받은 후 해당 감정에 대한 화음의 진행순서에 따라 반주부분을 변조할 수 있다.If the sound source generating device 200 receives the emotion information, the accompaniment part may be modulated in the order of the harmony of the chord corresponding to the emotion information. For example, the sound source generating apparatus 200 may modulate the accompaniment part in accordance with the order of the harmony of the corresponding emotion after receiving any one of a plurality of emotions from the user.

도 11은 본 발명의 실시 예에 따른 음원을 유전적 방법을 통해 생성하는 방법의 일 예를 도시한 도면이다.11 is a diagram illustrating an example of a method for generating a sound source through a genetic method according to an embodiment of the present invention.

도 11을 참조하면, 음원생성장치(200)는 두 음원(1100,1110)의 일부분(1120,1130)을 서로 교체하거나 음원의 특정 부분(1140)을 변경한다. 예를 들어, 음원생성장치(200)는 제1 음원(1100)과 제2 음원(1110)의 서로 대응되는 부분을 서로 교체할 수 있다. 이때 교체되는 부분은 반드시 마디 단위일 필요는 없으며, 하나의 음표가 되거나 두 마디에 걸친 부분(1130)이 될 수 있는 등 실시 예에 따라 다양하게 설정될 수 있다. 음원이 멜로디 부분과 반주부분으로 구성된 경우에 음원생성장치(200)는 각 음원의 멜로디 부분과 반주 부분을 서로 대응하여 함께 교체하거나 변경할 수 있다. 또는 음원생성장치(200)는 멜로디 부분과 반주부분 각각 별개로 상호 교체하거나 변경할 수 있다. 즉, 두 음원(1100,1110)에서 멜로디 부분은 첫 번째 마디가 교체되고, 반주 부분은 첫 번째 마디가 아닌 두 번째 마디가 교체될 수 있다.Referring to FIG. 11, the sound source generating apparatus 200 replaces portions 1120 and 1130 of two sound sources 1100 and 1110 with each other or changes a specific portion 1140 of a sound source. For example, the sound source generator 200 may replace portions corresponding to each other of the first sound source 1100 and the second sound source 1110. In this case, the replaced part does not necessarily need to be a unit of measure, and may be variously set according to an exemplary embodiment, such as a single note or a part 1130 that spans two measures. When the sound source is composed of a melody part and an accompaniment part, the sound source generator 200 may replace or change the melody part and the accompaniment part of each sound source together. Alternatively, the sound source generator 200 may replace or change each of the melody part and the accompaniment part separately. That is, in the two sound sources 1100 and 1110, the melody part may be replaced with the first node, and the accompaniment part may be replaced with the second node instead of the first node.

두 음원(1100,1110) 사이의 교체되는 부분(1120,1130)은 음원생성장치(200)에 미리 정의되어 있을 수 있으나, 이 경우 다양한 변형이 이루어지지 않으므로, 가능하면 교체되는 부분을 임의로 선택하는 것이 바람직하다. 즉, 음원생성장치(200)는 두 음원(1100,1110) 사이의 교체부분(1120,1130)의 위치 및 크기 등을 매번 임의로 선택할 수 있다. 예를 들어, 음원생성장치(200)는 교체 개수를 기 설정된 범위(예를 들어, 5~10) 내에서 임의 선택하고, 교체 위치를 전체 음원의 길이에서 임의 선택하고, 또한 교체 크기를 마디 단위로 미리 정의하거나 기 설정된 범위 내에서 임의의 크기(예를 들어, 4박자 크기 또는 8박자 크기 등)로 설정할 수 있다.The parts 1120 and 1130 to be replaced between the two sound sources 1100 and 1110 may be predefined in the sound generating device 200, but in this case, various modifications are not made. It is preferable. That is, the sound source generator 200 may arbitrarily select the position and size of the replacement parts 1120 and 1130 between the two sound sources 1100 and 1110 each time. For example, the sound source generating apparatus 200 randomly selects the number of replacements within a preset range (for example, 5 to 10), randomly selects a replacement position from the length of the entire sound source, and further, selects a replacement size in units of nodes. It may be defined in advance or set to any size (for example, 4 beat size or 8 beat size) within a preset range.

음원생성장치(200)는 음원의 다양한 변형을 위하여 두 음원 사이의 교체뿐만 아니라 음원의 특정 부분(1140)을 변형할 수 있다. 음원생성장치(200)는 특정 부분(11400의 음의 높이를 변형하거나 음의 길이를 변형할 수 있다. 음원생성장치(200)는 변형 부분(1140)을 임의로 선택하고, 변형 부분의 범위를 하나의 음표, 복수의 음표 또는 마디 등과 같이 미리 정의하거나 기 설정된 범위 내에서 선택할 수 있다. 변형 부분(1140)의 개수 및 위치 또한 미리 정의되거나 기 설정된 범위 내에서 선택될 수 있다.The sound source generator 200 may deform a specific portion 1140 of the sound source as well as a replacement between the two sound sources for various modifications of the sound source. The sound source generator 200 may modify the height of the sound of the specific portion 11400 or the length of the sound. The sound source generator 200 arbitrarily selects the modified portion 1140 and selects a range of the modified portion. It can be predefined or selected within a predetermined range, such as a note, a plurality of notes or nodes, etc. The number and position of the deformation portion 1140 can also be selected within a predefined or preset range.

도 12는 본 발명의 실시 예에 따른 음원 평가 방법의 일 예를 도시한 흐름도이다. 12 is a flowchart illustrating an example of a sound source evaluation method according to an embodiment of the present invention.

도 12를 참조하면, 음원평가장치(100)는 학습음원의 각 마디별로 추출한 적어도 하나 이상의 특징값들을 기초로 선호도를 출력하는 적어도 하나 이상의 분류기를 학습한다(S1200). 분류기의 학습 방법의 예는 도 5 및 도 6에 도시되어 있다. Referring to FIG. 12, the sound source evaluating apparatus 100 learns at least one classifier that outputs a preference based on at least one feature value extracted for each node of the learning sound source (S1200). Examples of the classifier learning method are shown in FIGS. 5 and 6.

분류기의 학습이 완료되면, 음원평가장치(100)는 평가대상음원의 각 마디별로 적어도 하나 이상의 특징값들을 추출하고(S1210), 적어도 하나 이상의 분류기를 이용하여 평가대상음원의 선호도를 파악한다(S1220). 음원평가장치(100)는 또한 평가대상음원과 기 저장된 복수의 음원에 대한 유사도를 파악한다(S1230). 음원평가장치는 선호도 및 유사도를 기초로 평가대상음원의 평가결과를 출력한다(S1240).When the learning of the classifier is completed, the sound source evaluating apparatus 100 extracts at least one or more feature values for each node of the sound source to be evaluated (S1210), and determines the preference of the sound source to be evaluated using the at least one classifier (S1220). ). The sound source evaluating apparatus 100 also grasps the similarity degree between the evaluation target sound source and the plurality of previously stored sound sources (S1230). The sound source evaluation apparatus outputs an evaluation result of the sound source to be evaluated based on the preference and the similarity (S1240).

도 13은 본 발명의 실시 예에 다른 음원 생성 방법의 일 예를 도시한 흐름도이다.13 is a flowchart illustrating an example of a method of generating a sound source according to an embodiment of the present invention.

도 13을 참조하면, 음원생성장치(200)는 무작위의 음표로 구성되는 적어도 하나 이상의 음원을 생성한다(S1300). 음원생성장치(200)는 음원평가장치를 통한 음원 평가결과가 일정 이상인 적어도 하나 이상의 음원을 선별한다(S1310). 그리고 음원생성장치(200)는 선별된 음원 사이에 적어도 한 구간 이상을 서로 교체하거나 적어도 하나 이상의 음표를 다른 음표로 바꾸어 새로운 음원을 생성하고 평가하는 과정을 반복 수행한다(S1320).Referring to FIG. 13, the sound source generating apparatus 200 generates at least one sound source composed of random notes (S1300). The sound source generating apparatus 200 selects at least one or more sound sources having a predetermined or more sound source evaluation result through the sound source evaluation apparatus (S1310). The sound source generating apparatus 200 repeats a process of generating and evaluating a new sound source by exchanging at least one or more sections between the selected sound sources or replacing at least one or more notes with other notes (S1320).

예를 들어, 음원생성장치(200)는 초기에 1000곡의 음원을 임의생성하고 음원 평가결과가 일정 이상인 음원을 선별한다. 평가평가가 일정 이상인 음원이 100곡이 존재하면, 음원생성장치(200)는 100곡의 음원에 대해 도 11과 같이 교체 또는 변형하여 새로운 복수의 음원을 생성한다. 예를 들어, 100곡의 음원을 일렬로 배열한 후 1번째 음원과 51번째 음원 사이의 도 11과 같은 유전적 변형, 2번째 음원과 52번째 음원 사이의 유전적 변형 등을 순차적으로 생성하여 복수의 새로운 음원을 생성할 수 있다. 음원생성장치(200)는 새롭게 생성된 복수의 음원에 대해 다시 평가하여 일정 개수의 음원을 새롭게 선별하고, 다시 도 11과 같은 유전적 변형을 반복한다. For example, the sound source generating apparatus 200 initially generates a random sound source of 1000 songs, and selects a sound source having a predetermined or more sound source evaluation result. When there are 100 pieces of the sound source having a predetermined evaluation or more, the sound source generating apparatus 200 generates or replaces a new sound source by replacing or modifying the 100 sound sources as shown in FIG. 11. For example, after arranging 100 sound sources in a row, a plurality of genes, such as the genetic strain shown in FIG. You can create a new sound source. The sound source generating apparatus 200 re-evaluates a plurality of newly generated sound sources, newly selects a predetermined number of sound sources, and repeats the genetic modification as shown in FIG.

음원생성장치(200)는 도 11과 같은 유전적 변형을 기 설정된 횟수만큼 반복하거나, 평가결과의 선호도 및 유사도가 기 설정된 기준에 도달하는 음원이 생성될 때까지 반복 수행할 수 있다.The sound source generating apparatus 200 may repeat the genetic modification as shown in FIG. 11 a predetermined number of times, or may repeatedly perform until a sound source in which the preference and similarity of the evaluation result reaches a predetermined criterion is generated.

도 14는 본 발명의 실시 예에 따른 음원평가장치의 일 예의 구성을 도시한 도면이다.14 is a diagram showing an example of a configuration of a sound source evaluation apparatus according to an embodiment of the present invention.

도 14를 참조하면, 음원평가장치(100)는 특징추출부(1410), 평가부(1420) 및 학습부(1400)를 포함한다. Referring to FIG. 14, the sound source evaluating apparatus 100 includes a feature extractor 1410, an evaluator 1420, and a learner 1400.

학습부(1400)는 선호도가 기 정의된 학습음원을 이용하여 분류기를 학습시킨다. 예를 들어, 학습부(1400)는 도 5 및 도 6과 같이 복수의 학습음원으로부터 특징값들을 추출하고 이를 분류기에 입력한 후 그 결과값이 해당 학습음원의 선호도와 동일하도록 파라미터를 조정하는 과정을 통해 분류기를 학습시킨다.The learner 1400 trains the classifier using a learning sound source with predefined preferences. For example, the learner 1400 extracts feature values from the plurality of learning sound sources as shown in FIGS. 5 and 6, inputs them to the classifier, and adjusts the parameters so that the result values are the same as the preferences of the corresponding learning sound sources. Train the classifier through

특징추출부(1410)는 평가대상음원의 각 마디별로 특징값들을 추출한다. 예를 들어, 특징추출부(1410)는 도 7과 같이 특징값 리스트의 특징값들을 복수 개의 특징값 서브셋으로 구성한 후 특징값 서브셋별로 평가대상음원의 각 마디의 특징값을 추출할 수 있다.The feature extractor 1410 extracts feature values for each node of the sound source to be evaluated. For example, the feature extractor 1410 may configure the feature values of the feature value list as a plurality of feature value subsets as shown in FIG. 7, and then extract feature values of each node of the sound source to be evaluated for each feature value subset.

평가부(1420)는 특징추출부(1410)에서 추출한 특징값들을 분류기에 입력하여 선호도를 파악하고, 평가대상음원과 기 저장된 다른 음원과의 유사도를 파악하여 출력한다. The evaluation unit 1420 inputs the feature values extracted by the feature extraction unit 1410 to the classifier to determine a preference, and to grasp and output the similarity between the evaluation target sound source and other previously stored sound sources.

도 15는 본 발명의 실시 예에 따른 음원생성장치의 일 예의 구성을 도시한 도면이다.15 is a diagram illustrating an example of a configuration of a sound source generating apparatus according to an embodiment of the present invention.

도 15를 참조하면, 음원생성장치(200)는 음원생성부(1500) 및 음원진화부(1510)를 포함한다.Referring to FIG. 15, the sound source generator 200 includes a sound source generator 1500 and a sound source evolution unit 1510.

음원생성부(1500)는 음원을 임의로 생성한다. 예를 들어, 음원생성부(1500)는 멜로디 부분의 음표를 임의로 구성하고, 반주 부분은 도 10과 같이 반주템플릿데이터베이스를 이용하여 구성할 수 있다. The sound source generator 1500 randomly generates a sound source. For example, the sound source generator 1500 may arbitrarily configure a note of the melody part, and the accompaniment part may be configured using an accompaniment template database as shown in FIG. 10.

음원진화부(1510)는 음원평가장치를 통해 일정 이상의 평가 결과가 나오는 음원들을 도 11과 같은 유전적 변형을 통해 새로운 음원을 생성한다. 음원진화부(1510)는 새롭게 생성된 음원들 중에 평가 결과가 일정 이상인 음원들을 선정한 후 다시 도 11과 같은 유전적 변형을 수행한다. 음원진화부(1510)는 이와 같은 유전적 진화 과정을 일정 횟수 반복 수행한다.The sound source evolution unit 1510 generates new sound sources through the genetic modification as shown in FIG. The sound source evolution unit 1510 selects sound sources having a predetermined or more evaluation result among the newly generated sound sources and then performs the genetic modification as shown in FIG. 11 again. The sound source evolution unit 1510 repeats the genetic evolution process a certain number of times.

본 발명은 또한 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광데이터 저장장치 등이 있다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.The invention can also be embodied as computer readable code on a computer readable recording medium. The computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored. Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disks, optical data storage devices, and the like. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

이제까지 본 발명에 대하여 그 바람직한 실시예들을 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.So far I looked at the center of the preferred embodiment for the present invention. Those skilled in the art will appreciate that the present invention can be implemented in a modified form without departing from the essential features of the present invention. Therefore, the disclosed embodiments should be considered in descriptive sense only and not for purposes of limitation. The scope of the present invention is shown in the claims rather than the foregoing description, and all differences within the scope will be construed as being included in the present invention.

Claims

Training at least one classifier that outputs a preference based on at least one feature value extracted for each section of the learning sound source;
Extracting at least one feature value for each section of the evaluation target sound source; And
Determining a preference of the sound source to be evaluated using the at least one classifier; sound source evaluation method comprising a.

The method of claim 1, wherein training the classifier comprises:
Extracting at least one feature value for each section of the plurality of learning sound sources in which dictionary information about a preference is defined; And
And adjusting a parameter of the classifier such that a value obtained by inputting feature values of the learning sound source to the classifier corresponds to a predefined preference with respect to the learning sound source.

The method according to claim 1 or 2,
There are multiple classifiers,
The combination of feature values input to each classifier is different from each other.

The method of claim 3, wherein the determining of the preference comprises:
And outputting statistical values for a plurality of values obtained by inputting different combinations of feature values extracted for each section of the evaluation target sound source into each of a plurality of classifiers as preferences of the evaluation target sound source. Sound source evaluation method to say.

The method of claim 1,
Determining the similarity between the file data for the evaluation target sound source and the file data for a plurality of previously stored sound sources; And
And outputting an evaluation result of the evaluation target sound source based on the preference and the similarity.

The method of claim 1,
Generating at least one first evaluation target sound source composed of random notes; And
Generating at least one second evaluation target sound source by replacing at least one section of the first evaluation target sound source with a corresponding section of another sound source or by replacing at least one note with another musical note;
Generating at least one third or more third target sound source using at least one or more section replacement or note change between the at least one first or more first sound target sound source or the at least one or more second sound source sound sources having a predetermined preference or more Repeating the process; Sound source evaluation method comprising a.

The method of claim 6,
The first evaluation target sound source is composed of a melody part and an accompaniment part,
Generating the first evaluation target sound source,
And randomly selecting a note to form the melody portion, and modulating a plurality of pre-stored accompaniment templates according to a chord to form an accompaniment portion.

8. The method of claim 7, wherein the accommodating portion comprises:
Receiving emotion information; And
And generating an accompaniment part corresponding to the emotion information by using a code progress rule according to a predetermined emotion.

Randomly extracting a plurality of pre-stored accompaniment templates; And
And modulating a randomly extracted accompaniment template according to a code to generate accompaniment portions of the sound source.

The method of claim 9, wherein generating the accompaniment portion,
Receiving emotion information; And
And generating an accompaniment part by modulating an accompaniment template with a code corresponding to the emotion information by using a code progress rule according to a predefined emotion.

The method of claim 9,
Generating at least one melody portion consisting of random notes; And
Generating at least one second sound source by replacing at least one section of at least one first sound source consisting of the melody part and the accompaniment part with a corresponding section of another sound source or by replacing at least one note with another note;
Repeating a process of generating at least one third sound source by using at least one or more section replacements or a change of a note among sound sources having a predetermined preference among the at least one first sound source or the at least one second sound source Sound source generation method comprising a.

At least one classifier for outputting a preference based on at least one feature value extracted for each section of the learning sound source;
Feature extraction unit for extracting at least one feature value for each section of the evaluation target sound source; And
And an evaluation unit to grasp the preference of the evaluation target sound source using the at least one classifier.

A sound source generator for generating at least one first evaluation target sound source composed of random notes; And
And a sound source evolution unit for generating at least one second evaluation target sound source by replacing at least one section of the first evaluation target sound source with another section of another sound source or by replacing at least one note with another note.
The sound source evolution unit may include at least one or more third evaluations using at least one or more section replacements or a note change between the at least one first evaluation target sound source or the at least one or more second evaluation sound sources, which have a predetermined preference. A sound source generating device, characterized in that for repeating the process of generating the target sound source.

A computer-readable recording medium having recorded thereon a program for performing the method according to any one of claims 1 to 11.