JPH0990978A

JPH0990978A - Music constitution automatic extracting method of music information

Info

Publication number: JPH0990978A
Application number: JP7246419A
Authority: JP
Inventors: Yumiko Matsuura; 由美子松浦; Seiji Kinohara; 誠司木ノ原
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1995-09-25
Filing date: 1995-09-25
Publication date: 1997-04-04
Anticipated expiration: 2015-09-25
Also published as: JP3388481B2

Abstract

PROBLEM TO BE SOLVED: To store all of the extracted music constitution information with a small amount of storage capacity by dividing music information into music constitutions based on the phrases detected from a detected voiced portion. SOLUTION: The music constitution automatic extraction is executed by a recording section 1 which records music information and stores the information in a music information file, an analysis file generating section 2 which processes the music information file and generates an analysis file and a music information analysis section 3 which analyzes the analysis file and extracts the music constitution. In other words, the section 1 generates the music information file. The music information file accumulated in the section 1 is processed in the section 2. Then, the analysis file is divided into phrases in the section 3. Thus, the voiced portion, which includes a voice or a melody, is detected and separated employing a voiceless portion from the music information that includes one or more than two acoustics. Then, the music information is divided into music constitutions based on the phrase which is one segment of a detected music.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】この発明は、音楽情報の曲構
成自動抽出方法に関し、特に、音楽情報から楽譜情報な
しに曲構成を自動的に抽出する音楽情報の曲構成自動抽
出方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a music composition automatic extraction method for music information, and more particularly to a music information automatic composition extraction method for automatically extracting music composition from music information without score information.

【０００２】[0002]

【従来の技術】音楽情報の曲構成を抽出する場合、音楽
情報を電子計算機を使用して電子的な音楽記述言語によ
り表現し、音楽情報の拍子、音程、拍数、コード進行を
求め、音楽情報のコード進行の規則に基づいて音楽情報
のフレーズ、曲構成を抽出することができる。曲構成は
１或は２以上のフレーズから成り、音楽１曲の内には、
一般に、複数の曲構成が存在している。2. Description of the Related Art When extracting a music composition of music information, the music information is expressed in an electronic music description language using an electronic computer, and the beat, pitch, number of beats, and chord progression of the music information are obtained. It is possible to extract a phrase and music composition of music information based on a rule of information chord progression. The music composition consists of one or two or more phrases.
Generally, there are multiple song configurations.

【０００３】ここで、音楽情報の内から音楽のダイジェ
スト情報或はサビと呼ばれる特徴部分を切り出すには、
この曲構成を切り出しの単位としている。この場合、音
楽情報が楽譜或は音楽記述言語により表現されていれ
ば、休止符毎に分割を行なうことにより曲構成を容易に
抽出することができる。Here, in order to cut out a characteristic portion called music digest information or chorus from the music information,
This music composition is used as a unit for cutting out. In this case, if the music information is expressed by a musical score or a music description language, the music composition can be easily extracted by dividing the music information for each rest.

【０００４】[0004]

【発明が解決しようとする課題】以上の音楽情報の曲構
成抽出方法においては、音楽情報を電子的な音楽記述言
語により再表現しなくてはならない。これを実施するに
は、電子的な音楽記述言語を熟知している必要があるこ
とは言うまでもない。そして、原の音楽情報を電子的な
音楽記述言語により再表現するには長時間を必要とす
る。その上に、この再表現作業には、各音色について原
の音楽情報と電子的な音楽記述言語により再表現したも
のとの間の聴覚による対応付けの技術を訓練により習得
することも必要であり、この再表現作業を自動化するに
は種々の困難を伴う。In the above music composition extraction method of music information, the music information must be re-expressed by an electronic music description language. It goes without saying that in order to do this, one must be familiar with electronic music description languages. Then, it takes a long time to re-express the original music information with the electronic music description language. In addition, this re-expression work also requires training to acquire the technique of auditory correspondence between the original music information for each timbre and the re-expression in the electronic music description language. However, there are various difficulties in automating this re-expression work.

【０００５】更に、電子的な音楽記述言語により再表現
された音楽情報から曲構成を抽出することができたとし
ても、電子計算機の拍は一定時間で正確に刻まれるもの
であるのに対して人間の演奏による拍には速度に変化が
伴うものであり、この点について見ても、原情報と抽出
される曲構成との間の対応づけは困難である。この発明
は、電子的な再記述を必要とせずに音楽情報の原情報か
ら音楽の曲構成を抽出し、抽出された曲構成情報のすべ
てを少ない記憶容量により格納する音楽情報の曲構成自
動抽出方法を提供するものである。Further, even if the music composition can be extracted from the music information re-expressed by the electronic music description language, the beat of the electronic computer is accurately carved in a fixed time. The beat produced by human performance is accompanied by a change in speed, and even in this respect, it is difficult to associate the original information with the extracted music composition. The present invention extracts music composition of music from original information of music information without requiring electronic re-description, and automatically extracts composition of music information to store all of extracted music composition information with a small storage capacity. It provides a method.

【０００６】[0006]

【課題を解決するための手段】１或は２以上の音響およ
び音声を含む音楽情報からこの音楽情報における音声或
はメロディを含んだ有声部分を無声部分を使用して検出
分離し、検出された有声部分から音楽の一区切りである
フレーズを検出して、検出されたフレーズに基づいて音
楽情報を曲構成に分割する音楽情報の曲構成自動抽出方
法を構成した。A voiced part containing a voice or a melody in this music information is detected and separated from music information containing one or more sounds and voices by using an unvoiced part, and detected. An automatic music composition extraction method is constructed by detecting a phrase that is a segment of music from a voiced part and dividing the music information into music compositions based on the detected phrase.

【０００７】そして、歌入曲である原曲Ａおよび原曲Ａ
から歌声を消去した伴奏曲Ａの双方を音楽情報ファイル
として録音し、録音された音楽情報ファイルを加工して
原曲Ａおよび伴奏曲Ａ双方の音楽部分のみの解析ファイ
ルを作成し、この音楽部分のみの解析ファイルを解析す
る音楽情報の曲構成自動抽出方法を構成した。また、コ
ンパクトディスク或はレコード盤の如き媒体に収録され
ている音楽情報を使用する音楽情報の曲構成自動抽出方
法を構成した。[0007] The original song A and the original song A which are song songs
Both of the accompaniment A from which the singing voice is deleted are recorded as a music information file, and the recorded music information file is processed to create an analysis file of only the music parts of both the original song A and the accompaniment A. A music composition automatic extraction method of music information that analyzes only the analysis file is configured. Also, an automatic music composition extracting method for music information using music information recorded on a medium such as a compact disc or a record board is constructed.

【０００８】更に、原曲Ａおよび伴奏曲Ａとしてオリジ
ナルカラオケを使用する音楽情報の曲構成自動抽出方法
を構成した。Furthermore, an automatic music composition extracting method of music information using original karaoke as the original music A and the accompaniment music A is constructed.

【０００９】[0009]

【発明の実施の形態】この発明は、市販されているコン
パクトディスク或はレコード盤の如き媒体に収録されて
いる音楽情報について、歌声より成る音声部分の有無に
より曲構成を自動的に抽出し、各曲構成の分割点列のす
べてを蓄積しておく。この発明の実施の形態を図１を参
照して説明する。図１は曲構成自動抽出方法の全体の流
れを説明する図である。BEST MODE FOR CARRYING OUT THE INVENTION The present invention automatically extracts a music composition from music information recorded on a medium such as a commercially available compact disc or record board depending on the presence or absence of a voice portion composed of a singing voice, All the division point sequences of each music composition are accumulated. An embodiment of the present invention will be described with reference to FIG. FIG. 1 is a diagram for explaining the overall flow of a music composition automatic extraction method.

【００１０】図１において、曲構成自動抽出方法は、音
楽情報を録音することとこれを音楽情報ファイルに保存
することを行なう録音部１と、音楽情報ファイルを加工
して解析ファイルを作成する解析ファイル作成部２と、
解析ファイルを解析して曲構成の抽出を行なう音楽情報
解析部３により実行される。以下、原曲Ａについて曲構
成を抽出する場合を例として説明をする。なお、原曲Ａ
とは歌およびそれ以外の音より成る歌入曲をいう。先
ず、録音部１において音楽情報ファイルを作成する。こ
の録音部１の内部構成は図２に示される通りである。入
力部１１において、歌入曲である原曲Ａ、一般にオリジ
ナルカラオケと呼ばれている原曲Ａから歌声を消去した
もの、或は音声その他の左右両チャネルの中心の位相に
録音されている音を消去する音響装置から出力される伴
奏曲Ａの録音を行ない、これらをファイル変換部１２に
おいてそれぞれ標本化、量子化を行ない、音楽情報ファ
イルとして蓄積する。In FIG. 1, the music composition automatic extraction method is a recording unit 1 for recording music information and storing it in a music information file, and an analysis for processing the music information file to create an analysis file. File creation unit 2,
This is executed by the music information analysis unit 3 which analyzes the analysis file and extracts the music composition. Hereinafter, a case of extracting the music composition of the original music A will be described as an example. Original song A
Is a song song consisting of a song and other sounds. First, the recording unit 1 creates a music information file. The internal structure of the recording unit 1 is as shown in FIG. In the input section 11, the original song A which is a song entry, the original song A which is generally called original karaoke, the singing voice is deleted, or the voice or other sound recorded in the center phase of both left and right channels The accompaniment music A output from the audio device for erasing is recorded, and these are sampled and quantized by the file conversion unit 12, respectively, and accumulated as a music information file.

【００１１】この録音部１に蓄積された音楽情報ファイ
ルは解析ファイル作成部２において加工される。解析フ
ァイル作成部２の内部構成は図２に示される通りであ
る。解析ファイル作成部２は、録音部１に蓄積された原
曲Ａおよび伴奏曲Ａ双方の音楽情報ファイルを解析を行
なうことができる形に加工する。解析に際して両曲のパ
ワーの差分をとる必要があるところから、両曲の同期を
とる。両曲の同期をとるには、先ず、無音検出部２１に
おいて両曲の音楽情報ファイルの先頭から後方へ順に、
無音部分の判定のために引数として与えられた閾値より
大なる値をとる点である開始点を検出する。同様に、両
曲の音楽情報ファイルの最後から前方へ順に、閾値より
大なる値をとる点である終了点を検出する。次いで、音
楽部分切出部２２において、無音検出部２１により検出
された開始点から終了点に到るまでを音楽情報ファイル
から切り出すことにより、原曲Ａおよび伴奏曲Ａ双方の
音楽部分のみの解析ファイルを作成する。The music information file stored in the recording section 1 is processed in the analysis file creating section 2. The internal configuration of the analysis file creation unit 2 is as shown in FIG. The analysis file creation unit 2 processes the music information files of both the original song A and the accompaniment song A stored in the recording unit 1 into a form that can be analyzed. Since it is necessary to obtain the power difference between both songs in the analysis, both songs are synchronized. In order to synchronize both songs, first, in the silence detecting section 21, the music information files of both songs are sequentially arranged from the beginning to the rear.
A start point, which is a point having a value larger than a threshold value given as an argument for determining a silent portion, is detected. Similarly, the end point, which is a point having a value larger than the threshold value, is detected in order from the end to the front of the music information files of both songs. Next, in the music part cutout unit 22, the music information file is cut out from the start point to the end point detected by the silence detection unit 21 to analyze only the music parts of both the original song A and the accompaniment song A. Create a file.

【００１２】ここで、音楽情報解析部３において解析フ
ァイルをフレーズに分割する。音楽情報解析部３の内部
構成は図２に示される通りである。音楽情報解析部３は
声の区切れる区切れ点である開始点および終了点を決定
する解析部であり、区切れ点以前の無声部分、区切れ点
以降の有声部分の割合が与えられた閾値より大きいこと
を条件とし、閾値を引数として要求する。この割合をＭ
として説明する。Here, the music information analysis unit 3 divides the analysis file into phrases. The internal structure of the music information analysis unit 3 is as shown in FIG. The music information analysis unit 3 is an analysis unit that determines a start point and an end point that are break points at which the voice is separated, and a threshold value to which a ratio of unvoiced parts before the break points and voiced parts after the break points is given. The condition is that it is larger, and the threshold value is requested as an argument. This ratio is M
It will be described as.

【００１３】先ず、開始点決定部３１において原曲Ａと
伴奏曲Ａとの間の相関をとり、相関値が最も大きな値と
なった両曲の点を開始点とみて両曲の同期を確実にと
る。次いで、図５をも参照するに、差分データ算出部３
２において、両曲について音声認識合成処理において一
般に採用される２００ミリ秒のフレーム長毎にフレーム
の先頭のパワーを算出し、原曲Ａの先頭のパワーの対数
と伴奏曲Ａの先頭のパワーの対数の差分をとり、同様
に、次のフレームの先頭のパワーの差分をとるという様
に、各フレーム毎に差分をとる。First, in the starting point determining section 31, the correlation between the original music piece A and the accompaniment music piece A is calculated, and the point of both music pieces having the largest correlation value is regarded as the starting point to ensure the synchronization of both music pieces. Take Next, referring also to FIG. 5, the difference data calculation unit 3
2, the power of the beginning of the frame is calculated for each 200 msec frame length generally adopted in the voice recognition synthesis process for both songs, and the logarithm of the power of the beginning of the original song A and the power of the beginning of the accompaniment A are calculated. The logarithmic difference is calculated, and similarly, the difference between the head powers of the next frames is calculated, and the difference is calculated for each frame.

【００１４】無声部分閾値決定部３３は伴奏のみで歌の
存在しない無声部分を判定する決定部であり、差分デー
タ算出部３２において算出された差分が初期値として与
えられた閾値より大である点を先頭の差分から順に探
し、先頭からその点までの範囲内で最大の差分を無声部
分閾値として決定する。区切り候補検出部３４は、有声
部分と無声部分とが切り替わる点である区切り点を検出
する。無声部分閾値決定部３３において決定された閾値
よりも差分が大きな値をとる点である有声部分を探し、
数曲のサンプル曲から調べたその点以前に間奏が持続す
る最低限の長さ２秒分のデータ範囲について、無声部分
と判断される点と、逆にその点以降歌声が持続する最低
限の長さ２秒分のデータ範囲について、有声部分と判断
される点が検出される割合がＭより大きな値をとれば、
その点は有声部分開始点、即ち、区切り点候補となる。
同様に、その点以前の２秒分のデータ範囲で有声部分と
判断される点とその点以降の２秒分のデータ範囲で無声
部分と判断される点が検出される割合がＭより大きな値
をとる点も区切り点候補に挙げられる。区切り点候補は
図６において、縦点線により示される通りである。The unvoiced portion threshold value determination unit 33 is a determination unit that determines an unvoiced portion where only the accompaniment does not have a song, and the difference calculated by the difference data calculation unit 32 is larger than the threshold value given as the initial value. Are sequentially searched from the beginning difference, and the maximum difference within the range from the beginning to that point is determined as the unvoiced part threshold. The break point candidate detection unit 34 detects a break point at which a voiced part and an unvoiced part are switched. Search for a voiced part that is a point where the difference has a larger value than the threshold value determined by the unvoiced part threshold value determination part 33,
In the data range for the minimum length of 2 seconds that the interlude lasted before that point, which was examined from several sample songs, it was judged that it was a voiceless part, and conversely In the data range of 2 seconds in length, if the ratio of detection of a point judged as a voiced portion is larger than M,
That point becomes a voiced portion start point, that is, a breakpoint candidate.
Similarly, the ratio of detection of a point judged as a voiced part in the data range for 2 seconds before that point and a point judged as an unvoiced part in the data range for 2 seconds after that point is larger than M. A point that takes is also included in the breakpoint candidates. Separation point candidates are as indicated by vertical dotted lines in FIG.

【００１５】不適切候補点検出部３５は、区切り点候補
の内の特に無声部分開始に位置する区切り点に注目し、
区切り点として検出すべきではない点である撥音便或は
促音便の区切り点候補を候補から除外する。その条件と
して、当該点候補の前後の差分が図１２および図１３の
様に推移している場合を正しい点と考え、図７において
×の実線矢印により示される点をこの条件を満足しない
点として除外する。The inappropriate candidate point detection unit 35 pays attention to the break point located at the start of the unvoiced part among the break point candidates,
The candidate for the break point of the sound-repellent stool or the consonant stool, which should not be detected as the break point, is excluded from the candidates. As the condition, the case where the difference before and after the point candidate changes as shown in FIGS. 12 and 13 is considered to be a correct point, and the point indicated by a solid arrow X in FIG. 7 is regarded as a point which does not satisfy this condition. exclude.

【００１６】カット点候補検出部３６はフレーズの区切
れ目の点（以下、カット点と称す）Ｓ₀ 〜Ｓ_nを検出す
る。不適切候補点検出部３５において、配列に残ってい
る点列の内の、区切り候補検出部３４において検出の基
準とされた傾きの最大値の閾値を更に２倍することによ
り、より差分の動きの大きな点をカット点として挙げて
いく。次に、図１２の場合は、長い無声部分が続くた
め、立ち上がり直前で差分の変化の上下する部分が生じ
ることを考慮して、差分の立ち上がりが急ではあるがそ
の点以前の４フレームが無声部分と判断不可能な部分は
カット点から除外して図８において縦の点線により示さ
れる点をより正確に有声部分の開始と判断可能な点とし
て選び出す。The cut point candidate detecting section 36 detects the points (hereinafter, referred to as cut points) S _{0 to} S _n at the break points of the phrase. In the inappropriate candidate point detection unit 35, by further doubling the threshold value of the maximum value of the slope, which is the reference of detection in the delimitation candidate detection unit 34, of the point sequences remaining in the array, the difference motion is further increased. I will list the big points as the cut points. Next, in the case of FIG. 12, since a long unvoiced part continues, the difference rises rapidly immediately before the rise, so that the rise of the difference is steep, but the four frames before that point are unvoiced. The portion that cannot be determined as a portion is excluded from the cut points, and the point indicated by the vertical dotted line in FIG. 8 is selected as the point that can be determined more accurately as the start of the voiced portion.

【００１７】図１４および図１５を参照するに、歌い出
し点選出部３７は、一般に、歌詞で１番、２番といわれ
る各コーラスの開始点の候補点となる歌い出し点Ｐ₀ 〜
Ｐ_nをカット点候補検出部３６で選ばれた点列の内から
探す。先ず、コーラス開始点は歌い出し点Ｐ₀ 〜Ｐ_nか
ら数フレームに渡って無声部分と判断される部分である
前奏部が前に持続して存在していると考える。このこと
から、１フレームを前奏部の長さＬの初期値とし、カッ
ト点列の内から当該点から前にＬ以上無声部分が持続す
る点を選出する。ここにおいては、数曲のサンプル曲か
ら調べた結果から１曲中のコーラスは８以下と考え、全
ての点について調べた結果、開始点が８以上選出された
場合はＬが短く該当する点が多く選出されたということ
に等しいため、Ｌを１フレーム増加させて同様に選出処
理行ない、開始点が８以下となったところで選出処理を
停止する。選出された歌い出し点は図９において、実線
矢印により示される通りである。Referring to FIGS. 14 and 15, the singing point selecting section 37 is generally a singing point P ₀ -which is a candidate point for the starting point of each chorus, which is generally referred to as the first and second lyrics.
P _n is searched from the point sequence selected by the cut point candidate detection unit 36. First, it is considered that the chorus start point is a part that is judged to be unvoiced part for several frames from the singing start points P _{0 to} P _n , and the prelude part is continuously present in front. Therefore, one frame is set as the initial value of the length L of the prelude portion, and a point in which the unvoiced portion continues for L or more before the point is selected from the cut point sequence. Here, the chorus in one song is considered to be 8 or less based on the results of examination from several sample songs, and as a result of examining all points, when the starting point is 8 or more, L is short and the corresponding point is Since it is equal to a large number of selections, L is increased by 1 frame and the selection processing is performed in the same manner, and the selection processing is stopped when the start point becomes 8 or less. The selected singing point is as shown by the solid arrow in FIG.

【００１８】コーラス開始点決定部３８は１コーラスお
よび２コーラスについて各コーラス開始点を決定する。
歌い出し点選出部３７において選出された歌い出し点列
Ｐ₀〜Ｐ_nの内から１コーラスおよび２コーラスの開始
点となるものを選出する。通常は、歌い出し点列の第１
歌い出し点Ｐ₀ が１コーラスの開始点、第２歌い出し点
Ｐ₁ が２コーラスの開始点となるが、冒頭部分に数フレ
ーズ存在している場合はこの点は１コーラスの開始点の
第１点とはならない。そのために、カット点候補検出部
３６で検出されたカット点列Ｓ₀ 〜Ｓ_nを利用して、図
９の様に歌い出し列第１点Ｐ₀ 、第２点Ｐ₁ がカット点
列第１点Ｓ₀ 、第２点Ｓ₁ に一致している場合には、前
奏から、数フレーズ、１コーラス開始前の間奏、という
構成をとっていると判断し、開始点列第２点Ｐ₁ を１コ
ーラス開始点とする。コーラス開始点は図１０において
実線矢印により示される。The chorus start point determination unit 38 determines each chorus start point for one chorus and two choruses.
From the singing point sequence P _{0 to} P _n selected by the singing point selection unit 37, the starting points of 1 chorus and 2 choruses are selected. Usually the first of the singing point sequence
The singing point P ₀ is the start point of one chorus, and the second singing point P ₁ is the start point of two choruses. If there are several phrases at the beginning, this point is the start point of one chorus. Not 1 point. Therefore, by using the cut point sequences S _{0 to} S _n detected by the cut point candidate detection unit 36, the first point P ₀ and the second point P ₁ are the cut point sequence numbers as shown in FIG. If the first point S ₀ and the second point S ₁ match, it is determined that the prelude is composed of a few phrases and an interlude before the start of the chorus, and the second point P _{1 of} the start point sequence. Is the starting point of 1 chorus. The chorus start point is indicated by the solid arrow in FIG.

【００１９】曲構成検出部３９は、１コーラスと２コー
ラスが同じ構成をとっていれば、楽譜上同じ部分に休符
点が生じていることから、同じ部分でカット点が検出さ
れるという条件を満たす点を曲構成開始点として検出す
る。先ず、１コーラスの第１点（Ｘ_i）から第２点（Ｘ
_i+1)までのデータ長（Ｌ₁ ）と、２コーラスの第１点
（Ｙ_j）から第２点（Ｙ_j+1)までのデータ長（Ｌ₂)とを
比較し、Ｌ₁ ＝Ｌ₂ であれば各コーラスの第１点と第２
点に挟まれたフレーズが同じと考えられ、第１点から第
２点までが第１曲構成となる。同様に、Ｘ_i、Ｙ_jにそ
れぞれ第２点を代入し、Ｘ_i+1 、Ｙ_j+1 にそれぞれ第３
点を代入してデータ長Ｌ₁ とデータ長Ｌ₂とを比較して
行く。If the one chorus and the two choruses have the same structure, the music composition detecting section 39 has a condition that a cut point is detected at the same portion because a rest point is generated at the same portion on the score. The point that satisfies the condition is detected as the music composition start point. First, from the first point (X _i ) to the second point (X
_{i + 1)} to the data length of (L _1), 2 first point chorus (Y _j) from the second point (Y j _{+ 1)} to the data length of (L ₂₎ and compare, L ₁ = If L ₂ , the first and second points of each chorus
It is considered that the phrases sandwiched between the points are the same, and the first music composition is from the first point to the second point. Similarly, the second point is assigned to each of X _i and Y _j, and the third point is assigned to each of X _{i + 1} and Y _{j + 1} .
By substituting points, the data length L ₁ and the data length L ₂ are compared.

【００２０】この際、各コーラスの間に歌法の差があっ
て、カット点にずれが生じていることも考えられる。そ
こで、両コーラスのデータ長を比較する際に、データ長
が等しいものとみなすずれ幅Ｗを設定する。ずれ幅Ｗの
初期値を０とし、絶対値｜Ｌ₁ −Ｌ₂ ｜＜Ｗ・・・・・（ａ）の時、同じフレーズが存在するとみなす。ここで、ずれ
幅Ｗ＝０であるものと設定すると、この場合は全く同じ
ところにカット点が現れていなくては、同じフレーズが
存在しているとみなさないということになる。そこで、絶対値｜Ｌ₁ −Ｌ₂ ｜＞Ｗ・・・・・（ｂ）となる場合、Ｘ_i+1 に１コーラス第３点を代入する。即
ち、Ｌ₁ を１コーラスの第１点から第３点までのデータ
長とし、Ｌ₂ と比較する。この様にして式（ａ）が成り
立つ場合は１コーラスの第１点から第３点までと２コー
ラスの第１点から第２点までに同じフレーズが存在して
おり、第１曲構成ということになる。式（ａ）が成り立
たない場合、Ｘ_i+1 を第２点に戻し、Ｙ_j+1 を１増や
す。即ち、２コーラスの方の当該データ長を変化させて
同様に比較を行なう。第１曲構成部分が決定されれば、
各コーラスの第１曲構成の終止点をＸ_i、Ｙ_jとし、次
のカット点候補をＸ_i+1 、Ｙ_j+1 として同様に比較決定
を行なう。At this time, it is conceivable that there is a difference in singing style between the choruses, and the cut points are deviated. Therefore, when comparing the data lengths of both choruses, the deviation width W is set so that the data lengths are considered to be equal. When the initial value of the deviation width W is set to 0 and the absolute value | L ₁ −L ₂ | < W ... (a), it is considered that the same phrase exists. Here, if it is set that the deviation width W = 0, in this case, it is not considered that the same phrase exists unless the cut points appear at exactly the same place. Therefore, when the absolute value | L ₁ −L ₂ |> W (b) holds, the 1st chorus third point is substituted for X _{i + 1} . That is, L ₁ is set as the data length from the first point to the third point of one chorus and compared with L ₂ . In this way, when the expression (a) is satisfied, the same phrase exists from the first point to the third point of one chorus and the first point to the second point of the two chorus, which means that it is the first song composition. become. When the expression (a) is not established, X _{i + 1} is returned to the second point and Y _{j + 1} is incremented by 1. That is, the data length of the two choruses is changed and the comparison is similarly performed. Once the first song component is decided,
The end points of the first music composition of each chorus are set to X _i and Y _j , the next cut point candidates are set to X _{i + 1} and Y _{j + 1} , and the comparison and determination are performed in the same manner.

【００２１】ずれ幅Ｗにより分割処理を進めて行き、分
割処理が終了した時点において、各コーラスが３分割以
上５分割以下の数に分割されているか否かを調べる。な
お、この３分割以上５分割以下という数は数曲のサンプ
ル曲を調査した結果得られた統計的な構成数である。該
当分割数がこの範囲内に収まっている場合、分割処理は
終了する。分割数がこの範囲内に収まっていない場合
は、ずれ幅Ｗを１フレーム増加させて再度第１点から分
割処理を行なっていく。以上の処理により獲得された各
曲構成の開始点のみを構成点列として記憶する。曲構成
に分割された結果は図１１において、実線両方向矢印に
より示される通りである。The division processing is advanced according to the shift width W, and when the division processing is completed, it is checked whether or not each chorus is divided into a number of divisions of 3 or more and 5 or less. The number of 3 divisions or more and 5 divisions or less is a statistical number of constituents obtained as a result of examining several sample pieces. If the number of corresponding divisions is within this range, the division processing ends. If the number of divisions does not fall within this range, the deviation width W is increased by one frame and division processing is performed again from the first point. Only the starting point of each music composition obtained by the above processing is stored as a composition point sequence. The result of the division into the musical composition is as shown by the solid double-headed arrow in FIG.

【００２２】[0022]

【発明の効果】以上の通りであって、この発明は、曲構
成毎の開始点を自動的に獲得することができるので、曲
構成分割に使用した音楽情報と各点を対応させることに
より、ダイジェスト情報の開始点を指定すれば、終了点
を模索することなしに、自動的に曲構成に則ったダイジ
ェスト情報を作成することができる。As described above, according to the present invention, since the starting point for each music composition can be automatically obtained, by associating the music information used for music composition division with each point, By specifying the start point of the digest information, it is possible to automatically create the digest information according to the music composition without searching for the end point.

【００２３】そして、指定されたダイジェスト情報をは
さみ、提示するダイジェスト情報の長さを変化させるこ
とが必要な場合でも、曲構成毎に区切られているので、
時間だけを優先し、音楽的な構成を無視して歌声の途中
で区切られるということはない。また、各曲構成の開始
点のみを点列で格納してよいことから、少ない記憶容量
で音楽情報を提示したい場合においても応用することが
できる。Even if it is necessary to sandwich the designated digest information and change the length of the digest information to be presented, since it is divided for each music composition,
Time is given priority, and musical composition is ignored, and there is no separation in the middle of a singing voice. Further, since only the start point of each music composition may be stored as a point sequence, the present invention can be applied even when it is desired to present music information with a small storage capacity.

[Brief description of drawings]

【図１】この発明の曲構成分割の全体の構成を説明する
図。FIG. 1 is a diagram for explaining the overall structure of a music composition division according to the present invention.

【図２】録音部の処理の流れを説明する図。FIG. 2 is a diagram illustrating a flow of processing of a recording unit.

【図３】解析ファイル作成部の処理の流れを説明する
図。FIG. 3 is a diagram illustrating a processing flow of an analysis file creation unit.

【図４】音楽情報解析部の処理の流れを説明する図。FIG. 4 is a diagram illustrating a processing flow of a music information analysis unit.

【図５】差分データの変化を示す図。FIG. 5 is a diagram showing changes in difference data.

【図６】検出された区切り候補点を示す図。FIG. 6 is a diagram showing detected demarcation candidate points.

【図７】区切り候補点の内の曲構成の区切りとはなり得
ない不適切候補点を示す図。FIG. 7 is a diagram showing inappropriate candidate points that cannot serve as a music composition break among the break candidate points.

【図８】カット点候補点を示す図。FIG. 8 is a diagram showing cut point candidate points.

【図９】歌い出し点を示す図。FIG. 9 is a diagram showing singing points.

【図１０】コーラス開始点を示す図。FIG. 10 is a diagram showing a chorus start point.

【図１１】曲構成に分割された結果を示す図。FIG. 11 is a diagram showing a result of division into music compositions.

【図１２】長い休符付近の差分データの変化を示す図。FIG. 12 is a diagram showing a change in difference data near a long rest.

【図１３】短い休符付近の差分データの変化を示す図。FIG. 13 is a diagram showing changes in difference data in the vicinity of short rests.

【図１４】歌い出し点とコーラス開始点の関係を示す
図。FIG. 14 is a diagram showing a relationship between a singing start point and a chorus start point.

【図１５】歌い出し点選出部の処理の流れを説明する
図。FIG. 15 is a diagram illustrating a processing flow of a singing point selection unit.

【図１６】曲構成検出部の処理の流れを説明する図。FIG. 16 is a diagram illustrating a processing flow of a music composition detecting unit.

[Explanation of symbols]

１録音部１１入力部１２ファイル変換部２解析ファイル作成部２１無音検出部２２音楽部分切出部３音楽情報解析部３１開始点決定部３２差分データ算出部３３無声部分閾値決定部３４区切り候補検出部３５不適切候補点検出部３６カット点候補検出部３７歌い出し点選出部３８コーラス開始点決定部３９曲構成検出部 1 Recording Section 11 Input Section 12 File Conversion Section 2 Analysis File Creation Section 21 Silence Detection Section 22 Music Part Extraction Section 3 Music Information Analysis Section 31 Starting Point Determining Section 32 Difference Data Calculating Section 33 Unvoiced Part Threshold Determining Section 34 Separation Candidate Detection Part 35 Inappropriate candidate point detection part 36 Cut point candidate detection part 37 Singing point selection part 38 Chorus start point determination part 39 Music composition detection part

Claims

[Claims]

1. A voiced part containing a voice or a melody in this music information is detected and separated from music information containing one or more sounds and voices by using an unvoiced part, and music is detected from the detected voiced part. A method for automatically extracting music composition of music information, characterized by detecting a phrase that is a delimiter and dividing music information into music compositions based on the detected phrase.

2. The automatic music composition extraction method according to claim 1, wherein both the original song A which is a song entry and the accompaniment song A in which the singing voice is erased from the original song A are recorded as a music information file. Then, the recorded music information file is processed to create an analysis file of only the music part of both the original song A and the accompaniment song A, and the analysis file of only this music part is analyzed. Automatic configuration extraction method.

3. The automatic music composition extracting method for music information according to claim 1, wherein music information recorded on a medium such as a compact disc or a record board is used. A method for automatically extracting music composition of music information, which is characterized by the above.

4. The automatic music composition music extraction method according to claim 3, wherein original karaoke is used as the original music A and the accompaniment music A.