JP2004310054A

JP2004310054A - Music file generation apparatus, music file generation method, and recording medium

Info

Publication number: JP2004310054A
Application number: JP2004017202A
Authority: JP
Inventors: Hirohito Kimoto; 裕仁木本
Original assignee: SUNS K KK
Current assignee: SUNS K KK
Priority date: 2003-03-24
Filing date: 2004-01-26
Publication date: 2004-11-04

Abstract

<P>PROBLEM TO BE SOLVED: To permit the utilization of music consisting of a singing voice and a BGM (background music), as the ringing tone, even in the present kind of mobile phone that is not equipped with a large-capacity memory, the MP3 (MPEG 1-Audio Layer-3) recorder, or the like. <P>SOLUTION: There are provided with a singing voice extraction part 2 for extracting the singing voice of a person from digital sound source data 11 and for obtaining singing voice data 12 according to the ADPCM (adaptive differential pulse code modulation) format, a BGM generation part 13 for generating the BGM data 13 according to the MIDI (music instrument digital interface) format, an MIDI adjustment part 4 for generating simulated singing voice data according to the MIDI format matching with the extracted singing voice and for adding it to the BGM data 13, and a file generation part 5 for processing the singing voice data 12 and the BGM + simulated singing voice data 14 into one music file 15. The overall amount of the data is reduced by heavily limiting the band with respect to the singing voice part and by generating the BGM part according to the MIDI format, and the quality of the reproduced singing voice can be maintained so as to be not less than a prescribed level, by supplementing the deteriorated singing voice part due to the band limitation with the MIDI data. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、楽曲ファイル生成装置、楽曲ファイル生成方法および特定データ構造を有する楽曲ファイルの記録媒体に関し、特に、人間の歌声とＢＧＭ（Back Ground Music）とから成る楽曲ファイルの生成方法および当該楽曲ファイルのデータ構造に関するものである。 The present invention relates to a music file generation device, a music file generation method, and a music file recording medium having a specific data structure, and more particularly to a music file generation method including a human singing voice and BGM (Back Ground Music) and the music file. This is related to the data structure.

今や携帯電話は爆発的に普及し、誰もが持っている機器の１つになった。当初の携帯電話では、着信音は極めて単調なパターン音を繰り返す程度のものしかなかった。しかしやがて、より個性的なものを求める市場ニーズに対応して、ＭＩＤＩ（Music Instrument Digital Interface）データを利用して作成したメロディで着信音を鳴らす、いわゆる「着信メロディ」が登場した。 Now, mobile phones have exploded and become one of the devices that everyone has. In the early days of mobile phones, the only ring tone was that of repeating a very monotonous pattern sound. In time, however, a so-called “ringing melody,” in which a melody created using MIDI (Music Instrument Digital Interface) data sounds a ringtone in response to a market need for a more individual thing, has appeared.

また、数年前にはＰＣＭ音源を内蔵した携帯電話も登場し、このＰＣＭ音源を利用してアーティスト等の声で着信音を鳴らす、いわゆる「着信ボイス」も今では実現されている。これらの着信メロディや着信ボイスは、インターネット上のサイトから所望のものをダウンロードして利用することができるようになっている。ユーザは自分好みのコンテンツを携帯電話にダウンロードすることにより、携帯電話を「自分」独自のものに仕立て上げることが可能である。 A few years ago, a mobile phone with a built-in PCM sound source also appeared, and a so-called “ringing voice” that uses the PCM sound source to sound a ringtone with the voice of an artist or the like has been realized. These ringtones and voices can be downloaded and used from sites on the Internet. The user can tailor the mobile phone to his / her own by downloading the content he / she likes to the mobile phone.

最近では、携帯電話機の進歩により、ＣＤ（コンパクトディスク）等に記録されている楽曲そのものを携帯電話の着信音として利用する（単なるメロディ音や単なる人間の声でなく、人間の歌声とＢＧＭとが一体となった楽曲そのものを着信音とする）システム（以下「着うた」システムと称する）が新たに提供されている。この種の着うたシステムでは、ＣＤ音源から一部を切り出し、それをＭＰ３（MPEG1 Audio Layer-3）等の形式で圧縮したデータを配信用のコンテンツとして用いている。 Recently, with the progress of mobile phones, music recorded on a compact disk (CD) or the like is used as a ringtone of the mobile phone (not just a melody sound or a mere human voice, but a human singing voice and BGM). A system (hereinafter, referred to as a “ringing song” system) in which the integrated music itself is used as a ringtone is newly provided. In this kind of ringing song system, a part is cut out from a CD sound source, and the data is compressed in a format such as MP3 (MPEG1 Audio Layer-3) and used as content for distribution.

しかしながら、従来の着うたシステムでは、ＣＤ音源の一部を単純に切り出して配信用コンテンツとしている。そのため、従来の着信メロディ（ＭＩＤＩデータ）や着信ボイス（ＰＣＭデータ）に比べて着うたのコンテンツはデータ量が非常に大きくなり、これをダウンロードして利用するためには携帯電話に大きな容量のメモリが必要になる。 However, in the conventional ringing song system, a part of the CD sound source is simply cut out and used as distribution content. Therefore, compared to the conventional ring tone melody (MIDI data) and ring tone voice (PCM data), the amount of data of the ringing song becomes very large, and a large-capacity memory is required in the mobile phone in order to download and use it. Will be needed.

少なくとも１つの楽曲として認識できる程度に着うたを再生するためには、それなりの時間分だけＣＤ音源を切り出す必要がある。よって、切り出したデータをＭＰ３形式で圧縮したとしても、既存の携帯電話機が備える少ない容量のメモリでは対応し切れない。また、既存の携帯電話機は、ＭＩＤＩ音源やＰＣＭ音源の再生機能は備えているが、ＭＰ３形式のデータの再生機能は備えていない。 In order to reproduce a ringing song to the extent that it can be recognized as at least one music piece, it is necessary to cut out a CD sound source for a certain amount of time. Therefore, even if the clipped data is compressed in the MP3 format, it cannot be coped with with a small capacity memory provided in the existing mobile phone. Existing mobile phones have a function of playing back MIDI sound sources and PCM sound sources, but do not have a function of playing back data in MP3 format.

以上のことから、従来の着うたシステムでは、非常に大きな容量のメモリを持ち、かつＭＰ３形式のデコーダを備えた新機種でないとサービスを利用することができないという問題があった。着信メロディや着信ボイスが大きなブームとなったのは、携帯電話が標準で備えていたＭＩＤＩ音源の再生機能とＰＣＭ音源の再生機能とをそのまま利用できたことが１つの要因である。したがって、着うたに関しても、既存の機種でもサービスを利用できるようにすることが望まれる。 As described above, the conventional ringing song system has a problem that the service cannot be used unless it is a new model having a very large capacity memory and an MP3 format decoder. One of the reasons why the ringing melody and the ringing voice became large boom was that the reproduction function of the MIDI sound source and the reproduction function of the PCM sound source, which were provided as standard in the mobile phone, could be used as they were. Therefore, it is desired that the service can be used even with the existing model for the ringing song.

本発明は、このような実情に鑑みて成されたものであり、大容量のメモリやＭＰ３デコーダ等を備えていない現行の携帯電話機でも、歌声とＢＧＭとから成る楽曲を着信音として利用できるようにすることを目的とする。 The present invention has been made in view of such circumstances, and enables a current mobile phone not provided with a large-capacity memory, an MP3 decoder, or the like to use a tune composed of singing voice and BGM as a ring tone. The purpose is to.

本発明の楽曲ファイル生成装置は、歌声と当該歌声以外の音声とが混合して成るデジタル音声データから上記歌声を抽出し、ＰＣＭ形式の歌声データを得る歌声抽出手段と、ＭＩＤＩ形式のＢＧＭデータを生成するとともに、上記歌声抽出手段により抽出した歌声に合わせてＭＩＤＩ形式の模擬歌声データを生成し、上記ＢＧＭデータに対して上記模擬歌声データを付加してＭＩＤＩデータの調整を行うＭＩＤＩ生成手段と、上記歌声抽出手段により生成されたＰＣＭ形式の歌声データと上記ＭＩＤＩ生成手段により生成されたＭＩＤＩ形式のＢＧＭ＋模擬歌声データとを１つの楽曲ファイルに加工するファイル生成手段とを備えたことを特徴とする。上記歌声以外の音声は、例えばＢＧＭまたは雑音である。 The music file generation device of the present invention extracts singing voice from digital voice data composed of a mixture of singing voice and voice other than the singing voice, and obtains singing voice data in PCM format and BGM data in MIDI format. MIDI generating means for generating, simulating singing voice data in MIDI format in accordance with the singing voice extracted by the singing voice extracting means, and adding the simulated singing voice data to the BGM data to adjust the MIDI data; File generating means for processing the singing voice data in the PCM format generated by the singing voice extracting means and the BGM + simulated singing voice data in the MIDI format generated by the MIDI generating means into one music file. . The voice other than the singing voice is, for example, BGM or noise.

本発明の他の態様では、上記歌声抽出手段は、上記歌声と上記歌声以外の音声とが混合して成るデジタル音声データに対して、上記歌声に対応する所定の周波数帯域まで帯域制限する処理を行うことを特徴とする。 In another aspect of the present invention, the singing voice extracting means performs a process of band-limiting digital voice data obtained by mixing the singing voice and voices other than the singing voice to a predetermined frequency band corresponding to the singing voice. It is characterized by performing.

本発明の他の態様では、上記ファイル生成手段により生成される楽曲ファイルは、上記ＭＩＤＩ生成手段により生成されたＭＩＤＩ形式のＢＧＭ＋模擬歌声データを再生するためのＭＩＤＩ再生制御情報と、上記歌声抽出手段により生成されたＰＣＭ形式の歌声データを上記模擬歌声データに同期させて再生するためのＰＣＭ再生制御情報とを含んで構成されることを特徴とする。 In another aspect of the present invention, the music file generated by the file generating means includes MIDI reproduction control information for reproducing the MIDI format BGM + simulated singing voice data generated by the MIDI generating means, and the singing voice extracting means. And PCM reproduction control information for reproducing the singing voice data in the PCM format generated by the above in synchronization with the simulated singing voice data.

また、本発明の楽曲ファイル生成方法は、歌声と当該歌声以外の音声とが混合して成るデジタル音声データから上記歌声を抽出し、ＰＣＭ形式の歌声データを得る第１のステップと、ＭＩＤＩ形式のＢＧＭデータを生成する第２のステップと、上記第１のステップで抽出した歌声に合わせてＭＩＤＩ形式の模擬歌声データを生成し、上記第２のステップで生成したＢＧＭデータに対して上記模擬歌声データを付加してＭＩＤＩデータの調整を行う第３のステップと、上記第１のステップで生成されたＰＣＭ形式の歌声データと上記第３のステップで調整が行われたＭＩＤＩ形式のＢＧＭ＋模擬歌声データとを１つの楽曲ファイルに加工する第４のステップとを有することを特徴とする。 The music file generating method according to the present invention further includes a first step of extracting the singing voice from digital voice data obtained by mixing a singing voice and a voice other than the singing voice to obtain PCM format singing voice data; A second step of generating BGM data; generating simulated singing voice data in MIDI format in accordance with the singing voice extracted in the first step; and simulating the singing voice data with respect to the BGM data generated in the second step. A third step of adjusting the MIDI data by adding the singing voice data, the PCM-format singing voice data generated in the first step, the MIDI-format BGM + simulated singing voice data adjusted in the third step, and Into a single music file.

本発明の他の態様では、上記第１のステップでは、上記歌声と当該歌声以外の音声とが混合して成るデジタル音声データに対して、上記歌声に対応する所定の周波数帯域まで帯域制限する処理を行うことを特徴とする。 In another aspect of the present invention, in the first step, a process of band-limiting digital voice data composed of a mixture of the singing voice and voices other than the singing voice to a predetermined frequency band corresponding to the singing voice. Is performed.

本発明の他の態様では、上記第４のステップでは、上記第２のステップで生成されたＰＣＭ形式の歌声データと、上記第３のステップで生成されたＭＩＤＩ形式のＢＧＭ＋模擬歌声データとの再生タイミングを同期させる調整処理を行うことを特徴とする。 In another aspect of the present invention, in the fourth step, the singing voice data in the PCM format generated in the second step and the BGM + simulated singing voice data in the MIDI format generated in the third step are reproduced. An adjustment process for synchronizing timing is performed.

本発明の他の態様では、上記第４のステップで生成される楽曲ファイルは、上記第３のステップで生成されたＭＩＤＩ形式のＢＧＭ＋模擬歌声データを再生するためのＭＩＤＩ再生制御情報と、上記第２のステップで生成されたＰＣＭ形式の歌声データを上記模擬歌声データに同期させて再生するためのＰＣＭ再生制御情報とを含むことを特徴とする。 In another aspect of the present invention, the music file generated in the fourth step includes MIDI reproduction control information for reproducing the MIDI format BGM + simulated singing voice data generated in the third step; PCM reproduction control information for reproducing the PCM format singing voice data generated in step 2 in synchronization with the simulated singing voice data.

また、本発明のコンピュータ読み取り可能な記録媒体は、ＰＣＭ形式の歌声データから成るＰＣＭデータと、ＭＩＤＩ形式のＢＧＭデータに対して、上記ＰＣＭデータの歌声に合わせて生成されたＭＩＤＩ形式の模擬歌声データが付加されたＭＩＤＩデータとを含み、上記ＰＣＭデータと上記ＭＩＤＩデータとが１つのファイルに統合されて成るデータ構造を有する楽曲ファイルが記録されたことを特徴とする。 Further, the computer readable recording medium of the present invention is a computer readable singing voice data in the MIDI format generated according to the singing voice of the PCM data with respect to the PCM data including the singing voice data in the PCM format and the BGM data in the MIDI format. And music data having a data structure in which the PCM data and the MIDI data are integrated into one file.

本発明の他の態様では、上記楽曲ファイルは、上記ＭＩＤＩデータを再生するためのＭＩＤＩ再生制御情報と、上記ＰＣＭデータを上記ＭＩＤＩデータに同期させて再生するためのＰＣＭ再生制御情報とを含むことを特徴とする。 In another aspect of the present invention, the music file includes MIDI playback control information for playing back the MIDI data, and PCM playback control information for playing back the PCM data in synchronization with the MIDI data. It is characterized by.

以上説明したように本発明によれば、現行の携帯電話機種の着信音に関するファイル容量の制限範囲内に収まる程度まで楽曲ファイルのデータ量を削減することができるとともに、再生音声の品質も所定レベル以上に維持することができる。これにより、大容量のメモリやＭＰ３デコーダ等を備えていない現行の携帯電話機種でも着うたのサービスが利用できるようにすることができる。 As described above, according to the present invention, it is possible to reduce the data amount of the music file to the extent that it falls within the limited range of the file capacity for the ringtone of the current mobile phone type, and the quality of the reproduced sound is also at a predetermined level. The above can be maintained. As a result, it is possible to use the ringtone service even with the current mobile phone type that does not include a large-capacity memory, an MP3 decoder, and the like.

以下、本発明の一実施形態を図面に基づいて説明する。
図１は、本実施形態に係る楽曲ファイル生成システムの一構成例を示す図である。図１に示すように、本実施形態の楽曲ファイル生成システム１００は、録音部１、歌声抽出部２、ＢＧＭ生成部３、ＭＩＤＩ調整部４およびファイル生成部５を備えて構成されている。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a diagram illustrating a configuration example of a music file generation system according to the present embodiment. As shown in FIG. 1, the music file generation system 100 of the present embodiment includes a recording unit 1, a singing voice extraction unit 2, a BGM generation unit 3, a MIDI adjustment unit 4, and a file generation unit 5.

録音部１は、ＣＤ（Compact Disc）やＤＶＤ（Digital Versatile Disk）等のデジタル音源データをコンピュータのハードディスク等にＷＡＶ形式で録音するものである。例えば、市販のＣＤをパーソナルコンピュータ（以下、パソコン）のＣＤドライブにセットし、パソコン内蔵のハードディスクに録音することにより、ＷＡＶ形式のデジタル音源データ１１を得ることができる。 The recording unit 1 records digital sound source data such as a CD (Compact Disc) and a DVD (Digital Versatile Disk) on a hard disk or the like of a computer in a WAV format. For example, by setting a commercially available CD in a CD drive of a personal computer (hereinafter, a personal computer) and recording the data on a hard disk built in the personal computer, the digital sound source data 11 in the WAV format can be obtained.

なお、ＷＡＶ形式は、Ｗｉｎｄｏｗｓ（登録商標）標準の音声ファイル形式であり、ＷＡＶＥ形式とも呼ばれる。デジタル音声信号を記録するための保存形式として規定されている。圧縮方式は任意のものを利用することが可能である。デフォルトではＰＣＭ（無圧縮）方式やＡＤＰＣＭ（Adaptive Differential Pulse Code Modulation）方式などの圧縮方式に対応している。 The WAV format is a Windows (registered trademark) standard audio file format, and is also called a WAVE format. It is defined as a storage format for recording digital audio signals. Any compression method can be used. By default, it supports compression methods such as the PCM (no compression) method and the ADPCM (Adaptive Differential Pulse Code Modulation) method.

歌声抽出部２は、人間の歌声とＢＧＭとが混合しているＷＡＶ形式のデジタル音源データ１１から所望の数小節分（例えば、楽曲の先頭部分やサビ部分）を切り出し、それからＢＧＭを廃棄して人間の歌声の部分だけを抽出する。このとき、携帯電話に実装されている再生形式に従って、ＷＡＶ形式のデジタル音源データ１１をＡＤＰＣＭ形式の歌声データ１２に変換する。 The singing voice extracting section 2 cuts out a desired few measures (for example, the beginning or climax part of a music piece) from the WAV format digital sound source data 11 in which human singing voice and BGM are mixed, and then discards the BGM. Extract only the human singing voice part. At this time, the digital audio data 11 in the WAV format is converted into the singing voice data 12 in the ADPCM format according to the reproduction format implemented in the mobile phone.

具体的には、例えばＣＤであれば４４．１ＫＨｚでサンプリングされたデジタル音源データ１１に対して、人間の歌声に対応する所定の周波数帯域（４ＫＨｚまたは８ＫＨｚ）まで帯域制限する処理を行う。すなわち、４ＫＨｚまたは８ＫＨｚに相当する一定時間ごとにデジタル音源データ１１をサンプリングする。単なるサンプリングではＰＣＭ形式となるが、ここでは更に音が連続的に変化すること利用して、直前のサンプリングデータとの差を記録することによってデータ量を減らす。これがＡＤＰＣＭ形式である。 Specifically, for example, in the case of a CD, the digital sound source data 11 sampled at 44.1 KHz is subjected to band limitation up to a predetermined frequency band (4 KHz or 8 KHz) corresponding to a human singing voice. That is, the digital sound source data 11 is sampled at regular intervals corresponding to 4 KHz or 8 KHz. Although the PCM format is used for simple sampling, the amount of data is reduced by recording the difference from the immediately preceding sampled data by utilizing the fact that the sound continuously changes. This is the ADPCM format.

ＢＧＭ生成部３は、ＭＩＤＩ形式のＢＧＭデータ１３を生成するものである。ここでは、例えばパソコンにＭＩＤＩ音源を備え、当該パソコンにインストールされたシーケンスソフトと呼ばれるアプリケーションプログラムを利用して、ＤＴＭ（Desk Top Music）によりＢＧＭを生成する。ここで生成するＢＧＭデータ１３は、歌声抽出部２で廃棄した部分に相当するＢＧＭである。なお、ＤＴＭはＭＩＤＩデータの生成法の一例であり、本発明はこの生成法に限定されるものではない。 The BGM generation unit 3 generates BGM data 13 in the MIDI format. Here, for example, a personal computer is provided with a MIDI sound source, and BGM is generated by DTM (Desk Top Music) using an application program called sequence software installed in the personal computer. The BGM data 13 generated here is BGM corresponding to the portion discarded by the singing voice extraction unit 2. Note that DTM is an example of a method of generating MIDI data, and the present invention is not limited to this method.

歌声抽出部２で抽出した歌声データ１２と、ＢＧＭ生成部３で生成したＢＧＭデータ１３とを合わせれば、ＣＤ等のデジタル音源と同じ元の楽曲ができる。歌声データ１２は元のデジタル音源データ１１を大幅に帯域制限して生成したものであり、データ量はかなり削減されている。また、ＢＧＭデータ１３はＭＩＤＩ形式なので、元々データ量は少ない。よって、単にＣＤ音源から一部を切り出してＭＰ３形式で圧縮したデータに比べて、データ量は格段に少なくなっている。 If the singing voice data 12 extracted by the singing voice extraction unit 2 and the BGM data 13 generated by the BGM generation unit 3 are combined, the same original music as a digital sound source such as a CD can be created. The singing voice data 12 is generated by greatly band-limiting the original digital sound source data 11, and the data amount is considerably reduced. Further, since the BGM data 13 is in the MIDI format, the data amount is originally small. Therefore, the data amount is much smaller than that of data simply cut out from a CD sound source and compressed in the MP3 format.

ただし、歌声抽出部２により抽出した歌声データ１２には劣化が生じており、そのまま再生しても殆ど人間の声として認識できない。サンプリング周波数を大きくすれば劣化を抑制できるが、歌声データ１２のデータ量が大きくなってしまう。そこで本実施形態では、人間の歌声をＭＩＤＩデータによってブーストさせることで、歌声データ１２のデータ量の肥大化を回避しながら、出力される歌声の品質を一定レベル以上に維持する手法をとっている。そのために利用するのがＭＩＤＩ調整部４である。 However, the singing voice data 12 extracted by the singing voice extraction unit 2 is deteriorated, and can be hardly recognized as a human voice even if reproduced as it is. Although the deterioration can be suppressed by increasing the sampling frequency, the data amount of the singing voice data 12 increases. Therefore, in the present embodiment, a method is used in which the quality of the output singing voice is maintained at a certain level or more while boosting the singing voice of the human with MIDI data, thereby avoiding an increase in the data amount of the singing voice data 12. . The MIDI adjusting unit 4 is used for that purpose.

ＭＩＤＩ調整部４では、歌声抽出部２で抽出した歌声の音程やテンポ、音色、音量などに合わせて、当該歌声を模擬したＭＩＤＩ形式の模擬歌声データを生成する。そして、この模擬歌声データをＢＧＭデータに付加してＭＩＤＩデータの調整を行う。このＭＩＤＩデータの調整も、例えばＤＴＭにより行う。 The MIDI adjusting unit 4 generates simulated singing voice data in the MIDI format that simulates the singing voice in accordance with the pitch, tempo, tone, volume, etc. of the singing voice extracted by the singing voice extracting unit 2. Then, the simulated singing voice data is added to the BGM data to adjust the MIDI data. The adjustment of the MIDI data is also performed by, for example, DTM.

ここで生成した模擬歌声データを単独で再生しても、人間の歌声には聞こえない。しかし、歌声抽出部２により抽出された歌声データ１２と同時に再生すると、ＡＤＰＣＭ形式の歌声データ１２で劣化した部分がＭＩＤＩ形式の模擬歌声データによってきれいに補われ、人間の歌声として良好に聞こえるようになる。 Even if the simulated singing voice data generated here is reproduced alone, it cannot be heard as a human singing voice. However, when played back simultaneously with the singing voice data 12 extracted by the singing voice extracting unit 2, the portion degraded by the singing voice data 12 in the ADPCM format is complemented by the simulated singing voice data in the MIDI format, and can be heard well as human singing voice. .

ファイル生成部５は、歌声抽出部２で生成されたＡＤＰＣＭ形式の歌声データ１２と、ＭＩＤＩ調整部４で調整が行われたＭＩＤＩ形式のＢＧＭ＋模擬歌声データ１４とを１つの楽曲ファイル１５に加工する処理を行う。ここで生成する楽曲ファイル１５は、携帯電話のキャリア独自のフォーマットに合わせて書き出したものである。例えばドコモ社の場合、ＭＦｉ（Melody Format for i-mode：i-modeは登録商標）に従ってＭＬＤ形式の楽曲ファイル１５を生成する。 The file generation unit 5 processes the singing voice data 12 in ADPCM format generated by the singing voice extraction unit 2 and the BGM + simulated singing voice data 14 in MIDI format adjusted by the MIDI adjustment unit 4 into one music file 15. Perform processing. The music file 15 generated here is written out according to a format unique to the carrier of the mobile phone. For example, in the case of DOCOMO, the music file 15 in the MLD format is generated according to MFi (Melody Format for i-mode: i-mode is a registered trademark).

上述のように、ＡＤＰＣＭの歌声データ１２とＭＩＤＩの模擬歌声データとをずれなく同時に再生することが重要である。したがって、ＭＬＤ形式の楽曲ファイル１５を生成する際には、歌声データ１２とＢＧＭ＋模擬歌声データ１４との再生タイミングを同期させる調整を行う。具体的には、ＭＬＤフォーマットで定義されているバイナリの演奏位置情報（演奏の開始位置と終了位置、発音時間など）を、歌声データ１２とＢＧＭ＋模擬歌声データ１４との双方について適切に設定する。 As described above, it is important to reproduce the singing voice data 12 of ADPCM and the simulated singing voice data of MIDI at the same time without deviation. Therefore, when generating the music file 15 in the MLD format, an adjustment is made to synchronize the reproduction timing of the singing voice data 12 and the BGM + simulated singing voice data 14. Specifically, binary performance position information (start position and end position of performance, sounding time, etc.) defined in the MLD format is appropriately set for both the singing voice data 12 and the BGM + simulated singing voice data 14.

以上のように構成した楽曲ファイル生成システム１００の各機能ブロック１〜５は、実際にはコンピュータのＣＰＵあるいはＭＰＵ、ＲＡＭ、ＲＯＭなどを備えて構成され、ＲＡＭやＲＯＭに記憶されたプログラムが動作することによって実現できる。 Each of the functional blocks 1 to 5 of the music file generation system 100 configured as described above is actually configured to include a CPU or an MPU, a RAM, a ROM, and the like of a computer, and a program stored in the RAM or the ROM operates. This can be achieved by:

図２は、楽曲ファイル１５のデータ構造をイメージ的に示す概念図である。一般的にＭＬＤファイルは、ファイルそのものの識別子を含むファイルヘッダ部、ファイルのデータについての情報を含むデータインフォメーション部、楽曲の実データを含むトラック部の３つを有しているが、図２はトラック部の構造を模擬的に示している。 FIG. 2 is a conceptual diagram conceptually showing the data structure of the music file 15. Generally, an MLD file has three parts: a file header part containing an identifier of the file itself, a data information part containing information on data of the file, and a track part containing actual music data. 4 schematically shows the structure of a track portion.

図２に示すように、楽曲ファイル１５は、ＡＤＰＣＭ形式の歌声データ１２とＭＩＤＩ形式のＢＧＭ＋模擬歌声データ１４とを含んでいる。図２において、横軸は時間方向を示し、ハッチングを付した部分がそれぞれＢＧＭ２１、模擬歌声２２、歌声２３の再生タイミングを表している。この図２の例では、ＭＩＤＩのＢＧＭ２１は最初から最後まで一貫して流れ、その途中の２箇所でＭＩＤＩの模擬歌声２２が流れる。この模擬歌声２２が流れるのと同時に、ＡＤＰＣＭの歌声２３も流れる、というイメージを示している。 As shown in FIG. 2, the music file 15 includes singing voice data 12 in ADPCM format and BGM + simulated singing voice data 14 in MIDI format. In FIG. 2, the horizontal axis indicates the time direction, and the hatched portions indicate the reproduction timing of the BGM 21, the simulated singing voice 22, and the singing voice 23, respectively. In the example of FIG. 2, the MIDI BGM 21 flows continuously from the beginning to the end, and the MIDI singing voice 22 flows at two points in the middle. The image shows that the singing voice 23 of ADPCM also flows at the same time as the simulated singing voice 22 flows.

ＭＩＤＩ形式のＢＧＭ＋模擬歌声データ１４は、ＢＧＭ２１の部分と模擬歌声２２の部分とが別々のＭＩＤＩデータとして生成されていても良いし、１つのＭＩＤＩデータとして生成されていても良い。前者の場合は、ＢＧＭ２１の演奏位置情報と模擬歌声２２の演奏位置情報とを別個に設定する。後者の場合は、ＢＧＭ２１と模擬歌声２２とが和音データとして定義される。すなわち、模擬歌声２２が流れないタイミングではＢＧＭ２１だけの和音、模擬歌声２２が流れるタイミングではＢＧＭ２１と模擬歌声２２とを合わせた和音として１つのＭＩＤＩデータが定義される。この場合は、当該１つのＭＩＤＩデータに対して演奏位置情報を設定する。 In the BGM + simulated singing voice data 14 in the MIDI format, the portion of the BGM 21 and the portion of the simulated singing voice 22 may be generated as separate MIDI data, or may be generated as one piece of MIDI data. In the former case, the performance position information of the BGM 21 and the performance position information of the simulated singing voice 22 are set separately. In the latter case, the BGM 21 and the simulated singing voice 22 are defined as chord data. That is, one MIDI data is defined as a chord of the BGM 21 only at a timing when the simulated singing voice 22 does not flow, and as a chord combining the BGM 21 and the simulated singing voice 22 at a timing at which the simulated singing voice 22 flows. In this case, performance position information is set for the one MIDI data.

一方、ＡＤＰＣＭ形式の歌声データ１２に関しては、歌声２３が模擬歌声２２と同時に流れるように、歌声２３の演奏位置情報を設定する。 On the other hand, with respect to the singing voice data 12 in the ADPCM format, the performance position information of the singing voice 23 is set so that the singing voice 23 flows simultaneously with the simulated singing voice 22.

このように、本実施形態の楽曲ファイル１５は、ＭＩＤＩ形式のＢＧＭ＋模擬歌声データ１４を適切なタイミングで再生するために必要なＭＩＤＩ再生制御情報と、ＡＤＰＣＭ形式の歌声データ１２をＢＧＭ＋模擬歌声データ１４に同期させて適切なタイミングで再生するために必要なＰＣＭ再生制御情報とを含んで構成されている。 As described above, the music file 15 of the present embodiment includes the MIDI reproduction control information necessary for reproducing the BGM + simulated singing voice data 14 in the MIDI format at an appropriate timing, and the singing voice data 12 in the ADPCM format as the BGM + simulated singing voice data 14. And PCM reproduction control information necessary for reproducing at an appropriate timing in synchronization with.

図３は、本実施形態による楽曲ファイル生成方法の処理手順を示すフローチャートである。図３において、まず録音部１により、ＣＤやＤＶＤ等のデジタル音源データ１１をコンピュータのハードディスク等にＷＡＶ形式で録音する（ステップＳ１）。次に歌声抽出部２により、録音したＷＡＶ形式のデジタル音源データ１１から所望の一部分（楽曲の先頭部分やサビ部分など）を切り出す（ステップＳ２）。切り出す部分は１箇所に限らず、複数箇所でも良い。また、切り出した複数箇所を連結して１つにまとめても良い。 FIG. 3 is a flowchart illustrating a processing procedure of the music file generation method according to the present embodiment. In FIG. 3, the recording section 1 first records digital sound source data 11 such as a CD or DVD on a hard disk or the like of a computer in a WAV format (step S1). Next, the singing voice extracting unit 2 cuts out a desired portion (such as a head portion or a chorus portion) of the recorded WAV format digital sound source data 11 (step S2). The cut-out portion is not limited to one location, but may be a plurality of locations. Also, a plurality of cut out portions may be connected to be combined into one.

この切り出し処理は、キーボードやマウス等を用いて成されたユーザからの指示に従って行うようにしても良いし、コンピュータが自動的に行うようにしても良い。コンピュータが自動的に行う場合、例えば楽曲の先頭部分を切り出す際には、切り出す小節数を指示することにより、該当する部分を自動的に切り出すことが可能である。また、サビ部分を切り出す際には、バックコーラスの開始、音量の変化、曲調の変化などを検出することによってサビ部分を予測し、これを自動的に切り出すようにすることが可能である。 This cutout process may be performed according to a user's instruction made using a keyboard, a mouse, or the like, or may be automatically performed by a computer. When the computer automatically performs, for example, when cutting out the beginning of a song, it is possible to automatically cut out the corresponding portion by designating the number of measures to be cut out. Further, when cutting out the rust portion, it is possible to predict the rust portion by detecting the start of the back chorus, a change in volume, a change in tune, and the like, and to cut out the rust portion automatically.

歌声抽出部２は更に、切り取ったデジタル音源データ１１に対して、人間の歌声に対応する所定の周波数帯域（４ＫＨｚまたは８ＫＨｚ）まで帯域制限する処理を行うことにより、ＢＧＭを廃棄して人間の歌声だけを抽出する（ステップＳ３）。これにより、ＡＤＰＣＭ形式の歌声データ１２を生成する。なお、切り出し処理をユーザからの指示に基づいて行う場合は、ステップＳ２とステップＳ３の処理は順番が逆でも良い。 The singing voice extracting unit 2 further performs a process of band-limiting the cut-out digital sound source data 11 to a predetermined frequency band (4 KHz or 8 KHz) corresponding to a human singing voice, thereby discarding the BGM and discarding the human singing voice. Is extracted (step S3). Thereby, the singing voice data 12 in the ADPCM format is generated. When the cutout process is performed based on an instruction from the user, the order of steps S2 and S3 may be reversed.

また、ＢＧＭ生成部３において、歌声抽出部２で廃棄した部分に相当するＢＧＭデータ１３を、例えばＤＴＭによりＭＩＤＩ形式で生成する（ステップＳ４）。ＭＩＤＩ形式のＢＧＭデータ１３は、携帯電話の各機種の内蔵音源に依存するところが大きい。そのため、ＭＭＬ（Music Markup Language）にて機種毎に表現方法を調整する（ステップＳ５）。次に、ＭＩＤＩ調整部４において、歌声抽出部２で抽出した歌声を模擬したＭＩＤＩ形式の模擬歌声データを生成し、ＢＧＭデータに付加してＭＩＤＩデータの調整を行う（ステップＳ６）。なお、ステップＳ１〜Ｓ３の処理と、ステップＳ４〜Ｓ６の処理とは順番が逆でも良い。 The BGM generator 3 generates BGM data 13 corresponding to the portion discarded by the singing voice extractor 2 in the MIDI format by, for example, DTM (step S4). The BGM data 13 in the MIDI format largely depends on the built-in sound source of each model of the mobile phone. Therefore, the expression method is adjusted for each model in MML (Music Markup Language) (step S5). Next, the MIDI adjustment unit 4 generates simulated singing voice data in the MIDI format that simulates the singing voice extracted by the singing voice extraction unit 2, adds the simulated singing voice data to the BGM data, and adjusts the MIDI data (step S6). Note that the order of the processing of steps S1 to S3 and the processing of steps S4 to S6 may be reversed.

最後に、ファイル生成部５により、ステップＳ１〜Ｓ３で生成されたＡＤＰＣＭ形式の歌声データ１２と、ステップＳ４〜Ｓ６で生成されたＭＩＤＩ形式のＢＧＭ＋模擬歌声データ１４とを１つの楽曲ファイル１５に加工する（ステップＳ７）。ここでは、携帯電話のキャリアのフォーマットに合わせてバイナリデータでファイルを書き出す。上述の例ではドコモ社のＭＬＤ形式について説明したが、ａｕ社であればＰＭＤ形式、Ｊフォン社であればＳＭＤ形式に合わせて楽曲ファイル１５を生成する。１つの楽曲について複数キャリアの楽曲ファイル１５を生成しても良い。 Lastly, the file generation unit 5 processes the singing voice data 12 in ADPCM format generated in steps S1 to S3 and the BGM + simulated singing voice data 14 in MIDI format generated in steps S4 to S6 into one music file 15. (Step S7). Here, a file is written in binary data according to the format of the carrier of the mobile phone. In the above example, the MLD format of DoCoMo was described, but the music file 15 is generated according to the PMD format for au, and the SMD format for J-phone. A music file 15 of a plurality of carriers may be generated for one music.

図４は、上述のようにして生成した楽曲ファイル１５を利用した本実施形態に係る音楽配信システムの構成例を示す図である。図４において、３００は楽曲ファイル１５の配信を行う音楽配信サーバ、４００は楽曲ファイル１５の配信を受ける携帯電話であり、これらはインターネット５００に接続可能とされている。 FIG. 4 is a diagram illustrating a configuration example of a music distribution system according to the present embodiment using the music file 15 generated as described above. In FIG. 4, reference numeral 300 denotes a music distribution server that distributes the music file 15, and 400 denotes a mobile phone that receives the distribution of the music file 15, which can be connected to the Internet 500.

図４に示すように、音楽配信サーバ３００は、楽曲ファイル取得部３１、再生プログラム取得部３２、顧客情報取得部３３、データベース（ＤＢ）登録部３４、配信楽曲ＤＢ３５、配信プログラムＤＢ３６、顧客ＤＢ３７、カプセル化部３８、顧客情報参照部３９および通信部４０を備えて構成されている。 As shown in FIG. 4, the music distribution server 300 includes a music file acquisition unit 31, a reproduction program acquisition unit 32, a customer information acquisition unit 33, a database (DB) registration unit 34, a distribution music DB 35, a distribution program DB 36, a customer DB 37, It comprises an encapsulation unit 38, a customer information reference unit 39, and a communication unit 40.

楽曲ファイル取得部３１は、楽曲ファイル生成システム１００により生成された楽曲ファイル１５を音楽配信サーバ３００内に取得するものである。再生プログラム取得部３２は、再生プログラム生成システム２００によって生成された音楽再生プログラム（音楽再生プレーヤ）を音楽配信サーバ３００内に取得するものである。 The music file acquisition unit 31 acquires the music file 15 generated by the music file generation system 100 into the music distribution server 300. The reproduction program acquisition section 32 acquires the music reproduction program (music reproduction player) generated by the reproduction program generation system 200 into the music distribution server 300.

これらの楽曲ファイル取得部３１および再生プログラム取得部３２では、具体的には、ＣＤやフレキシブルディスク等の記録媒体を介して楽曲ファイル１５や音楽再生プログラムを音楽配信サーバ３００内に取り込んだり、インターネット５００あるいはその他のネットワーク（図示せず）を介して楽曲ファイル１５や音楽再生プログラムを音楽配信サーバ３００内に取り込んだりする。 More specifically, the music file acquisition unit 31 and the reproduction program acquisition unit 32 import the music file 15 and the music reproduction program into the music distribution server 300 via a recording medium such as a CD or a flexible disk, or use the Internet 500 Alternatively, the music file 15 and the music reproduction program are loaded into the music distribution server 300 via another network (not shown).

音楽再生プログラムは、楽曲ファイル１５に記録されている演奏位置情報に従ってＢＧＭ２１、模擬歌声２２、歌声２３の演奏を指示するためのものである。これは、携帯電話内蔵のシンセサイザに対してＡＤＰＣＭ形式の歌声データ１２の演奏を指示するＰＣＭ再生制御プログラムと、シンセサイザに対してＭＩＤＩ形式のＢＧＭ＋模擬歌声データ１４の演奏を指示するＭＩＤＩ再生制御プログラムとを含んでいる。この音楽再生プログラムも、携帯電話の各キャリアが持つ仕様の違いに合わせて作り込まれる。 The music reproduction program is for instructing the performance of the BGM 21, the simulated singing voice 22, and the singing voice 23 in accordance with the performance position information recorded in the music file 15. The PCM playback control program instructs the synthesizer built in the mobile phone to play the singing voice data 12 in the ADPCM format, the MIDI playback control program instructs the synthesizer to play the BGM + simulated singing voice data 14 in the MIDI format. Contains. This music playback program is also tailored to the differences in the specifications of each mobile phone carrier.

顧客情報取得部３３は、顧客に関する各種の情報（例えば氏名、ユーザＩＤ、パスワード、顧客が使用している携帯電話４００のキャリアや機種など）を取得するものである。具体的には、ユーザが携帯電話４００からインターネット５００を介して音楽配信サーバ３００に最初にアクセスしてきたときに、ユーザに対して情報入力を要求する（例えば、情報入力画面を提示する）ことによって、必要な顧客情報を取得する。 The customer information acquisition unit 33 acquires various information about the customer (for example, name, user ID, password, carrier and model of the mobile phone 400 used by the customer). Specifically, when the user first accesses the music distribution server 300 from the mobile phone 400 via the Internet 500, the user requests information input (for example, presents an information input screen). , Get the required customer information.

ＤＢ登録部３４は、楽曲ファイル取得部３１により取得された様々な仕様に対応する楽曲ファイル１５を、着うた用の楽曲データファイルとして配信楽曲ＤＢ３５に登録する。また、再生プログラム取得部３２により取得された様々な仕様に対応する音楽再生プログラムを配信プログラムＤＢ３６に登録する。また、顧客情報取得部３３により取得された顧客情報を顧客ＤＢ３７に登録する。配信楽曲ＤＢ３５は、本発明の記録媒体を構成する。 The DB registration unit 34 registers the music files 15 corresponding to various specifications acquired by the music file acquisition unit 31 in the distribution music DB 35 as music data files for ringing songs. In addition, music playback programs corresponding to various specifications acquired by the playback program acquisition unit 32 are registered in the distribution program DB 36. In addition, the customer information acquired by the customer information acquisition unit 33 is registered in the customer DB 37. The distribution music DB 35 constitutes a recording medium of the present invention.

カプセル化部３８は、ユーザからの配信要求に応じて、そのユーザが使用している携帯電話４００のキャリアと機種に対応する楽曲ファイル１５を配信楽曲ＤＢ３５から読み出すとともに、当該携帯電話４００のキャリアと機種に対応する音楽再生プログラムを配信プログラムＤＢ３６から読み出して、それらをカプセル化してコンカチファイルを作成する。顧客情報参照部３９は、ユーザから楽曲の配信要求があったときに、顧客ＤＢ３７を参照することによって要求元のユーザが使用している携帯電話４００のキャリアと機種を把握し、カプセル化部３８に伝える処理を行う。 In response to the distribution request from the user, the encapsulation unit 38 reads the music file 15 corresponding to the carrier and model of the mobile phone 400 used by the user from the distribution music DB 35, and reads the music file 15 with the carrier of the mobile phone 400. The music reproduction program corresponding to the model is read from the distribution program DB 36, and the music reproduction program is encapsulated to create a concatenation file. When there is a music distribution request from the user, the customer information reference unit 39 refers to the customer DB 37 to grasp the carrier and model of the mobile phone 400 used by the requesting user, and the encapsulation unit 38 Perform the process of telling

カプセル化は、楽曲ファイル１５のバイナリデータと音楽再生プログラムのバイナリデータとを一緒にして単一のファイルとする処理であり、生成されたオブジェクトが自己完結型で一元管理されるＪａｖａ（登録商標）のクラス配信処理を利用して、電話の着信時にプログラムの起動が掛かる仕組みとして実装する。なお、楽曲ファイル１５に音楽再生プログラムをカプセル化する方法としては、楽曲配信の要求があった時点で動的に組み合わせを行う方法と、事前のバッチ処理で静的な組み合わせをあらかじめ用意する方法とがあり、本実施形態はその何れにも対応可能である。 Encapsulation is a process in which the binary data of the music file 15 and the binary data of the music reproduction program are combined into a single file, and the generated object is a self-contained, unified management of Java (registered trademark). Implement as a mechanism to start the program when a call arrives using the class distribution process of As a method of encapsulating the music reproduction program in the music file 15, there are a method of dynamically combining at the time of a music distribution request and a method of preparing a static combination in advance by batch processing in advance. This embodiment can deal with any of them.

また、配信するファイルは、Ｊａｖａファイルに従った方式と、任意のファイルフォーマットを制定して自己ファイル再生のプロトコルに従った方式との何れを採ることも可能である。配信するファイルについて、物理的な分割構成は問題でなく、論理的な単一ファイル構成になっていることが必要である。論理的な単一性に関しては、実装環境を構築するプロセスが、ユーザが楽曲をダウンロードしているときの操作性において完結性を満たしていれば良い。 Further, the file to be distributed can adopt either a method according to a Java file or a method according to a protocol for reproducing an own file by establishing an arbitrary file format. Regarding the file to be distributed, the physical division configuration is not a problem, and it is necessary that the file be a logical single file configuration. Regarding logical unity, it is only necessary that the process of constructing the mounting environment satisfies the completeness in the operability when the user downloads the music.

通信部４０は、インターネット５００を介して携帯電話４００との間で通信に関する処理を行う。例えば、携帯電話４００から送られてくる顧客情報を顧客情報取得部３３に伝える処理を行う。また、携帯電話４００から送られてくる所望の楽曲の配信要求を受信し、それをカプセル化部３８や顧客情報参照部３９に伝える処理を行う。また、カプセル化部３８により生成されたコンカチファイルを要求元の携帯電話４００に配信する処理も行う。コンカチファイル中に含まれる楽曲ファイル１５を記憶する携帯電話４００内のメモリ（図示せず）も、本発明の記録媒体を構成する。 The communication unit 40 performs processing related to communication with the mobile phone 400 via the Internet 500. For example, a process of transmitting the customer information sent from the mobile phone 400 to the customer information acquisition unit 33 is performed. In addition, a process of receiving a distribution request of a desired music sent from the mobile phone 400 and transmitting the request to the encapsulation unit 38 and the customer information reference unit 39 is performed. Further, a process of distributing the concatenation file generated by the encapsulation unit 38 to the mobile phone 400 of the request source is also performed. A memory (not shown) in the mobile phone 400 that stores the music file 15 included in the concatenation file also constitutes a recording medium of the present invention.

以上に説明した音楽配信サーバ３００内の各機能ブロック３１〜３４，３８〜４０は、ＣＰＵあるいはＭＰＵ、ＲＯＭ、ＲＡＭなどを備えて構成される制御部（図示せず）によってその動作が制御されるようになっている。また、各ＤＢ３５〜３７は、例えばハードディスク等の記録媒体により構成されている。 The operation of each of the functional blocks 31 to 34, 38 to 40 in the music distribution server 300 described above is controlled by a control unit (not shown) including a CPU or an MPU, a ROM, a RAM, and the like. It has become. Each of the DBs 35 to 37 is configured by a recording medium such as a hard disk, for example.

次に、上記のように構成した本実施形態による音楽配信システムの動作を、図５のフローチャートを参照しながら説明する。図５は、音楽配信サーバ３００における楽曲配信および顧客登録の動作を示すフローチャートである。 Next, the operation of the music distribution system according to the present embodiment configured as described above will be described with reference to the flowchart of FIG. FIG. 5 is a flowchart showing the operations of music distribution and customer registration in the music distribution server 300.

図５に示すように、音楽配信サーバ３００内の図示しない制御部は、携帯電話４００から通信部４０に対してアクセスがあったかどうかを判定する（ステップＳ１１）。携帯電話４００からアクセスがあった場合、制御部は更に、その携帯電話４００のユーザにパスワードが既に設定されているかどうかを判定する（ステップＳ１２）。ここでは、パスワード入力を伴ってアクセスが行われたかどうかを判定する。 As shown in FIG. 5, a control unit (not shown) in music distribution server 300 determines whether or not mobile phone 400 has accessed communication unit 40 (step S11). If there is access from the mobile phone 400, the control unit further determines whether a password has already been set for the user of the mobile phone 400 (step S12). Here, it is determined whether or not access has been performed with password input.

そのユーザにパスワードが設定されていない場合、制御部は通信部４０を用いて所定の情報入力画面を携帯電話４００に提示することにより、ユーザに顧客情報の入力を促す。そして、これに対応して入力された顧客情報を顧客情報取得部３３が取得し、ＤＢ登録部３４が顧客ＤＢ３７に登録する（ステップＳ１３）。その後で制御部は、そのユーザに対して固有のパスワードを発行する（ステップＳ１４）。 If a password has not been set for the user, the control unit uses the communication unit 40 to present a predetermined information input screen to the mobile phone 400, thereby prompting the user to input customer information. Then, the customer information acquisition unit 33 acquires the customer information input correspondingly, and the DB registration unit 34 registers the customer information in the customer DB 37 (step S13). After that, the control unit issues a unique password to the user (step S14).

上記ステップＳ１２でユーザに既にパスワードが発行されていたと判断した場合（パスワード入力を伴ってアクセスが行われた場合）および上記ステップＳ１４でパスワードが新たに発行された場合には、制御部はそのパスワードに関する承認処理を行う（ステップＳ１５）。パスワードが間違っているような場合には、その旨の警告メッセージを出力して処理を中断する。 If it is determined in step S12 that a password has already been issued to the user (if access has been performed with password input) and if a new password has been issued in step S14, the control unit determines that password. An approval process is performed (step S15). If the password is incorrect, a warning message to that effect is output and the processing is interrupted.

一方、パスワードの承認が済んだ場合、制御部は、通信部４０を用いて会員専用の音源メニュー画面を携帯電話４００に提示する（ステップＳ１６）。この音源メニュー画面を通じて、ユーザは自分が所望する楽曲のダウンロードを音楽配信サーバ３００に要求することができる。制御部は、携帯電話４００から所望の楽曲の配信要求が有ったか否かを判定し（ステップＳ１７）、要求がない場合はステップＳ１１の処理に戻る。 On the other hand, if the password has been approved, the control unit presents a member-specific sound source menu screen to the mobile phone 400 using the communication unit 40 (step S16). Through this sound source menu screen, the user can request the music distribution server 300 to download the desired music. The control unit determines whether or not there has been a request for distribution of the desired music from the mobile phone 400 (step S17), and if not, returns to the process of step S11.

楽曲の配信要求があった場合、顧客情報参照部３９は、顧客ＤＢ３７を参照することによって要求元の携帯電話４００のキャリアと機種を把握し、それをカプセル化部３８に伝える（ステップＳ１８）。カプセル化部３８は、顧客情報参照部３９より伝えられたキャリアと機種に対応する音楽再生プログラムを配信プログラムＤＢ３６から読み出すとともに、ユーザから配信要求された楽曲で顧客情報参照部３９より伝えられたキャリアと機種に対応する楽曲ファイル１５を配信楽曲ＤＢ３５から読み出して、それらをカプセル化してコンカチファイルを作成する（ステップＳ１９）。 If there is a music distribution request, the customer information reference unit 39 grasps the carrier and model of the requesting mobile phone 400 by referring to the customer DB 37, and transmits it to the encapsulation unit 38 (step S18). The encapsulation unit 38 reads out the music reproduction program corresponding to the carrier and the model transmitted from the customer information reference unit 39 from the distribution program DB 36, and the carrier transmitted from the customer information reference unit 39 in the music requested to be distributed by the user. Then, the music file 15 corresponding to the model is read from the distribution music DB 35 and encapsulated to create a concatenation file (step S19).

最後に、カプセル化部３８によって作成されたコンカチファイルを通信部４０が携帯電話４００に配信する（ステップＳ２０）。このコンカチファイルを受信した携帯電話４００では、その中に含まれている音楽再生プログラムによって楽曲ファイル１５の再生を実行する。 Finally, the communication unit 40 distributes the concatenation file created by the encapsulation unit 38 to the mobile phone 400 (step S20). The mobile phone 400 that has received the concatenation file executes the reproduction of the music file 15 by the music reproduction program included therein.

以上詳しく説明したように、本実施形態によれば、ＣＤ等のデジタル音源を歌声部分とＢＧＭ部分とに分離し、歌声部分については大幅に帯域制限を行ってＡＤＰＣＭ形式にすることによりデータ量を削減し、ＢＧＭ部分についてはＭＩＤＩ形式にてデータを生成することによりデータ量を削減した。これにより、ＣＤ音源等を単に切り取ってＭＰ３形式で圧縮する従来の方式に比べて、データ量を格段に少なくすることができる。また、帯域制限を行うことによって劣化した歌声部分はＭＩＤＩデータによって補うようにしたので、再生される歌声の品質も所定レベル以上に維持することができる。 As described above in detail, according to the present embodiment, a digital sound source such as a CD is separated into a singing voice portion and a BGM portion, and the singing voice portion is largely band-limited to have an ADPCM format to reduce the data amount. The amount of data was reduced by generating data in the MIDI format for the BGM portion. As a result, the data amount can be remarkably reduced as compared with the conventional method in which a CD sound source or the like is simply cut out and compressed in the MP3 format. Further, since the singing voice portion degraded by the band limitation is compensated for by the MIDI data, the quality of the singing voice to be reproduced can be maintained at a predetermined level or more.

したがって、現行の携帯電話機種の着信音に関する制約事項であるファイル容量の制限（例えばドコモ社の場合は１０Ｋｂｙｔｅ）を守りながら、一定レベル以上の品質が保証された着うた音声を携帯電話に配信して再生することが可能となる。すなわち、本実施形態によれば、大容量のメモリやＭＰ３デコーダ等を備えていない現行の携帯電話機種でも着うたのサービスが利用できるようになる。 Therefore, while maintaining the file size limitation (for example, 10 Kbytes in the case of DoCoMo), which is a restriction on the ringtone of the current mobile phone type, ringtone voices whose quality is guaranteed to a certain level or higher are distributed to the mobile phone. It becomes possible to reproduce. That is, according to the present embodiment, the service of the ringing song can be used even with the current mobile phone type that does not include a large-capacity memory, an MP3 decoder, and the like.

なお、上記実施形態では、携帯電話の着信音用として楽曲ファイル１５を生成する例について説明したが、必ずしも着信音用に限定されるものではない。小さいメモリ容量で歌声とＢＧＭとから成る楽曲を再生する必要があるシステムに対しては、本実施形態の楽曲ファイル１５を適用することが可能である。この場合に楽曲ファイル１５を記憶する記録媒体としては、ＣＤ−ＲＯＭ、フレキシブルディスク、ハードディスク、磁気テープ、光ディスク、光磁気ディスク、ＤＶＤ、不揮発性メモリカード等を用いることができ、これらも本発明の記録媒体を構成する。 In the above-described embodiment, an example has been described in which the music file 15 is generated for a ringtone of a mobile phone, but the present invention is not necessarily limited to the case of generating a music file. The music file 15 of the present embodiment can be applied to a system that needs to reproduce music composed of singing voice and BGM with a small memory capacity. In this case, as a recording medium for storing the music file 15, a CD-ROM, a flexible disk, a hard disk, a magnetic tape, an optical disk, a magneto-optical disk, a DVD, a non-volatile memory card, and the like can be used. Construct a recording medium.

また、上記実施形態では、録音部１はＣＤやＤＶＤ等のデジタル音源データをコンピュータのハードディスク等にＷＡＶ形式で録音するものである例について説明したが、これに限定されない。例えば、カラオケボックスやゲームセンター等の娯楽施設でカラオケをバックにユーザが歌った歌声をマイクから入力してＷＡＶ形式で録音するものであっても良い。この場合、歌声だけでなく周囲の雑音も同時に録音されてしまうが、歌声抽出部２、ＢＧＭ生成部３、ＭＩＤＩ調整部４、ファイル生成部５によって上記実施形態と同様の処理を行うことにより、ユーザ自身の歌声で雑音もない良好な着うたファイルを生成することができる。 Further, in the above-described embodiment, an example has been described in which the recording unit 1 records digital sound source data such as a CD or a DVD in a hard disk or the like of a computer in a WAV format, but is not limited thereto. For example, a singing voice sung by a user with a karaoke back at a recreation facility such as a karaoke box or a game center may be input from a microphone and recorded in a WAV format. In this case, not only the singing voice but also the surrounding noise is recorded at the same time, but the singing voice extracting unit 2, the BGM generating unit 3, the MIDI adjusting unit 4, and the file generating unit 5 perform the same processing as in the above embodiment, It is possible to generate a good ringing song file with no noise in the user's own singing voice.

この例において、録音部１の機能を備えた録音装置をカラオケボックスやゲームセンター等の娯楽施設に独立して設置し、歌声抽出部２、ＢＧＭ生成部３、ＭＩＤＩ調整部４、ファイル生成部５の機能を当該録音装置とは別の編集用コンピュータが備えるようにすることが可能である。その場合、録音部１にて録音されたデータは、ＣＤ、フレキシブルディスク、ハードディスク、磁気テープ、光ディスク、光磁気ディスク、ＤＶＤ、ＭＤ、不揮発性メモリカード等の記録媒体を介して編集用コンピュータに入力するようにしても良いし、インターネット等の通信ネットワークを介して録音装置から編集用コンピュータに送信するようにしても良い。生成した着うたファイルについても、通信ネットワークを介して編集用コンピュータからユーザの携帯電話に送信するようにしても良い。 In this example, a recording device having the function of the recording unit 1 is installed independently in an entertainment facility such as a karaoke box or a game center, and a singing voice extraction unit 2, a BGM generation unit 3, a MIDI adjustment unit 4, a file generation unit 5 Can be provided in an editing computer separate from the recording device. In this case, data recorded by the recording unit 1 is input to an editing computer via a recording medium such as a CD, a flexible disk, a hard disk, a magnetic tape, an optical disk, a magneto-optical disk, a DVD, an MD, and a nonvolatile memory card. The recording may be transmitted from the recording device to the editing computer via a communication network such as the Internet. The generated ringing song file may also be transmitted from the editing computer to the user's mobile phone via the communication network.

また、録音部１、歌声抽出部２、ＢＧＭ生成部３、ＭＩＤＩ調整部４、ファイル生成部５の機能を全て備えた装置をカラオケボックスやゲームセンター等の娯楽施設に設置するようにしても良い。この場合、ＢＧＭ生成部３は、歌声を録音する際にカラオケとして再生するＭＩＤＩ形式のＢＧＭデータをあらかじめ保持しておく機能に置き換えることが可能である。つまり、あらかじめ保持しておいたＭＩＤＩ形式のＢＧＭデータを歌声の録音の際に再生するとともに、同じＢＧＭデータと録音音声から抽出した歌声データとを用いて着うたファイルを生成する。このときＭＩＤＩ調整部４では、カラオケに合わせて録音された歌声の音程やテンポ、音色、音量などを解析し、その結果に合わせてＭＩＤＩ形式の擬似歌声データを生成してＢＧＭデータに付加する。 A device having all of the functions of the recording unit 1, the singing voice extracting unit 2, the BGM generating unit 3, the MIDI adjusting unit 4, and the file generating unit 5 may be installed in an entertainment facility such as a karaoke box or a game center. . In this case, the BGM generation unit 3 can be replaced with a function of holding MIDI-format BGM data to be reproduced as karaoke when recording a singing voice. That is, the BGM data in the MIDI format stored in advance is reproduced at the time of recording the singing voice, and a ringing song file is generated using the same BGM data and the singing voice data extracted from the recorded voice. At this time, the MIDI adjusting unit 4 analyzes the pitch, tempo, tone, volume, etc. of the singing voice recorded in accordance with the karaoke, generates MIDI-like pseudo singing voice data according to the analysis result, and adds the data to the BGM data.

その他、上記説明した実施形態は、本発明を実施するにあたっての具体化の一例を示したものに過ぎず、これによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその精神、またはその主要な特徴から逸脱することなく、様々な形で実施することができる。 In addition, the above-described embodiment is merely an example of the embodiment for carrying out the present invention, and the technical scope of the present invention should not be interpreted in a limited manner. That is, the present invention can be embodied in various forms without departing from the spirit or main features thereof.

さらに、本発明の他の形態を以下にまとめて記載しておく。
１．楽曲ファイルをデータ配信サーバから要求元の端末に配信するデータ配信システムであって、
請求項１に記載の楽曲ファイル生成装置により生成された楽曲ファイルをあらかじめ蓄積しておくデータ蓄積手段と、
上記端末のそれぞれの仕様に対応する再生プログラムをあらかじめ蓄積しておく再生プログラム蓄積手段と、
上記要求元の端末から所望のデータの配信要求があったときに、上記データ蓄積手段から該当する楽曲ファイルを読み出すとともに、上記要求元の端末の仕様に対応する再生プログラムを上記再生プログラム蓄積手段から読み出して、上記楽曲ファイルと上記再生プログラムとを上記要求元の端末に送信する送信手段とを備えたことを特徴とするデータ配信システム。
２．上記送信手段は、上記楽曲ファイルと上記再生プログラムとを論理的に１つのファイルにカプセル化するカプセル化手段を含むことを特徴とする上記第１項に記載のデータ配信システム。
３．楽曲ファイルを要求元の端末に配信するデータ配信サーバであって、
請求項１に記載の楽曲ファイル生成装置により生成された楽曲ファイルを記憶するデータ記憶手段と、
上記端末のそれぞれの仕様に対応する再生プログラムを記憶する再生プログラム記憶手段と、
上記要求元の端末から所望のデータの配信要求があったときに、上記データ記憶手段から該当する楽曲ファイルを読み出すとともに、上記要求元の端末の仕様に対応する再生プログラムを上記再生プログラム記憶手段から読み出して、上記楽曲ファイルと上記再生プログラムとを上記要求元の端末に送信する送信手段とを備えたことを特徴とするデータ配信サーバ。
４．上記送信手段は、上記楽曲ファイルと上記再生プログラムとを論理的に１つのファイルにカプセル化するカプセル化手段を含むことを特徴とする上記第３項に記載のデータ配信サーバ。 Further, other embodiments of the present invention will be described below.
1. A data distribution system that distributes a music file from a data distribution server to a requesting terminal,
Data storage means for storing in advance a music file generated by the music file generation device according to claim 1;
Playback program storage means for storing in advance playback programs corresponding to the respective specifications of the terminal,
When there is a request for distribution of desired data from the requesting terminal, the corresponding music file is read from the data storage means, and a reproduction program corresponding to the specification of the requesting terminal is read from the reproduction program storage means. A data distribution system, comprising: transmission means for reading and transmitting the music file and the reproduction program to the requesting terminal.
2. 2. The data distribution system according to claim 1, wherein the transmission unit includes an encapsulation unit that logically encapsulates the music file and the reproduction program into one file.
3. A data distribution server that distributes a music file to a requesting terminal,
Data storage means for storing a music file generated by the music file generation device according to claim 1;
Reproduction program storage means for storing a reproduction program corresponding to each specification of the terminal,
When there is a request for distribution of desired data from the requesting terminal, the corresponding music file is read from the data storage means, and a reproduction program corresponding to the specification of the requesting terminal is read from the reproduction program storage means. A data distribution server, comprising: transmission means for reading and transmitting the music file and the reproduction program to the requesting terminal.
4. 4. The data distribution server according to claim 3, wherein the transmission unit includes an encapsulation unit that logically encapsulates the music file and the reproduction program into one file.

本発明は、大容量のメモリやＭＰ３デコーダ等を備えていない現行の携帯電話機でも、歌声とＢＧＭとから成る楽曲を着信音として利用できるようにするのに有用である。 INDUSTRIAL APPLICABILITY The present invention is useful for making a tune composed of a singing voice and BGM available as a ring tone even in a current mobile phone not provided with a large-capacity memory or an MP3 decoder.

本実施形態に係る楽曲ファイル生成システムの構成例を示す図である。It is a figure showing the example of composition of the music file generation system concerning this embodiment. 本実施形態に係る楽曲ファイルのデータ構造を示す概念図である。It is a conceptual diagram showing the data structure of the music file concerning this embodiment. 本実施形態による楽曲ファイル生成方法の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of the music file generation method by this embodiment. 本実施形態による音楽配信システムの構成例を示す図である。It is a figure showing the example of composition of the music distribution system by this embodiment. 本実施形態による音楽配信サーバにおける楽曲配信および顧客登録の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of music distribution and customer registration in the music distribution server by this embodiment.

Explanation of reference numerals

１録音部
２歌声抽出部
３ＢＧＭ生成部
４ＭＩＤＩ調整部
５ファイル生成部
１１ＷＡＶ形式のデジタル音源データ
１２ＡＤＰＣＭ形式の歌声データ
１３ＭＩＤＩ形式のＢＧＭデータ
１４ＭＩＤＩ形式のＢＧＭ＋模擬歌声データ
１５ＭＬＤ形式の楽曲ファイル
２１ＢＧＭ
２２模擬歌声
２３歌声
３１楽曲ファイル取得部
３２再生プログラム取得部
３３顧客情報取得部
３４ＤＢ登録部
３５配信楽曲ＤＢ
３６配信プログラムＤＢ
３７顧客ＤＢ
３８カプセル化部
３９顧客情報参照部
４０通信部
１００楽曲ファイル生成システム
２００再生プログラム生成システム
３００音楽配信サーバ
４００携帯電話
５００インターネット Reference Signs List 1 recording unit 2 singing voice extracting unit 3 BGM generating unit 4 MIDI adjusting unit 5 file generating unit 11 digital sound source data in WAV format 12 singing voice data in ADPCM format 13 BGM data in MIDI format 14 BGM + simulated singing voice data in MIDI format 15 MLD format Music file 21 BGM
22 Simulated singing voice 23 Singing voice 31 Music file acquisition unit 32 Reproduction program acquisition unit 33 Customer information acquisition unit 34 DB registration unit 35 Distribution music DB
36 Distribution program DB
37 Customer DB
38 Encapsulation unit 39 Customer information reference unit 40 Communication unit 100 Music file generation system 200 Playback program generation system 300 Music distribution server 400 Mobile phone 500 Internet

Claims

Singing voice extracting means for extracting the singing voice from digital voice data composed of a mixture of a singing voice and a voice other than the singing voice and obtaining singing voice data in PCM format;
In addition to generating MIDI-format BGM data, it generates MIDI-format simulated singing data in accordance with the singing voice extracted by the singing voice extracting means, and adds the simulated singing data to the BGM data to adjust the MIDI data. MIDI generating means for performing;
File generating means for processing the singing voice data in the PCM format generated by the singing voice extracting means and the BGM + simulated singing voice data in the MIDI format generated by the MIDI generating means into one music file. Music file generation device.

The music file generation device according to claim 1, wherein the voice other than the singing voice is BGM.

The music file generating apparatus according to claim 1, wherein the voice other than the singing voice is noise.

The said singing voice extraction means performs the process which carries out band limitation to the predetermined frequency band corresponding to the said singing voice with respect to the digital voice data which mixed the said singing voice and the voice other than the said singing voice. 2. The music file generation device according to 1.

The music file generated by the file generating means includes MIDI reproduction control information for reproducing the MIDI-format BGM + simulated singing voice data generated by the MIDI generating means, and the PCM-format singing voice generated by the singing voice extracting means. The music file generating apparatus according to claim 1, further comprising PCM reproduction control information for reproducing data in synchronization with the simulated singing voice data.

A first step of extracting the singing voice from digital voice data composed of a mixture of a singing voice and a voice other than the singing voice to obtain singing voice data in PCM format;
A second step of generating MIDI-format BGM data;
MIDI simulated singing data is generated in accordance with the singing voice extracted in the first step, and the simulated singing data is added to the BGM data generated in the second step to adjust the MIDI data. 3 steps,
A fourth step of processing the PCM-format singing voice data generated in the first step and the MIDI-format BGM + simulated singing data adjusted in the third step into one music file; A music file generation method characterized by the following.

7. The music file generation method according to claim 6, wherein the voice other than the singing voice is BGM.

The music file generating method according to claim 6, wherein the voice other than the singing voice is noise.

In the first step, band limiting is performed on digital audio data obtained by mixing the singing voice and voices other than the singing voice to a predetermined frequency band corresponding to the singing voice. Item 7. A music file generation method according to Item 6.

In the fourth step, an adjustment process for synchronizing the reproduction timing of the PCM format singing voice data generated in the second step and the MIDI format BGM + simulated singing voice data generated in the third step is performed. 7. The music file generation method according to claim 6, wherein:

The music file generated in the fourth step includes MIDI playback control information for reproducing the MIDI format BGM + simulated singing voice data generated in the third step, and the PCM generated in the second step. 7. The music file generation method according to claim 6, further comprising PCM reproduction control information for reproducing the singing voice data in a format synchronized with the simulated singing voice data.

PCM data comprising singing voice data in PCM format;
MIDI data including MIDI data to which BGM data in the MIDI format and simulated singing data in the MIDI format generated according to the singing voice of the PCM data are added;
A computer-readable recording medium on which a music file having a data structure in which the PCM data and the MIDI data are integrated into one file is recorded.

13. The music file according to claim 12, wherein the music file includes MIDI reproduction control information for reproducing the MIDI data, and PCM reproduction control information for reproducing the PCM data in synchronization with the MIDI data. The computer-readable recording medium according to the above.