JP2008242376A

JP2008242376A - Musical piece introduction sentence generating device, narration adding device, and program

Info

Publication number: JP2008242376A
Application number: JP2007086751A
Authority: JP
Inventors: Takeshi Kikuchi; 菊池　　健; Atsushi Tougi; 温東儀; Keita Arimoto; 慶太有元; Yasushi Kamiya; 泰史神谷
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2007-03-29
Filing date: 2007-03-29
Publication date: 2008-10-09
Anticipated expiration: 2027-03-29
Also published as: JP5034599B2

Abstract

<P>PROBLEM TO BE SOLVED: To generate a liner note on which musical performance contents of an arbitrary musical piece are reflected without troubling a person and to impart a narration. <P>SOLUTION: A musical piece introduction sentence generating device includes: a musical piece introduction sentence database 152 which stores data for generating or specifying one musical piece introduction sentence from characteristic values indicative of a plurality of predetermined kinds of sound characteristics; an input means of inputting musical piece data; a feature specifying means of analyzing the musical piece data received by the input means and specifying characteristic values showing the plurality of kinds of sound characteristics; and an acquisition means of acquiring introduction sentence data showing an introduction sentence for the musical piece represented by the musical piece data received by the input means from the respective characteristic values specified by the feature specifying means and the storage contents of the musical piece introduction sentence database, and outputting the introduction sentence data. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、文章の自動生成技術に関し、特に、ライナーノーツやナレーションの元になる楽曲紹介文を自動生成する技術に関する。 The present invention relates to a technique for automatically generating a sentence, and more particularly to a technique for automatically generating a music introduction sentence that is a basis for liner notes and narration.

例えば音楽ＣＤやレコードなどには、収録楽曲の特徴が記されたライナーノーツと呼ばれる紹介文を印刷した冊子が同封されている。また、テレビやラジオなどで楽曲を放送する際には、楽曲の前奏区間や間奏区間、後奏区間などその楽曲を鑑賞する際の妨げとならない区間でナレーションによりその楽曲の特徴などを紹介する場合がある。このため、楽曲演奏や歌唱を趣味としている人々のなかには、自身の演奏や歌唱についてのライナーノーツやナレーションの作成を望んでいる者もいる。しかし、楽曲の特徴を的確に表現したライナーノーツやナレーションを作成するには、音楽についての幅広い専門知識が必要であり、そのような専門知識を有する音楽評論家等にその作成を依頼したならば、多額の費用が必要になる。趣味で楽曲演奏等を行っている人々にとっては、このような費用負担は難しいため、これらの人々が自己の演奏曲等についてライナーノーツ等を作成することは困難であった。そこで、このような問題点を解決するための技術が従来より種々提案されており、その一例としては、特許文献１に開示された技術が挙げられる。 For example, for music CDs and records, a booklet with an introductory text called liner notes on which the characteristics of the recorded music are written is enclosed. Also, when broadcasting a song on TV or radio, etc., introducing the characteristics of the song by narration in the prelude section, interlude section, and follower section of the song that do not hinder the appreciation of the song There is. For this reason, some people who enjoy playing music and singing want to create liner notes and narrations about their performance and singing. However, in order to create liner notes and narration that accurately represent the characteristics of the music, a wide range of expertise in music is required, and if you ask a music critic with such expertise to create it. A large amount of money is required. Such a cost burden is difficult for people who play music as a hobby, and it has been difficult for these people to create liner notes etc. for their own music. Therefore, various techniques for solving such problems have been proposed, and the technique disclosed in Patent Document 1 can be cited as an example.

特許文献１には、リクエストした曲の演奏開始時に曲の種類に適したナレーションを付加して再生するカラオケシステムが開示されている。より詳細に説明すると、このカラオケシステムには、楽曲データ記憶装置、ナレーションデータ記憶装置、および、カラオケシステム全体の制御を掌るホストシステムが含まれている。楽曲データ記憶装置は、予め複数の楽曲の楽曲データの各々にその楽曲の演奏内容を示す識別情報を付加して格納しており、ナレーションデータ記憶装置は、複数種類の演奏内容の各々に対応付けてその演奏内容に応じたナレーションを示すナレーションデータを格納している。そして、上記ホストシステムは、リクエストされた楽曲の楽曲データを楽曲データ記憶装置から読み出して再生する一方、その再生開始時にその楽曲の演奏内容に応じたナレーションデータをナレーションデータ記憶装置から読み出して再生する。
特開平１１−１６７３８８号公報 Patent Document 1 discloses a karaoke system that adds and plays back a narration suitable for the type of song at the start of the performance of the requested song. More specifically, this karaoke system includes a music data storage device, a narration data storage device, and a host system that controls the entire karaoke system. The music data storage device stores in advance each piece of music data of a plurality of pieces of music with identification information indicating the performance content of the music, and the narration data storage device associates with each of a plurality of types of performance content. Narration data indicating the narration corresponding to the performance content is stored. Then, the host system reads out the music data of the requested music from the music data storage device and reproduces it, and reads out the narration data corresponding to the performance content of the music from the narration data storage device and reproduces it at the start of the reproduction. .
JP-A-11-167388

特許文献１に開示された技術によれば、カラオケ曲を歌唱する際にそのカラオケ曲に適したナレーションを付加することが可能になるのであるが、例えばユーザが作詞や作曲をしたオリジナル楽曲については、ナレーションを付加することはできない。また、ユーザ自らが作成したのではない既存の楽曲であっても、その楽曲についてのナレーションデータが予め用意されてなければ、やはり、ナレーションが付加されることはない。つまり、特許文献１に開示された技術では、任意の楽曲についてその楽曲の演奏内容を反映したナレーションを付加することはできないのである。 According to the technique disclosed in Patent Document 1, when singing a karaoke song, it becomes possible to add a narration suitable for the karaoke song. For example, for an original song written by a user, No narration can be added. Further, even for existing music that was not created by the user himself, narration is not added if narration data for the music is not prepared in advance. That is, with the technique disclosed in Patent Document 1, it is not possible to add a narration that reflects the performance content of an arbitrary piece of music.

本発明は、上記課題に鑑みて為されたものであり、任意の楽曲について、人手を掛けることなく、その演奏内容を反映したライナーノーツの生成やナレーションの付与を可能にする技術を提供することを目的としている。 The present invention has been made in view of the above problems, and provides a technique that enables generation of liner notes and the addition of narration that reflect the performance content of an arbitrary piece of music without requiring human intervention. It is an object.

上記課題を解決するために、本発明は、予め定められた複数種類の音響特性の各々を示す特性値から１つの楽曲紹介文を生成または特定するためのデータが格納された楽曲紹介文データベースと、楽曲データが入力される入力手段と、前記入力手段により受け取った楽曲データを解析し、前記複数種類の音響特性の各々を示す特性値を特定する特徴特定手段と、前記特徴特定手段により特定される各特性値と前記楽曲紹介文データベースの格納内容とから、前記入力手段により受け取った楽曲データで表される楽曲についての紹介文を表す紹介文データを取得して出力する取得手段とを有することを特徴とする楽曲紹介文生成装置、を提供する。 In order to solve the above problems, the present invention provides a music introduction sentence database storing data for generating or specifying one music introduction sentence from characteristic values indicating each of a plurality of predetermined acoustic characteristics, , Input means to which music data is input, characteristic specifying means for analyzing the music data received by the input means and specifying characteristic values indicating each of the plurality of types of acoustic characteristics, and specified by the characteristic specifying means Acquisition means for acquiring and outputting introductory text data representing an introductory text for the music represented by the music data received by the input means from each characteristic value and the stored content of the music introductory text database A music introduction sentence generation device characterized by the above.

より好ましい態様においては、前記楽曲紹介文生成装置が有する楽曲紹介文データベースの格納内容は、楽曲のジャンル毎に分類されており、前記取得手段は、前記入力手段により受け取った楽曲データの表す楽曲の属するジャンルを示すジャンル識別子であって、外部から入力されるジャンル識別子の示すジャンル、または前記特徴特定手段により特定される各特性値から特定される楽曲のジャンルに対応付けて前記楽曲紹介文データベースに格納されているデータを参照して、前記入力手段により受け取った楽曲データの表す楽曲についての紹介文データを取得することを特徴とする。 In a more preferred aspect, the stored contents of the music introduction sentence database possessed by the music introduction sentence generating device are classified for each genre of music, and the acquisition means stores the music represented by the music data received by the input means. A genre identifier indicating a genre to which the genre belongs, and the genre indicated by the genre identifier input from the outside or the genre of the music specified from each characteristic value specified by the feature specifying means is associated with the music introduction sentence database. Referencing stored data, introductory sentence data about the music represented by the music data received by the input means is obtained.

また、上記課題を解決するために、本発明は、楽曲データが入力される入力手段と、前記入力手段により受け取った楽曲データを解析し、その楽曲データの表す楽曲の前奏区間、間奏区間または後奏区間を特定する区間特定手段と、前記入力手段により受け取った楽曲データの表す楽曲の特徴を反映した紹介文を表す文章データを取得する取得手段と、前記取得手段により取得された文章データの表す紹介文の読み上げ音声であるナレーションの音声データをその文章データから合成する音声合成手段と、前記音声合成手段により合成された音声データの表す音声が前記特定手段により特定された区間に重畳されるように前記音声データと前記楽曲データとを合成して出力する合成手段とを有することを特徴とするナレーション付加装置、を提供する。 In order to solve the above problems, the present invention analyzes an input means for inputting music data and music data received by the input means, and a prelude section, an interlude section or a post of the music represented by the music data. A section specifying means for specifying a performance section, an acquisition means for acquiring introductory text reflecting the characteristics of the music represented by the music data received by the input means, and a text data acquired by the acquiring means A voice synthesizing unit that synthesizes voice data of narration, which is a reading voice of an introduction sentence, from the sentence data, and a voice represented by the voice data synthesized by the voice synthesizing unit is superimposed on the section specified by the specifying unit. A narration adding device comprising: a synthesizing means for synthesizing and outputting the audio data and the music data; Subjected to.

より好ましい態様においては、前記ナレーション付加装置は、予め定められた複数種類の音響特性の各々を示す特性値から１つの楽曲紹介文を生成または特定するためのデータが格納された楽曲紹介文データベースと、前記入力手段により受け取った楽曲データを解析し、前記複数種類の音響特性の各々を示す特性値を特定する特徴特定手段と、を備え、前記取得手段は、前記特徴特定手段により特定される各特性値と前記楽曲紹介文データベースの格納内容とから、前記楽曲データで表される楽曲についての紹介文を表す紹介文データを取得することを特徴とする。 In a more preferred aspect, the narration adding device includes a song introduction sentence database storing data for generating or specifying one song introduction sentence from characteristic values indicating each of a plurality of predetermined acoustic characteristics, and Analyzing the music data received by the input means, and specifying a characteristic value indicating each of the plurality of types of acoustic characteristics, and the acquiring means is specified by the feature specifying means Introductory text data representing an introductory text about the music represented by the music data is acquired from the characteristic value and the stored contents of the music introductory text database.

また、上記課題を解決するために、本発明、コンピュータ装置を、前記コンピュータ装置へ入力された楽曲データを解析し、予め定められた複数種類の音響特性の各々を示す特性値を特定する特徴特定手段と、前記複数種類の音響特性の各々を示す特性値から１つの楽曲紹介文を生成または特定するためのデータが格納された楽曲紹介文データベースの格納内容と前記特徴特定手段により特定される各特性値とから、前記楽曲データで表される楽曲についての紹介文を表す紹介文データを取得して出力する取得手段として機能させることを特徴とするプログラム、を提供する。 In order to solve the above-mentioned problem, the present invention, the computer device, analyzes the music data input to the computer device, and specifies the characteristic value indicating each of a plurality of predetermined acoustic characteristics Means, contents stored in a music introduction database storing data for generating or specifying one music introduction sentence from characteristic values indicating each of the plurality of types of acoustic characteristics, and each specified by the feature specifying means There is provided a program characterized by functioning as an acquisition means for acquiring and outputting introductory text data representing an introductory text about a music represented by the music data from a characteristic value.

また、上記課題を解決するために、本発明は、コンピュータ装置を、前記コンピュータ装置に入力された楽曲データを解析し、その楽曲データの表す楽曲の前奏区間、間奏区間または後奏区間を特定する区間特定手段と、前記楽曲データの表す楽曲の特徴を反映した紹介文を表す文章データを取得する取得手段と、前記取得手段により取得された文章データの表す紹介文の読み上げ音声であるナレーションの音声データをその文章データから合成する音声合成手段と、前記音声合成手段により合成された音声データの表す音声が前記特定手段により特定された区間に重畳されるように前記音声データと前記楽曲データとを合成して出力する合成手段として機能させることを特徴とするプログラム、を提供する。 In order to solve the above-described problem, the present invention analyzes a music data input to the computer device and specifies a prelude section, an interlude section or a postlude section of the music represented by the music data. Section identifying means, acquisition means for acquiring sentence data representing an introductory sentence reflecting the characteristics of the music represented by the music data, and voice of narration that is a reading voice of the introductory sentence represented by the sentence data acquired by the acquiring means Voice synthesis means for synthesizing data from the text data, and the voice data and the music data so that the voice represented by the voice data synthesized by the voice synthesis means is superimposed on the section specified by the specification means. Provided is a program characterized by functioning as a combining means for combining and outputting.

本発明によれば、任意の楽曲について、人手を掛けることなく、その演奏内容を反映したライナーノーツの生成やナレーションの付与が可能になる、といった効果を奏する。 According to the present invention, there is an effect that it is possible to generate liner notes and give narration that reflects the performance contents of an arbitrary piece of music without manpower.

以下図面を参照しつつ本発明を実施する際の最良の形態について説明する。
（Ａ：第１実施形態）
（Ａ−１：構成）
図１は、本発明に係る楽曲紹介文生成装置の第１実施形態に係るライナーノーツ生成装置１０の構成例を示す図である。このライナーノーツ生成装置１０は、インターネットなどの通信網（図示省略）に接続されているコンピュータ装置であり、楽曲データやその楽曲データの表す楽曲のライナーノーツを表す紹介文データをデジタルコンテンツとして配信するコンテンツサーバとして機能する。なお、以下では、上記楽曲データとして、楽曲のオーディオ波形を所定のサンプリング間隔でサンプリングし、Ａ／Ｄ変換を施して得られる時系列サンプリングデータを用いる場合について説明するが、時系列サンプリングデータに更に符号圧縮を施して得られるデータを用いても勿論良い。図１に示すように、ライナーノーツ生成装置１０は、制御部１１０、通信インタフェース（以下、「ＩＦ」）部１２０、外部機器ＩＦ部１３０、揮発性記憶部１４０、不揮発性記憶部１５０、および、これら構成要素間のデータ授受を仲介するバス１６０を有している。 The best mode for carrying out the present invention will be described below with reference to the drawings.
(A: 1st Embodiment)
(A-1: Configuration)
FIG. 1 is a diagram illustrating a configuration example of a liner notes generating apparatus 10 according to a first embodiment of a music introduction sentence generating apparatus according to the present invention. The liner notes generating device 10 is a computer device connected to a communication network (not shown) such as the Internet, and distributes music data and introductory sentence data representing the liner notes of the music represented by the music data as digital contents. Functions as a content server. In the following description, a case will be described in which time-series sampling data obtained by sampling an audio waveform of a music at a predetermined sampling interval and performing A / D conversion is used as the music data. Of course, data obtained by performing code compression may be used. As shown in FIG. 1, the liner notes generating apparatus 10 includes a control unit 110, a communication interface (hereinafter referred to as “IF”) unit 120, an external device IF unit 130, a volatile storage unit 140, a nonvolatile storage unit 150, and It has a bus 160 that mediates data exchange between these components.

制御部１１０は、例えばＣＰＵ（Central Processing Unit）である。制御部１１０は、不揮発性記憶部１５０に記憶されている各種プログラムにしたがって作動することにより、ライナーノーツ生成装置１０の制御中枢として機能する。この制御部１１０が実行する処理については後に明らかにする。 The control unit 110 is, for example, a CPU (Central Processing Unit). The control unit 110 functions as a control center of the liner notes generating apparatus 10 by operating according to various programs stored in the nonvolatile storage unit 150. The processing executed by the control unit 110 will be clarified later.

通信ＩＦ部１２０は、例えばＮＩＣ（Network Interface Card）であり、前述した通信網に接続されている。通信ＩＦ部１２０は、例えばＨＴＴＰなどの所定の通信プロトコルにしたがって送信されている通信メッセージ（例えば、デジタルコンテンツの配信を要求する旨の要求メッセージ：以下、コンテンツ配信要求メッセージ）を受信して制御部１１０に引き渡す一方、制御部１１０から引き渡される通信メッセージ（例えば、デジタルコンテンツとその宛先とが書き込まれた通信メッセージ：以下、コンテンツ配信メッセージ）を上記通信網を介してその宛先へと送信する。 The communication IF unit 120 is, for example, a NIC (Network Interface Card), and is connected to the communication network described above. The communication IF unit 120 receives a communication message (for example, a request message for requesting distribution of digital content: hereinafter, a content distribution request message) transmitted according to a predetermined communication protocol such as HTTP, and the control unit Meanwhile, a communication message delivered from the control unit 110 (for example, a communication message in which digital content and its destination are written: hereinafter, a content distribution message) is transmitted to the destination via the communication network.

外部機器ＩＦ部１３０は、例えばＵＳＢ（Universal Serial Bus）インタフェースであり、ＵＳＢメモリなどの記録媒体やＣＤ−ＲＯＭ（Compact Disk Read Only Memory）ドライブなどの記録媒体読み取り装置を接続し、データの授受を行うためのものである。この外部機器ＩＦ部１３０は、ライナーノーツ生成装置１０に新たな楽曲データを記憶させる際に、その楽曲データをライナーノーツ生成装置１０に入力するための入力手段の役割を担う。具体的には、楽曲データが書き込まれたＵＳＢメモリなどの記録媒体が外部機器ＩＦ部１３０に接続されると、制御部１１０は、外部機器ＩＦ部１３０を介してその記録媒体から楽曲データを読み出し、所定の処理を施した後に不揮発性記憶部１５０内の所定領域に書き込むことにより、新たなデジタルコンテンツの登録を行う。なお、外部機器ＩＦ部１３０にＣＤ−ＲＯＭドライブなどの記録媒体読み取り装置が接続される場合には、その記録媒体読み取り装置にセットされた記録媒体から新たな楽曲データを読み出して不揮発性記憶部１５０に書き込む処理を制御部１１０に実行させれば良い。 The external device IF unit 130 is, for example, a USB (Universal Serial Bus) interface, and connects a recording medium reading device such as a USB memory or a CD-ROM (Compact Disk Read Only Memory) drive to exchange data. Is to do. The external device IF unit 130 serves as an input unit for inputting music data to the liner notes generating apparatus 10 when new music data is stored in the liner notes generating apparatus 10. Specifically, when a recording medium such as a USB memory in which music data is written is connected to the external device IF unit 130, the control unit 110 reads the music data from the recording medium via the external device IF unit 130. Then, after applying predetermined processing, new digital content is registered by writing in a predetermined area in the nonvolatile storage unit 150. When a recording medium reading device such as a CD-ROM drive is connected to the external device IF unit 130, new music data is read from the recording medium set in the recording medium reading device, and the nonvolatile storage unit 150. What is necessary is just to make the control part 110 perform the process written in.

揮発性記憶部１４０は、例えばＲＡＭ（Random Access Memory）であり各種プログラムを実行する際にワークエリアとして利用される。一方、不揮発性記憶部１５０は、ハードディスクであり、各種データや各種プログラムを格納しておくためのものである。図１に示すように、不揮発性記憶部１５０には、楽曲データベース１５１および楽曲紹介文データベース１５２の各種データベースと、ＯＳプログラム１５３、コンテンツ配信プログラム１５４、およびコンテンツ登録プログラム１５５の各種プログラムが格納されている。 The volatile storage unit 140 is, for example, a RAM (Random Access Memory), and is used as a work area when executing various programs. On the other hand, the non-volatile storage unit 150 is a hard disk for storing various data and various programs. As shown in FIG. 1, the nonvolatile storage unit 150 stores various databases such as a music database 151 and a music introduction sentence database 152, and various programs such as an OS program 153, a content distribution program 154, and a content registration program 155. Yes.

楽曲データベース１５１には、複数の楽曲の各々を一意に識別する楽曲識別子（例えば、楽曲の名称を表す文字列データ）に対応付けて、その楽曲識別子で識別される楽曲のオーディオ波形を示す楽曲データとその楽曲の特徴を紹介した文章（所謂ライナーノーツ）を表すテキストデータである紹介文データとが格納されている。本実施形態では、これら楽曲データや紹介文データが前述したデジタルコンテンツとして取り扱われるのである。なお、本実施形態では、紹介文を構成する各文字の文字コードをその記載順に配列してなるテキストデータを上記紹介文データとして用いる場合について説明するが、例えばＰＤＦ形式などテキストデータとの相互変換が可能な他のデータ形式を用いても良いことは勿論である。 In the music database 151, music data indicating an audio waveform of a music identified by the music identifier in association with a music identifier (for example, character string data representing the name of the music) that uniquely identifies each of the plurality of music. And introductory sentence data which is text data representing a sentence (so-called liner notes) introducing the characteristics of the music. In the present embodiment, these music data and introductory text data are handled as the digital contents described above. In the present embodiment, a case where text data in which character codes of characters constituting the introductory text are arranged in the order of description is used as the introductory text data. For example, mutual conversion with text data such as PDF format is performed. Of course, other data formats that can be used may be used.

楽曲紹介文データベース１５２は、外部機器ＩＦ部１３０を介して入力される楽曲データでオーディオ波形が表される楽曲（以下、「処理対象の楽曲」と呼ぶ）についての紹介文データをその楽曲データから生成する処理の実行過程で参照されるデータベースである。この楽曲紹介文データベース１５２は、図２に示すように、処理対象の楽曲の楽曲データを解析することにより特定されるその楽曲の音響特性を示す特性値とその楽曲の属するジャンルを示すジャンル識別子とからその楽曲の紹介文を一意に特定するためのデータベースであり、その実施態様としては種種の態様が考えられるが、その一例を挙げると以下の通りである。 The music introduction sentence database 152 obtains introduction sentence data about music (hereinafter referred to as “music to be processed”) whose audio waveform is represented by music data input via the external device IF unit 130 from the music data. It is a database that is referred to in the execution process of the process to be generated. As shown in FIG. 2, the music introduction sentence database 152 includes a characteristic value indicating the acoustic characteristic of the music specified by analyzing the music data of the music to be processed, and a genre identifier indicating the genre to which the music belongs. The database for uniquely identifying the introductory text of the music, and various modes can be considered as an embodiment thereof, and an example thereof is as follows.

（ａ）第１に、上記複数種類の音響特性の各々を示す特性値に対応付けて、それら特性値で示される音響特性を有する楽曲の特徴を表現するに好適な紹介文を表す紹介文データが格納されて紹介文テーブルを楽曲のジャンル毎に設けて楽曲紹介文データベース１５２を構成する態様が挙げられる。なお、上記特性値としては、音の総パワー（音量）を示す“Audio Energy”、音の明るさや軽／重の度合いを示す“Spectral
Centroid”、音のパワフルさの度合いを示す“Spectral Tilt”、目一杯音が鳴っている感じなど音のハリ具合を示す“Spectral Flatness dB”、音のキラキラ感の度合いを示す“High
Frequency Content”、音のスペクトルの概形を数個の数値で特徴づける“MFCC”、音の総パワーに対する各周波数帯域のパワーの割合である“Energy Band Ratio”や“Bark Energy Band”、ハーモニーや和声、コード感、キーなどを示す基本となる“HPCP”などを用いるようにすれば良い。 (A) First, introductory sentence data representing an introductory sentence suitable for expressing the characteristics of music having the acoustic characteristics indicated by these characteristic values in association with characteristic values indicating each of the plurality of types of acoustic characteristics. Is stored and an introduction sentence table is provided for each genre of music, and the music introduction sentence database 152 is configured. The above characteristic values include “Audio Energy” indicating the total power (volume) of the sound, “Spectral” indicating the brightness and light / heavy level of the sound.
“Centroid”, “Spectral Tilt” indicating how powerful the sound is, “Spectral Flatness dB” indicating how sharp the sound is, such as the sound of a full sound, “High” indicating the degree of glittering sound
“Frequency Content”, “MFCC” characterizing the outline of the sound spectrum with several numerical values, “Energy Band Ratio” or “Bark Energy Band” that is the ratio of the power of each frequency band to the total power of the sound, The basic "HPCP" indicating harmony, chord feeling, keys, etc. may be used.

図３は、上記音響特性として、楽曲の音量を示す特徴量（例えば、“Audio Energy”）と、楽曲のメロディが流麗であるのかそれとも歯切れ良いものであるかを示す特徴量（例えば、“HPCP”）を採用した場合のクラッシク曲についての紹介文テーブルの格納内容（図３（ａ））とポップスについての紹介文テーブルの格納内容（図３（ｂ））の一例を示す図である。このように、楽曲のジャンル毎に上記紹介文テーブルを設ける理由は、同一の特性値の組で音響特性が表される楽曲であっても、楽曲の属するジャンルが異なればその楽曲の特徴を表現するに好適な紹介文のスタイル（文体）やその紹介文で用いられるべき用語や表現が異なり得るからである。なお、図３では、紹介文データに対応付ける音響特性として楽曲のジャンルによらずに共通の音響特性（図３では、音量とメロディ感を）を採用する場合について説明したが、楽曲のジャンル毎に異なる音響特性を採用するとしても勿論良い。 FIG. 3 shows, as the acoustic characteristics, a feature amount (for example, “Audio Energy”) indicating the volume of a song and a feature amount (for example, “HPCP” indicating whether the melody of the song is clean or crisp. It is a figure which shows an example of the storage content (FIG.3 (a)) of the introduction sentence table about a classical music at the time of employ | adopting ") and the storage content (FIG.3 (b)) of the introduction sentence table about pops. In this way, the reason why the above introduction sentence table is provided for each genre of music is that, even if the music characteristics are represented by the same set of characteristic values, the characteristics of the music are expressed if the genre to which the music belongs is different. This is because the preferred introduction style (sentence) and the terms and expressions to be used in the introduction may differ. In FIG. 3, a case has been described in which a common acoustic characteristic (in FIG. 3, volume and melody feeling) is adopted as the acoustic characteristic associated with the introductory sentence data, regardless of the genre of the music. Of course, different acoustic characteristics may be adopted.

（ｂ）楽曲紹介文データベース１５２の第２の実施態様としては、上記複数種類の音響特性の各々を示す特性値を形容詞などのキーワードに変換するキーワードテーブルを楽曲のジャンル毎に設けておくとともに、それらキーワードが所定の空白部分に埋め込まれることにより紹介文を形成する紹介文テンプレートを楽曲のジャンル毎に設けておくことにより楽曲紹介文データベース１５２を構成する態様が挙げられる。なお、キーワードテーブルを楽曲のジャンル毎に設けておく理由は、ある音響特性についてその特性値が同一である場合であっても、楽曲のジャンルが異なればその音響特性を表現するに好適なキーワードが異なり得るからである。また、楽曲のジャンル毎に紹介文テンプレートを設けておく理由は、楽曲のジャンルが異なれば紹介文として好適なスタイルが異なり得るからである。
以上、楽曲紹介文データベース１５２の実施態様として２つの例を挙げたが、本発明に係る楽曲紹介文生成装置が有するべき楽曲紹介文データベースの実施態様が上記２つの実施態様の何れかに限定されるものではないことは言うまでもない。要は、処理対象の楽曲の楽曲データを解析することにより得られる各音響特性の特性値から１の紹介文データを特定（または生成）することができる態様であれば、どのような態様であっても良い。 (B) As a second embodiment of the music introduction sentence database 152, a keyword table for converting characteristic values indicating each of the plurality of types of acoustic characteristics into keywords such as adjectives is provided for each genre of music, A mode in which the music introduction sentence database 152 is configured by providing, for each genre of music, an introduction sentence template for forming an introduction sentence by embedding these keywords in a predetermined blank portion. The reason why a keyword table is provided for each genre of music is that, even if the characteristic value of a certain acoustic characteristic is the same, if the genre of the music is different, a keyword suitable for expressing the acoustic characteristic is Because it can be different. The reason why an introduction sentence template is provided for each genre of music is that, if the genre of music is different, a suitable style as an introduction sentence may be different.
As mentioned above, although two examples have been given as embodiments of the music introduction sentence database 152, the embodiment of the music introduction sentence database that the music introduction sentence generation apparatus according to the present invention should have is limited to one of the above two embodiments. Needless to say, it is not something. In short, as long as it is an aspect that can identify (or generate) one introduction sentence data from the characteristic values of each acoustic characteristic obtained by analyzing the music data of the music to be processed, what kind of aspect is it? May be.

ＯＳプログラム１５３は、制御部１１０にオペレーティングシステム（Operating System；以下、「ＯＳ」）を実現させるためのプログラムである。ライナーノーツ生成装置１０の電源（図示省略）が投入されると、制御部１１０は即座にＯＳプログラムを不揮発性記憶部１５０から揮発性記憶部１４０にロードしてその実行を開始する。ＯＳプログラム１５３にしたがって作動しＯＳを実現している状態の制御部１１０は、ライナーノーツ生成装置１０の各部の作動制御を行うこと、予め定められた実行スケジュールや運用管理者の指示にしたがって他のプログラムの実行を開始することができる。本実施形態に係るライナーノーツ生成装置１０においては、ＯＳプログラムにしたがって作動している制御部１１０は、コンテンツ配信プログラム１５４およびコンテンツ登録プログラム１５５の各々を所謂デーモンプロセスとして起動する。 The OS program 153 is a program for causing the control unit 110 to realize an operating system (hereinafter referred to as “OS”). When the liner notes generating apparatus 10 is turned on (not shown), the control unit 110 immediately loads the OS program from the nonvolatile storage unit 150 to the volatile storage unit 140 and starts executing the OS program. The control unit 110 that operates according to the OS program 153 and realizes the OS performs operation control of each unit of the linernotes generating device 10, and performs other control according to a predetermined execution schedule and an instruction of the operation manager. The execution of the program can be started. In the liner notes generating apparatus 10 according to the present embodiment, the control unit 110 operating according to the OS program starts each of the content distribution program 154 and the content registration program 155 as a so-called daemon process.

コンテンツ配信プログラム１５４は、デジタルコンテンツの配信を要求する旨の通信メッセージを受信した場合に、該当するデジタルコンテンツ（本実施形態では、楽曲データやその楽曲データに対応する紹介文データ）を楽曲データベース１５１から読み出し、その通信メッセージの送信元へ返信する処理を制御部１１０に実行させるプログラムである。つまり、コンテンツ配信プログラム１５４は、ライナーノーツ生成装置１０に前述したコンテンツサーバの役割を実現させるためのプログラムである。 When the content distribution program 154 receives a communication message requesting the distribution of digital content, the content distribution program 154 sends the corresponding digital content (in this embodiment, music data and introduction text data corresponding to the music data) to the music database 151. Is a program that causes the control unit 110 to execute a process of reading from the communication message and returning it to the transmission source of the communication message. That is, the content distribution program 154 is a program for causing the liner notes generating apparatus 10 to realize the role of the content server described above.

コンテンツ登録プログラム１５５は、外部機器ＩＦ部１３０を介して入力された楽曲データの表す楽曲についての紹介文を示す紹介文データを生成し、その紹介文データと楽曲データとを対応付けて楽曲データベース１５１に登録する処理を制御部１１０に実行させるプログラムである。より詳細に説明すると、コンテンツ登録プログラム１５５にしたがって作動している制御部１１０は、以下の３つの処理を実行する。 The content registration program 155 generates introductory text data indicating an introductory text about the music represented by the music data input via the external device IF unit 130, and associates the introductory text data with the music data to the music database 151. This is a program for causing the control unit 110 to execute the process of registering in. More specifically, the control unit 110 operating according to the content registration program 155 performs the following three processes.

第１に、外部機器ＩＦ部１３０を介して入力された楽曲データを解析して、その楽曲データの表す楽曲についての前述した複数種類の音響特性の各々を示す特性値を特定する特徴特定処理である。なお、本実施形態では、処理対象である楽曲データが外部機器ＩＦ部１３０を介してライナーノーツ生成装置１０に入力される場合について説明したが、通信網経由で入力されるとしても勿論良い。通信網経由で処理対象の楽曲データが入力される場合には、通信ＩＦ部１２０が前述した入力手段の役割を担うことになる。 First, in the feature specifying process of analyzing the music data input via the external device IF unit 130 and specifying the characteristic values indicating each of the plurality of types of acoustic characteristics of the music represented by the music data is there. In the present embodiment, the case where the music data to be processed is input to the liner notes generating apparatus 10 via the external device IF unit 130 has been described, but it may be input via a communication network. When the music data to be processed is input via the communication network, the communication IF unit 120 plays the role of the input means described above.

第２に、ユーザが操作部（図示省略）を適宜操作することにより入力したジャンル識別子（上記楽曲の属するジャンルを示す識別子）と上記特徴特定処理により特定された特性値とを用いて楽曲紹介文データベース１５２を検索し、該当する紹介文データを取得する紹介文取得処理である。なお、本実施形態では、ジャンル識別子を操作部経由で入力する場合について説明するが、外部機器ＩＦ部１３０経由で入力するとしても勿論良い。具体的には、処理対象の楽曲データが書き込まれた記録媒体にその楽曲データに対応づけてジャンル識別子を書き込んでおき、その楽曲データとともにそのジャンル識別子を制御部１１０に読み出させるようにすれば良い。また、本実施形態では、処理対象の楽曲のジャンルを示すジャンル識別子を入力する場合について説明するが、上記特徴特定処理による特定結果（すなわち、処理対象の楽曲についての各音響特性の度合い）からその楽曲の属するジャンルを制御部１１０に判別させるようにしても勿論良く、このような態様にあってはジャンル識別子の入力は不要である。 Second, a music introduction sentence using a genre identifier (an identifier indicating the genre to which the music belongs) input by the user appropriately operating an operation unit (not shown) and the characteristic value specified by the feature specifying process. This is an introduction sentence acquisition process for searching the database 152 and acquiring corresponding introduction sentence data. In this embodiment, the case where the genre identifier is input via the operation unit will be described. However, it is of course possible to input the genre identifier via the external device IF unit 130. Specifically, a genre identifier is written in association with the music data on a recording medium in which the music data to be processed is written, and the genre identifier is read together with the music data by the control unit 110. good. In the present embodiment, a case where a genre identifier indicating the genre of the music to be processed is input will be described. However, the identification result by the feature specifying process (that is, the degree of each acoustic characteristic of the music to be processed) Of course, the control unit 110 may determine the genre to which the music belongs. In such an aspect, it is not necessary to input a genre identifier.

そして、第３に、ユーザが操作部（図示省略）を適宜操作することにより入力した楽曲識別子に対応付けて、処理対象の楽曲の楽曲データと、紹介文取得処理にて取得した紹介文データとを楽曲データベース１５１に書き込むコンテンツ登録処理である。
以上がライナーノーツ生成装置１０の構成である。 Thirdly, in association with the music identifier input by the user appropriately operating the operation unit (not shown), the music data of the music to be processed, the introductory text data acquired in the introductory text acquisition process, Is a content registration process for writing the data into the music database 151.
The above is the configuration of the liner notes generating apparatus 10.

（Ａ−２：動作）
次いで、ライナーノーツ生成装置１０が実行する動作のうち、本発明に係る楽曲紹介文生成装置の特徴を顕著に示す動作、すなわち、コンテンツ登録プログラム１５５にしたがって制御部１１０が行う動作について図面を参照しつつ説明する。なお、以下に説明する動作例では、ライナーノーツ生成装置１０の電源（図示省略）は投入済みであり、コンテンツ登録プログラム１５５は、ＯＳの制御下で前述したデーモンプロセスとして制御部１１０によりその実行が開始されているとする。 (A-2: Operation)
Next, among the operations executed by the liner notes generating device 10, refer to the drawings for the operations that remarkably show the characteristics of the music introduction sentence generating device according to the present invention, that is, the operations performed by the control unit 110 according to the content registration program 155. I will explain. In the operation example described below, the power (not shown) of the liner notes generating apparatus 10 has been turned on, and the content registration program 155 is executed by the control unit 110 as the daemon process described above under the control of the OS. Suppose it has started.

上記状況下で、処理対象である楽曲データを格納した記録媒体が外部機器ＩＦ部１３０に接続され、さらに、その楽曲データの表す楽曲の属するジャンルを示すジャンル識別子が操作部（図示省略）を介して入力されると、コンテンツ登録プログラム１５５にしたがって作動している制御部１１０は、図４のフローチャートに示す処理を実行する。図４に示すように、制御部１１０は、まず、処理対象である楽曲の楽曲データを外部機器ＩＦ部１３０を介して記録媒体から読み出し、その楽曲データを解析して、その楽曲データの表す楽曲についての前述した複数種類の音響特性（楽曲紹介文データベースにキーワードが登録されている音響特性）の各々を示す特性値を算出することにより、各音響特性を特定する（ステップＳＡ１００）。このステップＳＡ１００の実行にあたっては、各音響特性についての周知のアルゴリズムを利用すれば良い。なお、処理対象の楽曲について上記複数種類の音響特性を特定する際には、楽曲全体を通して各音響特性の度合いを特定するとしても良く、楽曲の前半部分や後半部分など間奏で区切られるパート毎に各音響特性の度合いを特定するとしても良い。楽曲全体を通して各音響特性の度合いを特定する態様にあっては、その楽曲全体に対して１つの紹介文データが以下の処理により生成されることになり、パート毎に各音響特性の度合いを特定する態様にあっては、それらパート毎に以下のステップＳＡ１１０および１２０の処理が実行され、それらパート毎に紹介文データが生成されることになる。 Under the above circumstances, a recording medium storing the music data to be processed is connected to the external device IF unit 130, and a genre identifier indicating the genre to which the music represented by the music data belongs via an operation unit (not shown). The control unit 110 operating according to the content registration program 155 executes the process shown in the flowchart of FIG. As shown in FIG. 4, the control unit 110 first reads the music data of the music to be processed from the recording medium via the external device IF unit 130, analyzes the music data, and represents the music represented by the music data. Each acoustic characteristic is specified by calculating a characteristic value indicating each of the above-described plural types of acoustic characteristics (acoustic characteristics in which keywords are registered in the music introduction sentence database) (step SA100). In executing step SA100, a known algorithm for each acoustic characteristic may be used. In addition, when specifying the plurality of types of acoustic characteristics for the music to be processed, the degree of each acoustic characteristic may be specified throughout the music, and for each part separated by an interlude such as the first half or the second half of the music The degree of each acoustic characteristic may be specified. In the aspect of specifying the degree of each acoustic characteristic throughout the entire piece of music, one introductory sentence data is generated for the whole piece of music by the following process, and the degree of each acoustic characteristic is specified for each part. In this mode, the following steps SA110 and 120 are executed for each part, and introduction text data is generated for each part.

ステップＳＡ１００に後続して実行されるステップＳＡ１１０においては、制御部１１０は、操作部を介して入力されたジャンル識別子およびステップＳＡ１００にて特定した各特性値を用いて楽曲紹介文データベース１５２の検索を行い、それら特性値およびジャンル識別子に対応する紹介文データを取得する。例えば、楽曲紹介文データベース１５２が前述した第１の態様で実装されている場合には、制御部１１０は、上記ジャンル識別子で識別されるジャンルに対応する紹介文テーブルから、上記各特性値に対応する紹介文データを読み出すことにより、処理対象の楽曲についての紹介文データを取得する。また、楽曲紹介文データベース１５２が前述した第２の態様で実装されている場合には、制御部１１０は、上記ジャンル識別子に対応するキーワードテーブルを参照して各特性値に対応するキーワードを特定し、それらキーワードを上記ジャンル識別子に対応する紹介文テンプレートに埋め込んで紹介文データを生成することによって、処理対象の楽曲についての紹介文データを取得する。 In step SA110 executed subsequent to step SA100, the control unit 110 searches the music introduction sentence database 152 using the genre identifier input via the operation unit and the characteristic values specified in step SA100. To obtain introductory sentence data corresponding to the characteristic values and genre identifiers. For example, when the music introduction sentence database 152 is implemented in the first aspect described above, the control unit 110 corresponds to each characteristic value from the introduction sentence table corresponding to the genre identified by the genre identifier. The introductory text data about the music to be processed is acquired by reading the introductory text data to be processed. When the music introduction sentence database 152 is implemented in the above-described second mode, the control unit 110 refers to the keyword table corresponding to the genre identifier and identifies the keyword corresponding to each characteristic value. The introductory sentence data about the music to be processed is acquired by embedding those keywords in the introductory sentence template corresponding to the genre identifier and generating the introductory sentence data.

そして、制御部１１０は、ステップＳＡ１１０にて取得した紹介文データを上記処理対象の楽曲についてのライナーノーツデータとして、その楽曲についての楽曲データとともにその楽曲の楽曲識別子と対応付けて楽曲データベース１５１に格納する（ステップＳＡ１２０）。以降、制御部１１０は、通信ＩＦ部１２０を介してコンテンツ配信要求メッセージを受信すると、その要求内容に応じたコンテンツ（楽曲データやライナーノーツデータ、または、その両者）を楽曲データベース１５１から読み出し、そのコンテンツ要求メッセージの送信元へ配信する。なお、このようなコンテンツ配信を行う場合には、所定の課金アルゴリズムにしたがって課金を行っても良いことは勿論である。このようにしてライナーノーツ生成装置１０より配信されるライナーノーツデータの表す文章が、そのライナーノーツデータと対になる楽曲の特徴を反映したものであることは、前述した通りである。また、本実施形態によれば、任意の楽曲についてその楽曲の音響特性と楽曲紹介文データベース１５２の格納内容とから、その楽曲の特徴を反映した紹介文（ライナーノーツ）が自動生成されるのであるから、音楽評論家などの専門家の手を煩わせることもない。 Then, the control unit 110 stores the introductory sentence data acquired in step SA110 as liner notes data for the music to be processed in the music database 151 in association with the music identifier for the music and the music identifier of the music. (Step SA120). Thereafter, when receiving the content distribution request message via the communication IF unit 120, the control unit 110 reads content (music data and / or liner notes data) corresponding to the requested content from the music database 151, Deliver to the sender of the content request message. In addition, when performing such content distribution, of course, you may charge according to a predetermined charging algorithm. As described above, the sentence represented by the liner notes data distributed from the liner notes generating apparatus 10 reflects the characteristics of the music paired with the liner notes data. Further, according to the present embodiment, an introductory sentence (liner notes) reflecting the characteristics of the music is automatically generated from the acoustic characteristics of the music and the stored contents of the music introductory text database 152 for any music. Therefore, it doesn't bother the experts such as music critics.

（Ｂ：第２実施形態）
次いで、本発明の第２実施形態に係るナレーション付加装置３０について説明する。
ナレーション付加装置３０は、前述したライナーノーツ生成装置１０と同様、楽曲とその紹介文とをデジタルコンテンツとして配信するコンテンツサーバであり、楽曲を表す楽曲データを解析することによって、その楽曲の特徴を反映した紹介文を示す紹介文データを生成する点については、前述したライナーノーツ生成装置１０と同一である。しかしながら、このナレーション付加装置３０は、上記紹介文データの示す紹介文の読み上げ音であるナレーションをその楽曲の前奏区間や間奏区間、後奏区間などの所定区間に重畳させた楽曲データを合成する点が前述したライナーノーツ生成装置１０と異なっている。
以下、ナレーション付加装置３０の構成および動作について図面を参照しつつ説明する。 (B: Second embodiment)
Next, a narration adding device 30 according to a second embodiment of the present invention will be described.
The narration adding device 30 is a content server that distributes music and its introduction as digital content, similar to the liner notes generating device 10 described above, and reflects the characteristics of the music by analyzing music data representing the music. The point that the introduction sentence data indicating the introduction sentence is generated is the same as the liner notes generation apparatus 10 described above. However, the narration adding device 30 synthesizes music data in which the narration that is the reading sound of the introductory text indicated by the introductory text data is superimposed on a predetermined section such as a prelude section, an interlude section, and a postlude section of the music. Is different from the liner notes generating apparatus 10 described above.
Hereinafter, the configuration and operation of the narration adding device 30 will be described with reference to the drawings.

（Ｂ−１：構成）
図５は、本発明の第２実施形態に係るナレーション付加装置３０の構成例を示すブロック図である。図５と図１とを比較すれば明らかなように、ナレーション付加装置３０のハードウェア構成は、前述したライナーノーツ生成装置１０のハードウェア構成（図１参照）と同一である。具体的には、ナレーション付加装置３０は、制御部１１０、通信ＩＦ部１２０、外部機器ＩＦ部１３０、揮発性記憶部１４０、不揮発性記憶部１５０およびバス１６０を有している。しかしながら、ナレーション付加装置３０に特徴的な処理を実現するため、ナレーション付加装置３０のソフトウェア構成（不揮発性記憶部１５０に格納されているデータやプログラム）は、ライナーノーツ生成装置１０のソフトウェア構成とは異なっている。 (B-1: Configuration)
FIG. 5 is a block diagram showing a configuration example of the narration adding apparatus 30 according to the second embodiment of the present invention. As is apparent from a comparison between FIG. 5 and FIG. 1, the hardware configuration of the narration adding device 30 is the same as the hardware configuration of the liner notes generating device 10 described above (see FIG. 1). Specifically, the narration adding device 30 includes a control unit 110, a communication IF unit 120, an external device IF unit 130, a volatile storage unit 140, a nonvolatile storage unit 150, and a bus 160. However, in order to realize processing characteristic of the narration adding device 30, the software configuration of the narration adding device 30 (data and programs stored in the non-volatile storage unit 150) is the software configuration of the liner notes generating device 10. Is different.

具体的には、楽曲データベース１５１および楽曲紹介文データベース１５２の他に、音声合成データベース２５６が不揮発性記憶部１５０に格納されている点、およびコンテンツ登録プログラム１５５に換えてコンテンツ登録プログラム２５５が不揮発性記憶部１５０に格納されている点が異なっている。音声合成データベース２５６は、所謂音韻データベースであり、音声合成の際に参照される音韻データの集合体である。この音声合成データベース２５６には、発生する音声の種類、例えば、男声、女声、軽快な声質、深みのある声質など様々な声質ごとに音韻データの集合体が予め格納されている。この音声合成データベース２５６の格納内容は、処理対象の楽曲データについて生成された紹介文を表す文章データを音声データに変換して、その楽曲データの表す楽曲についてのナレーションを表すナレーションデータを生成する際に利用される。なお、本実施形態では、何れの声質の音声を合成するかをユーザに指定させる場合について説明するが、例えば、音韻データに対応付けてその音韻データで合成される音声でのナレーションが好適な楽曲のジャンルを示すジャンル識別子を対応付けておき、処理対象の楽曲の属するジャンルに好適な声質でナレーションを合成しても良い。 Specifically, in addition to the music database 151 and the music introduction text database 152, the voice synthesis database 256 is stored in the nonvolatile storage unit 150, and the content registration program 255 is replaced with the content registration program 155. The difference is that it is stored in the storage unit 150. The speech synthesis database 256 is a so-called phonological database, and is a collection of phonological data referred to in speech synthesis. In the speech synthesis database 256, a collection of phoneme data is stored in advance for each of various voice qualities such as a type of generated voice, for example, male voice, female voice, light voice quality, deep voice quality, and the like. The content stored in the speech synthesis database 256 is obtained by converting the text data representing the introduction sentence generated for the music data to be processed into voice data and generating narration data representing the narration for the music represented by the music data. Used for In the present embodiment, a case where the user specifies which voice quality to synthesize is described. For example, a musical piece suitable for narration in a voice synthesized with phonological data in association with phonological data. A genre identifier indicating the genre may be associated, and the narration may be synthesized with a voice quality suitable for the genre to which the music to be processed belongs.

一方、コンテンツ登録プログラム２５５は、処理対象である楽曲の楽曲データを解析して得られる各特性値とその楽曲のジャンル識別子とを用いて楽曲紹介文データベースを検索し、その楽曲の特徴を反映した紹介文データを取得する処理を制御部１１０に実行させる点は、前述したコンテンツ登録プログラム１５５と同一である。しかしながら、コンテンツ登録プログラム２５５は、上記のようにして取得した紹介文データの表す文章を読み上げた音声を示すナレーションデータをその紹介文データと音声合成データベースの格納内容とから合成し、その音声をナレーションとして上記楽曲データの表す楽曲に重畳させる処理を制御部１１０に実行させる点が前述したコンテンツ登録プログラム１５５と異なる。
以上がナレーション付加装置３０の構成である。 On the other hand, the content registration program 255 searches the music introduction database using each characteristic value obtained by analyzing the music data of the music to be processed and the genre identifier of the music, and reflects the characteristics of the music. The point which makes the control part 110 perform the process which acquires introductory sentence data is the same as the content registration program 155 mentioned above. However, the content registration program 255 synthesizes the narration data indicating the voice read out from the sentence represented by the introductory sentence data acquired as described above from the introductory sentence data and the contents stored in the speech synthesis database, and narrates the voice. As described above, the content registration program 155 is different from the above-described content registration program 155 in that the control unit 110 executes a process of superimposing the music data on the music data.
The above is the configuration of the narration adding device 30.

（Ｂ−２：動作）
次いで、ナレーション付加装置が行う動作のうち、本発明に係る楽曲紹介文生成装置の特徴を顕著に示す動作、すなわち、コンテンツ登録プログラム２５５にしたがって制御部１１０が実行する動作について説明する。
図６は、コンテンツ登録プログラム２５５にしたがって制御部１１０が実行する動作の流れを示すフローチャートである。図４と図６とを比較すれば明らかなように、コンテンツ登録プログラム２５５にしたがって制御部１１０が行う動作は、ステップＳＡ１１０の処理の実行後、ステップＳＢ１００からステップＳＢ１２０までの３つの処理を実行した後に、ステップＳＡ１２０の処理を実行する点のみが、コンテンツ登録プログラム１５５にしたがって制御部１１０が実行する動作と異なっている。以下、ライナーノーツ生成装置１０の動作との相違点であるステップＳＢ１００からステップＳＢ１２０の処理を中心に説明する。 (B-2: Operation)
Next, among the operations performed by the narration adding device, the operation that remarkably shows the characteristics of the music introduction sentence generating device according to the present invention, that is, the operation executed by the control unit 110 according to the content registration program 255 will be described.
FIG. 6 is a flowchart showing a flow of operations executed by the control unit 110 in accordance with the content registration program 255. As apparent from a comparison between FIG. 4 and FIG. 6, the operation performed by the control unit 110 in accordance with the content registration program 255 is performed by performing three processes from the step SB100 to the step SB120 after executing the process of the step SA110. Only the point that the process of step SA120 is executed later is different from the operation executed by the control unit 110 in accordance with the content registration program 155. Hereinafter, the processing from Step SB100 to Step SB120, which is the difference from the operation of the liner notes generating apparatus 10, will be mainly described.

ステップＳＢ１００においては、制御部１１０は、処理対象である楽曲の楽曲データを解析し、その楽曲データの表す楽曲のうちナレーションが付与されるべき区間を特定する。例えば、上記楽曲データが歌唱曲を表す場合には、制御部１１０は、その楽曲データにスペクトル解析を施し、そのスペクトル変化（すなわち、人の声に特徴的なスペクトルなど歌唱音に特徴的なスペクトルの有無）から歌い出しの位置を特定し、その歌い出し位置よりも前の区間（すなわち、前奏区間）をナレーションが付与されるべき区間として特定する。なお、本実施形態では、ナレーションが付与されるべき区間として楽曲の前奏区間を用いる場合について説明したが、間奏区間や後奏区間であっても良いことは勿論である。間奏区間や後奏区間の特定に関しては、前述した前奏区間の特定と同様で歌唱音に特徴的なスペクトルの検出によりその特定が可能である。また、歌唱を伴わない楽曲については、音量の変化から前奏区間等を特定しても良く、メロディ抽出などによりメインメロディと推定されるパートを特定し、そのパートに属するスペクトルの有無により前奏区間等の特定を行っても良い。 In step SB100, the control unit 110 analyzes the music data of the music to be processed, and identifies the section to which narration should be given among the music represented by the music data. For example, when the music data represents a song, the control unit 110 performs spectrum analysis on the music data and changes its spectrum (that is, a spectrum characteristic of a song such as a spectrum characteristic of a human voice). The position of singing is specified from the presence or absence), and the section before the singing position (that is, the prelude section) is specified as the section to be given narration. In this embodiment, the case where the prelude section of the music is used as the section to which narration should be given has been described, but it is needless to say that it may be an interlude section or a follower section. Regarding the specification of the interlude section and the subsequent section, it can be specified by detecting the spectrum characteristic of the singing sound in the same manner as the specification of the prelude section described above. In addition, for music that does not involve singing, the prelude section, etc. may be specified from the change in volume, the part presumed to be the main melody by melody extraction, etc., and the presence of the spectrum belonging to that part, etc. May be specified.

次いで、制御部１１０は、ステップＳＡ１１０にて取得した紹介文データの表す紹介文を読み上げた音声を示す音声データを、その紹介文データとユーザの指定する声質に対応する音韻データとから生成し（ステップＳＢ１１０）、ステップＳＢ１００にて特定したナレーション付加区間にその音声データの表す音声が重畳されるように、その音声データと処理対象の楽曲データとを合成する（ステップＳＢ１２０）。なお、本実施形態では、処理対象の楽曲データに対してステップＳＡ１００からステップＳＡ１１０の処理により得られる紹介文データの表す紹介文の音声を音声合成により生成し、その楽曲の所定区間に重畳する場合について説明した。しかし、処理対象の楽曲データとともにその紹介文を示す紹介文データ（例えば、ユーザ自らがワードプロセッサなどで作成した紹介文データ）をナレーション付加装置３０に与え、その紹介文データについてステップＳＢ１００以降の処理を実行しても良いことは勿論である。 Next, the control unit 110 generates speech data indicating the speech read out from the introductory sentence represented by the introductory sentence data acquired in step SA110 from the introductory sentence data and phonological data corresponding to the voice quality designated by the user ( In step SB110, the voice data and the music data to be processed are synthesized so that the voice represented by the voice data is superimposed on the narration addition section identified in step SB100 (step SB120). In the present embodiment, the speech of the introductory text represented by the introductory text data obtained by the processing from step SA100 to step SA110 is generated by speech synthesis for the music data to be processed and is superimposed on a predetermined section of the music. Explained. However, introductory text data (for example, introductory text data created by the user himself / herself with a word processor) indicating the introductory text along with the music data to be processed is given to the narration adding device 30, and the processing after step SB100 is performed on the introductory text data. Of course, it may be executed.

そして、制御部１１０は、ステップＳＢ１２０にて合成したナレーション付の楽曲データに、ステップＳＡ１１０にて生成した紹介文データを対応付けて楽曲データベースに書き込み（ステップＳＡ１２０）、本動作を終了する。なお、本実施形態では、ステップＳＡ１１０にて取得した紹介文データの表す紹介文全体を読み上げた音声をナレーションとして付加する場合について説明したが、ナレーションを付加するべき区間の時間長が短く、上記紹介文全体を読み上げた音声を重畳することができない場合には、既存の文章要約アルゴリズムを用いて上記紹介文データの表す紹介文を適宜要約した後に音声合成を行うようにしても良い。また、ナレーションを付与するべき区間として複数の区間がステップＳＢ１００にて特定された場合には、紹介文データの表す紹介文の長さとそれら複数の区間の各々の時間長とを比較してそれら複数の区間からナレーションを付与する区間を特定するとしても良い。 Then, the control unit 110 associates the introductory sentence data generated in step SA110 with the narrated music data synthesized in step SB120 and writes it in the music database (step SA120), and ends this operation. In the present embodiment, a case has been described in which the voice that reads out the entire introduction sentence represented by the introduction sentence data acquired in step SA110 is added as a narration. If speech that reads out the entire sentence cannot be superimposed, speech synthesis may be performed after appropriately summarizing the introduction sentence represented by the introduction sentence data using an existing sentence summarization algorithm. Further, when a plurality of sections are identified as sections to which narration is to be assigned in step SB100, the length of the introduction sentence represented by the introduction sentence data is compared with the time length of each of the plurality of sections. The section to which narration is given may be specified from the section.

以降、制御部１１０は、通信ＩＦ部１２０を介してコンテンツ配信要求メッセージを受信すると、その要求内容に応じたコンテンツ（楽曲データやライナーノーツデータ、または、その両者）を楽曲データベース１５１から読み出し、そのコンテンツ要求メッセージの送信元へ配信する。これにより、任意の楽曲データを、その楽曲データの表す楽曲の前奏区間にその楽曲の特徴を反映したナレーションを付加した楽曲を表す楽曲データに変換して配信することが可能になる。また、本第２実施形態においても、上記ナレーション付の楽曲データを生成する際に、音楽評論家などの専門家の手を煩わせることがないことは、前述した第１実施形態と同様である。 Thereafter, when receiving the content distribution request message via the communication IF unit 120, the control unit 110 reads content (music data and / or liner notes data) corresponding to the requested content from the music database 151, Deliver to the sender of the content request message. As a result, arbitrary music data can be converted into music data representing music obtained by adding a narration reflecting the characteristics of the music to the prelude section of the music represented by the music data, and can be distributed. Also in the second embodiment, it is the same as that in the first embodiment described above that when the music data with narration is generated, an expert such as a music critic is not bothered. .

（Ｃ：他の実施形態）
以上、本発明の一実施形態について説明したが、係る実施形態に以下に説明する変形を加えても良いことは勿論である。
（１）上述した各実施形態では、本発明に係る楽曲紹介文生成装置に特徴的な処理をソフトウェアモジュールで実現する場合について説明したが、その少なくとも１部または全部を電子回路（すなわち、ハードウェアモジュール）で実現しても勿論良い。 (C: other embodiment)
Although one embodiment of the present invention has been described above, it goes without saying that modifications described below may be added to such an embodiment.
(1) In each of the above-described embodiments, a case has been described in which processing characteristic of the music introduction sentence generation device according to the present invention is implemented by a software module. At least a part or all of the processing is implemented by an electronic circuit (ie, hardware). Of course, it may be realized by a module).

（２）上述した実施形態では、楽曲紹介文データベースの格納内容を楽曲のジャンル毎に分類しておく場合について説明したが、このような分類は必ずしも必須ではない。例えば、前述した第１の実施態様にあっては、各ジャンルに亘って共通の紹介文テーブルを用いても良く、また、第２の実施態様にあっては、各ジャンルに亘って共通のキーワードテーブルや各ジャンルに亘って共通の紹介文テンプレートを用いても勿論良い。 (2) In the above-described embodiment, the case has been described in which the contents stored in the music introduction sentence database are classified for each genre of music, but such classification is not necessarily essential. For example, in the first embodiment described above, a common introduction sentence table may be used for each genre, and in the second embodiment, a keyword common to each genre. Of course, a common introductory sentence template may be used across tables and genres.

また、上述した実施形態では、処理対象である楽曲の楽曲データを解析して、予め定められた複数種類の音響特性についてのその楽曲の特徴を示す特性値を算出し、その特性値を用いてその楽曲についての楽曲紹介文を示す紹介文データを取得する場合について説明したが、例えば、操作部を適宜操作することにより入力される楽曲名や演奏者の氏名等を上記紹介文データの表す紹介文に付加しても勿論良い。具体的には、「それでは、○○の演奏する□□をお楽しみください。」という文章テンプレートを楽曲紹介文データベースに格納しておき、その文章テンプレートの○○部分にユーザの入力した演奏者氏名を埋め込み、同□□部分にユーザの入力した楽曲名を埋め込んで得られる文章データを上記紹介文データの末尾に付加するようにすれば良い。 In the above-described embodiment, the music data of the music to be processed is analyzed, the characteristic values indicating the characteristics of the music for a plurality of predetermined acoustic characteristics are calculated, and the characteristic values are used. The case where introductory text data indicating the introductory text for the music is acquired has been described. For example, the introductory text that indicates the name of the music or the name of the performer that is input by appropriately operating the operation unit Of course, it may be added to the sentence. Specifically, a text template “Please enjoy XX playing XX” is stored in the music introduction database, and the name of the player entered by the user in the XX part of the text template. And the text data obtained by embedding the music name input by the user in the □□ portion may be added to the end of the introduction text data.

（３）上述した各実施形態では、ライナーノーツ生成装置１０やナレーション付加装置３０に、本発明に係る楽曲紹介文生成装置に特徴的な処理を制御部１１０に実行させるためのプログラム（コンテンツ登録プログラム）を予めインストールしておく場合について説明した。しかしながら、例えばＣＤ−ＲＯＭやＤＶＤ（Digital Versatile Disk）などコンピュータ装置読み取り可能な記録媒体に上記コンテンツ登録プログラムを記録して配布しても良く、また、インターネットなどの電気通信回線経由のダウンロードにより上記コンテンツ登録プロがラムを配布しても良い。 (3) In each of the above-described embodiments, a program (content registration program) that causes the liner notes generating device 10 or the narration adding device 30 to execute processing characteristic of the music introduction sentence generating device according to the present invention on the control unit 110. ) Was previously installed. However, the content registration program may be recorded and distributed on a computer-readable recording medium such as a CD-ROM or a DVD (Digital Versatile Disk), and the content may be distributed by downloading via an electric communication line such as the Internet. Registered pros may distribute rams.

本発明の第１実施形態に係るライナーノーツ生成装置１０の構成例を示すブロック図である。It is a block diagram showing an example of composition of liner notes generating device 10 concerning a 1st embodiment of the present invention. 同ライナーノーツ生成装置１０が有する楽曲紹介文データベース１５２の働きを説明するための図である。It is a figure for demonstrating the function of the music introduction sentence database 152 which the liner notes production | generation apparatus 10 has. 第１の実施態様に係る楽曲紹介文データベース１５２の格納内容の一例を説明するための図である。It is a figure for demonstrating an example of the storage content of the music introduction sentence database 152 which concerns on a 1st embodiment. 同ライナーノーツ生成装置１０の制御部１１０が実行するライナーノーツ生成処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the liner notes production | generation process which the control part 110 of the liner notes production | generation apparatus 10 performs. 本発明の第２実施形態に係るナレーション付加装置３０の構成例を示すブロック図である。It is a block diagram which shows the structural example of the narration addition apparatus 30 which concerns on 2nd Embodiment of this invention. 同ナレーション付加装置３０の制御部１１０が実行するナレーション付与処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the narration provision process which the control part 110 of the narration addition apparatus 30 performs.

Explanation of symbols

１０…ライナーノーツ生成装置、３０…ナレーション付加装置、１１０…制御部、１２０…通信ＩＦ部、１３０…外部機器ＩＦ部、１４０…揮発性記憶部、１５０…不揮発性記憶部、１５１…楽曲データベース、１５２…楽曲紹介文データベース、１５３…ＯＳプログラム、１５４…コンテンツ配信プログラム、１５５，２５５…コンテンツ登録プログラム、２５６…音声合成データベース、１６０…バス。 DESCRIPTION OF SYMBOLS 10 ... Liner notes production | generation apparatus, 30 ... Narration addition apparatus, 110 ... Control part, 120 ... Communication IF part, 130 ... External apparatus IF part, 140 ... Volatile memory part, 150 ... Nonvolatile memory part, 151 ... Music database, 152 ... Music introduction text database, 153 ... OS program, 154 ... Content distribution program, 155, 255 ... Content registration program, 256 ... Speech synthesis database, 160 ... Bus.

Claims

A music introduction database that stores data for generating or identifying one music introduction from characteristic values indicating each of a plurality of predetermined acoustic characteristics;
Input means for inputting music data;
Analyzing the music data received by the input means, and specifying a characteristic value indicating each of the plurality of types of acoustic characteristics;
Obtain and output introductory text data representing an introductory text about the music represented by the music data received by the input means from each characteristic value specified by the feature specifying means and the stored contents of the music introductory text database And a music introduction sentence generating device characterized by comprising:

The contents stored in the music introduction sentence database are classified for each genre of music,
The acquisition means is a genre identifier indicating a genre to which a music represented by music data received by the input means belongs, and a genre indicated by a genre identifier input from the outside, or each characteristic value specified by the feature specifying means Referencing data stored in the music introduction sentence database in association with the genre of the music specified from the above, obtaining introductory sentence data about the music represented by the music data received by the input means, The music introduction sentence generation apparatus according to claim 1.

Input means for inputting music data;
Analyzing the music data received by the input means, and a section specifying means for specifying a prelude section, an interlude section or a postlude section of the music represented by the music data;
Acquisition means for acquiring sentence data representing an introduction sentence reflecting the characteristics of the music represented by the music data received by the input means;
Voice synthesizing means for synthesizing voice data of narration, which is a reading voice of an introduction sentence represented by the sentence data acquired by the acquiring means, from the sentence data;
Synthesizing means for synthesizing and outputting the voice data and the music data so that the voice represented by the voice data synthesized by the voice synthesizing means is superimposed on the section specified by the specifying means. Narration adding device.

A music introduction database that stores data for generating or identifying one music introduction from characteristic values indicating each of a plurality of predetermined acoustic characteristics;
Analyzing the music data received by the input means, and specifying a characteristic value indicating each of the plurality of types of acoustic characteristics;
With
The acquisition means acquires introductory sentence data representing an introductory sentence about the music represented by the music data from each characteristic value specified by the feature specifying means and the stored contents of the music introductory text database. 4. The narration adding device according to claim 3, wherein

Computer equipment,
Analyzing the music data input to the computer device, and a characteristic specifying means for specifying a characteristic value indicating each of a plurality of predetermined acoustic characteristics;
The stored contents of the music introduction sentence database storing data for generating or specifying one music introduction sentence from the characteristic values indicating each of the plurality of types of acoustic characteristics, and the characteristic values specified by the feature specifying means, From the above, the program is made to function as an acquisition means for acquiring and outputting introductory text data representing an introductory text for the music represented by the music data.

Computer equipment,
Analyzing the music data input to the computer device, section specifying means for specifying the prelude section, the interlude section or the postlude section of the music represented by the music data;
Acquisition means for acquiring sentence data representing an introductory sentence reflecting the characteristics of the music represented by the music data;
Voice synthesizing means for synthesizing voice data of narration, which is a reading voice of an introduction sentence represented by the sentence data acquired by the acquiring means, from the sentence data;
Functioning as a synthesizing unit that synthesizes and outputs the voice data and the music data so that the voice represented by the voice data synthesized by the voice synthesizing unit is superimposed on the section specified by the specifying unit. A program characterized by