JP2005055547A

JP2005055547A - Music data formation system and program

Info

Publication number: JP2005055547A
Application number: JP2003206612A
Authority: JP
Inventors: Eiichiro Aoki; 栄一郎青木
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2003-08-07
Filing date: 2003-08-07
Publication date: 2005-03-03

Abstract

<P>PROBLEM TO BE SOLVED: To provide a music data formation system capable of forming music data adequate for videos and narration by a simple operation at a portable terminal. <P>SOLUTION: When the system is specified with the video data to be reproduced and the music style of the music data to be formed in the video of a portable terminal TP and a music style specification section A, the specified video data is extracted from a video data base VDB of a server SV. When the reproduction time of the specified video data is acquired from the video data by a time acquisition B, a composition engine EG functioning as a music data forming section starts composition with the reproduction time acquisition time as a trigger and forms the music data complying with the specified music style in correspondence to the reproduction time. Then the system reproduces the formed music data together with the video data by a reproduction section 4R. The narration (reading voice) can be used in place of the videos and the data of the videos and the narration can also be inputted from the camera or microphone of the portable terminal TP. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
この発明は、携帯用電話機などの携帯用情報処理端末において映像などに合わせて音楽を楽しむために、映像などに相応しい音楽データを作成することができる音楽データ作成システムに関する。
【０００２】
【従来の技術】
近年、携帯用電話機などの携帯通信端末の機能や性能は飛躍的に向上している。これに伴い、音楽的な面では、例えば、特許文献１には、音楽素材の入力を受けてサーバーが作成した音楽コンテンツを携帯通信端末に返信するようにしたものが提案されている。また、画像については、静止画再生だけでなく映像（動画像）までも再生できるようになっており、このような映像を再生する場合、そのＢＧＭ（背景音楽）が必然的に欲しくなる。これに対して、例えば、特許文献２には、映像に合った作曲をモバイルＰＣ上で実現することが開示されている。
【０００３】
【特許文献１】
特開２００２−５５６７９号公報
【特許文献２】
特開２００２−２８７７４６号公報（段落〔０１０９〕）
【０００４】
従って、携帯用電話機などの携帯通信端末において、映像に相応しい雰囲気で而も映像の再生に一致する音楽をより簡単に楽しむことができるような作曲システムが望まれるところである。また、映像だけでなく、ナレーション（朗読音声）に合う音楽を作成することができれば、携帯端末の利用範囲がより豊かなものになるものと期待される。
【０００５】
【発明が解決しようとする課題】
この発明は、携帯用電話機を含む携帯端末において映像などに合わせて音楽を楽しむことができる１つの方法を提供しようとするものであって、特に、携帯端末での簡単な操作によって、映像やナレーションに相応しい音楽データを作成することができる音楽データ作成システムを提供することを目的とする。
【０００６】
【課題を解決するための手段】
この発明の主たる特徴に従うと、作成される音楽データの曲風を指定する指定手段（Ａ，Ａ２）と、映像データ又はナレーションデータから当該データの再生時間を取得する時間取得手段（Ｂ）と、この再生時間を取得したことに応じて、当該再生時間に対応し、指定された曲風に従う音楽データを生成するデータ生成手段（ＥＧ）とを具備する音楽データ作成システム（請求項１）、並びに、作成される音楽データの曲風を指定するステップ（Ａ，Ａ２）と、映像データ又はナレーションデータから当該データの再生時間を取得するステップ（Ｂ）と、この再生時間を取得したことに応じて、当該再生時間に対応し、指定された曲風に従う音楽データを生成するステップ（ＥＧ）とから成る手順を情報処理装置に実行させるための音楽データ作成プログラム（請求項４）が提供される。なお、括弧書きは、理解の便のために、後で詳述する実施例において用いられる対応記号等であり、以下においても同様である。
【０００７】
また、この発明の更なる特徴によると、複数の映像データ又はナレーションデータを記憶するデータ記憶手段（２，ＶＤＢ）と、再生すべき映像データ又はナレーションデータ及び作成される音楽データの曲風を指定する指定手段（Ａ）と、指定された映像データ又はナレーションデータをデータ記憶手段（２，ＶＤＢ）から抽出するデータ抽出手段（ＶＤ）と、抽出された映像データ又はナレーションデータから当該データの再生時間を取得する時間取得手段（Ｂ）と、この再生時間を取得したことに応じて、当該再生時間に対応し、指定された曲風に従う音楽データを生成するデータ生成手段（ＥＧ）とを具備する音楽データ作成システム（請求項２）、並びに、作成される音楽データの曲風を指定する指定手段）Ａ２）と、外部から入力される映像又は音声に対応する映像データ又はナレーションデータを記録するデータ記録手段（２Ｖ）と、この映像データ又はナレーションデータの記録が終了したことに応じて当該データの再生時間を取得する時間取得手段（Ｂ）と、この再生時間を取得したことに応じて、当該再生時間に対応し、指定された曲風に従う音楽データを生成するデータ生成手段（ＥＧ）とを具備する音楽データ作成システム（請求項３）が提供される。
【０００８】
〔発明の作用〕
この発明は、携帯端末（ＰＴ）において、映像データやナレーションデータと、このデータの内容に合った曲風を指定すると（Ａ）、まず、この指定に該当する映像データやナレーションデータを抽出して（ＶＤ，ＶＤＢ）、当該データの再生時間情報を取得する（Ｂ）。次いで、データ生成手段（ＥＧ）がこの再生時間情報の取得をトリガーとして作曲を開始する。これにより、当該再生時間情報に合わせて、指定された映像やナレーションに相応しい音楽データを生成し、生成された音楽データを映像やナレーションと共に再生することができる。
【０００９】
或いは、曲風を指定し（Ａ２）、カメラ（３Ｃ）やマイクを動作させると、まず、カメラで撮影した映像データやマイクから記録したナレーションデータから、再生時間情報を取得する（Ｂ）。次いで、データ生成手段（ＥＧ）がこの再生時間情報の取得をトリガーとして作曲を開始する。これにより、当該再生時間情報に合わせて、記録された映像やナレーションに相応しい音楽データを生成し、記録した映像やナレーションと共にこの音楽を再生することができる。
【００１０】
この発明は、このように、映像や音声（ナレーション）の時間取得をトリガーとして作曲を開始することにより、携帯端末におけるユーザの操作を少なくするものであり、従って、データ及び曲風を指定するだけの簡単なユーザ操作によって、映像やナレーションに相応しい雰囲気をもち、しかも、映像やナレーションと再生時間が一致する音楽を得ることができ、携帯端末で再生する映像やナレーションにこの音楽を付加して再生することができる。
【００１１】
【発明の実施の形態】
以下、図面を参照しつつ、この発明の好適な実施形態を詳述する。なお、以下の実施形態は単なる一例であって、この発明の精神を逸脱しない範囲で種々の変更が可能である。
【００１２】
〔システムの概要〕
図６は、この発明の一実施例による携帯通信端末とサーバを含む音楽データ作成システムのハードウエア構成の概要を表わす。この音楽データ作成システムは、音楽データの作成を指示する携帯端末ＰＴが、基地局ＢＳを介してインターネット或いは電話回線網等の通信ネットワークＣＮを通じ、各携帯端末ＰＴにサービスを提供するサーバＳＶと相互に通信可能に接続されて構成される。なお、サーバＳＶはパーソナルコンピュータやワークステーションなどの任意のタイプの情報処理装置で構成され、携帯端末ＰＴには携帯用電話機が特に好適に用いられるが、通信機能を有するＰＤＡなど、他の任意形態の携帯型情報処理端末を携帯端末ＰＴに用いることもできる。
【００１３】
各携帯端末ＰＴは、破線内の内部構成ブロック図に例示されるように、中央処理装置（ＣＰＵ）１、記憶手段２、入力手段３、出力手段４、通信手段５などがバス６に接続されて成る。記憶手段２は、制御プログラムや制御用データを記憶した読出専用メモリ（ＲＯＭ）部、処理用データ等を一時記憶するランダムアクセスメモリ（ＲＡＭ）部、種々のデータやプログラムを記憶する外部記憶部などから成り、これらの記憶部は半導体メモリで構成することができる。そして、ＣＰＵ１は、ＲＡＭ部をワークメモリとしてＲＯＭ部の制御プログラムに従い外部記憶部のデータ等を利用して当該携帯端末ＰＴの動作を制御する。
【００１４】
例えば、記憶手段２のＲＯＭ部又は外部記憶部には、作曲エンジン（楽曲データ生成ソフトウエア）ＥＧや作曲用データ（ＭＤ）、映像（動画像）データ（ＶＤ）などを記憶しておくことができ、ＣＰＵ１は、音楽データ作成に関する端末側制御プログラム（端末側音楽データ作成プログラム）に従って、映像データに相応しい音楽データ（「曲データ」とも呼ばれる）を生成し、映像データと共に再生することができる。また、記憶手段２のＲＡＭ部は、再生される映像データ及び音楽データを記憶する映像及び音楽メモリ２Ｖ，２Ｍとして機能することができる。
【００１５】
入力手段３は、各種キースイッチ等の操作子３Ｍの外に、ビデオカメラ３Ｃやマイクロフォン等の、外部からの画像乃至音響情報を入力するための入力装置などを含む。また、出力手段４は、ＬＣＤ等のディスプレイを含む表示出力部や、通話音声や演奏楽音を放音するための音響出力部などを備え、音響出力部には楽音生成のための音源やスピーカ等が含まれる。ここで、表示出力部並びに音響出力部のうち、映像データに基づく映像を生成するための映像生成部、並びに、ナレーションデータに基づく音声を生成するための音声生成部及び音楽データに基づく楽音を生成するための楽音生成部は、再生部４Ｒと総称される。
【００１６】
通信手段５は、基地局ＢＳとの無線通信を介して通信ネットワークＣＮを通じサーバＳＶ或いは他の携帯端末と通信するための端末送受信部５Ｔ，５Ｒを有し、サーバＳＶとの通信で種々の制御プログラムやデータを受信することができる。例えば、サーバＳＶから端末側音楽データ作成プログラムや作曲エンジン等をダウンロードして記憶手段２の外部記憶部に格納し、これらのプログラム乃至エンジンを利用して音楽データ作成に関係する各種処理を行うことができる。
【００１７】
サーバＳＶは、携帯端末ＰＴを表わす図示破線内の内部構成ブロックとほぼ同様の内部構成を有しており、音楽データ作成を含むサーバ側制御プログラムに従って動作する。例えば、サーバＳＶ側にも、サーバでの作曲のために作曲エンジンＥＧを搭載することができる。さらに、サーバＳＶの記憶手段には、各携帯端末ＰＴにプログラムやデータを提供するため、磁気記録媒体（フレキシブルディスク、テープデバイス、ハードディスク等）或いは光記憶媒体（ＣＤ、ＤＶＤ、ＭＯ等）などの適当な記録媒体で構成される大容量記憶装置が含まれ、映像データベースＶＤＢや作曲用データベースＭＤＢを構築することができる。また、携帯端末ＰＴとのデータ送受のためにサーバ送受信部ＳＴ，ＳＲを備える。
【００１８】
この発明の一実施例による音楽データ作成システムは、映像データやナレーションデータに相応しい音楽データを生成して、映像データやナレーションデータと共に音楽データを携帯端末ＰＴの出力手段４の再生部４Ｒから同時に再生しようとするものであり、サーバ及び携帯端末の何れにおいて映像データ乃至ナレーションデータ源及び音楽データ（曲データ）生成機能を利用するかに応じて、例えば、図１〜図５に示される第１〜第５実施形態のように、異なる実施形態で実施することができる。
【００１９】
ここで、図１を用いて、この音楽データ作成システムの概要を簡単に説明しておく。このシステムでは、携帯端末ＰＴの映像及び曲風指定部Ａで、再生される映像データと作成される音楽データの曲風を指定すると、サーバＳＶの映像データベースＶＤＢから指定された映像データが抽出される。時間取得部Ｂにより、指定された映像データからその再生時間を取得すると、音楽データ生成部として機能する作曲エンジンＥＧは、この再生時間取得をトリガーとして作曲を開始して、当該再生時間に対応し、指定された曲風に従う音楽データを生成する。そして、再生部４Ｒによって、生成された音楽データを映像データと共に再生する。映像に代えてナレーション（朗読音声）を用いることができ、映像やナレーションのデータは、携帯端末ＰＴのカメラやマイクから入力することもできる。
【００２０】
以下、映像データに相応しい音楽データを生成して、映像データ及び音楽データを携帯端末ＰＴの出力手段４の再生部４Ｒから同時に再生する第１〜第５実施形態について、図１〜図５を用いて詳細に説明する。なお、これらの実施形態でも、携帯端末ＰＴには携帯用電話機が用られる。また、各図では、音楽データ作成処理に関係する機能を明瞭に説明するために、通話などの他の機能に関わる構成については記載を省略している。
【００２１】
〔第１実施形態〕
この発明の第１実施形態は、図１の機能ブロック図で表わされるように、映像データがサーバにあり作曲エンジンを携帯端末が持つ場合に適用することができる音楽データの作成方式である。第１実施形態では、サーバＳＶの映像データベースＶＤＢに映像データが蓄積されており、楽曲データ生成のためのソフトウエアである“作曲エンジン”ＥＧが携帯端末ＰＴの記憶手段２に搭載されている場合、携帯端末ＰＴでは、サーバＳＶから映像データの提供を受けたことに応じて、自動的に、作曲エンジンＥＧにより音楽データ（曲データ）が作成される。
【００２２】
第１実施形態の動作乃至処理の流れをより詳しく説明する。まず、携帯端末ＰＴでは、端末ユーザが入力手段３の操作子３Ｍを操作して、端末の動作モードを楽曲生成モードにすると、サーバＳＶの映像データベースＶＤＢ中から再生したい映像データを選択するための映像選択画面及びこれから作成しようとする音楽データのジャンルやタイプ等の曲風を選択するための曲風選択画面が出力手段３のディスプレイ上に順次表示される。そこで、各選択画面を利用し操作子３Ｍを操作して、所望の映像データの種別及び音楽データの曲風に関する項目を選択的に入力すると、映像及び曲風指定部Ａは、これに応じて、当該映像種別及び曲風を表わす映像指定情報及び曲風指定情報を生成し、それぞれ、端末送信部５Ｔ及び作曲用データ取得部ＭＤに出力する。
【００２３】
端末送信部５Ｔは映像指定情報をサーバＳＶに向かって送信し、サーバＳＶは、サーバ受信部ＳＲで映像指定情報を受信すると、映像データベースＶＤＢから、受信した映像指定情報により指定される映像データを抽出し、抽出された映像データをサーバ送信部ＳＴから携帯端末ＰＴに向かって送信する。
【００２４】
一方、携帯端末ＰＴでは、記憶手段２の所定記憶エリアに複数組の作曲用データ（作曲パラメータともいう）が予め記憶されており、作曲用データ取得部ＭＤは、映像及び曲風指定部Ａからの曲風指定情報により指定される作曲用データを抽出して、作曲エンジンＥＧの動作のためにＲＡＭ部の所定エリアに保持している。ここで、端末受信部５ＲがサーバＳＶからの映像データを受信すると、映像時間取得部Ｂは、受信した映像データから、当該映像データの再生に必要なトータルの映像時間を算出・取得し、取得された映像時間に対応する時間情報を作曲エンジンＥＧに供給する。
【００２５】
作曲エンジンＥＧは、映像時間取得部Ｂからの時間情報の取得をトリガーとして、作曲用データ取得部ＭＤからの作曲用データ及び当該時間情報の内容に基づいて、当該作曲用データに従った曲風をもち当該時間情報が表わす時間で終了する音楽データを生成し、音楽メモリ（曲データメモリ）２Ｍに格納すると共に、作曲終了情報を端末送信部５Ｔを介してサーバＳＶに送る。サーバＳＶは、サーバ受信部ＳＲで作曲終了情報を受信すると、先の映像指定情報により指定される映像データを映像データベースＶＤＢから再度抽出して、この映像データをサーバ送信部ＳＴから携帯端末ＰＴに送信する。
【００２６】
そして、携帯端末ＰＴは、端末受信部５Ｒでの映像データの再受信に基づき音楽メモリ２Ｍからのデータ読出しを開始させて、音楽メモリ２Ｍの音楽データを再生部４Ｒの楽音生成部に入力すると共に、端末受信部５Ｒで再受信した映像データを再生部４Ｒの映像生成部に入力する。従って、所望の映像が出力手段４のディスプレイ上に表示されると共に、この映像に合う音楽が出力手段４のスピーカから出力される。
【００２７】
なお、上述の動作例では、映像時間情報の算出機能（Ｂ）を携帯端末ＰＴ側に持たせたが、サーバＳＶ側で映像時間情報を算出・取得するようにしてもよい。また、携帯端末ＰＴの映像時間取得部Ｂで時間抽出の対象として受信した映像データを映像メモリ２Ｖに記憶させておき、作曲エンジンＥＧによる曲生成と同時に映像の再生を行うように構成して、サーバＳＶにおいて映像データを再抽出しないようにすることができる。この場合、作曲終了情報は、サーバへの返信（図示）をせず、音楽メモリ２Ｍに読出し開始指令として与えるようにすればよい。
【００２８】
また、作曲用データの記憶及び抽出手段をサーバＳＶ側におき、携帯端末ＰＴの映像及び曲風指定部Ａからの曲風指定情報をサーバＳＶに送信し、サーバＳＶが、この曲風指定情報に基づいて必要な作曲用データを抽出し、携帯端末ＰＴの作曲用データ取得部ＭＤに返信するようにしてもよい。さらに、映像データの再生は、上述のように、ストリーム再生してもよいが、外部記憶部のファイルにダウンロードしてから再生してもよい。
【００２９】
〔第２実施形態〕
この発明の第２実施形態は、第１実施形態とは逆に、作曲エンジンがサーバにあり映像データを携帯端末が持つ場合に適用することができる音楽データの作成方式であって、図２の機能ブロック図で表わされる。すなわち、第２実施形態では、サーバＳＶに楽曲データ生成のための作曲エンジンＥＧがあり、携帯端末ＰＴにおける記憶手段２に複数の映像データが記憶されている場合、携帯端末ＰＴで再生される映像データが指定されたことに応じて、自動的に、サーバＳＶ側で作曲エンジンＥＧにより音楽データが作成され携帯端末ＰＴに提供される。
【００３０】
第２実施形態において、携帯端末ＰＴは、操作子３Ｍのユーザ操作により楽曲生成モードになると、記憶手段２に記憶されている映像データの中から再生したい映像データを選択するための映像選択画面及び作成しようとする音楽データの曲風を選択するための曲風選択画面が出力手段３のディスプレイ上に順次表示される。ユーザが各選択画面において所望の映像データの種別及び音楽データの曲風を選択的に入力すると、映像及び曲風指定部Ａは、対応する映像指定情報及び曲風指定情報を生成し、それぞれ、映像データ取得部ＶＤ及び端末送信部５Ｔに出力する。
【００３１】
携帯端末ＰＴの記憶手段２の所定記憶エリアには、上述のように、複数の映像データが予め記憶されており、映像データ取得部ＶＤは、映像及び曲風指定部Ａからの映像指定情報により指定される映像データを抽出し、その再生のためにＲＡＭ部の所定エリアに保持する。また、映像時間取得部Ｂは、映像データ取得部ＶＤで抽出した映像データから、当該映像データの再生に必要なトータル時間を算出して、この映像時間に対応する時間情報を端末送信部５Ｔに出力する。
【００３２】
端末送信部５Ｔは、映像及び曲風指定部Ａからの曲風指定情報及び映像時間取得部Ｂからの映像時間情報をサーバＳＶに送信し、サーバＳＶは、サーバ受信部ＳＲで曲風指定情報及び映像時間情報を受信すると、作曲用データベースＭＤＢから、受信した曲風指定情報により指定される作曲用データを抽出し、抽出された作曲用データ及び受信した映像時間情報を作曲エンジンＥＧに供給する。作曲エンジンＥＧは、この映像時間情報の受信をトリガーとして、供給される作曲用データ及び映像時間情報の内容に基づいて、当該作曲用データに従った曲風をもち当該映像時間情報が表わす時間で終了する音楽データを生成する。さらに、作曲エンジンＥＧにより生成される音楽データは、サーバ送信部ＳＴを介して携帯端末ＰＴに送信する。
【００３３】
そして、携帯端末ＰＴは、端末受信部５Ｒでの音楽データの受信に基づき映像データ取得部ＶＤにより保持されている映像データの読出しを開始させて、同映像データを再生部４Ｒの映像生成部に入力すると共に、端末受信部５Ｒで受信した音楽データを再生部４Ｒの楽音生成部に入力する。従って、所望の映像が出力手段４のディスプレイ上に表示されると共に、この映像に合う音楽が出力手段４のスピーカから出力される。
【００３４】
なお、図２に示す例では、サーバＳＶの作曲用データベースＭＤＢに、作曲エンジンＥＧの稼働に用いられる作曲用データが蓄積されているものとして説明したが、第１実施形態と同様に、複数組の作曲用データを携帯端末ＰＴの記憶手段２に記憶しておき、映像及び曲風指定部Ａからの曲風指定情報により指定された作曲用データをサーバＳＶの作曲エンジンに供給するようにしてもよい。また、音楽データの再生は、上述のように、ストリーム再生してもよいが、外部記憶部のファイルにダウンロードしてから再生してもよい。
【００３５】
〔第３実施形態〕
この発明の第３実施形態は、第１或いは第２実施形態のように作曲エンジン或いは映像データを携帯端末が持つのではなく、作曲エンジン及び映像データの何れもがサーバにある場合に適用することができる音楽データの作成方式であって、図３の機能ブロック図で表わされる。すなわち、第３実施形態では、サーバＳＶの映像データベースＶＤＢに多数の映像データが蓄積され、楽曲データ生成のための作曲エンジンＥＧもサーバＳＶに備えられている場合、携帯端末ＰＴで再生される映像データ及び作成しようとする音楽データの曲風が指定されたことに応じて、自動的に、サーバＳＶ側で、対応する映像データが選択されると共にこれに相応しい音楽データが作成され、携帯端末ＰＴに提供される。
【００３６】
第３実施形態においては、携帯端末ＰＴは、操作子３Ｍのユーザ操作で楽曲生成モードになると、サーバＳＶの映像データベースＶＤＢに蓄積されている映像データの中から再生したい映像データを選択するための映像選択画面及び作成しようとする音楽データの曲風を選択するための曲風選択画面が出力手段３のディスプレイ上に順次表示される。ユーザが各選択画面において所望の映像データの種別及び音楽データの曲風を選択的に入力すると、映像及び曲風指定部Ａは、対応する映像指定情報及び曲風指定情報を生成して端末送信部５Ｔに出力する。
【００３７】
端末送信部５Ｔは、映像及び曲風指定部Ａからの映像指定情報及び曲風指定情報をサーバＳＶに送信する。サーバＳＶは、サーバ受信部ＳＲでこれらの指定情報を受信すると、一方では、映像データベースＶＤＢから、受信した映像指定情報により指定される映像データを抽出する。そして、映像時間取得部Ｂによって、抽出された映像データから、当該映像データの再生に必要な映像時間を表わす時間情報を生成し、作曲エンジンＥＧに供給する。また、サーバＳＶは、他方では、作曲用データベースＭＤＢから、受信した曲風指定情報により指定される作曲用データを選択し、選択された作曲用データを作曲エンジンＥＧに供給する。
【００３８】
作曲エンジンＥＧは、映像時間取得部Ｂからの映像時間情報の供給をトリガーとして、供給される作曲用データ及び映像時間情報の内容に基づいて、当該作曲用データに従った曲風をもち当該映像時間情報が表わす時間で終了する音楽データを生成する。次いで、サーバＳＶは、作曲エンジンＥＧが音楽データの生成を終了したこと（作曲終了情報の発生）に基づいて、再度、映像データベースＶＤＢから当該映像データを抽出する。そして、作曲エンジンＥＧにより生成された音楽データを、映像データベースＶＤＢから再抽出された映像データと共に、サーバ送信部ＳＴを介して携帯端末ＰＴに送信する。
【００３９】
携帯端末ＰＴでは、端末受信部５Ｒで受信された音楽データ及び映像データが再生部４Ｒに渡され、再生部４Ｒは、これら音楽データ及び映像データを再生処理して、両データに基づく音楽及び映像を出力手段４のスピーカ及びディスプレイに出力する。
【００４０】
なお、図３に示す例では、サーバＳＶの作曲用データベースＭＤＢに、作曲エンジンＥＧの稼働に用いられる作曲用データが蓄積されているものとして説明したが、第１実施形態と同様に、複数組の作曲用データを携帯端末ＰＴの記憶手段２に記憶しておき、映像及び曲風指定部Ａからの曲風指定情報により指定された作曲用データをサーバＳＶの作曲エンジンに供給するようにしてもよい。また、映像データや音楽データの再生は、上述のように、ストリーム再生してもよいが、外部記憶部のファイルにダウンロードしてから再生してもよい。
【００４１】
〔第４実施形態〕
この発明の第４実施形態は、第３形態とは逆に、作曲エンジン及び映像データの何れもが携帯端末にある場合に適用することができる音楽データの作成方式であって、図４の機能ブロック図で表わされる。すなわち、第４実施形態では、携帯端末ＰＴの記憶手段２に複数の映像データが記憶され、楽曲データ生成のための作曲エンジンＥＧも携帯端末ＰＴに搭載されている場合、携帯端末ＰＴ内において、再生される映像データ及び作成しようとする音楽データの曲風が指定されたことに応じて、自動的に、対応する映像データが選択されると共にこれに相応しい音楽データが作成される。
【００４２】
第４実施形態においては、携帯端末ＰＴは、操作子３Ｍのユーザ操作により楽曲生成モードになると、記憶手段２に記憶されている映像データの中から再生したい映像データを選択するための映像選択画面及び作成しようとする音楽データの曲風を選択するための曲風選択画面が出力手段３のディスプレイ上に順次表示される。ユーザが各選択画面において所望の映像データの種別及び音楽データの曲風を選択的に入力すると、映像及び曲風指定部Ａは、対応する映像指定情報及び曲風指定情報を生成し、それぞれ、映像データ取得部ＶＤ及び作曲用データ取得部ＭＤに出力する。
【００４３】
映像データ取得部ＶＤは、映像及び曲風指定部Ａからの映像指定情報により指定される映像データを抽出して映像時間取得部Ｂに与え、映像時間取得部Ｂは、抽出された映像データから、当該映像データの再生に必要な映像時間を表わす時間情報を算出し、作曲エンジンＥＧに供給する。また、作曲用データ取得部ＭＤは、記憶手段２に予め記憶されている複数の作曲用データから、映像及び曲風指定部Ａからの曲風指定情報により指定される作曲用データを選択し、選択された作曲用データを作曲エンジンＥＧに供給する。
【００４４】
作曲エンジンＥＧは、映像時間取得部Ｂからの映像時間情報の供給をトリガーとして、供給される作曲用データ及び映像時間情報の内容に基づいて、当該作曲用データに従った曲風をもち当該映像時間情報が表わす時間で終了する音楽データを生成する。さらに、作曲エンジンＥＧは、音楽データの生成を終了すると、作曲終了情報を映像データ取得部ＶＤに出力して、映像データ取得部ＶＤにより、再度、記憶手段２から当該映像データを抽出させ、これを再生部４Ｒの映像生成部に供給させると共に、この映像データの供給に合わせて、作曲エンジンＥＧ自身により生成された音楽データを再生部４Ｒの楽音生成部に供給する。
【００４５】
そして、再生部４Ｒは、映像データ取得部ＶＤ及び作曲エンジンＥＧから供給される映像データ及び音楽データの再生処理を行い、両データに基づく映像及び音楽を出力手段４のディスプレイ及びスピーカに出力する。
【００４６】
〔第５実施形態〕
この発明の第５実施形態は、楽曲データ生成のための作曲エンジンが携帯端末にあり、再生される映像データが携帯端末自体に設けられたビデオカメラのような画像情報入力手段から入力される場合に適用することができる音楽データの作成方式であって、図５の機能ブロック図で表わされる。すなわち、第５実施形態では、携帯端末ＰＴに作曲エンジンＥＧがあり入力手段３にビデオカメラ３Ｃが設けられている場合、このビデオカメラ３Ｃで撮影した映像データに対応して作成される音楽データの曲風が指定されたことに応じて、自動的に、この映像データに相応しい音楽データが作成される。
【００４７】
第５実施形態においては、携帯端末ＰＴは、操作子３Ｍのユーザ操作により楽曲生成モードになると、音楽データの曲風を選択するための曲風選択画面が出力手段３のディスプレイ上に表示される。ユーザがこの選択画面を用いて、これからビデオカメラ３Ｃで撮影しようとする映像に対して作成しようとする音楽データにつき、所望の曲風を選択的に入力すると、曲風指定部Ａ２は、対応する曲風指定情報を生成して作曲用データ取得部ＭＤに出力する。
【００４８】
作曲用データ取得部ＭＤは、記憶手段２に予め記憶されている複数の作曲用データから、曲風指定部Ａからの曲風指定情報により指定される作曲用データを選択し、選択された作曲用データを作曲エンジンＥＧに供給すべくＲＡＭ部の所定エリアに保持する。
【００４９】
次に、ユーザ操作によりビデオカメラ３Ｃを動作状態として撮影を開始し、撮影により得られる映像データを映像メモリ２Ｖに記憶していく。ビデオカメラ３Ｃでの撮影が終了すると同時に、映像時間取得部Ｂは、撮影された映像データの再生に必要な映像時間（通常は、撮影速度＝再生速度とされるので、撮影時間に等しい）を表わす時間情報を算出し、作曲エンジンＥＧに供給する。
【００５０】
作曲エンジンＥＧは、映像時間取得部Ｂからの映像時間情報の供給をトリガーとして、作曲用データ取得部ＭＤにより保持されている作曲用データと当該映像時間情報の内容に基づいて、当該作曲用データに従った曲風をもち当該映像時間情報が表わす時間で終了する音楽データを生成すると共に、映像メモリ２Ｖに記憶されている映像データの読出しを開始させる。そして、これら音楽データ及び映像データを再生部４Ｒに供給する。
【００５１】
再生部４Ｒは、作曲エンジンＥＧにより生成される音楽データを楽音生成部で再生処理し、映像メモリ２Ｖから読み出される映像データを映像生成部で再生処理して、両データに基づく音楽及び映像を出力手段４のスピーカ及びディスプレイに出力する。
【００５２】
なお、図５に示す例では、映像時間情報について、ビデオカメラ３Ｃの映像データから直接取得するようにしたが、映像メモリ２Ｖに一旦記憶した映像データから取得するように構成してもよい。さらに、ビデオカメラ３Ｃで撮影した映像データを複数記憶し、再生時に、別途設けた映像指定部からの映像指定情報により再生すべき映像データを指定するようにしてもよい。この場合、記憶された各映像データに対応して作成された音楽データも記憶され、再生時には、映像指定情報により、再生すべき音楽データを指定するように構成すればよい。さらに、第２実施形態のように、作曲エンジンＥＧをサーバＳＶに装備し、サーバＳＶ側で作成された音楽データを携帯端末ＰＴで利用するようにしてもよい。
【００５３】
〔ナレーションの場合における実施形態〕
以上の実施形態は、作成される音楽データが合わせようとする対象が何れも映像データであるが、この対象をナレーションデータ（朗読音声を表わすデータ）としてもよい。つまり、各実施形態において映像データに係る構成をナレーションデータに係る構成に変更する（例えば、第５実施形態については、ビデオカメラに代えてマイクロフォンを用いる。）ことによって、ナレーションデータに相応しい音楽データを生成して、ナレーションデータ及び音楽データを携帯端末ＰＴの出力手段４の再生部４Ｒから同時に再生するシステムが得られる。
【００５４】
なお、このようにナレーションデータに合わせて音楽データを作成する場合の構成変更の具体例を挙げると、次のとおりである：映像及び曲風指定部Ａ→ナレーション及び曲風指定部、映像時間情報取得部Ｂ→ナレーション（朗読）時間情報取得部、映像データ取得部ＶＤ→ナレーションデータ取得部、映像メモリ２Ｖ→音声メモリ、ビデオカメラ３Ｃ→マイクロフォン、〔作成部４Ｒ内〕映像生成部→音声生成部、〔サーバＳＶ側〕映像データベースＶＤＢ→ナレーション（朗読音声）データベース、等々。
【００５５】
【発明の効果】
以上説明したように、この発明によれば、映像や音声（ナレーション）の時間取得をトリガーとして作曲を開始するようにしているので、携帯端末におけるユーザの操作を少なくし、データ及び曲風を指定するだけの簡単なユーザ操作によって、映像やナレーションに相応しい雰囲気をもち、しかも、映像やナレーションと再生時間が一致する音楽を得ることができ、携帯端末で再生する映像やナレーションにこの音楽を付加して再生することができる。
【図面の簡単な説明】
【図１】図１は、この発明の第１実施形態の概要を表わす機能ブロック図である。
【図２】図２は、この発明の第２実施形態の概要を表わす機能ブロック図である。
【図３】図３は、この発明の第３実施形態の概要を表わす機能ブロック図である。
【図４】図４は、この発明の第４実施形態の概要を表わす機能ブロック図である。
【図５】図５は、この発明の第５実施形態の概要を表わす機能ブロック図である。
【図６】図６は、この発明の一実施例による音楽データ作成システムのハードウエア構成を極く概略的に表わすブロック図である。
【符号の説明】
Ａ，Ａ２映像及び曲風指定部並びに曲風指定部、
ＭＤ，ＶＤ作曲用データ取得部及び映像データ取得部、
ＶＤＢ，ＭＤＢ映像データベース及び作曲用データベース、
Ｂ映像時間情報取得部（再生時間取得部）、
ＥＧ作曲エンジン（音楽データ生成部）、
２Ｍ，２Ｖ音楽メモリ及び映像メモリ（ＲＡＭ部）、
４Ｒ映像生成部、音声生成部及び楽音生成部を備える再生部。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a music data creation system capable of creating music data suitable for video and the like in order to enjoy music in accordance with video and the like in a portable information processing terminal such as a portable telephone.
[0002]
[Prior art]
In recent years, functions and performance of portable communication terminals such as portable telephones have been dramatically improved. Accordingly, in terms of music, for example, Patent Document 1 proposes a method in which music content generated by a server in response to input of music material is returned to a mobile communication terminal. Also, images can be reproduced not only for still image reproduction but also for video (moving images), and when such a video is reproduced, its BGM (background music) is inevitably desired. On the other hand, for example, Patent Document 2 discloses that a composition suitable for an image is realized on a mobile PC.
[0003]
[Patent Document 1]
JP 2002-55679 A
[Patent Document 2]
JP 2002-287746 A (paragraph [0109])
[0004]
Therefore, there is a demand for a music composition system that allows a mobile communication terminal such as a portable telephone to more easily enjoy music that matches the reproduction of the video in an atmosphere suitable for the video. In addition, if it is possible to create music that matches not only video but also narration (reading speech), it is expected that the range of use of mobile terminals will become richer.
[0005]
[Problems to be solved by the invention]
The present invention is intended to provide one method for enjoying music in accordance with video on a mobile terminal including a mobile phone. In particular, the video and narration can be performed by a simple operation on the mobile terminal. An object of the present invention is to provide a music data creation system capable of creating music data suitable for the above.
[0006]
[Means for Solving the Problems]
According to the main feature of the present invention, designation means (A, A2) for designating the style of music data to be created, time acquisition means (B) for obtaining the reproduction time of the data from video data or narration data, A music data creation system comprising data generation means (EG) for generating music data corresponding to the playback time and corresponding to the playback time in response to the acquisition of the playback time; and The step (A, A2) of designating the music style of the music data to be created, the step (B) of obtaining the reproduction time of the data from the video data or the narration data, and the acquisition of the reproduction time Music data for causing the information processing apparatus to execute a procedure consisting of a step (EG) of generating music data corresponding to the reproduction style and corresponding to the specified musical style. Creation program (claim 4) is provided. Note that the parenthesis is a corresponding symbol used in an embodiment described in detail later for convenience of understanding, and the same applies to the following.
[0007]
According to a further feature of the present invention, the data storage means (2, VDB) for storing a plurality of video data or narration data, and the music style of the video data or narration data to be reproduced and the music data to be created are designated. Specifying means (A) for performing the operation, data extracting means (VD) for extracting the specified video data or narration data from the data storage means (2, VDB), and the reproduction time of the data from the extracted video data or narration data And a data generation means (EG) for generating music data corresponding to the reproduction time and corresponding to the specified music style in response to the acquisition of the reproduction time. Music data creation system (Claim 2) and designation means for designating the style of music data to be created) A2) Recording means (2V) for recording video data or narration data corresponding to the video or audio to be played, and time acquisition means for acquiring the reproduction time of the data in response to the end of recording of the video data or narration data (B) and a music data generation system (EG) that includes data generation means (EG) for generating music data corresponding to the playback time and corresponding to the playback time according to the acquisition of the playback time. Item 3) is provided.
[0008]
[Effects of the Invention]
According to the present invention, when video data or narration data and a musical style matching the contents of this data are specified in a portable terminal (PT) (A), video data and narration data corresponding to this specification are first extracted. (VD, VDB), reproduction time information of the data is acquired (B). Next, the data generation means (EG) starts composition with the acquisition of the reproduction time information as a trigger. Accordingly, music data suitable for the designated video and narration can be generated in accordance with the playback time information, and the generated music data can be played back together with the video and narration.
[0009]
Alternatively, when the music style is specified (A2) and the camera (3C) or the microphone is operated, first, the reproduction time information is acquired from the video data photographed by the camera or the narration data recorded from the microphone (B). Next, the data generation means (EG) starts composition with the acquisition of the reproduction time information as a trigger. Thus, music data suitable for the recorded video and narration can be generated in accordance with the reproduction time information, and this music can be reproduced together with the recorded video and narration.
[0010]
As described above, the present invention reduces the user's operation on the mobile terminal by starting the composition with the time acquisition of the video and audio (narration) as a trigger, and therefore only specifies the data and the music style. With a simple user operation, you can get music that has an atmosphere suitable for video and narration, and that has the same playback time as the video and narration. can do.
[0011]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings. The following embodiments are merely examples, and various modifications can be made without departing from the spirit of the present invention.
[0012]
[System Overview]
FIG. 6 shows an outline of a hardware configuration of a music data creation system including a portable communication terminal and a server according to an embodiment of the present invention. In this music data creation system, a portable terminal PT instructing creation of music data mutually communicates with a server SV that provides a service to each portable terminal PT via a base station BS via a communication network CN such as the Internet or a telephone line network. Connected to be communicable with each other. The server SV is composed of any type of information processing device such as a personal computer or a workstation, and a portable telephone is particularly preferably used for the portable terminal PT. However, other arbitrary forms such as a PDA having a communication function are used. The portable information processing terminal can also be used as the portable terminal PT.
[0013]
Each mobile terminal PT has a central processing unit (CPU) 1, a storage unit 2, an input unit 3, an output unit 4, a communication unit 5, and the like connected to a bus 6 as illustrated in an internal configuration block diagram within a broken line. It consists of The storage means 2 includes a read-only memory (ROM) unit that stores a control program and control data, a random access memory (RAM) unit that temporarily stores processing data, and an external storage unit that stores various data and programs. These storage units can be composed of a semiconductor memory. Then, the CPU 1 controls the operation of the portable terminal PT using the data in the external storage unit according to the control program of the ROM unit using the RAM unit as a work memory.
[0014]
For example, a composition engine (music data generation software) EG, composition data (MD), video (moving image) data (VD), etc. may be stored in the ROM section or external storage section of the storage means 2. The CPU 1 can generate music data suitable for video data (also referred to as “music data”) in accordance with a terminal-side control program (terminal-side music data creation program) related to music data creation, and can reproduce the music data together with the video data. Further, the RAM section of the storage means 2 can function as video and music memories 2V and 2M that store video data and music data to be reproduced.
[0015]
The input means 3 includes an input device for inputting image or sound information from the outside, such as a video camera 3C and a microphone, in addition to an operation element 3M such as various key switches. The output means 4 includes a display output unit including a display such as an LCD, a sound output unit for emitting a call voice or a musical performance sound, and the sound output unit includes a sound source, a speaker, etc. Is included. Here, among the display output unit and the sound output unit, a video generation unit for generating video based on video data, an audio generation unit for generating audio based on narration data, and a musical sound based on music data are generated. The musical tone generator for performing the above is generically referred to as a reproducing unit 4R.
[0016]
The communication means 5 includes terminal transmission / reception units 5T and 5R for communicating with the server SV or other portable terminals through the communication network CN through wireless communication with the base station BS, and various controls are performed by communication with the server SV. You can receive programs and data. For example, a terminal-side music data creation program, a composition engine, or the like is downloaded from the server SV and stored in the external storage unit of the storage means 2, and various processes related to music data creation are performed using these programs or engines. Can do.
[0017]
The server SV has substantially the same internal configuration as the internal configuration block in the illustrated broken line representing the portable terminal PT, and operates in accordance with a server-side control program including music data creation. For example, the composition engine EG can be mounted on the server SV side for composition on the server. Further, the storage means of the server SV includes a magnetic recording medium (flexible disk, tape device, hard disk, etc.) or an optical storage medium (CD, DVD, MO, etc.) to provide programs and data to each portable terminal PT. A mass storage device composed of an appropriate recording medium is included, and a video database VDB and a music composition database MDB can be constructed. In addition, server transmission / reception units ST and SR are provided for data transmission / reception with the portable terminal PT.
[0018]
The music data creation system according to one embodiment of the present invention generates music data suitable for video data and narration data, and simultaneously reproduces the music data together with the video data and narration data from the playback unit 4R of the output means 4 of the portable terminal PT. Depending on whether the server or the mobile terminal uses the video data or narration data source and the music data (song data) generation function, for example, first to first shown in FIGS. Like the fifth embodiment, it can be implemented in different embodiments.
[0019]
Here, the outline of the music data creation system will be briefly described with reference to FIG. In this system, when the video data of the portable terminal PT and the music style designation unit A designate the music style of the video data to be reproduced and the music data to be created, the designated video data is extracted from the video database VDB of the server SV. The When the playback time is acquired from the specified video data by the time acquisition unit B, the composition engine EG functioning as a music data generation unit starts composition by using the playback time acquisition as a trigger and corresponds to the playback time. The music data according to the specified music style is generated. Then, the generated music data is reproduced together with the video data by the reproducing unit 4R. Narration (reading audio) can be used instead of video, and video and narration data can also be input from the camera or microphone of the portable terminal PT.
[0020]
Hereinafter, music data suitable for video data is generated, and the video data and music data are simultaneously played back from the playback unit 4R of the output means 4 of the portable terminal PT with reference to FIGS. Will be described in detail. In these embodiments, a portable telephone is used as the portable terminal PT. Moreover, in each figure, in order to clearly explain the functions related to the music data creation process, the description of the configuration related to other functions such as a call is omitted.
[0021]
[First Embodiment]
As shown in the functional block diagram of FIG. 1, the first embodiment of the present invention is a music data creation method that can be applied when video data is in a server and a music composition engine is in a portable terminal. In the first embodiment, video data is stored in the video database VDB of the server SV, and the “composition engine” EG that is software for generating music data is installed in the storage unit 2 of the portable terminal PT. In the portable terminal PT, music data (music data) is automatically created by the music composition engine EG in response to receiving video data from the server SV.
[0022]
The operation or process flow of the first embodiment will be described in more detail. First, in the portable terminal PT, when the terminal user operates the operation element 3M of the input means 3 to set the operation mode of the terminal to the music generation mode, the video data to be reproduced is selected from the video database VDB of the server SV. An image selection screen and a song style selection screen for selecting a song style such as a genre or type of music data to be created are sequentially displayed on the display of the output means 3. Therefore, by operating the operation element 3M using each selection screen and selectively inputting items regarding the type of desired video data and the music style of the music data, the video and music style designating unit A responds accordingly. The video designation information and the music style designation information representing the video type and the music style are generated and output to the terminal transmission unit 5T and the music composition data acquisition unit MD, respectively.
[0023]
The terminal transmission unit 5T transmits the video designation information toward the server SV. When the server SV receives the video designation information at the server reception unit SR, the server SV receives the video data designated by the received video designation information from the video database VDB. The extracted video data is transmitted from the server transmission unit ST to the portable terminal PT.
[0024]
On the other hand, in the portable terminal PT, a plurality of sets of music composition data (also referred to as music composition parameters) are stored in advance in a predetermined storage area of the storage means 2, and the music composition data acquisition unit MD is supplied from the video and music style designation unit A The composition data designated by the music composition designation information is extracted and held in a predetermined area of the RAM portion for the operation of the composition engine EG. Here, when the terminal reception unit 5R receives the video data from the server SV, the video time acquisition unit B calculates and acquires the total video time necessary for reproduction of the video data from the received video data, and acquires it. Time information corresponding to the recorded video time is supplied to the composition engine EG.
[0025]
The composition engine EG uses the acquisition of the time information from the video time acquisition unit B as a trigger, and the composition style according to the composition data based on the composition data from the composition data acquisition unit MD and the contents of the time information. The music data that ends at the time indicated by the time information is generated and stored in the music memory (music data memory) 2M, and the music composition end information is sent to the server SV via the terminal transmitter 5T. When the server SV receives the composition end information at the server reception unit SR, the server SV again extracts the video data designated by the previous video designation information from the video database VDB, and this video data is transferred from the server transmission unit ST to the portable terminal PT. Send.
[0026]
Then, the portable terminal PT starts reading data from the music memory 2M based on the re-reception of the video data by the terminal receiving unit 5R, and inputs the music data of the music memory 2M to the musical sound generating unit of the reproducing unit 4R. The video data re-received by the terminal receiving unit 5R is input to the video generating unit of the reproducing unit 4R. Therefore, a desired video is displayed on the display of the output unit 4 and music matching the video is output from the speaker of the output unit 4.
[0027]
In the above operation example, the video time information calculation function (B) is provided on the portable terminal PT side, but the video time information may be calculated and acquired on the server SV side. In addition, the video data received as the target of time extraction by the video time acquisition unit B of the portable terminal PT is stored in the video memory 2V, and the video is reproduced simultaneously with the music generation by the music composition engine EG. It is possible not to re-extract video data in the server SV. In this case, the music composition end information may be given as a read start command to the music memory 2M without returning to the server (shown).
[0028]
Also, the composition data storage and extraction means is placed on the server SV side, the video of the portable terminal PT and the song designation information from the song designation unit A are transmitted to the server SV, and the server SV sends the song designation information. The necessary music composition data may be extracted based on the data and returned to the music composition data acquisition unit MD of the portable terminal PT. Furthermore, as described above, the video data may be streamed as described above, or may be played after being downloaded to a file in the external storage unit.
[0029]
[Second Embodiment]
Contrary to the first embodiment, the second embodiment of the present invention is a music data creation method that can be applied when the composition engine is in the server and the mobile terminal has video data. It is represented by a functional block diagram. That is, in the second embodiment, when the server SV has a music composition engine EG for generating music data and a plurality of pieces of video data are stored in the storage unit 2 in the mobile terminal PT, the video played on the mobile terminal PT. When the data is designated, music data is automatically created by the music composition engine EG on the server SV side and provided to the portable terminal PT.
[0030]
In the second embodiment, when the portable terminal PT enters the music generation mode by a user operation of the operator 3M, a video selection screen for selecting video data to be reproduced from the video data stored in the storage unit 2 and A song style selection screen for selecting the song style of the music data to be created is sequentially displayed on the display of the output means 3. When the user selectively inputs a desired video data type and music data style on each selection screen, the video and music style designation unit A generates corresponding video designation information and style designation information, It outputs to the video data acquisition part VD and the terminal transmission part 5T.
[0031]
In the predetermined storage area of the storage means 2 of the portable terminal PT, as described above, a plurality of video data is stored in advance, and the video data acquisition unit VD uses the video designation information from the video and the melody style designation unit A. The designated video data is extracted and held in a predetermined area of the RAM unit for reproduction. Also, the video time acquisition unit B calculates the total time required for reproduction of the video data from the video data extracted by the video data acquisition unit VD, and sends time information corresponding to the video time to the terminal transmission unit 5T. Output.
[0032]
The terminal transmission unit 5T transmits the video and wind style designation information from the wind style designation unit A and the video time information from the video time acquisition unit B to the server SV. The server SV receives the wind style designation information from the server reception unit SR. When the video time information is received, the music composition data specified by the received music style designation information is extracted from the music composition database MDB, and the extracted music composition data and the received video time information are supplied to the music composition engine EG. . The composition engine EG uses the reception of the video time information as a trigger, and based on the content of the composition data and the video time information supplied, the composition engine EG has a composition according to the composition data and the time represented by the video time information. Generate music data to end. Further, the music data generated by the composition engine EG is transmitted to the mobile terminal PT via the server transmission unit ST.
[0033]
Then, the portable terminal PT starts reading of the video data held by the video data acquisition unit VD based on the reception of the music data by the terminal reception unit 5R, and sends the video data to the video generation unit of the playback unit 4R. In addition to the input, the music data received by the terminal receiver 5R is input to the musical sound generator of the playback unit 4R. Therefore, a desired video is displayed on the display of the output unit 4 and music matching the video is output from the speaker of the output unit 4.
[0034]
In the example shown in FIG. 2, the composition data MDB of the server SV has been described as being stored with the composition data used for the operation of the composition engine EG. Is stored in the storage means 2 of the portable terminal PT, and the composition data designated by the composition and designation information from the composition and designation unit A is supplied to the composition engine of the server SV. Also good. Music data may be played back as a stream as described above, or may be played after downloading to a file in the external storage unit.
[0035]
[Third Embodiment]
The third embodiment of the present invention is applied when the mobile terminal does not have a composition engine or video data as in the first or second embodiment, but the server has both the composition engine and video data. 3 is a functional block diagram of FIG. That is, in the third embodiment, when a large number of video data is accumulated in the video database VDB of the server SV and the music composition data generation engine EG for generating music data is also provided in the server SV, the video played on the portable terminal PT. When the data and the style of the music data to be created are specified, the corresponding video data is automatically selected on the server SV side, and the corresponding music data is created, and the portable terminal PT Provided to.
[0036]
In the third embodiment, when the portable terminal PT enters the music generation mode by the user operation of the operator 3M, the mobile terminal PT selects video data to be reproduced from the video data stored in the video database VDB of the server SV. An image selection screen and a song style selection screen for selecting a song style of music data to be created are sequentially displayed on the display of the output means 3. When the user selectively inputs a desired video data type and music data style on each selection screen, the video and music style designation unit A generates corresponding video designation information and style designation information and transmits it to the terminal. To the unit 5T.
[0037]
The terminal transmission unit 5T transmits the video designation information and the song designation information from the movie and the song designation unit A to the server SV. When the server SV receives the designation information by the server reception unit SR, on the other hand, the server SV extracts the video data designated by the received video designation information from the video database VDB. Then, the video time acquisition unit B generates time information representing the video time necessary for reproducing the video data from the extracted video data, and supplies the time information to the composition engine EG. On the other hand, the server SV selects composition data designated by the received song style designation information from the composition database MDB, and supplies the selected composition data to the composition engine EG.
[0038]
The composition engine EG uses the video time information supplied from the video time acquisition unit B as a trigger, and based on the content of the composition data and the video time information supplied, the composition engine EG has a music style according to the composition data. Music data ending at the time indicated by the time information is generated. Next, the server SV extracts the video data from the video database VDB again based on the completion of the generation of the music data by the music composition engine EG (generation of music composition end information). Then, the music data generated by the composition engine EG is transmitted to the portable terminal PT via the server transmission unit ST together with the video data re-extracted from the video database VDB.
[0039]
In the portable terminal PT, music data and video data received by the terminal receiving unit 5R are transferred to the playback unit 4R, and the playback unit 4R performs playback processing on the music data and video data, and music and video based on both data. Is output to the speaker and display of the output means 4.
[0040]
In the example illustrated in FIG. 3, the composition data MDB of the server SV has been described as being stored with the composition data used for the operation of the composition engine EG. Is stored in the storage means 2 of the portable terminal PT, and the composition data designated by the composition and designation information from the composition and designation unit A is supplied to the composition engine of the server SV. Also good. In addition, as described above, video data and music data may be streamed as described above, or may be played after being downloaded to a file in an external storage unit.
[0041]
[Fourth Embodiment]
Contrary to the third embodiment, the fourth embodiment of the present invention is a music data creation method that can be applied when both the composition engine and the video data are in the portable terminal, and the function of FIG. Represented in block diagram. That is, in the fourth embodiment, when a plurality of video data is stored in the storage means 2 of the mobile terminal PT and the music composition engine EG for generating music data is also installed in the mobile terminal PT, In response to the designation of the video data to be reproduced and the music style of the music data to be created, the corresponding video data is automatically selected and the corresponding music data is created.
[0042]
In the fourth embodiment, when the portable terminal PT enters the music generation mode by a user operation of the operator 3M, a video selection screen for selecting video data to be reproduced from the video data stored in the storage unit 2 And a song style selection screen for selecting the song style of the music data to be created is sequentially displayed on the display of the output means 3. When the user selectively inputs a desired video data type and music data style on each selection screen, the video and music style designation unit A generates corresponding video designation information and style designation information, It outputs to the video data acquisition unit VD and the composition data acquisition unit MD.
[0043]
The video data acquisition unit VD extracts video data specified by the video and the video designation information from the melody designating unit A and gives it to the video time acquisition unit B. The video time acquisition unit B extracts the video data from the extracted video data. Then, time information representing the video time required for reproduction of the video data is calculated and supplied to the composition engine EG. Further, the composition data acquisition unit MD selects the composition data designated by the composition designation information from the video and the composition designation unit A from the plurality of composition data stored in the storage unit 2 in advance. The selected composition data is supplied to the composition engine EG.
[0044]
The composition engine EG uses the video time information supplied from the video time acquisition unit B as a trigger, and based on the content of the composition data and the video time information supplied, the composition engine EG has a music style according to the composition data. Music data ending at the time indicated by the time information is generated. Further, when the generation of the music data is completed, the composition engine EG outputs the composition end information to the video data acquisition unit VD, and the video data acquisition unit VD extracts the video data from the storage unit 2 again. Is supplied to the video generation unit of the playback unit 4R, and the music data generated by the composition engine EG itself is supplied to the musical sound generation unit of the playback unit 4R in accordance with the supply of the video data.
[0045]
Then, the playback unit 4R performs playback processing of video data and music data supplied from the video data acquisition unit VD and the composition engine EG, and outputs video and music based on both data to the display and speaker of the output means 4.
[0046]
[Fifth Embodiment]
In the fifth embodiment of the present invention, a music composition engine for generating music data is provided in a mobile terminal, and video data to be reproduced is input from image information input means such as a video camera provided in the mobile terminal itself. 5 is a function block diagram of FIG. That is, in the fifth embodiment, when the portable terminal PT has the composition engine EG and the input means 3 is provided with the video camera 3C, the music data created corresponding to the video data photographed by the video camera 3C is stored. Music data suitable for this video data is automatically created in response to the designation of the music style.
[0047]
In the fifth embodiment, when the portable terminal PT enters the music generation mode by a user operation of the operator 3M, a music style selection screen for selecting a music style of the music data is displayed on the display of the output means 3. . When the user selectively inputs a desired song style with respect to the music data to be created for the video to be shot by the video camera 3C from now on using this selection screen, the song style designating unit A2 responds. The music style designation information is generated and output to the composition data acquisition unit MD.
[0048]
The composition data acquisition unit MD selects the composition data designated by the composition designation information from the composition designation unit A from the plurality of composition data stored in the storage unit 2 in advance, and the selected composition The data for use is held in a predetermined area of the RAM unit to be supplied to the composition engine EG.
[0049]
Next, shooting is started by operating the video camera 3C by a user operation, and video data obtained by shooting is stored in the video memory 2V. Simultaneously with the end of shooting with the video camera 3C, the video time acquisition unit B calculates the video time required for playback of the shot video data (usually, the shooting speed is equal to the shooting time because the playback speed is set to the playback speed). Time information to be expressed is calculated and supplied to the composition engine EG.
[0050]
The composition engine EG uses the video time information from the video time acquisition unit B as a trigger, and composes data based on the composition data held by the composition data acquisition unit MD and the content of the video time information. The music data having a music style according to the video time information and ending at the time indicated by the video time information is generated, and reading of the video data stored in the video memory 2V is started. Then, these music data and video data are supplied to the playback unit 4R.
[0051]
The reproduction unit 4R reproduces music data generated by the composition engine EG with a musical sound generation unit, and reproduces video data read from the video memory 2V with a video generation unit, and outputs music and video based on both data. Output to the speaker and display of the means 4.
[0052]
In the example shown in FIG. 5, the video time information is directly acquired from the video data of the video camera 3C. However, the video time information may be acquired from the video data temporarily stored in the video memory 2V. Further, a plurality of video data shot by the video camera 3C may be stored, and video data to be played back may be designated by video designation information from a video designation unit provided separately during reproduction. In this case, music data created corresponding to each stored video data is also stored, and at the time of playback, the music data to be played back may be designated by the video designation information. Furthermore, as in the second embodiment, the composition engine EG may be installed in the server SV, and the music data created on the server SV side may be used on the portable terminal PT.
[0053]
[Embodiment in the case of narration]
In the above embodiment, the object to which the created music data is to be combined is video data, but this object may be narration data (data representing recitation audio). That is, by changing the configuration related to video data to the configuration related to narration data in each embodiment (for example, in the fifth embodiment, a microphone is used instead of a video camera), music data suitable for narration data is obtained. A system for generating and reproducing narration data and music data simultaneously from the playback unit 4R of the output means 4 of the portable terminal PT is obtained.
[0054]
A specific example of the configuration change in the case of creating music data in accordance with narration data is as follows: video and tune style designation section A → narration and tune style designation section, video time information Acquisition unit B → Narration (reading) time information acquisition unit, video data acquisition unit VD → narration data acquisition unit, video memory 2V → audio memory, video camera 3C → microphone, [within the generation unit 4R] video generation unit → audio generation unit [Server SV side] Video database VDB → Narration (recitation audio) database, and so on.
[0055]
【The invention's effect】
As described above, according to the present invention, since composition is started with the time acquisition of video and audio (narration) as a trigger, the user's operation on the portable terminal is reduced, and data and music style are designated. With a simple user operation, you can get music that matches the video and narration and has the same playback time as that of the video and narration. Can be played.
[Brief description of the drawings]
FIG. 1 is a functional block diagram showing an outline of a first embodiment of the present invention.
FIG. 2 is a functional block diagram showing an outline of a second embodiment of the present invention.
FIG. 3 is a functional block diagram showing an outline of a third embodiment of the present invention.
FIG. 4 is a functional block diagram showing an outline of a fourth embodiment of the present invention.
FIG. 5 is a functional block diagram showing an outline of a fifth embodiment of the present invention.
FIG. 6 is a block diagram very schematically representing a hardware configuration of a music data creation system according to an embodiment of the present invention.
[Explanation of symbols]
A, A2 video and tune style designation section
MD, VD composition data acquisition unit and video data acquisition unit,
VDB, MDB video database and composition database,
B video time information acquisition unit (playback time acquisition unit),
EG composition engine (music data generator),
2M, 2V music memory and video memory (RAM part),
4R A playback unit including a video generation unit, an audio generation unit, and a musical sound generation unit.

Claims

A designation means for designating the style of the music data to be created,
Time acquisition means for acquiring the reproduction time of the data from video data or narration data;
A music data creation system comprising: data generation means for generating music data corresponding to a specified musical style in response to the acquisition of the playback time.

Data storage means for storing a plurality of video data or narration data;
Designating means for designating the style of video data or narration data to be reproduced and music data to be created;
Data extraction means for extracting designated video data or narration data from the data storage means;
Time acquisition means for acquiring the reproduction time of the data from the extracted video data or narration data;
A music data creation system comprising: data generation means for generating music data corresponding to a specified musical style in response to the acquisition of the playback time.

A designation means for designating the style of the music data to be created,
Data recording means for recording video data or narration data corresponding to video or audio input from the outside;
Time acquisition means for acquiring the reproduction time of the data in response to the end of recording of the video data or narration data;
A music data creation system comprising: data generation means for generating music data corresponding to a specified musical style in response to the acquisition of the playback time.

A step of specifying the style of the music data to be created;
Obtaining the playback time of the data from video data or narration data;
A music data creation program for causing an information processing apparatus to execute a procedure including a step of generating music data according to a specified musical style corresponding to the reproduction time in response to acquisition of the reproduction time.