JP2004093975A

JP2004093975A - Communication terminal and program

Info

Publication number: JP2004093975A
Application number: JP2002255999A
Authority: JP
Inventors: Toshihisa Nakamura; 中村　利久
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2002-08-30
Filing date: 2002-08-30
Publication date: 2004-03-25

Abstract

<P>PROBLEM TO BE SOLVED: To make generation/playback of synthetic voice possible without being influenced by the distance between the devices, and to playback possible which reflects more faithfully the intentions of the inputting persons to each input voice. <P>SOLUTION: In a performance terminal 20, a CPU 21 extracts a voice data packet from an IP packet and outputs it to a DSP (digital signal processor) 22 when performance data made into an IP packet is inputted from communication equipment 33. In addition, the DSP 22 stores the inputted voice data packet in a performance data buffer 24b after performing prescribed acoustic processing to it. Along with it, the DSP 22 reads the voice data packet stored in the performance data buffer 24b according to its packet number, combines it with performance voice data and after that, outputs it to a D/A conversion circuit 29. The data after composition is outputted to the outside as concert voice via the D/A conversion circuit 29 and an amplifier 30. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、通信端末及びプログラムに関する。
【０００２】
【従来の技術】
従来、複数の装置間をネットワークで結び、各装置からの入力音声をリアルタイムに受信するとともに、これらの音声を各装置で再生させる通信システムとして、特開平１１−２１９１７４号公報に開示されているネットワーク演奏システムや、特開平１０−３９８９７号公報に開示されているカラオケシステムなどが知られている。
【０００３】
上記ネットワーク演奏システムにおいては、複数の端末装置それぞれが、入力された演奏パート（以下、適宜、単に「パート」という。）の操作情報をＭＩＤＩファイルに変換し、インターネットを介してサーバへ送信する。そして、サーバが、各端末装置から受信される各パートの操作情報に基づいて合奏音声を生成し、各端末装置へ送信している。
【０００４】
また、上記カラオケシステムにおいては、複数のカラオケ装置それぞれに入力されたカラオケ音声を、通信ネットワークを介して相互に送信することで、デュエットを可能としている。
【０００５】
【発明が解決しようとする課題】
ところで、上述のような通信システムにおいては、各装置間の距離に起因する時間遅延の問題が発生する。即ち、上記カラオケシステムのように、各端末装置が、同一店舗内といった比較的近距離に配置されている場合には問題とならないが、ネットワークとしてインターネットを利用するといった場合には、各装置間の距離が遠くなるにつれ、物理的な距離や介在するサーバの存在、通信プロトコルといった様々な要因から、受信データの時間遅延の発生は避け難いものとなり、生音声のリアルタイム演奏を行うには不充分である。
【０００６】
また、上記ネットワーク演奏システムのように、操作情報をＭＩＤＩ形式のファイルで送信する方法では、楽曲１曲分をまとめてＭＩＤＩ形式とする必要上、リアルタイム演奏は不可能に近く、更には、演奏者の意図した音色・音量・音質を完全に再現することが困難であった。
【０００７】
本発明の課題は、他の通信機器との間で送受信される音声等のデータを適切に合成・出力できるようにすることである。
【０００８】
【課題を解決するための手段】
上記課題を解決するために、請求項１に記載の発明は、
通信回線を介して接続された他の通信端末と音声データの送受信を行いつつ、前記他の通信端末から受信した音声データと入力された音声データとを合成し、合成音声として再生する通信端末（例えば、図１の演者端末２０）であって、
入力音声データに、予め定められている同期調整データ（例えば、図５の送信端末識別番号及びパケット番号）を付加して音声通信データを作成する作成手段（例えば、図１０の演者端末２０、図１６ステップＳ２４０）、
を備えることを特徴としている。
【０００９】
また、請求項１０に記載の発明は、
所定の通信回線を介して接続された他の通信端末と音声データの送受信を行なう通信端末であるコンピュータ（例えば、図１の演者端末２０）に、
入力された音声データに予め定められた同期調整データ（例えば、図５の送信端末識別番号及びパケット番号）を付加して音声通信データを作成する作成機能（例えば図１６ステップＳ２４０）と、
前記他の通信端末から送信された同期調整データが付加された音声通信データを受信する受信機能（例えば、図１５ステップＳ２２１〜Ｓ２２４）と、
この受信手段により受信された同期調整データが付加された音声通信データに含まれる音声データと前記作成手段により作成された同期調整データが付加された入力音声データに含まれる音声データとを、それぞれに対応する同期調整データに従って、同期した合成音声として再生する音声合成再生機能（図１７、Ｓ２５１〜Ｓ２５４）と、
を実現させるためのプログラムである。
【００１０】
この請求項１又は１０に記載の発明によれば、入力された音声データに、予め定められた同期調整データを付加した通信データを作成することができる。
【００１１】
そして、請求項２に記載の発明のように、
請求項１に記載の通信端末において、
前記作成手段により作成された同期調整データが付加された音声通信データを送信する送信手段（例えば、図１０の通信装置３３、ＣＰＵ２１、図１６；Ｓ２３９〜Ｓ２４４）と、
他機から送信される通信データを受信する受信手段（例えば、図１０の通信装置３３、ＣＰＵ２１、図１５；Ｓ２２１〜Ｓ２２４）と、
この受信手段により受信された通信データに含まれる音声データを、その通信データに含まれる同期調整データに従って、前記音声入力手段から出力された音声データと共に同期した合成音声として再生する再生手段（例えば、図１０のＣＰＵ２１及びＤＳＰ２２、図１７；Ｓ２５１〜Ｓ２５４）と、
を備えることとすれば、次の効果が得られる。
【００１２】
即ち、他の通信端末から受信した音声通信データに含まれる音声データと作成手段により作成された同期調整データが付加された入力音声データに含まれる音声データとを、それぞれに対応する同期調整データに従って、同期した合成音声として再生することができる。
【００１３】
ここで、請求項３に記載の発明のように、
請求項２に記載の通信端末において、
前記同期調整データには順序データが含まれ、
前記受信手段により受信された音声通信データを蓄積記憶する他データ記憶手段（例えば、図１０の出力バッフ２２ｂ）を更に備え、
前記再生手段は、前記他データ記憶手段に記憶された音声通信データを、その音声通信データに含まれる順序データに従って読み出し、前記入力された音声データと共に同期した合成音声として再生することとしても良い。
【００１４】
この請求項３に記載の発明によれば、音声通信データを受信した順序に関わらず、正しい順序で合成音声の再生ができる。
【００１５】
また、請求項４に記載の発明のように、
請求項２又は３に記載の通信端末において、
前記作成手段は、作成する音声通信データに、再生条件データを更に付加する再生条件付加手段（例えば、図１０のＤＳＰ２２、ＣＰＵ２１、図１６；Ｓ２３６）を有し、
前記再生手段は、前記受信手段により受信された音声通信データに含まれる音声データを、その音声通信データに含まれる再生条件データに基づいて再生する（例えば、図１０のＤＳＰ２２、図１５；Ｓ２２８）ように構成しても良い。
【００１６】
ここで、再生条件データとは、上記再生手段が音声データを再生する際の条件を指定するデータであり、具体的には、エコーやリバーブ（残響）、トーンコントロールの度合、音量バランス、ステレオ化の有無やそのＬ／Ｒ比等が挙げられる。
【００１７】
従って、この請求項４に記載の発明によれば、それぞれの音声通信データを加工し、音色・音量・音質を、より正確に再現することが可能となるとともに、また、再生する場所に応じて、より相応しい合成音声を生成することが可能となる。尚、この再生条件データは、利用者の入力指示に応じたものであっても良いし、通信端末が、再生場所等の条件に応じて、適宜決定するものであっても良い。
【００１８】
また、請求項５に記載の発明のように、
請求項１〜４の何れか一項に記載の通信端末において、
前記作成手段は、入力された音声データの内、所定単位の音声データごとに前記同期調整データを付加することにより、音声通信データを順次作成するように構成（図１６；Ｓ２３１〜Ｓ２３８）しても良い。
【００１９】
この請求項５に記載の発明によれば、音声データを所定単位毎に同期調整データを付加して音声通信データを順次作成できるので、音声通信データの取り扱いがしやすい。例えば上記音声データをパケット化するとともに、上記所定の通信回線を代表的なＩＰネットワークであるインターネットに適用することで、より多数、且つ広範にわたる他機との音声データの送受信が可能となる。
【００２０】
また、請求項６に記載の発明のように、
請求項２〜５の何れか一項に記載の通信端末において、
前記所定の通信回線を介して接続されるサーバ（例えば、図１の管理サーバ１０）に所望演通信件データを送信して、当該所望通信条件データに適合する所望通信条件データを送信した他の通信端末の通信アドレスを受信する照会手段（例えば、図１０のＣＰＵ２１、図１３；Ｓ１３，Ｓ１４，Ｓ１８、Ｓ２１，Ｓ２２）を更に備え、
この照会手段により受信された通信アドレスを基にして、前記送信手段および前記受信手段が、音声通信データの送信および受信を行うように構成しても良い。
【００２１】
ここで所望通信条件とは、通信データの送信及び受信を行うこととなる他の通信端末に望む条件であり、例えば、音声データの送受信を希望する日時、その内容等が挙げられる。
【００２２】
従って、この請求項６に記載の発明によれば、例えば、本発明の通信端末を、他の通信端末とが互いに演奏データを送受信し、リアルタイムのバンド演奏を実現する装置に適用することとすれば、所望通信条件として、演奏を希望する曲目やパート、日時等の条件をサーバへ送信することで、所望通信条件を満たす他機を見つけることができる。
【００２３】
また、請求項７記載の発明のように、
請求項２〜６の何れか一項に記載の通信端末において、
前記送信手段が送信する音声通信データを暗号化する暗号化手段（例えば、図１０のＣＰＵ２１、図１６；Ｓ２４３）と、
前記受信手段が受信した音声通信データを復号する復号化手段（例えば、図１０のＣＰＵ２１、図１５；Ｓ２２２）と、
を更に備えるように構成しても良い。
【００２４】
この請求項７に記載の発明によれば、送信する通信データを暗号化するともに、受信した通信データを復号化することで、上記所定の通信回線を介したデータ通信を行う際のセキュリティを高めることができる。
【００２５】
請求項８に記載の発明は、
所定の通信回線を介して接続された他の通信端末とデータの送受信を行いつつ、前記他の通信端末から受信したデータと入力されたデータと即時合成して再生する通信端末（例えば、図１の演者端末２０）であって、
入力データに予め定められた同期調整データ（例えば、図５の送信端末識別番号及びパケット番号）を付加して通信データを作成する作成手段（例えば、図１０の演者端末２０、図１６ステップＳ２４０）、
を備えることを特徴としている。
【００２６】
この請求項８に記載の発明によれば、入力された音声データに、予め定められた同期調整データを付加した通信データを作成することができる。従って、例えば、当該通信端末の入力データと、所定の通信回線を介して他の通信端末から受信した通信データに含まれる入力データとを、それぞれに対応する同期調整データに従って、正確に同期させて再生することができ、自機（当該通信端末）と他機（他の通信端末）との間の距離に影響されることなく、他機との通信データの送受信を行いつつ、これらのデータを即時合成して再生することが可能となる。
【００２７】
請求項９に記載の発明は、
サーバを介して、前記サーバに接続された他の通信端末とデータの送受信を行いつつ、入力データと他の通信端末から受信したデータとを即時合成して再生する通信端末（例えば、図１８の演者端末２０）であって、
入力データに予め定められた同期調整データを付加して通信データを作成する作成手段を備えることを特徴としている。
【００２８】
この請求項９に記載の発明によれば、入力されたデータに、予め定められた同期調整データを付加した通信データを作成することができる。従って、例えば、当該通信端末の入力データと、サーバを介して他の通信端末から受信した通信データに含まれる入力データとを、それぞれに対応する同期調整データに従って、正確に同期させて再生することができ、自機（当該通信端末）と他機（他の通信端末）との間の距離に影響されることなく、他機との通信データの送受信を行いつつ、これらのデータを即時合成して再生することが可能となる。また、通信端末のデータの送信先はサーバのみであるため、他機（他の通信端末）の台数が増加しても、通信データの送信にかかる負担を増加させずに済む。
【００２９】
【発明の実施の形態】
以下、図面を参照し、発明の実施の形態を詳細に説明する。尚、以下においては、本発明を適用した演奏システムを例にとって説明するが、本発明の適用はこれに限らない。
【００３０】
ここで演奏システムとは、インターネットを介して接続された複数の演者端末（通信端末）が、それぞれに割り当てられたパートで、同時にある曲目の演奏（即ち、合奏）を行うシステムである。この時、合奏に参加している各演者端末においては、利用者によって入力されたパートの演奏音声と、他の演者端末において入力されたパートの演奏音声とが合成された合奏音声が、出力される。尚、各演者端末に割り当てるパートは、それぞれ異なるものであっても良いし、複数の演者端末で同一のパートを割り当てることとしても良い。
【００３１】
図１は、本実施の形態における演奏システム１の構成を示す図である。
同図によれば、演奏システム１は、管理サーバ１０及び複数の演者端末２０より構成されるとともに、これらの機器は、インターネットＮに接続されている。尚、同図においては、３台の演者端末２０が示されているが、勿論、これは何台であっても構わない。
【００３２】
管理サーバ１０は、ＣＰＵ（Ｃｅｎｔｒａｌ　Ｐｒｏｃｅｓｓｉｎｇ　Ｕｎｉｔ）、ＲＡＭ（Ｒａｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｍｏｒｙ）、記憶装置、通信装置等がシステムバスを介して接続される、周知のサーバ装置により実現される。また、管理サーバ１０は、Ｗｅｂページ用ファイルを提供するＷｅｂサーバ機能を有している。
【００３３】
演者端末２０は、ＣＰＵ、ＲＡＭ、記憶装置、入力装置、表示装置、通信装置等がシステムバスを介して接続される、周知のＰＣ（Ｐｅｒｓｏｎａｌ　Ｃｏｍｐｕｔｅｒ）により実現される。
【００３４】
図２は、演奏システム１における各機器の間のデータの流れを示す図である。同図（ａ）は、合奏前の様子を、同図（ｂ）は、合奏中の様子を、それぞれ示している。尚、これらのデータは、適宜暗号化され、インターネットＮを介して送受信される。
【００３５】
同図（ａ）によれば、演者端末２０は、利用者の指示入力に応じた演奏条件データを、管理サーバ１０へ送信する。この演奏条件データには、利用者が演奏を希望する曲目、パート及び日時のデータが含まれる。従って、管理サーバ１０は、各演者端末２０から送信される演奏条件データを、受信する（図中▲１▼）。
【００３６】
また、管理サーバ１０は、これら受信した演奏条件データに基づき、合奏に参加する演者端末２０を特定する。そして、これら特定した各演者端末２０に対して、当該合奏の予定を示す合奏予定データを送信する。この合奏予定データには、合奏を予定している日時、曲目及び当該合奏に参加する演者端末２０を示すデータが含まれる。従って、合奏に参加予定の演者端末２０は、それぞれ、管理サーバ１０から送信される合奏予定データを受信する（図中▲２▼）。
【００３７】
一方、同図（ｂ）によれば、合奏が予定された日時となると、管理サーバ１０は、当該合奏に参加予定の演者端末２０それぞれに対して、演奏開始を指示する（図中▲３▼）。
【００３８】
そして、管理サーバ１０から演奏開始が指示された演者端末２０は、先に受信した合奏予定データに基づき、当該合奏に参加する他の演者端末２０と、演奏データの送受信を行い、合奏を実現する。この時、各演者端末２０は、利用者により入力された演奏音声のデータ（以下、適宜「演奏データ」という。）と、他の演者端末２０から受信した演奏データとを合成し、合奏音声として出力する。
【００３９】
即ち、図３に示すように、演者端末２０の利用者が、演者端末２０（図中、「ライブギア」と表記されている。）に接続されたエレキギター等の楽器４１を演奏すると、この演奏に応じた演奏データが、インターネットＮを介して他の演者端末２０に送信される。また、演者端末２０は、インターネットＮを介して共演者の演奏データを受信するとともに、自身の演奏データと合成し、合奏音声としてスピーカ４２より出力する。従って、演者端末２０の利用者にとっては、あたかも共演者（即ち、他の演者端末２０の利用者）と共演しているかのような合奏音声を、聴くことができる。
【００４０】
また、本実施の形態においては、通信プロトコルとして、ＵＤＰ／ＩＰが採用されている。即ち、インターネットＮを経由した管理サーバ１０及び演者端末２０の間のデータ通信は、各機器に割り当てられたＩＰアドレスに基づくＩＰパケットの送受信によって、実現される。
【００４１】
図４は、ＩＰパケットに含まれるＩＰヘッダのフォーマットを示す図である。同図によれば、ＩＰヘッダは、準拠しているＩＰ規格のバージョン（即ち“６”）、当該ＩＰパケットの優先度、通信帯域の予約を保証するためのフローラベル、実データの大きさを示すペイロード長、後続ヘッダのＩＤを示すネクストヘッダ、ルータによる中継限界数を示すホップリミット、送信元の機器（管理サーバ１０、或いは演者端末２０）のＩＰアドレスを示す送信元ＩＰアドレス及び宛先の機器（管理サーバ１０、或いは演者端末２０）のＩＰアドレスを示す宛先ＩＰアドレスより構成される。
【００４２】
また、図５は、ＩＰパケットに含まれる実データ（ペイロード部）のデータ構成を示す図である。尚、同図においては、ＩＰパケットの実データ（ペイロード部）の内、合奏時に、複数の演者端末２０の間で送受信される演奏データについてのデータ構成を示している。
同図によれば、ＩＰパケットに含まれる実データは、指示ヘッダ及び音声データより構成される。
【００４３】
指示ヘッダには、当該ＩＰパケットが演奏システム１のための演奏データであることを示す演奏データ識別、送信元の演者端末２０を示す送信端末識別番号、上記送信元の演者端末２０から送信された何番目のＩＰパケットであるかを示すパケット番号、音声データの記録方式（符号化方式）を示す音声記録方式（即ち“ＰＣＭ”）、その際のパラメータであるサンプリングレート（サンプリング周波数）、ビット数（量子化ビット数）、ビットレート、そして音声データに対するエフェクトを指定するエフェクト条件が含まれている。
【００４４】
ここで、送信端末識別番号及びパケット番号を合わせて、「同期調整データ」という。この同期調整データは、後述のように、演奏データを同期合成するために用いられる。
【００４５】
エフェクトとは、音声データに対して付加する音響効果を意味しており、例えば、エコーやリバーブ（残響）、トーンコントロール、音量バランスの変更、ステレオ化の有無等が該当する。
【００４６】
また、音声データは、利用者によって入力された、アナログ信号である演奏音声を、ＰＣＭ（Ｐｕｌｓｅ　Ｃｏｄｅ　Ｍｏｄｕｌａｔｉｏｎ）方式によってデジタル信号に変換したデータであり、１つのＩＰパケットにつき、５１２サンプリング分のデータが含まれる。即ち、演奏データは、複数の演者端末２０の間で、５１２サンプリング分を単位としてやり取りされる。以下、この１パケット分（即ち、５１２サンプリング分）の音声データを、適宜「音声データパケット」という。
【００４７】
図６は、演奏システム１における通信プロトコルのスタックを示す図である。同図によれば、通信プロトコルは、下位層から順に、イーサネット（登録商標）層（ＰＰＰｏｖｅｒＥｔｈｅｒ）、ＩＰ層（ネットワーク層）、ＵＤＰ層（トランスポート）及びアプリケーション層（ＡＰＩを含む）より構成される。
【００４８】
即ち、演者端末２０が他の演者端末２０から受信したＩＰパケットは、イーサネット（登録商標）層を介してＩＰ層へ渡され、このＩＰ層にて、暗号解読処理がなされる。その後、ＵＤＰ層を経由して、アプリケーション層に渡される。そして、アプリケーション層にて、後述のように、ＣＰＵ２１によりＩＰパケットの実データに含まれる音声データが抜き出され、１つの音声データパケットとして、指示ヘッダとともに、ＤＳＰ２２へ入力される。
【００４９】
また、利用者により演者端末２０へ入力された演奏データは、後述のように、アプリケーション層にて、ＤＳＰ２２により音声データパケットに変換され、ＣＰＵ２１へ出力される。そして、ＵＤＰ層、続いてＩＰ層へ渡され、このＩＰ層にて暗号化処理がなされ、ＩＰパケット化される。その後、生成されたＩＰパケットは、イーサネット（登録商標）層を経由し、他の演者端末２０へ送信される。
【００５０】
ここで、ＩＰ層にて実現されるデータの暗号／復号化について説明する。
暗号化とは、インターネット等のネットワークを通じて文書や画像等のデジタルデータをやりとりする際に、通信途中で第三者に盗み見られたり改ざんされることを防ぐため、予め定められた規則に従ってデータを変換し、解読が極めて困難な状態にすることである。そして、暗号化されたデータを元に戻し、解読可能な状態にすることを復号化という。
【００５１】
また、一般に、暗号／復号化には、暗号表にあたる「鍵」を使用するが、対になる２つの鍵を使う公開鍵暗号方式と、どちらにも同じ鍵を用いる共通鍵暗号方式とがある。前者にはＲＳＡ、ＥｌＧａｍａｌ暗号、楕円曲線暗号等があり、後者には米国政府標準のＤＥＳやＩＤＥＡ、ＦＥＡＬ、ＭＩＳＴＹ等があり、本実施の形態では上記の何れを使うこととしても良い。
【００５２】
次に、管理サーバ１０及び演者端末２０の内部構成について、図７〜図１１を参照して説明する。
【００５３】
図７は、管理サーバ１０の構成を示すブロック図である。
同図によれば、管理サーバ１０は、ＣＰＵ１１、記憶装置１２、ＲＡＭ１３及び通信装置１４より構成され、各部はシステムバス１５によりデータ通信可能に接続されている。
【００５４】
ＣＰＵ１１は、記憶装置１２に記憶されるプログラムに基づいて、管理サーバ１０を構成する各部を集中制御する。具体的には、通信装置１４から入力される信号に応答して、記憶装置１２に記憶されているプログラムを読み出してＲＡＭ１３に一時記憶させるとともに、係るプログラムに基づく処理を実行して、管理サーバ１０を機能させる。その際、ＣＰＵ１１は、処理結果をＲＡＭ１３内の所定領域に格納するとともに、必要に応じて、その処理結果を通信装置１４から外部機器へ送信する。
【００５５】
また、ＣＰＵ１１は、本実施の形態の特徴的な部分として、後述する合奏処理（図１３参照）を実行する。
【００５６】
具体的には、合奏処理において、ＣＰＵ１１は、演者端末２０から送信される演奏条件データを受信すると、この演奏条件データを、ＲＡＭ１３内の演奏条件データ格納領域１３ａに格納する。また、ＣＰＵ１１は、この演奏条件データに適合するデータを演奏条件データ格納領域１３ａに格納されているデータの内から検索し、合奏に参加可能な演者端末２０を特定する。そして、合奏を予定する日時、曲目、特定した演者端末２０のＩＰアドレスを含む合奏予定データを生成する。
【００５７】
その後、生成した合奏予定データを、ＲＡＭ１３内の合奏予定データ格納領域１３ｂに格納するとともに、特定した演者端末２０のそれぞれに対して、生成した合奏予定データを送信する。更に、ＣＰＵ１１は、合奏予定日時の所定時間前となると、当該合奏に参加予定の演者端末２０に対して、演奏開始を指示する。
【００５８】
記憶装置１２は、管理サーバ１０の動作に係る各種処理プログラムや、本実施の形態の機能を実現するためのプログラム（具体的には、合奏プログラム１２ａ）及びこれらのプログラムの実行に係る処理データ等を記憶する。
【００５９】
ＲＡＭ１３は、ＣＰＵ１１により実行される各プログラムを展開するプログラムエリア（不図示）、入力指示や入力データ及び上記プログラムが実行される際に生じる処理結果等を一時的に格納するワークメモリエリアを備えている。また、上記ワークメモリエリアには、演奏条件データ格納領域１３ａ及び合奏予定データ格納領域１３ｂが形成される。
【００６０】
図８は、演奏条件データ格納領域１３ａに格納されるデータの構成を示す図である。
同図によれば、演奏条件データ格納領域１３ａには、演奏条件及びＩＰアドレスが対応付けて格納される。この演奏条件は、演者端末２０から受信した演奏条件データに該当し、演奏を希望する曲目、パート及び日時を含んでいる。尚、これらの格納されているデータは、互いに適合するものが検索され、ＣＰＵ１１により合奏予定データが生成されると、該当するデータが、この演奏条件データ格納領域１３ａより削除される。
【００６１】
図９は、合奏予定データ格納領域１３ｂに格納されるデータの構成を示す図である。
同図によれば、この合奏予定データ格納領域１３ｂには、合奏予定データ、即ち、合奏を予定する日時、曲目及び各パートに割り当てられた演者端末２０のＩＰアドレスが、対応付けて格納される。また、これらの合奏予定データは、合奏を予定する日時が現在時刻に近い順に格納されており、該当する合奏が開始されると、該当するデータが、この合奏予定データ格納領域１３ｂより削除される。
【００６２】
通信装置１４は、インターネットＮを介して他の機器（主に、演者端末２０）とのデータ通信を行うためのインターフェースである。
【００６３】
図１０は、演者端末２０の構成を示すブロック図である。
同図によれば、演者端末２０は、ＣＰＵ２１、ＤＳＰ（Ｄｉｇｉｔａｌ　Ｓｉｇｎａｌ　Ｐｒｏｃｅｓｓｏｒ）２２、記憶装置２３、ＲＡＭ２４、入力装置２５、マルチプレクサ２６、サンプルホールド回路２７、Ａ／Ｄ（Ａｎａｌｏｇ−Ｄｉｇｉｔａｌ）変換回路２８、Ｄ／Ａ（Ｄｉｇｉｔａｌ−Ａｎａｌｏｇ）変換回路２９、アンプ３０、表示駆動回路３１、表示装置３２及び通信装置３３より構成される。
【００６４】
ＣＰＵ２１は、記憶装置２３に記憶されるプログラムに基づいて、演者端末２０を構成する各部を集中制御する。具体的には、通信装置３３、或いは入力装置２５から入力される信号に応答して、記憶装置２３に記憶されているプログラムを読み出してＲＡＭ２４に一時記憶させるとともに、係るプログラムに基づく処理を実行して、演者端末２０を機能させる。その際、ＣＰＵ２１は、処理結果をＲＡＭ２４内の所定領域に格納するとともに、必要に応じて、その処理結果を通信装置３３から外部機器へ送信するとともに、表示装置３２に表示させる。
【００６５】
また、ＣＰＵ２１は、本実施の形態に特徴的な部分として、後述する合奏処理（図１３参照）、リンク確立処理（図１４参照）、演奏データ受信処理（図１５参照）及び演奏データ送信処理（図１６参照）を実行する。
【００６６】
具体的には、合奏処理において、ＣＰＵ２１は、利用者の入力指示に応じた演奏条件データを、当該演者端末２０のＩＰアドレスとともに管理サーバ１０へ送信するとともに、管理サーバ１０から送信される合奏予定データを受信する。そして、管理サーバ１０より演奏開始を指示されると、ＣＰＵ２１は、リンク確立処理、次いで演奏処理を実行する。また、この演奏処理において、ＣＰＵ２１は、演奏データ受信処理及び演奏データ送信処理を実行するとともに、ＤＳＰ２２に、後述する演奏データ受信処理（図１５参照）、演奏データ送信処理（図１６参照）及び合奏音声出力処理（図１７参照）を実行させる。
【００６７】
リンク確立処理において、ＣＰＵ２１は、受信した合奏予定データに基づき、合奏に参加する他の演者端末２０に対して、利用者により入力された接続条件を送信する。それとともに、この接続条件と、他の演者端末２０から受信した接続条件とを照合し、当該合奏における接続条件を決定する。
【００６８】
ここで、接続条件には、通信条件及びエフェクト条件が含まれる。
通信条件とは、演者端末２０の間のデータ通信に間する条件であり、具体的には、入力された演奏音声の符号化方式（サンプリング周波数や量子化ビット数等のパラメータを含む）や、他の演者端末２０との間でデータ通信を行う際の接続レート（伝送速度）等が該当する。そして、この通信条件は、各演者端末２０のスペック等に応じて設定可能な条件の内から、利用者の指示により、或いは最適なものが自動的に決定される。
【００６９】
また、エフェクト条件とは、上述のように、演奏音声に対する音響効果に関する条件であり、具体的には、エコーやリバーブ（残響）、トーンコントロールの度合、音量バランス及びステレオ化の有無やそのＬ／Ｒ比等が該当する。そして、このエフェクト条件は、想定する演奏場所（例えば、コンサートホールや野外ステージ、ライブハウス等）に応じて、合奏音声に施すエコーやリバーブ（残響）、トーンコントロール等の度合が決定されたり、或いは、想定するステージ上における各パートの位置（例えば、中央にボーカル、右にギター、左にベース等）に応じて、当該各パートの音声の音量バランスや、ステレオ音声として出力する際の左右バランス（Ｌ／Ｒ比）が決定される。
【００７０】
演奏データ受信処理において、ＣＰＵ２１は、通信装置３３から入力されたＩＰパケットの実データに含まれる音声データを抜き出し、音声データパケットとして、指示ヘッダとともにＤＳＰ２２へ出力する。
【００７１】
また、演奏データ送信処理において、ＣＰＵ２１は、ＤＳＰ２２から入力される音声データパケットに所定のＩＰヘッダ等を付加したＩＰパケットを生成し、通信装置３３へ出力する。
【００７２】
ＤＳＰ２２は、デジタルデータの高速処理に特化したプロセッサである。また、ＤＳＰ２２は、本実施の形態の特徴的な部分として、演奏データ受信処理（図１５参照）、演奏データ送信処理（図１６参照）及び合奏音声出力処理（図１７参照）を実行する。
【００７３】
具体的には、演奏データ受信処理において、ＤＳＰ２２は、ＣＰＵ２１から入力された音声データパケットに対して所定の音響効果処理を行った後、演奏データバッファ２４ｂの所定領域に格納する。
【００７４】
演奏データ送信処理において、ＤＳＰ２２は、Ｄ／Ａ変換回路２９から入力された音声データを入力バッファ２４ａに格納し、５１２サンプリング分の音声データを１つの音声データパケットとしてＣＰＵ２１へ出力する。それとともに、この音声データパケットに対して所定の音響効果処理を行った後、演奏データバッファ２４ｂの所定領域に格納する。
【００７５】
また、合奏音声出力処理において、ＤＳＰ２２は、演奏データバッファ２４ｂに格納されている音声データパケットを合成し、合成後の音声データをＤ／Ａ変換回路２９へ出力する。
【００７６】
図１１は、ＣＰＵ２１及びＤＳＰ２２が、音声データパケットに対して行う処理の概念を示す図である。尚、同図においては、演奏システム１が、自身の演者端末（以下、適宜「演者端末Ａ」という。）及び２台の他の演者端末２０（以下、それぞれを、適宜「演者端末Ｂ」及び「演者端末Ｂ」という。）より構成される場合について、示されている。
【００７７】
管理サーバ１０が受信する演奏データは、上述のように、ＩＰパケットととしてＣＰＵ１１へ入力される。同図においては、このＩＰパケットを簡略し、送信端末識別番号及びパケット番号より構成される同期調整データ及び音声データのみで表しているとともに、ＣＰＵ１１への入力順に、図中左より示している。
【００７８】
また、送信端末識別番号“Ａ”は演者端末Ａを、送信端末識別番号“Ｂ”は演者端末Ｂを、そして送信端末識別番号“Ｃ”は演者端末Ｃを、それぞれ示している。
【００７９】
そして、音声データ“データＡｈ”は、演者端末Ａ自身のパケット番号“ｈ”の音声データパケットを、音声データ“データＢｉ”は、演者端末Ｂより受信したパケット番号“ｉ”の音声データパケットを、そして、音声データ“データＣｊ”は、演者端末Ｃより受信したパケット番号“ｊ”の音声データパケットを、それぞれ示している。
【００８０】
同図（ａ）によれば、ＣＰＵ２１は、入力されたＩＰパケットから音声データを抜き出す。そして、この音声データを１つの音声データパケットとし、付加されている端末識別番号に従って送信元の演者端末２０毎に振り分けるとともに、パケット番号に従って並べ替える。即ち、同図（ａ）においては、図中上から、演者端末Ａ、Ｂ、Ｃの順に振り分けた様子を示している。
【００８１】
次いで、同図（ｂ）によれば、ＤＳＰ２２は、これらの音声データパケットそれぞれに対し、指定されたエフェクト条件に基づいた音響処理を施す。具体的には、エコーやリバーブ、トーンコントロール等のエフェクトを、指定された度合で付加するとともに、指定されたＬ／Ｒ比のステレオデータに変換する。
【００８２】
例えば、音声データ“データＡ０”に対し、所定のエフェクトを付加するとともに、指定されたＬ／Ｒ比のステレオデータ、即ちＬチャネルの音声データ“データＡ０Ｌ”及び右チャネルの音声データ“データＡ０Ｒ”を、生成する。
【００８３】
また、音声データ“データＢ０”及び“データＣ０”についても同様に、所定のエフェクトを付加し、指定されたＬ／Ｒ比で、Ｌチャネルの音声データ“データＢ０Ｌ”及び“データＣ０Ｌ”、そして右チャネルの音声データ“データＢ０Ｒ”及び“データＣ０Ｒ”を、それぞれ生成する。
【００８４】
このように、ＤＳＰ２２によって音響処理が施された音声データパケット（即ち、同図（ｂ）の状態）が、演奏データバッファ２４ｂに格納される。
【００８５】
その後、同図（ｃ）によれば、ＤＳＰ２２は、これら音響処理済みの音声データパケットを合成し、合奏音声を再生するための音声データを生成する。具体的には、これら音響処理済みの音声データパケットの内、パケット番号が同一のもの同士を、チャネル毎に合成する。
【００８６】
例えば、パケット番号“０”については、音声データ“データＡ０Ｌ”、“データＢ０Ｌ”及び“データＣ０Ｌ”を合成し、Ｌチャネルの音声データ“データＡ０Ｌ＋Ｂ０Ｌ＋Ｃ０Ｌ”を生成するとともに、音声データ“データＡ０Ｒ”、“データＢ０Ｒ”及び“データＣ０Ｌ”を合成し、Ｒチャネルの音声データ“データＡ０Ｒ＋Ｂ０Ｒ＋Ｃ０Ｒ”を生成する。
【００８７】
また、パケット番号“１”についても同様に、音声データ“データＡ０Ｌ”、“データＢ０Ｌ”及び“データＣ０Ｌ”を合成し、Ｌチャネルの音声データ“データＡ０Ｌ＋Ｂ０Ｌ＋Ｃ０Ｌ”を生成するとともに、音声データ“データＡ０Ｒ”、“データＢ０Ｒ”及び“データＣ０Ｌ”を合成し、Ｒチャネルの音声データ“データＡ０Ｒ＋Ｂ０Ｒ＋Ｃ０Ｒ”を生成する。
【００８８】
そして、ＤＳＰ２２は、このように生成した音声データを、Ｄ／Ａ変換回路２９へ出力する。
【００８９】
また、図１０において、記憶装置２３は、演者端末２０の動作に係る各種処理プログラムや、本実施の形態の機能を実現するためのプログラム（具体的には、図１３の合奏処理を実行するための合奏プログラム２３ａ、図１４のリンク確立処理を実行するためのリンク確立プログラム２３ｂ、図１５の演奏データ受信処理を実行するための演奏データ受信プログラム２３ｃ、図１６の演奏データ送信処理を実行するための演奏データ送信プログラム２３ｄ及び図１７の合奏音声出力処理を実行するための合奏音声出力プログラム２３ｅ）及びこれらのプログラムに係る処理データ等を記憶する。
【００９０】
ＲＡＭ２４は、ＣＰＵ２１、或いはＤＳＰ２２により実行される各プログラムを展開するプログラムエリア（不図示）、入力指示や入力データ及び上記プログラムが実行される際に生じる処理結果等を一時的に格納するワークメモリエリアを備えている。また、このワークメモリエリアには、入力バッファ２４ａ及び演奏データバッファ２４ｂが形成される。
【００９１】
入力バッファ２４ａは、利用者によって入力された演奏音声のデータ（演奏データ）が格納される領域であり、詳細には、Ａ／Ｄ変換回路２８から入力された、５１２サンプリング分の音声データを格納する領域が確保されている。
【００９２】
演奏データバッファ２４ｂは、演者端末２０自身の演奏データ及び他の演者端末２０から受信した演奏データが格納される領域である。この演奏データバッファ２４ａのデータ構成を、図１２に示す。
【００９３】
図１２は、演奏データバッファ２４ｂのデータ構成を示す図である。
同図によれば、演奏データバッファ２４ｂには、当該演者端末２０へ入力された演奏音声のデータ（演奏データ）を格納する領域ＯＵＴ［０］及び他の演者端末２０から受信した音声データを格納するＮ個の領域ＯＵＴ［ｎ］（但し、ｎ＝１、２、・・・、Ｎである。また、Ｎは、合奏に参加している他の演者端末２０の数である。）が備えられている。
【００９４】
また、これらの領域ＯＵＴ［ｎ］は、合奏に参加している他の演者端末２０と１対１で対応付けられており、各領域ＯＵＴ［ｎ］毎に、対応付けられた演者端末２０から受信された音声データが格納される。
【００９５】
具体的には、これらの領域ＯＵＴ［ｍ］（但し、ｍ＝０、１、・・・Ｎである。）には、音声データパケットが、そのパケット番号に従って格納される。即ち、同図においては、領域ＯＵＴ［ｍ］毎に、音声データパケットが、パケット番号順に、図中左から１つづつ格納された様子を示している。
【００９６】
また、ここで格納される音声データパケットは、所定の音響処理が施された後のデータ、具体的には、所定のエフェクトが付加されるとともに、ステレオ化されたデータである。即ち、同図においては、領域ＯＵＴ［ｍ］毎に、図中上半分にＬチャネル用のデータが、下半分にＲチャネル用のデータが、それぞれ格納された様子を示している。
【００９７】
図１０において、入力装置２５は、演者端末２０の利用者が、演奏を希望する条件や、合奏に参加する他の演者端末２０との接続条件等を入力するためのものであり、文字キー、数字キーや各種機能キーを備えたキーボード又はタッチパネル、或いはマウスやトラックボール、トラックポイント、ポインティング・スティック等のポインティングデバイス等により構成される。そして、これらの操作に応じた操作信号を、ＣＰＵ２１へ出力する。
【００９８】
マルチプレクサ２６は、複数の入力信号の内から１つの信号を選択し、出力する回路であり、具体的には、ＤＳＰ２２から入力される指示信号（選択信号）に従って、入力される演奏音声の一つを選択し、サンプルホールド回路２７へ出力する。
【００９９】
サンプルホールド回路２７は、ＤＳＰ２２から入力されるクロック信号の入力タイミングに従い、マルチプレクサ２６から入力される演奏音声（アナログ信号）の波高を一時的にホールド（保持）するとともに、Ａ／Ｄ変換回路２８へ出力する。
【０１００】
Ａ／Ｄ変換回路２８は、サンプルホールド回路２７から入力されるアナログ信号をデジタル信号に変換する回路であり、変換後のデジタル音声データを、ＤＳＰ２２へ出力する。
【０１０１】
Ｄ／Ａ変換回路２９は、ＤＳＰ２２から入力されるデジタル信号をアナログ信号に変換する回路であり、変換後のアナログ音声信号を、アンプ３０へ出力する。
【０１０２】
アンプ３０は、Ｄ／Ａ変換回路２９から入力されるアナログ信号（アナログ音声信号）を所定レベルに増幅し、増幅後の信号を、外部接続されたスピーカ（図３のスピーカ４２に相当する。）へ出力する。
【０１０３】
表示駆動回路３１は、ＣＰＵ２１から入力される表示信号に基づいて表示装置３２を制御し、各種画面を表示させる回路である。また、表示装置３２は、ＣＲＴ（Ｃａｔｈｏｄｅ　Ｒａｙ　Ｔｕｂｅ）やＬＣＤ（Ｌｉｑｕｉｄ　Ｃｒｙｓｔａｌ　Ｄｉｓｐｌａｙ）等により構成され、ＣＰＵから入力される表示信号に従った表示画面を表示する。
【０１０４】
通信装置３３は、インターネットＮを介して他の機器（主に、管理サーバ１０及び他の演者端末２０）とのデータ通信を行うためのインターフェースである。
【０１０５】
次に、演奏システム１の動作を、図１３〜図１７を参照して説明する。
【０１０６】
図１３は、合奏処理を説明するためのフローチャートである。この合奏処理は、ＣＰＵ２１及びＤＳＰ２２により、記憶装置２３に記憶された合奏プログラム２３ａに従って、実行される処理である。
【０１０７】
同図によれば、演者端末２０において、ＣＰＵ２１は、利用者からの入力指示に従い、管理サーバ１０が提供するＷｅｂサイトへアクセスし（ステップＳ１１）、この管理サーバ１０から送信されるＨＰ情報（Ｗｅｂページを表示させるための情報）を受信する（ステップＳ１２）。そして、このＨＰ情報に基づいて、表示装置３２に参加登録画面（不図示）を表示させる。
【０１０８】
次いで、この参加登録画面上の入力フォーマットに従って、利用者により演奏を希望する曲目やパート、日時等を含む演奏条件データが入力されると（ステップＳ１３）、ＣＰＵ２１は、この演奏条件データ及び演者端末２０のＩＰアドレスを、参加登録要請とともに、管理サーバ１０へ送信する（ステップＳ１４）。
【０１０９】
一方、管理サーバ１０において、ＣＰＵ２１は、演者端末２０から送信される演奏希望条件及びＩＰアドレスを受信し（ステップＳ１５）、この演奏条件データ及びＩＰアドレスを対応付けて、ＲＡＭ２４内の演奏条件データ格納領域１３ａに追加格納する。
【０１１０】
次いで、ＣＰＵ２１は、演奏条件データ格納領域１３ａに格納されているデータの内から、受信した演奏条件データに適合するデータを検索し、合奏に参加可能な演者端末２０を特定する（ステップＳ１６）。具体的には、演奏を希望する曲目及び日時が一致し、且つパートが異なるデータを検索する。
【０１１１】
演奏条件が適合するデータを検索すると、ＣＰＵ２１は、これらのデータに基づいて合奏予定データを生成する。そして、生成した合奏予定データを、ＲＡＭ２４内の合奏予定データ格納領域１３ｂ内に、詳細には、合奏予定日時が現在日時に最も近い順に並ぶ位置に追加格納するとともに、上記特定した各演者端末２０に対し、この生成した合奏予定データを送信する（ステップＳ１７）。尚、この合奏予定データには、上述のように、演奏を予定する日時、曲目及び各パートに割り当てた演者端末２０のＩＰアドレスが含まれている。そして、ＣＰＵ２１は、演奏条件データ格納領域１３ａから、上記適合すると特定したデータを削除する。
【０１１２】
尚、図示されていないが、ステップＳ１６において、受信した演奏条件データに適合するデータが演奏条件データ格納領域１３ａに格納されていない、即ち合奏に参加可能な演者端末２０が特定できない場合には、ＣＰＵ２１は、再度ステップＳ１２へ移行し、続いて受信される演奏条件データを待機する。
【０１１３】
また、管理サーバ１０のＣＰＵ２１は、随時、合奏予定データ格納領域１３ｂの先頭に格納されている演奏予定日時と、現在日時とを比較している。そして、演奏予定日時の所定時間前になると（ステップＳ１９：ＹＥＳ）、該当する演者端末２０、即ち合奏に参加予定の演者端末２０の全てに対して、演奏開始を指示する（ステップＳ２０）。
【０１１４】
そして、演者端末２０においては、管理サーバ１０より演奏開始を指示されると（ステップＳ２１）、ＣＰＵ２１は、後述のリンク確立処理（図１４参照）を実行することで、当該合奏に参加予定の他の演者端末２０と、接続条件の提示・照合を行うとともに、直接通信リンクを確立する（ステップＳ２２）。
【０１１５】
その後、ＣＰＵ２１は、後述の演奏処理を実行することで、複数の演者端末２０による合奏を実現する（ステップＳ２３）。そして、この演奏処理が終了することで、予定した合奏が終了となる。
【０１１６】
図１３のステップＳ２１にて実行されるリンク確立処理について、説明する。図１４は、リンク確立処理を説明するためのフローチャートである。このリンク確立処理は、上述のように、図１３のステップＳ２１にて実行される処理であり、ＣＰＵ２１及びＤＳＰ２２により、記憶装置２３２記憶されたリンク確立プログラム２３ｂに従って実行される。
【０１１７】
同図によれば、演者端末２０において、ＣＰＵ２１は、利用者からの入力指示に従った接続条件を、他の演者端末２０それぞれに対して提示するとともに、他の演者端末２０それぞれから提示される接続条件と照合する（ステップＳ２１１）。尚、ここで照合される接続条件は、上述のように、通信条件及びエフェクト条件が含まれるものである。
【０１１８】
照合の結果、他の演者端末２０から提示された接続条件との合意が得られないと判断した場合（ステップＳ２１２：ＮＯ）、ＣＰＵ２１は、合意されていない接続条件の内容を、代替案の候補とともに表示装置３２に表示させる（ステップＳ２１３）。
【０１１９】
そして、これらの代替案から選択された内容を、新たな接続条件として他の演者端末２０に提示し、他の演者端末２０から提示される接続条件との照合を、再度行う（ステップＳ２１４）。
【０１２０】
接続条件の再照合の結果、合意が得られた場合には（ステップＳ２１２：ＹＥＳ）、ＣＰＵ２１は、合意した接続条件、詳細には、演者端末２０自身に関する接続条件を、ＲＡＭ２４に記憶しておく。
その後、ＣＰＵ２１は、本リンク確立処理を終了する。
【０１２１】
次に、図１３のステップＳ２３にて実行される演奏処理について説明する。
この演奏処理において、ＣＰＵ２１は、演奏データ受信処理（図１５参照）、演奏データ送信処理（図１６参照）及び合奏音声出力処理（図１７参照）の３つの処理を、並行して実行する。そして、合奏の終了を判断した場合には、実行中のこれらの処理を終了させ、当該演奏処理を終了する。
【０１２２】
図１５は、演奏データ受信処理を説明するためのフローチャートである。この演奏データ受信処理は、ＣＰＵ２１及びＤＳＰ２２により、記憶装置２３に記憶された演奏データ受信プログラム２３ｃに従って実行される。
【０１２３】
同図によれば、演者端末２０において、ＣＰＵ２１は、通信装置３３からＩＰパケットが入力されると（ステップＳ２２１：ＹＥＳ）、入力されたＩＰパケットに対して、暗号解読処理を含むＩＰ層プロトコル処理（ステップＳ２２２）及びＵＤＰ層プロトコル処理を行うとともに（ステップＳ２２３）、アプリケーション層処理により、このＩＰパケットから音声データを抜き出し、音声データパケットとして、指示ヘッダとともにＤＳＰ２２へ出力する（ステップＳ２２４）。
【０１２４】
その後、ＣＰＵ２１は、本処理の終了が指示されているか否かを判定する（ステップＳ２２５）。判定の結果、終了を指示されていない場合には（ステップＳ２２５：ＮＯ）、ＣＰＵ２１は、再度ステップＳ２２１に移行し、続いて入力されるＩＰパケットに対して、同様の処理を繰り返す（ステップＳ２２１〜Ｓ２２４）。
【０１２５】
また、ステップＳ２２５において、本処理の終了が指示されていると判定した場合には（ステップＳ２２５：ＹＥＳ）、ＣＰＵ２１は、本演奏データ受信処理を終了する。
【０１２６】
一方、ＤＳＰ２２は、ＣＰＵ２１から音声データパケットが入力されると（ステップＳ２２６）、入力された音声データパケットに対し、ともに転送される指示ヘッダに含まれる音声記録方式に従って、デコード処理を行う（ステップＳ２２７）。
【０１２７】
次いで、ＤＳＰ２２は、上記指示ヘッダに含まれるエフェクト条件に従って、デコードした音声データパケットに対する音響処理を行った後（ステップＳ２２８）、演奏データバッファ２４ｂの該当する領域に格納する（ステップ２２９）。
【０１２８】
具体的には、図１２を参照して説明したように、エコーやリバーブ、トーンコントロール等のエフェクトを、指定された度合で付加した後、指定されたＬ／Ｒ比でステレオ化する。そして、演奏データバッファ２４ｂ内の、送信元の演者端末２０に対応する領域ＯＵＴに、パケット番号に従って格納する。
【０１２９】
その後、ＤＳＰ２２は、再度ステップＳ２２６に移行し、続いて入力される音声データパケットに対して、同様の処理を繰り返す（ステップＳ２２６〜Ｓ２２９）。
【０１３０】
図１６は、演奏データ送信処理を説明するためのフローチャートである。この演奏データ送信処理は、ＣＰＵ２１及びＤＳＰ２２により、記憶装置３３に記憶された演奏データ送信プログラム２３ｄに従って実行される。
【０１３１】
同図によれば、演者端末２０において、本処理を開始した後、所定のサンプリング周期が経過すると（ステップＳ２３１：ＹＥＳ）、ＤＳＰ２２は、Ａ／Ｄ変換回路２８から出力される音声データを取込み（ステップＳ２３２）、取込んだ音声データを、入力バッファ２４ａへ格納する（ステップＳ２３３）。
【０１３２】
尚、ここで、サンプリング周期とは、アナログ信号である演奏音声をデジタル信号に変換する際のサンプリング周波数であり、図１４のステップＳ２１２において合意された接続条件に含まれている。
【０１３３】
次いで、ＤＳＰ２２は、１パケット分、即ち５１２サンプリング分の音声データの取込みを行ったか否かを判定し（ステップＳ２３４）、行っていないと判定した場合には（ステップＳ２３４：ＮＯ）、再度ステップＳ２３１に移行し、再度サンプリング周期を経過後、同様の処理を繰り返す（ステップＳ２３１〜Ｓ２３４）。
【０１３４】
このように、サンプリング周期毎に音声データの取込みを繰り返し、１パケット分、即ち５１２サンプリング分の音声データの取込みを行ったと判定すると（ステップＳ２３４：ＹＥＳ）、次いで、ＤＳＰ２２は、入力バッファ２４ａに格納されている音声データを読み出す。そして、読み出した音声データに対して所定の音響処理を行った後（ステップＳ２３５）、演奏データバッファ２４ｂの該当する領域に格納する（ステップＳ２３６）。
【０１３５】
即ち、図１２を参照して説明したように、先に合意した接続条件に含まれるエフェクト条件に従い、１パケット分の音声データに、エコーやリバーブ、トーンコントロール等のエフェクトを付加するとともに、指定されたＬ／Ｒ比でステレオ化する。
【０１３６】
また、ＤＳＰ２２は、上記入力バッファ２４ｂから読み出した１パケット分の音声データを、ＰＣＭ方式に従ってエンコードした後（ステップＳ２３７）、音声データパケットとして、指示ヘッダとともにＣＰＵ２１へ転送する（ステップＳ２３８）。尚、この指示ヘッダは、上述のように、演者端末２０自身を識別する送信端末識別番号やパケット番号及び上記音声記録方式、エフェクト条件等を含むものである。
【０１３７】
その後、ＤＳＰ２２は、再度ステップＳ２３１へ移行し、次の１パケット分の音声データを取込み、同様の処理を繰り返す（ステップＳ２３１〜Ｓ２３８）。
【０１３８】
一方、ＣＰＵ２１は、ＤＳＰ２２から音声データパケットが入力されると（ステップＳ２３９）、この音声データパケットに、同期調整データ、即ち演者端末２０自身を識別する送信端末識別番号及び音声データパケットが入力された順を示すパケット番号を付加する（ステップＳ２４０）。そして、当該合奏に参加している他の演者端末２０の内から、１つの演者端末２０を、演奏データの送信先として特定する（ステップＳ２４１）。
【０１３９】
次いで、ＣＰＵ２１は、ＵＤＰ層プロトコル処理（ステップＳ２４２）、暗号化処理を含むＩＰ層プロトコル処理を行い（ステップＳ２４３）、ＩＰパケットを生成する。この時、ＣＰＵ２１は、上記特定した演者端末２０のＩＰアドレスを、ＩＰヘッダの宛先ＩＰアドレスに設定する。そして、ＣＰＵ２１は、生成したＩＰパケットを通信装置３３へ出力し、上記特定した他の演者端末２０へ送信させる（ステップＳ２４４）。
【０１４０】
その後、ＣＰＵ２１は、当該合奏に参加している他の演者端末２０の内、当該音声データパケットを含むＩＰパケットが未送信の演者端末２０が存在するか否かを判定する（ステップＳ２４５）。
【０１４１】
判定の結果、存在する場合には（ステップＳ２４５：ＹＥＳ）、ＣＰＵ２１は、再度ステップＳ２３９に移行し、未送信の演者端末２０に対して、同様の処理を繰り返す（ステップＳ２３９〜Ｓ２４５）。
【０１４２】
また、ステップＳ２４５において、合奏に参加している全ての他の演者端末２０に、当該音声データを含むＩＰパケットを送信したと判定した場合には（ステップＳ２４５：ＮＯ）、ＣＰＵ２１は、続いて、本処理の終了が指示されているか否かを判定する（ステップＳ２４６）。
【０１４３】
判定の結果、終了を指示されていない場合には（ステップＳ２４６：ＮＯ）、ＣＰＵ２１は、再度ステップＳ２３９に移行し、続いて入力される音声データパケットに対して、同様の処理を繰り返す（ステップＳ２３９〜Ｓ２４６）。一方、ステップＳ２４６において、本処理の終了が指示されていると判定した場合には（ステップＳ２４６：ＹＥＳ）、ＣＰＵ２１は、本処理を終了する。
【０１４４】
図１７は、合奏音声出力処理を説明するためのフローチャートである。この合奏音声出力処理は、ＤＳＰ２２により、記憶装置３３に記憶された合奏音声出力プログラム２３ｅに従って実行される。
【０１４５】
同図によれば、演者端末２０において、本処理を開始した後、所定の再生サンプリング周期が経過すると（ステップＳ２５１：ＹＥＳ）、ＤＳＰ２２は、演奏データバッファ２４ｂから、所定のパケット番号の音声データパケットを読み出す（ステップＳ２５２）。
【０１４６】
尚、ここで、再生サンプリング周期とは、１パケット分の音声データ（即ち、５１２サンプリング分）を再生するのに要する時間であり、具体的には、上記サンプリング周期の５１２倍に相当する。
【０１４７】
次いで、ＤＳＰ２２は、演奏データバッファ２４ｂから読み出した音声データパケットを、図１２を参照して説明したように、チャネル毎に合成（同期合成）し、合奏音声を再現するための音声データパケットを生成する（ステップＳ２５３）。
【０１４８】
その後、ＤＳＰ２２は、生成した音声データパケットの出力レベルを調整した後、Ｄ／Ａ変換回路２９へ出力する（ステップＳ２５４）。尚、ここでＤ／Ａ変換回路２９に入力された音声データは、アナログ信号に変換された後、アンプ３０にて増幅され、演奏音声としてスピーカから出力される。
【０１４９】
そして、ＤＳＰ２２は、本処理の終了が指示されているか否かを判定する（ステップＳ２５５）。判定の結果、終了が指示されていない場合には（ステップＳ２５５５：ＮＯ）、ＤＳＰ２２は、再度ステップＳ２５１に移行し、次のパケット番号の音声データパケットに対して、同様の処理を繰り返す（ステップＳ２５１〜Ｓ２５５）。
【０１５０】
また、ステップＳ２５５において、本処理の終了が指示されていると判定した場合には（ステップＳ２５５：ＹＥＳ）、ＤＳＰ２２は、本処理を終了する。
【０１５１】
このように、演者端末２０において、並行して実行される３つの処理（演奏データ受信処理、演奏データ送信処理及び合奏音声出力処理）を全て終了すると、ＣＰＵ２１は、演奏処理を終了する。
【０１５２】
以上のように構成することで、インターネットＮを介して接続された複数の演者端末２０それぞれにおいて、入力された演奏音声のデータと、他の演者端末２０から送信される演奏データとを、パケット番号に従って合成・出力することが可能となる。即ち、各演者端末２０の間の距離に関わらず、これらの演者端末２０における演奏音声を正確に同期させて再生することができ、リアルタイムな合奏を実現することが可能となる。
【０１５３】
更に、演者端末２０に入力される演奏音声のデータを、エフェクト条件とともに送信することで、入力された演奏音声に、利用者が所望する音響効果（例えば、エコーやリバーブ、ステレオ化等）を施した合奏音声を、再生することが可能となる。
【０１５４】
尚、本発明は、上記実施の形態に限定されることなく、本発明の趣旨を逸脱しない範囲で適宜変更可能である。例えば、演者端末２０間の演奏データの送受信を、管理サーバ１０を介して行うこととしても良い。
【０１５５】
即ち、図１８に示すように、各演者端末２０は、入力された演奏音声のデータ（演奏データ）を、管理サーバ１０へ送信する。そして、管理サーバ１０は、これらの演奏データを受信するとともに、各演者端末２０へ一斉配信する。また、演者端末２０は、管理サーバ１０から送信される演奏データを、パケット番号に従って同期させ、指定されたエフェクト条件に従って音響処理を施し、合奏音声として出力する。
【０１５６】
従って、演者端末２０による演奏データの送信先は、管理サーバ１０のみとなるため、演者端末２０における、演奏データの送受信に係る負担を軽減することができるとともに、合奏に参加する演者端末２０が多数になる程、より効果的となる。
【０１５７】
また、この時、管理サーバ１０は、各演者端末２０から送信されるこれらの演奏データを、パケット番号に従って同期させ、各演者端末２０へ配信することとしても良いし、更に、管理サーバ１０は、上記パケット番号に従って同期させた演奏データを、指定されたエフェクト条件に基づいてエフェクト付加及びステレオ化し、各演者端末２０へ送信することとしても良い。
【０１５８】
また、上記実施の形態においては、各演者端末２０の利用者は１人であるとしたが、複数であっても良い。その場合には、マルチプレクサ２６は、各利用者から入力される複数の演奏音声を、ＤＳＰ２２からの指示信号に従って分離・選択し、サンプルホールド回路２７へ出力する。そして、ＤＳＰ２２及びＣＰＵ２１は、これらの演奏音声を、利用者毎に振り分けて処理する。
【０１５９】
【発明の効果】
請求項１又は１０に記載の発明によれば、入力された音声データに、予め定められた同期調整データを付加した通信データを作成することができる。
【０１６０】
請求項２に記載の発明によれば、他の通信端末から受信した音声通信データに含まれる音声データと作成手段により作成された同期調整データが付加された入力音声データに含まれる音声データとを、それぞれに対応する同期調整データに従って、同期した合成音声として再生することができる。
【０１６１】
請求項３に記載の発明によれば、音声通信データを受信した順序に関わらず、正しい順序で合成音声の再生ができる。
【０１６２】
請求項４に記載の発明によれば、それぞれの音声通信データを加工し、音色・音量・音質を、より正確に再現することが可能となるとともに、また、再生する場所に応じて、より相応しい合成音声を生成することが可能となる。
【０１６３】
請求項５に記載の発明によれば、音声データを所定単位毎に同期調整データを付加して音声通信データを順次作成できるので、音声通信データの取り扱いがしやすい。
【０１６４】
請求項６に記載の発明によれば、例えば、本発明の通信端末を、他の通信端末とが互いに演奏データを送受信し、リアルタイムのバンド演奏を実現する装置に適用することとすれば、所望通信条件として、演奏を希望する曲目やパート、日時等の条件をサーバへ送信することで、所望通信条件を満たす他機を見つけることができる。
【０１６５】
請求項７に記載の発明によれば、送信する通信データを暗号化するともに、受信した通信データを復号化することで、上記所定の通信回線を介したデータ通信を行う際のセキュリティを高めることができる。
【０１６６】
請求項８，９に記載の発明によれば、入力された音声データに、予め定められた同期調整データを付加した通信データを作成することができる。
【図面の簡単な説明】
【図１】本実施の形態における演奏システムの構成を示す図である。
【図２】演奏システムにおけるデータの流れを示す図である。
【図３】本実施の形態の概念を示す図である。
【図４】ＩＰパケットのＩＰヘッダのフォーマットを示す図である。
【図５】ＩＰパケットの実データの構成を示す図である。
【図６】演奏システムにおける通信プロトコルを示す図である。
【図７】管理サーバの内部構成を示すブロック図である。
【図８】演奏条件データ格納領域のデータ構成を示す図である。
【図９】合奏予定データ格納領域のデータ構成を示す図である。
【図１０】演者端末の内部構成を示すブロック図である。
【図１１】演奏データに対する処理の概略を示す図である。
【図１２】演奏データバッファのデータ構成を示す図である。
【図１３】合奏処理を説明するためのフローチャートである。
【図１４】リンク確立処理を説明するためのフローチャートである。
【図１５】演奏データ受信処理を説明するためのフローチャートである。
【図１６】演奏データ送信処理を説明するためのフローチャートである。
【図１７】合奏音声出力処理を説明するためのフローチャートである。
【図１８】本実施の形態の変形例を示す図である。
【符号の説明】
１　演奏システム
１０　管理サーバ
１１　ＣＰＵ
１２　記憶装置
１２ａ　合奏プログラム
１３　ＲＡＭ
１３ａ　演奏条件データ格納領域
１３ｂ　合奏予定データ格納領域
１４通信装置
２０　演者端末
２１　ＣＰＵ
２２　ＤＳＰ
２３　記憶装置
２３ａ　合奏プログラム
２３ｂ　リンク確立プログラム
２３ｃ　演奏データ受信プログラム
２３ｄ　演奏データ送信プログラム
２３ｅ　合奏音声出力プログラム
２４　ＲＡＭ
２４ａ　入力バッファ
２４ｂ　演奏データバッファ
２５　入力装置
２６　マルチプレクサ
２７　サンプルホールド回路
２８　Ａ／Ｄ変換回路
２９　Ｄ／Ａ変換回路
３０　アンプ
３１　表示駆動回路
３２　表示装置
３３　通信装置
Ｎ　インターネット[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a communication terminal and a program.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, as a communication system for connecting a plurality of devices via a network, receiving an input voice from each device in real time, and reproducing the voice on each device, a network disclosed in Japanese Patent Application Laid-Open No. H11-219174 is disclosed. A performance system and a karaoke system disclosed in Japanese Patent Application Laid-Open No. 10-39897 are known.
[0003]
In the network performance system, each of the plurality of terminal devices converts the operation information of the input performance part (hereinafter, simply referred to as “part”) into a MIDI file and transmits the MIDI file to the server via the Internet. Then, the server generates an ensemble sound based on the operation information of each part received from each terminal device, and transmits the ensemble sound to each terminal device.
[0004]
Further, in the karaoke system, duet is enabled by mutually transmitting karaoke voices input to each of the plurality of karaoke apparatuses via a communication network.
[0005]
[Problems to be solved by the invention]
By the way, in the communication system as described above, a problem of a time delay due to a distance between the devices occurs. That is, as in the above-mentioned karaoke system, there is no problem when each terminal device is arranged at a relatively short distance such as in the same store. However, when the Internet is used as a network, there is no problem between the devices. As the distance increases, various factors such as the physical distance, the presence of intervening servers, and communication protocols make it difficult to avoid the time delay of the received data, which is not enough to perform live audio in real time. is there.
[0006]
In the method of transmitting operation information in a MIDI format file as in the network performance system, real-time performance is almost impossible because one music piece needs to be put in the MIDI format. It was difficult to completely reproduce the intended timbre, volume and sound quality.
[0007]
An object of the present invention is to appropriately synthesize and output data such as voice transmitted and received between other communication devices.
[0008]
[Means for Solving the Problems]
In order to solve the above problems, the invention described in claim 1 is
A communication terminal that synthesizes voice data received from the other communication terminal and input voice data while transmitting and receiving voice data to and from another communication terminal connected via a communication line, and reproduces the synthesized voice as synthesized voice ( For example, in the actor terminal 20) of FIG.
Creation means for creating voice communication data by adding predetermined synchronization adjustment data (for example, the transmission terminal identification number and packet number in FIG. 5) to the input voice data (for example, the performer terminal 20 in FIG. 10, FIG. 16 steps S240),
It is characterized by having.
[0009]
The invention according to claim 10 is
A computer (for example, the actor terminal 20 in FIG. 1) which is a communication terminal for transmitting and receiving voice data to and from another communication terminal connected via a predetermined communication line,
A creation function for creating voice communication data by adding predetermined synchronization adjustment data (for example, the transmission terminal identification number and the packet number in FIG. 5) to the input voice data (for example, step S240 in FIG. 16);
A receiving function (eg, steps S221 to S224 in FIG. 15) for receiving voice communication data to which synchronization adjustment data transmitted from the other communication terminal is added;
The audio data included in the audio communication data to which the synchronization adjustment data received by the receiving unit is added and the audio data included in the input audio data to which the synchronization adjustment data created by the creation unit is added, respectively, A voice synthesizing / reproducing function (FIG. 17, S251 to S254) for reproducing as a synthetic voice synchronized according to the corresponding synchronization adjustment data;
Is a program for realizing.
[0010]
According to the first or tenth aspect of the present invention, communication data in which predetermined synchronization adjustment data is added to input audio data can be created.
[0011]
And like the invention of claim 2,
The communication terminal according to claim 1,
Transmitting means (for example, the communication device 33, the CPU 21, FIG. 16; S239 to S244 in FIG. 10) for transmitting the voice communication data to which the synchronization adjustment data created by the creating means is added;
Receiving means (for example, the communication device 33 in FIG. 10, the CPU 21, FIG. 15; S221 to S224) for receiving communication data transmitted from another device;
Reproducing means (for example, reproducing the audio data included in the communication data received by the receiving means as a synthesized voice synchronized with the audio data output from the audio input means in accordance with the synchronization adjustment data included in the communication data) CPU 21 and DSP 22 in FIG. 10, FIG. 17; S251 to S254);
The following effects can be obtained by providing.
[0012]
That is, the voice data included in the voice communication data received from the other communication terminal and the voice data included in the input voice data to which the synchronization adjustment data created by the creation unit is added according to the corresponding synchronization adjustment data. , And can be reproduced as a synchronized synthesized voice.
[0013]
Here, as in the invention according to claim 3,
The communication terminal according to claim 2,
The synchronization adjustment data includes order data,
Further comprising other data storage means (for example, an output buffer 22b in FIG. 10) for accumulating and storing the voice communication data received by the reception means;
The reproducing means may read out the voice communication data stored in the other data storage means in accordance with the order data included in the voice communication data and reproduce the voice communication data as a synthesized voice synchronized with the input voice data.
[0014]
According to the third aspect of the invention, the synthesized speech can be reproduced in a correct order regardless of the order in which the voice communication data is received.
[0015]
Also, as in the invention according to claim 4,
The communication terminal according to claim 2 or 3,
The creation means includes playback condition addition means (for example, DSP22, CPU 21, FIG. 16; S236 in FIG. 10) for further adding playback condition data to the voice communication data to be created,
The reproducing unit reproduces audio data included in the audio communication data received by the receiving unit based on reproduction condition data included in the audio communication data (for example, DSP22 in FIG. 10, FIG. 15; S228). You may comprise so that it may be.
[0016]
Here, the reproduction condition data is data for specifying conditions when the reproduction means reproduces the audio data. Specifically, the reproduction condition data includes echo and reverb (reverberation), the degree of tone control, volume balance, and stereophonic conversion. And its L / R ratio.
[0017]
Therefore, according to the invention described in claim 4, each voice communication data can be processed, and the timbre, the volume, and the sound quality can be reproduced more accurately. , It is possible to generate a more suitable synthesized speech. The reproduction condition data may be data in response to a user's input instruction, or may be appropriately determined by the communication terminal according to conditions such as a reproduction place.
[0018]
Also, as in the invention according to claim 5,
In the communication terminal according to any one of claims 1 to 4,
The creating means is configured to sequentially create the audio communication data by adding the synchronization adjustment data for each predetermined unit of the audio data among the input audio data (FIG. 16; S231 to S238). Is also good.
[0019]
According to the fifth aspect of the present invention, since voice communication data can be sequentially created by adding the synchronization adjustment data to the voice data for each predetermined unit, the voice communication data can be easily handled. For example, by packetizing the voice data and applying the predetermined communication line to the Internet, which is a typical IP network, it is possible to transmit and receive voice data to and from a larger number and a wider range of other devices.
[0020]
Further, as in the invention according to claim 6,
The communication terminal according to any one of claims 2 to 5,
Desired communication condition data is transmitted to a server (for example, the management server 10 of FIG. 1) connected via the predetermined communication line, and other desired communication condition data conforming to the desired communication condition data is transmitted. Inquiry means (for example, CPU21 in FIG. 10, FIG. 13; S13, S14, S18, S21, S22) for receiving the communication address of the communication terminal,
The transmitting means and the receiving means may be configured to transmit and receive voice communication data based on the communication address received by the inquiry means.
[0021]
Here, the desired communication condition is a condition desired by another communication terminal that transmits and receives communication data, and includes, for example, a date and time at which voice data is desired to be transmitted and received, the content thereof, and the like.
[0022]
Therefore, according to the invention described in claim 6, for example, the communication terminal of the present invention is applied to an apparatus for transmitting and receiving performance data to and from other communication terminals and realizing a real-time band performance. For example, by transmitting to the server the desired communication conditions such as the desired music, part, date and time, it is possible to find another device that satisfies the desired communication conditions.
[0023]
Also, as in the invention according to claim 7,
The communication terminal according to any one of claims 2 to 6,
Encryption means (for example, CPU21 in FIG. 10, FIG. 16; S243) for encrypting voice communication data transmitted by the transmission means;
Decoding means (for example, CPU21 in FIG. 10, FIG. 15; S222) for decoding the voice communication data received by the receiving means;
May be further provided.
[0024]
According to the seventh aspect of the present invention, the communication data to be transmitted is encrypted and the received communication data is decrypted, thereby enhancing the security when performing data communication via the predetermined communication line. be able to.
[0025]
The invention according to claim 8 is
A communication terminal that transmits and receives data to and from another communication terminal connected via a predetermined communication line while immediately synthesizing and reproducing the data received from the other communication terminal and the input data (for example, FIG. Performer terminal 20),
Creation means for creating communication data by adding predetermined synchronization adjustment data (for example, transmission terminal identification number and packet number in FIG. 5) to input data (for example, performer terminal 20 in FIG. 10, step S240 in FIG. 16) ,
It is characterized by having.
[0026]
According to the eighth aspect of the invention, it is possible to create communication data in which predetermined synchronization adjustment data is added to the input audio data. Therefore, for example, the input data of the communication terminal and the input data included in the communication data received from another communication terminal via the predetermined communication line are accurately synchronized according to the corresponding synchronization adjustment data. It is possible to play back and transmit and receive communication data with other devices without being affected by the distance between the own device (the communication terminal) and the other device (other communication terminal). It is possible to immediately synthesize and reproduce.
[0027]
The invention according to claim 9 is
A communication terminal that immediately synthesizes and reproduces input data and data received from another communication terminal while transmitting and receiving data to and from another communication terminal connected to the server via the server (for example, FIG. Performer terminal 20)
It is characterized by comprising a creating means for creating communication data by adding predetermined synchronization adjustment data to input data.
[0028]
According to the ninth aspect, it is possible to create communication data in which predetermined synchronization adjustment data is added to input data. Therefore, for example, the input data of the communication terminal and the input data included in the communication data received from another communication terminal via the server are reproduced in an accurately synchronized manner according to the corresponding synchronization adjustment data. It is possible to immediately combine these data while transmitting and receiving communication data with other devices without being affected by the distance between the own device (the communication terminal) and the other device (other communication terminal). Playback. Further, since the data transmission destination of the communication terminal is only the server, even if the number of other devices (other communication terminals) increases, the burden on transmitting the communication data does not need to be increased.
[0029]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the following, a performance system to which the present invention is applied will be described as an example, but the application of the present invention is not limited to this.
[0030]
Here, the performance system is a system in which a plurality of performer terminals (communication terminals) connected via the Internet simultaneously perform a certain piece of music (that is, an ensemble) in a part assigned to each. At this time, each performer terminal participating in the ensemble outputs a ensemble sound obtained by synthesizing the performance sound of the part input by the user and the performance sound of the part input by another performer terminal. You. The parts assigned to each of the actor terminals may be different, or the same part may be assigned to a plurality of actor terminals.
[0031]
FIG. 1 is a diagram showing a configuration of a performance system 1 according to the present embodiment.
As shown in FIG. 1, the performance system 1 includes a management server 10 and a plurality of performer terminals 20, and these devices are connected to the Internet N. Although three performer terminals 20 are shown in the figure, any number of performer terminals may be used.
[0032]
The management server 10 is realized by a known server device to which a CPU (Central Processing Unit), a RAM (Random Access Memory), a storage device, a communication device, and the like are connected via a system bus. The management server 10 has a Web server function of providing a Web page file.
[0033]
The actor terminal 20 is realized by a well-known PC (Personal Computer) to which a CPU, a RAM, a storage device, an input device, a display device, a communication device, and the like are connected via a system bus.
[0034]
FIG. 2 is a diagram showing the flow of data between the devices in the performance system 1. FIG. 7A shows a state before the ensemble, and FIG. 7B shows a state during the ensemble. Note that these data are appropriately encrypted and transmitted / received via the Internet N.
[0035]
According to FIG. 2A, the performer terminal 20 transmits performance condition data according to the user's instruction input to the management server 10. The performance condition data includes data of a music piece, a part, and a date and time that the user desires to perform. Therefore, the management server 10 receives the performance condition data transmitted from each performer terminal 20 ((1) in the figure).
[0036]
Further, the management server 10 specifies the performer terminals 20 participating in the ensemble based on the received performance condition data. Then, ensemble schedule data indicating the ensemble schedule is transmitted to each of the specified performer terminals 20. The ensemble schedule data includes data indicating the date and time at which the ensemble is scheduled, the number of songs, and the performer terminals 20 participating in the ensemble. Therefore, each of the performer terminals 20 scheduled to participate in the ensemble receives the ensemble schedule data transmitted from the management server 10 ((2) in the figure).
[0037]
On the other hand, according to FIG. 3B, at the scheduled date and time of the ensemble, the management server 10 instructs each of the performer terminals 20 scheduled to participate in the ensemble to start playing ((3) in the figure). ).
[0038]
The performer terminal 20 instructed to start playing by the management server 10 transmits and receives performance data to and from other performer terminals 20 participating in the ensemble based on the previously received ensemble schedule data, thereby realizing the ensemble. . At this time, each performer terminal 20 synthesizes performance sound data input by the user (hereinafter referred to as “performance data” as appropriate) and performance data received from the other performer terminals 20, and forms a ensemble voice. Output.
[0039]
That is, as shown in FIG. 3, when the user of the actor terminal 20 plays a musical instrument 41 such as an electric guitar connected to the actor terminal 20 (indicated as “live gear” in the figure). Performance data corresponding to the performance is transmitted to another performer terminal 20 via the Internet N. In addition, the performer terminal 20 receives the performance data of the co-star via the Internet N, synthesizes it with its own performance data, and outputs the synthesized data from the speaker 42 as an ensemble sound. Therefore, the user of the actor terminal 20 can listen to the ensemble sound as if he / she is performing with the co-actor (that is, the user of another actor terminal 20).
[0040]
Further, in the present embodiment, UDP / IP is adopted as a communication protocol. That is, data communication between the management server 10 and the performer terminal 20 via the Internet N is realized by transmission and reception of IP packets based on IP addresses assigned to the respective devices.
[0041]
FIG. 4 is a diagram showing a format of an IP header included in an IP packet. According to the figure, the IP header indicates the version of the compliant IP standard (ie, “6”), the priority of the IP packet, the flow label for guaranteeing the reservation of the communication band, and the size of the actual data. The payload length, the next header indicating the ID of the subsequent header, the hop limit indicating the relay limit number by the router, the source IP address and the destination device indicating the IP address of the source device (the management server 10 or the performer terminal 20) (Management server 10 or performer terminal 20).
[0042]
FIG. 5 is a diagram showing a data configuration of actual data (payload part) included in an IP packet. FIG. 2 shows a data configuration of performance data transmitted and received between the plurality of performer terminals 20 during ensemble, out of the actual data (payload portion) of the IP packet.
According to the figure, actual data included in an IP packet is composed of an instruction header and audio data.
[0043]
The instruction header includes performance data identification indicating that the IP packet is performance data for the performance system 1, a transmission terminal identification number indicating the source performer terminal 20, and transmission from the source performer terminal 20. A packet number indicating the number of the IP packet, an audio recording method (ie, “PCM”) indicating an audio data recording method (encoding method), and a sampling rate (sampling frequency) and a bit number as parameters at that time (Quantization bit number), bit rate, and effect conditions for specifying an effect on audio data are included.
[0044]
Here, the transmission terminal identification number and the packet number are collectively referred to as “synchronization adjustment data”. The synchronization adjustment data is used for synthesizing the performance data synchronously as described later.
[0045]
The effect means a sound effect to be added to audio data, and includes, for example, echo and reverb (reverberation), tone control, change in volume balance, presence or absence of stereo conversion, and the like.
[0046]
The audio data is data obtained by converting a performance audio, which is an analog signal, input by a user into a digital signal by a PCM (Pulse Code Modulation) method, and data of 512 samplings per IP packet. included. That is, the performance data is exchanged between the plurality of performer terminals 20 in units of 512 samplings. Hereinafter, the audio data for one packet (that is, for 512 samplings) will be appropriately referred to as “audio data packet”.
[0047]
FIG. 6 is a diagram showing a communication protocol stack in the performance system 1. According to the figure, the communication protocol includes, in order from the lower layer, an Ethernet (registered trademark) layer (PPPOverEther), an IP layer (network layer), a UDP layer (transport), and an application layer (including API). .
[0048]
That is, the IP packet received by the actor terminal 20 from another actor terminal 20 is passed to the IP layer via the Ethernet (registered trademark) layer, and the IP layer performs a decryption process. After that, it is passed to the application layer via the UDP layer. Then, in the application layer, audio data included in the actual data of the IP packet is extracted by the CPU 21 as described later, and is input to the DSP 22 together with the instruction header as one audio data packet.
[0049]
The performance data input by the user to the performer terminal 20 is converted into audio data packets by the DSP 22 in the application layer and output to the CPU 21 as described later. Then, the data is transferred to the UDP layer and subsequently to the IP layer, where the encryption processing is performed in the IP layer and the IP packet is formed. Thereafter, the generated IP packet is transmitted to another performer terminal 20 via the Ethernet (registered trademark) layer.
[0050]
Here, encryption / decryption of data realized in the IP layer will be described.
Encryption means that when exchanging digital data such as documents and images over a network such as the Internet, data is converted according to a predetermined rule to prevent the data from being stolen or tampered with by a third party during communication. And make it extremely difficult to decipher. Restoring the encrypted data so that it can be decrypted is called decryption.
[0051]
In general, a "key" corresponding to a cipher table is used for encryption / decryption. There are a public key encryption method using two pairs of keys and a common key encryption method using the same key for both. . The former includes RSA, ElGamal encryption, elliptic curve encryption, and the like, and the latter includes DES, IDEA, FEAL, MISTY, and the like, which are standardized by the U.S. government. In the present embodiment, any of the above may be used.
[0052]
Next, the internal configuration of the management server 10 and the performer terminal 20 will be described with reference to FIGS.
[0053]
FIG. 7 is a block diagram illustrating a configuration of the management server 10.
According to FIG. 1, the management server 10 includes a CPU 11, a storage device 12, a RAM 13, and a communication device 14, and each unit is connected by a system bus 15 so as to be able to perform data communication.
[0054]
The CPU 11 centrally controls each unit constituting the management server 10 based on a program stored in the storage device 12. Specifically, in response to a signal input from the communication device 14, the program stored in the storage device 12 is read out and temporarily stored in the RAM 13, and a process based on the program is executed to execute the processing based on the program. Function. At that time, the CPU 11 stores the processing result in a predetermined area in the RAM 13 and transmits the processing result from the communication device 14 to an external device as necessary.
[0055]
In addition, the CPU 11 executes a later-described ensemble process (see FIG. 13) as a characteristic portion of the present embodiment.
[0056]
Specifically, in the ensemble processing, when receiving the performance condition data transmitted from the performer terminal 20, the CPU 11 stores the performance condition data in the performance condition data storage area 13a in the RAM 13. Further, the CPU 11 searches the data stored in the performance condition data storage area 13a for data conforming to the performance condition data, and specifies the performer terminal 20 that can participate in the ensemble. Then, ensemble schedule data including the scheduled date and time of the ensemble, the music number, and the IP address of the specified performer terminal 20 is generated.
[0057]
Thereafter, the generated ensemble schedule data is stored in the ensemble schedule data storage area 13b in the RAM 13, and the generated ensemble schedule data is transmitted to each of the specified performer terminals 20. Further, when a predetermined time comes before the scheduled ensemble date and time, the CPU 11 instructs the performer terminals 20 scheduled to participate in the ensemble to start playing.
[0058]
The storage device 12 stores various processing programs related to the operation of the management server 10, programs for realizing the functions of the present embodiment (specifically, the ensemble program 12a), and processing data related to the execution of these programs. Is stored.
[0059]
The RAM 13 includes a program area (not shown) for expanding each program executed by the CPU 11, and a work memory area for temporarily storing input instructions, input data, processing results generated when the program is executed, and the like. I have. In the work memory area, a performance condition data storage area 13a and an ensemble schedule data storage area 13b are formed.
[0060]
FIG. 8 is a diagram showing a configuration of data stored in the performance condition data storage area 13a.
According to the figure, a performance condition and an IP address are stored in the performance condition data storage area 13a in association with each other. The performance conditions correspond to the performance condition data received from the performer terminal 20, and include a desired music piece, part, and date and time. The stored data is searched for compatible data, and when the ensemble schedule data is generated by the CPU 11, the corresponding data is deleted from the performance condition data storage area 13a.
[0061]
FIG. 9 is a diagram showing a configuration of data stored in the ensemble schedule data storage area 13b.
According to the figure, the ensemble schedule data storage area 13b stores the ensemble schedule data, that is, the date and time at which the ensemble is scheduled, the program, and the IP address of the performer terminal 20 assigned to each part. . Further, these ensemble schedule data are stored in the order in which the scheduled date and time of the ensemble are close to the current time. When the corresponding ensemble starts, the corresponding data is deleted from the ensemble schedule data storage area 13b. .
[0062]
The communication device 14 is an interface for performing data communication with another device (mainly, the performer terminal 20) via the Internet N.
[0063]
FIG. 10 is a block diagram showing a configuration of the performer terminal 20.
According to the figure, the performer terminal 20 includes a CPU 21, a DSP (Digital Signal Processor) 22, a storage device 23, a RAM 24, an input device 25, a multiplexer 26, a sample and hold circuit 27, an A / D (Analog-Digital) conversion circuit 28. , D / A (Digital-Analog) conversion circuit 29, amplifier 30, display drive circuit 31, display device 32, and communication device 33.
[0064]
The CPU 21 performs centralized control of each unit constituting the performer terminal 20 based on a program stored in the storage device 23. Specifically, in response to a signal input from the communication device 33 or the input device 25, a program stored in the storage device 23 is read and temporarily stored in the RAM 24, and a process based on the program is executed. To make the actor terminal 20 function. At this time, the CPU 21 stores the processing result in a predetermined area in the RAM 24, transmits the processing result from the communication device 33 to an external device as necessary, and causes the display device 32 to display the processing result.
[0065]
Further, the CPU 21 includes ensemble processing (see FIG. 13), link establishment processing (see FIG. 14), performance data reception processing (see FIG. 15), and performance data transmission processing (see FIG. 15), which are characteristic parts of the present embodiment. 16 (see FIG. 16).
[0066]
Specifically, in the ensemble processing, the CPU 21 transmits the performance condition data according to the user's input instruction to the management server 10 together with the IP address of the performer terminal 20 and the ensemble schedule transmitted from the management server 10. Receive data. When the performance start is instructed by the management server 10, the CPU 21 executes a link establishment process and then a performance process. In this performance process, the CPU 21 executes a performance data reception process and a performance data transmission process, and instructs the DSP 22 to perform a performance data reception process (see FIG. 15), a performance data transmission process (see FIG. 16), and an ensemble. The sound output process (see FIG. 17) is executed.
[0067]
In the link establishment process, the CPU 21 transmits the connection conditions input by the user to other performer terminals 20 participating in the ensemble based on the received ensemble schedule data. At the same time, the connection condition is collated with the connection condition received from another performer terminal 20, and the connection condition in the ensemble is determined.
[0068]
Here, the connection condition includes a communication condition and an effect condition.
The communication condition is a condition for data communication between the performer terminals 20, and specifically, a coding method (including parameters such as a sampling frequency and a quantization bit number) of the input performance sound, The connection rate (transmission rate) when data communication is performed with another performer terminal 20 corresponds to this. The communication condition is determined automatically by a user's instruction or automatically from conditions that can be set according to the specifications of each of the performer terminals 20.
[0069]
The effect condition is, as described above, a condition relating to the acoustic effect on the performance voice. Specifically, the effect condition is the echo or reverb (reverberation), the degree of tone control, the volume balance, the presence or absence of stereo conversion, and the L / L The R ratio and the like correspond. The effect conditions determine the degree of echo, reverb, and tone control applied to the ensemble sound according to the assumed performance place (for example, a concert hall, an outdoor stage, a live house, or the like), or Depending on the position of each part on the assumed stage (for example, vocal in the center, guitar on the right, bass on the left, etc.), the volume balance of the sound of each part or the left / right balance when outputting as stereo sound ( L / R ratio) is determined.
[0070]
In the performance data receiving process, the CPU 21 extracts audio data included in the actual data of the IP packet input from the communication device 33, and outputs it as an audio data packet to the DSP 22 together with the instruction header.
[0071]
In the performance data transmission process, the CPU 21 generates an IP packet in which a predetermined IP header or the like is added to the audio data packet input from the DSP 22 and outputs the IP packet to the communication device 33.
[0072]
The DSP 22 is a processor specialized for high-speed processing of digital data. The DSP 22 executes performance data reception processing (see FIG. 15), performance data transmission processing (see FIG. 16), and ensemble audio output processing (see FIG. 17) as characteristic parts of the present embodiment.
[0073]
Specifically, in the performance data receiving process, the DSP 22 performs a predetermined sound effect process on the audio data packet input from the CPU 21, and then stores it in a predetermined area of the performance data buffer 24b.
[0074]
In the performance data transmission process, the DSP 22 stores the audio data input from the D / A conversion circuit 29 in the input buffer 24a, and outputs 512 samples of audio data to the CPU 21 as one audio data packet. At the same time, after performing a predetermined sound effect process on the audio data packet, the audio data packet is stored in a predetermined area of the performance data buffer 24b.
[0075]
In the ensemble audio output process, the DSP 22 synthesizes audio data packets stored in the performance data buffer 24b, and outputs the synthesized audio data to the D / A conversion circuit 29.
[0076]
FIG. 11 is a diagram illustrating the concept of processing performed by the CPU 21 and the DSP 22 on an audio data packet. In the figure, the performance system 1 has its own performer terminal (hereinafter referred to as “actor terminal A” as appropriate) and two other actor terminals 20 (hereinafter referred to as “actor terminal B” as appropriate, respectively). This is shown for the case of "actor terminal B").
[0077]
The performance data received by the management server 10 is input to the CPU 11 as an IP packet as described above. In the figure, the IP packet is simplified and is represented only by the synchronization adjustment data and the audio data composed of the transmission terminal identification number and the packet number, and is shown in the input order to the CPU 11 from the left in the figure.
[0078]
The transmitting terminal identification number “A” indicates the actor terminal A, the transmitting terminal identification number “B” indicates the actor terminal B, and the transmitting terminal identification number “C” indicates the actor terminal C.
[0079]
The voice data “data Ah” is the voice data packet of the performer terminal A's own packet number “h”, and the voice data “data Bi” is the voice data packet of the packet number “i” received from the performer terminal B. The audio data “data Cj” indicates an audio data packet of the packet number “j” received from the actor terminal C, respectively.
[0080]
According to FIG. 7A, the CPU 21 extracts audio data from the input IP packet. This audio data is made into one audio data packet, distributed to each of the performer terminals 20 of the transmission source according to the added terminal identification number, and rearranged according to the packet number. That is, FIG. 3A shows a state where the performer terminals A, B, and C are sorted in this order from the top in the figure.
[0081]
Next, according to FIG. 3B, the DSP 22 performs an acoustic process on each of these audio data packets based on the designated effect condition. Specifically, effects such as echo, reverb, and tone control are added at a specified degree, and are converted into stereo data having a specified L / R ratio.
[0082]
For example, a predetermined effect is added to the audio data “data A0”, and stereo data having a specified L / R ratio, that is, L-channel audio data “data A0L” and right-channel audio data “data A0R” Is generated.
[0083]
Similarly, a predetermined effect is added to the audio data “data B0” and “data C0”, and the L-channel audio data “data B0L” and “data C0L” at the specified L / R ratio, and The right channel audio data “data B0R” and “data C0R” are respectively generated.
[0084]
Thus, the audio data packet subjected to the audio processing by the DSP 22 (that is, the state of FIG. 3B) is stored in the performance data buffer 24b.
[0085]
After that, according to FIG. 3C, the DSP 22 synthesizes the audio data packets subjected to the acoustic processing, and generates audio data for reproducing the ensemble audio. Specifically, among these audio data packets that have been subjected to the audio processing, those having the same packet number are synthesized for each channel.
[0086]
For example, for the packet number “0”, the audio data “data A0L”, “data B0L”, and “data C0L” are combined to generate L-channel audio data “data A0L + B0L + C0L” and the audio data “data A0R” , "Data B0R" and "data C0L" to generate R channel audio data "data A0R + B0R + C0R".
[0087]
Similarly, for the packet number “1”, the audio data “data A0L”, “data B0L” and “data C0L” are combined to generate the L-channel audio data “data A0L + B0L + C0L” and the audio data “data A0R "," data B0R "and" data C0L "are combined to generate R channel audio data" data A0R + B0R + C0R ".
[0088]
Then, the DSP 22 outputs the audio data thus generated to the D / A conversion circuit 29.
[0089]
In FIG. 10, the storage device 23 stores various processing programs related to the operation of the performer terminal 20 and a program for realizing the functions of the present embodiment (specifically, for executing the ensemble processing of FIG. 13). An ensemble program 23a, a link establishment program 23b for executing the link establishment process of FIG. 14, a performance data reception program 23c for executing the performance data reception process of FIG. 15, and a performance data transmission process of FIG. Of the performance data transmission program 23d, the ensemble voice output program 23e) for executing the ensemble voice output process of FIG. 17, and the processing data related to these programs.
[0090]
The RAM 24 is a program area (not shown) for expanding programs executed by the CPU 21 or the DSP 22, and a work memory area for temporarily storing input instructions, input data, processing results generated when the programs are executed, and the like. It has. An input buffer 24a and a performance data buffer 24b are formed in the work memory area.
[0091]
The input buffer 24a is an area for storing performance sound data (performance data) input by the user, and specifically stores 512 samplings of audio data input from the A / D conversion circuit 28. Is secured.
[0092]
The performance data buffer 24b is an area in which performance data of the performer terminal 20 itself and performance data received from other performer terminals 20 are stored. FIG. 12 shows the data configuration of the performance data buffer 24a.
[0093]
FIG. 12 is a diagram showing a data configuration of the performance data buffer 24b.
According to the figure, the performance data buffer 24b stores an area OUT [0] for storing performance voice data (performance data) input to the performer terminal 20 and voice data received from another performer terminal 20. , N (where n = 1, 2,..., N. N is the number of other performer terminals 20 participating in the ensemble). Has been.
[0094]
These areas OUT [n] are associated one-to-one with the other performer terminals 20 participating in the ensemble. The received audio data is stored.
[0095]
Specifically, in these areas OUT [m] (where m = 0, 1,... N), audio data packets are stored according to their packet numbers. That is, FIG. 2 shows a state in which audio data packets are stored one by one from the left in the figure in the order of packet numbers for each area OUT [m].
[0096]
The audio data packet stored here is data that has been subjected to predetermined acoustic processing, specifically, data that has been given a predetermined effect and has been converted into stereo. That is, FIG. 2 shows a state in which data for the L channel is stored in the upper half and data for the R channel is stored in the lower half in each area OUT [m].
[0097]
In FIG. 10, an input device 25 is used by a user of the actor terminal 20 to input conditions desired to perform, connection conditions with other actor terminals 20 participating in the ensemble, and the like. It is composed of a keyboard or a touch panel having numeric keys and various function keys, or a pointing device such as a mouse, a trackball, a track point, a pointing stick, and the like. Then, operation signals corresponding to these operations are output to the CPU 21.
[0098]
The multiplexer 26 is a circuit that selects and outputs one signal from a plurality of input signals. Specifically, according to an instruction signal (selection signal) input from the DSP 22, one of the input performance sounds is output. And outputs it to the sample and hold circuit 27.
[0099]
The sample-and-hold circuit 27 temporarily holds (holds) the wave height of the performance sound (analog signal) input from the multiplexer 26 in accordance with the input timing of the clock signal input from the DSP 22, and sends it to the A / D conversion circuit 28. Output.
[0100]
The A / D conversion circuit 28 is a circuit that converts an analog signal input from the sample and hold circuit 27 into a digital signal, and outputs the converted digital audio data to the DSP 22.
[0101]
The D / A conversion circuit 29 is a circuit that converts a digital signal input from the DSP 22 into an analog signal, and outputs the converted analog audio signal to the amplifier 30.
[0102]
The amplifier 30 amplifies an analog signal (analog audio signal) input from the D / A conversion circuit 29 to a predetermined level, and outputs the amplified signal to an externally connected speaker (corresponding to the speaker 42 in FIG. 3). Output to
[0103]
The display drive circuit 31 is a circuit that controls the display device 32 based on a display signal input from the CPU 21 and displays various screens. The display device 32 includes a CRT (Cathode Ray Tube), an LCD (Liquid Crystal Display), or the like, and displays a display screen according to a display signal input from the CPU.
[0104]
The communication device 33 is an interface for performing data communication with other devices (mainly, the management server 10 and other performer terminals 20) via the Internet N.
[0105]
Next, the operation of the performance system 1 will be described with reference to FIGS.
[0106]
FIG. 13 is a flowchart for explaining the ensemble processing. This ensemble process is a process executed by the CPU 21 and the DSP 22 in accordance with the ensemble program 23a stored in the storage device 23.
[0107]
According to the figure, in the actor terminal 20, the CPU 21 accesses a Web site provided by the management server 10 according to an input instruction from a user (step S11), and sends the HP information (Web) transmitted from the management server 10 Information for displaying a page) is received (step S12). Then, a participation registration screen (not shown) is displayed on the display device 32 based on the HP information.
[0108]
Next, when the user inputs performance condition data including a desired song, part, date and time, etc. in accordance with the input format on the participation registration screen (step S13), the CPU 21 determines the performance condition data and the performer terminal. The IP address 20 is transmitted to the management server 10 together with the participation registration request (step S14).
[0109]
On the other hand, in the management server 10, the CPU 21 receives the desired performance condition and the IP address transmitted from the performer terminal 20 (step S15), and stores the performance condition data in the RAM 24 in association with the performance condition data and the IP address. The data is additionally stored in the area 13a.
[0110]
Next, the CPU 21 searches the data stored in the performance condition data storage area 13a for data that matches the received performance condition data, and specifies the performer terminals 20 that can participate in the ensemble (step S16). More specifically, a search is made for data in which the desired music and the date and time match, and which have different parts.
[0111]
When searching for data that matches the performance conditions, the CPU 21 generates ensemble schedule data based on these data. The generated ensemble schedule data is additionally stored in the ensemble schedule data storage area 13b in the RAM 24, specifically, at a position where the ensemble schedule date and time are arranged in the order closest to the current date and time. Then, the generated ensemble schedule data is transmitted (step S17). As described above, the ensemble schedule data includes the scheduled date and time of the performance, the program, and the IP address of the performer terminal 20 assigned to each part. Then, the CPU 21 deletes the data specified as matching from the performance condition data storage area 13a.
[0112]
Although not shown, in step S16, if data conforming to the received performance condition data is not stored in the performance condition data storage area 13a, that is, if the performer terminal 20 that can participate in the ensemble cannot be specified, The CPU 21 again proceeds to step S12, and waits for subsequently received performance condition data.
[0113]
The CPU 21 of the management server 10 compares the scheduled performance date and time stored at the head of the ensemble scheduled data storage area 13b with the current date and time as needed. When a predetermined time comes before the scheduled performance date and time (step S19: YES), the start of performance is instructed to the corresponding performer terminals 20, that is, all the performer terminals 20 scheduled to participate in the ensemble (step S20).
[0114]
Then, in the performer terminal 20, when the performance start is instructed by the management server 10 (step S21), the CPU 21 executes a link establishment process (see FIG. 14) to be described later so as to be able to participate in the ensemble. With the presenter terminal 20, the connection conditions are presented and collated, and a direct communication link is established (step S22).
[0115]
Thereafter, the CPU 21 implements a ensemble by the plurality of performer terminals 20 by executing a performance process described later (step S23). When the performance process ends, the scheduled ensemble ends.
[0116]
The link establishment process executed in step S21 of FIG. 13 will be described. FIG. 14 is a flowchart for explaining the link establishment processing. This link establishment process is a process executed in step S21 of FIG. 13 as described above, and is executed by the CPU 21 and the DSP 22 in accordance with the link establishment program 23b stored in the storage device 232.
[0117]
According to the figure, in the actor terminal 20, the CPU 21 presents the connection condition according to the input instruction from the user to each of the other actor terminals 20, and also from each of the other actor terminals 20. It is compared with the connection condition (step S211). Note that the connection conditions collated here include the communication conditions and the effect conditions as described above.
[0118]
As a result of the collation, if it is determined that agreement with the connection condition presented from the other performer terminal 20 cannot be obtained (step S212: NO), the CPU 21 determines the contents of the connection condition that has not been agreed on as a candidate for an alternative. At the same time, it is displayed on the display device 32 (step S213).
[0119]
Then, the contents selected from these alternatives are presented to the other actor terminals 20 as new connection conditions, and the collation with the connection conditions presented from the other actor terminals 20 is performed again (step S214).
[0120]
If agreement is obtained as a result of the re-matching of the connection conditions (step S212: YES), the CPU 21 stores the agreed connection conditions, specifically, the connection conditions relating to the performer terminal 20 itself, in the RAM 24. .
Thereafter, the CPU 21 ends the link establishment processing.
[0121]
Next, the performance processing executed in step S23 of FIG. 13 will be described.
In this performance process, the CPU 21 executes three processes, performance data reception process (see FIG. 15), performance data transmission process (see FIG. 16), and ensemble audio output process (see FIG. 17) in parallel. Then, when it is determined that the ensemble ends, the processing being executed is ended, and the performance processing is ended.
[0122]
FIG. 15 is a flowchart for explaining the performance data receiving process. The performance data receiving process is executed by the CPU 21 and the DSP 22 in accordance with the performance data receiving program 23c stored in the storage device 23.
[0123]
According to the figure, in the performer terminal 20, when an IP packet is input from the communication device 33 (step S221: YES), the CPU 21 performs an IP layer protocol process including a decryption process on the input IP packet. (Step S222) and UDP layer protocol processing are performed (Step S223), and audio data is extracted from the IP packet by application layer processing and output to the DSP 22 together with an instruction header as an audio data packet (Step S224).
[0124]
Thereafter, the CPU 21 determines whether or not the end of the present process has been instructed (step S225). As a result of the determination, when the termination is not instructed (step S225: NO), the CPU 21 proceeds to step S221 again, and repeats the same processing for the subsequently input IP packet (steps S221 to S221). S224).
[0125]
If it is determined in step S225 that the end of this process is instructed (step S225: YES), the CPU 21 ends the main performance data reception process.
[0126]
On the other hand, when the voice data packet is input from the CPU 21 (step S226), the DSP 22 performs a decoding process on the input voice data packet in accordance with the voice recording method included in the instruction header to be transferred together (step S227). ).
[0127]
Next, the DSP 22 performs audio processing on the decoded audio data packet in accordance with the effect condition included in the instruction header (step S228), and stores the decoded audio data packet in a corresponding area of the performance data buffer 24b (step 229).
[0128]
Specifically, as described with reference to FIG. 12, effects such as echo, reverb, and tone control are added at a specified degree, and then stereo is formed at a specified L / R ratio. Then, the data is stored in the area OUT corresponding to the performer terminal 20 of the transmission source in the performance data buffer 24b according to the packet number.
[0129]
Thereafter, the DSP 22 proceeds to Step S226 again, and repeats the same processing for the subsequently input audio data packet (Steps S226 to S229).
[0130]
FIG. 16 is a flowchart for explaining performance data transmission processing. This performance data transmission processing is executed by the CPU 21 and the DSP 22 in accordance with the performance data transmission program 23d stored in the storage device 33.
[0131]
According to the figure, when a predetermined sampling period elapses after the present process is started in the actor terminal 20 (step S231: YES), the DSP 22 captures audio data output from the A / D conversion circuit 28 ( (Step S232), and store the captured audio data in the input buffer 24a (Step S233).
[0132]
Here, the sampling period is a sampling frequency for converting a performance sound, which is an analog signal, into a digital signal, and is included in the connection conditions agreed in step S212 in FIG.
[0133]
Next, the DSP 22 determines whether or not one packet, that is, 512 samples of audio data has been captured (step S234). If it is determined that the voice data has not been captured (step S234: NO), the step S231 is performed again. The same processing is repeated after the elapse of the sampling cycle again (steps S231 to S234).
[0134]
As described above, the audio data is repeatedly taken in every sampling period, and when it is determined that the audio data of one packet, that is, 512 samples has been taken (step S234: YES), the DSP 22 stores the data in the input buffer 24a. And read out the audio data. After performing predetermined sound processing on the read audio data (step S235), the audio data is stored in a corresponding area of the performance data buffer 24b (step S236).
[0135]
That is, as described with reference to FIG. 12, effects such as echo, reverb, tone control, etc. are added to one packet of audio data in accordance with the effect conditions included in the connection conditions previously agreed, Stereo with the L / R ratio set.
[0136]
Also, the DSP 22 encodes one packet of audio data read from the input buffer 24b according to the PCM method (step S237), and transfers the data together with the instruction header to the CPU 21 as an audio data packet (step S238). As described above, the instruction header includes the transmitting terminal identification number and packet number for identifying the performer terminal 20 itself, the above-described audio recording method, effect conditions, and the like.
[0137]
Thereafter, the DSP 22 proceeds to Step S231 again, fetches the next one packet of audio data, and repeats the same processing (Steps S231 to S238).
[0138]
On the other hand, when the audio data packet is input from the DSP 22 (step S239), the CPU 21 receives the synchronization adjustment data, that is, the transmission terminal identification number and the audio data packet for identifying the performer terminal 20 itself. A packet number indicating the order is added (step S240). Then, one of the performer terminals 20 participating in the ensemble is identified as one of the performer terminals 20 (step S241).
[0139]
Next, the CPU 21 performs an UDP layer protocol process (step S242), performs an IP layer protocol process including an encryption process (step S243), and generates an IP packet. At this time, the CPU 21 sets the specified IP address of the performer terminal 20 as the destination IP address of the IP header. Then, the CPU 21 outputs the generated IP packet to the communication device 33, and transmits the IP packet to the specified other performer terminal 20 (step S244).
[0140]
Thereafter, the CPU 21 determines whether or not there is an actor terminal 20 to which an IP packet including the audio data packet has not been transmitted among other actor terminals 20 participating in the ensemble (step S245).
[0141]
As a result of the determination, if there is any (step S245: YES), the CPU 21 proceeds to step S239 again, and repeats the same processing for the untransmitted performer terminal 20 (steps S239 to S245).
[0142]
If it is determined in step S245 that an IP packet including the audio data has been transmitted to all other performer terminals 20 participating in the ensemble (step S245: NO), the CPU 21 proceeds to It is determined whether the end of the process is instructed (step S246).
[0143]
As a result of the determination, if the end has not been instructed (step S246: NO), the CPU 21 proceeds to step S239 again, and repeats the same processing for the subsequently input audio data packet (step S239). To S246). On the other hand, if it is determined in step S246 that the end of this processing has been instructed (step S246: YES), the CPU 21 ends this processing.
[0144]
FIG. 17 is a flowchart for explaining the ensemble audio output process. This ensemble audio output process is executed by the DSP 22 in accordance with the ensemble audio output program 23e stored in the storage device 33.
[0145]
According to the figure, when a predetermined reproduction sampling period elapses after the present processing is started in the actor terminal 20 (step S251: YES), the DSP 22 sends the audio data packet of the predetermined packet number from the performance data buffer 24b. Is read (step S252).
[0146]
Here, the reproduction sampling period is a time required to reproduce one packet of audio data (that is, 512 samplings), and specifically, corresponds to 512 times the sampling period.
[0147]
Next, the DSP 22 synthesizes (synchronous synthesis) the audio data packets read from the performance data buffer 24b for each channel as described with reference to FIG. 12, and generates audio data packets for reproducing the ensemble audio. (Step S253).
[0148]
After that, the DSP 22 adjusts the output level of the generated audio data packet, and outputs it to the D / A conversion circuit 29 (Step S254). Here, the audio data input to the D / A conversion circuit 29 is converted into an analog signal, amplified by the amplifier 30, and output from the speaker as performance audio.
[0149]
Then, the DSP 22 determines whether or not the end of the processing is instructed (Step S255). If the end is not instructed as a result of the determination (step S2555: NO), the DSP 22 proceeds to step S251 again and repeats the same processing for the audio data packet of the next packet number (step S251). To S255).
[0150]
If it is determined in step S255 that the end of this process has been instructed (step S255: YES), the DSP 22 ends this process.
[0151]
As described above, when all three processes (performance data reception process, performance data transmission process, and ensemble sound output process) performed in parallel at the performer terminal 20 are completed, the CPU 21 ends the performance process.
[0152]
With the above-described configuration, in each of the plurality of performer terminals 20 connected via the Internet N, the data of the input performance sound and the performance data transmitted from the other performer terminals 20 are converted into packet numbers. Can be combined and output in accordance with That is, irrespective of the distance between the performer terminals 20, the performance voices of the performer terminals 20 can be accurately synchronized and reproduced, and a real-time ensemble can be realized.
[0153]
Further, by transmitting the performance sound data input to the performer terminal 20 together with the effect conditions, the input performance sound is subjected to a sound effect (for example, echo, reverb, stereo conversion, etc.) desired by the user. The played ensemble sound can be reproduced.
[0154]
It should be noted that the present invention is not limited to the above embodiment, and can be appropriately changed without departing from the spirit of the present invention. For example, transmission and reception of performance data between the performer terminals 20 may be performed via the management server 10.
[0155]
That is, as shown in FIG. 18, each performer terminal 20 transmits the input performance sound data (performance data) to the management server 10. Then, the management server 10 receives these performance data and simultaneously distributes the data to each of the performer terminals 20. Also, the performer terminal 20 synchronizes the performance data transmitted from the management server 10 according to the packet number, performs sound processing according to the specified effect condition, and outputs the result as a ensemble sound.
[0156]
Therefore, the destination of the performance data by the performer terminal 20 is only the management server 10, so that the burden on the performer terminal 20 for transmitting and receiving the performance data can be reduced, and a large number of the performer terminals 20 participating in the ensemble are provided. The more effective, the more effective.
[0157]
At this time, the management server 10 may synchronize the performance data transmitted from each of the performer terminals 20 in accordance with the packet number and distribute the data to each of the performer terminals 20. The performance data synchronized in accordance with the packet number may be added with an effect and converted into stereo based on the specified effect condition, and transmitted to each performer terminal 20.
[0158]
Further, in the above embodiment, the number of users of each performer terminal 20 is one, but a plurality may be provided. In this case, the multiplexer 26 separates and selects a plurality of performance sounds input from each user according to the instruction signal from the DSP 22 and outputs the selected performance sounds to the sample and hold circuit 27. Then, the DSP 22 and the CPU 21 distribute and process these performance sounds for each user.
[0159]
【The invention's effect】
According to the first or tenth aspect of the present invention, communication data in which predetermined synchronization adjustment data is added to input audio data can be created.
[0160]
According to the second aspect of the present invention, the voice data included in the voice communication data received from another communication terminal and the voice data included in the input voice data to which the synchronization adjustment data created by the creating unit is added are included. Can be reproduced as synchronized synthesized voices according to the corresponding synchronization adjustment data.
[0161]
According to the third aspect of the invention, the synthesized speech can be reproduced in the correct order regardless of the order in which the voice communication data is received.
[0162]
According to the fourth aspect of the present invention, it is possible to process each voice communication data and reproduce the timbre / volume / sound quality more accurately, and more suitable according to a place to be reproduced. Synthesized speech can be generated.
[0163]
According to the fifth aspect of the present invention, since the voice communication data can be sequentially created by adding the synchronization adjustment data to the voice data for each predetermined unit, the voice communication data can be easily handled.
[0164]
According to the sixth aspect of the present invention, for example, if the communication terminal of the present invention is applied to a device that transmits and receives performance data to and from another communication terminal and realizes a real-time band performance, By transmitting, to the server, conditions such as a desired song, part, date and time, etc., as communication conditions, it is possible to find another device that satisfies the desired communication conditions.
[0165]
According to the seventh aspect of the present invention, the communication data to be transmitted is encrypted, and the received communication data is decrypted, so that the security at the time of performing the data communication through the predetermined communication line is enhanced. Can be.
[0166]
According to the eighth and ninth aspects of the invention, it is possible to create communication data in which predetermined synchronization adjustment data is added to input audio data.
[Brief description of the drawings]
FIG. 1 is a diagram showing a configuration of a performance system according to the present embodiment.
FIG. 2 is a diagram showing a data flow in the performance system.
FIG. 3 is a diagram showing the concept of the present embodiment.
FIG. 4 is a diagram showing a format of an IP header of an IP packet.
FIG. 5 is a diagram showing a configuration of actual data of an IP packet.
FIG. 6 is a diagram showing a communication protocol in the performance system.
FIG. 7 is a block diagram illustrating an internal configuration of a management server.
FIG. 8 is a diagram showing a data configuration of a performance condition data storage area.
FIG. 9 is a diagram showing a data configuration of an ensemble schedule data storage area.
FIG. 10 is a block diagram showing an internal configuration of an actor terminal.
FIG. 11 is a diagram showing an outline of processing for performance data.
FIG. 12 is a diagram showing a data configuration of a performance data buffer.
FIG. 13 is a flowchart for explaining ensemble processing.
FIG. 14 is a flowchart illustrating a link establishment process.
FIG. 15 is a flowchart for explaining performance data reception processing.
FIG. 16 is a flowchart for explaining performance data transmission processing.
FIG. 17 is a flowchart illustrating ensemble audio output processing.
FIG. 18 is a diagram showing a modification of the present embodiment.
[Explanation of symbols]
1 Performance system
10 Management server
11 CPU
12 Storage device
12a ensemble program
13 RAM
13a Performance condition data storage area
13b ensemble schedule data storage area
14 communication devices
20 Performer terminal
21 CPU
22 DSP
23 Storage
23a ensemble program
23b Link establishment program
23c Performance data receiving program
23d Performance data transmission program
23e ensemble audio output program
24 RAM
24a input buffer
24b Performance data buffer
25 Input device
26 Multiplexer
27 Sample hold circuit
28 A / D conversion circuit
29 D / A conversion circuit
30 amplifier
31 Display drive circuit
32 display device
33 Communication device
N Internet

Claims

A communication terminal that transmits and receives voice data to and from another communication terminal connected via a communication line, synthesizes voice data received from the other communication terminal with input voice data, and reproduces the synthesized voice as synthesized voice. So,
A communication terminal, comprising: a creating unit that creates voice communication data by adding predetermined synchronization adjustment data to input voice data.

Transmission means for transmitting voice communication data to which the synchronization adjustment data created by the creation means is added,
Receiving means for receiving voice communication data to which the synchronization adjustment data transmitted from the other communication terminal has been added,
The audio data included in the audio communication data to which the synchronization adjustment data received by the receiving unit is added and the audio data included in the input audio data to which the synchronization adjustment data created by the creation unit is added, respectively, 2. The communication terminal according to claim 1, further comprising: a reproducing unit configured to reproduce a synthesized voice synchronized according to the corresponding synchronization adjustment data.

The synchronization adjustment data includes order data,
Further comprising other data storage means for storing and storing the voice communication data received by the receiving means,
The reproducing means reads out the voice communication data stored in the other data storage means in accordance with the order data included in the voice communication data, and reproduces the voice communication data as a synthesized voice synchronized with the input voice data. The communication terminal according to claim 2.

The creating means has a playback condition adding means for further adding a playback condition to the voice communication data to be created,
4. The apparatus according to claim 2, wherein the reproducing unit reproduces audio data included in the audio communication data received by the receiving unit based on reproduction condition data included in the audio communication data. 5. Communication terminal.

5. The voice communication data according to claim 1, wherein the generation unit sequentially generates the voice communication data by adding the synchronization adjustment data for each predetermined unit of voice data among the input voice data. The communication terminal according to claim 1.

Inquiring means for transmitting desired communication condition data to a server connected via the predetermined communication line and receiving a communication address of another communication terminal that has transmitted the desired communication condition data matching the desired communication condition data. In addition,
The said transmission means and the said reception means perform transmission and reception of voice communication data based on the communication address received by this inquiry means, The claim of any one of Claims 2-5 characterized by the above-mentioned. Communication terminal.

Encryption means for encrypting voice communication data transmitted by the transmission means,
Decoding means for decoding the voice communication data received by the receiving means,
The communication terminal according to any one of claims 2 to 6, further comprising:

A communication terminal that transmits and receives data to and from another communication terminal connected via a predetermined communication line while immediately synthesizing and reproducing data received from the other communication terminal and input data,
Creation means for creating communication data by adding predetermined synchronization adjustment data to input data,
A communication terminal comprising:

A communication terminal for immediately synthesizing and reproducing input data and data received from another communication terminal while transmitting and receiving data to and from another communication terminal connected to the server via a server,
A communication terminal, comprising: a creating unit that creates communication data by adding predetermined synchronization adjustment data to input data.

A computer that transmits and receives voice data to and from another communication terminal connected via a predetermined communication line,
A creation function of creating voice communication data by adding predetermined synchronization adjustment data to the input voice data,
A receiving function of receiving voice communication data to which synchronization adjustment data transmitted from the other communication terminal has been added,
The audio data included in the audio communication data to which the synchronization adjustment data received by the reception function is added and the audio data included in the input audio data to which the synchronization adjustment data created by the creation function are added, respectively. A voice synthesis playback function for playing back as a synthesized voice in synchronization with the corresponding synchronization adjustment data;
The program to realize.