JP3594068B2

JP3594068B2 - Recording / reproducing apparatus and recording / reproducing method

Info

Publication number: JP3594068B2
Application number: JP05700298A
Authority: JP
Inventors: 隆大澤; 浩桂林; 恵理子田丸
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1998-03-09
Filing date: 1998-03-09
Publication date: 2004-11-24
Anticipated expiration: 2018-03-09
Also published as: JPH11262096A

Description

【０００１】
【発明が属する技術分野】
この発明は、例えば会議などにおける音声情報を記録し、再生する記録再生装置および記録再生方法に関する。
【０００２】
【従来の技術】
従来のこの種の記録再生装置としては、市販の製品でいえば、カセットテープレコーダや、例えばミニディスクを用いたディスクレコーダなどが代表的なものである。それらは、カセットテープやディスクなどの記録媒体に音声情報を記録し、再生の必要があれば、時間や機器のカウンタ数値を指定して所望の箇所の再生を開始することができる。
【０００３】
特に、ミニディスクなどのようなディスク媒体に記録されている音声の場合、ランダムアクセス機能により、所望の音声情報の記録位置に瞬時にアクセスすることが可能となっている。
【０００４】
この種の記録再生装置において、再生時に、記録された音声のどのあたりの箇所を再生しているかは、表示されているカウンタ数値や時刻情報を見ることで確認することができる。
【０００５】
また、これらの市販の機器の機能を更に充実させたシステムとして、特に、会議、講演会などをターゲットにした記録再生システムが、従来から提案されている。それらの代表的な例としては、特開平６−３４３１４６号公報に記載されたものがある。
【０００６】
このシステムでは、会議の音声・映像などのマルチメディア情報を記録する一方で、会議参加者のペン入力やキーボード入力などのユーザ入力情報と、その入力時刻を記録する機能がある。そして、後で、ユーザ入力情報の入力時刻を利用して、そのユーザ入力情報に関連するマルチメディア情報を再現できるような仕組みが備えられている。
【０００７】
そして、記録した音声を再生している時には、時間軸上に音声波形を示した画面を表示させ、そこに再生している箇所を示す印が表示されているので、記録中のどのあたりを再生しているかをユーザは確認することができる。
【０００８】
【発明が解決しようとする課題】
上述の従来技術に挙げたテープレコーダ、ディスクレコーダなどの市販の機器や、特開平６−３４３１４６号公報に記載のような記録再生システムは、記録情報中のどのあたり、あるいは、いつの記録情報を再生しているかを確認するには、カウンタ数値や時刻を表示している視覚的ディスプレイを見て確認する必要がある。
【０００９】
また、ディスクレコーダではランダムアクセス機能により、記録時の本来の時間的流れとは全く関係なく、さまざまな箇所に自由にアクセスすることが容易なため、アクセスが本来の時間の流れとは異なる順序で行われることが良くあるが、このような場合に、個々のアクセス箇所の情報の時間関係を把握しづらいという問題がある。また、再生した音声情報から記録時の時間的な音声情報の流れを把握するためには、個々の再生アクセス位置における時刻やカウンタの値を覚えておき、そこから本来の音声情報の流れを再構成する必要がある。
【００１０】
この発明は、以上の点にかんがみ、記録媒体上の再生箇所の時間的な位置を、聴覚的な情報のみで把握できるようにし、更に、記録されている音声情報の時間的な流れを、聴覚的な情報のみで容易に把握できる記録再生装置を提供することを目的とする。
【００１１】
【課題を解決するための手段】
上記課題を解決するため、請求項１の発明による記録再生装置は、
入力音声情報を、記録のために処理する記録音声信号処理手段と、
前記記録音声信号処理手段で処理された音声情報を、その先頭から終わりまでの全記録時間に関する付加情報とともに記録媒体に格納する音声情報格納手段と、
前記記録媒体に記録されている前記音声情報の内から再生対象の音声情報を指定する再生指定手段と、
前記再生指定手段により指定された再生対象の音声情報の付加情報から、前記再生対象の音声情報の前記全記録時間中の時間位置に応じた、聴空間上の仮想音源位置を定めるための仮想音源配置情報を生成する仮想音源配置手段と、
前記仮想音源配置手段からの前記仮想音源配置情報を受け取るとともに、前記再生指定手段により指定された音声情報を前記記録媒体から読み出し、その読み出した音声情報について、前記全記録時間中のそれぞれの前記時間位置の音声出力が、前記仮想音源配置情報に基づいて決定される聴空間上位置に定位して聴取されるように再生処理する再生手段と、
前記再生手段で再生処理された音声情報による音声を出力するための音声出力手段と、
前記各手段の動作を制御する制御手段と、
を備えることを特徴とする。
【００１２】
また、請求項２の発明は、請求項１に記載の記録再生装置において、
前記再生指定手段は、前記記録媒体に記録されている前記音声情報の内から再生対象の音声情報を指定するとともに、その再生対象の音声情報の内の所望の再生区間を指定するものであり、
前記仮想音源配置手段は、前記指定手段で指定された前記再生区間の情報をも取得して、この指定された再生区間の先頭から終りまでの各時間位置に応じた聴空間上位置を定めるための仮想音源配置情報を生成して、その仮想音源配置情報を前記再生手段に送り、
前記再生手段は、前記指定された再生区間内の音声情報について、それぞれの時間位置の音声出力が、前記仮想音源配置情報に基づいて決定される聴空間上位置に定位して聴取されるように再生処理する
ことを特徴とする。
【００１３】
また、請求項３の発明は、請求項１に記載の記録再生装置において、
前記音声情報の時間軸に関連した表示情報を生成する表示情報生成手段と、
前記表示情報生成手段で生成された前記表示情報を、表示画面に表示する表示手段と、
を備え、
前記仮想音源配置手段は、前記再生対象の音声情報の各時間位置に応じた聴空間上位置を、前記表示画面に表示された表示情報上に配置する
ことを特徴とする。
【００１４】
また、請求項４の発明の記録再生装置は、
入力音声情報を、記録のために処理する記録音声信号処理手段と、
前記記録音声信号処理手段で処理された音声情報を、記録媒体に記録して格納する音声情報格納手段と、
前記入力音声情報に関連する関連情報を、前記入力音声情報の前記記録媒体上の記録位置との対応関係とともに、記録して格納する関連情報格納手段と、
前記関連情報格納手段から読み出された前記関連情報の複数個を表示画面に表示する表示手段と、
前記記録媒体に記録されている前記音声情報の内から再生対象の音声情報を指定する再生指定手段と、
前記再生指定手段により指定された再生対象の音声情報の各時間位置に対応する前記関連情報を、前記関連情報格納手段からの情報に基づいて検出し、前記指定された再生対象の音声情報の各時間位置に応じた聴空間上位置を、前記表示画面に表示された複数の関連情報の内の、前記再生対象の音声情報の各時間位置に応じた関連情報上に配置する
ことを特徴とする。
【００１５】
また、請求項５の発明は、請求項１〜請求項４のいずれかに記載の記録再生装置において、
前記再生手段は、頭部伝達関数を用いた演算を用いて音像の定位を実現することを特徴とする。
【００１６】
また、請求項６の発明は、請求項１〜請求項４のいずれかに記載の記録再生装置において、
前記音声出力の聴取者の頭部運動を検出する頭部運動検出手段を備え、
前記仮想音源配置手段は、前記頭部運動検出手段で検出された頭部運動を考慮して、前記仮想音源配置情報を生成することを特徴とする。
【００１７】
【作用】
上述のような構成の請求項１の発明においては、再生指定手段で再生する音声情報が指定されると、仮想音源配置手段は、その指定された音声情報の付加情報から、その全記録時間を検知し、記録媒体から順次に再生される音声情報の、前記全記録時間中の各時間位置に応じた仮想音源位置を聴空間上に定めるための仮想音源配置情報を生成し、その生成した仮想音源配置情報を再生手段に送る。
【００１８】
再生手段は、再生指定手段で指定された音声情報を記録媒体から読み出し、その読み出した音声情報について、前記仮想音源配置手段からの仮想音源配置情報に基づいて決定される聴空間上位置に定位して聴取されるようにする処理を施す。そして、その処理後の音声情報を、音声出力手段に供給する。
【００１９】
この結果、音声出力手段から出力される音声は、再生音声情報の、全記録時間中の時間位置に応じた聴空間上位置に定位するように聴取される。したがって、音声情報を聴取するだけで、現在再生中の音声情報が、指定した音声情報のどの当たりのものかを知ることができる。また、再生中の音声情報の音像定位位置は、時間の経過とともに変化するので、音声情報の時間的な流れを容易に把握することができる。
【００２０】
また、請求項２の発明によれば、再生指定手段は、再生対象の音声情報のうちの所望の再生区間を指定することが可能であり、仮想音源配置手段は、その指定された再生区間内における各時間位置に応じた聴空間上位置を定めるための仮想音源配置情報を生成する。したがって、音声情報の全体を再生する場合でなくとも、その指定された音声再生区間が、全体のどの当たりのものであるかを知ることができる。また、その再生区間内における音声情報の時間的な流れも容易に把握することができる。
【００２１】
また、請求項３の発明によれば、音声情報の時間軸に関連した表示情報が表示情報生成手段で生成される。例えば、入力音声情報が会議音声の場合において、複数の会議参加者のそれぞれの発言区間が検出され、その各会議参加者の発言区間がチャートとして表示手段の表示画面に表示される。
【００２２】
仮想音源配置手段は、再生対象の音声情報の各時間位置に応じた聴空間上位置を、表示画面の表示情報上に配置する。前記発言区間のチャートが表示される例では、チャート上の当該音声の発言者の発言区間の部分に、再生音声出力が定位するようにする仮想音源配置情報を生成する。
【００２３】
したがって、この仮想音源配置情報により再生手段で、各時間位置の音声情報が処理されて、音声出力手段により音声出力されることにより、例えば、各発言者の音声出力は、その発言者の発言区間の表示部分からあたかも出力されるかの如く聴取される。
【００２４】
また、請求項４の発明においては、音声記録手段により記録される音声情報に関連する関連情報が、音声情報との対応が付けられて、関連情報格納手段に格納されている。この関連情報としては、例えば、会議において提示された文書資料などが挙げられる。
【００２５】
そして、再生時において、複数の関連情報が画面に表示されている時に、音声情報が再生される場合、仮想音源配置手段により、音声出力は、表示画面上に表示されている対応する時間位置に関連して格納されている関連情報から出力されるように定位するようにされる。前述の会議の文書資料の場合であれば、再生されている音声情報が記録された時に、提示されていた文書資料の位置からあたかも音声が出力されるようになり、聴取者は、どの文書資料に関する音声情報であるかを容易に把握することができる。
【００２６】
また、請求項５の発明によれば、頭部伝達関数が用いられて音声処理が行われて、音像が、定位感良く、上述のような聴空間上の所定の位置に配置できる。
【００２７】
また、請求項６の発明によれば、聴取者の頭の位置が変わっても、その頭部の運動を検出して、その頭部運動に応じた仮想音源配置を行うことができるので、聴空間として絶対空間内に音像を位置付けることができる。
【００２８】
【発明の実施の形態】
以下、この発明による記録再生装置の実施の形態を図を参照しながら説明する。
【００２９】
［第１の実施の形態］
図２は、第１の実施の形態の記録再生装置のシステム概観を示すもので、マイクロホン１で収音した音声の信号を、Ａ／Ｄ変換ボード２によりデジタル信号に変換し、例えばパーソナルコンピュータ３に入力する。パーソナルコンピュータ３では、入力されたデジタル音声信号を、必要に応じて圧縮処理を施し、例えば内蔵ハードディスクに記録する。
【００３０】
パーソナルコンピュータ３には、ユーザ操作入力部として、キーボード４やマウス５が接続されている。このユーザ操作入力部からの再生指示があると、パーソナルコンピュータ３では、指示されたデジタル音声情報を、ハードディスクから読み出し、圧縮されている場合には、伸長処理を施して、Ｄ／Ａ変換ボード６を通じて、この例では、ヘッドフォン７に供給し、音声を再生出力する。８は、パーソナルコンピュータ３の表示情報を表示するためのディスプレイである。
【００３１】
そして、この第１の実施の形態では、パーソナルコンピュータ３では、ヘッドフォン７の聴取者が再生音声を聴取するだけで、一塊の音声情報のうちの、どの時間位置部分を再生しているかを容易に把握することができるようにするため、後述するように、聴感上の仮想音源位置（音像位置）を、聴空間上の所定の位置に配置するようにする演算処理を行う。
【００３２】
図１は、この第１の実施の形態の記録再生装置を、その処理機能によりブロック化して示した機能ブロック図である。すなわち、この第１の実施の形態の記録再生装置は、音声入力部１１と、記録音声信号処理部１２と、音声情報格納部１３と、ユーザ入力部１４と、制御部１５と、表示部１６と、再生部１７と、音声出力部１８と、仮想音源配置部１９とから構成される。
【００３３】
音声入力部１１は、記録する音声信号の入力部であり、この例の場合には、マイクロホン１により構成される。記録音声信号処理部１２は、Ａ／Ｄ変換や圧縮処理など、記録のための音声信号処理を施す部分であり、Ａ／Ｄ変換ボード２と、パーソナルコンピュータ３のソフトウエアとからなる。
【００３４】
音声情報格納部１３は、記録音声信号処理部１２からの音声信号を記録媒体に記録して格納するものであって、この例の場合には、記録媒体としてハードディスクを用いるので、ハードディスクドライバにより構成される。なお、記録媒体としては、メモリなどの半導体素子や、いわゆるＭＯやＭＤなどの光磁気ディスク、また、フロッピーディスクなどを用いることもできる。
【００３５】
ユーザ入力部１４は、ユーザ入力を受け付けるものであって、この例の場合には、キーボード４やマウス５により構成される。
【００３６】
制御部１５は、パーソナルコンピュータ３のソフトウエアにより構成されるもので、ユーザ入力部１４からのユーザ入力を受け付けて解析し、そのユーザに応じて、記録音声信号処理部１２、音声情報格納部１３、再生部１６、仮想音源配置部１９を制御し、音声情報の記録媒体への書き込みや読み出し、また、音像の定位処理などを制御する。また、この実施の形態の場合には、制御部１５は、表示部１６や、後述する仮想音源配置部１９なども制御する。
【００３７】
表示部１６は、制御部１５の制御を受けて、所定の情報を表示するもので、この例の場合は、ＣＲＴやＬＣＤ（液晶ディスプレイ）などを表示素子として用いたパーソナルコンピュータ３の表示部を使用する。
【００３８】
再生部１７は、制御部１５の制御を受け、また、後述する仮想音源配置部１９からの仮想音源配置情報を受けて、音声情報格納部１３から読み出された音声情報を、音声出力部１８に供給する信号に変換する処理を行うもので、パーソナルコンピュータ３のソフトウエアと、Ｄ／Ａ変換ボード６とからなる。
【００３９】
この実施の形態の場合、再生部１７では、ユーザにより指定された音声情報のうちの、指定された再生区間内の各時間位置の音声情報が、その各時間位置に応じた聴空間上位置の仮想音源から出力されるように定位させるようにする演算処理を施す。
【００４０】
この実施の形態の再生部１７で用いる演算方法は、日本音響学会誌４９巻７号（１９９３）、Ｐ５１５〜Ｐ５２１の「住空間音疑似体験システムの開発」に記載されている演算方法を用いている。
【００４１】
この方法は、仮想聴空間内で、指定された仮想音源方向から聞こえるように音の定位を実現させる演算方法で、聴取者の頭の位置・方向と、仮想音源の位置を与えると、頭部伝達関数が決定され、この頭部伝達関数と、左右２チャンネルのデジタル音声信号とのコンボリューション演算を行うことにより、仮想聴空間上の指定された仮想音源位置に定位するように音声出力をさせることができるというものである。
【００４２】
この実施の形態の場合、再生部１７は、聴取者の頭の位置・方向および仮想音源の位置と、これらに対する頭部伝達関数との対応テーブルを、例えばＲＯＭなどのメモリに格納して用意している。すなわち、予め、聴取者の頭の位置・方向および仮想音源の位置として想定される複数個の情報が与えられ、それらに対応する頭部伝達関数がそれぞれ計測されることにより得られるテーブルの情報が、例えばＲＯＭに格納されて用意されている。
【００４３】
仮想音源配置部１９は、前記の頭部伝達関数を求める演算のための仮想音源配置情報を生成して、再生部１７に供給する。すなわち、この第１の実施の形態の場合には、再生要求される音声情報の再生区間の時間軸上の位置情報に基づいて、聴空間上において、前記時間軸上の各位置に応じた位置に仮想音源を配置するようにするための仮想音源配置情報を生成し、再生部１８に供給する。この仮想音源配置部１９は、この例ではパーソナルコンピュータ３のソフトウエアにより構成される。
【００４４】
仮想音源配置部１９で用いられる音声情報の時間軸上の位置情報は、後述するように、制御部１５と、音声情報格納部１３から与えられる。制御部１５からの時間軸情報は、ユーザ入力部１４からのユーザ入力に応じたものであり、また、音声情報格納部１３からの時間軸情報は、音声情報に関連して記録された音声記録開始時刻や終了時刻などの情報である。
【００４５】
音声出力部１８は、再生された音声信号から放音音声を出力する音響変換を行うもので、この例の場合には、ヘッドフォン７で構成されている。再生部１７での前記の仮想音源定位処理により、仮想聴空間上において、再生信号の各時間軸上の位置に応じた位置にある仮想音源から音声が出力されているように、ヘッドフォン７で音声出力が聴取される。
【００４６】
この第１の実施の形態における音声情報の記録時の処理および再生時の処理を、以下に詳述する。
【００４７】
［第１の実施の形態の音声記録時の説明］
まず、第１の実施の形態における音声情報の記録シーンについて説明するとともに、音声入力部１１、記録音声信号処理部１２、音声情報格納部１３について説明する。
【００４８】
ユーザ入力部１４から音声記録命令が入力されると、その命令は、制御部１５で処理され、記録音声信号処理部１２および音声情報格納部１３に送られる。記録音声信号処理部１２では、音声記録命令に応じて、Ａ／Ｄ変換ボードにおいて、音声入力部１１からの入力音声信号のＡ／Ｄ変換処理を行い、必要な圧縮処理を施す。音声情報格納部１３では、記録音声信号処理部１２からの音声情報の記録を開始する。
【００４９】
ユーザ入力部１４から記録終了命令が入力されると、制御部１５はその命令を記録音声信号処理部１２および音声情報格納部１３に送る。この記録終了命令に応じて、記録音声信号処理部１２は、音声のＡ／Ｄ変換処理および圧縮処理を終了し、音声情報格納部１３は音声情報の記録を終了する。
【００５０】
この音声記録に際し、音声情報格納部１３では、記録された音声に、記録開始時刻と記録終了時刻の時間情報を対応付けて記録する。
【００５１】
また、音声情報格納部１３では、記録開始から記録終了まで時間的に連続する音声情報を、一つの音声情報ファイルとして格納管理するようにしており、ユーザ入力部１４からのユーザ入力操作により付与された音声情報ファイル名の情報が、制御部１５を通じて音声情報格納部１３に記録される。さらに、複数の音声情報ファイルを格納するために、各音声情報ファイルの記録媒体上の位置も、格納して保持している。
【００５２】
なお、この第１の実施の形態においては、各音声情報ファイルに関する時間情報は、その音声情報ファイルの全記録時間（記録開始から記録終了までの時間）が判れば良いので、記録する時間情報としては、記録開始時刻と、記録音声情報量、あるいは記録音声情報量と記録終了時刻などであっても良い。
【００５３】
この実施の形態の場合、ユーザは、音声情報格納部１３に、どのような音声情報ファイルが格納されているかを表示部１６に表示することが可能である。この例では、格納されている音声情報ファイルのファイル名を表示する。そして、ユーザは、このファイル名の表示によって、どのような音声情報が格納されているかを知ることができるだけでなく、ファイル名をユーザ入力部１４の、例えばマウスを用いる等して選択操作することにより、その選択した音声情報ファイルの再生指定をすることができるように構成されている。
【００５４】
［第１の実施の形態の音声再生時の説明］
次に、この第１の実施の形態の場合の再生時の動作について、以下に説明する。
【００５５】
再生に当たって、ユーザは、ユーザ入力部１４から、再生命令と、音声出力先と、再生音声指定情報とを含む再生要求を入力する。音声出力先は、この例ではヘッドフォン７であり、また、再生音声指定情報は、音声情報ファイル名と、再生開始オフセットと、再生終了オフセットとからなる。
【００５６】
再生開始オフセットは、再生要求する音声情報ファイルの先頭からどれだけ後の時間位置から再生を開始するかの情報であり、ゼロであれば、音声情報ファイルの先頭からの再生の要求である。同様に、再生終了オフセットは、再生要求する音声情報ファイルの記録終了時点からどれだけ前の時間位置までを再生するかの情報であり、ゼロであれば、音声情報ファイルの最後の記録終了時点までの再生要求となる。
【００５７】
制御部１５は、ユーザ入力部１４からのユーザ入力が再生命令を含む再生要求であることを判別すると、再生命令と、音声出力先と、再生音声指定情報とを含む再生要求の情報を再生部１７に送るとともに、その再生要求のあったときの現在時刻を保持しておく。
【００５８】
以下に、制御部１５からの再生要求の情報が渡されたときの再生部１７の動作と、再生部１７に関連して動作する仮想音源配置部１９の動作を説明する。まず、再生部１７の動作を、図３のフローチャートを参照しながら説明する。
【００５９】
再生部１７は、音声情報の再生要求を受け取ると、図３の処理ルーチンを起動して、まず、ステップ１０１において、仮想音源配置部１９に再生音声指定情報（音声情報ファイル名、再生開始オフセット、再生終了オフセット）を送る。
【００６０】
仮想音源配置部１９は、この再生音声指定情報の送付を受けると、後述するようにして、再生開始オフセットから再生終了オフセットまでの再生区間内の音声情報の各時間位置に対応する聴空間上の位置に、仮想音源を配置するようにするための仮想音源配置情報を生成する。
【００６１】
再生部１７は、この仮想音源配置部１９で生成された仮想音源配置情報を、ステップ１０２で受け取る。そして、ステップ１０３に進み、この仮想音源配置情報と、聴取者の頭の位置の情報とを用いて、前記テーブルに記録されている情報を参照し、再生対象の音声情報の各時刻における頭部伝達関数を決定する。
【００６２】
なお、この例では、音声出力部１８は、頭部に装着されたヘッドフォンであるため、聴取者の頭部の方向、位置は、固定されていると考えてよく、予め、それらの情報、すなわち、位置および方向の情報は与えられている。
【００６３】
こうして頭部伝達関数が決定したら、ステップ１０４に進み、音声情報格納部１３に格納されている、指定された再生区間の音声情報のデジタルデータと、頭部伝達関数とのコンボリューション演算を行い、再生対象の音声情報の各時刻のヘッドフォン７への再生音声信号を計算する。そして、ステップ１０５に進み、ステップ１０４で計算された再生音声信号を音声出力部１８としてのヘッドフォン７に供給し、順次、音声出力する。
【００６４】
次に、仮想音源配置部１９の、この再生時の動作を、図４のフローチャートを参照して説明する。
【００６５】
仮想音源配置部１９では、再生部７から再生音声指定情報（音声情報ファイル名、再生開始オフセット、再生終了オフセット）が送られてくると、ステップ２０１に進み、音声情報格納部１３から、指定された音声情報ファイルの記録開始時刻および記録終了時刻の情報を取得して、指定された音声情報ファイルの全記録時間を求める。
【００６６】
次に、ステップ２０２に進み、再生音声の聴空間上の仮想音源位置を決める。この場合の仮想音源位置の決定方法の概念図を図５に示す。
【００６７】
まず、この例においては、図５に示すように、ユーザの頭部９を中心とした半径ｒの円弧状の空間の座標系を仮想的に定義し、これをユーザの頭部座標系とする。そして、この頭部座標系において、所定の聴空間範囲を決定する角度αから角度βまでに相当する時間を定義し、この仮想的な時間空間上に再生音声の各時間位置の仮想音源を配置するようにする。
【００６８】
この場合の聴空間範囲はどのように定義することもできるが、この実施の形態では、上述したように、音声情報格納部１３から取得した音声情報ファイルの記録開始時刻および記録終了時刻を対応させて、音声情報ファイルの記録開始時刻Ｒｓを角度α、記録終了時刻Ｒｅを角度βに対応させる。なお、この例では、角度αは、例えば３０度、角度βは、例えば１５０度とする。
【００６９】
このようにユーザの頭部座標系を定義した後、再生音声指定情報の再生開始オフセットおよび再生終了オフセットで指定される再生開始時刻Ｔｓおよび再生終了時刻Ｔｅの時点にそれぞれ対応する、聴空間上の角度θｓおよび角度θｅを求める。この角度θｓおよび角度θｅが、再生開始時刻Ｔｓおよび再生終了時刻Ｔｅの仮想音源位置となる。
【００７０】
次に、角度αから角度βまでの角度範囲において、単位時間Δｔ当たりの単位角度Δθを求める。これにより、再生開始時刻Ｔｓから再生終了時刻Ｔｅの間の区間においては、単位時間Δｔごとに角度位置が定められ、それぞれの角度位置が、対応する時刻の音声の仮想音源位置と定められる。
【００７１】
こうして、指定された再生区間の各時刻の音声が定位すべき位置、すなわち、仮想音源位置を決めるための情報が得られると、ステップ２０３に進み、仮想音源配置部１９は、聴空間を形成する頭部座標系の半径ｒと、再生開始オフセットにより定まる再生開始時刻Ｔｓの聴空間上の角度と、単位時間Δｔ当たりの角度Δθを、仮想音源配置情報として、再生部１７に送る。
【００７２】
前述したように、再生部１７では、これらの情報に基づいて、再生対象区間内の各時間位置に応じた仮想音源位置から音声が出力されるようにする音声処理を、音声情報格納部１３から読み出された再生対象の音声情報ファイルの指定された再生区間の音声情報に対して施す。
【００７３】
この実施の形態では、以上のようにして再生して、音声出力することにより、聴感上、聴空間上を、音声の時間軸上の変化に応じて、仮想音源が移動するようにして再生されるようになる。
【００７４】
なお、上記例では、頭部を中心にした半径ｒの円上に各時刻の音声出力を配置するようにする例であったが、別の配置方法でもよい。例えば、図６に示すように、直線上に各時刻の音声を配置しても構わない。
【００７５】
この図６の例は、仮想３次元空間上の、点Ａ（ｘａ，ｙａ，ｚａ）と、点Ｂ（ｘｂ，ｙｂ，ｚｂ）の時刻を、指定した音声情報ファイルの記録開始時刻Ｒｓと、記録終了時刻Ｒｅとに対応させる。そして、単位時間Δｔ当たりの移動距離を、移動ベクトルΔｖ（＝（ｓ，ｔ，ｕ））として求める。すると、点Ａと点Ｂの直線方程式から、指定した音声再生区間の各時刻の仮想音源位置が定まる。
【００７６】
この図６の例の場合には、図４のフローチャートのステップ２０３では、開始オフセットの位置と、単位時間Δｔ当たりの移動ベクトルΔｖを再生部１７に送ればよい。
【００７７】
なお、上記例では、角度αと角度β、あるいは点Ａと点Ｂを、指定した音声情報ファイルの記録開始時刻と記録終了時刻に当てはめたが、予めユーザが別の日時を指定しても構わない。また、上述の説明では、一つの音声を聴空間上に定位させるように再生したが、複数の音声を選択して、それらを聴空間上に配置して、並列的に再生させるようにすることもできる。
【００７８】
このように、ユーザは同時に複数の音声を聴取できる。複数の音声を同時に理解することはできないが、単語を拾うことはできる。これにより、記録された音声の文脈や流れを推定することができる。また所望の情報がどれであるか、あるいはどこにあるかの見当をすばやくつけることができる。
【００７９】
［第２の実施の形態］
第１の実施の形態では、聴取者の頭部を中心とした頭部座標系を用いた聴空間上において、再生時間位置に対応して仮想音源を、頭部位置に対して相対的に配置するようにした。すなわち、第１の実施の形態では、音声の再生時間位置に対応して仮想音源を、頭部に対する相対空間上に配置させるようにした。しかし、この発明は、このような相対空間上において、再生時間位置に対応させて仮想音源位置を配置させる場合に限らない。
【００８０】
以下に説明する第２の実施の形態は、絶対空間上に時間軸を位置付け、その絶対空間上の時間軸上において、再生区間の各時間位置に対応した仮想音源位置を配置することができるようにした場合である。
【００８１】
この第２の実施の形態においては、図７に示すように、図１に示した第１の実施の形態の機能ブロック図のブロックに加えて、聴取者の頭部の動きを検出する頭部運動検出部２１を設ける。
【００８２】
頭部運動検出部２１は、例えば３次元磁気センサを用いて構成する。この３次元磁気センサにより頭部の位置、および方向（角度）を検出する。これを用いて、時間軸の位置を、聴取者は絶対空間上に位置づけることが可能になる。
【００８３】
すなわち、頭部が動いたとしても、絶対空間上の時間軸の位置は動かない。この概念を、図８を用いて説明する。
【００８４】
今、時刻Ｔにおける聴取者の位置をＰ（Ｔ）、時刻（Ｔ＋１）における聴取者の位置をＰ（Ｔ＋１）とする。頭部運動検出部２１が無い場合には、頭部がＰ（Ｔ＋１）に移動すると、次の時刻（Ｔ＋２）の音声も、頭部が移動した分だけ移動した位置に聞こえることになる。
【００８５】
この実施の形態の場合には、頭部運動検出部２１によって頭部位置および方向角度情報が検出され、その検出された頭部位置および方向角度情報は、制御部１５を介して仮想音源配置部１９に送られる。仮想音源配置部１９では、頭部位置Ｐ（Ｔ＋１）に基づいて、時刻（Ｔ＋２）以降の音声情報の頭部座標系の位置を計算し、その情報を再生部１７に送る。再生部１７では、送られた仮想音源配置情報から各時刻における頭部伝達関数を特定し、時刻（Ｔ＋２）以降の音声信号を計算し直す。
【００８６】
この頭部の位置が変化した場合における仮想音源配置部１９の動作を、図９のフローチャートを参照して説明する。
【００８７】
すなわち、仮想音源配置部１９は、制御部１５から頭部位置および方向角度情報を受け取ると、ステップ３０１において、頭部位置および方向角度情報が以前の時刻の頭部位置および方向角度情報と同じかを判別する。もし同じであれば、処理を終了し、もし異なれば、ステップ３０２に進む。ステップ３０２では、検出された頭部位置および方向角度情報に基づいて、これから再生する音声の各時刻における頭部座標系における仮想音源位置を特定するための仮想音源配置情報を計算する。そして、ステップ３０３に進み、その仮想音源配置情報を再生部１７に送る。
【００８８】
次に、この場合の再生部１７の処理動作のフローチャートを図１０に示す。
【００８９】
すなわち、仮想音源配置部１９より各時刻の仮想音源位置に関する情報が送られると、ステップ４０１へ進み、その情報に基づいて、指定された再生区間内の各時刻の音声の頭部伝達関数を特定する。そして、ステップ４０２に進み、音声情報格納部１３に記録されている、これから再生する音声信号のデジタルデータを読み出して、特定された頭部伝達関数とコンボリューション演算を行う。次に、ステップ４０３に進み、ステップ４０２の演算結果をＤ／Ａ変換ボードを介してヘッドフォン７に出力する。
【００９０】
このような機能をもつことで、第２の実施の形態によれば、絶対空間上に時間軸を位置づけ、その時間軸上で音声を再生することが可能である。また、特定の音声をよく聞きたいときに、そこに近づくことで聞き易くするようなことが可能である。
【００９１】
また、第２の実施の形態によれば、絶対空間上に時間軸を位置づけ、その時間軸上で音声を再生することが可能であるので、仮想空間上の時間軸において、仮想音源を再生音声の時間位置に応じて単に配置するだけでなく、画面上に表示されたチャート（時間軸を有し、その時間軸にそって情報が視覚的に表現されたもの）上に、再生音声の時間位置に応じて仮想音源を配置することもできる。
【００９２】
以下に、第２の実施の形態のいくつかの実施例について説明する。
【００９３】
［第１の実施例］
この第２の実施の形態の第１の実施例においては、音声情報の時間軸に関連する表示情報が表示部１６の画面に表示される場合に、再生対象の音声情報が、どの表示情報部分に関連するかを、表示画面上の表示情報に仮想音源を配置することにより、容易に把握することができるようにした場合である。
【００９４】
この場合に、前述したように、表示画面のような固定的な空間上において、仮想音源を配置するようにする場合には、聴取者の頭部の位置が変化すると、音源位置が変化してしまうことになるので、聴取者の頭部位置に応じた補正を行うようにしている。
【００９５】
この第１の実施例を、会議情報の記録再生の場合を例に取って、以下に説明することにする。前述した図７は、この第１の実施例の場合の記録再生装置の機能ブロック図でもある。
【００９６】
この第１の実施例においては、図１に示した第１の実施の形態の機能ブロック図のブロックに加えて、聴取者の頭部の動きを検出する頭部運動検出部２１と、音声情報の時間軸上に関連する情報を視覚的に配置したチャートを作成するチャート作成部２２を設ける。
【００９７】
この第１の実施例の場合には、ユーザ入力部１４から、再生命令と、再生開始オフセットおよび再生終了オフセットを含む再生指示情報とが入力されると、表示部１６の画面には、図１１に示すように、再生区間を示す時間軸バー３１が、チャート作成部２２で作成されて表示される。この図１１の例の場合は、再生開始オフセットおよび再生終了オフセットがともにゼロで、会議開始から終了までを再生区間として指定した状態を示している。
【００９８】
このチャートは、ユーザが入力部１４から直接的に入力して表示させるようにすることもできる。すなわち、この実施例の場合に、ユーザは、ユーザ入力部１４より、時間軸バー、文字、画像情報を入力することにより、制御部１５を介して表示部１６に、時間軸バー３１を表示できる。
【００９９】
そして、この第１の実施例の場合には、再生音声信号は、この表示部１６の画面に表示された時間軸バー３１上の、各再生音声の時間位置に対応する位置に仮想音源が定位するように、仮想音源配置部１９で、仮想音源配置情報が生成され、その仮想音源配置情報により、前述と同様にして、再生部１７において、頭部伝達関数を用いた演算処理が行われる。この場合の仮想音源配置部１９での処理は、前述の図６を用いて説明した場合に相当する。
【０１００】
ただし、前述したように、聴取者の頭部の位置が変化した場合には、仮想音源配置部１９で、その変化後の頭部を中心とした仮想音源配置情報の演算がやり直される。そして、その変化後の仮想音源配置情報が再生部１７に与えられる。さらに、頭部運動検出部２１で検出された頭部の位置情報も、再生部１７に与えられる。そして、再生部１７では、それらの情報に基づく変化後の頭部伝達関数が求められ、その頭部伝達関数が用いられて、各時間位置の音声情報のコンボリューション演算が行われる。
【０１０１】
すなわち、この第１の実施例においては、制御部１５は、時間軸バー３１の表示情報（画面の位置、画面中の時間軸バー３１の位置、時間軸バー上の時間情報）を仮想音源配置部１９に送る。すると、仮想音源配置部１９では、３次元磁気センサより検出される聴取者の頭部位置および方向角度情報と、画面上の時間軸バー３１についての情報から、頭部座標系における時間軸バー３１上の、再生音声の各時間位置における仮想音源位置に関する情報を生成して、再生部１７に供給するようにする。
【０１０２】
なお、仮想音源配置部１９では、ユーザからの音声再生命令があった時に、それに伴って送られてくる再生開始オフセット、再生終了オフセットの情報から、再生対象区間の音声の各時刻における仮想音源の配置位置を決定するのは前述の場合と同様である。
【０１０３】
図１１において、丸印３２は、この実施例の場合の仮想音源の定位位置の例を示すものである。すなわち、表示部１６の画面上の時間軸バー３１上において、再生中の音声の時間位置に応じた位置から音声が出力されるように、仮想音源が配置されるものである。なお、図１１では、概念的に判りやすくするために、丸印３２により仮想音源位置を示したが、実際上は、この丸印３２は表示上は、存在せず、当該丸印３２の位置から音声出力が発せられるように聴取されるものである。
【０１０４】
これにより、図１１の模式図に示すようにして、聴取者は、単に、再生音声を聴取するだけで、現在再生中の音声が、時間軸上のどの位置の音声であるかが分かる。
【０１０５】
［第２の実施例］
この第２の実施例も会議情報の記録再生装置の場合の例である。この第２の実施例の場合の記録再生装置の機能ブロック図を図１２に示す。
【０１０６】
この第２の実施例においては、音声入力部１１は、複数人の会議出席者の各々に割り当てられた複数本のマイクロホンで構成され、発言区間検出部２３では、各会議出席者ごとの発言区間を検出し、その検出結果を、音声情報格納部１３に格納するようにする。この第２の実施例の場合には、チャート作成部２２は、各会議出席者ごとの発言区間を時間軸上において表示するようにする発言者チャートの作成部を構成する。
【０１０７】
各会議出席者の発言区間は、図１３のフローチャートおよび図１４の説明図に示すようにして、発言区間検出部２３において検出される。この例では、各会議出席者のマイクロホンからの音声情報が、予め設定されている或るレベル以上、かつ、或る時間だけ継続した場合には、そのマイクロホンを使用する会議出席者が発言しているとして検出する。
【０１０８】
すなわち、図１４に示すように、マイクロホンから、あるレベルＬ１以上の音声信号が出力されると、ステップ５０１へ進み、予め、発言開始を検出するために適切な単位時間長として定められた時間区間Δｔ１以上に渡って、前記レベルＬ１以上の音声信号レベルが持続するかを監視する。もし、持続しなければ、それは発言とはみなさず、発言区間の検知を終了する。もし、持続したと判別した場合、ステップ５０２へ進み、現在の時刻（Ｔ１）を検出し、Ｔ１−Δｔ１を発言開始時刻とする。
【０１０９】
そして、ステップ５０３へ進み、音声の終了時刻を求めるために、音声が、ある時刻Ｔ２から、予め、発言終了を検出するために適切な単位時間長として定められた時間Δｔ２以上、あるレベルＬ２を下回ったか否かを監視する。そして、下回った場合、ステップ５０４へ進み、Ｔ２―Δｔ２を発言終了時刻として検出する。
【０１１０】
なお、上述の例では、発言開始時刻の検出のためのレベルＬ１と、発言終了時刻の検出のためのレベルＬ２とは、Ｌ１＝Ｌ２としたが、レベルＬ１とレベルＬ２とは、必ずしも等しくなくともよい。
【０１１１】
以上のようにして検知された会議参加者毎の発言区間に関する情報は、図１５に示すようなデータ構造にまとめられる。発言区間検出部２３は、この発言区間の関する情報を、発言状況の記録情報として、音声情報格納部１３に格納するようにする。
【０１１２】
図１５に示すように、記録される発言区間に関する情報は、それぞれの発言区間を識別するための発言識別データ（発言ＩＤ）と、その発言者、発言開始時刻および発言終了時刻とからなる。発言者の情報としては、予め登録された会議出席者名が記録される。なお、発言者名と発言者識別データとの対応テーブルを別に用意して、発言者識別データを発言者名の代わりに記録するようにしてもよい。
【０１１３】
この第２の実施例においては、再生に際し、ユーザは、ユーザ入力部１４からチャート作成の命令と、前述の場合と同様にして、再生時間範囲を指定することができる。チャート作成命令および再生時間範囲が入力されると、その命令および再生時間範囲の情報は、制御部１５を介して、チャート作成部２２へ送られる。
【０１１４】
チャート作成部２２は、音声情報格納部１３から指定された再生時間範囲の発言状況の記録情報を読み出して、図１６に示すような発言者チャートを作成し、制御部１５を通じて表示部１６に表示する。
【０１１５】
発言チャートは、図１６に示すように、発言者を識別するための発言者名Ａ，Ｂ，Ｃを表示する発言者名領域４１と、発言の遷移の状態を視覚的に表示するための発言遷移表示領域４２とから構成される。発言遷移表示領域４２には、発言区間検出部２３で検出された、発言者ごとの発言区間が、矩形のバー表示ＶＢにより表示される。
【０１１６】
図１６において、Ｔｓは、ユーザにより指定された再生区間の開始時刻、Ｔｅは当該再生区間の終了時刻である。図１６の例は、再生開始オフセットおよび再生終了オフセットがゼロの場合の例であり、会議の開始から終了までの区間が対象となっている。
【０１１７】
この発言者チャートにより、各会議出席者が、いつ、どのくらいの時間の発言を行ったのかが、矩形の発言区間バーＶＢの表示位置と長さにより示される。そして、この発言遷移表示領域４２の全会議参加者の発言区間バーＶＢの遷移として表示される発言構造を読み取ることで、誰の発言から誰の発言へと遷移したのかという、前記再生区間における発言遷移構造を読み取ることも可能となる。
【０１１８】
そして、聴取者に対する表示部１６の位置や、発言者チャートの表示部１６内の発言区間バーＶＢの表示位置は、制御部１５から仮想音源配置部１９に送られる。仮想音源配置部１９では、再生中の音声に対応する発言区間バーＶＢの位置を、仮想音源位置とするような仮想音源配置情報を生成し、再生部１７に送る。
【０１１９】
再生部１７は、この仮想音源配置情報に基づいて、前述と同様の演算処理を再生する音声信号に対して行い、音声出力部１８としてのヘッドフォン７に出力する。
【０１２０】
図１６の例では、丸印で示す発言区間バーの位置に仮想音源が存在するように、音声出力されている。すなわち、発言者Ａ，Ｂ，Ｃのそれぞれの発言に対応するように定位するようにされる。なお、図１６の例では、３人の発言を同時に再生していることを示している。
【０１２１】
このように、ユーザは、同時に複数の音声を聴取することもできる。複数の音声を同時に理解することは困難であるが、単語を拾うことはできる。これにより、記録された音声情報の文脈や流れを推定することができる。また、所望の情報がどれであるか、あるいはどこにあるかの見当を、素早く付けることができる。
【０１２２】
なお、表示部１６が動く可能性がある場合には、表示部１６の位置を検出する機構を設ければよい。例えば、携帯型のパーソナルコンピュータ上で記録再生装置を実現する場合には、携帯型のパーソナルコンピュータに３次元磁気センサを装着することで実現できる。また、表示部１６は、可搬なパーソナルコンピュータや電子黒板のようなものであってもよい。
【０１２３】
［第３の実施例］
図１７は、第２の実施の形態の記録再生装置の第３の実施例の機能ブロック図である。この第３の実施例は、会議情報の記録再生装置であって、会議において使用した参照資料としての文書の情報を、すべて記録しておき、再生に際して、その文書を表示画面に表示し、会議の再生音声がどの文書に関するものであるかを、仮想音源配置により、再生音声を聴取するだけで把握することができるようにする場合の例である。
【０１２４】
図１７において、文書記録部５１には、会議資料として使用された文書情報が記録されている。この文書情報には、ワープロソフトやプレゼンテーション用のソフトなどで作成されたさまざまな文書を含む。
【０１２５】
表示文書履歴記録部５２では、会議音声情報の記録が行われている際に、表示部１６に表示されていた文書の表示履歴が記録されている。表示文書履歴記録部５２ヘの履歴の記録は、ユーザ入力部１４から文書の表示・非表示の命令が入力される都度、制御部１５によって記録される。
【０１２６】
表示文書履歴記録部５２に記録されている情報の構造を図１８に示す。文書ＩＤは、各文書の識別情報である。同じ文書が異なる時間に再び資料として使用される場合もある。表示開始時刻および表示終了時刻は、文書が表示されていた時間を示すためのものである。
【０１２７】
表示文書記録部５３は、音声情報を再生している時に表示している文書とその表示位置や大きさが記録されている。これは、ある時点での文書の表示状態を表わす記録であり、履歴ではない。したがって、表示状態をユーザが変化させれば、この記録も変化する。
【０１２８】
表示文書記録部５３に記録されている情報の構造の例を図１９に示す。この実施例では、文書の表示位置を、文書の中心位置で記録しているが、文書のどの位置情報をどのように表現するかを限定するものではない。
【０１２９】
文書チャー卜作成部５４は、表示部１６の画面上において、時間軸上に文書を配置する機能を有する。具体的に言えば、文書チャート作成部５４は、会議で使われた文書を、時系列的に表示部１６に表示するための情報を作成するものである。
【０１３０】
図２０は、会議における各文書の表示状況を時間軸上に示したものである。すなわち、図２０では、文書１は、時刻ｔ０〜ｔ１の時間帯において表示され、文書２は、時刻ｔ２〜ｔ３の時間帯において表示され、文書３は、時刻ｔ４〜ｔ５の時間帯において表示されたことを示している。
【０１３１】
図２１は、会議音声の再生に先立ち、表示部１６の表示画面において、前述の第２の実施例と同様の時間軸バー３１に対応して、各文書を並べた文書チャートの例を示すものである。
【０１３２】
この文書チャートは、文書チヤート作成部５４によって作成される。文書チャート作成部１５の処理動作を図２２のフローチャートに示す。
【０１３３】
すなわち、文書チャート作成部５４は、文書チャート作成命令、チャート作成時間、表示文書履歴フアイル名が、ユーザ入力部１４より制御部１５を介して入力されると、この文書チャー卜を作成する。ここで、チャート作成時間とは、表示部１６に表示する時間軸バー３１の時間区間のことを指す。
【０１３４】
文書チャートの作成方法の別の方法としては、ユーザ自身が、時間軸バー３１と文書を、表示文書履歴記録部５２の情報を参照しながら、手入力で表示画面上に作成する方法がある。なお、その場合には、文書チャート作成部５４は使われない。
【０１３５】
この第３の実施例においては、前述の実施例と同様にして、音声情報を再生する場合に、仮想音源配置部１９は、再生対象の音声の時間位置において記録時に表示されていた文書を検索し、その文書位置に仮想音源を配置するように仮想音源配置情報を生成する。このため、仮想音源配置部１９には、制御部１５を通じて、文書チャートに関する表示情報が与えられている。
【０１３６】
再生部１７は、それを受けて、各音声の時間位置に対応した文書位置を仮想音源として音声出力がなされるように音声信号に対して、前述したような演算処理を施す。
【０１３７】
この結果、再生音声は、表示部１６に表示された時間軸バー３１に沿って表示されている文書に定位して出力するようにされる。図２１の丸印で示したのは、それぞれの文書１、文書２、文書３に対応する音声の定位位置を示している。
【０１３８】
なお、再生音声の時間位置に対応する文書がない場合、すなわち、会議中において、文書が表示されていなかった時間帯の音声情報については、図２１において、破線の丸印で示すように、時間軸バー３１上において、その前後の文書の中心に定位させるように仮想音源位置を配置して、当該再生音声情報の時間軸上の位置は知らせることができる。
【０１３９】
表示させている文書の大きさと、時間軸のスケールの関係によっては、ある文書に関する音声情報の定位位置と、文書の位置とにずれが生じることもある。このずれを考慮する場合を考ると、仮想音源配置部１９は、表示文書履歴記録部５２の記録情報から、その文書に関連した音声情報の時間帯を認識し、その時間帯においては、表示文書記録部５３の情報を用いて文書の中心位置に音声を定位させることがより望ましい。
【０１４０】
また、チャート作成部２２と、文書チャート作成部５４を併用することで、表示部１６の表示画面上において、時軸上バー３１に関連させて、発言者チャートと、文書チャートとを重ねて表示させる方法もある。この場合のイメージ図を図２３に示す。この場合には、再生音声の各時間位置に応じた仮想音源位置は、発言者チャート上であってもよいし、文書チャート上であってもよい。
【０１４１】
さらに、別の実施例として、音声再生時に文書を必ずしも時間軸に沿って表示しないような場合も考えられる。図２４は、文書の表示位置が時間軸に沿ってはいない場合の表示例である。
【０１４２】
このような例の場合には、仮想音源配置部１９は、表示文書履暦記録部５２を参照して、文書が表示された時間帯と、その文書ＩＤを認識し、その時間帯になったときに、表示文書記録部５３からその文書が表示されているか否かを判定し、表示されていれば、図２４において、実線の丸印６１で示すように、そこに音声を定位させる。そして、文書が表示されていなければ、予め決められた、例えば図２４で破線の丸印６２で示すような、文書が表示されないような場所に定位させる用にする方法で実施することができる。
【０１４３】
以上のようにして、この発明による記録再生装置の第１および第２の実施の形態においては、再生された音声を聴取するだけで、その再生音声の時間位置が、記録音声情報の全体中で、どの時間位置のものであるかを容易に把握することができる。
【０１４４】
また、いつ、どのような話題があったかを記憶する時に、頭部座標系、または絶対空間の座標系の場所に話題を結び付けて記憶することができる。
【０１４５】
また、様々な音声情報にアクセスした場合、それらの時間関係を記録しておかなくとも、空間的な位置と話題を対応づけて記憶することによって、話題の流れをつかみ易くなる。このようなことは、我々が日常用いる記憶方略にも役立てることが可能である。
【０１４６】
また、再生したい話題を再生するためのインデックスが、音声情報に付与されていなくとも、その話題が再生された時の音声の位置から時刻を連想できるという効果もある。
【０１４７】
また、聴取者は、表示チャート上の視覚情報と重ねあわせて音声情報を聴取することができる。それによって、音声情報の理解が促進される。
【０１４８】
なお、以上の第２の実施の形態において、表示部１６として、頭部搭載型ディスプレイを用いた場合には、頭部運動出部２１は不要である。
【０１４９】
【発明の効果】
以上説明したように、請求項１の発明によれば、再生された音声を聴取するだけで、その再生音声の時間位置が、記録音声情報の全体中で、どの時間位置のものであるかを容易に把握することができる。また、再生中の音声情報の音像定位位置は、時間の経過とともに変化するので、音声情報の時間的な流れを容易に把握することができる。
【０１５０】
また、請求項２の発明によれば、音声情報の全体を再生する場合でなくとも、その指定された音声再生区間が、全体のどの当たりのものであるかを知ることができる。また、その再生区間内における音声情報の時間的な流れも容易に把握することができる。
【０１５１】
また、請求項３の発明によれば、表示画面に表示された音声に関連する表示情報に再生音声出力が定位するようにできるので、例えば、会議における各発言者の音声出力は、その発言者の発言区間の表示部分からあたかも出力されるかの如く聴取されるようにすることができる。
【０１５２】
また、請求項４の発明によれば、指定された再生区間の音声に関連する、例えば、会議において提示された文書資料などを表示画面に表示したときに、その関連する文書に音声出力を定位するようにすることができるため、聴取者は、再生中の音声が、どの文書資料に関する音声情報であるかを容易に把握することができる。
【０１５３】
また、請求項５の発明によれば、頭部伝達関数が用いられて音声処理が行われて、音像が、定位感良く、上述のような聴空間上の所定の位置に配置できる。
【０１５４】
また、請求項６の発明によれば、聴取者の頭の位置が変わっても、その頭部の運動を検出して、その頭部運動に応じた仮想音源配置を行うことができるので、聴空間として絶対空間内に音像を位置付けることができる。
【図面の簡単な説明】
【図１】この発明による記録再生装置の第１の実施の形態の機能ブロック図である。
【図２】この発明による記録再生装置の第１の実施の形態が適用されるシステムの全体の概要を説明するための図である。
【図３】この発明による記録再生装置の第１の実施の形態の再生部の動作の例のフローチャートである。
【図４】この発明による記録再生装置の第１の実施の形態の仮想音源配置部の動作の例のフローチャートである。
【図５】この発明による記録再生装置の第１の実施の形態における仮想音源配置の方法の例を示す概念図である。
【図６】この発明による記録再生装置の第１の実施の形態における仮想音源配置の方法の例を示す概念図である。
【図７】この発明による記録再生装置の第２の実施の形態の第１の実施例の機能ブロック図である。
【図８】頭部の動きと、音の定位位置との関係を説明するための図である。
【図９】この発明による記録再生装置の第２の実施の形態の第１の実施例において、頭部運動検出部を具備した時の仮想音源配置部の動作を説明するためのフローチャートである。
【図１０】この発明による記録再生装置の第２の実施の形態の第１の実施例において、頭部運動検出部を具備した時の再生部の動作を説明するためのフローチャートである。
【図１１】第１の実施例において、表示部上の時間軸上で再生音声出力を定位させている様子を示す図である。
【図１２】この発明による記録再生装置の第２の実施の形態の第２の実施例の機能ブロック図である。
【図１３】第２の実施例における発言区間検出部の動作を説明するためのフローチャートである。
【図１４】第２の実施例における発言区間検出部の動作を説明するために用いる図である。
【図１５】第２の実施例における発言区間検出部の動作を説明するために用いる図である。
【図１６】第２の実施例において、表示部上の発言者チャート上で再生音声出力を定位させている様子を示す図である。
【図１７】この発明による記録再生装置の第２の実施の形態の第３の実施例の機能ブロック図である。
【図１８】第３の実施例における表示文書履歴記録部の記録情報の構造の例を示す図である。
【図１９】第３の実施例における表示文書記録部の記録情報の構造の例を示す図である。
【図２０】第３の実施例において、記録時に文書が表示されていた様子を時間軸上で示した図である。
【図２１】第３の実施例において、表示部上の文書チャート上で再生音声出力を定位させている様子を示す図である。
【図２２】第３の実施例における文書チャート作成部の動作を説明するためのフローチャートである。
【図２３】第３の実施例において、表示部上の発言者チャートあるいは文書チャート上で再生音声出力を定位させる場合の表示画面を示す図である。
【図２４】第３の実施例において、表示部上の文書チャート上で再生音声出力を定位させる場合の表示画面を示す図である。
【符号の説明】
１１音声入力部
１２記録音声信号処理部
１３音声情報格納部
１４ユーザ入力部
１５制御部
１６表示部
１７再生部
１８音声出力部
１９仮想音源配置部
２１頭部運動検出部
２２チャート作成部
２３画像入力部
２４画像信号記録処理部
２５画像情報格納部
２６発言区間検出部
３１時間軸バー
５１文書記録部
５２表示文書履歴記録部
５３表示文書記録部
５４文書チャート作成部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a recording / reproducing apparatus and a recording / reproducing method for recording and reproducing audio information in, for example, a conference.
[0002]
[Prior art]
As a conventional recording / reproducing apparatus of this type, a cassette tape recorder or a disk recorder using a mini disk, for example, is a typical example of a commercially available product. They record audio information on a recording medium such as a cassette tape or a disk, and if necessary, can start reproduction of a desired portion by designating a time or a counter value of a device.
[0003]
In particular, in the case of audio recorded on a disk medium such as a mini-disc, a random access function enables instant access to a desired audio information recording position.
[0004]
In this type of recording / reproducing apparatus, at the time of reproduction, which part of the recorded audio is being reproduced can be confirmed by looking at the displayed counter value and time information.
[0005]
As a system that further enhances the functions of these commercially available devices, a recording / reproducing system particularly targeting a conference, a lecture, and the like has been conventionally proposed. A typical example thereof is described in JP-A-6-343146.
[0006]
This system has a function of recording multimedia information such as audio and video of a conference, and recording user input information such as a pen input and a keyboard input of a conference participant, and the input time. Then, a mechanism is provided so that multimedia information related to the user input information can be reproduced later by using the input time of the user input information.
[0007]
When playing the recorded audio, a screen showing the audio waveform is displayed on the time axis, and a mark indicating the part being reproduced is displayed there. The user can confirm whether the user is performing the operation.
[0008]
[Problems to be solved by the invention]
Commercially available devices such as tape recorders and disk recorders described in the above-mentioned prior art, and recording / reproducing systems as described in Japanese Patent Application Laid-Open No. 6-343146 are used to reproduce any of or any portion of recorded information. In order to check whether the counter value is being displayed, it is necessary to check the value on the visual display displaying the counter value and the time.
[0009]
Also, with the random access function of the disc recorder, it is easy to freely access various places regardless of the original time flow at the time of recording, so the access is performed in a different order from the original time flow. It is often performed, but in such a case, there is a problem that it is difficult to grasp the time relationship between the information of the individual access points. Also, in order to grasp the temporal flow of audio information at the time of recording from the reproduced audio information, the time at each reproduction access position and the value of the counter are memorized, and the original audio information flow is re-examined from there. Must be configured.
[0010]
In view of the above points, the present invention enables a temporal position of a reproduction point on a recording medium to be grasped only by audio information, and furthermore, a temporal flow of recorded audio information is It is an object of the present invention to provide a recording / reproducing device which can be easily grasped only by basic information.
[0011]
[Means for Solving the Problems]
In order to solve the above-mentioned problems, a recording / reproducing apparatus according to the first aspect of the present invention includes:
Recording audio signal processing means for processing input audio information for recording;
Audio information storage means for storing the audio information processed by the recording audio signal processing means in a recording medium together with additional information on the entire recording time from the beginning to the end,
Playback designation means for designating audio information to be played from among the audio information recorded on the recording medium,
A virtual sound source for determining a virtual sound source position in a listening space in accordance with a time position in the total recording time of the audio information to be reproduced, from the additional information of the audio information to be reproduced specified by the reproduction specifying means; Virtual sound source arrangement means for generating arrangement information;
Receiving the virtual sound source arrangement information from the virtual sound source arrangement means, reading out the audio information designated by the reproduction designating means from the recording medium, and for the read out audio information, the respective time in the total recording time Playback means for performing playback processing so that the audio output of the position is localized and listened to a position in the listening space determined based on the virtual sound source arrangement information,
Audio output means for outputting audio based on the audio information reproduced by the reproduction means,
Control means for controlling the operation of each means,
It is characterized by having.
[0012]
According to a second aspect of the present invention, in the recording and reproducing apparatus according to the first aspect,
The reproduction designating means specifies audio information to be reproduced from the audio information recorded on the recording medium, and specifies a desired reproduction section in the audio information to be reproduced.
The virtual sound source arranging unit also acquires information on the playback section specified by the specifying unit, and determines a position in the listening space according to each time position from the beginning to the end of the specified playback section. Generating virtual sound source arrangement information, and sending the virtual sound source arrangement information to the reproducing means,
The reproduction means may be configured such that, for the audio information in the designated reproduction section, the audio output at each time position is localized and listened to a position in a listening space determined based on the virtual sound source arrangement information. Playback processing
It is characterized by the following.
[0013]
According to a third aspect of the present invention, in the recording and reproducing apparatus according to the first aspect,
Display information generating means for generating display information related to the time axis of the audio information,
Display means for displaying the display information generated by the display information generation means on a display screen,
With
The virtual sound source arranging means arranges a position in a listening space corresponding to each time position of the audio information to be reproduced on display information displayed on the display screen.
It is characterized by the following.
[0014]
The recording / reproducing apparatus of the invention according to claim 4 is:
Recording audio signal processing means for processing input audio information for recording;
Audio information processed by the recording audio signal processing means, audio information storage means for recording and storing in a recording medium,
Related information storage means for recording and storing related information related to the input audio information, together with the correspondence between the input audio information and a recording position on the recording medium,
Display means for displaying a plurality of the related information read from the related information storage means on a display screen,
Playback designation means for designating audio information to be played from among the audio information recorded on the recording medium,
The relevant information corresponding to each time position of the audio information to be reproduced specified by the reproduction specifying means is detected based on the information from the relevant information storage means, and each of the specified audio information to be reproduced is detected. A position in the listening space corresponding to the time position is arranged on the relevant information corresponding to each time position of the audio information to be reproduced, among the plurality of pieces of related information displayed on the display screen.
It is characterized by the following.
[0015]
According to a fifth aspect of the present invention, in the recording / reproducing apparatus according to any one of the first to fourth aspects,
The reproduction means realizes localization of a sound image by using an operation using a head-related transfer function.
[0016]
According to a sixth aspect of the present invention, in the recording / reproducing apparatus according to any one of the first to fourth aspects,
Comprising a head movement detecting means for detecting the head movement of the listener of the audio output,
The virtual sound source arrangement unit generates the virtual sound source arrangement information in consideration of the head movement detected by the head movement detection unit.
[0017]
[Action]
According to the first aspect of the present invention, when audio information to be reproduced is specified by the reproduction specifying means, the virtual sound source arranging means determines the total recording time from the additional information of the specified audio information. Detected audio information sequentially reproduced from the recording medium, and generates virtual sound source arrangement information for determining a virtual sound source position corresponding to each time position in the entire recording time in the listening space. The sound source arrangement information is sent to the reproducing means.
[0018]
The reproducing means reads the audio information specified by the reproduction specifying means from the recording medium, and localizes the read audio information to a position in a listening space determined based on the virtual sound source arrangement information from the virtual sound source arranging means. And perform processing to make it heard. Then, the processed audio information is supplied to the audio output means.
[0019]
As a result, the sound output from the sound output means is heard so that the reproduced sound information is localized at a position in the listening space corresponding to the time position during the entire recording time. Therefore, only by listening to the audio information, it is possible to know which of the specified audio information is the audio information currently being reproduced. Further, since the sound image localization position of the audio information being reproduced changes with the passage of time, the temporal flow of the audio information can be easily grasped.
[0020]
According to the second aspect of the present invention, the reproduction specifying means can specify a desired reproduction section in the audio information to be reproduced, and the virtual sound source arranging means can specify the desired reproduction section within the specified reproduction section. Generates virtual sound source arrangement information for determining a position in a listening space corresponding to each time position in. Therefore, even if the entire audio information is not reproduced, it is possible to know which one of the specified audio reproduction sections is the whole. Further, the temporal flow of the audio information in the reproduction section can be easily grasped.
[0021]
According to the third aspect of the invention, the display information related to the time axis of the audio information is generated by the display information generating means. For example, when the input voice information is the conference voice, each speech section of a plurality of conference participants is detected, and the speech section of each conference participant is displayed as a chart on the display screen of the display means.
[0022]
The virtual sound source arranging means arranges a position in the listening space corresponding to each time position of the audio information to be reproduced on the display information on the display screen. In the example in which the chart of the utterance section is displayed, virtual sound source arrangement information for causing the reproduced sound output to be localized in the utterance section of the speaker of the sound on the chart is generated.
[0023]
Therefore, the audio information at each time position is processed by the reproducing means according to the virtual sound source arrangement information and is output as audio by the audio output means, so that, for example, the audio output of each speaker is Is heard from the display part as if it were output.
[0024]
In the invention according to claim 4, the related information related to the sound information recorded by the sound recording means is stored in the related information storage means in association with the sound information. The related information includes, for example, document materials presented at the meeting.
[0025]
Then, at the time of reproduction, when the audio information is reproduced when a plurality of related information is displayed on the screen, the audio output is set to the corresponding time position displayed on the display screen by the virtual sound source arranging means. Localization is performed so as to be output from related information stored in association. In the case of the document material of the above-mentioned meeting, when the sound information being reproduced is recorded, the sound is output as if from the position of the presented document material, and the listener can determine which document material It can be easily grasped whether the information is audio information.
[0026]
According to the fifth aspect of the present invention, the sound processing is performed using the head-related transfer function, and the sound image can be arranged at a predetermined position in the listening space as described above with a good sense of localization.
[0027]
According to the invention of claim 6, even if the position of the listener's head changes, the movement of the head can be detected and the virtual sound source can be arranged according to the head movement. A sound image can be positioned in an absolute space as a space.
[0028]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment of a recording and reproducing apparatus according to the present invention will be described with reference to the drawings.
[0029]
[First Embodiment]
FIG. 2 shows an overview of the system of the recording / reproducing apparatus according to the first embodiment. An audio signal collected by a microphone 1 is converted into a digital signal by an A / D conversion board 2 and, for example, a personal computer 3 To enter. In the personal computer 3, the input digital audio signal is subjected to compression processing as necessary, and is recorded on, for example, a built-in hard disk.
[0030]
A keyboard 4 and a mouse 5 are connected to the personal computer 3 as a user operation input unit. When there is a reproduction instruction from the user operation input unit, the personal computer 3 reads out the instructed digital audio information from the hard disk. In this example, the sound is supplied to the headphone 7 and the sound is reproduced and output. Reference numeral 8 denotes a display for displaying display information of the personal computer 3.
[0031]
Then, in the first embodiment, in the personal computer 3, the listener of the headphones 7 simply listens to the reproduced sound to easily determine which time position portion of the one piece of audio information is being reproduced. In order to be able to comprehend, as will be described later, arithmetic processing is performed to arrange the virtual sound source position (sound image position) on the auditory sense at a predetermined position in the listening space.
[0032]
FIG. 1 is a functional block diagram showing the recording / reproducing apparatus according to the first embodiment in a block diagram by its processing function. That is, the recording and reproducing apparatus according to the first embodiment includes an audio input unit 11, a recorded audio signal processing unit 12, an audio information storage unit 13, a user input unit 14, a control unit 15, a display unit 16 , A reproducing unit 17, an audio output unit 18, and a virtual sound source arranging unit 19.
[0033]
The audio input unit 11 is an input unit for an audio signal to be recorded. In this example, the audio input unit 11 includes the microphone 1. The recording audio signal processing unit 12 is a part that performs audio signal processing for recording such as A / D conversion and compression processing, and includes an A / D conversion board 2 and software of a personal computer 3.
[0034]
The audio information storage unit 13 is for recording and storing the audio signal from the recording audio signal processing unit 12 on a recording medium. In this example, a hard disk is used as the recording medium. Is done. As the recording medium, a semiconductor device such as a memory, a magneto-optical disk such as a so-called MO or MD, or a floppy disk can also be used.
[0035]
The user input unit 14 receives a user input, and in this example, includes the keyboard 4 and the mouse 5.
[0036]
The control unit 15 is configured by software of the personal computer 3, receives and analyzes a user input from the user input unit 14, and performs a recording audio signal processing unit 12, an audio information storage unit 13 according to the user. , The reproducing unit 16 and the virtual sound source arranging unit 19 to control the writing and reading of the audio information to and from the recording medium and the localization processing of the sound image. In the case of this embodiment, the control unit 15 also controls the display unit 16, a virtual sound source arrangement unit 19 described later, and the like.
[0037]
The display unit 16 displays predetermined information under the control of the control unit 15. In this example, the display unit 16 is a display unit of the personal computer 3 using a CRT or an LCD (liquid crystal display) as a display element. use.
[0038]
The reproduction unit 17 receives the control of the control unit 15, and receives virtual sound source arrangement information from a virtual sound source arrangement unit 19, which will be described later, and outputs the audio information read from the audio information storage unit 13 to the audio output unit 18. The D / A conversion board 6 performs a process of converting the signal to a signal to be supplied to the personal computer 3.
[0039]
In the case of this embodiment, in the reproducing unit 17, the audio information of each time position in the specified reproduction section among the audio information specified by the user is the audio information of the position in the listening space corresponding to each time position. An arithmetic process for localizing the sound so as to be output from the virtual sound source is performed.
[0040]
The calculation method used in the reproduction unit 17 of this embodiment is based on the calculation method described in “Development of a Psycho-Experience System for Living Space Sound” on pages 515 to 521 in the Journal of the Acoustical Society of Japan, Vol. 49, No. 7, 1993. I have.
[0041]
This method is a calculation method for realizing sound localization so that sound can be heard from a specified virtual sound source direction in a virtual listening space. Given the position and direction of the listener's head and the position of the virtual sound source, the head A transfer function is determined, and a convolution operation is performed between the head-related transfer function and the digital audio signals of the left and right two channels, thereby outputting audio so as to be localized at a specified virtual sound source position in the virtual listening space. That you can do it.
[0042]
In the case of this embodiment, the reproducing unit 17 stores a correspondence table between the position and direction of the listener's head, the position of the virtual sound source, and the head-related transfer function corresponding to these, for example, stored in a memory such as a ROM. ing. That is, a plurality of pieces of information assumed as the position and direction of the listener's head and the position of the virtual sound source are given in advance, and the information of the table obtained by measuring the corresponding HRTFs is given. , For example, stored in a ROM.
[0043]
The virtual sound source arranging unit 19 generates virtual sound source arranging information for calculating the head related transfer function and supplies the generated information to the reproducing unit 17. That is, in the case of the first embodiment, the position corresponding to each position on the time axis in the listening space is determined based on the position information on the time axis of the reproduction section of the audio information requested to be reproduced. Then, virtual sound source arrangement information for arranging the virtual sound source is generated and supplied to the reproducing unit 18. The virtual sound source arranging unit 19 is configured by software of the personal computer 3 in this example.
[0044]
The position information on the time axis of the audio information used in the virtual sound source arrangement unit 19 is provided from the control unit 15 and the audio information storage unit 13 as described later. The time axis information from the control unit 15 is in response to a user input from the user input unit 14, and the time axis information from the voice information storage unit 13 is a voice recording recorded in association with the voice information. Information such as a start time and an end time.
[0045]
The sound output unit 18 performs sound conversion for outputting a sound emission from the reproduced sound signal. In this example, the sound output unit 18 includes the headphones 7. By the above-described virtual sound source localization processing in the reproducing unit 17, the sound is output from the headphones 7 so that the sound is output from the virtual sound source at a position corresponding to each time axis position of the reproduced signal in the virtual listening space. The output is heard.
[0046]
The processing at the time of recording and the processing at the time of reproducing the audio information in the first embodiment will be described in detail below.
[0047]
[Explanation of the first embodiment at the time of audio recording]
First, a recording scene of audio information in the first embodiment will be described, and an audio input unit 11, a recorded audio signal processing unit 12, and an audio information storage unit 13 will be described.
[0048]
When a voice recording command is input from the user input unit 14, the command is processed by the control unit 15 and sent to the recording voice signal processing unit 12 and the voice information storage unit 13. In the recording audio signal processing unit 12, the A / D conversion board performs A / D conversion processing of the input audio signal from the audio input unit 11 in accordance with the audio recording command, and performs necessary compression processing. The audio information storage unit 13 starts recording audio information from the recording audio signal processing unit 12.
[0049]
When a recording end instruction is input from the user input unit 14, the control unit 15 sends the instruction to the recording audio signal processing unit 12 and the audio information storage unit 13. In response to the recording end command, the recording audio signal processing unit 12 ends the audio A / D conversion processing and compression processing, and the audio information storage unit 13 ends the recording of audio information.
[0050]
At the time of the audio recording, the audio information storage unit 13 records the recorded audio in association with the time information of the recording start time and the recording end time.
[0051]
The audio information storage unit 13 stores and manages audio information that is temporally continuous from the start of recording to the end of recording as one audio information file, and is given by a user input operation from the user input unit 14. The information of the audio information file name is recorded in the audio information storage unit 13 through the control unit 15. Further, in order to store a plurality of audio information files, the position of each audio information file on the recording medium is also stored and held.
[0052]
In the first embodiment, the time information relating to each audio information file only needs to know the entire recording time (time from the start of recording to the end of recording) of the audio information file. May be a recording start time and a recording audio information amount, or a recording audio information amount and a recording end time.
[0053]
In the case of this embodiment, the user can display on the display unit 16 what kind of audio information file is stored in the audio information storage unit 13. In this example, the file name of the stored audio information file is displayed. By displaying the file name, the user can not only know what kind of audio information is stored, but also select the file name on the user input unit 14 using, for example, a mouse. Thus, the reproduction of the selected audio information file can be designated.
[0054]
[Explanation at the time of audio reproduction of the first embodiment]
Next, the operation at the time of reproduction in the case of the first embodiment will be described below.
[0055]
At the time of reproduction, the user inputs a reproduction request including a reproduction command, a sound output destination, and reproduction sound designation information from the user input unit 14. The audio output destination is the headphone 7 in this example, and the reproduction audio designation information includes an audio information file name, a reproduction start offset, and a reproduction end offset.
[0056]
The playback start offset is information indicating how far from the beginning of the audio information file to be played back the playback is to be started, and if it is zero, the playback is requested from the beginning of the audio information file. Similarly, the reproduction end offset is information on how far back from the recording end point of the audio information file to be reproduced is to be reproduced until the time position. Is a playback request.
[0057]
When the control unit 15 determines that the user input from the user input unit 14 is a reproduction request including a reproduction command, the control unit 15 reproduces the reproduction request information including the reproduction command, the audio output destination, and the reproduction audio designation information. 17 and hold the current time at the time of the reproduction request.
[0058]
Hereinafter, the operation of the reproduction unit 17 when the information of the reproduction request is passed from the control unit 15 and the operation of the virtual sound source arranging unit 19 that operates in association with the reproduction unit 17 will be described. First, the operation of the reproduction unit 17 will be described with reference to the flowchart of FIG.
[0059]
Upon receiving the audio information reproduction request, the reproduction unit 17 starts the processing routine of FIG. 3, and firstly, in step 101, the virtual sound source arranging unit 19 transmits the reproduction audio designation information (audio information file name, reproduction start offset, Playback end offset).
[0060]
When the virtual sound source arranging unit 19 receives the playback audio designation information, as described later, the virtual sound source arranging unit 19 displays the audio information in the audio space corresponding to each time position of the audio information in the playback section from the playback start offset to the playback end offset. Virtual sound source arrangement information for arranging a virtual sound source at a position is generated.
[0061]
The reproduction unit 17 receives the virtual sound source arrangement information generated by the virtual sound source arrangement unit 19 in step 102. Then, the process proceeds to step 103, using the virtual sound source arrangement information and the information on the position of the listener's head, referring to the information recorded in the table, and determining the head of the audio information to be reproduced at each time. Determine the transfer function.
[0062]
In this example, since the audio output unit 18 is a headphone worn on the head, the direction and position of the listener's head may be considered to be fixed. , Position and orientation information are given.
[0063]
When the head related transfer function is determined in this way, the process proceeds to step 104, where a convolution operation between the digital data of the sound information of the designated reproduction section stored in the sound information storage unit 13 and the head related transfer function is performed. The audio signal to be reproduced to the headphone 7 at each time of the audio information to be reproduced is calculated. Then, the process proceeds to step 105, in which the reproduced audio signal calculated in step 104 is supplied to the headphone 7 as the audio output unit 18, and the audio is sequentially output.
[0064]
Next, the operation of the virtual sound source arranging unit 19 during the reproduction will be described with reference to the flowchart of FIG.
[0065]
In the virtual sound source arranging unit 19, when the reproduction audio designation information (the audio information file name, the reproduction start offset, and the reproduction end offset) is sent from the reproduction unit 7, the process proceeds to step 201, and the virtual audio source storage unit 13 specifies The information of the recording start time and the recording end time of the recorded audio information file is obtained, and the total recording time of the specified audio information file is obtained.
[0066]
Next, the process proceeds to step 202, where a virtual sound source position in the listening space of the reproduced sound is determined. FIG. 5 shows a conceptual diagram of a method of determining the virtual sound source position in this case.
[0067]
First, in this example, as shown in FIG. 5, a coordinate system of an arc-shaped space having a radius r around the user's head 9 is virtually defined, and this is defined as the user's head coordinate system. . Then, in this head coordinate system, a time corresponding to an angle α to an angle β that determines a predetermined listening space range is defined, and a virtual sound source at each time position of the reproduced sound is arranged on this virtual time space. To do.
[0068]
In this case, the listening space range can be defined in any manner. In this embodiment, as described above, the recording start time and the recording end time of the audio information file acquired from the audio information storage unit 13 are associated with each other. Then, the recording start time Rs of the audio information file is made to correspond to the angle α, and the recording end time Re is made to correspond to the angle β. In this example, the angle α is, for example, 30 degrees, and the angle β is, for example, 150 degrees.
[0069]
After the user's head coordinate system is defined in this way, the playback start time Ts and the playback end time Te specified by the playback start offset and the playback end offset of the playback audio designation information respectively correspond to the points in the listening space. The angle θs and the angle θe are obtained. The angle θs and the angle θe are virtual sound source positions at the reproduction start time Ts and the reproduction end time Te.
[0070]
Next, a unit angle Δθ per unit time Δt is obtained in an angle range from the angle α to the angle β. Thus, in the section between the reproduction start time Ts and the reproduction end time Te, the angular position is determined for each unit time Δt, and each angular position is determined as the virtual sound source position of the sound at the corresponding time.
[0071]
When the position at which the sound at each time in the designated playback section should be localized, that is, the information for determining the virtual sound source position is obtained, the process proceeds to step 203, where the virtual sound source arranging unit 19 forms a listening space. The radius r of the head coordinate system, the angle in the listening space at the reproduction start time Ts determined by the reproduction start offset, and the angle Δθ per unit time Δt are sent to the reproduction unit 17 as virtual sound source arrangement information.
[0072]
As described above, the reproduction unit 17 performs, based on these pieces of information, audio processing for outputting audio from a virtual sound source position corresponding to each time position in the reproduction target section from the audio information storage unit 13. This is performed on the audio information in the specified reproduction section of the read audio information file to be reproduced.
[0073]
In this embodiment, the virtual sound source is reproduced in such a manner that the virtual sound source moves in accordance with a change in the time axis of the sound in the auditory sense and the listening space by reproducing and outputting the sound as described above. Become so.
[0074]
In the above example, the audio output at each time is arranged on a circle having a radius r centered on the head. However, another arrangement method may be used. For example, as shown in FIG. 6, the sound at each time may be arranged on a straight line.
[0075]
In the example of FIG. 6, the time of a point A (xa, ya, za) and the time of a point B (xb, yb, zb) on a virtual three-dimensional space are determined by setting a recording start time Rs of a specified audio information file, This is made to correspond to the recording end time Re. Then, the movement distance per unit time Δt is obtained as a movement vector Δv (= (s, t, u)). Then, the virtual sound source position at each time of the designated audio reproduction section is determined from the linear equation of the points A and B.
[0076]
In the case of the example of FIG. 6, in step 203 of the flowchart of FIG. 4, the position of the start offset and the movement vector Δv per unit time Δt may be sent to the reproducing unit 17.
[0077]
In the above example, the angle α and the angle β or the point A and the point B are applied to the recording start time and the recording end time of the specified audio information file, but the user may specify another date and time in advance. Absent. In the above description, one sound is reproduced so as to be localized in the listening space. However, a plurality of sounds are selected, arranged in the listening space, and reproduced in parallel. You can also.
[0078]
In this way, the user can listen to multiple sounds at the same time. You cannot understand multiple voices at the same time, but you can pick up words. As a result, the context and flow of the recorded voice can be estimated. In addition, it is possible to quickly determine which information is desired or where.
[0079]
[Second embodiment]
In the first embodiment, a virtual sound source is arranged relative to a reproduction time position on a listening space using a head coordinate system centered on a listener's head relative to the head position. I did it. That is, in the first embodiment, the virtual sound source is arranged in a space relative to the head in accordance with the reproduction time position of the sound. However, the present invention is not limited to the case where the virtual sound source position is arranged in such a relative space so as to correspond to the reproduction time position.
[0080]
In the second embodiment described below, a time axis is positioned on an absolute space, and a virtual sound source position corresponding to each time position of a reproduction section can be arranged on the time axis on the absolute space. This is the case.
[0081]
In the second embodiment, as shown in FIG. 7, in addition to the functional block diagram of the first embodiment shown in FIG. 1, a head for detecting the movement of the listener's head is used. A motion detector 21 is provided.
[0082]
The head movement detecting unit 21 is configured using, for example, a three-dimensional magnetic sensor. The position and direction (angle) of the head are detected by the three-dimensional magnetic sensor. Using this, the listener can position the time axis in absolute space.
[0083]
That is, even if the head moves, the position of the time axis in the absolute space does not move. This concept will be described with reference to FIG.
[0084]
Now, assume that the position of the listener at time T is P (T), and the position of the listener at time (T + 1) is P (T + 1). If the head movement detecting unit 21 is not provided, when the head moves to P (T + 1), the sound at the next time (T + 2) is also heard at the position where the head has moved.
[0085]
In the case of this embodiment, head position and direction angle information is detected by the head movement detection unit 21, and the detected head position and direction angle information are transmitted to the virtual sound source placement unit via the control unit 15. It is sent to 19. Based on the head position P (T + 1), the virtual sound source arranging unit 19 calculates the position in the head coordinate system of the audio information after the time (T + 2), and sends the information to the reproducing unit 17. The reproducing unit 17 specifies the head-related transfer function at each time from the transmitted virtual sound source arrangement information, and recalculates the audio signal after the time (T + 2).
[0086]
The operation of the virtual sound source arrangement unit 19 when the position of the head changes will be described with reference to the flowchart in FIG.
[0087]
That is, when the virtual sound source arranging unit 19 receives the head position and the directional angle information from the control unit 15, in step 301, it is determined whether the head position and the directional angle information are the same as the head position and the directional angle information at the previous time. Is determined. If they are the same, the process ends; if not, the process proceeds to step 302. In step 302, based on the detected head position and direction angle information, virtual sound source arrangement information for specifying a virtual sound source position in a head coordinate system at each time of a sound to be reproduced is calculated. Then, the process proceeds to step 303, where the virtual sound source arrangement information is sent to the reproducing unit 17.
[0088]
Next, a flowchart of the processing operation of the reproducing unit 17 in this case is shown in FIG.
[0089]
That is, when the information about the virtual sound source position at each time is sent from the virtual sound source arranging unit 19, the process proceeds to step 401, and based on the information, the head transfer function of the sound at each time in the designated reproduction section is specified. I do. Then, the process proceeds to step 402, where the digital data of the audio signal to be reproduced, which is recorded in the audio information storage unit 13, is read, and the convolution operation is performed with the specified head related transfer function. Next, the process proceeds to step 403, where the calculation result of step 402 is output to the headphones 7 via the D / A conversion board.
[0090]
With such a function, according to the second embodiment, it is possible to position the time axis in the absolute space and reproduce the sound on the time axis. Also, when the user wants to hear a specific voice well, it is possible to make it easier to hear by approaching the specific voice.
[0091]
Further, according to the second embodiment, since the time axis can be positioned on the absolute space and the sound can be reproduced on the time axis, the virtual sound source can be reproduced on the time axis in the virtual space. Not only according to the time position, but also on the chart displayed on the screen (having a time axis and visually representing information along the time axis) A virtual sound source can be arranged according to the position.
[0092]
Hereinafter, some examples of the second embodiment will be described.
[0093]
[First Embodiment]
In the first example of the second embodiment, when the display information related to the time axis of the audio information is displayed on the screen of the display unit 16, the audio information to be reproduced is displayed in any display information portion. This is a case in which a virtual sound source is arranged in the display information on the display screen so that it can be easily grasped.
[0094]
In this case, as described above, when the virtual sound source is arranged on a fixed space such as a display screen, when the position of the listener's head changes, the sound source position changes. Therefore, correction is performed according to the position of the listener's head.
[0095]
The first embodiment will be described below by taking a case of recording and reproducing conference information as an example. FIG. 7 described above is also a functional block diagram of the recording and reproducing apparatus in the case of the first embodiment.
[0096]
In the first embodiment, in addition to the functional block diagram of the first embodiment shown in FIG. 1, a head movement detecting unit 21 for detecting a head movement of a listener, and audio information Is provided with a chart creation unit 22 for creating a chart in which related information is visually arranged on the time axis.
[0097]
In the case of the first embodiment, when a reproduction command and reproduction instruction information including a reproduction start offset and a reproduction end offset are input from the user input unit 14, the screen of the display unit 16 displays the screen shown in FIG. As shown in FIG. 7, a time axis bar 31 indicating a playback section is created and displayed by the chart creation unit 22. The example of FIG. 11 shows a state in which the reproduction start offset and the reproduction end offset are both zero and the period from the start to the end of the conference is specified as the reproduction section.
[0098]
The chart may be directly input by the user from the input unit 14 and displayed. That is, in the case of this embodiment, the user can display the time axis bar 31 on the display unit 16 via the control unit 15 by inputting the time axis bar, characters, and image information from the user input unit 14. .
[0099]
In the case of the first embodiment, the reproduced sound signal is obtained by localizing the virtual sound source at a position corresponding to the time position of each reproduced sound on the time axis bar 31 displayed on the screen of the display unit 16. As described above, virtual sound source arrangement information is generated by the virtual sound source arrangement unit 19, and arithmetic processing using the head-related transfer function is performed in the reproducing unit 17 in the same manner as described above based on the virtual sound source arrangement information. The processing in the virtual sound source arrangement unit 19 in this case corresponds to the case described with reference to FIG.
[0100]
However, as described above, when the position of the listener's head changes, the virtual sound source arranging unit 19 recalculates the virtual sound source arrangement information centering on the changed head. Then, the virtual sound source arrangement information after the change is provided to the reproducing unit 17. Further, the position information of the head detected by the head movement detecting unit 21 is also provided to the reproducing unit 17. Then, the reproducing unit 17 obtains a changed head related transfer function based on the information, and uses the head related transfer function to perform a convolution calculation of the audio information at each time position.
[0101]
That is, in the first embodiment, the control unit 15 transmits the display information of the time axis bar 31 (the position of the screen, the position of the time axis bar 31 in the screen, and the time information on the time axis bar) to the virtual sound source arrangement. Send to part 19. Then, the virtual sound source arranging unit 19 obtains the time axis bar 31 in the head coordinate system from the information on the time axis bar 31 on the screen and the listener's head position and direction angle information detected by the three-dimensional magnetic sensor. Information on the virtual sound source position at each time position of the reproduced sound is generated and supplied to the reproducing unit 17.
[0102]
In addition, the virtual sound source arranging unit 19 obtains the virtual sound source at each time of the sound of the reproduction target section from the information of the reproduction start offset and the reproduction end offset sent when the user issues a sound reproduction command. The arrangement position is determined in the same manner as described above.
[0103]
In FIG. 11, a circle 32 indicates an example of the localization position of the virtual sound source in the case of this embodiment. That is, the virtual sound source is arranged such that the sound is output from the position corresponding to the time position of the sound being reproduced on the time axis bar 31 on the screen of the display unit 16. In FIG. 11, the virtual sound source position is indicated by a circle 32 for easy understanding conceptually. However, in practice, the circle 32 does not exist on the display, and the position of the circle 32 is not present. Is heard so that an audio output is emitted from the.
[0104]
As a result, as shown in the schematic diagram of FIG. 11, the listener can simply know the reproduced sound to know at which position on the time axis the currently reproduced sound is.
[0105]
[Second embodiment]
The second embodiment is also an example in the case of a conference information recording / reproducing apparatus. FIG. 12 shows a functional block diagram of the recording / reproducing apparatus in the case of the second embodiment.
[0106]
In the second embodiment, the voice input unit 11 is composed of a plurality of microphones assigned to each of a plurality of conference attendees. Is detected, and the detection result is stored in the audio information storage unit 13. In the case of the second embodiment, the chart creating unit 22 constitutes a speaker chart creating unit that displays a speech section for each conference attendee on a time axis.
[0107]
The speech section of each conference attendee is detected by the speech section detection unit 23 as shown in the flowchart of FIG. 13 and the explanatory diagram of FIG. In this example, if the audio information from the microphone of each conference participant continues for a predetermined level or more and for a certain period of time, the conference participant using that microphone speaks. It is detected as being.
[0108]
That is, as shown in FIG. 14, when an audio signal of a certain level L1 or more is output from the microphone, the process proceeds to step 501, and a time interval set in advance as an appropriate unit time length for detecting the start of speech It is monitored whether or not the audio signal level equal to or higher than the level L1 continues over Δt1. If it does not continue, it is not regarded as a speech and the detection of the speech section ends. If it is determined that it has continued, the process proceeds to step 502, where the current time (T1) is detected, and T1-Δt1 is set as the speech start time.
[0109]
Then, the process proceeds to step 503, in order to obtain the end time of the voice, the voice is set to a certain level L2 from a certain time T2 for a time Δt2 or more that is set in advance as an appropriate unit time length for detecting the end of the speech. Monitor if it falls below. If it is lower, the process proceeds to step 504, and T2-Δt2 is detected as the speech end time.
[0110]
In the above example, the level L1 for detecting the utterance start time and the level L2 for detecting the utterance end time are L1 = L2, but the levels L1 and L2 are not necessarily equal. May be.
[0111]
Information on the speech section of each conference participant detected as described above is put together in a data structure as shown in FIG. The utterance section detection unit 23 stores information on the utterance section in the voice information storage unit 13 as record information of the utterance state.
[0112]
As shown in FIG. 15, the information about the recorded speech section includes speech identification data (speech ID) for identifying each speech section, the speaker, the speech start time, and the speech end time. As the speaker information, a conference attendee name registered in advance is recorded. It is also possible to separately prepare a correspondence table between the speaker name and the speaker identification data, and record the speaker identification data instead of the speaker name.
[0113]
In the second embodiment, at the time of reproduction, the user can specify a chart creation command from the user input unit 14 and a reproduction time range in the same manner as described above. When a chart creation command and a playback time range are input, the command and information on the playback time range are sent to the chart creation unit 22 via the control unit 15.
[0114]
The chart creator 22 reads out the record information of the utterance status in the designated reproduction time range from the audio information storage 13, creates a utterer chart as shown in FIG. 16, and displays the utterer chart on the display 16 through the controller 15. I do.
[0115]
As shown in FIG. 16, the utterance chart includes a utterer name area 41 for displaying utterer names A, B, and C for identifying the utterer, and an utterance for visually displaying the state of the utterance transition. And a transition display area 42. In the utterance transition display area 42, the utterance section of each utterer detected by the utterance section detection unit 23 is displayed by a rectangular bar display VB.
[0116]
In FIG. 16, Ts is the start time of the playback section specified by the user, and Te is the end time of the playback section. The example of FIG. 16 is an example where the reproduction start offset and the reproduction end offset are zero, and covers a section from the start to the end of the conference.
[0117]
The speaker chart indicates when and how long each conference participant made a speech by the display position and length of the rectangular speech section bar VB. Then, by reading the statement structure displayed as a transition of the statement section bar VB of all conference participants in the statement transition display area 42, the statement in the playback section indicating which statement has transitioned to which statement has been changed. It is also possible to read the transition structure.
[0118]
The position of the display unit 16 with respect to the listener and the display position of the speech section bar VB in the speaker chart display unit 16 are sent from the control unit 15 to the virtual sound source arrangement unit 19. The virtual sound source arranging unit 19 generates virtual sound source arranging information such that the position of the utterance section bar VB corresponding to the sound being reproduced is used as the virtual sound source position, and sends it to the reproducing unit 17.
[0119]
The reproducing unit 17 performs the same arithmetic processing as described above on the reproduced audio signal based on the virtual sound source arrangement information, and outputs the same to the headphone 7 as the audio output unit 18.
[0120]
In the example of FIG. 16, the sound is output so that the virtual sound source exists at the position of the speech section bar indicated by the circle. That is, localization is performed so as to correspond to each of the speakers A, B, and C. Note that the example of FIG. 16 shows that three people's remarks are reproduced simultaneously.
[0121]
In this way, the user can listen to a plurality of sounds at the same time. It is difficult to understand multiple voices at the same time, but you can pick up words. This makes it possible to estimate the context and the flow of the recorded audio information. In addition, it is possible to quickly give an indication of what or where the desired information is.
[0122]
If the display unit 16 is likely to move, a mechanism for detecting the position of the display unit 16 may be provided. For example, when a recording / reproducing apparatus is realized on a portable personal computer, it can be realized by attaching a three-dimensional magnetic sensor to the portable personal computer. The display unit 16 may be a portable personal computer or an electronic blackboard.
[0123]
[Third embodiment]
FIG. 17 is a functional block diagram of a third example of the recording / reproducing device of the second embodiment. The third embodiment is a conference information recording / reproducing apparatus which records all information of a document as a reference material used in a conference, displays the document on a display screen at the time of reproduction, and This is an example of a case where it is possible to know which document the reproduced sound is related to by simply listening to the reproduced sound by using a virtual sound source arrangement.
[0124]
In FIG. 17, document information used as meeting material is recorded in the document recording unit 51. This document information includes various documents created by word processing software or presentation software.
[0125]
The display document history recording unit 52 records the display history of the document displayed on the display unit 16 when the conference audio information is recorded. The recording of the history in the display document history recording unit 52 is recorded by the control unit 15 each time a document display / non-display instruction is input from the user input unit 14.
[0126]
FIG. 18 shows the structure of information recorded in the display document history recording unit 52. The document ID is identification information of each document. The same document may be used again as material at different times. The display start time and the display end time are for indicating the time during which the document was displayed.
[0127]
The display document recording unit 53 records the document displayed when the audio information is reproduced, and the display position and size thereof. This is a record indicating the display state of the document at a certain point in time, not a history. Therefore, if the user changes the display state, this record also changes.
[0128]
FIG. 19 shows an example of the structure of information recorded in the display document recording unit 53. In this embodiment, the display position of the document is recorded at the center position of the document, but this does not limit how the position information of the document is expressed and how.
[0129]
The document chart creation unit 54 has a function of arranging a document on a time axis on the screen of the display unit 16. More specifically, the document chart creation unit 54 creates information for displaying the documents used in the meeting on the display unit 16 in chronological order.
[0130]
FIG. 20 shows the display status of each document in the meeting on the time axis. That is, in FIG. 20, document 1 is displayed in the time period from time t0 to time t1, document 2 is displayed in the time period from time t2 to time t3, and document 3 is displayed in the time period from time t4 to time t5. It shows that.
[0131]
FIG. 21 shows an example of a document chart in which each document is arranged on the display screen of the display unit 16 corresponding to the time axis bar 31 similar to that of the above-described second embodiment prior to the reproduction of the conference audio. It is.
[0132]
This document chart is created by the document chart creation unit 54. The processing operation of the document chart creation unit 15 is shown in the flowchart of FIG.
[0133]
That is, when a document chart creation command, a chart creation time, and a display document history file name are input from the user input unit 14 via the control unit 15, the document chart creation unit 54 creates this document chart. Here, the chart creation time refers to a time section of the time axis bar 31 displayed on the display unit 16.
[0134]
As another method of creating the document chart, there is a method in which the user himself manually creates the time axis bar 31 and the document on the display screen while referring to the information of the display document history recording unit 52. In this case, the document chart creation unit 54 is not used.
[0135]
In the third embodiment, when audio information is reproduced in the same manner as in the above-described embodiment, the virtual sound source arranging unit 19 searches for a document displayed at the time of recording at the time position of the audio to be reproduced. Then, virtual sound source arrangement information is generated such that the virtual sound source is arranged at the document position. For this reason, display information on the document chart is given to the virtual sound source arrangement unit 19 through the control unit 15.
[0136]
Receiving the information, the reproducing unit 17 performs the above-described arithmetic processing on the audio signal so that the audio signal is output using the document position corresponding to the time position of each audio as a virtual sound source.
[0137]
As a result, the reproduced sound is localized and output to the document displayed along the time axis bar 31 displayed on the display unit 16. The circles in FIG. 21 indicate the sound localization positions corresponding to the respective documents 1, 2, and 3.
[0138]
In addition, when there is no document corresponding to the time position of the reproduced sound, that is, during the conference, the audio information of the time zone in which the document is not displayed, as shown by the broken circle in FIG. The virtual sound source position is arranged on the axis bar 31 so as to be localized at the center of the preceding and following documents, and the position on the time axis of the reproduced audio information can be notified.
[0139]
Depending on the relationship between the size of the displayed document and the scale of the time axis, a difference may occur between the localization position of audio information related to a certain document and the position of the document. Considering the case where this deviation is considered, the virtual sound source arranging unit 19 recognizes the time zone of the audio information related to the document from the record information of the display document history recording unit 52, and displays the time zone in the time zone. More preferably, the sound is localized at the center position of the document using the information of the document recording unit 53.
[0140]
In addition, by using the chart creation unit 22 and the document chart creation unit 54 together, the speaker chart and the document chart are displayed on the display screen of the display unit 16 in association with the time axis bar 31. There is also a way to do that. FIG. 23 shows an image diagram in this case. In this case, the virtual sound source position corresponding to each time position of the reproduced sound may be on the speaker chart or on the document chart.
[0141]
Further, as another embodiment, a case may be considered in which a document is not necessarily displayed along the time axis during audio reproduction. FIG. 24 is a display example when the display position of the document is not along the time axis.
[0142]
In the case of such an example, the virtual sound source arranging unit 19 refers to the display document history recording unit 52, recognizes the time zone in which the document is displayed and the document ID, and when the time zone is reached. Then, it is determined whether or not the document is displayed from the display document recording unit 53, and if it is displayed, the sound is localized there as indicated by a solid circle 61 in FIG. If the document is not displayed, the method can be performed by a method of localizing the document to a predetermined place where the document is not displayed, for example, as indicated by a broken circle 62 in FIG.
[0143]
As described above, in the first and second embodiments of the recording / reproducing apparatus according to the present invention, the time position of the reproduced audio is set in the entire recorded audio information only by listening to the reproduced audio. , It can be easily grasped at what time position.
[0144]
Further, when storing what topic has occurred, when the topic is stored, the topic can be linked to the location in the head coordinate system or the coordinate system in the absolute space and stored.
[0145]
In addition, when various voice information is accessed, the flow of the topic can be easily grasped by storing the spatial position and the topic in association with each other, without having to record the time relationship between them. This can be useful for the memory strategies we use every day.
[0146]
Further, even if an index for reproducing a topic to be reproduced is not added to audio information, there is an effect that time can be associated with a position of a voice at the time when the topic is reproduced.
[0147]
Further, the listener can listen to the audio information by superimposing the visual information on the display chart. Thereby, the understanding of the voice information is promoted.
[0148]
In the above-described second embodiment, when a head-mounted display is used as the display unit 16, the head movement output unit 21 is unnecessary.
[0149]
【The invention's effect】
As described above, according to the first aspect of the present invention, only by listening to the reproduced sound, the time position of the reproduced sound is determined as to which time position in the entire recorded sound information. It can be easily grasped. Further, since the sound image localization position of the audio information being reproduced changes with the passage of time, the temporal flow of the audio information can be easily grasped.
[0150]
Further, according to the second aspect of the present invention, even if the entire audio information is not reproduced, it is possible to know which one of the specified audio reproduction sections is the whole. Further, the temporal flow of the audio information in the reproduction section can be easily grasped.
[0151]
According to the third aspect of the present invention, since the reproduced audio output can be localized in the display information related to the audio displayed on the display screen, for example, the audio output of each speaker in the conference is Can be heard as if it were output from the display section of the utterance section.
[0152]
According to the fourth aspect of the present invention, when a document related to the sound of the designated reproduction section, for example, a document material presented in a conference is displayed on the display screen, the sound output is localized to the related document. Therefore, the listener can easily understand which document material the sound being reproduced is sound information.
[0153]
According to the fifth aspect of the present invention, the sound processing is performed using the head-related transfer function, and the sound image can be arranged at a predetermined position in the listening space as described above with a good sense of localization.
[0154]
According to the invention of claim 6, even if the position of the listener's head changes, the movement of the head can be detected and the virtual sound source can be arranged according to the head movement. A sound image can be positioned in an absolute space as a space.
[Brief description of the drawings]
FIG. 1 is a functional block diagram of a recording and reproducing apparatus according to a first embodiment of the present invention.
FIG. 2 is a diagram for explaining an overview of an entire system to which a first embodiment of a recording / reproducing apparatus according to the present invention is applied;
FIG. 3 is a flowchart illustrating an example of an operation of a reproducing unit of the recording and reproducing apparatus according to the first embodiment of the present invention;
FIG. 4 is a flowchart illustrating an example of an operation of a virtual sound source arranging unit of the first embodiment of the recording / reproducing apparatus according to the present invention;
FIG. 5 is a conceptual diagram showing an example of a method of arranging virtual sound sources in the first embodiment of the recording / reproducing apparatus according to the present invention.
FIG. 6 is a conceptual diagram showing an example of a method for arranging virtual sound sources in the first embodiment of the recording / reproducing apparatus according to the present invention.
FIG. 7 is a functional block diagram of a first example of the second embodiment of the recording / reproducing apparatus according to the present invention;
FIG. 8 is a diagram for explaining the relationship between the movement of the head and the sound localization position.
FIG. 9 is a flowchart for explaining an operation of a virtual sound source arranging unit when a head movement detecting unit is provided in the first example of the second embodiment of the recording and reproducing device according to the present invention;
FIG. 10 is a flowchart for explaining the operation of the reproducing unit when the head movement detecting unit is provided in the first example of the second embodiment of the recording and reproducing device according to the present invention;
FIG. 11 is a diagram showing a state in which a reproduced audio output is localized on a time axis on a display unit in the first embodiment.
FIG. 12 is a functional block diagram of a second example of the recording / reproducing apparatus according to the second embodiment of the present invention.
FIG. 13 is a flowchart illustrating an operation of a speech section detection unit according to the second embodiment.
FIG. 14 is a diagram used to explain the operation of the speech section detection unit in the second embodiment.
FIG. 15 is a diagram used to explain the operation of a speech section detection unit in the second embodiment.
FIG. 16 is a diagram showing a state in which a reproduced voice output is localized on a speaker chart on a display unit in the second embodiment.
FIG. 17 is a functional block diagram of a third example of the second embodiment of the recording / reproducing apparatus according to the present invention.
FIG. 18 is a diagram illustrating an example of a structure of recording information of a display document history recording unit according to a third embodiment.
FIG. 19 is a diagram illustrating an example of a structure of recording information of a display document recording unit according to the third embodiment.
FIG. 20 is a diagram showing, on a time axis, a state in which a document is displayed at the time of recording in the third embodiment.
FIG. 21 is a diagram showing a state in which a reproduced audio output is localized on a document chart on a display unit in the third embodiment.
FIG. 22 is a flowchart for explaining the operation of a document chart creation unit in the third embodiment.
FIG. 23 is a diagram showing a display screen when the reproduced audio output is localized on the speaker chart or the document chart on the display unit in the third embodiment.
FIG. 24 is a diagram showing a display screen in the case where the reproduced audio output is localized on the document chart on the display unit in the third embodiment.
[Explanation of symbols]
11 Voice input section
12 Recorded audio signal processing unit
13 Voice information storage
14 User input section
15 Control part
16 Display
17 Reproduction unit
18 Audio output unit
19 Virtual sound source placement unit
21 Head motion detector
22 Chart creator
23 Image input section
24 Image signal recording processing unit
25 Image information storage
26 Remark section detection section
31 Time axis bar
51 Document Recorder
52 display document history recording unit
53 Display Document Recorder
54 Document Chart Creation Unit

Claims

Recording audio signal processing means for processing input audio information for recording;
Audio information storage means for storing the audio information processed by the recording audio signal processing means in a recording medium together with additional information on the entire recording time from the beginning to the end,
Playback designation means for designating audio information to be played from among the audio information recorded on the recording medium,
A virtual sound source for determining a virtual sound source position in a listening space in accordance with a time position in the total recording time of the audio information to be reproduced, from the additional information of the audio information to be reproduced specified by the reproduction specifying means; Virtual sound source arrangement means for generating arrangement information;
Receiving the virtual sound source arrangement information from the virtual sound source arrangement means, reading out the audio information designated by the reproduction designating means from the recording medium, and for the read out audio information, the respective time in the total recording time Playback means for performing playback processing so that the audio output of the position is localized and listened to a position in the listening space determined based on the virtual sound source arrangement information,
Audio output means for outputting audio based on the audio information reproduced by the reproduction means,
Control means for controlling the operation of each means,
A recording / reproducing apparatus comprising:

The recording / reproducing apparatus according to claim 1,
The reproduction designating means specifies audio information to be reproduced from the audio information recorded on the recording medium, and specifies a desired reproduction section in the audio information to be reproduced.
The virtual sound source arranging unit also acquires information on the playback section specified by the specifying unit, and determines a position in the listening space according to each time position from the beginning to the end of the specified playback section. Generating virtual sound source arrangement information, and sending the virtual sound source arrangement information to the reproducing means,
The reproduction means may be configured such that, for the audio information in the designated reproduction section, the audio output at each time position is localized and listened to a position in a listening space determined based on the virtual sound source arrangement information. A recording / reproducing apparatus for performing a reproducing process.

The recording / reproducing apparatus according to claim 1,
Display information generating means for generating display information related to the time axis of the audio information,
Display means for displaying the display information generated by the display information generation means on a display screen,
With
The recording / reproducing apparatus, wherein the virtual sound source arranging unit arranges a position in a listening space corresponding to each time position of the audio information to be reproduced on display information displayed on the display screen.

Recording audio signal processing means for processing input audio information for recording;
Audio information processed by the recording audio signal processing means, audio information storage means for recording and storing in a recording medium,
Related information storage means for recording and storing related information related to the input audio information, together with the correspondence between the input audio information and a recording position on the recording medium,
Display means for displaying a plurality of the related information read from the related information storage means on a display screen,
Playback designation means for designating audio information to be played from among the audio information recorded on the recording medium,
With
The relevant information corresponding to each time position of the audio information to be reproduced specified by the reproduction specifying means is detected based on the information from the relevant information storage means, and each of the specified audio information to be reproduced is detected. A position in a listening space corresponding to a time position is arranged on the relevant information corresponding to each time position of the audio information to be reproduced, among a plurality of pieces of related information displayed on the display screen. Recording and playback device.

The recording / reproducing apparatus according to any one of claims 1 to 4,
A recording / reproducing apparatus, wherein the reproducing means realizes localization of a sound image using an operation using a head-related transfer function.

The recording / reproducing apparatus according to any one of claims 1 to 4,
Comprising a head movement detecting means for detecting the head movement of the listener of the audio output,
The recording / reproducing apparatus, wherein the virtual sound source arrangement means generates the virtual sound source arrangement information in consideration of the head movement detected by the head movement detection means.

An audio information storing step of storing the audio information processed by the recording audio signal processing means for processing the input audio information for recording on a recording medium together with additional information regarding the entire recording time from the beginning to the end;
When audio information to be reproduced is specified from among the audio information recorded on the recording medium, the additional recording of the specified audio information to be reproduced is performed to record all of the audio information to be reproduced. A virtual sound source placement step of generating virtual sound source placement information for determining a virtual sound source position in a listening space according to a time position in time,
Receiving the virtual sound source arrangement information generated in the virtual sound source arrangement step, reading the specified audio information from the recording medium, and regarding the read audio information, the time position of each of the time positions in the total recording time. A reproduction processing step of performing a reproduction processing so that the audio output is localized and listened to a position on a listening space determined based on the virtual sound source arrangement information,
An audio output step of outputting an audio based on the audio information reproduced in the reproduction processing step,
A recording / reproducing method, comprising:

Audio information processed by the recording audio signal processing means for processing the input audio information for recording, audio information storage step of recording and storing in a recording medium,
Related information related to the input audio information, along with the corresponding relationship between the input audio information and the recording position on the recording medium, a related information storage step of recording and storing in a related information storage unit;
A display step of reading a plurality of pieces of related information related to the input voice information recorded in the related information storage step from the related information storage unit and displaying the read information on a display screen;
From the audio information recorded on the recording medium, the relevant information corresponding to each time position of the audio information to be reproduced specified by the reproduction specifying unit based on the information read from the relevant information storage unit. Detecting and determining a position in the listening space corresponding to each time position of the audio information of the reproduction target specified by the reproduction specifying means, the audio information of the reproduction target among a plurality of pieces of related information displayed on the display screen. Arranging on related information corresponding to each time position of
A recording / reproducing method, comprising: