JP2003018561A

JP2003018561A - Pantoscopic video image recording/reproducing system, conference recording/reproducing system, pantoscopic video image transmitting apparatus, conference video image transmitting apparatus, pantoscopic video image reproducing apparatus, conference video image reproducing apparatus, pantoscopic video image recording/reproducing method, conference video image reproducing method, pantoscopic video image transmitting method, conference video image transmitting method, pantoscopic video image reproducing method, conference video image reproducing method and program

Info

Publication number: JP2003018561A
Application number: JP2001203958A
Authority: JP
Inventors: Norihiko Murata; 憲彦村田; Shin Aoki; 青木　　伸
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2001-07-04
Filing date: 2001-07-04
Publication date: 2003-01-17
Anticipated expiration: 2021-07-04
Also published as: JP4439763B2

Abstract

PROBLEM TO BE SOLVED: To efficiently reproduce the scene of a conference while maintaining the scene of presence. SOLUTION: A conference recording/reproducing system is provided with a hyperboloid mirror 211 for receiving a pantoscopic video image using a vertical direction as the center or the axis, a plurality of microphones 221 for receiving voice, a voice direction detector (not shown) for detecting a direction of the voice on the basis of the voice inputted into the microphones 221, a recording section (not shown) for recording the pantoscopic image, the voice and the voice source direction, a video image deforming section for deforming an image of a prescribed direction according to the voice direction out of the recorded pantoscopic images into a rectangular output image, and a video image outputting section (not shown) for synchronously outputting the deformed image and the sound according to this image.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、広角の画像を取り
扱う広角画像録画再生システム、会議録画再生システ
ム、広角画像送出装置、会議画像送出装置、広角画像再
生装置、会議画像再生装置、広角画像録画再生方法、会
議録画再生方法、広角画像送出方法、会議画像送出方
法、広角画像再生方法、会議画像再生方法およびプログ
ラムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a wide-angle image recording / reproducing system that handles wide-angle images, a conference recording / reproducing system, a wide-angle image transmitting device, a conference image transmitting device, a wide-angle image reproducing device, a conference image reproducing device, and a wide-angle image recording. The present invention relates to a reproducing method, a conference recording / reproducing method, a wide-angle image transmitting method, a conference image transmitting method, a wide-angle image reproducing method, a conference image reproducing method, and a program.

【０００２】[0002]

【従来の技術】近年、電気通信技術の発展により、会議
の様子を撮影し、取得された画像を遠隔地に伝送するテ
レビ会議システムが多くの企業や団体で活用されるよう
になった。かかるシステムの利便性をより向上させるべ
く、従来より会議の様子を映像として取り込むための装
置および話者のみを切り出した部分映像を伝送するため
のシステムが数多く提案されている。2. Description of the Related Art In recent years, with the development of telecommunications technology, many companies and groups have come to utilize a video conference system for photographing the state of a conference and transmitting the obtained images to a remote place. In order to further improve the convenience of such a system, there have been conventionally proposed a number of devices for capturing the state of a conference as a video and a system for transmitting a partial video cut out from only a speaker.

【０００３】このような従来技術として、たとえば、特
開平５−１２２６８９号公報「テレビ会議システム」が
あげられる。同公報では、マイクから入力される音声を
検出して話者を判定し、該判定結果に基づいてカメラ制
御部でカメラを自動制御し、話者を捉えるというテレビ
会議システムに関する技術が開示されている。As such a conventional technique, there is, for example, Japanese Patent Application Laid-Open No. 5-122689, "TV conference system". This publication discloses a technology relating to a video conference system in which a voice input from a microphone is detected to determine a speaker, and a camera control unit automatically controls a camera based on the determination result to capture the speaker. There is.

【０００４】また、特開平１１−３３１８２７号公報
「テレビカメラ装置」があげられる。同公報では、魚眼
又は超広角レンズおよび可変指向性マイクロフォンを用
いたテレビカメラ装置に関する技術が開示されている。
具体的には、音源位置の方向を判定し、該音源位置方向
を追尾し、音源位置方向の画像を切り出して映像信号を
生成するという発明が開示されている。Further, there is a "television camera device" disclosed in Japanese Patent Laid-Open No. 11-331827. This publication discloses a technique relating to a television camera device using a fish-eye or ultra wide-angle lens and a variable directivity microphone.
Specifically, an invention is disclosed in which a direction of a sound source position is determined, the sound source position direction is tracked, and an image in the sound source position direction is cut out to generate a video signal.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、従来の
技術では以下の問題点があった。すなわち、特開平５−
１２２６８９号公報に開示される技術では、話者方向へ
カメラの向きを変えるのにある程度の時間が必要であ
り、話者が話し始めてから若干の間をおいて話者が映し
出されるという問題点があった。また、カメラの移動の
間映像が流れてしまい、会議画像が見辛くなるという問
題点があった。すなわち、臨場感を損ねるという問題点
があった。However, the conventional techniques have the following problems. That is, JP-A-5-
In the technique disclosed in Japanese Patent No. 122689, it takes a certain amount of time to change the direction of the camera in the direction of the speaker, and there is a problem that the speaker is projected some time after the speaker starts speaking. there were. In addition, there is a problem that a video image is displayed while the camera is moving, which makes it difficult to view a conference image. That is, there is a problem that the sense of presence is impaired.

【０００６】また、特開平１１−３３１８２７号公報に
開示される技術では、魚眼又は超広角レンズを用いた該
テレビカメラ装置を机の上などに設置する場合、一般に
天井などあまり重要でないものが視野の大半を占め、人
間の顔などの重要な被写体が視野の周辺部に存在し、周
縁減光や収差の影響を受けてしまうという問題点あっ
た。すなわち、会議を効率的に視聴することができない
という問題点があった。Further, in the technique disclosed in Japanese Patent Laid-Open No. 11-331827, when installing the television camera device using a fish-eye lens or an ultra wide-angle lens on a desk or the like, a ceiling or the like is generally not so important. There is a problem that an important subject such as a human face exists in the peripheral portion of the visual field, which occupies most of the visual field, and is affected by peripheral dimming and aberration. That is, there is a problem that the conference cannot be efficiently viewed.

【０００７】また、このようなレンズを用いた場合、歪
み補正のための計算が像の位置に大きく依存してしま
い、計算負担が大きくなるという問題点もあった。ま
た、このようなレンズないし光学系の設計は非常に難し
くコストも高くなってしまうという問題点もあった。In addition, when such a lens is used, the calculation for distortion correction greatly depends on the position of the image, and the calculation load becomes large. In addition, there is a problem in that the design of such a lens or optical system is very difficult and costly.

【０００８】また、近年では、従来のテレビ会議システ
ムの特徴であるいわゆるリアルタイム性に加えて、会議
内容を再びチェックしたいという要望も出てきている。Further, in recent years, in addition to the so-called real-time characteristic which is a characteristic of the conventional video conference system, there has been a demand for checking the contents of the conference again.

【０００９】本発明は上記に鑑みてなされたものであっ
て、臨場感を維持しつつ、会議を効率的に再現可能とす
ることを目的とする。The present invention has been made in view of the above, and it is an object of the present invention to efficiently reproduce a conference while maintaining a sense of presence.

【００１０】また、小型で安価な装置を提供することを
目的とする。Another object is to provide a small and inexpensive device.

【００１１】[0011]

【課題を解決するための手段】上記の目的を達成するた
めに、請求項１に記載の広角画像録画再生システムは、
鉛直方向を中心もしくは軸とした広角画像を入力する広
角画像入力手段と、音声を入力する複数の音声入力手段
と、前記複数の音声入力手段により入力された音声に基
づいて音源方向を検出する音源方向検出手段と、前記広
角画像入力手段により入力された広角画像と、前記音声
入力手段により入力された音声と、前記音源方向検出手
段により検出された音源方向を記録する記録手段と、前
記記録手段により記録された広角画像のうち、前記音源
方向に対応する方向の所定領域の画像を矩形の出力画像
となるように変形する画像変形手段と、前記画像変形手
段により変形された画像と、当該画像に対応した前記記
録手段に記録された音声とを同期させて出力する画像音
声出力手段と、を備えたことを特徴とする。In order to achieve the above object, a wide-angle image recording / reproducing system according to claim 1 is provided.
Wide-angle image input means for inputting a wide-angle image centered on or in the vertical direction, a plurality of voice input means for inputting voice, and a sound source for detecting a sound source direction based on the voice input by the plurality of voice input means Direction detection means, wide-angle image input by the wide-angle image input means, voice input by the voice input means, recording means for recording the sound source direction detected by the sound source direction detection means, and the recording means Image transforming means for transforming an image of a predetermined area in a direction corresponding to the sound source direction into a rectangular output image among the wide-angle images recorded by, an image transformed by the image transforming means, and the image. And audio / video output means for outputting in synchronization with the audio recorded in the recording means.

【００１２】すなわち、請求項１にかかる発明は、広角
の画像を記録し、再生の際にその歪みを正しつつ音源方
向を中心としたシーンを再生する。That is, according to the first aspect of the present invention, a wide-angle image is recorded, and when the image is reproduced, the distortion is corrected and the scene centered in the sound source direction is reproduced.

【００１３】また、請求項２に記載の広角画像録画再生
システムは、鉛直方向を中心もしくは軸とした広角画像
を入力する広角画像入力手段と、前記広角画像入力手段
により入力された広角画像を所定の変換式により矩形の
出力画像となるように変形する画像変形手段と音声を入
力する複数の音声入力手段と、前記複数の音声入力手段
により入力された音声に基づいて音源方向を検出する音
源方向検出手段と、前記画像変形手段により変形された
画像と、前記音声入力手段により入力された音声と、前
記音源方向検出手段により検出された音源方向を記録す
る記録手段と、前記記録手段により記録された矩形形状
の画像のうち、前記音源方向に対応する方向の所定領域
の画像を抽出する画像抽出手段と、前記画像抽出手段に
より抽出された画像と、当該画像に対応した前記記録手
段に記録された音声とを同期させて出力する画像音声出
力手段と、を備えたことを特徴とする。According to a second aspect of the present invention, there is provided a wide-angle image recording / reproducing system in which a wide-angle image input means for inputting a wide-angle image having a vertical direction as a center or an axis and a wide-angle image input by the wide-angle image input means are predetermined. Image transforming means for transforming into a rectangular output image according to the conversion formula, a plurality of voice input means for inputting voice, and a sound source direction for detecting a sound source direction based on the voice input by the plurality of voice input means. Detection means, an image transformed by the image transforming means, a voice input by the voice input means, a recording means for recording the sound source direction detected by the sound source direction detecting means, and a recording means for recording the sound source direction. Image extracting means for extracting an image of a predetermined area in a direction corresponding to the sound source direction among the rectangular images, and an image extracted by the image extracting means. When, characterized by comprising an image and sound output means for outputting is synchronized with the audio recorded in the recording means corresponding to the image.

【００１４】すなわち、請求項２にかかる発明は、パノ
ラマ形状に展開された画像を記録し、音源方向を中心と
したシーンを随時再生可能とする。That is, according to the second aspect of the present invention, an image developed in a panoramic shape is recorded, and a scene centered on the sound source direction can be reproduced at any time.

【００１５】また、請求項３に記載の広角画像録画再生
システムは、請求項１または２に記載の広角画像録画再
生システムにおいて、前記広角画像入力手段が、所定形
状の放物面を有する鏡面体、所定形状の双曲面を有する
鏡面体または所定の円錐形状を有する鏡面体と、画像撮
像素子とから構成されることを特徴とする。A wide-angle image recording / reproducing system according to a third aspect of the present invention is the wide-angle image recording / reproducing system according to the first or second aspect, wherein the wide-angle image input means is a mirror body having a paraboloid of a predetermined shape. It is characterized in that it is composed of a specular body having a hyperboloid of a predetermined shape or a specular body having a predetermined conical shape, and an image pickup device.

【００１６】すなわち、請求項３にかかる発明は、簡易
な構成により広角画像を取り込む。That is, the invention according to claim 3 captures a wide-angle image with a simple structure.

【００１７】また、請求項４に記載の広角画像録画再生
システムは、請求項１、２または３に記載の広角画像録
画再生システムにおいて、前記音源方向検出手段が、前
記複数の音声入力手段により入力された音声の時間差に
基づいて音源方向を検出することを特徴とする。The wide-angle image recording / reproducing system according to a fourth aspect is the wide-angle image recording / reproducing system according to the first, second or third aspect, wherein the sound source direction detecting means is input by the plurality of audio input means. It is characterized in that the sound source direction is detected based on the time difference between the generated sounds.

【００１８】すなわち、請求項４にかかる発明は、簡易
な構成により音源方向を検出する。That is, the invention according to claim 4 detects the sound source direction with a simple structure.

【００１９】また、請求項５に記載の広角画像録画再生
システムは、請求項１〜４に記載の広角画像録画再生シ
ステムにおいて、前記音源方向に対応する方向を修正す
る方向修正手段を備えたことを特徴とする。A wide-angle image recording / reproducing system according to a fifth aspect is the wide-angle image recording / reproducing system according to any one of the first to fourth aspects, further comprising direction correcting means for correcting a direction corresponding to the sound source direction. Is characterized by.

【００２０】すなわち、請求項５にかかる発明は、ノイ
ズ等により音源方向が正しく検出されなかった場合でも
所望の映像を再生する。That is, the invention according to claim 5 reproduces a desired video even when the sound source direction is not correctly detected due to noise or the like.

【００２１】また、請求項６に記載の広域画像録画再生
システムは、請求項１〜５のいずれか一つに記載の広角
画像録画再生システムにおいて、前記音源方向に対応す
る方向の所定領域を固定する領域固定手段を備えたこと
を特徴とする。A wide-area image recording / reproducing system according to a sixth aspect is the wide-angle image recording / reproducing system according to any one of the first to fifth aspects, wherein a predetermined area in a direction corresponding to the sound source direction is fixed. It is characterized in that it is provided with a region fixing means.

【００２２】すなわち、請求項６にかかる発明は、音源
の方向が微妙に移動する場合であっても、画像ブレを防
止する。That is, the invention according to claim 6 prevents the image blur even when the direction of the sound source slightly moves.

【００２３】また、請求項７に記載の会議録画再生シス
テムは、請求項１〜６のいずれか一つに記載の広角画像
録画再生システムを会議の録画再生に適用した会議録画
再生システムであって、画像の色分布もしくは画像中の
移動部分に基づいて話者の位置を判断する話者位置判断
手段と、前記話者位置判断手段の判断結果により前記所
定領域を決定する所定領域決定手段と、を備えたことを
特徴とする。A conference recording / reproducing system according to a seventh aspect is a conference recording / reproducing system in which the wide-angle image recording / reproducing system according to any one of the first to sixth aspects is applied to recording / reproducing of a conference. A speaker position determining means for determining the position of the speaker based on the color distribution of the image or a moving part in the image; and a predetermined area determining means for determining the predetermined area based on the determination result of the speaker position determining means, It is characterized by having.

【００２４】すなわち、請求項７にかかる発明は、話者
部分を的確に抽出する。That is, according to the invention of claim 7, the speaker portion is accurately extracted.

【００２５】また、請求項８に記載の広角画像送出装置
は、鉛直方向を中心もしくは軸とした広角画像を入力す
る広角画像入力手段と、音声を入力する複数の音声入力
手段と、前記複数の音声入力手段により入力された音声
に基づいて音源方向を検出する音源方向検出手段と、前
記広角画像入力手段により入力された広角画像に関する
データと、前記音声入力手段により入力された音声に関
するデータと、前記音源方向検出手段により検出された
音源方向に関するデータと、を所定のデータ格納手段へ
送出するデータ送出手段と、を備えたことを特徴とす
る。According to the eighth aspect of the invention, the wide-angle image transmitting apparatus has a wide-angle image input means for inputting a wide-angle image centered on or in the vertical direction, a plurality of voice input means for inputting voice, and the plurality of voice input means. A sound source direction detecting means for detecting a sound source direction based on a sound input by a sound input means; data about a wide-angle image input by the wide-angle image input means; and data about a sound input by the sound input means, It is characterized by further comprising: data transmission means for transmitting the data regarding the sound source direction detected by the sound source direction detection means to a predetermined data storage means.

【００２６】すなわち、請求項８にかかる発明は、広角
の画像を取り込み、音源を中心としたシーンを再生可能
とする。That is, the invention according to claim 8 can capture a wide-angle image and reproduce a scene centered on a sound source.

【００２７】また、請求項９に記載の広角画像送出装置
は、請求項８に記載の広角画像送出装置において、前記
広角画像入力手段により入力された広角画像を所定の変
換式により矩形の出力画像となるように変形する画像変
形手段を備え、前記データ送出手段が、前記広角画像入
力手段により入力された広角画像に関するデータに換え
て、前記画像変形手段により変形された画像に関するデ
ータを送出することを特徴とする。A wide-angle image transmitting apparatus according to a ninth aspect is the wide-angle image transmitting apparatus according to the eighth aspect, in which a wide-angle image input by the wide-angle image input means is converted into a rectangular output image by a predetermined conversion formula. Image transformation means for transforming so that the data transmission means transmits the data relating to the image transformed by the image transformation means in place of the data relating to the wide-angle image input by the wide-angle image input means. Is characterized by.

【００２８】すなわち、請求項９にかかる発明は、パノ
ラマ形状に展開された画像を再生可能とする。That is, the invention according to claim 9 can reproduce an image developed in a panoramic shape.

【００２９】また、請求項１０に記載の広角画像送出装
置は、請求項９に記載の広角画像送出装置において、前
記画像変形手段により変形された画像のうち、前記音源
方向検出手段により検出された音源方向の所定領域の画
像を抽出する画像抽出手段を備え、前記データ送出手段
が、前記広角画像入力手段により入力された広角画像に
関するデータおよび前記音源方向検出手段により検出さ
れた音源方向に関するデータに換えて、または、前記広
角画像入力手段により入力された広角画像に関するデー
タおよび前記音源方向検出手段により検出された音源方
向に関するデータと共に、前記画像抽出手段により抽出
された画像に関するデータを送出することを特徴とす
る。According to a tenth aspect of the present invention, there is provided the wide-angle image transmitting apparatus according to the ninth aspect, wherein the image is deformed by the image deforming means and detected by the sound source direction detecting means. The data transmission means includes image extraction means for extracting an image of a predetermined region in the sound source direction, and the data transmission means converts the data regarding the wide-angle image input by the wide-angle image input means and the data regarding the sound source direction detected by the sound source direction detection means. Alternatively, or in addition to the data on the wide-angle image input by the wide-angle image input means and the data on the sound source direction detected by the sound source direction detecting means, the data on the image extracted by the image extracting means may be transmitted. Characterize.

【００３０】すなわち、請求項１０にかかる発明は、音
源方向を中心としたシーンを再生可能とする。That is, the invention according to claim 10 can reproduce a scene centered on the sound source direction.

【００３１】また、請求項１１に記載の広角画像送出装
置は、請求項８、９または１０に記載の広角画像送出装
置において、前記広角画像入力手段が、所定形状の放物
面を有する鏡面体、所定形状の双曲面を有する鏡面体ま
たは所定の円錐形状を有する鏡面体と、画像撮像素子と
から構成されることを特徴とする。The wide-angle image transmitting device described in claim 11 is the wide-angle image transmitting device according to claim 8, 9 or 10, wherein the wide-angle image input means is a mirror surface body having a paraboloid of a predetermined shape. It is characterized in that it is composed of a specular body having a hyperboloid of a predetermined shape or a specular body having a predetermined conical shape, and an image pickup device.

【００３２】すなわち、請求項１１にかかる発明は、簡
易な構成で広角画像を取り込む。That is, the invention according to claim 11 captures a wide-angle image with a simple structure.

【００３３】また、請求項１２に記載の広角画像送出装
置は、請求項８〜１１のいずれか一つに記載の広角画像
送出装置において、前記音源方向検出手段が、前記複数
の音声入力手段により入力された音声の時間差に基づい
て音源方向を検出することを特徴とする。A wide-angle image transmitting device according to a twelfth aspect is the wide-angle image transmitting device according to any one of the eighth to eleventh aspects, wherein the sound source direction detecting means is constituted by the plurality of voice input means. It is characterized in that the sound source direction is detected based on the time difference between the inputted voices.

【００３４】すなわち、請求項１２にかかる発明は、簡
易な構成により音源方向を検出する。That is, according to the twelfth aspect of the invention, the sound source direction is detected with a simple structure.

【００３５】また、請求項１３に記載の広角画像送出装
置は、請求項１２に記載の広角画像送出装置において、
前記音源方向検出手段が、ある音声入力手段により入力
された音声と、当該音声入力手段と最も距離の離れた音
声入力手段により入力された音声との時間差に基づいて
音源方向を検出することを特徴とする。A wide-angle image transmitting device according to a thirteenth aspect is the wide-angle image transmitting device according to the twelfth aspect.
The sound source direction detecting means detects the sound source direction based on a time difference between a voice input by a certain voice input means and a voice input by a voice input means farthest from the voice input means. And

【００３６】すなわち、請求項１３にかかる発明は、高
精度に音源方向を検出する。That is, according to the thirteenth aspect of the present invention, the sound source direction is detected with high accuracy.

【００３７】また、請求項１４に記載の広角画像送出装
置は、請求項８〜１１のいずれか一つに記載の広角画像
送出装置において、前記複数の音声入力手段は指向性マ
イクロフォンにより構成され、前記音源方向検出手段
が、前記指向性マイクロフォンにより入力された音声の
強度に基づいて音源方向を検出することを特徴とする。A wide-angle image transmitting apparatus according to a fourteenth aspect is the wide-angle image transmitting apparatus according to any one of the eighth to eleventh aspects, wherein the plurality of audio input means are directional microphones. It is characterized in that the sound source direction detecting means detects the sound source direction based on the strength of the voice input by the directional microphone.

【００３８】すなわち、請求項１４にかかる発明は、簡
易な構成で音源方向を検出する。That is, the invention according to claim 14 detects the sound source direction with a simple structure.

【００３９】また、請求項１５に記載の広角画像送出装
置は、請求項８〜１４のいずれか一つに記載の広角画像
送出装置において、前記複数の音声入力手段を、当該音
声入力手段の重心位置が前記広角画像入力手段の光学中
心と略一致するようにそれぞれ配置したことを特徴とす
る。A wide-angle image transmitting apparatus according to a fifteenth aspect is the wide-angle image transmitting apparatus according to any one of the eighth to fourteenth aspects, wherein the plurality of voice input means are arranged at the center of gravity of the voice input means. It is characterized in that the respective positions are arranged so as to substantially coincide with the optical center of the wide-angle image input means.

【００４０】すなわち、請求項１５にかかる発明は、音
声入力手段の座標系と広角画像入力手段の座標系とを一
致させて各種計算を簡略化できる。That is, according to the fifteenth aspect of the present invention, various calculations can be simplified by matching the coordinate system of the voice input means with the coordinate system of the wide-angle image input means.

【００４１】また、請求項１６に記載の広角画像送出装
置は、請求項８〜１５のいずれか一つに記載の広角画像
送出装置において、前記複数の音声入力手段と前記撮像
素子とを台座側に配置し、前記鏡面体を透明部材を介し
て前記台座に対峙させて配置したこと、もしくは、前記
複数の音声入力手段、前記撮像素子および前記鏡面体を
台座側に配置し、前記台座側の鏡面体からの反射光を前
記台座側の撮像素子へ向けて反射する第２の鏡面体を透
明部材を介して当該台座に対峙させて配置したことを特
徴とする。A wide-angle image transmitting device according to a sixteenth aspect is the wide-angle image transmitting device according to any one of the eighth to fifteenth aspects, in which the plurality of audio input means and the image pickup device are provided on the pedestal side. The mirror surface is arranged facing the pedestal via a transparent member, or the plurality of voice input means, the image pickup device and the mirror surface are arranged on the pedestal side, It is characterized in that a second mirror body that reflects the reflected light from the mirror body toward the image pickup device on the pedestal side is arranged to face the pedestal through a transparent member.

【００４２】すなわち、請求項１６にかかる発明は、電
気系を台座に埋め込み、導線等による画像の分断を防止
する。That is, according to the sixteenth aspect of the present invention, the electric system is embedded in the pedestal to prevent the division of the image due to the conductive wire or the like.

【００４３】また、請求項１７に記載の広角画像送出装
置は、請求項８〜１６のいずれか一つに記載の広角画像
送出装置において、装置が設置される平面を基準とする
話者の仰角を設定する仰角設定手段を備え、前記画像抽
出手段が、前記音源方向検出手段により検出された音源
方向と前記仰角設定手段により設定された仰角とに基づ
いて画像を抽出することを特徴とする。A wide-angle image transmitting apparatus according to a seventeenth aspect is the wide-angle image transmitting apparatus according to any one of the eighth to sixteenth aspects, in which the speaker elevation angle is based on a plane on which the apparatus is installed. The image extraction means extracts an image based on the sound source direction detected by the sound source direction detection means and the elevation angle set by the elevation angle setting means.

【００４４】すなわち、請求項１７にかかる発明は、高
精度に画像を抽出する。That is, the invention according to claim 17 extracts an image with high accuracy.

【００４５】また、請求項１８に記載の会議画像送出装
置は、請求項１０〜１７のいずれか一つに記載の広角画
像送出装置を会議の録画用に適用した会議画像送出装置
であって、画像の色分布もしくは画像中の移動部分に基
づいて話者の位置を判断する話者位置判断手段と、前記
話者位置判断手段の判断結果により前記所定領域を決定
する所定領域決定手段と、を備えたことを特徴とする。A conference image transmitting apparatus according to a eighteenth aspect is a conference image transmitting apparatus to which the wide-angle image transmitting apparatus according to any one of the tenth to seventeenth aspects is applied for recording a conference. A speaker position determining means for determining the position of the speaker based on the color distribution of the image or a moving part in the image; and a predetermined area determining means for determining the predetermined area based on the determination result of the speaker position determining means. It is characterized by having.

【００４６】すなわち、請求項１８にかかる発明は、話
者部分を的確に抽出する。That is, according to the eighteenth aspect of the present invention, the speaker portion is accurately extracted.

【００４７】また、請求項１９に記載の広角画像再生装
置は、広角画像が撮像された動画データと、当該動画デ
ータに同期した音声データと、音源方向に関するデータ
と、を入力するデータ入力手段と、前記データ入力手段
により入力された動画データのうち、前記音源方向に関
するデータに基づいて所定領域の動画データを矩形の出
力画像となるように変形する画像変形手段と、前記画像
変形手段により変形された動画データと、当該動画デー
タに対応した音声データとを同期させて出力する画像音
声出力手段と、を備えたことを特徴とする。A wide-angle image reproducing apparatus according to a nineteenth aspect of the present invention is data input means for inputting moving image data in which a wide-angle image is captured, audio data synchronized with the moving image data, and data regarding a sound source direction. Of the moving image data input by the data input unit, an image transforming unit that transforms the moving image data in a predetermined area into a rectangular output image based on the data relating to the sound source direction, and the image transforming unit transforms the moving image data. And a video / audio output unit that outputs the video data and the audio data corresponding to the video data in synchronization with each other.

【００４８】すなわち、請求項１９にかかる発明は、歪
んだ広角の動画データを正しつつ音源方向を中心とした
シーンを出力する。That is, according to the nineteenth aspect of the present invention, a scene centered in the sound source direction is output while correcting the distorted wide-angle moving image data.

【００４９】また、請求項２０に記載の広角画像再生装
置は、パノラマ状の広角画像が撮像された動画データ
と、当該動画データに同期した音声データと、音源方向
に関するデータと、を入力するデータ入力手段と、前記
データ入力手段により入力された動画データのうち、前
記音源方向に関するデータに基づいて所定領域の動画デ
ータを抽出する画像抽出手段と、前記画像抽出手段によ
り抽出された動画データと、当該動画データに対応した
音声データとを同期させて出力する画像音声出力手段
と、を備えたことを特徴とする。The wide-angle image reproducing device according to the twentieth aspect of the invention is data for inputting moving image data in which a panoramic wide-angle image is captured, audio data synchronized with the moving image data, and data regarding a sound source direction. Input means, image extracting means for extracting moving image data of a predetermined area based on the data regarding the sound source direction from the moving image data input by the data inputting means, and moving image data extracted by the image extracting means, And an image / audio output unit that outputs the audio data corresponding to the moving image data in synchronization with each other.

【００５０】すなわち、請求項２０にかかる発明は、音
源方向を中心としたシーンを出力する。That is, according to the twentieth aspect of the invention, a scene centered in the sound source direction is output.

【００５１】また、請求項２１に記載の広角画像再生装
置は、請求項１９または２０に記載の広角画像再生装置
において、前記データ入力手段により入力された前記動
画データと、音声データと、音源方向に関するデータ
と、を記録する記録手段と、前記動画データの再生を指
示する再生指示手段と、前記再生指示手段による再生指
示があった場合に、前記記録手段により記録されている
データに基づいて、前記画像音声出力手段を制御して前
記動画データと、当該動画データに対応した音声データ
とを同期させて出力させる出力制御手段と、を備えたこ
とを特徴とする。A wide-angle image reproducing device according to a twenty-first aspect is the wide-angle image reproducing device according to the nineteenth or twentieth aspect, wherein the moving image data, the audio data, and the sound source direction input by the data input means are included. Recording means for recording data relating to the moving image data, reproduction instruction means for instructing reproduction of the moving image data, and when there is a reproduction instruction by the reproduction instruction means, based on the data recorded by the recording means, An output control means for controlling the image / audio output means to output the moving picture data and the sound data corresponding to the moving picture data in synchronization with each other are provided.

【００５２】すなわち、請求項２１にかかる発明は、音
源方向を中心としたシーンを随時再生可能とする。That is, according to the invention of claim 21, a scene centered on the sound source direction can be reproduced at any time.

【００５３】また、請求項２２に記載の広角画像再生装
置は、請求項１９、２０または２１に記載の広角画像再
生装置において、前記音源方向に対応する方向を修正す
る方向修正手段を備えたことを特徴とする。A wide-angle image reproducing device according to a twenty-second aspect is the wide-angle image reproducing device according to the nineteenth, twenty-third or twenty-first aspect, further comprising direction correcting means for correcting a direction corresponding to the sound source direction. Is characterized by.

【００５４】すなわち、請求項２２にかかる発明は、ノ
イズ等により音源方向が所望の方向でなかった場合でも
所望の映像を再生する。That is, according to the twenty-second aspect of the present invention, a desired image is reproduced even when the sound source direction is not the desired direction due to noise or the like.

【００５５】また、請求項２３に記載の広角画像再生装
置は、請求項１９〜２２のいずれか一つに記載の広角画
像再生装置において、前記音源方向に対応する方向の所
定領域を固定する領域固定手段を備えたことを特徴とす
る。The wide-angle image reproducing device according to a twenty-third aspect is the wide-angle image reproducing device according to any one of the nineteenth to twenty-second aspects, in which a predetermined region in a direction corresponding to the sound source direction is fixed. It is characterized in that a fixing means is provided.

【００５６】すなわち、請求項２２にかかる発明は、音
源の方向が微妙に移動する場合であっても、画像ブレを
防止する。That is, the invention according to claim 22 prevents the image blur even when the direction of the sound source slightly moves.

【００５７】また、請求項２４に記載の会議画像再生装
置は、請求項１９〜２３のいずれか一つに記載の広角画
像再生装置を会議の再生用に適用した会議画像再生装置
であって、動画データの色分布もしくは動画データ中の
移動部分に基づいて話者の位置を判断する話者位置判断
手段と、前記話者位置判断手段の判断結果により前記所
定領域を決定する所定領域決定手段と、を備えたことを
特徴とする。A conference image reproducing apparatus according to a twenty-fourth aspect is a conference image reproducing apparatus to which the wide-angle image reproducing apparatus according to any one of the nineteenth to twenty-third aspects is applied for reproducing a conference. A speaker position determining means for determining the position of the speaker based on the color distribution of the moving image data or a moving part in the moving image data; and a predetermined area determining means for determining the predetermined area based on the determination result of the speaker position determining means. , Is provided.

【００５８】すなわち、請求項２４にかかる発明は、話
者部分を的確に抽出する。That is, according to the invention of claim 24, the speaker portion is accurately extracted.

【００５９】また、請求項２５に記載の広角画像録画再
生方法は、鉛直方向を中心もしくは軸とした広角画像
と、当該画像に同期した音声と、当該音声の音源方向
と、を入力する入力工程と、前記入力工程により入力さ
れた広角画像と、音声と、音源方向と、を記録する記録
工程と、前記記録工程により記録された広角画像のう
ち、前記音源方向に対応する方向の所定領域の画像を矩
形の出力画像となるように変形する変形工程と、前記画
像変形工程により変形された画像と、前記記録工程に記
録された前記変形された画像にかかる音声とを同期させ
て再生する再生工程と、を含んだことを特徴とする。According to the wide-angle image recording / reproducing method of the twenty-fifth aspect, an input step of inputting a wide-angle image having a vertical direction as a center or an axis, a sound synchronized with the image, and a sound source direction of the sound. A recording step of recording a wide-angle image input in the input step, a sound, and a sound source direction; and a predetermined area in a direction corresponding to the sound source direction in the wide-angle image recorded in the recording step. Reproduction step of deforming an image into a rectangular output image, the image deformed by the image deforming step, and the sound of the deformed image recorded in the recording step in synchronization with each other. And a process.

【００６０】すなわち、請求項２５にかかる発明は、広
角の画像を記録し、再生の際にその歪みを正しつつ音源
方向を中心としたシーンを再生する。That is, according to the twenty-fifth aspect of the present invention, a wide-angle image is recorded, and when the image is reproduced, the distortion is corrected and the scene centered in the sound source direction is reproduced.

【００６１】また、請求項２６に記載の広角画像録画再
生方法は、鉛直方向を中心もしくは軸とした広角画像
と、当該画像に同期した音声と、当該音声の音源方向
と、を入力する入力工程と、前記入力工程により入力さ
れた広角画像を所定の変換式により矩形の出力画像とな
るように変形する変形工程と、前記変形工程により変形
された画像と、音声と、音源方向と、を記録する記録工
程と、前記記録工程により記録された矩形形状の画像の
うち、前記音源方向に対応する方向の所定領域の画像を
抽出する抽出工程と、前記抽出工程により抽出された画
像と、前記記録工程に記録された前記抽出された画像に
かかる音声とを同期させて再生する再生工程と、を含ん
だことを特徴とする。According to the wide-angle image recording / reproducing method of the twenty-sixth aspect, an input step of inputting a wide-angle image having a vertical direction as a center or an axis, a sound synchronized with the image, and a sound source direction of the sound. And a transformation step of transforming the wide-angle image input by the input step into a rectangular output image by a predetermined conversion formula, an image transformed by the transformation step, a sound, and a sound source direction are recorded. A recording step of extracting the image of a predetermined area in a direction corresponding to the sound source direction from the rectangular image recorded by the recording step; an image extracted by the extracting step; And a reproduction step of reproducing in synchronization with the sound related to the extracted image recorded in the step.

【００６２】すなわち、請求項２６にかかる発明は、パ
ノラマ形状に展開された画像を記録し、音源方向を中心
としたシーンを随時再生可能とする。That is, according to the twenty-sixth aspect of the invention, an image developed in a panoramic shape is recorded, and a scene centered on the sound source direction can be reproduced at any time.

【００６３】また、請求項２７に記載の会議録画再生方
法は、請求項２５または２６に記載の広角画像録画再生
方法を会議の録画再生に適用した会議録画再生方法であ
って、画像の色分布もしくは画像中の移動部分に基づい
て話者の位置を判断する話者位置判断工程と、前記話者
位置判断工程に基づく判断結果により前記所定領域を決
定する所定領域決定工程と、を含んだことを特徴とす
る。A conference recording / playback method according to a twenty-seventh aspect is a conference recording / playback method in which the wide-angle image recording / playback method according to the twenty-fifth or twenty-sixth aspect is applied to the recording / playback of a conference. Alternatively, it includes a speaker position determining step of determining the position of the speaker based on the moving part in the image, and a predetermined area determining step of determining the predetermined area based on the determination result based on the speaker position determining step. Is characterized by.

【００６４】すなわち、請求項２７にかかる発明は、話
者部分を的確に抽出する。That is, according to the twenty-seventh aspect of the present invention, the speaker portion is accurately extracted.

【００６５】また、請求項２８に記載の広角画像送出方
法は、鉛直方向を中心もしくは軸とした広角画像と、当
該画像に同期した音声と、当該音声の音源方向と、を入
力する入力工程と、前記入力工程により入力された広角
画像を所定の変換式により矩形の出力画像となるように
変形する変形工程と、前記変形工程により変形された画
像に関するデータと、音声に関するデータと、音源方向
に関するデータと、を所定のデータ格納先へ送出するデ
ータ送出工程と、を含んだことを特徴とする。The wide-angle image transmitting method according to the twenty-eighth aspect further comprises an input step of inputting a wide-angle image having a vertical direction as a center or an axis, a sound synchronized with the image, and a sound source direction of the sound. , A transformation step of transforming the wide-angle image input in the input step into a rectangular output image by a predetermined conversion formula, data regarding the image transformed by the transformation step, data regarding voice, and a sound source direction. And a data sending step of sending the data to a predetermined data storage destination.

【００６６】すなわち、請求項２８にかかる発明は、パ
ノラマ形状に展開された画像を再生可能とする。That is, according to the twenty-eighth aspect of the invention, it is possible to reproduce an image developed in a panoramic shape.

【００６７】また、請求項２９に記載の広角画像送出方
法は、請求項２８に記載の広角画像送出方法において、
前記変形工程により変形された画像のうち、前記音源方
向の所定領域の画像を抽出する抽出工程を含み、前記デ
ータ送出工程では、前記入力工程により入力された広角
画像に関するデータおよび音源方向に関するデータに換
えて、または、前記入力工程により入力された広角画像
に関するデータおよび音源方向に関するデータと共に、
前記抽出工程により抽出された画像に関するデータを送
出することを特徴とする。A wide-angle image transmitting method according to a twenty-ninth aspect is the wide-angle image transmitting method according to the twenty-eighth aspect.
Of the images transformed by the transforming step, an extraction step of extracting an image of a predetermined region in the sound source direction is included, and in the data sending step, the data on the wide-angle image and the data on the sound source direction are input in the input step. Alternatively, or together with the data about the wide-angle image and the data about the sound source direction input by the input step,
Data relating to the image extracted in the extraction step is transmitted.

【００６８】すなわち、請求項２９にかかる発明は、音
源方向を中心としたシーンを再生可能とする。That is, according to the twenty-ninth aspect of the present invention, it is possible to reproduce a scene centered on the sound source direction.

【００６９】また、請求項３０に記載の会議画像送出方
法は、請求項２９に記載の広角画像送出方法を会議の録
画用に適用した会議画像送出方法であって、画像の色分
布もしくは画像中の移動部分に基づいて話者の位置を判
断する話者位置判断工程と、前記話者位置判断工程の判
断結果により前記所定領域を決定する所定領域決定工程
と、を含んだことを特徴とする。A meeting image sending method according to a thirtieth aspect is a meeting image sending method in which the wide-angle image sending method according to the twenty-ninth aspect is applied for recording a conference, and the color distribution of the image or the image And a predetermined area determining step of determining the predetermined area based on the determination result of the speaker position determining step. .

【００７０】すなわち、請求項３０にかかる発明は、話
者部分を的確に抽出する。That is, according to the invention of claim 30, the speaker part is accurately extracted.

【００７１】また、請求項３１に記載の広角画像再生方
法は、広角画像が撮像された動画データと、当該動画デ
ータに同期した音声データと、音源方向に関するデータ
と、を記録する記録工程と、前記記録工程により記録さ
れた動画データのうち、前記音源方向に関するデータに
基づいて所定領域の動画データを矩形の出力画像となる
ように変形する変形工程と、前記画像変形工程により変
形された動画データと、当該動画データに対応した音声
データとを同期させて出力する出力工程と、を含んだこ
とを特徴とする。According to the wide-angle image reproducing method of the thirty-first aspect, a recording step of recording the moving image data in which the wide-angle image is captured, the audio data synchronized with the moving image data, and the data regarding the sound source direction, Of the moving picture data recorded by the recording step, a transformation step of transforming the moving picture data of a predetermined area into a rectangular output image based on the data regarding the sound source direction, and the moving picture data transformed by the image transforming step. And an output step of outputting the audio data corresponding to the moving image data in synchronization with each other.

【００７２】すなわち、請求項３１にかかる発明は、歪
んだ広角の動画データを正しつつ音源方向を中心とした
シーンを出力する。That is, according to the thirty-first aspect of the present invention, the scene centered in the sound source direction is output while correcting the distorted wide-angle moving image data.

【００７３】また、請求項３２に記載の広角画像再生方
法は、パノラマ状の広角画像が撮像された動画データ
と、当該動画データに同期した音声データと、音源方向
に関するデータと、を記録する記録工程と、前記記録工
程により記録された動画データのうち、前記音源方向に
関するデータに基づいて所定領域の動画データを抽出す
る抽出工程と、前記抽出工程により抽出された動画デー
タと、当該動画データに対応した音声データとを同期さ
せて出力する出力工程と、を含んだことを特徴とする。The wide-angle image reproducing method according to a thirty-second aspect of the invention is a recording for recording moving image data in which a panoramic wide-angle image is captured, audio data synchronized with the moving image data, and data regarding a sound source direction. A step of extracting moving image data of a predetermined area based on the data regarding the sound source direction from the moving image data recorded in the recording step, the moving image data extracted by the extracting step, and the moving image data And an output step of outputting corresponding audio data in synchronization with each other.

【００７４】すなわち、請求項３２にかかる発明は、音
源方向を中心としたシーンを出力する。That is, according to the thirty-second aspect of the present invention, a scene centered on the sound source direction is output.

【００７５】また、請求項３３に記載の会議画像再生方
法は、請求項３１または３２に記載の広角画像再生方法
を会議の再生用に適用した会議画像再生方法であって、
動画データの色分布もしくは動画データ中の移動部分に
基づいて話者の位置を判断する話者位置判断工程と、前
記話者位置判断工程に基づく判断結果により前記所定領
域を決定する所定領域決定工程と、を含んだことを特徴
とする。The conference image reproducing method described in claim 33 is the conference image reproducing method in which the wide-angle image reproducing method according to claim 31 or 32 is applied for reproducing a conference.
A speaker position determining step of determining the position of the speaker based on the color distribution of the moving image data or a moving portion in the moving image data, and a predetermined area determining step of determining the predetermined area based on the determination result based on the speaker position determining step. It is characterized by including and.

【００７６】すなわち、請求項３３にかかる発明は、話
者部分を的確に抽出する。That is, according to the invention of claim 33, the speaker portion is accurately extracted.

【００７７】また、請求項３４に記載のプログラムは、
コンピュータに、請求項２５〜３３のいずれか一つに記
載の方法の各工程を実行させることを特徴とする。すな
わち、請求項３４にかかる発明は、請求項２５〜３３の
いずれか一つに記載の方法と同一の作用効果を奏する。The program according to claim 34,
A computer is made to perform each step of the method according to any one of claims 25 to 33. That is, the invention according to claim 34 has the same operational effect as the method according to any one of claims 25 to 33.

【００７８】[0078]

【発明の実施の形態】以下、本発明の実施の形態を図面
を参照しながら詳細に説明する。実施の形態１．実施の形態１では、本発明の広角画像録
画再生システムを会議の録画再生に適用した会議録画再
生システムについて説明する。ここでは、まず、会議録
画再生システムがどのように使用されるかの使用例につ
いて簡単に概説し、次に、会議録画再生システムを構成
する要素（画像と音声の入力部に該当する会議画像送出
装置、および、その画像と音声の録画再生部に該当する
会議画像再生装置）を説明し、最後に処理流れについて
説明する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described in detail below with reference to the drawings. Embodiment 1. In the first embodiment, a conference recording / playback system in which the wide-angle image recording / playback system of the present invention is applied to recording / playback of a conference will be described. Here, first, a brief example of how the conference recording / playback system is used will be briefly described, and then, elements that constitute the conference recording / playback system (conference image transmission corresponding to an image and audio input unit) will be described. The apparatus and the conference image reproducing apparatus corresponding to the recording and reproducing unit of the image and the sound) will be described, and finally the processing flow will be described.

【００７９】（会議録画再生システムの使用例）図１
は、本発明を会議場面に設置した使用例を概説する説明
図である。会議録画再生システム１００は、広角画像と
音声を入力する会議画像送出装置２００と、会議画像送
出装置２００で入力された画像と音声を録画再生する会
議画像再生装置３００と、を有する。(Example of use of conference recording / playback system) FIG.
[Fig. 4] is an explanatory view outlining a usage example in which the present invention is installed in a conference scene. The conference recording / reproducing system 100 includes a conference image transmitting device 200 for inputting a wide-angle image and audio, and a conference image reproducing device 300 for recording and reproducing the image and audio input by the conference image transmitting device 200.

【００８０】図示したように、会議画像送出装置２００
は、テーブル１に設置され、会議の参加者（話者）２の
いる方向、すなわち、水平面を見渡す全周囲の画像を一
括して撮像し、また、会議の音声も入力する。会議画像
再生装置３００は、キャビネット３に格納され、会議画
像送出装置２００からの画像を録画し、ユーザの要求に
応じて録画された会議内容を必要に応じて再生する（な
お、図には再生用のモニタを省略している）。再生に際
して会議画像再生装置３００は、会議画像送出装置２０
０から取り込まれた全周囲の画像を変形し、矩形の出力
画像となるように変形する。As shown in the figure, the conference image transmitting apparatus 200
Is installed on the table 1 and collectively captures images of the direction in which the participants (speakers) 2 of the conference are, that is, the entire circumference overlooking the horizontal plane, and also inputs the voice of the conference. The conference image reproducing device 300 is stored in the cabinet 3, records an image from the conference image transmitting device 200, and reproduces the recorded conference contents as needed in response to a user's request (note that in the figure, The monitor for is omitted). Upon reproduction, the conference image reproduction device 300 uses the conference image transmission device 20.
The image of the entire circumference taken from 0 is transformed into a rectangular output image.

【００８１】次に、会議録画再生システム１００の各部
を説明する。（会議画像送出装置２００の外観構成）図２は、実施の
形態１の会議画像送出装置２００の外観斜視図である。
また、図３は、実施の形態１の会議画像送出装置２００
の正面図と平面図である。会議画像送出装置２００は、
鉛直方向を中心もしくは軸とした広角画像を入力するカ
メラ部２０１と、音声を入力するマイク部２０２と、を
有する。ここで、広角画像とは少なくとも水平面を見渡
す全周囲（３６０°）を含む画像をいう。Next, each part of the conference recording / playback system 100 will be described. (External Configuration of Conference Image Sending Apparatus 200) FIG. 2 is an external perspective view of the conference image sending apparatus 200 according to the first embodiment.
Further, FIG. 3 shows a conference image transmitting apparatus 200 according to the first embodiment.
FIG. 3A is a front view and FIG. The conference image transmitting device 200 is
It has a camera unit 201 for inputting a wide-angle image with the vertical direction as a center or an axis, and a microphone unit 202 for inputting voice. Here, the wide-angle image means an image including at least the entire circumference (360 °) overlooking the horizontal plane.

【００８２】なお、図示したように、実施の形態１の会
議画像送出装置２００は、４つのマイクロフォン２２１
を有し、このマイクロフォン２２１と、後述するカメラ
部２０１の撮像素子（ＣＣＤ）とは、台座２０３に配置
されている。また、後述するカメラ部２０１の双曲面ミ
ラー２１１は、透明ガラス２０４により台座２０３に対
峙して配置されている。透明ガラス２０４を用いること
により、双曲面ミラー２１１から入射する光が遮蔽を受
けることなく全周囲の画像を入力することができる。な
お、符合２０５は、各種データを送信するケーブルを示
す。As shown in the figure, the conference image transmitting apparatus 200 according to the first embodiment has four microphones 221.
The microphone 221 and the image pickup device (CCD) of the camera unit 201, which will be described later, are arranged on the pedestal 203. The hyperboloidal mirror 211 of the camera unit 201, which will be described later, is arranged so as to face the pedestal 203 with the transparent glass 204. By using the transparent glass 204, it is possible to input an image of the entire circumference without blocking the light incident from the hyperboloidal mirror 211. Note that reference numeral 205 indicates a cable that transmits various data.

【００８３】（会議画像送出装置２００：カメラ部２０
１の内容）図４は、実施の形態１の会議画像送出装置２
００のカメラ部２０１の構成例を示した説明図である。
カメラ部２０１は、双曲面ミラー２１１と、レンズ２１
２と、絞り２１３と、光電変換素子であるＣＣＤ（Ｃｈ
ａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）２１４と、を
有する。(Conference image sending device 200: camera unit 20
1) FIG. 4 shows the conference image transmitting apparatus 2 according to the first embodiment.
It is explanatory drawing which showed the structural example of the camera part 201 of 00.
The camera unit 201 includes a hyperboloidal mirror 211 and a lens 21.
2, a diaphragm 213, and a CCD (Ch
charge coupled device) 214.

【００８４】また、カメラ部２０１は、ＣＣＤ２１４の
タイミング制御をおこなうと共に、ＣＣＤ２１４により
得られた映像信号をＡ／Ｄ変換（アナログ−デジタル変
換）する駆動処理部２１５と、駆動処理部２１５により
得られたデジタル信号に対してエッジ強調やγ補正等の
前処理をおこなう前処理回路２１６と、アイリスを制御
するために絞り２１３を駆動するモータ駆動部２１７と
を備えている。Further, the camera unit 201 controls the timing of the CCD 214, and at the same time, the drive processing unit 215 that performs A / D conversion (analog-digital conversion) of the video signal obtained by the CCD 214 and the drive processing unit 215 obtains it. A preprocessing circuit 216 that performs preprocessing such as edge enhancement and γ correction on the digital signal and a motor drive unit 217 that drives the diaphragm 213 to control the iris are provided.

【００８５】ここで、光学系について説明する。双曲面
ミラー２１１は、広角の撮影を可能とならしめる反射鏡
である。反射鏡の例として実施の形態１では双曲面ミラ
ーを用いて各種の説明をおこなうが、広角画像を取り込
める構成であればその態様は問わない。なお、他の反射
鏡の例については実施の形態３で述べる。Here, the optical system will be described. The hyperboloidal mirror 211 is a reflecting mirror that enables wide-angle shooting. In Embodiment 1, various explanations are given using a hyperboloidal mirror as an example of a reflecting mirror, but the mode is not limited as long as it is a configuration capable of capturing a wide-angle image. Note that an example of another reflecting mirror will be described in Embodiment 3.

【００８６】双曲面ミラー２１１により画像を取り込む
技術に関しては、たとえば、Ａ．Ｍ．Ｂｒｕｃｋｓｔｅ
ｉｎａｎｄＴ．Ｊ．Ｒｉｃｈａｒｄｓｏｎ：Ｏｍｎ
ｉｖｉｅｗＣａｍｅｒａｓｗｉｔｈＣｕｒｖｅｄ
ＳｕｒｆａｃｅＭｉｒｒｏｒｓ，Ｐｒｏｃ．ｏ
ｆｔｈｅＩＥＥＥＷｏｒｋｓｈｏｐｏｎＯｍ
ｎｉｄｉｒｅｃｔｉｏｎａｌＶｉｓｉｏｎ２００
０，ｐｐ．７９−８４をあげることができる。同誌によ
れば、双曲面ミラーの使用により、人の顔などの水平方
向に近い重要な被写体を比較的高い解像度で撮影可能で
あることが示されている。Regarding the technique of capturing an image with the hyperboloidal mirror 211, see, for example, A.I. M. Bruckste
in and T.S. J. Richardson: Omni
view Cameras with Curved
Surface Mirrors, Proc. o
f the IEEE Workshop on Om
nidirectional vision 200
0, pp. 79-84 can be mentioned. According to the same magazine, the use of a hyperboloidal mirror makes it possible to capture an important subject in the horizontal direction, such as a human face, at a relatively high resolution.

【００８７】図５は、実施の形態１の双曲面ミラー２１
１を用いた場合の光路を説明する図であり、図６は、実
施の形態１の双曲面ミラー２１１によりＣＣＤ２１４の
表面に結像される広角画像の様子を示した図である。図
示したように、双曲面ミラー２１１から取り込まれる画
像はドーナツ形状となっている。なお、図６中の中心部
は、台座２０３方向を映し出し、これは重要でない画像
情報である。したがって、双曲面ミラー２１１の頭頂部
２１８を黒く塗りつぶして、黒色情報としてもよい。な
お、使用の態様によっては、頭頂部２１８に基準線を描
画し、会議画像送出装置２００の立ち上げの際、モータ
駆動部２１７を駆動することにより、ピント調整などの
初期設定に利用してもよい。FIG. 5 is a hyperboloidal mirror 21 of the first embodiment.
FIG. 6 is a diagram for explaining an optical path when 1 is used, and FIG. 6 is a diagram showing a state of a wide-angle image formed on the surface of the CCD 214 by the hyperboloidal mirror 211 of the first embodiment. As illustrated, the image captured from the hyperboloidal mirror 211 has a donut shape. Note that the central portion in FIG. 6 projects in the direction of the pedestal 203, which is unimportant image information. Therefore, the top portion 218 of the hyperboloidal mirror 211 may be painted black to provide black information. Depending on the mode of use, a reference line may be drawn on the parietal region 218, and the motor drive unit 217 may be driven when the conference image sending device 200 is started up to be used for initial settings such as focus adjustment. Good.

【００８８】カメラ部２０１は、以上に説明したよう
に、汎用のＣＣＤ２１４と簡易な構成の双曲面ミラー２
１１により構築できる。したがって、所望の被写体を高
解像度で一括して撮影するとともに、安価なカメラ部２
０１を提供することが可能となる。As described above, the camera section 201 includes the general-purpose CCD 214 and the hyperboloid mirror 2 having a simple structure.
It can be constructed by 11. Therefore, a desired subject can be collectively photographed with high resolution, and the inexpensive camera unit 2 can be used.
01 can be provided.

【００８９】（会議画像送出装置２００：マイク部２０
２の構成）次に、マイク部２０２の内容について説明す
る。図２もしくは図３を用いて説明したように、マイク
部２０２には、複数のマイクロフォン２２１が備わって
いる。以降においては、この複数のマイクロフォン２２
１を、適宜マイクロフォンアレイと称することとする。
マイクロフォン２２１は、圧電型、容量型（いわゆるコ
ンデンサマイクロフォン）など様々な種類のものを使用
することができる。後述するように、複数のマイクロフ
ォンを用いることにより、音源方向（話者方向）を検知
することができる。(Conference image sending device 200: Microphone unit 20
2) Next, the contents of the microphone unit 202 will be described. As described with reference to FIG. 2 or FIG. 3, the microphone unit 202 includes a plurality of microphones 221. In the following, this plurality of microphones 22
1 will be appropriately referred to as a microphone array.
As the microphone 221, various types such as a piezoelectric type and a capacitive type (so-called condenser microphone) can be used. As will be described later, by using a plurality of microphones, the sound source direction (speaker direction) can be detected.

【００９０】（会議画像再生装置３００の構成）次に、
会議画像再生装置３００の構成について説明する。図７
は、実施の形態１の会議画像再生装置３００の構成例を
示した図である。会議画像再生装置３００は、各種の制
御および処理をおこなうＣＰＵ（ＣｅｎｔｒａｌＰｒ
ｏｃｅｓｓｉｎｇＵｎｉｔ）３０１と、ＳＤＲＡＭ
（ＳｙｎｃｈｒｏｎｏｕｓＤｙｎａｍｉｃＲａｎｄ
ｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）３０２と、ＨＤＤ
（ＨａｒｄＤｉｓｋＤｒｉｖｅ）３０３と、マウス
等のポインティングデバイス、キーボード、ボタン等に
対する入力インターフェース（以下Ｉ／Ｆと称すること
とする）３０４と、電源３０５と、表示Ｉ／Ｆ３０６
と、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤ
ｉｓｃ）−ＲＡＭドライブ等の大容量記録装置３０７
と、会議画像送出装置２００と接続するための外部Ｉ／
Ｆ３０８と、を有し、バス３０９を介して接続されてい
る。なお、表示Ｉ／Ｆ３０６はＣＲＴ等のディスプレイ
に接続される。(Structure of Conference Image Reproducing Device 300) Next,
The configuration of the conference image reproducing device 300 will be described. Figure 7
FIG. 3 is a diagram showing a configuration example of a conference image reproducing device 300 according to the first embodiment. The conference image reproducing apparatus 300 includes a CPU (Central Pr) that performs various controls and processes.
processing unit) 301 and SDRAM
(Synchronous Dynamic Rand
om Access Memory) 302 and HDD
(Hard Disk Drive) 303, input interface (hereinafter referred to as I / F) 304 for pointing device such as mouse, keyboard, button, etc., power supply 305, and display I / F 306
And DVD (Digital Versatile D
isc) -Large-capacity recording device 307 such as a RAM drive
And an external I / O for connecting to the conference image transmitting apparatus 200.
F308, and are connected via a bus 309. The display I / F 306 is connected to a display such as a CRT.

【００９１】次に、会議画像再生装置３００の各構成部
について説明する。ＣＰＵ３０１は、ＨＤＤ３０３に格
納された所定のプログラムにしたがって、図６に示した
広角のドーナツ形状の画像を矩形の出力画像となるよう
に変形する。また、ＣＰＵ３０１は、音源方向の所定領
域を抽出する。なお、この変形や抽出の処理については
後述する。ＳＤＲＡＭ３０２は、ＣＰＵ３０１の作業領
域として利用されるとともに、ＨＤＤ３０３に格納され
る各処理プログラムや、その他制御プログラム（たとえ
ばＯＳ）などの記憶領域としても利用される。Next, each component of the conference image reproducing apparatus 300 will be described. The CPU 301 transforms the wide-angle donut-shaped image shown in FIG. 6 into a rectangular output image according to a predetermined program stored in the HDD 303. The CPU 301 also extracts a predetermined area in the sound source direction. The transformation and extraction processing will be described later. The SDRAM 302 is used as a work area of the CPU 301 and also as a storage area for each processing program stored in the HDD 303 and other control programs (for example, OS).

【００９２】外部Ｉ／Ｆ３０８は、前述したように、会
議画像送出装置２００から送出されるデータを入力する
際に使用されるインターフェースである。ここで、会議
画像送出装置２００から入力するデータとしては、広角
画像（動画データ）、音声データ、音源方向データがあ
げられる。外部Ｉ／Ｆ３０８は、各種のＩ／Ｆを採用す
ることができ、たとえば、ＵＳＢ（Ｕｎｉｖｅｒｓａｌ
ＳｅｒｉａｌＢｕｓ）、ＩＥＥＥ１３９４といった
有線接続を採用してもよいし、ＩｒＤＡ、ＢｌｕｅＴｏ
ｏｔｈ等の無線接続を採用してもよい。外部Ｉ／Ｆ３０
８により入力されたデータは、大容量記録装置３０７に
格納される。The external I / F 308 is an interface used when inputting the data transmitted from the conference image transmitting apparatus 200, as described above. Here, examples of the data input from the conference image transmitting device 200 include wide-angle images (moving image data), audio data, and sound source direction data. The external I / F 308 can employ various types of I / F, for example, a USB (Universal).
Wired connection such as Serial Bus) or IEEE1394 may be adopted, or IrDA, BlueTo
A wireless connection such as oth may be adopted. External I / F 30
The data input by 8 is stored in the mass storage device 307.

【００９３】（広角会議録画再生システム１００の内
容：機能的構成）次に、広角会議録画再生システム１０
０の機能的構成を説明しつつ、広角画像を矩形の出力画
像となるように変形する画像処理や音源方向の検出処理
について説明する。図８は、会議録画再生システム１０
０の機能的構成の一例を示したブロック図である。(Details of Wide-angle Conference Recording / Reproducing System 100: Functional Configuration) Next, the wide-angle conference recording / reproducing system 10 will be described.
Image processing for transforming a wide-angle image into a rectangular output image and detection processing for a sound source direction will be described while describing a functional configuration of 0. FIG. 8 shows a conference recording / playback system 10.
It is a block diagram showing an example of the functional composition of 0.

【００９４】広角会議録画再生システム１００は、その
機能的構成として、広角画像入力部８０１と、音声入力
部８０２と、音源方向検出部８０３と、記録部８０４
と、画像変形部８０５と、方向修正部８０６と、領域固
定部８０７と、画像音声出力部８０８と、を有する。The wide-angle conference recording / playback system 100 has, as its functional configuration, a wide-angle image input unit 801, a voice input unit 802, a sound source direction detection unit 803, and a recording unit 804.
The image transformation unit 805, the direction correction unit 806, the area fixing unit 807, and the image / audio output unit 808 are included.

【００９５】（広角会議録画再生システム１００：広角
画像入力部８０１の内容）広角画像入力部８０１は、鉛
直方向を中心もしくは軸とした広角画像を取り込み、そ
の画像データを記録部８０４に送出する。広角画像の例
としては図６にあげられたドーナツ型の画像があげられ
る。広角画像入力部８０１は、たとえば、図４に示した
双曲面ミラー２１１と、レンズ２１２と、絞り２１３
と、ＣＣＤ２１４と、駆動処理部２１５と、前処理回路
２１６によりその機能を実現できる。(Wide-angle conference recording / reproducing system 100: contents of wide-angle image input unit 801) The wide-angle image input unit 801 takes in a wide-angle image centered on or in the vertical direction and sends the image data to the recording unit 804. An example of the wide-angle image is the donut-shaped image shown in FIG. The wide-angle image input unit 801 includes, for example, the hyperboloidal mirror 211, the lens 212, and the diaphragm 213 shown in FIG.
The functions can be realized by the CCD 214, the drive processing unit 215, and the preprocessing circuit 216.

【００９６】（会議録画再生システム１００：音声入力
部８０２・音源方向検出部８０３の内容）音声入力部８
０２は、音声を入力し電気信号（音声データ）に変換
し、その音声データを音源方向検出部８０３と記録部８
０４に送出する。音声入力部８０２は、マイクロフォン
２２１（図２または図３参照）によりその機能を実現す
ることができる。なお、マイクロフォン２２１は前述し
たように複数配置され、それぞれのマイクロフォン２２
１からの音声データに基づいて音源方向が検出される。(Conference Recording / Playback System 100: Contents of Audio Input Unit 802 / Source Direction Detection Unit 803) Audio Input Unit 8
Reference numeral 02 designates a voice as an input, converts it into an electric signal (voice data), and converts the voice data into a sound source direction detecting unit 803 and a recording unit 8.
Send to 04. The voice input unit 802 can realize its function by the microphone 221 (see FIG. 2 or FIG. 3). A plurality of microphones 221 are arranged as described above, and each microphone 22
The sound source direction is detected based on the audio data from 1.

【００９７】音源方向検出部８０３は、音声入力部８０
２から音声データを入力し、音源方向を検出する。音源
方向を検出することにより、広角画像から発言者（話
者）部分の画像を抽出する（切り出す）ことが可能とな
り、臨場感を維持しつつ、会議を効率的に再現すること
ができる。次に、この音源方向の検出処理について説明
する。The sound source direction detecting section 803 has a voice input section 80.
The voice data is input from 2 and the sound source direction is detected. By detecting the sound source direction, it is possible to extract (cut out) the image of the speaker (speaker) portion from the wide-angle image, and it is possible to efficiently reproduce the conference while maintaining the presence. Next, this sound source direction detection processing will be described.

【００９８】ここでは、音源方向検出部８０３が、マイ
クロフォンアレイに入力される音声の到達時間差により
音源方向を検出する方法について説明する。図９は、音
源方向検出部８０３による音源方向の検出原理を説明す
る図である。図に示したように、２つのマイクロフォン
２２１（それぞれマイク１、マイク２と便宜的に称する
こととする）が間隔ｌだけ離れて並んでおり、音声がθ
方向から到達する場合、マイク１が出力する音声データ
ｓ１（ｔ）と、マイク２が出力する音声データｓ２
（ｔ）との関係は、ｔを時間ｖを音速として、下式
（１）の様に表すことができる。ｓ１（ｔ）＝ｓ２（ｔ−（ｌ・ｃｏｓθ）／ｖ）・・・（１）Here, a method in which the sound source direction detecting section 803 detects the sound source direction based on the arrival time difference of the voices input to the microphone array will be described. FIG. 9 is a diagram for explaining the principle of sound source direction detection by the sound source direction detection unit 803. As shown in the figure, two microphones 221 (which will be referred to as a microphone 1 and a microphone 2 for convenience) are arranged at intervals of a distance l, and a voice
When arriving from a direction, voice data s1 (t) output by the microphone 1 and voice data s2 output by the microphone 2
The relationship with (t) can be expressed by the following equation (1), where t is time v and the sound velocity. s1 (t) = s2 (t− (l · cos θ) / v) (1)

【００９９】式（１）は、マイク１の音声データがマイ
ク２の音声データに対して（ｌ・ｃｏｓθ）／ｖだけ時
間が進んで到達していることを示している。音源方向検
出部８０３は、この到達時間差を利用して、話者の音声
の方向を特定する。Expression (1) indicates that the voice data of the microphone 1 arrives ahead of the voice data of the microphone 2 by (l · cos θ) / v. The sound source direction detection unit 803 uses this arrival time difference to identify the direction of the speaker's voice.

【０１００】音源方向の特定に際しては、まず、マイク
１とマイク２の音声データの到達時間差を検出する。こ
の到達時間差は、たとえばマイク１の音声データｓ１
（ｔ）とマイク２の音声データｓ２（ｔ＋ｄｔ）との相
互相関値により計算する。相互相関値Ｃ（ｔ，ｄｔ）
は、次式（２）により算出される。In specifying the sound source direction, first, the arrival time difference between the voice data of the microphones 1 and 2 is detected. This arrival time difference is, for example, the voice data s1 of the microphone 1.
It is calculated by the cross-correlation value between (t) and the voice data s2 (t + dt) of the microphone 2. Cross-correlation value C (t, dt)
Is calculated by the following equation (2).

【数１】 [Equation 1]

【０１０１】式（２）は時刻ｔ以前のＮ個のサンプルを
用いて積和演算をおこなうものであることを示してい
る。なお、Ｎは相関窓の大きさを示す正の整数である。
詳細な説明は省略するが、Ｃ（ｔ，ｄｔ）を最大化する
ｄｔが到達時間差である。Equation (2) indicates that the product-sum operation is performed using N samples before time t. N is a positive integer indicating the size of the correlation window.
Although detailed description is omitted, dt that maximizes C (t, dt) is the arrival time difference.

【０１０２】次に、マイクの間隔ｌ、到達時間差ｄｔお
よび音速ｖを用いて、音声とマイクロフォンの基線とが
なす角θを次式（３）により計算する。Next, using the microphone interval l, the arrival time difference dt, and the sound velocity v, the angle θ between the voice and the base line of the microphone is calculated by the following equation (3).

【数２】ここで、θの値域は０°以上１８０°以下とする。[Equation 2] Here, the range of θ is 0 ° or more and 180 ° or less.

【０１０３】なお、以上の手順のみでは、マイクロフォ
ン２２１の前側の１８０°の範囲しか方向が検出され
ず、音源方向が特定されない。すなわち、音源方向検出
部８０３が出力する角度θは、実際には音声の到達方向
と２つのマイク間の基線とがなす角度であり、実際の音
声の方向は図１０に示したように、２つのマイクの中点
を頂点とする頂角θの円錐の側面上のいずれかに存在し
ている。By the above procedure alone, the direction is detected only in the front 180 ° range of the microphone 221, and the sound source direction is not specified. That is, the angle θ output by the sound source direction detection unit 803 is actually the angle formed by the arrival direction of the voice and the baseline between the two microphones, and the actual direction of the voice is 2 as shown in FIG. It exists on one of the sides of a cone with the apex angle θ whose apex is the midpoint of two microphones.

【０１０４】この問題を解消するために、マイク１とマ
イク２より構成される組と平行でない別のマイクロフォ
ンの組を用いて補正をおこなう。図１１は、４つのマイ
クロフォン２２１を２組に分けて音源方向を検出する場
合の組分けの様子を示した説明図である。図示したよう
に、組分けは、あるマイクロフォン２２１（たとえばマ
イク１（マイク３））と、そのマイクロフォンと最も距
離の離れたマイクロフォン２２１（マイク２（マイク
４））とを組み合わせる。In order to solve this problem, correction is performed by using another microphone group which is not parallel to the group consisting of the microphone 1 and the microphone 2. FIG. 11 is an explanatory diagram showing how the four microphones 221 are divided into two groups to detect the sound source direction. As shown in the figure, the grouping is performed by combining a microphone 221 (for example, microphone 1 (microphone 3)) and a microphone 221 (microphone 2 (microphone 4)) that is farthest from the microphone 221.

【０１０５】最も距離の離れた２つのマイクの組を用い
ることで、音声の到達時間差が最大となり、方向検知の
精度が向上する。なお、実施の形態１では、マイク部２
０２には４つのマイクロフォン２２１が備わっている
が、３つのマイクロフォンによっても、音源方向を精度
良く検出できる。図１２は、３つのマイクロフォンによ
ってマイクロフォン部が構成される場合のマイクロフォ
ンの組の採り方を説明する説明図である。図示したよう
に、マイクロフォンを正三角形に配置することにより、
どのマイクの組を採用しても、精度良く音源方向を検出
することができるようになる。なお、図示した例では、
第１の組と第２の組を採用して全方向の音源を検出でき
るが、補完的に第３の組を使用してもよい。By using a pair of two microphones that are most distant from each other, the difference in the arrival time of voices is maximized and the accuracy of direction detection is improved. In the first embodiment, the microphone unit 2
The 02 has four microphones 221. However, the direction of the sound source can also be detected with high precision by using three microphones. FIG. 12 is an explanatory diagram for explaining how to adopt a set of microphones when the microphone unit is configured by three microphones. By arranging the microphones in an equilateral triangle, as shown,
Whichever microphone set is used, the sound source direction can be detected with high accuracy. In the example shown,
Although the first set and the second set can be used to detect sound sources in all directions, the third set may be used complementarily.

【０１０６】音源方向検出部８０３は、たとえば、マイ
クロフォン２２１の図示しない制御部によりその機能を
実現することができる。なお、使用の態様によっては、
会議画像再生装置３００側のＣＰＵ３０１（図７参照）
によりその機能を実現させてもよい。なお、この場合
は、マイクロフォン２２１から入力される音声をそれぞ
れ別個に会議画像再生装置３００側に入力する必要があ
る。The sound source direction detecting unit 803 can realize its function by, for example, a control unit (not shown) of the microphone 221. In addition, depending on the mode of use,
CPU 301 on the conference image reproducing device 300 side (see FIG. 7)
The function may be realized by. In this case, it is necessary to separately input the sounds input from the microphone 221 to the conference image reproducing device 300 side.

【０１０７】（広角会議録画再生システム１００：記録
部８０４の内容）記録部８０４は、画像入力部８０１か
ら出力された広角画像の動画データ、音声入力部８０２
から出力された音声データ、音源方向検出部８０３から
出力された音源方向に関するデータを記録する。記録の
方式は様々挙げられるが、たとえば動画データに関して
は、ＭＰＥＧに代表される動画符号化フォーマットなど
の形式で記録する。また、音声データに関してはＭＰＥ
Ｇオーディオフォーマットを用いてもよく、ＰＣＭフォ
ーマットを用いてもよい。(Wide-angle conference recording / playback system 100: contents of recording unit 804) The recording unit 804 is a wide-angle image moving image data output from the image input unit 801 and a voice input unit 802.
The audio data output from the sound source and the sound source direction output from the sound source direction detecting unit 803 are recorded. There are various recording methods. For example, moving image data is recorded in a moving image encoding format represented by MPEG. For audio data, MPE
The G audio format may be used or the PCM format may be used.

【０１０８】音源方向のデータに関しては、音源方向が
変わった時刻と、その時の方位角と仰角とを随時記録し
ておくことによって、後述する画面の抽出（切り出し）
をおこなうことが可能となる。図１３は、音源方向のデ
ータ構成例を示した図である。図には、音源方向が変わ
った時刻（Ｔｉｍｅ）、新たな音源方向の方位角（θ）
および仰角（φ）が記録されている。この方向データ
は、テキストファイルなどの形式で、動画データや音声
データと共に大容量記録装置３０７に記録する。Regarding the data of the sound source direction, the time when the sound source direction is changed and the azimuth and elevation at that time are recorded at any time to extract (cut out) a screen to be described later.
Can be performed. FIG. 13 is a diagram showing an example of the data structure in the sound source direction. In the figure, the time when the sound source direction changed (Time), the azimuth angle (θ) of the new sound source direction
And the elevation angle (φ) is recorded. This direction data is recorded in the mass storage device 307 in the form of a text file or the like, together with moving image data and audio data.

【０１０９】なお、上述した例では、音源方向のデータ
は、動画データまたは音声データに結合したデータでは
ないが、ＲｅａｌＮｅｔｗｏｒｋｓ社より提供されてい
るＲｅａｌＭｅｄｉａフォーマットなどのストリーミン
グ用フォーマットを用いれば音源方向のデータも１つの
ファイルに埋め込むことができる。この他ＭＰＥＧ−７
のようなマルチメディア情報の内容記述標準を用いて音
源方向データをファイルに記すこともできる。In the above example, the sound source direction data is not the data combined with the moving image data or the audio data, but if the streaming format such as RealMedia format provided by RealNetworks is used, the sound source direction data is obtained. Can also be embedded in one file. Other MPEG-7
It is also possible to write the sound source direction data in a file using a content description standard of multimedia information such as.

【０１１０】この他、ＭＰＥＧプログラムストリームの
ように、動画データと音声データを１つのファイルに収
めて記録してもよい。この様な符号化を用いることで、
記録容量を小さくすることができる。記録部８０４は、
たとえば、大容量記録装置３０７によりその機能を実現
することができる。なお、使用の態様によっては、ＨＤ
Ｄ３０３によりその機能を実現してもよい。たとえば、
長時間の会議や、定例会議については、保存の必要性か
らＤＶＤ等により構成される大容量記録装置３０７に記
録し、短時間の会議など、長期の保存の必要性が低いも
のに関してはＨＤＤ３０３に記録するなどの使い分けを
おこなってもよい。In addition, moving image data and audio data may be recorded in a single file like an MPEG program stream. By using such encoding,
The recording capacity can be reduced. The recording unit 804 is
For example, the function can be realized by the large-capacity recording device 307. Depending on the mode of use, HD
The function may be realized by D303. For example,
Long-term meetings and regular meetings are recorded in the large-capacity recording device 307 composed of a DVD or the like due to the necessity of saving, and HDD 303 is used for short-term meetings such as those with low need for long-term saving. It may be used properly such as recording.

【０１１１】（広角会議録画再生システム１００：画像
変形部８０５および関連部の内容）次に、画像変形部８
０５およびこれに付随する機能部について説明する。画
像変形部８０５は、ドーナツ形状（もしくは円形状の）
広角画像を矩形の出力画像となるように変形する。一般
に、広角の範囲を一時に撮影して得られる映像は、人間
の眼で確認できる像の形状と異なり、大きな歪みが含ま
れている。したがって、記録部８０４に記録された会議
を後に再生するには、変形処理が必要となる。(Wide-angle conference recording / playback system 100: contents of image transforming unit 805 and related units) Next, the image transforming unit 8
05 and functional parts associated therewith will be described. The image transformation unit 805 has a donut shape (or a circular shape).
The wide-angle image is transformed into a rectangular output image. In general, an image obtained by shooting a wide-angle range at a time contains large distortion unlike the shape of the image that can be seen by the human eye. Therefore, in order to reproduce the conference recorded in the recording unit 804 later, the transformation process is required.

【０１１２】図２もしくは図３に示した双曲面ミラー２
１１を使用した場合の変形処理について説明する。画像
変形部８０５は、図６に示したドーナツ状の画像（以下
ドーナツ画像と称することとする）を、図１４のように
３６０度の視野角を持つ正像（以下パノラマ画像と称す
ることとする）に変形する。The hyperboloidal mirror 2 shown in FIG. 2 or FIG.
The transformation process when 11 is used will be described. The image transformation unit 805 transforms the donut-shaped image shown in FIG. 6 (hereinafter referred to as a donut image) into a normal image having a viewing angle of 360 degrees as shown in FIG. 14 (hereinafter referred to as a panoramic image). ) Is transformed into.

【０１１３】図１５および図１６は、双曲面ミラー２１
１を使用した場合の変形原理を説明する図である。この
うち、図１５は、ドーナツ画像とパノラマ画像の座標系
を示した図であり、図１６は、ＣＣＤ２１４からみた頂
角ψと、仰角φとの関係を示した図である。なお、図１
６では、簡単のため、レンズ２１２と絞り２１３とは省
略してある。ここでは変換式の便宜上、レンズ２１２〜
ＣＣＤ２１４の光学系をピンホールカメラモデルとして
説明する。15 and 16 show a hyperboloidal mirror 21.
It is a figure explaining the modification principle when 1 is used. Of these, FIG. 15 is a diagram showing the coordinate system of the donut image and the panoramic image, and FIG. 16 is a diagram showing the relationship between the apex angle φ and the elevation angle φ as viewed from the CCD 214. Note that FIG.
In FIG. 6, the lens 212 and the diaphragm 213 are omitted for simplicity. Here, for convenience of the conversion formula, the lenses 212 to
The optical system of the CCD 214 will be described as a pinhole camera model.

【０１１４】図中の各変数の意味は、下記の通りであ
る。（ｕ，ｖ）：ドーナツ画像における座標（ｕ０，ｖ０）：ドーナツ画像における双曲面ミラーの中心の座標（θ，φ）：パノラマ画像における座標ｒ：（ｕ０，ｖ０）から（ｕ，ｖ）への画素単位の距離ｒｍａｘ：ドーナツ画像における双曲面ミラーの画素単位の半径 θ ：方位角 φ ：仰角 ψ ：カメラの光軸からの頂角Ｆ：双曲面ミラーの焦点Ｆ’ ：双曲面ミラーと対をなす双曲面の焦点（カメラの光学中心に一致する）The meaning of each variable in the figure is as follows. (U, v): Coordinates in the donut image (u0, v0): Coordinates of the center of the hyperboloidal mirror in the donut image (θ, φ): Coordinates in the panoramic image r: From (u0, v0) to (u, v) In pixel unit rmax: radius of pixel unit of hyperboloid mirror in donut image θ: azimuth angle φ: elevation angle ψ: vertical angle from optical axis of camera F: focus of hyperboloid mirror F ′: pair with hyperboloid mirror The focal point of the hyperboloid that forms (matches the optical center of the camera)

【０１１５】このとき、頂角ψと仰角φとの間に、以下
の関係が成立する。At this time, the following relationship is established between the apex angle ψ and the elevation angle φ.

【数３】ここで、[Equation 3] here,

【０１１６】[0116]

【数４】である。また、φ_maxはドーナツ画像上の半径ｒ_maxの位
置に対応する仰角の値であり、これはカメラの仰角方向
の撮影限界値を表す。ｒ_maxとφ_maxの値は一般に容易に
知ることができる。[Equation 4] Is. Further, φ _max is the value of the elevation angle corresponding to the position of the radius r _max on the donut image, and this represents the photographing limit value of the camera in the elevation angle direction. The values of r _max and φ _max are generally easily known.

【０１１７】以下、変形の手順を説明する。（ｉ）：点（ｕ，ｖ）に対応する極座標（ｒ，θ）を、
次式（６）を解くことにより求める。（ｕ，ｖ）＝（ｒｃｏｓθ＋ｕ０，ｒｓｉｎθ＋ｖ０）・・・（６）（ｉｉ）：（６）式により算出されたｒに対応する頂角
ψを次式（７）により求める。The modification procedure will be described below. (I): polar coordinates (r, θ) corresponding to the point (u, v)
It is obtained by solving the following equation (6). (U, v) = (rcos θ + u0, r sin θ + v0) (6) (ii): The apex angle ψ corresponding to r calculated by the equation (6) is calculated by the following equation (7).

【数５】 [Equation 5]

【０１１８】ここで、Here,

【数６】であり、ψ_maxはドーナツ画像上の半径ｒ_maxの位置およ
び仰角φ_maxに対応する頂角ψの値である。ψ_maxの値
は、（４）式にφ_maxを代入することにより求めること
ができる。（ｉｉｉ）：（７）式により算出されたψに対応する仰
角φを、（４）式により求める。[Equation 6] And ψ _max is the value of the apex angle ψ corresponding to the position of the radius r _max on the donut image and the elevation angle φ _max . The value of ψ _max can be obtained by substituting φ _max in the equation (4). (Iii): The elevation angle φ corresponding to ψ calculated by the equation (7) is calculated by the equation (4).

【０１１９】以上の手順により、双曲面ミラー２１１に
より撮影されたドーナツ画像における任意の点（ｕ，
ｖ）を、パノラマ画像における点（θ，φ）に座標変換
することができる。すなわち、ドーナツ画像がパノラマ
画像に変形される。By the above procedure, an arbitrary point (u, u in the donut image captured by the hyperboloidal mirror 211
v) can be coordinate-converted into a point (θ, φ) in the panoramic image. That is, the donut image is transformed into a panoramic image.

【０１２０】なお、プロセッサの処理能力が低い場合
は、画像データの変形処理に計算時間がかかるので所定
の変換テーブルを参照することにより（ｕ，ｖ）→
（θ，φ）の変換を行ってもよい。図１７は、（ｕ，
ｖ）→（θ，φ）の変換テーブルの例を模式的に示した
説明図である。図示したテーブルにはドーナツ画像の座
標（ｕ，ｖ）各点に対応するパノラマ画像の点（θ，
φ）がそれぞれ格納されている。したがって、このテー
ブルを用いることにより、処理負担を小さくしつつ高速
な画像変形をおこなうことが可能となる。If the processing capacity of the processor is low, it takes time to transform the image data. Therefore, by referring to a predetermined conversion table, (u, v) →
The conversion of (θ, φ) may be performed. FIG. 17 shows (u,
It is explanatory drawing which showed typically the example of the conversion table of v)-> ((theta), (phi)). In the illustrated table, the point (θ, v in the panoramic image corresponding to each point (u, v) in the donut image)
φ) are stored respectively. Therefore, by using this table, it is possible to perform high-speed image transformation while reducing the processing load.

【０１２１】画像変形部８０５は、以上の変換処理をお
こなうと共に、所定の画像領域を出力する。すなわち、
会議の臨場感を維持しつつ会議を効率的に再現すべく、
会議録画再生システム１００は、パノラマ画像のうち話
者（発言者）部分を抽出して出力する。図８に示したよ
うに、会議録画再生システム１００は、その機能的構成
として、話者位置判断部８０９と、領域決定部８１０と
を有する。The image transformation unit 805 performs the above conversion processing and outputs a predetermined image area. That is,
In order to efficiently reproduce the meeting while maintaining the realism of the meeting,
The conference recording / playback system 100 extracts and outputs the speaker (speaker) portion of the panoramic image. As shown in FIG. 8, the conference recording / playback system 100 has a speaker position determination unit 809 and a region determination unit 810 as its functional configuration.

【０１２２】話者位置判断部８０９は、広角画像入力部
８０１から入力した画像データもしくは記録部８０４に
記録した画像データのうち、画像の色分布もしくは画像
中の移動部分に基づいて話者位置を判断する。画像の色
分布に基づく判断の方法としては、たとえば、肌色が局
所的に多い部分を検出する手法があげられる。なお、画
像中の移動部分により判断が可能であるのは、発言者の
口は必ず動いており、また、場合によっては発言者は身
振り手振りで体を動かしていることに基づく。したがっ
て、画像中最も移動量が多い部分により話者位置を判断
することができる。The speaker position determination unit 809 determines the speaker position based on the color distribution of the image or the moving portion in the image of the image data input from the wide-angle image input unit 801 or the image data recorded in the recording unit 804. to decide. As a method of determination based on the color distribution of an image, for example, there is a method of detecting a portion where the skin color is locally large. It should be noted that the reason why the determination can be made based on the moving part in the image is that the mouth of the speaker is always moving, and in some cases, the speaker is gesturing and moving his body. Therefore, the speaker position can be determined from the portion of the image with the largest movement amount.

【０１２３】領域決定部８１０は、話者位置判断部８０
９で判断した話者位置のどの部分を抽出するかを決定す
る。テーブル１が楕円形である場合に、カメラ部２０１
と発言者との距離はそれぞれ異なり、広角画像もしくは
パノラマ画像中の話者の大きさも異なることとなる。し
たがって、出力すべき領域が画一的な大きさであると、
場合によっては、発言者が大きすぎたり、反対に小さす
ぎたりしてしまう。領域決定部８１０は、話者が適切な
大きさとなるような領域で話者部分の領域を決定する。
なお、画像変形部８０５では、適宜この画像を拡大もし
くは縮小して表示する。The area determination unit 810 is the speaker position determination unit 80.
It is determined which part of the speaker position judged in 9 is extracted. When the table 1 has an elliptical shape, the camera unit 201
The distance between the speaker and the speaker is different, and the size of the speaker in the wide-angle image or the panoramic image is also different. Therefore, if the area to be output has a uniform size,
In some cases, the speaker may be too large or, on the contrary, too small. The area determination unit 810 determines the area of the speaker portion in an area in which the speaker has an appropriate size.
The image transformation unit 805 appropriately enlarges or reduces this image and displays it.

【０１２４】一方、方向修正部８０６は、音源方向に対
応する方向を修正する。これは、音源方向検出部８０３
で検出した音源方向が、拍手音などのノイズや、返事な
どの発言者以外の発する単発的な言葉により所望の方向
とならない場合があることに基づく。また、領域決定部
８１０により決定された領域よりも、たとえば、もう少
し右側を映し出して欲しいというような要請も実用上は
生じる。特に、話者がプレゼンテーションを行ってお
り、ホワイトボードへ書き込みを行っている場合に方向
を修正したい場合が生じる。したがって、方向修正部８
０６は、この様な要請を満たすべく、音源方向を修正す
る。On the other hand, the direction correcting unit 806 corrects the direction corresponding to the sound source direction. This is the sound source direction detection unit 803.
This is based on the fact that the direction of the sound source detected in 1. may not be the desired direction due to noise such as clap sounds, or sporadic words emitted by people other than the speaker such as a reply. In addition, in practical use, there is a request to project the right side of the area determined by the area determination unit 810, for example. In particular, there is a case where the speaker is giving a presentation and wants to correct the direction when writing on the whiteboard. Therefore, the direction correction unit 8
06 corrects the sound source direction so as to satisfy such a request.

【０１２５】また、領域固定部８０７は、音源方向に対
応する方向であって、領域決定部８１０で決定された像
領域を固定する。すなわち、領域決定部８１０では、た
とえば１６０画素×９０画素といった相対的な領域を決
定するのに対し、領域固定部８０７は、その領域が音源
方向にしたがってぶれないように絶対的な位置として固
定する。これは、話者が体を揺するなどして音源方向が
微妙に移動する場合に画像がぶれないようにするもので
ある。The area fixing unit 807 fixes the image area determined by the area determining unit 810 in the direction corresponding to the sound source direction. That is, the area determining unit 810 determines a relative area of, for example, 160 pixels × 90 pixels, whereas the area fixing unit 807 fixes the area as an absolute position so as not to blur according to the sound source direction. . This is to prevent the image from blurring when the direction of the sound source slightly moves due to the speaker shaking the body.

【０１２６】以上説明したように、画像変形部８０５
は、話者部分の画像を歪みなく適切に出力する。画像変
形部８０５、領域固定部８０７および話者位置判断部８
０９は、たとえば、図３に示した会議画像再生装置３０
０のＣＰＵ３０１と、ＨＤＤ３０３に格納された所定の
プログラムによりその機能を実現することができる。ま
た、方向修正部８０６および領域決定部８１０は、たと
えば、図３に示した会議画像再生装置３００のＣＰＵ３
０１と、ＨＤＤ３０３に格納された所定のプログラム
と、入力Ｉ／Ｆ３０４に接続されるポインティングデバ
イス、Ｋ／Ｂ、ボタンによりその機能を実現することが
できる。As described above, the image transformation unit 805
Properly outputs the image of the speaker portion without distortion. Image transforming unit 805, area fixing unit 807, and speaker position determining unit 8
09 is, for example, the conference image reproducing device 30 shown in FIG.
The function can be realized by the CPU 301 of 0 and a predetermined program stored in the HDD 303. In addition, the direction correction unit 806 and the area determination unit 810, for example, the CPU 3 of the conference image reproduction device 300 illustrated in FIG.
01, a predetermined program stored in the HDD 303, a pointing device, a K / B, and a button connected to the input I / F 304, the function can be realized.

【０１２７】（会議録画再生システム１００：画像音声
出力部８０８の内容）画像音声出力部８０８は、画像変
形部８０５から出力された画像（動画データ）と、この
画像が撮影（入力）された際に同時に録音（入力）され
た音声を対応づけて出力する。すなわち、画像と音声の
同期をとって出力する。プロセッサ（たとえばＣＰＵ３
０１）の処理速度によっては、音声と画像のタイムラグ
が生じるので、画像音声出力部８０８は、画像と音声の
同期をとることにより会議を自然な感じで再現する。画
像音声出力部８０８は、たとえば、図３に示した会議画
像再生装置３００のＣＰＵ３０１と、ＨＤＤ３０３に格
納された所定のプログラムによりその機能を実現するこ
とができる。(Conference Recording / Reproducing System 100: Contents of Image / Sound Output Unit 808) The image / sound output unit 808 detects the image (moving image data) output from the image transformation unit 805 and the image (moving) of this image. Simultaneously recorded (input) voices are output in association with each other. That is, the image and the sound are output in synchronization with each other. Processor (eg CPU3
Depending on the processing speed of 01), there is a time lag between the sound and the image, and therefore the image and sound output unit 808 reproduces the conference with a natural feeling by synchronizing the image and the sound. The image / audio output unit 808 can realize its function by, for example, the CPU 301 of the conference image reproducing apparatus 300 illustrated in FIG. 3 and a predetermined program stored in the HDD 303.

【０１２８】（会議録画再生システム１００：処理の流
れ）次に、会議録画再生システム１００の処理流れにつ
いて説明する。図１８は、会議録画再生システム１００
の処理流れの例を示した説明図である。会議録画再生シ
ステム１００は、まず、録画開始ボタン（図示せず）が
押下されることにより、録画を開始する（ステップＳ１
８０１）。この開始動作以降、鉛直方向を中心もしくは
軸とした広角画像（ドーナツ画像）をカメラ部２０１か
ら順次入力し、また、音声をマイク部２０２から順次入
力する（ステップＳ１８０２）。なお、マイク部２０２
から入力された音声に関しては、前述したマイクの組を
使って音源方向を随時検出しておく。(Conference Recording / Reproducing System 100: Process Flow) Next, the process flow of the conference recording / reproducing system 100 will be described. FIG. 18 shows a conference recording / playback system 100.
It is an explanatory view showing an example of the processing flow of. The conference recording / playback system 100 first starts recording by pressing a recording start button (not shown) (step S1).
801). After this starting operation, wide-angle images (donut images) centered on or in the vertical direction are sequentially input from the camera unit 201, and voices are sequentially input from the microphone unit 202 (step S1802). The microphone unit 202
With respect to the voice input from, the sound source direction is detected at any time by using the above-mentioned microphone set.

【０１２９】次に、カメラ部２０１から入力されたドー
ナツ画像と、マイク部２０２から入力された音声と、検
出された音源方向を記録する（ステップＳ１８０３）。
記録に関しては、後の再生のために、適宜録画時刻やフ
ァイル名（会議名）などを付しておく。なお、音源方向
が検出されているので、音声については、各マイクロフ
ォン２２１（すなわち４つのマイクロフォン２２１）を
４チャンネル分全てを記録する必要はなく、どれか一つ
もしくは４つの音の平均を記録すればよい。なお、原理
的には前述した時間差が生じているが、会議画像送出装
置２００の大きさと音速とを考えれば、この時間差は実
用的には何ら問題となるレベルではない。Next, the donut image input from the camera unit 201, the voice input from the microphone unit 202, and the detected sound source direction are recorded (step S1803).
Regarding recording, a recording time, a file name (meeting name), etc. are appropriately added for later reproduction. Since the sound source direction is detected, it is not necessary to record all four channels of each microphone 221 (that is, four microphones 221), and only one of them or the average of four sounds should be recorded. Good. Although the above-mentioned time difference occurs in principle, this time difference does not pose any practical problem in view of the size and the sound speed of the conference image transmitting device 200.

【０１３０】会議の終了にしたがって録画終了ボタン
（図示せず）が押下されることにより録画を終了する
（ステップＳ１８０４）。以上のステップを経ることに
より、全周囲の画像、すなわち、加工前のドーナツ画像
により構成される会議の内容を録画することができる。
なお、加工前の画像を録画することにより、後の編集
（画像領域の抽出、音源方向の修正等）を可能とする。Recording is ended by pressing a recording end button (not shown) at the end of the conference (step S1804). By going through the above steps, it is possible to record the content of the conference composed of the image of the entire circumference, that is, the unprocessed donut image.
By recording the image before processing, it is possible to perform later editing (extraction of image area, correction of sound source direction, etc.).

【０１３１】次に、録画された会議の再生について説明
する。再生開始ボタン（図示せず）が押下されることに
より、再生を開始する（ステップＳ１８０５）。なお、
記録媒体（たとえばＤＶＤ−ＲＡＭ）に複数の会議が録
画されている場合は、インデックス表示をしてユーザに
どの会議を再生するかを選択させてから再生を開始す
る。Next, the reproduction of the recorded conference will be described. When a reproduction start button (not shown) is pressed, reproduction is started (step S1805). In addition,
When a plurality of conferences are recorded on a recording medium (DVD-RAM, for example), the index display is performed to allow the user to select which conference to reproduce, and then the reproduction is started.

【０１３２】記録されたドーナツ画像のうち音源方向の
肌色部分を手掛かりとして、話者位置を判定し（ステッ
プＳ１８０６）、表示する領域を指定する（ステップＳ
１８０７）。ここで、なお、画像の方向を意識的に調整
したい場合は、適宜方向を修正する指示をおこなう。The speaker position is determined by using the skin-colored portion in the sound source direction of the recorded donut image as a clue (step S1806), and the area to be displayed is designated (step S).
1807). Here, if it is desired to intentionally adjust the direction of the image, an instruction to correct the direction is issued.

【０１３３】続いて、ドーナツ画像のうち、指定された
領域を矩形形状の画像となるように画像変形をおこなう
（ステップＳ１８０８）。画像の変形に際しては、変換
式を用いて変形してもよいが、変換テーブルを参照して
もよい。最後に、抽出され、適正に変形された画像を音
声と共に出力する（ステップＳ１８０９）。このような
ステップを踏むことにより、臨場感を維持しつつ、会議
を効率的に再現させることが可能となる。Subsequently, the image transformation is performed so that the designated area of the donut image becomes a rectangular image (step S1808). When transforming an image, it may be transformed using a transformation formula, or a transformation table may be referenced. Finally, the extracted and properly transformed image is output together with the sound (step S1809). By taking such steps, it is possible to efficiently reproduce the conference while maintaining the realism.

【０１３４】なお、会議録画再生システム１００では、
マイク部２０２の重心位置を、カメラ部２０１の光軸線
上にあるように設計することが好ましい。最も好ましい
設計は、ＣＣＤ２１４の重心と複数のマイクロフォン２
２１の重心とが一致するような配置である。この様に設
計ないし配置することにより、音源を算出する際の座標
系と、画像変換する際の座標系を一致させることがで
き、計算負荷が少なくてすむ。In the conference recording / playback system 100,
It is preferable to design the position of the center of gravity of the microphone unit 202 to be on the optical axis of the camera unit 201. The most preferable design is the center of gravity of CCD 214 and a plurality of microphones 2.
The arrangement is such that the center of gravity of 21 matches. By designing or arranging in this way, the coordinate system for calculating the sound source and the coordinate system for converting the image can be matched, and the calculation load can be reduced.

【０１３５】また、本実施の形態では、マイク部２０２
が台座２０３に設けられていたが、個々の参加者２が無
線通信手段を有したマイクロフォン２２１をそれぞれ所
有することにより、音源の方向を検出することも可能で
ある。たとえば、会議室内の複数の既知の位置に電波を
発信するユニットを設置し、マイクロフォン２２１に到
達した電波の信号強度や時間差より、三角測量の原理に
基づき各マイクロフォン２２１の位置を検出できる。こ
のとき、最も大きな信号振幅が得られたマイクロフォン
２２１の方向を、話者方向として検出することができ
る。ここで、無線通信手段としては、Ｂｌｕｅｔｏｏｔ
ｈなどの通信技術を用いることができる。Further, in the present embodiment, the microphone unit 202
Was provided on the pedestal 203, but it is also possible for each participant 2 to detect the direction of the sound source by owning each microphone 221 having a wireless communication means. For example, units that emit radio waves are installed at a plurality of known positions in the conference room, and the positions of the microphones 221 can be detected based on the principle of triangulation from the signal strength and time difference of the radio waves that have reached the microphone 221. At this time, the direction of the microphone 221 having the largest signal amplitude can be detected as the speaker direction. Here, as the wireless communication means, Bluetooth is used.
Communication technologies such as h can be used.

【０１３６】なお、会議画像再生装置３００は、はパー
ソナルコンピュータによりその機能を実現させることが
できる。この場合は各機能部を実現するソフトウェアを
ハードディスクに格納し、適宜処理プログラムを実行さ
せることによりその機能を実現させることができる。Note that the function of the conference image reproducing apparatus 300 can be realized by a personal computer. In this case, the software that implements each functional unit is stored in the hard disk, and the function can be implemented by appropriately executing the processing program.

【０１３７】以上説明したように、実施の形態１の会議
録画再生システムは、双曲面ミラーを用いた簡易な光学
系により簡便な構成で会議参加者の全員を一度に取り込
むことができる。また、この内容を録画することにより
会議を再現することができる。再生に関しては、必要な
部分を変形して出力することにより、話者を中心とした
臨場感の高い会議内容を再現できる。特に、会議内容を
全方向で録画しているので、ユーザの好む条件で会議シ
ーンを振り返ることも可能である。As described above, the conference recording / playback system according to the first embodiment can take in all the conference participants at one time with a simple structure using a simple optical system using a hyperboloidal mirror. Also, the conference can be reproduced by recording this content. Regarding reproduction, by deforming and outputting necessary parts, it is possible to reproduce highly realistic meeting contents centered on the speaker. In particular, since the content of the conference is recorded in all directions, it is possible to look back at the conference scene under the conditions preferred by the user.

【０１３８】実施の形態２．実施の形態２では、広域画
像をパノラマ画像に変形してから録画する会議画像録画
再生システムについて説明する。なお、実施の形態２で
は実施の形態１と同様の構成部分については同一の符号
を付し、その説明を省略するものとする。以降では、会
議録画再生システム１９００の外観構成、ハードウェア
構成、機能的構成、処理流れについて順に説明する。Embodiment 2. In the second embodiment, a conference image recording / reproducing system that transforms a wide area image into a panoramic image and then records the image will be described. In the second embodiment, the same components as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. Hereinafter, the external configuration, the hardware configuration, the functional configuration, and the processing flow of the conference recording / playback system 1900 will be sequentially described.

【０１３９】（会議録画再生システム１９００の外観構
成）図１９は、実施の形態２の画像録画再生システムの
外観構成の一例を示した図である。画像録画再生システ
ム１９００は、十字ボタン１９０１と、決定ボタン１９
０２と、画像音声出力端子１９０３と、媒体挿入スロッ
ト１９０４と、を有する。実施の形態１の会議録画再生
システム１００は、画像と音声を取り込む会議画像送出
装置２００と、その動画を記録して加工再生する会議画
像再生装置３００とが別体となっていたが、実施の形態
２の会議録画再生システム１９００は、画像音声の入
力、格納、加工、再生出力を一つの筐体でおこなう。(External Configuration of Conference Recording / Reproducing System 1900) FIG. 19 is a diagram showing an example of the external configuration of the image recording / reproducing system according to the second embodiment. The image recording / playback system 1900 includes a cross button 1901 and an enter button 19
02, a video / audio output terminal 1903, and a medium insertion slot 1904. In the conference recording / playback system 100 according to the first embodiment, the conference image sending device 200 that captures images and sounds and the conference image playing device 300 that records and processes and reproduces the moving image are separate entities. The conference recording / playback system 1900 in the form 2 inputs, stores, processes, and plays back image and sound in one housing.

【０１４０】まず、外観に表れている上記各部を説明す
る。十字ボタン１９０１は、図示しない画面に表示され
るメニューやポインタを移動させる際に使用する。たと
えば、会議名を入力し、会議ファイルを作成する際に使
用する。また、複数の会議が録画されている場合には、
再生しようとする会議ファイル名を選択する際にも使用
する。この他、話者の仰角を入力するなどの音源方向を
修正する場合にも使用する。First, the above-mentioned respective parts appearing in appearance will be described. The cross button 1901 is used to move a menu or pointer displayed on a screen (not shown). For example, enter the meeting name and use it when creating the meeting file. Also, if multiple meetings are recorded,
It is also used when selecting the conference file name to be played. In addition, it is also used to correct the sound source direction such as inputting the speaker's elevation angle.

【０１４１】決定ボタン１９０２は、各種の決定をおこ
なう。たとえば、十字ボタン１９０１による選択対象を
決定する際にも使用する。なお、この決定ボタンに関し
ては、電源のオンオフと、再生停止とを割り当てるなど
して多機能ボタンとして使用することもできる。The enter button 1902 makes various decisions. For example, it is also used when determining the selection target by the cross button 1901. Note that this decision button can also be used as a multi-function button by allocating power on / off and playback stop.

【０１４２】画像音声出力端子１９０３は、会議録画再
生システム１９００で処理したデータ、すなわち、所定
の話者が切り出された画像であって歪みのない画像の画
像信号とその画像に伴う音声信号とを出力する。データ
形式は前述のようなＭＰＥＧ形式やＲｅａｌＡｕｄｉｏ
形式でもよいが、ここでは、通常のテレビに設けられて
いるＶＩＤＥＯ端子（ＶＨＦ／ＵＨＦ端子）により送受
信される信号形式を採用している。この様な汎用の信号
形式により、特殊な制御回路を介することなく、通常の
テレビで会議を再生することが可能となる。The image / audio output terminal 1903 outputs data processed by the conference recording / playback system 1900, that is, an image signal of an image in which a predetermined speaker is cut out without distortion and an audio signal accompanying the image. Output. The data format is MPEG format or RealAudio as mentioned above.
Although the format may be used, here, a signal format transmitted and received by a VIDEO terminal (VHF / UHF terminal) provided in a normal television is adopted. With such a general-purpose signal format, it becomes possible to reproduce a conference on a normal television without going through a special control circuit.

【０１４３】媒体挿入スロット１９０４は、会議を録画
する記録媒体を挿入するスロットである。実施の形態１
では、ＤＶＤ−ＲＡＭなどを想定したが、ここでは、Ｐ
ＣＭＣＩＡソケットを採用し、高密度大容量のカード型
ＨＤＤを挿入する構成としている。この様な構成により
装置を小型化することが可能となる。なお、場合によっ
ては、ＤＶＤ−ＲＷやＤＡＴテープを挿入できるスロッ
トであってもよい。なお、機械的な駆動部を有する場合
には、会議録画再生システム１９００がメカニカルノイ
ズを拾わないような消音構造を採用する。The medium insertion slot 1904 is a slot for inserting a recording medium for recording a conference. Embodiment 1
Then, a DVD-RAM or the like is assumed, but here, P
A CMCIA socket is adopted, and a card-type HDD with high density and large capacity is inserted. With such a configuration, the device can be downsized. In some cases, it may be a slot into which a DVD-RW or DAT tape can be inserted. If the conference recording / playback system 1900 has a mechanical drive unit, a sound deadening structure is adopted so as not to pick up mechanical noise.

【０１４４】（会議録画再生システム１９００：ハード
ウェア構成）次に、会議録画再生システム１９００のハ
ードウェア構成について説明する。図２０は、実施の形
態２の会議画像録画再生システムのハードウェア構成の
一例を示した説明図である。会議録画再生システム１９
００は、ＣＰＵ３０１の他、ＲＡＭ２００１、ＲＯＭ２
００２、操作部２００３、出力Ｉ／Ｆ２００４と、カメ
ラ部２００５と、マイク部２００６と、リムーバブルメ
ディア部２００７と、を有する。なお、カメラ部２００
５は、図１９に示した光学系を含んだカメラ部２０１を
便宜的に示した表記であり、また、マイク部２００６
も、図１９に示したマイクロフォン２２１を含んだマイ
ク部２０２を便宜的に示した表記である。(Conference Recording / Playback System 1900: Hardware Configuration) Next, the hardware configuration of the conference recording / playback system 1900 will be described. FIG. 20 is an explanatory diagram showing an example of the hardware configuration of the conference image recording / playback system according to the second embodiment. Conference recording / playback system 19
00 is a CPU 301, a RAM 2001, a ROM 2
002, an operation unit 2003, an output I / F 2004, a camera unit 2005, a microphone unit 2006, and a removable media unit 2007. The camera unit 200
Reference numeral 5 is a notation for convenience showing the camera unit 201 including the optical system shown in FIG.
Also, this is a notation for convenience showing the microphone unit 202 including the microphone 221 shown in FIG.

【０１４５】ＲＡＭ２００１は、ＣＰＵ３０１の作業領
域として利用されるとともに、ＨＤＤ３０３に格納され
る各処理プログラムや、その他制御プログラム（たとえ
ばＯＳ）などの記憶領域としても利用される。ＲＯＭ２
００２は、普遍の制御情報や係数を記憶する。たとえ
ば、図１７に示した変換テーブル（対応表）を記憶して
おいてもよい。The RAM 2001 is used not only as a work area for the CPU 301, but also as a storage area for each processing program stored in the HDD 303 and other control programs (eg OS). ROM2
002 stores universal control information and coefficients. For example, the conversion table (correspondence table) shown in FIG. 17 may be stored.

【０１４６】操作部２００３は、十字ボタン１９０１
と、決定ボタン１９０２により構成される。出力Ｉ／Ｆ
２００４は、画像音声出力端子１９０３やビデオカード
やビデオメモリにより構成され、図示しないテレビのビ
デオ入力端子へ画像信号および音声信号を送出する。リ
ムーバブルメディア部２００７は、媒体挿入スロット１
９０４に挿入されたＰＣＭＣＩＡタイプの大容量ＨＤＤ
の書き込み／読み出しの駆動制御をおこなう。The operation section 2003 is provided with a cross button 1901.
And a decision button 1902. Output I / F
An image / audio output terminal 1903, a video card, and a video memory 2004 send image signals and audio signals to a video input terminal of a television (not shown). The removable media unit 2007 has a medium insertion slot 1
PCMCIA type large capacity HDD inserted in 904
Drive control of writing / reading.

【０１４７】（会議録画再生システム１９００の内容：
機能的構成）次に、会議録画再生システム１９００の機
能的構成について説明する。図２１は、会議録画再生シ
ステム１９００の機能的構成の一例を示した説明図であ
る。会議録画再生システム１９００は、その機能的構成
として、図８に説明した各機能部の他、広角画像展開部
２１０１と画像抽出部２１０２を備える。(Contents of the conference recording / playback system 1900:
Functional Configuration) Next, the functional configuration of the conference recording / playback system 1900 will be described. FIG. 21 is an explanatory diagram showing an example of the functional configuration of the conference recording / playback system 1900. The conference recording / playback system 1900 has, as its functional configuration, a wide-angle image expansion unit 2101 and an image extraction unit 2102 in addition to the functional units described in FIG.

【０１４８】（会議録画再生システム１９００：広角画
像展開部２１０１の内容）広角画像展開部２１０１は、
ドーナツ画像をパノラマ画像に変形する。実施の形態１
の会議会議録画再生システム１００では、再生時に画像
の変形を行っていたが（図８の画像変形部８０５参
照）、実施の形態２の会議録画再生システム１９００で
は、録画時に画像の変形をおこなう。換言すれば、会議
録画再生システム１９００では、記録部８０４への記録
前に広角画像をパノラマ画像へ展開し、このパノラマ画
像が記録される。展開処理については、式（４）〜式
（８）を用いて計算すればよいのでその説明を省略す
る。(Conference Recording / Playback System 1900: Contents of Wide-angle Image Expansion Unit 2101)
Transform a donut image into a panoramic image. Embodiment 1
In the conference recording / playback system 100, the image is transformed at the time of playback (see the image transforming unit 805 in FIG. 8), but in the conference recording / playback system 1900 of the second embodiment, the image is transformed at the time of recording. In other words, in the conference recording / playback system 1900, the wide-angle image is expanded into a panoramic image before being recorded in the recording unit 804, and this panoramic image is recorded. The expansion process may be calculated by using Expressions (4) to (8), and the description thereof will be omitted.

【０１４９】なお、ＣＰＵ３０１（図２０参照）の処理
能力が低い場合は、画像データの変形処理に計算時間が
かかるので所定の変換テーブルを参照することにより広
角画像とパノラマ画像とを対応づけてもよい。このよう
なテーブルを用いることにより、処理負担を小さくしつ
つ高速な画像変形をおこなうことが可能となる。If the processing capacity of the CPU 301 (see FIG. 20) is low, it takes a long time to transform the image data, so that a wide-angle image and a panoramic image can be associated with each other by referring to a predetermined conversion table. Good. By using such a table, it is possible to perform high-speed image transformation while reducing the processing load.

【０１５０】広角画像展開部２１０１は、たとえば、Ｃ
ＰＵ３０１と、ＨＤＤ３０３に格納された広角画像展開
プログラムによりその機能を実現することができる。な
お、会議録画再生システム１９００であっても会議録画
再生システム１００であっても、オリジナルの情報を１
００％有しているので、所望の音源方向のシーンを再生
することが随時可能となる。The wide-angle image expansion unit 2101 is, for example, C
The function can be realized by the PU 301 and the wide-angle image expansion program stored in the HDD 303. It should be noted that the original information is set to 1 in both the conference recording / playback system 1900 and the conference recording / playback system 100.
Since it has 100%, it is possible to reproduce a scene in a desired sound source direction at any time.

【０１５１】（会議録画再生システム１９００：画像抽
出部２１０２の内容）画像抽出部２１０２は、記録部８
０４に記録されたパノラマ画像のうち音源方向に対応す
る所定の画像領域を切り出し（抽出し）、画像音声出力
部８０８に出力する。たとえば、会議参加者Ａ（図６参
照）が発言中である場合には、音源方向データに基づ
き、図１４の様に展開され記録された映像データを基
に、参加者Ａに該当する部分を抽出する。以降では、こ
の切り出された画像を部分画像と称することとする。図
２２は、画像抽出の例を示した説明図である。図示した
ように画像抽出部２１０２は、参加者Ａのみが映された
部分画像データを生成する。(Conference Recording / Playback System 1900: Contents of Image Extraction Unit 2102) The image extraction unit 2102 includes a recording unit 8
A predetermined image area corresponding to the sound source direction is cut out (extracted) from the panorama image recorded in 04 and output to the image / audio output unit 808. For example, when the conference participant A (see FIG. 6) is speaking, the part corresponding to the participant A is determined based on the sound source direction data and the video data developed and recorded as shown in FIG. Extract. Hereinafter, this cut out image will be referred to as a partial image. FIG. 22 is an explanatory diagram showing an example of image extraction. As illustrated, the image extraction unit 2102 generates partial image data showing only the participant A.

【０１５２】ここで、画像抽出の手順を説明する。図２
３は、実施の形態２の画像抽出部２１０２による部分画
像データの生成方法を説明する説明図である。まず、部
分画像データとして抽出される角度の範囲を事前に設定
する。この角度の範囲は、方位角方向がΔθ、また仰角
方向がΔφであるとする。次に、音源方向検出部８０３
が検出した方位角θおよび仰角φを読み込む。最後に、
記録部８０４から入力したパノラマ画像データにおい
て、方位角θおよび仰角φに対応する領域（（θ−Δθ
／２，φ−Δφ／２）、（θ＋Δθ／２，φ−Δφ／
２）、（θ−Δθ／２，φ＋Δφ／２）、（θ＋Δθ／
２，φ＋Δφ／２）で囲まれた領域）を抽出することに
より、部分映像データを生成する。Here, the procedure of image extraction will be described. Figure 2
FIG. 3 is an explanatory diagram illustrating a method of generating partial image data by the image extracting unit 2102 according to the second embodiment. First, the range of angles extracted as partial image data is set in advance. The range of this angle is assumed to be Δθ in the azimuth angle direction and Δφ in the elevation angle direction. Next, the sound source direction detection unit 803
The azimuth angle θ and the elevation angle φ detected by are read. Finally,
In the panoramic image data input from the recording unit 804, the area ((θ-Δθ
/ 2, φ-Δφ / 2), (θ + Δθ / 2, φ-Δφ /
2), (θ-Δθ / 2, φ + Δφ / 2), (θ + Δθ /
2, partial image data is generated by extracting a region surrounded by φ + Δφ / 2).

【０１５３】なお、使用の態様によっては、画像抽出部
２１０２は、ドーナツ画像から直接画像を抽出してもよ
い。このときはドーナツ画像を図１５に示したように、
座標変換テーブルのうち（θ，φ）を中心とするΔθ×
Δφの矩形領域のみをアクセスすることにより、ドーナ
ツ画像データから部分映像データを切り出して変形す
る。また、実施の形態１のように、画像変形部８０５を
有する場合には、画像変形部８０５により生成されたパ
ノラマ画像に対し、（θ，φ）を中心とするΔθ×Δφ
の矩形領域を直接抽出するようにしてもよい。Depending on the mode of use, the image extraction unit 2102 may directly extract the image from the donut image. At this time, the donut image is as shown in FIG.
Δθ centered on (θ, φ) in the coordinate conversion table
By accessing only the rectangular area of Δφ, partial video data is cut out from the donut image data and transformed. Further, when the image transformation unit 805 is provided as in the first embodiment, the panoramic image generated by the image transformation unit 805 has Δθ × Δφ centered at (θ, φ).
Alternatively, the rectangular area may be directly extracted.

【０１５４】画像抽出部２１０２は、たとえば、図２０
に示した会議録画再生システム１９００のＣＰＵ３０１
と、ＨＤＤ３０３に格納された画像抽出プログラムによ
りその機能を実現することができる。なお、実施の形態
２においては、画像音声出力部８０８は、画像抽出部２
１０２から出力された画像（動画データ）と、この画像
が撮影（入力）された際に同時に録音（入力）された音
声を対応づけて出力する。すなわち、画像と音声の同期
をとって出力する。ＣＰＵ３０１（図２０参照）の処理
速度によっては、音声と画像のタイムラグが生じるの
で、画像音声出力部８０８は、画像と音声の同期をとっ
て自然な会議を再現する。The image extraction unit 2102 is, for example, as shown in FIG.
CPU 301 of the conference recording / playback system 1900 shown in FIG.
Then, the function can be realized by the image extraction program stored in the HDD 303. In the second embodiment, the image / audio output unit 808 is the image extraction unit 2
The image (moving image data) output from 102 and the sound recorded (input) at the same time when this image is captured (input) are output in association with each other. That is, the image and the sound are output in synchronization with each other. Depending on the processing speed of the CPU 301 (see FIG. 20), there is a time lag between the sound and the image, so the image and sound output unit 808 reproduces a natural conference by synchronizing the image and the sound.

【０１５５】（会議録画再生システム１９００：処理の
流れ）次に、会議録画再生システム１９００の処理流れ
について説明する。図２４は、会議録画再生システム１
００の処理流れの例を示した説明図である。会議録画再
生システム１００は、まず、録画開始ボタン（図示せ
ず）が押下されることにより、録画を開始する（ステッ
プＳ２４０１）。この開始動作以降、鉛直方向を中心も
しくは軸とした広角画像をカメラ部２０１から順次入力
し、また、音声をマイク部２０２から順次入力する（ス
テップＳ２４０２）。なお、マイク部２０２から入力さ
れた音声に関しては、前述したマイクの組を使って音源
方向を随時検出しておく。(Conference Recording / Playback System 1900: Processing Flow) Next, the processing flow of the conference recording / playback system 1900 will be described. FIG. 24 shows a conference recording / playback system 1
It is explanatory drawing which showed the example of the processing flow of 00. The conference recording / playback system 100 first starts recording by pressing a recording start button (not shown) (step S2401). After this start operation, wide-angle images centered on or in the vertical direction are sequentially input from the camera unit 201, and audio is sequentially input from the microphone unit 202 (step S2402). With respect to the voice input from the microphone unit 202, the sound source direction is detected at any time by using the above-described microphone set.

【０１５６】次に、カメラ部２０１から入力された広角
画像（ドーナツ画像）を順次パノラマ画像に変形する
（ステップＳ２４０３）。このパノラマ画像と、マイク
部２０２から入力された音声と、検出された音源方向を
記録する（ステップＳ２４０４）。記録に関しては、後
の再生のために、適宜録画時刻やファイル名（会議名）
などを付しておく。Next, the wide-angle image (donut image) input from the camera unit 201 is sequentially transformed into a panoramic image (step S2403). The panoramic image, the sound input from the microphone unit 202, and the detected sound source direction are recorded (step S2404). Regarding recording, record time and file name (meeting name) as required for later playback.
And so on.

【０１５７】会議が終わり、録画終了ボタン（図示せ
ず）が押下されることにより、録画を終了する（ステッ
プＳ２４０５）。以上のステップを経ることにより、全
周囲の画像、すなわち、会議の様子をすべて含んだ内容
を録画することができる。なお、保存された画像は、全
周囲の画像を含んでいるので、後でユーザが好むように
編集可能（画像領域の抽出、音源方向の修正等）とな
る。When the conference is over and the recording end button (not shown) is pressed, the recording is ended (step S2405). By going through the above steps, it is possible to record an image of the entire circumference, that is, the contents including all the states of the conference. Since the saved image includes the image of the entire circumference, it can be edited (extracting the image region, correcting the sound source direction, etc.) as desired by the user later.

【０１５８】次に、録画された会議の再生について説明
する。会議録画再生システム１９００は、再生開始ボタ
ン（図示せず）が押下されることにより再生を開始する
（ステップＳ２４０６）。なお、記録媒体（ＰＣＭＣＩ
Ａタイプのハードディスク）に複数の会議が録画されて
いる場合は、インデックス表示をしてユーザにどの会議
を再生するかを選択させる。Next, the reproduction of the recorded conference will be described. The conference recording / playback system 1900 starts playback by pressing a playback start button (not shown) (step S2406). A recording medium (PCMCI
When a plurality of conferences are recorded on the A type hard disk), an index is displayed to let the user select which conference to reproduce.

【０１５９】記録されたパノラマ画像のうち音源方向の
肌色部分を手掛かりとして、話者位置を判定し（ステッ
プＳ２４０７）、表示させる領域の画像を抽出する（ス
テップＳ２４０８）。最後に、抽出された画像を音声と
共に出力する（ステップＳ２４０９）。このようなステ
ップを踏むことにより、臨場感を維持しつつ、会議を効
率的に再現させることが可能となる。The speaker position is determined by using the skin-colored portion in the sound source direction of the recorded panoramic image as a clue (step S2407), and the image of the area to be displayed is extracted (step S2408). Finally, the extracted image is output together with the sound (step S2409). By taking such steps, it is possible to efficiently reproduce the conference while maintaining the realism.

【０１６０】以上説明したように、実施の形態２の会議
録画再生システムは、双曲面ミラーを用いた簡易な光学
系により簡便な構成で会議参加者の全員を一度に取り込
むことができる。また、この内容を録画することにより
会議を再現することができる。また、ドーナツ画像をパ
ノラマ画像に展開したものを録画するので、再生時の負
荷が少ないシステムを構築することができる。As described above, the conference recording / playback system according to the second embodiment can take in all the conference participants at one time with a simple structure using a simple optical system using a hyperboloidal mirror. Also, the conference can be reproduced by recording this content. In addition, since a donut image expanded into a panoramic image is recorded, it is possible to construct a system with less load during reproduction.

【０１６１】実施の形態３．実施の形態３では、会議録
画再生システムのうち、カメラ部およびマイク部が実施
の形態１または２とは異なった態様について説明する。
図２５は、実施の形態３のカメラ部を含んだ装置の外観
構成の一例を示した説明図である。図から明らかなよう
に、会議録画再生システム２５００のカメラ部２５０１
は、双曲面ミラーの代わりに円錐形状を有する鏡面体２
５０２を有する。ドーナツ画像からパノラマ画像への変
換式についての説明は省略するが、図４に示したレンズ
２１２のように、適宜レンズを配することによりＣＣＤ
２１４の表面上に焦点を合わせるようにする。なお、使
用の態様によっては、放物面を有する鏡面体であっても
よい。Third Embodiment In the third embodiment, an aspect of the conference recording / playback system in which the camera unit and the microphone unit are different from those in the first and second embodiments will be described.
FIG. 25 is an explanatory diagram showing an example of an external configuration of an apparatus including a camera unit according to the third embodiment. As is clear from the figure, the camera unit 2501 of the conference recording / playback system 2500
Is a specular body 2 having a conical shape instead of a hyperboloid mirror.
502. The description of the conversion formula from the donut image to the panoramic image is omitted, but by appropriately disposing a lens like the lens 212 shown in FIG.
Try to focus on the surface of 214. Depending on the mode of use, it may be a specular body having a paraboloid.

【０１６２】以上にあげた例では、反射鏡（双曲面ミラ
ー２１１、円錐形状を有する鏡面体２５０２もしくは放
物面を有する鏡面体）は１枚構成であったが、これに限
ることなく２枚の反射鏡を用いてもよい。図２６は、２
枚の反射鏡を用いてドーナツ画像を取り込む構成とした
カメラ部の外観構成図である。カメラ部２６００は、放
物面ミラーもしくは双曲面ミラーから構成される第１の
反射鏡２６０１と、第１の反射鏡により反射された反射
光をＣＣＤ方向へ偏向する第２の反射鏡２６０２とを有
する。なお、第１の反射鏡２６０１の頭頂部は第２の反
射鏡からの反射光を取り込むために穴が開けられてい
る。In the above-mentioned examples, the reflecting mirror (hyperbolic mirror 211, conical mirror surface 2502 or parabolic mirror surface) is composed of one piece, but is not limited to this. You may use the reflecting mirror of. 26 is 2
FIG. 6 is an external configuration diagram of a camera unit configured to capture a donut image using a single reflecting mirror. The camera unit 2600 includes a first reflecting mirror 2601 composed of a parabolic mirror or a hyperboloidal mirror, and a second reflecting mirror 2602 for deflecting the reflected light reflected by the first reflecting mirror in the CCD direction. Have. The top of the first reflecting mirror 2601 has a hole for taking in the reflected light from the second reflecting mirror.

【０１６３】次にマイク部について説明する。図２７
は、実施の形態３のマイク部と音源方向との関係を説明
する説明図である。実施の形態１および実施の形態２の
マイク部２０２は、無指向性のマイクロフォン２２１を
用いて、音声の到達時間差に基づいて本源方向を検出し
ていた。実施の形態３のマイク部２７０１は、指向性を
有するマイクロフォン２７０２を４つ有し、その音声の
強度に基づいて音源方向を決定する。便宜的に４つのマ
イクロフォン２７０２をマイク１〜４とする。Next, the microphone section will be described. FIG. 27
FIG. 9 is an explanatory diagram illustrating a relationship between a microphone unit and a sound source direction according to the third embodiment. The microphone unit 202 according to the first and second embodiments uses the omnidirectional microphone 221 to detect the direction of the main source based on the difference in arrival time of voice. The microphone unit 2701 of the third embodiment has four microphones 2702 having directivity, and determines the sound source direction based on the strength of the sound. For convenience, the four microphones 2702 are microphones 1 to 4.

【０１６４】いま、音声強度がマイク１で２０、マイク
２で３０、マイク３で２０，マイク４で５という数値で
あったとする。この場合はマイク２の方向に音源がある
と判断する。マイク１とマイク３の強度を比較するとい
ずれも同じ値２０であるので、最終的に音源方向はマイ
ク２方向（図でθ＝４５°と示した方向）と決定する。It is now assumed that the voice strength is 20 for microphone 1, 30 for microphone 2, 20 for microphone 3, and 5 for microphone 4. In this case, it is determined that there is a sound source in the direction of the microphone 2. When the strengths of the microphone 1 and the microphone 3 are compared, both have the same value 20, so that the sound source direction is finally determined to be the microphone 2 direction (the direction indicated by θ = 45 ° in the figure).

【０１６５】別の例を説明する。音声強度がマイク１で
１５、マイク２で３０、マイク３で２５，マイク４で５
であったとする。この場合はマイク２の方向に音源があ
ると初期判断する。マイク１とマイク３の強度を比較す
ると、マイク３の強度がマイク１より大きいので、音源
方向をマイク２方向からマイク３方向に若干量移動させ
た方向（図でθ＝３０°と示した方向）と決定する。こ
の方向の移動量は指向性マイクの特性にしたがって予め
決定しておけばよい。このように、指向性のマイクロフ
ォン２２１を用いれば、式（１）〜式（３）のような計
算をおこなわなくてすむので、プロセッサの負荷を軽減
させることができる。Another example will be described. Voice strength is 15 for microphone 1, 30 for microphone 2, 25 for microphone 3, 5 for microphone 4.
It was. In this case, it is initially determined that there is a sound source in the direction of the microphone 2. Comparing the strengths of the microphone 1 and the microphone 3, the strength of the microphone 3 is larger than that of the microphone 1. Therefore, the sound source direction is slightly moved from the microphone 2 direction to the microphone 3 direction (the direction indicated by θ = 30 ° in the figure). ) Is determined. The amount of movement in this direction may be determined in advance according to the characteristics of the directional microphone. As described above, if the directional microphone 221 is used, it is not necessary to perform calculations such as Expressions (1) to (3), and thus the load on the processor can be reduced.

【０１６６】実施の形態４．実施の形態４では、汎用性
のある会議画像送出装置および会議画像再生装置につい
て説明する。ここで汎用性のあるとは、広角画像を取り
込む鏡面体の構成やマイクロフォンの種類等により会議
画像送出装置や会議画像再生装置が複数種類あっても、
任意の組み合わせにより会議の録画ないし再生ができる
ことをいう。なお、実施の形態４においても、実施の形
態１〜３と同様の構成部分については、特に断らない限
り同一の符号を付し、その説明を省略するものとする。Fourth Embodiment In the fourth embodiment, a general-purpose conference image transmitting device and a conference image reproducing device will be described. Here, “having versatility” means that even if there are a plurality of types of conference image transmitting devices and conference image reproducing devices depending on the configuration of a specular body that captures wide-angle images and the type of microphone,
It means that a meeting can be recorded or played back in any combination. In addition, also in the fourth embodiment, the same components as those in the first to third embodiments are denoted by the same reference numerals, and the description thereof will be omitted.

【０１６７】実施の形態４の会議録画再生システム２８
００は、会議画像送出装置２８０１と、会議画像再生装
置２８０２とを有する。図２８は、実施の形態４の会議
画像送出装置２８０１と会議画像再生装置２８０２の機
能ブロックを示した図である。会議画像送出装置２８０
１は、その機能的構成として、広角画像入力部２８１１
と、音声入力部２８１２と、音源方向検出部２８１３
と、仰角設定部２８１４と、データ送出部２８１５と、
を有する。The conference recording / playback system 28 of the fourth embodiment.
00 has a conference image transmitting device 2801 and a conference image reproducing device 2802. FIG. 28 is a diagram showing functional blocks of the conference image transmitting apparatus 2801 and the conference image reproducing apparatus 2802 according to the fourth embodiment. Conference image transmitting device 280
Reference numeral 1 denotes a wide-angle image input unit 2811 as its functional configuration.
, Voice input unit 2812, and sound source direction detection unit 2813
An elevation angle setting unit 2814, a data transmission unit 2815,
Have.

【０１６８】広角画像入力部２８１１は、鉛直方向を中
心もしくは軸とした広角画像を取り込み、その画像デー
タをデータ送出部２８１５に出力する。広角画像の入力
は、実施の形態１で示した双曲面ミラー２１１を用いて
もよく、また、実施の形態３で示した円錐形状の鏡面体
２５０２もしくは放物面の反射鏡のいずれを用いてもよ
い。The wide-angle image input unit 2811 takes in a wide-angle image with the vertical direction as the center or axis, and outputs the image data to the data sending unit 2815. For inputting a wide-angle image, the hyperboloidal mirror 211 shown in Embodiment 1 may be used, and either the conical mirror body 2502 or the parabolic reflector shown in Embodiment 3 may be used. Good.

【０１６９】音声入力部２８１２は、音声を入力して電
気信号（音声データ）に変換し、その音声データを音源
方向検出部２８１３とデータ送出部２８１５に送出す
る。音声の入力は、実施の形態１で示した無指向性のマ
イクロフォン２２１を採用してもよく、また、実施の形
態３で説明した指向性のマイクロフォン２７０２を用い
てもよい。音源方向検出部２８１３では、音声入力部２
８１２から入力した音声の時間差もしくは強度に基づい
て音源方向を検出する。音源方向の検出原理は既に説明
したので省略する。The voice input unit 2812 inputs voice and converts it into an electric signal (voice data), and sends the voice data to the sound source direction detecting unit 2813 and the data sending unit 2815. For voice input, the omnidirectional microphone 221 described in Embodiment 1 may be adopted, or the directional microphone 2702 described in Embodiment 3 may be used. In the sound source direction detection unit 2813, the voice input unit 2
The sound source direction is detected based on the time difference or strength of the voice input from 812. The principle of detecting the direction of the sound source has already been described and will not be described.

【０１７０】仰角設定部２８１４は、話者の高さ方向で
ある仰角を設定する。音源方向検出部は一般に、図１０
を用いて説明したように、仰角方向についての誤差が大
きい。したがって、仰角設定部２８１４は、会議画像送
出装置２８０１が設置される平面からの仰角を設定す
る。設定の方法は、たとえばテンキーによる角度ψの直
接設定の他、話者の画像データ（肌色データ）検出に基
づいてもよい。The elevation angle setting unit 2814 sets the elevation angle in the height direction of the speaker. The sound source direction detecting unit is generally shown in FIG.
As described above, the error in the elevation angle direction is large. Therefore, the elevation angle setting unit 2814 sets the elevation angle from the plane where the conference image sending device 2801 is installed. The setting method may be based on, for example, direct detection of the image data (skin color data) of the speaker in addition to the direct setting of the angle ψ using the ten-key pad.

【０１７１】データ送出部２８１５は、広角画像と、音
声と、仰角も含めた音源方向に関するデータを所定のデ
ータ格納手段に送出する。ここでは、会議画像再生装置
２８０２に対して送出する。なお、実施の形態１〜３ま
では、有線によるデータ送出を述べたが、これに限るこ
となく無線によりデータを送出してもよい。無線データ
の送出方法については種々の方法を採用できるが、たと
えば、ＩｒＤＡ、ＢｌｕｅＴｏｏｔｈ等の無線Ｉ／Ｆを
採用することができる。The data sending unit 2815 sends the wide-angle image, the sound, and the data regarding the sound source direction including the elevation angle to the predetermined data storage means. Here, it is sent to the conference image reproducing apparatus 2802. Although data transmission by wire is described in the first to third embodiments, the invention is not limited to this, and data may be transmitted wirelessly. Although various methods can be adopted as a method of transmitting wireless data, for example, a wireless I / F such as IrDA or BlueTooth can be adopted.

【０１７２】次に、会議画像再生装置２８０２について
説明する。会議画像再生装置２８０２は、その機能的構
成として、データ入力部２８２１と、記録部２８２２
と、画像変形部２８２３と、領域決定部２８２４と、画
像音声出力部２８２５と、を有する。また、会議画像再
生装置２８０２は、方向修正部８０６と、領域固定部８
０７を有する。なお、以降では各機能部を分説するが、
会議画像再生装置２８０２はパーソナルコンピュータに
よりその機能を実現させることができる。この場合は各
機能部を実現するソフトウェアをハードディスクに格納
し、適宜処理プログラムを実行させることによりその機
能を実現させることができる。Next, the conference image reproducing apparatus 2802 will be described. The conference image reproducing apparatus 2802 has, as its functional configuration, a data input unit 2821 and a recording unit 2822.
An image transforming unit 2823, a region determining unit 2824, and an image / audio output unit 2825 are included. Also, the conference image reproducing apparatus 2802 includes the direction correcting unit 806 and the area fixing unit 8
Has 07. In addition, in the following, each functional unit will be divided.
The function of the conference image reproducing apparatus 2802 can be realized by a personal computer. In this case, the software that implements each functional unit is stored in the hard disk, and the function can be implemented by appropriately executing the processing program.

【０１７３】データ入力部２８２１は、所定のデータ送
信元から、広角画像が撮像された動画データと、当該動
画データに同期した音声データと、音源方向に関するデ
ータと、を入力する。ここでは、所定のデータ送信元を
会議画像送出装置２８０１としているが、動画データ、
音声データ、音源方向に関するデータを、そのデータの
種別が認識できる様な形式であれば送信元の装置には依
存しない。なお、データの種別は、ファイルの拡張子や
ファイルのヘッダ部分により判別することができる。ま
た、広角画像は、ここではドーナツ画像を想定している
が、パノラマ画像であってもよい。この種別も拡張子や
ヘッダにより判別する。データ入力部２８２１は、たと
えば、ＩｒＤＡ、ＢｌｕｅＴｏｏｔｈ等の無線Ｉ／Ｆを
採用することができる。The data input unit 2821 inputs, from a predetermined data transmission source, moving image data in which a wide-angle image is picked up, audio data synchronized with the moving image data, and data regarding a sound source direction. Here, although the predetermined data transmission source is the conference image transmission device 2801, the moving image data,
The format of the voice data and the data about the sound source direction does not depend on the transmission source device as long as the type of the data can be recognized. The data type can be determined by the file extension or the file header. The wide-angle image is assumed to be a donut image here, but it may be a panoramic image. This type is also determined by the extension and the header. The data input unit 2821 can employ, for example, a wireless I / F such as IrDA or BlueTooth.

【０１７４】記録部２８２２は、データ入力部２８２１
が入力した広角画像の動画データ、音声データ、仰角を
含んだ音源方向に関するデータを記録する。記録の方式
は様々挙げられるが、前述した様に、ＭＰＥＧ形式やＲ
ｅａｌＡｕｄｉｏ形式を採用することができる。The recording unit 2822 includes a data input unit 2821.
The wide-angle image moving image data, audio data, and data regarding the sound source direction including the elevation angle are recorded. There are various recording methods, but as described above, the MPEG format and R
The ealAudio format can be adopted.

【０１７５】画像変形部２８２３は、広角画像を矩形の
出力画像となるように変形する。変換については、ＣＣ
Ｄ２１４で焦点が合うように設計されているので、ＣＣ
Ｄ２１４が取り込む画像は、常にドーナツ画像である。
したがって、前述したようにドーナツ画像とパノラマ画
像との対応テーブル（図示せず）を参照することにより
画像の変形をおこなう。このとき、最終的な出力画像が
発言者を含んだ部分であるので、画像変形部２８０３で
は、領域決定部２８２４で決定された画像領域部分のみ
の画像変形をおこなう。The image transforming unit 2823 transforms the wide-angle image into a rectangular output image. For conversion, CC
Since it is designed to be in focus with D214, CC
The image captured by D214 is always a donut image.
Therefore, as described above, the image is transformed by referring to the correspondence table (not shown) between the donut image and the panoramic image. At this time, since the final output image is a portion including the speaker, the image transformation unit 2803 transforms the image of only the image region portion determined by the region determination unit 2824.

【０１７６】領域決定部２８２４は、記録部２８２２に
記録された仰角を含んだ音源方向に関するデータに基づ
いて再生すべき領域を決定する。なお、実施の形態１で
説明したように、話者位置判断部８０９と併用して話者
位置の検出精度を向上させてもよい。画像音声出力部２
８２５は、画像変形部２８２３から出力された画像（動
画データ）と、この画像が撮影（入力）された際に同時
に録音（入力）された音声を対応づけて出力する。The area determining unit 2824 determines the area to be reproduced based on the data regarding the sound source direction including the elevation angle recorded in the recording unit 2822. As described in the first embodiment, the speaker position determination unit 809 may be used in combination to improve the speaker position detection accuracy. Image sound output unit 2
Reference numeral 825 outputs the image (moving image data) output from the image transforming unit 2823 and the sound recorded (input) at the same time when this image was captured (input) in association with each other.

【０１７７】次に、会議画像送出装置２８０１の処理流
れについて説明する。図２９は、実施の形態４の会議画
像送出装置２８０１の処理流れの例を示したフローチャ
ートである。まず、会議画像送出装置２８０１のシステ
ムがユーザにより起動され、データ（画像データと音声
データ）の取り込み動作を開始する（ステップＳ２９０
１）。次に、取り込み停止（記録）が指示された否かを
判断し（ステップＳ２９０２）、指示があれば（ステッ
プＳ２９０２：Ｙｅｓ）、取り込みを終了する。Next, the processing flow of the conference image transmitting apparatus 2801 will be described. FIG. 29 is a flowchart showing an example of the processing flow of the conference image transmitting apparatus 2801 of the fourth embodiment. First, the system of the conference image transmitting device 2801 is activated by the user, and the operation of capturing data (image data and audio data) is started (step S290).
1). Next, it is judged whether or not the capture stop (recording) is instructed (step S2902), and if there is an instruction (step S2902: Yes), the capture is ended.

【０１７８】取り込み停止の指示がない限り（ステップ
Ｓ２９０２：ＮＯ）、ＣＣＤ２１４から送信される画像
データとマイクロフォンアレイから送信される音声デー
タを入力し続ける（ステップＳ２９０３）。音声データ
がある一定量、たとえば式（２）に示した相関窓Ｃの大
きさＮと同数のサンプルが入力された場合に、音源方向
を検出し、音源方向データを生成する（ステップＳ２９
０４）。会議画像送出装置２８０１は、画像データ、音
声データおよび音源方向データを、所定の送信先、たと
えば、ＰＣに順次出力する（ステップＳ２９０５）。以
降は、ステップＳ２９０２〜ステップＳ２９０４までの
動作を順次繰り返し、ユーザが記録停止を指示するまで
データを送出する。Unless there is an instruction to stop capturing (step S2902: NO), the image data transmitted from the CCD 214 and the voice data transmitted from the microphone array are continuously input (step S2903). When a certain amount of audio data, for example, the same number of samples as the size N of the correlation window C shown in Expression (2) is input, the sound source direction is detected and the sound source direction data is generated (step S29).
04). The conference image sending device 2801 sequentially outputs the image data, the sound data, and the sound source direction data to a predetermined destination, for example, a PC (step S2905). After that, the operations from step S2902 to step S2904 are sequentially repeated, and the data is transmitted until the user instructs recording stop.

【０１７９】次に、会議画像再生装置２８０２の処理流
れについて説明する。図３０は、実施の形態４の会議画
像再生装置２８０２の処理流れの例を示したフローチャ
ートである。まず、会議画像再生装置２８０２のシステ
ムがユーザにより起動される（ステップＳ３００１）。
次に、図示しないディスプレイ（テレビ）に表示される
画像にしたがって、再生する画像を選択する（ステップ
Ｓ３００２）。図３１は、再生させたい画像を選択する
画面構成の例を示した図である。図示したように、会議
のファイルはＭｅｅｔｉｎｇ１、Ｍｅｅｔｉｎｇ２と名
付けられており、各ファイルは、画像データ（ＭＰＥＧ
−２Ｖｉｄｅｏ）と、音声データ（ＭＰＥＧＡｕｄｉ
ｏ）と、音源方向データ（ＴＥＸＴ）から構成されてい
ることが分かる。Next, the processing flow of the conference image reproducing apparatus 2802 will be described. FIG. 30 is a flowchart showing an example of the processing flow of the conference image reproducing apparatus 2802 of the fourth embodiment. First, the system of the conference image reproducing device 2802 is activated by the user (step S3001).
Next, an image to be reproduced is selected according to the image displayed on the display (television) not shown (step S3002). FIG. 31 is a diagram showing an example of a screen configuration for selecting an image to be reproduced. As shown in the figure, the files of the conference are named Meeting1 and Meeting2, and each file is image data (MPEG).
-2 Video) and audio data (MPEG Audio)
It can be seen that it is composed of o) and sound source direction data (TEXT).

【０１８０】次に、会議画像再生装置２８０２は、広角
画像データ、音声データおよび音源方向データを読み出
し、再生動作を開始する（ステップＳ３００３）。続い
て、会議画像再生装置２８０２は、再生停止の指示があ
るか否かを判定し（ステップＳ３００４）、指示された
場合には再生を停止する。一方、再生停止の指示がない
場合（ステップＳ３００４：ＮＯ）、音源方向データを
問い合わせる時刻に到達したか否かを判定する（ステッ
プＳ３００５）。音源方向データを問い合わせる時刻と
は、たとえば、図１３に示したように、音源方向が変化
した時刻をいう。Next, the conference image reproducing device 2802 reads the wide-angle image data, the audio data and the sound source direction data, and starts the reproducing operation (step S3003). Subsequently, the conference image reproducing apparatus 2802 determines whether or not there is an instruction to stop the reproduction (step S3004), and if so, stops the reproduction. On the other hand, when there is no instruction to stop the reproduction (step S3004: NO), it is determined whether the time to inquire about the sound source direction data has been reached (step S3005). The time at which the sound source direction data is inquired means, for example, the time at which the sound source direction changes, as shown in FIG.

【０１８１】問い合わせ時刻に到達した場合は（ステッ
プＳ３００５：Ｙｅｓ）、音源方向データにアクセス
し、新たな音源方向（方位角θと仰角φの値）を取得す
る（ステップＳ３００６）。続いて、会議画像再生装置
２８０２は、ステップＳ３００６で取得した、方位角θ
と仰角φに対応する部分画像データを抽出し（ステップ
Ｓ３００７）、抽出された部分画像データと音声とを同
期させて出力（再生）する（ステップＳ３００８）。な
お、ステップＳ３００５で問い合わせ時刻に到達してい
ない場合は（ステップＳ３００５：ＮＯ）、現在再生さ
れている部分画像データをそのまま続行して再生する
（ステップＳ３００９）。When the inquiry time has been reached (step S3005: Yes), the sound source direction data is accessed and a new sound source direction (values of azimuth θ and elevation φ) is acquired (step S3006). Subsequently, the conference image reproducing apparatus 2802 acquires the azimuth angle θ acquired in step S3006.
And partial image data corresponding to the elevation angle φ is extracted (step S3007), and the extracted partial image data and audio are output (reproduced) in synchronization (step S3008). If the inquiry time has not been reached in step S3005 (step S3005: NO), the currently reproduced partial image data is continued and reproduced (step S3009).

【０１８２】以上説明したように、実施の形態４は、ビ
デオカメラとビデオデッキのように会議画像送出装置と
会議画像再生装置が別個独立に構成されていても、会議
内容を臨場感を維持しつつ効率的に再現させることがで
きる。As described above, in the fourth embodiment, even if the conference image transmitting device and the conference image reproducing device are separately configured like the video camera and the VCR, the conference contents are kept realistic. However, it can be reproduced efficiently.

【０１８３】実施の形態５．実施の形態５では、汎用性
のある会議画像送出装置および会議画像再生装置の他の
構成例について説明する。なお、実施の形態５において
も、実施の形態１〜４と同様の構成部分については、特
に断らない限り同一の符号を付し、その説明を省略する
ものとする。Embodiment 5. FIG. In the fifth embodiment, another configuration example of a general-purpose conference image transmitting device and a conference image reproducing device will be described. In addition, also in the fifth embodiment, the same components as those in the first to fourth embodiments are designated by the same reference numerals, and the description thereof will be omitted.

【０１８４】図３２は、実施の形態５の会議画像送出装
置と会議録画再生装置の機能ブロック図を示した図であ
る。実施の形態５の会議録画再生システム３２００は、
会議画像送出装置３２０１と、会議画像再生装置３２０
２とを有する。会議画像送出装置３２０１は、その機能
的構成として、広角画像入力部３２１１と、音声入力部
３２１２と、音源方向検出部３２１３と、広角画像展開
部３２１４と、画像抽出部３２１５と、データ送出部３
２１６と、を有する。FIG. 32 is a diagram showing a functional block diagram of the conference image transmitting apparatus and the conference recording / playback apparatus according to the fifth embodiment. The conference recording / playback system 3200 according to the fifth embodiment is
Conference image transmitting device 3201 and conference image reproducing device 320
2 and. The conference image transmission device 3201 has, as its functional configuration, a wide-angle image input unit 3211, a voice input unit 3212, a sound source direction detection unit 3213, a wide-angle image expansion unit 3214, an image extraction unit 3215, and a data transmission unit 3.
216, and.

【０１８５】広角画像入力部３２１１は、鉛直方向を中
心もしくは軸とした広角画像を取り込み、その画像デー
タを広角画像展開部３２１４に出力する。広角画像の入
力は、実施の形態４と同様に、双曲面ミラー２１１、円
錐形状の鏡面体２５０２、放物面の反射鏡のいずれを用
いてもよい。音声入力部３２１２は、音声を入力して電
気信号（音声データ）に変換し、その音声データを音源
方向検出部３２１３とデータ送出部３２１６に送出す
る。音声入力部３２１２は、指向性のマイクロフォンを
用いてもよいし無指向性のマイクロフォンを用いてもよ
い。音源方向検出部２８１３では、音声入力部３２１２
から入力した音声の時間差もしくは強度に基づいて音源
方向を検出し、画像抽出部３２１５とデータ送出部３２
１６に出力する。The wide-angle image input unit 3211 takes in a wide-angle image with the vertical direction as the center or axis, and outputs the image data to the wide-angle image developing unit 3214. For inputting the wide-angle image, any of the hyperboloidal mirror 211, the conical mirror surface body 2502, and the parabolic reflecting mirror may be used as in the fourth embodiment. The voice input unit 3212 inputs a voice, converts the voice into an electric signal (voice data), and outputs the voice data to the sound source direction detection unit 3213 and the data transmission unit 3216. The voice input unit 3212 may use a directional microphone or an omnidirectional microphone. In the sound source direction detecting unit 2813, the voice input unit 3212
The sound source direction is detected based on the time difference or the strength of the voice input from the image extraction unit 3215 and the data transmission unit 32.
Output to 16.

【０１８６】広角画像展開部３２１４は、ドーナツ画像
をパノラマ画像に変形し、画像抽出部３２１５とデータ
送出部３２１６に出力する。画像抽出部３２１５は、広
角画像展開部３２１４から出力されたパノラマ画像のう
ち、音源方向検出部３２１３から出力された音源方向に
基づいて話者方向の所定部分の画像を抽出する。データ
送出部３２１６は、パノラマ画像（全領域）と、抽出さ
れた画像（話者方向の部分画像）と、音声と、音源方向
に関するデータを所定のデータ格納手段に送出する。こ
こでは、会議画像再生装置３２０２に対して送出する。The wide-angle image developing unit 3214 transforms the donut image into a panoramic image and outputs it to the image extracting unit 3215 and the data sending unit 3216. The image extraction unit 3215 extracts an image of a predetermined portion in the speaker direction from the panoramic image output from the wide-angle image expansion unit 3214 based on the sound source direction output from the sound source direction detection unit 3213. The data sending unit 3216 sends the panoramic image (entire region), the extracted image (partial image in the speaker direction), the sound, and the data regarding the sound source direction to a predetermined data storage unit. Here, it is sent to the conference image reproducing apparatus 3202.

【０１８７】次に、会議画像再生装置３２０２について
説明する。会議画像再生装置３２０２は、その機能的構
成として、データ入力部３２２１と、記録部３２２２
と、画像音声出力部３２２３と、方向修正部３２２４と
を有する。なお、以降では各機能部を分説するが、会議
画像再生装置３２０２はパーソナルコンピュータにより
その機能を実現させることができる。この場合は各機能
部を実現するソフトウェアをハードディスクに格納し、
適宜処理プログラムを実行させることによりその機能を
実現させることができる。Next, the conference image reproducing apparatus 3202 will be described. The conference image reproducing apparatus 3202 has, as its functional configuration, a data input unit 3221 and a recording unit 3222.
And an image / audio output unit 3223 and a direction correction unit 3224. It should be noted that, although the functional units will be explained separately below, the conference image reproducing apparatus 3202 can be realized by a personal computer. In this case, store the software that implements each functional unit in the hard disk,
The function can be realized by appropriately executing the processing program.

【０１８８】データ入力部３２２１は、所定のデータ送
信元から広角画像が撮像された動画データと、当該動画
データに同期した音声データと、音源方向に関するデー
タと、を入力する。ここでは、所定のデータ送信元を会
議画像送出装置３２０１としているが、動画データ（全
体画像と部分画像）、音声データ、音源方向に関するデ
ータを、そのデータの種別が認識できる様な形式であれ
ば送信元の装置には依存しない。The data input unit 3221 inputs moving image data in which a wide-angle image is picked up from a predetermined data transmission source, audio data synchronized with the moving image data, and data regarding the sound source direction. Here, the predetermined image data transmission source is the conference image transmission device 3201, but moving image data (entire image and partial image), audio data, and data related to the sound source direction are in a format that allows the type of the data to be recognized. It does not depend on the source device.

【０１８９】記録部３２２２は、データ入力部３２２１
が入力したパノラマ画像と話者方向の部分画像の動画デ
ータ、音声データ、音源方向に関するデータを記録す
る。記録の方式は様々挙げられるが、前述した様に、Ｍ
ＰＥＧ形式やＲｅａｌＡｕｄｉｏ形式を採用することが
できる。画像音声出力部３２２３は、記録部３２２２か
ら出力された話者方向の部分画像（動画データ）と、こ
の画像が撮影（入力）された際に同時に録音（入力）し
た音声を対応づけて出力する。The recording unit 3222 includes a data input unit 3221.
The moving image data of the panorama image and the partial image in the speaker direction, the audio data, and the data related to the sound source direction are recorded. There are various recording methods, but as mentioned above, M
The PEG format and the RealAudio format can be adopted. The image / audio output unit 3223 outputs the partial image (moving image data) in the speaker direction output from the recording unit 3222 and the sound recorded (input) at the same time when this image is captured (input) in association with each other. .

【０１９０】但し、話者方向の部分画像が適正に抽出さ
れていない場合や、話者以外の画像、たとえば、隣り合
った二人やホワイトボードを含んだ話者を表示させたい
場合がある。そこで、この様な要求を満たすべく、会議
画像再生装置３２０２は、方向修正部３２２４を備え
る。方向修正部３２２４は、音源方向に対応する方向を
修正し、ユーザによる所望の音声方向を選択可能にす
る。なお、ユーザによる方向の選択については後述す
る。However, there are cases where a partial image in the speaker direction is not properly extracted, or there are cases where it is desired to display an image other than the speaker, for example, a speaker including two adjacent persons or a whiteboard. Therefore, in order to meet such a request, the conference image reproducing device 3202 includes a direction correcting unit 3224. The direction correcting unit 3224 corrects the direction corresponding to the sound source direction so that the user can select a desired voice direction. The selection of the direction by the user will be described later.

【０１９１】次に、会議画像送出装置３２０１の処理流
れについて説明する。図３３は、実施の形態５の会議画
像送出装置３２０１の処理流れの例を示したフローチャ
ートである。まず、会議画像送出装置３２０１のシステ
ムがユーザにより起動され、データ（画像データと音声
データ）の取り込み動作を開始する（ステップＳ３３０
１）。次に、取り込み停止（記録停止）が指示された否
かを判断し（ステップＳ３３０２）、指示があれば（ス
テップＳ３３０２：Ｙｅｓ）、取り込みを終了する。Next, the processing flow of the conference image transmitting apparatus 3201 will be described. FIG. 33 is a flowchart showing an example of the processing flow of the conference image transmitting apparatus 3201 according to the fifth embodiment. First, the system of the conference image transmitting device 3201 is activated by the user, and the operation of capturing data (image data and audio data) is started (step S330).
1). Next, it is determined whether or not an instruction to stop capturing (stop recording) is given (step S3302), and if there is an instruction (step S3302: Yes), the capturing is ended.

【０１９２】取り込み停止の指示がない限り（ステップ
Ｓ３３０２：ＮＯ）、ＣＣＤ２１４から送信される画像
データとマイクロフォンアレイから送信される音声デー
タを入力する（ステップＳ３３０３）。音声データがあ
る一定量、たとえば相関窓の大きさＮと同数のサンプル
が入力された場合には、音源方向を検出し、音源方向デ
ータを順次生成する（ステップＳ３３０４）。会議画像
送出装置３２０１は、ステップＳ３３０３で入力した広
角画像（ドーナツ画像）をパノラマ画像に順次展開し
（ステップＳ３３０５）、展開されたパノラマ画像のう
ち、音源方向の部分画像データを生成する（ステップＳ
３３０６）。Unless there is an instruction to stop capturing (step S3302: NO), the image data transmitted from the CCD 214 and the audio data transmitted from the microphone array are input (step S3303). When a certain amount of audio data, for example, the same number of samples as the size N of the correlation window are input, the sound source direction is detected and the sound source direction data is sequentially generated (step S3304). The conference image transmitting apparatus 3201 sequentially expands the wide-angle image (donut image) input in step S3303 into a panoramic image (step S3305), and generates partial image data in the sound source direction in the expanded panoramic image (step S).
3306).

【０１９３】会議画像送出装置２８０１は、パノラマ画
像データ、部分画像データ、音声データおよび音源方向
データを、所定の送信先、たとえば、ＰＣに順次出力す
る（ステップＳ３３０７）。以降は、ステップＳ３３０
２〜ステップＳ３３０７までの動作を順次繰り返し、ユ
ーザが記録停止を指示するまでデータを送出する。The conference image sending device 2801 sequentially outputs the panorama image data, the partial image data, the sound data and the sound source direction data to a predetermined destination, for example, a PC (step S3307). After that, step S330
The operation from 2 to step S3307 is sequentially repeated, and the data is transmitted until the user instructs the recording stop.

【０１９４】次に、会議画像再生装置３３０２の処理流
れについて説明する。図３４は、実施の形態５の会議画
像再生装置３３０２の処理流れの例を示したフローチャ
ートである。まず、会議画像再生装置３３０２のシステ
ムがユーザにより起動される（ステップＳ３４０１）。
次に、図示しないディスプレイ（テレビ）に表示される
画像にしたがって、再生する画像を選択する（ステップ
Ｓ３４０２）。図３５は、再生させたい画像を選択する
画面構成の例を示した図である。図示したように、会議
のファイルはＭｅｅｔｉｎｇ１、Ｍｅｅｔｉｎｇ２と名
付けられており、各ファイルは、パノラマ画像データ
（ＭＰＥＧ−２Ｖｉｄｅｏ）と、音声データ（ＭＰＥＧ
Ａｕｄｉｏ）と、音源方向データ（ＴＥＸＴ）と、更
に、部分画像データ（ＭＰＥＧ−２ＶＩｄｅｏ）から構
成されていることが分かる。Next, the processing flow of the conference image reproducing apparatus 3302 will be described. FIG. 34 is a flowchart showing an example of the processing flow of the conference image reproducing device 3302 according to the fifth embodiment. First, the system of the conference image reproduction device 3302 is activated by the user (step S3401).
Next, an image to be reproduced is selected according to an image displayed on a display (television) not shown (step S3402). FIG. 35 is a diagram showing an example of a screen configuration for selecting an image to be reproduced. As shown in the figure, the conference files are named Meeting1 and Meeting2, and each file includes panoramic image data (MPEG-2 Video) and audio data (MPEG).
It can be seen that it is composed of Audio), sound source direction data (TEXT), and further partial image data (MPEG-2VIdeo).

【０１９５】次に、会議画像再生装置３３０２は、部分
画像データ、音声データを読み出し、再生動作を開始す
る（ステップＳ３４０３）。続いて、会議画像再生装置
２８０２は、再生停止の指示があるか否かを判定し（ス
テップＳ３４０４）、指示された場合には再生を停止す
る。一方、再生停止の指示がない場合（ステップＳ３４
０４：ＮＯ）、方向修正部３２２４からの入力があった
かを判断する（ステップＳ３４０５）。方向の修正があ
った場合（ステップＳ３４０５：Ｙｅｓ）、指定された
部分画像をパノラマ画像から抽出し、音声と併せて出力
（再生）する（ステップＳ３４０６）。Next, the conference image reproducing apparatus 3302 reads the partial image data and audio data and starts the reproducing operation (step S3403). Subsequently, the conference image reproducing device 2802 determines whether or not there is an instruction to stop the reproduction (step S3404), and if so, stops the reproduction. On the other hand, when there is no instruction to stop the reproduction (step S34)
04: NO), and it is determined whether there is an input from the direction correction unit 3224 (step S3405). If the direction has been corrected (step S3405: YES), the specified partial image is extracted from the panoramic image and output (reproduced) together with the sound (step S3406).

【０１９６】一方、方向修正部３２２４からの入力がな
い場合（ステップＳ３４０５：ＮＯ）、そのまま部分画
像データを出力する（ステップＳ３４０７）。なお、会
議画像再生装置３２０２は、予め抽出された部分画像を
順次出力するので、方向修正がされない限り、図３５に
示したＭｅｅｔｉｎｇ１＿ｐｖを再生すればよい。On the other hand, if there is no input from the direction correcting unit 3224 (step S3405: NO), the partial image data is output as it is (step S3407). Since the conference image reproducing device 3202 sequentially outputs the partial images extracted in advance, it is sufficient to reproduce Meeting1_pv shown in FIG. 35 unless the direction is corrected.

【０１９７】次に、会議画像録画再生装置３２００から
出力される画像の構成例について説明する。図３６は、
会議画像録画再生装置３２００から出力される画像の構
成例（画面例）を示した説明図である。画面は話者方向
の画像３６０１だけでなく、モード切替部３６０２、方
向指示操作部３６０３、再生操作指示部３６０４といっ
たユーザインターフェースも含んでいる。Next, a configuration example of an image output from the conference image recording / reproducing apparatus 3200 will be described. FIG. 36 shows
FIG. 16 is an explanatory diagram showing a configuration example (screen example) of an image output from the conference image recording / reproducing device 3200. The screen includes not only the image 3601 in the direction of the speaker but also user interfaces such as a mode switching unit 3602, a direction instruction operation unit 3603, and a reproduction operation instruction unit 3604.

【０１９８】次に、各ユーザインターフェースを説明す
る。モード切替部３６０２は、広角画像データにおける
特定の部分画像を再生するか否かを切り替えるものであ
る。図３６に示したように、ラジオボタンを用いて、動
作モードを切り替えることができる。すなわち、「ＡＵ
ＴＯ」と描かれたラジオボタンが選択されると、音源方
向データに基づいて加工抽出され、記録部３２２２に記
録された部分画像が自動的再生される。一方、「ＭＡＮ
ＵＡＬ」と描かれたラジオボタンが選択されると、図３
７に示したように、ドーナツ画像３６０５が表示され、
ユーザの操作により再生させたい部分を手動で選択する
ことのできる「手動切替モード」に移行する。Next, each user interface will be described. The mode switching unit 3602 switches whether to reproduce a specific partial image in the wide-angle image data. As shown in FIG. 36, radio buttons can be used to switch the operation mode. That is, "AU
When the radio button labeled "TO" is selected, the partial image processed and extracted based on the sound source direction data and recorded in the recording unit 3222 is automatically reproduced. On the other hand, "MAN
When the radio button labeled "UAL" is selected, the screen shown in FIG.
As shown in FIG. 7, a donut image 3605 is displayed,
The operation shifts to the "manual switching mode" in which the user can manually select the portion to be reproduced.

【０１９９】手動切替モードでは、上下左右の向きの矢
印が描かれた４つのボタンである方向指示操作部３６０
３によりポインタ３６０７を移動させる。ポインタ３６
０７を移動させることにより、部分画像データの描画方
向を移動させ、図３８の様に抽出部分が変更された画像
を出力させることができる。この操作により、たとえ
ば、ホワイトボード上の描画内容を適切に出力させるこ
とができる。なお、画面の構成としては、図３６〜図３
８に限られることなく、たとえば図３９の様に、４分割
画面を同時に出力させるようにしてもよい。なお、ここ
で、符号３９０１は、４分割画面とそのうちの一画面と
の出力切り替えをおこなうＧＵＩである。In the manual switching mode, the direction indicating operation section 360, which is four buttons in which up, down, left, and right arrows are drawn.
The pointer 3607 is moved by 3. Pointer 36
By moving 07, it is possible to move the drawing direction of the partial image data and output the image in which the extracted portion is changed as shown in FIG. By this operation, for example, the drawing content on the whiteboard can be appropriately output. The screen configuration is as shown in FIGS.
The number of screens is not limited to eight, and four-division screens may be simultaneously output, for example, as shown in FIG. Here, reference numeral 3901 is a GUI for switching the output between the 4-split screen and one of the screens.

【０２００】一方、再生操作指示部３６０４は、図示し
たように、左から再生、停止、一時停止、早送り、巻き
戻しの機能が割り付けられているＧＵＩを有し、各部が
押下されることにより、その機能に対応した動作を実現
する。なお、ここではソフトウェア的な処理として説明
したが、会議画像再生装置３３０２側にハードウェア的
にボタンを配置してもよく、また、リモートコントロー
ラを別途設けて利便性を高めてもよい。On the other hand, as shown in the figure, the reproduction operation instructing section 3604 has a GUI to which the functions of reproduction, stop, pause, fast forward, and rewind are assigned, and by pressing each section, The operation corresponding to the function is realized. It should be noted that although the processing is described as software here, buttons may be arranged on the side of the conference image reproducing apparatus 3302 in terms of hardware, and a remote controller may be separately provided to improve convenience.

【０２０１】この様な手動切替モードや４分割画面を設
けることにより、たとえば１人の参加者が長時間話し続
けるシーンを後で再生する場合、発言者を映した映像の
みを延々と再生するよりも、間欠的に話者以外の参加者
を再生する方が退屈感を与えず、臨場感がます。このよ
うに、発言者以外の参加者の表情など方向データで指定
された部分以外の映像を見たい場合に、モード切替部３
６０２、方向指示操作部３６０３が特に有用となる。By providing such a manual switching mode and a 4-split screen, for example, when a scene in which one participant continues to speak for a long time is reproduced later, only the video showing the speaker is reproduced endlessly. However, playing a participant other than the speaker intermittently is less tedious and more realistic. In this way, when it is desired to see an image other than the portion specified by the direction data such as the facial expressions of participants other than the speaker, the mode switching unit 3
602 and the direction instruction operation unit 3603 are particularly useful.

【０２０２】なお、実施の形態５の会議画像送出装置３
２０１は、パノラマ画像（全領域）と抽出された画像
（話者方向の部分画像）をいずれも送出したが、使用の
態様によっては、部分画像のみを送出してもよい。ま
た、このときは音源方向データは、会議画像再生装置３
２０２側で画像の抽出や音源方向の判定がなされないの
で、会議画像再生装置３２０２に送出する必要はない。The conference image transmitting apparatus 3 of the fifth embodiment
Although 201 transmits both the panoramic image (entire region) and the extracted image (partial image in the speaker direction), only the partial image may be transmitted depending on the usage mode. At this time, the sound source direction data is the conference image reproduction device 3
Since the image is not extracted and the sound source direction is not determined on the 202 side, it is not necessary to send it to the conference image reproducing apparatus 3202.

【０２０３】以上説明したように、実施の形態５は、実
施の形態４と同様に、会議画像送出装置と会議画像再生
装置が別個独立に構成されていても、会議内容を臨場感
を維持しつつ効率的に再現させることができる。As described above, in the fifth embodiment, as in the fourth embodiment, even if the conference image transmitting device and the conference image reproducing device are separately configured, the contents of the conference are kept realistic. However, it can be reproduced efficiently.

【０２０４】なお、ここまでの例では、主として会議を
録画するシステムについて説明したが、本発明は、この
用途に限定されるものではなく、たとえば、天上に備え
付けることにより防犯カメラとして利用することもでき
る。また、夜行性の動物の生態を調べる用途にも使用す
ることができる。この場合は、高感度ＣＣＤを用いる。In the examples up to this point, a system for recording a conference was mainly described, but the present invention is not limited to this application, and may be used as a security camera by being installed on the sky, for example. it can. It can also be used for the purpose of investigating the ecology of nocturnal animals. In this case, a high sensitivity CCD is used.

【０２０５】[0205]

【発明の効果】以上説明したように、本発明の広角画像
録画再生システム（請求項１）は、広角画像入力手段
が、鉛直方向を中心もしくは軸とした広角画像を入力
し、複数の音声入力手段が、音声をそれぞれ入力し、音
源方向検出手段が、前記複数の音声入力手段により入力
された音声に基づいて音源方向を検出し、記録手段が、
前記広角画像入力手段により入力された広角画像と、前
記音声入力手段により入力された音声と、前記音源方向
検出手段により検出された音源方向を記録し、画像変形
手段が、前記記録手段により記録された広角画像のう
ち、前記音源方向に対応する方向の所定領域の画像を矩
形の出力画像となるように変形し、画像音声出力手段
が、前記画像変形手段により変形された画像と、当該画
像に対応した前記記録手段に記録された音声とを同期さ
せて出力するので、広角の画像を記録し、再生の際にそ
の歪みを正しつつ音源方向を中心としたシーンを再生で
き、これにより、臨場感を維持しつつ、録画場面を効率
的に再現可能となる。As described above, in the wide-angle image recording / reproducing system of the present invention (Claim 1), the wide-angle image input means inputs a wide-angle image centered on or in the vertical direction, and a plurality of voice inputs are made. The means inputs the voices respectively, the sound source direction detecting means detects the sound source direction based on the voices input by the plurality of voice input means, and the recording means,
The wide-angle image input by the wide-angle image input means, the voice input by the voice input means, and the sound source direction detected by the sound source direction detecting means are recorded, and the image transforming means is recorded by the recording means. In the wide-angle image, an image in a predetermined area in a direction corresponding to the sound source direction is transformed into a rectangular output image, and the image / audio output unit converts the image transformed by the image transforming unit into the image and the image. Since it outputs in synchronization with the sound recorded in the corresponding recording means, it is possible to record a wide-angle image and correct the distortion at the time of reproduction while reproducing the scene centered on the sound source direction. The recorded scene can be efficiently reproduced while maintaining the realism.

【０２０６】また、本発明の広角画像録画再生システム
（請求項２）は、広角画像入力手段が、鉛直方向を中心
もしくは軸とした広角画像を入力し、画像変形手段が、
前記広角画像入力手段により入力された広角画像を所定
の変換式により矩形の出力画像となるように変形し、複
数の音声入力手段が、音声をそれぞれ入力し、音源方向
検出手段が、前記複数の音声入力手段により入力された
音声に基づいて音源方向を検出し、記録手段が、前記画
像変形手段により変形された画像と、前記音声入力手段
により入力された音声と、前記音源方向検出手段により
検出された音源方向を記録し、画像抽出手段が、前記記
録手段により記録された矩形形状の画像のうち、前記音
源方向に対応する方向の所定領域の画像を抽出し、画像
音声出力手段が、前記画像抽出手段により抽出された画
像と、当該画像に対応した前記記録手段に記録された音
声とを同期させて出力するので、パノラマ形状に展開さ
れた画像を記録し、音源方向を中心としたシーンを随時
再生でき、これにより、臨場感を維持しつつ、録画場面
を効率的に再現可能となる。Further, in the wide-angle image recording / reproducing system of the present invention (claim 2), the wide-angle image input means inputs a wide-angle image centered on or in the vertical direction, and the image transforming means,
The wide-angle image input by the wide-angle image input means is transformed into a rectangular output image by a predetermined conversion formula, a plurality of voice input means respectively inputs a voice, and a sound source direction detection means is operated by the plurality of sound source direction detection means. The sound source direction is detected based on the voice input by the voice input unit, and the recording unit detects the image transformed by the image transforming unit, the voice input by the voice input unit, and the sound source direction detecting unit. The recorded sound source direction, the image extracting means extracts an image of a predetermined area in a direction corresponding to the sound source direction from the rectangular image recorded by the recording means, and the image sound output means Since the image extracted by the image extraction unit and the sound recorded in the recording unit corresponding to the image are output in synchronization, the image expanded in the panoramic shape is recorded. The scene around the sound source direction can play at any time, As a result, while maintaining a sense of realism, and the recording scene efficiently reproducible.

【０２０７】また、本発明の広角画像録画再生システム
（請求項３）は、請求項１または２に記載の広角画像録
画再生システムにおいて、前記広角画像入力手段が、所
定形状の放物面を有する鏡面体、所定形状の双曲面を有
する鏡面体または所定の円錐形状を有する鏡面体と、画
像撮像素子とから構成されるので、簡易な構成で広角画
像を取り込むことができ、これにより、小型で安価な装
置を提供することが可能となる。The wide-angle image recording / reproducing system of the present invention (claim 3) is the wide-angle image recording / reproducing system according to claim 1 or 2, wherein the wide-angle image input means has a paraboloid of a predetermined shape. Since it is composed of a mirror surface body, a mirror surface body having a hyperboloid of a predetermined shape or a mirror surface body having a predetermined cone shape, and an image pickup device, it is possible to capture a wide-angle image with a simple configuration, and thereby a small size is achieved. It becomes possible to provide an inexpensive device.

【０２０８】また、本発明の広角画像録画再生システム
（請求項４）は、請求項１、２または３に記載の広角画
像録画再生システムにおいて、前記音源方向検出手段
が、前記複数の音声入力手段により入力された音声の時
間差に基づいて音源方向を検出するので、簡易な構成で
音源方向を検出することができ、これにより、小型で安
価な装置を提供することが可能となる。A wide-angle image recording / reproducing system according to the present invention (claim 4) is the wide-angle image recording / reproducing system according to claim 1, 2 or 3, wherein the sound source direction detecting means is the plurality of audio input means. Since the sound source direction is detected based on the time difference between the input voices, it is possible to detect the sound source direction with a simple configuration, and thus it is possible to provide a small and inexpensive device.

【０２０９】また、本発明の広角画像録画再生システム
（請求項５）は、請求項１〜４に記載の広角画像録画再
生システムにおいて、方向修正手段が、前記音源方向に
対応する方向を修正するので、ノイズ等により音源方向
が正しく検出されなかった場合でも所望の映像を再生で
き、これにより、臨場感を維持しつつ、録画場面を効率
的に再現可能となる。According to the wide-angle image recording / reproducing system of the present invention (claim 5), in the wide-angle image recording / reproducing system of claims 1 to 4, the direction correcting means corrects the direction corresponding to the sound source direction. Therefore, even when the sound source direction is not correctly detected due to noise or the like, a desired image can be reproduced, and thus, it is possible to efficiently reproduce the recorded scene while maintaining the realism.

【０２１０】また、本発明の広域画像録画再生システム
（請求項６）は、請求項１〜５のいずれか一つに記載の
広角画像録画再生システムにおいて、領域固定手段が、
前記音源方向に対応する方向の所定領域を固定するの
で、音源の方向が微妙に移動する場合であっても、画像
ブレを防止でき、これにより、臨場感を維持しつつ、録
画場面を効率的に再現可能となる。The wide-area image recording / reproducing system of the present invention (claim 6) is the wide-angle image recording / reproducing system according to any one of claims 1 to 5, wherein the area fixing means comprises:
Since the predetermined area in the direction corresponding to the direction of the sound source is fixed, image blur can be prevented even when the direction of the sound source slightly moves, which allows the recorded scene to be efficiently performed while maintaining the presence. Can be reproduced.

【０２１１】また、本発明の会議録画再生システム（請
求項７）は、請求項１〜６のいずれか一つに記載の広角
画像録画再生システムを会議の録画再生に適用した会議
録画再生システムであって、話者位置判断手段が、画像
の色分布もしくは画像中の移動部分に基づいて話者の位
置を判断し、所定領域決定手段が、前記話者位置判断手
段の判断結果により前記所定領域を決定するので、話者
部分を的確に抽出でき、これにより、臨場感を維持しつ
つ、会議を効率的に再現可能となる。A conference recording / playback system according to the present invention (claim 7) is a conference recording / playback system in which the wide-angle image recording / playback system according to any one of claims 1 to 6 is applied to recording / playback of a conference. Then, the speaker position determination means determines the position of the speaker based on the color distribution of the image or the moving part in the image, and the predetermined area determination means determines the predetermined area based on the determination result of the speaker position determination means. Since it is determined that the speaker portion can be accurately extracted, the conference can be efficiently reproduced while maintaining the presence.

【０２１２】また、本発明の広角画像送出装置（請求項
８）は、広角画像入力手段が、鉛直方向を中心もしくは
軸とした広角画像を入力し、複数の音声入力手段が、音
声を入力し、音源方向検出手段が、前記複数の音声入力
手段により入力された音声に基づいて音源方向を検出
し、データ送出手段が、前記広角画像入力手段により入
力された広角画像に関するデータと、前記音声入力手段
により入力された音声に関するデータと、前記音源方向
検出手段により検出された音源方向に関するデータと、
を所定のデータ格納手段へ送出するので、広角の画像を
取り込み、音源方向を中心としたシーンを再生可能と
し、これにより、臨場感を維持しつつ、録画場面を効率
的に再現可能とする。Further, in the wide-angle image transmitting apparatus of the present invention (claim 8), the wide-angle image input means inputs a wide-angle image centered on or in the vertical direction, and the plurality of voice input means inputs voice. The sound source direction detecting means detects the sound source direction based on the voice input by the plurality of voice input means, and the data transmitting means detects the data regarding the wide-angle image input by the wide-angle image input means and the voice input. Data relating to the voice input by the means, data relating to the sound source direction detected by the sound source direction detecting means,
Is transmitted to a predetermined data storage means, a wide-angle image can be captured and a scene centered on the direction of the sound source can be reproduced. This makes it possible to efficiently reproduce a recorded scene while maintaining a sense of realism.

【０２１３】また、本発明の広角画像送出装置（請求項
９）は、請求項８に記載の広角画像送出装置において、
画像変形手段が、前記広角画像入力手段により入力され
た広角画像を所定の変換式により矩形の出力画像となる
ように変形し、前記データ送出手段が、前記広角画像入
力手段により入力された広角画像に関するデータに換え
て、前記画像変形手段により変形された画像に関するデ
ータを送出するので、パノラマ形状に展開された画像を
再生可能とし、これにより、臨場感を維持しつつ、録画
場面を効率的に再現可能とする。Further, a wide-angle image transmitting device (claim 9) of the present invention is the wide-angle image transmitting device according to claim 8,
The image transformation means transforms the wide-angle image input by the wide-angle image input means into a rectangular output image by a predetermined conversion formula, and the data sending means transforms the wide-angle image input by the wide-angle image input means. Since the data related to the image transformed by the image transforming means is transmitted in place of the data relating to the image, it is possible to reproduce the image expanded in the panoramic shape, thereby maintaining the realistic sensation and efficiently recording the recorded scene. Reproducible.

【０２１４】また、本発明の広角画像送出装置（請求項
１０）は、請求項９に記載の広角画像送出装置におい
て、画像抽出手段が、前記画像変形手段により変形され
た画像のうち、前記音源方向検出手段により検出された
音源方向の所定領域の画像を抽出し、前記データ送出手
段が、前記広角画像入力手段により入力された広角画像
に関するデータおよび前記音源方向検出手段により検出
された音源方向に関するデータに換えて、または、前記
広角画像入力手段により入力された広角画像に関するデ
ータおよび前記音源方向検出手段により検出された音源
方向に関するデータと共に、前記画像抽出手段により抽
出された画像に関するデータを送出するので、音源方向
を中心としたシーンを再生可能とし、これにより、臨場
感を維持しつつ、録画場面を効率的に再現可能とする。A wide-angle image transmitting apparatus according to the present invention (claim 10) is the wide-angle image transmitting apparatus according to claim 9, wherein the image extracting means includes the sound source among the images transformed by the image transforming means. An image of a predetermined region in the sound source direction detected by the direction detection means is extracted, and the data transmission means relates to the data regarding the wide-angle image input by the wide-angle image input means and the sound source direction detected by the sound source direction detection means. In place of the data, or together with the data on the wide-angle image input by the wide-angle image input means and the data on the sound source direction detected by the sound source direction detecting means, the data on the image extracted by the image extracting means is transmitted. Therefore, it is possible to reproduce scenes centered on the direction of the sound source, which allows recording while maintaining a sense of realism. And efficient to be able to reproduce the scene.

【０２１５】また、本発明の広角画像送出装置（請求項
１１）は、請求項８、９または１０に記載の広角画像送
出装置において、前記広角画像入力手段が、所定形状の
放物面を有する鏡面体、所定形状の双曲面を有する鏡面
体または所定の円錐形状を有する鏡面体と、画像撮像素
子とから構成されるので、簡易な構成で広角画像を取り
込むことができ、これにより、小型で安価な装置を提供
することが可能となる。A wide-angle image transmitting device (claim 11) of the present invention is the wide-angle image transmitting device according to claim 8, 9 or 10, wherein the wide-angle image input means has a paraboloid of a predetermined shape. Since it is composed of a mirror surface body, a mirror surface body having a hyperboloid of a predetermined shape or a mirror surface body having a predetermined cone shape, and an image pickup device, it is possible to capture a wide-angle image with a simple configuration, and thereby a small size is achieved. It becomes possible to provide an inexpensive device.

【０２１６】また、本発明の広角画像送出装置（請求項
１２）は、請求項８〜１１のいずれか一つに記載の広角
画像送出装置において、前記音源方向検出手段が、前記
複数の音声入力手段により入力された音声の時間差に基
づいて音源方向を検出するので、簡易な構成で音源方向
を検出することができ、これにより、小型で安価な装置
を提供することが可能となる。Further, a wide-angle image transmitting device (claim 12) of the present invention is the wide-angle image transmitting device according to any one of claims 8 to 11, wherein the sound source direction detecting means has a plurality of voice inputs. Since the sound source direction is detected based on the time difference between the voices input by the means, it is possible to detect the sound source direction with a simple configuration, which makes it possible to provide a small and inexpensive device.

【０２１７】また、本発明の広角画像送出装置（請求項
１３）は、請求項１２に記載の広角画像送出装置におい
て、前記音源方向検出手段が、ある音声入力手段により
入力された音声と、当該音声入力手段と最も距離の離れ
た音声入力手段により入力された音声との時間差に基づ
いて音源方向を検出するので、高精度に音源方向を検出
でき、これにより、臨場感を維持しつつ、録画場面を効
率的に再現可能とする。A wide-angle image transmitting apparatus according to the present invention (claim 13) is the wide-angle image transmitting apparatus according to claim 12, wherein the sound source direction detecting means is a voice input by a certain voice input means, Since the sound source direction is detected based on the time difference between the sound input means and the sound input by the sound input means farthest away, it is possible to detect the sound source direction with high accuracy, thereby maintaining the sense of presence while recording. The scene can be reproduced efficiently.

【０２１８】また、本発明の広角画像送出装置（請求項
１４）は、請求項８〜１１のいずれか一つに記載の広角
画像送出装置において、前記複数の音声入力手段は指向
性マイクロフォンにより構成され、前記音源方向検出手
段が、前記指向性マイクロフォンにより入力された音声
の強度に基づいて音源方向を検出するので、簡易な構成
で音源方向を検出することができ、これにより、小型で
安価な装置を提供することが可能となる。Further, a wide-angle image transmitting device (claim 14) of the present invention is the wide-angle image transmitting device according to any one of claims 8 to 11, wherein the plurality of audio input means are constituted by directional microphones. Since the sound source direction detecting means detects the sound source direction based on the intensity of the voice input by the directional microphone, it is possible to detect the sound source direction with a simple configuration, which is small and inexpensive. It becomes possible to provide a device.

【０２１９】また、本発明の広角画像送出装置（請求項
１５）は、請求項８〜１４のいずれか一つに記載の広角
画像送出装置において、前記複数の音声入力手段を、当
該音声入力手段の重心位置が前記広角画像入力手段の光
学中心と略一致するようにそれぞれ配置したので、音声
入力手段の座標系と広角画像入力手段の座標系とを一致
させて各種計算を簡略化でき、これにより、小型で安価
な装置とすることが可能となる。A wide-angle image transmitting apparatus according to the present invention (claim 15) is the wide-angle image transmitting apparatus according to any one of claims 8 to 14, wherein the plurality of voice input means are the voice input means. Since the barycentric position of each is arranged so as to substantially coincide with the optical center of the wide-angle image input means, the coordinate system of the voice input means and the coordinate system of the wide-angle image input means can be made coincident to simplify various calculations. As a result, a small-sized and inexpensive device can be obtained.

【０２２０】また、本発明の広角画像送出装置（請求項
１６）は、請求項８〜１５のいずれか一つに記載の広角
画像送出装置において、前記複数の音声入力手段と前記
撮像素子とを台座側に配置し、前記鏡面体を透明部材を
介して前記台座に対峙させて配置したこと、もしくは、
前記複数の音声入力手段、前記撮像素子および前記鏡面
体を台座側に配置し、前記台座側の鏡面体からの反射光
を前記台座側の撮像素子へ向けて反射する第２の鏡面体
を透明部材を介して当該台座に対峙させて配置したの
で、電気系を台座に埋め込み、導線等による画像の分断
を防止でき、これにより、臨場感を維持しつつ、録画場
面を効率的に再現可能とする。Further, a wide-angle image transmitting apparatus (claim 16) of the present invention is the wide-angle image transmitting apparatus according to any one of claims 8 to 15, wherein the plurality of audio input means and the image pickup device are provided. It is arranged on the pedestal side, and the mirror body is arranged so as to face the pedestal through a transparent member, or
The plurality of audio input means, the image sensor and the mirror body are arranged on the pedestal side, and the second mirror body that reflects the reflected light from the mirror body on the pedestal side toward the image sensor on the pedestal side is transparent. Since it is placed facing the pedestal via a member, the electric system can be embedded in the pedestal to prevent the image from being divided by the conductors, etc., which makes it possible to efficiently reproduce the recorded scene while maintaining a sense of realism. To do.

【０２２１】また、本発明の広角画像送出装置（請求項
１７）は、請求項８〜１６のいずれか一つに記載の広角
画像送出装置において、仰角設定手段が、装置の設置さ
れる平面を基準とする話者の仰角を設定し、前記画像抽
出手段が、前記音源方向検出手段により検出された音源
方向と前記仰角設定手段により設定された仰角とに基づ
いて画像を抽出するので、高精度に画像を抽出でき、こ
れにより、臨場感を維持しつつ、録画場面を効率的に再
現可能とする。Further, the wide-angle image transmitting device of the present invention (claim 17) is the wide-angle image transmitting device according to any one of claims 8 to 16, wherein the elevation angle setting means is set on a plane on which the device is installed. Since the elevation angle of the reference speaker is set and the image extraction means extracts an image based on the sound source direction detected by the sound source direction detection means and the elevation angle set by the elevation angle setting means, high accuracy is achieved. The image can be extracted, and the recorded scene can be efficiently reproduced while maintaining the realism.

【０２２２】また、本発明の会議画像送出装置（請求項
１８）は、請求項１０〜１７のいずれか一つに記載の広
角画像送出装置を会議の録画用に適用した会議画像送出
装置であって、話者位置判断手段が、画像の色分布もし
くは画像中の移動部分に基づいて話者の位置を判断し、
所定領域決定手段が、前記話者位置判断手段の判断結果
により前記所定領域を決定するので、話者部分を的確に
抽出でき、これにより、臨場感を維持しつつ、会議を効
率的に再現可能とする。A conference image transmitting apparatus (claim 18) of the present invention is a conference image transmitting apparatus to which the wide-angle image transmitting apparatus according to any one of claims 10 to 17 is applied for recording a conference. , The speaker position determination means determines the position of the speaker based on the color distribution of the image or the moving part in the image,
Since the predetermined area determination means determines the predetermined area based on the determination result of the speaker position determination means, it is possible to accurately extract the speaker portion, and thereby it is possible to efficiently reproduce the conference while maintaining the presence. And

【０２２３】また、本発明の広角画像再生装置（請求項
１９）は、データ入力手段が、広角画像が撮像された動
画データと、当該動画データに同期した音声データと、
音源方向に関するデータと、を入力し、画像変形手段
が、前記データ入力手段により入力された動画データの
うち、前記音源方向に関するデータに基づいて所定領域
の動画データを矩形の出力画像となるように変形し、画
像音声出力手段が、前記画像変形手段により変形された
動画データと、当該動画データに対応した音声データと
を同期させて出力するので、歪んだ広角の動画データ正
しつつ音源方向を中心としたシーンを出力でき、これに
より、臨場感を維持しつつ、ユーザによる効率的な視聴
を可能とする。Further, in the wide-angle image reproducing apparatus of the present invention (claim 19), the data input means includes moving image data in which the wide-angle image is captured, and audio data synchronized with the moving image data.
Data relating to the sound source direction is input, and the image transforming means causes the moving image data of the predetermined area to become a rectangular output image based on the data relating to the sound source direction, out of the moving image data input by the data inputting means. Since the image-audio output unit deforms and outputs the moving image data transformed by the image transforming unit and the audio data corresponding to the moving image data in synchronization with each other, the sound source direction is corrected while correcting the distorted wide-angle moving image data. A central scene can be output, which enables the user to efficiently view while maintaining a sense of reality.

【０２２４】また、本発明の広角画像再生装置（請求項
２０）は、データ入力手段が、パノラマ状の広角画像が
撮像された動画データと、当該動画データに同期した音
声データと、音源方向に関するデータと、を入力し、画
像抽出手段が、前記データ入力手段により入力された動
画データのうち、前記音源方向に関するデータに基づい
て所定領域の動画データを抽出し、画像音声出力手段
が、前記画像抽出手段により抽出された動画データと、
当該動画データに対応した音声データとを同期させて出
力するので、音源方向を中心としたシーンを出力でき、
これにより、臨場感を維持しつつ、ユーザによる効率的
な視聴を可能とする。Further, in the wide-angle image reproducing apparatus of the present invention (claim 20), the data input means relates to the moving image data in which the panoramic wide-angle image is picked up, the audio data synchronized with the moving image data, and the sound source direction. Data is input, and the image extracting means extracts the moving image data in the predetermined area based on the data regarding the sound source direction from the moving image data input by the data inputting means, and the image / sound outputting means outputs the image. Video data extracted by the extraction means,
Since the audio data corresponding to the video data is output in synchronization with the audio data, a scene centered on the sound source direction can be output,
As a result, it is possible for the user to efficiently view and listen while maintaining a sense of realism.

【０２２５】また、本発明の広角画像再生装置（請求項
２１）は、請求項１９または２０に記載の広角画像再生
装置において、記録手段が、前記データ入力手段により
入力された前記動画データと、音声データと、音源方向
に関するデータと、を記録し、再生指示手段が、前記動
画データの再生を指示し、出力制御手段が、前記再生指
示手段による再生指示があった場合に、前記記録手段に
より記録されているデータに基づいて、前記画像音声出
力手段を制御して前記動画データと、当該動画データに
対応した音声データとを同期させて出力させるので、音
源方向を中心としたシーンを随時再生でき、これによ
り、臨場感を維持しつつ、録画場面を効率的に再現可能
となる。A wide-angle image reproducing device according to the present invention (claim 21) is the wide-angle image reproducing device according to claim 19 or 20, wherein the recording means stores the moving image data input by the data input means. The audio data and the data about the sound source direction are recorded, the reproduction instructing unit instructs the reproduction of the moving image data, and the output control unit causes the recording unit to reproduce the reproduction instruction. Based on the recorded data, the video / audio output means is controlled to output the moving picture data and the sound data corresponding to the moving picture data in synchronism with each other, so that the scene around the sound source direction is reproduced at any time. As a result, the recorded scene can be efficiently reproduced while maintaining the sense of presence.

【０２２６】また、本発明の広角画像再生装置（請求項
２２）は、請求項１９、２０または２１に記載の広角画
像再生装置において、方向修正手段が、前記音源方向に
対応する方向を修正するので、ノイズ等により音源方向
が所望の方向でなかった場合でも所望の映像を再生で
き、これにより、臨場感を維持しつつ、録画場面を効率
的に再現可能となる。According to the wide-angle image reproducing device of the present invention (claim 22), in the wide-angle image reproducing device of claim 19, 20 or 21, the direction correcting means corrects the direction corresponding to the sound source direction. Therefore, even if the sound source direction is not the desired direction due to noise or the like, the desired video can be reproduced, and thus the recorded scene can be efficiently reproduced while maintaining the realism.

【０２２７】また、本発明の広角画像再生装置（請求項
２３）は、請求項１９〜２２のいずれか一つに記載の広
角画像再生装置において、領域固定手段が、前記音源方
向に対応する方向の所定領域を固定するので、音源の方
向が微妙に移動する場合であっても、画像ブレを防止で
き、これにより、臨場感を維持しつつ、録画場面を効率
的に再現可能となる。A wide-angle image reproducing device according to the present invention (claim 23) is the wide-angle image reproducing device according to any one of claims 19 to 22, wherein the area fixing means corresponds to the direction of the sound source. Since the predetermined area is fixed, the image blur can be prevented even when the direction of the sound source slightly moves, and thus the recorded scene can be efficiently reproduced while maintaining the presence.

【０２２８】また、本発明の会議画像再生装置（請求項
２４）は、請求項１９〜２３のいずれか一つに記載の広
角画像再生装置を会議の再生用に適用した会議画像再生
装置であって、話者位置判断手段が、動画データの色分
布もしくは動画データ中の移動部分に基づいて話者の位
置を判断し、所定領域決定手段が、前記話者位置判断手
段の判断結果により前記所定領域を決定するので、話者
部分を的確に抽出でき、これにより、臨場感を維持しつ
つ、会議を効率的に再現可能となる。Further, the conference image reproducing device of the present invention (claim 24) is a conference image reproducing device to which the wide-angle image reproducing device according to any one of claims 19 to 23 is applied for reproducing a conference. The speaker position determining means determines the position of the speaker based on the color distribution of the moving image data or the moving part in the moving image data, and the predetermined region determining means determines the predetermined position based on the determination result of the speaker position determining means. Since the area is determined, the speaker portion can be accurately extracted, and thus the conference can be efficiently reproduced while maintaining the presence.

【０２２９】また、本発明の広角画像録画再生方法（請
求項２５）は、入力工程では、鉛直方向を中心もしくは
軸とした広角画像と、当該画像に同期した音声と、当該
音声の音源方向と、を入力し、記録工程では、前記入力
工程により入力された広角画像と、音声と、音源方向
と、を記録し、変形工程では、前記記録工程により記録
された広角画像のうち、前記音源方向に対応する方向の
所定領域の画像を矩形の出力画像となるように変形し、
再生工程では、前記画像変形工程により変形された画像
と、前記記録工程に記録された前記変形された画像にか
かる音声とを同期させて再生するので、広角の画像を記
録し、再生の際にその歪みを正しつつ音源方向を中心と
したシーンを再生でき、これにより、臨場感を維持しつ
つ、録画場面を効率的に再現可能となる。Further, in the wide-angle image recording / reproducing method of the present invention (claim 25), in the input step, a wide-angle image having the vertical direction as a center or an axis, a sound synchronized with the image, and a sound source direction of the sound are set. , In the recording step, the wide-angle image, the sound, and the sound source direction input in the input step are recorded, and in the transforming step, the sound source direction in the wide-angle image recorded in the recording step is recorded. The image of the predetermined area in the direction corresponding to is transformed into a rectangular output image,
In the reproducing step, since the image deformed in the image deforming step and the sound applied to the deformed image recorded in the recording step are reproduced in synchronization with each other, a wide-angle image is recorded and reproduced. It is possible to reproduce the scene centered on the sound source direction while correcting the distortion, and thus it is possible to efficiently reproduce the recorded scene while maintaining the presence.

【０２３０】また、本発明の広角画像録画再生方法（請
求項２６）は、入力工程では、鉛直方向を中心もしくは
軸とした広角画像と、当該画像に同期した音声と、当該
音声の音源方向と、を入力し、変形工程では、前記入力
工程により入力された広角画像を所定の変換式により矩
形の出力画像となるように変形し、記録工程では、前記
変形工程により変形された画像と、音声と、音源方向
と、を記録し、抽出工程では、前記記録工程により記録
された矩形形状の画像のうち、前記音源方向に対応する
方向の所定領域の画像を抽出し、再生工程では、前記抽
出工程により抽出された画像と、前記記録工程に記録さ
れた前記抽出された画像にかかる音声とを同期させて再
生するので、パノラマ形状に展開された画像を記録し、
音源方向を中心としたシーンを随時再生でき、これによ
り、臨場感を維持しつつ、録画場面を効率的に再現可能
となる。Further, in the wide-angle image recording / reproducing method of the present invention (claim 26), in the input step, the wide-angle image having the vertical direction as the center or axis, the sound synchronized with the image, and the sound source direction of the sound are set. , In the transformation step, the wide-angle image input in the input step is transformed into a rectangular output image by a predetermined conversion formula, and in the recording step, the image transformed by the transformation step and the audio And a sound source direction, and in the extracting step, an image of a predetermined area in a direction corresponding to the sound source direction is extracted from the rectangular image recorded in the recording step, and in the reproducing step, the extraction is performed. Since the image extracted in the step and the sound related to the extracted image recorded in the recording step are reproduced in synchronization with each other, the image expanded in the panoramic shape is recorded,
Scenes centered on the direction of the sound source can be reproduced at any time, which makes it possible to efficiently reproduce recorded scenes while maintaining a sense of realism.

【０２３１】また、本発明の会議録画再生方法（請求項
２７）は、請求項２５または２６に記載の広角画像録画
再生方法を会議の録画再生に適用した会議録画再生方法
であって、話者位置判断工程では、画像の色分布もしく
は画像中の移動部分に基づいて話者の位置を判断し、所
定領域決定工程では、前記話者位置判断工程に基づく判
断結果により前記所定領域を決定するので、話者部分を
的確に抽出でき、これにより、臨場感を維持しつつ、会
議を効率的に再現可能となる。The conference recording / playback method (claim 27) of the present invention is a conference recording / playback method in which the wide-angle image recording / playback method according to claim 25 or 26 is applied to recording / playback of a conference. In the position determination step, the position of the speaker is determined based on the color distribution of the image or the moving part in the image, and in the predetermined area determination step, the predetermined area is determined based on the determination result based on the speaker position determination step. , The speaker part can be accurately extracted, and thereby the conference can be efficiently reproduced while maintaining the presence.

【０２３２】また、本発明の広角画像送出方法（請求項
２８）は、入力工程では、鉛直方向を中心もしくは軸と
した広角画像と、当該画像に同期した音声と、当該音声
の音源方向と、を入力し、変形工程では、前記入力工程
により入力された広角画像を所定の変換式により矩形の
出力画像となるように変形し、データ送出工程では、前
記変形工程により変形された画像に関するデータと、音
声に関するデータと、音源方向に関するデータと、を所
定のデータ格納先へ送出するので、パノラマ形状に展開
された画像を再生可能とし、これにより、臨場感を維持
しつつ、録画場面を効率的に再現可能とする。Further, in the wide-angle image transmitting method of the present invention (claim 28), in the input step, a wide-angle image having the vertical direction as the center or axis, a sound synchronized with the image, and a sound source direction of the sound, In the transforming step, the wide-angle image input in the inputting step is transformed into a rectangular output image by a predetermined conversion formula, and in the data sending step, data regarding the image transformed in the transforming step is input. , The data about the sound and the data about the sound source direction are sent to a predetermined data storage destination, so that the image expanded in the panoramic shape can be reproduced, which allows the recorded scene to be efficiently performed while maintaining the presence. To be reproducible.

【０２３３】また、本発明の広角画像送出方法（請求項
２９）は、請求項２８に記載の広角画像送出方法におい
て、抽出工程では、前記変形工程により変形された画像
のうち、前記音源方向の所定領域の画像を抽出し、前記
データ送出工程では、前記入力工程により入力された広
角画像に関するデータおよび音源方向に関するデータに
換えて、または、前記入力工程により入力された広角画
像に関するデータおよび音源方向に関するデータと共
に、前記抽出工程により抽出された画像に関するデータ
を送出するので、音源方向を中心としたシーンを再生可
能とし、これにより、臨場感を維持しつつ、録画場面を
効率的に再現可能とする。The wide-angle image transmitting method (claim 29) of the present invention is the wide-angle image transmitting method according to claim 28, wherein in the extracting step, the image in the sound source direction is selected from the images deformed by the deforming step. An image of a predetermined area is extracted, and in the data sending step, the wide-angle image data and sound source direction input in the input step are replaced with the wide-angle image data and sound source direction data input in the input step. Since the data related to the image extracted in the extraction step is transmitted together with the data related to the above, it is possible to reproduce the scene centered on the direction of the sound source, and thereby it is possible to efficiently reproduce the recorded scene while maintaining the presence. To do.

【０２３４】また、本発明の会議画像送出方法（請求項
３０）は、請求項２９に記載の広角画像送出方法を会議
の録画用に適用した会議画像送出方法であって、話者位
置判断工程では、画像の色分布もしくは画像中の移動部
分に基づいて話者の位置を判断し、所定領域決定工程で
は、前記話者位置判断工程の判断結果により前記所定領
域を決定するので、話者部分を的確に抽出でき、これに
より、臨場感を維持しつつ、会議を効率的に再現可能と
する。Further, a conference image transmitting method (claim 30) of the present invention is a conference image transmitting method in which the wide-angle image transmitting method according to claim 29 is applied for recording a conference, and a speaker position determining step Then, the position of the speaker is determined based on the color distribution of the image or the moving part in the image, and the predetermined region is determined in the predetermined region determination step by the determination result of the speaker position determination step. Can be accurately extracted, and this makes it possible to efficiently reproduce the conference while maintaining the presence.

【０２３５】また、本発明の広角画像再生方法（請求項
３１）は、記録工程では、広角画像が撮像された動画デ
ータと、当該動画データに同期した音声データと、音源
方向に関するデータと、を記録し、変形工程では、前記
記録工程により記録された動画データのうち、前記音源
方向に関するデータに基づいて所定領域の動画データを
矩形の出力画像となるように変形し、出力工程では、前
記画像変形工程により変形された動画データと、当該動
画データに対応した音声データとを同期させて出力する
ので、歪んだ広角の動画データを正しつつ音源方向を中
心としたシーンを出力でき、これにより、臨場感を維持
しつつ、録画場面を効率的に再現可能とする。Further, in the wide-angle image reproducing method (claim 31) of the present invention, in the recording step, the moving image data in which the wide-angle image is captured, the audio data synchronized with the moving image data, and the data regarding the sound source direction are recorded. In the recording and transforming step, of the moving image data recorded in the recording step, the moving image data in a predetermined area is transformed into a rectangular output image based on the data regarding the sound source direction, and in the output step, the image Since the video data transformed by the transformation process and the audio data corresponding to the video data are output in synchronization with each other, a scene centered on the sound source direction can be output while correcting the distorted wide-angle video data. , It is possible to efficiently reproduce the recorded scene while maintaining the realism.

【０２３６】また、本発明の広角画像再生方法（請求項
３２）は、パノラマ状の広角画像が撮像された動画デー
タと、記録工程では、当該動画データに同期した音声デ
ータと、音源方向に関するデータと、を記録し、抽出工
程では、前記記録工程により記録された動画データのう
ち、前記音源方向に関するデータに基づいて所定領域の
動画データを抽出し、出力工程では、前記抽出工程によ
り抽出された動画データと、当該動画データに対応した
音声データとを同期させて出力するので、音源方向を中
心としたシーンを出力でき、これにより、臨場感を維持
しつつ、録画場面を効率的に再現可能とする。According to the wide-angle image reproducing method of the present invention (claim 32), the moving image data in which the panoramic wide-angle image is captured, the recording process, the audio data synchronized with the moving image data, and the sound source direction data are recorded. In the extracting step, the moving image data in the predetermined area is extracted based on the data regarding the sound source direction in the extracting step, and in the extracting step, the extracting step extracts the moving image data in the predetermined area. Since the video data and the audio data corresponding to the video data are output in synchronization with each other, it is possible to output a scene centered on the direction of the sound source, which makes it possible to efficiently reproduce the recorded scene while maintaining the presence. And

【０２３７】また、本発明の会議画像再生方法（請求項
３３）は、請求項３１または３２に記載の広角画像再生
方法を会議の再生用に適用した会議画像再生方法であっ
て、話者位置判断工程では、動画データの色分布もしく
は動画データ中の移動部分に基づいて話者の位置を判断
し、所定領域決定工程では、前記話者位置判断工程に基
づく判断結果により前記所定領域を決定するので、話者
部分を的確に抽出でき、これにより、臨場感を維持しつ
つ、会議を効率的に再現可能となる。A conference image reproducing method (claim 33) of the present invention is a conference image reproducing method in which the wide-angle image reproducing method according to claim 31 or 32 is applied for reproducing a conference. In the determining step, the position of the speaker is determined based on the color distribution of the moving image data or the moving part in the moving image data, and in the predetermined region determining step, the predetermined region is determined based on the determination result based on the speaker position determining process. Therefore, the speaker portion can be accurately extracted, and thus the conference can be efficiently reproduced while maintaining the presence.

【０２３８】また、本発明のプログラム（請求項３４）
は、コンピュータに、請求項２５〜３３のいずれか一つ
に記載の方法の各工程を実行させるので、臨場感を維持
しつつ、会議を効率的に再現させる。A program of the present invention (claim 34)
Causes the computer to execute each step of the method according to any one of claims 25 to 33, so that the conference can be efficiently reproduced while maintaining the presence.

[Brief description of drawings]

【図１】本発明を会議場面に設置した使用例を概説する
説明図である。FIG. 1 is an explanatory diagram outlining a usage example in which the present invention is installed in a conference scene.

【図２】実施の形態１の会議画像送出装置の外観斜視図
である。FIG. 2 is an external perspective view of the conference image transmitting apparatus according to the first embodiment.

【図３】実施の形態１の会議画像送出装置の正面図と平
面図である。3A and 3B are a front view and a plan view of the conference image transmitting apparatus according to the first embodiment.

【図４】実施の形態１の会議画像送出装置のカメラ部の
構成例を示した説明図である。FIG. 4 is an explanatory diagram showing a configuration example of a camera unit of the conference image transmitting apparatus according to the first embodiment.

【図５】実施の形態１の双曲面ミラーを用いた場合の光
路を説明する図である。FIG. 5 is a diagram illustrating an optical path when the hyperboloidal mirror according to the first embodiment is used.

【図６】実施の形態１の双曲面ミラーによりＣＣＤの表
面に結像される広角画像の様子を示した図である。FIG. 6 is a diagram showing a state of a wide-angle image formed on the surface of a CCD by the hyperboloidal mirror of the first embodiment.

【図７】実施の形態１の会議録画再生装置の構成例を示
した図である。FIG. 7 is a diagram showing a configuration example of a conference recording / playback apparatus according to the first embodiment.

【図８】実施の形態１の録画画像再生システムの機能的
構成の一例を示したブロック図である。FIG. 8 is a block diagram showing an example of a functional configuration of a recorded image reproduction system according to the first embodiment.

【図９】実施の形態１の音源方向検出部による音源方向
の検出原理を説明する図である。FIG. 9 is a diagram illustrating a principle of detecting a sound source direction by a sound source direction detecting unit according to the first embodiment.

【図１０】音源が存在する方向が円錐上であることを説
明する図である。FIG. 10 is a diagram illustrating that a direction in which a sound source exists is on a cone.

【図１１】４つのマイクロフォンを２組に分けて音源方
向を検出する場合の組分けの様子を示した説明図であ
る。FIG. 11 is an explanatory diagram showing a manner of division when four sound sources are detected by dividing four microphones into two pairs.

【図１２】３つのマイクロフォンによってマイクロフォ
ン部が構成される場合のマイクロフォンの組の採り方を
説明する説明図である。FIG. 12 is an explanatory diagram illustrating how to take a set of microphones when the microphone unit is configured by three microphones.

【図１３】実施の形態１の音源方向のデータ構成例を示
した図である。FIG. 13 is a diagram showing a data configuration example in the sound source direction according to the first embodiment.

【図１４】双曲面ミラーから取り込まれたドーナツ画像
をパノラマ画像に変形した様子を示した説明図である。FIG. 14 is an explanatory diagram showing a state in which a donut image captured from a hyperboloidal mirror is transformed into a panoramic image.

【図１５】双曲面ミラーを使用した場合の変形原理を説
明する図のうち、ドーナツ画像とパノラマ画像の座標系
を示した図である。FIG. 15 is a diagram showing a coordinate system of a donut image and a panoramic image, among the diagrams for explaining the deformation principle when a hyperboloidal mirror is used.

【図１６】双曲面ミラーを使用した場合の変形原理を説
明する図のうち、ＣＣＤからみた頂角ψと、仰角φとの
関係を示した図である。FIG. 16 is a diagram showing the relationship between the apex angle ψ as viewed from the CCD and the elevation angle φ in the diagram illustrating the deformation principle when a hyperboloidal mirror is used.

【図１７】ドーナツ画像の座標系（ｕ，ｖ）からパノラ
マ画像の座標系（θ，φ）へ座標系を変換する場合の変
換テーブルの例を模式的に示した説明図である。FIG. 17 is an explanatory diagram schematically showing an example of a conversion table in the case of converting a coordinate system from a donut image coordinate system (u, v) to a panoramic image coordinate system (θ, φ).

【図１８】実施の形態１の会議録画再生システムの処理
流れの例を示した説明図である。FIG. 18 is an explanatory diagram showing an example of a processing flow of the conference recording / playback system according to the first embodiment.

【図１９】実施の形態２の画像録画再生システムの外観
構成の一例を示した図である。FIG. 19 is a diagram showing an example of an external configuration of an image recording / playback system according to a second embodiment.

【図２０】実施の形態２の会議画像録画再生システムの
ハードウェア構成の一例を示した説明図である。FIG. 20 is an explanatory diagram showing an example of the hardware configuration of the conference image recording / reproducing system according to the second embodiment.

【図２１】実施の形態２の会議録画再生システムの機能
的構成の一例を示した説明図である。FIG. 21 is an explanatory diagram showing an example of a functional configuration of the conference recording / playback system according to the second embodiment.

【図２２】実施の形態２における画像抽出の例を示した
説明図である。FIG. 22 is an explanatory diagram showing an example of image extraction according to the second embodiment.

【図２３】実施の形態２の画像抽出部による部分画像デ
ータの生成方法を説明する説明図である。FIG. 23 is an explanatory diagram illustrating a method of generating partial image data by the image extracting unit according to the second embodiment.

【図２４】実施の形態２の会議録画再生システムの処理
流れの例を示した説明図である。FIG. 24 is an explanatory diagram showing an example of the processing flow of the conference recording / playback system according to the second embodiment.

【図２５】実施の形態３のカメラ部を含んだ装置の外観
構成の一例を示した説明図である。FIG. 25 is an explanatory diagram showing an example of an external configuration of an apparatus including a camera unit according to a third embodiment.

【図２６】２枚の反射鏡を用いてドーナツ画像を取り込
む構成としたカメラ部の外観構成図である。FIG. 26 is an external configuration diagram of a camera unit configured to capture a donut image using two reflecting mirrors.

【図２７】実施の形態３のマイク部と音源方向との関係
を説明する説明図である。FIG. 27 is an explanatory diagram illustrating a relationship between a microphone unit and a sound source direction according to the third embodiment.

【図２８】実施の形態４の会議画像送出装置と会議録画
再生装置の機能ブロックを示した図である。FIG. 28 is a diagram showing functional blocks of the conference image transmitting apparatus and the conference recording / playback apparatus according to the fourth embodiment.

【図２９】実施の形態４の会議画像送出装置の処理流れ
の例を示したフローチャートである。FIG. 29 is a flowchart showing an example of the processing flow of the conference image transmitting apparatus of the fourth embodiment.

【図３０】実施の形態４の会議画像再生装置の処理流れ
の例を示したフローチャートである。FIG. 30 is a flowchart showing an example of the processing flow of the conference image reproducing apparatus in the fourth embodiment.

【図３１】再生させたい画像を選択する画面構成の例を
示した図である。FIG. 31 is a diagram showing an example of a screen configuration for selecting an image to be reproduced.

【図３２】実施の形態５の会議画像送出装置と会議録画
再生装置の機能ブロック図を示した図である。FIG. 32 is a diagram showing a functional block diagram of a conference image transmitting apparatus and a conference recording / playback apparatus according to the fifth embodiment.

【図３３】実施の形態５の会議画像送出装置３２０１の
処理流れの例を示したフローチャートである。FIG. 33 is a flowchart showing an example of the processing flow of the conference image transmitting apparatus 3201 according to the fifth embodiment.

【図３４】実施の形態５の会議画像再生装置の処理流れ
の例を示したフローチャートである。FIG. 34 is a flowchart showing an example of the processing flow of the conference image reproducing apparatus in the fifth embodiment.

【図３５】再生させたい画像を選択する画面構成の例を
示した図である。FIG. 35 is a diagram showing an example of a screen configuration for selecting an image to be reproduced.

【図３６】実施の形態５の会議画像録画再生装置から出
力される画像の構成例（画面例）を示した説明図であ
る。FIG. 36 is an explanatory diagram showing a configuration example (screen example) of an image output from the conference image recording / playback device according to the fifth embodiment.

【図３７】「ＭＡＮＵＡＬ」ボタンが選択されたのちの
図３６に示した画像の構成の変化の様子を示した説明図
である。FIG. 37 is an explanatory diagram showing a state of a change in the configuration of the image shown in FIG. 36 after the “MANUAL” button is selected.

【図３８】実施の形態５の方向指示操作部により抽出部
分が変更された画像を示した図である。FIG. 38 is a diagram showing an image in which an extracted portion is changed by a direction designating operation unit according to the fifth embodiment.

【図３９】画面構成の他の例であって、４分割画面の例
を示した説明図である。[Fig. 39] Fig. 39 is an explanatory diagram illustrating another example of the screen configuration, which is an example of a 4-split screen.

[Explanation of symbols]

１００，１９００，２５００，２８００，３２００
会議録画再生システム２００，２８０１，３２０１会議画像送出装置２０１，２００５，２５０１，２６００カメラ部２０２，２００６，２７０１マイク部２０３，２０３２台座２０４透明ガラス２１１双曲面ミラー２１２レンズ２１３絞り２２１，２７０２マイクロフォン３００，２８０２，３２０２，３３０２会議画像
再生装置３０７大容量記録装置８０１，２８１１，３２１１広角画像入力部８０２，２８１２，３２１２音声入力部８０３，３２１３音源方向検出部８０４，２８１３，２８２２，３２２２
記録部８０５，２８０３，２８２３画像変形部８０６，３２２４方向修正部８０７，２８２４領域固定部８０８，２８２５，３２２３画像音声出力部８０９話者位置判断部８１０領域決定部１９０１十字ボタン１９０２決定ボタン１９０３画像音声出力端子１９０４媒体挿入スロット２００３操作部２００７リムーバブルメディア部２１０１，３２１４広角画像展開部２１０２，３２１５画像抽出部２５０２鏡面体２６０１第１の反射鏡２６０２第２の反射鏡２８１４仰角設定部２８１５，３２１６データ送出部２８２１，３２２１データ入力部３６０２モード切替部３６０３方向指示操作部３６０４再生操作指示部100, 1900, 2500, 2800, 3200
Conference recording / playback system 200, 2801, 3201 Conference image transmission device 201, 2005, 2501, 2600 Camera unit 202, 2006, 2701 Microphone unit 203, 2032 Pedestal 204 Transparent glass 211 Hyperboloidal mirror 212 Lens 213 Aperture 221,702 Microphone 300, 2802, 3202, 3302 Conference image reproducing device 307 Large-capacity recording device 801, 811, 3211 Wide-angle image input unit 802, 2812, 3212 Audio input unit 803, 3213 Sound source direction detection unit 804, 2813, 2822, 3222
Recording unit 805, 2803, 2823 Image transformation unit 806, 3224 Direction correction unit 807, 2824 Area fixing unit 808, 2825, 3223 Image sound output unit 809 Speaker position judgment unit 810 Area decision unit 1901 Cross button 1902 Decision button 1903 Image sound Output terminal 1904 Medium insertion slot 2003 Operation unit 2007 Removable media unit 2101, 3214 Wide-angle image development unit 2102, 3215 Image extraction unit 2502 Specular body 2601 Second reflecting mirror 2602 Second reflecting mirror 2814 Elevation angle setting unit 2815, 3216 Data transmission 2821, 3221 Data input section 3602 Mode switching section 3603 Direction instruction operation section 3604 Reproduction operation instruction section

フロントページの続きＦターム(参考） 5C022 AA00 AB68 AC21 AC51 5C053 FA30 GB11 GB38 HA27 HA29 HA40 JA01 LA01 LA14 5C054 AA01 CA04 CC02 CE04 CH01 DA01 EA01 EA07 EG10 FD02 GB06 GD01 HA00 5C064 AA02 AB04 AC03 AC04 AC06 AC09 AC13 AC18 AC22 AD02 AD08 (54)【発明の名称】広角画像録画再生システム、会議録画再生システム、広角画像送出装置、会議画像送出装置、広角画像再生装置、会議画像再生装置、広角画像録画再生方法、会議録画再生方法、広角画像送出方法、会議画像送出方法、広角画像再生方法、会議画像再生方法およびプログラムContinued front page F-term (reference) 5C022 AA00 AB68 AC21 AC51 5C053 FA30 GB11 GB38 HA27 HA29 HA40 JA01 LA01 LA14 5C054 AA01 CA04 CC02 CE04 CH01 DA01 EA01 EA07 EG10 FD02 GB06 GD01 HA00 5C064 AA02 AB04 AC03 AC04 AC06 AC09 AC13 AC18 AC22 AD02 AD08 (54) [Title of Invention] Wide-angle image recording / reproducing system, conference recording / reproducing system, wide-angle image transmitting device, conference image transmitting device, Angle image reproducing device, conference image reproducing device, wide-angle image recording / reproducing method, conference recording / reproducing method, wide-angle image transmission Method, conference image transmitting method, wide-angle image reproducing method, conference image reproducing method and program

Claims

[Claims]

1. A wide-angle image input means for inputting a wide-angle image centered on or in the vertical direction, a plurality of voice input means for inputting voice, and a sound source based on voice input by the plurality of voice input means. Sound source direction detecting means for detecting a direction, wide-angle image input by the wide-angle image input means, voice input by the voice input means, and recording means for recording the sound source direction detected by the sound source direction detecting means And an image transforming unit that transforms an image of a predetermined area in a direction corresponding to the sound source direction into a rectangular output image among the wide-angle images recorded by the recording unit, and the image transforming unit transforms the image. Wide-angle image recording, comprising: an image and an image / sound output unit that outputs the image in synchronization with the sound recorded in the recording unit corresponding to the image. Playback system.

2. A wide-angle image input means for inputting a wide-angle image having a vertical direction as a center or an axis, and a wide-angle image input by the wide-angle image input means is transformed into a rectangular output image by a predetermined conversion formula. Image transforming means, a plurality of voice inputting means for inputting voices, a sound source direction detecting means for detecting a sound source direction based on the voices input by the plurality of voice inputting means, and the image transforming means An image, a voice input by the voice input unit, a recording unit that records the sound source direction detected by the sound source direction detection unit, and a rectangular image recorded by the recording unit,
An image extracting unit that extracts an image of a predetermined area in a direction corresponding to the sound source direction, an image extracted by the image extracting unit, and a sound recorded in the recording unit corresponding to the image are output in synchronization with each other. A wide-angle image recording / playback system, comprising:

3. The wide-angle image input means is composed of a mirror body having a paraboloid of a predetermined shape, a mirror body having a hyperboloid of a predetermined shape or a mirror body having a predetermined conical shape, and an image pickup device. The wide-angle image recording / playback system according to claim 1 or 2, wherein:

4. The wide-angle image according to claim 1, 2 or 3, wherein the sound source direction detecting means detects the sound source direction based on a time difference between sounds input by the plurality of sound input means. Recording and playback system.

5. The wide-angle image recording / reproducing system according to claim 1, further comprising direction correcting means for correcting a direction corresponding to the sound source direction.

6. The wide-angle image recording / reproducing system according to claim 1, further comprising area fixing means for fixing a predetermined area in a direction corresponding to the sound source direction.

7. A conference recording / playback system in which the wide-angle image recording / playback system according to any one of claims 1 to 6 is applied to recording / playback of a conference, wherein a color distribution of an image or a moving portion in the image is displayed. A conference recording / playback system comprising: a speaker position determining unit that determines the position of the speaker based on the speaker position; and a predetermined region determining unit that determines the predetermined region based on the determination result of the speaker position determining unit. .

8. A wide-angle image input means for inputting a wide-angle image centered on or in the vertical direction, a plurality of voice input means for inputting voice, and a sound source based on voice input by the plurality of voice input means. Sound source direction detecting means for detecting a direction, data about a wide-angle image input by the wide-angle image input means, data about a voice input by the voice input means, and a sound source direction detected by the sound source direction detecting means A wide-angle image transmission device comprising: a data transmission unit for transmitting data to a predetermined data storage unit.

9. An image transforming unit that transforms the wide-angle image input by the wide-angle image input unit into a rectangular output image by a predetermined conversion formula, wherein the data sending unit is configured by the wide-angle image input unit. 9. The wide-angle image transmitting apparatus according to claim 8, wherein the wide-angle image transmitting apparatus transmits the data regarding the image transformed by the image transforming means, in place of the input data regarding the wide-angle image.

10. An image extracting unit for extracting an image of a predetermined region in the sound source direction detected by the sound source direction detecting unit from among the images deformed by the image deforming unit, wherein the data sending unit includes the wide-angle image. Instead of the data regarding the wide-angle image input by the image input means and the data regarding the sound source direction detected by the sound source direction detecting means, or the data regarding the wide-angle image input by the wide-angle image input means and the sound source direction detecting means 10. The wide-angle image sending device according to claim 9, wherein the wide-angle image sending device sends the data concerning the image extracted by the image extracting means together with the data concerning the sound source direction detected by.

11. The wide-angle image input means is composed of a mirror body having a paraboloid of a predetermined shape, a mirror body having a hyperboloid of a predetermined shape or a mirror body having a predetermined conical shape, and an image pickup device. The wide-angle image transmitting device according to claim 8, 9, or 10.

12. The sound source direction detecting means detects the sound source direction based on a time difference between the voices input by the plurality of voice inputting means. Wide-angle image sending device.

13. The sound source direction detecting means determines a sound source direction based on a time difference between a voice input by a voice input means and a voice input by a voice input means that is farthest from the voice input means. The wide-angle image transmitting apparatus according to claim 12, wherein the wide-angle image transmitting apparatus detects the wide-angle image.

14. The plurality of voice input means is composed of a directional microphone, and the sound source direction detecting means is
The wide-angle image transmitting apparatus according to any one of claims 8 to 11, wherein the sound source direction is detected based on the strength of the sound input by the directional microphone.

15. The plurality of voice input means are arranged so that the barycentric position of the voice input means substantially coincides with the optical center of the wide-angle image input means, respectively. The wide-angle image transmission device according to any one of the above.

16. The plurality of voice input means and the image pickup device are arranged on a pedestal side, and the mirror body is arranged so as to face the pedestal through a transparent member, or the plurality of voice input means. A second mirror-faced body that is disposed on the pedestal side, and that reflects light reflected from the pedestal-side mirror-faced body toward the pedestal-side imager via a transparent member; 16. The wide-angle image transmitting apparatus according to claim 8, wherein the wide-angle image transmitting apparatus is arranged so as to face each other.

17. An elevation angle setting means for setting an elevation angle of a speaker with respect to a plane on which the apparatus is installed, wherein the image extracting means has a sound source direction detected by the sound source direction detecting means and the elevation angle setting means. The wide-angle image transmitting apparatus according to any one of claims 8 to 16, wherein the image is extracted based on the elevation angle set by.

18. A conference image transmission device, in which the wide-angle image transmission device according to any one of claims 10 to 17 is applied for recording a conference, which is based on a color distribution of the image or a moving portion in the image. A conference image transmitting apparatus comprising: a speaker position determining means for determining a position of a speaker by means of a speaker; and a predetermined area determining means for determining the predetermined area based on a determination result of the speaker position determining means.

19. Moving image data in which a wide-angle image is captured,
Data input means for inputting audio data synchronized with the moving picture data and data concerning the sound source direction, and moving picture data of a predetermined area based on the data concerning the sound source direction among the moving picture data inputted by the data input means. Image transforming means for transforming the image data into a rectangular output image, and moving image data transformed by the image transforming means, and image and sound output means for synchronizing and outputting the sound data corresponding to the moving image data. A wide-angle image reproducing device characterized by being provided.

20. Video data obtained by capturing a panoramic wide-angle image, and audio data synchronized with the video data,
Data inputting means for inputting data regarding a sound source direction; image extracting means for extracting moving image data in a predetermined area based on the data regarding the sound source direction from the moving image data input by the data inputting means; A wide-angle image reproducing device comprising: a moving image data extracted by the extracting unit; and an image / audio output unit that outputs the audio data corresponding to the moving image data in synchronization with each other.

21. A recording unit for recording the moving image data input by the data input unit, audio data, and data about a sound source direction, a reproduction instructing unit for instructing reproduction of the moving image data, and the reproduction. When there is a reproduction instruction from the instruction unit, the image / audio output unit is controlled based on the data recorded by the recording unit to synchronize the moving image data with the sound data corresponding to the moving image data. 21. The wide-angle image reproducing device according to claim 19, further comprising: an output control unit that outputs the wide-angle image.

22. A direction correcting means for correcting a direction corresponding to the sound source direction is provided.
20. The wide-angle image reproducing device according to 20 or 21.

23. The wide-angle image reproducing apparatus according to claim 19, further comprising area fixing means for fixing a predetermined area in a direction corresponding to the sound source direction.

24. A conference image reproducing device to which the wide-angle image reproducing device according to any one of claims 19 to 23 is applied for reproducing a conference, the color distribution of the moving image data or a moving part in the moving image data. A conference image reproducing device comprising: a speaker position determining means for determining the position of the speaker based on the above; and a predetermined area determining means for determining the predetermined area based on the determination result of the speaker position determining means. apparatus.

25. An input step of inputting a wide-angle image having a vertical direction as a center or an axis, a sound synchronized with the image, and a sound source direction of the sound, and a wide-angle image input by the input step, A recording step of recording voice and a sound source direction, and a transformation for deforming an image of a predetermined area in a direction corresponding to the sound source direction in the wide-angle image recorded in the recording step so as to be a rectangular output image A wide-angle image including a step, and a reproducing step of reproducing the image deformed by the image deforming step and the sound of the deformed image recorded in the recording step in synchronization with each other. Recording and playback method.

26. An input step of inputting a wide-angle image having a vertical direction as a center or an axis, a sound synchronized with the image, and a sound source direction of the sound, and a wide-angle image input by the input step is predetermined. A transforming step of transforming into a rectangular output image by the conversion formula of, a recording step of recording the image transformed by the transforming step, a sound, and a sound source direction, and a rectangle recorded by the recording step. Of the shape images,
An extracting step of extracting an image of a predetermined area in a direction corresponding to the sound source direction, an image extracted by the extracting step, and a sound of the extracted image recorded in the recording step are reproduced in synchronization with each other. A wide-angle image recording / reproducing method comprising:

27. A conference recording / playback method in which the wide-angle image recording / playback method according to claim 25 or 26 is applied to a recording / playback of a conference, the method being based on a color distribution of an image or a moving portion in the image. A conference recording / reproducing method comprising: a speaker position determining step of determining a position; and a predetermined area determining step of determining the predetermined area based on a determination result based on the speaker position determining step.

28. An input step of inputting a wide-angle image having a vertical direction as a center or an axis, a voice synchronized with the image, and a sound source direction of the voice, and a wide-angle image input by the input step is predetermined. A transformation step of transforming into a rectangular output image by the transformation formula of, and data relating to the image transformed by the transformation step,
A wide-angle image transmitting method, comprising: a data transmitting step of transmitting data regarding sound and data regarding a sound source direction to a predetermined data storage destination.

29. An extracting step of extracting an image of a predetermined region in the sound source direction from the image deformed by the deforming step, wherein the data transmitting step includes data relating to the wide-angle image input by the input step, and 29. The data relating to the image extracted in the extracting step is transmitted in place of the data relating to the sound source direction, or together with the data relating to the wide-angle image and the data relating to the sound source direction input in the input step. The wide-angle image transmission method described.

30. A conference image transmission method in which the wide-angle image transmission method according to claim 29 is applied for recording a conference, wherein the position of the speaker is determined based on a color distribution of the image or a moving portion in the image. And a predetermined area determining step of determining the predetermined area based on the determination result of the speaker position determining step.

31. Movie data obtained by capturing a wide-angle image,
A recording step of recording audio data synchronized with the moving image data and data relating to the sound source direction, and of the moving image data recorded by the recording step, the moving image data in a predetermined area is rectangular based on the data relating to the sound source direction. And an output step of outputting the moving image data transformed by the image transforming step and the audio data corresponding to the moving image data in synchronization with each other. Wide-angle image reproduction method.

32. Video data obtained by capturing a panoramic wide-angle image, and audio data synchronized with the video data,
A recording step of recording data relating to the sound source direction; an extracting step of extracting moving image data in a predetermined area based on the data relating to the sound source direction from the moving image data recorded by the recording step; and extracting by the extracting step. A wide-angle image reproducing method, comprising: an output step of outputting the synchronized moving image data and audio data corresponding to the moving image data in synchronization with each other.

33. A conference image reproducing method in which the wide-angle image reproducing method according to claim 31 or 32 is applied for reproducing a conference, wherein the speaker is based on a color distribution of the moving image data or a moving portion in the moving image data. And a predetermined area determining step of determining the predetermined area based on a determination result based on the speaker position determining step.

34. A program for causing a computer to execute each step of the method according to any one of claims 25 to 33.