JP4742196B2

JP4742196B2 - Presentation system and content creation system

Info

Publication number: JP4742196B2
Application number: JP2004334335A
Authority: JP
Inventors: 弘明千代倉
Original assignee: Keio University
Current assignee: Keio University
Priority date: 2004-11-18
Filing date: 2004-11-18
Publication date: 2011-08-10
Anticipated expiration: 2024-11-18
Also published as: JP2006146481A

Description

本発明は、コンピュータを用いて生成した画像の表示を行うプレゼンテーションシステム、及びプレゼンテーションシステムによって生成した画像を含むコンテンツを作成するコンテンツ作成システムに関する。 The present invention relates to a presentation system that displays an image generated using a computer, and a content creation system that creates content including an image generated by the presentation system.

講義、講演、発表等には、大型の表示画面を有するプレゼンテーションシステムが利用される。プレゼンテーションシステムとしては、コンピュータを用いて生成した画像を、液晶プロジェクタや大型フラットパネルディスプレイ等に表示させるコンピュータ利用のシステムがある。また、プレゼンテーションや講義を行う時に、予め作成した電子化文書に手書き情報を付加した画像情報を生成し、表示するものも提案されている（特許文献１、特許文献２参照）。 A presentation system having a large display screen is used for lectures, lectures, presentations, and the like. As a presentation system, there is a computer-based system that displays an image generated using a computer on a liquid crystal projector, a large flat panel display, or the like. In addition, it has also been proposed to generate and display image information in which handwritten information is added to a digitized document created in advance when giving a presentation or lecture (see Patent Document 1 and Patent Document 2).

このようなプレゼンテーション用画像の表示切換えを行ったり、手書き画像を表示させたりするためには、プレゼンタや講師等が操作する入力手段が必要であり、一般にキーボード、マウス、タブレット等が利用される。しかし、これらの入力手段は、コンピュータの近傍に固定設置されるため、移動しながらのプレゼンテーションや講義が難しい。 In order to switch the display of the presentation image or to display the handwritten image, an input unit operated by a presenter, a lecturer, or the like is required, and a keyboard, a mouse, a tablet, or the like is generally used. However, since these input means are fixedly installed near the computer, it is difficult to make presentations and lectures while moving.

設置位置から離れた位置でのコンピュータの操作を可能とする技術が、特許文献３に記載されている。特許文献３に記載されたポインティング方法は、操作者が手に持っている色タグを含む画像を撮影するカメラを設置し、撮影した画像における色タグの位置に基づいて、ディスプレイ画面上にカーソルを表示するものである。したがって、操作者が色タグを移動させることにより、カーソルを移動させることができるので、操作者は、コンピュータの設置位置近傍から離れた位置からでもコンピュータの操作が可能となる。 Japanese Patent Application Laid-Open No. 2004-228561 describes a technique that enables a computer to be operated at a position away from the installation position. In the pointing method described in Patent Document 3, a camera that captures an image including a color tag held by an operator is installed, and a cursor is placed on the display screen based on the position of the color tag in the captured image. To display. Therefore, since the operator can move the cursor by moving the color tag, the operator can operate the computer even from a position away from the vicinity of the installation position of the computer.

特許文献３に記載されたポインティング方法においては、撮影画像における色タグの位置とカーソル位置を対応付けるための変換パラメータの設定が必要である。この変換パラメータは、カメラ設置位置とディスプレイ設置位置、カメラの画角を考慮して、回転・拡大／縮小・移動処理により変換するためのものであり、操作者の平均的な立ち位置からみてディスプレイ上の適切な範囲が指示可能となるように予めキャリブレーションによって設定する。したがって、操作者がコンピュータの操作が可能となる範囲は、カメラの予め定められた画角内に制限されることになる。 In the pointing method described in Patent Document 3, it is necessary to set a conversion parameter for associating the position of the color tag with the cursor position in the captured image. This conversion parameter is for conversion by rotation / enlargement / reduction / movement processing in consideration of the camera installation position, display installation position, and camera angle of view, and the display is viewed from the average standing position of the operator. Preliminary calibration is set so that the appropriate range can be designated. Therefore, the range in which the operator can operate the computer is limited to a predetermined angle of view of the camera.

特開２００２−４４５８５号公報JP 2002-44585 A 特開２００１−１６３８４号公報JP 2001-16384 A 特開２００４−２７２５９８号公報JP 2004-272598 A

本発明は、上記事情に鑑みなされたもので、プレゼンタによる表示画像の操作が可能な移動範囲を拡張することができるプレゼテーションシステムを提供することを目的とする。また、そのようなプレゼンテーションシステムを用いて作成した画像を含むコンテンツを作成するコンテンツ作成システムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a presentation system capable of extending a moving range in which a presenter can operate a display image. It is another object of the present invention to provide a content creation system that creates content including an image created using such a presentation system.

本発明のプレゼンテーションシステムは、コンピュータを用いて生成した画像の表示を行うプレゼンテーションシステムであって、プレゼンタの画像を撮影する撮像手段と、前記撮像手段からの撮影画像データを一部に含む合成画像データを生成する画像合成手段と、前記合成画像データに基づく画像を表示する表示手段と、前記合成画像データに含まれる前記撮影画像データに基づき、前記プレゼンタによるポインティング操作を認識するポインティング処理手段と、を備え、前記ポインティング処理手段は、前記撮影画像データに含まれる前記プレゼンタによって移動される特定物体を識別し、前記撮影画像データに基づく画像の表示領域における前記特定物体の相対位置座標を認識し、前記相対位置座標に基づいて表示画面上のポインティング位置を認識し、前記画像合成手段が、前記撮影画像データの鏡像データを合成して前記合成画像データを生成するものである。 The presentation system of the present invention is a presentation system that displays an image generated by using a computer, and is an image capturing unit that captures an image of a presenter, and composite image data that partially includes captured image data from the image capturing unit. Image display means for generating an image, display means for displaying an image based on the composite image data, and pointing processing means for recognizing a pointing operation by the presenter based on the captured image data included in the composite image data. The pointing processing means identifies a specific object moved by the presenter included in the captured image data, recognizes a relative position coordinate of the specific object in an image display region based on the captured image data, and Point on display screen based on relative position coordinates It recognizes the Ingu position, the image synthesizing means, in which by combining the mirror image data of the photographed image data to generate the synthetic image data.

本発明によれば、撮影手段による撮影画像データの内、合成すべき領域を簡単に変更できるので、プレゼンタによるポインティング位置を認識するための画像の領域、すなわちプレゼンタによって移動される特定物体の表示可能領域を簡単に変更できる。したがって、プレゼンタが移動しても合成すべき領域を変更することにより、ポインティングが可能となり、プレゼンタによる表示画像の操作可能範囲を拡張することができる。また、プレゼンタの撮影画像に、特定物体の識別の妨げになる外乱物体が含まれる場合には、その外乱物体が含まれない領域の画像を合成領域とすることにより、ポインティング位置の認識精度を向上させることができる。また、表示用合成画像データに含まれる撮影画像データを、ポインティング処理に必要な部分の撮影画像データのみとすることができるで、ポインティング位置の認識精度を向上させることができるととともに、認識処理負担を軽減することができる。また、プレゼンタにとって、表示画面の全領域に渡ってポインティングし易いものとなる。 According to the present invention, the area to be synthesized can be easily changed in the image data taken by the imaging means, so that the area of the image for recognizing the pointing position by the presenter, that is, the specific object moved by the presenter can be displayed. You can easily change the area. Therefore, by changing the region to be synthesized even if the presenter moves, pointing becomes possible, and the range in which the presenter can operate the display image can be expanded. Also, if the presenter's captured image contains a disturbing object that hinders identification of a specific object, the pointing position recognition accuracy is improved by using an image of the area that does not include the disturbing object as a composite area. Can be made. In addition, since the captured image data included in the composite image data for display can be only the captured image data of the portion necessary for the pointing process, the recognition accuracy of the pointing position can be improved, and the recognition processing burden is increased. Can be reduced. In addition, it is easy for the presenter to point over the entire area of the display screen.

本発明のプレゼンテーションシステムは、前記画像合成手段が、前記表示手段に表示された撮影画像に対する操作に基づき、前記撮影画像データの合成すべき領域を認識するものであるものを含む。本発明によれば、ポインティング位置認識用の撮影画像領域を簡単に変更することができる。 The presentation system of the present invention includes one in which the image synthesizing unit recognizes a region to be synthesized of the photographed image data based on an operation on the photographed image displayed on the display unit. According to the present invention, a captured image area for pointing position recognition can be easily changed.

本発明のプレゼンテーションシステムは、前記ポインティング処理手段が、前記相対位置座標を前記表示手段の表示領域全体の座標にスケール変換して得た座標を前記ポインティング位置として認識するものであるものを含む。 The presentation system of the present invention includes one in which the pointing processing means recognizes coordinates obtained by scaling the relative position coordinates to coordinates of the entire display area of the display means as the pointing position.

本発明のプレゼンテーションシステムは、前記特定物体が、特定の色の光を発光又は反射する物体であるものを含む。具体的には、発光体、特定色のテープ等が利用可能である。 The presentation system of the present invention includes one in which the specific object is an object that emits or reflects light of a specific color. Specifically, a light emitter, a specific color tape, or the like can be used.

本発明のコンテンツ作成システムは、上記したプレゼンテーションシステムにおける前記画像合成手段によって生成された前記合成画像データに基づく表示用合成画像信号を生成するビデオ信号生成手段と、前記表示用合成画像信号に基づくデジタル動画データを含む動画ファイルを生成する動画ファイル生成手段とを備えるものである。 The content creation system according to the present invention includes a video signal generation unit that generates a display composite image signal based on the composite image data generated by the image synthesis unit in the presentation system, and a digital signal based on the display composite image signal. And a moving image file generating means for generating a moving image file including moving image data.

本発明の講義ビデオ作成システムは、上記したコンテンツ作成システムを利用して生成した前記動画ファイルを講義ビデオとして出力するものである。 The lecture video creation system of the present invention outputs the moving image file generated using the content creation system described above as a lecture video.

以上の説明から明らかなように、本発明によれば、プレゼンタによる表示画像の操作が可能な移動範囲を拡張することができるプレゼテーションシステムを提供することができる。また、そのようなプレゼンテーションシステムを用いて作成した画像を含むコンテンツを作成するコンテンツ作成システムを提供することができる。 As is clear from the above description, according to the present invention, it is possible to provide a presentation system that can expand the moving range in which the presenter can operate the display image. In addition, it is possible to provide a content creation system that creates content including an image created using such a presentation system.

以下、本発明の実施の形態について、図面を用いて説明する。なお、以下の説明では、プレゼンテーションシステムを講義に使用し、同時に講義ビデオを作成する講義ビデオ作成システムを適用例としている。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the following description, a lecture video creation system that uses a presentation system for lectures and creates lecture videos at the same time is used as an application example.

図１は、コンテンツ作成システムの一例である講義ビデオ作成システムの概略構成を示す図である。図１の講義ビデオ作成システムは、教室等での講義と同時に講義ビデオを作成するものであり、講師用コンピュータ１、カメラ２、タブレット３、プロジェクタ４、スキャンコンバータ５、録画用コンピュータ６、マイクロホン７、ビデオサーバ８を含んで構成される。 FIG. 1 is a diagram showing a schematic configuration of a lecture video creation system which is an example of a content creation system. The lecture video creation system shown in FIG. 1 creates a lecture video simultaneously with a lecture in a classroom or the like. The lecturer computer 1, camera 2, tablet 3, projector 4, scan converter 5, recording computer 6, microphone 7 The video server 8 is included.

講師用コンピュータ１は、プレゼンタである講師が講義に使用するコンピュータであり、例えばノート型ＰＣである。講師用コンピュータ１には、予め、Power Point等のプレゼンテーションソフトウェアで作成された講義用素材が用意されている。また、Ｗｅｂサイトのコンテンツを講義に使用する場合は、Ｗｅｂブラウザをインストールしておくと共にインターネットに接続可能としておく。 The instructor computer 1 is a computer used by a lecturer who is a presenter for a lecture, and is a notebook PC, for example. The lecturer computer 1 is prepared with lecture materials created in advance by presentation software such as Power Point. In addition, when using the contents of a website for lectures, a web browser is installed and connected to the Internet.

講師用コンピュータ１には、カメラ２とタブレット３が、例えばＵＳＢ接続により接続される。カメラ２は、講義中の講師を撮影する講師撮影用カメラであって、動画像を講師用コンピュータ１に入力するものであり、タブレット３は、講義中の板書と同様に、講師が手書きデータを入力するためのものである。また、講師用コンピュータ１は、操作手段としてキーボードとマウス、及び液晶表示部（いずれも図示せず）を備える。講師用コンピュータ１は、このような操作手段によって操作され、表示画像の切換え等を行うが、さらに、カメラ２の撮影画像に基づくポインティング操作も可能である。ポインティング操作については、後述する。 A camera 2 and a tablet 3 are connected to the instructor computer 1 by, for example, USB connection. The camera 2 is an instructor photographing camera that photographs an instructor during a lecture, and inputs a moving image to the instructor computer 1. It is for input. The instructor computer 1 includes a keyboard, a mouse, and a liquid crystal display unit (all not shown) as operation means. The instructor computer 1 is operated by such an operation means to perform switching of a display image, and further, a pointing operation based on a photographed image of the camera 2 is also possible. The pointing operation will be described later.

講師用コンピュータ１には、カメラ２からの映像をデスクトップ上に表示させるソフトウェアと、タブレット３からの手書き情報をデスクトップ上に描画するためのソフトウェアが予めインストールされる。これらのソフトウェアは、周知の技術により簡単に作成することができる。既に作成されたソフトウェアは、例えば、「COE e-Learning Tools」、＜ＵＲＬ：http://coe-el.sfc.keio.ac.jp/＞でダウンロードすることができる。このサイトからダウンロードされるソフトウェアは、カメラ２からの撮影動画像及びタブレット３からの手書き画像と１又は複数の講義用素材画像とを合成した合成画像データを生成するものである。 The instructor computer 1 is preinstalled with software for displaying video from the camera 2 on the desktop and software for drawing handwritten information from the tablet 3 on the desktop. These software can be easily created by a known technique. The already created software can be downloaded by, for example, “COE e-Learning Tools”, <URL: http://coe-el.sfc.keio.ac.jp/>. The software downloaded from this site generates composite image data obtained by combining a captured moving image from the camera 2 and a handwritten image from the tablet 3 with one or more lecture material images.

ここで生成される合成画像データは、複数の画像を重ね合わせたものでも、一部の画像を部分的に上書きしたものでも、それぞれの画像を所定の大きさの領域に配置したものでよい。また、合成する各アプリケーション画像（カメラ画像、手書き画像を含む。）の大きさは、任意であり、講師が変更可能である。また、カメラ画像のうちの合成領域も講師が変更可能である。ただし、カメラ画像については、講義資料の表示の邪魔にならないように表示領域の一部に合成する。また、後述するポインティング操作のために、カメラ画像は、鏡像データに変換した上で合成するのが好ましい。 The composite image data generated here may be a superposition of a plurality of images, a partial overwrite of a part of the images, or a configuration in which each image is arranged in a predetermined size area. Further, the size of each application image (including a camera image and a handwritten image) to be combined is arbitrary and can be changed by the lecturer. In addition, the instructor can also change the synthesis area in the camera image. However, the camera image is combined with a part of the display area so as not to disturb the display of the lecture material. Further, for a pointing operation described later, it is preferable that the camera image is synthesized after being converted into mirror image data.

カメラ画像の大きさの変更操作、及び合成領域の変更操作は、マウス操作によって行う。図２に、カメラ画像の大きさの変更操作の一例を示す。表示画面２００の右下部分に、カメラ画像が領域２１０ａの大きさで表示されているものとする。この状態で領域２１ａの左上隅の点２２０ａをドラッグすることにより、カメラ画像の大きさを変更する。例えば、点２２０ａを点２２０ｂまでドラッグすると、カメラ画像が領域２１０ｂの大きさで表示され、点２２０ｃまでドラッグすると、カメラ画像が領域２１０ｃの大きさで表示される。 The operation for changing the size of the camera image and the operation for changing the composition area are performed by operating the mouse. FIG. 2 shows an example of an operation for changing the size of the camera image. Assume that the camera image is displayed in the size of the area 210a in the lower right portion of the display screen 200. In this state, the size of the camera image is changed by dragging the point 220a at the upper left corner of the area 21a. For example, when the point 220a is dragged to the point 220b, the camera image is displayed with the size of the area 210b, and when the point 220c is dragged, the camera image is displayed with the size of the area 210c.

図３に、カメラ画像の合成領域の変更操作の一例を示す。表示画面３００の右下部分に、領域３１０ａの大きさでカメラ画像全体が表示されているものとする。この状態でカメラ画像を選択し、移動させるものとする。この処理は、周知の表示ウィンドウの選択移動により行うことができる。いま、選択したカメラ画像を右下方向に移動させると、表示画面３００に表示されるカメラ画像は、領域３１０ｂ（斜め左下方向のハッチングを付した領域）に、撮影領域の一部がトリミングされた状態で表示される。そして、さらに右下方向に移動させると、カメラ画像は、領域３１０ｃ（斜め右下方向のハッチングを付した領域）に、表示される。すなわち、図３で仮想的に示した領域３２０ｂ、３２０ｃの領域の撮影画像データが除かれた部分が合成されることになる。 FIG. 3 shows an example of the operation for changing the composite area of the camera image. It is assumed that the entire camera image is displayed in the lower right portion of the display screen 300 with the size of the area 310a. In this state, a camera image is selected and moved. This processing can be performed by selecting and moving a known display window. Now, when the selected camera image is moved in the lower right direction, the camera image displayed on the display screen 300 is cropped in a region 310b (a region with hatching in the lower left direction). Displayed with status. When the image is further moved in the lower right direction, the camera image is displayed in an area 310c (an area with diagonally lower right hatching). In other words, the parts of the areas 320b and 320c virtually shown in FIG. 3 excluding the captured image data are combined.

以上の説明では、カメラ画像を表示画面の右下に表示させるものとしたが、講義資料の表示の邪魔にならないような位置であれば、右下に限らない。また、図１のシステムでは、カメラ２を１台としたが複数台のカメラを設け、それぞれ別のウィンドウに表示させてもよい。さらに、カメラ画像を移動させることによってカメラ画像左上部分を残すようにトリミングしたが、マウス操作によって、任意の部分を残すことも可能である。 In the above description, the camera image is displayed at the lower right of the display screen. However, the position is not limited to the lower right as long as it does not interfere with the display of the lecture material. In the system of FIG. 1, the number of cameras 2 is one, but a plurality of cameras may be provided and displayed in separate windows. Further, the camera image is trimmed so as to leave the upper left part of the camera image by moving it, but it is also possible to leave an arbitrary part by operating the mouse.

なお、このような合成画像を表示するための合成画像データは、講師用コンピュータ１のフレームバッファ（図示せず）に書き込まれ、講師用コンピュータ１のデスクトップ画面の表示に利用される。また、フレームバッファの画像データは、後述するポインティング処理に利用される。 The composite image data for displaying such a composite image is written in a frame buffer (not shown) of the instructor computer 1 and used for displaying the desktop screen of the instructor computer 1. Also, the image data in the frame buffer is used for pointing processing described later.

講師用コンピュータ１の外部モニタ出力端子（図示せず）には、デスクトップの画面を映像として図示しない大規模スクリーンに表示するためのプロジェクタ４が接続される。スキャンコンバータ５は、講師用コンピュータ１の外部モニタ出力端子（図示せず）に接続され、この出力端子から出力されるデジタル信号を表示用画像信号の１つであるアナログビデオ信号に変換するものである。 A projector 4 for displaying a desktop screen as an image on a large-scale screen (not shown) is connected to an external monitor output terminal (not shown) of the instructor computer 1. The scan converter 5 is connected to an external monitor output terminal (not shown) of the instructor computer 1 and converts a digital signal output from the output terminal into an analog video signal which is one of display image signals. is there.

録画用コンピュータ６は、スキャンコンバータ５で取得したアナログビデオ信号をビデオキャプチャボードにより入力し、既存のビデオキャプチャソフトを用いて動画ファイル、例えばWindows Media 形式(.WMV)にリアルタイムでエンコードする。Windows Media 形式の動画ファイルは、非常に軽量である。例えば、録画解像度を６４０pixels×４８０pixels、配信ビットレートを２５０bpsに設定すると、１時間あたりのファイル容量は約１００ＭＢである。録画解像度を６４０pixels×４８０pixelsで、講師用コンピュータ１の画面上の資料及びタブレット描画による板書は、問題なく判読可能である。また、フレームレートは、１０fps程度であり、講師の表情や板書の動き等を違和感なく閲覧することが可能である。録画用コンピュータ６の性能は、例えば、PentiumIV．４ＧＨｚプロセッサ、メモリ１ＧＢ、ハードディスク容量１８０ＧＢである。 The recording computer 6 inputs the analog video signal acquired by the scan converter 5 through a video capture board and encodes it in real time into a moving image file such as Windows Media format (.WMV) using existing video capture software. Video files in Windows Media format are very lightweight. For example, if the recording resolution is set to 640 pixels × 480 pixels and the distribution bit rate is set to 250 bps, the file capacity per hour is about 100 MB. The recording resolution is 640 pixels × 480 pixels, and the material on the screen of the instructor computer 1 and the board drawing by the tablet drawing can be read without any problem. Also, the frame rate is about 10 fps, and it is possible to browse the instructor's facial expression and the movement of the board without any discomfort. The performance of the recording computer 6 is, for example, Pentium IV. A 4 GHz processor, a memory of 1 GB, and a hard disk capacity of 180 GB.

マイクロホン７は、講師の音声信号取得するためのものであり、録画用コンピュータ６に接続される。録画用コンピュータ６は、動画ファイルの生成時に音声データの付加を行う。なお、図１では、マイクロホン７を録画用コンピュータ６に接続したが、講師用コンピュータ１に接続し、講師用コンピュータ１で取得した音声データを録画用コンピュータ６に送ってもよい。 The microphone 7 is used to acquire a lecturer's audio signal, and is connected to the recording computer 6. The recording computer 6 adds audio data when the moving image file is generated. Although the microphone 7 is connected to the recording computer 6 in FIG. 1, the audio data acquired by the instructor computer 1 may be sent to the recording computer 6 by connecting to the instructor computer 1.

ビデオサーバ８は、録画用コンピュータ６で作成された動画ファイルがアップされ、ストリーミング配信するものである。ビデオサーバ８は、例えば、Windows 2000Server がインストールされたコンピュータであり、その性能は、Pentium III，７５０ＭＨｚプロセッサ、メモリ５１２ＭＢ、ハードディスク容量２４０ＧＢである。 The video server 8 uploads the moving image file created by the recording computer 6 and distributes it in a streaming manner. The video server 8 is, for example, a computer in which Windows 2000 Server is installed, and its performance is a Pentium III, 750 MHz processor, memory 512 MB, and hard disk capacity 240 GB.

このような構成を有する講義ビデオを作成システムの動作について説明する。講義室には、予め、講師用コンピュータ１以外の機器が用意されている。講師は、講義用素材を記憶した自己のコンピュータ１のＵＳＢ端子にカメラ２、タブレット３を接続し、ビデオ出力端子にプロジェクタ４及びスキャンコンバータ５を接続する。そして、全ての機器を動作させ、講師用コンピュータ１に用意した講義資料表示用のアプリケーションを起動する。 The operation of the lecture video creation system having such a configuration will be described. In the lecture room, devices other than the lecturer computer 1 are prepared in advance. The lecturer connects the camera 2 and the tablet 3 to the USB terminal of his computer 1 storing the lecture material, and connects the projector 4 and the scan converter 5 to the video output terminal. Then, all the devices are operated, and a lecture material display application prepared in the instructor computer 1 is started.

講師は、このようなシステムの状態で講義を開始し、講師用コンピュータ１に必要な講義用資料を表示させながら講義を進める。講師用コンピュータ１の画像表示信号は、プロジェクタ４に送られるので、図示しない大規模スクリーンにも表示される。講師用コンピュータ１には、講義用資料の一部にカメラ２からの撮影画像が表示される。図４に、表示画像の一例を示す。 The lecturer starts the lecture in such a system state, and advances the lecture while displaying necessary lecture materials on the instructor computer 1. Since the image display signal of the instructor computer 1 is sent to the projector 4, it is also displayed on a large-scale screen (not shown). On the instructor computer 1, a photographed image from the camera 2 is displayed on a part of the lecture material. FIG. 4 shows an example of the display image.

図４は、表示画面４００のほぼ大部分の領域に、プレゼンテーションソフトウェアによる表示画像４１０を表示させ、さらに表示画面４００の右下部に講師の撮影映像４２０が表示されている状態を模式的に示したものである。講師の撮影画像４２０における講師４３０は、ポインティング位置を認識するための物体４３０ａを保持している。そして、撮影画像４２０が占める領域における物体４３０ａの相対位置座標に対応する表示画面４００の位置にカーソル４４０が表示されている。ポインティング位置を認識するための物体４３０ａは、例えば、特定の色の光を発光又は反射する物体である。 FIG. 4 schematically shows a state in which the display image 410 by the presentation software is displayed in almost the most area of the display screen 400 and the instructor's photographed image 420 is displayed in the lower right part of the display screen 400. Is. The lecturer 430 in the photographed image 420 of the lecturer holds an object 430a for recognizing the pointing position. A cursor 440 is displayed at a position on the display screen 400 corresponding to the relative position coordinates of the object 430a in the area occupied by the captured image 420. The object 430a for recognizing the pointing position is, for example, an object that emits or reflects light of a specific color.

講師用コンピュータ１の画像表示信号は、同時にスキャンコンバータ５に送られ、スキャンコンバータ５では、画像表示信号に基づくアナログビデオ信号が生成される。そして、生成されたアナログビデオ信号は、録画用コンピュータ６に送られ、デジタル動画ファイルに変換される。すなわち、アナログビデオ信号は、録画用コンピュータ６のビデオキャプチャボード（図示せず）を介して入力され、既存のビデオキャプチャソフトを用いてWindows Media 形式(WMV)のデジタル画像データにリアルタイムでエンコードされる。その際、マイクロホン７によって入力された音声信号も同時にデジタル化され、合わせて出力される。 The image display signal of the instructor computer 1 is sent to the scan converter 5 at the same time, and the scan converter 5 generates an analog video signal based on the image display signal. The generated analog video signal is sent to the recording computer 6 and converted into a digital moving image file. That is, the analog video signal is input via a video capture board (not shown) of the recording computer 6 and encoded in real time into digital image data in Windows Media format (WMV) using existing video capture software. . At that time, the audio signal input by the microphone 7 is also digitized and output together.

録画用コンピュータ６で作成されたWindows Media 形式(WMV) の動画ファイルは、ビデオサーバ８にアップロードされる。そして、図示しないネットワークを介して講義ビデオの配信に供せられる。アップロードされる講義ビデオは動画ファイルであるので、ストリーム配信も可能であり、したがって、実際の講義とほぼ同時のライブ配信も可能であり、遠隔講義も実現できる。 A Windows Media format (WMV) video file created by the recording computer 6 is uploaded to the video server 8. Then, it is provided for distribution of lecture videos via a network (not shown). Since the uploaded lecture video is a moving image file, it is possible to distribute the stream. Therefore, live distribution can be performed almost simultaneously with the actual lecture, and a remote lecture can be realized.

次に、講師用コンピュータ１が行うポインティング処理について説明する。ポインティング処理は、合成画像データに含まれるカメラ画像データに基づき、講師によるポインティング操作を認識する処理である。具体的には、カメラ画像データに含まれる講師が保持する特定物体（例えば、特定の色の光を発光又は反射する物体）を識別し、カメラ画像データに基づく画像の表示領域における特定物体の相対位置座標を認識し、認識した相対位置座標に基づいて表示画面上のポインティング位置を認識するものである。表示画面上のポインティング位置は、認識した相対位置座標を表示領域全体の座標にスケール変換して求める。 Next, the pointing process performed by the instructor computer 1 will be described. The pointing process is a process for recognizing a pointing operation performed by a lecturer based on camera image data included in the composite image data. Specifically, a specific object (for example, an object that emits or reflects light of a specific color) held by the instructor included in the camera image data is identified, and the relative relationship of the specific object in the image display area based on the camera image data The position coordinates are recognized, and the pointing position on the display screen is recognized based on the recognized relative position coordinates. The pointing position on the display screen is obtained by scaling the recognized relative position coordinates to the coordinates of the entire display area.

図５に、ポインティング処理の一例の概略フローを示す。ステップＳ１０１では、フレームバッファに記憶された合成画像データを取得し、ステップＳ１０２では、表示領域中のカメラ画像表示用ウィンドウを特定するデータを取得する。今、表示画面が（Ｘ_０＋１）×（Ｙ_０＋１）画素で構成され、フレームバッファに、図６（ａ）に示すような座標に対応させて画素データが記憶されているものとする。図６（ｂ）における領域６１０は、カメラ画像のウィンドウ領域である。このような合成が行われる場合、カメラ画像ウィンドウを特定するデータとして、領域６１０の左上点の座標を取得する。 FIG. 5 shows a schematic flow of an example of the pointing process. In step S101, the composite image data stored in the frame buffer is acquired, and in step S102, data specifying the camera image display window in the display area is acquired. Now, it is assumed that the display screen is made up of (X ₀ +1) × (Y ₀ +1) pixels, and pixel data is stored in the frame buffer in correspondence with the coordinates as shown in FIG. An area 610 in FIG. 6B is a window area of the camera image. When such composition is performed, the coordinates of the upper left point of the area 610 are acquired as data for specifying the camera image window.

続いて、ステップＳ１０３では、カメラ画像ウィンドウ領域における特定物体の探索を行う。特定物体の探索のための画像処理は、例えば、特許文献３に示される技術が利用可能である。そして、探索結果カメラ画像ウィンドウ領域に１つの特定物体が認識できた場合は、探索成功と判断し、特定物体が認識できない場合及び２以上の特定物体が認識できた場合は探索不成功と判断する（ステップＳ１０４）。 Subsequently, in step S103, the specific object in the camera image window area is searched. For example, a technique disclosed in Patent Document 3 can be used for image processing for searching for a specific object. If one specific object can be recognized in the search result camera image window area, it is determined that the search is successful, and if the specific object cannot be recognized or if two or more specific objects can be recognized, the search is determined to be unsuccessful. (Step S104).

探索成功の場合は、認識できた特定物体の重心位置の相対座標を演算する（ステップＳ１０５）。相対座標は、ウィンドウ領域６１０における座標であり、次のように求めることができる。図６（ａ）に示すように、表示画面座標系（Ｘ，Ｙ）において、ウィンドウ領域６１０の左上点の座標が（Ｘ_Ｗ，Ｙ_Ｗ）であり、特定物体の重心位置Ｐ_１の座標が（Ｘ_１、Ｙ_１）であるとする。ここで、図６（ｂ）に示すようなＡを原点とする座標系（ｘ，ｙ）を考えると、点Ｐ_１の相対座標（ｘ_１、ｙ_１）は、（Ｘ_１−Ｘ_Ｗ，Ｙ_１）となる。 If the search is successful, the relative coordinates of the center of gravity position of the recognized specific object are calculated (step S105). The relative coordinates are coordinates in the window area 610 and can be obtained as follows. As shown in FIG. 6A, in the display screen coordinate system (X, Y), the coordinates of the upper left point of the window area 610 are (X _W , Y _W ), and the coordinates of the centroid position P ₁ of the specific object are It is assumed that (X ₁ , Y ₁ ). Here, considering a coordinate system (x, y) with A as the origin as shown in FIG. 6B, the relative coordinates (x ₁ , y ₁ ) of the point P ₁ are (X ₁ −X _W , Y ₁ ).

次に、ウィンドウ領域６１０における点Ｐ_１の座標系（ｘ，ｙ）での相対座標を、表示画面全体の座標にスケール変換する（ステップＳ１０６）。表示画面全体の画素数は、（Ｘ_０＋１）×（Ｙ_０＋１）であり、ウィンドウ領域６１０の画素数は、（Ｘ_０−Ｘ_Ｗ＋１）×（Ｙ_Ｗ＋１）であるので、点Ｐ_１をスケール変換した座標は、（（Ｘ_０＋１）（Ｘ_１−Ｘ_Ｗ）／（Ｘ_０−Ｘ_Ｗ＋１），（Ｙ_０＋１）（Ｙ_１）／（Ｙ_Ｗ＋１））となる。図６（ａ）の例では、点Ｐｃの位置がスケール変換した座標となる。そして、ステップＳ１０７では、このスケール変換した座標をポインティング位置情報として出力する。このポインティング位置情報は、例えばカーソル表示や手書き軌跡データの作成に利用される。 Next, the coordinate system of the point _{P 1} in the window area 610 (x, y) coordinates relative to, the scale transformation to the coordinates of the entire display screen (step S106). Since the number of pixels of the entire display screen is (X ₀ +1) × (Y ₀ +1) and the number of pixels in the window area 610 is (X ₀ −X _W +1) × (Y _W +1), the point P _The coordinates obtained by converting the scale of ₁ are ((X ₀ +1) (X ₁ −X _W ) / (X ₀ −X _W +1), (Y ₀ +1) (Y ₁ ) / (Y _W +1)). In the example of FIG. 6A, the position of the point Pc is the scaled coordinate. In step S107, the scale-converted coordinates are output as pointing position information. This pointing position information is used, for example, for cursor display and creation of handwritten trajectory data.

ポインティング処理によって、マウスと同様のポインティングを、コンピュータから離れた状態で行うことができる。なお、マウスのクリックに対応する入力は、例えば、他のリモコン端末を操作して行う。この場合のリモコン端末は、１又は２のオンオフ信号を出力するものであるので、小型かつ安価ものを利用できる。また、プレゼンタが保持する特定物体として複数の色の光を発光又は反射可能なものを利用し、特定物体の色を変化させることにより、クリック動作を認識させることも可能である。 By the pointing process, the same pointing as that of the mouse can be performed in a state away from the computer. Note that the input corresponding to the mouse click is performed by operating another remote control terminal, for example. Since the remote control terminal in this case outputs one or two on / off signals, a small and inexpensive one can be used. It is also possible to recognize a click operation by using a specific object held by the presenter that can emit or reflect light of a plurality of colors and changing the color of the specific object.

ステップＳ１０４で探索不成功と判断された場合は、ステップＳ１０８で探索不成功を講師用コンピュータ１のＯＳに通知して終了する。 If it is determined in step S104 that the search is unsuccessful, the search unsuccessful notification is sent to the OS of the instructor computer 1 in step S108, and the process ends.

以上のように、本発明においては、合成画像データに基づいて特定物体の識別処理を行うので、ポインティング処理自体の手順は変更なく、処理負担も増加しない。また、表示手段の一部には、カメラによる撮影画像が表示されるので、プレゼンタは、特定の物体がその画像に含まれるように動かすことにより、ポインティング位置を任意に設定することができる。 As described above, in the present invention, since the specific object identification process is performed based on the composite image data, the procedure of the pointing process itself is not changed, and the processing load does not increase. In addition, since an image captured by the camera is displayed on a part of the display unit, the presenter can arbitrarily set the pointing position by moving the specific object to be included in the image.

さらに、合成されるカメラ画像データの領域を適宜変更することにより、ポインティング処理に適した部分のみを合成させることができるので、ポインティング位置の認識精度を向上させることができるととともに、認識処理負担を軽減することができる。また、プレゼンタにとって、表示画面の全領域に渡ってポインティングし易い領域を合成することも可能である。なお、合成領域の変更操作は、プレゼンタがマウス等を使用して行ってもよいが、プレゼンテーションの補佐をする者が別のマウス等を利用して行ってもよい Furthermore, by appropriately changing the area of the camera image data to be synthesized, it is possible to synthesize only the portion suitable for the pointing process, so that the recognition accuracy of the pointing position can be improved and the recognition process burden is reduced. Can be reduced. It is also possible for the presenter to synthesize an area that can be easily pointed over the entire area of the display screen. The change operation of the composite area may be performed by the presenter using a mouse or the like, but may be performed by a person who assists the presentation using another mouse or the like.

本発明の実施の形態の講義ビデオ作成システムの概略構成を示す図The figure which shows schematic structure of the lecture video creation system of embodiment of this invention 本発明の実施の形態の講義ビデオ作成システムにおけるカメラ画像の大きさの変更操作の一例を示す図The figure which shows an example of the operation which changes the magnitude | size of the camera image in the lecture video creation system of embodiment of this invention 本発明の実施の形態の講義ビデオ作成システムにおけるカメラ画像の合成領域の変更操作の一例を示す図The figure which shows an example of the change operation of the synthetic | combination area | region of a camera image in the lecture video creation system of embodiment of this invention 本発明の実施の形態の講義ビデオ作成システムにおける表示画像の一例を示す図The figure which shows an example of the display image in the lecture video creation system of embodiment of this invention 本発明の実施の形態の講義ビデオ作成システムにおけるポインティング処理の一例の概略フローを示す図The figure which shows the schematic flow of an example of the pointing process in the lecture video production system of embodiment of this invention 本発明の実施の形態の講義ビデオ作成システムにおけるポインティング処理の一例を説明する図The figure explaining an example of the pointing process in the lecture video creation system of embodiment of this invention

Explanation of symbols

１・・・講師用コンピュータ
２・・・カメラ
３・・・タブレット
４・・・プロジェクタ
５・・・スキャンコンバータ
６・・・録画用コンピュータ
７・・・マイクロホン
８・・・ビデオサーバ DESCRIPTION OF SYMBOLS 1 ... Instructor computer 2 ... Camera 3 ... Tablet 4 ... Projector 5 ... Scan converter 6 ... Recording computer 7 ... Microphone 8 ... Video server

Claims

A presentation system for displaying an image generated using a computer,
Imaging means for taking an image of the presenter;
Image composition means for generating composite image data including a part of the captured image data from the imaging means;
Display means for displaying an image based on the composite image data;
Pointing processing means for recognizing a pointing operation by the presenter based on the captured image data included in the composite image data,
The pointing processing means identifies a specific object to be moved by the presenter included in the captured image data, recognizes a relative position coordinate of the specific object in an image display area based on the captured image data, and Recognize the pointing position on the display screen based on the coordinates ,
The image synthesizing unit is a presentation system that synthesizes mirror image data of the captured image data to generate the synthesized image data .

The presentation system according to claim 1,
The presentation system that recognizes an area to be synthesized of the photographed image data based on an operation on the photographed image displayed on the display means .

The presentation system according to claim 1 or 2,
The presentation system in which the pointing processing means recognizes coordinates obtained by scaling the relative position coordinates to coordinates of the entire display area of the display means as the pointing position .

The presentation system according to any one of claims 1 to 3,
The presentation system , wherein the specific object is an object that emits or reflects light of a specific color .

Video signal generation means for generating a composite image signal for display based on the composite image data generated by the image composition means in the presentation system according to any one of claims 1 to 4 ,
A content creation system comprising: a moving image file generating unit that generates a moving image file including digital moving image data based on the composite image signal for display.

A lecture video creation system for outputting the video file generated by using the content creation system according to claim 5 as a lecture video.