JP2012182688A

JP2012182688A - Image processing apparatus, image processing method, and image processing program

Info

Publication number: JP2012182688A
Application number: JP2011044605A
Authority: JP
Inventors: Hiroshi Ono; 浩志大野
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2011-03-02
Filing date: 2011-03-02
Publication date: 2012-09-20

Abstract

PROBLEM TO BE SOLVED: To provide an image processing apparatus capable of reducing processing load on a communication apparatus when the communication apparatus transmits data for a plurality of images captured by a plurality of imaging means to execute a teleconference, and an image processing method and an image processing program.SOLUTION: A communications system comprises: a PC which transmits data via a network; and a plurality of teleconference terminals connected to the PC in series. Each one of the teleconference terminals acquires the number of the other teleconference terminals connected to the PC in a row. A teleconference terminal acquires image data for the terminal itself from a camera which captures an image (S60), and acquires image data for other devices from other teleconference terminals connected downstream (S61). The acquired image data for the terminal itself and that for the other devices are combined to generate composite image data (S62). The teleconference terminal outputs the generated composite data to the PC (S65).

Description

本発明は、テレビ会議を実行するために通信装置からネットワークを介して送信される画像データを処理する画像処理装置、画像処理方法、および画像処理プログラムに関する。 The present invention relates to an image processing device, an image processing method, and an image processing program for processing image data transmitted from a communication device via a network to execute a video conference.

従来、撮像手段によって撮像された画像のデータを他のデバイスに出力する際に、データの出力先のデバイスの処理負担を低下させる技術が知られている。例えば、特許文献１に記載の撮像装置は、画像を印刷するプリンタに画像データを出力する際に、プリンタからメモリサイズの情報を取得する。取得したメモリサイズに基づいて、プリンタに出力可能な画像データのサイズを算出する。出力する画像データのサイズが、出力可能なサイズよりも大きい場合、撮像装置は、圧縮されている画像データを再圧縮してプリンタに出力する。その結果、画像を印刷する処理が適切に行われる。 2. Description of the Related Art Conventionally, a technique for reducing the processing burden on a data output destination device when outputting image data captured by an imaging unit to another device is known. For example, the imaging apparatus described in Patent Document 1 acquires memory size information from a printer when outputting image data to a printer that prints an image. Based on the acquired memory size, the size of the image data that can be output to the printer is calculated. If the size of the image data to be output is larger than the size that can be output, the imaging apparatus recompresses the compressed image data and outputs the compressed image data to the printer. As a result, processing for printing an image is appropriately performed.

特開２００６−１０１２２０号公報JP 2006-101220 A

テレビ会議を実行するための通信システムでは、画像データが複数の通信装置間で送受信されることで、複数の拠点の画像がシステム内で共有される。複数の撮像手段によって撮像された複数の画像のデータを１つの通信装置が送信する場合、通信装置は複数の画像を合成し、合成した画像のデータを送信する必要があった。従って、通信装置の処理負担が増大し、テレビ会議中に表示される画像が乱れる可能性があった。従来の技術では、複数の撮像手段によって撮像された複数の画像のデータを送信する場合の通信装置の処理負担を低下させることはできなかった。 In a communication system for executing a video conference, image data is transmitted and received between a plurality of communication devices, so that images of a plurality of bases are shared in the system. When a single communication device transmits data of a plurality of images captured by a plurality of imaging means, the communication device needs to combine a plurality of images and transmit the combined image data. Therefore, the processing load on the communication device is increased, and the image displayed during the video conference may be disturbed. In the prior art, it has not been possible to reduce the processing load on the communication device when transmitting data of a plurality of images captured by a plurality of imaging means.

本発明は、複数の撮像手段によって撮像された複数の画像のデータを通信装置が送信してテレビ会議を実行する場合に、通信装置の処理負担を低下させることができる画像処理装置、画像処理方法、および画像処理プログラムを提供することを目的とする。 The present invention relates to an image processing apparatus and an image processing method capable of reducing the processing burden on a communication apparatus when the communication apparatus transmits data of a plurality of images captured by a plurality of image capturing means to execute a video conference. And an image processing program.

本発明の第一の態様に係る画像処理装置は、ネットワークを介してデータを送信する通信装置と、前記通信装置に対して直列に接続される複数の画像処理装置とを備える通信システムにおいて使用され、他の拠点で表示させる画像の画像データを前記通信装置に出力することが可能な画像処理装置であって、前記通信装置に連なって接続している他の画像処理装置の台数を取得する台数取得手段と、画像を撮像する撮像手段から第一画像データを取得する第一画像データ取得手段と、自装置に接続している他の画像処理装置から第二画像データを取得する第二画像データ取得手段と、前記第一画像データ取得手段によって取得された第一画像データと前記第二画像データ取得手段によって取得された第二画像データとを、前記台数取得手段によって取得された台数に応じて合成して合成画像データを生成する生成手段と、前記生成手段によって生成された前記合成画像データを、接続している他の画像処理装置または前記通信装置に出力する出力手段とを備える。 An image processing apparatus according to a first aspect of the present invention is used in a communication system including a communication device that transmits data via a network and a plurality of image processing devices connected in series to the communication device. An image processing apparatus capable of outputting image data of an image to be displayed at another base to the communication apparatus, and obtaining the number of other image processing apparatuses connected to the communication apparatus. Acquisition means; first image data acquisition means for acquiring first image data from an imaging means for capturing an image; and second image data for acquiring second image data from another image processing apparatus connected to the apparatus. An acquisition unit, the first image data acquired by the first image data acquisition unit and the second image data acquired by the second image data acquisition unit are transferred to the number acquisition unit. Generating means for generating composite image data by combining them according to the number obtained, and outputting the composite image data generated by the generation means to another connected image processing apparatus or the communication apparatus Output means.

第一の態様に係る画像処理装置は、複数の撮像手段によって撮像された画像の画像データを合成して合成画像データを生成する。生成した合成画像データを、他の画像処理装置または通信装置に出力することができる。画像処理装置から合成画像データを入力した通信装置は、画像データを合成する処理を行う必要がない。従って、画像処理装置は通信装置の処理負担を低下させることができ、テレビ会議中の通信が乱れる可能性を低下させることができる。 The image processing apparatus according to the first aspect generates composite image data by combining image data of images picked up by a plurality of image pickup means. The generated composite image data can be output to another image processing apparatus or communication apparatus. The communication device that has input the composite image data from the image processing device does not need to perform processing for combining the image data. Therefore, the image processing apparatus can reduce the processing load on the communication apparatus, and can reduce the possibility of communication disruption during a video conference.

前記画像処理装置は、前記通信装置に連なって接続している自装置および他の画像処理装置のうち、前記通信装置に対する自装置の接続順を取得する接続順情報取得手段と、合成画像データ上の座標における第一画像データの合成位置を示す座標情報と、前記自装置の接続順とを対応付けるテーブルを記憶手段から取得するテーブル取得手段とをさらに備えてもよい。前記生成手段は、前記テーブル取得手段によって取得されたテーブルにおいて、前記接続順情報取得手段によって取得された前記自装置の接続順に対応する座標情報が示す合成位置に、第一画像データを合成してもよい。画像処理装置は、通信装置に対する自装置の接続順に応じた適切な合成位置に第一画像データを合成することができる。通信装置は、第一画像データの合成位置を画像処理装置に指示する必要はない。従って、画像処理装置は通信装置の処理負担をさらに低下させることができる。 The image processing apparatus includes: a connection order information acquisition unit that acquires a connection order of the own apparatus with respect to the communication apparatus among the own apparatus connected to the communication apparatus and another image processing apparatus; There may be further provided table acquisition means for acquiring, from the storage means, a table that associates the coordinate information indicating the combination position of the first image data at the coordinates of the first image data and the connection order of the own apparatus. In the table acquired by the table acquisition unit, the generation unit combines the first image data at the combination position indicated by the coordinate information corresponding to the connection order of the own apparatus acquired by the connection order information acquisition unit. Also good. The image processing apparatus can synthesize the first image data at an appropriate synthesis position according to the connection order of the own apparatus with respect to the communication apparatus. The communication apparatus does not need to instruct the image processing apparatus to synthesize the first image data. Therefore, the image processing apparatus can further reduce the processing load on the communication apparatus.

前記画像処理装置は、ユーザが表示手段に表示されたテンプレートに対して操作を入力することで設定される情報であり、前記テンプレート上の座標において前記ユーザが指定した各画像処理装置の第一画像データの合成位置を示す座標情報を取得する設定情報取得手段をさらに備えてもよい。前記生成手段は、前記設定情報取得手段によって取得された座標情報が示す合成位置に、第一画像データを合成してもよい。この場合、ユーザは、各撮像手段によって撮像された画像のレイアウトを自由に決定することができる。 The image processing device is information set when a user inputs an operation to a template displayed on a display unit, and a first image of each image processing device specified by the user at coordinates on the template You may further provide the setting information acquisition means which acquires the coordinate information which shows the synthetic | combination position of data. The generating unit may combine the first image data at a combining position indicated by the coordinate information acquired by the setting information acquiring unit. In this case, the user can freely determine the layout of the image captured by each imaging unit.

前記第二画像データ取得手段は、変換符号化および量子化が行われた第二画像データを、接続している前記他の画像処理装置から取得してもよい。前記画像処理装置は、自装置が他の画像処理装置を介さずに前記通信装置に接続しているか否かを判断する判断手段と、前記判断手段によって他の画像処理装置を介さずに前記通信装置に接続されていると判断された場合に、前記生成手段によって生成された合成画像データに対して予測符号化処理を含む圧縮処理を行う圧縮処理手段をさらに備えてもよい。前記出力手段は、前記圧縮処理手段によって処理された合成画像データを前記通信装置に出力してもよい。画像処理装置は、変換符号化および量子化が行われた第二画像データを取得するため、データ量の大きい第二画像データを取得して処理負担が増加することを防止することができる。さらに、画像処理装置は、合成画像データを通信装置に出力する場合にのみ、予測符号化処理を含む圧縮処理を行う。よって、合成画像データを生成する際に第二画像データを復号化する必要が無く、効率よく処理を行うことができる。通信装置は、圧縮処理が既に行われた合成画像データを入力するため、通信装置の処理負担はさらに低下する。 The second image data acquisition unit may acquire the second image data subjected to transform coding and quantization from the other connected image processing apparatus. The image processing apparatus determines whether the own apparatus is connected to the communication apparatus without going through another image processing apparatus, and the communication means without using the other image processing apparatus by the judging means. A compression processing unit that performs a compression process including a predictive coding process on the synthesized image data generated by the generation unit when it is determined that the apparatus is connected to the apparatus may be further provided. The output means may output the composite image data processed by the compression processing means to the communication device. Since the image processing apparatus acquires the second image data that has been subjected to transform coding and quantization, it can prevent the processing load from increasing by acquiring the second image data having a large amount of data. Furthermore, the image processing apparatus performs a compression process including a predictive encoding process only when outputting the composite image data to the communication apparatus. Therefore, it is not necessary to decode the second image data when generating the composite image data, and the processing can be performed efficiently. Since the communication device inputs composite image data that has already undergone compression processing, the processing load on the communication device is further reduced.

前記画像処理装置は、前記通信装置に出力する合成画像データの解像度およびフレームレートを前記通信装置から取得する出力情報取得手段と、前記出力情報取得手段によって取得された解像度およびフレームレートを、前記通信装置に連なって接続している他の画像処理装置に通知する通知手段とをさらに備えてもよい。前記生成手段は、前記出力情報取得手段によって取得された解像度およびフレームレート、または他の画像処理装置の通知手段によって通知された解像度およびフレームレートに従って合成画像データを生成してもよい。この場合、画像処理装置は、通信装置が要求する解像度およびフレームレートを他の画像処理装置と共有し、適切な合成画像データを通信装置に出力することができる。 The image processing device includes: output information acquisition means for acquiring the resolution and frame rate of the composite image data output to the communication device from the communication device; and the resolution and frame rate acquired by the output information acquisition means for the communication Notification means for notifying other image processing apparatuses connected to the apparatus may be further provided. The generation unit may generate composite image data according to the resolution and frame rate acquired by the output information acquisition unit, or the resolution and frame rate notified by a notification unit of another image processing apparatus. In this case, the image processing apparatus can share the resolution and frame rate required by the communication apparatus with other image processing apparatuses, and can output appropriate composite image data to the communication apparatus.

本発明の第二の態様に係る画像処理方法は、ネットワークを介してデータを送信する通信装置と、前記通信装置に対して直列に接続される複数の画像処理装置とを備える通信システムにおいて使用され、他の拠点で表示させる画像の画像データを前記通信装置に出力することが可能な画像処理装置によって実行される画像処理方法であって、前記通信装置に連なって接続している他の画像処理装置の台数を取得する台数取得ステップと、画像を撮像する撮像手段から第一画像データを取得する第一画像データ取得ステップと、自装置に接続している他の画像処理装置から第二画像データを取得する第二画像データ取得ステップと、前記第一画像データ取得ステップにおいて取得された第一画像データと前記第二画像データ取得ステップにおいて取得された第二画像データとを、前記台数取得ステップにおいて取得された台数に応じて合成して合成画像データを生成する生成ステップと、前記生成ステップにおいて生成された前記合成画像データを、接続している他の画像処理装置または前記通信装置に出力する出力ステップとを備える。 An image processing method according to a second aspect of the present invention is used in a communication system including a communication device that transmits data via a network and a plurality of image processing devices connected in series to the communication device. An image processing method executed by an image processing apparatus capable of outputting image data of an image to be displayed at another base to the communication apparatus, wherein the other image processing is connected to the communication apparatus. A number acquisition step for acquiring the number of devices, a first image data acquisition step for acquiring first image data from an imaging means for capturing an image, and second image data from another image processing device connected to the device itself In the second image data acquisition step, the first image data acquired in the first image data acquisition step, and the second image data acquisition step. Connecting the acquired second image data to the generation step of generating composite image data by combining the acquired second image data according to the number of units acquired in the number acquisition step, and the composite image data generated in the generation step; Output to the other image processing apparatus or the communication apparatus.

第二の態様に係る画像処理方法によると、画像処理装置は、複数の撮像手段によって撮像された画像の画像データを合成して合成画像データを生成する。生成した合成画像データを、他の画像処理装置または通信装置に出力することができる。画像処理装置から合成画像データを入力した通信装置は、画像データを合成する処理を行う必要がない。従って、画像処理装置は通信装置の処理負担を低下させることができ、テレビ会議中の通信が乱れる可能性を低下させることができる。 According to the image processing method according to the second aspect, the image processing apparatus generates combined image data by combining image data of images captured by a plurality of imaging units. The generated composite image data can be output to another image processing apparatus or communication apparatus. The communication device that has input the composite image data from the image processing device does not need to perform processing for combining the image data. Therefore, the image processing apparatus can reduce the processing load on the communication apparatus, and can reduce the possibility of communication disruption during a video conference.

本発明の第三の態様に係る画像処理プログラムは、ネットワークを介してデータを送信する通信装置と、前記通信装置に対して直列に接続される複数の画像処理装置とを備える通信システムにおいて使用され、他の拠点で表示させる画像の画像データを前記通信装置に出力することが可能な画像処理装置で用いられる画像処理プログラムであって、前記通信装置に連なって接続している他の画像処理装置の台数を取得する台数取得ステップと、画像を撮像する撮像手段から第一画像データを取得する第一画像データ取得ステップと、自装置に接続している他の画像処理装置から第二画像データを取得する第二画像データ取得ステップと、前記第一画像データ取得ステップにおいて取得された第一画像データと前記第二画像データ取得ステップにおいて取得された第二画像データとを、前記台数取得ステップにおいて取得された台数に応じて合成して合成画像データを生成する生成ステップと、前記生成ステップにおいて生成された前記合成画像データを、接続している他の画像処理装置または前記通信装置に出力する出力ステップとを前記画像処理装置のコントローラに実行させるための指示を含む。 An image processing program according to a third aspect of the present invention is used in a communication system including a communication device that transmits data via a network and a plurality of image processing devices connected in series to the communication device. An image processing program used in an image processing apparatus capable of outputting image data of an image to be displayed at another base to the communication apparatus, and the other image processing apparatus connected to the communication apparatus A number acquisition step for acquiring the number of images, a first image data acquisition step for acquiring first image data from an imaging means for capturing an image, and second image data from another image processing device connected to the device itself Second image data acquisition step to be acquired, first image data acquired in the first image data acquisition step, and second image data acquisition step The second image data acquired in step S2 is combined with the number acquired in the number acquisition step to generate combined image data, and the combined image data generated in the generation step is An instruction for causing the controller of the image processing apparatus to execute an output step of outputting to another connected image processing apparatus or the communication apparatus.

第三の態様に係る画像処理プログラムによると、画像処理装置は、複数の撮像手段によって撮像された画像の画像データを合成して合成画像データを生成する。生成した合成画像データを、他の画像処理装置または通信装置に出力することができる。画像処理装置から合成画像データを入力した通信装置は、画像データを合成する処理を行う必要がない。従って、画像処理装置は通信装置の処理負担を低下させることができ、テレビ会議中の通信が乱れる可能性を低下させることができる。 According to the image processing program according to the third aspect, the image processing apparatus generates composite image data by combining image data of images captured by a plurality of imaging units. The generated composite image data can be output to another image processing apparatus or communication apparatus. The communication device that has input the composite image data from the image processing device does not need to perform processing for combining the image data. Therefore, the image processing apparatus can reduce the processing load on the communication apparatus, and can reduce the possibility of communication disruption during a video conference.

通信システム１００のシステム構成を示す図である。1 is a diagram showing a system configuration of a communication system 100. FIG. テレビ会議端末１およびＰＣ３の電気的構成を示すブロック図である。It is a block diagram which shows the electrical structure of the video conference terminal 1 and PC3. テレビ会議中に表示装置４６に表示される画像の一例を示す図である。It is a figure which shows an example of the image displayed on the display apparatus 46 during a video conference. ＰＣ３が実行するＰＣ会議処理のフローチャートである。It is a flowchart of the PC conference process which PC3 performs. 画像配置入力用テンプレート５１を表示している表示装置４６の一例を示す図である。It is a figure which shows an example of the display apparatus 46 which is displaying the template 51 for image arrangement input. レイアウト情報のデータ構成を説明するための説明図である。It is explanatory drawing for demonstrating the data structure of layout information. テレビ会議端末１が実行する端末会議処理のフローチャートである。It is a flowchart of the terminal conference process which the video conference terminal 1 performs. 端末会議処理中に実行される最上位端末処理のフローチャートである。It is a flowchart of the highest terminal process performed during a terminal conference process. テレビ会議端末１のフラッシュメモリ１３に記憶されている合成位置テーブルのデータ構成を説明するための説明図である。It is explanatory drawing for demonstrating the data structure of the synthetic | combination position table memorize | stored in the flash memory 13 of the video conference terminal 1. FIG. 端末会議処理中に実行される下位端末処理のフローチャートである。It is a flowchart of the low-order terminal process performed during a terminal conference process.

以下、本発明の画像処理装置を具現化した一実施の形態であるテレビ会議端末１について、図面を参照して説明する。参照する図面は、本発明が採用し得る技術的特徴を説明するために用いられるものである。図面に記載されている装置の構成、各種処理のフローチャート等は、それのみに限定する趣旨ではなく、単なる説明例である。 Hereinafter, a video conference terminal 1, which is an embodiment of an image processing apparatus according to the present invention, will be described with reference to the drawings. The drawings to be referred to are used for explaining technical features that can be adopted by the present invention. The configuration of the apparatus, the flowcharts of various processes, and the like described in the drawings are not intended to be limited to these, but are merely illustrative examples.

図１を参照して、テレビ会議端末１を備えた通信システム１００の概略構成について説明する。通信システム１００は、ネットワーク８を介してデータを送受信することが可能な複数の通信装置を備える。本実施の形態では、通信装置としてパーソナルコンピュータ（以下、「ＰＣ」という。）３が用いられているが、データの送受信が可能なテレビ会議専用の装置等を通信装置として用いてもよい。 A schematic configuration of a communication system 100 including a video conference terminal 1 will be described with reference to FIG. The communication system 100 includes a plurality of communication devices that can transmit and receive data via the network 8. In this embodiment, a personal computer (hereinafter referred to as “PC”) 3 is used as a communication device, but a device dedicated to video conferencing that can transmit and receive data may be used as a communication device.

ＰＣ３にはテレビ会議端末１を接続することができる。テレビ会議端末１は、設置された拠点の音声をマイク２６（図２参照）から入力し、且つ画像をカメラ２８（図２参照）から入力する。入力した音声および画像のデータをＰＣ３に出力する。また、テレビ会議端末１は、ＰＣ３から入力した音声のデータに基づいて、他拠点の音声をスピーカ２７（図２参照）から発生させることができる。 The video conference terminal 1 can be connected to the PC 3. The video conference terminal 1 inputs the voice of the installed base from the microphone 26 (see FIG. 2) and inputs the image from the camera 28 (see FIG. 2). The input voice and image data is output to the PC 3. Further, the video conference terminal 1 can generate the sound of another base from the speaker 27 (see FIG. 2) based on the sound data input from the PC 3.

ＰＣ３には複数のテレビ会議端末１を接続することもできる。従って、図１に示すように、複数のユーザが１つの拠点に集合してテレビ会議を実行する場合、各ユーザの近傍に複数のテレビ会議端末１の各々を配置することができる。その結果、各ユーザの鮮明な画像が他の拠点で表示され、且つ各ユーザの発話音声が正確に他の拠点で出力される。 A plurality of video conference terminals 1 can be connected to the PC 3. Therefore, as shown in FIG. 1, when a plurality of users gather at one base and perform a video conference, each of the plurality of video conference terminals 1 can be arranged in the vicinity of each user. As a result, a clear image of each user is displayed at the other base, and the speech voice of each user is output accurately at the other base.

本実施の形態では、１つのＰＣ３に複数のテレビ会議端末１を接続する場合、複数のテレビ会議端末１を直列にＰＣ３に接続する。図１に示す例では、ＰＣ３にテレビ会議端末１Ａが接続される。テレビ会議端末１Ａにはテレビ会議端末１Ｂが接続され、さらにテレビ会議端末１Ｂにはテレビ会議端末１Ｃが接続される。従って、テレビ会議端末１Ｃが自装置のマイク２６およびカメラ２８から入力したデータは、テレビ会議端末１Ｂおよびテレビ会議端末１Ａを介してＰＣ３に出力される。同様に、テレビ会議端末１Ｂが入力した音声および画像のデータは、テレビ会議端末１Ａを介してＰＣ３に出力される。ＰＣ３は、複数のテレビ会議端末１Ａ〜１Ｃから入力したデータを、ネットワーク８を介して他の拠点のＰＣ３に送信する。その結果、ユーザＡ，Ｂ，Ｃの発話音声および画像が他の拠点で出力される。 In the present embodiment, when a plurality of video conference terminals 1 are connected to one PC 3, the plurality of video conference terminals 1 are connected to the PC 3 in series. In the example shown in FIG. 1, a video conference terminal 1 </ b> A is connected to the PC 3. A video conference terminal 1B is connected to the video conference terminal 1A, and a video conference terminal 1C is further connected to the video conference terminal 1B. Accordingly, data input from the video conference terminal 1C from its own microphone 26 and camera 28 is output to the PC 3 via the video conference terminal 1B and the video conference terminal 1A. Similarly, audio and image data input by the video conference terminal 1B is output to the PC 3 via the video conference terminal 1A. The PC 3 transmits data input from the plurality of video conference terminals 1 </ b> A to 1 </ b> C to the PC 3 at another base via the network 8. As a result, the voices and images of users A, B, and C are output at other locations.

前述したように、複数のテレビ会議端末１を１つのＰＣ３に接続することで、複数のユーザの各々の正確な発話音声および鮮明な画像が他の拠点で出力される。一方で、従来の技術では、ＰＣ３は、複数のテレビ会議端末１から入力した複数の画像データを合成する必要があった。その結果、ＰＣ３の処理負担が増大していた。ＰＣ３の処理負担が増大すると、データ通信、音声の出力、画像の表示等が乱れ、テレビ会議を円滑に実行できない可能性がある。本実施の形態のテレビ会議端末１は、ＰＣ３の処理負担を低下させて円滑にテレビ会議を実行することができる。 As described above, by connecting a plurality of video conference terminals 1 to one PC 3, accurate speech sounds and clear images of each of a plurality of users are output at other bases. On the other hand, in the conventional technique, the PC 3 has to synthesize a plurality of image data input from the plurality of video conference terminals 1. As a result, the processing burden on the PC 3 has increased. When the processing load on the PC 3 increases, data communication, audio output, image display, and the like may be disturbed, and the video conference may not be performed smoothly. The video conference terminal 1 according to the present embodiment can smoothly execute a video conference by reducing the processing load on the PC 3.

なお、以下の説明では、ＰＣ３に直列に接続されている複数のテレビ会議端末１のうち、ＰＣ３に近い方の端末を「上位」、ＰＣ３から遠い方の端末を「下位」とする。他のテレビ会議端末１を介さずにＰＣ３に直接接続しているテレビ会議端末１（例えば、図１のテレビ会議端末１Ａ）を「最上位端末」という。ＰＣ３から最も遠いテレビ会議端末１（例えば、図１のテレビ会議端末１Ｃ）を「最下位端末」という。上位および下位の２つのテレビ会議端末１に接続しているテレビ会議端末１（例えば、図１のテレビ会議端末１Ｂ）を「中間端末」という。 In the following description, among the plurality of video conference terminals 1 connected in series to the PC 3, a terminal closer to the PC 3 is referred to as “upper” and a terminal farther from the PC 3 is referred to as “lower”. A video conference terminal 1 (for example, the video conference terminal 1A in FIG. 1) directly connected to the PC 3 without passing through another video conference terminal 1 is referred to as a “top-level terminal”. The video conference terminal 1 farthest from the PC 3 (for example, the video conference terminal 1C in FIG. 1) is referred to as a “lowest terminal”. The video conference terminal 1 (for example, the video conference terminal 1B in FIG. 1) connected to the upper and lower two video conference terminals 1 is referred to as an “intermediate terminal”.

図２を参照して、テレビ会議端末１およびＰＣ３の電気的構成について説明する。テレビ会議端末１は、テレビ会議端末１の制御を司るＣＰＵ１０を備える。ＣＰＵ１０には、ＲＯＭ１１、ＲＡＭ１２、フラッシュメモリ１３、および入出力インターフェース１４が、バス１９を介して接続されている。 The electrical configuration of the video conference terminal 1 and the PC 3 will be described with reference to FIG. The video conference terminal 1 includes a CPU 10 that controls the video conference terminal 1. A ROM 11, a RAM 12, a flash memory 13, and an input / output interface 14 are connected to the CPU 10 via a bus 19.

ＲＯＭ１１は、テレビ会議端末１を動作させるためのプログラムおよび初期値等を記憶している。後述する端末会議処理（図７等参照）を実行するためのプログラムはＲＯＭ１１に記憶されている。ＲＡＭ１２は、制御プログラムで使用される各種の情報を一時的に記憶する。フラッシュメモリ１３は不揮発性の記憶装置である。後述する合成位置テーブル（図９参照）はフラッシュメモリ１３に記憶されている。フラッシュメモリ１３の代わりに、ＥＥＰＲＯＭまたはメモリカード等の記憶装置を用いてもよい。 The ROM 11 stores a program for operating the video conference terminal 1, an initial value, and the like. A program for executing a terminal conference process (see FIG. 7 and the like) described later is stored in the ROM 11. The RAM 12 temporarily stores various information used in the control program. The flash memory 13 is a nonvolatile storage device. A composite position table (see FIG. 9) to be described later is stored in the flash memory 13. Instead of the flash memory 13, a storage device such as an EEPROM or a memory card may be used.

入出力インターフェース１４には、ＵＳＢデバイスＩ／Ｆ２１、音声入力処理部２２、音声出力処理部２３、映像入力処理部２４、ＵＳＢホストＩ／Ｆ２５、および操作部２９が接続されている。ＵＳＢデバイスＩ／Ｆ２１は、ＰＣ３のＵＳＢホストＩ／Ｆ４３、または他のテレビ会議端末１のＵＳＢホストＩ／Ｆ２５に接続し、接続したデバイスに自拠点の音声および画像のデータを出力する。テレビ会議端末１がＵＳＢデバイスＩ／Ｆ２１によって他の装置に接続された場合、接続した他の装置がＵＳＢ通信プロトコルを主導する。一方、ＵＳＢホストＩ／Ｆ２５は、下位のテレビ会議端末１のＵＳＢデバイスＩ／Ｆ２１に接続し、接続した下位のテレビ会議端末１から自拠点の音声および画像のデータを入力する。ＵＳＢホストＩ／Ｆ２５によって他の装置に接続された場合には、自装置がＵＳＢ通信プロトコルを主導する。なお、ＵＳＢデバイスＩ／Ｆ２１およびＵＳＢホストＩ／Ｆ２５によって各種信号を他のデバイスとの間で入出力できることは言うまでもない。音声入力処理部２２は、マイク２６からの音声データの入力を処理する。音声出力処理部２３は、スピーカ２７の動作を処理する。映像入力処理部２４は、カメラ２８からの画像データの入力を処理する。操作部２９は電源ボタン、音量調節ボタン、マイクミュートボタン等からなり、ユーザによる各種操作入力を受け付ける。 A USB device I / F 21, an audio input processing unit 22, an audio output processing unit 23, a video input processing unit 24, a USB host I / F 25, and an operation unit 29 are connected to the input / output interface 14. The USB device I / F 21 is connected to the USB host I / F 43 of the PC 3 or the USB host I / F 25 of another video conference terminal 1, and outputs the voice and image data of the local site to the connected device. When the video conference terminal 1 is connected to another device by the USB device I / F 21, the other connected device takes the lead in the USB communication protocol. On the other hand, the USB host I / F 25 is connected to the USB device I / F 21 of the lower video conference terminal 1 and inputs the voice and image data of its own site from the connected lower video conference terminal 1. When connected to another device by the USB host I / F 25, the own device takes the lead in the USB communication protocol. Needless to say, various signals can be input / output to / from other devices by the USB device I / F 21 and the USB host I / F 25. The voice input processing unit 22 processes voice data input from the microphone 26. The audio output processing unit 23 processes the operation of the speaker 27. The video input processing unit 24 processes input of image data from the camera 28. The operation unit 29 includes a power button, a volume control button, a microphone mute button, and the like, and accepts various operation inputs from the user.

ＰＣ３は、ＰＣ３の制御を司るＣＰＵ３０を備える。ＣＰＵ３０には、ＲＯＭ３１、ＲＡＭ３２、ハードディスクドライブ（以下、「ＨＤＤ」という。）３３、および入出力インターフェース３４が、バス３９を介して接続されている。ＲＯＭ３１は、ＰＣ３を動作させるためのプログラムおよび初期値等を記憶している。ＲＡＭ１２は各種情報を一時的に記憶する。ＨＤＤ１３は不揮発性の記憶装置である。 The PC 3 includes a CPU 30 that controls the PC 3. A ROM 31, a RAM 32, a hard disk drive (hereinafter referred to as “HDD”) 33, and an input / output interface 34 are connected to the CPU 30 via a bus 39. The ROM 31 stores a program for operating the PC 3, initial values, and the like. The RAM 12 temporarily stores various information. The HDD 13 is a non-volatile storage device.

入出力インターフェース３４には、映像出力処理部４１、外部通信Ｉ／Ｆ４２、ＵＳＢホストＩ／Ｆ４３、および操作部４７が接続されている。映像出力処理部４１は、映像を表示する表示装置４６の動作を処理する。外部通信Ｉ／Ｆ４２は、ＰＣ３をネットワーク８に接続する。ＵＳＢホストＩ／Ｆ４３は、ＵＳＢケーブルを介してＰＣ３をテレビ会議端末１（最上位端末）に接続する。操作部４７はキーボード等からなり、各種操作入力を受け付ける。 A video output processing unit 41, an external communication I / F 42, a USB host I / F 43, and an operation unit 47 are connected to the input / output interface 34. The video output processing unit 41 processes the operation of the display device 46 that displays video. The external communication I / F 42 connects the PC 3 to the network 8. The USB host I / F 43 connects the PC 3 to the video conference terminal 1 (top-level terminal) via a USB cable. The operation unit 47 includes a keyboard and receives various operation inputs.

図３を参照して、通信システム１００でテレビ会議が実行されている間にＰＣ３の表示装置４６に表示される画像の一例について説明する。図３に示す例では、拠点Ｘに３人のユーザＡ，Ｂ，Ｃが集まり、且つ拠点Ｙには１人のユーザＤがいる。拠点ＸのＰＣ３には３つのテレビ会議端末１が接続されており、各テレビ会議端末１が３人のユーザＡ，Ｂ，Ｃの各々を撮像する。拠点ＸのＰＣ３は、３つのテレビ会議端末１が撮像した３つの画像が合成された合成画像４８を表示装置４６に表示する。さらに、拠点ＹのＰＣ３から受信したユーザＤの画像４９を表示装置４６に表示する。拠点ＸのＰＣ３は、合成画像４８の画像データを拠点ＹのＰＣ３に送信する。従って、図３に示す画像と同様の画像が、拠点ＹのＰＣ３の表示装置４６にも表示される。なお、ＰＣ３は、図３に例示する画像をプロジェクタに投影させてテレビ会議を実行してもよいし、ＰＣ３に接続された他の表示装置に表示させてもよい。 With reference to FIG. 3, an example of an image displayed on the display device 46 of the PC 3 while the video conference is executed in the communication system 100 will be described. In the example shown in FIG. 3, three users A, B, and C gather at the base X, and one user D exists at the base Y. Three video conference terminals 1 are connected to the PC 3 at the base X, and each video conference terminal 1 images each of the three users A, B, and C. The PC 3 at the site X displays on the display device 46 a composite image 48 in which the three images captured by the three video conference terminals 1 are combined. Further, the image 49 of the user D received from the PC 3 at the site Y is displayed on the display device 46. The PC 3 at the site X transmits the image data of the composite image 48 to the PC 3 at the site Y. Therefore, an image similar to the image shown in FIG. 3 is also displayed on the display device 46 of the PC 3 at the site Y. Note that the PC 3 may project the image illustrated in FIG. 3 on a projector to execute a video conference, or may display the image on another display device connected to the PC 3.

本実施の形態に係るテレビ会議端末１は、複数のテレビ会議端末１が撮像した複数の画像を合成して合成画像データを生成し、生成した合成画像データをＰＣ３に出力することができる。従って、テレビ会議端末１はＰＣ３の処理負担を低下させることができる。また、複数のテレビ会議端末１は、各テレビ会議端末１が撮像した画像の合成画像データにおける合成位置を示す座標情報を共有し、適切な位置に画像を合成することができる。さらに、テレビ会議端末１は、ＰＣ３から要求された解像度およびフレームレートに応じた適切な合成画像データをＰＣ３に出力することができる。以下、処理の詳細について説明する。 The video conference terminal 1 according to the present embodiment can generate a composite image data by combining a plurality of images captured by the plurality of video conference terminals 1, and can output the generated composite image data to the PC 3. Therefore, the video conference terminal 1 can reduce the processing load on the PC 3. In addition, the plurality of video conference terminals 1 can share coordinate information indicating the synthesis position in the synthesized image data of the image captured by each video conference terminal 1 and synthesize the images at appropriate positions. Furthermore, the video conference terminal 1 can output suitable composite image data corresponding to the resolution and frame rate requested from the PC 3 to the PC 3. Details of the processing will be described below.

図４から図６を参照して、ＰＣ３が実行するＰＣ会議処理について説明する。ＰＣ３のＨＤＤ３３には、テレビ会議を実行するためのアプリケーションプログラムがインストールされている。ＰＣ３のＣＰＵ３０は、アプリケーションを立ち上げる指示を操作部４７から入力すると、図４に示すＰＣ会議処理を実行する。 The PC conference process executed by the PC 3 will be described with reference to FIGS. An application program for executing a video conference is installed in the HDD 33 of the PC 3. When the CPU 30 of the PC 3 inputs an instruction for starting an application from the operation unit 47, the PC conference process shown in FIG. 4 is executed.

図４に示すように、ＰＣ会議処理が開始されると、接続している最上位端末に台数調査コマンドが出力される（Ｓ１）。台数調査コマンドとは、ＰＣ３に連なって接続しているテレビ会議端末１の台数を調査する指示を行うためのコマンドである。詳細は後述するが、テレビ会議端末１は、台数調査コマンドをＰＣ３から入力すると、テレビ会議端末１の台数を調査してＰＣ３に通知する。ＰＣ３のＣＰＵ１０は、最上位端末から台数を取得してＲＡＭ３２に記憶する（Ｓ２）。 As shown in FIG. 4, when the PC conference process is started, a number survey command is output to the connected highest terminal (S1). The number survey command is a command for instructing the number of video conference terminals 1 connected to the PC 3 to be surveyed. Although details will be described later, when the video conference terminal 1 inputs a number survey command from the PC 3, the number of the video conference terminals 1 is surveyed and notified to the PC 3. The CPU 10 of the PC 3 acquires the number from the highest terminal and stores it in the RAM 32 (S2).

次いで、レイアウト指定コマンドが操作部４７に入力されたか否かが判断される（Ｓ３）。レイアウト指定コマンドは、合成画像のレイアウトをユーザ自身が指定する場合にユーザによって入力される。レイアウト指定コマンドが入力されていなければ（Ｓ３：ＮＯ）、デフォルトのレイアウトを用いて会議を開始させる指示が入力されたか否かが判断される（Ｓ４）。入力されていなければ（Ｓ４：ＮＯ）、処理はＳ３の判断へ戻る。 Next, it is determined whether or not a layout designation command has been input to the operation unit 47 (S3). The layout designation command is input by the user when the user himself designates the layout of the composite image. If the layout designation command has not been input (S3: NO), it is determined whether or not an instruction to start the conference using the default layout has been input (S4). If not input (S4: NO), the process returns to the determination of S3.

レイアウト指定コマンドが入力された場合（Ｓ３：ＹＥＳ）、画像配置入力用テンプレートが起動されて表示装置４６に表示される（Ｓ６）。図５に、画像配置入力用テンプレート５１を表示した表示装置４６の一例を示す。画像配置入力用テンプレートは、合成画像のレイアウトの指定をユーザから受け付けるためのテンプレートである。Ｓ６では、Ｓ２で取得された台数、およびテレビ会議で採用する画像の解像度に応じたテンプレートが起動される。図５に例示する画像配置入力用テンプレート５１は、テレビ会議端末１の台数が３台であり、且つ採用する解像度がＶＧＡ（６４０×４８０）である場合のテンプレートである。この場合、最上位端末の撮像画像の位置を示す「画像１の位置」５２、中間端末の撮像画像の位置を示す「画像２の位置」５３、最下位端末の撮像画像の位置を示す「画像３の位置」５４が表示される。さらに、左上端を原点とした座標の数値が表示される。ユーザは、表示装置４６に表示されるポインタ（図示せず）を操作部４７によって操作することで、各画像の位置および大きさを指定する。指定したレイアウトを確定させる場合には、確定ボタン５７を操作する。なお、斜線部分５５は撮像画像が表示されないブランク部分である。 When the layout designation command is input (S3: YES), the image layout input template is activated and displayed on the display device 46 (S6). FIG. 5 shows an example of the display device 46 that displays the image layout input template 51. The image layout input template is a template for accepting designation of the layout of the composite image from the user. In S6, a template corresponding to the number acquired in S2 and the resolution of the image adopted in the video conference is activated. The image layout input template 51 illustrated in FIG. 5 is a template when the number of the video conference terminals 1 is three and the adopted resolution is VGA (640 × 480). In this case, “position of image 1” 52 indicating the position of the captured image of the highest terminal, “position of image 2” 53 indicating the position of the captured image of the intermediate terminal, and “image” indicating the position of the captured image of the lowest terminal. 3 position "54 is displayed. Furthermore, the numerical value of the coordinates with the upper left corner as the origin is displayed. The user designates the position and size of each image by operating a pointer (not shown) displayed on the display device 46 with the operation unit 47. When confirming the designated layout, the confirm button 57 is operated. The hatched portion 55 is a blank portion where a captured image is not displayed.

確定ボタン５７が操作されるまで（Ｓ７：ＮＯ）、画像配置入力用テンプレートは継続して起動される。確定ボタン５７が操作されてレイアウトの指定が完了すると（Ｓ７：ＹＥＳ）、レイアウト情報が設定されて最上位端末に出力される（Ｓ８）。図６に示すように、レイアウト情報は、各テレビ会議端末１が撮像した各々の画像の合成画像データ上の合成位置を示す。具体的には、合成画像データ上の座標における各々の画像の合成位置を示す座標値が、レイアウト情報として設定されて出力される。 The image layout input template is continuously activated until the confirmation button 57 is operated (S7: NO). When the confirmation button 57 is operated and the designation of the layout is completed (S7: YES), layout information is set and output to the highest terminal (S8). As shown in FIG. 6, the layout information indicates a composite position on the composite image data of each image captured by each video conference terminal 1. Specifically, a coordinate value indicating a composite position of each image at coordinates on the composite image data is set and output as layout information.

レイアウト情報の出力（Ｓ８）が完了した場合、またはデフォルトのレイアウトを用いて会議を開始させる指示が入力された場合には（Ｓ４：ＹＥＳ）、会議開始指示が最上位端末に出力される（Ｓ１０）。テレビ会議で採用する画像の解像度およびフレームレートが最上位端末に出力される（Ｓ１１）。ＰＣ会議処理は終了する。なお、ＰＣ３のＣＰＵ３０は、ＰＣ会議処理を終了すると、画像および音声のデータを他のＰＣ３との間で送受信する処理を開始する。この処理は一般的な処理であるため、説明を省略する。 When the output of layout information (S8) is completed, or when an instruction to start a conference using the default layout is input (S4: YES), the conference start instruction is output to the highest terminal (S10). ). The resolution and frame rate of the image used in the video conference are output to the highest terminal (S11). The PC conference process ends. In addition, CPU30 of PC3 will start the process which transmits / receives the image and audio | speech data between other PC3, if PC meeting process is complete | finished. Since this process is a general process, description thereof is omitted.

図７から図１０を参照して、テレビ会議端末１が実行する端末会議処理について説明する。テレビ会議端末１のＲＯＭ１１には、テレビ会議中にＰＣ３に画像データを出力するための画像処理プログラムが記憶されている。テレビ会議端末１のＣＰＵ１０は、電源が投入されることを契機として、図７に示す端末会議処理を実行する。 A terminal conference process executed by the video conference terminal 1 will be described with reference to FIGS. The ROM 11 of the video conference terminal 1 stores an image processing program for outputting image data to the PC 3 during the video conference. The CPU 10 of the video conference terminal 1 executes the terminal conference process shown in FIG. 7 when the power is turned on.

図７に示すように、端末会議処理が開始されると、ＰＣ３から台数調査コマンドが入力されたか否かが判断される（Ｓ２１）。入力されていなければ（Ｓ２１：ＮＯ）、上位に接続されているテレビ会議端末１から台数調査コマンドが入力されたか否かが判断される（Ｓ２２）。入力されていなければ（Ｓ２２：ＮＯ）、Ｓ２１およびＳ２２の判断が繰り返される。 As shown in FIG. 7, when the terminal conference process is started, it is determined whether or not a number survey command is input from the PC 3 (S21). If it has not been input (S21: NO), it is determined whether or not a number survey command has been input from the video conference terminal 1 connected to the host (S22). If not input (S22: NO), the determinations of S21 and S22 are repeated.

自装置が最上位端末であり、他のテレビ会議端末１を介さずにＰＣ３に接続されている場合には、テレビ会議の開始時にＰＣ３から台数調査コマンドが出力されてくる。従って、ＰＣ３から台数調査コマンドを直接入力した場合（Ｓ２１：ＹＥＳ）、ＣＰＵ１０は、自装置を最上位端末として認識し、自装置の接続番号を「１」に設定する（Ｓ２３）。接続番号とは、ＰＣ３に連なって接続している複数のテレビ会議端末１のうち、ＰＣ３に対する自装置の接続順を示す番号である。詳細は後述するが、テレビ会議端末１は、自装置のカメラ２８によって撮像した画像のデータの合成位置を、接続番号に従って認識する。本実施の形態では、上位のテレビ会議端末１から順に接続番号が増加するように接続番号が設定される。具体的には、ＰＣ３に直接接続されている場合には接続順が「１」とされ、接続番号が「１」とされる。１つのテレビ会議端末１（最上位端末）を介してＰＣ３に接続されている場合には接続順が「２」とされ、接続番号が「２」とされる。また、上位に接続されているテレビ会議端末１から台数調査コマンドを入力した場合には（Ｓ２２：ＹＥＳ）、処理はそのままＳ２４へ移行する。 When the own apparatus is the highest terminal and is connected to the PC 3 without passing through the other video conference terminals 1, a number survey command is output from the PC 3 at the start of the video conference. Therefore, when the number survey command is directly input from the PC 3 (S21: YES), the CPU 10 recognizes the own device as the highest terminal and sets the connection number of the own device to “1” (S23). The connection number is a number indicating the connection order of the own apparatus with respect to the PC 3 among the plurality of video conference terminals 1 connected in series with the PC 3. Although the details will be described later, the video conference terminal 1 recognizes the combined position of the data of the image captured by the camera 28 of its own device according to the connection number. In the present embodiment, the connection numbers are set so that the connection numbers increase in order from the higher-level video conference terminal 1. Specifically, in the case of direct connection to the PC 3, the connection order is “1” and the connection number is “1”. When connected to the PC 3 via one video conference terminal 1 (the highest terminal), the connection order is “2” and the connection number is “2”. When the number survey command is input from the video conference terminal 1 connected to the host (S22: YES), the process proceeds to S24 as it is.

台数調査コマンドが入力されると、下位に接続されているテレビ会議端末１へ台数調査コマンドが出力される（Ｓ２４）。出力した台数調査コマンドに対して、下位端末数の回答があったか否かが判断される（Ｓ２６）。回答がない場合（Ｓ２６：ＮＯ）、自装置の下位にはテレビ会議端末１が接続されていない。従って、自装置が最下位端末として認識される（Ｓ２７）。さらに、上位のテレビ会議端末１に対し、下位端末数が最下位端末の１つのみであることが出力される（Ｓ２８）。次いで、上位のテレビ会議端末１から合計台数が入力されるまで待機される（Ｓ３３：ＮＯ）。詳細は後述するが、合計台数は、最上位端末から下位のテレビ会議端末１へ向けて順に入力される。合計台数が入力されると（Ｓ３３：ＹＥＳ）、自装置よりも下位の端末数を合計台数から引いた値が、自装置の接続番号に設定される（Ｓ３４）。つまり、自装置が最下位端末であれば、自装置よりも下位の端末数は「０」であるため、合計台数と自装置の接続番号は等しくなる。次いで、自装置が最下位端末であるか否かが判断される（Ｓ３９）。最下位端末であれば（Ｓ３９：ＹＥＳ）、処理はそのままＳ４１の判断へ移行する。 When the number survey command is input, the number survey command is output to the video conference terminal 1 connected to the lower level (S24). It is determined whether or not there is a reply of the number of lower terminals in response to the output number survey command (S26). When there is no answer (S26: NO), the video conference terminal 1 is not connected to the lower level of the own device. Accordingly, the own apparatus is recognized as the lowest terminal (S27). Further, it is output to the upper video conference terminal 1 that the number of lower terminals is only one of the lowest terminals (S28). Next, the process waits until the total number is input from the higher-level video conference terminal 1 (S33: NO). Although details will be described later, the total number is sequentially input from the highest terminal to the lower video conference terminal 1. When the total number is input (S33: YES), a value obtained by subtracting the number of terminals lower than the own device from the total number is set as the connection number of the own device (S34). That is, if the own device is the lowest terminal, the number of terminals lower than the own device is “0”, so the total number is equal to the connection number of the own device. Next, it is determined whether or not the own device is the lowest terminal (S39). If it is the lowest terminal (S39: YES), the process directly proceeds to the determination of S41.

台数調査コマンドに対する回答があった場合（Ｓ２６：ＹＥＳ）、下位のテレビ会議端末１から入力された下位端末数が認識され、ＲＡＭ１２に記憶される（Ｓ３０）。次いで、自装置が最上位端末であるか否かが判断される（Ｓ３１）。自装置が最上位端末でなければ（Ｓ３１：ＮＯ）、下位のテレビ会議端末１から入力された下位端末数に「１」を足した数が、上位のテレビ会議端末１における下位端末数として、上位のテレビ会議端末１に対して出力される（Ｓ３２）。上記のテレビ会議端末１から合計台数が入力されると（Ｓ３３：ＹＥＳ）、合計台数から自装置の下位端末数を引いた値が自装置の接続番号に設定される（Ｓ３４）。自装置が最下位端末でなければ（Ｓ３９：ＮＯ）、下位に接続されているテレビ会議端末１に、上位のテレビ会議端末１から入力された合計台数がそのまま出力（転送）されて（Ｓ４０）、処理はＳ４１の判断へ移行する。 When there is an answer to the number survey command (S26: YES), the number of lower terminals input from the lower video conference terminal 1 is recognized and stored in the RAM 12 (S30). Next, it is determined whether or not the device is the highest terminal (S31). If the own apparatus is not the highest terminal (S31: NO), the number obtained by adding “1” to the number of lower terminals input from the lower video conference terminal 1 is the number of lower terminals in the higher video conference terminal 1. The video is output to the upper video conference terminal 1 (S32). When the total number is input from the video conference terminal 1 (S33: YES), a value obtained by subtracting the number of lower terminals of the own device from the total number is set as the connection number of the own device (S34). If the own apparatus is not the lowest terminal (S39: NO), the total number input from the upper video conference terminal 1 is directly output (transferred) to the lower video conference terminal 1 (S40). The process proceeds to the determination in S41.

自装置が最上位端末である場合には（Ｓ３１：ＹＥＳ）、下位のテレビ会議端末１から入力された下位端末数に「１」を足した数が合計台数として認識される（Ｓ３６）。自装置に接続しているＰＣ３および下位のテレビ会議端末１に対し、認識した合計台数が出力される（Ｓ３７）。前述したように、合計台数は最下位端末まで順に通知されるため、ＰＣ３に連なって接続しているテレビ会議端末１の全てが合計台数を認識できる。処理はＳ４１の判断へ移行する。 When the own apparatus is the highest terminal (S31: YES), the number obtained by adding “1” to the number of lower terminals input from the lower video conference terminal 1 is recognized as the total number (S36). The recognized total number is output to the PC 3 and the lower video conference terminal 1 connected to the own apparatus (S37). As described above, since the total number is notified to the lowest terminal in order, all of the video conference terminals 1 connected to the PC 3 can recognize the total number. The process proceeds to the determination in S41.

次いで、ＰＣ３または上位のテレビ会議端末１から会議開始指示を入力したか否かが判断され（Ｓ４１）、入力するまで待機状態となる（Ｓ４１：ＮＯ）。会議開始指示を入力すると（Ｓ４１：ＹＥＳ）、自装置が最上位端末であるか否かが判断される（Ｓ４２）。最上位端末であれば（Ｓ４２：ＹＥＳ）、最上位端末処理が行われて（Ｓ４３）、処理は終了する。自装置が最上位端末でなければ（Ｓ４２：ＮＯ）、下位端末処理が行われて（Ｓ４４）、処理は終了する。 Next, it is determined whether or not a conference start instruction is input from the PC 3 or the higher-level video conference terminal 1 (S41), and a standby state is entered until it is input (S41: NO). When a conference start instruction is input (S41: YES), it is determined whether or not the device is the highest terminal (S42). If it is the highest terminal (S42: YES), the highest terminal process is performed (S43), and the process ends. If the own apparatus is not the highest terminal (S42: NO), the lower terminal process is performed (S44), and the process ends.

図８を参照して、最上位端末処理について説明する。最上位端末処理が開始されると、テレビ会議で採用する解像度およびフレームレートがＰＣ３から入力される（Ｓ５１）。解像度およびフレームレートは、会議開始指示と共にＰＣ３によって出力される（図４、Ｓ１０およびＳ１１参照）。入力された解像度、フレームレート、および会議開始指示は、下位に接続しているテレビ会議端末１に転送される（Ｓ５２）。 With reference to FIG. 8, the top terminal processing will be described. When the highest terminal process is started, the resolution and frame rate employed in the video conference are input from the PC 3 (S51). The resolution and the frame rate are output by the PC 3 together with the conference start instruction (see FIG. 4, S10 and S11). The input resolution, frame rate, and conference start instruction are transferred to the video conference terminal 1 connected to the lower level (S52).

ＰＣ３からレイアウト情報（図６参照）が入力されたか否かが判断される（Ｓ５３）。前述したように、ユーザが画像配置入力用テンプレートを用いて画像の配置を指定すると、ＰＣ３は、各々の画像の合成画像データ上の合成位置を示すレイアウト情報を最上位端末に出力する。レイアウト情報が入力された場合（Ｓ５３：ＹＥＳ）、入力されたレイアウト情報が下位のテレビ会議端末１に転送される（Ｓ５４）。レイアウト情報が示す合成位置のうち、最上位端末の接続番号「１」に対応する合成位置が、自装置が撮像した画像の合成位置として認識される（Ｓ５５）。 It is determined whether layout information (see FIG. 6) has been input from the PC 3 (S53). As described above, when the user designates the image layout using the image layout input template, the PC 3 outputs layout information indicating the composite position of each image on the composite image data to the highest terminal. When layout information is input (S53: YES), the input layout information is transferred to the lower video conference terminal 1 (S54). Among the composite positions indicated by the layout information, the composite position corresponding to the connection number “1” of the highest terminal is recognized as the composite position of the image captured by the own device (S55).

レイアウト情報が入力されていない場合（Ｓ５３：ＮＯ）、ユーザはデフォルトのレイアウトを用いて会議を開始させる指示を入力している。この場合、フラッシュメモリ１３に記憶されている合成位置テーブルが取得される（Ｓ５６）。図９に示すように、合成位置テーブルでは、合成画像データ上の座標における各画像の合成位置を示す座標情報と、テレビ会議端末１の接続順（接続番号）とが対応付けられている。本実施の形態では、解像度がＶＧＡである場合のテーブルと、解像度がＸＧＡである場合のテーブルとが設けられている。ＰＣ３に接続されているテレビ会議端末１の合計台数毎に、各端末の合成位置を示す座標情報と接続順とが対応付けられている。テレビ会議端末１のＣＰＵ１０は、合計台数、解像度、および自装置の接続順に対応する座標情報を取得する。取得した座標情報が示す合成位置を、自装置が撮像した画像のデータを合成する合成位置として認識する（Ｓ５７）。 When the layout information has not been input (S53: NO), the user has input an instruction to start the conference using the default layout. In this case, the composite position table stored in the flash memory 13 is acquired (S56). As shown in FIG. 9, in the composite position table, coordinate information indicating the composite position of each image at coordinates on the composite image data is associated with the connection order (connection number) of the video conference terminal 1. In the present embodiment, there are provided a table when the resolution is VGA and a table when the resolution is XGA. For each of the total number of video conference terminals 1 connected to the PC 3, coordinate information indicating the combined position of each terminal is associated with the connection order. The CPU 10 of the video conference terminal 1 acquires coordinate information corresponding to the total number, resolution, and connection order of the own device. The synthesis position indicated by the acquired coordinate information is recognized as a synthesis position for synthesizing image data captured by the own device (S57).

次いで、自装置のカメラ２８が撮像した画像のデータ（自装置画像データ）が取得される（Ｓ６０）。Ｓ６０では、Ｓ５１でＰＣ３から入力された解像度の自装置画像データが取得される。さらに、Ｓ５１で入力されたフレームレートに従って自装置画像データが取得される。次いで、下位に接続しているテレビ会議端末１から画像データ（他装置画像データ）が取得される（Ｓ６１）。なお、Ｓ６１で取得される他装置画像データは、下位に接続しているテレビ会議端末１が１つのみであれば、下位のテレビ会議端末１が撮像した画像の画像データとなる。一方、下位に接続しているテレビ会議端末１が複数であれば、下位の複数のテレビ会議端末１によって撮像された複数の画像の合成画像となる。また、詳細は後述するが、他装置画像データの解像度およびフレームレートは、Ｓ６０で取得される自装置画像データの解像度およびフレームレートと同一となっている。さらに、他装置画像データには、ＤＣＴおよび量子化が既に行われている。次いで、他装置画像データのうち、Ｓ５５またはＳ５７で認識した位置に、自装置画像データが合成される（Ｓ６２）。その結果、合成画像データが生成される。 Next, data of the image captured by the camera 28 of the own device (own device image data) is acquired (S60). In S60, the self image data of the resolution input from the PC 3 in S51 is acquired. Further, the own apparatus image data is acquired according to the frame rate input in S51. Next, image data (other device image data) is acquired from the video conference terminal 1 connected to the lower level (S61). The other apparatus image data acquired in S61 is image data of an image captured by the lower video conference terminal 1 if only one video conference terminal 1 is connected to the lower level. On the other hand, if there are a plurality of video conference terminals 1 connected to the lower level, a composite image of a plurality of images captured by the lower level video conference terminals 1 is obtained. Although details will be described later, the resolution and frame rate of the other apparatus image data are the same as the resolution and frame rate of the own apparatus image data acquired in S60. Further, DCT and quantization have already been performed on the other apparatus image data. Next, the own device image data is synthesized at the position recognized in S55 or S57 among the other device image data (S62). As a result, composite image data is generated.

最上位端末では、生成された合成画像データに対して予測符号化処理が行われる（Ｓ６３）。さらに、ハフマン符号化が行われる（Ｓ６４）。最上位端末が予測符号化およびハフマン符号化を行ってデータを圧縮することで、ＰＣ３はこれらの処理を行う必要がなくなる。なお、Ｓ６４では、ＣＰＵ１０はハフマン符号化以外のエントロピー符号化を行ってもよい。予測符号化およびエントロピー符号化のいずれかのみを行ってもよい。符号化された合成画像データは、自装置に接続しているＰＣ３に出力される（Ｓ６５）。操作部２９等から会議終了指示を入力するまで（Ｓ６６：ＮＯ）、Ｓ６０〜Ｓ６６の処理が繰り返される。会議終了指示を入力すると（Ｓ６６：ＹＥＳ）、処理は終了する。 In the highest terminal, a predictive encoding process is performed on the generated composite image data (S63). Further, Huffman coding is performed (S64). Since the highest terminal performs the predictive coding and the Huffman coding to compress the data, the PC 3 does not need to perform these processes. In S64, the CPU 10 may perform entropy coding other than Huffman coding. Only either predictive coding or entropy coding may be performed. The encoded composite image data is output to the PC 3 connected to the own apparatus (S65). Until a conference end instruction is input from the operation unit 29 or the like (S66: NO), the processing of S60 to S66 is repeated. When the conference end instruction is input (S66: YES), the process ends.

図１０を参照して、下位端末処理について説明する。下位端末処理が開始されると、上位に接続されているテレビ会議端末１から解像度およびフレームレートが入力される（Ｓ７１）。下位にテレビ会議端末１が接続されていれば（自装置が最下位端末でなければ）、入力した解像度・フレームレートと、会議開始指示とが、下位のテレビ会議端末１に転送される（Ｓ７２）。 The lower terminal processing will be described with reference to FIG. When the lower terminal processing is started, the resolution and the frame rate are input from the video conference terminal 1 connected to the upper terminal (S71). If the video conference terminal 1 is connected to the lower level (the device itself is not the lowest level terminal), the input resolution / frame rate and the conference start instruction are transferred to the lower level video conference terminal 1 (S72). ).

上位に接続されているテレビ会議端末１からレイアウト情報（図６参照）が入力されたか否かが判断される（Ｓ７３）。レイアウト情報が入力された場合（Ｓ７３：ＹＥＳ）、自装置が最下位端末でなければ、レイアウト情報が下位のテレビ会議端末１に転送される（Ｓ７４）。レイアウト情報が示す合成位置のうち、自装置の接続番号に対応する合成位置が、自装置画像の合成位置として認識される（Ｓ７５）。レイアウト情報が入力されていない場合（Ｓ７３：ＮＯ）、フラッシュメモリ１３から合成位置テーブル（図９参照）が取得される（Ｓ７６）。取得された合成位置テーブルから、合計台数、解像度、および自装置の接続順に対応する座標情報が取得される。取得された座標情報によって示される位置が、自装置画像の合成位置として認識される（Ｓ７７）。 It is determined whether layout information (see FIG. 6) has been input from the video conference terminal 1 connected to the host (S73). When the layout information is input (S73: YES), if the own apparatus is not the lowest terminal, the layout information is transferred to the lower video conference terminal 1 (S74). Among the composite positions indicated by the layout information, the composite position corresponding to the connection number of the self apparatus is recognized as the composite position of the self apparatus image (S75). When the layout information is not input (S73: NO), the composite position table (see FIG. 9) is acquired from the flash memory 13 (S76). The coordinate information corresponding to the total number of units, the resolution, and the connection order of the own apparatus is acquired from the acquired combined position table. The position indicated by the acquired coordinate information is recognized as the composite position of the own apparatus image (S77).

次いで、自装置のカメラ２８から自装置画像データが取得される（Ｓ７９）。Ｓ７９では、Ｓ７１で取得された解像度およびフレームレートに従って自装置画像データが取得される。つまり、複数のテレビ会議端末１の間で共通の解像度およびフレームレートが用いられることになる。自装置が最下位端末であれば（Ｓ８０：ＹＥＳ）、Ｓ７５またはＳ７７で認識された位置に自装置画像データが配置され（Ｓ８１）、この画像データに対して離散コサイン変換（ＤＣＴ）および量子化処理が行われる（Ｓ８５）。ＤＣＴおよび量子化が行われた画像データが、上位に接続しているテレビ会議端末１に出力される（Ｓ８６）。画像データの出力前にＤＣＴおよび量子化を行うことで、上位のテレビ会議端末１およびＰＣ３の処理負担を低下させることができる。さらに、本実施の形態では、予測符号化およびハフマン符号化は最上位端末のみで行われる。従って、中間端末および最上位端末は、他装置画像データを復号化することなく、他装置画像データに自装置画像データを合成することができる。なお、Ｓ８５では、離散フーリエ変換（ＤＦＴ）等の他の変換符号化処理が行われてもよいし、変換符号化および量子化のいずれかのみが行われてもよい。 Next, own device image data is acquired from the camera 28 of the own device (S79). In S79, the apparatus image data is acquired according to the resolution and frame rate acquired in S71. That is, a common resolution and frame rate are used among the plurality of video conference terminals 1. If the own device is the lowest terminal (S80: YES), the own device image data is arranged at the position recognized in S75 or S77 (S81), and discrete cosine transform (DCT) and quantization are performed on this image data. Processing is performed (S85). The image data subjected to DCT and quantization is output to the video conference terminal 1 connected to the host (S86). By performing DCT and quantization before outputting image data, the processing burden on the upper video conference terminal 1 and the PC 3 can be reduced. Furthermore, in the present embodiment, predictive coding and Huffman coding are performed only at the highest terminal. Therefore, the intermediate terminal and the highest terminal can synthesize their own device image data with the other device image data without decoding the other device image data. In S85, other transform coding processing such as discrete Fourier transform (DFT) may be performed, or only transform coding or quantization may be performed.

自装置が最下位端末でなく中間端末であれば（Ｓ８０：ＮＯ）、下位に接続しているテレビ会議端末１から他装置画像データが取得される（Ｓ８３）。取得された他装置画像データのうち、Ｓ７５またはＳ７７で認識した位置に、自装置画像データが合成される（Ｓ８４）。その結果、合成画像データが生成される。生成された合成画像データに対してＤＣＴおよび量子化が行われて（Ｓ８５）、合成画像データは上位のテレビ会議端末１に出力される（Ｓ８６）。操作部２９等から会議終了指示を入力するまで（Ｓ８７：ＮＯ）、Ｓ７９〜Ｓ８７の処理が繰り返される。会議終了指示を入力すると（Ｓ８７：ＹＥＳ）、処理は終了する。 If the own apparatus is not the lowest terminal but an intermediate terminal (S80: NO), the other apparatus image data is acquired from the video conference terminal 1 connected to the lower terminal (S83). Among the acquired other device image data, the own device image data is synthesized at the position recognized in S75 or S77 (S84). As a result, composite image data is generated. The generated composite image data is subjected to DCT and quantization (S85), and the composite image data is output to the upper video conference terminal 1 (S86). Until a conference end instruction is input from the operation unit 29 or the like (S87: NO), the processes of S79 to S87 are repeated. When the conference end instruction is input (S87: YES), the process ends.

以上説明したように、本実施の形態のテレビ会議端末１は、複数のカメラ２８によって撮像された複数の画像の画像データを合成して合成画像データを生成する。生成した合成画像データを、上位のテレビ会議端末１またはＰＣ３に出力することができる。最上位端末から合成画像データを入力したＰＣ３は、画像データを合成する処理を行う必要がない。従って、テレビ会議端末１はＰＣ３の処理負担を低下させることができ、テレビ会議中の通信が乱れる可能性を低下させることができる。 As described above, the video conference terminal 1 according to the present embodiment generates composite image data by combining the image data of a plurality of images captured by the plurality of cameras 28. The generated composite image data can be output to the upper video conference terminal 1 or the PC 3. The PC 3 that has input the composite image data from the highest terminal does not need to perform a process of combining the image data. Therefore, the video conference terminal 1 can reduce the processing load of the PC 3, and can reduce the possibility that communication during the video conference is disturbed.

テレビ会議端末１は、自装置が撮像した画像のデータ（自装置画像データ）の合成位置を、合成位置テーブルおよび自装置の接続順に応じて容易に認識することができる。認識した適切な位置に自装置画像データを合成することができる。ＰＣ３は、画像データの合成位置を各々のテレビ会議端末１に指示する必要はない。従って、合成位置テーブルを用いる場合、テレビ会議端末１はＰＣ３の処理負担をさらに低下させることができる。 The video conference terminal 1 can easily recognize the composite position of the image data (self apparatus image data) captured by the self apparatus according to the combination position table and the connection order of the self apparatus. The apparatus image data can be synthesized at the appropriate recognized position. The PC 3 does not need to instruct each video conference terminal 1 to synthesize the image data. Therefore, when the composite position table is used, the video conference terminal 1 can further reduce the processing load on the PC 3.

テレビ会議端末１は、ユーザがＰＣ３の操作部４７を操作することで設定されたレイアウト情報によって、自装置画像データの合成位置を認識することができる。この場合、ユーザは、各テレビ会議端末１が撮像した画像のレイアウトを自由に決定することができる。テレビ会議端末１は、レイアウト情報が示す適切な位置に自装置画像データを合成することができる。 The video conference terminal 1 can recognize the composite position of its own device image data based on the layout information set by the user operating the operation unit 47 of the PC 3. In this case, the user can freely determine the layout of the image captured by each video conference terminal 1. The video conference terminal 1 can synthesize the own device image data at an appropriate position indicated by the layout information.

テレビ会議端末１は、自装置が中間端末または最下位端末として動作する場合には、変換符号化および量子化を行った画像データを上位のテレビ会議端末に出力する。従って、上位のテレビ会議端末がデータ量の大きい画像データを取得して処理負担が増加することを防止することができる。また、上位のテレビ会議端末１が入力する他装置画像データには、予測符号化およびエントロピー符号化は行われていない。従って、テレビ会議端末１は、合成画像データを生成する際に他装置画像データを復号化する必要もない。さらに、テレビ会議端末１は、最上位端末として動作する場合には、予測符号化およびエントロピー符号化を行った合成画像データをＰＣ３に出力する。よって、ＰＣ３の処理負担をさらに低下させることができる。 When the video conference terminal 1 operates as an intermediate terminal or the lowest terminal, the video conference terminal 1 outputs image data subjected to transform coding and quantization to a higher video conference terminal. Therefore, it is possible to prevent an upper video conference terminal from acquiring image data having a large amount of data and increasing the processing load. In addition, predictive coding and entropy coding are not performed on the other apparatus image data input by the higher-level video conference terminal 1. Therefore, the video conference terminal 1 does not need to decode the other device image data when generating the composite image data. Further, when the video conference terminal 1 operates as the highest terminal, the video conference terminal 1 outputs the composite image data subjected to the predictive coding and the entropy coding to the PC 3. Therefore, the processing burden on the PC 3 can be further reduced.

テレビ会議端末１は、最上位端末として動作する場合、解像度およびフレームレートをＰＣ３から取得し、下位のテレビ会議端末１に通知する。各テレビ会議端末１は、ＰＣ３から取得された解像度およびフレームレートに従って合成画像データを生成することができる。従って、テレビ会議端末１は、他のテレビ会議端末１と解像度およびフレームレートを共有して適切な合成画像データをＰＣ３に出力することができる。 When the video conference terminal 1 operates as the highest-level terminal, the video conference terminal 1 acquires the resolution and the frame rate from the PC 3 and notifies the lower video conference terminal 1. Each video conference terminal 1 can generate composite image data according to the resolution and frame rate acquired from the PC 3. Accordingly, the video conference terminal 1 can share appropriate resolution and frame rate with other video conference terminals 1 and output appropriate composite image data to the PC 3.

上記実施の形態において、テレビ会議端末１が本発明の「画像処理装置」に相当する。ＰＣ３が本発明の「通信装置」に相当する。図７の端末会議処理で合計台数を取得するＣＰＵ１０が「台数取得手段」として機能する。自装置画像データが本発明の「第一画像データ」に相当する。図８のＳ６０および図１０のＳ７９で自装置画像データを取得するＣＰＵ１０が「第一画像データ取得手段」として機能する。他装置画像データが本発明の「第二画像データ」に相当する。図８のＳ６１および図１０のＳ８３で他装置画像データを取得するＣＰＵ１０が「第二画像データ取得手段」として機能する。図８のＳ６２および図１０のＳ８４で合成画像データを生成するＣＰＵ１０が「生成手段」として機能する。図８のＳ６５および図１０のＳ８６で合成画像データを出力するＣＰＵ１０が「出力手段」として機能する。 In the above embodiment, the video conference terminal 1 corresponds to the “image processing apparatus” of the present invention. The PC 3 corresponds to the “communication device” of the present invention. The CPU 10 that acquires the total number in the terminal conference process of FIG. 7 functions as a “number acquisition unit”. The own apparatus image data corresponds to the “first image data” of the present invention. The CPU 10 that acquires the apparatus image data in S60 of FIG. 8 and S79 of FIG. 10 functions as a “first image data acquisition unit”. The other device image data corresponds to the “second image data” of the present invention. The CPU 10 that acquires the other apparatus image data in S61 of FIG. 8 and S83 of FIG. 10 functions as the “second image data acquisition unit”. The CPU 10 that generates the composite image data in S62 of FIG. 8 and S84 of FIG. 10 functions as a “generating unit”. The CPU 10 that outputs the composite image data in S65 of FIG. 8 and S86 of FIG. 10 functions as an “output unit”.

図７のＳ２３、Ｓ３４で接続番号を取得するＣＰＵ１０が「接続順情報取得手段」として機能する。図８のＳ５６および図１０のＳ７６で合成位置テーブルを取得するＣＰＵ１０が「テーブル取得手段」として機能する。図８のＳ５３および図１０のＳ７３でレイアウト情報を取得するＣＰＵ１０が「設定情報取得手段」として機能する。図７のＳ２１で自装置がＰＣ３に直接接続されているか否かを判断するＣＰＵ１０が「判断手段」として機能する。図８のＳ６３，Ｓ６４で予測符号化処理を含む圧縮処理を行うＣＰＵ１０が「圧縮処理手段」として機能する。図８のＳ５１で解像度およびフレームレートを取得するＣＰＵ１０が「出力情報取得手段」として機能する。図８のＳ５２および図１０のＳ７２で解像度およびフレームレートを転送するＣＰＵ１０が「通知手段」として機能する。 The CPU 10 that acquires the connection number in S23 and S34 in FIG. 7 functions as a “connection order information acquisition unit”. The CPU 10 that acquires the combined position table in S56 of FIG. 8 and S76 of FIG. 10 functions as a “table acquisition unit”. The CPU 10 that acquires the layout information in S53 of FIG. 8 and S73 of FIG. 10 functions as a “setting information acquisition unit”. The CPU 10 that determines whether or not the device itself is directly connected to the PC 3 in S21 of FIG. 7 functions as a “determination unit”. The CPU 10 that performs the compression process including the predictive encoding process in S63 and S64 of FIG. 8 functions as a “compression processing unit”. The CPU 10 that acquires the resolution and the frame rate in S51 of FIG. 8 functions as an “output information acquisition unit”. The CPU 10 that transfers the resolution and the frame rate in S52 of FIG. 8 and S72 of FIG. 10 functions as “notification means”.

図７の端末会議処理で合計台数を取得する処理が「台数取得ステップ」に相当する。図８のＳ６０および図１０のＳ７９で自装置画像データを取得する処理が「第一画像データ取得ステップ」に相当する。図８のＳ６１および図１０のＳ８３で他装置画像データを取得する処理が「第二画像データ取得ステップ」に相当する。図８のＳ６２および図１０のＳ８４で合成画像データを生成する処理が「生成ステップ」に相当する。図８のＳ６５および図１０のＳ８６で合成画像データを出力する処理が「出力ステップ」に相当する。 The process of acquiring the total number in the terminal conference process of FIG. 7 corresponds to the “number acquisition step”. The process of acquiring the apparatus image data in S60 of FIG. 8 and S79 of FIG. 10 corresponds to the “first image data acquisition step”. The process of acquiring the other apparatus image data in S61 of FIG. 8 and S83 of FIG. 10 corresponds to a “second image data acquisition step”. The process of generating the composite image data in S62 of FIG. 8 and S84 of FIG. 10 corresponds to a “generation step”. The process of outputting the composite image data in S65 of FIG. 8 and S86 of FIG. 10 corresponds to an “output step”.

本発明は上記実施の形態に限定されることはなく、様々な変形が可能であることは言うまでもない。例えば、テレビ会議端末１以外のデバイスを本発明に係る「画像処理装置」として用いることも可能である。例えば、ネットワーク８に接続しているＰＣ３に、カメラを備えた複数のＰＣを直列に接続することで、テレビ会議を実行することも可能である。この場合、カメラを備えた複数のＰＣの各々が本発明の「画像処理装置」に相当する。また、画像処理装置自身がカメラを備えている必要はない。画像処理装置は、自身に接続している外付けのカメラから自装置画像データを入力することも可能である。同様に、本発明に係る「通信装置」もＰＣ３である必要はない。テレビ会議を実行するための専用の通信装置等を、本発明に係る「通信装置」として用いることも可能である。 It goes without saying that the present invention is not limited to the above-described embodiment, and various modifications are possible. For example, a device other than the video conference terminal 1 can be used as the “image processing apparatus” according to the present invention. For example, a video conference can be executed by connecting a plurality of PCs equipped with cameras in series to the PC 3 connected to the network 8. In this case, each of a plurality of PCs equipped with a camera corresponds to the “image processing apparatus” of the present invention. Further, the image processing apparatus itself does not need to have a camera. The image processing apparatus can also input its own image data from an external camera connected to the image processing apparatus. Similarly, the “communication device” according to the present invention need not be the PC 3. A dedicated communication device or the like for executing a video conference can also be used as the “communication device” according to the present invention.

ＰＣ３と最上位端末との間の接続、および複数のテレビ会議端末１間の接続は、ＵＳＢ接続に限られず、他のプロトコルに従った接続方法を用いてもよい。無線通信を用いてもよいことは言うまでもない。 The connection between the PC 3 and the highest terminal and the connection between the plurality of video conference terminals 1 are not limited to the USB connection, and a connection method according to another protocol may be used. It goes without saying that wireless communication may be used.

上記実施の形態では、複数のテレビ会議端末１が直列にＰＣ３に接続されている。しかし、本発明は、複数のテレビ会議端末１の全てが並列にＰＣ３に接続されている場合を除き、様々な場合に適用できる。例えば、ＰＣ３に複数のテレビ会議端末１が並列に接続されている場合でも、並列に接続された各テレビ会議端末１にさらにテレビ会議端末１が接続されている場合であれば、本発明を適用することができる。この場合、ＰＣ３は、自装置に並列に接続された複数のテレビ会議端末１から画像データを入力して合成する必要はある。しかし、本発明を適用すれば、ＰＣ３に直接接続された最上位端末は、自装置に入力された複数の画像を合成してＰＣ３に出力することができる。従って、本発明を適用しない場合に比べて、ＰＣ３が画像を合成する処理の処理量は低下する。また、最上位端末に複数のテレビ会議端末１が並列に接続している場合には、最上位端末が全ての画像データを合成してＰＣ３に出力することで、ＰＣ３の処理負担を低下させることができる。 In the said embodiment, the some video conference terminal 1 is connected to PC3 in series. However, the present invention can be applied to various cases except when all of the plurality of video conference terminals 1 are connected to the PC 3 in parallel. For example, even when a plurality of video conference terminals 1 are connected in parallel to the PC 3, the present invention is applied if the video conference terminal 1 is further connected to each video conference terminal 1 connected in parallel. can do. In this case, the PC 3 needs to input and combine image data from a plurality of video conference terminals 1 connected in parallel to the PC 3. However, if the present invention is applied, the highest terminal directly connected to the PC 3 can synthesize a plurality of images input to the own device and output the synthesized image to the PC 3. Therefore, the processing amount of the process in which the PC 3 synthesizes an image is reduced as compared with the case where the present invention is not applied. Further, when a plurality of video conference terminals 1 are connected in parallel to the highest terminal, the highest terminal synthesizes all the image data and outputs them to the PC 3, thereby reducing the processing load on the PC 3. Can do.

上記実施の形態では、ＰＣ３は自ら画像を撮像することはない。しかし、ＰＣ３は、自ら自装置画像データを取得し、最上位端末から入力した他装置画像データに自装置画像データを合成してもよい。この場合、最上位端末は、予測符号化およびエントロピー符号化を行わずに合成画像データをＰＣ３に出力するのが望ましい。 In the above embodiment, the PC 3 does not take an image by itself. However, the PC 3 may acquire its own device image data and synthesize the own device image data with the other device image data input from the highest terminal. In this case, it is desirable that the highest terminal outputs the composite image data to the PC 3 without performing predictive coding and entropy coding.

上記実施の形態では、ユーザ自身が画像のレイアウトを指定する場合、ＰＣ３がユーザの操作入力を受け付けることでレイアウト情報が設定される。しかし、テレビ会議端末１がユーザの操作入力を受け付けてレイアウト情報を設定してもよい。 In the above embodiment, when the user himself / herself designates the image layout, the layout information is set by the PC 3 receiving the user's operation input. However, the video conference terminal 1 may accept user operation input and set layout information.

１テレビ会議端末
３ＰＣ
８ネットワーク
１０ＣＰＵ
１１ＲＯＭ
１３フラッシュメモリ
２１ＵＳＢデバイスＩ／Ｆ
２５ＵＳＢホストＩ／Ｆ
２８カメラ
５１画像配置入力用テンプレート
１００通信システム 1 Video conference terminal 3 PC
8 Network 10 CPU
11 ROM
13 Flash memory 21 USB device I / F
25 USB host I / F
28 Camera 51 Image layout input template 100 Communication system

Claims

Used in a communication system including a communication device that transmits data via a network and a plurality of image processing devices connected in series to the communication device, and the image data of an image to be displayed at another base is the communication An image processing apparatus capable of outputting to an apparatus,
Number acquisition means for acquiring the number of other image processing devices connected to the communication device;
First image data acquisition means for acquiring first image data from an imaging means for capturing an image;
Second image data acquisition means for acquiring second image data from another image processing apparatus connected to the apparatus;
The first image data acquired by the first image data acquisition unit and the second image data acquired by the second image data acquisition unit are combined and synthesized according to the number of units acquired by the number acquisition unit. Generating means for generating image data;
An image processing apparatus comprising: output means for outputting the composite image data generated by the generation means to another connected image processing apparatus or the communication apparatus.

A connection order information acquisition means for acquiring a connection order of the own apparatus with respect to the communication apparatus among the own apparatus connected to the communication apparatus and other image processing apparatuses;
A table acquisition means for acquiring from the storage means a table that associates the coordinate information indicating the synthesis position of the first image data in the coordinates on the composite image data and the connection order of the own device;
In the table acquired by the table acquisition unit, the generation unit combines the first image data at a combination position indicated by coordinate information corresponding to the connection order of the own apparatus acquired by the connection order information acquisition unit. The image processing apparatus according to claim 1.

This is information set by the user inputting an operation on the template displayed on the display means, and indicates the composite position of the first image data of each image processing apparatus designated by the user at the coordinates on the template It further comprises setting information acquisition means for acquiring coordinate information,
The image processing apparatus according to claim 1, wherein the generation unit combines the first image data at a combination position indicated by the coordinate information acquired by the setting information acquisition unit.

The second image data acquisition means acquires the second image data subjected to transform coding and quantization from the other image processing device connected thereto,
The image processing apparatus includes:
Determining means for determining whether or not the own apparatus is connected to the communication apparatus without passing through another image processing apparatus;
When it is determined by the determination means that the communication device is connected without going through another image processing device, the composite image data generated by the generation device is subjected to compression processing including predictive coding processing. Further comprising compression processing means for performing,
The image processing apparatus according to claim 1, wherein the output unit outputs the composite image data processed by the compression processing unit to the communication apparatus.

Output information acquisition means for acquiring from the communication device the resolution and frame rate of the composite image data to be output to the communication device;
A notification means for notifying the other image processing apparatus connected to the communication apparatus of the resolution and frame rate acquired by the output information acquisition means;
The generation unit generates composite image data in accordance with the resolution and frame rate acquired by the output information acquisition unit or the resolution and frame rate notified by a notification unit of another image processing apparatus. The image processing apparatus according to any one of 1 to 4.

Used in a communication system including a communication device that transmits data via a network and a plurality of image processing devices connected in series to the communication device, and the image data of an image to be displayed at another base is the communication An image processing method executed by an image processing apparatus capable of outputting to an apparatus,
Number acquisition step of acquiring the number of other image processing devices connected to the communication device; and
A first image data acquisition step of acquiring first image data from an imaging means for capturing an image;
A second image data acquisition step of acquiring second image data from another image processing device connected to the device;
The first image data acquired in the first image data acquisition step and the second image data acquired in the second image data acquisition step are combined and combined according to the number acquired in the number acquisition step. A generation step for generating image data;
An image processing method comprising: an output step of outputting the composite image data generated in the generation step to another connected image processing device or the communication device.

Used in a communication system including a communication device that transmits data via a network and a plurality of image processing devices connected in series to the communication device, and the image data of an image to be displayed at another base is the communication An image processing program used in an image processing apparatus capable of outputting to an apparatus,
Number acquisition step of acquiring the number of other image processing devices connected to the communication device; and
A first image data acquisition step of acquiring first image data from an imaging means for capturing an image;
A second image data acquisition step of acquiring second image data from another image processing device connected to the device;
The first image data acquired in the first image data acquisition step and the second image data acquired in the second image data acquisition step are combined and combined according to the number acquired in the number acquisition step. A generation step for generating image data;
An image processing program including an instruction for causing the controller of the image processing apparatus to execute the output step of outputting the composite image data generated in the generating step to another connected image processing apparatus or the communication apparatus .