JP2023026844A

JP2023026844A - Video recording system, video recording playback device and image processing method

Info

Publication number: JP2023026844A
Application number: JP2021132243A
Authority: JP
Inventors: 耕司桑田; Koji Kuwata
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2021-08-16
Filing date: 2021-08-16
Publication date: 2023-03-01

Abstract

To provide a video recording system which can display a first image during playback even when video recording is performed in such a state that a second image is superimposed on the first image.SOLUTION: The present invention provides a video recording system 60 comprising: a communication unit which receives first image data from another base; an image processing unit which moves the position of the first image data or second image data within a frame of a video in which the second image data received from the other base or prepared in the own base is embedded in the first image; an image composition unit which generates a video file in which the second image data is embedded in the first image data; an image construction unit which constructs the first image data that does not include the second image data from the video file; and a display control unit which displays the first image data constructed by the image construction unit.SELECTED DRAWING: Figure 1

Description

本発明は、録画システム、録画再生装置、及び、画像処理方法に関する。 The present invention relates to a recording system, a recording/playback apparatus, and an image processing method.

ネットワークを介して、遠隔地の拠点との間で映像や音声を送受信する通信システムが知られている。通信システムにおいては、会議に参加する当事者の一方がいる拠点において、通信端末を用いて画像の撮像及び発言などの音声の収集を行い、これらをデジタルデータに変換して相手方の通信端末に送信する。相手方の通信端末はディスプレイへの画像表示及びスピーカからの音声出力を行い、ビデオ通話を行うことができる。 2. Description of the Related Art A communication system is known that transmits and receives video and audio to and from a remote location via a network. In a communication system, at a location where one of the parties participating in a conference is located, a communication terminal is used to capture images and collect voices such as remarks, which are then converted into digital data and transmitted to the communication terminal of the other party. . The other party's communication terminal can display an image on a display and output audio from a speaker to make a video call.

会議中の画像を録画する技術が考案されている（例えば、特許文献１参照。）。特許文献１には、加工前の画像を一旦録画することにより、後の編集を可能とすることで、会議を効率的に再現させるシステムが開示されている。 A technique for recording images during a conference has been devised (see, for example, Japanese Patent Application Laid-Open No. 2002-200011). Patent Literature 1 discloses a system that efficiently reproduces a conference by temporarily recording an image before processing so that it can be edited later.

しかしながら、従来の技術は、第一の画像に第二の画像が重ねて表示されるため、録画された画像を後からユーザーが再生しても第一の画像を表示することができないという問題がある。すなわち、通信端末は、ディスプレイに多拠点のカメラ画像（第二の画像）だけでなく資料（第一の画像）も表示するが、資料は文字情報が含まれるケースが多いため、ユーザーが一般的には資料をできるだけ大きく表示させる。多拠点のカメラ画像は縮小された状態で資料に重ねるようにPicture In Picture的な表示方法が行われる。そうすることによって、以下の使い方が可能になる。
・全ての画面を同時に見れる。
・重要（共有資料、or発話中の拠点映像、等）な画面や資料は大きく映すことができる。 However, the conventional technology has a problem that the second image is displayed superimposed on the first image, so even if the user later reproduces the recorded image, the first image cannot be displayed. be. In other words, the communication terminal displays not only the camera images of multiple sites (second image) but also the material (first image) on the display. display the material as large as possible. A picture-in-picture display method is used so that images from multiple sites are superimposed on materials in a reduced state. By doing so, the following usage becomes possible.
・All screens can be viewed at the same time.
・Important screens and materials (shared materials, base images during speech, etc.) can be enlarged.

しかし、録画システムがディスプレイに表示された資料や各拠点のカメラ画像を録画しても、第一の画像（大きく表示している映像）の一部は隠れており、ユーザーが第一の画像の気になる部位や箇所を表示することができない。 However, even if the recording system records the materials displayed on the display and the camera images of each site, part of the first image (the image displayed in large size) is hidden, and the user cannot see the first image. Unable to display areas of concern.

本発明は、上記課題に鑑み、第一の画像に第二の画像が重ねられた状態で録画されても、再生時に第一の画像を表示できる録画システムを提供することを目的とする。 SUMMARY OF THE INVENTION It is therefore an object of the present invention to provide a video recording system capable of displaying a first image during playback even when a second image is superimposed on a first image.

上記課題に鑑み、本発明は、他の拠点から第一の画像データを受信する通信部と、前記第一の画像に、他の拠点から受信するか又は自拠点で用意した第二の画像データを埋め込んだ動画のフレーム内で、前記第一の画像データ又は前記第二の画像データの位置を移動させる画像処理部と、前記第一の画像データに前記第二の画像データを埋め込んだ動画ファイルを生成する画像合成部と、前記動画ファイルから前記第二の画像データを含まない前記第一の画像データを構築する画像構築部と、前記画像構築部が構築した前記第一の画像データを表示する表示制御部と、を有する録画システムを提供する。 In view of the above problems, the present invention provides a communication unit that receives first image data from another base, and second image data received from another base or prepared at one's own base for the first image. an image processing unit for moving the position of the first image data or the second image data within the frame of the moving image embedded with the moving image file in which the second image data is embedded in the first image data an image compositing unit that constructs the first image data that does not include the second image data from the moving image file; and the first image data constructed by the image constructing unit is displayed. and a display control unit for recording.

第一の画像に第二の画像が重ねられた状態で録画されても、再生時に第一の画像を表示できる録画システムを提供できる。 It is possible to provide a recording system capable of displaying the first image during playback even if the second image is recorded while being superimposed on the first image.

通信システムのシステム構成の概略図の一例である。1 is an example of a schematic diagram of a system configuration of a communication system; FIG. １拠点にある録画システムの構成図の一例である。1 is an example of a configuration diagram of a video recording system at one site; FIG. 通信端末のハードウェア構成図の一例である。1 is an example of a hardware configuration diagram of a communication terminal; FIG. 録画再生装置のハードウェア構成図の一例である。1 is an example of a hardware configuration diagram of a recording/playback device; FIG. 通信端末と録画再生装置が有する機能をブロックに分けて説明する機能ブロック図の一例である。FIG. 2 is an example of a functional block diagram illustrating functions of a communication terminal and a recording/reproducing device divided into blocks; 合成画像記憶部に記憶される情報を説明する図である。It is a figure explaining the information memorize|stored in a synthetic image memory|storage part. カメラ画像記憶部に記憶される情報を説明する図である。It is a figure explaining the information memorize|stored in a camera image memory|storage part. 資料画像記憶部に記憶される情報を説明する図である。It is a figure explaining the information memorize|stored in a document image memory|storage part. 通信端末が作成する従来の合成画像の一例を示す図である。FIG. 10 is a diagram showing an example of a conventional composite image created by a communication terminal; 合成画像におけるカメラ画像の位置を移動させる画像処理の概略を説明する図である。FIG. 10 is a diagram illustrating an outline of image processing for moving the position of a camera image in a synthesized image; 合成画像の録画処理を説明するフローチャート図の一例である。FIG. 10 is an example of a flowchart illustrating recording processing of a composite image; 合成画像のカメラ画像の移動を画面表示イメージで表した図である。It is the figure which represented the movement of the camera image of a synthetic image by the screen display image. 図１１のステップＳ７の処理を説明するフローチャート図の一例である。FIG. 12 is an example of a flowchart for explaining the process of step S7 in FIG. 11; 合成画像のフレームに添付されるメタデータの一例を示す図である。FIG. 4 is a diagram showing an example of metadata attached to a frame of a synthesized image; FIG. 録画再生装置における資料画像の再生方法を説明する図である。FIG. 4 is a diagram for explaining a method of reproducing a material image in the recording/reproducing device; 録画再生装置がカメラ画像（動画ファイル）と資料画像を保存する処理を説明するフローチャート図の一例である。FIG. 10 is an example of a flowchart for explaining a process of saving a camera image (moving image file) and a material image by a recording/reproducing device; 資料画像を上にスライドさせながら合成画像を作成する画像処理の概略を説明する図である。FIG. 10 is a diagram for explaining an outline of image processing for creating a composite image while sliding a document image upward; 資料画像の上下スライドを説明する図である。It is a figure explaining up-and-down slide of a material image. 資料画像を上にスライドさせる処理において、図１１のステップＳ７の処理を説明するフローチャート図である。FIG. 12 is a flowchart for explaining the process of step S7 in FIG. 11 in the process of sliding the material image upward; 資料画像を上下にスライドさせる処理において、第二資料画像構築部が資料画像を構築する処理を説明するフローチャート図の一例である。FIG. 10 is an example of a flowchart illustrating processing for constructing a material image by a second material image constructing unit in processing for vertically sliding the material image. 合成画像の再生時に利用できる３種類の画像を示す図である。FIG. 10 is a diagram showing three types of images that can be used when playing back a composite image; ２つのディスプレイを用いた画像の表示例を示す図である。FIG. 10 is a diagram showing an example of image display using two displays;

以下、本発明を実施するための形態の一例として、録画システムと録画システムが行う画像処理方法について図面を参照しながら説明する。 DESCRIPTION OF THE PREFERRED EMBODIMENTS A video recording system and an image processing method performed by the video recording system will be described below as an example of embodiments of the present invention with reference to the drawings.

＜録画システムの動作の概略＞
本実施形態の録画システムでは、少なくとも以下の２つの画像が拠点間で送受信される。
１．会議参加者等、拠点の様子をカメラで撮像したカメラ画像（動画）
２．会議で共有すべき資料画像（一般的に静止画像）
各拠点の通信端末は、資料の見やすさを維持したまま、１と２の画像を一つの画面（１画面分のフレーム）に収める（同時に見えるようにする）。この場合、ユーザーはPicture In Picture的（以下、１を２に埋め込むという）なレイアウトにせざるを得ない。 <Outline of recording system operation>
In the recording system of this embodiment, at least the following two images are transmitted and received between sites.
1. Camera image (movie) of the site, such as conference participants, captured by a camera
2. Document images to be shared at the meeting (generally static images)
The communication terminal at each base puts the images 1 and 2 on one screen (frames for one screen) while maintaining the visibility of the materials (makes them visible at the same time). In this case, the user has no choice but to use a layout similar to Picture In Picture (hereinafter referred to as embedding 1 in 2).

カメラ画像（第二の画像データの一例）と資料画像（第一の画像データの一例）がレイアウトされた画面の全体を通信端末が録画した場合、会議に参加できなかった人が再生して見直す場合は、
・全ての時間の映像を慎重に見直すことで、資料画像のうちの死角となっている領域を把握できる場合もあるが、時間はかかる。
・あるいは死角の領域は見えないままで終わる可能性も否定できない。
という不都合がある。 If the entire screen on which the camera image (an example of the second image data) and the material image (an example of the first image data) are laid out is recorded by the communication terminal, the person who could not participate in the meeting can play it back and review it. If
・By carefully reviewing the video of all time, it may be possible to grasp the blind spot area in the document image, but it takes time.
・Alternatively, we cannot deny the possibility that the area of the blind spot remains invisible.
There is an inconvenience.

＜本実施形態の処理の概略＞
そこで、本実施形態では、以下の処理を行うことで、会議中には見えてなかった（隠れていた）領域を、録画システムが再生時に任意のタイミングで再生できるようにする。 <Outline of processing in this embodiment>
Therefore, in this embodiment, the following processing is performed so that the recording system can reproduce the area that was not visible (hidden) during the conference at an arbitrary timing during reproduction.

◆録画時（通信端末の処理）
通信端末は、録画時に以下の処理（少なくともいずれか）を実行する。
Ａ．資料画像にカメラ画像が埋め込まれている場合、最前面になっている画像（通常はカメラ画像）の表示位置をゆっくりと移動させ続ける。
Ｂ．最背面になっている画像（通常は資料画像）が改ページ又は資料そのものが変更された場合、上下又は左右に資料画像をスライドさせるようにして、次のページに切り替える。 ◆ When recording (processing of communication terminal)
The communication terminal executes (at least one of) the following processes during recording.
A. When the camera image is embedded in the material image, the display position of the frontmost image (usually the camera image) is slowly moved continuously.
B. When the backmost image (usually a material image) is changed to a new page or the material itself is changed, the material image is slid vertically or horizontally to switch to the next page.

◆再生時（録画再生装置の処理）
Ｃ（Ａに対応）．最前面になっているカメラ画像の表示位置が移動するので、録画再生装置は、資料画像の背面に隠れていた領域も取得できる。録画再生装置はそれを別のファイルとして再生時に一時的に保存しておくことで、再生時は資料画像の全体を表示したり、資料画像の任意の位置にカメラ画像を移動したりすることができる。再生時に一時的とは、ファイルは一般の録画装置のように一つのファイルとして保存されており、録画再生装置は、再生時はその１つのファイルから資料画像を別ファイルとして生成することをいう。 ◆ During playback (processing by recording/playback device)
C (corresponding to A). Since the display position of the camera image in the foreground moves, the recording/reproducing device can also obtain the area hidden behind the document image. By temporarily saving it as a separate file during playback, the recording and playback device can display the entire document image and move the camera image to any position in the document image during playback. can. "Temporary during playback" means that the file is saved as one file like a general recording device, and the recording and playback device generates the material image as a separate file from the one file during playback.

なお、資料画像は会議で使う資料であることが多いため、ほぼ静止画として扱える。そのことが、このような再生方法を可能にしている。
Ｄ（Ｂに対応）．資料画像がスライドして切り替わることを利用して、死角の領域の情報を取得する。録画再生装置がそれを再生時に一時的に別ファイルとして保存しておくことで、再生時は資料画像の全体を表示したり、資料画像の任意の位置にカメラ画像を移動したりすることができる。 In addition, since the material image is often the material used in the meeting, it can be treated as a still image. That is what makes such a playback method possible.
D (corresponding to B). To obtain information on a blind spot area by utilizing the fact that the material image is slid and switched. By temporarily saving it as a separate file during playback by the recording and playback device, the entire document image can be displayed during playback, and the camera image can be moved to any position in the document image. .

このＣ，Ｄの処理は、会議中とは異なり、資料画像にカメラ画像が埋め込まれた合成画像を録画しておくことができ、リアルタイムに表示する必要がないので実現できる。すなわち、すでに録画済みの合成画像を編集するかのように処理するので合成画像の再生表示を行うことができる。 These processes C and D can be realized because, unlike during a meeting, a synthesized image in which a camera image is embedded in a material image can be recorded and displayed in real time. That is, since processing is performed as if editing a composite image that has already been recorded, the composite image can be reproduced and displayed.

このように、本実施形態の通信システムは、会議の合成画像を参加できなかったユーザーが見直す場合に、死角の領域が見えないままで再生が終わることもなく、合成画像の全時間を慎重に追いかける必要性もない。 As described above, the communication system according to the present embodiment prevents users who have not been able to participate in the meeting from reviewing the synthesized image from being played back without seeing the blind spot area. No need to chase.

＜用語について＞
死角とは、第一の画像データのうち第二の画像データが重なっている領域である。死角の領域はユーザーから見えない領域となる。 <Terms>
A blind spot is an area where the second image data overlaps in the first image data. The blind spot area is an area invisible to the user.

リアルタイムとは、ある処理が現在、実行されたとして、一定の遅延内で処理の結果が得られることをいう。 Real-time means that if a certain process is currently executed, the result of the process can be obtained within a certain delay.

＜録画システムの構成例＞
図１は、通信システム１００のシステム構成の概略図である。通信システム１００では、拠点Ａ～拠点Ｄに配置された各通信端末１０Ａ～１０Ｄがインターネット等の通信ネットワークＮを介して通信する。拠点Ａ～Ｄには録画システム６０（通信端末１０と録画再生装置３０）がある。必ずしも全ての拠点に録画再生装置３０が配置されなくてもよい。図１の通信端末１０Ａ～１０Ｄの数（４つ）は一例であり、２つ以上（２拠点以上）であればよい。通信端末１０Ａ～１０Ｄのうち任意の通信端末を「通信端末１０」と称す。 <Configuration example of recording system>
FIG. 1 is a schematic diagram of the system configuration of a communication system 100. As shown in FIG. In the communication system 100, communication terminals 10A to 10D located at bases A to D communicate via a communication network N such as the Internet. Sites A to D have recording systems 60 (communication terminals 10 and recording/playback devices 30). The recording/playback device 30 does not necessarily have to be arranged at all bases. The number (four) of communication terminals 10A to 10D in FIG. 1 is an example, and the number may be two or more (two or more bases). An arbitrary communication terminal among the communication terminals 10A to 10D is referred to as a "communication terminal 10".

通信端末１０Ａ～１０Ｄは撮像したカメラ画像及び音声を通信管理システム５０に送信する。また、通信端末１０Ａ～１０Ｄは通信管理システム５０からカメラ画像を受信する。なお、自拠点のカメラ画像は自拠点側では受信しなくてよい。また、通信端末１０Ａ～１０Ｄのうち任意の通信端末１０は通信管理システム５０に資料画像を送信する。資料画像は、通信端末１０Ａ～１０Ｄが外部ＰＣから取り込んだり内部的に保持したりしている会議の資料の画像である。通信端末１０Ａ～１０Ｄは通信管理システム５０から資料画像を受信する。資料画像は元々、通信管理システム５０に登録されている場合もある。 The communication terminals 10A to 10D transmit captured camera images and voices to the communication management system 50. FIG. Also, the communication terminals 10A to 10D receive camera images from the communication management system 50. FIG. Note that it is not necessary for the self-site to receive the camera image of the self-site. Any communication terminal 10 among the communication terminals 10A to 10D transmits the document image to the communication management system 50. FIG. The material image is an image of the material of the conference that the communication terminals 10A to 10D have imported from the external PC or internally held. The communication terminals 10A-10D receive the material images from the communication management system 50. FIG. The document image may be originally registered in the communication management system 50 .

拠点Ａ～Ｄの通信端末１０Ａ～１０Ｄは、通信端末１０Ａ～１０Ｄが撮像したカメラ画像と、上記の資料画像を表示する。音声データについても同様である。なお、通信端末１０Ａ～１０Ｄが送信するカメラ画像は動画であるとするが、静止画でもよい。また、通信端末１０Ａ～１０Ｄは、カメラ画像及び音声を送信しないで単に受信するだけでもよい。 The communication terminals 10A to 10D of the bases A to D display the camera images captured by the communication terminals 10A to 10D and the document images. The same applies to voice data. It is assumed that the camera images transmitted by the communication terminals 10A to 10D are moving images, but they may be still images. Also, the communication terminals 10A to 10D may simply receive camera images and voices without transmitting them.

通信端末１０Ａ～１０Ｄは、それぞれ専用のテレビ会議端末でもよいし、アプリケーションを実行したＰＣ（Personal Computer）でもよい。すなわち、ＰＣは、普段は汎用的な情報処理装置として使用され、テレビ会議用のアプリケーションを実行すると通信端末１０として動作する。 Each of the communication terminals 10A to 10D may be a dedicated teleconference terminal or a PC (Personal Computer) running an application. That is, the PC is normally used as a general-purpose information processing device, and operates as the communication terminal 10 when executing a teleconference application.

また、汎用的な情報処理装置としての通信端末１０はＰＣの他、スマートフォン、タブレット端末、ＰＤＡ（Personal Digital Assistant）、携帯電話、プロジェクタ、電子黒板、カーナビ、などでもよい。 Further, the communication terminal 10 as a general-purpose information processing device may be a smart phone, a tablet terminal, a PDA (Personal Digital Assistant), a mobile phone, a projector, an electronic blackboard, a car navigation system, etc., in addition to the PC.

また、通信ネットワークＮには通信管理システム５０が接続されている。通信管理システム５０は、通信端末１０Ａ～１０Ｄの通信を管理及び制御する。通信管理システム５０は、単一のコンピュータによって構築されてもよいし、各部（機能、手段、又は記憶部）を分割して任意に割り当てられた複数のコンピュータによって構築されていてもよい。 A communication management system 50 is also connected to the communication network N. FIG. The communication management system 50 manages and controls communications of the communication terminals 10A-10D. The communication management system 50 may be constructed by a single computer, or may be constructed by a plurality of computers in which each part (function, means, or storage part) is divided and arbitrarily assigned.

通信管理システム５０は、例えば同じ会議に参加する通信端末１０をセッションＩＤで対応付けて管理している。通信管理システム５０は、セッションＩＤに基づいて各拠点からの画像（資料画像、カメラ画像）と音声を同じ会議に参加している他の拠点に送信する。こうすることで、他拠点間でテレビ会議が可能になる。 The communication management system 50 manages, for example, the communication terminals 10 participating in the same conference by associating them with session IDs. Based on the session ID, the communication management system 50 transmits images (material images, camera images) and voices from each site to other sites participating in the same conference. By doing so, teleconferencing becomes possible between other bases.

通信管理システム５０は、一台以上の情報処理装置で実現される。通信管理システム５０は、クラウドコンピューティングにより実現されてもよいし、単一の情報処理装置によって実現されてもよい。クラウドコンピューティングとは、特定ハードウェア資源が意識されずにネットワーク上のリソースが利用される形態をいう。通信管理システム５０は、インターネット上に存在しても、オンプレミスに存在してもよい。 The communication management system 50 is realized by one or more information processing devices. The communication management system 50 may be realized by cloud computing, or may be realized by a single information processing device. Cloud computing refers to a form in which resources on a network are used without being aware of specific hardware resources. The communication management system 50 may exist on the Internet or may exist on-premises.

図２は、１拠点にある録画システム６０の構成図を示す。録画システム６０は、通信端末１０と録画再生装置３０を有する。録画再生装置３０にはディスプレイ２０が接続される場合がある。通信端末１０にディスプレイ２０を接続することも可能である。 FIG. 2 shows a configuration diagram of a recording system 60 at one site. Recording system 60 has communication terminal 10 and recording/playback device 30 . A display 20 may be connected to the recording/playback device 30 . It is also possible to connect a display 20 to the communication terminal 10 .

通信端末１０にはカメラとマイクが外付け接続される場合がある。カメラは自拠点の画角に入る範囲を撮像してカメラ画像を生成し、マイクは自拠点の音声を集音して音声データを生成する。通信端末１０はカメラ画像を内蔵する又は外付けのディスプレイ２０に表示すると共に、他の拠点に送信する。通信端末１０は音声データを他の拠点に送信する。 A camera and a microphone may be externally connected to the communication terminal 10 . The camera captures a range within the angle of view of its own site to generate a camera image, and the microphone collects the sound of its own site to generate audio data. The communication terminal 10 displays the camera image on a built-in or external display 20 and transmits the image to other bases. The communication terminal 10 transmits voice data to another base.

通信端末１０は、自拠点のカメラ画像、他の拠点のカメラ画像を、自拠点又は他の拠点が用意する資料画像に埋め込んで合成画像を生成する。通信端末１０は合成画像や音声データを録画再生装置３０に送信する。録画再生装置３０は会議中に生成される合成画像や音声データを録画して動画ファイルを作成する。 The communication terminal 10 embeds a camera image of its own site and a camera image of another site into a material image prepared by its own site or another site to generate a composite image. The communication terminal 10 transmits the synthesized image and audio data to the recording/playback device 30 . The recording/reproducing device 30 records synthesized images and audio data generated during a conference to create a moving image file.

録画再生装置３０は、各拠点や部署などに押下され、会議のダイジェストを主に会議に参加しなかったユーザーが会議の内容を把握するために使用される。 The recording/reproducing device 30 is pressed by each site or department, and is used mainly for the digest of the conference for users who did not participate in the conference to grasp the content of the conference.

録画再生装置３０は、ユーザー操作に応じて、会議が終了した後、合成画像の動画ファイルを再生する。再生した映像はディスプレイ２０に表示される。あるいは、録画再生装置３０がＷｅｂサーバーとして、ユーザー端末に再生した合成画像の動画ファイルを送信してもよい。また、録画再生装置３０は動画ファイルをクラウドにアップロードしてもよい。録画再生装置３０がクラウド上にあってもよい。 The recording/reproducing device 30 reproduces the moving image file of the composite image after the conference ends according to the user's operation. The reproduced video is displayed on the display 20 . Alternatively, the recording/reproducing device 30 may serve as a Web server and transmit the reproduced composite image moving image file to the user terminal. Also, the recording/reproducing device 30 may upload the moving image file to the cloud. The recording/playback device 30 may be on the cloud.

なお、図２では、通信端末１０と録画再生装置３０が別体だが、通信端末１０が録画再生装置３０の機能を有していてもよい。 In FIG. 2, the communication terminal 10 and the recording/playback device 30 are separate units, but the communication terminal 10 may have the function of the recording/playback device 30.

＜ハードウェア構成例＞
図３，図４を参照して、本実施形態に係る録画システム６０に含まれる通信端末１０及び録画再生装置３０のハードウェア構成について説明する。 <Hardware configuration example>
Hardware configurations of the communication terminal 10 and the recording/playback device 30 included in the recording system 60 according to the present embodiment will be described with reference to FIGS. 3 and 4. FIG.

＜＜通信端末＞＞
図３は、通信端末１０の一例のハードウェア構成を示す図である。通信端末１０は、カメラモジュール３０１、映像処理部３０３、映像ＣＯＤＥＣ部３１４、映像出力処理部３１０、マイクアレイ３０４、音声出力部３０７、音声処理部３０８、ネットワーク処理部３１３、全体処理部３１５、操作部３１６、ＲＡＭ３０６、録画装置I/F部３１７、映像特性解析部３０２、ＲＡＭ３０５、及びCapture処理部３１２を備える。なお、カメラモジュール３０１は、外付けの汎用カメラを接続する構成でもよい。 <<communication terminal>>
FIG. 3 is a diagram showing an example hardware configuration of the communication terminal 10. As shown in FIG. The communication terminal 10 includes a camera module 301, a video processing unit 303, a video CODEC unit 314, a video output processing unit 310, a microphone array 304, an audio output unit 307, an audio processing unit 308, a network processing unit 313, an overall processing unit 315, an operation A unit 316 , a RAM 306 , a recorder I/F unit 317 , a video characteristic analysis unit 302 , a RAM 305 and a Capture processing unit 312 are provided. Note that the camera module 301 may be configured to connect an external general-purpose camera.

カメラモジュール３０１は、「撮像装置」の一例である。カメラモジュール３０１は、会議シーンの映像を撮像する。カメラモジュール３０１は、レンズ３０１ａ、撮像部３０１ｂ（イメージセンサ）、及びＤＳＰ３０１ｃを有する。撮像部３０１ｂは、レンズ３０１ａを介して集光された映像を電気信号に変換することにより、映像データ（ＲＡＷデータ）を生成する。ＤＳＰ３０１ｃは、撮像部３０１ｂから出力された映像データ（ＲＡＷデータ）に対して、ベイヤー変換、３Ａ制御、等の公知のカメラ映像処理を行うことにより、映像データ（ＹＵＶデータ）を生成する。 The camera module 301 is an example of an "imaging device". The camera module 301 captures video of the conference scene. The camera module 301 has a lens 301a, an imaging unit 301b (image sensor), and a DSP 301c. The imaging unit 301b generates video data (RAW data) by converting the video condensed through the lens 301a into an electrical signal. The DSP 301c generates video data (YUV data) by performing known camera video processing such as Bayer conversion and 3A control on the video data (RAW data) output from the imaging unit 301b.

映像処理部３０３は、カメラモジュール３０１から出力された映像データ（ＹＵＶデータ）に対し、目的に応じてクロッピング処理及び変倍処理３０３ａ等の各種映像処理を行う。例えば、映像処理部３０３は、映像特性解析部３０２から顔検知情報、及び音声処理部３０８からのビームフォーミング情報、を取得して発話者のクローズアップ映像を生成する。生成された映像は、映像出力処理部３１０に転送される。なお、映像処理部３０３は、各種映像処理を行う際に、ＲＡＭ３０６をバッファとして使用する。 The video processing unit 303 performs various video processing such as cropping processing and scaling processing 303a on the video data (YUV data) output from the camera module 301 according to the purpose. For example, the video processing unit 303 acquires face detection information from the video characteristic analysis unit 302 and beamforming information from the audio processing unit 308, and generates a close-up video of the speaker. The generated video is transferred to the video output processing section 310 . Note that the video processing unit 303 uses the RAM 306 as a buffer when performing various video processing.

映像ＣＯＤＥＣ部３１４は、他の通信端末１０との間で送受信される映像データ（映像ストリームデータ）の符号化及び復号化を行う。例えば、映像ＣＯＤＥＣ部３１４は、動画Ｅｎｄｏｃｅｒ３１４ａによって、映像処理部３０３から出力された映像データを符号化し、符号化された映像データを、ネットワーク処理部３１３を介して、他の通信端末１０へ送信する。あるいは、映像ＣＯＤＥＣ部３１４は、映像出力処理部３１０でレイアウト処理された映像データを符号化し、符号化された映像データを、ネットワーク処理部３１３を介して、他の通信端末１０へ送信する。 The video CODEC unit 314 encodes and decodes video data (video stream data) transmitted/received to/from another communication terminal 10 . For example, the video CODEC unit 314 encodes the video data output from the video processing unit 303 by the video endocer 314a, and transmits the encoded video data to the other communication terminal 10 via the network processing unit 313. . Alternatively, the video CODEC unit 314 encodes the video data layout-processed by the video output processing unit 310 and transmits the encoded video data to the other communication terminal 10 via the network processing unit 313 .

また、例えば、映像ＣＯＤＥＣ部３１４は、他の通信端末１０から送信された映像データ（他の通信端末１０で符号化された映像データ）を、ネットワーク処理部３１３を介して取得し、動画Ｄｅｄｏｃｅｒ３１４ｂによって、当該映像データを復号化する。そして、映像ＣＯＤＥＣ部３１４は、復号された映像データを、映像出力処理部３１０へ出力する。映像ＣＯＤＥＣ部３１４は、例えば、Ｈ.２６４／２６５等の圧縮規格を用いた、ＣＯＤＥＣ回路又はソフトウェアによって構成される。 Further, for example, the video CODEC unit 314 acquires video data transmitted from another communication terminal 10 (video data encoded by the other communication terminal 10) via the network processing unit 313, and , to decode the video data. Video CODEC section 314 then outputs the decoded video data to video output processing section 310 . The video CODEC unit 314 is configured by a CODEC circuit or software using compression standards such as H.264/265, for example.

映像出力処理部３１０は、映像データに基づく映像を、タッチパネル部３０９が備えるディスプレイに表示させる。ディスプレイは外付けの一般的なモニターでもよい。 The image output processing unit 310 displays an image based on the image data on the display included in the touch panel unit 309 . The display may be an external general monitor.

・例えば、映像出力処理部３１０は、映像ＣＯＤＥＣ部３１４で復号化された映像データに基づく映像（すなわち、他拠点の映像）を、タッチパネル部３０９が備えるディスプレイに表示させる。他拠点映像は複数のケースもある。 For example, the video output processing unit 310 causes the display included in the touch panel unit 309 to display a video based on the video data decoded by the video CODEC unit 314 (that is, a video of another site). There are multiple cases of videos from other bases.

・また、例えば、映像出力処理部３１０は、カメラモジュール３０１から出力された映像データに基づく映像（すなわち、自拠点の映像）を、タッチパネル部３０９が備えるディスプレイに表示させる。カメラモジュール３０１からの映像は、映像処理部３０３で発話者がクローズアップ処理された映像になることもある。 Also, for example, the image output processing unit 310 causes the display included in the touch panel unit 309 to display an image based on the image data output from the camera module 301 (that is, the image of the home base). The video from the camera module 301 may be a video in which the speaker has been subjected to close-up processing by the video processing unit 303 .

・また、例えば、映像出力処理部３１０は、Capture処理部３１２から出力された映像データに基づく映像（すなわち、外部ＰＣ３１１で表示されている資料画面等の映像）を、タッチパネル部３０９が備えるディスプレイに表示させる。 Also, for example, the image output processing unit 310 outputs an image based on the image data output from the capture processing unit 312 (that is, an image such as a material screen displayed on the external PC 311) to the display provided in the touch panel unit 309. display.

このように、映像出力処理部３１０が表示させる映像は多岐にわたるが、これらのビデオ会議にかかわる映像を一つのフレームに収める必要があるため、映像出力処理部３１０内でレイアウト処理を行う。各表示映像を単純に並べて表示することもあれば、あるいは共有資料を大きく映して各拠点の参加者映像をPicture In Picture形式で表示させることもある。映像出力処理部３１０は、表示中の映像特性やその時のレイアウトに応じて、随時レイアウトを変更させる。本実施形態のレイアウト変更（カメラ画像等の移動）は、このモジュールで実行される。 In this way, the video output processing unit 310 displays a wide variety of images, and since it is necessary to fit these video images related to the video conference into one frame, the video output processing unit 310 performs layout processing. In some cases, the displayed images are simply displayed side by side, or in other cases, the shared materials are enlarged and the images of the participants at each base are displayed in Picture In Picture format. The video output processing unit 310 changes the layout at any time according to the characteristics of the video being displayed and the layout at that time. Layout change (movement of camera images, etc.) in this embodiment is executed by this module.

マイクアレイ３０４は、マイクロフォンアレイ３０４ａ及びＡ／Ｄコンバータ３０４ｂを有する。マイクロフォンアレイ３０４ａは、ビデオ会議の参加者の音声を集音し、音声信号（アナログ信号）を出力する。Ａ／Ｄコンバータ３０４ｂは、マイクロフォンアレイ３０４ａから出力された音声の音声信号（アナログ信号）をデジタル信号に変換して、変換後の音声信号（デジタル信号）を音声処理部３０８へ出力する。 The microphone array 304 has a microphone array 304a and an A/D converter 304b. The microphone array 304a collects voices of participants in the video conference and outputs voice signals (analog signals). The A/D converter 304 b converts the audio signal (analog signal) of the audio output from the microphone array 304 a into a digital signal and outputs the converted audio signal (digital signal) to the audio processing unit 308 .

音声出力部３０７は、Ｄ／Ａコンバータ３０７Ｂ及びスピーカ３０７ａを有する。Ｄ／Ａコンバータ３０７Ｂは、他の通信端末１０から送信された音声信号（デジタル信号）をアナログ信号に変換する。スピーカ３０７ａは、Ｄ／Ａコンバータ３０７Ｂによる変換後の音声信号（アナログ信号）が供給されることにより、他拠点において集音されたビデオ会議の参加者の音声を出力する。 The audio output unit 307 has a D/A converter 307B and a speaker 307a. The D/A converter 307B converts an audio signal (digital signal) transmitted from another communication terminal 10 into an analog signal. The speaker 307a is supplied with the audio signal (analog signal) converted by the D/A converter 307B, and outputs the voice of the participant of the video conference collected at the other site.

音声処理部３０８は、ＤＳＰ３０８ａ、音声ＣＯＤＥＣする機能３０８ｂ、ノイズキャンセル（ＮＲ／ＥＣ）する機能３０８ｃ、音声判別する機能３０８ｄ、ビームフォーミングする機能３０８ｅを有し、他の通信端末１０から受信された映像データを構成する音声データに対して、所定の音声処理（例えば、音声ＣＯＤＥＣ処理、ノイズキャンセル（ＮＲ／ＥＣ）、音声判別、ビームフォーミング等）を行う。そして、音声処理部３０８は、音声処理後の音声データを、音声出力部３０７へ出力する。同時に、音声処理部３０８は、音声出力部３０７に出力する音声データを把握しながら、マイクアレイ３０４に回り込んで入力される音声データに対するエコーキャンセル（ＥＣ）処理を行う。そして、音声処理部３０８は、音声処理後の音声データを、ネットワーク処理部３１３へ出力する。また、音声処理部３０８は、ビームフォーミング機能により音の方向を特定し、その情報をもとに映像処理部３０３で発話者のクローズアップ映像が生成される。 The audio processing unit 308 has a DSP 308a, an audio CODEC function 308b, a noise canceling (NR/EC) function 308c, an audio discrimination function 308d, and a beamforming function 308e. Predetermined audio processing (for example, audio CODEC processing, noise cancellation (NR/EC), audio discrimination, beam forming, etc.) is performed on audio data constituting data. Then, the audio processing unit 308 outputs the audio data after the audio processing to the audio output unit 307 . At the same time, the audio processing unit 308 performs echo cancellation (EC) processing on the audio data input to the microphone array 304 while grasping the audio data to be output to the audio output unit 307 . Then, the audio processing unit 308 outputs the audio data after the audio processing to the network processing unit 313 . Also, the sound processing unit 308 specifies the direction of the sound by the beam forming function, and the image processing unit 303 generates a close-up image of the speaker based on the information.

ネットワーク処理部３１３は、映像ＣＯＤＥＣ部（エンコーダ）３１４から出力された符号化済みの映像データを、ネットワークを介して、送信先の他の通信端末１０へ送信するＮＩＣ３１３ａを有する。また、ネットワーク処理部３１３は、他の通信端末１０から送信された符号化済みの映像データを、ネットワークを介して受信する。そして、ネットワーク処理部３１３は、当該映像データを、映像ＣＯＤＥＣ部（デコーダ）３１４へ出力する。また、ネットワーク処理部３１３は、符号化パラメータ（QP値、等）を決めるための、ネットワークの帯域をモニターする機能（ネットワーク状態検知部３１３ｂ）を有する。また、ネットワーク処理部３１３は、符号化パラメータ（QP値、等）や送信モードの設定を最適化するための、相手局の機能や性能に関する情報を取得する機能（相手局機能判別部３１３ｃ）を有する。 The network processing unit 313 has a NIC 313a that transmits encoded video data output from the video CODEC unit (encoder) 314 to another communication terminal 10 as a transmission destination via the network. The network processing unit 313 also receives encoded video data transmitted from another communication terminal 10 via the network. The network processing unit 313 then outputs the video data to the video CODEC unit (decoder) 314 . The network processing unit 313 also has a network bandwidth monitoring function (network state detection unit 313b) for determining encoding parameters (QP value, etc.). In addition, the network processing unit 313 has a function (partner station function determination unit 313c) that acquires information about the function and performance of the partner station in order to optimize the settings of the encoding parameters (QP value, etc.) and transmission mode. have.

全体処理部３１５は、通信端末１０の全体の制御を行う。全体処理部３１５は、ＣＰＵ３１５ａ、ＲＯＭ３１５ｂ、ＳＳＤ３１５ｃ、ＲＡＭ３１５ｄ等を備えて構成されている。例えば、全体処理部３１５は、オペレータの指示に従って、各モジュール及び各ブロックのモード設定、ステータス管理等を行う。また、全体処理部３１５は、システムメモリ（ＲＡＭ）の使用権及びシステムバスのアクセス権限の調停機能等を有する。 The overall processing unit 315 performs overall control of the communication terminal 10 . The overall processing unit 315 includes a CPU 315a, a ROM 315b, an SSD 315c, a RAM 315d, and the like. For example, the overall processing unit 315 performs mode setting, status management, etc. of each module and each block according to an operator's instruction. The overall processing unit 315 also has a function of arbitrating the right to use the system memory (RAM) and the right to access the system bus.

また、全体処理部３１５は、カメラモジュール３０１の撮像モードの設定を行う。カメラモジュール３０１の撮像モードの設定は、環境に応じて自動的に設定される自動設定項目（例えば、測光条件等）と、オペレータの操作入力により手動的に設定される手動設定項目とを含み得る。 Also, the overall processing unit 315 sets the imaging mode of the camera module 301 . The setting of the imaging mode of the camera module 301 can include automatic setting items (for example, photometric conditions) that are automatically set according to the environment, and manual setting items that are manually set by the operator's operation input. .

また、全体処理部３１５は、映像出力処理部３１０で行われるレイアウト処理に関する設定を行う。全体処理部３１５は、共有資料を優先的に表示するためのPicture In Picture表示にする、あるいは、特定の拠点を大きく映す、等の予め決められている表示フォーマットを選択・設定する。 The overall processing unit 315 also makes settings related to layout processing performed by the video output processing unit 310 . The overall processing unit 315 selects and sets a predetermined display format, such as Picture In Picture display for preferentially displaying shared materials, or displaying a specific site in a large size.

これらの設定は、オペレータによる操作部３１６の操作によって行われ、通信端末１０が備えるメモリ（ＲＡＭ）に記憶される。そして、これらの設定は、映像処理部３０３によって使用される。 These settings are made by the operator operating the operation unit 316 and stored in the memory (RAM) of the communication terminal 10 . These settings are then used by the video processing unit 303 .

操作部３１６は、各種入力デバイス（例えば、タッチパネル、操作ボタン、リモコン等）を備える。操作部３１６は、オペレータによる各種入力デバイスに操作により、各種入力（例えば、各種設定、会議参加者の呼び出し等）を受け付ける。 The operation unit 316 includes various input devices (eg, touch panel, operation buttons, remote control, etc.). The operation unit 316 receives various inputs (for example, various settings, calling of conference participants, etc.) by operating various input devices by the operator.

録画装置I/F部３１７は、音声処理部３０８から出力される音声データと、映像出力処理部３１０で生成された映像データと、を組み合わせて録画データを構成させ、その合成されたデータを録画再生装置３０に出力するためのＩ／Ｆ機能を有する。 The recording device I/F unit 317 combines the audio data output from the audio processing unit 308 and the video data generated by the video output processing unit 310 to form recording data, and records the synthesized data. It has an I/F function for outputting to the playback device 30 .

映像特性解析部３０２は、検知部３０２ａ及び動き判定部３０２ｂを有する。検知部３０２ａは、カメラモジュール３０１から出力された映像データを構成するフレーム画像から、人の顔が存在するエリアを検知する。動き判定部３０２ｂは、カメラモジュール３０１から出力された映像データを構成するフレーム画像から、人が動いているエリアを検知する。映像特性解析部３０２は、各エリアの検知結果を、映像処理部３０３へ出力する。なお、映像特性解析部３０２は、各エリアの検知を行う際に、ＲＡＭ３０５をバッファとして使用する。 The video characteristic analysis unit 302 has a detection unit 302a and a motion determination unit 302b. The detection unit 302 a detects an area in which a person's face exists from the frame images forming the video data output from the camera module 301 . The motion determination unit 302b detects an area in which a person is moving from frame images forming video data output from the camera module 301 . The image characteristic analysis unit 302 outputs the detection result of each area to the image processing unit 303 . Note that the video characteristic analysis unit 302 uses the RAM 305 as a buffer when detecting each area.

Capture処理部３１２は、外部ＰＣ３１１から入力された映像を取り込んで、映像出力処理部３１０に転送する。Capture処理部３１２は、本実施形態にかかわる機能としては、改頁（更新）検出機能を有する。 The capture processing unit 312 captures the video input from the external PC 311 and transfers it to the video output processing unit 310 . The capture processing unit 312 has a page break (update) detection function as a function related to this embodiment.

・外部ＰＣ３１１から転送されてくる資料（画面）共有用の映像について、Capture処理部３１２は、改頁（更新）されたかどうか検出する。 The capture processing unit 312 detects whether or not a page break (update) has occurred in the material (screen) sharing video transferred from the external PC 311 .

・検出方法は特に限定するものはないが、ＰＣ画面の映像データなのでノイズ的な要素はないため、単純にフレーム間でのベリファイチェックやサムチェックでもよい。Capture処理部３１２は、演算量を抑えるために、Captureした画像の解像度を落としてから上記の処理を行ってもよい。 The detection method is not particularly limited, but since it is video data of a PC screen and there is no element of noise, a simple verify check or sum check between frames may be used. In order to reduce the amount of calculation, the capture processing unit 312 may perform the above processing after lowering the resolution of the captured image.

・ここでいう画面の改頁（更新）とは、会議がスタートした後に外部ＰＣ３１１が接続されて、外部ＰＣ３１１から資料（画面）共有の映像転送がスタートしたことも改頁トリガーに含める。 The page break (update) of the screen here includes the fact that the external PC 311 is connected after the conference has started and video transfer for material (screen) sharing has started from the external PC 311 as a page break trigger.

・Capture処理部３１２は、改頁（更新）を検出したら、その旨を映像出力処理部３１０に知らせる。 - When the capture processing unit 312 detects a page break (update), it notifies the video output processing unit 310 of that fact.

・外部ＰＣ３１１から転送されてくる映像は、ここでは資料（画面）共有用の画像という扱いなので、カメラモジュールの映像のような動画としては扱わない。よって、フレームレートにも上限が設けられる（すなわち、改頁（更新）検出の間隔をある程度確保する）。 - The video transferred from the external PC 311 is treated here as an image for material (screen) sharing, so it is not treated as a video like the camera module video. Therefore, an upper limit is also set for the frame rate (that is, a certain amount of page break (update) detection interval is secured).

・なお、外部ＰＣ３１１上で再生している動画を、フレームレートの制約を設けず通常の動画として転送したい場合は、カメラモジュールからの入力映像をユーザーが無効にすればよい。ユーザーはシステムの動作モード指定時にそのような設定をCapture制御部にインプットする。 - If the user wants to transfer the moving image being played back on the external PC 311 as a normal moving image without limiting the frame rate, the user can disable the input image from the camera module. The user inputs such settings into the Capture control when specifying the operating mode of the system.

＜＜録画再生装置＞＞
図４は、録画再生装置３０のハードウェア構成図である。図４に示されているように、録画再生装置３０は、ＣＰＵ５０１、ＲＡＭ５０２、操作部５０３、ＲＯＭ５０４、入力Ｉ/Ｆ５０５、ＣＯＤＥＣ５０６、録画再生回路５０７、暗号化回路５０８、出力Ｉ/Ｆ５１０、及び、ＮＩＣ５１１を有している。 <<Recording and playback device>>
FIG. 4 is a hardware configuration diagram of the recording/playback device 30. As shown in FIG. As shown in FIG. 4, the recording/playback device 30 includes a CPU 501, a RAM 502, an operation unit 503, a ROM 504, an input I/F 505, a CODEC 506, a recording/playback circuit 507, an encryption circuit 508, an output I/F 510, and It has NIC511.

ＣＰＵ５０１は、ＲＯＭ５０４に格納された所定のプログラムに従って、本実施形態で説明される録画機能を実現するために、録画再生装置３０の各ブロックを制御する。 The CPU 501 controls each block of the recording/reproducing device 30 according to a predetermined program stored in the ROM 504 in order to implement the recording function described in this embodiment.

ＲＡＭ５０２は、ＣＰＵ５０１の作業領域として利用されると共に、ＲＯＭ５０４に格納される各処理プログラムなどの記憶領域としても利用される。ＲＡＭ５０２は、通信端末１０から転送される画像データや音声データの一時的な格納先として利用される。また、ＲＡＭ５０２は、CODEC５０６や録画再生回路５０７のワークメモリとしても利用される。 The RAM 502 is used as a work area for the CPU 501 and also as a storage area for each processing program stored in the ROM 504 . The RAM 502 is used as a temporary storage destination for image data and audio data transferred from the communication terminal 10 . The RAM 502 is also used as a work memory for the CODEC 506 and recording/playback circuit 507 .

操作部５０３は、ハードキーorリモコン等から構成され、録画再生装置３０の起動、モード設定などを行う一般的なユーザーインタフェースである。 The operation unit 503 is composed of hard keys, a remote control, or the like, and is a general user interface for starting the recording/reproducing apparatus 30, setting the mode, and the like.

入力Ｉ/Ｆ５０５は、通信端末１０から転送される画像データと音声データを入力する際に使用されるインターフェースである。I/Fとしては専用のもの、あるいは一般的なHDMI（登録商標）やDisplayPortで実現可能である。 The input I/F 505 is an interface used when inputting image data and audio data transferred from the communication terminal 10 . The I/F can be a dedicated one, or a general HDMI (registered trademark) or DisplayPort.

ＣＯＤＥＣ５０６は、入力Ｉ/Ｆ５０５で入力された画像データのフレーム（映像ストリームデータ）のエンコード/デコード処理を行うため、H.264/265等のCODEC回路あるいはソフトウェアで構成される。上記エンコード処理で符号化されたデータは暗号化回路５０８で暗号化されてからストレージ装置５０９（HDD、SSD、SDメモリカード、等）に格納される。ストレージ装置５０９に格納されたデータが合成画像の動画ファイルそのものである。 A CODEC 506 is configured by a CODEC circuit such as H.264/265 or software in order to encode/decode a frame of image data (video stream data) input from the input I/F 505 . The data encoded by the encoding process is encrypted by the encryption circuit 508 and then stored in the storage device 509 (HDD, SSD, SD memory card, etc.). The data stored in the storage device 509 is the composite image moving image file itself.

録画再生回路５０７は、録画データを再利用しやすくするために、本実施形態で説明される一連の画像処理を行う。 A recording/playback circuit 507 performs a series of image processing described in this embodiment in order to facilitate reuse of recorded data.

出力Ｉ/Ｆ５１０は、通信端末１０から転送された画像データをディスプレイに出力する。あるいは、出力Ｉ/Ｆは、合成画像データから再生した合成画像をディスプレイに出力する。I/Fの規格としては、HDMI（登録商標）やDisplayPort（登録商標）等がある。画像録画データには音声データも含まれる。 The output I/F 510 outputs the image data transferred from the communication terminal 10 to the display. Alternatively, the output I/F outputs a synthesized image reproduced from the synthesized image data to the display. I/F standards include HDMI (registered trademark) and DisplayPort (registered trademark). Audio data is also included in the image recording data.

ＮＩＣ５１１は、LAN（Local Area Network）等のネットワークを介してインターネットに接続でき、外部サーバーやＮＡＳに合成画像データを転送する。 The NIC 511 can be connected to the Internet via a network such as a LAN (Local Area Network), and transfers composite image data to an external server or NAS.

＜機能について＞
図５は、通信端末１０と録画再生装置３０が有する機能をブロックに分けて説明する機能ブロック図の一例である。 <About functions>
FIG. 5 is an example of a functional block diagram for explaining the functions of the communication terminal 10 and the recording/playback device 30 by dividing them into blocks.

＜＜通信端末＞＞
通信端末１０は、カメラ画像取得部１１、音声取得部１２、通信部１３、合成画像送信部１４、移動部１５、ページ切替部１６、画像合成部１７、及び、操作受付部１８を有する。通信端末１０が有するこれらの機能は図３に示した通信端末１０のハードウェア回路で実現されるが、ＣＰＵがプログラムで実行することで実現されてもよい。また、図５の機能は通信端末１０が有する主要な機能を示したに過ぎず、図示する他に機能を有していてよい。 <<communication terminal>>
The communication terminal 10 has a camera image acquisition section 11 , a voice acquisition section 12 , a communication section 13 , a composite image transmission section 14 , a movement section 15 , a page switching section 16 , an image synthesis section 17 and an operation reception section 18 . These functions of the communication terminal 10 are realized by the hardware circuit of the communication terminal 10 shown in FIG. 3, but may be realized by the CPU executing a program. Also, the functions in FIG. 5 merely show the main functions that the communication terminal 10 has, and the communication terminal 10 may have functions other than those shown.

カメラ画像取得部１１は、カメラモジュール３０１が撮像したカメラ画像をカメラモジュール３０１からリアルタイムに取得する。音声取得部１２は、マイクアレイ３０４が集音した音声をＰＣＭ変換して音声データを生成する。 The camera image acquisition unit 11 acquires the camera image captured by the camera module 301 from the camera module 301 in real time. The voice acquisition unit 12 converts the voice collected by the microphone array 304 into PCM to generate voice data.

通信部１３は、通信管理システムを介してカメラ画像を他の拠点の通信端末１０に送信し、また、他の拠点の通信端末１０から通信管理システムを介してカメラ画像を受信する。通信部１３は、通信管理システム５０を介して音声データを他の拠点の通信端末１０に送信し、また、他の拠点の通信端末１０から通信管理システム５０を介して音声データを受信する。通信部１３は、他の拠点の通信端末１０が資料画像を共有する場合は、通信管理システム５０を介して資料画像を受信する。自拠点の通信端末１０が資料画像を共有する場合は、通信部１３は通信管理システム５０を介して資料画像を他の拠点の通信端末１０に送信する。 The communication unit 13 transmits camera images to the communication terminals 10 at other bases via the communication management system, and receives camera images from the communication terminals 10 at other bases via the communication management system. The communication unit 13 transmits voice data to the communication terminals 10 at other sites via the communication management system 50 and receives voice data from the communication terminals 10 at other sites via the communication management system 50 . The communication unit 13 receives the document image via the communication management system 50 when the communication terminals 10 at other bases share the document image. When the communication terminal 10 of its own site shares the document image, the communication unit 13 transmits the document image to the communication terminal 10 of another site via the communication management system 50 .

移動部１５は、上記Ａの方法の画像処理に関し、資料画像に対しカメラ画像をゆっくりと移動させることを継続的に行う。 Regarding the image processing of method A, the moving unit 15 continuously moves the camera image slowly with respect to the document image.

画像合成部１７は、ユーザー操作に応じて又は自動的に、自拠点のカメラ画像、他拠点のカメラ画像、及び、資料画像を配置して１フレームの合成画像を生成する。フレームとは動画における個々の静止画である。ユーザーはこれらのカメラ画像や資料画像を縮小してタイル状に配置することも、資料画像にカメラ画像を埋め込んで（又はその逆に）配置することも可能である。本実施形態では、資料画像にカメラ画像が埋め込まれて配置される場合を説明する。 The image synthesizing unit 17 arranges the camera image of the own base, the camera images of the other bases, and the document image to generate a synthesized image of one frame according to the user's operation or automatically. A frame is an individual still image in a moving image. The user can reduce these camera images and material images and arrange them in tiles, or embed the camera images in the material images (or vice versa). In this embodiment, a case where a camera image is embedded in a document image and arranged will be described.

また、画像合成部１７は、合成画像におけるカメラ画像の位置情報・サイズと拠点の識別情報をメタデータなどでフレームに添付する。 In addition, the image synthesizing unit 17 attaches the position information/size of the camera image in the synthesized image and the identification information of the base to the frame as metadata or the like.

ページ切替部１６は、上記Ｂの画像処理に関し、資料画像の切り替えを検出して、資料画像を上下方向又は左右方向にスライドさせて切り替え前の資料画像から切り替え後の資料画像に切り替える。合成画像の各フレームには切り替わる途中の資料画像がスライドしながら記録される。補足すると、資料画像の切り替え自体はユーザー操作で行われる。ページ切替部１６はユーザー操作による資料画像の切り替えを検出して、Ｂの処理のために切り替え後の資料画像を用意する（切り替え前の資料画像はすでに取得済み）。ページ切替部１６は用意した切り替え後の資料画像と切り替え前の資料画像をスライドさせながら切り替える。 Regarding the image processing of B above, the page switching unit 16 detects switching of the document image and slides the document image vertically or horizontally to switch from the document image before switching to the document image after switching. In each frame of the composite image, the material image in the middle of switching is recorded while sliding. Supplementally, the switching of the material image itself is performed by the user's operation. The page switching unit 16 detects the switching of the material image by the user's operation, and prepares the material image after switching for the processing of B (the material image before switching has already been obtained). The page switching unit 16 switches the prepared document image after switching and the document image before switching while sliding.

なお、移動部１５とページ切替部１６は画像処理部２１として機能する。移動部１５とページ切替部１６はいずれか一方が動作してもよいし、両方が動作してもよい。 Note that the moving section 15 and the page switching section 16 function as an image processing section 21 . Either one of the moving unit 15 and the page switching unit 16 may operate, or both may operate.

画像合成部１７は、ページ切替部１６が生成する、スライドしながら記録された資料画像に、自拠点のカメラ画像及び他拠点のカメラ画像を配置して１フレームの合成画像を生成する。 The image synthesizing unit 17 arranges the camera image of the own base and the camera images of the other bases in the document image generated by the page switching unit 16 and recorded while sliding, thereby generating a synthesized image of one frame.

合成画像送信部１４は、合成画像を動画のように繰り返し録画再生装置３０に送信する。動画の場合、１秒間に例えば３０フレーム以上のフレームが次々に送信される。なお、合成画像送信部１４は合成画像をリアルタイムに送信するほか、（会議中で録画再生装置３０がディスプレイ出力をしていない場合は）一定量、蓄積してから送信してもよい。 The composite image transmission unit 14 repeatedly transmits the composite image to the recording/playback device 30 like a moving image. In the case of moving images, for example, 30 or more frames are transmitted one after another per second. In addition to transmitting the composite image in real time, the composite image transmission unit 14 may accumulate a certain amount of the composite image (when the recording/playback device 30 is not outputting the display during the meeting) and then transmit the composite image.

操作受付部１８は、通信端末１０に対するユーザーの操作を受け付ける。 The operation accepting unit 18 accepts a user's operation on the communication terminal 10 .

＜＜録画再生装置＞＞
録画再生装置３０は、受信部３１、動画ファイル作成部３２、第一資料画像構築部３３、操作受付部３４、切替検出部３５、第二資料画像構築部３６、表示制御部３７、及び、記憶部４９を有している。録画再生装置３０が有するこれらの機能は図４に示したハードウェアによりで実現されるが、ＣＰＵがプログラムで実行することで実現されてもよい。 <<Recording and playback device>>
The recording/playback device 30 includes a receiving unit 31, a moving image file creating unit 32, a first document image constructing unit 33, an operation accepting unit 34, a switching detecting unit 35, a second document image constructing unit 36, a display control unit 37, and a storage unit. It has a portion 49 . These functions of the recording/playback device 30 are implemented by the hardware shown in FIG. 4, but may be implemented by the CPU executing a program.

受信部３１は、通信端末１０から合成画像を受信し、合成画像記憶部４１に保存する。このように、合成画像記憶部４１には資料画像にカメラ画像が埋め込まれている合成画像が保存される。 The receiving unit 31 receives the composite image from the communication terminal 10 and stores it in the composite image storage unit 41 . In this manner, the composite image storage unit 41 stores a composite image in which the camera image is embedded in the document image.

動画ファイル作成部３２は、カメラ画像の位置情報・サイズに基づいて合成画像からカメラ画像を取り込み、カメラ画像からなる動画ファイル（第二の動画ファイルの一例）を作成する。合成画像からカメラ画像を取り出すことをトリミングという場合がある。動画ファイル作成部３２は拠点の識別情報に基づいて同じ拠点のカメラ画像を時系列にカメラ画像記憶部４２に保存する。カメラ画像は合成画像の各フレームから取得できるので、合成画像と同じfps（flame per second）のフレームがカメラ画像ごと（拠点ごと）に保存される。 The moving image file creating unit 32 captures the camera image from the composite image based on the position information and size of the camera image, and creates a moving image file (an example of the second moving image file) composed of the camera image. Extracting a camera image from a synthesized image is sometimes called trimming. The moving image file creating unit 32 stores the camera images of the same location in time series in the camera image storage unit 42 based on the identification information of the location. Since the camera image can be obtained from each frame of the composite image, the same fps (frames per second) frame as the composite image is saved for each camera image (for each site).

第一資料画像構築部３３は、上記Ｃの処理として、撮像時刻が異なる合成画像（カメラ画像が切り取られた合成画像を中間画像という）にＯＲ演算を行い、資料画像を構築する。第一資料画像構築部３３は、構築した資料画像を資料画像記憶部４３に保存する。この資料画像は死角がなくなるまでカメラ画像が移動するのに必要な時間ごとに構築される。 As the processing of C above, the first document image constructing unit 33 performs an OR operation on synthesized images captured at different times (a synthesized image obtained by cutting out a camera image is referred to as an intermediate image) to construct a document image. The first document image constructing unit 33 stores the constructed document image in the document image storage unit 43 . This document image is built up every time it takes for the camera image to move until there is no blind spot.

切替検出部３５は、上記Ｄの処理として、資料画像そのもの又は資料画像のページ（以下、区別せずにページの切り替わりという）が切り替わったことを検出する。切替検出部３５は、合成画像のカメラ画像以外の領域を、撮像時刻が異なるフレームごとに比較する。差異が一定以上の場合、切替検出部３５は、資料画像のページが切り替わったことを検出する。あるいは、切替検出部３５は、通信端末１０から送信されたページを切り替える操作信号により資料画像のページの切り替わりを検出する。 As the processing of D above, the switching detection unit 35 detects that the document image itself or the page of the document image (hereinafter referred to as page switching without distinction) has been switched. The switching detection unit 35 compares the areas other than the camera image of the composite image for each frame having different imaging times. If the difference is equal to or greater than a certain value, the switching detection unit 35 detects that the page of the document image has been switched. Alternatively, the switching detection unit 35 detects page switching of the document image based on a page switching operation signal transmitted from the communication terminal 10 .

第二資料画像構築部３６は、上記Ｄの処理として、ページが切り替わったことが検出されると、カメラ画像が重なっていない部分の資料画像を結合して、資料画像を構築する。第二資料画像構築部３６は構築した資料画像を資料画像記憶部４３に保存する。この資料画像はページの切り替わりごとに構築される。 As the processing of D above, the second document image constructing unit 36 constructs a document image by combining the document images of the portions where the camera images do not overlap when it is detected that the page has been switched. The second document image constructing unit 36 stores the constructed document image in the document image storage unit 43 . This material image is constructed each time the page is switched.

なお、第一資料画像構築部３３と第二資料画像構築部３６はいずれか一方が処理してもよいし、両方が処理してもよい。両方が処理する場合、同じ資料画像が作成される場合があるが、同じ画像は削除してもよい。 Either one of the first document image constructing section 33 and the second document image constructing section 36 may perform the processing, or both of them may perform the processing. If both process, the same document image may be created, but the same image may be deleted.

操作受付部３４は、録画再生装置３０に対する再生操作や録画操作を受け付ける。表示制御部３７は、合成画像、カメラ画像、又は、資料画像をディスプレイ２０に表示する。 The operation accepting unit 34 accepts playback operations and recording operations for the recording/playback device 30 . The display control unit 37 displays the synthesized image, camera image, or document image on the display 20 .

図６は、合成画像記憶部４１に記憶される情報を説明する図である。合成画像記憶部４１では、録画開始から終了までが１つの動画ファイルで保存される。図６では、録画ＩＤ、撮像開始時刻、撮像終了時刻、拠点数、サイズ、録画時間、ファイル名の各項目が動画ファイルごとに管理されている。
・録画ＩＤは動画ファイルを識別する識別情報であり、カメラ画像や資料画像を合成画像と対応付ける情報である。
・撮像開始時刻は録画の開始時刻である。
・撮像終了時刻は録画の終了時刻である。
・拠点数は会議に参加した拠点の数であり、通信端末１０から送信される。なお、拠点数には自拠点も含まれる。
・サイズは動画ファイルの容量である。
・録画時間は、開始時刻から終了時刻の経過時間である。
・ファイル名は、ファイルパスと共に保存される合成画像のファイル名である。動画ファイルの形式は制限されない。 FIG. 6 is a diagram for explaining information stored in the composite image storage unit 41. As shown in FIG. In the composite image storage unit 41, one moving image file is stored from the start to the end of recording. In FIG. 6, items such as recording ID, imaging start time, imaging end time, number of bases, size, recording time, and file name are managed for each moving image file.
- A recording ID is identification information for identifying a moving image file, and is information for associating a camera image or a document image with a composite image.
- The imaging start time is the recording start time.
- The imaging end time is the recording end time.
- The number of bases is the number of bases participating in the conference, and is transmitted from the communication terminal 10 . Note that the number of sites includes the own site.
・The size is the capacity of the video file.
- The recording time is the elapsed time from the start time to the end time.
・The file name is the file name of the composite image that is saved along with the file path. The video file format is not limited.

図７は、カメラ画像記憶部４２に記憶される情報を説明する図である。カメラ画像記憶部４２には、拠点ごとのカメラ画像が１つの動画ファイルで保存される。図７では、録画ＩＤ、撮像開始時刻、撮像終了時刻、拠点ＩＤ、サイズ、録画時間の各項目が動画ファイルごとに管理されている。録画ＩＤ、撮像開始時刻、撮像終了時刻、サイズ、及び録画時間については図６と同様でよい。
・拠点ＩＤは、拠点を識別する識別情報である。録画再生装置３０が拠点名を通信端末１０から取得してもよいし、重複しない番号を採番しもよい。 FIG. 7 is a diagram for explaining information stored in the camera image storage unit 42. As shown in FIG. In the camera image storage unit 42, camera images for each site are stored as one moving image file. In FIG. 7, the items of recording ID, imaging start time, imaging end time, site ID, size, and recording time are managed for each moving image file. The recording ID, imaging start time, imaging end time, size, and recording time may be the same as in FIG.
- The base ID is identification information for identifying a base. The recording/reproducing device 30 may acquire the site name from the communication terminal 10, or may assign a unique number.

なお、図７ではファイル名を省略したが、拠点ごとのカメラ画像が動画ファイルで保存されている。また、各拠点のカメラ画像は１つの動画ファイルに保存されてもよい。 Although the file names are omitted in FIG. 7, the camera images for each location are saved as moving image files. Also, the camera images of each site may be saved in one moving image file.

図８は、資料画像記憶部４３に記憶される情報を説明する図である。図８（ａ）は第一資料画像構築部３３が構築した資料画像の情報であり、図８（ｂ）は第二資料画像構築部３６が構築した資料画像の情報である。 FIG. 8 is a diagram for explaining information stored in the document image storage unit 43. As shown in FIG. FIG. 8(a) shows the information of the document image constructed by the first document image construction unit 33, and FIG. 8(b) shows the information of the material image constructed by the second material image construction unit 36. FIG.

図８（ａ）の資料画像記憶部４３では、合成画像の動画ファイルから構築された１つ以上の資料画像が保存される。図８（ａ）では、録画ＩＤ、構築開始時刻、構築終了時刻、資料画像ＩＤ、サイズ、ファイル名の各項目が資料画像ごとに管理されている。
・構築開始時刻は、資料画像の構築の開始を録画の開始時刻を基準として記録した時刻である。構築の開始は、後述する状態Ａの時刻でよい。
・構築終了時刻は、資料画像の構築の終了を録画の開始時刻を基準として記録した時刻である。構築の終了は、後述する状態Ｃの時刻でよい。
・資料画像ＩＤは、資料画像を識別する識別情報である。
・サイズは資料画像の容量である。
・ファイル名は、資料画像のファイル名である。図８（ａ）では資料画像が静止画であるが、動画として保存されてもよい。 In the document image storage unit 43 shown in FIG. 8A, one or more document images constructed from moving image files of composite images are stored. In FIG. 8A, the items of recording ID, construction start time, construction end time, material image ID, size, and file name are managed for each material image.
The construction start time is the time when the construction of the document image is started based on the recording start time. The start of construction may be the time of state A, which will be described later.
• The construction end time is the time when the construction of the material image was completed with reference to the recording start time. The construction may end at the time of state C, which will be described later.
- The document image ID is identification information for identifying the document image.
・The size is the capacity of the document image.
・The file name is the file name of the document image. Although the document image is a still image in FIG. 8A, it may be saved as a moving image.

図８（ｂ）の資料画像記憶部４３では、録画ＩＤ、ページ切り替え時刻、資料画像ＩＤ、サイズ、ファイル名の各項目が資料画像ごとに管理されている。録画ＩＤ、資料画像ＩＤ、サイズ、及びファイル名の各項目は、図８（ａ）と同様でよい。
・ページ切り替え時刻は、ページの切り替わりが検出された時刻を、録画の開始時刻を基準として記録した時刻である。 In the document image storage unit 43 shown in FIG. 8B, items such as recording ID, page switching time, document image ID, size, and file name are managed for each document image. The items of recording ID, material image ID, size, and file name may be the same as those in FIG. 8(a).
The page switching time is the time at which the page switching was detected and recorded with reference to the recording start time.

＜従来の合成画像の方法＞
図９は、通信端末１０が作成する従来の合成画像の一例である。図９の合成画像は以下のような構成である。
・画面全体には協議中の資料画像２０１が表示されている。細かなテキストも含まれるため、できるだけ大きく表示されることが望ましい。
・画面左下はある拠点の参加者全員が撮像されているパノラマ画像２０２（カメラ画像）である。
・画面中央下は、在宅勤務等のため一人でリモート参加している個人ごとの個別画像２０３（カメラ画像）である。そのうちの１つが自拠点のカメラが撮像したカメラ画像である。
・画面右下は、発話中の人がやや大きめに表示されている話者カメラ画像２０４である。 <Conventional synthetic image method>
FIG. 9 is an example of a conventional composite image created by the communication terminal 10. As shown in FIG. The composite image in FIG. 9 has the following configuration.
- A material image 201 under discussion is displayed on the entire screen. Since it contains detailed text, it is desirable to display it as large as possible.
A panorama image 202 (camera image) in which all participants at a site are captured is shown at the bottom left of the screen.
The bottom center of the screen is an individual image 203 (camera image) for each individual who participates remotely because of working from home or the like. One of them is the camera image captured by the camera at the base.
- The bottom right of the screen is the speaker camera image 204 in which the person who is speaking is displayed in a slightly larger size.

図９の合成画像をそのまま録画再生装置３０が録画すると、会議に参加できなかった人がこの再生した場合に以下のような問題があった。
・資料の死角が発生することが避けられない。
・資料画像を小さく表示すれば死角はなくなるかもしれないが、資料等はテキストが含まれるケースが多いため、縮小表示ではその内容を読み取りにくくなり、会議の進行に支障をきたす。
・死角の領域が見える時間帯があるのかもしれないが、見える時間帯がわからないとユーザーが再生時に全時間の映像を見る必要があるかもしれない。その場合、時間節約のために端折って再生映像を見るということがやりにくい方法があるが、結局、最後まで見えなかったというケースもあり得る。 If the recording/reproducing apparatus 30 records the synthesized image of FIG. 9 as it is, the following problems arise when a person who could not participate in the conference reproduces the image.
・Blind spots in materials cannot be avoided.
・Blind spots may be eliminated if the material image is displayed in a small size.
・There may be times when the blind spot area is visible, but if the user does not know the times when the blind spot is visible, the user may need to watch the entire video during playback. In that case, there is a method that is difficult to watch the reproduced video in order to save time, but in the end, there may be a case where the user cannot see the video to the end.

従って、合成画像は、会議に参加できなかった人にとっては使いづらい録画映像となっていた。 Therefore, the synthesized image is a recorded video that is difficult to use for those who could not participate in the conference.

＜本実施形態の画像処理の概略＞
以下では、合成画像に対する二種類の画像処理を説明する。 <Overview of image processing according to the present embodiment>
Two types of image processing for a composite image are described below.

Ａ．カメラ画像の位置の移動
まず、図１０は、合成画像におけるカメラ画像の位置を移動させる画像処理の概略を説明する図である。 A. Movement of Position of Camera Image First, FIG. 10 is a diagram for explaining an outline of image processing for moving the position of a camera image in a synthesized image.

通信端末１０の移動部１５は、各拠点のカメラ画像を、図１０（ａ）→図１０（ｂ）→図１０（ｃ）のように、ゆっくりと微妙に位置を変えて合成する。図１０（ａ）～（ｃ）によれば、パノラマ画像２０２、在宅リモート参加者の個別画像２０３、及び、発話者映像が移動していることが分かる。移動部１５は、カメラ画像の位置を変えることで全く見えないままの領域を残さないように移動する。例えば、画像合成部１７は、「カメラ画像のＷｉｎｄｏｗの大きさ（サイズ）」と「表示位置座標（位置情報）」を把握しながら、死角を残さないように（サイズよりも大きく）ゆっくりと位置を変えていく。 The moving unit 15 of the communication terminal 10 slowly and subtly changes the positions of the camera images of each base in the order of FIG. 10(a)→FIG. 10(b)→FIG. 10A to 10C, it can be seen that the panorama image 202, the individual image 203 of the home remote participant, and the video of the speaker are moving. The moving unit 15 moves so as not to leave any unseen regions by changing the position of the camera image. For example, the image synthesizing unit 17 grasps the "window size (size) of the camera image" and the "display position coordinates (position information)", and slowly moves the position (larger than the size) so as not to leave any blind spots. changing.

録画再生装置３０では、合成画像が圧縮符号化の後に記録される。再生時に（又は録画から再生までの間に）、録画再生装置３０の第一資料画像構築部３３は、Ｃの処理として、カメラ画像の表示位置が変わっていくことを利用して、死角のない資料画像を別途作成する。すでに会議が終了しているので（リアルタイムである必要がないため）、このような操作が可能になる。 In the recording/reproducing device 30, the composite image is recorded after being compression-encoded. At the time of playback (or between recording and playback), the first document image construction unit 33 of the recording/playback device 30 uses the change in the display position of the camera image as the process C to eliminate blind spots. Separately create a material image. Since the meeting has already ended (because it doesn't have to be in real time), such an operation is possible.

また、再生時には、録画再生装置３０は死角のない資料画像のみを表示することもできる。また２つのディスプレイがあれば、資料画像とカメラ画像を別々のディスプレイに同時に表示させることができる。 Also, during playback, the recording/playback device 30 can display only material images without blind spots. If there are two displays, the document image and the camera image can be displayed simultaneously on separate displays.

以上の操作で、再生時の会議内容把握にかかる時間を大幅に削減できることが期待できる。 With the above operations, it is expected that the time required to grasp the content of the conference during playback can be greatly reduced.

＜処理手順＞
図１１は、合成画像の録画処理を説明するフローチャート図である。 <Processing procedure>
FIG. 11 is a flowchart for explaining recording processing of a composite image.

通信端末１０のユーザーは、通信端末１０又はアプリを起動させる。ユーザーは、起動時に録画システム６０にかかわる初期設定を行う（Ｓ１）。初期設定は、例えば、接続先拠点の指定や会議画面のレイアウト指定、等である。 A user of the communication terminal 10 activates the communication terminal 10 or an application. The user performs initial settings related to the recording system 60 at startup (S1). The initial settings are, for example, designation of a connection destination site, layout designation of a conference screen, and the like.

次に、通信端末１０はカメラやマイク、スピーカなど入出力機器の初期化、動作モードの設定等を行い、各装置を起動する（Ｓ２）。例えば、通信端末１０は会議環境に合わせて、測光条件等のカメラ撮像モードの設定を行う。 Next, the communication terminal 10 initializes input/output devices such as a camera, a microphone, and a speaker, sets operation modes, etc., and starts each device (S2). For example, the communication terminal 10 sets the camera imaging mode such as photometric conditions in accordance with the conference environment.

通信部１３は、ステップＳ１、Ｓ２で本体の準備が整った時点で、通信管理システム５０に通信開始を要求して会議をスタートする（Ｓ３）。通信部１３は、あるいは相手局からの通信要求を受けて通信をスタートしてよい。通信端末１０が合成画像を録画する場合は、録画再生装置３０もこの時点で起動させる。 When the main body is ready in steps S1 and S2, the communication section 13 requests the communication management system 50 to start communication and starts the conference (S3). The communication unit 13 may start communication upon receiving a communication request from the partner station. When the communication terminal 10 records the composite image, the recording/playback device 30 is also activated at this time.

合成画像が録画状態になっている場合は、処理はステップＳ７に移動する。そうでない場合は、処理はステップＳ５に進む。 If the composite image is in the recording state, the process moves to step S7. Otherwise, the process proceeds to step S5.

録画再生装置がREADY状態であれば、処理はステップＳ６に進む。そうでない場合は、処理はステップＳ８に進む。 If the recording/playback device is in the READY state, the process proceeds to step S6. Otherwise, the process proceeds to step S8.

通信端末１０の合成画像送信部１４は合成画像を録画再生装置３０に送信することで、録画再生装置３０が録画処理を開始する（Ｓ６）。 The composite image transmission unit 14 of the communication terminal 10 transmits the composite image to the recording/reproducing device 30, so that the recording/reproducing device 30 starts recording processing (S6).

合成画像送信部１４は録画処理を実行、又は継続する（Ｓ７）。会議が終了した場合は、通信端末１０は待機状態へ移行する（Ｓ８）。会議が終了するまで、処理はステップＳ４に戻り、Ｓ４からＳ７の処理を繰り返す。 The composite image transmission unit 14 executes or continues the recording process (S7). When the conference ends, the communication terminal 10 shifts to a standby state (S8). The process returns to step S4 and repeats the processes from S4 to S7 until the conference ends.

＜＜カメラ画像の位置の移動＞＞
図１２は、合成画像のカメラ画像の移動を画面表示イメージで表した図である。つまり、移動部１５が、図１２のようにカメラ画像を移動させれば資料画像の死角をなくすことができる。
１．状態Ａ→状態Ｂへ移行中は、パノラマ画像２０２と話者カメラ画像２０４が図で示すように移動する（上方向に移動）。
２．状態Ｂ→状態Ｃに移行中は、個別画像２０３と話者カメラ画像２０４が図で示すように移動する（個別画像２０３が左方向、話者カメラ画像２０４が上方向）。
３．状態Ｃ→状態Ｄに移行中は、個別画像２０３と話者カメラ画像２０４が図で示すように移動する（個別画像２０３が右方向、話者カメラ画像２０４が下方向）。すなわち、各カメラ画像が元の位置に戻ろうとする。
４．状態Ｄ→状態Ａに移行中は、パノラマ画像２０２と話者カメラ画像２０４が図で示すように移動する（下方向に移動。初期の位置に戻ろうとする）。
この状態は以下のように遷移する。
状態Ａ→状態Ｂ→状態Ｃ→状態Ｄ→状態Ａ→状態Ｂ→状態Ｃ→状態Ｄ→・・・（以後繰り返し） <<Move the position of the camera image>>
FIG. 12 is a diagram showing a screen display image of movement of a camera image of a composite image. In other words, if the moving unit 15 moves the camera image as shown in FIG. 12, blind spots in the document image can be eliminated.
1. During the transition from state A to state B, the panorama image 202 and the speaker camera image 204 move as shown in the drawing (upward movement).
2. During the transition from state B to state C, the individual image 203 and the speaker's camera image 204 move as shown in the figure (the individual image 203 moves leftward, and the speaker's camera image 204 moves upward).
3. During the transition from state C to state D, the individual image 203 and the speaker's camera image 204 move as shown in the figure (the individual image 203 moves rightward, and the speaker's camera image 204 moves downward). That is, each camera image tries to return to its original position.
4. During the transition from state D to state A, the panorama image 202 and the speaker camera image 204 move as shown in the drawing (moving downward, trying to return to the initial position).
This state transitions as follows.
State A→state B→state C→state D→state A→state B→state C→state D→... (repeat thereafter)

図１３は、図１１のステップＳ７の処理を説明するフローチャート図である。図１３では、現在の移動状態に従ってカメラ画像の移動方向が指定される。図１２で説明したように、カメラ画像の移動中の状態Ａ～Ｃは３つのいずれかの状態なので（状態ＤからＡは復路なので状態としては省略してよい）、以下のようにカメラ画像が移動される。 FIG. 13 is a flow chart for explaining the process of step S7 in FIG. In FIG. 13, the movement direction of the camera image is designated according to the current movement state. As described with reference to FIG. 12, the states A to C during movement of the camera image are any of the three states (states D to A can be omitted as states because they are on the return trip), so the camera image is as follows. be moved.

状態Ａ→状態Ｂに向けてカメラ画像を徐々に移動中の場合（Ｓ７１のＹｅｓ）、移動部１５は状態Ａ→状態Ｂに向けてカメラ画像を徐々に移動する（Ｓ７２）。ステップＳ７１がＮｏの場合、処理はステップＳ７３に進む。 When the camera image is being gradually moved from state A to state B (Yes in S71), the moving unit 15 gradually moves the camera image from state A to state B (S72). If step S71 is No, the process proceeds to step S73.

状態Ｂ→状態Ｃに向けてカメラ画像を徐々に移動中の場合（Ｓ７３のＹｅｓ）、移動部１５は状態Ｂ→状態Ｃに向けてカメラ画像を徐々に移動する（Ｓ７４）。ステップＳ７３がＮｏの場合、処理はステップＳ７５に進む。 When the camera image is being gradually moved from state B to state C (Yes in S73), the moving unit 15 gradually moves the camera image from state B to state C (S74). If step S73 is No, the process proceeds to step S75.

状態Ｃ→状態Ｄに向けてカメラ画像を徐々に移動中の場合（Ｓ７５のＹｅｓ）、移動部１５は状態Ｃ→状態Ｄに向けてカメラ画像を徐々に移動する（Ｓ７６）。ステップＳ７５がＮｏの場合、処理はステップＳ７７に進む。移動部１５は状態Ｄ→状態Ａに向けてカメラ画像を徐々に移動する（Ｓ７７）。 When the camera image is being gradually moved from state C to state D (Yes in S75), the moving unit 15 gradually moves the camera image from state C to state D (S76). If step S75 is No, the process proceeds to step S77. The moving unit 15 gradually moves the camera image from state D to state A (S77).

通信端末１０の移動部１５は、ステップＳ７２，７４，７６，７７ずれかの処理を常に行っていることになる。例えば、状態Ａ～Ｃまでの時間は予め設定されており、３０秒や１分などゆっくりした時間（ユーザーがカメラ画像の移動が気にならない時間）でよい。 The mobile unit 15 of the communication terminal 10 is always performing the processing of steps S72, S74, S76, S77. For example, the time from state A to state C is set in advance, and may be a slow time such as 30 seconds or 1 minute (a time during which the user does not notice the movement of the camera image).

図１４は、合成画像のフレームに添付されるメタデータの一例である。フレームにはフレームＩＤ、領域ＩＤ、位置情報、サイズ、及び、状態が添付される。
・フレームＩＤは、例えばフレームの識別情報である。撮像時刻や撮像開始からの経過時間が含まれるとよい。
・拠点ＩＤは拠点を識別する識別情報である。
・位置情報は、合成画像内のカメラ画像の座標である。位置情報は、移動部１５が制御する値である。
・サイズはカメラ画像の幅と高さである。サイズは、ユーザーが設定しても固定でもよいが、いずれにせよ既知である。 FIG. 14 is an example of metadata attached to frames of a synthesized image. A frame is attached with a frame ID, region ID, location information, size, and state.
- The frame ID is, for example, identification information of a frame. It is preferable to include the imaging time and the elapsed time from the start of imaging.
- The base ID is identification information for identifying a base.
- The position information is the coordinates of the camera image in the composite image. The position information is a value controlled by the moving unit 15 .
・The size is the width and height of the camera image. The size may be set by the user or fixed, but is known anyway.

拠点ＩＤ、位置情報及びサイズはカメラ画像の数だけ含まれる。
・状態には、カメラ画像の配置が上記Ａ～Ｄになった場合に状態Ａ～Ｄが格納される。移動部１５が制御する値である。 Site IDs, location information and sizes are included for the number of camera images.
The states A to D are stored in the state when the arrangement of the camera image is the above A to D. It is a value controlled by the moving unit 15 .

＜Ｃ．録画再生装置における資料画像の構築＞
図１５は、録画再生装置３０における資料画像の再生方法を説明する図である。録画再生装置３０の第一資料画像構築部３３は、録画された合成画像を活用して、死角がない資料画像を構築する。 <C. Construction of Material Image in Recording/Playback Device>
15A and 15B are diagrams for explaining a method of reproducing a material image in the recording/reproducing device 30. FIG. The first document image constructing unit 33 of the recording/reproducing device 30 utilizes the recorded composite image to construct a document image without blind spots.

第一資料画像構築部３３は、合成画像の画像データから状態Ａ、状態Ｂ、状態Ｃのそれぞれのフレームを取得する。状態Ａ、状態Ｂ、状態Ｃのフレームであるかどうかは、録画再生装置３０がカメラ画像の位置に基づいて判断してもよいし、画像合成部１７が図１４のメタデータから判断してもよい。 The first document image constructing unit 33 acquires each frame of state A, state B, and state C from the image data of the composite image. Whether the frame is in state A, state B, or state C may be determined by the recording/reproducing device 30 based on the position of the camera image, or may be determined by the image synthesizing unit 17 from the metadata shown in FIG. good.

第一資料画像構築部３３は、カメラ画像を表示していた領域を黒画素（２５６階調の０）に置き換え中間画像２５０を作成する（図１５（ａ）～（ｃ））。中間画像２５０は動画ファイル作成部３２がカメラ画像をトリミングした後の合成画像でもよい。第一資料画像構築部３３は白画素に置き換えてもよい。状態Ａ、状態Ｂ、状態Ｃのカメラ画像の位置やサイズはメタデータから取得できる。 The first document image constructing unit 33 replaces the area where the camera image was displayed with black pixels (0 in 256 gradations) to create an intermediate image 250 (FIGS. 15(a) to 15(c)). The intermediate image 250 may be a synthesized image after trimming the camera image by the moving image file creating section 32 . The first document image constructing unit 33 may be replaced with white pixels. The positions and sizes of camera images in state A, state B, and state C can be obtained from metadata.

第一資料画像構築部３３は、状態Ａ'、状態Ｂ'、状態Ｃ'の中間画像２５０を画素位置ごとにＯＲ演算で合成すれば、死角のない資料画像２５１を構築することができる図１５（ｄ））。このＯＲ演算は、例えば、３つの中間画像２５０のうち画素値が０なら破棄して、０でなければ平均（又は任意の画像のどれか）を採用する処理である。 The first document image constructing unit 33 can construct a document image 251 without blind spots by synthesizing the intermediate images 250 of state A', state B', and state C' by OR operation for each pixel position. (d)). This OR operation is, for example, a process of discarding if the pixel value of the three intermediate images 250 is 0, and adopting the average (or any of an arbitrary image) if the pixel value is not 0.

なお、状態Ａ～Ｃの間に（カメラ画像の位置が変化する間に）資料画像が切り替わっている可能性があるので、第一資料画像構築部３３はカメラ画像がない中間画像２５０の領域をブロックに分けて、３つの中間画像２５０のブロック同士を比較し、切り替わりを検出するとよい。資料画像が切り替わっていた場合、第一資料画像構築部３３は資料画像を構築しない。あるいは、切り替わっていない資料画像が２つあれば、資料画像を構築してもよい（この場合、資料画像の全体を構築できないおそれがある）。 Since there is a possibility that the document image is switched between states A to C (while the position of the camera image is changing), the first document image constructing unit 33 selects the area of the intermediate image 250 where there is no camera image. It is preferable to divide into blocks and compare the blocks of the three intermediate images 250 to detect switching. If the document image has been switched, the first document image constructing unit 33 does not construct the document image. Alternatively, if there are two document images that have not been switched, the document image may be constructed (in this case, the entire document image may not be constructed).

また、使用する合成画像は上記では状態Ａ～Ｃの３つだが、死角がなくなるように合成画像があればよく、２つ又は４つ以上でもよい。 In the above description, three composite images are used for states A to C, but it is sufficient if there are composite images so as to eliminate blind spots, and two or four or more may be used.

＜Ｃの処理による、資料画像と動画ファイルの作成＞
図１６は、録画再生装置３０がカメラ画像（動画ファイル）と資料画像を保存する処理を説明するフローチャート図である。図１６の処理は、合成画像の動画ファイルが合成画像記憶部４１に保存されると実行可能となる。ユーザーが録画再生装置３０に対し、合成画像を再生する操作を開始したことでスタートしてもよい（普段のデータ収納は合成画像のままでよい）。 <Creation of document images and video files by processing C>
FIG. 16 is a flowchart for explaining the process of saving the camera image (moving image file) and the material image by the recording/reproducing device 30. As shown in FIG. The process of FIG. 16 can be executed when the moving image file of the composite image is stored in the composite image storage unit 41. FIG. It may be started when the user starts an operation to reproduce the composite image on the recording/reproducing device 30 (ordinary data storage may be performed with the composite image as it is).

まず、動画ファイル作成部３２は、合成画像記憶部４１から合成画像を取得する（Ｓ１１）。取得する合成画像はユーザーが指定した動画ファイルでもよいし、カメラ画像の動画ファイルや資料画像が構築されていない動画ファイルでもよい。 First, the moving image file creation unit 32 acquires a composite image from the composite image storage unit 41 (S11). The synthesized image to be acquired may be a moving image file specified by the user, or may be a moving image file of camera images or a moving image file in which material images are not constructed.

動画ファイル作成部３２は、各フレームのメタデータに含まれるカメラ画像の位置情報とサイズに基づいて、フレームからカメラ画像を切り出す。動画ファイル作成部３２はこれらを拠点ごとに動画ファイルとしてカメラ画像記憶部４２に保存する（Ｓ１２）。 The moving image file creating unit 32 cuts out the camera image from the frame based on the position information and size of the camera image included in the metadata of each frame. The moving image file creation unit 32 stores these as moving image files in the camera image storage unit 42 for each site (S12).

次に、第一資料画像構築部３３は合成画像の動画ファイルから状態がＡ～Ｃのフレームを特定する（Ｓ１３）。ここでは、動画ファイル作成部３２により中間画像が作成済みとする。 Next, the first document image constructing unit 33 identifies frames with statuses A to C from the composite image moving image file (S13). Here, it is assumed that intermediate images have been created by the moving image file creation unit 32 .

状態Ａ～Ｃのフレームは時系列にほぼ一定間隔に現れる。第一資料画像構築部３３は状態Ａ～Ｃのフレームを１セットに、ＯＲ演算して資料画像を構築する（Ｓ１４）。 Frames of states A to C appear in time series at substantially regular intervals. The first document image constructing unit 33 constructs a document image by OR-operating frames of states A to C as one set (S14).

第一資料画像構築部３３は、資料画像に録画ＩＤ、構築開始時刻、構築終了時刻、資料画像ＩＤ、サイズ、ファイル名を対応付けて資料画像記憶部４３に保存する（Ｓ１５）。なお、構築開始時刻は状態Ａのフレームの撮像時刻であり、構築終了時刻は状態Ｃのフレームの撮像時刻である。 The first document image construction unit 33 stores the document image in the document image storage unit 43 in association with the recording ID, construction start time, construction end time, document image ID, size, and file name (S15). The construction start time is the state A frame imaging time, and the construction end time is the state C frame imaging time.

＜Ｂ．資料画像の上下スライド＞
続いて、「Ｂ．資料画像の上下スライド」による合成画像の生成と録画された資料画像の構築について説明する。 <B. Vertical slide of the material image>
Next, generation of a composite image and construction of a recorded material image by "B. Vertical slide of material image" will be described.

図１７は、資料画像を上にスライドさせながら合成画像を作成する画像処理の概略を説明する図である。 FIG. 17 is a diagram for explaining an outline of image processing for creating a composite image while sliding a material image upward.

通信端末１０のページ切替部１６は、資料画像を、図１７（ａ）→図１７（ｂ）→図１７（ｃ）のように、上方向に移動させながら次の資料画像（次のページ）に移行する。スライドの速度は録画再生装置３０が資料画像を取り込める程度の速度とする。図１７ではカメラ画像が資料画像の下部にあるため、ページ切替部１６が資料画像を上にスライドさせるが、カメラ画像が資料画像の側方にある場合、ページ切替部１６が資料画像を左又は右にスライドさせる。カメラ画像が資料画像の上部にある場合、ページ切替部１６が資料画像を下にスライドさせる。このように、ページ切替部１６は、カメラ画像のレイアウトを考慮してスライド方向を決定する。 The page switching unit 16 of the communication terminal 10 switches the material image to the next material image (next page) while moving the material image upward as shown in FIG. transition to The slide speed is set to a speed that allows the recording/reproducing device 30 to capture the material image. In FIG. 17, the camera image is below the document image, so the page switching unit 16 slides the document image upward. slide to the right. When the camera image is above the document image, the page switching unit 16 slides the document image downward. In this manner, the page switching unit 16 determines the slide direction in consideration of the layout of camera images.

図１７（ａ）→図１７（ｂ）→図１７（ｃ）では、切り替わる前の資料画像Ａと切り替わった後の資料画像Ｂが徐々に上方向にスライドしている。 17(a)→FIG. 17(b)→FIG. 17(c), the material image A before switching and the material image B after switching gradually slide upward.

図１８を用いて詳細に説明する。図１８は、資料画像の上下スライドを説明する図である。まず、ページ切替部１６は、図１８（ａ）に示すように、資料画像の切り替わりを検出する。ページ切替部１６は、例えば、他の拠点から送信される資料画像を一定間隔で比較し、資料画像の切り替わりを検出できる。ページ切替部１６は、ページを切り替えた旨の操作信号で検出してもよい。ページ切替部１６は切り替え後の資料画像Ｂを用意する。 A detailed description will be given with reference to FIG. FIG. 18 is a diagram for explaining vertical sliding of the document image. First, the page switching unit 16 detects switching of the document image as shown in FIG. 18(a). For example, the page switching unit 16 can compare document images transmitted from other bases at regular intervals and detect switching of the document images. The page switching unit 16 may detect an operation signal indicating that the page has been switched. The page switching unit 16 prepares the material image B after switching.

資料画像の切り替わりを検出した場合、ページ切替部１６は、２つの資料画像Ａ，Ｂを上下方向に連結し、図１８（ｂ）→図１８（ｃ）→図１８（ｄ）に示すように、１フレーム２２０に入る部分を資料画像として切り取る。ページ切替部１６は、２つの資料画像Ａ，Ｂの上方から下方に切り取る範囲を徐々に移動する。例えば、図１８（ｂ）では切り替わる前の資料画像Ａの全体が１フレーム２２０に配置され、図１８（ｃ）では切り替わる前の資料画像Ａの下半分と切り替わった後の資料画像Ｂの上半分が１フレーム２２０に配置され、図１８（ｄ）では切り替わった後の資料画像Ｂの全体が１フレーム２２０に配置されている。画像合成部１７はこのようなフレームにカメラ画像を配置して合成画像を作成する。 When the switching of the document image is detected, the page switching unit 16 connects the two document images A and B in the vertical direction, and as shown in FIG. 18(b)→FIG. 18(c)→FIG. , a portion within one frame 220 is cut out as a material image. The page switching unit 16 gradually moves the range to be cut from the upper side of the two document images A and B to the lower side. For example, in FIG. 18B, the entire document image A before switching is arranged in one frame 220, and in FIG. is arranged in one frame 220, and the entire document image B after switching is arranged in one frame 220 in FIG. 18(d). The image synthesizing unit 17 creates a synthetic image by arranging the camera images in such a frame.

そして、録画再生装置３０では、図１７、図１８で説明した合成画像が圧縮符号化の後に記録される。この合成画像は、資料画像が上下にスライドしながら徐々に切り替わる動画になる。資料画像にはカメラ画像も埋め込まれている。資料画像の再生時、録画再生装置３０の第二資料画像構築部３６は、Ｄの処理として、資料画像がスライドしながら表示されていることを利用して、死角のない資料画像を別途作成する。 Then, in the recording/reproducing device 30, the composite image described with reference to FIGS. 17 and 18 is recorded after being compression-encoded. This composite image becomes a moving image in which the material images are gradually switched while sliding up and down. The camera image is also embedded in the document image. When reproducing the material image, the second material image constructing unit 36 of the recording/reproducing device 30 uses the fact that the material image is displayed while sliding as processing D to separately create a material image without blind spots. .

すなわち、第二資料画像構築部３６は、カメラ画像が重なっていない部分の資料画像を結合して、資料画像を構築する。カメラ画像が存在する領域を下端からＬとすると、Ｌ以上の領域を切り取ればよい。第二資料画像構築部３６は、例えば、図１８（ｂ）の画面（フレーム）の上半分２２１（フレームの一部）を切り取り、図１８（ｃ）の画面（フレーム）の上半分２２２（フレームの一部）を切り取る。第二資料画像構築部３６は切り取った２つの資料画像を結合して、資料画像を構築する。図１８（ｂ）から図１８（ｄ）への切り替え時間をＴ秒とすると、切り替えの検出の直前と、切り替えの検出からＴ／２秒経過時に第二資料画像構築部３６がそれぞれフレームの上半分２２１，２２２を切り取ればよい。 That is, the second document image constructing unit 36 constructs a document image by combining the document images of the portions where the camera images do not overlap. Assuming that the area where the camera image exists is L from the lower end, the area of L or more should be cut out. The second document image constructing unit 36 cuts out, for example, the upper half 221 (part of the frame) of the screen (frame) in FIG. part). The second document image constructing unit 36 combines the two clipped document images to construct a document image. If the switching time from FIG. 18(b) to FIG. 18(d) is assumed to be T seconds, the second document image constructing unit 36 immediately before the detection of the switching and after T/2 seconds after the detection of the switching, the upper frame of each frame. The halves 221 and 222 should be cut off.

なお、図１７、図１８の画像処理は、会議参加者が多い（参加拠点数が多いため）、Ａのカメラ画像の移動がやりにくい、又は、資料の改頁がかなり短いインターバルで発生する場合に適している。 The image processing in FIGS. 17 and 18 is performed when there are many conference participants (because of the large number of participating sites), when it is difficult to move the camera image of A, or when page breaks in the material occur at fairly short intervals. Suitable for

＜処理手順＞＞
図１９は、資料画像を上にスライドさせる処理において、図１１のステップＳ７の処理を説明するフローチャート図である。 <Processing procedure>
FIG. 19 is a flow chart for explaining the process of step S7 in FIG. 11 in the process of sliding the document image upward.

ページ切替部１６が、他の拠点から送信される資料画像と表示中の資料画像に基づいて資料画像（ページ）が切り替わったか否かを判断する（Ｓ２１）。ページ切替部１６は、自拠点の通信端末１０が資料画像を表示している場合、ユーザーの操作でページの切り替えを検出できる。 The page switching unit 16 determines whether or not the document image (page) has been switched based on the document image transmitted from another site and the document image being displayed (S21). The page switching unit 16 can detect page switching by a user's operation when the communication terminal 10 at its own site displays a material image.

また、ページ切替部１６は、資料画像におけるカメラ画像の位置に基づいてスライド方向を判断する（Ｓ２２）。ここではページ切替部１６は、上方向にスライドすると判断したものとする。 Also, the page switching unit 16 determines the slide direction based on the position of the camera image in the document image (S22). Here, it is assumed that the page switching unit 16 has determined to slide upward.

次に、ページ切替部１６は切り替わる前の資料画像を上に、切り替わった後の資料画像を下にして２つの資料画像を上下に連結する（Ｓ２３）。 Next, the page switching unit 16 connects the two document images vertically with the document image before switching on the top and the document image after switching on the bottom (S23).

ページ切替部１６は、上側の資料画像の高さを時間Ｔで除算してスライド速度を決定する（Ｓ２４）。時間Ｔは、予め設定されている。時間Ｔは、キャプチャ処理が間に合う範囲で速いほうが好ましい（例えば、１０秒等）。ユーザーが会議中の議論に遅れないように資料画像を表示するためである。 The page switching unit 16 divides the height of the upper material image by the time T to determine the slide speed (S24). The time T is preset. It is preferable that the time T be as fast as possible (for example, 10 seconds) as long as the capture process can be completed in time. This is to display the material image so that the user can keep up with the discussion during the meeting.

ページ切替部１６は、合成画像のフレームごとに、スライド速度に基づいて、上側の資料画像の上端を基準にフレームに含める範囲を決定する（Ｓ２５）。 The page switching unit 16 determines the range to be included in the frame based on the slide speed for each frame of the synthesized image, with reference to the upper end of the upper material image (S25).

ページ切替部１６は、上側の資料画像がフレームに入らなくなるまで、ステップＳ２４の処理を繰り返す（Ｓ２６）。 The page switching unit 16 repeats the process of step S24 until the upper material image does not fit in the frame (S26).

＜Ｄ．録画再生装置における資料画像の構築＞
次に、図２０を参照して、録画再生装置３０における資料画像の構築について説明する。図２０は、資料画像を上下にスライドさせる処理において、第二資料画像構築部３６が資料画像を構築する処理を説明するフローチャート図である。 <D. Construction of Material Image in Recording/Playback Device>
Next, referring to FIG. 20, construction of a material image in the recording/playback device 30 will be described. FIG. 20 is a flowchart for explaining the process of constructing a material image by the second material image construction unit 36 in the process of vertically sliding the material image.

まず、第二資料画像構築部３６は、合成画像記憶部４１から合成画像の動画ファイルを取得する（Ｓ３１）。取得する合成画像はユーザーが指定した動画ファイルでもよいし、カメラ画像の動画ファイルや資料画像が構築されていない動画ファイルでもよい。 First, the second document image constructing unit 36 acquires the moving image file of the composite image from the composite image storage unit 41 (S31). The synthesized image to be acquired may be a moving image file specified by the user, or may be a moving image file of camera images or a moving image file in which material images are not constructed.

切替検出部３５は、合成画像のフレームを監視して資料画像が切り替わったか否かを判断する（Ｓ３２）。 The switching detection unit 35 monitors the frame of the synthesized image and determines whether or not the material image has been switched (S32).

ステップＳ３２の判断がＹｅｓの場合、まず、第二資料画像構築部３６は切り替える前の資料画像の上半分をキャプチャする（Ｓ３３）。合成画像は保存済みなので、第二資料画像構築部３６は切り替え開始の直前のフレームを特定できる。 If the determination in step S32 is Yes, first, the second document image constructing unit 36 captures the upper half of the document image before switching (S33). Since the composite image has already been saved, the second material image constructing unit 36 can specify the frame immediately before the start of switching.

次に、第二資料画像構築部３６は切り替えが検出されてからＴ／２秒が経過した資料画像の上半分をキャプチャする（Ｓ３４）。 Next, the second document image constructing unit 36 captures the upper half of the document image after T/2 seconds have passed since the switching was detected (S34).

第二資料画像構築部３６は、資料画像の上半分と下半分を連結して、資料画像を構築する（Ｓ３５）。 The second document image constructing unit 36 connects the upper half and the lower half of the document image to construct the document image (S35).

第二資料画像構築部は、資料画像に録画ＩＤ、ページ切替時刻、資料画像ＩＤ、サイズ、ファイル名を対応付けて資料画像記憶部４３に保存する（Ｓ３６）。 The second document image constructing unit associates the document image with the recording ID, page switching time, document image ID, size, and file name, and stores the document image in the document image storage unit 43 (S36).

第二資料画像構築部３６は、合成画像の動画ファイルの最後まで図２０の処理を繰り返す（Ｓ３７）。 The second document image constructing unit 36 repeats the processing of FIG. 20 until the end of the composite image moving image file (S37).

＜会議に参加していなかったユーザーの操作による資料画像の閲覧＞
図２１は、合成画像の再生時に利用できる３種類の画像を示す図である。図２１（ａ）は合成画像２６０を示す。 <Browsing of material images by the operation of a user who did not participate in the meeting>
FIG. 21 is a diagram showing three types of images that can be used when reproducing a synthesized image. FIG. 21(a) shows a composite image 260. FIG.

図２１（ｂ）は資料画像用のファイルである。
・図２１（ｂ）は、Ｃ又はＤの処理により作成された、会議中に共有されていた資料等に対して、死角のない資料画像２５１のファイルである。 FIG. 21(b) is a file for a material image.
- FIG. 21(b) is a file of a material image 251 with no blind spots for materials, etc. shared during the meeting, created by the process of C or D. FIG.

図２１（ｃ）は、パノラマ画像２０２、個別画像２０３、話者カメラ画像２０４等のカメラ画像２５２のみを表示した動画ファイルである。
・再生時に資料と同時に映す必要がなければ、拡大表示が可能になるため、人の表情等がわかりやすくなる。 FIG. 21(c) is a moving image file displaying only camera images 252 such as the panorama image 202, the individual image 203, and the speaker camera image 204. FIG.
・If it is not necessary to display the material at the same time as the material during playback, it is possible to enlarge the display, making it easier to understand the facial expressions of people.

図２１（ｄ）は、図２１（ａ）と同様の合成画像２６０の動画ファイルである。
・図２１（ｄ）は、元の合成画像であり、動画として表示できる。 FIG. 21(d) is a moving image file of a composite image 260 similar to FIG. 21(a).
• Fig. 21(d) is the original composite image, which can be displayed as a moving image.

ユースケースの一例を説明する。会議に参加しなったユーザーはＰＣを録画再生装置３０に接続して、合成画像の動画ファイルを選ぶ。どの合成画像が所望の会議のものかは、撮像開始時刻等からユーザーが判断できる。図２１（ｄ）に示すように、ユーザーは合成画像を最初から閲覧するが、資料の死角が気になると、例えば資料をダブルクリックするなどする。操作受付部３４が資料の要求を検出する。表示制御部３７は資料が要求された合成画像の表示時刻を検出し、資料画像記憶部４３から表示時刻に対応する資料画像を取得する。例えば、合成画像の開始から１分１５秒の経過時に資料が要求された場合、１分から１分３０秒の間のフレームから構築された001－003.jpegの資料画像を取得する。表示制御部３７は、ディスプレイにこの資料画像（死角がない）を大きく表示できる。 An example of a use case will be explained. A user who does not participate in the conference connects a PC to the recording/reproducing device 30 and selects a moving image file for a composite image. The user can determine which composite image is for the desired conference from the imaging start time or the like. As shown in FIG. 21(d), the user browses the composite image from the beginning, but when he becomes concerned about the blind spot of the material, he double-clicks the material, for example. The operation reception unit 34 detects the request for materials. The display control unit 37 detects the display time of the composite image for which the material is requested, and acquires the material image corresponding to the display time from the material image storage unit 43 . For example, if a material is requested when 1 minute and 15 seconds have elapsed from the start of the composite image, a material image of 001-003.jpeg constructed from frames between 1 minute and 1 minute and 30 seconds is obtained. The display control unit 37 can display this document image (without blind spots) in a large size on the display.

同様に、ユーザーが参加者の表情が気になると、例えば任意のカメラ画像をダブルクリックするなどする。操作受付部３４がカメラ画像の要求を検出する。表示制御部３７はカメラ画像が要求された合成画像の表示時刻を検出し、カメラ画像記憶部４２から表示時刻に対応するカメラ画像を取得する。例えば、合成画像の開始から１分１５秒の経過時にカメラ画像が要求された場合、各拠点のカメラ画像（動画ファイル）の１分１５秒経過時から再生する。 Similarly, when the user is concerned about the facial expressions of the participants, he/she double-clicks an arbitrary camera image, for example. The operation reception unit 34 detects the request for the camera image. The display control unit 37 detects the display time of the composite image for which the camera image is requested, and acquires the camera image corresponding to the display time from the camera image storage unit 42 . For example, if a camera image is requested after 1 minute and 15 seconds have passed since the start of the composite image, the camera images (moving image files) of each site are played back after 1 minute and 15 seconds have passed.

また、図２２に示すように、２つのディスプレイ２０Ａ，２０Ｂがあれば、録画再生装置３０は、カメラ画像２５２と資料画像２５１を同時に表示できる。図２２は、合成画像２６０から作成されたカメラ画像２５２と資料画像２５１の２つのディスプレイを用いた表示例を示す。図２１で説明したようにカメラ画像と資料画像を別々に用意することで、以下のような再生方法が可能になる。
・録画再生装置３０が出力I/Fを２つ有する場合、図２１（ｂ）と（ｃ）の画像を同時に２つのディスプレイ２０Ａ，２０Ｂにそれぞれ出力することができる。
・あるいは、Dual DisplayとDisplayPortのマルチストリーム機能（ディスプレイ同士を接続できる）を使うことで、図２１（ｂ）と（ｃ）の画像を同時にモニター出力することができる。 Also, as shown in FIG. 22, if there are two displays 20A and 20B, the recording/reproducing device 30 can display the camera image 252 and the document image 251 at the same time. FIG. 22 shows a display example using two displays of a camera image 252 and a material image 251 created from a composite image 260. FIG. By separately preparing the camera image and the document image as described with reference to FIG. 21, the following reproduction method is possible.
- If the recording/playback device 30 has two output I/Fs, the images shown in FIGS. 21(b) and (c) can be simultaneously output to the two displays 20A and 20B, respectively.
- Alternatively, by using the multi-stream function of Dual Display and DisplayPort (which can connect displays), the images in FIGS. 21(b) and (c) can be output to the monitor at the same time.

＜主な効果＞
以上説明したように、本実施形態の録画システム６０は、カメラ画像が徐々に移動させるか、又は、資料画像をスライドさせてページを切り替えるかの少なくともいずれかにより、録画再生装置３０は、画像の背面に隠れていた領域も取得できる。録画再生装置３０はそれを別ファイルとして保存しておくことで、再生時は資料画像の全体を表示したり、資料画像の任意の位置にカメラ画像を移動したりすることができる。 <Main effects>
As described above, the recording system 60 of the present embodiment gradually moves the camera image, or slides the material image to switch pages, so that the recording/reproducing device 30 can reproduce the image. You can also get the area that was hidden behind. By storing it as a separate file, the recording/playback device 30 can display the entire document image or move the camera image to an arbitrary position of the document image during playback.

＜その他の適用例＞
以上、本発明を実施するための最良の形態について実施例を用いて説明したが、本発明はこうした実施例に何等限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々の変形及び置換を加えることができる。 <Other application examples>
Although the best mode for carrying out the present invention has been described above using examples, the present invention is by no means limited to such examples, and various modifications can be made without departing from the scope of the present invention. and substitutions can be added.

例えば、本実施形態では、資料画像にカメラ画像が埋め込まれる例を説明したが、画像に何が映っているかに関わらず本実施形態を適用できる。例えば資料画像は風景画像などでもよい。カメラ画像は動画投稿サイトが提供する画像でもよい。また、資料画像をＷｅｂページとして、カメラ画像を広告としてもよい。 For example, in this embodiment, an example in which a camera image is embedded in a document image has been described, but this embodiment can be applied regardless of what is shown in the image. For example, the material image may be a landscape image. The camera image may be an image provided by a video posting site. Alternatively, the document image may be the web page, and the camera image may be the advertisement.

また、本実施形態では録画再生装置３０が資料画像を構築したが、サーバーが資料画像を構築してもよい。 Also, in the present embodiment, the recording/reproducing device 30 constructs the material image, but the server may construct the material image.

また、図５などの構成例は、通信端末１０、及び録画再生装置３０による処理の理解を容易にするために、主な機能に応じて分割したものである。処理単位の分割の仕方や名称によって本願発明が制限されることはない。通信端末１０、及び録画再生装置３０の処理は、処理内容に応じて更に多くの処理単位に分割することもできる。また、１つの処理単位が更に多くの処理を含むように分割することもできる。 Further, the configuration examples in FIG. 5 and the like are divided according to main functions in order to facilitate understanding of processing by the communication terminal 10 and the recording/playback device 30 . The present invention is not limited by the division method or name of the unit of processing. The processing of the communication terminal 10 and the recording/playback device 30 can also be divided into more processing units according to the content of the processing. Also, one processing unit can be divided to include more processing.

上記で説明した実施形態の各機能は、一又は複数の処理回路によって実現することが可能である。ここで、本明細書における「処理回路」とは、電子回路により実装されるプロセッサのようにソフトウェアによって各機能を実行するようプログラミングされたプロセッサや、上記で説明した各機能を実行するよう設計されたASIC(Application Specific Integrated Circuit)、DSP（Digital Signal Processor）、FPGA（Field Programmable Gate Array）や従来の回路モジュール等のデバイスを含むものとする。 Each function of the embodiments described above may be implemented by one or more processing circuits. Here, the "processing circuit" in this specification means a processor programmed by software to perform each function, such as a processor implemented by an electronic circuit, or a processor designed to perform each function described above. Devices such as ASIC (Application Specific Integrated Circuit), DSP (Digital Signal Processor), FPGA (Field Programmable Gate Array) and conventional circuit modules are included.

１０通信端末
３０録画再生装置
６０録画システム
１００通信システム 10 communication terminal 30 recording/playback device 60 recording system 100 communication system

特開2009－18298号公報Japanese Patent Application Laid-Open No. 2009-18298

Claims

a communication unit that receives the first image data from another base;
The position of the first image data or the second image data in the moving image frame in which the second image data received from another site or prepared at the own site is embedded in the first image data an image processing unit that moves the
an image synthesizing unit that generates a moving image file in which the second image data is embedded in the first image data;
an image construction unit that constructs the first image data that does not include the second image data from the moving image file;
a display control unit that displays the first image data constructed by the image construction unit;
A recording system with

arranging the first image data in a moving image frame, and taking a preset time to make the second image data larger than the size of the second image data with respect to the first image data; Having a moving part for moving,
The image constructing unit acquires a plurality of the first image data in which the positions of the second image data are different from the moving image file, extracts the second image data from the first image data, and synthesizes the second image data. 2. The recording system according to claim 1, wherein the first image data that does not contain the second image data is constructed by doing so.

The image constructing unit performs an OR operation on a plurality of the first image data obtained by removing the second image data from the first image data, thereby creating the first image that does not include the second image data. 3. The recording system according to claim 2, wherein the recording system constructs data.

When it is detected that the first image data has been switched,
a page switching unit that takes a preset time to slide the first image data before switching arranged in a frame of a moving image to switch to the first image data after switching;
The image synthesizing unit generates a moving image file in which the second image data is embedded in the first image data in the middle of the slide,
When it is detected that the first image data has been switched from the moving image file,
The image constructing unit cuts out a portion of the first image data before switching and a portion of the first image data after sliding, and synthesizes them to form the second image data. 2. The video recording system according to claim 1, wherein said first image data that does not contain .

When it is detected that the first image data has been switched from the moving image file,
The image constructing unit cuts out half of the first image data before switching and half of the first image data after half of the time, and synthesizes the second image data. 5. The video recording system according to claim 4, wherein said first image data that does not contain .

the moving image file has position information and size of the second image data in the first image data;
6. The method according to any one of claims 1 to 5, further comprising a moving image file creation unit that creates a second moving image file of said second image data captured from said first image data based on said position information and size. The recording system according to any one of items 1 and 2.

7. The recording system according to any one of claims 1 to 6, wherein said first image data not including said second image data is displayed according to a user's operation.

7. The recording system according to claim 6, wherein said second moving image file is displayed according to user's operation.

a communication unit that receives the first image data from another base;
The position of the first image data or the second image data in the moving image frame in which the second image data received from another site or prepared at the own site is embedded in the first image data an image processing unit that moves the
a receiving unit for receiving the moving image file from a communication terminal having an image synthesizing unit for generating a moving image file in which the second image data is embedded in the first image data;
an image construction unit that constructs the first image data that does not include the second image data from the moving image file received by the reception unit;
a display control unit that displays the first image data constructed by the image construction unit;
A recording/playback device having

a step in which the communication unit receives the first image data from another base;
The image processing unit embeds the first image data or the second a step of moving the position of the image data of
an image synthesizing unit generating a moving image file in which the second image data is embedded in the first image data;
an image construction unit constructing the first image data that does not include the second image data from the moving image file;
a step in which a display control unit displays the first image data constructed by the image construction unit;
An image processing method comprising: