JP4795141B2

JP4795141B2 - Video coding / synthesizing apparatus, video coding / synthesizing method, and video transmission system

Info

Publication number: JP4795141B2
Application number: JP2006179708A
Authority: JP
Inventors: 正樹佐藤
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2006-06-29
Filing date: 2006-06-29
Publication date: 2011-10-19
Anticipated expiration: 2026-06-29
Also published as: JP2008011191A

Description

本発明は、入力された映像を符号化し、符号化された複数の映像データを合成する映像符号化合成装置、映像符号化合成方法及び映像伝送システムに関する。 The present invention relates to a video encoding / synthesizing device, a video encoding / synthesizing method, and a video transmission system that encode an input video and synthesize a plurality of encoded video data.

図２１は従来の映像符号化合成装置の構成例を示す図である。映像符号化合成装置は、多地点制御装置１７０１及び複数のユーザ端末１７０２〜１７０６を有して構成される。ユーザ端末１７０２〜１７０６は、それぞれ映像入力部及び映像表示部を有する。図２２はユーザ端末の映像表示部の画面を示す図である。ユーザ端末１７０２の映像表示部の画面１８０１には、ユーザ端末１７０３〜１７０６の映像入力部にそれぞれ入力された映像（ユーザ２、ユーザ３、ユーザ４、ユーザ５）が表示される。 FIG. 21 is a diagram showing a configuration example of a conventional video coding / synthesizing apparatus. The video encoding / synthesizing device includes a multipoint control device 1701 and a plurality of user terminals 1702 to 1706. The user terminals 1702 to 1706 each have a video input unit and a video display unit. FIG. 22 is a diagram showing a screen of the video display unit of the user terminal. On the screen 1801 of the video display unit of the user terminal 1702, videos (user 2, user 3, user 4, user 5) respectively input to the video input units of the user terminals 1703 to 1706 are displayed.

図２３はユーザ端末１７０２の映像入力部の構成を示す図である。なお、ユーザ端末１７０３〜１７０６の映像入力部の構成も同様である。ユーザ端末１７０２は、多地点制御装置１７０１に対し、符号化した映像データを合成可能な形式で出力するものであり、入力処理部１９０１、フレームメモリ１９０２、符号化部１９０３、バッファ１９０４、入力フレーム制御部１９０５、符号化制御部１９０６、データ量監視部１９０７及び送信部１９０８を有する。 FIG. 23 is a diagram showing the configuration of the video input unit of the user terminal 1702. The configuration of the video input unit of the user terminals 1703 to 1706 is the same. The user terminal 1702 outputs the encoded video data to the multipoint control device 1701 in a format that can be combined, and includes an input processing unit 1901, a frame memory 1902, an encoding unit 1903, a buffer 1904, and input frame control. A unit 1905, an encoding control unit 1906, a data amount monitoring unit 1907, and a transmission unit 1908.

入力処理部１９０１は、カメラ（図示せず）からの映像をデジタル映像信号に変換し、さらに全体の１／４に縮小する。フレームメモリ１９０２は、入力処理部１９０１で処理された映像データを蓄積する。符号化部１９０３は、フレームメモリ１９０２に蓄積された映像データを、一列分のマクロブロックラインを１つのビデオパケットとして符号化する。バッファ１９０４は、符号化部１９０３から出力された映像データを蓄積する。データ量監視部１９０７は、バッファ１９０４に蓄積されているデータを監視し、一定の閾値以下である場合、送信部１９０８により送信を行わせる。入力フレーム制御部１９０５は、データ量監視部１９０７の制御に従って、符号化されるフレームを選択する。符号化制御部１９０６は、データ量監視部１９０７の制御に従って、符号化の打ち切り処理を行う。 The input processing unit 1901 converts a video from a camera (not shown) into a digital video signal, and further reduces it to ¼ of the whole. The frame memory 1902 stores the video data processed by the input processing unit 1901. The encoding unit 1903 encodes the video data stored in the frame memory 1902 with one row of macroblock lines as one video packet. The buffer 1904 stores the video data output from the encoding unit 1903. The data amount monitoring unit 1907 monitors the data accumulated in the buffer 1904, and causes the transmission unit 1908 to perform transmission if it is below a certain threshold. The input frame control unit 1905 selects a frame to be encoded under the control of the data amount monitoring unit 1907. The encoding control unit 1906 performs encoding termination processing in accordance with the control of the data amount monitoring unit 1907.

一方、多地点制御装置１７０１は、ユーザ端末１７０２から、１列分のマクロブロックラインを１つのビデオパケットとして符号化された映像データを受信し、映像データを構成する（特許文献１参照）。図２４は映像データの合成動作を示す図である。図２４（Ａ）は１列分のマクロブロックラインを１つのビデオパケット２００１として符号化された映像データを示す。図２４（Ｂ）は受信した複数の映像データから構成される合成画面２００２を示す。 On the other hand, the multipoint control device 1701 receives video data encoded from a user terminal 1702 with one column of macroblock lines as one video packet, and configures video data (see Patent Document 1). FIG. 24 is a diagram showing a video data composition operation. FIG. 24A shows video data encoded using one block of macroblock lines as one video packet 2001. FIG. 24B shows a composite screen 2002 composed of a plurality of received video data.

特開２００５−２０４６６号公報Japanese Patent Laid-Open No. 2005-20466

上記従来の映像符号化合成装置では、以下に掲げる問題点があった。即ち、ユーザ端末で入力された映像を縮小後に符号化して伝送する際、多地点制御装置が介在する場合、受信側のユーザ端末では、縮小された映像のみ表示可能であり、元の解像度の映像を表示することができなかった。また、各ユーザ端末で発生する符号量を考慮しておらず、ユーザ端末に入力される映像特性のばらつきが大きい場合、ユーザ端末間で画質が不均一になっていた。 The conventional video encoding / synthesizing apparatus has the following problems. In other words, when a multipoint control device is interposed when the video input at the user terminal is encoded after being reduced and transmitted, only the reduced video can be displayed at the user terminal on the receiving side, and the original resolution video can be displayed. Could not be displayed. In addition, when the amount of codes generated at each user terminal is not taken into consideration and the variation in the video characteristics input to the user terminals is large, the image quality is uneven among the user terminals.

また、従来例では、画像認識機能と組み合わせ、この画像認識の出力結果を符号化する際の符号量制御に反映させることで、例えば、人の顔領域を高画質に符号化することが実現可能である。しかし、この場合、画像認識の処理が終わった後に映像の符号化を行うので、映像が符号化されるまでの遅延時間が増大していた。 Also, in the conventional example, by combining with the image recognition function and reflecting the output result of this image recognition in the code amount control, it is possible to encode, for example, a human face area with high image quality. It is. However, in this case, since the video is encoded after the image recognition process is completed, the delay time until the video is encoded increases.

本発明は、上記事情に鑑みてなされたもので、縮小された映像の他、元の解像度を有する映像を容易に表示することが可能な映像符号化合成装置、映像符号化合成方法及び映像伝送システムを提供することを目的とする。 The present invention has been made in view of the above circumstances. A video coding / synthesizing device, a video coding / synthesizing method, and video transmission capable of easily displaying a video having an original resolution in addition to a reduced video. The purpose is to provide a system.

また、本発明は、入力される映像特性のばらつきが大きい場合でも、分割された映像間で画質を均一にすることが可能な映像符号化合成装置、映像符号化合成方法及び映像伝送システムを提供することを目的とする。 The present invention also provides a video coding / synthesizing device, a video coding / synthesizing method, and a video transmission system that can make the image quality uniform among the divided videos even when the variation in the inputted video characteristics is large. The purpose is to do.

また、本発明は、映像が符号化されるまでの遅延時間を削減することが可能な映像符号化合成装置、映像符号化合成方法及び映像伝送システムを提供することを目的とする。 Another object of the present invention is to provide a video coding / synthesizing device, a video coding / synthesizing method, and a video transmission system capable of reducing a delay time until a video is coded.

本発明の映像符号化合成装置は、入力された映像からスライス映像を生成する前処理部と、前記スライス映像をスライス符号化する符号化部と、前記スライス符号化された複数の映像データを、多画面表示となるように合成する合成部と、前記多画面表示となるように合成された複数の映像データまたは前記スライス符号化された映像データを復号化する復号化部と、前記復号化された映像データを、一画面表示となるように元の映像データに復元する復元部と、前記多画面表示となるように合成されて復号化された映像データまたは前記一画面表示となるように復元された映像データを選択する選択部と、前記選択された映像データによる映像を画面に表示する表示部と、を備えた映像符号化合成装置であって、前記映像に含まれる特定の対象物を検出する対象物検出部をさらに備え、前記対象物検出部により、前記特定の対象物の検出が終了していない場合、前記符号化部は、前記特定の対象物に対する過去の検出結果を基に、符号化特性を変更するものである。
これにより、縮小された映像の他、元の解像度を有する映像を容易に表示することが可能となる。 The video coding / synthesizing device of the present invention includes a preprocessing unit that generates a slice video from an input video, a coding unit that slice-codes the slice video, and a plurality of slice-coded video data. A synthesizing unit that synthesizes the multi-screen display ; a decoding unit that decodes the plurality of video data synthesized to be the multi-screen display or the slice-encoded video data; A restoration unit that restores the video data to the original video data so as to be displayed on a single screen, and a video data that has been synthesized and decoded so as to be displayed on the multi-screen, or is restored so as to be displayed on the single screen. a selection unit for selecting a video data, and a display unit for displaying an image on a screen by the selected video data, a video coding synthesizer having a specific included in the video An object detection unit for detecting an object, and when the detection of the specific object has not been completed by the object detection unit, the encoding unit detects a past detection result for the specific object; Based on the above, the coding characteristics are changed .
This makes it possible to easily display a video having the original resolution in addition to the reduced video.

また、本発明は、上記の映像符号化合成装置であって、前記映像に含まれる特定の対象物を検出する対象物検出部を備え、前記符号化部は、前記特定の対象物の検出が終了していない場合、前記特定の対象物に対する過去の検出結果を基に、前記符号化特性を変更するものとする。
これにより、対象物の検出の終了を待つことなく、過去の情報から対象物の領域を高画質に符号化することが可能となる。 Further, the present invention is the video encoding / synthesizing apparatus described above, further comprising an object detection unit that detects a specific object included in the video, wherein the encoding unit detects the specific object. If not completed, the coding characteristic is changed based on the past detection result for the specific object .
Accordingly, it is possible to encode the region of the object with high image quality from the past information without waiting for the end of the detection of the object .

また、本発明は、上記の映像符号化合成装置であって、前記符号化部は、前記特定の対象物が検出されなかった場合、前記分割された映像の符号化を省略するものとする。
これにより、対象物が検出されなかった映像データは符号化されないので、ネットワークに送信されるデータ量を削減することが可能となる。 Further, the present invention is the video coding / synthesizing device described above, wherein the coding unit omits coding of the divided video when the specific object is not detected.
Thereby, since the video data in which the object is not detected is not encoded, it is possible to reduce the amount of data transmitted to the network .

また、本発明は、上記の映像符号化合成装置であって、前記映像に含まれる特定の対象物を検出する対象物検出部を備え、前記符号化部は、前記特定の対象物が検出された場合、符号化特性を変更するものとする。
これにより、映像が符号化されるまでの遅延時間を削減することが可能となる。 Further, the present invention is the video encoding / synthesizing apparatus described above, further comprising an object detection unit that detects a specific object included in the video, and the encoding unit detects the specific object. In such a case, the encoding characteristic is changed.
As a result, it is possible to reduce the delay time until the video is encoded.

また、本発明は、上記の映像符号化合成装置であって、前記符号化部は、前記検出された特定の対象物の該当領域を前記映像から切り出して符号化するものとする。
これにより、検出された対象物の領域のみ高解像度である映像データを元に高画質に符号化することが可能となる。 Also, the present invention is the video coding / synthesizing device described above, wherein the coding unit cuts out a corresponding area of the detected specific target object from the video and codes it.
As a result, only the detected object region can be encoded with high image quality based on high-resolution video data.

また、本発明は、上記の映像符号化合成装置であって、前記符号化部は、スライス符号化された映像データを、多画面表示となるように合成する場合、符号量が目標符号量になるように符号化するものとする。
これにより、符号量の無駄がなくなり、平均的な画質を向上させることが可能となる。従って、入力された映像特性のばらつきが大きい場合でも、スライス映像間で画質を均一にすることができる。 Further, the present invention is the video coding / synthesizing device described above, wherein the coding unit synthesizes the slice-coded video data so as to be a multi-screen display, and the code amount is set to a target code amount. It shall be encoded as follows.
As a result, the code amount is not wasted and the average image quality can be improved. Therefore, even when the variation in input video characteristics is large, the image quality can be made uniform between slice videos .

本発明の映像符号化合成方法は、入力された映像からスライス映像を生成する前処理ステップと、前記分割されたスライス映像をスライス符号化する符号化ステップと、前記スライス符号化された複数の映像データを、多画面表示となるように合成する合成ステップと、前記多画面表示となるように合成された複数の映像データまたは前記スライス符号化された映像データを復号化する復号化ステップと、前記復号化された映像データを、一画面表示となるように元の映像データに復元する復元ステップと、前記多画面表示となるように合成されて復号化された映像データまたは前記一画面表示となるように復元された映像データを選択する選択ステップと、前記選択された映像データによる映像を画面に表示する表示ステップと、を有するものであって、前記映像に含まれる特定の対象物を検出する対象物検出ステップをさらに備え、前記対象物検出ステップにより、前記特定の対象物の検出が終了していない場合、前記符号化ステップは、前記特定の対象物に対する過去の検出結果を基に、符号化特性を変更するものである。 The video encoding / synthesizing method of the present invention includes a pre-processing step of generating a slice video from an input video, an encoding step of slice-coding the divided slice video, and the plurality of slice-coded videos A synthesis step for synthesizing data so as to be a multi-screen display; a decoding step for decoding a plurality of video data synthesized for the multi-screen display or the slice-encoded video data; The restoration step of restoring the decoded video data to the original video data so as to be displayed on one screen, and the video data synthesized and decoded so as to be the multi-screen display or the one-screen display. A selection step for selecting the restored video data, and a display step for displaying the video based on the selected video data on the screen. A than, further comprising, when said by the object detection step, the detection of a specific object is not finished, the encoding step the object detecting step of detecting a specific object included in the video Is for changing the coding characteristics based on the past detection results for the specific object .

本発明の映像伝送システムは、映像送信装置、映像合成装置及び映像受信装置がネットワークを介して接続され、入力された映像を符号化し、前記符号化された複数の映像データを合成して表示する映像伝送システムであって、前記映像送信装置は、前記入力された映像からスライス映像を生成する前処理部と、前記スライス映像をスライス符号化する符号化部と、前記スライス符号化された映像データを前記ネットワークに送信する第１の送信部とを備え、前記映像合成装置は、前記ネットワークから前記スライス符号化された映像データを受信する第１の受信部と、異なる前記映像送信装置から入力され、前記スライス符号化された複数の映像データを、多画面表示となるように合成する合成部と、前記合成された映像データを前記ネットワークに送信する第２の送信部とを備え、前記映像受信装置は、前記ネットワークから前記多画面表示となるように合成された複数の映像データまたは前記スライス符号化された映像データを受信する第２の受信部と、前記多画面表示となるように合成された複数の映像データまたは前記スライス符号化された映像データを復号化する復号化部と、前記復号化された映像データを、一画面表示となるように元の映像データに復元する復元部と、前記多画面表示となるように合成されて復号化された映像データまたは前記一画面表示となるように復元された映像データを選択する選択部と、前記選択された映像データによる映像を画面に表示する表示部と、を備え、前記映像送信装置は、前記映像に含まれる特定の対象物を検出する対象物検出部をさらに備え、前記対象物検出部により、前記特定の対象物の検出が終了していない場合、前記符号化部は、前記特定の対象物に対する過去の検出結果を基に、符号化特性を変更するものである。 In the video transmission system of the present invention, a video transmission device, a video synthesis device, and a video reception device are connected via a network, encodes the input video, and synthesizes and displays the plurality of encoded video data. In the video transmission system, the video transmission device includes a pre-processing unit that generates a slice video from the input video, a coding unit that slice-codes the slice video, and the slice-coded video data A first transmission unit that transmits the video data to the network, and the video synthesis device is input from a different video transmission device than the first reception unit that receives the slice-encoded video data from the network. A combination unit configured to combine the plurality of slice-coded video data so as to be displayed on a multi-screen, and the combined video data to the network. And a second transmission unit for transmitting to the workpiece, the video receiving device, first receives a plurality of video data or the slice encoded video data synthesized such that the multi-screen display from the network 2 receiving units, a decoding unit that decodes the plurality of video data synthesized so as to be displayed on the multi-screen or the slice-encoded video data, and the decoded video data on one screen A restoration unit that restores the original video data so as to be displayed, and a video data that is synthesized and decoded so as to be the multi-screen display or the video data restored so as to be the single-screen display. comprising: a selecting unit, and a display unit for displaying an image on a screen by the selected video data, the video transmission apparatus, object detection for detecting a specific object included in the video And when the detection of the specific object is not completed by the object detection unit, the encoding unit obtains an encoding characteristic based on a past detection result for the specific object. To change .

本発明によれば、縮小された映像の他、元の解像度を有する映像を容易に表示することが可能な映像符号化合成装置、映像符号化合成方法及び映像伝送システムを提供できる。また、入力される映像特性のばらつきが大きい場合でも、分割された映像間で画質を均一にすることが可能な映像符号化合成装置、映像符号化合成方法及び映像伝送システムを提供できる。また、映像が符号化されるまでの遅延時間を削減することが可能な映像符号化合成装置、映像符号化合成方法及び映像伝送システムを提供できる。 According to the present invention, it is possible to provide a video coding / synthesizing device, a video coding / synthesizing method, and a video transmission system that can easily display a video having an original resolution in addition to a reduced video. Further, it is possible to provide a video coding / synthesizing device, a video coding / synthesizing method, and a video transmission system that can make the image quality uniform among the divided videos even when the variation in inputted video characteristics is large. Further, it is possible to provide a video coding / synthesizing device, a video coding / synthesizing method, and a video transmission system capable of reducing a delay time until a video is coded.

本実施形態の映像符号化合成装置は、例えば監視カメラの撮影画像を伝送する映像伝送システムなどに適用されるものである。この種の映像伝送システムは、複数のカメラ等から入力される映像を符号化し、符号化された映像データを１つの画面に合成して表示する機能などを備えている。以下に本実施形態に係る映像符号化合成装置及び映像伝送システムの構成及び動作の例を説明する。 The video coding / synthesizing apparatus according to the present embodiment is applied to, for example, a video transmission system that transmits a captured image of a surveillance camera. This type of video transmission system has a function of encoding video input from a plurality of cameras or the like, and synthesizing and displaying the encoded video data on one screen. An example of the configuration and operation of the video coding / synthesizing apparatus and video transmission system according to the present embodiment will be described below.

（第１の実施形態）
図１は本発明の第１の実施形態に係る映像伝送システムの構成を示す図である。この映像伝送システムは、複数の映像送信装置１０１ａ〜１０１ｄ、映像合成装置１０２及び複数の映像受信装置１０３ａ〜１０３ｄが第１のネットワーク１０４及び第２のネットワーク１０５を介して接続された構成を有する。ここで、映像送信装置１０１ａ〜１０１ｄを映像送信装置１０１と総称する。同様に、映像受信装置１０３ａ〜１０３ｄを映像受信装置１０３と総称する。 (First embodiment)
FIG. 1 is a diagram showing a configuration of a video transmission system according to the first embodiment of the present invention. This video transmission system has a configuration in which a plurality of video transmission devices 101 a to 101 d, a video composition device 102, and a plurality of video reception devices 103 a to 103 d are connected via a first network 104 and a second network 105. Here, the video transmission apparatuses 101a to 101d are collectively referred to as the video transmission apparatus 101. Similarly, the video receiving devices 103 a to 103 d are collectively referred to as the video receiving device 103.

映像送信装置１０１は、カメラ（図示せず）からの映像データを入力し、入力した映像データを符号化して第１のネットワーク１０４に伝送する。映像合成装置１０２は、第１のネットワーク１０４から、複数の映像送信装置１０１でそれぞれ符号化された映像データを受信し、多画面表示となるように、これらの映像データを合成して第２のネットワーク１０５に伝送する。映像受信装置１０３は、第２のネットワーク１０５から、符号化された映像データを受信し、受信した映像データを復号化して表示する。第１のネットワーク１０４は、ＬＡＮあるいはＰＣＩ等の装置内のバスからなり、映像送信装置１０１及び映像合成装置１０２を接続する。第２のネットワーク１０５は、ＬＡＮあるいはＰＣＩ等の装置内のバスを有して構成され、映像合成装置１０２及び映像受信装置１０３を接続する。 The video transmission apparatus 101 receives video data from a camera (not shown), encodes the input video data, and transmits the encoded video data to the first network 104. The video composition device 102 receives the video data encoded by each of the plurality of video transmission devices 101 from the first network 104, and synthesizes these video data so as to be displayed on the multi-screen. The data is transmitted to the network 105. The video receiving device 103 receives the encoded video data from the second network 105, decodes the received video data, and displays it. The first network 104 includes a bus in a device such as a LAN or a PCI, and connects the video transmission device 101 and the video composition device 102. The second network 105 includes a bus in a device such as a LAN or a PCI, and connects the video composition device 102 and the video reception device 103.

映像送信装置１０１は、映像入力部１０７、映像前処理部１０８、映像符号化部１０９、送受信部１１０及び符号化制御部１０６を有して構成される。映像入力部１０７は、カメラからの映像を入力する。映像前処理部１０８は、分割部の機能を有するもので、映像入力部１０７により取り込まれた映像データに対し、間引き処理を行ってスライス映像データを生成する。映像符号化部１０９は、符号化部の機能を有するもので、映像前処理部１０８により生成されたスライス映像データをスライス符号化する。送受信部１１０は、スライス符号化された映像データを第１のネットワーク１０４に送信する。符号化制御部１０６は、映像前処理部１０８によるスライス映像データの生成、及び映像符号化部１０９によるスライス符号化を制御する。 The video transmission apparatus 101 includes a video input unit 107, a video preprocessing unit 108, a video encoding unit 109, a transmission / reception unit 110, and an encoding control unit 106. The video input unit 107 inputs video from the camera. The video pre-processing unit 108 has a function of a dividing unit, and performs slice processing on the video data captured by the video input unit 107 to generate slice video data. The video encoding unit 109 has the function of an encoding unit, and slice-slices the slice video data generated by the video preprocessing unit 108. The transmission / reception unit 110 transmits the slice-coded video data to the first network 104. The encoding control unit 106 controls generation of slice video data by the video preprocessing unit 108 and slice encoding by the video encoding unit 109.

映像合成装置１０２は、送受信部１１４、合成部１１３、送受信部１１２及び合成制御部１１１を有して構成される。送受信部１１４は、第１のネットワーク１０４からスライス符号化された映像データを受信する。合成部１１３は、複数のスライス符号化された映像データを合成する。送受信部１１２は、このスライス合成された映像データを第２のネットワーク１０５に送信する。合成制御部１１１は、合成部１１３による合成を制御する。 The video composition apparatus 102 includes a transmission / reception unit 114, a synthesis unit 113, a transmission / reception unit 112, and a composition control unit 111. The transmission / reception unit 114 receives the slice-coded video data from the first network 104. The synthesizer 113 synthesizes a plurality of slice encoded video data. The transmission / reception unit 112 transmits the slice-combined video data to the second network 105. The composition control unit 111 controls the composition by the composition unit 113.

映像受信装置１０３は、第２のネットワーク１０５からスライス合成された映像データを受信する送受信部１１７、このスライス合成された映像データを復号する映像復号部１１６、復号結果の画像を表示する映像表示部１１５、及びユーザからの要求を入力するユーザインタフェース（ＵＩ）１１８を有して構成される。ここで、映像復号部１１６は、復号化部及び復元部の機能を有しており、ユーザインタフェース１１８は、選択部及び要求入力部の機能を有している。また、映像表示部１１５は表示部の機能を有する。 The video receiving apparatus 103 includes a transmission / reception unit 117 that receives the slice-combined video data from the second network 105, a video decoding unit 116 that decodes the slice-combined video data, and a video display unit that displays a decoding result image. 115 and a user interface (UI) 118 for inputting a request from the user. Here, the video decoding unit 116 has functions of a decoding unit and a restoration unit, and the user interface 118 has functions of a selection unit and a request input unit. The video display unit 115 has a function of a display unit.

なお、上記各部の機能は、それぞれに設けられた記憶媒体に格納された制御プログラムをプロセッサが実行することによって実現される。 Note that the functions of the above units are realized by a processor executing a control program stored in a storage medium provided in each unit.

上記構成を有する第１の実施形態の映像伝送システムの動作を示す。始めに、映像前処理部１０８の動作を示す。図２は第１の実施形態における間引き処理及びスライス映像作成処理を示す図である。 The operation of the video transmission system according to the first embodiment having the above-described configuration will be described. First, the operation of the video preprocessing unit 108 will be described. FIG. 2 is a diagram showing a thinning process and a slice video creation process in the first embodiment.

入力映像２０１は、画素レベルで見ると、入力映像２０２に示すように「１、２、…、ｙｘ」の画素からなる。映像前処理部１０８は、入力映像２０１の間引き処理を行う際、入力映像２０１を画素レベルで間引き、スライス映像２０３を４面生成する。スライス映像２０３は、画素レベルで見ると、スライス映像２０４に示すように例えば「１、３、５、…」の画素からなる。なお、本実施形態では、２画素おきに間引くことでスライス映像を４面生成する場合を示すが、間引き方法は特に限定されるものでなく、符号化制御部１０６によって任意の間引き方法に設定可能である。 When viewed at the pixel level, the input video 201 is composed of “1, 2,..., Yx” pixels as shown in the input video 202. When performing the thinning process of the input video 201, the video pre-processing unit 108 thins the input video 201 at the pixel level and generates four slice videos 203. When viewed at the pixel level, the slice video 203 includes, for example, “1, 3, 5,...” Pixels as shown in the slice video 204. Although the present embodiment shows a case where four slice images are generated by thinning out every two pixels, the thinning method is not particularly limited, and can be set to any thinning method by the encoding control unit 106. It is.

図３は第１の実施形態の映像符号化部１０９におけるマクロブロックの取り扱いを示す図である。マクロブロックは、映像を符号化する際の単位であり、例えば１６ｘ１６画素からなる。マクロブロック（ＭＢ）には、ＭＰＥＧ−２、ＭＰＥＧ−４、Ｈ．２６４などの国際標準規格等が用いられる。前述したように、映像前処理部１０８によって入力映像２０１から生成されたスライス映像２０３は、マクロブロックＭＢ（１）、ＭＢ（２）、…、ＭＢ（（ｋ＋１）ｎ）から構成される。映像符号化部１０９は、このマクロブロックを基本単位としてスライス符号化を行う。なお、スライス符号化の詳細については後述する。また、他のスライス映像も、同様にマクロブロックから構成される。 FIG. 3 is a diagram illustrating how macroblocks are handled in the video encoding unit 109 according to the first embodiment. The macro block is a unit for encoding video, and is composed of, for example, 16 × 16 pixels. Macroblock (MB) includes MPEG-2, MPEG-4, H.264, and so on. An international standard such as H.264 is used. As described above, the slice video 203 generated from the input video 201 by the video preprocessing unit 108 is composed of macroblocks MB (1), MB (2),..., MB ((k + 1) n). The video encoding unit 109 performs slice encoding using this macroblock as a basic unit. Details of slice encoding will be described later. Similarly, other slice videos are also composed of macroblocks.

図４は第１の実施形態の合成部１１３におけるスライス合成動作を示す図である。合成部１１３は、映像送信装置１０１ａでスライス符号化されたスライス符号化済データ４０１、映像送信装置１０１ｂでスライス符号化されたスライス符号化済データ４０２、映像送信装置１０１ｃでスライス符号化されたスライス符号化済データ４０３、及び映像送信装置１０１ｄでスライス符号化されたスライス符号化済データ４０４を、スライス合成し、スライス合成データ４０５として出力する。 FIG. 4 is a diagram illustrating a slice combining operation in the combining unit 113 according to the first embodiment. The synthesizing unit 113 includes slice-encoded data 401 that has been slice-encoded by the video transmission apparatus 101a, slice-encoded data 402 that has been slice-encoded by the video transmission apparatus 101b, and slices that have been slice-encoded by the video transmission apparatus 101c The encoded data 403 and the slice encoded data 404 slice-coded by the video transmission apparatus 101d are slice-combined and output as slice-combined data 405.

スライス符号化済データ４０１は、スライスデータ１−１、１−２、１−３、１−４からなる。同様に、スライス符号化済データ４０２は、スライスデータ２−１、２−２、２−３、２−４からなる。スライス符号化済データ４０３は、スライスデータ３−１、３−２、３−３、３−４からなる。スライス符号化済データ４０４は、スライスデータ４−１、４−２、４−３、４−４からなる。 The slice encoded data 401 includes slice data 1-1, 1-2, 1-3, and 1-4. Similarly, the slice encoded data 402 is composed of slice data 2-1, 2-2, 2-3, 2-4. The slice encoded data 403 includes slice data 3-1, 3-2, 3-3, 3-4. The slice encoded data 404 is composed of slice data 4-1, 4-2, 4-3, and 4-4.

スライス合成データ４０５は、各スライス符号化済データから取り出された、スライス符号化済データ４０１の左上のスライスデータ１−１、スライス符号化済データ４０２の右上のスライスデータ２−２、スライス符号化済データ４０３の左下のスライスデータ３−３、及びスライス符号化済データ４０４の右下のスライスデータ４−４が１画面分のスライスデータとして合成される。 The slice synthesis data 405 includes slice data 1-1 at the upper left of the slice encoded data 401 extracted from each slice encoded data, slice data 2-2 at the upper right of the slice encoded data 402, and slice encoding. The lower left slice data 3-3 of the completed data 403 and the lower right slice data 4-4 of the slice encoded data 404 are combined as slice data for one screen.

例えば、符号化方式としてＨ．２６４に準拠したストリームフォーマットを用いることで、各々のスライスデータの先頭を容易に見つけることができる。また、図３に示すようにマクロブロックを構成することで、スライス符号化済データから取り出したスライスデータを単に結合するだけで、スライス合成データ４０５を合成することが可能である。 For example, H. By using the H.264-compliant stream format, the head of each slice data can be easily found. Also, by constructing a macroblock as shown in FIG. 3, it is possible to synthesize the slice synthesis data 405 simply by combining the slice data extracted from the slice encoded data.

図５は第１の実施形態における映像前処理部１０８の動作処理手順を示すフローチャートである。映像前処理部１０８は、映像が入力されるまで待ち（ステップＳ１）、映像が入力されると、図２に示されるようなスライス画像を生成する（ステップＳ２）。この生成されたスライス画像を映像符号化部１０９に出力する（ステップＳ３）。この後、最終のスライス映像が映像符号化部１０９に出力されたか否かを判別し（ステップＳ４）。最終のスライス映像が出力された場合、ステップＳ１の処理に戻って映像入力待ちとなる。一方、最終のスライス映像が出力されていない場合、ステップＳ３の処理に戻る。 FIG. 5 is a flowchart showing an operation processing procedure of the video preprocessing unit 108 in the first embodiment. The video pre-processing unit 108 waits until a video is input (step S1), and when the video is input, generates a slice image as shown in FIG. 2 (step S2). The generated slice image is output to the video encoding unit 109 (step S3). Thereafter, it is determined whether or not the final slice video has been output to the video encoding unit 109 (step S4). When the final slice video is output, the process returns to step S1 to wait for video input. On the other hand, if the final slice video has not been output, the process returns to step S3.

図６は第１の実施形態における映像符号化部１０９の動作処理手順を示すフローチャートである。映像符号化部１０９は、スライス映像が入力されるまで待ち（ステップＳ１１）、スライス映像が入力されると、スライス映像からマクロブロック（ＭＢ）を生成する（ステップＳ１２）。生成されたマクロブロックを用いてスライス符号化を行う（ステップＳ１３）。このスライス符号化の詳細については後述する。そして、スライス符号化されたスライスデータを送受信部１１０に出力する（ステップＳ１４）。この後、ステップＳ１１の処理に戻って、次のスライス映像の入力待ちになる。 FIG. 6 is a flowchart showing an operation processing procedure of the video encoding unit 109 in the first embodiment. The video encoding unit 109 waits until a slice video is input (step S11), and when the slice video is input, generates a macroblock (MB) from the slice video (step S12). Slice encoding is performed using the generated macroblock (step S13). Details of this slice encoding will be described later. Then, the slice-coded slice data is output to the transmission / reception unit 110 (step S14). Thereafter, the process returns to step S11 to wait for input of the next slice video.

図７は第１の実施形態のステップＳ１３において映像符号化部１０９で行われるスライス符号化動作を示す図である。まず、映像符号化部１０９は、ステップＳ１２で生成されたＭＢデータ、及び符号化制御部１０６から得られるＭＢ符号化特性を入力し（Ｔ６０１）、ＭＢデータの処理を開始する（Ｔ６０２）。マクロブロック毎に、フレームメモリに格納されている前後数フレーム内でマッチング処理を行って動きベクトルを検出し、動きベクトル情報に基づいて動き補償を行い、予測画像を生成する（Ｔ６０３）。 FIG. 7 is a diagram illustrating a slice encoding operation performed by the video encoding unit 109 in step S13 of the first embodiment. First, the video encoding unit 109 receives the MB data generated in step S12 and the MB encoding characteristics obtained from the encoding control unit 106 (T601), and starts processing the MB data (T602). For each macroblock, matching processing is performed within several frames before and after stored in the frame memory to detect a motion vector, motion compensation is performed based on the motion vector information, and a predicted image is generated (T603).

また、動き補償予測を行ったマクロブロックに対し、面内予測（イントラ予測）を行い、予測画像を生成する（Ｔ６０４）。Ｔ６０３、Ｔ６０４で生成された、それぞれの予測画像に対し、予測誤差を計算し、より誤差が少ない予測画像を選択する（Ｔ６０５）。選択された予測画像と入力されたスライス映像を比較し、予測差分信号を生成する（Ｔ６０６）。 In addition, intra prediction (intra prediction) is performed on the macroblock subjected to motion compensation prediction to generate a predicted image (T604). A prediction error is calculated for each prediction image generated at T603 and T604, and a prediction image with a smaller error is selected (T605). The selected predicted image is compared with the input slice video to generate a prediction difference signal (T606).

予測差分信号に対し、２次元の周波数成分に分解するＤＣＴ変換を行った後、このＤＣＴ変換係数を、入力されたＭＢ符号化特性（例えば、量子化ステップ）を用いて、離散的な代表値に対応付け、量子化係数を出力する（Ｔ６０７）。 After performing DCT transform which decomposes | disassembles into a two-dimensional frequency component with respect to a prediction difference signal, this DCT transform coefficient is used as the discrete representative value using the input MB encoding characteristic (for example, quantization step). And a quantization coefficient is output (T607).

ここでは、ＭＢ符号化特性の一例として、量子化ステップを示す。量子化ステップを小さくすると、マクロブロックの発生符号量が大きくなる。逆に、量子化ステップを大きくすると、マクロブロックの発生符号量が小さくなる。Ｔ６０７で出力された量子化係数を用いて、エントロピー符号化を行い（Ｔ６０８）、スライスデータを出力する。 Here, a quantization step is shown as an example of MB encoding characteristics. When the quantization step is reduced, the generated code amount of the macroblock is increased. Conversely, if the quantization step is increased, the generated code amount of the macroblock is reduced. Entropy coding is performed using the quantized coefficient output in T607 (T608), and slice data is output.

符号化制御部１０６は、エントロピー符号化の結果、発生した符合量のフィードバックを受けながら、目標とする基準値（例えば１秒間あたりの符号量）を満たせるようにＭＢ符号化特性を制御する。また、Ｔ６０７で出力された量子化係数を用いて、逆量子化及び逆ＤＣＴ変換を行い、予測差分信号を生成する（Ｔ６０９）。 The encoding control unit 106 controls MB encoding characteristics so as to satisfy a target reference value (for example, the code amount per second) while receiving feedback of the code amount generated as a result of entropy encoding. In addition, inverse quantization and inverse DCT transform are performed using the quantization coefficient output in T607 to generate a prediction difference signal (T609).

そして、符号化制御部１０６は、上記Ｔ６０５で選択された、動き補償予測処理あるいはイントラ予測処理により生成される予測画像と、Ｔ６０９で生成される予測差分信号とを加算し、復号画像を生成する（Ｔ６１０）。生成された復号画像のマクロブロック境界に対し、ブロック境界が目立たないようにデブロッキングフィルタ処理を施し（Ｔ６１１）、デブロッキングフィルタ処理が施された復号画像をフレームメモリに格納する（Ｔ６１２）。 Then, the encoding control unit 106 adds the prediction image generated by the motion compensation prediction process or the intra prediction process selected in T605 and the prediction difference signal generated in T609 to generate a decoded image. (T610). Deblocking filter processing is performed on the macroblock boundary of the generated decoded image so that the block boundary is not noticeable (T611), and the decoded image subjected to the deblocking filter processing is stored in the frame memory (T612).

図８は第１の実施形態における合成部１１３の動作処理手順を示すフローチャートである。合成部１１３は、スライス符号化済データが入力されるまで待ち（ステップＳ２１）、スライス符号化済データが入力されると、入力されたスライスデータをメモリに保持する（ステップＳ２２）。そして、最終のスライスデータであるか否かを判別する。ここでは、４つ目のスライスデータであるか否かの判別が行われる。 FIG. 8 is a flowchart showing an operation processing procedure of the synthesizing unit 113 in the first embodiment. The synthesizing unit 113 waits until the slice encoded data is input (step S21). When the slice encoded data is input, the combining unit 113 holds the input slice data in the memory (step S22). And it is discriminate | determined whether it is the last slice data. Here, it is determined whether or not it is the fourth slice data.

最終のスライスデータでない場合、ステップＳ２１の処理に戻って次のスライスデータの入力を待つ。一方、最終のスライスデータである場合、多画面合成モードであるか否かを判別する（ステップＳ２４）。多画面合成モードであるか否かの判別は、映像受信装置１０３のユーザインタフェース（ＵＩ）１０８に入力されるユーザ要求により決定される。その決定されたモード情報は、映像受信装置１０３から映像合成装置１０２に通知される。 If it is not the last slice data, the process returns to step S21 to wait for input of the next slice data. On the other hand, if it is the last slice data, it is determined whether or not it is the multi-screen composition mode (step S24). Whether or not the multi-screen composition mode is selected is determined by a user request input to the user interface (UI) 108 of the video reception device 103. The determined mode information is notified from the video receiver 103 to the video synthesizer 102.

ステップＳ２４で多画面合成モードである場合、図４に示すように、スライス合成を行い（ステップＳ２５）、合成されたスライス合成データを送受信部１１２に出力する（ステップＳ２６）。この後、ステップＳ２１の処理に戻る。一方、ステップＳ２４で多画面合成モードでない場合、スライス合成を行わず、そのままステップＳ２６でスライスデータを送受信部１１２に出力する。 If the multi-screen composition mode is selected in step S24, slice composition is performed as shown in FIG. 4 (step S25), and the synthesized slice composition data is output to the transmission / reception unit 112 (step S26). Thereafter, the process returns to step S21. On the other hand, if it is not the multi-screen composition mode in step S24, slice composition is not performed, and the slice data is output to the transmission / reception unit 112 as it is in step S26.

図９は第１の実施形態における映像復号部１１６の動作処理手順を示すフローチャートである。映像復号部１１６は、スライスデータが入力されるまで待ち（ステップＳ３１）、スライスデータが入力されると、スライスデータの復号化処理を行う（ステップＳ３２）。このスライスデータの復号化処理は、図７のスライス符号化処理とは逆の処理を実行することで実現される。 FIG. 9 is a flowchart showing an operation processing procedure of the video decoding unit 116 in the first embodiment. The video decoding unit 116 waits until slice data is input (step S31). When the slice data is input, the video decoding unit 116 performs slice data decoding processing (step S32). The slice data decoding process is realized by executing a process opposite to the slice encoding process of FIG.

そして、最終のスライスデータであるか否かを判別する（ステップＳ３３）。最終のスライスデータでない場合、ステップＳ３１の処理に戻って次のスライスデータの入力を待つ。一方、最終のスライスデータである場合、多画面合成モードであるか否かを判別する（ステップＳ３４）。 And it is discriminate | determined whether it is the last slice data (step S33). If it is not the last slice data, the process returns to step S31 to wait for the next slice data. On the other hand, if it is the last slice data, it is determined whether or not it is the multi-screen composition mode (step S34).

多画面合成モードである場合、復号化された映像データをそのまま映像表示部１１５に出力する（ステップＳ３５）。この後、ステップＳ３１の処理に戻る。一方、ステップＳ３４で多画面合成モードでない場合、図２に示す映像前処理部１０８の間引き処理の逆の処理（補完処理）を行い（ステップＳ３６）、ステップＳ３５で、復号化された映像データを映像表示部１１５に出力する。 In the multi-screen composition mode, the decoded video data is output as it is to the video display unit 115 (step S35). Thereafter, the process returns to step S31. On the other hand, if it is not the multi-screen composition mode in step S34, the reverse process (complementary process) of the thinning process of the video preprocessing unit 108 shown in FIG. 2 is performed (step S36), and the decoded video data is converted in step S35. The image is output to the video display unit 115.

このように、第１の実施形態の映像符号化合成装置及び映像伝送システムによれば、映像前処理部１０８は間引き処理を行ってスライス映像を作成するとともに、各スライス映像をマクロブロックで構成し、映像符号化部１０９はスライス映像をスライス符号化し、合成部１１２はスライス符号化されたスライスデータを合成する。従って、複数のカメラからの映像データを容易に符号化して合成することができる。また、縮小された映像が合成された多画面表示と、元の解像度を有する映像の一画面表示とを容易に選択することが可能であり、ユーザからの要求に応じて、多画面合成表示と、カメラからの入力映像と同じ解像度を有する映像の一画面表示とを速やかに切り替えることができる。 As described above, according to the video encoding / synthesizing apparatus and the video transmission system of the first embodiment, the video pre-processing unit 108 performs the thinning process to create slice videos, and each slice video is configured by a macro block. The video encoding unit 109 performs slice encoding on the slice video, and the combining unit 112 combines the slice encoded slice data. Therefore, video data from a plurality of cameras can be easily encoded and combined. In addition, it is possible to easily select a multi-screen display in which reduced video is synthesized and a single-screen display of video having the original resolution, and according to a request from the user, Thus, it is possible to quickly switch between the one-screen display of the video having the same resolution as the input video from the camera.

（第２の実施形態）
図１０は本発明の第２の実施形態に係る映像伝送システムの構成を示す図である。前記第１の実施形態と同一の構成部分については、同一の符号を付することにより、その説明を適宜省略する。第２の実施形態では、前記第１の実施形態と異なり、送受信部１１０は、映像合成装置１０２から符号化制御情報（例えば目標符号量）を受信すると、符号化制御部１０６に出力する。また、合成部１１３には、後述する映像送信装置管理テーブル１００１及びスライス合成データ管理テーブル１００２が保持されている。その他の構成は、前記第１の実施形態と同様である。 (Second Embodiment)
FIG. 10 is a diagram showing a configuration of a video transmission system according to the second embodiment of the present invention. The same components as those in the first embodiment are denoted by the same reference numerals, and the description thereof is omitted as appropriate. In the second embodiment, unlike the first embodiment, when the transmission / reception unit 110 receives encoding control information (for example, a target code amount) from the video synthesis apparatus 102, the transmission / reception unit 110 outputs the encoding control information to the encoding control unit 106. The synthesizing unit 113 holds a video transmission device management table 1001 and a slice synthesis data management table 1002 described later. Other configurations are the same as those in the first embodiment.

上記構成を有する第２の実施形態の映像伝送システムの動作を示す。図１１は第２の実施形態における合成部１１３の動作処理手順を示すフローチャートである。 The operation of the video transmission system of the second embodiment having the above configuration will be described. FIG. 11 is a flowchart illustrating an operation processing procedure of the synthesis unit 113 in the second embodiment.

合成部１１３は、スライス符号化済データが入力されるまで待ち（ステップＳ４１）、スライス符号化済データが入力されると、入力されたスライスデータを保持する（ステップＳ４２）。さらに、受信したスライスデータの符号量を監視し、映像送信装置管理テーブル１００１及びスライス合成データ管理テーブル１００２を更新する（ステップＳ４３）。 The synthesizing unit 113 waits until the slice encoded data is input (step S41). When the slice encoded data is input, the combining unit 113 holds the input slice data (step S42). Further, the code amount of the received slice data is monitored, and the video transmission device management table 1001 and the slice synthesis data management table 1002 are updated (step S43).

図１２は第２の実施形態における映像送信装置管理テーブル１００１及びスライス合成データ管理テーブル１００２を示す図である。図１２（Ａ）に示す映像送信装置管理テーブル１００１は、映像送信装置１０１が発生する符号量の上限値、及び実際に発生した符号量を管理するテーブルである。図１２（Ｂ）に示すスライス合成データ管理テーブル１００２は、スライス合成データがどの映像送信装置から送られてきたスライスデータで構成されているか、また、そのスライスデータの合計符号量の上限値、及び実際に発生した符号量を管理するテーブルである。 FIG. 12 is a diagram showing a video transmission device management table 1001 and a slice synthesis data management table 1002 in the second embodiment. A video transmission device management table 1001 shown in FIG. 12A is a table for managing the upper limit value of the code amount generated by the video transmission device 101 and the code amount actually generated. The slice synthesis data management table 1002 shown in FIG. 12B indicates which video transmission device the slice synthesis data is composed of, and the upper limit value of the total code amount of the slice data, and It is a table for managing the amount of code actually generated.

ステップＳ４３では、映像送信装置管理テーブル１００１及びスライス合成データ管理テーブル１００２を用いて目標符号量を算出し、送受信部１１４に出力する。送受信部１１４は、第１のネットワーク１０４を介して、算出された目標符号量を映像送信装置１０１に送信する。映像送信装置１０１内の符号化制御部１０６は、受け取った目標符号量に近くなるようにＭＢ符号化特性を制御する。ここで、目標符号量の算出は、例えば次の（ａ）、（ｂ）、（ｃ）の手順で行われる。 In step S43, the target code amount is calculated using the video transmission device management table 1001 and the slice synthesis data management table 1002, and is output to the transmission / reception unit 114. The transmission / reception unit 114 transmits the calculated target code amount to the video transmission apparatus 101 via the first network 104. The encoding control unit 106 in the video transmission apparatus 101 controls the MB encoding characteristics so as to be close to the received target code amount. Here, the calculation of the target code amount is performed, for example, according to the following procedures (a), (b), and (c).

（ａ）スライス合成データ管理テーブル１００２を検索し、目標符号量と発生符号量の差のもっとも大きなスライス合成データを選択する。
（ｂ）選択されたスライス合成データを構成する映像送信装置に関し、映像送信装置管理テーブル１００１を検索し、目標符号量と発生符号量の差のもっとも大きい映像送信装置ＭＡＸ、及び差の最も小さい映像送信装置ＭＩＮを選択する。
（ｃ）映像送信装置ＭＡＸの目標符号量と発生符号量の差に対し、例えば２分の１の値を求め、映像送信装置ＭＩＮの目標符号量に加算する。 (A) The slice synthesis data management table 1002 is searched, and slice synthesis data having the largest difference between the target code amount and the generated code amount is selected.
(B) With respect to the video transmission device constituting the selected slice synthesis data, the video transmission device management table 1001 is searched, the video transmission device MAX having the largest difference between the target code amount and the generated code amount, and the video having the smallest difference. Select the transmission device MIN.
(C) For example, a half value is obtained for the difference between the target code amount of the video transmission device MAX and the generated code amount, and is added to the target code amount of the video transmission device MIN.

このような手順で求められた、映像送信装置ＭＡＸの新しい目標符号量と映像送信装置ＭＩＮの新しい目標符号量が映像送信装置１０１に送信される。これ以降のステップＳ４４〜Ｓ４７の処理については、前記第１の実施形態におけるステップＳ２３〜Ｓ２６の処理と同様であるので、その説明を省略する。 The new target code amount of the video transmission device MAX and the new target code amount of the video transmission device MIN obtained by such a procedure are transmitted to the video transmission device 101. Subsequent processes in steps S44 to S47 are the same as the processes in steps S23 to S26 in the first embodiment, and a description thereof will be omitted.

このように、第２の実施形態の映像符号化合成装置及び映像伝送システムによれば、スライス合成データ管理テーブル１００２及び映像送信装置管理テーブル１００１を合成部１１３に保持させることで、映像送信装置１０１間で符号量の割り当てを変更することができる。従って、各スライス画像の画質、すなわち多画面表示時の子画面間の画質を均一にすることができる。 As described above, according to the video encoding / synthesizing apparatus and the video transmission system of the second embodiment, the video transmission apparatus 101 is configured to hold the slice synthesis data management table 1002 and the video transmission apparatus management table 1001 in the synthesis unit 113. The code amount allocation can be changed between the two. Therefore, the image quality of each slice image, that is, the image quality between the sub-screens during multi-screen display can be made uniform.

（第３の実施形態）
図１３は本発明の第３の実施形態に係る映像伝送システムの構成を示す図である。前記第１の実施形態と同一の構成部分については、同一の符号を付することにより、その説明を適宜省略する。第３の実施形態では、前記第１の実施形態と異なり、対象物検出部１１０１が新たに設けられている。 (Third embodiment)
FIG. 13 is a diagram showing a configuration of a video transmission system according to the third embodiment of the present invention. The same components as those in the first embodiment are denoted by the same reference numerals, and the description thereof is omitted as appropriate. In the third embodiment, unlike the first embodiment, an object detection unit 1101 is newly provided.

この対象検出部１１０１は、映像前処理部１０８からスライス映像を入力し、画像認識機能により特定の対象物を検出し、その検出結果を映像符号化部１０９に出力する。対象物の検出結果は、対象物が含まれるマクロブロックのＭＢ符号化特性を制御し、対象物領域の高画質化を実現するために使用される。 The target detection unit 1101 receives the slice video from the video preprocessing unit 108, detects a specific target using the image recognition function, and outputs the detection result to the video encoding unit 109. The detection result of the target object is used to control the MB coding characteristics of the macroblock including the target object and realize high image quality of the target area.

上記構成を有する第３の実施形態の映像伝送システムの動作を示す。図１４は第３の実施形態における符号化処理のタイミングを示す図である。図１４（Ａ）は従来のフレーム符号化による処理タイミングを示す。図１４（Ｂ）は本実施形態のスライス符号化による処理タイミングを示す。 The operation of the video transmission system of the third embodiment having the above configuration will be described. FIG. 14 is a diagram showing the timing of the encoding process in the third embodiment. FIG. 14A shows processing timing by conventional frame coding. FIG. 14B shows processing timing by slice coding according to this embodiment.

従来のフレーム符号化による処理タイミング１２０１について示す。図１４（Ａ）の最上段には、タイミングステップを表す数字「１、２、３、…」が表されている。上段には、映像の入力タイミングが示され、中段には、対象物の検出タイミングが示され、下段には、フレーム符号化のタイミングが示されている。この場合、２つの映像データ入力に対し、対象物の検出が１回行われている。即ち、従来のフレーム符号化では、タイミング１で入力された映像に対し、タイミング２で対象物検出処理が行われる。また、対象物の検出結果を利用し、タイミング３で対象物の領域を高画質化するようにフレーム符号化が行われる。このように、映像が入力されてから符号化が完了するまで、３ステップの時間が必要となる。 A processing timing 1201 by conventional frame encoding will be described. Numbers “1, 2, 3,...” Representing timing steps are shown in the uppermost part of FIG. The upper part shows the video input timing, the middle part shows the object detection timing, and the lower part shows the frame encoding timing. In this case, the object is detected once for two video data inputs. That is, in the conventional frame coding, the object detection process is performed at timing 2 on the video input at timing 1. In addition, using the detection result of the object, frame encoding is performed so as to improve the image quality of the area of the object at timing 3. In this way, it takes three steps from the input of video until the encoding is completed.

一方、本実施形態のスライス符号化による処理タイミング１２０２について示す。図１４（Ｂ）の最上段には、同様に、タイミングステップを表す数字「１、２、３、…」が表されている。上段には、映像の入力タイミングが示され、中段には、対象物の検出タイミングが示され、下段には、スライス符号化のタイミングが示されている。映像の入力タイミング及び対象物の検出タイミングについては変化がなく、下段のスライス符号化のタイミングだけが変化している。４つのスライス映像を生成する場合、スライス符号化は、対象物の検出タイミング２と同じタイミングで、開始されている。この場合、対象物検出処理ａが終了した後、この対象物の検出結果を利用し、４番目のスライス映像に対してのみ、対象物の領域を高画質化することが可能となる。 On the other hand, the processing timing 1202 by slice coding of this embodiment will be described. Similarly, the numbers “1, 2, 3,...” Representing the timing steps are shown at the top of FIG. The upper part shows the video input timing, the middle part shows the detection timing of the object, and the lower part shows the slice encoding timing. There is no change in the input timing of the video and the detection timing of the object, and only the timing of the slice encoding in the lower stage is changed. In the case of generating four slice videos, the slice encoding is started at the same timing as the object detection timing 2. In this case, after the object detection process a is completed, the detection result of the object can be used to improve the image quality of the object area only for the fourth slice video.

このように、本実施形態のスライス符号化による処理タイミングでは、映像が入力されてから符号化が完了するまでに２ステップが必要であり、従来のフレーム符号化による処理タイミングに比べ、１ステップ分遅延時間を削減できる。 As described above, the processing timing by the slice encoding according to the present embodiment requires two steps from the input of the video until the encoding is completed. Compared to the processing timing by the conventional frame encoding, the processing timing is one step. Delay time can be reduced.

図１５は第３の実施形態における対象物検出部１１０１の動作処理手順を示すフローチャートである。対象物検出部１１０は、スライス映像が入力されるまで待つ（ステップＳ５１）。スライス映像が入力されると、対象物の検出処理を行う（ステップＳ５２）。この対象物の検出処理方法には、例えば特開２００１−２２２７１９に開示されている方法などが用いられる。即ち、この対象物の検出処理方法では、対象画像からエッジ部を抽出してエッジ画像を生成し、エッジ画像の各画素位置において、テンプレートを用いて投票処理を行い、その投票結果に基づき、そのクラスタを評価し、対象画像に含まれる顔の位置及び大きさを求めることが行われる。ステップＳ５２で検出された対象物検出情報を映像符号化部１０９に出力する（ステップＳ３３）。この後、ステップＳ５１の処理に戻る。 FIG. 15 is a flowchart showing an operation processing procedure of the object detection unit 1101 in the third embodiment. The object detection unit 110 waits until a slice video is input (step S51). When the slice video is input, an object detection process is performed (step S52). As this object detection processing method, for example, a method disclosed in Japanese Patent Application Laid-Open No. 2001-222719 is used. That is, in this target object detection processing method, an edge portion is extracted from the target image to generate an edge image, and a voting process is performed using a template at each pixel position of the edge image. The cluster is evaluated to determine the position and size of the face included in the target image. The object detection information detected in step S52 is output to the video encoding unit 109 (step S33). Thereafter, the process returns to step S51.

図１６は第３の実施形態における映像符号化部１０９の動作処理手順を示すフローチャートである。映像符号化部１０９は、スライス映像が入力されるまで待つ（ステップＳ６１）。スライス映像が入力されると、対象物検出部１１０１から対象物検出情報が得られているか否かを判別する（ステップＳ６２）。対象物検出情報を得られていない場合、前記第１の実施形態のステップＳ１２〜Ｓ１４の処理と同様、ステップＳ６３〜Ｓ６５の処理を行う。 FIG. 16 is a flowchart showing an operation processing procedure of the video encoding unit 109 in the third embodiment. The video encoding unit 109 waits until a slice video is input (step S61). When the slice video is input, it is determined whether or not the object detection information is obtained from the object detection unit 1101 (step S62). When the object detection information has not been obtained, the processes of steps S63 to S65 are performed in the same manner as the processes of steps S12 to S14 of the first embodiment.

一方、対象物検出情報を得られている場合、スライス映像からマクロブロックを生成し（ステップ６６）、符号化制御部１０６から与えられるＭＢ符号化特性に対し、対象物検出情報に基づいて補正を加える（ステップＳ６７）。具体的に、対象物が検出された領域に含まれるマクロブロックの量子化ステップをより小さな値に変更する。この後、ステップＳ６８、Ｓ６９の処理では、ステップＳ６４、Ｓ６５と同様の処理を行う。 On the other hand, when the object detection information is obtained, a macro block is generated from the slice video (step 66), and the MB coding characteristics given from the coding control unit 106 are corrected based on the object detection information. Add (step S67). Specifically, the quantization step of the macro block included in the region where the target object is detected is changed to a smaller value. Thereafter, in the processes in steps S68 and S69, the same processes as in steps S64 and S65 are performed.

このように、第３の実施形態の映像符号化合成装置及び映像伝送システムによれば、対象物検出部１１０１を設けることで、映像が符号化されるまでの遅延時間を削減でき、さらに、対象物が検出された領域を高画質化することができる。 As described above, according to the video encoding / synthesizing apparatus and the video transmission system of the third embodiment, by providing the object detection unit 1101, the delay time until the video is encoded can be reduced, and further, the target The area where an object is detected can be improved in image quality.

なお、ＭＢ符号化特性を補正する際、特定の量子化ステップ値（例えば、値４０）を使用し、映像送信部１０１及び映像受信部１０３間で特定の量子化ステップ値に関する情報を事前に交換しておいてもよい。これにより、付加的な情報を伝送することなく、映像ストリームにより対象物の検出結果を伝送することが可能となる。また、映像受信装置１０３の映像表示部１１５は、対象物の検出領域を知ることができ、該当する領域に枠を付ける等、強調表示を行うことができる。 When correcting the MB coding characteristic, a specific quantization step value (for example, value 40) is used, and information regarding the specific quantization step value is exchanged in advance between the video transmission unit 101 and the video reception unit 103. You may keep it. Thereby, it becomes possible to transmit the detection result of the object by the video stream without transmitting additional information. In addition, the video display unit 115 of the video receiving apparatus 103 can know the detection area of the object, and can perform highlighting such as adding a frame to the corresponding area.

また、上記実施形態では、最後のスライス映像のスライス符号化時に対象物検出情報を利用しているが、対象物検出処理時間に合わせて変更することも可能である。 In the above embodiment, the object detection information is used at the time of slice encoding of the last slice video. However, the object detection information can be changed according to the object detection processing time.

また、上記実施形態では、スライス映像のスライス符号化を、スライス映像１、スライス映像２、スライス映像３、スライス映像４の順序で行っているが、映像送信装置毎にその順序を変更してもよい。例えば、図４に示すスライス合成の場合、合成されるスライスデータに対応するスライス映像を、最後にスライス符号化する。これにより、対象物検出の結果を利用して対象物領域の高画質化が行われているスライス映像のみ、スライス合成に用いることができる。 In the above embodiment, slice video is encoded in the order of slice video 1, slice video 2, slice video 3, and slice video 4. However, even if the order is changed for each video transmission device. Good. For example, in the case of the slice synthesis shown in FIG. 4, the slice video corresponding to the slice data to be synthesized is finally slice encoded. Thereby, only the slice image in which the image quality of the object region is improved using the result of the object detection can be used for the slice synthesis.

（第４の実施形態）
第４の実施形態における映像符号化合成装置及び映像伝送システムの構成は、前記第１の実施形態と同一である。図１７は本発明の第４の実施形態における映像前処理部１０８の動作処理手順を示すフローチャートである。第４の実施形態のステップＳ７１〜Ｓ７３の処理は、前記第１の実施形態のステップＳ１〜Ｓ３の処理と同一であるので、その説明を省略する。 (Fourth embodiment)
The configurations of the video encoding / synthesizing apparatus and the video transmission system in the fourth embodiment are the same as those in the first embodiment. FIG. 17 is a flowchart showing an operation processing procedure of the video preprocessing unit 108 in the fourth embodiment of the present invention. Since the process of step S71-S73 of 4th Embodiment is the same as the process of step S1-S3 of the said 1st Embodiment, the description is abbreviate | omitted.

ステップＳ７３で映像符号化部１０９にスライス映像を出力した後、映像前処理部１０８は、最終のスライス映像の１つ前であるか否かを判別する（ステップＳ７４）。最終のスライス映像の１つ前でない場合、ステップＳ７３の処理に戻る。一方、最終のスライス映像の１つ前である場合、スライス映像ではなく、カメラから入力されたフレーム映像を、映像符号化部１０９に出力する（ステップＳ７５）。この後、ステップＳ７１の処理に戻る。 After outputting the slice video to the video encoding unit 109 in step S73, the video preprocessing unit 108 determines whether or not it is one before the final slice video (step S74). If it is not one before the final slice video, the process returns to step S73. On the other hand, when it is one before the last slice video, the frame video input from the camera is output to the video encoding unit 109 instead of the slice video (step S75). Thereafter, the process returns to step S71.

図１８は第４の実施形態における映像符号化部１０９の動作処理手順を示すフローチャートである。映像符号化部１０９は、基本的に、前記第１の実施形態における図６及び前記第３の実施形態における図１６の処理と同様であるが、以下の処理において異なる。即ち、ステップＳ８１、Ｓ８２の処理は、図１６のステップＳ６１、Ｓ６２の処理と同じであり、ステップＳ８３〜Ｓ８５の処理は、図６のステップＳ１２〜Ｓ１４と同じであり、ステップＳ８７〜Ｓ９０は、図１６のステップＳ６６〜Ｓ６９の処理と同じである。 FIG. 18 is a flowchart illustrating an operation processing procedure of the video encoding unit 109 according to the fourth embodiment. The video encoding unit 109 is basically the same as the processing in FIG. 6 in the first embodiment and the processing in FIG. 16 in the third embodiment, but differs in the following processing. That is, steps S81 and S82 are the same as steps S61 and S62 in FIG. 16, steps S83 to S85 are the same as steps S12 to S14 in FIG. 6, and steps S87 to S90 are This is the same as the processing in steps S66 to S69 in FIG.

ステップＳ８２で対象物検出情報を得られている場合、この対象物検出情報を用いて、映像前処理部１０８から受け取ったフレーム映像から、対象物の検出領域の切り出しを行う（ステップＳ８６）。この対象物の検出領域の切り出しを行うことで、最後のスライス映像の代わりに、フレーム映像からの切り出し領域をスライス符号化する。 When the object detection information has been obtained in step S82, the object detection area is cut out from the frame image received from the image preprocessing unit 108 using the object detection information (step S86). By cutting out the detection area of the object, the cut area from the frame video is slice-encoded instead of the last slice video.

このように、第４の実施形態の映像符号化合成装置及び映像伝送システムによれば、ステップＳ７５でフレーム映像を出力し、ステップＳ８６で対象物の検出領域の切り出しを行うことで、最後のスライス映像の代わりに、切り出されたフレーム映像を符号化する。これにより、対象物の検出領域のみ、高解像度な映像データを元に高画質にスライス符号化することができる。 As described above, according to the video encoding / synthesizing device and the video transmission system of the fourth embodiment, the frame image is output in step S75, and the detection area of the object is cut out in step S86, so that the last slice is obtained. The clipped frame video is encoded instead of the video. As a result, only the detection area of the object can be slice-encoded with high image quality based on high-resolution video data.

なお、対象物の検出領域が大きい場合、縮小処理を行ってからスライス符号化を行うことも可能である。また、フレーム映像を用いるのは、最後のスライス映像としているが、最後のスライス映像に限定することなく、対象物検出処理部の処理時間に合わせて変更することも可能である。 If the detection area of the object is large, slice coding can be performed after performing the reduction process. Further, although the frame image is used as the last slice image, the frame image is not limited to the last slice image, and can be changed according to the processing time of the object detection processing unit.

（第５の実施形態）
第５の実施形態における映像符号化合成装置及び映像伝送システムの構成は、前記第１の実施形態と同一である。図１９は本発明の第５の実施形態における映像符号化部１０９の動作処理手順を示すフローチャートである。第５の実施形態の処理は、基本的に、前記第３の実施形態における図１６の処理と同様であるが、ステップＳ１０３の処理において異なる。即ち、ステップＳ１０１、Ｓ１０２の処理は、図１６のステップＳ６１、Ｓ６２の処理と同じであり、ステップＳ１０４〜Ｓ１０７の処理は、図１６のステップＳ６６〜Ｓ６９の処理と同じである。 (Fifth embodiment)
The configurations of the video encoding / synthesizing apparatus and the video transmission system in the fifth embodiment are the same as those in the first embodiment. FIG. 19 is a flowchart showing an operation processing procedure of the video encoding unit 109 according to the fifth embodiment of the present invention. The process of the fifth embodiment is basically the same as the process of FIG. 16 in the third embodiment, but differs in the process of step S103. That is, the processes in steps S101 and S102 are the same as the processes in steps S61 and S62 in FIG. 16, and the processes in steps S104 to S107 are the same as the processes in steps S66 to S69 in FIG.

映像符号化部１０９は、ステップＳ１０２で対象物検出情報が得られない場合、対象物検出部１１０１に保持された過去の対象物検出情報を取り出す（ステップＳ１０３）。この取り出された過去の対象物検出情報を用いて、対象物検出部の処理が終了するまで対象物の存在確率の高い領域を高画質に符号化する。 If the object detection information is not obtained in step S102, the video encoding unit 109 extracts past object detection information held in the object detection unit 1101 (step S103). Using the extracted past object detection information, a region having a high object existence probability is encoded with high image quality until the processing of the object detection unit is completed.

このように、第５の実施形態の映像符号化合成装置及び映像伝送システムによれば、過去の対象物検出情報を保持することにより、対象物の存在確率の高い領域を高画質にスライス符号化することができる。 As described above, according to the video coding / synthesizing apparatus and the video transmission system of the fifth embodiment, the past object detection information is retained, so that a region with a high object existence probability is slice-encoded with high image quality. can do.

（第６の実施形態）
第６の実施形態における映像符号化合成装置及び映像伝送システムの構成は、前記第１の実施形態と同一である。図２０は本発明の第６の実施形態における映像符号化部１０９の動作処理手順を示すフローチャートである。第６の実施形態の処理は、基本的に、前記第３の実施形態における図１６の処理と同様であるが、後述する処理において異なる。即ち、ステップＳ１１１〜Ｓ１１５の処理は、図１６のステップＳ６１〜Ｓ６５の処理と同じであり、ステップＳ１１７〜Ｓ１２０の処理は、図１６のステップＳ６６〜Ｓ６９の処理と同じである。 (Sixth embodiment)
The configurations of the video encoding / synthesizing apparatus and the video transmission system in the sixth embodiment are the same as those in the first embodiment. FIG. 20 is a flowchart showing an operation processing procedure of the video encoding unit 109 according to the sixth embodiment of the present invention. The process of the sixth embodiment is basically the same as the process of FIG. 16 in the third embodiment, but differs in the process described later. That is, the processing of steps S111 to S115 is the same as the processing of steps S61 to S65 in FIG. 16, and the processing of steps S117 to S120 is the same as the processing of steps S66 to S69 of FIG.

映像符号化部１０９は、ステップＳ１１２で対象物検出情報が得られた場合、対象物の検出情報が含まれているか否かを判別する（ステップＳ１１６）。対象物の検出情報が含まれていない場合、ステップＳ１１１の処理に戻り、スライス符号化をスキップする。一方、対象物の検出情報が含まれている場合、ステップＳ１１７の処理に移行する。 When the object detection information is obtained in step S112, the video encoding unit 109 determines whether or not the object detection information is included (step S116). When the detection information of the target is not included, the process returns to step S111 and skips the slice encoding. On the other hand, when the detection information of the object is included, the process proceeds to step S117.

このように、第６の実施形態の映像符号化合成装置及び映像伝送システムによれば、対象物検出情報に対象物の検出情報を含むか否かを判断し、対象物が存在しないスライス映像のスライス符号化をスキップすることで、ネットワーク上に送信されるスライスデータ量を削減することができる。 As described above, according to the video encoding / synthesizing device and the video transmission system of the sixth embodiment, it is determined whether or not the target detection information includes the detection information of the target object, and the slice video of the target object does not exist. By skipping slice coding, the amount of slice data transmitted on the network can be reduced.

なお、本発明は上記の実施形態において示されたものに限定されるものではなく、明細書の記載、並びに周知の技術に基づいて、当業者が変更、応用することも本発明の予定するところであり、保護を求める範囲に含まれる。 It should be noted that the present invention is not limited to those shown in the above-described embodiments, and those skilled in the art can also make changes and applications based on the description in the specification and well-known techniques. Yes, included in the scope of protection.

例えば、ネットワークに有線ＬＡＮ、無線ＬＡＮ、公衆網などを用いることで、遠隔地の映像を符号化して伝送し、複数の映像を合成及び表示可能な映像伝送システムを構築することが可能である。また、上記実施形態では、スライス数が値４である場合を示したが、本発明は、この他、画素を間引き可能な値「９、１６、２５、…」などについても、同様に適用可能である。 For example, by using a wired LAN, a wireless LAN, a public network, or the like for the network, it is possible to construct a video transmission system that can encode and transmit a remote video and synthesize and display a plurality of videos. In the above-described embodiment, the case where the number of slices is 4 has been shown. However, the present invention can be similarly applied to values “9, 16, 25,. It is.

また、上記実施形態では、スライス合成されるスライスデータについては、単純間引きによって得る例を示している。例えば、図４のスライス符号化済データ４０１では、スライスデータ１−２、１−３、１−４を単純間引きしてスライスデータ１−１を得ている。本発明はこの単純間引きに限らず、周辺画素値の平均値としても同様に適用可能である。例えば、図２の入力映像２０２において、画素１，２，ｘ＋１，ｘ＋２の画素値を加算して４で割った値からスライスデータを求めるようにしてもよい。一画面に戻す際には、逆の演算を行えばよい。 In the above embodiment, an example is shown in which slice data to be combined is obtained by simple thinning. For example, in the slice encoded data 401 of FIG. 4, slice data 1-2 is obtained by simple thinning out the slice data 1-2, 1-3, and 1-4. The present invention is not limited to this simple decimation, and can be similarly applied to an average value of peripheral pixel values. For example, slice data may be obtained from the value obtained by adding the pixel values of the pixels 1, 2, x + 1, and x + 2 and dividing the result by 4 in the input video 202 of FIG. When returning to one screen, the reverse operation may be performed.

また、上記実施形態では、複数のカメラから入力される映像を対象としていたが、カメラに限らず、複数の記録媒体に記録された映像を入力する場合にも、本発明は同様に適用可能である。 In the above embodiment, the video input from a plurality of cameras is targeted. However, the present invention is not limited to the camera, and the present invention can be similarly applied when inputting video recorded on a plurality of recording media. is there.

本発明は、縮小された映像の他、元の解像度を有する映像を容易に表示することが可能となる効果、入力される映像特性のばらつきが大きい場合でも、分割された映像間で画質を均一にすることが可能となる効果、映像が符号化されるまでの遅延時間を削減することが可能となる効果を有し、入力された映像を符号化し、符号化された複数の映像データを合成する映像符号化合成装置、映像符号化合成方法及び映像伝送システム等に有用である。 The present invention has an effect that it is possible to easily display a video having the original resolution in addition to the reduced video, and even when there is a large variation in input video characteristics, the image quality is uniform between the divided videos. It has the effect of making it possible to reduce the delay time until the video is encoded, encodes the input video, and synthesizes multiple encoded video data It is useful for a video coding / synthesizing apparatus, a video coding / synthesizing method, a video transmission system, and the like.

本発明の第１の実施形態に係る映像伝送システムの構成を示す図The figure which shows the structure of the video transmission system which concerns on the 1st Embodiment of this invention. 第１の実施形態における間引き処理及びスライス映像作成処理を示す図The figure which shows the thinning-out process and slice image | video production | generation process in 1st Embodiment 第１の実施形態の映像符号化部におけるマクロブロックの取り扱いを示す図The figure which shows the handling of the macroblock in the video encoding part of 1st Embodiment 第１の実施形態の合成部におけるスライス合成動作を示す図The figure which shows the slice synthetic | combination operation | movement in the synthetic | combination part of 1st Embodiment. 第１の実施形態における映像前処理部の動作処理手順を示すフローチャートThe flowchart which shows the operation | movement process sequence of the image | video pre-processing part in 1st Embodiment. 第１の実施形態における映像符号化部の動作処理手順を示すフローチャートThe flowchart which shows the operation | movement process procedure of the video coding part in 1st Embodiment. 第１の実施形態のステップＳ１３において映像符号化部で行われるスライス符号化動作を示す図The figure which shows the slice encoding operation | movement performed by the video encoding part in step S13 of 1st Embodiment. 第１の実施形態における合成部の動作処理手順を示すフローチャートThe flowchart which shows the operation processing procedure of the synthetic | combination part in 1st Embodiment. 第１の実施形態における映像復号部の動作処理手順を示すフローチャートThe flowchart which shows the operation | movement process procedure of the video decoding part in 1st Embodiment. 本発明の第２の実施形態に係る映像伝送システムの構成を示す図The figure which shows the structure of the video transmission system which concerns on the 2nd Embodiment of this invention. 第２の実施形態における合成部の動作処理手順を示すフローチャートThe flowchart which shows the operation processing procedure of the synthetic | combination part in 2nd Embodiment. 第２の実施形態における映像送信装置管理テーブル及びスライス合成データ管理テーブルを示す図The figure which shows the video transmission apparatus management table and slice synthetic | combination data management table in 2nd Embodiment. 本発明の第３の実施形態に係る映像伝送システムの構成を示す図The figure which shows the structure of the video transmission system which concerns on the 3rd Embodiment of this invention. 第３の実施形態における符号化処理のタイミングを示す図The figure which shows the timing of the encoding process in 3rd Embodiment. 第３の実施形態における対象物検出部の動作処理手順を示すフローチャートThe flowchart which shows the operation | movement process sequence of the target object detection part in 3rd Embodiment. 第３の実施形態における映像符号化部の動作処理手順を示すフローチャートThe flowchart which shows the operation | movement process sequence of the image | video encoding part in 3rd Embodiment. 第４の実施形態における映像前処理部の動作処理手順を示すフローチャートThe flowchart which shows the operation | movement process sequence of the image | video pre-processing part in 4th Embodiment. 第４の実施形態における映像符号化部の動作処理手順を示すフローチャートThe flowchart which shows the operation | movement process procedure of the video coding part in 4th Embodiment. 第５の実施形態における映像符号化部の動作処理手順を示すフローチャートThe flowchart which shows the operation | movement processing procedure of the video encoding part in 5th Embodiment. 第６の実施形態における映像符号化部の動作処理手順を示すフローチャートThe flowchart which shows the operation | movement processing procedure of the video encoding part in 6th Embodiment. 従来の映像符号化合成装置の構成例を示す図The figure which shows the structural example of the conventional video coding and synthesizing | combining apparatus. 従来例におけるユーザ端末の映像表示部の画面を示す図The figure which shows the screen of the video display part of the user terminal in a prior art example 従来例におけるユーザ端末の映像入力部の構成を示す図The figure which shows the structure of the video input part of the user terminal in a prior art example. 従来例における映像データの合成動作を示す図The figure which shows the synthetic | combination operation | movement of the video data in a prior art example

Explanation of symbols

１０１映像送信装置
１０２映像合成装置
１０３映像受信装置
１０４第１のネットワーク
１０５第２のネットワーク
１０７映像入力部
１０８映像前処理部
１０９映像符号化部
１１０送受信部
１１１合成制御部
１１２、１１４送受信部
１１３合成部
１１５映像表示部
１１６映像復号部
１１７送受信部
１１８ユーザインタフェース（ＵＩ） DESCRIPTION OF SYMBOLS 101 Video transmission apparatus 102 Video composition apparatus 103 Video reception apparatus 104 1st network 105 2nd network 107 Video input part 108 Video pre-processing part 109 Video encoding part 110 Transmission / reception part 111 Compositing control part 112,114 Transmission / reception part 113 Synthesis | combination Unit 115 video display unit 116 video decoding unit 117 transmission / reception unit 118 user interface (UI)

Claims

A pre-processing unit that generates a slice video from the input video;
An encoding unit for encoding the slice video;
A synthesizing unit that synthesizes the plurality of slice-encoded video data so as to be a multi-screen display;
A decoding unit that decodes the plurality of video data synthesized to be the multi-screen display or the slice-encoded video data;
A restoration unit that restores the decoded video data to the original video data so as to be displayed on a single screen;
A selection unit for selecting video data synthesized and decoded to be the multi-screen display or video data restored to be the single-screen display;
A display unit for displaying a video based on the selected video data on a screen;
A video encoding / synthesizing device comprising:
An object detection unit for detecting a specific object included in the video;
When the detection of the specific object is not completed by the object detection unit,
The encoding unit is a video encoding / synthesizing device that changes encoding characteristics based on past detection results for the specific object .

The video encoding / synthesizing device according to claim 1 ,
When the specific object is not detected by the object detection unit,
The encoding unit is a video encoding / synthesizing device that omits encoding of the divided video.

The video encoding / synthesizing device according to claim 2 ,
When the specific object is detected by the object detection unit,
The encoding unit is a video encoding / synthesizing apparatus that changes encoding characteristics.

The video encoding / synthesizing device according to claim 3 ,
The encoding unit is a video encoding / synthesizing device that extracts and encodes a corresponding area of the detected specific object from the video.

The video encoding / synthesizing device according to claim 4 ,
When synthesizing the slice-encoded video data to be a multi-screen display,
The encoding unit is a video encoding / synthesizing apparatus that performs encoding so that a code amount becomes a target code amount.

A pre-processing step for generating a slice video from the input video;
An encoding step of performing slice encoding on the divided slice video;
A synthesis step of synthesizing the plurality of slice-coded video data so as to be a multi-screen display;
A decoding step of decoding a plurality of video data synthesized to be the multi-screen display or the slice-encoded video data;
A restoration step of restoring the decoded video data to the original video data so as to be displayed on a single screen;
A selection step of selecting the video data synthesized and decoded to be the multi-screen display or the video data restored to be the single-screen display;
A display step of displaying a video based on the selected video data on a screen;
A video encoding and synthesis method comprising :
An object detection step of detecting a specific object included in the video;
When the detection of the specific object is not completed by the object detection step,
The video encoding / synthesizing method in which the encoding step changes encoding characteristics based on a past detection result for the specific object .

A video transmission system in which a video transmission device, a video synthesis device, and a video reception device are connected via a network, encodes input video, synthesizes and displays the plurality of encoded video data,
The video transmission device includes:
A pre-processing unit that generates a slice video from the input video;
An encoding unit for encoding the slice video;
A first transmission unit that transmits the slice-encoded video data to the network;
The video composition device
A first receiver for receiving the slice-encoded video data from the network;
A synthesizing unit that synthesizes a plurality of slice-coded video data input from different video transmission devices so as to be a multi-screen display;
A second transmission unit for transmitting the synthesized video data to the network;
The video receiver is
A second receiving unit for receiving a plurality of video data synthesized so as to be the multi-screen display from the network or the slice-encoded video data;
A decoding unit that decodes the plurality of video data synthesized to be the multi-screen display or the slice-encoded video data;
A restoration unit that restores the decoded video data to the original video data so as to be displayed on a single screen;
A selection unit for selecting video data synthesized and decoded to be the multi-screen display or video data restored to be the single-screen display;
A display unit for displaying a video based on the selected video data on a screen;
Equipped with a,
The video transmission device includes:
An object detection unit for detecting a specific object included in the video;
When the detection of the specific object is not completed by the object detection unit,
The video transmission system in which the encoding unit changes encoding characteristics based on past detection results for the specific object .

A pre-processing unit that generates a slice video from the input video;
An encoding unit for encoding the slice video;
A synthesizing unit that synthesizes the plurality of slice-encoded video data so as to be a multi-screen display;
A decoding unit that decodes the plurality of video data combined to be the multi-screen display or the slice-encoded video data that is not combined;
A video encoding / synthesizing device comprising: a restoration unit that restores the video data that is not synthesized to the original video data so as to be displayed on a single screen;
An object detection unit for detecting a specific object included in the video;
The encoding unit includes:
When the specific object is detected by the object detection unit, the encoding characteristic is changed,
A video coding / synthesizing device that changes coding characteristics based on past detection results for the specific object when detection of the specific object is not completed.