JP2010226594A

JP2010226594A - Image transmission device and imaging device mounting the same

Info

Publication number: JP2010226594A
Application number: JP2009073556A
Authority: JP
Inventors: Hideo Hirono; 英雄廣野
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2009-03-25
Filing date: 2009-03-25
Publication date: 2010-10-07
Anticipated expiration: 2029-03-25
Also published as: JP5235746B2

Abstract

<P>PROBLEM TO BE SOLVED: To improve transmission efficiency of a moving image in a region of interest. <P>SOLUTION: In an imaging device, a coding unit 204 calculates a prediction code amount of an entire image in the region of interest based on a code amount generated when coding the entire image in the region of interest. The coding unit 204 calculates the prediction code amount of a specific object included in the region of interest based on the code amount generated when coding the specific object included in the region of interest. A transmission unit 206 calculates a transmission rate of a wireless communication network. A control unit 207 controls a region-of-interest setting unit 202 and a region-of-interest processing unit 203 so as to change a ratio of the specific object in the region of interest on a frame image of a coding target according to the transmission rate, the prediction code amount of the entire image in the region of interest, and the prediction code amount of the specific object included in the region of interest. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、動画像を符号化して送信するための画像送信装置およびそれを用いた撮像装置に関する。 The present invention relates to an image transmission apparatus for encoding and transmitting a moving image and an imaging apparatus using the same.

近年、動画を撮影することができるデジタルムービーカメラが普及してきている。デジタルムービーカメラは、年々、高画質化しており、フルＨＤ（High Definiton）画質に対応したものも実用化されている。それに伴い画像圧縮効率の高いＨ．２６４／ＡＶＣ規格で動画像を圧縮符号化するデジタルムービーカメラも実用化されている。 In recent years, digital movie cameras capable of shooting moving images have become widespread. Digital movie cameras have improved image quality year by year, and those compatible with full HD (High Definiton) image quality have also been put into practical use. Accordingly, H.H. has high image compression efficiency. Digital movie cameras that compress and code moving images according to the H.264 / AVC standard have also been put into practical use.

このようなデジタルムービーカメラのなかには、特定のオブジェクトの形状、色などオブジェクトの特徴量を抽出してこれを追尾するものもある。ユーザは、このようなデジタルムービーカメラを利用して、例えば、運動会で走っている子供を所定の注目領域内に継続して収まるように追尾しながら撮影することができる。（例えば、特許文献１） Among such digital movie cameras, there are those that extract and track object feature amounts such as the shape and color of a specific object. Using such a digital movie camera, for example, the user can shoot while tracking a child running in an athletic meet so as to be continuously within a predetermined region of interest. (For example, Patent Document 1)

特開平７−９５５９７号公報JP-A-7-95597

また、携帯電話や無線LAN（Local Area Network）などの無線通信技術の発展により、無線通信ネットワークにおいても大容量のデータを高速に伝送できるようになっている。このため、デジタルムービーカメラにおいても、撮影した動画像を記録媒体に記録するだけでなく、無線通信ネットワークを介して伝送することが要望されている。 In addition, with the development of wireless communication technologies such as mobile phones and wireless local area networks (LANs), large amounts of data can be transmitted at high speed even in wireless communication networks. For this reason, even in a digital movie camera, it is demanded not only to record a captured moving image on a recording medium but also to transmit it via a wireless communication network.

しかしながら、無線通信ネットワークにおける伝送速度は、伝播環境の影響を受け易く、十分な伝送速度を確保できないこともある。他方、特定のオブジェクトを追尾しながら撮影した場合、ユーザはその特定のオブジェクトを含む注目領域内の動画像を主な視聴対象とするため、注目領域内の動画像を効率良く伝送できれば、撮像した画面全体の動画像を伝送する必要がないことも多い。 However, the transmission rate in the wireless communication network is easily affected by the propagation environment, and a sufficient transmission rate may not be ensured. On the other hand, when shooting while tracking a specific object, the user takes a moving image in the attention area including the specific object as a main viewing target. In many cases, it is not necessary to transmit a moving image of the entire screen.

本発明はこのような状況に鑑みてなされたものであり、注目領域内の動画像を効率良く伝送することができる画像伝送装置およびそれを搭載した撮像装置を提供することを目的とする。 The present invention has been made in view of such a situation, and an object thereof is to provide an image transmission apparatus capable of efficiently transmitting a moving image in a region of interest and an imaging apparatus equipped with the image transmission apparatus.

本発明のある態様は、画像送信装置である。この画像送信装置は、画像上の特定オブジェクトを含むように、前記画像上に注目領域を設定する設定部と、前記注目領域内の画像を符号化して注目領域符号化データを生成する符号化部と、前記注目領域符号化データを送信する送信部と、前記送信部で取得した前注目領域符号化データの伝送速度と、前記符号化部で予測した前記注目領域符号化データと前記特定オブジェクトの符号量に応じて、前記注目領域における前記特定オブジェクトの占める割合を変更する処理部と、を備えることを特徴とする画像送信装置である。 One embodiment of the present invention is an image transmission device. The image transmission device includes a setting unit that sets a region of interest on the image so as to include a specific object on the image, and an encoding unit that encodes an image in the region of interest and generates region-of-interest encoded data A transmission unit that transmits the region-of-interest encoded data, a transmission rate of the previous region-of-interest encoded data acquired by the transmission unit, the region-of-interest encoded data predicted by the encoding unit, and the specific object An image transmission device comprising: a processing unit that changes a ratio of the specific object in the region of interest according to a code amount.

また、前記処理部は、前記伝送速度が第１のしきい値よりも小さく、かつ前記注目領域符号化データの符号量が第２のしきい値よりも大きく、かつ前記特定オブジェクトの符号量が第３のしきい値よりも大きい場合、前記注目領域における前記特定オブジェクトの占める割合を縮小する処理を実行することが好ましい。 Further, the processing unit is configured such that the transmission rate is smaller than a first threshold, the code amount of the attention area encoded data is larger than a second threshold, and the code amount of the specific object is When larger than a 3rd threshold value, it is preferable to perform the process which reduces the ratio for which the said specific object accounts in the said attention area.

また、前記処理部は、前記伝送速度が第１のしきい値よりも小さく、かつ前記注目領域符号化データの符号量が第２のしきい値よりも大きく、かつ前記特定オブジェクトの符号量が第３のしきい値よりも小さい場合、前記注目領域における前記特定オブジェクトの占める割合を拡大する処理を実行することが好ましい。 Further, the processing unit is configured such that the transmission rate is smaller than a first threshold, the code amount of the attention area encoded data is larger than a second threshold, and the code amount of the specific object is When it is smaller than the third threshold value, it is preferable to execute a process of expanding the proportion of the specific object in the attention area.

本発明によれば、注目領域内の動画像の伝送効率を向上させることができる。 According to the present invention, it is possible to improve the transmission efficiency of moving images within a region of interest.

本発明の実施の形態における撮像装置１の構成を示す概念図1 is a conceptual diagram illustrating a configuration of an imaging device 1 according to an embodiment of the present invention. 注目領域における特定オブジェクトの占める割合を変更するフローチャートFlow chart for changing the proportion of a specific object in a region of interest

本発明を具体的に説明する前に概要について述べる。本発明の実施の形態は、特定のオブジェクトを追尾しながら撮像することができる撮像装置に関する。撮像装置は、動画像に含まれるフレーム画像内に、人物などの特定のオブジェクトを検出すると、フレーム画像上（以下、全体領域とも言う)に、そのオブジェクトを含むように注目領域を設定する。また撮像装置は、オブジェクトから特徴量（例えば、人物の顔の形状など）を抽出し、そのオブジェクトを追尾する。そして、注目領域にオブジェクトが継続的に収まるように、そのオブジェクトの動きに注目領域を追従させる。 The outline will be described before the present invention is specifically described. Embodiments described herein relate generally to an imaging apparatus that can capture an image while tracking a specific object. When a specific object such as a person is detected in a frame image included in a moving image, the imaging apparatus sets a region of interest on the frame image (hereinafter also referred to as an entire region) so as to include the object. Further, the imaging apparatus extracts a feature amount (for example, the shape of a human face) from the object, and tracks the object. Then, the attention area is caused to follow the movement of the object so that the object continuously fits in the attention area.

さらに撮像装置は、無線通信ネットワーク（例えば、IEEE 802.11nなどの無線LAN）において、注目領域内の動画像を伝送する。無線通信ネットワークでデータ送信を行うと、伝播環境によりその伝送速度が変動するため、所定の期間内に注目領域内の動画像を送信することができない場合がある。 Furthermore, the imaging apparatus transmits a moving image in the region of interest in a wireless communication network (for example, a wireless LAN such as IEEE 802.11n). When data is transmitted through a wireless communication network, the transmission speed varies depending on the propagation environment, so that there may be a case where a moving image in the region of interest cannot be transmitted within a predetermined period.

そこで、本発明の実施の形態における撮像装置においては、無線通信ネットワークの伝送速度を取得するとともに、注目領域内の動画像を符号化したときに発生する符号量およびオブジェクトを符号化したときに発生する符号量を予測し、伝送速度の応じて、注目領域におけるオブジェクトが占める割合を可変制御する。 Therefore, in the imaging device according to the embodiment of the present invention, the transmission rate of the wireless communication network is acquired, and the code amount generated when the moving image in the attention area is encoded and the object are encoded. The amount of code occupied by the object in the attention area is variably controlled according to the transmission speed.

これにより、伝送速度に応じて注目領域に含まれるオブジェクトの符号量を調整することができ、注目領域内の動画像を伝播環境に応じて効率良く送信することができる。 Thereby, the code amount of the object included in the attention area can be adjusted according to the transmission speed, and the moving image in the attention area can be efficiently transmitted according to the propagation environment.

図１は、本発明の実施の形態における撮像装置１の構成を示す概念図である。撮像装置１は撮像部１０および画像送信装置２０を含む。 FIG. 1 is a conceptual diagram illustrating a configuration of an imaging apparatus 1 according to an embodiment of the present invention. The imaging device 1 includes an imaging unit 10 and an image transmission device 20.

撮像部１０は、動画像を取得し画像送信装置２０に供給する。撮像部１０は、ＣＣＤ（Charge Coupled Devices）センサやＣＭＯＳ（Complementary Metal Oxide Semiconductor）イメージセンサなどの固体撮像素子、固体撮像素子からアナログの三原色信号をデジタルの輝度信号および色差信号に変更する信号処理部を含む。 The imaging unit 10 acquires a moving image and supplies it to the image transmission device 20. The imaging unit 10 is a solid-state imaging device such as a CCD (Charge Coupled Devices) sensor or a CMOS (Complementary Metal Oxide Semiconductor) image sensor, or a signal processing unit that changes an analog three primary color signal from the solid-state imaging device to a digital luminance signal and a color difference signal. including.

画像送信装置２０は、撮像部１０から取得した動画像を、例えば、Ｈ．２６４／ＡＶＣ規格に従い圧縮符号化して符号化データ（以下、全体領域符号化データともいう）を生成し、記録媒体に記録する。また、画像送信装置２０は、動画像上に設定された注目領域内の動画像を圧縮符号化して符号化データ（以下、注目領域符号化データともいう）を生成し、無線通信ネットワークを介して図示しない画像受信装置に出力する。 The image transmission apparatus 20 uses the moving image acquired from the imaging unit 10 as, for example, H.264. According to the H.264 / AVC standard, encoded data (hereinafter also referred to as whole area encoded data) is generated by compression encoding and recorded on a recording medium. Further, the image transmission apparatus 20 generates encoded data (hereinafter, also referred to as attention area encoded data) by compressing and encoding the moving image in the attention area set on the moving image, via the wireless communication network. The image is output to an image receiving device (not shown).

画像送信装置２０は、オブジェクト抽出部２０１、注目領域設定部２０２、注目領域処理部２０３、符号化部２０４、記録部２０５、送信部２０６および制御部２０７を含む。 The image transmission apparatus 20 includes an object extraction unit 201, an attention area setting unit 202, an attention area processing unit 203, an encoding unit 204, a recording unit 205, a transmission unit 206, and a control unit 207.

オブジェクト抽出部２０１は、撮像部１０から受け取った動画像に含まれるフレーム画像内から特定のオブジェクトを検出する。オブジェクト抽出部２０１が検出すべき特定のオブジェクトは、ユーザが指定をすることができる。オブジェクトの具体例としては、人物、犬や猫などペット、自動車や飛行機などの移動体などが挙げられる。以下においては、説明の便宜上、特定のオブジェクトとして人物を検出するものとする。 The object extraction unit 201 detects a specific object from the frame image included in the moving image received from the imaging unit 10. The specific object to be detected by the object extraction unit 201 can be designated by the user. Specific examples of the object include a person, a pet such as a dog or a cat, and a moving body such as a car or an airplane. In the following, for convenience of explanation, it is assumed that a person is detected as a specific object.

オブジェクト抽出部２０１は、フレーム画像内の顔を検出することにより人物を特定する。検出した顔を含む顔領域の下方に顔領域の大きさに比例させた胴体領域を設定する。顔検出は、公知な方法で行えばよく、本発明の実施の形態においては、エッジ検出法を用いるものとする。 The object extraction unit 201 identifies a person by detecting a face in the frame image. A torso area proportional to the size of the face area is set below the face area including the detected face. The face detection may be performed by a known method, and in the embodiment of the present invention, the edge detection method is used.

オブジェクト抽出部２０１は、上記胴体領域の色に類似する色の領域を、後続フレーム画像内で探索することにより人物を追尾する。なお、後続フレーム画像内での、顔検出の結果を加味すれば、追尾の精度を高めることができる。 The object extraction unit 201 tracks a person by searching for a color area similar to the color of the body area in the subsequent frame image. If the result of face detection in the subsequent frame image is taken into account, tracking accuracy can be improved.

注目領域設定部２０２は、オブジェクト抽出部２０１が検出した人物を含むよう、フレーム画像上に注目領域を設定する。注目領域は人物全体を包含し、かつその周辺領域を含む矩形の形状であってもよい。注目領域の設定はユーザが行ってもよく、オブジェクト抽出部２０１から検出した人物の全体領域内における位置情報やサイズ情報を受け取り、自動的に行ってもよい。 The attention area setting section 202 sets the attention area on the frame image so as to include the person detected by the object extraction section 201. The attention area may include a whole shape of a person and a rectangular shape including the surrounding area. The attention area may be set by the user, or the position information and the size information in the entire area of the person detected from the object extraction unit 201 may be received and automatically set.

また、注目領域設定部２０２は、オブジェクト抽出部２０１の追尾状況から人物全体の動きベクトルを検出し、その動きベクトルに応じて注目領域を移動させ、人物の動きに注目領域を追従させる。 The attention area setting unit 202 detects the motion vector of the entire person from the tracking status of the object extraction unit 201, moves the attention area according to the motion vector, and causes the attention area to follow the movement of the person.

さらに、注目領域設定部２０２は、制御部２０７の指示に従い、注目領域動画像に対して、人物が注目領域内で占める割合を変更する処理を実行する。例えば、制御部２０７より人物が注目領域内で占める割合を大きくするように指示があると、注目領域設定部２０２は、対象人物を含むように設定した注目領域（以下、第１の注目領域ともいう）を解除し、制御部２０７から指定された拡大倍率に基づき、対象人物を含み、かつ第１の注目領域よりも小さな注目領域（以下、第２の注目領域ともいう）を設定する。第２の注目領域の大きさは第１の注目領域の大きさよりも小さい。このため、第２の注目領域内で人物が占める割合は、第１の注目領域内で人物が占める割合よりも大きくなっている。 Further, the attention area setting unit 202 executes processing for changing the ratio of the person in the attention area with respect to the attention area moving image according to the instruction of the control unit 207. For example, when the control unit 207 instructs to increase the proportion of the person in the attention area, the attention area setting unit 202 sets the attention area (hereinafter referred to as the first attention area) to include the target person. And an attention area (hereinafter also referred to as a second attention area) that includes the target person and is smaller than the first attention area is set based on the enlargement magnification designated by the control unit 207. The size of the second region of interest is smaller than the size of the first region of interest. For this reason, the proportion of the person in the second region of interest is greater than the proportion of the person in the first region of interest.

一方、制御部２０７より人物が注目領域内で占める割合を小さくするように指示があると、注目領域設定部２０２は、第１の注目領域を解除し、制御部２０７から指定された縮小倍率に基づき、対象人物を含み、かつ第１の注目領域よりも大きな注目領域（以下、第３の注目領域ともいう）を設定する。第３の注目領域の大きさは第１の注目領域の大きさよりも大きい。このため、第３の注目領域内で人物が占める割合は、第１の注目領域内で人物が占める割合よりも小さくなっている。 On the other hand, when the control unit 207 instructs to reduce the proportion of the person in the attention area, the attention area setting unit 202 cancels the first attention area and sets the reduction magnification specified by the control unit 207 to the reduction magnification. Based on this, an attention area that includes the target person and is larger than the first attention area (hereinafter also referred to as a third attention area) is set. The size of the third region of interest is larger than the size of the first region of interest. For this reason, the proportion of the person in the third region of interest is smaller than the proportion of the person in the first region of interest.

注目領域処理部２０３は、注目領域設定部２０２からフレーム画像上における注目領域の位置情報や形状情報などを受け取る。これらの情報を参照して、撮像部１０から受け取った動画像に含まれるフレーム画像から、注目領域内の画像を抽出する。 The attention area processing unit 203 receives position information, shape information, and the like of the attention area on the frame image from the attention area setting unit 202. With reference to these pieces of information, an image in the attention area is extracted from the frame image included in the moving image received from the imaging unit 10.

これにより、注目領域処理部２０３は、フレーム画像ごとに抽出された注目領域内の画像を単位画像（以下、注目領域単位画像ともいう）とし、この注目領域単位画像が連続した注目領域動画像を構成する。 As a result, the attention area processing unit 203 uses the image in the attention area extracted for each frame image as a unit image (hereinafter, also referred to as attention area unit image), and selects the attention area moving image in which the attention area unit images are continuous. Constitute.

注目領域処理部２０３は、注目領域動画像と、注目領域設定部２０２から受け取ったフレーム画像上における注目領域の位置情報や形状情報（以下、第１の位置情報等ともいう）および人物の位置情報や形状情報（以下、第２の位置情報等ともいう）を符号化部２０４に供給する。 The attention area processing unit 203 includes the attention area moving image, position information and shape information of the attention area (hereinafter also referred to as first position information) on the frame image received from the attention area setting unit 202, and person position information. And shape information (hereinafter also referred to as second position information) are supplied to the encoding unit 204.

符号化部２０４は、撮像部１０から受け取った動画像に対して、動き補償処理、直交変換処理、量子化処理、エントロピー符号化処理などを実行し、Ｈ．２６４／ＡＶＣ規格に準拠した圧縮符号化方式により全体領域符号化データを生成し、記録部２０５に供給する。 The encoding unit 204 performs motion compensation processing, orthogonal transform processing, quantization processing, entropy encoding processing, and the like on the moving image received from the imaging unit 10. The entire area encoded data is generated by a compression encoding method compliant with the H.264 / AVC standard and supplied to the recording unit 205.

また、注目領域処理部２０３から受け取った注目領域動画像に対しても同様の処理を実行し、Ｈ．２６４／ＡＶＣ規格に準拠した圧縮符号化方式により注目領域符号化データを生成し、送信部２０６に供給する。 The same processing is executed for the attention area moving image received from the attention area processing unit 203, and Region-of-interest encoded data is generated by a compression encoding method compliant with the H.264 / AVC standard and supplied to the transmission unit 206.

さらに、符号化部２０４は、第１および第２の位置情報等を参照して、注目領域内の全体画像および注目領域に含まれる人物のそれぞれに対して、Ｈ．２６４／ＡＶＣ規格に準拠した圧縮符号化方式を適用した場合に発生する符号量を算出する。一般に、動画像には時間方向に相関性があり、符号化対象フレーム画像を符号化したときの符号量は、時間方向に前後するフレーム画像を符号化したときの符号量に近い値となることが多い。そこで、符号化部２０４は、符号化対象の注目領域内の全体画像と、注目領域に含まれる人物を符号化したときのそれそれの符号量を、次に符号化対象とされる注目領域内の全体画像と、該注目領域に含まれる人物の予測符号量として制御部２０７に供給する。 Further, the encoding unit 204 refers to the first and second position information and the like for each of the entire image in the attention area and the person included in the attention area. The amount of code generated when a compression coding system compliant with the H.264 / AVC standard is applied is calculated. In general, moving images have a correlation in the time direction, and the amount of code when the encoding target frame image is encoded is close to the amount of code when the frame images preceding and following in the time direction are encoded. There are many. Therefore, the encoding unit 204 determines the entire image in the target region to be encoded and the code amount when the person included in the target region is encoded in the target region to be encoded next. And the predicted code amount of the person included in the region of interest are supplied to the control unit 207.

以上の説明では、制御部２０７より人物が注目領域内で占める割合を大きくするように指示があると、注目領域設定部２０２は、画像を抽出すべき領域を注目領域よりも小さくする処理を実行した。一方、制御部２０７より人物が注目領域内で占める割合を小さくするように指示があると、注目領域設定部２０２は、画像を抽出すべき領域を注目領域よりも大きくする処理を実行した。 In the above description, when the control unit 207 instructs to increase the proportion of the person in the attention area, the attention area setting unit 202 executes the process of making the area where the image should be extracted smaller than the attention area. did. On the other hand, when the control unit 207 instructs to reduce the proportion of the person in the attention area, the attention area setting unit 202 performs a process of making the area from which the image is to be extracted larger than the attention area.

しかしこれに限らず注目領域処理部２０３で人物が注目領域で占める割合を変更する処理を実行してもよい。例えば、注目領域処理部２０３は、制御部２０７より人物が注目領域内で占める割合を大きくするように指示があると、人物の画素データに対して補間処理を実行する。すなわち、注目領域処理部２０３は、注目領域設定部２０２から受け取ったフレーム画像上における人物の位置情報や形状情報により、人物の画素データを含むマクロブロックを特定する。制御部２０７から指定された拡大倍率に基づき、ＦＩＲ（ＦｉｎｉｔｅＩｍｐｕｌｓｅＲｅｓｐｏｎｓｅ）フィルタの係数を算出し、所定の通過帯域特性を有する補間フィルタを構成する。特定したマクロブロックの画素データに対して、水平方向と垂直方向に所定の間隔でゼロパディングを行いながら構築した補間フィルタを二次元的に適用し、人物の形状を拡大する。 However, the present invention is not limited to this, and the attention area processing unit 203 may execute a process of changing the proportion of the person in the attention area. For example, if the control unit 207 instructs the attention area processing unit 203 to increase the proportion of the person in the attention area, the attention area processing unit 203 performs interpolation processing on the pixel data of the person. That is, the attention area processing unit 203 identifies a macroblock including person pixel data based on the position information and shape information of the person on the frame image received from the attention area setting unit 202. Based on the enlargement magnification designated by the control unit 207, a coefficient of an FIR (Finite Impulse Response) filter is calculated, and an interpolation filter having a predetermined passband characteristic is configured. An interpolation filter constructed while performing zero padding at predetermined intervals in the horizontal direction and the vertical direction is applied two-dimensionally to the specified macroblock pixel data to enlarge the shape of the person.

また、制御部２０７より人物が注目領域内で占める割合を小さくするように指示があると、注目領域処理ブ２０３は、人物の画素データに対して間引き処理を実行する。すなわち、注目領域処理部２０３は、注目領域設定部２０２から受け取ったフレーム画像上における人物の位置情報や形状情報により、人物の画素データを含むマクロブロックを特定する。制御部２０７から指定された縮小倍率に基づき、ＦＩＲフィルタの係数を算出し、所定の通過帯域特性を有する間引きフィルタを構成する。特定したマクロブロックの画素データに対して、水平方向と垂直方向に構築した補間フィルタを二次元的に適用しながら、所定の間隔で画素データを間引き、人物の形状を縮小する。 When the control unit 207 instructs to reduce the proportion of the person in the attention area, the attention area processing block 203 performs a thinning process on the pixel data of the person. That is, the attention area processing unit 203 identifies a macroblock including person pixel data based on the position information and shape information of the person on the frame image received from the attention area setting unit 202. Based on the reduction ratio designated by the control unit 207, the coefficient of the FIR filter is calculated, and a thinning filter having a predetermined passband characteristic is configured. While the interpolation filter constructed in the horizontal direction and the vertical direction is applied two-dimensionally to the pixel data of the specified macroblock, the pixel data is thinned out at a predetermined interval to reduce the shape of the person.

注目領域処理部２０３は、制御部２０７の指示により注目領域動画像に対して人物が注目領域内で占める割合を変更した場合は、第１および第２の位置情報等に加えて、フレーム画像上における変更後の人物の位置情報や形状情報（以下、第３の位置情報等ともいう）も符号化部２０４に供給する。 When the ratio of the person in the attention area to the attention area moving image is changed according to an instruction from the control section 207, the attention area processing section 203 displays the frame image on the frame image in addition to the first and second position information. The position information and shape information (hereinafter also referred to as third position information) of the person after the change is supplied to the encoding unit 204.

このとき、符号化部２０４は、第１から第３の位置情報等を参照し、注目領域内の全体画像および注目領域に含まれる人物のそれぞれに対して、Ｈ．２６４／ＡＶＣ規格に準拠した圧縮符号化方式を適用した場合に発生する符号量を算出する。 At this time, the encoding unit 204 refers to the first to third position information and the like for each of the whole image in the attention area and the person included in the attention area. The amount of code generated when a compression coding system compliant with the H.264 / AVC standard is applied is calculated.

さらに、符号化部２０４は、注目領域処理部２０３で人物の画素データに対して間引き処理が実行された場合、注目領域処理部２０３から受け取った第２および第３の位置情報等から、間引き処理が実行される前の人物の形状に沿った境界線と、間引き処理が実行された後の人物の形状に沿った境界線とで囲まれた領域を特定する。そして符号化部２０４は、特定した領域に含まれるマクロブロックに対して、補償処理を実行する。 Furthermore, when the attention area processing unit 203 performs the thinning process on the pixel data of the person, the encoding unit 204 performs the thinning process based on the second and third position information received from the attention area processing unit 203. A region surrounded by the boundary line along the shape of the person before the execution of the thinning process and the boundary line along the shape of the person after the thinning process is specified. Then, the encoding unit 204 performs compensation processing on the macroblock included in the identified region.

すなわち、同じフレーム画像内に含まれる人物の背景画素データや、時間方向に前後するフレーム画像内に含まれる人物の背景画素データから、特定した領域に含まれるマクロブロックの画素データを予測する。そして、予測した画素データで特定した領域に含まれるマクロブロックの画素データを補償する。また、予測した画素データで補償するのではなく、特定の色（例えば、青）で表示されるように特定した領域に含まれるマクロブロックの画素データを補償してもよい。また、人物の間引き処理に関する情報（例えば、縮小倍率など）を注目領域符号化データに含めて画像受信装置に送信し、その情報に基づき、画像受信装置側で、間引き処理が実行される前の人物の形状まで拡大してもよい。 In other words, the pixel data of the macroblock included in the specified area is predicted from the background pixel data of the person included in the same frame image and the background pixel data of the person included in the frame image that moves back and forth in the time direction. Then, the pixel data of the macro block included in the area specified by the predicted pixel data is compensated. Further, instead of compensating with the predicted pixel data, the pixel data of the macroblock included in the area specified to be displayed in a specific color (for example, blue) may be compensated. In addition, information related to the thinning process of a person (for example, a reduction ratio) is included in the region-of-interest encoded data and transmitted to the image receiving apparatus, and before the thinning process is executed on the image receiving apparatus side based on the information. You may expand to the shape of a person.

記録部２０５は、符号化部２０４から受け取った全体領域符号化データを、例えば、ＨＤＤ（Hard Disk Drive）、光ディスク、フラッシュメモリなどの記録媒体に記録する。 The recording unit 205 records the entire area encoded data received from the encoding unit 204 on a recording medium such as an HDD (Hard Disk Drive), an optical disk, or a flash memory.

送信部２０６は、符号化部２０４から受け取った注目領域符号化データを図示しない送信バッファに格納する。そして、送信バッファに格納した注目領域符号化データが所定の送信単位に達すると（例えば、1500バイト）、送信バッファから注目領域符号化データを読み出し、データフォーマットの変換を行い、無線通信ネットワークに送信する。 The transmission unit 206 stores the attention area encoded data received from the encoding unit 204 in a transmission buffer (not shown). When the attention area encoded data stored in the transmission buffer reaches a predetermined transmission unit (for example, 1500 bytes), the attention area encoded data is read from the transmission buffer, converted into a data format, and transmitted to the wireless communication network. To do.

また、送信部２０６は、所定の期間（例えば、動画像のフレーム期間）内に送信できた注目領域符号化データのバイト数を計測して、所定の期間ごとに無線通信ネットワークの伝送速度を算出し、制御部２０７に供給する。 In addition, the transmission unit 206 measures the number of bytes of attention area encoded data that can be transmitted within a predetermined period (for example, a moving image frame period), and calculates the transmission rate of the wireless communication network for each predetermined period. To the control unit 207.

制御部２０７は、画像送信装置全体の制御を行う。また、制御部２０７は、符号化部２０４から受け取った符号化対象の注目領域内の全体画像および注目領域に含まれる人物の予測符号量と、送信部２０６から受け取った無線通信ネットワークの伝送速度に基づき、人物が注目領域内で占める割合を変更すべきか否かを決定する。 The control unit 207 controls the entire image transmission apparatus. In addition, the control unit 207 determines the entire image in the target region to be encoded received from the encoding unit 204, the predicted code amount of the person included in the target region, and the transmission speed of the wireless communication network received from the transmission unit 206. Based on this, it is determined whether or not the ratio of the person in the attention area should be changed.

制御部２０７は、シミュレーションや実験により、無線通信ネットワークの伝送速度に関するしきい値（以下、第１のしきい値ともいう）と注目領域内の全体画像の予測符号量に関するしきい値（以下、第２のしきい値ともいう）と注目領域に含まれる人物の予測符号量に関するしきい値（以下、第３のしきい値ともいう）とをあらかじめ規定しており、伝送速度、注目領域内の全体画像の予測符号量および注目領域に含まれる人物の予測符号量のそれぞれを、これらしきい値と比較することで、注目領域内で人物が占める割合を変更すべきか否かを決定する。 The control unit 207 performs a simulation or experiment on a threshold (hereinafter, also referred to as a first threshold) regarding the transmission speed of the wireless communication network and a threshold (hereinafter, referred to as a predicted code amount of the entire image in the region of interest). A threshold value (hereinafter also referred to as a third threshold value) relating to a predicted code amount of a person included in the attention area (hereinafter also referred to as a third threshold value). By comparing each of the prediction code amount of the entire image and the prediction code amount of the person included in the attention area with these threshold values, it is determined whether or not the ratio of the person in the attention area should be changed.

すなわち、制御部２０７は、送信部２０７から受け取った無線通信ネットワークの伝送速度が第１のしきい値よりも小さく、かつ符号化部２０４から受け取った注目領域内の全体画像の予測符号量が第２のしきい値より大きく、かつ注目領域に含まれる人物の予測符号量が第３のしきい値より大きい場合、注目領域内で人物が占める割合を小さくするように注目領域処理部２０３に指示する。 That is, the control unit 207 has a transmission rate of the wireless communication network received from the transmission unit 207 smaller than the first threshold, and the predicted code amount of the entire image in the attention area received from the encoding unit 204 is the first. If the predicted code amount of the person included in the attention area is larger than the third threshold value, the instruction area processing unit 203 is instructed to reduce the proportion of the person in the attention area. To do.

伝送速度が第１のしきい値より小さい場合は、所定の期間内に無線通信ネットワークに送信できる情報量も小さくなるため、注目領域符号化データの符号量もできるだけ少ないほうが好ましい。また、注目領域に含まれる人物の予測符号量が第３のしきい値よりも大きい場合は、注目領域符号化データの符号量に対する、人物の符号化の寄与は、注目領域内の人物を除く領域の符号化の寄与よりも大きいと考えられる。 When the transmission rate is smaller than the first threshold value, the amount of information that can be transmitted to the wireless communication network within a predetermined period is also small. Therefore, it is preferable that the code amount of the attention area encoded data is as small as possible. In addition, when the predicted code amount of the person included in the attention area is larger than the third threshold value, the contribution of the person encoding to the code amount of the attention area encoded data excludes the person in the attention area. This is considered to be larger than the contribution of the region coding.

そこで、制御部２０７は、注目領域内の全体画像の予測符号量が第２のしきい値より大きく、注目領域符号化データの符号量が、無線通信ネットワークに伝送速度に比べ大きすぎると判断した場合であって、その主たる要因が注目領域に含まれる人物の符号化にあると判断した場合、注目領域内で人物が占める割合を小さくする。 Therefore, the control unit 207 determines that the predicted code amount of the entire image in the attention area is larger than the second threshold value, and the code amount of the attention area encoded data is too large compared to the transmission speed in the wireless communication network. In this case, if it is determined that the main factor is the encoding of the person included in the attention area, the proportion of the person in the attention area is reduced.

注目領域内で人物が占める割合が小さくなれば、符号化すべき人物の画素データを含むマクロブロックが少なくなり、また、人物の動きベクトルの大きさも割合に応じて小さくなる。このため、人物を符号化したときの符号量を小さくすることができる。注目領域符号化データの符号量に対する寄与が大きい人物の符号量を小さくできるので、結果として注目領域符号化データの符号量を小さくすることができる。 If the proportion of the person in the region of interest decreases, the number of macroblocks including the pixel data of the person to be encoded decreases, and the size of the person's motion vector also decreases according to the proportion. For this reason, the amount of codes when a person is encoded can be reduced. Since the code amount of the person who greatly contributes to the code amount of the attention area encoded data can be reduced, the code amount of the attention area encoded data can be reduced as a result.

一方、制御部２０７は、送信部２０７から受け取った無線通信ネットワークの伝送速度が第１のしきい値よりも小さく、かつ符号化部２０４から受け取った注目領域内の全体画像の予測符号量が第２のしきい値より大きく、かつ注目領域に含まれる人物の予測符号量が第３のしきい値より小さい場合、注目領域内で人物が占める割合を大きくするように注目領域処理部２０３に指示する。 On the other hand, the control unit 207 has a transmission rate of the wireless communication network received from the transmission unit 207 smaller than the first threshold value, and the predicted code amount of the entire image in the attention area received from the encoding unit 204 is the first. When the predicted code amount of the person included in the attention area is smaller than the third threshold value, the instruction area processing unit 203 is instructed to increase the proportion of the person in the attention area. To do.

注目領域に含まれる人物の予測符号量が第３のしきい値よりも小さい場合は、注目領域符号化データの符号量に対する、人物の符号化の寄与は、注目領域内の人物を除く領域の符号化の寄与よりも小さいと考えられる。 When the predicted code amount of the person included in the attention area is smaller than the third threshold value, the contribution of the person's encoding to the code amount of the attention area encoded data is that of the area excluding the person in the attention area. This is considered to be smaller than the contribution of encoding.

そこで、制御部２０７は、注目領域内の全体画像の予測符号量が第２のしきい値より大きく、注目領域符号化データの符号量が、無線通信ネットワークに伝送速度に比べ大きすぎると判断した場合であって、その主たる要因が注目領域内の人物を除く領域の符号化にあると判断した場合、注目領域内で人物が占める割合を大きくする。 Therefore, the control unit 207 determines that the predicted code amount of the entire image in the attention area is larger than the second threshold value, and the code amount of the attention area encoded data is too large compared to the transmission speed in the wireless communication network. In this case, when it is determined that the main factor is the coding of the area excluding the person in the attention area, the ratio of the person in the attention area is increased.

注目領域内で人物が占める割合が大きくなれば、注目領域内の人物を除く領域に含まれるマクロブロックが少なくなる。このため、注目領域の人物を除く領域を符号化したときの符号量を小さくすることができる。注目領域符号化データの符号量に対する寄与が大きい人物を除く領域の符号量を小さくできるので、結果として注目領域符号化データの符号量を小さくすることができる。 When the proportion of the person in the attention area increases, the number of macroblocks included in the area excluding the person in the attention area decreases. For this reason, it is possible to reduce the amount of code when the area excluding the person in the attention area is encoded. Since the code amount of a region excluding a person who greatly contributes to the code amount of the attention area encoded data can be reduced, the code amount of the attention area encoded data can be reduced as a result.

図２は、注目領域における特定オブジェクトの占める割合を変更する手順を示すフローチャートである。符号化部２０４は、時点Tnで符号化対象とされたフレーム画像Fn上に設定された注目領域内の全体画像を符号化することで生じた符号量を算出し、注目領域内の全体画像の予測符号量として制御部２０７に供給する（S10）。また、符号化部２０４は、時点Tnで符号化対象とされたフレーム画像Fn上に設定された注目領域に含まれる人物を符号化することで生じた符号量を算出し、注目領域に含まれる人物の予測符号量として制御部２０７に供給する（S12）。 FIG. 2 is a flowchart showing a procedure for changing the proportion of the specific object in the attention area. The encoding unit 204 calculates a code amount generated by encoding the entire image in the attention area set on the frame image Fn to be encoded at the time Tn, and calculates the code amount of the entire image in the attention area. The prediction code amount is supplied to the control unit 207 (S10). In addition, the encoding unit 204 calculates a code amount generated by encoding the person included in the attention area set on the frame image Fn that is the encoding target at the time point Tn, and is included in the attention area. The predicted code amount of the person is supplied to the control unit 207 (S12).

送信部２０６は、時点Tnで送信バッファの残量を参照し、無線通信ネットワークの伝送速度を算出し、制御部２０７に供給する（S14）。 The transmission unit 206 refers to the remaining amount of the transmission buffer at the time Tn, calculates the transmission rate of the wireless communication network, and supplies it to the control unit 207 (S14).

制御部２０７は、時点Tn+1でフレーム画像Fn+1が符号化対象のフレーム画像となると、伝送速度を第１のしきい値と比較する（S16）。伝送速度が第１のしきい値より小さい場合（S16のY）、制御部２０７は、注目領域内の全体画像の予測符号量を第２のしきい値と比較する（S18）。注目領域内の全体画像の予想符号量が第２のしきい値よりも大きい場合（S18のY）、制御部２０７は、注目領域に含まれる人物の予測符号量を第３のしきい値と比較する（S20）。注目領域に含まれる人物の予想符号量が第３のしきい値よりも大きい場合（S20のY）、制御部２０７は、フレーム画像Fn+1上に設定された注目領域に含まれる人物の、注目領域に占める割合を大きくするように注目領域処理部２０３に対して指示を行う（S22）。一方、注目領域に含まれる人物の予想符号量が第３のしきい値よりも小さい場合（S20のN）、制御部２０７は、フレーム画像Fn+1上に設定された注目領域に含まれる人物の、注目領域に占める割合を小さくするように注目領域処理部２０３に対して指示を行う（S24）。 When the frame image Fn + 1 becomes the encoding target frame image at time Tn + 1, the control unit 207 compares the transmission rate with the first threshold value (S16). When the transmission rate is smaller than the first threshold value (Y in S16), the control unit 207 compares the predicted code amount of the entire image in the region of interest with the second threshold value (S18). When the predicted code amount of the entire image in the attention area is larger than the second threshold value (Y in S18), the control unit 207 sets the predicted code amount of the person included in the attention area as the third threshold value. Compare (S20). When the expected code amount of the person included in the attention area is larger than the third threshold value (Y in S20), the control unit 207 determines the person included in the attention area set on the frame image Fn + 1. The attention area processing unit 203 is instructed to increase the proportion of the attention area (S22). On the other hand, when the expected code amount of the person included in the attention area is smaller than the third threshold (N in S20), the control unit 207 displays the person included in the attention area set on the frame image Fn + 1. Is instructed to the attention area processing unit 203 so as to reduce the proportion of the attention area to the attention area (S24).

このような本発明の実施の形態によれば、以下のとおりの作用効果を享受することができる。 According to such an embodiment of the present invention, the following operational effects can be enjoyed.

（１）無線通信ネットワークの伝送速度と注目領域内の全体画像の予測符号量と注目領域に含まれる人物の予測符号量に基づき、注目領域に含まれる人物の、注目領域に占める割合を変更するので、無線通信ネットワークの伝播環境に最適な注目領域符号化データの送信が実現できる。 (1) The ratio of the person included in the attention area to the attention area is changed based on the transmission speed of the wireless communication network, the prediction code amount of the entire image in the attention area, and the prediction code amount of the person included in the attention area. Therefore, it is possible to realize transmission of attention area encoded data that is optimal for the propagation environment of the wireless communication network.

（２）無線通信ネットワークの伝送速度が第１のしきい値より小さく、かつ注目領域内の全体画像の予測符号量が第２のしきい値より大きく、かつ注目領域に含まれる人物の予測符号量が第３のしきい値よりも大きい場合、注目領域に含まれる人物の、注目領域に占める割合を小さくするので、量子化スケール大きくしたりや解像度を小さくすることなく符号量を調整することができ、高品質かつ伝播環境に最適な注目領域符号化データの送信を実現できる。 (2) Prediction codes of persons included in the attention area whose transmission speed of the wireless communication network is smaller than the first threshold value and whose prediction code amount of the entire image in the attention area is larger than the second threshold value When the amount is larger than the third threshold value, the ratio of the person included in the attention region to the attention region is reduced, so that the code amount can be adjusted without increasing the quantization scale or reducing the resolution. It is possible to realize transmission of attention area encoded data that is high quality and optimal for the propagation environment.

（３）無線通信ネットワークの伝送速度が第１のしきい値より小さく、かつ注目領域内の全体画像の予測符号量が第２のしきい値より大きく、かつ注目領域に含まれる人物の予測符号量が第３のしきい値よりも小さい場合、注目領域に含まれる人物の、注目領域に占める割合を大きくするので、量子化スケール大きくしたり解像度を小さくすることなく符号量を調整することができ、高品質かつ伝播環境に最適な注目領域符号化データの送信を実現できる。 (3) Predictive code of a person whose transmission speed of the wireless communication network is smaller than the first threshold and whose prediction code amount of the entire image in the attention area is larger than the second threshold and included in the attention area When the amount is smaller than the third threshold value, the ratio of the person included in the attention area to the attention area is increased, so that the code amount can be adjusted without increasing the quantization scale or decreasing the resolution. It is possible to realize transmission of attention area encoded data that is high quality and optimal for the propagation environment.

よって、注目領域符号化データの伝送効率を向上できる。 Therefore, the transmission efficiency of attention area coding data can be improved.

以上、本発明を実施するための形態について説明をしてきたが、本発明は、この実施の形態の構成に限定されるものではなく、特許請求の範囲に規定された本発明の適用範囲にあり、上述した実施の形態の構成が備える機能を達成可能であれば、いろいろな変形が可能である。 As mentioned above, although the form for implementing this invention has been demonstrated, this invention is not limited to the structure of this embodiment, It exists in the application range of this invention prescribed | regulated by the claim. Various modifications are possible as long as the functions of the configuration of the above-described embodiment can be achieved.

例えば、本発明の実施の形態において画像送信装置２０は、Ｈ．２６４／ＡＶＣ規格に従い圧縮符号化するとしたが、ＭＰＥＧ−２、またはＭＰＥＧ−４などの規格にしたがい圧縮符号化してもよい。 For example, in the embodiment of the present invention, the image transmission device 20 is the H.264 standard. The compression encoding is performed according to the H.264 / AVC standard, but the compression encoding may be performed according to a standard such as MPEG-2 or MPEG-4.

２０１オブジェクト抽出部、２０２注目領域設定部、２０３注目領域処理部、２０４符号化部、２０５記録部、２０６送信部、２０７制御部。
DESCRIPTION OF SYMBOLS 201 Object extraction part, 202 Attention area setting part, 203 Attention area process part, 204 Encoding part, 205 Recording part, 206 Transmission part, 207 Control part.

Claims

A setting unit for setting a region of interest on the image so as to include a specific object on the image;
An encoding unit that encodes an image in the region of interest to generate region-of-interest encoded data, and predicts a code amount of the region-of-interest encoded data and a code amount of the specific object;
A transmitter that transmits the attention area encoded data and obtains a transmission rate at the time of the attention area encoded data transmission;
The specific object occupies the attention area according to the transmission speed of the preceding attention area encoded data acquired by the transmission unit, the attention area encoded data predicted by the encoding section, and the code amount of the specific object. A processing unit for changing the ratio;
An image transmitting apparatus comprising:

The processing unit is configured such that the transmission rate is smaller than a first threshold, the code amount of the attention area encoded data is larger than a second threshold, and the code amount of the specific object is third. 2. The image transmission device according to claim 1, wherein when the threshold value is larger than the threshold value, a process of reducing a ratio of the specific object in the attention area is executed.

The processing unit is configured such that the transmission rate is smaller than a first threshold, the code amount of the attention area encoded data is larger than a second threshold, and the code amount of the specific object is third. 2. The image transmission device according to claim 1, wherein when the threshold value is smaller than the threshold value, a process of enlarging a ratio of the specific object in the attention area is executed.

An imaging unit for acquiring a moving image;
The image transmission device according to any one of claims 1 to 3, which processes a moving image acquired by the imaging unit;
An imaging apparatus comprising: