JP5958082B2

JP5958082B2 - Image processing apparatus and image processing method

Info

Publication number: JP5958082B2
Application number: JP2012119996A
Authority: JP
Inventors: 太田　雄介; 雄介太田; 浪江　健史; 健史浪江
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2011-12-28
Filing date: 2012-05-25
Publication date: 2016-07-27
Anticipated expiration: 2032-05-25
Also published as: JP2013153404A

Description

本発明は、画像処理装置、画像処理方法に関する。 The present invention relates to an image processing apparatus and an image processing method.

従来から、遠隔地で会議を行うことが出来るＴＶ会議システムがある。当該ＴＶ会議システムでは、撮像装置で会議参加者を撮像して、遠隔地のテレビ会議端末に送信する。多人数を撮影する用途で、広角レンズを用いるＴＶ会議システムもある。この場合、撮像装置で撮像した画像は歪んでいる場合が多く、当該歪みを除去するための技術が様々提案されている。 Conventionally, there is a TV conference system that can hold a conference at a remote place. In the TV conference system, a conference participant is imaged by an imaging device and transmitted to a remote video conference terminal. There is also a video conference system that uses a wide-angle lens for photographing a large number of people. In this case, the image captured by the image capturing apparatus is often distorted, and various techniques for removing the distortion have been proposed.

特許文献１記載の技術では、会議机の縁を結ぶ関数と、会議参加者の頭の上を結ぶ曲線を表す２つの関数を基に、画像を縦方向に引き伸ばすことで、会議参加者の大きさを近づける第１の補正を行なう。更に、画面上の位置に従い、横の大きさの調整をする第２の補正を行なう。特許文献１記載の技術では、当該第１の補正、第２の補正により、歪みを補正している。 In the technique described in Patent Document 1, the size of a conference participant is increased by stretching the image in the vertical direction based on a function that connects the edges of the conference desk and two functions that represent a curve connecting the heads of the conference participants. First correction is performed to bring the thickness closer. Further, the second correction for adjusting the horizontal size is performed according to the position on the screen. In the technique described in Patent Document 1, distortion is corrected by the first correction and the second correction.

しかし、特許文献１記載の技術では、元々レンズの歪みが少ない複数の撮像装置の映像を組み合わせた画像を前提に考えられている。従って、第１の補正を行なうことにより生じる垂直方向の歪みは考慮されているが、第１の補正を行なうことにより生じる水平方向の歪みは考慮されていない。従って、第２の補正を行なうことにより、映像が横長なものとなり、１つの画像に表示する人物画像が小さくなり、第２の補正後の画像が、ユーザにとって違和感のある画像となるという問題がある。 However, the technique described in Patent Document 1 is originally considered on the premise of an image obtained by combining videos of a plurality of imaging devices with little lens distortion. Therefore, although the vertical distortion caused by the first correction is taken into account, the horizontal distortion caused by the first correction is not taken into consideration. Therefore, by performing the second correction, there is a problem that the video becomes horizontally long, the person image displayed in one image becomes small, and the image after the second correction becomes an uncomfortable image for the user. is there.

そこで、本発明は、上記のような問題を鑑みて、撮像装置で撮像された画像を、ユーザにとって違和感のない画像に補正するための補正率を決定する画像処理装置、画像処理方法を提供することを目的とする。 In view of the above-described problems, the present invention provides an image processing apparatus and an image processing method for determining a correction rate for correcting an image captured by an imaging apparatus into an image that does not cause a sense of incongruity to the user. For the purpose.

上記目的を達成するため、撮像装置で撮像された画像内に人物画像が存在するか否かを判定する判定手段と、前記判定手段が、前記画像内に前記人物画像が存在すると判定すると、当該人物画像の大きさ又はコントラスト量を測定する測定手段と、前記判定手段が、前記画像内に前記人物画像が存在すると判定すると、当該画像内の当該人物画像の位置を認識する認識手段と、前記認識手段が認識した前記人物画像の位置と、前記測定手段が測定した前記人物画像の大きさ又はコントラスト量とに基づいて、当該人物画像に係る人物の、前記撮像装置からの位置を示す位置情報を生成する生成手段と、前記生成手段が生成した前記位置情報に基づいて、前記撮像装置で撮像された画像を補正するための補正率を決定する決定手段と、を有することを特徴とする画像処理装置を提供する。 In order to achieve the above object, the determination unit that determines whether or not a person image exists in the image captured by the imaging device, and the determination unit determines that the person image exists in the image, Measuring means for measuring the size or contrast amount of a person image, and when the determining means determines that the person image is present in the image, a recognizing means for recognizing the position of the person image in the image; Position information indicating the position of the person related to the person image from the imaging device based on the position of the person image recognized by the recognition unit and the size or contrast amount of the person image measured by the measurement unit And generating means for determining a correction factor for correcting an image picked up by the image pickup device based on the position information generated by the generating means. To provide an image processing apparatus according to claim and.

本発明の画像処理装置、画像処理方法によれば、撮像装置で撮像された画像を、ユーザにとって違和感のない画像に補正するための補正率を決定することができる。 According to the image processing device and the image processing method of the present invention, it is possible to determine a correction rate for correcting an image captured by the imaging device into an image that does not cause a sense of incongruity for the user.

実施例１における画像処理装置の機能構成例を示す図。FIG. 3 is a diagram illustrating a functional configuration example of an image processing apparatus according to the first embodiment. 実施例１における撮像装置の斜視図を示す図。FIG. 3 is a perspective view of the image pickup apparatus according to the first embodiment. 実施例１における撮像装置と画像処理装置とのハードウェア構成を示す図。2 is a diagram illustrating a hardware configuration of an imaging apparatus and an image processing apparatus in Embodiment 1. FIG. 実施例１における撮像装置と画像処理装置との各機能部が用いる情報の一例を示す図。3 is a diagram illustrating an example of information used by each functional unit of the imaging apparatus and the image processing apparatus in Embodiment 1. FIG. 実施例１における補正を説明するための図。FIG. 6 is a diagram for explaining correction in the first embodiment. 実施例１における補正前の画像と補正後の画像との一例を示す図。FIG. 3 is a diagram illustrating an example of an image before correction and an image after correction in the first embodiment. 実施例１におけるＣＰＵの機能構成例を示す図。FIG. 3 is a diagram illustrating a functional configuration example of a CPU according to the first embodiment. 実施例１における画像処理装置の処理フローを示す図。FIG. 3 is a diagram illustrating a processing flow of the image processing apparatus according to the first embodiment. 実施例１におけるフレーム画像などの一例を示す図。FIG. 3 is a diagram illustrating an example of a frame image and the like in the first embodiment. 倍率テーブルの一例を示す図。The figure which shows an example of a magnification table. 距離テーブルの一例を示す図。The figure which shows an example of a distance table. 角度テーブルの一例を示す図。The figure which shows an example of an angle table. 所定領域を示す図。The figure which shows a predetermined area | region. パラメータテーブルの一例を示す図。The figure which shows an example of a parameter table. 実施例１における実験結果を示す図。FIG. 4 is a diagram showing experimental results in Example 1. 実施例２における撮像装置と画像処理装置のハードウェア構成を示す図。2 is a diagram illustrating a hardware configuration of an imaging apparatus and an image processing apparatus in Embodiment 2. FIG. 実施例２における撮像装置と画像処理装置との各機能部が用いる情報の一例を示す図。FIG. 10 is a diagram illustrating an example of information used by each functional unit of the imaging apparatus and the image processing apparatus according to the second embodiment. 実施例２におけるＣＰＵの機能構成例を示す図。FIG. 10 is a diagram illustrating a functional configuration example of a CPU according to the second embodiment. レンズ位置−距離テーブルの一例を示す図。The figure which shows an example of a lens position-distance table. 実施例２における画像処理装置の処理フローを示す図。FIG. 10 is a diagram illustrating a processing flow of the image processing apparatus according to the second embodiment. 実施例２における距離測定フローの一例を示す図。FIG. 10 is a diagram illustrating an example of a distance measurement flow in the second embodiment. 実施例２における距離測定の手順（その１）を示す図。The figure which shows the procedure (the 1) of the distance measurement in Example 2. FIG. 実施例２における距離測定の手順（その２）を示す図。The figure which shows the procedure (the 2) of the distance measurement in Example 2. FIG. 実施例２における距離測定の手順（その３）を示す図。The figure which shows the procedure (the 3) of the distance measurement in Example 2. FIG. 実施例２における距離測定の手順（その４）を示す図。The figure which shows the procedure (the 4) of the distance measurement in Example 2. FIG.

以下、本発明を実施するための形態の説明を行う。本実施例の画像処理装置は、例えば、テレビ会議に用いることが好ましい。以下の説明では、本実施例の画像処理装置をテレビ会議に用いるものとして説明する。 Hereinafter, embodiments for carrying out the present invention will be described. The image processing apparatus according to the present embodiment is preferably used for a video conference, for example. In the following description, it is assumed that the image processing apparatus of this embodiment is used for a video conference.

［実施例１］
＜テレビ会議システムについて＞
図１に、実施例１のテレビ会議システムの概略構成図を示す。本テレビ会議システムは、複数のテレビ会議端末３０００と、これらテレビ会議端末３０００が接続されるネットワーク４０００で構成される。テレビ会議端末３０００は撮像装置３０１０と画像処理装置３０２０を具備している。 [Example 1]
<About the video conference system>
FIG. 1 shows a schematic configuration diagram of a video conference system according to the first embodiment. This video conference system includes a plurality of video conference terminals 3000 and a network 4000 to which these video conference terminals 3000 are connected. The video conference terminal 3000 includes an imaging device 3010 and an image processing device 3020.

任意のテレビ会議端末３０００の撮像装置３０１０で撮影された映像は、画像処理装置３０２０により、後述する補正が施される。そして、この補正された画像は、テレビ会議端末３０００内の表示装置１２０（図２参照）に表示されると共に、ネットワーク４０００に接続された他のテレビ会議端末３０００にも伝送されて、その表示装置１２０に表示される。 A video captured by the imaging device 3010 of an arbitrary video conference terminal 3000 is corrected by the image processing device 3020, which will be described later. The corrected image is displayed on the display device 120 (see FIG. 2) in the video conference terminal 3000, and is also transmitted to other video conference terminals 3000 connected to the network 4000. 120.

任意のテレビ会議端末３０００において、ユーザがズームイン／ズームアウトボタンを押下すると、当該ズームイン／ズームアウトが反映されたデジタルズーム画像が、当該テレビ会議端末３０００のディスプレイに表示される、また、当該デジタルズーム画像は、ネットワーク４０００に接続された他のテレビ会議端末３０００にも伝送されて、その表示装置１２０に表示される。 When a user presses the zoom-in / zoom-out button on any video conference terminal 3000, a digital zoom image reflecting the zoom-in / zoom-out is displayed on the display of the video conference terminal 3000. The data is transmitted to other video conference terminals 3000 connected to the network 4000 and displayed on the display device 120.

このテレビ会議システムによれば、ネットワークに接続された複数のテレビ会議端末で、補正やデジタルズームの施された映像をリアルタイムに表示することが可能になる。 According to this video conference system, it is possible to display a video subjected to correction and digital zoom in real time on a plurality of video conference terminals connected to the network.

＜テレビ会議端末について＞
次に、テレビ会議端末の機能構成例について説明する。図２に、テレビ会議端末３０００の具体的外観図の一例を示す。以下、テレビ会議端末３０００の長手方向をＸ軸方向とし、水平面内でＸ軸方向と直交する方向をＹ軸方向（幅方向）とし、Ｘ軸方向およびＹ軸方向に直交する（鉛直方向、高さ方向）をＺ軸方向として説明する。 <About video conference terminals>
Next, a functional configuration example of the video conference terminal will be described. FIG. 2 shows an example of a specific external view of the video conference terminal 3000. Hereinafter, the longitudinal direction of the video conference terminal 3000 is the X-axis direction, the direction orthogonal to the X-axis direction in the horizontal plane is the Y-axis direction (width direction), and the X-axis direction and the Y-axis direction are orthogonal (vertical direction, high direction). (Direction) is described as the Z-axis direction.

テレビ会議端末３０００は、筐体１１００、アーム１２００、ハウジング１３００を備えている。このうち、筐体１１００の右側壁面１１３０には、音収音用の孔１１３１が設けられる。当該音収音用の孔１１３１を通過した、外部からの音が、内部に設けられた音入力手段１４に収音される。 The video conference terminal 3000 includes a housing 1100, an arm 1200, and a housing 1300. Among these, a sound collecting hole 1131 is provided in the right wall surface 1130 of the housing 1100. Sound from the outside that has passed through the sound collecting hole 1131 is collected by the sound input means 14 provided inside.

また、上面手段１１５０には、電源スイッチ１０９と、音出力用の孔１１５１が設けられる。ユーザが、電源スイッチ１０９をＯＮにすることで、テレビ会議端末３０００を起動させることが出来る。また、音出力手段１２から出力された音は、音出力孔１１５１を通過して、外部に出力される。 Further, the upper surface means 1150 is provided with a power switch 109 and a sound output hole 1151. The user can activate the video conference terminal 3000 by turning on the power switch 109. The sound output from the sound output means 12 passes through the sound output hole 1151 and is output to the outside.

また、筐体の１１００の左側壁面１１４０側には、アーム１２００及びカメラハウジング１３００を収容するための、凹手段形状の収容手段１１６０が形成されている。また、筐体１１００の左側壁面１１４０には、接続口（図示せず）が設けられる。接続口は、映像出力手段３０と、表示装置１２０（ディスプレイ）とを接続するためのケーブル１２０ｃが接続されるものである。 Further, on the left wall surface 1140 side of the housing 1100, a concave means-shaped accommodation means 1160 for accommodating the arm 1200 and the camera housing 1300 is formed. Further, a connection port (not shown) is provided on the left wall surface 1140 of the housing 1100. The connection port is connected to a cable 120c for connecting the video output means 30 and the display device 120 (display).

また、アーム１２００は、トルクヒンジ１２１０により、筐体１１００に取り付けられる。アーム１２００が、筐体１１００に対して、１３５度のチルト角ω１の範囲で、上下方向に回転可能に構成されている。図２では、チルト角ω１が９０度の状態であることを示している。チルト角ω１を０度にすることで、アーム１２００及びカメラハウジング１３００を収容手段１１６０に収容することが出来る。 The arm 1200 is attached to the housing 1100 by a torque hinge 1210. The arm 1200 is configured to be vertically rotatable with respect to the housing 1100 within a tilt angle ω1 of 135 degrees. FIG. 2 shows that the tilt angle ω1 is 90 degrees. By setting the tilt angle ω1 to 0 degree, the arm 1200 and the camera housing 1300 can be accommodated in the accommodating means 1160.

カメラハウジング１３００には、内蔵型の撮影手段１０が収容されている。当該撮影手段１０により、人物（例えば、テレビ会議の参加者）や、用紙に記載された文字や記号、部屋などを撮影することが出来る。また、カメラハウジング１３００には、トルクヒンジ１３１０が形成されている。カメラハウジング１３００には、トルクヒンジ１３１０を介して、アーム１２００に取り付けられている。カメラハウジング１３００がアーム１２００に対して、図２で示されている状態を０度として、±１８０度のパン角ω２の範囲で、かつ、±４５度のチルト角ω３の範囲で、上下左右方向に回転可能に構成されている。 The camera housing 1300 houses the built-in photographing means 10. The photographing means 10 can photograph a person (for example, a participant in a video conference), characters or symbols written on a sheet, a room, or the like. A torque hinge 1310 is formed in the camera housing 1300. The camera housing 1300 is attached to the arm 1200 via a torque hinge 1310. With respect to the arm 1200, the camera housing 1300 is in the up / down / left / right direction within a range of a pan angle ω2 of ± 180 ° and a tilt angle ω3 of ± 45 °, assuming the state shown in FIG. It is configured to be rotatable.

また、実施例１のテレビ会議端末３０００は、図２に記載されたものではなく、他の構成であっても良い。例えば、ＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）に、音出力手段１２や音入力手段１４を外部接続したものを用いてもよい。また、実施例１のテレビ会議端末３０００をスマートフォンなどの携帯型端末に適用しても良い。 Further, the video conference terminal 3000 according to the first embodiment is not described in FIG. 2 and may have another configuration. For example, a PC (Personal Computer) with the sound output means 12 and the sound input means 14 externally connected may be used. Moreover, you may apply the video conference terminal 3000 of Example 1 to portable terminals, such as a smart phone.

＜撮像装置と画像処理装置のハードウェア構成例について＞
図３に、撮像装置３０１０と画像処理装置３０２０のハードウェア構成例を示す。撮像装置３０１０と画像処理装置３０２０の間は有線（ＵＳＢ等）あるいは無線で接続される。 <Example of Hardware Configuration of Imaging Device and Image Processing Device>
FIG. 3 illustrates a hardware configuration example of the imaging device 3010 and the image processing device 3020. The imaging device 3010 and the image processing device 3020 are connected by wire (such as USB) or wirelessly.

まず撮像装置３０１０について説明する。センサ１２は、レンズ１１で結像された光学像を電気信号のフレーム画像に変換する。センサとは例えば、ＣＣＤ（Charge Coupled Device Image Sensor）やＣＭＯＳ（Complementary Metal Oxide Semiconductor）などである。 First, the imaging device 3010 will be described. The sensor 12 converts the optical image formed by the lens 11 into a frame image of an electric signal. Examples of the sensor include a charge coupled device image sensor (CCD) and a complementary metal oxide semiconductor (CMOS).

画像処理ユニット１３は、フレーム画像に対して所定の画像処理を行う。画像処理ユニット１３は、例えば、ＩＳＰ（Image Signal Processor）である。Ｉ／Ｆユニット１４は、フレーム画像や変換済みフレーム画像、その他のデータ、制御信号などを画像処理装置３０２０と送受信する。 The image processing unit 13 performs predetermined image processing on the frame image. The image processing unit 13 is, for example, an ISP (Image Signal Processor). The I / F unit 14 transmits and receives a frame image, a converted frame image, other data, a control signal, and the like with the image processing device 3020.

次に、画像処理装置３０２０について説明する。Ｉ／Ｆユニット２１は、撮像装置３０１０とフレーム画像や変換済みフレーム画像、その他のデータ、制御信号などの送受信を行なう。 Next, the image processing apparatus 3020 will be described. The I / F unit 21 transmits and receives a frame image, a converted frame image, other data, a control signal, and the like with the imaging device 3010.

ＣＰＵ２２は、種々の処理を実行する。記憶手段２３には、ＣＰＵ２２の処理に必要な各種ソフトウエアやデータ、フレーム画像や変換済みフレーム画像、後述する様々なテーブル表や関数（演算式）などを格納する。記憶手段２３とは、例えば、ＲＡＭ、ＲＯＭ、ＨＤＤを総称したものである。映像出力ユニット２４は、モニタ１２０（図２参照）などに映像信号を送る。 The CPU 22 executes various processes. The storage unit 23 stores various software and data necessary for the processing of the CPU 22, frame images and converted frame images, various table tables and functions (calculation formulas) described later, and the like. The storage means 23 is a general term for RAM, ROM, and HDD, for example. The video output unit 24 sends a video signal to the monitor 120 (see FIG. 2).

通信ユニット２５は、ネットワーク４０００に接続された別のテレビ会議端末３０００に、映像信号などを送信する。制御ユニット２６は画像処理装置３０２０全体を制御する。バス２７は、画像処理装置３０２０内の各ユニットを接続する。 The communication unit 25 transmits a video signal or the like to another video conference terminal 3000 connected to the network 4000. The control unit 26 controls the entire image processing apparatus 3020. The bus 27 connects each unit in the image processing apparatus 3020.

＜撮像装置と画像処理装置とで用いる情報について＞
次に、撮像装置３０１０と画像処理装置３０２０との各機能部が用いる情報の詳細について説明する。図４に、各装置が用いる情報などの一例を示す。 <Information Used by Imaging Device and Image Processing Device>
Next, details of information used by the functional units of the imaging device 3010 and the image processing device 3020 will be described. FIG. 4 shows an example of information used by each device.

まず、撮像装置３０１０について説明する。画像取得手段１２１はフレーム画像を生成する。そして、画像取得手段１２１は、第１補正手段１３１やフレーム画像伝達手段１４２に対して、フレーム画像を送信する。 First, the imaging device 3010 will be described. The image acquisition unit 121 generates a frame image. Then, the image acquisition unit 121 transmits a frame image to the first correction unit 131 and the frame image transmission unit 142.

第１補正手段１３１は、補正率設定手段１３２により設定されている補正率を使用して、画像取得手段１２１から送信されたフレーム画像に対して補正を施し、補正済みフレーム画像を生成する。補正率設定手段１３２は、画像処理装置３０２０から送信された補正率を画像処理ユニット（ＩＳＰ等）１３内のメモリに設定を行う。 The first correction unit 131 performs correction on the frame image transmitted from the image acquisition unit 121 using the correction rate set by the correction rate setting unit 132, and generates a corrected frame image. The correction rate setting unit 132 sets the correction rate transmitted from the image processing apparatus 3020 in a memory in the image processing unit (ISP or the like) 13.

補正済みフレーム画像伝送手段１４１は、画像処理装置３０２０に補正済みフレーム画像を送る。実際には、第１補正手段１３１と補正済みフレーム画像伝送手段１４１は並行して動作する。また、補正済みフレーム画像の伝送は、補正率算出のために送られるフレーム画像に比べて高速に行なわれる。フレーム画像伝送手段１４２は画像処理装置３０２０にフレーム画像を送る。 The corrected frame image transmission unit 141 sends the corrected frame image to the image processing apparatus 3020. Actually, the first correction unit 131 and the corrected frame image transmission unit 141 operate in parallel. Further, the transmission of the corrected frame image is performed at a higher speed than the frame image sent for calculating the correction rate. The frame image transmission unit 142 sends the frame image to the image processing apparatus 3020.

そして、映像出力ユニット２４は、撮像装置３０１０から送信された補正済みフレーム画像を表示装置１２０に表示させる。また、通信ユニット２５は、撮像装置３０１０から送信された補正済みフレーム画像を、ネットワーク４０００経由で他のテレビ会議端末３０００に送信する。 Then, the video output unit 24 causes the display device 120 to display the corrected frame image transmitted from the imaging device 3010. Further, the communication unit 25 transmits the corrected frame image transmitted from the imaging device 3010 to another video conference terminal 3000 via the network 4000.

ＣＰＵ２２は、フレーム画像伝送手段１４２から伝送されたフレーム画像から補正率を算出する。ＣＰＵ２２の詳細な処理については後述する。補正率伝送手段２１１は補正率を撮像装置３０１０に送る。補正率設定手段１３２は、送信された補正率を設定する。 The CPU 22 calculates a correction rate from the frame image transmitted from the frame image transmission unit 142. Detailed processing of the CPU 22 will be described later. The correction rate transmission unit 211 sends the correction rate to the imaging device 3010. The correction rate setting means 132 sets the transmitted correction rate.

＜第１補正手段による補正について＞
次に、第１補正手段１３１による補正について説明する。以下では、画像の垂直方向（画像ｙ軸方向）とは、画像を平面で示した場合に、会議の奥行き方向であることを意味し、画像の水平方向（画像ｘ軸方向）とは、画像を平面で示した場合に、垂直方向と直交する方向である。 <Regarding correction by the first correction means>
Next, correction by the first correction unit 131 will be described. Hereinafter, the vertical direction of the image (image y-axis direction) means the depth direction of the conference when the image is shown as a plane, and the horizontal direction of the image (image x-axis direction) Is a direction orthogonal to the vertical direction.

一般的に、会議の画像内の人間の顔や体が、歪みにより曲がって見えることは、会議参加者（ユーザ）にとって好ましくない。従って、実施例１では、第１補正手段１３１は、歪みの水平方向成分に関しては、歪みを全て解消するように補正する。また、第１補正手段１３１は、歪みの垂直方向成分に関しては、歪みを一部（所定量）残存するように、補正する。このように補正することで、会議の画像においては、遠近感の強調や補正による画像の変形を抑えることができる。 In general, it is not preferable for a conference participant (user) that a human face or body in a conference image looks bent due to distortion. Accordingly, in the first embodiment, the first correction unit 131 corrects the horizontal component of distortion so as to eliminate all the distortion. Further, the first correction unit 131 corrects the distortion in the vertical direction component so that a part of the distortion (predetermined amount) remains. By correcting in this way, it is possible to suppress deformation of the image due to perspective emphasis and correction in the conference image.

図５に、変換前の画像の画素と、変換後の画像の画素との、関係について示す。図５を用いて第１補正手段１３１の処理内容を説明する。まず、用語の説明を行う。図５の１つの升目を１つの画素とする。そして、光源中心Ｃ（図５も参照）を原点（座標は（０、０））とする。注目画素とは、補正前の画像（実像）の全画素のうち、注目する任意の画素をいう。注目画素の座標をＰ１（ｘ'、ｙ'）とする。 FIG. 5 shows the relationship between the pixels of the image before conversion and the pixels of the image after conversion. The processing contents of the first correction unit 131 will be described with reference to FIG. First, terms will be explained. One square in FIG. 5 is defined as one pixel. The light source center C (see also FIG. 5) is the origin (coordinates are (0, 0)). The pixel of interest refers to any pixel of interest among all the pixels of the image (real image) before correction. Let the coordinate of the pixel of interest be P1 (x ′, y ′).

また、歪みを所定量残すように画像を補正した場合の注目画素の変換後の座標をＰ２（ｘ''、ｙ''）とする。また、画像の歪みを残さないように補正した場合の注目画素の変換後の座標をＰ０（ｘ、ｙ）とする。第１補正手段１３１が画像を補正することで、Ｐ１（ｘ'、ｙ'）はＰ２（ｘ''、ｙ''）に変換される。 Also, let P2 (x ″, y ″) be the coordinates after conversion of the pixel of interest when the image is corrected so as to leave a predetermined amount of distortion. Further, the coordinate after conversion of the pixel of interest when correction is performed so as not to leave image distortion is P0 (x, y). When the first correcting unit 131 corrects the image, P1 (x ′, y ′) is converted to P2 (x ″, y ″).

図６に補正前の画像の画素（変換前の画素であり、「変換前画素」という。）と、補正後の画像の画素（変換後の画素であり、「変換後画素」という。）とを模式的に示す。図６を用いて第１補正手段１３１の処理を簡略的に説明する。第１補正手段１３１は、変換後画素に対応する、変換前画素を求める。また、座標（ａ、ｂ）の変換前画素を「変換前画素（ａ、ｂ）」とし、座標（ｃ、ｄ）の変換後画素を「変換後画素（ｃ、ｄ）」とする。 FIG. 6 shows a pixel of an image before correction (a pixel before conversion, referred to as “pixel before conversion”), and a pixel of an image after correction (a pixel after conversion, referred to as “pixel after conversion”). Is shown schematically. The process of the 1st correction | amendment means 131 is demonstrated easily using FIG. The first correction unit 131 obtains a pre-conversion pixel corresponding to the post-conversion pixel. Also, the pixel before conversion at coordinates (a, b) is referred to as “pixel before conversion (a, b)”, and the pixel after conversion at coordinates (c, d) is referred to as “post-conversion pixel (c, d)”.

また、図６の例では、Ｘ軸方向の変換後画素の数はＮ_ｘ個とし、Ｙ軸方向の変換前画素の数はＮ_ｙ個とする。つまり、変換後画素の数は、Ｎ_ｘ×Ｎ_ｙ個存在する。図６の例では、変換後画素（１、１）に注目すると、該変換後画素（１、１）は変換前画素（３、３）と対応する。変換後画素に対応する変換前画素の求め方は後述する。そして、注目する変換後画素を（２、１）、（３、１）・・・（Ｎ_ｘ、１）、（１、２）（１、３）・・・（１、Ｎ_ｙ）・・・（Ｎ_ｘ、Ｎ_ｙ）と変化させてゆき、これらの変換後画素に対応する変換前画素の座標を求める。 In the example of FIG. 6, the number of post-conversion pixels in the X-axis direction is N _x , and the number of pre-conversion pixels in the Y-axis direction is N _y . That is, there are N _x × N _y pixels after conversion. In the example of FIG. 6, when attention is paid to the post-conversion pixel (1, 1), the post-conversion pixel (1, 1) corresponds to the pre-conversion pixel (3, 3). A method for obtaining the pre-conversion pixel corresponding to the post-conversion pixel will be described later. Then, the converted pixels of interest are (2, 1), (3, 1)... (N _x , 1), (1, 2) (1, 3)... (1, N _y ). -Change to (N _x , N _y ) to obtain the coordinates of the pre-conversion pixels corresponding to these post-conversion pixels.

ここで、変換前画素Ｐ１（ｘ'、ｙ'）と変換後画素Ｐ２（ｘ''、ｙ''）との間に、以下の式（１）が成り立つ。 Here, the following expression (1) is established between the pre-conversion pixel P1 (x ′, y ′) and the post-conversion pixel P2 (x ″, y ″).

ここで、ｈは理想像高であり、つまり光軸中心Ｃ（０、０）からＰ０（ｘ、ｙ）までの距離であり、ｈ＝（ｘ^２＋ｙ^２）^１／２となる。それぞれの注目画素についてのｈの値は、キャリブレーション等により予め測定される値である。変換係数ｃ_ｍは、（ｘ、ｙ）（ｘ'、ｙ'）から事前に求められるものである。また、定数Ｍは、撮影手段１０のカメラユニットの種類などから予め定められるものである。 Here, h is the ideal image height, that is, the distance from the optical axis center C (0, 0) to P0 (x, y), and h = (x ² + y ² ) ^1/2 . The value of h for each pixel of interest is a value measured in advance by calibration or the like. The conversion coefficient _cm is obtained in advance from (x, y) (x ′, y ′). The constant M is determined in advance from the type of camera unit of the photographing means 10 and the like.

また、補正率α、βは歪曲収差の軽減具合を制御できる値であり、０≦α≦１、０≦β≦１となり、画像情報から算出された人の配置データに基づいて定められるものである。また、この場合には、水平方向成分（Ｘ軸方向成分）の歪みを完全に解消するように補正することからα＝１とする。θは、水平線Ａと注目画素Ｐ１（ｘ'、ｙ'）とがなす角度であり、注目する変換後画素が変化する度に、θの値を測定する。なお、歪曲収差の特性上、原点（０、０）、Ｐ１（ｘ'、ｙ'）、Ｐ０（ｘ、ｙ）は一直線上（理想像高ｈの矢印）に位置する。 The correction factors α and β are values that can control the degree of reduction of distortion, and are 0 ≦ α ≦ 1 and 0 ≦ β ≦ 1, and are determined based on human arrangement data calculated from image information. is there. Further, in this case, α = 1 is set because the distortion in the horizontal direction component (X-axis direction component) is corrected so as to be completely eliminated. θ is an angle formed by the horizontal line A and the target pixel P1 (x ′, y ′), and the value of θ is measured every time the target converted pixel changes. Note that the origins (0, 0), P1 (x ′, y ′), and P0 (x, y) are positioned on a straight line (the arrow of the ideal image height h) due to distortion characteristics.

上記式（１）から、変換後の座標Ｐ２（ｘ''、ｙ''）に対応する変換前の座標Ｐ１（ｘ'、ｙ'）を求めることができる。このようにして、Ｎ_ｘ×Ｎ_ｙ個、全ての変換後画素に対応する変換前画素の座標を求める。 From the equation (1), the coordinate P1 (x ′, y ′) before conversion corresponding to the coordinate P2 (x ″, y ″) after conversion can be obtained. In this way, the coordinates of the pre-conversion pixels corresponding to all N _x × N _y pixels after conversion are obtained.

そして、第１補正手段１３１は、式（１）により算出された全ての変換前画素Ｐ１（ｘ'、ｙ'）について輝度値を求める。輝度値の求め方の手法は公知の技術を用いればよい。第１補正手段１３１は、求められた変換前画素の輝度値を、対応する変換後画素の輝度値として設定する。 Then, the first correction unit 131 obtains the luminance value for all the pre-conversion pixels P1 (x ′, y ′) calculated by Expression (1). A known technique may be used as a method for obtaining the luminance value. The first correction unit 131 sets the obtained luminance value of the pre-conversion pixel as the luminance value of the corresponding post-conversion pixel.

例えば、上記式（１）から、注目した変換後画素が（１、１）である場合には、注目した変換後画素（１、１）に対応する変換前画素（３、３）を求め、該変換前画素（３、３）の輝度値を求める。該求められた変換前画素（３、３）の輝度値を変換後画素（１、１）に設定する。このようにして、第１補正手段１３１は、補正後の画像を生成できる。 For example, from the above equation (1), when the noticed converted pixel is (1, 1), the pre-conversion pixel (3, 3) corresponding to the noticed converted pixel (1, 1) is obtained. The luminance value of the pre-conversion pixel (3, 3) is obtained. The obtained luminance value of the pre-conversion pixel (3, 3) is set to the post-conversion pixel (1, 1). In this way, the first correction unit 131 can generate a corrected image.

また、第１補正手段１３１は、全ての変換後画素に対応する変換前画素を求めた後に、該全ての変換前画素の輝度値を求めるようにしてもよい。また、変換後画素に対応する変換前画素を１つ求めた後に、該求められた変換前画素の輝度値を求めるようにしてもよい。また、所定数、変換後画素に対応する変換前画素を求め、該所定数求められた変換前画素の輝度値を全て求め、この処理を繰り返すことで、全ての変換前画素の輝度値を求めるようにしてもよい。 The first correcting unit 131 may obtain the luminance values of all the pre-conversion pixels after obtaining the pre-conversion pixels corresponding to all the post-conversion pixels. Further, after obtaining one pre-conversion pixel corresponding to the post-conversion pixel, the luminance value of the obtained pre-conversion pixel may be obtained. Further, a predetermined number of pre-conversion pixels corresponding to the post-conversion pixels are obtained, all the pre-conversion pixel luminance values obtained are obtained, and this process is repeated to obtain the pre-conversion pixel luminance values. You may do it.

＜画像処理装置の処理について＞
次に、実施例１の画像処理装置の処理について説明する。図７にＣＰＵ２２の機能構成例を示す。また、図８に、画像処理装置の主な処理フローを示す。 <About processing of image processing apparatus>
Next, processing of the image processing apparatus according to the first embodiment will be described. FIG. 7 shows a functional configuration example of the CPU 22. FIG. 8 shows a main processing flow of the image processing apparatus.

まず、ステップＳ２において、制御ユニット２６は、撮像装置３０１０のフレーム画像伝送手段１４２から送信された画像データを記憶手段２３に展開する。次に、ステップＳ４では、判定手段１０２は、展開された画像データの中から人物画像を検出する。ここで、人物画像を人物の顔画像（以下、単に「顔画像」という。）とする。判定手段１０２の検出処理は、ユーザによる操作により行なうようにしてもよく、所定時間ごとに行なうようにしてもよい。図９（Ａ）に展開された画像データについての画像の一例を示す。図９（Ａ）の例では、レンズ１１が広角レンズであるため、画像は歪んでいる。 First, in step S <b> 2, the control unit 26 develops the image data transmitted from the frame image transmission unit 142 of the imaging device 3010 in the storage unit 23. Next, in step S4, the determination unit 102 detects a person image from the developed image data. Here, the person image is a person's face image (hereinafter simply referred to as “face image”). The detection process of the determination unit 102 may be performed by a user operation or may be performed at predetermined time intervals. FIG. 9A shows an example of an image for the developed image data. In the example of FIG. 9A, since the lens 11 is a wide-angle lens, the image is distorted.

顔画像検出の手法として、例えば、テンプレートマッチングを用いればよい。以下では、テンプレートマッチングを用いた場合について説明する。テンプレートマッチング処理の詳細については、例えば、特許４２１９５２１などに記載されている。 For example, template matching may be used as the face image detection method. Hereinafter, a case where template matching is used will be described. Details of the template matching process are described in, for example, Japanese Patent No. 4219521.

テンプレートマッチング処理について簡単に説明する。テンプレートマッチング処理を行なう場合には、顔の特徴データ群（テンプレート）を予め記憶手段２３に記憶させておく。特徴データ群とは、顔の種類などごとに、顔の輪郭、顔の構成部品（例えば、顎、口、眼、鼻、輪郭など）などの特徴量を定めたデータである。また、顔の種類とは、大人、子供、赤ちゃん、老人、女性、男性、黄色人種、白色人種、黒色人種などの顔の種類である。以下では、会議に参加している人物を黄色人種、つまり、顔色が肌色であるとして説明する。 The template matching process will be briefly described. When performing template matching processing, a facial feature data group (template) is stored in the storage means 23 in advance. The feature data group is data in which feature quantities such as face contours and face components (for example, chin, mouth, eye, nose, contour, etc.) are determined for each type of face. The types of faces are types of faces such as adults, children, babies, elderly people, women, men, yellow races, white races, and black races. In the following description, it is assumed that the person participating in the conference is a yellow race, that is, the face color is skin color.

そして、判定手段１０２は、画像を複数の領域に分割する。ここで、領域の形状は、矩形状、円形状、楕円形状など、何れの形状でもよい。ここでは、領域の形状は、矩形状であるとする。判定手段１０２は、矩形領域内の画像の特徴量と、特徴データ群と、の類似度を算出する。類似度が算出された矩形領域を検査領域という。判定手段１０２は、算出した類似度が、予め定められた閾値以上であるか否かを判断する。判定手段１０２が閾値以上であると判断すると、判定手段１０２は、検査領域が、顔画像の領域（以下、「顔画像領域」という。）であると判断し、つまり、顔画像を検出したと判断する。 Then, the determination unit 102 divides the image into a plurality of areas. Here, the shape of the region may be any shape such as a rectangular shape, a circular shape, or an elliptical shape. Here, the shape of the region is assumed to be rectangular. The determination unit 102 calculates the similarity between the feature amount of the image in the rectangular area and the feature data group. The rectangular area for which the similarity is calculated is called an inspection area. The determination unit 102 determines whether the calculated similarity is equal to or greater than a predetermined threshold. When the determination unit 102 determines that the threshold value is equal to or greater than the threshold value, the determination unit 102 determines that the inspection area is a face image area (hereinafter referred to as “face image area”), that is, detects a face image. to decide.

更に判定手段１０２は、顔画像領域内の肌色領域以外の領域を可能な限り小さくするように、顔画像領域の大きさを変更する。判定手段１０２は当該変更することにより、顔画像の大きさに応じた顔画像領域を生成することが出来る。 Further, the determination unit 102 changes the size of the face image area so as to make the area other than the skin color area in the face image area as small as possible. By making the change, the determination unit 102 can generate a face image area corresponding to the size of the face image.

そして、ステップＳ６で、判定手段１０２は、検査領域が顔画像領域であるか否かを判断する。判定手段１０２が、検査領域が顔画像領域であると判断すると（ステップＳ６のＹｅｓ）、ステップＳ７、Ｓ８、Ｓ９、Ｓ１０（後述する）の処理を行なう。また、判定手段１０２が、検査領域が顔画像領域ではないと判断すると（ステップＳ６のＮｏ）、ステップＳ１２に移行する。 In step S6, the determination unit 102 determines whether the inspection area is a face image area. If the determination unit 102 determines that the inspection area is a face image area (Yes in step S6), the processes of steps S7, S8, S9, and S10 (described later) are performed. If the determination unit 102 determines that the inspection area is not a face image area (No in step S6), the process proceeds to step S12.

ステップＳ１２では、判定手段１０２が画像の全領域を検査したか否かを判断する。判定手段１０２が、画像の全領域を検査したと判断すると（ステップＳ１２のＹｅｓ）、ステップＳ１４に移行する。また、判定手段１０２が、未だ、画像の全領域を検査していないと判断すると、ステップＳ４に戻る。 In step S12, the determination unit 102 determines whether or not the entire area of the image has been inspected. When the determination unit 102 determines that the entire area of the image has been inspected (Yes in step S12), the process proceeds to step S14. If the determination unit 102 determines that the entire area of the image has not yet been inspected, the process returns to step S4.

このようにして、判定手段１０２は、展開された画像データの画像（図９（Ａ）参照）内に人物画像（ここでは、顔画像）が存在するか否かを判定する。また、判定手段１０２が、テンプレートマッチング処理を用いる場合には、人物画像を顔画像としたが、判定手段１０２が。テンプレートマッチング処理以外の処理で人物画像検出を行なう場合、または、当該画像処理装置がテレビ会議システム以外の他の用途で用いられる場合などには、人物画像を、人物の顔以外の画像としてもよい。 In this way, the determination unit 102 determines whether or not a person image (here, a face image) exists in the image of the developed image data (see FIG. 9A). When the determination unit 102 uses the template matching process, the person image is a face image. When performing human image detection by a process other than the template matching process, or when the image processing apparatus is used for other purposes other than the video conference system, the human image may be an image other than the human face. .

図９（Ａ）に、撮像装置３０１０により撮像された画像の一例を示す。図９（Ａ）の例では、レンズ１１が広角レンズであるため、画像は歪んでいる。次に、ステップＳ４で判定手段１０２は、顔の特徴データ群を用いて、顔画像検出を行う。 FIG. 9A illustrates an example of an image captured by the imaging device 3010. In the example of FIG. 9A, since the lens 11 is a wide-angle lens, the image is distorted. Next, in step S4, the determination unit 102 performs face image detection using the facial feature data group.

ステップＳ７では、測定手段１０６は、人物画像の大きさを測定する。ここで、人物画像の大きさを、顔画像領域のｙ軸方向の長さとする。つまり、測定手段１０６は、顔画像領域のｙ軸方向の長さを測定する。顔画像領域のｙ軸方向の長さとは、例えば、顔画像領域のｙ軸方向のピクセル数である。また、人物画像の大きさとして、例えば、顔画像の面積、つまり、顔の輪郭内の肌色部分の面積などとしてもよい。 In step S7, the measuring unit 106 measures the size of the person image. Here, the size of the person image is the length of the face image area in the y-axis direction. That is, the measuring unit 106 measures the length of the face image area in the y-axis direction. The length of the face image area in the y-axis direction is, for example, the number of pixels in the y-axis direction of the face image area. The size of the person image may be, for example, the area of the face image, that is, the area of the skin color portion in the outline of the face.

図９（Ａ）には、２つの顔画像Ａ、Ｂが存在している。また、顔画像Ａ、Ｂについての領域を、それぞれ、顔画像領域Ａ、Ｂとする。また、顔画像領域Ｒ_Ａのｙ軸方向の長さをｈ_Ａとし、顔画像領域Ｒ_Ｂのｙ軸方向の長さｈ_Ｂとする。測定手段１０６は、顔画像領域Ｒ_Ａのｙ軸方向の長さｈ_Ａを測定する。また、測定手段１０６は、顔画像領域Ｒ_Ｂのｙ軸方向の長さｈ_Ｂを測定する。また、以下では、人物画像の大きさを「顔画像の長さ」という。 In FIG. 9A, two face images A and B exist. Further, the areas for the face images A and B are set as face image areas A and B, respectively. Further, the y-axis direction length of the face image region R _A and h _A, and y-axis direction length h _B of the face image area R _B. The measuring means 106 measures the length h _A in the y-axis direction of the face image region _RA . The measurement means 106 measures the y-axis direction length _{h B} of the face image area _{R B.} Hereinafter, the size of the person image is referred to as “the length of the face image”.

ステップＳ８では、認識手段１０８が、画像の全領域に対して、顔画像領域の位置を認識する。ここで、認識手段１０８の顔画像領域の位置の認識とは、例えば、「撮像装置３０１０が撮像した画像の中央Ｃ（図９（Ａ）参照）から、顔画像領域の中心（重心）までの距離Ｌ」を算出することをいう。図９（Ａ）の例では、中央Ｃから、顔画像領域Ｒ_Ａの中心までの距離をＬ_Ａとし、中央Ｃから、顔画像領域Ｒ_Ｂの中心までの距離をＬ_Ｂとする。 In step S8, the recognition unit 108 recognizes the position of the face image area with respect to the entire area of the image. Here, the recognition of the position of the face image area by the recognition means 108 is, for example, “from the center C (see FIG. 9A) of the image captured by the imaging device 3010 to the center (center of gravity) of the face image area. It means calculating the “distance L”. In the example of FIG. 9A, the distance from the center C to the center of the face image area R _A is L _A, and the distance from the center C to the center of the face image area R _B is L _B.

ステップＳ７、Ｓ８の処理は、どちらを先に行なってもよく、並列的におこなってもよい。 Either of the processes of steps S7 and S8 may be performed first or in parallel.

ところで、理想的には、測定手段１０６は、撮像装置３０１０から人物までの距離に対応するように、当該人物についての顔画像領域の高さを測定することが好ましい。例えば、撮像装置３０１０のレンズ１１から等距離にある複数の顔画像領域については、当該複数の顔画像についての顔画像領域の高さは全て等しいものとして当該顔画像領域の高さを測定することが好ましい。 By the way, ideally, the measuring unit 106 preferably measures the height of the face image area for the person so as to correspond to the distance from the imaging device 3010 to the person. For example, for a plurality of face image areas that are equidistant from the lens 11 of the imaging device 3010, the heights of the face image areas are measured assuming that all the face image areas have the same height. Is preferred.

換言すると、図９（Ａ）に示すように、撮像装置３０１０で撮像された画像は、歪んでいるために、測定手段１０６が測定した顔画像の高さｈ_Ａ、ｈ_Ｂと、実際の人物の顔の高さとは対応したものとならない。そこで、ステップＳ９で、第２補正手段１０４は、顔画像の高さを補正することが好ましい。 In other words, as shown in FIG. 9A, since the image captured by the imaging device 3010 is distorted, the heights h _A and h _{B of} the face image measured by the measuring unit 106 and the actual person It does not correspond to the height of the face. Therefore, in step S9, the second correction unit 104 preferably corrects the height of the face image.

ステップＳ９では、認識手段１０８が認識した人物画像の位置に基づいた補正を、測定手段１０６が測定した人物画像の大きさ（顔画像の高さ（ピクセル数））に対して行なう。ここで、第２補正手段１０４の補正手法について説明する。例えば、「撮像装置３０１０が撮像した画像の中央Ｃ（図９（Ａ）参照）から、顔画像（顔画像領域）までの距離Ｌ」と、「人物画像の大きさｈに乗算する倍率ｇ」と、を対応付けた倍率テーブルを用いる。当該倍率テーブルは、実験的に求められるものであり、予め記憶手段２３に記憶されているものである。 In step S9, correction based on the position of the person image recognized by the recognition unit 108 is performed on the size of the person image (height (number of pixels) of the face image) measured by the measurement unit 106. Here, a correction method of the second correction unit 104 will be described. For example, “a distance L from the center C of the image captured by the imaging device 3010 (see FIG. 9A) to the face image (face image region)” and “a magnification g by which the size h of the person image is multiplied”. Are used in correspondence with each other. The magnification table is obtained experimentally and is stored in the storage means 23 in advance.

図１０に倍率テーブルの一例を示す。図１０の例では、距離Ｌと倍率ｇとが対応付けられている。図１０の例では、例えば、距離Ｌ_１に対応する倍率は、ｇ_１となる。第２補正手段１０４は、倍率テーブルを参照して、距離Ｌ_Ａに対応する倍率ｇ_Ａを求める。また、第２補正手段１０４は、倍率テーブルを参照して、距離Ｌ_Ｂに対応する倍率ｇ_Ｂを求める。 FIG. 10 shows an example of the magnification table. In the example of FIG. 10, the distance L and the magnification g are associated with each other. In the example of FIG. 10, for example, the corresponding ratio to the distance _{L 1} becomes _{g 1.} Second correction means 104 refers to the ratio table, obtains the ratio _{g A} corresponding to the distance _{L A.} Further, the second correction unit 104 refers to the magnification table to obtain the magnification g _B corresponding to the distance L _B.

そして、第２補正手段１０４は、顔画像領域の高さｈに対して、求められた倍率ｇを乗算する。図９（Ａ）の例では、ｈ_Ａ'＝ｈ_Ａ・ｇ_Ａ、および、ｈ_Ｂ'＝ｈ_Ｂ・ｇ_Ｂを演算することにより、顔画像領域の高さｈを補正する。ただし、ｈ_Ａ'、ｈ_Ｂ'は補正後の顔画像領域の高さである。図９（Ｂ）にｈ_Ａ'、ｈ_Ｂ'について示す。 Then, the second correcting unit 104 multiplies the height h of the face image area by the obtained magnification g. In the example of FIG. 9A, the height h of the face image region is corrected by calculating h _A ′ = h _A · g _A and h _B ′ = h _B · g _B. Here, h _A ′ and h _B ′ are the heights of the face image areas after correction. FIG. 9B shows h _A ′ and h _B ′.

また、倍率テーブルを用いずとも、予め、ｇ＝Ｆ_１（Ｌ）となる関数Ｆ_１（・）を求めておき、当該関数を用いて、距離Ｌ_Ａに対応する倍率ｇ_Ａを求めるようにしてもよい。 Further, without using a magnification table, a function F ₁ (•) that satisfies g = F ₁ (L) is obtained in advance, and a magnification g _A corresponding to the distance L _A is obtained using the function. May be.

ステップＳ１０で、生成手段１１０は、認識手段１０８が認識した人物画像の位置と、測定手段１０６が測定した（補正後の）人物画像の大きさと、に基づいて、位置情報を生成する。ここで、位置情報とは、人物画像に係る人物の、撮像装置３０１０からの位置を示す情報である。図９（Ｃ）、（Ｄ）に人物画像Ａ、Ｂについての人物をＨ_Ａ、Ｈ_Ｂを示す。 In step S <b> 10, the generation unit 110 generates position information based on the position of the person image recognized by the recognition unit 108 and the size of the person image measured (after correction) by the measurement unit 106. Here, the position information is information indicating the position of the person related to the person image from the imaging device 3010. FIGS. 9C and 9D show H _A and H _B as persons for the person images A and B, respectively.

ここで、図９（Ｄ）に示すように、人物の位置情報とは、撮像装置３０１０から人物までの距離Ｐと、撮像装置３０１０からの水平方向の人物の角度θとを含むものとする。また、撮像装置３０１０からの水平方向角度θとは、真上から見て、撮像装置３０１０のレンズ１１の中心と人物（例えば、Ｈ_Ａ、Ｈ_Ｂ）とを結ぶ直線と、レンズ１１の光軸であるｙ軸と、がなす角度である。 Here, as illustrated in FIG. 9D, it is assumed that the person position information includes a distance P from the imaging device 3010 to the person and an angle θ of the person in the horizontal direction from the imaging device 3010. The horizontal direction angle θ from the imaging device 3010 is a straight line connecting the center of the lens 11 of the imaging device 3010 and a person (for example, H _A , H _B ) and the optical axis of the lens 11 when viewed from directly above. Is the angle formed by the y-axis.

まず、生成手段１１０、距離Ｐの算出手法について説明する。第２補正手段１０４により補正された顔画像領域の高さｈ'と、距離Ｐと、の対応を示す距離テーブルを、予め求めておき、記憶手段２３に記憶させる。図１１に距離テーブルの一例を示す。図１１の例では、顔画像領域の高さｈ'が５０ピクセルの場合には、撮像装置３０１０から当該顔画像の人物までの距離Ｐは、５０ｃｍとなる。 First, the generation means 110 and the distance P calculation method will be described. A distance table indicating the correspondence between the height h ′ of the face image area corrected by the second correction unit 104 and the distance P is obtained in advance and stored in the storage unit 23. FIG. 11 shows an example of the distance table. In the example of FIG. 11, when the height h ′ of the face image area is 50 pixels, the distance P from the imaging device 3010 to the person of the face image is 50 cm.

生成手段１１０は、当該距離テーブルを参照して、第２補正手段１０４が求めた顔画像領域の高さｈ'に対応する距離Ｐを求める。また、距離テーブルを用いずとも、Ｐ＝Ｆ_２（ｈ'）となる関数Ｆ_２（・）を予め算出し、当該関数を用いて、距離Ｐを求めるようにしてもよい。 The generation unit 110 refers to the distance table to determine the distance P corresponding to the height h ′ of the face image area determined by the second correction unit 104. Further, without using the distance table, a function F ₂ (·) that satisfies P = F ₂ (h ′) may be calculated in advance, and the distance P may be obtained using the function.

次に、撮像装置３０１０からの水平方向の角度θの求め方について説明する。距離Ｌと、水平方向角度θと、の対応を示す角度テーブルを予め求めておき、記憶手段２３に記憶させる。図１２に角度テーブルの一例を示す。図１２では、例えば、距離Ｌ_１に対応する水平方向角度はθ_１である。 Next, how to obtain the horizontal angle θ from the imaging device 3010 will be described. An angle table indicating the correspondence between the distance L and the horizontal angle θ is obtained in advance and stored in the storage unit 23. FIG. 12 shows an example of the angle table. In FIG. 12, for example, the horizontal angle corresponding to the distance L ₁ is θ ₁ .

第２補正手段１０４は、当該角度テーブルを参照して、距離Ｌに対応する水平方向角度θを求める。また、距離テーブルを用いずとも、θ＝Ｆ_３（Ｌ）となる関数Ｆ_３（・）を予め算出し、当該関数を用いて、水平方向角度θを求めるようにしてもよい。 The second correcting unit 104 refers to the angle table to obtain the horizontal direction angle θ corresponding to the distance L. Further, the function F ₃ (·) that satisfies θ = F ₃ (L) may be calculated in advance without using the distance table, and the horizontal direction angle θ may be obtained using the function.

一般的に、撮像装置３０１０に撮像される画像の特性は、レンズ１１の設計により異なる。また、方向性のあるレンズを除き、通常のレンズはレンズ中央から遠ざかる同心円上で同じ特性がある。このことから、レンズの設計による特性が分かれば、画像内の人物画像の位置により、撮像装置３０１０からの水平方向角度θは決定される。 In general, the characteristics of an image picked up by the image pickup device 3010 differ depending on the design of the lens 11. Also, except for directional lenses, ordinary lenses have the same characteristics on concentric circles that are far from the center of the lens. From this, if the characteristics of the lens design are known, the horizontal direction angle θ from the imaging device 3010 is determined by the position of the person image in the image.

図９（Ｄ）に、人物Ｈ_Ａ、Ｈ_Ｂについての位置情報を示す。人物Ｈ_Ａについては、撮像装置３０１０からの距離はＬ_Ａとなり、水平方向の角度はθ_Ａとなる。また、人物Ｈ_Ｂについては、撮像装置３０１０からの距離はＬ_Ｂとなり、水平方向の角度はθ_Ｂとなる。 FIG. 9D shows position information about the persons H _A and H _B. For the person _{H A,} the distance from the imaging device 3010 becomes _{L A,} the horizontal angle becomes theta _A. Also, the person _{H B,} the distance from the imaging device 3010 becomes _{L B,} the horizontal angle becomes theta _B.

このようにして、生成手段１１０は、撮像装置３０１０からの距離Ｌと、水平方向角度θと、を含む位置情報を生成する。また、位置情報の他の例として、撮像装置３０１０を上から見て、当該撮像装置３０１０を原点（０、０）とした場合の、各人物Ｈ_Ａ、Ｈ_Ｂの座標（ｘ_Ａ、ｙ_Ａ）、（ｘ_Ａ、ｙ_Ｂ）としてもよい。 In this way, the generation unit 110 generates position information including the distance L from the imaging device 3010 and the horizontal direction angle θ. As another example of the position information, the coordinates (x _A , y _A ) of each person H _A , H _B when the imaging device 3010 is viewed from above and the imaging device 3010 is the origin (0, 0). ), (X _A , y _B ).

また、ステップＳ１１において、検出手段１１２が、水平方向角度θが所定範囲内であるか否かを検出することが好ましい。これは換言すれば、検出手段１１２は予め定められた所定領域内に人物がいるか否かを検出することである。所定領域とは、水平方向角度の所定範囲内の領域をいう。図１３に、所定領域Ｍ（ハッチング部分）について示す。また、所定領域以外の領域を所定外領域Ｎという。図１３の例では、所定領域Ｍとは、水平角度（画角）が１００度、つまり、−５０度〜＋５０度の領域である。 In step S11, it is preferable that the detection unit 112 detects whether or not the horizontal angle θ is within a predetermined range. In other words, the detecting means 112 detects whether or not there is a person within a predetermined area. The predetermined area refers to an area within a predetermined range of the horizontal angle. FIG. 13 shows the predetermined region M (hatched portion). An area other than the predetermined area is referred to as a predetermined outside area N. In the example of FIG. 13, the predetermined area M is an area having a horizontal angle (view angle) of 100 degrees, that is, −50 degrees to +50 degrees.

ステップＳ１２でＹｅｓと判断されると、ステップＳ１４では、生成手段１１０が生成した位置情報、及び、検出手段１１２の検出結果に基づいて、決定手段１１４は、補正率β（上記式（１）参照）を決定する。決定手段１１４の決定手法について説明する。決定手段１１４は、補正率テーブルを用いて補正率β（補正データ）を決定する。 If YES is determined in step S12, in step S14, the determination unit 114 determines the correction factor β (see the above formula (1)) based on the position information generated by the generation unit 110 and the detection result of the detection unit 112. ). A determination method of the determination unit 114 will be described. The determination unit 114 determines the correction rate β (correction data) using the correction rate table.

まず、人物画像が２つ以上ある場合について説明する。この場合には、決定手段１１４は、撮像装置３０１０から最も遠い人物と撮像装置３０１０から最も近い人物との奥行き方向の距離差Ｄを求める。最も遠い人物と最も近い人物の判断は、決定手段１１４は、距離Ｐの大小を比較すればよい。また、奥行き方向とは、撮像装置３０１０の画角の中心線であるｙ軸方向をいう。 First, a case where there are two or more person images will be described. In this case, the determination unit 114 obtains the distance difference D in the depth direction between the person farthest from the imaging device 3010 and the person closest to the imaging device 3010. To determine the farthest person and the closest person, the determining unit 114 may compare the distances P. The depth direction refers to the y-axis direction that is the center line of the angle of view of the imaging device 3010.

図９（Ｄ）の例では、撮像装置３０１０から最も遠い人物は人物Ｈ_Ａであり、撮像装置３０１０から最も近い人物は人物Ｈ_Ｂである。そして、決定手段１１４は、以下の式（２）により距離差Ｄを求めることが出来る。
Ｄ＝Ｐ_Ａｃｏｓθ_Ａ−Ｐ_Ｂｃｏｓθ_Ｂ（２）
更に、決定手段１１４は、検出手段１１２の検出結果を解析することにより、１以上の人物の水平方向の角度が、所定角度（図１３参照）内であるか否かを判断する。これは換言すると、所定外領域Ｎに人物が存在するか否かを判断するものである。 In the example of FIG. 9 (D), the farthest person from the image pickup device 3010 is a person _{H A,} closest to the person from the image pickup device 3010 is a person _{H B.} And the determination means 114 can obtain | require the distance difference D by the following formula | equation (2).
D = P _A cos θ _A −P _B cos θ _B (2)
Furthermore, the determination unit 114 analyzes the detection result of the detection unit 112 to determine whether or not the horizontal angle of one or more persons is within a predetermined angle (see FIG. 13). In other words, it is determined whether or not a person exists in the non-predetermined area N.

そして、決定手段１１４は、奥行き方向の差Ｄおよび検出結果と、に対応する補正率βを求める。例えば、人物Ｈ_Ａ、Ｈ_Ｂについての奥行き方向の距離差Ｄ＝６０ｃｍとし、人物Ｈ_Ａ、Ｈ_Ｂが共に、所定外領域Ｎに存在しない（人物Ｈ_Ａ、Ｈ_Ｂが所定領域Ｍに存在する）場合には、決定手段１１４は、補正率βを９０％（＝０．９）と決定する。また、人物Ｈ_Ａ、Ｈ_Ｂについての奥行き方向の距離差Ｄ＝６０ｃｍとし、人物Ｈ_Ａ、Ｈ_Ｂの何れか一方が、所定外領域Ｎに存在する場合には、決定手段１１４は、補正率βを８０％と決定する。また、人物Ｈ_Ａ、Ｈ_Ｂについての奥行き方向の距離差Ｄ＝１００ｃｍである場合には、人物Ｈ_Ａ、Ｈ_Ｂが所定領域Ｍに存在するか否かに関らず、決定手段１１４は、補正率βを８０％と決定する。 Then, the determination unit 114 obtains a correction rate β corresponding to the difference D in the depth direction and the detection result. For example, the person _H A, and the depth direction of the distance difference D = 60cm for _{H B,} the person _H A, is _{H B} together, does not exist in the predetermined area outside N (person _H A, is _{H B} exists in a predetermined region M ), The determination unit 114 determines the correction rate β to be 90% (= 0.9). Also, a person _H A, the depth direction of the distance difference D = 60cm for _{H B,} when the person _H A, one of _{H B} is present in a predetermined area outside N is determining unit 114, the correction factor β is determined to be 80%. Further, when the distance difference D in the depth direction for the persons H _A and H _B is 100 cm, regardless of whether or not the persons H _A and H _B exist in the predetermined region M, the determining unit 114 The correction rate β is determined to be 80%.

また、人物画像が３以上である場合には、当該３以上の人物のうち、撮像装置３０１０から最も離れた人物と、撮像装置３０１０から最も近い人物との奥行き方向の距離差Ｄを決定手段１１４は求める。撮像装置３０１０から最も離れた人物と、撮像装置３０１０から最も近い人物との定め方は、３以上の人物画像それぞれの距離Ｐの大小を、決定手段１１４は比較するようにすればよい。 When the number of person images is three or more, the distance difference D in the depth direction between the person farthest from the imaging device 3010 and the person closest to the imaging device 3010 among the three or more people is determined. Ask. The determination unit 114 may compare the distance P of each of the three or more person images with respect to how to determine the person farthest from the imaging device 3010 and the person closest to the imaging device 3010.

また、人物画像が１である場合には、決定手段１１４は、奥行き方向の距離差Ｄ＝０であるとして補正率βを決定する。 When the person image is 1, the determination unit 114 determines the correction rate β assuming that the distance difference D = 0 in the depth direction.

このように判定手段１０２は人物画像の数を計測するようにし、人物画像が「１」の場合と、人物画像が「２以上」の場合とで、補正率βは異なる。 In this way, the determination unit 102 measures the number of person images, and the correction rate β differs between the case where the person image is “1” and the case where the person image is “2 or more”.

また、記憶手段２３に補正率テーブルを記憶させずとも、β＝Ｆ_４（Ｄ）となる関数Ｆ_４（・）を定めて、当該関数を用いるようにしてもよい。 Further, the function F ₄ (·) that satisfies β = F ₄ (D) may be determined and the function may be used without storing the correction rate table in the storage unit 23.

決定手段１１４が決定した補正率α、βを、補正率伝送手段２１１（図４参照）は、補正率設定手段１３２に送信する。そして、第１補正手段１３１は、補正率設定手段１３２により設定された補正率α、βを用いて、フレーム画像を補正する。 The correction rate transmission unit 211 (see FIG. 4) transmits the correction rates α and β determined by the determination unit 114 to the correction rate setting unit 132. Then, the first correction unit 131 corrects the frame image using the correction rates α and β set by the correction rate setting unit 132.

また、図８のステップＳ７〜Ｓ１１の処理の説明では、顔画像Ａ、Ｂについて同時に説明したが、図８では、ステップＳ６で１つの顔画像があると判定されるごとに、ステップＳ７〜Ｓ１１の処理は行なわれる。 Further, in the description of the processing of steps S7 to S11 in FIG. 8, the face images A and B are described simultaneously. However, in FIG. 8, every time it is determined in step S6 that there is one face image, steps S7 to S11 are performed. Processing is performed.

＜実験結果＞
次に、実験結果について説明する。図１５（Ａ）に、撮像装置３０１０が撮像した元画像を示す。図１５（Ｂ）に、補正率α、βを共に「１００％」として、元画像を補正した場合の画像を示す。図１５（Ｃ）に、図８の例に示す処理フローにより決定した補正率β、（α＝１００％）で元画像を補正した場合の画像を示す。 <Experimental result>
Next, experimental results will be described. FIG. 15A illustrates an original image captured by the imaging device 3010. FIG. 15B shows an image when the original image is corrected with the correction factors α and β being both “100%”. FIG. 15C shows an image when the original image is corrected with the correction factor β determined by the processing flow shown in the example of FIG. 8 (α = 100%).

図１５（Ａ）は、レンズ１１による歪曲収差が発生し、両端の人物の姿勢が曲がったり、蛍光灯などが上下に歪み、不自然な画像になっている。図１５（Ｂ）では、補正率α、βを共に「１００％」とした結果、中心から離れるに従い、画像が引き伸ばされる。従って、縦横の本来直線である部分は直線になっているが、特に両端の人物が引き伸ばされる影響を受けて、横長になってしまい、不自然な画像になる。また遠近感が非常に強調されている。 In FIG. 15A, distortion is caused by the lens 11, the posture of the person at both ends is bent, and the fluorescent lamp is distorted vertically, resulting in an unnatural image. In FIG. 15B, when the correction factors α and β are both set to “100%”, the image is stretched as the distance from the center increases. Therefore, although the vertical and horizontal portions that are originally straight lines are straight lines, they are particularly elongated due to the influence of the stretching of the persons at both ends, resulting in an unnatural image. Perspective is also very emphasized.

一方、図１５（Ｃ）は、垂直方向の補正率を人物の配置に合せて調節したことから、水平方向にやや歪みが残るが、垂直方向の直線が保たれるとともに、人物も自然な画像になっている。当実験から明らかなように、実施例１の画像処理装置で決定された補正率を用いて補正することで、ユーザにとって違和感の無い画像に補正できる。 On the other hand, in FIG. 15C, since the vertical correction factor is adjusted in accordance with the arrangement of the person, a slight distortion remains in the horizontal direction, but the straight line in the vertical direction is maintained, and the person also has a natural image. It has become. As is clear from this experiment, the image can be corrected to an image that does not feel uncomfortable for the user by performing correction using the correction factor determined by the image processing apparatus according to the first embodiment.

図１４記載の補正率テーブルでは、２人以上の人物において奥行き方向の距離Ｄが長い場合には、奥行き方向の距離Ｄが短い場合と比較して、補正率βを小さく設定している。これにより、奥行き方向の遠近感の強調を軽減することが出来る。 In the correction rate table illustrated in FIG. 14, when the distance D in the depth direction is long in two or more persons, the correction rate β is set smaller than in the case where the distance D in the depth direction is short. This can reduce the emphasis on perspective in the depth direction.

また、奥行き方向の距離Ｄが小さい場合でも、所定範囲外の領域Ｎに人物が存在する場合の補正率βは、所定範囲外の領域Ｎに人物が存在しない場合の補正率βよりも低く設定している。これにより、後述する図１５（Ｃ）に示すように、画像両端の人物は引き伸ばされるという現象を緩和することが出来る。つまり、決定手段１１４は、検出手段１１２の検出結果を用いることにより、更に、違和感の無い画像に補正することが出来る。 Even when the distance D in the depth direction is small, the correction rate β when the person is present in the region N outside the predetermined range is set lower than the correction rate β when the person is not present in the region N outside the predetermined range. doing. Thereby, as shown in FIG. 15C, which will be described later, it is possible to mitigate the phenomenon that the persons at both ends of the image are stretched. In other words, the determination unit 114 can further correct the image without a sense of incongruity by using the detection result of the detection unit 112.

［実施例２］
次に、実施例２におけるテレビ会議システムについて説明する。実施例２では、撮像装置から人物までの距離を求めるのに、人物画像のコントラスト量を用いる。これにより、距離測定の精度を向上させることができる。 [Example 2]
Next, the video conference system in Example 2 will be described. In the second embodiment, the contrast amount of the person image is used to obtain the distance from the imaging device to the person. Thereby, the precision of distance measurement can be improved.

カメラはレンズの位置により、ピントの合う距離が変動する。この変動を利用して被写体とピントが合ったときのレンズの位置を知ることで、被写体とカメラの位置を求めることができる。実施例２では、この手法を用いて距離を測定する。 The focusing distance of the camera varies depending on the lens position. By knowing the position of the lens when the subject is in focus using this variation, the positions of the subject and the camera can be obtained. In Example 2, the distance is measured using this method.

また、いつピントが合っているかを判定するために、実施例２では画像のコントラスト量を用いる。一般にピントがあっているときにはコントラスト量が最大になる。 Further, in order to determine when the image is in focus, the image contrast amount is used in the second embodiment. In general, the amount of contrast is maximized when the subject is in focus.

実施例２におけるテレビ会議システムの概略構成は、図１に示す構成と同様であり、実施例２におけるテレビ会議端末の機能構成は、図２に示す構成と同様である。なお、実施例２では、撮像装置と画像処理装置とについて、それぞれ符号５０１０、５０２０を用いて説明する。 The schematic configuration of the video conference system in the second embodiment is the same as the configuration shown in FIG. 1, and the functional configuration of the video conference terminal in the second embodiment is the same as the configuration shown in FIG. In the second embodiment, an imaging apparatus and an image processing apparatus will be described using reference numerals 5010 and 5020, respectively.

＜撮像装置と画像処理装置のハードウェア構成例について＞
図１６は、実施例２における撮像装置５０１０と画像処理装置５０２０のハードウェア構成例を示す。実施例２における撮像装置５０１０と、画像処理装置５０２０とのハードウェアについて、図３と同様のものは同じ符号を付す。以下では、実施例１と異なる処理を行うハードウェアを主に説明する。 <Example of Hardware Configuration of Imaging Device and Image Processing Device>
FIG. 16 illustrates a hardware configuration example of the imaging device 5010 and the image processing device 5020 according to the second embodiment. Regarding the hardware of the imaging device 5010 and the image processing device 5020 in the second embodiment, the same components as those in FIG. Hereinafter, hardware that performs processing different from that of the first embodiment will be mainly described.

実施例２における撮像装置５０１０は、フォーカス制御ユニット１５を有する。フォーカス制御ユニット１５は、レンズ１１の位置を調整する。 The imaging device 5010 according to the second embodiment includes a focus control unit 15. The focus control unit 15 adjusts the position of the lens 11.

実施例２における画像処理装置５０２０のＣＰＵ２８は、種々の処理を実行する。また、ＣＰＵ２８は、実施例１と異なり、人物画像のコントラスト量に基づいて、撮像措置５０１０から人物までの距離を測定する。 The CPU 28 of the image processing apparatus 5020 according to the second embodiment executes various processes. Further, unlike the first embodiment, the CPU 28 measures the distance from the imaging measure 5010 to the person based on the contrast amount of the person image.

＜撮像装置と画像処理装置とが用いる情報について＞
次に、撮像装置５０１０と画像処理装置５０２０との各機能部が用いる情報の詳細について説明する。図１７は、各装置が用いる情報の一例を示す。実施例１と同様の処理を行う機能部には、図４に示す機能と同じ符号を付す。図１７では、実施例１と異なる機能や情報について主に説明する。 <Information Used by Imaging Device and Image Processing Device>
Next, details of information used by the functional units of the imaging device 5010 and the image processing device 5020 will be described. FIG. 17 shows an example of information used by each device. The functional units that perform the same processing as in the first embodiment are denoted by the same reference numerals as the functions illustrated in FIG. In FIG. 17, functions and information different from those in the first embodiment will be mainly described.

フォーカス制御ユニット１５は、画像処理装置５０２０のＣＰＵ２８からの信号によりレンズ１１の位置を変動させる。画像取得手段１２１は、レンズ１１の各位置で撮影されたフレーム画像を取得する。この各フレーム画像に対して以降の処理が施される。なお、第１補正手段１３１による補正は、実施例１と同様である。 The focus control unit 15 changes the position of the lens 11 by a signal from the CPU 28 of the image processing device 5020. The image acquisition unit 121 acquires frame images taken at each position of the lens 11. The subsequent processing is performed on each frame image. The correction by the first correction unit 131 is the same as that in the first embodiment.

ＣＰＵ２８は、フレーム画像伝送手段１４２から伝送された各フレーム画像からコントラスト量を算出する。ＣＰＵ２８は、コントラスト量に基づいて距離を算出し、補正率を算出する。ＣＰＵ２８の詳細な処理については後述する。 The CPU 28 calculates a contrast amount from each frame image transmitted from the frame image transmission unit 142. The CPU 28 calculates a distance based on the contrast amount and calculates a correction rate. Detailed processing of the CPU 28 will be described later.

＜画像処理装置の処理について＞
次に、実施例２における画像処理装置５０２０の処理について説明する。図１８は、実施例２におけるＣＰＵ２８の機能構成の一例を示す図である。 <About processing of image processing apparatus>
Next, processing of the image processing apparatus 5020 in the second embodiment will be described. FIG. 18 is a diagram illustrating an example of a functional configuration of the CPU 28 according to the second embodiment.

図１８に示す機能において、図７に示す機能と同様のものは同じ符号を付す。実施例２におけるＣＰＵ２８は、コントラスト測定手段２０２及び生成手段２０４を有する。 In the functions shown in FIG. 18, the same functions as those shown in FIG. The CPU 28 in the second embodiment includes a contrast measurement unit 202 and a generation unit 204.

コントラスト測定手段２０２は、各フレーム画像に対し、判定手段１０２により検出された人物画像のコントラスト量を計算する。以降では、人物画像として、例えば人物の顔領域の画像（顔画像）を用いるが、これに限られない。コントラスト測定手段２０２は、顔画像のコントラスト量として、顔画像内の最大輝度と最小輝度の差を用いてもよいし、顔画像内の輝度の分散を用いてもよい。 The contrast measuring unit 202 calculates the contrast amount of the person image detected by the determining unit 102 for each frame image. Hereinafter, for example, an image of a human face area (face image) is used as the person image, but the present invention is not limited to this. The contrast measuring unit 202 may use the difference between the maximum luminance and the minimum luminance in the face image as the contrast amount of the face image, or may use the variance of the luminance in the face image.

生成手段２０４は、実施例１と異なり、顔画像のコントラスト量が最大になるレンズ１１の位置から人物（例えば顔）までの距離を算出する。生成手段２０４は、レンズ１１の位置と実際の距離とを対応付けたレンズ位置−距離テーブルを予め作成しておく。このレンズ位置−距離テーブルは、例えば記憶手段２３などに記憶され、生成手段２０４により適宜読み出される。 Unlike the first embodiment, the generation unit 204 calculates the distance from the position of the lens 11 where the contrast amount of the face image is maximized to a person (for example, a face). The generation unit 204 creates in advance a lens position-distance table in which the position of the lens 11 is associated with the actual distance. This lens position-distance table is stored in, for example, the storage unit 23 and is appropriately read out by the generation unit 204.

図１９は、レンズ位置−距離テーブルの一例を示す。図１９に示す例では、レンズ位置Ｓと距離Ｐとが対応付けられている。距離Ｐは、例えば、撮像装置５０１０から人物までの距離である。図１９に示す例では、例えば、レンズ位置Ｓ_１１に対する距離は、Ｐ_１１である。生成手段２０４は、図１９に示すようなレンズ位置−距離テーブルを参照して、レンズ位置に対応する距離Ｐを求める。 FIG. 19 shows an example of a lens position-distance table. In the example shown in FIG. 19, the lens position S and the distance P are associated with each other. The distance P is a distance from the imaging device 5010 to a person, for example. In the example shown in FIG. 19, for example, the distance to the lens position _{S 11} _{is P 11.} The generation unit 204 refers to a lens position-distance table as shown in FIG. 19 to obtain a distance P corresponding to the lens position.

また、生成手段２０４は、レンズ位置−距離テーブルを用いずとも、Ｐ＝Ｆ_３（Ｓ）となる関数Ｆ_３（・）を予め算出し、当該関数を用いて、距離Ｐを求めるようにしてもよい。 In addition, the generation unit 204 calculates a function F ₃ (·) that satisfies P = F ₃ (S) in advance without using the lens position-distance table, and obtains the distance P using the function. Also good.

図２０は、画像処理装置５０２０の主な処理フローを示す。図２０に示すステップＳ１０１で、判定手段１０２は、顔認識処理を行う。ステップＳ１０１の処理は、図８に示すステップＳ２〜Ｓ４の処理と同様である。 FIG. 20 shows a main processing flow of the image processing apparatus 5020. In step S101 shown in FIG. 20, the determination unit 102 performs face recognition processing. The processing in step S101 is the same as the processing in steps S2 to S4 shown in FIG.

ステップＳ１０２で、コントラスト測定手段２０２、生成手段２０４などは、認識された顔画像から距離Ｐを算出する。ステップＳ１０２の詳細な処理は、図２１を用いて後述する。 In step S102, the contrast measurement unit 202, the generation unit 204, and the like calculate the distance P from the recognized face image. Detailed processing in step S102 will be described later with reference to FIG.

ステップＳ１０３で、決定手段１１４は、奥行き方向の差Ｄを用いて、補正率βを決定する。ステップＳ１０３の処理は、図８に示すステップＳ１４の処理と同様である。 In step S103, the determination unit 114 determines the correction rate β using the difference D in the depth direction. The process of step S103 is the same as the process of step S14 shown in FIG.

次に、ステップＳ１０２の処理を説明する。図２１は、実施例２における距離測定フローの一例を示す。図２１に示すステップＳ２０１で、コントラスト測定手段２０２は、対象のフレーム画像内で、判定手段１０２により認識された顔画像数が１以上であるか否かを判定する。顔画像数が１以上であれば（ステップＳ２０１−Ｙｅｓ）ステップＳ２０２に移行する。顔画像数が１以上でなければ（ステップＳ２０１−Ｎｏ）距離測定処理を終了する。 Next, the process of step S102 will be described. FIG. 21 shows an example of a distance measurement flow in the second embodiment. In step S201 shown in FIG. 21, the contrast measuring unit 202 determines whether or not the number of face images recognized by the determining unit 102 in the target frame image is 1 or more. If the number of face images is 1 or more (step S201—Yes), the process proceeds to step S202. If the number of face images is not 1 or more (step S201—No), the distance measurement process is terminated.

ステップＳ２０２で、コントラスト測定手段２０２は、顔画像を含むフレーム画像を取得する。 In step S202, the contrast measuring unit 202 acquires a frame image including a face image.

ステップＳ２０３で、コントラスト測定手段２０２は、取得したフレーム画像内の顔画像ごとに、コントラスト量を計算する。コントラスト量は、顔画像のコントラスト量として、顔画像内の最大輝度と最小輝度の差を用いてもよいし、顔画像内の輝度の分散を用いてもよい。 In step S203, the contrast measuring unit 202 calculates a contrast amount for each face image in the acquired frame image. As the contrast amount, the difference between the maximum luminance and the minimum luminance in the face image may be used as the contrast amount of the face image, or the luminance dispersion in the face image may be used.

ステップＳ２０４で、コントラスト測定手段２０２は、現在のレンズ位置、顔画像ＩＤ（例えば顔画像Ａ、Ｂ）、コントラスト量をメモリに保存する。メモリは、例えば記憶手段２３などである。 In step S204, the contrast measurement unit 202 stores the current lens position, face image ID (for example, face images A and B), and the contrast amount in the memory. The memory is, for example, the storage means 23.

ステップＳ２０５で、コントラスト測定手段２０２は、全てのレンズ位置で処理を行ったか否かを判定する。全てのレンズ位置か否かの判定は、コントラスト測定手段２０２が、レンズ１１の可動範囲を知っておけばよい。全てのレンズ位置が終われば（ステップＳ２０５−Ｙｅｓ）ステップＳ２０７に移行し、全てのレンズ位置が終わっていなければ（ステップＳ２０５−Ｎｏ）ステップＳ２０６に移行する。 In step S205, the contrast measuring unit 202 determines whether or not processing has been performed at all lens positions. In order to determine whether or not the lens positions are all, the contrast measuring unit 202 only needs to know the movable range of the lens 11. If all the lens positions are finished (step S205—Yes), the process proceeds to step S207. If all the lens positions are not finished (step S205—No), the process proceeds to step S206.

ステップＳ２０６で、ＣＰＵ２８は、レンズ１１の位置をずらすための信号をフォーカス制御ユニット１５に送信する。ＣＰＵ２８は、撮像装置５０１０のフォーカス制御ユニット１５を制御し、レンズ１１の位置を調整する。ステップＳ２０６の後はステップＳ２０２に戻り、別のレンズ位置において、ステップＳ２０２〜Ｓ２０４の処理が行われる。 In step S <b> 206, the CPU 28 transmits a signal for shifting the position of the lens 11 to the focus control unit 15. The CPU 28 controls the focus control unit 15 of the imaging device 5010 to adjust the position of the lens 11. After step S206, the process returns to step S202, and the processes of steps S202 to S204 are performed at another lens position.

ステップＳ２０７で、生成手段２０４は、顔画像ごとにコントラスト量が最大となるレンズ１１の位置を求める。生成手段２０４は、保存されたコントラスト量を顔画像ごとに比較することで、最大となるコントラスト量のレンズ位置を求めることができる。 In step S207, the generation unit 204 obtains the position of the lens 11 at which the contrast amount is maximized for each face image. The generation unit 204 can obtain the lens position of the maximum contrast amount by comparing the stored contrast amount for each face image.

ステップＳ２０８で、生成手段２０４は、顔画像ごとに、最大となるレンズ位置から、レンズ位置−距離テーブルを参照することで距離Ｐを求める。 In step S208, the generation unit 204 obtains the distance P by referring to the lens position-distance table from the maximum lens position for each face image.

生成手段２０４は、距離Ｐを求めた後、認識手段１０８が認識した顔画像の位置と、距離Ｐとを含む位置情報を生成する。 After obtaining the distance P, the generation unit 204 generates position information including the position of the face image recognized by the recognition unit 108 and the distance P.

生成手段２０４により位置情報が求められれば、検出手段１１２、決定手段１１４により、実施例１と同様の処理が行われる。よって、補正率βが決定手段１１４により決定される。 When the position information is obtained by the generation unit 204, the same processing as in the first embodiment is performed by the detection unit 112 and the determination unit 114. Therefore, the correction rate β is determined by the determination unit 114.

＜実施例２における距離測定の手順＞
次に、実施例２における距離測定の手順を図２２〜図２５を用いて説明する。図２２は、実施例２における距離測定の手順（その１）を示す。図２２に示すように、判定手段１０２は、（１）画像取得手段１２１からのフレーム画像で、例えば顔認識を行う。判定手段１０２は、認識結果情報をメモリに保存する。メモリは、記憶手段２３でもよいし、他の記憶部でもよい。例えば、図２２に示す場合、判定手段１０２は、顔Ａ、Ｂの矩形の座標、サイズをメモリに保存する。 <Distance Measurement Procedure in Example 2>
Next, a distance measurement procedure according to the second embodiment will be described with reference to FIGS. FIG. 22 shows a distance measurement procedure (part 1) in the second embodiment. As illustrated in FIG. 22, the determination unit 102 performs (1) face recognition, for example, on the frame image from the image acquisition unit 121. The determination unit 102 stores the recognition result information in the memory. The memory may be the storage unit 23 or another storage unit. For example, in the case illustrated in FIG. 22, the determination unit 102 stores the coordinates and size of the rectangles of the faces A and B in the memory.

図２３は、実施例２における距離測定の手順（その２）を示す。図２３に示すように、撮像装置５０１０は、（２）レンズ１１の位置を移動させながら、画像を連続で撮影する。これは、ＣＰＵ２８が、フォーカス制御ユニット１５を制御して、レンズの位置を移動させる。ここで、全てのレンズ位置での撮影が終了した後に、次の（３）の処理が行われてもよいし、一枚撮影される毎に（３）の処理が行われてもよい。 FIG. 23 shows a distance measurement procedure (part 2) in the second embodiment. As illustrated in FIG. 23, the imaging apparatus 5010 (2) continuously captures images while moving the position of the lens 11. In this case, the CPU 28 controls the focus control unit 15 to move the lens position. Here, after the photographing at all lens positions is completed, the following process (3) may be performed, or the process (3) may be performed every time one image is photographed.

また、この例では、簡単のため、レンズを１つとして説明するが、複数のレンズが用いられてもよい。撮像装置５０１０が、複数のレンズを有する場合、複数のレンズそれぞれに対してコントラスト量を測定するようにすればよい。 In this example, for the sake of simplicity, a single lens will be described, but a plurality of lenses may be used. When the imaging device 5010 includes a plurality of lenses, the contrast amount may be measured for each of the plurality of lenses.

コントラスト測定手段２０２は、（３）各画像の顔画像ごとに、コントラスト量を測定する。コントラスト測定手段２０２は、測定したコントラスト量と、レンズの位置と、顔画像とを対応付けてメモリに保存しておく。メモリは、例えば記憶手段２３であり、その他の記憶部でもよい。 The contrast measuring means 202 (3) measures the amount of contrast for each face image of each image. The contrast measuring unit 202 stores the measured contrast amount, the lens position, and the face image in a memory in association with each other. The memory is, for example, the storage unit 23 and may be another storage unit.

図２４は、実施例２における距離測定の手順（その３）を示す。図２４に示すように、生成手段２０４は、（４）各顔画像に対し、コントラスト量が最大となるレンズ位置を求める。生成手段２０４は、例えば、顔画像Ａのコントラスト量が最大となる画像からレンズ位置ｌｅｎｓＡ（ｍｍ）を求め、顔画像Ｂのコントラスト量が最大となる画像からレンズ位置ｌｅｎｓＢ（ｍｍ）を求める。 FIG. 24 shows a distance measurement procedure (part 3) in the second embodiment. As shown in FIG. 24, the generation unit 204 (4) obtains the lens position where the contrast amount is maximum for each face image. For example, the generation unit 204 obtains the lens position lensA (mm) from the image with the maximum contrast amount of the face image A, and obtains the lens position lensB (mm) from the image with the maximum contrast amount of the face image B.

図２５は、実施例２における距離測定の手順（その４）を示す。図２５に示すように、生成手段２０４は、（５）レンズ位置Ｓと事前に求めておいた関係式Ｆ_３から実際の距離を求める。レンズ位置と実際の距離の関係式Ｆ_３は、事前に実験などにより求めておく。また、実際の距離の正確性を向上させるために、カメラ正面からの角度ごとに関係式Ｆ_３が用意されていてもよい。また、関係式Ｆ_３は、図２５に示す線形になるとは限らず、二次関数などで表されてもよい。 FIG. 25 shows a distance measurement procedure (part 4) in the second embodiment. As shown in FIG. 25, the generation unit 204 obtains the actual distance from the relation F ₃ which has been determined in advance and (5) the lens position S. Relation of the actual distance between the lens position F ₃ is previously obtained by such pre-experiment. In order to improve the actual accuracy of the distance, for each angle from the front of the camera relational expression F ₃ may be prepared. Also, the relational expression F ₃ is not necessarily become linear as shown in FIG. 25 may be represented by such a quadratic function.

以上、実施例２によれば、ピント調整を用いて距離を求め、この距離を用いて決定された補正率により補正をすることで、ユーザにとって違和感の無い画像に補正できる。 As described above, according to the second embodiment, the distance can be obtained by using the focus adjustment, and the correction can be performed with the correction factor determined by using the distance, so that the image can be corrected without any sense of incongruity for the user.

また、本実施形態の画像処理装置によれば、撮像装置からの人物の位置を示す位置情報に応じて、垂直成分の補正率を決定する。つまり、人物の配置に応じて、垂直成分の補正率を決定し、換言すると、歪みを一部残存させる補正を行なう補正率を決定する。従って、ユーザにとって違和感の無い画像に補正できる補正率を決定することが出来る。 Further, according to the image processing apparatus of the present embodiment, the correction factor of the vertical component is determined according to the position information indicating the position of the person from the imaging apparatus. In other words, the correction factor for the vertical component is determined according to the arrangement of the person, in other words, the correction factor for performing the correction that partially retains the distortion. Therefore, it is possible to determine a correction rate that can be corrected to an image that does not feel uncomfortable for the user.

また、図４、図１８に示す例では、撮像装置３０１０、５０１０が第１補正手段１３１を有していたが、画像処理装置３０２０、５０２０が、第１補正手段１３１を有するようにして、画像処理装置３０２０、５０２０が、上記式（１）の補正を行なうようにしてもよい。 4 and 18, the imaging devices 3010 and 5010 have the first correction unit 131. However, the image processing devices 3020 and 5020 have the first correction unit 131 so that images can be displayed. The processing devices 3020 and 5020 may perform the correction of the above formula (1).

なお、上記の各実施例を組み合わせて、いずれの処理も可能なようにし、実施例１の処理を行うか、実施例２の処理を行うかをユーザが切替できるようにしてもよい。 In addition, the above-described embodiments may be combined so that any processing is possible, and the user may be able to switch between performing the processing in the first embodiment and performing the processing in the second embodiment.

なお、本発明は、上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化することができる。また、開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成することができる。例えば、実施の形態に示される全構成要素からいくつかの構成要素を削除してもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. Various inventions can be formed by appropriately combining a plurality of disclosed constituent elements. For example, some components may be deleted from all the components shown in the embodiment.

１１・・・・レンズ
１２・・・・センサ
１３・・・・画像処理ユニット
１４・・・・Ｉ／Ｆ手段
２１・・・・Ｉ／Ｆ手段
２２、２８・ＣＰＵ
２３・・・・記憶手段
２４・・・・映像出力ユニット
２５・・・・通信ユニット
２６・・・・制御ユニット
２７・・・・バス
１０２・・・判定手段
１０４・・・第２補正手段
１０６・・・測定手段
１０８・・・認識手段
１１０・・・生成手段
１１２・・・検出手段
１１４・・・決定手段
１３１・・・第１補正手段
１３２・・・補正率設定手段
１４１・・・変換済みフレーム画像伝送手段
１４２・・・フレーム画像伝送手段
２０２・・・コントラスト生成手段
３０１０・・撮像装置
３０２０・・画像処理装置 DESCRIPTION OF SYMBOLS 11 ... Lens 12 ... Sensor 13 ... Image processing unit 14 ... I / F means 21 ... I / F means 22, 28 CPU
23 ... Storage means 24 ... Video output unit 25 ... Communication unit 26 ... Control unit 27 ... Bus 102 ... Determination means 104 ... Second correction means 106 ... Measuring means 108 ... Recognizing means 110 ... Generating means 112 ... Detecting means 114 ... Determining means 131 ... First correcting means 132 ... Correction rate setting means 141 ... Conversion Frame image transmission means 142... Frame image transmission means 202... Contrast generation means 3010... Imaging device 3020.

特許第４２７９６１３号Japanese Patent No. 4279613

Claims

Determination means for determining whether or not a human image exists in an image captured by the imaging device;
When the determination unit determines that the person image is present in the image, a measurement unit that measures the size or contrast amount of the person image;
When the determination unit determines that the person image is present in the image, a recognition unit that recognizes a position of the person image in the image;
A position indicating the position of the person related to the person image from the imaging device based on the position of the person image recognized by the recognition unit and the size or contrast amount of the person image measured by the measurement unit. Generating means for generating information;
Determining means for determining a correction rate for correcting the image based on the position information generated by the generating means;
Correction means for performing correction based on the position of the person image recognized by the recognition means on the size of the person image measured by the measurement means;
The image processing apparatus generates the position information based on the position of the person image recognized by the recognition means and the size of the person image corrected by the correction means .

The determination means counts the number of person images present in the image,
When the number of person images counted by the determination means is 2 or more,
The determining means includes
The image processing according to claim 1, wherein a correction factor is determined based on a distance difference in a depth direction between a person farthest from the imaging device and a person closest to the imaging device in a person related to each of the two or more human images. apparatus.

When the number of person images counted by the determination means is 1,
The image processing apparatus according to claim 2 , wherein the determination unit determines a correction rate on the assumption that the distance difference in the depth direction is zero.

A straight line connecting the center of the lens of the imaging device and the person related to the person image;
An optical axis of the imaging device;
Having an detecting means for detecting whether or not an angle formed by is within a predetermined range;
It said determining means, on the basis of the detection result of the position information and said detection means, the image processing apparatus according to claim 1 to 3 any one of claims to determine the correction factor.

The size of the person image is an image processing apparatus according to claim 1-4 any one wherein the length of the y-axis direction of the person of the face image according to the person image.

The determining means includes
The image processing apparatus according to claim 1 to 5 any one of claims to determine the correction factor for correcting so as to leave the distortion in the y-axis direction component of the image.

The measuring means includes
Measure the contrast amount of the person image in a plurality of images taken by changing the position of the lens,
The generating means includes
The image processing apparatus according to claim 1, wherein a distance from the imaging device to a person is obtained based on a lens position of an image with the maximum contrast amount.

A determination step of determining whether a human image is present in an image captured by the imaging device;
When the determination step determines that the person image is present in the image, a measurement step of measuring the size or contrast amount of the person image;
When the determination step determines that the person image is present in the image, a recognition step of recognizing the position of the person image in the image;
A correction step for performing correction based on the position of the person image recognized by the recognition step on the size of the person image measured by the measurement step;
Based on the position of the person image recognized in the recognition step and the size or contrast amount of the person image measured in the measurement step, the position of the person related to the person image from the imaging device is determined. A generating step for generating position information to be shown;
Determining a correction rate for correcting the image based on the position information generated by the generating step, and
An image processing method in which, in the generation step, the position information is generated based on the position of the person image recognized in the recognition step and the size of the person image corrected in the correction step .