JP6996538B2

JP6996538B2 - Image processing equipment, image processing methods, and image processing systems

Info

Publication number: JP6996538B2
Application number: JP2019190608A
Authority: JP
Inventors: 宣浩綱島
Original assignee: Sony Corp; Sony Group Corp
Current assignee: Sony Corp; Sony Group Corp
Priority date: 2019-10-17
Filing date: 2019-10-17
Publication date: 2022-01-17
Anticipated expiration: 2035-04-14
Also published as: JP2020018004A

Description

本開示は、画像処理装置、画像処理方法、および画像処理システムに関する。 The present disclosure relates to an image processing apparatus, an image processing method, and an image processing system.

従来、撮影された画像の中から、検出対象の人物などの物体の領域を切り出すための技
術が各種開発されている。 Conventionally, various techniques have been developed for cutting out an area of an object such as a person to be detected from a captured image.

例えば、特許文献１には、魚眼レンズカメラにより撮影された画像の中から移動体を検出し、検出した各移動体の外接四角形の領域をそれぞれ切り出す技術が記載されている。また、特許文献２には、直前のフレーム画像で抽出した部分領域の位置情報と、現在のフレーム画像から解析される物理的特徴量とに基づいて、各フレーム画像から部分領域を抽出する技術が記載されている。また、特許文献３には、映像データから抽出される動体のうちサイズ、存在時間、もしくは移動速度が最も大きい動体を検出し、そして、検出した動体を含む領域を切り出す技術が記載されている。 For example, Patent Document 1 describes a technique of detecting a moving body from an image taken by a fisheye lens camera and cutting out a region of a circumscribed quadrangle of each detected moving body. Further, Patent Document 2 describes a technique for extracting a partial region from each frame image based on the position information of the partial region extracted in the immediately preceding frame image and the physical feature amount analyzed from the current frame image. Are listed. Further, Patent Document 3 describes a technique of detecting a moving body having the largest size, existence time, or moving speed among moving bodies extracted from video data, and cutting out a region including the detected moving body.

特開２００１－３３３４２２号公報Japanese Unexamined Patent Publication No. 2001-333422 特開２００４－３３４５８７号公報Japanese Unexamined Patent Publication No. 2004-334587 特開２０１４－２２２８２５号公報Japanese Unexamined Patent Publication No. 2014-222825

しかしながら、特許文献１～特許文献３に記載の技術では、切り出し領域の位置が制約される場合がある。例えば、特許文献３に記載の技術では、一度検出対象に定められた物体が同じ場所に位置し続けると、同じ場所が切り出し領域として長時間設定され続ける。 However, in the techniques described in Patent Documents 1 to 3, the position of the cutout region may be restricted. For example, in the technique described in Patent Document 3, once an object defined as a detection target continues to be located at the same place, the same place continues to be set as a cutout area for a long time.

そこで、本開示では、検出対象の物体が停止している時間の長さに適応的に切り出し領域を決定することが可能な、新規かつ改良された画像処理装置、画像処理方法、および画像処理システムを提案する。 Therefore, in the present disclosure, a new and improved image processing device, an image processing method, and an image processing system capable of adaptively determining a cutout region according to the length of time that the object to be detected is stopped are available. To propose.

本開示によれば、動画像の第１の時点において、所定の基準に基づいて、１または２以上の検出された物体の中から第１の物体を注視対象として選択し、前記第１の時点よりも後の第２の時点において、前記第１の物体が、第１の所定時間以上停止していることを検出した場合に、前記第１の物体を前記注視対象から除外する制御部を備える、画像処理装置が提供される。 According to the present disclosure, at a first time point of a moving image, a first object is selected as a gaze target from one or more detected objects based on a predetermined criterion, and the first time point is described. A control unit for excluding the first object from the gaze target when it is detected that the first object is stopped for a predetermined time or longer at a second time point after that. , An image processing device is provided.

また、本開示によれば、動画像の第１の時点において、所定の基準に基づいて、１または２以上の検出された物体の中から第１の物体を注視対象として選択することと、前記第１の時点よりも後の第２の時点において、前記第１の物体が所定時間以上停止していることを検出した場合に、前記第１の物体を前記注視対象から除外することと、を備える画像処理方法が提供される。 Further, according to the present disclosure, at the first time point of the moving image, the first object is selected as the gaze target from one or more detected objects based on a predetermined criterion. When it is detected that the first object is stopped for a predetermined time or more at a second time point after the first time point, the first object is excluded from the gaze target. An image processing method is provided.

また、本開示によれば、動画像の第１の時点において、所定の基準に基づいて、１または２以上の検出された物体の中から第１の物体を注視対象として選択し、前記第１の時点よりも後の第２の時点において、前記第１の物体が所定時間以上停止していることを検出した場合に、前記第１の物体を前記注視対象から除外する制御部と、前記注視対象の物体の検出位置が含まれるように切り出し領域を決定する切り出し領域決定部と、前記切り出し領域決定部により決定された前記切り出し領域を切り出すことにより切り出し画像を生成する切り出し画像生成部と、生成された前記切り出し画像を記憶する記憶部と、備える、画像処理システムが提供される。 Further, according to the present disclosure, at the first time point of the moving image, the first object is selected as the gaze target from one or more detected objects based on a predetermined criterion, and the first object is described. A control unit that excludes the first object from the gaze target when it is detected that the first object is stopped for a predetermined time or more at a second time point after the first gaze, and the gaze. A cutout area determination unit that determines a cutout area so as to include a detection position of a target object, and a cutout image generation unit that generates a cutout image by cutting out the cutout area determined by the cutout area determination unit. An image processing system including a storage unit for storing the cut-out image is provided.

以上説明したように本開示によれば、検出対象の物体が停止している時間の長さに適応的に切り出し領域を決定することができる。なお、ここに記載された効果は必ずしも限定されるものではなく、本開示中に記載されたいずれかの効果であってもよい。 As described above, according to the present disclosure, the cutout region can be adaptively determined according to the length of time that the object to be detected is stopped. The effects described herein are not necessarily limited, and may be any of the effects described in the present disclosure.

本開示の実施形態による画像処理システムの構成例を示した説明図である。It is explanatory drawing which showed the structural example of the image processing system by embodiment of this disclosure. カメラ１０により生成された縮小画像３２の一例を示した説明図である。It is explanatory drawing which showed an example of the reduced image 32 generated by a camera 10. フレーム画像３０から生成された複数のクロッピング画像５０の一例を示した説明図である。It is explanatory drawing which showed an example of the plurality of cropping images 50 generated from the frame image 30. 同実施形態によるカメラ１０の構成を示した機能ブロック図である。It is a functional block diagram which showed the structure of the camera 10 by the same embodiment. フレーム画像３０とクロッピング領域４０との関係を示した説明図である。It is explanatory drawing which showed the relationship between the frame image 30 and the cropping area 40. 同実施形態による監視端末２２の構成を示した機能ブロック図である。It is a functional block diagram which showed the structure of the monitoring terminal 22 by the same embodiment. 同実施形態による評価基準設定画面の表示例を示した説明図である。It is explanatory drawing which showed the display example of the evaluation standard setting screen by the same embodiment. 同実施形態による領域設定部１０４の構成を示した機能ブロック図である。It is a functional block diagram which showed the structure of the area setting part 104 by the same embodiment. 同実施形態による動作を示したフローチャートである。It is a flowchart which showed the operation by the same embodiment. 同実施形態によるクロッピング画像生成処理の動作の一部を示したフローチャートである。It is a flowchart which showed a part of the operation of the cropping image generation processing by the same embodiment. 同実施形態によるクロッピング画像生成処理の動作の一部を示したフローチャートである。It is a flowchart which showed a part of the operation of the cropping image generation processing by the same embodiment.

以下に添付図面を参照しながら、本開示の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Preferred embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings. In the present specification and the drawings, components having substantially the same functional configuration are designated by the same reference numerals, so that duplicate description will be omitted.

また、本明細書及び図面において、実質的に同一の機能構成を有する複数の構成要素を、同一の符号の後に異なるアルファベットを付して区別する場合もある。例えば、実質的に同一の機能構成を有する複数の構成を、必要に応じて映像クロッピング部１０６ａおよび映像クロッピング部１０６ｂのように区別する。ただし、実質的に同一の機能構成を有する複数の構成要素の各々を特に区別する必要がない場合、同一符号のみを付する。例えば、映像クロッピング部１０６ａおよび映像クロッピング部１０６ｂを特に区別する必要が無い場合には、単に映像クロッピング部１０６と称する。 Further, in the present specification and the drawings, a plurality of components having substantially the same functional configuration may be distinguished by adding different alphabets after the same reference numerals. For example, a plurality of configurations having substantially the same functional configuration are distinguished as necessary, such as the video cropping unit 106a and the video cropping unit 106b. However, if it is not necessary to particularly distinguish each of the plurality of components having substantially the same functional configuration, only the same reference numerals are given. For example, when it is not necessary to distinguish between the image cropping unit 106a and the image cropping unit 106b, the image cropping unit 106 is simply referred to as the image cropping unit 106.

また、以下に示す項目順序に従って当該「発明を実施するための形態」を説明する。
１．画像処理システムの基本構成
２．実施形態の詳細な説明
３．変形例 In addition, the "mode for carrying out the invention" will be described in accordance with the order of items shown below.
1. 1. Basic configuration of image processing system 2. Detailed description of the embodiment 3. Modification example

＜＜１．画像処理システムの基本構成＞＞
本開示は、一例として「２．実施形態の詳細な説明」において詳細に説明するように、多様な形態で実施され得る。最初に、本実施形態による画像処理システムの基本構成について、図１を参照して説明する。 << 1. Basic configuration of image processing system >>
The present disclosure may be implemented in a variety of forms, as described in detail in "2. Detailed Description of Embodiments" as an example. First, the basic configuration of the image processing system according to the present embodiment will be described with reference to FIG.

図１に示したように、本実施形態による画像処理システムは、カメラ１０、ストレージ２０、監視端末２２、および、通信網２４を含む。 As shown in FIG. 1, the image processing system according to the present embodiment includes a camera 10, a storage 20, a monitoring terminal 22, and a communication network 24.

＜１－１．カメラ１０＞
カメラ１０は、本開示における画像処理装置の一例である。カメラ１０は、外環境の動画像を撮影するための装置である。このカメラ１０は、人や車の交通量が多い場所や、監視対象の場所などに設置され得る。例えば、カメラ１０は、道路、駅、空港、商業用ビルディング、アミューズメントパーク、公園、駐車場、または立ち入り禁止区域などに設置されてもよい。 <1-1. Camera 10>
The camera 10 is an example of the image processing device in the present disclosure. The camera 10 is a device for taking a moving image of an external environment. The camera 10 may be installed in a place where there is a lot of traffic of people or cars, a place to be monitored, or the like. For example, the camera 10 may be installed on a road, station, airport, commercial building, amusement park, park, parking lot, or restricted area.

また、カメラ１０は、撮影したフレーム画像を用いて別の画像を生成することや、生成した別の画像を後述する通信網２４を介して他の装置へ送信することが可能である。ここで、フレーム画像は、例えば、カメラ１０が撮影可能な上限の解像度を有する画像である。一例として、フレーム画像は、４Ｋの画像であってもよい。 Further, the camera 10 can generate another image using the captured frame image, and can transmit the generated other image to another device via the communication network 24 described later. Here, the frame image is, for example, an image having an upper limit resolution that can be captured by the camera 10. As an example, the frame image may be a 4K image.

例えば、カメラ１０は、フレーム画像に基づいてデータ量の小さい別の画像を生成する。この理由は、フレーム画像はデータ量が大きいので、例えば伝送に時間がかかるなどの理由により、フレーム画像自体を他の装置へ送信することは望ましくないからである。 For example, the camera 10 generates another image with a small amount of data based on the frame image. The reason for this is that since the frame image has a large amount of data, it is not desirable to transmit the frame image itself to another device, for example, because it takes a long time to transmit.

ここで、カメラ１０が生成する別の画像の例は、フレーム画像の解像度を単純に下げた画像である縮小画像や、注視対象の領域をクロッピング（切り取り）した画像であるクロッピング画像である。なお、縮小画像は、例えばフルＨＤの画像であってもよい。 Here, another example of the image generated by the camera 10 is a reduced image which is an image in which the resolution of the frame image is simply lowered, and a cropped image which is an image in which the region to be gazed is cropped. The reduced image may be, for example, a full HD image.

図２は、縮小画像の一例（縮小画像３２）を示した説明図である。縮小画像３２は、フレーム画像に含まれる全ての領域を含む。一方で、図２に示したように、縮小画像３２では、例えば人の顔などの、注視対象の領域が非常に小さくなり得るので、視認し辛くなり得る。なお、図２に示した領域４０は、後述するクロッピング領域に対応する領域である。通常、クロッピング領域はフレーム画像内に設定されるが、図２では、説明の便宜上、縮小画像３２における、クロッピング領域に対応する領域を領域４０と記載している。 FIG. 2 is an explanatory diagram showing an example of a reduced image (reduced image 32). The reduced image 32 includes all the areas included in the frame image. On the other hand, as shown in FIG. 2, in the reduced image 32, the area to be gazed at, for example, a human face, can be very small, so that it can be difficult to visually recognize. The region 40 shown in FIG. 2 is a region corresponding to a cropping region described later. Normally, the cropping region is set in the frame image, but in FIG. 2, for convenience of explanation, the region corresponding to the cropping region in the reduced image 32 is described as the region 40.

また、図３は、一枚のフレーム画像から生成された複数のクロッピング画像の一例（クロッピング画像５０の集合５２）を示した説明図である。クロッピング画像５０は、フレーム画像と同じ解像度を有するが、図３に示したように、個々のクロッピング画像５０は、フレーム画像のうちの一部の領域だけしか含まない。そこで、本実施形態によるカメラ１０は、基本的には、一枚のフレーム画像から一枚の縮小画像、および、一以上のクロッピング画像を生成する。この生成例によれば、カメラ１０により撮影された全景をユーザが確認でき、かつ、注視対象の領域を高解像度でユーザは確認することができる。そして、フレーム画像と比較して合計のデータ量を抑制することができる。 Further, FIG. 3 is an explanatory diagram showing an example of a plurality of cropping images (set 52 of cropping images 50) generated from one frame image. The cropping image 50 has the same resolution as the frame image, but as shown in FIG. 3, each cropping image 50 includes only a part of the frame image. Therefore, the camera 10 according to the present embodiment basically generates one reduced image and one or more cropped images from one frame image. According to this generation example, the user can confirm the whole view taken by the camera 10, and the user can confirm the area to be gazed at with high resolution. Then, the total amount of data can be suppressed as compared with the frame image.

ここで、図４を参照して、カメラ１０の内部構成について説明する。図４に示したように、カメラ１０は、撮影部１００、映像縮小部１０２、領域設定部１０４、複数の映像クロッピング部１０６、および、通信部１０８を含む。なお、図４では、映像クロッピング部１０６が四個設けられる例を示しているが、かかる例に限定されず、一以上の任意の個数設けられてもよい。 Here, the internal configuration of the camera 10 will be described with reference to FIG. As shown in FIG. 4, the camera 10 includes a shooting unit 100, an image reduction unit 102, an area setting unit 104, a plurality of image cropping units 106, and a communication unit 108. Although FIG. 4 shows an example in which four image cropping portions 106 are provided, the present invention is not limited to such an example, and any number of one or more may be provided.

［１－１－１．撮影部１００］
撮影部１００は、外部の映像を、レンズを通して例えばＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）やＣＭＯＳ（ＣｏｍｐｌｅｍｅｎｔａｒｙＭｅｔａｌＯｘｉｄｅＳｅｍｉｃｏｎｄｕｃｔｏｒ）などの撮像素子に結像させることにより、フレーム画像を取得する機能を有する。例えば、撮影部１００は、所定のフレームレートごとに外部の映像を撮影することにより、フレーム画像を取得する。 [1-1-1. Shooting unit 100]
The photographing unit 100 has a function of acquiring a frame image by forming an external image on an image pickup element such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor) through a lens. For example, the photographing unit 100 acquires a frame image by photographing an external image at a predetermined frame rate.

［１－１－２．映像縮小部１０２］
映像縮小部１０２は、撮影部１００により取得されたフレーム画像を所定のサイズに縮小することにより縮小画像を生成する。 [1-1-2. Image reduction unit 102]
The image reduction unit 102 generates a reduced image by reducing the frame image acquired by the shooting unit 100 to a predetermined size.

［１－１－３．領域設定部１０４］
領域設定部１０４は、撮影部１００により取得されたフレーム画像において、クロッピング画像の生成元となる領域であるクロッピング領域を設定する。例えば、領域設定部１０４は、撮影部１００により取得されたフレーム画像から、カメラ１０に設けられている映像クロッピング部１０６の個数だけクロッピング領域を設定する。 [1-1-3. Area setting unit 104]
The area setting unit 104 sets a cropping area, which is a region from which a cropping image is generated, in the frame image acquired by the photographing unit 100. For example, the area setting unit 104 sets the cropping area by the number of the image cropping units 106 provided in the camera 10 from the frame image acquired by the shooting unit 100.

図５は、領域設定部１０４によるクロッピング領域の設定例を示した説明図である。なお、図５では、クロッピング領域４０の横幅の長さを“ｃｒｏｐ＿ｗｉｄｔｈ”、クロッピング領域４０の縦幅の長さを“ｃｒｏｐ＿ｈｅｉｇｈｔ”とそれぞれ記載している。 FIG. 5 is an explanatory diagram showing an example of setting a cropping area by the area setting unit 104. In FIG. 5, the width of the cropping region 40 is described as “crop_width”, and the length of the vertical width of the cropping region 40 is described as “crop_height”.

図５に示したように、領域設定部１０４は、フレーム画像３０の中から人物３００などの検出対象の物体を検出し、そして、物体の検出位置３０２に基づいてクロッピング領域４０を設定する。 As shown in FIG. 5, the area setting unit 104 detects an object to be detected such as a person 300 from the frame image 30, and sets a cropping area 40 based on the object detection position 302.

［１－１－４．映像クロッピング部１０６］
映像クロッピング部１０６は、本開示における切り出し画像生成部の一例である。映像クロッピング部１０６は、領域設定部１０４により設定されたクロッピング領域を、撮影部１００により取得されたフレーム画像から切り出すことによりクロッピング画像を生成する。 [1-1-4. Video cropping unit 106]
The image cropping unit 106 is an example of the cropped image generation unit in the present disclosure. The video cropping unit 106 generates a cropping image by cutting out the cropping area set by the area setting unit 104 from the frame image acquired by the photographing unit 100.

例えば、図３に示した例では、四個の映像クロッピング部１０６の各々により生成された四枚のクロッピング画像５０を示している。図３に示したように、例えば、映像クロッピング部１０６ａは、領域設定部１０４により設定された、図２に示した領域４０ａに対応するクロッピング領域からクロッピング画像５０ａを生成する。また、映像クロッピング部１０６ｂは、領域設定部１０４により設定された、図２に示した領域４０ｂに対応するクロッピング領域からクロッピング画像５０ｂを生成する。 For example, the example shown in FIG. 3 shows four cropping images 50 generated by each of the four video cropping units 106. As shown in FIG. 3, for example, the video cropping unit 106a generates a cropping image 50a from the cropping region corresponding to the region 40a shown in FIG. 2 set by the region setting unit 104. Further, the video cropping unit 106b generates a cropping image 50b from the cropping region corresponding to the region 40b shown in FIG. 2 set by the region setting unit 104.

［１－１－５．通信部１０８］
通信部１０８は、本実施形態による取得部の一例である。通信部１０８は、後述する通信網２４を介して、通信網２４に接続された装置との間で各種の情報の送受信を行う。例えば、通信部１０８は、映像縮小部１０２により取得された縮小画像、および、複数の映像クロッピング部１０６により生成された複数のクロッピング画像をストレージ２０へ送信する。また、通信部１０８は、ユーザにより設定された、クロッピング対象を選択するための検出指定情報を監視端末２２から受信する。 [1-1-5. Communication unit 108]
The communication unit 108 is an example of the acquisition unit according to the present embodiment. The communication unit 108 transmits and receives various types of information to and from the device connected to the communication network 24 via the communication network 24 described later. For example, the communication unit 108 transmits the reduced image acquired by the video reducing unit 102 and the plurality of cropped images generated by the plurality of video cropping units 106 to the storage 20. Further, the communication unit 108 receives the detection designation information set by the user for selecting the cropping target from the monitoring terminal 22.

なお、検出指定情報は、監視端末２２から受信される代わりに、カメラ１０が最初から記憶していてもよい。以下では、検出指定情報が監視端末２２から受信される例を中心として説明を行う。 The detection designation information may be stored in the camera 10 from the beginning instead of being received from the monitoring terminal 22. In the following, an example in which the detection designation information is received from the monitoring terminal 22 will be mainly described.

＜１－２．ストレージ２０＞
ストレージ２０は、カメラ１０から受信される縮小画像およびクロッピング画像を記憶するための記憶装置である。例えば、ストレージ２０は、カメラ１０の識別情報、撮影日時、受信された縮小画像、および受信された複数のクロッピング画像を対応づけて記憶する。なお、ストレージ２０は、例えばデータセンタや、監視員が勤務する監視センタなどに設置され得る。 <1-2. Storage 20>
The storage 20 is a storage device for storing a reduced image and a cropping image received from the camera 10. For example, the storage 20 stores the identification information of the camera 10, the shooting date and time, the received reduced image, and the received cropping image in association with each other. The storage 20 may be installed in, for example, a data center or a monitoring center where a watchman works.

＜１－３．監視端末２２＞
監視端末２２は、カメラ１０により生成された縮小画像およびクロッピング画像を表示するための情報処理端末である。この監視端末２２は、例えば監視センタに設置され、そして、監視員に使用され得る。 <1-3. Monitoring terminal 22>
The monitoring terminal 22 is an information processing terminal for displaying a reduced image and a cropping image generated by the camera 10. The monitoring terminal 22 may be installed, for example, in a monitoring center and used by observers.

ここで、監視端末２２の構成について詳細に説明する。図６は、本実施形態による監視端末２２の構成を示した機能ブロック図である。図６に示したように、監視端末２２は、制御部２２０、通信部２２２、表示部２２４、および入力部２２６を有する。 Here, the configuration of the monitoring terminal 22 will be described in detail. FIG. 6 is a functional block diagram showing the configuration of the monitoring terminal 22 according to the present embodiment. As shown in FIG. 6, the monitoring terminal 22 has a control unit 220, a communication unit 222, a display unit 224, and an input unit 226.

［１－３－１．制御部２２０］
制御部２２０は、監視端末２２に内蔵されるＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、およびＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）などのハードウェアを用いて、監視端末２２の動作を全般的に制御する。 [1-3-1. Control unit 220]
The control unit 220 uses hardware such as a CPU (Central Processing Unit), a RAM (Random Access Memory), and a ROM (Read Only Memory) built in the monitoring terminal 22 to generally operate the monitoring terminal 22. Control.

［１－３－２．通信部２２２］
通信部２２２は、後述する通信網２４を介して、通信網２４に接続された装置との間で各種の情報の送受信を行う。例えば、通信部２２２は、ストレージ２０に格納されている縮小画像およびクロッピング画像をストレージ２０から受信する。なお、通信部２２２は、カメラ１０により生成された縮小画像、および複数のクロッピング画像をカメラ１０から直接受信することも可能である。 [1-3-2. Communication unit 222]
The communication unit 222 transmits and receives various types of information to and from the device connected to the communication network 24 via the communication network 24 described later. For example, the communication unit 222 receives the reduced image and the cropping image stored in the storage 20 from the storage 20. The communication unit 222 can also directly receive the reduced image generated by the camera 10 and a plurality of cropping images from the camera 10.

また、通信部２２２は、制御部２２０の制御に従って、後述する評価基準設定画面においてユーザにより入力された、クロッピング対象の物体を選択するための検出指定情報をカメラ１０へ送信する。 Further, the communication unit 222 transmits, according to the control of the control unit 220, the detection designation information for selecting the object to be cropped, which is input by the user on the evaluation standard setting screen described later, to the camera 10.

［１－３－３．表示部２２４］
表示部２２４は、例えば、液晶ディスプレイ（ＬＣＤ：ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）や、ＯＬＥＤ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ）などのディスプレイにより構成される。この表示部２２４は、例えば、ストレージ２０から受信された縮小画像またはクロッピング画像を含む監視画面などを表示する。 [1-3-3. Display unit 224]
The display unit 224 is composed of, for example, a display such as a liquid crystal display (LCD: Liquid Crystal Display) or an OLED (Organic Light Emitting Display). The display unit 224 displays, for example, a monitoring screen including a reduced image or a cropping image received from the storage 20.

また、表示部２２４は、制御部２２０の制御に従って、評価基準設定画面を表示する。この評価基準設定画面は、カメラ１０により撮影されたフレーム画像の中からクロッピング対象の物体を決定するために用いられる検出指定情報をユーザが設定（または変更）するための画面である。例えば、検出指定情報は、評価基準設定画面において選択された１以上の評価項目、および、当該評価項目の評価基準を含むように設定される。 Further, the display unit 224 displays the evaluation standard setting screen according to the control of the control unit 220. This evaluation standard setting screen is a screen for the user to set (or change) the detection designation information used for determining the object to be cropped from the frame image captured by the camera 10. For example, the detection designation information is set to include one or more evaluation items selected on the evaluation standard setting screen and the evaluation criteria of the evaluation item.

ここで、図７を参照して、評価基準設定画面（評価基準設定画面６０）の表示例について説明する。図７に示したように、評価基準設定画面６０は、例えば物体のサイズ設定欄６００ａや、物体のスピード設定欄６００ｂなどの複数の評価項目の設定欄６００を含む。図７に示したように、複数の評価項目は、例えば、物体のサイズ、物体のスピード、物体の滞在時間、物体の縦横比、物体が画面外に出るまでの予測時間、物体とホームポジションとの距離、物体の追跡時間、または、物体の停止時間などを含む。ここで、物体の縦横比は、例えば人と自動車とを区別するなど、検出対象の物体の種類を区別するために用いられる評価項目である。なお、縦横比を用いることのメリットとして、少ない計算量で物体の種類を識別できることが挙げられる。 Here, a display example of the evaluation standard setting screen (evaluation standard setting screen 60) will be described with reference to FIG. 7. As shown in FIG. 7, the evaluation standard setting screen 60 includes a setting field 600 for a plurality of evaluation items such as an object size setting field 600a and an object speed setting field 600b. As shown in FIG. 7, the plurality of evaluation items include, for example, the size of the object, the speed of the object, the staying time of the object, the aspect ratio of the object, the estimated time until the object goes out of the screen, and the object and the home position. Includes distance, object tracking time, object downtime, and so on. Here, the aspect ratio of an object is an evaluation item used for distinguishing the type of an object to be detected, for example, distinguishing between a person and an automobile. One of the merits of using the aspect ratio is that the type of object can be identified with a small amount of calculation.

また、物体が画面外に出るまでの予測時間は、例えば過去のフレームにおける位置の変化に基づいて物体の移動速度を算出し、そして、算出された移動速度に基づいて予測される時間である。この「物体が画面外に出るまでの予測時間」は、例えば、高速に移動する物体であっても、少なくとも一回だけは撮影しておきたいような場合に利用される評価項目である。なお、ホームポジションは、本開示における監視対象領域の一例である。このホームポジションは、例えば、物体の検出枠ごとに定められる。一例として、検出枠は、例えば、通り、建物の入り口、立ち入り禁止区域などユーザが監視を希望する場所ごとに決められ得る。また、物体の検出枠は、カメラ１０に含まれる複数の映像クロッピング部１０６の各々に対応づけて設定されてもよい。 Further, the predicted time until the object goes out of the screen is, for example, the time when the moving speed of the object is calculated based on the change in the position in the past frame, and the time is predicted based on the calculated moving speed. This "estimated time until the object goes out of the screen" is an evaluation item used when, for example, even an object moving at high speed wants to be photographed at least once. The home position is an example of the monitored area in the present disclosure. This home position is determined, for example, for each detection frame of an object. As an example, the detection frame may be determined for each place the user wishes to monitor, for example, a street, a building entrance, an exclusion zone, and the like. Further, the object detection frame may be set in association with each of the plurality of image cropping units 106 included in the camera 10.

例えば、図７に示した例では、物体のスピード、物体の縦横比、ホームポジションまでの距離、および、物体の停止時間がユーザにより評価項目として選択されたことを示している。さらに、図７では、物体のスピードが規定値よりも速い物体、物体の縦横比が規定値よりも小さい物体、ホームポジションまでの距離が規定値よりも長い物体、または、物体の停止時間が規定値よりも短い物体であるほど、評価が高くなるように評価基準が指定されたことを示している。 For example, the example shown in FIG. 7 shows that the speed of the object, the aspect ratio of the object, the distance to the home position, and the stop time of the object are selected as evaluation items by the user. Further, in FIG. 7, the speed of the object is faster than the specified value, the aspect ratio of the object is smaller than the specified value, the distance to the home position is longer than the specified value, or the stop time of the object is specified. It is shown that the evaluation criteria are specified so that the object shorter than the value has a higher evaluation.

［１－３－４．入力部２２６］
入力部２２６は、例えばマウス、キーボード、タッチパネル、または、マイクロフォンなどの入力装置を含む。この入力部２２６は、監視端末２２に対するユーザによる各種の入力を受け付ける。例えば、入力部２２６は、表示部２２４に表示された評価基準設定画面に対する検出指定情報の入力を受け付ける。 [1-3-4. Input unit 226]
The input unit 226 includes an input device such as a mouse, a keyboard, a touch panel, or a microphone. The input unit 226 accepts various inputs by the user to the monitoring terminal 22. For example, the input unit 226 accepts the input of the detection designation information to the evaluation standard setting screen displayed on the display unit 224.

＜１－４．通信網２４＞
通信網２４は、通信網２４に接続されている装置から送信される情報の有線、または無線の伝送路である。例えば、通信網２４は、電話回線網、インターネット、衛星通信網などの公衆回線網や、Ｅｔｈｅｒｎｅｔ（登録商標）を含む各種のＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）などを含んでもよい。また、通信網２４は、ＩＰ－ＶＰＮ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ－ＶｉｒｔｕａｌＰｒｉｖａｔｅＮｅｔｗｏｒｋ）などの専用回線網を含んでもよい。 <1-4. Communication network 24>
The communication network 24 is a wired or wireless transmission path for information transmitted from a device connected to the communication network 24. For example, the communication network 24 may include a public line network such as a telephone line network, the Internet, or a satellite communication network, and various LANs (Local Area Network) including Ethernet (registered trademark), WAN (Wide Area Network), and the like. .. Further, the communication network 24 may include a dedicated line network such as IP-VPN (Internet Protocol-Virtual Private Network).

なお、本実施形態による画像処理システムは、上述した構成に限定されない。例えば、ストレージ２０および監視端末２２は一体的に構成されてもよい。または、本画像処理システムは、ストレージ２０もしくは監視端末２２を含まないことも可能である。 The image processing system according to the present embodiment is not limited to the above-described configuration. For example, the storage 20 and the monitoring terminal 22 may be integrally configured. Alternatively, the image processing system may not include the storage 20 or the monitoring terminal 22.

上述した画像処理システムにおいて、本実施形態によるカメラ１０は、ユーザにより指定された検出指定情報に基づいて、最適なクロッピング対象を自動的に選択することが可能である。 In the image processing system described above, the camera 10 according to the present embodiment can automatically select the optimum cropping target based on the detection designation information designated by the user.

＜＜２．実施形態の詳細な説明＞＞
＜２－１．構成＞
以上、本実施形態による画像処理システムの構成について説明した。次に、本実施形態によるカメラ１０の構成について詳細に説明する。 << 2. Detailed description of the embodiment >>
<2-1. Configuration>
The configuration of the image processing system according to the present embodiment has been described above. Next, the configuration of the camera 10 according to the present embodiment will be described in detail.

本実施形態によるカメラ１０の構成の特徴は、特に領域設定部１０４の構成に関する。
以下では、図８を参照して、領域設定部１０４の構成についてさらに詳細に説明する。 The feature of the configuration of the camera 10 according to the present embodiment is particularly related to the configuration of the area setting unit 104.
Hereinafter, the configuration of the area setting unit 104 will be described in more detail with reference to FIG.

図８に示したように、領域設定部１０４は、物体検出部１２０、クロッピング領域決定部１２２、および、重なり判定部１２４を含む。 As shown in FIG. 8, the area setting unit 104 includes an object detection unit 120, a cropping area determination unit 122, and an overlap determination unit 124.

［２－１－１．物体検出部１２０］
（２－１－１－１．検出例１）
物体検出部１２０は、例えば監視端末２２から受信された検出指定情報に基づいて、撮影部１００により取得されたフレーム画像（以下、現在のフレーム画像と称する）における検出枠の中からクロッピング対象の物体を検出する。例えば、物体検出部１２０は、検出枠に含まれる複数の物体のうち、受信された検出指定情報に含まれる評価項目の評価基準で評価した値が最も高い物体をクロッピング対象として検出する。 [2-1-1. Object detection unit 120]
(2-1-1-1. Detection example 1)
The object detection unit 120 is an object to be cropped from the detection frame in the frame image (hereinafter referred to as the current frame image) acquired by the photographing unit 100 based on the detection designation information received from the monitoring terminal 22, for example. Is detected. For example, the object detection unit 120 detects, among the plurality of objects included in the detection frame, the object having the highest value evaluated by the evaluation criteria of the evaluation items included in the received detection designation information as the cropping target.

なお、受信された検出指定情報が複数の評価項目を含む場合には、物体検出部１２０は、検出枠に含まれる複数の物体のうち、当該検出指定情報に含まれる複数の評価項目の各々の評価基準による評価値の組み合わせに基づいてクロッピング対象の物体を検出することも可能である。例えば、上記の場合には、物体検出部１２０は、まず、検出枠に含まれる複数の物体の各々について、当該検出指定情報が示す複数の評価項目の各々について評価基準に従って評価し、そして、各評価項目の評価値の合計を算出する。そして、物体検出部１２０は、各評価項目の評価値の合計が最も高い物体をクロッピング対象として検出する。 When the received detection designation information includes a plurality of evaluation items, the object detection unit 120 includes each of the plurality of evaluation items included in the detection designation information among the plurality of objects included in the detection frame. It is also possible to detect an object to be cropped based on a combination of evaluation values based on evaluation criteria. For example, in the above case, the object detection unit 120 first evaluates each of the plurality of objects included in the detection frame according to the evaluation criteria for each of the plurality of evaluation items indicated by the detection designation information, and then each of them. Calculate the total evaluation value of the evaluation items. Then, the object detection unit 120 detects the object having the highest total evaluation value of each evaluation item as the cropping target.

（２－１－１－２．検出例２）
また、受信された検出指定情報が評価項目「物体の停止時間」を含む場合には、物体検出部１２０は、現在の検出対象の物体が停止していると推測される時間の長さと、当該検出指定情報に含まれる停止上限時間との比較に基づいて、検出対象の物体を切り替えるか否かを判断する。 (2-1-1-2. Detection example 2)
Further, when the received detection designation information includes the evaluation item “object stop time”, the object detection unit 120 determines the length of time that the current object to be detected is estimated to be stopped and the said time. It is determined whether or not to switch the object to be detected based on the comparison with the stop upper limit time included in the detection designation information.

例えば、現在の検出対象の物体が停止していると推測される時間の長さが停止上限時間より大きい場合には、物体検出部１２０は、検出対象の物体を別の物体に切り替えることを判断する。また、現在の検出対象の物体が停止していると推測される時間の長さが停止上限時間以下である場合には、物体検出部１２０は、検出対象の物体を、前のフレーム画像と同一にすることを判断する。なお、物体検出部１２０は、例えば、連続するフレーム画像間での検出対象の物体の検出位置の変化量が所定の範囲内である場合には、当該物体が停止していると推測することが可能である。 For example, when the length of time estimated that the current object to be detected is stopped is longer than the upper limit time for stopping, the object detection unit 120 determines to switch the object to be detected to another object. do. Further, when the length of time estimated that the current object to be detected is stopped is equal to or less than the stop upper limit time, the object detection unit 120 sets the object to be detected to be the same as the previous frame image. Judge to. In addition, the object detection unit 120 may presume that the object is stopped when, for example, the amount of change in the detection position of the object to be detected between continuous frame images is within a predetermined range. It is possible.

また、受信された検出指定情報が評価項目「物体の追跡時間」を含む場合には、物体検出部１２０は、現在の検出対象の物体の検出の継続時間と、当該検出指定情報に含まれる追跡上限時間との比較に基づいて、検出対象の物体を切り替えるか否かを判断する。なお、追跡上限時間は、クロッピング対象の物体を同一の物体に維持する時間の上限値である。 When the received detection designation information includes the evaluation item "object tracking time", the object detection unit 120 has the duration of detection of the current object to be detected and the tracking included in the detection designation information. Based on the comparison with the upper limit time, it is determined whether or not to switch the object to be detected. The tracking upper limit time is an upper limit value of the time for maintaining the object to be cropped on the same object.

例えば、現在の検出対象の物体の検出の継続時間が追跡上限時間を超えた場合には、物体検出部１２０は、検出対象の物体を別の物体に切り替えることを判断する。また、現在の検出対象の物体の検出の継続時間が追跡上限時間以下である場合には、物体検出部１２０は、検出対象の物体を、前のフレーム画像と同一にすることを判断する。 For example, when the duration of detection of the current object to be detected exceeds the tracking upper limit time, the object detection unit 120 determines to switch the object to be detected to another object. Further, when the duration of detection of the current object to be detected is equal to or less than the tracking upper limit time, the object detection unit 120 determines that the object to be detected is the same as the previous frame image.

また、受信された検出指定情報が評価項目「物体とホームポジションとの間の距離」を含む場合には、物体検出部１２０は、動画像において予め設定されているホームポジションと現在の検出対象の物体との間の距離と、当該検出指定情報に含まれる監視上限距離との比較に基づいて、検出対象の物体を切り替えるか否かを判断する。 Further, when the received detection designation information includes the evaluation item "distance between the object and the home position", the object detection unit 120 sets the home position preset in the moving image and the current detection target. Based on the comparison between the distance to the object and the monitoring upper limit distance included in the detection designation information, it is determined whether or not to switch the object to be detected.

例えば、ホームポジションと現在の検出対象の物体との間の距離が監視上限距離を超えた場合には、物体検出部１２０は、検出対象の物体を、例えばホームポジションの最も近くに位置する物体などの、別の物体に切り替えることを判断する。また、ホームポジションと現在の検出対象の物体との間の距離が監視上限距離以内である場合には、物体検出部１２０は、検出対象の物体を、前のフレーム画像と同一にすることを判断する。 For example, when the distance between the home position and the current object to be detected exceeds the monitoring upper limit distance, the object detection unit 120 sets the object to be detected to, for example, the object located closest to the home position. Decide to switch to another object. Further, when the distance between the home position and the current object to be detected is within the monitoring upper limit distance, the object detection unit 120 determines that the object to be detected is the same as the previous frame image. do.

（２－１－１－３．検出例３）
また、後述する重なり判定部１２４によりクロッピング領域が重なっていると判定された場合には、物体検出部１２０は、判定されたクロッピング領域のうちいずれかのクロッピング対象の物体を別の物体に切り替えることを判断する。 (2-1-1-3. Detection example 3)
Further, when it is determined by the overlap determination unit 124 described later that the cropping regions overlap, the object detection unit 120 switches any of the determined cropping regions to another object to be cropped. To judge.

例えば、物体検出部１２０は、重なり判定部１２４によりクロッピング領域が重なっていると判定されたときから、受信された検出指定情報に含まれる「検出対象の切り替え時間」が経過した場合に、重なっているクロッピング領域のうちいずれかのクロッピング対象の物体を別の物体に切り替えることを判断する。 For example, the object detection unit 120 overlaps when the "detection target switching time" included in the received detection designation information elapses from the time when the overlap determination unit 124 determines that the cropping areas overlap. It is determined to switch one of the cropping target objects in the cropping area to another object.

なお、変形例として、重なり判定部１２４によりクロッピング領域が重なっていると判定されたクロッピング領域のうち、クロッピング対象の物体を別の物体に切り替えるクロッピング領域（以下、クロッピング対象切り替え領域と称する）を、物体検出部１２０は、受信された検出指定情報に従って判断してもよい。例えば、物体検出部１２０は、重なっているクロッピング領域のうち、受信された検出指定情報に含まれる評価項目の値がより低いクロッピング対象を含むクロッピング領域をクロッピング対象切り替え領域に決定してもよい。また、物体検出部１２０は、重なっているクロッピング領域のうち、より早い時刻からクロッピング対象の物体の検出（追跡）を継続しているクロッピング領域をクロッピング対象切り替え領域に決定してもよい。 As a modification, of the cropping regions determined by the overlap determination unit 124 to overlap the cropping regions, a cropping region for switching the object to be cropped to another object (hereinafter referred to as a cropping target switching region) is used. The object detection unit 120 may make a determination according to the received detection designation information. For example, the object detection unit 120 may determine, among the overlapping cropping regions, a cropping region including a cropping target having a lower value of the evaluation item included in the received detection designation information as the cropping target switching region. Further, the object detection unit 120 may determine, among the overlapping cropping regions, the cropping region that continues to detect (track) the object to be cropped from an earlier time as the cropping target switching region.

（２－１－１－４．検出例４）
なお、物体検出部１２０は、現在のフレーム画像の中から、例えば映像クロッピング部１０６の個数以内の数など、所定の数の物体を検出することが可能である。 (2-1-1-4. Detection example 4)
The object detection unit 120 can detect a predetermined number of objects from the current frame image, for example, a number within the number of the image cropping units 106.

（２－１－１－５．検出例５）
また、物体検出部１２０は、検出枠の中にクロッピング対象（追跡対象）の物体が存在しない場合には、いずれの物体も検出しないことが可能である。 (2-1-1-5. Detection example 5)
Further, when the object of the cropping target (tracking target) does not exist in the detection frame, the object detection unit 120 can detect none of the objects.

［２－１－２．クロッピング領域決定部１２２］
（２－１－２－１．決定例１）
クロッピング領域決定部１２２は、本開示における切り出し領域決定部の一例である。クロッピング領域決定部１２２は、物体検出部１２０により検出された物体の検出位置が含まれるように、現在のフレーム画像におけるクロッピング領域を決定する。例えば、クロッピング領域決定部１２２は、物体検出部１２０により検出された物体の検出位置がクロッピング領域の中心になるように、現在のフレーム画像におけるクロッピング領域を決定する。 [2-1-2. Cropping area determination unit 122]
(2-1-2-1. Determination example 1)
The cropping region determination unit 122 is an example of the cropping region determination unit in the present disclosure. The cropping area determination unit 122 determines the cropping area in the current frame image so as to include the detection position of the object detected by the object detection unit 120. For example, the cropping area determination unit 122 determines the cropping area in the current frame image so that the detection position of the object detected by the object detection unit 120 is at the center of the cropping area.

なお、フレーム画像におけるクロッピング領域の形状およびサイズは、基本的には、全てのフレーム画像において同一に定められる。また、クロッピング領域のサイズは、基本的には、所定の大きさに定められている。 The shape and size of the cropping region in the frame image are basically the same for all frame images. Further, the size of the cropping area is basically set to a predetermined size.

（２－１－２－２．決定例２）
なお、物体検出部１２０によりいずれの物体も検出されなかった場合には、クロッピング領域決定部１２２は、いずれの領域も出力しないことを決定してもよい。または、上記の場合には、クロッピング領域決定部１２２は、ホームポジションを含む領域をクロッピング領域として決定してもよい。 (2-1-2-2. Determination example 2)
If no object is detected by the object detection unit 120, the cropping area determination unit 122 may decide not to output any area. Alternatively, in the above case, the cropping area determination unit 122 may determine the area including the home position as the cropping area.

［２－１－３．重なり判定部１２４］
重なり判定部１２４は、クロッピング領域決定部１２２によりクロッピング領域が決定された際に、受信された検出指定情報に含まれる重なり判定条件に基づいて、当該クロッピング領域が他のクロッピング領域と重なっているか否かを判定する。例えば、重なり判定部１２４は、クロッピング領域の面積に対する、重なっている領域の面積の割合が所定の閾値以上である場合には、当該クロッピング領域は他のクロッピング領域と重なっていると判定する。また、重なり判定部１２４は、当該クロッピング領域の中心から他のクロッピング領域までの距離が所定の閾値以下である場合には、当該クロッピング領域は他のクロッピング領域と重なっていると判定する。 [2-1-3. Overlap determination unit 124]
When the cropping area is determined by the cropping area determination unit 122, the overlap determination unit 124 determines whether or not the cropping area overlaps with another cropping area based on the overlap determination condition included in the received detection designation information. Is determined. For example, when the ratio of the area of the overlapping region to the area of the cropping region is equal to or more than a predetermined threshold value, the overlap determination unit 124 determines that the cropping region overlaps with another cropping region. Further, when the distance from the center of the cropping region to the other cropping region is equal to or less than a predetermined threshold value, the overlap determination unit 124 determines that the cropping region overlaps with the other cropping region.

＜２－２．動作＞
以上、本実施形態による構成について説明した。続いて、本実施形態による動作について、図９～図１１を参照して説明する。なお、ここでは、カメラ１０が映像クロッピング部１０６を四個有しており、そして、一枚のフレーム画像から一枚の縮小画像および四枚のクロッピング画像を生成する場面における動作例について説明する。なお、この動作は、所定のフレームレートごとに繰り返し実行される。 <2-2. Operation>
The configuration according to the present embodiment has been described above. Subsequently, the operation according to the present embodiment will be described with reference to FIGS. 9 to 11. Here, an operation example in a scene in which the camera 10 has four image cropping units 106 and one reduced image and four cropped images are generated from one frame image will be described. It should be noted that this operation is repeatedly executed at predetermined frame rates.

［２－２－１．全体の動作］
図９は、本実施形態による動作例を示したフローチャートである。図９に示したように、まず、カメラ１０の撮影部１００は、所定の撮影タイミングが到来したら、外部の映像を撮影することによりフレーム画像を取得する（Ｓ１０１）。 [2-2-1. Overall operation]
FIG. 9 is a flowchart showing an operation example according to the present embodiment. As shown in FIG. 9, first, when a predetermined shooting timing arrives, the shooting unit 100 of the camera 10 acquires a frame image by shooting an external image (S101).

続いて、映像縮小部１０２は、Ｓ１０１で取得されたフレーム画像（以下、現在のフレーム画像と称する）を所定のサイズに縮小することにより縮小画像を生成する（Ｓ１０３）。 Subsequently, the image reduction unit 102 generates a reduced image by reducing the frame image (hereinafter referred to as the current frame image) acquired in S101 to a predetermined size (S103).

その後、カメラ１０は、後述する「クロッピング画像生成処理」を映像クロッピング部１０６の個数、つまり４回繰り返して行う（Ｓ１０５～Ｓ１１１）。 After that, the camera 10 repeats the "cropping image generation process" described later by the number of video cropping units 106, that is, four times (S105 to S111).

その後、通信部１０８は、Ｓ１０３で生成された縮小画像、およびＳ１０７で生成された４枚のクロッピング画像をストレージ２０へ送信する（Ｓ１１３）。 After that, the communication unit 108 transmits the reduced image generated in S103 and the four cropped images generated in S107 to the storage 20 (S113).

［２－２－２．クロッピング画像生成処理］
ここで、図１０～図１１を参照して、Ｓ１０７における「クロッピング画像生成処理」の動作について詳細に説明する。図１０に示したように、まず、カメラ１０の物体検出部１２０は、Ｉ箇所目のクロッピング対象として設定されている物体、つまり、追跡中の物体を検出する（Ｓ１５１）。 [2-2-2. Cropping image generation process]
Here, with reference to FIGS. 10 to 11, the operation of the “cropping image generation process” in S107 will be described in detail. As shown in FIG. 10, first, the object detection unit 120 of the camera 10 detects an object set as a cropping target at a position I, that is, an object being tracked (S151).

続いて、クロッピング領域決定部１２２は、Ｓ１５１で検出された物体の検出位置がクロッピング領域の中心になるように、現在のフレーム画像におけるクロッピング領域を決定する（Ｓ１５３）。 Subsequently, the cropping region determination unit 122 determines the cropping region in the current frame image so that the detection position of the object detected in S151 is at the center of the cropping region (S153).

ここで、図１１を参照して、Ｓ１５３よりも後の動作について説明する。図１１に示したように、Ｓ１５３の後、物体検出部１２０は、Ｓ１５１で検出された物体の（追跡開始からの）追跡時間の合計が追跡上限時間を経過したか否かを判定する（Ｓ１６１）。物体の追跡時間の合計が追跡上限時間を経過した場合には（Ｓ１６１：Ｙｅｓ）、物体検出部１２０は、Ｉ箇所目のクロッピング対象を別の物体に切り替える（Ｓ１６３）。その後、カメラ１０は、再びＳ１５３の動作を行う。 Here, with reference to FIG. 11, the operation after S153 will be described. As shown in FIG. 11, after S153, the object detection unit 120 determines whether or not the total tracking time (from the start of tracking) of the objects detected in S151 has passed the tracking upper limit time (S161). ). When the total tracking time of the objects has passed the tracking upper limit time (S161: Yes), the object detection unit 120 switches the cropping target at the I position to another object (S163). After that, the camera 10 operates S153 again.

一方、物体の追跡時間が追跡上限時間を経過していない場合には（Ｓ１６１：Ｎｏ）、物体検出部１２０は、次に、Ｓ１５１で検出された物体の（停止が検出されたときからの）停止時間の合計が停止上限時間を経過したか否かを判定する（Ｓ１６５）。物体の停止時間の合計が停止上限時間を経過した場合には（Ｓ１６５：Ｙｅｓ）、物体検出部１２０は、Ｓ１６３の動作を行う。 On the other hand, when the tracking time of the object does not elapse the upper limit of tracking time (S161: No), the object detection unit 120 then receives the object detected in S151 (from the time when the stop is detected). It is determined whether or not the total stop time has passed the upper limit time of stop (S165). When the total stop time of the object has passed the upper limit time of stop (S165: Yes), the object detection unit 120 operates S163.

一方、物体の停止時間の合計が停止上限時間を経過していない場合には（Ｓ１６５：Ｎｏ）、重なり判定部１２４は、Ｓ１５３で決定されたクロッピング領域が、同じフレーム画像内の他のクロッピング領域と重なっているか否かを判定する（Ｓ１６７）。該当のクロッピング領域が、他のクロッピング領域と重なっている場合には（Ｓ１６７：Ｙｅｓ）、物体検出部１２０は、Ｓ１６３の動作を行う。 On the other hand, when the total stop time of the object does not elapse the upper limit time of stop (S165: No), the overlap determination unit 124 has the cropping region determined in S153 as another cropping region in the same frame image. It is determined whether or not it overlaps with (S167). When the corresponding cropping region overlaps with another cropping region (S167: Yes), the object detection unit 120 operates S163.

一方、該当のクロッピング領域が、他のクロッピング領域と重なっていない場合には（Ｓ１６７：Ｎｏ）、映像クロッピング部１０６は、該当のクロッピング領域を現在のフレーム画像から切り出すことにより、クロッピング画像を生成する（Ｓ１６９）。 On the other hand, when the corresponding cropping area does not overlap with another cropping area (S167: No), the video cropping unit 106 generates a cropping image by cutting out the corresponding cropping area from the current frame image. (S169).

＜２－３．効果＞
［２－３－１．効果１］
以上、例えば図４、図８～図１１などを参照して説明したように、本実施形態によるカメラ１０は、複数の評価項目の中からユーザにより選択された評価項目に基づいて物体を検出し、そして、検出された物体の検出位置が含まれるように、現在のフレーム画像におけるクロッピング領域を決定する。このため、フレーム画像に含まれる複数の物体のうち、最適なクロッピング対象を自動的に選択することができる。また、クロッピング対象の物体をクロッピングする時間の長さに関しても同様に最適化することができる。 <2-3. Effect>
[2-3-1. Effect 1]
As described above, for example, as described with reference to FIGS. 4, 8 to 11, the camera 10 according to the present embodiment detects an object based on an evaluation item selected by the user from a plurality of evaluation items. Then, the cropping area in the current frame image is determined so as to include the detection position of the detected object. Therefore, the optimum cropping target can be automatically selected from the plurality of objects included in the frame image. Further, the length of time for cropping the object to be cropped can be similarly optimized.

［２－３－２．効果２］
また、カメラ１０は、クロッピング対象の物体が停止していると推測される時間が停止上限時間を超えた場合には、クロッピング対象の物体を別の物体に切り替え、そして、切り替え後の物体が中心となるようにクロッピング領域を決定する。このため、仮に、一度クロッピング対象に定められた物体が同じ場所に位置し続けたとしても、停止条件時間が経過すれば、クロッピング対象の物体は別の物体に変更される。従って、長時間同じ場所が切り出し領域に設定され続けることを防止できる。 [2-3-2. Effect 2]
Further, when the time estimated that the object to be cropped is stopped exceeds the stop upper limit time, the camera 10 switches the object to be cropped to another object, and the object after the switching is the center. The cropping area is determined so as to be. Therefore, even if the object once defined as the cropping target continues to be located at the same place, the object to be cropped is changed to another object after the stop condition time elapses. Therefore, it is possible to prevent the same place from being continuously set in the cutout area for a long time.

［２－３－３．効果３］
また、クロッピング領域決定部１２２によるクロッピング領域の決定方法は簡易な方法であるので、カメラ１０は、クロッピング画像の生成をリアルタイムに行うことができる。 [2-3-3. Effect 3]
Further, since the method of determining the cropping region by the cropping region determination unit 122 is a simple method, the camera 10 can generate a cropping image in real time.

［２－３－４．効果４］
また、本実施形態によれば、カメラ１０単体で縮小画像およびクロッピング画像を生成することが可能である。このため、縮小画像およびクロッピング画像を生成するための例えばサーバなどの他の装置に、カメラ１０はフレーム画像を送信する必要がないので、通信量を軽減することができる。 [2-3-4. Effect 4]
Further, according to the present embodiment, it is possible to generate a reduced image and a cropping image by the camera 10 alone. Therefore, since the camera 10 does not need to transmit the frame image to another device such as a server for generating the reduced image and the cropped image, the communication amount can be reduced.

＜＜３．変形例＞＞
以上、添付図面を参照しながら本開示の好適な実施形態について詳細に説明したが、本開示はかかる例に限定されない。本開示の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本開示の技術的範囲に属するものと了解される。 << 3. Modification example >>
Although the preferred embodiments of the present disclosure have been described in detail with reference to the accompanying drawings, the present disclosure is not limited to such examples. It is clear that anyone with ordinary knowledge in the field of technology to which this disclosure belongs can come up with various modifications or amendments within the scope of the technical ideas set forth in the claims. , These are also naturally understood to belong to the technical scope of the present disclosure.

＜３－１．変形例１＞
上述した実施形態では、本開示による画像処理装置がカメラ１０である例について説明したが、かかる例に限定されない。例えば、監視端末２２（の制御部２２０）が、カメラ１０の代わりに、上述した映像縮小部１０２、領域設定部１０４、および、複数の映像クロッピング部１０６の全てを有する場合には、本開示による画像処理装置は、監視端末２２であってもよい。 <3-1. Modification 1>
In the above-described embodiment, the example in which the image processing apparatus according to the present disclosure is the camera 10 has been described, but the present invention is not limited to such an example. For example, when the monitoring terminal 22 (control unit 220) has all of the above-mentioned image reduction unit 102, area setting unit 104, and a plurality of image cropping units 106 instead of the camera 10, according to the present disclosure. The image processing device may be the monitoring terminal 22.

また、別途設けられたサーバ（図示省略）が例えば通信網２４を介してカメラ１０と通信可能であり、かつ、このサーバが、カメラ１０の代わりに、上述した映像縮小部１０２、領域設定部１０４、および、複数の映像クロッピング部１０６の全てを有する場合には、本開示による画像処理装置は、サーバであってもよい。また、このサーバとストレージ２０とは一体的に構成されてもよい。 Further, a separately provided server (not shown) can communicate with the camera 10 via, for example, a communication network 24, and this server replaces the camera 10 with the above-mentioned image reduction unit 102 and area setting unit 104. , And, when all of the plurality of video cropping units 106 are provided, the image processing apparatus according to the present disclosure may be a server. Further, the server and the storage 20 may be integrally configured.

＜３－２．変形例２＞
また、上記の説明では、物体の追跡上限時間、および、物体の停止上限時間は検出指定情報に含まれる値、つまり固定値である例について説明したが、かかる例に限定されず、カメラ１０は、追跡上限時間、または、停止上限時間を動的に決定してもよい。例えば、カメラ１０は、検出枠ごとに、追跡上限時間、または、停止上限時間を動的に決定してもよい。一例として、カメラ１０は、検出枠ごとに、未追跡の物体の数が少ないほど、追跡上限時間を長くしてもよい。また、カメラ１０は、検出枠に例えば監視対象領域が含まれる場合には、追跡上限時間をより長くしてもよい。 <3-2. Modification 2>
Further, in the above description, an example in which the tracking upper limit time of the object and the stop upper limit time of the object are values included in the detection designation information, that is, fixed values has been described, but the camera 10 is not limited to such an example. , Tracking limit time, or stop limit time may be determined dynamically. For example, the camera 10 may dynamically determine the tracking upper limit time or the stop upper limit time for each detection frame. As an example, the camera 10 may increase the tracking upper limit time as the number of untracked objects decreases for each detection frame. Further, the camera 10 may have a longer tracking upper limit time when the detection frame includes, for example, a monitored area.

＜３－３．変形例３＞
また、上述した実施形態によれば、例えばＣＰＵ、ＲＯＭ、およびＲＡＭなどのハードウェアを、上述した映像縮小部１０２、領域設定部１０４、および、映像クロッピング部１０６と同等の機能を発揮させるためのコンピュータプログラムも提供可能である。また、該コンピュータプログラムが記録された記録媒体も提供される。 <3-3. Modification 3>
Further, according to the above-described embodiment, for example, the hardware such as the CPU, ROM, and RAM can exhibit the same functions as the above-mentioned image reduction unit 102, area setting unit 104, and image cropping unit 106. Computer programs can also be provided. Also provided is a recording medium on which the computer program is recorded.

なお、以下のような構成も本開示の技術的範囲に属する。
（１）
動画像に含まれる第１のフレーム画像から第１の物体を検出する物体検出部と、
前記第１の物体の検出位置が含まれるように前記第１のフレーム画像の切り出し領域を決定する切り出し領域決定部と、
を備え、
前記切り出し領域決定部は、前記第１のフレーム画像よりも後の第２のフレーム画像において、前記第１の物体が停止していると推測される時間の長さに基づいて、前記第２のフレーム画像の切り出し領域を決定する、画像処理装置。
（２）
前記切り出し領域決定部は、前記第１の物体が停止していると推測される時間の長さが停止上限時間より大きいか否かに基づいて、前記第２のフレーム画像の切り出し領域を決定する、前記（１）に記載の画像処理装置。
（３）
前記第２のフレーム画像が、前記第１の物体が停止していると推測される時間が前記停止上限時間を超えた際もしくは後のフレーム画像である場合には、前記物体検出部は、前記第２のフレーム画像から第２の物体を検出し、
前記切り出し領域決定部は、前記第２の物体の検出位置が含まれるように前記第２のフレーム画像の切り出し領域を決定する、前記（２）に記載の画像処理装置。
（４）
前記第２のフレーム画像が、前記第１の物体が停止していると推測される時間が前記停止上限時間以下である際のフレーム画像である場合には、前記切り出し領域決定部は、前記第１の物体の検出位置が含まれるように前記第２のフレーム画像の切り出し領域を決定する、前記（２）または（３）に記載の画像処理装置。
（５）
前記第１の物体が停止していると推測される時間の長さは、連続するフレーム画像間での前記第１の物体の検出位置の変化量が所定の範囲内である間の時間の長さである、前記（２）～（４）のいずれか一項に記載の画像処理装置。
（６）
前記画像処理装置は、ユーザにより指定された停止上限時間の長さを取得する取得部をさらに備え、
前記切り出し領域決定部は、前記第１の物体が停止していると推測される時間の長さが、前記取得部により取得された停止上限時間以上であるか否かに基づいて、前記第２のフレーム画像の切り出し領域を決定する、前記（５）に記載の画像処理装置。
（７）
前記第２のフレーム画像の切り出し領域の形状およびサイズは、前記第１のフレーム画像の切り出し領域と同一である、前記（２）～（６）のいずれか一項に記載の画像処理装置。
（８）
前記画像処理装置は、前記動画像からの切り出し領域を決定するための複数の目標情報の中からユーザにより選択された目標情報を取得する取得部をさらに備え、
前記切り出し領域決定部は、さらに、前記取得部により取得された目標情報に基づいて、前記第２のフレーム画像の切り出し領域を決定する、前記（２）～（５）のいずれか一項に記載の画像処理装置。
（９）
前記複数の目標情報は、切り出し領域に対応する切り出し対象の物体を同一に維持する時間の上限値である追跡上限時間を含み、
前記第２のフレーム画像が、前記物体検出部による前記第１の物体の検出の継続時間が前記取得部により取得された追跡上限時間を超えた際もしくは後のフレーム画像である場合には、前記物体検出部は、前記第２のフレーム画像から第２の物体を検出し、
前記切り出し領域決定部は、前記第２の物体の検出位置が含まれるように前記第２のフレーム画像の切り出し領域を決定する、前記（８）に記載の画像処理装置。
（１０）
前記物体検出部は、さらに、所定の数の物体を前記第２のフレーム画像から検出し、
前記切り出し領域決定部は、前記所定の数の物体の各々の検出位置に基づいて、前記第２のフレーム画像の中から前記所定の数の切り出し領域をそれぞれ決定する、前記（８）に記載の画像処理装置。
（１１）
前記複数の目標情報は、複数の切り出し領域の重なりを判定するための重なり判定条件を含み、
前記取得部により取得された重なり判定条件に基づいて前記所定の数の切り出し領域のうち第１の切り出し領域と第２の切り出し領域とが一部重複すると判定された場合には、前記物体検出部は、前記第２の切り出し領域に対応づけて検出された物体とは異なる第３の物体を前記第２のフレーム画像から検出し、
前記切り出し領域決定部は、前記第３の物体の検出位置が含まれるように前記第２の切り出し領域の位置を変更する、前記（１０）に記載の画像処理装置。
（１２）
前記動画像において監視対象領域が予め設定されており、
前記複数の目標情報は、切り出し領域に対応する切り出し対象の物体と前記監視対象領域との間の距離の上限値である監視上限距離を含み、
前記第２のフレーム画像が、前記第１の物体と前記監視対象領域との間の距離が前記取得部により取得された監視上限距離を超えた際もしくは後のフレーム画像である場合には、前記物体検出部は、前記第２のフレーム画像から第２の物体を検出し、
前記切り出し領域決定部は、前記第２の物体の検出位置が含まれるように前記第２のフレーム画像の切り出し領域を決定する、前記（８）に記載の画像処理装置。
（１３）
前記第２の物体は、前記第２のフレーム画像に含まれる複数の物体のうち前記監視対象領域の最も近くに位置する物体である、前記（１２）に記載の画像処理装置。
（１４）
前記複数の目標情報は、検出対象の物体の領域サイズ、検出対象の物体の移動速度、検出対象の物体の滞在時間、検出対象の物体の縦横比、または、検出対象の物体が前記動画像の領域外に移動するまでの予測時間を含む、前記（９）～（１３）のいずれか一項に記載の画像処理装置。
（１５）
前記画像処理装置は、前記切り出し領域決定部により決定された前記切り出し領域を前記第２のフレーム画像から切り出すことにより切り出し画像を生成する切り出し画像生成部をさらに備える、前記（１）～（１４）のいずれか一項に記載の画像処理装置。
（１６）
動画像に含まれる第１のフレーム画像から第１の物体を検出することと、
前記第１の物体の検出位置が含まれるように前記第１のフレーム画像の切り出し領域を決定することと、
前記第１のフレーム画像よりも後の第２のフレーム画像において、前記第１の物体が停止していると推測される時間の長さに基づいて、前記第２のフレーム画像の切り出し領域を決定することと
を備える、画像処理方法。
（１７）
動画像に含まれる第１のフレーム画像から第１の物体を検出する物体検出部と、
前記第１の物体の検出位置が含まれるように前記第１のフレーム画像の切り出し領域を決定する切り出し領域決定部と、
前記切り出し領域決定部により決定された前記切り出し領域を前記第１のフレーム画像から切り出すことにより切り出し画像を生成する切り出し画像生成部と、
生成された前記切り出し画像を記憶する記憶部と、
を備え、
前記切り出し領域決定部は、前記第１のフレーム画像よりも後の第２のフレーム画像において、前記第１の物体が停止していると推測される時間の長さに基づいて、前記第２のフレーム画像の切り出し領域を決定する、画像処理システム。 The following configurations also belong to the technical scope of the present disclosure.
(1)
An object detection unit that detects the first object from the first frame image included in the moving image, and
A cutout area determination unit that determines a cutout area of the first frame image so as to include a detection position of the first object, and a cutout area determination unit.
Equipped with
The cutout region determination unit is based on the length of time that the first object is presumed to be stopped in the second frame image after the first frame image. An image processing device that determines the cropped area of a frame image.
(2)
The cutout area determination unit determines the cutout area of the second frame image based on whether or not the length of time estimated that the first object is stopped is larger than the stop upper limit time. , The image processing apparatus according to (1) above.
(3)
When the second frame image is a frame image when the time at which the first object is presumed to be stopped exceeds or is after the stop upper limit time, the object detection unit may perform the above-mentioned. The second object is detected from the second frame image,
The image processing apparatus according to (2) above, wherein the cutout area determination unit determines a cutout area of the second frame image so as to include a detection position of the second object.
(4)
When the second frame image is a frame image when the time at which the first object is presumed to be stopped is equal to or less than the stop upper limit time, the cutout area determination unit is the first. The image processing apparatus according to (2) or (3) above, wherein a cutout region of the second frame image is determined so as to include a detection position of one object.
(5)
The length of time that the first object is presumed to be stopped is the length of time that the amount of change in the detection position of the first object between consecutive frame images is within a predetermined range. The image processing apparatus according to any one of (2) to (4) above.
(6)
The image processing device further includes an acquisition unit that acquires the length of the stop upper limit time specified by the user.
The cutout area determination unit is based on whether or not the length of time presumed that the first object is stopped is equal to or longer than the stop upper limit time acquired by the acquisition unit. The image processing apparatus according to (5) above, which determines the cutout area of the frame image of.
(7)
The image processing apparatus according to any one of (2) to (6) above, wherein the shape and size of the cut-out region of the second frame image are the same as the cut-out region of the first frame image.
(8)
The image processing device further includes an acquisition unit that acquires target information selected by the user from a plurality of target information for determining a region to be cut out from the moving image.
The section according to any one of (2) to (5) above, wherein the cutout area determination unit further determines a cutout area of the second frame image based on the target information acquired by the acquisition unit. Image processing equipment.
(9)
The plurality of target information includes a tracking upper limit time, which is an upper limit of the time for keeping the object to be cut out to be the same corresponding to the cutout area.
When the second frame image is a frame image when the duration of detection of the first object by the object detection unit exceeds or after the tracking upper limit time acquired by the acquisition unit, the frame image is described. The object detection unit detects the second object from the second frame image and determines the second object.
The image processing apparatus according to (8) above, wherein the cutout area determination unit determines a cutout area of the second frame image so as to include a detection position of the second object.
(10)
The object detection unit further detects a predetermined number of objects from the second frame image.
The cut-out area determination unit determines the predetermined number of cut-out areas from the second frame image based on the detection positions of the predetermined number of objects, according to the above (8). Image processing device.
(11)
The plurality of target information includes an overlap determination condition for determining overlap of a plurality of cutout regions.
When it is determined that the first cut-out area and the second cut-out area partially overlap among the predetermined number of cut-out areas based on the overlap determination condition acquired by the acquisition unit, the object detection unit Detects a third object different from the object detected in association with the second cutout region from the second frame image.
The image processing apparatus according to (10), wherein the cutout area determining unit changes the position of the second cutout area so as to include the detection position of the third object.
(12)
The monitoring target area is preset in the moving image, and the monitoring target area is set in advance.
The plurality of target information includes a monitoring upper limit distance which is an upper limit value of the distance between the object to be cut out corresponding to the cutting area and the monitoring target area.
When the second frame image is a frame image when the distance between the first object and the monitored area exceeds the monitoring upper limit distance acquired by the acquisition unit, or after that, the frame image is described. The object detection unit detects the second object from the second frame image and determines the second object.
The image processing apparatus according to (8) above, wherein the cutout area determination unit determines a cutout area of the second frame image so as to include a detection position of the second object.
(13)
The image processing apparatus according to (12), wherein the second object is an object located closest to the monitored area among a plurality of objects included in the second frame image.
(14)
The plurality of target information includes the area size of the object to be detected, the moving speed of the object to be detected, the residence time of the object to be detected, the aspect ratio of the object to be detected, or the object to be detected is the moving image of the moving image. The image processing apparatus according to any one of (9) to (13) above, which includes an estimated time to move out of the region.
(15)
The image processing apparatus further includes a cut-out image generation unit that generates a cut-out image by cutting out the cut-out area determined by the cut-out area determination unit from the second frame image (1) to (14). The image processing apparatus according to any one of the above items.
(16)
Detecting the first object from the first frame image included in the moving image,
Determining the cutout area of the first frame image so as to include the detection position of the first object, and
In the second frame image after the first frame image, the cutout region of the second frame image is determined based on the length of time estimated that the first object is stopped. An image processing method that comprises doing.
(17)
An object detection unit that detects the first object from the first frame image included in the moving image, and
A cutout area determination unit that determines a cutout area of the first frame image so as to include a detection position of the first object, and a cutout area determination unit.
A cut-out image generation unit that generates a cut-out image by cutting out the cut-out area determined by the cut-out area determination unit from the first frame image.
A storage unit that stores the generated cut-out image,
Equipped with
The cutout region determination unit is based on the length of time that the first object is presumed to be stopped in the second frame image after the first frame image. An image processing system that determines the cropped area of a frame image.

１０カメラ
２０ストレージ
２２監視端末
２４通信網
１００撮影部
１０２映像縮小部
１０４領域設定部
１０６映像クロッピング部
１０８通信部
１２０物体検出部
１２２クロッピング領域決定部
１２４重なり判定部
２２０制御部
２２２通信部
２２４表示部
２２６入力部 10 Camera 20 Storage 22 Monitoring terminal 24 Communication network 100 Imaging unit 102 Image reduction unit 104 Area setting unit 106 Video cropping unit 108 Communication unit 120 Object detection unit 122 Cropping area determination unit 124 Overlap determination unit 220 Control unit 222 Communication unit 224 Display unit 226 Input section

Claims

At the first time point of the moving image, the first object is selected as the gaze target from one or more detected objects based on a predetermined criterion.
When it is detected that the first object is stopped for a predetermined time or more at a second time point after the first time point, the gaze target is said to be the same from the first object. It has a control unit that switches to a second object that is different from the first object.
The control unit
An image processing device that determines the first predetermined time based on the number of one or more detected objects in the moving image that are not selected as the gaze target .

When the control unit detects that the first object is stopped for the first predetermined time or more at the second time point, the gaze target is the second object from the first object. The image processing apparatus according to claim 1, wherein the second object is switched to the second object selected from the two or more detected objects at the time point.

The control unit detects that the first object is stopped when the amount of change in the detection position of the first object between the frame images of the moving image is within a predetermined range. Item 2. The image processing apparatus according to Item 1.

When the control unit detects that the first object is continuously selected as the gaze target for a second predetermined time or longer at the second time point, the gaze target is selected as the first gaze target. The image processing apparatus according to any one of claims 1 to 3, which switches from an object to the second object.

The control unit determines the first predetermined time or the second predetermined time based on the number of one or more detected objects in the moving image that are not selected as the gaze target. The image processing apparatus according to claim 4.

The control unit prolongs the first predetermined time or the second predetermined time as the number of objects not selected as the gaze target among the one or more detected objects in the moving image is smaller. The image processing apparatus according to claim 5.

The image according to any one of claims 4 to 6, wherein the control unit lengthens the first predetermined time or the second predetermined time when a specific region is included in the moving image. Processing equipment.

When the control unit detects that the distance between the first object and the specific region in the moving image exceeds a predetermined distance at the second time point, the first one. The image processing apparatus according to any one of claims 1 to 7, wherein the object is excluded from the gaze target.

The control unit detects that the distance between the first object and the specific region exceeds the predetermined distance at the second time point, so that the first object is said to be the first object. The image according to claim 8, wherein when excluded from the gaze target, the object located closest to the specific region among the one or more detected objects at the second time point is selected as the gaze target. Processing device.

The image processing apparatus according to any one of claims 1 to 9, wherein the control unit selects a specific region in the moving image as the gaze target when there is no detected object in the moving image. ..

The image processing apparatus according to any one of claims 1 to 10, wherein the control unit selects a predetermined number of objects among one or more detected objects in the moving image as the gaze target.

The predetermined criteria are the size, moving speed, staying time, aspect ratio, tracking time, stop time of the detected object in the moving image, and the detected object in the moving image is outside the area of the moving image. The image processing apparatus according to any one of claims 1 to 11, which comprises at least one of the estimated times until movement.

The image processing according to any one of claims 1 to 12, wherein the image processing apparatus includes a cutting area determination unit that determines a cutting area of the moving image so as to include a detection position of the object to be gazed. Device.

When it is determined at the second time point that the cutout area corresponding to the first object partially overlaps with the cutout area corresponding to the third object, the control unit is selected as the gaze target. 13. The image processing apparatus according to claim 13 , wherein either the first object or the third object is switched to the second object different from the first object or the third object. ..

The image processing apparatus according to claim 14, wherein among the first object and the third object, an object that has been continuously selected as the gaze target for a longer period of time is switched to the second object.

The image processing apparatus according to any one of claims 13 to 15, further comprising a cut-out image generation unit that generates a cut-out image by cutting out the cut-out area determined by the cut-out area determination unit. ..

At the first time point of the moving image, the first object is selected as the gaze target from one or more detected objects based on a predetermined criterion.
When it is detected that the first object is stopped for a predetermined time or more at a second time point after the first time point, the gaze target is said to be the same from the first object. Switching to a second object that is different from the first object,
An image processing method comprising determining the first predetermined time based on the number of objects not selected as the gaze target among one or more detected objects in the moving image .

At the first time point of the moving image, the first object is selected as the gaze target from one or more detected objects based on a predetermined criterion.
When it is detected that the first object is stopped for a predetermined time or more at a second time point after the first time point, the gaze target is said to be the same from the first object. A control unit that switches to a second object that is different from the first object,
A cutting area determination unit that determines a cutting area so as to include the detection position of the object to be gazed, and a cutting area determining unit.
A cut-out image generation unit that generates a cut-out image by cutting out the cut-out area determined by the cut-out area determination unit.
A storage unit that stores the generated cut-out image,
Equipped with
The control unit
An image processing system that determines the first predetermined time based on the number of one or more detected objects in the moving image that are not selected as the gaze target .