JP6516478B2

JP6516478B2 - Image processing apparatus and control method of image processing apparatus

Info

Publication number: JP6516478B2
Application number: JP2015006928A
Authority: JP
Inventors: 成己松岡
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2015-01-16
Filing date: 2015-01-16
Publication date: 2019-05-22
Anticipated expiration: 2035-01-16
Also published as: JP2016134679A

Description

本発明は画像処理装置及び画像処理装置の制御方法に関し、特に、動画中の移動体を抽出するために用いて好適な技術に関する。 The present invention relates to an image processing apparatus and a control method of the image processing apparatus, and more particularly to a technique suitable for use in extracting moving objects in a moving image.

移動体を抽出する手法の一つとして背景差分法がある。これは、移動体が写っていない画像と移動体が写っている画像との差分を求め、差分が閾値以上であった領域を被写体として抽出する手法である。特許文献１では、連写撮影又は動画撮影された画像中に動きのある部分の存在が検出されない区間の画像を背景画像として記憶しておき、背景画像と一連の被写体込みの画像との差分を用いて被写体を抽出する構成が記載されている。 Background subtraction is one of the techniques for extracting moving objects. This is a method of obtaining a difference between an image in which the moving body is not captured and an image in which the moving body is captured, and extracting an area in which the difference is equal to or more than a threshold as a subject. In Patent Document 1, an image of a section in which the presence of a moving part is not detected in continuous-shot or moving-image captured images is stored as a background image, and a difference between the background image and a series of images including subjects is stored. A configuration for extracting a subject using it is described.

また、移動体を抽出する手法に、被写体追尾を用いる場合がある。これは、隣り合うフレーム間の画像において、対応する特徴点や領域を求めることで移動体を抽出する手法である。特許文献２では、数フレームの画像の色域を抽出し、抽出した色域の位置が変動したか否かを判定することで移動体を抽出する構成が記載されている。 In addition, as a method of extracting a moving object, subject tracking may be used. This is a method of extracting a moving object by obtaining corresponding feature points or areas in an image between adjacent frames. Patent Document 2 describes a configuration for extracting a moving object by extracting the color gamut of an image of several frames and determining whether the position of the extracted color gamut has changed.

特開２０１０−２８３６３７号公報JP, 2010-283637, A 特開２０００−３４８１８５号公報JP, 2000-348185, A

しかしながら、前述の特許文献１で触れたような背景差分法を用いるには、移動体の写っていない画像を撮影することが必要となる。仮に、撮影した動画中に移動体の写っていない画像が存在しない場合があっても、動画のそれぞれのフレーム画像を用いて移動体を除去した移動体除去画像を作成し、代用する方法が考えられる。 However, in order to use the background subtraction method as described in the above-mentioned Patent Document 1, it is necessary to take an image without a mobile object. Even if there are cases where there are no images with no moving objects in the captured moving image, a method of using the frame images of the moving images to create a moving object-removed image from which the moving objects have been removed will be considered. Be

このとき、各画素に対して、移動体以外すなわち背景を記録したフレームを選択することで、移動体除去画像を作成できる。しかし、例えば、移動体がある位置を中心とした周期運動をするなど、全フレームに渡って移動体を記録する画素がある場合、原理上移動体を除去できない。さらに、移動体と背景とを交互に記録する画素があり、誤って移動体を記録したフレームを選択した場合、移動体の除去に失敗する。 At this time, a moving object removed image can be created by selecting a frame other than the moving object, that is, a frame in which the background is recorded, for each pixel. However, for example, if there is a pixel for recording the moving object over the entire frame, for example, the moving object is periodically moved around a certain position, the moving object can not be removed in principle. Furthermore, if there are pixels that alternately record the moving object and the background, and if the frame in which the moving object is recorded is selected by mistake, removal of the moving object fails.

さらに、特許文献２で触れたような移動体を追尾する手法では、移動体を表す色域の判定を行っていないため、例えば、ある固有の色域を持った静止物の前を移動体が横切った場合では、静止物を誤って移動体として検出してしまう可能性がある。
本発明は前述の問題点に鑑み、被写体の写っていない画像を撮影していない場合や、固有の色域を持った静止物の前を移動体が横断する場合でも移動体を抽出できるようにすることを目的とする。 Furthermore, in the method of tracking the moving object as described in Patent Document 2, since the color gamut representing the moving object is not determined, for example, the moving object moves in front of a stationary object having a certain unique color range. In the case of crossing, there is a possibility that a stationary object is erroneously detected as a moving object.
In view of the above-mentioned problems, the present invention can extract a moving object even when the moving object crosses in front of a stationary object having a unique color gamut or when an image in which a subject is not photographed is not photographed. The purpose is to

本発明の画像処理装置は、動画中の移動体を抽出する画像処理装置であって、動画中で再現度の高い区間の組み合わせを求め、前記再現度の高い区間から１つのフレームを選び出し、そのフレームにおける画素値を各画素に適用することで、移動体の写っていない移動体除去画像を生成する生成手段と、前記移動体除去画像と動画の各フレームにおける画像との差分を用いて移動体の一部である移動体シードを抽出する抽出手段と、前記抽出手段により抽出した移動体シードと、隣接した注目画素の特徴量を比較することで注目画素を移動体に分類し、さらに分類を終えた時点で移動体ではないと分類された画素を背景と分類する分類手段とを備え、前記生成手段は、動画のある区間のフレームの各画素において、特徴量の変動が閾値以下となるフレームが断続的に続いた回数を元にして前記再現度の高い区間の組み合わせを求め、前記分類手段は、前記移動体からの侵攻を終えた時点で移動体ではない画素を背景と分類し、さらに、背景から移動体への侵攻を行い、前記抽出手段は、前記分類手段による分類に基づいて動画から移動体を抽出することを特徴とする。 The image processing apparatus according to the present invention is an image processing apparatus for extracting a moving object in a moving image, obtaining a combination of sections with high reproducibility in the moving image, selecting one frame from the section with high reproducibility, A moving object is generated by applying a pixel value in a frame to each pixel to generate a moving object removed image without a moving object, and a difference between the moving object removed image and an image in each frame of a moving image. The target pixel is classified as a moving object by comparing the feature amount of the target pixel adjacent to the moving object seed extracted by the extracting unit and the moving object seed extracted by the extracting unit, and the classification is further performed. and a classifying means for classifying the pixels that have been classified as not a mobile at the time of completion and the background, the generating means, in each pixel of a frame of the section of video, threshold fluctuation feature amounts less and The combination of the sections with high reproducibility is obtained based on the number of times the frame continues intermittently, and the classification unit classifies the pixel which is not the mobile as the background when the invasion from the mobile is finished. Furthermore, the moving object is invaded from the background, and the extracting unit extracts the moving object from the moving image based on the classification by the classification unit.

本発明によれば、動画において移動体を正しく抽出することが可能となる。 According to the present invention, it is possible to correctly extract a moving object in a moving image.

本発明の実施形態に係る撮像装置の構成例を示すブロック図である。It is a block diagram showing an example of composition of an imaging device concerning an embodiment of the present invention. 実施形態に係る移動体抽出における全体の処理手順の一例を示すフローチャートである。It is a flow chart which shows an example of the whole processing procedure in mobile's extraction concerning an embodiment. 実施形態に係る移動体除去画像の作成処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of a creation processing procedure of a mobile body removal image concerning an embodiment. 実施形態に係る移動体の抽出処理手順の一例を示すフローチャートである。It is a flow chart which shows an example of extraction processing procedure of a mobile concerning an embodiment. 実施形態に係る移動体シードからの侵攻処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the invasion processing procedure from the mobile body seed which concerns on embodiment. 実施形態に係る背景シードからの侵攻処理手順の一例を示すフローチャートである。It is a flow chart which shows an example of an invasion processing procedure from a background seed concerning an embodiment. 実施形態に係る移動体除去画像作成の例を示す図である。It is a figure which shows the example of mobile body removal image creation which concerns on embodiment. 実施形態に係る移動体除去画像作成の例および背景記録候補区間の例を示す図である。It is a figure which shows the example of the moving body removal image creation which concerns on embodiment, and the example of a background recording candidate area. 実施形態に係る背景記録候補区間の例を示す図である。It is a figure which shows the example of the background recording candidate area which concerns on embodiment. 実施形態に係る背景記録区間の選定例を示す図である。It is a figure which shows the example of selection of the background recording area which concerns on embodiment. 実施形態に係る背景記録区間の例を示す図である。It is a figure which shows the example of the background recording area which concerns on embodiment. 実施形態に係る移動体抽出処理の例を示す図である。It is a figure which shows the example of the mobile body extraction process which concerns on embodiment. 実施形態に係る移動体除去画像作成の例を示す図である。It is a figure which shows the example of mobile body removal image creation which concerns on embodiment. 実施形態に係る移動体除去画像の例を示す図である。It is a figure which shows the example of the mobile body removal image which concerns on embodiment. 実施形態に係る移動体シードからの侵攻の例を示す図である。It is a figure which shows the example of the invasion from the mobile body seed which concerns on embodiment. 実施形態に係る背景シードからの侵攻の例を示す図である。It is a figure which shows the example of the invasion from the background seed which concerns on embodiment.

以下に、本発明の好ましい実施の形態を、添付の図面に基づいて詳細に説明する。
［第１の実施形態］
図１は、本発明の画像処理装置を実現する一つの例である撮像装置１００の基本構成を示すブロック図である。
撮像装置１００は、デジタルカメラ、デジタルビデオカメラのようなカメラはもとより、カメラ機能付き携帯電話、カメラ付きコンピュータなど、カメラ機能を備える任意の電子機器であってもよい。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the attached drawings.
First Embodiment
FIG. 1 is a block diagram showing a basic configuration of an imaging apparatus 100 which is an example for realizing the image processing apparatus of the present invention.
The imaging apparatus 100 may be any electronic device having a camera function, such as a digital camera, a camera such as a digital video camera, a mobile phone with a camera function, a computer with a camera, and the like.

光学系１０１は、レンズ、シャッター、絞りから構成されており、ＣＰＵ１０３の制御によって被写体からの光を撮像素子１０２に結像させる。ＣＣＤイメージセンサ、ＣＭＯＳイメージセンサなどの撮像素子１０２は、光学系１０１を通って結像した光を画像信号に変換する。 The optical system 101 includes a lens, a shutter, and an aperture, and causes the light from an object to form an image on the image sensor 102 under the control of the CPU 103. An imaging element 102 such as a CCD image sensor or a CMOS image sensor converts light imaged through the optical system 101 into an image signal.

ＣＰＵ１０３は、入力された信号や予め記憶されたプログラムに従い、撮像装置１００を構成する各部を制御することで、撮像装置１００の機能を実現させる。
一次記憶部１０４は、例えば、ＲＡＭのような揮発性装置であり、一時的なデータを記憶し、ＣＰＵ１０３の作業用に使われる。また、一次記憶部１０４に記憶されている情報は、画像処理部１０５で利用されたり、記録媒体１０６へ記録されたりもする。 The CPU 103 implements the functions of the imaging device 100 by controlling the respective units constituting the imaging device 100 in accordance with the input signal or a program stored in advance.
The primary storage unit 104 is, for example, a volatile device such as a RAM, stores temporary data, and is used for the work of the CPU 103. Further, the information stored in the primary storage unit 104 may be used by the image processing unit 105 or may be recorded on the recording medium 106.

記録媒体１０６は、一次記憶部１０４に記憶されている、撮影により得られた画像のデータなどを記録する。なお、記録媒体１０６は、例えば、半導体メモリカードのように撮像装置１００から取り外し可能であり、記録されたデータはパーソナルコンピュータなどに装着してデータを読み出すことが可能である。つまり、撮像装置１００は、記録媒体１０６の着脱機構及び読み書き機能を有する。 The recording medium 106 records, for example, data of an image obtained by shooting, which is stored in the primary storage unit 104. The recording medium 106 is removable from the imaging apparatus 100, for example, like a semiconductor memory card, and the recorded data can be attached to a personal computer or the like to read out the data. In other words, the imaging apparatus 100 has a mounting and demounting mechanism and a reading and writing function of the recording medium 106.

二次記憶部１０７は、例えば、ＥＥＰＲＯＭのような不揮発性記憶装置であり、撮像装置１００を制御するためのプログラム（ファームウェア）や各種の設定情報を記憶し、ＣＰＵ１０３によって利用される。
表示部１０８は、撮影時のビューファインダー画像の表示、撮影した画像の表示、対話的な操作のためのＧＵＩ画像などの表示を行う。操作部１０９は、ユーザの操作を受け付けてＣＰＵ１０３へ入力情報を伝達する入力デバイス群であり、例えばボタン、レバー、タッチパネル等はもちろん、音声や視線などを用いた入力機器であってもよい。 The secondary storage unit 107 is, for example, a non-volatile storage device such as an EEPROM, stores a program (firmware) for controlling the imaging apparatus 100 and various setting information, and is used by the CPU 103.
The display unit 108 displays a viewfinder image at the time of shooting, displays a shot image, and displays a GUI image for interactive operation. The operation unit 109 is an input device group that receives a user's operation and transmits input information to the CPU 103, and may be, for example, a button, a lever, a touch panel or the like, or an input device using voice or a line of sight.

なお、本実施形態の撮像装置１００は、画像処理部１０５が撮像画像に適用する画像処理のパターンを複数有し、パターンを撮像モードとして操作部１０９から設定可能である。画像処理部１０５は、いわゆる現像処理と呼ばれる画像処理をはじめ、撮影モードに応じた色調の調整なども行う。なお、画像処理部１０５の機能の少なくとも一部は、ＣＰＵ１０３がソフトウェア的に実現してもよい。 Note that the imaging apparatus 100 according to the present embodiment has a plurality of patterns of image processing to be applied to a captured image by the image processing unit 105, and the pattern can be set from the operation unit 109 as an imaging mode. The image processing unit 105 performs image processing called so-called development processing, as well as adjusting the color tone according to the shooting mode. The CPU 103 may realize at least a part of the functions of the image processing unit 105 as software.

図２に、本実施形態における移動体抽出の全体の流れを示し、図３、４、５、および６にそれぞれ移動体除去画像の作成の流れ、移動体の抽出の流れ、移動体シードからの侵攻の流れおよび背景シードからの侵攻の流れを示す。 FIG. 2 shows the entire flow of mobile object extraction in this embodiment, and FIGS. 3, 4, 5, and 6 show the flow of creation of a mobile object removed image, the flow of mobile object extraction, and the flow from mobile object seeds, respectively. Invasion flow and background Invasion flow from the seed is shown.

図２〜図６のフローチャートは、ＣＰＵ１０３が二次記憶部１０７に記録されたプログラムを一次記憶部１０４のワークメモリ領域に展開して実行し、各部を制御することで実現する。
以下に、図１２の例を用いて、動画中の移動体を抽出する本実施形態について説明する。
まず、Ｓ２０１において、ＣＰＵ１０３は、動画の各画素に対して、背景を記録した区間（以下、背景記録区間）を選定することで、移動体を除去した移動体除去画像の作成を行う。 The flowcharts in FIGS. 2 to 6 are realized by the CPU 103 developing a program stored in the secondary storage unit 107 in a work memory area of the primary storage unit 104 and executing the program to control each unit.
In the following, the present embodiment for extracting a moving object in a moving image will be described using the example of FIG.
First, in S201, the CPU 103 creates a moving body-removed image from which a moving body has been removed by selecting a section in which the background is recorded (hereinafter, background recording section) for each pixel of the moving image.

図７のＲＧＢ値の時間変動の例を用いて、ある画素における移動体除去画像を作成するためのフレーム選択の一連の流れを説明する。
Ｓ３０１において、ＣＰＵ１０３は、変数ｉに１を代入し、Ｓ３０２において、ＣＰＵ１０３は変数ｊに１を代入し、次に、Ｓ３０３において、ＣＰＵ１０３は、背景を記録した可能性のある区間（以下、背景記録候補区間）を選定する。背景を記録した場合、単一の色情報を連続的に記録するが、このような区間を求める手順を以下に述べる。 A series of flow of frame selection for creating a moving object eliminated image at a certain pixel will be described using an example of temporal variation of RGB values in FIG. 7.
In S301, the CPU 103 substitutes 1 for the variable i, in S302 the CPU 103 substitutes 1 for the variable j, and in S303, the section in which the background may be recorded (hereinafter, background recording) Select candidate interval). When background is recorded, single color information is continuously recorded, and the procedure for obtaining such a section will be described below.

まず初めに、ＣＰＵ１０３は、ＲＧＢ値それぞれに対して、隣接フレーム間の差分を算出し、その絶対値を求める（図８（a））。
次に、ＣＰＵ１０３は、ＲＧＢ値各々の差分絶対値が第１の閾値以下となったフレームが、第２の閾値以上の期間だけ続く区間を求め、当該区間を背景記録候補区間として選定する。図８（b）および図９は、前述の選定方法により選定した背景記録候補区間を示す。なお、第１の閾値および第２の閾値は、それぞれ予めＣＰＵ１０３に記憶しておく。 First, for each of the RGB values, the CPU 103 calculates the difference between adjacent frames, and obtains the absolute value thereof (FIG. 8A).
Next, the CPU 103 obtains a section in which a frame in which the difference absolute value of each of the RGB values is less than or equal to the first threshold continues for only a second threshold or more, and selects the section as a background recording candidate section. FIGS. 8B and 9 show background recording candidate sections selected by the above-described selection method. The first threshold and the second threshold are stored in advance in the CPU 103, respectively.

次に、Ｓ３０４において、ＣＰＵ１０３は、Ｓ３０３において選定された背景記録候補区間から、実際に背景を記録した区間（以下、背景記録区間）の選定を行う。まず、ＣＰＵ１０３は、背景記録候補区間の中央のフレームにおけるＲＧＢ値を、各候補区間の代表値として抽出する。 Next, in step S304, the CPU 103 selects a section in which the background is actually recorded (hereinafter referred to as a background recording section) from the background recording candidate sections selected in step S303. First, the CPU 103 extracts RGB values in the central frame of the background recording candidate section as a representative value of each candidate section.

次に、ＣＰＵ１０３は、前述した代表値の組合せを元にして、最も再現度の高い組合せを持った区間を求め、この区間を背景記録区間として選定する。ある区間sにおける再現度vote(s)は、初期値を０とし、式１を全て満たした場合に１加算することとする。なお、tは注目している区間s以外の区間番号を取ることとする。 Next, based on the combination of the representative values described above, the CPU 103 obtains a section having the combination with the highest degree of reproducibility, and selects this section as a background recording section. The reproduction value vote (s) in a certain section s has an initial value of 0, and 1 is added when all the expressions 1 are satisfied. In addition, suppose that t takes the section numbers other than the section s which is paying attention.

|candi(s, R) - candi( t, R) | < thr3
|candi(s, G) - candi( t, G) | < thr3 ・・・式１
|candi(s, B) - candi ( t, B) | < thr3
ここで、candi(s, R)、candi(s, G)およびcandi(s, B)はそれぞれ区間sにおける代表値のＲ、ＧおよびＢを表し、thr3は第３の閾値をそれぞれ表す。
図９の例における各候補区間の再現度は、図１０のようになる。図１０より、図１１に示すように、区間１、３および５が背景記録区間として選定される。 | candi (s, R)-candi (t, R) | <thr3
| candi (s, G)-candi (t, G) | <thr3 formula 1
| candi (s, B)-candi (t, B) | <thr3
Here, candi (s, R), candi (s, G) and candi (s, B) respectively represent R, G and B of representative values in the section s, and thr 3 represents a third threshold, respectively.
The degree of reproduction of each candidate section in the example of FIG. 9 is as shown in FIG. From FIG. 10, as shown in FIG. 11, sections 1, 3 and 5 are selected as background recording sections.

Ｓ３０５において、ＣＰＵ１０３は、注目画素に適用するＲＧＢの画素値を決定する。Ｓ３０４において選定した背景記録区間の中から、ある任意のフレームを選択し、そのフレームにおけるＲＧＢ値を、注目画素における背景情報を記録したとして用いる。 In step S305, the CPU 103 determines RGB pixel values to be applied to the target pixel. An arbitrary frame is selected from among the background recording sections selected in S304, and the RGB values in the frame are used as the background information in the pixel of interest is recorded.

Ｓ３０６において、ＣＰＵ１０３は、変数ｊを１だけインクリメントし、Ｓ３０７において、全列数の処理を行った否かを判定し、処理が終了していない場合にはＳ３０３に戻り、終了していればＳ３０８に進む。
Ｓ３０８において、ＣＰＵ１０３は、変数ｉを１だけインクリメントし、Ｓ３０９において、全行数の処理を行った否かを判定し、処理が終了していない場合にはＳ３０２に戻り、終了していれば移動体除去画像の作成処理を終了する。 In step S306, the CPU 103 increments the variable j by 1. In step S307, it is determined whether or not the processing of the total number of rows has been performed. If the processing is not completed, the process returns to step S303. Go to
In step S308, the CPU 103 increments the variable i by 1. In step S309, the CPU 103 determines whether all rows have been processed. If the processing is not completed, the process returns to step S302. The creation process of the body removed image is ended.

背景記録区間の中から、任意のフレームを選ぶ選定方法に関しては、どのような手段であってもよいが、環境光の変化なども考慮して、より中央のフレームを選択することが望ましい。そこで、今回は背景記録区間のうち、時間的に中央に位置する区間に注目し、この区間の中央のフレームにおけるＲＧＢ値を用いることとした。 With regard to the selection method of selecting an arbitrary frame from the background recording section, any means may be used, but it is desirable to select a more central frame in consideration of changes in ambient light and the like. Therefore, in this case, among the background recording sections, the section located at the center in time is focused, and the RGB values in the center frame of this section are used.

また、本実施形態において繰り返し同じ色を記録した区間を背景記録区間として選定したが、例えば、より長時間記録した区間を選定した区間や、区間単位での断続的な繰り返し回数を用いずに各区間での記録時間を用いて背景記録区間を選定してもよい。さらに、本実施形態において選定した背景記録区間に対し、中央の区間の中央のフレームを用いることとしたが、動画記録区間のうち、最も中央の時刻のフレームにおけるＲＧＢ値を用いるなど、他の評価基準により、利用するフレームを選定してもよい。さらに、あるフレームにおけるＲＧＢ値の組合せを用いる代わりに、背景記録区間のＲＧＢ値それぞれの最頻値や中央値などを個別に求めて、これを利用してもよい。 Also, in the present embodiment, the section in which the same color is repeatedly recorded is selected as the background recording section, but, for example, each section is selected for a longer recording time, and each section is not used. The background recording section may be selected using the recording time in the section. Furthermore, although the center frame of the center section is used for the background recording section selected in the present embodiment, other evaluations such as using the RGB values in the frame at the center time of the moving image recording sections are used. The frame to be used may be selected according to the standard. Furthermore, instead of using a combination of RGB values in a certain frame, it is also possible to separately obtain the mode value or median value of each of the RGB values in the background recording section and use this.

Ｓ２０１において、ＣＰＵ１０３は、前述の手順を各画素に対して繰り返し行う（Ｓ３０１からＳ３０９）ことで、移動体除去画像の作成を行う。
前述したように、各画素の特徴量の時間変動から再現度の高い特徴量の組み合わせを持ったフレームを選び出し、そのフレームにおける特徴量の組み合わせから移動体除去画像を生成する。特徴量は、色情報、輝度値、距離情報の少なくとも一つである。 In S201, the CPU 103 repeatedly performs the above-described procedure for each pixel (S301 to S309) to create a moving body removed image.
As described above, a frame having a combination of feature amounts with high reproducibility is selected from time variation of feature amounts of each pixel, and a moving body removed image is generated from the combination of feature amounts in the frame. The feature amount is at least one of color information, luminance value, and distance information.

図１２に、だるま落としの前面でヤジロベーが動くような動画における、ある１フレームの画像を示し、図１２の（１）、（２）および（３）それぞれの画素におけるＲＧＢ値の時間変動を図１３（ａ）、１３（ｂ）および１３（ｃ）にそれぞれ示す。 FIG. 12 shows an image of one frame in a moving image in which a jairo rober moves in front of a dart drop, and illustrates temporal variation of RGB values in each pixel of (1), (2) and (3) in FIG. 13 (a), 13 (b) and 13 (c) respectively.

図１３（ａ）より、図１２の（１）の画素は周期的にヤジロベーの棒と背景が交互に記録されているが、背景の色の方がより継続的に記録されているため、この画素においては背景を記録したフレームを正しく選択できる。
次に、図１３（ｂ）より、図１２の（２）の画素は全フレームに渡ってヤジロベーの支点上の物体（以下、支点上物体）を記録している。このように、常に移動体が記録されるような画素に対しては、移動体の除去を行うことはできない。 From Fig. 13 (a), although the pixel and the background of the yellowtail are alternately recorded periodically in the pixel of Fig. 12 (1), the color of the background is recorded more continuously, so At the pixel, it is possible to correctly select the frame in which the background is recorded.
Next, as shown in FIG. 13B, the pixel shown in FIG. 12B records the object on the fulcrum fulcrum (hereinafter referred to as the fulcrum on the fulcrum) over the entire frame. As described above, it is not possible to remove the moving object for the pixels where the moving object is always recorded.

さらに、図１３（c）より、図１２の（３）の画素は（１）の画素のように、周期的に支点上物体と背景が交互に記録されているが、支点上物体の方がより継続的に記録されている。そのため、この画素においては支点上物体を記録したフレームを誤って選択してしまう。このように、移動体除去画像は最大限移動体を除去したとはいえないが、このような場合であっても移動体を正しく抽出する方法については後述する。 Furthermore, although the object on the fulcrum and the background are alternately recorded periodically as in the pixel (1) in the pixel (3) of FIG. 12 from FIG. 13 (c), the object on the fulcrum It is recorded more continuously. Therefore, in this pixel, the frame on which the object is recorded on the supporting point is erroneously selected. Thus, although it can not be said that the mobile-body-removed image has removed the mobile body as much as possible, a method for correctly extracting the mobile body even in such a case will be described later.

図１４に、図１２の画像を含んだ動画に対して、Ｓ２０１の手順で作成した移動体除去画像を示す。
図１４より、移動体の一部は除去できているが、前記支点上物体など周期運動を行っている箇所の除去は行うことができない。 FIG. 14 shows a moving body removed image created in the procedure of S201 for a moving image including the image of FIG.
From FIG. 14, although a part of the moving body can be removed, it is not possible to remove the portion on the fulcrum such as the object performing periodic motion.

図２のフローチャートの説明に戻る。
Ｓ２０２において、ＣＰＵ１０３は、変数ｎに１を代入し、Ｓ２０３において、ＣＰＵ１０３は、移動体の抽出処理を行う。移動体抽出処理は、図４のフローチャートで説明する。
Ｓ４０１およびＳ４０２において、ＣＰＵ１０３は、移動体抽出に必要となるシードの作成を行う。ここでは、対象とするフレームの画像を構成する各画素から「移動体シード」又は「背景シード」を作成する。まず、Ｓ４０１では移動体シードを作成する。次に、Ｓ４０２で背景シードを作成する。 It returns to description of the flowchart of FIG.
In S202, the CPU 103 substitutes 1 into the variable n, and in S203, the CPU 103 performs mobile object extraction processing. The moving object extraction process will be described with reference to the flowchart of FIG.
In steps S401 and S402, the CPU 103 creates a seed necessary for mobile object extraction. Here, a “moving body seed” or a “background seed” is created from each pixel constituting an image of a target frame. First, at S401, a mobile object seed is created. Next, in step S402, a background seed is created.

Ｓ４０１において、ＣＰＵ１０３は、移動体シードを作成する。移動体シードは、移動体を構成する全画素の一部分又は全体を指す。Ｓ２０１で作成した移動体除去画像と、移動体を抽出したいフレーム（以下、注目フレーム）の画像の差分絶対値が第３の閾値以上となった画素を、移動体シードとして作成する。 In S401, the CPU 103 creates a mobile unit seed. The mobile seed refers to a part or all of all the pixels constituting the mobile. A pixel whose difference absolute value between the moving body removed image created in S201 and the image of a frame (hereinafter, a target frame) for which the moving body is desired to be extracted is equal to or more than a third threshold is created as a moving body seed.

次に、Ｓ４０２においてＣＰＵ１０３は、背景シードを作成する。背景シードは、全フレームに渡って背景情報を記録した画素を指す。全フレームに渡って背景情報を記録したかどうかは、注目画素のＲＧＢ値の時間変動の差分の絶対値を求め、算出した差分絶対値が、全フレームに渡って第１の閾値以下となった画素を背景として判定する。
最後に、移動体シードとも背景シードとも判定されなかった画素を、不定とする。 Next, in step S402, the CPU 103 creates a background seed. Background seed refers to a pixel in which background information is recorded over the entire frame. Whether or not background information has been recorded over all frames, the absolute value of the difference of the time variation of the RGB value of the pixel of interest is determined, and the calculated difference absolute value becomes less than the first threshold over all frames. Determine the pixel as the background.
Finally, a pixel which has not been determined as a mobile object seed or a background seed is considered to be indeterminate.

次に、Ｓ４０３およびＳ４０４において、ＣＰＵ１０３は、Ｓ４０１およびＳ４０２において作成した「移動体シード」、「背景シード」および「不定」を利用して移動体の抽出を行う。注目画素と、その周囲の画素との特徴量を比較し、十分似ていた場合に、注目画素をその周囲の画素と同じグループ（ここでは、移動体又は背景）に属させる過程を、以下では「侵攻」と表記する。 Next, in S403 and S404, the CPU 103 extracts a mobile using the “mobile seed”, “background seed” and “indeterminate” created in S401 and S402. The feature quantities of the target pixel and its surrounding pixels are compared, and if they are sufficiently similar, the process of making the target pixel belong to the same group as the surrounding pixels (here, mobile or background) will be described below. Described as "invasion".

まず、Ｓ４０３において、ＣＰＵ１０３は、移動体シードから不定に対する侵攻を行う。移動体シードから不定に対する侵攻を行う、注目画素を移動体に分類する処理手順を、図５のフローチャートに示す。
Ｓ５０１において、ＣＰＵ１０３は、注目する不定画素I_indetと、これに隣接する４近傍の画素のうち、移動体画素I_moveとで式２を満たした場合に、Ｓ５０２において、注目画素を移動体に追加する。ここで、thr4は第４の閾値を表す。
| I_indet(R) - I_move(R) | < thr4
| I_indet(G) - I_move (G) | < thr4 ・・・式２
| I_indet(B) - I_move (B) | < thr4 First, in step S403, the CPU 103 invades the indeterminate state from the mobile object seed. A process procedure of classifying a pixel of interest as a moving object in which a moving object seed invades an indefinite state is shown in the flowchart of FIG.
The CPU 103 _{adds the} target pixel to the moving body in step S502 when the formula (2) is satisfied with the indeterminate pixel I _{indet of} interest and the moving object pixel I _move among the four neighboring pixels adjacent thereto. Do. Here, thr4 represents a fourth threshold.
| I _indet (R)-I _move (R) | <thr4
| I _indet (G)-I _move (G) | <thr4 equation 2
| I _indet (B)-I _move (B) | <thr4

Ｓ５０３において、ＣＰＵ１０３は、移動体画素と隣接した全ての不定画素に対して、Ｓ５０２で移動体に追加されたか否かの判定を行う。移動体画素と隣接した全ての不定画素が移動体に追加されなくなるまで、ＣＰＵ１０３は、Ｓ５０１からＳ５０３を繰り返し行う。 In step S503, the CPU 103 determines whether all the unfixed pixels adjacent to the moving body pixel have been added to the moving body in step S502. The CPU 103 repeats steps S501 to S503 until all indefinite pixels adjacent to the moving body pixel are not added to the moving body.

図１５（ａ）、（ｂ）及び（ｃ）に、移動体シードから不定画素への侵攻の過程の例を示す。
図１５より、正しくは背景であるにも関わらず移動体シードに含まれた画素があることにより、背景の一部分が移動体として誤判定されている。また、移動体の色と似ていない色を持った領域に対しては侵攻できず、不定のままとなっている。
ここで、Ｓ５０４において、ＣＰＵ１０３は、移動体シードからの侵攻を終えた時点で不定となっていた画素は、移動体ではないと判断し、前述の不定画素を背景シードに加える。 FIGS. 15 (a), (b) and (c) show an example of the process of invasion of a mobile object seed to an indefinite pixel.
From FIG. 15, a part of the background is misjudged as a moving object because there is a pixel included in the moving object seed although it is correctly the background. In addition, it is not possible to invade a region having a color that is not similar to the color of the moving object, and remains indefinite.
Here, in S504, the CPU 103 determines that the pixel that has become indeterminate at the end of the invasion from the moving body seed is not a moving body, and adds the above-mentioned indeterminate pixel to the background seed.

前述した理由により、図１５（ｃ）のように、正しくは背景であるにも関わらず、移動体と判定される画素が存在することがある。これらの画素を正しく背景と判別するために、Ｓ４０４において、ＣＰＵ１０３は、背景シードから移動体への侵攻を行う。 For the reason described above, as shown in FIG. 15C, there may be pixels that are determined to be moving objects, although they are correctly background. In order to correctly determine these pixels as the background, in S404, the CPU 103 invades the moving object from the background seed.

背景シードから移動体への侵攻を行う処理手順を、図６のフローチャートに示す。
Ｓ６０１において、ＣＰＵ１０３は、注目する移動体画素I_moveと、これに隣接する４近傍の画素のうち、背景画素I_backとで式３を満たした場合に、Ｓ６０２において、注目画素を背景に追加する。ここで、thr5は第５の閾値を表す。なお、thr4とthr5は同じ値としてもよいが、thr5をthr4よりも小さい値として設定してもよい。これは、注目フレームの画像をリサイズするなどして、移動体のエッジ部がぼやけるような場合に背景からの侵攻時に曖昧なエッジを介して移動体に対して誤って侵攻をしてしまうことを避けるためである。
| I_move(R) - I_back(R) | < thr5
| I_move(G) - I_back (G) | < thr5 ・・・式３
| I_move(B) - I_back (B) | < thr5 The processing procedure for invading the mobile unit from the background seed is shown in the flowchart of FIG.
In step S602, the CPU 103 adds the target pixel to the background in step S602 when the expression 103 is satisfied with the moving object pixel I _move to be focused and the background pixel I _back among the four neighboring pixels adjacent thereto. . Here, thr5 represents a fifth threshold. Although thr4 and thr5 may have the same value, thr5 may be set to a value smaller than thr4. This is that when the edge of the moving object is blurred, such as when the image of the frame of interest is resized, the moving object is accidentally invaded through an ambiguous edge when it invades from the background. It is to avoid.
I _move (R)-I _back (R) | <thr5
I _move (G)-I _back (G) | <thr5 equation 3
I _move (B)-I _back (B) | <thr5

Ｓ６０３において、ＣＰＵ１０３は、背景画素と隣接した全ての移動体画素に対して、Ｓ６０２で背景に追加されたか否かの判定を行う。背景画素と隣接した全ての移動体画素が背景に追加されなくなるまで、ＣＰＵ１０３は、Ｓ６０１からＳ６０３を繰り返し行う。
図１６（ａ）、（ｂ）に、背景シードから移動体画素への侵攻の過程を示す。さらに、図１６（ｃ）は、背景シードからの侵攻を終えた図、すなわち、移動体を抽出し終えた状態を示す。 In step S603, the CPU 103 determines whether all the moving object pixels adjacent to the background pixel have been added to the background in step S602. The CPU 103 repeats steps S601 to S603 until all moving object pixels adjacent to the background pixel are not added to the background.
FIGS. 16 (a) and 16 (b) show the process of invasion from the background seed to the moving object pixel. Further, FIG. 16 (c) shows a diagram after the invasion from the background seed, that is, a state in which the mobile object has been extracted.

Ｓ２０４において、ＣＰＵ１０３は、ｎを１だけインクリメントし、Ｓ２０５において、全フレームの処理を行ったか否かを判定し、処理が終了していない場合はＳ２０３に戻り、次のフレーム処理を行う。処理が終了していればＳ２０６に進む。
以上の手順（Ｓ２０３からＳ２０５）を入力動画の全フレームに対して繰り返し処理を行う。これにより、移動体の写っていない画像を撮影していない場合や、固有の色域を持った静止物の前を移動体が横断する場合でも移動体を抽出することができる。 In step S204, the CPU 103 increments n by 1, and determines in step S205 whether all frames have been processed. If the processing is not completed, the process returns to step S203 to perform the next frame processing. If the process is completed, the process proceeds to S206.
The above procedure (S203 to S205) is repeated for all frames of the input moving image. In this way, it is possible to extract the moving object even when the image in which the moving object is not captured is not captured or when the moving object crosses in front of a stationary object having a unique color gamut.

さらに、Ｓ２０６において、ＣＰＵ１０３は、ある１枚の静止画（以下、代表静止画像）を作成する。ＣＰＵ１０３は、代表静止画像に本実施形態で抽出した移動体を合成することで、シネマグラフを作成することができる。 Furthermore, in S206, the CPU 103 creates a certain still image (hereinafter, representative still image). The CPU 103 can create a cinema graph by combining the moving object extracted in the present embodiment with the representative still image.

（その他の実施例）
本発明は、前述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the aforementioned embodiments to a system or apparatus via a network or storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. Can also be realized. It can also be implemented by a circuit (eg, an ASIC) that implements one or more functions.

１００撮像装置
１０１光学系
１０２撮像素子
１０３ＣＰＵ
１０４一次記憶部
１０５画像処理部
１０６記録媒体
１０７二次記憶部
１０８表示部
１０９操作部 100 image pickup apparatus 101 optical system 102 image pickup element 103 CPU
104 primary storage unit 105 image processing unit 106 recording medium 107 secondary storage unit 108 display unit 109 operation unit

Claims

An image processing apparatus for extracting a moving object in a moving image,
A moving object removed image in which a moving object is not shown by finding a combination of sections having high reproducibility in a moving image, selecting one frame from the section having high reproducibility, and applying the pixel value in that frame to each pixel Generating means for generating
An extraction unit that extracts a mobile object seed that is a part of a mobile object using a difference between the mobile object removed image and an image in each frame of a moving image;
The target pixel is classified as a moving object by comparing the moving body seed extracted by the extraction means with the feature amount of the adjacent target pixel, and the pixels classified as not moving objects at the time of classification are further added And classification means for classifying
The generation means obtains a combination of high-reproduction intervals based on the number of frames in which a feature amount variation continues below a threshold intermittently in each pixel of a frame in a certain interval of a moving image.
The classification means classifies the pixel which is not a mobile as the background when the invasion from the mobile is finished, and further, invades the mobile from the background,
The image processing apparatus, wherein the extraction unit extracts a moving object from a moving image based on the classification by the classification unit.

The image processing apparatus according to claim 1 , wherein the feature amount is at least one of color information, a luminance value, and distance information.

The invasion The image processing apparatus according to claim 1 or 2, characterized in that is to classify the pixel of interest to the mobile or background.

The mobile device further comprises a combining unit that combines the moving object extracted in the image of each frame into a still image,
The image processing apparatus according to any one of claims 1 to 3, wherein the combining unit creates a cinema graph.

A control method of an image processing apparatus for extracting a moving object in a moving image, comprising:
A moving object removed image in which a moving object is not shown by finding a combination of sections having high reproducibility in a moving image, selecting one frame from the section having high reproducibility, and applying the pixel value in that frame to each pixel Generating step to generate
Extracting a moving body seed which is a part of the moving body using a difference between the moving body removed image and an image in each frame of a moving image;
The target pixel is classified as a moving object by comparing the moving body seed extracted in the extraction step with the feature amount of the adjacent target pixel, and the pixels classified as non-moving objects at the time of classification are background And a classification process to classify
In the generation step, in each pixel of a frame of a section in a moving image, a combination of sections with high reproducibility is determined based on the number of times in which a frame whose feature amount variation is equal to or less than a threshold continues intermittently.
In the classification step, when the invasion from the moving body is finished, pixels which are not the moving body are classified as the background, and further, the invasion from the background to the moving body is performed,
In the extraction step , a moving body is extracted from a moving image based on the classification in the classification step.