JP2013055408A

JP2013055408A - Video processing device and control method therefor

Info

Publication number: JP2013055408A
Application number: JP2011190684A
Authority: JP
Inventors: Naoki Matsuki; 直紀松木
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2011-09-01
Filing date: 2011-09-01
Publication date: 2013-03-21

Abstract

PROBLEM TO BE SOLVED: To provide a video processing device which allows a minimum length required in editing to be easily known.SOLUTION: A video processing device inputs video (302), acquires video information from the input video (304), acquires a recommended photographing time on the basis of the acquired video information (305), and presents information based on the acquired recommended photographing time (306).

Description

本発明は、動画像を処理する技術に関する。 The present invention relates to a technique for processing a moving image.

近年、家庭用デジタルビデオカメラの普及に加え、デジタル一眼レフカメラに動画撮影機能が追加されるなど、デジタル動画を撮影する機会が増えている。 In recent years, in addition to the popularization of home digital video cameras, there are increasing opportunities to shoot digital moving images, such as the addition of moving image shooting functions to digital single-lens reflex cameras.

一般に、撮影したままの動画を視聴することは退屈である。そのため、撮影した動画や静止画等の素材映像を集めて編集し、視聴に耐えうる編集後映像を作成する必要がある。このとき、映像に応じてカットの再生時間を変化させることで、編集後映像を観やすくできる。例えば、カットの再生時間は、映像の情報量が多いほど長くすることが望ましい。映像の情報量が多いほど、視聴者が映像の内容を理解するために多くの時間を要するためである。 In general, viewing a video as it is taken is tedious. Therefore, it is necessary to collect and edit material videos such as shot videos and still images, and to create edited videos that can withstand viewing. At this time, the edited video can be easily viewed by changing the playback time of the cut according to the video. For example, it is desirable that the cut playback time be longer as the amount of video information increases. This is because the larger the information amount of the video, the more time it takes for the viewer to understand the content of the video.

映像に応じてカットの再生時間を変化させる作業は、一般にはユーザが手作業で行うが、自動で行うシステムも存在する。 The operation of changing the reproduction time of the cut in accordance with the video is generally performed manually by the user, but there is a system that automatically performs the operation.

例えば、特許文献１に公開されている方法によれば、スライドショーの再生時、画像内の人物の数が多いほど再生時間を短くする。これにより、主役の写っている画像を長い時間表示することを可能にしている。 For example, according to the method disclosed in Patent Document 1, when a slide show is reproduced, the reproduction time is shortened as the number of persons in the image increases. As a result, it is possible to display an image in which the leading role appears for a long time.

また、特許文献２に公開されている方法によれば、電子アルバムの自動ページ送りにおいて、画像内のアルバムのページに含まれる画像の枚数や動画の時間に応じて、ページの再生時間を変化させる。これにより、情報量が多いページの再生時間を長くすることを可能にしている。 According to the method disclosed in Patent Document 2, in the automatic page feed of an electronic album, the playback time of the page is changed according to the number of images included in the album page in the image and the time of the moving image. . This makes it possible to lengthen the playback time of a page with a large amount of information.

これらの方法を応用することで、映像の編集時に、映像に応じてカットの再生時間を変化させることが容易に可能となる。 By applying these methods, it is possible to easily change the playback time of the cut according to the video when editing the video.

特開２００６−２４５６４６号公報JP 2006-245646 A 特開２００２−１５７２７５号公報JP 2002-157275 A

上記の方法を利用するためには、予め十分な長さの素材映像を撮影しておくことが必要となる。 In order to use the above method, it is necessary to take a sufficiently long material image in advance.

しかしながら、撮影時に、撮影している映像の、編集時に必要となる長さを判断することは、ユーザにとって非常に困難である。そのため、ユーザが必要な長さの映像を撮り忘れてしまうという問題が発生する。全ての映像を長めに撮影しておいても良いが、その場合には、編集時に使用しない余分な映像が記憶容量を圧迫し、トータルの記録時間が短くなってしまうという問題が発生する。 However, it is very difficult for the user to determine the length required for editing the video being shot during shooting. Therefore, there arises a problem that the user forgets to take a video having a required length. All the videos may be taken longer, but in this case, an extra video that is not used at the time of editing presses the storage capacity, resulting in a problem that the total recording time is shortened.

本発明は、上記の問題を鑑みて、ユーザに、編集時に最低限必要な長さを容易に知らせることができる映像処理装置を提供することを目的としている。 In view of the above problems, an object of the present invention is to provide a video processing apparatus capable of easily notifying a user of a minimum length required for editing.

上記の目的を達成するための本発明の一態様による映像処理装置は以下の構成を備える。すなわち、映像を入力する映像入力手段と、前記映像入力手段によって入力された映像から映像情報を取得する映像情報取得手段と、前記映像情報取得手段によって取得された映像情報に基づいて推奨撮影時間を取得する推奨撮影時間取得手段と、前記推奨撮影時間取得手段によって取得された推奨撮影時間に基づく情報を提示する提示手段とを備える。 In order to achieve the above object, a video processing apparatus according to an aspect of the present invention has the following arrangement. That is, a video input unit for inputting video, a video information acquisition unit for acquiring video information from the video input by the video input unit, and a recommended shooting time based on the video information acquired by the video information acquisition unit Recommended photographing time acquisition means to be acquired, and presenting means for presenting information based on the recommended photographing time acquired by the recommended photographing time acquisition means.

本発明によれば、撮影している映像の映像情報から、ユーザに、編集時に最低限必要な長さを容易に知らせることができるようになる。 According to the present invention, it is possible to easily inform the user of the minimum length required for editing from the video information of the video being shot.

映像処理装置の概略を示す図The figure which shows the outline of the image processing device 映像処理装置の外観を示す図The figure which shows the appearance of the image processing device 機能構成を示すブロック図である。It is a block diagram which shows a function structure. ユーザが撮影を開始した際の処理の流れを示すフローチャートFlowchart showing the flow of processing when the user starts shooting 映像撮影時のユーザ動作とビデオカメラ動作の対応を示す図Diagram showing the correspondence between user actions and video camera actions during video shooting 推奨撮影時間と固定撮影持続時間の表示例を示す図Figure showing a display example of recommended shooting time and fixed shooting duration 映像の情報量と推奨撮影時間の対応表の例を示す図The figure which shows the example of the correspondence table of information quantity of picture and recommended photographing time 推奨撮影時間を取得する処理の流れを示すフローチャートFlow chart showing the flow of processing to obtain recommended shooting time 顔の数による情報量を算出する処理の流れを示すフローチャートFlow chart showing the flow of processing for calculating the amount of information based on the number of faces 色数による情報量を算出する処理の流れを示すフローチャートA flowchart showing a flow of processing for calculating the information amount by the number of colors

以下、本発明の好適な実施形態について添付の図面を参照して詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

［第１実施形態］
図１は、本実施形態を実現する映像処理装置の概略を示した図である。 [First Embodiment]
FIG. 1 is a diagram showing an outline of a video processing apparatus that realizes the present embodiment.

撮像部１０１は、ズームレンズ、フォーカスレンズ，ぶれ補正レンズ，絞り，シャッター，光学ローパスフィルタ，ｉＲカットフィルタ，カラーフィルタ，及びＣＭＯＳやＣＣＤなどのセンサなどから構成され、被写体の光量を検知する。 The imaging unit 101 includes a zoom lens, a focus lens, a shake correction lens, a diaphragm, a shutter, an optical low-pass filter, an iR cut filter, a color filter, and a sensor such as a CMOS or a CCD, and detects the amount of light of a subject.

Ａ／Ｄ変換部１０２は、被写体の光量をデジタル値に変換する。信号処理部１０３は、上記デジタル値にデモザイキング処理，ホワイトバランス処理，ガンマ処理などを行い、デジタル画像を生成する。Ｄ／Ａ変換部１０４は、上記デジタル画像に対しアナログ変換を行う。 The A / D conversion unit 102 converts the light amount of the subject into a digital value. The signal processing unit 103 performs demosaicing processing, white balance processing, gamma processing, and the like on the digital value to generate a digital image. The D / A conversion unit 104 performs analog conversion on the digital image.

エンコーダ部１０５は、上記デジタル画像をＪＰＥＧ，ＭＰＥＧ，Ｈ．２６４などのファイルフォーマットに変換する処理を行う。メディアインターフェース１０６は、映像処理装置をＰＣやその他メディア（例えば、ハードディスク，メモリーカード，ＣＦカード，ＳＤカード，ＵＳＢメモリ）につなぐためのインターフェースである。 The encoder unit 105 converts the digital image into JPEG, MPEG, H.264. A process of converting to a file format such as H.264 is performed. The media interface 106 is an interface for connecting the video processing apparatus to a PC or other media (for example, a hard disk, a memory card, a CF card, an SD card, a USB memory).

ＣＰＵ１０７は、各構成の処理全てに関わり、ＲＯＭ１０８やＲＡＭ１０９に格納された命令を順に読み込み、解釈し、その結果に従って処理を実行する。また、ＲＯＭ１０８とＲＡＭ１０９は、その処理に必要なプログラム，データ，作業領域などをＣＰＵ１０７に提供する。後述する各処理も、ＲＯＭ、ＲＡＭに格納されている処理プログラムを読み込み、ＣＰＵ１０７で実行される。 The CPU 107 is involved in all the processes of each configuration, reads the instructions stored in the ROM 108 and RAM 109 in order, interprets them, and executes the processes according to the results. The ROM 108 and the RAM 109 provide the CPU 107 with programs, data, work areas, and the like necessary for the processing. Each process to be described later is also executed by the CPU 107 by reading a processing program stored in the ROM and RAM.

撮像系制御部１１０は、フォーカスを合わせる，シャッターを開く，絞りを調節するなどの、ＣＰＵ１０７から指示された撮像系の制御を行う。 The imaging system control unit 110 controls the imaging system instructed by the CPU 107 such as focusing, opening a shutter, and adjusting an aperture.

操作部１１１は、ボタンやモードダイヤルなどが該当し、これらを介して入力されたユーザ指示を受け取る。レンズのズームなどの指示も、操作部１１１を介して行うことができる。キャラクタージェネレータ１１２は、文字やグラフィックなどを生成する。 The operation unit 111 corresponds to a button, a mode dial, and the like, and receives a user instruction input via these buttons. An instruction such as zooming of the lens can also be given via the operation unit 111. The character generator 112 generates characters and graphics.

表示部１１３は、一般的には液晶ディスプレイが広く用いられており、キャラクタージェネレーション部１１０やＤ／Ａ変換部１０４から受け取った撮影画像や文字の表示を行う。また、タッチスクリーン機能を有していても良く、その場合は、ユーザ指示を操作部１１１の入力として扱うことも可能である。 In general, a liquid crystal display is widely used as the display unit 113 and displays a captured image and characters received from the character generation unit 110 and the D / A conversion unit 104. In addition, a touch screen function may be provided, and in this case, a user instruction can be handled as an input of the operation unit 111.

動き検出部１１４は、公知の３軸の加速度センサや角速度センサなどから構成され、映像処理装置にかかる動きに関する情報（傾き，動きの方向，加速度など）を検出する。 The motion detection unit 114 is composed of a known triaxial acceleration sensor, angular velocity sensor, and the like, and detects information (tilt, direction of motion, acceleration, etc.) related to the motion applied to the video processing device.

なお、装置の構成要素は上記以外にも存在するが、本発明の主眼ではないので、説明を省略する。 In addition, although the component of an apparatus exists other than the above, since it is not the main point of this invention, description is abbreviate | omitted.

また、本発明の構成要素は、複数の機器（例えばホストコンピュータ，インターフェース機器，リーダ，プリンタなど）から構成されるシステムに適用しても、一つの機器からなる装置（例えば、カメラ，複写機，ファクシミリ装置など）に適用しても良い。 The component of the present invention can be applied to a system composed of a plurality of devices (for example, a host computer, an interface device, a reader, a printer, etc.), or a device (for example, a camera, a copier, The present invention may be applied to a facsimile machine or the like.

本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システムあるいは装置に供給し、そのシステムが記憶媒体に格納されたプログラムコードを読み出し実行することによっても達成される。 An object of the present invention is to supply a storage medium storing software program codes for realizing the functions of the above-described embodiments to a system or apparatus, and the system reads and executes the program codes stored in the storage medium. Is also achieved.

この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。プログラムコードを供給するための記憶媒体としては、例えば、フロッピーディスク，ハードディスク，光ディスク，光磁気ディスク，ＣＤ−ＲＯＭ，ＣＤ−Ｒ，磁気テープ，不揮発性のデータ保存部，ＲＯＭなどを用いることが出来る。 In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program code constitutes the present invention. As a storage medium for supplying the program code, for example, a floppy disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, a nonvolatile data storage unit, a ROM, or the like can be used. .

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼動しているＯＳ（オペレーティングシステム）などが実際の処理の一部を行う。さらに、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an OS (operating system) operating on the computer based on the instruction of the program code. Perform some of the actual processing. Furthermore, it is needless to say that the case where the functions of the above-described embodiments are realized by the processing is included.

さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるデータ保存部に書き込まれる。その後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, the program code read from the storage medium is written to a data storage unit provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer. After that, based on the instruction of the program code, the CPU of the function expansion board or function expansion unit performs part or all of the actual processing, and the processing of the above-described embodiment is realized by the processing. Needless to say.

本実施形態では、上記映像処理装置の一形態として、ビデオカメラを例にとって説明する。 In the present embodiment, a video camera will be described as an example of the video processing apparatus.

図２は、本実施形態における映像処理装置であるビデオカメラの外観を示す図である。 FIG. 2 is a diagram illustrating an appearance of a video camera that is a video processing apparatus according to the present embodiment.

ボタン２０１は、撮影開始・終了を指示するためのボタンである。撮影が行われていない状態でボタン２０１が押下されると、映像の撮影を開始する。撮影が行われている状態でボタン２０１が押下されると、撮影を終了する。 The button 201 is a button for instructing start / end of shooting. When the button 201 is pressed in a state where shooting is not performed, video shooting is started. When the button 201 is pressed while shooting is being performed, shooting is terminated.

ディスプレイ２０２は、各種情報を表示するためのディスプレイである。図１における表示部１１３に相当する。撮影中の映像に加え、文字やグラフィックなどを表示することができる。ここでは映像とは主に動画のことを指すが、静止画も撮影可能なものであってもよい。 The display 202 is a display for displaying various information. This corresponds to the display unit 113 in FIG. In addition to the video being shot, characters and graphics can be displayed. Here, video mainly refers to moving images, but still images may also be taken.

図３は、本実施形態における機能構成を示すブロック図である。映像処理装置３０１には、映像入力部３０２、処理部３０３、映像情報取得部３０４、推奨映像時間取得部３０５、提示部３０６を備える。なお、処理部３０３は、図１のＣＰＵ１０７に相当する。また、提示部３０６は、図１の表示部１１３、図２のディスプレイ２０２に相当する。 FIG. 3 is a block diagram showing a functional configuration in the present embodiment. The video processing device 301 includes a video input unit 302, a processing unit 303, a video information acquisition unit 304, a recommended video time acquisition unit 305, and a presentation unit 306. The processing unit 303 corresponds to the CPU 107 in FIG. The presentation unit 306 corresponds to the display unit 113 in FIG. 1 and the display 202 in FIG.

映像入力部３０２で映像が入力されると、映像情報取得部３０４が入力された映像から必要な情報を取得し、取得した情報から、推奨映像時間取得部３０５が推奨時間を取得し、取得された推奨時間を提示部３０６が提示する。 When a video is input by the video input unit 302, the video information acquisition unit 304 acquires necessary information from the input video, and the recommended video time acquisition unit 305 acquires and acquires the recommended time from the acquired information. The presentation unit 306 presents the recommended time.

図４は、本実施形態のビデオカメラにおいて、ユーザがボタン２０１を押下して撮影を開始した際の処理の流れを示すフローチャートである。また、ユーザの撮影動作とビデオカメラの動作の対応を、図５に示す。なお、図４におけるＳ＊＊＊のそれぞれは、所定工程（ステップ）に該当する。Ｓ４０１では、フラグＦに０を代入する。Ｓ４０２では、ユーザの撮影動作が固定されているかどうかを判断する。ユーザの撮影動作が固定されているかどうかは、操作部１１１の入力情報と、動き検出部１１４より取得できるビデオカメラの動きに関する情報とから判断する。 FIG. 4 is a flowchart showing the flow of processing when the user starts shooting by pressing the button 201 in the video camera of this embodiment. FIG. 5 shows the correspondence between the user's shooting operation and the operation of the video camera. Each of S *** in FIG. 4 corresponds to a predetermined process (step). In S401, 0 is substituted for the flag F. In S402, it is determined whether the user's shooting operation is fixed. Whether or not the user's shooting operation is fixed is determined based on input information of the operation unit 111 and information on the motion of the video camera that can be acquired from the motion detection unit 114.

本実施形態では、操作部１１１からの入力情報がなく、かつ、ビデオカメラにかかる動きがない場合に、ユーザの撮影動作が固定されていると判断する。ただし、ビデオカメラにかかる動きを判断する際には、手ぶれ等による微量な動きは無視するものとする。ユーザの撮影動作が固定されていると判断されれば、Ｓ４０３へ進む。それ以外であれば、Ｓ４０５へ進む。 In this embodiment, when there is no input information from the operation unit 111 and there is no movement on the video camera, it is determined that the user's shooting operation is fixed. However, when determining the movement of the video camera, a small amount of movement due to camera shake is ignored. If it is determined that the user's shooting operation is fixed, the process proceeds to S403. Otherwise, the process proceeds to S405.

Ｓ４０３では、フラグＦの値をチェックする。Ｆ＝０であれば、Ｓ４０４へ進む。それ以外であれば、Ｓ４０８へ進む。Ｓ４０４では、フラグＦに１を代入する。Ｓ４０５では、フラグＦに０を代入する。 In S403, the value of the flag F is checked. If F = 0, the process proceeds to S404. Otherwise, the process proceeds to S408. In S404, 1 is assigned to the flag F. In S405, 0 is substituted for the flag F.

Ｓ４０６では、現在の時間Ｔ_０を取得する。Ｔ_０は、撮影ボタンを押下した瞬間から経過した秒数で表されるものとする。Ｓ４０７では、撮影している映像の内容から、推奨撮影時間Ｔを取得する。推奨撮影時間Ｔは、後の編集で使用するために最低限必要となる時間（秒）である。推奨撮影時間Ｔを取得する方法は、図８を用いて後述する。 In S406, to get the current time _{T 0.} T ₀ is represented by the number of seconds that have elapsed since the moment the photograph button was pressed. In S407, the recommended shooting time T is acquired from the content of the video being shot. The recommended shooting time T is the minimum time (seconds) required for use in later editing. A method of acquiring the recommended shooting time T will be described later with reference to FIG.

Ｓ４０８では、現在の時間Ｔ_１を取得する。Ｔ_１は、撮影ボタンを押下した瞬間から経過した秒数で表されるものとする。Ｓ４０９では、固定撮影持続時間Ｔ_ｋを取得する。Ｔ_ｋは、ユーザの撮影動作が固定されていると判断されてから経過した時間（秒）であり、次の式により算出する。 In S408, to get the _{T 1} the current time. T ₁ is assumed to be expressed by the number of seconds elapsed from the moment the user presses the capture button. In S409, to acquire the fixed shooting duration _{T k.} T _k is the time (seconds) that has elapsed since it was determined that the user's shooting operation is fixed, and is calculated by the following equation.

Ｔ_ｋ＝Ｔ_１− Ｔ_０ …（１）
Ｓ４１０では、Ｓ４０７で取得した推奨撮影時間Ｔと、Ｓ４０９で取得した固定撮影持続時間Ｔ_ｋを、ディスプレイ２０２に表示する。表示は、例えば、図６に示すように行う。図６では、推奨撮影時間Ｔが最大値となるようなメータを作成し、メータ上に固定撮影持続時間Ｔ_ｋを表すことで、推奨撮影時間Ｔと固定撮影持続時間Ｔ_ｋの関係をグラフィカルに表示している。 T _k = T ₁ −T ₀ (1)
In S410, the recommended imaging time T obtained in S407, the fixed photographing duration _{T k} obtained in S409, is displayed on the display 202. For example, the display is performed as shown in FIG. In FIG. 6, a meter is created so that the recommended shooting time T becomes the maximum value, and the fixed shooting duration T _k is represented on the meter, so that the relationship between the recommended shooting time T and the fixed shooting duration T _k is graphically represented. it's shown.

また、現在の撮影時間の達成度を、メッセージとして％表示するようにしてもよい。 Further, the degree of achievement of the current shooting time may be displayed as a message in%.

Ｓ４１１では、ユーザによりボタン２０１が押下されたかどうかを判断する。ボタン２０１が押下されたと判断されれば、Ｓ４１２へ進む。それ以外の場合は、Ｓ４０２へ進む。 In S411, it is determined whether or not the button 201 has been pressed by the user. If it is determined that the button 201 has been pressed, the process proceeds to S412. Otherwise, the process proceeds to S402.

Ｓ４１２は、ビデオカメラの撮影動作を終了し、撮影された映像を保存する。 In step S412, the shooting operation of the video camera is terminated, and the shot video is stored.

尚、上記説明では、撮影動作が固定されていると固定判断された場合に、固定をトリガに撮影時間の取得の開始のタイミングとしたが、シーンチェンジを検出して、シーンの変わり目から撮影時間を求めてもよい。 In the above description, when it is determined that the shooting operation is fixed, the timing for starting the acquisition of the shooting time is set as the trigger, but the shooting time is detected from the change of the scene when a scene change is detected. You may ask for.

（推奨撮影時間取得処理）
ここでは、図４のＳ４０７で行う、推奨撮影時間の取得の詳細について説明する。 (Recommended shooting time acquisition process)
Here, details of acquisition of the recommended shooting time performed in S407 of FIG. 4 will be described.

本実施形態では、推奨撮影時間を判断するために、まず、映像の内容を解析して映像の情報量Ｉを算出する。情報量Ｉは、０〜１の間の値で表されるものとする。また、情報量Ｉと推奨撮影時間Ｔが対応付けて保存されたデータベースを予め用意しておき、算出した情報量Ｉをキーにして、推奨撮影時間Ｔを取得する。データベースの例を図７に示す。情報量の算出方法は後述する。 In this embodiment, in order to determine the recommended shooting time, first, the content of video is analyzed to calculate the video information amount I. The information amount I is represented by a value between 0 and 1. Further, a database in which the information amount I and the recommended shooting time T are stored in association with each other is prepared in advance, and the recommended shooting time T is acquired using the calculated information amount I as a key. An example of the database is shown in FIG. A method for calculating the information amount will be described later.

図８は、本実施形態における推奨撮影時間を取得する際の処理の流れを示すフローチャートである。 FIG. 8 is a flowchart showing the flow of processing when acquiring the recommended shooting time in the present embodiment.

Ｓ８０１では、情報量Ｉに０を代入する。Ｓ８０２では、映像中の顔の数を元にした情報量Ｉ_Ｆを算出する。Ｉ_Ｆは、０〜１の間の値で表されるものとする。情報量Ｉ_Ｆの算出方法については、図９を用いて後述する。 In S801, 0 is substituted into the information amount I. In S802, it calculates the amount of information I _F was based on the number of faces in the image. I _F shall be represented by a value between 0 and 1. A method for calculating the amount of information I _F, will be described later with reference to FIG.

Ｓ８０３では、映像中の色数を元にした情報量Ｉ_Ｃを算出する。Ｉ_Ｃは、０〜１の間の値で表されるものとする。情報量Ｉ_Ｃの算出方法については、図１０を用いて後述する。 In S803, it calculates the amount of information I _C was based on the number of colors in the image. I _C is represented by a value between 0 and 1. A method of calculating the information amount I _C will be described later with reference to FIG.

Ｓ８０４では、映像の情報量Ｉを算出する。情報量Ｉは、Ｉ_Ｆ，Ｉ_Ｃを用いて、次のように算出される。ここで、ａ，ｂは、それぞれ、Ｉ_Ｆ，Ｉ_Ｃに対する重み付けの値であり、ａ＋ｂ＝２となるように任意の値を設定する。本実施形態では、ａ＝ｂ＝１であるとする。 In step S804, the video information amount I is calculated. The information amount I is calculated as follows using I _F and I _C. Here, a and b are weighting values for I _F and I _C , respectively, and arbitrary values are set so that a + b = 2. In the present embodiment, it is assumed that a = b = 1.

Ｉ＝（ａ・Ｉ_Ｆ＋ｂ・Ｉ_Ｃ）／２ …（２）
Ｓ８０５では、情報量Ｉをキーにして、図７に示すデータベースから推奨撮影時間を取得する。 _{I = (a · I F +} b · I C) / 2 ... (2)
In S805, the recommended shooting time is acquired from the database shown in FIG. 7 using the information amount I as a key.

図９は、図８のＳ８０２における、顔の数を元にした情報量Ｉ_Ｆを算出する処理の流れを示すフローチャートである。 Figure 9 is a flow chart illustrating in S802 of FIG. 8, a flow of a process for calculating the amount of information I _F was based on the number of faces.

なお、以下の処理は、静止画像に対して行うものとする。処理の対象となる画像は、ビデオカメラで撮影される映像の１フレームを選択して使用する。また、映像の複数フレームから１枚の画像を作成し、処理の対象としても良い。 Note that the following processing is performed on a still image. As an image to be processed, one frame of video shot by a video camera is selected and used. Further, one image may be created from a plurality of frames of video and may be processed.

Ｓ９０１では、Ｉ_Ｆに０を代入する。Ｓ９０２では、画像中の顔の数Ｎ_Ｆを取得する。画像中の顔領域の検出は、公知の方法に従う。例えば、特開平０９−２５２５３４号公報には、予め登録された標準顔画像（テンプレート）を用いて顔領域を抽出する技術が開示されている。特開平０９−２５２５３４号公報は、更に、抽出された顔領域の中から眼球（黒目）や鼻穴などの特徴点の候補を抽出し、これらの配置や予め登録されている目、鼻、口領域などのテンプレートとの類似度から目、鼻、口等の顔部品を検出する技術が開示されている。 In S901, 0 is assigned to the _{I F.} In S902, obtains the number _{N F} of a face in an image. The detection of the face area in the image follows a known method. For example, Japanese Patent Laid-Open No. 09-252534 discloses a technique for extracting a face region using a standard face image (template) registered in advance. Japanese Patent Application Laid-Open No. 09-252534 further extracts candidate feature points such as eyeballs (black eyes) and nostrils from the extracted face region, and their arrangement and pre-registered eyes, nose, mouth A technique for detecting facial parts such as eyes, nose, and mouth from a similarity with a template such as a region is disclosed.

Ｓ９０３では、顔の数の最大値Ｎ_{Ｆ，ＭＡＸ}に値を代入する。Ｎ_{Ｆ，ＭＡＸ}には、１以上の任意の値を代入する。本実施形態では、Ｎ_{Ｆ，ＭＡＸ}は１０であるとする。Ｓ９０４では、Ｎ_ＦとＮ_{Ｆ，ＭＡＸ}の値を比較する。Ｎ_Ｆ＞Ｎ_{Ｆ，ＭＡＸ}であれば、Ｓ９０５へ進む。それ以外であれば、Ｓ９０６へ進む。Ｓ９０５では、Ｉ_Ｆに１．０を代入する。Ｓ９０６では、Ｉ_Ｆに（Ｎ_Ｆ／Ｎ_{Ｆ，ＭＡＸ}）を代入する。 In step S903, a value is substituted for the maximum number of faces N _{F, MAX} . An arbitrary value of 1 or more is substituted for N _{F and MAX} . In the present embodiment _{, NF and MAX} are assumed to be 10. In S904, it compares the value of _{N F} and _{N F, MAX.} If N _F > N _{F, MAX} , the process proceeds to S905. Otherwise, the process proceeds to S906. In S905, substituting the 1.0 to _{I F.} In S906, substituting _{_{(N F / N F, MAX}} ) to _{I F.}

図１０は、図８のＳ８０３における、色数を元にした情報量Ｉ_Ｃを算出する処理の流れを示すフローチャートである。 Figure 10 is a flowchart showing the flow of processing for calculating in S803 of FIG. 8, the amount of information I _C was based on the number of colors.

画像の各画素には、色情報として、Ｒ，Ｇ，Ｂの３チャンネルの値が各８ビットで格納されているものとする。したがって、Ｒ，Ｇ，Ｂはそれぞれ０〜２５５の範囲の値をとる。 Assume that each pixel of the image stores three channel values of R, G, and B as 8-bit colors as color information. Therefore, R, G, and B each take a value in the range of 0 to 255.

また、画像の各画素には、画像の左下を原点とし、右方向に＋ｘ，上方向に＋ｙとなる座標系でアクセスできるものとする。この座標系において、ｒ（ｉ，ｊ），ｇ（ｉ，ｊ），ｂ（ｉ，ｊ）は、ｘ＝ｉ，ｙ＝ｊの位置の画素におけるＲ，Ｇ，Ｂの値をそれぞれ表すものとする。 Each pixel of the image can be accessed in a coordinate system in which the lower left of the image is the origin, + x in the right direction and + y in the upper direction. In this coordinate system, r (i, j), g (i, j), and b (i, j) represent R, G, and B values in the pixel at the position of x = i and y = j, respectively. And

また、画像の幅をＷ，画像の高さをＨで表すものとする。 Also, the width of the image is represented by W and the height of the image is represented by H.

Ｓ１００１では、Ｉ_Ｃに０を代入する。Ｓ１００２では、２５６×２５６×２５６の３次元配列ＦＬＧのバッファを作成する。ＦＬＧ［ｒ］［ｇ］［ｂ］のように表記することで、配列の任意の位置にアクセスできるものとする。Ｓ１００３では、３次元配列ＦＬＧの全ての位置に０を代入する。 In S1001, 0 is substituted for I _C. In S1002, a buffer of 256 × 256 × 256 three-dimensional array FLG is created. By notation such as FLG [r] [g] [b], an arbitrary position in the array can be accessed. In S1003, 0 is assigned to all positions of the three-dimensional array FLG.

Ｓ１００４では、変数ｊに０を代入する。Ｓ１００５では、変数ｉに０を代入する。Ｓ１００６では、ＦＬＧ［ｒ（ｉ，ｊ）］［ｇ（ｉ，ｊ）］［ｂ（ｉ，ｊ）］の値をチェックする。ＦＬＧ［ｒ（ｉ，ｊ）］［ｇ（ｉ，ｊ）］［ｂ（ｉ，ｊ）］の値が０であれば、Ｓ１００７へ進む。それ以外であれば、Ｓ１００８へ進む。 In S1004, 0 is substituted into the variable j. In S1005, 0 is substituted into the variable i. In S1006, the value of FLG [r (i, j)] [g (i, j)] [b (i, j)] is checked. If the value of FLG [r (i, j)] [g (i, j)] [b (i, j)] is 0, the process proceeds to S1007. Otherwise, the process proceeds to S1008.

Ｓ１００７では、ＦＬＧ［ｒ（ｉ，ｊ）］［ｇ（ｉ，ｊ）］［ｂ（ｉ，ｊ）］に１を代入する。Ｓ１００８では、変数ｉに１を加算する。Ｓ１００９では、変数ｉとＷの値を比較する。ｉ＞Ｗであれば、Ｓ１０１０へ進む。それ以外であれば、Ｓ１００６へ進む。Ｓ１０１０では、変数ｊに１を加算する。 In S1007, 1 is assigned to FLG [r (i, j)] [g (i, j)] [b (i, j)]. In S1008, 1 is added to the variable i. In S1009, the values of variables i and W are compared. If i> W, the process proceeds to S1010. Otherwise, the process proceeds to S1006. In S1010, 1 is added to the variable j.

Ｓ１０１１では、変数ｊとＨの値を比較する。ｊ＞Ｈであれば、Ｓ１０１２へ進む。それ以外であれば、Ｓ１００５へ進む。Ｓ１０１２では、画像の色数Ｎ_Ｃを取得する。画像の色数Ｎ_Ｃは、３次元配列ＦＬＧの全位置の値を足し合わせた値である。Ｓ１０１３では、色数の最大値Ｎ_{Ｃ，ＭＡＸ}に値を代入する。Ｎ_{Ｃ，ＭＡＸ}には、１〜１６７７７２１６の間の任意の値を代入する。本実施形態では、Ｎ_{Ｃ，ＭＡＸ}は１０００であるとする。 In S1011, the variable j is compared with the value of H. If j> H, the process proceeds to S1012. Otherwise, the process proceeds to S1005. In S1012, obtains the number of colors _{N C} of the image. Color number N _C of the image is a value obtained by adding the values of all the position of the three-dimensional array FLG. In S1013, a value is substituted for the maximum number of colors N _{C and MAX} . An arbitrary value between 1 and 16777216 is substituted for N _{C and MAX} . In the present embodiment, it is assumed that N _{C and MAX} are 1000.

Ｓ１０１４では、Ｎ_ＣとＮ_{Ｃ，ＭＡＸ}の値を比較する。Ｎ_Ｃ＞Ｎ_{Ｃ，ＭＡＸ}であれば、Ｓ１０１５へ進む。それ以外であれば、Ｓ１０１６へ進む。Ｓ１０１５では、Ｉ_Ｃに１．０を代入する。Ｓ１０１６では、Ｉ_Ｃに（Ｎ_Ｃ／Ｎ_{Ｃ，ＭＡＸ}）を代入する。 In S1014, the values of N _C and N _{C, MAX} are compared. If N _C > N _{C, MAX} , the process proceeds to S1015. Otherwise, the process proceeds to S1016. In S1015, 1.0 is substituted for I _C. In S1016, (N _C / N _{C, MAX} ) is substituted for I _C.

なお、以上の例では、各々の画素が色情報としてＲ，Ｇ，Ｂの３チャンネルの値を各８ビットで持っているものとした。しかし、他の形式で色情報を持っていても良い。例えば、Ｒ，Ｇ，Ｂの３チャンネルの値を各１６ビットで持っている場合には、それぞれの値の範囲は０〜６５５３５となり、３次元配列ＦＬＧも６５５３５×６５５３５×６５５３５のバッファを作成する。また、チャンネル数が増えた場合には、ＦＬＧの次元数が増える。 In the above example, each pixel has values of 3 channels of R, G, and B as color information in 8 bits each. However, it may have color information in other formats. For example, if the values of 3 channels of R, G, B are each 16 bits, the range of each value is 0 to 65535, and the 3D array FLG also creates a buffer of 65535 × 65535 × 65535. . In addition, when the number of channels increases, the number of FLG dimensions increases.

また、上記の例では、画像の１画素ごとの色をカウントしていった。しかし、画像を複数ブロックに分割し、ブロック毎の色をカウントしても良い。例えば、２０×２０ピクセルのブロック毎に平均の色を求めてカウントしていっても良い。 In the above example, the color for each pixel of the image is counted. However, the image may be divided into a plurality of blocks and the color for each block may be counted. For example, an average color may be obtained and counted for each block of 20 × 20 pixels.

以上のように、本実施形態では、図８を用いて説明したように、映像情報として、映像の情報量を用いて、編集時に最低限必要となる映像の長さを判断して推奨撮影時間を取得し、ユーザに呈示する。これにより、ユーザに、映像毎に、編集時に最低限必要な長さを容易に知らせることができるようになる。更に、余分な映像を撮影することによる記憶容量の圧迫を削減することができるようになる。 As described above, in the present embodiment, as described with reference to FIG. 8, the recommended shooting time is determined by determining the minimum video length required for editing using the video information amount as video information. Is obtained and presented to the user. This makes it possible to easily inform the user of the minimum length required for editing for each video. Furthermore, it is possible to reduce the compression of the storage capacity due to the shooting of extra video.

なお、本実施形態では、映像の情報量を算出する際に、映像中の顔の数と色数を用いたが、他の項目を用いても良い。例えば、画像を周波数解析し、高周波成分の数を用いて映像の情報量を算出するようにしても良い。また、オブジェクトを検出して、オブジェクトの数を用いてもよい。 In the present embodiment, the number of faces and the number of colors in the video are used when calculating the information amount of the video, but other items may be used. For example, the frequency of the image may be analyzed, and the amount of video information may be calculated using the number of high frequency components. Alternatively, an object may be detected and the number of objects may be used.

また、本実施形態では、取得した推奨撮影時間は、情報としてユーザに提示するのみであったが、他の動作を加えても良い。例えば、固定撮影持続時間が推奨撮影時間を越えたら、強制的に撮影を終了するようにしても良い。 In the present embodiment, the acquired recommended shooting time is only presented to the user as information, but other operations may be added. For example, when the fixed shooting duration exceeds the recommended shooting time, the shooting may be forcibly terminated.

なお、上記実施形態では、映像の情報量から推奨撮影時間を取得したが、本発明はこれに限らない。例えば、映像情報として、顔のサイズが表示画面に対して大きいほど推奨撮影時間を長くすることも有効である。これによれば、顔のアップがあった場合に、長時間表示することができる。 In the above embodiment, the recommended shooting time is acquired from the video information amount, but the present invention is not limited to this. For example, as video information, it is effective to increase the recommended shooting time as the face size is larger than the display screen. According to this, when the face is raised, it can be displayed for a long time.

以上より、映像情報として、顔のサイズを用いて推奨撮影時間を取得し、ユーザに提示することで、ユーザに、映像毎に、編集時に最低限必要な長さを容易に知らせることができるようになる。 As described above, the recommended shooting time is acquired as the video information using the face size and presented to the user so that the user can easily be notified of the minimum length required for editing for each video. become.

以上の実施形態によれば、映像情報から推奨撮影時間を取得し、ユーザに提示することで、ユーザに、映像毎に、編集時に最低限必要な長さを容易に知らせることができるようになる。 According to the above embodiment, by obtaining the recommended shooting time from the video information and presenting it to the user, it becomes possible to easily inform the user of the minimum length required for editing for each video. .

Claims

An input means for inputting video;
Video information acquisition means for acquiring video information from the input video;
Recommended shooting time acquisition means for acquiring a recommended shooting time based on the acquired video information;
A video processing apparatus comprising: presentation means for presenting information based on the acquired recommended shooting time.

The video processing apparatus according to claim 1, wherein the video information is an information amount of video input by the video input unit.

The video processing apparatus according to claim 2, wherein the information amount of the video is acquired based on the number of faces in the video.

The video processing apparatus according to claim 2, wherein the information amount of the video is acquired based on the number of colors in the video.

The video processing apparatus according to claim 1, wherein the video information is a size of a subject of the video input by the video input unit.

Fixed determination means for determining whether the shooting operation is fixed;
The video processing apparatus according to claim 1, wherein the recommended shooting time acquisition unit starts acquiring the recommended shooting time when it is determined by the fixing determination unit to be fixed.

The video processing apparatus according to claim 1, wherein the presentation unit graphically presents the recommended shooting time acquired by the recommended shooting time acquisition unit.

A method for controlling a video processing apparatus,
A step of inputting video by input means;
Acquiring video information from the input video by video information acquisition means;
Obtaining recommended shooting time based on the acquired video information by recommended shooting time acquisition means;
And a step of presenting information based on the acquired recommended photographing time by a presenting means.

A non-transitory computer-readable storage medium storing a program for causing a computer to execute the video processing apparatus control method according to claim 8 by being read by the computer.

A computer-readable storage medium storing the computer program according to claim 9.