JP4240098B2

JP4240098B2 - Image processing apparatus and image processing method

Info

Publication number: JP4240098B2
Application number: JP2006260179A
Authority: JP
Inventors: 英雄阿部
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2006-09-26
Filing date: 2006-09-26
Publication date: 2009-03-18
Anticipated expiration: 2018-12-24
Also published as: JP2007082240A

Description

本発明は、画像処理装置及び画像処理方法に関し、特に、ビデオカメラや電子スチルカメラ等により撮影された動画情報を、静止画像として表示出力する画像処理装置及び画像処理方法に関する。本発明は、画像処理装置及び画像処理方法に関し、特に、ビデオカメラや電子スチルカメラ等により撮影された動画情報を、静止画像として表示出力する画像処理装置及び画像処理方法に関する。 The present invention relates to an image processing device and an image processing method, and more particularly to an image processing device and an image processing method for displaying and outputting moving image information captured by a video camera, an electronic still camera, or the like as a still image. The present invention relates to an image processing device and an image processing method, and more particularly to an image processing device and an image processing method for displaying and outputting moving image information captured by a video camera, an electronic still camera, or the like as a still image.

近年、ＡＶ機器の普及、機能向上が著しい。特に、民生用、業務用を始め様々な場面でビデオカメラや電子スチルカメラ等の高機能化された画像記録機器が利用されるようになっている。これらの機器により撮影された画像は、機器本体に搭載された液晶パネルや、ケーブル等を介して接続されたテレビジョンやパーソナルコンピュータ（以下、パソコンと略称する）のモニタ上に簡易に表示出力することができ、また、プリンタ等を介して直接印刷出力することもできる。従来、ビデオカメラや電子スチルカメラ等により撮影された動画に限らず、一般的な動画情報（テレビジョン放送の映像情報等を含む）を、静止画像として表現する方法としては、動画情報を構成する複数のフレーム画像の中から任意のフレーム画像を抽出して単独で表示し、また、複数のフレーム画像の場合には、順次あるいは分割画面に一括して、表示する方法等が知られている。 In recent years, the spread of AV equipment and the improvement of functions have been remarkable. In particular, highly functional image recording devices such as video cameras and electronic still cameras have come to be used in various situations including consumer and business use. Images taken by these devices are simply displayed and output on a monitor of a liquid crystal panel mounted on the device main body, a television or a personal computer (hereinafter abbreviated as a personal computer) connected via a cable or the like. It is also possible to print out directly via a printer or the like. Conventionally, moving image information is configured as a method of expressing general moving image information (including video information of a television broadcast) as a still image, not limited to moving images taken by a video camera, an electronic still camera, or the like. A method of extracting an arbitrary frame image from a plurality of frame images and displaying it alone, and displaying a plurality of frame images sequentially or collectively on a divided screen is known.

以下、従来技術について具体的に説明する。
図１３は、動画情報を構成する複数のフレーム画像から任意のフレーム画像を選択し、表示する手法の概念図である。図１３において、動画データＶＤには、車両Ｘが図面左方向から右方向へ走行する画像が含まれているものとする。ここで、動画データＶＤは、時系列的に配列された複数のフレーム画像（静止画）により構成されていると考えることができる。したがって、画像処理装置の利用者は、動画データＶＤを一旦再生、閲覧し、被写体（車両）の動きや撮影状況の変化等を把握した上で、一連のフレーム画像の中から任意のフレーム画像を選択指示することにより、動画データ中に含まれる任意のシーンの画像を静止画像として表示出力することができる。例えば、図１３に示すように、動画データを構成する一連のフレーム画像の中から所望のフレーム画像（ここでは、特定時刻Ｔ４におけるフレーム画像Ｆ４）を選択するように画像処理装置のスイッチ類を操作することにより、車両Ｘが走行する任意の１シーンを動画データの中から抽出することができ、モニタ等に表示出力することができる。なお、選択指示したフレーム画像が複数ある場合には、例えば、スクロール表示モードによりモニタ上に時系列的に、あるいは、マルチ画面モードにより一括して表示出力される。 The prior art will be specifically described below.
FIG. 13 is a conceptual diagram of a method for selecting and displaying an arbitrary frame image from a plurality of frame images constituting moving image information. In FIG. 13, it is assumed that the moving image data VD includes an image in which the vehicle X travels from the left to the right in the drawing. Here, the moving image data VD can be considered to be composed of a plurality of frame images (still images) arranged in time series. Accordingly, the user of the image processing apparatus once reproduces and browses the moving image data VD, grasps the movement of the subject (vehicle), the change in the shooting state, and the like, and then selects an arbitrary frame image from the series of frame images. By instructing the selection, an image of an arbitrary scene included in the moving image data can be displayed and output as a still image. For example, as shown in FIG. 13, the switches of the image processing apparatus are operated so as to select a desired frame image (here, the frame image F4 at a specific time T4) from a series of frame images constituting the moving image data. By doing so, one arbitrary scene in which the vehicle X travels can be extracted from the moving image data, and can be displayed and output on a monitor or the like. When there are a plurality of frame images for which selection has been instructed, for example, they are displayed and output in time series on the monitor in the scroll display mode or collectively in the multi-screen mode.

上述したような静止画像の選択、表示方法においては、動画データの全てを再生表示して初めて撮影状況の変化や被写体の一連の動き等の撮影内容を認識することができるものであるため、画像処理装置の利用者は、動画データＶＤを一旦再生、閲覧し、撮影状況の変化や被写体の動き等を把握し、所望のシーンが含まれるフレーム画像を選択指示しなければならず、撮影内容の確認作業に長時間を要するという問題、また、所望の撮影内容を静止画像として出力する際に種々の編集作業（操作）を必要とし、極めて煩雑であるという問題を有している。 In the method for selecting and displaying still images as described above, it is possible to recognize the photographing contents such as a change in the photographing situation and a series of movements of the subject only after reproducing and displaying all of the moving image data. The user of the processing device must once reproduce and browse the moving image data VD, grasp the change in the shooting situation, the movement of the subject, etc., and select and instruct the frame image including the desired scene. There are problems that the confirmation work takes a long time, and that various editing work (operations) are required when outputting desired photographing contents as a still image, which is extremely complicated.

そこで、本発明は、一連の動画データに含まれる撮影状況の変化や被写体の動き等に応じて静止画像を自動的に抽出し、撮影内容を直感的に認識することができる画像表現で表示出力することができる画像処理装置及び画像処理方法を提供することを目的とする。 Therefore, the present invention automatically extracts still images according to changes in shooting conditions included in a series of moving image data, movements of subjects, and the like, and displays and outputs them in an image representation that enables intuitive recognition of shooting contents. It is an object of the present invention to provide an image processing apparatus and an image processing method that can be used.

請求項１記載の画像処理装置は、動画情報を構成する複数のフレーム画像に含まれる特徴量の変化を検出する特徴量検出手段と、前記特徴量検出手段により検出された前記特徴量の変化に基づいて、前記複数のフレーム画像から特定のフレーム画像を選択するフレーム画像選択手段と、前記フレーム画像選択手段により選択された前記特定のフレーム画像に表示枠形状を変更する強調処理を施すフレーム画像強調手段と、前記フレーム画像強調手段により表示枠形状を変更する強調処理された前記特定のフレーム画像を含む前記複数のフレーム画像を静止画像として表示出力する画像出力手段と、を備えたことを特徴とする。
請求項２記載の画像処理装置は、動画情報を構成する複数のフレーム画像に含まれる特徴量の変化を検出する特徴量検出手段と、前記特徴量検出手段により検出された前記特徴量の変化に基づいて、前記複数のフレーム画像から特定のフレーム画像を選択するフレーム画像選択手段と、前記フレーム画像選択手段により選択された前記特定のフレーム画像に所定の強調処理を施すフレーム画像強調手段と、前記フレーム画像強調手段により強調処理された前記特定のフレーム画像を含む前記複数のフレーム画像を静止画像として印刷出力する印刷出力手段と、を備えたことを特徴とする。 The image processing apparatus according to claim 1, wherein a feature amount detection unit that detects a change in a feature amount included in a plurality of frame images constituting the moving image information, and a change in the feature amount detected by the feature amount detection unit. A frame image selection unit that selects a specific frame image from the plurality of frame images, and a frame image enhancement that performs an enhancement process to change a display frame shape on the specific frame image selected by the frame image selection unit And image output means for displaying and outputting as a still image the plurality of frame images including the specific frame image that has undergone enhancement processing for changing a display frame shape by the frame image enhancement means. To do.
The image processing apparatus according to claim 2 , wherein a feature amount detection unit that detects a change in a feature amount included in a plurality of frame images constituting moving image information, and a change in the feature amount detected by the feature amount detection unit. A frame image selection unit that selects a specific frame image from the plurality of frame images, a frame image enhancement unit that performs a predetermined enhancement process on the specific frame image selected by the frame image selection unit, Print output means for printing out the plurality of frame images including the specific frame image enhanced by the frame image enhancement means as still images.

請求項４記載の画像処理装置は、請求項１乃至３のいずれかに記載の画像処理装置において、前記特徴量は、前記動画情報を構成する複数のフレーム画像に含まれる被写体の動きであることを特徴とする。
請求項５記載の画像処理装置は、請求項１乃至３のいずれかに記載の画像処理装置において、前記特徴量は、前記動画情報を構成する複数のフレーム画像の各々に付随する音声情報であることを特徴とする。
請求項６記載の画像処理装置は、複数のフレーム画像を含む動画情報を入力する動画情報入力手段と、前記動画情報入力手段により入力された動画情報から所定の時間間隔でフレーム画像を抽出する動画情報抽出手段と、前記動画情報抽出手段により抽出された複数のフレーム画像に含まれる特徴量の変化を検出する特徴量検出手段と、前記特徴量検出手段により検出された前記特徴量の変化に基づいて、前記複数のフレーム画像から特定のフレーム画像を選択するフレーム画像選択手段と、前記フレーム画像選択手段により選択された前記特定のフレーム画像に所定の強調処理を施すフレーム画像強調手段と、前記フレーム画像強調手段により強調処理された前記特定のフレーム画像を含む前記複数のフレーム画像を静止画像として表示出力する画像出力手段と、を備えたことを特徴とする。 The image processing device according to claim 4 is the image processing device according to any one of claims 1 to 3, wherein the feature amount is movement of a subject included in a plurality of frame images constituting the moving image information. It is characterized by.
The image processing device according to claim 5 is the image processing device according to any one of claims 1 to 3, wherein the feature amount is audio information accompanying each of a plurality of frame images constituting the moving image information. It is characterized by that.
The image processing apparatus according to claim 6, wherein a moving image information input unit that inputs moving image information including a plurality of frame images, and a moving image that extracts frame images at predetermined time intervals from the moving image information input by the moving image information input unit. An information extraction unit; a feature amount detection unit that detects a change in a feature amount included in a plurality of frame images extracted by the moving image information extraction unit; and a change in the feature amount detected by the feature amount detection unit. Frame image selecting means for selecting a specific frame image from the plurality of frame images, frame image enhancing means for applying a predetermined enhancement process to the specific frame image selected by the frame image selecting means, and the frame The plurality of frame images including the specific frame image enhanced by the image enhancement means are displayed as still images. An image output means for, characterized by comprising a.

請求項７記載の画像処理方法は、動画情報を構成する複数のフレーム画像に含まれる特徴量の変化を特徴量検出部に検出させるステップと、前記特徴量検出部により検出された前記特徴量の変化に基づいて、前記複数のフレーム画像から特定のフレーム画像を表示画像選択部に選択させるステップと、前記表示画像選択部により選択された特定のフレーム画像に表示枠形状を変更する強調処理を施す処理を画像加工部に実行させるステップと、前記画像加工部により強調処理が施された特定のフレーム画像を含む前記複数のフレーム画像を静止画像として画像出力部に表示出力させるステップと、を含むことを特徴とする。
請求項８記載の画像処理方法は、動画情報を構成する複数のフレーム画像に含まれる特徴量の変化を特徴量検出部に検出させるステップと、前記特徴量検出部により検出された前記特徴量の変化に基づいて、前記複数のフレーム画像から特定のフレーム画像を表示画像選択部に選択させるステップと、前記表示画像選択部により選択された特定のフレーム画像に所定の強調処理を施す処理を画像加工部に実行させるステップと、前記画像加工部により強調処理された特定のフレーム画像を含む前記複数のフレーム画像を静止画像としてプリンタに印刷出力させるステップと、を含むことを特徴とする。 The image processing method according to claim 7 , wherein a feature amount detection unit detects a change in a feature amount included in a plurality of frame images constituting moving image information, and the feature amount detected by the feature amount detection unit . Based on the change, a step of causing the display image selection unit to select a specific frame image from the plurality of frame images, and an enhancement process for changing a display frame shape to the specific frame image selected by the display image selection unit Including causing the image processing unit to execute processing, and causing the image output unit to display and output the plurality of frame images including the specific frame image subjected to the enhancement processing by the image processing unit as still images. It is characterized by.
The image processing method according to claim 8, wherein a feature amount detection unit detects a change in a feature amount included in a plurality of frame images constituting moving image information, and the feature amount detected by the feature amount detection unit. Based on the change, image processing includes a step of causing the display image selection unit to select a specific frame image from the plurality of frame images, and a process of performing a predetermined enhancement process on the specific frame image selected by the display image selection unit And a step of causing the printer to print out the plurality of frame images including the specific frame image emphasized by the image processing unit as still images.

本発明によれば、特徴量の変化に基づいて選択されたフレーム画像を通常の画像表示とは異なる特殊な画像表現で表示出力することができるため、一連のフレーム画像を表示する場合においても撮影状況の変化等を示すフレーム画像を強調表示することができ、画像処理装置の利用者に対して適切に認識させることができる。 According to the present invention, the frame image selected based on the change in the feature amount can be displayed and output with a special image representation different from the normal image display. Therefore, even when a series of frame images are displayed, the image is captured. A frame image indicating a change in the situation or the like can be highlighted, and the user of the image processing apparatus can be appropriately recognized.

以下、本発明の実施の形態を、図面を参照しながら説明する。
＜第１の実施形態＞
図１は、本発明に係る画像処理装置の第１の実施形態を示すブロック図である。
図１において、１０は動画データ取込部、２０はフレームメモリ、３０はＭＰＵ、４０はハードディスク、５０はＬＣＤ等のモニタ（画像出力手段）、６０はプリンタ（画像出力手段）、７０はキースイッチ等の入力部（特徴量選択手段、加工方法選択手段）、８０はデータ／命令伝送用のバスである。ここで、ＭＰＵ３０は、特徴量検出部（特徴量検出手段）３１と、表示画像選択部（フレーム画像選択手段）３２と、画像加工部（フレーム画像加工手段）３３の各機能を有して構成されている。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
<First Embodiment>
FIG. 1 is a block diagram showing a first embodiment of an image processing apparatus according to the present invention.
In FIG. 1, 10 is a moving image data capturing unit, 20 is a frame memory, 30 is an MPU, 40 is a hard disk, 50 is a monitor such as an LCD (image output means), 60 is a printer (image output means), and 70 is a key switch. , Etc. (feature amount selection means, processing method selection means), 80 is a data / command transmission bus. Here, the MPU 30 includes functions of a feature amount detection unit (feature amount detection unit) 31, a display image selection unit (frame image selection unit) 32, and an image processing unit (frame image processing unit) 33. Has been.

これらの各機能は、概ね以下の通りである。
（１）動画データ取込部１０
動画データ取込部１０は、動画データから複数のフレーム画像を音声情報とともに抽出して後述するフレームメモリ２０に記憶するものである。ここで、動画データは、ビデオデータや連続撮影された複数の静止画像により構成されるものであってもよく、また、音声情報を含まず、画像情報のみから構成されるものであってもよい。要するに、本発明において、動画データは少なくとも複数のフレーム画像の連続により構成されるものであればよく、また、取り込まれるフレーム画像はアナログ画像であっても、デジタル画像であっても構わない。
動画データ取込部１０の概略構成について、図面を参照して説明する。ここでは、動画データとして、画像情報に音声情報が付随したデータ構成を有しているものとして説明する。したがって、画像情報からのみ構成される動画データを対象とする場合には、音声情報を信号処理する構成は含まれない。 Each of these functions is generally as follows.
(1) Moving image data capturing unit 10
The moving image data capturing unit 10 extracts a plurality of frame images from the moving image data together with audio information and stores them in a frame memory 20 described later. Here, the moving image data may be composed of video data or a plurality of still images taken continuously, or may be composed only of image information without including audio information. . In short, in the present invention, the moving image data only needs to be composed of a sequence of at least a plurality of frame images, and the captured frame image may be an analog image or a digital image.
A schematic configuration of the moving image data capturing unit 10 will be described with reference to the drawings. Here, it is assumed that the moving image data has a data configuration in which audio information is attached to image information. Therefore, when moving image data composed only of image information is targeted, a configuration for signal processing of audio information is not included.

図２において、１１はフレーム画像選択部、１２は画像・音声分離部、１３ａは画像信号用アナログ−デジタル変換器（以下、画像Ａ／Ｄと記す。）、１３ｂは音声信号用アナログ−デジタル変換器（以下、音声Ａ／Ｄと記す。）、１４ａは画像信号用圧縮部、１４ｂは音声信号用圧縮部、１５は画像・音声合成部（以下、ミキサと記す。）、１６はバスインターフェースである。
フレーム画像選択部１１は、後述するＭＰＵ３０からの命令に従って、動画データから所定の時間間隔で音声情報とともにフレーム画像を選択、抽出する。なお、選択されるフレーム画像は、動画データを構成する全てのフレーム画像であってもよいし、特定の時間間隔、例えば１／５ｓｅｃや１／１０ｓｅｃ毎のフレーム画像であってもよいが、動画データに含まれる特徴量の変化を検出することができる程度の時間間隔を有していることを必須とする。なお、特徴量の定義については後述する。また、フレーム画像の選択の際に用いる時間間隔は、ＭＰＵ３０により予め設定された基準値を用いる方法や、画像処理装置の使用者の意志により入力部７０を介して指示された任意の値を用いる方法等、様々な手法を設定することができる。 In FIG. 2, 11 is a frame image selection unit, 12 is an image / sound separation unit, 13a is an analog-digital converter for image signals (hereinafter referred to as image A / D), and 13b is an analog-digital conversion for audio signals. 14a is an image signal compression unit, 14b is an audio signal compression unit, 15 is an image / sound synthesis unit (hereinafter referred to as a mixer), and 16 is a bus interface. is there.
The frame image selection unit 11 selects and extracts a frame image together with audio information from the moving image data at a predetermined time interval in accordance with a command from the MPU 30 described later. The selected frame image may be all frame images constituting the moving image data, or may be a frame image at a specific time interval, for example, 1/5 sec or 1/10 sec. It is essential to have a time interval that can detect a change in the feature amount included in the data. The definition of the feature amount will be described later. The time interval used when selecting a frame image is a method using a reference value preset by the MPU 30 or an arbitrary value instructed via the input unit 70 by the user of the image processing apparatus. Various methods such as a method can be set.

画像・音声分離部１２は、画像と音声の周波数帯の違いを利用して画像信号と音声信号とを分離し、以後の信号処理を画像と音声の個別の経路により実行する。
画像Ａ／Ｄ１３ａは、選択されたフレーム画像がアナログ信号の場合に、アナログの画像信号をデジタル画像信号に変換する。また、音声Ａ／Ｄ１３ｂは、選択されたフレーム画像に付随する音声情報がアナログ信号の場合に、アナログの音声信号をデジタル音声信号に変換する。 The image / sound separation unit 12 separates the image signal and the sound signal by using the difference between the frequency bands of the image and the sound, and executes the subsequent signal processing through separate paths of the image and the sound.
The image A / D 13a converts the analog image signal into a digital image signal when the selected frame image is an analog signal. Further, the audio A / D 13b converts an analog audio signal into a digital audio signal when the audio information accompanying the selected frame image is an analog signal.

画像信号用圧縮部１４ａは、画像Ａ／Ｄ１３ａによりデジタル画像信号化された、あるいは、フレーム画像選択部１１においてデジタル画像信号として抽出された各フレーム画像の画像信号を、所定の画像圧縮規格に準拠するように処理する。フレーム画像の圧縮符号化方式としては、ＪＰＥＧ規格等を適用することができる。ここで、ＪＰＥＧ（Joint Photographic Coding Experts Group)とは、ＤＣＴ（離散コサイン変換）、量子化、可変長符号化、等の手法により原画像データを圧縮符号化する規格であり、カラーファクシミリ装置や電子スチルカメラ等に採用されている国際標準規格である。なお、画像情報の圧縮符号化方式としては、一般にＪＰＥＧのほか、ＧＩＦ、ＴＩＦＦ、ＬＨＡ、ＺＩＰ等の様々な形式、規格が利用されているため、実施の形態に応じて適切な方式を採用することができる。 The image signal compression unit 14a complies with a predetermined image compression standard for the image signal of each frame image converted into a digital image signal by the image A / D 13a or extracted as a digital image signal by the frame image selection unit 11. To process. The JPEG standard or the like can be applied as a compression encoding method for frame images. Here, JPEG (Joint Photographic Coding Experts Group) is a standard that compresses and encodes original image data by DCT (Discrete Cosine Transform), quantization, variable length coding, and the like. It is an international standard adopted for still cameras. Note that, as a compression encoding method of image information, various formats and standards such as GIF, TIFF, LHA, ZIP, etc. are generally used in addition to JPEG, and therefore an appropriate method is adopted according to the embodiment. be able to.

また、音声信号用圧縮部１４ｂは、音声Ａ／Ｄ１３ｂによりデジタル音声信号化された、あるいは、フレーム画像選択部１１においてデジタル音声信号として抽出された各フレーム画像に付随する音声信号を、所定の音声圧縮規格に準拠するように処理する。音声信号の圧縮符号化方式としては、ＭＰＥＧ規格等を適用することができる。ここで、ＭＰＥＧ（Motion Picture Coding Experts Group)とは、時系列上に連続する複数の画像（画面）から構成される動画像において、画像間の動き補償された差分データを圧縮符号化する予測符号化の手法を用いて圧縮符号化する規格であるが、音声信号の圧縮符合化にも適用することができる。 The audio signal compression unit 14b converts the audio signal attached to each frame image that has been converted into a digital audio signal by the audio A / D 13b or extracted as a digital audio signal by the frame image selection unit 11 into a predetermined audio signal. Process to comply with compression standards. An MPEG standard or the like can be applied as a compression encoding method for audio signals. Here, MPEG (Motion Picture Coding Experts Group) is a prediction code that compresses and encodes motion-compensated difference data between images in a moving image composed of a plurality of images (screens) continuous in time series. However, it can also be applied to compression coding of audio signals.

このように、画像信号をＪＰＥＧ規格により、また、音声信号をＭＰＥＧ規格により圧縮符号化すれば、近年のインターネット等の普及に伴い、普及型のパソコンに搭載されたＪＰＥＧ規格、ＭＰＥＧ規格に対応したソフトウェアにより、簡易に画像処理を実現することができる。
ミキサ１５は、所定の規格により圧縮符号化された画像信号及び音声信号を対応付けて、一つのフレームデータに合成処理（パケット化）し、一方、バスインターフェース１６は、合成処理されたフレームデータをバス８０の伝送幅に変換してフレームメモリ２０へ転送する。 As described above, if the image signal is compressed and encoded according to the JPEG standard and the audio signal is encoded according to the MPEG standard, the JPEG standard and the MPEG standard installed in the popular personal computer are supported with the spread of the Internet in recent years. Image processing can be easily realized by software.
The mixer 15 associates the image signal and the audio signal that are compression-encoded according to a predetermined standard, and synthesizes them into one frame data (packetization), while the bus interface 16 converts the synthesized frame data into the frame data. The data is converted into the transmission width of the bus 80 and transferred to the frame memory 20.

（２）フレームメモリ２０
フレームメモリ２０は、ＤＲＡＭ（Dynamic Random Access Memory）等により構成され、動画データ取込部１０により選択、圧縮符号化処理された画像信号及び音声信号（フレームデータ）にヘッダー情報を関係付けて、ＭＰＵ３０により指定された画像データ格納領域に格納するものである。
図３は、フレームメモリ２０の内部領域を示す概念図である。
図３に示すように、フレームメモリ２０の内部領域は、大別してフォーマットテーブル領域、情報テーブル領域、データ領域、オフセット領域から構成される。
フォーマットテーブル領域には、画像情報に関する総合的な情報であるフォーマット情報が格納される。また、情報テーブル領域には、画像情報を識別するための番号情報を含む画像情報識別情報、動画データの時系列上での位置（時刻）を示す時刻情報等の画像情報を識別するためのヘッダー情報が格納される。データ領域は、フレームデータ、すなわち、圧縮符号化された画像信号及びそれに付随する音声信号をひとまとめにして格納し、オフセット領域は、データ領域におけるフレームデータのデータ長を固定長とするためのオフセットデータ（ブランク）を格納する。
このように、各フレームデータは、情報テーブル領域に格納されたヘッダー情報に関係付けされてデータ領域に格納される。 (2) Frame memory 20
The frame memory 20 is constituted by a DRAM (Dynamic Random Access Memory) or the like, and associates header information with the image signal and the audio signal (frame data) selected and compression-encoded by the moving image data capturing unit 10, and the MPU 30. The image data is stored in the image data storage area designated by.
FIG. 3 is a conceptual diagram showing an internal area of the frame memory 20.
As shown in FIG. 3, the internal area of the frame memory 20 is roughly divided into a format table area, an information table area, a data area, and an offset area.
The format table area stores format information that is comprehensive information related to image information. The information table area includes headers for identifying image information such as image information identification information including number information for identifying image information, time information indicating a position (time) of moving image data on a time series. Information is stored. The data area stores frame data, that is, the compression-encoded image signal and the accompanying audio signal together, and the offset area is offset data for fixing the data length of the frame data in the data area. (Blank) is stored.
In this way, each frame data is stored in the data area in association with the header information stored in the information table area.

（３）特徴量検出部３１
特徴量検出部３１は、フレームメモリ２０に格納された複数のフレームデータから画像信号又は音声信号の特徴量の変化を検出するものである。ここで、画像信号の特徴量とは、フレームデータの画像、つまりフレーム画像に含まれる被写体（動体）や画面の輝度、彩度等であって、画像の変化を適切に抽出することができるものであることを必須とする。また、音声信号の特徴量とは、画像に付随する音声の音量（レベル）や音域（周波数帯）等であって、音声の変化を適切に抽出することができるものであることを必須とする。すなわち、このような特徴量の所定の変化、すなわち、予め設定されたしきい値を超過するような激しい変化や継続性（非変化状態）等について、連続するフレームデータ相互を監視し、一連のフレームデータ（動画データ）に含まれる画像や音声の変化特性を把握する。
なお、特徴量の検出方法については後述する。 (3) Feature amount detection unit 31
The feature quantity detection unit 31 detects a change in the feature quantity of the image signal or the audio signal from a plurality of frame data stored in the frame memory 20. Here, the feature amount of the image signal is an image of the frame data, that is, a subject (moving object) included in the frame image, the brightness of the screen, saturation, and the like, and the change of the image can be appropriately extracted. Is essential. The feature amount of the audio signal is the volume (level) of the sound accompanying the image, the sound range (frequency band), and the like, and it is essential that the change in the sound can be appropriately extracted. . That is, continuous frame data is monitored for a predetermined change in the feature amount, that is, a drastic change or continuity (non-change state) exceeding a preset threshold, The change characteristics of images and sounds included in frame data (moving image data) are grasped.
Note that a feature amount detection method will be described later.

（４）表示画像選択部３２
表示画像選択部３２は、特徴量検出部３１により検出されたフレームデータに含まれる画像又は音声の変化特性に基づいて、フレームメモリ２０に格納された複数のフレームデータの中から特定のフレームデータを選択、抽出し、後述する画像加工部３３を介して、あるいは、直接モニタ５０やプリンタ６０に表示出力するものである。ここで、特定のフレームデータの選択、抽出は、上述した画像又は音声の変化特性により明らかとなった撮影状況の切り替わり直後や、被写体の急激な動作直後等の画像を含むフレームデータを選択するものであってもよいし、上記切り替わりや動作の直前直後のフレームデータを選択するものであってもよい。要するに、後述するモニタ５０やプリンタ６０を介して表示出力する際に、画像処理装置の利用者に特徴量の変化が生じたことを認識させることができる画像を選択、抽出するものであればよい。 (4) Display image selection unit 32
The display image selection unit 32 selects specific frame data from the plurality of frame data stored in the frame memory 20 based on the change characteristics of the image or sound included in the frame data detected by the feature amount detection unit 31. Selection, extraction, and display display output to the monitor 50 or the printer 60 via the image processing unit 33 described later or directly. Here, selection and extraction of specific frame data is to select frame data including an image immediately after switching of a shooting situation or immediately after a sudden movement of a subject, which is clarified by the above-described change characteristics of an image or sound. Alternatively, the frame data immediately before and after the switching or operation may be selected. In short, it is only necessary to select and extract an image that allows the user of the image processing apparatus to recognize that a change in the feature amount has occurred when the display is output via the monitor 50 or the printer 60 described later. .

（５）画像加工部３３
画像加工部３３は、表示画像選択部３２により、選択、抽出されたフレームデータに含まれる画像を所定の表現形式に加工するものである。
所定の表現形式への加工とは、表示出力する画像サイズの大型化や、表示位置や表示画質、画調の変更、画像枠の変形（等倍変形、非等倍変形等）等であって、当該画像に相前後して、あるいは、同時に表示される他の画像に比較して、強調表示される形式への変更を意味する。 (5) Image processing unit 33
The image processing unit 33 processes the image included in the frame data selected and extracted by the display image selection unit 32 into a predetermined expression format.
Processing to a predetermined expression format includes enlargement of the image size to be displayed and output, change of display position and display image quality, change of tone, deformation of image frame (same size transformation, non-same size transformation, etc.), etc. This means a change to a highlighted format in comparison with other images displayed before or after the image.

（６）ＭＰＵ３０、ハードディスク４０
ハードディスク４０は、ＭＰＵ３０が実行するプログラムや動作上必要なデータを記憶する。したがって、ＭＰＵ３０は、ハードディスク４０に記憶されたアプリケーションプログラムを実行することにより、上述した特徴量検出部３１、表示画像選択部３２、及び、画像加工部３３の各機能をソフトウェア的に実現して、後述する一連の画像処理やメモリ管理、モニタ５０やプリンタ６０への出力制御を行う。
（７）モニタ５０、プリンタ６０
表示画像選択部３２により選択された画像、又は、画像加工部３３により加工処理された画像を表示、あるいは、印刷出力するものであって、テレビやパソコンのモニタ、プリンタ等の出力装置である。ここで、図１においては、モニタ５０やプリンタ６０を、バス８０に直接接続された構成として示したが、本発明はこれに限定されるものではなく、バス８０に接続された通信インターフェース等を介して通信回線により接続されるファクシミリ装置や携帯情報端末（ＰＤＡ）、パソコン等であってもよい。なお、本発明の説明においては、モニタ５０への表示出力及びプリンタ６０への印刷出力のほか、画像を出力する動作全般を、便宜的に「表示出力」と記載する。 (6) MPU 30 and hard disk 40
The hard disk 40 stores programs executed by the MPU 30 and data necessary for operation. Therefore, the MPU 30 executes the application program stored in the hard disk 40 to realize the functions of the above-described feature amount detection unit 31, display image selection unit 32, and image processing unit 33 in software, A series of image processing, memory management, and output control to the monitor 50 and the printer 60 which will be described later are performed.
(7) Monitor 50, printer 60
An image selected by the display image selection unit 32 or an image processed by the image processing unit 33 is displayed or printed out, and is an output device such as a monitor of a television or a personal computer or a printer. Here, in FIG. 1, the monitor 50 and the printer 60 are shown as being directly connected to the bus 80, but the present invention is not limited to this, and a communication interface connected to the bus 80, etc. It may be a facsimile machine, a personal digital assistant (PDA), a personal computer or the like connected via a communication line. In the description of the present invention, in addition to the display output to the monitor 50 and the print output to the printer 60, the entire operation of outputting an image is described as “display output” for convenience.

（８）入力部７０
入力部７０は、画像処理装置に設けられた各種キースイッチ類であって、ＭＰＵ３０によるアプリケーションプログラムの実行や画像処理、モニタ５０やプリンタ６０への表示出力等の制御信号を生成する。また、後述する実施形態における特徴量や画像加工方法の選択手段としての機能も有する。画像処理装置に設けられた専用のキースイッチ類はもとより、パソコン等により本発明を実施する場合にはキーボードやマウス、ペンタブレット等の各種入力装置も含まれる。
なお、本実施形態においては、動画データとして、画像情報に音声情報が予め付随、合成されたものを取り込む場合の構成について説明したが、本発明はこの実施形態に限定されるものではなく、ビデオカメラや電子スチルカメラ等により撮影、収録された直後の画像情報及び音声情報のように、各々の情報を別個の信号として取り込み、処理するものであってもよい。この場合、取り込まれた情報がアナログ信号の場合には、画像Ａ／Ｄ１３ａ及び音声Ａ／Ｄ１３ｂを介して、また、デジタル信号の場合には、直接、画像信号用圧縮部１４ａ及び音声信号用圧縮部１４ｂに入力する構成を適用することができる。 (8) Input unit 70
The input unit 70 is various key switches provided in the image processing apparatus, and generates control signals such as execution of application programs by the MPU 30, image processing, and display output to the monitor 50 and the printer 60. In addition, it also has a function as a selection unit for a feature amount and an image processing method in an embodiment described later. In addition to dedicated key switches provided in the image processing apparatus, various input devices such as a keyboard, a mouse, and a pen tablet are included when the present invention is implemented by a personal computer or the like.
In the present embodiment, a configuration has been described in which audio data is preliminarily attached and synthesized as image data as moving image data. However, the present invention is not limited to this embodiment, and video Each piece of information may be captured and processed as separate signals, such as image information and audio information immediately after being captured and recorded by a camera, an electronic still camera, or the like. In this case, when the captured information is an analog signal, the image signal compression unit 14a and the audio signal compression are directly performed via the image A / D 13a and the audio A / D 13b. A configuration input to the unit 14b can be applied.

次に、上述した構成を有する画像処理装置における処理動作について、図面を参照して説明する。
図４は、本実施形態に係る画像処理装置の処理動作を示すフローチャートである。まず、処理動作の概略について上述した構成を参照しつつ説明した後、各ステップについて個別に説明する。
図４に示すように、ステップＳ１０１、Ｓ１０２において、動画データ取込部１０により、入力された動画データから所定の時間間隔で一連のフレーム画像及びそれに付随する音声情報を選択、抽出し、各々ＪＰＥＧ、ＭＰＥＧ等の規格に準拠するように圧縮符号化処理を施してフレームメモリ２０の所定の格納領域に格納する。この際、圧縮符号化された画像信号及び音声信号を対応付けて、一つのフレームデータに合成処理し、所定のヘッダー情報を関係付けて格納される。
次いで、ステップＳ１０３、Ｓ１０４において、特徴量検出部３１により、フレームメモリに格納された一連のフレームデータに含まれる特徴量を検出して、画像及び音声の変化特性を把握する。 Next, processing operations in the image processing apparatus having the above-described configuration will be described with reference to the drawings.
FIG. 4 is a flowchart showing the processing operation of the image processing apparatus according to this embodiment. First, the outline of the processing operation will be described with reference to the configuration described above, and then each step will be described individually.
As shown in FIG. 4, in steps S101 and S102, the moving image data capturing unit 10 selects and extracts a series of frame images and accompanying audio information from the input moving image data at predetermined time intervals. Then, the data is compressed and encoded so as to comply with a standard such as MPEG and stored in a predetermined storage area of the frame memory 20. At this time, the compression-coded image signal and audio signal are associated with each other, synthesized into one frame data, and stored in association with predetermined header information.
Next, in steps S103 and S104, the feature amount detection unit 31 detects feature amounts included in a series of frame data stored in the frame memory, and grasps change characteristics of images and sounds.

次いで、ステップＳ１０５において、表示画像選択部３２により、上述した画像及び音声の変化特性に基づいて、撮影状況の切り替わりや被写体の急激な動作等を判別して、例えば、変化直後の画像を含むフレームデータを選択する。次いで、ステップＳ１０６において、画像加工部３３により、表示出力する画像サイズの大型化や、表示位置や表示画質、画調の変更、画像枠の変形等の加工処理を施して、他の画像に比較して強調された表示画像を作成し、ステップＳ１０７において、画像加工部３３により作成された表示画像をモニタ５０やプリンタ６０等に表示出力する。
このように、本実施形態における画像処理装置の処理動作は、大別して、動画データ取込ステップ、特徴量検出ステップ、表示画像選択ステップ、及び、画像加工ステップから構成されている。 Next, in step S105, the display image selection unit 32 discriminates switching of shooting conditions, a sudden movement of the subject, and the like based on the above-described change characteristics of the image and sound, and for example, a frame including the image immediately after the change. Select data. Next, in step S106, the image processing unit 33 performs processing such as increasing the size of the image to be displayed and outputting, changing the display position and display image quality, changing the tone of the image, and deforming the image frame. In step S107, the display image created by the image processing unit 33 is displayed and output on the monitor 50, the printer 60, or the like.
As described above, the processing operation of the image processing apparatus in the present embodiment is roughly divided into a moving image data capturing step, a feature amount detection step, a display image selection step, and an image processing step.

以下、各ステップについて、図面を参照して説明する。
（１）動画データ取込ステップ
図５は、動画データ取込ステップを示す概念図である。以下、被写体として走行する車両Ｘ、停止している車両Ｙの動きを例にして説明する。
ビデオカメラや電子スチルカメラにより撮影された動画データや、テレビジョン放送の映像情報等の動画データは、時系列的に配列された一連のフレーム画像の集合と、それらに付随する音声情報の合成データであるため、本ステップにおいては、図５に示すように、一連のフレーム画像から所定の時間間隔毎に、例えば動画データＶＤの時系列上の位置に相当する時刻Ｔ０、Ｔ２、Ｔ４、Ｔ６、Ｔ８、Ｔ１０、…のフレーム画像Ｆ０、Ｆ２、Ｆ４、Ｆ６、Ｆ８、Ｆ１０、…が、付随する音声データＡ０、Ａ２、Ａ４、Ａ６、Ａ８、Ａ１０、…とともに動画データ取込部１０により選択、抽出されるものとする。なお、選択、抽出されるフレーム画像は、動画データＶＤを構成する一連のフレーム画像の全てであってもよく、すなわち、抽出されたフレーム画像や音声によって撮影状況の変化や被写体の動きを把握できるものであれば上述した時間間隔に限定されない。
抽出されたフレーム画像Ｆ０、Ｆ２、Ｆ４、Ｆ６、Ｆ８、Ｆ１０、…及び付随する音声データＡ０、Ａ２、Ａ４、Ａ６、Ａ８、Ａ１０、…は、後述する信号処理を簡易に実行するためにデジタル画像信号及びデジタル音声信号に変換され、さらに、フレームメモリ２０の記憶容量を有効に利用するためにＪＰＥＧ、ＭＰＥＧ等の所定の圧縮符号化処理が施され、フレームデータとしてまとめられてヘッダー情報に関係付けられて、所定のデータ領域に順次格納される。 Hereinafter, each step will be described with reference to the drawings.
(1) Movie Data Capture Step FIG. 5 is a conceptual diagram showing a movie data capture step. Hereinafter, the movements of the vehicle X traveling as a subject and the stopped vehicle Y will be described as examples.
Moving image data shot by a video camera or electronic still camera, or moving image data such as video information of a television broadcast, is a set of a series of frame images arranged in a time series and synthesized data of audio information accompanying them. Therefore, in this step, as shown in FIG. 5, at a predetermined time interval from a series of frame images, for example, times T0, T2, T4, T6 corresponding to positions on the time series of the moving image data VD, The frame images F0, F2, F4, F6, F8, F10,... Of T8, T10,... Are selected by the moving image data capturing unit 10 together with accompanying audio data A0, A2, A4, A6, A8, A10,. Shall be extracted. Note that the frame image selected and extracted may be all of a series of frame images constituting the moving image data VD, that is, a change in the shooting situation and the movement of the subject can be grasped by the extracted frame image and sound. If it is a thing, it will not be limited to the time interval mentioned above.
The extracted frame images F0, F2, F4, F6, F8, F10,... And the accompanying audio data A0, A2, A4, A6, A8, A10,. In order to make effective use of the storage capacity of the frame memory 20, it is converted into an image signal and a digital audio signal, and is subjected to a predetermined compression encoding process such as JPEG, MPEG, etc. And sequentially stored in a predetermined data area.

（２）特徴量検出ステップ
本ステップにおいては、特徴量検出部３１により、フレームメモリ２０に格納された複数のフレームデータから画像又は音声の急激な変化や持続状態を検出し、動画データに含まれる画像や音声の変化特性を把握する。
以下、特徴量の検出方法について説明する。
（ａ）第１の特徴量検出方法
特徴量検出方法の第１の例として、画像に含まれる被写体の動きを検出する手法を適用することができる。この方法は、異なる時刻におけるフレームデータの画像をそれぞれブロック領域に分割し、同一ブロック領域毎に参照画像上でのブロックマッチング処理を行い、誤差が最小となる座標位置から被写体のフレーム間での動きを検出するブロックマッチング法を適用するものである。なお、ブロックマッチング法は、ＩＳＯ／ＩＥＣＪＴＣ１／ＳＣ２９／ＷＧ１１で国際標準化が行われたＩＳ１１１７２−２規格などに広く利用されている。 (2) Feature amount detection step In this step, the feature amount detection unit 31 detects a sudden change or continuous state of an image or sound from a plurality of frame data stored in the frame memory 20, and is included in the moving image data. Understand the change characteristics of images and sounds.
Hereinafter, a feature amount detection method will be described.
(A) First Feature Quantity Detection Method As a first example of the feature quantity detection method, a technique for detecting the motion of a subject included in an image can be applied. This method divides each frame data image at different times into block areas, performs block matching processing on the reference image for each same block area, and moves the subject between frames from the coordinate position where the error is minimized. This applies a block matching method for detecting. Note that the block matching method is widely used in the IS11172-2 standard that has been internationally standardized by ISO / IECJTC1 / SC29 / WG11.

ここで、ブロックマッチング法について説明する。
図６は、特徴量検出ステップに適用されるブロックマッチング法を示す概念図である。
例えば、『テレビジョン学会編、「画像情報圧縮」、オーム社、ｐ．９２、１９９１年』には、連続する複数のフレーム画像により構成される動画データに含まれる被写体の動きを検出するブロックマッチング法について記載されている。上記文献によれば、図６に示すように、注目する画像（以下、便宜的に現画像という。）Ｆｎと、一つ前の時刻における画像（以下、便宜的に前画像という。）Ｆｍの特定位置のブロック（領域）Ｒｎ、Ｒｍについて、パターンマッチングを行う。パターンマッチングの方法は、例えば、ブロックＲｎ中の画素Ｘｎと、これをｉ画素分ずらしたブロックＲｍ中の画素Ｘｎ-ｉとの差分の絶対値和Ｓを次式｛１｝に従って求め、この絶対値和Ｓ、すなわち評価量を最小にするずれ位置ｉを探索して動き量を検出するものである。
Ｓ＝Σ｜Ｘｎ−Ｘｎ-ｉ｜ …｛１｝
ここで、｛１｝式における総和計算Σは、ブロックＲｎに属する全ての画素Ｘｎについて実行される。
このように、ブロックマッチング法においては、現画像をブロックに分割し、ブロック単位で前画像との類似位置をパターンマッチングにより探索して、一連の画像に含まれる被写体の動きを検出することができる。 Here, the block matching method will be described.
FIG. 6 is a conceptual diagram showing a block matching method applied to the feature amount detection step.
For example, “The Institute of Television Engineers,“ Image Information Compression ”, Ohmsha, p. 92, 1991 ”describes a block matching method for detecting the motion of a subject included in moving image data composed of a plurality of continuous frame images. According to the above document, as shown in FIG. 6, an image of interest (hereinafter referred to as the current image) Fn and an image at the previous time (hereinafter referred to as the previous image for convenience) Fm. Pattern matching is performed for blocks (regions) Rn and Rm at specific positions. For example, the pattern matching method obtains the absolute value sum S of the difference between the pixel Xn in the block Rn and the pixel Xn-i in the block Rm shifted by i pixels according to the following equation {1}. The motion amount is detected by searching for the value sum S, that is, the shift position i that minimizes the evaluation amount.
S = Σ | Xn−Xn−i |... {1}
Here, the sum calculation Σ in the expression {1} is executed for all the pixels Xn belonging to the block Rn.
As described above, in the block matching method, the current image is divided into blocks, and a similar position to the previous image is searched by pattern matching for each block, so that the motion of the subject included in the series of images can be detected. .

（ｂ）第２の特徴量検出方法
特徴量検出方法の第２の例として、画像中に含まれる特定領域の画素特性の変化を算出する手法を適用することができる。この手法は、まず、画像に含まれる輝度成分に対してラプラシアン処理を行い、処理画像の零交点を領域境界線として検出し、連続した境界線により閉じられた部分（被写体領域）を特定領域として抽出する。あるいは、画像に含まれる色成分を解析し、色相変化の少ない連続した部分を代表色に置き換えて特定領域として抽出する。そして、抽出された特定領域について各画像間で変化量を算出することにより、領域全体としての移動量を求めることができる。また、上記特定領域を画像全域について設定し、変化量を算出することにより、撮影状況（シーン）の切り替わり等の変化を検出することができる。 (B) Second Feature Quantity Detection Method As a second example of the feature quantity detection method, a technique for calculating a change in pixel characteristics of a specific region included in an image can be applied. In this method, first, Laplacian processing is performed on the luminance component included in the image, the zero intersection point of the processed image is detected as a region boundary line, and a portion (subject region) closed by the continuous boundary line is defined as a specific region. Extract. Alternatively, a color component included in the image is analyzed, and a continuous portion with a small hue change is replaced with a representative color and extracted as a specific region. Then, by calculating the amount of change between the images for the extracted specific region, the amount of movement for the entire region can be obtained. In addition, by setting the specific area for the entire image and calculating the amount of change, it is possible to detect changes such as switching of shooting conditions (scenes).

（ｃ）第３の特徴量検出方法
特徴量検出方法の第３の例として、画像に付随する音声のレベルや周波数帯等の音声特性の変化を算出する手法を適用することができる。
図７は、音声特性の変化を示す模式図である。ここで、図７（ａ）は、各画像に付随する音声のレベルの変化を示す模式図であり、図７（ｂ）は、音声の周波数帯の変化を示す模式図である。
図７（ａ）に示すように、特徴量として画像Ｆ０〜Ｆ１０に付随する音声のレベルを検出する場合においては、車両Ｘ走行時のエンジン音のレベルＬ２、車両Ｘ停止時のブレーキ音のレベルＬ４、クラクション発鳴時のレベルＬ６、車両Ｙ発車時のエンジン音のレベルＬ８、車両Ｘ発車時のエンジン音のレベルＬ１０の音声特性の変化が、音声データＡ０、Ａ２、Ａ４、Ａ６、Ａ８、Ａ１０に基づいて検出される。 (C) Third Feature Amount Detection Method As a third example of the feature amount detection method, a method for calculating a change in audio characteristics such as an audio level and a frequency band accompanying an image can be applied.
FIG. 7 is a schematic diagram showing a change in audio characteristics. Here, FIG. 7A is a schematic diagram showing a change in the level of audio accompanying each image, and FIG. 7B is a schematic diagram showing a change in the frequency band of audio.
As shown in FIG. 7A, in the case of detecting the level of sound accompanying the images F0 to F10 as the feature amount, the level L2 of the engine sound when the vehicle X is running, and the level of the brake sound when the vehicle X is stopped Changes in the sound characteristics of L4, level L6 at the time of horn sounding, level L8 of the engine sound at the time of vehicle Y departure, and level L10 of the engine sound at the time of vehicle X departure are represented by sound data A0, A2, A4, A6, A8, Detected based on A10.

一方、図７（ｂ）に示すように、特徴量として画像Ｆ０〜Ｆ１０に付随する音声データＡ０〜Ａ１０の周波数帯を検出する場合においては、車両Ｘ走行時のエンジン音の周波数Ｂ２、車両Ｘ停止時のブレーキ音の周波数Ｂ４、クラクション発鳴時の周波数Ｂ６、車両Ｙ発車時のエンジン音の周波数Ｂ８、車両Ｘ発車時のエンジン音の周波数Ｂ１０の音声特性の変化が、音声データＡ０、Ａ２、Ａ４、Ａ６、Ａ８、Ａ１０に基づいて検出される。
したがって、連続する画像相互における音声レベルの高低や無音状態への変化、周波数帯の推移等を監視することにより、車両相互の動き等を検出することができる。
なお、上述した特徴量検出方法は、本発明に適用される一例を示したものであって、本発明の実施の形態を何ら限定するものではない。また、これらの特徴検出方法は、単独で用いるものであってもよいし、適宜組み合わせて用いるものであってもよい。 On the other hand, as shown in FIG. 7B, in the case where the frequency band of the audio data A0 to A10 accompanying the images F0 to F10 is detected as the feature amount, the frequency B2 of the engine sound when the vehicle X is running, the vehicle X Changes in the sound characteristics of the stop sound frequency B4, the horn sounding frequency B6, the engine sound frequency B8 when the vehicle Y departs, and the engine sound frequency B10 when the vehicle X departs are the sound data A0, A2. , A4, A6, A8, A10.
Therefore, it is possible to detect the movement between vehicles by monitoring the level of sound between successive images, the change to a silent state, the transition of a frequency band, and the like.
Note that the above-described feature amount detection method shows an example applied to the present invention, and does not limit the embodiment of the present invention. Further, these feature detection methods may be used alone or in appropriate combination.

（３）表示画像選択ステップ
本ステップにおいては、表示画像選択部３２により、検出された特徴量と予め設定されたしきい値又は許容範囲とを比較し、しきい値又は許容範囲を超過する特徴量が出現した場合に、その特徴量を有するフレームデータの画像を表示画像として、あるいは、後述する画像加工ステップにおける加工対象画像として選択する。
図８は、画像の変化特性に基づいてフレームデータを選択する表示画像選択ステップの一例を示す概念図である。
図８に示すように、フレームメモリ２０に格納された一連のフレーム画像Ｆ０〜Ｆ１０から、上述した特徴量検出ステップにより判明した車両Ｘ、Ｙの移動量や、被写体としての車両Ｘ、Ｙの有無等に対して、予め設定されたしきい値との比較を行い、撮影状況の変化や被写体の動きの激しいフレーム画像を選択、抽出する。例えば、車両Ｘについて、フレーム画像Ｆ２では定速の走行状態が検出され、フレーム画像Ｆ４で急ブレーキによる走行速度の急激な変化が検出された場合、その変化量が予め設定されたしきい値や、他のフレーム画像相互の変化量よりも大きい場合には、フレーム画像Ｆ４を動画データ中でのトピック性が高い画像であると判断して選択する。 (3) Display Image Selection Step In this step, the display image selection unit 32 compares the detected feature amount with a preset threshold value or allowable range, and features that exceed the threshold value or allowable range. When the amount appears, an image of the frame data having the feature amount is selected as a display image or a processing target image in an image processing step described later.
FIG. 8 is a conceptual diagram showing an example of a display image selection step for selecting frame data based on image change characteristics.
As shown in FIG. 8, from the series of frame images F0 to F10 stored in the frame memory 20, the movement amount of the vehicles X and Y determined by the above-described feature amount detection step, and the presence or absence of the vehicles X and Y as subjects. And the like are compared with a preset threshold value, and a frame image with a change in photographing condition or a rapid movement of the subject is selected and extracted. For example, for a vehicle X, when a running state at a constant speed is detected in the frame image F2, and a sudden change in running speed due to sudden braking is detected in the frame image F4, the amount of change is set to a preset threshold value or If the amount of change is greater than the amount of change between the other frame images, the frame image F4 is determined to be an image having high topicality in the moving image data and is selected.

また、車両Ｘの被写体としての存在に着目し、フレーム内に車両Ｘが存在しないフレーム画像Ｆ０と、車両Ｘが常に存在するフレーム画像Ｆ２〜Ｆ１０がある場合には、車両Ｘが出現する前後のフレーム画像Ｆ０及びＦ２を、動画データ中でのトピック性が高い画像であると判断して選択する。
なお、画像の変化特性に基づくフレームデータの選択方法としては、上述した手法に限定されるものではなく、被写体の輝度や彩度の急激な変化等を用いるものであってもよい。要するに、撮影状況の変化や被写体の動きを把握することができる特徴量を監視するものであればよい。 Further, paying attention to the existence of the vehicle X as a subject, when there are a frame image F0 in which the vehicle X does not exist in the frame and frame images F2 to F10 in which the vehicle X always exists, before and after the vehicle X appears. The frame images F0 and F2 are selected by determining that the images have high topicality in the moving image data.
Note that the frame data selection method based on the change characteristics of the image is not limited to the above-described method, and a method such as a rapid change in luminance or saturation of the subject may be used. In short, what is necessary is just to be able to monitor the feature quantity that can grasp the change of the photographing situation and the movement of the subject.

次に、音声の変化特性に基づいてフレームデータを選択する表示画像選択ステップの一例について、上述した図７を参照して説明する。
図７に示したように、動画データ中の車両Ｘ、Ｙの動作や撮影状況の変化は、音声としても記録されるため、上述した特徴量検出ステップにより判明した車両Ｘ、Ｙ等が発するエンジン音やクラクション等の音声レベルや周波数帯等に対して、予め設定されたしきい値との比較を行い、撮影状況の変化やトピック性の高いフレーム画像を選択、抽出する。例えば、車両Ｘ、Ｙから発する音声レベルは、図７（ａ）のように示されるが、音声データＡ６におけるクラクション発鳴時の音声レベルＬ６が、音声データＡ２における車両Ｘの走行音のレベルＬ２や予め設定されたしきい値（許容範囲）に比較して極めて大きい場合には、音声データＡ６に対応するフレーム画像Ｆ６を動画データ中でのトピック性が高い画像であると判断して選択する。 Next, an example of a display image selection step for selecting frame data based on the sound change characteristics will be described with reference to FIG.
As shown in FIG. 7, since the movement of the vehicles X and Y in the moving image data and the change in the shooting situation are also recorded as sound, the engine generated by the vehicles X and Y and the like found by the feature amount detection step described above. A sound level such as sound or horn, a frequency band, or the like is compared with a preset threshold value, and a frame image having a change in shooting state or high topicality is selected and extracted. For example, the sound level emitted from the vehicles X and Y is shown as shown in FIG. 7A. The sound level L6 at the time of horn sounding in the sound data A6 is the level L2 of the running sound of the vehicle X in the sound data A2. If it is extremely large compared to a preset threshold value (allowable range), the frame image F6 corresponding to the audio data A6 is determined to be an image having high topicality in the moving image data and selected. .

また、車両Ｘ、Ｙから発する周波数帯に着目した場合、図７（ｂ）に示したように、音声データＡ４における車両Ｘの急停止に伴うブレーキ音の周波数Ｂ４が、音声データＡ２における車両Ｘの走行音の周波数Ｂ２や予め設定されたしきい値（許容範囲）に比較して極めて高い場合には、音声データＡ４に対応するフレーム画像Ｆ４を動画データ中でのトピック性が高い画像であると判断して選択する。
なお、音声の変化特性に基づくフレームデータの選択方法としては、上述した手法に限定されるものではなく、音声レベルや周波数の急激な変化や、同一の音声状態（例えば、無音状態）の継続時間等を用いるものであってもよい。 When attention is paid to the frequency bands emitted from the vehicles X and Y, as shown in FIG. 7B, the frequency B4 of the brake sound accompanying the sudden stop of the vehicle X in the audio data A4 is the vehicle X in the audio data A2. When the frequency is very high compared to the frequency B2 of the running sound and the threshold value (allowable range) set in advance, the frame image F4 corresponding to the audio data A4 is an image having high topicality in the moving image data. Judgment and selection.
Note that the frame data selection method based on the sound change characteristics is not limited to the above-described method, and a rapid change in the sound level or frequency or the duration of the same sound state (for example, a silent state). Etc. may be used.

（４）画像加工ステップ
本ステップは、画像加工部３３により、選択されたフレーム画像を、撮影状況の変化や被写体の動きを強調する所定の表現形式に加工処理し、モニタ５０やプリンタ６０を介して表示出力する。ここで、モニタ５０やプリンタ６０等に表示出力される画像は、上記画像加工部３３により強調処理された画像を特徴量の変化の度合いに応じて強調処理に変化を付けるものであってもよい。また、モニタ５０やプリンタ６０等への表示方法としては、強調処理された画像を他の画像とともに一括表示する方法のほかに、他の画像に優先して表示出力するものであってもよい。さらに、スクロール表示における画像の表示出力に際しては、フラッシングや文字表示、アラーム等により撮影状況の切り替わりや被写体の急激な動作を報知、認識させるものであってもよい。
以下、画像加工ステップに適用される強調処理について説明する。 (4) Image processing step In this step, the selected frame image is processed by the image processing unit 33 into a predetermined expression format that emphasizes the change in the shooting situation and the movement of the subject, and the monitor 50 and the printer 60 are used. Display output. Here, the image displayed and output on the monitor 50, the printer 60, or the like may change the enhancement processing of the image enhanced by the image processing unit 33 according to the degree of change in the feature amount. . Further, as a display method on the monitor 50, the printer 60, or the like, in addition to a method of displaying the emphasized images together with other images, display and output may be performed with priority over other images. Furthermore, at the time of image display output in scroll display, switching of shooting conditions or a sudden movement of a subject may be notified and recognized by flashing, character display, alarm, or the like.
Hereinafter, enhancement processing applied to the image processing step will be described.

（ａ）第１の強調処理
図９は、画像加工ステップにおける第１の強調処理の例を示す概念図である。
第１の強調処理は、表示画像選択部３２により選択されたフレーム画像について、モニタ５０やプリンタ６０等への表示出力の際の表示枠（コマ）の形状を通常とは異なる特異な形状に変化させるものである。
図９に示すように、例えば、表示画像選択部３２により、車両Ｘの動きの中で急ブレーキ、クラクション発鳴、急発進というトピックス性の高いフレーム画像Ｆ４、Ｆ６、Ｆ１０を選択した場合、これらのフレーム画像の表示枠の形状を通常の正方形や矩形に変えて、台形や平行四辺形、あるいは予め用意された台紙枠等を適用することにより、車両Ｘの動きの変化毎に強調表現された静止画像を表示出力することができる。 (A) First Enhancement Processing FIG. 9 is a conceptual diagram showing an example of the first enhancement processing in the image processing step.
In the first enhancement processing, the shape of the display frame (frame) at the time of display output to the monitor 50, the printer 60, etc. is changed to a unique shape different from the normal for the frame image selected by the display image selection unit 32. It is something to be made.
As illustrated in FIG. 9, for example, when the display image selection unit 32 selects frame images F4, F6, and F10 having high topics such as sudden braking, horn sounding, and sudden starting in the movement of the vehicle X, By changing the shape of the display frame of the frame image to a normal square or rectangle, and applying a trapezoid, parallelogram, or a pre-prepared mount frame, etc., the display was emphasized for each change in the movement of the vehicle X A still image can be displayed and output.

（ｂ）第２の強調処理
図１０は、画像加工ステップにおける第２の強調処理の例を示す概念図である。
第２の強調処理は、表示画像選択部３２により選択されたフレーム画像について、表示出力の際の表示サイズを通常とは異なるサイズに変化させるものである。
図１０に示すように、例えば、表示画像選択部３２により、車両Ｘの動きの中でクラクション発鳴というトピックス性の高いフレーム画像Ｆ６を選択した場合、他のフレーム画像よりも大きく表示されるように表示サイズを変更することにより、車両Ｘの動きの変化を強調表現した静止画像を表示出力することができる。 (B) Second Enhancement Processing FIG. 10 is a conceptual diagram showing an example of the second enhancement processing in the image processing step.
In the second enhancement process, the display size at the time of display output is changed to a size different from the normal size for the frame image selected by the display image selection unit 32.
As shown in FIG. 10, for example, when the display image selection unit 32 selects a frame image F6 having high topicality such as honking in the movement of the vehicle X, it is displayed larger than the other frame images. By changing the display size, it is possible to display and output a still image that emphasizes the change in the movement of the vehicle X.

（ｃ）第３の強調処理
図１１は、画像加工ステップにおける第３の強調処理の例を示す概念図である。
第３の強調処理は、表示画像選択部３２により選択されたフレーム画像について、特徴量の変化の度合いに応じて、表示サイズや表示枠の変形の割合を変化させるものである。
図１１に示すように、例えば、特徴量検出部３１により検出された車両Ｘの発する音声レベルの強弱に対応させて、表示サイズを段階的にあるいは傾斜的に変更する。一例として、図７に示された音声レベルを基準レベルに対する倍率として算出し、その倍率（例えば、１．２倍）をそのまま基準表示サイズに適用する。これにより、車両Ｘが発する急ブレーキ音やクラクション発鳴音のような音声レベルの高い音声データＡ４、Ａ６については、その音声レベルに応じた表示サイズに加工されることになり、音声の変化特性を静止画像により表示出力することができる。ここで、強調処理の度合いを決定する特徴量として被写体の音声レベルを適用した例を示したが、これは一例にすぎず本発明の実施の形態を何ら限定するものではない。したがって、被写体の移動量等の大小に応じて強調処理の度合いを変化させるようなものであってもよい。 (C) Third Enhancement Processing FIG. 11 is a conceptual diagram showing an example of the third enhancement processing in the image processing step.
The third emphasis process is to change the display size and the rate of deformation of the display frame of the frame image selected by the display image selection unit 32 according to the degree of change of the feature amount.
As shown in FIG. 11, for example, the display size is changed stepwise or in an inclined manner corresponding to the level of the sound level emitted by the vehicle X detected by the feature amount detection unit 31. As an example, the audio level shown in FIG. 7 is calculated as a magnification with respect to the reference level, and the magnification (for example, 1.2 times) is applied to the reference display size as it is. As a result, the voice data A4 and A6 having a high voice level such as a sudden braking sound or horn sound generated by the vehicle X are processed into a display size corresponding to the voice level, and the change characteristics of the voice Can be displayed and output as a still image. Here, an example in which the sound level of the subject is applied as a feature amount for determining the degree of enhancement processing is shown, but this is only an example and does not limit the embodiment of the present invention. Therefore, the degree of enhancement processing may be changed according to the amount of movement of the subject.

なお、画像加工の手法は、上述した各強調処理に限定されるものではなく、選択されたフレーム画像の画質や表示階調を通常表示よりも強調するものや、表示位置を変更するものであってもよい。
以上の一連のステップを有する本実施形態によれば、動画データ中の撮影状況の切り替わりや被写体の動きを画像又は音声に含まれる特徴量の変化として検出し、この特徴量に基づいて動画データ中の代表画像を選択して強調表示することができるため、簡易に動画データから静止画像を抽出し、撮影内容を直感的に認識させることができる画像表現で表示出力することができる。なお、本実施形態では、特徴量検出ステップにおいて検出された特徴量が所定の変化を示した場合に、表示画像選択ステップにおいてその特徴量を有するフレーム画像を表示画像として選択し、（画像加工を施して）表示出力する例について示したが、本発明はこれに限定されるものではなく、特徴量検出ステップにおいて所定の特徴量の変化を示したフレーム画像のみに加工処理を施して、他の加工処理を施していないフレーム画像とともに、順次あるいは一括して表示出力するものであってもよい。 Note that the image processing method is not limited to the above-described emphasis processing, but emphasizes the image quality and display gradation of the selected frame image as compared to the normal display or changes the display position. May be.
According to the present embodiment having the series of steps described above, the switching of the shooting situation in the moving image data and the movement of the subject are detected as a change in the feature amount included in the image or sound, and the moving image data is detected based on the feature amount. The representative image can be selected and highlighted, so that a still image can be easily extracted from the moving image data and displayed and output in an image representation that allows the captured content to be intuitively recognized. In this embodiment, when the feature amount detected in the feature amount detection step shows a predetermined change, a frame image having the feature amount is selected as a display image in the display image selection step, and (image processing is performed). However, the present invention is not limited to this. The present invention is not limited to this, and only the frame image showing the change in the predetermined feature value is processed in the feature value detection step, A frame image that has not been processed may be displayed and output sequentially or collectively.

＜第２の実施形態＞
次に、本発明に係る画像処理装置の第２の実施形態について、図面を参照して説明する。
本実施形態は、上述した第１の実施形態において、画像処理装置の利用者が特徴量及び画像加工方法を任意に選択設定するようにしたものである。
すなわち、図１に示した画像処理装置において、画像処理装置の利用者が、入力部７０を操作して、あるいは、図示を省略した通信回線等を介して、特徴量選択命令及び加工方法選択命令を指示入力することにより、画像信号の特徴量変化、又は、音声信号の特徴量変化、あるいは、画像信号及び音声信号の特徴量変化のいずれかを手がかりにして所望の条件を満たすフレーム画像を選択し、さらに、フレーム画像の加工処理方法を選択して、モニタ５０やプリンタ６０等への表現形式を任意に設定する。 <Second Embodiment>
Next, a second embodiment of the image processing apparatus according to the present invention will be described with reference to the drawings.
In this embodiment, in the first embodiment described above, the user of the image processing apparatus arbitrarily selects and sets the feature amount and the image processing method.
That is, in the image processing apparatus shown in FIG. 1, the user of the image processing apparatus operates the input unit 70 or through a communication line (not shown) or the like, a feature amount selection command and a processing method selection command. Is used to select a frame image that satisfies a desired condition based on either a change in the feature quantity of the image signal, a change in the feature quantity of the audio signal, or a change in the feature quantity of the image signal and the audio signal. Further, the processing method of the frame image is selected, and the expression format on the monitor 50, the printer 60, etc. is arbitrarily set.

以下、本実施形態の処理動作について、図１２のフローチャートを参照して説明する。上述した第１の実施形態と同等のステップについては、その説明を簡略化する。
図１２に示すように、ステップＳ２０１、Ｓ２０２において、動画データから所定の時間間隔で一連のフレーム画像及びそれに付随する音声情報を選択、抽出し、圧縮符号化処理を施してフレームメモリ２０の所定の格納領域に格納する。次いで、ステップＳ２０３、Ｓ２０４、Ｓ２０５において、フレームメモリ２０に格納された一連のフレーム画像及び音声について、入力部７０等を介して利用者により選択、指示された特徴量の変化を検出し、画像又は音声の変化特性を把握する。すなわち、撮影、収録された動画データについて、利用者が指示する特徴量の変化、つまり、画像や音声の変化特性に基づいて、後述するステップにおいて、所望の撮影内容を有するフレーム画像が選択、表示出力される。 Hereinafter, the processing operation of this embodiment will be described with reference to the flowchart of FIG. The description of steps equivalent to those in the first embodiment described above will be simplified.
As shown in FIG. 12, in steps S201 and S202, a series of frame images and accompanying audio information are selected and extracted from moving image data at predetermined time intervals, subjected to compression encoding processing, and subjected to predetermined data in the frame memory 20. Store in the storage area. Next, in steps S203, S204, and S205, a change in the feature amount selected and instructed by the user via the input unit 70 or the like is detected for the series of frame images and audio stored in the frame memory 20, and the image or Understand the change characteristics of voice. That is, for captured and recorded moving image data, a frame image having a desired shooting content is selected and displayed in a later-described step on the basis of a change in a feature amount designated by a user, that is, a change characteristic of an image or sound. Is output.

次いで、ステップＳ２０６において、表示画像選択部３２により、上述した画像又は音声の変化特性に基づいて、撮影状況の切り替わりや被写体の急激な動作等を含むフレーム画像を選択し、ステップＳ２０７、Ｓ２０８において、選択されたフレーム画像について、入力部７０等を介して利用者により選択、指示された画像加工処理を施す。ここでは、上述した強調処理を含む複数の画像加工処理の中から任意の加工処理方法が選択され、例えば撮影状況の切り替わりや被写体の変化を示すフレーム画像を、通常のフレーム画像よりも表示サイズを大型化したり、表示枠を変形させたり、表示位置をモニタ画面の中央に移動する等の表示枠の制御や、表示画面の輝度や彩度を強調したり、フラッシング等の表示画質の制御を行う。 Next, in step S206, the display image selection unit 32 selects a frame image including switching of shooting conditions, a sudden movement of the subject, and the like based on the above-described image or sound change characteristics, and in steps S207 and S208. The selected frame image is subjected to image processing selected and instructed by the user via the input unit 70 or the like. Here, an arbitrary processing method is selected from among a plurality of image processing processes including the above-described enhancement processing. For example, a frame image indicating a change in shooting state or a change in a subject is displayed with a display size larger than that of a normal frame image. Control the display frame such as increasing the size, changing the display frame, moving the display position to the center of the monitor screen, emphasizing the brightness and saturation of the display screen, and controlling the display image quality such as flushing. .

そして、ステップＳ２０９において、画像加工部３３により加工処理されたフレーム画像をモニタ５０やプリンタ６０、あるいは、通信回線等を介して接続されたファクシミリ装置やＰＤＡ、パソコン等に表示出力する。以上の一連のステップを有する本実施形態によれば、利用者が任意の特徴量及び画像加工処理を選択することができるため、所望の撮影内容（特徴量の変化）を有するフレーム画像を、視認性のよい表現形式で表示出力することができる。なお、本実施形態においては、利用者が特徴量及び画像加工方法の双方を任意に選択設定する場合について説明したが、選択設定する対象を特徴量又は画像加工方法のいずれかとしてもよい。
ところで、各実施形態に示した画像処理装置及び画像処理方法は、上述したビデオカメラや電子スチルカメラ、パソコンのほか、ビデオプレーヤーやファクシミリ装置、プリンタ等の画像処理機器に組み込んで、あるいは、アプリケーションソフトとして提供することにより良好に実現することができるものであることはいうまでもない。 In step S209, the frame image processed by the image processing unit 33 is displayed and output to the monitor 50, the printer 60, a facsimile apparatus, a PDA, a personal computer, or the like connected via a communication line. According to the present embodiment having the above series of steps, the user can select an arbitrary feature amount and image processing, so that a frame image having a desired shooting content (change in feature amount) can be visually recognized. It can be displayed and output in a highly expressive format. In the present embodiment, the case where the user arbitrarily selects and sets both the feature amount and the image processing method has been described. However, the target to be selected and set may be either the feature amount or the image processing method.
By the way, the image processing apparatus and the image processing method shown in each embodiment can be incorporated into an image processing apparatus such as a video player, a facsimile apparatus, and a printer in addition to the above-described video camera, electronic still camera, and personal computer, or application software. It goes without saying that it can be realized well by providing as.

本発明に係る画像処理装置の第１の実施形態を示すブロック図である。1 is a block diagram showing a first embodiment of an image processing apparatus according to the present invention. 動画データ取込部の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of a moving image data taking-in part. フレームメモリの内部領域を示す概念図である。It is a conceptual diagram which shows the internal area | region of a frame memory. 第１の実施形態における処理動作を示すフローチャートである。It is a flowchart which shows the processing operation in 1st Embodiment. 動画データ取込ステップを示す概念図である。It is a conceptual diagram which shows a moving image data taking-in step. 特徴量検出ステップに適用されるブロックマッチング法を示す概念図である。It is a conceptual diagram which shows the block matching method applied to a feature-value detection step. 音声特性の変化を示す模式図である。It is a schematic diagram which shows the change of an audio | voice characteristic. 画像の変化特性に基づいてフレームデータを選択する表示画像選択ステップの一例を示す概念図である。It is a conceptual diagram which shows an example of the display image selection step which selects frame data based on the change characteristic of an image. 画像加工ステップにおける第１の強調処理の例を示す概念図である。It is a conceptual diagram which shows the example of the 1st emphasis process in an image process step. 画像加工ステップにおける第２の強調処理の例を示す概念図である。It is a conceptual diagram which shows the example of the 2nd emphasis process in an image process step. 画像加工ステップにおける第３の強調処理の例を示す概念図である。It is a conceptual diagram which shows the example of the 3rd emphasis process in an image process step. 第２の実施形態における処理動作を示すフローチャートである。It is a flowchart which shows the processing operation in 2nd Embodiment. 動画情報を構成する複数のフレーム画像から任意のフレーム画像を選択し、表示する手法の概念図である。It is a conceptual diagram of the method of selecting and displaying arbitrary frame images from a plurality of frame images constituting moving image information.

Explanation of symbols

１０動画データ取込部
２０フレームメモリ
３０ＭＰＵ
３１特徴量検出部（特徴量検出手段）
３２表示画像選択部（フレーム画像選択手段）
３３画像加工部（フレーム画像加工手段）
４０ハードディスク
５０モニタ（画像出力手段）
６０プリンタ（画像出力手段）
７０入力部（特徴量選択手段、加工方法選択手段）
８０バス 10 Movie data fetching unit 20 Frame memory 30 MPU
31 feature quantity detection unit (feature quantity detection means)
32 Display image selection unit (frame image selection means)
33 Image processing unit (frame image processing means)
40 hard disk 50 monitor (image output means)
60 Printer (image output means)
70 Input unit (feature quantity selection means, processing method selection means)
80 bus

Claims

A feature amount detecting means for detecting a change in the feature amount included in a plurality of frame images constituting the moving image information;
Frame image selection means for selecting a specific frame image from the plurality of frame images based on the change in the feature quantity detected by the feature quantity detection means;
Frame image enhancement means for performing an enhancement process for changing a display frame shape on the specific frame image selected by the frame image selection means;
Image output means for displaying and outputting, as still images, the plurality of frame images including the specific frame image subjected to enhancement processing for changing a display frame shape by the frame image enhancement means ;
An image processing apparatus comprising:

A feature amount detecting means for detecting a change in the feature amount included in a plurality of frame images constituting the moving image information;
Frame image selection means for selecting a specific frame image from the plurality of frame images based on the change in the feature quantity detected by the feature quantity detection means;
Frame image enhancement means for applying a predetermined enhancement process to the specific frame image selected by the frame image selection means;
Print output means for printing out the plurality of frame images including the specific frame image subjected to enhancement processing by the frame image enhancement means as still images;
An image processing apparatus comprising:

Moving image information input means for inputting moving image information including a plurality of frame images;
Movie information extraction means for extracting frame images at predetermined time intervals from the movie information input by the movie information input means;
Feature amount detection means for detecting changes in feature amounts included in a plurality of frame images extracted by the moving image information extraction means;
Frame image selection means for selecting a specific frame image from the plurality of frame images based on the change in the feature quantity detected by the feature quantity detection means;
Frame image enhancement means for applying a predetermined enhancement process to the specific frame image selected by the frame image selection means;
Image output means for displaying and outputting, as still images, a plurality of frame images extracted by the moving image information extraction means, including the specific frame image subjected to enhancement processing by the frame image enhancement means;
An image processing apparatus comprising:

The image processing apparatus according to claim 1, wherein the feature amount is a movement of a subject included in a plurality of frame images constituting the moving image information.

The image processing apparatus according to claim 1, wherein the feature amount is audio information associated with each of a plurality of frame images constituting the moving image information.

Feature quantity detection means for detecting the degree of change in sound accompanying each frame image, which is a component of video information;
Frame image enhancement means for changing the form of each frame image, which is a component of the moving image information, according to the degree of change of the audio information detected by the feature amount detection means;
Image output means for displaying and outputting each frame image including a frame image whose form has been changed by the frame image enhancement means as a still image;
An image processing apparatus, wherein the plurality of frame images output by the image output means have different forms.

Causing the feature amount detection unit to detect a change in the feature amount included in the plurality of frame images constituting the moving image information;
A step of causing a display image selection unit to select a specific frame image from the plurality of frame images based on a change in the feature amount detected by the feature amount detection unit ;
Causing the image processing unit to execute a process of performing an enhancement process for changing a display frame shape on the specific frame image selected by the display image selection unit;
A step of causing the image output unit to display and output the plurality of frame images including the specific frame image subjected to the enhancement processing by the image processing unit ;
An image processing method comprising:

Causing the feature amount detection unit to detect a change in the feature amount included in the plurality of frame images constituting the moving image information;
A step of causing a display image selection unit to select a specific frame image from the plurality of frame images based on a change in the feature amount detected by the feature amount detection unit;
Causing the image processing unit to execute a process of performing a predetermined enhancement process on the specific frame image selected by the display image selection unit;
Printing out the plurality of frame images including the specific frame image subjected to the enhancement processing by the image processing unit as a still image;
An image processing method comprising: