JP2010176240A

JP2010176240A - Video analysis device, video analysis program, video analysis control device, and video analysis control program

Info

Publication number: JP2010176240A
Application number: JP2009016141A
Authority: JP
Inventors: Yasuhiko Miyazaki; 泰彦宮崎; Akira Kojima; 明小島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2009-01-28
Filing date: 2009-01-28
Publication date: 2010-08-12

Abstract

<P>PROBLEM TO BE SOLVED: To achieve a video analyzing device which is excellent in versatility and scalability by allowing a control part to transfer a frame required by each video analyzing part by uniform processing. <P>SOLUTION: A video analyzing part 10 has a featured value output part 12 for achieving each featured value extraction technology, and a next-frame designation part 11 for designating next-frame information as a frame number or frame time information required the next each time image data are successively input. A control part 14 transfers frame data to the video analyzing part 10 only when the frame designated by the next-frame designation part 11 of each video analyzing part 10 is obtained by a frame data acquisition part 13. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は，映像データから各種特徴量を解析するための映像解析装置，映像解析制御装置およびそれらのプログラムに関するものである。 The present invention relates to a video analysis device, a video analysis control device, and programs for analyzing various feature amounts from video data.

映像データを解析して，その映像の特徴的な区間やそれに付随する値（以下「特徴量」という）を自動的に出力する方式に関しては，従来から各種の技術が存在している。 There have been various techniques for analyzing video data and automatically outputting characteristic sections of the video and their associated values (hereinafter referred to as “features”).

特許文献１には，映像データ列から映像カット点を検出する技術が記載されており，この方法により，時間的にゆっくりとしたシーン変化を含むカット点（カット区間）を出力することができる。 Patent Document 1 describes a technique for detecting a video cut point from a video data sequence. By this method, a cut point (cut section) including a scene change that is slow in time can be output.

特許文献２には，画像データ列（映像データと同じ意味で記載されている）からカメラパラメータを推定する技術が記載されており，この方法により，映像データから，パン，チルトなどと呼ばれるカメラ動作がある区間について，そのパラメータ値と共に出力することができる。 Patent Document 2 describes a technique for estimating camera parameters from an image data sequence (described in the same meaning as video data). By this method, camera operation called panning, tilting, etc. is performed from video data. Can be output along with its parameter values for a certain interval.

特許文献３には，フレーム画像の中から文字部分を画素連結領域として抽出する文字領域抽出技術が記載されており，この方法を映像データのフレーム画像に適用することにより，映像データから，一般にテロップと呼ばれている文字領域が含まれる映像区間と，そのテロップ位置情報を出力することができる。 Patent Document 3 describes a character region extraction technique for extracting a character portion from a frame image as a pixel connection region. By applying this method to a frame image of video data, a general telop is obtained from the video data. A video section including a character area called “Tel” and its telop position information can be output.

特許文献４には，映像から動物体アップフレーム画像（アップショットともいう）を検出する技術が記載されておりり，この方法により，動物体がクローズアップされ比較的大きく写っている映像フレーム区間を出力することができる。 Patent Document 4 describes a technique for detecting an animal up-frame image (also referred to as an up-shot) from a video. By this method, a video frame section in which a moving object is close-up and relatively large is shown. Can be output.

これらの技術に共通することは，映像を，連続する画像データ（フレームと呼ばれる）ととらえ，複数のフレームを入力データとして，それぞれ，カット，カメラワーク，テロップ，動物体アップショットなどの特徴量を算出して出力することにある。そのため，これらの技術と，単一のフレームデータ取得部（映像をフレームデータとして取得する機能）とを組み合わせ，単一の映像データに対して，様々な特徴量を同時に出力する映像解析装置あるいは映像解析プログラムを構成することは可能であった。実際に，映像からフレームデータを取得する処理は，ＣＰＵやメモリといったコンピュータ上のリソースを多く消費するため，一度のデコード処理で複数の特徴量を抽出をすることのメリットは大きい。 What is common to these technologies is that video is treated as continuous image data (called a frame), and multiple frames are used as input data, and feature quantities such as cuts, camera work, telops, and animal up-shots are added. It is to calculate and output. Therefore, combining these technologies with a single frame data acquisition unit (function to acquire video as frame data), a video analysis device or video that simultaneously outputs various feature quantities to a single video data. It was possible to construct an analysis program. Actually, the process of acquiring the frame data from the video consumes a large amount of resources on the computer such as a CPU and a memory. Therefore, the advantage of extracting a plurality of feature amounts by one decoding process is great.

なお，映像データは，多くのフレーム画像から成り立っている。例えば，標準的な３０ｆｐｓ（フレーム／秒）の映像データは，１つ１つのフレームが約３３ｍｓｅｃ（ミリ秒）ずつの画像となる。前述した特許文献１，２，３，４等の技術を実現した場合，ＣＰＵやメモリ等の計算のために必要なリソースが多く必要となるため，実行速度や実行効率を上げるためには，３３ｍｓｅｃごとの全てのフレームを対象として計算をするのではなく，それぞれ，１００ｍｓｅｃごと，５００ｍｓｅｃごとなどの周期でフレーム画像を取得し，各技術による抽出処理をする実施形態を取ることが多い。 Note that video data is composed of many frame images. For example, in standard 30 fps (frame / second) video data, each frame is an image of about 33 msec (millisecond). When the techniques described in Patent Documents 1, 2, 3, 4 and the like described above are realized, a large amount of resources necessary for calculation such as a CPU and a memory are required. Therefore, in order to increase the execution speed and execution efficiency, 33 msec. In many cases, the calculation is not performed for every frame, but frame images are acquired at intervals of 100 msec or 500 msec, and extraction processing is performed by each technique.

例えば，１秒以下の非常に短いカット区間というのは，そもそも利用者にとって認識できるものではなく，むしろ，瞬間的なノイズとして取り扱うべきものである。そのため，カット区間抽出については３００ｍｓｅｃごとのフレームをもとにカット点の抽出処理を行ったとしても，実用上の問題は起こりにくい。 For example, a very short cut interval of 1 second or less is not something that can be recognized by the user in the first place, but rather should be treated as instantaneous noise. For this reason, even if cut point extraction processing is performed on the basis of a frame every 300 msec, the practical problem is unlikely to occur.

カメラワークについても同様であり，特にカメラワークの特徴量抽出のほうがカットより比較的リソースを多く必要とするので，さらに区間を大きくして５００ｍｓｅｃごとの処理としたほうが，より実用的になる。 The same applies to camera work, and in particular, the feature extraction of camera work requires a relatively large amount of resources rather than cutting, so it is more practical to further increase the section and process every 500 msec.

一方で，動物体アップショット検出は，もっともクローズアップして被写体を捕えるフレームを探索することを行うので，動物体があると判定されている区間においては，ちょうど利用者がビデオのコマ送りをしながら探すのと同様に，３３ｍｓｅｃごとの全フレームを入力対象として処理を行うことになる。 On the other hand, moving object up-shot detection searches for a frame that captures the subject most closely, so that the user can just frame the video during the period when it is determined that there is a moving object. However, as in the case of searching, all frames every 33 msec are processed as input targets.

テロップの場合には，特許文献３の段落［００３１］から［００３７］に記載されているように，時間的に連続する複数のフレーム画像を入力し，平均化カラー画像を作成することにより，テロップの誤検出を抑制する効果があることが知られている。このため，例えば５００ｍｓｅｃごとに連続した４枚の画像を処理する，といったような制御が必要になる。 In the case of a telop, as described in paragraphs [0031] to [0037] of Patent Document 3, a plurality of temporally continuous frame images are input to create an averaged color image, thereby It is known that there is an effect of suppressing the false detection of. For this reason, for example, it is necessary to control such that four consecutive images are processed every 500 msec.

特許第２８３９１３２号公報Japanese Patent No. 2839132 特許第３４０８１１７号公報Japanese Patent No. 3408117 特許第３４６７１９５号公報Japanese Patent No. 3467195 特開２００６−２４４０７４号公報JP 2006-244074 A

前述したように，映像を連続するフレームととらえ，時間的に近接する複数のフレームを入力データとして特徴量を出力する技術には多様なものがあり，また，今後も同様に新たな特徴量を抽出する技術が開発されると考えられる。 As described above, there are a variety of technologies that treat video as consecutive frames and output feature values using multiple frames that are close in time as input data. Extraction technology is expected to be developed.

しかしながら，これらの，個別の特徴量抽出の技術を実現した映像解析装置あるいは映像解析プログラムを組み合わせて，様々な特徴量を同時に出力する装置あるいはそのプログラムを構成することは，困難になっていく。すなわち，各特徴量抽出技術が，それぞれ，異なる間隔でのフレームデータを必要とするため，その全体を制御する装置あるいはプログラムは，その各々の間隔を管理しながら各映像解析部を呼び出さなくてはならない。さらに，未知の特徴量抽出技術が開発された場合には，それに合わせて，全体を制御する装置あるいはプログラムとなる制御部を，毎回改造しなくてはならない。 However, it becomes difficult to configure a device or a program for simultaneously outputting various feature amounts by combining the video analysis device or the video analysis program that realizes these individual feature amount extraction techniques. In other words, since each feature extraction technique requires frame data at different intervals, a device or program that controls the whole must call each video analysis unit while managing each interval. Don't be. Furthermore, when an unknown feature quantity extraction technique is developed, the control unit that is a device or program for controlling the whole must be modified every time accordingly.

図７は，本発明の課題を説明するための図であり，個別の特徴量抽出の技術を組み合わせて，様々な特徴量を同時に出力する装置の構成例を示す図である。図８は，図７に示す制御部の処理フローチャートである。 FIG. 7 is a diagram for explaining the problem of the present invention, and is a diagram illustrating a configuration example of an apparatus that simultaneously outputs various feature amounts by combining individual feature amount extraction techniques. FIG. 8 is a process flowchart of the control unit shown in FIG.

従来技術の個別の特徴量抽出を実現する映像解析装置を組み合わせて，様々な特徴量を同時に出力する装置を考えた場合，例えば図７に示すような装置構成になると考えられる。図７において，ＣＵＴ映像解析部１０１は，特許文献１に記載されているようなカット点の検出を行うものであり，その検出結果であるカット点に関する特徴量（以下「ＣＵＴ特徴量」という）の出力機能を持つＣＵＴ特徴量出力部１０２を有する。 When considering a device that outputs various feature amounts simultaneously by combining a video analysis device that realizes individual feature amount extraction of the prior art, it can be considered that the device configuration is as shown in FIG. 7, for example. In FIG. 7, a CUT video analysis unit 101 detects a cut point as described in Patent Document 1, and a feature amount related to the cut point (hereinafter referred to as “CUT feature amount”) as a result of the detection. The CUT feature quantity output unit 102 having the output function is provided.

ＣＡＭ映像解析部１１１は，特許文献２に記載されているようなカメラパラメータの推定を行うものであり，その推定結果であるカメラワークに関する特徴量（以下「ＣＡＭ特徴量」という）の出力機能を持つＣＡＭ特徴量出力部１１２を有する。 The CAM video analysis unit 111 estimates a camera parameter as described in Patent Document 2, and has an output function of a feature value related to camera work (hereinafter referred to as “CAM feature value”) as a result of the estimation. A CAM feature amount output unit 112 is provided.

ＴＬＰ映像解析部１２１は，特許文献３に記載されているようなテロップの検出を行うものであり，その検出結果であるテロップに関する特徴量（以下「ＴＬＰ特徴量」という）の出力機能を持つＴＬＰ特徴量出力部１２２を有する。 The TLP video analysis unit 121 detects a telop as described in Patent Document 3, and has a function of outputting a feature amount related to a telop (hereinafter referred to as “TLP feature amount”) as a detection result. A feature amount output unit 122 is included.

ＵＰＳ映像解析部１３１は，特許文献４に記載されているような動物体アップフレーム画像の検出を行うものであり，その検出結果である動物体アップフレーム画像に関する特徴量（以下「ＵＰＳ特徴量」という）の出力機能を持つＵＰＳ特徴量出力部１３２を有する。 The UPS video analysis unit 131 detects a moving object up-frame image as described in Patent Document 4, and a feature amount related to the moving object up-frame image (hereinafter referred to as “UPS feature amount”). A UPS feature amount output unit 132 having an output function.

フレームデータ取得部１４０は，圧縮符号化された映像データを，フレーム画像にデコードし，デコード結果のフレームデータを制御部１５０に渡す。制御部１５０は，フレームデータ取得部１４０から得たフレームデータを，適当なタイミングでＣＵＴ映像解析部１０１，ＣＡＭ映像解析部１１１，ＴＬＰ映像解析部１２１，ＵＰＳ映像解析部１３１に渡し，これらの各部に映像を解析させる処理を行う。 The frame data acquisition unit 140 decodes the compressed and encoded video data into a frame image, and passes the decoded frame data to the control unit 150. The control unit 150 passes the frame data obtained from the frame data acquisition unit 140 to the CUT video analysis unit 101, the CAM video analysis unit 111, the TLP video analysis unit 121, and the UPS video analysis unit 131 at appropriate timings. Perform processing to analyze the video.

図８に従って，制御部１５０が行う処理について説明する。図８において，ＴＳｃｕｔは，ＣＵＴ映像解析部１０１が次に処理対象とするタイムスタンプ［ｍｓｅｃ］を保持するためのメモリ領域，ＴＳｃａｍは，ＣＡＭ映像解析部１１１が次に処理対象とするタイムスタンプを保持するためのメモリ領域，ＴＳｔｌｐは，ＴＬＰ映像解析部１２１が次に処理対象とするタイムスタンプを保持するためのメモリ領域である。ＵＰＳ映像解析部１３１は，全フレームについて解析を行うので，次に処理対象とするタイムスタンプを保持するためのメモリ領域（ＴＳｕｐｓ）は用いなくてもよい。Ｃｔｌｐは，この後連続して処理対象とするフレーム画像数のカウンタを保持するためのメモリ領域である。 Processing performed by the control unit 150 will be described with reference to FIG. In FIG. 8, TScut is a memory area for holding a time stamp [msec] to be processed next by the CUT video analysis unit 101, and TScam is a time stamp to be processed next by the CAM video analysis unit 111. A memory area TStlp for holding is a memory area for holding a time stamp to be processed next by the TLP video analysis unit 121. Since the UPS video analysis unit 131 analyzes all frames, it is not necessary to use a memory area (TSups) for holding a time stamp to be processed next. Ctlp is a memory area for holding a counter of the number of frame images to be processed successively thereafter.

制御部１５０は，最初に解析対象の映像データが格納された映像ファイルのオープンやバッファ領域の確保などの各機能の初期化を行う（ステップＳ１０１）。次に，各タイムスタンプ変数ＴＳｘｘｘとカウンタＣｔｌｐの値を全て０にセットする（ステップＳ１０２）。その後，フレームデータ取得部１４０からフレームデータを取得する（ステップＳ１０３）。 The control unit 150 first initializes each function such as opening a video file storing video data to be analyzed and securing a buffer area (step S101). Next, all the values of the time stamp variables TSxxx and the counter Ctlp are set to 0 (step S102). Thereafter, frame data is acquired from the frame data acquisition unit 140 (step S103).

続いて，制御部１５０は，映像ファイルのデータが終了するまで，以下に説明するステップＳ１０５からステップＳ１１６までの処理とステップＳ１０３の処理を繰り返す（ステップＳ１０４）。 Subsequently, the control unit 150 repeats the processing from step S105 to step S116 described below and the processing of step S103 until the data of the video file is completed (step S104).

取得したフレームのタイムスタンプをｔ［ｍｓｅｃ］とする。まず，タイムスタンプｔがＴＳｃｕｔになったかどうかを判定する（ステップＳ１０５）。ＴＳｃｕｔになっていなければ，ステップＳ１０８へ進む。ＴＳｃｕｔになったならば，ＣＵＴ映像解析部１０１にフレームデータを渡す（ステップＳ１０６）。その後，タイムスタンプｔに３００を加えた値をＴＳｃｕｔに設定する（ステップＳ１０７）。すなわち，次の処理対象フレームを３００ｍｓｅｃ後のフレームに設定する。 Let the time stamp of the acquired frame be t [msec]. First, it is determined whether or not the time stamp t is TScut (step S105). If not TScut, the process proceeds to step S108. If TScut is reached, the frame data is transferred to the CUT video analysis unit 101 (step S106). Thereafter, a value obtained by adding 300 to the time stamp t is set in TScut (step S107). That is, the next processing target frame is set to a frame after 300 msec.

続いて，タイムスタンプｔがＴＳｃａｍになったかどうかを判定する（ステップＳ１０８）。ＴＳｃａｍになっていなければ，ステップＳ１１１へ進む。ＴＳｃａｍになったならば，ＣＡＭ映像解析部１１１にフレームデータを渡す（ステップＳ１０９）。その後，現在のタイムスタンプｔに５００を加えた値をＴＳｃａｍに設定する（ステップＳ１１０）。 Subsequently, it is determined whether or not the time stamp t becomes TScam (step S108). If not TScam, the process proceeds to step S111. When TScam is reached, the frame data is transferred to the CAM video analysis unit 111 (step S109). Thereafter, a value obtained by adding 500 to the current time stamp t is set in TScam (step S110).

次に，Ｃｔｌｐが０より大きいか，またはタイムスタンプｔがＴＳｔｌｐになったかどうかを判定する（ステップＳ１１１）。判定結果が“偽”の場合には，ステップＳ１１６へ進む。判定結果が“真”であれば，ＴＬＰ映像解析部１２１にフレームデータを渡す（ステップＳ１１２）。その後，Ｃｔｌｐが３よりと小さいかどうかを判定し（ステップＳ１１３），３より小さければ，Ｃｔｌｐを１カウントアップする（ステップＳ１１４）。そうでなければ，現在のタイムスタンプｔに５００を加えた値をＴＳｔｌｐに設定し，Ｃｔｌｐに０をセットする（ステップＳ１１５）。 Next, it is determined whether Ctlp is greater than 0 or whether the time stamp t is TStlp (step S111). If the determination result is “false”, the process proceeds to step S116. If the determination result is “true”, the frame data is passed to the TLP video analysis unit 121 (step S112). Thereafter, it is determined whether or not Ctlp is smaller than 3 (step S113). If Ctlp is smaller than 3, Ctlp is incremented by 1 (step S114). Otherwise, a value obtained by adding 500 to the current time stamp t is set in TStlp, and 0 is set in Ctlp (step S115).

ＵＰＳ映像解析部１３１に対しては，取得したフレームのフレームデータを渡す（ステップＳ１１６）。ＵＰＳ映像解析部１３１は，全フレームの解析を行うことになる。その後，制御部１５０は，ステップＳ１０３の処理に戻り，フレームデータ取得部１４０から次のフレームのフレームデータを取得し，同様に処理を繰り返す。 The frame data of the acquired frame is passed to the UPS video analysis unit 131 (step S116). The UPS video analysis unit 131 analyzes all frames. Thereafter, the control unit 150 returns to the process of step S103, acquires the frame data of the next frame from the frame data acquisition unit 140, and repeats the process in the same manner.

映像ファイルが終了したならば（ステップＳ１０４），バッファ領域の解放や映像ファイルのクローズなどの各機能の終了処理を行う（ステップＳ１１７）。 If the video file is finished (step S104), the end processing of each function such as releasing the buffer area and closing the video file is performed (step S117).

以上の処理内容から明らかなように，単に従来の種々の映像解析部１０１〜１３１に対してフレームデータを渡す制御部１５０を付加しただけでは，制御部１５０のフロー中に，各映像解析部１０１〜１３１固有の処理が入り，煩雑な制御が必要になる。このため，制御部１５０は，統一的な一律な処理ができなくなる。このことは，このような全体としての映像解析装置あるいは映像解析プログラムの拡張性を大きく損ねることとなる。 As is clear from the above processing contents, each video analysis unit 101 is included in the flow of the control unit 150 simply by adding the control unit 150 for passing frame data to the conventional various video analysis units 101 to 131. Processing unique to ˜131 is entered, and complicated control is required. For this reason, the control unit 150 cannot perform uniform and uniform processing. This greatly impairs the expandability of such a video analysis apparatus or video analysis program as a whole.

本発明は上記課題の解決を図り，種々の映像解析部に対して，制御部が一律な処理で各映像解析部が必要なフレームを渡すことができる機構を提供することにより，汎用性，拡張性に優れた映像解析システムの構築を可能にすることを目的としている。 The present invention solves the above-mentioned problems and provides a mechanism that allows each video analysis unit to pass a necessary frame to the various video analysis units in a uniform process. The purpose is to make it possible to construct a video analysis system with excellent performance.

上記課題を解決するために，本発明では，映像解析部として，各特徴量抽出技術を実現した特徴量出力機能とともに，次フレーム指定機能を設け，制御部では，その次フレーム指定機能により指定されたフレームがフレームデータ取得部から取得された場合にのみ，その映像解析部に対してフレームデータを渡すという構成をとる。 In order to solve the above problems, in the present invention, the video analysis unit is provided with a next frame designation function as well as a feature quantity output function realizing each feature quantity extraction technique, and the control unit is designated by the next frame designation function. Only when the received frame is acquired from the frame data acquisition unit, the frame data is passed to the video analysis unit.

詳しくは，本発明に係る映像解析装置は，映像をフレームごとの画像データとして取得するフレームデータ取得部と，フレームごとの画像データを逐次入力することによって，その映像の特徴的な区間またはそれに付随する値である特徴量を出力する特徴量出力部および画像データが逐次入力されるごとに次に必要となるフレーム番号またはフレーム時間情報である次フレーム情報を指定する次フレーム指定部とを有する単数または複数の映像解析部と，映像解析部から指定される次フレーム情報に基づき，それぞれの映像解析部が必要なフレームの画像データをフレームデータ取得部より取得して映像解析部に送る制御部とを備える。 Specifically, the video analysis apparatus according to the present invention includes a frame data acquisition unit that acquires video as image data for each frame, and sequentially inputs the image data for each frame, so that a characteristic section of the video or an accompanying image is obtained. A feature amount output unit that outputs a feature amount that is a value to be processed, and a next frame designation unit that designates next frame information that is frame number information or frame time information that is required next every time image data is sequentially input Or a plurality of video analysis units, and a control unit that obtains image data of a necessary frame from the frame data acquisition unit based on the next frame information specified by the video analysis unit and sends the image data to the video analysis unit. Is provided.

これらの各部は，単一のコンピュータで実現することもできるが，映像解析部と制御部とを，ネットワークで接続される異なるコンピュータ上に実装する装置構成とすることもできる。その際に，フレームデータ取得部を制御部と同じコンピュータ上に実装することも，また，さらに異なるコンピュータ上に実装することも可能である。 Each of these units can be realized by a single computer, but it is also possible to adopt a device configuration in which the video analysis unit and the control unit are mounted on different computers connected by a network. At this time, the frame data acquisition unit can be mounted on the same computer as the control unit, or can be mounted on a different computer.

本発明によれば，制御部では，各映像解析部に固有なフレームデータの取得間隔を事前に知ることなく，各映像解析部が必要なフレームを渡す機構を実現することができる。そのため，未知の特徴量抽出技術が開発された場合でも，制御部に特別な改造を施すことなく，複数の特徴量を同時に出力する機構を容易に実現することが可能となる。 According to the present invention, the control unit can realize a mechanism for passing a necessary frame to each video analysis unit without knowing in advance the acquisition interval of frame data unique to each video analysis unit. Therefore, even when an unknown feature quantity extraction technique is developed, it is possible to easily realize a mechanism for outputting a plurality of feature quantities at the same time without special modification of the control unit.

本発明の実施形態に係る映像解析装置の構成例を示す図である。It is a figure which shows the structural example of the video analysis apparatus which concerns on embodiment of this invention. フレームデータ取得部の処理フローチャートである。It is a process flowchart of a frame data acquisition part. 制御部の処理フローチャートである。It is a process flowchart of a control part. カット点を検出する映像解析部の処理フローチャートである。It is a process flowchart of the image | video analysis part which detects a cut point. テロップを検出する映像解析部の処理フローチャートである。It is a process flowchart of the image | video analysis part which detects a telop. ネットワークシステムで構成される映像解析装置の例を示す図である。It is a figure which shows the example of the video-analysis apparatus comprised with a network system. 本発明の課題を説明する装置の構成例を示す図である。It is a figure which shows the structural example of the apparatus explaining the subject of this invention. 図７に示す制御部の処理フローチャートである。It is a process flowchart of the control part shown in FIG.

以下では主に，本発明の各部を，ＣＰＵやメモリなどのコンピュータのハードウェアとソフトウェアプログラムを用いて実施する場合を例として述べる。 Below, the case where each part of this invention is mainly implemented using computer hardware and software programs, such as CPU and memory, is described as an example.

図１は，本発明の実施形態に係る映像解析装置の構成例を示す。映像解析装置１は，映像データからそれぞれ異なる特徴量を抽出する複数の映像解析部１０［０］，１０［１］，１０［３］，…と，解析対象の映像をフレームごとの画像データとして取得するフレームデータ取得部１３と，制御部１４とから構成される。なお，この映像解析装置１は汎用性を有するため，映像解析部１０が複数ではなく単数であっても，正常に機能する。 FIG. 1 shows a configuration example of a video analysis apparatus according to an embodiment of the present invention. The video analysis apparatus 1 includes a plurality of video analysis units 10 [0], 10 [1], 10 [3],... That extract different feature amounts from the video data, and the analysis target video as image data for each frame. The frame data acquisition unit 13 to be acquired and the control unit 14 are configured. Since the video analysis apparatus 1 has versatility, even if the video analysis unit 10 is not plural but single, it functions normally.

各映像解析部１０は，それぞれ，次フレーム指定部１１［０］，１１［１］，１１［２］，…と，特徴量出力部１２［０］，１２［１］，１２［２］，…とを備える。 Each video analysis unit 10 includes next frame designation units 11 [0], 11 [1], 11 [2],..., Feature amount output units 12 [0], 12 [1], 12 [2],. … And.

次フレーム指定部１１は，映像解析部１０にフレームが逐次入力されるごとに，次に必要となるフレーム番号またはフレーム時間情報の次フレーム情報を，制御部１４に通知する機能を有する。また，特徴量出力部１２は，フレームごとの画像データを逐次入力することによって，その映像の特徴的な区間やそれに付随する値などの特徴量を出力する機能を有する。 The next frame designation unit 11 has a function of notifying the control unit 14 of next frame information of a frame number or frame time information that is necessary next time a frame is sequentially input to the video analysis unit 10. The feature amount output unit 12 has a function of outputting feature amounts such as a characteristic section of the video and a value associated therewith by sequentially inputting image data for each frame.

制御部１４は，映像解析部１０から指定される次フレーム情報に基づき，それぞれの映像解析部１０が必要なフレームのデータをフレームデータ取得部１３から取得して映像解析部１０に送る制御機能を持つ。 Based on the next frame information specified by the video analysis unit 10, the control unit 14 has a control function that each video analysis unit 10 acquires necessary frame data from the frame data acquisition unit 13 and sends it to the video analysis unit 10. Have.

以上の本実施形態における映像解析部１０，フレームデータ取得部１３，制御部１４は，この例では単一のコンピュータ上で実行されるソフトウェアモジュールによって構成されるものであり，各構成要素間の動作は，そのソフトウェアの関数コールの形式で実行されるものとして説明する。 In this example, the video analysis unit 10, the frame data acquisition unit 13, and the control unit 14 according to the present embodiment are configured by software modules executed on a single computer, and operations between components are performed. Is described as being executed in the form of a function call for that software.

フレームデータ取得部１３は，ＭＰＥＧなどの形式で符号化・圧縮化等の処理がなされたファイルを入力し，フレーム画像にデコードする機能を有するソフトウェアモジュールによって構成される。このようなソフトウェアは数多く実現されている。 The frame data acquisition unit 13 is configured by a software module having a function of inputting a file that has been subjected to processing such as encoding and compression in a format such as MPEG and decoding it into a frame image. Many such softwares have been implemented.

図２に，フレームデータ取得部１３の処理フローチャートを示す。フレームデータ取得部１３は，図２に示すように，主に「初期化」「フレーム取得」「終期化」の各フェーズからなり，上位のソフトウェア（制御部１４）から逐次関数コールで呼び出されることによって，機能提供がなされる。具体的には，以下のとおりである。 FIG. 2 shows a processing flowchart of the frame data acquisition unit 13. As shown in FIG. 2, the frame data acquisition unit 13 mainly includes phases of “initialization”, “frame acquisition”, and “finalization”, and is called by sequential function calls from higher-level software (control unit 14). The function is provided by Specifically, it is as follows.

まず，制御部１４が，対象とする映像ファイル名を引数として指定して，「初期化」関数を呼び出すと，フレームデータ取得部１３は引数で指定されたファイルをオープンし（ステップＳ１０），フレームデータ取得部１３の内部で必要なバッファ領域を確保する（ステップＳ１１）といった初期化動作を行う。 First, when the control unit 14 specifies the target video file name as an argument and calls the “initialization” function, the frame data acquisition unit 13 opens the file specified by the argument (step S10), and the frame An initialization operation such as securing a necessary buffer area in the data acquisition unit 13 (step S11) is performed.

以降，制御部１４から，制御部１４内の格納域アドレスを引数とする「フレーム取得」関数呼び出しが連続的に行われ，フレームデータ取得部１３は呼び出されるごとに映像ファイルからデコードされたフレームデータを１枚ずつ内部バッファ領域に読み出す（ステップＳ１２）。映像ファイルが終了した場合には，ファイル終了を関数呼び出し元の制御部１４に通知する（ステップＳ１３）。映像ファイルからフレームデータが得られた場合には，その１フレーム分のデータを指定された制御部１４内の格納域アドレスに展開し（ステップＳ１４），制御部１４に返却する。返却されるフレームデータには，タイムスタンプと呼ばれる映像ファイル中のフレーム画像の時間的インデックス情報が含まれる。フレームデータ取得部１３の実施方法によっては，時間的情報ではなく，先頭から何フレーム目かを表す番号のインデックス情報のみが含まれる場合もあるが，その場合には，フレームレート（単位時間当たりのフレーム数）の逆数を掛けることによって，タイムスタンプと同等に扱うことができる。 Thereafter, the “frame acquisition” function call using the storage area address in the control unit 14 as an argument is continuously performed from the control unit 14, and the frame data acquired from the video file is decoded each time the frame data acquisition unit 13 is called. Are read one by one into the internal buffer area (step S12). When the video file is ended, the end of the file is notified to the function calling source control unit 14 (step S13). When frame data is obtained from the video file, the data for one frame is expanded to the designated storage area address in the control unit 14 (step S14) and returned to the control unit 14. The returned frame data includes temporal index information of a frame image in the video file called a time stamp. Depending on the implementation method of the frame data acquisition unit 13, not only temporal information but also index information having a number indicating the number of frames from the beginning may be included. In this case, the frame rate (per unit time) is included. By multiplying by the reciprocal of the number of frames, it can be handled in the same way as a time stamp.

映像の解析が全て終了し，制御部１４から「終期化」関数の呼び出しがあると，フレームデータ取得部１３は，映像ファイルをクローズし（ステップＳ１５），また初期化時に確保したバッファ領域を解放して（ステップＳ１６），処理を終了する。 When the video analysis is complete and the control unit 14 calls the “terminate” function, the frame data acquisition unit 13 closes the video file (step S15) and releases the buffer area secured at initialization. (Step S16), and the process ends.

制御部１４の処理フローを図３に示す。本発明では，フレームを入力データとして特徴量を出力する複数の映像解析部１０の一般化が可能であるため，それらＮ個の映像解析部１０を，映像解析部１０［０］，映像解析部１０［１］，…，映像解析部１０［Ｎ−１］とする。具体的には，その使用目的に合わせて，例えば，映像解析部１０［０］は，カット点検出映像解析部であり，映像解析部１０［１］は，テロップ検出映像解析部であり，…といったようになる。このような定義は，例えば，映像解析装置１の設定ファイル（図示省略）に記述しておき，制御部１４の初期化時にその設定ファイルを読み込んで，指定するといった実施方法を採ることができる。 A processing flow of the control unit 14 is shown in FIG. In the present invention, since it is possible to generalize a plurality of video analysis units 10 that output feature quantities using frames as input data, these N video analysis units 10 are referred to as a video analysis unit 10 [0] and a video analysis unit. 10 [1],..., Video analysis unit 10 [N-1]. Specifically, in accordance with the purpose of use, for example, the video analysis unit 10 [0] is a cut point detection video analysis unit, the video analysis unit 10 [1] is a telop detection video analysis unit,... And so on. Such a definition can be described, for example, in a setting file (not shown) of the video analysis apparatus 1 and read and specified when the control unit 14 is initialized.

制御部１４では，このようなＮ個の映像解析部１０に対応するＮ個の配列ＴＳ［］を，メモリ領域１５に持つ。この配列値を用いて，図３のステップＳ２５において，実際に各映像解析部１０にフレームデータを渡して解析処理の関数コールを行うか否かを決定することができる。解析処理関数コールを行った場合には，ステップＳ２７において，その戻り値（あるいは別の実施例の場合には，出力パラメータ値）で，配列ＴＳ［ｉ］の値を更新する。この処理により，次にその映像解析部１０が必要とするフレームを，制御部１４で汎用的に管理することができる。 The control unit 14 has N arrays TS [] corresponding to the N video analysis units 10 in the memory area 15. Using this array value, in step S25 of FIG. 3, it is possible to determine whether or not to actually pass the frame data to each video analysis unit 10 and make a function call for analysis processing. If an analysis processing function call is made, the value of the array TS [i] is updated with the return value (or the output parameter value in another embodiment) in step S27. With this process, the frame required by the video analysis unit 10 can be managed by the control unit 14 for general use.

この制御部１４が実行する処理は，以下のとおりである。まず最初に各機能の初期化を行う（ステップＳ２０）。ここでは，前述した「初期化」関数の呼び出し，フレームデータ格納域として用いる内部バッファ領域の確保，また必要に応じて各映像解析部１０への初期化指示などを行う。また，メモリ領域１５のタイムスタンプ配列ＴＳ［０］〜ＴＳ［Ｎ−１］を全て０にセットする（ステップＳ２１）。 The processing executed by the control unit 14 is as follows. First, each function is initialized (step S20). Here, the above-mentioned “initialization” function is called, an internal buffer area used as a frame data storage area is secured, and an initialization instruction is given to each video analysis unit 10 as necessary. Further, all the time stamp arrays TS [0] to TS [N−1] in the memory area 15 are set to 0 (step S21).

その後，制御部１４は，内部バッファ領域の格納域アドレスを引数として「フレーム取得」関数呼び出しを行い，フレームデータ取得部１３からフレームデータを取得する（ステップＳ２２）。関数呼び出しの戻り値が「ファイル終了」かどうかを判定し（ステップＳ２３），「ファイル終了」であれば，ステップＳ２９へ進む。 Thereafter, the control unit 14 calls the “frame acquisition” function with the storage area address of the internal buffer area as an argument, and acquires frame data from the frame data acquisition unit 13 (step S22). It is determined whether the return value of the function call is “end of file” (step S23). If it is “end of file”, the process proceeds to step S29.

戻り値が「ファイル終了」ではなく，「正常取得」であり，フレームデータが内部バッファ領域に格納されたならば，ループカウンタｉを０からＮ−１までインクリメントしながら（ステップＳ２４），以下のステップＳ２５〜Ｓ２７の処理を，ｉ＝Ｎ−１になるまで繰り返す（ステップＳ２８）。 If the return value is not “end of file” but “normally acquired” and the frame data is stored in the internal buffer area, the loop counter i is incremented from 0 to N−1 (step S24), and the following The processes in steps S25 to S27 are repeated until i = N−1 (step S28).

取得したフレームのタイムスタンプをｔ［単位：ｍｓｅｃ］とする。繰り返し処理では，まず，タイムスタンプｔがＴＳ［ｉ］以上かどうかを判定する（ステップＳ２５）。ｔがＴＳ［ｉ］より小さければ，ステップＳ２６，Ｓ２７をスキップする。タイムスタンプｔがＴＳ［ｉ］以上であれば，ｉ番目の映像解析部１０［ｉ］へ内部バッファ領域に格納されたフレームデータを渡す（ステップＳ２６）。その後，映像解析部１０［ｉ］の次フレーム指定部１１［ｉ］から指定される次フレーム情報の値をタイムスタンプ配列ＴＳ［ｉ］にセットする（ステップＳ２７）。 Let the time stamp of the acquired frame be t [unit: msec]. In the iterative process, first, it is determined whether or not the time stamp t is equal to or greater than TS [i] (step S25). If t is smaller than TS [i], steps S26 and S27 are skipped. If the time stamp t is equal to or greater than TS [i], the frame data stored in the internal buffer area is transferred to the i-th video analysis unit 10 [i] (step S26). Thereafter, the value of the next frame information specified by the next frame specifying unit 11 [i] of the video analysis unit 10 [i] is set in the time stamp array TS [i] (step S27).

以上の繰り返し処理がｉ＝Ｎ−１まで終了したならば（ステップＳ２８），ステップＳ２２へ戻り，次のフレームデータの取得を繰り返す。 If the above repeating process is completed up to i = N−1 (step S28), the process returns to step S22, and the acquisition of the next frame data is repeated.

フレームデータ取得部１３から「ファイル終了」が通知されたならば，前述した「終期化」関数の呼び出し，フレームデータ格納域として用いた内部バッファ領域の解放，また必要に応じて各映像解析部１０への終了処理指示などの各機能の終了処理を行い（ステップＳ２９），処理を終了する。 When “end of file” is notified from the frame data acquisition unit 13, the above-mentioned “end” function is called, the internal buffer area used as the frame data storage area is released, and each video analysis unit 10 is also necessary. End processing of each function such as an end processing instruction is performed (step S29), and the processing ends.

次に，映像解析部１０の処理フローを図４，図５の例に従って説明する。ここでは，特許文献１の映像解析技術を利用した例を図４に示し，特許文献３の映像解析技術を利用した例を図５に示す。 Next, the processing flow of the video analysis unit 10 will be described according to the examples of FIGS. Here, an example using the video analysis technique of Patent Document 1 is shown in FIG. 4, and an example using the video analysis technique of Patent Document 3 is shown in FIG.

図４に示す映像解析部１０［０］は，映像解析によりカット点検出を行うものであり，制御部１４から１フレーム分のフレームデータを引数とする解析処理の指示があると，特許文献１に記載されているようなカット点の抽出処理を実行する（ステップＳ３０）。カット点抽出処理によりカット点が検出されたかどうかを判定し（ステップＳ３１），カット点が検出されたならば，特徴量出力部１２［０］を呼び出す。そうでなければ，次フレーム指定部１１［０］の処理（ステップＳ３３）へ進む。 The video analysis unit 10 [0] shown in FIG. 4 performs cut point detection by video analysis, and if there is an analysis processing instruction using one frame of frame data as an argument from the control unit 14, Patent Document 1 The cut point extraction process as described in (1) is executed (step S30). It is determined whether or not a cut point is detected by the cut point extraction process (step S31). If a cut point is detected, the feature amount output unit 12 [0] is called. Otherwise, the process proceeds to the next frame designation unit 11 [0] (step S33).

特徴量出力部１２［０］は，検出されたカット点の情報を所定の結果ファイル（または結果格納メモリ域）に出力する（ステップＳ３２）。 The feature amount output unit 12 [0] outputs information of the detected cut point to a predetermined result file (or result storage memory area) (step S32).

次フレーム指定部１１［０］は，ｔに３００［ｍｓｅｃ］を加えた値を次フレーム情報として設定し（ステップＳ３３），その次フレーム情報を戻り値として制御部１４に返却する。次フレーム情報をｔ＋３００として，制御部１４に通知することにより，３００ｍｓｅｃごとのフレームに対してカット点の検出処理を行うことができるようになる。 The next frame specifying unit 11 [0] sets a value obtained by adding 300 [msec] to t as next frame information (step S33), and returns the next frame information to the control unit 14 as a return value. By notifying the control unit 14 of the next frame information as t + 300, it becomes possible to perform a cut point detection process on a frame every 300 msec.

図５に示す映像解析部１０［１］は，映像解析によりテロップの検出を行うものである。メモリ領域１６に用意されたカウンタＣｔｌｐの初期値は０に設定されている。この例では，５００ｍｓｅｃごとに連続した４枚のフレーム画像を入力し，平均化カラー画像を作成して，テロップを検出するものとする。 The video analysis unit 10 [1] shown in FIG. 5 detects telops by video analysis. The initial value of the counter Ctlp prepared in the memory area 16 is set to zero. In this example, four continuous frame images are input every 500 msec, an averaged color image is created, and a telop is detected.

制御部１４から１フレーム分のフレームデータを引数とする解析処理の指示があると，まず，カウンタＣｔｌｐの値が３より小さいかどうかを判定する（ステップＳ４０）。３より小さい場合には，特許文献３に記載されている方法による複数フレーム平均化処理を実行する（ステップＳ４１）。一方，カウンタＣｔｌｐの値が３になった場合には，特許文献３に記載されている方法によるテロップ検出処理を実行する（ステップＳ４２）。テロップが検出された場合には（ステップＳ４３），特徴量出力部１２［１］を呼び出し，検出されたカット点の情報を所定の結果ファイル（または結果格納メモリ域）に出力する（ステップＳ４４）。 When there is an analysis processing instruction using frame data for one frame as an argument from the control unit 14, it is first determined whether or not the value of the counter Ctlp is smaller than 3 (step S40). If it is smaller than 3, multiple frame averaging processing is executed by the method described in Patent Document 3 (step S41). On the other hand, when the value of the counter Ctlp becomes 3, the telop detection process by the method described in Patent Document 3 is executed (step S42). When a telop is detected (step S43), the feature amount output unit 12 [1] is called, and information on the detected cut point is output to a predetermined result file (or result storage memory area) (step S44). .

その後，次フレーム指定部１１［１］の処理へ進み，カウンタＣｔｌｐの値が３より小さいかどうかを判定する（ステップＳ４５）。３より小さい場合には，カウンタＣｔｌｐの値を１カウントアップし（ステップＳ４６），ｔに１［ｍｓｅｃ］を加えた値を次フレーム情報として設定し（ステップＳ４７），その次フレーム情報を戻り値として制御部１４に返却する。このように，次フレーム情報をｔ＋１とすることによって，タイムスタンプが現在よりも少しでも進めば，再度，映像解析部１０［１］が制御部１４から呼び出されるようになるため，連続してフレームデータが渡されることになる。 Thereafter, the process proceeds to the process of the next frame specifying unit 11 [1], and it is determined whether or not the value of the counter Ctlp is smaller than 3 (step S45). If it is smaller than 3, the value of the counter Ctlp is incremented by 1 (step S46), a value obtained by adding 1 [msec] to t is set as the next frame information (step S47), and the next frame information is returned as the return value. To the control unit 14. In this way, by setting the next frame information to t + 1, if the time stamp advances even a little from the present time, the video analysis unit 10 [1] is called again from the control unit 14. Data will be passed.

一方，カウンタＣｔｌｐの値が３になったならば，カウンタＣｔｌｐの値を０に戻し（ステップＳ４８），ｔに５００［ｍｓｅｃ］を加えた値を次フレーム情報として設定し（ステップＳ４９），その次フレーム情報を戻り値として制御部１４に返却する。 On the other hand, if the value of the counter Ctlp becomes 3, the value of the counter Ctlp is returned to 0 (step S48), and a value obtained by adding 500 [msec] to t is set as the next frame information (step S49). The next frame information is returned to the control unit 14 as a return value.

以上のようにして，次フレーム情報を制御部１４に通知することにより，５００ｍｓｅｃごとに４枚のフレームデータを入力し，テロップの検出処理を行うことができるようになる。 As described above, the next frame information is notified to the control unit 14, so that four frames of data can be input every 500 msec, and a telop detection process can be performed.

ここでは，映像解析部１０の次フレーム指定部１１により指定された次フレーム情報については，映像解析部１０を呼び出す解析処理関数の戻り値として制御部１４に返却される例を説明したが，別の実施例としては，解析処理関数コールにおいて，出力用のパラメータを設け，その出力用パラメータに次フレーム情報を設定してリターンするという方法を採ることも可能である。 Here, an example has been described in which the next frame information specified by the next frame specifying unit 11 of the video analysis unit 10 is returned to the control unit 14 as a return value of an analysis processing function that calls the video analysis unit 10. As an embodiment of the above, it is also possible to adopt a method in which an output parameter is provided in the analysis processing function call, the next frame information is set in the output parameter, and the process returns.

カット点検出とテロップ検出の映像解析部１０の例を説明したが，他の種類の特徴量を抽出する映像解析部１０についても同様に，次フレーム指定部１１によって，次に必要となるフレーム番号やフレーム時間情報などの次フレーム情報を制御部１４に通知する手段を設けることにより，制御部１４では，フレームデータの映像解析部１０への引渡し処理を一律に実行することができるようになる。 The example of the video analysis unit 10 for cut point detection and telop detection has been described. Similarly, for the video analysis unit 10 that extracts other types of feature amounts, the next frame designation unit 11 similarly uses the next frame number required. By providing means for notifying the control unit 14 of next frame information such as frame time information and the like, the control unit 14 can uniformly execute the delivery processing of the frame data to the video analysis unit 10.

以上の実施形態として，説明を分かりやすくするため，図１に示す各部の機能を，単一のコンピュータに実装する場合の例を説明したが，他の実施形態としては，各機能を別々のコンピュータ上で実行されるソフトウェアモジュールによって実現し，これらのコンピュータは，ＬＡＮを通じて相互に通信ができるものとすることもできる。 In the above embodiment, in order to make the explanation easy to understand, the example in which the functions of the respective units shown in FIG. 1 are implemented in a single computer has been described. However, in another embodiment, each function is provided in a separate computer. Implemented by the software modules executed above, these computers can also communicate with each other over a LAN.

図６に，映像解析装置をネットワークシステムを用いて構築する例を示す。本実施形態における映像解析装置は，それぞれ異なるコンピュータ（ＣＰＵおよびメモリ等）で実現される映像解析制御装置２０と，映像解析処理装置３０［０］，３０［１］，３０［２］，…と，ＬＡＮなどのネットワーク４０とから構成される。映像解析制御装置２０は，図１で説明したフレームデータ取得部１３と制御部１４とを有する。 FIG. 6 shows an example in which a video analysis apparatus is constructed using a network system. The video analysis device according to the present embodiment includes a video analysis control device 20 realized by different computers (CPU and memory), video analysis processing devices 30 [0], 30 [1], 30 [2],. , And a network 40 such as a LAN. The video analysis control device 20 includes the frame data acquisition unit 13 and the control unit 14 described with reference to FIG.

すなわち，映像解析制御装置２０の制御部１４は，フレームデータ取得部１３から映像をフレームごとの画像データとしてフレームの時間順に取得する手段と，映像解析処理装置３０から指定される次フレーム情報に基づき，フレームデータ取得部１３から取得したフレームが，映像解析処理装置３０が必要なフレームの画像データであるかどうかを判定する手段と，フレームデータ取得部１３から取得したフレームが，映像解析処理装置３０が必要なフレームの画像データであると判定された場合に，そのフレームの画像データを映像解析処理装置３０へ送る手段と，映像解析処理装置３０から受信した次フレーム情報を記憶する手段とを備える。 That is, the control unit 14 of the video analysis control device 20 is based on means for acquiring video from the frame data acquisition unit 13 as frame-by-frame image data in time order of frames, and next frame information specified by the video analysis processing device 30. , Means for determining whether or not the frame acquired from the frame data acquisition unit 13 is image data of a frame required by the video analysis processing device 30, and the frame acquired from the frame data acquisition unit 13 is the video analysis processing device 30. Is determined to be image data of a necessary frame, means for sending the image data of the frame to the video analysis processing device 30 and means for storing next frame information received from the video analysis processing device 30 are provided. .

各映像解析処理装置３０は，図１で説明した映像解析部１０を有する。映像解析制御装置２０および各映像解析処理装置３０は，ネットワーク４０を通じて，命令とフレームデータや次フレーム情報等のデータを送受信することで，単一のコンピュータによって映像解析装置を実現する場合と同様に，本発明を実施することができる。もちろん，フレームデータ取得部１３と制御部１４とを，異なるコンピュータに実装して本発明を実施することも可能である。 Each video analysis processing device 30 includes the video analysis unit 10 described with reference to FIG. The video analysis control device 20 and each video analysis processing device 30 transmit and receive commands and data such as frame data and next frame information through the network 40, as in the case where the video analysis device is realized by a single computer. The present invention can be implemented. Of course, it is possible to implement the present invention by mounting the frame data acquisition unit 13 and the control unit 14 on different computers.

１映像解析装置
１０映像解析部
１１次フレーム指定部１２特徴量出力部
１３フレームデータ取得部
１４制御部 DESCRIPTION OF SYMBOLS 1 Image | video analysis apparatus 10 Image | video analysis part 11 Next frame designation | designated part 12 Feature-value output part 13 Frame data acquisition part 14 Control part

Claims

A frame data acquisition unit for acquiring video as image data for each frame;
By sequentially inputting image data for each frame, a feature amount output unit for outputting a feature amount which is a characteristic section of the video or a value associated therewith, and the image data are sequentially input each time the image data is sequentially input. One or a plurality of video analysis units having a next frame designating unit for designating next frame information which is a frame number or frame time information,
Based on the next frame information specified by the video analysis unit, each of the video analysis units acquires image data of a necessary frame from the frame data acquisition unit and sends the frame data to the video analysis unit;
A video analysis apparatus comprising:

Computer
A frame data acquisition unit for acquiring video as image data for each frame;
By sequentially inputting image data for each frame, a feature amount output unit for outputting a feature amount which is a characteristic section of the video or a value associated therewith, and the image data are sequentially input each time the image data is sequentially input. One or a plurality of video analysis units having a next frame designating unit for designating next frame information which is a frame number or frame time information,
Based on the next frame information specified by the video analysis unit, each of the video analysis units acquires the image data of the necessary frame from the frame data acquisition unit and sends it to the video analysis unit,
Video analysis program to make it function.

By sequentially inputting image data for each frame, a feature amount output unit for outputting a feature amount which is a characteristic section of the video or a value associated therewith, and the image data are sequentially input each time the image data is sequentially input. Sending image data of a frame to be analyzed to one or a plurality of video analysis processing devices including a video analysis unit having a next frame information unit for specifying next frame information which is frame number or frame time information, A video analysis control device for analyzing video,
Means for acquiring video as image data for each frame in the time order of the frame from the frame data acquisition unit provided in the video analysis control device or the external device;
Means for determining whether the frame acquired from the frame data acquisition unit is image data of a frame required by the video analysis processing device based on next frame information designated from the video analysis processing device;
Means for sending the image data of the frame to the video analysis processing device when it is determined that the frame acquired from the frame data acquisition unit is image data of a frame required by the video analysis processing device;
Means for storing next frame information received from the video analysis processing device;
A video analysis control apparatus comprising:

A video analysis control program for causing a computer to function as each means included in the video control device.