JP7398869B2

JP7398869B2 - Image data extraction device and image data extraction method

Info

Publication number: JP7398869B2
Application number: JP2019021389A
Authority: JP
Inventors: 涼平戸崎
Original assignee: Toshiba IT and Control Systems Corp
Current assignee: Toshiba IT and Control Systems Corp
Priority date: 2019-02-08
Filing date: 2019-02-08
Publication date: 2023-12-15
Anticipated expiration: 2039-02-08
Also published as: JP2020129741A

Description

本実施形態は連続した複数のフレームにより構成された動画における対象となる画像を抽出する画像データ抽出装置および画像データ抽出方法に関する。 The present embodiment relates to an image data extraction device and an image data extraction method for extracting a target image from a moving image composed of a plurality of consecutive frames.

複数のフレームにより構成された動画における対象となる画像を抽出する画像データ抽出装置が知られている。 2. Description of the Related Art An image data extraction device is known that extracts a target image from a moving image composed of a plurality of frames.

特開２０１５－６９４３２号公報JP2015-69432A 特開２０１６－２１２７８４号公報Japanese Patent Application Publication No. 2016-212784

近年、デジタルムービーやスマートフォンが普及し、動画の撮影が多く行われている。また、防犯カメラ等により動画が撮影される場合もある。撮影された画像データは膨大なデータ量となる。 In recent years, digital movies and smartphones have become widespread, and many videos are being shot. Additionally, a video may be taken by a security camera or the like. The amount of captured image data is enormous.

また近年、機械学習による物体解析に、画像データが用いられる。機械学習による物体解析は、膨大な過去の基本データに基づき行われる。過去の基本データの数量が多いほど解析の精度を向上させることができる。機械学習による物体解析に用いられる基本データとして、多数の画像データが予め記憶されていることが望ましい。 In recent years, image data has also been used for object analysis using machine learning. Object analysis using machine learning is performed based on a huge amount of past basic data. The greater the amount of past basic data, the more accurate the analysis can be. It is desirable that a large amount of image data be stored in advance as basic data used for object analysis using machine learning.

動画は連続した複数のフレームにより構成されている。さらに作業者の所望する画像データは、動画として撮影された画面の一部分である場合が多い。したがって、作業者の所望する画像データは、動画として撮影された画面の一部分を切り出して対象画像として抽出される。しかしながら、動画は多数のフレームにより構成されており、この多数のフレームから１枚ずつ画面の一部分を切りだして対象画像を作成するには、多大な労力が費やされる。 A video consists of multiple consecutive frames. Furthermore, the image data desired by the worker is often a portion of a screen shot as a moving image. Therefore, the image data desired by the worker is extracted as a target image by cutting out a part of the screen shot as a moving image. However, a moving image is composed of a large number of frames, and it takes a lot of effort to create a target image by cutting out a portion of the screen one by one from the large number of frames.

このため、動画を構成する多数のフレームから画面の一部分を切り出して対象画像を作成する場合、コンピュータにより構成された装置が用いられる。しかしながら、多数のフレームから画面の一部分を切りだして対象画像を作成するには、手作業の依存が多く、作業者にとって作業を行いにくい、との問題点があった。 Therefore, when creating a target image by cutting out a portion of the screen from a large number of frames constituting a moving image, a device configured using a computer is used. However, there is a problem in that creating a target image by cutting out a portion of the screen from a large number of frames requires a lot of manual labor, making it difficult for the operator to perform the task.

本実施形態は、上記問題点を解決するために、作業者による短時間かつ単純な作業で、動画を構成する連続した複数のフレームの各々から、所望する画像を対象画像として抽出することができる画像データ抽出装置および画像データ抽出方法を提供することを目的とする。 In order to solve the above-mentioned problems, the present embodiment allows a worker to extract a desired image as a target image from each of a plurality of consecutive frames constituting a video with a short and simple operation. An object of the present invention is to provide an image data extraction device and an image data extraction method.

本実施形態の画像データ抽出装置は、次のような構成を有することを特徴とする。
（１）連続した複数のフレームにより構成された動画を再生する表示部。
（２）前記表示部により再生された再生中の動画に第１の座標と第２の座標が入力される入力部。
（３）前記第１の座標が入力された時点で前記表示部に表示されていたフレームをｎ番目のフレームとして記憶し、前記第２の座標が入力された時点で前記表示部に表示されていたフレームをｎ＋ｋ番目のフレームとして記憶する記憶部。
（４）前記入力部に入力された前記第１の座標から第１の座標データを作成し、前記第２の座標から第２の座標データを作成し、第１の座標データと前記第２の座標データに基づき構成された多角形または楕円形により囲まれた囲み領域の内側の画像を、前記ｎ＋ｋ番目のフレームから基準画像として選択する選択部。
（５）前記選択部により選択された前記基準画像に基づき、前記動画を構成する連続した前記複数のフレームの各々から、前記基準画像に相当する画像を対象画像として抽出する抽出部。
（６）前記記憶部は、前記抽出部により抽出された対象画像を記憶する。 The image data extraction device of this embodiment is characterized by having the following configuration.
(1) A display unit that plays back a moving image composed of a plurality of consecutive frames.
(2) An input unit into which first coordinates and second coordinates are input into the video being played back by the display unit.
(3) The frame displayed on the display unit at the time the first coordinates are input is stored as the n-th frame, and the frame displayed on the display unit at the time the second coordinates are input is stored. A storage unit that stores the frame as the n+kth frame.
( 4 ) Create first coordinate data from the first coordinates input to the input unit, create second coordinate data from the second coordinates, and combine the first coordinate data and the second coordinates. A selection unit that selects an image inside an enclosed area surrounded by a polygon or an ellipse formed based on coordinate data as a reference image from the n+kth frame.
( 5 ) An extraction unit that extracts, as a target image, an image corresponding to the reference image from each of the plurality of continuous frames constituting the moving image based on the reference image selected by the selection unit.
(6) The storage unit stores the target image extracted by the extraction unit.

第１実施形態にかかる画像データ抽出装置を示す図A diagram showing an image data extraction device according to the first embodiment 第１実施形態にかかる画像データ抽出装置の外観図External view of the image data extraction device according to the first embodiment 第１実施形態にかかる画像データ抽出装置の選択部のプログラムのフローを示す図A diagram showing the flow of a program of the selection unit of the image data extraction device according to the first embodiment. 第１実施形態にかかる画像データ抽出装置の抽出部のプログラムのフローを示す図A diagram showing the flow of a program of the extraction unit of the image data extraction device according to the first embodiment 第１実施形態にかかる画像データ抽出装置の画像抽出にかかる動作を説明する図A diagram explaining the operation related to image extraction by the image data extraction device according to the first embodiment. 第１実施形態にかかる画像データ抽出装置の画像調整部のプログラムのフローを示す図A diagram showing a program flow of the image adjustment unit of the image data extraction device according to the first embodiment

［１．第１実施形態］
［１－１．構成］
図１～２を参照して本実施形態の一例としての画像データ抽出装置１について説明する。画像データ抽出装置１は、マイクロコンピュータ等により構成された装置である。画像データ抽出装置１は、作業者により操作され、動画からの画像データの抽出に用いられる。画像データ抽出装置１は、対象となる人物、物品、設備等の物体解析を行うためのデータ抽出に利用される。抽出された画像および特徴の位置を示す座標データは、機械学習における教師データ等に用いられる。 [1. First embodiment]
[1-1. composition]
An image data extraction device 1 as an example of this embodiment will be described with reference to FIGS. 1 and 2. The image data extraction device 1 is a device configured with a microcomputer or the like. The image data extraction device 1 is operated by an operator and used to extract image data from a moving image. The image data extraction device 1 is used to extract data for analyzing objects such as people, articles, and equipment. The extracted images and coordinate data indicating the positions of features are used as training data in machine learning.

画像データ抽出装置１において、以下のコマンド、データが、入力、出力、記憶される。
コマンドＪ１：抽出する画像のポイントＡを示す指令
コマンドＪ２：抽出する画像のポイントＢを示す指令
コマンドＫ１：囲み領域の縦方向の拡大を指示する指令
コマンドＫ２：囲み領域の縦方向の縮小を指示する指令
コマンドＫ３：囲み領域の横方向の拡大を指示する指令
コマンドＫ４：囲み領域の横方向の縮小を指示する指令
コマンドＬ１：動画の再生停止を指示する指令
コマンドＬ２：動画の再生開始を指示する指令
コマンドＬ３：動画の低速再生を指示する指令
コマンドＬ４：動画の高速再生を指示する指令
コマンドＬ５：動画の巻き戻し再生を指示する指令
座標データＡ：抽出する画像のポイントＡの座標を示すデータ
座標データＢ：抽出する画像のポイントＢの座標を示すデータ
動画データＤ１：連続した複数のフレームにより構成された動画のデータ
基準画像データＥ１：動画Ｄ１のうち作業者により選択された対象画像のデータ
対象画像データＦ１：動画Ｄ１の複数のフレームから抽出された複数の対象画像のデータ
図５のポイントＡの座標が請求項における第１の座標、ポイントＢの座標が請求項における第２の座標に相当する。囲み領域は、ポイントＡの座標と、ポイントＢの座標に基づき、画面上に構成された図形により囲まれた領域である。囲み領域の内側の画像が抽出される。 In the image data extraction device 1, the following commands and data are input, output, and stored.
Command J1: Command to indicate point A of the image to be extracted Command J2: Command to indicate point B of the image to be extracted Command K1: Command to instruct to enlarge the enclosed area in the vertical direction Command K2: Instruct to reduce the enclosed area in the vertical direction Command K3: Command to expand the enclosed area in the horizontal direction Command K4: Command to reduce the enclosed area in the horizontal direction Command L1: Command to stop playing the video Command L2: Command to start playing the video Command L3: Command to play the video at low speed Command L4: Command to play the video at high speed Command L5: Command to rewind the video Coordinate data A: Indicates the coordinates of point A of the image to be extracted Data Coordinate data B: Data indicating the coordinates of point B of the image to be extracted Video data D1: Data of a video composed of a plurality of consecutive frames Reference image data E1: Data of the target image selected by the operator from video D1 Data Target image data F1: data of multiple target images extracted from multiple frames of video D1 The coordinates of point A in FIG. 5 are the first coordinates in the claims, and the coordinates of point B are the second coordinates in the claims. corresponds to The enclosed area is an area surrounded by a figure configured on the screen based on the coordinates of point A and point B. The image inside the enclosed area is extracted.

画像データ抽出装置１は、入力部２、表示部３、演算部４、記憶部５を有する。演算部４は、選択部４１、抽出部４２、画像調整部４３を含む。画像データ抽出装置１の画像データ抽出に関する機能は、搭載されたコンピュータプログラムにより実現される。 The image data extraction device 1 includes an input section 2, a display section 3, a calculation section 4, and a storage section 5. The calculation section 4 includes a selection section 41, an extraction section 42, and an image adjustment section 43. The functions related to image data extraction of the image data extraction device 1 are realized by an installed computer program.

（入力部２）
入力部２は、入力部２は、マウス２１、キーボード２２により構成された入力装置である。入力部２は、演算部４に接続される。入力部２は、作業者により操作されコマンドＪ１～Ｊ２、Ｋ１～Ｋ４、Ｌ１～Ｌ５が入力される。入力部２は、入力されたコマンドを演算部４に出力する。 (Input section 2)
The input unit 2 is an input device including a mouse 21 and a keyboard 22. The input section 2 is connected to the calculation section 4. The input unit 2 is operated by an operator to input commands J1 to J2, K1 to K4, and L1 to L5. The input unit 2 outputs the input command to the calculation unit 4.

（表示部３）
表示部３は、プラズマディスプレイ、液晶パネル等により構成された表示装置である。表示部３は、演算部４に接続される。表示部３は、演算部４から出力された動画データＤ１、基準画像データＥ１に基づき画像を表示する。 (Display section 3)
The display unit 3 is a display device configured with a plasma display, a liquid crystal panel, or the like. The display section 3 is connected to the calculation section 4. The display unit 3 displays an image based on the moving image data D1 and the reference image data E1 output from the calculation unit 4.

（記憶部５）
記憶部５は、半導体メモリやハードディスクのような記憶媒体にて構成される。記憶部５は、演算部４に接続される。記憶部５は、座標データＡ、座標データＢ、動画データＤ１、基準画像データＥ１、対象画像データＦ１を記憶する。記憶部５は、演算部４に書込み、読み出しを制御される。記憶部５は、コンソール９に内蔵される。 (Storage unit 5)
The storage unit 5 is composed of a storage medium such as a semiconductor memory or a hard disk. The storage unit 5 is connected to the calculation unit 4. The storage unit 5 stores coordinate data A, coordinate data B, video data D1, reference image data E1, and target image data F1. The storage unit 5 is controlled for writing and reading by the calculation unit 4. The storage unit 5 is built into the console 9.

（演算部４）
演算部４は、マイクロコンピュータのＣＰＵ等により構成される。演算部４は、選択部４１、抽出部４２、画像調整部４３を含む。選択部４１、抽出部４２、画像調整部４３は、プログラムモジュールにより構成される。選択部４１、抽出部４２、画像調整部４３は、後述するコンピュータプログラムを内蔵する。演算部４は、入力部２、表示部３、記憶部５に接続される。演算部４は、入力部２、表示部３、記憶部５と連携し、以下の演算および制御を行う。演算部４は、コンソール９に内蔵される。 (Computation unit 4)
The calculation unit 4 is constituted by a CPU of a microcomputer or the like. The calculation section 4 includes a selection section 41, an extraction section 42, and an image adjustment section 43. The selection section 41, the extraction section 42, and the image adjustment section 43 are configured by program modules. The selection section 41, the extraction section 42, and the image adjustment section 43 incorporate a computer program to be described later. The calculation section 4 is connected to the input section 2, the display section 3, and the storage section 5. The calculation unit 4 cooperates with the input unit 2, display unit 3, and storage unit 5 to perform the following calculations and control. The calculation unit 4 is built into the console 9.

ａ．選択部４１の演算、制御
選択部４１は、入力部２から前述のコマンドＪ１、コマンドＪ２を受信する。選択部４１は、コマンドＪ１、コマンドＪ２に基づき座標データＡ（抽出する画像のポイントＡの座標を示すデータ）、座標データＢ（抽出する画像のポイントＢの座標を示すデータ）を作成し記憶部５に記憶させる。 a. Calculation and Control of Selection Unit 41 The selection unit 41 receives the above-mentioned command J1 and command J2 from the input unit 2. The selection unit 41 creates coordinate data A (data indicating the coordinates of point A in the image to be extracted) and coordinate data B (data indicating the coordinates of point B in the image to be extracted) based on command J1 and command J2, and stores them in the storage unit. 5 to be memorized.

選択部４１は、ポイントＡの座標にかかる座標データＡと、ポイントＢの座標にかかる座標データＢに基づき、画面上に四角形を構成し、構成された四角形により囲まれた囲み領域の内側の画像を、連続した複数のフレームのうちの一つのフレームから基準画像として選択し、基準画像データＥ１（動画Ｄ１のうち作業者により選択された対象画像のデータ）を作成し記憶部５に記憶させる。選択部４１は、図２に示すプログラムに基づき動作を行う。 The selection unit 41 configures a rectangle on the screen based on coordinate data A regarding the coordinates of point A and coordinate data B regarding the coordinates of point B, and selects an image inside an enclosed area surrounded by the configured rectangle. is selected as a reference image from one of a plurality of consecutive frames, and reference image data E1 (data of the target image selected by the operator in the moving image D1) is created and stored in the storage unit 5. The selection unit 41 operates based on the program shown in FIG.

ｂ．抽出部４２の演算、制御
抽出部４２は、記憶部５に記憶された基準画像データＥ１（動画Ｄ１のうち作業者により選択された対象画像のデータ）に基づき、動画データＤ１（連続した複数のフレームにより構成された動画のデータ）にかかる動画を構成する連続した複数のフレームの各々から、基準画像に相当する画像を対象画像として抽出し、対象画像データＦ１（動画Ｄ１の複数のフレームから抽出された複数の対象画像のデータ）を作成し記憶部５に記憶させる。抽出部４２は、図３に示すプログラムに基づき動作を行う。 b. Calculation and Control of Extraction Unit 42 The extraction unit 42 extracts video data D1 (a plurality of consecutive An image corresponding to the reference image is extracted as a target image from each of a plurality of consecutive frames constituting a video related to the video data F1 (video data composed of frames), and an image corresponding to the reference image is extracted as a target image. data of a plurality of target images) is created and stored in the storage unit 5. The extraction unit 42 operates based on the program shown in FIG.

抽出部４２は、抽出した対象画像にかかる対象画像データＦ１（動画Ｄ１の複数のフレームから抽出された複数の対象画像のデータ）に、予め設定した名称を付与し、記憶部５に記憶させる。 The extraction unit 42 assigns a preset name to target image data F1 (data of a plurality of target images extracted from a plurality of frames of the video D1) related to the extracted target image, and causes the storage unit 5 to store the data.

ｃ．画像調整部４３の演算、制御
画像調整部４３は、入力部２からコマンドＬ１～Ｌ５を受信する。画像調整部４３は、コマンドＬ１～Ｌ５に基づき、記憶部５に記憶された動画データＤ１（連続した複数のフレームにより構成された動画のデータ）を表示部３に表示させる。 c. Calculation and Control of Image Adjustment Unit 43 The image adjustment unit 43 receives commands L1 to L5 from the input unit 2. The image adjustment section 43 causes the display section 3 to display the moving image data D1 (moving image data composed of a plurality of consecutive frames) stored in the storage section 5 based on commands L1 to L5.

画像調整部４３は、入力部２からコマンドＫ１～Ｋ４を受信する。画像調整部４３は、コマンドＫ１～Ｋ４に基づき、囲み領域の縦方向または横方向の長さを変更し、表示部３に表示させる。囲み領域は、ポイントＡの座標にかかる座標データＡと、ポイントＢの座標にかかる座標データＢに基づき、画面上に構成された四角形により囲まれた領域である。囲み領域の内側の画像が、選択部４１により基準画像として選択される。画像調整部４３は、図６に示すプログラムに基づき上記動作を行う。 The image adjustment section 43 receives commands K1 to K4 from the input section 2. The image adjustment section 43 changes the length of the enclosed area in the vertical direction or the horizontal direction based on the commands K1 to K4, and causes the display section 3 to display the changed length. The enclosed area is an area surrounded by a rectangle formed on the screen based on coordinate data A regarding the coordinates of point A and coordinate data B regarding the coordinates of point B. The image inside the enclosed area is selected by the selection unit 41 as the reference image. The image adjustment section 43 performs the above operations based on the program shown in FIG.

以上が、画像データ抽出装置１の構成である。 The above is the configuration of the image data extraction device 1.

［１－２．作用］
次に、本実施形態の画像データ抽出装置１の作用を、図１～図６に基づき説明する。画像データ抽出装置１は、対象となる人物、物品、設備を検出し、物体解析を行うためのデータ抽出に利用される。抽出された画像は、機械学習における教師データ等に用いられる。画像データ抽出装置１は、作業者により操作され、動画からの画像データの抽出を行う。 [1-2. Effect】
Next, the operation of the image data extraction device 1 of this embodiment will be explained based on FIGS. 1 to 6. The image data extraction device 1 is used to extract data for detecting target persons, articles, and equipment and performing object analysis. The extracted images are used as training data in machine learning. The image data extraction device 1 is operated by an operator to extract image data from a moving image.

画像データ抽出装置１の選択部４１は、入力部２に入力された第１の座標であるポイントＡの座標と第２の座標であるポイントＢの座標に基づき構成された四角形により囲まれた囲み領域の内側の画像を、動画を構成する連続した複数のフレームのうちの一つのフレームから基準画像として選択する。 The selection unit 41 of the image data extraction device 1 selects a box surrounded by a rectangle formed based on the coordinates of point A, which is the first coordinate, and the coordinate of point B, which is the second coordinate, input to the input unit 2. An image inside the area is selected as a reference image from one of a plurality of consecutive frames constituting the moving image.

画像データ抽出装置１の抽出部４２は、選択部４１により選択された基準画像に基づき、動画を構成する連続した複数のフレームの各々から、基準画像に相当する画像を対象画像として抽出する。 Based on the reference image selected by the selection unit 41, the extraction unit 42 of the image data extraction device 1 extracts, as a target image, an image corresponding to the reference image from each of a plurality of consecutive frames constituting the moving image.

［ａ．選択部４１の動作］
以下に選択部４１の動作を説明する。選択部４１は、図３に示すプログラムに従って動作を行う。図３に示すプログラムは、演算部４に内蔵される。図３に示すプログラムは、演算部４により、繰り返し実行される。 [a. Operation of selection unit 41]
The operation of the selection section 41 will be explained below. The selection unit 41 operates according to the program shown in FIG. The program shown in FIG. 3 is built into the calculation unit 4. The program shown in FIG. 3 is repeatedly executed by the calculation unit 4.

（ステップＳ０１：動画を再生する）
選択部４１は、対象画像抽出の対象となる動画を再生する。予め、動画データＤ１（連続した複数のフレームにより構成された動画）が記憶部５に記憶されている。動画データＤ１にかかる動画が表示部３に表示される。動画の再生中に、対象画像の抽出を行う作業者によりコマンドＪ１～Ｊ２、Ｋ１～Ｋ４、Ｌ１～Ｌ５が、入力部２から入力される。本実施形態では、コマンドＪ１、Ｊ２の入力は、入力部２を構成するマウス２１により、Ｋ１～Ｋ４、Ｌ１～Ｌ５の入力は、入力部２を構成するキーボード２２により行われるものとする。 (Step S01: Play the video)
The selection unit 41 plays back the video that is the target of target image extraction. Video data D1 (a video composed of a plurality of consecutive frames) is stored in the storage unit 5 in advance. The moving image related to the moving image data D1 is displayed on the display section 3. During playback of a video, commands J1 to J2, K1 to K4, and L1 to L5 are input from the input unit 2 by an operator who extracts a target image. In this embodiment, the commands J1 and J2 are input using the mouse 21 that constitutes the input section 2, and the commands K1 to K4 and L1 to L5 are input using the keyboard 22 that constitutes the input section 2.

（ステップＳ０２：コマンドＪ１が入力されたか判断する）
次に選択部４１は、コマンドＪ１（抽出する画像のポイントＡを示す指令）が入力されたかの判断を行う。作業者は、表示部３に再生された動画を見て、表示部３に表示されたカーソルをマウス２１によりポイントＡに移動させる。ポイントＡは、抽出を所望する画像の始点である。選択部４１は、入力部２を構成するマウス２１の左クリックがＯＮとされたことを検出し、コマンドＪ１が入力されたと判断する。 (Step S02: Determine whether command J1 has been input)
Next, the selection unit 41 determines whether command J1 (command indicating point A of the image to be extracted) has been input. The worker watches the video reproduced on the display section 3 and moves the cursor displayed on the display section 3 to point A using the mouse 21. Point A is the starting point of the image desired to be extracted. The selection unit 41 detects that the left click of the mouse 21 constituting the input unit 2 is turned on, and determines that the command J1 has been input.

コマンドＪ１が入力されたと判断した場合（ステップＳ０２のＹＥＳ）、選択部４１は、ステップＳ０３に移行する。コマンドＪ１が入力されたと判断しない場合（ステップＳ０２のＮＯ）、コマンドＪ１の入力待ち状態となる。 If it is determined that the command J1 has been input (YES in step S02), the selection unit 41 moves to step S03. If it is not determined that the command J1 has been input (NO in step S02), the process enters a state of waiting for the input of the command J1.

（ステップＳ０３：座標データＡを作成、記憶し、ポイントＡを表示する）
ステップＳ０２にてコマンドＪ１が入力されたと判断した場合、選択部４１は、コマンドＪ１が入力された、動画上のポイントＡの座標を検出し座標データＡ（抽出する画像のポイントＡの座標を示すデータ）を作成する。選択部４１は、座標データＡを記憶部５に記憶させる。また、図５に示すように選択部４１は、座標データＡに基づき表示部３に表示された動画上のポイントＡにドット「・」を表示する。 (Step S03: Create and store coordinate data A, and display point A)
If it is determined in step S02 that command J1 has been input, the selection unit 41 detects the coordinates of point A on the video, where command J1 has been input, and generates coordinate data A (indicating the coordinates of point A in the image to be extracted). data). The selection unit 41 causes the storage unit 5 to store the coordinate data A. Further, as shown in FIG. 5, the selection section 41 displays a dot "." at point A on the video displayed on the display section 3 based on the coordinate data A.

（ステップＳ０４：コマンドＪ２が入力されたか判断する）
次に選択部４１は、コマンドＪ２（抽出する画像のポイントＢを示す指令）が入力されたかの判断を行う。作業者は、表示部３に再生された動画を見て、表示部３に表示されたカーソルをマウス２１によりポイントＢに移動させる。ポイントＢは、抽出を所望する画像の終点である。ポイントＡとポイントＢを対頂角とする四角形が、表示部３に表示された動画上に描かれる。選択部４１は、入力部２を構成するマウス２１の左クリックがＯＦＦされたことを検出し、コマンドＪ２が入力されたと判断する。 (Step S04: Determine whether command J2 has been input)
Next, the selection unit 41 determines whether command J2 (command indicating point B of the image to be extracted) has been input. The operator watches the video reproduced on the display section 3 and moves the cursor displayed on the display section 3 to point B using the mouse 21. Point B is the end point of the image desired to be extracted. A rectangle having opposite apex angles at point A and point B is drawn on the moving image displayed on the display unit 3. The selection unit 41 detects that the left click of the mouse 21 constituting the input unit 2 is turned off, and determines that command J2 has been input.

コマンドＪ２が入力されたと判断した場合（ステップＳ０３のＹＥＳ）、選択部４１は、ステップＳ０５に移行する。コマンドＪ２が入力されたと判断しない場合（ステップＳ０３のＮＯ）、コマンドＪ２の入力待ち状態となる。 If it is determined that the command J2 has been input (YES in step S03), the selection unit 41 moves to step S05. If it is not determined that the command J2 has been input (NO in step S03), the process enters a state of waiting for the input of the command J2.

（ステップＳ０５：座標データＢを作成、記憶し、ポイントＢおよび囲み領域を表示する）
ステップＳ０４にてコマンドＪ２が入力されたと判断した場合、選択部４１は、コマンドＪ２が入力された、動画上のポイントＢの座標を検出し座標データＢ（抽出する画像のポイントＢの座標を示すデータ）を作成する。選択部４１は、座標データＢを記憶部５に記憶させる。また、図５に示すように選択部４１は、座標データＡ、座標データＢに基づき、表示部３に表示された動画上にポイントＡとポイントＢを対頂角とする四角形を形成し、囲み領域として表示する。 (Step S05: Create and store coordinate data B, and display point B and the enclosed area)
If it is determined in step S04 that command J2 has been input, the selection unit 41 detects the coordinates of point B on the video, where command J2 has been input, and generates coordinate data B (indicating the coordinates of point B of the image to be extracted). data). The selection unit 41 causes the storage unit 5 to store the coordinate data B. Further, as shown in FIG. 5, the selection unit 41 forms a rectangle with opposite apex angles at point A and point B on the video displayed on the display unit 3 based on the coordinate data A and the coordinate data B, and forms a rectangle as an enclosing area. indicate.

（ステップＳ０６：コマンドＪ２が入力された時点のフレームを記憶する）
選択部４１は、コマンドＪ２が入力された時点の動画にかかるフレームを記憶する。例えば、図５に示すように、コマンドＪ１がｎ番目のフレームの表示時点に入力され、コマンドＪ２がｎ＋ｋ番目のフレームの表示時点に入力された場合、選択部４１は、コマンドＪ２が入力されたフレームは、「ｎ＋ｋ」番目のフレームであることを記憶部５に記憶させる。 (Step S06: Store the frame at the time when command J2 is input)
The selection unit 41 stores frames of the video at the time when the command J2 is input. For example, as shown in FIG. 5, if command J1 is input at the time when the nth frame is displayed and command J2 is input at the time when the n+kth frame is displayed, the selection unit 41 selects whether command J2 is input or not. The storage unit 5 stores that the frame is the "n+k"th frame.

（ステップＳ０７：基準画像データＥ１を作成、記憶する）
選択部４１は、入力部２に入力された座標データＡと座標データＢに基づき構成された四角形により囲まれた囲み領域の内側の画像を、連続した複数のフレームのうちの一つのフレームから基準画像として選択する。座標データＡにかかる座標が請求項における第１の座標に、座標データＢにかかる座標が請求項における第２の座標に相当する。 (Step S07: Create and store reference image data E1)
The selection unit 41 selects an image inside an area surrounded by a rectangle formed based on the coordinate data A and coordinate data B input to the input unit 2 from one frame out of a plurality of consecutive frames. Select as image. Coordinates related to coordinate data A correspond to first coordinates in the claims, and coordinates related to coordinate data B correspond to second coordinates in the claims.

選択部４１は、「ｎ＋ｋ」番目のフレームにおける、座標データＡにかかるポイントＡと座標データＢにかかるポイントＢを対頂角とする四角形により囲まれた、囲み領域の内側の画像を基準画像として選択し、基準画像データＥ１（動画Ｄ１のうち作業者により選択された対象画像のデータ）を作成し、記憶部５に記憶させる。 The selection unit 41 selects, as a reference image, an image inside an enclosed area surrounded by a rectangle whose vertical angles are point A corresponding to coordinate data A and point B corresponding to coordinate data B in the "n+k"th frame. , the reference image data E1 (data of the target image selected by the operator in the moving image D1) is created and stored in the storage unit 5.

以上が、選択部４１の動作である。 The above is the operation of the selection section 41.

［ｂ．抽出部４２の動作］
以下に抽出部４２の動作を説明する。抽出部４２は、図４に示すプログラムに従って動作を行う。図４に示すプログラムは、演算部４に内蔵される。図４に示すプログラムは、演算部４により、繰り返し実行される。 [b. Operation of extraction unit 42]
The operation of the extractor 42 will be explained below. The extraction unit 42 operates according to the program shown in FIG. The program shown in FIG. 4 is built into the calculation unit 4. The program shown in FIG. 4 is repeatedly executed by the calculation unit 4.

（ステップＳ１１：対象画像データＦ１を作成、記憶する）
抽出部４２は、選択部４１により選択された基準画像に基づき、動画を構成する連続した複数のフレームの各々から、基準画像に相当する画像を対象画像として抽出する。 (Step S11: Create and store target image data F1)
Based on the reference image selected by the selection unit 41, the extraction unit 42 extracts, as a target image, an image corresponding to the reference image from each of a plurality of consecutive frames constituting the moving image.

具体的には、抽出部４２は、選択部４１により作成され記憶部５に記憶された、基準画像データＥ１（動画Ｄ１のうち作業者により選択された対象画像のデータ）に基づき、動画データＤ１（連続した複数のフレームにより構成された動画のデータ）を構成する複数のフレームから基準画像データＥ１に相当する画像を抽出し、対象画像データＦ１（動画Ｄ１の複数のフレームから抽出された複数の対象画像のデータ）を作成し、記憶部５に記憶させる。 Specifically, the extraction unit 42 extracts the video data D1 based on the reference image data E1 (data of the target image selected by the operator in the video D1) created by the selection unit 41 and stored in the storage unit 5. An image corresponding to the reference image data E1 is extracted from a plurality of frames constituting (data of a moving image composed of a plurality of consecutive frames), and an image corresponding to the reference image data E1 is data of the target image) is created and stored in the storage unit 5.

（ステップＳ１２：対象画像データＦ１に画像の名称を付ける）
抽出部４２は、対象画像データＦ１（動画Ｄ１の複数のフレームから抽出された複数の対象画像のデータ）に、既に記憶部５に記憶されている過去の画像に、予め設定された名称を付与し記憶部５に記憶させる。 (Step S12: Give an image name to the target image data F1)
The extraction unit 42 assigns a preset name to the target image data F1 (data of a plurality of target images extracted from a plurality of frames of the video D1) as a past image already stored in the storage unit 5. and is stored in the storage unit 5.

以上が、抽出部４２の動作である。 The above is the operation of the extraction section 42.

［ｃ．画像調整部４３の動作］
以下に画像調整部４３の動作を説明する。画像調整部４３は、図６に示すプログラムに従って動作を行う。図６に示すプログラムは、演算部４に内蔵される。図６に示すプログラムは、選択部４１または抽出部４２の実行中に、割り込みにより実行される。入力部２を構成するキーボード２２のいずれかのキーが押された場合、割り込みが発生し、図６に示すプログラムが起動する。 [c. Operation of image adjustment unit 43]
The operation of the image adjustment section 43 will be explained below. The image adjustment section 43 operates according to the program shown in FIG. The program shown in FIG. 6 is built into the calculation unit 4. The program shown in FIG. 6 is executed by an interrupt while the selection section 41 or the extraction section 42 is being executed. When any key on the keyboard 22 constituting the input section 2 is pressed, an interrupt occurs and the program shown in FIG. 6 is activated.

コマンドＫ１～Ｋ４、Ｌ１～Ｌ５は、キーボード２２の下記に示すキーが押されることにより入力される。
コマンドＫ１（囲み領域の縦方向の拡大を指示する指令）：キー［↑］
コマンドＫ２（囲み領域の縦方向の縮小を指示する指令）：キー［↓］
コマンドＫ３（囲み領域の横方向の拡大を指示する指令）：キー［→］
コマンドＫ４（囲み領域の横方向の縮小を指示する指令）：キー［←］
コマンドＬ１（動画の再生停止を指示する指令）：キー［Ｓ］
コマンドＬ２（動画の再生開始を指示する指令）：キー［Ｒ］
コマンドＬ３（動画の低速再生を指示する指令）：キー［Ｔ］
コマンドＬ４（動画の高速再生を指示する指令）：キー［Ｕ］
コマンドＬ５（動画の巻き戻し再生を指示する指令）：キー［Ｖ］ Commands K1 to K4 and L1 to L5 are input by pressing the keys shown below on the keyboard 22.
Command K1 (command to enlarge the enclosed area in the vertical direction): Key [↑]
Command K2 (command to reduce the enclosed area in the vertical direction): Key [↓]
Command K3 (command to expand the enclosed area in the horizontal direction): Key [→]
Command K4 (command to reduce the enclosed area in the horizontal direction): Key [←]
Command L1 (command to stop video playback): Key [S]
Command L2 (command to start playing the video): Key [R]
Command L3 (command to play the video at low speed): Key [T]
Command L4 (command for high-speed video playback): Key [U]
Command L5 (command to rewind the video): Key [V]

（ステップＳ２１：コマンドＬ１が入力された場合、動画の再生を停止する）
画像調整部４３は、キーボード２２のキー［Ｓ］が押され、コマンドＬ１が入力されたと判断した場合（ステップＳ２１ａ）、記憶部５に記憶された動画データＤ１にかかる動画の再生を停止し、静止画像を表示部３に表示させる（ステップＳ２１ｂ）。 (Step S21: If command L1 is input, stop playing the video)
When the image adjustment unit 43 determines that the key [S] on the keyboard 22 has been pressed and the command L1 has been input (step S21a), the image adjustment unit 43 stops playing the video related to the video data D1 stored in the storage unit 5, The still image is displayed on the display unit 3 (step S21b).

（ステップＳ２２：コマンドＬ２が入力された場合、動画の再生を行う）
画像調整部４３は、キーボード２２のキー［Ｒ］が押され、コマンドＬ２が入力されたと判断した場合（ステップＳ２２ａ）、記憶部５に記憶された動画データＤ１にかかる動画を再生し、表示部３に表示させる（ステップＳ２２ｂ）。 (Step S22: If command L2 is input, play the video)
When the image adjustment unit 43 determines that the key [R] of the keyboard 22 has been pressed and the command L2 has been input (step S22a), the image adjustment unit 43 plays back the video corresponding to the video data D1 stored in the storage unit 5, and displays the video on the display unit. 3 (step S22b).

（ステップＳ２３：コマンドＬ３が入力された場合、動画の低速再生を行う）
画像調整部４３は、キーボード２２のキー［Ｔ］が押され、コマンドＬ３が入力されたと判断した場合（ステップＳ２３ａ）、再生速度を低速にして動画データＤ１にかかる動画を表示部３に表示させる（ステップＳ２３ｂ）。画像調整部４３は、現在再生している動画の再生速度を、例えば２０％低速にする。コマンドＬ３が複数回入力された場合、動画の再生速度は累積して低速にされる。 (Step S23: If command L3 is input, play the video at low speed)
When the image adjustment unit 43 determines that the key [T] of the keyboard 22 has been pressed and the command L3 has been input (step S23a), the image adjustment unit 43 lowers the playback speed and displays the video corresponding to the video data D1 on the display unit 3. (Step S23b). The image adjustment unit 43 slows down the playback speed of the currently playing video by, for example, 20%. If command L3 is input multiple times, the playback speed of the video is cumulatively slowed down.

（ステップＳ２４：コマンドＬ４が入力された場合、動画の高速再生を行う）
画像調整部４３は、キーボード２２のキー［Ｕ］が押され、コマンドＬ４が入力されたと判断した場合（ステップＳ２４ａ）、再生速度を高速にして動画データＤ１にかかる動画を表示部３に表示させる（ステップＳ２４ｂ）。画像調整部４３は、現在再生している動画の再生速度を、例えば２０％高速にする。コマンドＬ３が複数回入力された場合、動画の再生速度は累積して高速にされる。 (Step S24: If command L4 is input, perform high-speed playback of the video)
When the image adjustment unit 43 determines that the key [U] of the keyboard 22 has been pressed and the command L4 has been input (step S24a), the image adjustment unit 43 increases the playback speed and displays the video corresponding to the video data D1 on the display unit 3. (Step S24b). The image adjustment unit 43 increases the playback speed of the currently played video by, for example, 20%. When command L3 is input multiple times, the video playback speed is cumulatively increased.

（ステップＳ２５：コマンドＬ５が入力された場合、動画の巻き戻し再生を行う）
画像調整部４３は、キーボード２２のキー［Ｖ］が押され、コマンドＬ５が入力されたと判断した場合（ステップＳ２５ａ）、動画データＤ１にかかる動画を巻き戻し再生にて表示部３に表示させる（ステップＳ２５ｂ）。コマンドＬ３が複数回入力された場合、動画の巻き戻し再生速度は累積して高速にされる。 (Step S25: If command L5 is input, rewind and play the video)
When the image adjustment unit 43 determines that the key [V] on the keyboard 22 has been pressed and the command L5 has been input (step S25a), the image adjustment unit 43 causes the display unit 3 to display the video related to the video data D1 in rewind playback ( Step S25b). If the command L3 is input multiple times, the rewind playback speed of the video is cumulatively increased.

基準画像データＥ１（動画Ｄ１のうち作業者により選択された対象画像のデータ）の作成は、作業者により上記のステップＳ２１～Ｓ２５により調整された動画上にて行われる。 The reference image data E1 (data of the target image selected by the operator in the moving image D1) is created on the moving image adjusted by the operator in steps S21 to S25 described above.

（ステップＳ２６：コマンドＫ１が入力された場合、囲み領域の縦の長さを拡大する）
画像調整部４３は、キーボード２２のキー［↑］が押され、コマンドＫ１が入力されたと判断した場合（ステップＳ２６ａ）、囲み領域の縦方向の長さを拡大する（ステップＳ２６ｂ）。囲み領域は、座標データＡと座標データＢの座標に基づき構成された四角形に囲まれた領域である。囲み領域は、表示部３に再生された動画上に表示される。画像調整部４３は、現在表示されている囲み領域の縦方向の長さを、例えば１０ピクセル拡大する。コマンドＫ１が複数回入力された場合、囲み領域の縦方向の長さは累積して拡大される。 (Step S26: If command K1 is input, expand the vertical length of the enclosed area)
When the image adjustment unit 43 determines that the key [↑] on the keyboard 22 has been pressed and the command K1 has been input (step S26a), the image adjustment unit 43 enlarges the length of the enclosed area in the vertical direction (step S26b). The enclosed area is an area surrounded by a rectangle formed based on the coordinates of coordinate data A and coordinate data B. The enclosed area is displayed on the moving image played back on the display unit 3. The image adjustment unit 43 increases the length of the currently displayed enclosed area in the vertical direction by, for example, 10 pixels. If the command K1 is input multiple times, the length of the enclosed area in the vertical direction is cumulatively expanded.

（ステップＳ２７：コマンドＫ２が入力された場合、囲み領域の縦の長さを縮小する）
画像調整部４３は、キーボード２２のキー［↓］が押され、コマンドＫ２が入力されたと判断した場合（ステップＳ２７ａ）、囲み領域の縦方向の長さを縮小する（ステップＳ２７ｂ）。画像調整部４３は、現在表示されている囲み領域の縦方向の長さを、例えば１０ピクセル縮小する。コマンドＫ２が複数回入力された場合、囲み領域の縦方向の長さは累積して縮小される。 (Step S27: If command K2 is input, reduce the vertical length of the enclosed area)
When the image adjustment unit 43 determines that the key [↓] on the keyboard 22 has been pressed and the command K2 has been input (step S27a), the image adjustment unit 43 reduces the length of the enclosed area in the vertical direction (step S27b). The image adjustment unit 43 reduces the length of the currently displayed enclosed area in the vertical direction by, for example, 10 pixels. If the command K2 is input multiple times, the length of the enclosed area in the vertical direction is cumulatively reduced.

（ステップＳ２８：コマンドＫ３が入力された場合、囲み領域の横の長さを拡大する）
画像調整部４３は、キーボード２２のキー［→］が押され、コマンドＫ３が入力されたと判断した場合（ステップＳ２８ａ）、囲み領域の横方向の長さを拡大する（ステップＳ２８ｂ）。画像調整部４３は、現在表示されている囲み領域の横方向の長さを、例えば１０ピクセル拡大する。コマンドＫ３が複数回入力された場合、囲み領域の横方向の長さは累積して拡大される。 (Step S28: If command K3 is input, expand the horizontal length of the enclosed area)
When the image adjustment unit 43 determines that the key [→] on the keyboard 22 has been pressed and the command K3 has been input (step S28a), the image adjustment unit 43 enlarges the horizontal length of the enclosed area (step S28b). The image adjustment unit 43 increases the horizontal length of the currently displayed enclosed area by, for example, 10 pixels. If the command K3 is input multiple times, the horizontal length of the enclosed area is cumulatively expanded.

（ステップＳ２９：コマンドＫ４が入力された場合、囲み領域の横の長さを縮小する）
画像調整部４３は、キーボード２２のキー［←］が押され、コマンドＫ４が入力されたと判断した場合（ステップＳ２９ａ）、囲み領域の横方向の長さを縮小する（ステップＳ２９ｂ）。画像調整部４３は、現在表示されている囲み領域の横方向の長さを、例えば１０ピクセル縮小する。コマンドＫ４が複数回入力された場合、囲み領域の横方向の長さは累積して縮小される。 (Step S29: If command K4 is input, reduce the horizontal length of the enclosed area)
When the image adjustment unit 43 determines that the key [←] on the keyboard 22 has been pressed and the command K4 has been input (step S29a), the image adjustment unit 43 reduces the horizontal length of the enclosed area (step S29b). The image adjustment unit 43 reduces the horizontal length of the currently displayed enclosed area by, for example, 10 pixels. If command K4 is input multiple times, the horizontal length of the enclosed area is cumulatively reduced.

基準画像データＥ１（動画Ｄ１のうち作業者により選択された対象画像のデータ）は、上記のステップＳ２６～Ｓ２９にて囲み領域の大きさが調整され作成される。 The reference image data E1 (data of the target image selected by the operator in the moving image D1) is created by adjusting the size of the enclosed area in steps S26 to S29 described above.

以上が、画像調整部４３の動作である。 The above is the operation of the image adjustment section 43.

以上が、画像データ抽出装置１の動作である。上記のように画像データ抽出装置１の表示部３に表示された動画データＤ１にかかる動画上に囲み領域が表示され、囲み領域に基づき基準画像データＥ１（動画Ｄ１のうち作業者により選択された対象画像のデータ）が作成される。基準画像データＥ１に基づき、対象画像データＦ１（動画Ｄ１の複数のフレームから抽出された複数の対象画像のデータ）が作成される。 The above is the operation of the image data extraction device 1. As described above, a boxed area is displayed on the video corresponding to the video data D1 displayed on the display unit 3 of the image data extraction device 1, and based on the boxed area, the reference image data E1 (out of the video D1 selected by the operator) is displayed. data of the target image) is created. Based on the reference image data E1, target image data F1 (data of multiple target images extracted from multiple frames of the video D1) is created.

［１－３．効果］
（１）本実施形態によれば、画像データ抽出装置１は、連続した複数のフレームにより構成された動画を再生する表示部３と、表示部３により再生された動画における第１の座標と第２の座標が入力される入力部２と、入力部２に入力された第１の座標と第２の座標に基づき構成された多角形または楕円形により囲まれた囲み領域の内側の画像を、連続した複数のフレームのうちの一つのフレームから基準画像として選択する選択部４１と、選択部４１により選択された基準画像に基づき、動画を構成する連続した複数のフレームの各々から、基準画像に相当する画像を対象画像として抽出する抽出部４２と、抽出部４２により抽出された対象画像を記憶する記憶部５とを有するので、作業者による単純な作業で、動画を構成する連続した複数のフレームの各々から、所望する画像を対象画像として抽出することができる画像データ抽出装置を提供することができる。 [1-3. effect]
(1) According to the present embodiment, the image data extraction device 1 includes a display unit 3 that plays back a moving image composed of a plurality of consecutive frames, and a first coordinate and a first coordinate in the moving image played by the display unit 3. 2 coordinates are input, and an image inside an enclosed area surrounded by a polygon or an ellipse configured based on the first and second coordinates input to the input unit 2. A selection unit 41 selects one frame from among a plurality of consecutive frames as a reference image, and a selection unit 41 selects a reference image from each of the plurality of consecutive frames constituting a video based on the reference image selected by the selection unit 41. Since it has an extraction unit 42 that extracts a corresponding image as a target image, and a storage unit 5 that stores the target image extracted by the extraction unit 42, an operator can easily extract a plurality of consecutive images that make up a moving image. It is possible to provide an image data extraction device that can extract a desired image as a target image from each frame.

本実施形態によれば、作業者は、表示部３に表示された動画上で第１の座標と第２の座標を入力部２から入力し、所望の画像を基準画像として選択することができるので、簡単な作業で基準画像を選択することができる。作業者は、動画を静止させ、静止画像から基準画像を個別に選択することを必要とされない。 According to this embodiment, the worker can input the first coordinate and the second coordinate from the input unit 2 on the video displayed on the display unit 3 and select a desired image as the reference image. Therefore, the reference image can be selected with a simple task. The operator is not required to freeze the video and individually select reference images from the still images.

本実施形態によれば、抽出部４２は、選択部４１により選択された基準画像に基づき、動画を構成する連続した複数のフレームの各々から、基準画像に相当する画像を対象画像として抽出するので、作業者は、動画を静止させ、静止画像から多数の画像を選択することが必要とされない。その結果、短時間に大量の画像データを抽出することができる。抽出された大量の画像データは、機械学習による物体解析等に利用される。 According to the present embodiment, the extraction unit 42 extracts an image corresponding to the reference image as a target image from each of a plurality of consecutive frames constituting a video based on the reference image selected by the selection unit 41. , the operator is not required to freeze the video and select multiple images from the still images. As a result, a large amount of image data can be extracted in a short time. The extracted large amount of image data is used for object analysis using machine learning.

（２）本実施形態によれば、囲み領域の縦方向または横方向の長さは、入力部２に入力されたコマンドに基づき変更されるので、作業者は、容易に基準画像の大きさを変更することができる。 (2) According to the present embodiment, the length of the enclosed area in the vertical or horizontal direction is changed based on the command input to the input unit 2, so the operator can easily determine the size of the reference image. Can be changed.

（３）本実施形態によれば、動画を構成する連続した複数のフレームの再生速度は、入力部２に入力されたコマンドに基づき変更されるので、作業者は、任意の再生速度で再生された動画上で、基準画像の選択を行うことができる。これにより、作業者による基準画像の選択作業は、より容易なものとなる。 (3) According to the present embodiment, the playback speed of a plurality of consecutive frames constituting a video is changed based on the command input to the input unit 2, so the operator can play the video at any playback speed. The reference image can be selected on the video. This makes it easier for the operator to select the reference image.

（４）本実施形態によれば、対象画像は、過去に記憶した画像のうち最も類似する画像の名称が付与され記憶部５に記憶されるので、作業者は容易に、記憶部５に記憶されたデータの内容を知ることができる。また、類似した対象画像は、同様の名称が付与されて記憶部５に記憶されるので、作業者は、容易にデータの分類を行うことができる。また、作業者は、対象画像の名称を頼りに、データを抽出し、時間的な経緯の把握に役立てることができる。 (4) According to the present embodiment, the target image is given the name of the most similar image among the images stored in the past and is stored in the storage unit 5. Therefore, the operator can easily store the target image in the storage unit 5. You can know the contents of the data. Further, similar target images are given similar names and stored in the storage unit 5, so that the operator can easily classify the data. Furthermore, the operator can rely on the name of the target image to extract data and use it to understand the temporal history.

（５）本実施形態によれば、入力部２は、マウス２１、キーボード２２、音声入力装置、視線追跡装置のうち少なくとも一つにより構成されるので、作業者は、安価に容易に一般的な装置により画像データ抽出装置１を構成することができる。 (5) According to the present embodiment, the input unit 2 includes at least one of the mouse 21, the keyboard 22, the voice input device, and the eye tracking device. The image data extraction device 1 can be configured by the devices.

［２．他の実施形態］
変形例を含めた実施形態を説明したが、これらの実施形態は例として提示したものであって、発明の範囲を限定することを意図していない。これら実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略や置き換え、変更を行うことができる。これらの実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。以下は、その一例である。 [2. Other embodiments]
Although embodiments including modifications have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These embodiments can be implemented in various other forms, and various omissions, substitutions, and changes can be made without departing from the gist of the invention. These embodiments and their modifications are included within the scope and gist of the invention as well as within the scope of the invention described in the claims and its equivalents. Below is an example.

（１）上記実施形態では、入力部２は、マウス２１、キーボード２２により構成され、コマンドＪ１～Ｊ２はマウス２１、Ｋ１～Ｋ４、Ｌ１～Ｌ５はキーボード２２により入力されるものとした。しかしながらこれらのコマンドは、他の入力装置により構成された入力部２により入力されるようにしてもよい。例えば、入力部２は、音声入力装置が含まれ構成されるようにし、以下のコマンドが音声により音声入力装置に入力されるようにしてもよい。
コマンドＫ１（囲み領域の縦方向の拡大を指示する指令）：音声［うえ］
コマンドＫ２（囲み領域の縦方向の縮小を指示する指令）：音声［した］
コマンドＫ３（囲み領域の横方向の拡大を指示する指令）：音声［みぎ］
コマンドＫ４（囲み領域の横方向の縮小を指示する指令）：音声［ひだり］
コマンドＬ１（動画の再生停止を指示する指令）：音声［一時停止］
コマンドＬ２（動画の再生開始を指示する指令）：音声［再生］
コマンドＬ３（動画の低速再生を指示する指令）：音声［低速］
コマンドＬ４（動画の高速再生を指示する指令）：音声［高速］
コマンドＬ５（動画の巻き戻し再生を指示する指令）：音声［巻き戻し］ (1) In the above embodiment, the input unit 2 is configured with a mouse 21 and a keyboard 22, and the commands J1 to J2 are inputted by the mouse 21, and the commands K1 to K4 and L1 to L5 are inputted by the keyboard 22. However, these commands may also be input through the input unit 2 configured with another input device. For example, the input unit 2 may be configured to include a voice input device, and the following commands may be input to the voice input device by voice.
Command K1 (instruction to expand the enclosed area in the vertical direction): Voice [up]
Command K2 (instruction to reduce the enclosed area in the vertical direction): Voice [Shita]
Command K3 (instruction to expand the enclosed area in the horizontal direction): Voice [Migi]
Command K4 (command to reduce the enclosed area in the horizontal direction): Voice [hidari]
Command L1 (command to stop video playback): Audio [pause]
Command L2 (command to start playing the video): Audio [play]
Command L3 (command to play video at low speed): Audio [low speed]
Command L4 (command for high-speed video playback): Audio [high-speed]
Command L5 (command to rewind the video): Audio [Rewind]

（２）上記実施形態では、作業者に操作された入力部２のマウス２１により、コマンドＪ１（抽出する画像のポイントＡを示す指令）、コマンドＪ２（抽出する画像のポイントＢを示す指令）が入力されるものとした。しかしながらこれらのコマンドは、他の入力装置により構成された入力部２により入力されるようにしてもよい。例えば、視線追跡装置を含め入力部２を構成するようにし、コマンドＪ１、Ｊ２は、作業者の視線の方向により視線追跡装置に入力されるようにしてもよい。 (2) In the above embodiment, command J1 (instruction indicating point A of the image to be extracted) and command J2 (instruction indicating point B of the image to be extracted) is executed by the mouse 21 of the input unit 2 operated by the operator. input. However, these commands may also be input through the input unit 2 configured with another input device. For example, the input unit 2 may include an eye tracking device, and the commands J1 and J2 may be input to the eye tracking device depending on the direction of the worker's eye.

（３）上記実施形態では、囲み領域は、ポイントＡの座標とポイントＢの座標に基づき構成された四角形であるものとした。しかしながら囲み領域の形状は、これに限られない。囲み領域の形状は、ポイントＡの座標とポイントＢの座標に基づき構成された、例えば三角形、五角形以上の多角形または楕円形であってもよい。 (3) In the above embodiment, the enclosed area is a rectangle constructed based on the coordinates of point A and point B. However, the shape of the enclosed area is not limited to this. The shape of the enclosing area may be, for example, a triangle, a polygon of pentagon or more, or an ellipse, which is configured based on the coordinates of point A and point B.

（４）上記実施形態ではコマンドＪ２が入力された時点の動画のフレームから基準画像データＥ１を作成するものとしたが、基準画像データＥ１が作成される動画のフレームはこれに限られない。コマンドＪ１が入力された動画のフレームから基準画像データＥ１が作成されるようにしてもよい。またはコマンドＪ１コマンドＪ２が入力された間の動画のフレームが選択され、基準画像データＥ１が作成されるようにしてもよい。 (4) In the above embodiment, the reference image data E1 is created from the frame of the video at the time when the command J2 is input, but the frame of the video from which the reference image data E1 is created is not limited to this. The reference image data E1 may be created from the frame of the moving image into which the command J1 is input. Alternatively, frames of the moving image between commands J1 and J2 may be selected to create the reference image data E1.

１・・・画像データ抽出装置
２・・・入力部
３・・・表示部
４・・・演算部
５・・・記憶部
９・・・コンソール
２１・・・マウス
２２・・・キーボード
４１・・・選択部
４２・・・抽出部
４３・・・画像調整部

1... Image data extraction device 2... Input unit 3... Display unit 4... Arithmetic unit 5... Storage unit 9... Console 21... Mouse 22... Keyboard 41... - Selection unit 42...Extraction unit 43...Image adjustment unit

Claims

a display unit that plays a video composed of a plurality of consecutive frames;
an input unit into which first coordinates and second coordinates are input into the video being played back by the display unit;
The frame displayed on the display unit at the time the first coordinates were input is stored as the nth frame, and the frame displayed on the display unit at the time the second coordinates were input is stored. a storage unit that stores the n+kth frame;
Create first coordinate data from the first coordinates input to the input unit, create second coordinate data from the second coordinates, and combine the first coordinate data and the second coordinate data. a selection unit that selects an image inside an enclosed area surrounded by a polygon or an ellipse configured based on the above as a reference image from the n+kth frame;
an extraction unit that extracts an image corresponding to the reference image as a target image from each of the plurality of consecutive frames forming the moving image based on the reference image selected by the selection unit;
has
the storage unit stores the target image extracted by the extraction unit;
Image data extraction device.

The length of the enclosed area in the vertical or horizontal direction is changed based on a command input to the input section.
The image data extraction device according to claim 1.

The playback speed of the plurality of consecutive frames constituting the video is changed based on a command input to the input unit.
The image data extraction device according to claim 1 or 2.

The target image is given a name of the most similar image among the images stored in the past and is stored in the storage unit.
An image data extraction device according to any one of claims 1 to 3.

The input unit includes at least one of a mouse, a keyboard, a voice input device, and an eye tracking device.
An image data extraction device according to any one of claims 1 to 4.

an input procedure of inputting first coordinates and second coordinates to a video being played by a display unit that plays a video composed of a plurality of consecutive frames;
The frame displayed on the display unit at the time the first coordinates were input is stored as the nth frame, and the frame displayed on the display unit at the time the second coordinates were input is stored. a storage procedure for storing as the n+kth frame;
Create first coordinate data from the first coordinates input by the input procedure, create second coordinate data from the second coordinates, and combine the first coordinate data and the second coordinate data. a selection procedure of selecting an image inside an enclosed area surrounded by a polygon or an ellipse configured based on the above as a reference image from the n+kth frame;
an extraction step of extracting an image corresponding to the reference image as a target image from each of the plurality of consecutive frames constituting the moving image based on the reference image selected by the selection step;
has
The storage step stores the target image extracted by the extraction step.
Image data extraction method.