JP7446605B2

JP7446605B2 - Systems, programs, trained models, learning model generation methods, generation devices, etc.

Info

Publication number: JP7446605B2
Application number: JP2020062878A
Authority: JP
Inventors: 要岩佐; 直紀松田
Original assignee: 株式会社ユピテル
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2024-03-11
Anticipated expiration: 2040-03-31
Also published as: JP2024055911A; JP2021164034A

Description

本発明は、たとえば、システム、プログラム、学習済みモデル、学習モデルの生成方法および生成装置等に関する。 The present invention relates to, for example, a system, a program, a trained model, a learning model generation method, a generation device, and the like.

従来の技術では、ドライブ・レコーダのような撮影装置において所定のイベントが発生したことをトリガにイベント録画情報を作成している。このようなものでは、異常な状況における撮影により得られた撮影データを効率的に収集するものが考えられている（特許文献１）。 In conventional technology, event recording information is created using the occurrence of a predetermined event in a photographing device such as a drive recorder as a trigger. Among such devices, one has been considered that efficiently collects photographic data obtained by photographing under abnormal conditions (Patent Document 1).

特開2019-016227号公報Japanese Patent Application Publication No. 2019-016227

異常な状況における撮影により得られた撮影データに加え、ユーザが所望するシーン等の特定のシーンの画像を効率的に収集する技術があると望ましい。 It would be desirable to have a technique for efficiently collecting images of a specific scene, such as a scene desired by a user, in addition to photographic data obtained by photographing under abnormal conditions.

上述した課題に鑑み、本発明の目的の一つは、車両で撮影可能な特定の画像のシーンを検出するための技術を提供することである。 In view of the above-mentioned problems, one object of the present invention is to provide a technique for detecting a specific image scene that can be captured by a vehicle.

本発明の目的はこれに限定されず、本明細書及び図面等に開示される構成の部分から奏する効果を得ることを目的とする構成についても分割出願・補正等により権利取得する意思を有する。例えば本明細書において「～できる」「～可能である」などと記載した箇所を「～が課題である」と読み替えた課題が本明細書には開示されている。課題はそれぞれ独立したものとして記載しているものであり、各々の課題を解決するための構成についても単独で分割出願・補正等により権利取得する意思を有する。課題が明細書の記載から黙示的に把握されるものであっても、本出願人は本明細書に記載の構成の一部を補正又は分割出願にて特許請求の範囲とする意思を有する。またこれら独立の課題を組み合わせた課題を解決する構成についても開示しているものであり、権利取得する意思を有する。 The purpose of the present invention is not limited thereto, and the present invention intends to acquire rights through divisional applications, amendments, etc. for structures that aim to obtain effects from the parts of the structures disclosed in this specification, drawings, etc. For example, the present specification discloses a problem in which passages such as ``can be done'' or ``possible'' are read as ``the problem is.'' Each issue is described as an independent entity, and we intend to acquire rights to the structure for solving each issue individually through divisional applications, amendments, etc. Even if the problem is implicitly understood from the description of the specification, the present applicant has the intention to claim a part of the structure described in the specification in an amendment or divisional application. The company also discloses a structure that solves a problem that combines these independent problems, and has the intention to acquire the rights to it.

（１）この発明によるシステムは、車両で撮影可能な特定の画像のシーンを規定するデータに基づいて、複数の時点の各時点に車両で撮影された画像から、上記シーンを表す画像を検出する機能を有する。 (1) The system according to the present invention detects an image representing the scene from images taken by a vehicle at each of a plurality of time points, based on data defining a scene of a specific image that can be taken by the vehicle. Has a function.

このようにすれば、車両で撮影可能な特定の画像のシーンを検出するための技術を提供することができる。 In this way, it is possible to provide a technique for detecting a specific image scene that can be captured by a vehicle.

（２）車両に設置されたカメラで撮影する機能と、上記カメラにより複数の時点の各時点で撮影された画像を記録媒体に記録する機能と、を有し、上記検出する機能は、上記記録媒体に記録された上記各時点の画像のうち、上記特定の画像のシーンとの一致度がしきい値以上のシーンを表す画像を検出するとよい。 (2) It has a function of taking pictures with a camera installed in the vehicle, and a function of recording images taken by the camera at each of a plurality of points in time on a recording medium, and the detecting function has the function of taking pictures with a camera installed in the vehicle. It is preferable to detect an image representing a scene whose degree of coincidence with the scene of the specific image is equal to or higher than a threshold value among the images recorded on the medium at each of the above-mentioned points in time.

このようにすれば、車両に設置されたカメラで撮影された画像から、特定の画像のシーンとの一致度がしきい値以上のシーンを表す画像を検出することができる。 In this way, it is possible to detect an image representing a scene whose degree of coincidence with the scene of a specific image is equal to or higher than a threshold value from images taken by a camera installed in a vehicle.

（３）上記データは、１つの時点の画像であるとよい。 (3) The above data may be an image at one point in time.

特定の画像が１つの時点であればシーンを検出するまでの時間を短縮できる。 If the specific image is at one point in time, the time required to detect a scene can be shortened.

（４）上記データは、２つまたは３つ以上の時点の画像であるとよい。 (4) The above data may be images at two or more points in time.

特定の画像が２つまたは３つ以上の時点の画像であれば検出するシーンの精度を向上できる。２つまたは３つ以上の時点の画像は、動画のように時間的に連続した画像（画像内の主要被写体の動きが連続している画像）が好ましい。 If the specific images are images at two or more points in time, the accuracy of the scene to be detected can be improved. Images taken at two or more time points are preferably temporally continuous images (images in which the main subject in the image moves continuously) like a moving image.

（５）上記データは、上記特定の画像と同一の画像もしくは上記特定の画像に近似した複数の画像を入力してディープ・ラーニングにより生成されるとよい。 (5) The data may be generated by deep learning by inputting an image that is the same as the specific image or a plurality of images that are similar to the specific image.

このようにすれば、ディープ・ラーニングの技術を用いて、車両で撮影可能な特定の画像のシーンを検出することができる。 In this way, a specific image scene that can be captured by a vehicle can be detected using deep learning technology.

（６）上記データは、上記特定の画像を教師データとして機械学習を行った学習済みの第１の学習モデルであり、上記検出する機能は、上記各時点に撮影された画像を上記学習済みモデルに入力することで上記シーンを表す画像を検出するとよい。 (6) The above data is a trained first learning model that has been subjected to machine learning using the above specific images as training data, and the above detection function uses images taken at each of the above points to use the trained first learning model. It is preferable to detect an image representing the above-mentioned scene by inputting the following.

このようにすれば、比較的迅速に特定のフレームを検出することができる。 In this way, a specific frame can be detected relatively quickly.

（７）上記特定のシーンには、たとえば、事故または災害に至る可能性のシーン、または事故または災害に至ったシーンの少なくとも一方のシーンが含まれるとよい。 (7) The specific scene may include, for example, at least one of a scene that may lead to an accident or disaster, or a scene that has led to an accident or disaster.

このようにすれば、事故または災害に至る可能性のシーン、事故または災害に至ったシーンに近いシーンを動画の中から検出することができる。 In this way, a scene that is likely to lead to an accident or disaster, or a scene that is close to a scene that leads to an accident or disaster can be detected from the video.

（８）上記シーンを表す画像を検出したことに応じて警告を行うように警告装置を制御する警告制御手段をさらに備えてもよい。 (8) The apparatus may further include warning control means for controlling the warning device to issue a warning in response to detection of an image representing the scene.

たとえば、車両に上記システムを設置し撮影しながら動画の記録を行い、その動画の中から特定の画像に近いシーンを見つけ出している場合には、特定の画像のシーンに近い状況にあることを車両の運転者に注意を促すことができる。 For example, if the above system is installed in a vehicle and records a video while shooting, and the scene that is similar to a specific image is found in the video, the vehicle It is possible to warn drivers of

（９）上記警告制御手段は、上記シーンの種類または危険度に応じて警告の内容を変更するように警告装置を制御するとよい。 (9) The warning control means may control the warning device to change the content of the warning depending on the type or degree of danger of the scene.

このようにすれば、車両の運転者は注意すべき度合いを把握しやすい。 In this way, the driver of the vehicle can easily understand the degree to which he or she should be careful.

（１０）上記検出されたシーンの画像を記録媒体に記録するように記録装置を制御する第１の記録制御手段をさらに備えるとよい。 (10) It is preferable to further include a first recording control means for controlling the recording device to record the image of the detected scene on a recording medium.

このようにすれば、特定のシーンのフレームの画像を記録することができ、たとえば後でそのフレームの画像を参照しやすくすることができる。 In this way, it is possible to record an image of a frame of a specific scene, and for example, it is possible to easily refer to the image of that frame later.

（１１）上記検出する機能は、上記データと上記車両の走行状態を示す物理量とに基づいて、上記シーンを表す画像を検出するとよい。 (11) The detecting function preferably detects an image representing the scene based on the data and a physical quantity indicating the driving state of the vehicle.

このようにすれば、車両の走行状況の加味することにより、所望のシーンをより精度良く検出できる。 In this way, a desired scene can be detected with higher accuracy by taking into consideration the driving situation of the vehicle.

（１２）上記記録媒体には、上記各時点に車両で撮影された画像が記憶され、上記記録媒体に記録された上記各時点に車両で撮影された画像を再生するように再生装置を制御する再生制御手段と、上記再生制御手段の制御による再生装置において再生された画像に関連づけて、上記検出した上記シーンを表す画像を報知するように報知装置を制御する報知制御手段と、をさらに備えるとよい。 (12) Images taken by the vehicle at each of the above points are stored in the recording medium, and a playback device is controlled to play back the images recorded in the recording medium and taken by the vehicle at each of the above points. Further comprising a reproduction control means, and a notification control means for controlling the notification device to notify the detected image representing the scene in association with the image reproduced on the reproduction device under the control of the reproduction control means. good.

このようにすれば、記録媒体に記録した画像の再生時に特定の画像に近似したシーンを検出しやすくなる。 This makes it easier to detect a scene that resembles a specific image when reproducing an image recorded on a recording medium.

（１３）上記記録媒体には、記録開始指令が与えられてから記録停止指令が与えられるまでの間を一つの期間として、上記各時点に車両で撮影された画像が記憶され、上記一つの期間内に生じている画像の欠損部分に対応する画像を生成する生成手段、および上記生成手段によって生成された画像を上記欠損部分に記録するように記録装置を制御する第２の記録制御手段をさらに備えるとよい。 (13) The above-mentioned recording medium stores images taken by the vehicle at each of the above-mentioned times, with the period from when a recording start command is given to when a recording stop command is given as one period, and the above-mentioned one period and a second recording control means for controlling a recording device to record the image generated by the generation means in the defective portion. It's good to be prepared.

このようにすれば、欠損部分の動画を復元することができる。 In this way, the missing portion of the video can be restored.

（１４）上記生成手段は、他の車両において複数の時点の各時点に撮影された画像における上記欠損部分に対応する第１の部分の前または後の少なくとも一方の第２の画像を入力とし、上記第１の部分を出力とする機械学習を行った学習済みの第２の学習モデルに、上記欠損データの前または後の少なくとも一方の画像を入力することで、上記欠損部分に対応する画像生成するとよい。 (14) The generating means inputs at least one second image before or after the first portion corresponding to the missing portion in images taken at each of a plurality of times in another vehicle, An image corresponding to the missing part is generated by inputting at least one image before or after the missing data to a trained second learning model that has undergone machine learning using the first part as an output. It's good to do that.

このようにすれば、比較的正確に欠損部分の動画を生成できる。 In this way, a moving image of the missing portion can be generated relatively accurately.

（１５）コンピュータに上記システムの機能を実現するためのプログラムを提供してもよい。 (15) A computer may be provided with a program for realizing the functions of the above system.

このようにすれば、プログラムを装置にインストールすることにより、その装置において、車両で撮影可能な特定の画像のシーンを検出することができる。 In this way, by installing the program into the device, the device can detect a specific image scene that can be photographed by the vehicle.

（１６）この発明による学習済みモデルは、車両で撮影可能な特定の画像を教師データとして用い、入力を複数の時点の各時点に車両で撮影された画像とし、出力を、入力した画像から上記特定の画像のシーンを表す画像の検出とする。 (16) The trained model according to the present invention uses specific images that can be taken by a vehicle as training data, inputs images taken by the vehicle at each of a plurality of points in time, and outputs the above images from the input images. Let us detect an image representing a specific image scene.

このようにすれば、車両で撮影可能な特定の画像のシーンを検出するための学習済みモデルを提供することができる。 In this way, it is possible to provide a trained model for detecting a specific image scene that can be captured by a vehicle.

（１７）この発明による学習モデルの生成方法は、車両で撮影可能な特定の画像を教師データとして用い、入力を複数の時点の各時点に車両で撮影された画像とし、出力を、入力した画像から上記特定の画像のシーンを表す画像とする学習モデルを生成する。 (17) The learning model generation method according to the present invention uses specific images that can be taken by a vehicle as training data, inputs images taken by the vehicle at each of a plurality of time points, and outputs the input images. A learning model is generated from which the image represents the scene of the specific image.

このようにすれば、比較的多くの特定のシーンの画像を検出したり、特定のシーンの画像を精度良く検出したりする学習モデルを生成することができる。 In this way, it is possible to generate a learning model that can detect a relatively large number of images of a specific scene or detect images of a specific scene with high accuracy.

（１８）上記特定の画像は、複数の撮影装置において撮影された画像とするとよい。 (18) The above-mentioned specific image may be an image photographed by a plurality of photographing devices.

（１９）上記撮影装置は車両に設置されるドライブ・レコーダであり、上記特定の画像は、上記ドライブ・レコーダに記録指令が与えられたときのシーン、または上記ドライブ・レコーダが取り付けられている車両に衝撃が加わったときのシーンを表すとよい。 (19) The photographing device is a drive recorder installed in a vehicle, and the specific image is a scene when a recording command is given to the drive recorder, or a scene of the vehicle to which the drive recorder is installed. It is best to describe the scene when a shock is applied to the scene.

このようにすれば、ドライブ・レコーダに記録指令が与えられたときやドライブ・レコーダが取り付けられている車両に衝撃が加わったときのシーンの画像を検出する学習モデルを生成することができる。 In this way, it is possible to generate a learning model that detects an image of a scene when a recording command is given to the drive recorder or when an impact is applied to the vehicle to which the drive recorder is attached.

（２０）対象物が一定以上の速度または加速度で移動しているシーンを表す画像を検出するとよい。 (20) It is preferable to detect an image representing a scene in which a target object is moving at a speed or acceleration higher than a certain level.

このようにすれば、一定以上の速度または加速度で移動した対象物を含む画像を検出する学習モデルを生成することができる。 In this way, it is possible to generate a learning model that detects images that include objects that have moved at a speed or acceleration above a certain level.

（２１）上記データと上記車両の走行状態を示す物理量とに基づいて上記シーンを表す画像を検出するための学習モデルを生成するとよい。 (21) It is preferable to generate a learning model for detecting an image representing the scene based on the data and a physical quantity indicating the driving state of the vehicle.

このようにすれば、車両の走行状況の加味することにより、所望のシーンをより精度良く検出できる学習モデルを生成することができる。 In this way, a learning model that can detect a desired scene with higher accuracy can be generated by taking into consideration the driving situation of the vehicle.

（２２）複数の時点の各時点に撮影された画像のうちの所定の期間の画像である第１の画像と、当該第１の画像の前または後の少なくとも一方の画像である第２の画像とを教師データとして用い、上記第１の画像を入力とし、上記第２の画像を出力として学習し、所定の期間に欠損部分を含む複数の時点の各時点に撮影された第３の画像を入力とし、入力した第３の画像から上記欠損部分の画像である第４の画像を推定するとよい。 (22) A first image that is an image of a predetermined period among images taken at each of a plurality of time points, and a second image that is at least one image before or after the first image. is used as training data, the first image is used as an input, the second image is used as an output to learn, and a third image taken at each of a plurality of time points including the missing part during a predetermined period is acquired. It is preferable to estimate the fourth image, which is the image of the missing portion, from the input third image.

このようにすれば、欠損部分の画像を生成するための学習モデルを生成することができる。 In this way, a learning model for generating an image of the missing portion can be generated.

（２３）この発明による学習済みモデルの生成装置は、（１７）から（２２）のいずれか１の学習モデルの生成方法により学習モデルを生成する。 (23) The trained model generation device according to the present invention generates a learning model using the learning model generation method of any one of (17) to (22).

このようにすれば、車両で撮影可能な特定の画像のシーンを検出するための学習済みモデルを提供することができる学習モデルを生成できる。 In this way, a learning model that can provide a trained model for detecting a specific image scene that can be captured by a vehicle can be generated.

上述した（１）から（２２）に示した発明は、任意に組み合わせることができる。例えば、（１）に示した発明の全て又は一部の構成に、（２）から（２３）の少なくとも１つの発明の少なくとも一部の構成を加える構成としてもよい。特に、（１）に示した発明に、（２）から（２３）の少なくとも１つの発明の少なくとも一部の構成を加えた発明とするとよい。また、（１）から（２３）に示した発明から任意の構成を抽出し、抽出された構成を組み合わせてもよい。 The inventions shown in (1) to (22) above can be combined arbitrarily. For example, at least a part of the structure of at least one of the inventions (2) to (23) may be added to all or part of the structure of the invention shown in (1). In particular, it is preferable to create an invention in which at least a part of the structure of at least one of the inventions (2) to (23) is added to the invention shown in (1). Further, arbitrary configurations may be extracted from the inventions shown in (1) to (23) and the extracted configurations may be combined.

本願の出願人は、これらの構成を含む発明について権利を取得する意思を有する。また「～の場合」「～のとき」という記載があったとしても、その場合やそのときに限られる構成として記載はしているものではない。これらはよりよい構成の例を示しているものであって、これらの場合やときでない構成についても権利取得する意思を有する。また順番を伴った記載になっている箇所もこの順番に限らない。一部の箇所を削除したり、順番を入れ替えたりした構成についても開示しているものであり、権利取得する意思を有する。 The applicant of this application intends to acquire rights to inventions containing these structures. Furthermore, even if there is a description of "in the case of" or "at the time of", the description is not intended to be limited to those cases or times. These are examples of better configurations, and we intend to acquire rights to these cases and other configurations as well. Furthermore, the sections described in order are not limited to this order. It also discloses a configuration in which some parts have been deleted or the order has been changed, and we have the intention to acquire the rights.

本発明によれば、車両で撮影可能な特定の画像のシーンを検出するための技術を提供することができる。 According to the present invention, it is possible to provide a technique for detecting a specific image scene that can be captured by a vehicle.

なお、本願の発明の効果はこれに限定されず、本明細書及び図面等に開示される構成の部分から奏する効果についても開示されており、当該効果を奏する構成についても分割出願・補正等により権利取得する意思を有する。例えば本明細書において「～できる」「～可能である」などと記載した箇所などは奏する効果を明示する記載であり、また「～できる」「～可能である」などといった記載がなくとも効果を示す部分が存在する。またこのような記載がなくとも当該構成よって把握される効果が存在する。 Note that the effects of the invention of the present application are not limited to these, and effects obtained from the parts of the configuration disclosed in the present specification, drawings, etc. are also disclosed, and the configurations that provide the effects are also disclosed by divisional applications, amendments, etc. Have the intention to acquire the rights. For example, in this specification, passages such as "can be done," "is possible," etc. are descriptions that clearly indicate the effect to be achieved, and even if there is no description such as "can be done," or "it is possible," the effect can be obtained. There is a part shown. Further, even without such a description, there are effects that can be understood from the configuration.

ドライブ・レコーダの一例である。This is an example of a drive recorder. 車内にドライブ・レコーダが取り付けられている様子を示している。This shows a drive recorder installed inside the car. ドライブ・レコーダの電気的構成を示すブロック図である。FIG. 2 is a block diagram showing the electrical configuration of a drive recorder. サーバの電気的構成を示すブロック図である。FIG. 2 is a block diagram showing the electrical configuration of a server. 学習モデルの一例である。This is an example of a learning model. 目標画像の一例である。This is an example of a target image. 学習モデルを生成するフローチャートの一例である。This is an example of a flowchart for generating a learning model. 複数の目標画像を生成する学習モデルの一例である。This is an example of a learning model that generates multiple target images. （Ａ）から（Ｃ）は目標画像の一例である。(A) to (C) are examples of target images. 目標画像を生成する敵対的生成ネットワークの電気的構成を示すブロック図である。FIG. 2 is a block diagram showing the electrical configuration of a generative adversarial network that generates a target image. 学習モデルの一例である。This is an example of a learning model. イベント記録の処理手順を示すフローチャートである。3 is a flowchart showing a processing procedure for event recording. 学習モデルの一例である。This is an example of a learning model. 一致度算出処理手順を示すフローチャートである。It is a flowchart which shows a matching degree calculation process procedure. フレームの一例である。This is an example of a frame. 動画を構成するフレームの一例である。This is an example of frames that make up a video. 学習モデルの一例である。This is an example of a learning model. パーソナル・コンピュータの一例である。This is an example of a personal computer. 記録フォーマットの一例である。This is an example of a recording format. 再生用ウインドウの一例である。This is an example of a playback window. 再生処理手順を示すフローチャートである。3 is a flowchart showing a reproduction processing procedure. 再生用ウインドウの一例である。This is an example of a playback window. 再生用ウインドウの一例である。This is an example of a playback window. （Ａ）および（Ｂ）は、フレーム番号とアドレス位置情報との関係を示している。(A) and (B) show the relationship between frame numbers and address position information. 修復処理手順を示すフローチャートである。5 is a flowchart showing a repair processing procedure. 記録フォーマットの一例である。This is an example of a recording format. 学習モデルの一例である。This is an example of a learning model. 学習モデルを学習させる様子を示している。This shows how the learning model is trained. 学習モデル生成処理手順を示すフローチャートである。It is a flowchart which shows a learning model generation process procedure. 動画の欠損部分の生成処理手順を示すフローチャートである。3 is a flowchart illustrating a procedure for generating a missing portion of a moving image.

以下、図面を参照して本発明を実施するための形態について説明する。なお、以下に示す実施形態は、本発明を提供した一つの実施形態であり、以下の記載に基づいて本願発明の内容が限定して解釈されるものではない。 Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings. In addition, the embodiment shown below is one embodiment which provided this invention, and the content of this invention is not limited and interpreted based on the following description.

［本願の発明の着想に至った経緯］
現在のドライブ・レコーダでは、加速度センサ等のセンサ類の検出結果に応じてイベント記録が自動で（受動的に）実行されるが、ユーザが能動的にイベント記録を行うにはスイッチしか方法がなく、センサ類が反応しなかった、かつユーザがスイッチを押せない状況にある場合（例えば、運転中や駐停車時の運転者が車内にいない等）映像の記録が行えない、という状況があった。そこで、ディープ・ラーニング等の方法により、センサ類が反応しないが事故や危険な状況を、映像データにより学習させておき、同じまたが類似した状況が発生した場合に、自動的にイベント記録を行うという解決方法を、発明者は考えた（第１、第２実施例等）。このように、本解決方法は、イベント録画を機械学習で補完するもので、センサ類が反応しない状況の映像を学習するものである。また、このような事故等の状況を撮影するイベント記録ではなく、運転中の景色やユーザがとりたい地点等を予め教師画像として学習しておき、類似した状況を自動判定してドライブ・レコーダで記録することもできると、発明者は考えた。事故等の映像ではなく撮影装置として利用できれば、ドライブ・レコーダの用途が増えるのではないかと、発明者は考えた（第３実施例等）。 [How the invention of the present application was conceived]
In current drive recorders, event recording is automatically (passively) performed according to the detection results of sensors such as acceleration sensors, but the only way for the user to actively record events is by switching. There were situations in which video recording could not be performed if the sensors did not respond and the user was unable to press the switch (for example, the driver was not inside the vehicle while driving or parking the vehicle). . Therefore, we use methods such as deep learning to learn about accidents and dangerous situations in which sensors do not respond, using video data, and automatically record events when the same or similar situations occur. The inventor thought of this solution (first and second embodiments, etc.). In this way, this solution supplements event recording with machine learning, and learns from images of situations in which sensors do not respond. In addition, instead of recording events that capture situations such as accidents, etc., the scenery while driving or the point the user wants to take is learned in advance as a teacher image, and similar situations are automatically determined and recorded using the drive recorder. The inventor thought that it could also be recorded. The inventor thought that the use of the drive recorder would increase if it could be used as a photographing device rather than for capturing images of accidents, etc. (Third Example, etc.).

［各実施例の概要］
以下で説明する各実施例は、車両で撮影可能な特定の画像のシーンを規定するデータに基づいて、複数の時点の各時点に車両で撮影された画像から、上記シーンを表す画像を検出する技術に関する。特定の画像は、各実施例では、所望するシーンを規定する画像（所望の画像ともいう。）である。「上記シーンを表す画像を検出する」ことは、所定の処理を実行するトリガを発生させることを含む。所定の処理とは、第１実施例・第２実施例ではイベント録画であり、第３実施例では、当該シーンを表す画像に関する情報を表示すること、及び表示した情報が選択されると、対応する動画を表す動画データに基づいて当該シーンを表す画像を含むイベント記録された動画を表示すること、動画の再生中にこの選択に応じて当該シーンの画像を表示する時点まで早送りすること、画像の再生中に当該時点までジャンプする（頭出しすること）ことを含む。これらに限られず、「上記シーンを表す画像を検出する」ことは、当該画像に検出したことを示す識別情報を関連付けること等の処理を含んでいてもかまわない。各実施例では、動画データによって表される動画を構成する複数のフレームのうち、所望の画像のシーンとの一致度がしきい値以上のシーンを表すフレームが見つけられるようにする方法を説明する。動画は、複数の時点の各時点の画像（ここでは静止画であり、実施例ではフレームともいう。）を時間軸上に配置して構成される。動画データに基づいて動画を再生した場合、当該時間軸に沿って順次画像が再生される。特定の画像は、１つの時点の画像（つまり静止画）であってもよいし、２の時点または３の時点以上の画像のいずれでもよいし、動画を構成するフレームを表す時間的に連続して撮影された２つの時点または３つの以上の時点の画像でもよい。所望の画像は、自動車事故（自動車事故に限らない）のような事故、自然災害（自然災害に限らない）のような災害に至る可能性のあるシーン、事故または災害に至ったシーンなどの画像のほか、所望の風景の画像などどのような画像でもよい。また、後述のように操作ボタンなどによりドライブ・レコーダに記録指令が与えられたシーン、ドライブ・レコーダが搭載されている車両に衝撃が加わったときのシーンの画像でもよい。 [Summary of each example]
Each of the embodiments described below detects an image representing the scene from images taken by the vehicle at each of a plurality of time points, based on data defining a scene of a specific image that can be taken by the vehicle. Regarding technology. In each embodiment, the specific image is an image that defines a desired scene (also referred to as a desired image). "Detecting an image representing the scene" includes generating a trigger to perform predetermined processing. The predetermined process is event recording in the first and second embodiments, and in the third embodiment, it is displaying information about an image representing the scene, and when the displayed information is selected, a corresponding action is taken. displaying an event-recorded video including an image representing the scene based on video data representing the video; fast-forwarding to the point where the image of the scene is displayed in response to this selection during video playback; This includes jumping to the relevant point during playback. The process is not limited to these, and "detecting an image representing the above scene" may include processing such as associating identification information indicating detection with the image. In each embodiment, a method will be described in which a frame representing a scene whose degree of match with the scene of a desired image is equal to or higher than a threshold value is found among a plurality of frames constituting a video represented by video data. . A moving image is constructed by arranging images (here, still images, and also referred to as frames in the embodiment) at each of a plurality of time points on a time axis. When a video is played back based on video data, images are played back sequentially along the time axis. The specific image may be an image at one time point (that is, a still image), an image at two or more time points, or a temporally continuous image representing frames that make up a video. The images may be taken at two points in time or at three or more points in time. The desired image is an image of an accident such as a car accident (not limited to a car accident), a scene that may lead to a disaster such as a natural disaster (not limited to a natural disaster), or a scene that leads to an accident or disaster. In addition to this, any image may be used, such as an image of a desired landscape. Furthermore, as will be described later, the image may be an image of a scene in which a recording command is given to the drive recorder using an operation button or the like, or a scene in which a shock is applied to the vehicle in which the drive recorder is mounted.

［第１実施例］
図１から図11は、第１実施例を示している。第１実施例は、所望の画像と同一の画像もしくは所望の画像に近似した画像である複数の画像または所望の画像と同一の画像および所望の画像に近似した画像である複数の画像を用いてディープ・ラーニングを用いて目標画像を生成し、生成した目標画像を所望の画像として利用する。また、学習済みモデルの生成についても説明する。 [First example]
1 to 11 show a first embodiment. The first embodiment uses a plurality of images that are the same as a desired image or an image that approximates the desired image, or a plurality of images that are the same as the desired image and an image that approximates the desired image. A target image is generated using deep learning, and the generated target image is used as a desired image. Also, generation of a trained model will be explained.

図１は、ドライブ・レコーダ１(システムの一例である)を斜め後方から見た斜視図である。以下の説明では、ドライブ・レコーダ１は、本実施形態では車両に対して後から設置される機器である。車両は、例えば自家用の自動車や事業用の自動車（乗用車）である。ただし、車両は、バス、トラック、フォークリフト等の特殊自動車、電車やモノレール、リニアモーターカー等の公共交通機関における車両等でもよい。 FIG. 1 is a perspective view of a drive recorder 1 (which is an example of a system) viewed diagonally from the rear. In the following description, the drive recorder 1 is a device that is installed later on the vehicle in this embodiment. The vehicle is, for example, a private car or a business car (passenger car). However, the vehicle may be a special vehicle such as a bus, truck, or forklift, or a public transportation vehicle such as a train, monorail, or linear motor car.

ドライブ・レコーダ１の筐体の１つの側面にＳＤカード挿入口10が形成されている。ドライブ・レコーダ１の筐体の背面のほぼ全体にディスプレイ11が形成されており、ディスプレイ11を挟んで左右の両側に複数の操作ボタン12が設けられている。ドライブ・レコーダ１の上面にはジョイント・レール13が設けられている。図１には現れていないが、ドライブ・レコーダの前面にカメラのレンズが設けられている。図１に現れていない側面にドライブ・レコーダ１のＤＣジャックが形成されており、ドライブ・レコーダ１の底面にスピーカおよびＨＤ(high definition)出力端子が形成されている。 An SD card insertion slot 10 is formed on one side of the housing of the drive recorder 1. A display 11 is formed on almost the entire back surface of the housing of the drive recorder 1, and a plurality of operation buttons 12 are provided on both left and right sides with the display 11 in between. A joint rail 13 is provided on the top surface of the drive recorder 1. Although not shown in FIG. 1, a camera lens is provided on the front of the drive recorder. A DC jack of the drive recorder 1 is formed on the side surface not shown in FIG. 1, and a speaker and an HD (high definition) output terminal are formed on the bottom surface of the drive recorder 1.

レンズを含むカメラは、例えば車両の前方の映像を撮影する。また、好ましくは車室内および車室外(車両の横方向、後方など)を撮影してもよい。ＤＣジャックは、電源ケーブルを介してＤＣ電源に接続するためのジャックである。ＳＤカード挿入口10はＳＤカード［メモリ・カードの一例で、ｘＤピクチャ・カードなどその他のメモリ・カードでもよいし、メモリ・カード以外のＵＳＢ(Universal Serial Bus)メモリなどのメモリでもよい］を挿入するための挿入口である。スピーカは、音や音声を出力する。ＨＤ出力端子は、ケーブルを介して他の情報機器に接続するための端子である。ジョイント・レール13は、ドライブ・レコーダ１を車両に搭載するためのジョイントを取り付けるためのものである。ディスプレイ11は、ドライブ・レコーダ１のカメラによって撮影された映像など種々の映像を表示する。操作ボタン12は、ユーザが操作することによってドライブ・レコーダ１に種々の指令を入力するためのものである。 A camera including a lens captures an image in front of a vehicle, for example. Preferably, the interior of the vehicle and the exterior of the vehicle (lateral direction, rear of the vehicle, etc.) may be photographed. The DC jack is a jack for connecting to a DC power source via a power cable. The SD card insertion slot 10 is used to insert an SD card [this is an example of a memory card, and may be any other memory card such as an xD picture card, or may be a memory other than a memory card such as a USB (Universal Serial Bus) memory]. It is an insertion port for A speaker outputs sound or audio. The HD output terminal is a terminal for connecting to other information equipment via a cable. The joint rail 13 is used to attach a joint for mounting the drive recorder 1 on a vehicle. The display 11 displays various images such as images taken by the camera of the drive recorder 1. The operation button 12 is used by the user to input various commands to the drive recorder 1.

図２は、ドライブ・レコーダ１が搭載された車両の車室内から前方に向かって見た図を表している。 FIG. 2 shows a view looking forward from the interior of a vehicle in which the drive recorder 1 is mounted.

車両のフロント・ガラス３の上部の中央付近にルーム・ミラー４が設けられている。ルーム・ミラー４に隣接する助手席側(図２では前方に向かって左側)の位置にドライブ・レコーダ１がジョイントによって車両に固定されている。 A room mirror 4 is provided near the center of the upper part of a windshield 3 of a vehicle. A drive recorder 1 is fixed to the vehicle by a joint at a position on the passenger seat side (on the left side when facing the front in FIG. 2) adjacent to the rearview mirror 4.

ドライブ・レコーダ１のＤＣジャックが電源ケーブル６を介してシガー・ソケット５に接続されている。車両のアクセサリ電源がオンにされると、シガー・ソケット５からドライブ・レコーダ１に電力が供給される。なお、ドライブ・レコーダ１は、図１および図２の形態のものに限られず、例えば、全天球カメラまたは半天球カメラを搭載したものでもよい。 A DC jack of a drive recorder 1 is connected to a cigarette lighter socket 5 via a power cable 6. When the accessory power source of the vehicle is turned on, power is supplied from the cigarette lighter socket 5 to the drive recorder 1. Note that the drive recorder 1 is not limited to the configuration shown in FIGS. 1 and 2, and may be equipped with a spherical camera or a hemispherical camera, for example.

図３は、ドライブ・レコーダ１の電気的構成を示すブロック図である。 FIG. 3 is a block diagram showing the electrical configuration of the drive recorder 1. As shown in FIG.

ドライブ・レコーダ１には、コントローラ20が含まれている。このコントローラ20にドライブ・レコーダ１の全体の動作を統括するＣＰＵ(Central Processing Unit)20ａ、ドライブ・レコーダ１の動作プログラムなどが格納されているＲＯＭ(Read Only Memory)20ｂ、データ等を一時的に記憶するＲＡＭ(Random Access Memory)20ｃおよびタイマ20ｄが含まれている。コントローラ20は、ＧＰＳ情報処理プログラム、映像処理プログラム、通信処理プログラムなどの機能を有する。 The drive recorder 1 includes a controller 20. This controller 20 has a CPU (Central Processing Unit) 20a that controls the entire operation of the drive recorder 1, a ROM (Read Only Memory) 20b that stores the operation programs of the drive recorder 1, and data etc. temporarily. A RAM (Random Access Memory) 20c for storage and a timer 20d are included. The controller 20 has functions such as a GPS information processing program, a video processing program, and a communication processing program.

また、ドライブ・レコーダ１にはＧＰＳ(Global Positioning System)受信機16、カメラ17、ＳＤカード・リーダ・ライタ14および加速度センサ18が含まれている。ＧＰＳ受信機16は、ドライブ・レコーダ１の位置を検出するもので、ドライブ・レコーダ１が搭載されている車両の位置がわかることとなる。カメラ17は、上述のようにドライブ・レコーダ１が搭載された車両の前方などを撮影する。ＳＤカード・リーダ・ライタ14は、上述のようにＳＤカード挿入口10にＳＤカード23が挿入されることにより、挿入されたＳＤカード23に記録されたデータを読み取り、かつデータをＳＤカード23に書き込む。加速度センサ18は、ドライブ・レコーダ１に与えられた上下、左右および前後の加速度を検出するもので、ドライブ・レコーダ１が搭載されている車両の上下方向、左右方向および前後方向の加速度がわかることとなる。 The drive recorder 1 also includes a GPS (Global Positioning System) receiver 16, a camera 17, an SD card reader/writer 14, and an acceleration sensor 18. The GPS receiver 16 detects the position of the drive recorder 1, and the position of the vehicle in which the drive recorder 1 is mounted can be known. The camera 17 photographs the front of the vehicle in which the drive recorder 1 is mounted as described above. When the SD card 23 is inserted into the SD card insertion slot 10 as described above, the SD card reader/writer 14 reads the data recorded on the inserted SD card 23 and transfers the data to the SD card 23. Write. The acceleration sensor 18 detects the vertical, horizontal, and longitudinal acceleration applied to the drive recorder 1, and can detect the vertical, horizontal, and longitudinal acceleration of the vehicle in which the drive recorder 1 is installed. becomes.

ＧＰＳ受信機16から出力されるドライブ・レコーダ１の位置を表すＧＰＳデータ、カメラ17によって撮影された映像を表す映像データ、加速度センサ18によって検出された加速度を表す信号は、それぞれコントローラ20に入力する。コントローラ20はＧＰＳ情報処理プログラムを実行することにより、ＧＰＳ受信機16から出力されたＧＰＳデータを、ＳＤカード挿入口10に挿入されたＳＤカード23に記録できる。また、コントローラ20は映像処理プログラムを実行することにより、カメラ17から得られた映像データを時刻と関連付けてＳＤカード23に記録できる。 GPS data representing the position of the drive recorder 1 outputted from the GPS receiver 16, video data representing the image taken by the camera 17, and a signal representing the acceleration detected by the acceleration sensor 18 are input to the controller 20, respectively. . By executing the GPS information processing program, the controller 20 can record the GPS data output from the GPS receiver 16 onto the SD card 23 inserted into the SD card insertion slot 10. Furthermore, by executing the video processing program, the controller 20 can record the video data obtained from the camera 17 in the SD card 23 in association with the time.

さらに、ドライブ・レコーダ１には、上述した音声等を出力するスピーカ15、映像等を表示するディスプレイ11、操作ボタン12、通信回路19およびＬＴＥモジュール21が含まれている。 Furthermore, the drive recorder 1 includes a speaker 15 that outputs the above-mentioned audio and the like, a display 11 that displays images and the like, operation buttons 12, a communication circuit 19, and an LTE module 21.

コントローラ20から音声データ等がスピーカ15に出力することにより、スピーカ15から音声等を出力し、コントローラ20から映像データ等がディスプレイ11に出力することにより、ディスプレイ11に映像等を表示する。操作ボタン12からの各種指令はコントローラ20に入力する。 The controller 20 outputs audio data, etc. to the speaker 15, so that the speaker 15 outputs the audio, and the controller 20 outputs the video data, etc. to the display 11, so that the display 11 displays images. Various commands from the operation button 12 are input to the controller 20.

通信回路19は、外部機器、例えばサーバ、パーソナル・コンピュータ、スマートフォン、タブレット端末等と無線通信を行うための通信手段として機能する。コントローラ20は、通信処理プログラムを実行することにより、通信回路19を介して、外部機器に映像等のデータを送信する機能を有する。通信回路19として、例えばＷｉＦｉ規格、Ｂｌｕｅｔｏｏｔｈ(登録商標)等の近距離無線通信の規格に準拠したものを用いるとよい。近距離無線通信の規格は、例えば構内で稼働する作業車両（フォークリフト等）と外部機器との通信に適用することができる。移動通信システムの規格は、例えば、より広範囲の領域内で移動する車両と外部機器との通信に適用することができる。 The communication circuit 19 functions as a communication means for wirelessly communicating with external devices such as servers, personal computers, smartphones, tablet terminals, and the like. The controller 20 has a function of transmitting data such as video to an external device via the communication circuit 19 by executing a communication processing program. As the communication circuit 19, it is preferable to use one that complies with short-range wireless communication standards such as the WiFi standard and Bluetooth (registered trademark). The short-range wireless communication standard can be applied, for example, to communication between work vehicles (forklifts, etc.) operating within a premises and external devices. Mobile communication system standards can be applied, for example, to communications between vehicles that move within a wider area and external equipment.

ＬＴＥ(Long Term Evolution)モジュール21は、携帯電話の通信規格に準じた通信回路である。ＬＴＥモジュール21の代わりに４Ｇ等の他の移動通信システムの規格等に準拠した通信回路を用いてもよい。ＬＴＥモジュール21によってカメラ17が撮影した映像データ等を外部サーバ等に送信できる。 The LTE (Long Term Evolution) module 21 is a communication circuit that complies with the communication standard of mobile phones. Instead of the LTE module 21, a communication circuit compliant with other mobile communication system standards such as 4G may be used. The LTE module 21 can transmit video data captured by the camera 17 to an external server or the like.

ドライブ・レコーダ１の動作プログラムはＳＤカード23にあらかじめ記録されていてもよいし、インターネットなどを介してＬＴＥモジュール21によって受信しＳＤカード23に記録されてもよいし、ＳＤカード23をパーソナル・コンピュータなどに挿入し、パーソナル・コンピュータを介してインターネットに接続してパーソナル・コンピュータからダウンロードしてＳＤカード23に記録してもよい。 The operation program of the drive recorder 1 may be recorded in advance on the SD card 23, or may be received by the LTE module 21 via the Internet and recorded on the SD card 23, or the SD card 23 may be stored on the personal computer. The data may be inserted into a personal computer, connected to the Internet via a personal computer, downloaded from the personal computer, and recorded on the SD card 23.

図４は、ドライブ・レコーダ１と通信するサーバの30の電気的構成を示すブロック図である。 FIG. 4 is a block diagram showing the electrical configuration of a server 30 that communicates with the drive recorder 1.

サーバ30には、全体の動作を統括する制御装置31が含まれている。 The server 30 includes a control device 31 that controls the entire operation.

サーバ30の制御装置31には、ドライブ・レコーダ１その他のクライアント・コンピュータなどと通信する通信装置、データなどを記憶するメモリ33およびハード・ディスク・ドライブ34が接続されている。ハード・ディスク・ドライブ34は、動画データなどを記憶するハード・ディスク35にアクセスして、ハード・ディスク35に記録されている動画データの読み取りおよびハード・ディスク35に動画データなどを記録する。 Connected to the control device 31 of the server 30 are a communication device for communicating with the drive recorder 1 and other client computers, a memory 33 for storing data, and a hard disk drive . The hard disk drive 34 accesses the hard disk 35 that stores moving image data and the like, reads the moving image data recorded on the hard disk 35, and records the moving image data and the like on the hard disk 35.

図５は、学習モデル40の一例である。 FIG. 5 is an example of the learning model 40.

図５に示す学習モデル40は、ディープ・ラーニングにより複数の画像を入力して目標画像を生成するものである。学習モデル40には入力層41、中間層（隠れ層）42および出力層43が含まれている。 The learning model 40 shown in FIG. 5 generates a target image by inputting a plurality of images by deep learning. The learning model 40 includes an input layer 41, an intermediate layer (hidden layer) 42, and an output layer 43.

任意の１枚の画像（所望の画像、所望の画像に近似した画像だけでなく、その他の画像も含まれる）を構成する画素Ｐ１からＰＮのそれぞれを学習モデル40の入力層41から入力し、中間層42および出力層43を通してディープ・ラーニングにより学習させることを、膨大な数の任意の画像だけ行う。その後、所望の画像、所望の画像に近似した画像が多く含まれる任意の画像を学習モデル40の入力層41から入力すると、所望の画像、所望の画像に近似した画像に反応するニューロン43ａを探し出すことができる。未知の画像を学習モデル40の入力層41から入力しニューロン43ａが反応したら、入力した未知の画像は所望の画像、所望の画像に近似した画像であると判断できる。これにより、学習モデル40によって画像認識ができる。 Input each of the pixels P1 to PN constituting an arbitrary image (including not only a desired image and an image approximated to the desired image but also other images) from the input layer 41 of the learning model 40, Learning by deep learning through the intermediate layer 42 and output layer 43 is performed on a huge number of arbitrary images. After that, when a desired image or an arbitrary image that includes many images that approximate the desired image is input from the input layer 41 of the learning model 40, neurons 43a that respond to the desired image or images that approximate the desired image are searched for. be able to. When an unknown image is input from the input layer 41 of the learning model 40 and the neuron 43a responds, it can be determined that the input unknown image is a desired image or an image that approximates the desired image. Thereby, image recognition can be performed using the learning model 40.

さらに、反応したニューロン43ａから逆に入力層41側に辿ることにより、もっとも所望の画像らしい目標画像を生成することができる。このようにして生成された目標画像がディープ・ラーニング（教師なし学習）により生成された所望の画像となる。 Furthermore, by tracing back from the responsive neuron 43a to the input layer 41 side, it is possible to generate a target image that is most likely to be the desired image. The target image generated in this way becomes a desired image generated by deep learning (unsupervised learning).

たとえば、事故に至りそうな画像、災害に至りそうな画像、事故に至った画像、災害に至った画像、その他の事故、災害に関係のない膨大な画像を学習モデル40に入力し、ディープ・ラーニングにより学習すると、事故に至りそうな画像が学習モデル40に入力したときに反応するニューロン（第１のニューロンとする）、災害に至りそうな画像が学習モデルに入力したときに反応するニューロン（第２のニューロンとする）、事故に至った画像が学習モデルに入力したときに反応するニューロン（第３のニューロンとする）、災害に至った画像が学習モデルに入力したときに反応するニューロン（第４のニューロンとする）などがわかる。 For example, images that are likely to lead to an accident, images that are likely to lead to a disaster, images that have led to an accident, images that have led to a disaster, and a large number of images unrelated to other accidents or disasters are input to the learning model 40, and the deep When learning is performed, a neuron that responds when an image that is likely to lead to an accident is input to the learning model 40 (referred to as the first neuron), and a neuron that responds when an image that is likely to lead to a disaster is input to the learning model (referred to as the first neuron). A neuron that reacts when the image that led to the accident is input to the learning model (selected as the second neuron), a neuron that responds when the image that led to the disaster is input to the learning model (selected as the third neuron), The fourth neuron) etc. can be understood.

反応した第１のニューロンから逆に入力層41側に辿ることにより、事故に至りそうな画像らしい目標画像を生成することができ、反応した第２のニューロンから逆に入力層421に辿ることにより、災害に至りそうな画像らしい目標画像を生成することができ、反応した第３のニューロンから逆に入力層41側に辿ることにより、事故に至った画像らしい目標画像を生成することができ、反応した第４のニューロンから逆に入力層41側に辿ることにより、災害に至った画像らしい目標画像を生成することができる。生成された目標画像を所望の画像として利用できる。 By tracing back from the first neuron that reacted to the input layer 41 side, it is possible to generate a target image that is likely to lead to an accident, and by tracing the image backward from the second neuron that responded to the input layer 421. , it is possible to generate a target image that looks like an image that is likely to lead to a disaster, and by tracing back from the reacted third neuron to the input layer 41 side, it is possible to generate a target image that looks like an image that is likely to lead to an accident. By tracing back from the reacted fourth neuron to the input layer 41 side, it is possible to generate a target image that is similar to the image that led to the disaster. The generated target image can be used as a desired image.

図６は、上述のようにして生成された目標画像50の一例であり、事故に至りそうな画像の一例である。 FIG. 6 is an example of the target image 50 generated as described above, and is an example of an image that is likely to lead to an accident.

目標画像50は、ディープ・ラーニングにもとづいて生成された画像であり、実際に撮影された画像とは異なる。ただし、偶然に実際に撮影された画像とほぼ同一の画像となることはあり得る。 The target image 50 is an image generated based on deep learning, and is different from an actually photographed image. However, it is possible that the image may be almost the same as the image that was actually photographed by chance.

この実施例の一例としては、動画を構成するフレームの中から、この目標画像50のシーンとの一致度がしきい値以上のシーンのフレームが見つけられる。 As an example of this embodiment, a frame of a scene whose degree of coincidence with the scene of the target image 50 is equal to or higher than a threshold value is found among the frames constituting the moving image.

図７は、目標画像生成処理手順を示すフローチャートである。 FIG. 7 is a flowchart showing the target image generation processing procedure.

図７に示す目標画像生成処理手順は、多数の画像が得られていればどのようなタイミングで行われてもよい。また、ＳＤカード23またはデータベース22に多数の画像を表す画像データが記録されていればドライブ・レコーダ１において行われてもよいし、サーバ30のハード・ディスク35に多数の画像を表す画像データが記録されていればサーバ30において行われてもよい。複数の車両に搭載されているドライブ・レコーダ（複数のドライブレコーダ）から画像を収集する場合は、サーバ30で目標画像生成処理手順が実行されることが望ましい。さらに、インターネットにアクセスできれば、ＳＤカード23、データベース22、ハード・ディスク35に多数の画像が記録されていなくとも、インターネットから多数の画像を読み取ることにより、ドライブ・レコーダ１またはサーバ30のいずれにおいて目標画像生成処理が行われてもよい。さらに、クライアント・コンピュータ（パーソナル・コンピュータ）において目標画像生成処理が行われてもよい。 The target image generation processing procedure shown in FIG. 7 may be performed at any timing as long as a large number of images are obtained. Further, if image data representing a large number of images is recorded in the SD card 23 or database 22, the process may be performed in the drive recorder 1, or if image data representing a large number of images is stored in the hard disk 35 of the server 30. If it is recorded, it may be performed on the server 30. When collecting images from drive recorders (multiple drive recorders) installed in a plurality of vehicles, it is desirable that the server 30 executes a target image generation processing procedure. Furthermore, if you have access to the Internet, even if many images are not recorded on the SD card 23, database 22, or hard disk 35, by reading many images from the Internet, you can set the target on either the drive recorder 1 or the server 30. Image generation processing may also be performed. Furthermore, target image generation processing may be performed in a client computer (personal computer).

多数の任意の画像が学習モデル40に入力し（ステップ61）、学習モデル40においてディープ・ラーニングが行われる（ステップ62）。上述のように目標画像が得られる（ステップ63）。 A large number of arbitrary images are input to the learning model 40 (step 61), and deep learning is performed in the learning model 40 (step 62). A target image is obtained as described above (step 63).

学習モデル40のプログラムは、ドライブ・レコーダ１のコントローラ20またはサーバ30の制御装置31にインストールされているのはいうまでもない。 Needless to say, the learning model 40 program is installed in the controller 20 of the drive recorder 1 or the control device 31 of the server 30.

図８は、複数の目標画像を生成する方法を示している。 FIG. 8 shows a method for generating multiple target images.

第１の学習モデル40Ａ、第２の学習モデル40Ｂおよび第３の学習モデル40Ｃは、図５に示した学習モデルと同様に、入力層、中間層および出力層を含む。 The first learning model 40A, the second learning model 40B, and the third learning model 40C include an input layer, a middle layer, and an output layer, similar to the learning model shown in FIG.

第１の学習モデル40Ａに入力する画像を第１の画像とし、第２の学習モデル40Ｂに入力する画像を第２の画像とし、第３の学習モデル40Ｃに入力する画像を第３の画像とする。図５を参照して説明したように第１の学習モデル40Ａに膨大な画像を入力してディープ・ラーニングにより学習させると特定の画像（たとえば、事故に至る可能性の高い画像）に反応するニューロンが探し出され、そのニューロンを入力層に辿ることにより特定の画像の特徴を表す画像が得られる。たとえば、事故に至る可能性の高い特徴を表す第１の目標画像が得られる。同様に、第２の学習モデル40Ｂに膨大な画像を入力してディープ・ラーニングにより学習させると特定の画像（たとえば、災害に至る可能性の高い画像）に反応するニューロンが探し出され、そのニューロンを入力層に辿ることにより特定の画像の特徴を表す画像が得られる。たとえば、災害に至る可能性の高い特徴を表す第２の目標画像が得られる。第３の学習モデル40Ｃに膨大な画像を入力してディープ・ラーニングにより学習させると特定の画像（たとえば、事故に至った画像）に反応するニューロンが探し出され、そのニューロンを入力層に辿ることにより特定の画像の特徴を表す画像が得られる。たとえば、事故に至った画像の特徴を表す第３の目標画像が得られる。このように複数種類の目標画像が得られる。 The image input to the first learning model 40A is referred to as the first image, the image input to the second learning model 40B is referred to as the second image, and the image input to the third learning model 40C is referred to as the third image. do. As explained with reference to FIG. 5, when a huge number of images are input to the first learning model 40A and it is trained by deep learning, neurons that respond to specific images (for example, images that are likely to lead to an accident) is found, and by tracing that neuron to the input layer, an image representing the characteristics of a particular image is obtained. For example, a first target image is obtained that represents features that are likely to lead to an accident. Similarly, when a large number of images are input to the second learning model 40B and it is trained by deep learning, neurons that respond to a specific image (for example, an image that is likely to lead to a disaster) are found, and that neuron By tracing the image to the input layer, an image representing the characteristics of a specific image can be obtained. For example, a second target image is obtained that represents features that are likely to lead to a disaster. When a huge number of images are input to the third learning model 40C and it is trained by deep learning, neurons that respond to a specific image (for example, the image that led to the accident) are found, and those neurons are traced back to the input layer. An image representing specific image characteristics is obtained. For example, a third target image is obtained that represents the characteristics of the image that led to the accident. In this way, multiple types of target images are obtained.

図８においては複数の学習モデル40Ａ-40Ｃを用いて複数種類の目標画像を得ているが、図５に示すような１つの学習モデル40においても、第１の目標画像に反応するニューロン、第２の目標画像に反応するニューロン、第３の目標画像に反応するニューロンなどを探し出すことにより、入力層41に辿ることにより第１の目標画像、第２の目標画像、第３の目標画像などを見つけることができる。 In FIG. 8, multiple types of target images are obtained using multiple learning models 40A to 40C, but even in one learning model 40 as shown in FIG. By searching for neurons that respond to the second target image, neurons that respond to the third target image, etc., the first target image, second target image, third target image, etc. are traced to the input layer 41. can be found.

図８に示す例においては、第１の画像、第２の画像および第３の画像は、時系列的に連続して撮影された画像とすることが好ましい。それにより、得られる第１の目標画像、第２の目標画像および第３の目標画像も時系列的に順に連続しているものを表すようになる。たとえば、第１の目標画像、第２の目標画像および第３の目標画像が事故に至る可能性の高い画像の特徴を表しているものとすると、第１の目標画像、第２の目標画像および第３の目標画像に近いフレームが順に動画の中で現れると、そのような状況では事故に至る可能性がより高いものと判断でき、１枚の目標画像のみで事故に至る可能性が高いと判断する場合よりも精度が高くなる。 In the example shown in FIG. 8, it is preferable that the first image, the second image, and the third image be images that are sequentially photographed in chronological order. As a result, the obtained first target image, second target image, and third target image also represent continuous images in chronological order. For example, if the first target image, the second target image, and the third target image represent characteristics of images that are likely to lead to an accident, then the first target image, the second target image, and If frames close to the third target image appear one after another in the video, it can be determined that there is a higher possibility of an accident in such a situation, and that only one target image is likely to lead to an accident. The accuracy will be higher than when making a judgment.

図９（Ａ）、（Ｂ）および（Ｃ）は、上述のようにして生成された第１の目標画像、第２の目標画像および第３の目標画像の一例である。 FIGS. 9A, 9B, and 9C are examples of the first target image, second target image, and third target image generated as described above.

第１の目標画像51、第２の目標画像52および第３の目標画像53となるにつれて被写体として映っている子供が自動車に近づいてしまい、事故に至る可能性が高くなっていることがわかる。これらの第１の目標画像51、第２の目標画像52および第３の目標画像53を所望の画像とし、これらの第１の目標画像51、第２の目標画像52および第３の目標画像53と同じようなシーンが動画の中から見つけられると、そのシーンのときには高い確率で事故に至ると判断できる。 It can be seen that as the first target image 51, second target image 52, and third target image 53 change, the child appearing as the subject gets closer to the car, and the possibility of an accident increases. These first target image 51, second target image 52, and third target image 53 are set as desired images, and these first target image 51, second target image 52, and third target image 53 are set as desired images. If a similar scene can be found in a video, it can be determined that there is a high probability that an accident will occur in that scene.

上述の実施例では、３枚の第１の目標画像51、第２の目標画像52および第３の目標画像53を所望の画像としているが、２枚または４枚以上の目標画像を生成し、所望の画像としてもよい。 In the above embodiment, the three first target images 51, the second target image 52, and the third target image 53 are the desired images, but two or four or more target images may be generated, It may be a desired image.

図10は、目標画像を生成する他の方法を示すもので、敵対的生成ネットワークの電気的構成を示すブロック図である。 FIG. 10 shows another method of generating a target image, and is a block diagram showing the electrical configuration of an adversarial generation network.

所望の画像の特徴量を表す特徴量データが生成器71に入力する。生成器71において、入力した特徴量データから所望の画像の疑似画像を表す疑似画像データが生成される。生成された疑似画像データは生成器71から識別器72に入力する。識別器72には所望の画像を表す画像データも入力する。識別器72において、所望の画像を表す画像データによって表される所望の画像と疑似画像データによって表される疑似画像とが同じまたは類似した画像かどうかが識別される。識別結果を表すデータは生成器71および重みづけ変更器73のそれぞれに入力する。識別結果を表すデータにもとづいて生成器71が調整され、かつ重みづけ変更器73によって識別器72の特徴量の重みづけが変更させられる。 Feature amount data representing the feature amount of a desired image is input to the generator 71. The generator 71 generates pseudo image data representing a pseudo image of a desired image from the input feature data. The generated pseudo image data is input from the generator 71 to the discriminator 72. Image data representing a desired image is also input to the discriminator 72. The classifier 72 identifies whether the desired image represented by the image data representing the desired image and the pseudo image represented by the pseudo image data are the same or similar images. Data representing the identification result is input to each of the generator 71 and the weighting changer 73. The generator 71 is adjusted based on the data representing the classification result, and the weighting changer 73 changes the weighting of the feature amount of the classifier 72.

生成器71における疑似画像データの生成、識別器72における識別、生成器71における識別結果にもとづく調整および重みづけ変更器73における識別器72の重みづけの変更が繰り返されると、生成器71によって生成された疑似画像データによって表される疑似画像と所望の画像とが同一または近似と識別器72において判定される。そのように判定された疑似画像が目標画像とされる。同一種類の複数の画像を所望の画像とし、上述した処理を繰り返すことにより得られた疑似画像を目標画像としてもよい。このようにして得られた目標画像が動画を構成するフレームから見つけられる所望の画像となる。 When the generation of pseudo image data in the generator 71, the identification in the discriminator 72, the adjustment based on the discrimination result in the generator 71, and the change in the weighting of the discriminator 72 in the weighting changer 73 are repeated, the generator 71 generates The discriminator 72 determines that the pseudo image represented by the generated pseudo image data and the desired image are the same or similar. The pseudo image determined in this way is set as the target image. A plurality of images of the same type may be used as desired images, and a pseudo image obtained by repeating the above-described process may be used as the target image. The target image obtained in this manner becomes a desired image found from the frames constituting the moving image.

上述の実施例においては、任意の画像を用いて目標画像を生成し生成した目標画像を所望の画像としたり、所望の画像を用いて疑似画像を生成し生成した疑似画像を所望の画像としたりしているが、事故または災害に至る可能性の高い画像、事故または災害に至った画像、ユーザが探したいシーンの画像などがすでに分かっている場合などには、目標画像、疑似画像などを、ディープ・ラーニングなどを用いて生成することなく、それらの画像自体を所望の画像として利用してもよい。たとえば、イベント記録の動画（たとえば、車両に搭載されているドライブ・レコーダが動画を記録する場合に車両に衝撃などが与えられた場合に記録される動画）がある場合には、そのイベント記録の動画を構成する１または複数のフレームを所望の画像としてもよい。 In the above embodiment, a target image is generated using an arbitrary image and the generated target image is used as the desired image, or a pseudo image is generated using the desired image and the generated pseudo image is used as the desired image. However, if you already know the images that are likely to lead to an accident or disaster, the images that led to the accident or disaster, or the images of the scene that the user wants to search, you can use target images, pseudo images, etc. These images themselves may be used as desired images without generating them using deep learning or the like. For example, if there is a video of an event record (for example, a video recorded when a drive recorder installed in a vehicle records a video when the vehicle is subjected to an impact, etc.), the event record One or more frames constituting a moving image may be used as the desired image.

図11は、学習モデルに学習させて学習済みモデルを生成する方法を示している。 FIG. 11 shows a method for training a learning model to generate a trained model.

目標画像を生成する他の一例であり、学習モデルの一例を示している。 This is another example of generating a target image, and shows an example of a learning model.

学習モデル80には、入力層81、中間層（隠れ層）82および出力層83が含まれている。 The learning model 80 includes an input layer 81, an intermediate layer (hidden layer) 82, and an output layer 83.

所望の画像を教師データとし、所望の画像を構成する画素Ｐ１からＰＮを表すデータを、画像を構成する順に入力層81に入力する。中間層82を介して入力層81に入力した画素Ｐ１からＰＮと同じデータが出力層83から所望の画像を構成する順と同じ順に出力するように、入力層81、中間層82および出力層83の特徴量、重みづけなどを必要に応じて調整する。すなわち、学習モデル80の出力が入力とほぼ同じようになるように機械学習を行い、学習モデル80の特徴量、重みづけなどを必要に応じて調整する。 Using a desired image as teacher data, data representing pixels P1 to PN constituting the desired image are input to the input layer 81 in the order of constituting the image. The input layer 81, the intermediate layer 82, and the output layer 83 are configured such that the same data as pixels P1 to PN inputted to the input layer 81 via the intermediate layer 82 is outputted from the output layer 83 in the same order as the desired image. Adjust the feature values, weighting, etc. as necessary. That is, machine learning is performed so that the output of the learning model 80 is almost the same as the input, and the feature amounts, weighting, etc. of the learning model 80 are adjusted as necessary.

入力層81に入力する画素を表すデータの数を減らしたり、出力層83から出力する画素を表すデータの数を減らしたりしてもよい。画像の同一位置の画素を表すデータが入力と出力とでほぼ同じとなるように学習モデル80の特徴量、重みづけなどを必要に応じて調整してもよい。 The number of data representing pixels input to the input layer 81 may be reduced, or the number of data representing pixels output from the output layer 83 may be reduced. The feature amounts, weighting, etc. of the learning model 80 may be adjusted as necessary so that the data representing pixels at the same position in the image is almost the same between input and output.

このような学習をさせた学習モデルに任意の画像（例えば、動画を構成する各フレーム）を表すデータを入力すると、所望の画像に一致する画像ほど出力層83から出力する各画素のデータが、所望の画像を構成する画素Ｐ１からＰＮのデータに近くなる。 When data representing an arbitrary image (for example, each frame constituting a video) is input to a learning model that has been trained in this way, the data of each pixel output from the output layer 83 will be The data is close to that of pixels P1 to PN that constitute the desired image.

第１実施例によると、所望の画像を生成でき、動画を構成する複数フレームの中から所望の画像のシーンに近似したシーンのフレームを検出するのに利用できる学習済みモデルを得られる。 According to the first embodiment, a trained model can be obtained that can generate a desired image and can be used to detect a frame of a scene that approximates the scene of the desired image from among a plurality of frames constituting a moving image.

［第２実施例］
図12から図16は、第２実施例を示している。第２実施例は、図11を参照して生成された学習済みの学習モデル80を利用して動画データを構成する複数フレームの中から所望の画像との一致度が高いフレームを探し出すものである。第２実施例では、撮影装置によって撮影された動画データを記録し、記録した動画データを読み取り、読み取られた動画データを構成する複数フレームの中から所望の画像との一致度が高いフレームを探し出している。以下では、所望の画像を検出するために加速度センサ１８により検出された加速度を用いること、および動画の中に一定以上の速度または加速度で移動している対象物があるかどうかを判断する処理が併用されているが、少なくとも、ドライブ・レコーダ１により撮影された画像を、第１実施例で説明した方法で生成した学習済みモデルに適用することにより、所望の画像を検出するものであればよい。 [Second example]
12 to 16 show a second embodiment. In the second embodiment, a trained learning model 80 generated with reference to FIG. 11 is used to search for a frame that has a high degree of matching with a desired image from among a plurality of frames constituting video data. . In the second embodiment, video data shot by a photographing device is recorded, the recorded video data is read, and a frame that has a high degree of matching with a desired image is searched from among multiple frames that make up the read video data. ing. In the following, it will be explained that the acceleration detected by the acceleration sensor 18 is used to detect a desired image, and that there is a process for determining whether there is an object moving at a speed or acceleration above a certain level in the video. Although used in combination, any desired image may be detected by at least applying the image photographed by the drive recorder 1 to the trained model generated by the method described in the first embodiment. .

図12は、図１および図２に示すドライブ・レコーダ１によって常時記録中に行われるイベント記録の処理手順を示すフローチャートである。 FIG. 12 is a flowchart showing an event recording process performed by the drive recorder 1 shown in FIGS. 1 and 2 during continuous recording.

図12に示す処理手順を実施するためのプログラムはドライブ・レコーダのコントローラ20にインストールされていてもよいし、ＳＤカード（プログラムを格納した記録媒体の一例である）23に格納されていてもよいし、ネットワークを介して送信されたものを通信回路19によって受信し、ドライブ・レコーダ１にインストールしてもよい。 A program for implementing the processing procedure shown in FIG. 12 may be installed in the controller 20 of the drive recorder, or may be stored in the SD card 23 (which is an example of a recording medium storing the program). However, the information transmitted via the network may be received by the communication circuit 19 and installed in the drive recorder 1.

ドライブ・レコーダ１が車両に搭載されている場合、その車両のエンジンがかけられるとドライブ・レコーダ１のカメラ17により撮影が開始され、撮影によって得られた動画データがＳＤカード・リーダ・ライタ14によってＳＤカード23への記録（常時記録）が開始される。常時記録は車両のエンジンがかけられることにより開始し、車両のエンジンが切られることにより終了する。また、車両のエンジンがかかっていない場合でも操作ボタン12から記録開始指令がコントローラ20に与えられると常時記録が開始され、操作ボタン12から記録終了指令（記録停止指令の一例である）がコントローラ20に与えられると常時記録が終了する。常時記録の開始から終了までが常時記録の一つの動画となる。また、この実施例においてはカメラ17による撮影によって得られた動画データはＲＡＭ20ｃにも与えられ記録される。ＲＡＭ20ｃに記録された動画データは一定期間の間だけ周期的に記録され、その一定期間が経過すると繰り返し上書きされる。 When the drive recorder 1 is installed in a vehicle, when the engine of the vehicle is started, the camera 17 of the drive recorder 1 starts shooting, and the video data obtained by shooting is transferred to the SD card reader/writer 14. Recording to the SD card 23 (constant recording) is started. Continuous recording starts when the vehicle engine is started and ends when the vehicle engine is turned off. Further, even when the engine of the vehicle is not running, when a recording start command is given to the controller 20 from the operation button 12, recording is always started, and when a recording end command (which is an example of a recording stop command) is given from the operation button 12, the controller 20 receives a recording start command from the operation button 12. When given, continuous recording ends. The period from the start to the end of continuous recording becomes one moving image of continuous recording. Further, in this embodiment, moving image data obtained by photographing with the camera 17 is also given to the RAM 20c and recorded therein. The moving image data recorded in the RAM 20c is periodically recorded only for a certain period of time, and is repeatedly overwritten after the certain period has elapsed.

この実施例においては加速度センサ18によりイベントが検出される、または操作ボタン12による記録指令が与えられることによりイベントが検出されると（ステップ91でＹＥＳ）、ＳＤカード23のイベント記録領域にイベント記録が行われる（ステップ98）。 In this embodiment, when an event is detected by the acceleration sensor 18 or by a recording command given by the operation button 12 (YES in step 91), the event is recorded in the event recording area of the SD card 23. is performed (step 98).

加速度センサ18または記録指令によるイベントの検出がされない場合には（ステップ91でＮＯ）、コントローラ20（第１の記録制御手段の一例である）によってＲＡＭ20ｃ（記録媒体の一例である）に記録されている動画データが読み取られ、読み取られた動画データによって表される動画の中に一定以上の速度または加速度で移動している対象物があるかどうかがコントローラ20によって判断される（ステップ92）。この対象物は、あらかじめ定められていてもよいし、定められてなくともよい。ドライブ・レコーダ１が搭載されている車両と対象物との相対的な速度または加速度でもよいし、対象物の絶対的な速度または加速度でもよい。 If the event is not detected by the acceleration sensor 18 or the recording command (NO in step 91), the event is recorded in the RAM 20c (an example of a recording medium) by the controller 20 (an example of a first recording control means). The video data represented by the read video data is read, and the controller 20 determines whether there is an object moving at a speed or acceleration above a certain level in the video represented by the read video data (step 92). This object may or may not be determined in advance. It may be the relative speed or acceleration of the object and the vehicle in which the drive recorder 1 is mounted, or it may be the absolute speed or acceleration of the object.

読み取られた動画データによって表される動画の中に一定以上の速度または加速度で移動している対象物がある場合には（ステップ92でＹＥＳ）、ＲＡＭ20ｃから読み取られた動画データは学習済みの学習モデル80に１フレームずつ入力させられる（ステップ93）。学習モデル80において学習処理が行われ、学習モデル80の出力層83から特徴画像を表す画像データが１フレームずつ得られる（ステップ94）。学習モデル80はハードウエアのように記載されているが、実際にはコントローラ20においてソフトウエアによって実施される。 If there is an object moving at a speed or acceleration higher than a certain level in the video represented by the video data read (YES in step 92), the video data read from the RAM 20c is used as a trained object. The model 80 is input one frame at a time (step 93). Learning processing is performed in the learning model 80, and image data representing feature images is obtained frame by frame from the output layer 83 of the learning model 80 (step 94). Although learning model 80 is described as being hardware, it is actually implemented in software in controller 20.

得られた特徴画像と所望の画像との一致度がコントローラ20によって算出される（ステップ95）。 The degree of coincidence between the obtained feature image and the desired image is calculated by the controller 20 (step 95).

図13は、学習済みの学習モデル80に動画データが１フレームずつ入力させられる様子を示している。 FIG. 13 shows how video data is input frame by frame to the trained learning model 80.

１フレームを構成する画素Ｐ11からＰ1Nが１フレームの配列にしたがって入力層81に入力し、中間層82および出力層83を介して特徴画像を表す画像データとして画素Ｐ21からＰ2Nのテータが出力される。 Pixels P11 to P1N constituting one frame are input to an input layer 81 according to the arrangement of one frame, and theta of pixels P21 to P2N is output as image data representing a characteristic image via an intermediate layer 82 and an output layer 83. .

特徴画像を表すこれらの画素Ｐ21からＰ2Nのデータと所望の画像を構成する各画素のデータとが比較され、特徴画像と所望の画像との一致度算出処理がコントローラ20において行われる。 The data of these pixels P21 to P2N representing the characteristic image are compared with the data of each pixel constituting the desired image, and the controller 20 performs a matching degree calculation process between the characteristic image and the desired image.

図14は、一致度算出の処理手順を示すフローチャートである。 FIG. 14 is a flowchart showing the processing procedure for calculating the degree of matching.

まず、所望の画像と特徴画像との一致度が第１の一致度として算出される（ステップ131）。つづいて所望の画像が車両（第２の車両の一例である）に搭載されたドライブ・レコーダ１から得られたときの車両情報（走行情報の一例である）と特徴画像が得られたときに学習モデル80に入力したフレームが、車両（第１の車両の一例である）に搭載されたドライブ・レコーダ１から得られたときの車両情報との一致度が第２の一致度として算出される（ステップ132）。走行情報は、例えば、車両の速度、車体の姿勢、車両の加速度等の車両の状態を示す物理量を含む。このような物理量は、車両の走行に伴って変化する物理量である。これ以外にも、走行情報は、車両の場所（位置情報）等の、車両の走行に伴って変化する情報を含んでもよい。さらに、第１の一致度と第２の一致度とから総合的な一致度が算出される（ステップ133）。 First, the degree of coincidence between the desired image and the feature image is calculated as a first degree of coincidence (step 131). Next, when the desired image is obtained from the drive recorder 1 mounted on the vehicle (which is an example of a second vehicle), vehicle information (which is an example of traveling information) and characteristic images are obtained. The degree of agreement between the frame input to the learning model 80 and the vehicle information obtained from the drive recorder 1 mounted on the vehicle (which is an example of the first vehicle) is calculated as the second degree of agreement. (Step 132). The driving information includes, for example, physical quantities indicating the state of the vehicle, such as the speed of the vehicle, the attitude of the vehicle body, and the acceleration of the vehicle. Such a physical quantity is a physical quantity that changes as the vehicle travels. In addition to this, the traveling information may also include information that changes as the vehicle travels, such as the location of the vehicle (position information). Further, an overall degree of coincidence is calculated from the first degree of coincidence and the second degree of coincidence (step 133).

図14に示す処理手順では、所望の画像が得られたときの車両情報が利用されているから、所望の画像が学習済みの学習モデルを利用して生成された仮想的な目標画像の場合には第２の一致度を算出せずに第１の一致度を用いて総合的な一致度とされることとなろう。もっとも、所望の画像が仮想的な目標画像の場合であったとしても、その目標画像を得るのに利用した画像についての車両情報がわかる場合にはその車両情報を利用して平均的な車両情報を算出して第２の一致度を算出してもよい。 In the processing procedure shown in Fig. 14, vehicle information at the time when the desired image was obtained is used, so if the desired image is a virtual target image generated using a trained learning model, will be determined as the overall degree of coincidence using the first degree of coincidence without calculating the second degree of coincidence. However, even if the desired image is a virtual target image, if vehicle information about the image used to obtain the target image is known, that vehicle information can be used to calculate average vehicle information. The second matching degree may be calculated by calculating .

図12に戻って、算出された一致度（総合一致度）がしきい値以上であると（ステップ96でＹＥＳ）、ＲＡＭ20ｃに記録された動画データによって表される動画を構成するフレームのうち、所望の画像シーンとの一致度がしきい値以上のシーンを表すフレームが検出されとしてコントローラ20（警告制御手段の一例である）からイベント記録指令が発生し、警告が行われる（ステップ97）。警告は、ドライブ・レコーダ1のスピーカ15（警告装置の一例である）からの出力される音声による警告でもよいし、ディスプレイ11（警告装置の一例である）に表示する文字、画像などによる警告、LED（light emitting diode）の発光、点滅などによる警告でもよい。また、ドライブ・レコーダ１が搭載されている車両に設けられているスピーカ、表示画面などを用いて警告してもよい。警告により、車両の運転者は所望の画像のシーンに近いシーンの運転状況になる（運転状況になっている）ことがわかる。たとえば、所望の画像が事故に至る可能性の高い画像であるとすると、より慎重な運転を心がけることができる。また、イベント記録指令が発生するとＳＤカード23のイベント記録領域にイベント記録が行われる（ステップ98）。 Returning to FIG. 12, if the calculated matching degree (total matching degree) is equal to or higher than the threshold value (YES in step 96), among the frames constituting the moving image represented by the moving image data recorded in the RAM 20c, When a frame representing a scene whose degree of coincidence with the desired image scene is equal to or higher than a threshold value is detected, an event recording command is issued from the controller 20 (an example of a warning control means), and a warning is issued (step 97). The warning may be an audio warning output from the speaker 15 of the drive recorder 1 (which is an example of a warning device), a warning by text, an image, etc. displayed on the display 11 (which is an example of a warning device), A warning may be provided by emitting or blinking an LED (light emitting diode). Further, the warning may be issued using a speaker, a display screen, etc. provided in the vehicle in which the drive recorder 1 is mounted. The warning allows the driver of the vehicle to know that the vehicle is in a driving situation (being in a driving situation) of a scene close to the scene of the desired image. For example, if the desired image is an image that is likely to lead to an accident, the user can drive more carefully. Furthermore, when an event recording command is issued, event recording is performed in the event recording area of the SD card 23 (step 98).

上述の実施例においては、対象物が一定以上の速度または加速度で移動している場合に（ステップ92でＹＥＳ）、撮影データを学習済みの学習モデルに入力しているが、対象物が一定以上の速度または加速度で移動しているかどうかにかかわらず撮影データを学習済みの学習モデルに入力してもよいし、対象物が一定以上の速度または加速度で移動しているかどうかを判断することなく撮影データを学習済みの学習モデルに入力してもよい。 In the above embodiment, when the target object is moving at a speed or acceleration above a certain level (YES at step 92), the photographic data is input to the trained learning model. You can input photographic data into a trained learning model regardless of whether the object is moving at a speed or acceleration of The data may be input into a trained learning model.

図15は、ＲＡＭ20ｃから読み取られた動画データによって表される動画を構成する複数のフレームのうちの一つのフレームの一例である。図16は、ＲＡＭ20ｃから読み取られた動画データによって表される動画を構成する複数のフレームを表している。 FIG. 15 is an example of one of a plurality of frames constituting a moving image represented by moving image data read from the RAM 20c. FIG. 16 shows a plurality of frames constituting a moving image represented by moving image data read from the RAM 20c.

図16に示すように、動画データによって表される動画は複数のフレームＦＲ１からＦＲｅによって構成され、図15および図16に示すフレームＦＲｎが学習モデル80に入力したときに得られる一致度（総合一致度）がしきい値以上となったものとする。すると、そのフレームＦＲｎの前の数秒間から数十秒間（その他の時間でもよい）に撮影された動画を表すフレームおよびフレームＦＲｎの後の数秒間から数十秒間（その他の時間でもよい）に撮影された動画を表すフレームをそれぞれ表すデータが、ＳＤカード23のイベント記録領域にＳＤカード・リーダ・ライタ14によって記録される。 As shown in FIG. 16, a video represented by video data is composed of a plurality of frames FR1 to FRe, and the degree of matching (overall matching) obtained when frames FRn shown in FIGS. degree) is greater than or equal to the threshold value. Then, a frame representing a video shot from several seconds to several tens of seconds (any other time is fine) before that frame FRn, and a frame representing a video taken from several seconds to several tens of seconds (any other time is fine) after frame FRn. Data each representing a frame representing a moving image is recorded in the event recording area of the SD card 23 by the SD card reader/writer 14.

上述の実施例における警告は、シーンの種類または危険度に応じて内容を変えるようにしてもよい。たとえば、危険度が高いシーンではドライバにより認識されるように、音声と表示との両方で警告したり、音量を大きくしたりする。危険度が低いシーン、たとえば、お気に入りの場所に似たシーンが見つかったような場合には、比較的静かな音声でドライバに知らせたりする。 The content of the warning in the above-described embodiment may be changed depending on the type of scene or the degree of danger. For example, in a highly dangerous scene, a warning is given both audibly and visually, or the volume is increased so that the driver can recognize it. If a low-risk scene is found, for example, a scene similar to a favorite place, the system will notify the driver with a relatively quiet voice.

図17は、図13に示す学習モデル80の変形例を示している。 FIG. 17 shows a modification of the learning model 80 shown in FIG. 13.

図17に示す学習モデル105では、入力層106、中間層107および出力層108が含まれており、出力層108がロジスティック回帰層とされている。 The learning model 105 shown in FIG. 17 includes an input layer 106, an intermediate layer 107, and an output layer 108, and the output layer 108 is a logistic regression layer.

動画を構成する複数のフレームを１フレームずつ学習モデル105に入力させる。たとえば、１フレームを構成する画素Ｐ31からＰ3Nを表すデータを画素配列にしたがって学習モデル105の入力層106に入力させる。このような場合に、所望の画像との一致度が高いフレームのときには、ロジスティック回帰層である出力層83から識別データが出力する。識別データが出力されたことにより所望の画像と学習モデル105に入力したフレームとの一致度がしきい値以上と判断され、イベント記録指令がコントローラ20から発生する。学習モデル105も多数の所望の画像を表す画像データを入力層106から入力し、所望の画像を表す画像データが入力層106から入力した場合に出力層108から識別データが出力するように学習させておくことにより生成された学習済みの学習モデルである。 A plurality of frames constituting a video are input into the learning model 105 one frame at a time. For example, data representing pixels P31 to P3N constituting one frame is input to the input layer 106 of the learning model 105 according to the pixel array. In such a case, if the frame has a high degree of matching with the desired image, identification data is output from the output layer 83, which is a logistic regression layer. By outputting the identification data, it is determined that the degree of coincidence between the desired image and the frame input to the learning model 105 is greater than or equal to the threshold value, and an event recording command is issued from the controller 20. The learning model 105 also receives image data representing a large number of desired images as input from the input layer 106, and is trained to output identification data from the output layer 108 when image data representing the desired image is input from the input layer 106. This is a trained learning model generated by

また、識別データが出力された場合に学習モデル105に入力したフレームが得られたときの車両情報と所望の画像に対応する車両情報との一致度（上述の第２の一致度）がしきい値以上の場合にイベント記録指令が発生するようにしてもよい。 In addition, when the identification data is output, the degree of coincidence between the vehicle information when the frame input to the learning model 105 is obtained and the vehicle information corresponding to the desired image (the second degree of coincidence described above) is the threshold. An event recording command may be generated when the value is greater than or equal to the value.

加速度センサ18によるイベント記録指令や操作ボタン12からのイベント記録指令が無くとも撮影した動画にもとづいてイベント記録を行うことができる。また、車両情報を利用して総合一致度を算出する場合には撮影した動画だけでなく車両情報も利用しているので、車両の状況に応じてより適切なイベント記録を行うこともできる。 Even without an event recording command from the acceleration sensor 18 or an event recording command from the operation button 12, event recording can be performed based on the captured video. Furthermore, when calculating the overall matching degree using vehicle information, not only the captured video but also the vehicle information is used, so it is possible to record events more appropriately depending on the vehicle situation.

上述の実施例においては、所望の画像に近似したフレームを動画の中から検出した場合にはイベント記録を行い、動画データをＳＤカード23に記録しているが、動画データを記録せずに所望の画像に近似したフレームを識別する識別情報をＳＤカード23その他の記録媒体に記録するようにしてもよい。 In the embodiment described above, when a frame similar to a desired image is detected in a video, an event is recorded and video data is recorded on the SD card 23. Identification information for identifying a frame similar to the image may be recorded on the SD card 23 or other recording medium.

以上説明した実施例では、総合一致度を用いてイベント録画の有無が判断されたが、第２の一致度を用いず、第１の一致度がしきい値以上であるか否かに基づいて、イベント録画の有無が判断されてもよい。ただし、車両の走行状態を示す物理量を組み合わせれば、所望のシーンのフレームをより精度良く検出できる効果を期待することができる。。 In the embodiment described above, the presence or absence of event recording is determined using the overall degree of coincidence; , the presence or absence of event recording may be determined. However, if physical quantities indicating the driving state of the vehicle are combined, it can be expected that frames of a desired scene can be detected with higher accuracy. .

［第３実施例］
この実施例は、記録されたデータのビューアソフト等における自動再生支援機能を提案する。管理目的の場合のドライブ・レコーダにおける映像データでは、長時間かつ基本的には問題のない場合が多く、イベント等であれば発生ポイントがビューアソフトで表示されるが、イベントとなっていない注視ポイントを管理者等が発見・確認することが苦痛になることがあった。そこで、発明者は、ディープ・ラーニング等により所望の映像状況を学習させておき、録画データ中に希望状況と同じまたは類似した映像があった場合、識別させてイベントとは別に注視ポイントとして検出し、ビューアソフト等に表示したり、早送り再生した途中イベント発生ポイント及び注視ポイントでは通常速度に戻したりすることで、管理者等が確認しやすくするという解決方法を、発明者は考えた。 [Third example]
This embodiment proposes an automatic playback support function in viewer software of recorded data. Video data from a drive recorder for management purposes is often long-term and basically has no problems, and if it is an event, the point of occurrence will be displayed on the viewer software, but the point of interest that has not become an event will be displayed. It was sometimes difficult for administrators to discover and confirm the information. Therefore, the inventor learned the desired video situation using deep learning, etc., and when there is a video in the recorded data that is the same or similar to the desired situation, it is identified and detected as a gaze point separately from the event. The inventor has devised a solution that makes it easier for administrators to check the content by displaying it on viewer software, etc., or returning it to normal speed at event occurrence points and gaze points during fast-forward playback.

併せて、この実施例は、記録媒体に記録した映像等のデータの復旧機能を提案する。
例えば後述する記録フォーマットを採用する場合、Ｗｉｎｄｏｗｓ（登録商標）等で使われているＦＡＴ方式ではなく、記憶媒体にシーケンシャルにデータを書き込んでいくため、映像が異常な状態になっていても（欠損部分が存在していても）データ自体は記録されるが、特殊な異常が発生した場合、映像データの所定セクタ分だけ０で消されていたり、他の情報が書かれているために、ビューアソフトで映像が再生できないといった問題があった。この場合、記録媒体の中身を１セクタずつ確認し手作業で復旧することがあるが、ディープ・ラーニング等の方法により記録媒体の状況を学習し、欠損部分の前後のデータから消されたデータの復旧を行うだけでなく、上記記録フォーマットとして正常な状態に復旧させビューアソフト等で確認することができるようにするという解決方法を、発明者は考えた。以下、この実施例の詳細を説明する。 Additionally, this embodiment proposes a recovery function for data such as video recorded on a recording medium.
For example, when using the recording format described below, data is written sequentially to the storage medium rather than the FAT method used in Windows (registered trademark), so even if the video is in an abnormal state (deleted Although the data itself is recorded (even if some sectors exist), if a special abnormality occurs, a certain sector of the video data may be erased with zeros or other information may be written, causing the viewer to There was a problem with the software not being able to play the video. In this case, the contents of the recording medium may be checked sector by sector and restored manually, but methods such as deep learning are used to learn the status of the recording medium and the erased data can be recovered from the data before and after the missing part. The inventor has devised a solution that not only restores the data, but also restores the recording format to a normal state so that it can be confirmed using viewer software or the like. The details of this embodiment will be explained below.

図18から図29は、第３実施例を示すもので、ドライブ・レコーダ１のような記録装置によって記録された動画の再生時についてのものである。第２実施例においてはドライブ・レコーダ１のような記録装置において動画データを記録している場合に所望の画像との一致度が高いフレームが見つかったときにイベント記録が行われるが、第３実施例においてはＳＤカード23（他の記録媒体でもよい）に記録された常時記録の動画データの再生時に所望の画像との一致度が高いフレームを見つけイベント記録を行うものである。第３実施例では、パーソナル・コンピュータを用いて常時記録の動画データを再生しているがドライブ・レコーダ１のような記録装置が再生機能を有している場合には、そのような記録装置において再生を行い、次の処理を行うこともできる。 FIGS. 18 to 29 show a third embodiment, in which a moving image recorded by a recording device such as the drive recorder 1 is played back. In the second embodiment, when a recording device such as the drive recorder 1 is recording moving image data, event recording is performed when a frame with a high degree of coincidence with a desired image is found, but in the third embodiment In this example, when reproducing constantly recorded moving image data recorded on the SD card 23 (other recording media may be used), a frame having a high degree of coincidence with a desired image is found and event recording is performed. In the third embodiment, constantly recorded video data is played back using a personal computer, but if a recording device such as the drive recorder 1 has a playback function, such a recording device You can also perform playback and perform the following processing.

図18は、動画を再生するパーソナル・コンピュータ110の電気的構成を示すブロック図である。 FIG. 18 is a block diagram showing the electrical configuration of a personal computer 110 that plays moving images.

パーソナル・コンピュータ110の全体の動作は制御装置115によって統括される。 The entire operation of personal computer 110 is supervised by control device 115.

パーソナル・コンピュータ110には表示装置111が設けられており、表示制御装置112の制御により画像等が表示装置111の表示画面に表示させられる。制御装置115には、インターネットなどと通信するための通信装置113およびメモリ114が接続されている。パーソナル・コンピュータ110にはキーボード、マウスなどの入力装置116が設けられており、入力装置116から与えられる指令は制御装置115に入力する。 The personal computer 110 is provided with a display device 111, and images and the like are displayed on the display screen of the display device 111 under the control of a display control device 112. A communication device 113 and a memory 114 for communicating with the Internet and the like are connected to the control device 115. The personal computer 110 is provided with an input device 116 such as a keyboard and a mouse, and commands given from the input device 116 are input to the control device 115.

さらに、制御装置115にはハード・ディスク118に記録されているデータを読み取り、ハード・ディスク118にデータを記録するハード・ディスク・ドライブ117およびＳＤカードに記録されているデータを読み取り、ＳＤカード23にデータを記録するＳＤカード・リーダ・ライタ119も接続されている。 Furthermore, the control device 115 includes a hard disk drive 117 that reads data recorded on the hard disk 118, reads data recorded on the SD card, and reads data recorded on the SD card 23. An SD card reader/writer 119 for recording data is also connected.

図19は、ＳＤカード23の記録フォーマットを示している。 FIG. 19 shows the recording format of the SD card 23.

ＳＤカード23の記録領域には、ファイル・システム領域121、常時記録領域122およびイベント記録領域125が形成されている。 A file system area 121, a constant recording area 122, and an event recording area 125 are formed in the recording area of the SD card 23.

ファイル・システム領域121には、常時記録領域122およびイベント記録領域125に記録されているデータを再生する専用ソフトウエアが記録されている。 Dedicated software for reproducing data recorded in the constant recording area 122 and the event recording area 125 is recorded in the file system area 121.

常時記録領域122には管理領域123と記録領域124とが形成されている。管理領域123にはファイル・システム領域121に記録されている専用ソフトウエアによって各種設定情報が記録される。記録領域124には、動画を構成するフレームを表す画像データがフレーム番号順に記録される。記録領域124には、１フレームごとにヘッダ記録領域131、フレーム画像データ記録領域132およびフッタ記録領域133が形成されている。ヘッダにはフレーム番号、アドレス位置、撮影時刻などの付加情報が記録され、フッタにはヘッダに記録される付加情報以外の付加情報が記録される。 A management area 123 and a recording area 124 are formed in the permanent recording area 122. Various setting information is recorded in the management area 123 by dedicated software recorded in the file system area 121. In the recording area 124, image data representing frames constituting a moving image is recorded in the order of frame numbers. In the recording area 124, a header recording area 131, a frame image data recording area 132, and a footer recording area 133 are formed for each frame. Additional information such as a frame number, address position, and shooting time is recorded in the header, and additional information other than the additional information recorded in the header is recorded in the footer.

常時記録領域122の記録領域124に記録されるフレーム１からフレームＥまでが常時記録の一つの動画期間を表す。 Frame 1 to frame E recorded in the recording area 124 of the constant recording area 122 represents one moving image period of constant recording.

イベント記録領域125も常時記録領域122と同様に、管理領域126と記録領域127とが形成されている。管理領域126にもファイル・システム領域121に記録されている専用ソフトウエアによって各種設定情報が記録される。記録領域127には、動画を構成するフレームを表す画像データがフレーム番号順に記録される。記録領域127にも、１フレームごとにヘッダ記録領域131、フレーム画像データ記録領域132およびフッタ記録領域133が形成されている。ヘッダにはフレーム番号、アドレス位置、撮影時刻などの付加情報が記録され、フッタにはヘッダに記録される付加情報以外の付加情報が記録される。 Similar to the constant recording area 122, the event recording area 125 also includes a management area 126 and a recording area 127. Various setting information is also recorded in the management area 126 by the dedicated software recorded in the file system area 121. In the recording area 127, image data representing frames constituting a moving image is recorded in the order of frame numbers. Also in the recording area 127, a header recording area 131, a frame image data recording area 132, and a footer recording area 133 are formed for each frame. Additional information such as a frame number, address position, and shooting time is recorded in the header, and additional information other than the additional information recorded in the header is recorded in the footer.

イベント記録領域125の記録領域127に記録されるフレーム１からフレームＥまでがイベント記録の一つの動画を表す。 Frame 1 to frame E recorded in the recording area 127 of the event recording area 125 represents one moving image of event recording.

図20は再生ウインドウ150の一例である。 FIG. 20 is an example of the playback window 150.

パーソナル・コンピュータ110にＳＤカード23が装填され、表示装置111の表示画面に現れるＳＤカード23のアイコンがクリックされると、専用ソフトウエアのアイコンが現れる。専用ソフトウエアのアイコンがクリックされると、専用ソフトウエアが起動し、再生ウインドウ150が表示装置111の表示画面に表示される。 When the SD card 23 is loaded into the personal computer 110 and the icon of the SD card 23 appearing on the display screen of the display device 111 is clicked, an icon of dedicated software appears. When the dedicated software icon is clicked, the dedicated software is started and a playback window 150 is displayed on the display screen of the display device 111.

再生ウインドウ150には、選択された動画を表示する映像表示領域151、イベント記録リスト表示領域152、地図画像表示領域153、車両情報等表示領域160、常時記録リスト表示領域161および加速度表示領域163が形成されている。 The playback window 150 includes a video display area 151 for displaying the selected video, an event record list display area 152, a map image display area 153, a vehicle information display area 160, a constant record list display area 161, and an acceleration display area 163. It is formed.

映像表示領域151は、ドライブ・レコーダ１等により記録された映像（常時記録動画、イベント記録動画）を表示する領域である。イベント記録リスト表示領域152は、イベント記録リストを記録時間の形態（記録時間以外の形態でもよい）で表示する領域である。イベント記録リスト表示領域152に表示されるイベント記録リストは、ＳＤカード23に記録されているすべてのイベント記録が表示される。常時記録動画が選択された場合には、選択された常時記録動画の撮影中に記録されたイベント記録動画をイベント記録リスト表示領域152に表示するようにしてもよい。 The video display area 151 is an area for displaying video recorded by the drive recorder 1 or the like (constantly recorded video, event recorded video). The event record list display area 152 is an area for displaying an event record list in a format of recording time (a format other than recording time may be used). In the event record list displayed in the event record list display area 152, all event records recorded in the SD card 23 are displayed. When a constantly recorded video is selected, event recorded videos recorded during the shooting of the selected constantly recorded video may be displayed in the event record list display area 152.

地図画像表示領域153は、映像表示領域151に表示されている映像の撮影場所近傍の地図の画像を表示する領域である。車両の位置を表すデータをパーソナル・コンピュータ110からインターネット上に存在する地図サーバに送信し、地図サーバから車両の位置近傍の地図を表すデータをパーソナル・コンピュータ110に送信することにより、地図画像表示領域153に地図の画像が表示される。車両情報等表示領域160は、映像の記録時における車両の速度などの情報を表示する領域である。車両情報等表示領域160には、映像表示領域151に表示されている映像を記録したドライブ・レコーダ１が取り付けられている車両の状態などを表示する車両状態表示領域158、映像の再生、停止などの操作指令を与える操作ボタン157、車室内の映像を明るく表示する処理の指示を与える室内強調ボタン154、車室内の映像を明るく表示する処理を停止して通常の表示を行う処理の指示を与える通常表示ボタン155、映像表示領域151に表示されている映像の記録時刻を表示する時刻表示領域156、車両の加速度をグラフ化して表示する加速度グラフ表示領域159などが含まれている。 The map image display area 153 is an area that displays an image of a map near the shooting location of the video displayed in the video display area 151. By transmitting data representing the vehicle's position from the personal computer 110 to a map server existing on the Internet, and transmitting data representing a map near the vehicle's position from the map server to the personal computer 110, the map image display area An image of the map is displayed on 153. The vehicle information display area 160 is an area that displays information such as the speed of the vehicle at the time of video recording. The vehicle information display area 160 includes a vehicle status display area 158 that displays the status of the vehicle to which the drive recorder 1 that recorded the video displayed in the video display area 151 is attached, and a vehicle status display area 158 that displays the status of the vehicle, such as video playback, stop, etc. an operation button 157 that gives an operation command to display an image inside the vehicle interior, an interior emphasis button 154 that gives an instruction to display a bright image inside the vehicle interior, and an instruction to stop the process of displaying a bright image inside the vehicle interior and display a normal display It includes a normal display button 155, a time display area 156 that displays the recording time of the video displayed in the video display area 151, an acceleration graph display area 159 that displays a graph of the acceleration of the vehicle, and the like.

常時記録リスト表示領域161は、ＳＤカード23に記録されている常時記録動画をリストで表示する領域である。加速度表示領域162には自動車の画像が表示されている。自動車の画像のせ前に前方に向く矢印とともに「＋Ｘ」の文字、自動車の画像の横に横方向に向く矢印とともに「＋Ｙ」の文字、および自動車の画像の上に上方向に向く矢印とともに「＋Ｚ」の文字が表示されている。自動車の画像の前方に自動車の前方または後方に対する加速度が表示され、自動車の画像の横方向に自動車の横方向に対する加速度が表示され、自動車の画像の上方向に自動車の垂直方向に対する加速度が表示される。自動車の外観とともに自動車の加速度が表示されるので、どの方向の加速度が自動車にかかっているのかが比較的わかりやすくなる。 The constantly recorded list display area 161 is an area where constantly recorded moving images recorded on the SD card 23 are displayed in a list. An image of a car is displayed in the acceleration display area 162. Before posting an image of a car, add the letters "+X" with an arrow pointing forward, the letters "+Y" with an arrow pointing horizontally next to the image of the car, and the letters "+Z" with an arrow pointing upwards above the image of the car. ' is displayed. The front or rear acceleration of the car is displayed in front of the car image, the lateral acceleration of the car is displayed in the lateral direction of the car image, and the vertical acceleration of the car is displayed above the car image. Ru. Since the acceleration of the car is displayed together with the exterior of the car, it is relatively easy to understand in which direction the acceleration is being applied to the car.

図21は、再生処理手順を示すフローチャート、図22および図23は、再生ウインドウ150の一例を示している。 FIG. 21 is a flowchart showing a playback processing procedure, and FIGS. 22 and 23 show an example of the playback window 150.

再生ウインドウ150のイベント記録リスト表示領域152または常時記録リスト表示領域161に表示されているイベント記録リストまたは常時記録リストの中から所望のイベント記録動画または常時記録動画が選択されると、その選択された動画を表す動画データがＳＤカード・リーダ・ライタ119によってＳＤカード23の常時記録領域122から読み取られる（ステップ141）。この実施例では、常時記録動画が選択されたものとする。 When a desired event recording video or constant recording video is selected from the event recording list or the constant recording list displayed in the event recording list display area 152 or the constant recording list display area 161 of the playback window 150, the selected event recording video or constant recording video is displayed. The moving image data representing the captured moving image is read from the permanent recording area 122 of the SD card 23 by the SD card reader/writer 119 (step 141). In this embodiment, it is assumed that a constantly recorded moving image is selected.

常時記録動画が選択されると、再生ウインドウ150の映像表示領域151には、図22に示すように、選択された常時記録動画の代表画像が表示される（ステップ142）。操作ボタン157の中の再生ボタンが押されると（ステップ143でＹＥＳ）、読み取られた動画データの再生処理が制御装置115（再生装置、再生制御手段の一例である）によって行われる（ステップ144）。映像表示領域151には選択された常時記録動画が表示されるようになる。 When a constantly recorded video is selected, a representative image of the selected constantly recorded video is displayed in the video display area 151 of the playback window 150, as shown in FIG. 22 (step 142). When the playback button among the operation buttons 157 is pressed (YES in step 143), the control device 115 (playback device, an example of playback control means) performs playback processing of the read video data (step 144). . The selected constantly recorded video comes to be displayed in the video display area 151.

常時記録動画の再生中に操作ボタン157の中の停止ボタンが押されると（ステップ145でＹＥＳ）、制御装置115によって再生中の動画が停止させられる（ステップ146）。 When the stop button among the operation buttons 157 is pressed during playback of the constantly recorded moving image (YES at step 145), the moving image being played is stopped by the control device 115 (step 146).

イベント記録リストの中から所望のイベント記録動画が選択された場合も、選択されたイベント記録動画を表す動画データがＳＤカード・リーダ・ライタ119によってＳＤカード23のイベント記録領域125から読み取られる（ステップ141）。 Even when a desired event recording video is selected from the event recording list, video data representing the selected event recording video is read from the event recording area 125 of the SD card 23 by the SD card reader/writer 119 (step 141).

イベント記録動画が選択されると、再生ウインドウ150の映像表示領域151には、図23に示すように、選択されたイベント記録動画の代表画像が表示される（ステップ142）。イベント記録動画の代表画像は、イベント記録指令が発生した時点に撮影された画像でもよいし、イベント記録動画の最初の画像でもよい。 When an event recorded video is selected, a representative image of the selected event recorded video is displayed in the video display area 151 of the playback window 150, as shown in FIG. 23 (step 142). The representative image of the event recording video may be an image taken at the time when the event recording command is issued, or may be the first image of the event recording video.

操作ボタン157の中の再生ボタンが押されると（ステップ143でＹＥＳ）、読み取られた動画データの再生処理が制御装置115によって行われる（ステップ144）。映像表示領域151には選択されたイベント記録動画が表示されるようになる。イベント記録動画の再生中に操作ボタン157の中の停止ボタンが押されると（ステップ145でＹＥＳ）、制御装置115によって再生中の動画が停止させられる（ステップ146）。 When the playback button among the operation buttons 157 is pressed (YES at step 143), the control device 115 performs playback processing of the read video data (step 144). The selected event recording video comes to be displayed in the video display area 151. When the stop button among the operation buttons 157 is pressed during playback of the event recorded video (YES in step 145), the control device 115 stops the video being played (step 146).

上述した常時記録動画の再生処理中においても図12ステップ92から98までの処理を行い、所望の画像が検出されると制御装置115によってイベント記録領域125にイベント記録されるようにしてもよい。イベント記録された場合には、イベント記録リスト表示領域152に追加で表示されるようになる。好ましくは再生時に所望の画像が検出された常時記録の動画を構成するフレームのヘッダなどに、所望の画像に関連した画像である旨の情報を記録し、その動画が再生された場合に、見つけられたイベント記録の時間などを表示装置111（報知装置の一例である）の再生ウインドウ150に表示（報知の一例である）するように制御装置115または表示制御装置112（報知制御手段の一例である）によって表示装置111を制御してもよい。 Even during the above-described reproduction process of the constantly recorded moving image, the processes from steps 92 to 98 in FIG. 12 may be performed, and when a desired image is detected, the event may be recorded in the event recording area 125 by the control device 115. When an event is recorded, it is additionally displayed in the event record list display area 152. Preferably, information indicating that the image is related to the desired image is recorded in the header of the frame constituting the constantly recorded moving image in which the desired image is detected during playback, so that when the moving image is played back, it can be found. The control device 115 or the display control device 112 (an example of a notification control means) is configured to display (an example of a notification) the time of the recorded event, etc. on the playback window 150 of the display device 111 (an example of a notification device). The display device 111 may be controlled by

図24（Ａ）および図24（Ｂ）は、一つの動画を構成するフレームのフレーム番号とアドレス位置情報との関係を示すテーブルの一例である。 24(A) and 24(B) are examples of tables showing the relationship between frame numbers of frames constituting one moving image and address position information.

フレーム番号およびアドレス位置情報は上述したように常時記録動画、イベント記録動画にかかわらずヘッダ記録領域131に記録されているので図24（Ａ）および図24（Ｂ）に示すテーブルを作成する必要はないが、図24（Ａ）および図24（Ｂ）に示すようなテーブルを生成し、常時記録領域122の管理領域123またはイベント記録領域125の管理領域126に記録してもよい。 As mentioned above, the frame number and address position information are recorded in the header recording area 131 regardless of whether it is a constantly recorded video or an event recorded video, so there is no need to create the tables shown in FIGS. 24(A) and 24(B). However, a table as shown in FIGS. 24(A) and 24(B) may be generated and recorded in the management area 123 of the constant recording area 122 or the management area 126 of the event recording area 125.

上述したように、ヘッダ記録領域131には、常時記録動画またはイベント記録動画を構成するフレームに対応してフレーム番号およびアドレス位置情報（フレームの先頭のアドレスを示す情報）が記録されている。この実施例による常時記録動画またはイベント記録動画を表す動画データは、フレーム番号順に常時記録領域122の記録領域124またはイベント記録領域125の記録領域127に連続して記録される。たとえば、図24（Ａ）に示すように、フレーム１からフレームＥまでのフレームによって構成される一つの動画がフレームごとに常時記録領域122の記録領域124またはイベント記録領域125の記録領域127に記録される。 As described above, in the header recording area 131, frame numbers and address position information (information indicating the address of the beginning of the frame) are recorded in correspondence with the frames constituting the constantly recorded moving image or the event recorded moving image. The moving image data representing a constantly recorded moving image or an event recorded moving image according to this embodiment is continuously recorded in the recording area 124 of the constantly recording area 122 or the recording area 127 of the event recording area 125 in the order of frame numbers. For example, as shown in FIG. 24(A), one moving image composed of frames from frame 1 to frame E is recorded frame by frame in the recording area 124 of the constant recording area 122 or the recording area 127 of the event recording area 125. be done.

このような場合において、フレーム番号またはアドレス位置情報の少なくとも一方のデータが正常に記録されておらず異常な状態となってしまうことがある。たとえば、フレーム番号は連続して記録されている筈なのに連続した番号となっていない、アドレス位置情報が一つ前のアドレス位置情報よりも極端に大きいアドレス位置になっている、アドレス位置情報が一つ前のアドレス位置情報よりも前のアドレス位置を示している、などである。そのような場合に、動画データ自体は正常であるにも関わらず、その部分の動画を再生できないことがある。 In such a case, at least one of the frame number and address position information may not be recorded properly, resulting in an abnormal state. For example, the frame numbers should be recorded consecutively but they are not, the address position information is an address position that is extremely larger than the previous address position information, or the address position information is incorrect. For example, it indicates an address position earlier than the previous address position information. In such a case, even though the video data itself is normal, that part of the video may not be played back.

たとえば、図24（Ａ）においては、フレーム番号もアドレス位置情報もすべて正常に記録されているが、図24（Ｂ）においてはハッチングで示すようにフレーム11から20までのフレーム番号およびアドレス位置情報が異常な値となっており、これらのフレーム11から20までのフレームによって表される動画は正常に再生できないことが多い。 For example, in Figure 24(A), all frame numbers and address position information are recorded normally, but in Figure 24(B), frame numbers and address position information for frames 11 to 20 are recorded as indicated by hatching. has an abnormal value, and the video represented by these frames 11 to 20 often cannot be played normally.

この実施例では、フレーム番号およびアドレス位置情報の少なくとも一方が正常に記録されていない場合、それらを正常な値となるように書き直す。また、動画データが欠損しているなど動画データも正常に記録されていない場合には、その欠損している動画データを生成し、正常に記録されていないフレームに書き込む。 In this embodiment, if at least one of the frame number and address position information is not recorded normally, they are rewritten so that they become normal values. Furthermore, if the moving image data is not recorded normally, such as due to missing moving image data, the missing moving image data is generated and written to the frame where the moving image data is not normally recorded.

図25は修復処理手順を示すフローチャート、図26はＳＤカード23の記録フォーマットを示している。 FIG. 25 is a flowchart showing the repair processing procedure, and FIG. 26 shows the recording format of the SD card 23.

修復処理は、パーソナル・コンピュータ110にＳＤカード23が装填され再生ソフトウエアが起動させられた場合に開始してもよいし、動画の再生のために所定の動画が指定された場合に、その動画についての修復処理が開始されてもよいし、再生ウインドウ150に修復モード・ボタンなどを設け、修復モード・ボタンが押された場合に開始されてもよい。 The restoration process may be started when the SD card 23 is loaded into the personal computer 110 and the playback software is started, or when a predetermined video is specified for video playback, the video The repair process may be started for the playback window 150, or a repair mode button or the like may be provided in the playback window 150, and the repair process may be started when the repair mode button is pressed.

所望の動画の修復処理が開始されると、その動画を構成するフレームのヘッダ記録領域131に記録されているフレーム番号が制御装置115によって読み取られる（ステップ171）。読み取られたフレーム番号が連続していない場合には、その連続していないフレーム番号が正常に記録されていないと制御装置115によって判定される（ステップ172でＹＥＳ）。また、式１に示すように、フレームのアドレス位置情報にそのフレームの動画データ量を加えたものが、そのフレームの次のフレームのアドレス位置情報となる関係にあるから、そのような関係が崩れている場合にはアドレス位置情報が正常に記録されていないと制御装置115によって判定される（ステップ172でＹＥＳ）。 When the restoration process for a desired moving image is started, the control device 115 reads the frame number recorded in the header recording area 131 of the frames constituting the moving image (step 171). If the read frame numbers are not consecutive, the control device 115 determines that the non-consecutive frame numbers are not recorded normally (YES in step 172). Furthermore, as shown in Equation 1, the address position information of a frame plus the video data amount of that frame is the address position information of the next frame, so such a relationship may be broken. If so, the control device 115 determines that the address position information is not recorded normally (YES in step 172).

フレームのアドレス位置情報＋そのフレームの動画データ量＝次のフレームのアドレス位置情報・・・式１ Address position information of a frame + video data amount of that frame = address position information of the next frame...Formula 1

フレーム番号またはアドレス位置情報の少なくとも一方が正常に記録されていない場合には（ステップ172でＮＯ）、正常に記録されていないフレーム番号またはアドレス位置情報（またはフレーム番号およびアドレス位置情報）が制御装置115によって修復させられる（ステップ173）。上述したように、フレーム番号が正常に記録されていない場合には、フレーム番号が動画を構成するフレーム順に連続するように制御装置115によって書き換えられる。アドレス位置情報が正常に記録されていない場合には、式１に示した関係となるようにアドレス位置情報が正常な値となるように制御装置115によって書き換えられる。 If at least one of the frame number or address position information is not recorded normally (NO in step 172), the incorrectly recorded frame number or address position information (or frame number and address position information) is recorded in the control device. 115 (step 173). As described above, if the frame numbers are not recorded normally, the control device 115 rewrites the frame numbers so that they are consecutive in the order of the frames that make up the moving image. If the address position information is not recorded normally, the control device 115 rewrites the address position information to a normal value such that the relationship shown in Equation 1 is satisfied.

つづいて、記録領域124または127に記録されている動画データが正常に記録されているかどうかが制御装置115によって判断される（ステップ174）。フレームによって表されるシーンがあまりにも異なる場合、たとえば、あるフレームまでは自動車が映っていたにも関わらず、その次のフレームには真っ黒、または真っ白のように対象物を認識できないような画像となっていると、そのフレームは正常に記録されていないと判断される。但し、シーンが突然変わっているフレームであっても撮影位置情報などにより同一撮影シーンの動画を構成するフレームと判定されると正常に記録されていると判定される。たとえば、事故が起こった場合などはシーンが突然変わってしまうが、そのような場合に正常に記録されていないと判定されてしまうことを未然に防止するためである。ユーザが目視でチェックして正常に記録されていないフレームを判定してもよい。 Next, the control device 115 determines whether the moving image data recorded in the recording area 124 or 127 is recorded normally (step 174). If the scenes represented by the frames are very different, for example, a car may be visible in one frame, but the next frame may be completely black or completely white, making the object unrecognizable. If so, it is determined that the frame was not recorded correctly. However, even if the scene suddenly changes, if the frame is determined to be a frame constituting a moving image of the same photographed scene based on the photographing position information, it is determined that the frame has been recorded normally. For example, when an accident occurs, the scene changes suddenly, and this is to prevent it from being determined that the image has not been recorded properly in such a case. The user may visually check and determine which frames are not being recorded normally.

動画データが正常に記録されていない場合には（ステップ174でＮＯ）、その正常に記録されていない動画データを生成し動画の欠損部分（欠損データ）が修復させられる（ステップ175）。動画の欠損部分の修復方法については後述する。図26を参照して、ハッチングで示すように常時記録領域122の記録領域124に記録されている動画データのうち、フレーム10から20までのフレームに記録されている動画データが正常に記録されていないとすると、それらのフレーム10から20までに記録されている動画データが修復させられる。動画の欠損部分を修復した場合、その修復したフレームについては撮影によって得られたフレームではなく修復したフレームであることを示す情報を付加情報としてヘッダ記録領域131またはフッタ記録領域131などに記録しておくことが好ましい。実際に撮影によって得られた現実のシーンと、後から生成された疑似的なシーンとを区別できるようにするためである。 If the moving image data is not recorded normally (NO in step 174), the incorrectly recorded moving image data is generated and the missing portion (missing data) of the moving image is repaired (step 175). A method for restoring missing parts of a video will be described later. Referring to FIG. 26, as shown by hatching, among the video data recorded in the recording area 124 of the continuous recording area 122, the video data recorded in frames 10 to 20 is not recorded normally. If not, the video data recorded in those frames 10 to 20 will be repaired. When a missing part of a video is repaired, information indicating that the repaired frame is a repaired frame rather than a frame obtained by shooting is recorded as additional information in the header recording area 131 or footer recording area 131, etc. It is preferable to leave it there. This is to make it possible to distinguish between a real scene actually obtained by photographing and a pseudo scene generated later.

動画の欠損部分を修復した場合、フレームのデータ量が変化することが考えられるから、修復したフレームのアドレス位置情報も変わってしまう可能性が高い。このために、式１にしたがってアドレス位置情報が修正される（ステップ176）。 When a missing portion of a video is repaired, the amount of data in the frame may change, so there is a high possibility that the address position information of the repaired frame will also change. To this end, the address location information is modified according to Equation 1 (step 176).

この実施例においては、動画データの修復には学習モデルが利用されるが、学習モデルを利用せずに、修復するフレームの前後のフレーム等を利用して画像補間により動画データを生成してもよい。 In this embodiment, a learning model is used to repair video data, but video data can also be generated by image interpolation using frames before and after the frame to be repaired without using a learning model. good.

図27および図28は、学習モデルを生成する方法を示している。図27は学習モデルの一例であり、図28は学習モデルを学習させる様子を示している。図29は学習モデル生成処理手順を示すフローチャートである。 Figures 27 and 28 show a method of generating a learning model. FIG. 27 is an example of a learning model, and FIG. 28 shows how the learning model is trained. FIG. 29 is a flowchart showing the learning model generation processing procedure.

図27および図28においては、フレーム21から30までの10個のフレーム（第２の動画データ、教師データ、第１の部分の一例である）を、フレーム21の前のフレーム１から20までのフレームとフレーム30の後のフレーム31から50までのフレーム（第１の動画データ、教師データ、第２の画像の一例である）を用いて機械学習により生成するものである。フレーム１から50までのフレームは他の車両において実際に撮影により得られたフレームである。例えば、他の車両において実際に撮影により得られたフレームは、同じ場所または同じ若しくは類似の様子（例えば風景）を撮影可能な別の場所を走行した他の車両に搭載したドライブ・レコーダによる撮影により得られたフレームである。、機械学習は、フレーム１から20、フレーム31から50を入力とし、フレーム21から30を出力とする教師あり学習である。フレーム21から30以外の他のフレームについても同様に生成できる。 In Figures 27 and 28, 10 frames from frames 21 to 30 (an example of the second video data, teacher data, and first part) are compared to frames 1 to 20 before frame 21. It is generated by machine learning using frames 31 to 50 after frame 30 (which are examples of first video data, teacher data, and second images). Frames 1 to 50 are frames actually obtained by photographing other vehicles. For example, a frame actually captured by another vehicle may be a frame captured by a drive recorder installed in another vehicle that traveled at the same location or another location where the same or similar scene (for example, a landscape) can be captured. This is the resulting frame. , Machine learning is supervised learning in which frames 1 to 20 and frames 31 to 50 are input, and frames 21 to 30 are output. Frames other than frames 21 to 30 can be generated in the same way.

学習モデル180には、入力層181、中間層182および出力層183が含まれている。入力層181の各ニューロンには、フレーム１から20、フレーム31から50にフレーム番号順に撮影した順序にしたがって入力する。学習モデル180の入力層181には図28に示すように画素Ｅ１のデータ、画素Ｅ２のデータというように各フレームの画素ごとに画素配列の順にしたがって入力する。このように、入力層181には、一部の動画部分（フレーム21から30）を除いた動画データ（フレーム１から20、フレーム31から50）が入力する（ステップ191）。 The learning model 180 includes an input layer 181, a middle layer 182, and an output layer 183. Inputs are input to each neuron in the input layer 181 in the order in which frames 1 to 20 and frames 31 to 50 are photographed in the order of frame numbers. As shown in FIG. 28, the input layer 181 of the learning model 180 is inputted in accordance with the order of pixel arrangement for each pixel of each frame, such as data of pixel E1 and data of pixel E2. In this way, video data (frames 1 to 20, frames 31 to 50) excluding some video parts (frames 21 to 30) is input to the input layer 181 (step 191).

入力層181から入力したデータは中間層182を介して出力層183から出力する。出力層183の各ニューロンからは、フレーム21から29に対応するデータが画素配列にしたがって出力するように学習モデル180に学習させる。図28に示すように、フレーム１から20、フレーム31から50にフレーム番号順に撮影した順序にしたがって入力し、フレーム21から30のフレームが出力層183から出力されると学習モデルの学習が終了する。このように、出力が一定の動画部分（フレーム21から30）となるように学習が行われる。他の動画データについても同様に多数の学習をさせることにより（ステップ192）、欠陥動画の部分の前後の動画のフレームを利用してその欠陥動画部分の動画の学習モデルを生成できるようになる。 Data input from the input layer 181 is output from the output layer 183 via the intermediate layer 182. The learning model 180 is trained to output data corresponding to frames 21 to 29 from each neuron of the output layer 183 according to the pixel array. As shown in FIG. 28, frames 1 to 20 and frames 31 to 50 are input in the order of frame numbers, and the learning of the learning model ends when frames 21 to 30 are output from the output layer 183. . In this way, learning is performed so that the output is a constant video portion (frames 21 to 30). By similarly performing a large amount of learning on other video data (step 192), it becomes possible to generate a learning model for the video of the defective video portion by using video frames before and after the defective video portion.

図27から図29においては、入力が40フレーム、出力が10フレームであるが、これらのフレーム数に限ることなく入力のフレーム数、出力のフレーム数を変更できる。また、入力のフレーム数、出力のフレーム数をいずれも一定にしておき、生成しようとする欠陥動画のフレーム数が出力のフレーム数よりも少ない場合には出力されたフレームを表す動画データを生成しようとする欠陥動画のフレーム数となるようにフレーム間引きをすればよいし、生成しようとする欠陥動画のフレーム数が出力のフレーム数よりも多い場合には出力されたフレームを表す動画データを用いて補間などによりフレーム数を調整できる。 In FIGS. 27 to 29, the input is 40 frames and the output is 10 frames, but the number of input frames and the number of output frames can be changed without being limited to these numbers. Also, keep the number of input frames and output frames constant, and if the number of frames of the defective video you are trying to generate is less than the number of output frames, generate video data that represents the output frames. You can thin out the frames so that the number of defective video frames is equal to the number of frames of the defective video you want to generate, or if the number of frames of the defective video you are trying to generate is greater than the number of output frames, use video data representing the output frames. The number of frames can be adjusted by interpolation etc.

好ましくは、さまざまなシーンごとに適した学習モデル180を複数生成しておきシーンに応じた学習モデルを用いて欠陥動画の部分を生成する。たとえば、前方の自動車が事故を起こしたときの欠陥動画の生成に利用される学習モデル180、前方に突然、鳥、自転車、歩行者などが現れたときの欠陥動画の生成に利用される学習モデル180など欠陥動画の部分の前または後のシーンに応じた学習モデル180を用意しておき、シーンに応じた学習モデル180を利用して欠陥動画の部分を生成するようにする。 Preferably, a plurality of learning models 180 suitable for various scenes are generated, and the defective video portion is generated using the learning model according to the scene. For example, a learning model 180 is used to generate a defect video when a car in front causes an accident, and a learning model is used to generate a defect video when a bird, bicycle, pedestrian, etc. suddenly appears in front. A learning model 180 such as 180 corresponding to a scene before or after a defective video portion is prepared in advance, and the learning model 180 corresponding to the scene is used to generate the defective video portion.

さらに、欠陥動画の生成においても動画が車両に搭載されたドライブ・レコーダなどの場合には、欠陥動画を撮影したドライブ・レコーダに搭載されていた車両の車両情報の一致度が高い動画を用いて生成された学習モデル180を利用するようにしてもよい。 Furthermore, when generating defective videos, if the video is from a drive recorder installed in a vehicle, we use a video with a high degree of matching of the vehicle information of the vehicle installed in the drive recorder that captured the defective video. The generated learning model 180 may be used.

図30は動画の欠損部分の生成処理手順を示すフローチャートである。 FIG. 30 is a flowchart showing a procedure for generating a missing portion of a moving image.

ＳＤカード23に記録されている動画データのうち欠陥部分の動画データが見つけられると、その欠陥部分の前後の動画データがフレームごとに学習済みの学習モデル180に撮影順序にしたがって入力させられる（ステップ201）。 When the video data of the defective part is found among the video data recorded on the SD card 23, the video data before and after the defective part is input into the trained learning model 180 frame by frame in accordance with the shooting order (step 201).

入力した動画データを学習済みの学習モデル180（第２の学習モデルの一例である）において学習させ（ステップ202）、制御装置115（動画データ生成手段の一例である）によって欠損部分を含む動画データ（第３の画像の一例である）を学習モデル180に入力させ、欠損部分の動画データを表す動画データ（第４の画像の一例である）をフレーム順に学習モデル180から出力させる（ステップ203）。このようにして得られた欠損部分の動画データが制御装置115（第２の記録制御手段の一例である）記録領域124または127の欠損部分に記録させられる（ステップ204）。 The input video data is trained by the trained learning model 180 (an example of a second learning model) (step 202), and the video data including missing parts is trained by the control device 115 (an example of video data generation means). (which is an example of the third image) is input to the learning model 180, and video data (which is an example of the fourth image) representing the video data of the missing portion is outputted from the learning model 180 in frame order (step 203). . The video data of the missing portion obtained in this manner is recorded in the missing portion of the recording area 124 or 127 of the control device 115 (which is an example of the second recording control means) (step 204).

メモリ・カード23に記録されているフレーム番号などの記録エラーがあった場合でも修復できるし、動画に欠損部分があった場合でも修復できる。 Even if there is a recording error such as a frame number recorded on the memory card 23, it can be repaired, and even if there is a missing part in the video, it can be repaired.

上記説明では、「上記シーンを表す画像を検出する」ことに応じて、イベント記録リスト表示領域152に当該シーンを表す画像に関する情報を表示すること、及び表示した情報が選択されると、対応する動画を表す動画データに基づいて当該シーンを表す画像を含むイベント記録された動画を表示することであったが、これに限られず、ＳＤカード２３に記録された動画の再生中に、この選択に応じて、当該シーンの画像を表示する時点まで早送りすること、画像の再生中に当該時点までジャンプする（頭出しすること）であってもよい。 In the above description, in response to "detecting an image representing the above scene," information regarding the image representing the scene is displayed in the event record list display area 152, and when the displayed information is selected, the corresponding information is displayed. Although the event-recorded video including the image representing the scene is displayed based on the video data representing the video, the selection is not limited to this. Depending on the situation, it may be possible to fast-forward to the point in time when the image of the scene is displayed, or to jump to the point in time while the image is being played back (to cue up).

以上説明した各実施例では動画データに基づいて各種処理が行われていたが、動画形式でないデータに基づいて処理が行われてもよい。当該動画形式でないデータは、時系列順に撮影された静止画を示す複数の静止画形式のデータの集合により構成されてもよい。 In each of the embodiments described above, various processes are performed based on video data, but the processes may also be performed based on data that is not in video format. The non-video format data may be constituted by a set of data in a plurality of still image formats representing still images taken in chronological order.

上述した実施例では、ドライブ・レコーダ１は、車両の状態（走行状態等）を検出するために加速度センサ１８を有していたが、学習モデルの生成やシーンの検出に加速度センサ１８を用いない場合は、これを有しないようにしてもよい。第２実施例において、ドライブ・レコーダ１はステップ９１の処理を実行しない、またはステップ９１の処理のうち加速度センサ18によりイベントを検出する処理を実行しないようにしてもよい。この場合、第２の一致度を用いず、第１の一致度がしきい値以上であるか否かに基づいて特定のシーンの画像を検出することになる。このような場合も、ドライブ・レコーダ１は加速度センサ１８を有しなくてもよい。ドライブ・レコーダ１が加速度センサ１８以外のセンサを用いて車両の状態を検出する場合も、当該センサを有しない構成とすることができる。 In the embodiment described above, the drive recorder 1 has the acceleration sensor 18 to detect the state of the vehicle (driving state, etc.), but the acceleration sensor 18 is not used to generate a learning model or detect a scene. In this case, it may be omitted. In the second embodiment, the drive recorder 1 may not execute the process of step 91, or may not execute the process of detecting an event by the acceleration sensor 18 among the processes of step 91. In this case, the image of the specific scene is detected based on whether the first degree of coincidence is equal to or higher than the threshold value, without using the second degree of coincidence. Even in such a case, the drive recorder 1 does not need to have the acceleration sensor 18. Even when the drive recorder 1 detects the state of the vehicle using a sensor other than the acceleration sensor 18, it can be configured without the sensor.

上述した実施形態で、ディープ・ラーニングの技術を用いて実現されていた機能が、他の機械学習その他の代替手段がある場合は、これを用いて実現されてもよい。例えば、一致度の高い画像を検出するための手段として、パターンマッチングの技術が用いられてもよい。 In the embodiments described above, the functions realized using deep learning technology may be realized using other machine learning or other alternative means, if any. For example, a pattern matching technique may be used as a means for detecting images with a high degree of matching.

なお、本発明の範囲は、明細書に明示的に説明された構成や限定されるものではなく、本明細書に開示される本発明の様々な側面の組み合わせをも、その範囲に含むものである。本発明のうち、特許を受けようとする構成を、添付の特許請求の範囲に特定したが、現在の処は特許請求の範囲に特定されていない構成であっても、本明細書に開示される構成を、将来的に特許請求の範囲とする意思を有する。 Note that the scope of the present invention is not limited to the configuration explicitly described in the specification, but also includes combinations of various aspects of the invention disclosed in this specification. Of the present invention, the structure for which a patent is sought has been specified in the attached claims, but at present, even if the structure is not specified in the claims, it is not disclosed in this specification. We intend to include such configurations in the scope of claims in the future.

本願発明は上述した実施の形態に記載の構成に限定されない。上述した各実施の形態や変形例の構成要素は任意に選択して組み合わせて構成するとよい。また各実施の形態や変形例の任意の構成要素と、発明を解決するための手段に記載の任意の構成要素又は発明を解決するための手段に記載の任意の構成要素を具体化した構成要素とは任意に組み合わせて構成するとよい。これらについても本願の補正又は分割出願等において権利取得する意思を有する。「～の場合」「～のとき」という記載があったとしてもその場合やそのときに限られる構成として記載はしているものではない。これらの場合やときでない構成についても開示しているものであり、権利取得する意思を有する。また順番を伴った記載になっている箇所もこの順番に限らない。一部の箇所を削除したり、順番を入れ替えた構成についても開示しているものであり、権利取得する意思を有する。 The present invention is not limited to the configuration described in the embodiments described above. The components of each of the embodiments and modifications described above may be arbitrarily selected and combined. Also, any component of each embodiment or modification, any component described in the means for solving the invention, or a component that embodies any component described in the means for solving the invention. It may be configured in any combination. The applicant intends to acquire rights to these matters through amendments to the application or divisional applications. Even if there is a description of ``in the case of'' or ``in the case of'', the description is not intended to be limited to that case or at that time. We have also disclosed these cases and other configurations, and we intend to acquire the rights. Furthermore, the sections described in order are not limited to this order. It also discloses a configuration in which some parts have been deleted or the order has been changed, and we have the intention to acquire the rights.

また、意匠登録出願への変更により、全体意匠又は部分意匠について権利取得する意思を有する。図面は本装置の全体を実線で描画しているが、全体意匠のみならず当該装置の一部の部分に対して請求する部分意匠も包含した図面である。例えば当該装置の一部の部材を部分意匠とすることはもちろんのこと、部材と関係なく当該装置の一部の部分を部分意匠として包含した図面である。当該装置の一部の部分としては、装置の一部の部材としても良いし、その部材の部分としても良い。全体意匠はもちろんのこと、図面の実線部分のうち任意の部分を破線部分とした部分意匠を、権利化する意思を有する。また、装置の筐体の内部のモジュール・部材・部品等についても、図面に表示されているものは、いずれも独立して取引の対象となるものであって、同様に、意匠登録出願への変更を行って権利化を行う意思を有するものである。 In addition, the applicant intends to acquire rights to the entire design or partial design by converting the application to a design registration application. Although the drawing depicts the entire device using solid lines, the drawing includes not only the overall design but also the partial design claimed for some parts of the device. For example, it is a drawing that not only includes some members of the device as a partial design, but also includes some parts of the device as a partial design regardless of the members. The part of the device may be a part of the device or a part of the device. We intend to obtain rights not only for the entire design, but also for partial designs in which any part of the solid line part of the drawing is a broken line part. In addition, the modules, members, parts, etc. inside the device housing shown in the drawings are all subject to independent transactions, and similarly, they are included in the design registration application. There is an intention to make changes and obtain rights.

１：ドライブ・レコーダ、３：フロント・ガラス、４：ルーム・ミラー、５：シガー・ソケット、６：電源ケーブル、10：ＳＤカード挿入口、11：ディスプレイ、12：操作ボタン、13：ジョイント・レール、14：ＳＤカード・リーダ・ライタ、15：スピーカ、16：ＧＰＳ受信機、17：カメラ、18：加速度センサ、19：通信回路、20：コントローラ、20ｃ：ＲＡＭ、20ｄ：タイマ、21：ＬＴＥモジュール、22：データベース、23：ＳＤカード、30：サーバ、31：制御装置、33：メモリ、34：ハード・ディスク・ドライブ、35：ハード・ディスク、40：学習モデル、40Ａ：第１の学習モデル、40Ｂ：第２の学習モデル、40Ｃ：第３の学習モデル、41：入力層、42：中間層、43：出力層、43ａ：ニューロン、50：目標画像、51：第１の目標画像、52：第２の目標画像、53：第３の目標画像、71：生成器、72：識別器、73：重みづけ変更器、80：学習モデル、81：入力層、82：中間層、83：出力層、105：学習モデル、106：入力層、107：中間層、108：出力層、110：パーソナル・コンピュータ、111：表示装置、112：表示制御装置、113：通信装置、114：メモリ、115：制御装置、116：入力装置、117：ハード・ディスク・ドライブ、118：ハード・ディスク、119：ＳＤカード・リーダ・ライタ、121：ファイル・システム領域、122：常時記録領域、123：管理領域、124：記録領域、125：イベント記録領域、126：管理領域、127：記録領域、131：ヘッダ記録領域、132：フレーム画像データ記録領域、133：フッタ記録領域、150：再生ウインドウ、151：映像表示領域、152：イベント記録リスト表示領域、153：地図画像表示領域、154：室内強調ボタン、155：通常表示ボタン、156：時刻表示領域、157：操作ボタン、158：車両状態表示領域、159：加速度グラフ表示領域、160：車両情報等表示領域、161：記録リスト表示領域、162：加速度表示領域、163：加速度表示領域、180：学習モデル、181：入力層、182：中間層、183：出力層、421：入力層、Ｅ１：画素、Ｅ２：画素、ＦＲ１：フレーム、ＦＲｎ：フレーム、Ｐ１：画素、Ｐ21：画素 1: Drive recorder, 3: Windshield, 4: Room mirror, 5: Cigarette socket, 6: Power cable, 10: SD card slot, 11: Display, 12: Operation buttons, 13: Joint rail , 14: SD card reader/writer, 15: speaker, 16: GPS receiver, 17: camera, 18: acceleration sensor, 19: communication circuit, 20: controller, 20c: RAM, 20d: timer, 21: LTE module , 22: database, 23: SD card, 30: server, 31: control device, 33: memory, 34: hard disk drive, 35: hard disk, 40: learning model, 40A: first learning model, 40B: Second learning model, 40C: Third learning model, 41: Input layer, 42: Middle layer, 43: Output layer, 43a: Neuron, 50: Target image, 51: First target image, 52: Second target image, 53: Third target image, 71: Generator, 72: Discriminator, 73: Weighting changer, 80: Learning model, 81: Input layer, 82: Middle layer, 83: Output layer , 105: Learning model, 106: Input layer, 107: Middle layer, 108: Output layer, 110: Personal computer, 111: Display device, 112: Display control device, 113: Communication device, 114: Memory, 115: Control Device, 116: Input device, 117: Hard disk drive, 118: Hard disk, 119: SD card reader/writer, 121: File system area, 122: Constant recording area, 123: Management area, 124: Recording area, 125: Event recording area, 126: Management area, 127: Recording area, 131: Header recording area, 132: Frame image data recording area, 133: Footer recording area, 150: Playback window, 151: Video display area, 152: Event record list display area, 153: Map image display area, 154: Indoor emphasis button, 155: Normal display button, 156: Time display area, 157: Operation button, 158: Vehicle status display area, 159: Acceleration graph display Area, 160: Vehicle information display area, 161: Record list display area, 162: Acceleration display area, 163: Acceleration display area, 180: Learning model, 181: Input layer, 182: Middle layer, 183: Output layer, 421 : input layer, E1: pixel, E2: pixel, FR1: frame, FRn: frame, P1: pixel, P21: pixel

Claims

Photographing with the vehicle at each of a plurality of time points based on an image that defines a scene of a specific image that can be photographed with the vehicle, and the physical quantity detected by a function that detects a physical quantity indicating the driving state of the vehicle. a function of detecting an image representing a scene of the specific image from the captured image ;
Equipped with
The function of detecting an image representing the scene of the specific image is
each of the plurality of time points based on the degree of coincidence between the image defining the scene of the specific image and the image taken by the vehicle, and the physical quantity when the image taken by the vehicle was obtained. detecting an image representing the scene of the specific image from images taken with the vehicle in
system.

The function of detecting an image representing a scene of the specific image is
a first degree of coincidence, which is a degree of coincidence between an image that defines the scene of the specific image and an image taken by the vehicle; and a physical quantity when the image that defines the scene of the specific image is obtained; Detecting an image representing the scene of the specific image based on a second degree of coincidence that is a degree of coincidence with the physical quantity when the image photographed by the vehicle is obtained.
The system according to claim 1.

3. The system according to claim 1 , further comprising a function of controlling a warning device to issue a warning in response to detection of an image representing a scene of the specific image .

The system according to any one of claims 1 to 3, further comprising a function of controlling a recording device to record an image representing a scene of the specific image on a recording medium.

A program for realizing in a computer a function of detecting an image representing a scene of the specific image of the system according to any one of claims 1 to 4 .