JP2023176441A

JP2023176441A - System, control method, and program or the like

Info

Publication number: JP2023176441A
Application number: JP2022088719A
Authority: JP
Inventors: 忠俣江; Tadashi Matae; 勇喜清水; Yuki Shimizu
Original assignee: Yupiteru Corp
Current assignee: Yupiteru Corp
Priority date: 2022-05-31
Filing date: 2022-05-31
Publication date: 2023-12-13

Abstract

To provide a system that makes an inference while image capturing, a control method thereof, and a program, which can substantially shorten the time required to obtain an inference result.SOLUTION: A system 5 includes an imaging device 1 installed in a vehicle 4 such as a car to image a traveling direction, a rear direction, etc., of the car, and infers captured video data obtained by image capturing, with an inference model, while image capturing. By inferring the captured video data while image capturing, the time required to obtain an inference result is substantially shortened, and the time from imaging end to inference end is shortened.SELECTED DRAWING: Figure 1

Description

本発明は、たとえば、システム、制御方法、プログラム等に関する。 The present invention relates to, for example, a system, a control method, a program, and the like.

従来の技術では、たとえば、車載の撮影装置などによって被写体の撮影により得られた動画データを学習済みモデルに入力し、学習済みモデルから特徴画像を得るものが考えられている（特許文献１）。 In the conventional technology, for example, video data obtained by photographing a subject using a vehicle-mounted photographing device or the like is input to a trained model, and characteristic images are obtained from the trained model (Patent Document 1).

特開2021-164034号公報Japanese Patent Application Publication No. 2021-164034

たとえば、車載の撮影装置において撮影して得られた撮影動画データを用いて推論モデルによる推論を行う場合、撮影動画データのデータ量が多いほど推論に必要な時間が長くなることがある。とくに、常時録画をした動画データを用いて推論モデルによる推論を行う場合には録画時間の何倍もの時間が必要なことが多く、たとえば、録画の終了後に推論を始めると録画の開始から推論の終了までの時間が長くかかってしまうことがある。 For example, when inference is performed using an inference model using photographed video data obtained by photographing with an in-vehicle photographing device, the time required for inference may become longer as the amount of photographed video data increases. In particular, when performing inference using an inference model using continuously recorded video data, it often takes several times the recording time. It may take a long time to complete.

上述した課題に鑑み、本発明の目的の一つは、撮影しながら推論する等従来とは異なる技術を提供することである。 In view of the above-mentioned problems, one of the objects of the present invention is to provide a technique different from the conventional one, such as making inferences while photographing.

本発明の目的はこれに限定されず、本明細書及び図面等に開示される構成の部分から奏する効果を得ることを目的とする構成についても分割出願・補正等により権利取得する意思を有する。例えば本明細書において「～できる」「～可能である」などと記載した箇所を「～が課題である」と読み替えた課題が本明細書には開示されている。課題はそれぞれ独立したものとして記載しているものであり、各々の課題を解決するための構成についても単独で分割出願・補正等により権利取得する意思を有する。課題が明細書の記載から黙示的に把握されるものであっても、本出願人は本明細書に記載の構成の一部を補正又は分割出願にて特許請求の範囲とする意思を有する。またこれら独立の課題を組み合わせた課題を解決する構成についても開示しているものであり、権利取得する意思を有する。 The purpose of the present invention is not limited thereto, and the present invention intends to acquire rights through divisional applications, amendments, etc. for structures that aim to obtain effects from the parts of the structures disclosed in this specification, drawings, etc. For example, the present specification discloses a problem in which passages such as ``can be done'' or ``possible'' are read as ``the problem is.'' Each issue is described as an independent entity, and we intend to acquire rights to the structure for solving each issue individually through divisional applications, amendments, etc. Even if the problem is implicitly understood from the description of the specification, the present applicant has the intention to claim a part of the structure described in the specification in an amendment or divisional application. The company also discloses a structure that solves a problem that combines these independent problems, and has the intention to acquire the rights to it.

（１）この発明によるシステムは、車載の撮影装置において撮影しながら、撮影して得られた撮影動画データを用いて、推論モデルでの推論を行う機能を有するとよい。 (1) The system according to the present invention preferably has a function of performing inference using an inference model using captured video data obtained by shooting with an in-vehicle photographing device.

このようにすれば、撮影終了から推論終了までの時間を短くすることができる。推論モデルは、たとえば、教師データを用いて学習した学習済みモデルであり、推論は、たとえば、学習済モデルに任意の画像を入力して出力を得ることである。車載の撮影装置は、自動車、バス、トラック、フォークリフトなどのような車両のほか、オートバイ、自転者などの二輪車に載置されるものも含むとよい。 In this way, the time from the end of photography to the end of inference can be shortened. The inference model is, for example, a trained model trained using teacher data, and inference is, for example, inputting an arbitrary image to the trained model to obtain an output. In-vehicle photographing devices may include those mounted on vehicles such as cars, buses, trucks, forklifts, and two-wheeled vehicles such as motorcycles and bicycles.

（２）上記推論モデルは複数の種類の中から決定するとよい。 (2) The above inference model may be determined from among a plurality of types.

このようにすれば、適した推論ができる。複数の種類には、推論モデルのバージョンが異なるものも含むとよい。選択指令に応じて推論モデルを決定するとよい。選択指令はユーザからの指令でもよいし、システムが自動的に決定するものでもよい。 In this way, appropriate inferences can be made. The plurality of types may include those with different versions of the inference model. It is preferable to determine the inference model according to the selection command. The selection command may be a command from the user or may be automatically determined by the system.

（３）撮影環境に適した上記推論モデルを使用するように上記推論モデルを決定するとよい。 (3) The inference model may be determined to be suitable for the shooting environment.

このようにすれば、撮影環境に適した推論ができる。撮影環境は、例えば、撮影装置が撮影している周りの状況（明るさ、暗さなど）のほか、撮影装置が載置されている車両の状況（車両の速度など）、撮影装置自体の状況（撮影装置が載置されている位置、撮影装置が載置されている場所が車両の室内か、車両の外部に露出されているか、前方を撮影するか、右または左側方を撮影するか、後方を撮影するか、天球画像または円周画像のように前後左右の全体を撮影するかなど）の１つまたは複数を含むとよい。 In this way, inferences suitable for the shooting environment can be made. The photographing environment includes, for example, the situation around the photographing device (brightness, darkness, etc.), the condition of the vehicle in which the photographing device is mounted (vehicle speed, etc.), and the condition of the photographing device itself. (The position where the photographing device is installed, whether the photographing device is installed inside the vehicle or exposed outside the vehicle, whether to photograph the front, right or left side, It is preferable to include one or more of the following: whether to photograph the rear, or to photograph the entire front, rear, left, and right sides such as a celestial sphere image or a circumferential image.

（４）撮影場所、撮影場所の天候または撮影場所の明るさに適した上記推論モデルを使用するように上記推論モデルを決定するとよい。 (4) The inference model may be determined to be suitable for the shooting location, the weather at the shooting location, or the brightness of the shooting location.

このようにすれば、撮影場所、撮影場所の天候または撮影場所の明るさに適した推論ができる。撮影場所は、例えば、撮影により得られる動画撮影データから、どのような場所（市街地か、郊外か、山道かなど）かが分かるし、撮影装置のＧＰＳ（Global Positioning System）機能を用いてわかる撮影場所の緯度、経度からも分かる。撮影場所の天候は、例えば、撮影場所から天気サーバなどにアクセスすることでわかるし、明るさは照度センサなどからわかる。 In this way, inferences can be made that are appropriate for the photographing location, the weather at the photographing location, or the brightness of the photographing location. For example, the shooting location can be known from the video shooting data obtained from the shooting (city area, suburbs, mountain road, etc.), and the shooting location can be determined using the GPS (Global Positioning System) function of the shooting device. You can also tell from the latitude and longitude of the location. The weather at the shooting location can be determined, for example, by accessing a weather server from the shooting location, and the brightness can be determined from an illuminance sensor.

（５）撮影場所の明るさが暗いほど、明るいときよりも相対的に高精度の推論モデルで推論するように使用する上記推論モデルを決定するとよい。たとえば、対象を検出するのであれば、明るいときに使用される推論モデルよりも暗いときに対象の検出精度が高い高精度の推論モデルとするとよい。 (5) It is preferable to determine the above-mentioned inference model to be used so that the darker the brightness of the shooting location, the more accurate the inference model is used for inference than when it is brighter. For example, if an object is to be detected, it is preferable to use a high-precision inference model that detects objects more accurately when it is dark than an inference model used when it is bright.

このようにすれば、撮影場所が暗いほど高精度の推論ができる。 In this way, the darker the shooting location, the more accurate the inference can be made.

（６）夜間は高精度の上記推論モデルで推論するように使用する上記推論モデルを決定するとよい。 (6) It is preferable to determine the inference model to be used for inference using the highly accurate inference model at night.

このようにすれば、夜間に適した推論ができる。夜間かどうかは、たとえば、撮影時の時刻から判断するとよい。照度センサを用いて暗いと判断したときに夜間としてもよい。 In this way, inferences suitable for nighttime can be made. Whether it is nighttime or not can be determined, for example, from the time at which the image was taken. It may be determined that it is nighttime when the illuminance sensor determines that it is dark.

（７）上記車載の撮影装置に与えられる外乱に応じて上記推論モデルの推論結果を補正するとよい。 (7) The inference result of the inference model may be corrected in accordance with disturbances applied to the in-vehicle photographing device.

このようにすれば、車載の撮影装置に与えられる外乱に耐えうる推論結果を得ることができる。外乱は、たとえば、車載の撮影装置の動作を乱す撮影装置の外部からの要因であるとよい。たとえば、撮影装置が撮影している場所の明るさがしきい値以下の場合に外乱としたり、撮影装置に光が照射されると外乱としたりするとよい。また、外乱は、たとえば、撮影装置が載置される車両等への外乱も含むとよい。たとえば、撮影装置が載置される車両等に設けられるＧセンサによって所定のしきい値以上の加速度が車両に加わったり、車両の速度計から与えられる車両の速度が所定のしきい値以上となったりすると外乱とするとよい。 In this way, it is possible to obtain inference results that can withstand disturbances applied to the vehicle-mounted imaging device. The disturbance may be, for example, a factor from outside the photographing device that disturbs the operation of the vehicle-mounted photographing device. For example, the disturbance may be determined when the brightness of the place photographed by the photographing device is below a threshold value, or the disturbance may be determined when the photographing device is irradiated with light. Further, the disturbance may include, for example, disturbance to a vehicle on which the photographing device is mounted. For example, a G-sensor installed in a vehicle, etc. on which a photographing device is installed may apply an acceleration of more than a predetermined threshold to the vehicle, or the speed of the vehicle given by the vehicle's speedometer may exceed a predetermined threshold. It is best to consider this as a disturbance.

（８）上記車載の撮影装置が設けられている車両の移動により車両から上記車載の撮影装置に与えられる外乱または上記車載の撮影装置に光が照射されたことにより上記車載の撮影装置に与えられる外乱に応じて上記推論モデルの推論結果を補正するとよい。 (8) Disturbance imparted from the vehicle to the vehicle-mounted photographic device due to movement of the vehicle in which the vehicle-mounted photographic device is installed, or disturbance imparted to the vehicle-mounted photographic device due to light being irradiated to the vehicle-mounted photographic device. It is preferable to correct the inference result of the above inference model according to the disturbance.

このようにすれば、車両の移動により車両から車載の撮影装置に与えられる外乱、または車載の撮影装置に光が照射されたことにより車載の撮影装置に与えられる外乱に耐えうる推論結果を得らことができる。車両の移動により車両から車載の撮影装置に与えられる外乱は、たとえば、車両の振動（例えば、車両の速度変化による振動、車両への衝撃による振動）などであるとよい。車載の撮影装置への光の照射には、たとえば、太陽光などの照り返し、日差しの照射、逆光、車両のライトの照射、光の反射による照射などがある。光が照射されたかどうかは、たとえば、光度計などにより検出するとよい。 In this way, it is possible to obtain inference results that can withstand disturbances caused by the movement of the vehicle to the in-vehicle imaging device, or disturbances caused to the in-vehicle imaging device due to light irradiation on the in-vehicle imaging device. be able to. The disturbance imparted from the vehicle to the vehicle-mounted imaging device due to the movement of the vehicle may be, for example, vibration of the vehicle (for example, vibration due to a change in vehicle speed, vibration due to impact on the vehicle), or the like. Irradiation of light onto the vehicle-mounted imaging device includes, for example, reflection of sunlight, irradiation of sunlight, backlighting, irradiation of vehicle lights, irradiation by reflection of light, and the like. Whether or not light has been irradiated may be detected using, for example, a photometer.

（９）上記車載の撮影装置に与えられる外乱により上記推論モデルでの推論ができる状態かできない状態かを報知するとよい。 (9) It is preferable to notify whether a state in which inference using the inference model is possible or not is possible due to a disturbance applied to the in-vehicle photographing device.

このようにすれば、推論ができない状態なので推論の結果が得られないのかどうかが分かることができる。推論ができない状態かどうかは、たとえば、外乱の種類、外乱の大きさごとにあらかじめ定められ、それらの種類、大きさに応じて推論ができないかどうか判断し、報知するとよい。 In this way, it can be determined whether or not the inference result cannot be obtained because the state is such that inference is not possible. Whether or not inference is not possible is determined in advance for each type of disturbance and the magnitude of the disturbance, and it is preferable to determine and notify whether inference is not possible according to these types and magnitudes.

（１０）上記車載の撮影装置が設置されている車両の揺れまたは速度に応じて上記推論モデルでの推論の結果を調整するとよい。 (10) The inference result of the inference model may be adjusted depending on the vibration or speed of the vehicle in which the in-vehicle photographing device is installed.

このようにすれば、車両の揺れまたは速度に応じて調整される推論の結果を得ることができる。車両の揺れは、たとえば、撮影装置のＧセンサにより分かり、車両の速度は、車両の速度計、撮影装置の速度センサなどにより分かるとよい。 In this way, it is possible to obtain inference results that are adjusted according to the sway or speed of the vehicle. It is preferable that the shaking of the vehicle is detected by, for example, a G sensor of the photographing device, and the speed of the vehicle is determined by the speedometer of the vehicle, the speed sensor of the photographing device, or the like.

（１１）基準の高さに設けられている上記車載の撮影装置において撮影して得られる動画により類似するように、上記車載の撮影装置が設けられている高さにもとづいて上記システムを調整するとよい。 (11) Adjusting the system based on the height at which the in-vehicle camera is installed so that it more closely resembles the video obtained by shooting with the in-vehicle camera installed at a standard height. good.

このようにすれば、車載の撮影装置が設けられている高さが異なっても基準の高さに設けられている車載の撮影装置が撮影したような動画が得られることが可能である。基準の高さは、撮影装置が設けられる車両等ごとに変わるとよい。車両等ごとの基準の高さは、たとえば、撮影により得られる動画が、撮影対象を比較的良く撮影されていると思われるものを試行錯誤で決定するとよい。高さごとに、撮影により得られる画像を得て、基準の高さに設けられている車載の撮影装置において撮影して得られる動画に類似するようにシステムを調整するとよい。 In this way, even if the vehicle-mounted photographing devices are installed at different heights, it is possible to obtain a video that looks like it was shot by the vehicle-mounted photographing device installed at the standard height. The height of the reference may vary depending on the vehicle or the like in which the photographing device is installed. The reference height for each vehicle or the like may be determined by trial and error, for example, based on a video obtained by shooting that is considered to capture the subject relatively well. It is preferable to obtain an image obtained by photographing at each height and adjust the system so that it resembles a video obtained by photographing with a vehicle-mounted photographing device installed at a reference height.

（１２）上記車載の撮影装置は、基準の高さよりも下方を撮影方向として円周画像を撮影するとよい。 (12) The vehicle-mounted photographing device preferably photographs a circumferential image with the photographing direction set below the reference height.

このようにすれば、基準の高さ以下の円周画像、天球画像を得ることができる。 In this way, it is possible to obtain a circumferential image and a celestial sphere image that are below the reference height.

（１３）上記車載の撮影装置が設置されている高さを入力し、入力された高さに応じて上記推論モデルでの推論を調整するとよい。 (13) It is preferable to input the height at which the in-vehicle photographing device is installed, and adjust the inference in the inference model according to the input height.

このようにすれば、高さに応じて調整された推論の結果を得ることができる。 In this way, it is possible to obtain inference results that are adjusted according to the height.

（１４）上記推論モデルは、動画の中から移動速度が所定速度以下の人物を検出する推論を行うとよい。 (14) The above inference model preferably performs inference to detect a person whose moving speed is less than or equal to a predetermined speed from the video.

このようにすれば、所定速度より速く移動している人物を検出しないようにできる。移動速度が所定速度以下かどうかは、たとえば、撮影によって得られた動画に含まれる人物から、その人物の速度を算出するとよい。 In this way, a person moving faster than a predetermined speed can be prevented from being detected. To determine whether the moving speed is below a predetermined speed, for example, the speed of the person included in the video obtained by shooting may be calculated.

（１５）上記推論モデルは、車両の中にいる人物を除外する推論を行うとよい。 (15) The above inference model preferably performs inference that excludes the person inside the vehicle.

このようにすれば、車両の中にいる人物を推論結果から除外できるようになる。たとえば、撮影により得られた動画から車両を検出し、その検出された車両の中に人物がいると、その人物を除外するとよい。 In this way, the person inside the vehicle can be excluded from the inference results. For example, if a vehicle is detected from a video obtained by shooting and a person is inside the detected vehicle, that person may be excluded.

（１６）上記推論モデルは、動画の中から対象物を検出する推論を行うものであり、反射面に映り込んだ対象物は排除するとよい。 (16) The above inference model performs inference to detect objects from a moving image, and it is preferable to exclude objects reflected on a reflective surface.

このようにすれば、検出する対象物は反射面に映り込んだもの以外のものとすることができる。反射面に映り込んだ対象物かどうかは、たとえば、対象物の周りが鏡面のような画像かどうかで判断するとよい。 In this way, the object to be detected can be something other than what is reflected on the reflective surface. Whether or not an object is reflected on a reflective surface may be determined by, for example, whether the image around the object has a mirror surface.

（１７）上記推論モデルは動画を構成する各画像から対象物を検出する推論を行うものであり、検出された対象物の部分を特定するマークを、動画を構成する各画像に表示する機能を有し、動画においてマークが途切れた時間が所定の時間以内のときにマークが途切れている画像にマークを補間するとよい。 (17) The above inference model performs inference to detect an object from each image that makes up a video, and has a function that displays a mark that identifies the detected object part on each image that makes up the video. It is preferable to interpolate the mark to the image where the mark is interrupted when the time when the mark is interrupted in the moving image is within a predetermined time.

このようにすれば、検出された対象物の部分を特定するマークが途切れている画像にもマークが表示されるようにできる。たとえば、少なくとも動画を構成する各画像のうち対象物が検出された画像であり、少なくとも１つの画像にマークを表示すればよい。 In this way, marks can be displayed even in images where marks specifying parts of the detected object are interrupted. For example, it is sufficient to display a mark on at least one of the images constituting the moving image, which is an image in which a target object has been detected.

（１８）上記車載の撮影装置において撮影して得られる撮影動画データを用いて上記推論モデルでの推論が行われたことにより対象物が検出されたことに応じて動画データのイベント記録を行うとよい。 (18) Event recording of video data is performed in response to the detection of an object by inference using the inference model using the captured video data obtained by photographing with the in-vehicle imaging device. good.

このようにすれば、対象物が検出されたことに応じてイベント記録を行うことができる。イベント記録は、たとえば、車に衝撃が加わったときだけでなく、対象物の検出に応じてイベント記録できるようになるとよい。 In this way, event recording can be performed in response to detection of a target object. For example, it would be good to be able to record events not only when an impact is applied to the car, but also when an object is detected.

（１９）対象物を検出したことに応じて、対象物の検出に用いられるしきい値を下げるとよい。 (19) The threshold value used for detecting the object may be lowered in response to the detection of the object.

このようにすれば、対象物の検出に応じて、対象物を検出する確率がより高くなることができる。たとえば、対象物の検出に用いられるパラメータがしきい値以上なら対象物が検出されたとするアルゴリズムがあるときに、しきい値を下げることで、しきい値を下げる前には検出しなかった対象物を検出できる。 In this way, the probability of detecting the target object can be increased depending on the detection of the target object. For example, if there is an algorithm that determines that an object has been detected if the parameter used to detect the object is greater than or equal to a threshold value, by lowering the threshold value, you can detect an object that was not detected before lowering the threshold value. Can detect objects.

（２０）上記イベント記録が行われた場所を地図上に表示するとよい。 (20) It is preferable to display the location where the event recording was performed on a map.

このようにすれば、地図を見ることでイベント記録が行われた場所がわかるようにできる。地図を表わすデータは、たとえば、インターネット上の地図サーバから受信すればよい。イベント記録が行われた場所は、たとえば、ＧＰＳ機能を利用すればよい。 In this way, the location where the event recording took place can be known by looking at the map. Data representing a map may be received from a map server on the Internet, for example. The location where the event was recorded may be determined using, for example, a GPS function.

（２１）上記イベント記録が行われた場所の画像を地図上に表示するとよい。 (21) It is preferable to display an image of the place where the above event recording was performed on a map.

このようにすれば、イベント記録が行われた場所の様子が分かりやすくなることが可能である。実際の画像を見ることができるのでわかりやすくなることが可能である。 In this way, it is possible to easily understand the state of the place where the event recording took place. Since you can see the actual image, it can be easier to understand.

（２２）地図上に表示されている上記イベント記録が行われた場所に、上記イベント記録の動画データへのリンクを埋め込むとよい。 (22) It is preferable to embed a link to the video data of the event record at the location where the event record was performed, which is displayed on the map.

このようにすれば、イベント記録が行われた場所の動画を見ることができる。たとえば、イベント記録が行われた場所を表すデータとリンクとを関連づけて撮影動画データが記録される媒体に記録することでリンクを埋め込むとよい。 In this way, you can view a video of the location where the event was recorded. For example, it is preferable to embed the link by associating data representing the location where the event was recorded with the link and recording it on the medium in which the photographed video data is recorded.

（２３）上記推論モデルでの推論により対象物が検出されたことに応じて動画データの上記イベント記録が行われており、上記イベント記録された動画を構成する画像に検出された対象物の着目箇所を示すマークを表示するとよい。 (23) The above-mentioned event recording of video data is performed in response to the detection of a target object by the inference using the above-mentioned inference model, and attention is paid to the detected target object in the image constituting the video in which the above-mentioned event has been recorded. It is a good idea to display a mark to indicate the location.

このようにすれば、対象物を検出するための着目箇所が分かることが可能である。マークは、たとえば、対象物の部分、対象物として特定する着目する部分などを特定できればよい。マークは、ヒートマップとするとよい。 In this way, it is possible to know the location of interest for detecting the target object. The mark may be used as long as it can specify, for example, a part of the object or a part of interest to be specified as the object. The mark may be a heat map.

（２４）複数の上記車載の撮影装置のそれぞれにおいて上記推論モデルでの推論が行われたことによる推論の結果を、上記車載の撮影装置ごとに取得し、上記車載の撮影装置ごとの推論結果の類似度を表示するとよい。 (24) Obtain the inference results obtained by inference using the inference model in each of the plurality of in-vehicle imaging devices, and obtain the inference results for each in-vehicle imaging device. It is better to display the degree of similarity.

このようにすれば、車載の撮影装置ごとの推論結果の比較をすることができる。 In this way, it is possible to compare the inference results for each vehicle-mounted photographing device.

（２５）上記車両の撮影装置が設置されている車両の位置、情報、速度、撮影時間などの状況が近似しているときに得られた撮影動画データを用いて行われた上記推論モデルでの推論結果の類似度を表すとよい。 (25) The above inference model was conducted using captured video data obtained when the location, information, speed, shooting time, etc. of the vehicle in which the vehicle camera camera is installed are similar. It is preferable to express the degree of similarity of inference results.

このようにすれば、状況の近似と推論結果の類似との関係が分かることが可能である。 In this way, it is possible to understand the relationship between the approximation of the situation and the similarity of the inference results.

（２６）この発明による方法は、上記車載の撮影装置において撮影しながら、撮影した得られた撮影動画データを用いて上記推論モデルでの推論を行う。 (26) The method according to the present invention performs inference using the inference model using the captured video data obtained while shooting with the in-vehicle camera.

このようにすれば、撮影終了から推論終了までの時間が短くなる等従来とは異なる技術を提供することができる。 In this way, it is possible to provide a technology different from the conventional technology, such as a shorter time from the end of imaging to the end of inference.

（２７）この発明によるプログラムは、コンピュータに上記システムの機能を実現させる。 (27) The program according to the present invention causes a computer to realize the functions of the above system.

上述した（１）から（２５）に示した発明は、任意に組み合わせることができる。例えば、（１）に示した発明の全て又は一部の構成に、（２）から（２５）の少なくとも１つの発明の少なくとも一部の構成を加える構成としてもよい。特に、（１）に示した発明に、（２）から（２５）の少なくとも１つの発明の少なくとも一部の構成を加えた発明とするとよい。また、（１）から（２５）に示した発明から任意の構成を抽出し、抽出された構成を組み合わせてもよい。また、（２６）に示した発明に、（２）から（２５）に示した発明から任意の構成を抽出し、抽出された構成を組み合わせてもよい。さらに、（２７）に示した発明に、（２）から（２５）に示した発明から任意の構成を抽出し、抽出された構成を組み合わせてもよい。 The inventions shown in (1) to (25) above can be combined arbitrarily. For example, at least a part of the structure of at least one of the inventions (2) to (25) may be added to all or part of the structure of the invention shown in (1). In particular, it is preferable to create an invention in which at least a part of the structure of at least one of the inventions (2) to (25) is added to the invention shown in (1). Further, arbitrary configurations may be extracted from the inventions shown in (1) to (25) and the extracted configurations may be combined. Further, arbitrary configurations may be extracted from the inventions shown in (2) to (25) and the extracted configurations may be combined with the invention shown in (26). Furthermore, arbitrary configurations may be extracted from the inventions shown in (2) to (25) and the extracted configurations may be combined with the invention shown in (27).

本願の出願人は、これらの構成を含む発明について権利を取得する意思を有する。また「～の場合」「～のとき」という記載があったとしても、その場合やそのときに限られる構成として記載はしているものではない。これらはよりよい構成の例を示しているものであって、これらの場合やときでない構成についても権利取得する意思を有する。また順番を伴った記載になっている箇所もこの順番に限らない。一部の箇所を削除したり、順番を入れ替えたりした構成についても開示しているものであり、権利取得する意思を有する。 The applicant of this application intends to acquire rights to inventions containing these structures. Furthermore, even if there is a description of "in the case of" or "at the time of", the description is not intended to be limited to those cases or times. These are examples of better configurations, and we intend to acquire rights to these cases and other configurations as well. Furthermore, the sections described in order are not limited to this order. It also discloses a configuration in which some parts have been deleted or the order has been changed, and we have the intention to acquire the rights.

本発明によれば、撮影終了から推論終了までの時間を短くする等従来とは異なる技術を提供することができる。 According to the present invention, it is possible to provide a technique different from the conventional technology, such as shortening the time from the end of imaging to the end of inference.

なお、本願の発明の効果はこれに限定されず、本明細書及び図面等に開示される構成の部分から奏する効果についても開示されており、当該効果を奏する構成についても分割出願・補正等により権利取得する意思を有する。例えば本明細書において「～できる」「～可能である」などと記載した箇所などは奏する効果を明示する記載であり、また「～できる」「～可能である」などといった記載がなくとも効果を示す部分が存在する。またこのような記載がなくとも当該構成よって把握される効果が存在する。 Note that the effects of the invention of the present application are not limited to these, and effects obtained from the parts of the configuration disclosed in the present specification, drawings, etc. are also disclosed, and the configurations that provide the effects are also disclosed by divisional applications, amendments, etc. Have the intention to acquire the rights. For example, in this specification, passages such as "can be done," "is possible," etc. are descriptions that clearly indicate the effect to be achieved, and even if there is no description such as "can be done," or "it is possible," the effect can be obtained. There is a part shown. Further, even without such a description, there are effects that can be understood from the configuration.

システムの一例を示している。An example of the system is shown. （Ａ）は撮影装置を正面から見た斜視図、（Ｂ）は撮影装置を背面から見た斜視図である。(A) is a perspective view of the photographing device seen from the front, and (B) is a perspective view of the photographing device seen from the back. （Ａ）はブラケット40を前方側の右斜め上方向から見た図、（Ｂ）はブラケットを側面側から見た図、（Ｃ）はブラケットを前方側の右斜め下方向から見た図である。(A) is a view of the bracket 40 viewed from the front diagonally upper right direction, (B) is a view of the bracket viewed from the side, and (C) is a view of the bracket viewed from the front diagonally lower right direction. be. 撮影装置の電気的構成を示している。The electrical configuration of the photographing device is shown. 学習済モデルの一例である。This is an example of a trained model. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置の表示画面の一例である。This is an example of a display screen of a photographing device. 撮影画像の一例である。This is an example of a photographed image. 記憶媒体のファイル構造の一例を示している。An example of a file structure of a storage medium is shown. 撮影画像の一例である。This is an example of a photographed image. 人物の画像の一例である。This is an example of an image of a person. 人物の画像の一例である。This is an example of an image of a person. 撮影画像の一例である。This is an example of a photographed image. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 記憶媒体のファイル構造の一例を示している。An example of a file structure of a storage medium is shown. 記憶媒体のファイル構造の一例を示している。An example of a file structure of a storage medium is shown. ＡＩモジュールを備えたシステムの一例である。This is an example of a system including an AI module. ＡＩモジュールの電気的構成を示すブロック図である。FIG. 2 is a block diagram showing the electrical configuration of an AI module. 撮影装置の表示画面の一例である。This is an example of a display screen of a photographing device. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置と推論用サーバとの関係を示している。It shows the relationship between the imaging device and the inference server. 推論用サーバの電気的構成を示すブロック図である。FIG. 2 is a block diagram showing the electrical configuration of an inference server. 撮影装置と推論用サーバとの処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device and an inference server. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置の撮影画像の一例である。This is an example of an image taken by a photographing device. 撮影装置の撮影画像の一例である。This is an example of an image taken by a photographing device. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置の処理手順を示すフローチャートである。It is a flowchart which shows the processing procedure of an imaging device. 撮影装置を背面から見た様子である。This is a view of the photographing device from the back. パーソナル・コンピュータの電気的構成を示すブロック図である。FIG. 1 is a block diagram showing the electrical configuration of a personal computer. 再生処理手順を示すフローチャートである。3 is a flowchart showing a reproduction processing procedure. 再生処理手順を示すフローチャートである。3 is a flowchart showing a reproduction processing procedure. パーソナル・コンピュータの表示画面の一例である。This is an example of a display screen of a personal computer. 再生処理手順を示すフローチャートである。3 is a flowchart showing a reproduction processing procedure. 人物の画像部分の一例である。This is an example of an image portion of a person. 推論の結果の類似度表示処理手順を示すフローチャートである。3 is a flowchart illustrating a procedure for displaying similarity of inference results. 推論の結果の類似度を示している。It shows the similarity of the inference results. 推論の結果の類似度を示している。It shows the similarity of the inference results. フォークリフトに設けられている撮影装置を示している。It shows a photographing device installed on a forklift. 撮影画像の一例である。This is an example of a photographed image. 撮影画像の一例である。This is an example of a photographed image. 撮影画像の一例である。This is an example of a photographed image. 変形処理された撮影画像の一例である。This is an example of a photographed image that has been subjected to transformation processing. 変形処理された撮影画像の一例である。This is an example of a photographed image that has been subjected to transformation processing.

［１．システムの全体構成］
図１は、本実施形態のシステムの構成を説明する図である。図１には、車両４を側面側から見た場合の模式図が示されている。システム５は、車両４に配置された、第１の撮影装置としての撮影装置１と、第２の撮影装置としての撮影装置２と、を有する。車両４は、例えば四輪の自動車であるが、四輪の自動車に限定されるものではなく、撮影装置１及び撮影装置２を設置することが可能な車両であればよい。車両は、例えば、自動車、バス、トラック等の四輪以上の大型輸送車や、自動二輪車、自転車等の二輪車、その他の車両等であってもよい。車両は、例えば、電車、モノレール、リニアモーターカー等の交通機関の車両でもよい。 [1. Overall system configuration]
FIG. 1 is a diagram illustrating the configuration of a system according to this embodiment. FIG. 1 shows a schematic diagram of the vehicle 4 viewed from the side. The system 5 includes a photographing device 1 as a first photographing device and a photographing device 2 as a second photographing device, which are arranged in a vehicle 4. The vehicle 4 is, for example, a four-wheeled vehicle, but is not limited to a four-wheeled vehicle, and may be any vehicle as long as the photographing device 1 and the photographing device 2 can be installed therein. The vehicle may be, for example, a large transport vehicle with four or more wheels such as an automobile, a bus, or a truck, a two-wheeled vehicle such as a motorcycle or a bicycle, or other vehicle. The vehicle may be, for example, a transportation vehicle such as a train, monorail, or linear motor car.

撮影装置１は、車両４の前方側に配置されたフロントカメラである。撮影装置１は、例えば、車両４の車室における前方側の所定の位置に取り付けられ、フロントガラス越しに、車両４の前方を撮影方向として撮影する。撮影装置１は、たとえば、ドライブレコーダである。具体的には、撮影装置１は、撮影する機能と、撮影した画像を示す画像データを記録する機能と、撮影装置２から取得した画像データを記録する機能と、を有するが、少なくとも撮影する機能を有せばよい。 The photographing device 1 is a front camera arranged on the front side of the vehicle 4. The photographing device 1 is attached, for example, to a predetermined position on the front side of the cabin of the vehicle 4, and photographs the front of the vehicle 4 through the windshield. The photographing device 1 is, for example, a drive recorder. Specifically, the photographing device 1 has a function of photographing, a function of recording image data representing a photographed image, and a function of recording image data obtained from the photographing device 2, but at least the function of photographing. It is sufficient to have

撮影装置２は、車両４の後方側に配置されたリアカメラである。撮影装置２は、例えば、車両４の車室における後方側の所定の位置に取り付けられ、リアガラス越しに、車両の後方を撮影方向として撮影する。撮影装置２は、撮影する機能と、撮影した画像を示す画像データを撮影装置１に出力する機能と、を有するが、少なくとも撮影する機能があればよい。 The photographing device 2 is a rear camera arranged on the rear side of the vehicle 4. The photographing device 2 is attached, for example, to a predetermined position on the rear side of the cabin of the vehicle 4, and photographs the rear of the vehicle through the rear window. The photographing device 2 has a function of photographing and a function of outputting image data representing a photographed image to the photographing device 1, but it is sufficient if it has at least a function of photographing.

撮影装置１と撮影装置２とは、ケーブル３を介して接続される。ケーブル３は、撮影装置１と撮影装置２とを接続する有線の通信路である。ケーブル３は、例えば、撮影装置１から撮影装置２へ動作用の電力を供給する電源線と、撮影装置１と撮影装置２との間で各種の信号を伝送するための信号線と、を有する。撮影装置２は、ケーブル３を介して、撮影装置１からの電力の供給を受けて動作する。なお、撮影装置１と撮影装置２とが、有線の通信路ではなく、Ｗｉ－Ｆｉ（登録商標）やＢｌｕｅｔｏｏｔｈ（登録商標）、その他の規格の無線の通信路によって接続されてもよい。また、撮影装置１は、撮影装置２と通信により接続することなく使用されてもよい。 The photographing device 1 and the photographing device 2 are connected via a cable 3. The cable 3 is a wired communication path that connects the photographing device 1 and the photographing device 2. The cable 3 includes, for example, a power line for supplying operational power from the imaging device 1 to the imaging device 2, and a signal line for transmitting various signals between the imaging device 1 and the imaging device 2. . The photographing device 2 operates by receiving power from the photographing device 1 via the cable 3 . Note that the photographing device 1 and the photographing device 2 may be connected not by a wired communication path but by a wireless communication path of Wi-Fi (registered trademark), Bluetooth (registered trademark), or other standards. Furthermore, the photographing device 1 may be used without being connected to the photographing device 2 through communication.

［２．撮影装置１の外観構成］
図２は、撮影装置１の外観構成の一例を示す図である。図２（Ａ）は、撮影装置１を前方側の右斜め上方向から見た図である。図２（Ｂ）は、撮影装置１を後方側の右斜め上方向から見た図である。撮影装置１は、筐体11を有する。筐体11は、上下方向よりも左右方向に長く、かつ厚みは比較的小さい直方体状である。筐体11は、車両４に取り付けられたときに上方を向く上面16と、第１の側面21と、第２の側面32と、第２の側面32の反対側に位置する第３の側面20と、第１の側面21の反対側に位置する第４の側面22と、を有する。 [2. External configuration of photographing device 1]
FIG. 2 is a diagram showing an example of the external configuration of the photographing device 1. As shown in FIG. FIG. 2(A) is a diagram of the imaging device 1 viewed from the front diagonally upper right direction. FIG. 2(B) is a diagram of the photographing device 1 viewed from the diagonally upper right direction on the rear side. The photographing device 1 has a housing 11. The housing 11 has a rectangular parallelepiped shape that is longer in the horizontal direction than in the vertical direction and has a relatively small thickness. The housing 11 has an upper surface 16 that faces upward when attached to the vehicle 4, a first side surface 21, a second side surface 32, and a third side surface 20 located on the opposite side of the second side surface 32. and a fourth side surface 22 located on the opposite side of the first side surface 21.

上面16には、ジョイントレール12と、カメラジャック17と、が設けられている。ジョイントレール12は、撮影装置１を車両４の所定の取付位置に取り付けるためのブラケット（例えば、後述するブラケット40）を着脱可能である。取付位置は、例えば、車両４のフロントガラス（例えば、フロントガラスにおける上端付近）、又は車両４のルームミラーや車室内の天井等としてもよい。撮影装置１が車両４に取りけられたとき、第１の側面21が車両４の前方側を向く。このとき、第２の側面32は、車両４の後方側から見て右側を向く。第３の側面20は、車両４の後方側から見て左側を向く。第４の側面22は、車両４の後方側を向く。 A joint rail 12 and a camera jack 17 are provided on the top surface 16. A bracket (for example, a bracket 40 described later) for attaching the photographing device 1 to a predetermined mounting position of the vehicle 4 can be attached to and detached from the joint rail 12. The mounting position may be, for example, the windshield of the vehicle 4 (for example, near the upper end of the windshield), the room mirror of the vehicle 4, the ceiling of the vehicle interior, or the like. When the photographing device 1 is attached to the vehicle 4, the first side surface 21 faces the front side of the vehicle 4. At this time, the second side surface 32 faces to the right when viewed from the rear side of the vehicle 4. The third side surface 20 faces to the left when viewed from the rear side of the vehicle 4. The fourth side surface 22 faces toward the rear of the vehicle 4.

カメラジャック17は、ケーブル３の一端が接続される端子である。カメラジャック17は、例えばＵＳＢＴｙｐｅＣの規格に対応し、撮影装置１が撮影装置２とイーサネット規格の通信を行うための端子としてもよい。 The camera jack 17 is a terminal to which one end of the cable 3 is connected. The camera jack 17 complies with the USB Type C standard, for example, and may be a terminal for the photographing device 1 to communicate with the photographing device 2 according to the Ethernet standard.

第１の側面21には、撮像レンズ15と、放音孔13と、マイク孔14と、が設けられている。撮像レンズ15は、撮影装置１が備える撮影部（後述する撮影部67）が有する集光用のレンズである。放音孔13は、撮像レンズ15の上方に設けられ、撮影装置１が有する音声出力部（後述する音声出力部66）が出力した音声を、筐体11の内部から外部に透過させる孔である。マイク孔14は、撮像レンズ15の下方に設けられ、外部からの音を筐体11の外部から内部に透過させる孔である。筐体11の内部に透過した音は、撮影装置１が有するマイクロホン（後述するマイクロホン61）に入力される。 The first side surface 21 is provided with an imaging lens 15, a sound emitting hole 13, and a microphone hole 14. The imaging lens 15 is a light condensing lens included in the imaging unit (the imaging unit 67 described later) included in the imaging device 1. The sound emitting hole 13 is a hole that is provided above the imaging lens 15 and allows the sound output from the sound output section (sound output section 66 described later) of the photographing device 1 to be transmitted from the inside of the housing 11 to the outside. . The microphone hole 14 is provided below the imaging lens 15 and is a hole that allows external sound to pass from the outside of the housing 11 to the inside. The sound transmitted into the interior of the housing 11 is input to a microphone (microphone 61 to be described later) included in the photographing device 1.

第２の側面32には、イベント記録ボタン31が設けられている。イベント記録ボタン31は、撮影部67が撮影した画像の記録（録画）の開始、又はその記録の終了を指示するための操作手段である。イベント記録が行われていないときに、ユーザによりイベント記録ボタン31が操作されると、撮影装置１はイベント記録を開始する。イベント記録について詳しくは後述する。イベント記録ボタン31は、右ハンドルの車両４の運転者が操作しやすいように、運転者席側を向く第２の側面32に設けられている。 An event recording button 31 is provided on the second side 32. The event recording button 31 is an operation means for instructing the start or end of recording of images photographed by the photographing section 67. If the user operates the event recording button 31 while event recording is not being performed, the photographing device 1 starts event recording. Event recording will be described in detail later. The event recording button 31 is provided on the second side surface 32 facing the driver's seat so that the driver of the right-hand drive vehicle 4 can easily operate it.

第３の側面20には、端子18と、記憶媒体挿入口19と、が設けられている。端子18は、外部の機器から電力の供給を受けるための端子である。端子18は、例えばＤＣジャックである。端子18は、電源用のコード（例えば、シガープラグコード）の一端側のコネクタが接続される。電源用コードの他端側のコネクタは、例えば車両４側に設けられた給電用の端子（例えば、シガーソケット）に接続される。 The third side surface 20 is provided with a terminal 18 and a storage medium insertion slot 19. Terminal 18 is a terminal for receiving power supply from an external device. The terminal 18 is, for example, a DC jack. A connector at one end of a power cord (for example, a cigarette plug cord) is connected to the terminal 18. The connector at the other end of the power cord is connected, for example, to a power supply terminal (for example, a cigarette lighter socket) provided on the vehicle 4 side.

端子18は、車両４のＯＢＤII（「II」は「２」のローマ数字である。）コネクタに接続可能なＯＢＤIIアダプタが接続されてもよい。ＯＢＤIIコネクタは、故障診断コネクタとも称され、車両のＥＣＵ（ＥｎｇｉｎｅＣｏｎｔｒｏｌＵｎｉｔ）に接続され、所定の期間毎（例えば、０．５秒毎）に各種の車両情報が出力される端子である。端子18が、ＯＢＤIIアダプタを用いてＯＢＤIIコネクタと接続されることで、撮影装置１は、動作用の電力の供給を受けるとともに、車両情報を取得することができる。 An OBD II adapter connectable to the OBD II ("II" is a Roman numeral for "2") connector of the vehicle 4 may be connected to the terminal 18. The OBD II connector is also called a failure diagnosis connector, and is a terminal that is connected to a vehicle's ECU (Engine Control Unit) and outputs various vehicle information every predetermined period (for example, every 0.5 seconds). By connecting the terminal 18 to the OBDII connector using the OBDII adapter, the photographing device 1 can receive power for operation and acquire vehicle information.

車両情報は、車両４の状態に関する情報である。車両情報は、例えば、車両４の速度（車速）、エンジン回転数、エンジン負荷率、スロットル度、点火時期、残り燃料の割合、インテークマニホールドの圧力、吸入空気量（ＭＡＦ）、インジェクション開時間、エンジン冷却水の温度（冷却水温度）、エンジンに吸気される空気の温度（吸気温度）、車外の気温（外気温度）、燃料タンクの残り燃料の量（残燃料量）、燃料流量、瞬間燃費、アクセル開度、ウインカー情報（左右のウインカーの動作（ＯＮ／ＯＦＦ））、ブレーキ開度、ハンドルの回転操舵角、ギヤポジション、及びドア開閉状態の情報等の少なくとも１つ以上とするとよい。 The vehicle information is information regarding the state of the vehicle 4. Vehicle information includes, for example, the speed of the vehicle 4 (vehicle speed), engine rotation speed, engine load factor, throttle degree, ignition timing, remaining fuel percentage, intake manifold pressure, intake air amount (MAF), injection opening time, engine Temperature of cooling water (cooling water temperature), temperature of air taken into the engine (intake temperature), air temperature outside the vehicle (outdoor temperature), amount of remaining fuel in the fuel tank (remaining fuel amount), fuel flow rate, instantaneous fuel efficiency, The information may be at least one of the following: accelerator opening, blinker information (operation (ON/OFF) of left and right blinkers), brake opening, steering angle of steering wheel, gear position, door opening/closing state, etc.

記憶媒体挿入口19は、外部の記憶手段としての記憶媒体71を、撮影装置１の内部に挿入するための挿入口である。記憶媒体71は、撮影装置１又は撮影装置２で撮影された画像が記録される記憶媒体で、例えばＳＤカードである。ＳＤカードは、例えば、ＳＤメモリ・カード、ｍｉｎｉＳＤカード、及びｍｉｃｒｏＳＤカード等のいずれの形状も含む。記憶媒体71は、さらに、記憶した画像をパーソナル・コンピュータ等の情報表示端末で再生するためのビューア（例えば、専用ビューア）のプログラムを記憶してもよい。 The storage medium insertion port 19 is an insertion port for inserting a storage medium 71 as an external storage means into the interior of the photographing device 1. The storage medium 71 is a storage medium in which images photographed by the photographing device 1 or the photographing device 2 are recorded, and is, for example, an SD card. The SD card includes, for example, any shape such as an SD memory card, a miniSD card, and a microSD card. The storage medium 71 may further store a program for a viewer (for example, a dedicated viewer) for reproducing the stored images on an information display terminal such as a personal computer.

第４の側面22には、操作部26と、表示面23と、発光部25と、が設けられている。操作部26は、第１のボタン27と、第２のボタン28と、第３のボタン29と、第４のボタン30と、を有する。第１のボタン27、第２のボタン28、第３のボタン29、及び第４のボタン30は、表示面23の右側の一辺に沿って上下に並べて配置される。これら各ボタンに割り当てられる機能としては、例えば以下の機能がある。 The fourth side surface 22 is provided with an operation section 26, a display surface 23, and a light emitting section 25. The operation unit 26 has a first button 27, a second button 28, a third button 29, and a fourth button 30. The first button 27, the second button 28, the third button 29, and the fourth button 30 are arranged vertically along one right side of the display surface 23. Examples of functions assigned to each of these buttons include the following functions.

第１のボタン27は、長押しされた場合に画像を切り替るためのボタンとして機能し、短押しされた場合に記憶媒体71のフォーマットを指示するためのボタンとして機能する。表示面23に表示される画像は、例えば、撮影装置１で現在撮影されている画像、及び撮影装置２で現在撮影されている画像の一方又は両方である。記憶媒体71をフォーマットすることは、記憶媒体71を初期化することであり、例えば、記憶媒体71に記憶された画像等のデータを消去すること、撮影装置１が記憶媒体71を使用できる状態にする（例えば、画像の記録及び読み出しをすることができる状態にする）ために、動作設定の内容を示す設定情報を記憶媒体71に書き込むこと、及び記憶媒体71を特定のファイル状態にすること、の少なくともいずれかとして把握される。 The first button 27 functions as a button for switching images when pressed for a long time, and functions as a button for instructing the format of the storage medium 71 when pressed for a short time. The image displayed on the display surface 23 is, for example, one or both of the image currently being photographed by the photographing device 1 and the image currently being photographed by the photographing device 2. Formatting the storage medium 71 means initializing the storage medium 71, for example, erasing data such as images stored in the storage medium 71, or putting the storage medium 71 into a state where the photographing device 1 can use it. writing setting information indicating the contents of the operation settings to the storage medium 71 in order to do so (for example, placing the storage medium 71 in a state where images can be recorded and read); and setting the storage medium 71 in a specific file state; It is understood as at least one of the following.

第２のボタン28は、撮影装置１が再生する画像を選択する選択画面を表示するためのボタンである。第３のボタン29は、撮影装置１、及び撮影装置２の設定に関するメニューを表示するためのボタンである。第４のボタン30は、画像の記録の開始、及び停止を指示するためのボタンである。例えば、後述する常時記録機能による記録中に、第４のボタン30が短押しされた場合、その記録が一時停止する。その一時停止中に、第４のボタン30が短押しされた場合、常時記録機能による画像の記録が再開する。第４のボタン30が長押しされた場合、画像を記録する際のフレームレートを変更することができる。 The second button 28 is a button for displaying a selection screen for selecting an image to be reproduced by the photographing device 1. The third button 29 is a button for displaying a menu regarding the settings of the photographing device 1 and the photographing device 2. The fourth button 30 is a button for instructing to start and stop image recording. For example, if the fourth button 30 is pressed briefly during recording using the constant recording function described below, the recording is temporarily stopped. If the fourth button 30 is pressed briefly during the pause, image recording by the constant recording function is resumed. If the fourth button 30 is held down, the frame rate at which images are recorded can be changed.

表示面23は、撮影装置１が有する表示部（後述する表示部65）が表示する画像が表示される領域である。表示面23は、例えば、長方形又は正方形の領域である。表示面23に重ねてユーザのタッチ操作を検出するためのタッチセンサ24が設けられている。 The display surface 23 is an area where an image displayed by a display unit (display unit 65 described later) included in the photographing device 1 is displayed. The display surface 23 is, for example, a rectangular or square area. A touch sensor 24 is provided to overlap the display surface 23 and detect a touch operation by a user.

発光部25は、第１のボタン27よりも上方に設けられ、所定の色で発光する。 The light emitting section 25 is provided above the first button 27 and emits light in a predetermined color.

なお、撮影装置２も、ドライブレコーダとしての機能を有してもよく、例えば撮影装置１と同様の構成を有してもよい。また、撮影装置１と通信により接続される撮影装置として、撮影装置２に代えて又は加えて、他の方向を撮影する１又は複数の撮影装置が用いられてもよい。他の方向として、車両４の右斜め後ろ、左斜め後ろ、車幅方向（側方）等の方向がある。 Note that the photographing device 2 may also have a function as a drive recorder, and may have the same configuration as the photographing device 1, for example. Furthermore, as the photographing device connected to the photographing device 1 through communication, one or more photographing devices that photograph in other directions may be used instead of or in addition to the photographing device 2. Other directions include directions such as diagonally to the right rear of the vehicle 4, diagonally to the left rear, and in the vehicle width direction (sideways).

［３．ブラケットの構成］
図３は、ブラケット40の構成を示す図である。図３（Ａ）は、ブラケット40を前方側の右斜め上方向から見た図である。図３（Ｂ）は、ブラケット40を側面側から見た図である。図３（Ｃ）は、ブラケット40を前方側の右斜め下方向から見た図である。ブラケット40は、撮影装置１を車両４に取り付ける取付部材の一例である。 [3. Bracket configuration]
FIG. 3 is a diagram showing the configuration of the bracket 40. FIG. 3(A) is a diagram of the bracket 40 viewed from the front side diagonally upward to the right. FIG. 3(B) is a side view of the bracket 40. FIG. 3(C) is a diagram of the bracket 40 viewed from diagonally lower right on the front side. The bracket 40 is an example of a mounting member for mounting the photographing device 1 on the vehicle 4.

ブラケット40は、ボールジョイント機構を用いた取付部材である。ブラケット40において、平板状のベース部41は、車両４のフロントガラスに貼り付ける取付面42を有する。ベース部41は、ボールスタッド43の支柱に対して所定角度だけ傾斜する。取付面42に両面テープ等の接着部材を貼り付け、その接着部材を介して、車両４のフロントガラス等に貼り付けられる。 Bracket 40 is a mounting member using a ball joint mechanism. In the bracket 40, a flat base portion 41 has a mounting surface 42 that is attached to the windshield of the vehicle 4. The base portion 41 is inclined at a predetermined angle with respect to the column of the ball stud 43. An adhesive member such as double-sided tape is attached to the mounting surface 42, and the device is attached to the windshield or the like of the vehicle 4 via the adhesive member.

ボールスタッド43は、ベース部41のうち取付面42とは反対側の面から起立した部位である。ソケット部45は、ボールスタッド43のボール部44が装着される。ナット46は、ソケット部45の周囲に着脱自在に取り付けられる。ソケット部45の外周には、ネジ溝が形成されている。ソケット部45の外周に当該ネジ溝に嵌め合うナット46が装着される。ボールスタッド43のボール部44がソケット部45内に装着された状態で、ナット46が締め付けられる前は、ソケット部45はボール部44の周面に沿って所望の方向に回転して、ベース部47の姿勢及び位置を変位可能である。ナット46が締め付けられると、ベース部47の姿勢及び位置が固定される。 The ball stud 43 is a portion of the base portion 41 that stands up from the surface opposite to the mounting surface 42. The ball portion 44 of the ball stud 43 is attached to the socket portion 45. The nut 46 is detachably attached around the socket portion 45. A thread groove is formed on the outer periphery of the socket portion 45. A nut 46 that fits into the thread groove is attached to the outer periphery of the socket portion 45. When the ball part 44 of the ball stud 43 is installed in the socket part 45 and before the nut 46 is tightened, the socket part 45 rotates in a desired direction along the circumferential surface of the ball part 44, and the base part It is possible to change 47 postures and positions. When the nut 46 is tightened, the posture and position of the base portion 47 are fixed.

ベース部47は、ソケット部45と一対に形成され、撮影装置１に装着するための部位である。ベース部47は、一対のガイドレール48を下面に有する。一対のガイドレール48は、撮影装置１のジョイントレール12に沿ってスライド可能に構成される。ベース部47の先端部には爪状の先端部49が設けられている。先端部49が、ジョイントレール12の撮影装置１の前方側の端部付近に引っ掛けられることで、ブラケット40が撮影装置１に取り付けられる。 The base portion 47 is formed as a pair with the socket portion 45, and is a portion for mounting on the imaging device 1. The base portion 47 has a pair of guide rails 48 on the lower surface. The pair of guide rails 48 are configured to be slidable along the joint rail 12 of the photographing device 1. A claw-shaped tip 49 is provided at the tip of the base portion 47. The bracket 40 is attached to the photographing device 1 by hooking the tip portion 49 onto the joint rail 12 near the front end of the photographing device 1.

［４．撮影装置１の電気的構成］
図４は、撮影装置１の電気的構成を示すブロック図である。制御部50は、撮影装置１の各部を制御する。制御部50は、例えば、プロセッサ51、及びメモリ52を含むコンピュータである。プロセッサ51は、例えば、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＭＰＵ（ＭｉｃｒｏＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＡＳＩＣ（Ａｐｐｌｉｃａｔｉｏｎ－ＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）、及びＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）等を有する。メモリ52は、例えば、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、及びＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）等を有する主記憶装置である。プロセッサ51は、メモリ52のＲＯＭから読み出したプログラムをＲＡＭに一時的に記憶させる。メモリ52のＲＡＭは、プロセッサ51に作業領域を提供する。プロセッサ51は、プログラムの実行中に生成されるデータをＲＡＭに一時的に記憶させながら演算処理を行うことにより、各種の制御を行う。制御部50は、さらに、時刻を計る計時部53を備える。計時部53は、例えばリアルタイムクロックである。計時部53は、プロセッサ51のマザーボードに実装されていていてもよいし、プロセッサ51に外付けされてもよい。 [4. Electrical configuration of photographing device 1]
FIG. 4 is a block diagram showing the electrical configuration of the photographing device 1. As shown in FIG. The control unit 50 controls each part of the photographing device 1. The control unit 50 is, for example, a computer including a processor 51 and a memory 52. The processor 51 is, for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), a GPU (Graphics Processing Unit), or an ASIC (Application-Specific Input). integrated circuit), and FPGA (Field Programmable Gate Array). The memory 52 is a main storage device including, for example, RAM (Random Access Memory) and ROM (Read Only Memory). The processor 51 temporarily stores the program read from the ROM in the memory 52 in the RAM. The RAM of memory 52 provides a work area for processor 51. The processor 51 performs various controls by performing arithmetic processing while temporarily storing data generated during program execution in the RAM. The control unit 50 further includes a clock unit 53 that measures time. The clock section 53 is, for example, a real-time clock. The clock section 53 may be mounted on the motherboard of the processor 51, or may be externally attached to the processor 51.

入力部60は、ユーザからの情報の入力を受け付ける。入力部60は、例えば、上述したマイクロホン61、イベント記録ボタン31、操作部26、及びタッチセンサ24を有する。マイクロホン61は、マイク孔14等を介して入射した音を電気信号に変換する。マイクロホン61は、例えばコンデンサマイクである。タッチセンサ24は、表示面23においてユーザによりタッチされた位置を検出する。タッチセンサ24は、例えば静電容量方式である。 The input unit 60 receives information input from the user. The input unit 60 includes, for example, the above-mentioned microphone 61, event recording button 31, operation unit 26, and touch sensor 24. The microphone 61 converts sound incident through the microphone hole 14 etc. into an electrical signal. The microphone 61 is, for example, a condenser microphone. Touch sensor 24 detects the position touched by the user on display surface 23. The touch sensor 24 is, for example, of a capacitive type.

表示部65は、表示面23に画像を表示する。表示部65は、例えば液晶ディスプレイ（ＬＣＤ；ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）である。 The display unit 65 displays an image on the display surface 23. The display unit 65 is, for example, a liquid crystal display (LCD).

音声出力部66は、音声を出力する。この音声としては、例えば、報知音や、ＢＧＭ、音声メッセージ等がある。音声出力部66は、例えば音声処理回路及びスピーカを有する。 The audio output unit 66 outputs audio. Examples of this sound include notification sounds, BGM, voice messages, and the like. The audio output unit 66 includes, for example, an audio processing circuit and a speaker.

撮影部67は、撮影し、撮影により得られた画像データを生成する、撮影部67は、例えば、撮像レンズ15、及び撮像レンズ15により集光された光を撮像する撮像素子を有する。撮像素子は、例えば、ＣＭＯＳ（ＣｏｍｐｌｅｍｅｎｔａｒｙＭＯＳ）又はＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）とする。撮影部67は、例えば赤（Ｒ）、緑（Ｇ）、青（Ｂ）の色成分からなるカラー（多色）の画像を生成する。 The imaging unit 67 takes an image and generates image data obtained by the imaging. The imaging unit 67 includes, for example, an imaging lens 15 and an imaging element that images the light collected by the imaging lens 15. The image sensor is, for example, a CMOS (Complementary MOS) or a CCD (Charge Coupled Device). The photographing unit 67 generates a color (multicolor) image made up of color components of red (R), green (G), and blue (B), for example.

通信部68は、外部の装置と通信する。通信部68は、例えば、Ｗｉ－Ｆｉ（登録商標）、Ｂｌｕｅｔｏｏｔｈ（登録商標）その他の無線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）通信や近距離無線通信により、外部の装置と無線通信するための通信回路を有する。通信部68は、例えば、ＬＴＥ（ＬｏｎｇＴｅｒｍＥｖｏｌｕｔｉｏｎ）、４Ｇ、５Ｇ等の移動通信システムの規格等に準拠した通信を行うための通信回路を有してもよい。 The communication unit 68 communicates with external devices. The communication unit 68 includes a communication circuit for wirelessly communicating with an external device using, for example, Wi-Fi (registered trademark), Bluetooth (registered trademark), other wireless LAN (Local Area Network) communication, or short-range wireless communication. . The communication unit 68 may include a communication circuit for performing communication based on mobile communication system standards such as LTE (Long Term Evolution), 4G, and 5G, for example.

センサ部69は、各種のセンサを有する。センサ部69は、例えば、加速度センサ、ジャイロセンサ、気圧センサ、及び照度センサの少なくともいずれかを有する。加速度センサは、例えば車両の前後、左右、上下の加速度を検出する３軸の加速度センサである。ジャイロセンサは、撮影装置１の傾きを検出するセンサである。加速度センサ及びジャイロセンサは、例えば、ＧＮＳＳ衛星からの信号が受信できない場合に、自律航法により車両４の位置を推測するのに使用されてもよい。気圧センサは、気圧を測定する。気圧センサは、例えば、高低差を検知して、高速道と一般道を判定するために用いられる。照度センサは、たとえば、撮影装置１の周辺である車室内の明るさを示す照度を検知するセンサである。照度センサは、例えば表示部65の表示の輝度の調整に使用される。 The sensor section 69 includes various sensors. The sensor section 69 includes, for example, at least one of an acceleration sensor, a gyro sensor, an atmospheric pressure sensor, and an illuminance sensor. The acceleration sensor is a three-axis acceleration sensor that detects, for example, longitudinal, lateral, and vertical acceleration of the vehicle. The gyro sensor is a sensor that detects the tilt of the photographing device 1. The acceleration sensor and the gyro sensor may be used, for example, to estimate the position of the vehicle 4 by autonomous navigation when signals from GNSS satellites cannot be received. The atmospheric pressure sensor measures atmospheric pressure. The atmospheric pressure sensor is used, for example, to detect differences in elevation and determine whether a highway is a highway or a general road. The illuminance sensor is, for example, a sensor that detects illuminance indicating the brightness in the vehicle interior surrounding the photographing device 1. The illuminance sensor is used, for example, to adjust the brightness of the display on the display unit 65.

リーダライタ70は、記憶媒体挿入口19から撮影装置１の内部に挿入された記憶媒体71を保持する媒体保持部として機能する。リーダライタ70は、記憶媒体71にデータを書き込んだり、記憶媒体71からデータを読み出したりする。リーダライタ70は、記憶媒体71を１つだけ保持するものでもよいが、２つ以上の記憶媒体71を同時に保持することが可能に構成されてもよい。 The reader/writer 70 functions as a medium holding section that holds a storage medium 71 inserted into the photographing device 1 from the storage medium insertion port 19. The reader/writer 70 writes data to and reads data from the storage medium 71. The reader/writer 70 may hold only one storage medium 71, but may be configured to hold two or more storage media 71 at the same time.

端子部72は、外部の装置と電気的に接続するための端子を有する。端子部72は、上述した、カメラジャック17及び端子18を有する。端子部72に接続される装置として、車両４側からの給電がなくても、撮影装置１及び撮影装置２が動作できるように、外付けのバッテリが用いられてもよい。端子部72に接続される装置は、例えば、ユーザの安全運転を支援する機能を有する装置であってもよい。このような装置として、例えば、運転手（例えば顔）を撮影して、わき見及び居眠り運転に例示される運転手の状態を検出して報知する機能を有する装置や、車両４の周辺の障害物を検知して報知する機能を有する装置（例えば、前方車両追突警報システム（ＦＣＷＳ：ＦｏｒｗａｒｄｖｅｈｉｃｌｅＣｏｌｌｉｓｉｏｎＷａｒｎｉｎｇＳｙｓｔｅｍｓ）のための車両検知に使用される装置）がある。端子部72に接続される装置は、その他にも、レーダー探知機、レーザー探知機、カーナビゲーション装置、ディスプレイ装置等の車載装置としてもよい。 The terminal section 72 has a terminal for electrically connecting to an external device. The terminal section 72 includes the camera jack 17 and the terminal 18 described above. As a device connected to the terminal portion 72, an external battery may be used so that the photographing device 1 and the photographing device 2 can operate even without power being supplied from the vehicle 4 side. The device connected to the terminal section 72 may be, for example, a device having a function of supporting safe driving of the user. Examples of such a device include a device that has a function of photographing the driver (for example, his face) and detecting and notifying the driver's condition, such as inattentive driving and drowsy driving, and There is a device (for example, a device used for vehicle detection for a forward vehicle collision warning system (FCWS)) that has a function of detecting and giving an alarm. The device connected to the terminal section 72 may also be an in-vehicle device such as a radar detector, a laser detector, a car navigation device, a display device, or the like.

位置情報取得部73は、撮影装置１の位置（より具体的には、現在位置）を示す位置情報を取得する。撮影装置１の位置は、撮影装置１が配置された車両４の位置、車両４に配置された撮影装置２の位置、及び車両４に乗車している運転手その他の人の位置と同視することができる。位置情報取得部73は、例えば、ＧＮＳＳ（ＧｌｏｂａｌＮａｖｉｇａｔｉｏｎＳａｔｅｌｌｉｔｅＳｙｓｔｅｍ：全球測位衛星システム）の一つであるＧＰＳ（ＧｌｏｂａｌＰｏｓｉｓｉｏｎｉｎｇＳｙｓｔｅｍ）からの信号に基づき、撮影装置１の位置情報（緯度情報、及び経度情報）を取得する。位置情報取得部73は、ＱＺＳＳ（Ｑｕａｓｉ－ＺｅｎｉｔｈＳａｔｅｌｌｉｔｅＳｙｓｔｅｍ：準天頂衛星システム）として、みちびきを併せて利用してもよい。 The position information acquisition unit 73 acquires position information indicating the position of the imaging device 1 (more specifically, the current position). The position of the photographing device 1 is the same as the position of the vehicle 4 in which the photographing device 1 is placed, the position of the photographing device 2 placed in the vehicle 4, and the position of the driver and other persons riding in the vehicle 4. I can do it. The position information acquisition unit 73 acquires the position information (latitude information and longitude information) of the photographing device 1 based on, for example, a signal from a GPS (Global Positioning System), which is one of the GNSS (Global Navigation Satellite System). ) to obtain. The position information acquisition unit 73 may also utilize Michibiki as QZSS (Quasi-Zenith Satellite System).

発光部２１は、所定の色で発光する。発光部２１は、例えば、発光ダイオードを有する。 The light emitting section 21 emits light in a predetermined color. The light emitting section 21 includes, for example, a light emitting diode.

電源制御部75は、撮影装置１の各部や、撮影装置２への電力の供給を制御する。電源制御部75は、例えば、電源スイッチや電源制御回路を有する。電源制御部75は、端子部72を介して車両４側から供給された電力を、撮影装置１の各部や、撮影装置２へ供給する。電源制御部75は、さらに、蓄電手段として、二次電池やボタン電池、電気二重層コンデンサ（スーパーキャパシタとも呼ばれる。）を有してもよい。 The power supply control unit 75 controls the supply of power to each part of the photographing device 1 and the photographing device 2. The power control unit 75 includes, for example, a power switch and a power control circuit. The power supply control section 75 supplies electric power supplied from the vehicle 4 side via the terminal section 72 to each part of the photographing device 1 and the photographing device 2. The power supply control unit 75 may further include a secondary battery, a button battery, or an electric double layer capacitor (also called a supercapacitor) as a power storage means.

映像入出力部76は、撮影装置１における撮影により得られた動画データを撮影装置１の外部に出力したり、外部から与えられる動画データを撮影装置１に入力したりする。また、データ入出力部77は、所望のデータ（動画データも含む）を撮影装置１から外部に出力したり、外部から与えられる所望のデータ（動画データも含む）を撮影装置１に入力したりする。 The video input/output unit 76 outputs video data obtained by imaging with the imaging device 1 to the outside of the imaging device 1, and inputs video data provided from the outside into the imaging device 1. The data input/output unit 77 also outputs desired data (including video data) from the photographing device 1 to the outside, inputs desired data (including video data) given from the outside into the photographing device 1, and so on. do.

撮影装置１は、さらに、フラッシュメモリ（例えばｅＭＭＣ、ＳＳＤ）に例示される補助記憶装置を内部の記憶手段として有してもよい。補助記憶装置としては、光学式記憶媒体、磁気記憶媒体、及び半導体記憶媒体に例示される各種の記憶媒体を用いることができる。 The photographing device 1 may further include an auxiliary storage device such as a flash memory (for example, eMMC, SSD) as an internal storage means. As the auxiliary storage device, various storage media such as optical storage media, magnetic storage media, and semiconductor storage media can be used.

［５．撮影装置１の画像記録機能］
撮影装置１は、画像記録機能として、以下の１つまたは２つ以上の機能を有する。画像記録機能は、撮影装置１が撮影した画像を、所定のファイル形式の画像データとして記録する機能である。画像データは、動画形式の画像データとするとよく、例えばＭＰＥＧ（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）形式（例えば、ＭＰＥＧ２、ＭＰＥＧ４）であるが、ＡＶＩ、ＭＯＶ、ＷＭＶ等である。 [5. Image recording function of photographing device 1]
The photographing device 1 has one or more of the following functions as an image recording function. The image recording function is a function of recording an image photographed by the photographing device 1 as image data in a predetermined file format. The image data is preferably image data in a moving picture format, for example in the MPEG (Moving Picture Experts Group) format (eg, MPEG2, MPEG4), AVI, MOV, WMV, etc.

＜５－１．常時記録機能＞
常時記録機能（常時録画機能ともいう。）は、撮影装置１の動作中は、撮影装置１及び撮影装置２の一方又は両方により撮影された画像を継続して（つまり、常時）記録する機能である。制御部50は、常時記録機能を実行しているときには、車両４のエンジンの始動から停止まで撮像した映像データを記憶する。エンジンの始動は、例えば、車両４のアクセサリ電源のオンにより検出され、エンジンの停止は、アクセサリ電源のオフにより検出される。 <5-1. Continuous recording function>
The constant recording function (also referred to as the constant recording function) is a function that continuously (that is, always) records images photographed by one or both of the photographing device 1 and the photographing device 2 while the photographing device 1 is in operation. be. When the control unit 50 is executing the constant recording function, the control unit 50 stores video data captured from the time when the engine of the vehicle 4 is started to stopped. Starting of the engine is detected, for example, by turning on the accessory power source of the vehicle 4, and stopping the engine is detected by turning off the accessory power source.

＜５－２．イベント記録機能＞
イベント記録機能は、特定のイベントが発生したことに応じて、撮影装置１及び撮影装置２の一方又は両方により撮影された画像を記録する機能である。イベントは、撮影装置１又は撮影装置２により撮影された画像を記録すべき事象であり、例えば、車両４の走行中におけるユーザの急ハンドル、急ブレーキ等の操作時、車両４の他の物体との衝突時等とする。制御部50は、イベントが発生したことを、例えば、センサ部69の加速度センサによる計測値に基づいて、判定する。具体的には、制御部50は、加速度センサの計測値が所定の閾値以上となった場合又は所定の時間的変化を示した場合に、イベントが発生したと判定する。イベントの発生の判定条件はこれに限られない。制御部50は、車両情報に基づいて、例えば車速や操舵状態等の車両４の状態が所定の条件を満たした場合に、イベントが発生したと判定してもよい。制御部50は、撮影部67又は撮影装置２により撮影された画像を解析し、車両４又は他車両の危険運転（例えば煽り運転、接近、又は異常接近）を検知した場合に、イベントが発生したと判定してもよい。制御部50は、イベント記録ボタン31が操作された場合にも、イベントが発生したと判定する。 <5-2. Event recording function>
The event recording function is a function of recording an image photographed by one or both of the photographing device 1 and the photographing device 2 in response to the occurrence of a specific event. An event is an event in which an image photographed by the photographing device 1 or the photographing device 2 is to be recorded, and for example, when the user suddenly operates the steering wheel or brakes suddenly while the vehicle 4 is running, or when the vehicle 4 collides with another object. In the event of a collision, etc. The control unit 50 determines that an event has occurred based on, for example, a value measured by an acceleration sensor of the sensor unit 69. Specifically, the control unit 50 determines that an event has occurred when the measured value of the acceleration sensor exceeds a predetermined threshold or shows a predetermined change over time. The conditions for determining the occurrence of an event are not limited to these. Based on the vehicle information, the control unit 50 may determine that an event has occurred when the state of the vehicle 4, such as the vehicle speed or steering state, satisfies a predetermined condition. The control unit 50 analyzes the image taken by the imaging unit 67 or the imaging device 2, and determines that an event has occurred when dangerous driving (for example, aggressive driving, approaching, or abnormally approaching) of the vehicle 4 or another vehicle is detected. It may be determined that The control unit 50 also determines that an event has occurred when the event recording button 31 is operated.

制御部50は、イベントが発生したと判定した場合、そのイベントの発生前後の所定期間（以下「イベント記録期間」という。）において撮像された画像を、記憶媒体71に記録する。制御部50は、例えば撮影部67及び撮影装置２により撮影された画像をメモリ52（例えば、ＲＡＭ）に一時的に記録しておき、イベントが発生したと判定した場合は、メモリ52から読み出したイベント記録期間の画像を記憶媒体71に記録するとよい。制御部50は、例えば、イベントの発生前２０秒、及びイベントの発生後２０秒の合計４０秒の画像を１つのファイルとする。イベント記録期間は一例であり、イベントの種類に応じて異なっていてもよいし、ユーザが変更可能であってもよい。制御部50は、１つのイベントの発生につき、複数のファイルからなる画像を記憶媒体71に記録してもよい。制御部50は、さらに、イベント記録期間にセンサ部69で計測された値（例えば、３軸の各方向の加速度）や、位置情報取得部73で取得された位置情報を、画像と関連付けて、記憶媒体71に記録してもよい。 When the control unit 50 determines that an event has occurred, it records on the storage medium 71 images captured during a predetermined period before and after the event (hereinafter referred to as "event recording period"). For example, the control unit 50 temporarily records images photographed by the photographing unit 67 and the photographing device 2 in the memory 52 (for example, RAM), and when it is determined that an event has occurred, reads the images from the memory 52. It is preferable to record images during the event recording period on the storage medium 71. For example, the control unit 50 creates one file of images for a total of 40 seconds, 20 seconds before the occurrence of the event and 20 seconds after the occurrence of the event. The event recording period is just one example, and may vary depending on the type of event, or may be changeable by the user. The control unit 50 may record an image made up of a plurality of files on the storage medium 71 for each occurrence of one event. The control unit 50 further associates the values measured by the sensor unit 69 during the event recording period (for example, acceleration in each direction of three axes) and the position information acquired by the position information acquisition unit 73 with the image, It may also be recorded on the storage medium 71.

＜５－３．駐車監視機能＞
駐車監視機能は、車両４の駐車中に撮影装置１及び撮影装置２の一方又は両方により撮影された画像を記録する機能である。駐車監視機能は、駐車中の車両４の内部又は車両４の周辺の外部を監視するための機能である。制御部50は、車両４のエンジンオフ状態では、外付けバッテリからの電力の供給を受けて、記憶媒体71に画像を記録する。車両４の駐車中であるか否かについて、制御部50は、例えば、アクセサリ電源がオフされたこと、エンジンがオフされたこと、外付けバッテリからの電源供給が開始されたこと、車速が０ｋｍ／ｈ又は所定速度以下であること、及び位置情報取得部73が取得した位置情報が所定の位置情報（例えば、自宅や勤務先、駐車場の位置情報）であること、の１つ又は複数に基づいて判定する。 <5-3. Parking monitoring function＞
The parking monitoring function is a function of recording an image photographed by one or both of the photographing device 1 and the photographing device 2 while the vehicle 4 is parked. The parking monitoring function is a function for monitoring the inside of the parked vehicle 4 or the outside around the vehicle 4. When the engine of the vehicle 4 is off, the control unit 50 receives power from an external battery and records images on the storage medium 71. Regarding whether or not the vehicle 4 is parked, the control unit 50 determines, for example, that the accessory power source is turned off, that the engine is turned off, that power supply from an external battery is started, or that the vehicle speed is 0 km. /h or below a predetermined speed, and the location information acquired by the location information acquisition unit 73 is predetermined location information (for example, location information of home, work, or parking lot). Judgment based on

駐車監視機能は、タイムラプスモード、及び動体検知モードを有してもよい。具体的には、制御部50は、タイムラプスモードがユーザにより選択された場合、常時記録機能及びイベント記録機能等の他の画像記録機能よりも、フレームレートを間引いて画像を記録する。例えば、他の画像記録機能のフレームレートが２０～３０フレーム／秒であるのに対して、タイムラプスモードのフレームレートは１フレーム／秒とする。動体検知モードは、移動体の検知に応じて画像を記録するモードである。具体的には、制御部50は、動体検知モードがユーザにより選択された場合に、撮影装置１及び撮影装置２により撮像された画像の変化から動体を検知したときは、その検知前後の所定期間に撮影された画像を、記憶媒体71に記録する。フレームレートは、常時記録機能及びイベント記録機能等と同じフレームレートとしてもよい。 The parking monitoring function may have a time-lapse mode and a motion detection mode. Specifically, when the time-lapse mode is selected by the user, the control unit 50 records images by thinning out the frame rate compared to other image recording functions such as the constant recording function and the event recording function. For example, while the frame rate of other image recording functions is 20 to 30 frames/second, the frame rate of time-lapse mode is 1 frame/second. The moving object detection mode is a mode in which images are recorded in response to detection of a moving object. Specifically, when the moving object detection mode is selected by the user and a moving object is detected from a change in images captured by the photographing device 1 and the photographing device 2, the control unit 50 detects a moving object for a predetermined period before and after the detection. The image photographed is recorded on the storage medium 71. The frame rate may be the same frame rate as the constant recording function, event recording function, etc.

なお、撮影装置１、２として、全天球、半天球といった天球画像を撮影する撮影装置が用いられてもよい。また、撮影装置１として、表示部65を有しない撮影装置が用いられてもよい。また、撮影装置１の筐体は、直方体状のものでなくてもよく、例えば円筒状等の撮影機器であってもよい。 Note that as the photographing devices 1 and 2, photographing devices that photograph celestial sphere images such as a celestial sphere or a half-celestial sphere may be used. Further, as the photographing device 1, a photographing device without the display section 65 may be used. Further, the housing of the photographing device 1 does not have to be rectangular, and may be, for example, a cylindrical photographing device.

［６．学習モデル］
図５は、学習モデル80の一例である。学習モデル80は教師データ、学習データを用いて学習中は学習モデルとなるが、学習が終了すると学習済モデル80となる。学習済みモデルは推論モデルの一例である。 [6. Learning model】
FIG. 5 is an example of the learning model 80. The learning model 80 becomes a learning model during learning using teacher data and learning data, but becomes a learned model 80 when learning is completed. A trained model is an example of an inference model.

図５に示す学習モデル80は、ディープ・ラーニングにより複数の画像を入力して動画を構成する画像に検出対象の対象物が含まれている確率を出力するものである。学習モデル80には入力層81、中間層82および出力層83が含まれている。 The learning model 80 shown in FIG. 5 inputs a plurality of images by deep learning and outputs the probability that an object to be detected is included in the images forming a moving image. The learning model 80 includes an input layer 81, a middle layer 82, and an output layer 83.

検出しようとする対象物が含まれている確率が100％と判断される画像を構成する画素Ｐ１からＰＮのそれぞれを学習モデル80の入力層81から入力し、中間層82および出力層83を通してディープ・ラーニングにより学習させることを、膨大な数の任意の画像を教師データ、教師画像、学習データ、学習画像、などだけ行う。同様に、検出しようとする対象物が含まれている確率が95％と判断される画像、検出しようとする対象物が含まれている確率が90％と判断される画像、というように、確率が５％ずつ（すなわち、85％、80％、75％、70％、65％、60％、55％、50％、45％、40％、35％、30％、25％、20％、15％、10％、５％、０％）下がる（５％ずつでなくともよい）と判断される膨大な数の画像を教師データ、教師画像、学習データ、学習画像などとしてディープ・ラーニングにより学習させる。学習が終了すると学習モデル80は学習済モデル80となるので、学習モデルも学習済モデルも同じ符号80で示す。 Each of the pixels P1 to PN constituting an image that is determined to have a 100% probability of containing the object to be detected is input from the input layer 81 of the learning model 80, and deep・Learning is performed by using a huge number of arbitrary images as teacher data, teacher images, learning data, learning images, etc. Similarly, the probability of an image being judged to be 95% to contain the object to be detected, an image to be judged to be 90% to containing the object to be detected, etc. in 5% increments (i.e. 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15 %, 10%, 5%, 0%) (not necessarily in 5% increments) are trained by deep learning as teacher data, teacher images, training data, training images, etc. . When the learning is completed, the learning model 80 becomes the trained model 80, so both the learning model and the trained model are indicated by the same reference numeral 80.

任意の画像を学習済みモデル80の入力層81から入力すると、入力した画像に対象物が含まれている確率が100％のときにはニューロン83ａが反応し、同様に、入力した画像に対象物が含まれている確率が95％のときにはニューロン83ｂが反応し、入力した画像に対象物が含まれている確率が０％のときにはニューロン83ｎが反応する。その他の確率についても同様である。 When an arbitrary image is input from the input layer 81 of the trained model 80, the neuron 83a responds when the probability that the input image contains the object is 100%, and similarly the input image contains the object. When the probability that the object is included in the input image is 95%, the neuron 83b responds, and when the probability that the input image contains the object is 0%, the neuron 83n responds. The same applies to other probabilities.

学習済モデル80の出力データを図示されていない弁別回路に入力し、弁別回路においてしきい値以上の確率を表すデータのときには対象物を検出した旨のデータを出力し、しきい値未満の確率を表すデータのときには対象物を検出しない旨のデータを出力またはデータ自体の出力を停止する。弁別回路から対象物を検出した旨のデータが出力されたことにより、撮影動画に対象、たとえば、人物が検出されたことが分かり、ドライバ等に警告その他の処理ができるようになる。以下、対象を人物とするが、その他のものでもよい。弁別回路を設けなくとも、学習済モデル80の出力データによって表される確率がしきい値以上であれば、ソフトウエア的に人物が存在する画像と判定してもよい。撮影動画、たとえば、動画を構成する各画像に対象が含まれていることが分かると、その画像から人物のランドマークを検出することで人物を検出して検出された人物を枠で囲む処理が行われる。このように、図５に示す学習済モデル80は画像に対象が含まれている確率を出力し、出力された確率がしきい値以上ならば、その画像に対象が含まれていると判定し、その画像から対象のランドマークを検出して対象を枠で囲んでいるが、撮影動画、動画を構成する各画像を入力し、画像に含まれている人物を囲むような画像を出力するような学習済モデル80または処理を利用してもよい。このような処理などは、OpenCVなどを利用することで実現できる（https://jellyware.jp/aicorex/contents/out_c08_realtime.html）。また、学習済モデル80を用いて撮影動画から対象の位置を検出し、検出された位置に枠を囲む処理をしてもよい。いずれにしても撮影動画から対象を検出できればよい。 The output data of the trained model 80 is input to a discriminator circuit (not shown), and when the discriminator circuit finds data representing a probability greater than or equal to a threshold value, it outputs data indicating that an object has been detected, and the probability value is less than the threshold value. When the data indicates that the object is not detected, the data indicating that the object is not detected is output or the output of the data itself is stopped. By outputting data indicating that an object has been detected from the discrimination circuit, it is known that an object, for example, a person, has been detected in the photographed video, and it becomes possible to issue a warning to the driver and other processing. In the following, the subject will be a person, but other subjects may be used. Even without providing a discrimination circuit, if the probability represented by the output data of the trained model 80 is equal to or higher than a threshold value, the image may be determined by software to be an image in which a person is present. When it is known that a subject is included in a captured video, for example, each image that makes up the video, a process is performed to detect the person by detecting landmarks of the person from the image and surround the detected person with a frame. It will be done. In this way, the trained model 80 shown in FIG. 5 outputs the probability that the image contains the object, and if the output probability is greater than or equal to the threshold, it is determined that the image contains the object. , the landmark of the target is detected from the image and the target is surrounded by a frame, but it is possible to input the captured video and each image that makes up the video and output an image that surrounds the person included in the image. A trained model 80 or process may also be used. Such processing can be achieved by using OpenCV etc. (https://jellyware.jp/aicorex/contents/out_c08_realtime.html). Alternatively, the trained model 80 may be used to detect the position of the target from the captured video, and a process may be performed to surround the detected position with a frame. In any case, it is sufficient if the target can be detected from the captured video.

撮影装置１または２による撮影により得られた撮影動画データを学習済モデル80に入力して推論、学習させることにより撮影動画データによって表される動画を構成する画像から対象を検出できる。対象の検出ではなく、他の推論、学習も同様にして学習モデル80を利用して撮影動画データを入力して学習することにより所望の結果が出力されるように学習させ、学習済モデル80を構築できる。たとえば、事故が起きそうな動画、事故が起きる前の動画などを教師データとして学習させることで、撮影動画データを学習済モデル80に入力することで事故が起きそうな状況を撮影したときに、その状況を検出、車両４のドライバに警告することも可能となる。また、撮影動画に対象がいた場合に、その対象を枠で囲む必要は必ずしもない。撮影動画に対象がいたと判定されたことで後述するようにイベント記録するようにもできる。 By inputting photographed video data obtained by photographing with the photographing device 1 or 2 into the trained model 80 and causing it to infer and learn, an object can be detected from images constituting the video represented by the photographed video data. In addition to target detection, other inferences and learning are also performed using the learning model 80 by inputting captured video data and learning so that the desired result is output, and the trained model 80 is Can be built. For example, by learning a video in which an accident is likely to occur or a video before the accident occurs as training data, when a situation where an accident is likely to occur is photographed by inputting the captured video data into the trained model 80, It is also possible to detect this situation and warn the driver of the vehicle 4. Furthermore, when there is an object in the captured video, it is not necessarily necessary to surround the object with a frame. If it is determined that there is a target in the captured video, an event can be recorded as described later.

［７．第１実施例］
図６から図９は、撮影装置１の処理手順を示すフローチャートである。撮影装置２も同様の処理を行なってよい。 [7. First Example]
6 to 9 are flowcharts showing the processing procedure of the photographing device 1. The photographing device 2 may also perform similar processing.

第４のボタン30が押されることで撮影装置１による撮影が開始する（図６ステップ91）。撮影によって得られた動画データによって表される動画が表示面23に表示される（図６ステップ92）。 When the fourth button 30 is pressed, the photographing device 1 starts photographing (step 91 in FIG. 6). A moving image represented by the moving image data obtained by shooting is displayed on the display screen 23 (step 92 in FIG. 6).

ユーザによってイベント記録ボタン31が押されると、通常イベント記録モード、（この実施例では推論イベント記録モード）が設定される（図６ステップ93でＹＥＳ）。通常イベント記録モードが設定されると推論によりイベントが発生したと判断されると発生したイベントの前後の一定期間の動画が記憶媒体71のイベント記録領域（推論イベント記録領域）に記録される。この実施例では、対象を検出したときにイベントが発生したとみなされて推論イベント記録が行われるが、車両４に何等かの衝撃が与えられたときなど他のトリガによってイベントが発生したとみなされて推論イベント記録が行われるようにしてもよい。後述するように、この実施例では通常イベント記録（推論イベント記録）の他にダブル・イベント記録がある。ダブル・イベント記録は、後述のように推論イベント記録とセンサ・イベント記録とが行われるイベント記録であり、通常イベント記録は、この実施例では推論イベント記録である。ダブル・イベント記録と区別するために通常イベント記録（推論イベント記録）という文言が用いられている。通常イベント記録（推論イベント記録）について説明するが通常イベント記録（推論イベント記録）が設定されているときも常時記録処理が行われている。通常イベント記録（推論イベント記録）モードが設定されないと（図６ステップ93でＮＯ）、図６に示す処理では通常イベント記録が行われずに常時記録処理が行われる。 When the user presses the event recording button 31, the normal event recording mode (inference event recording mode in this embodiment) is set (YES in step 93 in FIG. 6). When the normal event recording mode is set and it is determined by inference that an event has occurred, moving images for a certain period before and after the event that has occurred are recorded in the event recording area (inferred event recording area) of the storage medium 71. In this embodiment, when an object is detected, it is assumed that an event has occurred, and inference event recording is performed. Inference event recording may also be performed by As will be described later, in this embodiment, there is a double event record in addition to the normal event record (inferred event record). A double event record is an event record in which an inferred event record and a sensor event record are performed as described below, and a normal event record is an inferred event record in this embodiment. The term normal event record (inferred event record) is used to distinguish it from double event record. The normal event record (inferred event record) will be explained. Even when the normal event record (inferred event record) is set, recording processing is always performed. If the normal event recording (inferred event recording) mode is not set (NO in step 93 in FIG. 6), the normal event recording is not performed in the process shown in FIG. 6, but the constant recording process is performed.

通常イベント記録モードが設定されると（図６ステップ93でＹＥＳ）、撮影装置１の表示面23には、図９に示す学習済モデル選択画像120が表示される。 When the normal event recording mode is set (YES in step 93 in FIG. 6), the learned model selection image 120 shown in FIG. 9 is displayed on the display surface 23 of the photographing device 1.

図９を参照して、学習済モデル選択画像120には、学習済モデルＡを選択するときにユーザにタッチされる領域121、学習済モデルＢを選択するときにユーザにタッチされる領域122、学習済モデルＣを選択するときにユーザにタッチされる領域123、推奨させる学習済モデル80が表示される領域124、学習済モデル80をユーザが決定するときにユーザにタッチされる領域125および学習済モデル80を自動で決定するときにユーザにタッチされる領域が含まれている。 Referring to FIG. 9, trained model selection image 120 includes an area 121 touched by the user when selecting trained model A, an area 122 touched by the user when selecting trained model B, Area 123 touched by the user when selecting the trained model C, area 124 where the trained model 80 to be recommended is displayed, area 125 touched by the user when the user decides on the trained model 80, and learning. Contains the area touched by the user when automatically determining the completed model 80.

撮影装置１のメモリ52には、多種多様な学習済モデル80が記憶されており、ユーザの選択に応じた学習済モデル80を用いて撮影によって得られた動画データを学習させることによりイベントが発生したかどうかが検出される。この実施例では対象の検出がイベントの発生としているが、他のトリガによるイベントの発生としてもよい。他のトリガによるイベントが発生したとする場合には、そのトリガに応じてイベントの発生と推論する学習済みモデル80が用いられることとなる。 A wide variety of learned models 80 are stored in the memory 52 of the photographing device 1, and an event occurs by learning the video data obtained by shooting using the learned model 80 according to the user's selection. It is detected whether the In this embodiment, the detection of an object is considered to be the occurrence of an event, but the event may also be caused by another trigger. If an event occurs due to another trigger, the trained model 80 that infers that the event has occurred according to that trigger will be used.

たとえば、学習済モデルＡは、特定の年齢層の人物を高精度に検出するものではなく、子供、大人、老人に関係なく、すべての年齢層の人物をまんべんなく検出できるような教師データを用いて学習して得られた学習済モデル80である。また、学習済モデルＡは、日中での撮影での人物を高精度に検出したり、夜間での撮影での人物を高精度に検出したりするものでもなく、すべての明るさで人物が検出できるような教師データを用いて学習して得られた学習済モデル80である。学習済モデルＡは、人物の検出精度は平均的となるので、たとえば、老人や子供を高精度に検出する学習済モデル80と比べて老人や子供を高精度に検出することは難しいことがある。また、学習済モデルＡは、夜間の人物を高精度に検出できるような学習済モデル80に比べて夜間での撮影で得られる動画データから人物を高精度に検出することは難しいことがある。学習済モデル80Ｂは、老人や子供が含まれている動画データを教師データとして学習したもので、老人や子供を高精度に検出する学習済モデル80である。学習済モデルＣは、夜間に撮影され、かつ人物が含まれている動画データを教師データとして学習したもので、夜間に人物を高精度に検出する学習済モデル80である。 For example, trained model A does not detect people of a specific age group with high accuracy, but uses training data that can uniformly detect people of all age groups, regardless of whether they are children, adults, or the elderly. This is a trained model 80 obtained by learning. In addition, trained model A does not detect people with high accuracy when taking pictures during the day or when taking pictures at night, and it does not detect people with high accuracy when taking pictures at night. This is a trained model 80 obtained by learning using teacher data that can be detected. Since trained model A has average human detection accuracy, it may be difficult to detect elderly people and children with high accuracy compared to, for example, trained model 80, which detects elderly people and children with high accuracy. . Furthermore, compared to the trained model 80, which can detect people at night with high accuracy, the trained model A may have difficulty detecting people with high accuracy from video data obtained by shooting at night. The trained model 80B is trained using video data that includes elderly people and children as teacher data, and is a trained model 80 that can detect elderly people and children with high accuracy. The trained model C is a trained model 80 that is trained using video data that is photographed at night and includes a person as teacher data, and that detects a person at night with high accuracy.

領域124には、たとえば、一般的には学習済モデルＡを推奨し、老人や子供を高精度に検出して推論イベント記録を行う場合には学習済モデルＢを推奨し、夜間に人物を高精度に検出して推論イベント記録を行う場合には学習済モデルＣを推奨するように、推奨する理由と推奨する学習済モデル80の種類が表示される。ユーザは、領域124に表示される理由を見て、学習済モデルＡからＣの中から所望の種類の学習済モデル80を選択する。領域125がユーザによってタッチされると、領域121から123のうち、ユーザがタッチした領域によって特定される学習済モデル80が推論イベント記録に利用される学習済モデル80として決定する。このタッチにより手動での推論モデルの選択指令がプロセッサ51から発生して、学習済モデル80が選択される。 In area 124, for example, trained model A is recommended in general, trained model B is recommended when detecting elderly people and children with high accuracy and recording inference events, and training model B is recommended when detecting elderly people and children with high accuracy and recording inference events. The reason for recommendation and the type of trained model 80 to be recommended are displayed so that learned model C is recommended when detecting accurately and recording inference events. The user looks at the reason displayed in the area 124 and selects a desired type of trained model 80 from the trained models A to C. When the area 125 is touched by the user, the trained model 80 specified by the area touched by the user among the areas 121 to 123 is determined as the trained model 80 to be used for recording the inference event. This touch causes the processor 51 to issue a manual selection command for the inference model, and the learned model 80 is selected.

図９においては、３種類の学習済モデルＡからＣが例示されているがもっと多くの種類の学習済モデル80の中から選択するようにしてもよい。 In FIG. 9, three types of trained models A to C are illustrated, but more types of trained models 80 may be selected.

また、図９では、人物検出に用いられるのに一般的な学習済モデルＡ、子供や老人を高精度に検出する学習済モデルＢおよび夜間に人物を高精度に検出する学習済モデルＣが示されているが、その他の種類の学習済モデル80の中からユーザが選択できるようにしたり、自動で決定したりしてもよい。自動で決定する場合には、自動で決定する学習済みモデル80を選択するように選択指令がプロセッサ51から出力される。たとえば、撮影環境を検出したり、入力したりして、撮影環境に適した学習済モデル80を自動で決定したり、選択して決定したりしてもよい。撮影環境は、たとえば、撮影場所、撮影場所の天候、撮影場所の明るさなどがある。撮影場所、撮影場所の天候、撮影場所の明るさなどに適した学習済モデル80を使用するように学習済モデル80を決定するとよい。撮影場所であれば、市街地のように人物が多くいる場所では子供や老人を高精度に検出できる学習済モデルＢを決定したり、郊外のように人物があまりいない場所では見通しがいいので検出精度が低い学習済モデル80を決定したりする。また、撮影場所の天候であれば、雨が降っているときのように撮影場所の天候が悪い場合には、検出精度が高い学習済みモデルに決定したり、雨の撮影時に検出精度が高い学習済モデル80に決定したり、天候がよい場合には、検出精度が低い学習済モデル80に決定したりする。撮影場所の明るさであれば、明るいときよりも相対的に高精度の学習済モデル80で推論するように学習済モデル80を決定したり、暗い場所で高精度に人物を検出できる学習済モデル80に決定したりする。撮影場所、撮影場所の天候、撮影場所の明るさについてはユーザが入力してもよいし、これらの情報を取得して、その結果に応じて自動的に学習モデルを決定してもよい。たとえば、撮影場所であれば、ＧＰＳ機能を利用して、現在地を把握し、地図サーバにアクセスすることで撮影場所がどこかわかる。撮影場所の天候もＧＰＳ機能を利用して、現在地を把握し、天気サーバにアクセスすることで撮影場所の天気が分かる。撮影場所の明るさは照度センサを利用することで分かる。また、撮影時間帯が夜間の場合には、夜間の人物検出が高精度の学習済モデル80を利用してもよいが、夜間、日中にかかわらず高精度の学習済モデル80を利用するように決定してもよい。 In addition, FIG. 9 shows a trained model A that is commonly used for person detection, a trained model B that detects children and the elderly with high accuracy, and a trained model C that detects people at night with high accuracy. However, the user may be allowed to select from other types of trained models 80, or the model may be determined automatically. In the case of automatic determination, a selection command is output from the processor 51 to select the learned model 80 to be automatically determined. For example, the shooting environment may be detected or input, and the trained model 80 suitable for the shooting environment may be automatically determined or selected. The shooting environment includes, for example, the shooting location, the weather at the shooting location, and the brightness of the shooting location. It is preferable to determine the trained model 80 to be used that is suitable for the shooting location, the weather at the shooting location, the brightness of the shooting location, etc. If it is a shooting location, we will decide on trained model B, which can detect children and elderly people with high accuracy in places where there are many people, such as in urban areas, and we will use trained model B, which can detect children and elderly people with high accuracy in places where there are not many people, such as in the suburbs. or determine a trained model with a low value of 80. In addition, if the weather at the shooting location is bad, such as when it is raining, a trained model with high detection accuracy will be selected, or a trained model with high detection accuracy will be selected when shooting in the rain. If the weather is good, the trained model 80, which has low detection accuracy, may be determined. Depending on the brightness of the shooting location, the trained model 80 is determined so that inference is made using the trained model 80, which is relatively more accurate than when it is bright, or the trained model 80 is able to detect people with high accuracy in dark places. I decided on 80. The user may input the shooting location, the weather at the shooting location, and the brightness at the shooting location, or the learning model may be automatically determined based on the obtained information by acquiring the information. For example, if it is a shooting location, the current location can be determined using the GPS function, and the location of the shooting location can be determined by accessing a map server. The weather at the shooting location can also be determined by using the GPS function to determine the current location and accessing the weather server. The brightness of the shooting location can be determined by using an illuminance sensor. In addition, if the shooting time is night, the trained model 80 with high accuracy for detecting people at night may be used, but it is recommended to use the trained model 80 with high accuracy regardless of night or day may be determined.

領域126がタッチされると上述のように撮影環境に応じてプロセッサ51により自動的に学習済モデル80が決定する。 When the area 126 is touched, the learned model 80 is automatically determined by the processor 51 according to the shooting environment as described above.

図６にもどって、図９の領域121から123のうちいずれかの領域がタッチされ、かつ領域125がタッチされると、ユーザによって対象の検出に用いられる学習済モデル80が選択されることとなる（図６ステップ94でＹＥＳ）。すると、選択された学習済モデル80で対象の検出をするように撮影装置１が設定される（図６ステップ97）。 Returning to FIG. 6, when any of the regions 121 to 123 in FIG. 9 is touched and the region 125 is touched, the trained model 80 used for object detection is selected by the user. (YES in step 94 in Figure 6). Then, the photographing device 1 is set to detect the object using the selected trained model 80 (step 97 in FIG. 6).

図９の領域126がタッチされると（図６ステップ94でＮＯ）、撮影環境に応じて自動的に学習済モデル80が決定される。上述のようにして撮影環境が検出され（図６ステップ95）、検出した撮影環境に適した学習済モデル80が選択される（図６ステップ96）。たとえば、夜間で天候が悪いというように複数の撮影環境に適した学習済モデル80があれば、複数の撮影環境に適した学習済モデル80が選択される。選択された学習済モデル80で対象の検出をするように撮影装置１が設定される（図６ステップ97）。 When the area 126 in FIG. 9 is touched (NO in step 94 in FIG. 6), the learned model 80 is automatically determined according to the shooting environment. The photographing environment is detected as described above (step 95 in FIG. 6), and a trained model 80 suitable for the detected photographing environment is selected (step 96 in FIG. 6). For example, if there are trained models 80 suitable for multiple shooting environments, such as nighttime and bad weather, the trained models 80 suitable for multiple shooting environments are selected. The photographing device 1 is set to detect the object using the selected learned model 80 (step 97 in FIG. 6).

すると、撮影によって得られている撮影動画データを、選択された学習済モデル80に入力し、撮影しながら推論（学習）させられる（図７ステップ98）。撮影動画を表す画像を表す画像データが１フレームごとに、選択された学習済モデル80に入力して推論させられる。 Then, the photographed video data obtained by photographing is input to the selected trained model 80, and the model is caused to infer (learn) while photographing (step 98 in FIG. 7). Image data representing an image representing a captured moving image is input frame by frame to the selected learned model 80 and caused to infer.

また、推論中に外乱を受けたかどかが判定される（図７ステップ99）。センサ部69からの出力にもとづいて推論中に外乱を受けたかどうかを判定できる。外乱を受けた場合には（図７ステップ99でＹＥＳ）、外乱を受けても、選択された学習済モデル80で推論できる状態かどうかが判定される（図７ステップ100）。たとえば、センサ部69に含まれる照度センサによって撮影装置１が撮影している場所の明るさがしきい値以下の暗さであることが検出されたり、撮影装置１に光が照射されたことが照度センサにより検出されたり、センサ部69に含まれるＧセンサによって所定のしきい値以上の加速度が車両に加わったり、車両の速度計から与えられる車両の速度が所定のしきい値以上となったりすると推論できない状態と判定される。たとえば、外乱の種類や外乱の大きさごとに推論できるかどうかがあらかじめ定められており、あらかじめ定められている外乱の種類、大きさとなると推論できない状態と判定される。学習済モデル80ごとに外乱の種類や外乱の大きさごとに推論できるかどうかがあらかじめ定められており、あらかじめ定められている外乱の種類、大きさとなると推論できない状態と判定されるようにしてもよい。推論できる状態かどうかが判定されると判定結果が知らせられる（図７ステップ101）。判定結果は、音声出力部66から音声が出力されて知らせられてもよいし、表示面23に表示されることで知らせられてもよい。 Also, it is determined whether a disturbance has occurred during the inference (step 99 in FIG. 7). Based on the output from the sensor unit 69, it can be determined whether a disturbance has occurred during inference. If a disturbance is received (YES in step 99 in FIG. 7), it is determined whether the selected learned model 80 is in a state where inference can be made even if the disturbance is received (step 100 in FIG. 7). For example, the illuminance sensor included in the sensor unit 69 may detect that the brightness of the place photographed by the photographing device 1 is below a threshold, or the illuminance sensor may detect that the photographing device 1 is irradiated with light. It is inferred that the G-sensor included in the sensor unit 69 applies an acceleration of more than a predetermined threshold to the vehicle, or that the speed of the vehicle given by the speedometer of the vehicle exceeds a predetermined threshold. It is determined that it is not possible. For example, it is determined in advance whether inference is possible for each type of disturbance or magnitude of the disturbance, and it is determined that inference is not possible for the predetermined type or magnitude of disturbance. For each trained model 80, it is determined in advance whether or not inference can be made for each type of disturbance and the size of the disturbance, and even if it is determined that inference is not possible for the type and size of the disturbance that are predetermined, good. When it is determined whether the state is such that inference is possible, the determination result is notified (step 101 in FIG. 7). The determination result may be notified by outputting a sound from the audio output unit 66, or may be notified by being displayed on the display screen 23.

外乱があっても推論できる状態であると判定されると（図７ステップ102でＹＥＳ）、推論結果が補正される（図７ステップ103）。たとえば、選択された学習モデルから出力される確率がしきい値以上の場合に対象を検出したと判定するときに、そのしきい値を下げ、外乱が無い場合と比べて確率が低くても対象を検出したとみなすように推論結果、学習モデルから出力される確率が補正される（図７ステップ103）。外乱の種類や外乱の程度に応じて推論結果に与える影響をあらかじめ算出しておき、それらの外乱の種類、外乱の程度に応じて推論結果に与える影響が大きいほど補正量を大きくしてもよい。このように、撮影装置１が設けられている車両４の移動により車両４から撮影装置に与えられる外乱、撮影装置１に光が照射、例えば、対向車のライトによる照射、撮影装置１が設けられている車両４のライトの反射光など、されたことにより与えられる外乱に応じて推論結果を補正してもよい。たとえば、選択された学習モデルから出力される確率がしきい値以上の場合に対象を検出したと判定するときに、そのしきい値を下げ、確率が低くても対象を検出したとみなすように推論結果を補正する。外乱としては、車両４から撮影装置に与えられる動き、速度、加速度、車両の揺れに例示されるような、推論の精度に影響を与え得る事象とするとよい。 If it is determined that inference is possible even if there is a disturbance (YES in step 102 in FIG. 7), the inference result is corrected (step 103 in FIG. 7). For example, when determining that a target has been detected when the probability output from the selected learning model is greater than or equal to a threshold, the threshold can be lowered and the target detected even if the probability is lower than when there is no disturbance. The inference result and the probability output from the learning model are corrected so that it is considered that the learning model has been detected (step 103 in FIG. 7). The influence on the inference result may be calculated in advance according to the type and degree of disturbance, and the amount of correction may be increased as the influence on the inference result is greater depending on the type and degree of disturbance. . In this way, disturbances caused by the vehicle 4 to the photographing device due to the movement of the vehicle 4 in which the photographing device 1 is installed, light irradiation on the photographing device 1, for example, irradiation by the lights of an oncoming vehicle, The inference result may be corrected in accordance with disturbances caused by, for example, reflected light from the lights of the vehicle 4 being driven. For example, when determining that a target has been detected when the probability output from the selected learning model is greater than or equal to a threshold, the threshold can be lowered and the target can be considered detected even if the probability is low. Correct the inference results. The disturbance may be an event that can affect the accuracy of the inference, such as movement, speed, acceleration, or shaking of the vehicle given to the photographing device from the vehicle 4.

外乱により推論できる状態でなければ（図７ステップ102でＮＯ）、推論結果の補正は行われずにステップ98からの処理に戻る。外乱により推論ができない状態かどうかは、たとえば、外乱の種類、外乱の大きさごとにあらかじめ定められている。また、推論中に外乱を受けなければ（図７ステップ99でＮＯ）、ステップ100からステップ103までの処理はスキップされる。 If the inference is not possible due to disturbance (NO in step 102 in FIG. 7), the inference result is not corrected and the process returns to step 98. Whether inference is not possible due to a disturbance is determined in advance for each type of disturbance and the magnitude of the disturbance, for example. Furthermore, if no disturbance occurs during inference (NO in step 99 in FIG. 7), the processes from step 100 to step 103 are skipped.

推論により対象が検出されなければ（図７ステップ104でＮＯ）、ステップ98からの処理に戻る。推論により対象が検出されると（図７ステップ104でＹＥＳ）、推論イベント記録が開始する（図８ステップ105）。 If no target is detected by inference (NO in step 104 in FIG. 7), the process returns to step 98. When an object is detected by inference (YES in step 104 in FIG. 7), inference event recording starts (step 105 in FIG. 8).

図10は、撮影装置１の撮影によって得られた撮影動画を構成する撮影画像の一例である。 FIG. 10 is an example of a photographed image constituting a photographed moving image obtained by photographing with the photographing device 1.

撮影画像160は、撮影装置１の表示面23に表示されている。車両４は、ビル街を走行しており、撮影画像160には、車両４の前方を走行している車両170、車両４の進行方向から見て左側に家族連れの人物161、162および163が含まれている。人物161および163は大人であり、人物162は子供である。撮影画像160を学習モデルに入力して学習させると対象である人物161、162および163が検出されるので、撮影動画データの推論イベント記録が行われる。 The photographed image 160 is displayed on the display screen 23 of the photographing device 1. Vehicle 4 is traveling in a built-up area, and photographed image 160 shows vehicle 170 traveling in front of vehicle 4, and people 161, 162, and 163 with families on the left side when viewed from the direction of travel of vehicle 4. include. Persons 161 and 163 are adults, and person 162 is a child. When the photographed image 160 is input to the learning model and the learning model is trained, the target persons 161, 162, and 163 are detected, so that the inference event recording of the photographed video data is performed.

図11は、記憶媒体71のフォーマットを示している。 FIG. 11 shows the format of the storage medium 71.

記憶媒体71の記録領域には、ファイル・システム領域131、常時記録領域132、推論イベント記録領域135およびユーザ記録領域138が形成されている。 A file system area 131, a constant recording area 132, an inference event recording area 135, and a user recording area 138 are formed in the recording area of the storage medium 71.

ファイル・システム領域131には、常時記録領域132、推論イベント記録領域135およびユーザ記録領域138に記録されているデータを再生する専用ソフトウエアが記録されている。 Dedicated software for reproducing data recorded in the constant recording area 132, the inference event recording area 135, and the user recording area 138 is recorded in the file system area 131.

常時記録領域132には管理領域133と記録領域134とが形成されている。管理領域133にはファイル・システム領域131に記録されている専用ソフトウエアによって各種設定情報が記録される。記録領域134には、動画を構成するフレームを表す画像データがフレーム番号順に記録される。記録領域134には、１フレームごとにヘッダ記録領域151、フレーム画像データ記録領域152およびフッタ記録領域153が形成されている。ヘッダ記録領域151にはフレーム番号、アドレス位置、撮影時刻などの付加情報が記録され、フッタ記録領域153にはヘッダ記録領域151に記録される付加情報以外の付加情報が記録される。 A management area 133 and a recording area 134 are formed in the permanent recording area 132. Various setting information is recorded in the management area 133 by dedicated software recorded in the file system area 131. In the recording area 134, image data representing frames constituting a moving image is recorded in the order of frame numbers. In the recording area 134, a header recording area 151, a frame image data recording area 152, and a footer recording area 153 are formed for each frame. The header recording area 151 records additional information such as a frame number, address position, and shooting time, and the footer recording area 153 records additional information other than the additional information recorded in the header recording area 151.

常時記録領域132の記録領域134に記録されるフレーム１からフレームＥまでが常時記録の一つの動画期間を表す。たとえば、車両４のエンジンをスタートしてから、エンジンを切るまで常時記録が行われ、一つの動画期間となる。 Frame 1 to frame E recorded in the recording area 134 of the constant recording area 132 represents one moving image period of constant recording. For example, continuous recording is performed from the time the engine of the vehicle 4 is started until the engine is turned off, resulting in one video period.

推論イベント記録領域135も常時記録領域132と同様に、管理領域136と記録領域137とが形成されている。管理領域136にもファイル・システム領域131に記録されている専用ソフトウエアによって各種設定情報が記録される。記録領域137には、動画を構成するフレームを表す画像データがフレーム番号順に記録される。記録領域137にも、１フレームごとにヘッダ記録領域151、フレーム画像データ記録領域152およびフッタ記録領域153が形成されている。ヘッダ記録領域151にはフレーム番号、アドレス位置、撮影時刻などの付加情報が記録され、フッタ記録領域153にはヘッダ記録領域151に記録される付加情報以外の付加情報が記録される。 Similarly to the constant recording area 132, the inference event recording area 135 also includes a management area 136 and a recording area 137. Various setting information is also recorded in the management area 136 by the dedicated software recorded in the file system area 131. In the recording area 137, image data representing frames constituting a moving image is recorded in the order of frame numbers. Also in the recording area 137, a header recording area 151, a frame image data recording area 152, and a footer recording area 153 are formed for each frame. The header recording area 151 records additional information such as a frame number, address position, and shooting time, and the footer recording area 153 records additional information other than the additional information recorded in the header recording area 151.

推論イベント記録領域135の記録領域137に記録されるフレーム１からフレームＥまでがイベント記録の一つの動画を表す。たとえば、対象を検出してから対象が検出されなくなるのでの期間がイベント記録の一つの動画である。後述するように対象が検出されなくなってから一定の期間経過しても対象が検出されていないときに対象が検出されなくなり至便と一つのイベント記録が終了してもよい。 Frame 1 to frame E recorded in the recording area 137 of the inference event recording area 135 represents one moving image of the event record. For example, one video of an event record is the period from when an object is detected to when the object is no longer detected. As will be described later, when the object is not detected even after a certain period of time has elapsed since the object was no longer detected, the object may no longer be detected, and one event record may conveniently end.

ユーザ記録領域138は、ユーザがデータを自由に記録できる領域である。ユーザ記録領域138には管理領域139およびユーザ情報記録領域140が含まれている。 The user recording area 138 is an area where the user can freely record data. The user recording area 138 includes a management area 139 and a user information recording area 140.

管理領域139は、ユーザ情報記録領域140に記録されるデータを管理する領域である。管理領域139によってユーザ情報記録領域140に記録される一塊のデータの開始アドレス、終了アドレスがわかり、所望のデータの読み取りおよび記録ができる。 The management area 139 is an area for managing data recorded in the user information recording area 140. The start address and end address of a block of data recorded in the user information recording area 140 can be determined by the management area 139, and desired data can be read and recorded.

イベント記録が開始されると、記憶媒体71の推論イベント記録領域135の記録領域137に撮影動画データの記録が始まる。常時記録が行われているので、対象の検出にかかわらず撮影動画データは記憶媒体71の常時記録領域132の記録領域134に記録されている。撮影動画データをメモリ52に一時的に記憶し、数フレームから数十フレームごとに更新してもよい。それにより、イベント記録は、対象が検出される前の撮影動画データをメモリ52から読み出し、推論イベント記録領域135に記録することができる。 When event recording is started, recording of photographed moving image data begins in the recording area 137 of the inferred event recording area 135 of the storage medium 71. Since constant recording is performed, the photographed moving image data is recorded in the recording area 134 of the constant recording area 132 of the storage medium 71 regardless of whether an object is detected. The captured video data may be temporarily stored in the memory 52 and updated every several frames to several tens of frames. Thereby, for event recording, captured video data before the object is detected can be read from the memory 52 and recorded in the inferred event recording area 135.

図８にもどって、イベント記録が開始されると、ユーザによって選択された学習済モデル80、学習済モデル80の推論の結果などの情報がユーザ記録領域138のユーザ情報記録領域140に対象を検出した画像に対応して記録される（図８ステップ106）。たとえば、選択された学習済モデル80の種類、バージョン、推論の結果として対象の検出の確率、対象を検出した場所、日時、イベント記録または対象を検出したときに常時記録された撮影動画データの記憶媒体71におけるアドレス、リンク先、対象を検出したときの着目箇所などの情報が対象を検出した画像に対応してユーザ情報記録領域140、不可視領域の一例である、に記録される。記憶媒体71に動画のテロップを記録する領域を設け、そのテロップ領域に学習済モデル80に関する情報、種類、バージョン、推論結果などを記録してもよい。イベント記録された動画を再生するときにテロップとして学習済モデル80に関する情報を表示させることができるようになる。 Returning to FIG. 8, when event recording is started, information such as the trained model 80 selected by the user and the inference results of the trained model 80 is stored in the user information recording area 140 of the user recording area 138 when the target is detected. The recorded image is recorded in correspondence with the image (step 106 in FIG. 8). For example, the type and version of the selected trained model 80, the probability of detecting the target as a result of inference, the location, date and time of detecting the target, event records, or storage of captured video data constantly recorded when detecting the target. Information such as an address, a link destination, a point of interest when an object is detected on the medium 71 is recorded in the user information recording area 140, which is an example of an invisible area, in correspondence with the image in which the object was detected. The storage medium 71 may be provided with an area for recording video captions, and information regarding the learned model 80, type, version, inference results, etc. may be recorded in the caption area. Information regarding the learned model 80 can now be displayed as a subtitle when playing back an event-recorded video.

対象が検出されると、検出された対象が枠に囲まれる（図８ステップ107）。 When an object is detected, the detected object is surrounded by a frame (step 107 in FIG. 8).

図12に示す撮影画像160Ａは、図10の撮影画像160に対応している。 A photographed image 160A shown in FIG. 12 corresponds to the photographed image 160 in FIG.

撮影画像160Ａにおいて、対象161、162および163が検出されると、それぞれの対象161、162および163が枠164、165および166で囲まれる。枠164、165および166を見ることにより、ユーザは、対象161、162および163が存在することが分かる。対象161、162および163のすべてが検出されなくとも少なくとも一人の対象が検出されれば、その検出された対象が枠で囲まれる。 When objects 161, 162, and 163 are detected in photographed image 160A, frames 164, 165, and 166 surround the objects 161, 162, and 163, respectively. By looking at boxes 164, 165 and 166, the user knows that objects 161, 162 and 163 are present. Even if all of the objects 161, 162, and 163 are not detected, if at least one object is detected, the detected object is surrounded by a frame.

撮影動画データによって表される撮影動画を構成する次のフレームの画像を表す画像データが学習済モデル80に入力し、再び学習済モデル80での学習が行われる（図８ステップ108）。図７ステップ98からの処理が繰り返される。 Image data representing an image of the next frame constituting the photographed video represented by the photographed video data is input to the trained model 80, and learning is performed again using the trained model 80 (step 108 in FIG. 8). The process from step 98 in FIG. 7 is repeated.

次のフレームの画像でも対象が検出されると（図８ステップ109でＮＯ）、検出された対象が枠で囲まれ、さらに次のフレームの画像を表す画像データが学習済モデル80に入力し、再び学習済モデル80での推論が繰り返される（図８ステップ108）。このようにフレームごとに撮影動画データによって表される動画を構成する画像についての推論が繰り返される。 If the object is detected in the next frame image as well (NO in step 109 in FIG. 8), the detected object is surrounded by a frame, and image data representing the next frame image is input to the learned model 80. The inference using the learned model 80 is repeated again (step 108 in FIG. 8). In this way, inferences regarding the images forming the moving image represented by the photographed moving image data are repeated for each frame.

この実施例では、後述のように、画像から対象が検出されなくなっても（図８ステップ109でＹＥＳ）、検出されなくなってから所定の時間、たとえば、0.2秒経過するまでは（図８ステップ111でＮＯ）、枠を表示していた箇所に引き続き枠（対象物の部分を特定するマークの一例である）が表示（補間の一例である）される。対象が検出されなくなってから所定の時間、たとえば、0.2秒、経過しても対象が検出されていなければ枠が消去され、イベント記録が停止させられる（図８ステップ113）。撮影装置１に撮影終了指令が入力されると（図８ステップ114でＹＥＳ）、図６から図８に示す処理は終了する。撮影装置１に撮影終了指令が入力されなければ（図８ステップ114でＮＯ）、図７ステップ78からの処理が繰り返される。 In this embodiment, as will be described later, even if the object is no longer detected in the image (YES in step 109 in FIG. 8), the object remains undetected until a predetermined period of time, for example, 0.2 seconds has elapsed after the object is no longer detected (step 111 in FIG. 8). (NO), a frame (which is an example of a mark that specifies a part of the object) is displayed (an example of interpolation) in the place where the frame was displayed. If the object is not detected even after a predetermined period of time, for example 0.2 seconds, has elapsed since the object was no longer detected, the frame is erased and event recording is stopped (step 113 in FIG. 8). When the photographing end command is input to the photographing device 1 (YES in step 114 in FIG. 8), the processing shown in FIGS. 6 to 8 ends. If the photographing end command is not input to the photographing device 1 (NO in step 114 in FIG. 8), the processing from step 78 in FIG. 7 is repeated.

図13は、撮影動画データによって表される撮影動画を構成する画像の一部を示している。 FIG. 13 shows some of the images forming the captured video represented by the captured video data.

Ｎフレームの画像に柵167の向こう側に対象161、162および163がいた場合、Ｎフレームの画像を推論させることにより、それらの対象161、162および163が検出され、これらの対象161、162および163が枠164、165および166で囲まれる。ところが、（Ｎ＋１）フレームの画像を推論しても、柵167の向こう側に対象161、162および163がいたとしても柵167の影響で対象161、162および163が検出されないことがある。このため、この実施例では、対象が検出されなくなってから所定の時間、たとえば、0.2秒経過するまでは補間画像、少なくとも枠を補間して生成し、対象が検出されていた場所に枠164ａ、165ａおよび166ａが表示される。（Ｎ＋Ｍ）フレームの画像において推論により対象161、162および163が検出されると、それらの対象161、162および163を囲む枠164、165および166が表示される。 If there are objects 161, 162, and 163 on the other side of the fence 167 in the N-frame image, those objects 161, 162, and 163 are detected by inferring the N-frame image, and these objects 161, 162, and 163 is surrounded by frames 164, 165 and 166. However, even if the images of the (N+1) frames are inferred, the objects 161, 162, and 163 may not be detected due to the influence of the fence 167 even if the objects 161, 162, and 163 are on the other side of the fence 167. For this reason, in this embodiment, an interpolated image, at least a frame, is generated by interpolating until a predetermined period of time, for example, 0.2 seconds has passed after the target is no longer detected, and a frame 164a is created at the location where the target was detected. 165a and 166a are displayed. When objects 161, 162 and 163 are detected by inference in the (N+M) frame image, frames 164, 165 and 166 surrounding these objects 161, 162 and 163 are displayed.

図14も、撮影動画データによって表される撮影動画を構成する画像の一部を示している。 FIG. 14 also shows some of the images forming the captured video represented by the captured video data.

図14においても、Ｎフレームの画像に柵167の向こう側に対象161、162および163がいた場合、推論により、それらの対象161、162および163が検出され、これらの対象161、162および163が枠164、165および166で囲まれる。図13に示すものと同様に、（Ｎ＋１）フレームの画像では、柵167の向こう側に対象161、162および163がいても柵167の影響で対象161、162および163が検出されないと、対象が検出されなくなってから所定の時間（たとえば、0.2秒）経過するまでは補間画像を生成し、対象が検出されていた場所に枠164ａ、165ａおよび166ａが表示される。所定の時間（たとえば、0.2秒）経過して（Ｎ＋Ｌ）フレームの画像において対象が検出されなければ、枠164ａ、165ａおよび166ａは消去される。 Also in FIG. 14, if there are objects 161, 162 and 163 on the other side of the fence 167 in the N frame image, those objects 161, 162 and 163 are detected by inference, and these objects 161, 162 and 163 are Surrounded by boxes 164, 165 and 166. Similar to what is shown in FIG. 13, in the image of the (N+1) frame, even if there are objects 161, 162, and 163 on the other side of the fence 167, if the objects 161, 162, and 163 are not detected due to the influence of the fence 167, the objects will be detected. An interpolated image is generated until a predetermined time (for example, 0.2 seconds) has elapsed since the object was no longer detected, and frames 164a, 165a, and 166a are displayed at the locations where the object was detected. If no object is detected in the (N+L) frame images after a predetermined period of time (for example, 0.2 seconds), frames 164a, 165a, and 166a are deleted.

図15は、撮影装置１の撮影によって得られる撮影動画を構成する１フレームの画像の一例である。 FIG. 15 is an example of one frame of an image constituting a photographed moving image obtained by photographing with the photographing device 1.

画像160Ｂには、図10、図12と同様に撮影方向に向かって左側に対象161、162および163が存在している。画像160Ｂを学習済モデル80に入力して推論することで、これらの対象161、162および163が検出され、対象161、162および163が枠164、165および166で囲まれる。撮影方向の前方の車両以外に撮影方向に向かって右側にも車両180があり、その車両180の中に対象181および182が存在する。画像160Ｂを学習済モデル80に入力すると、これらの対象181および182も検出され、これらの対象181および182を囲む枠183および184も表示される。 In image 160B, objects 161, 162, and 163 are present on the left side in the photographing direction, as in FIGS. 10 and 12. These objects 161, 162, and 163 are detected by inputting the image 160B to the trained model 80 and inferring them, and the objects 161, 162, and 163 are surrounded by frames 164, 165, and 166. In addition to the vehicle in front of the photographing direction, there is also a vehicle 180 on the right side toward the photographing direction, and objects 181 and 182 exist within the vehicle 180. When image 160B is input to trained model 80, these objects 181 and 182 are also detected, and frames 183 and 184 surrounding these objects 181 and 182 are also displayed.

しかしながら、対象を検出する意図としては、たとえば、道を歩いている対象を検出して、車両４のドライバに注意を喚起するものであり、他の車両180などに乗っている対象を車両４のドライバに知らせる必要は必ずしもないとも考えられる。このため、この実施例においては、撮影装置１が設けられている車両４との相対的な速度が一定速度以下の対象を検出し、相対的な速度が一定速度より大きい対象は検出しないような学習済モデル80が利用される。たとえば、図５に示すような学習済モデル80を生成するときに教師データとして対象の相対的な速度、撮影装置１が設けられている車両４の速度がｖ１km/時sec、検出された対象181、182が載っている、車両４に向かってきている車両180の速度がｖ２km/時であると相対的な速度は（ｖ１＋Ｖ２）km/時、が一定速度より大きい対象画像は検出対象から除外され、相対的な速度が一定速度以下の対象の画像を検出するように教師データを用いて学習済モデル80が生成される。そのような学習済モデル80を利用するときには、学習済モデル80に撮影画像に含まれる対象の相対的な速度データを入力させて学習させればよい。相対的な速度データは、２つの画像の撮影感覚と対象の移動距離から分かる。また、図５に示すように対象を検出する学習済モデル80に、出力層83から出力される対象画像らしさの確率を、対象の画像の相対的な速度が一定速度より大きい対象については確率を０％となるように推論させたり、確率については変更せずに相対的な速度が一定速度以上の場合には推論結果を補正して対象を未検出とさせたりしてもよい。 However, the intention of detecting an object is, for example, to detect an object walking on the road and alert the driver of the vehicle 4, and to detect an object riding in another vehicle 180 etc. It is considered that it is not necessarily necessary to notify the driver. Therefore, in this embodiment, an object whose relative speed to the vehicle 4 in which the imaging device 1 is installed is less than a certain speed is detected, and an object whose relative speed is higher than the certain speed is not detected. Trained model 80 is used. For example, when generating a trained model 80 as shown in FIG. , 182, and the speed of the vehicle 180 approaching vehicle 4 is v2 km/hour, the target image whose relative speed is (v1 + V2) km/hour, which is greater than a constant speed, is excluded from the detection target. , a trained model 80 is generated using teacher data so as to detect images of objects whose relative speed is below a certain speed. When using such a trained model 80, the trained model 80 may be trained by inputting relative speed data of an object included in a photographed image. Relative speed data can be found from the sense of shooting the two images and the distance traveled by the object. In addition, as shown in FIG. 5, the trained model 80 that detects objects is given the probability of the likeness of the target image output from the output layer 83, and the probability is calculated for objects where the relative speed of the target image is greater than a certain speed. It may be inferred that the probability is 0%, or the inference result may be corrected so that the target is not detected if the relative speed is greater than a certain speed without changing the probability.

また、同様に、車両の中にいる対象は検出しないような学習済モデル80を生成し、そのような学習済モデル80を利用して撮影画像から対象を検出するようにしてもよい。車両の中にいる対象については確率が０％となるように推論させたり補正させたりしてもよい。車両を検出する学習済モデル80も兼用し、検出された車両の中にいる対象を検出すると対象を検出する推論結果としての確率が０％としてもよいし、補正により未検出としてもよい。 Similarly, a trained model 80 that does not detect objects inside a vehicle may be generated, and such a trained model 80 may be used to detect objects from captured images. For objects inside the vehicle, inference or correction may be made so that the probability is 0%. The trained model 80 for detecting a vehicle may also be used, and when an object inside a detected vehicle is detected, the probability of detecting the object may be 0% as an inference result, or it may be corrected so that the object is not detected.

さらに、ビルの壁、その他の場所などが鏡面のようなときに、その鏡面に対象が写り、鏡面内に写った対象を検出してしまうことがある。そのような対象を排除できる学習済モデル80を利用して推論させてもよい。教師データに鏡面に写った対象を用い、鏡面に写った対象については対象として検出しないような学習済モデル80を生成し、利用すればよい。 Furthermore, when a wall of a building or other place has a mirror surface, an object may be reflected on the mirror surface, and the object reflected within the mirror surface may be detected. Inference may be made using a trained model 80 that can exclude such targets. An object reflected on a mirror surface may be used as the training data, and a trained model 80 that does not detect an object reflected on a mirror surface as an object may be generated and used.

上述の実施例において、推論が行われたことにより、図５に示す学習済モデル80の出力層83から出力する確率を表すデータを入力して対象を検出したかどうかを判定する弁別回路から、対象を検出したことを示すデータが出力した場合には、その弁別回路のしきい値を下げたり、上げたりするように撮影装置１を制御してもよい。しきい値を下げることにより、紛らわしいものであっても対象を検出することができ車両４のドライバには注意を喚起でき、しきい値を上げることにより、あいまいな物体は非検出にでき、確実に対象を検出できるようになる。 In the above embodiment, as a result of the inference being performed, from the discrimination circuit that inputs data representing the probability output from the output layer 83 of the learned model 80 shown in FIG. 5 and determines whether or not a target has been detected, When data indicating that an object has been detected is output, the photographing device 1 may be controlled to lower or raise the threshold of the discrimination circuit. By lowering the threshold, it is possible to detect objects even if they are confusing, and alert the driver of the vehicle 4. By raising the threshold, ambiguous objects can be undetected, ensuring reliable detection. It becomes possible to detect objects.

［第２実施例］
図16から図19は、第２実施例を示している。第２実施例は、推論の結果にもとづくイベント記録のほかにセンサ部69の出力にもとづくイベント記録との両方のイベント記録（ダブル・イベント記録ということにする）を行うことができるものである。 [Second example]
16 to 19 show a second embodiment. In the second embodiment, in addition to event recording based on the result of inference, event recording can be performed based on the output of the sensor section 69 (referred to as double event recording).

図16および図17は、撮影装置１の処理手順を示すフローチャートである。 16 and 17 are flowcharts showing the processing procedure of the photographing device 1.

撮影装置１による撮影が開始されると（図16ステップ201）、撮影によって得られた動画が表示面23に表示される（図16ステップ202）。 When the photographing device 1 starts photographing (step 201 in FIG. 16), the moving image obtained by the photographing is displayed on the display screen 23 (step 202 in FIG. 16).

表示面23には、第３のボタン29が押されることによりメニュー画像が表示され、そのメニュー画像を利用してダブル・イベント記録モードの設定ができる（図16ステップ203）。ダブル・イベント記録モードが設定されないと（図16ステップ203でＮＯ）、図６ステップ93の処理に移行する。 A menu image is displayed on the display surface 23 by pressing the third button 29, and the double event recording mode can be set using the menu image (step 203 in FIG. 16). If the double event recording mode is not set (NO in step 203 in FIG. 16), the process moves to step 93 in FIG.

ダブル・イベント記録モードが設定されると（図16ステップ203でＹＥＳ）、図６から図９に示す推論イベント記録の処理（図16ステップ204）とセンサ・イベント処理とが平行して行われる。また、常時記録処理も並行して行われる。推論イベント記録の処理は上述した通常イベント処理と同様であり、図６ステップ94からの処理が行われる。 When the double event recording mode is set (YES in step 203 in FIG. 16), the inference event recording process (step 204 in FIG. 16) shown in FIGS. 6 to 9 and the sensor event process are performed in parallel. Further, constant recording processing is also performed in parallel. The inference event recording process is similar to the normal event process described above, and the process starting from step 94 in FIG. 6 is performed.

センサ・イベント処理は、図16ステップ205から開始する。 Sensor event processing begins at step 205 in FIG.

センサ部69の出力にもとづいて対象が検出されたかどうかが判定される（図16ステップ205でＹＥＳ）。たとえば、センサ部69に赤外線センサを設け、夜間の撮影において赤外線センサからの出力にもとづいて対象の形にあった熱の部分が存在することからわかり対象を検出できる。また、センサ部69に超音波センサを設け、超音波センサからの出力にもとついて対象を検出するようにしてもよい。 Based on the output of the sensor unit 69, it is determined whether the object has been detected (YES in step 205 in FIG. 16). For example, by providing an infrared sensor in the sensor section 69, the object can be detected by detecting the presence of a heated area that matches the shape of the object based on the output from the infrared sensor during night photography. Further, the sensor section 69 may be provided with an ultrasonic sensor, and the object may be detected based on the output from the ultrasonic sensor.

センサ部69からの出力にもとづいて対象が検出されると（図16ステップ205でＹＥＳ）、センサ・イベント記録が開始される（図16ステップ206）。センサ・イベント記録が開始されると、センサ部69からの出力にもとづいて検出された対象の画像上における位置、検出された日時、検出したセンサの種類、バージョンなどが記憶媒体71に記録される。 When an object is detected based on the output from the sensor unit 69 (YES in step 205 in FIG. 16), sensor event recording is started (step 206 in FIG. 16). When sensor event recording is started, the position of the detected object on the image, the date and time of detection, the type and version of the detected sensor, etc. are recorded in the storage medium 71 based on the output from the sensor unit 69. .

図18は、記憶媒体71のフォーマットの一例である。 FIG. 18 is an example of the format of the storage medium 71.

図18において、図11に示すものと同一物については同一符号を付して説明を省略する。 In FIG. 18, the same components as those shown in FIG. 11 are given the same reference numerals, and the description thereof will be omitted.

図18に示す記憶媒体71のフォーマットにおいては、上述したファイル・システム領域131、常時記録領域132、および推論イベント記録領域135のほかにセンサ・イベント記録領域221が設けられている。図11に示すユーザ記録領域138も設け、学習済モデル80に関する情報、推論結果などを記録してもよいし、推論イベント記録領域135の管理領域136、センサ・イベント記録領域221の管理領域222に学習済モデル80に関する情報、推論結果などを記録してもよい。 In the format of the storage medium 71 shown in FIG. 18, a sensor event recording area 221 is provided in addition to the above-described file system area 131, constant recording area 132, and inference event recording area 135. A user recording area 138 shown in FIG. 11 may also be provided to record information regarding the trained model 80, inference results, etc. Information regarding the learned model 80, inference results, etc. may be recorded.

推論イベント記録領域135の記録領域137に推論により対象が検出された撮影動画データ（第１の撮影動画データの一例である）が記録される。撮影動画データは対象が枠で囲まれている動画を表すものでもよいし、対象が枠で囲まれていない動画を表すものでもよい。対象が枠で囲まれていないときには、再生時には再生により得られる動画に含まれる対象を、推論結果を利用して枠で囲むこととなろう。 In the recording area 137 of the inference event recording area 135, photographed video data (which is an example of first photographed video data) in which a target is detected by inference is recorded. The captured video data may represent a video in which the object is surrounded by a frame, or may represent a video in which the object is not surrounded by a frame. If the object is not surrounded by a frame, during playback, the object included in the video obtained by playback will be surrounded by a frame using the inference result.

センサ・イベント記録領域221も推論イベント記録領域135と同様に、管理領域222と記録領域223とが形成されている。管理領域222にもファイル・システム領域131に記録されている専用ソフトウエアによって各種設定情報が記録される。記録領域223には、動画を構成するフレームを表す画像データがフレーム番号順に記録される。記録領域223にも、１フレームごとにヘッダ記録領域151、フレーム画像データ記録領域152およびフッタ記録領域153が形成されている。ヘッダ記録領域151にはフレーム番号、アドレス位置、撮影時刻などの付加情報が記録され、フッタ記録領域153にはヘッダ記録領域151に記録される付加情報以外の付加情報が記録される。ヘッダ記録領域151、フッタ記録領域153に学習済モデル80に関する情報、推論結果などを記録してもよい。 Similarly to the inference event recording area 135, the sensor event recording area 221 also includes a management area 222 and a recording area 223. Various setting information is also recorded in the management area 222 by the dedicated software recorded in the file system area 131. In the recording area 223, image data representing frames constituting a moving image is recorded in the order of frame numbers. Also in the recording area 223, a header recording area 151, a frame image data recording area 152, and a footer recording area 153 are formed for each frame. The header recording area 151 records additional information such as a frame number, address position, and shooting time, and the footer recording area 153 records additional information other than the additional information recorded in the header recording area 151. Information regarding the learned model 80, inference results, etc. may be recorded in the header recording area 151 and the footer recording area 153.

センサ・イベント記録領域221の記録領域223に記録されるフレームｎからフレームγまでがセンサ・イベント記録の一つの動画を表す。このようにして、センサ部69の出力にもとづいて、対象が検出された撮影動画データ（第２の撮影動画データの一例である）が記録される。撮影動画データは対象が枠で囲まれている動画を表すものでもよいし、対象が枠で囲まれていない動画を表すものでもよい。対象が枠で囲まれていないときには、再生時には再生により得られる動画に含まれる対象を、センサ部69の出力を利用して枠で囲むこととなろう。 Frame n to frame γ recorded in the recording area 223 of the sensor event recording area 221 represents one moving image of sensor event recording. In this way, based on the output of the sensor unit 69, photographed video data in which the object is detected (which is an example of second photographed video data) is recorded. The captured video data may represent a video in which the object is surrounded by a frame, or may represent a video in which the object is not surrounded by a frame. If the object is not surrounded by a frame, the object included in the moving image obtained by playback will be surrounded by a frame using the output of the sensor unit 69 during playback.

図17を参照して、センサ部69からの出力にもとづいて対象が検出されると、推論イベント記録と同様に、撮影動画データによって表される撮影動画を構成する画像に含まれる対象が枠で囲まれる（図17ステップ207）。センサ部69からの出力にもとづいて対象が検出されていれば（図17ステップ208でＹＥＳ）、検出した対象を枠で囲んで表示する処理が繰り返される。 Referring to FIG. 17, when an object is detected based on the output from the sensor unit 69, the object included in the images constituting the photographed video represented by the photographed video data is displayed in a frame, similar to the inference event record. surrounded (Figure 17 Step 207). If the object is detected based on the output from the sensor unit 69 (YES in step 208 in FIG. 17), the process of enclosing and displaying the detected object in a frame is repeated.

センサ部69からの出力にもとづいて対象が検出されなくなっても（図17ステップ208でＹＥＳ）、図13および図14を用いて説明したように、画像の検出されていた場所を引き続き枠で表示する（図17ステップ209）。対象が検出されなくなってから所定時間、たとえば、0.2秒、経過するまでは（図17ステップ210でＮＯ）、対象（人物）が画像から検出されなくなっても枠が表示され続ける。対象が検出されなくなってから所定時間、たとえば、0.2秒、経過すると（図17ステップ210でＹＥＳ）、枠は消去される（図17ステップ211）。センサ・イベント記録領域221へのセンサ・イベント記録が停止する（図17ステップ212）。撮影装置１に撮影の終了指令が入力されると撮影装置１の処理は終了し（図17ステップ213でＹＥＳ）、終了指令が入力されなければ図16ステップ205からのセンサ・イベント処理が繰り返される（図17ステップ213でＮＯ）。 Even if the target is no longer detected based on the output from the sensor unit 69 (YES in step 208 in Figure 17), the detected location in the image will continue to be displayed in a frame, as explained using Figures 13 and 14. (Figure 17 Step 209). The frame continues to be displayed even if the object (person) is no longer detected in the image until a predetermined period of time, for example, 0.2 seconds has elapsed since the object was no longer detected (NO in step 210 in FIG. 17). When a predetermined period of time, for example 0.2 seconds, has elapsed since the object was no longer detected (YES in step 210 in FIG. 17), the frame is erased (step 211 in FIG. 17). Sensor event recording to the sensor event recording area 221 is stopped (step 212 in FIG. 17). When a command to end imaging is input to the imaging device 1, the processing of the imaging device 1 ends (YES in step 213 in FIG. 17), and if the end command is not input, the sensor event processing from step 205 in FIG. 16 is repeated. (NO in step 213 of Figure 17).

図18において、推論の結果を、推論イベント記録領域135の記録領域137のヘッダ記録領域151、フッタ記録領域153、ヘッダ記録領域151、フッタ記録領域153は動画ファイルを構成する記録領域である、に記録し、推論の結果以外の推論に関する情報を推論イベント記録領域135の管理領域136に記録するようにしてもよい。このようにすることで、たとえば、推論の結果を除いたデータを、動画ファイルとは別のファイルに記録できる。もっとも、推論の結果を除いたデータをユーザ記録領域に138に記録するようにしてもよい。 In FIG. 18, the inference results are stored in the header recording area 151, footer recording area 153, header recording area 151, and footer recording area 153 of the recording area 137 of the inference event recording area 135, which are recording areas that constitute a video file. Information regarding the inference other than the inference result may be recorded in the management area 136 of the inference event recording area 135. By doing this, for example, data other than the inference results can be recorded in a file separate from the video file. However, data other than the inference results may be recorded in the user recording area 138.

図19は記憶媒体71のフォーマットの他の一例である。 FIG. 19 shows another example of the format of the storage medium 71.

図18に示す記憶媒体71のフォーマットでは、たとえば、推論イベント記録領域135には学習済モデル80に撮影動画を構成する画像を入力し学習させることにより対象を検出して、検出した対象を枠で囲んだ画像を表すデータが記録され、センサ・イベント記録領域221にはセンサ部69からの出力にもとづいて対象を検出して、検出した対象を枠で囲んだ画像を表すデータが記録される。これに対し、図19に示す記憶媒体のフォーマットでは、ファイル・システム領域121、常時記録領域132、推論イベント記録領域135およびセンサ・イベント記録領域221の他に、推論結果記録領域224（動画ファイルに記録されている推論の結果を、動画ファイルとは別のファイルに記録する領域の一例である）およびセンサ出力結果記録領域227が形成されている。 In the format of the storage medium 71 shown in FIG. 18, for example, in the inference event recording area 135, images constituting a captured video are input to the trained model 80 and trained, thereby detecting an object, and displaying the detected object in a frame. Data representing an enclosed image is recorded, and data representing an image in which an object is detected based on the output from the sensor unit 69 and the detected object is surrounded by a frame is recorded in the sensor event recording area 221. In contrast, in the format of the storage medium shown in FIG. 19, in addition to the file system area 121, constant recording area 132, inference event recording area 135, and sensor event recording area 221, (This is an example of an area where the recorded inference results are recorded in a file separate from the video file) and a sensor output result recording area 227 are formed.

推論結果記録領域224も、管理領域225と記録領域226とが形成されている。管理領域225にもファイル・システム領域131に記録されている専用ソフトウエアによって各種設定情報が記録される。記録領域226には、推論イベント記録領域135に記録された推論結果にもとづく画像、対象を枠で囲んでいる画像を表す画像データに対応して推論結果が記録される。たとえば、一つ目の推論イベント記録が行われると、その推論イベント記録を行わせるための推論の結果、たとえば、検出対象の存在位置、存在確率、使用した学習済モデル80の種類、使用した学習済モデル80のバージョン、対象を検出した場所、日時などが、推論イベント記録と対応づけて推論結果記録領域224の記録領域226に記録される。たとえば、推論イベント記録の記録場所を表すアドレスを推論結果と一緒に記録する。 The inference result recording area 224 also includes a management area 225 and a recording area 226. Various setting information is also recorded in the management area 225 by the dedicated software recorded in the file system area 131. In the recording area 226, an inference result is recorded in correspondence with an image based on the inference result recorded in the inference event recording area 135, and image data representing an image surrounding an object with a frame. For example, when the first inference event recording is performed, the results of the inference for recording the inference event, such as the location of the detection target, the probability of its existence, the type of trained model 80 used, and the training used. The version of the completed model 80, the location where the object was detected, the date and time, etc. are recorded in the recording area 226 of the inference result recording area 224 in association with the inference event record. For example, an address representing the recording location of the inference event record is recorded together with the inference result.

同様に、センサ出力結果記録領域227も、管理領域228と記録領域229とが形成されている。管理領域228にもファイル・システム領域131に記録されている専用ソフトウエアによって各種設定情報が記録される。記録領域229には、センサ・イベント記録領域221に記録された画像、人物を枠で囲んでいる画像、を表す画像データに対応してセンサ出力結果が記録される。たとえば、一つ目のセンサ・イベント記録が行われると、そのセンサ・イベント記録を行わせるためのセンサ出力の結果、たとえば、検出対象の存在位置、存在確率、使用したセンサの種類、使用したセンサのバージョン、対象を検出した場所、日時などが、センサ・イベント記録と対応づけてセンサ出力結果記録領域227の記録領域229に記録される。たとえば、センサ・イベント記録の記録場所を表すアドレスをセンサ出力結果と一緒に記録する。 Similarly, the sensor output result recording area 227 also includes a management area 228 and a recording area 229. Various setting information is also recorded in the management area 228 by the dedicated software recorded in the file system area 131. In the recording area 229, sensor output results are recorded in correspondence with image data representing the image recorded in the sensor event recording area 221, an image in which a person is surrounded by a frame. For example, when the first sensor event recording is performed, the sensor output results for that sensor event recording, such as the location of the detection target, the probability of existence, the type of sensor used, and the sensor used. version, location where the object was detected, date and time, etc. are recorded in the recording area 229 of the sensor output result recording area 227 in association with the sensor event record. For example, an address representing the recording location of the sensor event record is recorded together with the sensor output result.

［第３実施例］
図20から図24は、第３実施例を示すものである。この実施例では、撮影装置１とは異なるＡＩ（Artificial Intelligence）モジュールにおいて推論が行われる。 [Third example]
20 to 24 show a third embodiment. In this embodiment, inference is performed in an AI (Artificial Intelligence) module different from the imaging device 1.

図20は、ＡＩモジュール230を備えたシステム240の概要を示している。 FIG. 20 shows an overview of a system 240 with an AI module 230.

システム240には、撮影装置１、ＡＩモジュール230、推論モジュールの一例である、および表示装置239が含まれている。撮影装置１とＡＩモジュール230とは互いに通信可能であり、ＡＩモジュール230は表示装置239に動画データを出力できる。 The system 240 includes the imaging device 1, an AI module 230, an example of an inference module, and a display device 239. The photographing device 1 and the AI module 230 can communicate with each other, and the AI module 230 can output video data to the display device 239.

図21は、ＡＩモジュール230の電気的構成を示すブロック図である。 FIG. 21 is a block diagram showing the electrical configuration of the AI module 230.

ＡＩモジュール230の全体の動作は、ＣＰＵ（Central Processing Unit）231によって統括される。 The entire operation of the AI module 230 is controlled by a CPU (Central Processing Unit) 231.

ＡＩモジュール230には映像インターフェイス232およびデータ・インターフェイス233が含まれている。映像インターフェイス232は撮影装置１の映像入出力部76と接続されている。データ・インターフェイス233は撮影装置１のデータ入出力部77と接続されている。 AI module 230 includes a video interface 232 and a data interface 233. The video interface 232 is connected to the video input/output section 76 of the photographing device 1. The data interface 233 is connected to the data input/output section 77 of the imaging device 1.

また、ＡＩモジュール230には学習済モデル記憶装置234が含まれており、多種多用な学習済モデル80が記憶されている。ＡＩモジュール230には、メモリ・カード238などの記録媒体にデータを書き込み、かつメモリ・カード238などの記録媒体に所望のデータを記憶するメモリ・カード・リーダ・ライタ235、および所定のデータを記憶するメモリ236も含まれている。さらに、ＡＩモジュール230には、表示装置239に学習済み撮影動画データおよびオリジナル撮影動画データを出力する映像インターフェイス237も含まれている。 The AI module 230 also includes a learned model storage device 234, which stores a variety of learned models 80. The AI module 230 includes a memory card reader/writer 235 that writes data to a recording medium such as a memory card 238 and stores desired data in the recording medium such as the memory card 238, and a memory card reader/writer 235 that stores predetermined data. It also includes memory 236. Furthermore, the AI module 230 also includes a video interface 237 that outputs learned captured video data and original captured video data to a display device 239.

図22は、撮影装置１の表示面23に表示される学習済モデル情報251の一例である。 FIG. 22 is an example of learned model information 251 displayed on the display screen 23 of the photographing device 1.

撮影装置１とＡＩモジュール230とが接続され、撮影装置１の表示面23にメニューが表示され、そのメニューに含まれるＡＩモジュール230の学習済モデル情報表示指令がタッチされるとＡＩモジュール230の学習済モデル記憶装置234に記憶されているＡＩモジュールについての学習済みモデルの種類についての情報252、学習済モデル80のバージョンについての情報253などが学習済モデル記憶装置234から読み取られ、学習済モデル情報として図22に示すように表示される。推論モデルの種類、推論モデルのバージョンなど、どのような推論モデルかを示す情報の報知としての表示の一例である。学習済モデル80がファイル形式で学習済モデル記憶装置234に記憶されていれば、それぞれのファイルのヘッダなどに学習済モデル80の種類、バージョンなどの情報が記録されているので、そのヘッダなどから情報を読み取り、表示できる。 When the imaging device 1 and the AI module 230 are connected, a menu is displayed on the display screen 23 of the imaging device 1, and the learned model information display command of the AI module 230 included in the menu is touched, the AI module 230 starts learning. Information 252 about the type of trained model for the AI module stored in the trained model storage device 234, information 253 about the version of the trained model 80, etc. are read from the trained model storage device 234, and the trained model information is read from the trained model storage device 234. is displayed as shown in Figure 22. This is an example of a display as a notification of information indicating what kind of inference model is, such as the type of inference model and the version of the inference model. If the trained model 80 is stored in the trained model storage device 234 in file format, information such as the type and version of the trained model 80 is recorded in the header of each file. Able to read and display information.

図23は、ＡＩモジュール230の処理手順を示すフローチャートである。 FIG. 23 is a flowchart showing the processing procedure of the AI module 230.

図22に示したように撮影装置１の表示面23に学習済モデル80についての情報および学習済モデル80の名称を表示させ、利用する学習済モデル80を撮影装置１のユーザに選択させる。学習モデルが選択されると選択指令が撮影装置１のデータ入出力部77から出力され、ＡＩモジュール230に入力し、ＡＩモジュール230のＣＰＵ231に入力する。ＡＩモジュール230のＣＰＵ231において学習済モデル80の選択指令を入力すると（図23ステップ241）、選択指令に応じた学習済モデル80で撮影動画データを推論させるように学習済モデル80が設定される（図23ステップ242）。 As shown in FIG. 22, information about the trained model 80 and the name of the trained model 80 are displayed on the display screen 23 of the photographing device 1, and the user of the photographing device 1 is allowed to select the trained model 80 to be used. When a learning model is selected, a selection command is output from the data input/output unit 77 of the imaging device 1, inputted to the AI module 230, and then inputted to the CPU 231 of the AI module 230. When a command to select the trained model 80 is input to the CPU 231 of the AI module 230 (step 241 in FIG. 23), the trained model 80 is set so that the captured video data is inferred by the trained model 80 according to the selection command ( Figure 23 step 242).

撮影装置１において撮影が開始され、撮影動画データ（学習済モデル80により推論されていない撮影動画データをオリジナル撮影動画データということにする）が撮影装置１の映像入出力部76からＡＩモジュールに送信される。オリジナル撮影動画データがＡＩモジュール230に入力すると（図23ステップ243でＹＥＳ）、設定された学習済モデル80にオリジナル撮影動画データが１フレームの画像ずつ入力され、推論させられる（図23ステップ244）。 Shooting is started in the shooting device 1, and shot video data (shot video data that has not been inferred by the trained model 80 is referred to as original shot video data) is sent from the video input/output unit 76 of the shooting device 1 to the AI module. be done. When the original photographed video data is input to the AI module 230 (YES in step 243 in FIG. 23), the original photographed video data is input one frame at a time to the set learned model 80, and is caused to infer (step 244 in FIG. 23). .

上述のように、推論により撮影動画を構成する画像から対象が検出されると（図23ステップ245でＹＥＳ）、検出した対象がスーパー・インポーズなどによって枠で囲まれる（図23ステップ246）。推論により対象が枠で囲まれている撮影動画を表す推論済撮影動画データは映像インターフェイス237から表示装置239に出力される。車載の撮影装置１とは別のデバイスに出力することの一例である。（図23ステップ247）。表示装置239の表示画面には対象が枠で囲まれた撮影動画が表示されるようになる。対象が検出されていると（図23ステップ248でＮＯ）、ステップ246および247の処理が繰り返される。推論済み撮影動画データをデータ・インターフェイス233から撮影装置１に送信し、撮影装置１の記憶媒体71に推論済撮影動画データを記録するようにしてもよい。 As described above, when an object is detected from the images constituting the captured video by inference (YES in step 245 in FIG. 23), the detected object is surrounded by a frame by superimposing or the like (step 246 in FIG. 23). Inferred captured video data representing a captured video in which the object is surrounded by a frame is output from the video interface 237 to the display device 239. This is an example of outputting to a device other than the vehicle-mounted photographing device 1. (Figure 23 Step 247). On the display screen of the display device 239, a captured video with the object surrounded by a frame is displayed. If the target has been detected (NO in step 248 in FIG. 23), the processes in steps 246 and 247 are repeated. The inferred photographed video data may be transmitted from the data interface 233 to the photographing device 1, and the inferred photographed video data may be recorded in the storage medium 71 of the photographing device 1.

対象が検出されなくなると（図23ステップ248でＹＥＳ）、枠が消去される（図23ステップ249）。撮影装置１からデータ・インターフェイス233を介してＡＩモジュール230に終了指令が入力すると（図23ステップ250でＹＥＳ）、ＡＩモジュール230の処理は終了する。終了指令が入力しなければ（図23ステップ250でＮＯ）、ステップ244からの処理が繰り返される。 When the object is no longer detected (YES in step 248 in FIG. 23), the frame is erased (step 249 in FIG. 23). When a termination command is input from the photographing device 1 to the AI module 230 via the data interface 233 (YES in step 250 in FIG. 23), the processing of the AI module 230 is terminated. If the termination command is not input (NO in step 250 in FIG. 23), the processing from step 244 is repeated.

メモリ・カード・リーダ・ライタ235を用いて推論済撮影動画データをメモリ・カード238に記録してもよい。また、推論結果、対象を検出した画像のフレーム番号、撮影日時、確率、対象を検出した画像における対象の位置など、を図19などのフォーマットにしたがってメモリ・カード238に記録してもよいし、データ・インターフェイス233から撮影装置１に送信して撮影装置１において記憶媒体71に記録するようにしてもよい。 The inferred captured video data may be recorded on the memory card 238 using the memory card reader/writer 235. Further, the inference result, the frame number of the image in which the object was detected, the shooting date and time, the probability, the position of the object in the image in which the object was detected, etc. may be recorded in the memory card 238 according to a format such as that shown in FIG. The data may be transmitted from the data interface 233 to the photographing device 1 and recorded in the storage medium 71 in the photographing device 1.

図24は、ＡＩモジュール230の処理手順の一部を示すフローチャートである。 FIG. 24 is a flowchart showing part of the processing procedure of the AI module 230.

図24に示す処理は、図23のステップ243の処理からつづく。 The process shown in FIG. 24 continues from the process in step 243 of FIG.

ＡＩモジュール230にオリジナル撮影動画データが入力すると、設定された学習済モデル80で推論させられる（図24ステップ244）。対象が検出されると（図24ステップ245でＹＥＳ）、検出された対象が枠で囲まれる（図24ステップ246）。 When the original photographed video data is input to the AI module 230, it is caused to perform inference using the set trained model 80 (step 244 in FIG. 24). When an object is detected (YES in step 245 in FIG. 24), the detected object is surrounded by a frame (step 246 in FIG. 24).

また、オリジナル撮影動画データは、選択された学習済モデル80での推論および枠の表示に必要な時間だけＣＰＵ231によって遅延させられる（図24ステップ251）。遅延させられたオリジナル撮影動画データと推論済撮影動画データとが映像インターフェイス237から表示装置239に出力される（図24ステップ252）。表示装置239には、オリジナル撮影動画データと推論済撮影動画データとが入力するので、推論済の撮影動画と推論していないオリジナルの撮影動画とを表示して比較できる。 Further, the original photographed video data is delayed by the CPU 231 by the time necessary for inference with the selected learned model 80 and display of the frame (step 251 in FIG. 24). The delayed original shot video data and inferred shot video data are output from the video interface 237 to the display device 239 (step 252 in FIG. 24). Since the original shot video data and the inferred shot video data are input to the display device 239, the inferred shot video and the original shot video that has not been inferred can be displayed and compared.

対象が検出されていると（図24ステップ253でＮＯ）、ステップ246からの処理が繰り返される。対象が検出されなくなると（図24ステップ353でＹＥＳ）、枠が消去され、撮影装置１から終了指令がＡＩモジュール230に終了指令が入力すると（図24ステップ255でＹＥＳ）、ＡＩモジュール230の処理は終了する。ＡＩモジュール230に終了指令が入力しなければ（図24ステップ255でＮＯ）、ステップ244からの処理が繰り返される。 If the target has been detected (NO in step 253 in FIG. 24), the processing from step 246 is repeated. When the target is no longer detected (YES in step 353 in FIG. 24), the frame is erased, and when a termination command is input from the imaging device 1 to the AI module 230 (YES in step 255 in FIG. 24), the processing of the AI module 230 ends. If the termination command is not input to the AI module 230 (NO in step 255 in FIG. 24), the processing from step 244 is repeated.

また、撮影装置１の映像入出力部76から第１のラインを通してＡＩモジュール230の映像インターフェイス232にオリジナル撮影動画データを入力し、同様に、撮影装置１のデータ入出力部77から第２のラインを通してＡＩモジュール230のデータ・インターフェイス233にオリジナル撮影動画データを入力してもよい。第１のラインを通してＡＩモジュール230に入力したオリジナル撮影動画データを、学習済モデル80を用いて推論し、対象を検出することにより対象を囲む枠を表示する撮影動画を表す推論済撮影動画データを、映像インターフェイス237から表示装置239に出力し、あるいは映像インターフェイス232から撮影装置１に出力し、第２のラインを通してＡＩモジュール230に入力したオリジナル撮影動画データを、学習済モデル80を用いて推論し、対象を枠で囲む処理に必要な時間だけ遅延させて映像インターフェイス237から表示装置239に出力し、あるいはデータ・インターフェイス233から撮影装置１に出力してもよい。このようにすることで、オリジナル撮影動画データによって表される撮影動画と推論済撮影動画データによって表される撮影動画とを対比して表示できる。 In addition, original photographed video data is input from the video input/output section 76 of the photographing device 1 to the video interface 232 of the AI module 230 through the first line, and similarly, from the data input/output section 77 of the photographing device 1, the original photographed video data is input to the video interface 232 of the AI module 230 through the first line. Original photographed video data may be input to the data interface 233 of the AI module 230 through. The original captured video data input to the AI module 230 through the first line is inferred using the trained model 80, and the inferred captured video data representing the captured video that displays a frame surrounding the target by detecting the target is generated. The trained model 80 is used to infer original captured video data that is output from the video interface 237 to the display device 239 or from the video interface 232 to the imaging device 1 and input to the AI module 230 through the second line. , the image may be outputted from the video interface 237 to the display device 239 or outputted from the data interface 233 to the photographing device 1 after being delayed by the time necessary for processing to surround the object with a frame. By doing so, it is possible to compare and display the photographed video represented by the original photographed video data and the photographed video represented by the inferred photographed video data.

［第４実施例］
図25から図27は、第４実施例を示している。 [Fourth example]
25 to 27 show a fourth embodiment.

図26は、推論用サーバ260を備えたシステム269を示している。 FIG. 26 shows a system 269 that includes an inference server 260.

撮影装置１と推論用サーバ260とはいずれもインターネットに接続可能である。撮影装置１は通信部68によってインターネットに接続可能であり、推論用サーバ260は後述する通信回路262によってインターネットに接続できる。 Both the imaging device 1 and the inference server 260 can be connected to the Internet. The photographing device 1 can be connected to the Internet through a communication unit 68, and the inference server 260 can be connected to the Internet through a communication circuit 262, which will be described later.

図26は、推論用サーバ260の電気的構成を示すブロック図である。 FIG. 26 is a block diagram showing the electrical configuration of the inference server 260.

推論用サーバ260の全体の動作は、制御装置261によって統括される。 The entire operation of the inference server 260 is supervised by the control device 261.

推論用サーバ260には、上述したようにインターネットに接続するための通信回路262、データを一時的に記憶するメモリ263、多種多様な学習済モデル、たとえば、撮影装置１の学習済モデルよりも高精度の学習済モデル、80を記憶している学習済モデル記憶装置264、所定のデータを記憶するハードディスク266、ハードディスク266にデータを書き込み、かつハードディスク266に書き込まれているデータを読み取るハードディスク・ドライブ265および推論用サーバ260に指令等を与える入力装置267が含まれている。 As described above, the inference server 260 includes a communication circuit 262 for connecting to the Internet, a memory 263 for temporarily storing data, and a wide variety of trained models, such as models higher than the trained model of the imaging device 1. A trained model storage device 264 that stores a trained model of accuracy 80, a hard disk 266 that stores predetermined data, and a hard disk drive 265 that writes data to the hard disk 266 and reads data written to the hard disk 266. and an input device 267 for giving commands and the like to the inference server 260.

図27は、撮影装置１と推論用サーバ260との処理手順を示すフローチャートである。 FIG. 27 is a flowchart showing the processing procedure of the imaging device 1 and the inference server 260.

上述したように撮影装置１において記憶媒体71に常時記録撮影動画データおよび推論イベント記録撮影動画データが記録されると、それらの常時記録撮影動画データおよび推論イベント記録撮影動画データが撮影装置１から推論用サーバ260に送信される（図27ステップ271）。これらのデータの推論用サーバ260への送信タイミングは、撮影装置１のユーザからメニューを利用した送信コマンドが撮影装置１に与えられたときでもよいし、推論用サーバ260からの送信コマンドが撮影装置１に与えられたときでもよく、任意のタイミングでよい。 As described above, when the constantly recorded video data and the inferred event recorded video data are recorded in the storage medium 71 in the imaging device 1, the constantly recorded video data and the inferred event recorded video data are inferred from the imaging device 1. server 260 (FIG. 27 step 271). The timing of transmitting these data to the inference server 260 may be when the user of the imaging device 1 gives a transmission command using a menu to the imaging device 1, or when the transmission command from the inference server 260 is sent to the imaging device 1. 1, or at any arbitrary timing.

撮影装置１から送信された常時記録撮影動画データおよび推論イベント記録撮影動画データが推論用サーバ260において受信されると（図27ステップ281でＹＥＳ）、常時記録データを用いて対象を検出する高精度な推論が推論用サーバ260において行われる（図27ステップ282）。 When the constantly recorded video data and the inference event recorded video data transmitted from the imaging device 1 are received by the inference server 260 (YES in step 281 in FIG. 27), high accuracy detection of the target using the constantly recorded data is performed. Inference is performed in the inference server 260 (step 282 in FIG. 27).

高精度な学習済モデル80における推論により対象が検出されると（図27ステップ283でＹＥＳ）、推論用サーバ260において検出された対象の画像が撮影装置１の推論イベント記録でも記録されているか、撮影装置１の推論イベント記録でも対象が検出されているかどうかが確認される（図27ステップ284）。推論イベント記録でも対象が検出されていると、対象が検出されたのは撮影装置１において高精度の推論が行われたかどうかが確認される（図27ステップ285）。撮影装置１において高精度の推論が行われたかどうかについては、撮影装置１から推論用サーバ260に撮影装置１における推論で使用された学習済モデル80の種類、バージョンなどを確認すればよい。 When an object is detected by inference in the highly accurate trained model 80 (YES in step 283 in FIG. 27), whether the image of the object detected in the inference server 260 is also recorded in the inference event record of the imaging device 1, It is also confirmed whether the object has been detected in the inference event record of the imaging device 1 (step 284 in FIG. 27). If the object is also detected in the inference event record, it is confirmed whether the object was detected because highly accurate inference was performed in the imaging device 1 (step 285 in FIG. 27). To determine whether highly accurate inference has been performed in the imaging device 1, the imaging device 1 may check with the inference server 260 the type, version, etc. of the trained model 80 used in the inference in the imaging device 1.

推論用サーバ260において対象を検出できたが（図27ステップ283でＹＥＳ）、撮影装置１において行われた推論イベント記録では対象を検出できなかったときには（図27ステップ284でＮＯ）、推論用サーバ260から撮影装置１に、撮影装置１において行われる推論よりやや高精度な学習済モデル80（推論用サーバ260での推論よりも精度は低いが比較的高精度の推論ができるような学習済モデル80）を使用して推論を行うことができるように、再学習指令および教師データが推論用サーバ260から撮影装置１に送信される（図27ステップ286）。教師データは、たとえば、推論用サーバ260において検出した対象を含む画像を表すデータである。推論用サーバ260において対象を検出でき（図27ステップ283でＹＥＳ）、かつ撮影装置１においても対象を検出したが（図27ステップ284でＹＥＳ）、撮影装置１において高精度の推論（推論用サーバ260で行われるような高精度の推論）が行われているときには（図27ステップ285でＹＥＳ）、低い精度での学習済モデル80を使用して推論を行うように、再学習指令および教師データが推論用サーバ260から撮影装置１に送信される（図27ステップ286）。 If the inference server 260 was able to detect the target (YES in step 283 in FIG. 27), but the target could not be detected in the inference event record performed in the imaging device 1 (NO in step 284 in FIG. 27), the inference server 260 to the imaging device 1, a trained model 80 that is slightly more accurate than the inference performed in the imaging device 1 (a trained model that is less accurate than the inference in the inference server 260 but capable of relatively high-precision inference) 80), the relearning command and teacher data are transmitted from the inference server 260 to the imaging device 1 (step 286 in FIG. 27). The teacher data is, for example, data representing an image containing an object detected by the inference server 260. The inference server 260 was able to detect the object (YES in step 283 in FIG. 27), and the imaging device 1 also detected the object (YES in step 284 in FIG. 27). 260) is being performed (YES in step 285 in Figure 27), the retraining command and training data are used to perform inference using the trained model 80 with lower accuracy. is transmitted from the inference server 260 to the photographing device 1 (step 286 in FIG. 27).

やや高精度な学習済モデル80を使用して推論を行うように、再学習指令および教師データが撮影装置１において受信されると（図27ステップ272でＹＥＳ）、使用されていた学習済モデル80がやや高精度となるように、たとえば、推論用サーバ260における推論よりは低いが対象を検出できる確率が上がるように、推論用サーバ260から送信された教師データを用いて再学習される（図27ステップ273）。教師データは、たとえば、推論用サーバ260において検出した対象を含む画像を表すデータである。低い精度での学習済モデル80を使用して推論を行うように、再学習指令および教師データが撮影装置１において受信されると（図27ステップ272でＹＥＳ）、使用されていた学習済モデル80は撮影装置１においては高精度すぎるため、低い精度となるように、高精度の学習済モデル80を用いた推論用サーバ260において検出できた対象までは検出できないように、推論用サーバ260から送信された教師データを用いて再学習される（図27ステップ273）。 When a relearning command and teacher data are received in the imaging device 1 so as to perform inference using the trained model 80 with slightly higher accuracy (YES in step 272 in FIG. 27), the trained model 80 that was being used is is retrained using the training data sent from the inference server 260 so that the accuracy is slightly higher, for example, the probability of detecting the target is lower than the inference in the inference server 260, but the probability of detecting the target is increased (Fig. 27 steps 273). The teacher data is, for example, data representing an image containing an object detected by the inference server 260. When a retraining command and training data are received in the imaging device 1 so as to perform inference using the trained model 80 with low accuracy (YES in step 272 in FIG. 27), the trained model 80 that was being used is is too accurate for the photographing device 1, so it is sent from the inference server 260 so that it cannot detect objects that could be detected by the inference server 260 using the highly accurate trained model 80, so that the accuracy is too low. The training data is retrained using the trained training data (Step 273 in Figure 27).

撮影装置１は終了指令が与えられるまで（図27ステップ274）、ステップ272および273の処理が繰り返され、推論用サーバ260は終了指令が与えられるまで（図27ステップ287）、ステップ282からの処理が繰り返される。撮影装置１における推論は高精度すぎず、低精度すぎず中くらいの精度のものが実行されるようになる。 The imaging device 1 repeats the processes of steps 272 and 273 until a termination command is given (step 274 in FIG. 27), and the inference server 260 repeats the processes from step 282 until a termination command is given (step 287 in FIG. 27). is repeated. The inference in the photographing device 1 is executed with medium precision, not too high precision, and not too low precision.

上述の実施例において、撮影装置１における推論で対象が検出されずに推論用サーバ260において対象が検出されたときに、撮影装置１における推論の確率を入力して対象の有無を弁別するためのしきい値を下げて対象が検出される確率を上げてもよいし、撮影装置１における推論が高精度のために対象が検出されて推論用サーバ260においても高精度の推論で対象が検出されたときに、撮影装置１における推論の確率を入力して対象の有無を弁別するためのしきい値を上げて対象が検出される確率を下げてもよい。 In the above-described embodiment, when the object is not detected by the inference in the imaging device 1 and the object is detected in the inference server 260, the probability of the inference in the imaging device 1 is inputted to discriminate the presence or absence of the object. The probability that the object will be detected may be increased by lowering the threshold value, or the object may be detected because the inference in the photographing device 1 is highly accurate, and the inference server 260 may also detect the object through highly accurate inference. At this time, the probability of the object being detected may be lowered by inputting the inference probability in the photographing device 1 and increasing the threshold value for determining the presence or absence of the object.

［第５実施例］
図28から図31は、第５実施例を示している。 [Fifth example]
28 to 31 show the fifth embodiment.

図28および図29は、車両４に設けられている撮影装置１の処理手順を示すフローチャートである。 28 and 29 are flowcharts showing the processing procedure of the photographing device 1 provided in the vehicle 4.

この実施例による撮影装置１は、前後、左右、車内の全周囲の360度を撮影できるものである。図30は、撮影装置１を基準の高さ（たとえば、地上1.3ｍ）に設置して得られる撮影画像、たとえば、天球画像、円周画像310の一例であり、図31は、撮影装置１を基準の高さよりも低い位置（たとえば、地上１ｍ）に設置して得られる撮影画像320の一例である。撮影装置１は、基準の高さよりも下方を撮影方向として画像を撮影してもよい。 The photographing device 1 according to this embodiment is capable of photographing 360 degrees around the entire interior of the vehicle, including front, rear, left and right. 30 is an example of a photographed image, such as a celestial sphere image and a circumferential image 310, obtained by installing the photographing device 1 at a standard height (for example, 1.3 m above the ground), and FIG. This is an example of a photographed image 320 obtained by installing at a position lower than the reference height (for example, 1 m above the ground). The photographing device 1 may photograph an image with the photographing direction below the reference height.

図30に示すように撮影装置１を基準の高さに設置して得られる撮影画像310と、図31に示すように撮影装置１を基準の高さよりも低い位置または高い位置に設置して得られる撮影画像320とは、同じ場所を撮影していたとしても異なる。この実施例では、撮影装置１が設置されている高さを入力し、得られる撮影画像320が基準の高さに設置されているときに得られる撮影画像310同じようになるように撮影画像320が調整される。調整された撮影画像320を表す撮影動画データについて図５に示すような学習済モデル80に入力されて推論が行われる。 A photographed image 310 obtained by installing the photographing device 1 at a reference height as shown in FIG. 30 and a photographed image 310 obtained by installing the photographing device 1 at a position lower or higher than the reference height as shown in FIG. The photographed image 320 shown in FIG. In this embodiment, the height at which the photographing device 1 is installed is input, and the photographed image 320 is adjusted so that the photographed image 320 obtained is the same as the photographed image 310 obtained when the photographing device 1 is installed at the reference height. is adjusted. Photographed video data representing the adjusted photographed image 320 is input to a learned model 80 as shown in FIG. 5, and inference is performed.

上述のように撮影装置１により撮影が開始すると（図28ステップ291）、撮影によって得られた動画が表示面23に表示される（図28ステップ292）。撮影装置１のユーザは、撮影装置１の第３のボタン29を押して表示面23にメニューを表示させ、そのメニューの中から設置の高さを入力するメニューを選択して撮影装置１の設置の高さを入力する。撮影装置１の販売店、ユーザが車両４に撮影装置１を設置するときに設置の高さを入力しておき、その高さをメモリ52に記憶させておいてもよい。メモリ52には撮影装置１の高さと撮影によって得られる撮影動画データの画像の倍率との関係が記憶されており、ユーザなどから入力された高さに対応する撮影動画データの画像の倍率が読み取ることができる。入力された高さに対応する画像の倍率を用いて撮影動画データの画像を拡大または縮小することで、得られた画像は、撮影装置１が基準の高さに設置されているときの画像と同様となる。 When the photographing device 1 starts photographing as described above (step 291 in FIG. 28), the moving image obtained by photographing is displayed on the display screen 23 (step 292 in FIG. 28). The user of the photographing device 1 presses the third button 29 of the photographing device 1 to display a menu on the display screen 23, selects a menu for inputting the installation height from the menu, and sets the installation height of the photographing device 1. Enter the height. When the store or user of the photographing device 1 installs the photographing device 1 in the vehicle 4, the installation height may be inputted and the height may be stored in the memory 52. The memory 52 stores the relationship between the height of the photographing device 1 and the magnification of the image of the photographed video data obtained by photographing, and the magnification of the image of the photographed video data corresponding to the height input by the user etc. is read. be able to. By enlarging or reducing the image of the photographed video data using the image magnification corresponding to the input height, the obtained image can be compared to the image when the photographing device 1 is installed at the standard height. It will be the same.

この実施例においても推論イベント記録と常時記録とは並行して行われる。メニューを用いて撮影装置１のユーザが推論イベント記録、たとえば、通常イベント記録、ダブル・イベント記録を設定すると（図28ステップ294でＹＥＳ）、入力された高さまたはあらかじめメモリ52に記憶されている高さに対応する画像の倍率がメモリ52から読み取られ、読み取られた倍率で撮影動画データによって表される画像が拡大または縮小させられる調整処理が行われる（図28ステップ295）。 Also in this embodiment, inference event recording and constant recording are performed in parallel. When the user of the imaging device 1 uses the menu to set inferred event recording, for example, normal event recording, double event recording (YES in step 294 of FIG. 28), the input height or the height previously stored in the memory 52 is set. The magnification of the image corresponding to the height is read from the memory 52, and an adjustment process is performed in which the image represented by the captured video data is enlarged or reduced by the read magnification (step 295 in FIG. 28).

画像が拡大または縮小させられた撮影動画データが図５に示すような学習済モデル80に入力し推論させられる（図28ステップ296）。 The photographed video data, in which the image has been enlarged or reduced, is input to the learned model 80 as shown in FIG. 5, and is caused to make inferences (step 296 in FIG. 28).

対象が検出されると（図28ステップ297でＹＥＳ）、推論イベント記録が開始され（図29ステップ298）、その対象が枠で囲まれて表示面23に表示される（図29ステップ299）。対象が検出されなくなると（図29ステップ300でＹＥＳ）、枠が消去され（図29ステップ301）、推論イベント記録が停止する（図29ステップ302）。 When the object is detected (YES in step 297 of FIG. 28), inference event recording is started (step 298 of FIG. 29), and the object is displayed on the display surface 23 surrounded by a frame (step 299 of FIG. 29). When the target is no longer detected (YES in step 300 of FIG. 29), the frame is erased (step 301 of FIG. 29), and inference event recording is stopped (step 302 of FIG. 29).

撮影装置１に撮影の終了指令が与えられれば（図29ステップ303でＹＥＳ）、図28および図29に示す処理は終了し、終了指令が与えられなければ（図29ステップ303でＮＯ）、撮影が続けられて図28ステップ296の処理から図29ステップ303の処理が繰り返される。 If the photographing end command is given to the photographing device 1 (YES in step 303 in FIG. 29), the processes shown in FIGS. 28 and 29 will be completed, and if the end command is not given (NO in step 303 in FIG. Then, the process from step 296 in FIG. 28 to step 303 in FIG. 29 is repeated.

［第６実施例］
図32から図34は、第６実施例を示している。第６実施例は、撮影装置１が再生機能を有するものである。撮影装置１に装填される記憶媒体71のファイル・システム領域131に再生用ビューワ・ソフトウエアをインストールしておくことにより、そのソフトウエアを読みだして再生できる。 [Sixth Example]
32 to 34 show a sixth embodiment. In the sixth embodiment, the photographing device 1 has a playback function. By installing playback viewer software in the file system area 131 of the storage medium 71 loaded into the photographing device 1, the software can be read and played back.

上述のように撮影装置１において撮影が開始され（図32ステップ331）、撮影装置１の表示面23に動画が表示される（図32ステップ332）。第３のボタン29が押されることにより表示面23にメニューが表示され、そのメニューの中から再生モードが設定されると（図32ステップ333でＹＥＳ）、表示面23には再生する動画ファイル名が表示されるので、ユーザは再生する動画ファイルを選択する（図32ステップ334）。再生モードが設定されないと（図32ステップ333でＮＯ）、図６などに示した動画記録処理に移行する。 As described above, photographing is started in the photographing device 1 (step 331 in FIG. 32), and a moving image is displayed on the display screen 23 of the photographing device 1 (step 332 in FIG. 32). When the third button 29 is pressed, a menu is displayed on the display screen 23, and when the playback mode is set from the menu (YES at step 333 in Figure 32), the name of the video file to be played is displayed on the display screen 23. is displayed, and the user selects a video file to play (step 334 in FIG. 32). If the playback mode is not set (NO in step 333 of FIG. 32), the process shifts to the moving image recording process shown in FIG. 6 and the like.

再生する動画ファイルが選択されると、その選択された動画ファイルが推論イベント記録のものかどうかが確認される（図32ステップ335）。推論イベント記録に限らず通常のイベント記録でもよい。推論イベント記録のものでなければ（図32ステップ335でＮＯ）、常時記録の再生処理が行われる。推論イベント記録の動画ファイルについて選択され、表示面23に表示される再生開始指令ボタンがタッチされると撮影装置１に再生開始指令が与えられる（図32ステップ336でＹＥＳ）。すると、指定された動画ファイルが再生される（図33ステップ337）。 When a video file to be played is selected, it is checked whether the selected video file is an inference event record (step 335 in FIG. 32). It is not limited to an inference event record, but may be a normal event record. If it is not an inference event record (NO in step 335 in FIG. 32), a constant recording reproduction process is performed. When the video file of the inference event record is selected and the playback start command button displayed on the display screen 23 is touched, a playback start command is given to the photographing device 1 (YES at step 336 in FIG. 32). Then, the specified video file is played back (step 337 in FIG. 33).

図34は、撮影装置１の表示面23を示している。図34において、図２（Ｂ）に示すものと同一物については同一符号を付して説明を省略する。 FIG. 34 shows the display surface 23 of the photographing device 1. In FIG. 34, the same components as those shown in FIG. 2(B) are given the same reference numerals, and the description thereof will be omitted.

表示面23には再生された動画が表示されている。表示面23の下方には、「イベント記録があっていますか」の文字列350、ＯＫボタン351およびＮＧボタン352が表示されている。 The played video is displayed on the display screen 23. At the bottom of the display screen 23, a character string 350 reading "Is the event record correct?", an OK button 351, and an NG button 352 are displayed.

ユーザは、表示面23に表示されている推論イベント記録の動画を見ながら、推論イベント記録が正しく行われているか誤っているかを確認する。図34の表示面23の右側には枠353が表示されているが、その枠353の中には対象が存在しない。これは、対象が存在しないのもかかわらず撮影動画データの推論により対象を検出し、枠353を表示させてしまったと考えられる。推論イベント記録が誤っているので、ユーザはＮＧボタン352をタッチする。仮に枠353の中に対象がいれば存在する対象を撮影動画データの推論により検出できたこととなるので、推論イベント記録は正しいこととなる。ユーザはＯＫボタン351をタッチする。 The user checks whether the inference event recording is performed correctly or incorrectly while watching the moving image of the inference event record displayed on the display screen 23. Although a frame 353 is displayed on the right side of the display surface 23 in FIG. 34, no object exists within the frame 353. This is thought to be because the object was detected by inference from the captured video data and the frame 353 was displayed even though the object did not exist. Since the inference event record is incorrect, the user touches the NG button 352. If there is an object within the frame 353, this means that the existing object has been detected by inference from the captured video data, and therefore the inference event recording is correct. The user touches the OK button 351.

図33に戻って、推論イベント記録の動画ファイルを再生しているときに推論イベント記録が誤りであることを示すＮＧボタン352が押されると（図33ステップ338でＹＥＳ）、その推論イベント記録の動画を構成する画像、たとえば、ＮＧボタン352が押されたときに表示されていた画像のヘッダに再学習が必要である旨、推論イベント記録が誤りであることなどが記録される（図33ステップ339）。再学習が必要である旨などを一つの画像のヘッダに記録するのではなく、一つの動画ファイルのヘッダや、一つの動画ファイルの最初の画像のヘッダなどに記録するようにしてもよい。 Returning to FIG. 33, when the NG button 352 indicating that the inference event record is incorrect is pressed while playing the video file of the inference event record (YES in step 338 in FIG. 33), the inference event record In the header of the image that makes up the video, for example, the image that was displayed when the NG button 352 was pressed, it is recorded that relearning is necessary, that the inference event record is incorrect, etc. (Step 33 in Figure 33) 339). Instead of recording the fact that relearning is necessary, etc., in the header of one image, it may be recorded in the header of one video file or the header of the first image of one video file.

推論イベント記録の動画ファイルを再生しているときに推論イベント記録が正しいことを示すＯＫボタン351が押されると（図33ステップ338でＮＯ）、ＯＫボタン351が押されたときに表示されていた画像のヘッダに推論イベント記録があっていることが記録される（図33ステップ340）。 When the OK button 351 indicating that the inference event record is correct is pressed while playing the video file of the inference event record (NO in step 338 in Figure 33), the message displayed when the OK button 351 is pressed is pressed. It is recorded that there is an inference event record in the header of the image (step 340 in FIG. 33).

ユーザから再生終了指令が与えられるまで、ステップ337から340までの処理が繰り返される（図33ステップ341）。 The processes from steps 337 to 340 are repeated until the user gives a reproduction end command (step 341 in FIG. 33).

ＮＧボタン352が押されると、ＮＧボタン352が押されたときに表示されていた画像のヘッダに再学習が必要である旨、推論イベント記録が誤りであることなどが記録されると、その推論イベント記録に用いられた学習済モデル80、異なる学習済モデル80、撮影装置１に記憶されている学習済モデル80以外の学習済モデル80などで、再学習が必要である旨などが記録された画像について再学習が行われる。ＯＫボタン351が押されたときに表示されていた画像のヘッダに推論イベント記録が正しいことが記録されたときにも、その推論イベント記録に用いられた学習済モデル80、異なる学習済モデル80、撮影装置１に記憶されている学習済モデル80以外の学習済モデル80などで、その画像を用いて再学習が行われるようにしてもよい。 When the NG button 352 is pressed, the header of the image that was displayed when the NG button 352 was pressed indicates that relearning is required, that the inference event record is incorrect, etc., and the inference is A message indicating that relearning is necessary is recorded for the trained model 80 used for event recording, a different trained model 80, a trained model 80 other than the trained model 80 stored in the imaging device 1, etc. Images are relearned. Even when it is recorded that the inference event record is correct in the header of the image that was displayed when the OK button 351 was pressed, the trained model 80 used for the inference event record, a different trained model 80, Relearning may be performed using a trained model 80 other than the trained model 80 stored in the photographing device 1 using the image thereof.

[第７実施例]
図35から図38は、第７実施例を示しており、パーソナル・コンピュータにおいて再生用ビューワ・ソフトウエアを用いた再生処理についてのものである。 [Seventh Example]
35 to 38 show a seventh embodiment, which concerns reproduction processing using reproduction viewer software on a personal computer.

図35は、タブレット型パーソナル・コンピュータ、たとえば、再生装置の一例である、の電気的構成を示すブロック図である。 FIG. 35 is a block diagram showing the electrical configuration of a tablet personal computer, for example, an example of a playback device.

タブレット型パーソナル・コンピュータ（以下，パーソナル・コンピュータという）360の全体の動作は，ＣＰＵ(Central Processing Unit)361によって統括される。 The entire operation of the tablet personal computer (hereinafter referred to as personal computer) 360 is controlled by a CPU (Central Processing Unit) 361.

パーソナル・コンピュータ360には，表示装置363が設けられている。この表示装置363は，ＣＰＵ361によって制御される表示制御装置362によって制御される。また，パーソナル・コンピュータ360には，加速度センサ364が設けられており，加速度センサ364からの出力信号は，ＣＰＵ361に入力する。さらに，パーソナル・コンピュータ360には，ＣＰＵ361によってアクセスされるＳＳＤ（solid state drive）365，記憶媒体71に記録されているデータ等を読み取り，かつ記憶媒体71にデータ等を書き込むメモリ・カード・リーダ・ライタ366が含まれている。 The personal computer 360 is provided with a display device 363. This display device 363 is controlled by a display control device 362 which is controlled by a CPU 361. Further, the personal computer 360 is provided with an acceleration sensor 364, and an output signal from the acceleration sensor 364 is input to the CPU 361. Furthermore, the personal computer 360 includes an SSD (solid state drive) 365 that is accessed by the CPU 361, and a memory card reader that reads data recorded in the storage medium 71 and writes data etc. to the storage medium 71. Writer 366 is included.

さらに，パーソナル・コンピュータ360には，キーボード，マウスなどの入力装置367，ＲＡＭなどのメモリ368およびインターネットなどのネットワークと接続する通信回路369が含まれている。 Furthermore, the personal computer 360 includes an input device 367 such as a keyboard and a mouse, a memory 368 such as a RAM, and a communication circuit 369 connected to a network such as the Internet.

後述する動作プログラムは，インターネットを介してパーソナル・コンピュータ360の通信回路369によって受信され，パーソナル・コンピュータ360にインストールされる。インストールされたプログラムをＣＰＵ361が読み出して実行することにより，ＣＰＵ361が各部を制御する。記憶媒体71などに動作プログラムが格納されており，そのような記憶媒体71から動作プログラムが読み取られて，パーソナル・コンピュータ360にインストールされてもよい。 An operating program, which will be described later, is received by the communication circuit 369 of the personal computer 360 via the Internet and installed on the personal computer 360. The CPU 361 controls each section by reading and executing the installed program. An operating program is stored in a storage medium 71 or the like, and the operating program may be read from such storage medium 71 and installed in the personal computer 360.

図36および図37は、パーソナル・コンピュータ360の再生処理手順を示すフローチャート、図38は、表示装置363の表示画面の一例である。 36 and 37 are flowcharts showing the reproduction processing procedure of the personal computer 360, and FIG. 38 is an example of a display screen of the display device 363.

パーソナル・コンピュータ360にインストールされている再生用ソフトウエアが起動させられることにより図36に示す再生処理が開始する。 When the playback software installed on the personal computer 360 is activated, the playback process shown in FIG. 36 starts.

図38を参照して、表示装置363の表示画面には、再生動画を表示する再生動画表示領域390、後述するように推論イベント記録の動画を表示する領域391、常時記録の動画ファイルのリストを表示するプレイ・リスト領域392、地図を表示する領域393、再生動画表示領域390に表示される再生動画の撮影時に得られる車両４の各種情報を表示する情報表示領域402、その他の情報を表示する領域403および撮影時に車両に加わっている加速度の方向を示す領域404が含まれている。 Referring to FIG. 38, the display screen of the display device 363 includes a playback video display area 390 that displays a playback video, an area 391 that displays a video of an inference event record as described later, and a list of constantly recorded video files. A play list area 392 to display, an area 393 to display a map, an information display area 402 to display various information about the vehicle 4 obtained when shooting the playback video displayed in the playback video display area 390, and other information. It includes a region 403 and a region 404 indicating the direction of acceleration applied to the vehicle at the time of photographing.

地図領域393には、推論イベント記録が行われた場所を示す矢印394および推論イベント記録が行われた場所のサムネイル画像395も表示される。また、情報表示領域402には、再生動画の記録日時を示す情報398、再生動画の開始、停止、一時停止、早戻し、早送りなどの指令を与えるボタン領域399、再生動画を記録した場所を示す情報396、車両の速度情報401などが表示される。 Also displayed in the map area 393 are an arrow 394 indicating the location where the inference event recording took place and a thumbnail image 395 of the location where the inference event recording took place. The information display area 402 also includes information 398 indicating the recording date and time of the playback video, a button area 399 for giving commands such as start, stop, pause, fast rewind, fast forward, etc. of the playback video, and a location where the playback video was recorded. Information 396, vehicle speed information 401, etc. are displayed.

パーソナル・コンピュータには記憶媒体7１が装填され、その記憶媒体71に記憶されているデータなどが読み取られる。図38を参照して、プレイ・リスト領域392には、常時記録の動画ファイルのファイル名が一覧で表示されている。ユーザは、プレイ・リスト領域392に表示されている常時記録の動画ファイルのファイル名の中から所望のファイル名の動画ファイルを、入力装置367を用いて選択する（図36ステップ371）。常時記録の動画ファイルが選択されると、選択された動画ファイルに対応する推論イベント記録の動画ファイルが記憶媒体71の中から見つけ出され、推論イベント記録の動画ファイルから推論イベント記録の情報が読み取られ（図36ステップ372）、メモリ368に一時的に記憶される。推論イベント記録の情報には、推論イベント記録が行われた場所、推論イベント記録の動画ファイルのリンク、推論イベント記録の動画ファイルによって表される動画を構成する画像の格納場所などがある。 A storage medium 71 is loaded into the personal computer, and data stored in the storage medium 71 is read. Referring to FIG. 38, play list area 392 displays a list of file names of constantly recorded video files. The user selects a video file with a desired file name from among the file names of constantly recorded video files displayed in the play list area 392 using the input device 367 (step 371 in FIG. 36). When a constantly recorded video file is selected, an inference event record video file corresponding to the selected video file is found in the storage medium 71, and inference event record information is read from the inference event record video file. (step 372 in FIG. 36) and temporarily stored in memory 368. The information on the inference event record includes the location where the inference event record was performed, a link to the video file of the inference event record, and the storage location of images forming the video represented by the video file of the inference event record.

ボタン領域399に含まれる再生開始ボタンが押されると再生開始指令が入力されたこととなり（図36ステップ373でＹＥＳ）、選択された常時記録の動画ファイルの再生が開始される（図36ステップ374）。すると、再生動画表示領域390に常時記録の動画が表示される（図36ステップ375）。 When the playback start button included in the button area 399 is pressed, it means that a playback start command has been input (YES in step 373 in FIG. 36), and playback of the selected constantly recorded video file is started (step 374 in FIG. 36). ). Then, the constantly recorded video is displayed in the playback video display area 390 (step 375 in FIG. 36).

読み取られた推論イベント記録の情報から、再生動画表示領域390に表示されている場所に近い推論イベント記録が行われた場所を示すデータが読み取られ、そのデータが地図サーバ（図示略）に送信される。地図サーバから、推論イベント記録が行われた場所の近傍の地図データがパーソナル・コンピュータ360に送信され、地図表示領域393に推論イベント記録が行われた場所の近傍の地図が表示される（図36ステップ376）。また、推論イベント記録の動画ファイルの中から、推論イベント記録が行われた場所の画像が読み取られ、サムネイル画像395として地図表示領域393に表示されている地図上に表示されるとともに推論イベント記録が行われた場所に矢印394が表示される。これらの矢印394およびサムネイル画像395には、推論イベント記録の動画ファイルへのリンク（このリンクは推論イベント記録の情報の一つとして読み取られたものである）が埋め込まれる。 From the information of the read inference event record, data indicating the location where the inference event record was made near the location displayed in the playback video display area 390 is read, and the data is sent to a map server (not shown). Ru. Map data in the vicinity of the place where the inference event was recorded is transmitted from the map server to the personal computer 360, and a map in the vicinity of the place in which the inference event was recorded is displayed in the map display area 393 (Fig. 36 step 376). In addition, an image of the location where the inference event recording was performed is read from the video file of the inference event record, and displayed as a thumbnail image 395 on the map displayed in the map display area 393, and the inference event record is An arrow 394 is displayed at the location where the action was performed. A link to the video file of the inference event record (this link is read as one of the information of the inference event record) is embedded in these arrows 394 and thumbnail images 395.

地図表示領域393に表示されている矢印394またはサムネイル画像395がパーソナル・コンピュータ360のユーザによってクリックされると（図37ステップ377でＹＥＳ）、推論イベント記録の動画ファイルへのリンクを用いて推論イベント記録の動画ファイルが記憶媒体71から読み取られる。推論イベント記録の動画ファイルによって表される推論イベント記録の動画の再生が開始され（図37ステップ378）、推論イベント記録の動画が領域391に表示される（図37ステップ379）。 When the arrow 394 or thumbnail image 395 displayed in the map display area 393 is clicked by the user of the personal computer 360 (YES in step 377 in FIG. 37), the inference event is displayed using the link to the video file of the inference event record. A video file of the recording is read from the storage medium 71. Playback of the inference event record video represented by the inference event record video file is started (step 378 in FIG. 37), and the inference event record video is displayed in area 391 (step 379 in FIG. 37).

推論イベント記録の動画の再生が終了するまで（図37ステップ380）、領域391には推論イベント記録の動画が表示され、領域390には常時記録の動画が表示される。推論イベント記録の動画の再生が終了すると（図37ステップ380でＹＥＳ）、常時記録の動画の再生が終了したかどうかが確認される（図37ステップ381）。常時記録の動画の再生が終了していなければ（図37ステップ381でＮＯ）、図37ステップ377からの処理が繰り返される。常時記録の動画の再生が終了すると（図37ステップ381でＹＥＳ）、図36および図37の処理は終了する。 Until the reproduction of the video of the inference event record is completed (step 380 in FIG. 37), the video of the inference event record is displayed in the area 391, and the video of the constant recording is displayed in the area 390. When the reproduction of the moving image of the inference event record is completed (YES in step 380 in FIG. 37), it is confirmed whether the reproduction of the constantly recorded moving image has been completed (step 381 in FIG. 37). If the reproduction of the constantly recorded moving image has not been completed (NO in step 381 in FIG. 37), the processing from step 377 in FIG. 37 is repeated. When the reproduction of the constantly recorded moving image is completed (YES in step 381 in FIG. 37), the processing in FIGS. 36 and 37 ends.

上述の実施例においては、常時記録動画のファイルを指定して推論イベント記録の動画ファイルを見つけ、常時記録動画と推論イベント記録の動画とを表示しているが、推論イベント記録の動画ファイルの中から所望の推論イベント記録の動画を選択し、選択された推論イベント記録に対応する常時記録の動画を見つけ、推論イベント記録の動画と常時記録の動画とを表示するようにしてもよい。また、常時記録の動画を表示せずに推論イベント記録の動画を表示するようにしてもよい。たとえば、推論イベント記録の動画ファイルを選択し、その推論イベント記録が行われた場所の近傍の地図を地図表示領域393に表示し、矢印394またはサムネイル画像395がクリックされたことにより、領域391に推論イベント記録の動画が表示されるようにしてもよい。 In the above embodiment, the constantly recorded video file is specified, the video file of the inference event record is found, and the constantly recorded video and the video of the inference event record are displayed. The moving image of the desired inference event record may be selected from the list, the constantly recorded moving image corresponding to the selected inferential event record may be found, and the moving image of the inferential event record and the constantly recorded moving image may be displayed. Furthermore, the video of the inference event record may be displayed instead of the video of the constant record. For example, when a video file of an inference event record is selected, a map of the vicinity of the place where the inference event recording was performed is displayed in the map display area 393, and the arrow 394 or thumbnail image 395 is clicked, the area 391 is displayed. A video of the inference event record may be displayed.

［第８実施例］
図39および図40は、第８実施例を示すもので、対象を検出したときの着目箇所を知らせるものである。図39は、再生処理手順を示すフローチャート、図40は、対象を表す画像部分の一例である。 [Eighth Example]
FIG. 39 and FIG. 40 show an eighth embodiment, which indicates the location of interest when an object is detected. FIG. 39 is a flowchart showing the reproduction processing procedure, and FIG. 40 is an example of an image portion representing the object.

記憶媒体71をパーソナル・コンピュータ360に装填し、再生用ソフトウエアを起動させると、図38に示すようなウインドウが表示される。その状態で対象を検出したときの着目箇所を知らせる処理を行わせるためのコマンドが入力装置367から入力される。すると、図39に示す再生処理が開始する。 When the storage medium 71 is loaded into the personal computer 360 and the playback software is started, a window as shown in FIG. 38 is displayed. In this state, a command is input from the input device 367 to perform a process of notifying the target location when the target is detected. Then, the playback process shown in FIG. 39 starts.

パーソナル・コンピュータ360の表示装置363の表示画面には、推論イベント記録の動画ファイルが一覧で表示され、その中から所望の推論イベント記録ファイルが入力装置367を用いてユーザによって選択される（ステップ411）。入力装置367から再生開始指令が入力されると（ステップ412でＹＥＳ）、表示装置363の表示画面には推論イベント記録の動画が表示される（ステップ413）。 A list of video files of inference event records is displayed on the display screen of the display device 363 of the personal computer 360, and a desired inference event record file is selected by the user using the input device 367 (step 411). ). When a reproduction start command is input from the input device 367 (YES in step 412), a moving image of the inference event record is displayed on the display screen of the display device 363 (step 413).

入力装置367を用いてイベント着目表示コマンドが入力されると（ステップ414でＹＥＳ）、対象の着目箇所のデータが、再生されている推論イベント記録の動画についての推論イベント記録情報の中から読み取られる（ステップ415）。すると、着目箇所を示すマークが推論イベント記録動画上に表示される（ステップ416）。 When the event focus display command is input using the input device 367 (YES at step 414), data of the target point of interest is read from the inference event record information about the moving image of the inference event record being played. (Step 415). Then, a mark indicating the point of interest is displayed on the inference event recording video (step 416).

図40を参照して、推論イベント記録の動画を構成する画像部分424の一例である。 Referring to FIG. 40, this is an example of an image portion 424 that constitutes a moving image of an inference event record.

対象161、162および163のそれぞれの顔の部分がヒートマップ420（着目箇所を示すマークの一例である）によって表されている。たとえば、ヒートマップ420の中央の円421は赤色であり、中央の円421の周りの円環422は黄色であり、円環422の周りの円環は青色である。カラー表示でなく濃淡で表してもよい。ヒートマップ420を見ることにより、検出された対象である対象161、162および163はそれぞれ顔（とくに、鼻、目、口）に着目して検出されていることが分かる。対象の着目箇所のデータは、たとえば、VQA(Visual Question Answering)を利用して質問を「対象はどれか」とすることで得ることができる（https://jellyware.jp/aicorex/contents/out_c08_realtime.html）。対象の着目箇所のデータは推論イベント記録時に推論イベント記録情報として記憶媒体71に記憶してもよいし、再生時に生成してもよい。 The facial parts of each of the objects 161, 162, and 163 are represented by a heat map 420 (which is an example of a mark indicating a point of interest). For example, the center circle 421 of the heat map 420 is red, the ring 422 around the center circle 421 is yellow, and the ring around ring 422 is blue. It may be expressed in shading instead of in color. By looking at the heat map 420, it can be seen that the detected objects 161, 162, and 163 are detected by focusing on their faces (in particular, their noses, eyes, and mouths). Data on the target point of interest can be obtained, for example, by using VQA (Visual Question Answering) and asking the question "What is the target?" (https://jellyware.jp/aicorex/contents/out_c08_realtime .html). The data of the target point of interest may be stored in the storage medium 71 as inference event recording information when recording an inference event, or may be generated during playback.

図39に戻って、イベント着目表示コマンドが入力されなければ（ステップ414でＮＯ）、ステップ415および416の処理はスキップされる。推論イベント記録の動画の再生が終了しなければ（ステップ417でＮＯ）、ステップ413からの処理が繰り返される。推論イベント記録の動画の再生が終了すると（ステップ417でＹＥＳ）、図39に示す処理が終了する。 Returning to FIG. 39, if the event focus display command is not input (NO in step 414), the processes in steps 415 and 416 are skipped. If the reproduction of the moving image of the inference event record is not completed (NO in step 417), the processing from step 413 is repeated. When the reproduction of the moving image of the inference event record ends (YES in step 417), the process shown in FIG. 39 ends.

［第９実施例］
図41から図43は、第９実施例を示すもので、推論結果を表示するものである。 [Ninth Example]
FIGS. 41 to 43 show the ninth embodiment and display the inference results.

図41は、推論結果の類似度表示の処理手順を示すフローチャートである。たとえば、図35に示したパーソナル・コンピュータ360において実施する。 FIG. 41 is a flowchart showing the processing procedure for displaying the similarity of inference results. For example, it is implemented in the personal computer 360 shown in FIG.

図35に示したパーソナル・コンピュータ360において推論の結果が撮影装置１ごとに読み取られる（ステップ431）。たとえば、撮影装置１に装填されて推論の結果が記憶されている記憶媒体71が撮影装置1から取り外され、取り外された記憶媒体が撮影装置１ごとにパーソナル・コンピュータ360に装填される。パーソナル・コンピュータ360において、推論の結果が撮影装置１に関連付けて撮影装置１ごとに読み取られる。推論の結果の類似度がパーソナル・コンピュータ360において算出され、算出された推論の結果の類似度が撮影装置１の識別データに関連づけられて表示装置363の表示画面に表示される（ステップ432）。識別データは、推論の結果とともに撮影装置1から読み取られる。 The inference results are read for each imaging device 1 in the personal computer 360 shown in FIG. 35 (step 431). For example, the storage medium 71 loaded into the photographing device 1 and storing the inference results is removed from the photographing device 1, and the removed storage medium is loaded into the personal computer 360 for each photographing device 1. In the personal computer 360, the result of the inference is read for each photographing device 1 in association with the photographing device 1. The similarity of the inference result is calculated in the personal computer 360, and the calculated similarity of the inference result is displayed on the display screen of the display device 363 in association with the identification data of the photographing device 1 (step 432). The identification data is read from the imaging device 1 along with the result of the inference.

図42は、表示装置363の表示画面に表示される推論の結果の類似度の一覧表の一例である。 FIG. 42 is an example of a list of similarities of inference results displayed on the display screen of the display device 363.

例えば、識別データID0001で特定される撮影装置１の推論の結果と類似度が90％以上の推論の結果をもつ撮影装置１の識別データはID0007、ID0010などであり、識別データID0001で特定される撮影装置１の推論の結果と類似度が80％以上の推論の結果をもつ撮影装置１の識別データはID0003、ID0008などである。また、識別データID0002で特定される撮影装置１の推論の結果と類似度が90％以上の推論の結果をもつ撮影装置１の識別データは、ID0005、ID0006などである。 For example, the identification data of the imaging device 1 that has an inference result with a degree of similarity of 90% or more to the inference result of the imaging device 1 specified by the identification data ID0001 is ID0007, ID0010, etc., and is identified by the identification data ID0001. ID0003, ID0008, and the like are ID0003, ID0008, etc., as identification data of the photographing apparatus 1 having an inference result with a degree of similarity of 80% or more to the inference result of the photographing apparatus 1. Further, identification data of the photographing apparatus 1 having an inference result having a degree of similarity of 90% or more to the inference result of the photographing apparatus 1 specified by the identification data ID0002 is ID0005, ID0006, etc.

たとえば、識別データID0001の撮影装置1の推論の結果と識別データID0007、ID0010などの撮影装置1の推論の結果とは90％の類似度があることがわかる。より正確にいえば、識別データID0001の撮影装置1での推論に用いられた学習済モデル80での推論の結果と識別データID0007、ID0010などの撮影装置1での推論に用いられた学習済モデル80での推論の結果とは90％の類似度があるということが言える。撮影装置１（学習済モデル80）ごとの推論の結果の類似度が分かるので、撮影装置１（学習済モデル80）ごとの推論の結果の相違が分かるようになる。 For example, it can be seen that there is a 90% similarity between the inference result of the photographing device 1 having the identification data ID0001 and the inference result of the photographing device 1 having the identification data ID0007, ID0010, etc. To be more precise, the results of the inference using the trained model 80 used for the inference of the identification data ID0001 in the imaging device 1, and the learned models used for the inference of the identification data ID0007, ID0010, etc. in the imaging device 1. It can be said that there is a 90% similarity with the inference result at 80. Since the degree of similarity of the inference results for each photographing device 1 (trained model 80) is known, it becomes possible to understand the difference in the inference results for each photographing device 1 (trained model 80).

図41に戻って、パーソナル・コンピュータ360において撮影時の状況が近似しているときに得られた推論の結果が撮影装置１ごとに撮影装置１から取り外された記憶媒体71から読み取られる（ステップ433）。すると、表示装置363の表示画面には図43に示す、撮影時が近似しているときの推論の結果の類似度の一覧表が表示される。撮影時が近似しているときには、たとえば、撮影日時、撮影時の天候、撮影時の明るさ、撮影場所、たとえば、市街地か郊外かなどがある。これらの撮影時の状況を表すデータは推論イベント記録の動画ファイルに関連づけて記憶媒体71に記憶されている。 Returning to FIG. 41, on the personal computer 360, the inference results obtained when the situation at the time of photographing is similar are read from the storage medium 71 removed from the photographing apparatus 1 for each photographing apparatus 1 (step 433 ). Then, on the display screen of the display device 363, a list of similarities as a result of inference when the shooting times are similar is displayed as shown in FIG. When the times of photography are similar, for example, the date and time of photography, the weather at the time of photography, the brightness at the time of photography, and the location of photography, such as whether it is in the city or in the suburbs. The data representing the situation at the time of photographing is stored in the storage medium 71 in association with the video file of the inference event record.

例えば、撮影時の状況が近似しているときに、識別データID0001で特定される撮影装置１の推論の結果と類似度が90％以上の推論の結果をもつ撮影装置１の識別データはID0007、ID0023などであり、識別データID0001で特定される撮影装置１の推論の結果と類似度が80％以上の推論の結果をもつ撮影装置１の識別データはID0003、ID0007などである。また、識別データID0002で特定される撮影装置１の推論の結果と類似度が90％以上の推論の結果をもつ撮影装置１の識別データは。ID0006、ID0012などである。 For example, when the shooting situations are similar, the identification data of the imaging device 1 that has an inference result with a degree of similarity of 90% or more to the inference result of the imaging device 1 specified by the identification data ID0001 is ID0007, ID0023, etc., and the identification data of the photographing apparatus 1 having an inference result with a degree of similarity of 80% or more to the inference result of the photographing apparatus 1 specified by the identification data ID0001 is ID0003, ID0007, etc. Further, the identification data of the photographing device 1 having an inference result with a degree of similarity of 90% or more to the inference result of the photographing device 1 specified by the identification data ID0002 is. ID0006, ID0012, etc.

たとえば、撮影時の状況が近似しているときであれば、識別データID0001の撮影装置1の推論の結果と識別データID0007、ID0023などの撮影装置1の推論の結果とは90％の類似度があることがわかる。より正確にいえば、識別データID0001の撮影装置1での推論に用いられた学習済モデル80での推論の結果と識別データID0007、ID0023などの撮影装置1での推論に用いられた学習済モデル80での推論の結果とは90％の類似度があるということが言える。撮影時の状況が近似しているときの撮影装置１ごとの推論の結果の類似度が分かるので、撮影状況が近似しているときの撮影装置１ごとの推論の結果の相違が分かるようになる。このため、撮影装置１に記憶されている学習済みモデル80ごとの推論の結果の相違がわかるようになる。 For example, if the situations at the time of shooting are similar, the inference result of imaging device 1 with identification data ID0001 and the inference result of imaging device 1 with identification data ID0007, ID0023, etc. have a 90% similarity. I understand that there is something. To be more precise, the results of the inference using the learned model 80 used for inference in the imaging device 1 for identification data ID0001 and the learned models used for inference in the imaging device 1 for identification data ID0007, ID0023, etc. It can be said that there is a 90% similarity with the inference result at 80. Since the similarity of the inference results for each imaging device 1 when the shooting situations are similar is known, it is possible to understand the difference in the inference results for each imaging device 1 when the shooting situations are similar. . Therefore, differences in the inference results for each learned model 80 stored in the photographing device 1 can be seen.

［第10実施例］
図44から図49は、第10実施例を示している。 [10th Example]
44 to 49 show a tenth embodiment.

図44は、車両としてフォークリフト400が採用されている。必ずしもフォークリフト400でなくともよく、自動車などでもよい。 In FIG. 44, a forklift 400 is used as the vehicle. It does not necessarily have to be a forklift 400, and may be a car or the like.

フォークリフト400の天井に下方向に前後左右の円周画像、たとえば、天球画像を撮影する撮影装置１が設けられている。また、フォークリフト400の後方に、フォークリフト400の後方を撮影する撮影装置２が設けられている。 A photographing device 1 is provided on the ceiling of the forklift 400 for photographing circumferential images of the front, rear, left, and right, for example, celestial sphere images in a downward direction. Further, a photographing device 2 for photographing the rear of the forklift 400 is provided at the rear of the forklift 400.

図45は、フォークリフト400の天井に設けられている撮影装置１の撮影によって得られた円周画像の一例である。 FIG. 45 is an example of a circumferential image obtained by photographing with the photographing device 1 installed on the ceiling of the forklift 400.

図45の円周画像460において奥側が前方であり、左側が左方であり、手前側が後方であり、右側が右方である。 In the circumferential image 460 of FIG. 45, the back side is the front, the left side is the left side, the near side is the back side, and the right side is the right side.

図46は、フォークリフト400の後方に設けられている撮影装置２の撮影によって得られた画像の一例である。 FIG. 46 is an example of an image obtained by photographing with the photographing device 2 provided at the rear of the forklift 400.

撮影装置２はフォークリフト400の後方に設けられ、かつフォークリフト400の後方を撮影するから、図46に示す画像470はフォークリフト400の後方を表している。 Since the photographing device 2 is provided behind the forklift 400 and photographs the rear of the forklift 400, the image 470 shown in FIG. 46 represents the rear of the forklift 400.

図45に示す円周画像460と図46に示す画像470とがあるときにおいて上述した推論を行うときに、これらの円周画像460と画像470とのすべてについて、たとえば、図５に示す学習済モデル80などを用いて推論しなくとも円周画像460の前方の画像部分461と後方の画像470とを推論すればよい。円周画像460の後方部分の画像、たとえば、前方の画像部分461以外の部分の画像と後方の画像470とは重複していると考えられるからである。このように円周画像460の一部を学習済モデル80などで学習することで推論に要する時間を短くできる。円周画像460は円周画像に限らない。 When performing the above-mentioned inference when there are a circumferential image 460 shown in FIG. 45 and an image 470 shown in FIG. It is sufficient to infer the front image portion 461 and the rear image 470 of the circumferential image 460 without inference using the model 80 or the like. This is because the image of the rear part of the circumferential image 460, for example, the image of the part other than the front image part 461 and the rear image 470 are considered to overlap. In this way, by learning a part of the circumferential image 460 using the trained model 80 or the like, the time required for inference can be shortened. The circumferential image 460 is not limited to a circumferential image.

図47は、撮影装置１を用いて撮影して得られた円周画像480の一例である。 FIG. 47 is an example of a circumferential image 480 obtained by photographing using the photographing device 1.

円周画像480は、円周画像480の中心を中心として前後左右を撮影しているから画像の下方向が実空間における下方向、画像の上方向が実空間における上方向を表すような通常の画像とは異なり、対象が逆立ちしているように写ったり、横向きに立っているように写ったりする。このため、円周画像480をそのまま、たとえば、図５に示す学習済モデル80などで推論させて円周画像480の中から対象を検出させると、検出精度が低下したり、検出時間が長くかかったりすることがある。 The circumferential image 480 is photographed from the front, back, left and right around the center of the circumferential image 480, so it is a normal image where the bottom of the image represents the bottom in real space and the top of the image represents the top in real space. Unlike the image, the subject appears to be standing upside down or sideways. Therefore, if a target is detected from the circumferential image 480 by inference using the trained model 80 shown in FIG. Sometimes.

円周画像480のうち、前方を表す画像部分481を考えると、この画像部分481が図48に示すような通常の画像（通常の画像に近い画像で、画像の上下と実空間の上下とが一致している画像）になれば、たとえば、図５に示す学習済モデル80などで推論させて、その画像の中から対象として対象を検出させても検出精度が低下したり、検出時間が長くかかったりすることを抑えることができる。 Considering the image portion 481 representing the front of the circumferential image 480, this image portion 481 is a normal image as shown in FIG. For example, if the trained model 80 shown in Figure 5 is used to detect the target from the image, the detection accuracy will decrease and the detection time will be long. You can reduce the amount of time it takes.

このために、この実施例では、図47に示す円周画像480を図49に示すような画像500に変形する第１の処理を行い、この画像500のうちの一部の画像部分501について図48に示す画像490となるように第２の変形処理を行うことで、図47に示す円周画像480の画像部分481を図48に示すように実空間の上下方向と上下方向が同じである画像490を作成できることが分かった。画像部分501は図48の画像490に対応する部分である。入力画像を図47に示す円周画像480とし、出力画像を図48に示す画像490とした場合、図47に示す円周画像480に第１の処理および第２の処理を施すことにより図48に示す出力画像490が得られる。 For this purpose, in this embodiment, a first process is performed to transform the circumferential image 480 shown in FIG. 47 into an image 500 as shown in FIG. By performing the second transformation process so that the image 490 shown in FIG. 48 is obtained, the image portion 481 of the circumferential image 480 shown in FIG. 47 has the same vertical direction as the vertical direction in real space as shown in FIG. I found out that I can create image 490. Image portion 501 corresponds to image 490 in FIG. 48. When the input image is a circumferential image 480 shown in FIG. 47 and the output image is an image 490 shown in FIG. 48, by performing the first process and the second process on the circumferential image 480 shown in FIG. An output image 490 shown in is obtained.

図47に示す円周画像480に限らず、他の円周画像について第１の変形処理および第２の変形処理が施されることにより、図48に示すような実空間の上下方向と上下方向とが一致した画像を得ることができるようになる。このような画像を用いて学習済モデル80を用いて推論させることにより、推論の結果の高精度化、推論の時間の短縮化を図ることができる。 By performing the first deformation process and the second deformation process not only on the circumferential image 480 shown in FIG. 47 but also on other circumferential images, the vertical and vertical directions of the real space as shown in FIG. It becomes possible to obtain an image that matches the By performing inference using the trained model 80 using such images, it is possible to improve the accuracy of the inference result and shorten the inference time.

［変形例］
エッジ端末、ドラレコで録画しながら常に推論を行い、動画データ内にデータとして推論結果を保存してもよい。推論結果は、BB(バウンディングボックス)の位置、分類タグ名、確率などが考えられる。データ保存領域には、映像データの字幕ストリームにGPSデータや加速度データなどとともに保存してもよい。 [Modified example]
Inference may be constantly performed while recording on an edge terminal or drive recorder, and the inference results may be saved as data within the video data. Possible inference results include the position of BB (bounding box), classification tag name, and probability. In the data storage area, a subtitle stream of video data may be stored together with GPS data, acceleration data, etc.

検出対象を検出した場合に、イベント録画を行い、専用のイベント録画ディレクトリに保存してもよい。後から全ての常時録画に推論をかけ、切り出す必要がなくなる。また、データアップロード後に、このデータに対して、再度集計用端末で推論をかることで、エッジ端末の推論結果が間違っていた場合に、フィードバックをかけることでモデル向上につなげることができる。エッジ端末で広く捉えて、集計用端末で更に絞り込み、結果をエッジモデルに返し再学習してもよい。 When a detection target is detected, event recording may be performed and saved in a dedicated event recording directory. There is no need to infer and extract all continuous recordings later. In addition, after data is uploaded, inference is performed on this data again on the aggregation terminal, and if the inference result of the edge terminal is incorrect, feedback can be applied to improve the model. The edge terminal may broadly capture the information, the aggregation terminal may narrow it down further, and the results may be returned to the edge model for relearning.

このようにすると、後からほぼ推論行わないため、管理システムでデータ集計/可視化が容易になる。字幕ストリームのデータを吐き出し、地図にプロットすれば可視化できるし、対象の動画は、そのタイミングでサムネと再生リンクでも貼ればよい。管理者が気になれば、その動画を見ればいいので楽である。また、データを集積することで、後からフィルタリングして、条件に合う位置情報とその時の映像を絞り込める。既存のDVRや監視カメラに後からAI機能を付加したいときに利用することができる。 In this way, data aggregation/visualization becomes easier in the management system because there is almost no inference required afterwards. You can visualize it by outputting the subtitle stream data and plotting it on a map, and you can also paste the thumbnail and playback link of the target video at that time. If the administrator is interested, all he has to do is watch the video, so it's easy. In addition, by accumulating data, you can filter it later to narrow down the location information and images that match the conditions. It can be used when you want to add AI functionality to existing DVRs and surveillance cameras later.

既存製品に取り付けできるAIモジュールにすることで、汎用的に利用できる。後から取り付けられるので必須の入力は映像データでよい。端末側と相互通信できるとよい。出力は、外部モニタに推論結果をbb(バウンディングボックス)を付加して出力できればよい。DVRのAIモジュールでは外部出力がなくてもよい。推論結果は通信で端末に返すものでもよい。 By making it into an AI module that can be attached to existing products, it can be used for general purposes. Since it can be installed later, the required input only needs to be video data. It would be good if it could communicate with the terminal side. As for output, it is sufficient if the inference result can be outputted to an external monitor with bb (bounding box) added. The DVR's AI module does not require external output. The inference results may be returned to the terminal via communication.

モジュールに搭載するモデルを載せ替えることで、利用ケースにあわせて学習済みモデル、推論モデルを変更することができる。端末側にデータアップロード機能があれば、推論結果込みでクラウドにデータアップロードもできる。 By replacing the model installed in the module, the trained model and inference model can be changed according to the use case. If the device has a data upload function, data including inference results can be uploaded to the cloud.

推論に使用するモデルをユーザが選択する選択機能を設けたドラレコまたはドラレコの設定アプリ、映像ファイル内に、推論結果とともに、モデルに関する情報を記録したり（ONNX形式などに変換可能なデータ構造など）、モデルに関する情報はファイル単位で１つ入れたり、モデルに変化があったときは別のファイルとしてもよい。また、画像認識AIの推論結果のみに基づく種類の映像と、画像認識AIの推論結果と他のセンサ等の出力等に基づく種類の映像とを区別可能に記録する機能を備え、画像認識AIの推論結果と他のセンサ等の出力等に基づく種類の映像には推論結果と対応づけて他のセンサ等の出力等の情報を記録しておく機能を備えてもよい。映像データから推論結果の履歴およびモデル情報の少なくとも一方を抽出して別ファイルとして記録する機能を備えたソフトでもよい。映像データから推論結果の履歴およびモデル情報の少なくとも一方を除いた映像データを抽出して別の映像ファイルとして記録する機能を備えたソフトでもよい。 Drive recorder or drive recorder settings app with a selection function for the user to select the model to be used for inference, record information about the model along with the inference results in the video file (data structure that can be converted to ONNX format, etc.) , information regarding the model may be entered in one file, or in a separate file when there is a change in the model. In addition, it is equipped with a function to distinguish between types of video that are based only on the inference results of image recognition AI and videos that are based on the inference results of image recognition AI and the output of other sensors, etc. A type of video based on the inference result and the output of other sensors, etc. may be provided with a function of recording information such as the output of other sensors, etc. in association with the inference result. Software may be used that has a function of extracting at least one of a history of inference results and model information from video data and recording it as a separate file. It may be software that has a function of extracting video data from which at least one of the history of inference results and model information is removed and recording it as a separate video file.

AIでの認識をきっかけとして記録された映像を再生しているとき、ドラレコ本体でも、アプリ・PCビューア等でもよい、に、この認識が正しいか否かをユーザが入力するGUI、例えばボタンやVUI、例えば「間違っている」「正しい」の音声認識を備え、この正しいか否かの情報ともととなる映像と判定の元になったモデルに関する情報、推論結果等とに基づいて、所定の処理、例えば、これらを再学習用情報として記録する処理、これらに基づく再学習、しきい値の変更等を行なってもよい。 When playing back video recorded as a result of AI recognition, a GUI, such as a button or VUI, allows the user to input whether or not this recognition is correct, whether it is on the drive recorder itself, an app, PC viewer, etc. For example, it is equipped with voice recognition for "wrong" and "correct", and performs predetermined processing based on the information on whether it is correct or not, the information on the original video, the information on the model that was the basis of the judgment, the inference results, etc. For example, processing for recording these as relearning information, relearning based on these, changing thresholds, etc. may be performed.

また、複数のドラレコからのAIの推論結果に応じて記録された映像データを取得する機能を備えたシステムで、映像データには推論結果の情報とともにそのドラレコを一意に特定する情報、ID等を記録しておき、異なるIDのドラレコにおいて記録された映像で、位置的・時間的・車両情報的、速度などが近接した範囲の映像データの中の推論結果の類似度、例えば相関などを表示する機能を備えるとよい。類似度が高いもの同士、類似度が低いもの同士を並べて表示したり、これらを対比してさらに並べて表示したりする機能を備えるとよい。 In addition, the system is equipped with a function to acquire video data recorded according to AI inference results from multiple drive records, and the video data includes information on the inference results as well as information that uniquely identifies the drive record, ID, etc. Displays the similarity of inference results, such as correlation, among video data recorded on drive recorders with different IDs that are close in location, time, vehicle information, speed, etc. It would be good to have a function. It is preferable to have a function to display items with a high degree of similarity and items with a low degree of similarity side by side, or to compare and display these items side by side.

ドラレコのビューア、ドラレコ本体などに、AIの推論結果に応じて記録された映像データを取得する機能を備え、AIの推論の種類、例えばトリガの種別ととともに、望ましくは映像とともに、その推論状況を示す画像を表示すること。特に、推論における画像の着目点を視覚的に表示する機能を備えるとよい。例えばヒートマップで表示するとよい。 The viewer of the drive recorder, the drive recorder itself, etc. is equipped with a function to acquire video data recorded according to the inference result of the AI, and the inference status can be displayed along with the type of AI inference, such as the trigger type, and preferably the video. Display the image shown. In particular, it is preferable to have a function of visually displaying points of interest in images in inference. For example, it may be useful to display it as a heat map.

向かいの車の中の人を検知してしまい、バウンディングボックスで表示してしまうので、車の中の人を検知したら後処理ではじくとよい。たとえば、移動速度高いと車の中の人と判断し、移動速度低いと人と判断する。また、ミラー状のものを対象からはじいたりしてもよい。前の車がミラーぽいと反射してそれを検知してしまうからである。 The person in the car opposite will be detected and displayed as a bounding box, so if you detect a person in the car, you can remove it in post-processing. For example, if the moving speed is high, it is determined to be a person in a car, and if the moving speed is slow, it is determined to be a person. Alternatively, a mirror-like object may be repelled from the target. This is because it detects the reflection of the car in front of you in the mirror.

さらに、推論に時間がかかるので、バウンディングボックスと対象とがずれたり、道路横の柵の影響で、その柵の向こうの人を認識できたりしないことある。このためいくつかのフレームでバウンディングボックスが消失してしまう。連続とみなす処理をするとよい。また、車の揺れや速度でバウンディングボックスの確率に下駄をはかせる後処理をしたり、夜は物体検知難しいので昼の車と夜の車でモデルを切り替えたりする。照り返し、日差し、逆光で、認識率低下の原因になる。夜はランプ、反射の影響があるので、キャリブレーションする。認識可能状態/認識不可能状態をドラレコに表示する。これにより、物体認識できないのが認識不可能状態のせいと分かる。 Furthermore, since inference takes time, the bounding box and the target may be misaligned, or the person on the other side of the fence may not be recognized. This causes the bounding box to disappear in some frames. It is best to treat it as continuous. In addition, we perform post-processing to adjust the probability of the bounding box based on the vibration and speed of the car, and because it is difficult to detect objects at night, we switch models between cars during the day and cars at night. Reflection, sunlight, and backlighting can cause a decline in recognition rate. Calibrate at night as it is affected by lamps and reflections. Displays the recognizable/unrecognizable status on the drive recorder. This shows that the inability to recognize objects is due to the unrecognizable state.

カメラの設置の高さによって、映像がかなり違うので、例えば、設置高さを入力してもらって、推論処理を調整してもよい。例えば推論はフォークリフトへの一般的な設置位置の高さで行い、推論時には実際の設置高さと一般的な高さとの差から求めた倍率で、入力画像のサイズを調整、たとえば、拡大や縮小してから、推論部に入力してもよい。 Since images vary considerably depending on the height of the camera installation, for example, the inference processing may be adjusted by having the user input the installation height. For example, inference is performed using the height of the general installation position on a forklift, and during inference, the size of the input image is adjusted, for example, enlarged or reduced, using a magnification determined from the difference between the actual installation height and the general height. After that, it may be input to the inference section.

なお、本発明の範囲は、明細書に明示的に説明された構成や限定されるものではなく、本明細書に開示される本発明の様々な側面の組み合わせをも、その範囲に含むものである。本発明のうち、特許を受けようとする構成を、添付の特許請求の範囲に特定したが、現在の処は特許請求の範囲に特定されていない構成であっても、本明細書に開示される構成を、将来的に特許請求の範囲とする意思を有する。 Note that the scope of the present invention is not limited to the configuration explicitly described in the specification, but also includes combinations of various aspects of the invention disclosed in this specification. Of the present invention, the structure for which a patent is sought has been specified in the attached claims, but at present, even if the structure is not specified in the claims, it is not disclosed in this specification. We intend to include such configurations in the scope of claims in the future.

本願発明は上述した実施の形態に記載の構成に限定されない。上述した各実施の形態や変形例の構成要素は任意に選択して組み合わせて構成するとよい。また各実施の形態や変形例の任意の構成要素と、発明を解決するための手段に記載の任意の構成要素又は発明を解決するための手段に記載の任意の構成要素を具体化した構成要素とは任意に組み合わせて構成するとよい。これらについても本願の補正又は分割出願等において権利取得する意思を有する。「～の場合」「～のとき」という記載があったとしてもその場合やそのときに限られる構成として記載はしているものではない。これらの場合やときでない構成についても開示しているものであり、権利取得する意思を有する。また順番を伴った記載になっている箇所もこの順番に限らない。一部の箇所を削除したり、順番を入れ替えた構成についても開示しているものであり、権利取得する意思を有する。 The present invention is not limited to the configuration described in the embodiments described above. The components of each of the embodiments and modifications described above may be arbitrarily selected and combined. Also, any component of each embodiment or modification, any component described in the means for solving the invention, or a component that embodies any component described in the means for solving the invention. It may be configured in any combination. The applicant intends to acquire rights to these matters through amendments to the application or divisional applications. Even if there is a description of ``in the case of'' or ``in the case of'', the description is not intended to be limited to that case or at that time. We have also disclosed these cases and other configurations, and we intend to acquire the rights. Furthermore, the sections described in order are not limited to this order. It also discloses a configuration in which some parts have been deleted or the order has been changed, and we have the intention to acquire the rights.

また、意匠登録出願への変更により、全体意匠又は部分意匠について権利取得する意思を有する。図面は本装置の全体を実線で描画しているが、全体意匠のみならず当該装置の一部の部分に対して請求する部分意匠も包含した図面である。例えば当該装置の一部の部材を部分意匠とすることはもちろんのこと、部材と関係なく当該装置の一部の部分を部分意匠として包含した図面である。当該装置の一部の部分としては、装置の一部の部材としても良いし、その部材の部分としても良い。全体意匠はもちろんのこと、図面の実線部分のうち任意の部分を破線部分とした部分意匠を、権利化する意思を有する。また、装置の筐体の内部のモジュール・部材・部品等についても、図面に表示されているものは、いずれも独立して取引の対象となるものであって、同様に、意匠登録出願への変更を行って権利化を行う意思を有するものである。 In addition, the applicant intends to acquire rights to the entire design or partial design by converting the application to a design registration application. Although the drawing depicts the entire device using solid lines, the drawing includes not only the overall design but also the partial design claimed for some parts of the device. For example, it is a drawing that not only includes some members of the device as a partial design, but also includes some parts of the device as a partial design regardless of the members. The part of the device may be a part of the device or a part of the device. We intend to obtain rights not only for the entire design, but also for partial designs in which any part of the solid line part of the drawing is a broken line part. In addition, the modules, members, parts, etc. inside the device housing shown in the drawings are all subject to independent transactions, and similarly, they are included in the design registration application. There is an intention to make changes and obtain rights.

１：撮影装置、２：撮影装置、３：ケーブル、４：車両、５：システム、11：筐体、12：ジョイントレール、13：放音孔、14：マイク孔、15：撮像レンズ、16：上面、17：カメラジャック、18：端子、19：記憶媒体挿入口、20：第３の側面、21：第１の側面、22：第４の側面、23：表示面、24：タッチセンサ、25：発光部、26：操作部、27：第１のボタン、28：第２のボタン、29：第３のボタン、30：第４のボタン、31：イベント記録ボタン、32：第２の側面、40：ブラケット、41：ベース部、42：取付面、43：ボールスタッド、44：ボール部、45：ソケット部、46：ナット、47：ベース部、48：ガイドレール、49：先端部、50：制御部、51：プロセッサ、52：メモリ、53：計時部、60：入力部、61：マイクロホン、65：表示部、66：音声出力部、67：撮影部、68：通信部、69：センサ部、70：リーダライタ、71：記憶媒体、72：端子部、73：位置情報取得部、75：電源制御部、76：映像入出力部、77：データ入出力部、80：学習モデル（学習済モデル）、80Ｂ：学習済モデル、81：入力層、82：中間層、83：出力層、83ａ：ニューロン、83ｂ：ニューロン、83ｎ：ニューロン、120：学習済モデル選択画像、121：システム領域、122-126：領域、131：システム領域、132：記録領域、133：管理領域、134：記録領域、135：推論イベント記録領域、136：管理領域、137：記録領域、138：ユーザ記録領域、139：管理領域、140：ユーザ情報記録領域、151：ヘッダ記録領域、152：フレーム画像データ記録領域、153：フッタ記録領域、160：撮影画像、160Ａ：撮影画像、160Ｂ：画像、161-163：対象、164：枠、164ａ：枠、165：枠、165ａ：枠、167：柵、170：車両、180：車両、181：対象、182：対象、183：枠、221：センサ・イベント記録領域、222：管理領域、223：記録領域、224：推論結果記録領域、225：管理領域、226：記録領域、227：センサ出力結果記録領域、228：管理領域、229：記録領域、230：ＡＩモジュール、231：ＣＰＵ、232：映像インターフェイス、233：データ・インターフェイス、234：学習済モデル記憶装置、235：メモリ・カード・リーダ・ライタ、236：メモリ、237：映像インターフェイス、238：メモリ・カード、239：表示装置、240：システム、251：学習済モデル情報、260：推論用サーバ、261：制御装置、262：通信回路、263：メモリ、264：学習済モデル記憶装置、265：ハードディスク・ドライブ、266：ハードディスク、267：入力装置、269：システム、310：撮影画像、320：撮影画像、350：文字列、351：ＯＫボタン、352：ＮＧボタン、353：枠、360：コンピュータ、361：ＣＰＵ、362：表示制御装置、363：表示装置、364：加速度センサ、366：メモリ・カード・リーダ・ライタ、367：入力装置、368：メモリ、369：通信回路、390：再生動画表示領域、391：領域、392：リスト領域、393：地図表示領域、394：矢印、395：サムネイル画像、396：情報、398：情報、399：ボタン領域、400：フォークリフト、401：速度情報、402：情報表示領域、403：領域、404：領域、420：ヒートマップ、421：円、422：円環、424：画像部分、460：円周画像、461：画像部分、470：画像、480：円周画像、481：画像部分、490：出力画像、500：画像、501：画像部分、ID0001、ID0002、ID0007、ID0010、ID0023：識別データ 1: Photography device, 2: Photography device, 3: Cable, 4: Vehicle, 5: System, 11: Housing, 12: Joint rail, 13: Sound emission hole, 14: Microphone hole, 15: Imaging lens, 16: Top surface, 17: Camera jack, 18: Terminal, 19: Storage medium insertion slot, 20: Third side, 21: First side, 22: Fourth side, 23: Display surface, 24: Touch sensor, 25 : Light emitting part, 26: Operation part, 27: First button, 28: Second button, 29: Third button, 30: Fourth button, 31: Event recording button, 32: Second side, 40: Bracket, 41: Base part, 42: Mounting surface, 43: Ball stud, 44: Ball part, 45: Socket part, 46: Nut, 47: Base part, 48: Guide rail, 49: Tip part, 50: Control unit, 51: Processor, 52: Memory, 53: Time measurement unit, 60: Input unit, 61: Microphone, 65: Display unit, 66: Audio output unit, 67: Photography unit, 68: Communication unit, 69: Sensor unit , 70: Reader/writer, 71: Storage medium, 72: Terminal section, 73: Position information acquisition section, 75: Power supply control section, 76: Video input/output section, 77: Data input/output section, 80: Learning model (learned) model), 80B: Trained model, 81: Input layer, 82: Middle layer, 83: Output layer, 83a: Neuron, 83b: Neuron, 83n: Neuron, 120: Trained model selection image, 121: System area, 122 -126: Area, 131: System area, 132: Recording area, 133: Management area, 134: Recording area, 135: Inference event recording area, 136: Management area, 137: Recording area, 138: User recording area, 139: Management area, 140: User information recording area, 151: Header recording area, 152: Frame image data recording area, 153: Footer recording area, 160: Photographed image, 160A: Photographed image, 160B: Image, 161-163: Target, 164: Frame, 164a: Frame, 165: Frame, 165a: Frame, 167: Fence, 170: Vehicle, 180: Vehicle, 181: Target, 182: Target, 183: Frame, 221: Sensor event recording area, 222: Management area, 223: Recording area, 224: Inference result recording area, 225: Management area, 226: Recording area, 227: Sensor output result recording area, 228: Management area, 229: Recording area, 230: AI module, 231: CPU, 232: Video interface, 233: Data interface, 234: Learned model storage device, 235: Memory card reader/writer, 236: Memory, 237: Video interface, 238: Memory card, 239: Display device , 240: System, 251: Learned model information, 260: Inference server, 261: Control device, 262: Communication circuit, 263: Memory, 264: Learned model storage device, 265: Hard disk drive, 266: Hard disk, 267: Input device, 269: System, 310: Captured image, 320: Captured image, 350: Character string, 351: OK button, 352: NG button, 353: Frame, 360: Computer, 361: CPU, 362: Display control device, 363: display device, 364: acceleration sensor, 366: memory card reader/writer, 367: input device, 368: memory, 369: communication circuit, 390: playback video display area, 391: area, 392: list Area, 393: Map display area, 394: Arrow, 395: Thumbnail image, 396: Information, 398: Information, 399: Button area, 400: Forklift, 401: Speed information, 402: Information display area, 403: Area, 404 : area, 420: heat map, 421: circle, 422: circular ring, 424: image part, 460: circumferential image, 461: image part, 470: image, 480: circumferential image, 481: image part, 490: Output image, 500: Image, 501: Image part, ID0001, ID0002, ID0007, ID0010, ID0023: Identification data

Claims

A system that has the function of performing inference using an inference model using captured video data obtained by shooting with an in-vehicle camera.

The system according to claim 1, wherein the inference model has a function of determining among a plurality of types.

The system according to claim 2, having a function of determining the inference model to use the inference model suitable for a shooting environment.

4. The system according to claim 3, further comprising a function of determining the inference model to use the inference model suitable for the shooting location, the weather at the shooting location, or the brightness of the shooting location.

5. The system according to claim 4, wherein the system has a function of determining the inference model to be used so that the darker the brightness of the shooting location, the more accurate the inference model is compared to when the shooting location is brighter.

The system according to claim 5, having a function of determining the inference model to be used for inference using the highly accurate inference model at night.

The system according to claim 6, having a function of correcting the inference result of the inference model in accordance with disturbances applied to the in-vehicle photographing device.

In response to disturbances imparted to the in-vehicle imaging device from the vehicle due to movement of the vehicle in which the in-vehicle imaging device is installed, or disturbances imparted to the in-vehicle imaging device due to irradiation of light to the in-vehicle imaging device. The system according to claim 7, having a function of correcting the inference result of the inference model using the inference model.

9. The system according to claim 8, having a function of notifying whether inference using the inference model is possible or not due to disturbances applied to the in-vehicle photographing device.

The system according to claim 9, having a function of adjusting the inference result of the inference model according to the shaking or speed of the vehicle in which the in-vehicle photographing device is installed.

The system has a function of adjusting the system based on the height at which the vehicle-mounted photographing device is installed so as to more closely resemble the video obtained by photographing with the vehicle-mounted photographing device installed at a standard height. , the system of claim 10.

12. The system according to claim 11, wherein the vehicle-mounted photographing device has a function of photographing a circumferential image with a photographing direction below a reference height.

13. The method according to claim 1, further comprising a function of inputting a height at which the in-vehicle photographing device is installed and adjusting inference in the inference model according to the input height. system.

14. The system according to claim 13, wherein the inference model has a function of performing inference to detect a person whose moving speed is less than or equal to a predetermined speed from a video.

15. The system according to claim 14, wherein the inference model has a function of making an inference that excludes a person inside the vehicle.

16. The system according to claim 15, wherein the inference model performs inference to detect an object from a moving image, and has a function of excluding an object reflected on a reflective surface.

The above-mentioned inference model performs inference to detect a target object from each image that makes up a video, and has a function of displaying a mark that identifies the detected part of the object on each image that makes up the video. 17. The system according to claim 16, having a function of interpolating a mark to an image where the mark is interrupted when the time when the mark is interrupted in the moving image is within a predetermined time.

having a function of recording an event of video data in response to detection of an object by inference using the inference model using photographed video data obtained by photographing with the in-vehicle photographing device; 18. The system of claim 17.

19. The system according to claim 18, having a function of lowering a threshold value used for detecting a target object in response to detecting the target object.

20. The system according to claim 19, having a function of displaying the location where the event recording was performed on a map.

21. The system according to claim 20, having a function of displaying an image of the location where the event recording was performed on a map.

22. The system according to claim 21, further comprising a function of embedding a link to video data of the event record at a location where the event record was performed, which is displayed on a map.

The above-mentioned event recording of the video data is performed in response to the detection of the target object by the inference using the above-mentioned inference model, and the focused point of the detected target is shown in the image constituting the video in which the above-mentioned event has been recorded. 23. The system according to claim 22, having the function of displaying a mark.

The inference results obtained by performing inference using the inference model in each of the plurality of in-vehicle imaging devices are acquired for each of the in-vehicle imaging devices, and the similarity of the inference results for each of the in-vehicle imaging devices is calculated. 24. The system of claim 23, having the capability of displaying.

The inference results of the above inference model were performed using captured video data obtained when the position, information, speed, shooting time, etc. of the vehicle in which the above vehicle camera camera is installed are similar. 25. The system according to claim 24, having a function of representing similarity.

A method for controlling a system having a function of performing inference using the inference model using captured video data taken while shooting with the in-vehicle imaging device.

A program for causing a computer to implement the functions of the system according to claims 1 to 12.