JP2021128796A

JP2021128796A - Object recognition system and object recognition method

Info

Publication number: JP2021128796A
Application number: JP2021083156A
Authority: JP
Inventors: 洋一高野; Yoichi Takano
Original assignee: Daifuku Co Ltd
Current assignee: Daifuku Co Ltd
Priority date: 2018-06-01
Filing date: 2021-05-17
Publication date: 2021-09-02
Anticipated expiration: 2038-06-01
Also published as: JP6913303B2; JP2019211921A; JP7199645B2

Abstract

To provide an object recognition system that facilitates recognition of a second object accompanied with a first object in a moving image with high accuracy while suppressing a working burden of a system.SOLUTION: An object recognition system comprises a moving image object recognition part conducting object recognition of a first object from a moving image, a still picture extraction part extracting a still picture having multiple frames including the first object from the moving image, and a still picture object recognition part conducting object recognition of a second object accompanied with the first object from the still picture.SELECTED DRAWING: Figure 6

Description

本発明は、物体認識等の処理を行う物体認識システムおよび物体認識方法に関する。 The present invention relates to an object recognition system and an object recognition method that perform processing such as object recognition.

従来、画像から対象物を認識する物体認識の手法が提案されている。物体認識の手法は主に撮影画像を取扱うシステムにおいて好適に利用することができ、各種処理の自動化や効率化に役立てることが可能である。 Conventionally, an object recognition method for recognizing an object from an image has been proposed. The object recognition method can be suitably used in a system that mainly handles captured images, and can be useful for automation and efficiency improvement of various processes.

一例として特許文献１によれば、車両周辺の物標を検出し、検出された物標に基づいて物体を認識する物体認識装置が開示されている。また特許文献２によれば、車両に設けられた撮像装置によって撮像された画像から認識対象物の形状が存在する対象領域を抽出し、当該画像の領域全体のうち抽出された対象領域に対して選択的に認識対象物の認識処理を実行する物体認識装置が開示されている。 As an example, according to Patent Document 1, an object recognition device that detects a target around a vehicle and recognizes an object based on the detected target is disclosed. Further, according to Patent Document 2, a target area in which the shape of the recognition object exists is extracted from the image captured by the image pickup device provided in the vehicle, and the extracted target area is obtained from the entire area of the image. An object recognition device that selectively executes recognition processing of a recognition object is disclosed.

物体認識を行うための具体的手段としては、動画からの物体認識に好適であるＹＯＬＯや、静止画からの物体認識に好適であるＴＥＮＳＯＲＦＬＯＷ（登録商標、以下同様）等が開発されている。これらの手段には人工知能［ＡＩ：Artificial Intelligence］の技術が応用され、機械学習によって物体認識の精度を向上させることが可能である。また機械学習においては、高度な深層学習（ディープラーニング）が採用される傾向にあり、物体認識の精度や速度が向上してきている。 As specific means for performing object recognition, YOLO, which is suitable for object recognition from moving images, TENSORFLOW (registered trademark, the same applies hereinafter) and the like, which are suitable for object recognition from still images, have been developed. Artificial Intelligence (AI) technology is applied to these means, and it is possible to improve the accuracy of object recognition by machine learning. Further, in machine learning, there is a tendency that advanced deep learning is adopted, and the accuracy and speed of object recognition are improving.

特開２０１８−０４１３９６号公報JP-A-2018-041396 特開２０１７−１３０１５５号公報JP-A-2017-130155

ところで、動画中のある対象物（第１対象物）に付随する別の対象物（第２対象物）の物体認識を行う場合に、第２対象物を動画から直接的に精度良く認識することが容易ではないケースがある。なお本願における「対象物」は、機械学習や物体認識等でのアノテーションデータに相当する概念である。例えば、車両（第１対象物の一例）を監視するカメラの撮影動画からそのナンバープレートに表されたナンバー（第２対象物の一例）を認識しようとする場合、ナンバーを認識するためには車両の物体認識に比べて精密な認識処理が要求される。そのため動画から直接的にナンバーを認識しようとすると、認識精度の低下等を生じる虞がある。 By the way, when recognizing another object (second object) attached to one object (first object) in the moving image, the second object is directly and accurately recognized from the moving image. Is not easy in some cases. The "object" in the present application is a concept corresponding to annotation data in machine learning, object recognition, and the like. For example, when trying to recognize the number displayed on the license plate (an example of the second object) from the video shot by the camera that monitors the vehicle (an example of the first object), the vehicle must be recognized in order to recognize the number. More precise recognition processing is required than object recognition. Therefore, if an attempt is made to recognize the number directly from the moving image, the recognition accuracy may be lowered.

この問題を解消させるため、動画から静止画を抜き出しておき、抜き出した静止画からの物体認識によってナンバーを認識する手法が考えられる。しかしながら、例えば動画から各フレームの静止画を一律に抜き出すようにすると、車両が含まれない不要な静止画も抜き出しの対象となり、システムの動作負担が過大となる虞がある。また、抜き出す静止画にこのような不要な静止画が多く含まれると、物体認識の速度や精度が低下する虞もある。 In order to solve this problem, a method of extracting a still image from a moving image and recognizing a number by recognizing an object from the extracted still image can be considered. However, for example, if the still images of each frame are uniformly extracted from the moving image, unnecessary still images that do not include the vehicle are also extracted, and the operating load of the system may become excessive. Further, if the extracted still image contains a large number of such unnecessary still images, the speed and accuracy of object recognition may decrease.

本発明は上述した問題点に鑑み、動画中の第１対象物に付随する第２対象物を、システムの動作負担を抑えながら精度良く認識することが容易となる物体認識システム、および物体認識方法の提供を目的とする。 In view of the above-mentioned problems, the present invention is an object recognition system and an object recognition method that facilitates accurate recognition of a second object associated with a first object in a moving image while suppressing the operating load of the system. The purpose is to provide.

本発明に係る物体認識システムは、動画から第１対象物を物体認識する動画物体認識部と、前記動画から第１対象物を含む複数のフレームの静止画を抜き出す静止画抜出部と、前記静止画から第１対象物に付随する第２対象物を物体認識する静止画物体認識部と、を備えた構成とする。本構成によれば、動画中の第１対象物に付随する第２対象物を、システムの動作負担を抑えながら精度良く認識することが容易となる。なおここでの「付随する」とは、第１対象物に第２対象物が含まれる形態に限られず、動画中に表れる時期が密接に関連する他の形態も含まれる。 The object recognition system according to the present invention includes a moving object recognition unit that recognizes a first object from a moving image, a still image extraction unit that extracts a still image of a plurality of frames including the first object from the moving image, and the above. The configuration includes a still image object recognition unit that recognizes a second object attached to the first object from a still image. According to this configuration, it becomes easy to accurately recognize the second object attached to the first object in the moving image while suppressing the operation load of the system. The term "accompanying" here is not limited to the form in which the first object includes the second object, but also includes other forms closely related to the time when it appears in the moving image.

また上記構成としてより具体的には、前記抜き出された各静止画のうち、第１対象物および第２対象物の少なくとも一方の露出度の高さに関する所定条件を満たすものを抽出する抽出部を備え、前記静止画物体認識部は、前記抽出された静止画から、第１対象物に含まれる第２対象物を認識する構成としてもよい。本構成によれば、第１対象物に含まれる第２対象物を、静止画からより効率良く認識することが可能となる。 More specifically, as the above configuration, an extraction unit that extracts, among the extracted still images, those that satisfy a predetermined condition regarding the high degree of exposure of at least one of the first object and the second object. The still image object recognition unit may be configured to recognize the second object included in the first object from the extracted still image. According to this configuration, the second object included in the first object can be recognized more efficiently from the still image.

また上記構成としてより具体的には、前記抜き出された各静止画のうち、第２対象物である特定対象物の露出度が最も高いものを抽出する第１処理と、抽出された前記静止画から第２対象物内の表示情報の認識を行う第２処理と、当該認識が成功するまで、前記各静止画のうち前記特定対象物の露出度がその次に高いものを抽出して第２処理を繰返し行う第３処理と、を実行する構成としてもよい。なおここでの「表示情報」とは、文字、図形、記号、或いはこれらの組合わせであって、例えば本願でのナンバー（自動車登録番号）等が該当し得る。 More specifically, as the above configuration, the first process of extracting the extracted still image having the highest degree of exposure of the specific object, which is the second object, and the extracted still image. The second process of recognizing the display information in the second object from the image and the extraction of the still image having the next highest degree of exposure of the specific object from the still images until the recognition is successful are extracted. The configuration may be such that the third process, in which the two processes are repeated, and the third process are executed. The "display information" here is a character, a figure, a symbol, or a combination thereof, and may correspond to, for example, a number (automobile registration number) in the present application.

また上記構成としてより具体的には、前記繰返しの回数が所定回数に達しても第２処理における前記認識が成功しない場合に、前記特定対象物を第１対象物として第１処理から第３処理を実行する構成としてもよい。 More specifically, as the above configuration, when the recognition in the second process is not successful even if the number of repetitions reaches a predetermined number, the specific object is set as the first object and the first to third processes are performed. May be configured to execute.

また上記構成としてより具体的には、前記静止画抜出部は、前記動画における第１対象物の露出度に応じて、前記抜き出す静止画の解像度および前記静止画を抜き出す時間間隔の少なくとも一方を調節する構成としてもよい。 More specifically, as the above configuration, the still image extraction unit sets at least one of the resolution of the still image to be extracted and the time interval for extracting the still image according to the degree of exposure of the first object in the moving image. It may be configured to be adjusted.

本構成によれば、システムの負担を極力抑えながら、重要度の高い静止画を効率良く抜き出すことが可能となる。また上記構成としてより具体的には、抜き出された複数の前記静止画の中で第２対象物の露出度が最も大きいものに対し、深層学習に適した補正処理を施す構成としてもよい。 According to this configuration, it is possible to efficiently extract highly important still images while suppressing the burden on the system as much as possible. Further, more specifically, the configuration may be such that a correction process suitable for deep learning is applied to the one having the highest degree of exposure of the second object among the plurality of extracted still images.

また上記構成としてより具体的には、前記動画はカメラを用いて撮影された映像であって、前記カメラと第１対象物の距離を検知する距離検知部を備え、前記静止画抜出部は、前記距離が所定値以下となったときに、前記静止画の抜き出しを開始する構成としてもよい。本構成によれば、第１対象物の映りが小さい不明瞭な静止画の抜き出しを極力抑えることが可能となる。 More specifically, as the above configuration, the moving image is an image taken by a camera, and includes a distance detecting unit for detecting the distance between the camera and the first object, and the still image extraction unit is provided. The still image may be extracted when the distance becomes equal to or less than a predetermined value. According to this configuration, it is possible to suppress extraction of an unclear still image in which the reflection of the first object is small as much as possible.

また上記構成としてより具体的には、第１対象物の向きを検知する方向検知部を備え、前記静止画抜出部は、前記向きが所定条件を満たしたときに、前記静止画の抜き出しを開始する構成としてもよい。本構成によれば、第１対象物の向きに問題がある静止画の抜き出しを極力抑えることが可能である。 More specifically, the configuration includes a direction detection unit that detects the orientation of the first object, and the still image extraction unit extracts the still image when the orientation satisfies a predetermined condition. It may be a configuration to start. According to this configuration, it is possible to suppress extraction of a still image having a problem in the orientation of the first object as much as possible.

また上記構成としてより具体的には、第１対象物の情報と第２対象物の情報を関連させて保持する構成としてもよい。本構成によれば、第１対象物と第２対象物を一括して管理することが可能となる。また更に上記構成において、前記動画物体認識部と前記静止画物体認識部において、異なる物体認識の手法を用いる構成としてもよい。本構成によれば、動画からの物体認識と静止画からの物体認識のそれぞれに最適な手法を用い、物体認識を効率良く行うことが可能となる。 Further, more specifically, the above configuration may be a configuration in which the information of the first object and the information of the second object are held in association with each other. According to this configuration, it is possible to collectively manage the first object and the second object. Further, in the above configuration, the moving object recognition unit and the still image object recognition unit may use different object recognition methods. According to this configuration, it is possible to efficiently perform object recognition by using the optimum methods for object recognition from moving images and object recognition from still images.

また本発明に係る物体認識方法は、動画から第１対象物を物体認識する動画物体認識工程と、前記動画から第１対象物を含む静止画を抜き出す静止画抜出工程と、前記静止画から第１対象物に付随する第２対象物を物体認識する静止画物体認識工程と、を含む方法とする。 Further, the object recognition method according to the present invention includes a moving object recognition step of recognizing a first object from a moving image, a still image extraction step of extracting a still image including the first object from the moving image, and a still image. The method includes a still image object recognition step of recognizing a second object attached to the first object.

本発明に係る物体認識システムおよび物体認識方法によれば、動画中の第１対象物に付随する第２対象物を、システムの動作負担を抑えながら精度良く認識することが容易となる。 According to the object recognition system and the object recognition method according to the present invention, it becomes easy to accurately recognize the second object attached to the first object in the moving image while suppressing the operation load of the system.

本実施形態に係る車両管理システム１の構成に関するブロック図である。It is a block diagram concerning the structure of the vehicle management system 1 which concerns on this embodiment. 敷地内においてスマートフォン１１が設置された様子の説明図である。It is explanatory drawing of the appearance that the smartphone 11 was installed in the site. 情報処理装置１４の機能的構成に関するブロック図である。It is a block diagram concerning the functional configuration of the information processing apparatus 14. スマートフォン１１の映像に関する説明図である。It is explanatory drawing about the image of the smartphone 11. ご当地ナンバープレートに関する説明図である。It is explanatory drawing about a local license plate. 前段処理の流れに関するフローチャートである。It is a flowchart about the flow of the pre-stage processing. 車両の動きに伴って距離Ｄが変化する様子の説明図である。It is explanatory drawing of the appearance that the distance D changes with the movement of a vehicle. 動画からの静止画の抜出しに関するタイミングチャートである。It is a timing chart regarding the extraction of a still image from a moving image. 後段処理の流れに関するフローチャートである。It is a flowchart about the flow of post-stage processing. チェック結果等の情報に関する説明図である。It is explanatory drawing about information such as a check result.

本発明の実施形態に係る車両管理システム（本発明に係る物体認識システムの一形態）について、各図面を参照しながら以下に説明する。 The vehicle management system according to the embodiment of the present invention (one form of the object recognition system according to the present invention) will be described below with reference to each drawing.

１．車両管理システムの構成
図１は、本実施形態に係る車両管理システム１の概略構成を示すブロック図である。本図に示すように車両管理システム１は、進入路前側撮影用スマートフォン１１ａ、進入路後側撮影用スマートフォン１１ｂ、退出路前側撮影用スマートフォン１１ｃ、退出路後側撮影用スマートフォン１１ｄ、通信ネットワーク１２、エッジサーバー１３、情報処理装置１４、および管理サーバー１５を備えている。なお以下の説明では、上記の各スマートフォン１１ａ〜１１ｄを「スマートフォン１１」と総称することがある。また各図面においては、スマートフォン（SmartPhone）を「ＳＰ」と略記することがある。 1. 1. Configuration of Vehicle Management System FIG. 1 is a block diagram showing a schematic configuration of vehicle management system 1 according to the present embodiment. As shown in this figure, the vehicle management system 1 includes a smartphone 11a for shooting on the front side of the approach road, a smartphone 11b for shooting on the rear side of the approach road, a smartphone 11c for shooting on the front side of the exit road, a smartphone 11d for shooting on the rear side of the exit road, and a communication network 12. It includes an edge server 13, an information processing device 14, and a management server 15. In the following description, the above smartphones 11a to 11d may be collectively referred to as "smartphone 11". Further, in each drawing, a smartphone (SmartPhone) may be abbreviated as "SP".

本実施形態では一例として、スマートフォン１１は複数の敷地（図１に示す例では敷地１〜３）内それぞれに複数個が設置されている。本実施形態における「敷地」は、当該敷地を管理する事業者等（以下、「管理者」と称する）の許可を得た車両が出入りできる場所であり、例えば、管理者が所有する駐車場等が該当する。敷地に車両の出入口が複数箇所ある場合、全ての出入口の付近にスマートフォン１１を設置することにより、その敷地に進入或いは退出する車両を漏れなく監視することが可能である。なお、本実施形態における「車両」はナンバープレートを備えた自動車のことであり、「ナンバー」は当該ナンバープレートに表された自動車登録番号のことである。一般的にナンバープレートは、車両の前側と後側の両方に設けられている。 In this embodiment, as an example, a plurality of smartphones 11 are installed in each of a plurality of sites (sites 1 to 3 in the example shown in FIG. 1). The "site" in the present embodiment is a place where vehicles obtained with the permission of the business operator or the like (hereinafter referred to as "manager") who manages the site can enter and exit, for example, a parking lot owned by the manager or the like. Is applicable. When there are a plurality of vehicle entrances and exits on the site, by installing smartphones 11 near all the entrances and exits, it is possible to monitor vehicles entering or leaving the site without omission. The "vehicle" in the present embodiment is a vehicle provided with a license plate, and the "number" is a vehicle registration number represented on the license plate. Generally, license plates are provided on both the front and rear sides of the vehicle.

一方で、エッジサーバー１３、情報処理装置１４、および管理サーバー１５は、管理センターに纏めて設置されている。本実施形態における「管理センター」は、敷地に出入りする車両の管理が行われる場所であり、例えば、管理者が所有する建物内の一室が該当する。車両管理システム１は、各敷地に進入する車両を自動的に監視するとともに、一括して管理する役割を果たす。 On the other hand, the edge server 13, the information processing device 14, and the management server 15 are collectively installed in the management center. The "management center" in the present embodiment is a place where vehicles entering and exiting the site are managed, and corresponds to, for example, a room in a building owned by the manager. The vehicle management system 1 plays a role of automatically monitoring vehicles entering each site and collectively managing them.

管理サーバー１５は、各敷地に進入する車両等の管理に用いられるサーバーである。管理サーバー１５には、敷地内への進入を許可された全ての車両（以下、便宜的に「許可車両」と称する）のナンバーが、データベースとして登録されている。管理サーバー１５は、管理者等によって新たな許可車両のナンバーが入力される度に、この情報をデータベースに蓄積する。なお、管理サーバー１５は、インターネット網を介してデータセンタ上に設けられてもよい。 The management server 15 is a server used for managing vehicles and the like entering each site. In the management server 15, the numbers of all vehicles permitted to enter the site (hereinafter, referred to as "permitted vehicles" for convenience) are registered as a database. The management server 15 stores this information in the database every time a new permitted vehicle number is input by the administrator or the like. The management server 15 may be provided on the data center via the Internet network.

スマートフォン１１は、被写体を撮影して動画（映像）を得るカメラの機能を有するとともに、自機から被写体までの距離を測る機能（測距機能）を有する。この測距機能は、スマートフォン１１に複数のレンズを搭載した「ステレオカメラ」により実現される。測距機能は、ステレオカメラに替えて測距センサー等を設けることにより実現してもよい。また、進入路前側撮影用スマートフォン１１ａは敷地へ進入する車両の前側を撮影する役割を、進入路後側撮影用スマートフォン１１ｂは敷地へ進入する車両の後側を撮影する役割を、退出路前側撮影用スマートフォン１１ｃは敷地から退出する車両の前側を撮影する役割を、退出路後側撮影用スマートフォン１１ｄは敷地から退出する車両の後側を撮影する役割を、それぞれ担っている。なお、カメラ単体或いはその他のカメラを有した機器が、スマートフォン１１の代わりに適用されても良い。 The smartphone 11 has a camera function of capturing a subject and obtaining a moving image (video), and also has a function of measuring the distance from the own device to the subject (distance measuring function). This distance measuring function is realized by a "stereo camera" in which a plurality of lenses are mounted on the smartphone 11. The distance measuring function may be realized by providing a distance measuring sensor or the like instead of the stereo camera. In addition, the smartphone 11a for photographing the front side of the approach road has a role of photographing the front side of the vehicle entering the site, and the smartphone 11b for photographing the rear side of the approach road has a role of photographing the rear side of the vehicle entering the site. The smartphone 11c for shooting has a role of photographing the front side of the vehicle leaving the site, and the smartphone 11d for photographing the rear side of the exit road has a role of photographing the rear side of the vehicle leaving the site. A camera alone or other device having a camera may be applied instead of the smartphone 11.

図２は、敷地内においてスマートフォン１１が設置された様子を例示している。本図に示すように、スマートフォン１１は、敷地に進入および退出する車両の通行路が被写体となるように設置されている。これにより、車両が進入路（敷地へ進入するための通行路）および退出路（敷地から退出するための通行路）を通行する際、その車両の前側および後側の外観をスマートフォン１１の被写体に収めることが可能である。スマートフォン１１は、車両のナンバープレート（或いは、これに表されたナンバーの情報）、運転者、運転者が装着したシートベルト、車両の汚れや傷（凹み等含む）、および所定の装備品（以下、これらを「ナンバープレート等」と総称することがある）を被写体へ収めることができるように、適切な位置に設置されることが望ましい。なお、ここでの「装備品」は、例えば許可車両に装備が義務付けられたものであり、スマートフォン１１の被写体となり得るものである。 FIG. 2 illustrates a state in which the smartphone 11 is installed on the premises. As shown in this figure, the smartphone 11 is installed so that the passage of vehicles entering and exiting the site is the subject. As a result, when the vehicle passes through the approach road (passage for entering the site) and the exit route (passage for exiting the site), the appearance of the front side and the rear side of the vehicle is used as the subject of the smartphone 11. It is possible to fit it. The smartphone 11 includes a vehicle license plate (or information on the number represented on the license plate), a driver, a seatbelt worn by the driver, dirt and scratches (including dents, etc.) on the vehicle, and predetermined equipment (hereinafter, dents, etc.). , These are sometimes collectively referred to as "license plates, etc."), and it is desirable that they be installed at appropriate positions so that they can be accommodated in the subject. The "equipment" here is, for example, one that is obliged to be equipped in a licensed vehicle, and can be a subject of the smartphone 11.

例えば図２に示すように、各スマートフォン１１は守衛室の近傍に設けられ、車両の全体を斜め上方から撮影できる位置（本図の例では支柱）に設置されることが望ましい。本図の例では、進入路前側撮影用スマートフォン１１ａおよび進入路後側撮影用スマートフォン１１ｂは、進入路のほぼ真上において後部同士が対向するように設置され、進入路前側撮影用スマートフォン１１ａは進入路を進む車両の前側全体を斜め上前方から撮影するように、進入路後側撮影用スマートフォン１１ｂは進入路を進む車両の後側全体を斜め上後方から撮影するように、それぞれ適切に配置されている。 For example, as shown in FIG. 2, it is desirable that each smartphone 11 is provided near the guard room and is installed at a position where the entire vehicle can be photographed from diagonally above (post in the example of this figure). In the example of this figure, the smartphone 11a for shooting on the front side of the approach road and the smartphone 11b for shooting on the rear side of the approach road are installed so that the rear portions face each other almost directly above the approach road, and the smartphone 11a for shooting on the front side of the approach road enters. The smartphone 11b for shooting the rear side of the approach road is appropriately arranged so as to shoot the entire front side of the vehicle traveling on the road from diagonally above and front, and the entire rear side of the vehicle traveling on the approach road is photographed diagonally from above and behind. ing.

一方、退出路前側撮影用スマートフォン１１ｃおよび退出路後側撮影用スマートフォン１１ｄは、退出路のほぼ真上において後部同士が対向するように設置され、退出路前側撮影用スマートフォン１１ｃは退出路を進む車両の前側全体を斜め上前方から撮影するように、退出路後側撮影用スマートフォン１１ｄは退出路を進む車両の後側全体を斜め上後方から撮影するように、それぞれ適切に配置されている。 On the other hand, the smartphone 11c for shooting on the front side of the exit road and the smartphone 11d for shooting on the rear side of the exit road are installed so that the rear parts face each other almost directly above the exit road, and the smartphone 11c for shooting on the front side of the exit road is a vehicle traveling on the exit road. The smartphone 11d for photographing the rear side of the exit road is appropriately arranged so as to photograph the entire rear side of the vehicle traveling on the exit road from diagonally above and rear.

このように、各スマートフォン１１は上方視において車両と重なる位置に配されることが望ましく、これによりスマートフォン１１により得られた画像データについて、車幅方向の各種補正の簡略化あるいは省略が可能となる。なお以下の説明では、同じ車両を前側と後側から撮影するスマートフォン１１の組合せ、すなわち、進入路前側撮影用スマートフォン１１ａとこれに対応する進入路後側撮影用スマートフォン１１ｂの組合せ、および、退出路前側撮影用スマートフォン１１ｃとこれに対応する退出路後側撮影用スマートフォン１１ｄの組合せそれぞれを、「一対のスマートフォン１１」と表現することがある。 As described above, it is desirable that each smartphone 11 is arranged at a position overlapping the vehicle in upward view, which makes it possible to simplify or omit various corrections in the vehicle width direction for the image data obtained by the smartphone 11. .. In the following description, the combination of the smartphone 11 that shoots the same vehicle from the front side and the rear side, that is, the combination of the smartphone 11a for shooting the front side of the approach road and the corresponding smartphone 11b for shooting the rear side of the approach road, and the exit route. Each combination of the front side shooting smartphone 11c and the corresponding exit road rear side shooting smartphone 11d may be expressed as "a pair of smartphones 11".

例えば退出路前側撮影用スマートフォン１１ｃの映像には、図４に例示するように、車両Ｃ１の前側のナンバープレートが直接映るとともに、運転者と運転者が装着したシートベルトがフロントガラス越しに映ることになる。上記のナンバープレート等は、車両が映っている動画および静止画において、何れも当該車両に含まれているものであり、当該車両に付随しているものである。また、スマートフォン１１は測距機能を有しているため、被写体中の車両の位置が特定されれば、当該スマートフォン１１から当該車両までの距離Ｄの情報を得ることが可能である。この距離Ｄは、後述する距離検知部４３によって検知される。また、スマートフォン１１の温度上昇、低下や経年劣化を抑え、車両の側面をより正確に撮影するために、守衛室にスマートフォン１１を設けてもよい。進入車両を撮影するスマートフォン（１１ａ、１１ｂ）と、退出車両を撮影するスマートフォン（１１ｃ、１１ｄ）とから得られるデータを照合することにより、入退出の管理が可能になる。 For example, in the image of the smartphone 11c for shooting on the front side of the exit road, as illustrated in FIG. 4, the license plate on the front side of the vehicle C1 is directly reflected, and the driver and the seat belt worn by the driver are reflected through the windshield. become. The above-mentioned license plates and the like are included in the vehicle in both the moving image and the still image showing the vehicle, and are attached to the vehicle. Further, since the smartphone 11 has a distance measuring function, if the position of the vehicle in the subject is specified, it is possible to obtain information on the distance D from the smartphone 11 to the vehicle. This distance D is detected by the distance detection unit 43, which will be described later. In addition, the smartphone 11 may be provided in the guard room in order to suppress the temperature rise / fall and aging deterioration of the smartphone 11 and to photograph the side surface of the vehicle more accurately. By collating the data obtained from the smartphones (11a, 11b) that photograph the approaching vehicle and the smartphones (11c, 11d) that photograph the exiting vehicle, entry / exit management becomes possible.

通信ネットワーク１２は、各スマートフォン１１と情報処理装置１４の間の通信に用いられるネットワークである。通信ネットワーク１２の具体的形態としては、有線と無線の何れのネットワークが適用されても良い。また、通信ネットワーク１２にインターネット等を利用することも可能である。 The communication network 12 is a network used for communication between each smartphone 11 and the information processing device 14. As a specific form of the communication network 12, either a wired network or a wireless network may be applied. It is also possible to use the Internet or the like for the communication network 12.

エッジサーバー１３は、通信ネットワーク１２と情報処理装置１４の間に介在し、例えばディープラーニングを実行可能な環境やディープラーニングで使用される各種値（人工知能の学習済のハイパーパラメータ、モデルの構造情報となるハイパーパラメータ、学習データを学習させた際に与えられるウエイトデータ、強化学習モデルにおける報酬関数）を記憶している。エッジサーバー１３にはディープラーニングを実行できる環境のソフトウエア（Python,anaconda,jupyter,opencv,TENSORFLOW,YOLO等）がインストールされている。 The edge server 13 is interposed between the communication network 12 and the information processing device 14, and for example, an environment in which deep learning can be executed and various values used in deep learning (hyperparameters learned by artificial intelligence, structure information of a model). The hyper-parameters that become, the weight data given when the training data is trained, and the reward function in the reinforcement learning model) are stored. Software (Python, anaconda, jupyter, opencv, TENSORFLOW, YOLO, etc.) that can execute deep learning is installed on the edge server 13.

情報処理装置１４は、エッジサーバー１３よりも高性能なサーバーにより構成され、動画および静止画からの物体認識の他、車両の監視および管理に関わるディープラーニングの新規学習（強化学習、追加学習）の関連処理等を実行する装置である。また情報処理装置１４は、動画に対する画像認識等の処理を行う動画処理エンジン１４ａと、静止画に対する画像認識等の処理を行う静止画処理エンジン１４ｂを有する。 The information processing device 14 is composed of a server having higher performance than the edge server 13, and includes object recognition from moving images and still images, as well as new learning (reinforcement learning, additional learning) of deep learning related to vehicle monitoring and management. It is a device that executes related processing. Further, the information processing device 14 includes a moving image processing engine 14a that performs processing such as image recognition for moving images, and a still image processing engine 14b that performs processing such as image recognition for still images.

動画処理エンジン１４ａは、ＹＯＬＯ（You Only Look Once）やＯＰＥＮＣＶ（Open Source Computer Vision Library）等のアルゴリズムが採用されており、リアルタイムで動画から物体認識を行う機能に優れている。動画処理エンジン１４ａは、機械学習により、外観（傾き、大きさ、向き）が異なる車両を何れも「車両」を正確かつ迅速に物体認識することが可能となっている。これにより、動画中の車両の認識漏れを極力抑えることが可能である。なお「機械学習」は、与えられた情報に基づいて反復的に学習を行うことにより、法則やルールを自律的に見つけ出す手法である。但し、動画処理エンジン１４ａの具体的構成は上記の例に限定されるものではなく、ＹＯＬＯ等の代わりに、動画からの物体認識に適した他の手段が採用されても良い。 The video processing engine 14a employs algorithms such as YOLO (You Only Look Once) and OPEN CV (Open Source Computer Vision Library), and is excellent in the function of recognizing an object from a video in real time. The moving image processing engine 14a is capable of accurately and quickly recognizing a "vehicle" for any vehicle having a different appearance (tilt, size, orientation) by machine learning. As a result, it is possible to suppress the omission of recognition of the vehicle in the moving image as much as possible. "Machine learning" is a method of autonomously finding laws and rules by iteratively learning based on given information. However, the specific configuration of the moving image processing engine 14a is not limited to the above example, and other means suitable for object recognition from moving images may be adopted instead of YOLO or the like.

一方で静止画処理エンジン１４ｂは、機械学習ライブラリであるＴＥＮＳＯＲＦＬＯＷが採用されており、静止画から素早く精度良く物体認識を行う機能に優れている。特にＴＥＮＳＯＲＦＬＯＷは、深層学習（ディープラーニング）が可能であるライブラリとなっており、多次元のデータ構造を円滑に処理することができる。なお「深層学習」は、多層構造のニューラルネットワーク（人間の脳神経系の仕組みを模した情報処理モデル）を用いた機械学習である。 On the other hand, the still image processing engine 14b employs TENSORFLOW, which is a machine learning library, and is excellent in the function of quickly and accurately recognizing an object from a still image. In particular, TENSORFLOW is a library capable of deep learning, and can smoothly process multidimensional data structures. "Deep learning" is machine learning using a multi-layered neural network (information processing model that imitates the mechanism of the human cranial nerve system).

静止画処理エンジン１４ｂによれば、車両を含む静止画からナンバープレート等を高精度に物体認識することが出来るとともに、当該ナンバープレートに表されたナンバーを認識することも可能である。但し、静止画処理エンジン１４ｂの具体的構成は上記の例に限定されるものではなく、ＴＥＮＳＯＲＦＬＯＷの代わりに、静止画からの物体認識に適した他の手段が採用されても良い。また、エッジサーバー１３と情報処理装置１４とを同じサーバーで実現してもよい。 According to the still image processing engine 14b, it is possible to recognize an object such as a license plate with high accuracy from a still image including a vehicle, and it is also possible to recognize a number displayed on the license plate. However, the specific configuration of the still image processing engine 14b is not limited to the above example, and other means suitable for object recognition from a still image may be adopted instead of TENSORFLOW. Further, the edge server 13 and the information processing device 14 may be realized by the same server.

ここで、情報処理装置１４の主な機能的構成のブロック図を図３に示す。本図に示すように情報処理装置１４は、制御部４０、通信部４１、動画物体認識部４２、距離検知部４３、速度検知部４４、静止画抜出部４５、露出度検出部４６、抽出部４７、画像処理部４８、静止画物体認識部４９、チェック実行部５０、および異常信号出力部５１を有する。 Here, a block diagram of a main functional configuration of the information processing apparatus 14 is shown in FIG. As shown in this figure, the information processing device 14 includes a control unit 40, a communication unit 41, a moving object recognition unit 42, a distance detection unit 43, a speed detection unit 44, a still image extraction unit 45, an exposure degree detection unit 46, and an extraction. It includes a unit 47, an image processing unit 48, a still image object recognition unit 49, a check execution unit 50, and an abnormal signal output unit 51.

制御部４０は、情報処理装置１４が正常に動作するように、各機能部４１〜５１を適切に制御する。なお情報処理装置１４の主な動作については、改めて詳細に説明する。通信部４１は、各スマートフォン１１および管理サーバー１５を含む外部装置との通信を実行する。 The control unit 40 appropriately controls each of the functional units 41 to 51 so that the information processing device 14 operates normally. The main operation of the information processing device 14 will be described in detail again. The communication unit 41 executes communication with an external device including each smartphone 11 and the management server 15.

動画物体認識部４２は、動画処理エンジン１４ａによる動画からの物体認識機能を用いて、動画から車両を物体認識する。なお、複数の車両が同時に表れている動画に対しては、動画物体認識部４２はこれらを別々に物体認識することが可能である。例えば一のスマートフォン１１の被写体に２台の車両が入ったときには、これら２台の車両を別々に物体認識することが可能であり、情報処理装置１４は、それぞれに着目した処理を並行して進めることが可能である。当該物体認識は、主に後述するステップＳ１０の処理において実施される。 The moving object recognition unit 42 recognizes the vehicle from the moving image by using the object recognition function from the moving image by the moving image processing engine 14a. For moving images in which a plurality of vehicles appear at the same time, the moving image object recognition unit 42 can recognize the objects separately. For example, when two vehicles enter the subject of one smartphone 11, it is possible to recognize these two vehicles separately, and the information processing device 14 advances the processing focusing on each in parallel. It is possible. The object recognition is mainly carried out in the process of step S10 described later.

距離検知部４３は、スマートフォン１１に設けられた測距機能を利用して、物体認識された車両とスマートフォン１１との距離Ｄ（図２を参照）を検知する。当該距離の検知は、主に後述するステップＳ１２の処理において実施される。 The distance detection unit 43 detects the distance D (see FIG. 2) between the object-recognized vehicle and the smartphone 11 by using the distance measurement function provided in the smartphone 11. The detection of the distance is mainly carried out in the process of step S12 described later.

速度検知部４４は、物体認識された車両の速度を検知する。速度を検知する手法としては、スマートフォン１１の近傍に設置された速度センサーを利用する手法や、動画における車両の動きから速度を検知する手法等が採用され得る。当該速度の検知は、主に後述するステップＳ１１の処理において実施される。 The speed detection unit 44 detects the speed of the vehicle recognized as an object. As a method for detecting the speed, a method using a speed sensor installed in the vicinity of the smartphone 11, a method for detecting the speed from the movement of the vehicle in the moving image, and the like can be adopted. The detection of the speed is mainly carried out in the process of step S11 described later.

静止画抜出部４５は、動画から複数の静止画（各フレームの画像、例えば０．１秒間隔に３０枚）を抜き出して、記憶領域に一時的に保持する。なお静止画抜出部４５は、動画から抜き出す静止画の解像度、および動画から静止画を抜き出す時間間隔を、適宜変更することが可能である。当該静止画の抜出しは、主に後述するステップＳ１３の処理において実施される。 The still image extraction unit 45 extracts a plurality of still images (images of each frame, for example, 30 images at intervals of 0.1 seconds) from the moving image and temporarily holds them in the storage area. The still image extraction unit 45 can appropriately change the resolution of the still image extracted from the moving image and the time interval for extracting the still image from the moving image. Extraction of the still image is mainly carried out in the process of step S13 described later.

露出度検出部４６は、動画の１フレーム（一の静止画に相当する）における物体認識された車両の露出度（以下、「第１露出度」と称する）を検出する。第１露出度は、動画の１フレームにおける車両の大きさ（面積）とフレームの大きさとの比率（フレームに対する露出割合）としてもよく、車両の大きさ自体としてもよい。また車両の大きさの情報としては、車両の輪郭内部の面積を採用しても良く、当該大きさの指標となる他の情報（例えば、図４に破線で示す矩形（四辺が車両に接する矩形）の内部の面積）を採用しても良い。その他、フレーム同士における車両の露出度の高さを比較可能とする別の値を、第１露出度とみなしても良い。第１露出度の検出は、主に後述するステップＳ１４の処理において実施される。さらに露出度検出部４６は、動画の１フレーム（一の静止画に相当する）における物体認識されたナンバープレートの露出度（以下、「第２露出度」と称する）を検出する。第２露出度は、動画の１フレームにおけるナンバープレートの大きさと車両の大きさとの比率（車両に対する露出割合）としても良く、ナンバープレートの大きさとフレームの大きさとの比率（フレームに対する露出割合）としても良く、ナンバープレートの大きさ自体としてもよい。また第１露出度の場合と同様に、フレーム同士におけるナンバープレートの露出度の高さを比較可能とする各種の値を、第２露出度とみなすことが可能である。第２露出度の検出は、主に後述するステップＳ２０の処理において実施される。 The exposure degree detection unit 46 detects the exposure degree (hereinafter, referred to as “first exposure degree”) of the vehicle recognized as an object in one frame (corresponding to one still image) of the moving image. The first degree of exposure may be the ratio of the size (area) of the vehicle to the size of the frame (exposure ratio to the frame) in one frame of the moving image, or may be the size of the vehicle itself. Further, as the information on the size of the vehicle, the area inside the contour of the vehicle may be adopted, and other information as an index of the size (for example, a rectangle shown by a broken line in FIG. 4 (a rectangle whose four sides are in contact with the vehicle)). ) Internal area) may be adopted. In addition, another value that enables comparison of the high degree of exposure of the vehicle between the frames may be regarded as the first degree of exposure. The detection of the first degree of exposure is mainly carried out in the process of step S14, which will be described later. Further, the exposure degree detection unit 46 detects the exposure degree (hereinafter, referred to as “second exposure degree”) of the license plate recognized as an object in one frame (corresponding to one still image) of the moving image. The second degree of exposure may be the ratio of the size of the license plate to the size of the vehicle (exposure ratio to the vehicle) in one frame of the moving image, or as the ratio of the size of the license plate to the size of the frame (exposure ratio to the frame). It may be the size of the license plate itself. Further, as in the case of the first exposure degree, various values that make it possible to compare the heights of the exposure degree of the license plates between the frames can be regarded as the second exposure degree. The detection of the second degree of exposure is mainly carried out in the process of step S20 described later.

抽出部４７は、静止画抜出部４５によって抜き出された複数の静止画のうち、第１露出度および第２露出度の少なくとも一方に関する所定条件（以下、便宜的に「露出度条件」と称する）を満たすものを抽出する。露出度条件は、例えば、当該露出度の値が所定値以上であることとしても良く、当該露出度が最も高いこととしても良く、当該露出度の高い方から数えて所定数以内に該当することとしても良い。本実施形態に係る露出度条件の具体的内容については、改めて詳細に説明する。露出度が高いほど、静止画物体認識部４９によるナンバーの物体認識が行い易くなる可能性が高まるため、露出度条件を満たす静止画を用いれば当該物体認識をより有利に行うことが可能となる。その他、露出度条件の代わりに、静止画物体認識部４９による物体認識の行い易さに関する別の条件が設定されても良い。当該抽出は、主に後述するステップＳ２０の処理において実施される。 The extraction unit 47 refers to a predetermined condition regarding at least one of the first exposure degree and the second exposure degree among the plurality of still images extracted by the still image extraction unit 45 (hereinafter, for convenience, “exposure degree condition”. Extract those that satisfy (referred to). The exposure degree condition may be, for example, that the value of the degree of exposure is equal to or higher than a predetermined value, that the degree of exposure may be the highest, and that the value falls within a predetermined number counting from the one with the highest degree of exposure. It may be. The specific contents of the exposure condition according to the present embodiment will be described in detail again. The higher the degree of exposure, the higher the possibility that the still image object recognition unit 49 can easily recognize the numbered object. Therefore, if a still image satisfying the degree of exposure is used, the object recognition can be performed more advantageously. .. In addition, instead of the exposure degree condition, another condition regarding the ease of performing object recognition by the still image object recognition unit 49 may be set. The extraction is mainly carried out in the process of step S20 described later.

画像処理部４８は、静止画に対して閾値処理、エッジ処理（エッジ検出処理）、および傾き補正処理の各画像処理を順に実施する。なお画像処理部４８は、これらの画像処理のうち、何れか一つまたは二つのみを実施するようにしても良く、静止画からの物体認識をより有利にするための他の画像処理を更に実施するようにしても良い。当該画像処理は、主に後述するステップＳ２１の処理において実施される。なおこれらの画像処理は、深層学習に適した補正処理とみることも出来る。 The image processing unit 48 sequentially performs each image processing of the threshold value processing, the edge processing (edge detection processing), and the tilt correction processing on the still image. The image processing unit 48 may perform only one or two of these image processes, and further performs other image processes for making object recognition from a still image more advantageous. It may be carried out. The image processing is mainly carried out in the process of step S21 described later. Note that these image processes can also be regarded as correction processes suitable for deep learning.

ここで「閾値処理」は、画像を２値画像（シングルチャンネル画像）に変換する処理である。閾値処理によれば、例えば、白黒の２値画像に変換する場合には、チャンネル値が所定の閾値を超えた画素については白の画素に、チャンネル値が当該閾値を超えなかった画素については黒の画素に、それぞれ変換されることになる。閾値処理が施された画像は、画像中の明度の異なる部分を選ぶことが容易となる。 Here, the "threshold processing" is a processing for converting an image into a binary image (single channel image). According to the threshold processing, for example, when converting to a black-and-white binary image, pixels whose channel value exceeds a predetermined threshold are white pixels, and pixels whose channel value does not exceed the threshold are black. It will be converted to each pixel. For images that have been subjected to threshold processing, it becomes easy to select parts having different brightness in the image.

また「エッジ処理」は、画像中の明るさ（濃淡）あるいは色が急に変化している箇所（エッジ）を検出する処理である。画像中の物体の輪郭や線では、一般的に濃淡等が急激に変化しているため、エッジ処理によってこの輪郭や線を検出することが可能である。エッジは物体の構造を反映している重要な情報であり、静止画からの物体認識を実施する際にエッジ処理は極めて有用である。なおエッジ処理をより効果的に行うため、通常、予めその画像に閾値処理を実施しておくことは有用である。 Further, the "edge processing" is a processing for detecting a portion (edge) in the image where the brightness (shading) or the color suddenly changes. Since the contours and lines of an object in an image generally have abrupt changes in shading and the like, it is possible to detect the contours and lines by edge processing. Edges are important information that reflects the structure of an object, and edge processing is extremely useful when performing object recognition from a still image. In order to perform the edge processing more effectively, it is usually useful to perform the threshold processing on the image in advance.

エッジ処理を実施するためのアルゴリズムとしては、キャニー（Canny）エッジ検出器が採用されても良い。このアルゴリズムが採用された場合のエッジ処理（キャニー処理）によれば、他のアルゴリズム（ソーベルフィルタやラプラシアンフィルタ等）が採用された場合に比べ、輪郭の検出漏れや誤検出が少なく、各点に一本の輪郭を検出し、真にエッジである部分を検出し易いといった特徴がある。なおキャニー処理は、Gaussianフィルタで画像を平滑化し、この平滑化された画像の微分の計算結果から勾配の大きさと方向の計算して、Non maximum Suppression処理およびHysteresis Threshold処理を行うことにより達成される。 A Canny edge detector may be adopted as an algorithm for performing edge processing. According to the edge processing (canny processing) when this algorithm is adopted, there are less omissions and false detections of contours compared to the case where other algorithms (Sobel filter, Laplacian filter, etc.) are adopted, and each point. It has the feature that it is easy to detect a single contour and to detect a part that is a true edge. Canny processing is achieved by smoothing the image with a Gaussian filter, calculating the magnitude and direction of the gradient from the calculation result of the derivative of the smoothed image, and performing Non maximum Suppression processing and Hysteresis Threshold processing. ..

また「傾き補正処理」は、画像中に検出された直線等が水平方向（或いは垂直方向）から傾斜している場合に、この傾斜を解消させるように画像を回転させる処理である。例えば、画像中のナンバープレートの横方向に伸びる縁が水平方向に一致するように傾き補正処理を施すことにより、ナンバーの文字列が水平方向へ並ぶようにし、ナンバーの認識をより容易なものとすることが可能となる。なお画像中の直線等を検出容易とするため、通常、予めエッジ処理を実施しておくことは有用である。 Further, the "tilt correction process" is a process of rotating an image so as to eliminate the inclination when a straight line or the like detected in the image is inclined from the horizontal direction (or the vertical direction). For example, by performing tilt correction processing so that the laterally extending edges of the license plate in the image match in the horizontal direction, the character strings of the numbers are lined up in the horizontal direction, making it easier to recognize the numbers. It becomes possible to do. It is usually useful to perform edge processing in advance in order to facilitate detection of straight lines and the like in the image.

静止画物体認識部４９は、静止画処理エンジン１４ｂによる静止画からの物体認識機能を用いて、静止画から車両のナンバープレート、ナンバー（ナンバープレートに表された情報）、運転者が装着したシートベルト、車両の汚れ・傷、および装備品を物体認識する。静止画からの物体認識は、車両のナンバーを認識する場合のように、静止画に表された表示情報を認識することも含む概念である。当該物体認識は、主に後述するステップＳ２２の処理において実施されるが、先述した第２露出度を検出するため、ナンバープレートについての物体認識はステップＳ２０の処理において実施される。なお上述したように本実施形態では、動画物体認識部４２と静止画物体認識部４９において、異なる物体認識の手法が用いられている。そのため、双方において同じ物体認識の手法が用いられる場合に比べ、動画からの物体認識と静止画からの物体認識のそれぞれに最適な手法を用い、物体認識を効率良く行うことが可能となっている。 The still image object recognition unit 49 uses the object recognition function from the still image by the still image processing engine 14b to obtain the license plate, number (information displayed on the license plate) of the vehicle from the still image, and the seat worn by the driver. Object recognition of belts, vehicle dirt / scratches, and equipment. Object recognition from a still image is a concept that includes recognizing display information displayed on a still image, as in the case of recognizing a vehicle number. The object recognition is mainly carried out in the process of step S22 described later, but in order to detect the second exposure degree described above, the object recognition of the license plate is carried out in the process of step S20. As described above, in the present embodiment, different object recognition methods are used in the moving image object recognition unit 42 and the still image object recognition unit 49. Therefore, compared to the case where the same object recognition method is used in both cases, it is possible to efficiently perform object recognition by using the optimum methods for object recognition from moving images and object recognition from still images. ..

なお、静止画物体認識部４９は、図柄の無い一般的なナンバープレートだけでなく、いわゆるご当地ナンバーが表されたナンバープレートからもナンバーを認識することが可能である。図５は、認識され得るナンバープレートのうち、ご当地ナンバーが表されたもの（ご当地ナンバープレート）の一例を示している。当該ナンバープレートには、車両を識別する数字や記号以外に図柄（ここでは波の図柄）が描写されている。静止画物体認識部４９は、このようなナンバープレートに対しても、プリミティブ形状判断によりナンバーとして登録される文字や記号だけを抽出し、それを車両の識別番号として利用することができる。そのため図５に示す例では、「墨田区ｓ１２３４」のナンバーが抽出される。 The still image object recognition unit 49 can recognize the number not only from a general license plate without a pattern but also from a license plate on which a so-called local number is displayed. FIG. 5 shows an example of a license plate that can be recognized and represents a local number (local license plate). On the license plate, a design (here, a wave design) is drawn in addition to the numbers and symbols that identify the vehicle. Even for such a license plate, the still image object recognition unit 49 can extract only the characters and symbols registered as numbers by the primitive shape determination and use them as the identification number of the vehicle. Therefore, in the example shown in FIG. 5, the number of "Sumida-ku s1234" is extracted.

チェック実行部５０は、物体認識されたナンバー等について予め決められた内容のチェックを実施する。当該チェックは、後述するステップＳ２４の処理で実施されるものであり、その内容については改めて詳細に説明する。 The check execution unit 50 checks the predetermined contents of the object-recognized number and the like. The check is carried out in the process of step S24 described later, and the contents thereof will be described in detail again.

異常信号出力部５１は、チェック結果の異常を管理者等に知らせるための異常信号を出力する。この異常信号は、管理担当者等にチェック結果の異常を知らせるものであり、アラート音（聴覚信号）や警告ランプ（視覚信号）等とすることが可能である。当該異常信号の出力は、主に後述するステップＳ２６の処理において実施される。 The abnormality signal output unit 51 outputs an abnormality signal for notifying the administrator or the like of the abnormality of the check result. This abnormal signal informs the person in charge of management or the like that the check result is abnormal, and can be used as an alert sound (auditory signal), a warning lamp (visual signal), or the like. The output of the abnormal signal is mainly carried out in the process of step S26 described later.

２．車両管理システムの動作
次に、車両管理システム１の動作概要について説明する。まず車両管理システム１は、主に動画から静止画を抜出すための一連の処理（以下、便宜的に「前段処理」と称する）を実行する。以下、この前段処理の流れについて、図６に示すフローチャートを参照しながら説明する。 2. Operation of the vehicle management system Next, an outline of the operation of the vehicle management system 1 will be described. First, the vehicle management system 1 mainly executes a series of processes for extracting a still image from a moving image (hereinafter, referred to as "pre-stage processing" for convenience). Hereinafter, the flow of this pre-stage processing will be described with reference to the flowchart shown in FIG.

（１）前段処理
敷地内に設置された各スマートフォン１１は継続的に被写体の撮影を行い、その動画はリアルタイムに情報処理装置１４へ送られる。一方で情報処理装置１４は、この動画に対して車両の物体認識の処理を継続的に実施する。これにより、何れかのスマートフォン１１の被写体に車両が表れたとき、換言すれば、車両が敷地内に進入して被写体内の通行路を通過するときに、情報処理装置１４は当該車両を物体認識することができる（ステップＳ１０）。このようにして情報処理装置１４は、敷地内に進入する車両を監視する。 (1) Pre-stage processing Each smartphone 11 installed on the premises continuously shoots a subject, and the moving image is sent to the information processing device 14 in real time. On the other hand, the information processing device 14 continuously performs the object recognition process of the vehicle for this moving image. As a result, when a vehicle appears on the subject of any smartphone 11, in other words, when the vehicle enters the site and passes through the passage in the subject, the information processing device 14 recognizes the vehicle as an object. Can be done (step S10). In this way, the information processing device 14 monitors the vehicle entering the site.

各スマートフォン１１における被写体の撮影モードは、天候、時間、および季節などの状況に応じて可変としてもよい。例えば、逆光や暗い場所の場合には、各スマートフォン１１におけるＨＤＲ（High Dynamic Range）の機能が自動的に有効となるようにしてもよい。これにより、そのときの状況に応じて極力鮮明な動画を取得することができる。 The shooting mode of the subject in each smartphone 11 may be variable depending on the conditions such as weather, time, and season. For example, in the case of backlight or a dark place, the HDR (High Dynamic Range) function of each smartphone 11 may be automatically enabled. As a result, it is possible to acquire as clear a moving image as possible according to the situation at that time.

車両が物体認識されると（ステップＳ１０のＹｅｓ）、情報処理装置１４は、当該車両に対して以降の処理（ステップＳ１１〜Ｓ１８）を実施する。なお情報処理装置１４は、複数の車両が同時に物体認識された場合、すなわち、同じスマートフォン１１の被写体に同時に複数の車両が表れた場合や、複数のスマートフォン１１の被写体に同時に車両が表れた場合には、これらの車両が全て物体認識され、車両１台ごとに以降の処理が個別に行われる。 When the vehicle is recognized as an object (Yes in step S10), the information processing device 14 performs the subsequent processes (steps S11 to S18) on the vehicle. The information processing device 14 recognizes an object at the same time when a plurality of vehicles are recognized at the same time, that is, when a plurality of vehicles appear on the subject of the same smartphone 11 at the same time or when the vehicles appear on the subjects of the plurality of smartphones 11 at the same time. All of these vehicles are recognized as objects, and the subsequent processing is individually performed for each vehicle.

まず情報処理装置１４は、物体認識された車両の速度を検出する（ステップＳ１１）。この検出された車両速度の情報は、後述するステップＳ２４の処理により、管理サーバー１５に記録される。また更に情報処理装置１４は、当該車両とスマートフォン１１との距離Ｄが所定の閾値以下となるタイミングを監視する（ステップＳ１２）。 First, the information processing device 14 detects the speed of the vehicle recognized as an object (step S11). The detected vehicle speed information is recorded in the management server 15 by the process of step S24 described later. Further, the information processing device 14 monitors the timing when the distance D between the vehicle and the smartphone 11 becomes equal to or less than a predetermined threshold value (step S12).

ここで図７は、車両の動きに伴って距離Ｄが変化する様子を例示している。本図に示すように、車両がスマートフォン１１に映り始めたときに比べ、車両がより大きく明瞭に映る位置まで進んだときには、距離Ｄは小さくなっている。なお距離に関する閾値は、車両が適度に大きく映ると見込まれるときの距離Ｄに合わせて設定されている。そのため情報処理装置１４は、ステップＳ１２の処理を行うことにより、車両が適度に大きく映り始めたタイミングを検知することが可能である。 Here, FIG. 7 illustrates how the distance D changes with the movement of the vehicle. As shown in this figure, the distance D is smaller when the vehicle advances to a position where the vehicle is larger and clearly reflected than when the vehicle starts to be reflected on the smartphone 11. The distance threshold is set according to the distance D when the vehicle is expected to appear reasonably large. Therefore, the information processing device 14 can detect the timing when the vehicle starts to appear appropriately large by performing the process of step S12.

距離Ｄが閾値以下となると（ステップＳ１２のＹｅｓ）、情報処理装置１４は、静止画抜出処理を開始する（ステップＳ１３）。以降、情報処理装置１４は、静止画抜出処理を終了するまで、動画から静止画を逐次抜き出すようにする。なお、距離Ｄが閾値以下となるまで静止画抜出処理の実施が保留されることにより、車両の映りが小さい不明瞭な静止画の抜き出しを極力抑えることが可能である。 When the distance D becomes equal to or less than the threshold value (Yes in step S12), the information processing apparatus 14 starts the still image extraction process (step S13). After that, the information processing device 14 sequentially extracts the still image from the moving image until the still image extraction process is completed. By suspending the execution of the still image extraction process until the distance D becomes equal to or less than the threshold value, it is possible to suppress the extraction of an unclear still image with a small image of the vehicle as much as possible.

なお、静止画抜出処理を開始する条件は、本実施形態のように距離Ｄが閾値以下になったときとする代わりに、例えば、車両の向きが所定条件を満たしたときとしても良い。このようにする場合、情報処理装置１４に車両の向きを検知する機能部（方向検知部）を設けておき、検知された方向が所定条件を満たしたときに静止画抜出処理が開始されるようにすれば良い。車両の向きは、動画中の車両の状態から認識することができる。車両の向きに関する所定条件は、例えば、スマートフォン１１に対して車両が真正面を向いている状態、つまり車両前側のナンバープレート前面がスマートフォン１１に真直ぐ向いている状態を基準方向として、車両の向きと基準方向との差が所定値以下（例えば３０°以下）であることとすれば良い。このようにすれば、車両の向きが所定条件を満たすまで静止画抜出処理の実施が保留されることにより、車両の向きに問題がある（ナンバーの認識に支障が出易い）静止画の抜き出しを極力抑えることが可能である。 The condition for starting the still image extraction process may be, for example, when the direction of the vehicle satisfies a predetermined condition, instead of the case where the distance D becomes equal to or less than the threshold value as in the present embodiment. In this case, the information processing device 14 is provided with a function unit (direction detection unit) for detecting the direction of the vehicle, and the still image extraction process is started when the detected direction satisfies a predetermined condition. You can do it. The orientation of the vehicle can be recognized from the state of the vehicle in the moving image. The predetermined condition regarding the orientation of the vehicle is, for example, the orientation and reference of the vehicle with the state in which the vehicle is facing directly in front of the smartphone 11, that is, the state in which the front of the license plate on the front side of the vehicle is directly facing the smartphone 11. It suffices that the difference from the direction is not more than a predetermined value (for example, 30 ° or less). In this way, the execution of the still image extraction process is suspended until the orientation of the vehicle satisfies the predetermined condition, so that there is a problem in the orientation of the vehicle (prone to hinder the recognition of the number). Can be suppressed as much as possible.

静止画抜き出し処理が開始された後、情報処理装置１４は、動画の最新の１フレームについて当該車両の第１露出度を検出する（ステップＳ１４）。第１露出度が高いほど、その静止画において当該車両がより鮮明に表れている可能性が高く、当該車両に含まれるナンバープレート等の認識に役立つ可能性が高いため、その静止画はより重要度が高いと言える。なお第１露出度の検出は、動画から直接行うようにしても良く、抜き出された最新の静止画から行うようにしても良い。 After the still image extraction process is started, the information processing device 14 detects the first exposure degree of the vehicle for the latest one frame of the moving image (step S14). The higher the first exposure, the more clearly the vehicle is likely to appear in the still image, and the more likely it is to be useful for recognizing the license plate etc. contained in the vehicle, so the still image is more important. It can be said that the degree is high. The first exposure may be detected directly from the moving image or from the latest extracted still image.

その後に情報処理装置１４は、検出された第１露出度に応じて、抜き出す静止画の解像度を調節する（ステップＳ１５）。より具体的に説明すると、情報処理装置１４は、第１露出度が高いほど、抜き出す静止画の解像度を上げるようにする。これにより、重要度の高い静止画を優先的に得ることができ、静止画からのナンバープレート等の認識をより行い易くすることが可能である。なお、高い解像度の静止画を常時得ようとすると、データサイズの大きい静止画を多量に扱う必要があるためシステムの負担が大きくなり易いが、本実施形態のように車両の露出度に応じて解像度を調節することにより、このような問題を極力解消することが可能である。 After that, the information processing device 14 adjusts the resolution of the extracted still image according to the detected first exposure degree (step S15). More specifically, the information processing apparatus 14 increases the resolution of the extracted still image as the first exposure degree increases. As a result, it is possible to preferentially obtain a still image having a high degree of importance, and it is possible to make it easier to recognize the license plate or the like from the still image. If a still image having a high resolution is to be obtained at all times, it is necessary to handle a large amount of still images having a large data size, which tends to increase the load on the system. By adjusting the resolution, it is possible to solve such a problem as much as possible.

更に情報処理装置１４は、検出された第１露出度に応じて、静止画を抜き出す時間間隔を調節する（ステップＳ１６）。より具体的に説明すると、情報処理装置１４は、第１露出度が高いほど静止画を抜き出す時間間隔を短くし、単位時間当たりに抜き出す静止画の数を増やすようにする。これにより、重要度の高い静止画を優先的に得ることができ、静止画からのナンバープレート等の認識をより行い易くすることが可能である。なお、静止画を抜き出す時間間隔を常時短くしておくと、非常に多くの静止画を扱う必要があるためシステムの負担が大きくなり易いが、本実施形態のように車両の露出度に応じて時間間隔を調節することにより、このような問題を極力解消することが可能である。 Further, the information processing apparatus 14 adjusts the time interval for extracting the still image according to the detected first exposure degree (step S16). More specifically, the information processing apparatus 14 shortens the time interval for extracting still images as the first exposure degree increases, and increases the number of still images extracted per unit time. As a result, it is possible to preferentially obtain a still image having a high degree of importance, and it is possible to make it easier to recognize the license plate or the like from the still image. If the time interval for extracting still images is always shortened, it is necessary to handle a large number of still images, which tends to increase the load on the system. However, as in the present embodiment, depending on the degree of exposure of the vehicle. By adjusting the time interval, it is possible to solve such a problem as much as possible.

上述したステップＳ１４〜Ｓ１６の一連の処理は、車両が認識されなくなるまで（すなわち、車両がスマートフォン１１に映る範囲を通り過ぎるまで）、繰り返し実施される（ステップＳ１７）。このようにして、静止画の解像度および静止画を抜き出す時間間隔は第１露出度に応じてフィードバック制御され、重要度の高い静止画を効率良く抜き出すことが可能である。 The series of processes of steps S14 to S16 described above are repeatedly performed until the vehicle is no longer recognized (that is, until the vehicle passes the range reflected on the smartphone 11) (step S17). In this way, the resolution of the still image and the time interval for extracting the still image are feedback-controlled according to the first exposure degree, and it is possible to efficiently extract the still image having high importance.

図８に示すタイミングチャートは、一対のスマートフォン１１（同じ車両の前側と後側を撮影する各スマートフォン）により得られた動画から静止画が抜き出されるタイミングを例示している。本図における着色箇所が、静止画の抜き出しが行われるタイミングを示している。本図に示すように、車両が前側撮影用スマートフォン（１１ａまたは１１ｃ）の被写体内に存在する期間では、当該車両の前側の静止画が取得される。その後に当該車両が移動し、当該車両が後側撮影用スマートフォン（１１ｂまたは１１ｄ）の被写体内に存在する期間では、当該車両の後側の静止画が取得される。また図８に示すように、第１露出度が高いときほど、解像度の高い静止画が多く取得される。 The timing chart shown in FIG. 8 exemplifies the timing at which a still image is extracted from a moving image obtained by a pair of smartphones 11 (each smartphone that captures the front side and the rear side of the same vehicle). The colored parts in this figure indicate the timing at which the still image is extracted. As shown in this figure, during the period when the vehicle is in the subject of the front side shooting smartphone (11a or 11c), the still image of the front side of the vehicle is acquired. After that, the vehicle moves, and during the period in which the vehicle exists in the subject of the rear side shooting smartphone (11b or 11d), a still image of the rear side of the vehicle is acquired. Further, as shown in FIG. 8, the higher the first exposure degree, the more high-resolution still images are acquired.

なお本実施形態において、ステップＳ１４およびＳ１５の処理の一方を省略しても良く、ステップＳ１４〜Ｓ１６の処理を省略しても良い。一方、車両が認識されなくなると（ステップＳ１７のＹｅｓ）、その車両に関しての静止画抜出処理は終了する（ステップＳ１８）。一対のスマートフォン１１それぞれに対応した前段処理が実行されると、車両１台分についての複数のフレームの静止画が得られることになる。このようにして得られた静止画群は、後述するステップＳ２０〜Ｓ２６の一連の処理（以下、便宜的に「後段処理」と称する）に用いられる。 In this embodiment, one of the processes of steps S14 and S15 may be omitted, or the processes of steps S14 to S16 may be omitted. On the other hand, when the vehicle is no longer recognized (Yes in step S17), the still image extraction process for the vehicle ends (step S18). When the pre-stage processing corresponding to each of the pair of smartphones 11 is executed, still images of a plurality of frames for one vehicle can be obtained. The still image group thus obtained is used for a series of processes (hereinafter, referred to as "post-stage processing" for convenience) in steps S20 to S26 described later.

（２）後段処理
次に、図９に示すフローチャートを参照しながら、後段処理の流れについて説明する。この後段処理は、車両１台分の前段処理が終了する度に実行される。情報処理装置１４は、ナンバーの認識等に用いる静止画を得るため、前段処理によって得られた複数の静止画のうち先述した露出度条件を満たすものを抽出する（ステップＳ２０）。 (2) Post-stage processing Next, the flow of post-stage processing will be described with reference to the flowchart shown in FIG. This post-stage processing is executed every time the pre-stage processing for one vehicle is completed. In order to obtain a still image used for number recognition or the like, the information processing device 14 extracts a plurality of still images obtained by the pre-stage processing that satisfy the above-mentioned exposure degree condition (step S20).

ここで本実施形態では露出度条件として、優先度の高い方から順に、第１条件、第２条件、第３条件、第４条件、および第５条件が次の通り設定される。
第１条件：第２露出度が最も高いこと
第２条件：第２露出度が２番目に高いこと
第３条件：第２露出度が３番目に高いこと
第４条件：第１露出度が最も高いこと
第５条件：第１露出度が２番目に高いこと
但し、第４条件および第５条件については、第１〜第３条件の何れかを満たす静止画は対象外とされる。また、第５条件以降の各条件が適宜設定されるようにしても良い。 Here, in the present embodiment, as the exposure degree condition, the first condition, the second condition, the third condition, the fourth condition, and the fifth condition are set in order from the highest priority as follows.
1st condition: 2nd exposure is the highest 2nd condition: 2nd exposure is the 2nd highest 3rd condition: 2nd exposure is the 3rd highest 4th condition: 1st exposure is the highest High 5th condition: 1st exposure is the 2nd highest However, for the 4th and 5th conditions, still images satisfying any of the 1st to 3rd conditions are excluded. Further, each condition after the fifth condition may be set as appropriate.

すなわち、ステップＳ２０の処理が最初に行われる際には、露出度条件として第１条件が有効とされる。しかし、その後のステップＳ２３の処理においてナンバーの認識が成功せず、次にステップＳ２０の処理が行われる際には、露出度条件として第２条件が有効とされる。以下同様に、その次にステップＳ２０の処理が行われる際には第３条件が有効とされ、更にその次にステップＳ２０の処理が行われる際には第４条件が有効とされ、更にその次にステップＳ２０の処理が行われる際には第５条件が有効とされる。 That is, when the process of step S20 is performed for the first time, the first condition is valid as the exposure degree condition. However, when the number recognition is not successful in the subsequent processing in step S23 and the processing in step S20 is performed next time, the second condition is valid as the exposure degree condition. Similarly, the third condition is valid when the process of step S20 is performed next, and the fourth condition is valid when the process of step S20 is performed next. The fifth condition is valid when the process of step S20 is performed.

このように本実施形態では、まず第２露出度の高い方から所定数（本実施形態の例では３個）の静止画が最優先で抽出されるようにし、その次に第１露出度の高い方から所定数（本実施形態の例では２個）の静止画が優先的に抽出されるようにしている。なお、これらの所定数の値は一例であり、他の値を採用しても構わない。本実施形態では特に移動中の車を撮影するため、車両を映した撮影動画においてナンバープレートが欠けていたり、適切に見えなかったりする事態が生じ得る。このような事態はナンバーの認識に致命的な悪影響を及ぼす可能性が高いが、本実施形態のように第２露出度の高い静止画を最優先に抽出することにより、このような事態を極力抑えることが可能である。また本実施形態では、第１露出度の高さもナンバーの認識の成功率に大きく影響することから、第２露出度に次いで第１露出度の高さも重視して、抽出する静止画を決めるようにしている。なお、仮に車両を一方向のみから撮影すると、逆光でナンバープレートが適切に映らない虞があるが、本実施形態では一対のスマートフォンを用いて前側および後側から車両を撮影し、前側のナンバープレートが映った動画と後側のナンバープレートが映った動画の両方を得ることが出来るため、このような不具合は回避される。 As described above, in the present embodiment, a predetermined number (three in the example of the present embodiment) of still images are first extracted from the one with the highest second exposure degree with the highest priority, and then the first exposure degree is obtained. A predetermined number (two in the example of this embodiment) of still images are preferentially extracted from the highest one. It should be noted that these predetermined numbers of values are examples, and other values may be adopted. In this embodiment, since a moving vehicle is photographed in particular, a situation may occur in which the license plate is missing or does not look appropriate in the photographed moving image of the vehicle. Such a situation is likely to have a fatal adverse effect on the recognition of the number, but by extracting the still image with a high second degree of exposure with the highest priority as in the present embodiment, such a situation can be eliminated as much as possible. It is possible to suppress it. Further, in the present embodiment, since the high degree of first exposure also greatly affects the success rate of number recognition, the high degree of first exposure should be emphasized next to the second degree of exposure to determine the still image to be extracted. I have to. If the vehicle is photographed from only one direction, the license plate may not be properly reflected due to backlight. However, in the present embodiment, the vehicle is photographed from the front side and the rear side using a pair of smartphones, and the license plate on the front side is photographed. Since it is possible to obtain both a video showing the image and a video showing the license plate on the rear side, such a problem can be avoided.

ステップＳ２０の処理を行った後、情報処理装置１４は、抽出された静止画に対して先述した画像処理を実施し（ステップＳ２１）、画像処理済みの静止画に対してナンバープレート等の物体認識を実行する（ステップＳ２２）。なおナンバープレートに関しては、これに表されたナンバー（表示情報）の認識が実行される。ここで、ナンバーの認識に成功した場合には（ステップＳ２３のＹｅｓ）、次のステップＳ２４の処理が行われるが、ナンバーの認識に成功しなかった場合には（ステップＳ２３のＮｏ）、ステップＳ２０の処理が再度行われる。 After performing the processing in step S20, the information processing apparatus 14 performs the image processing described above on the extracted still image (step S21), and recognizes an object such as a number plate on the image-processed still image. Is executed (step S22). Regarding the license plate, the recognition of the number (display information) represented on the license plate is executed. Here, if the number recognition is successful (Yes in step S23), the processing of the next step S24 is performed, but if the number recognition is not successful (No in step S23), step S20. Processing is performed again.

なお、ステップＳ２０の処理が再度行われる際には、先述したとおり、第１条件の代わりに第２条件が適用され、更にステップＳ２０の処理が再度行われる際には第３条件が適用される。このように本実施形態では、抜き出された各静止画のうち第２露出度が最も高いものを抽出する第１処理と、抽出された前記静止画からナンバーの認識を行う第２処理と、当該認識が成功しない場合に、前記各静止画のうち第２露出度がその次に高いものを抽出してナンバーの認識を再度行う第３処理と、が実行され、当該認識が成功するまで第３処理が繰返されるようになっている。 When the process of step S20 is performed again, the second condition is applied instead of the first condition as described above, and when the process of step S20 is performed again, the third condition is applied. .. As described above, in the present embodiment, the first process of extracting the second most exposed still image from the extracted still images and the second process of recognizing the number from the extracted still images are If the recognition is not successful, the third process of extracting the second highest degree of exposure from each of the still images and recognizing the number again is executed, and the second process is performed until the recognition is successful. 3 The processing is repeated.

また更に本実施形態では、第３処理を所定回数繰返しても前記認識が成功しない場合に、前記抜き出された各静止画のうち第１露出度が最も高いものを抽出する第４処理と、抽出された前記静止画からナンバーの認識を行う第５処理と、当該認識が成功しない場合に、前記各静止画のうち第１露出度がその次に高いものを抽出してナンバーの認識を再度行う第６処理と、が実行され、当該認識が成功するまで第６処理が繰り返されるようになっている。 Further, in the present embodiment, when the recognition is not successful even if the third process is repeated a predetermined number of times, the fourth process for extracting the one with the highest degree of first exposure from each of the extracted still images. The fifth process of recognizing the number from the extracted still image, and when the recognition is not successful, the one with the next highest degree of exposure is extracted from each of the still images and the number is recognized again. The sixth process to be performed and the sixth process are executed, and the sixth process is repeated until the recognition is successful.

次に情報処理装置１４は、当該物体認識の結果に基づいてナンバー等のチェックを実施し、その結果を管理サーバー１５に記録する（ステップＳ２４）。より具体的に説明すると、情報処理装置１４は、物体認識されたナンバーについては、管理サーバー１５に格納されているデータベース（全ての許可車両のナンバー）との照合を実行する。その結果、何れかの許可車両のナンバーに一致していれば正常、そうでなければ異常と判別する。また情報処理装置１４は、運転者が装着したシートベルトおよび装備品については、正しく物体認識された場合（つまり、正しく装着或いは装備されている場合）には正常、そうでなければ異常と判別する。また情報処理装置１４は、ステップＳ１１の処理にて検出済みである車両速度については、所定の許容上限速度（例えば３０km/h）を超えていなければ正常、そうでなければ異常と判別する。また情報処理装置１４は、車両の汚れや傷に関して、所定基準を上回る汚れや傷が物体認識された場合には異常、そうでなければ正常と判別する。 Next, the information processing device 14 checks the number and the like based on the result of the object recognition, and records the result in the management server 15 (step S24). More specifically, the information processing apparatus 14 executes collation with the database (numbers of all permitted vehicles) stored in the management server 15 for the number recognized as an object. As a result, if it matches the number of any of the permitted vehicles, it is determined to be normal, and if not, it is determined to be abnormal. Further, the information processing device 14 determines that the seatbelt and the equipment worn by the driver are normal if the object is correctly recognized (that is, if the seatbelt and the equipment are correctly worn or equipped), and otherwise abnormal. .. Further, the information processing device 14 determines that the vehicle speed detected in the process of step S11 is normal if it does not exceed a predetermined allowable upper limit speed (for example, 30 km / h), and is abnormal otherwise. Further, the information processing device 14 determines that the dirt or scratches on the vehicle are abnormal when the dirt or scratches exceeding a predetermined standard are recognized as an object, and normal if not.

更に情報処理装置１４は、これらのチェック結果（判別の結果）、認識されたナンバー、撮影日時（現在の日時）、撮影に用いられたスマートフォン１１の識別番号、および車両速度を、車両ごとに関連付けて管理サーバー１５に記録される。図１０は、管理サーバー１５に記録されたチェック結果等の情報を例示している。本図に示す例では、車両ごとに管理番号が割り振られ、各項目の情報が記録されている。なお管理サーバー１５には、物体認識に利用された動画や静止画も保存され、車両の情報とその車両に関するナンバープレート等（ナンバーの他、運転者や装備品なども含む）の情報を関連させて保持するようになっている。これにより、車両とナンバープレート等を一括して管理することが可能である。また運転者（人物）については、その表情等も紐付けて保持されるようになっている。 Further, the information processing device 14 associates these check results (results of determination), recognized numbers, shooting date / time (current date / time), identification number of the smartphone 11 used for shooting, and vehicle speed for each vehicle. Is recorded on the management server 15. FIG. 10 illustrates information such as a check result recorded on the management server 15. In the example shown in this figure, a control number is assigned to each vehicle, and information on each item is recorded. The management server 15 also stores videos and still images used for object recognition, and associates vehicle information with information such as license plates (including drivers and equipment in addition to numbers) related to the vehicle. It is designed to hold. This makes it possible to manage the vehicle, license plate, etc. collectively. In addition, the facial expressions of the driver (person) are also linked and held.

その後、情報処理装置１４は、上記の何れのチェック結果にも異常が無かった場合には（ステップＳ２５のＮｏ）、後段処理は終了する。しかし何れかのチェック結果に異常が有った場合には（ステップＳ２５のＹｅｓ）、情報処理装置１４は異常信号を出力し（ステップＳ２６）、その後に後段処理は終了する。図１０に示す例では、管理番号No.00001についてはシートベルトのチェック結果に異常があり、管理番号No.00002については車両速度のチェック結果に異常があるため、これらの結果に対して異常信号が出力されることになる。 After that, if there is no abnormality in any of the above check results (No in step S25), the information processing apparatus 14 ends the post-stage processing. However, if any of the check results is abnormal (Yes in step S25), the information processing apparatus 14 outputs an abnormal signal (step S26), after which the subsequent processing ends. In the example shown in FIG. 10, since the seatbelt check result is abnormal for control number No.00001 and the vehicle speed check result is abnormal for control number No.00002, an abnormal signal is given to these results. Will be output.

異常信号が出力されると、これに気付いた管理担当者は、チェック結果を確認した上で然るべき措置をとることが可能となる。チェック結果に異常が有ったことは、その車両のユーザーに報知されるようにしても良い。また、異常信号は、管理担当者が所定操作を行うまで継続して出力されるようにしても良い。更にこの場合、異常信号の出力が一定時間以上続けば、管理担当者やユーザーが所持する端末等に異常を知らせるメッセージ（例えば電子メール）が送信されるようにしても良い。 When an abnormal signal is output, the person in charge of management who notices it can take appropriate measures after confirming the check result. If there is an abnormality in the check result, the user of the vehicle may be notified. Further, the abnormal signal may be continuously output until the person in charge of management performs a predetermined operation. Further, in this case, if the output of the abnormality signal continues for a certain period of time or more, a message (for example, e-mail) notifying the abnormality may be sent to the terminal or the like possessed by the person in charge of management or the user.

３．その他
以上に説明した通り車両管理システム１は、動画から車両（第１対象物の一例）を物体認識する動画物体認識部４２と、この動画から当該車両を含む静止画を抜き出す静止画抜出部４５と、当該静止画から車両に付随するナンバープレート等（第２対象物の一例）を物体認識する静止画物体認識部４９と、を備える。 3. 3. Others As described above, the vehicle management system 1 includes a moving image object recognition unit 42 that recognizes a vehicle (an example of a first object) from a moving image, and a still image extracting unit that extracts a still image including the vehicle from the moving image. A still image object recognition unit 49 that recognizes a number plate or the like (an example of a second object) attached to a vehicle from the still image is provided.

そのため車両管理システム１によれば、システムの動作負担を抑えながらもナンバープレート等を精度良く認識することが容易となっている。すなわち、動画からナンバープレート等を直接的に認識しようとすると認識精度の低下等が懸念されるが、静止画から認識することにより精度の高い物体認識が容易である。また更に、動画に車両が含まれないときには静止画抜出処理が行われないようにし、無駄な処理を省いてシステムの動作負担を抑えることが可能である。 Therefore, according to the vehicle management system 1, it is easy to recognize the license plate and the like with high accuracy while suppressing the operation load of the system. That is, if an attempt is made to directly recognize a license plate or the like from a moving image, there is a concern that the recognition accuracy may decrease, but by recognizing from a still image, highly accurate object recognition is easy. Furthermore, it is possible to prevent the still image extraction process from being performed when the moving image does not include the vehicle, eliminate unnecessary processing, and reduce the operating load of the system.

更に車両管理システム１は、前記抜き出された複数の静止画のうち露出度条件を満たすものを抽出する抽出部４７を備え、静止画物体認識部４９は、抽出された静止画から、車両に含まれるナンバープレート等を物体認識する。そのため、車両に含まれるナンバープレート等を静止画からより効率良く認識することが可能である。 Further, the vehicle management system 1 includes an extraction unit 47 that extracts an extracted still image that satisfies the degree of exposure condition, and the still image object recognition unit 49 extracts the extracted still image into the vehicle. It recognizes the included license plate and the like as an object. Therefore, it is possible to more efficiently recognize the license plate and the like included in the vehicle from the still image.

以上、本発明の実施形態について説明したが、本発明の構成は上記実施形態に限られず、発明の主旨を逸脱しない範囲で種々の変更を加えることが可能である。本発明の技術的範囲は、上記実施形態の説明ではなく、特許請求の範囲によって示されるものであり、特許請求の範囲と均等の意味及び範囲内に属する全ての変更が含まれると理解されるべきである。 Although the embodiments of the present invention have been described above, the configuration of the present invention is not limited to the above embodiments, and various modifications can be made without departing from the gist of the invention. The technical scope of the present invention is shown not by the description of the above embodiment but by the scope of claims, and is understood to include all modifications belonging to the meaning and scope equivalent to the scope of claims. Should be.

本発明は、動画からの物体認識を行うシステム等に利用可能である。 The present invention can be used in a system or the like that recognizes an object from a moving image.

１車両管理システム
１１スマートフォン
１１ａ進入路前側撮影用スマートフォン
１１ｂ進入路後側撮影用スマートフォン
１１ｃ退出路前側撮影用スマートフォン
１１ｄ退出路後側撮影用スマートフォン
１２通信ネットワーク
１３エッジサーバー
１４情報処理装置
１４ａ動画処理エンジン
１４ｂ静止画処理エンジン
１５管理サーバー
４０制御部
４１通信部
４２動画物体認識部
４３距離検知部
４４速度検知部
４５静止画抜出部
４６露出度検出部
４７抽出部
４８画像処理部
４９静止画物体認識部
５０チェック実行部
５１異常信号出力部
1 Vehicle management system 11 Smartphone 11a Smartphone for front side shooting of approach road 11b Smartphone for shooting rear side of approach road 11c Smartphone for shooting front side of exit road 11d Smartphone for shooting rear side of exit road 12 Communication network 13 Edge server 14 Information processing device 14a Video processing engine 14b Still image processing engine 15 Management server 40 Control unit 41 Communication unit 42 Video object recognition unit 43 Distance detection unit 44 Speed detection unit 45 Still image extraction unit 46 Exposure detection unit 47 Extraction unit 48 Image processing unit 49 Still image object recognition Part 50 Check execution part 51 Abnormal signal output part

Claims

An image object recognition unit that recognizes the first object from image information including a plurality of frames, and an image object recognition unit.
A still image extraction unit that extracts a still image of a plurality of frames including the first object from the image information, and a still image extraction unit.
A still image object recognition unit that recognizes a second object attached to the first object from the still image is provided.
The still image extraction part is
An image recognition system characterized in that the higher the degree of exposure of the first object in the image information, the shorter the time interval for extracting the still image.

The first object is a vehicle, the second object is the license plate of the vehicle,
The image recognition system according to claim 1, wherein the image information is an image taken by a camera.

As the camera, a front side shooting camera for shooting the front side of the same vehicle and a rear side shooting camera for shooting the rear side are provided.
The image information includes both a front image taken by the front camera and a rear image taken by the rear camera.
The still image object recognition unit is
The license plate on the front side of the vehicle is recognized as an object from the still image extracted from the front image, and the license plate on the rear side of the vehicle is recognized as an object from the still image extracted from the rear image. The image recognition system according to claim 2, wherein the image recognition system is characterized by the above.

Each of the extracted still images is provided with an extraction unit for extracting an image that satisfies a predetermined condition regarding the high degree of exposure of at least one of the first object and the second object.
The image recognition according to any one of claims 1 to 3, wherein the still image object recognition unit recognizes a second object included in the first object from the extracted still image. system.

Among the extracted still images, the first process of extracting the one with the highest degree of exposure of the specific object, which is the second object, and
The second process of recognizing the display information in the second object from the extracted still image, and
Until the recognition is successful, the third process of extracting the next highest exposure of the specific object from the still images and repeating the second process is executed.
The claim is characterized in that when the recognition in the second process is not successful even if the number of repetitions reaches a predetermined number, the first process to the third process are executed with the specific object as the first object. 4. The object recognition system according to 4.

The still image extraction part is
The image recognition system according to claim 3, wherein the time interval for extracting the still image from the front image is shorter than the time interval for extracting the still image from the rear image.