JP7199645B2

JP7199645B2 - Object recognition system and object recognition method

Info

Publication number: JP7199645B2
Application number: JP2021083156A
Authority: JP
Inventors: 洋一高野
Original assignee: Daifuku Co Ltd
Current assignee: Daifuku Co Ltd
Priority date: 2018-06-01
Filing date: 2021-05-17
Publication date: 2023-01-06
Anticipated expiration: 2038-06-01
Also published as: JP6913303B2; JP2019211921A; JP2021128796A

Description

本発明は、物体認識等の処理を行う物体認識システムおよび物体認識方法に関する。 The present invention relates to an object recognition system and an object recognition method that perform processing such as object recognition.

従来、画像から対象物を認識する物体認識の手法が提案されている。物体認識の手法は主に撮影画像を取扱うシステムにおいて好適に利用することができ、各種処理の自動化や効率化に役立てることが可能である。 Conventionally, object recognition methods for recognizing objects from images have been proposed. The method of object recognition can be suitably used mainly in a system that handles captured images, and can be used for automating and improving the efficiency of various processes.

一例として特許文献１によれば、車両周辺の物標を検出し、検出された物標に基づいて物体を認識する物体認識装置が開示されている。また特許文献２によれば、車両に設けられた撮像装置によって撮像された画像から認識対象物の形状が存在する対象領域を抽出し、当該画像の領域全体のうち抽出された対象領域に対して選択的に認識対象物の認識処理を実行する物体認識装置が開示されている。 As an example, Patent Literature 1 discloses an object recognition device that detects a target around a vehicle and recognizes the object based on the detected target. Further, according to Patent Document 2, a target region in which the shape of a recognition target exists is extracted from an image captured by an imaging device provided in a vehicle, and the extracted target region out of the entire region of the image is An object recognition device is disclosed that selectively executes recognition processing for a recognition target object.

物体認識を行うための具体的手段としては、動画からの物体認識に好適であるＹＯＬＯや、静止画からの物体認識に好適であるＴＥＮＳＯＲＦＬＯＷ（登録商標、以下同様）等が開発されている。これらの手段には人工知能［ＡＩ：Artificial Intelligence］の技術が応用され、機械学習によって物体認識の精度を向上させることが可能である。また機械学習においては、高度な深層学習（ディープラーニング）が採用される傾向にあり、物体認識の精度や速度が向上してきている。 As specific means for object recognition, YOLO suitable for object recognition from moving images, TENSORFLOW (registered trademark, hereinafter the same) suitable for object recognition from still images, etc. have been developed. Artificial intelligence [AI] technology is applied to these means, and it is possible to improve the accuracy of object recognition through machine learning. In machine learning, advanced deep learning tends to be adopted, and the accuracy and speed of object recognition are improving.

特開２０１８－０４１３９６号公報JP 2018-041396 A 特開２０１７－１３０１５５号公報JP 2017-130155 A

ところで、動画中のある対象物（第１対象物）に付随する別の対象物（第２対象物）の物体認識を行う場合に、第２対象物を動画から直接的に精度良く認識することが容易ではないケースがある。なお本願における「対象物」は、機械学習や物体認識等でのアノテーションデータに相当する概念である。例えば、車両（第１対象物の一例）を監視するカメラの撮影動画からそのナンバープレートに表されたナンバー（第２対象物の一例）を認識しようとする場合、ナンバーを認識するためには車両の物体認識に比べて精密な認識処理が要求される。そのため動画から直接的にナンバーを認識しようとすると、認識精度の低下等を生じる虞がある。 By the way, when performing object recognition of another object (second object) accompanying a certain object (first object) in a moving image, it is necessary to recognize the second object directly from the moving image with high accuracy. is not easy. Note that the “object” in the present application is a concept corresponding to annotation data in machine learning, object recognition, and the like. For example, when trying to recognize a number (an example of a second object) displayed on a license plate from a video captured by a camera that monitors a vehicle (an example of a first object), the vehicle A more precise recognition process is required compared to the object recognition of . Therefore, when trying to recognize the number directly from the moving image, there is a possibility that the recognition accuracy may be lowered.

この問題を解消させるため、動画から静止画を抜き出しておき、抜き出した静止画からの物体認識によってナンバーを認識する手法が考えられる。しかしながら、例えば動画から各フレームの静止画を一律に抜き出すようにすると、車両が含まれない不要な静止画も抜き出しの対象となり、システムの動作負担が過大となる虞がある。また、抜き出す静止画にこのような不要な静止画が多く含まれると、物体認識の速度や精度が低下する虞もある。 In order to solve this problem, a method of recognizing a number by extracting a still image from a moving image and recognizing an object from the extracted still image is conceivable. However, for example, if a still image of each frame is uniformly extracted from a moving image, unnecessary still images that do not include a vehicle will also be extracted, and there is a risk that the operation load of the system will become excessive. In addition, if the still images to be extracted contain many unnecessary still images, there is a possibility that the speed and accuracy of object recognition may be lowered.

本発明は上述した問題点に鑑み、動画中の第１対象物に付随する第２対象物を、システムの動作負担を抑えながら精度良く認識することが容易となる物体認識システム、および物体認識方法の提供を目的とする。 In view of the above-described problems, the present invention provides an object recognition system and an object recognition method that facilitate accurate recognition of a second object that accompanies a first object in a moving image while suppressing the operation load of the system. for the purpose of providing

本発明に係る物体認識システムは、動画から第１対象物を物体認識する動画物体認識部と、前記動画から第１対象物を含む複数のフレームの静止画を抜き出す静止画抜出部と、前記静止画から第１対象物に付随する第２対象物を物体認識する静止画物体認識部と、を備えた構成とする。本構成によれば、動画中の第１対象物に付随する第２対象物を、システムの動作負担を抑えながら精度良く認識することが容易となる。なおここでの「付随する」とは、第１対象物に第２対象物が含まれる形態に限られず、動画中に表れる時期が密接に関連する他の形態も含まれる。 An object recognition system according to the present invention comprises: a moving image object recognition unit that recognizes a first object from a moving image; a still image extracting unit that extracts still images of a plurality of frames including the first object from the moving image; a still image object recognition unit for recognizing a second object attached to the first object from the still image. According to this configuration, it becomes easy to accurately recognize the second object that accompanies the first object in the moving image while suppressing the operation load of the system. Note that "accompanying" here is not limited to the form in which the second object is included in the first object, but also includes other forms in which the time appearing in the moving image is closely related.

また上記構成としてより具体的には、前記抜き出された各静止画のうち、第１対象物および第２対象物の少なくとも一方の露出度の高さに関する所定条件を満たすものを抽出する抽出部を備え、前記静止画物体認識部は、前記抽出された静止画から、第１対象物に含まれる第２対象物を認識する構成としてもよい。本構成によれば、第１対象物に含まれる第２対象物を、静止画からより効率良く認識することが可能となる。 More specifically, the above configuration is an extracting unit that extracts, among the extracted still images, still images that satisfy a predetermined condition regarding the degree of exposure of at least one of the first object and the second object. and the still image object recognition unit may recognize a second object included in the first object from the extracted still image. According to this configuration, it is possible to more efficiently recognize the second object included in the first object from the still image.

また上記構成としてより具体的には、前記抜き出された各静止画のうち、第２対象物である特定対象物の露出度が最も高いものを抽出する第１処理と、抽出された前記静止画から第２対象物内の表示情報の認識を行う第２処理と、当該認識が成功するまで、前記各静止画のうち前記特定対象物の露出度がその次に高いものを抽出して第２処理を繰返し行う第３処理と、を実行する構成としてもよい。なおここでの「表示情報」とは、文字、図形、記号、或いはこれらの組合わせであって、例えば本願でのナンバー（自動車登録番号）等が該当し得る。 Further, more specifically, as the above configuration, a first process of extracting, from among the extracted still images, one having the highest degree of exposure of a specific object that is a second object; a second process for recognizing display information in the second object from the image; A third process in which the second process is repeated may be executed. The "display information" here is characters, graphics, symbols, or a combination thereof, and may correspond to, for example, a number (automobile registration number) in the present application.

また上記構成としてより具体的には、前記繰返しの回数が所定回数に達しても第２処理における前記認識が成功しない場合に、前記特定対象物を第１対象物として第１処理から第３処理を実行する構成としてもよい。 Further, more specifically, as the above configuration, when the recognition in the second process is not successful even if the number of repetitions reaches a predetermined number, the first process to the third process are performed with the specific object as the first object. may be configured to execute

また上記構成としてより具体的には、前記静止画抜出部は、前記動画における第１対象物の露出度に応じて、前記抜き出す静止画の解像度および前記静止画を抜き出す時間間隔の少なくとも一方を調節する構成としてもよい。 Further, more specifically as the above configuration, the still image extracting unit selects at least one of the resolution of the still image to be extracted and the time interval for extracting the still image according to the degree of exposure of the first object in the moving image. It is good also as a structure which adjusts.

本構成によれば、システムの負担を極力抑えながら、重要度の高い静止画を効率良く抜き出すことが可能となる。また上記構成としてより具体的には、抜き出された複数の前記静止画の中で第２対象物の露出度が最も大きいものに対し、深層学習に適した補正処理を施す構成としてもよい。 According to this configuration, it is possible to efficiently extract still images of high importance while minimizing the burden on the system. Further, more specifically, the above configuration may be configured such that a correction process suitable for deep learning is performed on the one having the highest degree of exposure of the second object among the plurality of extracted still images.

また上記構成としてより具体的には、前記動画はカメラを用いて撮影された映像であって、前記カメラと第１対象物の距離を検知する距離検知部を備え、前記静止画抜出部は、前記距離が所定値以下となったときに、前記静止画の抜き出しを開始する構成としてもよい。本構成によれば、第１対象物の映りが小さい不明瞭な静止画の抜き出しを極力抑えることが可能となる。 Further, more specifically as the above configuration, the moving image is an image captured using a camera, and a distance detection unit for detecting a distance between the camera and a first object is provided, and the still image extraction unit is , the extraction of the still image may be started when the distance becomes equal to or less than a predetermined value. According to this configuration, it is possible to minimize the extraction of an unclear still image in which the first target image is small.

また上記構成としてより具体的には、第１対象物の向きを検知する方向検知部を備え、前記静止画抜出部は、前記向きが所定条件を満たしたときに、前記静止画の抜き出しを開始する構成としてもよい。本構成によれば、第１対象物の向きに問題がある静止画の抜き出しを極力抑えることが可能である。 Further, more specifically, the above configuration includes a direction detection unit that detects the orientation of the first object, and the still image extracting unit extracts the still image when the orientation satisfies a predetermined condition. It is good also as a structure which starts. According to this configuration, it is possible to minimize extraction of a still image having a problem with the orientation of the first object.

また上記構成としてより具体的には、第１対象物の情報と第２対象物の情報を関連させて保持する構成としてもよい。本構成によれば、第１対象物と第２対象物を一括して管理することが可能となる。また更に上記構成において、前記動画物体認識部と前記静止画物体認識部において、異なる物体認識の手法を用いる構成としてもよい。本構成によれば、動画からの物体認識と静止画からの物体認識のそれぞれに最適な手法を用い、物体認識を効率良く行うことが可能となる。 More specifically, the above configuration may be configured such that the information on the first object and the information on the second object are held in association with each other. According to this configuration, it is possible to collectively manage the first object and the second object. Furthermore, in the above configuration, different object recognition techniques may be used in the moving image object recognition section and the still image object recognition section. According to this configuration, it is possible to efficiently perform object recognition by using the optimum method for object recognition from moving images and object recognition from still images.

また本発明に係る物体認識方法は、動画から第１対象物を物体認識する動画物体認識工程と、前記動画から第１対象物を含む静止画を抜き出す静止画抜出工程と、前記静止画から第１対象物に付随する第２対象物を物体認識する静止画物体認識工程と、を含む方法とする。 Further, an object recognition method according to the present invention includes a moving image object recognition step of recognizing a first object from a moving image, a still image extracting step of extracting a still image including the first object from the moving image, and and a still image object recognition step of recognizing a second object attached to the first object.

本発明に係る物体認識システムおよび物体認識方法によれば、動画中の第１対象物に付随する第２対象物を、システムの動作負担を抑えながら精度良く認識することが容易となる。 According to the object recognition system and the object recognition method according to the present invention, it becomes easy to accurately recognize the second object accompanying the first object in the moving image while suppressing the operation load of the system.

本実施形態に係る車両管理システム１の構成に関するブロック図である。1 is a block diagram relating to the configuration of a vehicle management system 1 according to this embodiment; FIG. 敷地内においてスマートフォン１１が設置された様子の説明図である。It is explanatory drawing of a mode that the smart phone 11 was installed in the site. 情報処理装置１４の機能的構成に関するブロック図である。2 is a block diagram of a functional configuration of an information processing device 14; FIG. スマートフォン１１の映像に関する説明図である。FIG. 3 is an explanatory diagram relating to an image of a smartphone 11; ご当地ナンバープレートに関する説明図である。It is an explanatory diagram of a local license plate. 前段処理の流れに関するフローチャートである。10 is a flowchart relating to the flow of pre-stage processing; 車両の動きに伴って距離Ｄが変化する様子の説明図である。FIG. 5 is an explanatory diagram of how the distance D changes with the movement of the vehicle; 動画からの静止画の抜出しに関するタイミングチャートである。4 is a timing chart for extracting a still image from a moving image; 後段処理の流れに関するフローチャートである。10 is a flowchart regarding the flow of post-processing. チェック結果等の情報に関する説明図である。FIG. 10 is an explanatory diagram related to information such as check results;

本発明の実施形態に係る車両管理システム（本発明に係る物体認識システムの一形態）について、各図面を参照しながら以下に説明する。 A vehicle management system according to an embodiment of the present invention (one form of an object recognition system according to the present invention) will be described below with reference to each drawing.

１．車両管理システムの構成
図１は、本実施形態に係る車両管理システム１の概略構成を示すブロック図である。本図に示すように車両管理システム１は、進入路前側撮影用スマートフォン１１ａ、進入路後側撮影用スマートフォン１１ｂ、退出路前側撮影用スマートフォン１１ｃ、退出路後側撮影用スマートフォン１１ｄ、通信ネットワーク１２、エッジサーバー１３、情報処理装置１４、および管理サーバー１５を備えている。なお以下の説明では、上記の各スマートフォン１１ａ～１１ｄを「スマートフォン１１」と総称することがある。また各図面においては、スマートフォン（SmartPhone）を「ＳＰ」と略記することがある。 1. Configuration of Vehicle Management System FIG. 1 is a block diagram showing a schematic configuration of a vehicle management system 1 according to the present embodiment. As shown in this figure, the vehicle management system 1 includes an approach road front side photographing smartphone 11a, an approach road rear side photographing smartphone 11b, an exit road front side photographing smartphone 11c, an exit road rear side photographing smartphone 11d, a communication network 12, An edge server 13 , an information processing device 14 and a management server 15 are provided. In the following description, the smartphones 11a to 11d may be collectively referred to as "smartphone 11". Moreover, in each drawing, a smart phone (SmartPhone) may be abbreviated as "SP."

本実施形態では一例として、スマートフォン１１は複数の敷地（図１に示す例では敷地１～３）内それぞれに複数個が設置されている。本実施形態における「敷地」は、当該敷地を管理する事業者等（以下、「管理者」と称する）の許可を得た車両が出入りできる場所であり、例えば、管理者が所有する駐車場等が該当する。敷地に車両の出入口が複数箇所ある場合、全ての出入口の付近にスマートフォン１１を設置することにより、その敷地に進入或いは退出する車両を漏れなく監視することが可能である。なお、本実施形態における「車両」はナンバープレートを備えた自動車のことであり、「ナンバー」は当該ナンバープレートに表された自動車登録番号のことである。一般的にナンバープレートは、車両の前側と後側の両方に設けられている。 In this embodiment, as an example, a plurality of smartphones 11 are installed in each of a plurality of sites (sites 1 to 3 in the example shown in FIG. 1). The "site" in this embodiment is a place where vehicles can enter and exit with the permission of a business operator who manages the site (hereinafter referred to as "administrator"), for example, a parking lot owned by the administrator. is applicable. When a site has multiple entrances and exits for vehicles, by installing the smartphones 11 near all the entrances and exits, it is possible to monitor vehicles entering or leaving the site without omission. It should be noted that the "vehicle" in this embodiment refers to a vehicle equipped with a license plate, and the "number" refers to the vehicle registration number indicated on the license plate. Generally, license plates are provided on both the front and rear sides of a vehicle.

一方で、エッジサーバー１３、情報処理装置１４、および管理サーバー１５は、管理センターに纏めて設置されている。本実施形態における「管理センター」は、敷地に出入りする車両の管理が行われる場所であり、例えば、管理者が所有する建物内の一室が該当する。車両管理システム１は、各敷地に進入する車両を自動的に監視するとともに、一括して管理する役割を果たす。 On the other hand, the edge server 13, the information processing device 14, and the management server 15 are collectively installed in the management center. The "management center" in this embodiment is a place where vehicles entering and exiting the site are managed, and corresponds to, for example, a room in a building owned by a manager. The vehicle management system 1 automatically monitors vehicles entering each site and collectively manages them.

管理サーバー１５は、各敷地に進入する車両等の管理に用いられるサーバーである。管理サーバー１５には、敷地内への進入を許可された全ての車両（以下、便宜的に「許可車両」と称する）のナンバーが、データベースとして登録されている。管理サーバー１５は、管理者等によって新たな許可車両のナンバーが入力される度に、この情報をデータベースに蓄積する。なお、管理サーバー１５は、インターネット網を介してデータセンタ上に設けられてもよい。 The management server 15 is a server used for managing vehicles and the like that enter each site. In the management server 15, the numbers of all vehicles permitted to enter the premises (hereinafter referred to as "permitted vehicles" for convenience) are registered as a database. The management server 15 accumulates this information in the database each time a new permit vehicle number is entered by an administrator or the like. Note that the management server 15 may be provided on a data center via the Internet network.

スマートフォン１１は、被写体を撮影して動画（映像）を得るカメラの機能を有するとともに、自機から被写体までの距離を測る機能（測距機能）を有する。この測距機能は、スマートフォン１１に複数のレンズを搭載した「ステレオカメラ」により実現される。測距機能は、ステレオカメラに替えて測距センサー等を設けることにより実現してもよい。また、進入路前側撮影用スマートフォン１１ａは敷地へ進入する車両の前側を撮影する役割を、進入路後側撮影用スマートフォン１１ｂは敷地へ進入する車両の後側を撮影する役割を、退出路前側撮影用スマートフォン１１ｃは敷地から退出する車両の前側を撮影する役割を、退出路後側撮影用スマートフォン１１ｄは敷地から退出する車両の後側を撮影する役割を、それぞれ担っている。なお、カメラ単体或いはその他のカメラを有した機器が、スマートフォン１１の代わりに適用されても良い。 The smartphone 11 has a camera function of capturing a subject and obtaining a moving image (image), and also has a function of measuring the distance from the smartphone to the subject (ranging function). This ranging function is realized by a “stereo camera” in which the smartphone 11 is equipped with a plurality of lenses. The distance measurement function may be realized by providing a distance measurement sensor or the like instead of the stereo camera. The smartphone for photographing the front side of the approach road 11a has a role of photographing the front side of the vehicle entering the site, and the smartphone for photographing the rear side of the approach road 11b has a role of photographing the rear side of the vehicle entering the site. The smart phone 11c for taking pictures of the front side of the vehicle leaving the site, and the smart phone 11d for taking pictures of the rear side of the exit road take pictures of the rear side of the vehicle leaving the site. Note that a camera alone or a device having another camera may be applied instead of the smart phone 11 .

図２は、敷地内においてスマートフォン１１が設置された様子を例示している。本図に示すように、スマートフォン１１は、敷地に進入および退出する車両の通行路が被写体となるように設置されている。これにより、車両が進入路（敷地へ進入するための通行路）および退出路（敷地から退出するための通行路）を通行する際、その車両の前側および後側の外観をスマートフォン１１の被写体に収めることが可能である。スマートフォン１１は、車両のナンバープレート（或いは、これに表されたナンバーの情報）、運転者、運転者が装着したシートベルト、車両の汚れや傷（凹み等含む）、および所定の装備品（以下、これらを「ナンバープレート等」と総称することがある）を被写体へ収めることができるように、適切な位置に設置されることが望ましい。なお、ここでの「装備品」は、例えば許可車両に装備が義務付けられたものであり、スマートフォン１１の被写体となり得るものである。 FIG. 2 illustrates how the smartphone 11 is installed within the site. As shown in the figure, the smartphone 11 is installed so that the subject is a passageway for vehicles entering and leaving the site. As a result, when a vehicle passes through an approach road (traffic road for entering the site) and an exit road (traffic road for exiting the site), the appearance of the front and rear sides of the vehicle can be viewed as the subject of the smartphone 11. can be accommodated. The smartphone 11 can store the license plate of the vehicle (or the number information displayed on the license plate), the driver, the seat belt worn by the driver, dirt and scratches (including dents) on the vehicle, and predetermined equipment (hereinafter , these are sometimes collectively referred to as “license plates, etc.”) to be placed in an appropriate position so that the subject can be captured. Note that the “equipment” here is, for example, an item that a permitted vehicle is obligated to equip, and can be a subject of the smartphone 11 .

例えば図２に示すように、各スマートフォン１１は守衛室の近傍に設けられ、車両の全体を斜め上方から撮影できる位置（本図の例では支柱）に設置されることが望ましい。本図の例では、進入路前側撮影用スマートフォン１１ａおよび進入路後側撮影用スマートフォン１１ｂは、進入路のほぼ真上において後部同士が対向するように設置され、進入路前側撮影用スマートフォン１１ａは進入路を進む車両の前側全体を斜め上前方から撮影するように、進入路後側撮影用スマートフォン１１ｂは進入路を進む車両の後側全体を斜め上後方から撮影するように、それぞれ適切に配置されている。 For example, as shown in FIG. 2, each smartphone 11 is provided near the security guard's room, and is desirably installed at a position (support in the example of this figure) where the entire vehicle can be photographed obliquely from above. In the example of this figure, the approach road front side photographing smartphone 11a and the approach road rear side photographing smartphone 11b are installed so that their rear portions face each other almost directly above the approach road. The smartphone 11b for photographing the rear side of the approach road is appropriately arranged so as to photograph the entire front side of the vehicle traveling on the road from diagonally above the front and the entire rear side of the vehicle traveling on the approach road from the diagonally upper rear. ing.

一方、退出路前側撮影用スマートフォン１１ｃおよび退出路後側撮影用スマートフォン１１ｄは、退出路のほぼ真上において後部同士が対向するように設置され、退出路前側撮影用スマートフォン１１ｃは退出路を進む車両の前側全体を斜め上前方から撮影するように、退出路後側撮影用スマートフォン１１ｄは退出路を進む車両の後側全体を斜め上後方から撮影するように、それぞれ適切に配置されている。 On the other hand, the smartphone for photographing the front side of the exit road 11c and the smartphone for photographing the rear side of the exit road 11d are installed so that their rear portions face each other almost directly above the road to exit, and the smartphone for photographing the front side of the exit road 11c is installed on the vehicle traveling on the exit road. The smartphone 11d for photographing the rear side of the exit road is appropriately arranged so as to photograph the entire front side of the vehicle from diagonally above the front, and to photograph the entire rear side of the vehicle traveling on the exit road from the diagonally upper rear.

このように、各スマートフォン１１は上方視において車両と重なる位置に配されることが望ましく、これによりスマートフォン１１により得られた画像データについて、車幅方向の各種補正の簡略化あるいは省略が可能となる。なお以下の説明では、同じ車両を前側と後側から撮影するスマートフォン１１の組合せ、すなわち、進入路前側撮影用スマートフォン１１ａとこれに対応する進入路後側撮影用スマートフォン１１ｂの組合せ、および、退出路前側撮影用スマートフォン１１ｃとこれに対応する退出路後側撮影用スマートフォン１１ｄの組合せそれぞれを、「一対のスマートフォン１１」と表現することがある。 In this way, each smartphone 11 is desirably arranged in a position overlapping the vehicle when viewed from above, and this makes it possible to simplify or omit various corrections in the vehicle width direction for image data obtained by the smartphone 11. . Note that in the following description, a combination of smartphones 11 that capture images of the same vehicle from the front and rear sides, that is, a combination of a smartphone 11a for photographing the front side of an approach road and a corresponding smartphone 11b for photographing the rear side of an approach road, and a combination of a smartphone 11b for photographing the rear side of an approach road Each combination of the front side photographing smartphone 11c and the corresponding exit road rear side photographing smartphone 11d may be expressed as "a pair of smartphones 11".

例えば退出路前側撮影用スマートフォン１１ｃの映像には、図４に例示するように、車両Ｃ１の前側のナンバープレートが直接映るとともに、運転者と運転者が装着したシートベルトがフロントガラス越しに映ることになる。上記のナンバープレート等は、車両が映っている動画および静止画において、何れも当該車両に含まれているものであり、当該車両に付随しているものである。また、スマートフォン１１は測距機能を有しているため、被写体中の車両の位置が特定されれば、当該スマートフォン１１から当該車両までの距離Ｄの情報を得ることが可能である。この距離Ｄは、後述する距離検知部４３によって検知される。また、スマートフォン１１の温度上昇、低下や経年劣化を抑え、車両の側面をより正確に撮影するために、守衛室にスマートフォン１１を設けてもよい。進入車両を撮影するスマートフォン（１１ａ、１１ｂ）と、退出車両を撮影するスマートフォン（１１ｃ、１１ｄ）とから得られるデータを照合することにより、入退出の管理が可能になる。 For example, in the image of the smartphone 11c for photographing the front side of the exit road, as illustrated in FIG. 4, the license plate on the front side of the vehicle C1 is directly reflected, and the driver and the seat belt worn by the driver are reflected through the windshield. become. The above license plate and the like are included in the vehicle in both moving images and still images showing the vehicle, and are attached to the vehicle. Further, since the smartphone 11 has a distance measurement function, it is possible to obtain information on the distance D from the smartphone 11 to the vehicle if the position of the vehicle in the subject is specified. This distance D is detected by a distance detection unit 43, which will be described later. In addition, the smartphone 11 may be provided in the security guard's room in order to prevent the temperature of the smartphone 11 from rising, lowering, or deteriorating over time, and to more accurately photograph the sides of the vehicle. Entry and exit can be managed by collating data obtained from smartphones (11a, 11b) that photograph incoming vehicles and smartphones (11c, 11d) that photograph exiting vehicles.

通信ネットワーク１２は、各スマートフォン１１と情報処理装置１４の間の通信に用いられるネットワークである。通信ネットワーク１２の具体的形態としては、有線と無線の何れのネットワークが適用されても良い。また、通信ネットワーク１２にインターネット等を利用することも可能である。 The communication network 12 is a network used for communication between each smartphone 11 and the information processing device 14 . As a specific form of the communication network 12, either a wired network or a wireless network may be applied. It is also possible to use the Internet or the like as the communication network 12 .

エッジサーバー１３は、通信ネットワーク１２と情報処理装置１４の間に介在し、例えばディープラーニングを実行可能な環境やディープラーニングで使用される各種値（人工知能の学習済のハイパーパラメータ、モデルの構造情報となるハイパーパラメータ、学習データを学習させた際に与えられるウエイトデータ、強化学習モデルにおける報酬関数）を記憶している。エッジサーバー１３にはディープラーニングを実行できる環境のソフトウエア（Python,anaconda,jupyter,opencv,TENSORFLOW,YOLO等）がインストールされている。 The edge server 13 is interposed between the communication network 12 and the information processing device 14, for example, an environment in which deep learning can be executed, various values used in deep learning (learned hyperparameters of artificial intelligence, model structural information (hyperparameters, weight data given when the learning data is learned, reward function in the reinforcement learning model) are stored. The edge server 13 is installed with environment software (Python, anaconda, jupyter, opencv, TENSORFLOW, YOLO, etc.) capable of executing deep learning.

情報処理装置１４は、エッジサーバー１３よりも高性能なサーバーにより構成され、動画および静止画からの物体認識の他、車両の監視および管理に関わるディープラーニングの新規学習（強化学習、追加学習）の関連処理等を実行する装置である。また情報処理装置１４は、動画に対する画像認識等の処理を行う動画処理エンジン１４ａと、静止画に対する画像認識等の処理を行う静止画処理エンジン１４ｂを有する。 The information processing device 14 is composed of a server with higher performance than the edge server 13, and in addition to object recognition from moving images and still images, new learning (reinforcement learning, additional learning) of deep learning related to vehicle monitoring and management. It is a device that executes related processing and the like. The information processing device 14 also includes a moving image processing engine 14a that performs processing such as image recognition on moving images, and a still image processing engine 14b that performs processing such as image recognition on still images.

動画処理エンジン１４ａは、ＹＯＬＯ（You Only Look Once）やＯＰＥＮＣＶ（Open Source Computer Vision Library）等のアルゴリズムが採用されており、リアルタイムで動画から物体認識を行う機能に優れている。動画処理エンジン１４ａは、機械学習により、外観（傾き、大きさ、向き）が異なる車両を何れも「車両」を正確かつ迅速に物体認識することが可能となっている。これにより、動画中の車両の認識漏れを極力抑えることが可能である。なお「機械学習」は、与えられた情報に基づいて反復的に学習を行うことにより、法則やルールを自律的に見つけ出す手法である。但し、動画処理エンジン１４ａの具体的構成は上記の例に限定されるものではなく、ＹＯＬＯ等の代わりに、動画からの物体認識に適した他の手段が採用されても良い。 The moving image processing engine 14a employs algorithms such as YOLO (You Only Look Once) and OPEN CV (Open Source Computer Vision Library), and is excellent in the function of recognizing objects from moving images in real time. The moving image processing engine 14a is capable of accurately and quickly object-recognizing all vehicles with different appearances (inclination, size, direction) as "vehicles" by machine learning. As a result, it is possible to minimize omissions in recognizing vehicles in moving images. "Machine learning" is a method of autonomously discovering laws and rules by repeatedly learning based on given information. However, the specific configuration of the moving image processing engine 14a is not limited to the above example, and instead of YOLO or the like, other means suitable for object recognition from moving images may be employed.

一方で静止画処理エンジン１４ｂは、機械学習ライブラリであるＴＥＮＳＯＲＦＬＯＷが採用されており、静止画から素早く精度良く物体認識を行う機能に優れている。特にＴＥＮＳＯＲＦＬＯＷは、深層学習（ディープラーニング）が可能であるライブラリとなっており、多次元のデータ構造を円滑に処理することができる。なお「深層学習」は、多層構造のニューラルネットワーク（人間の脳神経系の仕組みを模した情報処理モデル）を用いた機械学習である。 On the other hand, the still image processing engine 14b employs TENSORFLOW, which is a machine learning library, and is excellent in the function of quickly and accurately recognizing objects from still images. In particular, TENSORFLOW is a library capable of deep learning, and can smoothly process multidimensional data structures. “Deep learning” is machine learning using a multi-layered neural network (an information processing model imitating the mechanism of the human brain and nervous system).

静止画処理エンジン１４ｂによれば、車両を含む静止画からナンバープレート等を高精度に物体認識することが出来るとともに、当該ナンバープレートに表されたナンバーを認識することも可能である。但し、静止画処理エンジン１４ｂの具体的構成は上記の例に限定されるものではなく、ＴＥＮＳＯＲＦＬＯＷの代わりに、静止画からの物体認識に適した他の手段が採用されても良い。また、エッジサーバー１３と情報処理装置１４とを同じサーバーで実現してもよい。 According to the still image processing engine 14b, it is possible to recognize objects such as a license plate from a still image including a vehicle with high accuracy, and also to recognize the number displayed on the license plate. However, the specific configuration of the still image processing engine 14b is not limited to the above example, and other means suitable for object recognition from still images may be employed instead of TENSORFLOW. Also, the edge server 13 and the information processing device 14 may be realized by the same server.

ここで、情報処理装置１４の主な機能的構成のブロック図を図３に示す。本図に示すように情報処理装置１４は、制御部４０、通信部４１、動画物体認識部４２、距離検知部４３、速度検知部４４、静止画抜出部４５、露出度検出部４６、抽出部４７、画像処理部４８、静止画物体認識部４９、チェック実行部５０、および異常信号出力部５１を有する。 FIG. 3 shows a block diagram of the main functional configuration of the information processing device 14. As shown in FIG. As shown in the figure, the information processing device 14 includes a control unit 40, a communication unit 41, a moving object recognition unit 42, a distance detection unit 43, a speed detection unit 44, a still image extraction unit 45, an exposure detection unit 46, an extraction It has a section 47 , an image processing section 48 , a still image object recognition section 49 , a check execution section 50 and an abnormality signal output section 51 .

制御部４０は、情報処理装置１４が正常に動作するように、各機能部４１～５１を適切に制御する。なお情報処理装置１４の主な動作については、改めて詳細に説明する。通信部４１は、各スマートフォン１１および管理サーバー１５を含む外部装置との通信を実行する。 The control unit 40 appropriately controls the functional units 41 to 51 so that the information processing device 14 operates normally. The main operations of the information processing device 14 will be explained in detail later. The communication unit 41 communicates with external devices including each smartphone 11 and the management server 15 .

動画物体認識部４２は、動画処理エンジン１４ａによる動画からの物体認識機能を用いて、動画から車両を物体認識する。なお、複数の車両が同時に表れている動画に対しては、動画物体認識部４２はこれらを別々に物体認識することが可能である。例えば一のスマートフォン１１の被写体に２台の車両が入ったときには、これら２台の車両を別々に物体認識することが可能であり、情報処理装置１４は、それぞれに着目した処理を並行して進めることが可能である。当該物体認識は、主に後述するステップＳ１０の処理において実施される。 The moving image object recognition unit 42 recognizes the vehicle from the moving image using the object recognition function from the moving image by the moving image processing engine 14a. It should be noted that the moving image object recognition unit 42 can recognize objects separately for moving images in which a plurality of vehicles appear at the same time. For example, when two vehicles enter the subject of one smartphone 11, it is possible to recognize the objects of these two vehicles separately, and the information processing device 14 advances processing focusing on each in parallel. It is possible. The object recognition is mainly performed in the process of step S10, which will be described later.

距離検知部４３は、スマートフォン１１に設けられた測距機能を利用して、物体認識された車両とスマートフォン１１との距離Ｄ（図２を参照）を検知する。当該距離の検知は、主に後述するステップＳ１２の処理において実施される。 The distance detection unit 43 detects a distance D (see FIG. 2 ) between the vehicle whose object has been recognized and the smartphone 11 using a distance measurement function provided in the smartphone 11 . Detection of the distance is mainly performed in the process of step S12, which will be described later.

速度検知部４４は、物体認識された車両の速度を検知する。速度を検知する手法としては、スマートフォン１１の近傍に設置された速度センサーを利用する手法や、動画における車両の動きから速度を検知する手法等が採用され得る。当該速度の検知は、主に後述するステップＳ１１の処理において実施される。 The speed detection unit 44 detects the speed of the vehicle whose object has been recognized. As a method of detecting the speed, a method of using a speed sensor installed near the smartphone 11, a method of detecting the speed from the motion of the vehicle in the moving image, or the like can be adopted. The detection of the speed is mainly carried out in the process of step S11, which will be described later.

静止画抜出部４５は、動画から複数の静止画（各フレームの画像、例えば０．１秒間隔に３０枚）を抜き出して、記憶領域に一時的に保持する。なお静止画抜出部４５は、動画から抜き出す静止画の解像度、および動画から静止画を抜き出す時間間隔を、適宜変更することが可能である。当該静止画の抜出しは、主に後述するステップＳ１３の処理において実施される。 The still image extraction unit 45 extracts a plurality of still images (images of each frame, for example, 30 images at intervals of 0.1 seconds) from the moving image and temporarily stores them in a storage area. Note that the still image extraction unit 45 can appropriately change the resolution of the still images extracted from the moving image and the time interval for extracting the still images from the moving image. Extraction of the still image is mainly performed in the process of step S13, which will be described later.

露出度検出部４６は、動画の１フレーム（一の静止画に相当する）における物体認識された車両の露出度（以下、「第１露出度」と称する）を検出する。第１露出度は、動画の１フレームにおける車両の大きさ（面積）とフレームの大きさとの比率（フレームに対する露出割合）としてもよく、車両の大きさ自体としてもよい。また車両の大きさの情報としては、車両の輪郭内部の面積を採用しても良く、当該大きさの指標となる他の情報（例えば、図４に破線で示す矩形（四辺が車両に接する矩形）の内部の面積）を採用しても良い。その他、フレーム同士における車両の露出度の高さを比較可能とする別の値を、第１露出度とみなしても良い。第１露出度の検出は、主に後述するステップＳ１４の処理において実施される。さらに露出度検出部４６は、動画の１フレーム（一の静止画に相当する）における物体認識されたナンバープレートの露出度（以下、「第２露出度」と称する）を検出する。第２露出度は、動画の１フレームにおけるナンバープレートの大きさと車両の大きさとの比率（車両に対する露出割合）としても良く、ナンバープレートの大きさとフレームの大きさとの比率（フレームに対する露出割合）としても良く、ナンバープレートの大きさ自体としてもよい。また第１露出度の場合と同様に、フレーム同士におけるナンバープレートの露出度の高さを比較可能とする各種の値を、第２露出度とみなすことが可能である。第２露出度の検出は、主に後述するステップＳ２０の処理において実施される。 The exposure detection unit 46 detects the exposure of a vehicle whose object has been recognized in one frame of a moving image (corresponding to one still image) (hereinafter referred to as "first exposure"). The first degree of exposure may be the ratio between the size (area) of the vehicle in one frame of the moving image and the size of the frame (exposure ratio to the frame), or may be the size of the vehicle itself. As information on the size of the vehicle, the area inside the contour of the vehicle may be used, and other information that serves as an indicator of the size (for example, a rectangle indicated by broken lines in FIG. 4 (a rectangle whose four sides are in contact with the vehicle) ) may be adopted. In addition, another value that allows comparison of vehicle exposure levels between frames may be considered as the first exposure level. The detection of the first exposure is mainly carried out in the process of step S14, which will be described later. Further, the exposure detection unit 46 detects the exposure of the license plate recognized as an object in one frame of the moving image (corresponding to one still image) (hereinafter referred to as "second exposure"). The second degree of exposure may be the ratio of the size of the license plate to the size of the vehicle in one frame of the video (exposure ratio to the vehicle), and the ratio of the size of the license plate to the size of the frame (exposure ratio to the frame). Alternatively, it may be the size of the license plate itself. Also, as in the case of the first degree of exposure, various values that enable comparison of the degree of exposure of the license plate between frames can be regarded as the second degree of exposure. The detection of the second exposure is mainly performed in the process of step S20, which will be described later.

抽出部４７は、静止画抜出部４５によって抜き出された複数の静止画のうち、第１露出度および第２露出度の少なくとも一方に関する所定条件（以下、便宜的に「露出度条件」と称する）を満たすものを抽出する。露出度条件は、例えば、当該露出度の値が所定値以上であることとしても良く、当該露出度が最も高いこととしても良く、当該露出度の高い方から数えて所定数以内に該当することとしても良い。本実施形態に係る露出度条件の具体的内容については、改めて詳細に説明する。露出度が高いほど、静止画物体認識部４９によるナンバーの物体認識が行い易くなる可能性が高まるため、露出度条件を満たす静止画を用いれば当該物体認識をより有利に行うことが可能となる。その他、露出度条件の代わりに、静止画物体認識部４９による物体認識の行い易さに関する別の条件が設定されても良い。当該抽出は、主に後述するステップＳ２０の処理において実施される。 The extracting unit 47 extracts a predetermined condition (hereinafter referred to as “exposure condition” for convenience) regarding at least one of the first exposure and the second exposure among the plurality of still images extracted by the still image extracting unit 45. ) are extracted. The degree of exposure condition may be, for example, that the value of the degree of exposure is a predetermined value or more, or that the degree of exposure is the highest, and that the value of the degree of exposure is counted from the highest degree of exposure and falls within a predetermined number. It is good as Specific contents of the exposure condition according to the present embodiment will be described in detail again. The higher the degree of exposure, the more likely it is that the still image object recognizing unit 49 will be able to recognize the numbered object more easily. . In addition, instead of the exposure level condition, another condition regarding ease of object recognition by the still image object recognition unit 49 may be set. The extraction is mainly performed in the process of step S20, which will be described later.

画像処理部４８は、静止画に対して閾値処理、エッジ処理（エッジ検出処理）、および傾き補正処理の各画像処理を順に実施する。なお画像処理部４８は、これらの画像処理のうち、何れか一つまたは二つのみを実施するようにしても良く、静止画からの物体認識をより有利にするための他の画像処理を更に実施するようにしても良い。当該画像処理は、主に後述するステップＳ２１の処理において実施される。なおこれらの画像処理は、深層学習に適した補正処理とみることも出来る。 The image processing unit 48 sequentially performs image processing such as threshold processing, edge processing (edge detection processing), and tilt correction processing on the still image. Note that the image processing unit 48 may perform only one or two of these image processes, and further performs other image processes for making object recognition from still images more advantageous. You may implement it. The image processing is mainly performed in the processing of step S21, which will be described later. These image processes can also be regarded as correction processes suitable for deep learning.

ここで「閾値処理」は、画像を２値画像（シングルチャンネル画像）に変換する処理である。閾値処理によれば、例えば、白黒の２値画像に変換する場合には、チャンネル値が所定の閾値を超えた画素については白の画素に、チャンネル値が当該閾値を超えなかった画素については黒の画素に、それぞれ変換されることになる。閾値処理が施された画像は、画像中の明度の異なる部分を選ぶことが容易となる。 Here, "threshold processing" is processing for converting an image into a binary image (single-channel image). According to threshold processing, for example, when converting to a black-and-white binary image, pixels whose channel values exceed a predetermined threshold are converted to white pixels, and pixels whose channel values do not exceed the threshold are converted to black pixels. pixels, respectively. An image that has undergone threshold processing makes it easier to select portions of the image that differ in brightness.

また「エッジ処理」は、画像中の明るさ（濃淡）あるいは色が急に変化している箇所（エッジ）を検出する処理である。画像中の物体の輪郭や線では、一般的に濃淡等が急激に変化しているため、エッジ処理によってこの輪郭や線を検出することが可能である。エッジは物体の構造を反映している重要な情報であり、静止画からの物体認識を実施する際にエッジ処理は極めて有用である。なおエッジ処理をより効果的に行うため、通常、予めその画像に閾値処理を実施しておくことは有用である。 "Edge processing" is processing for detecting portions (edges) where brightness (shading) or color suddenly changes in an image. Since the contours and lines of an object in an image generally have abrupt changes in shading, it is possible to detect these contours and lines by edge processing. Edges are important information that reflects the structure of objects, and edge processing is extremely useful when recognizing objects from still images. In order to perform edge processing more effectively, it is usually useful to perform threshold processing on the image in advance.

エッジ処理を実施するためのアルゴリズムとしては、キャニー（Canny）エッジ検出器が採用されても良い。このアルゴリズムが採用された場合のエッジ処理（キャニー処理）によれば、他のアルゴリズム（ソーベルフィルタやラプラシアンフィルタ等）が採用された場合に比べ、輪郭の検出漏れや誤検出が少なく、各点に一本の輪郭を検出し、真にエッジである部分を検出し易いといった特徴がある。なおキャニー処理は、Gaussianフィルタで画像を平滑化し、この平滑化された画像の微分の計算結果から勾配の大きさと方向の計算して、Non maximum Suppression処理およびHysteresis Threshold処理を行うことにより達成される。 A Canny edge detector may be employed as an algorithm for performing edge processing. According to the edge processing (Canny processing) when this algorithm is adopted, compared to the case where other algorithms (Sobel filter, Laplacian filter, etc.) are adopted, there are fewer contour detection omissions and false detections, and each point It is characterized in that it is easy to detect a single contour and to detect a portion that is truly an edge. Canny processing is achieved by smoothing an image with a Gaussian filter, calculating the magnitude and direction of the gradient from the differential calculation result of this smoothed image, and performing Non maximum Suppression processing and Hysteresis Threshold processing. .

また「傾き補正処理」は、画像中に検出された直線等が水平方向（或いは垂直方向）から傾斜している場合に、この傾斜を解消させるように画像を回転させる処理である。例えば、画像中のナンバープレートの横方向に伸びる縁が水平方向に一致するように傾き補正処理を施すことにより、ナンバーの文字列が水平方向へ並ぶようにし、ナンバーの認識をより容易なものとすることが可能となる。なお画像中の直線等を検出容易とするため、通常、予めエッジ処理を実施しておくことは有用である。 Further, the "tilt correction process" is a process of rotating the image so as to eliminate the tilt when a straight line or the like detected in the image is tilted from the horizontal direction (or the vertical direction). For example, by performing tilt correction processing so that the horizontal edge of the license plate in the image is aligned with the horizontal direction, the character strings of the number are lined up in the horizontal direction, making it easier to recognize the number. It becomes possible to In order to facilitate detection of straight lines and the like in an image, it is usually useful to perform edge processing in advance.

静止画物体認識部４９は、静止画処理エンジン１４ｂによる静止画からの物体認識機能を用いて、静止画から車両のナンバープレート、ナンバー（ナンバープレートに表された情報）、運転者が装着したシートベルト、車両の汚れ・傷、および装備品を物体認識する。静止画からの物体認識は、車両のナンバーを認識する場合のように、静止画に表された表示情報を認識することも含む概念である。当該物体認識は、主に後述するステップＳ２２の処理において実施されるが、先述した第２露出度を検出するため、ナンバープレートについての物体認識はステップＳ２０の処理において実施される。なお上述したように本実施形態では、動画物体認識部４２と静止画物体認識部４９において、異なる物体認識の手法が用いられている。そのため、双方において同じ物体認識の手法が用いられる場合に比べ、動画からの物体認識と静止画からの物体認識のそれぞれに最適な手法を用い、物体認識を効率良く行うことが可能となっている。 The still image object recognition unit 49 uses the object recognition function from the still image by the still image processing engine 14b to extract the license plate of the vehicle from the still image, the license plate number (information indicated on the license plate), and the seat worn by the driver. Object recognition of belts, vehicle dirt/scratches, and accessories. Object recognition from a still image is a concept that includes recognition of display information represented in a still image, such as the case of recognizing a vehicle license plate. The object recognition is mainly performed in the process of step S22 described later, but the object recognition of the license plate is performed in the process of step S20 in order to detect the second exposure degree described above. As described above, in this embodiment, the moving image object recognition unit 42 and the still image object recognition unit 49 use different object recognition methods. Therefore, compared to the case where the same object recognition method is used for both, it is possible to perform object recognition efficiently by using the optimal method for object recognition from moving images and object recognition from still images. .

なお、静止画物体認識部４９は、図柄の無い一般的なナンバープレートだけでなく、いわゆるご当地ナンバーが表されたナンバープレートからもナンバーを認識することが可能である。図５は、認識され得るナンバープレートのうち、ご当地ナンバーが表されたもの（ご当地ナンバープレート）の一例を示している。当該ナンバープレートには、車両を識別する数字や記号以外に図柄（ここでは波の図柄）が描写されている。静止画物体認識部４９は、このようなナンバープレートに対しても、プリミティブ形状判断によりナンバーとして登録される文字や記号だけを抽出し、それを車両の識別番号として利用することができる。そのため図５に示す例では、「墨田区ｓ１２３４」のナンバーが抽出される。 Note that the still image object recognition unit 49 can recognize a number not only from a general license plate without a pattern, but also from a license plate showing a so-called local number. FIG. 5 shows an example of a recognizable license plate showing a local number (local license plate). On the license plate, a pattern (here, a wave pattern) is depicted in addition to numbers and symbols that identify the vehicle. The still image object recognition unit 49 can extract only the characters and symbols registered as the number from the primitive shape judgment for such a license plate and use them as the identification number of the vehicle. Therefore, in the example shown in FIG. 5, the number of "Sumida-ku s1234" is extracted.

チェック実行部５０は、物体認識されたナンバー等について予め決められた内容のチェックを実施する。当該チェックは、後述するステップＳ２４の処理で実施されるものであり、その内容については改めて詳細に説明する。 The check execution unit 50 performs a predetermined check on the object-recognized number and the like. This check is carried out in the processing of step S24, which will be described later, and the details thereof will be explained again in detail.

異常信号出力部５１は、チェック結果の異常を管理者等に知らせるための異常信号を出力する。この異常信号は、管理担当者等にチェック結果の異常を知らせるものであり、アラート音（聴覚信号）や警告ランプ（視覚信号）等とすることが可能である。当該異常信号の出力は、主に後述するステップＳ２６の処理において実施される。 The abnormality signal output unit 51 outputs an abnormality signal for notifying an administrator or the like of an abnormality in the check result. This abnormality signal notifies the person in charge of management or the like of the abnormality in the check result, and can be an alert sound (auditory signal) or a warning lamp (visual signal). The output of the abnormality signal is mainly carried out in the process of step S26, which will be described later.

２．車両管理システムの動作
次に、車両管理システム１の動作概要について説明する。まず車両管理システム１は、主に動画から静止画を抜出すための一連の処理（以下、便宜的に「前段処理」と称する）を実行する。以下、この前段処理の流れについて、図６に示すフローチャートを参照しながら説明する。 2. Operation of Vehicle Management System Next, an outline of the operation of the vehicle management system 1 will be described. First, the vehicle management system 1 executes a series of processes (hereinafter referred to as "pre-process" for convenience) mainly for extracting still images from moving images. The flow of this pre-processing will be described below with reference to the flowchart shown in FIG.

（１）前段処理
敷地内に設置された各スマートフォン１１は継続的に被写体の撮影を行い、その動画はリアルタイムに情報処理装置１４へ送られる。一方で情報処理装置１４は、この動画に対して車両の物体認識の処理を継続的に実施する。これにより、何れかのスマートフォン１１の被写体に車両が表れたとき、換言すれば、車両が敷地内に進入して被写体内の通行路を通過するときに、情報処理装置１４は当該車両を物体認識することができる（ステップＳ１０）。このようにして情報処理装置１４は、敷地内に進入する車両を監視する。 (1) Pre-stage processing Each smartphone 11 installed on the premises continuously captures an image of a subject, and the moving image is sent to the information processing device 14 in real time. On the other hand, the information processing device 14 continuously performs vehicle object recognition processing on this moving image. As a result, when a vehicle appears in the subject of any smartphone 11, in other words, when the vehicle enters the site and passes through the road in the subject, the information processing device 14 recognizes the vehicle as an object. (step S10). In this manner, the information processing device 14 monitors vehicles entering the site.

各スマートフォン１１における被写体の撮影モードは、天候、時間、および季節などの状況に応じて可変としてもよい。例えば、逆光や暗い場所の場合には、各スマートフォン１１におけるＨＤＲ（High Dynamic Range）の機能が自動的に有効となるようにしてもよい。これにより、そのときの状況に応じて極力鮮明な動画を取得することができる。 The shooting mode of the subject in each smartphone 11 may be variable according to conditions such as weather, time, and season. For example, in the case of backlight or a dark place, the HDR (High Dynamic Range) function of each smartphone 11 may be automatically enabled. As a result, it is possible to acquire a moving image that is as clear as possible according to the situation at that time.

車両が物体認識されると（ステップＳ１０のＹｅｓ）、情報処理装置１４は、当該車両に対して以降の処理（ステップＳ１１～Ｓ１８）を実施する。なお情報処理装置１４は、複数の車両が同時に物体認識された場合、すなわち、同じスマートフォン１１の被写体に同時に複数の車両が表れた場合や、複数のスマートフォン１１の被写体に同時に車両が表れた場合には、これらの車両が全て物体認識され、車両１台ごとに以降の処理が個別に行われる。 When the vehicle is recognized as an object (Yes in step S10), the information processing device 14 performs subsequent processes (steps S11 to S18) on the vehicle. Note that the information processing device 14 detects when a plurality of vehicles are recognized as objects at the same time, that is, when a plurality of vehicles appear in the subject of the same smartphone 11 at the same time, or when vehicles appear in the subjects of a plurality of smartphones 11 at the same time. , all of these vehicles are recognized as objects, and the subsequent processing is performed individually for each vehicle.

まず情報処理装置１４は、物体認識された車両の速度を検出する（ステップＳ１１）。この検出された車両速度の情報は、後述するステップＳ２４の処理により、管理サーバー１５に記録される。また更に情報処理装置１４は、当該車両とスマートフォン１１との距離Ｄが所定の閾値以下となるタイミングを監視する（ステップＳ１２）。 First, the information processing device 14 detects the speed of the vehicle whose object has been recognized (step S11). Information on the detected vehicle speed is recorded in the management server 15 by the process of step S24, which will be described later. Furthermore, the information processing device 14 monitors the timing when the distance D between the vehicle and the smartphone 11 becomes equal to or less than a predetermined threshold (step S12).

ここで図７は、車両の動きに伴って距離Ｄが変化する様子を例示している。本図に示すように、車両がスマートフォン１１に映り始めたときに比べ、車両がより大きく明瞭に映る位置まで進んだときには、距離Ｄは小さくなっている。なお距離に関する閾値は、車両が適度に大きく映ると見込まれるときの距離Ｄに合わせて設定されている。そのため情報処理装置１４は、ステップＳ１２の処理を行うことにより、車両が適度に大きく映り始めたタイミングを検知することが可能である。 Here, FIG. 7 illustrates how the distance D changes as the vehicle moves. As shown in this figure, the distance D is smaller when the vehicle moves to a position where the vehicle is larger and clearer than when the vehicle starts to appear on the smartphone 11 . Note that the threshold for distance is set according to the distance D when the vehicle is expected to appear moderately large. Therefore, the information processing device 14 can detect the timing at which the vehicle starts appearing appropriately large by performing the process of step S12.

距離Ｄが閾値以下となると（ステップＳ１２のＹｅｓ）、情報処理装置１４は、静止画抜出処理を開始する（ステップＳ１３）。以降、情報処理装置１４は、静止画抜出処理を終了するまで、動画から静止画を逐次抜き出すようにする。なお、距離Ｄが閾値以下となるまで静止画抜出処理の実施が保留されることにより、車両の映りが小さい不明瞭な静止画の抜き出しを極力抑えることが可能である。 When the distance D becomes equal to or less than the threshold (Yes in step S12), the information processing device 14 starts still image extraction processing (step S13). Thereafter, the information processing device 14 sequentially extracts still images from the moving image until the still image extracting process ends. By suspending the execution of the still image extraction process until the distance D becomes equal to or less than the threshold, it is possible to minimize the extraction of unclear still images in which the reflection of the vehicle is small.

なお、静止画抜出処理を開始する条件は、本実施形態のように距離Ｄが閾値以下になったときとする代わりに、例えば、車両の向きが所定条件を満たしたときとしても良い。このようにする場合、情報処理装置１４に車両の向きを検知する機能部（方向検知部）を設けておき、検知された方向が所定条件を満たしたときに静止画抜出処理が開始されるようにすれば良い。車両の向きは、動画中の車両の状態から認識することができる。車両の向きに関する所定条件は、例えば、スマートフォン１１に対して車両が真正面を向いている状態、つまり車両前側のナンバープレート前面がスマートフォン１１に真直ぐ向いている状態を基準方向として、車両の向きと基準方向との差が所定値以下（例えば３０°以下）であることとすれば良い。このようにすれば、車両の向きが所定条件を満たすまで静止画抜出処理の実施が保留されることにより、車両の向きに問題がある（ナンバーの認識に支障が出易い）静止画の抜き出しを極力抑えることが可能である。 Note that the condition for starting the still image extraction process may be, for example, when the direction of the vehicle satisfies a predetermined condition instead of when the distance D becomes equal to or less than the threshold as in the present embodiment. In this case, the information processing device 14 is provided with a function unit (direction detection unit) for detecting the orientation of the vehicle, and the still image extracting process is started when the detected orientation satisfies a predetermined condition. You should do it like this. The orientation of the vehicle can be recognized from the state of the vehicle in the moving image. The predetermined condition regarding the direction of the vehicle is, for example, a state in which the vehicle faces straight ahead with respect to the smartphone 11, that is, a state in which the front surface of the license plate on the front side of the vehicle faces straight toward the smartphone 11 as a reference direction. The difference from the direction should be a predetermined value or less (for example, 30° or less). In this way, the execution of the still image extraction process is suspended until the orientation of the vehicle satisfies the predetermined condition, thereby extracting a still image that has a problem with the orientation of the vehicle (which easily interferes with license plate recognition). can be minimized.

静止画抜き出し処理が開始された後、情報処理装置１４は、動画の最新の１フレームについて当該車両の第１露出度を検出する（ステップＳ１４）。第１露出度が高いほど、その静止画において当該車両がより鮮明に表れている可能性が高く、当該車両に含まれるナンバープレート等の認識に役立つ可能性が高いため、その静止画はより重要度が高いと言える。なお第１露出度の検出は、動画から直接行うようにしても良く、抜き出された最新の静止画から行うようにしても良い。 After the still image extracting process is started, the information processing device 14 detects the first exposure degree of the vehicle for the latest one frame of the moving image (step S14). The higher the first degree of exposure, the more likely that the vehicle will appear more clearly in the still image, and the more likely that the still image will be useful for recognizing the license plate, etc. included in the vehicle, the still image is more important. It can be said that the degree is high. Note that the detection of the first exposure may be performed directly from the moving image, or may be performed from the most recent extracted still image.

その後に情報処理装置１４は、検出された第１露出度に応じて、抜き出す静止画の解像度を調節する（ステップＳ１５）。より具体的に説明すると、情報処理装置１４は、第１露出度が高いほど、抜き出す静止画の解像度を上げるようにする。これにより、重要度の高い静止画を優先的に得ることができ、静止画からのナンバープレート等の認識をより行い易くすることが可能である。なお、高い解像度の静止画を常時得ようとすると、データサイズの大きい静止画を多量に扱う必要があるためシステムの負担が大きくなり易いが、本実施形態のように車両の露出度に応じて解像度を調節することにより、このような問題を極力解消することが可能である。 After that, the information processing device 14 adjusts the resolution of the extracted still image according to the detected first exposure (step S15). More specifically, the information processing device 14 increases the resolution of the extracted still image as the first exposure level increases. This makes it possible to preferentially obtain a still image with a high degree of importance, making it easier to recognize a license plate or the like from the still image. If high-resolution still images are to be constantly obtained, it is necessary to handle a large number of still images with a large data size, which tends to increase the load on the system. By adjusting the resolution, it is possible to solve such problems as much as possible.

更に情報処理装置１４は、検出された第１露出度に応じて、静止画を抜き出す時間間隔を調節する（ステップＳ１６）。より具体的に説明すると、情報処理装置１４は、第１露出度が高いほど静止画を抜き出す時間間隔を短くし、単位時間当たりに抜き出す静止画の数を増やすようにする。これにより、重要度の高い静止画を優先的に得ることができ、静止画からのナンバープレート等の認識をより行い易くすることが可能である。なお、静止画を抜き出す時間間隔を常時短くしておくと、非常に多くの静止画を扱う必要があるためシステムの負担が大きくなり易いが、本実施形態のように車両の露出度に応じて時間間隔を調節することにより、このような問題を極力解消することが可能である。 Further, the information processing device 14 adjusts the time interval for extracting still images according to the detected first exposure (step S16). More specifically, the information processing device 14 shortens the time interval for extracting still images as the first exposure is higher, and increases the number of still images extracted per unit time. This makes it possible to preferentially obtain a still image with a high degree of importance, making it easier to recognize a license plate or the like from the still image. If the time interval for extracting still images is always shortened, it is necessary to handle a large number of still images, and the load on the system tends to increase. By adjusting the time interval, such problems can be eliminated as much as possible.

上述したステップＳ１４～Ｓ１６の一連の処理は、車両が認識されなくなるまで（すなわち、車両がスマートフォン１１に映る範囲を通り過ぎるまで）、繰り返し実施される（ステップＳ１７）。このようにして、静止画の解像度および静止画を抜き出す時間間隔は第１露出度に応じてフィードバック制御され、重要度の高い静止画を効率良く抜き出すことが可能である。 The series of processes of steps S14 to S16 described above is repeated until the vehicle is no longer recognized (that is, until the vehicle passes through the range displayed on the smartphone 11) (step S17). In this way, the resolution of the still image and the time interval for extracting the still image are feedback-controlled according to the first degree of exposure, and it is possible to efficiently extract the still image of high importance.

図８に示すタイミングチャートは、一対のスマートフォン１１（同じ車両の前側と後側を撮影する各スマートフォン）により得られた動画から静止画が抜き出されるタイミングを例示している。本図における着色箇所が、静止画の抜き出しが行われるタイミングを示している。本図に示すように、車両が前側撮影用スマートフォン（１１ａまたは１１ｃ）の被写体内に存在する期間では、当該車両の前側の静止画が取得される。その後に当該車両が移動し、当該車両が後側撮影用スマートフォン（１１ｂまたは１１ｄ）の被写体内に存在する期間では、当該車両の後側の静止画が取得される。また図８に示すように、第１露出度が高いときほど、解像度の高い静止画が多く取得される。 The timing chart shown in FIG. 8 illustrates timings at which still images are extracted from moving images obtained by a pair of smartphones 11 (smartphones that photograph the front and rear sides of the same vehicle). The colored portions in this figure indicate the timings at which still images are extracted. As shown in this figure, a still image of the front side of the vehicle is acquired during a period in which the vehicle is present in the subject of the front imaging smartphone (11a or 11c). After that, the vehicle moves, and a still image of the rear side of the vehicle is acquired during a period in which the vehicle exists within the subject of the rear-side photographing smartphone (11b or 11d). Also, as shown in FIG. 8, the higher the first exposure, the more high-resolution still images are acquired.

なお本実施形態において、ステップＳ１４およびＳ１５の処理の一方を省略しても良く、ステップＳ１４～Ｓ１６の処理を省略しても良い。一方、車両が認識されなくなると（ステップＳ１７のＹｅｓ）、その車両に関しての静止画抜出処理は終了する（ステップＳ１８）。一対のスマートフォン１１それぞれに対応した前段処理が実行されると、車両１台分についての複数のフレームの静止画が得られることになる。このようにして得られた静止画群は、後述するステップＳ２０～Ｓ２６の一連の処理（以下、便宜的に「後段処理」と称する）に用いられる。 In this embodiment, one of the processes of steps S14 and S15 may be omitted, and the processes of steps S14 to S16 may be omitted. On the other hand, when the vehicle is no longer recognized (Yes in step S17), the still image extracting process for that vehicle ends (step S18). When the pre-processing corresponding to each of the pair of smartphones 11 is executed, still images of a plurality of frames for one vehicle are obtained. A group of still images obtained in this manner is used in a series of processes in steps S20 to S26 (hereinafter referred to as "post-stage processing" for convenience).

（２）後段処理
次に、図９に示すフローチャートを参照しながら、後段処理の流れについて説明する。この後段処理は、車両１台分の前段処理が終了する度に実行される。情報処理装置１４は、ナンバーの認識等に用いる静止画を得るため、前段処理によって得られた複数の静止画のうち先述した露出度条件を満たすものを抽出する（ステップＳ２０）。 (2) Post-stage processing Next, the flow of post-stage processing will be described with reference to the flowchart shown in FIG. This post-stage processing is executed each time the pre-stage processing for one vehicle is completed. The information processing device 14 extracts still images satisfying the above-described exposure condition from among the plurality of still images obtained by the pre-processing in order to obtain still images used for number recognition and the like (step S20).

ここで本実施形態では露出度条件として、優先度の高い方から順に、第１条件、第２条件、第３条件、第４条件、および第５条件が次の通り設定される。
第１条件：第２露出度が最も高いこと
第２条件：第２露出度が２番目に高いこと
第３条件：第２露出度が３番目に高いこと
第４条件：第１露出度が最も高いこと
第５条件：第１露出度が２番目に高いこと
但し、第４条件および第５条件については、第１～第３条件の何れかを満たす静止画は対象外とされる。また、第５条件以降の各条件が適宜設定されるようにしても良い。 Here, in the present embodiment, as exposure conditions, the first condition, second condition, third condition, fourth condition, and fifth condition are set in descending order of priority as follows.
1st condition: The 2nd degree of exposure is the highest 2nd condition: The 2nd degree of exposure is the 2nd highest 3rd condition: The 2nd degree of exposure is the 3rd highest 4th condition: The 1st degree of exposure is the highest High Fifth condition: The first exposure level is the second highest. However, for the fourth and fifth conditions, a still image that satisfies any one of the first to third conditions is excluded. Further, each condition after the fifth condition may be appropriately set.

すなわち、ステップＳ２０の処理が最初に行われる際には、露出度条件として第１条件が有効とされる。しかし、その後のステップＳ２３の処理においてナンバーの認識が成功せず、次にステップＳ２０の処理が行われる際には、露出度条件として第２条件が有効とされる。以下同様に、その次にステップＳ２０の処理が行われる際には第３条件が有効とされ、更にその次にステップＳ２０の処理が行われる際には第４条件が有効とされ、更にその次にステップＳ２０の処理が行われる際には第５条件が有効とされる。 That is, when the process of step S20 is performed for the first time, the first condition is valid as the exposure condition. However, when the recognition of the number is not successful in the subsequent processing of step S23 and the processing of step S20 is performed next, the second condition is valid as the exposure condition. Similarly, the third condition is validated when the process of step S20 is performed next, the fourth condition is validated when the process of step S20 is performed next, and the next condition is validated. When the process of step S20 is performed in , the fifth condition is valid.

このように本実施形態では、まず第２露出度の高い方から所定数（本実施形態の例では３個）の静止画が最優先で抽出されるようにし、その次に第１露出度の高い方から所定数（本実施形態の例では２個）の静止画が優先的に抽出されるようにしている。なお、これらの所定数の値は一例であり、他の値を採用しても構わない。本実施形態では特に移動中の車を撮影するため、車両を映した撮影動画においてナンバープレートが欠けていたり、適切に見えなかったりする事態が生じ得る。このような事態はナンバーの認識に致命的な悪影響を及ぼす可能性が高いが、本実施形態のように第２露出度の高い静止画を最優先に抽出することにより、このような事態を極力抑えることが可能である。また本実施形態では、第１露出度の高さもナンバーの認識の成功率に大きく影響することから、第２露出度に次いで第１露出度の高さも重視して、抽出する静止画を決めるようにしている。なお、仮に車両を一方向のみから撮影すると、逆光でナンバープレートが適切に映らない虞があるが、本実施形態では一対のスマートフォンを用いて前側および後側から車両を撮影し、前側のナンバープレートが映った動画と後側のナンバープレートが映った動画の両方を得ることが出来るため、このような不具合は回避される。 As described above, in this embodiment, first, a predetermined number (three in the example of this embodiment) of still images with the highest second exposure are extracted with the highest priority, and then the images with the first exposure are extracted. A predetermined number (two in the example of the present embodiment) of still images are preferentially extracted from the highest number. It should be noted that the values of these predetermined numbers are examples, and other values may be adopted. In this embodiment, since a moving vehicle is photographed in particular, a situation may occur in which the license plate is missing or not properly visible in the photographed moving image of the vehicle. Such a situation is highly likely to have a fatal adverse effect on number recognition, but by extracting the still image with the second high degree of exposure as the highest priority as in the present embodiment, such a situation can be avoided as much as possible. can be suppressed. In this embodiment, since the high first exposure also greatly affects the number recognition success rate, the still image to be extracted is determined by placing importance on the first exposure next to the second exposure. I have to. If the vehicle is photographed from only one direction, there is a risk that the license plate may not be properly captured due to backlight. Since it is possible to obtain both the video showing the image and the video showing the license plate on the rear side, such a problem can be avoided.

ステップＳ２０の処理を行った後、情報処理装置１４は、抽出された静止画に対して先述した画像処理を実施し（ステップＳ２１）、画像処理済みの静止画に対してナンバープレート等の物体認識を実行する（ステップＳ２２）。なおナンバープレートに関しては、これに表されたナンバー（表示情報）の認識が実行される。ここで、ナンバーの認識に成功した場合には（ステップＳ２３のＹｅｓ）、次のステップＳ２４の処理が行われるが、ナンバーの認識に成功しなかった場合には（ステップＳ２３のＮｏ）、ステップＳ２０の処理が再度行われる。 After performing the process of step S20, the information processing device 14 performs the above-described image processing on the extracted still image (step S21), and recognizes an object such as a license plate on the image-processed still image. is executed (step S22). As for the license plate, recognition of the number (display information) indicated on the plate is executed. Here, if the number is successfully recognized (Yes in step S23), the next step S24 is processed. If the number is not successfully recognized (No in step S23), step S20 is performed. is processed again.

なお、ステップＳ２０の処理が再度行われる際には、先述したとおり、第１条件の代わりに第２条件が適用され、更にステップＳ２０の処理が再度行われる際には第３条件が適用される。このように本実施形態では、抜き出された各静止画のうち第２露出度が最も高いものを抽出する第１処理と、抽出された前記静止画からナンバーの認識を行う第２処理と、当該認識が成功しない場合に、前記各静止画のうち第２露出度がその次に高いものを抽出してナンバーの認識を再度行う第３処理と、が実行され、当該認識が成功するまで第３処理が繰返されるようになっている。 When the process of step S20 is performed again, the second condition is applied instead of the first condition as described above, and the third condition is applied when the process of step S20 is performed again. . As described above, in the present embodiment, a first process of extracting the extracted still image with the highest second exposure level, a second process of recognizing the number from the extracted still image, If the recognition is not successful, a third process of extracting the still image with the next highest second exposure level and recognizing the number again is executed, until the recognition is successful. 3 processing is repeated.

また更に本実施形態では、第３処理を所定回数繰返しても前記認識が成功しない場合に、前記抜き出された各静止画のうち第１露出度が最も高いものを抽出する第４処理と、抽出された前記静止画からナンバーの認識を行う第５処理と、当該認識が成功しない場合に、前記各静止画のうち第１露出度がその次に高いものを抽出してナンバーの認識を再度行う第６処理と、が実行され、当該認識が成功するまで第６処理が繰り返されるようになっている。 Furthermore, in the present embodiment, when the recognition is not successful even after repeating the third processing a predetermined number of times, a fourth processing for extracting the still image having the highest first exposure level among the extracted still images; A fifth process of recognizing the number from the extracted still image, and if the recognition is unsuccessful, extracting the still image with the next highest first exposure level and recognizing the number again. A sixth process is executed, and the sixth process is repeated until the recognition is successful.

次に情報処理装置１４は、当該物体認識の結果に基づいてナンバー等のチェックを実施し、その結果を管理サーバー１５に記録する（ステップＳ２４）。より具体的に説明すると、情報処理装置１４は、物体認識されたナンバーについては、管理サーバー１５に格納されているデータベース（全ての許可車両のナンバー）との照合を実行する。その結果、何れかの許可車両のナンバーに一致していれば正常、そうでなければ異常と判別する。また情報処理装置１４は、運転者が装着したシートベルトおよび装備品については、正しく物体認識された場合（つまり、正しく装着或いは装備されている場合）には正常、そうでなければ異常と判別する。また情報処理装置１４は、ステップＳ１１の処理にて検出済みである車両速度については、所定の許容上限速度（例えば３０km/h）を超えていなければ正常、そうでなければ異常と判別する。また情報処理装置１４は、車両の汚れや傷に関して、所定基準を上回る汚れや傷が物体認識された場合には異常、そうでなければ正常と判別する。 Next, the information processing device 14 checks the number and the like based on the object recognition result, and records the result in the management server 15 (step S24). More specifically, the information processing device 14 checks the object-recognized number against the database (numbers of all permitted vehicles) stored in the management server 15 . As a result, if it matches the number of any permitted vehicle, it is determined to be normal, otherwise it is determined to be abnormal. The information processing device 14 determines that the seat belt and accessories worn by the driver are normal if they are correctly recognized as objects (that is, if they are correctly worn or equipped), and are otherwise abnormal. . The information processing device 14 determines that the vehicle speed detected in step S11 is normal if it does not exceed a predetermined allowable upper limit speed (for example, 30 km/h), and that it is abnormal otherwise. Further, the information processing device 14 determines that the vehicle is abnormal when dirt or damage exceeding a predetermined standard is recognized as an object, and that the vehicle is normal otherwise.

更に情報処理装置１４は、これらのチェック結果（判別の結果）、認識されたナンバー、撮影日時（現在の日時）、撮影に用いられたスマートフォン１１の識別番号、および車両速度を、車両ごとに関連付けて管理サーバー１５に記録される。図１０は、管理サーバー１５に記録されたチェック結果等の情報を例示している。本図に示す例では、車両ごとに管理番号が割り振られ、各項目の情報が記録されている。なお管理サーバー１５には、物体認識に利用された動画や静止画も保存され、車両の情報とその車両に関するナンバープレート等（ナンバーの他、運転者や装備品なども含む）の情報を関連させて保持するようになっている。これにより、車両とナンバープレート等を一括して管理することが可能である。また運転者（人物）については、その表情等も紐付けて保持されるようになっている。 Furthermore, the information processing device 14 associates these check results (discrimination results), the recognized number, the date and time of photography (current date and time), the identification number of the smartphone 11 used for photography, and the vehicle speed with each vehicle. is recorded in the management server 15. FIG. 10 exemplifies information such as check results recorded in the management server 15 . In the example shown in this figure, a management number is assigned to each vehicle, and information of each item is recorded. The management server 15 also stores moving images and still images used for object recognition, and associates vehicle information with information related to the vehicle, such as a license plate (including not only the number but also the driver and equipment). It is designed to hold This makes it possible to collectively manage vehicles, license plates, and the like. As for the driver (person), the facial expression and the like are also stored in association with each other.

その後、情報処理装置１４は、上記の何れのチェック結果にも異常が無かった場合には（ステップＳ２５のＮｏ）、後段処理は終了する。しかし何れかのチェック結果に異常が有った場合には（ステップＳ２５のＹｅｓ）、情報処理装置１４は異常信号を出力し（ステップＳ２６）、その後に後段処理は終了する。図１０に示す例では、管理番号No.00001についてはシートベルトのチェック結果に異常があり、管理番号No.00002については車両速度のチェック結果に異常があるため、これらの結果に対して異常信号が出力されることになる。 After that, if there is no abnormality in any of the above check results (No in step S25), the information processing device 14 ends the post-processing. However, if there is an abnormality in any of the check results (Yes in step S25), the information processing device 14 outputs an abnormality signal (step S26), after which the post-processing ends. In the example shown in FIG. 10, the seat belt check result for control number No. 00001 is abnormal, and the vehicle speed check result for control number No. 00002 is abnormal. will be output.

異常信号が出力されると、これに気付いた管理担当者は、チェック結果を確認した上で然るべき措置をとることが可能となる。チェック結果に異常が有ったことは、その車両のユーザーに報知されるようにしても良い。また、異常信号は、管理担当者が所定操作を行うまで継続して出力されるようにしても良い。更にこの場合、異常信号の出力が一定時間以上続けば、管理担当者やユーザーが所持する端末等に異常を知らせるメッセージ（例えば電子メール）が送信されるようにしても良い。 When the abnormal signal is output, the person in charge of management who notices it can take appropriate measures after confirming the check result. The user of the vehicle may be informed that the check result indicates an abnormality. Further, the abnormal signal may be continuously output until the person in charge of management performs a predetermined operation. Furthermore, in this case, if the output of the abnormality signal continues for a certain period of time or more, a message (e.g., electronic mail) may be sent to notify the terminal or the like owned by the person in charge of management or the user of the abnormality.

３．その他
以上に説明した通り車両管理システム１は、動画から車両（第１対象物の一例）を物体認識する動画物体認識部４２と、この動画から当該車両を含む静止画を抜き出す静止画抜出部４５と、当該静止画から車両に付随するナンバープレート等（第２対象物の一例）を物体認識する静止画物体認識部４９と、を備える。 3. Others As described above, the vehicle management system 1 includes a moving image object recognition unit 42 that recognizes a vehicle (an example of a first object) from a moving image, and a still image extracting unit that extracts a still image including the vehicle from the moving image. 45, and a still image object recognition unit 49 that recognizes a license plate or the like attached to the vehicle (an example of a second object) from the still image.

そのため車両管理システム１によれば、システムの動作負担を抑えながらもナンバープレート等を精度良く認識することが容易となっている。すなわち、動画からナンバープレート等を直接的に認識しようとすると認識精度の低下等が懸念されるが、静止画から認識することにより精度の高い物体認識が容易である。また更に、動画に車両が含まれないときには静止画抜出処理が行われないようにし、無駄な処理を省いてシステムの動作負担を抑えることが可能である。 Therefore, according to the vehicle management system 1, it is easy to accurately recognize the license plate and the like while suppressing the operation load of the system. That is, although there is a concern that recognition accuracy may be lowered if an attempt is made to directly recognize a license plate or the like from a moving image, highly accurate object recognition is facilitated by recognizing from a still image. Furthermore, it is possible to prevent the still image extracting process from being performed when the moving image does not include a vehicle, thereby omitting unnecessary processing and reducing the operation load of the system.

更に車両管理システム１は、前記抜き出された複数の静止画のうち露出度条件を満たすものを抽出する抽出部４７を備え、静止画物体認識部４９は、抽出された静止画から、車両に含まれるナンバープレート等を物体認識する。そのため、車両に含まれるナンバープレート等を静止画からより効率良く認識することが可能である。 Furthermore, the vehicle management system 1 includes an extracting unit 47 that extracts still images that satisfy an exposure condition from among the plurality of extracted still images. Recognize objects such as license plates included. Therefore, it is possible to more efficiently recognize the license plate or the like included in the vehicle from the still image.

以上、本発明の実施形態について説明したが、本発明の構成は上記実施形態に限られず、発明の主旨を逸脱しない範囲で種々の変更を加えることが可能である。本発明の技術的範囲は、上記実施形態の説明ではなく、特許請求の範囲によって示されるものであり、特許請求の範囲と均等の意味及び範囲内に属する全ての変更が含まれると理解されるべきである。 Although the embodiments of the present invention have been described above, the configuration of the present invention is not limited to the above embodiments, and various modifications can be made without departing from the gist of the invention. The technical scope of the present invention is defined by the scope of claims rather than the description of the above embodiments, and is understood to include all modifications within the scope and meaning equivalent to the scope of claims. should.

本発明は、動画からの物体認識を行うシステム等に利用可能である。 INDUSTRIAL APPLICABILITY The present invention can be used for systems that recognize objects from moving images.

１車両管理システム
１１スマートフォン
１１ａ進入路前側撮影用スマートフォン
１１ｂ進入路後側撮影用スマートフォン
１１ｃ退出路前側撮影用スマートフォン
１１ｄ退出路後側撮影用スマートフォン
１２通信ネットワーク
１３エッジサーバー
１４情報処理装置
１４ａ動画処理エンジン
１４ｂ静止画処理エンジン
１５管理サーバー
４０制御部
４１通信部
４２動画物体認識部
４３距離検知部
４４速度検知部
４５静止画抜出部
４６露出度検出部
４７抽出部
４８画像処理部
４９静止画物体認識部
５０チェック実行部
５１異常信号出力部
1 vehicle management system 11 smartphone 11a smartphone for photographing the front side of the approach road 11b smartphone for photographing the rear side of the approach road 11c smartphone for photographing the front side of the exit road 11d smartphone for photographing the rear side of the exit road 12 communication network 13 edge server 14 information processing device 14a video processing engine 14b still image processing engine 15 management server 40 control unit 41 communication unit 42 moving image object recognition unit 43 distance detection unit 44 speed detection unit 45 still image extraction unit 46 exposure detection unit 47 extraction unit 48 image processing unit 49 still image object recognition Unit 50 Check execution unit 51 Abnormal signal output unit

Claims

an image object recognition unit that recognizes a vehicle as an object from image information including a plurality of frames;
a still image extraction unit for extracting still images of a plurality of frames including the vehicle from the image information, wherein the higher the degree of exposure of the vehicle in the image information, the shorter the time interval for extracting the still images; ,
a still image object recognition unit that recognizes a license plate associated with the vehicle from the still image,
A front side photographing camera for photographing the front side of the same vehicle and a rear side photographing camera for photographing the rear side are provided,
The image information includes both a front image taken using the front camera and a rear image taken using the rear camera,
The front side photographing camera and the rear side photographing camera are arranged at positions overlapping with the vehicle in an upward view, and the front side photographing camera is arranged so as to photograph the entire front side of the vehicle from an obliquely upper forward direction, The image recognition system , wherein the camera for photographing the rear side is arranged so as to photograph the entire rear side of the vehicle from an obliquely upper rear direction .

The still image object recognition unit
The license plate on the front side of the vehicle is object-recognized from the still image extracted from the front-side image, and the license plate on the rear side of the vehicle is object-recognized from the still image extracted from the rear-side image. 2. The image recognition system according to claim 1, characterized by:

an extraction unit that extracts, from among the extracted still images, those that satisfy a predetermined condition regarding the degree of exposure of at least one of the vehicle and the license plate;
3. The image recognition system according to claim 1, wherein the still image object recognition unit recognizes the license plate included in the vehicle from the extracted still image.

a first process of extracting the one with the highest degree of exposure of the specific object, which is the license plate, from among the extracted still images;
a second process for recognizing display information in the license plate from the extracted still image;
a third process of extracting the still image with the next highest degree of exposure of the specific object and repeating the second process until the recognition is successful;
4. The method according to claim 3, wherein when the recognition in the second process is not successful even if the number of repetitions reaches a predetermined number, the first process to the third process are executed with the vehicle as the specific object. The object recognition system described.