JP2019096179A

JP2019096179A - Behavior monitoring system

Info

Publication number: JP2019096179A
Application number: JP2017226573A
Authority: JP
Inventors: 浩生菊池; Hiroo Kikuchi; 真希楠見; Maki Kusumi
Original assignee: Hochiki Corp
Current assignee: Hochiki Corp
Priority date: 2017-11-27
Filing date: 2017-11-27
Publication date: 2019-06-20
Anticipated expiration: 2037-11-27
Also published as: JP7108395B2; JP2022169507A

Abstract

To make it possible to monitor whether irregularity by fraudulent behaviors is set from a behavior of a user by inputting images of monitored areas captured by a monitoring camera to a multilayer type neural network.SOLUTION: A monitoring camera 14 is configured to capture an office room 12 targeted for monitoring, and a behavior determination unit 16 of a behavior monitoring system 10 is configured to: cut out images of a monitoring segment in which equipment such as a desk, shelf and the like set in monitored images is arranged; input the cut-out image by a multilayer neural network learned by images indicative of allowed behaviors in which a user having an equipment use authority handles the equipment; and determine irregularity behaviors of a third party. When the irregularity behavior of the third party is determined by the behavior determination unit 16, the behavior determination unit is configured to: transmit a warning signal to a client device 24 of an entering/existing room management facility to let the client device know the irregularity behavior thereof; and reproduce motion images of the monitored area before and after a prescribed time including a point of time of determining the irregularity from recorded images of a recording image device 20; and cause the motion image to be displayed.SELECTED DRAWING: Figure 1

Description

本発明は、施設内の部屋を監視カメラで撮像した画像からニューラルネットワークにより在室者の不審な行動を監視する行動監視システムに関する。 The present invention relates to a behavior monitoring system that monitors suspicious behavior of an occupant by using a neural network from an image obtained by capturing a room in a facility with a surveillance camera.

従来、オフィスビルなどの施設にあっては入退室管理設備が設置されており、部屋の出入口の扉外側に設置したカードリーダ等の読取端末で、利用者が保有するカードから読み取った利用者ＩＤ情報を予め登録した利用者識別情報（以下(利用者ＩＤ)という）との一致を判別した場合に入退室管理制御装置に認証信号を送り、入退室管理制御装置からの制御信号により扉の電気錠を解錠し、関係者以外の第三者が施設内に入らないように管理している（特許文献１）。 Conventionally, in facilities such as office buildings, entry and exit management facilities have been installed, and a reader terminal such as a card reader installed outside the door of a room entrance, a user ID read from a card held by the user When it is determined that the information matches the user identification information (hereinafter referred to as (user ID)), an authentication signal is sent to the room entry and exit management control device, and the control signal from the room entry and exit management control device The lock is unlocked and managed so that a third party other than the person concerned does not enter the facility (Patent Document 1).

また、施設内に入った利用者の行動を、人物動線追跡技術を利用して監視する監視カメラシステムも知られている。この監視カメラシステムは、監視領域内を移動する人物の行動を追跡する動線データを動線データベースに記憶し、監視領域内を移動する人物をカメラで撮影したカメラ画像データをカメラ画像データベースに記憶し、動線データベースに記憶された人物の動線データのなかから、特定の人物行動を表わす条件データに該当する人物行動が行われた区間を検出し、この検出された区間の当該人物を撮影したカメラ画像データを抽出して再生することにより、不正行為者の行動を確認できるようにしている（特許文献２）。 There is also known a surveillance camera system that monitors the behavior of a user who has entered the facility using human flow line tracking technology. The surveillance camera system stores flow line data for tracking the behavior of a person moving in the monitoring area in a flow line database, and stores camera image data obtained by photographing a person moving in the monitoring area with a camera image database From the flow line data of the person stored in the flow line database, a section in which the person action corresponding to the condition data representing the specific person action is performed is detected, and the person in the detected section is photographed By extracting and reproducing the captured camera image data, it is possible to confirm the behavior of the fraudster (Patent Document 2).

特開２００３−１０１９９９号公報JP 2003-101999 A 特開２００９−２８４１６７号公報JP, 2009-284167, A

しかしながら、このような従来の入退室管理設備にあっては、利用者がカードを使用して電気錠を解錠して部屋に入る場合に別の人物が一緒に入る共連れの問題があり、登録されている利用者以外の第三者が共連れにより室内に入ってしまう場合があり、その行動を監視することが困難なため、機密漏洩の可能性がある。 However, in such a conventional entry and exit management facility, there is a problem of cohabitation that another person enters together when the user unlocks the electric lock using a card and enters the room. A third party other than the registered user may enter the room by sharing, and it is difficult to monitor the behavior, so there is a possibility of a security leak.

また、オフィスルーム等の多数の利用者が働く場所では、権限のある管理者や担当者がパーソナルコンピュータ等の情報端末機器を使用して例えば銀行決済取引をしたり、書棚に保管した機密情報を収めたファイルを使用した業務等を日常的に行っており、例えば権限のある管理者や担当者が会議等のために一時的に席を外していた場合、権限のない第三者により情報端末機器が不正に操作されたり、書棚に保管されている機密ファイルが不正に閲覧されたり持ち出されたりする可能性がある。 In addition, in places where many users work, such as office rooms, authorized administrators and persons in charge use information terminal devices such as personal computers to carry out, for example, bank settlement transactions, and confidential information stored in bookcases. The business using the stored file is performed on a daily basis, for example, when an authorized administrator or person in charge temporarily takes a seat for a meeting etc., the information terminal is operated by an unauthorized third party. There is a possibility that the device may be tampered with, or confidential files stored in the bookcase may be viewed or taken out illegally.

このような不正行為を防止するためには、例えば監視カメラにより撮像された監視領域の画像を監視することが考えられるが、人為的に画像を常に見ていなければならず、不正行為の発見は困難であり、不正な銀行取引や機密情報の漏洩等が後に発覚して初めて気づくといった問題がある。 In order to prevent such fraudulent activity, for example, it is conceivable to monitor the image of the surveillance area taken by the surveillance camera, but it is necessary to always look at the image artificially, and the discovery of fraudulent activity is It is difficult, and there is a problem that it will be noticed only when unlawful banking or leakage of confidential information is discovered later.

本発明は、多層式のニューラルネットワークに監視カメラで撮像された監視領域の画像を入力して不正行為による異常が発生していないかを監視可能とする行動者監視システムを提供することを目的とする。 An object of the present invention is to provide an actor monitoring system capable of monitoring an image of a monitoring area captured by a monitoring camera in a multi-layered neural network and monitoring whether an abnormality due to a fraud has occurred. Do.

（行動監視システム）
本発明は、行動監視システムに於いて、
監視対象とする監視領域を撮像する撮像部と、
監視領域内の人物を特定し、前記特定した人物の異常行動を判定して出力する行動判定部と、
が設けられ、
行動判定部は、監視領域内に通常存在する人物の許可行動を示す画像により学習された多層式のニューラルネットワークによって構成されたことを特徴とする。 (Action monitoring system)
The present invention relates to a behavior monitoring system
An imaging unit for imaging a monitoring area to be monitored;
An action determination unit that identifies a person in the monitoring area and determines and outputs an abnormal behavior of the identified person;
Is provided,
The action determination unit is characterized by being configured by a multi-layered neural network learned by an image showing permission actions of a person normally present in the monitoring area.

（人物特定部）
行動監視システムは、更に、行動判定部で異常行動が判定された人物を特定して人物特定情報を出力する人物特定部を備える。 (Person identification department)
The behavior monitoring system further includes a person specifying unit that specifies a person whose abnormal behavior has been determined by the behavior determining unit and outputs person specifying information.

（多層式ニューラルネットワークの機能構成１）
行動判定部の多層式のニューラルネットワークは、特徴抽出部と認識部で構成され、
特徴抽出部は、入力した人物の画像から人物に応じた特徴量を抽出して出力する畳み込みニューラルネットワークで構成され、
認識部は、畳み込みニューラルネットワークから出力される特徴量を入力して許可行動の人物か否かを推定する複数の全結合層を備えた全結合ニューラルネットワークで構成される。 (Functional Configuration 1 of Multilayer Neural Network)
The multi-layered neural network of the action determination unit is composed of a feature extraction unit and a recognition unit,
The feature extraction unit is configured of a convolutional neural network that extracts and outputs feature amounts according to the person from the input image of the person,
The recognition unit is configured by an all-connected neural network including a plurality of all connected layers that input a feature value output from the convolutional neural network and estimate whether or not the person is a permitted person.

（多層式ニューラルネットワーク機能構成１の学習制御）
行動判定部には、
監視領域内に通常存在する人物の許可行動を示す画像が予め記憶された学習情報記憶部と、
学習情報記憶部に記憶されている画像を読み出して教師ありの学習画像として畳み込みニューラルネットワークに入力した場合に全結合ニューラルネットワークから出力される推定値と所定の期待値との誤差に基づくバックプロパゲーションにより全結合ニューラルネットワーク及び畳み込みニューラルネットワークを学習させる学習制御部と、
が設けられる。 (Learning control of multilayer neural network functional configuration 1)
The behavior determination unit
A learning information storage unit in which an image indicating a permitted action of a person normally present in the monitoring area is stored in advance;
Back propagation based on an error between an estimated value output from all coupled neural networks and a predetermined expected value when an image stored in a learning information storage unit is read out and input to a convolutional neural network as a supervised learning image A learning control unit for learning all coupled neural networks and convolutional neural networks by
Is provided.

（多層式ニューラルネットワークの機能構成２）
行動判定部の多層式のニューラルネットワークは、画像解析部と行動認識部で構成され、
画像解析部は、
入力した人物の画像から人物に応じた特徴量を抽出して所定の中間層から出力する畳み込みニューラルネットワークと、
畳み込みニューラルネットワークから出力された特徴量を入力し、監視領域の画像の画像説明文を生成して出力する再帰型ニューラルネットワークと、
により構成され、
行動認識部は、
所定の異常行動を示す単語が登録された辞書と、
画像解析部から出力された画像説明文を構成する単語を、辞書に登録された単語と比較して人物の異常行動を判定する判定器と、
により構成される。 (Functional configuration 2 of multi-layered neural network)
The multi-layered neural network of the action determination unit includes an image analysis unit and an action recognition unit.
The image analysis unit
A convolutional neural network that extracts feature quantities according to a person from an image of the person that is input, and outputs the extracted feature amount from a predetermined intermediate layer;
A recursive neural network that receives feature quantities output from a convolutional neural network and generates and outputs an image description of an image of a monitoring region;
Configured by
The action recognition unit
A dictionary in which a word indicating a predetermined abnormal behavior is registered;
A determiner that determines an abnormal behavior of a person by comparing words that make up the image description output from the image analysis unit with the words registered in the dictionary;
It consists of

（人物特定による異常行動の判定）
画像解析部は、人物の画像から人物を特定して人物特定情報を出力し、
所定の異常行動を示す単語は監視領域内に通常存在する人物に応じて登録されており、
判定器は画像説明文を構成する単語と、人物特定情報の人物における異常行動を示す単語を比較して人物の異常行動を判定する。 (Determination of abnormal behavior by person identification)
The image analysis unit identifies a person from the image of the person and outputs person identification information;
A word indicating a predetermined abnormal behavior is registered according to a person usually present in the monitoring area,
The determiner compares the words forming the image description with the words indicating abnormal behavior in the person of the person specifying information to determine abnormal behavior of the person.

（多層式ニューラルネットワーク機能構成２の学習制御）
行動判定部には、
監視領域内に通常存在する人物の許可行動を示す画像と、画像の概要を示す所定の画像説明文とのペアが予め記憶された学習情報記憶部と、
学習情報記憶部に記憶されている画像を読出して教師なしの学習画像として畳み込みニューラルネットワークに入力してバックプロパゲーションにより学習させ、当該学習の済んだ畳み込みニューラルネットワークに学習画像を入力して出力された特徴量と学習画像のペアとなる画像説明文を再帰型ニューラルネットワークに教師なしの学習情報として入力してバックプロパゲーションにより学習させる学習制御部と、
が設けられる。 (Learning control of multilayer neural network functional configuration 2)
The behavior determination unit
A learning information storage unit in which a pair of an image indicating permission behavior of a person normally present in the monitoring area and a predetermined image explanatory text indicating an outline of the image is stored in advance;
The image stored in the learning information storage unit is read out and input to the convolutional neural network as an unsupervised learning image to be learned by back propagation, and the learning image is input and output to the learned convolutional neural network A learning control unit for inputting an image description sentence, which is a pair of the feature amount and the learning image, as unsupervised learning information into a recursive neural network and learning by back propagation;
Is provided.

（入退室管理設備との連携）
更に、
施設内の監視領域に入室する利用者を識別し、利用者識別情報が予め登録した利用者識別情報と一致した場合に出入口の扉に設けられた電気錠を解錠する制御を行う入退室管理設備と、
入退室管理設備で利用者を識別した際に入室する利用者を撮像する入室撮像部と、
が設けられ、
人物特定部は、入退室管理設備による電気錠を解錠する制御に連動して、入室撮像部により撮像された利用者画像を記憶し、利用者画像と行動判定部で異常行動が判定された人物の画像と照合して一致した利用者の利用者情報を人物特定情報として出力する。 (Cooperation with room entry and exit management equipment)
Furthermore,
An entry and exit management that performs control to identify a user who enters the monitoring area in the facility and unlocks the electric lock provided on the door of the entrance when the user identification information matches the user identification information registered in advance. Equipment,
An entry imaging unit for imaging a user who enters a room when the user is identified by the entry / exit management facility;
Is provided,
The person identification unit stores the user image captured by the entry imaging unit in conjunction with the control for unlocking the electric lock by the entry / exit management facility, and the user image and the action determination unit determine an abnormal action. User information of the matched user is collated with the image of the person and output as person identification information.

（異常行動の警報と録画）
更に、撮像部で撮像された監視領域の動画を録画する録画装置が設けられ、
行動判定部は、人物の異常行動を判定した場合に、所定の外部装置に異常警報信号を送信して報知させると共に、異常行動を判定した時点を含む前後所定時間の撮像部で撮像された監視領域の動画を録画装置から再生して表示させる。 (Warning and recording of abnormal behavior)
Furthermore, a recording device for recording a moving image of the monitoring area imaged by the imaging unit is provided.
When the behavior determination unit determines an abnormal behavior of a person, it transmits an alarm signal to a predetermined external device for notification and also monitors the image captured by the imaging unit for a predetermined time before and after the time when the abnormal behavior is determined. The video of the area is reproduced from the recording device and displayed.

（マルチ監視）
行動判定部は、撮像部により撮像された監視領域の画像の中に監視対象とする備品が配置された１又は複数の監視区画を設定し、監視区画の画像を切り出して行動判定部に入力することにより監視区画毎に人物の異常行動を判定して出力する。 (Multi-monitoring)
The action determination unit sets one or more monitoring sections in which the equipment to be monitored is arranged in the image of the monitoring area captured by the imaging unit, cuts out the image of the monitoring section, and inputs the image to the action determination section Thus, the abnormal behavior of the person is determined and output for each monitoring section.

（周期的な行動監視）
行動判定部は、所定の周期毎に、監視区画の画像を入力して人物の異常行動を判定して出力する。 (Periodical activity monitoring)
The action determination unit inputs an image of the monitoring section every predetermined period, determines and outputs an abnormal action of the person.

（利用者学習画像のサイズ正規化）
行動判定部は、監視区画の画像を所定サイズの画像に正規化して入力する。 (User training image size normalization)
The behavior determination unit normalizes and inputs an image of the monitoring section into an image of a predetermined size.

（基本的な効果）
本発明は、行動監視システムに於いて、監視対象とする監視領域を撮像する撮像部と、監視領域内の人物を特定し、前記特定した人物の異常行動を判定して出力する行動判定部と、行動判定部で異常行動が判定された人物を特定して人物特定情報を出力する人物特定部とが設けられ、行動判定部は、監視領域内に通常存在する人物の許可行動を示す画像により学習された多層式のニューラルネットワークによって構成されたため、多階層のニューラルネットワークの出力により、学習済みの備品を扱う許可された利用者の画像と一部の特徴は一致するが一部の特徴が一致しない場合や、全ての特徴が一致しない場合、許可された利用者による備品の扱いではないことを示す異常行動が判定され、第三者の不正行為による異常行動を監視することができる。 (Basic effect)
The present invention relates to an action monitoring unit for capturing an image of a monitoring area to be monitored, an action determining unit for identifying a person in the monitoring area, and determining and outputting an abnormal action of the specified person, in the action monitoring system. And a person specifying unit that specifies a person whose abnormal behavior has been determined by the behavior determining unit and outputs the person specifying information, and the behavior determining unit is an image showing permission behavior of the person normally present in the monitoring area. Since it is composed of learned multi-layered neural network, the output of multi-layered neural network matches the image of the authorized user handling the learned equipment with some features but matches some features If not, or if all features do not match, anomalous behavior indicating that the equipment is not handled by an authorized user is determined, and the anomalous behavior caused by a third party's fraudulent activity is monitored. Can.

（人物特定部の効果）
また、行動監視システムは、更に、行動判定部で異常行動が判定された人物を特定して人物特定情報を出力する人物特定部を備えたため、異常行動が判定された人物がだれであるかが特定でき、異常行動が判定された人物に対し迅速且つ適切な対応が可能となる。 (Effect of person identification department)
In addition, since the behavior monitoring system further includes a person specifying unit that specifies a person whose abnormal behavior has been determined by the behavior determination unit and outputs person specifying information, the person whose abnormal behavior is determined is who A person who can be identified and whose abnormal behavior has been determined can be promptly and appropriately dealt with.

（多層式ニューラルネットワークの機能構成１による効果）
また、行動判定部の多層式のニューラルネットワークは、特徴抽出部と認識部で構成され、特徴抽出部は、入力した人物の画像から人物に応じた特徴量を抽出して出力する畳み込みニューラルネットワークで構成され、認識部は、畳み込みニューラルネットワークから出力される特徴量を入力して許可行動の人物か否かを推定する複数の全結合層を備えた全結合ニューラルネットワークで構成されたため、畳み込みニューラルネットワークにより備品を扱っている利用者画像の特徴が自動的に抽出されることで、利用者画像となる入力情報から前処理により人物の特徴、例えば、顔における目、口、耳等の稜線等を抽出するような前処理を必要とすることなく備品を扱う利用者画像の特徴が抽出され、引き続いて行う認識部により学習済みの許可行動の人物とは異なる第三者が備品を扱っている異常行動を高い精度で推定可能とする。 (Effect by functional configuration 1 of multi-layered neural network)
The multi-layered neural network of the action determination unit is composed of a feature extraction unit and a recognition unit, and the feature extraction unit is a convolutional neural network that extracts and outputs feature quantities according to the person from the input image of the person. A convolutional neural network is configured, and the recognition unit is composed of an all-combined neural network including a plurality of all coupled layers that input a feature value output from the convolutional neural network to estimate whether or not the person is a permitted person. The feature of the user image handling the equipment is automatically extracted, and the feature of the person is extracted from the input information that becomes the user image by the pre-processing, for example, ridgelines of eyes, mouth, ears, etc. in the face. The features of the user image for handling the equipment are extracted without the need for preprocessing to be extracted, and learning has been performed by the recognition unit to be performed subsequently The variable behavior of the person and can be estimated with a high degree of accuracy the abnormal behavior of different third party is handling the equipment.

（多層式ニューラルネットワーク機能構成１の学習制御による効果）
また、行動判定部には、監視領域内に通常存在する人物の許可行動を示す画像が予め記憶された学習情報記憶部と、学習情報記憶部に記憶されている画像を読み出して教師ありの学習画像として畳み込みニューラルネットワークに入力した場合に全結合ニューラルネットワークから出力される推定値と所定の期待値との誤差に基づくバックプロパゲーションにより全結合ニューラルネットワーク及び畳み込みニューラルネットワークを学習させる学習制御部とが設けられたため、例えばシステムの運用を開始する前の段階で、オフィスルームの勤務時間帯を対象に、監視カメラにより撮像した監視画像の中から監視対象として設定した１又は複数の監視区画の画像を例えば１分周期で切出して学習画像として記憶し、これを１ケ月程度繰り返すことで多層式ニューラルネットワークの学習に必要な十分の量の学習画像が得られ、監視区画の画像を多層式のニューラルネットワークに入力して学習させることで、システムの運用を開始した場合に、学習済みの多層式のニューラルネットワークによる監視区画における第三者の異常行動を精度良く判定することができる。 (Effect of learning control of multi-layered neural network functional configuration 1)
Also, the behavior determination unit reads out a learning information storage unit in which an image indicating a permitted action of a person normally present in the monitoring area is stored in advance, and an image stored in the learning information storage unit to perform supervised learning. A learning control unit for learning all coupled neural networks and convolutional neural networks by back propagation based on an error between an estimated value output from all coupled neural networks and a predetermined expected value when input to a convolutional neural network as an image; Since it is provided, for example, at the stage before starting the operation of the system, the image of one or more monitoring sections set as the monitoring target out of the monitoring images captured by the monitoring camera for the working hours of the office room For example, it is cut out in a one-minute cycle and stored as a learning image, and this is repeated for about one month As a result, a sufficient amount of learning images necessary for learning in the multi-layered neural network can be obtained, and the image of the monitoring section is input to the multi-layered neural network for learning, and the system operation is started. It is possible to accurately determine the abnormal behavior of the third party in the monitoring section by the already-described multi-layered neural network.

（多層式ニューラルネットワークの機能構成２による効果）
また、行動判定部の多層式のニューラルネットワークは、画像解析部と行動認識部で構成され、画像解析部は、入力した人物の画像から人物に応じた特徴量を抽出して所定の中間層から出力する畳み込みニューラルネットワークと、畳み込みニューラルネットワークから出力された特徴量を入力し、監視領域の画像の画像説明文を生成して出力する再帰型ニューラルネットワークとにより構成され、行動認識部は、所定の異常行動を示す単語が登録された辞書と、画像解析部から出力された画像説明文を構成する単語を、辞書に登録された単語と比較して人物の異常行動を判定する判定器と、
により構成されたため、監視カメラにより撮像された監視領域、例えば１又は複数の監視区画の画像を解析することで、監視区画に存在する備品と備品を扱う利用者の特徴が抽出されて画像説明文が生成され、生成された画像説明文から抽出された単語を辞書の第三者の異常行動を示す所定の単語と比較して一致又は類似した場合に監視区画における第三者の異常行動と判定して報知することができる。 (Effect of functional configuration 2 of multilayer neural network)
In addition, the multi-layered neural network of the action determination unit is composed of an image analysis unit and an action recognition unit, and the image analysis unit extracts feature quantities according to the person from the input person's image and extracts from a predetermined intermediate layer The action recognition unit comprises a convolutional neural network to output and a recursive neural network to which feature amounts output from the convolutional neural network are input to generate and output an image description of the image of the monitoring area, and the action recognition unit A dictionary in which a word indicating an abnormal action is registered, and a determiner that determines an abnormal action of a person by comparing a word forming an image explanatory sentence output from the image analysis unit with a word registered in the dictionary;
Therefore, by analyzing the images of the monitoring area, for example, one or more monitoring sections captured by the monitoring camera, the features of the user who handles the fixtures and the items present in the monitoring section are extracted, and the image explanatory text The third party's abnormal behavior and judgment in the monitoring section when a word is generated and the word extracted from the generated image description matches or is compared with a predetermined word indicating an abnormal behavior of the third party of the dictionary Can be notified.

（人物特定による異常行動の判定による効果）
また、画像解析部は、人物の画像から人物を特定して人物特定情報を出力し、所定の異常行動を示す単語は監視領域内に通常存在する人物に応じて登録されており、判定器は画像説明文を構成する単語と、人物特定情報の人物における異常行動を示す単語を比較して人物の異常行動を判定するようにしたため、特定された人物に許可されている行動か否か監視可能となり、許可された以外の行動、例えば、使用権限のないＰＣの操作や持ち出し禁止のファイルの持ち出し等の異常行動を判定して警報することができる。 (Effects from the determination of abnormal behavior by person identification)
Further, the image analysis unit identifies a person from the image of the person and outputs person specifying information, and a word indicating a predetermined abnormal action is registered according to the person normally present in the monitoring area, and the determination unit Since the abnormal behavior of the person is determined by comparing the words constituting the image description and the word indicating the abnormal behavior in the person of the person specifying information, it is possible to monitor whether or not the behavior permitted by the specified person Thus, it is possible to determine and issue an action other than the permitted action, for example, an abnormal action such as an operation of a PC without use authority or a carry-out prohibition file.

（多層式ニューラルネットワーク機能構成２の学習制御による効果）
また、行動判定部には、監視領域内に通常存在する人物の許可行動を示す画像と、画像の概要を示す所定の画像説明文とのペアが予め記憶された学習情報記憶部と、学習情報記憶部に記憶されている画像を読出して教師なしの学習画像として畳み込みニューラルネットワークに入力してバックプロパゲーションにより学習させ、当該学習の済んだ畳み込みニューラルネットワークに学習画像を入力して出力された特徴量と学習画像のペアとなる画像説明文を再帰型ニューラルネットワークに教師なしの学習情報として入力してバックプロパゲーションにより学習させる学習制御部とが設けられたため、多層式ニューラルネットワークの機能構成１の学習制御１の効果と同様、システム運用開始前の学習画像とその画像説明文の記憶及びシステム運用中の学習画像とその説明文の記憶による学習情報を使用した多層式ニューラルネットワークの学習により、第三者の異常行動と精度良く判定して報知することができる。 (Effect of learning control of multilayer neural network functional configuration 2)
Further, the action determination unit includes a learning information storage unit in which a pair of an image indicating permitted behavior of a person normally present in the monitoring area and a predetermined image explanatory sentence indicating an outline of the image is stored in advance; The features stored in the storage unit are read out and input to the convolutional neural network as unsupervised learning images and learned by back propagation, and the learning images are input and output in the learned convolutional neural network A learning control unit for inputting an image description sentence as a pair of a quantity and a learning image into the recursive neural network as unsupervised learning information and learning by back propagation is provided. Therefore, the functional configuration 1 of the multilayer neural network is provided. Similar to the effect of learning control 1, the storage and system of the learning image and the image The learning of the multi-layer neural network using the training information learning image in operation and by the storage of the description, it is possible to abnormal behavior and accurately determine to broadcast a third party.

（入退室管理設備との連携による効果）
更に、施設内の監視領域に入室する利用者を識別し、利用者識別情報が予め登録した利用者識別情報と一致した場合に出入口の扉に設けられた電気錠を解錠する制御を行う入退室管理設備と、入退室管理設備で利用者を識別した際に入室する利用者を撮像する入室撮像部とが設けられ、人物特定部は、入退室管理設備による電気錠を解錠する制御に連動して、入室撮像部により撮像された利用者画像を記憶し、利用者画像と行動判定部で異常行動が判定された人物の画像と照合して一致した利用者の利用者情報を人物特定情報として出力するようにしたため、異常行動が判定された人物が、入退室管理設備によりＩＣカード等の読取りによる電気錠の解錠で入室して入室撮像部により撮像されている利用者であることが特定されて、その人物の入退室管理情報が異常行動が判定された人物の人物特定情報となることで、異常行動が判定された人物が社員のだれであるか又は来訪者のだれであるかが特定でき、異常行動が判定された人物に対し迅速且つ適切な対応が可能となる。 (Effect by cooperation with room entry and exit management equipment)
Furthermore, a user who enters the monitoring area in the facility is identified, and when the user identification information matches the pre-registered user identification information, control is performed to unlock the electric lock provided on the door of the entrance. An exit management facility and an entry imaging unit for imaging a user who enters the room when the user is identified by the entry / exit administration facility are provided, and the person specifying unit performs control to unlock the electric lock by the entry / exit administration facility In conjunction, the user image captured by the room entry imaging unit is stored, and the user image is collated with the user image and the image of the person whose abnormal behavior has been determined by the behavior determination unit, and the user information of the matched user is identified Since the information is output as information, the person whose abnormal behavior has been determined is a user who has entered the room by unlocking the electric lock by reading an IC card or the like by the room access control facility and is imaged by the room entry imaging unit Is identified, that person Since the entry / exit management information is the person identification information of the person whose abnormal behavior has been determined, it is possible to identify who the employee for whom the person whose abnormal behavior has been determined is or who is the visitor. It is possible to respond promptly and appropriately to the determined person.

また、入退室管理設備による在室管理情報から監視領域となる部屋に利用者の入室がないことを検出して夜間や休日には行動判定部の判定動作を休止させることができ、この休止の時間帯を利用して行動判定部の多層式ニューラルネットワークの学習ができる。 In addition, it is possible to detect that the user does not enter the room to be the monitoring area from the occupancy management information by the entry / exit management facility, and to suspend the determination operation of the behavior determination unit at night or on holidays. The time zone can be used to learn the multi-layered neural network of the action determination unit.

また、入退室管理設備の管理情報から、行動監視を行う監視区画の利用者の在室が検出でき、利用者端末のＬＡＮ回線に対するネットワーク接続等から利用者端末の使用開始を検出した場合に、所定時間のあいだ監視区域の学習画像の記憶を行って休止時間に行動判定部の多層式ニューラルネットワークの学習ができる。 In addition, when the presence of the user of the monitoring section performing the activity monitoring can be detected from the management information of the entry / exit management facility and the use start of the user terminal is detected from the network connection to the LAN line of the user terminal, etc. The learning image of the monitored area is stored for a predetermined time, and learning of the multi-layered neural network of the behavior determination unit can be performed at the pause time.

（異常行動警報と録画再生による効果）
更に、撮像部で撮像された監視領域の動画を録画する録画装置が設けられ、行動判定部は、人物の異常行動を判定した場合に、所定の外部装置に警報信号を送信して報知させると共に、異常行動を判定した時点を含む前後所定時間の撮像部で撮像された監視領域の動画を録画装置から再生して表示させるようにしたため、第三者の異常行動を判定した場合に、例えば所定の管理装置や監視対象としている備品の扱いが許可された利用者の端末に警報信号を送って警報表示させる人物の異常行動を知しらせ、その根拠となった動画再生により状況を確認して必要な対処を可能とする。 (Abnormal behavior warning and the effect of recording and playback)
Furthermore, a recording device for recording a moving image of the monitoring area imaged by the imaging unit is provided, and the behavior determination unit transmits an alarm signal to a predetermined external device for notification when it determines an abnormal behavior of a person. In the case where the abnormal behavior of the third party is determined, for example, since the moving image of the monitoring area captured by the imaging unit before and after the predetermined time including the time point of determining the abnormal behavior is reproduced from the recording device Send an alarm signal to the terminal of the user who is permitted to handle the equipment to be monitored and the equipment to be monitored to notify the abnormal behavior of the person who is to display the alarm, and confirm the situation by the video reproduction that became the basis Make it possible to

（マルチ監視による効果）
また、行動判定部は、撮像部により撮像された監視領域の画像の中に監視対象とする備品が配置された１又は複数の監視区画を設定し、監視区画の画像を切り出して多階層のニューラルネットワークに入力することにより監視区画毎に人物の異常行動を判定して出力するようにしたため、例えばオフィスルーム等のように多数の利用者が情報端末機器やファイルを使用して業務を遂行している場合、監視カメラで撮像している監視画面に映っている利用者の席及びに備品が映っている場所に例えば矩形の監視区画を必要に応じて複数設定することで、監視区域毎に人物の異常行動をマルチ的に監視することができる。 (Effect by multi-monitoring)
In addition, the behavior determination unit sets one or a plurality of monitoring sections in which the equipment to be monitored is disposed in the image of the monitoring area captured by the imaging unit, extracts the image of the monitoring sections, and performs multi-layered neural network Since the abnormal behavior of the person is determined and output for each monitoring section by inputting to the network, a large number of users perform work using information terminal devices and files, for example, as in an office room. If the user has a seat on the surveillance screen captured by the surveillance camera, the user may set multiple rectangular surveillance zones, for example, in places where equipment is shown on the surveillance screen, as needed. Abnormal behavior can be monitored in multiple ways.

（周期的な行動監視による効果）
また、行動判定部は、所定の周期毎に、監視区画の画像を多層式のニューラルネットワークに入力して人物の異常行動を判定して出力するようにしたため、監視カメラにより撮像された動画の中から行動監視を見逃すことのない例えば１〜２分といった所定周期毎にフレーム画像を読み込んで多層式ニューラルネットワークに入力することにより、多層式ニューラルネットワークの処理負担を軽減した行動監視を可能とする。 (Effect of periodical action monitoring)
In addition, since the action determination unit inputs the image of the monitoring section to the multi-layered neural network at a predetermined cycle to determine and output the abnormal action of the person, the moving image captured by the monitoring camera is output. Since the frame image is read every predetermined period such as 1 to 2 minutes without missing the action monitoring, the action monitoring with reduced processing load of the multilayer neural network is made possible by inputting to the multilayer neural network.

（利用者学習画像のサイズ正規化による効果）
また、行動判定部は、監視区画の画像を所定サイズの画像に正規化して多層式のニューラルネットワークに入力するようにしたため、監視カメラから監視区画までの距離により監視区画の画像サイズが異なるが、切出した監視区画の画像が同じ縦横サイズに正規化されることで、多層式のニューラルネットワークに入力した場合の判定精度を高めることができる。 (Effect of size normalization of user learning image)
Also, since the behavior determination unit normalizes the image of the monitoring section into an image of a predetermined size and inputs it to the multi-layered neural network, the image size of the monitoring section varies depending on the distance from the monitoring camera to the monitoring section. By normalizing the extracted images of the monitoring section to the same vertical and horizontal sizes, it is possible to improve the determination accuracy in the case of being input to the multilayer neural network.

行動監視システムの概略を入退室管理設備と共に示した説明図An explanatory view showing an outline of the behavior monitoring system together with the entry and exit management equipment 監視対象となるオフィスレイアウトの一例を示した説明図An explanatory diagram showing an example of an office layout to be monitored 監視画像に対する監視区画の設定を示した説明図Explanatory drawing showing the setting of the monitoring section for the monitoring image 図１の行動監視装置の第１実施形態を機能構成により示したブロック図The block diagram which showed 1st Embodiment of the action monitoring apparatus of FIG. 1 by functional configuration 図４の多層式ニューラルネットワークの機能構成を示した説明図An explanatory view showing a functional configuration of the multilayer neural network of FIG. 4 図４の行動判定部による行動監視制御を示したフローチャートFlow chart showing action monitoring control by the action determination unit of FIG. 4 図４の学習制御部による学習制御を示したフローチャートA flowchart showing learning control by the learning control unit of FIG. 4 図１の行動監視装置の第２実施形態を機能構成により示したブロック図The block diagram which showed 2nd Embodiment of the action monitoring apparatus of FIG. 1 by functional configuration 図８の畳み込みニューラルネットワークと再帰型ニューラルネットワークの機能構成を示したブロック図A block diagram showing the functional configuration of the convolutional neural network and the recursive neural network of FIG. 図８の行動判定部による行動監視制御を示したフローチャートA flowchart showing behavior monitoring control by the behavior determination unit of FIG. 8 図８の学習制御部による学習制御を示したフローチャートA flowchart showing learning control by the learning control unit of FIG. 8

［行動監視システム］
（システムの概要）
図１は行動監視システムの概略を入退室管理設備と共に示した説明図である。図１に示すように、ビル等の施設のオフィスルーム１２には撮像部として機能する監視カメラ１４が設置され、複数の利用者が在室しているオフィスルーム１２内を監視カメラ１４により動画撮像している。 Behavior monitoring system
(System overview)
FIG. 1 is an explanatory view showing an outline of a behavior monitoring system together with a room entry and exit management facility. As shown in FIG. 1, a monitoring camera 14 functioning as an imaging unit is installed in an office room 12 of a facility such as a building, and a moving image is captured by the monitoring camera 14 inside the office room 12 where a plurality of users are present. doing.

監視カメラ１４はＲＧＢのカラー画像を例えば３０フレーム／秒で撮像して動画として出力する。また、１フレームは例えば縦横４０５６×４０５６ピクセルの画素配置となる。 The surveillance camera 14 captures an RGB color image at, for example, 30 frames / second and outputs it as a moving image. Also, one frame has a pixel arrangement of 4056 × 4056 pixels, for example.

オフィスルーム１２に対し施設の管理センター等には行動監視装置１０が設置されている。行動監視装置１０には行動判定部１６、人物特定部１７、学習制御部１８及び録画装置２０が設けられ、オフィスルーム１２に設置された監視カメラ１４からの信号線がそれぞれに並列に入力されている。 An activity monitoring apparatus 10 is installed at a management center or the like of the office room 12. The behavior monitoring device 10 is provided with a behavior determination unit 16, a person identification unit 17, a learning control unit 18, and a recording device 20, and signal lines from the monitoring camera 14 installed in the office room 12 are input in parallel to each other. There is.

行動判定部１６は多層式のニューラルネットワークを備えており、多層式のニューラルネットワークは監視領域となるオフィ―スルーム１２の画像内に設定された１又は複数の監視区画に存在する利用者端末等の情報端末機器や書棚等の備品の画像及び監視対象とする備品の使用が許可された利用者（以下「備品使用権限をもつ利用者」という）が備品を扱う画像により予め学習されている。 The action determination unit 16 includes a multi-layered neural network, and the multi-layered neural network is a user terminal or the like existing in one or more monitoring sections set in the image of the office room 12 serving as the monitoring area. An image of equipment such as an information terminal and a bookshelf and a user who is permitted to use equipment to be monitored (hereinafter referred to as a "user with authority to use equipment") has been learned in advance by an image that handles equipment.

システム運用中に、行動判定部１６は、例えばオフィスルーム１２に利用者の在室がある場合に行動判定動作を行っており、監視カメラ１４により撮像された監視領域の画像を多層式のニューラルネットワークに入力して備品使用権限をもつ利用者以外の第三者（人物）の異常行動、即ち監視領域内に通常存在する人物の許可された行動か否かを判定し、備品使用権限をもつ利用者の利用者端末を含む所定の管理装置に警報信号を送信して報知させると共に、録画装置２０で録画している異常行動を判定した時点を含む前後所定時間の監視領域の動画を保存して再生可能としている。 During system operation, the behavior determination unit 16 performs behavior determination operation when, for example, the user of the office room 12 is present, and the image of the monitoring area captured by the monitoring camera 14 is displayed in a multi-layered neural network To determine whether the abnormal behavior of a third party (person) other than the user who has the authority to use the equipment, that is, whether or not the person who normally exists in the monitoring area is the authorized action of the person, The alarm signal is sent to a predetermined management device including the user terminal of the person to make a notification, and the moving image of the monitoring area for the predetermined time before and after including the time when the abnormal behavior recorded by the recording device 20 is determined Reproduction is possible.

人物特定部１７は、行動判定部１６で異常行動が判定された人物を特定して人物特定情報を出力する。例えば人物特定部１７は、後の説明で明らかにする入退室管理設備による電気錠２８を解錠する制御に連動して、入室カメラ２５により撮像された利用者画像を記憶し、利用者画像と行動判定部１６で異常行動が判定された人物の画像と照合し、一致した利用者の利用者情報を人物特定情報として出力する。 The person specifying unit 17 specifies a person whose abnormal action has been determined by the action determining unit 16 and outputs person specifying information. For example, the person specifying unit 17 stores the user image captured by the entry camera 25 in conjunction with the control of unlocking the electric lock 28 by the entry / exit management facility to be described later, and stores the user image with the user image. The behavior determination unit 16 collates the image with the person whose abnormal behavior has been determined, and outputs the user information of the matched user as person identification information.

学習制御部１８は、システムの運用開始前に、例えばオフィスルーム１２の勤務時間帯に監視カメラ１４により撮像された監視領域の画像の中から監視対象として設定した１又は複数の監視区画の画像を例えば１分周期で切出して学習画像として記憶し、これを例えば１ケ月程度繰り返すことで多層式ニューラルネットワークの学習に必要な十分の量の学習画像を収集し、監視区画の画像を多層式のニューラルネットワークに入力してディープラーニングにより学習させることで、システムの運用を開始した場合に、学習済みの多層式のニューラルネットワークにより監視区画における第三者の異常行動を精度良く判定することを可能とする。 The learning control unit 18 selects, for example, an image of one or a plurality of monitoring sections set as a monitoring target from among images of the monitoring area captured by the monitoring camera 14 in the working time zone of the office room 12 before starting operation of the system. For example, it is cut out in a one-minute cycle and stored as a learning image, and this is repeated, for example, for about one month to collect a sufficient amount of learning images necessary for learning in a multi-layered neural network. By inputting to a network and learning by deep learning, when starting operation of the system, it is possible to accurately determine the abnormal behavior of the third party in the monitoring section by the learned multi-layered neural network .

また、システムの運用中に学習制御部１８は、例えば備品使用権限をもつ利用者が始業時に利用者端末の使用を開始した場合に、所定時間の間、システムの運用開始前と同様に、監視区域の学習画像の記憶を行い、オフィスルーム１２に利用者が在室していないことで行動判定部１６の行動判定動作が休止している夜間や休日等の空き時間に、記憶している学習画像を行動判定部１６の多層式ニューラルネットワークに入力して学習することで、監視対象とする備品使用権限をもつ利用者の服装や髪形等に変化があっても、この変化が行動判定部１６の多層式ニューラルネットワークに反映され、監視区画における第三者の異常行動を精度良く判定することを可能としている。 In addition, during operation of the system, for example, when a user having authority to use equipment starts using the user terminal at the start of work, the learning control unit 18 monitors for a predetermined time as in the case before the start of operation of the system. The learning image stored in the area is stored in the learning image of the area, and the user is not present in the office room 12, and the learning is stored in the vacant time such as night or holiday when the action judging operation of the action judging unit 16 is paused. Even if there is a change in the clothes, the hair style, etc. of the user who has the authority to use the equipment to be monitored, the image can be input to the multi-layered neural network of the action determination unit 16 and learned. It is possible to accurately determine the abnormal behavior of the third party in the monitoring section by reflecting it on the multi-layered neural network.

（入退室管理システムの概要）
本実施形態の行動監視システムが設置された施設には、入退室管理設備が併せて設けられている。入退室管理設備として、図１に示すように、在室者の監視対象となるオフィスルーム１２の出入口には扉３０が設けられており、扉３０には電気錠２８が設けられ、また扉３０の近傍にはカードリーダ２６が配置されている。 (Overview of entry and exit management system)
A facility in which the behavior monitoring system of the present embodiment is installed is provided with a room entry and exit management facility. As shown in FIG. 1, a door 30 is provided at the entrance of the office room 12 to be monitored by the occupants, and an electric lock 28 is provided on the door 30. The card reader 26 is disposed in the vicinity of.

カードリーダ２６及び電気錠２８は入退室管理制御装置２２に伝送線により接続されている。入退室管理制御装置２２は例えば建物の各階毎に配置されている。施設の管理センター等には、センター装置２３とクライアント装置２４が配置され、ＬＡＮ回線２１により相互に接続されると共に各階の入退室管理制御装置２２と接続されている。 The card reader 26 and the electric lock 28 are connected to the entry and exit management control device 22 by a transmission line. The entry and exit management control device 22 is disposed, for example, on each floor of a building. A center device 23 and a client device 24 are disposed at a management center of the facility, etc., and are connected to each other by a LAN line 21 and connected to an entry management control device 22 on each floor.

また、オフィスルーム１２の出入口には入室撮像部として機能する入室カメラ２５が設置され、電気錠２８を解錠する制御に連動して入室する利用者を撮像し、撮像した利用者画像を入出力管理制御装置２２を介してクライアント装置２４に送信し、利用者管理情報に対応して利用者画像を記憶し、例えば行動監視装置１０に設けられた人物特定部１７に利用者画像を送り、行動判定部１６で異常行動が判定された人物の画像と照合し、一致した利用者の利用者情報を人物特定情報として出力させるようにしている。 In addition, an entrance camera 25 functioning as an entrance imaging unit is installed at the entrance of the office room 12, and the user who enters the room is imaged in conjunction with the control for unlocking the electric lock 28, and the imaged user image is input / output The user image is transmitted to the client device 24 via the management control device 22, stores the user image corresponding to the user management information, and sends the user image to, for example, the person specifying unit 17 provided in the action monitoring device 10, The determination unit 16 collates the image with the person whose abnormal action has been determined, and outputs the user information of the matched user as person identification information.

入退室管理設備の管理制御は次のようになる。カードリーダ２６は、例えば利用者の携帯する磁気カード又は非接触ＩＣカードから利用者ＩＤ等の個人情報を読取って事前登録した個人情報と照合し、照合一致により認証成功を判別した場合に認証信号を入退室管理制御装置２２へ送信する。 Management control of entry and exit management equipment is as follows. The card reader 26 reads personal information such as a user ID from a magnetic card or non-contact IC card carried by the user, for example, and collates it with personal information registered in advance, and determines an authentication success based on the collation match. Is sent to the room entry and exit management controller 22.

入退室管理制御装置２２はカードリーダ２６から認証信号を受信した場合、対応する出入口に設けた扉３０の電気錠２８へ制御信号を出力して解錠制御し、入室を可能とする。なお、部屋から出る場合にはカードリーダ２６による認証は必要とせず、扉３０の内側に設けられたスイッチ釦の操作等により電気錠２８を解錠して退出することができる。 When the entry / exit management control device 22 receives an authentication signal from the card reader 26, the entry / exit management control device 22 outputs a control signal to the electric lock 28 of the door 30 provided at the corresponding entrance to unlock and control entry. When leaving the room, authentication by the card reader 26 is not necessary, and the electric lock 28 can be unlocked and exited by operating the switch button provided inside the door 30 or the like.

センター装置２３は、ディスプレイ付きのパーソナルコンピュータであり、入退室管理設備を設置した建物の地図などの管理情報を表示する。クライアント装置２４はディスプレイ付きのパーソナルコンピュータであり、ＬＡＮ回線２１を介して入退室管理制御装置２２と接続され、入退室管理制御装置２２を経由してカードリーダ２６との間で磁気カードや非接触ＩＣカードに対応した個人情報の登録、削除、履歴検索などの各種設定や処理を行い、また、カードリーダ２６からのカード読取信号に基づく認証処理毎にオフィスルーム１２の在室者をカウントし、在室している利用者数及び在室している利用者名（利用者ＩＤ）を管理している。 The center device 23 is a personal computer with a display, and displays management information such as a map of a building in which the room access management facility is installed. The client device 24 is a personal computer with a display, and is connected to the room entry and exit management controller 22 via the LAN line 21, and a magnetic card or contactless with the card reader 26 via the room entry and exit management controller 22. Performs various settings and processing such as registration, deletion, history search, etc. of personal information corresponding to the IC card, and counts people in the office room 12 for each authentication processing based on a card read signal from the card reader 26 It manages the number of users in the room and the names of users in the room (user ID).

行動監視装置１０に設けられた行動判定部１６はＬＡＮ回線２１に接続され、本実施形態にあっては、入退室管理設備のクライアント装置２４を行動監視システムの管理装置として利用していることから、行動判定部１６で監視区画の画像から備品使用権限をもつ利用者以外の第三者の異常行動が判定された場合、ＬＡＮ回線２１を介してクライアント装置２４に警報信号を送信して第三者の異常行動を示す警報を報知させると共に、録画装置２０で録画している異常行動を判定した時点を含む前後所定時間の監視領域の動画を再生可能としている。 Since the action determination unit 16 provided in the action monitoring apparatus 10 is connected to the LAN line 21 and in the present embodiment, the client device 24 of the entry and exit management facility is used as a management apparatus of the action monitoring system. When an abnormal behavior of a third party other than the user having the right to use equipment is determined from the image of the monitoring section by the behavior determination unit 16, an alarm signal is transmitted to the client device 24 through the LAN line 21 to The alarm indicating the abnormal behavior of the person is notified, and the moving image of the monitoring area of the predetermined time before and after the time including the time when the abnormal behavior recorded by the recording device 20 is determined can be reproduced.

また、ＬＡＮ回線２１にはオフィスルーム１２に設置されている行動監視の対象となる利用者端末３２が接続されており、利用者端末３２はディスプレイ付きのパーソナルコンピュータを用いている。なお、利用者端末３２はオフィスルーム１２内に配置されているものであるが、ＬＡＮ回線２１との接続を示すため、オフィスルーム１２から取り出した状態で示している。 Further, a user terminal 32 as a target of behavior monitoring installed in the office room 12 is connected to the LAN line 21. The user terminal 32 uses a personal computer with a display. Although the user terminal 32 is disposed in the office room 12, it is shown in a state of being taken out of the office room 12 in order to indicate the connection with the LAN line 21.

また、行動判定部１６は、入退室管理設備のクライアント装置２４で管理しているオフィスルーム１２の在室管理情報を利用して行動判定動作の休止と起動（休止解除）を制御している。 In addition, the behavior determination unit 16 controls the suspension and activation (pause release) of the behavior determination operation using the occupancy management information of the office room 12 managed by the client device 24 of the entry / exit management facility.

即ち、行動判定部１６は、クライアント装置２４の在室管理情報から、監視対象とするオフィスルーム１２における夜間や休日等での利用者の在室なしを検出した場合に行動判定動作を休止し、この休止中に利用者の在室ありを検出した場合に、休止を解除して行動判定動作を起動する。 That is, the behavior determination unit 16 suspends the behavior determination operation when it is detected from the occupancy management information of the client device 24 that the user is not present at night or on a holiday in the office room 12 to be monitored. When the presence of the user is detected during the pause, the pause is canceled and the action determination operation is activated.

このようにシステムの運用中に行動判定部１６で行動判定動作が休止されると、学習制御部１８は休止による空き時間を利用して、システムの運用開始前と同様に、システム運用中に記憶した監視区域の学習画像により行動判定部１６の多層式ニューラルネットワークを学習する制御を行う。 As described above, when the behavior determination operation is suspended by the behavior determination unit 16 during the operation of the system, the learning control unit 18 uses the idle time due to the suspension and stores it during the system operation as before the system operation start. Control is performed to learn the multilayer neural network of the action determination unit 16 based on the learning image of the monitored area.

（オフィスルーム監視の概要）
図２は監視対象となるオフィスレイアウトの一例を示した説明図、図３は監視画像に対する監視区画の設定を示した説明図である。 (Overview of office room monitoring)
FIG. 2 is an explanatory view showing an example of an office layout to be monitored, and FIG. 3 is an explanatory view showing setting of a monitoring section for a monitoring image.

図２に示すように、行動監視システムの監視対象となるオフィスルーム１２は、その内部に複数の机と椅子が配置され、机の上にはパーソナルコンピュータ等の備品が置かれている。オフィスルーム１２の机配置は、例えば課や係単位に、図示で上下方向に２列に机が配置され、図示で上側には、管理職の席が配置されている。 As shown in FIG. 2, in the office room 12 to be monitored by the behavior monitoring system, a plurality of desks and chairs are disposed inside, and equipment such as a personal computer is placed on the desk. In the desk arrangement of the office room 12, for example, desks are arranged in two rows in the vertical direction in the drawing in units of department or clerk, and a manager's seat is arranged on the upper side in the drawing.

監視カメラ１４はオフィスルーム１２の図示で左下のコーナーの天井に近い位置に、室内全体を俯瞰するように配置され、部屋全体を監視画像に映し込めるようにしている。なお、１台に監視カメラ１４で部屋全体を撮像できない場合は、必要に応じて複数台の監視カメラ１４を設置してもよい。また、監視カメラ１４として１８０°又は３６０°の撮像範囲を持つ広角の監視カメラ１４を設置しても良い。 The surveillance camera 14 is arranged near the ceiling of the lower left corner of the office room 12 so as to look over the entire room so that the whole room can be reflected in the surveillance image. In addition, when the monitoring camera 14 can not image the whole room to one, you may install several monitoring camera 14 as needed. Also, a wide angle surveillance camera 14 having an imaging range of 180 ° or 360 ° may be installed as the surveillance camera 14.

また、オフィスルーム１２の出入口に設けられた扉３０の外側にはカードリーダ２６が配置され、入室の際に利用者はカードリーダ２６にカードを読み取らせて電気錠を解錠させる。 In addition, a card reader 26 is disposed outside the door 30 provided at the entrance of the office room 12, and when entering the room, the user causes the card reader 26 to read the card and unlock the electric lock.

オフィスルーム１２に対しては、行動監視システムで監視対象とする矩形の点線で示す監視区画３４−１〜３４−８が設定される。例えば監視区画３４−１は管理職となるＡ部長の席に設定され、監視区画３４−１の机には利用者端末３２−１があり、利用者端末３２−１は部長Ａが使用権限をもっている。このため監視領域３４−１は利用者端末３２−１の備品使用権限をもつＡ部長以外の第三者による行動を監視することになる。 For the office room 12, monitoring sections 34-1 to 34-8 indicated by rectangular dotted lines to be monitored by the behavior monitoring system are set. For example, the monitoring section 34-1 is set in the seat of the manager of manager A who is the manager, the desk of the monitoring section 34-1 has the user terminal 32-1, and the user terminal 32-1 has the authority of using the manager A. There is. For this reason, the monitoring area 34-1 monitors the action by a third party other than the manager of department A who has the authority to use the equipment of the user terminal 32-1.

また、監視領域３４−２は書棚３５−２に設定され、書棚３５−２にはＡ部長が使用権限を持つ機密書類のファイルが格納されている。このため監視領域３４−２は書棚３５−２の備品使用権限をもつＡ部長以外の第三者によるファイル持ち出し等の行動を監視することになる。 Also, the monitoring area 34-2 is set to the bookshelf 35-2, and the bookshelf 35-2 stores files of confidential documents for which the manager of the department A is authorized to use. Therefore, the monitoring area 34-2 monitors an action such as taking out a file by a third party other than the manager A who has the authority to use the equipment of the bookshelf 35-2.

このようなオフィスルーム１２に対する監視区画の設定は、図３に示すように、監視カメラ１４で撮像された監視領域の監視画像３６を利用して行われる。なお、図３の監視画像３６は一例であり、図２のオフィスルーム１２には対応していない。 Such setting of the monitoring section for the office room 12 is performed using the monitoring image 36 of the monitoring area captured by the monitoring camera 14 as shown in FIG. 3. The monitoring image 36 of FIG. 3 is an example, and does not correspond to the office room 12 of FIG.

図３に示すように、行動監視制御の管理装置として機能する図１の入退室管理設備のクライアント装置２４に監視画像３６を表示させ、この状態で例えば点線の矩形で示す監視区画３４−１０〜３４−１２を画面内に設定する。 As shown in FIG. 3, the monitor image 36 is displayed on the client device 24 of the room entry and exit management facility of FIG. 1 functioning as a management device for behavior monitoring control, and in this state, for example, a monitoring section 34-10 to a dotted line rectangle. Set 34-12 in the screen.

例えば監視区画３４−１０は部長Ｘの席に設定され、監視区画３４−１０の机には利用者端末３２−１０があり、利用者端末３２−１０は部長Ｘが使用権限をもっている。このため監視領域３４−１０は利用者端末３２−１０の備品使用権限をもつＸ部長以外の第三者による行動を監視することになる。 For example, the monitoring section 34-10 is set at the seat of the director X, the desk of the monitoring section 34-10 has the user terminal 32-10, and the user terminal 32-10 has the authority of using the director X. For this reason, the monitoring area 34-10 monitors the action by a third party other than the director of department X who has the right to use the equipment of the user terminal 32-10.

また、監視区画３４−１１は機密情報を扱う社員Ｙの席に設定され、監視区画３４−１１の机には利用者端末３２−１１があり、利用者端末３２−１１は社員Ｙが使用権限をもっている。このため監視領域３４−１１は利用者端末３２−１１の備品使用権限をもつ社員Ｙ以外の第三者による行動を監視することになる。 Further, the monitoring section 34-11 is set to the seat of the employee Y handling confidential information, the desk of the monitoring section 34-11 has the user terminal 32-11, and the user terminal 32-11 has the use authority of the employee Y. Have Therefore, the monitoring area 34-11 monitors behavior by a third party other than the employee Y who has the authority to use the equipment of the user terminal 32-11.

このように監視画面３６により設定された監視区画３４−１０〜３４−１２に対しては、行動監視制御以外に、図１に示した学習制御部１８により監視区画の画像を切り出して学習画像として記憶し、記憶した学習画像による行動判定部１６の多層式ニューラルネットワークを学習させる学習制御が行われる。 For the monitoring sections 34-10 to 34-12 set by the monitoring screen 36 in this way, in addition to the action monitoring control, an image of the monitoring section is cut out by the learning control unit 18 shown in FIG. Learning control is performed to learn the multilayer neural network of the action determination unit 16 based on the stored and stored learning image.

［行動判定装置の第１実施形態］
図４は、図１の行動監視装置の第１実施形態を機能構成により示したブロック図である。 First Embodiment of Behavior Determination Device
FIG. 4 is a block diagram showing the first embodiment of the behavior monitoring device of FIG. 1 by a functional configuration.

図４に示すように、行動判定部１６は、画像切出部４４、多層式ニューラルネットワーク４６及び時系列判定部４８で構成され、また学習制御器１８は、画像切出部５０、制御部５２、学習情報記憶部として機能する学習画像記憶部５４、マウスやキーボード等の操作部５５及びディスプレイを備えた表示部５６で構成され、これらの機能は、ニューラルネットワークの処理に対応したコンピュータ回路のＣＰＵによるプログラムの実行により実現される。 As shown in FIG. 4, the action determination unit 16 includes an image extraction unit 44, a multilayer neural network 46 and a time series determination unit 48, and the learning controller 18 includes an image extraction unit 50 and a control unit 52. , A learning image storage unit 54 functioning as a learning information storage unit, an operation unit 55 such as a mouse or a keyboard, and a display unit 56 provided with a display, and these functions are performed by a CPU of a computer circuit corresponding to neural network processing. It is realized by the execution of the program by.

（行動判定部の機能）
行動判定部１６の多層式ニューラルネットワーク４６は、学習制御部１８による監視区画に設定された備品の画像及び備品使用権限をもつ利用者が備品を扱う画像により学習されている。 (Function of behavior judgment unit)
The multi-layered neural network 46 of the behavior determination unit 16 is learned by the image of the equipment set in the monitoring section by the learning control unit 18 and the image by which the user having the equipment usage authority handles the equipment.

行動判定部１６は、クライアント装置２４で管理されているオフィスルーム１２の在室管理情報から在室ありを検出した場合に行動判定動作を開始し、監視カメラ１４により撮像されたオフィスルーム１２の監視画像を例えば１分程度に定めた行動監視に必要な所定の周期毎に画像切出部４４に保持させ、保持された監視画像の中に映っている１又は複数の監視区画の画像を切出して多層式ニューラルネットワーク４６に入力し、備品使用権限をもつ利用者の行動を示す推定値を出力させる。 The behavior determination unit 16 starts the behavior determination operation when detecting presence in the room from the occupancy management information of the office room 12 managed by the client device 24, and monitors the office room 12 captured by the monitoring camera 14. For example, an image is held in the image cutting out unit 44 at predetermined intervals necessary for behavior monitoring, which is set to about 1 minute, and the images of one or more monitoring sections displayed in the held monitoring image are cut out A multilayer neural network 46 is input to output an estimated value indicating the behavior of the user who has the right to use the equipment.

例えば図３の監視画像３６を例にとると、行動判定部１６は、監視画像３６を画像切出部４４に保持させた状態で、監視画像３６に設定されている監視区画３４−１０〜３４−１２の画像を切出して多層式ニューラルネットワーク４６に順次入力し、例えば監視区画３４−１０の切出し画像であれば、利用者端末３２−１０の備品使用権限をもつ部長Ｘの行動を示す推定値を出力させる。 For example, taking the monitoring image 36 of FIG. 3 as an example, the behavior determining unit 16 sets the monitoring sections 34-10 to 34 set in the monitoring image 36 in a state where the monitoring image 36 is held by the image cutting unit 44. The image of -12 is cut out and sequentially input to the multilayer neural network 46. For example, in the case of the cut-out image of the monitoring section 34-10, an estimated value indicating the behavior of the director X with authority to use the equipment of the user terminal 32-10. Output

また、監視区画３４−１０の切出し画像に部長Ｘが映っておらず、利用者端末３２−１０のみの場合、この切出し画像を入力した多層式ニューラルネットワーク４６は、利用者端末３２−１０の存在を示す推定値を出力させる。 Further, in the case of only the user terminal 32-10, the section image X of the monitoring section 34-10 does not appear, and in the case of only the user terminal 32-10, the multilayer neural network 46 which has input this cutout image is the presence of the user terminal 32-10 Output an estimated value indicating.

監視区画の切出し画像を入力した場合に多層式ニューラルネットワーク４６が出力する備品使用権限をもつ利用者の行動を示す推定値は０〜１の値を持ち、備品使用権限をもつ利用者の画像であれば推定値は１又は１に近い値となり、第三者の画像であれば推定値は０又は０に近い値となる。 The estimated value indicating the behavior of the user who has the right to use the equipment output by the multi-layered neural network 46 when the cutout image of the monitoring section is input has a value of 0 to 1 and is an image of the user who has the right to use the equipment. If it is, the estimated value will be 1 or a value close to 1, and if it is an image of a third party, the estimated value will be a value close to 0 or 0.

このため多層式ニューラルネットワーク４６は判定閾値を例えば０．５に設定しており、推定値が判定閾値０．５以下の場合は、第三者による異常行動と判定して時系列判定部４８に出力する。 For this reason, the multi-layered neural network 46 sets the determination threshold to, for example, 0.5, and when the estimated value is equal to or less than the determination threshold 0.5, it is determined to be an abnormal action by a third party. Output.

時系列判定部４８は、多層式ニューラルネットワーク４６による異常行動の判定結果が複数回又は所定時間に亘り連続した場合に、第三者による異常行動との判定を確定させ、人物特定部１７に出力する。 When the determination result of the abnormal behavior by the multilayer neural network 46 continues a plurality of times or for a predetermined time, the time series determination unit 48 determines the determination as the abnormal behavior by the third party and outputs the determination to the person specifying unit 17 Do.

人物特定部１７には、図１に示した入退室管理設備による電気錠２８を解錠する制御に連動して入室カメラ２５により撮像された利用者画像が記憶されており、記憶している利用者画像と行動判定部１６で異常行動が判定された人物の画像と照合し、一致した利用者の利用者情報を入退室管理設備のクライアント装置２４から取得して人物特定情報とし、異常行動が判定された人物の画像、人物特定情報を含む警報信号を例えば入退室管理設備のクライアント装置２４に送信して警報信号を送信して第三者の異常行動を警報により報知させると共に、録画装置２０で録画している異常行動を判定した時点を含む前後所定時間の監視領域の動画を再生可能としている。 The person identification unit 17 stores the user image captured by the entry camera 25 in conjunction with the control for unlocking the electric lock 28 by the entry / exit management facility shown in FIG. The image of the person and the image of the person whose abnormal behavior has been determined by the behavior determination unit 16 are collated, and the user information of the matched user is acquired from the client device 24 of the room entry / exit management facility and used as person identification information. For example, an alarm signal including the image of the determined person and the person identification information is transmitted to the client device 24 of the entry and exit management facility, the alarm signal is transmitted, and the abnormal action of the third party is notified by an alarm. It is possible to reproduce a moving image of a monitoring area for a predetermined time before and after the time when the abnormal behavior recorded in the above is determined.

また、人物特定部１７からの警報信号は、異常行動が判定された監視区画の利用者端末３２にも送信され、席を外していた利用者が戻ってきて利用者端末３２を操作した場合に、第三者の異常行動の警報を報知させると共に、録画装置２０で録画している異常行動を判定した時点を含む前後所定時間の監視領域の動画を再生可能とし、判定された第三者の異常行動を確認して必要な対処を可能とする。 Further, the alarm signal from the person specifying unit 17 is also transmitted to the user terminal 32 of the monitoring section in which the abnormal behavior is determined, and the user who took the seat comes back and operates the user terminal 32. The alarm of the third party's abnormal behavior is notified, and the moving image of the monitoring area for a predetermined time before and after the time including the time when the abnormal behavior recorded by the recording device 20 is determined can be reproduced. Confirm abnormal behavior and enable necessary measures.

また、行動判定部１６の多層式ニューラルネットワーク４６は、入力された監視区画の画像に備品のみが映っており、備品使用権限をもつ利用者が映っていなかった場合、多層式ニューラルネットワーク４６は監視対象となる備品の存在を示す推定値として１又は１に近い値を出力する。 In addition, the multilayer neural network 46 of the behavior determination unit 16 monitors the multilayer neural network 46 when only the equipment is shown in the input image of the monitoring section and the user having the authority to use the equipment is not shown. A value of 1 or a value close to 1 is output as an estimated value indicating the presence of the target equipment.

これに対し監視対象としている備品が不正に持ち出されたような場合、備品の映っていない監視区画の画像が多層式ニューラルネットワーク４６に入力され、監視対象となる備品の存在を示す推定値は０又は０に近い値となり、閾値０．５以下となるため警報信号が出力されて異常が報知され、また、異常発生時点を含む前後所定時間の録画装置２０による動画再生が可能となる。これにより監視対象としている備品の盗難等の監視も可能となる。 On the other hand, when the equipment to be monitored is taken out illegally, the image of the monitoring area where the equipment is not shown is input to the multilayer neural network 46, and the estimated value indicating the existence of the equipment to be monitored is 0. Or, the value becomes close to 0, and the threshold value is 0.5 or less, so an alarm signal is output to notify of an abnormality, and moving picture reproduction by the recording device 20 before and after a predetermined time including an abnormality occurrence time becomes possible. This makes it possible to monitor the theft of equipment to be monitored.

また、行動判定部１６は、監視画像から監視区画の画像を切り出した場合、所定サイズの画像に正規化して多層式ニューラルネットワークに入力させる制御を行う。 In addition, when the image of the monitoring section is cut out from the monitoring image, the behavior determining unit 16 performs control to normalize the image to a predetermined size and input the image to the multilayer neural network.

これは、図３に示したように、監視カメラ１４により撮像された監視画像３６に設定された監視区画３４−１０〜３４−１２が、監視カメラ１４からの距離が遠くなると画像サイズが小さくなり、その結果、切出された複数の監視区画の画像は縦横サイズが異なっており、そのまま多層式ニューラルネットワーク４６に入力すると、サイズの小さい画像の判定精度が低くなり、これを防止するために監視区画の切出し画像を同じサイズに正規化している。 This is because, as shown in FIG. 3, in the monitoring sections 34-10 to 34-12 set in the monitoring image 36 captured by the monitoring camera 14, the image size decreases as the distance from the monitoring camera 14 increases. As a result, the images of a plurality of cut out monitoring areas have different vertical and horizontal sizes, and if input to the multi-layered neural network 46 as it is, the determination accuracy of the small size image becomes low, and monitoring is performed to prevent this. The cropped image of the section is normalized to the same size.

監視区画の切出し画像を所定サイズに正規化するためサイズ変更は公知の手法であり、画像サイズを小さくする場合は画素の間引き処理を行い、画像サイズを大きくする場合は画素の補完処理を行う。 Resizing is a well-known method for normalizing the cutout image of the monitoring section to a predetermined size. When reducing the image size, pixel thinning processing is performed, and when increasing the image size, pixel complementing processing is performed.

また、学習制御部１８による監視区画の画像を切出して学習画像として記憶する場合にも、同様に監視区画から切り出した画像を所定サイズの画像に正規化して記憶する制御が行われる。 Also in the case where the image of the monitoring section by the learning control unit 18 is cut out and stored as a learning image, similarly, control is performed to normalize and store the image cut out from the monitoring section into an image of a predetermined size.

（多層式ニューラルネットワーク）
図５は図４に示した多層式ニューラルネットワークの機能構成を示した説明図であり、図５（Ａ）に概略を示し、図５（Ｂ）に詳細を模式的に示している。 (Multilayer neural network)
FIG. 5 is an explanatory view showing a functional configuration of the multilayer neural network shown in FIG. 4, schematically shown in FIG. 5 (A), and schematically shown in detail in FIG. 5 (B).

図５（Ａ）に示すように、本実施形態の多層式ニューラルネットワーク４６は、特徴抽出部５８と認識部６０で構成される。特徴抽出部５８は畳み込みニューラルネットワークであり、認識部６０は全結合ニューラルネットワークである。 As shown in FIG. 5A, the multilayer neural network 46 of the present embodiment is composed of a feature extraction unit 58 and a recognition unit 60. The feature extraction unit 58 is a convolutional neural network, and the recognition unit 60 is a fully combined neural network.

多層式ニューラルネットワーク４６は、深層学習（ディープラーニング）を行うニューラルネットワークであり、中間層を複数つなぎ合わせた深い階層をもつニューラルネットワークであり、特徴抽出となる表現学習を行う。 The multilayer neural network 46 is a neural network that performs deep learning (deep learning), is a neural network having a deep hierarchy in which a plurality of intermediate layers are connected, and performs expression learning that is feature extraction.

通常のニューラルネットワークは、画像から通常在室している利用者を判定するための特徴抽出には、人為的な試行錯誤による作業を必要とするが、多層式ニューラルネットワーク４６では、特徴抽出部５８として畳み込みニューラルネットワークを用いることで、画像の画素値を入力し、学習により最適な特徴を抽出し、認識部６０の全結合ニューラルネットワークに入力して備品使用権限をもつ利用者の行動を示す推定値を出力する。 A normal neural network requires an operation by artificial trial and error to extract a feature for determining a user who is usually present in an image from an image. By using a convolutional neural network as an input, the pixel value of the image is input, the optimum feature is extracted by learning, and input to the all-connected neural network of the recognition unit 60 to estimate the behavior of the user who has authority to use the equipment. Print a value.

認識部６０の全結合ニューラルネットワークは、図５（Ｂ）に模式的に示すように、入力層６６、全結合６８、中間層７０と全結合６８の繰り返し、及び出力層７２で構成されている。 The all connection neural network of the recognition unit 60 is composed of an input layer 66, a total connection 68, a repetition of an intermediate layer 70 and a total connection 68, and an output layer 72, as schematically shown in FIG. 5B. .

（畳み込みニューラルネットワーク）
図５（Ｂ）は特徴抽出部５８を構成する畳み込みニューラルネットワークの構造を模式的に示している。 (Convolutional neural network)
FIG. 5B schematically shows the structure of a convolutional neural network that constitutes the feature extraction unit 58. As shown in FIG.

畳み込みニューラルネットワークは、通常のニューラルネットワークとは少し特徴が異なり、視覚野から生物学的な構造を取り入れている。視覚野には、視野の小区域に対し敏感な小さな細胞の集まりとなる受容野が含まれており、受容野の挙動は、行列の形で重み付けを学習することで模倣できる。この行列は重みフィルタ（カーネル）呼ばれ、生物学的に受容野が果たす役割と同様に、ある画像の類似した小区域に対して敏感になる。 Convolutional neural networks are slightly different in characteristics from ordinary neural networks and adopt biological structures from visual cortex. The visual cortex contains a receptive field that is a collection of small cells that are sensitive to a small area of the visual field, and the behavior of the receptive field can be mimicked by learning weights in the form of a matrix. This matrix is called a weight filter (kernel) and is sensitive to similar subregions of an image, as well as the role played by the receptive field in biological terms.

畳み込みニューラルネットワークは、畳み込み演算により、重みフィルタと小区域との間の類似性を表すことでき、この演算を通して、画像の適切な特徴を抽出することができる。 A convolutional neural network can express the similarity between the weight filter and the small area by convolution, through which the appropriate features of the image can be extracted.

畳み込みニューラルネットワークは、図５（Ｂ）に示すように、まず、入力画像６２に対し重みフィルタ６３により畳み込み処理を行う。例えば、重みフィルタ６３は縦横３×３の所定の重み付けがなされた行列フィルタであり、入力画像６２の各画素にフィルタ中心を位置合わせしながら畳み込み演算を行うことで、入力画像６２の９画素を小区域となる特長マップ６４ａの１画素に畳み込み、複数の特徴マップ６４ａが生成される。 As shown in FIG. 5B, the convolutional neural network first performs convolution processing on the input image 62 using the weight filter 63. For example, the weight filter 63 is a matrix filter with predetermined weighting of 3 × 3 in vertical and horizontal directions, and performs nineteen pixels of the input image 62 by performing a convolution operation while aligning the center of the filter with each pixel of the input image 62. A plurality of feature maps 64a are generated by convolution to one pixel of the feature map 64a which is a small area.

続いて、畳み込み演算により得られた特徴マップ６４ａに対しプーリングの演算を行う。プーリングの演算は、識別に不必要な特徴量を除去し、識別に必要な特徴量を抽出する処理である。 Subsequently, pooling operation is performed on the feature map 64 a obtained by the convolution operation. The pooling operation is a process of removing feature amounts unnecessary for identification and extracting feature amounts necessary for identification.

続いて、重みフィルタ６５ａ，６５ｂを使用した畳み込み演算とプーリングの演算を多段に繰り返して特徴マップ６４ｂ，６４ｃが得られ、最後の層の特徴マップ６４ｃを認識部６０に入力し、通常の全結合ニューラルネットワークを用いた認識部６０により備品使用権限をもつ利用者の行動を推定する。 Subsequently, the convolution operation and the pooling operation using the weight filters 65a and 65b are repeated in multiple stages to obtain the feature maps 64b and 64c, and the feature map 64c of the last layer is input to the recognition unit 60, and normal total combination The recognition unit 60 using a neural network estimates the behavior of the user who has the authority to use the equipment.

なお、畳み込みニューラルネットワークにおけるプーリングの演算は、備品使用権限をもつ利用者の識別に不必要な特徴量が必ずしも明確でなく、必要な特徴量を削除する可能性があることから、プーリングの演算は行わないようにしても良い。 It should be noted that the pooling operation in the convolutional neural network is that the unnecessary feature amount is not necessarily clear in identifying the user who has the right to use the equipment, and the necessary feature amount may be deleted. You may not do this.

（学習制御部の機能）
図４に示す学習制御部１８の制御部５２は、システムの運用開始前に、例えばオフィスルーム１２の勤務時間帯に監視カメラ１４により撮像された監視領域の動画の中から監視対象として設定した１又は複数の監視区画のフレーム画像を例えば１分周期で１又は複数切出して学習画像として学習画像記憶部５４に記憶する。このような学習画像の記憶を例えば１ケ月程度繰り返すことで多層式ニューラルネットワーク４６の学習に必要な十分の量の学習画像が収集される。 (Function of learning control unit)
For example, the control unit 52 of the learning control unit 18 shown in FIG. 4 is set as a monitoring target from the moving image of the monitoring area captured by the monitoring camera 14 during the working time zone of the office room 12 before starting operation of the system 1 Alternatively, one or more frame images of a plurality of monitoring sections are cut out, for example, in a one-minute cycle, and stored in the learning image storage unit 54 as a learning image. By repeating such storage of the learning image, for example, for about one month, a sufficient amount of learning images necessary for learning of the multilayer neural network 46 is collected.

このようにして学習画像記憶部５４に記憶される学習画像についても、行動判定部１６による監視区画の切出し画像と同様に、画像サイズの正規化が行われている。 In this way, normalization of the image size is also performed on the learning image stored in the learning image storage unit 54 as in the case of the cutout image of the monitoring section by the action determination unit 16.

また、制御部５２は、システム運用中に監視領域の画像を切出して学習画像記憶部５４に記憶する制御を行う。このため制御部５２は、クライアント装置２４の在室管理情報から監視区画の備品使用権限をもつ利用者の在室を検出し、且つ、利用者端末の使用開始を例えばＬＡＮ回線２１に対する利用者端末のネットワーク接続により検出した場合、例えば１０〜１５分といった所定時間のあいだ、例えば１分周期で監視画像から監視区域の画像を切出して学習画像記憶部５４に記憶する制御を行う。 Further, the control unit 52 performs control of cutting out an image of the monitoring area during system operation and storing the image in the learning image storage unit 54. Therefore, the control unit 52 detects the room occupancy of the user who has the right to use the equipment of the monitoring section from the room occupancy management information of the client device 24 and starts the use of the user terminal, for example, the user terminal for the LAN line 21 When the network connection is detected, control is performed to cut out an image of the monitoring area from the monitoring image in a one-minute cycle, for example, for a predetermined time such as 10 to 15 minutes, and store the image in the learning image storage unit 54.

このようにシステム運用中の始業時に学習画像記憶部５４に記憶された学習画像は、行動判定部１６の行動判定動作が休止している夜間や休日等の空き時間に、学習制御部１８の制御部５２により学習画像記憶部５４から順次読出されて多層式ニューラルネットワーク４６に入力することで学習する制御が行われる。 As described above, the learning image stored in the learning image storage unit 54 at the start of work in the system operation is controlled by the learning control unit 18 during an idle time such as night or holiday when the action determination operation of the action determination unit 16 is paused. Control is performed by sequentially reading out from the learning image storage unit 54 by the unit 52 and inputting it to the multilayer neural network 46.

これによりシステム運用中に、監視区画の画像に映される備品使用権限をもつ利用者に服装や髪形等に変化があっても、この変化した監視区画の画像により多層式ニューラルネットワーク４６の学習が行われ、備品使用権限をもつ利用者の行動を示す推定値の低下が抑制され、備品使用権限をもつ利用者の行動を第三者の異常行動と誤って判定してしまうことを未然に防止可能とする。 As a result, even if there is a change in clothes, hair style, etc. of the user who has the right to use equipment shown in the image of the monitoring section during system operation, learning of the multilayer neural network 46 is performed by the image of the changing monitoring section. It is performed, and the decrease in the estimated value indicating the behavior of the user who has the authority to use the equipment is suppressed, and it is prevented in advance that the behavior of the user who has the authority to use the equipment is erroneously judged as the abnormal behavior of the third party. To be possible.

（多層式ニューラルネットワークの学習）
図４に示した学習制御部１８による多層式ニューラルネットワーク４６の学習は次のようにして行われる。 (Learning of multi-layered neural network)
The learning of the multilayer neural network 46 by the learning control unit 18 shown in FIG. 4 is performed as follows.

入力層、複数の中間層及び出力層で構成されるニューラルネットワークは、各層に複数のユニットを設けて他の層の複数のユニットと結合し、各ユニットにはウェイト（重み）とバイアス値が設定され、複数の入力値とウェイトとのベクトル積を求めてバイアス値を加算して総和を求め、これを所定の活性化関数に通して次の層のユニットに出力するようにしており、最終層に到達するまで値が伝播するフォワードプロパゲーションが行われる。 A neural network composed of an input layer, a plurality of intermediate layers, and an output layer provides a plurality of units in each layer and combines them with a plurality of units in other layers, and weights (bias) and bias values are set for each unit The vector product of a plurality of input values and weights is calculated and the bias values are added to obtain the sum, which is passed through a predetermined activation function and output to the unit of the next layer. Forward propagation is performed where the value propagates until reaching.

このようなニューラルネットワークのウェイトやバイアスを変更するには、バックプロパゲーションとして知られている学習アルゴリズムを使用する。バックプロパゲーションでは、入力値ｘと期待される出力値（期待値）ｙというデータセットをネットワークに与えた場合の教師ありの学習と、入力値ｘのみをネットワークに与えた場合の教師なしの学習があり、本実施形態は、教師ありの学習を行う。 To change the weights and biases of such neural networks, we use a learning algorithm known as backpropagation. In back propagation, supervised learning when the data set of input value x and expected output value (expected value) y is given to the network and unsupervised learning when the input value x is given only to the network In this embodiment, supervised learning is performed.

教師ありの学習でバックプロパゲーションを行う場合は、ネットワークを通ってきたフォワードプロパゲーションの結果である推定値ｙ＊と期待値yの値を比較する誤差として、例えば、平均二乗誤差の関数を使用する。 When performing backpropagation in supervised learning, for example, a function of mean square error is used as an error for comparing values of estimated value y * and expected value y, which are the results of forward propagation passed through the network Do.

バックプロパゲーションでは、推定値ｙ＊と期待値ｙの誤差の大きさを使い、ネットワークの後方から前方までウェイトとバイアスを補正しながら値を伝播させる。各ウェイトとバイアスについて補正した量は、誤差への寄与として扱われ、最急降下法で計算され、ウェイトとバイアスの値を変更することにより、誤差関数の値を最小化する。 In back propagation, using the magnitude of the error between the estimated value y * and the expected value y, the value is propagated from the back to the front of the network while correcting the weight and bias. The amount corrected for each weight and bias is treated as a contribution to the error, calculated by the steepest descent method, and the value of the weight and bias is changed to minimize the value of the error function.

ニューラルネットワークに対するバックプロパゲーションによる学習の手順は次にようになる。
（１）入力値ｘをニューラルネットワークに入力して、フォワードプロパゲーションを行い推定値ｙ＊を求める。
（２）推定値ｙ＊と期待値ｙに基づき誤差関数で誤差を計算する。
（３）ウェイトとバイアスを更新しながら、ネットワークにて、バックプロパゲーションを行う。 The procedure of learning by back propagation for a neural network is as follows.
(1) The input value x is input to the neural network, and forward propagation is performed to obtain the estimated value y *.
(2) Calculate an error with an error function based on the estimated value y * and the expected value y.
(3) Perform back propagation in the network while updating weights and biases.

この手順は、ニューラルネットワークのウェイトとバイアスの誤差が可能な限り最小になるまで、異なる入力値ｘと期待値ｙの組み合わせを使って繰り返し、誤差関数の値を最小化する。 This procedure is repeated using combinations of different input values x and expected values y until the error of weight and bias of the neural network is minimized to minimize the value of the error function.

［行動判定制御］
図６は図４の行動判定部による行動判定制御を示したフローチャートである。図６に示すように、行動判定部１６はステップＳ１で監視カメラ１４により撮像された監視領域の画像に対し、例えば図３に示したように、監視区画を設定する。 [Behavior judgment control]
FIG. 6 is a flowchart showing action determination control by the action determination unit of FIG. As shown in FIG. 6, the action determination unit 16 sets a monitoring section on the image of the monitoring area captured by the monitoring camera 14 in step S1, for example, as shown in FIG.

続いてステップＳ２に進み、行動判定部１６は、クライアント装置２４で管理しているオフィスルーム１２の在室管理情報から在室者の有無を判別し、在室者なしが判別された場合はステップＳ３に進んで行動判定動作を休止し、一方、ステップＳ２で在室者ありが判別された場合はステップＳ４に進んで行動判定動作を開始又は継続し、ステップＳ５に進む。 Subsequently, the process proceeds to step S2, and the behavior determination unit 16 determines the presence or absence of a room occupant from the room occupancy management information of the office room 12 managed by the client device 24. When it is determined that there is no room occupant The process proceeds to step S3 to suspend the action determination operation. On the other hand, when the presence of a room occupant is determined in step S2, the process proceeds to step S4 to start or continue the action determination operation, and the process proceeds to step S5.

行動判定部１６は、ステップＳ５で例えば１分周期となる所定の監視タイミングへの到達を判別するとステップＳ６に進み、そのとき監視カメラ１４により撮像されている動画のフレーム画像を監視画像として画像切出部４４に保持し、監視画像から監視区画の画像を切り出して画像サイズを正規化し、ステップＳ７で多層式ニューラルネットワーク４６に入力し、備品使用権限をもつ利用者の行動を示す推定値を出力する。 When the action determination unit 16 determines that the predetermined monitoring timing, which is, for example, a one-minute cycle, is reached in step S5, the process proceeds to step S6, and the image cutting is performed with the frame image of the moving image captured by the monitoring camera 14 as a monitoring image. The image is stored in the output unit 44, and the image of the monitoring section is cut out from the monitoring image to normalize the image size, and input to the multilayer neural network 46 in step S7 to output an estimated value indicating the behavior of the user who has the right to use the equipment. Do.

続いて行動判定部１６は、ステップＳ８で監視区画の画像の入力により多層式ニューラルネットワーク４６から出力された備品使用権限をもつ利用者の行動を示す推定値が所定の閾値以下（又は閾値未満）か否か判別し、備品使用権限をもつ利用者の行動であれば閾値超え（又は閾値以上）となることから、これを判別した場合はステップＳ２に戻って同様な処理を繰り返す。 Subsequently, the action determination unit 16 determines that the estimated value indicating the action of the user having the right to use the equipment output from the multilayer neural network 46 by the input of the image of the monitoring section in step S8 is equal to or less than a predetermined threshold (or less than the threshold) If it is the action of the user who has the right to use the equipment, the threshold value is exceeded (or more than the threshold value), so when it is determined, the process returns to step S2 and the same process is repeated.

これに対し動判定器１６がステップＳ８で監視区画の画像の入力により多層式ニューラルネットワーク４６から出力される備品使用権限をもつ利用者の行動を示す推定値が所定の閾値以下（又は閾値未満）となることを判別した場合は、ステップＳ９に進んで第三者の異常行動と判定し、ステップＳ１０で例えば入退室管理設備のクライアント装置２４や異常行動が判定された監視区画の利用者端末等に警報信号を送信して警報を報知させ、更に、ステップＳ１１に進んで録画装置２０に対し、異常行動の判定時点を含む例えば前後５分間の監視動画の再生を指示し、クライアント装置２４又は利用者端末に送信して表示させる。 On the other hand, the estimated value indicating the behavior of the user having the right to use the equipment output from the multi-layered neural network 46 by the motion judging unit 16 at step S8 input of the image of the monitoring section is less than the predetermined threshold (or less than the threshold) If it is determined to be, the process proceeds to step S9 to determine that it is an abnormal action of the third party, and in step S10, for example, the client device 24 of the room access control facility or the user terminal of the monitoring section where the abnormal action is determined The alarm signal is sent to alert the alarm, and the process proceeds to step S11, and instructs the recording device 20 to reproduce, for example, five minutes before and after the monitored moving image including the determination time of the abnormal action, Send to the terminal of the person and display it.

［学習制御］
図７は図４の学習制御部による在室者学習制御を示したフローチャートである。図７に示すように、学習制御部１８の制御部５２は、ステップＳ２１でクライアント装置２４の在室管理情報から監視区域に対する備品使用権限をもつ利用者が在室か否か判別し、在室を判別するとステップＳ２２に進んで監視領域に配置されている利用者端末のＬＡＮ回線２１への接続の有無を判別し、接続ありを判別すると利用者端末の使用開始を認識してステップＳ２３に進む。このようなステップＳ２１，Ｓ２２の処理により、制御部５２は、監視領域の備品使用権限のある利用者が出勤して業務を開始する始業時の行動を検出している。 [Learning control]
FIG. 7 is a flow chart showing occupancy learning control by the learning control unit of FIG. As shown in FIG. 7, the control unit 52 of the learning control unit 18 determines from the occupancy management information of the client device 24 in step S21 whether the user having the right to use the equipment for the monitoring area is an occupancy or not When it is determined, the process proceeds to step S22 to determine whether the user terminal arranged in the monitoring area is connected to the LAN line 21. When the connection is determined, the start of use of the user terminal is recognized and the process proceeds to step S23. . By the processes of steps S21 and S22 as described above, the control unit 52 detects an action at the start of work when a user having authority to use equipment in the monitoring area comes to work and starts work.

ステップＳ２３に進んだ制御部５２は、例えば１分周期となる所定の切出しタイミングへの到達を判別するとステップＳ２４に進み、監視区画の画像を切り出して画像サイズの正規化を行った後に、ステップＳ２５で学習画像として学習画像記憶部４４に記憶させ、これをステップＳ２６で就業開始から１０〜１５分といった所定の学習時間の経過を判別するまで繰り返す。 For example, when the control unit 52 proceeds to step S23 and determines that the predetermined cutout timing which is a one-minute cycle has been reached, the control unit 52 proceeds to step S24 to cut out the image of the monitoring section and normalize the image size. In step S26, the process is repeated until it is determined that the predetermined learning time has elapsed, such as 10 to 15 minutes from the start of work.

続いて、制御部５２は、ステップＳ２７で夜間や休日等などにおける行動判定部１６の休止中を判別するとステップＳ２８に進み、学習画像記憶部５４に記憶している監視区画の画像を読み出して多層式ニューラルネットワーク４６に入力し、備品使用権限をもつ利用者の行動を示す推定値が１となるようにバックプロパゲーションにより多層式ニューラルネットワーク４６のウェイトとバイアスを調整する学習制御をステップＳ２９で行動判定部１６の休止解除が判別されるまで繰り返す。 Subsequently, when the control unit 52 determines in step S27 that the action determination unit 16 is paused at night, on holidays, etc., the control unit 52 proceeds to step S28 and reads the image of the monitoring section stored in the learning image storage unit 54 The learning control to adjust the weight and bias of the multilayer neural network 46 by back propagation so that the estimated value indicating the behavior of the user who has the authority to use the equipment becomes 1 is input to the formula neural network 46 in step S29. The process is repeated until the pause release of the determination unit 16 is determined.

続いて、制御部５２は、ステップＳ２９で行動判定部１６における行動判定動作の休止解除（起動）を判別するとステップＳ３０に進んで多層式ニューラルネットワーク４６の学習を中断又は終了し、ステップＳ２１からの処理を繰り返す。 Subsequently, when it is determined in step S29 that the pause determination (activation) of the action determination operation in the action determination unit 16 is determined in step S29, the control unit 52 proceeds to step S30 to suspend or end learning of the multilayer neural network 46, and from step S21. Repeat the process.

［行動監視装置の第２実施形態］
図８は図１の行動監視装置の第２実施形態を機能構成により示したブロック図である。 Second Embodiment of Behavior Monitoring Device
FIG. 8 is a block diagram showing a second embodiment of the behavior monitoring apparatus of FIG. 1 by a functional configuration.

図８に示すように、第２実施形態の行動監視装置１０に設けられた行動判定部１６は、画像切出部４４、画像解析部７４、行動認識部７６及び時系列判定部４８で構成され、画像解析部７４には畳み込みニューラルネットワーク７８と再帰型ニューラルネットワーク８０が設けられ、また、行動認識部７６には判定器８２とシソーラス辞書８４が設けられている。シソーラス辞書８４には備品使用権限をもたない第三者を識別するための異常行動判定単語が記憶されている。 As shown in FIG. 8, the action determination unit 16 provided in the action monitoring device 10 according to the second embodiment includes an image cutout unit 44, an image analysis unit 74, an action recognition unit 76, and a time series determination unit 48. The image analysis unit 74 is provided with a convolutional neural network 78 and a recursive neural network 80, and the behavior recognition unit 76 is provided with a determiner 82 and a thesaurus dictionary 84. The thesaurus dictionary 84 stores abnormal behavior determination words for identifying a third party who does not have the right to use the equipment.

また、学習制御部１８は、画像切出部５０、制御部５２、学習情報記憶部として機能する学習データセット記憶部１００、マウスやキーボード等の操作部５５及びディスプレイを備えた表示部５６で構成される。これらの機能は、ニューラルネットワークの処理に対応したコンピュータ回路のＣＰＵによるプログラムの実行により実現される。 The learning control unit 18 includes an image cutting unit 50, a control unit 52, a learning data set storage unit 100 functioning as a learning information storage unit, an operation unit 55 such as a mouse and a keyboard, and a display unit 56 including a display. Be done. These functions are realized by execution of a program by a CPU of a computer circuit corresponding to neural network processing.

（行動判定部の機能）
図８に示すように、行動判定部１６の画像切出部４４は監視カメラ１４で撮像されたオフィスルーム１２などの監視画像を所定周期毎に読み込んで保持し、保持した監視画像に設定された１又は複数の監視区画の画像を切り出して画像サイズを正規化した後に画像解析部７４に出力する。 (Function of behavior judgment unit)
As shown in FIG. 8, the image cutout unit 44 of the action determination unit 16 reads and holds the monitoring image of the office room 12 or the like captured by the monitoring camera 14 for each predetermined cycle, and is set as the held monitoring image An image of one or more monitoring sections is cut out, the image size is normalized, and the image size is output to the image analysis unit 74.

画像解析部７４に設けられた畳み込みニューラルネットワーク７８は入力した監視区画の画像の特徴量を抽出して出力する。再帰型ニューラルネットワーク８０は畳み込みニューラルネットワーク７８から出力された特徴量を入力し、備品使用権限をもつ利用者又は第三者の行動を説明する入力画像の行動説明文（「画像説明文」ともいう）を生成して出力する。 A convolutional neural network 78 provided in the image analysis unit 74 extracts and outputs feature amounts of the image of the input monitoring section. Recursive neural network 80 inputs the feature amount output from convolutional neural network 78, and an action description sentence of an input image (also referred to as "image explanation text") explaining the action of a user who has authority to use equipment or a third party. Generate and output).

行動認識部７６の判定器８２は、画像解析部７４の再帰型ニューラルネットワーク８０から出力された行動説明文を構成する１又は複数の単語と、シソーラス辞書８４に記憶されている備品使用権限をもたない第三者の行動を示す異常行動判定単語とを比較し、行動説明文の単語がシソーラス辞書７２の異常行動判定単語に一致又は類似した場合に備品使用権限をもたない第三者の行動を示す異常行動を判定し、時系列判定部４８に出力する。 The determiner 82 of the action recognition unit 76 also uses one or more words constituting the action explanatory text output from the recursive neural network 80 of the image analysis unit 74 and the equipment use authority stored in the thesaurus dictionary 84. If the word of the behavioral description matches or resembles the abnormal behavior determination word of the thesaurus dictionary 72, the third party who does not have the authority to use the equipment is compared with the abnormal behavior determination word indicating the behavior of the third party. The abnormal behavior indicating the behavior is determined and output to the time series determination unit 48.

時系列判定部４８は、行動認識部７６による異常行動の判定結果が複数回又は所定時間に亘り連続した場合に、第三者による異常行動との判定を確定させ、図１に示した人物特定部１７に出力する。 When the determination result of the abnormal action by the action recognition unit 76 continues a plurality of times or for a predetermined time, the time series determination unit 48 determines the determination as the abnormal action by the third party, and the person identification shown in FIG. 1 Output to the unit 17.

人物特定部１７には、入退室管理設備による電気錠２８を解錠する制御に連動して入室カメラ２５により撮像された利用者画像が記憶されており、記憶している利用者画像と行動認識部７６で異常行動が判定された人物の画像と照合し、一致した利用者の利用者情報を入退室管理設備のクライアント装置２４から取得して人物特定情報とし、異常行動が判定された人物の画像、人物特定情報を含む警報信号例えば入退室管理設備のクライアント装置２４に送信して警報信号を送信して第三者の異常行動の警報を報知させると共に、録画装置２０で録画している異常行動を判定した時点を含む前後所定時間の監視領域の動画を再生してクライアント装置２４などに表示させる。 In the person identification unit 17, a user image captured by the entry camera 25 is stored in conjunction with control of unlocking the electric lock 28 by the entry / exit management facility, and the stored user image and action recognition are stored. The part 76 collates the image of the person whose abnormal behavior has been determined, and acquires the user information of the matched user from the client device 24 of the entry / exit management facility as person identification information, and the person whose abnormal behavior has been determined An alarm signal including an image and person identification information, for example, transmitted to the client device 24 of the room entry and exit management facility to transmit an alarm signal to notify an alarm of an abnormal behavior of a third party, and an abnormality being recorded by the recording device 20 The moving image of the monitoring area of a predetermined time before and after the time point when the action is determined is reproduced and displayed on the client device 24 or the like.

（学習制御部の機能）
行動判定部１６の画像解析部７４に設けられた折り畳みニューラルネットワーク７８と再帰型ニューラルネットワーク８０は、学習制御部１８の学習データセット記憶部１００に予め記憶された学習画像とその行動説明文のペアからなる多数の学習データセットを使用して制御部５２により学習されている。 (Function of learning control unit)
The folding neural network 78 and the recursive neural network 80 provided in the image analysis unit 74 of the action determination unit 16 are a pair of a learning image stored in advance in the learning data set storage unit 100 of the learning control unit 18 and its action explanation sentence. Are learned by the control unit 52 using a large number of learning data sets.

学習データセット記憶部１００に記憶されている学習画像は、例えば、図４の第１実施形態と同様に、監視カメラ１４により撮像された例えば図３に示した監視区画３４−１０〜３４−１２により切出されて画像サイズが正規化された画像である。 The learning image stored in the learning data set storage unit 100 is, for example, the monitoring section 34-10 to 34-12 shown in FIG. 3, for example, captured by the monitoring camera 14 as in the first embodiment of FIG. It is an image which has been cut out and the image size has been normalized.

また、学習データセット記憶部１００には、記憶された多数の学習画像に対応して行動説明文が準備され、学習画像と行動説明文のペアからなる多数の学習データセットとして記憶されている。 Also, in the learning data set storage unit 100, action explanatory sentences are prepared corresponding to the stored many learning images, and are stored as a large number of learning data sets consisting of pairs of learning images and action explanatory sentences.

例えば、図３に示した監視区画３４−１０から切出された学習画像に対しては、例えば「部長Ｘが利用者端末を操作している」といった行動説明文が準備され、また、監視区画３４−１０から切出された部長Ｘが席を外している学習画像には、例えば「利用者端末が机の上に置いてある」といった行動説明文（画像説明文）が準備され、それぞれ学習画像とペアとなって学習データセット記憶部１００に記憶されている。 For example, for the learning image cut out from the monitoring section 34-10 shown in FIG. 3, an action explanatory text such as "the director X is operating the user terminal" is prepared, and the monitoring section In the learning image where the director X cut out from 34-10 has taken a seat, an action explanatory sentence (image explanatory sentence) such as "the user terminal is placed on a desk" is prepared, for example. It is stored in the learning data set storage unit 100 as a pair with the image.

制御部５２は、入力層、複数の中間層及び全結合の出力層で構成された学習用の畳込みニューラルネットワーク、即ち図５に示した特徴抽出部５５の畳み込みニューラルネットワークと認識部６０の全結合ニューラルネットワークで構成された多層式ニューラルネットワークを学習用に準備し、まず、学習データセット記憶部１００に記憶されている多数の学習画像を読み出し、学習用の多層式ニューラルネットワークに学習画像として入力し、バックプロパゲーション法（逆伝播法）により学習させる。 The control unit 52 is a learning convolutional neural network composed of an input layer, a plurality of intermediate layers and an output layer of all couplings, that is, the convolutional neural network of the feature extraction unit 55 shown in FIG. A multi-layered neural network composed of a combined neural network is prepared for learning, and first, a large number of learning images stored in the learning data set storage unit 100 are read out and input as learning images to the multi-layered neural network for learning. And learn by back propagation method (back propagation method).

続いて、制御部５２は、学習済みの畳込みニューラルネットワークに得られたウェイトとバイアスを、図９に示した出力層を持たない畳み込みニューラルネットワーク７８にセットして学習済みとし、学習済みの畳込みニューラルネットワーク７８に学習画像を入力して特徴量を抽出し、抽出した特徴量と入力した学習画像とペアになっている行動説明文を再帰型ニューラルネットワーク８０に入力し、バックプロパゲーション法により学習させる。 Subsequently, the control unit 52 sets the weights and biases obtained in the learned convolutional neural network in the convolutional neural network 78 having no output layer shown in FIG. The learning image is input to the embedded neural network 78 to extract the feature amount, and the action description sentence paired with the extracted feature amount and the input learning image is input to the recursive neural network 80, and the back propagation method is performed. To learn.

このように学習制御部１８により学習された畳み込みニューラルネットワーク７８と再帰型ニューラルネットワーク８０で構成された画像解析部７８に、例えば図３に示した備品使用権限をもつ部長Ｘが利用者端末３２−１０を操作している監視区画３４−１０の画像が入力されると、例えば「部長Ｘが利用者端末を操作している」といった行動説明文が出力される。この場合には、行動認識部７６により異常行動判定単語との不一致が判定され、第三者の異常行動とは認識されない。 In the image analysis unit 78 composed of the convolutional neural network 78 learned by the learning control unit 18 and the recursive neural network 80 in this manner, for example, the part length X having the right to use the equipment shown in FIG. When an image of the monitoring section 34-10 operating 10 is input, an action description such as “the director X operates the user terminal” is output. In this case, the behavior recognition unit 76 determines that there is a mismatch with the abnormal behavior determination word, and is not recognized as a third party abnormal behavior.

これに対し、画像解析部７８に、例えば部長Ｘ以外の第三者が利用者端末３２−１０を操作している監視区画３４−１０の画像が入力されると、例えば「不審者が利用者端末を操作している」といった行動説明文が出力され、行動認識部７６で例えば「不審者」、「利用端末」、「操作」といった異常行動判定単語との一致が判定され、第三者の異常行動を認識して報知させることができる。 On the other hand, when an image of the monitoring section 34-10 where, for example, a third party other than the director X operates the user terminal 32-10 is input to the image analysis unit 78, for example, An action description such as “operating a terminal” is output, and the action recognition unit 76 determines, for example, a match with an abnormal action determination word such as “suspicious person”, “use terminal”, or “operation”. Anomalous behavior can be recognized and informed.

［画像解析部の多層式ニューラルネットワーク］
図９は図８の画像解析部に設けられた畳み込みニューラルネットワークと再帰型ニューラルネットワークの機能構成を示した説明図である。 [Multilayer neural network of image analysis unit]
FIG. 9 is an explanatory view showing functional configurations of a convolutional neural network and a recursive neural network provided in the image analysis unit of FIG.

（畳み込みニューラルネットワーク）
図９に示すように、畳み込みニューラルネットワーク７８は入力層８５、複数の中間層８６で構成されている。通常の畳み込みニューラルネットワークは最後の中間層８６の後に、入力層、複数の中間層及び出力層を全結合して画像の特徴量から出力を推定する図５に示したように全結合ニューラルネットワークを設けているが、本実施形態は、入力画像の特徴量を抽出するだけで良いことから、後段の全結合ニューラルネットワークは設けていない。 (Convolutional neural network)
As shown in FIG. 9, the convolutional neural network 78 comprises an input layer 85 and a plurality of intermediate layers 86. A typical convolutional neural network combines an input layer, a plurality of intermediate layers, and an output layer after the last intermediate layer 86 to estimate the output from the feature quantities of the image as shown in FIG. Although provided, since the present embodiment only needs to extract the feature amount of the input image, the fully coupled neural network in the latter stage is not provided.

畳み込みニューラルネットワーク７８は、図５に示したと同様、通常のニューラルネットワークとは少し特徴が異なり、視覚野から生物学的な構造を取り入れている。畳み込みニューラルネットワーク７８は、畳み込み演算により、重みフィルタと小区域との間の類似性を表すことでき、この演算を通して、画像の適切な特徴を抽出することができる。 The convolutional neural network 78, as shown in FIG. 5, is slightly different in characteristics from a normal neural network, and takes in a biological structure from the visual cortex. The convolutional neural network 78 can represent the similarity between the weight filter and the small area by convolution and through this operation the appropriate features of the image can be extracted.

畳み込みニューラルネットワーク７８は、入力層８５に入力した入力画像に対し重みフィルタにより畳み込み処理を行い、中間層８６に特徴マップが生成される。続いて、畳み込み演算により得られた中間層８６の特徴マップに対しプーリングの演算を行う。 The convolutional neural network 78 performs a convolution process on the input image input to the input layer 85 using a weight filter, and a feature map is generated on the intermediate layer 86. Subsequently, pooling operation is performed on the feature map of the intermediate layer 86 obtained by the convolution operation.

続いて、重みフィルタを使用した畳み込み演算とプーリングの演算を各中間層８６毎に繰り返すことで最後の中間層８６まで特徴マップが生成され、本実施形態にあっては、任意の中間層８６に生成された特徴マップを、入力画像の特徴量として再帰型ニューラルネットワーク８０に入力している。 Subsequently, the feature map is generated up to the last intermediate layer 86 by repeating the convolution operation and the pooling operation using the weight filter for each intermediate layer 86, and in the present embodiment, in the arbitrary intermediate layer 86. The generated feature map is input to the recursive neural network 80 as a feature of the input image.

畳み込みニューラルネットワーク７８は、図８に示した学習制御部１８により学習データセット記憶部５６に記憶された学習画像を入力して学習を行っており、この学習により、良く似た画像をグループ分けするクラスタリングされた特徴量をもつ画像を生成することができる。 The convolutional neural network 78 performs learning by inputting the learning image stored in the learning data set storage unit 56 by the learning control unit 18 shown in FIG. 8, and groups similar images by this learning. Images with clustered feature quantities can be generated.

（再帰型ニューラルネットワーク）
図９に示す再帰型ニューラルネットワーク８０は、畳み込みニューラルネットワーク７８を用いて抽出した画像の特徴量を、単語ベクトルと共に入力して行動説明文を予測する。 (Recursive neural network)
A recursive neural network 80 shown in FIG. 9 predicts an action description by inputting the feature quantities of the image extracted using the convolutional neural network 78 together with the word vector.

本実施形態の再帰型ニューラルネットワーク８０は、時系列データ対応の深層学習モデルとなるＬＳＴＭ−ＬＭ（ＬｏｎｇＳｈｏｒｔ−ＴｅｒｍＭｅｍｏｒｙ−ＬａｎｇａｇｅＭｏｄｅｌ）を使用している。 The recursive neural network 80 of the present embodiment uses a Long Short-Term Memory-Langage Model (LSTM-LM) which is a deep learning model corresponding to time series data.

通常の再帰型ニューラルネットワークのモデルは、入力層、隠れ層、出力層で構成され、隠れ層の情報を次時刻の入力とすることで過去の経歴を利用した時系列解析をするモデルである。これに対しＬＳＴＭモデルは、過去の文脈となるｔ−１個の単語からｔ番目の単語として各単語が選ばれる確率を算出する。即ち、ＬＳＴＭモデルは１時刻前の隠れ状態となる時刻１〜ｔ−１の単語情報、１時刻前の予測結果となる時刻ｔ−１の単語、及び外部情報の３つを入力とし、逐次的に次の単語の予測を繰り返して文章を生成する。 A model of a normal recursive neural network is a model that is composed of an input layer, a hidden layer, and an output layer, and is a model that performs time series analysis using past history by using information of the hidden layer as an input of the next time. On the other hand, the LSTM model calculates the probability that each word will be selected as the t-th word from t-1 words that will be in the past context. That is, the LSTM model takes three inputs of word information of time 1 to t-1 that is hidden one time ago, time t-1 that is a prediction result of one time earlier, and external information, and sequentially Repeat the next word prediction to generate sentences.

図９の再帰型ニューラルネットワーク８０は、畳み込みニューラルネットワーク７８で抽出された画像の特徴ベクトルをＬＳＴＭ隠れ層８８に入力する行列に変換するＬＳＴＭ入力層８７、レジスタ９０に単語単位に格納された単語Ｓ₀〜Ｓ_N-1をベクトルＷｅＳ₀〜ＷｅＳ_N-1に変換するベクトル変換部９２、Ｎ−１段のＬＳＴＭ隠れ層８８、ＬＳＴＭ隠れ層８８の出力を出現確率ｐ₁〜ｐ_Nに変換する確率変換部９４、単語を出力する確率からコスト関数ｌｏｇＰ₁（ｓ１）〜ｌｏｇｐ_N（Ｓ_N）により算出してコストを最小化するコスト算出部９６で構成される。 The recursive neural network 80 shown in FIG. 9 converts the feature vectors of the image extracted by the convolutional neural network 78 into a matrix input to the LSTM hidden layer 88, and the word S stored in word units in the register 90. converting the ₀ to S _N-1 output vector WeS ₀ ~WeS vector converter 92 to be converted to _N-1, N-1 stage LSTM hidden layer 88 of, LSTM hidden layer 88 to the appearance probability p ₁ ~p _N The probability conversion unit 94 is configured with a cost calculation unit 96 that calculates the cost from the probability of outputting a word by the cost functions log P ₁ (s ₁ ) to log p _N (S _N ) to minimize the cost.

（再帰型ニューラルネットワークの学習）
再帰型ニューラルネットワーク８０の学習対象は、ベクトル変換部９２とＬＳＴＭ隠れ層８８であり、畳み込みニューラルネットワーク７８からの特徴量の抽出には、学習済みのパラメータ（ウェイトとバイアス）をそのまま使用する。 (Learning of recursive neural networks)
The learning targets of the recursive neural network 80 are the vector conversion unit 92 and the LSTM hidden layer 88, and for extraction of feature quantities from the convolutional neural network 78, the learned parameters (weights and biases) are used as they are.

学習データは、学習画像Ｉとその行動説明文の単語列｛Ｓｔ｝（ｔ＝０，・・・Ｎ）となり、次の手順で行う。
（１）画像Ｉを畳み込みニューラルネットワーク７８に入力し、特定の中間層８６の出力を特徴ベクトルとして取り出す。
（２）特徴ベクトルをＬＳＴＭ隠れ層８８に入力する。
（３）単語列Ｓｔをｔ＝０からｔ＝Ｎ−１まで順に入力し、それぞれのステップで確率ｐ_t+1を得る。
（４）単語Ｓｔ＋１を出力する確率ｐｔ＋１（Ｓｔ＋１）から求まるコストを最小化する。 The learning data is the word sequence {St} (t = 0,... N) of the learning image I and its action explanatory text, and the following procedure is performed.
(1) The image I is input to the convolutional neural network 78, and the output of a specific intermediate layer 86 is extracted as a feature vector.
(2) Input the feature vector to the LSTM hidden layer 88.
(3) The word string St is sequentially input from t = 0 to t = N-1, and a probability _{pt + 1} is obtained in each step.
(4) Minimize the cost obtained from the probability pt + 1 (St + 1) of outputting the word St + 1.

（画像説明文の生成）
学習済みの畳み込みニューラルネットワーク７８と再帰型ニューラルネットワーク８０を使用して入力画像の行動説明文を生成する場合には、畳み込みニューラルネットワーク７８に監視区画から切り出した画像を入力して生成した特徴量のベクトルを再帰型ニューラルネットワーク８０に入力し、単語の出現確率の積が高い順に単語列を並べて行動説明文を生成させる。この手順は次のようになる。 (Generation of image description)
In the case of generating the action explanation of the input image using the trained convolutional neural network 78 and the recursive neural network 80, the feature extracted from the monitoring section is input to the convolutional neural network 78 and the feature amount generated is A vector is input to the recursive neural network 80, and word sequences are arranged in descending order of the product of the word appearance probability to generate an action description. The procedure is as follows.

（１）画像を畳み込みニューラルネットワーク７８に入力し、特定の中間層８６の出力を特徴ベクトルとして取り出す。
（２）特徴ベクトルをＬＳＴＭ入力層８７からＬＳＴＭ隠れ層８８に入力する。
（３）文の開始記号＜Ｓ＞を、ベクトル変換部９２を使用してベクトルに変換し、ＬＳＴＭ隠れ層８８に入力する。
（４）ＬＳＴＭ隠れ層８８の出力から単語の出現確率が分かるので、上位Ｍ個（例えばＭ＝２０個）の単語を選ぶ。
（５）１つ前のステップで出力した単語を、ベクトル変換部９２を使用してベクトルに変換し、ＬＳＴＭ隠れ層８８に入力する。
（６）ＬＳＴＭ隠れ層８８の出力から、これまでに出力した単語の確率の積を求め、上位Ｍ個の単語列を選択する。
（７）前記（５）と前記（６）の処理を、単語の出力が終端記号になるまで繰り返す。 (1) The image is input to a convolutional neural network 78, and the output of a specific intermediate layer 86 is extracted as a feature vector.
(2) Input a feature vector from the LSTM input layer 87 to the LSTM hidden layer 88.
(3) The start symbol <S> of the sentence is converted into a vector using the vector conversion unit 92 and input to the LSTM hidden layer 88.
(4) Since the appearance probability of a word is known from the output of the LSTM hidden layer 88, the top M (for example, M = 20) words are selected.
(5) The word output in the previous step is converted into a vector using the vector conversion unit 92 and input to the LSTM hidden layer 88.
(6) From the output of the LSTM hidden layer 88, the product of the probabilities of the words output so far is obtained, and the top M word strings are selected.
(7) Repeat the processes of (5) and (6) until the word output becomes a terminal symbol.

このように行動判定部１６は、監視カメラ１４により撮像された監視画像に設定された監視区画の画像を切り出して解析することで、監視区画の備品使用権限をもたない第三者の異常行動を判定して報知することができる。 As described above, the behavior determination unit 16 cuts out and analyzes the image of the monitoring section set in the monitoring image captured by the monitoring camera 14 to thereby allow an abnormal action of a third party who does not have the authority to use the equipment of the monitoring section. Can be determined and reported.

なお、本実施形態の畳み込みニューラルネットワーク７８と再帰型ニューラルネットワーク８０は教師なしの学習としても良いし、教師ありの学習としても良い。 Note that the convolutional neural network 78 and the recursive neural network 80 according to the present embodiment may be unsupervised learning or supervised learning.

［行動判定制御］
図１０は図８の行動判定部による行動判定制御を示したフローチャートである。図１０に示すように、行動判定部１６はステップＳ３１で監視カメラ１４により撮像された監視領域の画像に対し、例えば図３に示したように、監視区画を設定する。 [Behavior judgment control]
FIG. 10 is a flowchart showing action determination control by the action determination unit of FIG. As shown in FIG. 10, the action determination unit 16 sets a monitoring section on the image of the monitoring area captured by the monitoring camera 14 in step S31, as shown in FIG. 3, for example.

続いてステップＳ３２に進み、行動判定部１６は、クライアント装置２４で管理しているオフィスルーム１２の在室管理情報から在室者の有無を判別し、在室者なしが判別された場合はステップＳ３３に進んで行動判定動作を休止し、一方、ステップＳ３２で在室者ありが判別された場合はステップＳ３４に進んで行動判定動作を開始し、ステップＳ３５に進む。 Subsequently, the process proceeds to step S32, and the behavior determination unit 16 determines the presence or absence of a room occupant from the room occupancy management information of the office room 12 managed by the client device 24. When it is determined that there is no room occupant The process proceeds to step S33 to suspend the action determination operation. On the other hand, when it is determined in step S32 that there is a room occupant, the process proceeds to step S34 to start the action determination operation, and the process proceeds to step S35.

行動判定部１６は、ステップＳ３５で例えば１分周期となる所定の監視タイミングへの到達を判別するとステップＳ３６に進み、そのとき監視カメラ１４により撮像されている動画のフレーム画像を監視画像として画像切出部４４に保持し、監視画像から監視区画の画像を切り出して画像サイズを正規化し、ステップＳ３７で多層式ニューラルネットワーク４６に入力して行動説明文を行動認識部７６の判定器８２に出力する。 When the action determination unit 16 determines at step S35 that arrival at a predetermined monitoring timing which is, for example, a one-minute cycle is reached, the process proceeds to step S36, and the image cutting is performed using the frame image of the moving image captured by the monitoring camera The image is stored in the output unit 44, and the image of the monitoring section is cut out from the monitoring image to normalize the image size, and input to the multilayer neural network 46 in step S37 to output an action explanation to the determiner 82 of the action recognition unit 76 .

判定器８２はステップＳ３８で行動説明文を構成する単語をソシーラス辞書８４に登録されている異常行動判定単語と比較し、一部又は全部が異常行動判定単語と一致した場合にステップＳ４０に進んで第三者による異常行動と判定し、ステップＳ４１で例えば入退室管理設備のクライアント装置２４や第三者の異常行動が判定された監視区画の利用者端末等に警報信号を送信して警報を報知させ、更に、ステップＳ４２に進んで録画装置２０に対し、異常行動の判定時点を含む例えば前後５分間の監視動画の再生を指示し、クライアント装置２４又は利用者端末に送信して表示させる。 In step S38, the determiner 82 compares the words constituting the action explanatory text with the abnormal behavior determination words registered in the socilas dictionary 84, and proceeds to step S40 when part or all of the words match the abnormal behavior determination words. The alarm signal is transmitted by transmitting an alarm signal to the client device 24 of the room entry and exit management equipment or the user terminal of the monitoring section in which the abnormal behavior of the third party is determined in step S41. Further, the process proceeds to step S42 to instruct the recording apparatus 20 to reproduce a monitored moving image, for example, five minutes before and after including the determination time point of the abnormal action, and transmit it to the client device 24 or the user terminal for display.

［学習制御］
図１１は図８の学習制御部による在室者学習制御を示したフローチャートである。図１１に示すように、学習制御部１８の制御部５２は、ステップＳ５１でクライアント装置２４の在室管理情報から監視区画に対し備品使用権限をもつ利用者（許可利用者）の在室を判別するとステップＳ５２に進み、監視区画に配置されている利用者端末のＬＡＮ回線２１への接続の有無を判別し、接続ありを判別すると利用者端末の使用開始を認識してステップＳ５３に進む。このようなステップＳ５１，Ｓ５２の処理により制御部５２は、備品使用権限のある利用者が出勤して業務を開始する始業時の行動を検出している。 [Learning control]
FIG. 11 is a flowchart showing occupancy learning control by the learning control unit of FIG. As shown in FIG. 11, the control unit 52 of the learning control unit 18 determines the occupancy of the user (authorized user) who has the right to use the equipment for the monitoring section from the occupancy management information of the client device 24 in step S51. Then, in step S52, the presence or absence of connection to the LAN line 21 of the user terminal arranged in the monitoring section is determined, and when connection is determined, the start of use of the user terminal is recognized, and the process proceeds to step S53. By the processes of steps S51 and S52, the control unit 52 detects an action at the start of work when a user having authority to use equipment goes to work and starts work.

ステップＳ５３に進んだ制御部５２は、例えば１分周期となる所定の切出しタイミングへの到達を判別するとステップＳ５４に進み、監視区域の画像を切り出して画像サイズの正規化を行った後に、ステップＳ５５で切出した画像中に人物があるか否か判別する。 For example, when the control unit 52 proceeds to step S53 and determines that the predetermined cutting timing which is a one-minute cycle has been reached, the process proceeds to step S54 to cut out the image of the monitoring area and normalize the image size. It is determined whether or not there is a person in the image cut out in.

ステップＳ５５における切出し画像中の人物の判別は、例えば、人物不在時の背景画像を予め記憶し、切出した画像との差分をとり、差分画像の輝度値の総和が所定の閾値以上の場合に人物ありと判定すればよい。 The determination of the person in the clipped image in step S55 includes, for example, storing in advance a background image in the absence of a person, taking a difference from the clipped image, and the person when the sum of luminance values of the difference image is equal to or more than a predetermined threshold It may be determined that there is.

制御部５２は、ステップＳ５５で切出し画像中に人物ありを判別した場合はステップＳ５６に進み、例えば「権限のある利用者が利用者端末を操作している」といった許可利用者の行動説明文を生成し、ステップＳ５８で監視区画画像と行動説明文のペアを学習データセットとして学習データセット記憶部１００に記憶させる。 If it is determined in step S55 that there is a person in the cut-out image, the control unit 52 proceeds to step S56 and, for example, an action descriptive text of an authorized user such as "authorized user operating user terminal" In step S58, a pair of the monitoring section image and the action explanatory text is stored in the learning data set storage unit 100 as a learning data set.

また、制御部５２は、ステップＳ５５で切出し画像中に人物なしを判別した場合はステップＳ５７に進み、例えば「「利用者端末が机の上に配置されている」といった許可利用者のいない行動説明文を生成し、ステップＳ５８で監視区画画像と行動説明文のペアを学習データセットとして学習データセット記憶部１００に記憶させる。 If the control unit 52 determines that there is no person in the cut-out image in step S55, the process proceeds to step S57, and for example, an action explanation such as "a user terminal is placed on a desk" without an authorized user A sentence is generated, and in step S58, a pair of the monitoring section image and the action explanatory sentence is stored in the learning data set storage unit 100 as a learning data set.

このようなステップＳ５３〜Ｓ５８の学習情報の記憶を、制御部５２はステップＳ５９で例えば１０〜１５分といった所定の学習時間の経過を判別するまで繰り返す。 The storage of the learning information in steps S53 to S58 is repeated until the control unit 52 determines in step S59 that a predetermined learning time, such as 10 to 15 minutes, has elapsed.

続いて、制御部５２は、ステップＳ６０で夜間や休日等などにおける行動判定部１６の休止中を判別するとステップＳ６１に進み、学習データセット記憶部１００に記憶している監視区画画像と行動説明文のデータセットを読み出して畳み込みニューラルネットワーク７８と再帰型ニューラルネットワーク８０で構成される多層式ニューラルネットワークに入力し、入力画像から行動説明文が推定されるように多層式ニューラルネットワークのウェイトとバイアスを調整する学習制御をステップＳ６２で行動判定部１６の休止解除が判別されるまで繰り返す。 Subsequently, when the control unit 52 determines that the action determination unit 16 is paused at night, a holiday, or the like in step S60, the process proceeds to step S61, and the monitoring section image and the action description stored in the learning data set storage unit 100. The data set is read out and input to a multi-layered neural network composed of a convolutional neural network 78 and a recursive neural network 80, and the weights and biases of the multi-layered neural network are adjusted so that an action description can be estimated from the input image. The learning control to be performed is repeated until it is determined in step S62 that the action determination unit 16 has been canceled.

続いて、制御部５２は、ステップＳ６２で行動判定部１６における行動判定動作の休止解除（起動）を判別するとステップＳ６３に進んで多層式ニューラルネットワークの学習を中断又は終了し、ステップＳ５１からの処理を繰り返す。 Subsequently, when the control unit 52 determines that suspension release (activation) of the action determination operation in the action determination unit 16 is determined in step S62, the process proceeds to step S63 to suspend or end learning of the multilayer neural network, and the process from step S51 repeat.

〔本発明の変形例〕
（在室者に応じた異常判定）
上記の第２実施形態では、異常判定した人物を人物特定部１７で人物特定するようにしていたが、実施形態はこれに限らない。人物を特定したのち、当該人物に許可された行動かどうかを判定するようにしても良い。領域に通常存在する人物ごと、及び領域に通常存在しない人物について、異常と判定する行動となる異常行動判定単語に関する行動辞書を作成する。運用時、画像解析部または利用者画像と監視画像を比較により人物を特定する。画像解析部によって出力される人物の行動が、特定された人物における異常と判定する行動であるかどうかを判定する。これにより、その人物に許可されている行動か監視可能となる。例えば、通常の在室者ならＰＣの操作は問題ないが、通常の在室者でない場合は機密漏えいの可能性があるため警報を発する。部長は持ちだし可能だが社員持ち出し禁止のファイルを、社員が持ちだそうとする場合、警報を発する。 [Modification of the present invention]
(Abnormal judgment according to the room occupant)
In the second embodiment described above, the person specified as abnormal is specified by the person specifying unit 17. However, the embodiment is not limited to this. After identifying a person, it may be determined whether the action is permitted for the person. For each person normally present in the area and for a person not normally present in the area, an action dictionary is created regarding an abnormal action determination word as an action for determining an abnormality. During operation, a person is identified by comparing the image analysis unit or the user image with the monitoring image. It is determined whether the action of the person output by the image analysis unit is an action for determining an abnormality in the identified person. This makes it possible to monitor the behavior permitted by the person. For example, normal room occupants have no problem operating the PC, but if they are not ordinary room occupants, an alarm is issued because there is a possibility of security leaks. The department manager issues a warning if an employee tries to take out a file that can be taken out but the employee is not allowed to take it out.

（学習機能）
上記の実施形態に示した在室監視装置は、行動判定部の多階層ニューラルネットワークを学習する学習制御部を備えた場合を例にとっているが、多階層ニューラルネットワークの学習は、学習機能を備えた別のコンピュータ設備を使用して行い、その結果得られた学習済みの多階層ニュートラルネットワークを行動判定部に実装して使用するようにしても良い。 (Learning function)
Although the room occupancy monitoring apparatus shown in the above embodiment is provided with a learning control unit for learning the multi-layer neural network of the action determination unit as an example, the learning of the multi-layer neural network has a learning function. It may be carried out using another computer equipment, and the resulting learned multi-tiered neutral network may be implemented and used in the action determination unit.

（特徴抽出）
上記の実施形態は、畳み込みニューラルネットワークに画像を入力して通常在室している利用者の特徴を抽出しているが、畳み込みニューラルネットワークを使用せず、入力した画像から輪郭、濃淡等の特徴を抽出する前処理を行って所定の特徴を抽出し、特徴が抽出された画像を認識部として機能する全結合ニューラルネットワーク又は再帰型ニューラルネットワークに入力して通常在室している利用者か否かや画像説明文を推定させるようにしても良い。これにより画像の特徴抽出の処理負担を低減可能とする。 (Feature extraction)
Although the above embodiment inputs an image to a convolutional neural network to extract features of a user who is usually in the room, it does not use the convolutional neural network, and features such as contours and shadings from the input image Whether the user is usually in the room by performing pre-processing to extract a predetermined feature to extract a predetermined feature, and inputting the image from which the feature has been extracted into a fully connected neural network or a recursive neural network that functions as a recognition unit It is also possible to estimate a heel image description. As a result, the processing load of image feature extraction can be reduced.

（学習方法について）
上記の実施形態は、バックプロパゲーションによる学習を行っているが、多層式ニューラルネットワークの学習方法はこれに限らない。 (About the learning method)
Although the above embodiment performs learning by back propagation, the learning method of the multilayer neural network is not limited to this.

（入退室管理設備との連携）
上記の実施形態は、入退室管理設備で管理している在室管理情報を利用して行動判定部の動作と休止を制御しているが、これに限定されず、監視対象とする部屋に設置された人感センサにより在室の有無を判別して行動判定部の動作と休止を制御するようにしても良い。 (Cooperation with room entry and exit management equipment)
Although the above embodiment controls operation and suspension of the action determination unit using the occupancy management information managed by the entry and exit management facility, the present invention is not limited to this, and is installed in a room to be monitored It is also possible to determine the presence or absence of the room by the human sensor, and control the operation and the pause of the behavior determination unit.

（システム運用中の学習情報の記憶）
上記の実施形態は、監視区画における備品使用権限をもつ利用者の利用者端末の使用開始を検出した場合に、所定時間のあいだ監視区画の画像を切り出して学習情報として記憶させているが、これに限定されず、監視領域のセキュリティ−担当者等の利用者端末を使用した人為的な操作により、平日の就業時間帯に監視区画の画像を切り出して学習情報として記憶させるようにしても良い。 (Storage of learning information during system operation)
Although the above embodiment detects the start of use of the user terminal of the user who has the right to use the equipment in the monitoring area, the image of the monitoring area is cut out and stored as learning information for a predetermined time. The image of the monitoring section may be cut out and stored as learning information during the working hours of a weekday by artificial operation using a user terminal such as a security person in charge of the monitoring area.

（その他）
また、本発明は上記の実施形態に限定されず、その目的と利点を損なうことのない適宜の変形を含み、更に上記の実施形態に示した数値による限定は受けない。 (Others)
Further, the present invention is not limited to the above-described embodiment, includes appropriate modifications that do not impair the objects and advantages thereof, and is not limited by the numerical values shown in the above-described embodiment.

１０：行動監視装置
１２：オフィスルーム
１４：監視カメラ
１６：行動判定部
１７：人物特定部
１８：学習制御部
２０：録画装置
２１：ＬＡＮ回線
２２：入退室管理制御装置
２３：センター装置
２４：クライアント装置
２５：入室カメラ
２６：カードリーダ
２８：電気錠
３０：扉
３２：利用者端末
３４−１〜３４−１２：監視区画
３６：監視画像
４４，５０：画像切出部
４６：多層式ニューラルネットワーク
４８：時系列判定部
５２：制御部
５４：学習画像記憶部
５５：操作部
５６：表示部
５８：特徴抽出部
６０：認識部
６２：入力画像
６３，６５ａ，６５ｂ：ウェイトフィルタ
６４ａ，６４ｂ，６４ｃ：特徴マップ
６６，８５：入力層
６８：結合層
７０，８６：中間層
７２：出力層
７４：画像解析部
７６：行動認識部
７８：畳み込みニューラルネットワーク
８０：再帰型ニューラルネットワーク
８２：判定器
８４：シソーラス辞書
８７：ＬＳＴＭ入力層
８８：ＬＳＴＭ隠れ層
９０：単語レジスタ
９２：単語ベクトル変換部
９４：確率変換部
９６：コスト算出部
１００：学習データセット記憶部 10: behavior monitoring device 12: office room 14: monitoring camera 16: behavior determination unit 17: person identification unit 18: learning control unit 20: recording device 21: LAN line 22: entering and leaving management control device 23: center device 24: client Device 25: Entry camera 26: Card reader 28: Electric lock 30: Door 32: User terminal 34-1 to 34-12: Monitoring section 36: Monitoring image 44, 50: Image cutting out part 46: Multilayer neural network 48 : Time series determination unit 52: control unit 54: learning image storage unit 55: operation unit 56: display unit 58: feature extraction unit 60: recognition unit 62: input image 63, 65a, 65b: weight filter 64a, 64b, 64c: Feature map 66, 85: input layer 68: coupling layer 70, 86: middle layer 72: output layer 74: image analysis unit 76: action recognition unit 78: convolution new Network 80: recursive neural network 82: determiner 84: thesaurus dictionary 87: LSTM input layer 88: LSTM hidden layer 90: word register 92: word vector converter 94: probability converter 96: cost calculator 100: learning data Set storage unit

Claims

An imaging unit for imaging a monitoring area to be monitored;
An action determination unit that determines and outputs an abnormal action of a person in the monitoring area;
Is provided,
The behavior monitoring system according to claim 1, wherein the behavior determination unit is configured by a multi-layered neural network learned by an image indicating permission behavior of a person normally present in the monitoring area.

In the behavior monitoring system according to claim 1,
The behavior monitoring system further includes a person specifying unit that specifies a person whose abnormal behavior has been determined by the behavior determining unit and outputs person specifying information.

In the behavior monitoring system according to claim 1,
The multi-layered neural network of the action determination unit is composed of a feature extraction unit and a recognition unit,
The feature extraction unit is configured of a convolutional neural network that extracts and outputs feature amounts according to the person from the input image of the person,
The recognition unit is configured by an all connected neural network including a plurality of all connected layers that input the feature amount output from the convolutional neural network and estimate whether the person is the person of the permitted behavior or not. Behavior monitoring system.

In the behavior monitoring system according to claim 3,
The action determination unit
A learning information storage unit in which an image indicating a permitted action of a person normally present in the monitoring area is stored in advance;
When the image stored in the learning information storage unit is read out and input to the convolutional neural network as a supervised learning image, the error between the estimated value output from the fully combined neural network and a predetermined expected value A learning control unit that learns the fully coupled neural network and the convolutional neural network by back propagation based on;
An activity monitoring system characterized in that

In the behavior monitoring system according to claim 1,
The multi-layered neural network of the action determination unit includes an image analysis unit and an action recognition unit.
The image analysis unit
A convolutional neural network which extracts a feature amount according to the person from the input image of the person and outputs it from a predetermined intermediate layer;
A recursive neural network which receives the feature quantity output from the convolutional neural network and generates and outputs an image description of an image of the monitoring area;
Configured by
The action recognition unit
A dictionary in which a word indicating a predetermined abnormal behavior is registered;
A determiner that compares the words forming the image description output from the image analysis unit with the words registered in the dictionary to determine abnormal behavior of the person;
An activity monitoring system characterized by comprising:

In the behavior monitoring system according to claim 5,
The image analysis unit identifies a person from the image of the person and outputs person identification information.
The word indicating the predetermined abnormal behavior is registered according to the person normally present in the monitoring area,
The behavior monitoring system, wherein the determination unit determines the abnormal behavior of the person by comparing a word forming the image description sentence and a word indicating an abnormal behavior in the person of the person specifying information.

In the behavior monitoring system according to claim 5,
The action determination unit
A learning information storage unit in which a pair of an image indicating permission behavior of a person normally present in the monitoring area and a predetermined image explanatory sentence indicating an outline of the image is stored in advance;
The image stored in the learning information storage unit is read out and input as an unsupervised learning image to the convolutional neural network to learn by back propagation, and the learning image is input to the convolutional neural network after the learning. A learning control unit that inputs the image description, which is a pair of the feature value input and output and the learning image, as unsupervised learning information to the recursive neural network and learns by back propagation;
An activity monitoring system characterized in that

In the behavior monitoring system according to claim 2, further,
A user who enters a monitoring area in the facility is identified, and when the user identification information matches the user identification information registered in advance, a control is performed to unlock the electric lock provided at the door of the entrance. Management equipment,
An entry imaging unit for imaging a user who enters the room when the user is identified by the entry / exit management facility;
Is provided,
The person identification unit stores a user image captured by the room entry imaging unit in conjunction with control for unlocking the electric lock by the entry / exit management facility, and the user image and the action determination unit The room occupant monitoring system, wherein the user information of the user matched with the image of the person whose abnormal behavior has been determined is output as the person identification information.

In the behavior monitoring system according to claim 1,
Furthermore, a recording device for recording a moving image of the monitoring area imaged by the imaging unit is provided,
The action determination unit transmits an abnormality alarm signal to a predetermined external device for notification when it determines an abnormal action of the person, and the imaging unit before and after a predetermined time including a time point when the abnormal action is determined. An action monitoring system characterized by reproducing and displaying a moving image of a captured monitoring area from a recording device.

The behavior monitoring system according to claim 1, wherein the behavior determination unit includes one or more monitoring sections in which the equipment to be monitored is disposed in the image of the monitoring area captured by the imaging unit. An action monitoring system characterized by setting an abnormal action of the person for each of the monitoring sections by setting out and extracting an image of the monitoring section and inputting the image to the action determination unit.

The behavior monitoring system according to claim 1, wherein the behavior determining unit determines and outputs an abnormal behavior of the person by inputting an image of the monitoring section at a predetermined cycle. Monitoring system.

The behavior monitoring system according to claim 1, wherein the behavior determination unit normalizes and inputs an image of the monitoring section into an image of a predetermined size.