JP6943105B2

JP6943105B2 - Information processing systems, information processing devices, and programs

Info

Publication number: JP6943105B2
Application number: JP2017177903A
Authority: JP
Inventors: チョンフィファン; 増田　誠; 誠増田
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2017-09-15
Filing date: 2017-09-15
Publication date: 2021-09-29
Anticipated expiration: 2037-09-15
Also published as: JP2019053581A

Description

本発明は、情報処理システム、情報処理装置、及びプログラムに関する。 The present invention relates to an information processing system, an information processing device, and a program.

現在、様々な場所に監視カメラが設置されるようになり、監視カメラの数は増加傾向にある。また、ネットワークに接続された監視カメラの普及も進んでおり、ネットワークを介して監視カメラから得られる画像（静止画像、または動画像）を用いた多様な応用が考えられる。 Currently, surveillance cameras are being installed in various places, and the number of surveillance cameras is on the rise. In addition, surveillance cameras connected to networks are becoming widespread, and various applications using images (still images or moving images) obtained from surveillance cameras via networks are conceivable.

例えば下記特許文献１には、監視カメラの撮像により得られた撮像画像を、ネットワークを介してサーバへ送信し、当該サーバにおいて必要な処理を行うことで、防犯等に活用するシステムが提案されている。また、下記特許文献１では、処理を監視カメラ及び複数のサーバで分担させることにより、リアルタイム性を保つことが記載されている。 For example, Patent Document 1 below proposes a system that utilizes an image obtained by imaging with a surveillance camera for crime prevention, etc. by transmitting it to a server via a network and performing necessary processing on the server. There is. Further, Patent Document 1 below describes that real-time performance is maintained by sharing processing between a surveillance camera and a plurality of servers.

特許第３６１２２２０号公報Japanese Patent No. 361220

Sachin Sudhakar Farfade、外２名、「Multi-view Face Detection Using Deep Convolutional Neural Networks」、In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval（ＩＣＭＲ）、２０１５年６月、ｐ.６４３―６５０Sachin Sudhakar Farfade, 2 outsiders, "Multi-view Face Detection Using Deep Convolutional Neural Networks", In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR), June 2015, p.643-650 Florian Schroff、外２名、「FaceNet: A Unified Embedding for Face Recognition and Clustering」、In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition（ＣＶＰＲ）、２０１５年６月Florian Schroff, 2 outsiders, "FaceNet: A Unified Embedding for Face Recognition and Clustering", In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), June 2015

しかし、上記特許文献１に記載の技術では、撮像画像から人物候補画像を抽出する処理を監視カメラで行うため、監視カメラが係る処理をリアルタイムに行うための性能を有していない場合には、リアルタイム性を保つことが出来なかった。このように処理を分担させるシステムにおいて、処理をより細かい単位で分担させることが可能な仕組みが望まれていた。 However, in the technique described in Patent Document 1, since the process of extracting a person candidate image from the captured image is performed by the surveillance camera, if the surveillance camera does not have the performance to perform the process in real time, Real-time performance could not be maintained. In such a system for sharing processing, a mechanism capable of sharing processing in finer units has been desired.

そこで、本発明は、上記問題に鑑みてなされたものであり、本発明の目的とするところは、処理をより細かい単位で分担させることが可能な、新規かつ改良された情報処理システム、情報処理装置、及びプログラムを提供することにある。 Therefore, the present invention has been made in view of the above problems, and an object of the present invention is a new and improved information processing system and information processing capable of sharing processing in finer units. To provide equipment and programs.

上記課題を解決するために、本発明のある観点によれば、ｎを２以上の整数としたとき、少なくともｎ層で構成されたニューラルネットワークを用いたニューラルネットワーク処理を行う情報処理システムであって、ｋを１以上ｎ−１以下の整数としたとき、入力データを入力とし、前記ニューラルネットワークのうち第１層から第ｋ層までを用いた前記ニューラルネットワーク処理を行って、第ｋ層出力値を出力する第１のニューラルネットワーク処理部と、前記第ｋ層出力値を通信用第ｋ層出力値に変換する第１の変換部と、前記通信用第ｋ層出力値を第１の通信ネットワークへ送信する第１の送信部と、前記通信用第ｋ層出力値を前記第１の通信ネットワークから受信する第１の受信部と、前記第１の受信部が受信した前記通信用第ｋ層出力値を前記第ｋ層出力値に変換する第２の変換部と、前記第ｋ層出力値を入力とし、前記ニューラルネットワークのうち少なくとも第ｋ＋１層を用いた前記ニューラルネットワーク処理を行う第２のニューラルネットワーク処理部と、ｑを１以上ｎ以下の整数としたとき、第１の入力データを入力とした前記ニューラルネットワーク処理において前記ニューラルネットワークのうちの第ｑ層により出力される第ｑ層出力値に基づいて、第１の入力データとは異なる第２の入力データを入力とした前記ニューラルネットワーク処理における前記ｋの値を決定する処理分担決定部と、を有する、情報処理システムが提供される。
In order to solve the above problem, according to a certain viewpoint of the present invention, it is an information processing system that performs neural network processing using a neural network composed of at least n layers when n is an integer of 2 or more. When k is an integer of 1 or more and n-1 or less, the input data is used as an input, and the neural network processing using the first layer to the kth layer of the neural network is performed to perform the k-th layer output value. A first neural network processing unit that outputs A first transmitting unit that transmits to, a first receiving unit that receives the communication k-layer output value from the first communication network, and the communication k-layer received by the first receiving unit. A second conversion unit that converts an output value into the k-th layer output value, and a second that performs the neural network processing using at least the k + 1 layer of the neural network with the k-layer output value as an input. When q is an integer of 1 or more and n or less with the neural network processing unit, the qth layer output value output by the qth layer of the neural network in the neural network processing using the first input data as an input. Based on the above, an information processing system is provided that includes a processing sharing determination unit that determines the value of k in the neural network processing that uses a second input data different from the first input data as an input.

前記処理分担決定部は、前記第１のニューラルネットワーク処理部と前記第２のニューラルネットワーク処理部との処理分担境界を示す前記ｋの値を決定してもよい。
Pre Symbol processing sharing determination unit sets a value of the k indicating the processing sharing boundary between said second neural network processing unit and the first neural network processing unit may determine.

前記第１の入力データ、及び前記第２の入力データはセンシングにより取得されるセンシングデータであり、前記情報処理システムは、前記第ｑ層出力値に基づいて、第２の入力データを取得するための前記センシングに係る分解能を決定する分解能決定部をさらに有してもよい。 The first input data and the second input data are sensing data acquired by sensing, and the information processing system acquires the second input data based on the qth layer output value. It may further have a resolution determination unit that determines the resolution related to the sensing.

前記処理分担決定部は、前記分解能決定部が前記分解能としてより高い値を決定する場合に、前記第１のニューラルネットワーク処理部と前記第２のニューラルネットワーク処理部のうち、より低い処理性能を有する方の処理負荷が、より小さくなるように、前記ｋの値を決定してもよい。 The processing sharing determination unit has lower processing performance than the first neural network processing unit and the second neural network processing unit when the resolution determination unit determines a higher value as the resolution. The value of k may be determined so that the processing load on the side becomes smaller.

前記第１の入力データ、及び前記第２の入力データは画像データであり、前記ニューラルネットワークは前記画像データに含まれる物体を認識するためのニューラルネットワークであり、前記第ｑ層出力値は、前記物体の検出結果に関する情報を含んでもよい。 The first input data and the second input data are image data, the neural network is a neural network for recognizing an object included in the image data, and the qth layer output value is the qth layer output value. It may include information about the detection result of the object.

前記ｎは３以上の整数であり、ｍをｋ＋１以上ｎ−１以下の整数としたとき、前記第２のニューラルネットワーク処理部は、前記ニューラルネットワークのうち前記第ｋ＋１層から第ｍ層までを用いた前記ニューラルネットワーク処理を行って、第ｍ層出力値を出力し、前記情報処理システムは、前記第ｍ層出力値を通信用第ｍ層出力値に変換する第３の変換部と、前記通信用第ｍ層出力値を第２の通信ネットワークへ送信する第２の送信部と、前記通信用第ｍ層出力値を前記第２の通信ネットワークから受信する第２の受信部と、前記第２の受信部が受信した前記通信用第ｍ層出力値を前記第ｍ層出力値に変換する第４の変換部と、前記第ｍ層出力値を入力とし、前記ニューラルネットワークのうち少なくとも第ｍ＋１層を用いた前記ニューラルネットワーク処理を行う第３のニューラルネットワーク処理部と、をさらに有してもよい。 When n is an integer of 3 or more and m is an integer of k + 1 or more and n-1 or less, the second neural network processing unit uses the k + 1 layer to the mth layer of the neural network. The neural network processing that has been performed is performed to output the m-th layer output value, and the information processing system communicates with a third conversion unit that converts the m-layer output value into a communication m-layer output value. A second transmitting unit that transmits the m-layer output value for communication to the second communication network, a second receiving unit that receives the m-layer output value for communication from the second communication network, and the second receiving unit. A fourth conversion unit that converts the communication mth layer output value received by the reception unit into the mth layer output value, and at least the m + 1th layer of the neural network with the mth layer output value as input. A third neural network processing unit that performs the neural network processing using the above may be further provided.

また、上記課題を解決するために、本発明の別の観点によれば、ｎを２以上の整数、ｋを１以上ｎ−１以下の整数、としたとき、少なくともｎ層で構成されたニューラルネットワークのうち第１層から第ｋ層までを用いたニューラルネットワーク処理を行って、第ｋ層出力値を出力する第１のニューラルネットワーク処理部と、前記第ｋ層出力値を通信用第ｋ層出力値に変換する第１の変換部と、前記通信用第ｋ層出力値を第１の通信ネットワークへ送信する第１の送信部と、ｑを１以上ｎ以下の整数としたとき、第１の入力データを入力とした前記ニューラルネットワーク処理において前記ニューラルネットワークのうちの第ｑ層により出力される第ｑ層出力値に基づいて、第１の入力データとは異なる第２の入力データを入力とした前記ニューラルネットワーク処理における前記ｋの値を決定する処理分担決定部と、を備える、情報処理装置が提供される。
Further, in order to solve the above problem, according to another viewpoint of the present invention, when n is an integer of 2 or more and k is an integer of 1 or more and n-1 or less, a neural network composed of at least n layers. A first neural network processing unit that performs neural network processing using the first layer to the kth layer of the network and outputs the kth layer output value, and the kth layer output value for communication kth layer When q is an integer of 1 or more and n or less, the first conversion unit that converts the output value, the first transmission unit that transmits the k-layer output value for communication to the first communication network, and the first In the neural network processing using the input data of the above, the second input data different from the first input data is input based on the qth layer output value output by the qth layer of the neural network. Provided is an information processing apparatus including a processing sharing determination unit for determining the value of k in the neural network processing.

また、上記課題を解決するために、本発明の別の観点によれば、ｎを２以上の整数、ｋを１以上ｎ−１以下の整数、としたとき、少なくともｎ層で構成されたニューラルネットワークのうち第１層から第ｋ層までを用いたニューラルネットワーク処理を行って、第ｋ層出力値を出力する機能と、前記第ｋ層出力値を通信用第ｋ層出力値に変換する機能と、前記通信用第ｋ層出力値を第１の通信ネットワークへ送信する機能と、ｑを１以上ｎ以下の整数としたとき、第１の入力データを入力とした前記ニューラルネットワーク処理において前記ニューラルネットワークのうちの第ｑ層により出力される第ｑ層出力値に基づいて、第１の入力データとは異なる第２の入力データを入力とした前記ニューラルネットワーク処理における前記ｋの値を決定する機能と、をコンピュータに実現させるためのプログラムが提供される。
Further, in order to solve the above problem, according to another viewpoint of the present invention, when n is an integer of 2 or more and k is an integer of 1 or more and n-1 or less, a neural computer composed of at least n layers. A function of performing neural network processing using the first layer to the kth layer of the network and outputting the kth layer output value, and a function of converting the kth layer output value into a communication kth layer output value. And the function of transmitting the k-layer output value for communication to the first communication network, and the neural network processing in which the first input data is input when q is an integer of 1 or more and n or less. A function of determining the value of k in the neural network processing in which a second input data different from the first input data is input based on the qth layer output value output by the qth layer of the network. When a program for implementing the computer is provided with.

また、上記課題を解決するために、本発明の別の観点によれば、ｎを２以上の整数、ｋを１以上ｎ−１以下の整数、としたとき、少なくともｎ層で構成されたニューラルネットワークのうち第１層から第ｋ層までを用いたニューラルネットワーク処理を行って出力された第ｋ層出力値を変換して得られた通信用第ｋ層出力値を、第１の通信ネットワークから受信する第１の受信部と、前記第１の受信部が受信した前記通信用第ｋ層出力値を前記第ｋ層出力値に変換する第２の変換部と、前記第ｋ層出力値を入力とし、前記ニューラルネットワークのうち少なくとも第ｋ＋１層を用いた前記ニューラルネットワーク処理を行う第２のニューラルネットワーク処理部と、を有し、ｑを１以上ｎ以下の整数としたとき、第１の入力データを入力とした前記ニューラルネットワーク処理において前記ニューラルネットワークのうちの第ｑ層により出力される第ｑ層出力値に基づいて、第１の入力データとは異なる第２の入力データを入力とした前記ニューラルネットワーク処理における前記ｋの値が決定される、情報処理システムが提供される。
Further, in order to solve the above problem, according to another viewpoint of the present invention, when n is an integer of 2 or more and k is an integer of 1 or more and n-1 or less, a neural network composed of at least n layers. The k-th layer output value for communication obtained by converting the k-layer output value output by performing neural network processing using the first layer to the k-th layer of the network is obtained from the first communication network. A first receiving unit to receive, a second conversion unit that converts the communication k-layer output value received by the first receiving unit into the k-layer output value, and the k-layer output value. as input, wherein the second neural network processing unit to perform the neural network processing that uses at least a k + 1 So Uchi neural network, have a, when the q and an integer from 1 to n, a first input In the neural network process using data as an input, the second input data different from the first input data is input based on the qth layer output value output by the qth layer of the neural network. An information processing system is provided in which the value of k in the neural network processing is determined.

また、上記課題を解決するために、本発明の別の観点によれば、ｎを２以上の整数、ｋを１以上ｎ−１以下の整数、としたとき、少なくともｎ層で構成されたニューラルネットワークのうち第１層から第ｋ層までを用いたニューラルネットワーク処理を行って出力された第ｋ層出力値を変換して得られた通信用第ｋ層出力値を、第１の通信ネットワークから受信する機能と、前記第１の受信部が受信した前記通信用第ｋ層出力値を前記第ｋ層出力値に変換する機能と、前記第ｋ層出力値を入力とし、前記ニューラルネットワークのうち少なくとも第ｋ＋１層を用いた前記ニューラルネットワーク処理を行う機能と、をコンピュータに実現させるためのプログラムであって、ｑを１以上ｎ以下の整数としたとき、第１の入力データを入力とした前記ニューラルネットワーク処理において前記ニューラルネットワークのうちの第ｑ層により出力される第ｑ層出力値に基づいて、第１の入力データとは異なる第２の入力データを入力とした前記ニューラルネットワーク処理における前記ｋの値が決定される、プログラムが提供される。

Further, in order to solve the above problem, according to another viewpoint of the present invention, when n is an integer of 2 or more and k is an integer of 1 or more and n-1 or less, a neural network composed of at least n layers. The k-th layer output value for communication obtained by converting the k-layer output value output by performing neural network processing using the first layer to the k-th layer of the network is obtained from the first communication network. Among the neural networks, the function of receiving, the function of converting the k-layer output value for communication received by the first receiving unit into the k-layer output value, and the k-layer output value as input. A program for realizing a function of performing the neural network processing using at least the first k + 1 layer on a computer, and when q is an integer of 1 or more and n or less, the first input data is used as an input. The k in the neural network processing in which the second input data different from the first input data is input based on the qth layer output value output by the qth layer of the neural network in the neural network processing. A program is provided that determines the value of.

以上説明したように本発明によれば、処理をより細かい単位で分担させることが可能である。 As described above, according to the present invention, it is possible to divide the processing into finer units.

本発明の各実施形態に共通する監視システム９００の概略構成を説明するための説明図である。It is explanatory drawing for demonstrating the schematic structure of the monitoring system 900 common to each embodiment of this invention. 本発明の第１の実施形態の概要を説明するための説明図である。It is explanatory drawing for demonstrating the outline of the 1st Embodiment of this invention. 同実施形態に係る監視システム９００−１の構成例を示すブロック図である。It is a block diagram which shows the structural example of the monitoring system 900-1 which concerns on the same embodiment. 同実施形態に係る監視システム９００−１の処理フローを示すシーケンス図である。It is a sequence diagram which shows the processing flow of the monitoring system 900-1 which concerns on the same embodiment. 本発明の第２の実施形態の概要を説明するための説明図である。It is explanatory drawing for demonstrating the outline of the 2nd Embodiment of this invention. 同実施形態に係る監視システム９００−２の構成例を示すブロック図である。It is a block diagram which shows the structural example of the monitoring system 900-2 which concerns on the same embodiment. 同実施形態に係る監視システム９００−２の処理フローを示すシーケンス図である。It is a sequence diagram which shows the processing flow of the monitoring system 900-2 which concerns on the same embodiment. 本発明の第３の実施形態の概要を説明するための説明図である。It is explanatory drawing for demonstrating the outline of the 3rd Embodiment of this invention. 同実施形態に係る監視システム９００−３の構成例を示すブロック図である。It is a block diagram which shows the structural example of the monitoring system 900-3 which concerns on the same embodiment. 決定部２５３による整数ｋ、整数ｍ、及び分解能（フレームレート、及び解像度）の決定の一例を示す表である。It is a table which shows an example of the determination of the integer k, the integer m, and the resolution (frame rate and resolution) by the determination unit 253. 同実施形態に係る監視システム９００−３の処理フローを示すシーケンス図である。It is a sequence diagram which shows the processing flow of the monitoring system 900-3 which concerns on the same embodiment. ハードウェア構成例を示す説明図である。It is explanatory drawing which shows the hardware configuration example.

以下に添付図面を参照しながら、本発明の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the present specification and the drawings, components having substantially the same functional configuration are designated by the same reference numerals, so that duplicate description will be omitted.

また、本明細書及び図面において、実質的に同一の機能構成を有する複数の構成要素を、同一の符号の後に異なるアルファベットを付して区別する場合もある。ただし、実質的に同一の機能構成を有する複数の構成要素の各々を特に区別する必要がない場合、同一符号のみを付する。 Further, in the present specification and the drawings, a plurality of components having substantially the same functional configuration may be distinguished by adding different alphabets after the same reference numerals. However, if it is not necessary to distinguish each of the plurality of components having substantially the same functional configuration, only the same reference numerals are given.

＜＜１．概要＞＞
＜１−１．背景＞
現在、街のいたるところに監視カメラが設置されており、監視カメラの数は増加傾向にある。また、監視カメラの多くはネットワークに接続され、ネットワークを介して監視カメラから得られる画像（静止画像、または動画像）を用いた多様な応用が考えられる。 << 1. Overview >>
<1-1. Background ＞
Currently, surveillance cameras are installed all over the city, and the number of surveillance cameras is on the rise. In addition, most of the surveillance cameras are connected to a network, and various applications using images (still images or moving images) obtained from the surveillance cameras via the network can be considered.

例えば、特許文献１には、監視カメラの撮像により得られた撮像画像を、ネットワークを介してサーバへ送信し、当該サーバにおいて必要な処理を行うことで、防犯等に活用する技術が提案されている。また、特許文献１には、処理を監視カメラ及び複数のサーバで分担させることにより、リアルタイム性を保つことも記載されている。 For example, Patent Document 1 proposes a technique of transmitting an captured image obtained by imaging of a surveillance camera to a server via a network and performing necessary processing on the server to utilize it for crime prevention or the like. There is. Further, Patent Document 1 also describes that real-time performance is maintained by sharing processing between a surveillance camera and a plurality of servers.

しかし、上記特許文献１に記載の技術では、撮像画像から人物検出を行って人物候補画像を抽出する処理を監視カメラで行うため、監視カメラが係る処理をリアルタイムに行うための性能を有していない場合には、リアルタイム性を保つことが出来ない。リアルタイム性を保つため、例えばフレームレート（時間分解能）を減らしたり、解像度（空間分解能）を減らしたりすることも可能であるが、係る場合には人物検出の精度が低下する恐れがある。したがって、処理をより細かい単位で分担させることが可能な仕組みが望まれていた。 However, the technique described in Patent Document 1 has the ability for the surveillance camera to perform the processing in real time because the surveillance camera performs the process of detecting the person from the captured image and extracting the candidate image of the person. Without it, real-time performance cannot be maintained. In order to maintain real-time performance, for example, it is possible to reduce the frame rate (time resolution) or the resolution (spatial resolution), but in such a case, the accuracy of person detection may decrease. Therefore, a mechanism that can divide the processing into smaller units has been desired.

また、上記特許文献１の技術では、監視カメラからサーバへ、人物の映った画像が送信され得る。したがって、もし監視カメラとサーバとの間の通信内容が盗み取られてしまった場合には、画像を目視することで、当該人物がいつどこにいたのか、当該人物が何をしていたのか、といった情報が把握され、当該人物のプライバシーが侵害される恐れがあった。 Further, in the technique of Patent Document 1, an image of a person can be transmitted from a surveillance camera to a server. Therefore, if the communication content between the surveillance camera and the server is stolen, by visually checking the image, it is possible to know when and where the person was and what the person was doing. There was a risk that the information would be grasped and the privacy of the person concerned would be infringed.

＜１−２．基本構成＞
以上、本発明の実施形態の背景について説明した。本件発明者は、上述した事情を一着眼点にして本発明の実施形態を創作するに至った。 <1-2. Basic configuration>
The background of the embodiment of the present invention has been described above. The inventor of the present invention has come to create an embodiment of the present invention with the above-mentioned circumstances as a point of view.

以下、本発明の各実施形態に共通する監視システム９００の基本構成について、図１を参照して説明を行う。図１は、本発明の各実施形態に共通する監視システム９００の概略構成を説明するための説明図である。図１に示すように、監視システム９００は、監視カメラ１と、中間サーバ２と、認識サーバ３と、通信ネットワーク５Ａと、通信ネットワーク５Ｂとを有する。 Hereinafter, the basic configuration of the monitoring system 900 common to each embodiment of the present invention will be described with reference to FIG. FIG. 1 is an explanatory diagram for explaining a schematic configuration of a monitoring system 900 common to each embodiment of the present invention. As shown in FIG. 1, the surveillance system 900 includes a surveillance camera 1, an intermediate server 2, a recognition server 3, a communication network 5A, and a communication network 5B.

図１に示すように、監視カメラ１と中間サーバ２とは、通信ネットワーク５Ａを介して接続され、中間サーバ２と認識サーバ３とは、通信ネットワーク５Ｂを介して接続される。 As shown in FIG. 1, the surveillance camera 1 and the intermediate server 2 are connected via the communication network 5A, and the intermediate server 2 and the recognition server 3 are connected via the communication network 5B.

通信ネットワーク５Ａ、及び通信ネットワーク５Ｂは、それぞれ通信ネットワーク５Ａ、及び通信ネットワーク５Ｂに接続されている装置、またはシステムから送信される情報の有線、または無線の伝送路である。例えば、通信ネットワーク５（通信ネットワーク５Ａ、及び通信ネットワーク５Ｂ）は、インターネット、電話回線網、衛星通信網等の公衆回線網や、Ethernet（登録商標）を含む各種のＬＡＮ（Local Area Network）、ＷＡＮ（Wide Area Network）等を含んでもよい。また、通信ネットワーク５は、ＩＰ−ＶＰＮ（Internet Protocol-Virtual Private Network）等の専用回線網を含んでもよい。 The communication network 5A and the communication network 5B are wired or wireless transmission lines of information transmitted from a device or system connected to the communication network 5A and the communication network 5B, respectively. For example, the communication network 5 (communication network 5A and communication network 5B) includes public network such as the Internet, telephone network, satellite communication network, various LANs (Local Area Network) including Ethernet (registered trademark), and WAN. (Wide Area Network) and the like may be included. Further, the communication network 5 may include a dedicated line network such as IP-VPN (Internet Protocol-Virtual Private Network).

監視システム９００は、監視カメラ１の撮像により得られた画像データに含まれる人物の顔（物体の一例）を認識するために、ニューラルネットワークを用いたニューラルネットワーク処理を行う。顔認識を実現するためのニューラルネットワークは、例えば上記の非特許文献１、または非特許文献２のような手法で生成することが可能である。また、ニューラルネットワークを用いたニューラルネットワーク処理は、ニューラルネットワークパラメータにより特定することが可能である。 The surveillance system 900 performs neural network processing using a neural network in order to recognize a person's face (an example of an object) included in the image data obtained by the imaging of the surveillance camera 1. The neural network for realizing face recognition can be generated by a method such as the above-mentioned Non-Patent Document 1 or Non-Patent Document 2. Further, the neural network processing using the neural network can be specified by the neural network parameters.

例えば、監視システム９００が行うニューラルネットワーク処理は、画像データから人物の顔を見つけ出す顔検出処理と、検出された顔が誰であるかを照合する顔認証処理とを含み得る。ただし、監視システム９００が行うニューラルネットワーク処理は、顔検出処理と顔認証処理とが明確に区別されているとは限らない。 For example, the neural network process performed by the monitoring system 900 may include a face detection process for finding a person's face from image data and a face recognition process for collating who the detected face is. However, in the neural network processing performed by the monitoring system 900, the face detection processing and the face recognition processing are not always clearly distinguished.

ニューラルネットワークは、脳機能に見られるいくつかの特性を計算機上のシミュレーションによって表現することを目指した数学モデルである。例えば、ニューラルネットワークは、多数の層から構成され、入力データの特徴量を抽出する処理や、抽出された特徴量を識別する処理等を係る多数の層により行うことが可能である。 A neural network is a mathematical model that aims to express some characteristics of brain function by computer simulation. For example, a neural network is composed of a large number of layers, and can perform a process of extracting a feature amount of input data, a process of identifying the extracted feature amount, and the like by a large number of layers.

本発明の各実施形態に係る監視システム９００は、ニューラルネットワークの層単位で処理を分割し、分割された処理を監視カメラ１、中間サーバ２、認識サーバ３のうち少なくとも２つの装置に分担させる。係る構成により、細かい単位で処理を分担させることが可能であり、より高い解像度、より高いフレームレートで撮像が行われる場合であっても、リアルタイム性を保ちやすい。 The monitoring system 900 according to each embodiment of the present invention divides the processing into layers of the neural network, and divides the divided processing into at least two devices of the surveillance camera 1, the intermediate server 2, and the recognition server 3. With such a configuration, it is possible to divide the processing in small units, and it is easy to maintain real-time performance even when imaging is performed at a higher resolution and a higher frame rate.

また、後述する第２の実施形態、及び、第３の実施形態では、装置間で画像データは送信されず、各装置に分担された各段階の処理結果である出力値が送信されるため、万一通信内容が盗み取られてしまった場合であっても、プライバシーが侵害され難い。 Further, in the second embodiment and the third embodiment described later, the image data is not transmitted between the devices, but the output value which is the processing result of each stage shared by each device is transmitted. Even if the communication content is stolen, privacy is unlikely to be infringed.

以上、本発明の各実施形態に共通する監視システム９００の基本構成について説明した。以下では、上述した効果を実現する本発明の各実施形態について、順次詳細に説明する。 The basic configuration of the monitoring system 900 common to each embodiment of the present invention has been described above. Hereinafter, each embodiment of the present invention that realizes the above-mentioned effects will be described in detail in order.

＜＜２．各実施形態の詳細な説明＞＞
＜２−１．第１の実施形態＞
（概要）
まず、本発明の第１の実施形態に係る監視システム９００の概要について説明する。なお、以下では、本発明の第１の実施形態に係る監視システム９００を監視システム９００−１と呼称し、監視システム９００−１が有する監視カメラ１、中間サーバ２、及び認識サーバ３をそれぞれ監視カメラ１−１、中間サーバ２−１、及び認識サーバ３−１と呼称する。 << 2. Detailed description of each embodiment >>
<2-1. First Embodiment>
(Overview)
First, an outline of the monitoring system 900 according to the first embodiment of the present invention will be described. In the following, the surveillance system 900 according to the first embodiment of the present invention will be referred to as a surveillance system 900-1, and the surveillance camera 1, the intermediate server 2, and the recognition server 3 of the surveillance system 900-1 will be monitored, respectively. It is referred to as a camera 1-1, an intermediate server 2-1 and a recognition server 3-1.

図２は、本発明の第１の実施形態の概要を説明するための説明図である。図２には、本実施形態に係る監視システム９００−１が用いるニューラルネットワークＮＮ１が示されている。図２に示すように、ニューラルネットワークＮＮ１は、ｎ層で構成されたニューラルネットワークである。なお、本実施形態において、ｎは少なくとも２以上の整数であるものとする。 FIG. 2 is an explanatory diagram for explaining an outline of the first embodiment of the present invention. FIG. 2 shows the neural network NN1 used by the monitoring system 900-1 according to the present embodiment. As shown in FIG. 2, the neural network NN1 is a neural network composed of n layers. In this embodiment, n is an integer of at least 2 or more.

また、図２に示すニューラルネットワークＮＮ１は、各層の出力値が、当該層の次の（右の）層に入力されるニューラルネットワークである。なお、各層の出力値は、スカラー値に限られず、例えばベクトル値であってもよい。以下、図２に示すように、本実施形態に係るニューラルネットワークＮＮ１を構成する各層を左から順に第１層Ｌ_１、第２層Ｌ_２、第３層Ｌ_３、・・・、第ｎ層Ｌ_ｎと呼称する。 Further, the neural network NN1 shown in FIG. 2 is a neural network in which the output value of each layer is input to the next (right) layer of the layer. The output value of each layer is not limited to the scalar value, and may be, for example, a vector value. Hereinafter, as shown in FIG. 2, the layers constituting the neural network NN1 according to the present embodiment are arranged in order from the left, the first layer L ₁ , the second layer L ₂ , the third layer L ₃ , ..., The nth layer. It is called _{L n.}

本実施形態に係る監視システム９００−１は、ニューラルネットワークＮＮ１を用いたニューラルネットワーク処理を、中間サーバ２−１と、認識サーバ３−１とで分担させる。図２に示す例では、中間サーバ２−１が第１層Ｌ_１〜第ｋ層Ｌ_ｋのニューラルネットワーク処理を担当し、認識サーバ３−１が、第ｋ＋１層Ｌ_ｋ＋１〜第ｎ層Ｌ_ｎのニューラルネットワーク処理を担当する。ここで、ｋは１以上ｎ−１以下の整数であり、中間サーバ２−１と認識サーバ３−１の（より正確には、後述するように中間サーバ２−１と認識サーバ３−１がそれぞれ有する処理部の）処理分担境界を示している。 The monitoring system 900-1 according to the present embodiment shares the neural network processing using the neural network NN1 between the intermediate server 2-1 and the recognition server 3-1. In the example shown in FIG. 2, the intermediate server 2-1 is _{in charge of the neural network processing of the first layer L 1} to the k layer L _k , and the recognition server 3-1 is in charge of the first k + 1 layer L _{k + 1} to the nth layer L _n. Responsible for neural network processing. Here, k is an integer of 1 or more and n-1 or less, and the intermediate server 2-1 and the recognition server 3-1 (more accurately, the intermediate server 2-1 and the recognition server 3-1 are described later. The processing sharing boundary (of each processing unit) is shown.

本実施形態において、中間サーバ２−１と認識サーバ３−１の処理分担境界を示す整数ｋは予め設定された値であってもよい。例えば整数ｋは、中間サーバ２−１と認識サーバ３−１の処理性能や、監視カメラ１−１のフレームレートや解像度等を考慮し、要求される処理時間を満たすようにユーザにより設定されてもよい。なお、本実施形態において、ニューラルネットワークＮＮ１は、顔検出処理と顔認証処理とが明確に区別されていてもよいし、明確に区別されていなくてもよい。ニューラルネットワークＮＮ１において顔検出処理と顔認証処理とが明確に区別されていた場合であっても、中間サーバ２−１と認識サーバ３−１の処理分担境界を示す整数ｋは、顔検出処理と顔認証処理の境界とは依存せずに設定され得る。さらに言えば、中間サーバ２−１と認識サーバ３−１の処理分担境界を示す整数ｋは、ニューラルネットワークＮＮ１の各層がいかなる処理のための層であるかに依存せずに設定されてもよい。係る構成により、処理をより細かい単位で中間サーバ２−１と認識サーバ３−１に分担させることが可能である。 In the present embodiment, the integer k indicating the processing sharing boundary between the intermediate server 2-1 and the recognition server 3-1 may be a preset value. For example, the integer k is set by the user so as to satisfy the required processing time in consideration of the processing performance of the intermediate server 2-1 and the recognition server 3-1 and the frame rate and resolution of the surveillance camera 1-1. May be good. In the present embodiment, in the neural network NN1, the face detection process and the face recognition process may or may not be clearly distinguished. Even when the face detection process and the face recognition process are clearly distinguished in the neural network NN1, the integer k indicating the processing sharing boundary between the intermediate server 2-1 and the recognition server 3-1 is the face detection process. It can be set independently of the boundary of face recognition processing. Furthermore, the integer k indicating the processing sharing boundary between the intermediate server 2-1 and the recognition server 3-1 may be set regardless of what processing each layer of the neural network NN1 is for. .. With such a configuration, it is possible to divide the processing into the intermediate server 2-1 and the recognition server 3-1 in finer units.

（構成例）
以上、本実施形態に係る監視システム９００−１の概要を説明した。続いて、本実施形態に係る監視システム９００−１の構成例について、より詳細に説明する。図３は、本実施形態に係る監視システム９００−１の構成例を示すブロック図である。 (Configuration example)
The outline of the monitoring system 900-1 according to the present embodiment has been described above. Subsequently, a configuration example of the monitoring system 900-1 according to the present embodiment will be described in more detail. FIG. 3 is a block diagram showing a configuration example of the monitoring system 900-1 according to the present embodiment.

監視カメラ１−１は、図３に示すように、撮像部１１１、及び通信インタフェース部１２０を備える。 As shown in FIG. 3, the surveillance camera 1-1 includes an imaging unit 111 and a communication interface unit 120.

撮像部１１１は、画像データ（センシングデータの一例）を撮像（センシングの一例）により取得するカメラモジュールである。例えば、撮像部１１１は、は、ＣＣＤ（Charge Coupled Device）またはＣＭＯＳ（Complementary Metal Oxide Semiconductor）等の撮像素子を用いて周囲の実空間を撮像することにより、光を電気信号に変換し、画像データを生成する。撮像部１１１は、画像データを通信インタフェース部１２０へ提供する。 The imaging unit 111 is a camera module that acquires image data (an example of sensing data) by imaging (an example of sensing). For example, the image pickup unit 111 converts light into an electric signal by taking an image of the surrounding real space using an image pickup device such as a CCD (Charge Coupled Device) or a CMOS (Complementary Metal Oxide Semiconductor), and obtains image data. To generate. The image capturing unit 111 provides image data to the communication interface unit 120.

通信インタフェース部１２０は、監視カメラ１−１による他の装置との間の通信を仲介する。通信インタフェース部１２０は、任意の無線通信プロトコルまたは有線通信プロトコルをサポートし、通信ネットワーク５Ａを介して、あるいは直接に他の装置との間の通信接続を確立する。通信インタフェース部１２０は、図３に示すように変換部１２２、及び通信部１２４を含む。 The communication interface unit 120 mediates communication between the surveillance camera 1-1 and other devices. The communication interface unit 120 supports an arbitrary wireless communication protocol or a wired communication protocol, and establishes a communication connection with another device via the communication network 5A or directly. As shown in FIG. 3, the communication interface unit 120 includes a conversion unit 122 and a communication unit 124.

変換部１２２は、データを通信部１２４が送信可能な形式のデータ（通信用データ）に変換する。例えば、変換部１２２は、撮像部１１１から提供された画像データを通信用画像データに変換し、通信部１２４へ提供する。 The conversion unit 122 converts the data into data (communication data) in a format that can be transmitted by the communication unit 124. For example, the conversion unit 122 converts the image data provided by the imaging unit 111 into communication image data and provides the image data to the communication unit 124.

通信部１２４は、通信ネットワーク５Ａを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部１２４は、変換部１２２から提供された通信用画像データを通信ネットワーク５Ａへ送信する。なお、本実施形態では、通信用画像データが通信ネットワーク５Ａを介して監視カメラ１−１から中間サーバ２−１へ送信されるため、通信ネットワーク５Ａは高いセキュリティを有する通信ネットワークであることが望ましい。 The communication unit 124 transmits communication data to another device via the communication network 5A or directly, or receives communication data from the other device. For example, the communication unit 124 transmits the communication image data provided by the conversion unit 122 to the communication network 5A. In the present embodiment, since the image data for communication is transmitted from the surveillance camera 1-1 to the intermediate server 2-1 via the communication network 5A, it is desirable that the communication network 5A is a communication network having high security. ..

中間サーバ２−１は、図３に示すように、通信インタフェース部２２０、処理部２３１、及び記憶部２４０を備える情報処理装置である。 As shown in FIG. 3, the intermediate server 2-1 is an information processing device including a communication interface unit 220, a processing unit 231 and a storage unit 240.

通信インタフェース部２２０は、中間サーバ２−１による他の装置との間の通信を仲介する。通信インタフェース部２２０は、任意の無線通信プロトコルまたは有線通信プロトコルをサポートし、通信ネットワーク５Ａ、または通信ネットワーク５Ｂを介して、あるいは直接に他の装置との間の通信接続を確立する。通信インタフェース部２２０は、図３に示すように変換部２２２、及び通信部２２４を含む。 The communication interface unit 220 mediates communication between the intermediate server 2-1 and other devices. The communication interface unit 220 supports any wireless communication protocol or wired communication protocol, and establishes a communication connection with or directly from the communication network 5A or the communication network 5B. As shown in FIG. 3, the communication interface unit 220 includes a conversion unit 222 and a communication unit 224.

変換部２２２は、通信部２２４が受信した通信用データを処理部２３１や記憶部２４０が扱うためのデータに変換（逆変換）し、処理部２３１や記憶部２４０へ提供する。例えば、変換部２２２は、通信部２２４が通信ネットワーク５Ａから受信した通信用画像データを画像データ（本実施形態における入力データ）に変換し、処理部２３１へ提供する。 The conversion unit 222 converts (reversely transforms) the communication data received by the communication unit 224 into data for handling by the processing unit 231 and the storage unit 240, and provides the data to the processing unit 231 and the storage unit 240. For example, the conversion unit 222 converts the communication image data received from the communication network 5A by the communication unit 224 into image data (input data in the present embodiment) and provides it to the processing unit 231.

また、変換部２２２は、データを通信部２２４が送信可能な形式のデータ（通信用データ）に変換する。例えば、変換部２２２は、本実施形態における第１の変換部として機能し、後述する処理部２３１から出力されるニューラルネットワークＮＮ１の第ｋ層出力値を通信用第ｋ層出力値に変換し、通信部２２４へ提供する。 Further, the conversion unit 222 converts the data into data (communication data) in a format that can be transmitted by the communication unit 224. For example, the conversion unit 222 functions as the first conversion unit in the present embodiment, and converts the k-th layer output value of the neural network NN1 output from the processing unit 231 described later into the k-th layer output value for communication. It is provided to the communication unit 224.

通信部２２４は、通信ネットワーク５Ａを介して、または通信ネットワーク５Ｂを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部２２４は、本実施形態における第１の送信部として機能し、変換部２２２から提供された通信用第ｋ層出力値を通信ネットワーク５Ｂ（本実施形態における第１の通信ネットワーク）へ送信する。また、通信部２２４は、監視カメラ１−１が送信した通信用画像データを通信ネットワーク５Ａから受信する。 The communication unit 224 transmits communication data to another device, or receives communication data from the other device, via the communication network 5A, via the communication network 5B, or directly. For example, the communication unit 224 functions as the first transmission unit in the present embodiment, and transfers the k-layer output value for communication provided by the conversion unit 222 to the communication network 5B (the first communication network in the present embodiment). Send. Further, the communication unit 224 receives the communication image data transmitted by the surveillance camera 1-1 from the communication network 5A.

処理部２３１は、ニューラルネットワーク処理を行う。処理部２３１が行うニューラルネットワーク処理は、例えば、記憶部２４０に記憶されるニューラルネットワークパラメータによって特定され得る。なお、記憶部２４０に記憶されるニューラルネットワークパラメータは、ニューラルネットワークＮＮ１全体に対応するパラメータであってもよいし、ニューラルネットワークＮＮ１のうち第１層Ｌ_１から第ｋ層Ｌ_ｋまでに対応するパラメータであってもよい。処理部２３１と後述する認識サーバ３−１が有する処理部３３１の処理分担境界を示す整数ｋが記憶部２４０に記憶されていてもよい。 The processing unit 231 performs neural network processing. The neural network processing performed by the processing unit 231 can be specified by, for example, the neural network parameters stored in the storage unit 240. Incidentally, the neural network parameters stored in the storage unit 240 may be a parameter corresponding to the entire neural network NN1, corresponding parameter from the first layer L ₁ to the k-th layer L _k of the neural network NN1 It may be. An integer k indicating a processing sharing boundary of the processing unit 331 of the processing unit 231 and the recognition server 3-1 described later may be stored in the storage unit 240.

処理部２３１は、本実施形態における第１のニューラルネットワーク処理部として機能し、画像データを入力とし、図２に示したニューラルネットワークＮＮ１のうち第１層Ｌ_１から第ｋ層Ｌ_ｋまでを用いたニューラルネットワーク処理を行う。また、処理部２３１は、本実施形態における第１のニューラルネットワーク処理部として機能し、第ｋ層Ｌ_ｋの出力値である第ｋ層出力値を通信インタフェース部２２０へ出力する。 Processing unit 231 functions as a first neural network processing unit in the present embodiment, use of the input image data, from the first layer L ₁ of the neural network NN1 shown in FIG. 2 to the k-th layer L _k Performs the neural network processing that was used. The processing unit 231 functions as a first neural network processing unit in the present embodiment, and outputs the k-th layer output value is an output value of the k-th layer L _k to a communication interface unit 220.

記憶部２４０は、中間サーバ２−１の動作に用いられるプログラム及びデータを記憶する。また、記憶部２４０はニューラルネットワークＮＮ１に関するニューラルネットワークパラメータを記憶する。 The storage unit 240 stores programs and data used for the operation of the intermediate server 2-1. Further, the storage unit 240 stores the neural network parameters related to the neural network NN1.

認識サーバ３−１は、図３に示すように、通信インタフェース部３２０、処理部３３１、及び記憶部３４０を備える情報処理装置である。 As shown in FIG. 3, the recognition server 3-1 is an information processing device including a communication interface unit 320, a processing unit 331, and a storage unit 340.

通信インタフェース部３２０は、認識サーバ３−１による他の装置との間の通信を仲介する。通信インタフェース部３２０は、任意の無線通信プロトコルまたは有線通信プロトコルをサポートし、通信ネットワーク５Ｂを介して、あるいは直接に他の装置との間の通信接続を確立する。通信インタフェース部３２０は、図３に示すように変換部３２２、及び通信部３２４を含む。 The communication interface unit 320 mediates communication between the recognition server 3-1 and other devices. The communication interface unit 320 supports any wireless communication protocol or wired communication protocol, and establishes a communication connection with another device via the communication network 5B or directly. As shown in FIG. 3, the communication interface unit 320 includes a conversion unit 322 and a communication unit 324.

変換部３２２は、通信部３２４が受信した通信用データを処理部３３１や記憶部３４０が扱うためのデータに変換（逆変換）し、処理部３３１や記憶部３４０へ提供する。例えば、変換部３２２は、本実施形態における第２の変換部として機能し、通信部３２４が通信ネットワーク５Ｂ（本実施形態における第１の通信ネットワーク）から受信した通信用第ｋ層出力値を第ｋ層出力値に変換し、処理部３３１へ提供する。 The conversion unit 322 converts (reversely transforms) the communication data received by the communication unit 324 into data for handling by the processing unit 331 and the storage unit 340, and provides the data to the processing unit 331 and the storage unit 340. For example, the conversion unit 322 functions as a second conversion unit in the present embodiment, and the communication unit 324 receives the communication k-layer output value received from the communication network 5B (the first communication network in the present embodiment). It is converted into a k-layer output value and provided to the processing unit 331.

通信部３２４は、通信ネットワーク５Ｂを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部３２４は、本実施形態における第１の受信部として機能し、中間サーバ２−１が送信した通信用第ｋ層出力値を通信ネットワーク５Ｂから受信する。 The communication unit 324 transmits the communication data to another device via the communication network 5B or directly, or receives the communication data from the other device. For example, the communication unit 324 functions as the first reception unit in the present embodiment, and receives the communication k-th layer output value transmitted by the intermediate server 2-1 from the communication network 5B.

処理部３３１は、ニューラルネットワーク処理を行う。処理部３３１が行うニューラルネットワーク処理は、例えば、記憶部３４０に記憶されるニューラルネットワークパラメータによって特定され得る。なお、記憶部３４０に記憶されるニューラルネットワークパラメータは、ニューラルネットワークＮＮ１全体に対応するパラメータであってもよいし、ニューラルネットワークＮＮ１のうち第ｋ＋１層Ｌ_ｋ＋１から第ｎ層Ｌ_ｎまでに対応するパラメータであってもよい。中間サーバ２−１が有する処理部２３１と処理部３３１の処理分担境界を示す整数ｋが記憶部３４０に記憶されていてもよい。 The processing unit 331 performs neural network processing. The neural network processing performed by the processing unit 331 can be specified by, for example, the neural network parameters stored in the storage unit 340. The neural network parameter stored in the storage unit 340 may be a parameter corresponding to the entire neural network NN1, or a parameter corresponding to the k + 1st layer L _{k + 1} to the nth layer L _{n of the neural network NN1.} It may be. An integer k indicating a processing sharing boundary between the processing unit 231 and the processing unit 331 included in the intermediate server 2-1 may be stored in the storage unit 340.

処理部３３１は、本実施形態における第２のニューラルネットワーク処理部として機能し、第ｋ層出力値を入力とし、図２に示したニューラルネットワークＮＮ１のうち第ｋ＋１層Ｌ_ｋ＋１から第ｎ層Ｌ_ｎまでを用いたニューラルネットワーク処理を行う。処理部３３１は、第ｎ層Ｌ_ｎの出力値である第ｎ層出力値を出力し、例えば記憶部３４０へ記憶させてもよい。あるいは、処理部３３１により出力された第ｎ層出力値は、不図示の表示部に表示されてもよいし、変換部３２２により通信用データに変換された後に通信部３２４により他の装置へ送信されてもよい。なお、上述したように、本実施形態に係るニューラルネットワークＮＮ１は、顔認識のためのニューラルネットワークであり、第ｎ層出力値は、例えば監視カメラ１−１が撮像した画像に誰の顔が含まれるか、という情報を含み得る。 The processing unit 331 functions as a second neural network processing unit in the present embodiment, receives the k-th layer output value as an input, and has k + 1 layer L _{k + 1} to n-th layer L _{n of the neural network NN1 shown in FIG.} Perform neural network processing using up to. Processing unit 331 is an output value of the n-th layer L _n outputs the n-th layer output values may be stored for example in the storage unit 340. Alternatively, the nth layer output value output by the processing unit 331 may be displayed on a display unit (not shown), or is converted into communication data by the conversion unit 322 and then transmitted to another device by the communication unit 324. May be done. As described above, the neural network NN1 according to the present embodiment is a neural network for face recognition, and the nth layer output value includes, for example, who's face in the image captured by the surveillance camera 1-1. Can include information about whether or not it is possible.

記憶部３４０は、認識サーバ３−１の動作に用いられるプログラム及びデータを記憶する。また、記憶部３４０はニューラルネットワークＮＮ１に関するニューラルネットワークパラメータを記憶する。 The storage unit 340 stores programs and data used for the operation of the recognition server 3-1. Further, the storage unit 340 stores the neural network parameters related to the neural network NN1.

（動作例）
以上、本発明の第１の実施形態に係る監視システム９００−１の構成例について説明した。続いて、本実施形態の動作例について、図４を参照して説明する。図４は、本実施形態に係る監視システム９００−１の処理フローを示すシーケンス図である。 (Operation example)
The configuration example of the monitoring system 900-1 according to the first embodiment of the present invention has been described above. Subsequently, an operation example of the present embodiment will be described with reference to FIG. FIG. 4 is a sequence diagram showing a processing flow of the monitoring system 900-1 according to the present embodiment.

図４に示すように、まず監視カメラ１−１は、撮像部１１１の撮像により画像データを取得する（Ｓ１０２）。続いて、監視カメラ１−１の変換部１２２が、画像データを通信用画像データに変換する（Ｓ１０６）。さらに、監視カメラ１−１の通信部１２４が、通信用画像データを通信ネットワーク５Ａへ送信し、中間サーバ２−１の通信部２２４が当該通信用画像データを通信ネットワーク５Ａから受信する（Ｓ１０８）。 As shown in FIG. 4, first, the surveillance camera 1-1 acquires image data by imaging the imaging unit 111 (S102). Subsequently, the conversion unit 122 of the surveillance camera 1-1 converts the image data into communication image data (S106). Further, the communication unit 124 of the surveillance camera 1-1 transmits the communication image data to the communication network 5A, and the communication unit 224 of the intermediate server 2-1 receives the communication image data from the communication network 5A (S108). ..

続いて、中間サーバ２−１の変換部２２２が、ステップＳ１０８で受信された通信用画像データを画像データに変換する（Ｓ１１０）。さらに、中間サーバ２−１の処理部２３１が、当該画像データを入力とし、ニューラルネットワークＮＮ１のうち第１層Ｌ_１から第ｋ層Ｌ_ｋまでを用いたニューラルネットワーク処理を行って第ｋ層出力値を出力する（Ｓ１１２）。 Subsequently, the conversion unit 222 of the intermediate server 2-1 converts the communication image data received in step S108 into image data (S110). Further, the processing unit 231 of the intermediate server 2-1, and inputs the image data, the k-th layer output by performing a neural network processing using the first layer L ₁ of the neural network NN1 to the k-th layer L _k The value is output (S112).

続いて、中間サーバ２−１の変換部２２２が第ｋ層出力値を通信用第ｋ層出力値に変換する（Ｓ１１４）。さらに、中間サーバ２−１の通信部２２４が、通信用第ｋ層出力値を通信ネットワーク５Ｂへ送信し、認識サーバ３−１の通信部３２４が当該通信用第ｋ層出力値を通信ネットワーク５Ｂから受信する（Ｓ１１６）。 Subsequently, the conversion unit 222 of the intermediate server 2-1 converts the k-th layer output value into the communication k-th layer output value (S114). Further, the communication unit 224 of the intermediate server 2-1 transmits the k-layer output value for communication to the communication network 5B, and the communication unit 324 of the recognition server 3-1 transmits the k-layer output value for communication to the communication network 5B. Received from (S116).

続いて、認識サーバ３−１の変換部３２２が、ステップＳ１１６で受信された通信用第ｋ層出力値を第ｋ層出力値に変換する（Ｓ１１８）。さらに、認識サーバ３−１の処理部３３１が、当該第ｋ層出力値を入力とし、ニューラルネットワークＮＮ１のうち第ｋ＋１層Ｌ_ｋ＋１から第ｎ層Ｌ_ｎまでを用いたニューラルネットワーク処理を行って第ｎ層出力値を出力する（Ｓ１２０）。 Subsequently, the conversion unit 322 of the recognition server 3-1 converts the communication k-th layer output value received in step S116 into the k-th layer output value (S118). Further, the processing unit 331 of the recognition server 3-1 takes the kth layer output value as an input and _{performs neural network processing using the k + 1 layer L k + 1} to the nth layer L _{n of} the neural network NN1 to perform the first neural network processing. The n-layer output value is output (S120).

（効果）
以上、本発明の第１の実施形態について説明した。本実施形態によれば、監視カメラ１−１の撮像により取得された画像データを入力としたニューラルネットワーク処理が、中間サーバ２−１と認識サーバ３−１とで分担される。また、上述したように、中間サーバ２−１と認識サーバ３−１との処理分担境界を示す整数ｋは、各層に係る処理に依存せず設定され得るため、処理をより細かい単位で中間サーバ２−１と認識サーバ３−１に分担させることが可能である。さらに、中間サーバ２−１と認識サーバ３−１との間では、画像データに係る通信が行われず、通信用第ｋ層出力値が通信される。係る構成により、仮に通信ネットワーク５Ｂを介した中間サーバ２−１と認識サーバ３−１との間の通信内容が盗み取られてしまった場合であっても、プライバシーが侵害され難い。 (effect)
The first embodiment of the present invention has been described above. According to the present embodiment, the neural network processing using the image data acquired by the imaging of the surveillance camera 1-1 as the input is shared between the intermediate server 2-1 and the recognition server 3-1. Further, as described above, the integer k indicating the processing sharing boundary between the intermediate server 2-1 and the recognition server 3-1 can be set independently of the processing related to each layer, so that the processing can be set in a finer unit. It is possible to share the work between 2-1 and the recognition server 3-1. Further, communication related to image data is not performed between the intermediate server 2-1 and the recognition server 3-1 and the k-layer output value for communication is communicated. With such a configuration, even if the communication content between the intermediate server 2-1 and the recognition server 3-1 via the communication network 5B is stolen, privacy is unlikely to be infringed.

＜２−２．第２の実施形態＞
（概要）
上記第１の実施形態では、ニューラルネットワーク処理を中間サーバ２−１と認識サーバ３−１の２つの装置で分担させる例を説明したが、ニューラルネットワーク処理を３つ以上の装置で分担させることも可能である。以下では、ニューラルネットワーク処理を監視カメラ１、中間サーバ２、及び認識サーバ３の３つの装置で分担させる例について、本発明に係る第２の実施形態として説明する。なお、以下では、本発明の第２の実施形態に係る監視システム９００を監視システム９００−２と呼称し、監視システム９００−２が有する監視カメラ１、中間サーバ２、及び認識サーバ３をそれぞれ監視カメラ１−２、中間サーバ２−２、及び認識サーバ３−２と呼称する。 <2-2. Second embodiment>
(Overview)
In the first embodiment, the example in which the neural network processing is shared by the two devices of the intermediate server 2-1 and the recognition server 3-1 has been described, but the neural network processing may be shared by three or more devices. It is possible. Hereinafter, an example in which the neural network processing is shared by the three devices of the surveillance camera 1, the intermediate server 2, and the recognition server 3 will be described as a second embodiment of the present invention. In the following, the surveillance system 900 according to the second embodiment of the present invention will be referred to as a surveillance system 900-2, and the surveillance camera 1, the intermediate server 2, and the recognition server 3 of the surveillance system 900-2 will be monitored, respectively. It is called a camera 1-2, an intermediate server 2-2, and a recognition server 3-2.

図５は、本発明の第２の実施形態の概要を説明するための説明図である。図５には、本実施形態に係る監視システム９００−２が用いるニューラルネットワークＮＮ２が示されている。図５に示すように、ニューラルネットワークＮＮ２は、ｎ層で構成されたニューラルネットワークである。なお、本実施形態において、ｎは少なくとも３以上の整数であるものとする。 FIG. 5 is an explanatory diagram for explaining the outline of the second embodiment of the present invention. FIG. 5 shows the neural network NN2 used by the monitoring system 900-2 according to the present embodiment. As shown in FIG. 5, the neural network NN2 is a neural network composed of n layers. In this embodiment, n is an integer of at least 3 or more.

また、図５に示すニューラルネットワークＮＮ２は、各層の出力値が、当該層の次の（右の）層に入力されるニューラルネットワークである。なお、各層の出力値は、スカラー値に限られず、例えばベクトル値であってもよい。以下、図５に示すように、本実施形態に係るニューラルネットワークＮＮ２を構成する各層を左から順に第１層Ｌ_１、第２層Ｌ_２、第３層Ｌ_３、・・・、第ｎ層Ｌ_ｎと呼称する。 Further, the neural network NN2 shown in FIG. 5 is a neural network in which the output value of each layer is input to the next (right) layer of the layer. The output value of each layer is not limited to the scalar value, and may be, for example, a vector value. Hereinafter, as shown in FIG. 5, each layer constituting the neural network NN2 according to the present embodiment is arranged in order from the left, the first layer L ₁ , the second layer L ₂ , the third layer L ₃ , ..., The nth layer. It is called _{L n.}

本実施形態に係る監視システム９００−２は、ニューラルネットワークＮＮ２を用いたニューラルネットワーク処理を、監視カメラ１−２、中間サーバ２−２、及び認識サーバ３−２で分担させる。図５に示す例では、監視カメラ１−２が第１層Ｌ_１〜第ｋ層Ｌ_ｋのニューラルネットワーク処理を担当する。また、中間サーバ２−２が第ｋ＋１層Ｌ_ｋ＋１〜第ｍ層Ｌ_ｍのニューラルネットワーク処理を担当する。また、認識サーバ３−２が、第ｍ＋１層Ｌ_ｍ＋１〜第ｎ層Ｌ_ｎのニューラルネットワーク処理を担当する。ここで、ｋは１以上ｍ−１以下の整数であり、監視カメラ１−２と中間サーバ２−２の（より正確には、後述するように監視カメラ１−２と中間サーバ２−２がそれぞれ有する処理部の）処理分担境界を示している。また、ｍはｋ＋１以上ｎ−１以下の整数であり、中間サーバ２−２と認識サーバ３−２の（より正確には、後述するように中間サーバ２−２と認識サーバ３−２がそれぞれ有する処理部の）処理分担境界を示している。 In the monitoring system 900-2 according to the present embodiment, the neural network processing using the neural network NN2 is shared by the surveillance camera 1-2, the intermediate server 2-2, and the recognition server 3-2. In the example shown in FIG. 5, the monitoring camera 1-2 is responsible for neural network processing of the first layer _{L 1} ~ k-th layer _{L k.} Further, the intermediate server 2-2 is in charge of the neural network processing of the _{first k + 1 layer L k + 1} to the mth layer L _m. Further, the recognition server 3-2 is in charge of the neural network processing of _{the m + 1 layer L m + 1} to the nth layer L _n. Here, k is an integer of 1 or more and m-1 or less, and the surveillance camera 1-2 and the intermediate server 2-2 (more accurately, the surveillance camera 1-2 and the intermediate server 2-2 are described later). The processing sharing boundary (of each processing unit) is shown. Further, m is an integer of k + 1 or more and n-1 or less, and the intermediate server 2-2 and the recognition server 3-2 (more accurately, the intermediate server 2-2 and the recognition server 3-2, respectively, as described later). It shows the processing sharing boundary (of the processing unit).

本実施形態において、監視カメラ１−２と中間サーバ２−２の処理分担境界を示す整数ｋ、及び中間サーバ２−２と認識サーバ３−２の処理分担境界を示す整数ｍはそれぞれ予め設定された値であってもよい。例えば整数ｋ、及び整数ｍは、監視カメラ１−２、中間サーバ２−２、及び認識サーバ３−２の処理性能や、監視カメラ１−２のフレームレートや解像度等を考慮し、要求される処理時間を満たすようにユーザにより設定されてもよい。なお、本実施形態において、ニューラルネットワークＮＮ２は、顔検出処理と顔認証処理とが明確に区別されていてもよいし、明確に区別されていなくてもよい。ニューラルネットワークＮＮ２において顔検出処理と顔認証処理とが明確に区別されていた場合であっても、整数ｋ、及び整数ｍは、顔検出処理と顔認証処理の境界とは依存せずに設定され得る。さらに言えば、整数ｋ、及び整数ｍは、ニューラルネットワークＮＮ２の各層がいかなる処理のための層であるかに依存せずに設定されてもよい。係る構成により、処理をより細かい単位で監視カメラ１−２、中間サーバ２−２、及び認識サーバ３−３に分担させることが可能である。 In the present embodiment, the integer k indicating the processing sharing boundary between the surveillance camera 1-2 and the intermediate server 2-2 and the integer m indicating the processing sharing boundary between the intermediate server 2-2 and the recognition server 3-2 are preset. It may be a value. For example, the integer k and the integer m are required in consideration of the processing performance of the surveillance camera 1-2, the intermediate server 2-2, and the recognition server 3-2, the frame rate and the resolution of the surveillance camera 1-2, and the like. It may be set by the user to satisfy the processing time. In the present embodiment, in the neural network NN2, the face detection process and the face recognition process may or may not be clearly distinguished. Even when the face detection process and the face recognition process are clearly distinguished in the neural network NN2, the integers k and m are set independently of the boundary between the face detection process and the face recognition process. obtain. Furthermore, the integer k and the integer m may be set independently of what processing each layer of the neural network NN2 is for. With such a configuration, it is possible to divide the processing into the surveillance camera 1-2, the intermediate server 2-2, and the recognition server 3-3 in finer units.

（構成例）
以上、本実施形態に係る監視システム９００−２の概要を説明した。続いて、本実施形態に係る監視システム９００−２の構成例について、より詳細に説明する。図６は、本実施形態に係る監視システム９００−２の構成例を示すブロック図である。なお、本実施形態に係る監視システム９００−２は、一部において第１の実施形態に係る監視システム９００−１と同様の構成を有するため、適宜省略しながら説明を行う。 (Configuration example)
The outline of the monitoring system 900-2 according to the present embodiment has been described above. Subsequently, a configuration example of the monitoring system 900-2 according to the present embodiment will be described in more detail. FIG. 6 is a block diagram showing a configuration example of the monitoring system 900-2 according to the present embodiment. Since the monitoring system 900-2 according to the present embodiment has the same configuration as the monitoring system 900-1 according to the first embodiment in part, the description will be omitted as appropriate.

監視カメラ１−２は、図６に示すように、撮像部１１２、処理部１３２、記憶部１４０、及び通信インタフェース部１７０を備える情報処理装置である。 As shown in FIG. 6, the surveillance camera 1-2 is an information processing device including an imaging unit 112, a processing unit 132, a storage unit 140, and a communication interface unit 170.

撮像部１１２は、図３を参照して説明した撮像部１１１と同様に画像データ（センシングデータの一例）を撮像（センシングの一例）により取得するカメラモジュールである。ただし、本実施形態に係る撮像部１１２は、画像データ（本実施形態における入力データ）を処理部１３２へ提供する点において、図３を参照して説明した撮像部１１１と異なる。 The imaging unit 112 is a camera module that acquires image data (an example of sensing data) by imaging (an example of sensing) in the same manner as the imaging unit 111 described with reference to FIG. However, the imaging unit 112 according to the present embodiment is different from the imaging unit 111 described with reference to FIG. 3 in that image data (input data in the present embodiment) is provided to the processing unit 132.

処理部１３２は、ニューラルネットワーク処理を行う。処理部１３２が行うニューラルネットワーク処理は、例えば、記憶部１４０に記憶されるニューラルネットワークパラメータによって特定され得る。なお、記憶部１４０に記憶されるニューラルネットワークパラメータは、ニューラルネットワークＮＮ２全体に対応するパラメータであってもよいし、ニューラルネットワークＮＮ２のうち第１層Ｌ_１から第ｋ層Ｌ_ｋまでに対応するパラメータであってもよい。処理部１３２と後述する中間サーバ２−２が有する処理部２３２の処理分担境界を示す整数ｋが記憶部１４０に記憶されていてもよい。 The processing unit 132 performs neural network processing. The neural network processing performed by the processing unit 132 can be specified by, for example, the neural network parameters stored in the storage unit 140. Incidentally, the neural network parameters stored in the storage unit 140 may be a parameter corresponding to the entire neural network NN2, corresponding parameter from the first layer L ₁ to the k-th layer L _k of the neural network NN2 It may be. An integer k indicating a processing sharing boundary of the processing unit 232 of the processing unit 132 and the intermediate server 2-2 described later may be stored in the storage unit 140.

処理部１３２は、本実施形態における第１のニューラルネットワーク処理部として機能し、撮像部１１２から提供される画像データを入力とし、図５に示したニューラルネットワークＮＮ２のうち第１層Ｌ_１から第ｋ層Ｌ_ｋまでを用いたニューラルネットワーク処理を行う。また、処理部１３２は、本実施形態における第１のニューラルネットワーク処理部として機能し、第ｋ層Ｌ_ｋの出力値である第ｋ層出力値を通信インタフェース部１７０へ出力する。 Processing unit 132 functions as a first neural network processing unit in the present embodiment, an input image data provided from the imaging unit 112, first from the first layer L ₁ of the neural network NN2, shown in FIG. 5 performing a neural network processing using up to k layer L _k. The processing unit 132 functions as a first neural network processing unit in the present embodiment, and outputs the k-th layer output value is an output value of the k-th layer L _k to a communication interface unit 170.

記憶部１４０は、監視カメラ１−２の動作に用いられるプログラム及びデータを記憶する。また、記憶部１４０はニューラルネットワークＮＮ２に関するニューラルネットワークパラメータを記憶する。 The storage unit 140 stores programs and data used for the operation of the surveillance cameras 1-2. Further, the storage unit 140 stores the neural network parameters related to the neural network NN2.

通信インタフェース部１７０は、図３を参照して説明した通信インタフェース部１２０と同様に、監視カメラ１−２による他の装置との間の通信を仲介する。通信インタフェース部１７０は、任意の無線通信プロトコルまたは有線通信プロトコルをサポートし、通信ネットワーク５Ａを介して、あるいは直接に他の装置との間の通信接続を確立する。通信インタフェース部１７０は、図６に示すように変換部１７２、及び通信部１７４を含む。 The communication interface unit 170 mediates communication with other devices by the surveillance cameras 1-2, similarly to the communication interface unit 120 described with reference to FIG. The communication interface unit 170 supports an arbitrary wireless communication protocol or a wired communication protocol, and establishes a communication connection with another device via the communication network 5A or directly. As shown in FIG. 6, the communication interface unit 170 includes a conversion unit 172 and a communication unit 174.

変換部１７２は、データを通信部１７４が送信可能な形式のデータ（通信用データ）に変換する。例えば、変換部１７２は、本実施形態における第１の変換部として機能し、処理部１３２から出力されるニューラルネットワークＮＮ２の第ｋ層出力値を通信用第ｋ層出力値に変換し、通信部１７４へ提供する。 The conversion unit 172 converts the data into data (communication data) in a format that can be transmitted by the communication unit 174. For example, the conversion unit 172 functions as the first conversion unit in the present embodiment, converts the k-th layer output value of the neural network NN2 output from the processing unit 132 into the k-th layer output value for communication, and is the communication unit. Provide to 174.

通信部１７４は、通信ネットワーク５Ａを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部１７４は、変換部１７２から提供された通信用第ｋ層出力値を通信ネットワーク５Ａ（本実施形態における第１の通信ネットワーク）へ送信する。 The communication unit 174 transmits communication data to another device via the communication network 5A or directly, or receives communication data from the other device. For example, the communication unit 174 transmits the communication k-layer output value provided by the conversion unit 172 to the communication network 5A (the first communication network in the present embodiment).

通信部１７４は、通信ネットワーク５Ａを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部１７４は、本実施形態における第１の送信部として機能し、変換部１７２から提供された通信用第ｋ層出力値を通信ネットワーク５Ａ（本実施形態における第１の通信ネットワーク）へ送信する。 The communication unit 174 transmits communication data to another device via the communication network 5A or directly, or receives communication data from the other device. For example, the communication unit 174 functions as the first transmission unit in the present embodiment, and transfers the k-layer output value for communication provided by the conversion unit 172 to the communication network 5A (the first communication network in the present embodiment). Send.

中間サーバ２−２は、図６に示すように、処理部２３２、記憶部２４０、及び通信インタフェース部２７０を備える情報処理装置である。図６に示す記憶部２４０の機能は図３を参照して説明した記憶部２４０の機能と同様であるため、説明を省略する。 As shown in FIG. 6, the intermediate server 2-2 is an information processing device including a processing unit 232, a storage unit 240, and a communication interface unit 270. Since the function of the storage unit 240 shown in FIG. 6 is the same as the function of the storage unit 240 described with reference to FIG. 3, the description thereof will be omitted.

処理部２３２は、ニューラルネットワーク処理を行う。処理部２３２が行うニューラルネットワーク処理は、例えば、記憶部２４０に記憶されるニューラルネットワークパラメータによって特定され得る。なお、本実施形態において記憶部２４０に記憶されるニューラルネットワークパラメータは、ニューラルネットワークＮＮ２全体に対応するパラメータであってもよいし、ニューラルネットワークＮＮ２のうち第ｋ＋１層Ｌ_ｋ＋１から第ｍ層Ｌ_ｍまでに対応するパラメータであってもよい。また、監視カメラ１−２が有する処理部１３２と処理部２３２の処理分担境界を示す整数ｋ、及び処理部２３２と後述する認識サーバ３−１が有する処理部３３２の処理分担境界を示す整数ｍが記憶部２４０に記憶されていてもよい。 The processing unit 232 performs neural network processing. The neural network processing performed by the processing unit 232 can be specified by, for example, the neural network parameters stored in the storage unit 240. The neural network parameter stored in the storage unit 240 in the present embodiment may be a parameter corresponding to the entire neural network NN2, or from the k + 1 layer L _{k + 1} to the mth layer L _{m of the neural network NN2.} It may be a parameter corresponding to. Further, an integer k indicating the processing sharing boundary between the processing unit 132 and the processing unit 232 of the surveillance camera 1-2, and an integer m indicating the processing sharing boundary of the processing unit 232 and the processing unit 332 of the recognition server 3-1 described later. May be stored in the storage unit 240.

処理部２３２は、本実施形態における第２のニューラルネットワーク処理部として機能し、第ｋ層出力値を入力とし、図５に示したニューラルネットワークＮＮ２のうち第ｋ＋１層Ｌ_ｋ＋１から第ｍ層Ｌ_ｍまでを用いたニューラルネットワーク処理を行う。また、処理部２３２は、本実施形態における第２のニューラルネットワーク処理部として機能し、第ｍ層Ｌ_ｍの出力値である第ｍ層出力値を通信インタフェース部２７０へ出力する。 The processing unit 232 functions as the second neural network processing unit in the present embodiment, receives the output value of the k-th layer as an input, and has the k + 1 layer L _{k + 1} to the m-th layer L _{m of the neural network NN2 shown in FIG.} Perform neural network processing using up to. The processing unit 232 functions as a second neural network processing unit in this embodiment, outputs an m-th layer output value is an output value of the m-th layer L _m to the communication interface unit 270.

通信インタフェース部２７０は、図３を参照して説明した通信インタフェース部２２０と同様に、中間サーバ２−２による他の装置との間の通信を仲介する。通信インタフェース部２７０は、任意の無線通信プロトコルまたは有線通信プロトコルをサポートし、通信ネットワーク５Ａ、または通信ネットワーク５Ｂを介して、あるいは直接に他の装置との間の通信接続を確立する。通信インタフェース部２７０は、図６に示すように変換部２７２、及び通信部２７４を含む。 The communication interface unit 270 mediates communication between the intermediate server 2-2 and other devices, similarly to the communication interface unit 220 described with reference to FIG. The communication interface unit 270 supports any wireless communication protocol or wired communication protocol, and establishes a communication connection with or directly from the communication network 5A or the communication network 5B. The communication interface unit 270 includes a conversion unit 272 and a communication unit 274 as shown in FIG.

変換部２７２は、通信部２７４が受信した通信用データを処理部２３２や記憶部２４０が扱うためのデータに変換（逆変換）し、処理部２３２や記憶部２４０へ提供する。例えば、変換部２７２は、本実施形態における第２の変換部として機能し、通信部２７４が通信ネットワーク５Ａから受信した通信用第ｋ層出力値を第ｋ層出力値に変換し、処理部２３２へ提供する。 The conversion unit 272 converts (reversely converts) the communication data received by the communication unit 274 into data for handling by the processing unit 232 and the storage unit 240, and provides the data to the processing unit 232 and the storage unit 240. For example, the conversion unit 272 functions as a second conversion unit in the present embodiment, converts the communication k-layer output value received from the communication network 5A by the communication unit 274 into the k-layer output value, and processes the processing unit 232. To provide to.

また、変換部２７２は、データを通信部２７４が送信可能な形式のデータ（通信用データ）に変換する。例えば、変換部２７２は、本実施形態における第３の変換部として機能し、処理部２３２から出力されるニューラルネットワークＮＮ２の第ｍ層出力値を通信用第ｍ層出力値に変換し、通信部２７４へ提供する。 Further, the conversion unit 272 converts the data into data (communication data) in a format that can be transmitted by the communication unit 274. For example, the conversion unit 272 functions as a third conversion unit in the present embodiment, converts the mth layer output value of the neural network NN2 output from the processing unit 232 into the communication mth layer output value, and communicates unit. Provide to 274.

通信部２７４は、通信ネットワーク５Ａを介して、または通信ネットワーク５Ｂを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部２７４は、本実施形態における第１の受信部として機能し、監視カメラ１−１が送信した通信用第ｋ層出力値を通信ネットワーク５Ａから受信する。また、通信部２７４は、本実施形態における第２の送信部として機能し、変換部２７２から提供された通信用第ｍ層出力値を通信ネットワーク５Ｂ（本実施形態における第２の通信ネットワーク）へ送信する。 The communication unit 274 transmits communication data to another device, or receives communication data from the other device, via the communication network 5A, via the communication network 5B, or directly. For example, the communication unit 274 functions as the first reception unit in the present embodiment, and receives the communication k-layer output value transmitted by the surveillance camera 1-1 from the communication network 5A. Further, the communication unit 274 functions as a second transmission unit in the present embodiment, and transfers the communication m-layer output value provided by the conversion unit 272 to the communication network 5B (second communication network in the present embodiment). Send.

認識サーバ３−２は、図６に示すように、処理部３３２、記憶部３４０、及び通信インタフェース部３７０を備える情報処理装置である。図６に示す記憶部３４０の機能は図３を参照して説明した記憶部３４０の機能と同様であるため、説明を省略する。 As shown in FIG. 6, the recognition server 3-2 is an information processing device including a processing unit 332, a storage unit 340, and a communication interface unit 370. Since the function of the storage unit 340 shown in FIG. 6 is the same as the function of the storage unit 340 described with reference to FIG. 3, the description thereof will be omitted.

処理部３３２は、ニューラルネットワーク処理を行う。処理部３３２が行うニューラルネットワーク処理は、例えば、記憶部３４０に記憶されるニューラルネットワークパラメータによって特定され得る。なお、本実施形態において記憶部３４０が記憶するニューラルネットワークパラメータは、ニューラルネットワークＮＮ２全体に対応するパラメータであってもよいし、ニューラルネットワークＮＮ２のうち第ｍ＋１層Ｌ_ｍ＋１から第ｎ層Ｌ_ｎまでに対応するパラメータであってもよい。また、中間サーバ２−２が有する処理部２３２と処理部３３２の処理分担境界を示す整数ｍが記憶部３４０に記憶されていてもよい。 The processing unit 332 performs neural network processing. The neural network processing performed by the processing unit 332 can be specified by, for example, the neural network parameters stored in the storage unit 340. The neural network parameter stored in the storage unit 340 in the present embodiment may be a parameter corresponding to the entire neural network NN2, or from the m + 1 layer L _{m + 1} to the nth layer L _{n of the neural network NN2.} It may be the corresponding parameter. Further, the integer m indicating the processing sharing boundary between the processing unit 232 and the processing unit 332 of the intermediate server 2-2 may be stored in the storage unit 340.

処理部３３２は、本実施形態における第３のニューラルネットワーク処理部として機能し、第ｍ層出力値を入力とし、図５に示したニューラルネットワークＮＮ２のうち第ｍ＋１層Ｌ_ｍ＋１から第ｎ層Ｌ_ｎまでを用いたニューラルネットワーク処理を行う。処理部３３２により出力される第ｎ層Ｌ_ｎの出力値である第ｎ層出力値は、記憶部３４０に記憶されてもよいし、不図示の表示部に表示されてもよいし、変換部３７２により通信用データに変換された後に通信部３７４により他の装置へ送信されてもよい。 The processing unit 332 functions as a third neural network processing unit in the present embodiment, receives the output value of the mth layer as an input, and has m + 1 layer L _{m + 1} to the nth layer L _{n of the neural network NN2 shown in FIG.} Perform neural network processing using up to. The n layer output value is an output value of the n-th layer L _n which is output by the processing unit 332 may be stored in the storage unit 340, may be displayed on the display unit (not shown), conversion unit After being converted into communication data by 372, it may be transmitted to another device by the communication unit 374.

通信インタフェース部３７０は、認識サーバ３−２による他の装置との間の通信を仲介する。通信インタフェース部３７０は、任意の無線通信プロトコルまたは有線通信プロトコルをサポートし、通信ネットワーク５Ｂを介して、あるいは直接に他の装置との間の通信接続を確立する。通信インタフェース部３７０は、図６に示すように変換部３７２、及び通信部３７４を含む。 The communication interface unit 370 mediates communication between the recognition server 3-2 and other devices. The communication interface unit 370 supports any wireless communication protocol or wired communication protocol, and establishes a communication connection with another device via the communication network 5B or directly. As shown in FIG. 6, the communication interface unit 370 includes a conversion unit 372 and a communication unit 374.

変換部３７２は、通信部３７４が受信した通信用データを処理部３３２や記憶部３４０が扱うためのデータに変換（逆変換）し、処理部３３２や記憶部３４０へ提供する。例えば、変換部３７２は、本実施形態における第４の変換部として機能し、通信部３７４が通信ネットワーク５Ｂ（本実施形態における第２の通信ネットワーク）から受信した通信用第ｍ層出力値を第ｍ層出力値に変換し、処理部３３２へ提供する。 The conversion unit 372 converts (reversely converts) the communication data received by the communication unit 374 into data for handling by the processing unit 332 and the storage unit 340, and provides the data to the processing unit 332 and the storage unit 340. For example, the conversion unit 372 functions as a fourth conversion unit in the present embodiment, and the communication unit 374 receives the communication m-layer output value received from the communication network 5B (second communication network in the present embodiment). It is converted into an m-layer output value and provided to the processing unit 332.

通信部３７４は、通信ネットワーク５Ｂを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部３７４は、本実施形態における第２の受信部として機能し、中間サーバ２−２が送信した通信用第ｍ層出力値を通信ネットワーク５Ｂから受信する。 The communication unit 374 transmits the communication data to another device via the communication network 5B or directly, or receives the communication data from the other device. For example, the communication unit 374 functions as a second reception unit in the present embodiment, and receives the communication m-layer output value transmitted by the intermediate server 2-2 from the communication network 5B.

（動作例）
以上、本発明の第２の実施形態に係る監視システム９００−２の構成例について説明した。続いて、本実施形態の動作例について、図７を参照して説明する。図７は本実施形態に係る監視システム９００−２の処理フローを示すシーケンス図である。 (Operation example)
The configuration example of the monitoring system 900-2 according to the second embodiment of the present invention has been described above. Subsequently, an operation example of the present embodiment will be described with reference to FIG. 7. FIG. 7 is a sequence diagram showing a processing flow of the monitoring system 900-2 according to the present embodiment.

図７に示すように、まず監視カメラ１−２は、撮像部１１２の撮像により画像データを取得する（Ｓ２０２）。続いて、監視カメラ１−２の処理部１３２が、当該画像データを入力とし、ニューラルネットワークＮＮ２のうち第１層Ｌ_１から第ｋ層Ｌ_ｋまでを用いたニューラルネットワーク処理を行って第ｋ層出力値を出力する（Ｓ２０４）。 As shown in FIG. 7, first, the surveillance camera 1-2 acquires image data by imaging the imaging unit 112 (S202). Subsequently, the processing unit 132 of the monitoring camera 1-2, and inputs the image data, the k-th layer by performing a neural network processing using the first layer L ₁ to the k-th layer L _k of the neural network NN2 The output value is output (S204).

続いて、監視カメラ１−２の変換部１７２が、第ｋ層出力値を通信用第ｋ層出力値に変換する（Ｓ２０６）。さらに、監視カメラ１−２の通信部１７４が、通信用第ｋ層出力値を通信ネットワーク５Ａへ送信し、中間サーバ２−２の通信部２７４が当該通信用第ｋ層出力値を通信ネットワーク５Ａから受信する（Ｓ２０８）。 Subsequently, the conversion unit 172 of the surveillance camera 1-2 converts the k-th layer output value into the communication k-th layer output value (S206). Further, the communication unit 174 of the surveillance camera 1-2 transmits the k-layer output value for communication to the communication network 5A, and the communication unit 274 of the intermediate server 2-2 transmits the k-layer output value for communication to the communication network 5A. Received from (S208).

続いて、中間サーバ２−２の変換部２７２が、ステップＳ２０８で受信された通信用第ｋ層出力値を第ｋ層出力値に変換する（Ｓ２１０）。さらに、中間サーバ２−２の処理部２３２が、当該第ｋ層出力値を入力とし、ニューラルネットワークＮＮ２のうち第ｋ＋１層Ｌ_ｋ＋１から第ｍ層Ｌ_ｍまでを用いたニューラルネットワーク処理を行って第ｍ層出力値を出力する（Ｓ２１２）。 Subsequently, the conversion unit 272 of the intermediate server 2-2 converts the communication k-th layer output value received in step S208 into the k-th layer output value (S210). Further, the processing unit 232 of the intermediate server 2-2 takes the k-th layer output value as an input and _{performs neural network processing using the k + 1 layer L k + 1} to the m-th layer L _m of the neural network NN2. The m-layer output value is output (S212).

続いて、中間サーバ２−２の変換部２７２が第ｍ層出力値を通信用第ｍ層出力値に変換する（Ｓ２１４）。さらに、中間サーバ２−２の通信部２７４が、通信用第ｍ層出力値を通信ネットワーク５Ｂへ送信し、認識サーバ３−２の通信部３７４が当該通信用第ｍ層出力値を通信ネットワーク５Ｂから受信する（Ｓ２１６）。 Subsequently, the conversion unit 272 of the intermediate server 2-2 converts the m-th layer output value into the communication m-layer output value (S214). Further, the communication unit 274 of the intermediate server 2-2 transmits the communication m-layer output value to the communication network 5B, and the communication unit 374 of the recognition server 3-2 transmits the communication m-layer output value to the communication network 5B. Received from (S216).

続いて、認識サーバ３−２の変換部３７２が、ステップＳ２１６で受信された通信用第ｍ層出力値を第ｍ層出力値に変換する（Ｓ２１８）。さらに、認識サーバ３−２の処理部３３２が、当該第ｍ層出力値を入力とし、ニューラルネットワークＮＮ２のうち第ｍ＋１層Ｌ_ｍ＋１から第ｎ層Ｌ_ｎまでを用いたニューラルネットワーク処理を行って第ｎ層出力値を出力する（Ｓ２２０）。 Subsequently, the conversion unit 372 of the recognition server 3-2 converts the communication m-layer output value received in step S216 into the m-layer output value (S218). Further, the processing unit 332 of the recognition server 3-2 receives as input the m-th layer output value, the performing a neural network processing using the (m + 1) -th layer L _{m + 1} of the neural network NN2 to the n layer L _n The n-layer output value is output (S220).

（効果）
以上、本発明の第２の実施形態について説明した。本実施形態によれば、監視カメラ１−２の撮像により取得された画像データを入力としたニューラルネットワーク処理が、監視カメラ１−２、中間サーバ２−２、及び認識サーバ３−２の、３つの装置で分担される。また、上述したように、監視カメラ１−２と中間サーバ２−２の処理分担境界を示す整数ｋ、及び中間サーバ２−２と認識サーバ３−２の処理分担境界を示す整数ｍは、各層に係る処理に依存せず設定され得るため、処理をより細かい単位で分担させることが可能である。さらに、各装置間では、画像データに係る通信が行われず、通信用第ｋ層出力値または、通信用第ｍ層出力値が通信される。係る構成により、仮に通信ネットワーク５Ａを介した監視カメラ１−２と中間サーバ２−２との間の通信内容、または通信ネットワーク５Ｂを介した中間サーバ２−２と認識サーバ３−２との間の通信内容が盗み取られてしまった場合であっても、プライバシーが侵害され難い。 (effect)
The second embodiment of the present invention has been described above. According to the present embodiment, the neural network processing using the image data acquired by the imaging of the surveillance camera 1-2 as the input is performed by the surveillance camera 1-2, the intermediate server 2-2, and the recognition server 3-2. It is shared by two devices. Further, as described above, the integer k indicating the processing sharing boundary between the surveillance camera 1-2 and the intermediate server 2-2 and the integer m indicating the processing sharing boundary between the intermediate server 2-2 and the recognition server 3-2 are each layer. Since it can be set independently of the processing related to, it is possible to divide the processing in finer units. Further, communication related to image data is not performed between the devices, and the k-th layer output value for communication or the m-th layer output value for communication is communicated. Depending on the configuration, the communication content between the surveillance camera 1-2 and the intermediate server 2-2 via the communication network 5A, or between the intermediate server 2-2 and the recognition server 3-2 via the communication network 5B. Even if the contents of the communication are stolen, privacy is unlikely to be infringed.

なお、上記第２の実施形態では、ニューラルネットワーク処理が、３つの装置で分担される例を説明したが、本発明は係る例に限定されず、ニューラルネットワーク処理は４つ以上の装置で分担されてもよい。 In the second embodiment, the example in which the neural network processing is shared by three devices has been described, but the present invention is not limited to this example, and the neural network processing is shared by four or more devices. You may.

＜２−３．第３の実施形態＞
（概要）
上記第１の実施形態、及び第２の実施形態では、装置間の処理分担境界を示す整数ｋ、及び整数ｍが予め設定される例を説明したが、処理分担境界を動的に決定することも可能である。以下では、処理分担境界が動的に決定される例について、本発明に係る第３の実施形態として説明する。なお、以下では、本発明の第３の実施形態に係る監視システム９００を監視システム９００−３と呼称し、監視システム９００−３が有する監視カメラ１、中間サーバ２、及び認識サーバ３をそれぞれ監視カメラ１−３、中間サーバ２−３、及び認識サーバ３−３と呼称する。 <2-3. Third Embodiment>
(Overview)
In the first embodiment and the second embodiment described above, an example in which an integer k and an integer m indicating a processing sharing boundary between devices are preset has been described, but the processing sharing boundary is dynamically determined. Is also possible. Hereinafter, an example in which the processing sharing boundary is dynamically determined will be described as a third embodiment of the present invention. In the following, the surveillance system 900 according to the third embodiment of the present invention will be referred to as a surveillance system 900-3, and the surveillance camera 1, the intermediate server 2, and the recognition server 3 of the surveillance system 900-3 will be monitored, respectively. It is referred to as a camera 1-3, an intermediate server 2-3, and a recognition server 3-3.

図８は、本発明の第３の実施形態の概要を説明するための説明図である。図８には、本実施形態に係る監視システム９００−３が用いるニューラルネットワークＮＮ３が示されている。図８に示すように、ニューラルネットワークＮＮ３は、ｎ層で構成されたニューラルネットワークである。なお、本実施形態において、ｎは少なくとも３以上の整数であるものとする。 FIG. 8 is an explanatory diagram for explaining the outline of the third embodiment of the present invention. FIG. 8 shows the neural network NN3 used by the monitoring system 900-3 according to the present embodiment. As shown in FIG. 8, the neural network NN3 is a neural network composed of n layers. In this embodiment, n is an integer of at least 3 or more.

また、図８に示すニューラルネットワークＮＮ３は、各層の出力値が、当該層の次の（右の）層に入力されるニューラルネットワークである。なお、各層の出力値は、スカラー値に限られず、例えばベクトル値であってもよい。以下、図８に示すように、本実施形態に係るニューラルネットワークＮＮ３を構成する各層を左から順に第１層Ｌ_１、第２層Ｌ_２、第３層Ｌ_３、・・・、第ｎ層Ｌ_ｎと呼称する。 Further, the neural network NN3 shown in FIG. 8 is a neural network in which the output value of each layer is input to the next (right) layer of the layer. The output value of each layer is not limited to the scalar value, and may be, for example, a vector value. Hereinafter, as shown in FIG. 8, the layers constituting the neural network NN3 according to the present embodiment are arranged in order from the left, the first layer L ₁ , the second layer L ₂ , the third layer L ₃ , ..., The nth layer. It is called _{L n.}

本実施形態において、ニューラルネットワークＮＮ３に含まれる各層は、図８に示すように顔検出処理と顔認証処理とが明確に区別されている。図８に示すように、ニューラルネットワークＮＮ３においては、第１層Ｌ_１〜第ｑ層Ｌ_ｑが顔検出処理に対応し、第ｑ＋１層Ｌ_ｑ＋１〜第ｎ層Ｌ_ｎが顔認証処理に対応する。つまり、第ｑ層Ｌ_ｑの出力値である第ｑ層出力値は、顔検出結果に関する情報を含む。なお、ここでｑは１以上ｎ−１以下の整数である。 In the present embodiment, each layer included in the neural network NN3 is clearly distinguished from the face detection process and the face recognition process as shown in FIG. As shown in FIG. 8, the neural network NN3 is the first layer _{L 1} ~ q-th layer _{L q} corresponds to the face detection process, the q + 1 layer _{L q + 1} ~ n-th layer _{L n} corresponds to the face authentication process .. In other words, the q layer output value is an output value of the q layer L _q includes information about the face detection result. Here, q is an integer of 1 or more and n-1 or less.

本実施形態に係る監視システム９００−３は、第２の実施形態に係る監視システム９００−２と同様に、ニューラルネットワークＮＮ３を用いたニューラルネットワーク処理を、監視カメラ１−３、中間サーバ２−３、及び認識サーバ３−３で分担させる。図８に示す例では、監視カメラ１−３が第１層Ｌ_１〜第ｋ層Ｌ_ｋのニューラルネットワーク処理を担当する。また、中間サーバ２−３が第ｋ＋１層Ｌ_ｋ＋１〜第ｍ層Ｌ_ｍのニューラルネットワーク処理を担当する。また、認識サーバ３−３が、第ｍ＋１層Ｌ_ｍ＋１〜第ｎ層Ｌ_ｎのニューラルネットワーク処理を担当する。ここで、ｋは１以上ｑ−１以下の整数であり、監視カメラ１−３と中間サーバ２−３の（より正確には、後述するように監視カメラ１−３と中間サーバ２−３がそれぞれ有する処理部の）処理分担境界を示している。また、ｍはｑ＋１以上ｎ−１以下の整数であり、中間サーバ２−３と認識サーバ３−３の（より正確には、後述するように中間サーバ２−３と認識サーバ３−３がそれぞれ有する処理部の）処理分担境界を示している。 Similar to the monitoring system 900-2 according to the second embodiment, the monitoring system 900-3 according to the present embodiment performs neural network processing using the neural network NN3 with the surveillance camera 1-3 and the intermediate server 2-3. , And the recognition server 3-3. In the example shown in FIG. 8, the monitoring cameras 1-3 responsible for neural network processing of the first layer _{L 1} ~ k-th layer _{L k.} Further, the intermediate server 2-3 is in charge of the neural network processing of the _{first k + 1 layer L k + 1} to the mth layer L _m. Further, the recognition server 3-3 is in charge of the neural network processing of _{the m + 1 layer L m + 1} to the nth layer L _n. Here, k is an integer of 1 or more and q-1 or less, and the surveillance cameras 1-3 and the intermediate server 2-3 (more accurately, the surveillance cameras 1-3 and the intermediate server 2-3 are described later). The processing sharing boundary (of each processing unit) is shown. Further, m is an integer of q + 1 or more and n-1 or less, and the intermediate server 2-3 and the recognition server 3-3 (more accurately, the intermediate server 2-3 and the recognition server 3-3, respectively, as described later) are used. It shows the processing sharing boundary (of the processing unit).

本実施形態において、監視カメラ１−３と中間サーバ２−３の処理分担境界を示す整数ｋ、及び中間サーバ２−３と認識サーバ３−３の処理分担境界を示す整数ｍは動的に変化し得る。整数ｋ及び整数ｍの決定方法の詳細については後述するが、上述したようにｋは１以上ｑ−１以下の範囲で、ｍはｑ＋１以上ｎ−１以下の範囲で決定される。係る構成により、中間サーバ２−３は、常に第ｑ層Ｌ_ｑを用いたニューラルネットワーク処理を行うこととなる。そのため中間サーバ２−３は、現在のフレームの画像データ（第１の入力データ）を入力とした顔検出処理の結果に関する情報を含む第ｑ層出力値を得ることが可能である。そして、中間サーバ２−３は当該第ｑ層出力値に基づいて、次のフレームの画像データ（第２の入力データ）の撮像（センシング）に係る分解能や、次のフレームの画像データを入力としたニューラルネットワーク処理の処理分担境界を決定する。係る構成により、例えば要求される処理時間を満たすように処理を分担しつつ、より高精度な顔認識を行うことが可能となる。 In the present embodiment, the integer k indicating the processing sharing boundary between the surveillance camera 1-3 and the intermediate server 2-3 and the integer m indicating the processing sharing boundary between the intermediate server 2-3 and the recognition server 3-3 change dynamically. Can be. The details of the method for determining the integer k and the integer m will be described later, but as described above, k is determined in the range of 1 or more and q-1 or less, and m is determined in the range of q + 1 or more and n-1 or less. According to such a constitution, the intermediate server 2-3 will always be possible to perform a neural network processing using the second q layer L _q. Therefore, the intermediate server 2-3 can obtain the qth layer output value including the information regarding the result of the face detection process using the image data (first input data) of the current frame as an input. Then, the intermediate server 2-3 inputs the resolution related to the imaging (sensing) of the image data (second input data) of the next frame and the image data of the next frame based on the qth layer output value. Determine the processing sharing boundary of the neural network processing. With such a configuration, it is possible to perform more accurate face recognition while sharing the processing so as to satisfy the required processing time, for example.

（構成例）
以上、本実施形態に係る監視システム９００−３の概要を説明した。続いて、本実施形態に係る監視システム９００−３の構成例について、より詳細に説明する。図９は、本実施形態に係る監視システム９００−３の構成例を示すブロック図である。なお、本実施形態に係る監視システム９００−３は、一部において第１の実施形態に係る監視システム９００−１や第２の実施形態に係る監視システム９００−２と同様の構成を有するため、適宜省略しながら説明を行う。 (Configuration example)
The outline of the monitoring system 900-3 according to the present embodiment has been described above. Subsequently, a configuration example of the monitoring system 900-3 according to the present embodiment will be described in more detail. FIG. 9 is a block diagram showing a configuration example of the monitoring system 900-3 according to the present embodiment. Since the monitoring system 900-3 according to the present embodiment has the same configuration as the monitoring system 900-1 according to the first embodiment and the monitoring system 900-2 according to the second embodiment in part. The explanation will be given while omitting as appropriate.

監視カメラ１−３は、図９に示すように、撮像部１１３、処理部１３３、記憶部１４０、処理制御部１６３、及び通信インタフェース部１８０を備える情報処理装置である。図９に示す記憶部１４０の機能は図６を参照して説明した記憶部１４０の機能と同様であるため、説明を省略する。 As shown in FIG. 9, the surveillance cameras 1-3 are information processing devices including an imaging unit 113, a processing unit 133, a storage unit 140, a processing control unit 163, and a communication interface unit 180. Since the function of the storage unit 140 shown in FIG. 9 is the same as the function of the storage unit 140 described with reference to FIG. 6, the description thereof will be omitted.

撮像部１１３は、図６を参照して説明した撮像部１１２と同様に画像データ（センシングデータの一例）を撮像（センシングの一例）により取得するカメラモジュールである。ただし、本実施形態に係る撮像部１１３は、後述する処理制御部１６３の制御に従ったフレームレート（時間分解能）、及び解像度（空間分解能）で撮像を行う点で、図６を参照して説明した撮像部１１２と異なる。 The imaging unit 113 is a camera module that acquires image data (an example of sensing data) by imaging (an example of sensing) in the same manner as the imaging unit 112 described with reference to FIG. However, the imaging unit 113 according to the present embodiment will be described with reference to FIG. 6 in that it performs imaging at a frame rate (time resolution) and a resolution (spatial resolution) in accordance with the control of the processing control unit 163 described later. It is different from the imaging unit 112.

処理部１３３は、ニューラルネットワーク処理を行う。処理部１３３が行うニューラルネットワーク処理は、例えば、記憶部１４０に記憶されるニューラルネットワークパラメータによって特定され得る。なお、記憶部１４０に記憶されるニューラルネットワークパラメータは、ニューラルネットワークＮＮ３全体に対応するパラメータであってもよいし、ニューラルネットワークＮＮ３のうち第１層Ｌ_１から第ｑ層Ｌ_ｑまでに対応するパラメータであってもよい。 The processing unit 133 performs neural network processing. The neural network processing performed by the processing unit 133 can be specified by, for example, the neural network parameters stored in the storage unit 140. Incidentally, the neural network parameters stored in the storage unit 140 may be a parameter corresponding to the entire neural network NN3, corresponding parameter from the first layer L ₁ to the q layer L _q of the neural network NN3 It may be.

処理部１３３は、本実施形態における第１のニューラルネットワーク処理部として機能し、撮像部１１３から提供される画像データを入力とし、図８に示したニューラルネットワークＮＮ３のうち第１層Ｌ_１から第ｋ層Ｌ_ｋまでを用いたニューラルネットワーク処理を行う。また、処理部１３３は、本実施形態における第１のニューラルネットワーク処理部として機能し、第ｋ層Ｌ_ｋの出力値である第ｋ層出力値を通信インタフェース部１８０へ出力する。 Processor 133 functions as a first neural network processing unit in the present embodiment, an input image data provided from the imaging unit 113, first from the first layer L ₁ of the neural network NN3 shown in FIG. 8 performing a neural network processing using up to k layer L _k. The processing unit 133 functions as a first neural network processing unit in the present embodiment, and outputs the k-th layer output value is an output value of the k-th layer L _k to a communication interface unit 180.

上述したように、処理部１３３と後述する中間サーバ２−３が有する処理部２３３の処理分担境界を示す整数ｋは動的に変化し得る。例えば、処理部１３３は、後述する処理制御部１６３の制御に従って第１層Ｌ_１から第ｋ層Ｌ_ｋまでを用いたニューラルネットワーク処理を行う。 As described above, the integer k indicating the processing sharing boundary of the processing unit 233 of the processing unit 133 and the intermediate server 2-3 described later can change dynamically. For example, processor 133 performs a neural network processing using the first layer L ₁ in accordance with the control of the processing control unit 163 to be described later up to the k layer L _k.

処理制御部１６３は、後述する通信インタフェース部１８０から提供される処理制御情報に基づいて、撮像部１１３、及び処理部１３３を制御する。通信インタフェース部１８０から処理制御部１６３に提供される処理制御情報には、例えば撮像部１１３の撮像（センシングの一例）に係るフレームレート（時間分解能）、及び解像度（空間分解能）の情報が含まれていてもよい。そして、処理制御部１６３は、処理制御情報に基づいて、撮像部１１３のフレームレート（時間分解能）、及び解像度（空間分解能）の設定を変更してもよい。また、通信インタフェース部１８０から処理制御部１６３に提供される処理制御情報には、処理部１３３と後述する中間サーバ２−３が有する処理部２３３の処理分担境界を示す整数ｋの情報（値）が含まれていてもよい。 The processing control unit 163 controls the imaging unit 113 and the processing unit 133 based on the processing control information provided from the communication interface unit 180, which will be described later. The processing control information provided from the communication interface unit 180 to the processing control unit 163 includes, for example, frame rate (time resolution) and resolution (spatial resolution) information related to imaging (an example of sensing) of the imaging unit 113. You may be. Then, the processing control unit 163 may change the settings of the frame rate (time resolution) and the resolution (spatial resolution) of the imaging unit 113 based on the processing control information. Further, the processing control information provided from the communication interface unit 180 to the processing control unit 163 includes information (value) of an integer k indicating a processing sharing boundary of the processing unit 233 possessed by the processing unit 133 and the intermediate server 2-3 described later. May be included.

通信インタフェース部１８０は、監視カメラ１−３による他の装置との間の通信を仲介する。通信インタフェース部１８０は、任意の無線通信プロトコルまたは有線通信プロトコルをサポートし、通信ネットワーク５Ａを介して、あるいは直接に他の装置との間の通信接続を確立する。通信インタフェース部１８０は、図９に示すように変換部１８２、及び通信部１８４を含む。 The communication interface unit 180 mediates communication between the surveillance cameras 1-3 and other devices. The communication interface unit 180 supports an arbitrary wireless communication protocol or a wired communication protocol, and establishes a communication connection with another device via the communication network 5A or directly. As shown in FIG. 9, the communication interface unit 180 includes a conversion unit 182 and a communication unit 184.

変換部１８２は、データを通信部１８４が送信可能な形式のデータ（通信用データ）に変換する。例えば、変換部１７８は、本実施形態における第１の変換部として機能し、処理部１３３から出力されるニューラルネットワークＮＮ３の第ｋ層出力値を通信用第ｋ層出力値に変換し、通信部１８４へ提供する。 The conversion unit 182 converts the data into data (communication data) in a format that can be transmitted by the communication unit 184. For example, the conversion unit 178 functions as the first conversion unit in the present embodiment, converts the k-layer output value of the neural network NN3 output from the processing unit 133 into the k-layer output value for communication, and converts the k-layer output value for communication into the communication unit. Provide to 184.

また、変換部１８２は、通信部１８４が受信した通信用データを処理部１３３や記憶部１４０、処理制御部１６３が扱うためのデータに変換（逆変換）し、処理部１３３や記憶部１４０、処理制御部１６３へ提供する。例えば、変換部１８２は、通信部１８４が通信ネットワーク５Ａから受信した通信用処理制御情報を処理制御情報に変換し、処理制御部１６３へ提供する。 Further, the conversion unit 182 converts (reversely transforms) the communication data received by the communication unit 184 into data for handling by the processing unit 133, the storage unit 140, and the processing control unit 163, and the processing unit 133 and the storage unit 140, It is provided to the processing control unit 163. For example, the conversion unit 182 converts the communication processing control information received from the communication network 5A by the communication unit 184 into the processing control information and provides it to the processing control unit 163.

通信部１８４は、通信ネットワーク５Ａを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部１８４は、変換部１８２から提供された通信用第ｋ層出力値を通信ネットワーク５Ａ（本実施形態における第１の通信ネットワーク）へ送信する。通信部１８４は、中間サーバ２−３が送信した通信用処理制御情報を通信ネットワーク５Ａから受信する。 The communication unit 184 transmits communication data to another device via the communication network 5A or directly, or receives communication data from the other device. For example, the communication unit 184 transmits the communication layer k layer output value provided by the conversion unit 182 to the communication network 5A (the first communication network in the present embodiment). The communication unit 184 receives the communication processing control information transmitted by the intermediate servers 2-3 from the communication network 5A.

中間サーバ２−３は、図９に示すように、処理部２３３、記憶部２４０、決定部２５３、処理制御部２６３、及び通信インタフェース部２８０を備える情報処理装置である。図９に示す記憶部２４０の機能は図３を参照して説明した記憶部２４０の機能と同様であるため、説明を省略する。 As shown in FIG. 9, the intermediate server 2-3 is an information processing device including a processing unit 233, a storage unit 240, a determination unit 253, a processing control unit 263, and a communication interface unit 280. Since the function of the storage unit 240 shown in FIG. 9 is the same as the function of the storage unit 240 described with reference to FIG. 3, the description thereof will be omitted.

処理部２３３は、ニューラルネットワーク処理を行う。処理部２３３が行うニューラルネットワーク処理は、例えば、記憶部２４０に記憶されるニューラルネットワークパラメータによって特定され得る。なお、本実施形態において記憶部２４０に記憶されるニューラルネットワークパラメータは、ニューラルネットワークＮＮ３全体に対応するパラメータであってもよい。 The processing unit 233 performs neural network processing. The neural network processing performed by the processing unit 233 can be specified by, for example, the neural network parameters stored in the storage unit 240. The neural network parameter stored in the storage unit 240 in the present embodiment may be a parameter corresponding to the entire neural network NN3.

処理部２３３は、本実施形態における第２のニューラルネットワーク処理部として機能し、第ｋ層出力値を入力とし、図８に示したニューラルネットワークＮＮ３のうち第ｋ＋１層Ｌ_ｋ＋１から第ｍ層Ｌ_ｍまでを用いたニューラルネットワーク処理を行う。また、処理部２３３は、本実施形態における第２のニューラルネットワーク処理部として機能し、第ｍ層Ｌ_ｍの出力値である第ｍ層出力値を通信インタフェース部２８０へ出力する。 The processing unit 233 functions as a second neural network processing unit in the present embodiment, receives the output value of the k-th layer as an input, and has the k + 1 layer L _{k + 1} to the m-th layer L _{m of the neural network NN3 shown in FIG.} Perform neural network processing using up to. The processing unit 233 functions as a second neural network processing unit in this embodiment, outputs an m-th layer output value is an output value of the m-th layer L _m to the communication interface unit 280.

上述したように、監視カメラ１−３が有する処理部１３３と処理部２３３の処理分担境界を示す整数ｋ、及び処理部２３２と後述する認識サーバ３−１が有する処理部３３２の処理分担境界を示す整数ｍは動的に変化し得る。例えば、処理部２３３は、後述する処理制御部２６３の制御に従って第ｋ＋１層Ｌ_ｋ＋１から第ｍ層Ｌ_ｍまでを用いたニューラルネットワーク処理を行う。 As described above, the integer k indicating the processing sharing boundary of the processing unit 133 and the processing unit 233 of the surveillance camera 1-3, and the processing sharing boundary of the processing unit 332 of the processing unit 232 and the recognition server 3-1 described later are set. The indicated integer m can change dynamically. For example, the processing unit 233 performs neural network processing using the _{first k + 1 layer L k + 1} to the mth layer L _{m under} the control of the processing control unit 263 described later.

決定部２５３は、処理分担決定部として機能し、処理部１３３と処理部２３３の処理分担境界を示す整数ｋの値、及び処理部２３２と処理部３３２の処理分担境界を示す整数ｍの値を決定する。例えば、決定部２５３は、現在のフレームの画像データ（第１の入力データ）を入力としたニューラルネットワーク処理における第ｑ層出力値に基づいて、次以降のフレームの画像データ（第２の入力データ）を入力としたニューラルネットワーク処理における整数ｋの値、及び整数ｍの値を決定してもよい。 The determination unit 253 functions as a processing sharing determination unit, and sets a value of an integer k indicating the processing sharing boundary between the processing unit 133 and the processing unit 233 and a value of an integer m indicating the processing sharing boundary between the processing unit 232 and the processing unit 332. decide. For example, the determination unit 253 uses the image data of the next and subsequent frames (second input data) based on the qth layer output value in the neural network processing in which the image data of the current frame (first input data) is input. ) As an input, the value of the integer k and the value of the integer m in the neural network processing may be determined.

また、決定部２５３は、分解能決定部として機能し、監視カメラ１−３が有する撮像部１１３の撮像に係る分解能（フレームレート、及び解像度）を決定する。例えば、決定部２５３は、現在のフレームの画像データ（第１の入力データ）を入力としたニューラルネットワーク処理における第ｑ層出力値に基づいて、次以降のフレームの画像データ（第２の入力データ）の撮像に係る分解能を決定してもよい。 Further, the determination unit 253 functions as a resolution determination unit, and determines the resolution (frame rate and resolution) related to the imaging of the imaging unit 113 included in the surveillance cameras 1-3. For example, the determination unit 253 uses the image data of the next and subsequent frames (second input data) based on the qth layer output value in the neural network processing in which the image data of the current frame (first input data) is input. ) May be determined.

ここで、上述したように、第ｑ層出力値は、顔検出の結果を示す情報を含み、例えば、決定部２５３は第ｑ層出力値に基づいて、顔検出されたか否かを判定することが可能である。したがって、決定部２５３は、顔検出結果に基づいて、整数ｋの値、整数ｍの値、及び上記分解能を決定し得る。 Here, as described above, the qth layer output value includes information indicating the result of face detection, and for example, the determination unit 253 determines whether or not the face is detected based on the qth layer output value. Is possible. Therefore, the determination unit 253 can determine the value of the integer k, the value of the integer m, and the above resolution based on the face detection result.

図１０は、決定部２５３による整数ｋ、整数ｍ、及び分解能（フレームレート、及び解像度）の決定の一例を示す表である。図１０において、初期設定におけるｋの値であるｐは１以上ｑ以下の整数、初期設定におけるｍの値であるｒはｑ以上ｎ−１以下の整数であり、例えばそれぞれ予め設定されていてもよい。 FIG. 10 is a table showing an example of determination of the integer k, the integer m, and the resolution (frame rate and resolution) by the determination unit 253. In FIG. 10, p, which is the value of k in the initial setting, is an integer of 1 or more and q or less, and r, which is the value of m in the initial setting, is an integer of q or more and n-1 or less. good.

図１０に示すように、決定部２５３は、顔検出時には、より高精度に顔認識処理が行われるように、分解能決定部として、分解能（フレームレート、及び解像度）が高くなるように分解能を設定してもよい。ただし、係る場合、全体の処理負荷も高くなってしまうため、処理分担を変更しないと要求される処理時間で処理を完了することが出来ない恐れがある。そのため、決定部２５３は、分解能としてより高い値を決定する場合に、処理部１３３と処理部２３３のうち、より低い処理性能を有する方の処理負荷が、より小さくなるように、ｋの値を決定してもよい。 As shown in FIG. 10, the determination unit 253 sets the resolution as a resolution determination unit so that the resolution (frame rate and resolution) is high so that the face recognition process is performed with higher accuracy at the time of face detection. You may. However, in such a case, since the overall processing load is also high, there is a possibility that the processing cannot be completed in the required processing time unless the processing division is changed. Therefore, when determining a higher value as the resolution, the determination unit 253 sets the value of k so that the processing load of the processing unit 133 and the processing unit 233, which has the lower processing performance, becomes smaller. You may decide.

例えば、本実施形態において、監視カメラ１−３の処理部１３３よりも、中間サーバ２−３の処理部２３３の方が高い処理性能を有するものとする。そのため、図１０に示す例では、フレームレート、及び解像度が高い値に決定される顔検出時には、ｋの値は初期設定のｐよりも小さい値に決定される。また、同様に、中間サーバ２−３の処理部２３３よりも、認識サーバ３−３の処理部３３３の方が高い処理性能を有してもよい。そのため、図１０に示す例では、フレームレート、及び解像度が高い値に決定される顔検出時には、ｍの値は初期設定のｒよりも小さい値に決定される。 For example, in the present embodiment, it is assumed that the processing unit 233 of the intermediate server 2-3 has higher processing performance than the processing unit 133 of the surveillance camera 1-3. Therefore, in the example shown in FIG. 10, the value of k is determined to be smaller than the default value of p at the time of face detection in which the frame rate and the resolution are determined to be high values. Similarly, the processing unit 333 of the recognition server 3-3 may have higher processing performance than the processing unit 233 of the intermediate server 2-3. Therefore, in the example shown in FIG. 10, the value of m is determined to be smaller than the initial setting r at the time of face detection in which the frame rate and the resolution are determined to be high values.

また、図１０に示すように、決定部２５３は、顔未検出時には、処理負荷を低減させるため、分解能決定部として、分解能（フレームレート、及び解像度）が低くなるように分解能を設定してもよい。係る場合、全体の処理負荷も低くなるため、処理分担を変更し、要求される処理時間を満たす範囲で可能な限り処理部１３３に処理を分担させることで、後段の処理部２３３、及び処理部３３３の処理負荷をより軽減することが可能となる。そこで、図１０に示す例では、フレームレート、及び解像度が低い値に決定される顔未検出時には、ｋの値は初期設定のｐ以上であり、ｑより小さい値に決定される。また、図１０に示す例では、フレームレート、及び解像度が低い値に決定される顔未検出時には、ｍの値は初期設定のｒ以上であり、ｎよりも小さい値に決定される。 Further, as shown in FIG. 10, in order to reduce the processing load when the face is not detected, the determination unit 253 may set the resolution as the resolution determination unit so that the resolution (frame rate and resolution) is low. good. In such a case, the overall processing load is also low, so by changing the processing division and having the processing unit 133 share the processing as much as possible within the range that satisfies the required processing time, the processing unit 233 and the processing unit in the subsequent stage are used. The processing load of 333 can be further reduced. Therefore, in the example shown in FIG. 10, when the face is not detected and the frame rate and the resolution are determined to be low values, the value of k is determined to be equal to or more than the initial setting p and smaller than q. Further, in the example shown in FIG. 10, when the face is not detected and the frame rate and the resolution are determined to be low values, the value of m is determined to be equal to or more than the initial setting r and smaller than n.

以上、図１０を参照して決定部２５３による整数ｋの値、整数ｍの値、及び分解能の決定方法について説明したが、図１０に示したのは一例であって、係る例に限定されない、例えば、決定部２５３は、検出された顔の数に応じて、より多段階で整数ｋの値、整数ｍの値、及び分解能を決定してもよい。 The method of determining the value of the integer k, the value of the integer m, and the resolution by the determination unit 253 has been described above with reference to FIG. 10, but FIG. 10 shows only one example and is not limited to such an example. For example, the determination unit 253 may determine the value of the integer k, the value of the integer m, and the resolution in more steps according to the number of detected faces.

決定部２５３は、決定した整数ｋの値、整数ｍの値、及び分解能の情報を含む処理制御情報を生成し、処理制御部２６３、及び通信インタフェース部２８０へ提供する。 The determination unit 253 generates processing control information including the determined integer k value, integer m value, and resolution information, and provides the processing control information to the processing control unit 263 and the communication interface unit 280.

図９に戻って説明を続ける。処理制御部２６３は、決定部２５３から提供された処理制御情報に含まれる整数ｋの値、整数ｍの値に基づいて処理部２３３を制御する。決定部２５３から処理制御部２６３に提供される処理制御情報には、整数ｋ、及び整数ｍの情報（値）が含まれていてもよい。 The explanation will be continued by returning to FIG. The processing control unit 263 controls the processing unit 233 based on the value of the integer k and the value of the integer m included in the processing control information provided by the determination unit 253. The processing control information provided from the determination unit 253 to the processing control unit 263 may include information (value) of an integer k and an integer m.

通信インタフェース部２８０は、図３を参照して説明した通信インタフェース部２２０と同様に、中間サーバ２−３による他の装置との間の通信を仲介する。通信インタフェース部２８０は、任意の無線通信プロトコルまたは有線通信プロトコルをサポートし、通信ネットワーク５Ａ、または通信ネットワーク５Ｂを介して、あるいは直接に他の装置との間の通信接続を確立する。通信インタフェース部２８０は、図９に示すように変換部２８２、及び通信部２８４を含む。 The communication interface unit 280 mediates communication with other devices by the intermediate server 2-3, similarly to the communication interface unit 220 described with reference to FIG. The communication interface unit 280 supports any wireless communication protocol or wired communication protocol, and establishes a communication connection with or directly from the communication network 5A or the communication network 5B. The communication interface unit 280 includes a conversion unit 282 and a communication unit 284 as shown in FIG.

変換部２８２は、通信部２８４が受信した通信用データを処理部２３３や記憶部２４０が扱うためのデータに変換（逆変換）し、処理部２３３や記憶部２４０へ提供する。例えば、変換部２８２は、本実施形態における第２の変換部として機能し、通信部２８４が通信ネットワーク５Ａから受信した通信用第ｋ層出力値を第ｋ層出力値に変換し、処理部２３３へ提供する。 The conversion unit 282 converts (reversely converts) the communication data received by the communication unit 284 into data for handling by the processing unit 233 and the storage unit 240, and provides the data to the processing unit 233 and the storage unit 240. For example, the conversion unit 282 functions as a second conversion unit in the present embodiment, converts the communication k-layer output value received from the communication network 5A by the communication unit 284 into the k-layer output value, and processes the processing unit 233. To provide to.

また、変換部２８２は、データを通信部２８４が送信可能な形式のデータ（通信用データ）に変換する。例えば、変換部２８２は、本実施形態における第３の変換部として機能し、処理部２３３から出力されるニューラルネットワークＮＮ３の第ｍ層出力値を通信用第ｍ層出力値に変換し、通信部２８４へ提供する。また、本実施形態に係る変換部２８２は、決定部２５３から提供された処理制御情報を通信用処理制御情報に変換し、通信部２８４へ提供する。 Further, the conversion unit 282 converts the data into data (communication data) in a format that can be transmitted by the communication unit 284. For example, the conversion unit 282 functions as a third conversion unit in the present embodiment, converts the mth layer output value of the neural network NN3 output from the processing unit 233 into the communication mth layer output value, and communicates unit. Provide to 284. Further, the conversion unit 282 according to the present embodiment converts the processing control information provided by the determination unit 253 into communication processing control information and provides it to the communication unit 284.

通信部２８４は、通信ネットワーク５Ａを介して、または通信ネットワーク５Ｂを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部２８４は、本実施形態における第１の受信部として機能し、監視カメラ１−１が送信した通信用第ｋ層出力値を通信ネットワーク５Ａから受信する。また、通信部２８４は、本実施形態における第２の送信部として機能し、変換部２８２から提供された通信用第ｍ層出力値を通信ネットワーク５Ｂ（本実施形態における第２の通信ネットワーク）へ送信する。また、本実施形態に係る通信部２８４は、変換部２８２から提供された通信用処理制御情報を通信ネットワーク５Ａ、及び通信ネットワーク５Ｂへ送信する。 The communication unit 284 transmits communication data to another device, or receives communication data from another device, via the communication network 5A, via the communication network 5B, or directly. For example, the communication unit 284 functions as the first reception unit in the present embodiment, and receives the communication k-layer output value transmitted by the surveillance camera 1-1 from the communication network 5A. Further, the communication unit 284 functions as a second transmission unit in the present embodiment, and transfers the communication m-layer output value provided by the conversion unit 282 to the communication network 5B (second communication network in the present embodiment). Send. Further, the communication unit 284 according to the present embodiment transmits the communication processing control information provided by the conversion unit 282 to the communication network 5A and the communication network 5B.

認識サーバ３−３は、図６に示すように、処理部３３３、記憶部３４０、及び通信インタフェース部３８０を備える情報処理装置である。図９に示す記憶部３４０の機能は図３を参照して説明した記憶部３４０の機能と同様であるため、説明を省略する。 As shown in FIG. 6, the recognition server 3-3 is an information processing device including a processing unit 333, a storage unit 340, and a communication interface unit 380. Since the function of the storage unit 340 shown in FIG. 9 is the same as the function of the storage unit 340 described with reference to FIG. 3, the description thereof will be omitted.

処理部３３３は、ニューラルネットワーク処理を行う。処理部３３３が行うニューラルネットワーク処理は、例えば、記憶部３４０に記憶されるニューラルネットワークパラメータによって特定され得る。なお、本実施形態において記憶部３４０が記憶するニューラルネットワークパラメータは、ニューラルネットワークＮＮ３全体に対応するパラメータであってもよいし、ニューラルネットワークＮＮ３のうち第ｑ＋１層Ｌ_ｑ＋１から第ｎ層Ｌ_ｎまでに対応するパラメータであってもよい。 The processing unit 333 performs neural network processing. The neural network processing performed by the processing unit 333 can be specified by, for example, the neural network parameters stored in the storage unit 340. Incidentally, the neural network parameters storage unit 340 stores in this embodiment may be a parameter corresponding to the entire neural network NN3, from the q + 1 layer L _{q + 1} of the neural network NN3 to the n layer L _n It may be the corresponding parameter.

処理部３３３は、本実施形態における第３のニューラルネットワーク処理部として機能し、第ｍ層出力値を入力とし、図８に示したニューラルネットワークＮＮ３のうち第ｍ＋１層Ｌ_ｍ＋１から第ｎ層Ｌ_ｎまでを用いたニューラルネットワーク処理を行う。処理部３３３により出力される第ｎ層Ｌ_ｎの出力値である第ｎ層出力値は、記憶部３４０に記憶されてもよいし、不図示の表示部に表示されてもよいし、変換部３８２により通信用データに変換された後に通信部３８４により他の装置へ送信されてもよい。 The processing unit 333 functions as a third neural network processing unit in the present embodiment, receives the output value of the mth layer as an input, and has m + 1 layer L _{m + 1} to nth layer L _{n of the neural network NN3 shown in FIG.} Perform neural network processing using up to. The n layer output value is an output value of the n-th layer L _n which is output by the processing unit 333 may be stored in the storage unit 340, may be displayed on the display unit (not shown), conversion unit After being converted into communication data by 382, it may be transmitted to another device by the communication unit 384.

上述したように、中間サーバ２−３が有する処理部２３２と処理部３３２の処理分担境界を示す整数ｍは動的に変化し得る。例えば、処理部３３３は、後述する処理制御部３６３の制御に従って第ｍ＋１層Ｌ_ｍ＋１から第ｎ層Ｌ_ｎまでを用いたニューラルネットワーク処理を行う。 As described above, the integer m indicating the processing sharing boundary between the processing unit 232 and the processing unit 332 of the intermediate server 2-3 can change dynamically. For example, the processing unit 333 performs neural network processing using _{the m + 1 layer L m + 1} to the nth layer L _n according to the control of the processing control unit 363 described later.

処理制御部３６３は、後述する通信インタフェース部３８０から提供される処理制御情報に基づいて、及び処理部３３３を制御する。通信インタフェース部３８０から処理制御部３６３に提供される処理制御情報には処理部２３３と、処理部３３３の処理分担境界を示す整数ｍの情報（値）が含まれていてもよい。 The processing control unit 363 controls the processing unit 333 based on the processing control information provided from the communication interface unit 380 described later. The processing control information provided from the communication interface unit 380 to the processing control unit 363 may include information (value) of an integer m indicating the processing unit 233 and the processing sharing boundary of the processing unit 333.

通信インタフェース部３８０は、認識サーバ３−３による他の装置との間の通信を仲介する。通信インタフェース部３８０は、任意の無線通信プロトコルまたは有線通信プロトコルをサポートし、通信ネットワーク５Ｂを介して、あるいは直接に他の装置との間の通信接続を確立する。通信インタフェース部３８０は、図９に示すように変換部３８２、及び通信部３８４を含む。 The communication interface unit 380 mediates communication between the recognition server 3-3 and other devices. The communication interface unit 380 supports any wireless communication protocol or wired communication protocol, and establishes a communication connection with another device via the communication network 5B or directly. The communication interface unit 380 includes a conversion unit 382 and a communication unit 384 as shown in FIG.

変換部３８２は、通信部３８４が受信した通信用データを処理部３３３や記憶部３４０が扱うためのデータに変換（逆変換）し、処理部３３３や記憶部３４０へ提供する。例えば、変換部３８２は、本実施形態における第４の変換部として機能し、通信部３８４が通信ネットワーク５Ｂ（本実施形態における第２の通信ネットワーク）から受信した通信用第ｍ層出力値を第ｍ層出力値に変換し、処理部３３３へ提供する。また、変換部３８２は、通信部３８４が通信ネットワーク５Ｂから受信した通信用処理制御情報を処理制御情報に変換し、処理制御部３６３へ提供する。 The conversion unit 382 converts (reversely converts) the communication data received by the communication unit 384 into data for handling by the processing unit 333 and the storage unit 340, and provides the data to the processing unit 333 and the storage unit 340. For example, the conversion unit 382 functions as a fourth conversion unit in the present embodiment, and the communication unit 384 uses the communication m-layer output value received from the communication network 5B (second communication network in the present embodiment) as the first. It is converted into an m-layer output value and provided to the processing unit 333. Further, the conversion unit 382 converts the communication processing control information received from the communication network 5B by the communication unit 384 into the processing control information and provides it to the processing control unit 363.

通信部３８４は、通信ネットワーク５Ｂを介して、あるいは直接に他の装置へ通信用データを送信し、または他の装置から通信用データを受信する。例えば、通信部３８４は、本実施形態における第２の受信部として機能し、中間サーバ２−３が送信した通信用第ｍ層出力値を通信ネットワーク５Ｂから受信する。また通信部３８４は、中間サーバ２−３が送信した通信用処理制御情報を通信ネットワーク５Ｂから受信する。 The communication unit 384 transmits communication data to another device via the communication network 5B or directly, or receives communication data from the other device. For example, the communication unit 384 functions as a second reception unit in the present embodiment, and receives the communication m-layer output value transmitted by the intermediate server 2-3 from the communication network 5B. Further, the communication unit 384 receives the communication processing control information transmitted by the intermediate servers 2-3 from the communication network 5B.

（動作例）
以上、本発明の第３の実施形態に係る監視システム９００−３の構成例について説明した。続いて、本実施形態の動作例について、図１１を参照して説明する。図１１は、本実施形態に係る監視システム９００−３の処理フローを示すシーケンス図である。 (Operation example)
The configuration example of the monitoring system 900-3 according to the third embodiment of the present invention has been described above. Subsequently, an operation example of the present embodiment will be described with reference to FIG. FIG. 11 is a sequence diagram showing a processing flow of the monitoring system 900-3 according to the present embodiment.

図１１に示すように、まず監視カメラ１−３は、撮像部１１３の撮像により画像データを取得する（Ｓ３０２）。続いて、監視カメラ１−３の処理部１３３が、当該画像データを入力とし、ニューラルネットワークＮＮ３のうち第１層Ｌ_１から第ｋ層Ｌ_ｋまでを用いたニューラルネットワーク処理を行って第ｋ層出力値を出力する（Ｓ３０４）。 As shown in FIG. 11, first, the surveillance cameras 1-3 acquire image data by imaging the imaging unit 113 (S302). Subsequently, the processing unit 133 of the monitoring camera 1-3, and inputs the image data, the k-th layer by performing a neural network processing using the first layer L ₁ to the k-th layer L _k of the neural network NN3 The output value is output (S304).

続いて、監視カメラ１−３の変換部１８２が、第ｋ層出力値を通信用第ｋ層出力値に変換する（Ｓ２０６）。さらに、監視カメラ１−３の通信部１８４が、通信用第ｋ層出力値を通信ネットワーク５Ａへ送信し、中間サーバ２−３の通信部２８４が当該通信用第ｋ層出力値を通信ネットワーク５Ａから受信する（Ｓ３０８）。 Subsequently, the conversion unit 182 of the surveillance cameras 1-3 converts the k-th layer output value into the communication k-th layer output value (S206). Further, the communication unit 184 of the surveillance camera 1-3 transmits the k-layer output value for communication to the communication network 5A, and the communication unit 284 of the intermediate server 2-3 transmits the k-layer output value for communication to the communication network 5A. Received from (S308).

続いて、中間サーバ２−３の変換部２８２が、ステップＳ３０８で受信された通信用第ｋ層出力値を第ｋ層出力値に変換する（Ｓ３１０）。さらに、中間サーバ２−３の処理部２３３が、当該第ｋ層出力値を入力とし、ニューラルネットワークＮＮ３のうち第ｋ＋１層Ｌ_ｋ＋１から第ｑ層Ｌ_ｑまでを用いたニューラルネットワーク処理を行って第ｑ層出力値を出力する（Ｓ３１４）。 Subsequently, the conversion unit 282 of the intermediate server 2-3 converts the communication k-th layer output value received in step S308 into the k-th layer output value (S310). Further, the processing unit 233 of the intermediate server 2-3 uses the k-th layer output value as an input and performs neural network processing using _{the k + 1 layer L k + 1} to the q-th layer L _{q of the neural network NN3.} The q-layer output value is output (S314).

続いて、中間サーバ２−３の決定部２５３が顔検出結果を示す第ｑ層出力値に基づいて、整数ｋの値、整数ｍの値、及び分解能を決定し、整数ｋの値、整数ｍの値、及び分解能の情報を含む処理制御情報を生成する（Ｓ３１４）。さらに、中間サーバ２−３の変換部２８２が処理制御情報を通信用処理制御情報に変換する（Ｓ３１６）。そして、中間サーバ２−２の通信部２７４が、通信用処理制御情報を通信ネットワーク５Ａへ送信し、監視カメラ１−３の通信部１８４が当該通信用処理制御情報を通信ネットワーク５Ａから受信する（Ｓ３１８）。 Subsequently, the determination unit 253 of the intermediate server 2-3 determines the integer k value, the integer m value, and the resolution based on the qth layer output value indicating the face detection result, and determines the integer k value and the integer m. Process control information including the value of and the resolution information is generated (S314). Further, the conversion unit 282 of the intermediate server 2-3 converts the processing control information into the communication processing control information (S316). Then, the communication unit 274 of the intermediate server 2-2 transmits the communication processing control information to the communication network 5A, and the communication unit 184 of the surveillance camera 1-3 receives the communication processing control information from the communication network 5A ( S318).

続いて、監視カメラ１−３の変換部１８２が、通信用処理制御情報を処理制御情報に変換する（Ｓ３２０）。さらに、監視カメラ１−３の処理制御部１６３が、処理制御情報に基づいて、撮像部１１３の画像取得（撮像）に係る分解能の設定を変更する（Ｓ３２２）。 Subsequently, the conversion unit 182 of the surveillance cameras 1-3 converts the communication processing control information into the processing control information (S320). Further, the processing control unit 163 of the surveillance cameras 1-3 changes the resolution setting related to the image acquisition (imaging) of the imaging unit 113 based on the processing control information (S322).

続いて、中間サーバ２−３の処理部２３３が、ニューラルネットワークＮＮ３のうち第ｑ＋１層Ｌ_ｑ＋１から第ｍ層Ｌ_ｍまでを用いたニューラルネットワーク処理を行って第ｍ層出力値を出力する（Ｓ３２４）。 Subsequently, the processing unit 233 of the intermediate server 2-3 _{performs neural network processing using the q + 1 layer L q + 1} to the m layer L _{m of} the neural network NN3, and outputs the m layer output value (S324). ).

さらに、中間サーバ２−３の変換部２８２が第ｍ層出力値を通信用第ｍ層出力値に変換する（Ｓ３２６）。さらに、中間サーバ２−３の通信部２８４が、通信用処理制御情報と通信用第ｍ層出力値を通信ネットワーク５Ｂへ送信し、認識サーバ３−３の通信部３８４が当該通信用処理制御情報と当該通信用第ｍ層出力値を通信ネットワーク５Ｂから受信する（Ｓ３２８）。 Further, the conversion unit 282 of the intermediate server 2-3 converts the m-th layer output value into the communication m-layer output value (S326). Further, the communication unit 284 of the intermediate server 2-3 transmits the communication processing control information and the communication mth layer output value to the communication network 5B, and the communication unit 384 of the recognition server 3-3 transmits the communication processing control information. And the m-layer output value for communication are received from the communication network 5B (S328).

続いて、認識サーバ３−３の変換部３８２が、ステップＳ３２８で受信された通信用処理制御情報と通信用第ｍ層出力値をそれぞれ処理制御情報と第ｍ層出力値に変換する（Ｓ３３０）。さらに、認識サーバ３−３の処理部３３３が、当該第ｍ層出力値を入力とし、ニューラルネットワークＮＮ３のうち第ｍ＋１層Ｌ_ｍ＋１から第ｎ層Ｌ_ｎまでを用いたニューラルネットワーク処理を行って第ｎ層出力値を出力する（Ｓ３３２）。 Subsequently, the conversion unit 382 of the recognition server 3-3 converts the communication processing control information and the communication mth layer output value received in step S328 into the processing control information and the communication mth layer output value, respectively (S330). .. Further, the processing unit 333 of the recognition server 3-3 receives as input the m-th layer output value, the performing a neural network processing using the (m + 1) -th layer L _{m + 1} of the neural network NN3 to the n layer L _n The n-layer output value is output (S332).

なお、上述したステップＳ３０２〜Ｓ３３２の処理は、適宜、または必要に応じて繰り返されてもよい。そして、次のフレームの撮像は、ステップＳ３２２で設定された分解能で行われ、次のフレームの画像データを入力としたニューラルネットワーク処理は、ステップＳ３１４で決定された新たな処理分担境界に基づいて分担されるように制御される。 The above-mentioned processes of steps S302 to S332 may be repeated as appropriate or as necessary. Then, the imaging of the next frame is performed with the resolution set in step S322, and the neural network processing using the image data of the next frame as an input is shared based on the new processing sharing boundary determined in step S314. It is controlled to be done.

（効果）
以上、本発明の第３の実施形態について説明した。本実施形態によれば、監視カメラ１−３の撮像により取得された画像データを入力としたニューラルネットワーク処理が、監視カメラ１−３、中間サーバ２−３、及び認識サーバ３−３の、３つの装置で分担される。また、上述したように、監視カメラ１−３と中間サーバ２−３の処理分担境界を示す整数ｋ、及び中間サーバ２−３と認識サーバ３−３の処理分担境界を示す整数ｍが動的に決定されるため、より柔軟に処理を分担させることが可能である。また、上述したように、監視カメラ１−３による撮像に係る分解能を制御することで、より高精度に顔認識を行うことが可能である。 (effect)
The third embodiment of the present invention has been described above. According to the present embodiment, the neural network processing using the image data acquired by the imaging of the surveillance cameras 1-3 as the input is performed by the surveillance cameras 1-3, the intermediate server 2-3, and the recognition server 3-3. It is shared by two devices. Further, as described above, the integer k indicating the processing sharing boundary between the surveillance camera 1-3 and the intermediate server 2-3 and the integer m indicating the processing sharing boundary between the intermediate server 2-3 and the recognition server 3-3 are dynamic. Therefore, it is possible to divide the processing more flexibly. Further, as described above, by controlling the resolution related to the imaging by the surveillance cameras 1-3, it is possible to perform face recognition with higher accuracy.

＜＜３．変形例＞＞
以上、本発明の第１の実施形態、第２の実施形態、及び第３の実施形態について説明した。以下では、上記各実施形態の幾つかの変形例を説明する。なお、以下に説明する各変形例は、単独で各実施形態に適用されてもよいし、組み合わせで各実施形態に適用されてもよい。また、各変形例は、各実施形態で説明した構成に代えて適用されてもよいし、各実施形態で説明した構成に対して追加的に適用されてもよい。 << 3. Modification example >>
The first embodiment, the second embodiment, and the third embodiment of the present invention have been described above. Hereinafter, some modifications of each of the above embodiments will be described. In addition, each modification described below may be applied to each embodiment individually, or may be applied to each embodiment in combination. Further, each modification may be applied in place of the configuration described in each embodiment, or may be additionally applied to the configuration described in each embodiment.

＜３−１．変形例１＞
上記実施形態では、監視システム９００に１つの監視カメラ１と、１つの中間サーバ２と、１つの認識サーバ３が含まれる例を説明したが、本技術は係る例に限定されない。監視システム９００は各装置を複数含んでもよく、例えば、１つの認識サーバ３に対して、複数の中間サーバ２が対応していてもよいし、１つの中間サーバ２に対して、複数の監視カメラ１が対応していてもよい。係る構成の場合、上述した第３の実施形態では、顔未検出時には中間サーバ２や認識サーバ３の処理負荷を低減させることが可能であるため、処理リソースを効率的に利用可能となる。 <3-1. Modification 1>
In the above embodiment, an example in which the surveillance system 900 includes one surveillance camera 1, one intermediate server 2, and one recognition server 3 has been described, but the present technology is not limited to such an example. The surveillance system 900 may include a plurality of each device. For example, a plurality of intermediate servers 2 may correspond to one recognition server 3, or a plurality of surveillance cameras may correspond to one intermediate server 2. 1 may correspond. In the case of such a configuration, in the third embodiment described above, it is possible to reduce the processing load of the intermediate server 2 and the recognition server 3 when the face is not detected, so that the processing resources can be efficiently used.

＜３−２．変形例２＞
上記実施形態では、各層の出力値が、当該層の次の番号（順番）を有する層に入力されるニューラルネットワークを用いたニューラルネットワーク処理を例に説明を行ったが、本技術は係る例に限定されない。例えば、本技術は、複数の層の出力が１の層へ入力され、または１の層の出力が複数の層へ入力されるニューラルネットワークを用いたニューラルネットワーク処理においても適用可能である。係る場合には、出力値を入力される側（受け取る側）の層の番号が出力値を出力する側の層の番号よりも大きくなるように、番号が設定されてよい。 <3-2. Modification 2>
In the above embodiment, a neural network process using a neural network in which the output value of each layer is input to the layer having the next number (order) of the layer is described as an example. Not limited. For example, the present technology is also applicable to neural network processing using a neural network in which the output of a plurality of layers is input to one layer or the output of one layer is input to a plurality of layers. In such a case, the number may be set so that the number of the layer on the side where the output value is input (receiver) is larger than the number of the layer on the side where the output value is output.

＜３−３．変形例３＞
上記実施形態では、記憶部１４０、記憶部２４０、及び記憶部３４０に予めニューラルネットワークパラメータが記憶される例を説明したが、本技術は係る例に限定されない。例えば、１つの装置の記憶部にのみニューラルネットワークパラメータが記憶されていて、他の装置は、当該１つの装置からニューラルネットワークパラメータを受信してもよい。 <3-3. Modification 3>
In the above embodiment, an example in which neural network parameters are stored in advance in the storage unit 140, the storage unit 240, and the storage unit 340 has been described, but the present technology is not limited to such an example. For example, the neural network parameter may be stored only in the storage unit of one device, and the other device may receive the neural network parameter from the one device.

＜３−４．変形例４＞
上記第３の実施形態では、顔検出結果を示す第ｑ層出力値に基づいて処理分担境界を示す整数ｋ、及び整数ｍが決定される例を説明したが、本技術は係る例に限定されない。例えば、決定部２５３は、第ｑ層出力値に代えて、または加えて、通信状況や、現在の各処理部のリソース状況等に基づいて、処理分担境界を決定してもよい。 <3-4. Modification 4>
In the third embodiment, an example in which the integer k and the integer m indicating the processing sharing boundary are determined based on the output value of the qth layer indicating the face detection result has been described, but the present technology is not limited to such an example. .. For example, the determination unit 253 may determine the processing sharing boundary in place of or in addition to the qth layer output value, based on the communication status, the current resource status of each processing unit, and the like.

＜＜４．ハードウェア構成例＞＞
以上、本発明の各実施形態を説明した。上述したニューラルネットワーク処理、分担決定処理、分解能決定処理等の情報処理は、ソフトウェアと、監視カメラ１、中間サーバ２、認識サーバ３のハードウェアとの協働により実現される。以下では、本発明の実施形態に係る情報処理装置である監視カメラ１、中間サーバ２、認識サーバ３のハードウェア構成例として、情報処理装置１０００のハードウェア構成について説明する。 << 4. Hardware configuration example >>
Each embodiment of the present invention has been described above. Information processing such as the above-mentioned neural network processing, sharing determination processing, and resolution determination processing is realized by the cooperation between the software and the hardware of the surveillance camera 1, the intermediate server 2, and the recognition server 3. Hereinafter, the hardware configuration of the information processing device 1000 will be described as an example of the hardware configuration of the monitoring camera 1, the intermediate server 2, and the recognition server 3, which are the information processing devices according to the embodiment of the present invention.

図１２は、本発明の実施形態に係る情報処理装置１０００のハードウェア構成を示す説明図である。図１２に示したように、情報処理装置１０００は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１００１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１００２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１００３と、入力装置１００４と、出力装置１００５と、ストレージ装置１００６と、通信装置１００７とを備える。 FIG. 12 is an explanatory diagram showing a hardware configuration of the information processing apparatus 1000 according to the embodiment of the present invention. As shown in FIG. 12, the information processing apparatus 1000 includes a CPU (Central Processing Unit) 1001, a ROM (Read Only Memory) 1002, a RAM (Random Access Memory) 1003, an input device 1004, and an output device 1005. , A storage device 1006 and a communication device 1007.

ＣＰＵ１００１は、演算処理装置及び制御装置として機能し、各種プログラムに従って情報処理装置１０００内の動作全般を制御する。また、ＣＰＵ１００１は、マイクロプロセッサであってもよい。ＲＯＭ１００２は、ＣＰＵ１００１が使用するプログラムや演算パラメータなどを記憶する。ＲＡＭ１００３は、ＣＰＵ１００１の実行において使用するプログラムや、その実行において適宜変化するパラメータなどを一時記憶する。これらはＣＰＵバスなどから構成されるホストバスにより相互に接続されている。主に、ＣＰＵ１００１、ＲＯＭ１００２及びＲＡＭ１００３とソフトウェアとの協働により、例えば、処理部１３２、１３３、２３１、２３２、２３３、３３１、３３２、３３３、決定部２５３、処理制御部１６３、２６３、３６３等の機能が実現される。 The CPU 1001 functions as an arithmetic processing device and a control device, and controls the overall operation in the information processing device 1000 according to various programs. Further, the CPU 1001 may be a microprocessor. The ROM 1002 stores programs, calculation parameters, and the like used by the CPU 1001. The RAM 1003 temporarily stores a program used in the execution of the CPU 1001 and parameters that are appropriately changed in the execution. These are connected to each other by a host bus composed of a CPU bus or the like. Mainly, by collaboration between CPU 1001, ROM 1002 and RAM 1003 and software, for example, processing units 132, 133, 231, 232, 233, 331, 332, 333, determination unit 253, processing control units 163, 263, 363, etc. The function is realized.

入力装置１００４は、マウス、キーボード、タッチパネル、ボタン、マイクロフォン、スイッチ及びレバーなどユーザが情報を入力するための入力手段と、ユーザによる入力に基づいて入力信号を生成し、ＣＰＵ１００１に出力する入力制御回路などから構成されている。情報処理装置１０００のユーザは、該入力装置１００４を操作することにより、情報処理装置１０００に対して各種のデータを入力したり処理動作を指示したりすることができる。 The input device 1004 includes input means for the user to input information such as a mouse, keyboard, touch panel, buttons, microphone, switch, and lever, and an input control circuit that generates an input signal based on the input by the user and outputs the input signal to the CPU 1001. It is composed of such as. By operating the input device 1004, the user of the information processing device 1000 can input various data to the information processing device 1000 and instruct the processing operation.

出力装置１００５は、例えば、液晶ディスプレイ（ＬＣＤ）装置、ＯＬＥＤ装置及びランプなどの表示装置を含む。さらに、出力装置１００５は、スピーカ及びヘッドホンなどの音声出力装置を含む。例えば、表示装置は、撮像された画像や生成された画像などを表示する。一方、音声出力装置は、音声データなどを音声に変換して出力する。 The output device 1005 includes, for example, a liquid crystal display (LCD) device, an OLED device, and a display device such as a lamp. Further, the output device 1005 includes an audio output device such as a speaker and headphones. For example, the display device displays an captured image, a generated image, or the like. On the other hand, the voice output device converts voice data and the like into voice and outputs the data.

ストレージ装置１００６は、データ格納用の装置である。ストレージ装置１００６は、記憶媒体、記憶媒体にデータを記録する記録装置、記憶媒体からデータを読み出す読出し装置及び記憶媒体に記録されたデータを削除する削除装置などを含んでもよい。ストレージ装置１００６は、ＣＰＵ１００１が実行するプログラムや各種データを格納する。なお、ストレージ装置１００６は、記憶部１４０、２４０、３４０に対応する。 The storage device 1006 is a device for storing data. The storage device 1006 may include a storage medium, a recording device for recording data on the storage medium, a reading device for reading data from the storage medium, a deleting device for deleting the data recorded on the storage medium, and the like. The storage device 1006 stores a program executed by the CPU 1001 and various data. The storage device 1006 corresponds to the storage units 140, 240, and 340.

通信装置１００７は、例えば、通信網に接続するための通信デバイスなどで構成された通信インタフェースである。また、通信装置１００７は、無線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）対応通信装置、ＬＴＥ（ＬｏｎｇＴｅｒｍＥｖｏｌｕｔｉｏｎ）対応通信装置、有線による通信を行うワイヤー通信装置、またはブルートゥース（登録商標）通信装置を含んでもよい。なお、通信装置１００７は、通信インタフェース部１２０、１７０、１８０、２２０、２７０、２８０、３２０、３７０、３８０、に対応する。 The communication device 1007 is a communication interface composed of, for example, a communication device for connecting to a communication network. Further, the communication device 1007 may include a wireless LAN (Local Area Network) compatible communication device, an LTE (Long Term Evolution) compatible communication device, a wire communication device that performs wired communication, or a Bluetooth (registered trademark) communication device. The communication device 1007 corresponds to the communication interface units 120, 170, 180, 220, 270, 280, 320, 370, 380.

＜＜５．むすび＞＞
以上説明したように、本発明の実施形態によれば、処理をより細かい単位で分担させることが可能である。そのため、要求される処理時間を満たすことがより容易となり、例えばより容易に処理のリアルタイム性を保つことが可能となる。 << 5. Conclusion >>
As described above, according to the embodiment of the present invention, it is possible to divide the processing into finer units. Therefore, it becomes easier to satisfy the required processing time, and for example, it becomes possible to more easily maintain the real-time performance of the processing.

以上、添付図面を参照しながら本発明の好適な実施形態について詳細に説明したが、本発明はかかる例に限定されない。本発明の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本発明の技術的範囲に属するものと了解される。 Although the preferred embodiments of the present invention have been described in detail with reference to the accompanying drawings, the present invention is not limited to such examples. It is clear that a person having ordinary knowledge in the field of technology to which the present invention belongs can come up with various modifications or modifications within the scope of the technical ideas described in the claims. , These are also naturally understood to belong to the technical scope of the present invention.

例えば、上記実施形態では、監視システム９００が通信ネットワーク５Ａと通信ネットワーク５Ｂを有する例を説明したが、本発明はかかる例に限定されない。例えば、通信ネットワーク５Ａと通信ネットワーク５Ｂは同一の通信ネットワークであってもよい。 For example, in the above embodiment, the example in which the monitoring system 900 has the communication network 5A and the communication network 5B has been described, but the present invention is not limited to such an example. For example, the communication network 5A and the communication network 5B may be the same communication network.

また、上記実施形態におけるシーケンス図では、監視カメラ１から一連の処理が開始されていたが、認識サーバ３から監視カメラ１へ処理開始のためのアクセスが行われて、一連の処理が開始されてもよい。 Further, in the sequence diagram in the above embodiment, a series of processes was started from the surveillance camera 1, but the recognition server 3 accesses the surveillance camera 1 to start the process, and the series of processes is started. May be good.

また、上記実施形態における各ステップは、必ずしもシーケンス図として記載された順序に沿って時系列に処理される必要はない。例えば、上記実施形態の処理における各ステップは、シーケンス図として記載した順序と異なる順序で処理されても、並列的に処理されてもよい。 Further, each step in the above embodiment does not necessarily have to be processed in chronological order in the order described as a sequence diagram. For example, each step in the processing of the above embodiment may be processed in an order different from the order described as the sequence diagram, or may be processed in parallel.

また、上記実施形態によれば、ＣＰＵ１００１、ＲＯＭ１００２、及びＲＡＭ１００３などのハードウェアを、上述した監視カメラ１、中間サーバ２、認識サーバ３の各構成と同様の機能を発揮させるためのコンピュータプログラムも提供可能である。また、該コンピュータプログラムが記録された記録媒体も提供される。 Further, according to the above embodiment, a computer program for exerting the same functions as the above-mentioned configurations of the surveillance camera 1, the intermediate server 2, and the recognition server 3 on the hardware such as the CPU 1001, the ROM 1002, and the RAM 1003 is also provided. It is possible. Also provided is a recording medium on which the computer program is recorded.

１監視カメラ
２中間サーバ
３認識サーバ
５通信ネットワーク
１１１撮像部
１２０通信インタフェース部
１２２変換部
１２４通信部
１３２処理部
１４０記憶部
１６３処理制御部
２２０通信インタフェース部
２２２変換部
２２４通信部
２３１処理部
２４０記憶部
２５３決定部
２６３処理制御部
３２０通信インタフェース部
３２２変換部
３２４通信部
３３１処理部
３４０記憶部
３６３処理制御部
９００監視システム
1 Surveillance camera 2 Intermediate server 3 Recognition server 5 Communication network 111 Imaging unit 120 Communication interface unit 122 Conversion unit 124 Communication unit 132 Processing unit 140 Storage unit 163 Processing control unit 220 Communication interface unit 222 Conversion unit 224 Communication unit 231 Processing unit 240 Storage unit Unit 253 Decision unit 263 Processing control unit 320 Communication interface unit 322 Conversion unit 324 Communication unit 331 Processing unit 340 Storage unit 363 Processing control unit 900 Monitoring system

Claims

An information processing system that performs neural network processing using a neural network composed of at least n layers, where n is an integer of 2 or more.
When k is an integer of 1 or more and n-1 or less, the input data is used as an input, and the neural network processing using the first layer to the kth layer of the neural network is performed to obtain the kth layer output value. The first neural network processing unit to output and
A first conversion unit that converts the k-th layer output value into a communication k-th layer output value, and
A first transmitter that transmits the communication k-layer output value to the first communication network, and
A first receiving unit that receives the k-layer output value for communication from the first communication network, and
A second conversion unit that converts the communication k-layer output value received by the first reception unit into the k-layer output value, and
A second neural network processing unit that uses the k-th layer output value as an input and performs the neural network processing using at least the k + 1th layer of the neural network.
When q is an integer of 1 or more and n or less, the first layer is based on the output value of the qth layer output by the qth layer of the neural network in the neural network process using the first input data as an input. A process sharing determination unit that determines the value of k in the neural network process that uses a second input data different from the input data of
Information processing system.

Before Symbol processing sharing determining unit,
It said first and neural network processing unit that determine the value of the k indicating the processing sharing boundaries with the second neural network processing unit, the information processing system according to claim 1.

The first input data and the second input data are sensing data acquired by sensing, and are
The information according to claim 1 or 2 , wherein the information processing system further includes a resolution determining unit that determines the resolution related to the sensing for acquiring the second input data based on the qth layer output value. Processing system.

The processing sharing determination unit has lower processing performance than the first neural network processing unit and the second neural network processing unit when the resolution determination unit determines a higher value as the resolution. The information processing system according to claim 3 , wherein the value of k is determined so that the processing load on the side becomes smaller.

The first input data and the second input data are image data, and are
The neural network is a neural network for recognizing an object included in the image data.
The information processing system according to any one of claims 1 to 4 , wherein the q-th layer output value includes information on a detection result of the object.

The n is an integer of 3 or more.
When m is an integer of k + 1 or more and n-1 or less,
The second neural network processing unit performs the neural network processing using the k + 1 layer to the mth layer of the neural network, outputs the mth layer output value, and outputs the mth layer output value.
The information processing system
A third conversion unit that converts the m-layer output value into a communication m-layer output value, and
A second transmitter that transmits the communication m-layer output value to the second communication network, and
A second receiving unit that receives the communication m-layer output value from the second communication network, and
A fourth conversion unit that converts the communication m-layer output value received by the second receiving unit into the m-layer output value, and
A third neural network processing unit that uses the mth layer output value as an input and performs the neural network processing using at least the m + 1th layer of the neural network.
The information processing system according to any one of claims 1 to 5, further comprising.

When n is an integer of 2 or more and k is an integer of 1 or more and n-1 or less, neural network processing using the first layer to the kth layer of the neural network composed of at least n layers is performed. , The first neural network processing unit that outputs the k-th layer output value, and
A first conversion unit that converts the k-th layer output value into a communication k-th layer output value, and
A first transmitter that transmits the communication k-layer output value to the first communication network, and
When q is an integer of 1 or more and n or less, the first layer is based on the output value of the qth layer output by the qth layer of the neural network in the neural network process using the first input data as an input. A process sharing determination unit that determines the value of k in the neural network process that uses a second input data different from the input data of
Information processing device.

When n is an integer of 2 or more and k is an integer of 1 or more and n-1 or less, neural network processing using the first layer to the kth layer of the neural network composed of at least n layers is performed. , The function to output the k-th layer output value, and
A function to convert the k-th layer output value into a communication k-th layer output value,
A function of transmitting the k-layer output value for communication to the first communication network, and
When q is an integer of 1 or more and n or less, the first layer is based on the output value of the qth layer output by the qth layer of the neural network in the neural network process using the first input data as an input. A function of determining the value of k in the neural network process using a second input data different from the input data of
A program to realize the above on a computer.

When n is an integer of 2 or more and k is an integer of 1 or more and n-1 or less, neural network processing using the first layer to the kth layer of the neural network composed of at least n layers is performed. A first receiving unit that receives the k-layer output value for communication obtained by converting the output k-layer output value from the first communication network, and
A second conversion unit that converts the communication k-layer output value received by the first reception unit into the k-layer output value, and
A second neural network processing unit that uses the k-th layer output value as an input and performs the neural network processing using at least the k + 1th layer of the neural network.
Have a,
When q is an integer of 1 or more and n or less, the first layer is based on the output value of the qth layer output by the qth layer of the neural network in the neural network process using the first input data as an input. An information processing system in which the value of k in the neural network processing in which a second input data different from the input data of the above is input is determined.

When n is an integer of 2 or more and k is an integer of 1 or more and n-1 or less, neural network processing using the first layer to the kth layer of the neural network composed of at least n layers is performed. A function to receive the k-layer output value for communication obtained by converting the output k-layer output value from the first communication network, and
A function of converting the k-th layer output value for communication into the k-th layer output value,
A function of using the k-th layer output value as an input and performing the neural network processing using at least the k + 1th layer of the neural network, and
A program for realizing the to the computer,
When q is an integer of 1 or more and n or less, the first layer is based on the output value of the qth layer output by the qth layer of the neural network in the neural network process using the first input data as an input. A program in which the value of k in the neural network process using a second input data different from the input data of the above is determined.