JP2018125777A

JP2018125777A - Imaging apparatus, learning server, and imaging system

Info

Publication number: JP2018125777A
Application number: JP2017017950A
Authority: JP
Inventors: 裕司川合; Yuji Kawai
Original assignee: Panasonic Intellectual Property Management Co Ltd
Current assignee: Panasonic Intellectual Property Management Co Ltd
Priority date: 2017-02-02
Filing date: 2017-02-02
Publication date: 2018-08-09

Abstract

PROBLEM TO BE SOLVED: To provide a technique for transmitting required information, while restraining increase in traffic.SOLUTION: An imaging part 50 captures an image. A receiving part 56 receives the learning results of a neural network leant by an image including a processing object, from a learning server 40. A determination part 60 inputs the image, captured in the imaging part 50, to the neural network reflecting the learning results received in the receiving part 56, and determines whether or not a processing object is included in the image captured in the imaging part 50, based on the output from the neural network. When a determination is made in the determination part 60 that the processing object is included in the image, a transmission part 58 transmits the image captured in the imaging part 50. The receiving part 56 receives the learning results of the neural network learnt by the image transmitted from the transmission part 58 from the learning server 40.SELECTED DRAWING: Figure 1

Description

本発明は、撮像技術に関し、特に撮像した映像を送信する撮像装置、学習サーバ、撮像システムに関する。 The present invention relates to an imaging technique, and particularly relates to an imaging apparatus, a learning server, and an imaging system that transmit captured images.

ドライブレコーダは、通常、動画撮影部と加速度センサとＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ）とメモリ部と通信モジュール等を備える。このドライブレコーダは、事故発生時の前後における一定時間内の画像や事故の危険があった状況等を自動的に記録する。一方、自動車の走行に伴って自動的および任意に道路情報等を容易に送信し、その豊富な道路情報等をユーザ同士が的確に利用することが求められる。そのため、通信ネットワークを介してデータセンタと各ユーザのドライブレコーダが接続され、ドライブレコーダからの道路撮影画像や走行速度、走行経路等がデータセンタに蓄積されて、利用される（例えば、特許文献１参照）。 The drive recorder usually includes a moving image capturing unit, an acceleration sensor, a GPS (Global Positioning System), a memory unit, a communication module, and the like. This drive recorder automatically records images within a certain period of time before and after the accident, situations where there was a risk of an accident, and the like. On the other hand, it is required that road information and the like are easily transmitted automatically and arbitrarily as the automobile travels, and that the abundant road information and the like are used by users accurately. For this reason, the data center and the drive recorder of each user are connected via a communication network, and the road photographed image, the traveling speed, the traveling route, and the like from the drive recorder are accumulated in the data center and used (for example, Patent Document 1). reference).

特開２０１５−２１０７１３号公報Japanese Patent Laid-Open No. 2015-210713

ドライブレコーダからデータセンタにデータ、特に映像を送信し続ける場合、通信量が大きくなる。一方、映像を送信するタイミングを制限した場合、通信量は減少するが、必要な映像を送信できない場合がある。 When data, particularly video, is continuously transmitted from the drive recorder to the data center, the communication amount increases. On the other hand, when the transmission timing of the video is limited, the communication amount decreases, but the necessary video may not be transmitted.

本発明はこうした状況に鑑みてなされたものであり、その目的は、通信量の増加を抑制しながら、必要な情報を送信する技術を提供することにある。 The present invention has been made in view of such circumstances, and an object of the present invention is to provide a technique for transmitting necessary information while suppressing an increase in traffic.

上記課題を解決するために、本発明のある態様の撮像装置は、画像を撮像する撮像部と、処理対象が含まれた画像によって学習させたニューラルネットワークの学習結果を学習サーバから受信する受信部と、受信部において受信した学習結果を反映させたニューラルネットワークに、撮像部において撮像した画像を入力し、当該ニューラルネットワークからの出力をもとに、撮像部において撮像した画像に処理対象が含まれているか否かを判定する判定部と、判定部において、画像に処理対象が含まれていることを判定した場合、撮像部において撮像した画像を送信する送信部とを備える。受信部は、送信部が送信した画像によって学習させたニューラルネットワークの学習結果を学習サーバから受信する。 In order to solve the above-described problem, an imaging device according to an aspect of the present invention includes an imaging unit that captures an image, and a reception unit that receives a learning result of a neural network learned from an image including a processing target from a learning server. The image captured by the image capturing unit is input to the neural network that reflects the learning result received by the receiving unit, and the image captured by the image capturing unit is included in the processing based on the output from the neural network. A determination unit that determines whether or not the image is processed, and a transmission unit that transmits the image captured by the imaging unit when the determination unit determines that the processing target is included in the image. The receiving unit receives the learning result of the neural network learned from the image transmitted by the transmitting unit from the learning server.

本発明の別の態様は、学習サーバである。この学習サーバは、画像を撮像する撮像装置と通信可能な学習サーバであって、処理対象が含まれた画像によってニューラルネットワークを学習させる学習部と、学習部における学習結果を撮像装置に送信する送信部と、撮像装置において、送信部が送信した学習結果を反映させたニューラルネットワークに、撮像した画像が入力され、当該ニューラルネットワークからの出力をもとに、当該画像に処理対象が含まれていると判定された場合に、撮像装置が撮像した画像を受信する受信部とを備える。学習部は、受信部が受信した画像によってニューラルネットワークを学習させる。 Another aspect of the present invention is a learning server. The learning server is a learning server that can communicate with an imaging device that captures an image, and a learning unit that learns a neural network from an image including a processing target, and a transmission that transmits a learning result in the learning unit to the imaging device. And the imaging device, the captured image is input to the neural network reflecting the learning result transmitted by the transmission unit, and the image includes a processing target based on the output from the neural network. A receiving unit that receives an image captured by the imaging device. The learning unit learns the neural network from the image received by the receiving unit.

本発明のさらに別の態様は、撮像システムである。この撮像システムは、処理対象が含まれた画像によってニューラルネットワークを学習させ、学習結果を送信する学習サーバと、画像を撮像するとともに、学習サーバからの学習結果を受信する撮像装置とを備える。撮像装置は、受信した学習結果を反映させたニューラルネットワークに、撮像した画像を入力し、当該ニューラルネットワークからの出力をもとに、撮像した画像に処理対象が含まれているか否かを判定する判定部と、判定部において、画像に処理対象が含まれていることを判定した場合、撮像した画像を送信する送信部とを備える。学習サーバは、送信部が送信した画像によってニューラルネットワークを学習させる。 Yet another embodiment of the present invention is an imaging system. The imaging system includes a learning server that learns a neural network from an image including a processing target and transmits a learning result, and an imaging device that captures the image and receives the learning result from the learning server. The imaging apparatus inputs a captured image to a neural network that reflects the received learning result, and determines whether the captured image includes a processing target based on an output from the neural network. A determination unit and a transmission unit that transmits a captured image when the determination unit determines that the processing target is included in the image. The learning server learns the neural network from the image transmitted by the transmission unit.

なお、以上の構成要素の任意の組合せ、本発明の表現を方法、装置、システム、記録媒体、コンピュータプログラムなどの間で変換したものもまた、本発明の態様として有効である。 It should be noted that any combination of the above-described constituent elements and a conversion of the expression of the present invention between a method, an apparatus, a system, a recording medium, a computer program, etc. are also effective as an aspect of the present invention.

本発明によれば、通信量の増加を抑制しながら、必要な情報を送信できる。 According to the present invention, necessary information can be transmitted while suppressing an increase in communication traffic.

実施例に係る撮像システムの構成を示す図である。It is a figure which shows the structure of the imaging system which concerns on an Example. 図１の学習サーバから送信される学習結果のデータ構造を示す図である。It is a figure which shows the data structure of the learning result transmitted from the learning server of FIG. 図１の撮像システムによる送信手順を示すシーケンス図である。It is a sequence diagram which shows the transmission procedure by the imaging system of FIG. 図１の撮像装置による送信手順を示すフローチャートである。3 is a flowchart illustrating a transmission procedure by the imaging apparatus of FIG. 1.

本発明の実施例を具体的に説明する前に、概要を述べる。本発明の実施例は、車両に搭載されたドライブレコーダ等の撮像装置と、撮像装置にネットワークを介して接続された蓄積サーバとを含む撮像システムに関する。撮像システムにおいて、撮像装置は映像を蓄積サーバに送信し、蓄積サーバは映像を蓄積する。撮像装置が映像を連続的に送信した場合、ネットワークの通信量が増加して、ネットワークへの負荷、蓄積サーバへの負荷も増加する。また、通信料金も増加する。このような状況の発生を防止するために、撮像装置から映像を送信するタイミングを制限した場合、必要な映像を送信できないおそれがある。本実施例では、通信量を削減しながら、必要な映像を送信することを目的とする。 An outline of the present invention will be given before a specific description of embodiments of the present invention. An embodiment of the present invention relates to an imaging system including an imaging device such as a drive recorder mounted on a vehicle and a storage server connected to the imaging device via a network. In the imaging system, the imaging device transmits video to a storage server, and the storage server stores video. When the imaging device continuously transmits video, the network traffic increases, and the load on the network and the load on the storage server also increase. In addition, communication charges will increase. In order to prevent the occurrence of such a situation, there is a possibility that the necessary video cannot be transmitted when the timing of transmitting the video from the imaging device is limited. An object of the present embodiment is to transmit necessary video while reducing the amount of communication.

本実施例に係る映像システムには、学習サーバがさらに含まれており、学習サーバは、処理対象が含まれた画像によってニューラルネットワークを学習させる。ここで、処理対象とは、車両、歩行者、自転車、二輪車のような対象物、出会い頭、追突、飛び出しのようなシーンが含まれる。そのため、学習サーバでの学習結果は、これらの処理対象が含まれた画像を検知可能なように設定される。学習サーバは、学習結果を撮像装置に送信する。撮像装置は、受信した学習結果を反映させるようにニューラルネットワークを更新させ、撮像した映像に含まれる各画像をニューラルネットワークに入力することによって、処理対象が含まれた画像を検知する。撮像装置は、処理対象が含まれた画像を検知した場合、当該画像が含まれた一定期間の映像をネットワーク経由で蓄積サーバに送信する。さらに、学習サーバは、蓄積サーバに蓄積された映像をもとに前述の学習を繰り返し実行し、学習結果を撮像装置に順次送信する。撮像装置は、受信した学習結果を反映させるようにニューラルネットワークをさらに更新させ、前述の処理を繰り返す。 The video system according to the present embodiment further includes a learning server, and the learning server learns the neural network from the image including the processing target. Here, the processing target includes objects such as vehicles, pedestrians, bicycles, and motorcycles, scenes such as encounters, rear-end collisions, and jumping out. Therefore, the learning result in the learning server is set so that an image including these processing targets can be detected. The learning server transmits the learning result to the imaging device. The imaging device updates the neural network so as to reflect the received learning result, and inputs each image included in the captured video to the neural network, thereby detecting an image including the processing target. When the imaging device detects an image including the processing target, the imaging device transmits a video of a certain period including the image to the storage server via the network. Further, the learning server repeatedly executes the above learning based on the video stored in the storage server, and sequentially transmits the learning result to the imaging device. The imaging device further updates the neural network to reflect the received learning result, and repeats the above-described processing.

図１は、実施例に係る撮像システム１００の構成を示す。撮像システム１００は、車両１０、撮像装置２０、蓄積サーバ３０、学習サーバ４０を含む。また、撮像装置２０は、撮像部５０、処理部５２、記憶部５４、受信部５６、送信部５８を含み、処理部５２は、判定部６０を含む。学習サーバ４０は、受信部７０、学習部７２、送信部７４を含む。ここで、撮像装置２０は、車両１０に搭載されるドライブレコーダである。 FIG. 1 illustrates a configuration of an imaging system 100 according to the embodiment. The imaging system 100 includes a vehicle 10, an imaging device 20, a storage server 30, and a learning server 40. The imaging device 20 includes an imaging unit 50, a processing unit 52, a storage unit 54, a reception unit 56, and a transmission unit 58, and the processing unit 52 includes a determination unit 60. The learning server 40 includes a receiving unit 70, a learning unit 72, and a transmitting unit 74. Here, the imaging device 20 is a drive recorder mounted on the vehicle 10.

学習サーバ４０は、画像の組合せで構成される映像を撮像する撮像装置２０と通信可能である。学習サーバ４０の学習部７２は、処理対象が含まれた画像によってニューラルネットワークを学習させる。処理対象は、対象となるシーン（以下、単に「シーン」ということもある）と、対象となる物（以下、「対象物」ということもある）に分類される。前述のごとく、シーンには、出会い頭、追突、飛び出し等が含まれ、対象物には、車両、歩行者、自転車、二輪車等が含まれる。つまり、撮像装置２０がドライブレコーダである場合を想定しているので、交通事故が発生した、あるいは発生の可能性が高いシーンと対象物が処理対象とされる。 The learning server 40 can communicate with the imaging device 20 that captures an image composed of a combination of images. The learning unit 72 of the learning server 40 learns the neural network from the image including the processing target. The processing target is classified into a target scene (hereinafter sometimes simply referred to as “scene”) and a target object (hereinafter also referred to as “target”). As described above, the scene includes an encounter, a rear-end collision, a jump-out, and the like, and the objects include a vehicle, a pedestrian, a bicycle, a two-wheeled vehicle, and the like. That is, since it is assumed that the imaging device 20 is a drive recorder, a scene and an object in which a traffic accident has occurred or is highly likely to occur are targeted for processing.

ニューラルネットワークの学習は、機械学習の手法の１つであるディープラーニングによってなされており、人間の脳の神経回路をもとにした手法がベースになっている。ニューラルネットワークでは、脳の神経細胞を模したパーセプトロンが三層以上重なって組み合わさっており、データの特徴をさまざまな方面や段階から学ぶことによって、自動的に深く理解する。そのため、ニューラルネットワークは、複雑な特徴を理解しないとできない画像の判別処理に適する。ニューラルネットワークおよびニューラルネットワークの学習については公知の技術を使用すればよいので、ここでは説明を省略する。 Learning of neural networks is performed by deep learning, which is one of machine learning techniques, and is based on a technique based on a human brain neural circuit. In neural networks, perceptrons imitating brain neurons are combined in three or more layers, and the features of data are automatically and deeply understood by learning from various directions and stages. Therefore, the neural network is suitable for image discrimination processing that cannot be performed without understanding complicated features. Since a known technique may be used for the neural network and the neural network learning, the description thereof is omitted here.

学習部７２は、出会い頭、追突、飛び出し等のシーンごとの映像をフレームごとの画像に分解し、それを時系列の順番に、予め構築したニューラルネットワークに入力する。これによって、ニューラルネットワークは各シーンの特徴を時系列で学習する。また、学習部７２は、このような学習を数万件〜数十万件繰り返すことによって学習精度を向上させる。さらに、学習部７２は、車両、歩行者、自転車、二輪車等の対象物ごとの映像をフレームごとの画像に分解し、前述の処理と同様の処理を実行する。その結果、学習部７２は、シーンおよび対象物に対するニューラルネットワークの学習結果を取得する。 The learning unit 72 decomposes the video for each scene such as encounter, rear-end collision, and pop-out into images for each frame, and inputs them into a neural network constructed in advance in chronological order. Thereby, the neural network learns the features of each scene in time series. In addition, the learning unit 72 improves learning accuracy by repeating tens of thousands to hundreds of thousands of such learning. Further, the learning unit 72 decomposes the video for each object such as a vehicle, a pedestrian, a bicycle, and a two-wheeled vehicle into an image for each frame, and executes a process similar to the process described above. As a result, the learning unit 72 acquires the learning result of the neural network for the scene and the object.

送信部７４は、学習部７２における学習結果をネットワーク経由で撮像装置２０に送信する。図２は、学習サーバ４０から送信される学習結果のデータ構造を示す。これは、学習結果に含まれるパラメータを示す。最適化関数は、Ａｄａｍ等の関数名とそのパラメータを含む。ネットワークモデルは、ＶＧＧ等のベース、畳み込み、ｍａｘプーリング数、全結合等の層、ＢａｔｃｈＮｏｒｍａｌｉｚａｔｉｏｎ(あり、なし)を含む。活性化関数は、ｒｅｌｕ等の関数を含む。ドロップアウトは、ｄｒｏｐｏｕｔ(あり、なし）、ｒａｔｉｏ等のパラメータを含み、ｌｏｓｓ計算は、ｓｏｆｔｍａｘ＿ｃｒｏｓｓ＿ｅｎｔｒｏｐｙ等を含む。図１に戻る。 The transmission unit 74 transmits the learning result in the learning unit 72 to the imaging device 20 via the network. FIG. 2 shows the data structure of the learning result transmitted from the learning server 40. This indicates a parameter included in the learning result. The optimization function includes a function name such as Adam and its parameters. The network model includes a base such as VGG, a convolution, a max pooling number, a layer such as total coupling, and BatchNormalization (yes or no). The activation function includes a function such as relu. The dropout includes parameters such as dropout and yes, and the loss calculation includes softmax_cross_entropy and the like. Returning to FIG.

撮像装置２０の受信部５６は、学習結果を学習サーバ４０から受信する。受信部５６は、学習結果を処理部５２に出力する。撮像部５０は、映像を撮像する。撮像された映像は、複数の画像が時系列に並べられることによって構成される。そのため、撮像部５０は、画像を撮像するともいえる。撮像部５０は、撮像した映像、画像をデジタルデータに変換する。なお、以下では、デジタルデータに変換された映像、画像も、映像、画像という。撮像部５０は、映像を処理部５２に出力する。 The receiving unit 56 of the imaging device 20 receives the learning result from the learning server 40. The receiving unit 56 outputs the learning result to the processing unit 52. The imaging unit 50 captures an image. The captured video is configured by arranging a plurality of images in time series. Therefore, it can be said that the imaging unit 50 captures an image. The imaging unit 50 converts captured images and images into digital data. In the following, video and images converted into digital data are also referred to as video and images. The imaging unit 50 outputs the video to the processing unit 52.

記憶部５４は、処理部５２からの指示によって、撮像部５０において撮像された映像を記憶する。なお、映像に含まれた複数の画像の少なくとも一部には、タイムスタンプ等の時間を特定するための情報（以下、「時間情報」という）が含まれており、記憶部５４は、時間情報も記憶する。 The storage unit 54 stores the video imaged by the imaging unit 50 according to an instruction from the processing unit 52. Note that at least some of the plurality of images included in the video include information for specifying time such as a time stamp (hereinafter referred to as “time information”), and the storage unit 54 stores time information. Also remember.

判定部６０は、受信部５６において受信した学習結果を反映させるようにニューラルネットワークを更新させる。また、判定部６０は、記憶部５４に記憶した各画像、つまり撮像部５０において撮像した各画像をニューラルネットワークに入力する。判定部６０は、ニューラルネットワークからの出力をもとに、各画像に処理対象が含まれているか否かを判定する。これは、学習済みのニューラルネットワークが各画像を判定して、シーン、対象物の検知がなされることに相当する。つまり、判定部６０は、各画像に、前述のシーン、対象物が含まれているか否かを判定する。 The determination unit 60 updates the neural network so that the learning result received by the reception unit 56 is reflected. The determination unit 60 inputs each image stored in the storage unit 54, that is, each image captured by the imaging unit 50, to the neural network. The determination unit 60 determines whether each image includes a processing target based on the output from the neural network. This corresponds to a learned neural network determining each image and detecting a scene and an object. That is, the determination unit 60 determines whether or not the above-described scene and object are included in each image.

判定部６０は、シーンと対象物とが含まれた画像を検知した場合、当該画像を含む一定期間の映像を記憶部５４から取得する。一定期間は、例えば、当該画像よりも前のタイミングから後のタイミングまでになるように規定される。なお、このような一定期間の映像の取得には、前述の時間情報が使用される。ここでは、シーンと対象物とが含まれた画像を検知した場合に一定期間の映像が取得されているが、シーンと対象物のいずれか一方の画像を検知した場合に一定期間の映像が取得されてもよい。 When the determination unit 60 detects an image including a scene and an object, the determination unit 60 acquires a video of a certain period including the image from the storage unit 54. For example, the certain period is defined to be from a timing before the image to a timing after the image. Note that the above-described time information is used to acquire such a certain period of video. Here, a video for a certain period is acquired when an image including a scene and an object is detected, but an image for a certain period is acquired when an image of either the scene or the object is detected. May be.

送信部５８は、判定部６０において、画像に処理対象が含まれていることを判定した場合、一定期間の映像をネットワーク経由で蓄積サーバ３０に送信する。これは、撮像部５０において撮像した画像を送信することに相当する。蓄積サーバ３０は、撮像装置２０からの映像を受信して、それを蓄積する。蓄積サーバ３０には、ネットワークを介して学習サーバ４０が接続される。なお、蓄積サーバ３０と学習サーバ４０は一体的に構成されてもよい。 When the determination unit 60 determines that the processing target is included in the image, the transmission unit 58 transmits video for a certain period to the storage server 30 via the network. This corresponds to transmitting an image captured by the imaging unit 50. The accumulation server 30 receives the video from the imaging device 20 and accumulates it. A learning server 40 is connected to the storage server 30 via a network. Note that the storage server 30 and the learning server 40 may be configured integrally.

学習サーバ４０の受信部７０は、蓄積サーバ３０からの映像を受信する。これは、撮像装置２０が撮像した映像であって、かつ撮像装置２０において処理対象が含まれていると判定された画像を含む映像を受信することに相当する。受信部７０は、映像を学習部７２に出力する。学習部７２は、受信部７０が受信した映像によって、前述のニューラルネットワークの学習を実行する。つまり、学習部７２は、受信した映像をフレームごとの画像に分解し、それを時系列の順番に、予め構築したニューラルネットワークに入力する。これに続く処理はこれまでと同様であるので、ここでは説明を省略する。このような処理による学習結果も送信部７４から撮像装置２０に送信される。 The receiving unit 70 of the learning server 40 receives video from the storage server 30. This is equivalent to receiving a video image captured by the imaging device 20 and including an image determined to be included in the imaging device 20 as a processing target. The receiving unit 70 outputs the video to the learning unit 72. The learning unit 72 performs the above-described neural network learning based on the video received by the receiving unit 70. That is, the learning unit 72 decomposes the received video into images for each frame, and inputs them into a neural network constructed in advance in chronological order. Subsequent processing is the same as heretofore, and the description is omitted here. A learning result by such processing is also transmitted from the transmission unit 74 to the imaging device 20.

撮像装置２０の受信部５６は、送信部５８が送信した映像によって学習させたニューラルネットワークの学習結果を学習サーバ４０から新たに受信する。判定部６０は、受信部５６において新たに受信した学習結果を反映させるようにニューラルネットワークを更新させる。これに続く処理はこれまでと同様であるので、ここでは処理を説明する。そのため、学習サーバ４０におけるニューラルネットワークの学習と、撮像装置２０におけるニューラルネットワークによる判定が、ループ状に繰り返し実行される。 The receiving unit 56 of the imaging device 20 newly receives a learning result of the neural network learned from the video transmitted by the transmitting unit 58 from the learning server 40. The determination unit 60 updates the neural network so that the learning result newly received by the reception unit 56 is reflected. Since the subsequent processing is the same as before, the processing will be described here. Therefore, the learning of the neural network in the learning server 40 and the determination by the neural network in the imaging device 20 are repeatedly executed in a loop shape.

ここでは、１台の車両１０に搭載された１つの撮像装置２０だけを示している。しかしながら、撮像システム１００には、撮像装置２０を搭載した車両１０が複数含まれてもよい。このような構成において、学習サーバ４０は、複数の撮像装置２０のそれぞれが撮像した映像を受信する。その際、学習部７２は、複数の撮像装置２０のそれぞれが撮像した映像によってニューラルネットワークを学習させる。このような処理によって、ニューラルネットワークを学習させる際に使用する画像の数が増加される。 Here, only one imaging device 20 mounted on one vehicle 10 is shown. However, the imaging system 100 may include a plurality of vehicles 10 on which the imaging device 20 is mounted. In such a configuration, the learning server 40 receives video captured by each of the plurality of imaging devices 20. At that time, the learning unit 72 learns the neural network from images captured by the plurality of imaging devices 20. By such processing, the number of images used when learning the neural network is increased.

この構成は、ハードウエア的には、任意のコンピュータのＣＰＵ、メモリ、その他のＬＳＩで実現でき、ソフトウエア的にはメモリにロードされたプログラムなどによって実現されるが、ここではそれらの連携によって実現される機能ブロックを描いている。したがって、これらの機能ブロックがハードウエアのみ、ハードウエアとソフトウエアの組合せによっていろいろな形で実現できることは、当業者には理解されるところである。 This configuration can be realized in terms of hardware by a CPU, memory, or other LSI of any computer, and in terms of software, it can be realized by a program loaded in the memory, but here it is realized by their cooperation. Draw functional blocks. Accordingly, those skilled in the art will understand that these functional blocks can be realized in various forms only by hardware, or by a combination of hardware and software.

以上の構成による撮像システム１００の動作を説明する。図３は、撮像システム１００による送信手順を示すシーケンス図である。学習サーバ４０は、ニューラルネットワークを学習させる（Ｓ１０）。学習サーバ４０は、学習結果を送信する（Ｓ１２）。撮像装置２０は、ニューラルネットワークを更新させる（Ｓ１４）。撮像装置２０は映像を撮像する（Ｓ１６）。撮像装置２０は、映像に含まれる画像に処理対象を検出する（Ｓ１８）。撮像装置２０は、一定期間の映像を送信する（Ｓ２０）。学習サーバ４０は、映像に含まれた映像によってニューラルネットワークを学習させる（Ｓ２２）。学習サーバ４０は、学習結果を送信する（Ｓ２４）。撮像装置２０は、ニューラルネットワークを更新させる（Ｓ２６）。 The operation of the imaging system 100 having the above configuration will be described. FIG. 3 is a sequence diagram illustrating a transmission procedure by the imaging system 100. The learning server 40 learns the neural network (S10). The learning server 40 transmits the learning result (S12). The imaging device 20 updates the neural network (S14). The imaging device 20 captures an image (S16). The imaging device 20 detects a processing target in the image included in the video (S18). The imaging device 20 transmits video for a certain period (S20). The learning server 40 learns the neural network from the video included in the video (S22). The learning server 40 transmits the learning result (S24). The imaging device 20 updates the neural network (S26).

図４は、撮像装置２０による送信手順を示すフローチャートである。受信部５６は、学習結果を受信する（Ｓ５０）。判定部６０は、ニューラルネットワークを更新させる（Ｓ５２）。判定部６０が対象となるシーンを検知し（Ｓ５４のＹ）、対象となる物を検知した場合（Ｓ５６のＹ）、送信部５８は、検知時刻前後の映像を送信する（Ｓ５８）。判定部６０が対象となるシーンを検知しない場合（Ｓ５４のＮ）、あるいは対象となる物を検知しない場合（Ｓ５６のＮ）、処理は終了される。 FIG. 4 is a flowchart showing a transmission procedure by the imaging apparatus 20. The receiving unit 56 receives the learning result (S50). The determination unit 60 updates the neural network (S52). When the determination unit 60 detects a target scene (Y in S54) and detects a target object (Y in S56), the transmission unit 58 transmits videos before and after the detection time (S58). If the determination unit 60 does not detect the target scene (N in S54) or does not detect the target object (N in S56), the process ends.

本実施例によれば、処理対象が含まれた画像によって学習させたニューラルネットワークによって、処理対象が含まれた画像を検出した場合に画像を送信するので、必要な映像を一時期だけ送信できる。また、必要な映像が一時期だけ送信されるので、通信量の増加を抑制しながら必要な情報を送信できる。また、対象となるシーンが含まれた画像によって学習させたニューラルネットワークを使用するので、対象となるシーンが含まれた画像を検出できる。また、対象物が含まれた画像によって学習させたニューラルネットワークを使用するので、対象物が含まれた画像を検出できる。 According to the present embodiment, the image is transmitted when the image including the processing target is detected by the neural network learned from the image including the processing target, so that the necessary video can be transmitted only once. In addition, since the necessary video is transmitted only once, necessary information can be transmitted while suppressing an increase in traffic. In addition, since a neural network learned from an image including a target scene is used, an image including the target scene can be detected. In addition, since a neural network learned from an image including an object is used, an image including the object can be detected.

また、処理対象が含まれた画像によって学習させたニューラルネットワークを使用させて、処理対象が含まれた画像を検出した場合に画像を送信させるので、通信量の増加を抑制しながら必要な情報を受信できる。また、受信した画像をもとにニューラルネットワークを学習させるので、学習の精度を向上できる。また、複数の撮像装置のそれぞれが撮像した画像によってニューラルネットワークを学習させるので、学習に使用する画像の数を増加できる。また、学習に使用する画像の数が増加されるので、学習の精度を向上できる。 In addition, since the image is transmitted when the image including the processing target is detected by using the neural network learned by the image including the processing target, the necessary information can be obtained while suppressing the increase in the traffic. Can receive. Moreover, since the neural network is learned based on the received image, the accuracy of learning can be improved. In addition, since the neural network is learned from images captured by each of a plurality of imaging devices, the number of images used for learning can be increased. Further, since the number of images used for learning is increased, the accuracy of learning can be improved.

本発明の一態様の概要は、次の通りである。本発明のある態様の撮像装置は、画像を撮像する撮像部と、処理対象が含まれた画像によって学習させたニューラルネットワークの学習結果を学習サーバから受信する受信部と、受信部において受信した学習結果を反映させたニューラルネットワークに、撮像部において撮像した画像を入力し、当該ニューラルネットワークからの出力をもとに、撮像部において撮像した画像に処理対象が含まれているか否かを判定する判定部と、判定部において、画像に処理対象が含まれていることを判定した場合、撮像部において撮像した画像を送信する送信部とを備える。受信部は、送信部が送信した画像によって学習させたニューラルネットワークの学習結果を学習サーバから受信する。 The outline of one embodiment of the present invention is as follows. An imaging apparatus according to an aspect of the present invention includes an imaging unit that captures an image, a receiving unit that receives a learning result of a neural network learned from an image including a processing target from a learning server, and learning received by the receiving unit. A determination that determines whether or not a processing target is included in the image captured by the imaging unit based on the output from the neural network based on the output from the neural network by inputting the image captured by the imaging unit to the neural network that reflects the result And a transmission unit that transmits an image captured by the imaging unit when the determination unit determines that the processing target is included in the image. The receiving unit receives the learning result of the neural network learned from the image transmitted by the transmitting unit from the learning server.

この態様によると、処理対象が含まれた画像によって学習させたニューラルネットワークによって、処理対象が含まれた画像を検出した場合に画像を送信するので、通信量の増加を抑制しながら必要な情報を送信できる。 According to this aspect, since the image is transmitted when the image including the processing target is detected by the neural network learned from the image including the processing target, the necessary information is obtained while suppressing an increase in the communication amount. Can be sent.

受信部は、対象となるシーンが含まれた画像によって学習させたニューラルネットワークの学習結果を学習サーバから受信し、判定部は、撮像部において撮像した画像に対象となるシーンが含まれているか否かを判定してもよい。この場合、対象となるシーンが含まれた画像によって学習させたニューラルネットワークを使用するので、対象となるシーンが含まれた画像を検出できる。 The receiving unit receives the learning result of the neural network learned from the image including the target scene from the learning server, and the determining unit determines whether the target scene is included in the image captured by the imaging unit. It may be determined. In this case, since a neural network learned from an image including a target scene is used, an image including the target scene can be detected.

受信部は、対象となる物が含まれた画像によって学習させたニューラルネットワークの学習結果を学習サーバから受信し、判定部は、撮像部において撮像した画像に対象となる物が含まれているか否かを判定してもよい。この場合、対象となる物が含まれた画像によって学習させたニューラルネットワークを使用するので、対象となる物が含まれた画像を検出できる。 The receiving unit receives the learning result of the neural network learned from the image including the target object from the learning server, and the determining unit determines whether the target object is included in the image captured by the imaging unit. It may be determined. In this case, since a neural network trained by an image including the target object is used, an image including the target object can be detected.

この態様によると、処理対象が含まれた画像によって学習させたニューラルネットワークを使用させて、処理対象が含まれた画像を検出した場合に画像を送信させるので、通信量の増加を抑制しながら必要な情報を受信できる。 According to this aspect, it is necessary to suppress an increase in communication amount because an image is transmitted when an image including a processing target is detected by using a neural network learned from the image including the processing target. Can receive information.

受信部は、複数の撮像装置のそれぞれが撮像した画像を受信し、学習部は、受信部が受信した画像であって、かつ複数の撮像装置のそれぞれが撮像した画像によってニューラルネットワークを学習させてもよい。この場合、複数の撮像装置のそれぞれが撮像した画像によってニューラルネットワークを学習させるので、学習の精度を向上できる。 The receiving unit receives an image captured by each of the plurality of imaging devices, and the learning unit is configured to learn the neural network based on the image received by the receiving unit and captured by each of the plurality of imaging devices. Also good. In this case, since the neural network is learned from images captured by each of the plurality of imaging devices, the learning accuracy can be improved.

以上、本発明を実施例をもとに説明した。この実施例は例示であり、それらの各構成要素あるいは各処理プロセスの組合せにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。 In the above, this invention was demonstrated based on the Example. This embodiment is an exemplification, and it will be understood by those skilled in the art that various modifications can be made to each of those constituent elements or combinations of processing processes, and such modifications are also within the scope of the present invention. .

本実施例において、撮像装置２０は車両１０に搭載される。しかしながらこれに限らず例えば、撮像装置２０は車両１０以外に設置されてもよく、監視カメラのように使用されてもよい。本変形例によれば、適用範囲を拡大できる。 In this embodiment, the imaging device 20 is mounted on the vehicle 10. However, the present invention is not limited to this. For example, the imaging device 20 may be installed other than the vehicle 10 or may be used like a surveillance camera. According to this modification, the application range can be expanded.

１０車両、２０撮像装置、３０蓄積サーバ、４０学習サーバ、５０撮像部、５２処理部、５４記憶部、５６受信部、５８送信部、６０判定部、７０受信部、７２学習部、７４送信部、１００撮像システム。 DESCRIPTION OF SYMBOLS 10 Vehicle, 20 Imaging device, 30 Storage server, 40 Learning server, 50 Imaging part, 52 Processing part, 54 Storage part, 56 Receiving part, 58 Transmitting part, 60 Determination part, 70 Receiving part, 72 Learning part, 74 Transmitting part , 100 imaging system.

Claims

An imaging unit that captures an image;
A receiving unit that receives a learning result of a neural network learned from an image including a processing target from a learning server;
The image captured by the imaging unit is input to a neural network that reflects the learning result received by the receiving unit, and the image captured by the imaging unit includes a processing target based on the output from the neural network. A determination unit for determining whether or not
A transmission unit that transmits an image captured by the imaging unit when the determination unit determines that the processing target is included in the image;
The image receiving apparatus, wherein the receiving unit receives a learning result of a neural network learned from an image transmitted by the transmitting unit from the learning server.

The receiving unit receives a learning result of a neural network learned from an image including a target scene from a learning server,
The imaging device according to claim 1, wherein the determination unit determines whether a target scene is included in an image captured by the imaging unit.

The receiving unit receives the learning result of the neural network learned from the image including the target object from the learning server,
The imaging apparatus according to claim 1, wherein the determination unit determines whether or not a target object is included in an image captured by the imaging unit.

A learning server capable of communicating with an imaging device that captures an image,
A learning unit for learning a neural network from an image including a processing target;
A transmission unit that transmits a learning result in the learning unit to the imaging device;
In the imaging apparatus, when a captured image is input to a neural network that reflects the learning result transmitted by the transmission unit, and the processing target is included in the image based on an output from the neural network. A reception unit that receives an image captured by the imaging device when determined,
The learning server, wherein the learning unit learns a neural network from images received by the receiving unit.

The receiving unit receives an image captured by each of a plurality of imaging devices,
The learning server according to claim 4, wherein the learning unit is configured to learn a neural network from images received by the receiving unit and images captured by the plurality of imaging devices.

A learning server that learns a neural network from an image including a processing target and transmits a learning result;
An imaging device that captures an image and receives a learning result from the learning server;
The imaging device
A determination unit that inputs a captured image to a neural network that reflects the received learning result, and determines whether the captured image includes a processing target based on an output from the neural network;
A transmission unit that transmits the captured image when the determination unit determines that the processing target is included in the image;
The learning system, wherein the learning server learns a neural network from an image transmitted by the transmission unit.