JP2021168459A

JP2021168459A - Apparatus, system, imaging apparatus, mobile, program and method

Info

Publication number: JP2021168459A
Application number: JP2020071850A
Authority: JP
Inventors: 数史佐藤; Kazufumi Sato
Original assignee: SZ DJI Technology Co Ltd
Current assignee: SZ DJI Technology Co Ltd
Priority date: 2020-04-13
Filing date: 2020-04-13
Publication date: 2021-10-21

Abstract

To reduce the data volume of middle layer data generated by multilayer neural network processing for image data that constitutes a moving image.SOLUTION: An imaging apparatus is configured to: generate first and second middle layer data 601-612... by subjecting each image data of first and second moving image component images 400, 401 that constitute a moving image to processing up to a specific middle layer in a multilayer neural network; generate motion data between the first middle layer data and the second middle layer data based on motion information between the first moving image component image and the second moving image component image; generate residual data between the first middle layer data and the second middle layer data by performing motion compensation between the first middle layer data and the second middle layer data based on the motion data; generate middle layer data that is transmitted to other apparatus performing the processing of a latter stage of the specific middle layer in the multilayer neural network based on the residual data; and transmit the middle layer data and the motion data to the other apparatus.SELECTED DRAWING: Figure 6

Description

本発明は、装置、システム、撮像装置、移動体、プログラム及び方法に関する。 The present invention relates to devices, systems, imaging devices, mobiles, programs and methods.

非特許文献１には、ニューラルネットワークの一部の処理をモバイルエッジで実行し、ニューラルネットワークの残りの部分をクラウドサーバで実行する技術が記載されている。
［先行技術文献］
［特許文献］
［非特許文献１］ＹｉｐｉｎｇＫａｎｇ，ＪｏｈａｎｎＨａｕｓｗａｌｄ，ＣａｏＧａｏ，ＡｕｓｔｉｎＲｏｖｉｎｓｋｉ，ＴｒｅｖｏｒＭｕｄｇｅ，ＪａｓｏｎＭａｒｓ，ＬｉｎｇｊｉａＴａｎｇ， "Ｎｅｕｒｏｓｅｒｇｅｏｎ：Ｃｏｌｌａｂｏｒａｔｉｖｅｉｎｔｅｌｌｉｇｅｎｃｅｂｅｔｗｅｅｎｔｈｅｃｌｏｕｄａｎｄｍｏｂｉｌｅｅｄｇｅ"，ＡＣＭＳＩＧＡＲＣＨＣｏｍｐｕｔｅｒＡｒｃｈｉｔｅｃｔｕｒｅＮｅｗｓ，２０１７年４月 Non-Patent Document 1 describes a technique in which a part of a neural network is executed by a mobile edge and the rest of the neural network is executed by a cloud server.
[Prior art literature]
[Patent Document]
[Non-Patent Document 1] Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, Lingjia Tang, "Neurosergeon: Collaborative intelligence between the cloud and mobile edge", ACM SIGARCH Computer Architecture News, 4 May 2017

本発明の第１の形態に係る装置は、動画を構成する第１の動画構成画像の画像データに対して、多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって、第１の中間層データを生成するよう構成された回路を備える。回路は、動画を構成する第２の動画構成画像の画像データに対して、多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって、第２の中間層データを生成するよう構成される。回路は、第１の動画構成画像と第２の動画構成画像との間の動き情報に基づいて、第１の中間層データと第２の中間層データとの間の動きデータを生成するよう構成される。回路は、動きデータに基づいて第１の中間層データと第２の中間層データとの間で動き補償を行うことによって、第１の中間層データと第２の中間層データとの間の残差データを生成するよう構成される。回路は、残差データに基づいて、多層ニューラルネットワークにおける特定の中間層より後段の処理を行う他の装置に送信する中間層データを生成するよう構成される。回路は、中間層データ及び動きデータを、他の装置に送信するよう構成される。 The apparatus according to the first embodiment of the present invention performs processing up to a specific intermediate layer in the multi-layer neural network on the image data of the first moving image constituent image constituting the moving image, thereby forming the first intermediate layer. It includes a circuit configured to generate data. The circuit is configured to generate the second intermediate layer data by performing processing up to a specific intermediate layer in the multi-layer neural network on the image data of the second moving image constituent image constituting the moving image. The circuit is configured to generate motion data between the first intermediate layer data and the second intermediate layer data based on the motion information between the first moving image and the second moving image. Will be done. The circuit performs motion compensation between the first and second intermediate layer data based on the motion data, thereby performing a motion compensation between the first intermediate layer data and the second intermediate layer data. It is configured to generate difference data. The circuit is configured to generate intermediate layer data to be transmitted to other devices that perform processing after a specific intermediate layer in a multi-layer neural network based on the residual data. The circuit is configured to transmit intermediate layer data and motion data to other devices.

回路は、残差データを量子化することによって量子化データを生成するように構成される。回路は、量子化データに基づいて中間層データを生成するように構成されてよい。 The circuit is configured to generate quantized data by quantizing the residual data. The circuit may be configured to generate intermediate layer data based on the quantized data.

回路は、量子化データをエントロピー符号化することによって中間層データを生成するように構成されてよい。 The circuit may be configured to generate intermediate layer data by entropy coding the quantized data.

多層ニューラルネットワークは畳み込みニューラルネットワークであり、第１の動画構成画像及び第２の動画構成画像に畳み込み演算を行うための１つ以上の中間層を含んでよい。 The multi-layer neural network is a convolutional neural network, and the first moving image and the second moving image may include one or more intermediate layers for performing a convolution operation.

回路は、第１の動画構成画像及び第２の動画構成画像のサイズ及び第１の中間層データ及び第２の中間層データのサイズに基づいて動き情報をスケーリングすることによって、動きデータを生成するように構成されてよい。 The circuit generates motion data by scaling the motion information based on the size of the first moving image and the second moving image and the size of the first intermediate layer data and the second intermediate layer data. It may be configured as follows.

多層ニューラルネットワークは、動画の超解像処理又は動画からの画像認識処理を行うためのニューラルネットワークであってよい。 The multi-layer neural network may be a neural network for performing super-resolution processing of moving images or image recognition processing from moving images.

多層ニューラルネットワークは、多層ニューラルネットワークの特定の中間層より前段の入力層から特定の中間層より後段の出力層までのニューラルネットワークパラメータが、学習データを用いて学習されている。回路は、入力層から特定の中間層までのニューラルネットワークパラメータを記憶するように構成されてよい。 In the multi-layer neural network, the neural network parameters from the input layer before the specific intermediate layer of the multi-layer neural network to the output layer after the specific intermediate layer are learned by using the training data. The circuit may be configured to store neural network parameters from the input layer to a particular intermediate layer.

多層ニューラルネットワークは複数の中間層を有してよい。回路は、装置における通信回線容量、装置の負荷状態、及び、装置の電源状態のうちの少なくとも１つに基づいて、複数の中間層の中から特定の中間層を選択するように構成されてよい。 A multi-layer neural network may have a plurality of intermediate layers. The circuit may be configured to select a particular intermediate layer from a plurality of intermediate layers based on at least one of the communication line capacity of the device, the load state of the device, and the power state of the device. ..

本発明の第２の態様に係る装置は、動画を構成する第１の動画構成画像及び第２の動画構成画像に対して多層ニューラルネットワークにおける特定の中間層までの処理を行う他の装置と通信し、多層ニューラルネットワークにおける特定の中間層より後段の処理を行う。装置は、他の装置から、（ｉ）第１の動画構成画像に対して多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって生成された第１の中間層データと、（ｉｉ）第１の中間層データと第２の動画構成画像に対して多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって生成された第２の中間層データとの間の動きデータと、（ｉｉｉ）第１の中間層データと第２の中間層データとの間の残差データであって、動きデータに基づいて動き補償を行うことによって生成された第１の中間層データと第２の中間層データとの間の残差データとを受信するように構成された回路を備える。回路は、動きデータ及び残差データに基づいて、第２の中間層データを生成するように構成される。回路は、第１の中間層データ及び第２の中間層データに対して、多層ニューラルネットワークにおける特定の中間層より後段の処理を行うように構成される。 The apparatus according to the second aspect of the present invention communicates with another apparatus that processes the first moving image constituent image and the second moving image constituent image constituting the moving image up to a specific intermediate layer in the multi-layer neural network. Then, the processing after the specific intermediate layer in the multi-layer neural network is performed. The device includes (i) first intermediate layer data generated by processing the first moving image constituent image from another device to a specific intermediate layer in the multilayer neural network, and (ii) first. Motion data between the intermediate layer data of 1 and the second intermediate layer data generated by processing up to a specific intermediate layer in the multilayer neural network for the second moving image constituent image, and (iii). Residual data between the first intermediate layer data and the second intermediate layer data, the first intermediate layer data and the second intermediate layer generated by performing motion compensation based on the motion data. It includes a circuit configured to receive residual data with and from the data. The circuit is configured to generate second intermediate layer data based on motion data and residual data. The circuit is configured to perform processing on the first intermediate layer data and the second intermediate layer data after the specific intermediate layer in the multi-layer neural network.

本発明の第３の態様に係るシステムは、第１の態様に係る装置と、第２の態様に係る装置とを備える。 The system according to the third aspect of the present invention includes an apparatus according to the first aspect and an apparatus according to the second aspect.

本発明の第４の態様に係る撮像装置は、上記の装置と、画像を生成するイメージセンサとを備える。 The image pickup apparatus according to the fourth aspect of the present invention includes the above-mentioned apparatus and an image sensor for generating an image.

本発明の第５の態様に係る移動体は、上記の撮像装置を備えて移動する。 The moving body according to the fifth aspect of the present invention moves with the above-mentioned imaging device.

移動体は、無人航空機であってよい。 The moving body may be an unmanned aerial vehicle.

本発明の第６の態様に係るプログラムは、コンピュータを上記の装置として機能させる。プログラムは、非一時的記録媒体に記録されてよい。 The program according to the sixth aspect of the present invention causes the computer to function as the above-mentioned device. The program may be recorded on a non-temporary recording medium.

本発明の第７の態様に係る方法は、動画を構成する第１の動画構成画像の画像データに対して、多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって、第１の中間層データを生成する段階を備える。方法は、動画を構成する第２の動画構成画像の画像データに対して、多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって、第２の中間層データを生成する段階を備える。方法は、第１の動画構成画像と第２の動画構成画像との間の動き情報に基づいて、第１の中間層データと第２の中間層データとの間の動きデータを生成する段階を備える。方法は、動きデータに基づいて第１の中間層データと第２の中間層データとの間で動き補償を行うことによって、第１の中間層データと第２の中間層データとの間の残差データを生成する段階を備える。方法は、残差データに基づいて、多層ニューラルネットワークにおける特定の中間層より後段の処理を行う他の装置に送信する中間層データを生成する段階を備える。方法は、中間層データ及び動きデータを、他の装置に送信する段階を備える。 In the method according to the seventh aspect of the present invention, the image data of the first moving image constituent image constituting the moving image is processed up to a specific intermediate layer in the multi-layer neural network, so that the first intermediate layer is processed. It has a stage to generate data. The method includes a step of generating a second intermediate layer data by performing processing up to a specific intermediate layer in the multi-layer neural network on the image data of the second moving image constituting the moving image. The method is a step of generating motion data between the first intermediate layer data and the second intermediate layer data based on the motion information between the first moving image constituent image and the second moving image constituent image. Be prepared. The method is to perform motion compensation between the first intermediate layer data and the second intermediate layer data based on the motion data, thereby performing a residual between the first intermediate layer data and the second intermediate layer data. It has a stage to generate difference data. The method comprises a step of generating intermediate layer data to be transmitted to another device that performs processing after a specific intermediate layer in a multi-layer neural network based on the residual data. The method comprises transmitting intermediate layer data and motion data to another device.

本発明の第８の態様に係る方法は、動画を構成する第１の動画構成画像及び第２の動画構成画像に対して多層ニューラルネットワークにおける特定の中間層までの処理を行う他の装置と通信し、多層ニューラルネットワークにおける特定の中間層より後段の処理を行う。方法は、他の装置から、（ｉ）第１の動画構成画像に対して多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって生成された第１の中間層データと、（ｉｉ）第１の中間層データと第２の動画構成画像に対して多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって生成された第２の中間層データとの間の動きデータと、（ｉｉｉ）第１の中間層データと第２の中間層データとの間の残差データであって、動きデータに基づいて動き補償を行うことによって生成された第１の中間層データと第２の中間層データとの間の残差データとを受信する段階を備える。方法は、動きデータ及び残差データに基づいて、第２の中間層データを生成する段階を備える。方法は、第１の中間層データ及び第２の中間層データに対して、多層ニューラルネットワークにおける特定の中間層より後段の処理を行う段階を備える。 The method according to the eighth aspect of the present invention communicates with another device that processes the first moving image constituent image and the second moving image constituent image constituting the moving image up to a specific intermediate layer in the multi-layer neural network. Then, the processing after the specific intermediate layer in the multi-layer neural network is performed. The method comprises (i) first intermediate layer data generated by processing the first moving image constituent image from another device to a specific intermediate layer in the multilayer neural network, and (ii) first. Motion data between the intermediate layer data of 1 and the second intermediate layer data generated by processing up to a specific intermediate layer in the multilayer neural network for the second moving image constituent image, and (iii). Residual data between the first intermediate layer data and the second intermediate layer data, the first intermediate layer data and the second intermediate layer generated by performing motion compensation based on the motion data. It is provided with a step of receiving residual data from the data. The method comprises generating a second intermediate layer data based on the motion data and the residual data. The method includes a step of processing the first intermediate layer data and the second intermediate layer data after the specific intermediate layer in the multi-layer neural network.

本発明の上記の態様によれば、他の装置に送信されるニューラルネットワークの中間層データのデータ量を削減することができる。 According to the above aspect of the present invention, it is possible to reduce the amount of intermediate layer data of the neural network transmitted to other devices.

なお、上記の発明の概要は、本発明の必要な特徴の全てを列挙したものではない。また、これらの特徴群のサブコンビネーションもまた、発明となりうる。 The outline of the above invention does not list all the necessary features of the present invention. Sub-combinations of these feature groups can also be inventions.

本実施形態に係る撮像装置１００の外観斜視図の一例を示す図である。It is a figure which shows an example of the external perspective view of the image pickup apparatus 100 which concerns on this embodiment. 本実施形態に係る撮像装置１００の機能ブロックを示す図である。It is a figure which shows the functional block of the image pickup apparatus 100 which concerns on this embodiment. 撮像装置１００とサーバ１８０とを備えるシステム３１０の全体図を模式的に示す。The overall view of the system 310 including the image pickup apparatus 100 and the server 180 is schematically shown. 撮像装置１００により撮像される動画を構成する動画構成画像を模式的に示す。The moving image composition image constituting the moving image captured by the image pickup apparatus 100 is schematically shown. 多層ニューラルネットワーク５８０によって動画構成画像が処理される状態を模式的に示す。The state in which the moving image is processed by the multi-layer neural network 580 is schematically shown. 撮像装置１００における中間層データの圧縮処理の概要を模式的に示す。The outline of the compression process of the intermediate layer data in the image pickup apparatus 100 is schematically shown. 制御部１１０のブロック構成を模式的に示す。The block configuration of the control unit 110 is schematically shown. サーバ１８０のブロック構成を模式的に示す。The block configuration of the server 180 is schematically shown. 撮像装置１００が中間層データを生成する処理の流れを示すフローチャートを示す。The flowchart which shows the flow of the process which the image pickup apparatus 100 generates the intermediate layer data is shown. サーバ１８０が実行する処理の流れを示すフローチャートを示す。A flowchart showing a flow of processing executed by the server 180 is shown. 無人航空機（ＵＡＶ）の一例を示す。An example of an unmanned aerial vehicle (UAV) is shown. 本発明の複数の態様が全体的または部分的に具現化されてよいコンピュータ１２００の一例を示す。An example of a computer 1200 in which a plurality of aspects of the present invention may be embodied in whole or in part is shown.

以下、発明の実施の形態を通じて本発明を説明するが、以下の実施の形態は特許請求の範囲に係る発明を限定するものではない。また、実施の形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。以下の実施の形態に、多様な変更または改良を加えることが可能であることが当業者に明らかである。その様な変更または改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 Hereinafter, the present invention will be described through embodiments of the invention, but the following embodiments do not limit the invention according to the claims. Also, not all combinations of features described in the embodiments are essential to the means of solving the invention. It will be apparent to those skilled in the art that various changes or improvements can be made to the following embodiments. It is clear from the description of the claims that such modified or improved forms may also be included in the technical scope of the present invention.

特許請求の範囲、明細書、図面、及び要約書には、著作権による保護の対象となる事項が含まれる。著作権者は、これらの書類の何人による複製に対しても、特許庁のファイルまたはレコードに表示される通りであれば異議を唱えない。ただし、それ以外の場合、一切の著作権を留保する。 The claims, description, drawings, and abstracts include matters that are subject to copyright protection. The copyright holder will not object to any person's reproduction of these documents as long as they appear in the Patent Office files or records. However, in other cases, all copyrights are reserved.

本発明の様々な実施形態は、フローチャート及びブロック図を参照して記載されてよく、ここにおいてブロックは、（１）操作が実行されるプロセスの段階または（２）操作を実行する役割を持つ装置の「部」を表わしてよい。特定の段階及び「部」が、プログラマブル回路、及び／またはプロセッサによって実装されてよい。専用回路は、デジタル及び／またはアナログハードウェア回路を含んでよい。集積回路（ＩＣ）及び／またはディスクリート回路を含んでよい。プログラマブル回路は、再構成可能なハードウェア回路を含んでよい。再構成可能なハードウェア回路は、論理ＡＮＤ、論理ＯＲ、論理ＸＯＲ、論理ＮＡＮＤ、論理ＮＯＲ、及び他の論理操作、フリップフロップ、レジスタ、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、プログラマブルロジックアレイ（ＰＬＡ）等の様なメモリ要素等を含んでよい。 Various embodiments of the present invention may be described with reference to flowcharts and block diagrams, wherein the block is (1) a stage of the process in which the operation is performed or (2) a device having a role of performing the operation. May represent the "part" of. Specific steps and "parts" may be implemented by programmable circuits and / or processors. Dedicated circuits may include digital and / or analog hardware circuits. It may include integrated circuits (ICs) and / or discrete circuits. Programmable circuits may include reconfigurable hardware circuits. Reconfigurable hardware circuits include logical AND, logical OR, logical XOR, logical NAND, logical NOR, and other logical operations, flip-flops, registers, field programmable gate arrays (FPGA), programmable logic arrays (PLA), etc. It may include a memory element such as.

コンピュータ可読媒体は、適切なデバイスによって実行される命令を格納可能な任意の有形なデバイスを含んでよい。その結果、そこに格納される命令を有するコンピュータ可読媒体は、フローチャートまたはブロック図で指定された操作を実行するための手段を作成すべく実行され得る命令を含む、製品を備えることになる。コンピュータ可読媒体の例としては、電子記憶媒体、磁気記憶媒体、光記憶媒体、電磁記憶媒体、半導体記憶媒体等が含まれてよい。コンピュータ可読媒体のより具体的な例としては、フロッピー（登録商標）ディスク、ディスケット、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、リードオンリメモリ（ＲＯＭ）、消去可能プログラマブルリードオンリメモリ（ＥＰＲＯＭまたはフラッシュメモリ）、電気的消去可能プログラマブルリードオンリメモリ（ＥＥＰＲＯＭ）、静的ランダムアクセスメモリ（ＳＲＡＭ）、コンパクトディスクリードオンリメモリ（ＣＤ−ＲＯＭ）、デジタル多用途ディスク（ＤＶＤ）、ブルーレイ（登録商標）ディスク、メモリスティック、集積回路カード等が含まれてよい。 The computer-readable medium may include any tangible device capable of storing instructions executed by the appropriate device. As a result, the computer-readable medium having the instructions stored therein will include the product, including instructions that can be executed to create means for performing the operation specified in the flowchart or block diagram. Examples of computer-readable media may include electronic storage media, magnetic storage media, optical storage media, electromagnetic storage media, semiconductor storage media, and the like. More specific examples of computer-readable media include floppy® disks, diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), Electrically erasable programmable read-only memory (EEPROM), static random access memory (SRAM), compact disc read-only memory (CD-ROM), digital versatile disc (DVD), Blu-ray® disc, memory stick, An integrated circuit card or the like may be included.

コンピュータ可読命令は、１または複数のプログラミング言語の任意の組み合わせで記述されたソースコードまたはオブジェクトコードの何れかを含んでよい。ソースコードまたはオブジェクトコードは、従来の手続型プログラミング言語を含む。従来の手続型プログラミング言語は、アセンブラ命令、命令セットアーキテクチャ（ＩＳＡ）命令、マシン命令、マシン依存命令、マイクロコード、ファームウェア命令、状態設定データ、またはＳｍａｌｌｔａｌｋ（登録商標）、ＪＡＶＡ（登録商標）、Ｃ＋＋等のようなオブジェクト指向プログラミング言語、及び「Ｃ」プログラミング言語または同様のプログラミング言語でよい。コンピュータ可読命令は、汎用コンピュータ、特殊目的のコンピュータ、若しくは他のプログラム可能なデータ処理装置のプロセッサまたはプログラマブル回路に対し、ローカルにまたはローカルエリアネットワーク（ＬＡＮ）、インターネット等のようなワイドエリアネットワーク（ＷＡＮ）を介して提供されてよい。プロセッサまたはプログラマブル回路は、フローチャートまたはブロック図で指定された操作を実行するための手段を作成すべく、コンピュータ可読命令を実行してよい。プロセッサの例としては、コンピュータプロセッサ、処理ユニット、マイクロプロセッサ、デジタル信号プロセッサ、コントローラ、マイクロコントローラ等を含む。 Computer-readable instructions may include either source code or object code written in any combination of one or more programming languages. Source code or object code includes traditional procedural programming languages. Traditional procedural programming languages are assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcodes, firmware instructions, state-setting data, or Smalltalk®, JAVA®, C ++. It may be an object-oriented programming language such as, and a "C" programming language or a similar programming language. Computer-readable instructions are applied locally or to a processor or programmable circuit of a general purpose computer, special purpose computer, or other programmable data processing device, or a wide area network (WAN) such as a local area network (LAN), the Internet, etc. ) May be provided. The processor or programmable circuit may execute computer-readable instructions to create means for performing the operations specified in the flowchart or block diagram. Examples of processors include computer processors, processing units, microprocessors, digital signal processors, controllers, microcontrollers and the like.

図１は、本実施形態に係る撮像装置１００の外観斜視図の一例を示す図である。図２は、本実施形態に係る撮像装置１００の機能ブロックを示す図である。 FIG. 1 is a diagram showing an example of an external perspective view of the image pickup apparatus 100 according to the present embodiment. FIG. 2 is a diagram showing a functional block of the image pickup apparatus 100 according to the present embodiment.

撮像装置１００は、撮像部１０２、レンズ部２００を備える。撮像部１０２は、イメージセンサ１２０、制御部１１０、メモリ１３０、指示部１６２、表示部１６０及び通信部１７０を有する。 The image pickup apparatus 100 includes an image pickup section 102 and a lens section 200. The image pickup unit 102 includes an image sensor 120, a control unit 110, a memory 130, an instruction unit 162, a display unit 160, and a communication unit 170.

イメージセンサ１２０は、ＣＣＤまたはＣＭＯＳにより構成されてよい。イメージセンサ１２０は、レンズ部２００が有するレンズ２１０を介して光を受光する。イメージセンサ１２０は、レンズ２１０を介して結像された光学像の画像データを制御部１１０に出力する。 The image sensor 120 may be composed of a CCD or CMOS. The image sensor 120 receives light through the lens 210 included in the lens unit 200. The image sensor 120 outputs the image data of the optical image formed through the lens 210 to the control unit 110.

制御部１１０は、ＣＰＵまたはＭＰＵなどのマイクロプロセッサ、ＭＣＵなどのマイクロコントローラなどにより構成されてよい。メモリ１３０は、コンピュータ可読可能な記録媒体でよく、ＳＲＡＭ、ＤＲＡＭ、ＥＰＲＯＭ、ＥＥＰＲＯＭ、及びＵＳＢメモリなどのフラッシュメモリの少なくとも１つを含んでよい。制御部１１０は回路に対応する。メモリ１３０は、制御部１１０がイメージセンサ１２０などを制御するのに必要なプログラム等を格納する。メモリ１３０は、撮像装置１００の筐体の内部に設けられてよい。メモリ１３０は、撮像装置１００の筐体から取り外し可能に設けられてよい。 The control unit 110 may be composed of a CPU, a microprocessor such as an MPU, a microcontroller such as an MCU, or the like. The memory 130 may be a computer-readable recording medium and may include at least one of flash memories such as SRAM, DRAM, EPROM, EEPROM, and USB memory. The control unit 110 corresponds to the circuit. The memory 130 stores a program or the like necessary for the control unit 110 to control the image sensor 120 or the like. The memory 130 may be provided inside the housing of the image pickup apparatus 100. The memory 130 may be provided so as to be removable from the housing of the image pickup apparatus 100.

指示部１６２は、撮像装置１００に対する指示をユーザから受け付けるユーザインタフェースである。表示部１６０は、イメージセンサ１２０により撮像され、制御部１１０により処理された画像、撮像装置１００の各種設定情報などを表示する。表示部１６０は、タッチパネルで構成されてよい。 The instruction unit 162 is a user interface that receives an instruction to the image pickup apparatus 100 from the user. The display unit 160 displays an image captured by the image sensor 120 and processed by the control unit 110, various setting information of the image pickup device 100, and the like. The display unit 160 may be composed of a touch panel.

制御部１１０は、レンズ部２００及びイメージセンサ１２０を制御する。例えば、制御部１１０は、レンズ２１０の焦点の位置や焦点距離を制御する。制御部１１０は、ユーザからの指示を示す情報に基づいて、レンズ部２００が備えるレンズ制御部２２０に制御命令を出力することにより、レンズ部２００を制御する。 The control unit 110 controls the lens unit 200 and the image sensor 120. For example, the control unit 110 controls the focal position and focal length of the lens 210. The control unit 110 controls the lens unit 200 by outputting a control command to the lens control unit 220 included in the lens unit 200 based on the information indicating the instruction from the user.

レンズ部２００は、１以上のレンズ２１０、レンズ駆動部２１２、レンズ制御部２２０、及びメモリ２２２を有する。本実施形態において１以上のレンズ２１０のことを「レンズ２１０」と総称する。レンズ２１０は、フォーカスレンズ及びズームレンズを含んでよい。レンズ２１０が含むレンズのうちの少なくとも一部または全部は、レンズ２１０の光軸に沿って移動可能に配置される。レンズ部２００は、撮像部１０２に対して着脱可能に設けられる交換レンズであってよい。 The lens unit 200 includes one or more lenses 210, a lens driving unit 212, a lens control unit 220, and a memory 222. In this embodiment, one or more lenses 210 are collectively referred to as "lens 210". The lens 210 may include a focus lens and a zoom lens. At least some or all of the lenses included in the lens 210 are movably arranged along the optical axis of the lens 210. The lens unit 200 may be an interchangeable lens that is detachably provided to the imaging unit 102.

レンズ駆動部２１２は、レンズ２１０のうちの少なくとも一部または全部を、レンズ２１０の光軸に沿って移動させる。レンズ制御部２２０は、撮像部１０２からのレンズ制御命令に従って、レンズ駆動部２１２を駆動して、レンズ２１０全体又はレンズ２１０が含むズームレンズやフォーカスレンズを光軸方向に沿って移動させることで、ズーム動作やフォーカス動作の少なくとも一方を実行する。レンズ制御命令は、例えば、ズーム制御命令、及びフォーカス制御命令等である。 The lens driving unit 212 moves at least a part or all of the lens 210 along the optical axis of the lens 210. The lens control unit 220 drives the lens drive unit 212 in accordance with a lens control command from the image pickup unit 102 to move the entire lens 210 or the zoom lens or focus lens included in the lens 210 along the optical axis direction. Perform at least one of the zoom and focus movements. The lens control command is, for example, a zoom control command, a focus control command, and the like.

レンズ駆動部２１２は、複数のレンズ２１０の少なくとも一部または全部を光軸方向に移動させるボイスコイルモータ（ＶＣＭ）を含んでよい。レンズ駆動部２１２は、ＤＣモータ、コアレスモータ、または超音波モータ等の電動機を含んでよい。レンズ駆動部２１２は、電動機からの動力をカム環、ガイド軸等の機構部材を介して複数のレンズ２１０の少なくとも一部または全部に伝達して、レンズ２１０の少なくとも一部または全部を光軸に沿って移動させてよい。 The lens driving unit 212 may include a voice coil motor (VCM) that moves at least a part or all of the plurality of lenses 210 in the optical axis direction. The lens driving unit 212 may include an electric motor such as a DC motor, a coreless motor, or an ultrasonic motor. The lens driving unit 212 transmits power from the motor to at least a part or all of the plurality of lenses 210 via mechanical members such as a cam ring and a guide shaft, and makes at least a part or all of the lenses 210 an optical axis. You may move it along.

メモリ２２２は、レンズ駆動部２１２を介して移動するフォーカスレンズやズームレンズ用の制御値を記憶する。メモリ２２２は、ＳＲＡＭ、ＤＲＡＭ、ＥＰＲＯＭ、ＥＥＰＲＯＭ、及びＵＳＢメモリなどのフラッシュメモリの少なくとも１つを含んでよい。 The memory 222 stores the control values for the focus lens and the zoom lens that move via the lens driving unit 212. The memory 222 may include at least one of flash memories such as SRAM, DRAM, EPROM, EEPROM, and USB memory.

制御部１１０は、指示部１６２等を通じて取得したユーザの指示を示す情報に基づいて、イメージセンサ１２０に制御命令を出力することにより、イメージセンサ１２０に撮像動作の制御を含む制御を実行する。制御部１１０は、イメージセンサ１２０により撮像された画像を取得する。制御部１１０は、イメージセンサ１２０から取得した画像に画像処理を施してメモリ１３０に格納する。 The control unit 110 executes control including control of the imaging operation on the image sensor 120 by outputting a control command to the image sensor 120 based on the information indicating the user's instruction acquired through the instruction unit 162 or the like. The control unit 110 acquires an image captured by the image sensor 120. The control unit 110 performs image processing on the image acquired from the image sensor 120 and stores it in the memory 130.

通信部１７０は、外部との通信を担う。通信部１７０は、制御部１１０が生成した情報を通信ネットワークを通じて外部に送信する。通信部１７０は、通信ネットワークを通じて外部から受信した情報を制御部１１０に提供する。 The communication unit 170 is responsible for communication with the outside. The communication unit 170 transmits the information generated by the control unit 110 to the outside through the communication network. The communication unit 170 provides the control unit 110 with information received from the outside through the communication network.

図３は、撮像装置１００とサーバ１８０とを備えるシステム３１０の全体図を模式的に示す。撮像装置１００は、多層ニューラルネットワークを用いて、撮像装置１００が撮像した動画データを処理する。具体的には、撮像装置１００は、多層ニューラルネットワークにおける特定の中間層までの処理を行い、特定の中間層のデータを生成する。撮像装置１００は、動画の動き情報を用いて中間層のデータを圧縮して、中間層データを生成する。撮像装置１００の通信部１７０は、中間層データ及び動きデータを、通信ネットワーク１９０を通じてサーバ１８０に送信する。 FIG. 3 schematically shows an overall view of the system 310 including the image pickup apparatus 100 and the server 180. The image pickup apparatus 100 processes the moving image data captured by the image pickup apparatus 100 by using the multilayer neural network. Specifically, the image pickup apparatus 100 performs processing up to a specific intermediate layer in the multi-layer neural network and generates data of the specific intermediate layer. The image pickup apparatus 100 compresses the data of the intermediate layer using the motion information of the moving image to generate the intermediate layer data. The communication unit 170 of the image pickup apparatus 100 transmits the intermediate layer data and the motion data to the server 180 through the communication network 190.

サーバ１８０は、中間層データ及び動き情報を受信すると、動き情報を用いて中間層データを伸長することによって、中間層のデータを取得する。そして、サーバ１８０は、中間層のデータを多重ニューラルネットワークにおける特定の中間層に反映して、多層ニューラルネットワークにおける特定の中間層より後段の処理を行う。サーバ１８０は、多重ニューラルネットワークの出力層の情報に基づく処理結果を、通信ネットワーク１９０を通じて撮像装置１００に送信する。例えば、多層ニューラルネットワークが画像認識用のニューラルネットワークである場合、撮像装置１００は、画像の認識結果を撮像装置１００に送信する。 When the server 180 receives the intermediate layer data and the motion information, the server 180 acquires the intermediate layer data by extending the intermediate layer data using the motion information. Then, the server 180 reflects the data of the intermediate layer on the specific intermediate layer in the multi-layer neural network, and performs the processing after the specific intermediate layer in the multi-layer neural network. The server 180 transmits the processing result based on the information of the output layer of the multiple neural network to the image pickup apparatus 100 through the communication network 190. For example, when the multi-layer neural network is a neural network for image recognition, the image pickup device 100 transmits the image recognition result to the image pickup device 100.

撮像装置１００が実行するニューラルネットワークに関する処理の概要を説明する。制御部１１０は、動画を構成する第１の動画構成画像の画像データに対して、多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって、第１の中間層データを生成する。制御部１１０は、動画を構成する第２の動画構成画像の画像データに対して、多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって、第２の中間層データを生成する。制御部１１０は、第１の動画構成画像と第２の動画構成画像との間の動き情報に基づいて、第１の中間層データと第２の中間層データとの間の動きデータを生成する。制御部１１０は、動きデータに基づいて、第１の中間層データと第２の中間層データとの間で動き補償を行うことによって、第１の中間層データと第２の中間層データとの間の残差データを生成する。制御部１１０は、残差データに基づいて、多層ニューラルネットワークにおける特定の中間層より後段の処理を行う他の装置に送信する中間層データを生成する。通信部１７０は、中間層データ及び動きデータを、他の装置に送信する。例えば、通信部１７０は、中間層データ及び動きデータを、通信ネットワーク１９０を通じてサーバ１８０に送信する。 The outline of the processing related to the neural network executed by the image pickup apparatus 100 will be described. The control unit 110 generates the first intermediate layer data by performing processing up to a specific intermediate layer in the multi-layer neural network on the image data of the first moving image constituting the moving image. The control unit 110 generates the second intermediate layer data by performing processing up to a specific intermediate layer in the multi-layer neural network on the image data of the second moving image constituent image constituting the moving image. The control unit 110 generates motion data between the first intermediate layer data and the second intermediate layer data based on the motion information between the first moving image constituent image and the second moving image constituent image. .. The control unit 110 obtains the first intermediate layer data and the second intermediate layer data by performing motion compensation between the first intermediate layer data and the second intermediate layer data based on the motion data. Generate residual data between. Based on the residual data, the control unit 110 generates intermediate layer data to be transmitted to another device that performs processing after a specific intermediate layer in the multi-layer neural network. The communication unit 170 transmits the intermediate layer data and the motion data to another device. For example, the communication unit 170 transmits the intermediate layer data and the motion data to the server 180 through the communication network 190.

制御部１１０は、残差データを量子化することによって量子化データを生成する。制御部１１０は、量子化データに基づいて中間層データを生成する。例えば、制御部１１０は、量子化データをエントロピー符号化することによって中間層データを生成する。 The control unit 110 generates the quantized data by quantizing the residual data. The control unit 110 generates intermediate layer data based on the quantization data. For example, the control unit 110 generates intermediate layer data by entropy coding the quantization data.

制御部１１０は、第１の動画構成画像及び第２の動画構成画像のサイズ及び第１の中間層データ及び第２の中間層データのサイズに基づいて動き情報をスケーリングすることによって、動きデータを生成してよい。 The control unit 110 scales the motion information based on the sizes of the first moving image and the second moving image and the sizes of the first intermediate layer data and the second intermediate layer data to obtain the motion data. May be generated.

多層ニューラルネットワークは畳み込みニューラルネットワークであってよい。多層ニューラルネットワークは、第１の動画構成画像及び第２の動画構成画像に畳み込み演算を行うための１つ以上の中間層を含んでよい。多層ニューラルネットワークは、動画の超解像処理を行うためのニューラルネットワークであってよい。多層ニューラルネットワークは、動画からの画像認識処理を行うためのニューラルネットワークであってよい。多層ニューラルネットワークの用途は結果を超解像処理や画像認識処理に限られない。多層ニューラルネットワークは画像を入力する任意のニューラルネットワークであってよい。 The multi-layer neural network may be a convolutional neural network. The multi-layer neural network may include one or more intermediate layers for performing a convolution operation on the first moving image and the second moving image. The multi-layer neural network may be a neural network for performing super-resolution processing of moving images. The multi-layer neural network may be a neural network for performing image recognition processing from a moving image. Applications of multi-layer neural networks are not limited to super-resolution processing and image recognition processing for results. The multi-layer neural network may be any neural network that inputs an image.

多層ニューラルネットワークは、多層ニューラルネットワークの特定の中間層より前段の入力層から特定の中間層より後段の出力層までのニューラルネットワークパラメータが、学習データを用いて学習されていてよい。制御部１１０は、入力層から特定の中間層までのニューラルネットワークパラメータを記憶する。例えば、制御部１１０は、入力層から特定の中間層までのニューラルネットワークパラメータを予めメモリ１３０に記憶する。なお、制御部１１０は、中間層より後段のニューラルネットワークパラメータを記憶しなくてよい。 In the multi-layer neural network, the neural network parameters from the input layer before the specific intermediate layer of the multi-layer neural network to the output layer after the specific intermediate layer may be learned by using the training data. The control unit 110 stores neural network parameters from the input layer to a specific intermediate layer. For example, the control unit 110 stores in advance the neural network parameters from the input layer to the specific intermediate layer in the memory 130. The control unit 110 does not have to store the neural network parameters in the stage after the intermediate layer.

多層ニューラルネットワークは複数の中間層を有してよい。制御部１１０は、撮像装置１００において利用可能な通信回線容量、撮像装置１００の負荷状態、及び、撮像装置１００の電源状態のうちの少なくとも１つに基づいて、多層ニューラルネットワークが有する複数の中間層の中から特定の中間層を選択する。例えば、制御部１１０は、撮像装置１００において利用可能な通信回線容量が小さいほど、より前段の層を特定の中間層として選択してよい。制御部１１０は、撮像装置１００の負荷状態が高いほど、より前段の層を特定の中間層として選択してよい。制御部１１０は、撮像装置１００の電源残容量が低いほど、より前段の層を特定の中間層として選択してよい。 A multi-layer neural network may have a plurality of intermediate layers. The control unit 110 has a plurality of intermediate layers included in the multilayer neural network based on at least one of the communication line capacity available in the image pickup apparatus 100, the load state of the image pickup apparatus 100, and the power supply state of the image pickup apparatus 100. Select a specific intermediate layer from among. For example, the control unit 110 may select the earlier layer as a specific intermediate layer as the communication line capacity available in the image pickup apparatus 100 becomes smaller. The control unit 110 may select a layer in the front stage as a specific intermediate layer as the load state of the image pickup apparatus 100 increases. The control unit 110 may select a layer in the front stage as a specific intermediate layer as the remaining power supply capacity of the image pickup apparatus 100 is lower.

次に、サーバ１８０が実行する処理の概要を説明する。サーバ１８０は、多層ニューラルネットワークにおける上記の特定の中間層より後段の処理を行う装置である。サーバ１８０は、撮像装置１００から、（ｉ）第１の動画構成画像に対して多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって生成された第１の中間層データと、（ｉｉ）第１の中間層データと第２の動画構成画像に対して多層ニューラルネットワークにおける特定の中間層までの処理を行うことによって生成された第２の中間層データとの間の動きデータと、（ｉｉｉ）第１の中間層データと第２の中間層データとの間の残差データであって、動きデータに基づいて動き補償を行うことによって生成された第１の中間層データと第２の中間層データとの間の残差データとを受信する。サーバ１８０は、動きデータ及び残差データに基づいて、第２の中間層データを生成する。サーバ１８０は、第１の中間層データ及び第２の中間層データに対して、多層ニューラルネットワークにおける特定の中間層より後段の処理を行う。 Next, an outline of the processing executed by the server 180 will be described. The server 180 is a device that performs processing after the specific intermediate layer in the multi-layer neural network. The server 180 includes (i) first intermediate layer data generated by processing the first moving image constituent image from the image pickup device 100 to a specific intermediate layer in the multilayer neural network, and (ii). Motion data between the first intermediate layer data and the second intermediate layer data generated by processing up to a specific intermediate layer in the multilayer neural network for the second moving image constituent image, and (iii). ) Residual data between the first intermediate layer data and the second intermediate layer data, which is between the first intermediate layer data and the second intermediate layer data generated by performing motion compensation based on the motion data. Receives residual data with and from layer data. The server 180 generates the second intermediate layer data based on the motion data and the residual data. The server 180 processes the first intermediate layer data and the second intermediate layer data after the specific intermediate layer in the multi-layer neural network.

図４は、撮像装置１００により撮像される動画を構成する動画構成画像を模式的に示す。動画は、動画構成画像４００、動画構成画像４０１、動画構成画像４０２、動画構成画像４０３・・・のように時系列の複数の動画構成画像を含んで構成される。 FIG. 4 schematically shows a moving image constituent image constituting a moving image captured by the imaging device 100. The moving image is composed of a plurality of time-series moving image constituent images such as the moving image constituent image 400, the moving image constituent image 401, the moving image constituent image 402, the moving image constituent image 403, and so on.

図５は、多層ニューラルネットワーク５８０によって動画構成画像を処理する様子を模式的に示す。多層ニューラルネットワーク５８０は、入力層５００、中間層５０１、中間層５０２、中間層５０３、中間層５０８、及び出力層５０９を含む。多層ニューラルネットワーク５８０は、例えば、動画構成画像からの画像認識処理や動画構成画像の超解像処理を行うための畳み込みニューラルネットワーク（ＣＮＮ）である。動画構成画像４００、動画構成画像４０１、動画構成画像４０２・・・は、予め定められた順で順次に入力層５００に入力される。入力層５００に入力された動画構成画像は、中間層５０１、中間層５０２、中間層５０３及び中間層５０８を含む複数の中間層の間で順次に処理され、出力層５０９から出力される。出力層５０９のデータに基づいて、多層ニューラルネットワーク５８０により処理結果が生成される。 FIG. 5 schematically shows how the multilayer neural network 580 processes the moving image. The multi-layer neural network 580 includes an input layer 500, an intermediate layer 501, an intermediate layer 502, an intermediate layer 503, an intermediate layer 508, and an output layer 509. The multi-layer neural network 580 is, for example, a convolutional neural network (CNN) for performing image recognition processing from a moving image constituent image and super-resolution processing of a moving image constituent image. The moving image constituent image 400, the moving image constituent image 401, the moving image constituent image 402 ... Are sequentially input to the input layer 500 in a predetermined order. The moving image constituent image input to the input layer 500 is sequentially processed among a plurality of intermediate layers including the intermediate layer 501, the intermediate layer 502, the intermediate layer 503, and the intermediate layer 508, and is output from the output layer 509. The processing result is generated by the multi-layer neural network 580 based on the data of the output layer 509.

本実施形態において、多層ニューラルネットワーク５８０において入力層５００から中間層５０２までの処理を撮像装置１００が担当し、中間層５０２より後段の出力層５０９までの処理をサーバ１８０が担当する。 In the present embodiment, the image pickup apparatus 100 is in charge of the processing from the input layer 500 to the intermediate layer 502 in the multilayer neural network 580, and the server 180 is in charge of the processing from the intermediate layer 502 to the output layer 509 in the subsequent stage.

図６は、撮像装置１００における中間層データの圧縮処理の概要を模式的に示す。制御部１１０は、複数の動画構成画像の間の動きを検出する。例えば、制御部１１０は、ブロックマッチング等の手法により動きを検出してよい。図６は、動画構成画像４００を参照画像として、動画構成画像４０１に関する動きとして動き情報Ｍ０が検出された状態を示す。 FIG. 6 schematically shows an outline of the intermediate layer data compression process in the image pickup apparatus 100. The control unit 110 detects the movement between the plurality of moving image constituent images. For example, the control unit 110 may detect the movement by a method such as block matching. FIG. 6 shows a state in which motion information M0 is detected as a motion related to the moving image constituent image 401 with the moving image constituent image 400 as a reference image.

中間層データ６０１は、動画構成画像４００を入力層５００に入力することによって得られる中間層５０１の状態を示し、中間層データ６０２は、動画構成画像４００を入力層５００に入力することによって得られる中間層５０２の状態を示す。中間層データ６１１は、動画構成画像４０１を入力層５００に入力することによって得られる中間層５０１の状態を示し、中間層データ６１２は、動画構成画像４０１を入力層５００に入力することによって得られる中間層５０２の状態を示す。 The intermediate layer data 601 shows the state of the intermediate layer 501 obtained by inputting the moving image constituent image 400 into the input layer 500, and the intermediate layer data 602 is obtained by inputting the moving image constituent image 400 into the input layer 500. The state of the intermediate layer 502 is shown. The intermediate layer data 611 shows the state of the intermediate layer 501 obtained by inputting the moving image constituent image 401 to the input layer 500, and the intermediate layer data 612 is obtained by inputting the moving image constituent image 401 to the input layer 500. The state of the intermediate layer 502 is shown.

まず、制御部１１０は、中間層データ６０２に対し量子化６３０及び符号化６４０を行う。通信部１７０は、中間層データ６０２の量子化６３０及び符号化６４０によって得られたデータを、サーバ１８０に送信する。 First, the control unit 110 performs quantization 630 and coding 640 on the intermediate layer data 602. The communication unit 170 transmits the data obtained by the quantization 630 and the coding 640 of the intermediate layer data 602 to the server 180.

次に、制御部１１０は、動画構成画像４００及び動画構成画像４０１から取得した動き情報Ｍ０に基づいて、動き情報Ｍ１を取得する。制御部１１０は、動画構成画像４００及び動画構成画像４０１のサイズと、中間層５０２のサイズに基づいて、動き情報Ｍ０をスケーリングすることにより、動き情報Ｍ１を取得する。なお、動画構成画像４００及び動画構成画像４０１のサイズと中間層５０２のサイズが同一の場合は、動き情報Ｍ１として動き情報Ｍ０をそのまま適用してよい。多層ニューラルネットワーク５８０において中間層５０２までの間に畳み込み処理やプーリング処理が存在することによって動画構成画像４００及び動画構成画像４０１に対して中間層５０２のサイズが小さい場合、制御部１１０は、動き情報Ｍ０を縮小率に応じてスケーリングすることにより、動き情報Ｍ１を生成する。 Next, the control unit 110 acquires the motion information M1 based on the motion information M0 acquired from the moving image constituent image 400 and the moving image constituent image 401. The control unit 110 acquires the motion information M1 by scaling the motion information M0 based on the sizes of the moving image constituent image 400 and the moving image constituent image 401 and the size of the intermediate layer 502. When the size of the moving image constituent image 400 and the moving image constituent image 401 and the size of the intermediate layer 502 are the same, the motion information M0 may be applied as it is as the motion information M1. When the size of the intermediate layer 502 is smaller than that of the moving image constituent image 400 and the moving image constituent image 401 due to the convolution processing and the pooling processing existing up to the intermediate layer 502 in the multi-layer neural network 580, the control unit 110 causes motion information. Motion information M1 is generated by scaling M0 according to the reduction ratio.

制御部１１０は、動き情報Ｍ１を用いて中間層データ６０２から中間層データ６１２の予測データを生成し、中間層データ６１２と予測データとの残差情報６２０を生成する。制御部１１０は、残差情報６２０を量子化６５０し、残差情報６２０の量子化により得られた情報を符号化６６０する。例えば、制御部１１０は、残差情報６２０の量子化により得られた情報をエントロピー符号化により符号化する。通信部１７０は、エントロピー符号化された情報をサーバ１８０に送信する。 The control unit 110 generates the prediction data of the middle layer data 612 from the middle layer data 602 using the motion information M1, and generates the residual information 620 between the middle layer data 612 and the prediction data. The control unit 110 quantizes the residual information 620 650, and encodes the information obtained by the quantization of the residual information 620 660. For example, the control unit 110 encodes the information obtained by the quantization of the residual information 620 by the entropy coding. The communication unit 170 transmits the entropy-encoded information to the server 180.

図７は、制御部１１０のブロック構成を模式的に示す。第１ＣＮＮ部７１０は、多層ニューラルネットワーク５８０における入力層５００から中間層５０２までの部分を示す。入力動画の動画構成画像は、第１ＣＮＮ部７１０で処理され、中間データバッファ７２０に記憶される。 FIG. 7 schematically shows the block configuration of the control unit 110. The first CNN part 710 shows the part from the input layer 500 to the intermediate layer 502 in the multi-layer neural network 580. The moving image of the input moving image is processed by the first CNN unit 710 and stored in the intermediate data buffer 720.

動き検出部７３０は、複数の動画構成画像間の動き情報を検出する。動きマッピング部７５０は、動き検出部７３０が検出した動き情報を、中間層５０２のサイズにスケーリングすることによって、中間層５０２における動き情報を取得する。中間データ予測部７４０は、動きマッピング部７５０が取得した中間層５０２における動き情報と、中間データバッファ７２０に記憶されている中間層データとに基づいて、中間層データの予測データを生成して、中間層データと予測データとの間の残差情報を生成する。量子化・エントロピー符号化部７６０は、中間データ予測部７４０が生成した残差情報を量子化し、量子化された残差情報にエントロピー符号化を施す。通信部１７０は、量子化・エントロピー符号化部７６０により得られた残差情報の符号化データと、動き検出部７３０により動画構成画像から検出された動き情報とを含む送信データ７８０を生成して、サーバ１８０に送信する。 The motion detection unit 730 detects motion information between a plurality of moving image constituent images. The motion mapping unit 750 acquires the motion information in the intermediate layer 502 by scaling the motion information detected by the motion detection unit 730 to the size of the intermediate layer 502. The intermediate data prediction unit 740 generates prediction data of the intermediate layer data based on the motion information in the intermediate layer 502 acquired by the motion mapping unit 750 and the intermediate layer data stored in the intermediate data buffer 720. Generate residual information between the middle layer data and the forecast data. The quantization / entropy coding unit 760 quantizes the residual information generated by the intermediate data prediction unit 740, and performs entropy coding on the quantized residual information. The communication unit 170 generates transmission data 780 including the coded data of the residual information obtained by the quantization / entropy coding unit 760 and the motion information detected from the moving image constituent image by the motion detection unit 730. , Send to server 180.

静止被写体が写っている領域や大域的な動きがある領域については、中間層データの残差情報は小さな値になることが期待できる。そのため、量子化処理によって残差情報の値が０になる場合が多いことが期待できる。そのため、ランレングス符号化等の符号化処理を行うことにより、中間層データの圧縮率を高めることが期待できる。なお、符号化として算術符号化を適用してもよい。量子化後の０及び１の値の発生確率には偏りが生じることが期待できるため、算術符号化を利用することによっても圧縮率を高めることが期待できる。 It can be expected that the residual information of the intermediate layer data will be a small value in the area where the still subject is shown or the area where there is global movement. Therefore, it can be expected that the value of the residual information becomes 0 in many cases due to the quantization process. Therefore, it can be expected that the compression rate of the intermediate layer data is increased by performing a coding process such as run-length coding. Arithmetic coding may be applied as the coding. Since it is expected that the probability of occurrence of 0 and 1 values after quantization will be biased, it can be expected that the compression rate can be increased by using arithmetic coding.

図８は、サーバ１８０のブロック構成を模式的に示す。第２ＣＮＮ部８１０は、多層ニューラルネットワーク５８０における中間層５０２から出力層５０９までの部分を示す。エントロピー復号化・量子化部８３０は、撮像装置１００から受信した中間層データに対してエントロピー復号化及び逆量子化を行うことにより、中間層データの残差情報を生成する。また、動きマッピング部８５０は、撮像装置１００から受信した動き情報を、中間層５０２における動き情報にマッピングして、中間層５０２における動き情報を取得する。中間データ再構築部８４０は、中間データバッファ８２０に記憶されている中間層データと動きマッピング部８５０が取得した動き情報とに基づいて、中間層データの予測データを生成する。中間データ再構築部８４０は、予測データに中間層データの差分情報を加算することにより、中間層５０２の状態を再構築する。第２ＣＮＮ部８１０は、中間層５０２より後段の処理を行い、出力層５０９の状態を処理結果として出力する。 FIG. 8 schematically shows a block configuration of the server 180. The second CNN part 810 shows a part from the intermediate layer 502 to the output layer 509 in the multi-layer neural network 580. The entropy decoding / quantization unit 830 generates residual information of the intermediate layer data by performing entropy decoding and dequantization of the intermediate layer data received from the image pickup apparatus 100. Further, the motion mapping unit 850 maps the motion information received from the image pickup apparatus 100 to the motion information in the intermediate layer 502, and acquires the motion information in the intermediate layer 502. The intermediate data reconstruction unit 840 generates prediction data of the intermediate layer data based on the intermediate layer data stored in the intermediate data buffer 820 and the motion information acquired by the motion mapping unit 850. The intermediate data reconstruction unit 840 reconstructs the state of the intermediate layer 502 by adding the difference information of the intermediate layer data to the prediction data. The second CNN unit 810 performs the processing after the intermediate layer 502, and outputs the state of the output layer 509 as the processing result.

図９は、撮像装置１００が中間層データを生成する処理の流れを示すフローチャートを示す。Ｓ９１０において、制御部１１０は、対象の動画構成画像を第１ＣＮＮ部７１０に入力する。Ｓ９２０において、中間層５０２の中間層データを生成する。Ｓ９３０において、動き検出部７３０は、対象の動画構成画像と参照する動画構成画像との間の動き情報を検出する。Ｓ９４０において、Ｓ９３０で検出した動き情報を、中間層５０２の動き情報にスケーリングする。Ｓ９５０において、中間データ予測部７４０は、Ｓ９４０で取得した動き情報を用いて中間層５０２における予測データを生成し、中間層５０２における中間層データと予測データとの残差情報を生成する。 FIG. 9 shows a flowchart showing a flow of processing in which the image pickup apparatus 100 generates intermediate layer data. In S910, the control unit 110 inputs the target moving image configuration image to the first CNN unit 710. In S920, the intermediate layer data of the intermediate layer 502 is generated. In S930, the motion detection unit 730 detects motion information between the target moving image constituent image and the referenced moving image constituent image. In S940, the motion information detected in S930 is scaled to the motion information of the intermediate layer 502. In S950, the intermediate data prediction unit 740 generates prediction data in the intermediate layer 502 using the motion information acquired in S940, and generates residual information between the intermediate layer data and the prediction data in the intermediate layer 502.

Ｓ９６０において、量子化・エントロピー符号化部７６０は、Ｓ９５０で生成した残差情報に対し量子化及びエントロピー符号化を行うことによって、送信用の中間層データを生成する。Ｓ９７０において、通信部１７０は、Ｓ９６０で生成した中間層データと、Ｓ９３０で検出した動き情報とを含む送信データをサーバ１８０へ送信する。 In S960, the quantization / entropy coding unit 760 generates intermediate layer data for transmission by performing quantization and entropy coding on the residual information generated in S950. In S970, the communication unit 170 transmits the transmission data including the intermediate layer data generated in S960 and the motion information detected in S930 to the server 180.

図１０は、サーバ１８０が実行する処理の流れを示すフローチャートを示す。Ｓ１０１０において、撮像装置１００から送信された中間層データを取得する。なお、本フローチャートでは、中間層データの残差情報を取得した場合を説明する。 FIG. 10 shows a flowchart showing the flow of processing executed by the server 180. In S1010, the intermediate layer data transmitted from the image pickup apparatus 100 is acquired. In this flowchart, the case where the residual information of the intermediate layer data is acquired will be described.

Ｓ１０２０において、中間層データに対しエントロピー復号化及び逆量子化を行う。Ｓ１０３０において、サーバ１８０は、動画構成画像間の動き情報を取得する。Ｓ１０４０において、Ｓ１０３０で取得した動き情報を、中間層５０２の動き情報にスケーリングする。Ｓ１０５０において、Ｓ１０４０で取得した動き情報を用いて中間層５０２における予測データを生成し、Ｓ１０１０で取得した中間層データを予測データに加算することにより、中間層５０２の中間層データを再構築する。 In S1020, entropy decoding and dequantization are performed on the intermediate layer data. In S1030, the server 180 acquires motion information between the moving image constituent images. In S1040, the motion information acquired in S1030 is scaled to the motion information of the intermediate layer 502. In S1050, the prediction data in the intermediate layer 502 is generated using the motion information acquired in S1040, and the intermediate layer data acquired in S1010 is added to the prediction data to reconstruct the intermediate layer data in the intermediate layer 502.

Ｓ１０６０において、第２ＣＮＮ部８１０による処理を行う。Ｓ１０７０において、第２ＣＮＮ部８１０の出力層５０９の状態を取得して、多層ニューラルネットワーク５８０の処理結果として撮像装置１００に送信する。 In S1060, the process by the second CNN unit 810 is performed. In S1070, the state of the output layer 509 of the second CNN unit 810 is acquired and transmitted to the image pickup apparatus 100 as the processing result of the multilayer neural network 580.

以上に説明した実施形態では、通信部１７０はサーバ１８０に、動画構成画像から検出した動き情報を送信する。通信部１７０は、動き情報の他の例としての動画構成画像をサーバ１８０に送信してもよい。サーバ１８０は、撮像装置１００から受信した動画構成画像を用いて、制御部１１０における動き情報の検出アルゴリズムと同一のアルゴリズムにより動き検出を行うことによって、動画構成画像間の動き情報を取得してよい。この場合、通信部１７０は、サーバ１８０に動画構成画像から検出した動き情報を送信しなくてよい。 In the embodiment described above, the communication unit 170 transmits the motion information detected from the moving image configuration image to the server 180. The communication unit 170 may transmit a moving image configuration image as another example of motion information to the server 180. The server 180 may acquire motion information between the moving image constituent images by performing motion detection by the same algorithm as the motion information detection algorithm in the control unit 110 using the moving image constituent image received from the imaging device 100. .. In this case, the communication unit 170 does not have to transmit the motion information detected from the moving image configuration image to the server 180.

以上に説明した実施形態では、通信部１７０が中間層５０２の中間層データの残差情報をサーバ１８０に送信する形態を説明した。しかし、通信部１７０は、中間層５０１の中間層データの残差情報も送信してよい。多層ニューラルネットワーク５８０における中間層５０２より後段の処理において、中間層５０２より前段の中間層又は入力層の情報を必要とする場合、後段の処理において必要となる全ての層の残差情報をサーバ１８０に送信してよい。 In the embodiment described above, the mode in which the communication unit 170 transmits the residual information of the intermediate layer data of the intermediate layer 502 to the server 180 has been described. However, the communication unit 170 may also transmit the residual information of the intermediate layer data of the intermediate layer 501. When the information of the intermediate layer or the input layer before the intermediate layer 502 is required in the processing after the intermediate layer 502 in the multi-layer neural network 580, the residual information of all the layers required in the subsequent processing is stored in the server 180. May be sent to.

以上に説明した実施形態では、撮像装置１００は、多層ニューラルネットワーク５８０における入力層５００から中間層５０２までの層の処理を行う。撮像装置１００が処理を行う層は、撮像装置１００が利用可能な通信容量、撮像装置１００の電源の残容量、撮像装置１００の処理能力、撮像装置１００における現在の負荷に応じて選択してよい。例えば、撮像装置１００の電源の残容量が予め定められた値より小さい場合に、撮像装置１００は中間層５０１までの層を担当するものとしてよい。この場合、通信部１７０は、選択した中間層を識別する情報をサーバ１８０に送信してよい。 In the embodiment described above, the image pickup apparatus 100 processes the layers from the input layer 500 to the intermediate layer 502 in the multilayer neural network 580. The layer to be processed by the image pickup device 100 may be selected according to the communication capacity available to the image pickup device 100, the remaining capacity of the power supply of the image pickup device 100, the processing capacity of the image pickup device 100, and the current load of the image pickup device 100. .. For example, when the remaining capacity of the power supply of the image pickup apparatus 100 is smaller than a predetermined value, the image pickup apparatus 100 may be in charge of the layers up to the intermediate layer 501. In this case, the communication unit 170 may transmit information identifying the selected intermediate layer to the server 180.

以上に説明した実施形態では、動き検出部７３０は、動画構成画像から動き検出を行うことによって動き情報を取得する。動き検出部７３０は、ノイズ除去や画像圧縮等の他の目的で算出された動き情報を取得しても。また、動き検出部７３０は、ジャイロセンサから得られる情報を動き情報として用いてもよい。画像圧縮で算出された動きベクトル等を動き情報として用いる場合、画像圧縮情報のＧＯＰ構造に基づき、動画構成画像において、過去だけでなく、未来の動画構成画像も用いて、中間層データの予測データを生成して残差情報を生成しても良い。 In the embodiment described above, the motion detection unit 730 acquires motion information by detecting motion from the moving image. Even if the motion detection unit 730 acquires motion information calculated for other purposes such as noise removal and image compression. Further, the motion detection unit 730 may use the information obtained from the gyro sensor as the motion information. When a motion vector or the like calculated by image compression is used as motion information, based on the GOP structure of the image compression information, the prediction data of the intermediate layer data is used not only in the past but also in the future video composition image in the video composition image. May be generated to generate residual information.

以上に説明した実施形態によれば、撮像装置１００及びサーバ１８０が一つの多層ニューラルネットワーク５８０の処理を分担することができる。上述したように、制御部１１０は、動画構成画像間の動き情報を用いて中間層データの動き補償を行う。そのため、撮像装置１００からサーバ１８０に送信される中間層データのデータ量を削減することができる。これにより、ネットワーク負荷を軽減することができる。また、撮像装置１００とサーバ１８０とで多層ニューラルネットワーク５８０の処理を分担することができる。これにより、例えば通信帯域やサーバ１８０における過負荷が処理速度のボトルネックとなってしまう可能性を低減することができる。そのため、撮像装置１００は、多層ニューラルネットワーク５８０の処理結果を速やかに取得できることが期待できる。 According to the embodiment described above, the image pickup apparatus 100 and the server 180 can share the processing of one multilayer neural network 580. As described above, the control unit 110 compensates for the motion of the intermediate layer data by using the motion information between the moving image constituent images. Therefore, the amount of intermediate layer data transmitted from the image pickup apparatus 100 to the server 180 can be reduced. As a result, the network load can be reduced. Further, the image pickup apparatus 100 and the server 180 can share the processing of the multilayer neural network 580. This makes it possible to reduce the possibility that, for example, the communication band or the overload on the server 180 becomes a bottleneck in the processing speed. Therefore, it can be expected that the image pickup apparatus 100 can quickly acquire the processing result of the multilayer neural network 580.

撮像装置１００の一部又は全ての機能は、携帯電話等の移動端末に組み込まれてよい。撮像装置１００は、監視カメラであってよい。撮像装置１００はビデオカメラ、等であってよい。撮像装置１００の一部又は全ての機能は、動画を撮像することができる任意の装置に組み込まれてよい。 Some or all the functions of the image pickup apparatus 100 may be incorporated into a mobile terminal such as a mobile phone. The image pickup apparatus 100 may be a surveillance camera. The image pickup device 100 may be a video camera or the like. Some or all of the functions of the imaging device 100 may be incorporated into any device capable of capturing moving images.

上記のような撮像装置１００は、移動体に搭載されてもよい。撮像装置１００は、図１１に示すような、無人航空機（ＵＡＶ）に搭載されてもよい。ＵＡＶ１０は、ＵＡＶ本体２０、ジンバル５０、複数の撮像装置６０、及び撮像装置１００を備えてよい。ジンバル５０、及び撮像装置１００は、撮像システムの一例である。ＵＡＶ１０は、推進部により推進される移動体の一例である。移動体とは、ＵＡＶの他、空中を移動する他の航空機などの飛行体、地上を移動する車両、水上を移動する船舶等を含む概念である。 The image pickup apparatus 100 as described above may be mounted on a moving body. The imaging device 100 may be mounted on an unmanned aerial vehicle (UAV) as shown in FIG. The UAV 10 may include a UAV main body 20, a gimbal 50, a plurality of image pickup devices 60, and an image pickup device 100. The gimbal 50 and the imaging device 100 are examples of an imaging system. The UAV 10 is an example of a moving body propelled by a propulsion unit. The moving body is a concept including a UAV, a flying object such as another aircraft moving in the air, a vehicle moving on the ground, a ship moving on the water, and the like.

ＵＡＶ本体２０は、複数の回転翼を備える。複数の回転翼は、推進部の一例である。ＵＡＶ本体２０は、複数の回転翼の回転を制御することでＵＡＶ１０を飛行させる。ＵＡＶ本体２０は、例えば、４つの回転翼を用いてＵＡＶ１０を飛行させる。回転翼の数は、４つには限定されない。また、ＵＡＶ１０は、回転翼を有さない固定翼機でもよい。 The UAV main body 20 includes a plurality of rotor blades. The plurality of rotor blades are an example of a propulsion unit. The UAV main body 20 flies the UAV 10 by controlling the rotation of a plurality of rotor blades. The UAV body 20 flies the UAV 10 using, for example, four rotor blades. The number of rotor blades is not limited to four. Further, the UAV 10 may be a fixed-wing aircraft having no rotor blades.

撮像装置１００は、所望の撮像範囲に含まれる被写体を撮像する撮像用のカメラである。ジンバル５０は、撮像装置１００を回転可能に支持する。ジンバル５０は、支持機構の一例である。例えば、ジンバル５０は、撮像装置１００を、アクチュエータを用いてピッチ軸で回転可能に支持する。ジンバル５０は、撮像装置１００を、アクチュエータを用いて更にロール軸及びヨー軸のそれぞれを中心に回転可能に支持する。ジンバル５０は、ヨー軸、ピッチ軸、及びロール軸の少なくとも１つを中心に撮像装置１００を回転させることで、撮像装置１００の姿勢を変更してよい。 The imaging device 100 is an imaging camera that captures a subject included in a desired imaging range. The gimbal 50 rotatably supports the imaging device 100. The gimbal 50 is an example of a support mechanism. For example, the gimbal 50 rotatably supports the image pickup device 100 on a pitch axis using an actuator. The gimbal 50 further rotatably supports the image pickup device 100 around each of the roll axis and the yaw axis by using an actuator. The gimbal 50 may change the posture of the image pickup device 100 by rotating the image pickup device 100 around at least one of the yaw axis, the pitch axis, and the roll axis.

複数の撮像装置６０は、ＵＡＶ１０の飛行を制御するためにＵＡＶ１０の周囲を撮像するセンシング用のカメラである。２つの撮像装置６０が、ＵＡＶ１０の機首である正面に設けられてよい。更に他の２つの撮像装置６０が、ＵＡＶ１０の底面に設けられてよい。正面側の２つの撮像装置６０はペアとなり、いわゆるステレオカメラとして機能してよい。底面側の２つの撮像装置６０もペアとなり、ステレオカメラとして機能してよい。複数の撮像装置６０により撮像された画像に基づいて、ＵＡＶ１０の周囲の３次元空間データが生成されてよい。ＵＡＶ１０が備える撮像装置６０の数は４つには限定されない。ＵＡＶ１０は、少なくとも１つの撮像装置６０を備えていればよい。ＵＡＶ１０は、ＵＡＶ１０の機首、機尾、側面、底面、及び天井面のそれぞれに少なくとも１つの撮像装置６０を備えてもよい。撮像装置６０で設定できる画角は、撮像装置１００で設定できる画角より広くてよい。撮像装置６０は、単焦点レンズまたは魚眼レンズを有してもよい。 The plurality of image pickup devices 60 are sensing cameras that image the surroundings of the UAV 10 in order to control the flight of the UAV 10. Two imaging devices 60 may be provided on the front surface, which is the nose of the UAV 10. Yet two other imaging devices 60 may be provided on the bottom surface of the UAV 10. The two image pickup devices 60 on the front side may form a pair and function as a so-called stereo camera. The two image pickup devices 60 on the bottom surface side may also be paired and function as a stereo camera. Three-dimensional spatial data around the UAV 10 may be generated based on the images captured by the plurality of imaging devices 60. The number of image pickup devices 60 included in the UAV 10 is not limited to four. The UAV 10 may include at least one imaging device 60. The UAV 10 may be provided with at least one imaging device 60 on each of the nose, nose, side surface, bottom surface, and ceiling surface of the UAV 10. The angle of view that can be set by the image pickup device 60 may be wider than the angle of view that can be set by the image pickup device 100. The image pickup apparatus 60 may have a single focus lens or a fisheye lens.

遠隔操作装置３００は、ＵＡＶ１０と通信して、ＵＡＶ１０を遠隔操作する。遠隔操作装置３００は、ＵＡＶ１０と無線で通信してよい。遠隔操作装置３００は、ＵＡＶ１０に上昇、下降、加速、減速、前進、後進、回転などのＵＡＶ１０の移動に関する各種命令を示す指示情報を送信する。指示情報は、例えば、ＵＡＶ１０の高度を上昇させる指示情報を含む。指示情報は、ＵＡＶ１０が位置すべき高度を示してよい。ＵＡＶ１０は、遠隔操作装置３００から受信した指示情報により示される高度に位置するように移動する。指示情報は、ＵＡＶ１０を上昇させる上昇命令を含んでよい。ＵＡＶ１０は、上昇命令を受け付けている間、上昇する。ＵＡＶ１０は、上昇命令を受け付けても、ＵＡＶ１０の高度が上限高度に達している場合には、上昇を制限してよい。 The remote control device 300 communicates with the UAV 10 to remotely control the UAV 10. The remote control device 300 may wirelessly communicate with the UAV 10. The remote control device 300 transmits to the UAV 10 instruction information indicating various commands related to the movement of the UAV 10, such as ascending, descending, accelerating, decelerating, advancing, reversing, and rotating. The instruction information includes, for example, instruction information for raising the altitude of the UAV 10. The instruction information may indicate the altitude at which the UAV 10 should be located. The UAV 10 moves so as to be located at an altitude indicated by the instruction information received from the remote control device 300. The instruction information may include an ascending instruction to ascend the UAV 10. The UAV10 rises while accepting the rise order. Even if the UAV10 accepts the ascending command, the ascending may be restricted if the altitude of the UAV10 has reached the upper limit altitude.

図１２は、本発明の複数の態様が全体的または部分的に具現化されてよいコンピュータ１２００の一例を示す。コンピュータ１２００にインストールされたプログラムは、コンピュータ１２００に、本発明の実施形態に係る装置に関連付けられるオペレーションまたは当該装置の１または複数の「部」として機能させることができる。例えば、コンピュータ１２００にインストールされたプログラムは、コンピュータ１２００に、制御部１１０として機能させることができる。または、当該プログラムは、コンピュータ１２００に当該オペレーションまたは当該１または複数の「部」の機能を実行させることができる。当該プログラムは、コンピュータ１２００に、本発明の実施形態に係るプロセスまたは当該プロセスの段階を実行させることができる。そのようなプログラムは、コンピュータ１２００に、本明細書に記載のフローチャート及びブロック図のブロックのうちのいくつかまたはすべてに関連付けられた特定のオペレーションを実行させるべく、ＣＰＵ１２１２によって実行されてよい。 FIG. 12 shows an example of a computer 1200 in which a plurality of aspects of the present invention may be embodied in whole or in part. The program installed on the computer 1200 can cause the computer 1200 to function as an operation associated with the device according to an embodiment of the present invention or as one or more "parts" of the device. For example, a program installed on a computer 1200 can cause the computer 1200 to function as a control unit 110. Alternatively, the program may cause the computer 1200 to perform the operation or the function of the one or more "parts". The program can cause a computer 1200 to perform a process or a step of the process according to an embodiment of the present invention. Such a program may be run by the CPU 1212 to cause the computer 1200 to perform certain operations associated with some or all of the blocks in the flowcharts and block diagrams described herein.

本実施形態によるコンピュータ１２００は、ＣＰＵ１２１２、及びＲＡＭ１２１４を含み、それらはホストコントローラ１２１０によって相互に接続されている。コンピュータ１２００はまた、通信インタフェース１２２２、入力／出力ユニットを含み、それらは入力／出力コントローラ１２２０を介してホストコントローラ１２１０に接続されている。コンピュータ１２００はまた、ＲＯＭ１２３０を含む。ＣＰＵ１２１２は、ＲＯＭ１２３０及びＲＡＭ１２１４内に格納されたプログラムに従い動作し、それにより各ユニットを制御する。 The computer 1200 according to this embodiment includes a CPU 1212 and a RAM 1214, which are connected to each other by a host controller 1210. The computer 1200 also includes a communication interface 1222, an input / output unit, which are connected to the host controller 1210 via an input / output controller 1220. The computer 1200 also includes a ROM 1230. The CPU 1212 operates according to the programs stored in the ROM 1230 and the RAM 1214, thereby controlling each unit.

通信インタフェース１２２２は、ネットワークを介して他の電子デバイスと通信する。ハードディスクドライブが、コンピュータ１２００内のＣＰＵ１２１２によって使用されるプログラム及びデータを格納してよい。ＲＯＭ１２３０はその中に、アクティブ化時にコンピュータ１２００によって実行されるブートプログラム等、及び／またはコンピュータ１２００のハードウェアに依存するプログラムを格納する。プログラムが、ＣＲ−ＲＯＭ、ＵＳＢメモリまたはＩＣカードのようなコンピュータ可読記録媒体またはネットワークを介して提供される。プログラムは、コンピュータ可読記録媒体の例でもあるＲＡＭ１２１４、またはＲＯＭ１２３０にインストールされ、ＣＰＵ１２１２によって実行される。これらのプログラム内に記述される情報処理は、コンピュータ１２００に読み取られ、プログラムと、上記様々なタイプのハードウェアリソースとの間の連携をもたらす。装置または方法が、コンピュータ１２００の使用に従い情報のオペレーションまたは処理を実現することによって構成されてよい。 Communication interface 1222 communicates with other electronic devices via a network. The hard disk drive may store programs and data used by the CPU 1212 in the computer 1200. The ROM 1230 stores in it a boot program or the like executed by the computer 1200 at the time of activation and / or a program depending on the hardware of the computer 1200. The program is provided via a computer-readable recording medium such as a CR-ROM, USB memory or IC card or network. The program is installed in RAM 1214 or ROM 1230, which is also an example of a computer-readable recording medium, and is executed by CPU 1212. The information processing described in these programs is read by the computer 1200 and provides a link between the program and the various types of hardware resources described above. The device or method may be configured to implement the operation or processing of information according to the use of the computer 1200.

例えば、通信がコンピュータ１２００及び外部デバイス間で実行される場合、ＣＰＵ１２１２は、ＲＡＭ１２１４にロードされた通信プログラムを実行し、通信プログラムに記述された処理に基づいて、通信インタフェース１２２２に対し、通信処理を命令してよい。通信インタフェース１２２２は、ＣＰＵ１２１２の制御の下、ＲＡＭ１２１４、またはＵＳＢメモリのような記録媒体内に提供される送信バッファ領域に格納された送信データを読み取り、読み取られた送信データをネットワークに送信し、またはネットワークから受信した受信データを記録媒体上に提供される受信バッファ領域等に書き込む。 For example, when communication is executed between the computer 1200 and an external device, the CPU 1212 executes a communication program loaded in the RAM 1214, and performs communication processing on the communication interface 1222 based on the processing described in the communication program. You may order. Under the control of the CPU 1212, the communication interface 1222 reads the transmission data stored in the transmission buffer area provided in the RAM 1214 or a recording medium such as a USB memory, and transmits the read transmission data to the network, or The received data received from the network is written to the reception buffer area or the like provided on the recording medium.

また、ＣＰＵ１２１２は、ＵＳＢメモリ等のような外部記録媒体に格納されたファイルまたはデータベースの全部または必要な部分がＲＡＭ１２１４に読み取られるようにし、ＲＡＭ１２１４上のデータに対し様々なタイプの処理を実行してよい。ＣＰＵ１２１２は次に、処理されたデータを外部記録媒体にライトバックしてよい。 Further, the CPU 1212 makes the RAM 1214 read all or necessary parts of a file or a database stored in an external recording medium such as a USB memory, and executes various types of processing on the data on the RAM 1214. good. The CPU 1212 may then write back the processed data to an external recording medium.

様々なタイプのプログラム、データ、テーブル、及びデータベースのような様々なタイプの情報が記録媒体に格納され、情報処理を受けてよい。ＣＰＵ１２１２は、ＲＡＭ１２１４から読み取られたデータに対し、本開示の随所に記載され、プログラムの命令シーケンスによって指定される様々なタイプのオペレーション、情報処理、条件判断、条件分岐、無条件分岐、情報の検索／置換等を含む、様々なタイプの処理を実行してよく、結果をＲＡＭ１２１４に対しライトバックする。また、ＣＰＵ１２１２は、記録媒体内のファイル、データベース等における情報を検索してよい。例えば、各々が第２の属性の属性値に関連付けられた第１の属性の属性値を有する複数のエントリが記録媒体内に格納される場合、ＣＰＵ１２１２は、第１の属性の属性値が指定される、条件に一致するエントリを当該複数のエントリの中から検索し、当該エントリ内に格納された第２の属性の属性値を読み取り、それにより予め定められた条件を満たす第１の属性に関連付けられた第２の属性の属性値を取得してよい。 Various types of information such as various types of programs, data, tables, and databases may be stored in recording media and processed. The CPU 1212 describes various types of operations, information processing, conditional judgment, conditional branching, unconditional branching, and information retrieval described in various parts of the present disclosure with respect to the data read from the RAM 1214. Various types of processing may be performed, including / replacement, etc., and the results are written back to the RAM 1214. Further, the CPU 1212 may search for information in a file, a database, or the like in the recording medium. For example, when a plurality of entries each having an attribute value of the first attribute associated with the attribute value of the second attribute are stored in the recording medium, the CPU 1212 specifies the attribute value of the first attribute. Search for an entry that matches the condition from the plurality of entries, read the attribute value of the second attribute stored in the entry, and associate it with the first attribute that satisfies the predetermined condition. The attribute value of the second attribute obtained may be acquired.

上で説明したプログラムまたはソフトウェアモジュールは、コンピュータ１２００上またはコンピュータ１２００近傍のコンピュータ可読記憶媒体に格納されてよい。また、専用通信ネットワークまたはインターネットに接続されたサーバーシステム内に提供されるハードディスクまたはＲＡＭのような記録媒体が、コンピュータ可読記憶媒体として使用可能であり、それによりプログラムを、ネットワークを介してコンピュータ１２００に提供する。 The program or software module described above may be stored on a computer 1200 or in a computer readable storage medium near the computer 1200. Also, a recording medium such as a hard disk or RAM provided within a dedicated communication network or a server system connected to the Internet can be used as a computer readable storage medium, thereby allowing the program to be transferred to the computer 1200 over the network. offer.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更または改良を加えることが可能であることが当業者に明らかである。その様な変更または改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments. It will be apparent to those skilled in the art that various changes or improvements can be made to the above embodiments. It is clear from the description of the claims that such modified or improved forms may also be included in the technical scope of the present invention.

特許請求の範囲、明細書、および図面中において示した装置、システム、プログラム、および方法における動作、手順、ステップ、および段階等の各処理の実行順序は、特段「より前に」、「先立って」等と明示しておらず、また、前の処理の出力を後の処理で用いるのでない限り、任意の順序で実現しうることに留意すべきである。特許請求の範囲、明細書、および図面中の動作フローに関して、便宜上「まず、」、「次に、」等を用いて説明したとしても、この順で実施することが必須であることを意味するものではない。 The order of execution of operations, procedures, steps, steps, etc. in the devices, systems, programs, and methods shown in the claims, specification, and drawings is particularly "before" and "prior to". It should be noted that it can be realized in any order unless the output of the previous process is used in the subsequent process. Even if the scope of claims, the specification, and the operation flow in the drawings are explained using "first", "next", etc. for convenience, it means that it is essential to carry out in this order. It's not a thing.

１０ＵＡＶ
２０ＵＡＶ本体
５０ジンバル
６０撮像装置
１００撮像装置
１０２撮像部
１１０制御部
１２０イメージセンサ
１３０メモリ
１６０表示部
１６２指示部
１７０通信部
１８０サーバ
１９０通信ネットワーク
２００レンズ部
２１０レンズ
２１２レンズ駆動部
２２０レンズ制御部
２２２メモリ
３００遠隔操作装置
３１０システム
４００、４０１、４０２、４０３動画構成画像
５００入力層
５０１、５０２、５０３、５０８中間層
５０９出力層
５８０多層ニューラルネットワーク
６０１中間層データ
６０２中間層データ
６１１中間層データ
６１２中間層データ
６２０残差情報
６３０量子化
６４０符号化
６５０量子化
６６０符号化
７１０第１ＣＮＮ部
７２０中間データバッファ
７３０動き検出部
７４０中間データ予測部
７５０マッピング部
７６０量子化・エントロピー符号化部
７８０送信データ
８１０第２ＣＮＮ部
８２０中間データバッファ
８３０エントロピー復号化・量子化部
８４０中間データ再構築部
８５０動きマッピング部
１２００コンピュータ
１２１０ホストコントローラ
１２１２ＣＰＵ
１２１４ＲＡＭ
１２２０入力／出力コントローラ
１２２２通信インタフェース
１２３０ＲＯＭ 10 UAV
20 UAV main unit 50 gimbal 60 imaging device 100 imaging device 102 imaging unit 110 control unit 120 image sensor 130 memory 160 display unit 162 indicator unit 170 communication unit 180 server 190 communication network 200 lens unit 210 lens 212 lens drive unit 220 lens control unit 222 Memory 300 Remote control device 310 System 400, 401, 402, 403 Video configuration image 500 Input layer 501, 502, 503, 508 Intermediate layer 509 Output layer 580 Multilayer neural network 601 Intermediate layer data 602 Intermediate layer data 611 Intermediate layer data 612 Intermediate Layer data 620 Residual information 630 Quantization 640 Coding 650 Quantization 660 Coding 710 1st CNN part 720 Intermediate data buffer 730 Motion detection part 740 Intermediate data prediction part 750 Mapping part 760 Quantization / entropy coding part 780 Transmission data 810 2nd CNN part 820 Intermediate data buffer 830 Entropy decoding / quantization unit 840 Intermediate data reconstruction unit 850 Motion mapping unit 1200 Computer 1210 Host controller 1212 CPU
1214 RAM
1220 Input / Output Controller 1222 Communication Interface 1230 ROM

Claims

The first intermediate layer data is generated by processing the image data of the first moving image constituting the moving image up to a specific intermediate layer in the multi-layer neural network.
The second intermediate layer data is generated by performing processing up to the specific intermediate layer in the multi-layer neural network on the image data of the second moving image constituent image constituting the moving image.
Based on the motion information between the first moving image and the second moving image, motion data between the first intermediate layer data and the second intermediate layer data is generated.
By performing motion compensation between the first intermediate layer data and the second intermediate layer data based on the motion data, between the first intermediate layer data and the second intermediate layer data. Generate residual data for
Based on the residual data, intermediate layer data to be transmitted to another device that performs processing after the specific intermediate layer in the multi-layer neural network is generated.
A device including a circuit configured to transmit the intermediate layer data and the motion data to the other device.

The circuit
Quantized data is generated by quantizing the residual data,
The apparatus according to claim 1, wherein the intermediate layer data is generated based on the quantization data.

The apparatus according to claim 2, wherein the circuit is configured to generate the intermediate layer data by entropy coding the quantization data.

The apparatus according to claim 1 or 2, wherein the multi-layer neural network is a convolutional neural network, and includes one or more intermediate layers for performing a convolution operation on the first moving image and the second moving image. ..

The circuit scales the motion information based on the sizes of the first moving image and the second moving image and the sizes of the first intermediate layer data and the second intermediate layer data. The apparatus according to claim 4, wherein the motion data is generated.

The apparatus according to claim 1 or 2, wherein the multi-layer neural network is a neural network for performing super-resolution processing of the moving image or image recognition processing from the moving image.

In the multi-layer neural network, the neural network parameters from the input layer before the specific intermediate layer to the output layer after the specific intermediate layer of the multi-layer neural network are learned by using the training data.
The device according to claim 1 or 2, wherein the circuit is configured to store neural network parameters from the input layer to the specific intermediate layer.

The multi-layer neural network has a plurality of intermediate layers and has a plurality of intermediate layers.
The circuit selects the specific intermediate layer from the plurality of intermediate layers based on at least one of a communication line capacity in the device, a load state of the device, and a power state of the device. The device according to claim 1 or 2, which is configured as described above.

The first moving image constituent image and the second moving image constituent image constituting the moving image are communicated with another device that processes up to a specific intermediate layer in the multi-layer neural network, and the specific intermediate layer in the multi-layer neural network is communicated with. It is a device that performs the subsequent processing,
From the other device
(I) The first intermediate layer data generated by performing the processing up to the specific intermediate layer in the multilayer neural network on the first moving image constituent image, and
(Ii) Between the first intermediate layer data and the second intermediate layer data generated by performing processing up to the specific intermediate layer in the multilayer neural network on the second moving image constituent image. Motion data and
(Iii) Residual data between the first intermediate layer data and the second intermediate layer data, and the first intermediate layer generated by performing motion compensation based on the motion data. Residual data between the data and the second intermediate layer data and
Received
Based on the motion data and the residual data, the second intermediate layer data is generated.
A device including a circuit configured to perform processing on the first intermediate layer data and the second intermediate layer data after the specific intermediate layer in the multilayer neural network.

The device according to claim 1 and
A system including the device according to claim 9.

The device according to claim 1 or 2,
An imaging device including an image sensor that generates the moving image.

A moving body that moves with the imaging device according to claim 11.

The mobile body according to claim 12, wherein the mobile body is an unmanned aerial vehicle.

A program for operating a computer as the device according to claim 1 or 2.

The stage of generating the first intermediate layer data by processing the image data of the first moving image constituting the moving image up to a specific intermediate layer in the multi-layer neural network, and the stage of generating the first intermediate layer data.
A step of generating a second intermediate layer data by performing processing up to the specific intermediate layer in the multi-layer neural network on the image data of the second moving image constituent image constituting the moving image.
A step of generating motion data between the first intermediate layer data and the second intermediate layer data based on motion information between the first moving image and the second moving image. When,
By performing motion compensation between the first intermediate layer data and the second intermediate layer data based on the motion data, between the first intermediate layer data and the second intermediate layer data. At the stage of generating the residual data of
Based on the residual data, a step of generating intermediate layer data to be transmitted to another device that performs processing after the specific intermediate layer in the multilayer neural network, and a step of generating the intermediate layer data.
A method including a step of transmitting the intermediate layer data and the motion data to the other device.

The first moving image constituent image and the second moving image constituent image constituting the moving image are communicated with another device that processes up to a specific intermediate layer in the multi-layer neural network, and the specific intermediate layer in the multi-layer neural network is communicated with. It is a method to perform the later processing,
From the other device
(I) The first intermediate layer data generated by performing the processing up to the specific intermediate layer in the multilayer neural network on the first moving image constituent image, and
(Ii) Between the first intermediate layer data and the second intermediate layer data generated by performing processing up to the specific intermediate layer in the multilayer neural network on the second moving image constituent image. Motion data and
(Iii) Residual data between the first intermediate layer data and the second intermediate layer data, and the first intermediate layer generated by performing motion compensation based on the motion data. Residual data between the data and the second intermediate layer data and
At the stage of receiving
A step of generating the second intermediate layer data based on the motion data and the residual data, and
A method including a step of performing processing on the first intermediate layer data and the second intermediate layer data after the specific intermediate layer in the multilayer neural network.