JP2021532498A

JP2021532498A - Video memory processing methods, devices and recording media based on convolutional neural networks

Info

Publication number: JP2021532498A
Application number: JP2021506309A
Authority: JP
Inventors: モンチャン; イーチュンタン; ポンカオ; チアンチョン; クオトンシエ
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-06-10
Filing date: 2019-11-14
Publication date: 2021-11-25
Anticipated expiration: 2039-11-14
Also published as: CN110377342B; WO2020248499A1; JP7174831B2; CN110377342A

Abstract

本願は、ニューラルネットワークの分野に関し、畳み込みニューラルネットワークに基づくビデオメモリ処理方法、装置及び記録媒体を提供し、ビデオメモリ処理方法は、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する記憶空間である一時記憶空間を作成するステップと、処理対象データのタイプ及び方向に応じて、前記処理対象データに対応する一時記憶空間を呼び出し、かつ前記処理対象データを呼び出された一時記憶空間に読み込むステップと、前記呼び出された一時記憶空間内で前記処理対象データに対して所定の処理を行うステップと、処理後のデータのタイプ及び方向に応じて、前記呼び出された一時記憶空間内のデータを指定された外部記憶空間に書き込むステップと、を含む。本願はビデオメモリを大幅に節約し、ＧＰＵ計算の並列度を向上させることができる。【選択図】図５The present application provides a video memory processing method, an apparatus and a recording medium based on a convolutional neural network in the field of a neural network, in which the video memory processing method temporarily stores input data, output data, input error and output error. Depending on the step of creating the temporary storage space, which is the storage space, and the type and direction of the data to be processed, the temporary storage space corresponding to the data to be processed is called, and the temporary storage space to which the data to be processed is called is used. A step to read, a step to perform predetermined processing on the data to be processed in the called temporary storage space, and data in the called temporary storage space according to the type and direction of the processed data. Includes a step to write to the specified external storage space. The present application can save a lot of video memory and improve the parallelism of GPU calculation. [Selection diagram] FIG. 5

Description

本願は、中国特許出願番号第２０１９１０４９７３９６．８号（出願日２０１９年６月１０日、発明の名称「畳み込みニューラルネットワークに基づくビデオメモリ処理方法、装置及び記録媒体」）の特許出願の優先権を主張するものである。 This application claims the priority of the patent application of Chinese Patent Application No. 201910497396.8 (Filing date June 10, 2019, title of invention "Video memory processing method, apparatus and recording medium based on convolutional neural network"). It is something to do.

本願は、畳み込みニューラルネットワーク（ＣＮＮ：Convolutional Neural Network）の技術分野に関し、特に畳み込みニューラルネットワークに基づくビデオメモリ処理方法、装置及び記録媒体に関する。 The present application relates to the technical field of a convolutional neural network (CNN), and more particularly to a video memory processing method, an apparatus and a recording medium based on a convolutional neural network.

ビデオメモリは、ＧＰＵ表示コアの一時メモリであり、処理すべきコアデータを一時的に記憶するものである。その役割は、ＣＰＵ及びメモリの役割と同じである。ビデオメモリの容量は、ビデオメモリがデータを一時的に記憶する能力を決定し、ビデオカードのコアが十分強い場合、大容量ビデオメモリは、データを読み取る回数を低減し、遅延を低減することができる。出願人は、現在の畳み込みニューラルネットワークモデルのトレーニングプロセスにおいて、モデルの入出力データが異なるビデオメモリ空間に繰り返して記憶され、不必要なビデオメモリオーバーヘッドを引き起こし、モデルのトレーニングのバッチ処理数を低減するため、モデルのトレーニングの精度に影響を与えることを着目した。 The video memory is a temporary memory of the GPU display core, and temporarily stores the core data to be processed. Its role is the same as that of the CPU and memory. The capacity of the video memory determines the ability of the video memory to temporarily store the data, and if the core of the video card is strong enough, the large capacity video memory can reduce the number of times the data is read and reduce the delay. can. In the current convolutional neural network model training process, the applicant repeatedly stores the model's input / output data in different video memory spaces, causing unnecessary video memory overhead and reducing the number of batch processes for model training. Therefore, we focused on affecting the accuracy of model training.

例えば、Ｃｏｎｃａｔ層及びＡｄｄｉｔｉｏｎ層は、現在の深層学習分類ネットワーク及びターゲット検出ネットワークにおいてよく使用されている層である。Ｃｏｎｃａｔ層は、特徴次元で複数の入力データに対してマージ処理を行うものであり、Ａｄｄｉｔｉｏｎ層は、複数の入力データに対して累積処理を行うものである。Ｃａｆｆｅ、ＴｅｎｓｏｒＦｌｏｗなどの従来の深層学習ネットワークトレーニングフレームワークは、いずれもＣｏｎｃａｔ及びＡｄｄｉｔｉｏｎ層に対してビデオメモリ最適化を行わないため、入出力データが異なるビデオメモリ空間に繰り返し記憶され、不必要なビデオメモリオーバーヘッドをもたらし、モデルのトレーニングのバッチ処理数の低減を引き起こすため、モデルのトレーニング精度に影響を与える。それと共に、ビデオメモリ空間はまた、自動化機械学習技術ａｕｔｏＭＬの最適化方式の探索空間などを制限する。 For example, the Concat layer and the Addition layer are commonly used layers in current deep learning classification networks and target detection networks. The Concat layer performs merge processing on a plurality of input data in the feature dimension, and the Addition layer performs cumulative processing on a plurality of input data. Traditional deep learning network training frameworks such as Caffe and TensorFlow do not perform video memory optimization for the Concat and Addition layers, so I / O data is repeatedly stored in different video memory spaces and unnecessary video. It affects the training accuracy of the model by introducing memory overhead and reducing the number of batch processes for training the model. At the same time, the video memory space also limits the search space of the optimization method of the automated machine learning technology autoML.

本願は、共有一時記憶空間を作成し、処理すべきデータのタイプ及び指示に応じてデータを対応する一時記憶空間内に読み込むか又は書き込むことを主な目的として、従来のフレームワークと比較して、ユーザが様々なモジュールと任意に組み合わせて新たなＣＮＮ構造を形成することができ、ビデオメモリを大幅に節約し、ＧＰＵ計算の並列度を向上させることができる、畳み込みニューラルネットワークに基づくビデオメモリ処理方法、装置及びコンピュータ読取り可能な記録媒体を提供する。 The present application is compared with conventional frameworks with the main purpose of creating a shared temporary storage space and reading or writing the data into the corresponding temporary storage space according to the type and instructions of the data to be processed. , Video memory processing based on convolutional neural network, which allows the user to arbitrarily combine with various modules to form a new CNN structure, which can save a lot of video memory and improve the parallelism of GPU computation. Methods, devices and computer-readable recording media are provided.

上記目的を達成するために、本願に係る、電子装置に適用される畳み込みニューラルネットワークに基づくビデオメモリ処理方法は、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する記憶空間である一時記憶空間を作成するステップと、処理対象データのタイプ及び方向に応じて、前記処理対象データに対応する一時記憶空間を呼び出し、前記処理対象データを呼び出された一時記憶空間に読み込むステップと、前記呼び出された一時記憶空間で前記処理対象データに対して所定の処理を行うステップと、処理後のデータのタイプ及び方向に応じて、前記呼び出された一時記憶空間内のデータを指定された外部記憶空間に書き込むステップと、を含む。 In order to achieve the above object, the video memory processing method based on the convolutional neural network applied to the electronic device according to the present application is a storage space for temporarily storing input data, output data, input error and output error. A step of creating a temporary storage space, a step of calling a temporary storage space corresponding to the processing target data according to the type and direction of the processing target data, and a step of reading the processing target data into the called temporary storage space, and the above-mentioned The data in the called temporary storage space is specified as an external storage according to the step of performing a predetermined process on the data to be processed in the called temporary storage space and the type and direction of the processed data. Includes steps to write to space.

本願に係る畳み込みニューラルネットワークに基づくビデオメモリ処理システムは、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する記憶空間である一時記憶空間を作成する空間作成部と、処理対象データのタイプ及び方向に応じて、前記処理対象データに対応する一時記憶空間を呼び出し、前記処理対象データを呼び出された一時記憶空間に読み込むデータ呼び出し部と、前記呼び出された一時記憶空間で前記処理対象データに対して所定の処理を行う前処理部と、処理後のデータのタイプ及び方向に応じて、前記呼び出された一時記憶空間内のデータを指定された外部記憶空間に書き込むデータ書き込み部と、を含む。 The video memory processing system based on the convolutional neural network according to the present application has a space creation unit for creating a temporary storage space, which is a storage space for temporarily storing input data, output data, input error, and output error, and a space creation unit for processing target data. A data calling unit that calls the temporary storage space corresponding to the processing target data and reads the processing target data into the called temporary storage space according to the type and direction, and the processing target data in the called temporary storage space. A pre-processing unit that performs predetermined processing on the data, and a data writing unit that writes data in the called temporary storage space to a designated external storage space according to the type and direction of the data after processing. include.

また、上記目的を達成するために、本願に係る電子装置は、メモリ及びプロセッサを含み、前記メモリは、前記プロセッサによって実行されると、前述した畳み込みニューラルネットワークに基づくビデオメモリ処理方法のステップを実現する、畳み込みニューラルネットワークに基づくビデオメモリ処理プログラムを含む。 Further, in order to achieve the above object, the electronic device according to the present application includes a memory and a processor, and when the memory is executed by the processor, the step of the video memory processing method based on the convolutional neural network described above is realized. Includes a video memory processor based on a convolutional neural network.

また、上記目的を達成するために、本願に係るコンピュータ読取り可能な記録媒体は、プロセッサによって実行されると、上記畳み込みニューラルネットワークに基づくビデオメモリ処理方法のステップを実現する、畳み込みニューラルネットワークに基づくビデオメモリ処理プログラムを含む。 Further, in order to achieve the above object, the computer-readable recording medium according to the present application, when executed by a processor, realizes a step of a video memory processing method based on the convolutional neural network, a video based on a convolutional neural network. Includes memory processing program.

本願に係る畳み込みニューラルネットワークに基づくビデオメモリ処理方法、システム、電子装置及びコンピュータ読取り可能な記録媒体は、共有一時記憶空間を設定し、処理すべきデータのタイプ及び指示に応じて、対応する一時記憶空間を呼び出し、かつデータを対応する一時記憶空間内に読み込むか又は書き込んで演算処理を行うことにより、ＣＮＮアルゴリズムに適用することができ、他のフレームワークと比較して、Ｄｅｎｓｅ、Ｒｅｓｉｄｕａｌ、Ｉｎｃｅｐｔｉｏｎモジュールと任意に組み合わせて新たなＣＮＮ構造を形成することができ、約半分のビデオメモリを節約すると共に、ＧＰＵ計算の並列度を向上させることができる。 The video memory processing method, system, electronic device and computer readable recording medium based on the convolutional neural network according to the present application set a shared temporary storage space and correspond to the corresponding temporary storage according to the type and instruction of the data to be processed. It can be applied to CNN algorithms by calling space and reading or writing data into the corresponding temporary storage space for arithmetic processing, and compared to other frameworks, the Dense, Memory, and Injection modules. A new CNN structure can be formed by any combination with and can save about half of the video memory and improve the parallelism of GPU calculation.

本願の目的の達成、機能特徴及び利点について、実施例と組み合わせて、図面を参照しながらさらに説明する。
本願の実施例に係る畳み込みニューラルネットワークに基づくビデオメモリ処理方法の適用環境の概略図である。図１における畳み込みニューラルネットワークに基づくビデオメモリ処理プログラムの具体的な実施例のモジュールの概略図である。従来のＣＮＮ構造の部分構造の概略図である。ビデオメモリ最適化処理後の図３の部分構造の概略図である。本願の実施例に係る畳み込みニューラルネットワークに基づくビデオメモリ処理方法のフローチャートである。本願の実施例に係る畳み込みニューラルネットワークに基づくビデオメモリ処理システムの論理構造の概略図である。 Achievements, functional features and advantages of the present application will be further described in combination with examples with reference to the drawings.
It is a schematic diagram of the application environment of the video memory processing method based on the convolutional neural network which concerns on embodiment of this application. It is a schematic diagram of the module of the specific embodiment of the video memory processing program based on the convolutional neural network in FIG. 1. It is a schematic diagram of the partial structure of the conventional CNN structure. It is a schematic diagram of the partial structure of FIG. 3 after the video memory optimization process. It is a flowchart of the video memory processing method based on the convolutional neural network which concerns on embodiment of this application. It is a schematic diagram of the logical structure of the video memory processing system based on the convolutional neural network which concerns on embodiment of this application.

なお、ここで説明する具体的な実施例は、本願を解釈するためのものに過ぎず、本願を限定するものではない。 It should be noted that the specific examples described here are merely for interpreting the present application and do not limit the present application.

（実施例１）
本願は、電子装置１に適用される畳み込みニューラルネットワークに基づくビデオメモリ処理方法を提供する。図１に示すように、本願の畳み込みニューラルネットワークに基づくビデオメモリ処理方法の好ましい実施例の適用環境の概略図である。 (Example 1)
The present application provides a video memory processing method based on a convolutional neural network applied to the electronic device 1. As shown in FIG. 1, it is a schematic diagram of the application environment of a preferable embodiment of the video memory processing method based on the convolutional neural network of the present application.

本実施例で、電子装置１は、サーバ、スマートフォン、タブレットコンピュータ、ポータブルコンピュータ、デスクトップコンピュータなどの、演算機能を持つ端末装置であってよい。 In this embodiment, the electronic device 1 may be a terminal device having a calculation function, such as a server, a smartphone, a tablet computer, a portable computer, or a desktop computer.

該電子装置１は、プロセッサ１２、メモリ１１、ネットワークインタフェース１４及び通信バス１５を含む。 The electronic device 1 includes a processor 12, a memory 11, a network interface 14, and a communication bus 15.

メモリ１１は、少なくとも１種の読取り可能な記録媒体を含み、フラッシュメモリ、ハードディスク、マルチメディアカード、カード型メモリ１１などの不揮発性記録媒体であってよい。いくつかの実施例では、読取り可能な記録媒体は、上記電子装置１の内部記憶部であってよく、例えば、該電子装置１のハードディスクである。別の実施例では、読取り可能な記録媒体は、上記電子装置１の外部メモリ１１であってよく、例えば、電子装置１に搭載されたプラグインハードディスク、スマートメディアカード（ＳｍａｒｔＭｅｄｉａＣａｒｄ、ＳＭＣ）、セキュアデジタル（ＳｅｃｕｒｅＤｉｇｉｔａｌ、ＳＤ）カード、フラッシュカード（ＦｌａｓｈＣａｒｄ）である。 The memory 11 includes at least one type of readable recording medium, and may be a non-volatile recording medium such as a flash memory, a hard disk, a multimedia card, or a card-type memory 11. In some embodiments, the readable recording medium may be the internal storage of the electronic device 1, for example, the hard disk of the electronic device 1. In another embodiment, the readable recording medium may be the external memory 11 of the electronic device 1, for example, a plug-in hard disk mounted on the electronic device 1, a SmartMedia Card (SMC), and the like. It is a secure digital (SD) card and a flash card (FlashCard).

本実施例では、メモリ１１の読取り可能な記録媒体は、一般的に、電子装置１に搭載された畳み込みニューラルネットワークに基づくビデオメモリ処理プログラム１０などを記憶する。また、メモリ１１は、出力されたデータ又は出力しようとするデータを一時的に記憶することができる。 In this embodiment, the readable recording medium of the memory 11 generally stores a video memory processing program 10 or the like based on a convolutional neural network mounted on the electronic device 1. Further, the memory 11 can temporarily store the output data or the data to be output.

いくつかの実施例では、プロセッサ１２は、中央処理装置（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ、ＣＰＵ）、マイクロプロセッサ又はその他のデータ処理チップであってよく、メモリ１１に記憶されたプログラムコード又は処理データを実行し、例えば、畳み込みニューラルネットワークに基づくビデオメモリ処理プログラム１０などを実行する。 In some embodiments, the processor 12 may be a central processing unit (CPU), microprocessor or other data processing chip, executing program code or processing data stored in memory 11. For example, a video memory processing program 10 based on a convolutional neural network is executed.

ネットワークインタフェース１４は、好ましくは、標準的な有線インタフェース、無線インタフェース（例えば、ＷＩ−ＦＩインタフェース）を含んでよく、一般的に該電子装置１とその他の電子装置との間に通信接続を確立する。 The network interface 14 may preferably include a standard wired interface, a wireless interface (eg, a WI-FI interface), and generally establishes a communication connection between the electronic device 1 and other electronic devices. ..

通信バス１５は、これらのコンポーネントの間の接続通信を実現する。 The communication bus 15 realizes connection communication between these components.

図１でコンポーネント１１〜１５を有する電子装置１のみを示しているが、全ての示されたコンポーネントを実施することを要求せず、より多くの又はより少ないコンポーネントを代替的に実施できることを理解されたい。 Although only electronic device 1 with components 11-15 is shown in FIG. 1, it is understood that more or less components can be implemented in an alternative manner without requiring all indicated components to be implemented. sea bream.

該電子装置１は、ユーザインタフェース、ディスプレイ、タッチセンサを含んでよく、ユーザインタフェースは、キーボードなどの入力部と、マイクロホンなどの音声認識機能を持つ装置などの音声入力装置と、オーディオ、ヘッドフォンなどの音声出力装置とを含んでよい。ディスプレイは、ＬＥＤディスプレイ、液晶ディスプレイ、タッチ式液晶ディスプレイ及び有機発光ダイオード（ＯｒｇａｎｉｃＬｉｇｈｔ−ＥｍｉｔｔｉｎｇＤｉｏｄｅ、ＯＬＥＤ）タッチ装置などであってよい。タッチセンサは、抵抗式タッチセンサ、静電容量式タッチセンサなどであってよい。また、上記タッチセンサは、接触式タッチセンサを含むだけでなく、近接式タッチセンサなどを含んでもよい。また、上記タッチセンサは、単一のセンサであってもよいし、例えばアレイ状に配置された複数のセンサであってもよい。 The electronic device 1 may include a user interface, a display, and a touch sensor, and the user interface includes an input unit such as a keyboard, a voice input device such as a device having a voice recognition function such as a microphone, and audio, headphones, and the like. It may include an audio output device. The display may be an LED display, a liquid crystal display, a touch-type liquid crystal display, an organic light-emitting diode (OLED) touch device, or the like. The touch sensor may be a resistance type touch sensor, a capacitive touch sensor, or the like. Further, the touch sensor may include not only a contact type touch sensor but also a proximity type touch sensor and the like. Further, the touch sensor may be a single sensor, or may be, for example, a plurality of sensors arranged in an array.

好ましくは、該電子装置１は、無線周波数（ＲａｄｉｏＦｒｅｑｕｅｎｃｙ、ＲＦ）回路、センサ、オーディオ回路などをさらに含んでよく、ここでは説明を省略する。 Preferably, the electronic device 1 may further include a radio frequency (Radio Frequency, RF) circuit, a sensor, an audio circuit, and the like, and the description thereof is omitted here.

図１に示す装置の実施例では、コンピュータ記録媒体であるメモリ１１には、オペレーティングシステム、畳み込みニューラルネットワークに基づくビデオメモリ処理プログラム１０などが含まれてよく、プロセッサ１２は、メモリ１１に記憶された畳み込みニューラルネットワークに基づくビデオメモリ処理プログラム１０を実行すると、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する記憶空間である一時記憶空間を作成するステップ１と、処理対象データのタイプ及び方向に応じて、上記処理対象データに対応する一時記憶空間を呼び出し、上記処理対象データを呼び出された一時記憶空間に読み込むステップ２と、呼び出された一時記憶空間で上記処理対象データに対して所定の処理を行うステップ３と、処理後のデータのタイプ及び方向に応じて、上記呼び出された一時記憶空間内のデータを指定された外部記憶空間に書き込むステップ４とを実現する。 In the embodiment of the apparatus shown in FIG. 1, the memory 11 which is a computer recording medium may include an operating system, a video memory processing program 10 based on a convolutional neural network, and the like, and the processor 12 is stored in the memory 11. When the video memory processing program 10 based on the convolutional neural network is executed, step 1 of creating a temporary storage space which is a storage space for temporarily storing input data, output data, input error, and output error, and a type of data to be processed. And, depending on the direction, the temporary storage space corresponding to the processing target data is called, and the processing target data is read into the called temporary storage space. A step 3 of performing a predetermined process and a step 4 of writing the data in the called temporary storage space to the designated external storage space according to the type and direction of the processed data are realized.

ステップ１では、一時記憶空間は、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する記憶空間であり、対応する一時記憶空間は、入力データ一時記憶空間、出力データ一時記憶空間、入力誤差一時記憶空間及び出力誤差一時記憶空間を含む。該一時記憶空間はビデオメモリ内に設定されてもよく、ビデオメモリはモデル又はデータを記憶するもので、ビデオメモリが大きいほど、動作可能なネットワークも大きくなり、一般的なビデオカードは、主に以下のいくつかの種類を有する。

In step 1, the temporary storage space is a storage space for temporarily storing input data, output data, input error, and output error, and the corresponding temporary storage space is an input data temporary storage space, an output data temporary storage space, and the like. Includes input error temporary storage space and output error temporary storage space. The temporary storage space may be set in the video memory, the video memory stores the model or data, the larger the video memory, the larger the operable network, and the general video card is mainly. It has several types:

ビデオメモリの記憶単位は、主に、
１Ｂｙｔｅ＝８ｂｉｔと、
１Ｋ＝１０２４Ｂｙｔｅと、
１ＫＢ＝１０００Ｂｙｔｅと、
１Ｍ＝１０２４Ｋと、
１ＭＢ＝１０００ＫＢと、
１Ｇ＝１０２４Ｍと、
１ＧＢ＝１０００ＧＢと、
１０Ｋ＝１０＊１０２４Ｂｙｔｅと、
１０ＫＢ＝１００００Ｂｙｔｅと、を含む。 The storage unit of video memory is mainly
1Byte = 8bit,
1K = 1024Byte,
1KB = 1000Byte,
1M = 1024K,
1MB = 1000KB,
1G = 1024M,
1GB = 1000GB,
10K = 10 * 1024Byte,
Includes 10KB = 10000Byte.

一般的な数値のタイプ及びそのサイズは以下の表に示すとおりである。

Common numerical types and their sizes are shown in the table below.

上記表において、Ｉｎｔは整数型の数値であり、ｌｏｎｇは長整数型の数値であり、ｆｌｏａｔは浮動小数点型の数値である（ｓｉｎｇｌｅは単精度浮動小数点型の数値であり、ｄｏｕｂｌｅは倍精度浮動小数点型の数値である）。 In the above table, Int is an integer type numerical value, long is a long integer type numerical value, float is a floating point type numerical value (single is a single precision floating point type numerical value, and double is a double precision floating point type numerical value). (Floating point number).

ステップ２で、処理対象データのタイプが誤差であり方向が出力である場合、出力誤差データに応じて、対応する出力誤差一時記憶空間を呼び出し、該出力誤差を該出力誤差一時記憶空間内に読み込んで処理することができる。 In step 2, if the type of data to be processed is error and the direction is output, the corresponding output error temporary storage space is called according to the output error data, and the output error is read into the output error temporary storage space. Can be processed with.

上記処理対象データに対して所定の処理を行うステップは、上記処理対象データに対して畳み込み処理、重畳処理、乗算処理又は積分演算のうちの少なくとも１つを行うステップを含む。 The step of performing a predetermined process on the process target data includes a step of performing at least one of a convolution process, a superimposition process, a multiplication process, and an integral operation on the process target data.

例えば、データに対して畳み込み処理を行うときに、主に２つの変数をある範囲内で乗算した後に加算した結果を取得する。畳み込みの変数が系列ｘ（ｎ）及びｈ（ｎ）であれば、畳み込み結果は下式に示すとおりである。

For example, when performing a convolution process on data, the result of mainly multiplying two variables within a certain range and then adding them is acquired. If the variables of the convolution are the series x (n) and h (n), the convolution result is as shown in the following equation.

式中、＊は畳み込みを示す。時系列ｎ＝０の場合、系列ｈ（−ｉ）はｈ（ｉ）の時系列ｉが反転された結果であり、時系列の反転によりｈ（ｉ）が縦軸を中心として１８０度反転するため、このような、乗算後に加算する計算法は、畳み込み和と呼ばれ、畳み込みと略称される。また、ｎはｈ（−ｉ）をシフトさせる量であり、異なるｎは異なる畳み込み結果に対応する。 In the formula, * indicates convolution. When the time series n = 0, the series h (-i) is the result of the time series i of h (i) being inverted, and the time series inversion causes h (i) to be inverted 180 degrees about the vertical axis. Therefore, such a calculation method of adding after multiplication is called a convolution sum, and is abbreviated as convolution. Further, n is an amount that shifts h (−i), and different n corresponds to different convolution results.

畳み込みの変数がｘ（ｔ）とｈ（ｔ）の２つの関数であれば、畳み込みの計算は下式に従って行われる。

If the variables of the convolution are two functions x (t) and h (t), the calculation of the convolution is performed according to the following equation.

ｐは積分変数であり、積分も加算であり、ｔは関数ｈ（−ｐ）をシフトさせた量であり、＊は畳み込みを示す。 p is an integral variable, integral is also an addition, t is a shifted quantity of the function h (-p), and * indicates convolution.

ビデオメモリを節約するという目的を達成するために、上記これらの演算は、いずれも一時記憶空間内で行うことができる。 All of these operations can be performed in temporary storage space to achieve the goal of saving video memory.

ステップ４で、上記呼び出された一時記憶空間内のデータを指定された外部記憶空間に書き込むステップは、設定された書き込み方式で上記一時記憶空間内の処理後のデータを指定された外部記憶空間に書き込むステップを含み、上記書き込み方式はＡｄｄｉｔｉｏｎモード及びＣｏｎｃａｔモードを含む。 In step 4, the step of writing the data in the called temporary storage space to the designated external storage space is to write the processed data in the temporary storage space to the designated external storage space by the set writing method. The writing method includes a writing step, and the writing method includes an Addition mode and a Concat mode.

また、データのタイプは、入力データ、出力データ、入力誤差及び出力誤差を含み、上記データの方向は、入力及び出力を含む。 The type of data includes input data, output data, input error and output error, and the direction of the data includes input and output.

具体的には、ユーザにより設定された書き込み方式（Ａｄｄｉｔｉｏｎ、Ｃｏｎｃａｔ）に応じて異なる方式でデータを指定されたメモリ空間内に書き込むことができる。例えば、ユーザがＡｄｄｉｔｉｏｎモードを設定した場合、対応する一時記憶空間内のデータを指定された記憶空間内に累積的に書き込み、ユーザがＣｏｎｃａｔモードを設定した場合、ユーザにより設定されたデータ長情報に基づいて、対応する一時記憶空間内のデータを指定された記憶空間内に間隔をあけて順次書き込む。 Specifically, data can be written in the designated memory space by a different method according to the writing method (Addition, Concat) set by the user. For example, when the user sets the Addition mode, the data in the corresponding temporary storage space is cumulatively written in the specified storage space, and when the user sets the Concat mode, the data length information set by the user is used. Based on this, the data in the corresponding temporary storage space is sequentially written to the specified storage space at intervals.

以下、畳み込みニューラルネットワークを例として、本願の畳み込みニューラルネットワークに基づくビデオメモリ処理プログラムの実行について詳細に説明する。 Hereinafter, the execution of the video memory processing program based on the convolutional neural network of the present application will be described in detail by taking the convolutional neural network as an example.

ニューラルネットワークの各層の出力のビデオメモリ使用量状況を取得するには、各層のｆｅａｔｕｒｅｍａｐの形状を計算し、かつバックプロパゲーションのために勾配を保存する必要があり、ビデオメモリ使用量はｂａｔｃｈｓｉｚｅに正比例する。ニューラルネットワーク全体のビデオメモリ使用量は、モデルビデオメモリ＋ｂａｔｃｈｓｉｚｅ＊各サンプルのビデオメモリ使用量であり、モデルが小さい場合に、ｂａｔｃｈｓｉｚｅ＊各サンプルのビデオメモリ使用量とほぼ等しい。 In order to obtain the video memory usage status of the output of each layer of the neural network, it is necessary to calculate the shape of the feature map of each layer and save the gradient for backpropagation, and the video memory usage is batch size. Is directly proportional to. The video memory usage of the entire neural network is the model video memory + the video memory usage of each batch size * sample, and when the model is small, it is almost equal to the video memory usage of each batch size * sample.

畳み込みニューラルネットワークモデルによるトレーニングプロセスにおけるビデオメモリの使用量を節約するために、例えば、対応する一時記憶空間で複数の入力データにマージ処理を行い、対応する一時記憶空間で複数の入力データに累積処理を行うようにＣｏｎｃａｔ及びＡｄｄｉｔｉｏｎ層などに対してビデオメモリ最適化を行うことができる。 To save video memory usage in the training process with a convolutional neural network model, for example, merge multiple input data in the corresponding temporary storage space and accumulate multiple input data in the corresponding temporary storage space. Video memory optimization can be performed for the Concat and Addition layers and the like.

例えば、図３は、ビデオメモリ最適化処理が行われていない従来のＣＮＮネットワークの部分構造である。 For example, FIG. 3 is a partial structure of a conventional CNN network in which video memory optimization processing is not performed.

図３に示すように、逆伝搬を一時的に考慮せず、順伝搬を例とすると、畳み込み層の入力データのサイズは３２＊３２＊３であり、ｂａｔｃｈｓｉｚｅが５であれば、該層の入力データのサイズは３２＊３２＊３＊５であり、各入出力データのサイズの計算方式は以上と同じである。したがって、データがｆｌｏａｔで表されると、最適化処理が行われていない該ＣＮＮネットワーク部分が使用するビデオメモリは１９８０ｋｂである。 As shown in FIG. 3, if the back propagation is not considered temporarily and the forward propagation is taken as an example, the size of the input data of the convolution layer is 32 * 32 * 3, and if the batch size is 5, the layer is said. The size of the input data of is 32 * 32 * 3 * 5, and the calculation method of the size of each input / output data is the same as described above. Therefore, when the data is represented by float, the video memory used by the CNN network portion that has not been optimized is 1980 kb.

本願の畳み込みニューラルネットワークに基づくビデオメモリ処理方法を利用して上記部分に対してビデオメモリ最適化を行い、最適化後の構成図は図４に示すとおりである。 Video memory optimization is performed on the above portion using the video memory processing method based on the convolutional neural network of the present application, and the configuration diagram after the optimization is as shown in FIG.

逆伝搬を一時的に考慮しないため、出力データ一時記憶空間を呼び出すことのみを考慮すればよく、該一時記憶空間の寸法又はサイズは、ＣＮＮネットワークにおける畳み込み層の出力データの最大サイズに設定され、該実施例では３２＊３２＊１６である。図４における破線枠内の畳み込み層の出力データに対して、いずれも実際のビデオメモリ空間を割り当てず、出力データ一時記憶空間を呼び出す。 Since backpropagation is not considered temporarily, only calling the output data temporary storage space needs to be considered, and the size or size of the temporary storage space is set to the maximum size of the output data of the convolutional layer in the CNN network. In this example, it is 32 * 32 * 16. No actual video memory space is allocated to the output data of the convolution layer in the broken line frame in FIG. 4, and the output data temporary storage space is called.

以上より、ｂａｔｃｈｓｉｚｅが５であり、かつデータがｆｌｏａｔで表される場合、ビデオメモリ最適化後、該ＣＮＮネットワーク部分が使用するビデオメモリは１３４０ｋｂであり、ビデオメモリを３２．３％節約することができる。 From the above, when the batch size is 5 and the data is represented by float, the video memory used by the CNN network portion after the video memory optimization is 1340 kb, which saves 32.3% of the video memory. Can be done.

上記実施例に係る電子装置１は、共有一時記憶空間を設定し、処理すべきデータのタイプ及び指示に応じて対応する一時記憶空間を呼び出し、データを対応する一時記憶空間内に読み込むか又は書き込んで演算処理を行うことにより、ＣＮＮアルゴリズムに適用することができ、他のフレームワークと比較して、Ｄｅｎｓｅ、Ｒｅｓｉｄｕａｌ、Ｉｎｃｅｐｔｉｏｎモジュールと任意に組み合わせて新たなＣＮＮ構造を形成することができ、約半分のビデオメモリを節約すると共に、ＧＰＵ計算の並列度を向上させることができる。 The electronic device 1 according to the above embodiment sets a shared temporary storage space, calls the corresponding temporary storage space according to the type and instruction of the data to be processed, and reads or writes the data into the corresponding temporary storage space. By performing arithmetic processing in, it can be applied to the CNN algorithm, and compared to other frameworks, it can be arbitrarily combined with the Dense, Memory, and Injection modules to form a new CNN structure, which is about half. Video memory can be saved and the degree of parallelism of GPU calculation can be improved.

他の実施例では、畳み込みニューラルネットワークに基づくビデオメモリ処理プログラム１０には、共有一時記憶空間マネージャがさらに設定されてもよく、該マネージャは、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する一時記憶空間を含む。該マネージャは、対応する一時記憶空間を取得し操作するいくつかのサブモジュールを提供し、１つ以上のモジュールは、メモリ１１に記憶され、かつプロセッサ１２により実行されることで、本願を完成させる。本願でいうモジュールとは、特定の機能を実行できる一連のコンピュータプログラム命令セグメントを指す。図２は図１における畳み込みニューラルネットワークに基づくビデオメモリ処理プログラム１０の好ましい実施例のプログラムモジュールの図である。上記畳み込みニューラルネットワークに基づくビデオメモリ処理プログラム１０は、以下のサブモジュール２１０〜２３０に分割することができる。 In another embodiment, the video memory processing program 10 based on the convolutional neural network may further be configured with a shared temporary storage space manager, which may temporarily set input data, output data, input errors, and output errors. Includes temporary storage space to store in. The manager provides several submodules to acquire and manipulate the corresponding temporary storage space, one or more modules stored in memory 11 and executed by processor 12 to complete the present application. .. A module as used herein refers to a set of computer program instruction segments capable of performing a particular function. FIG. 2 is a diagram of a program module of a preferred embodiment of the video memory processing program 10 based on the convolutional neural network in FIG. The video memory processing program 10 based on the convolutional neural network can be divided into the following submodules 210 to 230.

一時空間取得サブモジュール２１０は、モジュールに入力されたデータのタイプ（データ又は誤差）及び方向（入力又は出力）に応じて、対応する一時記憶空間を呼び出して出力する。
例えば、該一時空間取得サブモジュールに「誤差及び出力」を入力すれば、該サブモジュールは対応する出力誤差一時記憶空間を呼び出して出力する。 The temporary space acquisition submodule 210 calls and outputs the corresponding temporary storage space according to the type (data or error) and direction (input or output) of the data input to the module.
For example, if "error and output" is input to the temporary space acquisition submodule, the submodule calls and outputs the corresponding output error temporary storage space.

データ読み込みサブモジュール２２０は、該データ読み込みサブモジュールに入力されたデータのタイプ（データ又は誤差）及び方向（入力又は出力）に応じて、指定された記憶空間内のデータを対応する一時記憶空間内に読み込み、かつ該一時記憶空間を出力する。 The data read submodule 220 accommodates data in the specified storage space in the corresponding temporary storage space, depending on the type (data or error) and direction (input or output) of the data input to the data read submodule. And output the temporary storage space.

例えば、該データ読み込みサブモジュールに「誤差及び出力」を入力すれば、該モジュールは指定された記憶空間内のデータを出力誤差一時記憶空間内に読み込み、出力誤差一時記憶空間を出力する。 For example, if "error and output" is input to the data reading submodule, the module reads the data in the designated storage space into the output error temporary storage space and outputs the output error temporary storage space.

上記指定された空間とは、主に、処理対象データが現在存在する記憶空間を指し、処理対象データは該指定された空間から一時記憶空間内に読み込まれて処理される。以下は同じである。 The above-designated space mainly refers to a storage space in which the processing target data currently exists, and the processing target data is read from the designated space into the temporary storage space and processed. The following is the same.

データ書き込みサブモジュール２３０は、該データ書き込みサブモジュールに入力されたデータのタイプ（データ又は誤差）及び方向（入力又は出力）に応じて、対応する一時記憶空間内のデータを指定された一時記憶空間内に書き込む。 The data writing submodule 230 is a temporary storage space in which data in the corresponding temporary storage space is designated according to the type (data or error) and direction (input or output) of the data input to the data writing submodule. Write in.

例えば、該データ書き込みサブモジュールに「誤差及び入力」を入力すれば、該データ書き込みモジュールは入力誤差一時記憶空間内のデータを指定された記憶空間内に書き込む。 For example, if "error and input" is input to the data writing submodule, the data writing module writes the data in the input error temporary storage space into the designated storage space.

なお、該データ書き込みサブモジュールはまたユーザにより設定された書き込み方式（Ａｄｄｉｔｉｏｎ￥Ｃｏｎｃａｔ）に応じて、異なる方式でデータを指定されたメモリ空間内に書き込む場合もある。例えば、ユーザがＡｄｄｉｔｉｏｎモードを設定した場合、該データ書き込みサブモジュールは、対応する一時記憶空間内のデータを指定された記憶空間に累積的に書き込み、ユーザがＣｏｎｃａｔモードを設定した場合、該データ書き込みサブモジュールは、ユーザにより設定されたデータ長情報に基づいて、対応する一時記憶空間内のデータを指定された記憶空間に間隔をあけて順次書き込む。 The data writing submodule may also write data in the designated memory space by a different method according to the writing method (Addition \ Concat) set by the user. For example, when the user sets the Addition mode, the data write submodule cumulatively writes the data in the corresponding temporary storage space to the specified storage space, and when the user sets the Concat mode, the data write. The submodule sequentially writes data in the corresponding temporary storage space to the specified storage space at intervals based on the data length information set by the user.

（実施例２）
本願は、畳み込みニューラルネットワークに基づくビデオメモリ処理方法をさらに提供する。図５は本願に係る畳み込みニューラルネットワークに基づくビデオメモリ処理方法の好ましい実施例のフローチャートである。該方法は、装置によって実行されてよく、該装置は、ソフトウェア及び／又はハードウェアによって実現されてよい。 (Example 2)
The present application further provides a video memory processing method based on a convolutional neural network. FIG. 5 is a flowchart of a preferred embodiment of the video memory processing method based on the convolutional neural network according to the present application. The method may be performed by a device, which may be implemented by software and / or hardware.

本実施例では、畳み込みニューラルネットワークに基づくビデオメモリ処理方法は、以下のＳ１１０〜Ｓ１４０を含む。 In this embodiment, the video memory processing method based on the convolutional neural network includes the following S110 to S140.

Ｓ１１０では、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する記憶空間である一時記憶空間を作成する。 In S110, a temporary storage space, which is a storage space for temporarily storing input data, output data, input error, and output error, is created.

該ステップでは、一時記憶空間は、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する記憶空間であり、対応する一時記憶空間は、入力データ一時記憶空間、出力データ一時記憶空間、入力誤差一時記憶空間及び出力誤差一時記憶空間を含む。 In this step, the temporary storage space is a storage space that temporarily stores input data, output data, input error, and output error, and the corresponding temporary storage space is an input data temporary storage space, an output data temporary storage space, and the like. Includes input error temporary storage space and output error temporary storage space.

該一時記憶空間はビデオメモリ内に設定されてもよく、ビデオメモリはモデル又はデータを記憶するもので、ビデオメモリが大きいほど、動作可能なネットワークも大きくなり、一般的なビデオカードは、主に以下のいくつかの種類を有する。

The temporary storage space may be set in the video memory, the video memory stores the model or data, the larger the video memory, the larger the operable network, and the general video card is mainly. It has several types:

Common numerical types and their sizes are shown in the table below.

Ｓ１２０では、処理対象データのタイプ及び方向に応じて、上記処理対象データに対応する一時記憶空間を呼び出し、上記処理対象データを呼び出された一時記憶空間内に読み込む。 In S120, the temporary storage space corresponding to the processing target data is called according to the type and direction of the processing target data, and the processing target data is read into the called temporary storage space.

例えば、処理対象データのタイプが誤差であり、方向が出力である場合、出力誤差データに応じて、対応する出力誤差一時記憶空間を呼び出し、該出力誤差を該出力誤差一時記憶空間内に読み込んで処理することができる。 For example, if the type of data to be processed is error and the direction is output, the corresponding output error temporary storage space is called according to the output error data, and the output error is read into the output error temporary storage space. Can be processed.

Ｓ１３０では、呼び出された一時記憶空間で上記処理対象データに対して所定の処理を行う。 In S130, predetermined processing is performed on the processing target data in the called temporary storage space.

例えば、データに対して畳み込み処理を行うとき、主に２つの変数をある範囲内で乗算した後に加算した結果を取得する。畳み込みの変数が系列ｘ（ｎ）及びｈ（ｎ）であれば、畳み込み結果は、下式に示すとおりである。

For example, when performing a convolution process on data, mainly two variables are multiplied within a certain range and then added to obtain the result. If the variables of the convolution are the series x (n) and h (n), the convolution result is as shown in the following equation.

畳み込みの変数がｘ（ｔ）とｈ（ｔ）の２つの関数であれば、畳み込みの計算は、下式に従って行われる。

同様に、ビデオメモリを節約するという目的を達成するために、上記これらの演算はいずれも一時記憶空間内で行うことができる。 Similarly, any of these operations can be performed within the temporary storage space to achieve the goal of saving video memory.

Ｓ１４０では、処理後のデータのタイプ及び方向に応じて、上記呼び出された一時記憶空間内のデータを指定された外部記憶空間内に書き込む。 In S140, the data in the called temporary storage space is written in the designated external storage space according to the type and direction of the processed data.

該ステップでは、上記呼び出された一時記憶空間内のデータを指定された外部記憶空間内に書き込むステップは、設定された書き込み方式で、上記一時記憶空間内の処理後のデータを指定された外部記憶空間内に書き込むステップを含み、書き込み方式は、ａｄｄｉｔｉｏｎモード及びｃｏｎｃａｔモードを含む。 In the step, the step of writing the data in the called temporary storage space into the designated external storage space is a set writing method, and the processed data in the temporary storage space is designated as external storage. The writing method includes a step of writing in space, and a writing method includes an addition mode and a concat mode.

具体的には、ユーザにより設定された書き込み方式（Ａｄｄｉｔｉｏｎ、Ｃｏｎｃａｔ）に応じて、異なる方式でデータを指定されたメモリ空間内に書き込むことができる。例えば、ユーザがＡｄｄｉｔｉｏｎモードを設定した場合、対応する一時記憶空間内のデータを指定された記憶空間内に累積的に書き込み、ユーザがＣｏｎｃａｔモードを設定した場合、ユーザにより設定されたデータ長情報に基づいて、対応する一時記憶空間内のデータを指定された記憶空間内に間隔をあけて順次書き込む。 Specifically, data can be written in the designated memory space by a different method according to the writing method (Addition, Concat) set by the user. For example, when the user sets the Addition mode, the data in the corresponding temporary storage space is cumulatively written in the specified storage space, and when the user sets the Concat mode, the data length information set by the user is used. Based on this, the data in the corresponding temporary storage space is sequentially written to the specified storage space at intervals.

以下、畳み込みニューラルネットワークを例として、本願に係る畳み込みニューラルネットワークに基づくビデオメモリ処理方法について詳細に説明する。 Hereinafter, the video memory processing method based on the convolutional neural network according to the present application will be described in detail by taking the convolutional neural network as an example.

畳み込みニューラルネットワークモデルによるトレーニングプロセスにおけるビデオメモリの使用量を節約するために、例えば対応する一時記憶空間内で複数の入力データにマージ処理を行い、対応する一時記憶空間内で複数の入力データに累積処理を行うようにｃｏｎｃａｔ及びａｄｄｉｔｉｏｎ層などに対してビデオメモリ最適化を行うことができる。 To save video memory usage in the training process with a convolutional neural network model, for example, merge multiple input data in the corresponding temporary storage space and accumulate to multiple input data in the corresponding temporary storage space. Video memory optimization can be performed for the concat and addition layers and the like so as to perform processing.

図３に示すように、逆伝搬を一時的に考慮せず、順伝搬を例とすると、畳み込み層の入力データのサイズは３２＊３２＊３であり、ｂａｔｃｈｓｉｚｅが５であれば、該層の入力データのサイズは３２＊３２＊３＊５であり、各入出力データのサイズの計算方式は以上と同じである。したがって、データがｆｌｏａｔで表されると、最適化処理が行われない該ＣＮＮネットワーク部分が使用するビデオメモリは１９８０ｋｂである。 As shown in FIG. 3, if the back propagation is not considered temporarily and the forward propagation is taken as an example, the size of the input data of the convolution layer is 32 * 32 * 3, and if the batch size is 5, the layer is said. The size of the input data of is 32 * 32 * 3 * 5, and the calculation method of the size of each input / output data is the same as described above. Therefore, when the data is represented by float, the video memory used by the CNN network portion that is not optimized is 1980 kb.

本願に係る畳み込みニューラルネットワークに基づくビデオメモリ処理方法を利用して上記部分に対しビデオメモリ最適化を行い、最適化後の構成図は図４に示すとおりである。 Video memory optimization is performed on the above portion using the video memory processing method based on the convolutional neural network according to the present application, and the configuration diagram after the optimization is as shown in FIG.

上記実施例に係る畳み込みニューラルに基づくビデオメモリ処理方法は、共有一時記憶空間を設定し、処理すべきデータのタイプ及び指示に応じて、対応する一時記憶空間を呼び出し、かつデータを対応する一時記憶空間内に読み込むか又は書き込んで演算処理を行うことにより、ＣＮＮアルゴリズムに適用することができ、他のフレームワークと比較して、Ｄｅｎｓｅ、Ｒｅｓｉｄｕａｌ、Ｉｎｃｅｐｔｉｏｎモジュールと任意に組み合わせて新たなＣＮＮ構造を形成することができ、約半分のビデオメモリを節約すると共に、ＧＰＵ計算の並列度を向上させることができる。 In the video memory processing method based on the convolutional neural according to the above embodiment, a shared temporary storage space is set, the corresponding temporary storage space is called according to the type and instruction of the data to be processed, and the data is stored in the corresponding temporary storage space. It can be applied to the CNN algorithm by reading or writing in space and performing arithmetic processing, and compared with other frameworks, it can be arbitrarily combined with the Dense, Memory, and Injection modules to form a new CNN structure. It can save about half the video memory and improve the degree of parallelism of GPU calculation.

（実施例３）
上記実施例２に係る畳み込みニューラルネットワークに基づくビデオメモリ処理方法に対応し、本願は畳み込みニューラルネットワークに基づくビデオメモリ処理システムをさらに提供する。図６は、本実施例に係る畳み込みニューラルネットワークに基づくビデオメモリ処理システムの論理構造を示す。 (Example 3)
Corresponding to the video memory processing method based on the convolutional neural network according to the second embodiment, the present application further provides a video memory processing system based on the convolutional neural network. FIG. 6 shows the logical structure of the video memory processing system based on the convolutional neural network according to this embodiment.

図６に示すように、本実施例に係る畳み込みニューラルネットワークに基づくビデオメモリ処理システム６００は、空間作成部６１０、データ呼び出し部６２０、前処理部６３０、及びデータ書き込み部６４０を含む。空間作成部６１０、データ呼び出し部６２０、前処理部６３０及びデータ書き込み部６４０により実現される機能は、上記実施例２における畳み込みニューラルネットワークに基づくビデオメモリ処理方法における対応するステップと一対一に対応する。 As shown in FIG. 6, the video memory processing system 600 based on the convolutional neural network according to the present embodiment includes a space creation unit 610, a data calling unit 620, a preprocessing unit 630, and a data writing unit 640. The functions realized by the space creation unit 610, the data calling unit 620, the preprocessing unit 630, and the data writing unit 640 correspond one-to-one with the corresponding steps in the video memory processing method based on the convolutional neural network in the second embodiment. ..

具体的には、空間作成部６１０は、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する記憶空間である一時記憶空間を作成し、空間作成部６１０は、ビデオメモリ内で一時記憶空間を作成することができ、ビデオメモリは、モデル又はデータを記憶するもので、ビデオメモリが大きいほど、動作可能なネットワークも大きくなる。作成された一時記憶空間は、入力データ一時記憶空間、出力データ一時記憶空間、入力誤差一時記憶空間及び出力誤差一時記憶空間を含んでよい。 Specifically, the space creation unit 610 creates a temporary storage space which is a storage space for temporarily storing input data, output data, input error, and output error, and the space creation unit 610 temporarily stores the input data, output data, input error, and output error in the video memory. A storage space can be created, and the video memory stores a model or data, and the larger the video memory, the larger the operable network. The created temporary storage space may include an input data temporary storage space, an output data temporary storage space, an input error temporary storage space, and an output error temporary storage space.

データ呼び出し部６２０は、処理対象データのタイプ及び方向に応じて、該処理対象データに対応する一時記憶空間を呼び出し、該処理対象データを呼び出された一時記憶空間内に読み込む。例えば、処理対象データのタイプが誤差であり、方向が出力である場合、出力誤差データに応じて、対応する出力誤差一時記憶空間を呼び出し、該出力誤差を該出力誤差一時記憶空間内に読み込んで処理することができる。 The data calling unit 620 calls the temporary storage space corresponding to the processing target data according to the type and direction of the processing target data, and reads the processing target data into the called temporary storage space. For example, if the type of data to be processed is error and the direction is output, the corresponding output error temporary storage space is called according to the output error data, and the output error is read into the output error temporary storage space. Can be processed.

前処理部６３０は、データ呼び出し部６２０により呼び出された一時記憶空間内で処理対象データに対して所定の処理を行う。該所定の処理は、上記処理対象データに対して行う畳み込み処理、重畳処理、乗算処理又は積分演算のうちの少なくとも１つを含んでよい。 The pre-processing unit 630 performs predetermined processing on the data to be processed in the temporary storage space called by the data calling unit 620. The predetermined process may include at least one of a convolution process, a superimposition process, a multiplication process, or an integral operation performed on the data to be processed.

例えば、前処理部６３０がデータに対して畳み込み処理を行うときに、主に２つの変数をある範囲内で乗算した後に加算した結果を取得する。畳み込みの変数が系列ｘ（ｎ）及びｈ（ｎ）であれば、畳み込み結果は下式に示すとおりである。

For example, when the preprocessing unit 630 performs a convolution process on data, it mainly acquires the result of multiplying two variables within a certain range and then adding them. If the variables of the convolution are the series x (n) and h (n), the convolution result is as shown in the following equation.

式中、＊は畳み込みを示す。時系列ｎ＝０の場合、系列ｈ（−ｉ）はｈ（ｉ）の時系列ｉが反転された結果であり、時系列の反転によりｈ（ｉ）が縦軸を中心として１８０度反転するため、このような乗算後に加算する計算法は畳み込み和と呼ばれ、畳み込みと略称される。また、ｎはｈ（−ｉ）をシフトさせる量であり、異なるｎは異なる畳み込み結果に対応する。 In the formula, * indicates convolution. When the time series n = 0, the series h (-i) is the result of the time series i of h (i) being inverted, and the time series inversion causes h (i) to be inverted 180 degrees about the vertical axis. Therefore, such a calculation method of adding after multiplication is called a convolution sum, and is abbreviated as convolution. Further, n is an amount that shifts h (−i), and different n corresponds to different convolution results.

データ書き込み部６４０は、処理後のデータのタイプ及び方向に応じて、呼び出された一時記憶空間内のデータを指定された外部記憶空間内に書き込む。 The data writing unit 640 writes the data in the called temporary storage space into the designated external storage space according to the type and direction of the processed data.

データ書き込み部６４０は、設定された書き込み方式で、上記一時記憶空間内の処理後のデータを指定された外部記憶空間に書き込むことができ、該書き込み方式はＡｄｄｉｔｉｏｎモード及びＣｏｎｃａｔモードを含む。ユーザがＡｄｄｉｔｉｏｎモードを設定した場合、対応する一時記憶空間内のデータを指定された記憶空間に累積的に書き込み、ユーザがＣｏｎｃａｔモードを設定した場合、ユーザにより設定されたデータ長情報に基づき、対応する一時記憶空間内のデータを指定された記憶空間に間隔をあけて順次書き込む。 The data writing unit 640 can write the processed data in the temporary storage space to the designated external storage space by the set writing method, and the writing method includes the Addition mode and the Concat mode. When the user sets the Addition mode, the data in the corresponding temporary storage space is cumulatively written to the specified storage space, and when the user sets the Concat mode, it corresponds based on the data length information set by the user. Writes the data in the temporary storage space to the specified storage space in sequence at intervals.

上記実施例に係る畳み込みニューラルに基づくビデオメモリ処理システムは、共有一時記憶空間を設定し、処理すべきデータのタイプ及び指示に応じて、対応する一時記憶空間を呼び出し、かつデータを対応する一時記憶空間に読み込むか又は書き込んで演算処理を行うことにより、ＣＮＮアルゴリズムに適用することができ、他のフレームワークと比較して、Ｄｅｎｓｅ、Ｒｅｓｉｄｕａｌ、Ｉｎｃｅｐｔｉｏｎモジュールと任意に組み合わせて新たなＣＮＮ構造を形成することができ、約半分のビデオメモリを節約すると共に、ＧＰＵ計算の並列度を向上させることができる。 The video memory processing system based on the convolutional neural network according to the above embodiment sets a shared temporary storage space, calls the corresponding temporary storage space according to the type and instruction of the data to be processed, and stores the data in the corresponding temporary storage space. It can be applied to the CNN algorithm by reading or writing to space and performing arithmetic processing, and can be arbitrarily combined with the Dense, Memory, and Injection modules to form a new CNN structure compared to other frameworks. It can save about half the video memory and improve the parallelism of GPU calculation.

（実施例４）
本願の実施例に係るコンピュータ読取り可能な記録媒体は、プロセッサによって実行されると、入力データ、出力データ、入力誤差及び出力誤差を一時的に記憶する記憶空間である一時記憶空間を作成する動作と、処理対象データのタイプ及び方向に応じて、処理対象データに対応する一時記憶空間を呼び出し、処理対象データを呼び出された一時記憶空間内に読み込む動作と、呼び出された一時記憶空間で処理対象データに対して所定の処理を行う動作と、処理後のデータのタイプ及び方向に応じて、呼び出された一時記憶空間内のデータを指定された外部記憶空間に書き込む動作と、を実現する、畳み込みニューラルに基づくビデオメモリ処理プログラムを含む。 (Example 4)
The computer-readable recording medium according to the embodiment of the present application has an operation of creating a temporary storage space, which is a storage space for temporarily storing input data, output data, input error, and output error when executed by a processor. , The operation of calling the temporary storage space corresponding to the processing target data and reading the processing target data into the called temporary storage space according to the type and direction of the processing target data, and the processing target data in the called temporary storage space. A convolutional neural that realizes the operation of performing a predetermined process on the data and the operation of writing the data in the called temporary storage space to the specified external storage space according to the type and direction of the processed data. Includes a video memory processing program based on.

好ましくは、一時記憶空間は、入力データ一時記憶空間、出力データ一時記憶空間、入力誤差一時記憶空間及び出力誤差一時記憶空間を含む。 Preferably, the temporary storage space includes an input data temporary storage space, an output data temporary storage space, an input error temporary storage space, and an output error temporary storage space.

好ましくは、処理対象データに対して所定の処理を行うステップは、処理対象データに対して畳み込み処理、重畳処理、乗算処理又は積分演算のうちの少なくとも１つを行うステップを含む。 Preferably, the step of performing a predetermined process on the data to be processed includes a step of performing at least one of a convolution process, a superimposition process, a multiplication process, or an integral operation on the data to be processed.

好ましくは、上記呼び出された一時記憶空間内のデータを指定された外部記憶空間に書き込むステップは、設定された書き込み方式で、一時記憶空間内の処理後のデータを指定された外部記憶空間に書き込むステップを含み、書き込み方式は、Ａｄｄｉｔｉｏｎモード及びＣｏｎｃａｔモードを含む。 Preferably, the step of writing the data in the called temporary storage space to the specified external storage space writes the processed data in the temporary storage space to the specified external storage space by the set writing method. The writing method includes an Addition mode and a Concat mode.

好ましくは、データのタイプは、入力データ、出力データ、入力誤差及び出力誤差を含み、上記データの方向は、入力及び出力を含む。 Preferably, the type of data includes input data, output data, input errors and output errors, and the direction of the data includes inputs and outputs.

本願のコンピュータ読取り可能な記録媒体の具体的な実施形態は、上記畳み込みニューラルネットワークに基づくビデオメモリ処理方法、システム、電子装置の具体的な実施形態とほぼ同じであるため、ここでは説明を省略する。 Since the specific embodiment of the computer-readable recording medium of the present application is almost the same as the specific embodiment of the video memory processing method, the system, and the electronic device based on the convolutional neural network, the description thereof is omitted here. ..

なお、本明細書において、用語「含む」、「備える」又はそれらの任意の他の変形は、非排他的な包含をカバーするものであることにより、一連の要素を含むプロセス、装置、物品又は方法は、それらの要素を含むだけでなく、明確に例示されていない他の要素をさらに含むか、又はこれらのプロセス、装置、物品又は方法固有の要素をさらに含む。更なる限定がない限り、語句「……を含む」により限定された要素は、該要素を含むプロセス、装置、物品又は方法に別の同一の要素がさらに存在する可能性がある。 It should be noted that, as used herein, the terms "include", "provide" or any other variation thereof by cover a non-exclusive inclusion, such as a process, device, article or a process, apparatus, article containing a set of elements. The method includes not only those elements, but also other elements not explicitly exemplified, or further elements specific to these processes, devices, articles or methods. Unless further limited, an element limited by the phrase "contains ..." may further have another identical element in the process, device, article or method containing the element.

本願の上記実施例の番号は、説明のためのものに過ぎず、実施例の優劣を示すものではない。以上の実施形態の説明により、当業者が明確に理解できるように、上記実施例の方法は、ソフトウェアと必要な汎用ハードウェアプラットフォームを併用した方法で実現でき、当然のことながら、ハードウェアでも実現できるが、多くの場合に前者がより好ましい実施形態である。このような理解に基づいて、本願の技術手段は、本質的又は従来技術に寄与する部分が、ソフトウェア製品の形態で具現化されてよく、該コンピュータソフトウェア製品は、上記記録媒体（例えば、ＲＯＭ／ＲＡＭ、磁気ディスク、光ディスク）に記憶されており、１台の端末装置（携帯電話、コンピュータ、サーバ又はネットワーク装置などであってよい）に本願の各実施例に記載の方法を実行させるための複数の命令を含む。 The numbers in the above embodiments of the present application are for illustration purposes only and do not indicate superiority or inferiority of the embodiments. As can be clearly understood by those skilled in the art by the above description of the embodiment, the method of the above embodiment can be realized by a method using both software and a necessary general-purpose hardware platform, and of course, it can also be realized by hardware. Although possible, the former is often the more preferred embodiment. Based on such an understanding, the technical means of the present application may be embodied in the form of a software product in which an essential part or a part contributing to the prior art may be embodied, and the computer software product may be a recording medium (for example, ROM / ROM / A plurality of devices (which may be a mobile phone, a computer, a server, a network device, or the like) stored in a RAM (RAM, magnetic disk, optical disk), for causing one terminal device (which may be a mobile phone, a computer, a server, a network device, etc.) to execute the method described in each embodiment of the present application. Includes instructions for.

以上は、本願の好ましい実施例に過ぎず、本願の保護範囲を限定するものではなく、本願の明細書及び図面の内容を利用してなされた等価構造又は等価フロー変換、又は他の関連する技術分野における直接又は間接運用は、いずれも同様に本願の保護範囲に含まれる。 The above is merely a preferred embodiment of the present application, which does not limit the scope of protection of the present application, and is an equivalent structure or equivalent flow conversion made by utilizing the contents of the specification and drawings of the present application, or other related techniques. Both direct and indirect operations in the field are similarly included in the scope of protection of the present application.

Claims

A video memory processing method based on a convolutional neural network applied to electronic devices.
Steps to create a temporary storage space, which is a storage space for temporarily storing input data, output data, input errors, and output errors.
A step of calling a temporary storage space corresponding to the processing target data according to the type and direction of the processing target data and reading the processing target data into the called temporary storage space.
A step of performing predetermined processing on the processing target data in the called temporary storage space, and
Video memory processing based on a convolutional neural network comprising, depending on the type and direction of the processed data, a step of writing the data in the called temporary storage space to a designated external storage space. Method.

The video memory based on the convolutional neural network according to claim 1, wherein the temporary storage space includes an input data temporary storage space, an output data temporary storage space, an input error temporary storage space, and an output error temporary storage space. Processing method.

The video memory processing method based on a convolutional neural network according to claim 1, wherein the temporary storage space is set in a video memory.

The step of performing a predetermined process on the data to be processed is
The video memory processing method based on a convolutional neural network according to claim 1, further comprising a step of performing at least one of a convolutional process, a superimposition process, a multiplication process, and an integral operation on the data to be processed. ..

The convolutional neural network according to claim 1, wherein a predetermined process performed on the data to be processed is a convolutional process of multiplying two variables within a certain range and then acquiring the result of addition. Based on video memory processing method.

If the variables of the convolution are the series x (n) and h (n), the convolution result is as shown in the following equation.

In the equation, * indicates convolution, n is a quantity that shifts h (-i), different n corresponds to different convolution results, and when the time series n = 0, the series h (-i) is h ( Based on the convolutional neural network according to claim 5, which is the result of inversion of the time series i of i), wherein h (i) is inverted 180 degrees about the vertical axis due to the inversion of the time series. Video memory processing method.

If the convolution variables are two functions x (t) and h (t), the convolution result is as shown in the following equation.

The convolutional neural network according to claim 5, wherein * indicates a convolution, t is a shifted quantity of the function h (−p), p is an integral variable, and the integral is an addition. Based on video memory processing method.

The step of writing the data in the called temporary storage space into the specified external storage space is
A billing scheme comprising the steps of writing processed data in the temporary storage space into a designated external storage space in a configured write scheme, wherein the write scheme includes an Addition mode and a Concat mode. The video memory processing method based on the convolutional neural network according to Item 1.

When the Addition mode is set, the data in the corresponding temporary storage space is cumulatively written to the external storage space.
The eighth aspect of the present invention, wherein when the Concat mode is set, the data in the corresponding temporary storage space is sequentially written to the external storage space at intervals based on the set data length information. Video memory processing method based on convolutional neural network.

The type of data includes input data, output data, input error and output error.
The video memory processing method based on a convolutional neural network according to any one of claims 1 to 9, wherein the direction of the data includes an input and an output.

A space creation unit that creates a temporary storage space, which is a storage space that temporarily stores input data, output data, input errors, and output errors.
A data calling unit that calls a temporary storage space corresponding to the processing target data and reads the processing target data into the called temporary storage space according to the type and direction of the processing target data.
A pre-processing unit that performs predetermined processing on the processing target data in the called temporary storage space, and
A video based on a convolutional neural network, comprising a data writer that writes data in the called temporary storage space to a designated external storage space, depending on the type and direction of the processed data. Memory processing system.

The eleventh aspect of claim 11, wherein the temporary storage space created by the space creation unit includes an input data temporary storage space, an output data temporary storage space, an input error temporary storage space, and an output error temporary storage space. A video memory processing system based on a convolutional neural network.

The video memory processing system based on the convolutional neural network according to claim 11, wherein the space creation unit creates the temporary storage space in the video memory.

A predetermined process performed by the preprocessing unit on the data to be processed includes at least one of a convolutional process, a superimposition process, a multiplication process, and an integral operation on the data to be processed. Item 4. The video memory processing system based on the convolutional neural network according to Item 11.

The predetermined processing performed by the preprocessing unit on the processing target data is a convolution processing for acquiring the result of addition after multiplying two variables within a certain range.
If the variables of the convolution are the series x (n) and h (n), the convolution result is as shown in the following equation.

In the equation, * indicates convolution, n is a quantity that shifts h (-i), different n corresponds to different convolution results, and when the time series n = 0, the series h (-i) is h ( Based on the convolutional neural network according to claim 11, which is the result of inversion of the time series i of i), wherein h (i) is inverted 180 degrees about the vertical axis due to the inversion of the time series. Video memory processing system.

The predetermined processing performed by the preprocessing unit on the processing target data is a convolution processing for acquiring the result of addition after multiplying two variables within a certain range.
If the convolution variables are two functions x (t) and h (t), the convolution result is as shown in the following equation.

The convolutional neural network according to claim 11, wherein * indicates a convolution, t is a shifted quantity of the function h (−p), p is an integral variable, and the integral is an integral. Based on video memory processing system.

The data writing unit writes the processed data in the temporary storage space to the designated external storage space by the set writing method.
The video memory processing system based on the convolutional neural network according to claim 11, wherein the writing method includes an Addition mode and a Concat mode.

When the Addition mode is set, the data writing unit cumulatively writes the data in the corresponding temporary storage space to the external storage space.
When the Concat mode is set, the data writing unit is characterized in that data in the corresponding temporary storage space is sequentially written to the external storage space at intervals based on the set data length information. The video memory processing system based on the convolutional neural network according to claim 11.

Including memory and processor
The memory is a video memory processing program based on a convolutional neural network that, when executed by the processor, realizes the steps of the video memory processing method based on the convolutional neural network according to any one of claims 1 to 11. An electronic device characterized by including.

It is characterized by including a video memory processing program based on a convolutional neural network, which implements the steps of the video memory processing method based on the convolutional neural network according to any one of claims 1 to 11, when executed by a processor. A computer-readable recording medium.