JP2021196893A

JP2021196893A - Device, method, and program

Info

Publication number: JP2021196893A
Application number: JP2020103187A
Authority: JP
Inventors: 壮太井上; Sota Inoue
Original assignee: Fuji Electric Co Ltd
Current assignee: Fuji Electric Co Ltd
Priority date: 2020-06-15
Filing date: 2020-06-15
Publication date: 2021-12-27

Abstract

To reduce the time required for executing image recognition processing using machine learning models in multiple stages.SOLUTION: A device according to one embodiment for executing recognition processing which is implemented by configuring inference processing of machine learning models in multiple stages includes: a recognition unit which acquires a recognition result in the m-th stage of input data by inference processing of a machine learning model in the m-th stage by using a parameter of the machine learning model in the m-th stage; an analysis unit which acquires a recognition result in the m-th stage of the input data by analysis processing of a light analysis algorithm; a development unit which develops parameters of a machine learning model in the (m+1)-th stage on a memory in accordance with the recognition result in the m-th stage acquired by the analysis unit; and a selection unit which selects a parameter of the machine learning model in the (m+1)-th stage corresponding to the recognition result in the m-th stage acquired by the recognition unit, out of the parameters of the machine learning model in the (m+1)-th stage developed on the memory by the development unit.SELECTED DRAWING: Figure 3

Description

本発明は、機器、方法及びプログラムに関する。 The present invention relates to devices, methods and programs.

近年、ニューラルネットワーク等の機械学習モデルによる画像認識処理が実用化されており、複雑な画像認識処理（例えば、画像中の商品を数百種類のカテゴリに分類する処理等）を実行する場合には複数の機械学習モデルを多段に用いることが行われている。 In recent years, image recognition processing using machine learning models such as neural networks has been put into practical use, and when performing complicated image recognition processing (for example, processing for classifying products in an image into hundreds of categories), Multiple machine learning models are used in multiple stages.

例えば、画像中の商品の大分類を認識するニューラルネットワークを第１段目、特定の大分類の商品の小分類を認識する複数のニューラルネットワークを第２段目に用いて、第１段目のニューラルネットワークで画像中の商品の大分類を認識した後、この認識結果に対応する第２段目のニューラルネットワークで当該商品の小分類を認識する、といったことが行われている。 For example, a neural network that recognizes a major classification of products in an image is used in the first stage, and a plurality of neural networks that recognize a minor classification of a specific major classification of products are used in the second stage. After the neural network recognizes the major classification of the product in the image, the second-stage neural network corresponding to this recognition result recognizes the minor classification of the product.

なお、機械学習モデルによる画像処理の従来技術として、複数の機械学習モデルの中から適切な機械学習モデルを選択し、選択した機械学習モデルにより入力画像の情報を推定する技術が知られている（例えば、特許文献１）。 As a conventional technique for image processing using a machine learning model, a technique is known in which an appropriate machine learning model is selected from a plurality of machine learning models and the information of the input image is estimated by the selected machine learning model (. For example, Patent Document 1).

特開２０１９−８７２２９号公報Japanese Unexamined Patent Publication No. 2019-87229

しかしながら、複数のニューラルネットワークを多段に用いた画像認識処理を組込機器で実行する場合、ＲＡＭ（Random Access Memory）容量の制限により第１段目のニューラルネットワークによる認識結果が出力されるまで第２段目以降のニューラルネットワークをロードできないことがある。このため、画像認識処理の実行時間が長くなることがあった。 However, when performing image recognition processing using multiple neural networks in multiple stages on an embedded device, the second stage is until the recognition result by the first stage neural network is output due to the limitation of RAM (Random Access Memory) capacity. It may not be possible to load the neural network after the stage. Therefore, the execution time of the image recognition process may be long.

本発明の一実施形態は、複数の機械学習モデルを多段に用いた画像認識処理の実行時間を削減することを目的とする。 One embodiment of the present invention aims to reduce the execution time of an image recognition process using a plurality of machine learning models in multiple stages.

上記の目的を達成するため、一実施形態に係る機器は、機械学習モデルの推論処理を多段に構成することで実現される認識処理を実行する機器であって、第ｍ段目の機械学習モデルのパラメータを用いて、前記第ｍ段目の機械学習モデルの推論処理により入力データの第ｍ段目の認識結果を得る認識部と、軽量な分析アルゴリズムの分析処理により前記入力データの前記第ｍ段目の認識結果を得る分析部と、前記分析部によって得られた前記第ｍ段目の認識結果に応じて、第ｍ＋１段目の機械学習モデルのパラメータをメモリ上に展開する展開部と、前記展開部によりメモリ上に展開された第ｍ＋１段目の機械学習モデルのパラメータのうち、前記認識部によって得られた前記第ｍ段目の認識結果に対応する第ｍ＋１段目の機械学習モデルのパラメータを選択する選択部と、を有する。 In order to achieve the above object, the device according to the embodiment is a device that executes recognition processing realized by configuring the inference processing of the machine learning model in multiple stages, and is a machine learning model of the mth stage. Using the parameters of, the recognition unit that obtains the recognition result of the m-th stage of the input data by the inference processing of the machine learning model of the m-th stage, and the m-th of the input data by the analysis processing of the lightweight analysis algorithm. An analysis unit that obtains the recognition result of the stage, and an expansion unit that expands the parameters of the machine learning model of the m + 1th stage on the memory according to the recognition result of the mth stage obtained by the analysis unit. Of the parameters of the m + 1st stage machine learning model expanded on the memory by the expansion unit, the m + 1st stage machine learning model corresponding to the recognition result of the mth stage obtained by the recognition unit. It has a selection unit for selecting parameters.

複数の機械学習モデルを多段に用いた画像認識処理の実行時間を削減することができる。 It is possible to reduce the execution time of image recognition processing using multiple machine learning models in multiple stages.

従来の画像認識処理のタイムチャートの一例を示す図である。It is a figure which shows an example of the time chart of the conventional image recognition processing. 従来の画像認識処理におけるＲＡＭ使用量の一例を示す図である。It is a figure which shows an example of the RAM usage in the conventional image recognition processing. 本実施形態に係る画像認識処理のタイムチャートの一例を示す図である。It is a figure which shows an example of the time chart of the image recognition processing which concerns on this embodiment. 本実施形態に係る画像認識処理におけるＲＡＭ使用量の一例を示す図である。It is a figure which shows an example of the RAM used amount in the image recognition processing which concerns on this embodiment. 本実施形態に係る組込機器のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware composition of the embedded device which concerns on this embodiment. 本実施形態に係る組込機器の機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of the embedded device which concerns on this embodiment. 本実施形態に係る画像認識処理のフローチャートの一例を示す図である。It is a figure which shows an example of the flowchart of the image recognition processing which concerns on this embodiment.

以下、本発明の一実施形態について説明する。本実施形態では、機械学習モデルとしてニューラルネットワークを想定し、複数のニューラルネットワークを多段に用いた画像認識処理を組込機器が実行する場合について説明する。ただし、ニューラルネットワークは機械学習モデルの一例であって、画像認識処理に用いられる複数の機械学習モデルの全部又は一部がニューラルネットワーク以外の機械学習モデルであってもよい。 Hereinafter, an embodiment of the present invention will be described. In this embodiment, a neural network is assumed as a machine learning model, and a case where an embedded device executes image recognition processing using a plurality of neural networks in multiple stages will be described. However, the neural network is an example of a machine learning model, and all or part of a plurality of machine learning models used for image recognition processing may be machine learning models other than the neural network.

なお、組込機器は、例えば、産業用機器や家電製品等に内蔵され、特定の機能（つまり、本実施形態では画像認識機能）を実現する機器又はシステムである。組込機器は、ＰＣ（パーソナルコンピュータ）等と比較して、ＲＡＭ等のメモリ容量に制限がある（つまり、メモリ容量が少ない）ことが一般的である。 The embedded device is, for example, a device or system that is built in an industrial device, a home electric appliance, or the like and realizes a specific function (that is, an image recognition function in this embodiment). Embedded devices generally have a limited memory capacity (that is, a small memory capacity) such as RAM as compared with a PC (personal computer) or the like.

ただし、本実施形態は、組込機器が画像認識処理を実行する場合に限られず、任意の機器又は装置（特に、例えば、ＲＡＭ等のメモリ容量に制限があるウェアラブルデバイス等）が画像認識処理を実行する場合にも同様に適用可能である。 However, this embodiment is not limited to the case where the embedded device executes the image recognition process, and any device or device (particularly, a wearable device having a limited memory capacity such as RAM) performs the image recognition process. It is also applicable when executing.

＜従来の画像認識処理＞
まず、複数のニューラルネットワークを多段に用いた従来の画像認識処理を組込機器が実行する場合について説明する。以降では、一例として、入力画像中の商品の大分類を表すカテゴリを認識するニューラルネットワークを第１段目、特定の大分類の商品の小分類を表すカテゴリを認識する複数のニューラルネットワークを第２段目として、第１段目のニューラルネットワークの処理と第２段目のニューラルネットワークの処理とで実現される画像認識処理により入力画像中の商品の大分類及び小分類を認識（識別）する場合について説明する。ただし、画像中には１つの商品（物体）のみが含まれているものとする（したがって、画像中の物体のカテゴリを識別することと画像のカテゴリを識別することは同義である。）。 <Conventional image recognition processing>
First, a case where an embedded device executes a conventional image recognition process using a plurality of neural networks in multiple stages will be described. In the following, as an example, the first stage is a neural network that recognizes a category representing a major classification of products in an input image, and the second stage is a plurality of neural networks that recognize a category representing a minor classification of a specific major classification product. As a step, when the major classification and the minor classification of the product in the input image are recognized (identified) by the image recognition processing realized by the processing of the first stage neural network and the processing of the second stage neural network. Will be explained. However, it is assumed that only one product (object) is included in the image (hence, identifying the category of the object in the image is synonymous with identifying the category of the image).

なお、商品の大分類としては、例えば、「おにぎり」、「お菓子」、「飲料」等がある。また、大分類「おにぎり」の小分類としては、例えば、「ツナマヨ」、「梅」、「おかか」等がある。同様に、大分類「お菓子」の小分類としては、例えば、「チョコレート」、「スナック」、「ガム」等がある。同様に、大分類「飲料」の小分類としては、例えば、「コーヒー」、「ジュース」、「お酒」等がある。 The major categories of products include, for example, "rice balls", "sweets", "beverages" and the like. Further, as a minor classification of the major classification "rice ball", for example, there are "Tsunamayo", "Plum", "Katsuobushi" and the like. Similarly, the minor classification of the major classification "sweets" includes, for example, "chocolate", "snack", "gum" and the like. Similarly, the minor classification of the major classification "beverage" includes, for example, "coffee", "juice", "liquor" and the like.

また、第２段目に用いられる複数のニューラルネットワークは大分類毎に存在し、例えば、大分類「おにぎり」の小分類を認識するニューラルネットワーク、大分類「お菓子」の小分類を認識するニューラルネットワーク、大分類「飲料」の小分類を認識するニューラルネットワーク等がある。 In addition, a plurality of neural networks used in the second stage exist for each major classification, for example, a neural network that recognizes a minor classification of a major classification "Onigiri" and a neural network that recognizes a minor classification of a major classification "sweets". There are networks, neural networks that recognize minor categories of major categories "beverages", and so on.

このとき、図１に示すように、画像認識処理を組込機器に実行させるためのプログラムの起動時に第１段目のニューラルネットワークがロードされた後、画像認識処理の認識対象となる画像を入力として第１段目のニューラルネットワークの処理が実行される。そして、第２段目に用いられる複数のニューラルネットワークのうち、第１段目のニューラルネットワークの処理結果（つまり、第１段目のニューラルネットワークで推定された大分類）に対応するニューラルネットワークがロードされ、当該画像を入力として当該第２段目のニューラルネットワークの処理が実行される。これにより、第２段目のニューラルネットワークの処理結果（つまり、小分類）が得られ、画像中の商品の大分類及び小分類が認識される。なお、ニューラルネットワークのロードとは、ニューラルネットワークのパラメータ（例えば、全結合層の重みやバイアス、畳み込み層のフィルター（又はカーネル）のパラメータやバイアス等）をＲＡＭ上に展開することを意味する。 At this time, as shown in FIG. 1, after the neural network of the first stage is loaded when the program for causing the embedded device to execute the image recognition process is started, the image to be recognized by the image recognition process is input. The processing of the first stage neural network is executed. Then, among the plurality of neural networks used in the second stage, the neural network corresponding to the processing result of the first stage neural network (that is, the major classification estimated by the first stage neural network) is loaded. Then, the processing of the second-stage neural network is executed with the image as an input. As a result, the processing result (that is, the minor classification) of the neural network of the second stage is obtained, and the major classification and the minor classification of the products in the image are recognized. Note that loading the neural network means expanding the parameters of the neural network (for example, the weight and bias of the fully connected layer, the parameter and bias of the filter (or kernel) of the convolutional layer, etc.) on the RAM.

したがって、第１段目のニューラルネットワークのロード時間をＴ_１１、第１段目のニューラルネットワークの処理時間をＴ_１２、第２段目のニューラルネットワークのロード時間をＴ_２１、第２段目のニューラルネットワークの処理時間をＴ_２２とすれば、画像認識処理の処理時間（実行時間）はＴ_１１＋Ｔ_１２＋Ｔ_２１＋Ｔ_２２となる。 Therefore, the load time of the first-stage neural network is T ₁₁ , the processing time of the first-stage neural network is T ₁₂ , the load time of the second-stage neural network is T ₂₁ , and the second-stage neural network. Assuming that the processing time of the network is T ₂₂ , the processing time (execution time) of the image recognition processing is T ₁₁ + T ₁₂ + T ₂₁ + T ₂₂ .

また、第１段目のニューラルネットワークのパラメータサイズをＬ_１、第２段目のニューラルネットワークのパラメータサイズをＬ_２とすれば、図２に示すように、第１段目のニューラルネットワークのロード後のＲＡＭ使用量はＬ_１、第２段目のニューラルネットワークのロード後のＲＡＭ使用量はＬ_１＋Ｌ_２となる。ただし、図２では簡単のためニューラルネットワークのＲＡＭ使用量のみを示している。 If the parameter size of the first-stage neural network is L ₁ and the parameter size of the second-stage neural network is L ₂ , as shown in FIG. 2, after the first-stage neural network is loaded. The RAM usage of is L ₁ , and the RAM usage after loading the second-stage neural network is L ₁ + L ₂ . However, FIG. 2 shows only the RAM usage of the neural network for the sake of simplicity.

なお、上述したように、組込機器で画像認識処理を実行する場合、従来の画像認識処理では、第２段目に用いられる複数のニューラルネットワークのうち、第１段目のニューラルネットワークの処理結果に対応するニューラルネットワークをロードしている。これは、第２段目に用いられる全てのニューラルネットワークをロードした場合、ＲＡＭ使用量が上限容量を超える可能性があるためである。例えば、組込機器のＲＡＭ容量（つまり、上限容量）は一般に１Ｇバイト〜４Ｇバイト程度であるのに対し、第１段目のニューラルネットワークのパラメータサイズを３００ＭＢ、第２段目の各ニューラルネットワークのパラメータサイズを９０ＭＢ、大分類数を６０種類（つまり、第２段目に用いられるニューラルネットワーク数も６０）とした場合、３００＋６０×９０＝５７００ＭＢのＲＡＭ使用量が必要になり、上限容量を超えることになる。 As described above, when the image recognition process is executed by the embedded device, in the conventional image recognition process, the processing result of the neural network of the first stage among the plurality of neural networks used in the second stage. The neural network corresponding to is loaded. This is because the RAM usage may exceed the upper limit capacity when all the neural networks used in the second stage are loaded. For example, the RAM capacity (that is, the upper limit capacity) of the embedded device is generally about 1 Gbyte to 4 Gbyte, whereas the parameter size of the first-stage neural network is 300 MB and that of each second-stage neural network. If the parameter size is 90MB and the number of major categories is 60 (that is, the number of neural networks used in the second stage is also 60), the RAM usage of 300 + 60 × 90 = 5700MB is required, which exceeds the upper limit capacity. become.

＜本実施形態に係る画像認識処理＞
次に、本実施形態に係る画像認識処理を組込機器が実行する場合について説明する。以降でも、上記と同様に、一例として、入力画像中の商品の大分類を表すカテゴリを認識するニューラルネットワークを第１段目、特定の大分類の商品の小分類を表すカテゴリを認識する複数のニューラルネットワークを第２段目として、第１段目のニューラルネットワークの処理と第２段目のニューラルネットワークの処理とで実現される画像認識処理により入力画像中の商品の大分類及び小分類を認識（識別）する場合について説明する。 <Image recognition processing according to this embodiment>
Next, a case where the embedded device executes the image recognition process according to the present embodiment will be described. In the following, as in the above, as an example, the neural network that recognizes the category representing the major classification of the product in the input image is the first stage, and a plurality of categories that recognize the category representing the minor classification of the specific major classification product. With the neural network as the second stage, the large classification and minor classification of the products in the input image are recognized by the image recognition processing realized by the processing of the neural network of the first stage and the processing of the neural network of the second stage. The case of (identification) will be described.

このとき、図３に示すように、本実施形態に係る画像認識処理を組込機器に実行させるためのプログラムの起動時に第１段目のニューラルネットワークのロードと軽量画像分析アルゴリズムを実行するモジュール（例えば、クラスや関数、プログラムモジュール等）等のロードとを並列に実行される。ここで、軽量画像分析アルゴリズムとは、入力画像を分析することで、ニューラルネットワークを用いた画像処理と比べて比較的軽量（つまり、低メモリ使用量かつ低計算量）に入力画像中の物体を認識（本実施形態では画像中の商品の大分類を識別）するアルゴリズムである。軽量画像分析アルゴリズムとしては、例えば、テンプレートマッチングによって形状解析を行うことで第１のカテゴリを認識するアルゴリズム、コーナー検出によるサイズ検出を行うことで第１のカテゴリを認識するアルゴリズム等が挙げられる。ただし、軽量画像分析アルゴリズムは第１段目のニューラルネットワークと比べて、その認識精度が低いのが一般的である。 At this time, as shown in FIG. 3, a module that executes the loading of the first-stage neural network and the lightweight image analysis algorithm at the time of starting the program for causing the embedded device to execute the image recognition process according to the present embodiment ( For example, loading of classes, functions, program modules, etc.) is executed in parallel. Here, the lightweight image analysis algorithm analyzes an input image to make an object in the input image relatively lightweight (that is, low memory usage and low calculation amount) as compared with image processing using a neural network. It is an algorithm for recognizing (in this embodiment, identifying a major category of products in an image). Examples of the lightweight image analysis algorithm include an algorithm that recognizes the first category by performing shape analysis by template matching, an algorithm that recognizes the first category by performing size detection by corner detection, and the like. However, the lightweight image analysis algorithm generally has lower recognition accuracy than the first-stage neural network.

次に、画像認識処理の認識対象となる画像を入力として、第１段目のニューラルネットワークの処理と、軽量画像分析アルゴリズムの処理及びその後の第２段目のニューラルネットワーク候補のロードとが並列に実行される。ここで、第２段目のニューラルネットワーク候補とは、第２段目に用いられるニューラルネットワークのうち、軽量画像分析アルゴリズムの処理結果（認識結果）に対応する第２段目のニューラルネットワークのことである。例えば、軽量画像分析アルゴリズムの認識結果が大分類「おにぎり」及び大分類「飲料」であった場合、第２段目のニューラルネットワーク候補は、大分類「おにぎり」に対応するニューラルネットワークと、大分類「飲料」に対応するニューラルネットワークとの２つである。 Next, using the image to be recognized by the image recognition process as an input, the processing of the first-stage neural network, the processing of the lightweight image analysis algorithm, and the subsequent loading of the second-stage neural network candidate are performed in parallel. Will be executed. Here, the second-stage neural network candidate is a second-stage neural network corresponding to the processing result (recognition result) of the lightweight image analysis algorithm among the neural networks used in the second stage. be. For example, if the recognition result of the lightweight image analysis algorithm is the large classification "rice ball" and the large classification "beverage", the neural network candidates in the second stage are the neural network corresponding to the large classification "rice ball" and the large classification. There are two with a neural network corresponding to "beverage".

そして、第２段目のニューラルネットワーク候補のうち、第１段目のニューラルネットワークの処理結果（つまり、第１段目のニューラルネットワークで推定された大分類）に対応するニューラルネットワークが第２段目で実行すると選択された後、当該画像を入力として、第２段目で実行すると選択されたニューラルネットワークの処理が実行される。これにより、第２段目のニューラルネットワークの処理結果（つまり、小分類）が得られ、画像中の商品の大分類及び小分類が認識される。 Then, among the neural network candidates of the second stage, the neural network corresponding to the processing result of the neural network of the first stage (that is, the major classification estimated by the neural network of the first stage) is the second stage. After being selected to be executed in, the processing of the selected neural network is executed when the image is input and executed in the second stage. As a result, the processing result (that is, the minor classification) of the neural network of the second stage is obtained, and the major classification and the minor classification of the products in the image are recognized.

したがって、第１段目のニューラルネットワークのロード時間をＴ_１１、軽量画像分析アルゴリズムのロード時間をＳ_１（≦Ｔ_１１）、第１段目のニューラルネットワークの処理時間をＴ_１２＋Δ_１、軽量分析アルゴリズムの処理時間をＳ_２、第２段目のニューラルネットワーク候補のロード時間をＴ'、第２段目のニューラルネットワークの処理時間をＴ_２２とすれば、本実施形態に係る画像認識処理の実行時間はＴ_１１＋（Ｔ_１２＋Δ_１）＋Ｔ_２２となる。ただし、Ｔ_１２＋Δ_１≧Ｓ_２＋Ｔ'であるものとする。ここで、Δ_１は第１段目のニューラルネットワークの処理を並列実行することに伴う増加時間である。また、Ｔ'は第２段目のニューラルネットワーク候補の全てのロード時間（並列実行に伴う増加時間も含む）の和である。 Therefore, the loading time of the first-stage neural network is T ₁₁ , the loading time of the lightweight image analysis algorithm is S ₁ (≦ T ₁₁ ), the processing time of the first-stage neural network is T ₁₂ + Δ ₁ , and the lightweight analysis. If the processing time of the algorithm is S ₂ , the loading time of the second-stage neural network candidate is T', and the processing time of the second-stage neural network is T ₂₂ , the image recognition processing according to the present embodiment is executed. The time is T ₁₁ + (T ₁₂ + Δ ₁ ) + T ₂₂ . However, it is assumed that _{T 12} + Δ ₁ ≧ S _{2 + T'.} Here, Δ ₁ is an increase time associated with parallel execution of the processing of the first-stage neural network. Further, T'is the sum of all the load times (including the increase time due to parallel execution) of the second-stage neural network candidates.

一般に、第１段目のニューラルネットワークの処理を並列実行することに伴う増加時間Δ_１はニューラルネットワークのロード時間と比べて非常に短いため、Ｔ_１１＋（Ｔ_１２＋Δ_１）＋Ｔ_２２＜Ｔ_１１＋Ｔ_１２＋Ｔ_２１＋Ｔ_２２となる。このように、本実施形態に係る画像認識処理では、従来の画像認識処理と比べて、その実行時間が削減される。 _{In general, the increase time Δ 1} associated with parallel execution of the processing of the first stage neural network is very short compared to the load time of the neural network, so T ₁₁ + (T ₁₂ + Δ ₁ ) + T ₂₂ <T ₁₁ + T ₁₂ + T ₂₁ + T ₂₂ . As described above, in the image recognition process according to the present embodiment, the execution time is reduced as compared with the conventional image recognition process.

また、第１段目のニューラルネットワークのパラメータサイズをＬ_１、軽量画像分析アルゴリズムを実行するモジュール等のロードに必要なサイズをＫ、第２段目のニューラルネットワーク候補が２つであるものとしてそのパラメータサイズをそれぞれＬ_２１及びＬ_２２とすれば、図４に示すように、第１段目のニューラルネットワークのロード後のＲＡＭ使用量はＬ_１＋Ｋ、第２段目のニューラルネットワーク候補のロード後のＲＡＭ使用量はＬ_１＋Ｋ＋Ｌ_２１＋Ｌ_２２となる。更に、第１段目のニューラルネットワークの処理結果が得られた後は、第２段目のニューラルネットワーク候補のうち、当該処理結果に対応するニューラルネットワークのパラメータ以外はメモリから解放できるため、ＲＡＭ使用量はＬ_１＋Ｋ＋Ｌ_２となる。ただし、Ｌ_２は、第２段目のニューラルネットワーク候補のうち、第１段目のニューラルネットワークの処理結果に対応するニューラルネットワークのパラメータサイズ（図４に示す例ではＬ_２１又はＬ_２２のいずれか）である。ただし、図４では簡単のためニューラルネットワーク及び軽量画像分析アルゴリズムのＲＡＭ使用量のみを示している。 Further, assuming that the parameter size of the first-stage neural network is L ₁ , the size required for loading a module for executing a lightweight image analysis algorithm is K, and the second-stage neural network candidates are two. Assuming that the parameter sizes are L ₂₁ and L ₂₂ , respectively, the RAM usage after loading the first-stage neural network is L ₁ + K, and after loading the second-stage neural network candidate, as shown in FIG. The amount of RAM used is L ₁ + K + L ₂₁ + L ₂₂ . Further, after the processing result of the first-stage neural network is obtained, the parameters other than the neural network parameters corresponding to the processing result among the second-stage neural network candidates can be released from the memory, so RAM is used. The amount is L ₁ + K + L ₂ . However, L ₂ is either _{L 21} or L _{22 in} the parameter size of the neural network corresponding to the processing result of the neural network of the first stage among the neural network candidates of the second stage (in the example shown in FIG. 4). ). However, FIG. 4 shows only the RAM usage of the neural network and the lightweight image analysis algorithm for the sake of simplicity.

このように、本実施形態に係る画像認識処理では、従来の画像認識処理と比べてＲＡＭ使用量は増加するものの、軽量画像分析アルゴリズムの処理結果として得られるカテゴリの種類数が適切であれば、組込機器のＲＡＭ容量を超えることなく、画像認識処理の実行時間を削減することが可能となる。なお、軽量画像分析アルゴリズムの処理結果として得られるカテゴリの適切な種類数は、組込機器のＲＡＭ容量によって異なるものの、例えば、数種類程度とすることが挙げられる。 As described above, in the image recognition processing according to the present embodiment, although the amount of RAM used increases as compared with the conventional image recognition processing, if the number of categories obtained as the processing result of the lightweight image analysis algorithm is appropriate, It is possible to reduce the execution time of the image recognition process without exceeding the RAM capacity of the embedded device. The appropriate number of categories obtained as a result of the processing of the lightweight image analysis algorithm varies depending on the RAM capacity of the embedded device, but for example, it may be about several types.

以降では、本実施形態に係る画像認識処理を実行する組込機器を「組込機器１０」として、この組込機器１０について説明する。また、本実施形態に係る組込機器１０が実行する画像認識処理は、上述したように、軽量画像分析アルゴリズムの処理と第１段目のニューラルネットワークの処理と第２段目のニューラルネットワークの処理とで実現されるものとする。ただし、本実施形態は、第３段目以降のニューラルネットワークの処理が含まれる画像認識処理にも同様に適用可能である。 Hereinafter, the embedded device 10 for executing the image recognition process according to the present embodiment will be referred to as the “embedded device 10”, and the embedded device 10 will be described. Further, as described above, the image recognition processing executed by the embedded device 10 according to the present embodiment is the processing of the lightweight image analysis algorithm, the processing of the first-stage neural network, and the processing of the second-stage neural network. It shall be realized by. However, this embodiment can be similarly applied to image recognition processing including processing of the neural network of the third and subsequent stages.

＜組込機器１０のハードウェア構成＞
次に、本実施形態に係る組込機器１０のハードウェア構成について、図５を参照しながら説明する。図５は、本実施形態に係る組込機器１０のハードウェア構成の一例を示す図である。 <Hardware configuration of embedded device 10>
Next, the hardware configuration of the embedded device 10 according to the present embodiment will be described with reference to FIG. FIG. 5 is a diagram showing an example of the hardware configuration of the embedded device 10 according to the present embodiment.

図５に示すように、本実施形態に係る組込機器１０は、入出力Ｉ／Ｆ１１と、ＲＯＭ（Read Only Memory）１２と、ＲＡＭ１３と、プロセッサ１４とを有する。これらの各ハードウェアは、それぞれがバス１５を介して通信可能に接続されている。 As shown in FIG. 5, the embedded device 10 according to the present embodiment includes an input / output I / F 11, a ROM (Read Only Memory) 12, a RAM 13, and a processor 14. Each of these hardware is connected so as to be communicable via the bus 15.

入出力Ｉ／Ｆ１１は、他の機器又は装置（例えば、組込機器１０が組み込まれた機器又は装置等）との間の入出力インタフェースである。組込機器１０は、入出力Ｉ／Ｆ１１を介して、他の機器又は装置との間で各種データ（例えば、画像等）の入出力を行うことができる。 The input / output I / F 11 is an input / output interface with another device or device (for example, a device or device in which the embedded device 10 is incorporated). The embedded device 10 can input / output various data (for example, an image or the like) to / from another device or device via the input / output I / F 11.

ＲＯＭ１２は、各種データ（例えば、後述する画像認識プログラム１００やニューラルネットワークのパラメータ等）を記憶する不揮発性の記憶装置である。ＲＡＭ１３は、各種データ（例えば、ニューラルネットワークのパラメータ、軽量画像分析アルゴリズムを実行するモジュール等）を一時保持する揮発性の記憶装置である。ＲＡＭ１３は、組込機器１０の主記憶装置として用いられる。プロセッサ１４は、例えば、ＭＰＵ（Micro Processing unit）やＣＰＵ（Central Processing Unit）等の各種演算装置である。 The ROM 12 is a non-volatile storage device that stores various data (for example, an image recognition program 100 described later, parameters of a neural network, and the like). The RAM 13 is a volatile storage device that temporarily holds various data (for example, parameters of a neural network, a module that executes a lightweight image analysis algorithm, and the like). The RAM 13 is used as the main storage device of the embedded device 10. The processor 14 is, for example, various arithmetic units such as an MPU (Micro Processing unit) and a CPU (Central Processing Unit).

本実施形態に係る組込機器１０は、図５に示すハードウェア構成を有することにより、後述する画像認識処理を実現することができる。なお、図５に示すハードウェア構成は一例であって、本実施形態に係る組込機器１０は、他のハードウェア構成を有していてもよい。例えば、本実施形態に係る組込機器１０は、フラッシュメモリ等の補助記憶装置を有していてもよいし、ＧＰＵ（Graphics Processing Unit）等の専用処理ユニットを有していてもよい。 By having the hardware configuration shown in FIG. 5, the embedded device 10 according to the present embodiment can realize the image recognition process described later. The hardware configuration shown in FIG. 5 is an example, and the embedded device 10 according to the present embodiment may have another hardware configuration. For example, the embedded device 10 according to the present embodiment may have an auxiliary storage device such as a flash memory, or may have a dedicated processing unit such as a GPU (Graphics Processing Unit).

＜組込機器１０の機能構成＞
次に、本実施形態に係る組込機器１０の機能構成について、図６を参照しながら説明する。図６は、本実施形態に係る組込機器１０の機能構成の一例を示す図である。 <Functional configuration of embedded device 10>
Next, the functional configuration of the embedded device 10 according to the present embodiment will be described with reference to FIG. FIG. 6 is a diagram showing an example of the functional configuration of the embedded device 10 according to the present embodiment.

図６に示すように、本実施形態に係る組込機器１０は、展開部１０１と、第１の認識部１０２と、複数の第２の認識部１０３と、画像分析部（軽量アルゴリズム）１０４と、選択部１０５とを有する。これらの各部は、例えば、ＲＯＭ１２に記憶されている画像認識プログラム１００がプロセッサ１４に実行させる処理により実現される。 As shown in FIG. 6, the embedded device 10 according to the present embodiment includes an expansion unit 101, a first recognition unit 102, a plurality of second recognition units 103, and an image analysis unit (lightweight algorithm) 104. , And a selection unit 105. Each of these parts is realized, for example, by a process of causing the processor 14 to execute the image recognition program 100 stored in the ROM 12.

また、本実施形態に係る組込機器１０は、パラメータ記憶部２００を有する。パラメータ記憶部２００は、例えば、ＲＯＭ１２により実現される。 Further, the embedded device 10 according to the present embodiment has a parameter storage unit 200. The parameter storage unit 200 is realized by, for example, the ROM 12.

パラメータ記憶部２００は、各ニューラルネットワークのパラメータ（例えば、全結合層の重みやバイアス、畳み込み層のフィルター（又はカーネル）のパラメータやバイアス等）を記憶する。なお、これらのパラメータは予め学習済み（つまり、所望の画像認識処理を実現可能なようにその値が調整済み）であるものとする。 The parameter storage unit 200 stores the parameters of each neural network (for example, the weight and bias of the fully connected layer, the parameter and bias of the filter (or kernel) of the convolution layer, and the like). It is assumed that these parameters have been learned in advance (that is, their values have been adjusted so that the desired image recognition process can be realized).

展開部１０１は、ニューラルネットワークのパラメータや軽量画像分析アルゴリズムを実行するモジュール等をＲＡＭ１３上に展開（ロード）する。また、展開部１０１は、画像認識処理に不要となったニューラルネットワークのパラメータ等を解放（アンロード）する。 The expansion unit 101 expands (loads) the parameters of the neural network, the module that executes the lightweight image analysis algorithm, and the like on the RAM 13. Further, the expansion unit 101 releases (unloads) the parameters and the like of the neural network that are no longer necessary for the image recognition process.

第１の認識部１０２は第１段目のニューラルネットワークにより実現され、当該ニューラルネットワークのパラメータを用いて、画像認識処理の認識対象として入力された画像（入力画像）の第１のカテゴリを認識（例えば、当該画像中の商品の大分類を表すカテゴリを認識）する。すなわち、第１の認識部１０２は、学習済みのニューラルネットワークのパラメータを用いて、当該ニューラルネットワークの推論処理を実行し、入力画像の第１のカテゴリを認識する。 The first recognition unit 102 is realized by the first-stage neural network, and recognizes the first category of the image (input image) input as the recognition target of the image recognition process by using the parameters of the neural network (the first recognition unit 102). For example, it recognizes a category representing a major classification of products in the image). That is, the first recognition unit 102 executes the inference processing of the neural network using the parameters of the trained neural network, and recognizes the first category of the input image.

第２の認識部１０３は第２段目のニューラルネットワークにより実現され、当該ニューラルネットワークのパラメータを用いて、入力画像中の第２のカテゴリを認識（例えば、当該画像中の商品の小分類を表すカテゴリを認識）する。すなわち、第２の認識部１０３は、学習済みのニューラルネットワークのパラメータを用いて、当該ニューラルネットワークの推論処理を実行し、入力画像の第２のカテゴリを認識する。ここで、第２の認識部１０３は第１のカテゴリ毎に存在する（つまり、第２段目に用いられるニューラルネットワークも第１のカテゴリ毎に存在する）。以降では、第１のカテゴリの種類数をＮとして、複数の第２の認識部１０３の各々を区別するときは「第２の認識部１０３−１」、・・・、「第２の認識部１０３−Ｎ」等と表記する。 The second recognition unit 103 is realized by the second-stage neural network, and recognizes the second category in the input image using the parameters of the neural network (for example, represents a subclass of products in the image). Recognize the category). That is, the second recognition unit 103 executes inference processing of the neural network using the parameters of the trained neural network, and recognizes the second category of the input image. Here, the second recognition unit 103 exists for each first category (that is, the neural network used for the second stage also exists for each first category). Hereinafter, when the number of types in the first category is N and each of the plurality of second recognition units 103 is distinguished, "second recognition unit 103-1", ..., "Second recognition unit 103", ... Notated as "103-N" or the like.

画像分析部１０４は、軽量画像分析アルゴリズムにより、入力画像を分析して、その第１のカテゴリを認識（つまり、当該画像中の商品の大分類を表すカテゴリを認識）する。なお、画像分析部１０４は、例えば、その認識精度が高い順に予め設定された上位数個（１個の場合も含む）の第１のカテゴリを認識結果として出力する。 The image analysis unit 104 analyzes the input image by the lightweight image analysis algorithm and recognizes the first category (that is, recognizes the category representing the major classification of the products in the image). The image analysis unit 104 outputs, for example, the first category of a few high-order (including one) preset in order of increasing recognition accuracy as a recognition result.

選択部１０５は、第１の認識部１０２の認識結果に応じて、複数の第２の認識部１０３のうちの一の第２の認識部１０３を選択する。すなわち、選択部１０５は、第２段目に用いられる複数のニューラルネットワークのうち、第１段目のニューラルネットワークの処理結果に対応するニューラルネットワークを第２段目で実行するニューラルネットワークとして選択する。 The selection unit 105 selects the second recognition unit 103 of one of the plurality of second recognition units 103 according to the recognition result of the first recognition unit 102. That is, the selection unit 105 selects the neural network corresponding to the processing result of the neural network of the first stage as the neural network to be executed in the second stage among the plurality of neural networks used in the second stage.

なお、図６に示す例は、組込機器１０が第１の認識部１０２と第２の認識部１０３とを有している場合を示しているが、例えば、第３段目のニューラルネットワークの処理が画像認識処理に含まれている場合、組込機器１０は１以上の第３の認識部を有する。同様に、一般に、ｍ＝１，・・・，Ｍに対して第ｍ段目のニューラルネットワークの処理が画像認識処理に含まれている場合、組込機器１０は１以上の第ｍの認識部を有する。ただし、第１の認識部から第Ｍの認識部の区別は便宜上のものであって、例えば、第１の認識部から第Ｍの認識部までを単に「認識部」としてもよい。 The example shown in FIG. 6 shows a case where the embedded device 10 has a first recognition unit 102 and a second recognition unit 103. For example, in the neural network of the third stage. When the process is included in the image recognition process, the embedded device 10 has one or more third recognition units. Similarly, in general, when the processing of the neural network of the mth stage is included in the image recognition processing for m = 1, ..., M, the embedded device 10 has one or more mth recognition units. Have. However, the distinction between the first recognition unit and the M recognition unit is for convenience, and for example, the first recognition unit to the M recognition unit may be simply referred to as a "recognition unit".

＜本実施形態に係る画像認識処理の流れ＞
次に、本実施形態に係る組込機器１０が実行する画像認識処理の流れについて、図７を参照しながら説明する。図７は、本実施形態に係る画像認識処理のフローチャートの一例を示す図である。なお、図７のステップＳ１０１〜ステップＳ１０３とステップＳ２０１〜ステップＳ２０４は並列実行される。 <Flow of image recognition processing according to this embodiment>
Next, the flow of the image recognition process executed by the embedded device 10 according to the present embodiment will be described with reference to FIG. 7. FIG. 7 is a diagram showing an example of a flowchart of the image recognition process according to the present embodiment. It should be noted that steps S101 to S103 and steps S201 to S204 in FIG. 7 are executed in parallel.

展開部１０１は、第１段目のニューラルネットワークのパラメータをＲＡＭ１３上にロードする（ステップＳ１０１）。また、展開部１０１は、軽量画像分析アルゴリズムを実行するモジュール等をＲＡＭ１３上にロードする（ステップＳ２０１）。 The expansion unit 101 loads the parameters of the first-stage neural network onto the RAM 13 (step S101). Further, the expansion unit 101 loads a module or the like for executing the lightweight image analysis algorithm on the RAM 13 (step S201).

ステップＳ１０１に続いて、第１の認識部１０２は、認識対象の画像を入力する（ステップＳ１０２）。また、ステップＳ２０１に続いて、画像分析部１０４は、ステップＳ１０２で第１の認識部１０２に入力された画像と同一の画像を入力する（ステップＳ２０２）。 Following step S101, the first recognition unit 102 inputs an image to be recognized (step S102). Further, following step S201, the image analysis unit 104 inputs the same image as the image input to the first recognition unit 102 in step S102 (step S202).

ステップＳ１０２に続いて、第１の認識部１０２は、第１段目のニューラルネットワークのパラメータを用いて、入力した画像の第１のカテゴリを認識（つまり、当該画像中の商品の大分類を表すカテゴリを認識）する（ステップＳ１０３）。また、ステップＳ２０２に続いて、画像分析部１０４は、軽量画像分析アルゴリズムにより、入力した画像の第１のカテゴリを認識（つまり、当該画像中の商品の大分類を表すカテゴリを認識）する（ステップＳ２０３）。 Following step S102, the first recognition unit 102 recognizes the first category of the input image using the parameters of the neural network of the first stage (that is, represents the major classification of the products in the image). Recognize the category) (step S103). Further, following step S202, the image analysis unit 104 recognizes the first category of the input image (that is, recognizes the category representing the major classification of the products in the image) by the lightweight image analysis algorithm (step). S203).

なお、上記のステップＳ２０３で、画像分析部１０４は、複数種類の軽量画像分析アルゴリズムにより、入力した画像の第１のカテゴリをそれぞれ認識してもよい。ただし、複数種類の軽量画像分析アルゴリズムを実行する場合、ステップＳ２０３〜ステップＳ２０４の処理時間が長くなるが、画像認識処理全体の実行時間に影響を与えないためには、ステップＳ１０３の処理が終了する前に、ステップＳ２０４の処理が終了している必要がある。 In the above step S203, the image analysis unit 104 may recognize the first category of the input image by a plurality of types of lightweight image analysis algorithms. However, when executing a plurality of types of lightweight image analysis algorithms, the processing time of steps S203 to S204 becomes long, but the processing of step S103 ends in order not to affect the execution time of the entire image recognition processing. Before, the processing of step S204 needs to be completed.

そして、展開部１０１は、第２段目に用いられるニューラルネットワークのうち、上記のステップＳ２０３で認識された第１のカテゴリに対応するニューラルネットワークのパラメータをＲＡＭ１３上にロードする（ステップＳ２０４）。なお、上記のステップＳ２０３で認識された第１のカテゴリが複数存在する場合（例えば、その認識精度が高い順に上位数個の第１のカテゴリが認識された場合）は、これら複数の第１のカテゴリにそれぞれ対応する複数のニューラルネットワークのパラメータがそれぞれＲＡＭ１３上にロードされる。 Then, the expansion unit 101 loads the parameters of the neural network corresponding to the first category recognized in step S203 of the neural network used in the second stage onto the RAM 13 (step S204). If there are a plurality of first categories recognized in step S203 (for example, when the top several first categories are recognized in descending order of recognition accuracy), the first of these plurality. The parameters of the plurality of neural networks corresponding to each category are loaded on the RAM 13.

ステップＳ１０３及びステップＳ２０４に続いて、選択部１０５は、複数の第２の認識部１０３のうち、上記のステップＳ１０３で認識された第１のカテゴリに対応する第２の認識部１０３を選択する（ステップＳ３０１）。すなわち、選択部１０５は、複数の第２の認識部１０３のうち、第１のカテゴリが表す大分類の小分類（第２のカテゴリ）を認識するニューラルネットワークで実現される第２の認識部１０３を選択する。 Following step S103 and step S204, the selection unit 105 selects, among the plurality of second recognition units 103, the second recognition unit 103 corresponding to the first category recognized in step S103 above (the second recognition unit 103). Step S301). That is, the selection unit 105 is the second recognition unit 103 realized by the neural network that recognizes the minor classification (second category) of the major classification represented by the first category among the plurality of second recognition units 103. Select.

次に、展開部１０１は、第２段目に用いられるニューラルネットワークのうち、上記のステップＳ３０１で選択された第２の認識部１０３を実現するニューラルネットワーク以外のニューラルネットワークのパラメータをＲＡＭ１３上からアンロードする（ステップＳ３０２）。 Next, the expansion unit 101 unannounces the parameters of the neural network other than the neural network that realizes the second recognition unit 103 selected in the above step S301 among the neural networks used in the second stage from the RAM 13. Load (step S302).

続いて、上記のステップＳ３０１で選択された第２の認識部１０３は、上記のステップＳ１０２で第１の認識部１０２に入力された画像と同一の画像を入力する（ステップＳ３０３）。そして、当該第２の認識部１０３は、この第２の認識部１０３を実現する第２段目のニューラルネットワークのパラメータを用いて、入力した画像の第２のカテゴリを認識（つまり、当該画像中の商品の小分類を表すカテゴリを認識）する（ステップＳ３０４）。 Subsequently, the second recognition unit 103 selected in step S301 inputs the same image as the image input to the first recognition unit 102 in step S102 (step S303). Then, the second recognition unit 103 recognizes the second category of the input image (that is, in the image) by using the parameters of the neural network of the second stage that realizes the second recognition unit 103. Recognize the category representing the sub-category of the product of (step S304).

以上のように、本実施形態に係る組込機器１０は、第１段目のニューラルネットワークの処理と、軽量画像分析アルゴリズムによる推定処理及びその推定結果に対応する第２段目のニューラルネットワークのロードとを並列に実行する。これにより、本実施形態に係る組込機器１０では、第１段目のニューラルネットワークによる推定結果が得られたときには第２段目のニューラルネットワークがＲＡＭ１３上にロードされているため、画像認識処理全体の実行時間を削減することが可能となる。 As described above, the embedded device 10 according to the present embodiment has the processing of the first-stage neural network, the estimation processing by the lightweight image analysis algorithm, and the loading of the second-stage neural network corresponding to the estimation result. And in parallel. As a result, in the embedded device 10 according to the present embodiment, when the estimation result by the first-stage neural network is obtained, the second-stage neural network is loaded on the RAM 13, so that the entire image recognition process is performed. It is possible to reduce the execution time of.

なお、本実施形態では、一例として、第１段目のニューラルネットワークの処理と第２段目のニューラルネットワークの処理とで画像認識処理を行う場合について説明したが、第３段目以降のニューラルネットワークの処理が画像認識処理を行う場合についても同様に適用可能である。この場合、一般に、第ｍ段目のニューラルネットワークの処理実行中に軽量画像分析アルゴリズムによる推定処理とその推定結果に対応する第ｍ＋１段目のニューラルネットワークのロードとを実行し、第ｍ段目のニューラルネットワークによる推定結果に対応するニューラルネットワークで第ｍ＋１段目の処理を実行すればよい。 In this embodiment, as an example, a case where image recognition processing is performed by the processing of the neural network of the first stage and the processing of the neural network of the second stage has been described, but the neural network of the third and subsequent stages has been described. The same applies to the case where the processing of is performed the image recognition processing. In this case, in general, the estimation process by the lightweight image analysis algorithm and the loading of the m + 1th-stage neural network corresponding to the estimation result are executed during the processing execution of the m-th stage neural network, and the m-th stage neural network is loaded. The processing of the first m + 1st stage may be executed by the neural network corresponding to the estimation result by the neural network.

ただし、軽量画像分析アルゴリズムの認識精度によっては、第ｍ段目のニューラルネットワークによる推定結果に対応する第ｍ＋１段目のニューラルネットワークがロードされていない場合もあり得る。この場合は第ｍ＋１段目のニューラルネットワークをロードすればよいが、そうすると画像認識処理全体の実行時間の削減効果が得られなくなる。このため、軽量画像分析アルゴリズムの認識精度は、少なくとも上位数個の推定結果の中に第ｍ段目のニューラルネットワークによる推定結果が含まれる程度の認識精度があることが好ましい。 However, depending on the recognition accuracy of the lightweight image analysis algorithm, the neural network of the m + 1th stage corresponding to the estimation result by the neural network of the mth stage may not be loaded. In this case, the neural network of the first m + 1st stage may be loaded, but if this is done, the effect of reducing the execution time of the entire image recognition process cannot be obtained. Therefore, it is preferable that the recognition accuracy of the lightweight image analysis algorithm is such that the estimation result by the neural network of the mth stage is included in at least the top several estimation results.

本発明は、具体的に開示された上記の実施形態に限定されるものではなく、請求の範囲の記載から逸脱することなく、種々の変形や変更、既知の技術との組み合わせ等が可能である。 The present invention is not limited to the above-described embodiment specifically disclosed, and various modifications and modifications, combinations with known techniques, and the like are possible without departing from the description of the claims. ..

１０組込機器
１１入出力Ｉ／Ｆ
１２ＲＯＭ
１３ＲＡＭ
１４プロセッサ
１５バス
１００画像認識プログラム
１０１展開部
１０２第１の認識部
１０３第２の認識部
１０４画像分析部
１０５選択部
２００パラメータ記憶部 10 Embedded equipment 11 I / O I / F
12 ROM
13 RAM
14 Processor 15 Bus 100 Image recognition program 101 Expansion unit 102 First recognition unit 103 Second recognition unit 104 Image analysis unit 105 Selection unit 200 Parameter storage unit

Claims

It is a device that executes recognition processing realized by configuring inference processing of a machine learning model in multiple stages.
A recognition unit that obtains the recognition result of the m-th stage of the input data by the inference processing of the machine learning model of the m-th stage using the parameters of the machine learning model of the m-th stage.
An analysis unit that obtains the recognition result of the m-th stage of the input data by the analysis processing of a lightweight analysis algorithm, and
An expansion unit that expands the parameters of the machine learning model of the m + 1th stage on the memory according to the recognition result of the mth stage obtained by the analysis unit.
Of the parameters of the m + 1st stage machine learning model expanded on the memory by the expansion unit, the m + 1st stage machine learning model corresponding to the recognition result of the mth stage obtained by the recognition unit. A selection section for selecting parameters and
Equipment with.

The analysis unit
The first aspect of the present invention is to execute an analysis process by the lightweight analysis algorithm in parallel with an inference process of the machine learning model of the m-th stage by the recognition unit to obtain a recognition result of the m-th stage of the input data. Described equipment.

The development part is
The device according to claim 1 or 2, wherein the parameters of the m + 1st-stage machine learning model are expanded on the memory in parallel with the inference processing of the m-th stage machine learning model by the recognition unit.

The recognition unit
Using the parameters of the machine learning model of the m + 1st stage selected by the selection unit, the recognition result of the m + 1th stage of the input data is obtained by the inference processing of the machine learning model of the m + 1th stage. The device according to any one of Items 1 to 3.

The development part is
Among the parameters of the machine learning model of the first m + 1 stage expanded on the memory, the parameters other than the parameters of the machine learning model of the m + 1 stage selected by the selection unit are deleted from the memory. The device according to any one of Items 1 to 4.

The machine learning model is a neural network, the input data is image data, and the lightweight analysis algorithm analyzes the image data with a lower memory usage and a lower calculation amount than the neural network, and recognizes the m-th stage. The device according to any one of claims 1 to 5, which is an algorithm for obtaining a result.

A device that executes recognition processing realized by configuring inference processing of a machine learning model in multiple stages
A recognition procedure for obtaining the recognition result of the m-th stage of the input data by the inference processing of the machine learning model of the m-th stage using the parameters of the machine learning model of the m-th stage.
An analysis procedure for obtaining the recognition result of the m-th stage of the input data by the analysis processing of a lightweight analysis algorithm, and
According to the recognition result of the m-th stage obtained by the analysis procedure, the expansion procedure of expanding the parameters of the machine learning model of the m + 1-th stage on the memory and the expansion procedure.
Of the parameters of the m + 1st stage machine learning model expanded on the memory by the expansion procedure, the m + 1st stage machine learning model corresponding to the recognition result of the mth stage obtained by the recognition procedure. The selection procedure for selecting parameters and
How to run.

For devices that execute recognition processing realized by configuring inference processing of machine learning models in multiple stages
A recognition procedure for obtaining the recognition result of the m-th stage of the input data by the inference processing of the machine learning model of the m-th stage using the parameters of the machine learning model of the m-th stage.
An analysis procedure for obtaining the recognition result of the m-th stage of the input data by the analysis processing of a lightweight analysis algorithm, and
According to the recognition result of the m-th stage obtained by the analysis procedure, the expansion procedure of expanding the parameters of the machine learning model of the m + 1-th stage on the memory and the expansion procedure.
Of the parameters of the m + 1st stage machine learning model expanded on the memory by the expansion procedure, the m + 1st stage machine learning model corresponding to the recognition result of the mth stage obtained by the recognition procedure. The selection procedure for selecting parameters and
A program to execute.