JP2023080908A

JP2023080908A - Information processing apparatus and control method of information processing apparatus

Info

Publication number: JP2023080908A
Application number: JP2021194472A
Authority: JP
Inventors: 心高木; Shin Takagi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-11-30
Filing date: 2021-11-30
Publication date: 2023-06-09

Abstract

To provide an information processing apparatus which can reduce a calculation amount while having less influence on inference accuracy, in processing of identifying an object in a moving image.SOLUTION: The information processing apparatus includes: acquisition means which acquires continuous frames of a moving image; and identification means which identifies an object from each of the frames of the moving image by using a calculator including a plurality of hierarchies, in which a degree of influence being a product sum of outputs from individual nodes included in a first hierarchy to a prescribed node in a second hierarchy and weights respectively corresponding to the outputs is calculated and is inputted to the prescribed node. On the basis of a first degree of influence inputted to the prescribed node for a first frame and a partial calculation result of calculation of a second degree of influence inputted to the prescribed node for a second frame after the first frame, the identification means determines whether or not calculation of the second degree of influence is interrupted.SELECTED DRAWING: Figure 10

Description

本発明は、情報処理装置および情報処理装置の制御方法に関する。 The present invention relates to an information processing device and a control method for the information processing device.

近年、深層学習による学習、推論の研究および実社会への適用が進み、大規模なニューラルネットワークが用いられるようになってきた。１０００種類程度の物体を識別できる多クラス分類器も実現されている。 In recent years, learning by deep learning, inference research, and application to the real world have progressed, and large-scale neural networks have come to be used. A multi-class classifier capable of identifying about 1000 types of objects has also been implemented.

例えば１０００種類程度の物体を識別できる多クラス分類器は、画像内にそれぞれの物体が存在する確率を計算するが、現実世界において、ある画像内に一度に存在する物体の種類は一般的には限りがある。識別できる物体の種類が増えると、多クラス分類器の全体の演算量に対して、画面内に存在しない物体についての演算量の割合が増えるため、画像内の物体を識別する効率は悪くなる。 For example, a multi-class classifier that can distinguish about 1000 kinds of objects calculates the probability that each object exists in an image. Limited. As the types of objects that can be identified increase, the ratio of the amount of calculation for objects that do not exist in the screen to the overall amount of calculation of the multi-class classifier increases, so the efficiency of identifying objects in the image deteriorates.

演算量を低減する方法として、例えば、特許文献１は、ニューラルネットワークの後段のレイヤにおいて有用な情報と無用な情報とを判別し、無用であると判別された部分の特徴量または重みをゼロに置換する方法を開示している。 As a method for reducing the amount of calculation, for example, Patent Document 1 discloses that useful information and useless information are distinguished in the latter layer of a neural network, and the feature amount or weight of the portion judged to be useless is set to zero. Discloses how to replace.

特開２０１９－２００６４８号公報JP 2019-200648 A

Ｓ．Ｈａｙｋｉｎ，“ＮｅｕｒａｌＮｅｔｗｏｒｋｓＡＣｏｍｐｒｅｈｅｎｓｉｖｅＦｏｕｎｄａｔｉｏｎ２ｎｄＥｄｉｔｉｏｎ”，ＰｒｅｎｔｉｃｅＨａｌｌ，ｐｐ．１５６－２５５，Ｊｕｌｙ１９９８S. Haykin, "Neural Networks A Comprehensive Foundation 2nd Edition", Prentice Hall, pp. 156-255, July 1998

特許文献１に記載の技術は、時間方向の相関を考慮していないため、動画など時間的に連続する画像に対する演算量の十分な削減効果は見込めない。また、ニューラルネットワークの後段側で演算量を削減しても、多クラス分類器を用いた場合、物体の種類の増加に伴う演算量の増加を抑制する効果は見込めない。 The technique described in Patent Literature 1 does not take correlation in the temporal direction into consideration, and therefore cannot be expected to have a sufficient effect of reducing the amount of calculation for temporally continuous images such as moving images. Further, even if the amount of computation is reduced on the post-stage side of the neural network, when using a multi-class classifier, the effect of suppressing the increase in the amount of computation associated with an increase in the types of objects cannot be expected.

そこで、本発明は、動画像における物体の識別処理において、推論精度への影響を抑えつつ演算量を低減可能な情報処理装置を提供することを目的とする。 SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention to provide an information processing apparatus capable of reducing the amount of calculation while suppressing the influence on inference accuracy in object identification processing in moving images.

本発明の情報処理装置は、
動画像の連続するフレームを取得する取得手段と、
複数の階層を含む演算器であって、第１の階層に含まれる各ノードから第２の階層の所定のノードへの出力と、前記出力のそれぞれに対応する重みとの積和である影響度を演算して、前記演算された影響度を前記所定のノードへの入力とする演算器を用いて、前記動画像の各フレームから物体を識別する識別手段と、を有し、
前記識別手段は、第１のフレームで前記所定のノードに入力された第１の影響度と、前記第１のフレームよりも後の第２のフレームで、前記所定のノードに入力される第２の影響度の演算の一部の演算結果とに基づいて、前記第２の影響度の演算を中断するか否かを
判定する
ことを特徴とする。 The information processing device of the present invention includes:
acquisition means for acquiring successive frames of a moving image;
A calculator including a plurality of hierarchies, wherein the degree of impact is the product sum of the output from each node included in the first hierarchy to a predetermined node in the second hierarchy and the weight corresponding to each of the outputs and an identification means for identifying an object from each frame of the moving image using a calculator that inputs the calculated degree of influence to the predetermined node,
The identifying means includes a first influence input to the predetermined node in a first frame and a second influence input to the predetermined node in a second frame after the first frame. It is characterized in that it is determined whether or not to interrupt the calculation of the second influence based on a partial calculation result of the calculation of the influence.

本発明によれば、動画像における物体の識別処理において、推論精度への影響を抑えつつ演算量を低減することができる。 Advantageous Effects of Invention According to the present invention, it is possible to reduce the amount of calculation while suppressing the influence on inference accuracy in object identification processing in a moving image.

情報処理装置の構成を例示するブロック図である。1 is a block diagram illustrating the configuration of an information processing device; FIG. 被写体クラスを識別するニューラルネットワークの構造例を示す図である。FIG. 4 is a diagram showing an example structure of a neural network for identifying subject classes; ＣＮＮの特徴検出処理および特徴統合処理を説明する図である。It is a figure explaining the feature detection process and feature integration process of CNN. ニューラルネットワークのモデル構造の概略図である。1 is a schematic diagram of a model structure of a neural network; FIG. 推論モデルを用いた推論処理を例示するフローチャートである。6 is a flowchart illustrating inference processing using an inference model; １フレーム目の画像に対する推論演算を例示するフローチャートである。10 is a flowchart illustrating an inference calculation for a first frame image; 推論モデルのノードで用いられる演算器を説明する図である。FIG. 4 is a diagram for explaining computing units used in nodes of an inference model; 積和演算器の構成を例示する図である。FIG. 4 is a diagram illustrating the configuration of a sum-of-products calculator; 被写体クラスの識別結果の例を示すグラフである。7 is a graph showing an example of a subject class identification result; １フレーム目以外での推論演算を例示するフローチャートである。10 is a flowchart illustrating inference calculations for frames other than the first frame; 変形例１での影響度の変化の例を示すグラフである。10 is a graph showing an example of changes in the degree of influence in modified example 1;

＜実施形態＞
以下に、図面を参照して、本発明の実施形態を説明する。ただし、本発明は、以下の実施形態に限定されるものではない。本実施形態では、情報処理装置は、ニューラルネットワークのモデルを多クラス分類に用いる場合に、無駄になる演算量を抑制する。 <Embodiment>
Embodiments of the present invention will be described below with reference to the drawings. However, the present invention is not limited to the following embodiments. In the present embodiment, the information processing apparatus reduces the amount of computation that is wasted when using a neural network model for multi-class classification.

図１は、情報処理装置１００の構成を例示するブロック図である。情報処理装置１００は、ＣＰＵ１０１、ＲＯＭ１０２、メモリ１０３、画像取得部１０４、識別部１０５を備える。情報処理装置１００に含まれる各構成は、システムバス１１０に接続されており、システムバス１１０を介して互いにデータのやり取りをすることができる。 FIG. 1 is a block diagram illustrating the configuration of an information processing apparatus 100. As illustrated in FIG. The information processing apparatus 100 includes a CPU 101 , a ROM 102 , a memory 103 , an image acquisition section 104 and an identification section 105 . Each component included in the information processing apparatus 100 is connected to a system bus 110 and can exchange data with each other via the system bus 110 .

ＣＰＵ１０１は、ＲＯＭ１０２に格納されたプログラムをメモリ１０３に読み出して実行することにより、画像取得部１０４および識別部１０５を制御する。 CPU 101 controls image acquisition unit 104 and identification unit 105 by reading programs stored in ROM 102 into memory 103 and executing them.

ＲＯＭ１０２は、ＣＰＵ１０１が動作するための各種プログラムなどを格納する。なお、ＣＰＵ１０１が動作するための各種プログラムは、ＲＯＭ１０２に限られず、例えばハードディスク等に格納されてもよい。 The ROM 102 stores various programs for the CPU 101 to operate. Various programs for the operation of the CPU 101 are not limited to the ROM 102, and may be stored, for example, in a hard disk or the like.

メモリ１０３は、例えばＲＡＭであり、ＣＰＵ１０１が各種プログラムを実行するためのワークメモリとして用いられる。ＣＰＵ１０１は、ＲＯＭ１０２に格納されたプログラムをメモリ１０３に読み出して実行することができる。 The memory 103 is, for example, a RAM, and is used as a work memory for the CPU 101 to execute various programs. The CPU 101 can read a program stored in the ROM 102 to the memory 103 and execute it.

画像取得部１０４は、物体（被写体）を識別する対象となる動画像の連続するフレーム画像を取得する。画像取得部１０４は、例えば、情報処理装置１００が備える不図示の撮像部から動画像を取得することができる。また、画像取得部１０４は、情報処理装置１００に接続された撮像装置から動画像を取得してもよい。 The image acquisition unit 104 acquires continuous frame images of a moving image for which an object (subject) is to be identified. The image acquiring unit 104 can acquire a moving image from, for example, an imaging unit (not shown) included in the information processing apparatus 100 . Also, the image acquisition unit 104 may acquire a moving image from an imaging device connected to the information processing device 100 .

識別部１０５は、画像取得部１０４が取得したフレーム画像から、ニューラルネットワークの推論モデル（学習済みモデル）を用いて物体を識別（検出）する。識別部１０５は、前のフレームに対する演算結果および現フレームに対する一部の演算結果に基づいて、
現フレームに対する演算を中断するか否かを判定する。中断された後の演算を実行しないことにより、物体を識別するための演算量は低減される。 The identification unit 105 identifies (detects) an object from the frame image acquired by the image acquisition unit 104 using an inference model (learned model) of a neural network. Based on the calculation result for the previous frame and a part of the calculation result for the current frame, the identification unit 105
Determines whether or not to interrupt the operation for the current frame. By not performing computations after being interrupted, the amount of computations for identifying objects is reduced.

図２および図３を参照し、ＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）を例として、ニューラルネットワークの推論モデルの基本的な構成について説明する。図２は、２次元画像データである入力画像から、被写体クラスを識別するＣＮＮの基本的な構成を示す。 A basic configuration of an inference model of a neural network will be described with reference to FIGS. 2 and 3, taking a CNN (Convolutional Neural Network) as an example. FIG. 2 shows the basic configuration of a CNN that identifies object classes from an input image that is two-dimensional image data.

ＣＮＮは複数の階層を含み、各階層は特徴検出層（Ｓ層）および特徴統合層（Ｃ層）と呼ばれる２つの層を含む。図２の例では、ＣＮＮに入力された入力画像は、第１階層から第Ｘ階層まで順に処理される。 A CNN contains multiple layers, each containing two layers called the feature detection layer (S-layer) and the feature integration layer (C-layer). In the example of FIG. 2, an input image input to the CNN is processed in order from the 1st layer to the Xth layer.

ＣＮＮでは、まずＳ層において、前段の階層で検出された特徴に基づいて入力画像の特徴を検出する。次に、Ｓ層で検出された特徴はＣ層で統合され、現在の階層での検出結果として次の階層に入力される。 In the CNN, first, in the S layer, the features of the input image are detected based on the features detected in the preceding layer. Next, the features detected in the S layer are integrated in the C layer and input to the next layer as detection results in the current layer.

Ｓ層は、複数の特徴検出細胞面を含み、特徴検出細胞面ごとに異なる特徴を検出する。Ｃ層は、複数の特徴統合細胞面を含み、Ｓ層の特徴検出細胞面での検出結果をプーリングする。図２の例では、最終階層である出力層（第Ｘ階層）は、Ｃ層は用いずＳ層で構成される。特徴検出細胞面および特徴統合細胞面は、総称して細胞面とも称される。 The S-layer includes multiple feature-detecting cell planes and detects different features for each feature-detecting cell plane. The C layer contains multiple feature integration cell planes and pools the detection results in the feature detection cell planes of the S layer. In the example of FIG. 2, the output layer (X-th layer), which is the final layer, is composed of the S layer without using the C layer. The feature detection cell plane and the feature integration cell plane are also collectively referred to as cell planes.

図３を参照して、特徴検出細胞面での特徴検出処理、および特徴統合細胞面での特徴統合処理の詳細について説明する。図３において、矩形は細胞面を示す。Ｌ階層目Ｓ層は、複数の特徴検出細胞面を含み、Ｌ－１階層目Ｃ層およびＬ階層目Ｃ層は、それぞれ複数の特徴統合細胞面を含む。 Details of the feature detection processing in the feature detection cell plane and the feature integration processing in the feature integration cell plane will be described with reference to FIG. In FIG. 3, rectangles indicate cell planes. The L-th order S-layer contains a plurality of feature detection cell planes, and the L-1-th order C-layer and L-th order C-layer each contain a plurality of feature integration cell planes.

特徴検出細胞面は、複数の特徴検出ニューロンにより構成され、特徴検出ニューロンは、１つ前の階層のＣ層に所定の構造で結合している。特徴統合細胞面は、複数の特徴統合ニューロンにより構成され、特徴統合ニューロンは、同じ階層のＳ層に所定の構造で結合している。 The feature detection cell surface is composed of a plurality of feature detection neurons, and the feature detection neurons are connected to the layer C, which is the previous layer, in a predetermined structure. The feature-integrating cell surface is composed of a plurality of feature-integrating neurons, and the feature-integrating neurons are connected to the S layer of the same hierarchy with a predetermined structure.

Ｌ階層目Ｓ層のｍ番目細胞面内において、位置（ξ，ζ）の特徴検出ニューロンの出力値をｙ_m ^LS（ξ，ζ）、Ｌ階層目Ｃ層のｍ番目細胞面内において、位置（ξ，ζ）の特徴
統合ニューロンの出力値をｙ_m ^LC（ξ，ζ）と表記する。それぞれのニューロンの結合係
数をｗ_m ^LS（ｎ，ｕ，ｖ）、ｗ_m ^LC（ｕ，ｖ）とすると、各出力値は以下のように表すことができる。

In the m-th cell plane of the L-th layer S layer, the output value of the feature detection neuron at the position (ξ, ζ) is y _m ^LS (ξ, ζ), and in the m-th cell plane of the L-th layer C layer, the position The output value of the feature integration neuron of (ξ, ζ) is expressed as y _m ^LC (ξ, ζ). Assuming that the coupling coefficients of the neurons are w _m ^LS (n, u, v) and w _m ^LC (u, v), each output value can be expressed as follows.

式１のｆは活性化関数であり、ロジスティック関数、双曲線正接関数などのシグモイド関数であればよく、例えばｔａｎｈ関数により実現される。ｕ_m ^LS（ξ，ζ）は、Ｌ階層
目Ｓ層のｍ番目細胞面における、位置（ξ，ζ）の特徴検出ニューロンの内部状態である。式１で示される特徴検出ニューロンの出力値は、特徴検出ニューロンの内部状態ｕ_m ^LS
（ξ，ζ）を、活性化関数ｆで変換することにより算出される。 f in Equation 1 is an activation function, which may be a sigmoid function such as a logistic function or a hyperbolic tangent function, and is realized by a tanh function, for example. u _m ^LS (ξ, ζ) is the internal state of the feature detection neuron at the position (ξ, ζ) in the m-th cell plane of the L-th layer S layer. The output value of the feature detection neuron shown in Equation 1 is the internal state of the feature detection neuron u _m ^LS
It is calculated by transforming (ξ, ζ) with the activation function f.

式２で示される特徴統合ニューロンの出力値は、活性化関数を用いずに、結合係数ｗ_m ^LC（ｕ，ｖ）とＬ階層目Ｓ層のｍ番目の出力値との単純な線形和により算出される。活性
化関数を用いない場合は、特徴統合ニューロンの内部状態ｕ_m ^LC（ξ，ζ）と出力値ｙ_m ^LC（ξ，ζ）は等しい。また、式１のｙ_n ^L-1C（ξ＋ｕ，ζ＋ｖ）、式２のｙ_m ^LS（ξ＋ｕ，ζ＋ｖ）は、それぞれ特徴統合ニューロン、特徴検出ニューロンの結合先出力値と称される。 The output value of the feature integration neuron shown in Equation 2 is obtained by a simple linear sum of the coupling coefficient w _m ^LC (u, v) and the m-th output value of the L-th layer S layer without using an activation function. Calculated. When no activation function is used, the internal state u _m ^LC (ξ, ζ) of the feature integration neuron is equal to the output value y _m ^LC (ξ, ζ). Also, y _n ^L-1C (ξ+u, ζ+v) in Equation 1 and y _m ^LS (ξ+u, ζ+v) in Equation 2 are referred to as output values to which the feature integration neuron and the feature detection neuron are coupled, respectively.

式１および式２中のξ、ζ、ｕ、ｖ、ｎについて説明する。位置（ξ，ζ）は入力画像における位置座標に対応する。ｙ_m ^LS（ξ，ζ）が、他の位置の特徴検出ニューロンの出
力値よりも高い場合、Ｌ階層目Ｓ層ｍ番目細胞面において検出する特徴は、入力画像の画素位置（ξ，ζ）に存在する可能性が高いことを意味する。 ξ, ζ, u, v, and n in Equations 1 and 2 will be explained. Position (ξ, ζ) corresponds to position coordinates in the input image. If y _m ^LS (ξ, ζ) is higher than the output value of the feature detection neuron at other positions, the feature detected in the L-th layer, S-layer, m-th cell plane is the pixel position (ξ, ζ) of the input image. It means that there is a high possibility that there is a

式１のｎは、Ｌ－１階層目Ｃ層ｎ番目細胞面を意味しており、統合先特徴番号と称される。特徴検出ニューロンの内部状態ｕ_m ^LS（ξ，ζ）は、Ｌ－１階層目Ｃ層に存在する各
細胞面について、結合係数ｗ_m ^LS（ｎ，ｕ，ｖ）と、特徴統合ニューロンの結合先出力値
ｙ_n ^L-1C（ξ＋ｕ，ζ＋ｖ）との積和演算により算出される。 n in Formula 1 means the L-1 layer C layer n-th cell surface, and is referred to as an integration target feature number. The internal state u _m ^LS (ξ, ζ) of the feature detection neuron is obtained by combining the coupling coefficient w _m ^LS (n, u, v) and the coupling It is calculated by a sum-of-products operation with the previous output value y _n ^L-1C (ξ+u, ζ+v).

（ｕ，ｖ）は、結合係数の相対位置座標であり、検出する特徴のサイズに応じて有限の範囲（ｕ，ｖ）で積和演算が行われる。有限範囲（ｕ，ｖ）は、受容野と呼ばれる。受容野の大きさは、結合している範囲の横画素数×縦画素数で表されるものとし、以下では受容野サイズと称される。 (u, v) are the relative position coordinates of the coupling coefficient, and the sum-of-products operation is performed in a finite range (u, v) according to the size of the feature to be detected. The finite range (u,v) is called the receptive field. The size of the receptive field is represented by the number of horizontal pixels×the number of vertical pixels in the combined range, and is hereinafter referred to as the receptive field size.

式１において、Ｌ＝１つまり一番初めのＳ層では、ｙ_n ^L-1C（ξ＋ｕ，ζ＋ｖ）は、入
力画像ｙ^in_image（ξ＋ｕ，ζ＋ｖ）または、入力位置マップｙ^in_posi_map（ξ＋ｕ，
ζ＋ｖ）となる。ニューロンおよび画素の分布は離散的であり、結合先特徴番号も離散的である。したがって、ξ、ζ、ｕ、ｖ、ｎは連続な変数ではなく、離散的な値をとる。ここでは、ξ、ζは非負整数、ｎは自然数、ｕ、ｖは整数とし、いずれも有限な範囲となる。 In Equation 1, y _n ^L-1C (ξ+u, ζ+v) is the input image y ^in_image (ξ+u, ζ+v) or the input position map y ^in_posi_map (ξ+u,
ζ+v). The distribution of neurons and pixels is discrete, and the destination feature number is also discrete. Therefore, ξ, ζ, u, v, and n are discrete values rather than continuous variables. Here, ξ and ζ are non-negative integers, n is a natural number, and u and v are integers, all of which have a finite range.

式１中のｗ_m ^LS（ｎ，ｕ，ｖ）は、所定の特徴を検出するための結合係数分布である。
結合係数を適切な値に調整することによって、所定の特徴を検出することが可能になる。ＣＮＮの構築（学習）においては、様々なテストパターンを提示し、ｙ_m ^LS（ξ，ζ）が
適切な出力値になるように、結合係数が繰り返し調整される。 w _m ^LS (n, u, v) in Equation 1 is the coupling coefficient distribution for detecting a given feature.
By adjusting the coupling coefficients to appropriate values, it is possible to detect a given feature. In constructing (learning) the CNN, various test patterns are presented, and the coupling coefficients are iteratively adjusted so that y _m ^LS (ξ, ζ) are appropriate output values.

式２中のｗ_m ^LC（ｕ，ｖ）は、２次元のガウシアン関数を用いて式３のように表される
。

w _m ^LC (u, v) in Equation 2 is expressed as Equation 3 using a two-dimensional Gaussian function.

（ｕ，ｖ）は有限の範囲とし、特徴検出ニューロンの説明と同様に、有限範囲（ｕ，ｖ）は受容野、受容野の範囲の大きさは受容野サイズと称される。受容野サイズは、Ｌ階層目Ｓ層のｍ番目細胞面で検出される特徴のサイズに応じて適切な値に設定されればよい。式３中の、σは特徴サイズ因子であり、受容野サイズに応じて適切な定数に設定される。具体的には、特徴サイズ因子σは、受容野の一番外側で結合係数の値がほぼ０とみなせる値になるように設定されることが好ましい。 (u, v) is a finite range, and the finite range (u, v) is called the receptive field, and the size of the receptive field range is called the receptive field size, similar to the description of the feature detection neuron. The receptive field size may be set to an appropriate value according to the size of the feature detected on the m-th cell plane of the L-th layer S layer. σ in Equation 3 is a feature size factor, which is set to an appropriate constant according to the receptive field size. Specifically, the feature size factor σ is preferably set so that the value of the coupling coefficient can be considered to be approximately 0 at the outermost side of the receptive field.

ニューラルネットワークの各階層で上述のような演算をすることで、最終階層のＳ層において、被写体クラスの識別が可能となる。 By performing the above-described calculations in each layer of the neural network, it becomes possible to identify the subject class in the final layer, S layer.

ニューラルネットワークの具体的な学習方法について説明する。本実施形態では結合係
数は、教師ありの学習により調整される。教師ありの学習では、テストパターンを与えて実際にニューロンの出力値を求め、結合係数ｗ_m ^LS（ｎ，ｕ，ｖ）は、実際のニューロン
の出力値と教師信号（そのニューロンが出力すべき望ましい出力値）との関係から修正されればよい。結合係数は、例えば、最終階層の特徴検出層では最小二乗法を用いて修正し、中間層の特徴検出層では誤差逆伝搬法を用いて修正することができる。最小二乗法、および誤差逆伝搬法等による結合係数の修正は、例えば非特許文献１に開示された公知の手法を用いることができる。 A specific learning method of the neural network will be explained. In this embodiment, the coupling coefficients are adjusted by supervised learning. In supervised learning, a ^test pattern is given and the output value of a neuron is actually obtained _. desired output value). The coupling coefficients can be modified, for example, by using the least squares method in the final feature detection layer and by using the error backpropagation method in the intermediate feature detection layers. For correction of the coupling coefficients by the least-squares method, the error backpropagation method, or the like, the known method disclosed in Non-Patent Document 1, for example, can be used.

ニューラルネットワークを予め学習させる場合、学習用のテストパターンとして、検出すべき特定パターンおよび検出すべきでないパターンが多数用意される。活性化関数をｔａｎｈ関数とし、検出すべき特定パターンを提示した場合、最終階層の特徴検出細胞面の、特定パターンが存在する領域のニューロンに対して出力値が１となるように教師信号が与えられる。逆に、検出すべきでないパターンを提示した場合、検出すべきでないパターンが存在する領域のニューロンに対して出力値が－１となるように教師信号が与えられる。 When the neural network is trained in advance, a large number of specific patterns to be detected and patterns not to be detected are prepared as test patterns for learning. When the activation function is the tanh function and a specific pattern to be detected is presented, a teacher signal is given so that the output value becomes 1 to the neurons in the region where the specific pattern exists in the feature detection cell surface of the final layer. be done. Conversely, when a pattern that should not be detected is presented, a teacher signal is given to the neuron in the area where the pattern that should not be detected exists so that the output value becomes -1.

以上の方法により、入力画像から被写体クラスを識別するニューラルネットワークが構築される。実際の検出（識別）処理での演算は、学習により調整した結合係数ｗ_m ^LS（ｎ
，ｕ，ｖ）を用いて行われる。最終階層の特徴検出細胞面上のニューロンの出力値が、所定値以上であれば、識別対象の被写体クラスの被写体は、当該ニューロンに対応する位置（領域）に存在すると判定される。 By the above method, a neural network for identifying object classes from input images is constructed. Calculations in the actual detection (identification) process are the coupling coefficients w _m ^LS (n
, u, v). If the output value of a neuron on the feature detection cell surface of the final layer is equal to or greater than a predetermined value, it is determined that the subject of the subject class to be identified exists in the position (region) corresponding to the neuron.

図４を参照して、ニューラルネットワークのモデル構造について説明する。図４は、ニューラルネットワークのモデル構造の概略図である。図４の説明では、ニューロンはノード、ニューロン間をつなぐ結合はエッジ、ニューロン間の結合の強さを表す結合係数はエッジの重みと称される。ノードは丸で表され、ノード間をつなぐエッジは直線で表される。 The model structure of the neural network will be described with reference to FIG. FIG. 4 is a schematic diagram of the model structure of a neural network. In the description of FIG. 4, a neuron is called a node, a connection between neurons is called an edge, and a connection coefficient representing the strength of the connection between neurons is called an edge weight. Nodes are represented by circles, and edges connecting nodes are represented by straight lines.

図４に示す推論モデル４００（演算器）は、入力層４０１、隠れ層４０３、出力層４０５を含むモデル構造を有する。入力層４０１のノード数はＮ個、隠れ層４０３のノード数はＭ個、出力層４０５のノード数は１０個であり、推論モデル４００は、１０種類の被写体クラスを識別できる多クラス分類器である。隠れ層４０３のノードは、任意の活性化関数による演算を実行する。入力層４０１と隠れ層４０３とをつなぐエッジ群は第１層、隠れ層４０３と出力層４０５とをつなぐエッジ群は第２層と称される。 An inference model 400 (operator) shown in FIG. 4 has a model structure including an input layer 401 , a hidden layer 403 and an output layer 405 . The input layer 401 has N nodes, the hidden layer 403 has M nodes, and the output layer 405 has 10 nodes. be. Hidden layer 403 nodes perform operations with arbitrary activation functions. A group of edges connecting the input layer 401 and the hidden layer 403 is called a first layer, and a group of edges connecting the hidden layer 403 and the output layer 405 is called a second layer.

図５から図９を参照して、被写体クラスを識別するための推論処理について説明する。図５は、推論モデル４００を用いた推論処理を例示するフローチャートである。ステップＳ５０１では、ＣＰＵ１０１は、画像取得部１０４が取得した画像（フレーム画像）が１フレーム目であるか否かを判定する。 An inference process for identifying a subject class will be described with reference to FIGS. 5 to 9. FIG. FIG. 5 is a flowchart illustrating inference processing using the inference model 400 . In step S501, the CPU 101 determines whether the image (frame image) acquired by the image acquisition unit 104 is the first frame.

ＣＰＵ１０１が、取得された画像を１フレーム目であると判定した場合、処理はステップＳ５０３に進む。ＣＰＵ１０１が、取得された画像を１フレーム目でないと判定した場合、処理はステップＳ５０５に進む。 When the CPU 101 determines that the acquired image is the first frame, the process proceeds to step S503. When the CPU 101 determines that the acquired image is not the first frame, the process proceeds to step S505.

以下の説明では、ＣＰＵ１０１は、１フレーム目（第１のフレーム）であるか否かに応じて処理を切り替えるが、これに限られない。ＣＰＵ１０１は、例えば、前のフレーム（第１のフレーム）と現在のフレーム（第２のフレーム）との画素値の差分等に基づいて、被写体が変化したか否かを判定するようにしてもよい。そして、ＣＰＵ１０１は、被写体が変化したと判定した場合にステップＳ５０３に処理を進め、被写体が変化していないと判定した場合にステップＳ５０５に処理を進めてもよい。被写体の変化に応じて、演算を
中断するか否かを判定するための基準値が更新されるため、ＣＰＵ１０１は、無駄となる演算をより低減することができる。 In the following description, the CPU 101 switches the processing depending on whether it is the first frame (first frame), but the present invention is not limited to this. The CPU 101 may determine whether or not the subject has changed based on, for example, the difference in pixel values between the previous frame (first frame) and the current frame (second frame). . Then, the CPU 101 may advance the process to step S503 if it determines that the subject has changed, and advance the process to step S505 if it determines that the subject has not changed. Since the reference value for determining whether or not to interrupt the calculation is updated according to the change of the subject, the CPU 101 can further reduce unnecessary calculation.

ステップＳ５０３では、ＣＰＵ１０１は、推論演算１を実行する。図６を参照して、推論演算１の処理の流れを説明する。図６は、１フレーム目の画像に対する推論演算１を例示するフローチャートである。 In step S503 , the CPU 101 executes inference operation 1 . The processing flow of inference operation 1 will be described with reference to FIG. FIG. 6 is a flowchart illustrating inference operation 1 for the first frame image.

ステップＳ６０１では、ＣＰＵ１０１は、推論演算１１を実行する。推論演算１１は、まず、隠れ層４０３の各ノードへの入力（影響度）を演算し、次に、演算した影響度を活性化関数に通過させて、ノードからの出力値を取得する処理である。 In step S601 , the CPU 101 executes the inference operation 11 . The inference calculation 11 is a process of first calculating an input (influence) to each node of the hidden layer 403, then passing the calculated influence through an activation function, and obtaining an output value from the node. be.

まず、ＣＰＵ１０１は、図４の推論モデル４００の隠れ層４０３に含まれる各ノードへの入力を演算する。図７および図８を参照して、隠れ層４０３の各ノードへの入力ａ_j ⁽¹⁾（ｊ＝１，２，３，…，Ｍ）について説明する。ｊは、隠れ層４０３に含まれる各ノードの番号を示す。図４の例では、隠れ層４０３はＭ個のノードを有する。隠れ層４０３の各ノードへの入力ａ_j ⁽¹⁾の「（１）」は、ａ_j ⁽¹⁾が第１層での演算結果であることを示す。ノード４２１への入力はａ₁ ⁽¹⁾と表される。 First, the CPU 101 computes inputs to each node included in the hidden layer 403 of the inference model 400 of FIG. Input a _j ⁽¹⁾ (j=1, 2, 3, . . . , M) to each node of hidden layer 403 will be described with reference to FIGS. j indicates the number of each node included in the hidden layer 403 . In the example of FIG. 4, hidden layer 403 has M nodes. "(1)" of the input a _j ⁽¹⁾ to each node of the hidden layer 403 indicates that a _j ⁽¹⁾ is the operation result in the first layer. The input to node 421 is denoted a ₁ ⁽¹⁾ .

図７は、推論モデル４００のノードで用いられる演算器を説明する図である。図７は、隠れ層４０３に含まれるノードについての演算例を示す。ＣＰＵ１０１は、まず、積和演算器７０１を用いて、ノードごとに入力ａ_j ⁽¹⁾（ｊ＝１，２，３，…，Ｍ）を演算する。隠れ層４０３のｊ番目のノードへの入力ａ_j ⁽¹⁾を演算する場合、積和演算器７０１への入力は、入力層４０１のＮ個のノードからの出力値ｘ_i ⁽¹⁾（ｉ＝１，２，３，…，Ｎ）およびエッジの重みｗ_j,i ⁽¹⁾（ｉ＝１，２，３，…，Ｎ）である。積和演算器７０１は、ａ_j ⁽¹⁾としてｘ_i ⁽¹⁾とｗ_j,i ⁽¹⁾との積和を演算する。例えばノード４２１（ｊ＝１）では、積和演算器７０１は、ｘ_i ⁽¹⁾とｗ_1,i ⁽¹⁾との積和をａ₁ ⁽¹⁾として演算する。 FIG. 7 is a diagram for explaining computing units used in the nodes of the inference model 400. As shown in FIG. FIG. 7 shows an example of computation for nodes included in the hidden layer 403 . The CPU 101 first uses the sum-of-products calculator 701 to calculate inputs a _j ⁽¹⁾ (j=1, 2, 3, . . . , M) for each node. When calculating the input a _j ⁽¹⁾ to the j-th node of the hidden layer 403, the input to the sum-of-products calculator 701 is the output value x _i ⁽¹⁾ (i = 1, 2, 3, ..., N) and edge weights w _j,i ⁽¹⁾ (i = 1, 2, 3, ..., N). The product-sum calculator 701 calculates the product-sum of x _i ⁽¹⁾ ^and w _j,i ⁽¹⁾ as a _j (1). For example, at node 421 (j=1), product-sum calculator 701 calculates the product-sum of x _i ⁽¹⁾ and w _1,i ⁽¹⁾ as a ₁ ⁽¹⁾ .

図８は、図７に示す積和演算器７０１の構成を例示する図である。ＣＰＵ１０１は、隠れ層４０３のノードごとに、ｘ_i ⁽¹⁾とｗ_j,i ⁽¹⁾との積算および積算結果の加算を繰り返し、それぞれのノードへの入力ａ_j ⁽¹⁾（j＝１，２，３，…，Ｍ）を演算する。積和演算器
７０１での演算により取得したノードへの入力ａ_j ⁽¹⁾は、影響度としてメモリ１０３に記憶される。 FIG. 8 is a diagram illustrating the configuration of the sum-of-products calculator 701 shown in FIG. CPU 101 repeats the integration of x _i ⁽¹⁾ and w _j,i ⁽¹⁾ and the addition of the integration results for each node of hidden layer 403, and inputs a _j ⁽¹⁾ (j=1 , 2, 3, . . . , M). The input a _j ⁽¹⁾ to the node obtained by the calculation in the sum-of-products calculator 701 is stored in the memory 103 as the degree of influence.

次に、ＣＰＵ１０１は、隠れ層４０３の各ノードへの入力ａ_j ⁽¹⁾をそれぞれ活性化関数演算器７０２に通過させることで、出力値ｘ_j ⁽²⁾を取得する。例えば、ノード４２１からの出力値は、ａ₁ ⁽¹⁾を活性化関数演算器７０２に通過させて得られるｘ₁ ⁽²⁾である。活性化関数演算器７０２は、シグモイド関数またはｔａｎｈ関数などを活性化関数として用いる。ＣＰＵ１０１は、隠れ層４０３に含まれる各ノードへの入力ａ_j ⁽¹⁾を活性化関数演算器７０２に通過させて、ｘ_j ⁽²⁾（ｊ＝１，２，３，…，Ｍ）を取得する。 Next, CPU 101 obtains output value x _j ⁽²⁾ by passing input a _j ⁽¹⁾ to each node of hidden layer 403 through activation function calculator 702 . For example, the output value from node 421 is x ₁ ⁽²⁾ obtained by passing a ₁ ⁽¹⁾ through activation function calculator 702 . The activation function calculator 702 uses a sigmoid function, a tanh function, or the like as an activation function. CPU 101 passes input a _j ⁽¹⁾ to each node included in hidden layer 403 through activation function calculator 702 to generate x _j ⁽²⁾ (j=1, 2, 3, . . . , M) as get.

なお、図４は、隠れ層４０３が１つの例を示すが、隠れ層は複数あってもよい。隠れ層が複数存在する場合、ＣＰＵ１０１はそれぞれの隠れ層に対して推論演算１１を実行する。これにより、複数の隠れ層に含まれる各ノードへの入力が求められる。 Although FIG. 4 shows an example with one hidden layer 403, there may be a plurality of hidden layers. If there are multiple hidden layers, the CPU 101 executes the inference operation 11 for each hidden layer. This provides an input to each node included in multiple hidden layers.

図６のステップＳ６０３では、ＣＰＵ１０１は、推論演算１２を実行する。推論演算１２は、まず、出力層４０５の各ノードへの入力（影響度）を演算し、次に、演算した影響度を活性化関数に通過させて、ノードからの出力値を取得する処理である。 In step S603 of FIG. 6, the CPU 101 executes the inference operation 12. FIG. The inference calculation 12 is a process of first calculating the input (influence) to each node of the output layer 405, then passing the calculated influence through the activation function, and acquiring the output value from the node. be.

まず、ＣＰＵ１０１は、図４の推論モデル４００の出力層４０５に含まれる各ノードへの入力ａ_k ⁽²⁾（ｋ＝１，２，３，…，１０）を、図７および図８で説明した積和演算器７
０１を用いて演算する。 First, the CPU 101 inputs a _k ⁽²⁾ (k=1, 2, 3, . . . , 10) to each node included in the output layer 405 of the inference model 400 of FIG. sum-of-products calculator 7
01 is used for calculation.

出力層４０５のｋ番目のノードへの入力ａ_k ⁽²⁾を演算する場合、積和演算器７０１への入力は、隠れ層４０３のＭ個のノードからの出力値ｘ_j ⁽²⁾（ｊ＝１，２，３，…，Ｍ）およびエッジの重みｗ_k,j ⁽²⁾（ｊ＝１，２，３，…，Ｍ）である。積和演算器７０１は、ａ_k ⁽²⁾としてｘ_j ⁽²⁾とｗ_k,j ⁽²⁾との積和を演算する。例えばノード４２３では、積和演算器７０１は、ｘ_j ⁽²⁾とｗ_1,j ⁽²⁾との積和をａ₁ ⁽²⁾として演算する。積和演算器７０１での演算により取得したノードへの入力ａ_k ⁽²⁾は、影響度としてメモリ１０３に記憶される。 When calculating the input a _k ⁽²⁾ to the k-th node of the output layer 405, the input to the sum-of-products calculator 701 is the output value x _j ⁽²⁾ (j = 1, 2, 3, ..., M) and edge weights w _k,j ⁽²⁾ (j = 1, 2, 3, ..., M). The product-sum calculator 701 calculates the product-sum of x _j ⁽²⁾ and w _k,j ⁽²⁾ as a _k ⁽ 2). For example, at the node 423, the product-sum calculator 701 calculates the product-sum of x _j ⁽²⁾ and w _1,j ⁽²⁾ as a ₁ ⁽²⁾ . The input a _k ⁽²⁾ to the node acquired by the calculation in the sum-of-products calculator 701 is stored in the memory 103 as the degree of influence.

ＣＰＵ１０１は、各ノードへの入力ａ_k ⁽²⁾をそれぞれ活性化関数演算器７０２に通過させる。出力層４０５では、活性化関数演算器７０２は、例えばＳｏｆｔｍａｘ関数である。Ｓｏｆｔｍａｘ関数は、以下の式４で表される。式４のeはネイピア数である。

CPU 101 passes input a _k ⁽²⁾ to each node to activation function calculator 702 . In the output layer 405, the activation function calculator 702 is, for example, the Softmax function. The Softmax function is represented by Equation 4 below. e in Equation 4 is Napier's number.

図９は、被写体クラスの識別結果の例を示すグラフである。出力層４０５の各ノードからの出力ｙ_kが棒グラフで示される。出力層４０５のｋ番目のノード（以下、ノードｋの
ように記載される）からの出力ｙ_kは、ノードｋに対応する被写体クラスの被写体の存在
確率である。図９では、出力層４０５のノード６からの出力ｙ₆が他のノードからの出力
よりも高く、ノード６に対応する被写体クラスの被写体が存在する確率は、他のノードに対応する被写体クラスの被写体が存在する確率よりも高い。 FIG. 9 is a graph showing an example of object class identification results. The output y _k from each node in the output layer 405 is shown as a bar graph. The output y _k from the k-th node (hereinafter referred to as node k) of the output layer 405 is the subject existence probability of the subject class corresponding to node k. In FIG. 9, the output _y6 from node 6 of the output layer 405 is higher than the outputs from other nodes, and the probability that an object of the object class corresponding to node 6 exists is Higher than the probability that the subject exists.

図９では、ノード６からの出力ｙ₆に続いて、ノード２からの出力ｙ₂、ノード７からの出力ｙ₇が高く、それ以外のノードからの出力は、ｙ₂およびｙ₇よりも低くなる。したが
って、対応する被写体クラスの被写体の存在確率は、ノード６が最も高く、ノード２、ノード７、その他のノードの順に低くなっていく。 In FIG. 9, output _y6 from node 6 is high, followed by output _y2 from node 2, output _y7 from node 7, and outputs from other nodes are lower than _y2 and _y7 . Become. Therefore, the object existence probability of the corresponding object class is highest at node 6, and decreases in the order of node 2, node 7, and other nodes.

図５のステップＳ５０５では、ＣＰＵ１０１は、画像取得部１０４が取得した画像を１フレーム目でないと判定した場合の処理である推論演算２を実行する。図１０を参照して、推論演算２の処理の流れを説明する。図１０は、１フレーム目以外での推論演算２を例示するフローチャートである。 In step S505 of FIG. 5, the CPU 101 executes the inference operation 2, which is the process when it is determined that the image acquired by the image acquiring unit 104 is not the first frame. The processing flow of the inference operation 2 will be described with reference to FIG. FIG. 10 is a flowchart illustrating an inference operation 2 other than the first frame.

ステップＳ１００１では、ＣＰＵ１０１は、推論演算１１を実行する。ステップＳ１００１の推論演算１１は、図６のステップＳ６０１の推論演算１１と同じであるため説明を省略する。 In step S1001 , the CPU 101 executes an inference operation 11 . Since the inference operation 11 in step S1001 is the same as the inference operation 11 in step S601 of FIG. 6, description thereof is omitted.

ステップＳ１００３では、ＣＰＵ１０１は、推論演算２１を実行する。推論演算２１は、出力層４０５（第２の階層）への入力ａ_k ⁽²⁾（ｋ＝１，２，３，…，１０）を途中まで計算する処理である。ａ_k ⁽²⁾は、以下の式５に示すように、隠れ層４０３（第１の階層）のＭ個のノードの出力値ｘ_j ⁽²⁾と対応するｗ_k,j ⁽²⁾との積和の演算により求められる。推論演算２１では、式５による演算は、式６に示すように２つに分けられ、ＣＰＵ１０１は、式６の第１項の演算を実行する。 In step S1003 , the CPU 101 executes the inference operation 21 . The inference operation 21 is a process of partially calculating the input a _k ⁽²⁾ (k=1, 2, 3, . . . , 10) to the output layer 405 (second hierarchy). a _k ⁽²⁾ is the output value x _j ⁽²⁾ of M nodes in the hidden layer 403 (first layer) and the corresponding w _k,j ⁽²⁾ , as shown in Equation 5 below. It is obtained by sum-of-products operation. In the inference operation 21, the operation based on the expression 5 is divided into two as shown in the expression 6, and the CPU 101 executes the operation of the first term of the expression 6.

ただし、ｍ＝Ｍ／２（Ｍが偶数の場合）、（Ｍ－１）／２（Ｍが奇数の場合）とする。推論演算２１は、式６の第１項を演算する処理である。式６の第１項は、隠れ層４０３の一部のノードについての積和である。例えば、ノード４２３への入力ａ₁ ⁽²⁾についての推論演算２１は、隠れ層４０３のＭ個のノードのうち、ｍ個の出力値ｘ_j ⁽²⁾と対応するｗ_1,j ⁽²⁾との積和の演算処理である。

However, m=M/2 (when M is an even number) and (M-1)/2 (when M is an odd number). The inference calculation 21 is a process of calculating the first term of Equation 6. The first term in Equation 6 is the sum of products for some nodes in hidden layer 403 . For example, the inference operation 21 for the input a ₁ ⁽²⁾ to the node 423 generates w _1,j ₍ ² ⁾ It is a calculation process of sum of products with .

ステップＳ１００５およびステップＳ１００７では、ＣＰＵ１０１は、式６の第２項の演算を実行するか否かを判定し、被写体の識別結果に影響を与えない演算は実行せずに、影響度の演算を中断することで演算量を抑制する。 In steps S1005 and S1007, the CPU 101 determines whether or not to execute the calculation of the second term of Expression 6, and suspends the calculation of the degree of influence without executing the calculation that does not affect the identification result of the subject. By doing so, the amount of calculation is suppressed.

ステップＳ１００５では、ＣＰＵ１０１は、現在のフレームでの影響度（第２の影響度）の演算の一部の演算結果（以下、部分影響度とも称される）と基準値との差が所定範囲内であるか否かを判定する。基準値は、１フレーム目（第１のフレーム）での隠れ層４０３の一部のノードについての積和（式６の第１項）に相当する値であり、例えば１フレーム目でメモリ１０３に記憶したａ_k ⁽²⁾（第１の影響度）を０．５倍した値とすることができる。 In step S1005, the CPU 101 determines whether the difference between the partial calculation result (hereinafter also referred to as the partial influence) of the calculation of the influence (second influence) in the current frame and the reference value is within a predetermined range. It is determined whether or not. The reference value is a value corresponding to the sum of products (the first term in Equation 6) for some nodes of the hidden layer 403 in the first frame (first frame). A value obtained by multiplying the stored a _k ⁽²⁾ (first degree of influence) by 0.5 can be used.

影響度は、式６によって演算されるａ_k ⁽²⁾であり、部分影響度は、式６の第１項の演算結果とする。この場合、ＣＰＵ１０１は、ａ_k ⁽²⁾の０．５倍（基準値）と式６の第１項の演算結果（部分影響度）との差が所定範囲内であるか否かを判定する。 The degree of influence is a _k ⁽²⁾ calculated by Equation 6, and the partial degree of influence is the calculation result of the first term of Equation 6. In this case, the CPU 101 determines whether the difference between 0.5 times a _k ⁽²⁾ (reference value) and the calculation result (partial influence degree) of the first term of Equation 6 is within a predetermined range. .

動画のように時間的に連続したフレームが画像取得部１０４に入力される場合、一般には連続するフレーム間の相関は高くなる。このため、時間的に連続する前のフレームで検出された被写体は、現在のフレームでも検出され、連続する前のフレームで検出されなかった被写体は、現在のフレームでも検出されない可能性がある。 When temporally continuous frames such as a moving image are input to the image acquisition unit 104, the correlation between the continuous frames is generally high. For this reason, there is a possibility that the subject detected in the temporally consecutive previous frame will be detected in the current frame, and the subject that was not detected in the consecutive previous frame will not be detected in the current frame.

ＣＰＵ１０１は、基準値と式６の第１項の演算結果との差が所定範囲内であれば、１フレーム目と現在のフレームとで同じ被写体の特徴をとらえていると判定することができる。所定範囲は、例えば、１フレーム目での影響度（メモリ１０３に記憶されたａ_k ⁽²⁾）に基づいて設定される。具体的には、所定範囲は、影響度の３％としてもよく、基準値の５％としてもよい。所定範囲を大きくすることで、式６の第２項の演算を実行しない場合が増えるため、演算量はより抑制される。 If the difference between the reference value and the calculation result of the first term of Equation 6 is within a predetermined range, the CPU 101 can determine that the same subject features are captured in the first frame and the current frame. The predetermined range is set, for example, based on the degree of influence in the first frame (a _k ⁽²⁾ stored in memory 103). Specifically, the predetermined range may be 3% of the degree of influence or 5% of the reference value. By increasing the predetermined range, the number of cases where the calculation of the second term of Equation 6 is not performed increases, so the amount of calculation is further suppressed.

なお、式６の第１項と第２項とを分けるｍの値は、Ｍの約半分としたがこれに限られず、Ｍの約１／α（０＜１／α＜１）としてもよい。基準値は、ｍをＭの約半分とした場合に１フレーム目のａ_j ⁽²⁾の０．５倍としたが、ｍをＭの約１／αとした場合、ｍに合わせてａ_j ⁽²⁾の約１／αとすればよい。 The value of m that divides the first term and the second term of Equation 6 is about half of M, but is not limited to this, and may be about 1/α (0<1/α<1) of M. . The reference value is 0.5 times a _j ⁽²⁾ in the first frame when m is about half of M, but when m is about 1/α of M, a _j ⁽²⁾ may be approximately 1/α.

ステップＳ１００５で、ＣＰＵ１０１が基準値と部分影響度との差が所定範囲内であると判定した場合、処理はステップＳ１００７に進む。ＣＰＵ１０１が基準値と部分影響度
との差が所定範囲内でないと判定した場合、処理はステップＳ１００９に進む。基準値と部分影響度との差が所定範囲内でない場合には、ＣＰＵ１０１は、１フレーム目と現在フレームで相関がないと判定して、式６の第２項も含めてａ_k ⁽²⁾を演算する。 If the CPU 101 determines in step S1005 that the difference between the reference value and the partial influence degree is within the predetermined range, the process advances to step S1007. If the CPU 101 determines that the difference between the reference value and the partial influence degree is not within the predetermined range, the process advances to step S1009. If the difference between the reference value and the partial influence degree is not within the predetermined range, the CPU 101 determines that there is no correlation between the first frame and the current frame, and a _k ⁽²⁾ is calculated.

ステップＳ１００７では、ＣＰＵ１０１は、部分影響度が閾値以下か否かを判定する。部分影響度は、例えば式６の第１項の演算結果とする。影響度は被写体の特徴をよくとらえている場合に大きくなるため、ＣＰＵ１０１は、部分影響度が閾値より大きい場合には式６の第２項の演算も実行する。 In step S1007, the CPU 101 determines whether or not the partial influence is equal to or less than the threshold. The partial influence degree is, for example, the calculation result of the first term of Expression 6. Since the degree of influence increases when the features of the subject are captured well, the CPU 101 also executes the calculation of the second term of Equation 6 when the partial degree of influence is greater than the threshold.

一方、影響度は被写体の特徴が検知されない場合には小さくなるため、ＣＰＵ１０１は、部分影響度が閾値以下の場合には式６の第２項の演算を実行しない。このように演算が中断された場合、ａ_k ⁽²⁾には、前のフレームである１フレーム目のａ_k ⁽²⁾の値が設定されてもよく、中断した時点での演算結果である第１項の値またはゼロといった所定値が設定されてもよい。演算が中断された場合、ＣＰＵ１０１は、ａ_k ⁽²⁾に設定された値を影響度として、出力層４０５のノードｋに入力する。 On the other hand, since the degree of influence is small when the features of the subject are not detected, the CPU 101 does not perform the calculation of the second term of Equation 6 when the partial degree of influence is equal to or less than the threshold. When the calculation is interrupted in this way, a _k ⁽²⁾ may be set to the value of a _k ⁽²⁾ in the first frame, which is the previous frame. A predetermined value such as the value of the first term or zero may be set. When the calculation is interrupted, the CPU 101 inputs the value set in a _k ⁽²⁾ to the node k of the output layer 405 as the degree of influence.

演算が中断されてａ_k ⁽²⁾に式６の第１項の値または所定値が設定された場合、１フレーム目の推論演算１でメモリ１０３に記憶されたａ_k ⁽²⁾は更新されず、次のフレームでも基準値を決定するための影響度として用いられるようにしてもよい。この場合、メモリ１０３に記憶されたａ_k ⁽²⁾は、次に推論演算１（ステップＳ５０３）が実行されるまで更新されない。 When the calculation is interrupted and the value of the first term of Equation 6 or the predetermined value is set to a _k ⁽²⁾ , a _k ⁽²⁾ stored in memory 103 is updated in inference calculation 1 of the first frame. Instead, it may be used as the degree of influence for determining the reference value in the next frame as well. In this case, a _k ⁽²⁾ stored in memory 103 is not updated until inference operation 1 (step S503) is next executed.

部分影響度と比較する閾値は、例えば、次のように設定される。隠れ層４０３のＭ個のノードからノードｋへの入力ａ_k ⁽²⁾の最大値がＭとなるように、重みｗ_k,j ⁽²⁾（ｊ＝１，２，３，…，Ｍ）およびノードからの出力ｘ_j ⁽²⁾等は、０～１の範囲に正規化される。閾値は、例えば、ａ_k ⁽²⁾の最大値であるＭの１０％以上の値に設定することができる。 A threshold to be compared with the partial influence is set as follows, for example. Weight w k _,j ₍ ²⁾ ⁽ j=1, 2, 3, . and the outputs x _j ⁽²⁾ etc. from the nodes are normalized to the range 0-1. The threshold can be set, for example, to a value equal to or greater than 10% of M, which is the maximum value of a _k ⁽²⁾ .

閾値をＭの何％の値にするかは、式６の第１項に含まれる積和の項数が、ａ_k ⁽²⁾の項数Ｍに占める割合等に応じて設定されてもよい。また、閾値をＭの何％の値にするかは、被写体クラスを識別する精度および速度に基づいて設定されてもよい。また、閾値は、重み等を正規化した場合の最大値Ｍに限られず、ａ_k ⁽²⁾が取りうる最大値に基づいて設定されてもよい。 The percentage of M to be the threshold value may be set according to the ratio of the number of terms of the sum of products included in the first term of Equation 6 to the number of terms M of a _k ^(2). . Also, what percentage of M the threshold is set to may be set based on the accuracy and speed of identifying the object class. Also, the threshold value is not limited to the maximum value M when the weights and the like are normalized, and may be set based on the maximum value that a _k ⁽²⁾ can take.

ステップＳ１００７でＣＰＵ１０１が、部分影響度が閾値以下であると判定した場合、処理はステップＳ１０１１に進む。ＣＰＵ１０１が、部分影響度が閾値より大きいと判定した場合、処理はステップＳ１００９に進む。部分影響度が閾値より大きい場合には、ＣＰＵ１０１は、出力ｙ_k（対応する被写体の存在確率）が高くなると判定し、式６の第２
項も含めてａ_k ⁽²⁾を演算する。 If the CPU 101 determines in step S1007 that the degree of partial influence is equal to or less than the threshold, the process advances to step S1011. If the CPU 101 determines that the partial influence level is greater than the threshold, the process advances to step S1009. If the partial influence degree is greater than the threshold, the CPU 101 determines that the output y _k (existence probability of the corresponding subject) is high, and the second
Compute a _k ⁽²⁾ including terms.

ステップＳ１００９では、ＣＰＵ１０１は、推論演算２２を実行する。推論演算２２は、式６の第２項の演算処理を実行して、出力層４０５のノードへの入力ａ_k ⁽²⁾を演算し、演算結果を活性化関数に通過させて、ノードからの出力値を取得する処理である。ＣＰＵ１０１は、式６の第２項の演算処理を実行して得られたａ_k ⁽²⁾を、ステップＳ６０３の推論演算１２と同様に活性化関数に通過させる。 In step S1009 , the CPU 101 executes the inference operation 22 . The inference operation 22 performs the operation processing of the second term of Equation 6 to operate the input a _k ⁽²⁾ to the node of the output layer 405, passes the operation result through the activation function, and obtains the output from the node This is the process of acquiring the output value. The CPU 101 passes a _k ⁽²⁾ obtained by executing the arithmetic processing of the second term of Equation 6 to the activation function as in the inference operation 12 of step S603.

ＣＰＵ１０１は、得られたａ_k ⁽²⁾を、次のフレームで基準値を決定するための影響度としてメモリ１０３に記憶してもよい。また、ＣＰＵ１０１は、ステップＳ１００９の推論演算２２ではａ_k ⁽²⁾を記憶せず、次に推論演算１（ステップＳ５０３）が実行されるまでの間、メモリ１０３に記憶された影響度ａ_k ⁽²⁾を用いて基準値を決定してもよい。 The CPU 101 may store the obtained a _k ⁽²⁾ in the memory 103 as the degree of influence for determining the reference value in the next frame. Further, the CPU 101 does not store a _k ⁽²⁾ in the inference calculation 22 in step S1009, and does not store the influence a _k (2) stored ⁱⁿ the memory 103 until inference calculation 1 (step S503) is next executed. ²⁾ may be used to determine the reference value.

ステップＳ１０１１では、ＣＰＵ１０１は、出力層４０５の各ノードへの入力ａ_k ⁽²⁾の演算が終了したか否かを判定する。ＣＰＵ１０１は、例えば、各ノードへの入力ａ_k ⁽²⁾の演算が実行済みであれば、演算が終了したと判定することができる。ＣＰＵ１０１が演算終了と判定した場合、図１０に示す推論演算２の処理は終了する。ＣＰＵ１０１が演算は終了していないと判定した場合、処理はステップＳ１００３に戻る。 In step S1011, the CPU 101 determines whether or not the calculation of the input a _k ⁽²⁾ to each node of the output layer 405 has been completed. For example, if the calculation of input a _k ⁽²⁾ to each node has been executed, the CPU 101 can determine that the calculation has ended. When the CPU 101 determines that the computation has ended, the processing of the inference computation 2 shown in FIG. 10 ends. If the CPU 101 determines that the calculation has not ended, the process returns to step S1003.

なお、図１０の推論演算２のステップＳ１００５で、基準値は、１フレーム目でメモリ１０３に記憶したａ_k ⁽²⁾（影響度）に基づいて設定される例を示したが、これに限られない。基準値は、現在のフレームの直前のフレームでメモリ１０３に記憶したａ_k ⁽²⁾に基づいて設定されてもよい。 Note that in step S1005 of the inference calculation 2 in FIG. 10, the reference value is set based on a _k ⁽²⁾ (influence) stored in the memory 103 in the first frame. can't The reference value may be set based on a _k ⁽²⁾ stored in memory 103 in the frame immediately preceding the current frame.

また、本実施形態では１フレーム目のａ_k ⁽²⁾を影響度として記憶したが、ＣＰＵ１０１は、所定数のフレームごとに、１フレーム目と同様に推論演算１（ステップＳ５０３）を実行してもよい。推論演算１を実行した場合、ＣＰＵ１０１は、推論演算１２（ステップＳ６０３）で演算したａ_k ⁽²⁾を新たな影響度とし、メモリ１０３に記憶された影響度を更新する。 Further, in the present embodiment, a _k ⁽²⁾ of the first frame is stored as the degree of influence, but the CPU 101 executes inference calculation 1 (step S503) for each predetermined number of frames in the same manner as the first frame. good too. When the inference calculation 1 is executed, the CPU 101 sets a _k ⁽²⁾ calculated in the inference calculation 12 (step S603) as a new influence level, and updates the influence level stored in the memory 103 .

また、式５および式６では、出力層４０５の各ノードへの入力ａ_k ⁽²⁾の演算（図４の第２層での演算）の一部を中断して演算量を低減させる例を説明したが、第１層でも同様に演算量の低減は可能である。情報処理装置は、隠れ層４０３の各ノードへの入力ａ_j ⁽¹⁾の演算（図４の第１層での演算）では、式６と同様にａ_j ⁽¹⁾を第１項および第２項に分けて、図１０のステップＳ１００３からステップＳ１００９までの処理と同様の処理を実行する。情報処理装置１００は、一部のノードでａ_j ⁽¹⁾の第２項の演算を実行しないことにより演算量を低減させることができる。 Also, in equations 5 and 6, an example of interrupting part of the computation of the input a _k ⁽²⁾ to each node of the output layer 405 (computation in the second layer in FIG. 4) to reduce the amount of computation is given. As explained above, it is possible to reduce the amount of calculation in the first layer as well. In the calculation of the input a _j ⁽¹⁾ to each node of the hidden layer 403 (calculation in the first layer in FIG. 4), the information processing device converts a _j ⁽¹⁾ into the first term and the first term Divided into two items, the same processing as the processing from step S1003 to step S1009 in FIG. 10 is executed. The information processing apparatus 100 can reduce the amount of computation by not performing the computation of the second term of a _j ⁽¹⁾ at some nodes.

また、隠れ層が複数存在する場合、出力層４０５の各ノードへの入力の演算と同様に、隠れ層４０３の次の隠れ層の各ノードへの入力の演算の一部を中断して、演算量を低減させるようにしてもよい。 Further, when there are a plurality of hidden layers, similar to the calculation of the input to each node of the output layer 405, part of the calculation of the input to each node of the hidden layer next to the hidden layer 403 is interrupted, and the calculation The amount may be reduced.

上記の実施形態によれば、情報処理装置１００は、現在のフレームよりも前のフレームでの影響度と、現在のフレームでの影響度の一部の演算結果とに基づいて、影響度の演算を中断するか否かを判定する。これにより、情報処理装置１００は、推論精度への影響を抑えつつ演算量を低減することができる。情報処理装置１００は、影響度が閾値以下となるノードの演算を早期に中断することで、推論演算時間を短縮することができる。 According to the above embodiment, the information processing apparatus 100 calculates the degree of influence based on the degree of influence in the frame prior to the current frame and a partial calculation result of the degree of influence in the current frame. is interrupted. Thereby, the information processing apparatus 100 can reduce the amount of calculation while suppressing the influence on the inference accuracy. The information processing apparatus 100 can shorten the inference calculation time by early interrupting the calculation of the node whose influence degree is equal to or less than the threshold.

（変形例１）
図６のステップＳ６０３の推論演算１２では、影響度としてａ_k ⁽²⁾を記憶する。変形例１では、式７に示す影響度ｂ_k ⁽²⁾のように、影響度は、隠れ層４０３の各ノードからの出力ｘ_j ⁽²⁾およびエッジの重みｗ_k,j ⁽²⁾を符号なしの値として、単調増加するように構成されてもよい。

(Modification 1)
In inference calculation 12 in step S603 of FIG. 6, a _k ⁽²⁾ is stored as the degree of influence. In Modified Example 1, the degree of influence, like the degree of influence b _k ⁽²⁾ shown in Equation 7, is obtained by combining the output x _j ⁽²⁾ from each node of the hidden layer 403 and the weight w _k,j ⁽²⁾ of the edge. It may be configured to be monotonically increasing as an unsigned value.

この場合、推論モデル４００は、学習段階においても、ｘ_j ⁽²⁾およびｗ_k,j ⁽²⁾を符号なしの値を使用する。影響度を式７のように構成すると、図１１で示すように、演算が進むにつれて影響度ｂ_k ⁽²⁾の値は単調増加するため、閾値を超えた後に値が減少することはない。図１１の例では、ＣＰＵ１０１は、Ｍ回の積和演算の約半分であるｍ回の積和演算を
実行する前に、閾値を超えたと判定することができる。したがって、ＣＰＵ１０１は、図１０のステップＳ１００７の処理で、影響度の演算の一部の演算結果（部分影響度）が閾値以上になったか否かをより早く判定することができる。 In this case, the inference model 400 uses unsigned values for x _j ⁽²⁾ and w _k,j ⁽²⁾ even during the training phase. If the degree of influence is configured as shown in Equation 7, the value of the degree of influence b _k ⁽²⁾ monotonically increases as the calculation progresses as shown in FIG. 11, so the value does not decrease after exceeding the threshold. In the example of FIG. 11, the CPU 101 can determine that the threshold has been exceeded before executing m product-sum operations, which is approximately half of the M product-sum operations. Therefore, the CPU 101 can more quickly determine whether or not a partial calculation result (partial influence degree) of the influence degree calculation is equal to or greater than the threshold value in the processing of step S1007 in FIG.

（変形例２）
出力層４０５の各ノード、例えばノード４２３への入力ａ_k ⁽²⁾は、１フレーム目（または現在のフレームよりも前のフレーム）の隠れ層４０３の各ノードへの入力ａ_j ⁽¹⁾（影響度）の大きさに基づいて演算されるようにしてもよい。ＣＰＵ１０１は、隠れ層４０３の各ノードのうち、他のノードよりも１フレーム目での影響度が大きいノードを優先して積和演算をする。閾値を超えるまでの時間が短くなり、ＣＰＵ１０１は、推論演算２２を実施するか否かを、より早く判定することができる。 (Modification 2)
The input a _k ⁽²⁾ to each node of the output layer 405, for example, the node 423, is the input a _j ⁽¹⁾ ( degree of influence). Among the nodes of the hidden layer 403, the CPU 101 preferentially performs a sum-of-products operation on a node having a greater degree of influence in the first frame than other nodes. The time until the threshold value is exceeded is shortened, and the CPU 101 can quickly determine whether or not to perform the inference operation 22 .

例えば、図１０のステップＳ１００３の推論演算２１で、ａ_k ⁽²⁾の途中計算結果が閾値を超えると、ＣＰＵ１０１は、影響度が大きくなると判定して、ステップＳ１００９に進み、推論演算２２を実行する。ここでの閾値は、ステップＳ１００７の閾値と同様に決定することができる。 For example, in inference calculation 21 in step S1003 of FIG. 10, if the intermediate calculation result of a _k ⁽²⁾ exceeds the threshold, the CPU 101 determines that the degree of influence increases, proceeds to step S1009, and executes inference calculation 22. do. The threshold here can be determined in the same manner as the threshold in step S1007.

ＣＰＵ１０１は、ステップＳ１００５およびステップＳ１００７の判定に変えて、前のフレームでの隠れ層４０３の各ノードのうち、他のノードよりも影響度が大きいノードを優先してａ_k ⁽²⁾を演算する。これにより、ＣＰＵ１０１は、閾値を超えたか否かをより早く判定することができるようになる。 CPU 101 computes a _k ⁽²⁾ by giving priority to a node having a greater degree of influence than other nodes among the nodes of hidden layer 403 in the previous frame instead of the determinations in steps S1005 and S1007. . As a result, the CPU 101 can more quickly determine whether or not the threshold has been exceeded.

なお、隠れ層４０３の各ノードは、厳密に影響度が大きい順に並べ替えなくてもよく、他のノードよりも影響度が大きいノードが優先されるようにすればよい。例えば、ＣＰＵ１０１は、影響度の大きさに応じてノードを４つのグループに分類し、他のグループよりも影響度が大きいグループのノードから順に演算すればよい。この場合、グループ内での影響度の大小は考慮しなくてもよい。影響度に応じたグループ分けは、例えば、影響度が上位１００％～７５％、７５％～５０％、５０％～２５％、２５％～０％の４つのグループとすることができる。グループの数は４つに限られず、グループ分けまたは並べ替えによる処理負荷に応じて決定されればよい。 Note that the nodes in the hidden layer 403 do not have to be sorted in strict order of influence, and a node with a greater influence may be prioritized over other nodes. For example, the CPU 101 may classify the nodes into four groups according to the degree of influence, and perform calculations in order from the node in the group having the greater degree of influence than the other groups. In this case, it is not necessary to consider the degree of influence within the group. Grouping according to the degree of influence can be, for example, four groups of the top 100% to 75%, 75% to 50%, 50% to 25%, and 25% to 0% of the degree of influence. The number of groups is not limited to four, and may be determined according to the processing load due to grouping or rearrangement.

また、ＣＰＵ１０１は、隠れ層４０３の各ノードのうち、影響度ａ_j ⁽¹⁾が他より大きいノードを優先して積和演算するが、影響度はａ_j ⁽¹⁾に限られない。変形例２では、出力層４０５のノードへの入力ａ_k ⁽²⁾の演算の順序を決定するための影響度は、１フレーム目での出力値ｘ_j ⁽²⁾とエッジの重みｗ_k,j ⁽²⁾との積としてもよい。この場合、ＣＰＵ１０１は、１フレーム目で、隠れ層４０３の各ノードのからの出力値ｘ_j ⁽²⁾と対応する重みｗ_k,j ⁽²⁾との積をメモリ１０３に記憶しておく。ＣＰＵ１０１は、１フレーム目での出力値と重みとの積を影響度として、影響度が他のノードより大きいノードを優先して、現在のフレームでの出力層４０５のノードへの入力ａ_k ⁽²⁾の演算を実行する。 In addition, the CPU 101 preferentially performs the sum-of-products operation on a node having a higher degree of influence a _j ⁽¹⁾ among the nodes in the hidden layer 403, but the degree of influence is not limited to a _j ⁽¹⁾ . In Modified Example 2, the degree of influence for determining the order of operations for the input a _k ⁽²⁾ to the nodes of the output layer 405 is the output value x _j ⁽²⁾ in the first frame and the edge weight w _{k ,} It may be the product of _j ⁽²⁾ . In this case, the CPU 101 stores in the memory 103 the product of the output value x _j ⁽²⁾ from each node of the hidden layer 403 and the corresponding weight w _k,j ⁽²⁾ in the first frame. The CPU 101 uses the product of the output value and the weight in the first frame as the degree of influence, and gives _priority to a node whose degree of influence is greater than that of other nodes ^{. 2)} Execute the operation.

（変形例３）
フレーム画像の画像全体にフィルタをかけて特徴量抽出する場合、情報処理装置１００は、フレーム画像の一部を間引いて推論モデル４００に入力してもよい。例えば、ＣＰＵ１０１は、フレーム画像のライン０－２に対してフィルタを掛けた後、ライン１からライン５を先頭行とするフィルタ（ライン１－３など）での走査を間引いて、ライン６－８に対してフィルタを掛けるようにしてもよい。ＣＰＵ１０１は、間引きをしながらフレーム画像の最後までフィルタを掛けると、ライン１－３に戻ってフィルタを掛ければよい。なお、フレーム画像は、縦方向にライン単位で間引く場合に限られず、横方向に間引いたり、縦横両方で間引いたりしてもよい。 (Modification 3)
When the entire frame image is filtered to extract the feature amount, the information processing apparatus 100 may thin out part of the frame image and input it to the inference model 400 . For example, after applying a filter to lines 0-2 of the frame image, the CPU 101 thins out the scanning with a filter (lines 1-3, etc.) with lines 1 to 5 as the leading rows, and scans lines 6-8. may be filtered. When the CPU 101 thins out and applies the filter to the end of the frame image, the CPU 101 may return to the line 1-3 and apply the filter. It should be noted that the frame image is not limited to being thinned out in units of lines in the vertical direction, and may be thinned out in the horizontal direction or both vertically and horizontally.

このようにフレーム画像の一部を間引いてフィルタを走査させることで、ＣＰＵ１０１は、画像全体の特徴を粗くとらえることができる。ＣＰＵ１０１は、特徴を粗くとらえた画像によっても、前のフレームでの影響度と、現在のフレームでの影響度の演算の一部の演算結果とに基づいて、早期に演算を中断し、演算量を低減することができる。 By thinning out a portion of the frame image and scanning the filter, the CPU 101 can roughly grasp the characteristics of the entire image. The CPU 101 interrupts the calculation at an early stage based on the degree of influence in the previous frame and a partial calculation result of the calculation of the degree of influence in the current frame, even with an image in which the features are roughly captured, and reduces the amount of calculation. can be reduced.

入力として画像信号を扱う場合には、入力ｘ_i ⁽¹⁾は、符号なし８ビットのＲＧＢまたはオフセットつきＹＵＶ信号とすればよい。ＹＵＶ信号は、輝度信号（Ｙ）、輝度信号と青色成分の差（Ｕ）、輝度信号と赤色成分の差（Ｖ）の組み合わせで色情報を表すことができる。 When an image signal is handled as an input, the input x _i ⁽¹⁾ may be an unsigned 8-bit RGB signal or YUV signal with an offset. A YUV signal can express color information by a combination of a luminance signal (Y), a difference (U) between the luminance signal and the blue component, and a difference (V) between the luminance signal and the red component.

以上、本発明の好ましい実施形態について説明したが、本発明はこれらの実施形態に限定されず、その要旨の範囲内で種々の変形および変更が可能である。また、実施形態で説明した複数の特徴は適宜組み合せることも可能である。 Although preferred embodiments of the present invention have been described above, the present invention is not limited to these embodiments, and various modifications and changes are possible within the scope of the gist. Moreover, it is also possible to appropriately combine the plurality of features described in the embodiments.

＜その他の実施形態＞
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 <Other embodiments>
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in the computer of the system or apparatus reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.

１００：情報処理装置、１０１：ＣＰＵ、１０４：画像取得部、１０５：識別部、４００：推論モデル 100: information processing device, 101: CPU, 104: image acquisition unit, 105: identification unit, 400: inference model

Claims

acquisition means for acquiring successive frames of a moving image;
A calculator including a plurality of hierarchies, wherein the degree of impact is the product sum of the output from each node included in the first hierarchy to a predetermined node in the second hierarchy and the weight corresponding to each of the outputs and an identification means for identifying an object from each frame of the moving image using a calculator that inputs the calculated degree of influence to the predetermined node,
The identifying means includes a first influence input to the predetermined node in a first frame and a second influence input to the predetermined node in a second frame after the first frame. determining whether or not to interrupt the calculation of the second degree of influence based on a partial calculation result of the calculation of the degree of influence.

The identification means is a value corresponding to the sum of products for some nodes of the first hierarchy in the second frame and the sum of products for the some nodes in the first frame. When the difference from the reference value is within a predetermined range and the sum of products for some nodes of the first hierarchy in the second frame is equal to or less than a threshold, the 2. The information processing apparatus according to claim 1, wherein the computation of the second degree of influence is interrupted.

3. The information processing apparatus according to claim 2, wherein said predetermined range is set based on said first degree of influence.

The threshold is set based on the maximum possible value of the product sum of the output from each node included in the first hierarchy to a predetermined node in the second hierarchy and the weight corresponding to each of the outputs. 4. The information processing apparatus according to claim 2, wherein:

4. The identifying means inputs a calculation result or a predetermined value at the time when the calculation of the second degree of influence is interrupted to the predetermined node as the second degree of influence. The information processing device according to any one of .

The identifying means prioritizes a node having a greater degree of influence than other nodes among the nodes of the first hierarchy in the first frame, and determines the second influence in the second frame. 6. The information processing apparatus according to claim 1, wherein the sum of products of degrees is calculated.

In the first frame, the identification means classifies each node of the first hierarchy into a plurality of groups based on the degree of influence, and from the node of the group having the degree of influence greater than that of other groups, 7. The information processing apparatus according to claim 6, wherein the sum of products of said second degree of influence in said second frame is calculated in order.

8. Information according to any one of claims 1 to 7, characterized in that said identifying means uses an unsigned value as a weight used in calculating said first degree of influence and said second degree of influence. processing equipment.

9. The information processing apparatus according to claim 1, wherein said identifying means thins out a part of frame images of said moving image and inputs it to said calculator.

10. The information processing apparatus according to any one of claims 1 to 9, wherein said identifying means updates said first degree of influence every predetermined number of frames of said moving image.

an acquisition step of acquiring successive frames of the video image;
A calculator including a plurality of hierarchies, wherein the degree of impact is the product sum of the output from each node included in the first hierarchy to a predetermined node in the second hierarchy and the weight corresponding to each of the outputs and an identification step of identifying an object from each frame of the moving image using a calculator that inputs the calculated degree of influence to the predetermined node;
have
In the identifying step, a first influence input to the predetermined node in a first frame and a second influence input to the predetermined node in a second frame after the first frame. and determining whether or not to interrupt the calculation of the second degree of influence based on a partial calculation result of the calculation of the degree of influence.

A program for causing a computer to function as each means of the information processing apparatus according to any one of claims 1 to 10.