JP2020107138A

JP2020107138A - Classification device, classification system, classification method and program

Info

Publication number: JP2020107138A
Application number: JP2018246147A
Authority: JP
Inventors: 根本　亮; Akira Nemoto; 亮根本
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2018-12-27
Filing date: 2018-12-27
Publication date: 2020-07-09
Anticipated expiration: 2038-12-27
Also published as: JP7263770B2; WO2020137227A1

Abstract

To provide a classification device, a classification system, a classification method and a program capable of further improving accuracy of classification while further suppressing an increase in processing amount.SOLUTION: A classification device comprises: a processing unit acquiring non-negative evaluation data in a predetermined data area from time series data; a storage unit storing a supervised coefficient matrix generated from the non-negative first supervised data corresponding to the predetermined data area, and a classification model parameter generated based on a first class mapped to the first supervised data; a feature extraction unit generating an evaluation coefficient matrix from the evaluation data based on a supervised basic matrix generated from the first supervised data; a classification unit classifying the evaluation coefficient matrix into the first class based on the classification model parameter; and an area specification unit specifying the predetermined data area.SELECTED DRAWING: Figure 8

Description

本発明は、分類装置、分類システム、分類方法及びプログラムに関する。 The present invention relates to a classification device, a classification system, a classification method and a program.

従来、センシングにより得られた信号等の連続的に発生する時系列データの一例である音響信号等のデータを解析、分類するために、非負値行列因子分解を用いる技術が研究されている（下記特許文献１〜４、下記非特許文献１，２）。 Conventionally, a technique using non-negative matrix factorization has been researched in order to analyze and classify data such as acoustic signals, which is an example of continuously generated time series data such as signals obtained by sensing (see below. Patent Documents 1 to 4 and Non-Patent Documents 1 and 2 below.

特開２０１４−１３７３８９号公報JP, 2014-137389, A 特開２０１５−０７９１１０号公報JP, 2005-079110, A 特開２０１１−１３３７８０号公報JP, 2011-133780, A 特開２０１７−１５１８７２号公報JP, 2017-151872, A

Ｌｅｅ，Ｈ．Ｓ．Ｓｅｕｎｇ， “Ｌｅａｒｎｉｎｇｔｈｅｐａｒｔｓｏｆｏｂｊｅｃｔｓｂｙｎｏｎｎｅｇａｔｉｖｅｍａｔｒｉｘｆａｃｔｏｒｉｚａｔｉｏｎ，” Ｎａｔｕｒｅ，（英国），Ｏｃｔ．１９９９，ｖｏｌ．４０１ｐｐ．７８８−７９１．Lee, H.; S. Seung, “Learning the parts of objects by nonneatactive matrix factorization,” Nature, (UK), Oct. 1999, vol. 401 pp. 788-791. Ｌｅｅ，Ｈ．Ｓ．Ｓｅｕｎｇ， “Ａｌｇｏｒｉｔｈｍｓｆｏｒｎｏｎ−ｎｅｇａｔｉｖｅｍａｔｒｉｘｆａｃｔｏｒｉｚａｔｉｏｎ，” Ａｄｖａｎｃｅｓｉｎｎｅｕｒａｌｉｎｆｏｒｍａｔｉｏｎｐｒｏｃｅｓｓｉｎｇｓｙｓｔｅｍｓ，Ｄｅｃ．２００１，ｐｐ．５５６−５６２．Lee, H.; S. Seung, "Algorithms for non-negative matrix factorization," Advances in neural information processing systems, Dec. 2001, pp. 556-562.

上述のような解析、分類のための技術は、時系列データを解析、分類することができるが、近年、解析、分類する精度をより高めることが強く求められるようになってきている。さらに上述のような要求と同時に、処理時間の増加及び処理装置の負担を避けるために、上述のような解析、分類のための処理において、処理量（処理データ量）をより抑制することが求められている。 The techniques for analysis and classification as described above can analyze and classify time-series data, but in recent years, it has been strongly demanded to further improve the accuracy of analysis and classification. Further, at the same time as the above request, in order to avoid an increase in processing time and a burden on the processing device, it is required to further reduce the processing amount (processing data amount) in the above-described analysis and classification processing. Has been.

そこで、本発明は、上記状況に鑑みてなされたものであり、本発明の目的とするところは、処理量の増加をより抑制しつつ、分類の精度をより向上させることが可能な、新規且つ改良された分類装置、分類システム、分類方法及びプログラムを提供することにある。 Therefore, the present invention has been made in view of the above situation, and an object of the present invention is to further suppress the increase in the processing amount and further improve the accuracy of classification, which is new and An object of the present invention is to provide an improved classification device, classification system, classification method and program.

上記課題を解決するために、本発明のある観点によれば、時系列データを目的とする第１のクラスに分類する分類装置であって、少なくとも周波数領域又は時間領域において表現された前記時系列データから、所定のデータ領域の非負の評価データを取得する処理部と、前記所定のデータ領域に対応する非負の第１の教師データに対して非負値行列因子分解を行うことで生成された教師係数行列、及び、前記第１の教師データに対応付けられた前記第１のクラスに基づいて生成された分類モデルパラメータを格納する記憶部と、前記第１の教師データに対して非負値行列因子分解を行うことで生成された教師基底行列に基づいて、前記評価データから評価係数行列を生成する特徴抽出部と、
算出された前記評価係数行列を、前記分類モデルパラメータに基づいて、前記第１のクラスに分類する分類部と、前記所定のデータ領域を特定する領域特定部とを備え、前記領域特定部は、前記第１のクラスのそれぞれに対応付けられた非負の各第２の教師データを、周波数領域又は時間領域においてそれぞれ分割し、分割された領域ごとの前記各第２の教師データの観測行列に基づいて、前記分割された領域ごとの前記第１のクラス間の分離度を示す第１の評価指標を算出し、目的としない第２のクラスのそれぞれに対応付けられた非負の各第３の教師データを、周波数領域又は時間領域においてそれぞれ分割し、分割された領域ごとの前記各第３の教師データの観測行列に基づいて、前記分割された領域ごとの前記第２のクラス間の分離度を示す第２の評価指標を算出し、前記第１の評価指標が予め設定された第１の閾値よりも高い、前記分割された領域を抽出することにより、第１の領域を特定し、前記第２の評価指標が予め設定された第２の閾値よりも高い、前記分割された領域を抽出することにより、第２の領域を特定し、特定した前記第１及び第２の領域に基づいて、前記所定のデータ領域を特定する、分類装置が提供される。 In order to solve the above problems, according to an aspect of the present invention, there is provided a classification device for classifying time series data into an intended first class, wherein the time series represented at least in a frequency domain or a time domain. A processing unit for acquiring non-negative evaluation data of a predetermined data area from data, and a teacher generated by performing non-negative matrix factorization on the non-negative first teacher data corresponding to the predetermined data area. A storage unit for storing a coefficient matrix and a classification model parameter generated based on the first class associated with the first teacher data; and a non-negative matrix factor for the first teacher data. A feature extraction unit that generates an evaluation coefficient matrix from the evaluation data, based on the teacher basis matrix generated by performing decomposition,
The calculated evaluation coefficient matrix, based on the classification model parameter, a classification unit that classifies into the first class, and a region specifying unit that specifies the predetermined data region, the region specifying unit, Each non-negative second teacher data associated with each of the first classes is divided in the frequency domain or the time domain, and based on the observation matrix of each second teacher data for each divided area. Calculating a first evaluation index indicating the degree of separation between the first classes for each of the divided regions, and assigning each non-negative third teacher associated with each of the second classes not intended. The data is divided in the frequency domain or the time domain, and the degree of separation between the second classes for each divided area is calculated based on the observation matrix of each of the third teacher data for each divided area. The second evaluation index shown is calculated, and the first area is specified by extracting the divided area in which the first evaluation index is higher than a preset first threshold value. 2 evaluation index is higher than a preset second threshold value, the second area is specified by extracting the divided area, and based on the specified first and second areas, A classification device is provided that identifies the predetermined data area.

また、上記課題を解決するために、本発明の別の観点によれば、時系列データを目的とする第１のクラスに分類する分類装置と、前記時系列データを取得するセンサとを含む分類システムであって、前記分類装置は、少なくとも周波数領域又は時間領域において表現された前記時系列データから、所定のデータ領域の非負の評価データを取得する処理部と、前記所定のデータ領域に対応する非負の第１の教師データに対して非負値行列因子分解を行うことで生成された教師係数行列、及び、前記第１の教師データに対応付けられた前記第１のクラスに基づいて生成された分類モデルパラメータを格納する記憶部と、前記第１の教師データに対して非負値行列因子分解を行うことで生成された教師基底行列に基づいて、前記評価データから評価係数行列を生成する特徴抽出部と、算出された前記評価係数行列を、前記分類モデルパラメータに基づいて、前記第１のクラスに分類する分類部と、前記所定のデータ領域を特定する領域特定部とを有し、前記領域特定部は、前記第１のクラスのそれぞれに対応付けられた非負の各第２の教師データを、周波数領域又は時間領域においてそれぞれ分割し、分割された領域ごとの前記各第２の教師データの観測行列に基づいて、前記分割された領域ごとの前記第１のクラス間の分離度を示す第１の評価指標を算出し、目的としない第２のクラスのそれぞれに対応付けられた非負の各第３の教師データを、周波数領域又は時間領域においてそれぞれ分割し、分割された領域ごとの前記各第３の教師データの観測行列に基づいて、前記分割された領域ごとの前記第２のクラス間の分離度を示す第２の評価指標を算出し、前記第１の評価指標が予め設定された第１の閾値よりも高い、前記分割された領域を抽出することにより、第１の領域を特定し、前記第２の評価指標が予め設定された第２の閾値よりも高い、前記分割された領域を抽出することにより、第２の領域を特定し、特定した前記第１及び第２の領域に基づいて、前記所定のデータ領域を特定する、分類システムが提供される。 In order to solve the above-mentioned problems, according to another aspect of the present invention, a classification including a classification device that classifies time-series data into an intended first class and a sensor that acquires the time-series data. In the system, the classification device corresponds to the processing unit that acquires non-negative evaluation data of a predetermined data region from the time series data represented in at least the frequency domain or the time domain, and the predetermined data region. It is generated based on a teacher coefficient matrix generated by performing non-negative matrix factorization on the non-negative first teacher data, and the first class associated with the first teacher data. Feature extraction for generating an evaluation coefficient matrix from the evaluation data based on a storage unit that stores a classification model parameter and a teacher basis matrix generated by performing non-negative matrix factorization on the first teacher data A region, a classification unit that classifies the calculated evaluation coefficient matrix into the first class based on the classification model parameter, and a region specifying unit that specifies the predetermined data region. The specifying unit divides each non-negative second teacher data associated with each of the first classes in a frequency domain or a time domain, and divides each second teacher data of each of the divided areas. A first evaluation index indicating the degree of separation between the first classes for each of the divided regions is calculated based on an observation matrix, and each non-negative value associated with each of the second classes not intended The third teacher data is divided in the frequency domain or the time domain, and based on the observation matrix of each of the third teacher data for each divided area, the second class between the divided areas is divided. A second evaluation index indicating the degree of separation of the first area, and the first area is identified by extracting the divided area in which the first evaluation index is higher than a preset first threshold value. Then, the second area is specified by extracting the divided area in which the second evaluation index is higher than a preset second threshold value, and the specified first and second areas A classification system is provided for identifying the predetermined data area based on

上記課題を解決するために、本発明の更なる別の観点によれば、時系列データを目的とする第１のクラスに分類する分類方法であって、少なくとも周波数領域又は時間領域において表現された前記時系列データから、所定のデータ領域の非負の評価データを取得することと、前記所定のデータ領域に対応する非負の第１の教師データに対して非負値行列因子分解を行うことで生成された教師係数行列、及び、前記第１の教師データに対応付けられた前記第１のクラスに基づいて生成された分類モデルパラメータを格納することと、前記第１の教師データに対して非負値行列因子分解を行うことで生成された教師基底行列に基づいて、前記評価データから評価係数行列を生成することと、算出された前記評価係数行列を、前記分類モデルパラメータに基づいて、前記第１のクラスに分類することと、前記所定のデータ領域を特定することとを含み、前記所定のデータ領域を特定することは、前記第１のクラスのそれぞれに対応付けられた非負の各第２の教師データを、周波数領域又は時間領域においてそれぞれ分割し、分割された領域ごとの前記各第２の教師データの観測行列に基づいて、前記分割された領域ごとの前記第１のクラス間の分離度を示す第１の評価指標を算出し、目的としない第２のクラスのそれぞれに対応付けられた非負の各第３の教師データを、周波数領域又は時間領域においてそれぞれ分割し、分割された領域ごとの前記各第３の教師データの観測行列に基づいて、前記分割された領域ごとの前記第２のクラス間の分離度を示す第２の評価指標を算出し、前記第１の評価指標が予め設定された第１の閾値よりも高い、前記分割された領域を抽出することにより、第１の領域を特定し、前記第２の評価指標が予め設定された第２の閾値よりも高い、前記分割された領域を抽出することにより、第２の領域を特定し、特定した前記第１及び第２の領域に基づいて、前記所定のデータ領域を特定することを有する、分類方法が提供される。 In order to solve the above-mentioned problems, according to still another aspect of the present invention, there is provided a classification method for classifying time series data into a desired first class, which is expressed in at least a frequency domain or a time domain. It is generated by acquiring non-negative evaluation data of a predetermined data area from the time series data and performing non-negative matrix factorization on the non-negative first teacher data corresponding to the predetermined data area. A training coefficient matrix, and a classification model parameter generated based on the first class associated with the first training data, and a non-negative matrix for the first training data. Generating an evaluation coefficient matrix from the evaluation data based on a teacher basis matrix generated by performing factorization, and calculating the calculated evaluation coefficient matrix based on the classification model parameter. Classifying into a class and identifying the predetermined data area, wherein identifying the predetermined data area is performed by each non-negative second teacher associated with each of the first classes. The data is divided in the frequency domain or the time domain, and the degree of separation between the first classes of the divided areas is calculated based on the observation matrix of the second teacher data of the divided areas. The first evaluation index shown is calculated, and each non-negative third teacher data associated with each second target class is divided in the frequency domain or the time domain. A second evaluation index indicating the degree of separation between the second classes for each of the divided regions is calculated based on the observation matrix of each of the third teacher data, and the first evaluation index is set in advance. By dividing the divided area having a higher threshold value than the predetermined first threshold value to identify the first area, and the second evaluation index being higher than a preset second threshold value. A classification method is provided that includes identifying a second area by extracting the identified area and identifying the predetermined data area based on the identified first and second areas.

上記課題を解決するために、本発明の更なる別の観点によれば、時系列データを目的とする第１のクラスに分類するためのプログラムであって、コンピュータに、少なくとも周波数領域又は時間領域において表現された前記時系列データから、所定のデータ領域の非負の評価データを取得する機能と、前記所定のデータ領域に対応する非負の第１の教師データに対して非負値行列因子分解を行うことで生成された教師係数行列、及び、前記第１の教師データに対応付けられた前記第１のクラスに基づいて生成された分類モデルパラメータを格納する機能と、前記第１の教師データに対して非負値行列因子分解を行うことで生成された教師基底行列に基づいて、前記評価データから評価係数行列を生成する機能と、算出された前記評価係数行列を、前記分類モデルパラメータに基づいて、前記第１のクラスに分類する機能と、前記所定のデータ領域を特定する機能とを実現させ、前記所定のデータ領域を特定する機能においては、前記第１のクラスのそれぞれに対応付けられた非負の各第２の教師データを、周波数領域又は時間領域においてそれぞれ分割し、分割された領域ごとの前記各第２の教師データの観測行列に基づいて、前記分割された領域ごとの前記第１のクラス間の分離度を示す第１の評価指標を算出し、目的としない第２のクラスのそれぞれに対応付けられた非負の各第３の教師データを、周波数領域又は時間領域においてそれぞれ分割し、分割された領域ごとの前記各第３の教師データの観測行列に基づいて、前記分割された領域ごとの前記第２のクラス間の分離度を示す第２の評価指標を算出し、前記第１の評価指標が予め設定された第１の閾値よりも高い、前記分割された領域を抽出することにより、第１の領域を特定し、前記第２の評価指標が予め設定された第２の閾値よりも高い、前記分割された領域を抽出することにより、第２の領域を特定し、特定した前記第１及び第２の領域に基づいて、前記所定のデータ領域を特定することを実施する、プログラムが提供される。 In order to solve the above-mentioned problems, according to still another aspect of the present invention, there is provided a program for classifying time series data into a desired first class, the program including at least a frequency domain or a time domain. A function of acquiring non-negative evaluation data of a predetermined data area from the time-series data expressed in (3) and non-negative matrix factorization of the non-negative first teacher data corresponding to the predetermined data area A teacher coefficient matrix generated by the above, and a function of storing a classification model parameter generated based on the first class associated with the first teacher data; Based on the teacher basis matrix generated by performing non-negative matrix factorization, the function of generating an evaluation coefficient matrix from the evaluation data, the calculated evaluation coefficient matrix, based on the classification model parameters, In the function of realizing the function of classifying into the first class and the function of specifying the predetermined data area and specifying the predetermined data area, a non-negative value associated with each of the first classes is provided. Of the second teacher data of each of the divided regions in the frequency domain or the time domain, and based on the observation matrix of the second teacher data of each of the divided regions, A first evaluation index indicating the degree of separation between classes is calculated, and each non-negative third teacher data associated with each of the second classes not intended is divided in the frequency domain or the time domain, A second evaluation index indicating the degree of separation between the second classes for each divided region is calculated based on the observation matrix of each of the third teacher data for each divided region, and the first evaluation index is calculated. The first threshold is specified by extracting the divided area in which the evaluation index of is higher than the first threshold set in advance, and the second threshold in which the second evaluation index is set in advance. A second area is identified by extracting the divided area that is higher than the first area, and the predetermined data area is identified based on the identified first and second areas. The program is provided.

以上説明したように本発明によれば、処理量の増加をより抑制しつつ、分類の精度をより向上させることができる。 As described above, according to the present invention, it is possible to further suppress the increase in the processing amount and further improve the classification accuracy.

本発明の実施形態に係る情報処理システム１０の概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of a schematic structure of the information processing system 10 which concerns on embodiment of this invention. 本発明の実施形態に係る情報処理装置２００の構成例を説明するためのブロック図である。It is a block diagram for explaining an example of composition of information processor 200 concerning an embodiment of the present invention. 本発明の実施形態に係るＮＭＦの処理例を模式的に示す説明図である。It is explanatory drawing which shows typically the example of a process of NMF which concerns on embodiment of this invention. ＮＭＦによる次元の削減を説明するための模式図である。It is a schematic diagram for demonstrating the reduction of the dimension by NMF. 時間・周波数・振幅データの全体に対してＮＭＦを適用した例を示す模式図である。It is a schematic diagram which shows the example which applied NMF to the whole time/frequency/amplitude data. 時間・周波数・振幅データの部分に対してＮＭＦを適用した例を示す模式図である。It is a schematic diagram which shows the example which applied NMF to the part of time/frequency/amplitude data. 本発明の実施形態に係るデータ分割の例を示す説明図である。It is an explanatory view showing an example of data division concerning an embodiment of the present invention. 本発明の実施形態に係る教師データと評価データの取得の例を示す説明図（その１）である。It is explanatory drawing (the 1) which shows the example of acquisition of the teacher data and evaluation data which concern on embodiment of this invention. 本発明の実施形態に係る教師データと評価データの取得の例を示す説明図（その２）である。It is explanatory drawing (the 2) which shows the example of acquisition of the teacher data and evaluation data which concern on embodiment of this invention. 学習フェーズにおける本発明の実施形態の動作例を示すフローチャート図である。It is a flowchart figure which shows the operation example of embodiment of this invention in a learning phase. 分類フェーズにおける本発明の実施形態の動作例を示すフローチャート図である。It is a flowchart figure which shows the operation example of embodiment of this invention in a classification phase. ハードウェア構成例を示す説明図である。It is an explanatory view showing a hardware configuration example.

以下に添付図面を参照しながら、本発明の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。また、本明細書及び図面において、類似する構成要素については、同一の符号の後に異なるアルファベットを付して区別する場合がある。ただし、類似する構成要素の各々を特に区別する必要がない場合、同一符号のみを付する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In this specification and the drawings, constituent elements having substantially the same functional configuration are designated by the same reference numerals, and a duplicate description will be omitted. In this specification and the drawings, similar components may be distinguished by attaching different alphabets after the same reference numerals. However, if there is no particular need to distinguish between similar components, only the same reference numerals will be given.

＜＜背景＞＞
まず、本発明に係る実施形態を説明する前に、本発明者が本発明に係る実施形態を創作するに至る背景について説明する。 <<Background>>
First, before describing the embodiments of the present invention, the background leading to the creation of the embodiments of the present invention by the present inventor will be described.

先に説明したように、従来、センシングにより得られた信号等の連続的に発生する時系列データの一例である音響信号等のデータを解析、分類するために、非負値行列因子分解を用いる技術が研究されている。 As described above, a technique using non-negative matrix factorization to analyze and classify data such as acoustic signals, which is one example of continuously generated time series data such as signals obtained by sensing, as described above. Is being studied.

上述のような技術は、時系列データを解析、分類する際に有効な手段である。しかしながら、近年、解析、分類するデータとしては、好適なデータとは言えない現実に存在する様々な時系列データであっても、分類する精度をより高めることが強く求められるようになってきている。また、上述のような現実に存在する様々な時系列データは、様々なパラメータが複雑に組み合わされたデータであることが多いことから、このような時系列データを処理する際に、処理時間が増加し、且つ、処理装置の負担が増加する傾向が顕著に表れるようになってきていた。 The technique as described above is an effective means for analyzing and classifying time series data. However, in recent years, as data to be analyzed and classified, even for various kinds of actually existing time series data that are not suitable data, there is a strong demand for higher accuracy of classification. .. Moreover, since various time-series data that actually exist as described above are often data in which various parameters are combined in a complicated manner, the processing time when processing such time-series data is The tendency that the number of processing devices increases and the load on the processing device also increases has become remarkable.

そこで、本発明者は、上述の状況を鑑みて、処理量の増加をより抑制しつつ、分類の精度をより向上させることが可能な、本発明の実施形態を創作するに至った。以下、このような本発明の実施形態を順次詳細に説明する。 Therefore, in view of the above situation, the present inventor has created an embodiment of the present invention capable of further improving the classification accuracy while further suppressing an increase in the processing amount. Hereinafter, such embodiments of the present invention will be sequentially described in detail.

＜＜実施形態＞＞
以下に説明する実施形態においては、センサから取得される時系列データは、例えば、測定対象物（例えばコンクリート等）に対してハンマー等の打具で打撃した際の打音を収音して得られた音響信号であってもよい。さらに、以下の説明においては、一例として、情報処理装置が、当該音響信号に基づいて当該測定対象物が正常であるか異常であるかを示す２つのクラス（目的とする第１のクラス）に自動的に分類する機能を有する例について説明するが、本発明は係る例に限定されるものではない。 <<Embodiment>>
In the embodiment described below, the time-series data acquired from the sensor is obtained by, for example, collecting a tapping sound when hitting an object to be measured (such as concrete) with a hitting tool such as a hammer. It may be an acoustic signal that has been recorded. Furthermore, in the following description, as an example, the information processing device is divided into two classes (first target class) indicating whether the measurement object is normal or abnormal based on the acoustic signal. Although an example having a function of automatically classifying will be described, the present invention is not limited to such an example.

また、以下の説明においては、センサからの時系列データに対する、フーリエ変換によって得られたパワースペクトル、短時間フーリエ変換によって得られたスペクトログラムや、ウェーブレット変換によって得られたスカトログラム等、時間、周波数、振幅の関係を表したデータを総称して、時間・周波数・振幅データと呼ぶ。すなわち、当該時間・周波数・振幅データは、少なくとも周波数領域又は時間領域において表現されたデータであると言え、例えば、周波数領域、及び時間領域における振幅の値を示すデータとして表現されることができる。 Further, in the following description, for time series data from the sensor, a power spectrum obtained by Fourier transform, a spectrogram obtained by short-time Fourier transform, a scatogram obtained by wavelet transform, etc., time, frequency, Data representing the relationship between amplitudes is generically called time/frequency/amplitude data. That is, it can be said that the time/frequency/amplitude data is data expressed at least in the frequency domain or the time domain, and can be expressed as, for example, data indicating amplitude values in the frequency domain and the time domain.

また、以下の説明においては、分類の対象となるデータを評価データと呼び、評価データを分類する前に、事前に正確ラベル（クラス）と対応させて分類に必要なモデルを作るためのデータを教師データと呼ぶ。そして、上記正解ラベルは、教師データを変換して得られた教師観測行列（非負の教師データ）に含まれる各観測ベクトルが属するクラスを示すラベルのことをいう。加えて、以下の説明においては、教師データを変換して得られた教師観測行列を非負値行列因子分解分解して得られる基底行列及び係数行列を、それぞれ教師基底行列、教師係数行列と呼ぶ。さらに、以下の説明においては、評価データを変換して得られた評価観測行列（非負の評価データ）を非負値行列因子分解分解して得られる基底行列及び係数行列を、それぞれ評価基底行列、評価係数行列と呼ぶ。 Also, in the following explanation, the data to be classified is called evaluation data, and before classifying the evaluation data, the data for making a model necessary for classification by associating with the accurate label (class) in advance is classified. Called teacher data. The correct label is a label indicating the class to which each observation vector included in the teacher observation matrix (non-negative teacher data) obtained by converting the teacher data belongs. In addition, in the following description, the basis matrix and the coefficient matrix obtained by performing non-negative matrix factorization of the teacher observation matrix obtained by converting the teacher data are referred to as a teacher basis matrix and a teacher coefficient matrix, respectively. Furthermore, in the following description, the basis matrix and coefficient matrix obtained by performing non-negative value matrix factorization of the evaluation observation matrix (non-negative evaluation data) obtained by converting the evaluation data are the evaluation basis matrix and the evaluation matrix, respectively. Called the coefficient matrix.

＜情報処理システム１０の概略構成＞
まずは、図１を参照して、本発明の実施形態に係る情報処理システム（分類システム）１０の概略構成を説明する。図１は、本実施形態に係る情報処理システム１０の概略的な構成の一例を示す説明図である。 <Schematic configuration of information processing system 10>
First, a schematic configuration of an information processing system (classification system) 10 according to the embodiment of the present invention will be described with reference to FIG. FIG. 1 is an explanatory diagram showing an example of a schematic configuration of an information processing system 10 according to the present embodiment.

本実施形態に係る情報処理システム１０は、例えば、測定対象物に対する打音の時系列データ（音響データ）を取得し、取得した時系列データに基づき、当該測定対象物が正常であるか異常であるかをクラス分け、すなわち分類を行う分類システムである。図１に示すように、本実施形態に係る情報処理システム１０は、上記時系列データを取得するためのセンサ１００と、分類を行う情報処理装置（分類装置）２００とを主に含む。以下に、情報処理システム１０に含まれる各装置の概要について説明する。 The information processing system 10 according to the present embodiment acquires, for example, time-series data (acoustic data) of tapping sound with respect to a measurement target, and based on the acquired time-series data, whether the measurement target is normal or abnormal. It is a classification system that classifies whether or not there is. As shown in FIG. 1, the information processing system 10 according to the present embodiment mainly includes a sensor 100 for acquiring the time series data and an information processing device (classification device) 200 that performs classification. The outline of each device included in the information processing system 10 will be described below.

（センサ１００）
センサ１００は、観測対象の状態の変化を物理的な変化として検出し、検出した時系列データを情報処理装置２００に出力する。例えば、センサ１００は、周辺の音を音響信号として取得するマイクロフォン等の音響センサであってもよく、加速度データを取得する加速度センサや、温度データを取得する温度センサ等であってもよく、時系列データを取得できるものであれば特に限定されるものではない。 (Sensor 100)
The sensor 100 detects a change in the state of the observation target as a physical change, and outputs the detected time series data to the information processing device 200. For example, the sensor 100 may be an acoustic sensor such as a microphone that acquires ambient sound as an acoustic signal, an acceleration sensor that acquires acceleration data, a temperature sensor that acquires temperature data, or the like. There is no particular limitation as long as it can acquire series data.

（情報処理装置２００）
情報処理装置２００は、センサ１００から出力された時系列データを解析して分類する装置である。詳細には、情報処理装置２００は、センサ１００により得られる信号等の連続的に発生する時系列データ（教師データ）を用いて、複数のクラスに分類するための分類モデルパラメータを生成するパラメータ生成装置としての機能と、時系列データ（評価データ）を複数のクラスに分類する分類装置としての機能とを有する。例えば、図１に示すように、情報処理装置２００は、クラウド２００ａ、ＩｏＴゲートウェイ２００ｂ、エッジ端末２００ｃ等であることができる。これら情報処理装置２００の詳細構成については後述する。なお、本実施形態に係る情報処理システム１０は、クラウド２００ａ、ＩｏＴゲートウェイ２００ｂ、エッジ端末２００ｃ等のような情報処理装置２００を少なくとも１つ含んでいればよい。 (Information processing device 200)
The information processing device 200 is a device that analyzes and classifies the time-series data output from the sensor 100. More specifically, the information processing apparatus 200 uses the time-series data (teacher data) generated continuously such as the signal obtained by the sensor 100 to generate a classification model parameter for classifying into a plurality of classes. It has a function as a device and a function as a classification device for classifying time series data (evaluation data) into a plurality of classes. For example, as shown in FIG. 1, the information processing device 200 can be a cloud 200a, an IoT gateway 200b, an edge terminal 200c, or the like. The detailed configuration of these information processing devices 200 will be described later. The information processing system 10 according to the present embodiment may include at least one information processing device 200 such as the cloud 200a, the IoT gateway 200b, the edge terminal 200c, and the like.

＜情報処理装置２００の詳細構成＞
以上、本発明の実施形態に係る情報処理システム１０の概略構成を説明した。次に、図２を参照して、本発明の実施形態に係る情報処理装置２００の一例の詳細構成を説明する。図２は、本発明の実施形態に係る情報処理装置２００の構成例を説明するためのブロック図である。図２に示すように、本実施形態に係る情報処理装置２００は、データ取得部２０２、操作部２０４、処理部２１０、教師データ処理部２２０、パラメータ生成部２２２、記憶部２２４、特徴抽出部２２６、分類部２２８、及び出力部２３０を主に有する。以下に情報処理装置２００の有する各機能部について説明する。 <Detailed Configuration of Information Processing Device 200>
The schematic configuration of the information processing system 10 according to the embodiment of the present invention has been described above. Next, a detailed configuration of an example of the information processing device 200 according to the embodiment of the present invention will be described with reference to FIG. FIG. 2 is a block diagram for explaining a configuration example of the information processing device 200 according to the embodiment of the present invention. As shown in FIG. 2, the information processing apparatus 200 according to the present embodiment has a data acquisition unit 202, an operation unit 204, a processing unit 210, a teacher data processing unit 220, a parameter generation unit 222, a storage unit 224, and a feature extraction unit 226. , A classification unit 228, and an output unit 230. Each functional unit of the information processing device 200 will be described below.

（データ取得部２０２）
データ取得部２０２は、センサ１００から、少なくとも周波数領域又は時間領域において表現された時系列データを取得し、後述する処理部２１０に取得した時系列データを出力する。 (Data acquisition unit 202)
The data acquisition unit 202 acquires the time-series data expressed in at least the frequency domain or the time domain from the sensor 100, and outputs the acquired time-series data to the processing unit 210 described below.

（操作部２０４）
操作部２０４は、ユーザの入力を受け付け、入力情報を処理部２１０に出力する。例えば、ユーザは、操作部２０４を操作して、教師データを取得するか、又は、評価データを取得するかを示す情報を入力してもよい。さらに、ユーザは、操作部２０４を操作して、データ取得部２０２により現在取得されている時系列データに対応するクラス（正解ラベル）を示す情報を入力してもよい。なお、操作部２０４は、例えば、マウス、キーボード、タッチパネル、ボタン、スイッチ、レバー、又はダイヤル等により実現されてもよい。 (Operation unit 204)
The operation unit 204 receives user input and outputs the input information to the processing unit 210. For example, the user may operate the operation unit 204 to input information indicating whether to acquire the teacher data or the evaluation data. Further, the user may operate the operation unit 204 to input information indicating a class (correct answer label) corresponding to the time-series data currently acquired by the data acquisition unit 202. The operation unit 204 may be realized by, for example, a mouse, a keyboard, a touch panel, a button, a switch, a lever, a dial, or the like.

（処理部２１０）
処理部２１０は、データ取得部２０２により取得された時系列データに基づいて、分類モデルパラメータ生成処理のための教師データ、及び、分類処理のための評価データを取得する。詳細には、図２に示すように、本実施形態に係る処理部２１０は、変換部２１２、対応付け部２１４、データ分割部２１６、及び、分離度評価部（領域特定部）２１８としての機能を有する。以下に、処理部２１０の各機能について説明する。 (Processing unit 210)
The processing unit 210 acquires teacher data for classification model parameter generation processing and evaluation data for classification processing based on the time-series data acquired by the data acquisition unit 202. In detail, as shown in FIG. 2, the processing unit 210 according to the present embodiment functions as a conversion unit 212, an associating unit 214, a data dividing unit 216, and a separation degree evaluating unit (region specifying unit) 218. Have. The functions of the processing unit 210 will be described below.

〜変換部２１２〜
変換部２１２は、データ取得部２０２からの時系列データ（教師データ及び評価データ）に対して、フーリエ変換、短時間フーリエ変換、又は、ウェーブレット変換を行い、時間・周波数・振幅データ（詳細には、非負の観測ベクトル（例えば、周波数領域表現に変換されたデータ）を含む非負の観測行列）を生成する。例えば、先に説明したように、時間・周波数・振幅データは、パワースペクトル、スペクトログラム、又は、スカログラムであってもよい。また、変換部２１２は、後述する分離度評価部２１８によって特定された領域（対象領域）の時系列データに対して、フーリエ変換等を行い、時間・周波数・振幅データを生成してもよい。本実施形態においては、パワースペクトル、スペクトログラム、スカログラム等のような周波数領域等で表現されるデータを用いることにより、後述の教師データ処理部２２０、又は特徴抽出部２２６等の処理において、分類したいクラスの特徴をより高精度に抽出することができる。 -Conversion unit 212-
The transform unit 212 performs Fourier transform, short-time Fourier transform, or wavelet transform on the time-series data (teacher data and evaluation data) from the data acquisition unit 202 to obtain time/frequency/amplitude data (specifically, , A non-negative observation matrix containing non-negative observation vectors (eg, data transformed into a frequency domain representation). For example, as described above, the time/frequency/amplitude data may be a power spectrum, a spectrogram, or a scalogram. Further, the conversion unit 212 may perform Fourier transform or the like on the time-series data of the region (target region) specified by the separation degree evaluation unit 218 described below to generate time/frequency/amplitude data. In the present embodiment, by using the data represented in the frequency domain such as the power spectrum, spectrogram, scalogram, etc., the class to be classified in the processing of the teacher data processing unit 220 or the feature extraction unit 226 described later. The features of can be extracted with higher accuracy.

〜対応付け部２１４〜
対応付け部２１４は、データ取得部２０２からの時系列データ（教師データ）と、操作部２０４を介してユーザにより入力される正解ラベル（クラス）とを、対応付ける。そして、当該正解ラベルに紐づけられた時系列データは、上述の変換部２１２によって変換され、後述する教師データ処理部２２０に出力される。 ~Association unit 214~
The associating unit 214 associates the time-series data (teacher data) from the data acquiring unit 202 with the correct answer label (class) input by the user via the operation unit 204. Then, the time series data associated with the correct answer label is converted by the conversion unit 212 described above and output to the teacher data processing unit 220 described below.

〜データ分割部２１６〜
データ分割部２１６は、変換部２１２での変換により生成した教師データ又は評価データに係る時間・周波数・振幅データを、周波数領域、又は時間領域において分割する。なお、データ分割部２１６は、予めユーザによって決められた分割のルール（いくつに分割するか、どの程度の大きさの領域に分割するか等）に従って分割することができ、ユーザは分割ルールを好適に変更することにより、分類したいクラスの特徴をより高精度に抽出したり、より高精度に分類したりすることが可能となる。 -Data division unit 216-
The data division unit 216 divides the time/frequency/amplitude data relating to the teacher data or the evaluation data generated by the conversion in the conversion unit 212 in the frequency domain or the time domain. The data dividing unit 216 can divide the data according to a division rule determined in advance by the user (how many divisions, how large the area should be divided, and the like). By changing to, it becomes possible to extract the features of the class to be classified with higher accuracy or classify with higher accuracy.

〜分離度評価部２１８〜
分離度評価部２１８は、データ分割部２１６により分割された各領域でのクラスの分類能力を示す分離度を評価する。分離度を示す指標としては、例えば、クラス内・クラス間分散比、マハラノビス距離、又は、クラスの分類正解率等を用いることができる。詳細には、分離度評価部２１８は、データ分割部２１６により分割された領域ごとに、目的とするクラス（第１のクラス）の分離度（第１の評価指標）を評価して、分離度の高い領域（第１の領域）を抽出し、目的外のクラス（第２のクラス）の分離度（第２の評価指標）を評価して、分離度の高い領域（第２の領域）を抽出し、抽出したこれら領域に基づいて、教師データ及び評価データを取得する対象領域（所定の領域）を特定する。より具体的には、分離度評価部２１８は、例えば、目的のクラスの分離度の高い領域から、目的外のクラスの分離度の高い領域を除去することにより、教師データ及び評価データを取得する対象領域を特定する。すなわち、本実施形態においては、教師データの全体を用いて分類パラメータを生成し、且つ、評価データの全体を用いて分類を行うのではない。本実施形態においては、目的のクラス分離度が高く、且つ、目的外のクラス分離度が高くない対象領域の教師データ用いて分類パラメータを生成し、さらに、当該対象領域の評価データを用いて分類を行うことにより、目的外のクラスの特徴に邪魔されることなく、分類したい目的のクラスの特徴をより高精度に抽出したり、目的のクラスにより高精度に分類したりすることができる。 ~ Separation degree evaluation unit 218 ~
The degree-of-separation evaluation unit 218 evaluates the degree-of-separation indicating the class classification ability in each region divided by the data division unit 216. As the index indicating the degree of separation, for example, the within-class/inter-class variance ratio, the Mahalanobis distance, or the class correct answer rate can be used. Specifically, the degree-of-separation evaluation unit 218 evaluates the degree-of-separation (first evaluation index) of the target class (first class) for each of the regions divided by the data division unit 216 to determine the degree of separation. Area (first area) is extracted, the degree of separation (second evaluation index) of a class (second class) other than the target is evaluated, and the area with high degree of separation (second area) is determined. The target area (predetermined area) from which the teacher data and the evaluation data are acquired is specified based on the extracted areas. More specifically, the degree-of-separation evaluation unit 218 acquires teacher data and evaluation data by, for example, removing an area having a high degree of separation of a non-target class from an area having a high degree of separation of a target class. Identify the target area. That is, in the present embodiment, the classification parameter is not generated using the entire teacher data, and the classification is not performed using the entire evaluation data. In the present embodiment, the classification parameter is generated using the teacher data of the target area having a high target class separation and the non-target class separation not high, and further the classification parameter is generated using the evaluation data of the target area. By performing the above, it is possible to more accurately extract the feature of the target class to be classified or classify the target class with higher precision without being disturbed by the feature of the non-target class.

さらに、分離度評価部２１８によって特定された対象領域の教師データ及び評価データの時間・周波数・振幅データは、後述する教師データ処理部２２０及び特徴抽出部２２６にそれぞれ出力される。 Furthermore, the teacher data of the target area and the time/frequency/amplitude data of the evaluation data specified by the separation degree evaluation unit 218 are output to the teacher data processing unit 220 and the feature extraction unit 226, respectively, which will be described later.

なお、処理部２１０は、上述の処理に加えて、変換部２１２による変換前の時系列データからオフセット値を引く処理、変換後の時間・周波数・振幅データに対する周波数フィルタリング処理等の処理を必要に応じて行ってもよい。また、処理部２１０の動作の詳細については、後述する。 In addition to the above-described processing, the processing unit 210 needs to perform processing such as processing for subtracting an offset value from the time series data before conversion by the conversion unit 212 and frequency filtering processing for the converted time/frequency/amplitude data. You may go accordingly. The details of the operation of the processing unit 210 will be described later.

（教師データ処理部２２０）
教師データ処理部２２０は、処理部２１０から受け取った分離度評価部２１８によって特定された対象領域の時間・周波数・振幅データ（非負の第１の教師データ）（詳細には、教師観測行列）に対して、非負値行列因子分解を行うことにより、教師基底行列及び教師係数行列を生成する（詳細については後述する）。そして、教師データ処理部２２０は、生成した教師基底行列を正解ラベル（クラス）と対応付けて、後述するパラメータ生成部２２２及び記憶部２２４に出力する。なお、教師データ処理部２２０の動作の詳細については後述する。 (Teacher data processing unit 220)
The teacher data processing unit 220 converts the time/frequency/amplitude data (non-negative first teacher data) (specifically, the teacher observation matrix) of the target area specified by the separation degree evaluation unit 218 received from the processing unit 210. On the other hand, a non-negative matrix factorization is performed to generate a teacher basis matrix and a teacher coefficient matrix (details will be described later). Then, the teacher data processing unit 220 associates the generated teacher base matrix with the correct answer label (class) and outputs the teacher basis matrix to the parameter generation unit 222 and the storage unit 224 described later. The details of the operation of the teacher data processing unit 220 will be described later.

（パラメータ生成部２２２）
パラメータ生成部２２２は、上述の教師係数行列及び正解ラベルに基づいて、時系列データを目的のクラスに分類するための分類モデルパラメータを生成する。パラメータ生成部２２２が生成する分類モデルパラメータは、後述する分類部２２８における分類モデルに応じたパラメータであってもよい。例えば、分類部２２８が閾値判別を行う場合の分類モデルパラメータは閾値であってもよい。また、分類部２２８が線形判別を行う場合の分類モデルパラメータはクラスの境界を決定する線形判別関数の係数であってもよく、分類部２２８が二次判別を行う場合の分類モデルパラメータはクラスの境界を決定する二次判別関数の係数であってもよい。パラメータ生成部２２２は、上記のようにして生成した分類モデルパラメータを、記憶部２２４に出力し、格納させる。 (Parameter generation unit 222)
The parameter generation unit 222 generates a classification model parameter for classifying the time series data into a target class based on the teacher coefficient matrix and the correct answer label described above. The classification model parameter generated by the parameter generation unit 222 may be a parameter according to a classification model in the classification unit 228 described later. For example, the classification model parameter when the classification unit 228 makes a threshold determination may be a threshold. Further, the classification model parameter when the classifying unit 228 makes a linear judgment may be a coefficient of a linear discriminant function that determines a class boundary, and the classification model parameter when the classifying unit 228 makes a quadratic judgment is a class model parameter. It may be a coefficient of a quadratic discriminant function that determines the boundary. The parameter generation unit 222 outputs the classification model parameter generated as described above to the storage unit 224 and stores it.

（記憶部２２４）
記憶部２２４は、情報処理装置２００の各機能部が機能するためのプログラムやパラメータを格納する。例えば、記憶部２２４は、教師データ処理部２２０が対象領域の時間・周波数・振幅データ（非負の第１の教師データ）に対して非負値行列因子分解を行うことで生成された教師基底行列と、パラメータ生成部２２２により生成された分類モデルパラメータとを格納する。 (Storage unit 224)
The storage unit 224 stores programs and parameters for causing each functional unit of the information processing device 200 to function. For example, the storage unit 224 stores a teacher base matrix generated by the teacher data processing unit 220 performing non-negative matrix factorization on the time/frequency/amplitude data (non-negative first teacher data) of the target region. , And the classification model parameters generated by the parameter generation unit 222.

（特徴抽出部２２６）
特徴抽出部２２６は、記憶部２２４に格納された教師基底行列に基づいて、処理部２１０により出力された対象領域の時間・周波数・振幅データ（非負の評価データ）から評価係数行列を生成し、生成した評価係数行列を後述する分類部２２８に出力する。なお、特徴抽出部２２６の動作の詳細については後述する。 (Feature extraction unit 226)
The feature extraction unit 226 generates an evaluation coefficient matrix from the time/frequency/amplitude data (non-negative evaluation data) of the target region output by the processing unit 210, based on the teacher basis matrix stored in the storage unit 224, The generated evaluation coefficient matrix is output to the classification unit 228 described later. The details of the operation of the feature extraction unit 226 will be described later.

（分類部２２８）
分類部２２８は、上記評価係数行列を、記憶部２２４に格納された分類モデルパラメータに基づいて、目的とするクラスに分類する。分類部２２８は、例えば、評価係数行列に含まれる係数ベクトルを特徴ベクトルとし、様々な分類モデルを用いて、当該分類モデルに応じた分類モデルパラメータに基づいた分類を行うことが可能である。具体的には、分類部２２８は、例えば、閾値判別、線形判別、二次判別、又はＳＶＭ（ＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅ）等に基づく分類モデルを用いて、分類することができる。 (Classification unit 228)
The classification unit 228 classifies the evaluation coefficient matrix into a target class based on the classification model parameters stored in the storage unit 224. The classification unit 228 can perform classification based on a classification model parameter according to the classification model using, for example, a coefficient vector included in the evaluation coefficient matrix as a feature vector and using various classification models. Specifically, the classification unit 228 can perform classification using, for example, a classification model based on threshold value determination, linear determination, quadratic determination, SVM (Support Vector Machine), or the like.

（出力部２３０）
出力部２３０は、分類部２２８による分類結果を出力する。出力部２３０は、例えば分類結果を表示するディスプレイ等の表示装置、又は分類結果を音声出力するスピーカ等であることができる。 (Output unit 230)
The output unit 230 outputs the classification result of the classification unit 228. The output unit 230 can be, for example, a display device such as a display that displays the classification result, or a speaker that outputs the classification result by voice.

＜概要＞
以上、本発明の実施形態に係る情報処理装置２００の詳細構成を説明した。続いて、本実施形態に係る動作例を説明する前に、図３及び図４を参照して、本実施形態の概要について説明する。図３は、本実施形態に係るＮＭＦの処理例を模式的に示す説明図であり、図４は、次元の削減を説明するための模式図である。 <Overview>
The detailed configuration of the information processing apparatus 200 according to the embodiment of the present invention has been described above. Subsequently, before describing an operation example according to the present embodiment, an outline of the present embodiment will be described with reference to FIGS. 3 and 4. FIG. 3 is an explanatory diagram schematically showing a processing example of the NMF according to the present embodiment, and FIG. 4 is a schematic diagram for explaining dimension reduction.

（非負値行列因子分解）
本実施形態においては、上述した変換部２１２の変換によって、非負の観測ベクトル（周波数領域表現に変換されたデータ）を含む非負の観測行列を得ることができる。当該観測行列は、１の対象区間から得られる観測ベクトルを、１又は複数並べた行列である。また、以下の説明では、先に説明したように、教師データとして取得される観測行列を教師観測行列と呼び、評価データとして取得される観測行列を評価観測行列と呼んで区別する場合がある。 (Nonnegative matrix factorization)
In the present embodiment, a non-negative observation matrix including a non-negative observation vector (data converted into a frequency domain representation) can be obtained by the conversion of the conversion unit 212 described above. The observation matrix is a matrix in which one or more observation vectors obtained from one target section are arranged. Further, in the following description, as described above, the observation matrix acquired as the teacher data may be referred to as a teacher observation matrix, and the observation matrix acquired as the evaluation data may be referred to as an evaluation observation matrix to be distinguished.

ところで、非負値行列因子分解（Ｎｏｎ−ｎｅｇａｔｉｖｅＭａｔｒｉｘＦａｃｔｏｒｉｚａｔｉｏｎ、以下、ＮＭＦと呼ぶ）は、１の非負行列Ｙを２つの非負行列Ｗ、Ｈの積（例えばＷＨ）に分解するアルゴリズムである。１の非負行列Ｙを２つの非負行列Ｗ、Ｈに分解することにより、元の非負行列Ｙの持つ潜在的要素（特徴）を明確にすることができる。そして、２つの非負行列Ｗ、Ｈを解析的に求めることは困難であるため、初期値を与えてＹとＷＨの誤差が局所最適解になるよう、反復的に近似解を求める手法が提案されている（例えば、上記非特許文献１、２等）。なお、一般に、局所最適解は初期値に依存して変化する。 Non-negative matrix factorization (hereinafter referred to as NMF) is an algorithm that decomposes one non-negative matrix Y into two non-negative matrices W and H (for example, WH). By decomposing the one non-negative matrix Y into two non-negative matrices W and H, the latent element (feature) of the original non-negative matrix Y can be clarified. Since it is difficult to analytically determine the two non-negative matrices W and H, a method for iteratively obtaining an approximate solution so that an error between Y and WH is a local optimum solution by providing an initial value is proposed. (For example, Non-Patent Documents 1 and 2 above). Note that the local optimum solution generally changes depending on the initial value.

図３に示すように、ＮＭＦにより、例えば、非負の観測行列Ｙ（教師観測行列、又は評価観測行列）は、非負の係数行列Ｗ、及び非負の基底行列Ｈの積に分解される。 As shown in FIG. 3, for example, a non-negative observation matrix Y (teacher observation matrix or evaluation observation matrix) is decomposed into a product of a non-negative coefficient matrix W and a non-negative basis matrix H by NMF.

さらに、図３に示すように、本実施形態に係る観測行列Ｙは、例えば、ｍ次元の観測ベクトルｙが行ベクトルとなって、時系列データにおける対象区間の数ｎだけの行で構成されたｎ行ｍ列の行列である。 Further, as shown in FIG. 3, the observation matrix Y according to the present embodiment is composed of, for example, the m-dimensional observation vector y as a row vector, and the rows are the number n of the target sections in the time-series data. It is a matrix with n rows and m columns.

本実施形態に係る基底行列Ｈは、例えば、観測行列ＹにＮＭＦを適用し、分解されたｍ次元の基底ベクトルｈが行ベクトルとなって、基底ベクトルの数ｋだけの行で構成されたｋ行ｍ列の行列である。 The basis matrix H according to the present embodiment is obtained by applying NMF to the observation matrix Y, and the decomposed m-dimensional basis vector h is a row vector, and k is composed of as many rows as the number k of basis vectors. It is a matrix with rows and m columns.

本実施形態に係る係数行列Ｗは、例えば、ｋ次元の係数ベクトルｗが行ベクトルとなって、観測ベクトルの数ｎだけの行で構成されたｎ行ｋ列の行列である。ここで係数ベクトルｗは、ある観測ベクトルにおいて、基底行列Ｈに含まれる各基底ベクトルの成分がどれだけ含まれるか、という加重値を示す行ベクトルである。 The coefficient matrix W according to the present embodiment is, for example, a matrix of n rows and k columns that is configured by k rows of k-dimensional coefficient vectors w and has rows of the number n of observation vectors. Here, the coefficient vector w is a row vector indicating a weighted value indicating how many components of each basis vector included in the basis matrix H are included in a certain observation vector.

図３に示すように、上記の行列においては、以下の数式（１）の関係がある。 As shown in FIG. 3, in the above matrix, there is a relationship of the following mathematical expression (1).

なお、先に説明したように、教師観測行列Ｙ_Ｌを分解して得られる基底行列、及び係数行列は、それぞれ教師基底行列Ｈ_Ｌ、教師係数行列Ｗ_Ｌとなる。 As described above, the basis matrix and the coefficient matrix obtained by decomposing the teacher observation matrix Y _L are the teacher basis matrix H _L and the teacher coefficient matrix W _L , respectively.

本実施形態においては、上述した教師データ処理部２２０により、教師観測行列Ｙ_Ｌに対応付けられた正解ラベルのクラスごとにＮＭＦを適用して、教師基底行列Ｈ_Ｌ、教師係数行列Ｗ_Ｌを得ることができる。 In the present embodiment, the teacher data processing unit 220 described above applies NMF to each class of correct labels associated with the teacher observation matrix Y _L to obtain a teacher base matrix H _L and a teacher coefficient matrix W _L. be able to.

例えば、教師データ処理部２２０は、まず教師観測行列Ｙ_Ｌの各クラス（ｃ１，ｃ２，・・・，ｃＮ）のデータセット（観測ベクトル又は観測行列）Ｙ_Ｃ１，Ｙ_Ｃ２，・・・，Ｙ_ＣＮごとにＮＭＦを適用する。教師データ処理部２２０は、データセットＹ_Ｃ１，Ｙ_Ｃ２，・・・，Ｙ_ＣＮのそれぞれが分解されて得られたベクトルまたは行列Ｈ_Ｃ１，Ｈ_Ｃ２，・・・，Ｈ_ＣＮにより、教師基底行列Ｈ_Ｌ＝｛Ｈ_Ｃ１，Ｈ_Ｃ２，・・・，Ｈ_ＣＮ｝を生成する。 For example, the teacher data processing unit 220 first sets a data set (observation vector or observation matrix) Y _C1 , Y _C2 ,..., Y of each class (c1, c2,..., CN) of the teacher observation matrix Y _L. Apply NMF per _CN . Teacher data processor 220, a data set _{_{_{Y C1, Y C2, ···,}}} Y vector or matrix _H C1 which each obtained is decomposed in _{_CN,} H C2, · · _·, a _{H CN,} teachers basis matrix H _L ={H _C1 , H _C2 ,..., H _CN } is generated.

そして、教師基底行列Ｈ_Ｌに基づいて、教師観測行列Ｙ_Ｌから生成される係数行列を教師係数行列Ｗ_Ｌとすると、数式（１）より、教師観測行列Ｙ_Ｌ、教師係数行列Ｗ_Ｌ、及び教師基底行列Ｈ_Ｌの関係は、以下の数式（２）で表わすことができる。 Then, assuming that the coefficient matrix generated from the teacher observation matrix Y _L based on the teacher basis matrix H _L is the teacher coefficient matrix W _L , the teacher observation matrix Y _L , the teacher coefficient matrix W _L , and The relationship of the teacher basis matrix H _L can be expressed by the following mathematical expression (2).

上記数式（２）より、教師係数行列Ｗ_Ｌを得るためには、教師基底行列Ｈ_Ｌの逆行列を用いる必要がある。しかし、教師基底行列Ｈ_Ｌは、一般に正則行列とは限らないため、逆行列を持たない場合がある。そこで、教師データ処理部２２０は、例えば教師基底行列の疑似逆行列（ムーア‐ペンローズの疑似逆行列）に基づいて、教師係数行列Ｗ_Ｌを生成してもよい。教師基底行列Ｈ_Ｌの疑似逆行列Ｈ_Ｌ ^＋は、ｋ×ｍの行列であり、一般にｋ＜ｍであるため、擬似逆行列Ｈ_Ｌ ^＋は、以下の数式（３）で表すことができる。 From Equation (2) above, in order to obtain the teacher coefficient matrix W _L , it is necessary to use the inverse matrix of the teacher base matrix H _L. However, since the teacher basis matrix H _L is not generally a regular matrix, it may not have an inverse matrix. Therefore, the teacher data processing unit 220 may generate the teacher coefficient matrix W _L based on, for example, a pseudo inverse matrix of the teacher base matrix (Moore-Penrose pseudo inverse matrix). The pseudo inverse matrix H _L ⁺ of the teacher basis matrix H _L is a k×m matrix, and generally k<m, so the pseudo inverse matrix H _L ⁺ can be expressed by the following mathematical expression (3).

上記数式（３）で得られた疑似逆行列Ｈ_Ｌ ^＋を用いて、教師係数行列Ｗ_Ｌは以下の数式（４）で表すことができる。 The teacher coefficient matrix W _L can be expressed by the following mathematical expression (4) using the pseudo inverse matrix H _L ⁺ obtained by the mathematical expression (3).

ＮＭＦは、一般に振幅が大きく、頻出する度合が大きいデータに対応する特徴を持つ規定が生成されやすい。従って、全てのデータセットに対して一括でＮＭＦを適用する際、クラスによって振幅の小さいデータや出現回数の低いデータがあると、そのクラスのデータに対応した特徴を持つ基底が生成され難い場合がある。そこで、本実施形態においては、クラスごとにＮＭＦを適用することにより、振幅の小さいデータや、出現回数の低いデータに対応した特徴を持つ基底も生成されやすくなり、分類精度をより向上させることができる。なお、本実施形態においては、複数のクラス、又は全てのクラスのデータセットに対して一括でＮＭＦが適用されてもよい。 NMF generally has a large amplitude, and it is easy to generate a rule having a characteristic corresponding to data that frequently occurs. Therefore, when NMF is applied to all data sets at once, if there is data with small amplitude or data with low number of appearances depending on the class, it may be difficult to generate a base having features corresponding to the data of the class. is there. Therefore, in the present embodiment, by applying the NMF for each class, a base having a feature corresponding to the data with a small amplitude or the data with a low appearance frequency is easily generated, and the classification accuracy can be further improved. it can. In this embodiment, NMF may be collectively applied to a plurality of classes or data sets of all classes.

なお、上述した教師データ処理部２２０は、生成した教師基底行列Ｈ_Ｌを、上記対応付け部２１４により対応付けられた正解ラベル（クラス）と紐づけて、上記パラメータ生成部２２２及び記憶部２２４に出力する。 The teacher data processing unit 220 described above associates the generated teacher basis matrix H _L with the correct answer label (class) associated by the association unit 214, and causes the parameter generation unit 222 and the storage unit 224 to associate the same. Output.

次に、本実施形態においては、上述した特徴抽出部２２６は、記憶部２２４に格納された教師基底行列Ｈ_Ｌに基づいて、評価データの評価観測行列Ｙ_Ｔから評価係数行列Ｗ_Ｔを生成する。なお、先に説明したように、評価観測行列Ｙ_Ｔを分解して得られる基底行列、及び係数行列は、それぞれ評価基底行列Ｈ_Ｔ、教師係数行列Ｗ_Ｔとなる。以下、特徴抽出部２２６による評価係数行列Ｗ_Ｔの生成例について説明する。 Next, in the present embodiment, the above-described feature extraction unit 226 generates the evaluation coefficient matrix W _T from the evaluation observation matrix Y _T of the evaluation data, based on the teacher basis matrix H _L stored in the storage unit 224. .. As described above, the basis matrix and the coefficient matrix obtained by decomposing the evaluation observation matrix Y _T are the evaluation basis matrix H _T and the teacher coefficient matrix W _T , respectively. Hereinafter, an example of generating the evaluation coefficient matrix W _T by the feature extraction unit 226 will be described.

上記数式（１）に基づいて、教師基底行列Ｈ_Ｌ、評価観測行列Ｙ_Ｔ及び評価係数行列Ｗ_Ｔの関係は、以下の数式（５）によって示すことができる。 The relationship between the teacher basis matrix H _L , the evaluation observation matrix Y _T, and the evaluation coefficient matrix W _T can be expressed by the following Expression (5) based on the above Expression (1).

上記数式（３）と同様の方法で得られた疑似逆行列Ｈ_Ｌ ^＋を用いて、評価係数行列Ｗ_Ｔは、以下の数式（６）で示すことができる。 The evaluation coefficient matrix W _T can be expressed by the following formula (6) by using the pseudo inverse matrix H _L ⁺ obtained by the same method as the formula (3).

そして、本実施形態においては、特徴抽出部２２６は、上述のようにして生成した評価係数行列Ｗ_Ｔを分類部２２８に出力し、分類部２２８は、評価係数行列Ｗ_Ｔを、記憶部２２４に格納された分類モデルパラメータに基づいて分類する。 Then, in the present embodiment, the feature extraction unit 226 outputs the evaluation coefficient matrix W _T generated as described above to the classification unit 228, and the classification unit 228 stores the evaluation coefficient matrix W _T in the storage unit 224. Classify based on the stored classification model parameters.

本実施形態においては、以上のように、基底行列に対する重みを示す係数ベクトルを特徴ベクトルとして用いて分類することで、例えば特定周波数のパワーの大きさに基づいて直接的に分類するよりも、分類精度を向上させることができる。 In the present embodiment, as described above, the classification is performed by using the coefficient vector indicating the weight with respect to the basis matrix as the feature vector, rather than the direct classification based on the magnitude of the power of the specific frequency, for example. The accuracy can be improved.

また、本実施形態においては、評価データ（時系列データ）自体を用いて分類処理を行った場合に比べ、評価データから生成される評価係数行列Ｗ_Ｔを用いて分類処理を行うことにより、分類処理における特徴ベクトルの次元が削減され、分類処理の処理量を抑制することができる。 Further, in the present embodiment, the classification process is performed by using the evaluation coefficient matrix W _T generated from the evaluation data, as compared with the case where the classification process is performed using the evaluation data (time series data) itself. The dimension of the feature vector in the processing is reduced, and the processing amount of the classification processing can be suppressed.

例えば、図４に示すように、ＮＭＦにより、パワースペクトルである評価データＧ２０は、基底ベクトルＧ２１及び係数Ｗ_１の積と、基底ベクトルＧ２２及び係数Ｗ_２の積と、の和で表わすことができる。図４の例において、評価データＧ２０を特徴ベクトルとして分類処理を行う場合には、評価データＧ２０における周波数分解能に応じた次元での処理が必要となる。一方、図４の例において、係数ベクトルの次元は２である。従って、上述したように、評価データＧ２０の次元よりも係数ベクトルの次元数の方が小さいため、評価係数行列Ｗ_Ｔを用いて分類処理を行うことで、分類処理における特徴ベクトルの次元が削減され、分類処理の処理量を抑制することができる。 For example, as shown in FIG. 4, by NMF, the evaluation data G20 which is a power spectrum can be represented by the sum of the product of the basis vector G21 and the coefficient W _{1 and} the product of the basis vector G22 and the coefficient W _2. .. In the example of FIG. 4, when performing the classification process using the evaluation data G20 as a feature vector, it is necessary to perform processing in a dimension according to the frequency resolution of the evaluation data G20. On the other hand, in the example of FIG. 4, the dimension of the coefficient vector is 2. Therefore, as described above, since the dimension number of the coefficient vector is smaller than the dimension of the evaluation data G20, the dimension of the feature vector in the classification processing is reduced by performing the classification processing using the evaluation coefficient matrix W _T. The processing amount of the classification processing can be suppressed.

（対象領域）
しかしながら、ＮＭＦを用いることにより、分類処理の処理量を抑制できるものの、スペクトログラムやスカログラムは、時間と周波数の２軸を有するデータであることから、スペクトログラム等に対してＮＭＦを適用した場合、処理量の抑制に限界がある場合がある。 (Target area)
However, although the amount of classification processing can be suppressed by using NMF, since the spectrogram and scalogram are data having two axes of time and frequency, when NMF is applied to a spectrogram, etc. There may be a limit to the suppression of.

また、ＮＭＦは、先に説明したように、一般に、振幅の値や、出現頻度が高い基底が選ばれやすい。従って、時間・周波数・振幅データの全体に対してＮＭＦを適用すると、データによっては、例えば、クラスの分離能力が低く振幅の大きな基底が選ばれ、クラスの分離能力が高く振幅の小さな基底が選ばれてしまうことがある。このような場合、分離精度が低下する場合も存在することから、分類の精度の向上には限界があることがあった。 In addition, as described above, in general, the NMF is likely to select a value of amplitude or a base with a high appearance frequency. Therefore, when NMF is applied to the entire time/frequency/amplitude data, for example, a base with a low class separation ability and a large amplitude is selected, and a base with a high class separation capacity and a small amplitude is selected depending on the data. It may be lost. In such a case, since the separation accuracy may decrease, there is a limit in improving the classification accuracy.

分類の精度の限界が存在する場合があることを、図５を参照して説明する。図５は、時間・周波数・振幅データの全体に対してＮＭＦを適用した例を示す模式図である。なお、図５に示すスペクトログラムＧ６１、Ｇ６２は、それぞれクラス１、クラス２の特徴を示すスペクトログラムである。さらに、スペクトログラムＧ６１、Ｇ６２においては、黒色の濃さが振幅の大きさを示している。 It will be described with reference to FIG. 5 that there may be a limit of classification accuracy. FIG. 5 is a schematic diagram showing an example in which NMF is applied to the entire time/frequency/amplitude data. The spectrograms G61 and G62 shown in FIG. 5 are spectrograms showing the characteristics of class 1 and class 2, respectively. Furthermore, in the spectrograms G61 and G62, the darkness of black indicates the magnitude of the amplitude.

図５に示す例においては、振幅の大きい領域Ｒ１における振幅の大きさは、クラス１の特徴を示すスペクトログラムＧ６１、及びクラス２の特徴を示すスペクトログラムＧ６２に共通して現れ、分類に有効な特徴ではないものとする。一方、図１５に示す例においては、振幅の小さい領域Ｒ２、及び領域Ｒ３における振幅の大きさは、クラス１、及びクラス２の分類に有効な特徴であるものとする。 In the example shown in FIG. 5, the magnitude of the amplitude in the region R1 where the amplitude is large appears commonly in the spectrogram G61 showing the features of class 1 and the spectrogram G62 showing the features of class 2, and is not a feature effective for classification. Make it not exist. On the other hand, in the example shown in FIG. 15, it is assumed that the magnitudes of the amplitudes in the regions R2 and R3 where the amplitude is small are effective characteristics for class 1 and class 2.

しかしながら、スペクトログラムＧ６１、Ｇ６２のデータ全体に対してＮＭＦを適用してしまうと、振幅の大きい領域Ｒ１に対応する基底Ｇ６１１、Ｇ６２１が選択され、クラス１とクラス２の分類に有効な特徴を示す振幅の小さな基底は選択されない場合がある。このような場合、スペクトログラムＧ６１から得られる係数Ｇ６１３、及びスペクトログラムＧ６２から得られる係数Ｇ６２３が同一となり、クラス１とクラス２とを精度よく分類することが難しくなる。 However, if NMF is applied to the entire data of the spectrograms G61 and G62, the bases G611 and G621 corresponding to the region R1 having a large amplitude are selected, and the amplitudes that are effective for the classification of class 1 and class 2 are selected. Small bases of may not be selected. In such a case, the coefficient G613 obtained from the spectrogram G61 and the coefficient G623 obtained from the spectrogram G62 are the same, which makes it difficult to accurately classify the class 1 and the class 2.

そこで、本実施形態においては、分類の精度をより向上させるために、クラス分類に有効な時間領域と周波数領域に限定して部分的にＮＭＦを適用する。このような適用例について、図６を参照して説明する。図６は、時間・周波数・振幅データの部分に対してＮＭＦを適用した例を示す模式図である。 Therefore, in the present embodiment, in order to further improve the accuracy of classification, NMF is partially applied only in the time domain and frequency domain effective for class classification. Such an application example will be described with reference to FIG. FIG. 6 is a schematic diagram showing an example in which NMF is applied to the time/frequency/amplitude data portion.

図６に示すスペクトログラムＧ７１、Ｇ７２は、図５に示したスペクトログラムＧ６１、Ｇ６２と同様に、それぞれクラス１、クラス２の特徴を示すスペクトログラムである。図６に示すように、スペクトログラムＧ７１、Ｇ７２の領域Ｒ４に限定して部分的にＮＭＦを適用すると、領域Ｒ２に対応する基底Ｇ７１１、Ｇ７２１と、領域Ｒ３に対応する基底Ｇ７１２、Ｇ７２２が選択される。その結果、スペクトログラムＧ７１から得られる係数Ｇ７１３、及び係数Ｇ７１４と、スペクトログラムＧ７２から得られる係数Ｇ７２３、及び係数７２４とが大きく異なり、容易にクラス１とクラス２を分類することができる。すなわち、クラス分類に有効な時間領域と周波数領域に限定して部分的にＮＭＦを適用することにより、分類の精度をより向上させることができる。また、本実施形態においては、領域Ｒ４に限定してＮＭＦを適用すればよいため、処理量の抑制をさらに進めることもできる。 The spectrograms G71 and G72 shown in FIG. 6 are spectrograms showing the characteristics of class 1 and class 2, respectively, similarly to the spectrograms G61 and G62 shown in FIG. As shown in FIG. 6, when the NMF is partially applied only to the region R4 of the spectrograms G71 and G72, the bases G711 and G721 corresponding to the region R2 and the bases G712 and G722 corresponding to the region R3 are selected. .. As a result, the coefficient G713 and the coefficient G714 obtained from the spectrogram G71 are greatly different from the coefficient G723 and the coefficient 724 obtained from the spectrogram G72, and the class 1 and the class 2 can be easily classified. That is, the accuracy of classification can be further improved by partially applying NMF only to the time domain and frequency domain effective for class classification. Further, in the present embodiment, since NMF may be applied only to the region R4, it is possible to further reduce the processing amount.

例えば、一般に打音データは限られた時間帯のデータであり、時間・周波数・振幅データにおいて所定の振幅が現れる一部の時間帯や周波数帯に限定して目的に応じた分析を行えば十分である場合がある。また、特定の周波数帯の音を出力する機械の振動音等のデータであっても、その周波数帯に限定して目的に応じた分析を行えば十分であることがある。 For example, tapping data is generally data in a limited time zone, and it is sufficient to perform analysis according to the purpose by limiting it to a part of the time zone or frequency zone where a predetermined amplitude appears in the time/frequency/amplitude data. May be. Further, even for data such as vibration noise of a machine that outputs a sound in a specific frequency band, it may be sufficient to perform analysis according to the purpose by limiting the frequency band.

さらに、本実施形態においては、分類の精度をより向上させるために、目的とするクラス分類に有効な時間領域と周波数領域に限定するだけでなく、目的外のクラス分類に有効な時間領域と周波数領域を除外した上で部分的にＮＭＦを適用する。すなわち、本実施形態においては、目的のクラス分離度が高く、且つ、目的外のクラス分離度が高くない領域に限定してＮＭＦを適用することにより、目的外のクラスの特徴に邪魔されることなく、分類したい目的のクラスの特徴をより高精度に抽出したり、目的のクラスに、より高精度に分類したりすることができる。その結果、本実施形態においては、分類の精度をより向上させることができる。加えて、本実施形態においては、領域を限定してＮＭＦを適用すればよいため、処理量の抑制をさらに進めることもできる。 Further, in the present embodiment, in order to further improve the accuracy of classification, not only the time domain and frequency domain effective for the target class classification but also the time domain and frequency effective for the non-target class classification are used. The NMF is partially applied after excluding the region. That is, in the present embodiment, by applying the NMF only to the region where the target class separation is high and the non-target class separation is not high, the characteristics of the non-target class are disturbed. The feature of the desired class to be classified can be extracted with higher accuracy, or can be classified into the target class with higher accuracy. As a result, in this embodiment, the accuracy of classification can be further improved. In addition, in the present embodiment, since it is sufficient to limit the area and apply the NMF, it is possible to further suppress the processing amount.

なお、以下の説明においては、目的とするクラス分類とは、例えば、測定対象物が正常であるか異常であるかを示す２つのクラス（測定対象物（機械）の状態に関するクラス）に分類することである。また、目的外のクラス分類としては、例えば、測定対象物（機械）の個体差に関するクラス、又は、測定対象物（機械）が設置された環境差に関するクラスについて分類することである。すなわち、本実施形態においては、上述のように部分的にＮＭＦを適用することにより、例えば、測定対象物の個体差や環境差異による影響をより抑えて、目的とする当該測定対象物が正常であるか異常であるかを示すクラスの特徴を高精度に抽出することができる。 In the following description, the target class classification is classified into, for example, two classes (classes related to the state of the measurement object (machine)) indicating whether the measurement object is normal or abnormal. That is. Further, as the class classification other than the purpose, for example, a class relating to individual difference of the measurement object (machine) or a class relating to environment difference in which the measurement object (machine) is installed is classified. That is, in the present embodiment, by partially applying the NMF as described above, for example, the influence of the individual difference or the environmental difference of the measurement target is further suppressed, and the target measurement target is normal. It is possible to highly accurately extract the characteristics of the class indicating whether the class is present or abnormal.

（分離度）
そして、本実施形態においては、目的のクラス分類の分離度及び目的外クラス分類の分離度に基づいて、ＮＭＦを適用する領域（対象領域）の限定を行うこととなる。図７から図９を参照して、本実施形態に係るＮＭＦを適用する対象領域の限定について説明する。図７は、本実施形態に係るデータ分割の例を示す説明図であって、詳細には、データ分割部２１６によるデータ分割と、分離度評価部２１８による教師データと評価データの取得の例を示す説明図である。また、図８及び図９は、本実施形態に係る教師データと評価データの取得の例を示す説明図である。 (Separation degree)
Then, in the present embodiment, the region (target region) to which NMF is applied is limited based on the degree of separation of the target class classification and the degree of separation of the non-target class classification. The limitation of the target area to which the NMF according to the present embodiment is applied will be described with reference to FIGS. 7 to 9. FIG. 7 is an explanatory diagram showing an example of data division according to the present embodiment, and more specifically, an example of data division by the data division unit 216 and acquisition of teacher data and evaluation data by the separation degree evaluation unit 218. It is an explanatory view shown. 8 and 9 are explanatory diagrams showing an example of acquisition of teacher data and evaluation data according to the present embodiment.

データ分割部２１６は、図７に示すように、例えば、教師データから得られた時間・周波数・振幅データ（非負の第２の教師データの観測行列）の領域Ａ０（図８及び図９においては領域Ｒ８０に対応する）において、４つの領域Ａ１、Ａ２、Ａ３、Ａ４にデータを分割する。データ分割部２１６による分割数は、２以上の任意の数であってもよい。 As shown in FIG. 7, the data dividing unit 216, for example, the area A0 (in FIGS. 8 and 9) of the time/frequency/amplitude data (observation matrix of the non-negative second teacher data) obtained from the teacher data. (Corresponding to the region R80), the data is divided into four regions A1, A2, A3 and A4. The number of divisions by the data division unit 216 may be an arbitrary number of 2 or more.

そして、分離度評価部２１８は、例えば、分割された各領域Ａ１、Ａ２、Ａ３、Ａ４に対して、目的とするクラス（第１のクラス）のクラス間の分離度（第１の評価指数）として、クラス内・クラス間分散比を分離度として算出する。まずは、クラス内分散σ_w ²、及びクラス間分散σ_B ²は、以下の数式（７）、数式（８）によって示すことができる。 Then, the degree-of-separation evaluation unit 218, for example, with respect to each of the divided areas A1, A2, A3, and A4, the degree-of-separation (first evaluation index) between classes of the target class (first class). As a result, the intra-class/inter-class variance ratio is calculated as the degree of separation. First, the within-class variance σ _w ² and the between-class variance σ _B ² can be expressed by the following formulas (7) and (8).

ただし、上記数式（７）、（８）において、ｎは全データ数、ｎ_ｉはクラスｉのデータ数、ｘは特徴ベクトル、ｍ_ｉはクラスｉの特徴量ベクトルの平均、ｍは全特徴量ベクトルの平均である。上記クラス内分散σ_w ²、及びクラス間分散σ_B ²を用いて、クラス内・クラス間分散比Ｊは以下の数式（９）によって示される。 However, the equation (7), in (8), n is the total number of data, n _i is the number of data of the class i, x is a feature vector, m _i is the average of the feature vectors of the class i, m is the total feature value It is the average of the vectors. Using the within-class variance σ _w ² and the between-class variance σ _B ² , the within-class/inter-class variance ratio J is expressed by the following mathematical expression (9).

上記のクラス内・クラス間分散比Ｊが大きい程、分離度が大きく、良い特徴量であると考えることができる。なお、分離度を示す指標は、上記に限定されず、例えばマハラノビス距離等を用いることもできる。 It can be considered that the larger the above-mentioned in-class/inter-class dispersion ratio J, the greater the degree of separation and the better the feature amount. The index indicating the degree of separation is not limited to the above, and for example, the Mahalanobis distance or the like can be used.

分離度評価部２１８は、例えば、４つの領域Ａ１、Ａ２、Ａ３、Ａ４でそれぞれ上述したように目的クラスについての分離度を算出する。さらに、分離度評価部２１８は、例えば、４つの領域Ａ１、Ａ２、Ａ３、Ａ４から算出した分離度の値が予め設定された閾値Ｌ_１（第１の閾値）以上の領域（第１の領域）Ｒ１０（図８参照）を抽出する。 The degree-of-separation evaluation unit 218 calculates the degree of separation for the target class as described above in each of the four areas A1, A2, A3, and A4. Further, the separation degree evaluation unit 218, for example, an area (first area) in which the value of the separation degree calculated from the four areas A1, A2, A3, and A4 is equal to or greater than a preset threshold L ₁ (first threshold). ) Extract R10 (see FIG. 8).

上述と同様に、分離度評価部２１８は、例えば、分割された教師データから得られた時間・周波数・振幅データ（非負の第３の教師データの観測行列）の各領域Ａ１、Ａ２、Ａ３、Ａ４に対して、目的外クラス（第２のクラス）についての分離度（第２の評価指標）を算出する。さらに、分離度評価部２１８は、例えば、４つの領域Ａ１、Ａ２、Ａ３、Ａ４から算出した分離度の値が予め設定された閾値Ｌ_２（第２の閾値）以上の領域（第２の領域）Ｒ１２（図８参照）を抽出する。なお、目的外クラスについては、２つ以上であってもよく、この場合、分離度評価部２１８は、例えば、分割された各領域Ａ１、Ａ２、Ａ３、Ａ４に対して他の目的外クラスについての分離度を算出し、４つの領域Ａ１、Ａ２、Ａ３、Ａ４から算出した分離度の値が予め設定された閾値Ｌ_３以上の領域Ｒ１４（図９参照）を抽出することとなる。 In the same manner as described above, the separability evaluation unit 218, for example, each area A1, A2, A3 of the time/frequency/amplitude data (observation matrix of the non-negative third teacher data) obtained from the divided teacher data, For A4, the degree of separation (second evaluation index) for the non-target class (second class) is calculated. Further, the separation degree evaluation unit 218, for example, an area (second area) in which the value of the separation degree calculated from the four areas A1, A2, A3, and A4 is equal to or greater than a preset threshold L ₂ (second threshold). ) Extract R12 (see FIG. 8). It should be noted that the number of non-purpose classes may be two or more, and in this case, the degree-of-separation evaluation unit 218 may determine, for example, for each of the divided regions A1, A2, A3, and A4 about other non-purpose classes. Is calculated, and the region R14 (see FIG. 9) in which the value of the separation calculated from the four regions A1, A2, A3, and A4 is equal to or greater than the preset threshold L ₃ is extracted.

さらに、分離度評価部２１８は、抽出した領域Ｒ１０、Ｒ１２、Ｒ１４に基づいて、ＮＭＦを適用する対象領域（所定の領域）を特定する。例えば、分離度評価部２１８は、抽出した領域Ｒ１０、Ｒ１２、Ｒ１４の和、差、積のうちのいずれかに基づいて、ＮＭＦを適用する対象領域を特定する。具体的には、図８に示すように、分離度評価部２１８は、閾値Ｌ_１以上の領域Ｒ１０から、閾値Ｌ_２以上の領域Ｒ１２を除去することにより、ＮＭＦを適用する領域Ｇ８１０を特定する。また、目的外クラスが２つであった場合には、図９に示すように、分離度評価部２１８は、閾値Ｌ_１以上の領域Ｒ１０から、閾値Ｌ_２以上の領域Ｒ１２及び閾値Ｌ_３以上の領域Ｒ１４を除去することにより、ＮＭＦを適用する領域Ｇ８２０を特定する。すなわち、本実施形態においては、目的とするクラスの分離度の高い領域であり、且つ、目的外のクラスの分離度の高くない領域を選択することにより、目的のクラスの分類に有効な対象領域を効率よく見つけ出すことができる。 Further, the separation degree evaluation unit 218 identifies a target area (predetermined area) to which NMF is applied based on the extracted areas R10, R12, and R14. For example, the degree-of-separation evaluation unit 218 identifies the target region to which NMF is applied, based on any of the sum, difference, or product of the extracted regions R10, R12, and R14. Specifically, as illustrated in FIG. 8, the separation degree evaluation unit 218 identifies a region G810 to which NMF is applied by removing the region R12 having a threshold value L ₂ or more from the region R10 having a threshold value L ₁ or more. .. When there are two non-objective classes, as shown in FIG. 9, the separation degree evaluation unit 218 determines whether the separation degree evaluation unit 218 starts from the region R10 having a threshold value L ₁ or more to the region R12 having a threshold value L ₂ or more and a threshold value L ₃ or more. A region G820 to which NMF is applied is specified by removing the region R14 of. That is, in the present embodiment, by selecting a region in which the degree of separation of the target class is high and the degree of separation of the non-target class is not high, the target region effective in classifying the target class is selected. Can be found efficiently.

そして、本実施形態においては、上述したように、目的のクラスの分離度の高い領域から目的外のクラスの分離度が高い領域を除去した対象領域に、ＮＭＦを適用することにより、より高い分離度を持つ、より精度の高い分類に有効な特徴量を得ることができる。 Then, in the present embodiment, as described above, by applying NMF to the target region obtained by removing the region having a high degree of separation of the non-target class from the region having a high degree of separation of the target class, a higher separation is achieved. It is possible to obtain a feature quantity that has a degree and is effective for more accurate classification.

そして、分離度評価部２１８は、学習フェーズにおいて、上述のように特定されたＮＭＦを適用する対象領域での時間領域・周波数領域でのデータを教師用データ（非負の第１の教師データの教師観測行列）として取得する。さらに、分離度評価部２１８は、上述のように特定した時間領域・周波数領域を、評価データを取得するためのデータ領域（対象領域）として特定し、対応する当該データ領域の時間・周波数・振幅データを、評価用データ（非負の評価データの評価観測行列）として取得する。 Then, in the learning phase, the degree-of-separation evaluation unit 218 sets the data in the time domain/frequency domain in the target area to which the NMF specified as described above is applied to the teacher data (the teacher of the non-negative first teacher data). Observation matrix). Further, the degree-of-separation evaluation unit 218 specifies the time domain/frequency domain identified as described above as a data domain (target domain) for obtaining the evaluation data, and the corresponding time/frequency/amplitude of the data domain. Data is acquired as evaluation data (evaluation observation matrix of non-negative evaluation data).

また、本実施形態においては、分離度としてクラス分類正解率を用いてもよい。詳細には、分離度評価部２１８は、図７に示すような分割されたそれぞれのデータに対して、ＮＭＦを適用し、ある複数のランクでの基底行列及び係数行列を取得する。なお、時間・周波数・振幅データは、ＮＭＦを適用するために、１次元の列ベクトルに変換されてもよい。 Further, in the present embodiment, the class classification correct answer rate may be used as the degree of separation. In detail, the separability evaluation unit 218 applies NMF to each of the divided data as shown in FIG. 7, and acquires a base matrix and a coefficient matrix in a plurality of ranks. Note that the time/frequency/amplitude data may be converted into a one-dimensional column vector in order to apply NMF.

以下の例では、クラスＣ１とクラスＣ２の２つのクラスが存在する場合について説明する。まず、クラスＣ１に属する複数の時間・周波数・振幅データの領域Ａ１、Ａ２、Ａ３、Ａ４から、基底行列Ｈ_{Ｃ１，Ａ１}、Ｈ_{Ｃ１，Ａ２}、Ｈ_{Ｃ１，Ａ３}、Ｈ_{Ｃ１，Ａ４}、及び係数行列Ｗ_{Ｃ１，Ａ１}、Ｗ_{Ｃ１，Ａ２}、Ｗ_{Ｃ１，Ａ３}、Ｗ_{Ｃ１，Ａ４}、をそれぞれ得ることができる。続いて、クラスＣ２に属する複数の時間・周波数・振幅データの領域Ａ１、Ａ２、Ａ３、Ａ４から、同様に基底行列Ｈ_{Ｃ２，Ａ１}、Ｈ_{Ｃ２，Ａ２}、Ｈ_{Ｃ２，Ａ３}、Ｈ_{Ｃ２，Ａ４}、及び係数行列Ｗ_{Ｃ２，Ａ１}、Ｗ_{Ｃ２，Ａ２}、Ｗ_{Ｃ２，Ａ３}、Ｗ_{Ｃ２，Ａ４}、をそれぞれ得ることができる。なお、分解される基底の数を示すランクは、１から元のデータの次元数までの範囲である。 In the following example, a case where two classes C1 and C2 exist will be described. First, the basis matrices H _{C1, A1} , H _{C1, A2} , H _{C1, A3} , H _{C1, A4} , and the coefficient matrix are calculated from a plurality of time/frequency/amplitude data regions A1, A2, A3, A4 belonging to the class C1. It is possible to obtain W _C1,A1 , W _C1,A2 , W _C1,A3 , W _C1,A4 , respectively. Then, from the plurality of time/frequency/amplitude data regions A1, A2, A3, and A4 belonging to the class C2, the basis matrices H _{C2, A1} , H _{C2, A2} , H _{C2, A3} , H _{C2, A4} , And a coefficient matrix W _C2,A1 , W _C2,A2 , W _C2,A3 , W _C2,A4 , respectively. The rank indicating the number of decomposed bases ranges from 1 to the number of dimensions of the original data.

そして、ここでは、先に説明したように、クラスＣ１又はクラスＣ２の分離度を示す指標として、クラスＣ１又はクラスＣ２の分類正解率が用いられる。まず、分離度評価部２１８は、クラスＣ１、Ｃ２のそれぞれから得られた基底Ｈ_{Ｃ１，Ａ１}、Ｈ_{Ｃ２，Ａ１}を組み合わせた共通の基底Ｈ_{Ｃ１‐Ｃ２，Ａ１}＝｛Ｈ_{Ｃ１，Ａ１}，Ｈ_{Ｃ２，Ａ１}｝を算出する。元のクラスＣ１のデータＹ_{Ｃ１，Ａ０}、クラスＣ２のデータＹ_{Ｃ２，Ａ０}にＨ_{Ｃ１―Ｃ２，Ａ１}の擬似逆行列Ｈ_{Ｃ１―Ｃ２，Ａ１} ^＋をかけることにより、各クラスの係数行列Ｗ_{Ｃ１，Ａ０}、Ｗ_{Ｃ２，Ａ０}は、以下の数式（１０）、（１１）により示すことができる。 Then, as described above, the classification accuracy rate of the class C1 or the class C2 is used as the index indicating the degree of separation of the class C1 or the class C2. First, the degree-of-separation evaluation unit 218 combines the bases H _C1,A1 , H _C2,A1 obtained from the classes C1 and C2, respectively, to obtain a common base H _C1-C2,A1 ={H _C1,A1 , H _{C2 , A1} } is calculated. Data _{Y C1} of the original class _{C1, A0,} data _{Y C2} class _C2, the _A0 applying _{H C1-C2,} pseudo-inverse _{H C1-C2} of _{_A1, A1} ^+, the coefficient matrix _{W C1} of each _{class, A0} , _{WC2, A0} can be expressed by the following mathematical expressions (10) and (11).

分離度評価部２１８は、係数行列Ｗ_{Ｃ１，Ａ０}、Ｗ_{Ｃ２，Ａ０}を、それぞれクラスＣ１、クラスＣ２の特徴ベクトルとして、分類して、分類正解率を算出する。ここで分離度評価部２１８が行う分類の方式は、例えば分類部２２８の分類方式と同様であってもよい。 The separation degree evaluation unit 218 classifies the coefficient matrices W _C1,A0 , W _C2,A0 as the feature vectors of the class C1 and the class C2, respectively, and calculates the classification accuracy rate. Here, the classification method performed by the separation degree evaluation unit 218 may be the same as the classification method of the classification unit 228, for example.

また、データ分割部２１６は、図７に示す４つの領域Ａ１、Ａ２、Ａ３、Ａ４のそれぞれにおいて、さらに再帰的にデータを分割してもよい。例えば、図７に示す領域Ａ１は、４つの領域Ａ１１、Ａ１２、Ａ１３、Ａ１４に分割されてもよい。また、分離度評価部２１８は、分割されたデータや、分割されたデータを組み合わせたデータに対して、それぞれ分離度を算出することとなる。上記の再帰的な分割は、時間領域・周波数領域が最小の単位になるまで分割されるか、あるいは分割されたデータの大きさが所定の値となるまで、繰り返されてもよい。 The data dividing unit 216 may further recursively divide the data in each of the four areas A1, A2, A3, and A4 shown in FIG. For example, the area A1 shown in FIG. 7 may be divided into four areas A11, A12, A13, and A14. Further, the separation degree evaluation unit 218 calculates the separation degree for each of the divided data and the data obtained by combining the divided data. The above recursive division may be repeated until the time domain/frequency domain becomes the minimum unit, or until the size of the divided data reaches a predetermined value.

なお、上述の説明では、分離度が高い時間領域・周波数領域を効率よく抽出するために、再帰的に分割する例を説明したが、本発明は係る例に限定されない。例えば、一部、あるいは全ての時間領域・周波数領域の分離度を評価してもよいし、他の方法で分離度の高い時間領域・周波数領域を抽出してもよい。 In the above description, an example of recursive division has been described in order to efficiently extract the time domain/frequency domain having a high degree of separation, but the present invention is not limited to this example. For example, the degree of separation in part or all of the time domain/frequency domain may be evaluated, or the time domain/frequency domain with high degree of separation may be extracted by another method.

＜動作例＞
以上、本実施形態の概要について説明した。続いて、本実施形態の動作（分類方法）の例について、学習フェーズと分類フェーズとに分けて、それぞれ図１０、１１を参照して説明する。図１０は、学習フェーズにおける本実施形態の動作例を示すフローチャート図であり、図１１は、分類フェーズにおける本実施形態の動作例を示すフローチャート図である。 <Operation example>
The outline of the present embodiment has been described above. Next, an example of the operation (classification method) of this embodiment will be described with reference to FIGS. 10 and 11, separately for the learning phase and the classification phase. FIG. 10 is a flowchart showing an operation example of this embodiment in the learning phase, and FIG. 11 is a flowchart showing an operation example of this embodiment in the classification phase.

（学習フェーズの動作例）
次に、本実施形態に係る学習フェーズの動作例の詳細について説明する。図１０に示すように、当該学習フェーズには、例えば、ステップＳ１０１からステップＳ１２５までの複数のステップが含まれている。以下に、学習フェーズに含まれる各ステップの詳細を説明する。 (Example of operation in the learning phase)
Next, details of the operation example of the learning phase according to the present embodiment will be described. As shown in FIG. 10, the learning phase includes, for example, a plurality of steps from step S101 to step S125. The details of each step included in the learning phase will be described below.

〜ステップＳ１０１〜
データ取得部２０２は、センサ１００から時系列データを取得する。この際、処理部２１０は、取得した時系列データからオフセット値を引く処理を行ってもよい。もしくは、処理部２１０は、後述するステップＳ１０７で変換された時間・周波数・振幅データに対する周波数フィルタリング処理等を行ってもよい。このようにすることで、ＮＭＦの対象となる時間・周波数・振幅データの精度や、分類に有効な特徴量の有効性をより高めることができる。 ~ Step S101 ~
The data acquisition unit 202 acquires time series data from the sensor 100. At this time, the processing unit 210 may perform processing of subtracting an offset value from the acquired time series data. Alternatively, the processing unit 210 may perform frequency filtering processing or the like on the time/frequency/amplitude data converted in step S107 described later. By doing so, the accuracy of the time/frequency/amplitude data that is the target of NMF and the effectiveness of the feature quantity effective for classification can be further enhanced.

〜ステップＳ１０３〜
続いて、対応付け部２１４は、ステップＳ１０１で取得された時系列データと、操作部２０４を介してユーザにより入力される正解ラベル（目的クラス及び目的外クラス）とを対応付ける。 ~ Step S103 ~
Subsequently, the associating unit 214 associates the time-series data acquired in step S101 with the correct label (target class and non-target class) input by the user via the operation unit 204.

〜ステップＳ１０５〜
処理部２１０は、学習フェーズに用いられる時系列データの取得が終了したか否かの判定を行う。例えば、当該判定は、例えば操作部２０４の入力に基づいて行われてもよい。そして、学習フェーズに用いられる時系列データの取得が終了していない場合には、処理はステップＳ１０１に戻る。一方、学習フェーズに用いられる時系列データの取得が終了している場合には、処理はステップＳ１０７へ進む。 ~ Step S105
The processing unit 210 determines whether or not the acquisition of the time series data used in the learning phase is completed. For example, the determination may be performed based on, for example, the input of the operation unit 204. Then, if the acquisition of the time-series data used in the learning phase has not been completed, the process returns to step S101. On the other hand, when the acquisition of the time series data used in the learning phase has been completed, the process proceeds to step S107.

〜ステップＳ１０７〜
変換部２１２は、ステップＳ１０３で取得した時系列データを時間・周波数・振幅データに変換する。 ~ Step S107 ~
The conversion unit 212 converts the time series data acquired in step S103 into time/frequency/amplitude data.

〜ステップＳ１０９〜
続いて、データ分割部２１６は、時間領域、及び周波数領域で、ステップＳ１０７で取得した時間・周波数・振幅データを分割する。 ~ Step S109 ~
Subsequently, the data division unit 216 divides the time/frequency/amplitude data acquired in step S107 in the time domain and the frequency domain.

〜ステップＳ１１１〜
次に、分離度評価部２１８は、ステップＳ１０９でデータ分割部２１６により分割された領域ごとに、目的のクラスの分離度を評価する。 ~ Step S111 ~
Next, the degree-of-separation evaluation unit 218 evaluates the degree-of-separation of the target class for each area divided by the data division unit 216 in step S109.

〜ステップＳ１１３〜
分離度評価部２１８は、ステップＳ１０９で分割された複数の領域の中から、ステップＳ１１１で評価した分離度の高い領域を抽出する。 ~ Step S113 ~
The degree-of-separation evaluation unit 218 extracts an area having a high degree of separation evaluated in step S111 from the plurality of areas divided in step S109.

〜ステップＳ１１５〜
続いて、分離度評価部２１８は、ステップＳ１０９でデータ分割部２１６により分割された領域ごとに、目的外のクラスの分離度を評価する。 ~ Step S115 ~
Subsequently, the separation degree evaluation unit 218 evaluates the separation degree of the non-target class for each of the areas divided by the data division unit 216 in step S109.

〜ステップＳ１１７〜
分離度評価部２１８は、ステップＳ１０９で分割された複数の領域の中から、ステップＳ１１５で評価した分離度の高い領域を抽出する。 ~Step S117~
The degree-of-separation evaluation unit 218 extracts an area having a high degree of separation evaluated in step S115 from the plurality of areas divided in step S109.

〜ステップＳ１１９〜
分離度評価部２１８は、ステップＳ１１３で抽出された目的のクラスの分離度の高い領域から、ステップＳ１１７で抽出された目的外のクラスの分離度の高い領域を除外することにより、対象領域を特定する。 ~Step S119~
The degree-of-separation evaluation unit 218 identifies the target area by excluding the area with high degree of separation of the non-target class extracted in step S117 from the area with high degree of separation of the target class extracted in step S113. To do.

〜ステップＳ１２１〜
教師データ処理部２２０は、ステップＳ１１９で特定された対象領域の時間・周波数・振幅データ（教師データの教師観測行列）を取得する。 ~ Step S121 ~
The teacher data processing unit 220 acquires the time/frequency/amplitude data (teacher observation matrix of teacher data) of the target region specified in step S119.

〜ステップＳ１２３〜
教師データ処理部２２０は、ステップＳ１２１で取得された対象領域の時間・周波数・振幅データ（教師データの教師観測行列）にＮＭＦを適用して、教師基底行列と教師係数行列とを取得する。 ~ Step S123 ~
The teacher data processing unit 220 acquires the teacher base matrix and the teacher coefficient matrix by applying NMF to the time/frequency/amplitude data (teacher observation matrix of teacher data) of the target area acquired in step S121.

〜ステップＳ１２５〜
さらに、パラメータ生成部２２２は、ステップＳ１２３で取得された教師係数行列と、ステップＳ１０３で対応付けられた正解ラベル（目的クラス）とに基づいて、分類モデルパラメータを生成する。 ~ Step S125
Further, the parameter generation unit 222 generates a classification model parameter based on the teacher coefficient matrix acquired in step S123 and the correct answer label (target class) associated in step S103.

以上、学習フェーズにおける本実施形態の動作例を説明した。続いて、学習フェーズで生成された分類モデルパラメータを用いた、分類フェーズにおける本実施形態の動作例について、図１１を参照して説明する。 The operation example of this embodiment in the learning phase has been described above. Next, an operation example of the present embodiment in the classification phase using the classification model parameter generated in the learning phase will be described with reference to FIG.

（分類フェーズの動作例）
次に、分類フェーズの動作例の詳細について説明する。図１１に示すように、本実施形態に係る分類フェーズには、例えば、ステップＳ２０１からステップＳ２１５までの複数のステップが含まれている。以下に、分類フェーズに含まれる各ステップの詳細を説明する。 (Operation example of classification phase)
Next, details of an operation example of the classification phase will be described. As shown in FIG. 11, the classification phase according to this embodiment includes, for example, a plurality of steps from step S201 to step S215. The details of each step included in the classification phase will be described below.

〜ステップＳ２０１〜
データ取得部２０２は、時系列データを取得する。この際、処理部２１０は、取得した時系列データからオフセット値を引く処理を行ってもよい。もしくは、処理部２１０は、後述するステップＳ１０７で変換された時間・周波数・振幅データに対する周波数フィルタリング処理等を行ってもよい。このようにすることで、ＮＭＦの対象となる時間・周波数・振幅データの精度や、分類の精度をより高めることができる。 ~ Step S201 ~
The data acquisition unit 202 acquires time series data. At this time, the processing unit 210 may perform processing of subtracting an offset value from the acquired time series data. Alternatively, the processing unit 210 may perform frequency filtering processing or the like on the time/frequency/amplitude data converted in step S107 described later. By doing so, the accuracy of the time/frequency/amplitude data that is the target of NMF and the accuracy of classification can be further improved.

〜ステップＳ２０３〜
変換部２１２は、ステップＳ２０１で取得した時系列データを時間・周波数・振幅データに変換する。 ~Step S203~
The conversion unit 212 converts the time series data acquired in step S201 into time/frequency/amplitude data.

〜ステップＳ２０５〜
続いて、データ分割部２１６は、時間領域、及び周波数領域で、ステップＳ２０３で取得した時間・周波数・振幅データを分割する。 ~ Step S205
Subsequently, the data division unit 216 divides the time/frequency/amplitude data acquired in step S203 in the time domain and the frequency domain.

〜ステップＳ２０７〜
続いて、分離度評価部２１８は、ステップＳ２０５で分割された時間・周波数・振幅データから、図１０を参照して説明した学習フェーズのステップＳ１１９で特定された対象領域と同一の領域を特定する。 ~Step S207~
Subsequently, the separability evaluation unit 218 identifies the same region as the target region identified in step S119 of the learning phase described with reference to FIG. 10 from the time/frequency/amplitude data divided in step S205. ..

〜ステップＳ２０９〜
さらに、特徴抽出部２２６は、ステップＳ２０７で特定された対象領域の時間・周波数・振幅データ（評価データの評価観測行列）を取得する。 ~ Step S209 ~
Further, the feature extraction unit 226 acquires time/frequency/amplitude data (evaluation observation matrix of evaluation data) of the target area specified in step S207.

〜ステップＳ２１１〜
続いて、特徴抽出部２２６は、上述した学習フェーズのステップＳ１２３で取得された教師基底行列に基づいて、ステップＳ２０９で取得した対象領域の時間・周波数・振幅データ（評価データの評価観測行列）から評価係数行列を生成する。 ~ Step S211 ~
Subsequently, the feature extraction unit 226 extracts the time/frequency/amplitude data (evaluation observation matrix of the evaluation data) of the target region acquired in step S209 based on the teacher base matrix acquired in step S123 of the learning phase described above. Generate an evaluation coefficient matrix.

〜ステップＳ２１３〜
続いて、分類部２２８は、上述した学習フェーズのステップＳ１２５で生成された分類モデルパラメータに基づいて、ステップＳ２１１で生成した評価係数行列を分類する。 ~Step S213~
Subsequently, the classification unit 228 classifies the evaluation coefficient matrix generated in step S211 based on the classification model parameters generated in step S125 of the learning phase described above.

〜ステップＳ２１５〜
最後に、出力部２３０は、ステップＳ２１３における分類結果を出力する。 ~ Step S215 ~
Finally, the output unit 230 outputs the classification result in step S213.

＜＜まとめ＞＞
以上説明したように、本発明の実施形態においては、分類の精度をより向上させるために、目的とするクラス分類に有効な時間領域と周波数領域に限定するだけでなく、目的外のクラス分類に有効な時間領域と周波数領域を除外した上で部分的にＮＭＦを適用する。すなわち、本実施形態によれば、目的のクラス分離度が高く、且つ、目的外のクラス分離度が高くない領域に限定してＮＭＦを適用することにより、目的外のクラスの特徴に邪魔されることなく、分類したい目的のクラスの分類に有効な特徴をより高精度に抽出したり（言い換えると、目的のクラスの分類に有効な基底が選択されやすくなる）、目的のクラスに、より高精度に分類したりすることができる。その結果、本実施形態においては、分類の精度をより向上させることができる。加えて、本実施形態においては、領域を限定してＮＭＦを適用すればよいため、処理量の抑制をさらに進めることもできる。 <<Summary>>
As described above, in the embodiment of the present invention, in order to further improve the accuracy of the classification, not only the time domain and the frequency domain effective for the target class classification but also the non-target class classification The NMF is partially applied after excluding the effective time domain and frequency domain. That is, according to the present embodiment, by applying the NMF only to the region where the target class separation is high and the non-target class separation is not high, the characteristics of the non-target class are disturbed. Without extracting the features that are effective in classifying the target class that you want to classify with higher accuracy (in other words, it becomes easier to select the bases that are effective in classifying the target class), and with higher accuracy in the target class. Can be classified into. As a result, in this embodiment, the accuracy of classification can be further improved. In addition, in the present embodiment, since it is sufficient to limit the area and apply the NMF, it is possible to further suppress the processing amount.

また、上述した実施形態においては、測定対象物（例えばコンクリート等）に対してハンマー等の打具で打撃した際の打音を収音して得られた音響信号（時系列データ）に基づいて、当該測定対象物が正常であるか異常であるかを示す２つのクラス（目的とする第１のクラス）に分類する例について説明した。しかしながら、本発明の実施形態は、上述の例に限定されるものではなく、例えば、３つ以上のクラスに分類してもよく、測定対象物（例えば、工作機械等）の振動の時系列データに基づいて、当該測定対象物の状態や能力を複数のクラスに分類してもよい。また、本発明の実施形態は、対象となる時系列データは特に限定されるものではなく、時間とともに変化する画像データ、環境温度、流量等のデータであってもよく、時間とともに変化する電気的なデータ（例えば、電圧値、電流値等）、測定対象物（例えば、建築物、機械、人、ボール、移動体（自動車）等）のモーションデータ、人間等の身体から得られる各種の生体情報（例えば、脈拍、心拍、体温、脳波等）であってもよい。すなわち、本発明の実施形態は、様々な機械や、人間の診断等に適用することが可能である。 Further, in the above-described embodiment, based on the acoustic signal (time-series data) obtained by collecting the hitting sound when hitting the object to be measured (for example, concrete) with a hitting tool such as a hammer. An example in which the measurement target is classified into two classes (first target class) indicating whether the measurement target is normal or abnormal has been described. However, the embodiment of the present invention is not limited to the above-mentioned example, and may be classified into, for example, three or more classes, and time series data of vibration of a measurement object (for example, a machine tool). Based on, the state and ability of the measurement object may be classified into a plurality of classes. Further, in the embodiment of the present invention, the target time-series data is not particularly limited, and may be image data that changes with time, data such as environmental temperature, flow rate, or the like. Data (for example, voltage value, current value, etc.), motion data of measurement target (for example, building, machine, person, ball, moving body (automobile), etc.), various biological information obtained from human body (For example, pulse, heartbeat, body temperature, brain wave, etc.) may be used. That is, the embodiments of the present invention can be applied to various machines, human diagnosis, and the like.

＜＜ハードウェア構成＞＞
以上、本発明の実施形態を説明した。上述した前処理、教師データ処理、パラメータ生成処理、特徴抽出処理、分類処理などの情報処理は、ソフトウェアと、情報処理装置２００のハードウェアとの協働により実現される。以下では、本発明の実施形態に係る情報処理装置２００である情報処理装置２００のハードウェア構成例として、情報処理装置１０００のハードウェア構成例について説明する。 <<Hardware configuration>>
The embodiments of the present invention have been described above. Information processing such as the above-described preprocessing, teacher data processing, parameter generation processing, feature extraction processing, and classification processing is realized by cooperation of software and hardware of the information processing apparatus 200. Hereinafter, a hardware configuration example of the information processing apparatus 1000 will be described as a hardware configuration example of the information processing apparatus 200 that is the information processing apparatus 200 according to the embodiment of the present invention.

図１２は、本発明の実施形態に係る情報処理装置１０００のハードウェア構成例を示す説明図である。図１２に示すように、情報処理装置１０００は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１００１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１００２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１００３と、入力装置１００４と、出力装置１００５と、ストレージ装置１００６と、通信装置１００７とを備える。 FIG. 12 is an explanatory diagram showing a hardware configuration example of the information processing apparatus 1000 according to the embodiment of the present invention. As illustrated in FIG. 12, the information processing device 1000 includes a CPU (Central Processing Unit) 1001, a ROM (Read Only Memory) 1002, a RAM (Random Access Memory) 1003, an input device 1004, and an output device 1005. The storage device 1006 and the communication device 1007 are provided.

ＣＰＵ１００１は、演算処理装置及び制御装置として機能し、各種プログラムに従って情報処理装置１０００内の動作全般を制御する。また、ＣＰＵ１００１は、マイクロプロセッサであってもよい。ＲＯＭ１００２は、ＣＰＵ１００１が使用するプログラムや演算パラメータ等を格納する。ＲＡＭ１００３は、ＣＰＵ１００１の実行において使用するプログラムや、その実行において適宜変化するパラメータなどを一時記憶する。これらはＣＰＵバスなどから構成されるホストバスにより相互に接続されている。主に、ＣＰＵ１００１、ＲＯＭ１００２及びＲＡＭ１００３とソフトウェアとの協働により、例えば、処理部２１０、教師データ処理部２２０、パラメータ生成部２２２、特徴抽出部２２６、分類部２２８等の機能が実現される。 The CPU 1001 functions as an arithmetic processing unit and a control unit, and controls overall operations in the information processing apparatus 1000 according to various programs. Further, the CPU 1001 may be a microprocessor. The ROM 1002 stores programs used by the CPU 1001 and calculation parameters. The RAM 1003 temporarily stores a program used in the execution of the CPU 1001 and parameters that appropriately change in the execution. These are connected to each other by a host bus composed of a CPU bus and the like. The functions of, for example, the processing unit 210, the teacher data processing unit 220, the parameter generation unit 222, the feature extraction unit 226, the classification unit 228 and the like are realized mainly by the cooperation of the CPU 1001, the ROM 1002, the RAM 1003, and the software.

入力装置１００４は、マウス、キーボード、タッチパネル、ボタン、マイクロフォン、スイッチ及びレバーなどユーザが情報を入力するための入力手段と、ユーザによる入力に基づいて入力信号を生成し、ＣＰＵ１００１に出力する入力制御回路などから構成されている。情報処理装置１０００のユーザは、該入力装置１００４を操作することにより、情報処理装置１０００に対して各種のデータを入力したり処理動作を指示したりすることができる。なお、入力装置１００４は、操作部２０４に対応する。 The input device 1004 includes an input unit such as a mouse, a keyboard, a touch panel, a button, a microphone, a switch, and a lever for the user to input information, and an input control circuit that generates an input signal based on the input by the user and outputs the input signal to the CPU 1001. Etc. By operating the input device 1004, the user of the information processing apparatus 1000 can input various data to the information processing apparatus 1000 and instruct a processing operation. The input device 1004 corresponds to the operation unit 204.

出力装置１００５は、例えば、液晶ディスプレイ（ＬＣＤ）装置、ＯＬＥＤ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ）装置及びランプなどの表示装置を含む。さらに、出力装置１００５は、スピーカ及びヘッドホンなどの音声出力装置を含む。例えば、表示装置は、撮像された画像や生成された画像などを表示する。一方、音声出力装置は、音声データ等を音声に変換して出力する。なお、出力装置１００５は、出力部２３０に対応する。 The output device 1005 includes, for example, a liquid crystal display (LCD) device, an OLED (Organic Light Emitting Diode) device, and a display device such as a lamp. Further, the output device 1005 includes an audio output device such as a speaker and headphones. For example, the display device displays a captured image, a generated image, or the like. On the other hand, the voice output device converts voice data and the like into voice and outputs the voice. The output device 1005 corresponds to the output unit 230.

ストレージ装置１００６は、データ格納用の装置である。ストレージ装置１００６は、記憶媒体、記憶媒体にデータを記録する記録装置、記憶媒体からデータを読み出す読出し装置及び記憶媒体に記録されたデータを削除する削除装置などを含んでもよい。ストレージ装置１００６は、ＣＰＵ１００１が実行するプログラムや各種データを格納する。なお、ストレージ装置１００６は、記憶部２２４に対応する。 The storage device 1006 is a device for storing data. The storage device 1006 may include a storage medium, a recording device that records data in the storage medium, a reading device that reads data from the storage medium, a deletion device that deletes data recorded in the storage medium, and the like. The storage device 1006 stores programs executed by the CPU 1001 and various data. The storage device 1006 corresponds to the storage unit 224.

通信装置１００７は、例えば、通信網に接続するための通信デバイスなどで構成された通信インタフェースである。また、通信装置１００７は、無線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）対応通信装置、ＬＴＥ（ＬｏｎｇＴｅｒｍＥｖｏｌｕｔｉｏｎ）対応通信装置、有線による通信を行うワイヤー通信装置、またはブルートゥース（登録商標）通信装置を含んでもよい。 The communication device 1007 is, for example, a communication interface including a communication device for connecting to a communication network. The communication device 1007 may include a wireless LAN (Local Area Network) compatible communication device, an LTE (Long Term Evolution) compatible communication device, a wire communication device that performs wired communication, or a Bluetooth (registered trademark) communication device.

＜＜補足＞＞
以上、添付図面を参照しながら本発明の好適な実施形態について詳細に説明したが、本発明はかかる例に限定されない。本発明の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本発明の技術的範囲に属するものと了解される。 <<Supplement>>
The preferred embodiments of the present invention have been described above in detail with reference to the accompanying drawings, but the present invention is not limited to these examples. It is obvious that a person having ordinary knowledge in the technical field to which the present invention pertains can come up with various changes or modifications within the scope of the technical idea described in the claims. Of course, it is understood that these also belong to the technical scope of the present invention.

例えば、上記実施形態では、教師データと、評価データとを同一の装置（情報処理装置２００）が取得する例を説明したが、本発明は係る例に限定されない。例えば、教師データと評価データを取得する装置は別々の装置であってもよいし、教師データや正解ラベルは、予め記憶部２２４に格納されていてもよい。すなわち、上述の実施形態では、情報処理装置２００が、分類モデルパラメータを生成するパラメータ生成装置としての機能と、時系列データを複数のクラスに分類する分類装置としての機能とを有する例を説明したが、本発明は係る例に限定されない。例えば、本実施形態においては、分類モデルパラメータの生成処理を行うパラメータ生成装置と、パラメータ生成装置で生成された分類モデルパラメータを用いて分類処理を行う分類装置とが、異なる装置として設けられてもよい。 For example, in the above embodiment, an example in which the same device (information processing device 200) acquires the teacher data and the evaluation data has been described, but the present invention is not limited to this example. For example, the device that acquires the teacher data and the evaluation data may be separate devices, and the teacher data and the correct answer label may be stored in the storage unit 224 in advance. That is, in the above-described embodiment, an example has been described in which the information processing device 200 has a function as a parameter generation device that generates classification model parameters and a function as a classification device that classifies time-series data into a plurality of classes. However, the present invention is not limited to such an example. For example, in the present embodiment, the parameter generation device that performs the classification model parameter generation process and the classification device that performs the classification process using the classification model parameter generated by the parameter generation device may be provided as different devices. Good.

また、上述の実施形態は、例えば、コンピュータを本実施形態に係る情報処理装置２００として機能させるため（本実施形態に係る分類方法を実施するため）のプログラム、及びプログラムが記録された一時的でない有形の媒体を含みうる。また、プログラムをインターネット等の通信回線（無線通信も含む）を介して頒布してもよい。 Further, the above-described embodiment is, for example, a program for causing a computer to function as the information processing apparatus 200 according to the present embodiment (for performing the classification method according to the present embodiment), and a non-transitory program in which the program is recorded. It may include a tangible medium. Further, the program may be distributed via a communication line (including wireless communication) such as the Internet.

また、本実施形態の動作（分類方法）における処理ステップは、必ずしもフローチャートに記載された順序に沿って時系列に実行されなくてよい。例えば、本実施形態においては、処理ステップは、フローチャートとして記載した順序と異なる順序で実行されても、並列的に実行されてもよい。 Further, the processing steps in the operation (classification method) of the present embodiment do not necessarily have to be executed in time series in the order described in the flowchart. For example, in the present embodiment, the processing steps may be executed in an order different from the order described as the flowchart, or may be executed in parallel.

１０情報処理システム
１００センサ
２００、１０００情報処理装置
２００ａクラウド
２００ｂＩｏＴゲートウェイ
２００ｃエッジ端末
２０２データ取得部
２０４操作部
２１０処理部
２１２変換部
２１４対応付け部
２１６データ分割部
２１８分離度評価部
２２０教師データ処理部
２２２パラメータ生成部
２２４記憶部
２２６特徴抽出部
２２８分類部
２３０出力部
１００１ＣＰＵ
１００２ＲＯＭ
１００３ＲＡＭ
１００４入力装置
１００５出力装置
１００６ストレージ装置
１００７通信装置 10 information processing system 100 sensor 200, 1000 information processing device 200a cloud 200b IoT gateway 200c edge terminal 202 data acquisition unit 204 operation unit 210 processing unit 212 conversion unit 214 associating unit 216 data division unit 218 separation degree evaluation unit 220 teacher data processing Part 222 Parameter generation part 224 Storage part 226 Feature extraction part 228 Classification part 230 Output part 1001 CPU
1002 ROM
1003 RAM
1004 input device 1005 output device 1006 storage device 1007 communication device

Claims

A classifying device for classifying time series data into a desired first class,
From the time series data expressed in at least the frequency domain or the time domain, a processing unit that acquires non-negative evaluation data in a predetermined data domain,
A teacher coefficient matrix generated by performing non-negative matrix factorization on the non-negative first teacher data corresponding to the predetermined data area, and the first teacher data associated with the first teacher data. A storage unit for storing classification model parameters generated based on the class of
A feature extraction unit that generates an evaluation coefficient matrix from the evaluation data based on a teacher basis matrix generated by performing non-negative matrix factorization on the first teacher data;
A classification unit that classifies the calculated evaluation coefficient matrix into the first class based on the classification model parameters;
An area specifying unit for specifying the predetermined data area,
Equipped with
The area specifying unit,
Each non-negative second teacher data associated with each of the first classes is divided in the frequency domain or the time domain,
Calculating a first evaluation index indicating the degree of separation between the first classes for each of the divided regions, based on the observation matrix of each of the second teacher data for each of the divided regions,
Each non-negative third teacher data associated with each second target class is divided in the frequency domain or the time domain,
Calculating a second evaluation index indicating the degree of separation between the second classes for each of the divided regions, based on the observation matrix of each of the third teacher data for each of the divided regions,
The first evaluation index is higher than a preset first threshold value, the first area is specified by extracting the divided area,
The second area is specified by extracting the divided area in which the second evaluation index is higher than a preset second threshold value,
Specifying the predetermined data area based on the specified first and second areas;
Classifier.

A classifying device for classifying the time series data into a target first class;
A sensor for acquiring the time series data,
A classification system including
The classification device is
From the time series data expressed in at least the frequency domain or the time domain, a processing unit that acquires non-negative evaluation data in a predetermined data domain,
A teacher coefficient matrix generated by performing non-negative matrix factorization on the non-negative first teacher data corresponding to the predetermined data area, and the first teacher data associated with the first teacher data. A storage unit for storing classification model parameters generated based on the class of
A feature extraction unit that generates an evaluation coefficient matrix from the evaluation data based on a teacher basis matrix generated by performing non-negative matrix factorization on the first teacher data;
A classification unit that classifies the calculated evaluation coefficient matrix into the first class based on the classification model parameters;
An area specifying unit for specifying the predetermined data area,
Have
The area specifying unit,
Each non-negative second teacher data associated with each of the first classes is divided in the frequency domain or the time domain,
Calculating a first evaluation index indicating the degree of separation between the first classes for each of the divided regions, based on the observation matrix of each of the second teacher data for each of the divided regions,
Each non-negative third teacher data associated with each second target class is divided in the frequency domain or the time domain,
Calculating a second evaluation index indicating the degree of separation between the second classes for each of the divided regions, based on the observation matrix of each of the third teacher data for each of the divided regions,
The first evaluation index is higher than a preset first threshold value, the first area is specified by extracting the divided area,
The second area is specified by extracting the divided area in which the second evaluation index is higher than a preset second threshold value,
Specifying the predetermined data area based on the specified first and second areas;
Classification system.

A classification method for classifying time series data into a desired first class,
From the time series data expressed in at least the frequency domain or the time domain, obtaining non-negative evaluation data in a predetermined data domain,
A teacher coefficient matrix generated by performing non-negative matrix factorization on the non-negative first teacher data corresponding to the predetermined data area, and the first teacher data associated with the first teacher data. Storing classification model parameters generated based on the class of
Generating an evaluation coefficient matrix from the evaluation data based on a teacher basis matrix generated by performing non-negative matrix factorization on the first teacher data;
Classifying the calculated evaluation coefficient matrix into the first class based on the classification model parameters;
Specifying the predetermined data area,
Including
Specifying the predetermined data area is
Each non-negative second teacher data associated with each of the first classes is divided in the frequency domain or the time domain,
Calculating a first evaluation index indicating the degree of separation between the first classes for each of the divided regions, based on the observation matrix of each of the second teacher data for each of the divided regions,
Each non-negative third teacher data associated with each second target class is divided in the frequency domain or the time domain,
Calculating a second evaluation index indicating the degree of separation between the second classes for each of the divided regions, based on the observation matrix of each of the third teacher data for each of the divided regions,
The first evaluation index is higher than a preset first threshold value, the first area is specified by extracting the divided area,
The second area is specified by extracting the divided area in which the second evaluation index is higher than a preset second threshold value,
Specifying the predetermined data area based on the specified first and second areas;
Having
Classification method.

A program for classifying time series data into a desired first class,
On the computer,
From the time series data expressed in at least the frequency domain or the time domain, a function of acquiring non-negative evaluation data in a predetermined data domain,
A teacher coefficient matrix generated by performing non-negative matrix factorization on the non-negative first teacher data corresponding to the predetermined data area, and the first teacher data associated with the first teacher data. The ability to store the classification model parameters generated based on the class of
A function of generating an evaluation coefficient matrix from the evaluation data based on a teacher basis matrix generated by performing non-negative matrix factorization on the first teacher data;
A function of classifying the calculated evaluation coefficient matrix into the first class based on the classification model parameter;
A function for specifying the predetermined data area,
Is realized,
In the function of specifying the predetermined data area,
Each non-negative second teacher data associated with each of the first classes is divided in the frequency domain or the time domain,
Calculating a first evaluation index indicating the degree of separation between the first classes for each of the divided regions, based on the observation matrix of each of the second teacher data for each of the divided regions,
Each non-negative third teacher data associated with each second target class is divided in the frequency domain or the time domain,
Calculating a second evaluation index indicating the degree of separation between the second classes for each of the divided regions, based on the observation matrix of each of the third teacher data for each of the divided regions,
The first evaluation index is higher than a preset first threshold value, the first area is specified by extracting the divided area,
The second area is specified by extracting the divided area in which the second evaluation index is higher than a preset second threshold value,
Specifying the predetermined data area based on the specified first and second areas;
Do things,
program.