JP2008257084A - Hlac feature quantity extracting method and feature quantity extracting device, by binarizing one-dimensional signal - Google Patents

Hlac feature quantity extracting method and feature quantity extracting device, by binarizing one-dimensional signal Download PDF

Info

Publication number
JP2008257084A
JP2008257084A JP2007101279A JP2007101279A JP2008257084A JP 2008257084 A JP2008257084 A JP 2008257084A JP 2007101279 A JP2007101279 A JP 2007101279A JP 2007101279 A JP2007101279 A JP 2007101279A JP 2008257084 A JP2008257084 A JP 2008257084A
Authority
JP
Japan
Prior art keywords
hlac
dimensional
signal
binary
feature quantity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2007101279A
Other languages
Japanese (ja)
Other versions
JP4840819B2 (en
Inventor
Akira Saso
晃 佐宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Institute of Advanced Industrial Science and Technology AIST
Original Assignee
National Institute of Advanced Industrial Science and Technology AIST
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Institute of Advanced Industrial Science and Technology AIST filed Critical National Institute of Advanced Industrial Science and Technology AIST
Priority to JP2007101279A priority Critical patent/JP4840819B2/en
Publication of JP2008257084A publication Critical patent/JP2008257084A/en
Application granted granted Critical
Publication of JP4840819B2 publication Critical patent/JP4840819B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

<P>PROBLEM TO BE SOLVED: To provide an HLC (high dimensional local autocorrelation) feature quantity extracting method and a feature quantity extracting device, by binarizing one-dimensional signal, based on binary HLAC which is suitable for hardware, with which high speed processing can be performed. <P>SOLUTION: The one-dimensional signal is binarized by pulse width modulation (PWM), and the HLAC feature quantity is calculated by applying one dimensional binary HLAC to the one-dimensional binary signal. Each amplitude value of the one-dimensional signal is binarized by any one of linear quantization, μ-Law quantization or gray code quantization, and two-dimensional binary HLAC is applied to a binary image which is generated as its time sequence. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、音声、音楽、環境音などの音響信号、または心電図波形、地震波形などに加えより高周波数を含む様々な1次元信号から異常検出、特定信号の認識や検索または計数などの処理の実現に効果的なHLAC特徴量抽出方法および特徴量抽出装置に関する。   In the present invention, processing such as abnormality detection, recognition of specific signals, search, or counting is performed from various one-dimensional signals including higher frequencies in addition to acoustic signals such as voice, music, and environmental sounds, or electrocardiogram waveforms and earthquake waveforms. The present invention relates to an effective HLAC feature quantity extraction method and feature quantity extraction apparatus.

高次局所自己相関関数(HLAC)を用いて動画像から特徴を抽出し、異常動作の検出や移動対象の実時間追跡を行う技術は、下記特許文献1〜3に既に提案されている。
特開2005−092346号公報 特開2006―079272号公報 特開2006−163452号公報
Techniques for extracting features from a moving image using a high-order local autocorrelation function (HLAC) and detecting abnormal motion and tracking a moving object in real time have already been proposed in the following Patent Documents 1 to 3.
Japanese Patent Laying-Open No. 2005-092346 JP 2006-079272 A JP 2006-163452 A

上記特許文献1〜3では、画像(2次元)または動画像(3次元)から高次局所自己相関(HLAC)を用いて特徴抽出を行っている。
一方、音響信号などの1次元信号からHLAC特徴を抽出する技術はいまだ確立されていない。
従来の(動)画像からのHLAC特徴量抽出では、2値化画像を用いるため高次相関(つまり画素の輝度値を何乗しても)値がオーバーフローを起こすことはなく安定に計算を行える。しかし、1次元信号の場合は、例えば1サンプル値が16ビットで量子化されている場合は−3万から+3万程度のダイナミックレンジを持ち、その値からオーバーフローを起こさないように高次相関を求めるためには、演算装置のハードウェアまたはソフトウェアの負荷が増大してしまう。この場合、超音波帯の信号やそれ以上の高い周波数成分を含む一次元信号からのHLAC特徴抽出などでは、リアルタイム処理が困難になるという問題がある。また、高次相関と低次相関の間で値の桁が大きく異なるため、例えば、その特徴量を用いて主成分分析などを行うときに、安定した数値演算が難しくなる。
In Patent Documents 1 to 3, feature extraction is performed from an image (two-dimensional) or a moving image (three-dimensional) using high-order local autocorrelation (HLAC).
On the other hand, a technique for extracting HLAC features from a one-dimensional signal such as an acoustic signal has not yet been established.
In the conventional extraction of HLAC features from (moving) images, a binarized image is used, so that higher-order correlation (that is, whatever the luminance value of a pixel is raised) does not cause an overflow and can be calculated stably. . However, in the case of a one-dimensional signal, for example, if one sample value is quantized with 16 bits, it has a dynamic range of about −30,000 to +30,000, and a high-order correlation is performed so as not to cause overflow from the value. In order to obtain it, the load of the hardware or software of the arithmetic device increases. In this case, there is a problem that real-time processing becomes difficult in HLAC feature extraction from an ultrasonic band signal or a one-dimensional signal including higher frequency components. In addition, since the digit of the value is greatly different between the high-order correlation and the low-order correlation, for example, when performing principal component analysis using the feature amount, it is difficult to perform stable numerical calculation.

本発明は、上記問題点に鑑み、高速処理が可能でハードウェア化に適した2値HLACに基づいた、1次元信号の2値化によるHLAC特徴量抽出方法および特徴量抽出装置を提供することを目的とする。   In view of the above problems, the present invention provides a HLAC feature quantity extraction method and a feature quantity extraction apparatus based on binary one-dimensional signal based on binary HLAC that is capable of high-speed processing and suitable for hardware. With the goal.

以下、信号とは断りの無い限り1次元信号を意味することとする。
アナログ信号から1次元2値HLACにより特徴量を抽出するには、アナログ信号の振幅値を2値化する必要がある。本発明では、振幅の2値化に「(1)パルス幅変調を用いる方法」と「(2)グレイコード等を用いて2値表記に変換した振幅値を時系列に並べて2次元の2値画像を生成する方法」の2種類の方法を用いる。
(1)パルス幅変調(Pulse Width Modulation:PWM)を用いる方法:
アナログ入力信号を2値化する手段の1つとしてPWMを用い、これにより2値化した信号から1次元2値HLAC特徴を抽出する。
PWM信号は、アナログ入力信号の振幅値に比例してパルス幅を増減させる変調方式である。PWM信号の振幅値は正と負の2値信号となる。この変調機能を実現するハードウェアの回路図を図1に示す。
Hereinafter, a signal means a one-dimensional signal unless otherwise specified.
In order to extract a feature value from an analog signal using a one-dimensional binary HLAC, it is necessary to binarize the amplitude value of the analog signal. In the present invention, “(1) Method of using pulse width modulation” and “(2) Gray code or the like converted to binary notation are arranged in time series for binarization of amplitude, and two-dimensional binary is obtained. Two types of methods “method of generating image” are used.
(1) Method using pulse width modulation (PWM):
PWM is used as one of means for binarizing the analog input signal, and a one-dimensional binary HLAC feature is extracted from the binarized signal.
The PWM signal is a modulation method that increases or decreases the pulse width in proportion to the amplitude value of the analog input signal. The amplitude value of the PWM signal is a positive and negative binary signal. A circuit diagram of hardware for realizing this modulation function is shown in FIG.

図1は本発明の1次元2値HLAC特徴量算出回路の回路図である。図1の1次元2値HLAC特徴量算出回路1は、比較器2、サンプリング手段3、レジスタ4、カウンター5、シフトレジスタの出力線と論理積回路の入力線により構成する結線マトリクス回路9、論理積回路を構成するAND回路6および累積加算回路7、ラッチ回路8からなる。
入力観測信号と基準の三角波信号を比較器2で比較し、比較器2のPWM出力信号を任意のサンプリング周波数を基準クロックとしてサンプリング手段3でサンプリングし、レジスタ4に順次記憶する。レジスタ4の記憶状態は、マトリクス回路9の結線状態を反映するAND回路6の出力を累積加算回路7で加算し、この加算値をラッチ回路8から「PWM+BinHLAC信号」として取り出す。結線マトリクス回路9は、HLACの算出に用いるマスクを定義する。
この1次元2値HLAC特徴量算出回路1では、PWM信号は、基準となる三角波信号と帯域制限フィルタを通したアナログ入力信号をコンパレータ2で比較し、その比較出力として得られるため、ハードウェア化が容易である。なお、三角波の周波数はアナログ入力信号の最高周波数より十分大きく設定する必要がある。
FIG. 1 is a circuit diagram of a one-dimensional binary HLAC feature amount calculation circuit according to the present invention. 1 includes a comparator 2, a sampling means 3, a register 4, a counter 5, a connection matrix circuit 9 composed of an output line of a shift register and an input line of an AND circuit, a logic It consists of an AND circuit 6, a cumulative addition circuit 7, and a latch circuit 8 constituting a product circuit.
The input observation signal and the reference triangular wave signal are compared by the comparator 2, the PWM output signal of the comparator 2 is sampled by the sampling means 3 using an arbitrary sampling frequency as a reference clock, and sequentially stored in the register 4. As for the storage state of the register 4, the output of the AND circuit 6 reflecting the connection state of the matrix circuit 9 is added by the cumulative addition circuit 7, and this addition value is taken out from the latch circuit 8 as “PWM + BinHLAC signal”. The connection matrix circuit 9 defines a mask used for calculation of HLAC.
In this one-dimensional binary HLAC feature quantity calculation circuit 1, since the PWM signal is obtained as a comparison output by comparing the reference triangular wave signal with the analog input signal that has passed through the band limiting filter, it is realized as hardware. Is easy. Note that the frequency of the triangular wave needs to be set sufficiently higher than the maximum frequency of the analog input signal.

図2はアナログ入力信号をPWM処理により2値化する処理経過を示す図である。
図2(a)はアナログ入力信号の波形図。図2(b)は図2(a)のアナログ入力信号をPWM処理したパルス信号の波形図。図2(c)は、PWM信号の変化をより強調するために、図2(b)のPWM信号の差分(詳細は後記する)を求めた信号の波形図である。
図1のコンパレータ2の出力のPWM信号は連続時間の2値信号である。この2値信号をサンプリング手段3によりあるサンプリング周波数でサンプリングし、各サンプル値を1ビットで出力する。1ビットのサンプル値は、サンプリング周波数に同期してシフトするシフトレジスタ4に記憶する。図1に示すシフトレジスタ4は、上から下にビットをシフトするので、最も上にあるビットは現時刻のサンプル値を表し、最も下にあるビットが最も過去のサンプル値を表す。
FIG. 2 is a diagram illustrating a process of binarizing an analog input signal by PWM processing.
FIG. 2A is a waveform diagram of an analog input signal. FIG. 2B is a waveform diagram of a pulse signal obtained by performing PWM processing on the analog input signal of FIG. FIG. 2C is a waveform diagram of a signal obtained by obtaining a difference (details will be described later) of the PWM signal in FIG. 2B in order to further emphasize the change in the PWM signal.
The PWM signal output from the comparator 2 in FIG. 1 is a continuous time binary signal. The binary signal is sampled by the sampling means 3 at a certain sampling frequency, and each sample value is output with 1 bit. The 1-bit sample value is stored in the shift register 4 that shifts in synchronization with the sampling frequency. Since the shift register 4 shown in FIG. 1 shifts bits from top to bottom, the uppermost bit represents the sample value at the current time, and the lowermost bit represents the past sample value.

図2(c)に示したPWM信号の差分信号を求める場合は、シフトレジスタ4の隣り合うビット間で排他的論理輪和を求めることにより得られる。
以下に、PWM信号またはその差分信号から、1次元2値HLAC特徴量を算出する方法について説明する。
離散時間信号の1次元2値HLACは次式により求められる。

Figure 2008257084
f(m)は離散時間信号、Nは次数、Mはフレームのサンプル数、(a0,a1,a2,…,aN)はマスクパターンを表す。但し、a0=0である。マスクパターンは、シフトレジスタのレジスタ幅W(記憶するビット数)と相関の次数Nで一意に決まる。 When the difference signal of the PWM signal shown in FIG. 2C is obtained, it is obtained by obtaining an exclusive logical sum between adjacent bits of the shift register 4.
Hereinafter, a method for calculating a one-dimensional binary HLAC feature amount from a PWM signal or a difference signal thereof will be described.
The one-dimensional binary HLAC of the discrete time signal is obtained by the following equation.

Figure 2008257084
f (m) is a discrete-time signal, N is the order, M is the number of sampled frames, and (a0, a1, a2,..., aN) are mask patterns. However, a0 = 0. The mask pattern is uniquely determined by the register width W (number of bits to be stored) of the shift register and the correlation order N.

図3に本発明のレジスタ幅W=4のマスクパターンを示す。
黒く塗りつぶした位置が相関の算出に用いるビットを示す。各マスクパターンの最も左がa0を示し、右側に順にa1,a2,…のビットを示す。例えば、1次のマスクパターンは、a0は常に0で固定、a1が1,2,3と変化する3つのマスクパターンを持つ。
マスクパターンは図1の回路図において、シフトレジスタの出力線と論理積回路の入力線により構成する結線マトリックス回路9の結線を操作することにより実装する。専用ハードウェアは例えばFPGA(Field Programmable Gate Array:フィールドプログラマブルゲートアレイ)などの素子を用い、この結線は動的に変更可能にしておくことで、任意の次数のマスクパターンを実装可能とする。図1に示している結線は、図3に示したレジスタ幅が4でマスクの次数を2まで考慮するときのマスクパターンを示している。
FIG. 3 shows a mask pattern of the register width W = 4 of the present invention.
The blacked out position indicates the bit used for calculating the correlation. The leftmost of each mask pattern indicates a0, and the bits a1, a2,. For example, the primary mask pattern has three mask patterns in which a0 is always fixed to 0 and a1 changes to 1, 2, and 3.
In the circuit diagram of FIG. 1, the mask pattern is mounted by operating the connection of the connection matrix circuit 9 constituted by the output line of the shift register and the input line of the AND circuit. As the dedicated hardware, for example, an element such as a field programmable gate array (FPGA) is used, and this connection can be dynamically changed, so that a mask pattern of an arbitrary order can be mounted. The connection shown in FIG. 1 shows a mask pattern when the register width shown in FIG.

各論理積回路の出力に対応するマスクパターンの相関値が出力され、その出力を累積加算回路7に入力し、clock信号に同期して累積加算する。
Clock信号はカウンター回路にも入力し、累積加算回路7による加算回数を数える。そして、加算回数がフレームのサンプル数に一致したとき、カウンター回路5から制御信号を出力し、その時点での各累積加算回路7の出力値をラッチ回路8で記憶する。その後、累積加算回路7の値をゼロに戻し、次のフレームの特徴量計算の初期化を行う。このカウンター回路5からの制御信号は、1つのフレームのHLAC特徴量が確定したことを外部回路へ知らせるための割込み信号としても用いる。
The correlation value of the mask pattern corresponding to the output of each logical product circuit is output, and the output is input to the cumulative addition circuit 7 and cumulatively added in synchronization with the clock signal.
The Clock signal is also input to the counter circuit, and the number of additions by the cumulative addition circuit 7 is counted. When the number of additions matches the number of samples in the frame, a control signal is output from the counter circuit 5 and the output value of each cumulative addition circuit 7 at that time is stored in the latch circuit 8. Thereafter, the value of the cumulative addition circuit 7 is returned to zero, and the feature amount calculation for the next frame is initialized. The control signal from the counter circuit 5 is also used as an interrupt signal for notifying an external circuit that the HLAC feature value of one frame has been determined.

(2)グレイコード等を用いて2値表記に変換した振幅値を時系列に並べて2次元の2値画像を生成する方法:
図4は、本発明のグレイコード等を用いて2値表記に変換した振幅値を時系列に並べて2次元の2値画像を生成する方法によって特徴量の算出を行う手順を示す図である。
図4のフローチャートを説明する。なお、図中、Sはステップの省略形を意味する。
(2) A method of generating a two-dimensional binary image by arranging time-series amplitude values converted into binary notation using a Gray code or the like:
FIG. 4 is a diagram showing a procedure for calculating a feature amount by a method of generating a two-dimensional binary image by arranging amplitude values converted into binary notation using the Gray code of the present invention in time series.
The flowchart of FIG. 4 will be described. In the figure, S means an abbreviation of step.

開始
(1)1次元アナログ入力信号を取り込む(S1):
観測信号である1次元アナログ入力信号を、アナログ−ディジタル変換(A/D変換)する。
(2)サンプル値の変換(μ−Law、GrayCode、など)(S2):
S1で変換したディジタル信号の量子化値を、典型的な線形量子化手段の他に、音声信号の圧縮などで広く用いられているμ−Law量子化手段、またはグレイコード量子化手段などを用いて変換する。
(3)2値イメージの生成(S3):
S2で求めた離散時間の量子化サンプル値の時系列データより、2値イメージデータを生成する。このとき、例えば、各サンプル値を8ビットで量子化して、その8ビットのビットパターンを時系列順に並べると、図5に示すように2値のイメージが生成される。
Start (1) Capture one-dimensional analog input signal (S1):
A one-dimensional analog input signal as an observation signal is subjected to analog-digital conversion (A / D conversion).
(2) Sample value conversion (μ-Law, GrayCode, etc.) (S2):
The quantized value of the digital signal converted in S1 is used in addition to typical linear quantization means, μ-Law quantization means or Gray code quantization means widely used for audio signal compression, etc. To convert.
(3) Generation of binary image (S3):
Binary image data is generated from the time-series data of the discrete time quantized sample values obtained in S2. At this time, for example, when each sample value is quantized with 8 bits and the 8-bit bit patterns are arranged in chronological order, a binary image is generated as shown in FIG.

図5は、本発明の2値イメージに対して2次元(2D)2値HLACを適用し特徴を算出する方法の説明図である。時間tの関数X(t)のサンプル値(アナログ信号上の黒丸の値)を、矢印で示すように、2値で表現し、2値のイメージパターンを生成する。
(4)2次元2値HLACの算出(S4):
このようにして、2値イメージデータに対して2次元2値HLAC処理を適用しHLAC特徴量を算出する。
(5)HLAC特徴量出力(S5):
S4で算出したHLAC特徴量を、記憶手段に記憶すると共に、異常検出、特定信号の認識や検索または計数などの処理のために読み出す。
終了
FIG. 5 is an explanatory diagram of a method for calculating a feature by applying a two-dimensional (2D) binary HLAC to a binary image of the present invention. The sample value of the function X (t) at time t (the value of the black circle on the analog signal) is expressed as a binary value as indicated by an arrow to generate a binary image pattern.
(4) Calculation of two-dimensional binary HLAC (S4):
In this way, the two-dimensional binary HLAC process is applied to the binary image data to calculate the HLAC feature quantity.
(5) HLAC feature output (S5):
The HLAC feature value calculated in S4 is stored in the storage means, and is read for processing such as abnormality detection, specific signal recognition, search, or counting.
Finish

具体的な課題の解決手段は、以下のようになる。
(1)HLAC特徴量抽出方法は、1次元信号をPWMにより2値化し、その1次元2値信号に1次元2値HLACを適用してHLAC特徴量を求めることを特徴とする。
(2)HLAC特徴量抽出方法は、上記1次元信号の各振幅値を、線形量子化、μ‐Law量子化、および、グレイコード量子化の内のいずれか1つにより2値化し、その時系列として生成される2値画像へ2次元2値HLACを適用することを特徴とする。
(3)1次元信号をPWMにより2値化し、その1次元2値信号に1次元2値HLACを適用してHLAC特徴量を求めるHLAC特徴量抽出回路は、前記1次元2値HLACを適用する時に、HLACの算出に用いるマスクパターンを任意のパターン形状に変更できるようにしたことを特徴とする。
Specific means for solving the problem are as follows.
(1) The HLAC feature quantity extraction method is characterized in that a one-dimensional signal is binarized by PWM, and a one-dimensional binary HLAC is applied to the one-dimensional binary signal to obtain an HLAC feature quantity.
(2) In the HLAC feature quantity extraction method, each amplitude value of the one-dimensional signal is binarized by any one of linear quantization, μ-Law quantization, and Gray code quantization, and the time series thereof A two-dimensional binary HLAC is applied to a binary image generated as follows.
(3) A HLAC feature quantity extraction circuit for obtaining a HLAC feature quantity by binarizing a one-dimensional signal by PWM and applying the one-dimensional binary HLAC to the one-dimensional binary signal applies the one-dimensional binary HLAC. Sometimes, the mask pattern used for calculation of HLAC can be changed to an arbitrary pattern shape.

これまで2次元の画像や3次元の動画像からの特徴抽出として用いられてきたHLAC特徴量を音声・音響信号、心電図波形信号、地震波形信号など様々な1次元信号に適応することが可能となる。
1次元アナログ入力信号を2値化することで高速な特徴抽出処理が可能になる。特にPWMと1次元2値HLACを組み合わせた特徴抽出方法においては、本発明の回路用いてハードウェアを構成することで、高速処理が可能で、マスクパターンを動的に変更可能な汎用性の高い装置が実現される。
もう一つの2値化方法として、アナログ入力信号をディジタル信号に変換し、グレイコードを用いて生成した2値イメージに2D・2値HLACを適用することにより、特徴抽出の安定性を飛躍的に改善する。
HLAC feature values that have been used for feature extraction from two-dimensional images and three-dimensional moving images can be applied to various one-dimensional signals such as speech / acoustic signals, ECG waveform signals, and seismic waveform signals. Become.
High-speed feature extraction processing can be performed by binarizing the one-dimensional analog input signal. In particular, in the feature extraction method combining PWM and one-dimensional binary HLAC, high-speed processing is possible by configuring hardware using the circuit of the present invention, and the mask pattern can be dynamically changed. A device is realized.
As another binarization method, the analog input signal is converted into a digital signal, and the 2D / binary HLAC is applied to the binary image generated using the Gray code, thereby dramatically improving the stability of feature extraction. Improve.

本発明の実施の形態を図に基づいて詳細に説明する。   Embodiments of the present invention will be described in detail with reference to the drawings.

1次元アナログ入力信号を2値化する各方法に応じて、場合分けして説明する。
(1)Pulse Width Modulation(PWM)を用いる手法:
同一男性話者が発声した単語をランダムに繋ぎ合わせて10種類の音声データを生成する。生成した音声データにどの単語がいくつ含まれているかを、音声信号のPWM信号とその差分信号のそれぞれから抽出したHLAC特徴量を用いて計数する。実験に用いる単語数を60個から100個まで10個刻みで増やし、提案特徴量を用いた計数精度を調べる。
実験に用いたサンプリング周波数16kHzの音声信号を10倍にアップサンプリングし、基本周波数16kHzの三角波と比較することでアナログ信号を2値化するためのPWM信号を生成した。
1次元2値HLACの算出では、マスク点数を8、マスク点間隔は100サンプル、そして最高次数を5とした。
A description will be given for each case according to each method of binarizing the one-dimensional analog input signal.
(1) Method using Pulse Width Modulation (PWM):
Ten types of voice data are generated by randomly connecting words uttered by the same male speaker. The number of words included in the generated audio data is counted using the HLAC feature value extracted from each of the PWM signal of the audio signal and its differential signal. The number of words used in the experiment is increased from 60 to 100 in increments of 10, and the counting accuracy using the proposed feature amount is examined.
An audio signal with a sampling frequency of 16 kHz used in the experiment was up-sampled 10 times, and compared with a triangular wave with a basic frequency of 16 kHz, a PWM signal for binarizing the analog signal was generated.
In the calculation of the one-dimensional binary HLAC, the number of mask points was 8, the mask point interval was 100 samples, and the maximum order was 5.

表1は、60個から100個までの単語を用いて生成した10種類の音声データから、全ての単語数を正しく計数できた正解数を示している。

表1.各手法の正解数

Figure 2008257084
Table 1 shows the number of correct answers in which all the numbers of words were correctly counted from 10 types of speech data generated using 60 to 100 words.

Table 1. Number of correct answers for each method
Figure 2008257084

(2)グレイコード等を用いて2値表記に変換した振幅値を時系列に並べて2次元の2値画像を生成する方法:
同一男性話者が発声した単語をランダムに繋ぎ合わせて100個の音声データを生成する。生成した音声データの中にどの単語がいくつ含まれているかを、
(a)振幅を正規化した1次元信号に1D−濃淡HLACを適用する手段、
(b)PWM信号に1D−2値HLACを適用する手段、
(c)1次元信号のサンプル値時系列を線形量子化により2イメージパターンに変換し、2D−2値HLACを適用する手段
(d)1次元信号のサンプル値時系列をGrayCodeにより2値イメージパターンに変換し、2D−2値HLACを適用する手段、
の4種類で単語の計数実験を行う。
(2) A method of generating a two-dimensional binary image by arranging time-series amplitude values converted into binary notation using a Gray code or the like:
100 voice data are generated by randomly connecting words uttered by the same male speaker. How many words are included in the generated voice data,
(A) means for applying 1D-shading HLAC to a one-dimensional signal with normalized amplitude;
(B) means for applying a 1D-2 value HLAC to the PWM signal;
(C) Means for converting a sample value time series of a one-dimensional signal into a two-image pattern by linear quantization and applying a 2D-2 value HLAC. (D) A binary image pattern of a sample value time series of a one-dimensional signal by GrayCode. Means for converting to and applying a 2D-2 value HLAC;
The word counting experiment is performed with these four types.

それぞれの実験で用いる単語数は、各HLAC特徴量の次元数に一致させる。各HLAC特徴量の次元数は以下の1〜4の通りである。
1.1D−濃淡HLAC;
マスク点数=6、最高次数=2(特徴量は28次元)
2.PWM+1D−2値HLAC;
マスク点数=6、最高次数=3(特徴量は26次元)
3.線形量子化+2D−2値HLAC;
マスク=3×3、最高次数=2(特徴量は25次元)
4.グレイコード+2D・2値HLAC;
マスク=3×3、最高次数=2(特徴量は25次元)。
The number of words used in each experiment is matched with the number of dimensions of each HLAC feature. The number of dimensions of each HLAC feature amount is as follows.
1.1D-light HLAC;
Number of mask points = 6, maximum order = 2 (features are 28 dimensions)
2. PWM + 1D-2 value HLAC;
Number of mask points = 6, maximum order = 3 (features are 26 dimensions)
3. Linear quantization + 2D-2 value HLAC;
Mask = 3 × 3, maximum order = 2 (features are 25 dimensions)
4). Gray code + 2D, binary HLAC;
Mask = 3 × 3, maximum order = 2 (feature amount is 25 dimensions).

原理的に特徴量の次元数より多くの単語を区別することは不可能である。異なる単語音声の特徴を特徴量が異なるものとして適切に表現できていれば、次元数と同じ数の単語音声を区別できる筈である。もし特徴量が適切でなければ、異なる単語音声から求めた特徴量が線形従属の関係になり、次元数と同じ数の単語音声を区別できない。
マスクパターンのマスク点間隔を変えることで、特徴量の性能が大きく変化する。最適なマスクパターンのマスク点間隔は、基本的に分析する信号に依存する。1と2の手法に関してはマスク点間隔を0.0625msから25msまで変化させながら実験を行う。3と4の手法に関してはビットパターン軸に沿ったマスク点間隔を1から7ビット幅、時間軸に沿ったマスク点間隔を0.0625msから25msまで変化させながら実験を行う。
In principle, it is impossible to distinguish more words than the number of dimensions of the feature quantity. If the features of different word sounds can be appropriately expressed as having different feature quantities, it is possible to distinguish the same number of word sounds as the number of dimensions. If the feature amount is not appropriate, the feature amounts obtained from different word sounds have a linear dependency relationship, and the same number of word sounds as the number of dimensions cannot be distinguished.
By changing the mask point interval of the mask pattern, the performance of the feature amount changes greatly. The mask point interval of the optimum mask pattern basically depends on the signal to be analyzed. Regarding the methods 1 and 2, the experiment is performed while changing the mask point interval from 0.0625 ms to 25 ms. Regarding the methods 3 and 4, the experiment is performed while changing the mask point interval along the bit pattern axis from 1 to 7 bits wide and the mask point interval along the time axis from 0.0625 ms to 25 ms.

次元数と同じ数の単語音声の同時計数が可能かどうか、更に、マスク点間隔によらずどれだけ安定的に単語音声の同時計測が可能かで、特徴量の性能を評価する。
生成した100個のサンプルから単語音声数を同時計数し、全サンプルの計数結果が正解したときのみ計数成功とする。もし、100個のサンプルの内1つでも計数結果が間違っていたら計数失敗とする。
特徴量毎に、マスク点間隔を変えて計数実験を行い、全実験数に対する計数成功数の割合として計数成功出現割合を求める。この計数成功出現割合が大きいほど、マスク点間隔に依存しないで安定的に単語計数が行えることを示し、特徴量として性能が高いことを意味する。
The performance of the feature quantity is evaluated based on whether or not the simultaneous counting of the same number of word sounds as the number of dimensions is possible and how stably the word sounds can be simultaneously measured regardless of the mask point interval.
The number of word sounds is simultaneously counted from the generated 100 samples, and the counting is successful only when the counting results of all the samples are correct. If even one of the 100 samples has an incorrect count result, the count fails.
For each feature amount, a counting experiment is performed while changing the mask point interval, and a counting success appearance ratio is obtained as a ratio of the counting success count to the total number of experiments. A larger count success appearance ratio indicates that the word count can be stably performed without depending on the mask point interval, which means that the performance as a feature amount is high.

(実験結果)
図6は、それぞれの方法の計数成功出現率を示す図である。
図6に示されるように、係数成功出現率[%]は、「1D・Gray(濃淡)HLAC」の方法では47.37、「PWM+1D・Bin(2値)HLAC」の方法では48.68、「Linear(線形量子化)+2D・Bin(2値)HLAC」の方法では0.19、「GrayCode(グレイコード)+2D・Bin(2値)HLAC」の方法では57.89となる。
図6に示す全ての方法において、計数成功出現率がゼロになっていないことから、マスク点間隔を適切に設定すれば、単語音声数の計数を正しく行えることがわかる。
「1D・Gray(濃淡)HLAC」の方法より、「PWM+1D・Bin(2値)HLAC」の方法の方が特徴量としての性能が高いと言える。「Linear(線形量子化)+2D・Bin(2値)HLAC」の方法に関しては、性能がマスク点間隔に大きく依存するので、事前に分析する信号の特徴を正確に調べておく必要がある。一方、「GrayCode(グレイコード)+2D・Bin(2値)HLAC」の方法ではマスク点間隔への依存が小さくなり、特徴量としての性能が大きく改善されている。
(Experimental result)
FIG. 6 is a diagram showing the count success appearance rate of each method.
As shown in FIG. 6, the coefficient success appearance rate [%] is 47.37 in the method of “1D · Gray (shading) HLAC”, 48.68 in the method of “PWM + 1D · Bin (binary) HLAC”, The method of “Linear (linear quantization) + 2D · Bin (binary) HLAC” is 0.19, and the method of “GrayCode (gray code) + 2D · Bin (binary) HLAC” is 57.89.
In all the methods shown in FIG. 6, since the successful count appearance rate is not zero, it can be understood that the number of word sounds can be correctly counted if the mask point interval is set appropriately.
It can be said that the “PWM + 1D · Bin (binary) HLAC” method has higher performance as a feature amount than the “1D · Gray (light / dark) HLAC” method. With regard to the method of “Linear (linear quantization) + 2D · Bin (binary) HLAC”, the performance largely depends on the mask point interval, so it is necessary to accurately check the characteristics of the signal to be analyzed in advance. On the other hand, in the method of “GrayCode (Gray code) + 2D · Bin (binary) HLAC”, the dependence on the mask point interval is reduced, and the performance as the feature amount is greatly improved.

本発明の1次元2値HLAC特徴量算出回路の回路図である。It is a circuit diagram of the one-dimensional binary HLAC feature quantity calculation circuit of the present invention. アナログ入力信号をPWM処理により2値化する処理経過を示す図である。It is a figure which shows the process progress which binarizes an analog input signal by PWM process. 本発明のレジスタ幅W=4のマスクパターンを示す。The mask pattern of the register width W = 4 of this invention is shown. 本発明のグレイコード等を用いて2値表記に変換した振幅値を時系列に並べて2次元の2値画像を生成する方法によって特徴量の算出を行う手順を示す図である。It is a figure which shows the procedure which calculates a feature-value by the method of arranging the amplitude value converted into the binary notation using the Gray code of this invention in time series, and producing | generating a two-dimensional binary image. 本発明の2値イメージに対して2次元2値HLACを適用し特徴を算出する方法の説明図である。It is explanatory drawing of the method of calculating a characteristic by applying a two-dimensional binary HLAC with respect to the binary image of this invention. それぞれの方法の計数成功出現率を示す図である。It is a figure which shows the count success appearance rate of each method.

符号の説明Explanation of symbols

1・・1次元2値HLAC特徴量算出回路
2・・比較器
3・・サンプリング手段
4・・シフトレジスタ
5・・カウンター
6・・AND
7・・累積加算回路
8・・ラッチ回路
9・・マトリクス回路
1 .. One-dimensional binary HLAC feature quantity calculation circuit 2 .. Comparator 3 .. Sampling means 4. Shift register 5. Counter 6.
7 .. Cumulative addition circuit 8 .. Latch circuit 9 .. Matrix circuit

Claims (3)

1次元信号をPWMにより2値化し、その1次元2値信号に1次元2値HLACを適用してHLAC特徴量を求めることを特徴とするHLAC特徴量抽出方法。 An HLAC feature quantity extraction method characterized in that a one-dimensional signal is binarized by PWM, and an HLAC feature quantity is obtained by applying the one-dimensional binary HLAC to the one-dimensional binary signal. 上記1次元信号の各振幅値を、線形量子化、μ‐Law量子化、および、グレイコード量子化の内のいずれか1つにより2値化し、その時系列として生成される2値画像へ2次元2値HLACを適用することを特徴とする請求項1記載のHLAC特徴量抽出方法。 Each amplitude value of the one-dimensional signal is binarized by any one of linear quantization, μ-Law quantization, and Gray code quantization, and two-dimensionally generated into a binary image generated as a time series thereof. 2. The HLAC feature quantity extraction method according to claim 1, wherein binary HLAC is applied. 1次元信号をPWMにより2値化し、その1次元2値信号に1次元2値HLACを適用してHLAC特徴量を求めるHLAC特徴量抽出回路において、
前記1次元2値HLACを適用する時に、HLACの算出に用いるマスクパターンを任意のパターン形状に変更できるようにしたことを特徴とするHLAC特徴量抽出回路。
In an HLAC feature quantity extraction circuit that binarizes a one-dimensional signal by PWM and applies a one-dimensional binary HLAC to the one-dimensional binary signal to obtain an HLAC feature quantity,
An HLAC feature quantity extraction circuit characterized in that when applying the one-dimensional binary HLAC, a mask pattern used for calculation of HLAC can be changed to an arbitrary pattern shape.
JP2007101279A 2007-04-09 2007-04-09 HLAC feature quantity extraction method and feature quantity extraction device by binarization of one-dimensional signal Expired - Fee Related JP4840819B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2007101279A JP4840819B2 (en) 2007-04-09 2007-04-09 HLAC feature quantity extraction method and feature quantity extraction device by binarization of one-dimensional signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2007101279A JP4840819B2 (en) 2007-04-09 2007-04-09 HLAC feature quantity extraction method and feature quantity extraction device by binarization of one-dimensional signal

Publications (2)

Publication Number Publication Date
JP2008257084A true JP2008257084A (en) 2008-10-23
JP4840819B2 JP4840819B2 (en) 2011-12-21

Family

ID=39980702

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2007101279A Expired - Fee Related JP4840819B2 (en) 2007-04-09 2007-04-09 HLAC feature quantity extraction method and feature quantity extraction device by binarization of one-dimensional signal

Country Status (1)

Country Link
JP (1) JP4840819B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011095531A (en) * 2009-10-30 2011-05-12 National Institute Of Advanced Industrial Science & Technology High order autocorrelation (hlac) feature quantity extracting method, failure detecting method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006329979A (en) * 2005-05-20 2006-12-07 Tektronix Inc Measuring equipment, autocorrelation trigger generation method and generator
JP2008185845A (en) * 2007-01-30 2008-08-14 National Institute Of Advanced Industrial & Technology Method and device of hlac feature extraction from conversion value of one-dimensional signal

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006329979A (en) * 2005-05-20 2006-12-07 Tektronix Inc Measuring equipment, autocorrelation trigger generation method and generator
JP2008185845A (en) * 2007-01-30 2008-08-14 National Institute Of Advanced Industrial & Technology Method and device of hlac feature extraction from conversion value of one-dimensional signal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JPN6010069196, 大 聖一郎、児島 宏明, "HLAC尺度に依存した非定常信号処理", 信学技報, 200611, SP2006−70, p.7−10, JP, 社団法人電子情報通信学会 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011095531A (en) * 2009-10-30 2011-05-12 National Institute Of Advanced Industrial Science & Technology High order autocorrelation (hlac) feature quantity extracting method, failure detecting method and device

Also Published As

Publication number Publication date
JP4840819B2 (en) 2011-12-21

Similar Documents

Publication Publication Date Title
US10679643B2 (en) Automatic audio captioning
JP6766374B2 (en) Classification device, classification method, program, and parameter generator
CN106653056B (en) Fundamental frequency extraction model and training method based on LSTM recurrent neural network
US8050910B2 (en) Pitch extraction with inhibition of harmonics and sub-harmonics of the fundamental frequency
EP0134238A1 (en) Signal processing and synthesizing method and apparatus
WO2012064408A2 (en) Method for tone/intonation recognition using auditory attention cues
TW201246183A (en) Extraction and matching of characteristic fingerprints from audio signals
JP4606800B2 (en) System for detecting non-stationary signal components and method used in a system for detecting non-stationary signal components
CN109448746B (en) Voice noise reduction method and device
CN115587321B (en) Electroencephalogram signal identification and classification method and system and electronic equipment
US4388491A (en) Speech pitch period extraction apparatus
JP4840819B2 (en) HLAC feature quantity extraction method and feature quantity extraction device by binarization of one-dimensional signal
JP4705480B2 (en) How to find the fundamental frequency of a harmonic signal
Okawa et al. Audio classification of bit-representation waveform
TW569180B (en) Speech recognition method and device, speech synthesis method and device, recording medium
JP2005084244A (en) Method for restoration of target speech based upon speech section detection under stationary noise
CN116626631A (en) Automatic radar model identification method and system combining intra-pulse and inter-pulse characteristics
CN115311688A (en) Pedestrian detection method and device, electronic equipment and storage medium
CN108962389A (en) Method and system for indicating risk
Qaisar et al. An event-driven approach for time-domain recognition of spoken English letters
Abimbola et al. Time signature detection: a survey
CN111008356A (en) WTSVD algorithm-based background-subtracted gamma energy spectrum set analysis method
Azad et al. An efficient way to convert 1D signal to 2D digital image using energy values
Samiotis et al. Hybrid Annotation Systems for Music Transcription
Watts DeepPitch: wide-range monophonic pitch estimation using deep convolutional neural networks

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20090319

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20101124

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20101207

RD02 Notification of acceptance of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7422

Effective date: 20101211

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A821

Effective date: 20101213

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20110130

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20110920

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20110928

R150 Certificate of patent or registration of utility model

Ref document number: 4840819

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20141014

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20141014

Year of fee payment: 3

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

S533 Written request for registration of change of name

Free format text: JAPANESE INTERMEDIATE CODE: R313533

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

LAPS Cancellation because of no payment of annual fees