JP5342629B2 - Male and female voice identification method, male and female voice identification device, and program - Google Patents

Male and female voice identification method, male and female voice identification device, and program Download PDF

Info

Publication number
JP5342629B2
JP5342629B2 JP2011223680A JP2011223680A JP5342629B2 JP 5342629 B2 JP5342629 B2 JP 5342629B2 JP 2011223680 A JP2011223680 A JP 2011223680A JP 2011223680 A JP2011223680 A JP 2011223680A JP 5342629 B2 JP5342629 B2 JP 5342629B2
Authority
JP
Japan
Prior art keywords
voice
speech
time length
male
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2011223680A
Other languages
Japanese (ja)
Other versions
JP2013083796A (en
Inventor
光昭 磯貝
哲 小橋川
秀之 水野
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2011223680A priority Critical patent/JP5342629B2/en
Publication of JP2013083796A publication Critical patent/JP2013083796A/en
Application granted granted Critical
Publication of JP5342629B2 publication Critical patent/JP5342629B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

<P>PROBLEM TO BE SOLVED: To accurately identify sexuality of a speaker of a speech signal even when time length of the input speech signal is extremely short. <P>SOLUTION: A method for identifying male/female voice includes: extracting a speech feature amount from an input speech signal; and identifying the sexuality of a speaker of the speech signal on the basis of likelihood obtained by collating the speech feature amount with a male speech acoustic model and a female speech acoustic model. When the time length of the speech signal is less than a predetermined time length L, the speech signal is repeatedly extended till becoming to have the time length L or more, the speech feature amount is extracted using the extended speech signal, and a collation and an identification are performed using a speech recognition grammar corresponding to the repeat. <P>COPYRIGHT: (C)2013,JPO&amp;INPIT

Description

この発明は入力された音声信号の話者の性別を識別する男女声識別方法、男女声識別装置及びプログラムに関する。   The present invention relates to a male / female voice identification method, a male / female voice identification device, and a program for identifying the gender of a speaker of an input voice signal.

入力された音声信号から話者の性別を識別する男女声識別技術は、単に性別を識別するためだけではなく、例えば音声認識の高精度化のためにも重要な技術である。   The male / female voice identification technique for identifying the gender of a speaker from the input voice signal is an important technique not only for identifying gender but also for improving the accuracy of voice recognition, for example.

これまでは、入力された音声信号が男声・女声のいずれかを識別する場合、音声信号から音声特徴量を抽出し、その抽出した音声特徴量をGMM(Gaussian Mixture Model)等の統計的なモデル化に基づいて作成された男声用の音響モデル及び女声用の音響モデルと照合して尤度を求め、その尤度に基づいて男声・女声の識別が行われてきた。   Up to now, when the input audio signal identifies male voice or female voice, the voice feature is extracted from the voice signal, and the extracted voice feature is statistical model such as GMM (Gaussian Mixture Model). The likelihood is obtained by collating with the acoustic model for male voice and the acoustic model for female voice created based on the conversion, and the male voice and female voice are identified based on the likelihood.

特許文献1にはこのように入力された音声信号から抽出した音声特徴量を男声用の音響モデル及び女声用の音響モデルと照合し、尤度を求めることが記載されており、この尤度に基づいて男声・女声を識別することができる。   Patent Document 1 describes that the speech feature amount extracted from the speech signal input in this manner is collated with an acoustic model for male voice and an acoustic model for female voice, and the likelihood is obtained. Male voices and female voices can be identified based on this.

図11は上記のような方法によって、入力された音声信号の話者の性別を識別する男女声識別装置の構成例を示したものである。男女声識別装置は男女声識別処理部10と男声音響モデル20と女声音響モデル30とを備えて構成されている。男声音響モデル20は音声区間モデル21と非音声区間モデル22とを具備し、同様に女声音響モデル30も音声区間モデル31と非音声区間モデル32とを具備するものとなっている。   FIG. 11 shows an example of the configuration of a gender voice identification device for identifying the gender of a speaker of an input voice signal by the method as described above. The male / female voice identification device includes a male / female voice identification processing unit 10, a male voice acoustic model 20, and a female voice acoustic model 30. The male voice acoustic model 20 includes a voice section model 21 and a non-speech section model 22, and similarly, the female voice model 30 includes a voice section model 31 and a non-speech section model 32.

男女声識別処理部10はこの例では音声特徴量抽出部11と認識文法設定部12と識別部13とを備えている。音声特徴量抽出部11は入力された音声信号(A/D変換され、ディジタル化されたディジタル音声信号)の音声特徴量を抽出する。認識文法設定部12は音声特徴量を男声音響モデル20及び女声音響モデル30と照合し、尤度を求める際に用いる認識文法の設定を行う。識別部13は認識文法を用い、音声特徴量を男声音響モデル20及び女声音響モデル30と照合して尤度を求め、求めた尤度に基づいて音声信号の話者の性別を識別する。男女声識別処理部10はこのようにして識別した結果を出力する。   In this example, the male / female voice identification processing unit 10 includes a voice feature amount extraction unit 11, a recognition grammar setting unit 12, and an identification unit 13. The voice feature quantity extraction unit 11 extracts the voice feature quantity of the input voice signal (digital voice signal that has been A / D converted and digitized). The recognition grammar setting unit 12 collates the voice feature quantity with the male voice acoustic model 20 and the female voice acoustic model 30 and sets the recognition grammar used when obtaining the likelihood. The identification unit 13 uses the recognition grammar, compares the voice feature amount with the male voice acoustic model 20 and the female voice acoustic model 30, obtains the likelihood, and identifies the gender of the speaker of the voice signal based on the obtained likelihood. The male / female voice identification processing unit 10 outputs the result of identification in this way.

男声・女声を識別する際には発話単位で識別する必要があるため、認識文法設定部12で設定される認識文法は一般に下記に示すような認識文法(1)とされる。なお、下記認識文法(1)はBNF記法を拡張した表現で記述している。   Since it is necessary to identify male voices and female voices in units of utterances, the recognition grammar set by the recognition grammar setting unit 12 is generally a recognition grammar (1) as shown below. Note that the recognition grammar (1) below is described in an expanded form of the BNF notation.

・認識文法(1)
$[p]=pause;
$[g]=garbage;
$START=$p $g $p;
ここで、$[xxx]=はシンボルの宣言を意味し、右辺のpauseは無音等の非音声を表すシンボル、garbageは音声を表すシンボルである。$STARTは文全体を表す開始記号である。なお、記号=は定義、記号[ ]は単語表記の指定、記号;は定義の終端を表す。
・ Recognition grammar (1)
$ [p] = pause;
$ [g] = garbage;
$ START = $ p $ g $ p;
Here, $ [xxx] = means a symbol declaration, pause on the right side is a symbol representing non-speech such as silence, and garbage is a symbol representing speech. $ START is a start symbol that represents the entire sentence. Note that the symbol = represents the definition, the symbol [] represents the word notation, and the symbol; represents the end of the definition.

上記認識文法(1)は、非音声→音声→非音声の順に出現することを想定した文法となっている。   The recognition grammar (1) is assumed to appear in the order of non-speech → speech → non-speech.

特開2011−13543号公報JP 2011-13543 A

ところで、例えば1秒以下の非常に短い音声信号に対して男女声の識別を行った場合、以下の問題が発生する。   By the way, for example, when male and female voices are identified for a very short audio signal of 1 second or less, the following problems occur.

即ち、入力された音声信号から音声特徴量を抽出する際に、一般的には話者による音声特徴量の偏りを除去するため、例えばCMN(Cepstrum Mean Normalization)やCVN(Cepstrum Variance Normalization)等の音声特徴量の正規化処理を行う。しかしながら、こうした正規化処理は音声特徴量の統計的な分析に基づいた処理に基づくため、ある程度の長さの音声信号が入力されないと、統計的に正しい結果が得られず、結果的に正しい正規化処理ができない場合がある。   That is, when extracting a voice feature value from an input voice signal, generally, for example, CMN (Cepstrum Mean Normalization), CVN (Cepstrum Variance Normalization), or the like is used to remove the bias of the voice feature value by a speaker. Performs normalization processing of voice feature. However, since such normalization processing is based on processing based on statistical analysis of speech features, if a speech signal of a certain length is not input, a statistically correct result cannot be obtained, resulting in correct normalization. May not be able to be processed.

従って、例えば“はい”、“いいえ”等のごく短時間の音声が入力された場合には、その音声信号から抽出・正規化された音声特徴量に残った話者に依存した偏りが、男声/女声それぞれの音響モデルとの尤度に影響を与え、識別精度が低下するといった問題が発生する恐れがある。   Therefore, for example, when a very short speech such as “Yes” or “No” is input, the bias depending on the speaker remaining in the speech feature amount extracted and normalized from the speech signal is / There is a possibility that problems such as the influence of the female models on the acoustic model may be affected and the identification accuracy may be lowered.

この発明の目的はこのような問題に鑑み、入力された音声信号の時間長が非常に短い場合であっても、音声信号の話者の性別を正確に識別することができるようにした男女声識別方法及び男女声識別装置を提供することにある。   In view of such problems, the object of the present invention is to make it possible to accurately identify the gender of the speaker of the audio signal even when the time length of the input audio signal is very short. An object is to provide a discrimination method and a gender voice discrimination device.

請求項1の発明によれば、入力された音声信号から音声特徴量を抽出し、その音声特徴量を男声音響モデル及び女声音響モデルと照合した尤度に基づいて音声信号の話者の性別を識別する男女声識別方法において、音声信号の時間長が所定の時間長L未満の場合、音声信号を時間長L以上となるまで繰り返して伸長させ、その伸長させた音声信号を用いて音声特徴量の抽出を行い、前記繰り返しに対応した認識文法を用いて前記照合及び識別を行う。   According to the first aspect of the present invention, the voice feature is extracted from the input voice signal, and the gender of the speaker of the voice signal is determined based on the likelihood that the voice feature is collated with the male voice model and the female voice model. In the male / female voice identification method for identifying, when the time length of the audio signal is less than the predetermined time length L, the audio signal is repeatedly expanded until the time length becomes equal to or longer than the time length L, and the audio feature amount is used using the expanded audio signal. Are extracted, and the collation and identification are performed using a recognition grammar corresponding to the repetition.

請求項2の発明では請求項1の発明において、前記繰り返しを行う際、音声信号の音声区間を検出し、その音声区間のみ繰り返して音声信号を伸長させる。   According to a second aspect of the invention, in the first aspect of the invention, when the repetition is performed, a voice section of the voice signal is detected, and the voice signal is expanded by repeating only the voice section.

請求項3の発明では請求項2の発明において、検出した音声区間の長さが閾値T未満の場合、その音声区間を棄却し、前記識別を行わないこととする。   In the invention of claim 3, in the invention of claim 2, when the length of the detected speech section is less than the threshold value T, the speech section is rejected and the identification is not performed.

請求項4の発明では請求項1乃至3のいずれかの発明において、時間長Lは外部から設定可能とされる。   According to a fourth aspect of the present invention, in any one of the first to third aspects, the time length L can be set from the outside.

請求項5の発明では請求項1乃至3のいずれかの発明において、時間長Lは男女声識別を実行する計算機負荷と要求応答時間から算出される。   According to a fifth aspect of the present invention, in any one of the first to third aspects, the time length L is calculated from a computer load for executing gender discrimination and a required response time.

請求項6の発明によれば、男女声識別装置は、入力された音声信号の時間長が所定の時間長L未満か否かを判定し、時間長L未満と判定した場合、音声信号を音声伸長部に出力し、時間長L以上と判定した場合、音声信号を男女声識別処理部に出力する音声長判定部と、音声長判定部から入力された音声信号を時間長L以上となるまで繰り返して伸長させ、その伸長させた音声信号を男女声識別処理部に出力する音声伸長部と、音声長判定部から入力される音声信号及び音声伸長部から入力される音声信号の音声特徴量を抽出し、その音声特徴量を抽出した音声信号に対応する認識文法を用いて音声特徴量を男声音響モデル及び女声音響モデルと照合し、照合した尤度に基づいて音声特徴量を抽出した音声信号の話者の性別を識別して出力する男女声識別処理部とを備える。   According to the invention of claim 6, the gender voice identification device determines whether or not the time length of the input audio signal is less than the predetermined time length L. When it is output to the decompression unit and it is determined that the time length is equal to or longer than L, the voice length determination unit that outputs the voice signal to the male and female voice discrimination processing unit, and the voice signal input from the voice length determination unit until the time length L is equal to or longer A voice decompression unit that repeatedly decompresses and outputs the decompressed voice signal to the male and female voice discrimination processing unit, a voice signal input from the voice length determination unit, and a voice feature amount of the voice signal input from the voice extension unit The speech signal is extracted based on the likelihood of collating the speech feature with the male acoustic model and female acoustic model using the recognition grammar corresponding to the extracted speech signal. Identify and output the gender of the speaker And a male and female voice identification processing unit.

この発明によれば、入力された音声信号の時間長が短く、所定の時間長L未満の場合には音声信号を繰り返し、また繰り返しに対応した認識文法を用いるものとなっており、よって音声特徴量の正規化処理を安定させ、正しく行うことができ、これにより音声信号の話者の性別を正確に識別することが可能となる。   According to the present invention, when the time length of the input voice signal is short and less than the predetermined time length L, the voice signal is repeated, and the recognition grammar corresponding to the repetition is used. The amount normalization process can be stabilized and performed correctly, which makes it possible to accurately identify the gender of the speaker of the speech signal.

この発明による男女声識別方法の第1の実施例を実行する男女声識別装置の機能構成を示すブロック図。The block diagram which shows the function structure of the man and woman voice identification device which performs the 1st Example of the man and woman voice identification method by this invention. この発明による男女声識別方法の第1の実施例の処理フローを示すフローチャート。The flowchart which shows the processing flow of 1st Example of the gender voice identification method by this invention. 音声伸長例を示す図。The figure which shows the audio | voice expansion | extension example. 音声特徴量の正規化処理に対する音声伸長の効果を説明するための図。The figure for demonstrating the effect of the audio | voice expansion | extension with respect to the normalization process of an audio | voice feature-value. この発明による男女声識別方法の第2の実施例を実行する男女声識別装置の機能構成を示すブロック図。The block diagram which shows the function structure of the man and woman voice identification device which performs the 2nd Example of the man and woman voice identification method by this invention. この発明による男女声識別方法の第2の実施例の処理フローを示すフローチャート。The flowchart which shows the processing flow of 2nd Example of the gender voice identification method by this invention. 音声伸長例を示す図。The figure which shows the audio | voice expansion | extension example. この発明による男女声識別方法の第3の実施例の処理フローを示すフローチャート。The flowchart which shows the processing flow of the 3rd Example of the man and woman voice identification method by this invention. この発明による男女声識別方法の第4の実施例を実行する男女声識別装置の機能構成を示すブロック図。The block diagram which shows the function structure of the man and woman voice identification device which performs the 4th Example of the man and woman voice identification method by this invention. この発明による男女声識別方法の第5の実施例を実行する男女声識別装置の機能構成を示すブロック図。The block diagram which shows the function structure of the man and woman voice identification device which performs the 5th Example of the man and woman voice identification method by this invention. 従来の男女声識別方法を実行する男女声識別装置の機能構成を示すブロック図。The block diagram which shows the function structure of the man and woman voice identification apparatus which performs the conventional man and woman voice identification method.

以下、この発明の実施形態を図面を参照して実施例により説明する。   Hereinafter, embodiments of the present invention will be described with reference to the drawings.

実施例1の男女声識別装置の機能構成を図1に示し、その処理フローを図2に示す。   FIG. 1 shows a functional configuration of the gender voice identification apparatus according to the first embodiment, and FIG. 2 shows a processing flow thereof.

この例では男女声識別装置は図11に示した従来の男女声識別装置に対し、音声長判定部40と音声伸長部50とが付加された構成となっている。   In this example, the gender voice identification device has a configuration in which a voice length determination unit 40 and a voice decompression unit 50 are added to the conventional gender voice discrimination device shown in FIG.

話者の性別を識別したい音声信号は音声長判定部40に入力される(ステップS1)。音声長判定部40は入力された音声信号の時間長が所定の時間長L未満か否かを判定し(ステップS2)、時間長L未満と判定した場合、音声信号を音声伸長部50に出力し、時間長L以上と判定した場合、音声信号を男女声識別処理部10に出力する。   A voice signal for identifying the gender of the speaker is input to the voice length determination unit 40 (step S1). The audio length determination unit 40 determines whether or not the time length of the input audio signal is less than the predetermined time length L (step S2), and if it is determined that the time length is less than the time length L, outputs the audio signal to the audio expansion unit 50. If it is determined that the time length is equal to or longer than L, the audio signal is output to the gender identification unit 10.

音声伸長部50は音声長判定部40から入力された音声信号を時間長L以上となるまで繰り返して伸長させ(ステップS3)、その伸長させた音声信号を男女声識別処理部10に出力する。   The speech decompression unit 50 repeatedly decompresses the speech signal input from the speech length determination unit 40 until the time length becomes equal to or longer than the time length L (step S3), and outputs the decompressed speech signal to the gender voice discrimination processing unit 10.

時間長Lは、0より大きい任意の値にすることができる。時間長Lの値は、男女声識別を適用するタスクの音声セット等から、識別精度向上に有効な適切な値を実験的に求める等の方法で決定すればよい。ここでは、一例として、L=2秒とする。   The time length L can be any value greater than zero. The value of the time length L may be determined by a method such as experimentally obtaining an appropriate value effective for improving the identification accuracy from a voice set of a task to which gender voice identification is applied. Here, as an example, L = 2 seconds.

音声伸長部50における音声伸長は具体的には以下のように行われる。即ち、この例では音声伸長部50はバッファ51を備えており、このバッファ51に、入力された音声信号の先頭フレームから順次、1フレームずつコピーが行われる。入力音声信号の最終フレームまで達したら、再び入力音声信号の先頭フレームからコピーが行われる。以上の処理をバッファ51に含まれるフレーム長が時間長L以上となるまで繰り返す。ここでのコピーの繰り返し処理は、バッファ51に含まれるフレーム長が時間長Lと等しくなった時点(もしくは越えた時点)で打ち切ってもよい(図3はこのように繰り返されて伸長された伸長音声信号の一例を入力音声信号と共に示したものである)し、あるいは、バッファ51に含まれるフレーム長が時間長Lを超えた後に入力音声信号の最終フレームまで達した時点でコピーを終了してもよい。   Specifically, the voice decompression in the voice decompression unit 50 is performed as follows. That is, in this example, the audio decompression unit 50 includes a buffer 51, and the buffer 51 is copied frame by frame sequentially from the first frame of the input audio signal. When the final frame of the input audio signal is reached, copying is performed again from the first frame of the input audio signal. The above processing is repeated until the frame length included in the buffer 51 becomes equal to or longer than the time length L. The copy repetitive processing here may be terminated when the frame length included in the buffer 51 becomes equal to (or exceeds) the time length L (FIG. 3 shows the decompression that is repeated and expanded in this way. An example of the audio signal is shown together with the input audio signal), or the copy is finished when the frame length included in the buffer 51 exceeds the time length L and reaches the final frame of the input audio signal. Also good.

なお、一般的な音声認識では、このような音声伸長処理を行うと、認識結果(音声を文字化した結果)が入力音声とは異なってしまうので望ましくはないが、男女声識別においては発話内容(何が話されているか)は識別する必要がないため、このような音声伸長処理を適用することができる。   In general speech recognition, such speech decompression processing is not desirable because the recognition result (result of converting the speech into text) will be different from the input speech. Since it is not necessary to identify (what is spoken), such a voice decompression process can be applied.

男女声識別処理部10には音声長判定部40及び音声伸長部50から音声信号が入力される。音声特徴量抽出部11はこれら音声信号の音声特徴量を抽出する(ステップS4)。音声伸長部50から入力される音声信号は音声伸長により、音声と非音声が交互に含まれ、また複数回繰り返して含まれうるため、この繰り返しに対応した認識文法を用いる必要がある。認識文法設定部12は音声信号が音声長判定部40から入力された場合及び音声伸長部50から入力された場合のそれぞれに対応して認識文法を設定する。音声信号が音声長判定部40から入力された場合の認識文法は前記した認識文法(1)とされ、音声伸長部50から入力された場合の認識文法は下記に示す認識文法(2)とされる。なお、認識文法(2)は認識文法(1)と同様、BNF記法を拡張した表現で記述している。   An audio signal is input to the gender identification processing unit 10 from the audio length determination unit 40 and the audio expansion unit 50. The voice feature quantity extraction unit 11 extracts voice feature quantities of these voice signals (step S4). The speech signal input from the speech decompression unit 50 includes speech and non-speech alternately by speech decompression, and can be repeatedly included multiple times. Therefore, it is necessary to use a recognition grammar corresponding to this repetition. The recognition grammar setting unit 12 sets the recognition grammar corresponding to each of the case where the speech signal is input from the speech length determination unit 40 and the case where the speech signal is input from the speech decompression unit 50. The recognition grammar when the speech signal is input from the speech length determination unit 40 is the recognition grammar (1) described above, and the recognition grammar when the speech signal is input from the speech decompression unit 50 is the recognition grammar (2) shown below. The Note that the recognition grammar (2) is described in an expanded form of the BNF notation, similar to the recognition grammar (1).

・認識文法(2)
$[p]=pause;
$[g]=garbage;
$START=<$p|$g>;
但し、記号< >は1回以上の繰り返し、記号|は並列接続を表す。
・ Recognition grammar (2)
$ [p] = pause;
$ [g] = garbage;
$ START = <$ p | $ g>;
However, symbol <> represents one or more repetitions, and symbol | represents parallel connection.

上記認識文法(2)は、非音声と音声が交互に出現することを想定した文法となっている。   The recognition grammar (2) is a grammar assuming that non-speech and speech appear alternately.

識別部13は認識文法設定部12で設定された認識文法を用い、音声特徴量を男声音響モデル20及び女声音響モデル30と照合して尤度を求め(ステップS5)、求めた尤度に基づいて音声信号の話者の性別を識別する(ステップS6)。男女声識別処理部10はこのようにして識別した結果を出力する。   The identification unit 13 uses the recognition grammar set by the recognition grammar setting unit 12 to collate the voice feature amount with the male acoustic model 20 and the female acoustic model 30 to obtain likelihood (step S5), and based on the obtained likelihood. Then, the gender of the speaker of the voice signal is identified (step S6). The male / female voice identification processing unit 10 outputs the result of identification in this way.

この例では、上述したように入力された音声信号の時間長が短く、所定の時間長L未満の場合には音声信号を繰り返して時間長を伸長するものとなっており、これにより音声信号の話者の性別の識別に用いる音声特徴量平均を得ることができる区間を増やすことができるものとなっている。   In this example, the time length of the input audio signal is short as described above, and when the time length is less than the predetermined time length L, the time length is extended by repeating the audio signal. It is possible to increase the section in which the voice feature amount average used for identification of the gender of the speaker can be obtained.

図4(B)はこの様子を示したものであり、比較として音声伸長を行わない従来例を図4(A)に示す。   FIG. 4B shows this situation, and FIG. 4A shows a conventional example in which voice decompression is not performed as a comparison.

窓長N秒間(過去N秒間)の音声特徴量の平均を用い、逐次CMN等の正規化処理を行う場合、図4(A)に示した従来例では平均の計算に使用できるデータ量が少なく、CMNの効果が充分に得られないことになる。なお、図4(A),(B)中、両矢の矢印で示した区間は窓長N秒間を示し、このうち、全て実線で示した矢印は窓長N秒間の音声特徴量の平均が使える区間を示す。   When the normalization processing such as CMN is performed sequentially using the average of the voice feature amount for the window length N seconds (the past N seconds), the data amount that can be used for the average calculation is small in the conventional example shown in FIG. Therefore, the effect of CMN cannot be obtained sufficiently. 4A and 4B, the section indicated by the double-headed arrow indicates a window length of N seconds, and among these, the arrows indicated by solid lines all represent the average of the audio feature values for the window length of N seconds. Indicates a usable section.

図4(A)では冒頭のS,Sの区間は窓長N秒間の音声特徴量の平均が使えず、窓長N秒間の音声特徴量の平均が使える区間はS〜Sの3区間となっている。これに対し、音声伸長を行った図4(B)では窓長N秒間の音声特徴量の平均を使える区間はS〜Sの7区間と増加し、これによりCMN等による正規化の効果を充分に得ることができ、よって男女声の識別精度の向上を図ることができる。 In FIG. 4A, the average of the speech feature quantity for the window length N seconds cannot be used in the first section S 1 and S 2 , and the section in which the average of the voice feature quantity for the window length N seconds can be used is S 3 to S 5 . There are 3 sections. On the other hand, in FIG. 4B in which voice expansion is performed, the section in which the average of the voice feature amount for the window length of N seconds can be used is increased to seven sections of S 3 to S 9 , and thereby the effect of normalization by CMN or the like. Can be sufficiently obtained, and therefore the discrimination accuracy of male and female voices can be improved.

なお、Nは例えば0.8秒程度とする。Nは長すぎると、広い区間の平均を求めることになるので、正規化の効果が低下してしまい、識別精度の低下を招く。よって、例えば単純に音声信号の全区間の音声特徴量を用いて正規化処理をするのは望ましくなく、上述したようにNは0.8秒程度に設定する。   Note that N is, for example, about 0.8 seconds. If N is too long, the average of a wide section is obtained, so that the normalization effect is reduced and the identification accuracy is lowered. Therefore, for example, it is not desirable to simply perform normalization processing using the speech feature values of the entire section of the speech signal, and N is set to about 0.8 seconds as described above.

実施例2の男女声識別装置の機能構成を図5に示し、その処理フローを図6に示す。   FIG. 5 shows a functional configuration of the gender voice identification apparatus according to the second embodiment, and FIG. 6 shows a processing flow thereof.

この例では図1に示した実施例1の男女声識別装置に対し、音声区間検出部60を追加した構成となっており、図2に示した実施例1の処理フローに対し、音声区間検出処理(ステップS11)を音声伸長処理(ステップS3)の前に行うものとなっている。   In this example, the voice segment detection unit 60 is added to the gender voice identification apparatus of the first embodiment shown in FIG. 1, and the voice segment detection is performed with respect to the processing flow of the first embodiment shown in FIG. The process (step S11) is performed before the voice decompression process (step S3).

一般的な環境で入力された音声信号には、雑音や無音等の音声ではない区間(非音声区間)が含まれている。ごく短時間の音声信号であっても同じであり、音声信号の一部には非音声区間が含まれている。しかしながら、比較的時間が短い音声信号が入力された場合、こうした非音声区間の長さと音声区間の長さが同程度か音声区間の長さの方が短い場合がある。   An audio signal input in a general environment includes a section (non-speech section) that is not speech such as noise or silence. The same applies to a very short time audio signal, and a part of the audio signal includes a non-speech section. However, when a speech signal having a relatively short time is input, the length of the non-speech section and the length of the speech section may be approximately the same or the length of the speech section may be shorter.

一方、音声特徴量の正規化においては特段、音声区間、非音声区間の識別は行われない。そのため、非音声区間に含まれる雑音等が音声特徴量の統計的な分析結果に影響を与え、結果的に正しい正規化処理ができない場合がある。従って、雑音等が含まれるごく短時間の音声信号においては、雑音等による正規化処理の誤りが男女声それぞれの音響モデルとの尤度に影響を与え、識別精度が低下するといったことが起こりうる。   On the other hand, in the normalization of the voice feature amount, the voice section and the non-voice section are not particularly identified. For this reason, noise or the like included in the non-speech section affects the statistical analysis result of the speech feature value, and as a result, correct normalization processing may not be performed. Therefore, in a very short time speech signal including noise, etc., it is possible that an error in normalization processing due to noise or the like affects the likelihood of the male and female acoustic models and the identification accuracy is lowered. .

実施例2はこの問題を解決するもので、音声長判定部40で所定の時間長L未満と判定された音声信号は音声区間検出部60に入力され、音声区間検出部60は入力された音声信号の音声区間を検出し(ステップS11)、その音声区間のみを音声伸長部50に出力する。音声伸長部50は入力された音声区間のみを時間長L以上となるまで繰り返して伸長させる(ステップS3)。図7は入力された音声信号から音声区間が検出され、さらに音声区間が繰り返されて伸長音声信号が生成される様子を示したものである。   The second embodiment solves this problem. The audio signal determined by the audio length determination unit 40 to be less than the predetermined time length L is input to the audio interval detection unit 60, and the audio interval detection unit 60 receives the input audio. The speech section of the signal is detected (step S11), and only the speech section is output to the speech decompression unit 50. The voice decompression unit 50 repeatedly decompresses only the input voice section until the time length becomes equal to or longer than the time length L (step S3). FIG. 7 shows a state in which a speech section is detected from an input speech signal, and the speech section is repeated to generate an expanded speech signal.

音声区間検出部60における音声区間検出には既存の音声区間検出方法を用いることができ、例えば特許第4691079号公報に記載されている音声信号区間推定方法を用いることができる。   An existing speech segment detection method can be used for speech segment detection in the speech segment detection unit 60, and for example, a speech signal segment estimation method described in Japanese Patent No. 46901079 can be used.

この例では音声区間に対してのみ正規化処理を行うことで、正規化処理を安定させることができ、より正確に男女声の識別を行うことが可能となる。   In this example, the normalization process is performed only on the speech section, so that the normalization process can be stabilized and the male and female voices can be identified more accurately.

実施例3の処理フローを図8に示す。図8では図6に示した実施例2の処理フローに対し、ステップS12とS13の処理が追加されている。   The processing flow of Example 3 is shown in FIG. In FIG. 8, the processes of steps S12 and S13 are added to the process flow of the second embodiment shown in FIG.

入力された音声信号に含まれる音声区間が極端に短い場合には、音声伸長処理を実施しても、十分な精度で男女声の識別を行うことができない恐れが高い。また、そのような音声は誤発声あるいは音声信号ではない入力である可能性もあり、棄却するのが望ましいこともある。   When the speech section included in the input speech signal is extremely short, there is a high possibility that the male and female voices cannot be identified with sufficient accuracy even if the speech decompression process is performed. In addition, such voice may be an erroneous voice or an input that is not a voice signal, and it may be desirable to reject it.

実施例3ではこの棄却を行うものとなっており、音声区間の検出(ステップS11)を行った後、音声区間の長さが閾値T未満か否かを判定し(ステップS12)、閾値T未満の場合、その音声区間を棄却し(ステップS13)、男女声の識別を行わないものとする。閾値Tは0より大きい値であり、適切な値を実験的に求める等の方法で決定すればよい。ここでは、一例として、T=0.2秒とする。   In the third embodiment, this rejection is performed, and after detecting the speech section (step S11), it is determined whether the length of the speech section is less than the threshold value T (step S12) and less than the threshold value T. In this case, it is assumed that the voice section is rejected (step S13), and male and female voices are not identified. The threshold value T is a value greater than 0, and may be determined by a method such as obtaining an appropriate value experimentally. Here, as an example, T = 0.2 seconds.

実施例4の男女声識別装置の機能構成を図9に示す。   FIG. 9 shows a functional configuration of the gender voice identification apparatus according to the fourth embodiment.

この例では図1に示した実施例1の男女声識別装置に対し、伸長時間長入力部70を追加した構成となっている。   In this example, an extension time length input unit 70 is added to the gender voice identification apparatus of the first embodiment shown in FIG.

男女声識別処理の対象となる音声信号が長いと、男女声識別処理に要する時間は長くなる。処理速度に対する要求が厳しく、精度を多少犠牲にしても一定時間内で識別結果を出力する必要がある場合、音声伸長の時間長Lを外部から設定した値L’(L’<L)にすることができれば、都合がよい。   If the audio signal to be subjected to the gender voice discrimination process is long, the time required for the gender voice discrimination process becomes long. When the demand for processing speed is strict and it is necessary to output the identification result within a certain time even if the accuracy is somewhat sacrificed, the speech decompression time length L is set to an externally set value L ′ (L ′ <L). If you can, it is convenient.

この例ではこのような伸長時間長の外部設定を可能とすべく、伸長時間長入力部70を具備しており、処理速度要件に応じて識別に要する時間(識別の応答時間)を制御することができる。   In this example, an extension time length input unit 70 is provided to enable external setting of such extension time length, and the time required for identification (identification response time) is controlled according to the processing speed requirement. Can do.

実施例5の男女声識別装置の機能構成を図10に示す。   FIG. 10 shows a functional configuration of the gender voice identification apparatus according to the fifth embodiment.

この例では図9に示した実施例4の男女声識別装置における伸長時間長入力部70に替えて伸長時間長算出部80を有するものとなっている。   In this example, an extension time length calculation unit 80 is provided instead of the extension time length input unit 70 in the gender voice identification apparatus of the fourth embodiment shown in FIG.

例えば、入力された音声信号を音声認識し、入力音声信号の話者と同じ性別の合成音声で返答を行うシステム等に男女声識別技術を適用する場合、音声認識とほぼ同じ時間内に入力音声信号の性別を識別することが求められる。このように外部から要求され、処理にかけることができる時間を要求応答時間Rとする。   For example, when gender identification technology is applied to a system that recognizes an input speech signal and responds with a synthesized speech of the same gender as the speaker of the input speech signal, the input speech within approximately the same time as speech recognition. It is required to identify the gender of the signal. A time required for processing from the outside in this way is set as a request response time R.

伸長時間長算出部80は上記要求応答時間Rと、外部から入力される計算機負荷情報(例えば、OSから取得することができるロードアベレージ情報)とから、音声伸長の時間長を都度、算出し、音声伸長部50に出力するものとなっている。   The decompression time length calculator 80 calculates the speech decompression time length each time from the request response time R and computer load information (for example, load average information that can be acquired from the OS) input from the outside. This is output to the voice decompression unit 50.

ここで、計算機負荷をWとし、算出する時間長をL''とすれば、時間長L''は例えば、
L''=A×(R/W)
によって計算することができる。Aは所定のWとL''のときに所望の応答時間となるよう、予め実験的に求めることができる定数である。
Here, if the computer load is W and the time length to be calculated is L ″, the time length L ″ is, for example,
L ″ = A × (R / W)
Can be calculated by: A is a constant that can be experimentally determined in advance so that a desired response time is obtained when predetermined W and L ″.

今、あるシステム構成において、計算機負荷W=1.0の場合に、時間長L''=2.0秒の長さであれば、応答時間0.5秒で応答を返せるという実験例がある場合に、定数Aは、
A=L''×(W/R)
=2.0×(1.0/0.5)
=4.0
となる。定数Aはこのようにして求めることができる。
Now, in a certain system configuration, when the computer load W = 1.0, there is an experimental example that a response can be returned with a response time of 0.5 seconds if the time length L ″ = 2.0 seconds. In this case, the constant A is
A = L ″ × (W / R)
= 2.0 × (1.0 / 0.5)
= 4.0
It becomes. The constant A can be obtained in this way.

定数Aを適切な値の一例として、例えば4.0と設定した場合、計算機負荷W=1.5,要求応答時間R=0.5という入力に対し、時間長L''を、
L''=4.0×(0.5/1.5)
≒1.33(秒)
と求めることができる。
As an example of an appropriate value for the constant A, for example, 4.0, a time length L ″ is set for an input of a computer load W = 1.5 and a request response time R = 0.5.
L ″ = 4.0 × (0.5 / 1.5)
≒ 1.33 (seconds)
It can be asked.

このように求めた時間長L''を用いて、実施例1と同様に、音声伸長を行う。すなわち、音声伸長部50はバッファ51を備えており、このバッファ51に、入力された音声信号の先頭フレームから順次、1フレームずつコピーが行われる。入力音声信号の最終フレームまで達したら、再び入力音声信号の先頭フレームからコピーが行われる。以上の処理をバッファ51に含まれるフレーム長が時間長L''以上となるまで繰り返す。ここでのコピーの繰り返し処理は、バッファ51に含まれるフレーム長が時間長L''と等しくなった時点(もしくは越えた時点)で打ち切る。   Using the time length L ″ obtained in this way, the voice is decompressed in the same manner as in the first embodiment. That is, the audio decompression unit 50 includes a buffer 51, and the buffer 51 is copied frame by frame sequentially from the first frame of the input audio signal. When the final frame of the input audio signal is reached, copying is performed again from the first frame of the input audio signal. The above processing is repeated until the frame length included in the buffer 51 becomes equal to or longer than the time length L ″. The copy repetitive processing here is terminated when the frame length included in the buffer 51 becomes equal to (or exceeds) the time length L ″.

このように、この例では計算機負荷の変動を考慮した上で要求応答時間に対応した男女声の識別処理を行えるものとなっている。   As described above, in this example, the male / female voice identification process corresponding to the required response time can be performed in consideration of the fluctuation of the computer load.

以上、各種実施例について説明したが、この発明は入力された音声信号が非常に短い場合に、その音声信号を繰り返し、伸長させることを特徴としている。これに対し、入力データの一部を入力データと組み合わせることで、データを伸長する手法は従来においても用いられている。例えば、伝送されたデータの一部に欠落がある場合、欠落していない部分のデータを用いて欠落した部分のデータを補間する技術がある。また、データとテンプレートのマッチングを行う際に、データの端部の外側にデータの端部のデータをコピーして、データの端部をマッチングの対象とする技術がある。しかしながら、いずれの技術もデータの長さを任意の長さに伸長する目的で、データ全体を繰り返しコピーして用いるものではない。   While various embodiments have been described above, the present invention is characterized in that when an input audio signal is very short, the audio signal is repeated and expanded. On the other hand, a technique for decompressing data by combining a part of the input data with the input data is also used in the past. For example, there is a technique for interpolating data in a missing part using data in a part that is not missing when part of the transmitted data is missing. In addition, there is a technique in which when data is matched with a template, the data at the end of the data is copied outside the end of the data, and the end of the data is used as a matching target. However, none of the techniques repeatedly use the entire data for the purpose of extending the data length to an arbitrary length.

音響モデルを用いて認識を行う技術も従来からある。例えば、音声認識技術等である。しかし、音声認識技術は認識される発話内容を重視する処理である。入力音声を繰り返して入力音声の時間長を伸長すると、発話自体が異なるものとなってしまう。   There is also a conventional technique for performing recognition using an acoustic model. For example, voice recognition technology. However, the speech recognition technology is processing that places importance on the content of utterances that are recognized. If the input speech is repeated to extend the time length of the input speech, the utterance itself will be different.

よって、音響モデルを用いて認識を行う技術に対し、入力音声を繰り返して入力音声の時間長を伸長する技術を組み合わせて用いるという発想は従来なかった。   Accordingly, there has been no idea in the past that a technique for recognizing using an acoustic model is combined with a technique for extending the time length of an input voice by repeating the input voice.

これに対し、この発明の男女声識別では、発話内容を認識する必要がない。音響モデルを用いて識別するのは入力された音声信号の話者の性別であり、性別の識別には入力された音声信号から抽出される音声特徴量のみを必要とする。そのため、入力された音声信号を繰り返して伸長したデータで男女声識別を行うことで、男女声の識別の精度を向上させることが可能となる。   On the other hand, in the gender identification according to the present invention, it is not necessary to recognize the utterance content. The identification using the acoustic model is the gender of the speaker of the input voice signal, and the identification of the gender requires only the voice feature amount extracted from the input voice signal. For this reason, it is possible to improve the accuracy of male / female voice identification by performing male / female voice identification using data obtained by repeatedly expanding an input audio signal.

以上説明した男女声識別装置、男女声識別方法は、コンピュータと、コンピュータにインストールされたプログラムによって実現することができる。コンピュータにインストールされたプログラムはコンピュータのCPUによって解読されてコンピュータに上述した男女声識別方法を実行させる。   The male and female voice identification device and the male and female voice identification method described above can be realized by a computer and a program installed in the computer. The program installed in the computer is decrypted by the CPU of the computer to cause the computer to execute the above-described gender voice identification method.

10 男女声識別処理部 11 音声特徴量抽出部
12 認識文法設定部 13 識別部
20 男声音響モデル 30 女声音響モデル
40 音声長判定部 50 音声伸長部
60 音声区間検出部 70 伸長時間長入力部
80 伸長時間長算出部
DESCRIPTION OF SYMBOLS 10 Male and female voice identification process part 11 Voice feature-value extraction part 12 Recognition grammar setting part 13 Identification part 20 Male voice acoustic model 30 Female voice acoustic model 40 Voice length determination part 50 Voice decompression part 60 Voice section detection part 70 Extension time length input part 80 Decompression Time length calculator

Claims (7)

入力された音声信号から音声特徴量を抽出し、その音声特徴量を男声音響モデル及び女声音響モデルと照合した尤度に基づいて音声信号の話者の性別を識別する男女声識別方法であって、
前記音声信号の時間長が所定の時間長L未満の場合、前記音声信号を前記時間長L以上となるまで繰り返して伸長させ、
その伸長させた音声信号を用いて前記音声特徴量の抽出を行い、前記繰り返しに対応した認識文法を用いて前記照合及び識別を行うことを特徴とする男女声識別方法。
A male and female voice identification method for extracting a voice feature from an input voice signal and identifying the gender of the speaker of the voice signal based on the likelihood of matching the voice feature with a male acoustic model and a female acoustic model. ,
If the time length of the audio signal is less than a predetermined time length L, the audio signal is repeatedly expanded until the time length is equal to or greater than the time length L,
A method for discriminating male and female voices, wherein the voice feature value is extracted using the expanded voice signal, and the collation and identification are performed using a recognition grammar corresponding to the repetition.
請求項1記載の男女声識別方法において、
前記繰り返しを行う際、前記音声信号の音声区間を検出し、その音声区間のみ繰り返して前記音声信号を伸長させることを特徴とする男女声識別方法。
The method for identifying gender voice according to claim 1,
A male / female voice identification method characterized by detecting a voice section of the voice signal and repeating the voice section to extend the voice signal when the repetition is performed.
請求項2記載の男女声識別方法において、
前記検出した音声区間の長さが閾値T未満の場合、その音声区間を棄却し、前記識別を行わないことを特徴とする男女声識別方法。
The method for identifying gender voice according to claim 2,
When the length of the detected speech segment is less than a threshold T, the speech segment is rejected and the discrimination is not performed.
請求項1乃至3記載のいずれかの男女声識別方法において、
前記時間長Lは外部から設定可能とされていることを特徴とする男女声識別方法。
In the male and female voice identification method according to any one of claims 1 to 3,
The time length L can be set from the outside.
請求項1乃至3記載のいずれかの男女声識別方法において、
前記時間長Lは男女声識別を実行する計算機負荷と要求応答時間から算出されることを特徴とする男女声識別方法。
In the male and female voice identification method according to any one of claims 1 to 3,
The time length L is calculated from a computer load for performing gender voice discrimination and a required response time.
入力された音声信号の時間長が所定の時間長L未満か否かを判定し、時間長L未満と判定した場合、前記音声信号を音声伸長部に出力し、時間長L以上と判定した場合、前記音声信号を男女声識別処理部に出力する音声長判定部と、
前記音声長判定部から入力された音声信号を前記時間長L以上となるまで繰り返して伸長させ、その伸長させた音声信号を前記男女声識別処理部に出力する前記音声伸長部と、
前記音声長判定部から入力される音声信号及び前記音声伸長部から入力される音声信号の音声特徴量を抽出し、その音声特徴量を抽出した音声信号に対応する認識文法を用いて前記音声特徴量を男声音響モデル及び女声音響モデルと照合し、照合した尤度に基づいて前記音声特徴量を抽出した音声信号の話者の性別を識別して出力する前記男女声識別処理部とを備えることを特徴とする男女声識別装置。
When it is determined whether or not the time length of the input audio signal is less than the predetermined time length L, and when it is determined that the time length is less than the time length L, the audio signal is output to the audio expansion unit and is determined to be greater than or equal to the time length L A voice length determination unit that outputs the voice signal to a gender voice identification processing unit;
The speech decompression unit that repeatedly decompresses the speech signal input from the speech length determination unit until the time length becomes equal to or greater than the time length L, and outputs the decompressed speech signal to the gender voice identification processing unit;
The speech feature is extracted from the speech signal input from the speech length determination unit and the speech signal input from the speech decompression unit, and using the recognition grammar corresponding to the speech signal from which the speech feature amount is extracted. The male and female voice models, and the male / female voice model for identifying and outputting the gender of the speaker of the voice signal from which the voice feature quantity is extracted based on the likelihood of matching. A male / female voice identification device.
請求項1乃至5記載のいずれかの男女声識別方法をコンピュータに実行させるためのプログラム。   A program for causing a computer to execute the gender voice identification method according to any one of claims 1 to 5.
JP2011223680A 2011-10-11 2011-10-11 Male and female voice identification method, male and female voice identification device, and program Active JP5342629B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2011223680A JP5342629B2 (en) 2011-10-11 2011-10-11 Male and female voice identification method, male and female voice identification device, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2011223680A JP5342629B2 (en) 2011-10-11 2011-10-11 Male and female voice identification method, male and female voice identification device, and program

Publications (2)

Publication Number Publication Date
JP2013083796A JP2013083796A (en) 2013-05-09
JP5342629B2 true JP5342629B2 (en) 2013-11-13

Family

ID=48529044

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2011223680A Active JP5342629B2 (en) 2011-10-11 2011-10-11 Male and female voice identification method, male and female voice identification device, and program

Country Status (1)

Country Link
JP (1) JP5342629B2 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102017219596A1 (en) * 2016-12-22 2018-06-28 Volkswagen Aktiengesellschaft Speech output voice of a voice control system
CN109785846A (en) * 2019-01-07 2019-05-21 平安科技(深圳)有限公司 The role recognition method and device of the voice data of monophonic
KR20220109238A (en) * 2021-01-28 2022-08-04 삼성전자주식회사 Device and method for providing recommended sentence related to utterance input of user
CN114554381B (en) * 2022-02-24 2024-01-05 世邦通信股份有限公司 Automatic human voice restoration system and method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3523382B2 (en) * 1995-08-10 2004-04-26 株式会社リコー Voice recognition device and voice recognition method
US7739114B1 (en) * 1999-06-30 2010-06-15 International Business Machines Corporation Methods and apparatus for tracking speakers in an audio stream
US8229744B2 (en) * 2003-08-26 2012-07-24 Nuance Communications, Inc. Class detection scheme and time mediated averaging of class dependent models
JP5088050B2 (en) * 2007-08-29 2012-12-05 ヤマハ株式会社 Voice processing apparatus and program

Also Published As

Publication number Publication date
JP2013083796A (en) 2013-05-09

Similar Documents

Publication Publication Date Title
CN108305615B (en) Object identification method and device, storage medium and terminal thereof
WO2020029404A1 (en) Speech processing method and device, computer device and readable storage medium
JP6464650B2 (en) Audio processing apparatus, audio processing method, and program
JP5200712B2 (en) Speech recognition apparatus, speech recognition method, and computer program
US9595261B2 (en) Pattern recognition device, pattern recognition method, and computer program product
JP6464005B2 (en) Noise suppression speech recognition apparatus and program thereof
JP2013205842A (en) Voice interactive system using prominence
WO2018051945A1 (en) Speech processing device, speech processing method, and recording medium
JP5342629B2 (en) Male and female voice identification method, male and female voice identification device, and program
JP5647455B2 (en) Apparatus, method, and program for detecting inspiratory sound contained in voice
JP6276513B2 (en) Speech recognition apparatus and speech recognition program
JP4700522B2 (en) Speech recognition apparatus and speech recognition program
KR101122590B1 (en) Apparatus and method for speech recognition by dividing speech data
JP2996019B2 (en) Voice recognition device
JP2005148342A (en) Method for speech recognition, device, and program and recording medium for implementing the same method
CN109155128B (en) Acoustic model learning device, acoustic model learning method, speech recognition device, and speech recognition method
JP7291099B2 (en) Speech recognition method and device
JP6367773B2 (en) Speech enhancement device, speech enhancement method, and speech enhancement program
CN114203180A (en) Conference summary generation method and device, electronic equipment and storage medium
WO2018029071A1 (en) Audio signature for speech command spotting
JP7159655B2 (en) Emotion estimation system and program
JP6526602B2 (en) Speech recognition apparatus, method thereof and program
JP2008233782A (en) Pattern matching device, program, and method
JP2007248529A (en) Voice recognizer, voice recognition program, and voice operable device
JP4236502B2 (en) Voice recognition device

Legal Events

Date Code Title Description
A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20130725

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20130730

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20130809

R150 Certificate of patent or registration of utility model

Ref document number: 5342629

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Free format text: JAPANESE INTERMEDIATE CODE: R150

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20130918

S531 Written request for registration of change of domicile

Free format text: JAPANESE INTERMEDIATE CODE: R313531

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350