JPS63173100A

JPS63173100A - Keyword extractor

Info

Publication number: JPS63173100A
Application number: JP62006724A
Authority: JP
Inventors: 浩明服部
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1987-01-13
Filing date: 1987-01-13
Publication date: 1988-07-16
Anticipated expiration: 2009-05-02
Also published as: JPH0634193B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は連続発声された音声からキーワードを抽出する
装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a device for extracting keywords from continuously uttered speech.

[Conventional technology]

太閤が話を聞く場合には”話題”を推定することにより
、相手の発声内容を予測していると考えられる。音声認
識装置においても話題が推定できれば、発声内容の予測
までは出来なくとも、複数の単語セットの中からその話
題にあった単語セットを選択することで単語の予備選択
を行うことができるので、高い認識率の認識装置が実現
できる。When Taiko listens to a conversation, it is thought to predict the content of the other person's utterances by estimating the ``topic''. If the speech recognition device can also estimate the topic, even if it is not possible to predict the content of the utterance, it will be possible to make a preliminary selection of words by selecting a word set that matches the topic from among multiple word sets. A recognition device with a high recognition rate can be realized.

話題を推定するためには文中に話題を特定できるような
単語（以降この様な単語をキーワードと呼ぶ）を検出す
れば良い。In order to estimate the topic, it is sufficient to detect words (hereinafter such words will be referred to as keywords) that can specify the topic in the sentence.

文中のある特定の単語を検出して位置決めを行う技術は
ワードスポツティングと呼ばれる。ワードスポツティン
グの方法としては連続ＤＰ法（”・連続ＤＰを用いた連
続単語認識”岡　隆−１音声研究会資料５７８−２０）
が知られている。第２因に連続ＤＰ法の例を示す、同図
において、１１は連続発声された入力音声のパターン、
１２はキーワードのパターン、１３は連続ＤＰの結果得
られるキーワードパターンと入力音声パターンの間の距
離、１４はマツチング平面を示す、連続ＤＰ法では定め
られた閾値以下の距離を与える点から、マツチング千回
１４上のパスを逆にたどることによってワードスポツテ
ィングを行うことができる。The technology for detecting and positioning specific words in a sentence is called word spotting. The method for word spotting is the continuous DP method ("Continuous word recognition using continuous DP" Takashi Oka-1 Speech Research Group Material 578-20)
It has been known. The second factor shows an example of the continuous DP method. In the figure, 11 is a pattern of continuously uttered input speech,
12 is the keyword pattern, 13 is the distance between the keyword pattern obtained as a result of continuous DP and the input speech pattern, and 14 is the matching plane. Word spotting can be done by retracing the path above step 14.

[Problem that the invention seeks to solve]

しかし、連続ＤＰ法では入力の１フレームごとに距離計
算が行われるため計算量が多い、そこで文中でキーワー
ドのある区間を特定できれば計算量を削減することがで
き、効率よくキーワードを捜すことができる。However, in the continuous DP method, distance calculation is performed for each frame of input, which requires a large amount of calculation. Therefore, if the section in which a keyword is located in a sentence can be identified, the amount of calculation can be reduced and keywords can be searched efficiently. .

[Means for solving problems]

本発明のキーワード抽出装置は連続発声された入力音声
からピッチ情報を抽出するピッチ抽出手段と、話題を推
定するためのキーワードの標準パターンと前記キーワー
ドが強調されたことを示すピッチ変化パターンとを記憶
する記憶手段と、前記ピッチ情報と前記ピッチ変化パタ
ーンとを比較し前記キーワードが含まれる区間を抽出す
る区間抽出手段と、前記入力音声を特徴ベクトルの系列
に変換する特徴抽出手段と、前記抽出された区間におい
て前記標準パターンとのマツチングを行うマツチング手
段と、前記マツチングの結果から前記キーワードを決定
する決定手段とを備える。The keyword extraction device of the present invention stores pitch extraction means for extracting pitch information from continuously uttered input speech, a standard pattern of keywords for estimating a topic, and a pitch change pattern indicating that the keyword is emphasized. a storage means for comparing the pitch information and the pitch change pattern to extract a section including the keyword; a feature extraction means for converting the input speech into a sequence of feature vectors; and a determining means that determines the keyword from the result of the matching.

[Effect]

人間が文章を発声する場合には、伝達したい情報に関す
る単語が強調される。したがって、文中で強調されてい
る単語はその文の話題を特定する語、つまりキーワード
である場合が多いと考えられる。そこで文中において強
調されている部分を抽出できればキーワードを効率よく
捜すことができる０日本語においては、文中のある単語
が強調された場合にピッチパターンが変化することが報
告されている（”会話文章における基本周波数パターン
の制御規則について”岩１）相席、音声研究会資料５８
５−４２参照）、岩田はピッチパターンの変化は強調さ
れる単語および前後の単語のアクセント型（単語のどの
音節にアクセントがあるか）に依存すると述べているが
、基本的には強調される単語においてピッチが上昇する
。したがってピッチの変化パターンを捕らえることによ
り、文中で強調されている部分を抽出することができる
。When humans utter sentences, words related to the information they want to convey are emphasized. Therefore, it is considered that the words emphasized in a sentence are often words that specify the topic of the sentence, that is, keywords. Therefore, if you can extract the emphasized parts of a sentence, you can search for keywords efficiently. In Japanese, it has been reported that the pitch pattern changes when a certain word in a sentence is emphasized ("Conversation Sentences"). Regarding the control rules for fundamental frequency patterns in ``Iwa 1) Aiseki, Speech Study Group Material 58
5-42), Iwata states that changes in pitch patterns depend on the word being emphasized and the accent type of the words before and after (which syllable of the word is accented), but basically Pitch rises in words. Therefore, by capturing the pitch change pattern, it is possible to extract the emphasized part of the sentence.

〔Example〕

次に、本発明の実施例について図面を参照して説明する
。Next, embodiments of the present invention will be described with reference to the drawings.

第１図は本発明の一実施例を示す構成図である。FIG. 1 is a block diagram showing an embodiment of the present invention.

また、第３図、第４図及び第５図は本発明の一実施例の
動作を説明するための図である。Further, FIGS. 3, 4, and 5 are diagrams for explaining the operation of an embodiment of the present invention.

各図を参照すると、あらかじめキーワードの標準パター
ンと各キーワードが強調されたことを示すピッチの変化
パターンとをキーワード辞書（記憶部）３に登録してお
く、標準パターンである特徴パラメータはバンドパスフ
ィルタの出力、メルケプストラム係数等の直接音響的な
特徴を示すものに限らず、ベクトル量子化等によってシ
ンボル化されたパターンでも良い、キーワードは各話題
について１つ以上登録するものとする。いま、話題”時
節”のキーワードを一つとして”今日”が登録されてお
り、”今日”のピッチパターンは強調を受けると上昇す
るということが記憶されているものとする。第３図（ａ
＞は入力音声”今日はとてもよい天気です”のエネルギ
ーを示す、音声が入力されるとまずピッチ抽出部１にお
いてピッチが抽出される。ピッチ抽出の方法としては様
々な方法が利用できる１例えば、線形予測分析の誤差信
号の自己相関から求める方法である。第３図（ｂ）はピ
ッチ抽出の結果であり、実線は単語”今日”を強調しな
いで発声した場合、かつ点線は強調して発声した場合で
ある。したがって”今日は”においてピッチが上がって
いる０次に、区間抽出部２は抽出されたピッチの変化パ
ターンをキーワード辞書３に登録されている”今日”の
ピッチ変化パターンと比較し、ピッチの上昇している区
間、”今日は”をキーワード候補区間として抽出する０
次に、特徴抽出部４では入力音声の上記特徴パラメータ
を求める。マツチング部５はキーワード辞書３から”今
日”の標準パターンを取り出し、抽出されたキーワード
候補区間とのマツチングを行う、マツチングの手段とし
ては様々な方法が利用できる０例えば、上記連続ＤＰ法
である。Referring to each figure, a standard pattern of keywords and a pitch change pattern indicating that each keyword is emphasized are registered in advance in the keyword dictionary (storage unit) 3, and the characteristic parameters of the standard pattern are filtered by a bandpass filter. The keywords are not limited to those indicating direct acoustic features such as the output of , mel-cepstral coefficients, etc., but may also be patterns symbolized by vector quantization, etc. One or more keywords shall be registered for each topic. It is now assumed that "Today" is registered as one of the keywords of the topic "Season", and that it is remembered that the pitch pattern of "Today" rises when it is emphasized. Figure 3 (a
> indicates the energy of the input voice "It's very nice weather today." When the voice is input, the pitch is first extracted in the pitch extraction unit 1. Various methods can be used for pitch extraction. For example, there is a method of obtaining the pitch from autocorrelation of an error signal of linear predictive analysis. FIG. 3(b) shows the results of pitch extraction, where the solid line shows the case when the word "Today" is uttered without emphasis, and the dotted line shows the case when it is uttered with emphasis. Therefore, the interval extraction unit 2 compares the extracted pitch change pattern with the pitch change pattern of "Today" registered in the keyword dictionary 3, and detects the rise in pitch. 0 to extract “Today” as a keyword candidate interval.
Next, the feature extraction section 4 obtains the above-mentioned feature parameters of the input voice. The matching unit 5 extracts the standard pattern of "Today" from the keyword dictionary 3 and performs matching with the extracted keyword candidate section. Various methods can be used as a matching means, such as the continuous DP method described above.

第４図及び第５図は連続ＤＰ法によるマツチングの一例
である。第４図は第５図に示すキーワード候補区間２１
と標準パターン２２との連続ＤＰの結果から得られる距
離である。決定部６はマツチングの結果、閾値αよりも
小さい値が得られればキーワード”今日”が存在すると
判断する。また、距離が最小値を取る時点をｔとすると
、マツチング平面２３上で終端が時点ｔを通るパスをた
どることでキーワード“今日”の位置を決定できる。FIGS. 4 and 5 are examples of matching using the continuous DP method. Figure 4 shows the keyword candidate section 21 shown in Figure 5.
This is the distance obtained from the result of continuous DP between the standard pattern 22 and the standard pattern 22. The determining unit 6 determines that the keyword "today" exists if a value smaller than the threshold α is obtained as a result of matching. Further, assuming that the time point at which the distance takes the minimum value is t, the position of the keyword "today" can be determined by tracing a path on the matching plane 23 whose end passes through time t.

〔Effect of the invention〕

以上のように本発明によれば、文中で強調されている区
間でキーワードの探索を行うことにより、計算量を削減
して効率よくキーワードを捜すことができる。したがっ
て、連続音声認識装置においては話題や場面の推定等に
利用でき、高精度の認識を行うことができる。As described above, according to the present invention, by searching for a keyword in the highlighted section in a sentence, the amount of calculation can be reduced and the keyword can be searched efficiently. Therefore, the continuous speech recognition device can be used for estimating topics and scenes, and can perform highly accurate recognition.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示す構成図、第２図は従来
技術を説明する図、第３図、第４図及び第５図は本発明
の一実施例の動作を説明するための図である。１・・・ピッチ抽出部、２・・・区間抽出部、３・・・
キーワード辞書、４・・・特徴抽出部、５・・・マツチ
ング部、６・・・決定部、２１・・・キーワード候補区
間、２２・・・標準パターン、２３・・・マツチング平
面。第　１　回FIG. 1 is a block diagram showing an embodiment of the present invention, FIG. 2 is a diagram for explaining the prior art, and FIGS. 3, 4, and 5 are for explaining the operation of an embodiment of the present invention. This is a diagram. 1... Pitch extractor, 2... Section extractor, 3...
Keyword dictionary, 4...Feature extraction unit, 5...Matching unit, 6...Determination unit, 21...Keyword candidate section, 22...Standard pattern, 23...Matching plane. 1st

Claims

[Claims]

pitch extraction means for extracting pitch information from continuously uttered input speech; storage means for storing a standard pattern of keywords for estimating a topic; and a pitch change pattern indicating that the keyword has been emphasized; a section extraction means for comparing information with the pitch change pattern and extracting a section including the keyword; a feature extraction means for converting the input voice into a series of feature vectors; What is claimed is: 1. A keyword extraction device comprising: a matching means for performing matching; and a determining means for determining the keyword from the result of the matching.