JPS61294591A

JPS61294591A - Information input device

Info

Publication number: JPS61294591A
Application number: JP60135311A
Authority: JP
Inventors: Hiroyuki Tsuboi; 宏之坪井; Yoichi Takebayashi; 洋一竹林
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1985-06-21
Filing date: 1985-06-21
Publication date: 1986-12-25
Also published as: JPH0469960B2

Abstract

PURPOSE:To reduce the quantity of processing and to shorten the processing time when input information is recognized, by limiting statistically the candidate subjects to be recognized according to their emerging frequencies, etc. and recognizing only those candidate subjects that have high emerging probability. CONSTITUTION:Word voices are supplied to a pattern input part 1 and a feature vector is extracted. This vector is compared with the standard feature vector of the information on a subject to be recognized which is previously registered in a recognition dictionary 2 through a pattern recognizing part 3. The results of recognition are obtained and are arrayed in the order of candidate subjects to be recognized having higher degrees of resemblance. This array is delivered to a result selection part 4 in the form of the pattern recognition results. The patterns are recognized by the part 3 by limiting the information on the recognition subjects stored in a recognition dictionary 2 and to be recognized according to the information on the statistical frequency like each emerging frequency information, etc. related to the information on the recognition subjects stored in a frequency table 6. Then the part 4 delivers only the recognition subject candidates that have high emerging frequencies.

Description

【発明の詳細な説明】〔発明の技術分野〕本発明は音声として、或いは文字として与えられた入力
パターンを認識して、その入力パターンが示す言語情報
等の入力対象情報を情報処理システムに入力する情報入
力装置に関する。[Detailed Description of the Invention] [Technical Field of the Invention] The present invention recognizes an input pattern given as speech or text, and inputs input target information such as linguistic information indicated by the input pattern into an information processing system. The present invention relates to an information input device.

[Technical background of the invention and its problems]

情報処理技術の発展に伴い、日本語ワードプロセッサ等
の言語処理システムが種々開発されている。With the development of information processing technology, various language processing systems such as Japanese word processors have been developed.

この種の情報処理システムは、一般にキーボードから仮
名文字系列を入力し、その入力文字系列を原語情報処理
して単語や文節等の言語情報を求め、これをシステム入
力する如く構成されている。This type of information processing system is generally configured to input a kana character sequence from a keyboard, process the input character sequence for original language information to obtain linguistic information such as words and phrases, and input this into the system.

しかしキーボードを用いた文字列の入力には、所謂なれ
が必要であり、一般ユーザにとつ−では使い難い、入力
形態が不自然である、疲れ易い、情報入力速度の高速化
が望めない等の問題がある。However, inputting character strings using a keyboard requires so-called familiarity, making it difficult for general users to use, the input format is unnatural, it is easy to get tired, and information input speed cannot be increased. There is a problem.

このような背景の下で、文字や音声の認識技術を利用し
た情報入力装置の研究が種々試みられている。この装置
は、手書き文字そのものを入力情報としたり、また音声
により情報入力しようとするものである。このような情
報入力装置によれば、自然性良く、簡易に、且つ高速な
情報入力が可能となると考えられる。Against this background, various attempts have been made to research information input devices that utilize character and voice recognition technology. This device uses handwritten characters themselves as input information, or inputs information by voice. According to such an information input device, it is possible to input information easily, naturally, and at high speed.

ところがパターン認識技術には、いまだに未完成の部分
があり、常に入力パターンを正しく認識し得るとは限ら
ない。この為、誤認識が生じた場合には、その誤り訂正
が必要である。しかもパターン認識技術は非常に高度な
技術であり、装置コストが非常に高いと云う不具合もあ
る。However, pattern recognition technology still has some imperfections, and it is not always possible to correctly recognize input patterns. Therefore, if a recognition error occurs, it is necessary to correct the error. Moreover, the pattern recognition technology is a very advanced technology, and there is also the disadvantage that the equipment cost is very high.

この為、安価で高性能なパターン認識装置の開発と、こ
のパターン認識技術を利用して高性能な情報入力装置の
開発が強く望まれている。For this reason, there is a strong desire to develop an inexpensive and high-performance pattern recognition device, and to develop a high-performance information input device using this pattern recognition technology.

[Purpose of the invention]

本発明はこのような事情を考慮してなされたもので、そ
の目的とするところは、入力パターンを＾精度に認識し
て、その入力パターンが示す情報を効果的に入力するこ
とのできる安価な情報入力装置を提供することにある。The present invention has been made in consideration of these circumstances, and its purpose is to provide an inexpensive and inexpensive method that can accurately recognize input patterns and effectively input information indicated by the input patterns. The purpose of the present invention is to provide an information input device.

[Summary of the invention]

本発明は、半導体技術およびパターン認識技術の高速・
高性能化による進歩と、情報入力者（ｌｉ１人）によっ
て入力したい情報の種類（分野）およびその情報の入力
頻度が異なることに鑑み、入力対象情報を参照して音節
や仮名文字等からなる入力パターンを認識し、該入力パ
ターンまたは入力パターンの系列が示す単語や文節等か
らなる情報を情報処理システムに入力するに際し、利用
者の入力対象情報に関する出現頻度情報や連接頻度情報
等からなる統計的頻度情報に従って入力パターンに対す
るｍｌ処理対象とする入力対象情報を制限して、例えば
統計的頻度情報が所定の閾値以下のものを認識対象情報
から除外する等して該入力パターンを認識処理するよう
にし、またこの認識処理結果に従って上記入力対象情報
に関する統計的頻度情報を順次更新してなることを特徴
とするのである。The present invention is based on high-speed semiconductor technology and pattern recognition technology.
In view of advances in performance, and the fact that the type (field) of information desired to input and the frequency of input of that information differ depending on the information inputter (LI), input consisting of syllables, kana characters, etc. is performed by referring to the input target information. When recognizing a pattern and inputting information consisting of words, phrases, etc. indicated by the input pattern or a series of input patterns into an information processing system, statistical Recognize the input pattern by limiting the input target information to be subjected to ML processing for the input pattern according to the frequency information, for example, by excluding those whose statistical frequency information is below a predetermined threshold from the recognition target information. , and the statistical frequency information regarding the input target information is sequentially updated according to the recognition processing result.

〔Effect of the invention〕

かくして本発明によれば、出現頻度や連接頻度の情報に
基いて利用者が入力しようとする情報に対する入力対象
情報を制限するので、入力パターンに対する認識処理速
度および認識精度の向上を図ることができる。Thus, according to the present invention, since the input target information for the information that the user attempts to input is limited based on the information on the appearance frequency and the connection frequency, it is possible to improve the recognition processing speed and recognition accuracy for input patterns. .

つまり、利用者に応じて出現頻度の低い情報を除外し、
出現頻度の高い情報だけを認識処理結果として入力パタ
ーンを認識するので、その処理速度の向上を図ることが
でき、また不本意な認識候補が出現しないので、その認
識精度の向上を図ることができる。In other words, depending on the user, information that appears infrequently is excluded,
Since the input pattern is recognized as a recognition processing result of only information with a high frequency of occurrence, it is possible to improve the processing speed, and since no unwanted recognition candidates appear, it is possible to improve the recognition accuracy. .

これ故、経済的で高性能な情報入力装置を実現すること
ができ、また利用者にとっては、所望とする情報を容易
に入力することができる等の実用上多大なる効果が奏せ
られる。Therefore, an economical and high-performance information input device can be realized, and the user can have great practical effects, such as being able to easily input desired information.

[Embodiments of the invention]

以下、図面を参照して本発明の一実施例につき説明する
。Hereinafter, one embodiment of the present invention will be described with reference to the drawings.

第１図は単語音声を認識してその言語情報を計算機入力
する実施例装置の概略構成図である。FIG. 1 is a schematic diagram of an embodiment of an apparatus for recognizing word sounds and inputting the linguistic information into a computer.

この装置は、基本的には音声パターンを入力するパター
ン入力部１、入力パターンを認識辞Ｉ２を参照して認識
するパターン入力部３、このパターンａ’ｉ結果を選択
して出力する結果選択部４を備えて構成される。尚、５
は装置全体の動作を制御するシステム制御部である。This device basically consists of a pattern input section 1 that inputs a voice pattern, a pattern input section 3 that recognizes the input pattern by referring to a recognition word I2, and a result selection section that selects and outputs the result of this pattern a'i. 4. In addition, 5
is a system control unit that controls the operation of the entire device.

上記パターン入力部１は入力された単語音声を分析し、
該単語音声の始終端を決定する等の前処理を行った後、
その音韻等の特徴ベクトルを抽出している。パターン認
識部３はこのような入力音声の特徴ベクトルと、認識辞
書２に予め登録された認識対象情報（単語等のカテゴリ
）の標準特徴ベクトルとを照合してその認識結果（カテ
ゴリ）を求めるもので、例えばそのベクトル間の複合類
似度を計算している。そして類似度値の高い認識対象候
補（カテゴリ）から順に並べ、これをパターン認識結果
として結果選択部４に出力している。The pattern input unit 1 analyzes the input word sounds,
After performing preprocessing such as determining the beginning and end of the word sound,
Feature vectors such as phoneme are extracted. The pattern recognition unit 3 compares the feature vectors of such input speech with standard feature vectors of recognition target information (categories such as words) registered in advance in the recognition dictionary 2 to obtain recognition results (categories). For example, the composite similarity between the vectors is calculated. Then, the recognition target candidates (categories) are arranged in descending order of similarity value and outputted to the result selection unit 4 as a pattern recognition result.

尚、パターン認識部３によるパターン認識処理は、例え
ば後述する頻度表６に蓄積された認識対象情報に関する
各出現頻度情報等の統計的頻度情報に従って、認識辞書
２中の照合処理する認識対象情報（カテゴリ）を制限す
る等して行われる。In addition, the pattern recognition process by the pattern recognition unit 3 is performed based on statistical frequency information such as each appearance frequency information regarding the recognition target information accumulated in a frequency table 6, which will be described later. This is done by restricting categories (categories).

即ち、認識辞■２に登録された認識対象情報の中には、
一般的には利用者が入力対象とする分野（言語情報の範
囲）以外の情報も多く含まれる。That is, among the recognition target information registered in recognition dictionary ■2,
Generally, a lot of information other than the field (range of linguistic information) that the user is inputting is also included.

この為、利用者の情報入力履歴を参照すれば、どの情報
が良く入力され、また殆んど入力のない情報がどれであ
るかがわかる。ｖＡ度裏表６、このような出現頻度の情
報等を認識辞Ｉ２に登録された各認識対象情報毎に集計
している。この頻度表６に蓄積された頻度情報に従って
、例えば出現頻度の低い認識対象情報を、入力パターン
との照合対象から除外する等して、前記パターン照合に
供される認識対象情報が制限されている。Therefore, by referring to the user's information input history, it can be seen which information is frequently input and which information is rarely input. vA degree, both sides and 6, such information on the frequency of appearance, etc. is aggregated for each piece of recognition target information registered in the recognition word I2. According to the frequency information accumulated in the frequency table 6, the recognition target information used for the pattern matching is limited, for example by excluding recognition target information with a low appearance frequency from being matched with the input pattern. .

このようにしてパターン認識部３は、制限された認識対
象情報との間でだけその照合処理を行い、該入力パター
ンに対する認識結果を求めている。In this way, the pattern recognition unit 3 performs matching processing only with the limited recognition target information, and obtains a recognition result for the input pattern.

しかして結果選択部４は、上記パターン認識部２で求め
られた複数のｖｔ識対象候補の中から、頻度管理部７の
制−の下で出現頻度の高い認識対象候補だけを選択して
出力している。Therefore, the result selection unit 4 selects and outputs only recognition target candidates that appear frequently under the control of the frequency management unit 7 from among the plurality of VT recognition target candidates obtained by the pattern recognition unit 2. are doing.

即ち結果認識部４は、例えば頻度管理部７で設定された
頻度同値と、頻度表６に格納された各認識対象情報に関
する頻度情報とを比較し、上記認識対象候補として抽出
された情報の頻度情報が上記頻度閾値より高いか否かを
判定している。そして、頻度同値より低い頻度値を有す
る認識対象候補を除外し、残された認識対象候補だけを
出力している。That is, the result recognition unit 4 compares, for example, the frequency equivalent set by the frequency management unit 7 with the frequency information regarding each recognition target information stored in the frequency table 6, and determines the frequency of the information extracted as the recognition target candidate. It is determined whether the information is higher than the frequency threshold. Then, recognition target candidates having a frequency value lower than the frequency equivalency are excluded, and only the remaining recognition target candidates are output.

このようにして選択されたＩ！！識対象候補（カテゴリ
）が利用者に呈示され、ひつようならば正しい認識結果
の選択処理に委ねられる。このようにして選択されたＷ
１識対象候補が、計算機システムに対する入力情報とし
て与えられる。I! selected in this way! ! Recognition object candidates (categories) are presented to the user, and if necessary, the user is left to select the correct recognition result. W selected in this way
A candidate object is given as input information to a computer system.

このとき上記頻度管理部７は、利用者が選択し、計算機
システムへの入力情報として選択された認識対象候補の
情報を得て、前記頻度表６に格納された各Ｗ１ｗ、対象
候補に関する頻度情報を更新処理している。At this time, the frequency management unit 7 obtains information on the recognition target candidates selected by the user and inputted into the computer system, and receives frequency information regarding each W1w and target candidate stored in the frequency table 6. is being updated.

この結果、頻度表６に格納される各認識対象情報の頻度
情報は、その情報入力の履歴に応じて更新され、それ以
降の情報入力に利用されることになる。As a result, the frequency information of each recognition target information stored in the frequency table 6 is updated according to the history of the information input, and is used for subsequent information input.

尚、入力部８は頻度管理部７の頻度閾値を設定する情報
を入力することができる。すなわち、頻度に対する閾値
処理を行なう時の閾値を低くすると、相対的に頻度情報
を高くしたことになり、また閾値を高くすることは頻度
情報を低くしたことと等価になる。Note that the input unit 8 can input information for setting the frequency threshold of the frequency management unit 7. That is, lowering the threshold value when performing threshold processing for frequency means relatively increasing the frequency information, and increasing the threshold value is equivalent to lowering the frequency information.

この頻度閾値の・入力設定により、前述した処理により
１！候補の中から除外される認識対象情報の範囲を強制
的に設定することができる。例えば、以下なる頻度値よ
りも低い値に設定すると、除外される！ＭＥ対象情報は
全くなくなる。By inputting this frequency threshold value, the above-mentioned process results in 1! It is possible to forcibly set the range of recognition target information to be excluded from candidates. For example, if you set it to a value lower than the frequency value below, it will be excluded! ME target information is completely lost.

同時に頻度管理部７においては、その更ＩＦｒ処理に応
じて除外される認識対象情報の範囲を設定する為の頻度
同値を適宜更新することになる。At the same time, the frequency management unit 7 appropriately updates the frequency equivalent value for setting the range of recognition target information to be excluded in accordance with the further IFr process.

更に、入力部８は認識対象情報に対する選択基準を強制
的に設定する情報を入力する為のものである。即ち、前
述した頻度に対する閾値処理を行ったとき、入力しよう
とする情報がその頻度値が低い為に認識対象から除外さ
れる場合がある。入力部８は、このような出現頻度の低
い情報を入力しようとするとき、その情報に関する頻度
情報を外部からプリセット的に高くするものである。Furthermore, the input unit 8 is used to input information for forcibly setting selection criteria for recognition target information. That is, when the above-mentioned frequency threshold processing is performed, information to be input may be excluded from recognition targets because the frequency value thereof is low. When inputting such information with low frequency of appearance, the input unit 8 increases the frequency information related to the information from the outside in a preset manner.

この頻度情報の入力設定によって、前述した処理により
認識候補から除外される認識対象情報が、認識候補とし
て回復される。従って常時は入力対象から除外される情
報であっても、必要に応じてこれを入力することが可能
となる。By inputting and setting this frequency information, the recognition target information excluded from recognition candidates by the above-described process is recovered as recognition candidates. Therefore, even if the information is normally excluded from input targets, it is possible to input it as needed.

このように本ＶｔＲによれば、入力対象情報をその頻度
情報に応じて選択して入力パターンの認識処理に用いる
ので、パターン認識処理の高速化を図ることができる。In this manner, according to the present VtR, since input target information is selected according to its frequency information and used for input pattern recognition processing, it is possible to speed up the pattern recognition processing.

また不本意な入力対象情報が入力パターンに認識候補か
ら除外されるので、そのＨＩＥＩＩＩ度の向上を図るこ
とができる。これ故、総合的に正確な情報入力を可能な
らしめ、−また情報入力効率の向上を図ることが可能と
なる等の効果が奏せられる。In addition, since unwanted input target information is excluded from the input pattern recognition candidates, it is possible to improve the HIEIII degree. Therefore, it is possible to input information accurately in a comprehensive manner, and to improve the efficiency of inputting information.

第２図は本発明を音声ワードプロセッサに適用した例を
示すものである。FIG. 2 shows an example in which the present invention is applied to an audio word processor.

発声入力された音声パターンは、単音節検出認識部１１
にて分析され、所定の前処理と特徴ベクトルの抽出が行
われる。そして認識辞［１２を参照して、その入力音声
認識される。この入力音声の認識処理は、前記第１図に
示す装置と同様に行われる。The input voice pattern is processed by the monosyllable detection recognition unit 11.
, and predetermined preprocessing and feature vector extraction are performed. The input speech is then recognized with reference to the recognition word [12]. This input speech recognition process is performed in the same manner as in the apparatus shown in FIG. 1 above.

この単音節検出認識部１１にて入力単音節の認識結果が
、１つまたは複数の認識候補として出力される。The monosyllable detection and recognition unit 11 outputs the recognition result of the input monosyllable as one or more recognition candidates.

連接処理部１３は、このようにして求められた音節候補
系列の中の各音節の接続関係を、連接辞書１４を用いて
検定している。この音節の接続関係の検定は、音節候補
系列中の各音節の組合せの全てについて、その連接関係
が成立するか否かを連接辞書１４を参照して照合するこ
とにより行われる。The concatenation processing unit 13 uses the concatenation dictionary 14 to test the conjunctive relationship of each syllable in the syllable candidate series obtained in this manner. This test of the syllable connection relationship is performed by referring to the connection dictionary 14 to check whether or not the connection relationship holds true for all combinations of syllables in the syllable candidate series.

この連接処理により、可能な連接関係にある音節候補系
列が抽出される。Through this concatenation process, syllable candidate sequences that have possible concatenation relationships are extracted.

このようにして抽出された音節候補系列に対して、仮名
漢字変換処理部１５は仮名漢字変換用辞書１６を参照し
て仮名漢字変ＩｔＩ４処理し、上記音節候補系列に対応
した仮名漢字混じりの文字列の候補（文節候補）を求め
ている。この文節候補が結果選択表示部１７にて表示さ
れ、入力部１８からの指示入力の下で入力音声を正しく
認識した文節候補が選択される。For the syllable candidate series extracted in this way, the kana-kanji conversion processing unit 15 refers to the kana-kanji conversion dictionary 16 and performs kana-kanji conversion ItI4 processing to convert the characters mixed with kana-kanji corresponding to the syllable candidate series. We are looking for column candidates (bunsetsu candidates). These phrase candidates are displayed on the result selection display section 17, and under the instruction input from the input section 18, the phrase candidates whose input speech has been correctly recognized are selected.

ここで本装置が特徴とするところは、館記運接処理およ
び仮名漢字変換処理が、頻度管理部１９の管理の下で、
連接頻度表２０および仮名漢字変換辞書頻度表２１に登
録された各頻度情報に従ってそれぞれ制御される点にあ
る。即ち、連接処理に用いられる連接辞書１４は、連接
頻度表２０に示される各連接関係の頻度値に応じて選択
されて連接処ｌＩｉ部１３に与えられる。また仮名漢字
変換処理に用いられる変換用辞書１６は、仮名漢字変換
辞書頻度表２１に示される頻度値に応じて選択されて仮
名漢字変換部１５に与えられる。Here, the feature of this device is that the kanki unconnection processing and the kana-kanji conversion processing are performed under the control of the frequency management section 19.
It is controlled in accordance with each frequency information registered in the concatenation frequency table 20 and the kana-kanji conversion dictionary frequency table 21. That is, the concatenation dictionary 14 used in the concatenation process is selected according to the frequency value of each concatenation relationship shown in the concatenation frequency table 20 and is provided to the concatenation process IIi section 13. Further, the conversion dictionary 16 used in the kana-kanji conversion process is selected according to the frequency value shown in the kana-kanji conversion dictionary frequency table 21 and is provided to the kana-kanji conversion unit 15.

このような辞書の選択処理によって、前記連接処理ｉ１
３および仮名漢字変換処理部１５はそれぞれ選択された
処理対象情報範囲内で連接処理および仮名漢字変換処理
を高速に、且つ精度良く行うものとなっている。Through such dictionary selection processing, the connection processing i1
3 and the kana-kanji conversion processing unit 15 are designed to perform concatenation processing and kana-kanji conversion processing at high speed and with high accuracy within the selected processing target information range.

尚、図中２２は装置全体の動作を制御するシステム制御
部である。Note that 22 in the figure is a system control unit that controls the operation of the entire apparatus.

ここで頻度情報に従う連接処理の制御につき説明すると
、連接処理を制御する頻度管理部１９は、例えば第３図
に示すように構成される。Now, to explain the control of the concatenation process according to the frequency information, the frequency management section 19 that controls the concatenation process is configured as shown in FIG. 3, for example.

ｇ１４度管理処理部２３は、音節の組合せの組（連接組
）の選択基準となる閾値を閾値レジスタ２４に設定し、
連接頻度表２０に登録された各連接組の出現頻度値と上
記閾値とを各連接粗角に比較部２５にて比較している。The g14 degree management processing unit 23 sets a threshold value as a selection criterion for a set of syllable combinations (conjunctive set) in the threshold register 24,
The comparison unit 25 compares the appearance frequency value of each connection group registered in the connection frequency table 20 with the threshold value for each connection coarse angle.

尚、上記連接ｖＡ度表２０における各連接組の出現頻度
値は、予め汎用的な値として初期設定される。そして連
接処理の進行に伴い、音節系列中の連接関係が検定され
る都度、その連接組に対するＶ４度値が前記頻度管理処
理部２３の制御の下で更新されるものとなっている。同
様に閾値レジスタ２４の内容も頻度管理処理部２３の制
御の元手、必要に応じて更新される。Note that the appearance frequency value of each connection group in the connection vA degree table 20 is initialized in advance as a general-purpose value. As the concatenation process progresses, each time the conjunctive relationship in the syllable series is tested, the V4 degree value for that concatenated group is updated under the control of the frequency management processing section 23. Similarly, the contents of the threshold register 24 are also updated as necessary under the control of the frequency management processing section 23.

しかして連接辞１１４では、上記頻度値の比較結果に従
って、該連接辞１１１４に登録した各連接組（カテゴリ
）毎に選択フラグを立てている。この選択フラグを参照
して、その連接組を前述した連接処理に用いるか否かが
決定される。In the conjunction 114, a selection flag is set for each conjunction group (category) registered in the conjunction 1114, according to the comparison result of the frequency values. With reference to this selection flag, it is determined whether or not the linked group is to be used in the above-described linking process.

換言すれば、連接処理部１３は設定閾値のもとてフラグ
が立てられた連接組の情報だけを用いて入力音節系列の
連接関係を検定している。In other words, the concatenation processing unit 13 tests the conjunctive relationship of the input syllable series using only the information of the conjunctive groups flagged based on the set threshold.

具体的には、第４図に示すように連接組（あ→あ）に対
する頻度が（０）、連接組（あ→あ）に対する頻度が（
１ｏｏ）等として与えられている場合、これを閾値（５
０）で比較判定すると、連接組（あ→あ）は連接処理対
象から除かれ、連接組（あ→あ）に対してはフラグ（１
）が立てられて連接処理対象として認定される。同様に
連接組（あ→い）に対してもその頻度が（ｓｏｏｏ）で
あることから連接！ｌ！Ｎｕ対象であると認定される。Specifically, as shown in Figure 4, the frequency for the connected group (A → A) is (0), and the frequency for the connected group (A → A) is (0).
1oo) etc., set this as the threshold value (5
0), the connected group (A → A) is excluded from the target of connection processing, and the flag (1
) is set up and certified as a target for linked processing. Similarly, since the frequency is (sooo) for the connected group (A → I), it is connected! l! It is recognized that it is subject to Nu.

このようにして連接処理対象として認定されＥ連接組の
情報（連接辞Ｉ）だけを用いて、前記連接処理部１３は
入力音節系列の連接関係を検定することになる。Using only the information (conjunction I) of the E conjunctive group that has been identified as the subject of concatenation processing in this way, the concatenation processing section 13 tests the conjunctive relationship of the input syllable series.

このような連接組の選択処理によって、連接処理部１３
における連接処理が高速に効率良く、しかも出現頻度の
高い連接組の情報だけを有効に利用して精度良く行われ
るようになっている。Through such connection group selection processing, the connection processing unit 13
The concatenation process is performed quickly, efficiently, and accurately by effectively utilizing only the information of concatenated groups that appear frequently.

尚、入力部１８により閾値レジスタ２３の値を強制的に
設定することか可能であり、これによって認識対象情報
の範囲が強制的に設定される。また特殊な連接関係を入
力したい場合には、その連接組に対する頻度を入力部１
８から前記同値を上回るようにプリセット入力し、統計
的頻度情報の基いて選択された連接組の情報と共に、該
連接組の情報が連接処理に用いられるようにすれば良い
。Note that it is possible to forcibly set the value of the threshold register 23 using the input unit 18, and thereby the range of recognition target information is forcibly set. Also, if you want to input a special connection relationship, enter the frequency for that connection group in the input section 1.
8 to exceed the same value, and the information on the linked group selected based on the statistical frequency information may be used in the linking process.

一方、頻度情報に従う仮名漢字変換処理を制御する頻度
管理部１９は、例えば第５図に示すように構成される。On the other hand, the frequency management section 19, which controls the kana-kanji conversion process according to the frequency information, is configured as shown in FIG. 5, for example.

頻度管理処理部２６は、仮名漢字変換対象（カテゴリ）
の選択基準となる閾値を同値レジスタ２７に設定し、頻
度表２１に登録された各仮名漢字変換対象の出現頻度値
と上記ＷＡ値とを各仮名漢字変換対象毎に比較部２８に
て比較している。尚、上記頻度表２０における各仮名漢
字変換対象の出現頻度値は、予め汎用的な値として初期
設定される。そして仮名漢字変換処理の進行に伴い、仮
名漢字変換された仮名漢字変換対象毎にその頻度値が前
記頻度管理処理部２６の制御の下で更新されるものとな
っている。The frequency management processing unit 26 performs kana-kanji conversion target (category)
A threshold value serving as a selection criterion is set in the equivalency register 27, and the comparison unit 28 compares the appearance frequency value of each kana-kanji conversion target registered in the frequency table 21 and the above WA value for each kana-kanji conversion target. ing. Note that the appearance frequency value of each kana-kanji conversion target in the frequency table 20 is initialized in advance as a general-purpose value. As the kana-kanji conversion process progresses, the frequency value for each kana-kanji conversion target that has been converted into kana-kanji is updated under the control of the frequency management processing section 26.

尚、仮名漢字辞書１６は、実際には自立語辞書部と付属
語辞書部とにより構成され、これに伴って頻度表２１も
、自立語頻度表および付属語頻度表とによって構成され
るようになっている。The kana-kanji dictionary 16 is actually composed of an independent word dictionary section and an attached word dictionary section, and accordingly, the frequency table 21 is also composed of an independent word frequency table and an attached word frequency table. It has become.

しかして仮名漢字変換部１５では、上記頻度情報に従っ
て制限された仮名漢字変換対象のみを入力して前記連接
の検定が施された音節系列を仮名漢字変換している。The kana-kanji conversion unit 15 inputs only the kana-kanji conversion targets limited according to the frequency information, and converts the syllable series subjected to the above-mentioned concatenation test into kana-kanji characters.

具体的には、第６図に示すように「にゆうりよくがあるならば」なる入力音声に対して「にゆ（みゆ）うりよく（ぐ）で（が）あ（ば）る（ぶ
）な（ま）らば」なる音節系列をその認識結果として得たとき、これを自
立８Ｂ検定および付属語検定して仮名漢字変換する。こ
の場合、「入力」なる自立語が出現頻度の高い仮名漢字
変換候補として選択されていることから、「入力ｊなる
変換結果が求められる。Specifically, as shown in Figure 6, in response to the input voice ``Niyu uri yoku ni nara niara'', the response ``Niyu uri yoku de (ga) a(ba) ru(bu)'' is shown in Figure 6. When the syllable sequence ``na(ma)raba'' is obtained as a recognition result, it is subjected to an independent 8B test and an adjunct word test to convert kana to kanji. In this case, since the independent word "input" is selected as a frequently appearing kana-kanji conversion candidate, a conversion result "input j" is obtained.

そしてその付ｇＡ語部分については、「である」と「が
あるＪの２つがそれぞれ候補として求められることから
、例えば「入力であるならば」「入力があるならば」としてその変換処理結果が候補として求められる。As for the suffix gA word part, since the two candidates are ``is'' and ``is J'', the conversion processing result is, for example, ``if it is an input'' and ``if there is an input''. sought as a candidate.

しかしてこの付ｍａ部分についてその出現頻度の情報を
参照すると、「であるＪなる付属語の出現頻度に比較し
て、「が」「ある」なる付１１ＸＩの出現頻度がそれぞ
れ高いことがわかる。この出現頻度の情報に従って前記
結果選択表示部１７は、この音声入力者（利用者）が「
がある」なる付属語を好んで使用していることを判定し
、これを認識結果の第１候補として出力することになる
。However, if we refer to the information on the frequency of appearance of the ma part of the lever, it can be seen that the frequency of appearance of the auxiliary word ``ga'' and ``aru'' is higher than that of the auxiliary word ``J''. According to this appearance frequency information, the result selection display section 17 displays whether this voice input person (user)
It is determined that the adjunct ``aru'' is preferred, and this is output as the first candidate of the recognition result.

尚、この付属語に対する処理において、その処理閾値を
例えば（５ＯＯ）に設定しておけば、付属語の変換処理
対象から「である」なる付ＩＩＩの変換出力が阻止され
る。従ってこの場合には、「入力があるならば」なる仮名漢字変換結果だけが出力されることになる。In addition, in the processing for this adjunct word, if the processing threshold is set to, for example, (5OO), the conversion output of appendix III, which is "is", is prevented from being converted from the adjunct word conversion processing target. Therefore, in this case, only the kana-kanji conversion result ``if there is input'' will be output.

このような仮名漢字変換処理における自立語および付属
語の変換候補を、その頻度情報に従って予め制限してお
くことによって、無駄な仮名漢字変換処理が省かれ、変
換処理結果の曖昧性が取除かれることになる。これ故、
その分だけ認識処理精度の向上を図ることが可能となる
。またその処理速度の向上を図ることが可能となる。By previously limiting conversion candidates for independent words and attached words in such kana-kanji conversion processing according to their frequency information, wasteful kana-kanji conversion processing can be omitted and ambiguity in the conversion processing results can be removed. It turns out. Therefore,
It becomes possible to improve recognition processing accuracy by that much. Furthermore, it is possible to improve the processing speed.

尚、この場合においても頻度情報に従って選択処理され
、変換対象から除外された自立語および付属語を、その
頻度値のプリセット等によって変換対象として回復させ
ることが可能である。また、上記頻度同値の設定も可能
である。従って、通常の情報入力時とは異なる情報を入
力する場合には、この閾値設定機能を利用することによ
って、その情報を適確に入力することが可能となる。In this case as well, it is possible to restore independent words and adjunct words that have been selected according to frequency information and excluded from conversion targets by presetting their frequency values, etc. as conversion targets. Furthermore, it is also possible to set the same frequency values as described above. Therefore, when inputting information that is different from normal information inputting, by using this threshold value setting function, it becomes possible to input the information accurately.

以上説明したように本発明によれば、入力情報（パター
ン）を認識処理（変換処理）するに際し、その認識対象
候補（変換処理対象候補）をその出現頻度等によって統
計的に制限し、出現確率の＾いものについてだけ認識（
変換）処理するので、その処理量の低減を図って処理時
間の短縮化（高速化）を図ることができる。しかも！ｔ
！ＩＩ（変換）候補として確率の高いものの中からその
認ｌＥ（変換）結果を得るので認Ｉｔ（変換）精度の向
上を図ることができる。As explained above, according to the present invention, when performing recognition processing (conversion processing) on input information (pattern), the recognition target candidates (conversion processing target candidates) are statistically limited based on their appearance frequency, etc., and the occurrence probability is Recognize only the things that are
(conversion) processing, it is possible to reduce the processing amount and shorten (speed up) the processing time. And! t
! Since the recognition (conversion) result is obtained from among candidates with high probability as II (conversion) candidates, it is possible to improve the recognition It (conversion) accuracy.

また頻度情報により認１ｉｔ（変換）対象から除外され
た情報を、必要に応じて認ｌＩ（変換）対象に含ませる
場合には、その情報に対する頻度情報をプリセット的に
与えるだけで良いので、種々の入力情報に対して柔軟に
対処できる。これ故、情報入力の日清化を図り、簡易に
精度良く必要な情報を入力することが可能となる等の実
用上多大なる効果が奏せられる。In addition, if information that has been excluded from recognition (conversion) targets based on frequency information is to be included in recognition (conversion) targets as necessary, it is only necessary to provide frequency information for that information as a preset. can respond flexibly to input information. Therefore, great practical effects can be achieved, such as making it easier to input information and making it possible to easily and accurately input necessary information.

尚、本発明は上述した実施例に限定されるものではない
。ここでは日本語文章の音声入力について例示したが、
手書き文字情報を入力する場合にも同様に適用すること
ができる。またここでは、入力パターンの認識処理、そ
の認識結果に対する連接処理、更に仮名漢字変換処理に
おいてそれぞれ頻度情報を用いた処理候補の選択制御を
行ったが、これらの処理制御をそれぞれ単独に行うこと
も可能である。その他、本発明はその要旨を逸脱しない
範囲で種々変形して実施することができる。Note that the present invention is not limited to the embodiments described above. Here, we gave an example of voice input of Japanese sentences, but
The present invention can be similarly applied to inputting handwritten character information. In addition, here, we used frequency information to control the selection of processing candidates in input pattern recognition processing, concatenation processing on the recognition results, and kana-kanji conversion processing, but each of these processing controls can also be performed independently. It is possible. In addition, the present invention can be implemented with various modifications without departing from the gist thereof.

[Brief explanation of drawings]

第１図は本発明の基本的な実施例装置の概略構成図、第
２図（よ音声ワードプロセッサに適用した実施例装置の
概略構成図、第３図は連接処理に用いられる頻度管理部
の構成例を示す図、第４因は連接頻度表の構成例を示す
図、第５図は仮名漢字変換処理に用いられる頻度管理部
の構成例を示す図、第６図は仮名漢字変換処理の例を示
す図である。１・・・パターン入力部、２・・・Ｈ１ｉ辞書、３・・
・パターン認識部、４・・・結果選択部、５・・・シス
テムυ３１１１１部、６・・・頻度表、７・・・頻度管
理部、８・・・入力部、１１・・・単音節検出認識部、
１２・・・認識辞書、１３・・・連接処理部、１４・・
・連接辞書、１５・・・仮名漢字変換部、１６・・・仮
名漢字変換用辞書、１７・・・結果選択表示部、１８・
・・入力部、１９・・・頻度管理部、２０・・・連接頻
度表、２１・・・仮名漢字変換界！頻度表、２２・・・
システム制御部。出願人代理人　弁理士　鈴江武彦第１図第２図第３図第４図第５図（ａ）−Ｅ−一４リ　　　　（：９　　う　　ＩＪＪ　
　　＜　　　＠Ｊ＞　　　Ｓ　　　ｒｔ　　’５　　１
Ｊ”第６図FIG. 1 is a schematic configuration diagram of a basic embodiment device of the present invention, FIG. 2 is a schematic diagram of an embodiment device applied to a spoken word processor, and FIG. 3 is a configuration diagram of a frequency management section used for concatenation processing. The fourth factor is a diagram showing an example of the configuration of the concatenation frequency table. Figure 5 is a diagram showing an example of the configuration of the frequency management unit used in the kana-kanji conversion process. Figure 6 is an example of the kana-kanji conversion process. It is a diagram showing 1... pattern input section, 2... H1i dictionary, 3...
・Pattern recognition section, 4... Result selection section, 5... System υ31111 section, 6... Frequency table, 7... Frequency management section, 8... Input section, 11... Monosyllable detection recognition part,
12... Recognition dictionary, 13... Concatenation processing unit, 14...
- Concatenation dictionary, 15... Kana-kanji conversion section, 16... Dictionary for kana-kanji conversion, 17... Result selection display section, 18.
...Input section, 19...Frequency management section, 20...Conjunction frequency table, 21...Kana-Kanji conversion world! Frequency table, 22...
System control unit. Applicant's representative Patent attorney Takehiko Suzue Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 (a)-E-14 (:9 U IJJ
<@J> S rt '5 1
J"Figure 6

Claims

[Claims]

(1) Recognize the input pattern by referring to the input target information,
In an information input device that inputs information indicated by the input pattern into an information processing system, the input pattern is recognized and processed by restricting input target information to be subjected to recognition processing for the input pattern according to statistical frequency information regarding the input target information. , an information input device characterized in that statistical frequency information regarding the input target information is sequentially updated according to the recognition processing result.

(2) The information input device according to claim 1, wherein the statistical frequency information regarding the input target information includes appearance frequency information, connection frequency information, and the like.

(3) Restriction of the input target information to be subjected to recognition processing is performed by excluding input target information whose statistical frequency information is below a predetermined threshold value from the recognition target information for the input pattern. Information input device described in section.

(4) The information input device according to claim 1, wherein the input pattern consists of syllables or kana characters, and the input target information consists of words or phrases consisting of the series of the syllables or kana characters.

(5) The information input device according to claim 1, wherein the restriction on input target information to be subjected to recognition processing is lifted as necessary.