JPS60251484A - Word recognition system - Google Patents

Word recognition system

Info

Publication number
JPS60251484A
JPS60251484A JP59108798A JP10879884A JPS60251484A JP S60251484 A JPS60251484 A JP S60251484A JP 59108798 A JP59108798 A JP 59108798A JP 10879884 A JP10879884 A JP 10879884A JP S60251484 A JPS60251484 A JP S60251484A
Authority
JP
Japan
Prior art keywords
recognition
word
recognition unit
input
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP59108798A
Other languages
Japanese (ja)
Other versions
JPH0711821B2 (en
Inventor
Kenichi Maeda
賢一 前田
Yoshitaka Okazawa
岡沢 好高
Shunji Ariyoshi
俊二 有吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Priority to JP59108798A priority Critical patent/JPH0711821B2/en
Publication of JPS60251484A publication Critical patent/JPS60251484A/en
Publication of JPH0711821B2 publication Critical patent/JPH0711821B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)

Abstract

PURPOSE:To find out simply and efficiently a recognized result by scanning an input pattern over the length of a word to be recognized and recognizing the word while extracting proposed characters in a recognition unit sequence out of the input pattern. CONSTITUTION:A word recognition part 6 extracts the proposed input character strings the same in the number of characters as that of a word to be recognized which are registered in a recognition dictionary 5 out of the input character string. The group of similarity calculated between the mutually corresponding character positions of respective character patterns of the proposed input character string and the dictionary patterns of respective characters of the word to be recognized is found out as word recognized results for the proposed input character string of the word to be recognized. The word recognized results consisting of the group of similarity found out in each proposed input character string out of plural ones extracted from the input character string are mutually compared. Consequently, the proposed input character string which may be the most precise as the word to be recognized is recognized as the input character string corresponding to the word to be recognized.

Description

【発明の詳細な説明】 〔発明の技術分野〕 本発明は連続的に記載された文字列中の単語、〔発明の
技術的背景とその問題点〕 近時、情報処理システムの発展に伴い、マン・マシン・
インターフェースとして重要な役割を果たす各種のパタ
ーン認識方式が開発されている。
[Detailed Description of the Invention] [Technical Field of the Invention] The present invention relates to words in a string of characters written continuously. man machine
Various pattern recognition methods have been developed that play an important role as an interface.

その中でも、単語認識処理は、入カバターンを意味のあ
る情報(単語)として認識できるので、上記マン・マシ
ン・インターフェースとして非常に優れていると云える
Among these, word recognition processing can recognize input patterns as meaningful information (words), so it can be said to be extremely superior as the above-mentioned man-machine interface.

ところが従来の単語認識方式は、例えば特願昭57−1
64936号に例示されるように、入カバターンが単語
単位に区切られて与えられることを前提として行ってい
る。しかし、単語認識に供せられる入カバターンは、必
ずしも単語単位に区切られていると云う保障はない。例
えば郵便物に記載された宛先を示す文字列を認識し、そ
の認識結果に従って上記郵便物を区分は処理する場合、
文字認識に供せられる上記文字列が「都道府県芯」「置
市郡名」 「町村名」 「番地」等に明確に区分されて
記載されているとは限らない。この為、従来より種々提
唱されている単語認識方式を用いて文字列や音声等の入
カバターンを単語認識しようとする場合には、上記入カ
バターン中の認識目的とする単語の存在範囲(区間)を
何等かの手段で決定することが必要であった。然し乍ら
、連続して書かれた文字列や、連続発声された音声から
、その目的とする単語の存在範囲を決定することは極め
て困難であった。
However, conventional word recognition methods, for example,
As exemplified in No. 64936, this is done on the premise that the input cover pattern is divided into words and given. However, there is no guarantee that the input pattern used for word recognition is necessarily divided into word units. For example, when recognizing a character string indicating the address written on a postal item and sorting and processing the postal item according to the recognition result,
The above character strings used for character recognition are not always clearly categorized into "prefecture core", "city/gun name", "town/village name", "street address", etc. For this reason, when trying to recognize input cover patterns such as character strings and sounds using various word recognition methods that have been proposed in the past, it is necessary to identify the existence range (section) of the word to be recognized in the input cover patterns. It was necessary to determine this by some means. However, it has been extremely difficult to determine the range of existence of a target word from a continuously written character string or continuously uttered voice.

〔発明の目的〕[Purpose of the invention]

本発明はこのような事情を考慮してなされたもので、そ
の目的とするところは、入カバターン中に単語が区切ら
れて存在しない場合であっても、上記入カバターン中の
認識対象とする単語を簡易に、且つ効果的に認識するこ
とのできる単語認識方式を提供することにある。
The present invention has been made in consideration of these circumstances, and its purpose is to recognize the words to be recognized in the entered cover turn even if the words are separated and do not exist in the entered cover turn. An object of the present invention is to provide a word recognition method that can easily and effectively recognize words.

〔発明の概要〕[Summary of the invention]

本発明は、連続して記載された文字列、或いは連続発声
された音声等からなる入カバターンを所定の認識単位(
文字または音素や音節)毎に検切して上記入カバターン
の認識単位系列をめ、認識対象単語を構成する認識単位
系列と同じ認識単位数の認識単位系列候補を前記入カバ
ターンの認識゛単位系列中からそれぞれ抽出し、上記認
識単位系列候補の各認識単位入カバターンと前記II対
象単語の各認識単位辞書パターンとを照合して上記認識
単位系列候補に対する認識結果をめて、これらの各認識
結果を相互に比較して前記i1m対象単語に該当する前
記入力バータン中の認識単位系列候補を判定するように
したものである。
The present invention recognizes an input pattern consisting of a continuously written character string or continuously uttered voice, etc. in a predetermined recognition unit (
The above-mentioned recognition unit series for each character (or phoneme or syllable) is determined, and recognition unit series candidates with the same number of recognition units as the recognition unit series constituting the recognition target word are selected as the recognition unit series for the above-mentioned kavataan. Each of the recognition unit entry cover patterns of the recognition unit sequence candidates is compared with each recognition unit dictionary pattern of the II target word, and the recognition results for the recognition unit sequence candidates are calculated. are compared with each other to determine a recognition unit sequence candidate in the input bartan that corresponds to the i1m target word.

例えば、入カバターンから抽出される認識単位系列候補
の各認識単位入カバターンと前記認識対象単語の各認識
単位辞書パターンとの類似度を各Wg識単位毎に計算し
、これらの類似度の和(積)、または上記類似度に所定
の重み付けした値の和(積)として上記認識単位系列候
補に対する認識結果をめ、これらの認識結果を相互に比
較して前記認識対象単語に該当する前記入力バータン中
の認識単位系列候補を判定するようにしたものである。
For example, the similarity between each recognition unit input pattern of the recognition unit sequence candidates extracted from the input pattern and each recognition unit dictionary pattern of the recognition target word is calculated for each Wg recognition unit, and the sum of these similarities ( The recognition results for the recognition unit sequence candidates are calculated as the sum (product) of the product) or the sum (product) of the values obtained by weighting the similarity degrees with a predetermined value, and these recognition results are compared with each other to determine the input bar number corresponding to the recognition target word. The recognition unit sequence candidates in the middle are determined.

〔発明の効果〕〔Effect of the invention〕

かくして本発明によれば、連続的に記載された文字列や
連続発声された音声のように単語が区切られていない入
カバターンに対しても、Wlil対象単語と同じg*単
位数の認識単位系列候補を抽−出し、その各認識単位系
列候補の認識単位入カバターンと前記認識対象単語の認
識単位辞書パターンとの類似度から、各認識単位系列候
補の前記WIN対象単語に対する認識結果をめて入カバ
ターン中の単語を認識するので、簡易に、且つ効果的に
前記入カバターンに対する単語認識が可能となる。
Thus, according to the present invention, even for an input pattern in which words are not separated, such as a continuously written character string or continuously uttered speech, a recognition unit sequence with the same number of g* units as the Wlil target word is used. The candidates are extracted, and based on the degree of similarity between the recognition unit input cover pattern of each recognition unit series candidate and the recognition unit dictionary pattern of the recognition target word, the recognition result for the WIN target word of each recognition unit series candidate is input. Since the words in the Kabataan are recognized, it is possible to easily and effectively recognize the words in the Kabataan.

つまり、認識しようとする単語の長さに(認識。That is, depending on the length of the word you are trying to recognize (recognition.

単位系列)亙っで入カバターンをスキャンし、該入カバ
ターン中から認識単位系列候補を抽出しながら単語認識
を行うので、簡易に効率良くその認誠結果をめることが
可能となる等の多大なる効果が奏せられる。
Since word recognition is performed while scanning the input pattern (unit sequence) and extracting recognition unit sequence candidates from the input pattern, it is possible to easily and efficiently obtain the recognition result, etc. This produces a certain effect.

〔発明の実施例〕[Embodiments of the invention]

以下、図面を参照して本発明の実施例につき説明する。 Embodiments of the present invention will be described below with reference to the drawings.

第1図は郵便物等に連続して書かれた文字列が示す住所
(宛先)を認識づるようにした実施例装置の概略構成図
である。
FIG. 1 is a schematic diagram of an embodiment of an apparatus capable of recognizing an address (destination) indicated by a string of characters consecutively written on a piece of mail or the like.

郵便物等に記載された文字列は、入力部1を介して光電
変換されて読取り入力され、入力文字列画像として画像
メモリ2に書込まれる。切出し部3は、上記画像メモリ
2に格納された入力文字列画像を、認識処理の基本単位
である文字単位毎に一文字づつ検切処理し、上記入カバ
ターンの文字系列をめている。この文字の検切処理は、
例えば枠やピッチの情報を利用してその文字間の区切り
を検出して行ったり、或いは文字の記載位置を示すタイ
ミング・マークを用いて各文字を検出する等の、既に知
られている技術を適宜用いることによって行われる。尚
、上記文字ピッチの検出は、例えば文字列パターンの投
影濃度の変化を検出する等して行うことが可能である。
A character string written on a piece of mail or the like is photoelectrically converted and read and input via an input unit 1, and is written into an image memory 2 as an input character string image. The cutting section 3 processes the input character string image stored in the image memory 2 one by one character by character unit, which is the basic unit of recognition processing, and determines the character sequence of the above-mentioned cover pattern. The cutoff process for this character is
For example, we can use already known techniques such as detecting the breaks between characters using frame and pitch information, or detecting each character using timing marks that indicate the position of the characters. This is done by using it appropriately. Note that the character pitch can be detected by, for example, detecting a change in the projected density of a character string pattern.

しかして、このように検切処理してめられた入カバター
ンの文字系列情報は文字認識部4に供給され、認識辞書
5に予め登録された認識対象単語の各文字の辞書パター
ンと照合されて各入力文字毎に文字認識処理される。こ
の文字認識処理は、前記入力文字列の各文字パターンと
、上記認識辞書5に登録された認識対象単語の各文字の
辞書パターンとの類似度、或いは類似度と等価な関係に
ある距離等をそれぞれ計算することによって行われる。
Thus, the character sequence information of the input pattern obtained through the inspection process is supplied to the character recognition unit 4, and is compared with the dictionary pattern of each character of the recognition target word registered in advance in the recognition dictionary 5. Character recognition processing is performed for each input character. This character recognition process calculates the degree of similarity between each character pattern of the input character string and the dictionary pattern of each character of the recognition target word registered in the recognition dictionary 5, or a distance equivalent to the degree of similarity. This is done by calculating each.

この文字認識は、例えば良く知られた複合類似度法等を
用いることが適当である。
For this character recognition, it is appropriate to use, for example, the well-known composite similarity method.

単語認識部6は、前記認識辞書5に登録された認識対象
単語の文字数と同じ文字数の入力文字列候補を前記入力
文字列から抽出し、その入力文字列候補の各文字パター
ンと前記認識対象単語の各文字の辞書パターンとの相互
に対応する文字位置間でそれぞれ計算された類似度の組
を前記認識対象単語の上記入力文字列候補に対する単語
認識結果としてめている。そして、前記入力文字列から
抽出される複数の入力文字列中7m毎にめられた前記類
似度の組からなる単語認識結果を相互に比較し、前記認
識対象単語どして最も確力日ジしい入力文字列候補を、
認識対象単語に該当する入力文字列として認識している
The word recognition unit 6 extracts input character string candidates having the same number of characters as the recognition target word registered in the recognition dictionary 5 from the input character string, and extracts each character pattern of the input character string candidate and the recognition target word. A set of similarities calculated between mutually corresponding character positions with the dictionary pattern of each character is taken as a word recognition result for the input character string candidate of the recognition target word. Then, the word recognition results consisting of the sets of similarities obtained every 7 meters among the plurality of input character strings extracted from the input character strings are compared with each other, and the most probable date of the recognition target words is compared with each other. new input string candidates,
It is recognized as an input character string that corresponds to the recognition target word.

即ち、単語認識部6は次のようにして入力文字列中から
認識対象とする単語を認識している。ここでは簡単のた
めに、住所を示す入力文字列の中から「品用区」という
単語を認識する例について説明する。
That is, the word recognition unit 6 recognizes words to be recognized from the input character string in the following manner. For the sake of simplicity, an example will be described here in which the word "Shinyoku-ku" is recognized from an input character string indicating an address.

この場合、前記認識辞書5には1品用区」なる認識対象
単語の各文字「品] [川]「区」の辞書パターンが予
め準備される。そして単語認識に必要とする情報は、上
記1品」 「川」 1区」なる辞書パターンにそれぞれ
に対する、入力文字列の各文字パターンの類似度となる
。今、第2図に示すように「東京部品用区小山1−2−
3Jなる住所を示す文字列Aが入力され、これらの各入
力文字パターンと認識辞書5に登録された認識対象単語
Bの各文字の辞書パターンとの類似度Cが第2図に示す
数値のように得られたとする。尚、これらの各数値は、
最大値を(i、o)とする類似度として与えられる。
In this case, the recognition dictionary 5 is prepared in advance with dictionary patterns for each character of the recognition target word ``article'', [kawa], and ``ku''. The information required for word recognition is the degree of similarity of each character pattern of the input character string to each of the dictionary patterns ``1 item'', ``river'', 1 ward''. Now, as shown in Figure 2, "Tokyo Parts Ward Koyama 1-2-
A character string A indicating an address 3J is input, and the degree of similarity C between each of these input character patterns and the dictionary pattern of each character of the recognition target word B registered in the recognition dictionary 5 is as shown in Figure 2. Suppose that it is obtained. In addition, each of these numbers is
It is given as a degree of similarity with the maximum value being (i, o).

しかして、ここでは認識対象単語である「品用区」は3
文字からなる文字列であることから、前記検切処理され
た入力文字列中から3文字の認識文字列候補が抽出され
る。この文字列候補の抽出は、例えば前記入力文字列の
冒頭の文字位置から3文字を抽出し、次に抽出開始文字
位置を1文字分づつずらしながら行われる。具体的には
、先ず最初に「東京都」と云う3文字の文字列について
、「品用区」なる単語との類似度がめられる。この「東
京都」なる文字列の単語としての類似度は、1文字目の
文字「東」の文字パターンと「品」からなる辞書パター
ンとの文字認識によって計算される類似度、2文字目の
文字「京」と「川」との類似度、3文字目の文字「都」
と「区」との類似度の組としてめられる。そして、この
類似度の組で示される単語に対する認識結果は、例えば
上記名類似度の和〈平均〉としてめられる。尚、この単
語に対する認識結果を、上記各類似度の積としてめても
良く、或いは前記各類似度に所定の係数を乗じた後、そ
れらの和または積をめて前記単語に対する認識結果とし
ても良い。このように類似度の積をその評価値として用
いる場合には、前記入力文字列中の1文字でも、その文
字認識結果が疑わしい場合、その認識文字列候補を認識
処理対象からリジェクトするのに効果がある。
However, in this case, the word to be recognized, "Shinyoku-ku", is 3.
Since the character string is made up of characters, three character recognition character string candidates are extracted from the input character string subjected to the cutoff process. This extraction of character string candidates is performed by, for example, extracting three characters from the first character position of the input character string, and then shifting the extraction start character position one character at a time. Specifically, first, the similarity of the three-character string "Tokyo" with the word "Shinyo-ku" is determined. The similarity of this character string "Tokyo" as a word is the similarity calculated by character recognition between the character pattern of the first character "Higashi" and the dictionary pattern of "Shin", the similarity of the second character Similarity between the characters “Kyo” and “Kawa”, the third character “Miyako”
It is considered as a pair of similarity between ``ku'' and ``ku''. The recognition result for the word indicated by this similarity set is, for example, the sum (average) of the name similarities. Incidentally, the recognition result for this word may be obtained as the product of each of the above-mentioned degrees of similarity, or the recognition result for the word may be obtained by multiplying each of the above-mentioned degrees of similarity by a predetermined coefficient and then adding or multiplying them. good. In this way, when the product of similarities is used as the evaluation value, if even one character in the input character string has a questionable character recognition result, it is effective to reject the recognized character string candidate from the recognition processing target. There is.

また上述したように各文字認識結果(類似度)に所定の
係数を乗じて認識処理すれば、各文字に対する認識性能
に差がある場合等に有効である。つまり、簡単な文字に
対する認識結果と、複雑な文字に対する認識結果とを考
慮した上で、前述した単語に対する評価値を得ることが
可能となる。
Furthermore, as described above, performing recognition processing by multiplying each character recognition result (similarity) by a predetermined coefficient is effective when there is a difference in recognition performance for each character. In other words, it is possible to obtain the evaluation value for the word described above, taking into account the recognition results for simple characters and the recognition results for complex characters.

しかる後、前記入力文字列から抽出する3文字の文字列
を1文字分づつ移動させ、これらの各文字列にってい同
様に単語としての類似度を計算する。即ち、「東京都」
なる認識文字列候補について認識結果が得られた後、「
京部品」、「部品用」、1品用区」、「用区小」のよう
に認識文字列候補を順に抽出し、これらの各認識文字列
候補に対する認識結果を同様にめる。第2図に示す例で
は、上記各認識文字列候補の認識対象単語に対する類似
度(文字認識によりめられた類似度の和)はそれぞれ(
1,4)、(1,[3)、(1,4)、(2,6)、(
1,3)となる。これらの各認識対象単語に対する類似
度が相互に比較されて、前記認識対象単語として最も確
からしい入力文字列候補が、上記認識対象単語に該当す
る単語として認識される。従って、この例では、入力文
字列から4番目に抽出された文字列候補「品用区」が、
認識辞書5に登録された認識対象単語に該当する単語と
して認識される。
Thereafter, the three character string extracted from the input character string is moved one character at a time, and the word similarity is similarly calculated for each of these character strings. In other words, "Tokyo"
After the recognition results are obtained for the recognition character string candidates,
Recognized character string candidates are extracted in order, such as "Kyo Parts", "Parts", "1 Item Ward", and "Yu Ward Elementary School", and the recognition results for each of these recognized character string candidates are similarly displayed. In the example shown in Figure 2, the degree of similarity (sum of degrees of similarity determined by character recognition) of each recognized character string candidate to the recognition target word is (
1,4), (1,[3), (1,4), (2,6), (
1, 3). The degrees of similarity for each of these recognition target words are compared with each other, and the input character string candidate that is most likely to be the recognition target word is recognized as a word corresponding to the recognition target word. Therefore, in this example, the fourth character string candidate extracted from the input string is
The word is recognized as a word corresponding to the recognition target word registered in the recognition dictionary 5.

尚、入力文字列中の単語認識の決定は、次のようなアル
ゴリズムに従えば良い。例えば、「品用区」に対する認
識文字列候補の類似度がある閾値を越えたとき、そのW
g識識字字列候補該当単語として認識しても良く、或い
は入力文字列中に「品用区」なる該当単語が必ず一度だ
け現われるとわかつている場合には、その類似度が最も
大きくなった認識文字列候補を該当単語として認識する
ようにしても良い。
Note that the following algorithm may be used to determine word recognition in the input character string. For example, when the similarity of a recognized character string candidate to "Product Ward" exceeds a certain threshold, the W
g Literacy character string candidate If it is recognized as a corresponding word, or if it is known that the corresponding word "inplicity ward" always appears only once in the input character string, the similarity is the highest. A recognition character string candidate may be recognized as a corresponding word.

このようにすれば住所を示す連続した文字列の中で、[
品用区Jなる単語を簡易に、月っ効果的に認識すること
が可能となる。同様にして「東京都」や「小山」等の単
語についても認識可能なことは云うまでもない。
In this way, [[
It becomes possible to easily and effectively recognize the word ``article ward J''. It goes without saying that words such as "Tokyo" and "Oyama" can be similarly recognized.

尚、上記「品用区」なる単語が認識された後、これに続
く「小山」なる単語を認識する場合には、上記単語認識
結果を得た文字の次の文字から始まる文字列について前
述した認識処理を行えば良い。
In addition, after the word ``Shinyoku'' has been recognized, in order to recognize the following word ``Koyama'', the character string starting from the next character after the character for which the word recognition result was obtained is as described above. All you have to do is perform recognition processing.

また認識対象単語が「品用区」と「大田区jのように複
数個同時に存在する場合には、これらの各認識対象単語
に対する類似度をそれぞれ並行してめ、これらの各認識
対象単語についてそれぞれめられた最も確からしい認識
文字列候補の類似度を相互に比較して、認識対象単語の
選択と、その選択された認識対象単語に対する入力文字
列中の該当単語をめるようにすれば良い。
In addition, if there are multiple words to be recognized at the same time, such as ``Hinnyou-ku'' and ``Ota-ku j'', the degree of similarity for each of these words to be recognized is measured in parallel, and the similarity is calculated for each of these words to be recognized. By comparing the degrees of similarity between the most likely recognition character string candidates, you can select a recognition target word and find the corresponding word in the input character string for the selected recognition target word. good.

以上説明したように本発明によれば、入カバターンが単
語単位に区切られていない場合であっても、その入カバ
ターン中の認識対象とする単語の認識が可能となり、そ
の効果は極めて大きい。
As described above, according to the present invention, even if the input cover turn is not divided into words, it is possible to recognize the words to be recognized in the input cover turn, and the effect is extremely large.

尚、本発明は上述した実施例に限定されるものではない
。例えば連続発声された音声中の単語認識にも応用する
ことができる。第3図はその例を示すもので、入力部1
aを介して連続音声を入力し、この入力音声パターンを
入力音声メモリ2aに格納する。この音声パターンを切
出し部3aにて、例えば−音節(または音素)づつに分
割して音節系列(音素系列)をめる。しかる後、音節認
識部4aにて認識辞書5aを参照して上記各音節パター
ンと認識対象単語の各音節辞書パターンとの類似度を複
合類似度法等の方法で計算する。しかる後、単語認識部
6aにて先の実施例と同様にして単語認識処理するよう
にすれば良い。尚、先の住所の認識の実施例と比較して
、入力部1aおよび切出し部2aは音声処理固有のもの
として構成することが必要であるが、音節認識部4aお
よび単語認識部6aは、先の実施例の文字認識部4およ
び単語認識部6と基本的には同じ構成で良い。但し、認
識辞書5aの構成が異なることは云うまでもない。
Note that the present invention is not limited to the embodiments described above. For example, it can be applied to word recognition in continuously uttered speech. Figure 3 shows an example of this.
Continuous voice is input through the input voice 2a, and this input voice pattern is stored in the input voice memory 2a. This speech pattern is divided into, for example, -syllables (or phonemes) by a cutting unit 3a to obtain a syllable sequence (phoneme sequence). Thereafter, the syllable recognition unit 4a refers to the recognition dictionary 5a and calculates the similarity between each syllable pattern and each syllable dictionary pattern of the word to be recognized using a method such as a composite similarity method. Thereafter, the word recognition unit 6a may perform word recognition processing in the same manner as in the previous embodiment. Note that, compared to the previous embodiment of address recognition, the input section 1a and the extraction section 2a need to be configured specifically for speech processing, but the syllable recognition section 4a and the word recognition section 6a are The structure may be basically the same as that of the character recognition section 4 and the word recognition section 6 in the embodiment. However, it goes without saying that the configuration of the recognition dictionary 5a is different.

また、パターン単位の類似度を予め全て計算しておく必
要はない。例えば、認識しようとする単語の位置の移動
に同期して、その単語の類似度を計算するのに必要なパ
ターンの類似度だけを計算しても良い。この為には、例
えば第4図に示すように、必要なパターンのID番号を
文字認識部4にフィード・バックして、このパターンの
類似度のみを単語認識部6に入力するようにすればよい
Furthermore, it is not necessary to calculate all the similarities in pattern units in advance. For example, in synchronization with the movement of the position of a word to be recognized, only the similarity of patterns necessary to calculate the similarity of the word may be calculated. To do this, for example, as shown in FIG. 4, the ID number of the required pattern is fed back to the character recognition unit 4, and only the similarity of this pattern is input to the word recognition unit 6. good.

要するに本発明は、その要旨を変更しない範囲で種々変
形して実施することが可能である。
In short, the present invention can be implemented with various modifications without changing the gist thereof.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の一実施例装置の概略構成図、第2図は
実施例における単語認識処理の作用を説明するための図
、第3図および第4図はそれぞれ本発明の別の実施例を
示す装置概略構成図である。 1.1a・・・入力部、2・・・画像メモリ、2a・・
・音声メモリ、3,3a・・・切出し部、4・・・文字
認識部、4a・・・音節認識部、5.58・・・認識辞
書、6,6a・・・単語認識部。 出願人代理人 弁理士 鈴江武彦
FIG. 1 is a schematic diagram of a device according to an embodiment of the present invention, FIG. 2 is a diagram for explaining the operation of word recognition processing in the embodiment, and FIGS. 3 and 4 are diagrams showing other embodiments of the present invention. FIG. 1 is a schematic configuration diagram of an apparatus showing an example. 1.1a...input section, 2...image memory, 2a...
- Voice memory, 3, 3a... Cutting section, 4... Character recognition section, 4a... Syllable recognition section, 5.58... Recognition dictionary, 6, 6a... Word recognition section. Applicant's agent Patent attorney Takehiko Suzue

Claims (6)

【特許請求の範囲】[Claims] (1)入カバターンを所定の認識単位毎に検切して上記
入カバターンの認識単位系列をめる手段と、認識対象単
語を構成する認識単位系列と同じ認識単位数の認識単位
系列候補を前記入カバターンの認識単・位系列中からそ
れぞれ抽出する手段と、上記認識単位系列候補の各認識
単位入カバターンと前記認識対象単語の各認識単位辞書
パターンとを照合して上記認識単位系列候補に対する認
識結果をめる手段と、前記入カバターンの認識単位系列
中から抽出される認識単位系列候補毎にめられた各認識
結果を相互に比較して前記認識対象単語に該当する前記
入力バータン中の認識単位系列候補を判定する手段とを
具備したことを特徴とする単語認識方式。
(1) A means for determining the recognition unit series of the input cover patterns by examining the input cover patterns for each predetermined recognition unit, and a recognition unit series candidate having the same number of recognition units as the recognition unit series constituting the recognition target word. Recognizing the recognition unit series candidate by comparing each recognition unit input coverturn of the recognition unit series candidate with each recognition unit dictionary pattern of the recognition target word by means of extracting each of the recognition unit/place sequences of the entered cover patterns. a means for obtaining results; and recognition of the input pattern corresponding to the recognition target word by comparing each recognition result obtained for each recognition unit series candidate extracted from the recognition unit series of the input pattern. 1. A word recognition method comprising: means for determining unit sequence candidates.
(2)認識単位系列候補に対する認識結果は、認識単位
系列候補の各認識単位入カバターンと前記認識対象単語
の各認識単位辞書パターンとの各認識単位毎に計算され
る類似度の和、または上記類似度に所定の重み付けした
値の和としてめられるものである特許請求の範囲第1項
記載の単語認識方式。
(2) The recognition result for a recognition unit sequence candidate is the sum of the degrees of similarity calculated for each recognition unit between each recognition unit input cover pattern of the recognition unit sequence candidate and each recognition unit dictionary pattern of the recognition target word, or the above 2. The word recognition method according to claim 1, wherein the word recognition method is determined as a sum of values obtained by assigning a predetermined weight to the degree of similarity.
(3)認識単位系列候補に対する認識結果は、認識単位
系列候補の各認識単位入カバターンと前記認識対象単語
の各認識単位辞書パターンとの各認識単位毎に計算され
る類似度の積、または上記類似度に所定の重み付けした
値の積としてめられるものである特許請求の範囲第1項
記載の単語認識方式。
(3) The recognition result for a recognition unit sequence candidate is the product of the similarity calculated for each recognition unit between each recognition unit entry cover pattern of the recognition unit sequence candidate and each recognition unit dictionary pattern of the recognition target word, or the above 2. The word recognition method according to claim 1, wherein the word recognition method is determined as a product of a value given a predetermined weight to the degree of similarity.
(4)入カバターンは連続して記載された文字列からな
り、その認識単位は文字からなるものである特許請求の
範囲第1項記載の単語認識方式。
(4) The word recognition method according to claim 1, wherein the input cover pattern consists of a string of consecutively written characters, and the recognition unit thereof consists of letters.
(5)へカーパターンは連続発声された音声からなり、
その認識単位は音素または音節からなるものである特許
請求の範囲第1項記載の単語認識方式。
(5) A heker pattern consists of continuously uttered sounds,
2. The word recognition method according to claim 1, wherein the recognition unit consists of a phoneme or a syllable.
(6)認識単位系列候補は、入カバターンの認識単位系
列の所定の認識単位位置から順に、1つの認識単位づつ
ずらして抽出されるものである特許請求の範囲第1項記
載の単語認識方式。
(6) The word recognition method according to claim 1, wherein the recognition unit sequence candidates are extracted sequentially from a predetermined recognition unit position of the recognition unit sequence of the input pattern by shifting one recognition unit at a time.
JP59108798A 1984-05-29 1984-05-29 Word recognizer Expired - Lifetime JPH0711821B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP59108798A JPH0711821B2 (en) 1984-05-29 1984-05-29 Word recognizer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP59108798A JPH0711821B2 (en) 1984-05-29 1984-05-29 Word recognizer

Publications (2)

Publication Number Publication Date
JPS60251484A true JPS60251484A (en) 1985-12-12
JPH0711821B2 JPH0711821B2 (en) 1995-02-08

Family

ID=14493746

Family Applications (1)

Application Number Title Priority Date Filing Date
JP59108798A Expired - Lifetime JPH0711821B2 (en) 1984-05-29 1984-05-29 Word recognizer

Country Status (1)

Country Link
JP (1) JPH0711821B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01177180A (en) * 1988-01-04 1989-07-13 Oki Electric Ind Co Ltd Character recognizing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS57146380A (en) * 1981-03-04 1982-09-09 Nec Corp Address reader
JPS58154899A (en) * 1982-03-10 1983-09-14 日本電気株式会社 Word recognition equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS57146380A (en) * 1981-03-04 1982-09-09 Nec Corp Address reader
JPS58154899A (en) * 1982-03-10 1983-09-14 日本電気株式会社 Word recognition equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01177180A (en) * 1988-01-04 1989-07-13 Oki Electric Ind Co Ltd Character recognizing method

Also Published As

Publication number Publication date
JPH0711821B2 (en) 1995-02-08

Similar Documents

Publication Publication Date Title
CA2158064C (en) Speech processing
KR100312920B1 (en) Method and apparatus for connected speech recognition
KR101217524B1 (en) Utterance verification method and device for isolated word nbest recognition result
Wshah et al. Statistical script independent word spotting in offline handwritten documents
US4769844A (en) Voice recognition system having a check scheme for registration of reference data
CN109948144B (en) Teacher utterance intelligent processing method based on classroom teaching situation
US20110106814A1 (en) Search device, search index creating device, and search system
CN111128128B (en) Voice keyword detection method based on complementary model scoring fusion
EP0074769A1 (en) Recognition of speech or speech-like sounds using associative memory
JPS60251484A (en) Word recognition system
CN111429886A (en) Voice recognition method and system
JPH049320B2 (en)
JPS6325366B2 (en)
JP2000099084A (en) Voice recognition method and device therefor
JPH067346B2 (en) Voice recognizer
Zhang et al. A study on tone statistics in Chinese names
JP2979912B2 (en) Voice recognition device
JPH0632014B2 (en) Word detection method
EP0692134B1 (en) Speech processing
JP2760096B2 (en) Voice recognition method
JPS60164800A (en) Voice recognition equipment
JPH04291399A (en) Voice recognizing method
JP2005332271A (en) Device, method, and program for determining question type classification
Fu et al. A robust C/V segmentation algorithm for Cantonese
JPS60217490A (en) Character recognizing device

Legal Events

Date Code Title Description
EXPY Cancellation because of completion of term