JPH11316597A

JPH11316597A - Voice recognition device

Info

Publication number: JPH11316597A
Application number: JP10122048A
Authority: JP
Inventors: Satoru Oishi; 哲大石; Kenichi Yamamoto; 健一山本; Takahide Takahashi; 隆英高橋
Original assignee: Toshiba TEC Corp
Current assignee: Toshiba TEC Corp
Priority date: 1998-05-01
Filing date: 1998-05-01
Publication date: 1999-11-16

Abstract

PROBLEM TO BE SOLVED: To decrease the number of registered codes needed to code voice- recognized words and phrases. SOLUTION: This device is provided with a voice input part 11 for inputting a speaker's voice, a voice recognition resource 12 which stores a language element code obtained by combining a classification code indicating each classification with an individual code so that it corresponds to each word or phrase, a voice recognition part 13 which recognizes a word or phrase from the voice inputted from the voice input part, extracts the language element code corresponding to each word or phrase from the voice recognition resource when the recognized word or phrase includes a word or phrase to be recognized in advance, and outputs it, and a code reconstitution part 14 which generates a meaning code by rearranging and connecting only individual codes in the predetermined order of classification codes as to language element codes outputted form the voice recognition part.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、入力した音声によ
り語句を認識してコード化する音声認識装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech recognition apparatus for recognizing and coding words and phrases from input speech.

【０００２】[0002]

【従来の技術】従来の音声認識装置は、図１１に示すよ
うに音声を入力するマイク１ａとこのマイクからの音声
をデジタル信号に変換するＡ／Ｄ変換器１ｂを備える音
声入力部１、予め認識されるべき語句に対して定義され
た言語要素コードの集合体である音声認識リソース２、
この音声入力部１からの出力に基づいて語句を認識し、
その語句に対応するコードを音声認識リソース２に基づ
いて抽出する音声認識部３、音声認識部３で抽出したコ
ードを音声認識データとして利用するアプリケーション
プログラム４から構成される。2. Description of the Related Art As shown in FIG. 11, a conventional voice recognition apparatus has a voice input unit 1 including a microphone 1a for inputting voice and an A / D converter 1b for converting voice from the microphone into a digital signal. A speech recognition resource 2, which is a collection of language element codes defined for the phrase to be recognized;
Recognize words and phrases based on the output from the voice input unit 1,
The speech recognition unit 3 extracts a code corresponding to the word based on the speech recognition resource 2, and an application program 4 uses the code extracted by the speech recognition unit 3 as speech recognition data.

【０００３】上記音声認識リソース２は、図１２に示す
ような音声認識されるべき語句と、これらに対応させた
コードとから構成される。音声認識されるべき語句とし
ては、名詞・修飾語・数詞などを一括したものであり、
このようなすべての語句に対して異なるコードを対応さ
せている。[0003] The speech recognition resource 2 is composed of words to be speech-recognized as shown in FIG. 12 and codes corresponding thereto. A phrase to be recognized as speech is a set of nouns, modifiers, numerals, etc.
Different codes correspond to all such phrases.

【０００４】このような装置では、話者が発声した名詞
・修飾語・数詞などからなる文節を音声入力部１から入
力すると、音声認識部３ではそれらを一括して予め定義
されている語句が認識され、その語句に対応するコード
がアプリケーションプログラム４へ音声認識データとし
て出力されていた。[0004] In such an apparatus, when a speaker inputs a phrase composed of a noun, a modifier, a number, etc., from the voice input unit 1, the voice recognition unit 3 collectively inputs a phrase defined in advance. The recognized code corresponding to the phrase has been output to the application program 4 as speech recognition data.

【０００５】[0005]

【発明が解決しようとする課題】しかし、このような音
声認識装置においては、名詞・修飾語・数詞などからな
る文節を一括した語句として認識するため、これらすべ
ての語句にコードを対応させて（関連付けて）音声認識
リソース２を作成しなければならなかった。このため、
認識されるべき語句が膨大な数となり、音声認識リソー
ス２の作成作業に非常に手間がかかるという問題があっ
た。However, in such a speech recognition apparatus, since a phrase composed of a noun, a modifier, a numeral, and the like is recognized as a collective phrase, a code is made to correspond to all these phrases ( (Associate) a speech recognition resource 2 had to be created. For this reason,
There is a problem in that the number of phrases to be recognized becomes enormous, and the work of creating the speech recognition resource 2 is extremely troublesome.

【０００６】また、このように語句とコードの登録数は
膨大であり、その登録内容はこの音声認識装置を適用し
ようとする業務によって異なるため、各装置ごとに作成
しなければならず、多分野への応用展開を妨げる要因と
なっていた。Since the number of words and codes registered is enormous, and the registered contents differ depending on the business to which this speech recognition device is to be applied, it must be created for each device. It was a factor that hindered the development of applications for

【０００７】そこで、本発明は、音声認識した語句をコ
ード化するのに必要な登録コード数を減少させることが
できる音声認識装置を提供しようとするものである。Accordingly, an object of the present invention is to provide a speech recognition apparatus capable of reducing the number of registration codes required for coding speech-recognized words and phrases.

【０００８】[0008]

【課題を解決するための手段】請求項１の本発明は、話
者の音声を入力するための音声入力手段と、この音声入
力手段から入力した音声から語句を認識する音声認識手
段と、予め認識されるべき複数の語句を分類して記憶す
るとともに、個別コードに各分類に関するコードを組合
せてなる言語要素コードを各語句に対応させて記憶する
言語要素コード記憶手段と、音声認識手段で認識された
語句が予め認識されるべき語句を含むとき、各語句に対
応するコードを言語要素コード記憶手段から抽出して出
力する言語要素コード出力手段と、この言語要素コード
出力手段から出力された複数の言語要素コードについ
て、各分類に関するコードに基づく順番に個別コードの
みを並べ替えて連結することによって意味コードを作成
する意味コード作成手段とを備えたことを特徴とする音
声認識装置である。According to a first aspect of the present invention, there is provided a voice input means for inputting a voice of a speaker, a voice recognition means for recognizing a phrase from a voice input from the voice input means, and A language element code storage means for classifying and storing a plurality of words to be recognized and a language element code storage means for storing a language element code obtained by combining an individual code with a code relating to each classification in association with each word, and a speech recognition means Language element code output means for extracting and outputting a code corresponding to each word from the language element code storage means when the extracted word includes a word to be recognized in advance, and a plurality of words output from the language element code output means. Semantic code creation that creates a semantic code by rearranging and concatenating only individual codes in the order based on the code for each classification for the language element code of A speech recognition apparatus characterized by comprising a stage.

【０００９】請求項２の本発明は、話者の音声を入力す
るための音声入力手段と、この音声入力手段から入力し
た音声から語句を認識する音声認識手段と、予め認識さ
れるべき複数の語句を分類して記憶するとともに、個別
コードに各分類を示す分類コードを組合せてなる言語要
素コードを各語句に対応させて記憶する言語要素コード
記憶手段と、音声認識手段で認識された語句が予め認識
されるべき語句を含むとき、各語句に対応する言語要素
コードを言語要素コード記憶手段から抽出して出力する
言語要素コード出力手段と、この言語要素コード出力手
段から出力された複数の言語要素コードについて、予め
決められた分類コードの順番に個別コードのみを並べ替
えて連結することによって意味コードを作成する意味コ
ード作成手段とを備えたことを特徴とする音声認識装置
である。According to a second aspect of the present invention, there is provided a voice input means for inputting a voice of a speaker, a voice recognition means for recognizing a phrase from the voice input from the voice input means, and a plurality of voice recognition means to be recognized in advance. A language element code storage means for storing words in a manner that classifies and stores the words, and stores a language element code obtained by combining a classification code indicating each classification with an individual code corresponding to each word, and a word recognized by the speech recognition means. A language element code output means for extracting and outputting a language element code corresponding to each word from the language element code storage means when including a word to be recognized in advance; and a plurality of languages output from the language element code output means A semantic code creating means for creating a semantic code by rearranging and connecting only the individual codes in the order of the predetermined classification codes for the element codes. A speech recognition apparatus characterized by was e.

【００１０】請求項３の本発明は、話者の音声を入力す
るための音声入力手段と、この音声入力手段から入力し
た音声から語句を認識する音声認識手段と、予め認識さ
れるべき複数の語句を分類して記憶するとともに、個別
コードに各分類の並べ替え順序を示す順番コードを組合
せてなる言語要素コードを各語句に対応させて記憶する
言語要素コード記憶手段と、音声認識手段で認識された
語句が予め認識されるべき語句を含むとき、各語句に対
応する言語要素コードを言語要素コード記憶手段から抽
出して出力する言語要素コード出力手段と、この言語要
素コード出力手段から出力された複数の言語要素コード
について、順番コードの順番に個別コードだけを並べて
連結することによって意味コードを作成する意味コード
作成手段とを備えたことを特徴とする音声認識装置であ
る。According to a third aspect of the present invention, there is provided a voice input means for inputting a voice of a speaker, a voice recognition means for recognizing a phrase from the voice input from the voice input means, and a plurality of voice recognition means to be recognized in advance. A language element code storage unit that stores words in a manner that classifies and stores words, and stores a language element code obtained by combining an individual code with an order code indicating a sorting order of each classification in association with each word, and is recognized by a speech recognition unit. Language element code output means for extracting and outputting a language element code corresponding to each word from the language element code storage means when the extracted word includes a word to be recognized in advance, and a language element code output means Means for creating a semantic code by arranging and connecting only individual codes in the order of the order code for the plurality of language element codes. It is a speech recognition apparatus according to claim.

【００１１】[0011]

【発明の実施の形態】以下、本発明を電子式キャッシュ
レジスタ、ＰＯＳ端末などの商品販売コード登録処理な
どを行う業務処理装置に適用した場合の第１の実施の形
態を図１ないし図３を参照して説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a first embodiment in which the present invention is applied to a business processing device such as an electronic cash register, a POS terminal or the like which performs merchandise sales code registration processing will be described with reference to FIGS. It will be described with reference to FIG.

【００１２】図１は、本実施の形態にかかる業務処理装
置の構成を示す機能ブロック図である。この業務処理装
置は、音声をアナログ信号として入力するマイク（音声
入力手段）１１ａとこのマイク１１ａからの音声をデジ
タル信号に変換するＡ／Ｄ変換器１１ｂを備える音声認
識手段としての音声入力部１１、認識されるべき語句の
分類を含めて定義した後述の分類付コード（言語要素コ
ード）の集合体である音声認識リソース（言語要素コー
ド記憶手段）１２、音声入力部１１からの出力に基づい
て、入力した音声に対応する語句を認識し（音声認識手
段）、その語句に対応する分類付コードを音声認識リソ
ース１２から抽出して出力（言語要素コード出力手段）
する音声認識部１３、音声認識部１３で抽出した分類付
コードに基づいて、どんな商品が何個などの情報（文の
意味）を伝えるためのコード列（意味コード）を生成し
このコード列を出力する意味コード作成手段としてのコ
ード再構築部１４、このコード再構築部１４からの意味
コードを利用するアプリケーションプログラム１５から
構成される。FIG. 1 is a functional block diagram showing a configuration of a business processing device according to the present embodiment. The business processing device includes a microphone (voice input means) 11a for inputting voice as an analog signal and an A / D converter 11b for converting voice from the microphone 11a into a digital signal. A speech recognition resource (language element code storage unit) 12, which is a set of codes with classification (language element codes) defined below including the classification of words to be recognized, based on an output from a speech input unit 11. Recognize a phrase corresponding to the input speech (speech recognition means), extract a classified code corresponding to the phrase from the speech recognition resource 12, and output (language element code output means)
The speech recognition unit 13 generates a code sequence (meaning code) for transmitting information (meaning of a sentence) such as what product and how many based on the classified code extracted by the speech recognition unit 13. It comprises a code restructuring unit 14 as means for generating a meaning code to be output, and an application program 15 that uses the meaning code from the code restructuring unit 14.

【００１３】上記音声認識リソース１２は、例えばハー
ドディスク装置などの記憶装置で構成される。具体的に
は図２に示すような音声認識されるべき語句を単語単位
で分類したときのその分類名、各分類に属する単語単位
の語句、各分類ごとに同一の分類コードと各語句に対応
する個別コードからなり各語句ごとに対応させた分類付
コードから構成される。例えば分類名が「物」である
「紙」なる語句に対しては「Ｏ」なる分類コードと「Ｐ
Ａ」なる個別コードとからなる「Ｏ−ＰＡ」からなる分
類付コードを対応させる。また、分類名が「サイズ」で
ある「大」なる語句に対しては「Ｓ」なる分類コードと
「Ｌ」なる個別コードとからなる「Ｓ−Ｌ」からなる分
類付コードを対応させる。The voice recognition resource 12 is constituted by a storage device such as a hard disk device. Specifically, as shown in FIG. 2, the words to be recognized by speech are classified by word, the classification name, the words in words belonging to each classification, and the same classification code and the same words in each classification. And a classification code corresponding to each word and phrase. For example, for a word "paper" whose classification name is "thing", a classification code of "O" and "P
A classifying code consisting of “O-PA” consisting of an individual code “A” is associated. In addition, for a phrase “large” whose classification name is “size”, a classification code “SL” including a classification code “S” and an individual code “L” is associated.

【００１４】また、上記音声認識部１３、コード再構築
部１４、及びアプリケーションプログラム１５は、ＣＰ
Ｕ（中央処理装置）・ＲＯＭ（リード・オンリ・メモ
リ）・ＲＡＭ（ランダム・アクセス・メモリ）を備えた
パーソナルコンピュータなどから構成される。これら音
声認識部１３、コード再構築部１４、及びアプリケーシ
ョンプログラム１５は、具体的には例えばハードディス
ク装置などの記憶装置又はＲＯＭなどのメモリに記憶さ
れ、上記パーソナルコンピュータのＣＰＵが読取可能な
ソフトウエアプログラムで構成される。The speech recognition unit 13, the code reconstruction unit 14, and the application program 15
It comprises a personal computer having a U (central processing unit), a ROM (read only memory), and a RAM (random access memory). The voice recognition unit 13, the code reconstruction unit 14, and the application program 15 are specifically stored in a storage device such as a hard disk device or a memory such as a ROM, and are readable by a CPU of the personal computer. It consists of.

【００１５】このうち、音声認識部１３は、上記音声入
力部１１からの出力に基づいて入力された音声と予め音
声認識リソース１２内で音声特徴データを定義（対応）
させた語句との類似性・近似性を検出（例えば音声認識
リソース１２に同一の語句を意味する複数種類の言回し
の音声特徴データを同一の語句に対応させておき、これ
に基づいて入力された音声の認識を行って発声された語
句を特定）して音声認識を行い、音声認識して得られた
語句に対応する分類付コードを音声認識リソース１２か
ら抽出して出力する。The voice recognition unit 13 defines (corresponds to) voice input based on the output from the voice input unit 11 and voice feature data in the voice recognition resource 12 in advance.
Detecting similarity / approximation with the word (for example, making the speech recognition resource 12 correspond to the same word with speech feature data of a plurality of types of phrases meaning the same word, and input based on this. Then, the speech recognition is performed to identify the uttered phrase), and the speech recognition is performed. A classified code corresponding to the phrase obtained by the speech recognition is extracted from the speech recognition resource 12 and output.

【００１６】なお、図示はしないが音声認識されるべき
語句については、予め標準話者の音声特徴データを関連
付けて記憶しておいてもよく（不特定話者対応型）、ま
た使用者に実際に発声してもらった音声特徴データを関
連付けておいてもよい（特定話者対応型）。Although not shown, the words to be recognized by speech may be stored in advance in association with the voice characteristic data of the standard speaker (unspecified speaker-compatible type), May be associated with the uttered voice feature data (specific speaker correspondence type).

【００１７】また、上記コード再構築部１４は、音声認
識部１３から出力された分類付コードを予め決められた
各分類の順序、すなわち物値・サイズ値・数量値の順に
割当てられたメモリに各分類付コードの個別コードのみ
を格納する。これにより、分類付コードは分類ごとに並
べ替えられるとともに、各分類コードが削除された状態
で連結した意味コードが生成される。The code restructuring unit 14 stores the classified code output from the speech recognizing unit 13 in a memory assigned in a predetermined order of classification, that is, in the order of physical value, size value, and quantity value. Only the individual code of each classification code is stored. As a result, the classified codes are rearranged for each classification, and a semantic code that is linked in a state where the classified codes are deleted is generated.

【００１８】例えば図３（ａ）に示すように「大きい紙
が３個」という発声内容であった場合は同図（ｂ）に示
すように音声認識部１３からは分類付コードが「Ｓ−
Ｌ」、「Ｏ−ＰＡ」、「Ｎ−０００３」の順に出力され
るが、コード再構築部１４は同図（ｃ）に示すような各
分類ごとに割当てられた物値、サイズ値、数量値の各メ
モリにそれぞれ個別コード「ＰＡ」、「Ｌ」、「０００
３」のみを格納することにより生成した「ＰＡＬ０００
３」を意味コードとして出力する。For example, as shown in FIG. 3A, if the utterance content is "three large papers", the classification code is "S-
L, “O-PA”, and “N-0003” are output in this order. The code reconstructing unit 14 assigns the physical value, size value, and quantity assigned to each classification as shown in FIG. Individual codes "PA", "L", "000" are stored in each value memory.
"PAL000" generated by storing only "3"
"3" is output as a meaning code.

【００１９】上記アプリケーションプログラム１５は、
コード再構築部１４からの意味コードに基づいて商品販
売コードの登録、代金の計算などの所定の業務処理を行
うソフトウエアプログラムで構成される。アプリケーシ
ョンプログラム１５は、商品名と個数を発声すると、そ
の意味コードがコード再構築部１４から出力されるが、
この意味コードによって対応する商品名をディスプレイ
などの画面に選択表示し、商品コードの登録や代金の計
算などその後の会計処理を実施するためのものである。The above application program 15
Based on the semantic code from the code restructuring unit 14, it is constituted by a software program for performing predetermined business processes such as registration of a product sales code and calculation of a price. When the application program 15 utters the product name and the number, the meaning code is output from the code restructuring unit 14.
This meaning code is used to select and display the corresponding product name on a screen such as a display, and to perform subsequent accounting processing such as registration of the product code and calculation of the price.

【００２０】このような構成の本発明の実施の形態にお
いては、例えば本装置の使用者が図３（ａ）に示すよう
に「大きい紙が３個」と発声すると、この音声は音声入
力部１１でデジタル信号に変換されて音声認識部１３に
供給される。そして、音声認識部１３では音声認識リソ
ース１２が参照され、入力された音声と予め音声認識リ
ソース内で定義された語句との類似性・近似性が検出さ
れ、同図（ｂ）に示すように「大きい」に対しては「Ｓ
−Ｌ」なる分類付コード、「紙」に対しては「Ｏ−Ｐ
Ａ」なる分類付コード、「３個」に対しては「Ｎ−００
０３」なる分類付コードが、順に出力される。これらの
分類付コードは次のコード再構築部１４に渡され、同図
（ｃ）に示すような各分類ごとに割当てられた物値、サ
イズ値、数量値の各メモリにそれぞれ個別コード「Ｐ
Ａ」、「Ｌ」、「０００３」のみが格納されて「ＰＡＬ
０００３」なる意味コードが生成され、アプリケーショ
ンプログラム１５に渡される。この意味コードは、アプ
リケーションプログラム１５にとっては、「大きな紙が
３個」という意味をもっている。つまり、「大きな」、
「紙」、「３個」という語句が、一つ一つ独立して認識
され、各々が関連付けられた分類コードに変換され、分
類コードを組合せて意味コードとなる。In the embodiment of the present invention having such a configuration, for example, when the user of this apparatus utters "three large papers" as shown in FIG. At 11, it is converted into a digital signal and supplied to the voice recognition unit 13. Then, the speech recognition unit 13 refers to the speech recognition resource 12 and detects the similarity / approximation between the input speech and a phrase defined in advance in the speech recognition resource, as shown in FIG. For "Large", "S
-L ", and" OP "for" paper "
"A" and "N-00" for "3"
03 "are sequentially output. These classified codes are passed to the next code restructuring unit 14, and the individual codes "P" are stored in the memory for the physical value, size value, and quantity value assigned to each classification as shown in FIG.
A, “L” and “0003” are stored and “PAL” is stored.
0003 ”is generated and passed to the application program 15. This meaning code has a meaning of “three large papers” for the application program 15. In other words, "big",
The words “paper” and “three” are independently recognized one by one, converted into associated classification codes, and combined with the classification codes to form semantic codes.

【００２１】このように、認識されるべき語句と独立し
て単語単位で分類して、分類コード及び個別コードから
なる分類付コードを各語句（単語）に対応（定義）させ
て記憶しておくだけで、各分類に属する単語の登録数の
積の数だけ意味コード生成することができる。As described above, the words and phrases to be recognized are classified on a word-by-word basis, and a code with classification including a classification code and an individual code is stored corresponding to (defined) each word (word). Alone, semantic codes can be generated by the number of products of the registered numbers of words belonging to each classification.

【００２２】従って、音声認識リソース１２を作成する
際に、従来のように１文についてのコードを一つ一つ登
録する必要はなく、各分類ごとにその分類に属する語句
について個別コードに分類コードを組合わせた分類付コ
ードを登録（記憶）すればよい。これにより、音声認識
リソース１２に記憶するコード数を大幅に減少させるこ
とができる。これにより、音声認識リソース１２を容易
に作成することができ、検索速度も速くなるため音声情
報をコード化する速度を速くすることもできる。Therefore, when the speech recognition resource 12 is created, it is not necessary to register codes for one sentence one by one as in the prior art. May be registered (stored). Thus, the number of codes stored in the speech recognition resource 12 can be significantly reduced. As a result, the speech recognition resource 12 can be easily created, and the retrieval speed can be increased, so that the encoding speed of the speech information can be increased.

【００２３】また、音声認識リソース１２には各分類ご
とに記憶されているので、分類付コードの追加・削除・
修正等の作業が容易になる。Also, since the speech recognition resources 12 are stored for each classification, addition, deletion,
Work such as correction becomes easy.

【００２４】また、同一の意味をもつ文を発声すれば、
それを構成する語句の発声順序・言回しが異なっても、
予め決められた分類の順番に並べ替えられて意味コード
が作成され、アプリケーションプログラム１５に渡され
るので、文を構成する語句の発声順序・言回しが異なっ
ても、同一の意味の文として認識できる。When a sentence having the same meaning is uttered,
Even if the utterance order and wording of the words that compose it are different,
Since the meaning codes are created by being rearranged in a predetermined classification order and passed to the application program 15, even if the words constituting the sentence have different utterance orders / phrases, they can be recognized as sentences having the same meaning. .

【００２５】次に、本発明を電子式キャッシュレジス
タ、ＰＯＳ端末などの商品販売コード登録処理などを行
う業務処理装置に適用した場合の第２の実施の形態を図
４及び図５を参照して説明する。なお、上記第１の実施
の形態と同一部分には同一符号を付して詳細な説明を省
略する。また、本実施の形態における業務処理装置の機
能ブロック図は、図１に示すものと同様であるため、そ
の詳細な説明を省略する。Next, referring to FIGS. 4 and 5, a second embodiment in which the present invention is applied to a business processing device such as an electronic cash register, a POS terminal or the like for performing a merchandise sales code registration process will be described. explain. The same parts as those in the first embodiment are denoted by the same reference numerals, and detailed description is omitted. Further, the functional block diagram of the business processing device according to the present embodiment is the same as that shown in FIG. 1, and therefore, detailed description thereof will be omitted.

【００２６】本実施の形態における音声認識リソース１
２は、図４に示すように分類名として「サイズ」の代り
に「金額」を登録（記憶）する点で、第１の実施の形態
とは異なる。Speech recognition resource 1 in the present embodiment
2 differs from the first embodiment in that “amount” is registered (stored) instead of “size” as a classification name as shown in FIG.

【００２７】このような構成の本発明の実施の形態にお
いては、例えば本装置の使用者が図５（ａ）に示すよう
に「５０円の箱を２個」と発声すると、この音声は音声
入力部１１でデジタル信号に変換されて音声認識部１３
に供給される。そして、音声認識部１３では音声認識リ
ソース１２が参照され、入力された音声と予め音声認識
リソース内で定義された語句との類似性・近似性が検出
され、同図（ｂ）に示すように「５０円」に対しては
「Ｋ−００５０」なる分類付コード、「箱」に対しては
「Ｏ−ＢＯ」なる分類付コード、「２個」に対しては
「Ｎ−０００２」なる分類付コードが、順に出力され
る。これらの分類付コードは次のコード再構築部１４に
渡され、同図（ｃ）に示すような各分類ごとに割当てら
れた物値、金額値、数量値の各メモリにそれぞれ個別コ
ード「ＢＯ」、「００５０」、「０００２」のみが格納
されて「ＢＯ００５００００２」なる意味コードが生成
され、アプリケーションプログラム１５に渡される。こ
の意味コードは、アプリケーションプログラム１５にと
っては、「５０円の箱が２個」という意味をもってい
る。つまり、「５０円」、「箱」、「２個」という語句
が、一つ一つ独立して認識され、各々が対応された（関
連付けられた）分類コードに変換され、分類コードを組
合せて意味コードとなる。In the embodiment of the present invention having such a configuration, for example, when the user of the present apparatus utters “two 50-yen boxes” as shown in FIG. It is converted into a digital signal by the input unit 11 and
Supplied to Then, the speech recognition unit 13 refers to the speech recognition resource 12 and detects the similarity / approximation between the input speech and a phrase defined in advance in the speech recognition resource, as shown in FIG. A classification code "K-0050" for "50 yen", a classification code "O-BO" for "box", and a classification "N-0002" for "2" The attached codes are output in order. These classified codes are passed to the next code restructuring unit 14 and stored in the memories of the physical value, the monetary value, and the quantity value assigned to each classification as shown in FIG. , “0050”, and “0002” are stored, and a meaning code “BO00500002” is generated and passed to the application program 15. This meaning code has a meaning to the application program 15 of “two 50-yen boxes”. In other words, the words "50 yen", "box", and "two" are individually recognized one by one, converted into corresponding (associated) classification codes, and combined with the classification codes. It becomes a meaning code.

【００２８】このように、認識されるべき語句と独立し
て単語単位で分類して、分類コード及び個別コードから
なる分類付コードを各語句（単語）に対応（定義）させ
て記憶しておくだけで、第１の実施の形態と同様に各分
類に属する単語の登録数の積の数だけ意味コード生成す
ることができる。従って、第１の実施の形態と同様の効
果を奏することができる。As described above, the words and phrases to be recognized are classified on a word-by-word basis, and a code with classification including a classification code and an individual code is stored in association with (defined) each word (word). Only in the same manner as in the first embodiment, it is possible to generate the meaning codes by the number of products of the registered numbers of the words belonging to each classification. Therefore, the same effect as in the first embodiment can be obtained.

【００２９】さらに、本実施の形態では、分類名として
金額情報を登録していることから、分類付コードから生
成された意味コードの「金額」を参照することで、商品
の特性の他、容易に商品の金額情報を得ることができ
る。Further, in the present embodiment, since the price information is registered as the classification name, by referring to the "money" of the meaning code generated from the classification-added code, not only the characteristics of the product, but also the The price information of the product can be obtained.

【００３０】次に、本発明を電子式キャッシュレジス
タ、ＰＯＳ端末などの商品販売コード登録処理などを行
う業務処理装置に適用した場合の第３の実施の形態を図
６及び図７を参照して説明する。なお、上記第１の実施
の形態と同一部分には同一符号を付して詳細な説明を省
略する。また、本実施の形態における業務処理装置の機
能ブロック図は、図１に示すものと同様であるため、そ
の詳細な説明を省略する。Next, referring to FIGS. 6 and 7, a third embodiment in which the present invention is applied to a business processing device such as an electronic cash register, a POS terminal or the like which performs a merchandise sales code registration process will be described. explain. The same parts as those in the first embodiment are denoted by the same reference numerals, and detailed description is omitted. Further, the functional block diagram of the business processing device according to the present embodiment is the same as that shown in FIG. 1, and therefore, detailed description thereof will be omitted.

【００３１】本実施の形態における音声認識リソース１
２は、図６に示すように分類名として「色」及び「金
額」を加えた点で、第１の実施の形態とは異なる。Speech recognition resource 1 in the present embodiment
2 differs from the first embodiment in that “color” and “amount” are added as classification names as shown in FIG.

【００３２】このような構成の本発明の実施の形態にお
いては、例えば本装置の使用者が図７（ａ）に示すよう
に「１００円の青くて大きい箱を１個」と発声すると、
この音声は音声入力部１１でデジタル信号に変換されて
音声認識部１３に供給される。そして、音声認識部１３
では音声認識リソース１２が参照され、入力された音声
と予め音声認識リソース内で定義された語句との類似性
・近似性が検出され、同図（ｂ）に示すように「１００
円」に対しては「Ｋ−０１００」なる分類付コード、
「青」に対しては「Ｃ−Ｂ」なる分類付コード、「大き
い」に対しては「Ｓ−Ｌ」なる分類付コード、「箱」に
対しては「Ｏ−ＢＯ」なる分類付コード、「１個」に対
しては「Ｎ−０００１」なる分類付コードが、順に出力
される。これらの分類付コードは次のコード再構築部１
４に渡され、同図（ｃ）に示すような各分類ごとに割当
てられた物値、サイズ値、色値、金額値、数量値の各メ
モリにそれぞれ個別コード「ＢＯ」、「Ｌ」、「Ｂ」、
「０１００」、「０００１」のみが格納されて「ＢＯＬ
Ｂ０１０００００１」なる意味コードが生成され、アプ
リケーションプログラム１５に渡される。この意味コー
ドは、アプリケーションプログラム１５にとっては、
「１００円の青くて大きい箱が１個」という意味をもっ
ている。つまり、「１００円」、「青」、「大きい」、
「箱」、「１個」という語句が、一つ一つ独立して認識
され、各々が対応された（関連付けられた）分類コード
に変換され、分類コードを組合せて意味コードとなる。In the embodiment of the present invention having such a configuration, for example, as shown in FIG. 7A, when the user of this apparatus utters "one blue and large box of 100 yen",
This voice is converted into a digital signal by the voice input unit 11 and supplied to the voice recognition unit 13. Then, the voice recognition unit 13
Refers to the speech recognition resource 12 to detect the similarity / approximation between the input speech and the phrase defined in advance in the speech recognition resource, and as shown in FIG.
For "yen", a classification code "K-0100",
Classified code "CB" for "blue", Classified code "SL" for "Large", Classified code "O-BO" for "Box" , "1" are sequentially output with classification codes of "N-0001". These classified codes are stored in the following code restructuring unit 1.
4, the individual codes "BO", "L", "B",
Only “0100” and “0001” are stored and “BOL”
A meaning code “B01000001” is generated and passed to the application program 15. This meaning code is, for the application program 15,
It means "one blue and big box of 100 yen". That is, "100 yen", "blue", "large",
The words "box" and "one" are independently recognized one by one, converted to corresponding (associated) classification codes, and combined with the classification codes to form semantic codes.

【００３３】このように、認識されるべき語句と独立し
て単語単位で分類して、分類コード及び個別コードから
なる分類付コードを各語句（単語）に対応（定義）させ
て記憶しておくだけで、上記第１の実施の形態と同様に
各分類に属する単語の登録数の積の数だけ意味コード生
成することができる。従って、第１の実施の形態と同様
の効果を奏することができる。As described above, the words and phrases to be recognized are classified on a word-by-word basis, and the classified code including the classification code and the individual code is stored in association with (defined) each word (word). Only in the same manner as in the first embodiment, it is possible to generate semantic codes by the number of products of the registered numbers of words belonging to each classification. Therefore, the same effect as in the first embodiment can be obtained.

【００３４】特に、本実施の形態における音声認識リソ
ース１２には、「物」なる分類に属する語句が４種類、
「サイズ」なる分類に属する語句が３種類、「色」なる
分類に属する語句が４種類、「金額」なる分類に属する
語句が３種類、「数量」なる分類に属する語句が３種類
という分類付コードが登録されているので、全部で４×
３×４×３×３＝４３２種類の意味コードを取扱うこと
ができるが、このような４３２種類の意味コードを取扱
う場合には、従来では４３２のコード数を登録しなけれ
ばならなかったのに対して、本実施の形態では僅か１７
種類の分類コードを音声認識リソースに登録するだけで
音声認識リソース１２を作成することができる。つま
り、音声認識リソース１２に記憶するコード数をより一
層減少させることができる。In particular, the speech recognition resource 12 according to the present embodiment has four types of phrases belonging to the classification "object",
There are three types of terms belonging to the category "size", four types of terms belonging to the category "color", three types of terms belonging to the category "money", and three types of terms belonging to the category "quantity". Since the code is registered, a total of 4 ×
Although 3 × 4 × 3 × 3 = 432 kinds of semantic codes can be handled, when such 432 kinds of semantic codes are handled, the number of 432 codes must be registered in the related art. On the other hand, in the present embodiment, only 17
The speech recognition resource 12 can be created only by registering the type classification code in the speech recognition resource. That is, the number of codes stored in the speech recognition resource 12 can be further reduced.

【００３５】次に、本発明を上述したような業務処理装
置に適用した場合の第４の実施の形態を図８ないし図１
０を参照して説明する。なお、上記第１の実施の形態と
同一部分には同一符号を付して詳細な説明を省略する。Next, a fourth embodiment in which the present invention is applied to the business processing apparatus as described above will be described with reference to FIGS.
0 will be described. The same parts as those in the first embodiment are denoted by the same reference numerals, and detailed description is omitted.

【００３６】本実施の形態にかかる業務処理装置は、図
８に示すように音声入力部１１、音声認識リソース１
２′、音声認識部１３、コード再構築部１４、このコー
ド再構築部１４からの意味コードを利用するアプリケー
ションプログラム１５の他、コード再構築部１４で意味
コードを作成する前に言語要素コードの並べ替えを行う
並べ替え部２１から構成される。As shown in FIG. 8, the business processing apparatus according to the present embodiment includes a voice input unit 11 and a voice recognition resource 1.
2 ', a speech recognition unit 13, a code restructuring unit 14, an application program 15 using the semantic code from the code restructuring unit 14, and a language element code before the semantic code is created by the code restructuring unit 14. It comprises a reordering unit 21 for reordering.

【００３７】本実施の形態における音声認識部１３、コ
ード再構築部１４、アプリケーションプログラム１５、
及び並べ替え部２１は、ＣＰＵ（中央処理装置）・ＲＯ
Ｍ（リード・オンリ・メモリ）・ＲＡＭ（ランダム・ア
クセス・メモリ）を備えたパーソナルコンピュータなど
から構成される。これら音声認識部１３、コード再構築
部１４、アプリケーションプログラム１５、及び並べ替
え部２１は、具体的には例えばハードディスク装置など
の記憶装置又はＲＯＭなどのメモリに記憶され、上記パ
ーソナルコンピュータのＣＰＵが読取可能なソフトウエ
アプログラムで構成される。The speech recognition unit 13, the code restructuring unit 14, the application program 15,
And the rearranging unit 21 includes a CPU (Central Processing Unit) / RO
It is composed of a personal computer having M (read only memory) and RAM (random access memory). The speech recognition unit 13, the code reconstruction unit 14, the application program 15, and the rearrangement unit 21 are specifically stored in a storage device such as a hard disk device or a memory such as a ROM, and read by the CPU of the personal computer. It consists of possible software programs.

【００３８】本実施の形態における音声認識リソース１
２′は、上記第１の実施の形態の場合とは異なり、図９
に示すような音声認識されるべき語句と、これらに対応
させた言語要素コードとから構成される。この言語要素
コードは分類の順序を示すキー値と実際のコード番号を
示すコード（個別コード）との組合せで構成される。こ
こでは、第１の実施の形態と同様に音声認識されるべき
語句を「商品名」と「個数」に分類し、同一の分類には
同一のキー値を対応させるとともに、そのコードを対応
させる。Speech recognition resource 1 in the present embodiment
2 'differs from the case of the first embodiment in that FIG.
And language element codes corresponding to these words and phrases to be recognized. This language element code is composed of a combination of a key value indicating the order of classification and a code (individual code) indicating an actual code number. Here, similarly to the first embodiment, words to be recognized by speech are classified into “product name” and “quantity”, and the same classification is associated with the same key value and the code thereof. .

【００３９】例えば「商品Ａ」の語句に対してはキー値
「１」及びコード「０１」からなる言語要素コードを対
応させ、「１個」の語句に対してはキー値「２」及びコ
ード「０１」からなる言語要素コードを対応させる。こ
のように語句を分類して各分類の順序を示すキー値を対
応させることにより、言語要素コードをキー値の順に並
べ替えれば、言語要素コードの出力順序に関わらず同一
コードになる。For example, a language element code consisting of a key value “1” and a code “01” is associated with the phrase “product A”, and a key value “2” and a code A language element code consisting of "01" is associated. In this way, by classifying words and associating key values indicating the order of each classification, by rearranging the language element codes in the order of the key values, the same code is obtained regardless of the output order of the language element codes.

【００４０】なお、図示はしないが音声認識されるべき
語句については、予め標準話者の音声特徴データを関連
付けて記憶しておいてもよく（不特定話者対応型）、ま
た使用者に実際に発声してもらった音声特徴データを関
連付けておいてもよい（特定話者対応型）ことは上記第
１の実施の形態と同様である。It should be noted that, although not shown, the words to be recognized by speech may be stored in advance in association with the voice feature data of the standard speaker (unspecified speaker-compatible type). May be associated with the uttered voice feature data (specific speaker-compatible type), as in the first embodiment.

【００４１】また、本実施の形態におけるコード再構築
部１４は、音声認識部１３から２つの言語要素コードを
受取ると、これを並べ替え部２１に渡す。すると並べ替
え部２１は受取った言語要素コードをキー値の順に並べ
替えを行い、並べ替えた順に言語要素コードをコード再
構築部１４へ戻す。これにより、コード再構築部１４
は、各言語要素コードからキー値を削除して連結するこ
とにより４桁の意味コードにしてアプリケーションプロ
グラム１５へ渡す。これにより、上記第１の実施の形態
と同様に商品名と個数の発声順序が異なってもコード再
構築部１４からは同一の意味コードが出力されることに
なる。Further, when code reconstructing section 14 in the present embodiment receives two language element codes from speech recognition section 13, it passes them to rearranging section 21. Then, the rearranging unit 21 rearranges the received language element codes in the order of the key values, and returns the language element codes to the code reconstructing unit 14 in the rearranged order. Thereby, the code restructuring unit 14
Deletes the key value from each language element code and concatenates it to make it a four-digit meaning code and passes it to the application program 15. As a result, the same semantic code is output from the code restructuring unit 14 even when the utterance order of the product name and the number is different, as in the first embodiment.

【００４２】このような構成の本発明の実施の形態にお
いては、例えば本装置の使用者が図１０（ａ）に示すよ
うに「商品Ａが３個」と発声すると、この音声は音声入
力部１１でデジタル信号に変換されて音声認識部１３に
供給される。そして、音声認識部１３では音声認識リソ
ース１２′が参照され、入力された音声と予め音声認識
リソース内で定義された語句との類似性・近似性が検出
され、「商品Ａ」に対しては「１０１」なる言語要素コ
ードが出力され、「３個」に対しては「２０３」なる言
語要素コードが、その順に出力される。これらの言語要
素コードは次のコード再構築部１４に渡され、並べ替え
部２１で言語要素コード内で定義されたキー値に従って
並べ替えが行われる。すなわち、「商品Ａ」のキー値は
「１」であるから先頭に配置され、「３個」のキー値は
「２」であるから２番目に配置される。In the embodiment of the present invention having such a configuration, for example, when the user of the present apparatus utters "3 products A" as shown in FIG. At 11, it is converted into a digital signal and supplied to the voice recognition unit 13. Then, the speech recognition unit 13 refers to the speech recognition resource 12 ′, detects the similarity / approximation between the input speech and a phrase defined in advance in the speech recognition resource, and for “product A”, The language element code “101” is output, and the language element code “203” is output for “3” in that order. These language element codes are passed to the next code restructuring unit 14, and are rearranged by the rearranging unit 21 according to the key values defined in the language element codes. That is, since the key value of "commodity A" is "1", it is arranged at the head, and the key value of "three" is "2", and is arranged second.

【００４３】このように並べ替えが行われると、これら
の言語要素コードはコード再構築部１４に戻され、各キ
ー値を削除後、連結されて「０１０３」なる意味コード
が生成され、アプリケーションプログラム１５に渡され
る。この意味コードは、アプリケーションプログラム１
５にとっては、「商品Ａが３個」という意味をもってい
る。After the rearrangement is performed, these language element codes are returned to the code reconstructing unit 14, where each key value is deleted, and concatenated to generate a meaning code "0103". 15 is passed. This meaning code corresponds to application program 1
For 5, it has a meaning of “three products A”.

【００４４】これに対して、装置の使用者が図１０
（ｂ）に示すように「３個の商品Ａ」と発声すると、そ
の音声は上記と同様に音声入力部１１を介して音声認識
部１３へ供給され、音声認識部１３で「３個」に対して
は「２０３」なる言語要素コードが出力され、「商品
Ａ」に対しては「１０１」なる言語要素コードが、その
順に出力される。これらの言語要素コードは次のコード
再構築部１４に渡され、並べ替え部２１にて言語要素コ
ード内で定義されたキー値に従って並べ替えが行われ
る。すなわち、「商品Ａ」のキー値は「１」であるから
先頭に配置され、「３個」のキー値は「２」であるから
２番目に配置される。On the other hand, the user of the apparatus shown in FIG.
As shown in (b), when "3 products A" is uttered, the voice is supplied to the voice recognition unit 13 via the voice input unit 11 as described above, and the voice recognition unit 13 converts the voice into "3". The language element code “203” is output to the user, and the language element code “101” is output to the “product A” in that order. These language element codes are passed to the next code restructuring unit 14, and are rearranged by the rearranging unit 21 according to the key values defined in the language element codes. That is, since the key value of "commodity A" is "1", it is arranged at the head, and the key value of "three" is "2", and is arranged second.

【００４５】このように並べ替えが行われた後、これら
の言語要素コードはコード再構築部１４に戻され、各キ
ー値を削除後、連結されて「０１０３」なる意味コード
が生成され、アプリケーションプログラム１５に渡され
る。この意味コードは、アプリケーションプログラム１
５にとっては、「商品Ａが３個」という意味をもってい
る。つまり、「商品Ａ」、「３個」という語句が、一つ
一つ独立して認識され、各々が関連付けられた言語要素
コードに変換され、言語要素コードを組合せて意味コー
ドとなる。After the rearrangement is performed, these language element codes are returned to the code reconstructing unit 14, where each key value is deleted and concatenated to generate a meaning code "0103". It is passed to the program 15. This meaning code corresponds to application program 1
For 5, it has a meaning of “three products A”. In other words, the words "commodity A" and "three" are individually recognized one by one, converted into associated language element codes, and combined with the language element codes to form a semantic code.

【００４６】このように、認識されるべき語句と独立し
て単語単位で分類して、順番コード及び個別コードから
なる言語要素コードを各語句（単語）に対応（定義）さ
せて記憶しておくだけで、第１の実施の形態と同様に各
分類に属する単語の登録数の積の数だけ意味コード生成
することができる。従って、第１の実施の形態と同様の
効果を奏することができる。As described above, the words and phrases to be recognized are classified in units of words, and the language element codes including the order codes and the individual codes are stored so as to correspond (define) to the respective words (words). Only in the same manner as in the first embodiment, it is possible to generate the meaning codes by the number of products of the registered numbers of the words belonging to each classification. Therefore, the same effect as in the first embodiment can be obtained.

【００４７】また、同一の意味をもつ文を発声すれば、
それを構成する語句の発声順序・言回しが異なっても、
同一の意味をもつ「０１０３」なる意味コードが作成さ
れ、アプリケーションプログラム１５に渡すことができ
るので、第１の実施の形態と同様に音声認識リソース１
２′に記憶するコード数を大幅に減少させることができ
るとともに、文を構成する語句の発声順序・言回しが異
なっても、同一の意味の文として認識できる。When a sentence having the same meaning is uttered,
Even if the utterance order and wording of the words that compose it are different,
Since the meaning code “0103” having the same meaning is created and can be passed to the application program 15, the speech recognition resource 1 is set in the same manner as in the first embodiment.
The number of codes stored in 2 'can be greatly reduced, and even if the words forming the sentence have different utterance orders / phrases, they can be recognized as sentences having the same meaning.

【００４８】[0048]

【発明の効果】以上詳述したように本発明によれば、音
声認識リソース（言語要素コード記憶手段）を作成する
際に、従来のように１文についてのコードを一つ一つ登
録する必要はなく、各分類ごとにその分類に属する語句
について個別コードに分類コードを組合せた分類付コー
ドを登録すればよい。これにより、音声認識リソースに
登録（記憶）するコード数を大幅に減少させることがで
きるので、音声認識リソースの作成が容易になるととも
に、検索速度も速くなるため音声情報をコード化する速
度が速くなる。As described above, according to the present invention, when creating a speech recognition resource (language element code storage means), it is necessary to register codes for one sentence one by one as in the prior art. Instead, for each class, a code with classification obtained by combining a classification code with an individual code for a phrase belonging to the classification may be registered. As a result, the number of codes registered (stored) in the speech recognition resource can be greatly reduced, so that the creation of the speech recognition resource is facilitated and the search speed is also increased, so that the speed of coding the speech information is increased. Become.

【００４９】また、音声認識リソースには各分類ごとに
記憶されているので、分類付コードの追加・削除・修正
等の作業が容易になる。Further, since the speech recognition resources are stored for each classification, operations such as addition, deletion, and modification of classification-added codes are facilitated.

【００５０】さらに、意味のある文を発声するときに、
その文を構成する語句の発声順序・言回しが異なって
も、同一の意味コードが作成されるので、同一の意味の
文として認識できる。Further, when uttering a meaningful sentence,
Even if the utterance order and wording of the words constituting the sentence are different, the same meaning code is created, so that the sentence having the same meaning can be recognized.

[Brief description of the drawings]

【図１】本発明の第１の実施の形態にかかる業務処理装
置の構成を示す機能ブロック図。FIG. 1 is a functional block diagram showing a configuration of a business processing device according to a first embodiment of the present invention.

【図２】図１に示す音声認識リソースの構成を示す図。FIG. 2 is a diagram showing a configuration of a speech recognition resource shown in FIG.

【図３】本実施の形態の作用を説明する図。FIG. 3 is a diagram illustrating an operation of the embodiment.

【図４】本発明の第２の実施の形態にかかる業務処理装
置の音声認識リソースの構成を示す図。FIG. 4 is a diagram showing a configuration of a speech recognition resource of a business processing device according to a second embodiment of the present invention.

【図５】本実施の形態の作用を説明する図。FIG. 5 is a diagram illustrating the operation of the present embodiment.

【図６】本発明の第３の実施の形態にかかる業務処理装
置の音声認識リソースの構成を示す図。FIG. 6 is a diagram showing a configuration of a speech recognition resource of a business processing device according to a third embodiment of the present invention.

【図７】本実施の形態の作用を説明する図。FIG. 7 is a diagram illustrating the operation of the present embodiment.

【図８】本発明の第４の実施の形態にかかる業務処理装
置の構成を示す機能ブロック図。FIG. 8 is a functional block diagram showing a configuration of a business processing device according to a fourth embodiment of the present invention.

【図９】図８に示す音声認識リソースの構成を示す図。FIG. 9 is a diagram showing a configuration of a speech recognition resource shown in FIG. 8;

【図１０】本実施の形態の作用を説明する図。FIG. 10 is a diagram illustrating the operation of the present embodiment.

【図１１】従来の音声認識装置を適用した業務処理装置
の構成を示す機能ブロック図。FIG. 11 is a functional block diagram showing a configuration of a business processing device to which a conventional voice recognition device is applied.

【図１２】図１１に示す音声認識リソースの構成を示す
図。FIG. 12 is a diagram showing a configuration of a speech recognition resource shown in FIG. 11;

[Explanation of symbols]

１１…音声入力部１１ａ…マイク１１ｂ…Ａ／Ｄ変換器１２…音声認識リソース１３…音声認識部１４…コード再構築部１５…アプリケーションプログラム２１…並べ替え部 DESCRIPTION OF SYMBOLS 11 ... Speech input part 11a ... Microphone 11b ... A / D converter 12 ... Speech recognition resource 13 ... Speech recognition part 14 ... Code reconstruction part 15 ... Application program 21 ... Sort part

Claims

[Claims]

1. A speech input means for inputting a voice of a speaker, a speech recognition means for recognizing a phrase from a speech input from the speech input means, and a plurality of phrases to be recognized in advance are classified and stored. And a language element code storage means for storing a language element code obtained by combining a code for each classification with an individual code in association with each word, and a word for which the word recognized by the voice recognition means is to be recognized in advance. When including, a language element code output means for extracting and outputting a code corresponding to each word from the language element code storage means, and a plurality of language element codes output from the language element code output means, a code relating to each classification. And semantic code creating means for creating a semantic code by rearranging and connecting only individual codes in an order based on Voice recognition device.

2. A speech input means for inputting a voice of a speaker, a speech recognition means for recognizing a phrase from speech inputted from the speech input means, and a plurality of phrases to be recognized are classified and stored. Language element code storage means for storing a language element code obtained by combining a classification code indicating each classification with an individual code in association with each word, and a word recognized by the voice recognition means should be recognized in advance. When including a phrase, a language element code output means for extracting and outputting a language element code corresponding to each word from the language element code storage means, and a plurality of language element codes output from the language element code output means, A semantic code creating means for creating a semantic code by rearranging and connecting only individual codes in a predetermined classification code order. Characteristic speech recognition device.

3. Speech input means for inputting a speaker's speech, speech recognition means for recognizing a phrase from speech input from the speech input means, and classifying and storing a plurality of phrases to be recognized in advance. Language element code storage means for storing a language element code in which an individual code is combined with an order code indicating a rearrangement order of each classification in association with each word, and a word recognized by the speech recognition means in advance. A language element code output means for extracting and outputting a language element code corresponding to each word from the language element code storage means when including a word to be recognized; and a plurality of languages output from the language element code output means. About element code,
A speech recognition device comprising: a meaning code creating unit that creates a meaning code by arranging and connecting only individual codes in the order of the order codes.