JP2009163528A

JP2009163528A - Commodity sales data processing apparatus, program thereof, and commodity data input apparatus and program thereof

Info

Publication number: JP2009163528A
Application number: JP2008000892A
Authority: JP
Inventors: Naoki Sekine; 直樹関根; Masanori Takeuchi; 雅則竹内; Yasuo Hayashi; 康夫林; Rui Ozaki; 瑠依尾崎; Atsushi Nakamoto; 篤志中本
Original assignee: Toshiba TEC Corp
Current assignee: Toshiba TEC Corp
Priority date: 2008-01-08
Filing date: 2008-01-08
Publication date: 2009-07-23
Anticipated expiration: 2028-01-08
Also published as: JP5015806B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a commodity sales data processing apparatus that extracts commodity categories or commodity character strings similar to a recognized voice as sound without omission, even when the recognized voice is erroneously recognized, and that outputs and displays commodity data set according to commodity category candidates and commodity character string candidates without omission on the basis of the candidates. <P>SOLUTION: The commodity sales data processing apparatus includes: a commodity category/commodity data dictionary 1131 and a commodity character string/commodity data dictionary 1132 which store commodity categories and commodity character strings set correspondingly to commodity data for identifying respective commodities respectively; a voice pattern data extraction means 111 for extracting a commodity category or a commodity name character string which is coincident with or similar to voice data uttered by an operator as sound; a commodity category candidate storage means 1141 and a commodity character string candidate storage means 1142 which store these extracted candidates respectively; and an commodity data reading means 1113 for reading commodity data set correspondingly to these stored candidates. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、音声認識技術を利用した商品販売データ処理装置及びその装置を機能させるプログラム並びに商品データ入力装置及びその装置を機能させるプログラムに関する。 The present invention relates to a merchandise sales data processing apparatus using voice recognition technology, a program that causes the apparatus to function, a merchandise data input apparatus, and a program that causes the apparatus to function.

近年、キーボードの代わりに音声を用いる音声認識の技術を利用したデータ入力が様々な場面で実用化されており、この技術を利用した商品販売データ処理装置、及び商品データ入力装置等が知られている。これらの装置は、商品毎にその商品が属するグループを特定するキーワードを予め設定記憶したキーワード記憶手段を備えており、操作者が音声にて特定のキーワードを入力すると、このキーワードを基にキーワードと対応して設定された商品名が抽出され、画面上に表示される（例えば、特許文献１参照）。
特開２０００−２９３７５４号公報 In recent years, data input using voice recognition technology that uses voice instead of a keyboard has been put into practical use in various situations, and product sales data processing devices, product data input devices, etc. using this technology are known. Yes. These devices include keyword storage means for presetting and storing a keyword for specifying a group to which the product belongs for each product. When an operator inputs a specific keyword by voice, the keyword is stored based on the keyword. Correspondingly set product names are extracted and displayed on the screen (for example, see Patent Document 1).
JP 2000-293754 A

しかしながら現在の音声認識技術では正確に正しい名称を発声しない限り誤認識するおそれがあり、また、周囲の雑音や他人の会話の影響を受け認識に失敗する可能性もある。このため、商品名称が長い場合や類似した商品名称を発生する場合、音声を正しく認識させるためのユーザ負担が大きい。また、上述の装置には入力音声を誤認識した場合の対応が施されていないため、誤ったキーワードを認識した際には誤ったグループに属する商品名が画面上に出力表示される場合が考えられる。このため、操作者の所望する商品名が表示されないといった候補漏れを生ずる可能性があり、販売データ処理に時間がかかる場合や処理ができないおそれがある。 However, with current voice recognition technology, there is a risk of misrecognition unless the correct name is spoken, and there is a possibility that recognition may fail due to the influence of ambient noise and other people's conversation. For this reason, when the product name is long or a similar product name is generated, the user burden for correctly recognizing the voice is large. In addition, since the above-mentioned device is not provided with a case where the input speech is misrecognized, the product name belonging to the wrong group may be output and displayed on the screen when the wrong keyword is recognized. It is done. For this reason, there is a possibility that a candidate omission such as the display of the product name desired by the operator may not occur, and there is a possibility that the sales data processing takes time or the processing cannot be performed.

本発明はこのような事情に基づいてなされたものであり、誤認識された音声は少なくとも音声的に類似していると考えられるので、音声的に類似する候補を漏れ無く抽出し、その候補と対応する商品データを出力表示させることによって、入力音声について誤認識がある場合においても操作者が所望する音声候補を出力表示させることができる商品販売データ処理装置を提供しようとするものである。 The present invention has been made based on such circumstances, and misrecognized speech is considered to be at least speech-similar. Therefore, speech-similar candidates are extracted without omission and An object of the present invention is to provide a merchandise sales data processing apparatus capable of outputting and displaying voice candidates desired by an operator even when there is a misrecognition of input voice by outputting and displaying corresponding merchandise data.

予め作成された音声の音声特徴量と音声パターンデータを関連付けて記憶した音響辞書と、前記音声パターンデータと商品カテゴリを関連付けて記憶した音声辞書と、商品を識別する商品データ及びこの商品データと対応して設定された前記商品カテゴリを記憶した商品カテゴリ・商品データ辞書と、音声を入力する音声入力手段と、前記音響辞書を参照して、前記音声入力手段により入力された音声の音声特徴量と予め作成された音声特徴量を比較し、一致若しくは類似した音声特徴量と関連付けられて記憶された前記音声パターンデータを出力する音声認識手段と、前記音声辞書を参照して、前記音声認識手段により出力された前記音声パターンデータに基づいて前記商品カテゴリを候補として抽出する音声パターンデータ抽出手段と、前記音声パターンデータ抽出手段により抽出された前記商品カテゴリを商品カテゴリ候補として記憶する商品カテゴリ候補記憶手段と、前記商品カテゴリ候補と対応して設定された前記商品データを前記商品カテゴリ・商品データ辞書から読み出す商品データ読み出し手段と、前記商品データ読み出し手段により読み出された前記商品データを出力表示する出力表示手段と、前記出力表示手段により出力表示された前記商品データの中から任意の前記商品データが選択されると、商品販売データ処理を行う商品販売データ処理手段と、を備えたことを特徴とする商品販売データ処理装置。 Corresponding to an acoustic dictionary that stores voice feature values and voice pattern data that are created in advance in association with each other, a voice dictionary that stores the voice pattern data in association with product categories, product data that identifies products, and this product data A product category / product data dictionary storing the product category set as described above, a voice input unit for inputting voice, and a voice feature amount of voice input by the voice input unit with reference to the acoustic dictionary; A speech recognition unit that compares speech feature values created in advance and outputs the speech pattern data stored in association with a matching or similar speech feature amount; and by referring to the speech dictionary, the speech recognition unit Voice pattern data extracting means for extracting the product category as a candidate based on the output voice pattern data; Product category candidate storage means for storing the product category extracted by the voice pattern data extraction means as product category candidates, and the product data set corresponding to the product category candidate from the product category / product data dictionary Product data reading means to be read, output display means for outputting and displaying the product data read by the product data reading means, and arbitrary product data from the product data output and displayed by the output display means A product sales data processing apparatus comprising: product sales data processing means for performing product sales data processing when selected.

本発明によれば商品カテゴリ又は商品名の文字列の一部を音声入力すると、その入力した音声について類似すると判断された複数の商品カテゴリ候補や商品文字列候補といった音声候補が抽出される。これらの音声候補は１つ１つの商品データと対応して設定されているため、その商品カテゴリ候補や商品文字列候補と対応した商品データを漏れなく出力表示することができる。これによって、入力された音声が誤認識された時の候補漏れを防ぐことができる。また、商品カテゴリや商品文字列の一部を発話することで販売処理又は入力処理を行うことが可能となる。このため、販売入力処理操作に際して操作者が正式な商品名称がわからない場合であっても装置の操作を行うことが可能となる効果を奏する。 According to the present invention, when a part of a character string of a product category or product name is input by voice, voice candidates such as a plurality of product category candidates and product character string candidates determined to be similar to the input voice are extracted. Since these voice candidates are set corresponding to each piece of product data, the product data corresponding to the product category candidates and product character string candidates can be output and displayed without omission. Thereby, it is possible to prevent candidate omission when the input voice is erroneously recognized. Further, the sales process or the input process can be performed by speaking a part of the product category or the product character string. For this reason, there is an effect that it is possible to operate the apparatus even when the operator does not know the official product name at the time of the sales input processing operation.

以下、本発明の実施の形態を図面を参照して説明する。なお、この実施の形態は、飲食店向けの商品販売に用いる処理装置に適用した場合である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In addition, this embodiment is a case where it applies to the processing apparatus used for the merchandise sale for restaurants.

第１の実施形態は、操作者が商品カテゴリを音声入力すると、この商品カテゴリの音声パターンデータと音声的に一致若しくは類似する商品カテゴリ候補を参照し、この商品カテゴリ候補と対応して設定された商品データを出力表示する商品販売データ処理装置であり、図１〜図９を用いて説明する。本実施形態における商品カテゴリとは商品毎に各商品が属するグループを特定する属性をいう。図１は本発明の第１の実施形態における商品販売データ処理装置の外観を示す斜視図である。商品販売データ処理装置１があり、操作者の音声を入力する音声入力手段としてマイクロホン１３が設けられている、出力表示手段１２は例えば液晶を利用したディスプレイ１２１とタッチパネルセンサ１２２とからなるタッチパネル式のものを用いている。但し、マイクロホン１３は図示したように本体と別に設けられている必要はなく、出力表示手段１２に内蔵されていてもよい。また出力表示手段１２に関しては本実施形態ではタッチパネル方式を用いているがこれに限定する必要はない。 In the first embodiment, when an operator inputs a product category by voice, a product category candidate that matches or resembles the voice pattern data of the product category is referred to and set corresponding to the product category candidate. A merchandise sales data processing apparatus that outputs and displays merchandise data, and will be described with reference to FIGS. The product category in the present embodiment refers to an attribute that specifies a group to which each product belongs for each product. FIG. 1 is a perspective view showing an appearance of a product sales data processing apparatus according to the first embodiment of the present invention. The merchandise sales data processing apparatus 1 is provided, and a microphone 13 is provided as a voice input means for inputting an operator's voice. Something is used. However, the microphone 13 does not need to be provided separately from the main body as illustrated, and may be built in the output display means 12. As for the output display means 12, a touch panel system is used in the present embodiment, but it is not necessary to be limited to this.

次に、図２に装置本体１１の内部構成について示す。装置本体１１には各装置の制御を行いコンピュータの中核機能を担うＣＰＵ(Central Processing Unit)１１１、このＣＰＵ１１１の動作を制御するプログラム等の固定的データが予め格納されたＲＯＭ(Read Only Memory)１１２、ＣＰＵ１１１から直接、データの書込みや読み出しが行われるＲＡＭ（Random Access Memory）１１３、販売データや各種データの記憶を行うことができる大容量記憶媒体であるＨＤＤ（Hard Disc Drive）１１４、ＬＡＮ（Local Area Network）等の通信ネットワークを介して接続される電子機器とのデータ通信を制御する通信インターフェイス１１５、ディスプレイの画面表示を制御する表示コントローラ１１６、タッチパネルセンサ１２２からタッチ検出信号が入力されるタッチパネルインターフェース１１７、マイクロホン１３から入力されたアナログの音声データをディジタル音声データに変換するＡ/Ｄコンバータ１１８及びこのＡ/Ｄコンバータ１１８から出力されるディジタル音声データに基づいて音声認識を行う音声認識エンジン１１９が備えられている。そしてＣＰＵ１１１とＲＯＭ１１２、ＲＡＭ１１３、ＨＤＤ１１４、通信インターフェイス１１５、表示コントローラ１１６、タッチパネルインターフェース１１７、音声認識エンジン１１９とはアドレスバス、データバスなどのバスラインで電気的に接続されている。 Next, FIG. 2 shows an internal configuration of the apparatus main body 11. In the apparatus main body 11, a CPU (Central Processing Unit) 111 that controls each apparatus and performs the core function of the computer, and a ROM (Read Only Memory) 112 in which fixed data such as a program for controlling the operation of the CPU 111 is stored in advance. , RAM (Random Access Memory) 113 in which data is written and read directly from the CPU 111, HDD (Hard Disc Drive) 114, which is a large-capacity storage medium capable of storing sales data and various data, LAN (Local A communication interface 115 that controls data communication with an electronic device connected via a communication network such as an area network), a display controller 116 that controls screen display on the display, and a touch panel interface that receives touch detection signals from the touch panel sensor 122. 117, input from microphone 13 Speech recognition engine 119 to perform speech recognition based on the digital audio data outputted from the A / D converter 118 and the A / D converter 118 converts analog audio data to digital audio data is provided with. The CPU 111 and ROM 112, RAM 113, HDD 114, communication interface 115, display controller 116, touch panel interface 117, and speech recognition engine 119 are electrically connected by a bus line such as an address bus or a data bus.

続いて、音声入力手段はマイクロホン１３を介して操作者の音声入力の開始から終了までの音声を取り込むものであり、後述する図９のステップＳＴ９−１からステップＳＴ９−４に相当する手段である。音声の取り込み方法はタッチパネルセンサ１２２の信号により図３の音声認識キー３７がタッチ操作されたことを検知すると、ＣＰＵ１１１は、Ａ／Ｄ変換によりアナログ音声データからディジタル音声データに変換し、この変換された音声データを音声認識エンジン１１９に取り込ませる。なお、この音声データの取り込みはディスプレイ１２１上の音声認識キー３７から指が離されてオフ操作されるまで継続される。そして、Ａ／Ｄ変換が行われた後に雑音処理を行う。 Subsequently, the voice input means captures the voice from the start to the end of the operator's voice input via the microphone 13, and corresponds to steps ST9-1 to ST9-4 in FIG. . When the voice capture method detects that the voice recognition key 37 in FIG. 3 is touched by a signal from the touch panel sensor 122, the CPU 111 converts the analog voice data into digital voice data by A / D conversion. The voice recognition engine 119 takes in the voice data. Note that the capturing of the voice data is continued until the finger is released from the voice recognition key 37 on the display 121 and turned off. Then, noise processing is performed after A / D conversion.

音声入力手段の雑音処理方法について述べる。雑音処理には、指向性マイクロホンを用いた遅延和法や予め音声パターンデータに雑音を含ませた音響辞書１１３４を用意して雑音処理を行うＨＭＭ合成法等があるが、本発明ではスペクトラムサブトラクション法（以下ＳＳ法）による雑音処理を用いている。 A noise processing method of the voice input means will be described. Noise processing includes a delay sum method using a directional microphone and an HMM synthesis method in which an acoustic dictionary 1134 in which noise is previously included in speech pattern data is prepared to perform noise processing. In the present invention, the spectrum subtraction method is used. Noise processing by (hereinafter referred to as SS method) is used.

スペクトラムサブトラクション法（以下ＳＳ法）は、操作者が発声した音声に雑音が混入してできた音声データの振幅スペクトラムから雑音の振幅スペクトラムを差し引く、または雑音の混入した音声データのパワースペクトラムから雑音のパワースペクトラムを差し引くことによって雑音抑圧を実現するものである。なお、パワースペクトラムは振幅スペクトラムを２乗したものであってＳＳ法による出力は、雑音の抑圧された振幅スペクトラムかパワースペクトラムである。ただし、本発明の実施の際における雑音処理はＳＳ法に限定されるものではなく、雑音を処理して音声データを取り出せる手法であるならばよい。 The spectrum subtraction method (hereinafter referred to as SS method) is a method of subtracting the noise amplitude spectrum from the amplitude spectrum of the voice data produced by mixing the noise into the voice uttered by the operator or from the power spectrum of the voice data containing the noise. Noise suppression is achieved by subtracting the power spectrum. The power spectrum is a square of the amplitude spectrum, and the output by the SS method is an amplitude spectrum or a power spectrum in which noise is suppressed. However, the noise processing in the implementation of the present invention is not limited to the SS method, and any method may be used as long as it can process the noise and extract voice data.

次に、音声認識手段は、ＣＰＵ１１１が音声認識エンジン１１９を動作させることにより、音声認識キー３７がオン操作されている期間中に取り込んだ音声の音声特徴量に基づいて、音響辞書１１３４を参照し一致若しくは類似する音声パターンデータを出力する手段である。なお、ここでの音声パターンデータとは後述する音響辞書１１３４から出力される音素パターン列のことを示す。具体的には、入力音を基にして線形予測分析を行って音声特徴量を求める。線形予測分析は入力音からスペクトル包絡を求める手法であり、発声メカニズムの声道特性を反映した一般に知られた音声特徴量抽出手法である（鹿野清宏（他４名）“音声認識システム”オーム社出版，第１版（２００１年５月）Ｐ１〜Ｐ１３参照）。予め作成された音声の音声特徴量と音声パターンを記憶する音響辞書１１３４を用い、音声認識手段は音声パターンデータを出力する。例えば、音声パターンデータの出力にはHMM法（中川聖一著 “確率モデルによる音声認識”電子情報通信学会Ｐ２９〜Ｐ８０参照）を用いる。 Next, the voice recognition unit refers to the acoustic dictionary 1134 based on the voice feature amount of the voice captured during the period when the voice recognition key 37 is turned on by the CPU 111 operating the voice recognition engine 119. It is a means for outputting voice pattern data that matches or is similar. Note that the voice pattern data here indicates a phoneme pattern string output from an acoustic dictionary 1134 described later. Specifically, the speech feature amount is obtained by performing linear prediction analysis based on the input sound. Linear prediction analysis is a technique for obtaining the spectral envelope from the input sound, and is a commonly known speech feature extraction method that reflects the vocal tract characteristics of the utterance mechanism (Kiyohiro Shikano (4 others) "Speech recognition system" ohm Publication, first edition (May 2001) P1-P13). The speech recognition means outputs speech pattern data using an acoustic dictionary 1134 that stores speech feature quantities and speech patterns created in advance. For example, the HMM method (referred to by Seiichi Nakagawa, “Speech recognition using a probabilistic model”, Electronic Information and Communication Engineers P29 to P80) is used for outputting voice pattern data.

音声パターンデータ抽出手段１１１１は、音声認識手段により出力された音声パターンデータに基づいて商品カテゴリを商品カテゴリ候補として抽出を行う手段である。この手段により商品データを読み出すための候補が抽出され、後述する商品カテゴリ候補記憶手段１１４１に記憶される。 The voice pattern data extraction unit 1111 is a unit that extracts a product category as a product category candidate based on the voice pattern data output by the voice recognition unit. Candidates for reading product data are extracted by this means and stored in the product category candidate storage means 1141 described later.

ＣＰＵ１１１は、この商品カテゴリ候補と対応して設定された商品データを、後述する商品カテゴリ・商品データ辞書１１３２から読み出す商品データ読み出し手段１１１３及び売上のあった商品データの入力処理または、登録処理等といった販売処理を行う商品販売データ処理手段１１１４を有している。 The CPU 111 reads the product data set corresponding to the product category candidate from the product category / product data dictionary 1132 described later, the product data input process that has been sold, or the registration process. Product sales data processing means 1114 for performing sales processing is provided.

この商品データ読み出し手段１１１３によって読み出された商品データをディスプレイ１２１上に表示させる。ここで、ディスプレイ１２１に表示された商品販売処理画面３８の一例を図３に示す。図示するように商品販売処理画面３８には「０」〜「９」等のテンキー３３の他、確認キー、取り消しキー等のタッチキー３５が表示されている。また、商品販売処理画面３８左上には商品、個数、値段を表示するテーブル３１があり、さらに入力した数値を表示するテーブル３２が設けられている。ここで、操作者がタッチパネル中の音声認識キー３７を押しながら発話を行い、音声の取り込みを行う。音声の取り込みが終了した後、商品販売処理画面３８中央に位置するテーブル３６に認識された音声候補から得た商品データが表示される。この画面中央上方に位置する表示部３４には操作者によって選択された商品データ項目が表示される。 The product data read by the product data reading unit 1113 is displayed on the display 121. Here, an example of the product sales processing screen 38 displayed on the display 121 is shown in FIG. As shown in the figure, the product sales processing screen 38 displays touch keys 35 such as a confirmation key and a cancel key in addition to the numeric keys 33 such as “0” to “9”. Further, at the upper left of the product sales processing screen 38, there is a table 31 for displaying products, quantity, and price, and a table 32 for displaying inputted numerical values. Here, the operator speaks while pressing the voice recognition key 37 on the touch panel, and takes in the voice. After the audio capture is completed, the product data obtained from the recognized speech candidates is displayed in the table 36 located in the center of the product sales processing screen 38. The product data item selected by the operator is displayed on the display unit 34 located at the upper center of the screen.

次に、商品販売処理画面３８上に商品カテゴリ候補が出力された一例について図４を用いて示す。本実施の形態では音声候補から得た商品データをディスプレイ１２１上に最大で１０個程度出力をする。しかし、この形態に限定する必要は必ずしも無く、抽出された商品カテゴリ候補と対応して設定された全商品データを商品販売処理画面３８上に出力させてもよいし、スクロール形式で商品データを順番に表示させてもよい。また、表示形式としては商品の名称のみを示してもよいし、名称と共に商品画像を同時に表示させてもよい。また、商品販売情報等の付加情報を加えてより詳細な商品情報を示した形で表示をさせてもよく、表示形式についてはこの実施の形態に限定するものではない。 Next, an example in which product category candidates are output on the product sales processing screen 38 will be described with reference to FIG. In the present embodiment, about 10 pieces of product data obtained from voice candidates are output on the display 121 at the maximum. However, it is not necessarily limited to this form, and all the product data set corresponding to the extracted product category candidates may be output on the product sales processing screen 38, or the product data is sequentially ordered in the scroll format. May be displayed. Further, as the display format, only the name of the product may be shown, or the product image may be displayed simultaneously with the name. Further, additional information such as merchandise sales information may be added to display more detailed merchandise information, and the display format is not limited to this embodiment.

次に、商品データ読み出し手段１１１３により読み出された商品データをディスプレイ１２１上に表示させる際に、商品が持つ情報を基に序列をつけて並べ替え処理を行うものが出力表示制御手段１１６１である。例えば、商品が選択入力された回数を商品データ情報とした場合は、選択入力が多く行われた商品の順に並べ替える。また、商品データ情報を商品価格とした場合は価格が高価な順に並び替える、といった一定の規則性を持たせて序列制御を行い、操作者の処理操作の便宜を図る形態にしてもよい。 Next, when displaying the product data read by the product data reading unit 1113 on the display 121, the output display control unit 1161 performs sorting processing based on the information of the products. . For example, when the product data information is the number of times a product has been selected and input, the items are rearranged in the order of products that have been frequently input. Further, when the product data information is the product price, the ordering control may be performed with a certain regularity such as rearrangement in the order of price, so that the operator can easily perform the processing operation.

ＲＡＭ１１３に記憶されている辞書について説明する。ＲＡＭ１１３は商品カテゴリ・商品データ辞書１１３１、商品データ・商品文字列辞書１１３２、音声辞書１１３３、音響辞書１１３４を有している。図５に示す音響辞書１１３４は音声の音声特徴量と音声パターンデータと結びつけたものである。例えば、操作者が発話した言葉が「あんぱん」であった場合は、この音の音声特徴量ベクトルは図５の５１のように示され、これに対応して音声パターンデータ５２が「ａｎｐａｎ」として関連付けて記憶されている。次に、図６に示す音声辞書１１３３には所定の語句を発話したときの音声の読みを記述した音声パターンデータ５２が格納されている。この音声パターンデータ５２は、ここでは各商品カテゴリについての読み方である。この音声パターンデータ５２と商品カテゴリ６１とは関連づけて記憶されており、この音声辞書１１３３によって入力された音声がどのような商品カテゴリ６１であるかが認識される。 A dictionary stored in the RAM 113 will be described. The RAM 113 includes a product category / product data dictionary 1131, a product data / product character string dictionary 1132, an audio dictionary 1133, and an acoustic dictionary 1134. The acoustic dictionary 1134 shown in FIG. 5 is associated with a voice feature amount of voice and voice pattern data. For example, when the word spoken by the operator is “Anpan”, the sound feature vector of this sound is shown as 51 in FIG. 5, and the sound pattern data 52 is corresponding to “anpan”. It is stored in association. Next, the speech dictionary 1133 shown in FIG. 6 stores speech pattern data 52 describing speech reading when a predetermined word is uttered. This voice pattern data 52 is a way of reading for each product category here. The voice pattern data 52 and the product category 61 are stored in association with each other, and what product category 61 is recognized by the voice dictionary 1133 is recognized.

商品カテゴリ・商品データ辞書１１３１は、商品を識別する商品データと、さらにこの商品データと対応して設定された商品カテゴリを記憶する手段である。図７はこの商品カテゴリ・商品データ辞書１１３１に保存されているデータ内容を示したものである。同図には、データとして商品カテゴリ61、メニューコード７２、商品データとしての商品名称７３が記憶されており、同じ商品カテゴリ61に分類される商品名称７３には同じ商品カテゴリ61が設定されている。この商品カテゴリ61を基に商品データ読み出し手段１１１３が同じ商品カテゴリ61として設定されている商品名称７３を読み出す。ここで、商品データは商品名称７３だけでなく、画像情報や販売情報などであってもよい。 The product category / product data dictionary 1131 is a means for storing product data for identifying products and product categories set corresponding to the product data. FIG. 7 shows data contents stored in the product category / product data dictionary 1131. In the figure, a product category 61, a menu code 72, and a product name 73 as product data are stored as data, and the same product category 61 is set for the product names 73 classified into the same product category 61. . Based on the product category 61, the product data reading unit 1113 reads the product name 73 set as the same product category 61. Here, the product data may be not only the product name 73 but also image information or sales information.

例えば、「つぶあんぱん」という商品名称７３と対応する商品カテゴリ61として「あんぱん」を設定する。この際、商品データとして商品名称７３と共に「つぶあんぱん」の画像情報を「あんぱん」の商品カテゴリ61に対応させて設定してもよい。また、商品カテゴリは商品の色彩、形状、模様といった外観の特徴によって分類分けをされていてもよい。例えば、パンの場合であるならば、「アンパン」、「食パン」といった一般的な商品名のカテゴリ以外に、「黒」、「白」等といった色彩のカテゴリ、「丸」、「四角形」、「三角形」等の形状のカテゴリ、「まだら」、「格子状」等の模様のカテゴリでもよい。これらの商品カテゴリ61は先に例示した分類区分に限定されることはなく、商品を分類できるものであるならばよい。なお、商品データ・商品文字列辞書１１３２については後述の第２の実施形態にて説明する。 For example, “Anpan” is set as the product category 61 corresponding to the product name 73 “Muanpan”. At this time, the image information of “Muanpan” together with the product name 73 may be set as product data in association with the product category 61 of “Anpan”. The product category may be classified according to appearance features such as the color, shape, and pattern of the product. For example, in the case of bread, in addition to general product name categories such as “Anpan” and “bread”, color categories such as “black” and “white”, “circle”, “square”, “ It may be a shape category such as “triangle” or a pattern category such as “mottle” or “lattice”. These merchandise categories 61 are not limited to the classifications exemplified above, and may be anything that can classify merchandise. The product data / product character string dictionary 1132 will be described in a second embodiment to be described later.

なお、上述した辞書はＲＡＭ１１３以外にも商品販売データ処理装置内にあるＨＤＤ１１４に設けて、ＨＤＤ１１４から音声パターンデータを読み出すようにしてもよいし、若しくはサーバ内に辞書を設けて電気通信回線を介してサーバからデータを読み出すようにしてもよい。 In addition to the RAM 113, the above-described dictionary may be provided in the HDD 114 in the merchandise sales data processing apparatus, and the voice pattern data may be read from the HDD 114, or a dictionary may be provided in the server via an electric communication line. Data may be read from the server.

次に、ＨＤＤ１１４は商品カテゴリ候補記憶手段１１４１及び商品文字列候補記憶手段１１４２、商品販売データ処理回数情報記憶手段１１４３を有している。商品カテゴリ候補記憶手段１１４１は音声パターンデータ抽出手段１１１１によって抽出された音声的に類似する商品カテゴリ候補を記憶する手段である。図８は商品カテゴリ候補記憶手段１１４１に記憶されているデータ内容を示したものであり、音声パターンデータ抽出手段１１１１によって抽出された商品カテゴリ候補がこの商品カテゴリ記憶候補エリア８１に記憶される。また、商品販売データ処理回数情報記憶手段１１４３は、商品販売データ処理回数情報を記憶する手段である。この商品販売データ処理回数は例えば、単なる商品選択入力回数であってもよいし、売上登録回数であってもよく、商品販売処理回数に関するデータであるならばよい。これらの記憶手段は他の辞書・記憶手段と同様に電気通信回線上に設けてもよい。 Next, the HDD 114 includes a product category candidate storage unit 1141, a product character string candidate storage unit 1142, and a product sales data processing frequency information storage unit 1143. The merchandise category candidate storage means 1141 is a means for memorizing the voice category data candidate candidates extracted by the voice pattern data extracting means 1111. FIG. 8 shows the data contents stored in the product category candidate storage unit 1141, and the product category candidates extracted by the voice pattern data extraction unit 1111 are stored in the product category storage candidate area 81. The product sales data processing count information storage means 1143 is means for storing product sales data processing count information. The product sales data processing count may be, for example, a simple product selection input count, a sales registration count, or data regarding the product sales processing count. These storage means may be provided on the telecommunication line in the same manner as other dictionary / storage means.

まず、電源オンにより商品販売データ処理装置が立ち上がり、ＣＰＵ１１１は図３に示す商品販売処理時に音声認識を利用する商品販売処理画面３８をディスプレイ１２１に表示する。この画面に移ると音声入力が開始され装置本体１に取り付けられているマイクロホン１３から周囲の雑音の取り込みが開始される（ＳＴ９−１）。商品販売処理画面３８上の音声認識キー３７を押下しながら操作者が発話を行った場合は、操作者の音声の取り込みが行われる（ＳＴ９−２）。取り込まれた音声はマイクロホン１３を介して取り込まれ、Ａ/Ｄコンバータ１１８にてアナログ音声データからディジタル音声データへとＡ/Ｄ変換される（ＳＴ９−３）。Ａ/Ｄ変換後のディジタル音声データはＳＳ法により雑音処理が施され（ＳＴ９−４）、そのディジタル音声データに含まれている雑音が取り除かれる。 First, the merchandise sales data processing apparatus is activated when the power is turned on, and the CPU 111 displays a merchandise sales processing screen 38 that uses voice recognition during the merchandise sales processing shown in FIG. When moving to this screen, voice input is started, and ambient noise is started to be taken in from the microphone 13 attached to the apparatus main body 1 (ST9-1). When the operator speaks while pressing the voice recognition key 37 on the merchandise sales processing screen 38, the voice of the operator is captured (ST9-2). The captured voice is captured via the microphone 13 and A / D converted from analog voice data to digital voice data by the A / D converter 118 (ST9-3). The digital audio data after A / D conversion is subjected to noise processing by the SS method (ST9-4), and the noise contained in the digital audio data is removed.

なお、本実施の形態では、操作者の音声の取り込み時において音声認識の認識率の精度を高めるために音声入力の開始と終了を検知する入力区間検知手段１１９１を用いている。この手段は操作者が発話を開始すると同時に、この商品販売処理画面３８上に表示されている音声認識キー３７を押下すると、音声取得が開始される手段である。図３の音声認識キー３７がタッチ操作されたことをタッチパネルセンサ１２２が検知すると、操作者により入力された音声はＡ／Ｄコンバータ１１８にてディジタル音声データに変換され、ＣＰＵ１１１はこのディジタル音声データを音声認識エンジン１１９に取り込む（ＳＴ９−２）。このディジタル音声データの取り込みは音声認識キー３７から指が離されてオフ操作されるまで継続される。そして、ＣＰＵ１１１が音声認識キー３７のオフ操作を検知すると、音声の取り込みが終了する。 In this embodiment, the input section detection unit 1191 that detects the start and end of voice input is used to increase the accuracy of the recognition rate of voice recognition when the operator's voice is captured. This means is a means in which voice acquisition is started when the operator starts speaking and simultaneously presses the voice recognition key 37 displayed on the product sales processing screen 38. When the touch panel sensor 122 detects that the voice recognition key 37 in FIG. 3 is touched, the voice input by the operator is converted into digital voice data by the A / D converter 118, and the CPU 111 converts the digital voice data. The voice recognition engine 119 takes it in (ST9-2). The capturing of the digital voice data is continued until the finger is released from the voice recognition key 37 and turned off. Then, when the CPU 111 detects an operation of turning off the voice recognition key 37, the voice capturing ends.

音声取り込み終了後、音声認識エンジン１１９により音声認識キー３７がオン操作されている期間中に取り込んだ音声の特徴を抽出し（ＳＴ９−５）、音響辞書１１３４を参照して入力された音声の音声特徴量と予め作成された音声特徴量を比較し、一致するか判断する（ＳＴ９−６）。予め作成された音声特徴量と一致もしくは類似しない場合（ＳＴ９−６でＮＯ）は音声認識を終了し（ＳＴ９−１７）、音声特徴量と一致もしくは類似した場合（ＳＴ９−６でＹＥＳ）は、その音声特徴量に基づいて音響辞書１１３４内を参照し、一致もしくは類似した音声特徴量と関連付けられて記憶されている音声パターンデータを出力する（ＳＴ９−７）。
次に、出力された音声パターンデータに基づいて音声辞書１１３３を参照し、この音声パターンデータと関連付けて記憶されている商品カテゴリがあるかを判断する（ＳＴ９−８）。例えば、このとき音声認識手段により出力された音声パターンデータが「ａｎｐａｎ」及び「ａｎｍａｎ」であったとすると、この音声パターンデータと音声辞書１１３３内に記憶されている音声パターンデータの比較を行う（ＳＴ９−８）。音声辞書１１３３内を参照し、出力された音声パターンデータが音声辞書１１３３内の音声パターンデータに存在しなかった場合（ＳＴ９−８のＮＯ）は、音声認識を終了する（ＳＴ９−１７）。あるいは、操作者に対して「別の商品カテゴリで音声入力して下さい。」という内容のエラーメッセージの表示や、警告音を発する等の警告を行う形態にしてもよい。一方で、ＳＴ９−８において音声認識手段により、出力された音声パターンデータが音声辞書１１３３内に存在する場合は、（ＳＴ９−８のＹＥＳ）、音声パターンデータ抽出手段１１１１が音声認識手段によって出力された音声パターンデータに基づいて商品カテゴリの抽出を行う（ＳＴ９−９）。例えば、前述した「ａｎｐａｎ」及び「ａｎｍａｎ」の音声パターンデータが、参照する音声辞書１１３３に予め記憶されていて一致した場合は、音声パターンデータ「ａｎｐａｎ」、「ａｎｍａｎ」と関連づいている商品カテゴリ「あんぱん」、「あんまん」の２つの商品カテゴリが候補として音声パターンデータ抽出手段１１１１により抽出される。 After the completion of the voice acquisition, the voice features extracted during the period when the voice recognition key 37 is turned on by the voice recognition engine 119 are extracted (ST9-5), and the voice of the voice input with reference to the acoustic dictionary 1134 is extracted. The feature quantity is compared with the voice feature quantity created in advance, and it is determined whether or not they match (ST9-6). If it does not match or resembles the previously created speech feature (NO in ST9-6), the speech recognition is terminated (ST9-17), and if it matches or resembles the speech feature (YES in ST9-6), Based on the voice feature quantity, the acoustic dictionary 1134 is referred to, and voice pattern data stored in association with the matched or similar voice feature quantity is output (ST9-7).
Next, the voice dictionary 1133 is referred to based on the output voice pattern data, and it is determined whether there is a product category stored in association with the voice pattern data (ST9-8). For example, if the voice pattern data output by the voice recognition means at this time are “anpan” and “anman”, the voice pattern data is compared with the voice pattern data stored in the voice dictionary 1133 (ST9). -8). If the output voice pattern data does not exist in the voice pattern data in the voice dictionary 1133 with reference to the voice dictionary 1133 (NO in ST9-8), the voice recognition ends (ST9-17). Alternatively, the operator may be warned such as displaying an error message with a content “Please input by voice in another product category” or generating a warning sound. On the other hand, when the voice pattern data output by the voice recognition unit in ST9-8 exists in the voice dictionary 1133 (YES in ST9-8), the voice pattern data extraction unit 1111 is output by the voice recognition unit. The product category is extracted based on the voice pattern data (ST9-9). For example, when the voice pattern data of “anpan” and “anman” described above is stored in advance in the voice dictionary 1133 to be referred to and matches, the product category associated with the voice pattern data “anpan” and “anman” Two product categories “Anpan” and “Anman” are extracted as candidates by the voice pattern data extraction unit 1111.

次に、ＳＴ９−９にて抽出された複数の商品カテゴリ候補を商品カテゴリ候補記憶手段１１４１に一時記憶する（ＳＴ９−１０）。ここでは、図８の商品カテゴリ候補記憶手段１１４１のカテゴリ候補エリア８１に抽出された「あんぱん」、「あんまん」等の商品カテゴリ候補が一時記憶されている。この商品カテゴリ候補記憶手段１１４１に一時記憶された候補をもとに、商品読み出し手段１１１３が商品データ・商品カテゴリ辞書１１３１からこの商品カテゴリと対応して設定されている商品データを読み出す（ＳＴ９−１１）。さらに、ＣＰＵ１１１は商品データを序列制御させる際に必要となる商品販売データ処理回数情報を取得させる（ＳＴ９−１２）。 Next, the plurality of product category candidates extracted in ST9-9 are temporarily stored in the product category candidate storage unit 1141 (ST9-10). Here, product category candidates such as “Anpan” and “Anman” extracted in the category candidate area 81 of the product category candidate storage means 1141 of FIG. 8 are temporarily stored. Based on the candidates temporarily stored in the product category candidate storage unit 1141, the product reading unit 1113 reads the product data set corresponding to the product category from the product data / product category dictionary 1131 (ST9-11). ). Further, the CPU 111 obtains product sales data processing count information necessary when ordering the product data (ST9-12).

次に、出力表示制御手段１１６１がこの商品販売データ処理回数情報に基づいて商品データに序列をつけて所望の順序に並び替える（ＳＴ９−１３）。例えば、販売処理回数の多い商品データを優先的に上から順に表示させる形態が考えられる。この場合、販売データ処理回数の多い商品データ順に表示されるため、販売処理を行う可能性が高いものから表示され、効率よく操作できる。このように商品データを適宜所望の順番に並び替えた後、ディスプレイ１２１に表示させる（ＳＴ９−１４）。なお、出力表示制御手段１１６１によって序列制御するための情報は商品販売データ処理回数情報に限られることはなく、例えば商品名称の５０音順や、商品価格順、人気のある商品順、お奨め商品順等であってもよい。図４のように商品は中央の表示ディスプレイ３６に商品データが表示される。このときの商品データの表示方法として、予め出力表示させる商品データ数を限定して表示させてもよいし、一度に全てを表示させスクロール形式にして選択可能に表示させてもよい。また、商品の名称のみを表示するだけでなく商品データの画像又は価格、売上点数等の販売情報も商品名称７３と共に同時に表示させてもよい。 Next, the output display control means 1161 ranks the product data based on the product sales data processing count information and rearranges it in a desired order (ST9-13). For example, a form in which product data with a large number of sales processes is preferentially displayed in order from the top is conceivable. In this case, since the items are displayed in order of product data with the largest number of sales data processing times, the items are displayed in descending order of possibility of performing the sales processing, and can be operated efficiently. In this way, the product data is appropriately rearranged in a desired order, and then displayed on the display 121 (ST9-14). Note that the information for order control by the output display control means 1161 is not limited to product sales data processing count information. For example, the order of the product name in alphabetical order, product price order, popular product order, recommended product, etc. It may be in order. As shown in FIG. 4, the product data is displayed on the display 36 at the center. As a display method of the product data at this time, the number of product data to be output and displayed in advance may be limited and displayed, or all may be displayed at once and displayed in a scrollable manner so as to be selectable. Further, not only the name of the product but also the sales information such as the image or price of the product data and the number of sales points may be displayed simultaneously with the product name 73.

次に、ディスプレイ１２１上に表示された商品データから操作者が所望のものを選択する（ＳＴ９−１５）。商品データが選択された後、ＣＰＵ１１１は商品データを販売処理し（ＳＴ９−１６）、以上の操作により販売処理は終了となる（ＳＴ９−１７）。 Next, the operator selects a desired item from the product data displayed on the display 121 (ST9-15). After the merchandise data is selected, the CPU 111 sells the merchandise data (ST9-16), and the sales process is terminated by the above operation (ST9-17).

このように本実施の形態によれば商品名や商品の外観から連想される商品カテゴリを音声入力すると、音声的に類似する複数の商品カテゴリ候補が抽出される。この商品カテゴリ候補は各商品データと対応して設定されており、複数の商品カテゴリ候補が抽出されることから音声認識時に誤認識があった場合においても、商品データの抽出漏れを防ぐことができる。また、正式な商品名を発話することなく販売処理を行うことが可能となるため操作者の負担が軽減する。さらに、ディスプレイ１２１上に出力表示をする際、商品販売データ処理回数情報に基づいて序列制御が行われ操作者にとって所望の形態で表示されるため、販売処理操作に不慣れな者や初心者でも容易に販売処理操作を行うことができる効果を奏する。また、音声入力のみで所望する商品候補が商品販売処理画面３８に出力表示されることにより、販売処理に伴うタッチ操作を行う回数が減り、誤操作の防止を図ることができ、なおかつ入力作業時間を短縮することができるため素早く販売処理を行う効果を奏する。 As described above, according to the present embodiment, when a product category associated with the product name or the appearance of the product is input by voice, a plurality of product category candidates similar in terms of speech are extracted. This product category candidate is set corresponding to each product data, and since a plurality of product category candidates are extracted, it is possible to prevent omission of product data extraction even when there is a misrecognition during voice recognition. . In addition, since the sales process can be performed without speaking the official product name, the burden on the operator is reduced. Furthermore, when output is displayed on the display 121, the ordering control is performed based on the product sales data processing count information and is displayed in a desired form for the operator. Therefore, even a person unfamiliar with the sales processing operation or a beginner can easily perform it. The effect which can perform sales processing operation is produced. In addition, since the desired product candidate is output and displayed on the product sales processing screen 38 only by voice input, the number of touch operations associated with the sales processing can be reduced, and erroneous operation can be prevented, and the input work time can be reduced. Since it can be shortened, it has the effect of performing sales processing quickly.

次に、第２の実施形態として音声認識を用いた商品文字列の一致又は類似による検索を商品販売データ処理装置に適用した場合について図１０〜図１２を用いて説明する。なお、第１の実施の形態と同じ構成要素には同一の番号を付して詳細な説明を省略する。 Next, as a second embodiment, a case where a search based on matching or similarity of product character strings using speech recognition is applied to a product sales data processing apparatus will be described with reference to FIGS. The same constituent elements as those in the first embodiment are denoted by the same reference numerals, and detailed description thereof is omitted.

まず初めに第２の実施形態の図６に示す音声辞書１１３３には所定の語句を発話したときの音声の特徴量を数値化した標準的な音声パターンデータが予め格納されている。この音声パターンデータは各商品名の文字列についての音声データであり、この音声辞書１１３３を用いて入力音声がどのような商品文字列であるかが認識される。 First, in the speech dictionary 1133 shown in FIG. 6 of the second embodiment, standard speech pattern data obtained by quantifying the speech feature amount when a predetermined word is uttered is stored in advance. This voice pattern data is voice data for a character string of each product name, and this voice dictionary 1133 is used to recognize what product character string the input voice is.

商品データ・商品文字列辞書１１３２は、商品データと音声辞書１１３３に記憶されている商品データとしての商品名称の商品文字列とを関連付けて記憶する記憶手段である。この商品データ・商品文字列辞書１１３２のデータ内容について図１０を用いて示す。このテーブルには商品データ１００２とその商品データ１００２と対応して設定されている商品文字列１００１が記憶されている。ここでいう商品文字列とは商品データとしての商品名称の一部あるいは全部を構成する文字列のことである。この辞書に記憶されているデータとして、商品名の文字列が商品文字列エリア１００１に記憶されており、次に個々の商品データ１００２に割り当て設定されている商品コード１００３がメニューコードエリアに記憶され、音声認識されてディスプレイ１２１に出力表示される商品名称７３が商品名称エリアに設定されている。ここでは「食パン８枚」という商品名称７３に対して「しょ」、「しょく」、「しょくぱ」、「しょくぱん」、「しょくぱんは」等の商品名称７３の商品文字列が設定されており、例えば操作者が「食パン８枚」と音声で入力して「しょくぱんはち」と認識された時は「食パン８枚」が候補として該当する。また、「しょくぱん」と認識された場合は「食パン８枚」の商品名称７３の他に、同様の商品文字列「しょくぱん」に対応して設定されている商品名称「食パン６枚」も候補として該当する。 The product data / product character string dictionary 1132 is storage means for storing product data and product character strings of product names as product data stored in the voice dictionary 1133 in association with each other. The data contents of the product data / product character string dictionary 1132 will be described with reference to FIG. In this table, product data 1002 and a product character string 1001 set corresponding to the product data 1002 are stored. The product character string here is a character string that constitutes part or all of the product name as product data. As data stored in the dictionary, a character string of a product name is stored in the product character string area 1001, and then a product code 1003 assigned and set to each product data 1002 is stored in the menu code area. The product name 73 that is voice-recognized and output and displayed on the display 121 is set in the product name area. Here, a product character string of product name 73 such as “shoko”, “shoku”, “shokupa”, “shokupan”, “shokupanha” is set for the product name 73 of “8 breads”. For example, when the operator inputs “8 sheets of bread” by voice and is recognized as “shokupan hachi”, “8 sheets of bread” corresponds to the candidate. In addition, when “shokupan” is recognized, in addition to the product name 73 of “8 breads”, the product name “six breads” set corresponding to the same product string “shokupan” is also provided. Applicable as a candidate.

音声パターンデータ抽出手段１１１１は音声認識手段により出力された音声パターンデータに基づいて音声辞書１１３３に記憶されている商品文字列の音声パターンデータを参照し、商品文字列の商品文字列候補を抽出する手段である。この音声パターンデータ抽出手段１１１１は前方一致する商品文字列の音声パターンデータを持つ商品文字列候補のみを抽出対象とするわけではなく中間、後方一致する商品文字列の音声パターンデータを持つ商品文字列候補について抽出を行ってもよい。次に、この音声パターンデータ抽出手段１１１１によって抽出された商品文字列候補を記憶する手段として商品文字列候補記憶手段１１４２がある。 The voice pattern data extraction unit 1111 refers to the voice pattern data of the product character string stored in the voice dictionary 1133 based on the voice pattern data output by the voice recognition unit, and extracts the product character string candidate of the product character string. Means. This voice pattern data extraction means 1111 does not extract only the product character string candidates having the voice pattern data of the product character string matching the front, but the product character string having the voice pattern data of the product character string matching the middle and the rear. You may extract about a candidate. Next, there is a product character string candidate storage unit 1142 as a unit for storing the product character string candidates extracted by the voice pattern data extraction unit 1111.

次に、図１１は商品文字列候補記憶手段１１４２に記憶された商品文字列候補について示したものである。音声パターンデータ抽出手段１１１１により抽出された商品文字列は商品文字列候補記憶手段１１４２の商品文字列候補エリア１１０１に記憶される。続いて、商品データ読み出し手段１１１３が商品文字列候補記憶手段１１４２に記憶されている商品文字列に基づいて、この商品文字列と対応して設定されている商品文字列候補を読み出す。この記憶手段は他の辞書及び記憶手段と同様にＨＤＤ１１４内に設けてもよいし、サーバ内に設けてもよい。 Next, FIG. 11 shows the product character string candidates stored in the product character string candidate storage unit 1142. The product character string extracted by the voice pattern data extraction unit 1111 is stored in the product character string candidate area 1101 of the product character string candidate storage unit 1142. Subsequently, based on the product character string stored in the product character string candidate storage unit 1142, the product data reading unit 1113 reads a product character string candidate set corresponding to the product character string. This storage means may be provided in the HDD 114 like other dictionaries and storage means, or may be provided in the server.

第２の実施形態の操作処理について図１２を用いて説明する。第２の実施形態はＳＴ１２−１〜ＳＴ１２−７までは第１の実施形態のＳＴ９−１〜ＳＴ９−７と同様の処理が行われる。電源オンにより、商品販売データ処理装置が立ち上がり、ＣＰＵ１１１は商品販売処理画面３８を表示させると図３に示す音声認識を用いた商品販売処理画面３８が表示される（ＳＴ１２−１）。この画面の状態で、マイクロホン１３から周囲の雑音の取り込み、及び音声認識キー３７のオン・オフ操作により操作者が発話する音声の取り込み開始・終了が行われるが開始される（ＳＴ１２−２）。そして、取り込まれたこの音声は、Ａ/Ｄコンバータ１１８にてアナログ音声データからディジタル音声データへとＡ/Ｄ変換される（ＳＴ１２−３）。Ａ/Ｄ変換後のディジタル音声データはＳＳ法により雑音処理が施される（ＳＴ１２−４）。その後、ＣＰＵ１１１が音声認識エンジン１１９を実行させて取り込んだ音声パターンデータの特徴抽出を行う（ＳＴ１２−５）。ここで、音声辞書１１３３には商品名と商品文字列の音声パターンデータが関連付けて記憶されているため、ＳＴ１２−２における音声データの取り込みに関して、操作者は音声入力を商品名の全部で行ってもよいし、又は商品名の一部で行ってもよい。 The operation process of the second embodiment will be described with reference to FIG. In the second embodiment, the same processes as ST9-1 to ST9-7 in the first embodiment are performed from ST12-1 to ST12-7. When the power is turned on, the merchandise sales data processing apparatus is activated, and when the CPU 111 displays the merchandise sales processing screen 38, the merchandise sales processing screen 38 using voice recognition shown in FIG. 3 is displayed (ST12-1). In this state of the screen, the start of the capturing of ambient noise from the microphone 13 and the start / end of the speech uttered by the operator by the on / off operation of the speech recognition key 37 are started (ST12-2). The captured audio is A / D converted from analog audio data to digital audio data by the A / D converter 118 (ST12-3). The digital audio data after A / D conversion is subjected to noise processing by the SS method (ST12-4). Thereafter, the CPU 111 causes the speech recognition engine 119 to execute feature extraction of the captured voice pattern data (ST12-5). Here, since the voice dictionary 1133 stores the product name and the voice pattern data of the product character string in association with each other, the operator performs voice input for all of the product names in relation to the import of the voice data in ST12-2. Alternatively, it may be performed as a part of the product name.

次に、入力された音声の音声特徴量と予め作成された音声特徴量を比較し一致するか判断する（ＳＴ１２−６）。予め作成された音声特徴量と一致もしくは類似しない場合（ＳＴ１２−６のＮＯ）は音声認識を終了し（ＳＴ１２−１７）、音声特徴量と一致もしくは類似した場合（ＳＴ１２−６でＹＥＳ）は一致もしくは類似した音声特徴量に基づいて音響辞書内を参照し、この音声特徴量と関連付けられて記憶されている音声パターンデータを出力する（ＳＴ１２−７）。
次に、出力された音声パターンデータに基づいて音声辞書１１３３を参照し、この音声パターンデータと関連付けて記憶されている商品文字列があるかを判断する（ＳＴ１２−８）。出力された音声パターンデータと関連付けて記憶された商品文字列が存在しない場合（ＳＴ１２−８のＮＯ）は、音声認識を終了する（ＳＴ１２−１７）。あるいは、操作者に対して「別の商品文字列で音声入力して下さい。」という内容のエラーメッセージの表示や警告音を発する等の警告を行う形態にしてもよい。一方、出力された音声パターンデータと関連付けて記憶された商品文字列候補がある場合は（ＳＴ１２−８のＹＥＳ）、音声パターンデータ抽出手段１１１１が音声認識手段によって出力された音声パターンデータに基づいて商品文字列の抽出を行う（ＳＴ１２−９）。例えば、操作者が入力した音声が「食パン」として、音声認識エンジン１１９により一致若しくは類似するとして出力された音声パターンデータが「ｓｈｏｋｕｐａｎ」、「ｓｈｏｋｕ」、「ｃｈｏｃｏ」とすると、これらの音声パターンデータと関連付けて記憶された「しょくぱん」、「しょく」、「ちょこ」といった商品文字列が音声パターンデータ抽出手段１１１１により抽出される。 Next, the voice feature quantity of the input voice is compared with the voice feature quantity created in advance to determine whether they match (ST12-6). If it does not match or resembles a previously created speech feature (NO in ST12-6), speech recognition ends (ST12-17), and if it matches or resembles a speech feature (YES in ST12-6) Or, referring to the inside of the acoustic dictionary based on the similar voice feature quantity, the voice pattern data stored in association with the voice feature quantity is output (ST12-7).
Next, the voice dictionary 1133 is referred to based on the output voice pattern data, and it is determined whether there is a product character string stored in association with the voice pattern data (ST12-8). If there is no product character string stored in association with the output voice pattern data (NO in ST12-8), the voice recognition is terminated (ST12-17). Alternatively, the operator may be warned such as displaying an error message with the content “Please input with another product character string” or issuing a warning sound. On the other hand, when there is a commodity character string candidate stored in association with the output voice pattern data (YES in ST12-8), the voice pattern data extraction unit 1111 is based on the voice pattern data output by the voice recognition unit. The product character string is extracted (ST12-9). For example, if the voice input by the operator is “bread” and the voice pattern data output as matched or similar by the voice recognition engine 119 is “shokupan”, “shoku”, “choco”, these voice pattern data A product character string such as “Shokupan”, “Shoku”, and “Choko” stored in association with is extracted by the voice pattern data extracting means 1111.

次に、ＳＴ１２−９にて抽出された商品文字列候補を商品文字列候補記憶手段１１４２に一時記憶する（ＳＴ１２−１０）。このとき、商品文字列候補は図１１の商品文字列候補記憶エリア１１０１に記憶されている。この一時記憶された商品文字列候補をもとに、商品読み出し手段１１１３が商品データ・商品文字列記憶手段１１３２から商品文字列候補と対応して設定されている商品データを読み出す（ＳＴ１２−１１）。さらに読み出した商品データに関する商品販売データ処理回数情報を取得する（ＳＴ１２−１２）。ここで、商品販売情報は商品販売データ処理回数に関するものであるならばよい。 Next, the product character string candidates extracted in ST12-9 are temporarily stored in the product character string candidate storage unit 1142 (ST12-10). At this time, the product character string candidates are stored in the product character string candidate storage area 1101 of FIG. Based on the temporarily stored product character string candidates, the product reading unit 1113 reads the product data set corresponding to the product character string candidates from the product data / product character string storage unit 1132 (ST12-11). . Further, product sales data processing count information relating to the read product data is acquired (ST12-12). Here, the merchandise sales information may be related to the number of merchandise sales data processing.

次に、ＣＰＵ１１１はこの商品販売データ処理回数情報に基づいて出力表示制御手段１１６１を用いて序列をつけて所望の順序に商品データを並び替える（ＳＴ１２−１３）。例えば、販売処理回数の多い商品データを優先的に上から順に表示する形態が考えられる。このように商品データを適宜所望の順番に並び替えた後、ディスプレイ１２１に表示させる（ＳＴ１２−１４）。このときの表示方法として、予め出力表示させる商品データ数を限定して表示させてもよいし、また一度に全ての商品名を表示させスクロール形式にして選択可能に表示させてもよい。また、商品の名称のみを表示するだけでなく商品データの画像又は価格、売上点数等の販売情報も同時に表示させてもよい。 Next, the CPU 111 rearranges the product data in a desired order using the output display control means 1161 based on the product sales data processing count information (ST12-13). For example, a form in which product data with a large number of sales processes is preferentially displayed in order from the top can be considered. In this way, the product data is appropriately rearranged in a desired order, and then displayed on the display 121 (ST12-14). As a display method at this time, the number of product data to be output and displayed in advance may be limited and displayed, or all product names may be displayed at a time and displayed in a scrollable form so as to be selectable. Further, not only the name of the product but also the sales information such as the image or price of the product data and the sales number may be displayed at the same time.

次にディスプレイ１２１上に表示された商品データから所望の商品データを操作者が選択する（ＳＴ１２−１５）。選択された後、ＣＰＵ１１１は商品データを販売処理し（ＳＴ１２−１６）、以上の操作により販売処理は終了となる（ＳＴ１２−１７）。なお、第１の実施形態のカテゴリ検索機能と第２の実施形態の商品文字列検索機能は別々に実施する必要はなく、この二つの実施形態を組み合わせて実施を行ってもよい。 Next, the operator selects desired product data from the product data displayed on the display 121 (ST12-15). After the selection, the CPU 111 sells the product data (ST12-16), and the selling process is terminated by the above operation (ST12-17). Note that the category search function of the first embodiment and the product character string search function of the second embodiment do not need to be performed separately, and may be performed by combining these two embodiments.

本発明の第２の実施形態によれば、商品名の一部を音声入力することで商品データが商品販売処理画面３８上に出力されるため、商品名称７３を正確に覚えていない状態でも操作を行うことができる。このように音声認識によって商品販売処理が行われるので処理操作に不慣れな操作者であっても容易に処理を行うことができる効果を奏する。また、商品名の一部あるいは全部を音声入力した際に、誤認識されたとしても音声的に類似した商品データが出力されるので商品データの抽出漏れを防ぐことができる。また、商品文字列の前方一致、中間一致、後方一致による商品データの抽出、出力表示が可能であることから操作者は商品名がうろ覚えの状態であっても販売処理を行うことが可能であり、商品名称を正しく入力する必要がないため販売処理に伴う負担を軽減することができる。 According to the second embodiment of the present invention, since the product data is output on the product sales processing screen 38 by inputting a part of the product name by voice, the operation is performed even when the product name 73 is not accurately remembered. It can be performed. As described above, since the merchandise sales process is performed by voice recognition, even an operator unfamiliar with the process operation can easily perform the process. Further, when a part or all of the product name is inputted by voice, even if the product name is erroneously recognized, the product data similar to the voice is output, so that the omission of product data can be prevented. In addition, because product data can be extracted and output by forward matching, middle matching, and backward matching of product character strings, the operator can perform sales processing even if the product name is in a state of remembering. Since it is not necessary to input the product name correctly, the burden associated with the sales process can be reduced.

第３の実施形態として、音声認識を用いた商品データ入力装置について説明する。なお、第１、２の実施形態と同じ構成要素には同一の番号を付して詳細な説明を省略する。 As a third embodiment, a product data input device using voice recognition will be described. In addition, the same number is attached | subjected to the same component as 1st, 2 embodiment, and detailed description is abbreviate | omitted.

入力処理手段１１１２は、音声入力により出力表示された商品データの中から任意の商品データが選択された場合にその入力処理を行う手段である。この入力処理手段１１１２により入力された商品データの入力回数を記憶する手段が入力回数情報記憶手段１１４４である。第１の実施形態の商品販売データ処理装置は入力・登録などの処理を実行するが、第３の実施形態の商品データ入力装置は入力処理のみを実行する装置であり、この点で第１の実施形態と異なる。 The input processing means 1112 is a means for performing input processing when arbitrary product data is selected from product data output and displayed by voice input. A means for storing the number of times the product data is input by the input processing means 1112 is an input number information storage means 1144. The merchandise sales data processing apparatus of the first embodiment executes processes such as input / registration, but the merchandise data input apparatus of the third embodiment is an apparatus that executes only the input process. Different from the embodiment.

次に図１３を用いて商品データ入力処理のフローチャートについて説明を行う。本実施形態のＳＴ１３−１からＳＴ１３−５に該当する音声入力処理から音声候補の抽出までの処理は、第１の実施形態のＳＴ９−１からＳＴ９−５の処理と同様である。また、ＳＴ１３−６からＳＴ１３−１１までの商品カテゴリ候補を抽出する処理から抽出された商品カテゴリ候補を基に商品データをディスプレイ１２１上に出力・表示する処理は、第１の実施形態ＳＴ９−６からＳＴ９−１１の処理と同様の手順で行う。本実施形態では商品カテゴリを音声により入力し、入力された音声の音声特徴量に基づいて音響辞書１１３４を参照する。そして一致もしくは類似する音声パターンデータを音声認識手段が出力する。この音声パターンデータに基づいて音声辞書１１３３を参照し、商品カテゴリを音声パターンデータ抽出手段１１１１が抽出する。抽出された商品カテゴリ候補と対応して設定されている商品データを商品カテゴリ・商品データ辞書１１３１から商品データ読み出し手段１１１３によって読み出す（ＳＴ１３−１〜ＳＴ１３−１２）。読み出された商品データはディスプレイ１２１上に表示される。次に表示されている商品データに対して操作者が選択入力を行い、入力処理手段１１１２にて入力処理が行われ（ＳＴ１３−１３）、入力処理が終了される（ＳＴ１３−１４）。 Next, a flowchart of the product data input process will be described with reference to FIG. The processing from the speech input processing corresponding to ST13-1 to ST13-5 of this embodiment to the extraction of speech candidates is the same as the processing of ST9-1 to ST9-5 of the first embodiment. Further, the process of outputting / displaying the product data on the display 121 based on the product category candidates extracted from the process of extracting the product category candidates from ST13-6 to ST13-11 is the first embodiment ST9-6. To ST9-11. In this embodiment, the product category is input by voice, and the acoustic dictionary 1134 is referred to based on the voice feature amount of the input voice. Then, the voice recognition means outputs the voice pattern data that matches or is similar. Based on the voice pattern data, the voice dictionary 1133 is referred to, and the voice pattern data extraction unit 1111 extracts the product category. Product data set corresponding to the extracted product category candidates is read from the product category / product data dictionary 1131 by the product data reading means 1113 (ST13-1 to ST13-12). The read product data is displayed on the display 121. Next, the operator performs selection input for the displayed product data, and the input processing means 1112 performs input processing (ST13-13), and the input processing is terminated (ST13-14).

本発明の第３の実施形態によれば、商品カテゴリを音声入力すると音声的に類似する商品カテゴリ候補が複数抽出される。このように複数の商品カテゴリ候補が抽出されることによって商品カテゴリ候補と対応して設定されている商品データを漏れなく表示することができる効果を奏する。また、商品データをディスプレイ１２１に出力表示する際は、適宜、序列制御された状態で行われるため、操作者は迅速に入力処理を行うことができる。また、その他の序列制御による効果として入力処理を行う可能性が高い商品データが優先的に表示されるので、入力処理操作回数が減少し入力処理に伴う入力ミスを防ぐことができる。 According to the third embodiment of the present invention, when a product category is input by voice, a plurality of similar product category candidates are extracted. By extracting a plurality of product category candidates in this way, there is an effect that product data set corresponding to the product category candidates can be displayed without omission. Further, when the product data is output and displayed on the display 121, it is performed in a state where the order data is appropriately controlled, so that the operator can perform input processing quickly. In addition, since product data that is highly likely to be subjected to input processing is preferentially displayed as an effect of other sequential control, the number of input processing operations can be reduced and input errors associated with input processing can be prevented.

次に、第４の実施形態について説明する。なお、第１、２、３の実施形態と同じ構成要素には同一の番号を付して詳細な説明を省略する。第４の実施形態は音声により商品名の一部を入力し、その商品名の商品文字列の一致もしくは類似する商品文字列候補を抽出する。この抽出した商品文字列候補を基に商品データの抽出を行い、それを出力表示する商品データ入力装置である。 Next, a fourth embodiment will be described. In addition, the same number is attached | subjected to the same component as 1st, 2nd, 3rd embodiment, and detailed description is abbreviate | omitted. In the fourth embodiment, a part of a product name is input by voice, and a product character string candidate that matches or is similar to the product character string of the product name is extracted. This is a product data input device that extracts product data based on the extracted product character string candidates and outputs and displays it.

図１４を用いて第４の実施形態の処理を説明する。ＳＴ１４−１からＳＴ１４−１１は第２の実施例と同様の処理を行う。本実施の形態では商品名を音声により入力し、商品名の商品文字列音声データと一致若しくは類似する商品文字列候補を抽出する。この商品文字列候補と対応して設定されている商品データを商品データ読み出し手段１１１３によって読み出す（ＳＴ１４−１２）。読み出された商品データはディスプレイ１２１上に表示され、表示されている商品データに対して操作者が選択入力を行い、入力処理手段１１１２にて入力処理が行われる（ＳＴ１４−１３）。 The process of the fourth embodiment will be described with reference to FIG. ST14-1 to ST14-11 perform the same processing as in the second embodiment. In this embodiment, a product name is inputted by voice, and product character string candidates that match or are similar to the product character string voice data of the product name are extracted. The product data set corresponding to the product character string candidate is read by the product data reading means 1113 (ST14-12). The read product data is displayed on the display 121, and the operator performs selection input for the displayed product data, and the input processing means 1112 performs input processing (ST14-13).

本実施の形態によれば、商品名の一部を音声入力することにより商品名の文字列候補を基に商品データ候補がディスプレイ１２１上に出力される。このため、入力処理が不慣れであり、正式な商品名を記憶していない操作者が装置を操作した場合も容易に素早く入力処理を行うことができる。また、入力処理を行う可能性が高い商品データが優先的に表示されるため、入力処理に伴う入力ミスを防ぐことができる。このような効果から不慣れな操作者の入力操作の支援を行うことができる効果を奏する。 According to the present embodiment, by inputting a part of the product name by voice, the product data candidate is output on the display 121 based on the character string candidate of the product name. For this reason, even if the input process is unfamiliar and an operator who does not store the official product name operates the apparatus, the input process can be performed quickly and easily. In addition, since product data that is highly likely to be input is displayed preferentially, it is possible to prevent an input error associated with the input process. Due to such an effect, it is possible to support an unfamiliar operator's input operation.

なお、本発明は、上述した実施の形態がそのまま限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化でき、また、実施の形態に開示されている複数の構成要素の適宜な組み合わせにより種々の発明を変形できるものである。 The present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying constituent elements without departing from the scope of the invention in the implementation stage, and is disclosed in the embodiment. Various inventions can be modified by appropriately combining a plurality of components.

本発明の一実施の形態である商品販売データ処理装置の外観斜視図。1 is an external perspective view of a merchandise sales data processing apparatus according to an embodiment of the present invention. 本発明にかかる商品販売データ処理装置の要部構成を示すブロック図。The block diagram which shows the principal part structure of the merchandise sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置の処理画面の図。The figure of the processing screen of the goods sales data processing device concerning the present invention. 本発明にかかる商品販売データ処理装置の音声認識手段により商品カテゴリ候補が出力された画面の図。The figure of the screen where the goods category candidate was output by the voice recognition means of the goods sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置の音響辞書に保存されているデータ内容を示す図The figure which shows the data content preserve | saved at the acoustic dictionary of the merchandise sales data processing apparatus concerning this invention 本発明にかかる商品販売データ処理装置における音声辞書に保存されているデータ内容を示す図。The figure which shows the data content preserve | saved at the audio | voice dictionary in the merchandise sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置の商品データ・商品カテゴリ辞書に保存されているデータ内容を示す図。The figure which shows the data content preserve | saved at the merchandise data and merchandise category dictionary of the merchandise sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置の商品カテゴリ候補記憶手段に保存されているデータ内容を示す図。The figure which shows the data content preserve | saved at the merchandise category candidate memory | storage means of the merchandise sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置における第１の実施形態の処理を示すフローチャート。The flowchart which shows the process of 1st Embodiment in the merchandise sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置の商品データ・商品文字列辞書に保存されているデータ容を示す図。The figure which shows the data content preserve | saved at the merchandise data and merchandise character string dictionary of the merchandise sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置の商品文字列候補に保存されているデータ内容を示す図。The figure which shows the data content preserve | saved at the goods character string candidate of the goods sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置における第２の実施形態の処理を示すフローチャート。The flowchart which shows the process of 2nd Embodiment in the merchandise sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置における第３の実施形態の処理を示すフローチャート。The flowchart which shows the process of 3rd Embodiment in the merchandise sales data processing apparatus concerning this invention. 本発明にかかる商品販売データ処理装置における第４の実施形態の処理を示すフローチャート。The flowchart which shows the process of 4th Embodiment in the merchandise sales data processing apparatus concerning this invention.

Explanation of symbols

１１１１音声パターンデータ抽出手段
１１１２入力処理手段
１１１３商品データ読み出し手段
１１１４商品販売データ処理手段
１１３１商品カテゴリ・商品データ辞書
１１３２商品データ・商品文字列辞書
１１３３音声辞書
１１３４音響辞書
１１４１商品カテゴリ候補記憶手段
１１４２商品文字列候補記憶手段
１１４３商品販売データ処理回数情報記憶手段
１１４４入力回数情報記憶手段
１２出力表示手段 1111 Voice pattern data extracting unit 1112 Input processing unit 1113 Product data reading unit 1114 Product sales data processing unit 1131 Product category / product data dictionary 1132 Product data / product string dictionary 1133 Speech dictionary 1134 Acoustic dictionary 1141 Product category candidate storage unit 1142 Product Character string candidate storage means 1143 Product sales data processing count information storage means 1144 Input count information storage means
12 Output display means

Claims

An acoustic dictionary that stores voice feature data and voice pattern data created in advance in association with each other;
A voice dictionary storing the voice pattern data and product categories in association with each other;
Product data for identifying products, and a product category / product data dictionary storing the product categories set corresponding to the product data;
Voice input means for inputting voice;
The speech pattern stored in association with the matched or similar speech feature amount by comparing the speech feature amount of the speech input by the speech input unit with a previously created speech feature amount with reference to the acoustic dictionary Voice recognition means for outputting data;
Voice pattern data extracting means for referring to the voice dictionary and extracting the product category as a candidate based on the voice pattern data output by the voice recognition means;
Commodity category candidate storage means for storing the commodity category extracted by the voice pattern data extraction means as commodity category candidates;
Product data reading means for reading the product data set corresponding to the product category candidate from the product category / product data dictionary;
Output display means for outputting and displaying the product data read by the product data reading means;
Product sales data processing means for performing product sales data processing when any of the product data is selected from the product data output and displayed by the output display means;
A product sales data processing apparatus characterized by comprising:

An acoustic dictionary that stores voice feature data and voice pattern data created in advance in association with each other;
A voice dictionary storing the voice pattern data and product character strings in association with each other;
Product data for identifying the product and product data / product character string dictionary storing the product character string set corresponding to the product data;
Voice input means for inputting voice;
The speech pattern stored in association with the matched or similar speech feature amount by comparing the speech feature amount of the speech input by the speech input unit with a previously created speech feature amount with reference to the acoustic dictionary Voice recognition means for outputting data;
Voice pattern data extracting means for referring to the voice dictionary and extracting the product character string as a candidate based on the voice pattern data output by the voice recognition means;
Commodity character string candidate storage means for storing the commodity character string extracted by the voice pattern data extraction means as a commodity character string candidate;
Commodity data reading means for reading out the commodity data set corresponding to the commodity character string candidate from the commodity data / commodity character string dictionary;
Output display means for outputting and displaying the product data read by the product data reading means;
Product sales data processing means for performing product sales data processing when any of the product data is selected from the product data output and displayed by the output display means;
A product sales data processing device comprising:

The product sales data processing device includes: output display control means for performing output control and displaying the product data read by the product data reading means;
The merchandise sales data processing apparatus according to claim 1 or 2, further comprising:

The product sales data processing means includes product sales data processing frequency information storage means for storing the number of times product sales data processing has been performed on the product data.
The output display control means controls the order of the product data based on the product sales data processing count information stored in the product sales data processing count information storage means, and outputs and displays it.
The merchandise sales data processing apparatus according to claim 3.

An acoustic dictionary that stores voice feature data and voice pattern data created in advance in association with each other;
A voice dictionary storing the voice pattern data and product categories in association with each other;
Product data for identifying products, and a product category / product data dictionary storing the product categories set corresponding to the product data;
Voice input means for inputting voice;
The speech pattern stored in association with the matched or similar speech feature amount by comparing the speech feature amount of the speech input by the speech input unit with a previously created speech feature amount with reference to the acoustic dictionary Voice recognition means for outputting data;
Voice pattern data extracting means for referring to the voice dictionary and extracting the product category as a candidate based on the voice pattern data output by the voice recognition means;
Merchandise category candidate storage means for storing the quotient 74. product category extracted by the voice pattern data extraction means as a merchandise category candidate;
Product data reading means for reading the product data set corresponding to the product category candidate from the product category / product data dictionary;
Output display means for outputting and displaying the product data read by the product data reading means;
When any product data is selected from the product data output and displayed by the output display means, input processing means for performing input processing;
A product data input device comprising:

An acoustic dictionary that stores voice feature data and voice pattern data created in advance in association with each other;
A voice dictionary storing the voice pattern data and product character strings in association with each other;
Product data for identifying the product and product data / product character string dictionary storing the product character string set corresponding to the product data;
Voice input means for inputting voice;
The speech pattern stored in association with the matched or similar speech feature amount by comparing the speech feature amount of the speech input by the speech input unit with a previously created speech feature amount with reference to the acoustic dictionary Voice recognition means for outputting data;
Voice pattern data extracting means for referring to the voice dictionary and extracting the product character string as a candidate based on the voice pattern data output by the voice recognition means;
Commodity character string candidate storage means for storing the commodity character string extracted by the voice pattern data extracting means as a commodity character string candidate;
Commodity data reading means for reading out the commodity data set corresponding to the commodity character string candidate from the commodity data / commodity character string dictionary;
Output display means for outputting and displaying the product data read by the product data reading means;
When any product data is selected from the plurality of product data output by the output display means, input processing means for performing input processing;
A product data input device comprising:

The product data input device includes output display control means for performing output display by orderly controlling the product data read by the product data reading means,
The product data input device according to claim 1, further comprising:

The input data processing means has input number information storage means for storing the number of times of input of the selected product data. The product data is based on the input number information stored in the input number storage means by the output display control means. Display and control the order of
The product data input device according to claim 3.

Corresponding to an acoustic dictionary that stores voice feature values and voice pattern data that are created in advance in association with each other, a voice dictionary that stores the voice pattern data in association with product categories, product data that identifies products, and this product data In the product sales data processing apparatus provided with the product category / product data dictionary storing the product category set as described above,
Voice input function to input voice,
The speech pattern stored in association with the matched or similar speech feature amount by comparing the speech feature amount of the speech input by the speech input unit with a previously created speech feature amount with reference to the acoustic dictionary Voice recognition function to output data,
A product category extraction function for referring to the voice dictionary based on the output voice pattern data and extracting the product category as a candidate;
A product category storage function for storing the product categories extracted as candidates;
A product data reading function for reading the product data set corresponding to the stored product category from the product category / product data dictionary;
An output display function for outputting and displaying the read product data;
A product sales data processing function for processing product sales data when any of the product data is selected from the output product data;
A program to realize

An acoustic dictionary in which voice feature values of voice created in advance and voice pattern data are stored in association with each other, a voice dictionary in which the voice pattern data and product character strings are stored in association with each other, product data for identifying products, and the product data In the product sales data processing apparatus provided with the product data / product character string dictionary storing the product character string set correspondingly,
Voice input function to input voice,
The speech pattern stored in association with the matched or similar speech feature amount by comparing the speech feature amount of the speech input by the speech input unit with a previously created speech feature amount with reference to the acoustic dictionary Voice recognition function to output data,
A product character string extraction function for referring to the voice dictionary based on the output voice pattern data and extracting the product character string as a candidate;
A product character string storage function for storing the product character string extracted as a candidate;
A product data read function for reading the product data set corresponding to the stored product character string from the product data / product character string dictionary;
An output display function for outputting and displaying the read product data;
A product sales data processing function for processing product sales data when any of the product data is selected from the plurality of product data output and displayed;
A program to realize

Corresponding to an acoustic dictionary that stores voice feature values and voice pattern data that are created in advance in association with each other, a voice dictionary that stores the voice pattern data in association with product categories, product data that identifies products, and this product data In the product data input device provided with the product category / product data dictionary storing the product category set as described above,
Voice input function to input voice,
Voice pattern data stored in association with a matched or similar voice feature quantity by comparing the voice feature quantity of the voice inputted by the voice input means with a previously created voice feature quantity with reference to the acoustic dictionary Voice recognition function that outputs
A product category extraction function for referring to the voice dictionary based on the output voice pattern data and extracting the product category as a candidate;
A product category storage function for storing the product categories extracted as candidates;
A product data reading function for reading the product data set corresponding to the stored product category from the product category / product data dictionary;
An output display function for outputting and displaying the read product data;
When any product data is selected from the product data output and displayed, an input processing function for performing input processing;
A program to realize

An acoustic dictionary in which voice feature values of voice created in advance and voice pattern data are stored in association with each other, a voice dictionary in which the voice pattern data and product character strings are stored in association with each other, product data for identifying products, and the product data In the product data input device provided with the product data / product character string dictionary storing the product character string set correspondingly,
Voice input function to input voice,
Refer to the acoustic dictionary, compare the voice feature quantity of the voice input by the voice input means with a voice feature quantity created in advance, and store voice pattern data associated with the matched or similar voice feature quantity Voice recognition function that outputs
A product character string extraction function for referring to the voice dictionary based on the output voice pattern data and extracting the product character string as a candidate;
A product character string storage function for storing the product character string extracted as a candidate;
A product data read function for reading the product data set corresponding to the stored product character string from the product data / product character string dictionary;
An output display function for outputting and displaying the read product data;
An input processing function for performing input processing when any of the product data is selected from the plurality of output product data;
A program to realize