JP2010197669A

JP2010197669A - Portable terminal, editing guiding program, and editing device

Info

Publication number: JP2010197669A
Application number: JP2009042028A
Authority: JP
Inventors: Mitsufumi Yoshimoto; 光文吉本
Original assignee: Kyocera Corp
Current assignee: Kyocera Corp
Priority date: 2009-02-25
Filing date: 2009-02-25
Publication date: 2010-09-09

Abstract

<P>PROBLEM TO BE SOLVED: To provide a portable terminal for enhancing efficiency of document preparation due to voice recognition, and to provide an editing guiding program applied to a processor of the portable terminal. <P>SOLUTION: The portable terminal 10 includes a second microphone 16b taking in voice of a user, and recognizes the voice input to the second microphone 16b to generate a character string. In addition, when generating the character string by voice recognition, calculated likelihood is made reliability of voice recognition, and generated character string and reliability corresponding to the character string are recorded in a reliability table. The character string being reliability of a threshold or less is predetermined based on the reliability table, and background color of predetermined low reliability character string is colored in blue to be displayed on an LCD monitor 26. Since the candidate of erroneous recognition character string is displayed so as to be determined at a glance, the user easily determines the need of editing and uses voice recognition to efficiently create documents. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

この発明は、携帯端末に関し、特にたとえば音声認識によって文字列を入力する、携帯端末に関する。 The present invention relates to a mobile terminal, and more particularly to a mobile terminal that inputs a character string by voice recognition, for example.

従来、特にたとえば音声認識によって文字列を入力する、携帯端末が知られており、この種の装置の一例が、特許文献１に開示されている。この背景技術の発声認識装置では、単音節毎の音声認識を行うと共に、認識結果の信頼度が高いときは単音節の文字画像をそのまま表示し、認識結果の信頼度が低いときは、その母音の文字画像とその画像の横に子音認識不能を表わす「？」の画像とを表示する。そして、さらに認識結果が低いときは、次の音節入力を促すことを表わす「＊」を表示する。 2. Description of the Related Art Conventionally, a portable terminal that inputs a character string by voice recognition, for example, is known, and an example of this type of device is disclosed in Patent Document 1. This background art utterance recognition device performs speech recognition for each single syllable, displays the character image of a single syllable as it is when the recognition result is highly reliable, and displays the vowel when the recognition result is low in reliability. Next to the character image and an image of “?” Indicating that the consonant cannot be recognized. When the recognition result is further low, “*” indicating that the next syllable input is prompted is displayed.

また、特許文献２に開示されている、発音練習支援システムは携帯電話を利用して英会話等の発音練習のコンテンツを供給することが可能である。学習者が携帯電話に対して音声信号を入力すると、音声信号は携帯電話から回線通話制御装置に送信され、回線通話制御装置によって音声データ信号に変換された後に、発音評定サーバに送信される。発声評定サーバでは、学習者の発話単語または文章とデータパターンとのマッチングを行うことで、発話の基本周波数パターンの似ている程度を評定する。そして、評定結果がデータベースサーバに送信されると評定結果に応じてコンテンツが編集され、携帯電話の画面に表示される。これにより、学習者は、携帯電話に音声信号を入力することで、入力した音声信号における発音の正しさが通知される。
特開２００５−１２８１３０［G10L 15/22, G10L 15/28］特開２００５−３１２０７［G09B 19/06, G09B 5/06, G09B 19/04, G10L 15/00］ Further, the pronunciation practice support system disclosed in Patent Document 2 can supply pronunciation practice content such as English conversation using a mobile phone. When the learner inputs an audio signal to the mobile phone, the audio signal is transmitted from the mobile phone to the line call control device, converted into an audio data signal by the line call control device, and then transmitted to the pronunciation rating server. The utterance rating server evaluates the degree of similarity of the basic frequency pattern of the utterance by matching the utterance word or sentence of the learner with the data pattern. When the rating result is transmitted to the database server, the content is edited according to the rating result and displayed on the screen of the mobile phone. Thereby, the learner is notified of the correctness of pronunciation in the input voice signal by inputting the voice signal to the mobile phone.
JP-A-2005-128130 [G10L 15/22, G10L 15/28] JP-A-2005-31207 [G09B 19/06, G09B 5/06, G09B 19/04, G10L 15/00]

しかし、特許文献１における音声認識装置では、ユーザは単音節単位で発話するため、長い文章などを入力するには不向きである。また、長い文章を入力したとしても、単音節毎の信頼度が低いと、「＊」や「？」が混じった文字列となり、使用者は文章として読解することが困難になる。 However, in the speech recognition apparatus in Patent Document 1, since the user speaks in units of single syllables, it is not suitable for inputting long sentences. Even if a long sentence is input, if the reliability of each single syllable is low, the character string is mixed with “*” and “?”, And it becomes difficult for the user to read the sentence as a sentence.

また、特許文献２における発音練習支援システムでは、単語単位で音声を入力することができるが、音声認識によって文章作成を行う機能などについては開示されていない。 Moreover, in the pronunciation practice support system in Patent Document 2, speech can be input in units of words, but a function for creating a sentence by speech recognition is not disclosed.

それゆえに、この発明の主たる目的は、新規な、携帯端末およびこのような携帯端末のプロセサに適用される編集誘導プログラムを提供することである。 Therefore, a main object of the present invention is to provide a novel portable terminal and an editing guide program applied to a processor of such a portable terminal.

この発明の他の目的は、音声認識による文章作成の効率をあげることが可能な、携帯端末およびこのような携帯端末のプロセサに適用される編集誘導プログラムを提供することである。 Another object of the present invention is to provide a portable terminal and an editing guide program applied to a processor of such a portable terminal capable of increasing the efficiency of sentence creation by voice recognition.

この発明は、上記の課題を解決するために、以下の構成を採用した。なお、括弧内の参照符号および補足説明等は、この発明の理解を助けるために記述する実施形態との対応関係を示したものであって、この発明を何ら限定するものではない。 The present invention employs the following configuration in order to solve the above problems. The reference numerals in parentheses, supplementary explanations, and the like indicate the corresponding relationship with the embodiments described in order to help understanding of the present invention, and do not limit the present invention.

第１の発明は、音声信号を取り込む取込手段および取込手段によって取り込まれた音声信号から文字列を生成する音声認識手段を有する携帯端末であって、音声認識手段によって生成される文字列およびそれらの信頼度を示すデータを記録する記録手段、データを参照して所定値以下の信頼度の文字列を特定する特定手段、および特定手段によって特定された文字列を、他の文字列とは異なる形態で表示する表示手段を備える、携帯端末である。 1st invention is a portable terminal which has a voice recognition means which produces | generates a character string from the audio | voice signal taken in by the taking-in means and the audio | voice signal taken in by the taking-in means, Comprising: A recording means for recording data indicating the reliability, a specifying means for specifying a character string having a reliability equal to or lower than a predetermined value with reference to the data, and a character string specified by the specifying means are referred to as other character strings. It is a portable terminal provided with the display means to display with a different form.

第１の発明では、携帯端末（１０）は音声認識用のマイクなどの取込手段（１６ｂ）を有し、取込手段によって取り込まれた音声信号から、音声認識手段（２０ａ，２０ｂ，３０）によって、文字列が生成される。この音声認識手段は、たとえばＣＰＵ（２０ａ）、ＤＳＰ（２０ｂ）および音声辞書データを記憶するＲＯＭ（３０）から構成される。 In the first invention, the portable terminal (10) has capturing means (16b) such as a microphone for speech recognition, and the speech recognition means (20a, 20b, 30) is obtained from the speech signal captured by the capturing means. A character string is generated. This voice recognition means is composed of, for example, a CPU (20a), a DSP (20b) and a ROM (30) for storing voice dictionary data.

また、記録手段（２０ａ，Ｓ１４７）は、音声認識手段によって文字列が生成される際に算出される尤度を認識の信頼度とし、生成された文字列と、その文字列に対応する信頼度とをデータ（３３６）として記録する。また、そのデータにおいて、信頼度が所定値（閾値）以下の文字列は、誤認識された文字列（誤認識文字列）の候補として、特定手段（２０ａ，Ｓ１６１）によって特定される。そして、表示手段（２０ａ，２６，Ｓ１６３，Ｓ１６５）は、特定された文字列のみ、たとえば背景色を青色に色彩して表示するなどして、使用者による編集を誘導する。 The recording means (20a, S147) uses the likelihood calculated when the character string is generated by the speech recognition means as the recognition reliability, and the generated character string and the reliability corresponding to the character string. Are recorded as data (336). In the data, a character string having a reliability level equal to or lower than a predetermined value (threshold value) is specified by the specifying means (20a, S161) as a candidate of a misrecognized character string (misrecognized character string). The display means (20a, 26, S163, S165) guides editing by the user by displaying only the specified character string, for example, the background color is displayed in blue.

第１の発明によれば、誤認識文字列の候補が、一目で判断できるように表示されるため、使用者は候補の文字列に対する編集の要否を判断しやすくなる。そのため、使用者は、音声認識を利用した文章を効率よく作成できるようになる。 According to the first aspect, since the candidate for the misrecognized character string is displayed so that it can be determined at a glance, the user can easily determine whether the candidate character string needs to be edited. Therefore, the user can efficiently create a sentence using voice recognition.

第２の発明は、第１の発明に従属し、特定された文字列のみを選択するカーソルを表示するカーソル表示手段をさらに備える。 A second invention is dependent on the first invention, and further includes cursor display means for displaying a cursor for selecting only the specified character string.

第２の発明では、カーソル（ＣＵｂ）は、カーソル表示手段（２０ａ，Ｓ２０１，Ｓ２８９）によって特定された文字列のみを選択するように表示される。 In the second invention, the cursor (CUb) is displayed so as to select only the character string specified by the cursor display means (20a, S201, S289).

第２の発明によれば、カーソルは、低信頼度文字列のみを選択することが可能であるため、使用者が行う編集操作の操作性を向上させることができる。 According to the second invention, since the cursor can select only the low reliability character string, the operability of the editing operation performed by the user can be improved.

第３の発明は、第２の発明に従属し、カーソルを移動するための操作を受けつける操作手段をさらに備え、カーソルは、操作手段によって操作結果に応じて文字列を選択する。 A third invention is according to the second invention and further comprises operation means for receiving an operation for moving the cursor, and the cursor selects a character string according to the operation result by the operation means.

第３の発明では、カーソルは、たとえば上下方向および左右方向を入力可能な方向キーなどの操作手段（２２ｄ）の操作結果に応じて移動する。 In the third invention, the cursor moves according to the operation result of the operation means (22d) such as a direction key capable of inputting, for example, the vertical direction and the horizontal direction.

第３の発明によれば、カーソルは方向キーによって操作することが可能であるため、使用者は信頼性の高いカーソル操作を行うことができる。 According to the third aspect, since the cursor can be operated with the direction keys, the user can perform a highly reliable cursor operation.

第４の発明は、第２の発明に従属し、文字列が生成された後に新たに音声認識された新たな文字列と一致する文字列を検索する検索手段さらに備え、カーソルは、検索手段によって検索された文字列を選択する。 The fourth invention is dependent on the second invention and further comprises search means for searching for a character string that matches a new character string that is newly voice-recognized after the character string is generated, and the cursor is provided by the search means. Select the searched string.

第４の発明では、検索手段（２０ａ，Ｓ３３１）は、文字列が生成された後に、再入力された音声によって新たに認識された文字列を、特定された文字列の中から検索する。そして、カーソルは、検索手段による検索結果に基づいて文字列を選択する。 In the fourth invention, the search means (20a, S331) searches the character string newly identified by the re-input voice from the specified character string after the character string is generated. The cursor selects a character string based on the search result by the search means.

第４の発明によれば、使用者は、誤った文字列を発話するだけでカーソルを操作できるようになる。したがって、使用者には、音声認識による文書の作成に都合のいいカーソルの操作方法が提供される。 According to the fourth invention, the user can operate the cursor only by uttering an incorrect character string. Therefore, the user is provided with a cursor operation method that is convenient for creating a document by voice recognition.

第５の発明は、第２の発明ないし第４の発明のいずれかに従属し、カーソルによって選択された文字列を、音声認識手段によって新たに生成された文字列に基づいて編集する音声編集手段（２０ａ，Ｓ１４９，Ｓ３１９）をさらに備える。 A fifth invention is dependent on any one of the second to fourth inventions, and edits a character string selected by a cursor based on a character string newly generated by a voice recognition means. (20a, S149, S319).

第５の発明では、音声編集手段（２０ａ，Ｓ１４９，Ｓ３１９）は、たとえば、カーソルによって選択された文字列を、再入力された音声によって新たに認識された文字列に置き換える。 In the fifth invention, the voice editing means (20a, S149, S319) replaces, for example, a character string selected by the cursor with a character string newly recognized by the re-input voice.

第５の発明によれば、使用者は、音声認識を利用して、誤認識した文字列を容易に編集することができる。したがって、使用者には、音声認識よる文章の編集に都合がいい編集操作が提供される。 According to the fifth aspect, the user can easily edit the misrecognized character string using voice recognition. Therefore, the user is provided with an editing operation that is convenient for editing text by voice recognition.

第６の発明は、第５の発明に従属し、音声認識手段によって生成される文字列の候補を一覧的に表示する一覧表示手段をさらに備え、音声編集手段は、候補一覧表示手段によって表示された候補が選択されたとき、その選択された候補を新たに生成された文字列として編集する。 A sixth invention is according to the fifth invention, further comprising list display means for displaying a list of character string candidates generated by the voice recognition means, wherein the voice editing means is displayed by the candidate list display means. When the selected candidate is selected, the selected candidate is edited as a newly generated character string.

第６の発明では、一覧表示手段（２０ａ，Ｓ３１５）は、たとえば、音声認識手段が算出する尤度が所定値以上の文字列を全て表示する。そして、一覧的に表示された文字列のうち、使用者によって選択された文字列が、カーソルによって選択された文字列と置き換えられる。 In the sixth invention, the list display means (20a, S315) displays, for example, all character strings whose likelihood calculated by the speech recognition means is a predetermined value or more. Of the character strings displayed in a list, the character string selected by the user is replaced with the character string selected by the cursor.

第６の発明によれば、音声認識の候補を一覧的に表示されるため、使用者は、再入力した音声の認識精度が高くなくても、編集することができる。 According to the sixth aspect, since the voice recognition candidates are displayed as a list, the user can edit even if the recognition accuracy of the re-input voice is not high.

第７の発明は、第２の発明ないし第４の発明のいずれかに従属し、文字列を入力する文字入力手段、およびカーソルによって選択された文字列を、文字入力手段によって入力された文字列に基づいて編集する文字編集手段をさらに備える。 A seventh invention is according to any one of the second to fourth inventions, the character input means for inputting a character string, and the character string selected by the cursor, the character string input by the character input means. Is further provided with character editing means for editing based on the above.

第７の発明では、文字入力手段（２２ｅ）は、たとえば平仮名を入力する文字入力キーであり、文字編集手段（２０ａ，Ｓ２１１，Ｓ２１５）は、入力された平仮名を漢字や片仮名などに変換して、カーソルによって選択された文字列と置き換える。 In the seventh invention, the character input means (22e) is, for example, a character input key for inputting hiragana, and the character editing means (20a, S211, S215) converts the input hiragana into kanji or katakana. Replace with the string selected by the cursor.

第７の発明によれば、使用者は、電車の中や周囲が騒がしい場所など、音声認識に不適切な環境であれば、文字入力キーを利用して、文章の編集をすることができる。また、使用者は、文字入力キーを利用して、信頼性の高い編集操作を行うこともできる。 According to the seventh invention, the user can edit the text using the character input key in an environment that is inappropriate for voice recognition, such as in a train or around a noisy place. The user can also perform highly reliable editing operations using the character input keys.

第８の発明は、第１の発明に従属し、音声認識手段によって新たに生成された文字列と類似する文字列を検索する類似検索手段および類似検索手段によって検索された文字列を、音声認識手段によって新たに生成された文字列に置換する置換手段をさらに備える。 An eighth invention is dependent on the first invention, and a similar search means for searching for a character string similar to a character string newly generated by the voice recognition means and a character string searched by the similar search means for voice recognition Substitution means for substituting a character string newly generated by the means is further provided.

第８の発明では、類似検索手段（２０ａ，Ｓ２８３，Ｓ２８５）は、特定された文字列の中から、新たに生成された文字列を認識するときの音声を利用して、新たに生成された文字列と類似する文字列を検索する。そして、置換手段（２０ａ，Ｓ２９３）は、新たに生成された文字列と類似する文字列を、新たに生成された文字列に置換する。 In the eighth invention, the similarity search means (20a, S283, S285) is newly generated using the voice when recognizing a newly generated character string from among the specified character strings. Search for a string that is similar to the string. Then, the replacement means (20a, S293) replaces the character string similar to the newly generated character string with the newly generated character string.

第８の発明によれば、使用者は、編集するための文字列を発話するだけで、文字列を編集することができる。したがって、使用者には、音声認識による文章の編集に都合がよい編集操作が提供される。 According to the eighth aspect, the user can edit the character string only by speaking the character string for editing. Therefore, the user is provided with an editing operation that is convenient for editing text by voice recognition.

第９の発明は、第２の発明に従属し、音声認識手段によって新たに生成された文字列と類似する文字列を検索する類似検索手段をさらに備え、カーソルは、類似検索手段によって検索された文字列を選択する。 A ninth invention is according to the second invention, further comprising a similarity search means for searching for a character string similar to the character string newly generated by the speech recognition means, wherein the cursor is searched by the similarity search means Select a string.

第９の発明では、カーソルは、新たに生成された文字列と類似する文字列を選択する。 In the ninth invention, the cursor selects a character string similar to the newly generated character string.

第９の発明によれば、使用者は、再入力された音声の音声認識の結果によらず、意図する文字列を選択することができる。したがって、第４の発明と同様に、使用者には、音声認識による文書の作成に都合のいいカーソルの操作方法が提供される。 According to the ninth aspect, the user can select an intended character string regardless of the result of speech recognition of the re-input speech. Therefore, similarly to the fourth invention, the user is provided with a cursor operation method that is convenient for creating a document by voice recognition.

第１０の発明は、第９の発明に従属し、カーソルによって選択された文字列を、音声認識手段によって新たに生成された文字列に基づいて編集する音声編集手段をさらに備える。 A tenth invention is according to the ninth invention, and further comprises voice editing means for editing the character string selected by the cursor based on the character string newly generated by the voice recognition means.

第１０の発明によれば、第５の発明と同様に、使用者には、音声認識よる文章の編集に都合がいい編集操作が提供される。 According to the tenth invention, similar to the fifth invention, the user is provided with an editing operation that is convenient for editing text by voice recognition.

第１１の発明は、第１０の発明に従属し、音声認識手段によって生成される文字列の候補を一覧的に表示する一覧表示手段をさらに備え、音声編集手段は、候補一覧表示手段によって表示された候補が選択されたとき、その選択された候補を新たに生成された文字列として編集する。 An eleventh invention is according to the tenth invention, further comprising list display means for displaying a list of character string candidates generated by the speech recognition means, wherein the speech editing means is displayed by the candidate list display means. When the selected candidate is selected, the selected candidate is edited as a newly generated character string.

第１１の発明によれば、第６の発明と同様に、音声認識の候補を一覧的に表示されるため、使用者は、再入力した音声の認識精度が高くなくても、正しく編集することができる。 According to the eleventh aspect, similar to the sixth aspect, since the voice recognition candidates are displayed in a list, the user can edit correctly even if the recognition accuracy of the re-input voice is not high. Can do.

第１２の発明は、第９の発明に従属し、文字列を入力する文字入力手段、およびカーソルによって選択された文字列を、文字入力手段によって入力された文字列に基づいて編集する文字編集手段をさらに備える。 A twelfth invention is according to the ninth invention, and is a character input means for inputting a character string, and a character editing means for editing a character string selected by a cursor based on the character string input by the character input means. Is further provided.

第１２の発明によれば、第７の発明と同様に、電車の中や周囲が騒がしい場所など、音声認識に不適切な環境であれば、文字入力キーを利用して、文章の編集をすることができる。 According to the twelfth invention, similar to the seventh invention, text editing is performed using the character input key in an environment that is inappropriate for voice recognition, such as in a train or around a noisy place. be able to.

第１３の発明は、第８の発明ないし第１２の発明のいずれかに従属し、取込手段によって取り込まれた音声とその音声から生成された文字列とを音声辞書として記録する音声辞書記録手段をさらに備え、類似検索手段は、音声辞書記録手段によって記録された音声のそれぞれと、新たに入力された音声との相関値を算出することで、類似する文字列を検索する。 A thirteenth invention is dependent on any of the eighth to twelfth inventions, and a voice dictionary recording means for recording a voice captured by the fetching means and a character string generated from the voice as a voice dictionary. The similarity search unit searches for a similar character string by calculating a correlation value between each of the voices recorded by the voice dictionary recording unit and the newly input voice.

第１３の発明では、音声辞書記録手段（２０ａ、Ｓ２５１）によって記録された音声辞書には、特定された文字列と、その特定された文字列に対応する音声あるいは音声データの特徴パターンとが含まれる。そして、類似検索手段は、音声辞書を構成する音声データの特徴パターンと、新たに入力された音声の特徴パターンとの相関値を算出することで、類似する文字列を検索する。 In the thirteenth invention, the speech dictionary recorded by the speech dictionary recording means (20a, S251) includes the identified character string and the feature pattern of speech or speech data corresponding to the identified character string. It is. The similarity search means searches for a similar character string by calculating a correlation value between the feature pattern of the speech data constituting the speech dictionary and the feature pattern of the newly input speech.

第１３の発明によれば、類似する文字列を検索するために、相関関数を利用することができる。 According to the thirteenth aspect, a correlation function can be used to search for a similar character string.

第１４の発明は、第１の発明ないし第１３の発明に従属し、複数の文字列の少なくとも一部を表示する表示手段、および表示手段によって表示された文字列の表示位置をスクロールさせるスクロール手段をさらに備える。 A fourteenth invention is dependent on the first to thirteenth inventions, a display means for displaying at least a part of a plurality of character strings, and a scroll means for scrolling the display position of the character strings displayed by the display means. Is further provided.

第１４の発明では、ＬＣＤモニタなどの表示手段（２６）に音声認識手段によって生成された複数の文字列の一部が表示されている場合に、スクロール手段（２０ａ，Ｓ２０１，Ｓ２３７，Ｓ２６１）は、表示されていない文字列などを表示するように表示位置をスクロールさせる。 In the fourteenth invention, when a part of a plurality of character strings generated by the voice recognition means is displayed on the display means (26) such as an LCD monitor, the scroll means (20a, S201, S237, S261) The display position is scrolled to display a character string that is not displayed.

第１４の発明によれば、使用者は、表示される文字列の内容をスクロールさせることで、特定された文字列を探すことができるようになる。 According to the fourteenth aspect, the user can search for the specified character string by scrolling the contents of the displayed character string.

第１５の発明は、音声信号を取り込む取込手段（１６ａ，１６ｂ）および取込手段によって取り込まれた音声信号から文字列を生成する音声認識手段（２０ａ，２０ｂ，３０）を有する携帯端末（１０）のプロセサ（２０ａ）を、音声認識手段によって生成される文字列およびそれらの信頼度を示すデータを記録する記録手段（Ｓ１４７）、データを参照して所定値以下の信頼度の文字列を特定する特定手段（Ｓ１６１）、および特定手段によって特定された文字列を、他の文字列とは異なる形態で表示する表示手段（２６，Ｓ１６３，Ｓ１６５）として機能させる、編集誘導プログラムである。 A fifteenth aspect of the present invention is a portable terminal (10) having capturing means (16a, 16b) for capturing a speech signal and speech recognition means (20a, 20b, 30) for generating a character string from the speech signal captured by the capturing means. ) Processor (20a), recording means (S147) for recording the character strings generated by the speech recognition means and data indicating their reliability, and specifying the character string having a reliability below a predetermined value with reference to the data And an editing guidance program that causes the character string specified by the specifying means to function as display means (26, S163, S165) for displaying in a form different from other character strings.

第１５の発明でも、第１の発明と同様に、使用者は、音声認識を利用した文章を効率よく作成できるようになる。 In the fifteenth invention, similarly to the first invention, the user can efficiently create a sentence using voice recognition.

第１６の発明は、音声信号を取り込む取込手段（１６ａ，１６ｂ）および取込手段によって取り込まれた音声信号から文字列を生成する音声認識手段（２０ａ，２０ｂ，３０）を有する編集装置であって、音声認識手段によって生成される文字列およびそれらの信頼度を示すデータ（信頼度テーブル）を記録する記録手段（２０ａ，Ｓ１４７）、データを参照して所定値以下の信頼度の文字列を特定する特定手段（２０ａ，Ｓ１６１）、および特定手段によって特定された文字列を、他の文字列とは異なる形態で表示する表示手段（２０ａ，２６，Ｓ１６３，Ｓ１６５）を備える、編集装置である。 A sixteenth aspect of the invention is an editing apparatus having capturing means (16a, 16b) for capturing a speech signal and speech recognition means (20a, 20b, 30) for generating a character string from the speech signal captured by the capturing means. Recording means (20a, S147) for recording character strings generated by the speech recognition means and data indicating their reliability (reliability table), and referring to the data, a character string having a reliability equal to or lower than a predetermined value. An editing device comprising: specifying means (20a, S161) for specifying, and display means (20a, 26, S163, S165) for displaying the character string specified by the specifying means in a form different from other character strings. .

第１６の発明でも、第１の発明と同様に、使用者は、音声認識を利用した文章を効率よく作成できるようになる。 In the sixteenth invention, as in the first invention, the user can efficiently create a sentence using voice recognition.

この発明によれば、誤認識文字列の候補が一目で判断できるように表示されるため、使用者は、音声認識を利用して効率よく文章を作成できる。 According to this invention, since the candidate for the misrecognized character string is displayed so that it can be determined at a glance, the user can efficiently create a sentence using voice recognition.

この発明の上述の目的、その他の目的、特徴および利点は、図面を参照して行う以下の実施例の詳細な説明から一層明らかとなろう。 The above object, other objects, features, and advantages of the present invention will become more apparent from the following detailed description of embodiments with reference to the drawings.

図１は本発明の携帯端末を示すブロック図である。FIG. 1 is a block diagram showing a portable terminal of the present invention. 図２は図１に示す携帯端末の外観を示す図解図である。FIG. 2 is an illustrative view showing an appearance of the portable terminal shown in FIG. 図３は図１に示すＬＣＤモニタに表示されるＧＵＩの表示例を示す図解図である。FIG. 3 is an illustrative view showing a display example of a GUI displayed on the LCD monitor shown in FIG. 図４は図１に示すＬＣＤモニタに表示されるＧＵＩの他の表示例を示す図解図である。FIG. 4 is an illustrative view showing another display example of the GUI displayed on the LCD monitor shown in FIG. 図５は図１に示すＲＡＭに記憶される信頼度テーブルの一例を示す図解図である。FIG. 5 is an illustrative view showing one example of a reliability table stored in the RAM shown in FIG. 図６は図１に示すＬＣＤモニタに表示されるＧＵＩのその他の表示例を示す図解図である。FIG. 6 is an illustrative view showing another display example of the GUI displayed on the LCD monitor shown in FIG. 図７は図１に示すＬＣＤモニタに表示されるＧＵＩのさらにその他の表示例を示す図解図である。FIG. 7 is an illustrative view showing still another display example of the GUI displayed on the LCD monitor shown in FIG. 図８は図１に示すＲＡＭのメモリマップの一例を示す図解図である。FIG. 8 is an illustrative view showing one example of a memory map of the RAM shown in FIG. 図９は図８に示すメモリマップにおけるデータ記憶領域の一例を示す図解図である。FIG. 9 is an illustrative view showing one example of a data storage area in the memory map shown in FIG. 図１０は図１に示すＣＰＵの送信メール作成処理を示すフロー図である。FIG. 10 is a flowchart showing the outgoing mail creation process of the CPU shown in FIG. 図１１は図１に示すＣＰＵのサブメニュー処理を示すフロー図である。FIG. 11 is a flowchart showing submenu processing of the CPU shown in FIG. 図１２は図１に示すＣＰＵの信頼度閾値設定処理を示すフロー図である。FIG. 12 is a flowchart showing the reliability threshold value setting process of the CPU shown in FIG. 図１３は図１に示すＣＰＵの文書編集処理を示すフロー図である。FIG. 13 is a flowchart showing document editing processing of the CPU shown in FIG. 図１４は図１に示すＣＰＵの任意カーソル編集処理を示すフロー図である。FIG. 14 is a flowchart showing arbitrary cursor editing processing of the CPU shown in FIG. 図１５は図１に示すＣＰＵの音声認識入力処理を示すフロー図である。FIG. 15 is a flowchart showing voice recognition input processing of the CPU shown in FIG. 図１６は図１に示すＣＰＵの低信頼度部位編集処理を示すフロー図である。FIG. 16 is a flowchart showing the low-reliability part editing process of the CPU shown in FIG. 図１７は図１に示すＣＰＵのカーソル指定処理を示すフロー図である。FIG. 17 is a flowchart showing the cursor specification processing of the CPU shown in FIG. 図１８は図１に示すＣＰＵの音声指定処理を示すフロー図である。FIG. 18 is a flowchart showing voice designation processing of the CPU shown in FIG. 図１９は図１に示すＣＰＵの音声検索処理を示すフロー図である。FIG. 19 is a flowchart showing the voice search processing of the CPU shown in FIG. 図２０は図１に示すＣＰＵの変換部位検索処理を示すフロー図である。FIG. 20 is a flowchart showing conversion site search processing of the CPU shown in FIG. 図２１は図１に示すＣＰＵの音声認識入力処理を示す他の実施例のフロー図である。FIG. 21 is a flowchart of another embodiment showing the voice recognition input process of the CPU shown in FIG. 図２２は図１に示すＣＰＵの変換部位検索処理を示す他の実施例のフロー図である。FIG. 22 is a flowchart of another embodiment showing the conversion site search process of the CPU shown in FIG.

図１を参照して、携帯端末１０は、制御部２０およびキー入力装置２２を含み、制御部２０は、ＣＰＵ（プロセサまたはコンピュータと呼ばれることもある。）２０ａおよびＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）２０ｂを含む。ここで、キー入力装置２２によって発呼操作が行われると、制御部２０に含まれるＣＰＵ２０ａは、ＣＤＭＡ方式に対応する無線通信回路１４を制御して発呼信号を出力する。出力された発呼信号は、アンテナ１２から送出され、基地局を含む移動通信網に送信される。通話相手が応答操作を行うと、通話可能状態が確立される。 Referring to FIG. 1, mobile terminal 10 includes a control unit 20 and a key input device 22, and control unit 20 includes a CPU (also referred to as a processor or a computer) 20a and a DSP (Digital Signal Processor) 20b. Including. Here, when a call operation is performed by the key input device 22, the CPU 20a included in the control unit 20 controls the wireless communication circuit 14 corresponding to the CDMA system and outputs a call signal. The output call signal is transmitted from the antenna 12 and transmitted to the mobile communication network including the base station. When the other party performs a response operation, a call ready state is established.

通話可能状態に移行した後にキー入力装置２２によって通話終了操作が行われると、ＣＰＵ２０ａは、無線通信回路１４を制御して、通話相手に通話終了信号を送信する。そして、通話終了信号の送信後、ＣＰＵ２０ａは、通話処理を終了する。また、先に通話相手から通話終了信号を受信した場合も、ＣＰＵ２０ａは、通話処理を終了する。さらに、通話相手によらず、移動通信網から通話終了信号を受信した場合も、ＣＰＵ２０ａは通話処理を終了する。 When a call end operation is performed by the key input device 22 after shifting to a call ready state, the CPU 20a controls the wireless communication circuit 14 to transmit a call end signal to the other party. Then, after transmitting the call end signal, the CPU 20a ends the call process. Also, when the call end signal is received from the call partner first, the CPU 20a ends the call process. Furthermore, the CPU 20a also ends the call process when a call end signal is received from the mobile communication network regardless of the call partner.

携帯端末１０が起動している状態で通話相手からの発呼信号がアンテナ１２によって捉えられると、無線通信回路１４は着信をＣＰＵ２０ａに通知する。また、ＣＰＵ２０ａは、ＬＣＤドライバ２４によって表示手段であるＬＣＤモニタ２６を制御し、着信通知に記述された発信元情報をＬＣＤモニタ２６に表示させる。そして、ＣＰＵ２０ａは、図示しない着信通知用スピーカから着信音を出力させる。 When a call signal from a call partner is captured by the antenna 12 while the mobile terminal 10 is activated, the wireless communication circuit 14 notifies the CPU 20a of an incoming call. Further, the CPU 20a controls the LCD monitor 26 which is a display means by the LCD driver 24, and causes the LCD monitor 26 to display the caller information described in the incoming call notification. The CPU 20a then outputs a ring tone from an incoming call notification speaker (not shown).

通話可能状態では、次のような処理が実行される。通話相手から送られてきた変調音声信号（高周波信号）は、アンテナ１２によって受信される。受信された変調音声信号は、無線通信回路１４によって復調処理および復号処理を施される。そして、得られた受話音声信号は、スピーカ１８から出力される。一方、取込手段である第１マイク１６ａによって取り込まれた送話音声信号は、無線通信回路１４によって符号化処理および変調処理を施される。そして、生成された変調音声信号は、上述と同様、アンテナ１２を利用して通話相手に送信される。 In the call ready state, the following processing is executed. The modulated audio signal (high frequency signal) sent from the other party is received by the antenna 12. The received modulated audio signal is subjected to demodulation processing and decoding processing by the wireless communication circuit 14. The obtained received voice signal is output from the speaker 18. On the other hand, the transmitted voice signal captured by the first microphone 16 a serving as a capturing unit is subjected to encoding processing and modulation processing by the wireless communication circuit 14. Then, the generated modulated audio signal is transmitted to the other party using the antenna 12 as described above.

また、携帯端末１０は、文字列の入力や削除を行う文書編集機能を備えており、音声認識による文字入力を行うことができる。つまり、使用者が文章を読み上げ、音声を第２マイク１６ｂに入力すると、ＬＣＤモニタ２６には音声認識された文章が表示される。具体的には、第２マイク１６ｂによって取り込まれた音声信号はＤＰＳ２０ｂによって音声データに変換され、ＣＰＵ２０ａおよびＤＳＰ２０ｂは音声データから特徴パターン（特徴量）を抽出する。また、ＣＰＵ２０ａおよびＤＳＰ２０ｂは、音声認識用の音声辞書を構成する参照音声データをＲＯＭ３２から読み出し、参照音声データの特徴パターン（以下、参照パターンと言う。）を抽出するか、ＲＯＭ３２から参照パターンを直接読み出す。そして、ＣＰＵ２０ａおよびＤＳＰ２０ｂは、特徴パターンの照合による音声認識の手法または統計的決定理論に基づく音声認識の手法によって、特徴パターンと各参照パターンとを照合することで、音声データと一致する参照音声データを特定する。この音声辞書は参照音声データとその参照音声データが表わす文字列とが対応付けられているため、ＣＰＵ２０ａおよびＤＳＰ２０ｂは特定した参照音声データに対応する文字列を読み出すことで、当該音声データを文字列に変換する。 Further, the mobile terminal 10 has a document editing function for inputting and deleting a character string, and can perform character input by voice recognition. That is, when the user reads out the sentence and inputs the voice to the second microphone 16b, the sentence whose voice is recognized is displayed on the LCD monitor 26. Specifically, the audio signal captured by the second microphone 16b is converted into audio data by the DPS 20b, and the CPU 20a and the DSP 20b extract a feature pattern (feature amount) from the audio data. Further, the CPU 20a and the DSP 20b read the reference voice data constituting the voice dictionary for voice recognition from the ROM 32, and extract a feature pattern of the reference voice data (hereinafter referred to as a reference pattern) or directly read the reference pattern from the ROM 32. read out. Then, the CPU 20a and the DSP 20b collate the feature pattern with each reference pattern by a speech recognition method based on feature pattern matching or a speech recognition method based on statistical decision theory, thereby matching the reference speech data with the speech data. Is identified. Since this voice dictionary associates reference voice data with a character string represented by the reference voice data, the CPU 20a and the DSP 20b read the character string corresponding to the identified reference voice data, thereby converting the voice data into a character string. Convert to

特徴パターンの照合による音声認識の手法では、ＣＰＵ２０ａおよびＤＳＰ２０ｂは、マルチテンプレート法、ＮＮ（ＮｅａｒｅｓｔＮｅｉｇｈｂｏｒ）識別法またはｋ−ＮＮ識別法などの手法によって、各参照パターンと特徴パターンとの尤度を算出し、最も尤度が高い参照パターンを特定する。 In the speech recognition method based on feature pattern matching, the CPU 20a and the DSP 20b calculate the likelihood between each reference pattern and the feature pattern by a method such as a multi-template method, an NN (Nearest Neighbor) identification method, or a k-NN identification method. Then, the reference pattern having the highest likelihood is specified.

また、統計的決定理論に基づく音声認識の手法では、音声認識の手法として広く利用されているＨＭＭ（ＨｉｄｄｅｎＭａｒｋｏｖＭｏｄｅｌ：隠れマルコフモデル）を利用して、各参照音声データと特徴パターンとの尤度を算出し、特徴パターンの照合による音声認識の手法と同様に、ＣＰＵ２０ａおよびＤＳＰ２０ｂは最も尤度の高い参照音声データを特定する。 Further, in the speech recognition method based on the statistical decision theory, the likelihood between each reference speech data and the feature pattern using an HMM (Hidden Markov Model) widely used as a speech recognition method. And the CPU 20a and the DSP 20b specify the reference speech data with the highest likelihood in the same manner as the speech recognition method based on feature pattern matching.

なお、本実施例では、上述した尤度を音声認識における認識の信頼度とする。また、携帯端末１０は、ＣＰＵ２０ａ、ＤＳＰ２０ｂおよびＲＯＭ３２を音声認識手段として機能させる。 In the present embodiment, the likelihood described above is used as the reliability of recognition in speech recognition. Further, the portable terminal 10 causes the CPU 20a, DSP 20b, and ROM 32 to function as voice recognition means.

携帯端末１０は、メール機能を備えており、図示しないメールサーバとのデータ通信を行い、メールの送受信を行うことができる。なお、データ通信中における、アンテナ１２および無線通信回路１４は通信手段として機能し、メールサーバなどは有線または無線でネットワークと接続されている。 The portable terminal 10 has a mail function, and can perform data communication with a mail server (not shown) to send and receive mail. During data communication, the antenna 12 and the wireless communication circuit 14 function as communication means, and a mail server or the like is connected to the network by wire or wirelessly.

図２は携帯端末１０の外観を示す図解図である。図２を参照して、携帯端末１０は、板状に形成されたケースＣを有する。アンテナ１２は、伸縮可能に構成された伸縮アンテナであり、ケースＣの上側面に突出して設けられる。なお、アンテナ１２は内蔵アンテナであってもよく、ケースＣに内蔵される場合も考えられる。 FIG. 2 is an illustrative view showing an appearance of the mobile terminal 10. With reference to FIG. 2, the portable terminal 10 has a case C formed in a plate shape. The antenna 12 is an extendable antenna configured to be extendable and provided to protrude from the upper side surface of the case C. Note that the antenna 12 may be a built-in antenna or may be built in the case C.

図２では図示しない第１マイク１６ａおよびスピーカ１８はケースＣに内蔵される。内蔵された第１マイク１６ａに通じる開口ｏｐ１は、ケースＣの長さ方向一方の主面に設けられ、内蔵されたスピーカ１８に通じる開口ｏｐ２は、ケースＣの長さ方向他方の主面に設けられる。また、図２では図示しない第２マイク１６ｂもケースＣに内蔵される。内蔵された第２マイク１６ｂに通じる開口ｏｐ３は、開口ｏｐ１と並ぶようにケースＣの長さ方向一方の主面に設けられる。 The first microphone 16a and the speaker 18 (not shown in FIG. 2) are built in the case C. The opening op1 leading to the built-in first microphone 16a is provided on one main surface in the length direction of the case C, and the opening op2 leading to the built-in speaker 18 is provided on the other main surface in the length direction of the case C. It is done. A second microphone 16b (not shown in FIG. 2) is also built in the case C. The opening op3 leading to the built-in second microphone 16b is provided on one main surface in the length direction of the case C so as to be aligned with the opening op1.

つまり、通話者は、開口ｏｐ１を通じて第１マイク１６ａに送話音声を入力し、開口ｏｐ２を通じてスピーカ１８から受話音声を聞く。また、使用者は、開口ｏｐ３を通じて第２マイク１６ｂに音声認識用の音声を入力する。なお、第１マイク１６ａは音声認識用として兼用可能である。そして、第１マイク１６ａと第２マイク１６ｂとの音声信号の差分により、遠方音源である周囲雑音をキャンセルする技術を現実でき、音声認識率の向上に寄与する。 That is, the caller inputs the transmission voice to the first microphone 16a through the opening op1 and listens to the reception voice from the speaker 18 through the opening op2. The user inputs voice for voice recognition to the second microphone 16b through the opening op3. The first microphone 16a can also be used for voice recognition. A technique for canceling ambient noise, which is a distant sound source, can be realized by the difference between the sound signals of the first microphone 16a and the second microphone 16b, which contributes to an improvement in the speech recognition rate.

キー入力装置２２は第１メニューキー２２ａ、第２メニューキー２２ｂ、決定キー２２ｃ、方向キー２２ｄ（操作手段とも言う）および複数の文字入力キー２２ｅ（文字入力手段とも言う）などを含み、ケースＣの主面に設けられる。また、ＬＣＤモニタ２６は、モニタ画面がケースＣの主面に露出するように取り付けられる。なお、キー入力装置２２には、通話キーおよび終話キーも含まれる。 The key input device 22 includes a first menu key 22a, a second menu key 22b, an enter key 22c, a direction key 22d (also referred to as operation means), a plurality of character input keys 22e (also referred to as character input means), and the like. Is provided on the main surface. The LCD monitor 26 is attached such that the monitor screen is exposed on the main surface of the case C. The key input device 22 includes a call key and an end key.

第１メニューキー２２ａ、第２メニューキー２２ｂおよび決定キー２２ｃはＬＣＤモニタ２６に表示されるソフトキーにそれぞれ対応しており、ＬＣＤモニタ２６内に表示されるソフトキーを操作する際に利用される。また、方向キー２２ｄは、ＬＣＤモニタ２６に表示されるＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）に対して上下方向または左右方向の入力操作を行うために用いられる。なお、決定キー２２ｃは方向キー２２ｄによるＧＵＩの操作結果を決定するためにも用いられる。 The first menu key 22a, the second menu key 22b, and the enter key 22c correspond to soft keys displayed on the LCD monitor 26, respectively, and are used when operating the soft keys displayed in the LCD monitor 26. . The direction key 22d is used to perform an input operation in the vertical direction or the horizontal direction on a GUI (Graphical User Interface) displayed on the LCD monitor 26. The determination key 22c is also used to determine the GUI operation result by the direction key 22d.

そして、使用者は、文書編集機能が実行されている状態では、複数の文字入力キー２２ｅを利用することで文字列をキー入力することができる。たとえば、複数の文字入力キー２２ｅは「あ」行〜「わ」行までの文字キーから構成されており、各文字キーを操作する回数によって入力する文字（平仮名）を指定することができる。具体的には、「あ」文字キーは、「あ、い、う、え、お」の文字を入力するためのキーであり、使用者は「あ」行文字キーを押下すれば、「あ」の文字を入力することができ、さらにもう一度押下すれば、「い」の文字を入力することができる。また、使用者は方向キー２２ｄおよび決定キー２２ｃを利用することで平仮名を、漢字や片仮名に変換することができる。 The user can input a character string by using the plurality of character input keys 22e while the document editing function is being executed. For example, the plurality of character input keys 22e are composed of character keys from "A" line to "WA" line, and a character (Hiragana) to be input can be designated by the number of times each character key is operated. Specifically, the “A” character key is a key for inputting the characters “A, I, U, E, O”. If the user presses the “A” line character key, "Can be entered, and if pressed again, the character" I "can be entered. Further, the user can convert hiragana to kanji or katakana by using the direction key 22d and the enter key 22c.

なお、使用者は、文書編集機能のＧＵＩを操作することで、文字入力キーに割り当てられた文字を英数字や記号に切り替え、英数字または記号を入力することができる。また、複数の文字入力キー２２ｅは０〜９の数字キーとしても利用することが可能であり、たとえば、「あ」行文字キーは数字の「１」、「か」行文字キーは数字の「２」となるようにそれぞれ対応する。つまり、使用者は、電話番号を入力して発信する場合に、複数の文字入力キー２２ｅを利用して電話番号を入力し、通話キーによって発信操作を行うことができる。 Note that the user can switch the characters assigned to the character input keys to alphanumeric characters or symbols by operating the GUI of the document editing function, and can input alphanumeric characters or symbols. The plurality of character input keys 22e can also be used as numeric keys from 0 to 9. For example, the “a” line character key is the number “1”, and the “ka” line character key is the number “ 2 ”respectively. That is, when a user inputs a telephone number to make a call, the user can input the telephone number using a plurality of character input keys 22e and perform a call operation using the call key.

ここで、音声認識によって文字列を入力することが可能な文書編集機能について、ＬＣＤモニタ２６に表示される各画像やＧＵＩなどを用いて詳細に説明する。 Here, a document editing function capable of inputting a character string by voice recognition will be described in detail using each image displayed on the LCD monitor 26, a GUI, and the like.

図３（Ａ）を参照して、ＬＣＤモニタ２６には、状態表示領域５０、機能表示領域５２およびキー表示領域５４が設定されている。状態表示領域５０はＬＣＤモニタ２６の上側に設定され、アンテナ１２による電波受信状態、充電池（バッテリィ）の残電池容量および現在日時などを表示する。また、機能表示領域５２は、実行される機能に応じて表示内容が変化し、ここでは送信メールの本文編集画面が表示され、現在の文字入力位置を示すカーソルＣＵａが表示される。 With reference to FIG. 3A, a status display area 50, a function display area 52, and a key display area 54 are set in the LCD monitor 26. The status display area 50 is set on the upper side of the LCD monitor 26, and displays the radio wave reception status by the antenna 12, the remaining battery capacity of the rechargeable battery (battery), the current date and time, and the like. In addition, the display content of the function display area 52 changes according to the function to be executed. Here, a text editing screen of the outgoing mail is displayed, and a cursor CUa indicating the current character input position is displayed.

そして、キー表示領域５４は複数のソフトキーが表示されており、実行される機能に応じて表示状態が変化し、ソフトキーの表示が必要ない機能では、キー表示領域５４は表示されない。たとえば、送信メールの本文を編集するために文書編集機能が実行されている場合には、通常入力キー５６ａ、完了キー５６ｂおよびサブメニューキー５６ｃが表示される。そして、通常入力キー５６ａには第１メニューキー２２ａが対応し、完了キー５６ｂには決定キー２２ｃが対応し、サブメニューキー５６ｃには第２メニューキー２２ｂが対応し、使用者は第１メニューキー２２ａ、決定キー２２ｃおよび第２メニューキー２２ｂを押下することで、それぞれに対応するソフトキーを操作することができる。 A plurality of soft keys are displayed in the key display area 54. The display state changes according to the function to be executed, and the key display area 54 is not displayed for functions that do not require display of the soft key. For example, when the document editing function is executed in order to edit the text of the outgoing mail, the normal input key 56a, the completion key 56b, and the submenu key 56c are displayed. The normal input key 56a corresponds to the first menu key 22a, the completion key 56b corresponds to the enter key 22c, the submenu key 56c corresponds to the second menu key 22b, and the user selects the first menu key 22a. By pressing the key 22a, the enter key 22c, and the second menu key 22b, the corresponding soft keys can be operated.

なお、他の表示例（図面）であっても、左側に表示されるソフトキーは第１メニューキー２２ａに対応し、中央に表示されるソフトキーは決定キー２２ｃに対応し、右側に表示されるソフトキーは第２メニューキー２２ｂに対応する。 In other display examples (drawings), the soft key displayed on the left side corresponds to the first menu key 22a, and the soft key displayed on the center corresponds to the enter key 22c and is displayed on the right side. The soft key corresponds to the second menu key 22b.

まず、通常入力キー５６ａが操作されると、文字列の入力モードを切り替えることができる。文字列の入力モードの表示は、モード表示５８に表示されており、図３（Ａ）の状態では、「音声入力モード」に設定されている。この状態で、通常入力キー５６ａが操作されると、複数の文字入力キー２２ｅを利用して文字列を入力する「通常入力モード」に切り替わり、図３（Ｂ）のように、通常入力キー５６ａの代わりに音声入力キー５６ｄが表示される。通常入力モードは、複数の文字入力キー２２ｅに対するキー入力によって文字を入力するモードであり、音声入力モードは、音声認識によって文字列を入力するモードである。そして、通常入力キー５６ａおよび音声入力キー５６ｄを操作することで、使用者は入力モードを任意に切り換えることができる。なお、音声入力モードにおいて所定時間（２秒）の未入力状態が検出された場合にも、音声入力モードから通常入力モードに切り替わる。 First, when the normal input key 56a is operated, the character string input mode can be switched. The display of the character string input mode is displayed on the mode display 58, and in the state of FIG. When the normal input key 56a is operated in this state, the mode is switched to the “normal input mode” in which a character string is input using the plurality of character input keys 22e, and the normal input key 56a is switched as shown in FIG. A voice input key 56d is displayed instead of. The normal input mode is a mode in which characters are input by key input to the plurality of character input keys 22e, and the voice input mode is a mode in which a character string is input by voice recognition. Then, by operating the normal input key 56a and the voice input key 56d, the user can arbitrarily switch the input mode. In addition, when a non-input state for a predetermined time (2 seconds) is detected in the voice input mode, the voice input mode is switched to the normal input mode.

次に、完了キー５６ｂが操作されると、文書編集機能による送信メールの本文編集を終了し、送信メールの宛先や題名を入力するためのＧＵＩが表示される。さらに、サブメニューキー５６ｃが操作されると、送信メールの作成または文書編集機能の初期設定を変更するためのサブメニューが表示される。 Next, when the completion key 56b is operated, the text editing of the outgoing mail by the document editing function is finished, and a GUI for inputting the destination and title of the outgoing mail is displayed. Further, when the sub menu key 56c is operated, a sub menu for changing the initial setting of the outgoing mail creation or the document editing function is displayed.

音声入力モードが設定された状態で、第２マイク１６ｂに対して文章を意味する音声が入力されると、図３（Ｂ）に示すように、機能表示領域５２には音声認識された結果の文章が表示される。そして、信頼度の低い文字列（以下、低信頼度文字列または低信頼度部位と言う。）は背景色を青色に彩色される。つまり、低信頼度文字列は、誤認識されている文字列（以下、誤認識文字列と言う。）である可能性が高いため、使用者による編集を誘導するために、他の文字列とは異なる形態で表示される。 When a voice meaning a sentence is input to the second microphone 16b in the state where the voice input mode is set, as shown in FIG. 3B, the result of voice recognition is displayed in the function display area 52. A sentence is displayed. A character string with low reliability (hereinafter referred to as a low reliability character string or a low reliability portion) is colored blue in background. In other words, since the low-reliability character string is likely to be a misrecognized character string (hereinafter referred to as a misrecognized character string), in order to guide editing by the user, Are displayed in different forms.

さらに、携帯端末１０は、編集するための手段を確認するために、図３（Ｃ）に示すように、ウインドウＷａを機能表示領域５２上に表示する。このウインドウＷａには、「１．カーソル指定モード」、「２．音声指定モード」、「３．音声検索モード」および「４．通常入力モード」の４つモードを選択する編集メニューが含まれる。また、各編集メニューの選択には、対応する数字キーを操作すればよく、複数の文字入力キー２２ｅを利用する。 Furthermore, the portable terminal 10 displays a window Wa on the function display area 52 as shown in FIG. 3C in order to confirm means for editing. This window Wa includes an edit menu for selecting four modes: “1. cursor designation mode”, “2. voice designation mode”, “3. voice search mode”, and “4. normal input mode”. In addition, the selection of each editing menu may be performed by operating the corresponding numeric keys, and a plurality of character input keys 22e are used.

たとえば、「１」に対応する「あ」行文字キーが押下されると、図３（Ｄ）に示すように、任意の低信頼度文字列が編集カーソルＣＵｂによって指定された状態で表示され、モード表示５８には、カーソル指定モードと表示される。また、図３（Ｄ）では、中央のソフトキーとして編集キー５６ｅが表示されるようになる。そして、編集カーソルＣＵｂは、低信頼度文字列である「経済」を選択した状態となる。この編集カーソルＣＵｂは、低信頼度文字列のみを選択することが可能であるため、編集する操作の操作性を向上させることができる。 For example, when the “A” line character key corresponding to “1” is pressed, an arbitrary low-reliability character string is displayed in a state designated by the editing cursor CUb, as shown in FIG. The mode display 58 displays a cursor designation mode. In FIG. 3D, an edit key 56e is displayed as a central soft key. Then, the editing cursor CUb is in a state where “economy” which is a low reliability character string is selected. Since this edit cursor CUb can select only a low-reliability character string, the operability of the editing operation can be improved.

なお、編集キー５６ｅについては後述するため、ここでの詳細な説明は簡単のため省略する。 Since the edit key 56e will be described later, a detailed description thereof will be omitted for the sake of simplicity.

ここで、４つのモードのそれぞれについて概要を説明する。まず、カーソル指定モードは、低信頼度文字列のそれぞれを、方向キー２２ｄによって操作可能な編集カーソルＣＵｂ（図３（Ｄ）参照）によって指定（選択）して、その指定した文字列を編集するモードである。たとえば、図３（Ｄ）の状態で、右方向の入力が方向キー２２ｄにされると、「経済」に代わって「医術」が選択される。また、さらに右方向の入力が方向キー２２ｄにされると、「医術」に代わって「いたない」が選択される。さらに、「いたない」が選択される状態で、左方向の入力が方向キー２２ｄにされると、「いたない」に代わって「医術」が選択される。なお、左右方向の代わりに、上下方向の操作がされると、下方向は右方向に対応し、上方向は左方向に対応して編集カーソルＣＵｂが移動するようにしてもよい。 Here, an outline of each of the four modes will be described. First, in the cursor designation mode, each low-reliability character string is designated (selected) by the edit cursor CUb (see FIG. 3D) that can be operated by the direction key 22d, and the designated character string is edited. Mode. For example, in the state of FIG. 3D, when the right direction input is made with the direction key 22d, “medicine” is selected instead of “economy”. Further, when the direction key 22d is further input to the right direction, "Dai" is selected instead of "Medical technique". Further, in the state where “don't care” is selected, if the left direction input is made with the direction key 22d, “medical technique” is selected instead of “not good”. If an operation in the vertical direction is performed instead of the horizontal direction, the editing cursor CUb may move so that the downward direction corresponds to the right direction and the upward direction corresponds to the left direction.

このように、編集カーソルＣＵｂは、方向キー２２ｄによって操作することが可能であるため、使用者は信頼性の高いカーソル操作を行うことができる。 Thus, since the edit cursor CUb can be operated by the direction key 22d, the user can perform a highly reliable cursor operation.

次に、音声指定モードでは、低信頼度文字列が誤認識文字列である場合に、文字列を表わす音声が再入力されると、その再入力した音声が表わす文字列と類似する文字列が選択される。また、この類似する文字列を指定するためには、最初に入力された音声データを、形態素毎に分割して記憶しておくことで実現可能である。具体的には、分割された各音声データにおいて、低信頼度文字列に対応する各音声データと、対応する低信頼度文字列とから低信頼度音声辞書を作成する。そして、ＣＰＵ２０ａは、特徴パターンの照合による音声認識の手法を用いて、再入力された音声データと最も尤度が高い参照音声データを特定することで、低信頼度文字列を選択する。そして、選択された文字列は、再入力した音声が音声認識された文字列と置き換えられる。つまり、選択された誤認識文字列は、新たに音声認識された文字列と置き換えられる。このように、使用者は、編集するための文字列を発話するだけで、誤認識した文字列を編集することができる。つまり、使用者は、音声認識よる文章の編集に都合がいい編集操作を行うことができる。 Next, in the voice designation mode, when the low-reliability character string is a misrecognized character string and the voice representing the character string is re-input, a character string similar to the character string represented by the re-input voice is obtained. Selected. Moreover, in order to designate this similar character string, it is realizable by dividing and memorize | storing the audio | voice data input initially for every morpheme. Specifically, in each divided speech data, a low reliability speech dictionary is created from each speech data corresponding to a low reliability character string and a corresponding low reliability character string. Then, the CPU 20a selects the low-reliability character string by specifying the re-input speech data and the reference speech data having the highest likelihood using a speech recognition method based on feature pattern matching. The selected character string is replaced with a character string in which the re-input voice is recognized. That is, the selected misrecognized character string is replaced with a newly recognized character string. Thus, the user can edit a misrecognized character string only by speaking the character string for editing. That is, the user can perform an editing operation that is convenient for editing text by voice recognition.

また、音声検索モードでは、音声指定モードと同様に、低信頼度文字列が誤認識文字列である場合に、誤認識文字列を表わす音声を再入力することで、各誤認識文字列から再入力された音声に対応する誤認識文字列を検索する。そして、検索結果は、編集カーソルＣＵｂによって示される。なお、誤認識文字列を検索する際には、音声指定モードと同様に、特徴パターンの照合による音声認識の手法を利用して指定してもよいし、再入力した音声の認識結果と一致する文字列を指定するようにしてもよい。このように、使用者は、誤った文字列を発話するだけで編集カーソルＣＵｂを操作できるようになる。つまり、使用者は、音声認識による文書の作成に都合のいいカーソルの操作を実行することができる。 Also, in the voice search mode, as in the voice designation mode, when the low-reliability character string is a misrecognized character string, the voice representing the misrecognized character string is re-input to re-start from each misrecognized character string. A misrecognized character string corresponding to the input voice is searched. The search result is indicated by the edit cursor CUb. When searching for a misrecognized character string, as in the voice designation mode, it may be designated using a voice recognition technique based on feature pattern matching, or it matches the recognition result of the re-input voice. A character string may be specified. In this way, the user can operate the editing cursor CUb simply by speaking the wrong character string. That is, the user can execute a cursor operation that is convenient for creating a document by voice recognition.

なお、図３（Ｃ）に示す通常入力モードとは、先述した通常入力モードのことであり、使用者は、音声認識によって入力した文字列（文章）に対して、カーソルＣＵａを方向キー２２ｄによって文章（文字列）を編集する位置を任意に決め、複数の文字入力キー２２ｅによって文字を入力する。 The normal input mode shown in FIG. 3C is the above-described normal input mode, and the user moves the cursor CUa to the character string (sentence) input by voice recognition using the direction key 22d. A position for editing a sentence (character string) is arbitrarily determined, and characters are input by a plurality of character input keys 22e.

続いて、低信頼度文字列を編集する操作について、説明する。図４（Ａ）を参照して、モード表示５８には、カーソル指定モードと表示されており、編集カーソルＣＵｂによって、低信頼度文字列である「多少」が指定（選択）されている。また、キー表示領域５４では、左側に通常入力キー５６ａ、中央に編集キー５６ｅ、右側に終了キー５６ｆが表示されている。そして、編集キー５６ｅが操作されると、編集カーソルＣＵｂ部に対する音声認識文字入力あるいは文字入力キー２２ｅ等による文字入力を受けつける。 Next, an operation for editing the low reliability character string will be described. Referring to FIG. 4A, the mode display 58 displays a cursor designation mode, and “some” that is a low reliability character string is designated (selected) by the editing cursor CUb. In the key display area 54, a normal input key 56a is displayed on the left side, an edit key 56e in the center, and an end key 56f on the right side. Then, when the edit key 56e is operated, the voice recognition character input to the edit cursor CUb portion or the character input by the character input key 22e is accepted.

たとえば、複数の文字入力キー２２ｅによって「箇所」の文字列が入力されると、図４（Ｂ）に示すように、指定された低信頼度文字列「多少」が「箇所」の文字列に置き換えられる。つまり、使用者は、電車の中や周囲が騒がしい場所など、音声認識に不適切な環境であれば、複数の文字入力キー２２ｅを利用して、文章の編集をすることができる。また、使用者は、複数の文字入力キー２２ｅを利用して、信頼性の高い編集操作を行うこともできる。 For example, when a character string “location” is input by a plurality of character input keys 22e, the designated low-reliability character string “some” becomes a character string “location” as shown in FIG. Replaced. That is, the user can edit the text by using the plurality of character input keys 22e in an environment that is inappropriate for voice recognition, such as in a train or around a noisy place. The user can also perform a highly reliable editing operation by using the plurality of character input keys 22e.

また、文字入力キー２２ｅの押下されなければ、第２マイク１６ｂを利用した音声認識によって文字列を入力することが可能であり、「箇所」を表わす音声が入力されると、文字入力と同様に、指定された低信頼度文字列「多少」が「箇所」の文字列に置き換えられる。つまり、使用者は、音声認識を利用して容易に編集することができる。 If the character input key 22e is not pressed, it is possible to input a character string by voice recognition using the second microphone 16b. When a voice representing “location” is input, the character string is input in the same manner. The designated low reliability character string “some” is replaced with the character string “location”. That is, the user can easily edit using voice recognition.

そして、編集した後に確定キー５６ｇが操作されると、再入力された文字列は、背景色が他の文字列と同じ色で彩色され、さらに下線が付加されて表示されるようになる。なお、このように下線が付加された文字列を確定文字列と言うことにする。また、確定文字列は、編集カーソルＣＵｂにより選択後、編集キー５６ｅが再び操作されれば、編集可能な状態になる。 When the enter key 56g is operated after editing, the re-input character string is displayed with the background color colored in the same color as the other character strings and further underlined. Note that the character string with the underline added in this way is referred to as a confirmed character string. The confirmed character string becomes editable when the edit key 56e is operated again after being selected by the edit cursor CUb.

また、低信頼度文字列であっても誤入力でなければ、編集カーソルＣＵｂによって選択した後に、編集キー５６ｅと確定キー５６ｇとを続けて操作することで、確定文字列とすることができる。さらに、低信頼度文字列が表示されている状態で終了キー５６ｆが選択されると、現在のモード（ここでは、カーソル指定モード）を終了して、他のモードを選択することが可能になる。たとえば、図３（Ｄ）のいずれかの画面で終了キー５６ｆが操作されると、図３（Ｃ）に示すウインドウＷａが表示される。 Further, even if it is a low-reliability character string, if it is not an erroneous input, it can be made a confirmed character string by operating the edit key 56e and the confirm key 56g continuously after selecting with the edit cursor CUb. Furthermore, when the end key 56f is selected in a state where the low reliability character string is displayed, the current mode (here, the cursor designation mode) is ended, and another mode can be selected. . For example, when the end key 56f is operated on any screen of FIG. 3D, the window Wa shown in FIG. 3C is displayed.

そして、全ての低信頼度文字列が確定文字列に置き換えられると、図４（Ｃ）に示すように、ウインドウＷｂが表示される。ウインドウＷｂには、「通常入力を行いますか？」の文字列が表示されると共に、「１．ＹＥＳ」および「２．ＮＯ」が表示される。 When all the low reliability character strings are replaced with the confirmed character strings, a window Wb is displayed as shown in FIG. In the window Wb, a character string “Do you normally input?” And “1. YES” and “2. NO” are displayed.

たとえば、ウインドウＷｂが表示されている状態で、「１」の数字キーが操作されると、確定文字列に付加された下線が消去され、他の文字列と同じ表示になる。そして、カーソルＣＵａの表示位置に基づいて、複数の文字入力キー２２ｅによる文字入力を行うことができる状態になる。一方、「２」の数字キーが操作されると、送信メールの本文編集を終了して、送信メールの宛先や題名を入力可能なＧＵＩが表示される画面に遷移する。 For example, when the number key “1” is operated while the window Wb is displayed, the underline added to the confirmed character string is deleted, and the same display as that of the other character strings is obtained. And based on the display position of cursor CUa, it will be in the state which can perform the character input by the several character input key 22e. On the other hand, when the number key “2” is operated, the text editing of the outgoing mail is finished, and the screen is changed to a screen on which a GUI capable of inputting the destination and title of the outgoing mail is displayed.

なお、編集キー５６ｅあるいは確定キー５６ｇに対応する決定キー２２ｃを長押しすることで、低信頼度文字列が全て確定文字列にされてもよい。また、カーソル指定モードについて説明したが、音声検索モードであっても、編集カーソルＣＵｂを移動させる操作が異なるだけであり、低信頼度文字列に対する編集操作は同じである。 Note that all of the low-reliability character strings may be changed to the confirmed character string by long pressing the enter key 22c corresponding to the edit key 56e or the confirm key 56g. Although the cursor designation mode has been described, even in the voice search mode, only the operation for moving the editing cursor CUb is different, and the editing operation for the low-reliability character string is the same.

続いて、低信頼度文字列を、他の文字列とは異なる形態で表示するための信頼度テーブルについて説明する。図５を参照して、信頼度テーブルには、音声認識された各文字列を記録する文字列の列と、その各文字列に対応する信頼度を記録する信頼度の列とから構成されている。たとえば、文字列の列には、使用者が発話した文章が形態素単位に分割されて格納されており、「経済」、「の」および「医術」などの文字列が格納されている。一方、信頼度の列には、信頼度を百分率で表した数値が記録されており、音声認識の結果に基づいてそれぞれ記録される。つまり、「経済」の音声認識における信頼度が５０％であれば、「経済」の欄に対応して「５０％」が記録される。また、「の」の信頼度が８０％であれば、「の」の欄に対応して「８０％」が記録され、「医術」の信頼度が４０％であれば、「医術」の欄に対応して「４０％」が記録される。 Next, a reliability table for displaying a low reliability character string in a form different from other character strings will be described. Referring to FIG. 5, the reliability table is composed of a character string column for recording each character string recognized by speech recognition and a reliability column for recording the reliability corresponding to each character string. Yes. For example, in a character string column, a sentence uttered by a user is divided into morpheme units and stored, and character strings such as “economy”, “no”, and “medicine” are stored. On the other hand, in the reliability column, a numerical value representing the reliability as a percentage is recorded, and is recorded based on the result of speech recognition. In other words, if the reliability in speech recognition of “economy” is 50%, “50%” is recorded corresponding to the “economy” column. If the reliability of “NO” is 80%, “80%” is recorded corresponding to the “NO” column, and if the reliability of “Medical” is 40%, the “Medical” column is recorded. "40%" is recorded corresponding to

そして、信頼度テーブルの信頼度の列において６０％以下の文字列が、低信頼度文字列として表示されるようになる。つまり、「経済」および「医術」に対応する信頼度が６０％以下であるため、図３（Ｂ）などに示すように、「経済」および「医術」の背景色が青色に彩色されて表示される。 Then, a character string of 60% or less in the reliability column of the reliability table is displayed as a low reliability character string. That is, since the reliability corresponding to “economy” and “medical technique” is 60% or less, the background color of “economic” and “medical technique” is displayed in blue as shown in FIG. Is done.

続いて、低信頼度文字列を音声入力によって編集するときの他の実施例について説明する。図６を参照して、低信頼度文字列「いたない」が編集カーソルＣＵｂで指定され、使用者によって新たに音声が入力されると、音声認識によって生成された文字列の候補がプルダウンＰＤによって一覧的に表示される。このプルダウンＰＤに表示される一覧は、尤度（信頼度）が高い順に上から表示されるため、最も尤度が高い文字列が最上部に表示される。つまり、ここでは、「満たない」および「汚い」の文字列が、認識の候補としてプルダウンＰＤ内に表示される。そして、表示される文字列に対応する数字が選択されると、指定されている低信頼度文字列が選択された文字列と置き換えられる。たとえば、「１」の数字キーが操作されると、「いたない」が「満たない」に置き換えられる。なお、方向キー２２ｄによるカーソル移動と確定キー５６ｇの操作とによる選択であってもよい。 Next, another embodiment when editing a low-reliability character string by voice input will be described. Referring to FIG. 6, when a low-reliability character string “Daitai” is designated by the edit cursor CUb and a new voice is input by the user, a character string candidate generated by voice recognition is displayed by a pull-down PD. Displayed in a list. Since the list displayed on the pull-down PD is displayed from the top in the descending order of likelihood (reliability), the character string having the highest likelihood is displayed at the top. That is, here, the character strings “not satisfied” and “dirty” are displayed in the pull-down PD as recognition candidates. When a number corresponding to the displayed character string is selected, the designated low reliability character string is replaced with the selected character string. For example, when the number key “1” is operated, “not” is replaced with “not satisfied”. The selection may be made by moving the cursor with the direction key 22d and operating the confirmation key 56g.

このように、音声認識の候補を一覧的に表示することで、再入力した音声の認識精度が高くなくても、使用者は正しく編集することができる。 In this way, by displaying a list of voice recognition candidates in a list, the user can edit correctly even if the recognition accuracy of the re-input voice is not high.

なお、プルダウンＰＤを利用して低信頼度文字列を編集するのは、カーソル指定モードだけに限らず、音声検索モードや音声指定モードでも実行可能である。また、一定値（たとえば４０％）以下の文字列はプルダウンＰＤに表示されない。 Note that editing the low-reliability character string using the pull-down PD can be executed not only in the cursor designation mode but also in the voice search mode and the voice designation mode. In addition, character strings below a certain value (for example, 40%) are not displayed on the pull-down PD.

続いて、図３（Ａ）−図３（Ｃ）に示すサブメニューキー５６ｃが操作された場合について説明する。図７（Ａ）を参照して、機能表示領域５２にはウインドウＷｃが表示され、そのウインドウＷｃ内には、「１．新規保存」、「２．編集内容確認」および「３．信頼度閾値」の３つのサブメニューが表示される。なお、各メニューは、他のメニューを選択する操作と同様に、数字キーなどを操作することで選択することができる。 Next, a case where the sub menu key 56c shown in FIGS. 3A to 3C is operated will be described. Referring to FIG. 7A, window Wc is displayed in function display area 52, and in this window Wc, “1. New save”, “2. Edit content confirmation” and “3. Reliability threshold value” are displayed. 3 sub-menus are displayed. Each menu can be selected by operating a numeric key or the like as in the operation of selecting another menu.

たとえば、「新規保存」のメニューが選択されると、送信メールのデータをＲＡＭ３０に保存（記憶）する処理が実行される。また、「編集内容確認」のメニューが選択されると、作成した送信メールの宛先、題名および本文などを同時に確認する画面を表示する処理が実行される。 For example, when the “New Save” menu is selected, a process of saving (storing) the data of the outgoing mail in the RAM 30 is executed. When the “confirmation of editing content” menu is selected, a process for displaying a screen for simultaneously confirming the destination, title, and text of the created outgoing mail is executed.

そして、「信頼度閾値」のメニューが選択されると、低信頼度文字列と判断される閾値を変更するＧＵＩが表示される。つまり、「３」の数字キーが操作されると、図７（Ｂ）に示すウインドウＷｄが表示され、さらにウインドウＷｄには、「１．高い」、「２．普通」および「３．低い」の閾値メニューが表示される。そして、使用者は、任意の閾値メニューを選択することで、閾値を変化させることができる。たとえば、「高い」が選択されると閾値は７０％に設定され、「普通」が選択されると閾値は６０％に設定され、「低い」が選択されると閾値は５０％に設定される。そして、図７（Ａ），（Ｂ）における戻るキー５６ｈが操作されると、サブメニューの処理を終了して、図３（Ａ）などに示す画面に戻る。 When the “reliability threshold value” menu is selected, a GUI for changing a threshold value determined to be a low reliability character string is displayed. That is, when the number key “3” is operated, a window Wd shown in FIG. 7B is displayed, and “1. High”, “2. Normal”, and “3. Low” are further displayed in the window Wd. The threshold menu is displayed. The user can change the threshold by selecting an arbitrary threshold menu. For example, when “high” is selected, the threshold is set to 70%, when “normal” is selected, the threshold is set to 60%, and when “low” is selected, the threshold is set to 50%. . Then, when the return key 56h in FIGS. 7A and 7B is operated, the submenu processing is terminated and the screen returns to the screen shown in FIG. 3A and the like.

なお、信頼度の閾値は、３段階だけに限らず、２段階または４段階以上であってもよい。また、信頼度の閾値は任意の数値で指定されるようにしてあってもよい。また、サブメニューの処理は、送信メールの本文を作成するときだけに限らず、宛先や題名を入力するときでも、実行可能である。 The reliability threshold is not limited to three levels, and may be two levels or four or more levels. Further, the reliability threshold value may be designated by an arbitrary numerical value. Further, the processing of the submenu can be executed not only when the body of the outgoing mail is created but also when the destination and the title are input.

図８は、ＲＡＭ３０のメモリマップを示す図解図である。図８を参照して、ＲＡＭ３０のメモリマップ３００には、プログラム記憶領域３０２およびデータ記憶領域３０４が含まれる。プログラムおよびデータの一部は、フラッシュメモリ２８から一度に全部または必要に応じて部分的にかつ順次的に読み出され、ＲＡＭ３０に記憶されてからＣＰＵ２０などで処理される。 FIG. 8 is an illustrative view showing a memory map of the RAM 30. Referring to FIG. 8, a memory map 300 of RAM 30 includes a program storage area 302 and a data storage area 304. A part of the program and data is read from the flash memory 28 all at once or partially and sequentially as necessary, stored in the RAM 30, and then processed by the CPU 20 or the like.

プログラム記憶領域３０２は、携帯端末１０を動作させるためのプログラムを記憶する。携帯端末１０を動作させるためのプログラムは、メール機能プログラム３１０および文書編集プログラム３１２などから構成される。メール機能プログラム３１０は、送信メールおよび返信を作成したり、受信メールを表示したりするためのプログラムであり、さらに送信メール作成プログラム３１０ａおよびサブメニュープログラム３１０ｂなどから構成されている。送信メール作成プログラム３１０ａは、送信メールの宛先、題名および本文を作成（入力）するためのプログラムであり、サブメニュープログラム３１０ｂは、送信メールを保存するためのプログラムである。 The program storage area 302 stores a program for operating the mobile terminal 10. A program for operating the mobile terminal 10 includes a mail function program 310 and a document editing program 312. The mail function program 310 is a program for creating a sent mail and a reply, and displaying a received mail, and further includes a sent mail creating program 310a and a submenu program 310b. The outgoing mail creation program 310a is a program for creating (inputting) the destination, title, and body of the outgoing mail, and the submenu program 310b is a program for saving the outgoing mail.

また、文書編集プログラム３１２は、送信メールの本文などを編集するときに実行されるプログラムであり、さらに信頼度閾知設定プログラム３１２ａ、任意カーソル編集プログラム３１２ｂ、音声認識入力プログラム３１２ｃ、低信頼度部位編集プログラム３１２ｄ、カーソル指定プログラム３１２ｅ、音声指定プログラム３１２ｆ、音声検索プログラム３１２ｇおよび変換部位検索プログラム３１２ｈから構成されている。 The document editing program 312 is a program that is executed when editing the text of the outgoing mail, and further includes a reliability threshold setting program 312a, an arbitrary cursor editing program 312b, a speech recognition input program 312c, a low reliability part. The editing program 312d, the cursor designation program 312e, the voice designation program 312f, the voice search program 312g, and the conversion site search program 312h are configured.

信頼度閾値設定プログラム３１２ａは、使用者によって信頼度の閾値を任意に設定させるためのプログラムである。任意カーソル編集プログラム３１２ｂは、カーソルＣＵａによって決められた位置に基づいて、キー入力または音声入力によって文章を編集、つまり文字列を入力するためのプログラムである。音声認識入力プログラム３１２ｃは、音声認識によって文字列を入力するための処理であり、音声入力モードなどで実行されるプログラムである。 The reliability threshold setting program 312a is a program for arbitrarily setting a reliability threshold by the user. The arbitrary cursor editing program 312b is a program for editing a sentence by key input or voice input based on a position determined by the cursor CUa, that is, inputting a character string. The voice recognition input program 312c is a process for inputting a character string by voice recognition, and is a program executed in a voice input mode or the like.

低信頼度部位編集プログラム３１２ｄは、低信頼度文字列をキー入力または音声入力などによって編集するためのプログラムである。カーソル指定プログラム３１２ｅは、低信頼度文字列を編集カーソルＣＵｂによって選択して編集するためのプログラムである。音声指定プログラム３１２ｆは、新たに入力した音声と相関の高い部位の文字列を編集するためのプログラムである。音声検索プログラム３１２ｇは、新たに入力した音声によって編集カーソルＣＵｂを操作して、低信頼度文字列を編集するためのプログラムである。そして、変換部位検索プログラム３１２ｈは、音声指定プログラム３１２ｆおよび音声検索プログラム３１２ｇのサブルーチンであり、新たに入力された音声に基づいて低信頼度文字列を検索するためのプログラムである。 The low reliability part editing program 312d is a program for editing a low reliability character string by key input or voice input. The cursor designation program 312e is a program for selecting and editing a low-reliability character string with the edit cursor CUb. The voice designation program 312f is a program for editing a character string in a part having a high correlation with a newly input voice. The voice search program 312g is a program for editing the low-reliability character string by operating the editing cursor CUb with newly input voice. The conversion site search program 312h is a subroutine of the voice designation program 312f and the voice search program 312g, and is a program for searching for a low-reliability character string based on a newly input voice.

なお、図示は省略するが、携帯端末１０を動作させるためのプログラムは、通話を行うためのプログラム、ネットワークを通じてメールデータを取得するプログラムなども含む。 In addition, although illustration is abbreviate | omitted, the program for operating the portable terminal 10 also includes the program for performing a telephone call, the program which acquires mail data through a network, etc.

続いて、図９を参照して、データ記憶領域３０４には、音声認識バッファ３３０、入力文字バッファ３３２が設けられ、さらに、設定閾値データ３３４、信頼度テーブルデータ３３６、低信頼度音声辞書データ３３８、メールデータ３４０が記憶されると共に、カーソル指定フラグ３４２、音声指定フラグ３４４および音声検索フラグ３４６が設けられる。 Next, referring to FIG. 9, the data storage area 304 is provided with a speech recognition buffer 330 and an input character buffer 332, and further, setting threshold value data 334, reliability table data 336, and low reliability speech dictionary data 338. Mail data 340 is stored, and a cursor designation flag 342, a voice designation flag 344, and a voice search flag 346 are provided.

音声認識バッファ３３０は、音声認識の処理を実行する際に利用されるバッファであり、たとえば、ＤＳＰ２０ｂによって変換された音声認識用の音声データが一時的に格納される。入力文字バッファ３３２は、文書編集プログラム３１２が実行されることで編集（作成）されている文字列を一時的に格納するバッファである。なお、入力文字バッファ３３２に格納されたデータを利用して、文字列がＬＣＤモニタ２６に表示される。設定閾値データ３３４は、信頼度閾知設定プログラム３１２ａの処理によって決定した閾値のデータであり、たとえば「６０％」や「７０％」などを表わす数字列から構成されている。信頼度テーブルデータ３３６は、図５に示す信頼度テーブルのデータである。 The voice recognition buffer 330 is a buffer used when executing voice recognition processing, and temporarily stores voice data for voice recognition converted by the DSP 20b, for example. The input character buffer 332 is a buffer that temporarily stores a character string edited (created) by executing the document editing program 312. The character string is displayed on the LCD monitor 26 using the data stored in the input character buffer 332. The setting threshold value data 334 is threshold value data determined by the processing of the reliability threshold value setting program 312a, and is composed of a numeric string representing, for example, “60%” or “70%”. The reliability table data 336 is data of the reliability table shown in FIG.

低信頼度音声辞書データ３３８は、使用者によって入力された音声データのうち、低信頼度文字列に対応する音声データと、その低信頼度文字列文字列とから構成されており、音声指定モードや音声検索モードなどで入力された音声が表わす文字列に類似する文字列を検索するために利用される。メールデータ３４０は、送信メールの本文（文字列）のデータや、受信メール、送信済みメールおよび未送信メールなどのデータから構成されるデータである。 The low-reliability speech dictionary data 338 is composed of speech data corresponding to a low-reliability character string among speech data input by the user, and the low-reliability character string character string. It is used to search for a character string similar to the character string represented by the voice input in the voice search mode. The mail data 340 is data composed of data such as a body text (character string) of a transmitted mail, and data such as a received mail, a transmitted mail and an unsent mail.

カーソル指定フラグ３４２は、カーソル指定モードであるか否かを判断するためのフラグである。たとえば、カーソル指定フラグ３４２は１ビットのレジスタで構成され、カーソル指定フラグ３４２がオン（成立）されると、レジスタにはデータ値「１」が設定される。一方、カーソル指定フラグ３４２がオフ（不成立）されると、レジスタにはデータ値「０」が設定される。また、音声指定フラグ３４４は、音声指定モードであるか否かを判断するためのフラグである。そして、音声検索フラグ３４６は、音声検索モードであるか否かを判断するためのフラグである。なお、音声指定フラグ３４４および音声検索フラグ３４６の構成は、カーソル指定フラグ３４２と同じであるため、構成についての詳細な説明は省略する。 The cursor designation flag 342 is a flag for determining whether or not the cursor designation mode is set. For example, the cursor designation flag 342 is composed of a 1-bit register. When the cursor designation flag 342 is turned on (established), a data value “1” is set in the register. On the other hand, when the cursor designation flag 342 is turned off (not established), a data value “0” is set in the register. The voice designation flag 344 is a flag for determining whether or not the voice designation mode is set. The voice search flag 346 is a flag for determining whether or not the voice search mode is set. Note that the configurations of the voice designation flag 344 and the voice search flag 346 are the same as those of the cursor designation flag 342, and thus a detailed description of the configuration is omitted.

また、図示は省略するが、データ記憶領域３０４には、状態表示領域５０に表示する画像や文字列などを表示するためのデータが記憶されると共に、携帯端末１０の動作に必要な他のカウンタやフラグも設けられる。 Although not shown, the data storage area 304 stores data for displaying images and character strings to be displayed in the status display area 50 and other counters necessary for the operation of the mobile terminal 10. And flags are also provided.

ＣＰＵ２０ａは、「Ｌｉｎｕｘ」および「ＲＥＸ」などのＲＴＯＳ（ｒｅａｌ−ｔｉｍｅｏｐｅｒａｔｉｎｇｓｙｓｔｅｍ）の制御下で、図１０に示す送信メール作成処理、図１１に示すサブメニュー処理、図１２に示す信頼度閾値設定処理、図１３に示す文字編集処理、図１４に示す任意カーソル編集処理、図１５に示す音声認識入力処理、図１６に示す低信頼度部位編集処理、図１７に示すカーソル指定処理、図１８に示す音声指定処理、図１９に示す音声検索処理および図２０に示す変換部位検索処理などを含む複数のタスクを並列的に実行する。 Under the control of an RTOS (real-time operating system) such as “Linux” and “REX”, the CPU 20a performs outgoing mail creation processing shown in FIG. 10, submenu processing shown in FIG. 11, and reliability threshold setting shown in FIG. 13, the character editing process shown in FIG. 13, the arbitrary cursor editing process shown in FIG. 14, the speech recognition input process shown in FIG. 15, the low-reliability part editing process shown in FIG. 16, the cursor designation process shown in FIG. A plurality of tasks including a voice designation process shown, a voice search process shown in FIG. 19, a conversion part search process shown in FIG. 20, and the like are executed in parallel.

図１０は、送信メール作成処理を示すフロー図である。たとえば、使用者が送信メールを作成する操作を行うと、ＣＰＵ２０ａはステップＳ１で、終了操作か否かを判断する。つまり、送信メールの作成を終了するための操作か否かを判断する。ステップＳ１で“ＹＥＳ”であれば、送信メール作成処理を終了し、上位処理であるメール機能処理に戻る。一方、ステップＳ１で“ＮＯ”であれば、ステップＳ３は送信操作か否かを判断する。つまり、送信メールをネットワークに送信するための操作であるか否かを判断する。ステップＳ３で“ＹＥＳ”であれば、ステップＳ５で送信処理を実行し、送信メール作成処理を終了する。つまり、ステップＳ５では、送信メールのデータをネットワークに送信する。 FIG. 10 is a flowchart showing the outgoing mail creation process. For example, when the user performs an operation of creating a transmission mail, the CPU 20a determines whether or not the operation is an end operation in step S1. That is, it is determined whether or not the operation is to end the creation of the outgoing mail. If “YES” in the step S1, the outgoing mail creation process is ended, and the process returns to the mail function process which is the upper process. On the other hand, if “NO” in the step S1, a step S3 determines whether or not a transmission operation is performed. That is, it is determined whether or not the operation is an operation for transmitting outgoing mail to the network. If “YES” in the step S3, the transmission process is executed in a step S5, and the transmission mail creating process is ended. That is, in step S5, the outgoing mail data is sent to the network.

ステップＳ５で“ＮＯ”であれば、つまり送信操作でなければ、ステップＳ７でサブメニューの設定か否かを判断する。つまり、キー表示領域５４に表示されているサブメニューキー５６ｃが操作されたか否かを判断する。ステップＳ７で“ＹＥＳ”であれば、つまりサブメニューキー５６ｃが操作されていれば、ステップＳ９でサブメニュー処理を実行し、ステップＳ１に戻る。また、ステップＳ９で実行されるサブメニュー処理については後述するため、ここでの詳細な説明は省略する。一方、ステップＳ７で“ＮＯ”であれば、つまりサブメニューキー５６ｃが操作されていなければ、ステップＳ１１で題名の編集であるか否かを判断する。つまり、送信メールの題名を編集するための操作であるか否かを判断する。 If “NO” in the step S5, that is, if the transmission operation is not performed, it is determined whether or not the sub menu is set in a step S7. That is, it is determined whether or not the sub menu key 56c displayed in the key display area 54 has been operated. If “YES” in the step S7, that is, if the submenu key 56c is operated, a submenu process is executed in a step S9, and the process returns to the step S1. Further, since the submenu process executed in step S9 will be described later, a detailed description thereof is omitted here. On the other hand, if “NO” in the step S7, that is, if the submenu key 56c is not operated, it is determined whether or not the title is edited in a step S11. That is, it is determined whether or not the operation is for editing the title of the outgoing mail.

ステップＳ１１で“ＹＥＳ”であれば、つまり題名を編集する操作であれば、ステップＳ１３で文書編集処理を実行し、さらにステップＳ１５で題名を設定する処理した後にステップＳ１に戻る。一方、ステップＳ１１で“ＮＯ”であれば、つまり題名を編集する操作でなければ、ステップＳ１７で本文の編集であるか否かを判断する。つまり、ステップＳ１７では送信メールの本文を編集する操作であるか否かを判断する。ステップＳ１７で“ＹＥＳ”であれば、ステップＳ１３と同様にステップＳ１９で文書編集処理を実行し、さらにステップＳ２１で本文を設定した後に、ステップＳ１に戻る。一方、ステップＳ１７で“ＮＯ”であれば、つまり本文を編集する操作でなければ、ステップＳ２３で宛先の設定であるか否かを判断する。なお、ステップＳ１３またはステップＳ１９で実行される文書編集処理については、図１３に示す文書編集処理を示すフロー図を用いて詳細に説明する。 If “YES” in the step S11, that is, if the operation is to edit the title, the document editing process is executed in a step S13, and after the process of setting the title in a step S15, the process returns to the step S1. On the other hand, if “NO” in the step S11, that is, if the operation is not an operation for editing the title, it is determined whether or not the text is edited in a step S17. That is, in step S17, it is determined whether or not the operation is to edit the text of the outgoing mail. If “YES” in the step S17, the document editing process is executed in a step S19 similarly to the step S13, and the text is set in a step S21, and then the process returns to the step S1. On the other hand, if “NO” in the step S17, that is, if it is not an operation for editing the text, it is determined whether or not the destination is set in a step S23. The document editing process executed in step S13 or step S19 will be described in detail with reference to the flowchart showing the document editing process shown in FIG.

ステップＳ２３で“ＹＥＳ”であれば、つまり宛先を設定する処理であれば、ステップＳ２５で宛先の設定処理を実行し、ステップＳ１に戻る。一方、ステップＳ２３で“ＮＯ”であれば、つまり宛先を設定する操作でなければ、ステップＳ２７でデータの添付か否かを判断する。つまり、送信メールにデータを添付するための操作がされたか否かを判断する。ステップＳ２７で“ＹＥＳ”であれば、つまりデータを添付する操作であれば、ステップＳ２９でデータの添付処理を実行し、ステップＳ１に戻る。一方、ステップＳ２７で“ＮＯ”であれば、つまりデータを添付する操作でなければ、ステップＳ１に戻る。 If “YES” in the step S23, that is, if the process is to set a destination, the destination setting process is executed in a step S25, and the process returns to the step S1. On the other hand, if “NO” in the step S23, that is, if the operation is not an operation for setting a destination, it is determined whether or not data is attached in a step S27. That is, it is determined whether or not an operation for attaching data to the outgoing mail has been performed. If “YES” in the step S27, that is, if the operation is an operation for attaching data, a data attaching process is executed in a step S29, and the process returns to the step S1. On the other hand, if “NO” in the step S27, that is, if it is not an operation for attaching data, the process returns to the step S1.

図１１はステップＳ９（図１０参照）で実行されるサブメニュー処理を示すフロー図である。ＣＰＵ２０ａは、ステップＳ９の処理が実行されると、ＬＣＤモニタ２６には図７（Ａ）に示すようにウインドウＷｃが表示され、ステップＳ４１で戻る操作であるか否かを判断する。つまり、戻るキー５６ｈが操作されたか否かを判断する。ステップＳ４１で“ＹＥＳ”であれば、つまり戻るキー５６ｈが操作されれば、サブメニュー処理を終了し、送信メール作成処理に戻る。一方、ステップＳ４１で“ＮＯ”であれば、つまり戻るキー５６ｈが操作されなければステップＳ４３で新規保存の操作か否かを判断する。たとえば、「１」の数字キーが操作されたか否かを判断する。ステップＳ４３で“ＹＥＳ”であれば、ステップＳ４５で作成中のメールを保存する処理を実行した後に、ステップＳ４１に戻る。つまり、ステップＳ４５では送信メールを未送信メールとしてＲＡＭ３０に保存（記憶）させる。 FIG. 11 is a flowchart showing the submenu process executed in step S9 (see FIG. 10). When the process of step S9 is executed, the CPU 20a displays a window Wc on the LCD monitor 26 as shown in FIG. 7A, and determines whether the operation is a return operation in step S41. That is, it is determined whether or not the return key 56h is operated. If “YES” in the step S41, that is, if the return key 56h is operated, the submenu process is ended and the process returns to the transmission mail creating process. On the other hand, if “NO” in the step S41, that is, if the return key 56h is not operated, it is determined whether or not a new saving operation is performed in a step S43. For example, it is determined whether or not the numeric key “1” has been operated. If “YES” in the step S43, the process of saving the mail being created is executed in a step S45, and then the process returns to the step S41. That is, in step S45, the transmitted mail is stored (stored) in the RAM 30 as an untransmitted mail.

ステップＳ４３で“ＮＯ”であれば、つまり新規保存の操作でなければ、ステップＳ４７で編集内容の確認操作であるか否かを判断する。つまり、「２」の数字キーが操作されたか否かを判断する。 If “NO” in the step S43, that is, if the operation is not a new saving operation, it is determined whether or not an editing content confirmation operation is performed in a step S47. That is, it is determined whether or not the numeric key “2” has been operated.

ステップＳ４７で“ＹＥＳ”であれば、ステップＳ４９で送信メールの確認表示処理を実行し、ステップＳ４１に戻る。つまり、ステップＳ４９では、送信メールの宛先、題名および本文のそれぞれが同一の画面で確認することが可能な、確認表示の処理を実行する。また、ステップＳ４７で“ＮＯ”であれば、ステップＳ５１で信頼度閾値の設定操作か否かを判断する。つまり、「３」の数字キーが操作されたか否かを判断する。ステップＳ５１で“ＹＥＳ”であれば、ステップＳ５３で信頼度閾値設定処理を実行し、ステップＳ４１に戻る。このステップＳ５３の処理については後述するため、ここでの詳細な説明は省略する。また、ステップＳ５１で“ＮＯ”であれば、そのままステップＳ４１に戻る。 If “YES” in the step S47, a sent mail confirmation display process is executed in a step S49, and the process returns to the step S41. That is, in step S49, a confirmation display process is executed in which the destination, title, and body of the outgoing mail can be confirmed on the same screen. If “NO” in the step S47, it is determined whether or not a reliability threshold value setting operation is performed in a step S51. That is, it is determined whether or not the numeric key “3” has been operated. If “YES” in the step S51, a reliability threshold setting process is executed in a step S53, and the process returns to the step S41. Since the process of step S53 will be described later, detailed description thereof is omitted here. If “NO” in the step S51, the process returns to the step S41 as it is.

なお、サブメニュー処理は送信メールの本文を作成する処理と並列的に実行されてもよく、サブメニューキー５６ｃが表示されている状態であれば、実行できるようにしてあってもよい。 The submenu process may be executed in parallel with the process of creating the text of the outgoing mail, and may be executed as long as the submenu key 56c is displayed.

図１２は、ステップＳ５３（図１１参照）で実行される信頼度閾知設定処理を示すフロー図である。ＣＰＵ２０ａはステップＳ７１で信頼度設定画面を表示する。たとえば、図７（Ｂ）に示すように、ウインドウＷｄを表示する。続いて、ステップＳ７３では戻る操作か否かを判断する。つまり、戻るキー５６ｈが操作されたか否かを判断する。ステップＳ７３で“ＹＥＳ”であれば、つまり戻るキー５６ｈが操作されれば、信頼度閾知設定処理を終了して、サブメニュー処理に戻る。一方、ステップＳ７３で“ＮＯ”であれば、つまり戻るキー５６ｈが操作されなければ、ステップＳ７５で信頼度の変更操作か否かを判断する。たとえば、「１」〜「３」の数字キーのいずれか１つが操作されたか否かを判断する。ステップＳ７５で“ＮＯ”であれば、つまり信頼度の変更操作がされなければステップＳ７３に戻る。一方、ステップＳ７５で信頼度の変更操作がされれば、ステップＳ７７で変更操作に応じて信頼度を設定し、信頼度閾値設定処理を終了する。たとえば、「１」の数字キーが操作されれば信頼度の閾値は７０％（高い）に設定され、「２」の数字キーが操作されれば信頼度の閾値は６０％（普通）に設定され、「３」の数字キーが操作されれば信頼度の閾値は５０％（低い）に設定される。また、設定された信頼度の閾値を示すデータは、設定閾値データ３３４としてＲＡＭ３０に記憶される。 FIG. 12 is a flowchart showing the reliability threshold setting process executed in step S53 (see FIG. 11). In step S71, the CPU 20a displays a reliability setting screen. For example, the window Wd is displayed as shown in FIG. Subsequently, in step S73, it is determined whether the operation is a return operation. That is, it is determined whether or not the return key 56h is operated. If “YES” in the step S73, that is, if the return key 56h is operated, the reliability threshold value setting process is ended and the process returns to the sub menu process. On the other hand, if “NO” in the step S73, that is, if the return key 56h is not operated, it is determined whether or not a reliability changing operation is performed in a step S75. For example, it is determined whether any one of the numeric keys “1” to “3” has been operated. If “NO” in the step S75, that is, if the reliability changing operation is not performed, the process returns to the step S73. On the other hand, if the reliability change operation is performed in step S75, the reliability is set according to the change operation in step S77, and the reliability threshold value setting process is terminated. For example, if the “1” numeric key is operated, the reliability threshold is set to 70% (high), and if the “2” numeric key is operated, the reliability threshold is set to 60% (normal). If the number key “3” is operated, the reliability threshold is set to 50% (low). Further, data indicating the set reliability threshold value is stored in the RAM 30 as the set threshold value data 334.

図１３はステップＳ１３またはステップＳ１９（図１０参照）で実行される文書編集処理を示すフロー図である。ＣＰＵ２０ａは、ステップＳ１３またはステップＳ１９の処理が実行されると、ステップＳ９１で完了操作か否かを判断する。つまり、図３（Ａ）などに示す完了キー５６ｂが操作されたか否かを判断する。ステップＳ９１で“ＹＥＳ”であれば、つまり完了キー５６ｂが操作されれば、文書編集処理を終了して、送信メール作成処理に戻る。一方、ステップＳ９１で“ＮＯ”であれば、つまり完了キー５６ｂが操作されなければ、低信頼度部位があるか否かを判断する。つまり、信頼度テーブルデータ３３６を参照して、設定閾値データ３３４が示す閾値以下の信頼度が記録されているか否かを判断する。 FIG. 13 is a flowchart showing the document editing process executed in step S13 or step S19 (see FIG. 10). When the process of step S13 or step S19 is executed, the CPU 20a determines whether or not the completion operation is performed in step S91. That is, it is determined whether or not the completion key 56b shown in FIG. If “YES” in the step S91, that is, if the completion key 56b is operated, the document editing process is ended and the process returns to the transmission mail creating process. On the other hand, if “NO” in the step S91, that is, if the completion key 56b is not operated, it is determined whether or not there is a low reliability portion. That is, with reference to the reliability table data 336, it is determined whether or not reliability equal to or lower than the threshold indicated by the setting threshold data 334 is recorded.

ステップＳ９３で“ＮＯ”であれば、つまり低信頼度部位がなければ、ステップＳ９５で任意カーソル編集処理を実行し、ステップＳ９１に戻る。また、このステップＳ９５の処理は後述するため、ここでの詳細な説明は省略する。また、ステップＳ９３で“ＹＥＳ”であれば、つまり低信頼度部位があれば、ステップＳ９７で低信頼度部位編集処理を実行する。また、このステップＳ９７の処理は後述するため、ここでの詳細な説明は省略する。 If “NO” in the step S93, that is, if there is no low reliability portion, an arbitrary cursor editing process is executed in a step S95, and the process returns to the step S91. Further, since the process of step S95 will be described later, detailed description thereof is omitted here. If “YES” in the step S93, that is, if there is a low reliability part, a low reliability part editing process is executed in a step S97. Further, since the process of step S97 will be described later, detailed description thereof is omitted here.

続いて、ステップ９９では、モードの再選択操作か否かを判断する。つまり、低信頼度文字列が表示されている状態で終了キー５６ｆ（図４（Ａ）参照）が操作されたか否かを判断する。ステップＳ９９で“ＹＥＳ”であれば、つまりモードの再選択操作であれば、ステップ９７に戻る。一方、ステップＳ９９で“ＮＯ”であれば、つまりモードの再選択操作でなければ、ステップＳ１０１で任意カーソル編集を行うか否かを判断する。つまり、図４（Ｃ）に示すウインドウＷｂに示される「ＹＥＳ」または「ＮＯ」を選択する操作結果によって判断する。ステップＳ１０１で“ＹＥＳ”であれば、つまり「ＹＥＳ」が選ばれれば、ステップＳ９５に進む。一方、ステップＳ１０１で“ＮＯ”であれば、つまり「ＮＯ」が選ばれれば文書編集処理を終了して、送信メール作成処理に戻る。 Subsequently, in step 99, it is determined whether or not a mode reselection operation is performed. That is, it is determined whether or not the end key 56f (see FIG. 4A) is operated in a state where the low reliability character string is displayed. If “YES” in the step S99, that is, if it is a mode reselection operation, the process returns to the step 97. On the other hand, if “NO” in the step S99, that is, if it is not a mode reselection operation, it is determined whether or not an arbitrary cursor editing is performed in a step S101. That is, the determination is made based on the operation result of selecting “YES” or “NO” shown in the window Wb shown in FIG. If “YES” in the step S101, that is, if “YES” is selected, the process proceeds to a step S95. On the other hand, if “NO” in the step S101, that is, if “NO” is selected, the document editing process is ended, and the process returns to the transmission mail creating process.

図１４はステップＳ９５（図１３参照）で実行される任意カーソル編集処理を示すフロー図である。ＣＰＵ２０ａは、ステップＳ９５の処理が実行されると、ステップＳ１１１で、確定状態か否かを判断する。つまり、変換されていない平仮名が確定されたか否かを判断する。ステップＳ１１１で“ＹＥＳ”であれば、つまり平仮名が変換されて確定されれば、任意カーソル編集処理を終了し、文書編集処理に戻る。一方、ステップＳ１１１で“ＮＯ”であれば、つまり未確定の文字列が確定されていなければ、ステップＳ１１３で、文字列の表示を行う。つまり、入力文字バッファ３３２に格納されている文字列を読み出して、機能表示領域５２に表示する。なお、入力文字バッファ３３２に文字列が格納されていなければ、カーソルＣＵａのみが表示される。 FIG. 14 is a flowchart showing the arbitrary cursor editing process executed in step S95 (see FIG. 13). When the process of step S95 is executed, the CPU 20a determines whether or not it is a confirmed state in step S111. That is, it is determined whether an unconverted hiragana is confirmed. If “YES” in the step S111, that is, if the hiragana is converted and confirmed, the arbitrary cursor editing process is ended, and the process returns to the document editing process. On the other hand, if “NO” in the step S111, that is, if an undetermined character string is not confirmed, the character string is displayed in a step S113. That is, the character string stored in the input character buffer 332 is read and displayed in the function display area 52. If no character string is stored in the input character buffer 332, only the cursor CUa is displayed.

続いて、ステップＳ１１５では、音声認識操作か否かを判断する。つまり、通常入力モードで、音声入力キー５６ｄが操作されたか否かを判断する。ステップＳ１１５で“ＹＥＳ”であれば、つまり音声入力キー５６ｄが操作されれば、ステップＳ１１７で音声認識入力処理を実行し、ステップＳ１１１に戻る。また、このステップＳ１１７の処理は後述するため、ここでの詳細な説明は省略する。 Subsequently, in step S115, it is determined whether or not a voice recognition operation is performed. That is, it is determined whether or not the voice input key 56d is operated in the normal input mode. If “YES” in the step S115, that is, if the voice input key 56d is operated, a voice recognition input process is executed in a step S117, and the process returns to the step S111. Further, since the process of step S117 will be described later, detailed description thereof is omitted here.

また、ステップＳ１１５で“ＮＯ”であれば、つまり音声入力キー５６ｄが操作されていなければ、ステップＳ１１９で方向キー操作か否かを判断する。つまり、カーソルＣＵａを移動させるために方向キー２２ｄが操作されたか否かを判断する。なお、機能表示領域５２にカーソルＣＵａのみが表示されている状態では、カーソルＣＵａの表示位置は変化しない。ステップＳ１１９で“ＹＥＳ”であれば、つまり方向キー２２ｄが操作されれば、ステップＳ１２１でカーソル移動の処理を実行し、ステップＳ１１１に戻る。一方、ステップＳ１１９で“ＮＯ”であれば、つまり、方向キー２２ｄが操作されていなければ、ステップＳ１２３で文字入力操作か否かを判断する。つまり、複数の文字入力キー２２ｅのいずれか１つが操作されたか否かを判断する。 If “NO” in the step S115, that is, if the voice input key 56d is not operated, it is determined whether or not the direction key is operated in a step S119. That is, it is determined whether or not the direction key 22d has been operated to move the cursor CUa. In the state where only the cursor CUa is displayed in the function display area 52, the display position of the cursor CUa does not change. If “YES” in the step S119, that is, if the direction key 22d is operated, a cursor moving process is executed in a step S121, and the process returns to the step S111. On the other hand, if “NO” in the step S119, that is, if the direction key 22d is not operated, it is determined whether or not a character input operation is performed in a step S123. That is, it is determined whether any one of the plurality of character input keys 22e has been operated.

ステップＳ１２３で“ＹＥＳ”であれば、つまり文字入力操作であればステップＳ１２５で文字の入力処理を実行し、ステップＳ１１１に戻る。つまり、ステップＳ１２５では、押下された文字キー応じて平仮名を表示し、さらにその表示した平仮名のデータを入力文字バッファ３３２に格納する。一方、ステップＳ１２３で“ＮＯ”であれば、つまり文字入力操作でなければステップＳ１２７で変換操作か否かを判断する。つまり、未確定の平仮名を変換する操作がされたか否かを判断する。ステップＳ１２７で“ＹＥＳ”であれば、つまり変換操作であれば、ステップＳ１２９で文字の変換処理を実行する。一方、ステップＳ１２７で“ＮＯ”であれば、つまり変換操作でなければ、ステップＳ１１１に戻る。 If “YES” in the step S123, that is, if a character input operation is performed, a character input process is executed in a step S125, and the process returns to the step S111. That is, in step S 125, hiragana is displayed according to the pressed character key, and the displayed hiragana data is stored in the input character buffer 332. On the other hand, if “NO” in the step S123, that is, if it is not a character input operation, it is determined whether or not it is a conversion operation in a step S127. That is, it is determined whether or not an operation for converting an undetermined hiragana is performed. If “YES” in the step S127, that is, if a conversion operation is performed, a character converting process is executed in a step S129. On the other hand, if “NO” in the step S127, that is, if it is not a conversion operation, the process returns to the step S111.

図１５はステップＳ１１７（図１４参照）、後述するステップＳ２０７（図１７参照）または後述するステップＳ２６９（図１９参照）で実行される音声認識入力処理を示すフロー図である。ＣＰＵ２０ａは、ステップＳ１１７、ステップＳ２０７またはステップＳ２６３のいずれかが実行されると、ステップＳ１４１で音声が入力されたか否かを判断する。つまり、第２マイク１６ｂに対して音声が入力されたか否かを判断する。ステップＳ１４１で“ＮＯ”であれば、つまり第２マイク１６ｂに音声が入力されなければ、ステップＳ１４１の処理を繰り返し実行する。一方、ステップＳ１４１で“ＹＥＳ”であれば、つまり第２マイク１６ｂに対して音声が入力されれば、ステップＳ１４３で入力された音声を音声データに変換する。つまり、第２マイク１６ｂに対して入力された音声は、ＤＳＰ２０ｂによって音声データに変換され、その音声データは音声認識バッファ３３０に格納される。 FIG. 15 is a flowchart showing the speech recognition input process executed in step S117 (see FIG. 14), later-described step S207 (see FIG. 17) or later-described step S269 (see FIG. 19). When any of step S117, step S207, or step S263 is executed, the CPU 20a determines whether or not a voice is input in step S141. That is, it is determined whether or not sound is input to the second microphone 16b. If “NO” in the step S141, that is, if no sound is input to the second microphone 16b, the process of the step S141 is repeatedly executed. On the other hand, if “YES” in the step S141, that is, if a sound is input to the second microphone 16b, the sound input in the step S143 is converted into sound data. That is, the voice input to the second microphone 16b is converted into voice data by the DSP 20b, and the voice data is stored in the voice recognition buffer 330.

続いて、ステップＳ１４５では、音声辞書から音声データに対応する文字列を抽出する。つまり、ＲＯＭ３２に記憶されている音声辞書から、音声認識バッファ３３０に格納されている音声データと対応する参照音声データを特定することで、音声データを文字列に変換する。なお、音声データと対応する参照音声データを特定する手法は、先述した特徴パターンの照合による音声認識の手法または統計的決定理論に基づく音声認識の手法を利用する。 Subsequently, in step S145, a character string corresponding to the voice data is extracted from the voice dictionary. That is, the voice data is converted into a character string by specifying the reference voice data corresponding to the voice data stored in the voice recognition buffer 330 from the voice dictionary stored in the ROM 32. Note that as a method for identifying reference speech data corresponding to speech data, the speech recognition method based on the above-described feature pattern matching or the speech recognition method based on statistical decision theory is used.

続いて、ステップＳ１４７では、抽出した各文字列のそれぞれに対応する信頼度テーブルを作成する。つまり、形態素単位で変換された各文字列と、各文字列のそれぞれに対応する尤度とを、信頼度テーブルデータ３３６としてＲＡＭ３０に記憶させる。なお、ステップＳ１４７の処理を実行するＣＰＵ２０ａは記録手段として機能する。 In step S147, a reliability table corresponding to each extracted character string is created. That is, each character string converted in morpheme units and the likelihood corresponding to each character string are stored in the RAM 30 as reliability table data 336. The CPU 20a that executes the process of step S147 functions as a recording unit.

続いて、ステップＳ１４９では、カーソルＣＵａ（または編集カーソルＣＵｂ）の位置に基づいて、抽出した文字列を表示する。つまり、ステップＳ１４９では、カーソルＣＵａが示す文字列に基づいて、変換された各文字列を入力文字バッファ３３２に格納する。続いて、ステップＳ１５１では、低信頼度音声辞書データ３３８を作成し、音声認識入力処理を終了した後に、メインルーチンの処理に戻る。つまり、ステップＳ１５１の処理を実行するＣＰＵ２０ａは、文字列に変換するときに、形態素単位で分割された文字列と対応する各音声データから信頼度が閾値以下の文字列に対応する音声データのみを選出して、選出された音声データと、その選出された音声データに対応する文字列とを低信頼度音声辞書データ３３８としてＲＡＭ３０に記憶させる。なお、ステップＳ１５１の処理を実行するＣＰＵ２０ａは音声辞書記録手段として機能する。 Subsequently, in step S149, the extracted character string is displayed based on the position of the cursor CUa (or edit cursor CUb). That is, in step S149, the converted character strings are stored in the input character buffer 332 based on the character string indicated by the cursor CUa. Subsequently, in step S151, the low-reliability speech dictionary data 338 is created, and after the speech recognition input process is completed, the process returns to the main routine. That is, when the CPU 20a that executes the process of step S151 converts to a character string, only the voice data corresponding to the character string whose reliability is equal to or lower than the threshold value from each voice data corresponding to the character string divided in morpheme units. The selected voice data and the character string corresponding to the selected voice data are stored in the RAM 30 as the low-reliability voice dictionary data 338. The CPU 20a that executes the process of step S151 functions as a voice dictionary recording unit.

図１６はステップＳ９７（図１３参照）で実行される低信頼度部位編集処理を示すフロー図である。ＣＰＵ２０ａは、ステップＳ９７の処理が実行されると、ステップＳ１６１では、低信頼度の文字列を特定する。つまり、信頼度テーブルデータ３３６内における閾値以下の文字列を特定する。なお、ステップＳ１６１の処理を実行するＣＰＵ２０ａは特定手段として機能する。 FIG. 16 is a flowchart showing the low reliability part editing process executed in step S97 (see FIG. 13). When the process of step S97 is executed, the CPU 20a specifies a low-reliability character string in step S161. That is, the character string below the threshold in the reliability table data 336 is specified. The CPU 20a that executes the process of step S161 functions as a specifying unit.

続いて、ステップＳ１６３では特定された文字列の背景色を変更して表示する。つまり、入力文字バッファ３３２に格納されている各文字列から、信頼度テーブルにおいて信頼度が閾値以下の文字列を特定し、ＬＣＤモニタ２６に表示されている画像データを変更する。たとえば、信頼度が閾値以下の文字列が「経済」であれば、入力文字バッファ３３２に格納されている「経済」の文字列を特定し、その「経済」を表示するための画像データを変更する。続いて、ステップＳ１６５では、モード選択のＧＵＩを表示する。たとえば、図３（Ｃ）のように、カーソル指定モード、音声指定モード、音声検索モードおよび通常入力モードを数字に対するキーによって選択させるウインドウＷａを表示する。なお、ステップＳ１６３およびステップＳ１６５の処理を実行するＣＰＵ２０ａは編集誘導手段として機能する。 Subsequently, in step S163, the background color of the specified character string is changed and displayed. That is, a character string having a reliability level equal to or lower than the threshold value is specified in the reliability table from each character string stored in the input character buffer 332, and the image data displayed on the LCD monitor 26 is changed. For example, if the character string whose reliability is less than or equal to the threshold is “economy”, the character string “economy” stored in the input character buffer 332 is specified, and the image data for displaying the “economy” is changed. To do. In step S165, a mode selection GUI is displayed. For example, as shown in FIG. 3C, a window Wa for displaying a cursor designation mode, a voice designation mode, a voice search mode, and a normal input mode with keys for numbers is displayed. Note that the CPU 20a that executes the processes of step S163 and step S165 functions as an editing guide.

続いて、ステップＳ１６７では、カーソル指定モードか否かを判断する。つまり、カーソル指定モードと対応する数字キーが操作されたか否かを判断する。ステップＳ１６７で“ＹＥＳ”であれば、つまりカーソル指定モードを選択する操作であれば、カーソル指定フラグ３４２をオンにし、ステップＳ１６９でカーソル指定処理を実行する。さらに、ステップＳ１６９の処理が終了すると、カーソル指定フラグ３４２をオフにし、低信頼度部位編集処理を終了して、文書編集処理に戻る。また、ステップＳ１６９の処理は後述するため、ここでの詳細な説明は省略する。 Subsequently, in step S167, it is determined whether or not the cursor designation mode is set. That is, it is determined whether or not a numeric key corresponding to the cursor designation mode has been operated. If “YES” in the step S167, that is, if the operation is for selecting the cursor designation mode, the cursor designation flag 342 is turned on, and the cursor designation process is executed in a step S169. Further, when the process of step S169 is completed, the cursor designation flag 342 is turned off, the low reliability part editing process is terminated, and the process returns to the document editing process. Further, since the process of step S169 will be described later, detailed description thereof is omitted here.

また、ステップＳ１６７で“ＮＯ”であれば、つまりカーソル指定モードを選択する操作でなければ、ステップＳ１７１で音声して音声指定モードか否かを判断する。つまり、音声指定モードと対応する数字キーが操作されたか否かを判断する。ステップＳ１７１で“ＹＥＳ”であれば、つまり音声指定モードを選択する操作がされれば、音声指定フラグ３４４をオンにし、ステップＳ１７３で音声指定処理を実行する。さらに、ステップＳ１７３の処理が終了すると、音声指定フラグ３４４をオフにし、低信頼度部位編集処理を終了する。また、ステップＳ１７３の処理は後述するため、ここでの詳細な説明は省略する。 If “NO” in the step S167, that is, if the operation is not an operation for selecting the cursor designation mode, it is determined whether or not the voice designation mode is made in a voice in a step S171. That is, it is determined whether or not a numeric key corresponding to the voice designation mode has been operated. If “YES” in the step S171, that is, if an operation for selecting the voice designation mode is performed, the voice designation flag 344 is turned on, and the voice designation process is executed in a step S173. Further, when the process of step S173 is completed, the voice designation flag 344 is turned off, and the low reliability part editing process is terminated. Further, since the process of step S173 will be described later, detailed description thereof is omitted here.

また、ステップＳ１７１で“ＮＯ”であれば、つまり音声指定モードを選択する操作がされなければ、ステップＳ１７５で音声検索モードか否かを判断する。つまり、音声検索モードに対応する数字キーが操作されたか否かを判断する。ステップＳ１７５で“ＹＥＳ”であれば、つまり音声検索モードを選択する操作がされれば、音声検索フラグ３４６をオンにし、ステップＳ１７７で音声検索処理を実行する。さらに、ステップＳ１７７の処理が終了すると、音声検索フラグ３４６をオフにし、低信頼度部位編集処理を終了する。また、このステップＳ１７７の処理は後述するため、ここでの詳細な説明は省略する。 If “NO” in the step S171, that is, if an operation for selecting the voice designation mode is not performed, it is determined whether or not the voice search mode is set in a step S175. That is, it is determined whether or not a numeric key corresponding to the voice search mode has been operated. If “YES” in the step S175, that is, if an operation for selecting the voice search mode is performed, the voice search flag 346 is turned on, and the voice search process is executed in a step S177. Furthermore, when the process of step S177 is finished, the voice search flag 346 is turned off, and the low-reliability part editing process is finished. In addition, since the process of step S177 will be described later, detailed description thereof is omitted here.

また、ステップＳ１７５で“ＮＯ”であれば、つまり音声検索モードを選択する操作がされなければ、ステップＳ１７９で通常入力モードか否かを判断する。つまり、通常入力モードを選択するために数字キーが操作されたか否かを判断する。ステップＳ１７５で“ＹＥＳ”であれば、つまり通常入力モードが選択されれば、ステップＳ１８１で信頼度テーブルを更新し、低信頼度部位編集処理を終了する。つまり、低信頼度テーブルに記録される閾値以下の信頼度を１００％に変更する。これにより、使用者は、複数の低信頼度文字列が誤認識されていないと判断すれば、通常入力モードを選択することで、通常の文字入力を再開することができる。また、ステップＳ１７９で“ＮＯ”であれば、つまり通常入力モードを選択する操作でなければ、ステップＳ１６７に戻る。 If “NO” in the step S175, that is, if the operation for selecting the voice search mode is not performed, it is determined whether or not the normal input mode is selected in a step S179. That is, it is determined whether or not the numeric key has been operated to select the normal input mode. If “YES” in the step S175, that is, if the normal input mode is selected, the reliability table is updated in a step S181, and the low reliability part editing process is ended. That is, the reliability below the threshold value recorded in the low reliability table is changed to 100%. Thereby, if the user determines that a plurality of low-reliability character strings are not erroneously recognized, the user can resume normal character input by selecting the normal input mode. If “NO” in the step S179, that is, if the operation is not an operation for selecting the normal input mode, the process returns to the step S167.

図１７はステップＳ１６９（図１６参照）で実行されるカーソル指定処理を示すフロー図である。なお、ステップＳ２０７の処理については、ステップＳ１１７と同様であり、ステップＳ２１１−Ｓ２１５の処理については、ステップＳ１２５−Ｓ１２９と同様であるため、詳細な説明は省略する。ＣＰＵ２０ａは、ステップＳ１６９の処理が実行されると、ステップＳ１９１で確定操作か否かを判断する。たとえば、図４（Ａ）に示す編集キー５６ｅが長押しされた、あるいは確定キー５６ｇが操作されたか否かを判断する。ステップＳ１９１で“ＮＯ”であれば、つまり確定操作がされていなければ、ステップＳ１９７に進む。一方、ステップＳ１９１で“ＹＥＳ”であれば、つまり確定操作がされていれば、ステップＳ１９３で信頼度テーブルを更新する。 FIG. 17 is a flowchart showing the cursor designation process executed in step S169 (see FIG. 16). The process of step S207 is the same as that of step S117, and the process of steps S211 to S215 is the same as that of steps S125 to S129, and thus detailed description thereof is omitted. When the process of step S169 is executed, the CPU 20a determines whether or not the confirmation operation is performed in step S191. For example, it is determined whether or not the edit key 56e shown in FIG. 4A has been pressed for a long time or the enter key 56g has been operated. If “NO” in the step S191, that is, if the confirming operation is not performed, the process proceeds to a step S197. On the other hand, if “YES” in the step S191, that is, if a confirming operation is performed, the reliability table is updated in a step S193.

たとえば、編集カーソルＣＵｂによって選択された文字列の信頼度を１００％に変更する。また、編集キー５６ｅまたは確定キー５６ｇが長押しされた場合には、編集カーソルＣＵｂが選択する文字列に関係なく、信頼度テーブルにおける信頼度の列の値を全て１００％に変更する。続いて、ステップＳ１９５では、低信頼度部位があるか否かを判断する。つまり、信頼テーブルデータ３３８に、閾値以下の信頼度が記録されているか否かを判断する。ステップＳ１９５で“ＮＯ”であれば、つまり閾値以下の信頼度が記録されていなければ、カーソル指定処理を終了し、低信頼度部位編集処理に戻る。 For example, the reliability of the character string selected by the editing cursor CUb is changed to 100%. When the edit key 56e or the enter key 56g is pressed for a long time, the reliability column values in the reliability table are all changed to 100% regardless of the character string selected by the edit cursor CUb. Subsequently, in step S195, it is determined whether or not there is a low reliability portion. That is, it is determined whether or not the reliability below the threshold is recorded in the trust table data 338. If “NO” in the step S195, that is, if the reliability below the threshold is not recorded, the cursor designating process is terminated, and the process returns to the low reliability part editing process.

また、ステップＳ１９５で“ＹＥＳ”であれば、つまり閾値以下の信頼度が記録されていれば、ステップＳ１９７で終了操作か否かを判断する。つまり、図４（Ａ）、図４（Ｂ）に示す終了キー５６ｆが操作されたか否かを判断する。ステップＳ１９７で“ＹＥＳ”であれば、つまり終了キー５６ｆが操作されていれば、カーソル指定処理を終了する。一方、ステップＳ１９７で“ＮＯ”であれば、つまり終了キー５６ｆが操作されていなければ、ステップＳ１９９で方向キー操作か否かを判断する。つまり、方向キー２２ｄが操作されたか否かを判断する。 If “YES” in the step S195, that is, if a reliability equal to or lower than the threshold is recorded, it is determined whether or not the end operation is performed in a step S197. That is, it is determined whether or not the end key 56f shown in FIGS. 4A and 4B has been operated. If “YES” in the step S197, that is, if the end key 56f is operated, the cursor designation processing is ended. On the other hand, if “NO” in the step S197, that is, if the end key 56f is not operated, it is determined whether or not the direction key is operated in a step S199. That is, it is determined whether or not the direction key 22d has been operated.

ステップＳ１９９で“ＹＥＳ”であれば、つまり方向キー２２ｄが操作されればステップＳ２０１で編集カーソルＣＵｂの表示位置を更新し、ステップＳ１９１に戻る。つまり、ステップＳ２０１では、信頼度テーブルデータ３３６を参照し、入力された方向に応じて、他の低信頼度文字列を選択する。たとえば、図３（Ｄ）を参照して、「経済」が現在選択されている低信頼度文字列であり、右方向（または下方向）の操作がされると、「経済」の次に記録されている低信頼度文字列、つまり「医術」が編集カーソルＣＵｂによって選択された状態となる。また、「医術」が現在選択されている低信頼度文字列であり、左方向（または上方向）の操作がされると、「医術」の前に記録されている低信頼度文字列、つまり「経済」が編集カーソルＣＵｂによって選択された状態となる。 If “YES” in the step S199, that is, if the direction key 22d is operated, the display position of the editing cursor CUb is updated in a step S201, and the process returns to the step S191. That is, in step S201, the reliability table data 336 is referred to, and another low reliability character string is selected according to the input direction. For example, referring to FIG. 3D, when “Economy” is the currently selected low-reliability character string and an operation in the right direction (or downward direction) is performed, it is recorded after “Economy”. The low-reliability character string, that is, “medical technique” is selected by the editing cursor CUb. In addition, “medicine” is the currently selected low-reliability character string, and if a leftward (or upward) operation is performed, the low-reliability character string recorded before “medicine”, that is, “Economy” is selected by the editing cursor CUb.

なお、信頼度テーブルにおいて、最上位に記録されている低信頼度文字列が選択されている状態で、上方向の操作がされた場合には、編集カーソルＣＵｂの表示位置を更新しなくてもよいし、信頼度テーブルの最下位に記録されている信頼度文字列が選択されるようにしてもよい。また、編集カーソルＣＵｂよって選択される低信頼度文字列が機能表示領域５２に表示されていない場合には、表示される文字列を更新して、編集カーソルＣＵｂおよび選択された低信頼度文字列が表示されるようにする。 In the reliability table, when the low reliability character string recorded at the top is selected and an upward operation is performed, the display position of the edit cursor CUb need not be updated. Alternatively, the reliability character string recorded at the bottom of the reliability table may be selected. If the low-reliability character string selected by the edit cursor CUb is not displayed in the function display area 52, the displayed character string is updated, and the edit cursor CUb and the selected low-reliability character string are updated. Is displayed.

また、ステップＳ１９９で“ＮＯ”であれば、つまり方向キー２２ｄが操作されていなければ、ステップＳ２０３で編集操作か否かを判断する。つまり、編集キー５６ｅが操作されたか否かを判断する。ステップＳ２０３で“ＮＯ”であればステップＳ１９１に戻る。一方、ステップＳ２０３で“ＹＥＳ”であれば、ステップＳ２０５で音声認識操作か否かを判断する。たとえば、編集キー５６ｅが操作された後に、音声入力の有無を判断する。ステップＳ２０５で“ＹＥＳ”であれば、つまり音声認識操作であれば、以降、音声認識モードであることを記憶して、ステップＳ２０７で音声認識入力処理を実行した後に、ステップＳ１９１に戻る。たとえば、ＣＰＵ２０ａは、音声認識モードであることを記憶するために、音声認識モードフラグ（図９では図示せず）をオンにする。 If “NO” in the step S199, that is, if the direction key 22d is not operated, it is determined whether or not the editing operation is performed in a step S203. That is, it is determined whether or not the edit key 56e has been operated. If “NO” in the step S203, the process returns to the step S191. On the other hand, if “YES” in the step S203, it is determined whether or not a voice recognition operation is performed in a step S205. For example, after the edit key 56e is operated, the presence / absence of voice input is determined. If “YES” in the step S205, that is, if it is a voice recognition operation, thereafter, the voice recognition mode is stored, the voice recognition input process is executed in a step S207, and then the process returns to the step S191. For example, the CPU 20a turns on a voice recognition mode flag (not shown in FIG. 9) in order to store the voice recognition mode.

また、ステップＳ２０５で“ＮＯ”であれば、つまり音声認識操作でなければ、ステップＳ２０９で文字入力操作か否かを判断する。たとえば、編集キー５６ｅが操作された後に、文字入力キー２２ｅが操作されたか否かを判断する。ステップＳ２０９で“ＹＥＳ”であれば、以降、文字入力モードであることを記憶し、ステップＳ２１１で文字の入力処理を実行した後に、ステップＳ１９１に戻る。たとえば、ＣＰＵ２０ａは文字入力モードであることを記憶するために、文字入力モードフラグ（図９では図示せず）をオンにする。 If “NO” in the step S205, that is, if it is not a voice recognition operation, it is determined whether or not a character input operation is performed in a step S209. For example, it is determined whether or not the character input key 22e has been operated after the edit key 56e has been operated. If “YES” in the step S209, the character input mode is stored thereafter, and after the character input process is executed in a step S211, the process returns to the step S191. For example, the CPU 20a turns on a character input mode flag (not shown in FIG. 9) in order to store that it is in the character input mode.

一方、ステップＳ２０９で“ＮＯ”であれば、つまり文字入力操作でなければ、ステップＳ２１３で変換操作か否かを判断する。ステップＳ２１３で“ＹＥＳ”であれば、ステップＳ２１５で文字の変換処理を実行して、ステップＳ１９１に戻る。一方、ステップＳ２１３で“ＮＯ”であれば、そのままステップＳ１９１に戻る。 On the other hand, if “NO” in the step S209, that is, if it is not a character input operation, it is determined whether or not it is a conversion operation in a step S213. If “YES” in the step S213, a character conversion process is executed in a step S215, and the process returns to the step S191. On the other hand, if “NO” in the step S213, the process returns to the step S191 as it is.

なお、ステップＳ２１１およびステップＳ２１５を実行するＣＰＵ２０ａは文字編集手段として機能する。 In addition, CPU20a which performs step S211 and step S215 functions as a character editing means.

図１８はステップＳ１７３（図１６参照）で実行される音声指定処理を示すフロー図である。なお、ステップＳ２３１およびステップＳ２３３の処理はステップＳ１９７およびステップＳ１９５の処理と同じであるため、詳細な説明は省略する。ＣＰＵ２０ａは、ステップＳ１７３の処理が実行されると、ステップＳ２３１で、終了操作か否かを判断する。ステップＳ２３１で“ＹＥＳ”であれば、音声指定処理を終了し、低信頼度部位編集処理に戻る。一方、ステップＳ２３１で“ＮＯ”であれば、ステップＳ２３３で低信頼度部位があるか否かを判断する。ステップＳ２３３で“ＮＯ”であれば、音声指定処理を終了する。 FIG. 18 is a flowchart showing the voice designation process executed in step S173 (see FIG. 16). In addition, since the process of step S231 and step S233 is the same as the process of step S197 and step S195, detailed description is abbreviate | omitted. When the process of step S173 is executed, the CPU 20a determines whether or not the end operation is performed in step S231. If “YES” in the step S231, the voice designating process is ended, and the process returns to the low reliability part editing process. On the other hand, if “NO” in the step S231, it is determined whether or not there is a low reliability part in a step S233. If “NO” in the step S233, the voice specifying process is ended.

一方、ステップＳ２３３で“ＹＥＳ”であれば、ステップＳ２３５で方向キー操作か否かを判断する。つまり、方向キー２２ｄが操作されたか否かを判断する。ステップＳ２３５で“ＹＥＳ”であれば、つまり方向キー２２ｄが操作されれば、ステップＳ２３７で表示のスクロール処理を実行し、ステップＳ２３１に戻る。つまり、ステップＳ２３７では、入力された方向に応じて、機能表示領域５２内に表示される文字列をスクロール（変化）させる。たとえば、下方向の入力がされれば、まだ表示されていない送信メールの本文を表示するように下方向にスクロールする。また、上方向の入力がされれば、すでに表示された送信メールの本文を表示するように上方向にスクロールする。 On the other hand, if “YES” in the step S233, it is determined whether or not a direction key operation is performed in a step S235. That is, it is determined whether or not the direction key 22d has been operated. If “YES” in the step S235, that is, if the direction key 22d is operated, a display scrolling process is executed in a step S237, and the process returns to the step S231. That is, in step S237, the character string displayed in the function display area 52 is scrolled (changed) in accordance with the input direction. For example, if an input is made in the downward direction, the screen is scrolled downward so that the text of the outgoing mail that has not yet been displayed is displayed. Further, if an upward input is made, the scroll is performed upward so as to display the text of the already displayed outgoing mail.

このように、使用者は、ＬＣＤモニタ２６に表示される文字列の内容をスクロールさせることで、容易に他の低信頼度文字列を探すことができるようになる。 As described above, the user can easily search for other low-reliability character strings by scrolling the contents of the character strings displayed on the LCD monitor 26.

なお、表示されていない文字列がない場合には、方向キー２２ｄに対する入力がされても、機能表示領５２内の表示は変化しない。また、左右方向キーが入力された場合も同様に、機能表示領５２内の表示は変化しない。 If there is no character string that is not displayed, the display in the function display area 52 does not change even if an input is made to the direction key 22d. Similarly, when the left / right direction key is input, the display in the function display area 52 does not change.

また、ステップＳ２３５で“ＮＯ”であれば、つまり方向キー２２ｄが操作されなければ、ステップＳ２３９で音声が入力されたか否かを判断する。つまり、第２マイク１６ｂに音声が入力されたか否かを判断する。ステップＳ２３９で“ＹＥＳ”であれば、つまり音声が入力されれば、ステップＳ２４１で変換部位検索処理を実行した後に、ステップＳ２３１に戻る。このステップＳ２４１の処理は後述するため、ここでの詳細な説明は省略する。一方、ステップＳ２３９で“ＮＯ”であれば、つまり音声が入力されなければ、ステップＳ２３１に戻る。 If “NO” in the step S235, that is, if the direction key 22d is not operated, it is determined whether or not a sound is input in a step S239. That is, it is determined whether or not sound is input to the second microphone 16b. If “YES” in the step S239, that is, if a voice is input, the conversion part search process is executed in a step S241, and then the process returns to the step S231. Since the process of step S241 will be described later, a detailed description thereof is omitted here. On the other hand, if “NO” in the step S239, that is, if no voice is input, the process returns to the step S231.

図１９はステップＳ１７７（図１６参照）で実行される音声検索処理を示すフロー図である。なお、ステップＳ２５１−Ｓ２５７の処理はステップＳ１９１−Ｓ１９７の処理と同じであり、ステップＳ２５９，Ｓ２６１の処理はステップＳ２３５，Ｓ２３７と同じであり、ステップＳ２６５の処理はステップＳ２４１と同じであり、ステップＳ２６９の処理はステップＳ１１７またはステップＳ２０７と同じであるため、詳細な説明は省略する。 FIG. 19 is a flowchart showing the voice search process executed in step S177 (see FIG. 16). Note that the processing of steps S251 to S257 is the same as the processing of steps S191 to S197, the processing of steps S259 and S261 is the same as steps S235 and S237, the processing of step S265 is the same as step S241, and step S269. Since this process is the same as step S117 or step S207, detailed description thereof is omitted.

ステップＳ１７７の処理が実行されると、ＣＰＵ２０ａは、ステップＳ２５１で、確定操作か否かを判断する。ステップＳ２５１で“ＮＯ”であれば、ステップＳ２５７に進み、一方、“ＹＥＳ”であれば、ステップＳ２５３で信頼度テーブルを更新する。続いて、ステップＳ２５５では、低信頼度部位があるか否かを判断し、“ＮＯ”であれば音声検索処理を終了して、低信頼度編集処理に戻る。一方、ステップＳ２５５で“ＹＥＳ”であれば、ステップＳ２５７で、終了操作か否かを判断する。ステップＳ２５７で“ＹＥＳ”であれば、音声検索処理を終了し、“ＮＯ”であればステップＳ２５９で方向キー操作か否かを判断する。 When the process of step S177 is executed, the CPU 20a determines in step S251 whether or not it is a confirmation operation. If “NO” in the step S251, the process proceeds to a step S257, and if “YES”, the reliability table is updated in a step S253. Subsequently, in step S255, it is determined whether or not there is a low-reliability part. If “NO”, the voice search process is terminated and the process returns to the low-reliability editing process. On the other hand, if “YES” in the step S255, it is determined whether or not an end operation is performed in a step S257. If “YES” in the step S257, the voice search process is ended, and if “NO”, it is determined whether or not the direction key operation is performed in a step S259.

ステップＳ２５９で“ＹＥＳ”であれば、ステップＳ２６１で表示のスクロール処理を実行して、ステップＳ２５１に戻る。一方、ステップＳ２５９で“ＮＯ”であれば、ステップＳ２６３で音声が入力されたか否かを判断する。つまり、第２マイク１６ｂによって、音声が入力されたか否かを判断する。ステップＳ２６３で“ＹＥＳ”であれば、つまり音声が入力されれば、ステップＳ２６５で変換部位検索処理を実行して、ステップＳ２５１に戻る。たとえば、任意の低信頼度文字列を表わす音声が入力されれば、ステップＳ２６５の処理が終了すると、任意の低信頼度文字列が編集カーソルＣＵｂによって選択された状態となる。 If “YES” in the step S259, the display scroll process is executed in a step S261, and the process returns to the step S251. On the other hand, if “NO” in the step S259, it is determined whether or not a voice is input in a step S263. That is, it is determined whether or not sound is input by the second microphone 16b. If “YES” in the step S263, that is, if a voice is inputted, a conversion part search process is executed in a step S265, and the process returns to the step S251. For example, if a voice representing an arbitrary low-reliability character string is input, an arbitrary low-reliability character string is selected by the editing cursor CUb when the process of step S265 ends.

また、ステップＳ２６３で“ＮＯ”であれば、つまり音声が入力されなければ、ステップＳ２６７で入力操作か否かを判断する。たとえば、編集キー５６ｅが操作されたか否かを判断する。ステップＳ２６７で“ＹＥＳ”であれば、つまり入力操作がされれば、ステップＳ２６９で音声認識入力処理を実行して、ステップＳ２５１に戻る。また、ステップＳ２６７で“ＮＯ”であれば、ステップＳ２５１に戻る。たとえば、ステップＳ２６９の処理が終了すると、編集キー５６ｅが操作された後に入力された音声が文字列に変換されて、選択されている低信頼度文字列と置き換えられる。 If “NO” in the step S263, that is, if no sound is input, it is determined whether or not an input operation is performed in a step S267. For example, it is determined whether or not the edit key 56e has been operated. If “YES” in the step S267, that is, if an input operation is performed, a voice recognition input process is executed in a step S269, and the process returns to the step S251. If “NO” in the step S267, the process returns to the step S251. For example, when the process of step S269 ends, the voice input after the editing key 56e is operated is converted into a character string, and is replaced with the selected low-reliability character string.

なお、編集キー５６ｅが操作された後には、音声入力ではなく、通常入力によって低信頼度文字列を編集してもよい。つまり、ステップＳ２６７で“ＹＥＳ”と判断された後に、Ｓ２６９の代わりに、ステップＳ２０５−Ｓ２１５の処理を実行するようにしてもよい。そして、ステップＳ２０７，Ｓ２１１，Ｓ２１５の処理が終了した後には、ステップＳ２５１に戻るようにする。 Note that after the edit key 56e is operated, the low-reliability character string may be edited by normal input instead of voice input. In other words, after “YES” is determined in step S267, the processing in steps S205 to S215 may be executed instead of S269. And after the process of step S207, S211, and S215 is complete | finished, it is made to return to step S251.

図２０はステップＳ２４１（図１８参照）またはステップＳ２６５（図１９参照）で実行される変換部位検索処理を示すフロー図である。ＣＰＵ２０ａは、ステップＳ２４１またはステップＳ２６５が実行されると、ステップＳ２８１で、入力された音声を音声データに変換する。つまり、入力された音声はＤＳＰ２０ａによって音声データに変換される。続いて、ステップＳ２８３では、低信頼度音声辞書データ３３８を読み込む。つまり、低信頼度音声辞書データ３３８を構成する各音声データを、参照音声データとして読み込む。 FIG. 20 is a flowchart showing conversion site search processing executed in step S241 (see FIG. 18) or step S265 (see FIG. 19). When step S241 or step S265 is executed, the CPU 20a converts the input voice into voice data in step S281. That is, the input voice is converted into voice data by the DSP 20a. Subsequently, in step S283, the low reliability speech dictionary data 338 is read. That is, each piece of voice data constituting the low reliability voice dictionary data 338 is read as reference voice data.

続いて、ステップＳ２８５では、高相関部位の検索を行う。具体的には、参照音声データと入力された音声データとから、一定時間毎に変化する複数の特徴パターンを取得し、さらに、参照音声データと入力された音声データとのそれぞれの特徴パターンから相関値を算出する。そして、最も大きい相関値と対応する参照音声データが表わす文字列を抽出することで、信頼度テーブルデータ３３６から、抽出された文字列と一致する低信頼度文字列を検索する。このようにして、本実施例では、類似する文字列を検索するために、相関関数を利用することができる。なお、ステップＳ２８３およびステップＳ２８５の処理を実行するＣＰＵ２０ａは類似検索手段として機能する。 Subsequently, in step S285, a highly correlated part is searched. Specifically, a plurality of feature patterns that change at fixed time intervals are acquired from the reference speech data and the input speech data, and further correlated from the feature patterns of the reference speech data and the input speech data. Calculate the value. Then, by extracting the character string represented by the reference speech data corresponding to the largest correlation value, the low reliability character string that matches the extracted character string is searched from the reliability table data 336. Thus, in this embodiment, the correlation function can be used to search for a similar character string. Note that the CPU 20a that executes the processes of steps S283 and S285 functions as a similarity search unit.

続いて、ステップＳ２８７では、音声検索モードか否かを判断する。つまり、音声検索フラグ３４６がオンであるか否かを判断する。ステップＳ２８７で“ＹＥＳ”であれば、つまり音声検索モードであれば、ステップＳ２８９で検索結果に応じて、編集カーソルＣＵｂの表示位置を更新し、変換部位検索処理を終了した後に、音声検索処理に戻る。たとえば、類似する文字列が「医術」であれば、編集カーソルＣＵｂによって「医術」が選択される。このように、再入力された音声の音声認識の結果によらず、使用者が意図する低信頼度文字列を選択することができる。そして、先述したとおり、使用者は、音声認識による文書の作成に都合のいいカーソルの操作を実行することができるようになる。 Subsequently, in step S287, it is determined whether or not the voice search mode is set. That is, it is determined whether or not the voice search flag 346 is on. If “YES” in the step S287, that is, in the voice search mode, the display position of the editing cursor CUb is updated in accordance with the search result in a step S289, and after the conversion part search process is completed, the voice search process is performed. Return. For example, if the similar character string is “medical technique”, “medical technique” is selected by the editing cursor CUb. In this manner, the low-reliability character string intended by the user can be selected regardless of the result of speech recognition of the re-input speech. As described above, the user can execute a cursor operation convenient for creating a document by voice recognition.

また、ステップＳ２８７で“ＮＯ”であれば、つまり音声検索フラグ３４６がオフであり、かつ音声指定フラグ３４４がオンであれば、ステップＳ２９１で音声辞書から音声データに対応する文字列を抽出する。つまり、ステップＳ１４５と同様に、ＲＯＭ３２に記憶された音声辞書から文字列を抽出する。 If “NO” in the step S287, that is, if the voice search flag 346 is turned off and the voice designation flag 344 is turned on, a character string corresponding to the voice data is extracted from the voice dictionary in a step S291. That is, a character string is extracted from the speech dictionary stored in the ROM 32 as in step S145.

続いて、ステップＳ２９３では、高相関部位に基づいて、抽出した文字列を表示する。つまり、入力文字バッファ３３２から、ステップＳ２８５の処理による検索結果が示す文字列を特定し、音声辞書から抽出された文字列に置き換える。たとえば、検索結果の低信頼度文字列が「経済」であり、音声辞書から抽出された文字列が「現在」であれば、機能表示領域５２では「経済」の文字列が「現在」の文字列に置き換えられる。続いて、ステップＳ２９５では、信頼度テーブルを更新し、変換部位検索処理を終了した後に、音声指定処理に戻る。たとえば、ステップＳ２９５では、信頼度テーブルに記録される「経済」を「現在」に置き換え、「現在」を音声認識した際に算出された尤度を信頼度として記録する。 Subsequently, in step S293, the extracted character string is displayed based on the highly correlated part. That is, the character string indicated by the search result obtained in step S285 is identified from the input character buffer 332 and replaced with the character string extracted from the speech dictionary. For example, if the low-reliability character string of the search result is “economy” and the character string extracted from the speech dictionary is “present”, the character string “economic” is “current” in the function display area 52. Replaced by a column. Subsequently, in step S295, the reliability table is updated, and after the conversion site search process is completed, the process returns to the voice designation process. For example, in step S295, “economic” recorded in the reliability table is replaced with “present”, and the likelihood calculated when voice recognition of “present” is recorded as reliability.

なお、ステップＳ２０１またはステップＳ２８９の処理を実行するＣＰＵ２０ａはカーソル表示手段として機能する。また、ステップＳ２９３の処理を実行するＣＰＵ２０ａは置換手段として機能する。さらに、ステップＳ２０１、ステップＳ２３９およびステップＳ２６１の処理を実行するＣＰＵ２０ａはスクロール手段として機能する。 The CPU 20a that executes the process of step S201 or step S289 functions as a cursor display unit. Further, the CPU 20a that executes the process of step S293 functions as a replacement unit. Further, the CPU 20a that executes the processes of step S201, step S239, and step S261 functions as a scroll unit.

ここで、図６に示すプルダウンＰＤを利用した文字列の編集について、図２１に示す音声認識処理のフロー図を用いて詳細に説明する。なお、ステップＳ１４１−Ｓ１５１の処理については、すでに詳細に説明しているため、ここでは詳細な説明を省略し、ステップＳ３１１の処理から説明する。 Here, the editing of the character string using the pull-down PD shown in FIG. 6 will be described in detail with reference to the flowchart of the speech recognition process shown in FIG. In addition, since the process of step S141-S151 has already been demonstrated in detail, detailed description is abbreviate | omitted here and it demonstrates from the process of step S311.

ステップＳ３１１では、低信頼度文字列の編集か否かを判断する。つまり、カーソル指定フラグ３４２、音声指定フラグ３４４または音声検索フラグ３４６のいずれかがオンであるか否かを判断する。ステップＳ３１１で“ＮＯ”であれば、つまり低信頼度文字列の編集でなければ、ステップＳ１４５以下の処理を実行する。一方、ステップＳ３１１で“ＹＥＳ”であれば、音声辞書から音声データに対応する複数の文字列を抽出する。つまり、尤度が最も高い文字列だけでなく、一定値以上の尤度の文字列を全て抽出する。 In step S311, it is determined whether or not the low-reliability character string is to be edited. That is, it is determined whether any of the cursor designation flag 342, the voice designation flag 344, or the voice search flag 346 is on. If “NO” in the step S311, that is, if the low-reliability character string is not edited, the processing in the step S145 and the subsequent steps is executed. On the other hand, if “YES” in the step S311, a plurality of character strings corresponding to the voice data are extracted from the voice dictionary. That is, not only the character string having the highest likelihood but also all character strings having a likelihood equal to or greater than a certain value are extracted.

続いて、ステップＳ３１５では、プルダウンメニューを表示する。つまり、図６に示すプルダウンＰＤを表示し、そのプルダウンＰＤ内に、抽出した複数の文字列を表示する。なお、ステップＳ３１５の処理を実行するＣＰＵ２０ａは一覧表示手段として機能する。続いて、ステップＳ３１７では、選択された文字列に対応する信頼度を記録する。つまり、信頼度テーブルに選択された文字列の尤度、つまり信頼度を記録する。なお、信頼度を記録する際には、編集カーソルＣＵｂによって選択されている文字列および対応する信頼度を上書きする。 In step S315, a pull-down menu is displayed. That is, the pull-down PD shown in FIG. 6 is displayed, and a plurality of extracted character strings are displayed in the pull-down PD. The CPU 20a that executes the process of step S315 functions as a list display unit. Subsequently, in step S317, the reliability corresponding to the selected character string is recorded. That is, the likelihood of the selected character string, that is, the reliability is recorded in the reliability table. When recording the reliability, the character string selected by the editing cursor CUb and the corresponding reliability are overwritten.

続いて、ステップＳ３１９では、選択された文字列を表示し、音声認識入力処理を終了する。つまり、入力文字バッファ３３２に格納されている低信頼度文字列を、選択された文字列と置き換える。たとえば、図６を参照して、プルダウンＰＤ内の「満たない」が選択されていれば、「いたない」は「満たない」に置き換えられる。 Subsequently, in step S319, the selected character string is displayed, and the speech recognition input process is terminated. That is, the low reliability character string stored in the input character buffer 332 is replaced with the selected character string. For example, referring to FIG. 6, if “not satisfied” in the pull-down PD is selected, “not required” is replaced with “not satisfied”.

また、音声検索モードにおいて、類似する文字列を検索するのではなく、新たに音声認識された文字列と一致する低信頼度文字列を検索する処理について、図２２を用いて詳細に説明する。なお、ステップＳ１４９およびステップＳ３１９の処理を実行するＣＰＵ２０ａは音声編集手段として機能する。 In addition, a process for searching for a low-reliability character string that matches a newly recognized character string instead of searching for a similar character string in the voice search mode will be described in detail with reference to FIG. Note that the CPU 20a that executes the processes of steps S149 and S319 functions as a voice editing unit.

図２２を参照して、他の実施例では、変換部位検索処理におけるステップＳ２８１−Ｓ２８７，Ｓ２９１−Ｓ２９５における処理内容は同じであるため、詳細な説明は省略する。 Referring to FIG. 22, in another embodiment, the processing contents in steps S281-S287 and S291-S295 in the conversion site search processing are the same, and thus detailed description thereof is omitted.

ＣＰＵ２０ａは、ステップＳ２８１で入力された音声を音声データに変換し、次にステップＳ２９１で音声辞書から音声データに対応する文字列を抽出する。そして、ステップＳ２９１の処理が終了すると、ステップＳ２８７で音声検索モードか否かを判断する。 The CPU 20a converts the voice input in step S281 into voice data, and then extracts a character string corresponding to the voice data from the voice dictionary in step S291. When the process of step S291 ends, it is determined whether or not the voice search mode is set in step S287.

ステップＳ２８７で“ＮＯ”であれば、つまり音声検索モードではなく、音声指定モードあれば、ステップＳ２８３，Ｓ２８５，Ｓ２９３およびＳ２９５の順に、処理を実行し、変換部位検索処理を終了する。一方、ステップＳ２８７で“ＹＥＳ”であれば、つまり音声検索モードであれば、ステップＳ３３１で抽出した文字列と一致する低信頼度文字列を検索する。つまり、信頼度テーブルにおける文字列の列から、ステップＳ２９１で抽出された文字列を検索する。なお、ステップＳ３３１の処理を実行するＣＰＵ２０ａは検索手段として機能する。 If “NO” in the step S287, that is, if the voice designation mode is not the voice search mode, the process is executed in the order of steps S283, S285, S293, and S295, and the conversion site search process is ended. On the other hand, if “YES” in the step S287, that is, if the voice search mode is set, a low-reliability character string that matches the character string extracted in the step S331 is searched. That is, the character string extracted in step S291 is searched from the character string column in the reliability table. The CPU 20a that executes the process of step S331 functions as a search unit.

続いて、ステップＳ２８９では、検索結果に応じて編集カーソルＣＵｂの表示位置を更新する。つまり、ステップＳ３３１の処理における検索結果に応じて、編集カーソルＣＵｂの表示位置を更新する。たとえば、新たに入力された音声の認識結果が「経済」であれば、低認識文字列である「経済」が検索結果となる。そして、「経済」が編集カーソルＣＵｂによって選択される。 Subsequently, in step S289, the display position of the edit cursor CUb is updated according to the search result. That is, the display position of the edit cursor CUb is updated according to the search result in the process of step S331. For example, if the recognition result of the newly input voice is “economy”, “economy”, which is a low recognition character string, becomes the search result. Then, “economy” is selected by the editing cursor CUb.

以上の説明から分かるように、携帯端末１０は、使用者の音声を取り込む第２マイク１６ｂを含み、第２マイク１６ｂに入力された音声を音声認識して文字列を生成する。また、音声認識によって文字列を生成する際には、算出される尤度を音声認識の信頼度とし、生成する文字列とその文字列に対応する信頼度とを信頼度テーブルに記録する。そして、信頼度テーブルに基づいて、閾値以下の信頼度である文字列が特定され、特定された低信頼度文字列の背景色は、青色に彩色されて、ＬＣＤモニタ２６に表示される。 As can be understood from the above description, the mobile terminal 10 includes the second microphone 16b that captures the user's voice, and recognizes the voice input to the second microphone 16b to generate a character string. When a character string is generated by speech recognition, the calculated likelihood is used as the reliability of speech recognition, and the generated character string and the reliability corresponding to the character string are recorded in the reliability table. Then, based on the reliability table, a character string having a reliability equal to or lower than the threshold is specified, and the background color of the specified low reliability character string is colored blue and displayed on the LCD monitor 26.

これによって、誤認識文字列の候補が一目で判断できるように表示されるため、使用者は、編集の要否を判断しやすくなり、音声認識を利用して効率よく文章を作成できる。 As a result, the misrecognized character string candidates are displayed so that they can be determined at a glance, so that the user can easily determine whether or not editing is necessary, and can efficiently create a sentence using voice recognition.

なお、音声認識における辞書データ（ＲＯＭ３２に記憶される音声辞書および低信頼度音声辞書データを含む）を構成する参照音声データのそれぞれに、尤度を算出する関数（式）が設定されていれば、類似する文字列を検索するために尤度を算出する各関数を利用してもよい。つまり、各関数にそれぞれを識別するための関数ＩＤが設定し、低信頼度文字列の信頼度を算出するために利用した関数ＩＤをそれぞれ記録する。そして、新たに入力された音声を音声認識する際に利用した関数ＩＤを、記録された各関数ＩＤから検索することで、類似する低信頼度文字列を検索することが可能である。 It should be noted that if a function (expression) for calculating likelihood is set for each of reference speech data constituting dictionary data in speech recognition (including speech dictionary and low-reliability speech dictionary data stored in ROM 32). Each function for calculating likelihood may be used to search for a similar character string. That is, a function ID for identifying each function is set, and the function ID used for calculating the reliability of the low reliability character string is recorded. Then, it is possible to search for a similar low-reliability character string by searching the function ID used when recognizing the newly input voice from the recorded function IDs.

また、音声認識に利用するマイクは、第２マイク１６ｂだけに限らず、第１マイク１６ａであってもよい。さらに、マイクを備えない携帯端末１０であっても、市販のマイクを後付けし、図１０−図２２に示す各処理を実行可能なプログラムをインストールすることで、本実施例の効果を得られるようにしてもよい。 Further, the microphone used for voice recognition is not limited to the second microphone 16b, but may be the first microphone 16a. Furthermore, even if the portable terminal 10 does not include a microphone, it is possible to obtain the effect of the present embodiment by installing a program capable of performing each process shown in FIGS. It may be.

また、本実施例の文書編集機能は、送信メールの本文を編集するだけに限らす、メモ帳機能などの文字列を入力する機能であれば適用可能である。 In addition, the document editing function of the present embodiment is applicable to any function for inputting a character string such as a memo pad function, which is not limited to editing the text of the outgoing mail.

また、ウインドウＷａ−Ｗｄで選択可能な各メニューは、数字キーによって選択するのではなく、メニューを選択するための専用カーソルによって選択されてもよい。 Further, each menu that can be selected in the windows Wa-Wd may be selected by a dedicated cursor for selecting a menu, instead of being selected by a numeric key.

また、携帯端末１０の通信方式には、ＣＤＭＡ方式に限らず、Ｗ‐ＣＤＭＡ方式、ＴＤＭＡ方式、ＰＨＳ方式およびＧＳＭ方式などを採用してもよい。また、本実施例における携帯端末１０のＣＰＵ２０ａによって実行される各処理は、携帯端末１０のみに限らず、ＰＤＡ（ＰｅｒｓｏｎａｌＤｅｇｉｔａｌＡｓｓｉｓｔａｎｔ）などの携帯情報端末や、パーソナルコンピュータ（ＰＣ）などであってもよい。 Further, the communication method of the mobile terminal 10 is not limited to the CDMA method, but may be a W-CDMA method, a TDMA method, a PHS method, a GSM method, or the like. In addition, each process executed by the CPU 20a of the mobile terminal 10 in the present embodiment is not limited to the mobile terminal 10, but may be a mobile information terminal such as a PDA (Personal Digital Assistant) or a personal computer (PC). Good.

１０ … 携帯端末
１６ａ … 第１マイク
１６ｂ … 第２マイク
２０ａ … ＣＰＵ
２０ｂ … ＤＳＰ
２２ … キー入力装置
２６ … ＬＣＤモニタ
３０ … ＲＡＭ
３２ … ＲＯＭ DESCRIPTION OF SYMBOLS 10 ... Portable terminal 16a ... 1st microphone 16b ... 2nd microphone 20a ... CPU
20b DSP
22 ... Key input device 26 ... LCD monitor 30 ... RAM
32 ... ROM

Claims

A portable terminal having a capturing means for capturing a speech signal and a speech recognition means for generating a character string from the speech signal captured by the capturing means,
Recording means for recording the character strings generated by the speech recognition means and data indicating their reliability;
A portable unit comprising: a specifying unit that specifies a character string having a reliability equal to or lower than a predetermined value with reference to the data; and a display unit that displays the character string specified by the specifying unit in a form different from other character strings. Terminal.

The mobile terminal according to claim 1, further comprising a cursor display unit that displays a cursor for selecting only the specified character string.

An operation means for receiving an operation for moving the cursor;
The portable terminal according to claim 2, wherein the cursor selects a character string according to an operation result by the operation means.

Search means for searching for the character string that matches a new character string that is newly voice-recognized after the character string is generated,
The mobile terminal according to claim 2, wherein the cursor selects the character string searched by the search means.

The mobile terminal according to claim 2, further comprising voice editing means for editing a character string selected by the cursor based on a character string newly generated by the voice recognition means.

A list display means for displaying a list of character string candidates generated by the voice recognition means;
The portable terminal according to claim 5, wherein when the candidate displayed by the candidate list display unit is selected, the voice editing unit edits the selected candidate as a newly generated character string.

The character input means for inputting a character string, and the character edit means for editing the character string selected by the cursor based on the character string input by the character input means. The portable terminal as described in.

Similar search means for searching for a character string similar to the character string newly generated by the voice recognition means, and the character string searched by the similarity search means as a character string newly generated by the voice recognition means The mobile terminal according to claim 1, further comprising replacement means for replacing.

Similarity search means for searching for a character string similar to the character string newly generated by the voice recognition means,
The portable terminal according to claim 2, wherein the cursor selects a character string searched by the similarity search unit.

The mobile terminal according to claim 9, further comprising voice editing means for editing a character string selected by the cursor based on a character string newly generated by the voice recognition means.

A list display means for displaying a list of character string candidates generated by the voice recognition means;
The portable terminal according to claim 10, wherein when the candidate displayed by the candidate list display unit is selected, the voice editing unit edits the selected candidate as a newly generated character string.

The portable terminal according to claim 9, further comprising: a character input unit that inputs a character string; and a character editing unit that edits the character string selected by the cursor based on the character string input by the character input unit.

Voice dictionary recording means for recording the voice captured by the capturing means and the character string generated from the voice as a voice dictionary;
The similarity search means searches for a similar character string by calculating a correlation value between each of the voices recorded by the voice dictionary recording means and the newly input voice. The mobile terminal according to Crab.

The portable terminal according to claim 1, further comprising: a display unit that displays at least a part of the plurality of character strings; and a scroll unit that scrolls a display position of the character string displayed by the display unit. .

A portable terminal processor having a capturing means for capturing a speech signal and a speech recognition means for generating a character string from the speech signal captured by the capturing means,
Recording means for recording the character strings generated by the speech recognition means and data indicating their reliability;
A specifying means for specifying a character string having a reliability equal to or lower than a predetermined value with reference to the data, and a function for displaying the character string specified by the specifying means in a form different from other character strings; Editing guidance program.

An editing apparatus having a capturing means for capturing a speech signal and a speech recognition means for generating a character string from the speech signal captured by the capturing means,
Recording means for recording the character strings generated by the speech recognition means and data indicating their reliability;
Editing means comprising: specifying means for specifying a character string having a reliability equal to or lower than a predetermined value with reference to the data; and display means for displaying the character string specified by the specifying means in a form different from other character strings. apparatus.