JPS6184771A - Voice input device - Google Patents
Voice input deviceInfo
- Publication number
- JPS6184771A JPS6184771A JP59206238A JP20623884A JPS6184771A JP S6184771 A JPS6184771 A JP S6184771A JP 59206238 A JP59206238 A JP 59206238A JP 20623884 A JP20623884 A JP 20623884A JP S6184771 A JPS6184771 A JP S6184771A
- Authority
- JP
- Japan
- Prior art keywords
- voice
- input
- input device
- character
- confirmation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Document Processing Apparatus (AREA)
Abstract
Description
【発明の詳細な説明】
〔発明の利用分野〕
本発明は音声入力装置に係り、特に音声タイプライタに
おける入力結果の認識方法に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Application of the Invention] The present invention relates to a voice input device, and more particularly to a method for recognizing input results in a voice typewriter.
音声入力装置、特に入力音声を文字列に変換する。いわ
ゆる音声タイプライタにおいては、入力した音声が正し
く文字列に変換されているかどうかを確認することが重
要である。従来の装置は、たとえば実公昭44−552
G号記載のように、音節を入力毎に確認したり、特開昭
54−136134号記載のように、入力音声と文字を
再生と表示を時間的に対応付けて読合せたり、変換され
た文字から規則合成により音声を合成し、入力音声と交
互に再生し、つき合せたり、入力音声を文字字種指定入
力情報によりエコー等音声を変えて出力するなど勝れた
工夫がなされている。A voice input device, especially converting input voice into a string of characters. In so-called voice typewriters, it is important to check whether the input voice is correctly converted into a character string. The conventional device is, for example,
As described in issue G, syllables can be checked for each input, as described in JP-A-54-136134, input speech and characters can be reproduced and displayed in a time-related manner and read together, or converted. Excellent ideas have been used, such as synthesizing speech from characters by regular synthesis, playing it alternately with the input speech, matching it, and outputting the input speech with different sounds such as echoes depending on the input information specifying the character type.
仮名漢字変換や同音異字の文字列への変換を行なうよう
なシステムでは、同音異字の変換があり、変換後の文字
列から音声を規則合成で出力するよりも、入力の音声と
変換後の文字列を照合した方が合理的である。しかしな
がら、たとえばテープレコーダによる自分の声の再生音
を聞くことは相当数の人が好まない。したがって、音色
の個人的特徴を変形した音声で再生することが望ましい
。In systems that perform kana-kanji conversion or conversion of homophones into character strings, there is conversion of homophones and allographs, and rather than outputting audio from the converted character string by rule synthesis, it is possible to convert the input audio and the converted characters. It is more reasonable to match columns. However, a significant number of people do not like hearing their own voice played back, for example on a tape recorder. Therefore, it is desirable to reproduce the personal characteristics of the timbre in a modified voice.
また、読み合せ確認は、対象文章の完成度により、最適
な形態が異なるが、これらに対処できる柔軟な構成を有
する入力システムは実現されていない。Further, the optimal format for reading confirmation differs depending on the degree of completion of the target text, but an input system with a flexible configuration that can handle these situations has not been realized.
入力に際しても、入力しながら確認して行くと思考が中
断されるという問題が生じる。When inputting information, there is also a problem in that if you check while inputting, your thinking is interrupted.
本発明の目的は、以上の問題を解決し、その特待の要求
にそった柔軟で使いやすい音声入力システムに好適な入
力結果確認手段を提供することにある。SUMMARY OF THE INVENTION An object of the present invention is to solve the above-mentioned problems and provide an input result confirmation means suitable for a flexible and easy-to-use voice input system that meets the requirements for special treatment.
上記目的を達成するために、本発明では、先ず入力音声
を音声の形式で再生可能な形態でまとめて記録する手段
を設ける。この記録手段には、入力と同時に外部より入
力される信号、または認識時又は認識後の仮定漢字変換
用日本語処理時に自動的に判定・出力される単語又は文
節の境界を示す信号を音声と対応付けて記録する手段を
持たせる。さらに入力された音声は、外部から指定され
るモードに従い、入力しながら、又は入力後使用者から
の指定に従い、まとめてa!識を行ない、認識した結果
を仮名・漢字変換等を行なった後、変換した結果に単語
又は文節の境界を示す記号を付して記録する手段を持た
せる。In order to achieve the above object, the present invention first provides means for collectively recording input audio in a reproducible audio format. This recording means records signals that are input externally at the same time as the input, or signals that indicate the boundaries of words or phrases that are automatically determined and output during recognition or post-recognition Japanese processing for hypothetical kanji conversion. Provide a means to associate and record. Furthermore, the input audio is combined into a! according to an externally designated mode, while inputting, or according to a user's designation after inputting. After the recognized results are converted into kana/kanji characters, etc., the converted results are provided with a means for recording the converted results with symbols indicating the boundaries of words or phrases.
読合せ確認時に、入力音声を再生する際には、使用者が
外部から指示する指令に従い、音色を変えて音声を再生
しうる手段と、指定に従い、再生の速度を変更しうる手
段と、指令に従がい、単語又は文節毎に再生するか、入
力全体を別途指令のある時点まで連続的に再生する制御
手段と、再生中の単語又は文節に対応する文字又は文字
列の位置が一見してわかる表示補助手段(カーソルやブ
リンキング、色の変更等)等を持たせる。When reproducing input audio during reading confirmation, there is a means for reproducing the audio by changing the timbre according to commands given by the user from the outside, a means for changing the playback speed according to the instructions, and a command. control means for reproducing word or phrase by word or for continuously reproducing the entire input up to a point in time specified separately; Provide an easy-to-understand display aid (cursor, blinking, color change, etc.).
これらの手段の有する機能を任意に組み合せて実行する
モードを設定し、使用者が自由にこれらモードを選択で
きるよう構成することにより、その時々の使用者の要求
に合致した使いやすい状態で入力結果確認を行なうこと
が可能となる。By setting a mode in which the functions of these methods are arbitrarily combined and allowing the user to freely select these modes, input results can be created in an easy-to-use state that meets the user's needs at the time. It becomes possible to perform confirmation.
以下、本発明の一実施例を第1図により説明する。 An embodiment of the present invention will be described below with reference to FIG.
入力音声1はA/D変換2によりデジタル信号変換され
1分析部3で分析複入力音声記録部4に記録される。分
析方法は認識部の認識方式と整合の取れた方式であれば
良い。また、読合せの合成にも用いるため1合成にも適
した方式が良いが両者用に別々に処理を行なってももち
ろん良い。両者は同一の方が能率上望ましいが、多くの
場合、合成音の音源に関する分析は認識には用いないの
で1合成用のみに行なうことが必要になる。認識部5は
制御部6の認識指令信号7により入力音声を要求8し1
分析された入力音声を認識し、認識結果を認識結果記録
部9に書き込む。認識部5の構成は多くの公知技術が知
られており、どの方式を採用してもその内容は本発明と
本質的にかかわりないため、その説明は省略する。仮名
−漢字変換処理部】0は制御部6からの仮名漢字変換処
理指令信号11により、認識結果要求信号12を出し、
認識結果を取り込み、形態素解析等の処理により単語又
は文節境界を求めながら入力を漢字混り文に変換し、そ
の出力を仮名−漢字変換出力記録部I3に書き込むとと
もに、求められた単語あるいは文節等の境界の位置の記
号14を入力音声記録部4と仮名漢字変換出力記録部1
3の対応する所定の場所に記録して行く。仮名文字列か
ら漢字に変換する技術についても公知の技術を用いれば
良いので、ここでは説明は省略する6なお、単語や文節
の境界は音声入力時に、発声者により別途スイッチ等で
入力に同期させて入力しても、もちろんかまわない。An input voice 1 is converted into a digital signal by an A/D converter 2, and is recorded in an analysis multi-input voice recording unit 4 by an analysis unit 3. The analysis method may be any method as long as it is consistent with the recognition method of the recognition unit. Furthermore, since it is also used for reading and compositing, a method suitable for single compositing is preferable, but it is of course possible to perform processing separately for both. Although it is preferable for efficiency that the two be the same, in many cases, the analysis of the sound source of the synthesized sound is not used for recognition, so it is necessary to perform it only for one synthesis. The recognition unit 5 requests input voice 8 according to the recognition command signal 7 from the control unit 6.
The analyzed input voice is recognized and the recognition result is written into the recognition result recording section 9. Many publicly known techniques are known for the configuration of the recognition unit 5, and no matter which system is adopted, the details thereof are not essentially related to the present invention, so a description thereof will be omitted. [Kana-kanji conversion processing unit] 0 outputs a recognition result request signal 12 in response to a kana-kanji conversion processing command signal 11 from the control unit 6,
The recognition result is taken in, the input is converted into a sentence containing kanji while finding the word or phrase boundary through processing such as morphological analysis, and the output is written to the kana-kanji conversion output recording section I3, and the found word or phrase, etc. The symbol 14 at the boundary position is input to the audio recording section 4 and the kana-kanji conversion output recording section 1.
3. Record the information in the corresponding predetermined location. Known technology can be used to convert kana character strings into kanji, so the explanation will be omitted here6.Note that the boundaries between words and phrases can be synchronized with input by the speaker using a separate switch etc. during voice input. Of course, it doesn't matter if you enter it.
なお、音声を入力し、認識し、仮名−漢字等に変換する
各処理は、パイプライン的に並行して実行しても、各部
分毎にバッチ的に行なっても良く、これらのモードの選
択は使用者からの指令により制御部6からの各指令及び
11の出力タイミングを制御することにより容易に変え
ることができることは明らかである。Note that each process of inputting audio, recognizing it, and converting it into kana-kanji, etc. can be executed in parallel in a pipeline, or in batches for each part, and these modes can be selected. It is clear that this can be easily changed by controlling each command from the control section 6 and the output timing of the control section 11 according to a command from the user.
次に読み合せ時の説明で行なう。制御部6からの指令で
、音声合成部6及び文字表示部19は各各入力音声記録
部4と仮名−漢字変換出力記録部13に一単語又は−文
節分の音声合成用及び文字表示用データを要求し、音声
を合成、D/A変換し読合せ用音声として出力及び漢字
−仮名等混合文として表示する。この際制御部6は利用
者からの指令により、合成音声の音色を変えるために、
合成パラメータを修飾する修飾信号生成部16により、
パラメータを変形する係数を出力する。たとえば、音声
合成部が当業者には良く知られているLSP合成器の場
合、各LSPパラメータに一定の値を掛け、合成音声の
ホルマント位置を移動させたり、ピッチ周波数に一定値
を掛け、声の高さを変えたり、分析間隔とは異なった間
隔で合成パラメータを合成部に供給するよう制御するこ
とにより、音声や合成音の発声速度を変更することが可
能である。利用者は確認/修正情報21をキー人力部2
2より入力する。確認情報が入力されると制御部6は次
の単語または文節の出力を行なう。修正情報の場合は、
仮名−漢字出力記録部13の対応する部分の情報を修正
し、次に進む。This will be explained next when reading together. In response to a command from the control unit 6, the speech synthesis unit 6 and the character display unit 19 send data for speech synthesis and character display for one word or a segment to each input voice recording unit 4 and the kana-kanji conversion output recording unit 13. is requested, the voice is synthesized, D/A converted, and output as voice for reading aloud, and displayed as a mixed sentence such as kanji and kana. At this time, the control unit 6 changes the timbre of the synthesized voice according to the command from the user.
By the modification signal generation unit 16 that modifies the synthesis parameters,
Outputs the coefficients that transform the parameters. For example, if the speech synthesis section is an LSP synthesizer well known to those skilled in the art, each LSP parameter may be multiplied by a certain value to move the formant position of the synthesized speech, or the pitch frequency may be multiplied by a certain value to It is possible to change the speaking speed of the voice or synthesized sound by changing the height of the synthesizer or by controlling the synthesis parameter to be supplied to the synthesizer at an interval different from the analysis interval. User confirms/corrects information 21 in Key Human Resources Department 2
Input from 2. When the confirmation information is input, the control unit 6 outputs the next word or phrase. For correction information,
The information in the corresponding portion of the kana-kanji output recording section 13 is corrected, and the process proceeds to the next step.
単語または文節境界の修正の場合は、入力音声記録部4
の対応する境界記号の位置も併せて修正し、読合せと文
字表示の対応にずれが生じないようにする。In the case of modifying words or phrase boundaries, the input audio recording unit 4
The position of the corresponding boundary symbol is also corrected to avoid any discrepancy in the correspondence between reading and character display.
なお、利用者からの指示により、誤り修正の入力21が
入力されるまで、単語または文節境界にかかわらず、連
続的に音声の再生と文字の再生を行ない、修正人力21
があると、入力のあった時点での単語又は文節の終りで
出力を停止し、修正処理後再開するモード等の設定も容
易に実現できる。また、文字の表示の方は文章や段落等
の単位で先に表示し、読合せ音声の出力に対応する単語
又は文節の位置をカーソルやブリレキング、カラー表示
等で示すように構成することも可能である。In addition, according to an instruction from the user, until the input 21 for error correction is input, the audio and characters are played continuously regardless of the word or phrase boundary, and the correction manual 21 is performed.
If there is, it is possible to easily set a mode in which output is stopped at the end of the word or phrase at the time of input, and restarted after correction processing. In addition, it is also possible to display characters in units of sentences or paragraphs first, and to indicate the position of the word or phrase that corresponds to the output of the reading voice using a cursor, brillex, color display, etc. It is.
漢字は同音異字が多く、このように文章全体の中で表示
する方が誤り発見は容易である。Kanji has many homonyms, and it is easier to detect errors when they are displayed within the entire sentence.
なお、以上の実施例は日本語について説明したが、発音
が同じでツヅリの異なる言語等においても同等に構成で
きることは言うまでもない。Although the above embodiment has been described for Japanese, it goes without saying that the same structure can be applied to languages that have the same pronunciation but different tsuzuri.
以上説明したごとく1本発明によれば、同音異字を含む
文章を音声で入力する場合に、極めて容易に入力音声と
文章を対応して確認、修正することができる。As described above, according to the present invention, when a sentence including homophones and allographs is input by voice, the input voice and the sentence can be checked and corrected in correspondence with each other with great ease.
第1図は本発明の一実施例を説明するためのブロック図
である。FIG. 1 is a block diagram for explaining one embodiment of the present invention.
Claims (1)
を認識する手段と、認識された結果を文字等に変換する
手段と、文字等に変換された文字列を単語又は文節又は
それ以上の長い単体で表示する文字表示手段と、前記文
字表示手段による表示内容の確認及び訂正情報を入力す
る手段と、前記入力音声記録手段中の音声を単語又は文
節単位毎に区切つて再生出力する手段とをそなえたこと
を特徴とする音声入力装置。 2、前記音声再生手段が入力音声とは異なつた音色の再
生音となるような音色修飾手段を備えることを特徴とす
る前記特許請求の範囲第1項記載の音声入力装置。 3、前記音声再生手段は前記確認及び訂正情報入力手段
からの情報入力毎に単語又は文節の単位で順次出力する
よう制御され、前記文字表示手段は前記音声再生手段か
ら出力されている音声と対応する文字の位置が明らかに
わかるような表示補助手段を有することを特徴とする前
記特許請求の範囲第1項記載の音声入力装置。 4、前記音声再生手段が入力音声とは異なつた発声速度
の再生音となるような速度変更手段を有することを特徴
とする特許請求の範囲第1項記載の音声入力装置。 5、前記入力音声記録手段は十分な容量を有し、前記文
字等変換手段の出力を記録しておく十分な容量を持つ記
憶手段を有し、音声を入力する時点と認識動作を行なう
時点及び確認修正する時点を切り離して別個に処理可能
な形態を特徴とする前記特許請求の範囲第1項記載の音
声入力装置。 6、前記確認及び訂正手段が、通常は確認情報が入力さ
れた状態となつており、訂正の必要な場合にのみ、要訂
正情報入力を入力できるごときモードを併せ有すること
を特徴とする前記特許請求の範囲第1項記載の音声入力
装置。[Claims] 1. A means for recording input voice, a means for recognizing input voice, a means for converting the recognized result into characters, etc., and a means for converting a character string converted into characters, etc. a character display means for displaying single words or phrases or longer words, a means for inputting confirmation and correction information of the displayed content by the character display means, and a means for inputting information for checking and correcting the displayed content by the character display means, and inputting audio in the input voice recording means for each word or phrase. An audio input device characterized by comprising means for reproducing and outputting the audio in sections. 2. The voice input device according to claim 1, wherein the voice reproduction means includes timbre modification means so that the reproduced sound has a different timbre from the input voice. 3. The voice reproduction means is controlled to sequentially output in units of words or phrases each time information is input from the confirmation and correction information input means, and the character display means corresponds to the voice output from the voice reproduction means. 2. The voice input device according to claim 1, further comprising display assisting means for clearly identifying the position of the character. 4. The voice input device according to claim 1, wherein the voice reproduction means has a speed changing means so that the reproduced sound has a speaking speed different from that of the input voice. 5. The input voice recording means has a sufficient capacity and has a storage means having a sufficient capacity to record the output of the character etc. conversion means, and the input voice recording means has a storage means having a sufficient capacity to record the output of the character etc. conversion means, and the time of inputting voice, the time of performing recognition operation, and 2. The voice input device according to claim 1, wherein the voice input device is characterized in that the time points at which confirmation and correction are performed can be separated and processed separately. 6. The above-mentioned patent characterized in that the confirmation and correction means is normally in a state where confirmation information is input, and also has a mode in which correction information can be input only when correction is necessary. A voice input device according to claim 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP59206238A JPS6184771A (en) | 1984-10-03 | 1984-10-03 | Voice input device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP59206238A JPS6184771A (en) | 1984-10-03 | 1984-10-03 | Voice input device |
Publications (2)
Publication Number | Publication Date |
---|---|
JPS6184771A true JPS6184771A (en) | 1986-04-30 |
JPH0554960B2 JPH0554960B2 (en) | 1993-08-13 |
Family
ID=16520030
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP59206238A Granted JPS6184771A (en) | 1984-10-03 | 1984-10-03 | Voice input device |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPS6184771A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS6358283U (en) * | 1986-10-04 | 1988-04-18 | ||
JP2003518266A (en) * | 1999-12-20 | 2003-06-03 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Speech reproduction for text editing of speech recognition system |
JP2004529381A (en) * | 2001-03-29 | 2004-09-24 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Character editing during synchronized playback of recognized speech |
US7392194B2 (en) | 2002-07-05 | 2008-06-24 | Denso Corporation | Voice-controlled navigation device requiring voice or manual user affirmation of recognized destination setting before execution |
US8117034B2 (en) | 2001-03-29 | 2012-02-14 | Nuance Communications Austria Gmbh | Synchronise an audio cursor and a text cursor during editing |
JP2019532318A (en) * | 2016-09-22 | 2019-11-07 | 浙江吉利控股集団有限公司Zhejiang Geely Holding Group Co.,Ltd. | Audio processing method and apparatus |
-
1984
- 1984-10-03 JP JP59206238A patent/JPS6184771A/en active Granted
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS6358283U (en) * | 1986-10-04 | 1988-04-18 | ||
JP2003518266A (en) * | 1999-12-20 | 2003-06-03 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Speech reproduction for text editing of speech recognition system |
JP2004529381A (en) * | 2001-03-29 | 2004-09-24 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Character editing during synchronized playback of recognized speech |
US8117034B2 (en) | 2001-03-29 | 2012-02-14 | Nuance Communications Austria Gmbh | Synchronise an audio cursor and a text cursor during editing |
US8380509B2 (en) | 2001-03-29 | 2013-02-19 | Nuance Communications Austria Gmbh | Synchronise an audio cursor and a text cursor during editing |
US8706495B2 (en) | 2001-03-29 | 2014-04-22 | Nuance Communications, Inc. | Synchronise an audio cursor and a text cursor during editing |
US7392194B2 (en) | 2002-07-05 | 2008-06-24 | Denso Corporation | Voice-controlled navigation device requiring voice or manual user affirmation of recognized destination setting before execution |
JP2019532318A (en) * | 2016-09-22 | 2019-11-07 | 浙江吉利控股集団有限公司Zhejiang Geely Holding Group Co.,Ltd. | Audio processing method and apparatus |
US11011170B2 (en) | 2016-09-22 | 2021-05-18 | Zhejiang Geely Holding Group Co., Ltd. | Speech processing method and device |
Also Published As
Publication number | Publication date |
---|---|
JPH0554960B2 (en) | 1993-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3142803B2 (en) | A text-to-speech synthesizer | |
US6778962B1 (en) | Speech synthesis with prosodic model data and accent type | |
US8155958B2 (en) | Speech-to-text system, speech-to-text method, and speech-to-text program | |
JPS6184771A (en) | Voice input device | |
JPH07181992A (en) | Device and method for reading document out | |
JP2580565B2 (en) | Voice information dictionary creation device | |
JP3060276B2 (en) | Speech synthesizer | |
JPH07160289A (en) | Voice recognition method and device | |
JP2612030B2 (en) | Text-to-speech device | |
JPH0527787A (en) | Music reproduction device | |
JPS5991497A (en) | Voice synthesization output unit | |
JPH02251998A (en) | Voice synthesizing device | |
JP3414326B2 (en) | Speech synthesis dictionary registration apparatus and method | |
JP3034554B2 (en) | Japanese text-to-speech apparatus and method | |
JPS613241A (en) | Speech recognition system | |
JPH11259094A (en) | Regular speech synthesis device | |
JP2584236B2 (en) | Rule speech synthesizer | |
JP2570214B2 (en) | Performance information input device | |
JP2547612B2 (en) | Writing system | |
JP2647873B2 (en) | Writing system | |
JPS62113264A (en) | Speech document creating device | |
JP2647872B2 (en) | Writing system | |
JPH113096A (en) | Method and system of speech synthesis | |
JPS58154900A (en) | Sentence voice converter | |
JPS5913634Y2 (en) | language dictionary device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
LAPS | Cancellation because of no payment of annual fees |