JPH02230199A

JPH02230199A - Voice converting device

Info

Publication number: JPH02230199A
Application number: JP1051120A
Authority: JP
Inventors: Toshiyuki Ogura; 小倉　敏行
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1989-03-02
Filing date: 1989-03-02
Publication date: 1990-09-12

Abstract

PURPOSE:To output voice information in block units and to improve the operability of a user by dividing the voice information into plural voice blocks, converting the blocks into text information, one by one, and storing pairs of the both while setting identification codes as keys. CONSTITUTION:When the user specifies a conversion request and inputs the voice information to a voice input/output device 1, the control part 14 of the voice converting device 3 divides the input voice information into plural voice blocks by a voice division part 15 with the signal from a voice input/output part 10 and converts the blocks into text blocks by a voice conversion part 12. Further, a voice/text control part 16 stores the pairs of the voice blocks and text blocks in a storage part 13 by storing their storage positions in a voice/text correspondence table part 17 while setting the identification codes as the keys. When the user requests a read of information from a text input/ output device 2, the control part 14 displays corresponding information on a read device 2 according to the keys.

Description

【発明の詳細な説明】〔産業上の利用分野〕従来、この種の音声変換装置は、音声情報を一回でテキ
スト情報に１００％変換することは難しく、誤認識や認
識不可能な場合がある。そこで、利用者が音声変換装置
を用いる場合、音声情報をテキスト情報に変換した後に
テキスト情報と変換前の音声情報とを比較して確認をお
こない、必要ならば修正等を加えてから変換後の確定し
たテキスト情報とする塙合が多い。[Detailed Description of the Invention] [Industrial Application Field] Conventionally, this type of speech conversion device has difficulty converting 100% of speech information into text information in one go, and there have been cases of misrecognition or non-recognition. be. Therefore, when a user uses a voice conversion device, after converting voice information to text information, compare the text information with the voice information before conversion to confirm, make corrections, etc. if necessary, and then check the voice information after converting the voice information to text information. There are many cases where it is treated as fixed text information.

第４図は従来の音声変換装置の代表例のブロック図であ
る．音声入出力装置１とテキスト入出力装置２とは音声
変換装置３に接続され、音声変換装置３は音声入出力装
置１との間で音声情報の入出力をおこなう音声入出力部
１０と、テキスト入出力装ｔ２との間でテキスト情報の
入出力をおこなうテキスト入出力部１１と、音声・入出
力部１０から音声情報を受信し、テキスト情報に変換を
おこなう音声変換部１２と、音声情報とテキスト情報と
を蓄積する蓄積部１３と、音声入出力装置１またはテキ
スト入出力装置２から操作要求等を受信し、前記音声変
換装置３の動作制御をおこなう制御部１４とから構成さ
れる。Figure 4 is a block diagram of a typical example of a conventional speech conversion device. The voice input/output device 1 and the text input/output device 2 are connected to a voice conversion device 3, and the voice conversion device 3 has a voice input/output unit 10 that inputs and outputs voice information to and from the voice input/output device 1, and a text A text input/output unit 11 that inputs and outputs text information to and from the input/output device t2, a voice converter 12 that receives voice information from the audio/input/output unit 10 and converts it into text information, and a voice converter 12 that converts the voice information into text information. The control section 14 receives operation requests and the like from the voice input/output device 1 or the text input/output device 2 and controls the operation of the voice conversion device 3.

次に第４図の動作について説明する。利用者が音声変換
装置３を利用して音声情報をテキスト情報に変換する場
合、音声の変換要求であることを指定して音声入出力装
置１から音声情報を入力する。音声変、換装置３の制御
部１４は音声入出力部１０から音声の変換要求を受信し
、続いて音声入出力部１０を用いて音声情報を受信させ
る。そして、この音声情報を蓄積部ｌ３に蓄積させ、さ
らにこの音声情報を音声変換部１２を使ってテキスト情
報に変換させ、このテキスト情報も蓄積部ｌ３に蓄積さ
せる。音声の変換の終了後に、利用者がテキスト入出力
装置２から変換されたテキスト情報の読出しを要求する
と、音声変換装置３の制御部１４は、テキスト入出力部
１１からテキスト情報の読出しの要求を受け、蓄積部１
３から前記テキスト情報を読出させ、テキスト入出力部
１１を使ってこのテキスト情報をテキスト入出力装置２
に送出させる。利用者はテキスト入出力装置２に表示さ
れたテキスト情報を見ながら、変換の確認・をおこなう
。この時、変換されたテキスト情報の中で、正確に行わ
れていない等の理由により、対応する音声情報の内容の
確認をおこなう場合に、利用者はテキスト入出力装置２
から音声情報の読出しの要求をおこなう。音声変換装置
３の制御部１４は、この音声情報の読出し要求をテキス
ト入出力部１１から受信し、蓄積部１３から蓄積された
音声情報をすべて読出させ、音声入出力部１０を介して
音声入出力装置１に音声情報を送出させる．これにより
利用者は音声情報を聴きながら、テキスト情報に対応す
る部分の音声情報の内容の確認をおこなっている。Next, the operation shown in FIG. 4 will be explained. When a user converts audio information into text information using the audio conversion device 3, the user inputs the audio information from the audio input/output device 1 by specifying that the request is for audio conversion. The control unit 14 of the voice conversion/converting device 3 receives a voice conversion request from the voice input/output unit 10, and subsequently causes the voice input/output unit 10 to receive voice information. Then, this audio information is stored in the storage unit l3, and further this audio information is converted into text information using the audio conversion unit 12, and this text information is also stored in the storage unit l3. When the user requests reading of the converted text information from the text input/output device 2 after the voice conversion is completed, the control section 14 of the speech conversion device 3 requests the reading of the text information from the text input/output section 11. Receiving and accumulating section 1
The text information is read from the text input/output device 2 using the text input/output section 11.
send it to The user checks and confirms the conversion while viewing the text information displayed on the text input/output device 2. At this time, if the user wants to check the contents of the corresponding audio information among the converted text information due to reasons such as the conversion not being done correctly, the user must use the text input/output device 2.
A request is made to read out the audio information. The control unit 14 of the voice conversion device 3 receives this voice information read request from the text input/output unit 11, reads out all the voice information accumulated from the storage unit 13, and inputs the voice information via the voice input/output unit 10. Make output device 1 send audio information. This allows the user to check the content of the audio information corresponding to the text information while listening to the audio information.

[Problem to be solved by the invention]

上述した従来の音声変換装置は、受信した音声情報と音
声情報を変換したテキスト情報とを個別に蓄積するだけ
であったため、利用者が変換後のテキスト情報の任意の
位置に対応する音声情報の内容を確認するような場合に
も、入力した音声情報を最初から聴き直さなければなら
ず煩わしいという欠点を有している．〔課題を解決するための手段〕本発明の音声変換装置は、受信した音声情報をテキスト
情報に変換し、この音声情報とテキスト情報とを蓄積す
る音声変換装置において、受信した音声情報を複数の音
声ブロックに分割する音声分割手段と、この音声分割手
段により得られた音声ブロックと音声ブロックに対応す
るテキスト情報との蓄積位置とそれらの識別符号とを設
定して記憶させ、テキスト情報の読出しにこの識別符号
も付加させる音声／テキスト管理手段と、この音声／テ
キスト管理手段が設定した識別符号をキーとして萌記音
声ブロックとテキスト情報との蓄積位置を記憶する音声
／テキスト対応記憶手段とを有することにより構成され
る。The conventional voice conversion device described above simply stores the received voice information and the text information converted from the voice information separately, so the user can select the voice information corresponding to any position in the converted text information. Even when confirming the content, it has the disadvantage of having to listen to the input audio information again from the beginning, which is cumbersome. [Means for Solving the Problems] A voice conversion device of the present invention converts received voice information into text information and stores the voice information and text information. A voice dividing means for dividing into voice blocks, storage positions of the voice blocks obtained by the voice dividing means and text information corresponding to the voice blocks, and their identification codes are set and stored, and the text information is read out. It has an audio/text management means that also adds this identification code, and an audio/text correspondence storage means that stores the storage position of the Moeki audio block and text information using the identification code set by the audio/text management means as a key. It consists of:

[Effect]

以上の構成では、音声変換装置において受信した音声情
報を複数の音声ブロックに分割し、音声ブロック単位に
音声情報をテキスト情報に変換し、音声ブロックとテキ
スト情報を１組にして識別符号をキーとして蓄積してい
るので、音声ブロック単位に音声情報を出力することが
できる．〔実施例〕次に、本発明の実施例について図面を参照して説明する
。In the above configuration, the voice information received by the voice conversion device is divided into a plurality of voice blocks, the voice information is converted into text information in units of voice blocks, and the voice block and text information are combined into a set using an identification code as a key. Since it is stored, audio information can be output in audio block units. [Example] Next, an example of the present invention will be described with reference to the drawings.

第１図は本発明の一実施例を示すブロック図である。図
において、音声入出力装置１とテキスト入出力装置２と
は音声変換装置３に接続されている。また、音声変換装
置３は、従来例と同様に音声入出力部１０、テキスト入
出力部１１、および音声変換部１２、蓄積部１３および
制御部１４を有し、さらに音声情報を複数の音声ブロッ
クに分割する音声分割部ｌ５と、音声ブロックとこの音
声ブロックを音声変換部ｌ２によって変換しなテキスト
情報とを受信し、蓄積部１３に蓄積させて蓄積部１３内
の蓄積位置を識別符号をキーとして記憶させる音声／テ
キスト管理部１６と、音声ブロックごとの音声ブロック
とテキスト情報の蓄積位置との対応を前記識別符号をキ
ーとして記憶する音声／テキスト対応表部１７とを有し
ている。FIG. 1 is a block diagram showing one embodiment of the present invention. In the figure, a voice input/output device 1 and a text input/output device 2 are connected to a voice conversion device 3. Further, the voice conversion device 3 includes a voice input/output unit 10, a text input/output unit 11, a voice conversion unit 12, a storage unit 13, and a control unit 14, as in the conventional example, and furthermore, the voice information is transmitted to a plurality of voice blocks. A voice dividing unit 15 divides the voice block into a voice block and text information that is not converted by the voice converting unit 12, and stores the voice block in the storage unit 13. and an audio/text correspondence table section 17 that stores the correspondence between each audio block and the storage position of text information using the identification code as a key.

次に第１図の動作について説明する。利用者が音声入出
力装置１とテキスト入出力装置２とを利用して音声情報
をテキスト情報に変換するとき、まず利用者は音声の変
換要求であることを指定し、音声情報を音声入出力装置
１から入力する。Next, the operation shown in FIG. 1 will be explained. When a user converts voice information into text information using the voice input/output device 1 and text input/output device 2, the user first specifies that the request is a voice conversion request, and then converts the voice information into voice input/output. Input from device 1.

音声変換装置３の制御部１４は、音声入出力装置１から
の音声の変換要求を音声入出力部１０から受信すると、
音声入出力部１０に続く音声情報を受信させ、音声分割
部ｌ５にこの音声情報を複数の音声ブロックに分割させ
、音声ブロックを音声変換部ｌ２によってテキストブロ
ックに変換させる。さらに制御部］−４は音声／テキス
ト管理部１６に音声ブロックと音声ブロックに対して変
換が行われたテキストブロックとを蓄積部１３に蓄積さ
せ、音声ブロックとテキストブロックとの蓄積部１３に
おける蓄積位置を識別符号をキーにして音声／テキスト
対応表部１７に記憶させる。When the control unit 14 of the voice conversion device 3 receives the voice conversion request from the voice input/output device 1 from the voice input/output unit 10,
The audio input/output unit 10 receives the audio information, the audio division unit l5 divides the audio information into a plurality of audio blocks, and the audio converter l2 converts the audio blocks into text blocks. Furthermore, the control unit]-4 causes the audio/text management unit 16 to store the audio block and the text block converted to the audio block in the storage unit 13, and stores the audio block and the text block in the storage unit 13. The position is stored in the voice/text correspondence table section 17 using the identification code as a key.

音声の変換゛の終了後に利用者が、テキスト入出力装置
２からテキスト情報の読出しを要求すると、音声変換装
置３の制御部１４はテキスト入出力部１１からテキスト
情報の読出し要求を受け、蓄積部１３からテキスト情報
とその識別符号とを読出し、テキスト入出力部１１を介
してこのテキスト情報と識別府該とをテキスト入出力装
置２に送出させる。利用者はテキスト入出力装置２に表
示されたテキスト情報を見ながら、変換の確認をおこな
う。このとき利用者は、変換されたテキスト情報のうち
変換が正確に行われていない等の理由によって確認をお
こないたい部分について、識別符号を指定し音声情報の
確認要求をおこなう。制御部１４はテキスト入出力装置
２から確認要求を受け、音声／テキスト管理部１６に指
定された識別符号が含まれる音声ブロックの蓄積位置を
音声／テキスト対応表部１７から調べさせ、音声ブロッ
クを蓄積部１３から読出させて音声入出力部１０を介し
て音声入出力装置１に送出させる。これにより、利用者
は確認を行いたい部分の音声ブロック情報のみを聴いて
内容の確認が行える。また、音声情報の一部の入力をや
り直して再変換を行う場合、利用者が再変換を行ないた
い音声ブロックの識別符号を指定して、音声入出力装置
２から音声情報を入力すると、音声変換装置３の制御部
１４は音声の再入力であることを受けて、音声入出力部
１０を介して音声情報を受信させ、音声変換時と同様に
音声ブロックと音声ブロックに対応するテキストブロッ
クとを、指定された音声ブロックの代わりに蓄積部１３
に蓄積させる．第２図は音声分割部１５および音声変換
部１２における動作を示した図である。音声分割部１５
は入力された音声情報を複数の音声ブロックに分割する
。音声情報を音声ブロックに分割する手段として、ここ
では音声の有音／無音の検出をおこない、無音部分の間
の一つの有音区間を音声ブロックとして割り当てる例を
示している。音声ブロックに分割された音声情報は、音
声変換部ｌ２によって音声ブロックを単位にテキスト情
報に変換し、テキストブロックとして蓄積する。When the user requests reading of text information from the text input/output device 2 after the end of the voice conversion, the control section 14 of the speech conversion device 3 receives the request to read the text information from the text input/output section 11, and reads the text information from the storage section. 13, and sends the text information and identification code to the text input/output device 2 via the text input/output unit 11. The user confirms the conversion while viewing the text information displayed on the text input/output device 2. At this time, the user specifies an identification code and requests confirmation of voice information for a portion of the converted text information that he/she wishes to confirm due to reasons such as incorrect conversion. The control unit 14 receives a confirmation request from the text input/output device 2, causes the audio/text management unit 16 to check the storage position of the audio block containing the designated identification code from the audio/text correspondence table unit 17, and stores the audio block. The data is read from the storage unit 13 and sent to the audio input/output device 1 via the audio input/output unit 10. This allows the user to check the content by listening only to the audio block information of the part he/she wants to check. In addition, when re-inputting a part of the audio information and performing re-conversion, the user can specify the identification code of the audio block that he or she wants to re-convert and input the audio information from the audio input/output device 2. When the control unit 14 of the device 3 receives the voice input again, it receives the voice information via the voice input/output unit 10, and converts the voice block and the text block corresponding to the voice block in the same way as when converting the voice. , storage unit 13 instead of the specified audio block.
Accumulate in . FIG. 2 is a diagram showing operations in the voice dividing section 15 and the voice converting section 12. Audio dividing section 15
divides input audio information into multiple audio blocks. As a means for dividing audio information into audio blocks, an example is shown in which detection of voice presence/absence of voice is performed, and one voiced section between silent portions is assigned as an audio block. The audio information divided into audio blocks is converted into text information in units of audio blocks by the audio conversion unit l2, and is stored as text blocks.

第３図は音声／テキスト対応表部１７のメモリ構成図で
ある．この図では識別符号としてのブロック番号１が付
与された音声ブロックは、蓄積部１３の蓄積位置１００
０に蓄積され、この音声ブロックの変換結果であるテキ
ストブロックは、蓄積部１３の蓄積位置２０００に蓄積
されていることを示している。蓄積位置の数値は例えば
蓄積部１３がディスク装置であればセクタおよびトラッ
ク番号にあたる。FIG. 3 is a memory configuration diagram of the audio/text correspondence table section 17. In this figure, the audio block assigned the block number 1 as an identification code is located at the storage position 100 of the storage unit 13.
0, indicating that the text block which is the conversion result of this audio block is stored at storage position 2000 of the storage unit 13. For example, if the storage unit 13 is a disk device, the numerical value of the storage position corresponds to the sector and track number.

本実施例では、音声入出力装置とテキスト入出力装置と
を音声変換装置に接続する例について示したが、通信網
を介して音声を扱う電話機およびテキストを扱うデータ
端末をそれぞれ音声入出力装置およびテキスト入出力装
置として接続してもよい。さらには、通信網を介し“て
音声情報とテキスト情報とを扱える複合端末装置を音声
変換装置に接続してもよい．〔発明の効果〕以上述べたように本発明の音声変換装置は、音声情報を
複数の音声ブロックに分割し、音声ブロック単位にテキ
スト情報に変換をおこない、音声ブロックと音声ブロッ
クに対するテキスト情報を組にし識別符号をキーとして
蓄積しておくため、変換されたテキスト情報の任意の部
分に対応する音声情報を確認したい場合に、テキスト情
報に対応する音声ブロックのみを再生でき、全音声を聴
き直して確認する煩わしさはなくなる効果がある。また
、音声入力のやり直しを行いたい場合には、再入力する
音声ブロックに対応する識別符号を指定してから入力す
ることにより、やり直しをおこなう部分のみを再入力で
き、利用者の操作性が大きく向上する効果がある。In this embodiment, an example is shown in which a voice input/output device and a text input/output device are connected to a voice conversion device, but a telephone that handles voice and a data terminal that handles text are connected to the voice input/output device and It may also be connected as a text input/output device. Furthermore, a composite terminal device that can handle voice information and text information may be connected to the voice conversion device via a communication network. [Effects of the Invention] As described above, the voice conversion device of the present invention The information is divided into multiple audio blocks, converted into text information for each audio block, and the audio block and text information for the audio block are combined and stored using the identification code as a key. When you want to check the audio information that corresponds to a part, you can play only the audio block that corresponds to the text information, eliminating the hassle of having to listen to the entire audio and check it again.Also, if you want to redo the audio input. In such cases, by specifying and inputting the identification code corresponding to the audio block to be re-input, it is possible to re-input only the part to be redone, which has the effect of greatly improving operability for the user.

１３・・・蓄積部、１４・・・制御部、１５・・・音声
分割部、１６・・・音声／テキスト管理部、１７・・・
音声／テキスト対応表部．13... Storage section, 14... Control section, 15... Audio division section, 16... Audio/text management section, 17...
Audio/text correspondence table.

Claims

[Claims]

A voice conversion device that converts received voice information into text information and stores the voice information and text information,
A voice dividing means for dividing received voice information into a plurality of voice blocks, and storage positions of the voice blocks obtained by the voice dividing means and text information corresponding to the voice blocks and their identification codes are set and stored. an audio/text management means for adding this identification code to the reading of text information; and an audio/text management means for storing storage positions of the audio block and text information using the identification code set by the audio/text management means as a key. A speech conversion device comprising: correspondence storage means.