JPS63189933A - Device for reading sentence aloud - Google Patents

Device for reading sentence aloud

Info

Publication number
JPS63189933A
JPS63189933A JP62022909A JP2290987A JPS63189933A JP S63189933 A JPS63189933 A JP S63189933A JP 62022909 A JP62022909 A JP 62022909A JP 2290987 A JP2290987 A JP 2290987A JP S63189933 A JPS63189933 A JP S63189933A
Authority
JP
Japan
Prior art keywords
reading
kanji
words
word
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP62022909A
Other languages
Japanese (ja)
Inventor
Masahiko Uchiyama
昌彦 内山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP62022909A priority Critical patent/JPS63189933A/en
Publication of JPS63189933A publication Critical patent/JPS63189933A/en
Pending legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

PURPOSE:To continue the reading aloud of sentences without interrupting the reading by forming an unknown word reading conversion part and a Chinese character dictionary part and applying phonetic reading or Japanese reading of Chinese characters to an unknown word which is not registered in a sentence analyzing part. CONSTITUTION:A device for reading sentence aloud is provided with the un known word reading conversion part 5 and the KANJI (Chinese character) dictionary part 6 and connected to the sentence analyzing part 21. About an unknown word which is not registered in the analyzing part 21, an unknown word consisting of one KANJI and KANA (Japanese syllabary) is read out by Japanese reading of KANJI by collating the dictionary part 6 and unknown KANJI words consisting of plural KANJI are read out by phonetic reading and outputted to a voice synthesizing part 3. Consequently, the device for read ing sentence aloud can continue reading-aloud of sentences without being inter rupted by unknown words.

Description

【発明の詳細な説明】 〔概要〕 本発明は文章読み上げ装置に入力される文章データに文
章解析部に登録されていない単語があった場合、その単
語に与える読みに関するものであって、未知語読み変換
部と漢字辞書部とを設け、l漢字とかなで構成される単
語の漢字は訓読みにし、複数漢字は音読みにすることに
よって未登録単語に読みを付与して、読み上げができる
ようにする。
[Detailed Description of the Invention] [Summary] The present invention relates to readings to be given to words that are not registered in the text analysis unit in text data input to a text reading device, A pronunciation conversion section and a kanji dictionary section are provided, and the kanji of words consisting of kanji and kana are given kunyomi, and multiple kanji are given onyomi, so that unregistered words can be given readings and read aloud. .

〔産業上の利用分野〕[Industrial application field]

本発明は文章読み上げ装置に関するものであって、特に
、入力される文章データに文章解析部に登録されていな
い単語があった場合、その単語に与える読みに関するも
のである。
The present invention relates to a text reading device, and particularly relates to a reading given to a word when there is a word in input text data that is not registered in a text analysis section.

文字コードで構成された文章データを合成音声によって
読み上げる文章読み上げ装置は、新聞社等における原稿
とその原稿から作成された文章データとの読み合わせ校
正に利用される。
A text reading device that reads text data composed of character codes using synthesized speech is used in newspaper companies and the like to proofread a manuscript and text data created from the manuscript.

即ち、文章読み上げ装置が文章データを電子的に合成さ
れた合成音声で読みあげ、校正者は原稿を見て校正する
That is, a text reading device reads text data using electronically synthesized synthesized speech, and a proofreader looks at the manuscript and proofreads it.

この場合、文章データには文章解析部に登録されていな
い単語、例えば造語、新語等がしばしば含まれており、
その都度文章読み上げ装置がとまってしまうとその部分
の校正確認が煩わしく、校正作業工数が余分にかかるこ
とになる。
In this case, the text data often contains words that are not registered in the text analysis unit, such as coined words and new words.
If the text-reading device stops each time, proofreading of that part is troublesome, and extra man-hours are required for proofreading.

従って、未登録語はなんらかの読みを与えて読むことが
できるような文章読み上げ装置が要望されていた。
Therefore, there is a need for a text reading device that can read unregistered words by giving them some pronunciation.

〔従来の技術〕[Conventional technology]

第4図は従来の文章読み上げ装置の構成ブロック図を示
す。
FIG. 4 shows a block diagram of a conventional text reading device.

図において、文字コードで構成された文章データがファ
イル装置、あるいは回線伝送によって文章入力部lに入
力する。
In the figure, text data composed of character codes is input to a text input section l via a file device or line transmission.

文章解析部2は文章入力部lから出力された文章データ
を単語に区分し、内蔵する単語辞書と照合して、区分さ
れた単語に読み、アクセント、発声高低等の単語発音に
必要な要素、また文章として読み上げる場合の単語発音
の抑揚等を付加して音声合成部3に出力する。
The sentence analysis unit 2 classifies the sentence data output from the sentence input unit 1 into words, compares it with the built-in word dictionary, and adds elements necessary for word pronunciation such as pronunciation, accent, and vocal pitch to the classified words. Furthermore, when the word is read out as a sentence, intonation and the like of word pronunciation are added and outputted to the speech synthesis section 3.

音声合成部3は、これら発音要素のディジタル信号を音
声のアナログ信号に変換し、スピーカ4から出力される
The voice synthesis section 3 converts the digital signals of these sound generation elements into analog voice signals, which are output from the speaker 4.

このような文章読み上げ装置で文章解析部2の単語辞書
に登録されていない単語、即ち、未知語があって、発音
要素に解析できない場合は、音声合成ができず、文章の
読み上げはその部分で停止される。
In such a sentence reading device, if there is a word that is not registered in the word dictionary of the sentence analysis unit 2, that is, an unknown word, and it cannot be analyzed into pronunciation elements, speech synthesis cannot be performed and the reading of the sentence is limited to that part. will be stopped.

そして、その場合は文字コードを印刷、あるいはディス
プレイ表示して、原稿との正誤を確がめる。
In that case, print or display the character code to check whether it is correct or incorrect with the original.

〔発明が解決しようとする問題点〕[Problem that the invention seeks to solve]

この従来の方式では、例えば新聞社で使用されるような
文章では、新語、造語、略語等が頻繁に現れるため、未
知語は多くなり、文章の読み上げはその都度中断される
In this conventional method, new words, coined words, abbreviations, etc. frequently appear in texts used by newspaper companies, so the number of unknown words increases, and the reading of the text is interrupted each time.

未知語によって文章の読み上げが中断され、その確認す
る手数によって、原稿校正作業は著しく阻害される。
The reading of a text is interrupted by unknown words, and the work of proofreading the manuscript is significantly hindered by the trouble of checking the unknown words.

本発明はこのような点に鑑みて創作されたものであって
、未知語に対してなんらかの読みを与え、文章読み上げ
を中断しない文章読み上げ装置を提供することを目的と
している。
The present invention was created in view of these points, and an object of the present invention is to provide a text reading device that gives some kind of pronunciation to unknown words and does not interrupt text reading.

〔問題点を解決するための手段〕[Means for solving problems]

上記した目的を達成するために、文章読み上げ装置に未
知語読み変換部と漢字辞書部とを設け、文章解析部に接
続する。
In order to achieve the above object, a text reading device is provided with an unknown word reading conversion section and a kanji dictionary section, which are connected to a text analysis section.

そして、文章解析部に登録されていない未知語について
、1文字の漢字とかなとで構成される単語の漢字は、漢
字辞書部に照合して訓読みで読みを付与し、複数の漢字
で構成される未知語の漢字は、音読みで読みを付与する
ようにする。
For unknown words that are not registered in the text analysis department, the kanji in words that are composed of a single kanji or kanji are compared with the kanji dictionary department, and the reading is given using kun-yomi. For unknown kanji, give the on-yomi reading.

〔作用〕[Effect]

文章解析部で解析できない未知語は、未知語読み変換部
に出力され、未知語読み変換部は漢字辞書部に登録され
た漢字と照合して、未知語の単語構成に応じて漢字に訓
読み、あるいは音読みを与え、音声合成部に出力する。
Unknown words that cannot be analyzed by the text analysis section are output to the unknown word reading conversion section, which compares them with the kanji registered in the kanji dictionary section and converts them into kanji with kun reading according to the word structure of the unknown word. Alternatively, it gives a reading and outputs it to the speech synthesis section.

この未知語の処理によって、文章読み上げ装置は未知語
によって中断されることがなくなる。
By processing this unknown word, the text reading device is not interrupted by the unknown word.

しかも、文章解析部における単語辞書にすべての単語の
登録を期待することなく、登録される単語を常用単語に
限り、未登録単語は未知語として未知語読み変換部の読
みに任せることができる。
Furthermore, without expecting all words to be registered in the word dictionary in the sentence analysis section, only common words are registered, and unregistered words can be left to the unknown word reading conversion section as unknown words.

こうすることによって、単語辞書の登録数を減らすと文
章解析文章の単語照合の平均時間が減少し、文章読み上
げが円滑に行い得ることになる。
By doing so, by reducing the number of words registered in the word dictionary, the average time for word matching of the text analysis text is reduced, and the text can be read out smoothly.

〔実施例〕〔Example〕

第1図は本発明の文章読み上げ装置の一実施例の構成ブ
ロック図を示す。
FIG. 1 shows a block diagram of an embodiment of a text reading device according to the present invention.

なお、全図を通じて同一符号は同一対象物を示す。Note that the same reference numerals indicate the same objects throughout the figures.

文章解析部21は単語辞書に登録されていない単語を未
知語として弁別し、未知語読み変換部5に出力する機能
が付与される。
The sentence analysis section 21 is provided with a function of discriminating words that are not registered in the word dictionary as unknown words and outputting them to the unknown word reading conversion section 5.

従って、入力された文章データの未知語は、文章解析部
を経て、未知語読み変換部5に出力される。
Therefore, the unknown words in the input text data are output to the unknown word reading conversion unit 5 via the text analysis unit.

未知語読み変換部5は、漢字辞書部6に読みを照合して
未知語に読みをつけ、音声合成部3に出力する。
The unknown word reading conversion section 5 compares the reading with the kanji dictionary section 6, adds a reading to the unknown word, and outputs it to the speech synthesis section 3.

漢字辞書部6は、漢字読みの第1の登録方式として、1
つの漢字に音読みと訓読みとを1つずつをもっていて、
未知語が漢字とかなで構成されていると、その漢字に訓
読みを抽出し、未知語が複数の漢字で構成されている場
合は、その漢字に音読みを付与する。
The kanji dictionary section 6 uses 1 as the first registration method for kanji readings.
Each kanji has one on-yomi and one kun-yomi,
If the unknown word is composed of kanji and kana, the kun-yomi is extracted for that kanji, and if the unknown word is composed of multiple kanji, the on-yomi is given to the kanji.

この方式の場合、必ずしも正ししく未知語の読みを期待
することなく、ただ文章読み上げ時の読みが対応する1
つの漢字に特定されればよいとするものである。
In this method, we do not necessarily expect the correct reading of the unknown word, but simply the reading that corresponds to the unknown word when reading the sentence aloud.
It is sufficient to specify one kanji.

第2図は本発明の文章読み上げ装置の他の実施例の構成
ブロック図、 第3図は漢字読みの第2の登録方式を説明する図を示す
FIG. 2 is a block diagram of the configuration of another embodiment of the text reading device of the present invention, and FIG. 3 is a diagram illustrating a second registration method for reading kanji.

文章解析部22は、未知語読み変換部5に未知語を出力
するとともに、漢字辞書部6の漢字の読みの標識種Cに
標識を付ける機能をを有する。
The sentence analysis section 22 has a function of outputting the unknown word to the unknown word reading conversion section 5 and attaching a mark to the mark type C of the reading of the kanji in the kanji dictionary section 6.

第2の登録方式として、漢字とが対応している文章解析
部22の漢字読みを第3図のように、漢字毎に、漢字a
に対する読み欄b、各読みに対する標識種Cを有する。
As a second registration method, the kanji reading of the text analysis unit 22 that corresponds to the kanji is read for each kanji as shown in Figure 3.
It has a reading column b for each reading, and a marker type C for each reading.

例えば「行」であれば、「オコナ(う)〔訓〕、ギョウ
〔音〕、コラ〔音〕等の読みを記憶し、「政」では、[
マッリゴト〔訓〕、セイ〔音〕」を記憶している。
For example, for ``gyo'', memorize the pronunciations of ``okona (u) [kun], gyou [on], kora [on], etc., and for ``sei'', [
I remember "Marrigoto [Kun], Sei [Sound]".

そして、標識種Cは文章解析文章22が漢字に読みを付
与するたびに使用した音読みに標識をっけ換える。
Then, the marker type C replaces the marker with the on-yomi used every time the text analysis text 22 assigns a reading to a kanji.

未知語読み変換部5は、未知語の漢字読みには漢字辞書
部6の標識のある読みを使用する。
The unknown word reading conversion section 5 uses the labeled readings of the kanji dictionary section 6 for the kanji reading of the unknown word.

このようにすることによって、未知語には文章解析文章
22で使用された最新の使用読みが付与される。
By doing this, the latest pronunciation used in the sentence analysis sentence 22 is given to the unknown word.

未知語が例えば「行革審」と云う複数の漢字からなる単
語であれば、これら漢字が読みに複数の音読みをもつ場
合でも、文章データの始めに「行政改革審議会」という
単語があって「ギョウセイカイカクシンギカイ」と読ま
れているような場合に、「行革審」は未知語とされても
「ギョウカクシン」と読まれることになる。
For example, if the unknown word is a word consisting of multiple kanji such as ``Administrative Reform Council'', even if these kanji have multiple phonetic readings, the word ``Administrative Reform Council'' appears at the beginning of the text data, and the word ``Gyosei'' appears at the beginning of the sentence data. In cases where ``Administrative Reform Council'' is read as ``Kaikakushingikai,'' even if it is an unknown word, it will be read as ``Gyoukakushin.''

なお、未知語の読みは文章解析部の単語辞書に登録して
、次の文章データの単語入力に対処するようすることも
容易に可能である。
Note that it is also possible to easily register the pronunciation of an unknown word in the word dictionary of the text analysis section so as to handle the word input for the next text data.

〔発明の効果〕〔Effect of the invention〕

以上述べてきたように、本発明によれば、文章読み上げ
装置の文章解析部に登録されていない未知語であっても
読みが付与され、中断することなく読み上げを続けるこ
とができ、実用的には極めて有用である。
As described above, according to the present invention, even unknown words that are not registered in the text analysis section of the text reading device are given readings, and reading can be continued without interruption, making it practical. is extremely useful.

【図面の簡単な説明】 第1図は、本発明の文章読み上げ装置の一実施例の構成
ブロック図、 第2図は、本発明の他の実施例の構成ブロック図、 第3図は、第2図の漢字辞書を説明する図、第4図は、
従来例の構成ブロック図である。 図において、 lは文章データ入力部、 2.21.22は文章解析部、 3は音声合成部、 4はスピーカ、 5は未知語読み変換部、 6は漢字辞書部である。 1mで9石す■t+−fJfJfJ4nA?D ・y7
M第1図 第2図 漢丁#丁を鼓四わm 第3図 従来f’J/1’j−fhf5Cフ・ロッ7m第4図
[BRIEF DESCRIPTION OF THE DRAWINGS] FIG. 1 is a block diagram of the structure of an embodiment of the text reading device of the present invention. FIG. 2 is a block diagram of the structure of another embodiment of the invention. Figure 4 is a diagram explaining the kanji dictionary in Figure 2.
FIG. 2 is a configuration block diagram of a conventional example. In the figure, l is a text data input section, 2, 21, 22 is a text analysis section, 3 is a speech synthesis section, 4 is a speaker, 5 is an unknown word reading conversion section, and 6 is a kanji dictionary section. 9 stones in 1m■t+-fJfJfJ4nA? D・y7
M Fig. 1 Fig. 2 Hancho #cho 4 m Fig. 3 Conventional f'J/1'j-fhf5C fh 7m Fig. 4

Claims (1)

【特許請求の範囲】 文章解析部(2)が入力される文章データを解析して、
該文章データの単語を予め登録された単語の読みと照合
してそれぞれの単語に読みを出力し、音声合成部(3)
で音声合成して合成音で読み上げる装置において、 前記文章解析部(21、22)に接続された未知語読み
変換部(5)と該未知語読み変換部(5)に接続された
漢字辞書部(6)とを設け、前記前記文章解析部(21
、22)に登録されていない前記文章データの単語は前
記未知語読み変換部(5)に出力して処理され、前記登
録されていない単語で、1文字の漢字とかなとで構成さ
れる単語の漢字は訓読みで読みを出力し、複数の漢字で
構成される単語の漢字は音読みで読みを出力して音声合
成部(3)に入力されることを特徴とする文章読み上げ
装置。
[Claims] The text analysis unit (2) analyzes input text data,
The speech synthesis unit (3) compares the words of the text data with the pronunciations of words registered in advance and outputs the pronunciations for each word.
In a device that synthesizes speech and reads out aloud with synthesized sounds, an unknown word reading conversion unit (5) connected to the sentence analysis unit (21, 22) and a kanji dictionary unit connected to the unknown word reading conversion unit (5). (6), and the text analysis section (21
, 22), the words of the text data that are not registered are output to the unknown word reading conversion unit (5) and processed, and the words that are not registered are words that are composed of a single kanji or kana character. The text reading device is characterized in that the reading of kanji is outputted in kun-yomi, and the reading of kanji of a word consisting of a plurality of kanji is outputted in on-yomi, and the reading is inputted to a speech synthesis unit (3).
JP62022909A 1987-02-02 1987-02-02 Device for reading sentence aloud Pending JPS63189933A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP62022909A JPS63189933A (en) 1987-02-02 1987-02-02 Device for reading sentence aloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP62022909A JPS63189933A (en) 1987-02-02 1987-02-02 Device for reading sentence aloud

Publications (1)

Publication Number Publication Date
JPS63189933A true JPS63189933A (en) 1988-08-05

Family

ID=12095766

Family Applications (1)

Application Number Title Priority Date Filing Date
JP62022909A Pending JPS63189933A (en) 1987-02-02 1987-02-02 Device for reading sentence aloud

Country Status (1)

Country Link
JP (1) JPS63189933A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01321557A (en) * 1988-06-23 1989-12-27 Ricoh Co Ltd Text voice synthesizing device
WO2004013763A3 (en) * 2002-07-31 2004-05-21 Koninkl Philips Electronics Nv Determining the reading of a kanji word

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01321557A (en) * 1988-06-23 1989-12-27 Ricoh Co Ltd Text voice synthesizing device
WO2004013763A3 (en) * 2002-07-31 2004-05-21 Koninkl Philips Electronics Nv Determining the reading of a kanji word

Similar Documents

Publication Publication Date Title
US5164900A (en) Method and device for phonetically encoding Chinese textual data for data processing entry
EP0691023B1 (en) Text-to-waveform conversion
US20090150157A1 (en) Speech processing apparatus and program
GB2158776A (en) Method of computerised input of Chinese words in keyboards
JPS63189933A (en) Device for reading sentence aloud
JPH06282290A (en) Natural language processing device and method thereof
JPS634206B2 (en)
JP6998017B2 (en) Speech synthesis data generator, speech synthesis data generation method and speech synthesis system
Gakuru et al. Development of a Kiswahili text to speech system.
JP2006030384A (en) Device and method for text speech synthesis
JP2000352990A (en) Foreign language voice synthesis apparatus
JPH0210957B2 (en)
JP2612030B2 (en) Text-to-speech device
RU2113726C1 (en) Computer equipment for reading of printed text
Olabe et al. Real time text-to-speech conversion system for spanish
JP2658476B2 (en) Document Braille device
JP2502101B2 (en) Sentence proofreading device
JPH03245192A (en) Method for determining pronunciation of foreign language word
JPS6288054A (en) Text read up device
Gaved Pronunciation and text normalisation in applied text-to-speech systems.
Granström et al. A danish text-to-speech system using a text normalizer based on morph analysis.
Olaszi Analysis of Written and Spoken Form of Hungarian Numbers for TTS Applications
JP2801601B2 (en) Text-to-speech synthesizer
JPH054676B2 (en)
JP2614912B2 (en) Text-to-speech device