JPH11167398A

JPH11167398A - Voice synthesizer

Info

Publication number: JPH11167398A
Application number: JP9334079A
Authority: JP
Inventors: Koichi Shiraki; 宏一白木; Yasushi Ishikawa; 泰石川; Akito Nagai; 明人永井
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1997-12-04
Filing date: 1997-12-04
Publication date: 1999-06-22

Abstract

PROBLEM TO BE SOLVED: To eliminate the needs for key operation by a user, and also to make it possible for a voce synthesizer to output voices of only points of headlines or paragraphs of a document, when skip reading it in a voice synthesizer converting character information into voice for outputting. SOLUTION: A voice synthesizer is provided with a format conversion part 10 which analyzes a construction of an input document split into plural sentences, and creates and outputs a format-converted document in which construction information permitting to discriminate the headlines and paragraphs are added to the headlines and the following paragraphs in the input document, and a selecting part 20 which selects predetermined character information in the format-converted document, and creates and outputs the final format- converted document in which the selected information is added to the above format-converted document. And, this voice synthesizer is constituted of a read sentence creating part 30 which creates and outputs character information of the sentence to which the selection information is added from the final format-converted document, and a voice synthesis part 40 converting the read sentence into speech for outputting.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文書情報を音声に
変換し出力する音声合成装置において、特に文書の一部
のみを音声に変換し出力する、読み飛ばし、斜め読みの
技術に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech synthesizing apparatus for converting document information into speech and outputting the speech, and more particularly to a technique for skipping and oblique reading which converts only a part of a document into speech and outputs the speech. .

【０００２】[0002]

【従来の技術】いわゆる斜め読み、または読み飛ばしの
機能を持つ従来の音声合成装置（特開平7-114537号公
報）のフローチャートを図３１に示す。この音声合成装
置は、ユーザーが読み飛ばしを指示するためのスキップ
キーを押下すれば、音声が出力されている途中で読み飛
ばしを行うことができるというものである。以下、図３
１を用いて説明する。2. Description of the Related Art FIG. 31 shows a flow chart of a conventional speech synthesizer having a so-called oblique reading or skipping function (JP-A-7-114537). This speech synthesizer is capable of skipping while a voice is being output if the user presses a skip key for instructing skipping. Hereinafter, FIG.
1 will be described.

【０００３】まず、音声合成装置は、ステップＳ１にお
いて、文字列の文字を読みとる。ステップＳ２でユーザ
ーがスキップキーを押下したか否かを判断して、押下し
なかった場合には、ステップＳ３へ進み文字列を解析
し、ステップＳ４において、解析結果に応じた合成音声
を出力する。その後ステップＳ１へ戻り、処理を繰り返
す。そして、音声合成装置から音声が出力されている途
中で、ユーザーがスキップキーを押下すると（ステップ
Ｓ２）、音声合成装置はステップＳ５へ進む。[0003] First, in step S1, the voice synthesizing apparatus reads characters of a character string. In step S2, it is determined whether or not the user has pressed the skip key. If the user has not pressed the skip key, the process proceeds to step S3, where the character string is analyzed. In step S4, a synthesized voice corresponding to the analysis result is output. . Thereafter, the process returns to step S1, and the process is repeated. Then, when the user presses the skip key while the voice is being output from the voice synthesizer (step S2), the voice synthesizer proceeds to step S5.

【０００４】ステップＳ５では、今読み込んだ文字が区
切り文字であるか否かを判定し、区切り文字であればス
テップＳ１へ戻り、区切り文字でなければステップＳ６
へ進む。ここで区切り文字とは、あらかじめユーザーに
よって設定されている「。．、，」等の文字である。ス
テップＳ５とステップＳ６のループにより、読み取った
文字に区切り文字が現れるまで読み飛ばしが行われる。
そして区切り文字が現れるとステップＳ１へ戻り、区切
り文字の次の文字から再び文字列を音声に変換し出力す
る。In step S5, it is determined whether or not the currently read character is a delimiter. If the delimiter is a delimiter, the process returns to step S1. If not, the process proceeds to step S6.
Proceed to. Here, the delimiter character is a character such as ".. ,," set by the user in advance. By the loop of step S5 and step S6, skipping is performed until a delimiter appears in the read character.
When a delimiter appears, the process returns to step S1, and converts the character string into a voice again from the character following the delimiter and outputs it.

【０００５】[0005]

【発明が解決しようとする課題】従来の音声合成装置で
は、読み飛ばしを行う場合には、ユーザがその都度スキ
ップキーを操作する必要があり、手をステップキーから
離すことができなかった。また、すばやく文書の概要
を把握するために、文書の章、節等の見出し項目や段落
毎の要点のみを音声出力することもできなかった。In the conventional speech synthesizer, the user has to operate the skip key each time the reading is skipped, and the hand cannot be released from the step key. In addition, in order to quickly grasp the outline of the document, it was not possible to output only the heading items such as chapters and sections of the document and the essential points of each paragraph by voice.

【０００６】本発明は、読み飛ばしのためのユーザのキ
ー操作が不要な音声合成装置を提供し、また文書の見出
し項目や段落毎の要点のみを音声出力する音声合成装置
を提供することを目的とする。SUMMARY OF THE INVENTION It is an object of the present invention to provide a voice synthesizing apparatus which does not require a user's key operation for skipping, and a voice synthesizing apparatus which outputs only a heading item of a document or a key point of each paragraph. And

【０００７】[0007]

【課題を解決するための手段】第１の発明は、複数の文
に区切られた入力文書の構造を解析し、この入力文書中
の見出し項目及びこの見出し項目に続く段落のそれぞれ
の文を構成する文字情報と、これら見出し項目及び段落
が識別できる構造情報とからなる書式変換文書を生成し
出力する書式変換部と、上記書式変換文書中の所定の文
を選択し、この選択情報を上記書式変換文書に付加した
最終書式変換文書を生成し出力する選択部と、上記最終
書式変換文書から上記選択情報が付加された文の文字情
報を読み上げ文として生成し出力する読み上げ文生成部
と、上記読み上げ文を音声に変換し出力する音声合成部
とを備えたものである。According to a first aspect of the present invention, a structure of an input document divided into a plurality of sentences is analyzed, and each sentence of a heading item in the input document and a paragraph following the heading item is formed. A format conversion unit that generates and outputs a format conversion document including character information to be converted and structural information that can identify these heading items and paragraphs; and a predetermined sentence in the format conversion document is selected. A selecting unit that generates and outputs a final format conversion document added to the conversion document; a reading-sentence generating unit that generates and outputs character information of a sentence to which the selection information is added from the final format conversion document as a reading-sentence; And a speech synthesizer for converting a read-out sentence into speech and outputting the speech.

【０００８】第２の発明は、上記見出し項目と段落が識
別可能なドキュメント構造タグを有する入力文書から上
記ドキュメント構造タグに基づいて見出し項目を判別
し、この見出し項目が識別できる構造情報を出力する見
出し判別部と、上記入力文書から上記ドキュメント構造
タグに基づいて段落を判別し、この段落が識別できる構
造情報を出力する段落判別部と、上記入力文書より上記
文字情報を抽出し、この抽出した文字情報と、上記見出
し項目及び段落が識別できる構造情報とからなる書式変
換文書を生成し出力する書式変換文書出力部とを備えた
ものである。According to a second aspect of the present invention, a heading item is determined based on the document structure tag from an input document having a document structure tag that allows the heading item and paragraph to be identified, and structure information for identifying the heading item is output. A heading discriminating unit, a paragraph discriminating unit that discriminates a paragraph from the input document based on the document structure tag, and outputs a structure information capable of identifying the paragraph, and extracts the character information from the input document. It is provided with a format conversion document output unit for generating and outputting a format conversion document including character information and structural information for identifying the above-mentioned heading items and paragraphs.

【０００９】第３の発明は、上記入力文書から見出しの
形式を表す書式を調べ、この書式を出力する見出し書式
解析部と、上記書式に基づいて上記見出し項目を表す文
字列を判別し、上記見出し項目を識別できる構造情報を
出力する見出し判別部と、上記見出し項目に後続する文
中の字下げで始まる段落を判別し、この段落を識別でき
る構造情報を出力する段落判別部と、上記入力文書中の
上記文字情報と、上記見出し項目及び段落が識別できる
構造情報とからなる書式変換文書を生成し出力する書式
変換文書出力部とを備えたものである。In a third aspect of the present invention, a format representing a heading format is checked from the input document, and a heading format analyzing unit for outputting the format is determined, and a character string representing the heading item is determined based on the format. A heading discriminator that outputs structure information capable of identifying a heading item, a paragraph discriminator that determines a paragraph that starts with indentation in a sentence following the heading item, and outputs structural information that can identify the paragraph; And a format conversion document output section for generating and outputting a format conversion document including the character information therein and the structural information for identifying the heading items and paragraphs.

【００１０】第４の発明は、階層化された複数の見出し
項目を有する入力文書より、所定の階層までの見出し項
目を選定し、この選択情報を上記書式変換文書に付加し
た最終書式変換文書を生成し出力する選択部を備えたも
のである。According to a fourth aspect of the present invention, a heading item up to a predetermined hierarchy is selected from an input document having a plurality of hierarchized heading items, and a final format conversion document obtained by adding this selection information to the format conversion document is provided. It has a selector for generating and outputting.

【００１１】第５の発明は、上記書式変換文書中の所定
の見出し項目を選択し、この選択情報を上記書式変換文
書に付加する見出し選択部と、上記選択された見出し項
目の下位階層を構成する所定の段落を選択し、この選択
情報を上記書式変換文書に付加する段落選択部と、上記
選択された段落中の所定の文を選択し、この選択情報を
上記書式変換文書に付加する文選択部とを備えたもので
ある。A fifth aspect of the present invention comprises a heading selecting section for selecting a predetermined heading item in the format conversion document and adding this selection information to the format conversion document, and a lower hierarchy of the selected heading item. A paragraph selecting section for selecting a predetermined paragraph to be added and adding this selection information to the format conversion document, and a paragraph for selecting a predetermined sentence in the selected paragraph and adding this selection information to the format conversion document And a selection unit.

【００１２】第６の発明は、上記段落選択部で選択され
た段落の冒頭文及び末尾文を選択し、この選択情報を上
記書式変換文書に付加する文選択部を備えたものであ
る。A sixth aspect of the present invention is provided with a sentence selection unit for selecting a head sentence and a tail sentence of the paragraph selected by the paragraph selection unit, and adding the selected information to the format conversion document.

【００１３】第７の発明は、上記選択された文の後から
次の選択文の前までの、読み飛ばされる複数の文の文字
数が所定の値を超えるときには、複数の読み飛ばされる
文の中間に位置する文を選択し、この選択情報を上記書
式変換文書に付加する文選択部を備えたものである。According to a seventh aspect of the present invention, when the number of characters of a plurality of skipped sentences from a position after the selected sentence to a position before the next selected sentence exceeds a predetermined value, an intermediate position between the plurality of skipped sentences is set. Is provided, and a sentence selection unit for selecting this sentence and adding this selection information to the format conversion document is provided.

【００１４】第８の発明は、上記段落選択部で選択され
た段落を構成する複数の文から、一連の文の区切りを表
す文境界キーワードを検出し、この文境界キーワードの
位置を出力する文境界キーワード検出部と、上記検出さ
れた文境界キーワードの位置で上記段落を分割して新た
な段落を追加する段落追加部とを備え、上記文選択部
は、上記段落追加部で追加された新たな段落を選択し、
この選択情報を上記書式変換文書に付加するものであ
る。According to an eighth aspect of the present invention, a sentence boundary keyword indicating a break of a series of sentences is detected from a plurality of sentences constituting the paragraph selected by the paragraph selection section, and the position of the sentence boundary keyword is output. A boundary keyword detecting unit; and a paragraph adding unit that divides the paragraph at the position of the detected sentence boundary keyword and adds a new paragraph. The sentence selecting unit includes a new sentence added by the paragraph adding unit. Select the appropriate paragraph,
This selection information is added to the format conversion document.

【００１５】第９の発明は、上記選択情報が付加された
段落の文から自立語の出現頻度を調べ、出現頻度の高い
自立語を読み上げ文として出力するキーワード抽出部を
備えたものである。A ninth aspect of the present invention is provided with a keyword extracting section for examining the appearance frequency of an independent word from the sentence of the paragraph to which the selection information is added, and outputting the independent word having a high appearance frequency as a read-out sentence.

【００１６】第１０の発明は、上記選択情報が付加され
た段落の文から連体修飾節を検出し、この連体修飾節を
削除した文を読み上げ文として出力する修飾節処理部を
備えたものである。According to a tenth aspect of the present invention, there is provided a modifier processing unit for detecting a continuation modifier from the sentence of the paragraph to which the selection information is added, and outputting a sentence in which the continuation modifier is deleted as a read-out sentence. is there.

【００１７】第１１の発明は、上記選択情報が付加され
た文の後から次に上記選択情報が付加された文の前まで
の読み飛ばされる文の文字数が、所定の値を超える時に
は、読み飛ばされる文の文字数を通知する読み飛ばし通
知文を上記読み上げ文に付加する読み飛ばし文字数通知
部を備えたものである。According to an eleventh aspect, when the number of characters of a sentence to be skipped after the sentence to which the selection information is added and before the sentence to which the selection information is added next exceeds a predetermined value, the reading is performed. The present invention further comprises a skipped character number notifying unit for adding a skipped notification sentence for notifying the number of characters of the sentence to the read-out sentence.

【００１８】第１２の発明は、上記読み上げ文生成部か
ら出力された読み上げ文中の文字情報を読み上げ文字列
として出力し、さらに上記読み上げ文中の各文の境界に
従って合図音の発生指令を出す制御部と、上記制御部か
ら出力された上記読み上げ文字列を合成音声に変換する
テキスト音声変換部と、上記発生指令に従って合図音を
出力する合図音発生部と、上記合成音声と上記合図音を
加算して出力音声を生成し出力する加算器とを備えたも
のである。A twelfth invention is a control section which outputs character information in a speech sentence output from the speech sentence generation section as a speech string, and further issues a command to generate a signal sound in accordance with a boundary of each sentence in the speech sentence. A text-to-speech conversion unit that converts the read-out character string output from the control unit to a synthesized voice, a signal sound generation unit that outputs a signal sound in accordance with the generation command, and adds the synthesized sound and the signal sound. And an adder for generating and outputting an output sound.

【００１９】[0019]

【発明の実施の形態】実施の形態１．実施の形態１は、
HTML文書のような構造化された文書の概要を知るため
に、文書のヘッダや見出しのみを音声に変換して出力す
るというものである。図１は実施の形態１の音声合成装
置の全体構成図、図２は書式変換部１０の構成図、図３
は音声合成部４０の構成図である。また、図１の各構成
要素である入力文書５０、第１の書式変換文書６０、最
終書式変換文書７０、読み上げ文８０をそれぞれ図４〜
図７に示す。なお、本実施の形態１における入力文書５
０は、文書の構造を示すドキュメント構造タグを含むHT
ML文書等の構造化文書であり、以下、入力文書５０はHT
ML文書であるものとして説明する。まず、書式変換部１
０は、入力文書５０を、書式変換部１０に後続する選択
部２０と読み上げ文生成部３０が扱うことのできる書式
に変換する。以下、詳細な処理内容を説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiment 1 Embodiment 1
In order to know the outline of a structured document such as an HTML document, only the header and heading of the document are converted into audio and output. FIG. 1 is an overall configuration diagram of the speech synthesis apparatus according to the first embodiment, FIG. 2 is a configuration diagram of a format conversion unit 10, and FIG.
FIG. 3 is a configuration diagram of the speech synthesis unit 40. The input document 50, the first format conversion document 60, the final format conversion document 70, and the read-out sentence 80, which are the respective components of FIG.
As shown in FIG. Note that the input document 5 in the first embodiment
0 is HT including a document structure tag indicating the structure of the document
An input document 50 is a structured document such as an ML document.
The description is made assuming that the document is an ML document. First, format converter 1
0 converts the input document 50 into a format that can be handled by the selection unit 20 and the read-aloud sentence generation unit 30 that follow the format conversion unit 10. Hereinafter, the details of the processing will be described.

【００２０】HTML文書では、文書のヘッダ、見出し、図
表、本文中の段落を示すために、■<HEAD>■、■<H1>
■、■<H2>■、■<IMG>■、■<P>■ 等のドキュメント
構造タグが用いられている。さらに見出しは、その階層
構造が、■<H1>■、■<H2>■、....、■<H6>■のドキュ
メント構造タグで表されている。見出し判別部１１は、
このようなドキュメント構造タグをもとに、図４に示す
HTML形式の入力文書５０のヘッダ、見出し等の見出し項
目を判別し、見出し項目の入力文書５０における位置情
報と階層情報を段落判別部１２へ出力する。見出し判別
部１１が出力する階層情報とは、ドキュメント構造タグ
■<H1>■、■<H2>■、....、■<H6>■の見出しの階層
を表す数字である。ただし、文書のヘッダは見出しより
上位の最上位階層であるとする。In an HTML document, ■ <HEAD> ■, ■ <H1> are used to indicate the header, heading, figure, table, and paragraph in the text of the document.
Document structure tags such as ■, ■ <H2> ■, ■ <IMG> ■, ■ <P> ■ are used. Further, the heading has a hierarchical structure represented by document structure tags of {<H1>}, {<H2>}, ...., {<H6>}. The heading determination unit 11
Based on such a document structure tag, shown in FIG.
A header item such as a header or a heading of the input document 50 in the HTML format is determined, and positional information and hierarchical information of the heading item in the input document 50 are output to the paragraph determining unit 12. The hierarchy information output by the heading discrimination unit 11 is a number representing the hierarchy of the headings of the document structure tags {<H1>}, {<H2>},..., {<H6>}. However, it is assumed that the header of the document is at the highest level higher than the heading.

【００２１】段落判別部１２は、段落境界を示すドキュ
メント構造タグ■<P>■をもとに、入力文書の見出し項
目でない部分の段落境界を判別し、段落境界の入力文書
における行番号などの位置情報５２を書式変換文書出力
部１３へ出力する。書式変換文書出力部１３は、入力文
書５０から図表とドキュメント構造タグを削除すること
により、第１の書式変換文書６０の文字情報を生成し、
さらに図５に示すように、第１の書式変換文書６０の文
字情報の文書タイトルや見出しなどの見出し項目の左側
に、その見出し項目の階層を示す ■H0■、 ■H1■とい
う記号の階層情報を付加し、第１の書式変換文書６０の
文字情報の段落の先頭行の左側に、段落の先頭を示す
■P■ という記号の位置情報を付加することにより、第
１の書式変換文書６０を生成し出力する。なお、記号の
■H0■は文書のヘッダを、■H1■、■H2■、....、■H6
■はHTML文書と同様に見出しを、■P■は段落の先頭を
示している。The paragraph discriminating unit 12 discriminates a paragraph boundary of a part which is not a heading item of the input document based on a document structure tag {<P>} indicating a paragraph boundary, and determines a paragraph boundary such as a line number in the input document. The position information 52 is output to the format conversion document output unit 13. The format conversion document output unit 13 generates character information of the first format conversion document 60 by deleting the figure and the document structure tag from the input document 50,
Further, as shown in FIG. 5, on the left side of a heading item such as a document title or heading of the character information of the first format conversion document 60, the hierarchy information of the symbols {H0} and {H1} is shown. Is added to the left of the first line of the paragraph of the character information of the first format conversion document 60 to indicate the beginning of the paragraph.
(1) The first format conversion document 60 is generated and output by adding the position information of the symbol P ■. The symbol ■ H0 ■ indicates the document header, ■ H1 ■, ■ H2 ■, ...., ■ H6
■ indicates the heading as in the HTML document, and {P} indicates the beginning of the paragraph.

【００２２】次に、書式変換部１０に後続する選択部２
０の説明を行う。選択部２０は、第１の書式変換文書６
０中に記述されている見出し項目の階層情報をもとに、
第１の書式変換文書６０の中から、上位のK段階までの
階層の見出し項目のみを選択し、その選択情報を第１の
書式変換文書６０に付加することによって、図６に示す
最終書式変換文書７０を生成し、出力する。ここで、K
はあらかじめ設定されている整数であり、本実施の形態
では２である。また、選択情報とは図６の■H0■、■H2
■、■H3■の右側の■1■という数字であり、■1■は選
択を、何も数字のないものは選択されていないことを示
している。Next, the selector 2 following the format converter 10
0 will be described. The selection unit 20 outputs the first format conversion document 6
Based on the hierarchical information of the heading item described in 0,
From the first format conversion document 60, only the heading items up to the upper K stages are selected, and the selected information is added to the first format conversion document 60, whereby the final format conversion shown in FIG. A document 70 is generated and output. Where K
Is an integer set in advance, and is 2 in the present embodiment. The selection information is {H0}, {H2} in FIG.
■, {1} on the right side of {H3}, where {1} indicates selection, and those without any number indicate no selection.

【００２３】次に、選択部２０に後続する読み上げ文生
成部３０について説明する。読み上げ文生成部３０は、
最終書式変換文書７０を用いて、音声合成部４０の入力
となる読み上げ文８０を生成して出力するものである。
読み上げ文生成部３０は、最終書式変換文書７０の中か
ら選択情報で指定されている見出し項目のみの文字情報
を読み上げ文として生成する。さらに、読み上げ文生成
部３０は、それぞれの見出し項目毎の文字情報であるこ
とを表す境界情報を読み上げ文の文字情報に付加して、
図７に示す読み上げ文８０を生成し出力する。なお、境
界情報とは図７における左端の■T■という記号であ
る。Next, the text-to-speech generation unit 30 subsequent to the selection unit 20 will be described. The reading sentence generation unit 30
Using the final format conversion document 70, a read-aloud sentence 80 to be input to the speech synthesis unit 40 is generated and output.
The text-to-speech generation unit 30 generates, as a text-to-speech, character information of only the heading item specified by the selection information from the final format conversion document 70. Further, the reading-sentence generating unit 30 adds boundary information indicating that the information is character information for each heading item to the character information of the reading-sentence,
The read-out sentence 80 shown in FIG. 7 is generated and output. The boundary information is a symbol {T} at the left end in FIG.

【００２４】最後に、読み上げ文生成部３０に後続する
音声合成部４０の説明を図３を用いて行う。まず、制御
部４１は、読み上げ文８０中の境界情報を検知すると、
合図音発生部４３へ合図音の発生指令８２を与える。合
図音発生部４３は発生指令８２を受けると、合図音９２
を加算器４４に出力する。合図音９２が出力された後
で、制御部４１はテキスト音声変換部４２へ読み上げ文
８０の最初の見出し項目の文字情報を読み上げ文字列８
１として送る。テキスト音声変換部４２は、入力された
見出し項目の文字情報を合成音声９１に変換して加算器
４４に出力する。加算器４４では、合図音９２に合成音
声９１を加算して出力音声９０を出力する。以下、制御
部４１と合図音発生部４３とテキスト音声変換部４２
は、２番目以降の見出し項目に対しても同様の処理を行
う。その結果、音声合成部４０からは、上位のK段階ま
での階層の見出し項目が、各見出し項目の前に合図音を
伴った出力音声９０に変換されて出力される。Finally, the speech synthesizing section 40 subsequent to the reading sentence generating section 30 will be described with reference to FIG. First, when detecting the boundary information in the read-aloud sentence 80, the control unit 41
A signal sound generation command 82 is given to the signal sound generation unit 43. When receiving the generation command 82, the signal sound generation unit 43 receives the signal sound 92.
Is output to the adder 44. After the signal sound 92 is output, the control unit 41 sends the text information of the first heading item of the read-out sentence 80 to the text-to-speech conversion unit 42 in the read-out character string 8.
Send as 1. The text-to-speech converter 42 converts the input text information of the heading item into a synthesized speech 91 and outputs the synthesized speech 91 to the adder 44. The adder 44 adds the synthetic sound 91 to the signal sound 92 and outputs an output sound 90. Hereinafter, the control unit 41, the signal sound generation unit 43, and the text-to-speech conversion unit 42
Performs the same processing for the second and subsequent heading items. As a result, the heading items of the hierarchies up to the upper K stages are converted from the voice synthesizing unit 40 into output voices 90 accompanied by a signal sound before each heading item and output.

【００２５】以上説明したように、本実施の形態では、
ユーザーはスキップキー等のキー操作を行うことなく、
入力文書５０の見出し項目のみを音声として聴くことが
でき、入力文書５０の内容を短時間で把握することがで
きる。As described above, in the present embodiment,
The user does not need to perform key operations such as the skip key,
Only the headline items of the input document 50 can be heard as voice, and the contents of the input document 50 can be grasped in a short time.

【００２６】実施の形態２．実施の形態２は、実施の形
態１よりも詳しく文書の概要を知るために、文書のヘッ
ダや見出しに加えて、段落の冒頭と末尾の文も音声に変
換して出力するというのものである。本実施の形態の音
声合成装置の全体構成図は、図１と同様である。本実施
の形態の書式変換部１０の構成は図２、選択部２０の構
成は図８、音声合成部４０の構成は図３のとおりであ
る。また、選択部２０の入出力文書である第１の書式変
換文書６０、第２の書式変換文書６１、第３の書式変換
文書６２、最終書式変換文書７０をそれぞれ図９〜図１
２に、さらに読み上げ文生成部３０の出力文書である読
み上げ文８０を図１３に示す。ここで、書式変換部１０
は、実施の形態１と構成、動作が同じであるので説明を
省略する。Embodiment 2 FIG. In the second embodiment, in order to know the outline of the document in more detail than in the first embodiment, in addition to the header and heading of the document, the beginning and end sentences of the paragraph are also converted to speech and output. . The overall configuration diagram of the speech synthesizer of the present embodiment is the same as FIG. The configuration of the format conversion unit 10 of the present embodiment is as shown in FIG. 2, the configuration of the selection unit 20 is as shown in FIG. 8, and the configuration of the speech synthesis unit 40 is as shown in FIG. The first format conversion document 60, the second format conversion document 61, the third format conversion document 62, and the final format conversion document 70, which are input / output documents of the selection unit 20, are respectively shown in FIGS.
FIG. 13 shows a reading sentence 80 which is an output document of the reading sentence generation unit 30. Here, the format conversion unit 10
Has the same configuration and operation as those of the first embodiment, and thus the description is omitted.

【００２７】次に、書式変換部１０に後続する選択部２
０を図８、図９〜図１３を用いて説明する。本実施の形
態では、選択部２０は、図８に示すように見出し選択部
２１と段落選択部２２と文選択部２３から構成されてい
る。見出し選択部２１は、図９に示す第１の書式変換文
書６０の中に記述されている見出し項目の位置情報と階
層情報をもとに、第１の書式変換文書６０の中から、上
位のK段階までの階層の見出し項目のみを選択し、その
選択情報を第１の書式変換文書６０に付加することによ
って図１０に示す第２の書式変換文書６１を生成し、段
落選択部２２へ出力する。ここで、Kは実施の形態１と
同様に、あらかじめ設定されている整数であり、本実施
の形態でも２である。Next, the selector 2 following the format converter 10
0 will be described with reference to FIGS. 8 and 9 to 13. In the present embodiment, the selection unit 20 includes a heading selection unit 21, a paragraph selection unit 22, and a sentence selection unit 23, as shown in FIG. The heading selection unit 21 selects a higher-order one of the first format conversion documents 60 based on the position information and the hierarchy information of the heading items described in the first format conversion document 60 shown in FIG. By selecting only the heading items of the hierarchy up to the K stage and adding the selected information to the first format conversion document 60, a second format conversion document 61 shown in FIG. I do. Here, K is an integer set in advance as in the first embodiment, and is 2 in the present embodiment.

【００２８】段落選択部２２は、第２の書式変換文書６
１の中に記述されている見出し項目の選択情報と、段落
の位置情報をもとに、第２の書式変換文書６１の中か
ら、選択情報により指定されている見出し項目の下位階
層を構成する段落を選択し、その選択情報を第２の書式
変換文書６１に付加することによって、図１１に示す第
３の書式変換文書６２を生成し、文選択部２３へ出力す
る。図１１の選択情報の ■1-6■は、「見出し１」の下
位階層の１番目から６番目までの文である「文１・・
・。」から「文６・・・。」までが一つの段落であるこ
とを、選択情報の ■7-11■は、「見出し１」の下位階
層の７番目から１１番目までの文である「文７・・
・。」から「文１１・・・。」までが一つの段落である
ことを示している。The paragraph selecting section 22 stores the second format conversion document 6
1. Based on the selection information of the heading item described in 1 and the position information of the paragraph, a lower hierarchy of the heading item specified by the selection information is constructed from the second format conversion document 61. By selecting a paragraph and adding the selected information to the second format conversion document 61, a third format conversion document 62 shown in FIG. 11 is generated and output to the sentence selection unit 23. In the selection information shown in FIG. 11, {1-6} is the first to sixth sentences in the lower hierarchy of “Heading 1”, “Sentence 1.
・. "To" sentence 6 ... "is one paragraph, and the selection information {7-11} is a sentence from the seventh to eleventh in the lower hierarchy of" heading 1 ". 7 ...
・. ... To “sentence 11...” Indicate one paragraph.

【００２９】文選択部２３は、第３の書式変換文書６２
の選択情報で指定されている段落の冒頭のL文を冒頭
文、末尾のM文を末尾文として選択し、その選択情報を
第３の書式変換文書６２に付加することによって図１２
に示す最終書式変換文書７０を生成し出力する。ここ
で、LとMはあらかじめ設定されている１から３程度の整
数である。図１２は、Ｌが２、Ｍが１である場合の例
で、選択情報の ■1,2,6■は、「見出し１」の下位階層
の１番目と2番目の文を冒頭文として、６番目の文を末
尾文として選択したことを、選択情報の ■7,8,11■
は、「見出し１」の下位階層の７番目と８番目の文を冒
頭文として、１１番目の文を末尾文として選択したこと
を示している。The sentence selecting section 23 outputs the third format conversion document 62
12 is selected as the first sentence and the last M sentence of the paragraph specified by the selection information as the first sentence, and the selected information is added to the third format conversion document 62 in FIG.
And generates and outputs the final format conversion document 70 shown in FIG. Here, L and M are predetermined integers of about 1 to 3. FIG. 12 shows an example in which L is 2 and M is 1, and {1, 2, 6} of the selection information is obtained by using the first and second sentences in the lower hierarchy of “Heading 1” as the first sentence. The fact that the sixth sentence has been selected as the last sentence is indicated by the selection information of {7,8,11}.
Indicates that the seventh and eighth sentences in the lower hierarchy of “Heading 1” are selected as the first sentence and the eleventh sentence is selected as the last sentence.

【００３０】次に、選択部２０に後続する読み上げ文生
成部３０の説明を行う。読み上げ文生成部３０は、最終
書式変換文書７０の中から選択情報で指定されている見
出し項目と冒頭文、末尾文のみから構成される段落の文
字情報を読み上げ文として生成する。さらに読み上げ文
生成部３０は、読み上げ文の文字情報の中の見出し項目
と冒頭文、末尾文の境界情報を、読み上げ文の文字情報
に付加して、図１３に示す読み上げ文８０を生成し出力
する。Next, the text-to-speech generation unit 30 subsequent to the selection unit 20 will be described. The reading-sentence generating unit 30 generates, as a reading-out sentence, character information of a paragraph including only the heading item, the head sentence, and the tail sentence specified by the selection information from the final format conversion document 70. Further, the text-to-speech generation unit 30 adds the heading item in the text information of the text-to-speech and the boundary information between the head sentence and the text to the text information of the text-to-speech, and generates and outputs the text-to-speech 80 shown in FIG. I do.

【００３１】最後に、読み上げ文生成部３０に後続する
音声合成部４０を図３を用いて説明する。音声合成部４
０の構成は実施の形態１と同様であるが、入力される読
み上げ文８０の内容が異なるなるため、読み上げ文８０
の見出し項目、冒頭文、末尾文を、項目と称して動作を
説明する。Finally, the speech synthesizing section 40 following the reading sentence generating section 30 will be described with reference to FIG. Voice synthesis unit 4
0 is the same as that of the first embodiment, but since the content of the input sentence 80 is different,
The operation will be described with the heading item, the head sentence, and the tail sentence of the item referred to as items.

【００３２】まず、制御部４１は、読み上げ文８０中の
境界情報を検知すると、合図音発生部４３へ合図音の発
生指令８２を与える。合図音発生部４３は発生指令８２
を受けると、合図音９２を加算器４４に出力する。合図
音９２が出力された後で、制御部４１はテキスト音声変
換部４２へ読み上げ文８０の最初の見出し項目の文字情
報を読み上げ文字列８１として送る。テキスト音声変換
部４２は、入力された見出し項目の文字情報を合成音声
９１に変換して加算器４４に出力する。以下、制御部４
１と合図音発生部４３とテキスト音声変換部４２は、２
番目以降の冒頭文、末尾文の項目に対しても同様の処理
を行う。その結果、音声合成部４０からは、上位のK段
階までの階層の見出し項目とその下位階層の段落の冒頭
文、末尾文が、それぞれの前に合図音を伴った出力音声
９０に変換されて出力される。First, when detecting the boundary information in the read-out sentence 80, the control unit 41 gives a signal generation command 82 to the signal generation unit 43. Signaling sound generation unit 43 generates generation command 82
Upon receiving the signal, a signal sound 92 is output to the adder 44. After the signal sound 92 is output, the control unit 41 sends the text information of the first heading item of the read-aloud sentence 80 to the text-to-speech conversion unit 42 as a read-out character string 81. The text-to-speech converter 42 converts the input text information of the heading item into a synthesized speech 91 and outputs the synthesized speech 91 to the adder 44. Hereinafter, the control unit 4
1, the signal sound generator 43 and the text-to-speech converter 42
The same processing is performed for the items of the first sentence and the last sentence after the first sentence. As a result, the speech synthesizing unit 40 converts the heading items of the hierarchy up to the upper K stage and the beginning sentence and the ending sentence of the paragraph of the lower hierarchy into output speech 90 with a signal sound before each. Is output.

【００３３】以上説明したように、本実施の形態では、
ユーザーはスキップキー等のキー操作を行うことなく、
上位のK段階までの階層の見出し項目とその下位階層の
段落の冒頭文と末尾文を音声として聴くことができ、入
力文書５０の内容を短時間で把握することができる。As described above, in the present embodiment,
The user does not need to perform key operations such as the skip key,
It is possible to listen to the heading item of the hierarchy up to the upper K stages and the beginning and end sentences of the paragraphs in the lower hierarchy as speech, and to grasp the contents of the input document 50 in a short time.

【００３４】実施の形態３．実施の形態３は、構造化の
ための特別なタグを持たない、文字情報のみから構成さ
れる文書の概要を知るために、所定の基準に基づいて入
力文書５０中の見出しや段落などの階層構造を判別し、
その見出しと、段落の冒頭と末尾の文を音声に変換して
出力するというのものであり、書式変換部１０にその特
徴がある。Embodiment 3 According to the third embodiment, in order to know the outline of a document which does not have a special tag for structuring and is composed of only character information, a hierarchy such as a heading or a paragraph in the input document 50 based on a predetermined criterion is used. Determine the structure,
The heading and the sentence at the beginning and end of the paragraph are converted into voice and output, and the format conversion unit 10 has the feature.

【００３５】本実施の形態の音声合成装置の全体構成図
は、図１と同様である。本実施の形態の書式変換部１０
の構成は図１４、選択部２０の構成は図８、音声合成部
４０の構成は図３のとおりである。また、書式変換部１
０の入出力文書である入力文書５０と第１の書式変換文
書６０をそれぞれ図１５、図１６に示す。ここで、選
択部２０、読み上げ文生成部３０、音声合成部４０は、
実施の形態２と構成と動作が同じであるので説明を省略
し、書式変換部１０の説明のみを行う。The overall configuration of the speech synthesizing apparatus according to this embodiment is the same as that shown in FIG. Format conversion unit 10 of the present embodiment
14 is as shown in FIG. 14, the configuration of the selector 20 is as shown in FIG. 8, and the configuration of the speech synthesizer 40 is as shown in FIG. Format conversion unit 1
FIG. 15 and FIG. 16 show an input document 50 and a first format conversion document 60 which are input / output documents of No. 0, respectively. Here, the selection unit 20, the reading sentence generation unit 30, and the speech synthesis unit 40
Since the configuration and operation are the same as those of the second embodiment, the description will be omitted, and only the format conversion unit 10 will be described.

【００３６】図１４に示すように、書式変換部１０は、
見出し書式解析部１４、見出し判別部１１、段落判別部
１２、書式変換文書出力部１３によって構成される。ま
ず、見出し書式解析部１４は、入力文書５０の全体、ま
たは一部より、見出しの書式を解析する。具体的には文
書の見出しの書式が、「第１章、第２章、第３章、・・
・・」というものであるのか、「１.、２.、３.、・・
・・」というものであるのかなどを調べ、見出し書式５
３を出力する。As shown in FIG. 14, the format conversion unit 10
It is composed of a heading format analyzing unit 14, a heading determining unit 11, a paragraph determining unit 12, and a format conversion document output unit 13. First, the heading format analysis unit 14 analyzes the heading format from the whole or a part of the input document 50. Specifically, the format of the headline of the document is "Chapter 1, Chapter 2, Chapter 3, ...
.. "or" 1, 2., 3., ...
・・ ”, Etc., and check heading format 5
3 is output.

【００３７】次に、見出し判別部１１は、見出し書式解
析部１４より与えられる見出し書式５３を用いて、入力
文書５０中の見出し項目を判別する。図１５に示す例で
は、行の先頭の「１.、２.、３.、・・・・」という文
字列を見出し項目と判別する。そして、この文字列すな
わち見出し項目の入力文書中における行番号と階層を、
見出し項目の位置情報と階層情報として出力する。Next, the heading discriminating unit 11 uses the heading format 53 given by the heading format analyzing unit 14 to discriminate the heading items in the input document 50. In the example shown in FIG. 15, the character string “1, 2., 3,...” At the head of the line is determined as a heading item. Then, this character string, that is, the line number and hierarchy in the input document of the heading item,
Output as position information and hierarchical information of the heading item.

【００３８】段落判別部１２は、見出し判別部１１より
与えられる見出し項目の階層情報を用いて、入力文書５
０の見出し項目以外の部分の段落境界を判別する。具体
的には、図１５に示すように、入力文書５０において、
見出し項目でない箇所で字下げを行っている行を段落の
先頭とする。そして、段落境界の入力文書中における行
番号を位置情報として出力する。図１６に示すように、
書式変換文書出力部１３は、入力文書５０の文書タイト
ルや見出しなどの見出し項目の左側に、その見出し項目
の階層を示す ■H0■、■H1■という記号の階層情報を
付加し、さらに入力文書５０の段落の先頭行の左側に、
段落の先頭を示す ■P■ という記号の位置情報を付加
することにより、第１の書式変換文書６０の文字情報を
生成し出力する。The paragraph determining section 12 uses the hierarchical information of the heading item provided by the heading determining section 11 to input the input document 5.
The paragraph boundary of the part other than the heading item of 0 is determined. Specifically, as shown in FIG. 15, in the input document 50,
The line indented at a place other than the heading item is the head of the paragraph. Then, the line number of the paragraph boundary in the input document is output as position information. As shown in FIG.
The format conversion document output unit 13 adds, to the left side of a heading item such as a document title or a heading of the input document 50, hierarchical information of symbols {H0} and {H1} indicating the hierarchy of the heading item. To the left of the first line of the 50 paragraphs,
The character information of the first format conversion document 60 is generated and output by adding the position information of the symbol {P} indicating the beginning of the paragraph.

【００３９】以上のようにして生成された第１の書式変
換文書６０は、実施の形態２の選択部２０の入力である
第１の書式変換文書６０と同じ書式である。従って、本
実施の形態の書式変換部１０に後続する選択部２０と読
み上げ文生成部３０及び音声合成部４０は、実施の形態
２の説明と同様の動作を行う。そのため、本実施の形態
の音声合成装置では、入力文書５０が構造化のための特
別なタグを持たない文字情報のみから構成される場合で
も、ユーザーはスキップキー等のキー操作を行うことな
く、上位のK段階までの階層の見出し項目とその下位階
層の段落の冒頭文と末尾文を音声として聴くことがで
き、入力文書５０の内容を短時間で把握することができ
る。The first format conversion document 60 generated as described above has the same format as the first format conversion document 60 input to the selection unit 20 of the second embodiment. Therefore, the selecting unit 20, the reading-sentence generating unit 30, and the speech synthesizing unit 40 subsequent to the format converting unit 10 according to the present embodiment perform the same operations as described in the second embodiment. Therefore, in the speech synthesizer according to the present embodiment, even when the input document 50 is composed of only character information having no special tag for structuring, the user does not perform a key operation such as a skip key. It is possible to listen to the heading item of the hierarchy up to the upper K stages and the beginning and end sentences of the paragraphs in the lower hierarchy as speech, and to grasp the contents of the input document 50 in a short time.

【００４０】実施の形態４．実施の形態４は、実施の形
態２において、冒頭文の後から末尾文の前までの読み飛
ばされる文字数が大きくなり過ぎることを防止するとい
うものである。本実施の形態の音声合成装置の全体構成
図は、図１と同様である。本実施の形態の書式変換部１
０の構成は図２、選択部２０の構成は図１７、音声合成
部４０の構成は図３のとおりである。また選択部２０で
の入出力文書の一部である第４の書式変換文書６３と最
終書式変換文書７０をそれぞれ図１８、図１９に、さら
に選択部２０に後続する読み上げ文生成部３０の出力文
書である読み上げ文８０を図２０に示す。ここで、書式
変換部１０は、実施の形態１と構成、動作が同じである
ので説明を省略する。Embodiment 4 FIG. The fourth embodiment is to prevent the number of characters to be skipped from the first sentence to the last sentence from being too large in the second embodiment. The overall configuration diagram of the speech synthesizer of the present embodiment is the same as FIG. Format conversion unit 1 of the present embodiment
2, the configuration of the selector 20 is as shown in FIG. 17, and the configuration of the speech synthesizer 40 is as shown in FIG. The fourth format conversion document 63 and the final format conversion document 70, which are part of the input / output document in the selection unit 20, are respectively shown in FIGS. FIG. 20 shows a reading sentence 80 as a document. Here, the format conversion unit 10 has the same configuration and operation as those of the first embodiment, and thus the description is omitted.

【００４１】図１７に示すように、選択部２０は、見出
し選択部２１、段落選択部２２、文選択部２３、読み飛
ばし制限部２４によって構成される。本実施の形態の選
択部２０は、実施の形態２の選択部２０に読み飛ばし制
限部２４を追加したことを除いては、実施の形態２と同
一であるので、読み飛ばし制限部２４についてのみ説明
する。As shown in FIG. 17, the selection section 20 includes a heading selection section 21, a paragraph selection section 22, a sentence selection section 23, and a skip restriction section 24. The selecting unit 20 according to the present embodiment is the same as the second embodiment except that the skip restricting unit 24 is added to the selecting unit 20 according to the second embodiment. explain.

【００４２】読み飛ばし制限部２４は、図１８に示す第
４の書式変換文書６３を入力して、冒頭文の後から末尾
文の前までの読み飛ばされる文字数を数える。そして、
読み飛ばされる文字数が所定の値を超え、複数の文に及
ぶ場合には、複数の読み飛ばされる文の中間の文を中間
文として選択する。図１９は、読み飛ばし制限部２４の
出力である最終書式変換文書７０を示しており、「見出
し１」の下位階層の第１段落と、「見出し２」の下位階
層の第１段落の読み飛ばされる文字数が所定の値を超え
たため、「見出し１」の下位階層の４番目の文である
「文４・・・・・」と、「見出し２」の下位階層の４番
目の文である「文１５・・・・・」が中間文として選択
されたことを示している。The skipping restriction unit 24 inputs the fourth format conversion document 63 shown in FIG. 18 and counts the number of characters to be skipped from after the first sentence to before the last sentence. And
When the number of characters to be skipped exceeds a predetermined value and extends to a plurality of sentences, a middle sentence of the plurality of skipped sentences is selected as an intermediate sentence. FIG. 19 shows the final format conversion document 70 which is the output of the skipping restriction unit 24. The skipping of the first paragraph of the lower hierarchy of “Heading 1” and the first paragraph of the lower hierarchy of “Heading 2” are performed. Since the number of characters exceeds a predetermined value, “sentence 4...” Which is the fourth sentence of the lower hierarchy of “heading 1” and “sentence 4” which is the fourth sentence of the lower hierarchy of “heading 2” ... Have been selected as intermediate sentences.

【００４３】読み上げ文生成部３０は、図１９の最終書
式変換文書７０から、図２０の読み上げ文８０を生成す
る。この図では、選択部２０で選択された中間文が境界
情報を付加され、読み上げ文８０中に書き出されている
ことが示されている。The reading sentence generating section 30 generates the reading sentence 80 in FIG. 20 from the final format conversion document 70 in FIG. This figure shows that the intermediate sentence selected by the selection unit 20 is added with boundary information and is written out in the read-aloud sentence 80.

【００４４】最後に音声合成部４０について説明する。
音声合成部４０の構成は、実施の形態２で説明したもの
と同じである。実施の形態２の説明と異なるのは、音声
合成部４０の入力である読み上げ文８０が、実施の形態
２の図１３に対し、本実施の形態では図２０のように４
行目と１１行目に中間文が加わっている点である。従っ
て実施の形態２では、冒頭文の後から末尾文の前までの
読み飛ばされる文字数に関係なく、上位のK段階までの
階層の見出し項目とその下位階層の段落の冒頭文と末尾
文のみが、それぞれの前に合図音を伴った出力音声９０
に変換されて出力されるのに対し、本実施の形態では、
冒頭文の後から末尾文の前までの読み飛ばされる文字数
が多い場合には、上位のK段階までの階層の見出し項目
とその下位階層の段落の冒頭文と末尾文に加えて、中間
文が合図音を伴った出力音声９０に変換されて出力され
る。Finally, the speech synthesizer 40 will be described.
The configuration of the voice synthesizer 40 is the same as that described in the second embodiment. The difference from the description of the second embodiment is that the read-aloud sentence 80, which is the input of the speech synthesizer 40, is different from FIG. 13 of the second embodiment in FIG.
The difference is that an intermediate sentence is added to the lines 11 and 12. Therefore, in the second embodiment, regardless of the number of characters to be skipped from after the initial sentence to before the final sentence, only the heading item of the hierarchy up to the upper K-level and the opening sentence and the final sentence of the paragraph of the lower hierarchy are reduced. , The output sound 90 with a signal sound before each
In contrast, in the present embodiment,
If the number of characters to be skipped after the beginning sentence to before the end sentence is large, in addition to the heading items of the hierarchy up to the upper K level and the beginning and end sentences of the paragraphs in the lower hierarchy, the intermediate sentence It is converted into an output sound 90 accompanied by a signal sound and output.

【００４５】以上説明したように、本実施の形態では、
ユーザーはスキップキー等のキー操作を行うことなく、
上位のK段階までの階層の見出し項目とその下位階層の
段落の冒頭文と末尾文を音声として聴くことができる。
また、冒頭文の後から末尾文の前までの読み飛ばされる
文字数が所定の値を超えた場合には、冒頭文と末尾文の
間の中間文も音声として聴くことができるので、入力文
書５０の内容を短時間で、しかも、文字数が多い大きな
段落でも確実に内容を把握することが可能となる。As described above, in the present embodiment,
The user does not need to perform key operations such as the skip key,
It is possible to listen to the heading item and the first sentence and the last sentence of the heading items of the hierarchy up to the upper K stage and the paragraphs of the lower hierarchy.
If the number of characters to be skipped after the initial sentence to before the final sentence exceeds a predetermined value, the intermediate sentence between the initial sentence and the final sentence can be heard as a voice. Can be grasped in a short time, and even in a large paragraph having many characters.

【００４６】実施の形態５．実施の形態５は、一連の文
のまとまりある区切りを表す文境界キーワードを手がか
りに、新たな段落境界を追加するものである。文境界キ
ーワードとは、「まず」、「第一に」、「次に」など、
主に話題の転換を表す連語である。本実施の形態の音声
合成装置の全体構成図は、図１と同様である。本実施の
形態の書式変換部１０の構成は図２、選択部２０の構成
は図２１、音声合成部４０の構成は図３のとおりであ
る。また、選択部２０の入出力文書である第３の書式変
換文書６２、第４の書式変換文書６３及び最終書式変換
文書７０を図２２に示す。ここで、書式変換部１０は、
実施の形態１と構成と動作も同じであるので説明を省略
する。Embodiment 5 FIG. In the fifth embodiment, a new paragraph boundary is added based on a sentence boundary keyword indicating a unitary break of a series of sentences. Sentence boundary keywords are "first,""first,""second," etc.
It is a collocation that mainly represents a change in topic. The overall configuration diagram of the speech synthesizer of the present embodiment is the same as FIG. The configuration of the format conversion unit 10 of the present embodiment is as shown in FIG. 2, the configuration of the selection unit 20 is as shown in FIG. 21, and the configuration of the speech synthesis unit 40 is as shown in FIG. FIG. 22 shows a third format conversion document 62, a fourth format conversion document 63, and a final format conversion document 70, which are input / output documents of the selection unit 20. Here, the format conversion unit 10
Since the configuration and the operation are the same as those of the first embodiment, the description is omitted.

【００４７】次に、選択部２０の説明を行う。図２１に
示すように、選択部２０は、見出し選択部２１、段落選
択部２２、文境界キーワード検出部２５、段落追加部２
６、文選択部２３によって構成される。本実施の形態の
選択部２０は、実施の形態２の選択部２０に文境界キー
ワード検出部２５と段落追加部２６を追加したことを除
いては、実施の形態２と同一であるので、文境界キーワ
ード検出部２５と段落追加部２６についてのみ説明す
る。Next, the selection section 20 will be described. As shown in FIG. 21, the selection unit 20 includes a heading selection unit 21, a paragraph selection unit 22, a sentence boundary keyword detection unit 25, and a paragraph addition unit 2.
6. Sentence selection unit 23. The selection unit 20 of the present embodiment is the same as the second embodiment except that a sentence boundary keyword detection unit 25 and a paragraph addition unit 26 are added to the selection unit 20 of the second embodiment. Only the boundary keyword detecting section 25 and the paragraph adding section 26 will be described.

【００４８】文境界キーワード検出部２５は、図２２
（ａ）に示す第３の書式変換文書６２を入力して、選択
情報が■1■である見出し項目の下位階層のそれぞれの
文の文頭において、文境界キーワードの有無を調べ、文
境界キーワードを検出した場合には、第３の書式変換文
書６２中における文境界キーワードの位置情報６４を出
力する。段落追加部２６は、文境界キーワードの位置
が、第３の書式変換文書６２の選択情報で示される段落
境界と重複しない場合には、文境界キーワードの位置を
新たな段落境界として追加し、図２２（ｂ）に示す第４
の書式変換文書６３を出力する。図２２（ｂ）は、「文
４・・・・。」の先頭に文境界キーワードがあるため、
「見出し１」の下位階層の１つの段落が二つの段落に分
割されたことを示している。The sentence boundary keyword detecting unit 25 is configured as shown in FIG.
The third format conversion document 62 shown in (a) is input, and the presence or absence of a sentence boundary keyword is checked at the beginning of each sentence of the lower hierarchy of the heading item whose selection information is {1}. If detected, the position information 64 of the sentence boundary keyword in the third format conversion document 62 is output. If the position of the sentence boundary keyword does not overlap with the paragraph boundary indicated by the selection information of the third format conversion document 62, the paragraph adding unit 26 adds the position of the sentence boundary keyword as a new paragraph boundary, and The fourth shown in FIG.
Is output. In FIG. 22B, since a sentence boundary keyword is at the beginning of “sentence 4.
This indicates that one paragraph in the lower hierarchy of “Heading 1” has been divided into two paragraphs.

【００４９】文選択部２３は、分割されたそれぞれの段
落に対して、第４の書式変換文書６３の選択情報で指定
されている段落の冒頭のL文を冒頭文、末尾のM文を末尾
文として選択し、その選択情報を第４の書式変換文書６
３に付加することによって図２２（ｃ）に示す最終書式
変換文書７０を生成し出力する。本実施の形態では、L
とMはともに１である。選択部２０に後続する読み上げ
文生成部３０と音声合成部４０の動作は、実施の形態２
と同じなので、説明を省略する。The sentence selection unit 23 sets the first L sentence of the paragraph specified by the selection information of the fourth format conversion document 63 as the first sentence and the last M sentence as the last sentence for each of the divided paragraphs. Selected as a sentence, and the selected information is sent to the fourth format conversion document 6
3 to generate and output the final format conversion document 70 shown in FIG. In the present embodiment, L
And M are both 1. The operation of the text-to-speech generation unit 30 and the speech synthesis unit 40 following the selection unit 20 is described in the second embodiment.
Therefore, the description is omitted.

【００５０】以上説明したように、本実施の形態の選択
部２０では、文境界キーワード検出部２５が段落中の文
頭の文境界キーワードを検知した時には、段落追加部２
６は段落を分割し、文選択部２３は分割されたそれぞれ
の段落から冒頭文と末尾文を選択する。As described above, in the selecting section 20 of the present embodiment, when the sentence boundary keyword detecting section 25 detects a sentence boundary keyword at the beginning of a sentence in a paragraph, the paragraph adding section 2
6 divides a paragraph, and the sentence selection unit 23 selects a head sentence and a tail sentence from each of the divided paragraphs.

【００５１】従って、ＨＴＭＬ形式の入力文書の段落を
表す構造タグや、構造タグを持たない入力文書の段落の
先頭を表す字下げが、文の意味的なまとまりを示す位置
や、話題が変わる位置で欠如している場合には、文境界
キーワードを手がかりに段落を分割し、分割されたそれ
ぞれの段落の中の文を出力音声９０に変換して出力する
ことができ、入力文書５０が大きな段落から構成されて
いても、ユーザーは文書の内容を正確に、しかも短時間
で把握することができる。Therefore, the structure tag representing the paragraph of the input document in the HTML format, or the indentation representing the beginning of the paragraph of the input document having no structure tag may be a position indicating a meaningful unit of the sentence or a position at which the topic changes. If the input document 50 is large, the sentence boundary keyword can be used as a clue to divide the paragraph, and the sentence in each of the divided paragraphs can be converted into output speech 90 and output. , The user can grasp the contents of the document accurately and in a short time.

【００５２】実施の形態６．実施の形態６は、段落内の
文を読み上げるのでなく、段落内で出現頻度の高い自立
語をキーワードとし、キーワードのみを読み上げるとい
うものである。本実施の形態の音声合成装置の全体構成
図は、図１と同様である。本実施の形態の書式変換部１
０の構成は図２、選択部２０の構成は図２３、読み上げ
文生成部３０の構成は図２４、音声合成部４０の構成は
図３のとおりである。また、読み上げ文生成部３０の入
出力文書である最終書式変換文書７０と読み上げ文８０
を図２５に示す。ここで、書式変換部１０は、実施の形
態１と構成も動作も同じであるので説明を省略する。Embodiment 6 FIG. In the sixth embodiment, an independent word having a high appearance frequency in a paragraph is used as a keyword, and only the keyword is read aloud instead of reading out a sentence in the paragraph. The overall configuration diagram of the speech synthesizer of the present embodiment is the same as FIG. Format conversion unit 1 of the present embodiment
2, the configuration of the selection unit 20 is as shown in FIG. 23, the configuration of the read-aloud sentence generation unit 30 is as shown in FIG. 24, and the configuration of the speech synthesis unit 40 is as shown in FIG. A final format conversion document 70, which is an input / output document of the reading sentence generating unit 30, and a reading sentence 80
Is shown in FIG. Here, the format conversion unit 10 has the same configuration and operation as those of the first embodiment, and thus the description is omitted.

【００５３】次に、選択部２０の説明を行う。図２３に
示すように、選択部２０は、見出し選択部２１と段落選
択部２２によって構成されている。この構成は、図８で
示した実施の形態２の選択部２０から文選択部２３を削
除したものである。見出し選択部２１と段落選択部２２
については、段落選択部２２の出力の名称が、実施の形
態２では第３の書式変換文書６２であったものが、本実
施の形態では最終書式変換文書７０になったこと以外は
実施の形態２と同様であるので、動作の説明は省略す
る。Next, the selection section 20 will be described. As shown in FIG. 23, the selection unit 20 includes a heading selection unit 21 and a paragraph selection unit 22. This configuration is obtained by removing the sentence selection unit 23 from the selection unit 20 of the second embodiment shown in FIG. Heading selection section 21 and paragraph selection section 22
Is the same as that in the second embodiment except that the output name of the paragraph selecting unit 22 is the third format conversion document 62 in the second embodiment, but is the final format conversion document 70 in the present embodiment. 2, the description of the operation is omitted.

【００５４】図２５（ａ）は、選択部２０の出力である
最終書式変換文書７０を示している。この図は、本実施
の形態の要点を示すために、一つの見出しとその下位階
層である一つの段落のみを示している。１行目の左端に
は、この行が見出し項目であることを示す■H2■の記号
が、その右には見出し項目が選択されていることを示す
■1■の記号が付されている。２行目の左端には段落で
あることを示す■P■という記号が、その右には、見出
し項目の下位階層の第１文から第６文までが段落の範囲
であることを示す■1-6■の記号が付されている。FIG. 25A shows a final format conversion document 70 output from the selection unit 20. This figure shows only one heading and one paragraph below the heading to show the gist of the present embodiment. At the left end of the first line, a symbol {H2} indicating that this line is a heading item is attached, and at the right thereof, a symbol {1} indicating that the heading item is selected. At the left end of the second line is a symbol {P} indicating that it is a paragraph, and to the right is {1} indicating that the first to sixth sentences in the lower hierarchy of the heading item are the range of the paragraph. The symbol of -6 ■ is attached.

【００５５】次に、読み上げ文生成部３０について説明
する。図２４に示すように、読み上げ文生成部３０は、
キーワード抽出部３２と読み上げ文出力部３１によって
構成されている。まず、キーワード抽出部３２は、先に
説明した最終書式変換文書７０を入力とし、■1-6■等
の選択情報で示される段落の文中の自立語の頻度を調べ
る。そして頻度の高い自立語を段落のキーワード７１と
して抽出し出力する。次に、読み上げ文出力部３１は、
選択情報で指定されている見出し項目と、キーワード抽
出部３２の出力したキーワード７１のみから構成される
読み上げ文の文字情報を生成する。さらに読み上げ文生
成部３０は、読み上げ文の文字情報の中の見出し項目の
境界情報と、抽出されたキーワード７１が属していた段
落の境界情報を、読み上げ文の文字情報に付加して、図
２５（ｂ）に示す読み上げ文８０を生成し出力する。な
お、他の実施の形態と同様に、図の左端の■T■という
記号が境界情報である。Next, the text-to-speech generating unit 30 will be described. As shown in FIG. 24, the reading-sentence generating unit 30
It comprises a keyword extracting unit 32 and a reading sentence output unit 31. First, the keyword extracting unit 32 receives the final format conversion document 70 described above as an input, and checks the frequency of independent words in the sentence of the paragraph indicated by the selection information such as {1-6}. Then, the independence word having a high frequency is extracted and output as the keyword 71 of the paragraph. Next, the reading sentence output unit 31
The character information of the text to be read, which includes only the heading item specified by the selection information and the keyword 71 output from the keyword extracting unit 32, is generated. Further, the read-sentence generating unit 30 adds the boundary information of the heading item in the character information of the read-out sentence and the boundary information of the paragraph to which the extracted keyword 71 belongs to the character information of the read-out sentence. The read-out sentence 80 shown in (b) is generated and output. Note that, as in the other embodiments, the symbol {T} at the left end of the figure is the boundary information.

【００５６】最後に音声合成部４０について説明する。
音声合成部４０の構成は、実施の形態２で説明したもの
と同じである。実施の形態２の説明と異なるのは、音声
合成部４０の入力である読み上げ文８０において、冒頭
文と末尾文であった部分が、本実施の形態ではキーワー
ドになっている点である。従って実施の形態２では、上
位のK段階までの階層の見出し項目とその下位階層の段
落の冒頭文と末尾文が、それぞれの前に合図音を伴った
出力音声９０に変換されて出力されるのに対し、本実施
の形態では、上位のK段階までの階層の見出し項目とそ
の下位階層の段落のキーワードが、それぞれの前に合図
音を伴った出力音声９０に変換されて出力される。Finally, the speech synthesizer 40 will be described.
The configuration of the voice synthesizer 40 is the same as that described in the second embodiment. The difference from the description of the second embodiment is that in the read-aloud sentence 80 that is the input of the speech synthesis unit 40, the part that is the beginning sentence and the end sentence is a keyword in the present embodiment. Therefore, in the second embodiment, the heading item of the hierarchy up to the upper K-level and the beginning sentence and the end sentence of the paragraph of the lower hierarchy are converted into the output sound 90 accompanied by a signal sound before each and output. On the other hand, in the present embodiment, the heading item of the hierarchy up to the K-th level and the keyword of the paragraph in the lower hierarchy are converted into the output sound 90 accompanied by a signal sound before each and output.

【００５７】以上説明したように、本実施の形態では、
ユーザーはスキップキー等のキー操作を行うことなく、
上位のK段階までの階層の見出し項目とその下位階層の
段落のキーワードを音声として聴くことができるので、
入力文書５０の内容を短時間で把握することができる。As described above, in the present embodiment,
The user does not need to perform key operations such as the skip key,
Because you can listen to the keyword of the heading item of the hierarchy up to the upper K stage and the paragraph of the lower hierarchy,
The contents of the input document 50 can be grasped in a short time.

【００５８】実施の形態７．実施の形態７は、段落内の
文をそのまま読み上げるのでなく、文の連体修飾節、す
なわち体言を修飾する文節を削除して読み上げるという
ものである。本実施の形態の音声合成装置の全体構成図
は、図１と同様である。本実施の形態の書式変換部１０
の構成は図２、選択部２０の構成は図２３、読み上げ文
生成部３０の構成は図２６、音声合成部４０の構成は図
３のとおりである。また、読み上げ文生成部３０の入出
力文書である最終書式変換文書７０と読み上げ文８０を
図２７に示す。ここで、書式変換部１０は、実施の形態
１と構成も動作も同じであるので説明を省略する。ま
た、選択部２０は、実施の形態６と構成も動作も同じで
あるのでこの説明も省略する。Embodiment 7 FIG. In the seventh embodiment, a sentence in a paragraph is not read aloud as it is, but a continuous modification clause of the sentence, that is, a phrase that modifies a nominative is deleted and read. The overall configuration diagram of the speech synthesizer of the present embodiment is the same as FIG. Format conversion unit 10 of the present embodiment
2 is shown in FIG. 2, the configuration of the selection section 20 is shown in FIG. 23, the configuration of the reading-sentence generation section 30 is shown in FIG. 26, and the configuration of the speech synthesis section 40 is shown in FIG. FIG. 27 shows the final format conversion document 70 and the read-out sentence 80, which are input / output documents of the read-out sentence generation unit 30. Here, the format conversion unit 10 has the same configuration and operation as those of the first embodiment, and thus the description is omitted. In addition, the selection unit 20 has the same configuration and operation as those of the sixth embodiment, and thus the description thereof is omitted.

【００５９】次に、読み上げ文生成部３０の説明を行
う。図２６に示すように、読み上げ文生成部３０は、修
飾節処理部３３と読み上げ文出力部３１によって構成さ
れる。まず、修飾節処理部３３は、図２７（ａ）に示さ
れる最終書式変換文書７０を入力とし、■1-4■等の選
択情報によって段落であることが示されている文中の文
節間の修飾関係を解析して、体言を修飾している連体修
飾節を検出する。そして連体修飾節を削除した文７２を
出力する。Next, the reading sentence generating unit 30 will be described. As shown in FIG. 26, the text-to-speech generating unit 30 includes a modifier processing unit 33 and a text-to-speech output unit 31. First, the modifier processing unit 33 receives the final format conversion document 70 shown in FIG. 27 (a) as an input, and selects between paragraphs in a sentence indicated as a paragraph by selection information such as {1-4}. Analyze the modification relations and detect adnominal modification clauses that modify the nominal. Then, the sentence 72 from which the adnominal modifier clause is deleted is output.

【００６０】次に、読み上げ文出力部３１は、選択情報
で指定されている見出し項目と、修飾節処理部３３の出
力した連体修飾節を削除した文７２のみから構成される
読み上げ文の文字情報を生成する。さらに読み上げ文生
成部３０は、読み上げ文の文字情報の中の見出し項目の
境界情報と、修飾節を削除した文の属する段落の境界情
報を、読み上げ文の文字情報に付加して読み上げ文８０
を生成し出力する。図２７（ｂ）は、読み上げ文８０を
示しており、左端の■T■という記号が境界情報であ
り、右側が連体修飾節を削除した文７２の文字情報であ
る。Next, the read-out sentence output unit 31 outputs the character information of the read-out sentence composed only of the heading item designated by the selection information and the sentence 72 in which the adjunct modifier output from the modifier processing unit 33 is deleted. Generate Further, the reading-sentence generating unit 30 adds the boundary information of the heading item in the character information of the reading-out sentence and the boundary information of the paragraph to which the sentence from which the modifier clause has been deleted to the character information of the reading-out sentence, and adds the reading-out sentence 80.
Generate and output FIG. 27 (b) shows a read-aloud sentence 80, where the symbol {T} at the left end is the boundary information, and the right side is the character information of the sentence 72 from which the adnominal modifier clause has been deleted.

【００６１】最後に音声合成部４０について説明する。
音声合成部４０の構成は、実施の形態２で説明したもの
と同じである。実施の形態２の説明と異なるのは、音声
合成部４０の入力である読み上げ文８０において、実施
の形態２の冒頭文と末尾文の部分が、本実施の形態では
修飾節を削除した文になっている点である。従って、実
施の形態２では、上位のK段階までの階層の見出し項目
とその下位階層の段落の冒頭文と末尾文が、それぞれの
前に合図音を伴った出力音声９０に変換されて出力され
るのに対し、本実施の形態では、上位のK段階までの階
層の見出し項目とその下位階層の段落の連体修飾節を削
除した文が、それぞれの前に合図音を伴った出力音声９
０に変換されて出力される。Finally, the speech synthesizer 40 will be described.
The configuration of the voice synthesizer 40 is the same as that described in the second embodiment. The difference from the description of the second embodiment is that, in the read-aloud sentence 80 which is the input of the speech synthesis unit 40, the first sentence and the last sentence of the second embodiment are replaced by the sentence from which the modified clause is deleted in the present embodiment. It is a point that has become. Therefore, in the second embodiment, the heading items of the hierarchies up to the K-th level and the beginning and end sentences of the paragraphs in the lower hierarchy are converted into output sound 90 accompanied by a signal sound before each and output. On the other hand, in the present embodiment, the sentence in which the heading item of the hierarchy up to the upper K level and the union modifier in the paragraph of the lower hierarchy are deleted is output sound 9 with a signal sound before each sentence.
It is converted to 0 and output.

【００６２】以上説明したように、本実施の形態では、
ユーザーはスキップキー等のキー操作を行うことなく、
上位のK段階までの階層の見出し項目とその下位階層の
段落の連体修飾節を削除した文を音声として聴くことが
できるので、入力文書５０の内容を短時間で把握するこ
とができる。As described above, in the present embodiment,
The user does not need to perform key operations such as the skip key,
Since the sentence in which the heading item of the hierarchy up to the upper K stage and the adjunct modifier in the lower hierarchy can be heard as a voice, the contents of the input document 50 can be grasped in a short time.

【００６３】実施の形態８．実施の形態８は、段落内の
冒頭文と末尾文を読み上げる際の読み飛ばされる文字数
を音声で通知するというものである。本実施の形態の音
声合成装置の全体構成図は、図１と同様である。本実施
の形態の書式変換部１０の構成は図２、選択部２０の構
成は図８、読み上げ文生成部３０の構成は図２８、音声
合成部４０の構成は図３のとおりである。また、読み上
げ文生成部３０の入出力文書である最終書式変換文書７
０と読み上げ文８０をそれぞれ図２９、図３０に示す。
ここで、書式変換部１０は、実施の形態１と構成も動作
も同じであるので説明を省略する。また、選択部２０
は、実施の形態２と構成も動作も同じであるのでこの説
明も省略する。Embodiment 8 FIG. In the eighth embodiment, the number of characters to be skipped when reading the first sentence and the last sentence in a paragraph is notified by voice. The overall configuration diagram of the speech synthesizer of the present embodiment is the same as FIG. The configuration of the format conversion unit 10 according to the present embodiment is as shown in FIG. 2, the configuration of the selection unit 20 is as shown in FIG. 8, the configuration of the reading sentence generation unit 30 is as shown in FIG. 28, and the configuration of the speech synthesis unit 40 is as shown in FIG. Further, the final format conversion document 7 which is an input / output document of the reading-sentence generation unit 30
0 and the sentence 80 are shown in FIGS. 29 and 30, respectively.
Here, the format conversion unit 10 has the same configuration and operation as those of the first embodiment, and thus the description is omitted. The selection unit 20
Has the same configuration and operation as those of the second embodiment, so that the description thereof will be omitted.

【００６４】次に、読み上げ文生成部３０について説明
する。図２８に示すように、読み上げ文生成部３０は、
読み飛ばし文字数通知部３４と読み上げ文出力部３１に
よって構成されている。まず、読み飛ばし文字数通知部
３４は、図２９に示す最終書式変換文書７０の冒頭文の
後から末尾文の前までの読み飛ばされる文字数を数え
る。そして、読み飛ばされる文字数が所定の値を超えた
ときには、読み飛ばされる文字数を知らせるための通知
文を作成する。読み飛ばし通知文７３の書式は、例え
ば、「８２文字の読み飛ばしです。」というものであ
る。Next, the reading sentence generating unit 30 will be described. As shown in FIG. 28, the reading-sentence generating unit 30
It comprises a skipped character number notification unit 34 and a read-out sentence output unit 31. First, the skipped-character-number notifying unit 34 counts the number of skipped characters from after the first sentence to before the last sentence of the final format conversion document 70 shown in FIG. Then, when the number of characters to be skipped exceeds a predetermined value, a notification message for notifying the number of characters to be skipped is created. The format of the skip notification message 73 is, for example, "82 characters are skipped."

【００６５】次に、読み上げ文出力部３１は、選択部２
０で選択された見出し項目ならびに冒頭文と末尾文に、
さらに読み飛ばし文字数通知部３４の出力した読み飛ば
し通知文７３を挿入して、読み上げ文の文字情報を生成
する。また、読み上げ文出力部３１は、読み上げ文の文
字情報に境界情報を付加して、図３０に示す読み上げ文
８０を生成して出力する。Next, the read-sentence output unit 31 outputs
In the heading item selected in 0 and the head and tail sentences,
Furthermore, the skip notification message 73 output from the skip character notification unit 34 is inserted to generate character information of the text to be read. In addition, the reading sentence output unit 31 generates and outputs a reading sentence 80 shown in FIG. 30 by adding boundary information to the character information of the reading sentence.

【００６６】最後に音声合成部４０について説明する。
音声合成部４０の構成は、実施の形態２で説明したもの
と同じである。実施の形態２の説明と異なるのは、音声
合成部４０の入力である読み上げ文８０が、図３０のよ
うに４行目と１１行目に読み飛ばし通知文７３が加わっ
ている点である。Finally, the speech synthesizer 40 will be described.
The configuration of the voice synthesizer 40 is the same as that described in the second embodiment. The difference from the description of the second embodiment is that the text-to-speech sentence 80 input to the speech synthesis unit 40 has a skipped notification text 73 added to the fourth and eleventh lines as shown in FIG.

【００６７】従って実施の形態２では、冒頭文の後から
末尾文の前までの読み飛ばされる文字数に関係なく、上
位のK段階までの階層の見出し項目とその下位階層の段
落の冒頭文と末尾文のみが、それぞれの前に合図音を伴
った出力音声９０に変換されて出力される。これに対し
て、本実施の形態では、冒頭文の後から末尾文の前まで
の読み飛ばされる文字数が多い場合には、上位のK段階
までの階層の見出し項目とその下位階層の段落の冒頭文
と末尾文に加えて、読み飛ばし通知文が合図音を伴った
出力音声９０に変換されて出力される。Therefore, in the second embodiment, regardless of the number of characters to be skipped from the beginning sentence to the end sentence, regardless of the number of characters skipped, the first sentence and the last sentence of the upper-level K-level heading items and the lower-level paragraphs Only the sentence is output after being converted into an output sound 90 accompanied by a signal sound before each sentence. On the other hand, in the present embodiment, when the number of characters to be skipped from the beginning sentence to the end of the last sentence is large, the heading items of the hierarchy up to the upper K level and the beginning of the paragraph of the lower hierarchy In addition to the sentence and the tail sentence, the skipped notice sentence is converted into an output voice 90 accompanied by a signal sound and output.

【００６８】以上説明したように、本実施の形態では、
ユーザーはスキップキー等のキー操作を行うことなく、
上位のK段階までの階層の見出し項目とその下位階層の
段落の冒頭文と末尾文を音声として聴くことができる。
さらに、冒頭文の後から末尾文の前までの読み飛ばされ
る文字数が所定の値を超えた場合には、その文字数を音
声として聴くことができるので、入力文書５０の内容を
短時間で内容を把握することができ、しかも読み飛ばし
文字数が多いために内容が正確に把握できていない可能
性のある箇所と読み飛ばし文字数を、読み飛ばし通知文
により知ることができる。As described above, in the present embodiment,
The user does not need to perform key operations such as the skip key,
It is possible to listen to the heading item and the first sentence and the last sentence of the heading items of the hierarchy up to the upper K stage and the paragraphs of the lower hierarchy.
Furthermore, when the number of characters to be skipped from after the initial sentence to before the last sentence exceeds a predetermined value, the number of characters can be heard as a voice, so that the contents of the input document 50 can be read in a short time. It is possible to grasp, and furthermore, it is possible to know the location and the number of skipped characters that may not be accurately grasped due to the large number of skipped characters, by the skipped notification message.

【００６９】[0069]

【発明の効果】本発明は、以上説明したように構成され
ているので、以下に示すような効果を奏する。Since the present invention is configured as described above, it has the following effects.

【００７０】第１の発明では、複数の文に区切られた入
力文書から、見出し項目及び段落のそれぞれの文字情報
と、これらが識別できる構造情報と、所定の文字情報を
選択する選択情報とからなる書式変換文書を生成し、こ
の書式変換文書に基づいて所定の文字情報を読み上げ文
として生成し音声出力するようにしたので、ユーザーが
スキップキー等のキー操作を行うことなく、入力文書の
一部のみを音声として聴くことができ、入力文書の内容
を短時間で把握できる。In the first invention, from an input document divided into a plurality of sentences, character information of each of a heading item and a paragraph, structural information that can identify them, and selection information for selecting predetermined character information are obtained. Is generated, and predetermined character information is generated as a text-to-speech based on the format conversion document and is output as voice. Only the part can be listened to as sound, and the contents of the input document can be grasped in a short time.

【００７１】第２の発明では、見出し項目と段落が識別
可能なドキュメント構造タグを有する入力文書から、見
出し項目と段落を判別するようにしたので、入力文書が
ＨＴＭＬ文書のような構造タグを持つ構造化文書の一部
のみを音声として聴くことができ、入力文書の内容を短
時間で把握できる。In the second invention, the heading item and the paragraph are distinguished from the input document having the document structure tag in which the heading item and the paragraph can be identified. Therefore, the input document has a structure tag like an HTML document. Only a part of the structured document can be heard as voice, and the contents of the input document can be grasped in a short time.

【００７２】第３の発明では、入力文書から見出しの形
式を表す書式を調べ、この書式に基づいて見出し項目と
段落を表す文字列を判別するようにしたので、入力文書
に構造タグによる文書構造が与えられていない文書の一
部のみを音声として聴くことができ、入力文書の内容を
短時間で把握できる。In the third invention, a format representing a heading format is checked from an input document, and a character string representing a heading item and a paragraph is determined based on this format. Only a part of the document to which no is given can be heard as voice, and the contents of the input document can be grasped in a short time.

【００７３】第４の発明では、階層化された複数の見出
し項目を有する入力文書より、所定の階層の見出し項目
を選定するようにしたので、入力文書の上位から所望の
階層までの見出し項目のみを音声として聴くことがで
き、入力文書の内容を短時間で把握できる。In the fourth invention, a heading item of a predetermined hierarchy is selected from an input document having a plurality of hierarchized heading items. Therefore, only the heading items from the top of the input document to the desired hierarchy are selected. Can be heard as voice, and the contents of the input document can be grasped in a short time.

【００７４】第５の発明では、入力文書の書式変換文書
から見出し項目と段落を選択し、この段落中の所定の文
をさらに選択するようにしたので、入力文書の上位から
所望の階層までの見出し項目と、その下位階層の段落の
一部の文を音声として聴くことができ、入力文書の要点
を短時間で把握できる。In the fifth invention, a heading item and a paragraph are selected from the format conversion document of the input document, and a predetermined sentence in this paragraph is further selected. It is possible to listen to the heading item and a part of the sentence of the paragraph in the lower hierarchy as voice, and to grasp the main points of the input document in a short time.

【００７５】第６の発明では、入力文書から見出し項目
と段落を選択し、この段落の冒頭文及び末尾文をさらに
選択するようにしたので、入力文書の上位から所望の階
層までの見出し項目と、その下位階層の段落の冒頭文及
び末尾文のみを音声として聴くことができ、入力文書の
要点を短時間で把握できる。In the sixth invention, a heading item and a paragraph are selected from the input document, and a head sentence and a tail sentence of this paragraph are further selected. Only the first sentence and the last sentence of the lower hierarchical paragraph can be listened to as speech, and the gist of the input document can be grasped in a short time.

【００７６】第７の発明では、選択された文の後から次
の選択文の前までの、読み飛ばされる複数の文の文字数
が所定の値を超えるときには、複数の読み飛ばされる文
の中間に位置する文を選択するようにしたので、段落内
で読み飛ばされる文字数が所定の値を超えた場合には、
読み飛ばされる文の中間の文も音声として聴くことがで
き、入力文書の内容を短時間で、しかも大きな段落でも
確実に内容を把握できる。According to the seventh invention, when the number of characters of a plurality of skipped sentences from after the selected sentence to before the next selected sentence exceeds a predetermined value, the number of characters in the plurality of skipped sentences is set in the middle. Since the selected sentence is selected, if the number of characters skipped in a paragraph exceeds a predetermined value,
The middle sentence of the skipped sentence can be heard as speech, and the contents of the input document can be grasped in a short time, and even in a large paragraph.

【００７７】第８の発明では、選択された段落を構成す
る複数の文から、一連の文の区切りを表す文境界キーワ
ードを検出し、この文境界キーワードの位置で上記段落
を分割して新たな段落を追加するようにしたので、段落
中の文頭の文境界キーワードで分割されてできる新たな
段落中の一部の文を音声として聴くことができ、入力文
書の内容を短時間で、しかも大きな段落でも確実に内容
を把握できる。In the eighth invention, a sentence boundary keyword indicating a break of a series of sentences is detected from a plurality of sentences constituting the selected paragraph, and the paragraph is divided at the position of the sentence boundary keyword to form a new paragraph. Because a paragraph is added, a part of a new paragraph that is divided by the sentence boundary keyword at the beginning of the paragraph can be heard as audio, and the contents of the input document can be edited in a short time You can understand the contents even in paragraphs.

【００７８】第９の発明では、段落の文から自立語の出
現頻度を調べ、出現頻度の高い自立語を読み上げ文とし
て出力するようにしたので、選択された段落の文の代わ
りにキーワードを音声として聴くことができ、入力文書
の内容を短時間で把握できる。In the ninth invention, the appearance frequency of the independent word is checked from the sentence of the paragraph, and the independent word having a high appearance frequency is output as a text-to-speech. As a result, the contents of the input document can be grasped in a short time.

【００７９】第１０の発明では、段落の文から連体修飾
節を検出し、この連体修飾節を削除した文を読み上げ文
として出力するようにしたので、選択された段落の連体
修飾節を削除した文を音声として聴くことができ、入力
文書の内容を短時間で把握できる。In the tenth aspect, the adnominal modifier clause is detected from the sentence of the paragraph, and the sentence from which the adnominal modifier clause is deleted is output as a read-out sentence. Therefore, the adnominal modifier clause of the selected paragraph is deleted. The sentence can be heard as voice, and the contents of the input document can be grasped in a short time.

【００８０】第１１の発明では、選択された文の後から
次に選択された文の前までの読み飛ばされる文の文字数
が、所定の値を超える時には、読み飛ばされる文の文字
数を通知するようにしたので、段落内で読み飛ばされる
文字数が所定の値を超えた場合には、その文字数を音声
として聴くことができ、読み飛ばし文字数が多いために
内容が正確に把握できていない可能性のある箇所を、音
声による文字数の通知により知ることができる。In the eleventh invention, when the number of characters of the skipped sentence from after the selected sentence to before the next selected sentence exceeds a predetermined value, the number of characters of the skipped sentence is notified. Therefore, if the number of characters skipped in a paragraph exceeds a predetermined value, the number of characters can be heard as audio, and the content may not be accurately understood due to the large number of skipped characters. Can be known by voice notification of the number of characters.

【００８１】第１２の発明では、読み上げ文を音声で出
力するとき、読み上げ文中の各文の境界に従って合図音
を発生するようにしたので、音声に変換された読み上げ
文の境界を合図音で知ることができる。According to the twelfth invention, when outputting a read-aloud sentence by voice, a signal sound is generated in accordance with the boundary of each sentence in the read-out sentence. Therefore, the boundary of the read-aloud sentence converted into voice is known by the signal sound. be able to.

[Brief description of the drawings]

【図１】実施の形態１の音声合成装置の全体構成図であ
る。FIG. 1 is an overall configuration diagram of a speech synthesis device according to a first embodiment.

【図２】実施の形態１の書式変換部の構成図である。FIG. 2 is a configuration diagram of a format conversion unit according to the first embodiment.

【図３】実施の形態１の音声合成部の構成図である。FIG. 3 is a configuration diagram of a speech synthesis unit according to the first embodiment.

【図４】実施の形態１の書式変換文書に入力される入力
文書である。FIG. 4 is an input document input to the format conversion document according to the first embodiment.

【図５】実施の形態１の書式変換文書から出力される第
１の書式変換文書である。FIG. 5 is a first format conversion document output from the format conversion document according to the first embodiment.

【図６】実施の形態１の選択部から出力される最終書式
変換文書である。FIG. 6 illustrates a final format conversion document output from the selection unit according to the first embodiment.

【図７】実施の形態１の読み上げ文生成部から出力され
る読み上げ文である。FIG. 7 is a reading sentence output from a reading sentence generation unit according to the first embodiment.

【図８】実施の形態２の選択部の構成図である。FIG. 8 is a configuration diagram of a selection unit according to the second embodiment.

【図９】実施の形態２の書式変換部から出力される第１
の書式変換文書である。FIG. 9 is a diagram illustrating a first example output from the format conversion unit according to the second embodiment.
Format conversion document.

【図１０】実施の形態２の見出し選択部から出力される
第２の書式変換文書である。FIG. 10 illustrates a second format conversion document output from the heading selection unit according to the second embodiment.

【図１１】実施の形態２の段落選択部から出力される第
３の書式変換文書である。FIG. 11 shows a third format conversion document output from the paragraph selection unit according to the second embodiment.

【図１２】実施の形態２の文選択部から出力される最終
書式変換文書である。FIG. 12 illustrates a final format conversion document output from the sentence selection unit according to the second embodiment.

【図１３】実施の形態２の読み上げ文生成部から出力さ
れる読み上げ文である。FIG. 13 is a reading sentence output from a reading sentence generation unit according to the second embodiment.

【図１４】実施の形態３の書式変換部の構成図である。FIG. 14 is a configuration diagram of a format conversion unit according to the third embodiment.

【図１５】実施の形態３の書式変換部に入力される入力
文書である。FIG. 15 illustrates an input document input to the format conversion unit according to the third embodiment.

【図１６】実施の形態３の書式変換部から出力される第
１の書式変換文書である。FIG. 16 illustrates a first format conversion document output from a format conversion unit according to the third embodiment.

【図１７】実施の形態４の選択部の構成図である。FIG. 17 is a configuration diagram of a selection unit according to the fourth embodiment.

【図１８】実施の形態４の文選択部から出力される第４
の書式変換文書である。FIG. 18 is a diagram illustrating a fourth example output from the sentence selection unit according to the fourth embodiment.
Format conversion document.

【図１９】実施の形態４の読み飛ばし制限部から出力さ
れる最終書式変換文書である。FIG. 19 illustrates a final format conversion document output from the skipping restriction unit according to the fourth embodiment.

【図２０】実施の形態４の読み上げ文生成部から出力さ
れる読み上げ文である。FIG. 20 is a reading sentence output from a reading sentence generation unit according to the fourth embodiment.

【図２１】実施の形態５の選択部の構成図である。FIG. 21 is a configuration diagram of a selection unit according to the fifth embodiment.

【図２２】実施の形態５の選択部で入出力される第３の
書式変換文書、第４の書式変換文書、最終書式変換文書
である。FIG. 22 shows a third format conversion document, a fourth format conversion document, and a final format conversion document input and output by the selection unit according to the fifth embodiment.

【図２３】実施の形態６の選択部の構成図である。FIG. 23 is a configuration diagram of a selection unit according to the sixth embodiment.

【図２４】実施の形態６の読み上げ文生成部の構成図で
ある。FIG. 24 is a configuration diagram of a reading sentence generation unit according to the sixth embodiment.

【図２５】実施の形態６の選択部で入出力される最終書
式変換文書、読み上げ文である。FIG. 25 illustrates a final format conversion document and a read-aloud sentence input / output by the selection unit according to the sixth embodiment.

【図２６】実施の形態７における読み上げ文生成部の構
成図である。FIG. 26 is a configuration diagram of a reading sentence generation unit according to the seventh embodiment.

【図２７】実施の形態７の読み上げ文生成部で入出力さ
れる最終書式変換文書、読み上げ文である。FIG. 27 shows a final format conversion document and a read-aloud sentence that are input and output by the read-aloud sentence generation unit according to the seventh embodiment.

【図２８】実施の形態８の読み上げ文生成部の構成図で
ある。FIG. 28 is a configuration diagram of a reading sentence generation unit according to the eighth embodiment.

【図２９】実施の形態８の読み上げ文生成部に入力され
る最終書式変換文書である。FIG. 29 shows a final format conversion document input to the reading sentence generating unit according to the eighth embodiment.

【図３０】実施の形態８の読み上げ文生成部から出力さ
れる読み上げ文である。FIG. 30 shows a read-aloud sentence output from a read-aloud sentence generator according to the eighth embodiment.

【図３１】従来の音声合成装置の読み飛ばし処理のフロ
ーチャートである。FIG. 31 is a flowchart of a skipping process of a conventional speech synthesizer.

[Explanation of symbols]

１０書式変換部、１１見出し判別部、１２段落判
別部、１３書式変換文書出力部、１４見出し書式解
析部、２０選択部、２１見出し選択部、２２段落
選択部、２３文選択部、２４読み飛ばし制限部、２
５文境界キーワード検出部、２６段落追加部、３０
読み上げ文生成部、３１読み上げ文出力部、３２
キーワード抽出部、３３修飾節処理部、３４読み飛
ばし文字数通知部、４０音声合成部、４１制御部、
４２テキスト音声変換部、４３合図音発生部、４４
加算器、５０入力文書、５３見出し書式、６０第
１の書式変換文書、６１第２の書式変換文書、６２
第３の書式変換文書、６３第４の書式変換文書、６４
文境界キーワードの位置情報、７０最終書式変換文
書、７１キーワード、７２連体修飾節を削除した
文、７３読み飛ばし通知文、８０読み上げ文、８１
読み上げ文字列、８２発生指令、９０出力音声、９
１合成音声、９２合図音。Reference Signs List 10 format conversion unit, 11 heading discrimination unit, 12 paragraph discrimination unit, 13 format conversion document output unit, 14 heading format analysis unit, 20 selection unit, 21 heading selection unit, 22 paragraph selection unit, 23 sentence selection unit, 24 skipping Restricted part, 2
5 sentence boundary keyword detection unit, 26 paragraph addition unit, 30
Text-to-speech generation unit, 31 Text-to-speech output unit, 32
Keyword extraction unit, 33 Modifier clause processing unit, 34 Skipped character number notification unit, 40 voice synthesis unit, 41 control unit,
42 text-to-speech converter, 43 signal sound generator, 44
Adder, 50 input documents, 53 heading format, 60 first format conversion document, 61 second format conversion document, 62
Third format conversion document, 63 Fourth format conversion document, 64
Sentence boundary keyword position information, 70 final format conversion document, 71 keyword, 72 sentence from which adnominal modifier clause is deleted, 73 skipped notice sentence, 80 read-out sentence, 81
Read aloud character string, 82 generation command, 90 output voice, 9
1 Synthetic voice, 92 signal sound.

Claims

[Claims]

An input document divided into a plurality of sentences is analyzed, and character information constituting each sentence of a heading item and a paragraph following the heading item in the input document, and the heading item and the paragraph. A format conversion unit that generates and outputs a format conversion document including structural information that can be identified by a user, selects a predetermined sentence in the format conversion document, and adds a final format conversion document obtained by adding this selection information to the format conversion document. A selecting unit that generates and outputs a text-to-speech generating unit that generates and outputs, as a text-to-speech, character information of a sentence to which the selection information is added from the final format conversion document; and a voice that converts the text-to-speech to speech and outputs it. A speech synthesizer, comprising: a synthesizer.

2. The format conversion unit determines a heading item based on the document structure tag from an input document having a document structure tag capable of identifying the heading item and paragraph, and converts the structure information that can identify the heading item from the input document. A heading discriminator for outputting, a paragraph discriminator for discriminating a paragraph from the input document based on the document structure tag, and a paragraph discriminator for outputting structure information capable of identifying the paragraph; and extracting the character information from the input document. 2. The speech synthesis apparatus according to claim 1, further comprising: a format conversion document output unit configured to generate and output a format conversion document including the extracted character information and the structural information that can identify the heading item and the paragraph.

3. The format conversion unit checks a format representing a heading format from the input document, and determines a heading format analysis unit that outputs the format and a character string representing the heading item based on the format. A heading discriminator that outputs structure information that can identify the heading item, a paragraph discriminator that discriminates a paragraph that starts with indentation in a sentence following the heading item, and outputs structural information that can identify this paragraph; 2. The voice according to claim 1, further comprising: a format conversion document output unit configured to generate and output a format conversion document including the character information in the input document and the structural information capable of identifying the heading item and the paragraph. Synthesizer.

4. A final format conversion document in which a heading item up to a predetermined hierarchy is selected from an input document having a plurality of hierarchical heading items and this selection information is added to the format conversion document. 2. The speech synthesis apparatus according to claim 1, wherein the speech synthesis apparatus generates and outputs the speech.

5. A heading selecting unit for selecting a predetermined heading item in the format conversion document and adding this selection information to the format conversion document, and a lower hierarchy of the selected heading item. A paragraph selecting section for selecting a predetermined paragraph to be composed and adding this selection information to the format conversion document; and selecting a predetermined sentence in the selected paragraph and adding this selection information to the format conversion document. The speech synthesis device according to claim 1, further comprising a sentence selection unit.

6. The sentence selection unit according to claim 5, wherein a head sentence and a tail sentence of the paragraph selected by the paragraph selection unit are selected, and the selected information is added to the format conversion document. Voice synthesizer.

7. The sentence selection unit, wherein when the number of characters of a plurality of skipped sentences from a position after the selected sentence to before a next selected sentence exceeds a predetermined value, a plurality of skipped sentences is set. 6. The speech synthesis apparatus according to claim 5, wherein a sentence located in the middle of the text is selected, and the selected information is added to the format conversion document.

8. A sentence boundary keyword detection unit that detects a sentence boundary keyword indicating a break of a series of sentences from a plurality of sentences constituting the paragraph selected by the paragraph selection unit, and outputs the position of the sentence boundary keyword. And a paragraph adding unit that divides the paragraph at the position of the detected sentence boundary keyword and adds a new paragraph, wherein the sentence selecting unit selects the new paragraph added by the paragraph adding unit. 6. The speech synthesizer according to claim 5, wherein said selection information is added to said format conversion document.

9. The reading sentence generating unit includes a keyword extracting unit that checks the appearance frequency of an independent word from the sentence of the paragraph to which the selection information is added, and outputs the independent word having a high appearance frequency as a reading sentence. The speech synthesizer according to claim 1, wherein:

10. The reading sentence generating unit includes a modifying clause processing unit that detects a continuous modifier from a sentence of a paragraph to which the selection information is added, and outputs a sentence from which the continuous modifying clause is deleted as a spoken sentence. The speech synthesizer according to claim 1, wherein:

11. The reading sentence generation unit may be configured such that the number of characters of a sentence to be skipped from after the sentence to which the selection information is added to before to the sentence to which the next selection information is added exceeds a predetermined value. 2. The speech synthesizer according to claim 1, further comprising a skipped character number notifying unit for adding a skipped notification sentence for notifying the number of characters of the skipped sentence to the read-out sentence.

12. The speech synthesizer outputs the character information in the speech sentence output from the speech sentence generator as a speech character string, and further issues an instruction to generate a signal sound in accordance with the boundary of each sentence in the speech sentence. A control unit, a text-to-speech conversion unit that converts the read-out character string output from the control unit to a synthesized voice, a signal sound generation unit that outputs a signal sound according to the generation command, 2. The speech synthesizer according to claim 1, further comprising: an adder for generating and outputting an output speech by adding.