JP3712325B2 - Document reading device - Google Patents
Document reading device Download PDFInfo
- Publication number
- JP3712325B2 JP3712325B2 JP25615498A JP25615498A JP3712325B2 JP 3712325 B2 JP3712325 B2 JP 3712325B2 JP 25615498 A JP25615498 A JP 25615498A JP 25615498 A JP25615498 A JP 25615498A JP 3712325 B2 JP3712325 B2 JP 3712325B2
- Authority
- JP
- Japan
- Prior art keywords
- document
- sound
- item
- voice
- pitch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Description
【0001】
【発明の属する技術分野】
本発明は、文書読み上げ装置、より詳細には、文書構造に沿って文書を読み上げる際に、テキスト文書中の包含関係や並列関係の構造が音声合成で読み上げさせた時に分かるように、各構造の部分またはその区切り部分にガイド用音声や音響を付加して、聞き手が文書中の話題の変化に追従して内容を理解できるようにしたものである。
【0002】
【従来の技術】
通常、文書は視覚的なレイアウトを施すことによって、複数の部分に分離されている。そして、各部分の位置関係によって、包含関係や並列関係の構造が作られ、その関連の強さが表現されている。一方、現在のテキスト音声合成では、文書を単調に読み上げるだけで、せいぜい文字修飾によって音質を変えて読み上げたり、或いは、電子メール等で明示的に信号等で示された引用範囲に対して音質を変えて読み上げる程度であり、レイアウト的に表現されている文書間の関連情報を表現することはできていない。
【0003】
【発明が解決しようとする課題】
本発明は、上述のごとき実情に鑑みてなされたもので、レイアウト情報により構造化された文書をテキスト音声合成で読み上げる際に、対象文書をその包含関係等に従って階層化して、聞き手が文書構造を認識することを助け、文書内容の理解を支援すること、更には、画面上に表示されない構造情報も利用して聞き手が内容の理解に役立つようにすることを目的としてなされたものである。
【0004】
【課題を解決するための手段】
請求項1の発明は、構造化文書を規則音声合成によって読み上げる文書読み上げ装置において、ガイド音声及び/又は音響データ記憶装置、及び、音声合成装置及び/又は音響信号発生装置を有し、前記構造化文書の文書構造の区切りを、ガイド用の音声で表現するようにしたことを特徴としたものである。
【0014】
【発明の実施の形態】
図1は、本発明が適用される文書読み上げ装置の一例を説明するための要部ブロック図で、図中、1は構造化文書データが入力されるデータ入力装置、2はデータ入力装置1より入力されたデータの構造を分析するデータ構造分析装置、3はデータ記憶装置、4は音声出力制御装置、5はガイド音声/音響データ記憶装置、6はユーザによって操作される入力装置、7は音声合成装置、8は音響信号発生装置、9はスピーカ,ブザー等の音を発声する出力装置で、本発明は、データ入力装置1より入力された構造化文書を、文書構造に沿って読み上げて、出力装置より音声出力する際に、入力装置よりの要求により、構造の区切りにガイド音や音響を入れ、構造文書の区切りを明確にしたものである。
【0015】
上述のように、本発明は、文書構造に沿って文書を読み上げる際に、構造化文書の区切りを明確にして、聞き手が文書構造を認識し、文書内容をよく理解できるようにしたものであるが、文書構造を音響又は音声で伝える方法としては、例えば、文書構造の区切りを明確化する方法,項目間の関連性を表現する方法等がある。
【0016】
文書構造の区切りを明確にする方法:
・構造の区切りに、音響を入れる。
「ピー」とか「ブー」などの音を連続または単発で一回ないし複数回入れる。
・無音部分を挿入する。
・各話題項目の隔たりの程度に応じた案内用の音声を入れる。
・項目全体の背景に音響信号を重ねる。
ある階層の項目に対して背景音を重ね合わせ、項目が変わると背景の音も変更することで読み上げ項目の変更を知る。
【0017】
項目間の関連性を表現する方法:
・区切りの音響の変化であらわす。
関連性の程度で音響の長さを変える。例えば、関係の薄いものは、時間を長くする。
音響の回数を変化させる。短い音を複数回入れて、関係性が薄いと回数を多くする。
・音響の間隔を変える。例えば、一定回数の音を入れることにしておき、関係性が薄くなると間隔を長くする。
・音響の音程を変える。前後の関係性が低いと音も低くする。
・無音部分を入れる。前後の関係性の程度により無音時間を変える。
・以上の組み合わせ。
【0018】
・ガイド用の音声で表現する。
本文を読んでいる音声と別の音質のものを用意しておき、話題が大きく変わった時にガイドする。例えば、「話は変わって」、や「次の話題です」などの語句を階層毎に決めておき、階層中の項目が変わった時に対応する語句を挿入する。
箇条書き項目で、項目番号が数字表記されていない場合や、複数の段落がある場合に、何番目とガイドする。
【0019】
・BGMや音響によって大体の階層の深度を知る。
例えば、階層の深さによって音程や速度を変える。例えば、階層が深くなると、BGMや音響の速さを変える。同じBGMなので読み上げの範囲は変わっていないが、詳細な項目に移動していく時に利用できる。
音響信号の音程を変える。
【0020】
(実施例1)
図2は、本発明が適用される文書構造の一例(新聞記事の例)を示す図で、以下、音響信号の音程や長狭で異動方向をあらわす例を説明するが、以下において、( )内は音響、〈 〉内は信号の意味である。
政治( ピ) 〈下位の階層へ移動〉
選挙活動 (ピ)〈下位の階層へ移動〉
自民党の活動 (プ) 民主党の活動 (プ)〈並列的な項目が並んでいる〉
社民党の活動 (プ) 共産党の活動( ブー)〈上位の階層へ移動〉
政府の動き (ブー)〈上位の階層へ〉
経済 (ピ)〈下位の階層へ〉
円安の動向 (プ)〈同じ階層の項目が並ぶ〉
株価動向 (ブー)〈上位の階層へ〉
スポーツ
【0021】
(実施例2)
図3は、本発明が適用される文書構造の他の例(タカタログの例)を示す図で、以下、音響の数で区切りや異動方向をあらわす例を説明する。
食料品(ピピ)〈下位の階層へ移動〉
清涼飲料(ピピ)〈下の階層へ〉
炭酸飲料(ピ)〈並列された項目が並ぶ〉
果汁飲料(ピピピピ)〈上位の階層へ〉
お菓子(ピピピピ)〈上位の階層へ〉
日用品(ピピ)〈下の階層へ移動〉
台所用品(ピ)
洗濯用品(ピ)
お風呂用品(ピピピピ)〈上位の階層へ〉
衣類(ピピ)〈下の階層へ移動〉
子供服(ピ)
婦人服(ピ)
紳士服(ピピピピ)
【0022】
(実施例3)
図3に示したカタログの例を用いて、背景音楽(BGM)の種類や音程で階層の移動をあらわす例を説明する。
食料品(ピピ)(同じ分野の間は同一の音楽が背景に流れる)
清涼飲料(背景音楽の音程が高くなる)〈下位の階層へ〉
炭酸飲料(ピ)〈並列された項目が並ぶ〉
果汁飲料(背景音楽の音程が低くなる)〈上位の階層へ〉
お菓子(背景音楽の音程が最初に戻る)〈上位の階層へ〉
日用品(ピピ)(背景音楽が別のものに変わる)
台所用品(背景音楽の音程が高くなる)〈下の階層へ移動〉
洗濯用品(ピ)
お風呂用品(背景音楽の音程が低くなる)〈上位の階層へ〉
衣類(ピピ)(背景音楽が別のものに変わる)
【0023】
(実施例4)
図4は、本発明が適用される文書構造の他の例(雑誌の記事の例)を示す図で、以下、ガイド音声と音響を用いた例について説明するが、( )内はガイド用音声,音響である。
(最初の話題は) 政治
(最初の項目は) 各政党の公約
自民党の公約(ピピ)、民主党の公約(ピピ)、共産党の公約(ブー)
(2番目の項目は) 獲得議席予測
(話は変わって)
スポーツ情報
(最初の項目は) ワールドカップ
(2番目の項目は) プロ野球
○○連敗
(3番目の項目は) 相撲
(話は変わって)
芸能情報
【0024】
【発明の効果】
以上の説明から明らかなように、本発明によると、構造化文書を文書構造に沿って読み上げる際に、文書構造の区切りを音響や音声によって明確にし、或いは、項目間の関係性を音響やガイド音声によって明確にすることができる。
【図面の簡単な説明】
【図1】 本発明が適用される文書読み上げ装置の一例を説明するための要部ブロック図である。
【図2】 本発明が適用される文書構造の一例(新聞記事の例)を示す図である。
【図3】 本発明が適用される文書構造の他の例(タカタログの例)を示す図である。
【図4】 本発明が適用される文書構造の他の例(雑誌の記事の例)を示す図である。
【符号の説明】
1…データ入力装置、2…データ構造分析装置、3…データ記憶装置、4…音声出力制御装置、5…ガイド音声/音響データ記憶装置、6…入力装置、7…音声合成装置、8…音響信号発生装置、9…出力装置。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a document reading apparatus, and more specifically, when reading a document along the document structure, the structure of each structure can be understood so that the structure of the inclusion relationship or parallel relationship in the text document is read out by speech synthesis. A voice or sound for guide is added to the part or its delimiter so that the listener can understand the contents following the change of the topic in the document.
[0002]
[Prior art]
Usually, a document is separated into a plurality of parts by applying a visual layout. And the structure of inclusion relation or parallel relation is made by the positional relation of each part, and the strength of the relation is expressed. On the other hand, with current text-to-speech synthesis, you can simply read a document monotonously and read it at best by changing the sound quality by character modification, or improve the sound quality for the quoted range explicitly indicated by signals etc. in e-mail etc. However, it is not possible to express related information between documents expressed in a layout.
[0003]
[Problems to be solved by the invention]
The present invention has been made in view of the above circumstances, and when a document structured by layout information is read out by text-to-speech synthesis, the target document is hierarchized according to its inclusion relationship etc. The purpose is to help recognition and support understanding of document contents, and to make it easier for the listener to understand the contents by using structural information not displayed on the screen.
[0004]
[Means for Solving the Problems]
The invention according to claim 1 is a document reading device that reads out a structured document by regular speech synthesis, and includes a guide speech and / or acoustic data storage device, a speech synthesis device and / or an acoustic signal generation device, This is characterized in that the delimitation of the document structure of the document is expressed by a guide voice.
[0014]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 is a principal block diagram for explaining an example of a document reading apparatus to which the present invention is applied. In the figure, 1 is a data input device to which structured document data is input, and 2 is from the data input device 1. Data structure analysis device for analyzing the structure of input data, 3 is a data storage device, 4 is a voice output control device, 5 is a guide voice / acoustic data storage device, 6 is an input device operated by a user, and 7 is a voice A synthesizing device, 8 is an acoustic signal generating device, 9 is an output device that utters sounds such as speakers and buzzers, and the present invention reads out a structured document input from the data input device 1 along the document structure, When outputting an audio from the output device, a structure sound is clarified by inserting a guide sound or sound into the structure break according to a request from the input device.
[0015]
As described above, according to the present invention, when reading a document along the document structure, the section of the structured document is clarified so that the listener can recognize the document structure and understand the contents of the document well. However, examples of a method for conveying the document structure by sound or voice include a method for clarifying the division of the document structure and a method for expressing the relationship between items.
[0016]
To clarify document structure delimiters:
・ Sound is added to the structure breaks.
Add sounds such as “pea” or “boo” once or multiple times in a continuous or single shot.
・ Insert a silent part.
・ Enter a guidance voice according to the distance between each topic item.
-Overlay the acoustic signal on the background of the entire item.
The background sound is superimposed on an item in a certain hierarchy, and when the item changes, the background sound is also changed to know the change of the reading item.
[0017]
How to express relationships between items:
・ This is a change in the sound of the break.
Change the length of the sound depending on the degree of relevance. For example, the thing with a weak relationship lengthens time.
Change the number of sounds. Insert a short sound several times, and increase the number of times when the relationship is weak.
・ Change the sound interval. For example, it is assumed that a certain number of sounds are input, and the interval is increased when the relationship is reduced.
・ Change the pitch of the sound. If the relationship between the front and rear is low, the sound is also low.
-Insert silence. The silent time is changed according to the degree of the relationship between before and after.
・ A combination of the above.
[0018]
・ Express with guide voice.
Prepare a different sound quality than the one used to read the text, and guide you when the topic changes significantly. For example, phrases such as “the story changes” and “the next topic” are determined for each hierarchy, and the corresponding phrase is inserted when an item in the hierarchy changes.
In bulleted items, if the item number is not numbered or there are multiple paragraphs, it will guide you to what number.
[0019]
-Know the depth of the general hierarchy by BGM and sound.
For example, the pitch and speed are changed according to the depth of the hierarchy. For example, when the hierarchy becomes deep, the speed of BGM or sound is changed. Since the BGM is the same, the range of reading is not changed, but it can be used when moving to detailed items.
Change the pitch of the acoustic signal.
[0020]
(Example 1)
FIG. 2 is a diagram showing an example of a document structure to which the present invention is applied (an example of a newspaper article). Hereinafter, an example in which the pitch of a sound signal and the direction of change are expressed in a narrow or narrow manner will be described. The inside is the sound and the <> is the meaning of the signal.
Politics (pi) <Move to lower level>
Election (pi) <Move to lower level>
Liberal Democratic Party Activities (P) Democratic Party Activities (P) <Parallel items are lined up>
Social Democratic Party Activities (Pu) Communist Party Activities (Buo) <Move to Higher Level>
Government Movement (Boo) <Up to higher levels>
Economy (pi) <lower level>
Trends in yen depreciation (P) <Items on the same level>
Stock Price Trends (Boo) <To higher rank>
Sports [0021]
(Example 2)
FIG. 3 is a diagram showing another example of the document structure to which the present invention is applied (example of data catalog). Hereinafter, an example in which the number of sounds and the direction of change or change direction will be described.
Food (pipi) <Move to lower level>
Soft drink (pipi) <lower level>
Carbonated drink (pi) <lined items line up>
Fruit juice (pipipipii) <up to higher level>
Confectionery (upper layer)
Daily necessities (pipi) <Move to lower level>
Kitchenware (pi)
Laundry goods (pi)
Bath products (upper layer)
Clothing (pipi) <Move to lower level>
Children's clothes (pi)
Women's clothing (pi)
Men's clothes (pipipipi)
[0022]
(Example 3)
A description will be given of an example in which a hierarchy is moved by the type and pitch of background music (BGM) using the example of the catalog shown in FIG.
Grocery (pipi) (same music plays in the background during the same field)
Soft drink (the pitch of background music increases) <to lower level>
Carbonated drink (pi) <lined items line up>
Fruit juice beverage (background music pitch is lowered)
Sweets (Back to the beginning of the background music pitch)
Daily necessities (pipi) (background music changes to another)
Kitchen utensils (background music pitch increases) <Move to lower level>
Laundry goods (pi)
Bath products (background music pitch is lowered) <upper level>
Clothing (pipi) (background music changes to another)
[0023]
(Example 4)
FIG. 4 is a diagram showing another example of a document structure to which the present invention is applied (an example of a magazine article). Hereinafter, an example using guide voice and sound will be described. , Acoustic.
(First topic is) Politics (First item is) Pledge of each party LDP's pledge (Phi Phi), Democratic pledge (Phi Phi), Communist Party pledge (Boo)
(The second item) Predicting seats to be won (the story has changed)
Sports information (first item) World Cup (second item) Professional baseball XX consecutive losses (third item) Sumo (the story has changed)
Entertainment information [0024]
【The invention's effect】
As is clear from the above description, according to the present invention, when a structured document is read out along the document structure, the division of the document structure is clarified by sound or sound, or the relationship between items is sounded or guided. Can be clarified by voice.
[Brief description of the drawings]
FIG. 1 is a principal block diagram for explaining an example of a document reading apparatus to which the present invention is applied;
FIG. 2 is a diagram showing an example of a document structure to which the present invention is applied (an example of a newspaper article).
FIG. 3 is a diagram showing another example of document structure to which the present invention is applied (example of data catalog).
FIG. 4 is a diagram showing another example of a document structure to which the present invention is applied (an example of a magazine article).
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Data input device, 2 ... Data structure analyzer, 3 ... Data storage device, 4 ... Voice output control device, 5 ... Guide voice / acoustic data storage device, 6 ... Input device, 7 ... Speech synthesizer, 8 ... Sound Signal generator, 9... Output device.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP25615498A JP3712325B2 (en) | 1998-09-10 | 1998-09-10 | Document reading device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP25615498A JP3712325B2 (en) | 1998-09-10 | 1998-09-10 | Document reading device |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2000089777A JP2000089777A (en) | 2000-03-31 |
JP3712325B2 true JP3712325B2 (en) | 2005-11-02 |
Family
ID=17288668
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP25615498A Expired - Fee Related JP3712325B2 (en) | 1998-09-10 | 1998-09-10 | Document reading device |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP3712325B2 (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4678946B2 (en) * | 2000-12-28 | 2011-04-27 | 富士通株式会社 | Voice interactive information processing apparatus and recording medium |
JP4668542B2 (en) * | 2004-03-17 | 2011-04-13 | 株式会社リコー | Status notification device, electronic device, and status notification method |
JP2006137033A (en) * | 2004-11-10 | 2006-06-01 | Toppan Forms Co Ltd | Voice message transmission sheet |
JP6441177B2 (en) * | 2015-07-29 | 2018-12-19 | 日本電信電話株式会社 | PAUSE LENGTH DETERMINING DEVICE, PAUSE LENGTH DETERMINING METHOD, AND PROGRAM |
EP3441873A4 (en) * | 2016-04-05 | 2019-03-20 | Sony Corporation | Information processing apparatus, information processing method, and program |
CN111415650A (en) * | 2020-03-25 | 2020-07-14 | 广州酷狗计算机科技有限公司 | Text-to-speech method, device, equipment and storage medium |
-
1998
- 1998-09-10 JP JP25615498A patent/JP3712325B2/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
JP2000089777A (en) | 2000-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cornelius | Reconstructing alliterative verse: The pursuit of a medieval meter | |
Finnegan et al. | South Pacific oral traditions | |
JP3712325B2 (en) | Document reading device | |
Katrandjiev et al. | Usage of rhetorical figures in advertising slogans | |
Qureshi | Musical gesture and extra-musical meaning: words and music in the Urdu ghazal | |
JP2002169591A (en) | Simulated conversation system and information storage medium | |
Crystal | In search of English: a traveller's guide | |
Brognaux et al. | A new prosody annotation protocol for live sports commentaries. | |
Chickering | Form and Interpretation in the" Envoy" to the" Clerk's Tale" | |
Maschler | The discourse marker nu: Israeli Hebrew impatience in interaction | |
Finnegan | Introduction; or, why the comparativist should take account of the South Pacific | |
Jarman | High Notes, High Drama: Musical climaxes and gender politics in tenor heroes and Broadway women | |
Wood | The multiple voices of American Klezmer | |
Pierson | Voice, Technē, and Jouissance in Music for 18 Musicians | |
Eeckhout | Wallace Stevens's modernist melodies | |
Forrest | “The mind/is listening”: Aurality and Noise Poetics in the Poetry of William Carlos Williams | |
Attfield | Curse ov Dialect, Wooden Tongues (2006) | |
Alper | Not just derision and darkness: The interplay of lyrics and music in Steely Dan’s compositions | |
Colton | Channelling the ecstasy of Hildegard von Bingen:" O Euchari" remixed | |
Townsend | Meter and Modernist Prose: Verse Fragments in Woolf's The Years | |
JP6185136B1 (en) | Voice generation program and game device | |
Lowe et al. | Analyse This: Types and Tactics of Self-Referential Songs | |
Encarnacao | Courtney Barnett, Sometimes I Sit and Think, and Sometimes I Just Sit (2014) | |
Gerber | Tracing the trajectory of a Williams poem: From the variable foot to triadic-line verse | |
du Preez et al. | Die oorsprong, ontwikkeling en estetika van die intieme musiekspel, en Fees as’n Afrikaanse voorbeeld |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20050406 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20050607 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20050728 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20050816 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20050816 |
|
R150 | Certificate of patent or registration of utility model |
Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20080826 Year of fee payment: 3 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20090826 Year of fee payment: 4 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20090826 Year of fee payment: 4 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20100826 Year of fee payment: 5 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20100826 Year of fee payment: 5 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20110826 Year of fee payment: 6 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20110826 Year of fee payment: 6 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20120826 Year of fee payment: 7 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20120826 Year of fee payment: 7 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20130826 Year of fee payment: 8 |
|
LAPS | Cancellation because of no payment of annual fees |