JP3712325B2 - Document reading device - Google Patents

Document reading device Download PDF

Info

Publication number
JP3712325B2
JP3712325B2 JP25615498A JP25615498A JP3712325B2 JP 3712325 B2 JP3712325 B2 JP 3712325B2 JP 25615498 A JP25615498 A JP 25615498A JP 25615498 A JP25615498 A JP 25615498A JP 3712325 B2 JP3712325 B2 JP 3712325B2
Authority
JP
Japan
Prior art keywords
document
sound
item
voice
pitch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP25615498A
Other languages
Japanese (ja)
Other versions
JP2000089777A (en
Inventor
義文 櫻又
潤一郎 藤本
博雄 北川
哲也 酒寄
敬 有吉
裕一 小島
淳一 鷹見
彬 呂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP25615498A priority Critical patent/JP3712325B2/en
Publication of JP2000089777A publication Critical patent/JP2000089777A/en
Application granted granted Critical
Publication of JP3712325B2 publication Critical patent/JP3712325B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Description

【0001】
【発明の属する技術分野】
本発明は、文書読み上げ装置、より詳細には、文書構造に沿って文書を読み上げる際に、テキスト文書中の包含関係や並列関係の構造が音声合成で読み上げさせた時に分かるように、各構造の部分またはその区切り部分にガイド用音声や音響を付加して、聞き手が文書中の話題の変化に追従して内容を理解できるようにしたものである。
【0002】
【従来の技術】
通常、文書は視覚的なレイアウトを施すことによって、複数の部分に分離されている。そして、各部分の位置関係によって、包含関係や並列関係の構造が作られ、その関連の強さが表現されている。一方、現在のテキスト音声合成では、文書を単調に読み上げるだけで、せいぜい文字修飾によって音質を変えて読み上げたり、或いは、電子メール等で明示的に信号等で示された引用範囲に対して音質を変えて読み上げる程度であり、レイアウト的に表現されている文書間の関連情報を表現することはできていない。
【0003】
【発明が解決しようとする課題】
本発明は、上述のごとき実情に鑑みてなされたもので、レイアウト情報により構造化された文書をテキスト音声合成で読み上げる際に、対象文書をその包含関係等に従って階層化して、聞き手が文書構造を認識することを助け、文書内容の理解を支援すること、更には、画面上に表示されない構造情報も利用して聞き手が内容の理解に役立つようにすることを目的としてなされたものである。
【0004】
【課題を解決するための手段】
請求項1の発明は、構造化文書を規則音声合成によって読み上げる文書読み上げ装置において、ガイド音声及び/又は音響データ記憶装置、及び、音声合成装置及び/又は音響信号発生装置を有し、前記構造化文書の文書構造の区切りを、ガイド用の音声で表現するようにしたことを特徴としたものである。
【0014】
【発明の実施の形態】
図1は、本発明が適用される文書読み上げ装置の一例を説明するための要部ブロック図で、図中、1は構造化文書データが入力されるデータ入力装置、2はデータ入力装置1より入力されたデータの構造を分析するデータ構造分析装置、3はデータ記憶装置、4は音声出力制御装置、5はガイド音声/音響データ記憶装置、6はユーザによって操作される入力装置、7は音声合成装置、8は音響信号発生装置、9はスピーカ,ブザー等の音を発声する出力装置で、本発明は、データ入力装置1より入力された構造化文書を、文書構造に沿って読み上げて、出力装置より音声出力する際に、入力装置よりの要求により、構造の区切りにガイド音や音響を入れ、構造文書の区切りを明確にしたものである。
【0015】
上述のように、本発明は、文書構造に沿って文書を読み上げる際に、構造化文書の区切りを明確にして、聞き手が文書構造を認識し、文書内容をよく理解できるようにしたものであるが、文書構造を音響又は音声で伝える方法としては、例えば、文書構造の区切りを明確化する方法,項目間の関連性を表現する方法等がある。
【0016】
文書構造の区切りを明確にする方法:
・構造の区切りに、音響を入れる。
「ピー」とか「ブー」などの音を連続または単発で一回ないし複数回入れる。
・無音部分を挿入する。
・各話題項目の隔たりの程度に応じた案内用の音声を入れる。
・項目全体の背景に音響信号を重ねる。
ある階層の項目に対して背景音を重ね合わせ、項目が変わると背景の音も変更することで読み上げ項目の変更を知る。
【0017】
項目間の関連性を表現する方法:
・区切りの音響の変化であらわす。
関連性の程度で音響の長さを変える。例えば、関係の薄いものは、時間を長くする。
音響の回数を変化させる。短い音を複数回入れて、関係性が薄いと回数を多くする。
・音響の間隔を変える。例えば、一定回数の音を入れることにしておき、関係性が薄くなると間隔を長くする。
・音響の音程を変える。前後の関係性が低いと音も低くする。
・無音部分を入れる。前後の関係性の程度により無音時間を変える。
・以上の組み合わせ。
【0018】
・ガイド用の音声で表現する。
本文を読んでいる音声と別の音質のものを用意しておき、話題が大きく変わった時にガイドする。例えば、「話は変わって」、や「次の話題です」などの語句を階層毎に決めておき、階層中の項目が変わった時に対応する語句を挿入する。
箇条書き項目で、項目番号が数字表記されていない場合や、複数の段落がある場合に、何番目とガイドする。
【0019】
・BGMや音響によって大体の階層の深度を知る。
例えば、階層の深さによって音程や速度を変える。例えば、階層が深くなると、BGMや音響の速さを変える。同じBGMなので読み上げの範囲は変わっていないが、詳細な項目に移動していく時に利用できる。
音響信号の音程を変える。
【0020】
(実施例1)
図2は、本発明が適用される文書構造の一例(新聞記事の例)を示す図で、以下、音響信号の音程や長狭で異動方向をあらわす例を説明するが、以下において、( )内は音響、〈 〉内は信号の意味である。
政治( ピ) 〈下位の階層へ移動〉
選挙活動 (ピ)〈下位の階層へ移動〉
自民党の活動 (プ) 民主党の活動 (プ)〈並列的な項目が並んでいる〉
社民党の活動 (プ) 共産党の活動( ブー)〈上位の階層へ移動〉
政府の動き (ブー)〈上位の階層へ〉
経済 (ピ)〈下位の階層へ〉
円安の動向 (プ)〈同じ階層の項目が並ぶ〉
株価動向 (ブー)〈上位の階層へ〉
スポーツ
【0021】
(実施例2)
図3は、本発明が適用される文書構造の他の例(タカタログの例)を示す図で、以下、音響の数で区切りや異動方向をあらわす例を説明する。
食料品(ピピ)〈下位の階層へ移動〉
清涼飲料(ピピ)〈下の階層へ〉
炭酸飲料(ピ)〈並列された項目が並ぶ〉
果汁飲料(ピピピピ)〈上位の階層へ〉
お菓子(ピピピピ)〈上位の階層へ〉
日用品(ピピ)〈下の階層へ移動〉
台所用品(ピ)
洗濯用品(ピ)
お風呂用品(ピピピピ)〈上位の階層へ〉
衣類(ピピ)〈下の階層へ移動〉
子供服(ピ)
婦人服(ピ)
紳士服(ピピピピ)
【0022】
(実施例3)
図3に示したカタログの例を用いて、背景音楽(BGM)の種類や音程で階層の移動をあらわす例を説明する。
食料品(ピピ)(同じ分野の間は同一の音楽が背景に流れる)
清涼飲料(背景音楽の音程が高くなる)〈下位の階層へ〉
炭酸飲料(ピ)〈並列された項目が並ぶ〉
果汁飲料(背景音楽の音程が低くなる)〈上位の階層へ〉
お菓子(背景音楽の音程が最初に戻る)〈上位の階層へ〉
日用品(ピピ)(背景音楽が別のものに変わる)
台所用品(背景音楽の音程が高くなる)〈下の階層へ移動〉
洗濯用品(ピ)
お風呂用品(背景音楽の音程が低くなる)〈上位の階層へ〉
衣類(ピピ)(背景音楽が別のものに変わる)
【0023】
(実施例4)
図4は、本発明が適用される文書構造の他の例(雑誌の記事の例)を示す図で、以下、ガイド音声と音響を用いた例について説明するが、( )内はガイド用音声,音響である。
(最初の話題は) 政治
(最初の項目は) 各政党の公約
自民党の公約(ピピ)、民主党の公約(ピピ)、共産党の公約(ブー)
(2番目の項目は) 獲得議席予測
(話は変わって)
スポーツ情報
(最初の項目は) ワールドカップ
(2番目の項目は) プロ野球
○○連敗
(3番目の項目は) 相撲
(話は変わって)
芸能情報
【0024】
【発明の効果】
以上の説明から明らかなように、本発明によると、構造化文書を文書構造に沿って読み上げる際に、文書構造の区切りを音響や音声によって明確にし、或いは、項目間の関係性を音響やガイド音声によって明確にすることができる。
【図面の簡単な説明】
【図1】 本発明が適用される文書読み上げ装置の一例を説明するための要部ブロック図である。
【図2】 本発明が適用される文書構造の一例(新聞記事の例)を示す図である。
【図3】 本発明が適用される文書構造の他の例(タカタログの例)を示す図である。
【図4】 本発明が適用される文書構造の他の例(雑誌の記事の例)を示す図である。
【符号の説明】
1…データ入力装置、2…データ構造分析装置、3…データ記憶装置、4…音声出力制御装置、5…ガイド音声/音響データ記憶装置、6…入力装置、7…音声合成装置、8…音響信号発生装置、9…出力装置。
[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a document reading apparatus, and more specifically, when reading a document along the document structure, the structure of each structure can be understood so that the structure of the inclusion relationship or parallel relationship in the text document is read out by speech synthesis. A voice or sound for guide is added to the part or its delimiter so that the listener can understand the contents following the change of the topic in the document.
[0002]
[Prior art]
Usually, a document is separated into a plurality of parts by applying a visual layout. And the structure of inclusion relation or parallel relation is made by the positional relation of each part, and the strength of the relation is expressed. On the other hand, with current text-to-speech synthesis, you can simply read a document monotonously and read it at best by changing the sound quality by character modification, or improve the sound quality for the quoted range explicitly indicated by signals etc. in e-mail etc. However, it is not possible to express related information between documents expressed in a layout.
[0003]
[Problems to be solved by the invention]
The present invention has been made in view of the above circumstances, and when a document structured by layout information is read out by text-to-speech synthesis, the target document is hierarchized according to its inclusion relationship etc. The purpose is to help recognition and support understanding of document contents, and to make it easier for the listener to understand the contents by using structural information not displayed on the screen.
[0004]
[Means for Solving the Problems]
The invention according to claim 1 is a document reading device that reads out a structured document by regular speech synthesis, and includes a guide speech and / or acoustic data storage device, a speech synthesis device and / or an acoustic signal generation device, This is characterized in that the delimitation of the document structure of the document is expressed by a guide voice.
[0014]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 is a principal block diagram for explaining an example of a document reading apparatus to which the present invention is applied. In the figure, 1 is a data input device to which structured document data is input, and 2 is from the data input device 1. Data structure analysis device for analyzing the structure of input data, 3 is a data storage device, 4 is a voice output control device, 5 is a guide voice / acoustic data storage device, 6 is an input device operated by a user, and 7 is a voice A synthesizing device, 8 is an acoustic signal generating device, 9 is an output device that utters sounds such as speakers and buzzers, and the present invention reads out a structured document input from the data input device 1 along the document structure, When outputting an audio from the output device, a structure sound is clarified by inserting a guide sound or sound into the structure break according to a request from the input device.
[0015]
As described above, according to the present invention, when reading a document along the document structure, the section of the structured document is clarified so that the listener can recognize the document structure and understand the contents of the document well. However, examples of a method for conveying the document structure by sound or voice include a method for clarifying the division of the document structure and a method for expressing the relationship between items.
[0016]
To clarify document structure delimiters:
・ Sound is added to the structure breaks.
Add sounds such as “pea” or “boo” once or multiple times in a continuous or single shot.
・ Insert a silent part.
・ Enter a guidance voice according to the distance between each topic item.
-Overlay the acoustic signal on the background of the entire item.
The background sound is superimposed on an item in a certain hierarchy, and when the item changes, the background sound is also changed to know the change of the reading item.
[0017]
How to express relationships between items:
・ This is a change in the sound of the break.
Change the length of the sound depending on the degree of relevance. For example, the thing with a weak relationship lengthens time.
Change the number of sounds. Insert a short sound several times, and increase the number of times when the relationship is weak.
・ Change the sound interval. For example, it is assumed that a certain number of sounds are input, and the interval is increased when the relationship is reduced.
・ Change the pitch of the sound. If the relationship between the front and rear is low, the sound is also low.
-Insert silence. The silent time is changed according to the degree of the relationship between before and after.
・ A combination of the above.
[0018]
・ Express with guide voice.
Prepare a different sound quality than the one used to read the text, and guide you when the topic changes significantly. For example, phrases such as “the story changes” and “the next topic” are determined for each hierarchy, and the corresponding phrase is inserted when an item in the hierarchy changes.
In bulleted items, if the item number is not numbered or there are multiple paragraphs, it will guide you to what number.
[0019]
-Know the depth of the general hierarchy by BGM and sound.
For example, the pitch and speed are changed according to the depth of the hierarchy. For example, when the hierarchy becomes deep, the speed of BGM or sound is changed. Since the BGM is the same, the range of reading is not changed, but it can be used when moving to detailed items.
Change the pitch of the acoustic signal.
[0020]
(Example 1)
FIG. 2 is a diagram showing an example of a document structure to which the present invention is applied (an example of a newspaper article). Hereinafter, an example in which the pitch of a sound signal and the direction of change are expressed in a narrow or narrow manner will be described. The inside is the sound and the <> is the meaning of the signal.
Politics (pi) <Move to lower level>
Election (pi) <Move to lower level>
Liberal Democratic Party Activities (P) Democratic Party Activities (P) <Parallel items are lined up>
Social Democratic Party Activities (Pu) Communist Party Activities (Buo) <Move to Higher Level>
Government Movement (Boo) <Up to higher levels>
Economy (pi) <lower level>
Trends in yen depreciation (P) <Items on the same level>
Stock Price Trends (Boo) <To higher rank>
Sports [0021]
(Example 2)
FIG. 3 is a diagram showing another example of the document structure to which the present invention is applied (example of data catalog). Hereinafter, an example in which the number of sounds and the direction of change or change direction will be described.
Food (pipi) <Move to lower level>
Soft drink (pipi) <lower level>
Carbonated drink (pi) <lined items line up>
Fruit juice (pipipipii) <up to higher level>
Confectionery (upper layer)
Daily necessities (pipi) <Move to lower level>
Kitchenware (pi)
Laundry goods (pi)
Bath products (upper layer)
Clothing (pipi) <Move to lower level>
Children's clothes (pi)
Women's clothing (pi)
Men's clothes (pipipipi)
[0022]
(Example 3)
A description will be given of an example in which a hierarchy is moved by the type and pitch of background music (BGM) using the example of the catalog shown in FIG.
Grocery (pipi) (same music plays in the background during the same field)
Soft drink (the pitch of background music increases) <to lower level>
Carbonated drink (pi) <lined items line up>
Fruit juice beverage (background music pitch is lowered)
Sweets (Back to the beginning of the background music pitch)
Daily necessities (pipi) (background music changes to another)
Kitchen utensils (background music pitch increases) <Move to lower level>
Laundry goods (pi)
Bath products (background music pitch is lowered) <upper level>
Clothing (pipi) (background music changes to another)
[0023]
(Example 4)
FIG. 4 is a diagram showing another example of a document structure to which the present invention is applied (an example of a magazine article). Hereinafter, an example using guide voice and sound will be described. , Acoustic.
(First topic is) Politics (First item is) Pledge of each party LDP's pledge (Phi Phi), Democratic pledge (Phi Phi), Communist Party pledge (Boo)
(The second item) Predicting seats to be won (the story has changed)
Sports information (first item) World Cup (second item) Professional baseball XX consecutive losses (third item) Sumo (the story has changed)
Entertainment information [0024]
【The invention's effect】
As is clear from the above description, according to the present invention, when a structured document is read out along the document structure, the division of the document structure is clarified by sound or sound, or the relationship between items is sounded or guided. Can be clarified by voice.
[Brief description of the drawings]
FIG. 1 is a principal block diagram for explaining an example of a document reading apparatus to which the present invention is applied;
FIG. 2 is a diagram showing an example of a document structure to which the present invention is applied (an example of a newspaper article).
FIG. 3 is a diagram showing another example of document structure to which the present invention is applied (example of data catalog).
FIG. 4 is a diagram showing another example of a document structure to which the present invention is applied (an example of a magazine article).
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Data input device, 2 ... Data structure analyzer, 3 ... Data storage device, 4 ... Voice output control device, 5 ... Guide voice / acoustic data storage device, 6 ... Input device, 7 ... Speech synthesizer, 8 ... Sound Signal generator, 9... Output device.

Claims (1)

構造化文書を規則音声合成によって読み上げる文書読み上げ装置において、ガイド音声及び/又は音響データ記憶装置、及び、音声合成装置及び/又は音響信号発生装置を有し、前記構造化文書の文書構造の区切りを、ガイド用の音声で表現するようにしたことを特徴とする文書読み上げ装置。A document reading device for reading a structured document by regular speech synthesis, comprising a guide voice and / or acoustic data storage device, a speech synthesis device and / or an acoustic signal generator, and delimiting the document structure of the structured document A text-to-speech device characterized in that it is expressed by a guide voice.
JP25615498A 1998-09-10 1998-09-10 Document reading device Expired - Fee Related JP3712325B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP25615498A JP3712325B2 (en) 1998-09-10 1998-09-10 Document reading device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP25615498A JP3712325B2 (en) 1998-09-10 1998-09-10 Document reading device

Publications (2)

Publication Number Publication Date
JP2000089777A JP2000089777A (en) 2000-03-31
JP3712325B2 true JP3712325B2 (en) 2005-11-02

Family

ID=17288668

Family Applications (1)

Application Number Title Priority Date Filing Date
JP25615498A Expired - Fee Related JP3712325B2 (en) 1998-09-10 1998-09-10 Document reading device

Country Status (1)

Country Link
JP (1) JP3712325B2 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4678946B2 (en) * 2000-12-28 2011-04-27 富士通株式会社 Voice interactive information processing apparatus and recording medium
JP4668542B2 (en) * 2004-03-17 2011-04-13 株式会社リコー Status notification device, electronic device, and status notification method
JP2006137033A (en) * 2004-11-10 2006-06-01 Toppan Forms Co Ltd Voice message transmission sheet
JP6441177B2 (en) * 2015-07-29 2018-12-19 日本電信電話株式会社 PAUSE LENGTH DETERMINING DEVICE, PAUSE LENGTH DETERMINING METHOD, AND PROGRAM
EP3441873A4 (en) * 2016-04-05 2019-03-20 Sony Corporation Information processing apparatus, information processing method, and program
CN111415650A (en) * 2020-03-25 2020-07-14 广州酷狗计算机科技有限公司 Text-to-speech method, device, equipment and storage medium

Also Published As

Publication number Publication date
JP2000089777A (en) 2000-03-31

Similar Documents

Publication Publication Date Title
Cornelius Reconstructing alliterative verse: The pursuit of a medieval meter
Finnegan et al. South Pacific oral traditions
JP3712325B2 (en) Document reading device
Katrandjiev et al. Usage of rhetorical figures in advertising slogans
Qureshi Musical gesture and extra-musical meaning: words and music in the Urdu ghazal
JP2002169591A (en) Simulated conversation system and information storage medium
Crystal In search of English: a traveller's guide
Brognaux et al. A new prosody annotation protocol for live sports commentaries.
Chickering Form and Interpretation in the" Envoy" to the" Clerk's Tale"
Maschler The discourse marker nu: Israeli Hebrew impatience in interaction
Finnegan Introduction; or, why the comparativist should take account of the South Pacific
Jarman High Notes, High Drama: Musical climaxes and gender politics in tenor heroes and Broadway women
Wood The multiple voices of American Klezmer
Pierson Voice, Technē, and Jouissance in Music for 18 Musicians
Eeckhout Wallace Stevens's modernist melodies
Forrest “The mind/is listening”: Aurality and Noise Poetics in the Poetry of William Carlos Williams
Attfield Curse ov Dialect, Wooden Tongues (2006)
Alper Not just derision and darkness: The interplay of lyrics and music in Steely Dan’s compositions
Colton Channelling the ecstasy of Hildegard von Bingen:" O Euchari" remixed
Townsend Meter and Modernist Prose: Verse Fragments in Woolf's The Years
JP6185136B1 (en) Voice generation program and game device
Lowe et al. Analyse This: Types and Tactics of Self-Referential Songs
Encarnacao Courtney Barnett, Sometimes I Sit and Think, and Sometimes I Just Sit (2014)
Gerber Tracing the trajectory of a Williams poem: From the variable foot to triadic-line verse
du Preez et al. Die oorsprong, ontwikkeling en estetika van die intieme musiekspel, en Fees as’n Afrikaanse voorbeeld

Legal Events

Date Code Title Description
A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20050406

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20050607

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20050728

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20050816

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20050816

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20080826

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090826

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090826

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100826

Year of fee payment: 5

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100826

Year of fee payment: 5

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20110826

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20110826

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20120826

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20120826

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130826

Year of fee payment: 8

LAPS Cancellation because of no payment of annual fees