TWM575595U

TWM575595U - E-book apparatus with audible narration

Info

Publication number: TWM575595U
Application number: TW107210876U
Authority: TW
Inventors: 洪士哲; 吳宗銘; 陳秀華; 雷珵麟; 鄧旭敦; 施詠禎; 蔡忠婷; 廖秀美; 吳淑琴
Original assignee: 台灣大哥大股份有限公司
Priority date: 2018-08-09
Filing date: 2018-08-09
Publication date: 2019-03-11

Abstract

The invention relates to an automatic reading apparatus that receives a multi-media content including text content. The reading apparatus includes a display having a display region for the multi-media content; an input interface receiving an input signal associated with a location identification and/or a change of the multi-media content in the display region; and a reading and highlight unit generating audio content as well as one or more dynamic highlights associated with the text content, said dynamic highlight skip from a first portion of said text content to a second portion of said text content in response to the input signal.

Description

E-book voice reading device

本創作關於一種電子書閱讀裝置及其方法，尤其是一種能夠語音朗讀與動態標記文字內容之電子書閱讀裝置。 The present invention relates to an electronic book reading device and a method thereof, and more particularly to an electronic book reading device capable of voice reading and dynamic marking of text content.

電子書發展至今已經多年，常見的電子書格式包含PDF、EPUB、mobi及AZW等等。根據現有的技術，電子書所包含的圖片內容和文字內容均可完整地視覺呈現，唯有關電子書的朗讀功能這塊卻是發展較緩慢，特別是針對機器學習的自動朗讀功能。此原因在於，機器朗讀的難度相當高，需要克服單調的機器發音及上下文語意的分析才可順利朗讀。舉例而言，文字內容「3/4開幕典禮」與「3/4的影響範圍」，其中雖然都載有「3/4」，但前者是朗讀為三月四日，後著是朗讀為四分之三。然而，這些問題隨著AI技術的發展皆陸續克服，電子書的朗讀功能未來將成逐漸普及。 E-books have been in development for many years, and common e-book formats include PDF, EPUB, mobi, and AZW. According to the existing technology, the picture content and the text content contained in the e-book can be completely visually presented, but the reading function of the e-book is slow to develop, especially for the automatic reading function of machine learning. The reason is that the difficulty of reading aloud is quite high, and it is necessary to overcome the monotonous machine pronunciation and contextual analysis to be able to read smoothly. For example, the text content "3/4 Opening Ceremony" and "3/4 Impact Range", although both contain "3/4", the former is read aloud for March 4th and then read aloud for four. Three points. However, these problems have been gradually overcome with the development of AI technology, and the reading function of e-books will gradually become popular in the future.

現有的電子書閱讀裝置可開啟朗讀功能，且部分還伴隨文字內容的標記(highlight)來引導讀者閱讀，讓讀者可透過文字的標記與朗讀的配合更輕鬆的進入閱讀狀態。然而，現有電子書的朗讀及標記功能僅是單調地按照文字內容的順序性單向進行，不容許朗讀及標記的目標被任意選擇。 The existing e-book reading device can open the reading function, and some of them are accompanied by the highlight of the text content to guide the reader to read, so that the reader can enter the reading state more easily through the cooperation of the mark of the text and the reading of the reading. However, the reading and marking functions of the existing e-books are only monotonously performed in accordance with the order of the text content, and the objects that are not allowed to be read and marked are arbitrarily selected.

據此，有必要發展一種朗讀裝置或方法，允許依據使用者操作而選擇性地改變朗讀的目標，且文字的標記也一併同步。 Accordingly, it is necessary to develop a reading device or method that allows the target of reading to be selectively changed in accordance with the user's operation, and the marks of the characters are also synchronized.

本創作目的在於提供一種自動朗讀裝置，經配置以接收並顯示一多媒體內容，該多媒體內容至少包含文字內容，該朗讀裝置包含：一顯示器，具有一顯示區域以顯示該多媒體內容的一部分；一輸入介面，接收一輸入訊號，該輸入訊號與在該顯示區域中的一位置辨識及/或與該多媒體內容的該部分在該顯示區域中的變化有關；及一朗讀及標記單元，經配置以產生關聯於所述文字內容的聲音內容及一或多個動態標記，所述動態標記自所述文字內容的一第一部分跳躍至所述文字內容的一第二部分以回應該輸入訊號，其中所述文字內容的第一部分與一第一聲音內容有關，所述文字內容的第二部分與該輸入訊號有關且出現在該顯示區域中。 The purpose of the present invention is to provide an automatic reading device configured to receive and display a multimedia content, the multimedia content comprising at least text content, the reading device comprising: a display having a display area for displaying a portion of the multimedia content; The interface receives an input signal associated with a location in the display area and/or a change in the portion of the multimedia content in the display area; and a reading and marking unit configured to generate Corresponding to the sound content of the text content and one or more dynamic tags, the dynamic tag jumping from a first portion of the text content to a second portion of the text content to respond to a input signal, wherein The first portion of the textual content is associated with a first sound content, and the second portion of the textual content is associated with the input signal and appears in the display area.

在一具體實施例中，所述動態標記具有一句子標記。所述動態標記具有一單字標記。或者，所述動態標記具有一句子標記及一單字標記，該句子標記與該單字標記視覺可區隔地重疊。該句子標記的範圍由所述文字內容的兩個標點符號定義。 In a specific embodiment, the dynamic tag has a sentence tag. The dynamic tag has a single word tag. Alternatively, the dynamic tag has a sentence tag and a single word tag, and the sentence tag is visually separably overlapped with the single word tag. The range of the sentence tag is defined by two punctuation marks of the text content.

在一具體實施例中，所述文字內容的第二部分與一第二聲音內容有關。 In a specific embodiment, the second portion of the textual content is associated with a second sound content.

本創作還提供一種非暫態電腦可讀取媒介，包含複數個指令，可由一處理單元執行以：分析一多媒體內容包含的文字內容以辨識複數個句子及/或單字；接收一輸入訊號，該輸入訊號與在一顯示區域中的一位置辨識及/或與該多媒體內容的一部分在該顯示區域中的變化有關；產生關聯於所述文字內容的一或多個動態標記以回應該輸入訊號，所述動態標記為可視於該顯示區域中；及令所述動態標記自所述文字內容的一第一部分跳躍至所述文字內容的一第二部分，其中所述文字內容的第二部分與該輸入訊號有關且出現在該顯示區域中。 The present invention also provides a non-transitory computer readable medium, comprising a plurality of instructions, executable by a processing unit to: analyze a text content contained in a multimedia content to identify a plurality of sentences and/or words; receive an input signal, Input signal and a location in a display area and/or Having a portion of the multimedia content related to a change in the display area; generating one or more dynamic indicia associated with the textual content to respond to the input signal, the dynamic indicia being viewable in the display area; The dynamic tag jumps from a first portion of the text content to a second portion of the text content, wherein the second portion of the text content is associated with the input signal and appears in the display area.

在一具體實施例中，該等指令更執行：基於所述文字內容的句子及/或單字的一辨識產生對應的聲音內容，所述聲音內容的輸出與所述動態標記同步。 In a specific embodiment, the instructions are further executed: generating a corresponding sound content based on an identification of the sentence and/or the word of the text content, the output of the sound content being synchronized with the dynamic mark.

在一具體實施例中，所述產生關聯於所述文字內容的一或多個動態標記，包含取消一原動態標記。 In a specific embodiment, the generating one or more dynamic tags associated with the text content comprises canceling an original dynamic tag.

在一具體實施例中，所述該多媒體內容的一部分在該顯示區域中的變化，包含關於該顯示區域的一捲動操作或一翻頁操作。 In a specific embodiment, the change of a portion of the multimedia content in the display area includes a scrolling operation or a page turning operation with respect to the display area.

一種自動朗讀方法，由一運算裝置的處理單元執行，該方法包含：取得並顯示一多媒體內容的一部分於一顯示器的顯示區域上，其中該多媒體內容具有文字內容；起始一機械朗讀手段以基於所述文字內容輸出聲音內容；產生一或多個動態標記於該顯示區域中，所述動態標記指示所述文字內容的一句子及/或一單字，所述動態標記所指示的文字內容與該聲音內容關聯的文字內容同步；及接收一輸入訊號，該輸入訊號與在該顯示區域中的一位置辨識及/或與該多媒體內容的該部分在該顯示區域中的變化有關，所述動態標記的顯示及聲音內容的輸出自所述文字內容的一第一部分跳躍至一第二部分以回應該輸入訊號，其中所述文字內容的第一部分與一第一聲音內容有關，所述文字內容的第二部分與該輸入訊號和一第二聲音內容有關且出現在該顯示區域中。 An automatic reading method is performed by a processing unit of an computing device, the method comprising: obtaining and displaying a portion of a multimedia content on a display area of a display, wherein the multimedia content has text content; starting a mechanical reading means based on The text content outputs sound content; generating one or more dynamic markers in the display area, the dynamic mark indicating a sentence and/or a word of the text content, the text content indicated by the dynamic mark and the Synchronizing the textual content associated with the sound content; and receiving an input signal associated with a location in the display area and/or a change in the portion of the multimedia content in the display area, the dynamic marker And outputting the sound content from a first portion of the text content to a second portion to return a signal, wherein the first portion of the text content is associated with a first sound content, the text content The two parts are related to the input signal and a second sound content and appear in the display area.

在一具體實施例中，所述產生一或多個動態標記於該顯示區域中包含同時產生指示一句子的一第一動態標記及指示一單字的一第二動態標記，該第一動態標記與該第二動態標記視覺可區隔地重疊。 In one embodiment, the generating one or more dynamic markers in the display area includes simultaneously generating a first dynamic mark indicating a sentence and a second dynamic mark indicating a single word, the first dynamic mark and The second dynamic mark vision may overlap in a spaced apart manner.

在一具體實施例中，該輸入訊號是關聯於一觸控介面的操作、一影像辨識結果或一語音辨識結果。 In one embodiment, the input signal is associated with an operation of a touch interface, an image recognition result, or a speech recognition result.

在一具體實施例中，所述動態標記的顯示自所述文字內容的第一部分跳躍至第二部分以回應該輸入訊號，包含所述動態標記的顯示自所述文字內容之一第一部分的第一句子跳躍至所述文字內容之一第二部分的第二句子。 In a specific embodiment, the display of the dynamic mark jumps from the first portion of the text content to the second portion to respond to the input signal, including the display of the dynamic mark from the first part of the text content A sentence jumps to the second sentence of the second part of the text content.

100‧‧‧系統 100‧‧‧ system

102‧‧‧伺服器 102‧‧‧Server

1020‧‧‧中央處理器 1020‧‧‧Central Processing Unit

1022‧‧‧記，憶體 1022‧‧‧Remember

1024‧‧‧網路介面 1024‧‧‧Internet interface

1026‧‧‧數位儲存單元 1026‧‧‧Digital storage unit

104‧‧‧使用者終端裝置、用戶裝置 104‧‧‧User terminal device, user device

106‧‧‧網路 106‧‧‧Network

200‧‧‧用戶裝置 200‧‧‧User device

202‧‧‧處理單元 202‧‧‧Processing unit

210‧‧‧電腦可讀取媒介 210‧‧‧Computer readable medium

220‧‧‧網路介面 220‧‧‧Internet interface

230‧‧‧記憶體 230‧‧‧ memory

231‧‧‧操作系統 231‧‧‧ operating system

232‧‧‧內容播放模組 232‧‧‧Content Playback Module

233‧‧‧內容資料 233‧‧‧Contents

240‧‧‧輸出/輸入介面 240‧‧‧Output/Input Interface

250‧‧‧朗讀及標記單元 250‧‧‧Reading and marking unit

260‧‧‧輸出單元 260‧‧‧Output unit

270‧‧‧輸入單元 270‧‧‧ input unit

300‧‧‧朗讀及標記單元 300‧‧‧Reading and marking unit

301‧‧‧文字產生引擎 301‧‧‧Text Generator

302‧‧‧文字處理引擎 302‧‧‧Word Processing Engine

303‧‧‧語義分析引擎 303‧‧‧Semantic Analysis Engine

304‧‧‧音訊匹配引擎 304‧‧‧Optical Matching Engine

305‧‧‧文字標記引擎 305‧‧‧Text Markup Engine

306‧‧‧同步產生引擎 306‧‧‧Synchronous generation engine

400‧‧‧顯示畫面 400‧‧‧Display screen

401‧‧‧顯示區域 401‧‧‧Display area

402‧‧‧視窗 402‧‧‧Window

403‧‧‧捲動操作 403‧‧‧ scrolling operation

404‧‧‧翻頁操作 404‧‧‧Page turning operation

405‧‧‧第一選擇操作 405‧‧‧First choice operation

406‧‧‧第二選擇操作 406‧‧‧Second selection operation

407‧‧‧第三選擇操作 407‧‧‧ Third choice operation

501‧‧‧動態標記 501‧‧‧ dynamic mark

502‧‧‧動態標記 502‧‧‧ Dynamic Marking

503‧‧‧段落標記 503‧‧‧ paragraph mark

S600-S640‧‧‧步驟 S600-S640‧‧‧Steps

S700-S730‧‧‧步驟 S700-S730‧‧‧Steps

S800-S830‧‧‧步驟 S800-S830‧‧‧Steps

第一圖顯示本創作提供的一系統。 The first image shows a system provided by this creation.

第二圖顯示第一圖用戶裝置的一實施例。 The second figure shows an embodiment of the first figure user device.

第三圖顯示本創作朗讀及標記單元的一實施例。 The third figure shows an embodiment of the present reading and marking unit.

第四圖例示一顯示器的顯示畫面。 The fourth figure illustrates a display screen of a display.

第五A至五D圖顯示本創作動態標記的的各種實施例示意。 The fifth through fifth figures show schematic representations of various embodiments of the present creative dynamic indicia.

第六圖顯示使用者與自動朗讀裝置的互動流程。 The sixth diagram shows the interaction process between the user and the automatic reading device.

第七圖顯示本創作動態標記文字內容的步驟流程。 The seventh diagram shows the flow of steps in the dynamic tagging of text content.

第八圖顯示本創作朗讀方法的步驟流程。 The eighth diagram shows the flow of steps in the creation of the reading method.

在以下多個示例具體實施例的詳細敘述中，對該等隨附圖式進行參考，該等圖式形成本創作之一部分。且係以範例說明的方式顯示，藉由該範例可實作該等所敘述之具體實施例。提供足夠的細節以使該領域技術人員能夠實作該等所述具體實施例，而要瞭解到在不背離其精神或範圍下，也可以使用其他具體實施例，並可以進行其他改變。此外，雖然可以如此，但對於「一具體實施例」的參照並不需要屬於該相同或單數的具體實施例。因此，以下詳細敘述並不具有限制的想法，而該等敘述具體實施例的範圍係僅由該等附加申請專利範圍所定義。 In the following detailed description of various exemplary embodiments, reference is made to the accompanying drawings, which form a part of the present invention. The embodiments are shown by way of example, and the specific embodiments described herein can be implemented by the example. Provide enough detail to enable technicians in the field to It is to be understood that the specific embodiments are described, and other embodiments may be utilized and other changes may be made without departing from the spirit and scope. In addition, although this may be the case, reference to "a particular embodiment" does not require a particular embodiment of the same or singular. Therefore, the following detailed description is not to be taken in a limiting

在整體申請書與申請專利範圍中，除非在上下文中另外明確說明，否則以下用詞係具有與此明確相關聯的意義。當在此使用時，除非另外明確說明，否則該用詞「或」係為一種包含的「或」用法，並與該用詞「及/或」等價。除非在上下文中另外明確說明，否則該用詞「根據」並非排他，並允許根據於並未敘述的多數其他因子。此外，在整體申請書中，「一」、「一個」與「該」的意義包含複數的參照。「在...中」的意義包含「在...中」與「在...上」。 In the context of the overall application and the scope of the patent application, the following terms have the meanings explicitly associated with this unless the context clearly dictates otherwise. As used herein, the <RTI ID=0.0>"or" </ RTI> </ RTI> is used to include an "or" usage and is equivalent to the term "and/or" unless specifically stated otherwise. The term "subject" is not exclusive and expressly relies on the majority of other factors not recited, unless the context clearly dictates otherwise. In addition, in the overall application, the meaning of "one", "one" and "the" includes plural references. The meaning of "in" includes "in" and "in".

當在此使用時，該用詞「網路連接」意指一種鏈結及/或軟體元件的集合，能使一計算裝置透過一網路與另一計算裝置通訊。一種所述網路連接可為傳輸控制協定(TCP)連接。傳輸控制協定連接為兩網路節點之間的虛擬連接，且一般而言係透過一種傳輸控制協定交握通訊協定所建立。 As used herein, the term "network connection" means a collection of links and/or software components that enable a computing device to communicate with another computing device over a network. One such network connection can be a Transmission Control Protocol (TCP) connection. A Transmission Control Protocol connection is a virtual connection between two network nodes and is generally established through a Transmission Control Protocol handshake protocol.

以下簡短提供該等創新主題的簡要總結，以提供對某些態樣的一基本瞭解。並不預期此簡短敘述做為一完整的概述。不預期此簡短敘述用於辨識主要或關鍵元件，或用於描繪或是限縮該範圍。其目的只是以簡要形式呈現某些概念，以做為稍後呈現之該更詳細敘述的序曲。 A brief summary of these innovative topics is provided below to provide a basic understanding of certain aspects. This short narrative is not expected as a complete overview. This short description is not intended to identify primary or critical elements, or to depict or limit the range. Its purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.

第一圖顯示本創作提供的一系統(100)，其包含一或多個伺服器(102)及其網路連接的多個使用者終端裝置(104)，又稱用戶裝置或使用者裝置，尤其所述終端裝置(104)適於作為電子書閱讀器。對於用戶裝置(104)而言，所述伺服器(102)為一遠端伺服器。伺服器(102)可經由編程以創建一網站或是可供使用者裝置之瀏覽軟體存取的其他形式，以便讓使用者經由網路(106)下載伺服器允許存取的資料，如應用程式、多媒體內容資料、軟體更新資料等。伺服器(102)可經進一步配置以執行特定的運算，並將運算結果經由網路連接提供至用戶裝置(104)。在一些實施例中，伺服器(102)可提供一電子書網站及電子書閱讀軟體之下載連結。使用者可經由給定的伺服器存取電子書，伺服器的存取可被限制，例如限制存取的人數或資料流量等。 The first figure shows a system (100) provided by the present invention, which comprises one or more servers (102) and a plurality of user terminal devices (104) connected thereto, also called user devices or uses. The device, in particular the terminal device (104), is suitable as an e-book reader. For the user device (104), the server (102) is a remote server. The server (102) can be programmed to create a website or other form accessible by the browsing device of the user device, so that the user can download the data, such as the application, that the server is allowed to access via the network (106). , multimedia content materials, software updates, and so on. The server (102) can be further configured to perform a particular operation and provide the results of the operation to the user device (104) via a network connection. In some embodiments, the server (102) can provide a download link for an e-book website and an e-book reading software. The user can access the e-book via a given server, and the server's access can be restricted, such as limiting the number of people accessing or data traffic.

一般而言，伺服器(102)包含有一或多個中央處理器(1020)及記憶體(1022)用以儲存可由處理器執行的多個操作指令。網路介面(1024)網路連接至一或多個網路(106)及使用者終端裝置(104)，以接收來自網路的資料、請求及指令並向用戶裝置發送各種形式的資料，例如數位儲存單元(1026)所存放的多媒體內容資料，包含圖片資料、文字資料及聲音資料。處理器(1020)可經由網路(106)從其他電腦系統或服務接收資訊及指令。例如，處理器(1020)可利用網路介面(1024)接收或提供用於電子書呈現的各種內容項目。處理器(1020)可進一步利用網路介面(1024)接收或傳送關於內容項目的同步資訊。處理器(1020)可與記憶體(1022)通訊以存取在其中的各種操作指令。在一些實施例中，中央處理器(1020)所扮演的角色及其執行的操作可由用戶裝置(104)固有的處理器分擔或取代。 In general, the server (102) includes one or more central processing units (1020) and memory (1022) for storing a plurality of operational instructions executable by the processor. A network interface (1024) network is coupled to one or more networks (106) and user terminal devices (104) for receiving data, requests, and commands from the network and transmitting various forms of data to the user device, such as The multimedia content data stored in the digital storage unit (1026) includes image data, text data and sound data. The processor (1020) can receive information and instructions from other computer systems or services via the network (106). For example, the processor (1020) can utilize the web interface (1024) to receive or provide various content items for e-book presentation. The processor (1020) can further receive or transmit synchronization information about the content item using the web interface (1024). The processor (1020) can communicate with the memory (1022) to access various operational instructions therein. In some embodiments, the role played by the central processor (1020) and the operations performed thereby may be shared or replaced by processors inherent to the user device (104).

記憶體(1024)包含多個電腦可執行指令，其可由中央處理器(1020)執行以實現本創作所揭露的各種操作。記憶體(1024)可包含任何暫態或非暫態記憶體的組合，包含RAM、ROM、硬碟、固態硬碟、快閃記憶體等。記憶體(1024)可儲存一操作系統，其提供多個電腦程式指令由處理器(1020)使用於一般的內容物管理和操作中。在其他實施例中，數位資料儲存單元(1026)可被包含在記憶體(1024)的配置中並儲存用於在用戶裝置呈現的各種內容項目，如圖片內容、文字內容及聲音內容。數位資料儲存單元(1026)可包含關於內容項目的其他資訊，例如關於內容項目的同步映射資訊、內容項目的元資料等。在其他實施例中，伺服器(102)可網路連接至外部的另一數位資料儲存單元(未顯示)以取得用於在用戶裝置呈現的內容項目或儲存在記憶體(1024)。 The memory (1024) contains a plurality of computer executable instructions that are executable by the central processing unit (1020) to carry out the various operations disclosed herein. The memory (1024) may comprise any combination of transient or non-transitory memory, including RAM, ROM, hard disk, solid state hard disk, flash memory, and the like. The memory (1024) can store an operating system that provides a plurality of computer program instructions for use by the processor (1020) for general content management and operation. In other embodiments, the digital data storage unit (1026) can be included in the configuration of the memory (1024) and store various content items for presentation at the user device, such as picture content, text content, and sound content. The digital data storage unit (1026) may contain other information about the content item, such as synchronization mapping information about the content item, metadata of the content item, and the like. In other embodiments, the server (102) can be networked to another external digital data storage unit (not shown) to retrieve content items for presentation at the user device or stored in memory (1024).

使用者終端裝置或用戶裝置(104)可以是一個人電腦、一平板電腦、一個人數位助理、行動裝置或任何適當的形式。在該實施例中，用戶裝置(104)包含一顯示器、一輸人單元(如實體鍵盤、觸控螢幕、滑鼠、收音器或成像單元)、處理單元、記憶體及其他用以執行本創作所有實施例之配置。 The user terminal device or user device (104) can be a personal computer, a tablet computer, a number of person assistants, a mobile device, or any suitable form. In this embodiment, the user device (104) includes a display, an input unit (such as a physical keyboard, a touch screen, a mouse, a radio, or an imaging unit), a processing unit, a memory, and the like to perform the creation. Configuration of all embodiments.

第二圖顯示第一圖用戶裝置(104)的一實施例(200)，其包含一處理單元(202)、一電腦可讀取媒介(210)、一網路介面(220)、一記憶體(230)、一輸出/輸入介面(240)及一朗讀及標記單元(250)。相似地，處理單元(202)可自網路接收各種資訊、指令及多媒體內容。處理單元(202)可利用網路介面(220)以接收用於呈現電子書的各種內容項目。處理單元(202)還進一步利用網路介面(220)傳送或接收用於執行本創作各種實施例的其他資訊或指令，例如關於多個內容項目的同步映射資訊。處理單元(202)可存取記憶體(230)所包含的電腦可執行程式，如用戶端裝置的操作系統(231)、內容播放模組(232)及內容資料(233)，並經由輸入/輸出介面(240)輸出至輸出單元(260)，其可包含用於向使用者呈現各種內容項目的一或多個輸出裝置，如顯示器及揚聲器。輸入/輸出介面(240)可接收來自一輸入單元(270)的輸入，輸入單元(270)可包含一或多個輸入裝置，如觸控螢幕(輸出裝置與輸入裝置的結合)、滑鼠、麥克風及成像裝置。以觸控螢幕而言，一觸碰事件的發生產生相應的一輸入訊號，處理單元(202)可根據該輸入訊號決定關於該觸碰事件的一或多個座標，而進一步根據該等座標的分析處理單元(202)可辨識出顯示器的一或多個畫素及相應的觸控行為。據此，處理單元(202)可根據所述座標決定相關的畫素輸出。 The second figure shows an embodiment (200) of the first user device (104), which includes a processing unit (202), a computer readable medium (210), a network interface (220), and a memory. (230), an output/input interface (240) and a reading and marking unit (250). Similarly, the processing unit (202) can receive various information, instructions, and multimedia content from the network. The processing unit (202) can utilize the web interface (220) to receive various content items for presenting the e-book. The processing unit (202) further utilizes the web interface (220) to transmit or receive other information or instructions for performing various embodiments of the present authoring, such as synchronization mapping information for a plurality of content items. The processing unit (202) can access a computer executable program included in the memory (230), such as an operating system (231) of the client device, a content playing module (232), and content data (233), and input/ The output interface (240) is output to an output unit (260), which can include one or more output devices, such as a display and a speaker, for presenting various content items to a user. The input/output interface (240) can receive input from an input unit (270) The input unit (270) may include one or more input devices such as a touch screen (a combination of an output device and an input device), a mouse, a microphone, and an imaging device. In the case of a touch screen, a touch event generates a corresponding input signal, and the processing unit (202) can determine one or more coordinates about the touch event according to the input signal, and further according to the coordinates The analysis processing unit (202) can recognize one or more pixels of the display and corresponding touch behaviors. Accordingly, the processing unit (202) can determine the associated pixel output based on the coordinates.

記憶體(230)可包含暫態及非暫態的任何組合，如RAM、ROM、硬碟、固態硬碟及快閃記憶體等。操作系統(231)提供用戶裝置的一般管理和操作的電腦編程指令，其因用戶端裝置的種類而異，且為本領域所熟知，故不在此贅述。內容播放模組(232)可經配置以執行關於各種內容項目的呈現，以及提供用於控制內容播放的使用者互動介面。內容資料(233)包含一或多個內容項目，如文字內容、圖片內容、聲音內容，其可經由內容播放模組(232)播放。內容資料(233)可進一步包含與各內容項目有關的其他資訊，例如在相異兩個內容項目之間的同步映射資訊，以及各內容項目的元資料。內容資料(233)可根據來自網路介面(220)或輸出/輸入介面(240)所接收的其他內容資料而產生更新。用戶裝置(200)可獲取外部的其他內容項目並存放在內容資料(233)中以實現即時串流播放或隨時播放。以電子書而言，內容播放模組(232)可處理文字內容、圖片內容及聲音內容，內容播放模組(232)還可提供使用者操作元件，如翻頁按鈕或捲動組件。 The memory (230) can include any combination of transient and non-transitory, such as RAM, ROM, hard disk, solid state hard disk, and flash memory. The operating system (231) provides computer programming instructions for the general management and operation of the user device, which vary from one type of client device and are well known in the art and will not be described herein. The content play module (232) can be configured to perform presentations on various content items, as well as provide a user interaction interface for controlling content playback. The content material (233) contains one or more content items, such as text content, picture content, and sound content, which can be played via the content playback module (232). The content material (233) may further include other information related to each content item, such as synchronization mapping information between two different content items, and metadata of each content item. The content material (233) can be updated based on other content material received from the web interface (220) or the output/input interface (240). The user device (200) can acquire other external content items and store them in the content material (233) for real-time streaming playback or playback at any time. In the case of an e-book, the content playing module (232) can process text content, picture content, and sound content, and the content playing module (232) can also provide user operating elements such as a page turning button or a scrolling component.

朗讀及標記單元(250)經配置以執行對應文字內容的有聲朗讀及標記動作。在其他實施例中，朗讀及標記單元(250)可以拆分為相互獨立的一朗讀單元和一標記單元。在一些實施例中，朗讀及標記單元(250)可以是內容播放模組(232)的一部分或相關延伸。或者，朗讀及標記單元(250)的部分工作可在伺服器(100)端執行。第三圖顯示本創作朗讀及標記單元的一實施例(300)，包含一文字產生引擎(301)、文字處理引擎(302)、語義分析引擎(303)、音訊匹配引擎(304)、文字標記引擎(305)及同步產生引擎(306)。 The reading and marking unit (250) is configured to perform an audible reading and marking action of the corresponding textual content. In other embodiments, the reading and marking unit (250) can be split into a reading unit and a marking unit that are independent of each other. In some embodiments, the reading and marking unit (250) may be internal A portion of the playback module (232) or associated extension. Alternatively, part of the work of the reading and marking unit (250) can be performed at the server (100) side. The third figure shows an embodiment (300) of the creation reading and marking unit, comprising a text generation engine (301), a word processing engine (302), a semantic analysis engine (303), an audio matching engine (304), and a text marking engine. (305) and a synchronization generation engine (306).

文字產生引擎(301)經配置以自伺服器(100)或內容資料(233)存放的一或多個內容項目中辨識文字內容並產生可被顯示的視覺文字及標點符號，並可根據內容項目中的其他資訊決定文字排版、字體及字型等視覺效果。所述排版可包含圖片與文字的視覺呈現。所述內容項目可由各種電子書專用的格式所定義，如PDF、EPUB及AZW等。所述視覺文字可被涵蓋在電子書的單一頁或分別多頁的空間中。在本創作其他可能的實施例中，如在有聲書的應用中，文字產生引擎(301)可經配置以利用已知的語音辨識手段而根據已接收的聲音內容產生對應的文字內容。 The text generation engine (301) is configured to recognize textual content from one or more content items stored in the server (100) or content material (233) and to generate visual text and punctuation that can be displayed, and based on the content item Other information in the game determines visual effects such as text layout, fonts, and fonts. The layout may include a visual representation of the picture and the text. The content items may be defined by various e-book-specific formats, such as PDF, EPUB, and AZW. The visual text may be included in a single page of the e-book or in a space of multiple pages. In other possible embodiments of the present creation, such as in an audiobook application, the text generation engine (301) can be configured to generate corresponding textual content from the received sound content using known speech recognition means.

文字處理引擎(302)經配置以辨識文字內容中的一或多個句子。舉例而言，根據已知的規則，句子可以是介於兩個鄰近句點之間的文字，或任兩個鄰近標點符號(逗號和句號)之間的文字。括號所涵蓋的一或多個文字組成也可被視為句子的辨識。在其他實施例中，可進一步根據基於機械學習的技術來優化句子的辨識，此可解決可能因標點符號錯誤所導致的無法辨識。在一實施例中，經辨識為一句子的文字內容可給予一識別符或標籤並與對應的文字內容一起存放在記憶體，即每一句子具有各自的一識別符或標籤。例如，可給予這些句子特定的識別符，使得句子可被識別且句子與句子彼此之間的關係能夠被清楚定義，例如句子與句子的順序關係，句子所出現的段落或行數。 The word processing engine (302) is configured to recognize one or more sentences in the textual content. For example, according to known rules, a sentence can be a text between two adjacent periods, or a text between any two adjacent punctuation marks (comma and period). One or more of the text components covered by the brackets can also be considered as the identification of the sentence. In other embodiments, the identification of sentences may be further optimized in accordance with techniques based on mechanical learning, which may resolve unrecognizable errors that may result from punctuation errors. In one embodiment, the text content recognized as a sentence may be given an identifier or label and stored in the memory along with the corresponding text content, ie each sentence has a respective identifier or label. For example, specific identifiers for these sentences can be given so that sentences can be recognized and the relationship between sentences and sentences can be clearly defined, such as the order relationship of sentences and sentences, the number of paragraphs or lines in which sentences appear.

語義分析引擎(303)經配置以根據已辨識的一或多個句子決定關聯於該一或多個句子的語義特徵，其可伴隨文字內容及所述識別符或標籤存放在記憶體中。這邊所述語義特徵是指與句子的文法、文義及/或字詞組成有關的統計或衡量。在一實施例中，語義分析引擎(303)可將每一句子的文字分為多段並針對每一段決定對應的語義特徵。所述語義分析引擎(303)可由已知的機器學習手段實現，而語義分析引擎(303)的建立可以在遠端伺服器完成並下載安裝至使用者終端裝置。可替代地，語義分析引擎(303)可不在用戶裝置中執行，而是在遠端伺服器執行並將分析結果存放在遠端伺服器。語義分析引擎(303)可經由持續的訓練回饋而不斷優化語義分析的精準度，甚至偵測句子中的錯誤。 The semantic analysis engine (303) is configured to determine semantic features associated with the one or more sentences based on the identified one or more sentences, which may be stored in the memory along with the textual content and the identifier or tag. The semantic feature described here refers to the statistics or measurement related to the grammar, meaning and/or composition of the sentence. In an embodiment, the semantic analysis engine (303) may divide the text of each sentence into segments and determine corresponding semantic features for each segment. The semantic analysis engine (303) can be implemented by known machine learning means, and the establishment of the semantic analysis engine (303) can be done at the remote server and downloaded and installed to the user terminal device. Alternatively, the semantic analysis engine (303) may not execute in the user device, but instead execute at the remote server and store the analysis results at the remote server. The Semantic Analysis Engine (303) continuously optimizes the accuracy of semantic analysis and even detects errors in sentences through continuous training feedback.

音訊匹配引擎(304)經配置以根據關聯於一句子的語義特徵辨識與該句子對應的一或多個聲音內容，藉此完成文字內容及聲音內容的匹配。所述聲音內容可以是一或多個檔案構成並可經轉換成聲音訊號而經揚聲器輸出。在一實施例中，音訊匹配引擎(304)可存取一音訊樣本資料庫(圖中未示)，其可存放有與各種字詞對應的候選聲音內容。在一實施例中，在音訊樣本資料庫中，對應一字或一詞的每一聲音內容項目可被關聯於一或多個語義特徵，而所述匹配是至少基於字、詞及/或句子的語義特徵和聲音內容所關聯之語義特徵的辨識。所述匹配使所述文字內容(字、詞、句子)與一或多個聲音內容產生關聯。所述音訊匹配引擎(304)可由已知的手段實現，例如自動朗讀應用程式。在其他實施例中，如有聲書的應用，可以預錄的人聲朗讀取代音訊匹配引擎所合成的聲音內容，意即人聲朗讀的聲音內容可經處理而關聯至對應的文字內容作為播放。 The audio matching engine (304) is configured to recognize one or more sound content corresponding to the sentence based on the semantic features associated with the sentence, thereby completing the matching of the text content and the sound content. The sound content may be composed of one or more files and may be converted to an audio signal for output via a speaker. In one embodiment, the audio matching engine (304) can access an audio sample database (not shown) that can store candidate sound content corresponding to various words. In an embodiment, in the audio sample database, each sound content item corresponding to a word or a word may be associated with one or more semantic features, and the matching is based at least on words, words, and/or sentences. The semantic features and the identification of the semantic features associated with the sound content. The matching causes the textual content (words, words, sentences) to be associated with one or more sound content. The audio matching engine (304) can be implemented by known means, such as automatically reading an application. In other embodiments, for the application of the audio book, the pre-recorded human voice can read the sound content synthesized by the audio matching engine, that is, the voice content read by the human voice can be processed to be associated with the corresponding text content for playing.

文字標記引擎(305)經配置以根據一起始訊號或一輸入訊號而產生一或多個動態標記於關聯該起始訊號或輸入訊號的句子及/或文字。所述動態標記可經由顯示器視覺呈現給使用者。動態標記句有任何可能的形式，如於文字上的螢光標記、文字下方的底線、文字的顏色/字型/字體等。此處的動態是指標記會在自動朗讀期間隨著朗讀目標的前進而在句子及/或字詞間出現及跳躍的動作(skip)，當自動朗讀停止時標記會靜止於文字或消失。所述起始訊號指示了一自動朗讀動作的開始。文字標記引擎(305)會標記文字內容中所識別的第一句子或第一字詞以回應該起始訊號。或者，所述起始訊號可進一步指示經暫停後繼續自動朗讀動作的開始。標記的跳躍頻率與句子或字詞的長短還有自動朗讀的速度有關。所述輸入訊號是經由輸入單元(270)所產生，此處的輸入訊號指示了顯示器的一顯示區域上的位置資訊。在一實施例中，所述輸入訊號是基於顯示器的一顯示區域的一座標或一座標集合之辨識所產生(如使用者點選觸控螢幕)，其中所述座標是關聯於一或多個像素位置。在另一實施例中，所述輸入訊號是基於多媒體內容的一部分的選擇而產生(如使用者在顯示區域中點選顯示內容的一部分)。可替代地，所述輸入訊號指示了未被顯示的多媒體內容的位置資訊(如使用者在顯示的目錄上點選第三章)。文字標記引擎(305)可關聯一標記至未顯示的文字內容以回應該輸入訊號。值得注意的是，雖然多媒體內容未被顯示，但可根據已套用的排版規則而決定多媒體內容中各內容項目的一位置資訊(如文字內容的第三段第七句位在第九頁第一至五行)。前述內容播放模組(232)可提供一輸入欄位允許輸入電子書的導覽資訊，如章節、頁數、行數。 The text tag engine (305) is configured to generate one or more sentences and/or words that are dynamically tagged with the start signal or the input signal based on a start signal or an input signal. The dynamic indicia can be visually presented to the user via the display. Dynamic markups have any possible form, such as fluorescent marks on text, bottom lines below text, color/font/fonts of text, and so on. The dynamic here refers to the action that the mark appears and jumps between sentences and/or words as the reading target progresses during automatic reading. When the automatic reading stops, the mark will be stationary or disappear. The start signal indicates the beginning of an automatic reading operation. The text tag engine (305) marks the first sentence or first word identified in the text content to echo the start signal. Alternatively, the start signal may further indicate the beginning of the automatic reading operation after the pause. The frequency of the mark jump is related to the length of the sentence or word and the speed of automatic reading. The input signal is generated via an input unit (270), where the input signal indicates location information on a display area of the display. In one embodiment, the input signal is generated based on identification of a target or a set of labels in a display area of the display (eg, the user clicks on the touch screen), wherein the coordinates are associated with one or more Pixel position. In another embodiment, the input signal is generated based on a selection of a portion of the multimedia content (eg, the user clicks on a portion of the display content in the display area). Alternatively, the input signal indicates location information of the multimedia content that is not displayed (eg, the user clicks on the third chapter in the displayed directory). The text tag engine (305) can associate a tag to the undisplayed text content to return the input signal. It is worth noting that although the multimedia content is not displayed, a location information of each content item in the multimedia content may be determined according to the typesetting rules that have been applied (eg, the third sentence of the third paragraph of the text content is on the ninth page first. To the fifth line). The content playing module (232) can provide a navigation field for inputting an e-book, such as a chapter, a page number, and a line number.

以電子書為例，第四圖例示一顯示器的顯示畫面(400)，其中一顯示區域(401)顯示了文字內容的一部分，而其他部分未被顯示或被視窗(402)覆蓋。可選擇地，未被顯示的內容可經由一捲動操作或一翻頁操作而出現。舉例而言，捲動操作(403)所對應的輸入訊號指示顯示區域(401)上的一座標集合係沿著一縱向方向變化，據此未被顯示的文字內容可由顯示區域(401)的上方或下方載入畫面。翻頁操作(404)所對應的輸入訊號指示顯示區域(401)上的一座標集合係沿著一橫向方向變化，據此未被顯示的內容可由顯示區域(401)的左右側邊載入畫面。動態標記可不回應未顯示內容的載入，或者動態標記可回應未顯示內容的載入而維持在顯示區域(401)中。一選擇操作可關聯於顯示區域(401)中的文字內容。如圖示，一第一選擇操作(405)所對應的輸入訊號指示對應文字內容「我知道」的一座標或一座標集合，文字標記引擎可據此標記該文字內容所對應的句子或字詞。一第二選擇操作(406)所對應的輸入訊號指示對應一頁邊空白的一座標或一座標集合，文字標記引擎可據此標記與該頁邊空白(margin)最靠近的文字內容的句子或字詞。一第三選擇操作(407)所對應的輸入訊號指示對應兩個句子連接觸的一座標或一座標集合，文字標記引擎可據此選擇標記兩個句子的其中一者。 Taking an electronic book as an example, the fourth figure illustrates a display screen (400) of a display in which a display area (401) displays a portion of the text content while other portions are not displayed or are covered by the window (402). Alternatively, the undisplayed content may appear via a scrolling operation or a page turning operation. For example, the input signal corresponding to the scrolling operation (403) indicates that the set of labels on the display area (401) changes along a longitudinal direction, whereby the undisplayed text content can be above the display area (401). Or load the screen below. The input signal corresponding to the page turning operation (404) indicates that the label set on the display area (401) changes along a lateral direction, whereby the undisplayed content can be loaded into the screen by the left and right sides of the display area (401). . The dynamic mark may not respond to the loading of the undisplayed content, or the dynamic mark may be maintained in the display area (401) in response to the loading of the undisplayed content. A selection operation can be associated with the text content in the display area (401). As shown, the input signal corresponding to a first selection operation (405) indicates a label or a set of labels corresponding to the text content "I know", and the text markup engine can mark the sentence or word corresponding to the text content accordingly. . The input signal corresponding to a second selection operation (406) indicates a label or a set of labels corresponding to a blank margin of the page, and the text markup engine can mark the sentence of the text content closest to the margin of the margin or Words. The input signal corresponding to a third selection operation (407) indicates a label or a set of labels corresponding to the two sentences, and the text markup engine can select one of the two sentences accordingly.

各種形式的輸入單元可實現前述操作並產生對應的輸入訊號。觸控螢幕為普遍的輸入單元，也可提供直覺的操作。可替代地，輸入單元可為成像裝置用於捕捉讀者的眼球影像或是手勢影像，並配合影像辨識而產生操作所對應的輸入訊號。已知的影像辨識技術可判斷讀者眼腈在顯示區域中(401)所聚焦的位置或掃視以識別前述操作。例如，當眼球盯著顯示區域(401)中的一位置長達一時間或配合一扎眼動作，選擇操作可被辨識。當讀者遠離顯示器且給予一揮動手勢或指向手勢時，翻頁操作或選擇操作可被識別。可替代地，輸入單元可為用於捕捉人聲的麥克風。已知的語音辨識技術可判斷讀者給出的關鍵字並對應產生關聯選擇操作的輸入訊號。進一步，配合已知搜尋技術的搜尋技術，文字標記引擎可標記文字內容中所有被選擇的關鍵字。這些輸入單元的選擇性對於身障人士來說是友善的，另一方面也有助於教學領域的應用，而非僅侷限於已知電子書的使用。 Various forms of input units can perform the aforementioned operations and generate corresponding input signals. The touch screen is a universal input unit that also provides intuitive operation. Alternatively, the input unit may be used by the imaging device to capture the eyeball image or the gesture image of the reader, and cooperate with the image recognition to generate an input signal corresponding to the operation. Known image recognition techniques can determine the position or panning of the reader's eye nitrile in the display area (401) to identify the aforementioned operations. For example, the selection operation can be recognized when the eyeball is staring at a position in the display area (401) for a time or in conjunction with a lash action. When the reader is away from the display and When a waving gesture or a pointing gesture is given, a page turning operation or a selection operation can be recognized. Alternatively, the input unit may be a microphone for capturing human voice. Known speech recognition techniques can determine the keywords given by the reader and correspondingly generate input signals for associated selection operations. Further, in conjunction with the search technology of known search techniques, the text tagging engine can tag all selected keywords in the text content. The selectivity of these input units is friendly to the disabled and on the other hand to the teaching field, not just to the use of known e-books.

可選擇性地，一或多個動態標記可顯示於顯示區域中。第五A至五B圖顯示本創作動態標記的的各種實施例示意。第五A圖顯示針對單一句子的動態標記(501)。在捲動畫面的實施例中，動態標記(501)於自動朗讀期間可被維持在顯示區域的一水平高度或一範圍，因此隨著自動朗讀的進行畫面是動態的被自動捲動。在翻頁畫面的實施例中，當動態標記出現在當前頁面的底部內容，接著下一畫面的載入將動態標記至於畫面的頂部內容。第五B圖顯示針對單一文字的動態標記(502)。然而，當遇到冗長的句子或朗讀速度過快的情況，單純使用句子標記(501)或單字標記(502)均有其缺點。因此，綜合兩者可相互彌補缺點。如第五C圖同時顯示兩種標記，其中單字標記(502)被包含在單句標記(501)中，兩者可給予適當的視覺區別，例如顏色或透明度的處理。第五D圖進一步顯示段落標記(503)，其適用於基於段落的縮小內容。 Alternatively, one or more dynamic markers can be displayed in the display area. Figures 5 through 5B show various embodiments of the present dynamic tag. Figure 5A shows a dynamic marker (501) for a single sentence. In the embodiment of the volume animation surface, the dynamic mark (501) can be maintained at a level or a range of the display area during automatic reading, so that the picture is dynamically scrolled as the automatic reading is performed. In an embodiment of the page flipping screen, when the dynamic tag appears at the bottom of the current page, then the loading of the next frame will be dynamically tagged to the top content of the screen. Figure 5B shows a dynamic markup (502) for a single text. However, when a lengthy sentence is encountered or the reading speed is too fast, simply using the sentence mark (501) or the single word mark (502) has its disadvantages. Therefore, combining the two can make up for each other's shortcomings. As shown in the fifth C-picture, two types of marks are simultaneously displayed, wherein the single-word mark (502) is included in the single-sent mark (501), and both can give appropriate visual distinction, such as color or transparency processing. The fifth D diagram further shows a paragraph mark (503) that is suitable for narrowing down the content based on the paragraph.

返參第三圖，同步產生引擎(306)經配置以將辨識的聲音內容與對應的一或多個動態標記同步。同步產生引擎(306)可根據儲存的同步資訊或識別資訊(即識別符或標籤)將關於一句子或一字詞的聲音內容與動態標記同步輸出至輸出單元(270)，如揚聲器及顯示器。在一實施例中，同步產生引擎(306)可利用已知的識別符或標籤將文字內容中的一部分及其對應的動態標記與聲音內容中的一部分產生關聯，例如經由已知的一連結手段，其中所述識別符或標籤係用於識別文字內容中的一句子或字詞。在一些施例中，同步產生引擎(306)可持續執行連結手段直到同步完成所有文字內容與聲音內容。同步產生引擎(306)的執行可以在遠端伺服器完成並將同步結果儲存在雲端，其可隨著多媒體內容一並下載至用戶裝置。進一步地，同步產生引擎(306)可基於聲音內容項目的播放時間而決定同步的動態標記的顯示時間，並記錄於同步結果中。 Referring back to the third diagram, the synchronization generation engine (306) is configured to synchronize the identified sound content with the corresponding one or more dynamic markers. The synchronization generation engine (306) can output the sound content of a sentence or a word and the dynamic mark to the output unit (270), such as a speaker and a display, according to the stored synchronization information or identification information (ie, an identifier or a tag). In an embodiment, the synchronization generation engine (306) may utilize a known identifier or tag to place a portion of the textual content and its corresponding dynamic The association is associated with a portion of the sound content, such as via a known linking means, wherein the identifier or label is used to identify a sentence or word in the textual content. In some embodiments, the synchronization generation engine (306) can continue to perform the linking means until all textual content and sound content are completed in synchronization. Execution of the synchronization generation engine (306) can be done at the remote server and the synchronization results stored in the cloud, which can be downloaded to the user device along with the multimedia content. Further, the synchronization generation engine (306) may determine the display time of the synchronized dynamic mark based on the play time of the sound content item and record it in the synchronization result.

第六圖顯示使用者與自動朗讀裝置(如第一圖的用戶裝置，104)的互動流程圖，包含步驟S600至S640。在步驟S600，使用者開啟用戶裝置所安裝的電字書閱讀器並將電子書檔案經由該閱讀器開啟。所述電子書閱讀器可自遠端伺服器提供的網站或連結下載，或可內建於用戶裝置。閱讀器可包含控制介面以允許使用者選擇性導覽文章內容。閱讀器還可包含其他附加功能的選擇，如自動朗讀與文字標記的輔助。閱讀器可提供一閱讀視窗顯示於顯示器的一顯示區域，其顯示電子書內容的一部分，包含文字內容和圖片內容，甚至可點擊的連結。閱讀器的導覽模式可根據使用者設定或載入電子書檔案的類型而為捲動導覽模式或翻頁導覽模式。閱讀器開啟後，文章的一部分內容出現在顯示區域中，步驟S600結束 The sixth diagram shows a flow diagram of the interaction of the user with the automatic reading device (e.g., user device of the first figure, 104), including steps S600 through S640. In step S600, the user turns on the electronic book reader installed by the user device and opens the electronic book file via the reader. The e-book reader can be downloaded from a website or link provided by the remote server, or can be built into the user device. The reader can include a control interface to allow the user to selectively navigate the content of the article. The reader can also include other additional features, such as automatic reading and text markup. The reader can provide a reading window displayed on a display area of the display, which displays a portion of the e-book content, including text content and image content, and even clickable links. The navigation mode of the reader can be scrolling navigation mode or page turning navigation mode according to the type of user setting or loading the e-book file. After the reader is opened, part of the content of the article appears in the display area, and step S600 ends.

在步驟S610，使用者經由閱讀器的控制介面(如虛擬或實體按鍵)啟動自動朗讀及標記功能。在一實施例中，使用者可點選如第四圖視窗(402)中的虛擬按鈕而呼叫一選擇介面的顯示。可替代地，使用者可自顯示區域的一邊緣以滑動的手勢拉出列有多項可選擇功能的一選擇介面。朗讀及標記功能可以是個別獨立的。較佳地，當兩者被決定為主動狀態時，朗讀與標記的結果應同步。當朗讀與標記功能為主動時，使用者按下播放鍵後，朗讀之聲音訊號與動態標記實質同時為使用者所接收。動態標記係以單字、單詞或單句為單位出現在顯示區域中，且動態標記所關連的文字與聲音訊號關聯的文字完全或部分匹配。動態標記與聲音訊號會以適當的速度且依文字內容預定的順序而產生。如第四圖所例示的顯示畫面，動態標記與朗讀聲音訊號係由顯示區域中的第一行第一句或第一字往下自動同步關聯至顯示區域中的最後一句或最後一字。在捲動導覽模式中，當朗讀至顯示區域的最後一句或字，閱讀器可自動捲動畫面，使未被顯示的文字內容取代原來的一部分，並從更新的部分接續朗讀及標記。在翻頁導覽模式中，當朗讀至顯示區域的最後一句或字，閱讀器可自動載入未顯示的文字內容來取代原來的文字，並從更新的部分接續朗讀及標記。除非使用者命令閱讀器停止動作，否則朗讀的聲音訊號與動態標記會依文章的順序持續播放直到文章結束，結束步驟S610。 In step S610, the user initiates an automatic reading and marking function via a control interface (such as a virtual or physical button) of the reader. In one embodiment, the user can click on the virtual button in the fourth view window (402) to call the display of a selection interface. Alternatively, the user can pull out a selection interface listing a plurality of selectable functions in a sliding gesture from an edge of the display area. The reading and marking functions can be individually independent. Preferably, when both are determined to be active, the result of reading and marking should be Synchronize. When the reading and marking function is active, when the user presses the play button, the sound signal and the dynamic mark read aloud are substantially received by the user at the same time. The dynamic mark appears in the display area in units of words, words, or single sentences, and the text associated with the dynamic mark matches the text associated with the sound signal completely or partially. Dynamic markers and audio signals are generated at an appropriate speed and in the order in which the text content is predetermined. As shown in the fourth figure, the dynamic mark and the spoken sound signal are automatically and synchronously associated with the last sentence or the last word in the display area by the first sentence or the first word in the display area. In the scrolling navigation mode, when reading the last sentence or word in the display area, the reader can automatically scroll the animation surface, replacing the undisplayed text content with the original part, and reading and marking from the updated part. In the page navigation mode, when reading the last sentence or word in the display area, the reader can automatically load the undisplayed text content to replace the original text, and then read and mark from the updated part. Unless the user instructs the reader to stop the action, the sound signal and the dynamic mark read aloud will continue to play in the order of the article until the end of the article, and the process ends in step S610.

在步驟S620，閱讀器判斷使用者是否有指示跳躍或略過部分文章內容。所述跳躍或略過是指使用者指定文章中的一個新的標記及朗讀目標內容，其不包含在當前被標記及/或朗讀的目標內容。所述跳躍或略過的動作可代表使用者希望從當前的朗讀、標記及/或顯示目標內容切換至另一個未朗讀、未標記及/或未顯示目標內容，即使用者希望改變當前朗讀及/或標記的文字內容。例如，使用者可經由閱讀器決定及改變朗讀和動態標記的句子。閱讀器持續偵測是否有任何關於使用者指示跳躍文章的輸入訊號，結束步驟S620。 At step S620, the reader determines whether the user has indicated that the jump or skips part of the article content. The jump or skip refers to the user specifying a new mark in the article and reading the target content, which is not included in the target content currently marked and/or read. The skipping or skipping action may represent that the user wishes to switch from the current reading, marking and/or display of the target content to another unread, unmarked and/or undisplayed target content, ie the user wishes to change the current reading and / or mark the text content. For example, a user can determine and change spoken and dynamically tagged sentences via a reader. The reader continuously detects whether there is any input signal indicating that the user has jumped the article, and ends step S620.

如果閱讀器沒有收到相關的指示，閱讀器會依預定的文章順序依續地朗讀和標記下一個句子或單字，直到文章結束(步驟S630)。 If the reader does not receive the relevant instructions, the reader will continue to read and mark the next sentence or word in the order of the predetermined article until the end of the article (step S630).

在步驟S640，使用者指示閱讀器切換自動朗讀和動態標記的目標內容。例如，使用者可經由導覽找到目標內容並於其中選擇自動朗讀和動態標記的一起始位置。在一可行的實施例中，在導覽的過程(無論是捲動或翻頁)，自動朗讀和動態標記的動作可隨著顯示區域中文字內容的變化而自動識別一起始位置。例如，當捲動畫面的過程導致當前朗讀及標記中的文字消失在顯示區域中，閱讀器可經配置以自動識別當前顯示區域中的一位置，其允許新的朗讀及標示從此接續。在其他實施例中，朗讀與標記的動作不會隨著顯示區域的變化而改變，意即縱使捲動畫面的過程導致當前朗讀及標記中的文字消失在顯示區域中，朗讀的位置及順序仍未改變。此適用於使用者僅導覽，但未意圖改變當前的朗讀及標示目標。 At step S640, the user instructs the reader to switch the target content of the automatic reading and dynamic marking. For example, the user can navigate through the target content and select a starting location for automatic reading and dynamic tagging therein. In a possible embodiment, during the navigation process (whether scrolling or page turning), the automatic reading and dynamic marking actions automatically identify a starting position as the text content in the display area changes. For example, when the process of scrolling the animation surface causes the current reading and the text in the mark to disappear in the display area, the reader can be configured to automatically identify a position in the current display area that allows new readings and indications to continue from there. In other embodiments, the reading and marking actions do not change with the change of the display area, that is, even if the process of scrolling the animation surface causes the current reading and the characters in the mark to disappear in the display area, the position and order of the reading are still Not changed. This applies to users who only navigate, but are not intended to change the current reading and marking goals.

第七圖顯示本創作動態標記文字內容的步驟流程，包含步驟S700至S730。這些步驟可由存放在一或多個記憶體(如存在第一圖伺服器或用戶裝置)的多個可執行指令所實現。這些步驟的執行可在使用者終端裝置完成，或者這些步驟的一部分可在遠端伺服器執行，或者這些步驟的部分可由終端裝置及遠端伺服器共同執行。 The seventh figure shows the flow of steps for dynamically creating the text content of the present creation, including steps S700 to S730. These steps can be implemented by a plurality of executable instructions stored in one or more memories, such as the presence of a first map server or user device. The execution of these steps can be done at the user terminal device, or a portion of these steps can be performed at the remote server, or portions of these steps can be performed jointly by the terminal device and the remote server.

在步驟S700，一指令可經配置以執行分析一多媒體內容(如電子書檔案)以辨識出該多媒體內容包含之文字內容的複數個句子及/或單字。所述辨識可基於已知的語義辨識及機械學習而實現。在一實施例中，所述句子的辨識是基於標點符號間的關聯性。在其他實施例中，所述辨識可包含嘗試對一連串的文字內容切割出不同的區段以進行分析。在一些實施例中，所述辨識可包含分析一連串文字內容的語義而決定一句子的範圍。經辨識的句字、單字或段落可給予對應的辨識資訊，如辨識符或標籤，其可具體指示該句子在一文章中的上下文關係或位置。這些辨識資訊可隨著該多媒體內容被儲存及傳送，結束步驟S700。 At step S700, an instruction can be configured to perform analysis of a multimedia content (such as an e-book archive) to identify a plurality of sentences and/or words of the textual content of the multimedia content. The identification can be implemented based on known semantic recognition and mechanical learning. In an embodiment, the recognition of the sentence is based on an association between punctuation marks. In other embodiments, the identifying may include attempting to cut a different segment of text content for analysis. In some embodiments, the identifying may include analyzing the semantics of a series of textual content to determine the extent of a sentence. The recognized sentence, word or paragraph can be given corresponding identification information, such as an identifier or a label, which can specifically indicate that the sentence is in an article. Context or location. The identification information can be stored and transmitted along with the multimedia content, and step S700 is ended.

在步驟S705，一指令經配置以執行基於前述文字內容的句子及/或單字的辨識產生對應的聲音內容，此可採已知機械朗讀手段實現。已知手段可針對文字內容的每一單字、一單詞及一單句子輸出或合成為對應的聲音內容。這些聲音內容可在朗讀之前產生並儲存於適當的記憶體，或者這些聲音內容可以在執行朗讀的過程中產生並及時地輸出。 In step S705, an instruction is configured to perform recognition of the sentence and/or the word based on the aforementioned text content to generate corresponding sound content, which may be implemented by known mechanical reading means. Known means can output or synthesize the corresponding sound content for each word, word and single sentence of the text content. These sound content can be generated and stored in appropriate memory before reading aloud, or the sound content can be generated during the execution of the reading and output in time.

在步驟S710，一指令經配置以令一處理單元處理一輸入訊號，尤其該輸入訊號與在一顯示器之顯示區域中的一位置辨識及/或與該多媒體內容的一部分在該顯示區域中的變化有關。輸入訊號由如第二圖用戶端裝置的輸出/輸入介面(240)接收。不同的輸入單元(260)可與輸出/輸入介面(240)通訊連接，如觸控面板、光學鏡頭或麥克風。關於這些輸入單元所產生的輸入訊號已如前述。如第四圖所示，使用者可經由這些輸入單元產生相對於顯示區域的導覽操作及選擇操作，其中導覽操作(403、404)將致使多媒體內容的部分在該顯示區域中的變化(如捲動、翻頁、切換)，而選擇操作(405、406、407)是致使在顯示區域中的一位置選擇。在一操作中，使用者首先執行導覽操作以找到希望閱讀的內容，接著執行選擇操作以決定一閱讀項目，如句子或字詞。據此，處理單元至少獲得顯示區域中的一或多個位置資訊，步驟S710結束。 In step S710, an instruction is configured to cause a processing unit to process an input signal, in particular, the input signal and a location in a display area of a display and/or a change in a portion of the multimedia content in the display area. related. The input signal is received by the output/input interface (240) of the client device as shown in the second figure. Different input units (260) can be in communication with the output/input interface (240), such as a touch panel, an optical lens or a microphone. The input signals generated by these input units are as described above. As shown in the fourth figure, the user can generate a navigation operation and a selection operation with respect to the display area via the input unit, wherein the navigation operation (403, 404) will cause a change in the portion of the multimedia content in the display area ( The selection operation (405, 406, 407) is to cause a position selection in the display area. In one operation, the user first performs a navigation operation to find the content that he or she wishes to read, and then performs a selection operation to determine a reading item, such as a sentence or a word. Accordingly, the processing unit obtains at least one or more location information in the display area, and step S710 ends.

在步驟S720，一指令經配置以令處理單元(或標記單元)產生關聯於所述文字內容的一或多個動態標記，所述動態標記為可視於該顯示區域中。在一實施例中，處理單元自動產生所述動態標記於顯示區域中的文字。在另一實施例中，根據前述所獲得之顯示區域中的位置辨識資訊，處理單元找到對應該位置辨識的句子或字詞並產生所述動態標記於顯示區域中相應的範圍。如第四圖所示，無論使用者的選擇操作(405、406、407)所關聯的位置是否直接指出文章的一句子或單字，與該位置最相關的句子應優先被識別並給予標記。在一實施例中，處理單元還可根據導覽操作產生所述動態標記於顯示區域最終停留的文字內容。例如，根據一翻頁操作，新的動態標記可產生於新頁面的第一句。一或多個可視之動態標記可出現在顯示區域中。如第五A至五D圖顯示了單一句子標記、單一單字標記及其組合之示意。 At step S720, an instruction is configured to cause the processing unit (or the tag unit) to generate one or more dynamic tags associated with the textual content, the dynamic tags being viewable in the display area. In an embodiment, the processing unit automatically generates the text dynamically tagged in the display area. In another embodiment, the processing unit finds the location identification information in the display area obtained as described above. Corresponding to the position-recognized sentence or word and generating the dynamic mark in the corresponding range in the display area. As shown in the fourth figure, regardless of whether the position associated with the user's selection operation (405, 406, 407) directly indicates a sentence or a word of the article, the sentence most relevant to the position should be preferentially identified and given a mark. In an embodiment, the processing unit may further generate, according to the navigation operation, the text content that the dynamic mark ends in the display area. For example, according to a page turning operation, a new dynamic tag can be generated from the first sentence of the new page. One or more visible dynamic markers can appear in the display area. Figures 5A through VD show schematic representations of single sentence marks, single word marks, and combinations thereof.

在步驟S730，處理單元令所述動態標記自所述文字內容的一第一部分跳躍至所述文字內容的一第二部分以回應該輸入訊號，其中所述文字內容的第二部分出現在該顯示區域中。在一可能的情況中，於當前顯示區域中的一第一句子已見有一動態標記，而在使用者選擇當前顯示區域中的一第二句子後(即產生輸入訊號)，原第一句子的動態標記跳躍至使用者所選擇的第二句子。本文所描述的跳躍並非是指具體的跳躍動作，而應理解為視覺上類似跳躍或切換的視覺效果。所述跳躍的視覺效果可看出動態標記忽略了第一句子和第二句子之間其他句子的停留。在另一可能的情況中，見有動態標記的一第一句子因導覽操作而從當前的顯示區域消失，同時使用者選擇當前顯示區域中的一第二句子後(即產生輸入訊號)，動態標記回到顯示區域中並標記第二句子。儘管動態標記有可能因操作而自顯示區域消失，可假想所述動態標記是從未顯示部分跳躍至顯示中的內容。步驟S730可與步驟S720同時執行或者為步驟S720的一部分。 In step S730, the processing unit causes the dynamic mark to jump from a first portion of the text content to a second portion of the text content to return a signal, wherein the second portion of the text content appears on the display In the area. In a possible case, a first sentence in the current display area has seen a dynamic mark, and after the user selects a second sentence in the current display area (ie, generates an input signal), the original first sentence The dynamic marker jumps to the second sentence selected by the user. The jump described herein does not refer to a specific jump action, but rather to a visual effect that is visually similar to jumping or switching. The visual effect of the jump can be seen that the dynamic marker ignores the stay of other sentences between the first sentence and the second sentence. In another possible case, a first sentence with a dynamic mark disappears from the current display area due to the navigation operation, and the user selects a second sentence in the current display area (ie, generates an input signal). The dynamic marker returns to the display area and marks the second sentence. Although the dynamic mark may disappear from the display area due to the operation, it is assumed that the dynamic mark is a content that jumps from the undisplayed portion to the display. Step S730 may be performed simultaneously with step S720 or as part of step S720.

第八圖顯示本創作朗讀方法的步驟流程，包含步驟S800至S830。這些步驟可由一或多個運算裝置(如第一圖伺服器102及用戶裝置104)各別地或共同執行。例如，當用戶裝置與伺服器通訊連線，這些步驟的部分可由兩者共同執行。或者，在離現狀態，用戶裝置可獨立執行這些步驟。 The eighth figure shows the flow of steps of the present reading method, including steps S800 to S830. These steps may be performed by one or more computing devices (such as the first image server 102 and the user device 104). Or jointly. For example, when the user device is in communication with the server, portions of these steps can be performed jointly by both. Alternatively, in the detached state, the user device can perform these steps independently.

在步驟S800，經由一遠端伺服器或一用戶裝置取得一多媒體內容，並將該多媒體內容的一部分顯示於一顯示器的顯示區域上。所述多媒體內容可以是各種內容項目的組合，如文字內容、圖片內容、聲音內容及影像內容，其亦可整合在串流內容中從遠端伺服器往用戶裝置傳送。顯示器可被包含在用戶裝置或者是獨立於用戶裝置且與之通訊連接的一外接裝置。該顯示區域(如第四圖，401)顯示有多媒體內容的一部分。以電子書而言，顯示區域以文字內容為主並可於文字之間穿插圖片或廣告看板。結束步驟S800。 In step S800, a multimedia content is obtained via a remote server or a user device, and a portion of the multimedia content is displayed on a display area of a display. The multimedia content may be a combination of various content items, such as text content, picture content, sound content, and video content, which may also be integrated in the streaming content from the remote server to the user device. The display can be included in the user device or an external device that is independent of and in communication with the user device. The display area (such as the fourth figure, 401) displays a portion of the multimedia content. In the case of an e-book, the display area is mainly text content and can be interspersed with pictures or billboards between the texts. Step S800 ends.

在步驟S810，經由用戶裝置起始一機械朗讀手段以基於所述文字內容輸出一聲音訊號。用戶裝置可配置成具備朗讀文字的能力。例如，用戶裝置可包含電子書閱讀器、聲音資料庫、語義辨識引擎或模組以及喇叭。聲音資料庫存放有對應每一單字、詞或句的聲音資料，這些資料可和語義辨識的結果匹配而輸出對應的聲音訊號。本文描述的聲音訊號的可以是數位或類比的形式，不限於電路傳輸階段或最終輸出的可聽見訊號。在使用者未指定的情況下，機械朗讀可從文字內容的任一處開始，如文字內容的第一個字，或顯示區域中當前文字內容的第一個字，或先前朗讀結束的位置。所述朗讀持續直到文章結束或使用者主動停止，結束步驟S810。 In step S810, a mechanical reading means is initiated via the user device to output an audio signal based on the text content. The user device can be configured to have the ability to read text. For example, the user device can include an e-book reader, a sound database, a semantic recognition engine or module, and a speaker. The sound data stock is provided with sound data corresponding to each word, word or sentence, and the data can be matched with the result of the semantic recognition to output a corresponding sound signal. The audio signals described herein may be in the form of digits or analogs, and are not limited to audible signals in the circuit transmission phase or the final output. In the case where the user does not specify, mechanical reading can start from any part of the text content, such as the first word of the text content, or the first word of the current text content in the display area, or the position where the previous reading ends. The reading continues until the end of the article or the user actively stops, ending step S810.

在步驟S820，經由用戶裝置產生一或多個動態標記於該顯示區域中，所述動態標記指出希望被讀者注視的一句子及/或一單字。所述動態標記所指示的文字內容與該聲音訊號關聯的文字內容同步。本文描述的同步是指動態標記產生的範圍與當前朗讀的字相同或者與當前朗讀的字詞所屬的句子相同，並非僅限於相關訊號發生時間上的相同。在一實施例中，所述動態標記為句子標記，其出現在一句子顯示的位置使該句子可視覺地與其他文字區隔(如第五A圖)。在另一實施例中，所述動態標記為單字標記，其出現在一單字顯示的位置使該單字可視覺地與其他文字區隔(如第五B圖)。單字標記實質上可跟隨朗讀的速度持續跳躍至下一單字。在其他實施例中，所述動態標記為句子標記和單字標記的組合，其同時出現在一句子的位置及該句子中的一單字位置，且兩者視覺上可區隔(如第五C圖)。例如，句子標記和單字標記可分別具有不同的顏色，或其中一者為文字底線。動態標記會持續往文章的末端跳躍直到朗讀停，結束步驟S830。較佳地，步驟S810與步驟S820一起執行。 At step S820, one or more dynamic markers are generated in the display area via the user device, the dynamic markers indicating a sentence and/or a word that is desired to be viewed by the reader. The text content indicated by the dynamic mark is synchronized with the text content associated with the sound signal. Synchronization as described herein means that the dynamic marker produces the same range as the currently spoken word or the same sentence as the currently spoken word. It is not limited to the same time when the relevant signals occur. In one embodiment, the dynamic tag is a sentence tag that appears in a position where the sentence is displayed such that the sentence is visually distinguishable from other words (as in Figure 5A). In another embodiment, the dynamic indicia is a single-word indicia that appears in a single-word display such that the word can be visually distinguished from other text (as in Figure 5B). The single word mark can essentially jump to the next word at the speed of reading. In other embodiments, the dynamic mark is a combination of a sentence mark and a single word mark, which simultaneously appears at a position of a sentence and a single word position in the sentence, and the two are visually distinguishable (eg, the fifth C picture) ). For example, the sentence mark and the single word mark may each have a different color, or one of them may be a text bottom line. The dynamic mark will continue to jump to the end of the article until the reading stops, ending step S830. Preferably, step S810 is performed together with step S820.

在步驟S830，在朗讀和動態標記尚未停止前或停止後，經由用戶裝置接收一輸入訊號，該輸入訊號與在該顯示區域中的一位置辨識及/或與該多媒體內容的該部分在該顯示區域中的變化有關。用戶裝置可包含或通訊連接一輸入單元，如觸控面板、光學鏡頭或麥克風，其允許使用者操作以指出顯示區域上的一位置資訊及允許使用者於顯示區域中導覽所有的多媒體內容。所述位置資訊包含關於顯示區域上的一座標或一座標集合，其可指示一或多個使用者操作，如前述選擇操作或導覽操作。所述動態標記的顯示及聲音訊號的輸出自所述文字內容的一第一部分跳躍至一第二部分以回應該輸入訊號，其中所述文字內容的第二部分出現在該顯示區域中。當一選擇操作被用戶裝置識別，一句子或一單字(最關聯於所述位置資訊)接著被識別以回應所述選擇操作。基於所述輸入訊號而被識別的句子或單子則成為朗讀和標記的新目標，並立即被朗讀和標記。動態標記的目標自原句子(第一句子)跳躍至已識別的句子(第二句子)，第一句子和第二句字為不同的句子，且不限於第一句在第二句之前。若指示一導覽操作的輸入訊號導致顯示畫面的內容變化，動態標記會隨著所述變化改變出現在顯示畫面上的位置或消失。在一實施例中，當動態標記因此消失時，用戶裝置可產生一新的動態標記在變化後的顯示區域中以標記當前的內容，同時朗讀目標一併同步至新的目標內容。在一些實施例中，如包含章節連結的電子書目錄或返回首頁的快捷鍵，指示一選擇操作或導覽操作的輸入訊號可致使用戶裝置將讀者導向與所選章節連結所關聯的頁面，同時朗讀和標記目標也一併同步至該頁面。基於該輸入訊號，動態標記的目標及/或朗讀目標跳躍至新的目標內容，步驟S830結束。 In step S830, before the reading and the dynamic mark have not stopped or after the stop, an input signal is received via the user device, the input signal is recognized with a position in the display area and/or the part of the multimedia content is displayed on the display. Related to changes in the area. The user device can include or be communicatively coupled to an input unit, such as a touch panel, optical lens or microphone, that allows the user to operate to indicate a location information on the display area and to allow the user to navigate through all of the multimedia content in the display area. The location information includes a set of landmarks or a set of labels on the display area that may indicate one or more user operations, such as the aforementioned selection operations or navigation operations. The display of the dynamic mark and the output of the audio signal jump from a first portion of the text content to a second portion to respond to the input signal, wherein the second portion of the text content appears in the display area. When a selection operation is recognized by the user device, a sentence or a word (most associated with the location information) is then identified in response to the selection operation. A sentence or list that is recognized based on the input signal becomes a new target for reading and marking, and is immediately read and marked. The target of the dynamic mark jumps from the original sentence (the first sentence) to the recognized sentence (the second sentence), the first sentence and the second sentence are different sentences, and Limited to the first sentence before the second sentence. If the input signal indicating a navigation operation causes the content of the display screen to change, the dynamic mark changes the position appearing on the display screen or disappears with the change. In an embodiment, when the dynamic tag thus disappears, the user device can generate a new dynamic tag in the changed display area to mark the current content while reading the target and synchronizing to the new target content. In some embodiments, such as an e-book directory containing chapter links or a shortcut key returning to the home page, an input signal indicating a selection operation or a navigation operation may cause the user device to direct the reader to the page associated with the selected chapter link, while Reading and marking the target are also synced to the page. Based on the input signal, the target of the dynamic mark and/or the read target jumps to the new target content, and step S830 ends.

要瞭解在該類流程圖描繪中的步驟圖示及組合，係可實作為電腦程式指令。這些程式指令可提供至一處理器以製造一種機器，因此當在該處理器上執行該等指令時，產生用於實現在該流程圖區塊或多數區塊中所指定的動作之方法。該等電腦程式指令可由一處理器執行以由該處理器執行一連串的操作步驟，而形成一電腦實作程序，因此該等指令係於該處理器上執行，以提供用於實現在該流程圖區塊或多數區塊中所指定的動作之步驟。這些程式指令可被儲存於一電腦可讀媒體或機器可讀媒體上，像是儲存在一電腦可讀儲存媒體上。 To understand the steps and combinations of steps in the description of this type of flowchart, it can be used as a computer program instruction. These program instructions can be provided to a processor to make a machine, such that when the instructions are executed on the processor, a method for implementing the actions specified in the flowchart block or majority of the blocks is generated. The computer program instructions are executable by a processor to perform a series of operational steps by the processor to form a computer implemented program, such instructions being executed on the processor to provide for implementation in the flowchart The steps of the action specified in the block or most blocks. The program instructions can be stored on a computer readable medium or machine readable medium, such as on a computer readable storage medium.

據此，該等描述支援執行該等具體動作之手段的組合、支援執行該等具體動作之多數的組合，以及支援執行該等具體動作之程式指令方式。也將可瞭解，該流程圖描繪中的每一區塊以及該流程圖描繪中區塊的組合可由模組實作，像是以特殊目的硬體為基礎的系統，該系統執行該等具體動作步驟，或是特殊目的硬體與電腦指令的組合。 Accordingly, these descriptions support a combination of means for performing such specific actions, a combination of a plurality of specific operations that support execution, and a program command mode that supports the execution of such specific actions. It will also be appreciated that each block in the flowchart depiction and the combination of blocks in the flowchart depiction can be implemented by a module, such as a system based on a special purpose hardware, which performs the specific actions. Steps, or a combination of special purpose hardware and computer instructions.

以上內容提供該等敘述具體實施例之組合的製造與使用的完整描述。因為在不背離此敘述精神與範圍下可以產生許多具體實施例，因此這些具體實施例將存在於以下所附加之該等申請專利範圍之中。 The above description provides a complete description of the manufacture and use of the combinations of the specific embodiments. Since many specific embodiments can be made without departing from the spirit and scope of the invention, these specific embodiments are intended to be included in the scope of the appended claims.

Claims

An automatic reading device configured to receive and display a multimedia content, the multimedia content comprising at least text content, the reading device comprising: a display having a display area for displaying a portion of the multimedia content; an input interface receiving an input a signal, the input signal being associated with a location in the display area and/or a change in the portion of the multimedia content in the display area; and a reading and marking unit configured to generate an association with the text a sound content of the content and one or more dynamic tags, the dynamic tag jumping from a first portion of the text content to a second portion of the text content to return a signal, wherein the first portion of the text content In association with a first sound content, a second portion of the textual content is associated with the input signal and appears in the display area.

The automatic reading device of claim 1, wherein the dynamic mark has a sentence mark.

The automatic reading device of claim 1, wherein the dynamic mark has a single word mark.

The automatic reading device of claim 1, wherein the dynamic mark has a sentence mark and a single word mark, and the sentence mark is visually separably overlapped with the single mark.

The automatic reading device of claim 2, wherein the range of the sentence mark is defined by two punctuation marks of the text content.

The automatic reading device of claim 1, wherein the second portion of the text content is related to a second sound content.