TWI233090B

TWI233090B - System and method of language translation for multimedia data

Info

Publication number: TWI233090B
Application number: TW92120258A
Authority: TW
Inventors: Shih-Hsiung Weng; Hooi-Ling Yeo
Original assignee: Inventec Multimedia & Telecom
Priority date: 2003-07-24
Filing date: 2003-07-24
Publication date: 2005-05-21
Also published as: TW200504682A

Abstract

The present invention provides a system and method of language translation for multimedia data, which is to memorize the playing time and file name of the multimedia data during playing back of the multimedia data. Then record the memory data for the playing time according to the issued time recording instruction, and calculate to obtain a playing time range. Accordingly, fragment of audio data based on the playing time range for conversion is retrieved so as to obtain the necessary text data for display.

Description

1233090 五、發明說明（1) 【發明所屬之技術領域】本發明係關於一種輔助$吾文學習的糸統與方法，特別是一種針對多媒體資料進行語文翻譯的系統及方法。【先前技術】内資人過而語上能容觀行然放片者提息容作直接料與音訊士的需求這種服務已，若要了解其相影片作為較難以突目前若遇到下列時，由於賞者對於查詢，但現在家庭影片的功的情況之使用的過供文字訊的影片時關影片式的播資料完，會在的提供滿足觀關單字輔助語顯。要以觀幾種情影片持目前的在查詢用的影能，但下，必程中，息5因，勢必的爹媒體頁料飱放，也就整播放出多媒體資，僅止於賞者對影用法或文言學習的是將多媒體資料中包含的視訊來’有日守為了地區語言或殘障料中增加文字訊息的提供，不滿足觀賞者對影片内容的理解片内谷中部分不能理解的單字法訊息的相關訊息時，利用外工具，可能在輔助學習的成效賞影片作況：欣賞續播放且對白有疑的同時可片播放工是由於觀須多準備勢必產生此當觀賞更難理解為輔助影片時未提供問時’ 能將漏具，均賞者為一項語困擾；者欣賞影片中語言學習，遇到不相關字辭可能必須看後續的提供觀賞了學習語言的查詢另外並非的方法能理解典查詢自備字影片内者可以言又想工呈，全部的有提供物對白的内容到一部沒 ’將可對白内工具，辭典進容；雖暫停播欣賞影在觀賞影片會文字訊 ’也就 1233090 五、發明說明（2) ^難利用字辭典等輔助學習工具，了解單字的用法【發明内容】馨於以上習知技術的問題，本發明提供體資料進行語文翻譯的系統及方法，係於多媒體==某時，同時記憶多？體資料=放時間與多料放然後根據下達的日守間記錄心♦，記錄播放記= 多媒體資料檔名，然後計算取得一個播放時 =u貝抖，1233090 V. Description of the invention (1) [Technical field to which the invention belongs] The present invention relates to a system and method for assisting Chinese language learning, particularly a system and method for language translation of multimedia materials. [Previous technology] Domestic investors have been able to expressly accept the interest of filmmakers. This service has been used as a direct source of information and the needs of the audio-visuals. If you want to understand its photos, it is more difficult to highlight the current situation. If you encounter the following At this time, due to the inquiries of the viewers, but the current situation of the function of the home video has been used for text messages, the video-style broadcast data is over, and the auxiliary language display that satisfies the concept of the word will be provided. It is necessary to watch several kinds of love movies to hold the current shadowing energy for query, but next, in the course of the process, it will take 5 reasons, and the inevitable media page of the father will be released, and the multimedia resources will be broadcasted, only for the viewers The study of film usage or classical language is to include the video contained in the multimedia materials to increase the provision of text messages in the local language or handicap materials, which does not meet the viewer's understanding of the content of the film. When using the relevant information of the French message, you can use the external tools to assist the learning. You can enjoy the video status: enjoy the continuous playback and have doubts about the dialogue. At the same time, you can play the movie because you have to watch and prepare more. It will be difficult to understand. The question is not provided when the video is provided. It can be missed, and all the viewers are troubled by a single language. Those who appreciate language learning in the video may encounter unrelated words and may have to watch the follow-up query to learn the language is not the other way. Can understand the inquiries of the self-provided words in the film can speak and want to work, all of the content provided by the dialogue to one The ministry 'will be able to talk to the internal tools and dictionaries; although the broadcast of the video will be paused while watching the video, the text message will be 1233090 V. Description of the invention (2) ^ Difficult to use word dictionaries and other auxiliary learning tools to understand the use of single words [ Summary of the Invention: In view of the problems of the above-mentioned conventional technology, the present invention provides a system and method for translation of body data into Chinese. Volume data = release time and multi-material release. Then record the heart according to the issued day-to-day record, record and play. Record = multimedia data file name, and then calculate a playback time.

根據此播放時間範圍擷取出片段音訊資料進J以：：J 得必要的文字資料進行顯示的系統與方法。得換以獲本發明的目的在於提供一種針對多拔翻譯的系統及方法，藉由本系統與方法的、s二料進行語文助語文學習的目的。去的運作，可達到輔因此為達上述目的，本發明的系統主要組:計時單元、擷取模組、分析轉換模组m 該糸統更包含語文釋義模組。、科庫’其中至於本發明的運作方法則主要包含二種為即時模式（Real Time Mode)，— ^運作模式，一The system and method for extracting the segment audio data into J :: J based on this playback time range to obtain the necessary text data for display. The purpose of the present invention is to provide a system and method for multi-dial translation. The purpose of this system and method is to perform language-assisted language learning. Therefore, the system of the present invention mainly includes a timing unit, a capture module, and an analysis conversion module. The system further includes a language interpretation module. "Koku" Among them, the operation method of the present invention mainly includes two types of real-time mode (Real Time Mode), ^ operation mode, one

Hoid Mode)，另一種為擷取模式(”模式(On 模式(—1 丁⑽Mode)的運作步 eVe M〇de”即時指令記錄播放時間呓悻資枓.妒％ f .百先接收時間記錄計算取得播放時間範^ :著；，播放時間記憶資料體資料之片段立1次f 者根據播放時間範圍擷取多媒字資料顯示。貝料；最後根據片段音訊資料轉換成文暫U果式（On Hold Mode)及操取模式（Retrieve 第5頁 1233090 五、發明說明（3)Hoid Mode), the other is the capture mode ("Mode (On Mode (—1 丁 ⑽Mode) operation steps eVe Mode)" real-time instructions to record the playback time data 呓悻. Jealousy% f. 100 first reception time record calculation Obtain the playback time range ^: Zhuo ;, the playback time memory data segment of the data is displayed once f, and the multimedia word data is displayed according to the playback time range display. Shell material; and finally converted to the text U fruit type (On) based on the audio data of the segment Hold Mode) and Retrieve Page 5 1233090 V. Description of Invention (3)

Mode)則共同包含下列運作步驟··首先接收時間記錄指令記錄播放時間記憶資料及多媒體資料檔名；接著接收查詢指令並根據多媒體資料檔名擷取多媒體資料；然後根據播放時間記憶資料計算取得播放時間範圍；並且根據播放時間範圍擷取多媒體資料之片段音訊資料；最後根據片段音訊資料轉換成文字資料。透過前面對於本發明揭露的系統與方法的概略描述，可解決背景說明中所存在的問題，並且預期可達到以下功效： 1 ·提供内容查詢及翻譯功能，可供使用者欣賞影片時，解決不能理解對白的問題； 2·提供三種輔助使用者理解對白内容的方式，可滿足使用者欣賞影片時的各種要求； ^ 3 ·為確保顯示的文字資料準確性，採用公差計瞀方式可取得一定範圍的音訊資料進行處理，以提高對白^ ^ 正確性；及 4·允許使用者向前或向後調整音訊資料擷取的範圍，以滿足使用者正確理解對白内容的要求。Mode) includes the following operating steps in common: · First receive the time recording instruction to record the playback time memory data and the multimedia data file name; then receive the query instruction and retrieve the multimedia data according to the multimedia data file name; then calculate and obtain the playback based on the playback time memory data Time range; and extract the audio data of the multimedia data according to the playback time range; and finally convert it into text data according to the audio data of the fragment. Through the foregoing brief description of the system and method disclosed by the present invention, the problems existing in the background description can be solved, and the following effects are expected to be achieved: 1 · Provide content query and translation functions, which can be used by users to solve problems Understand the question of dialogue; 2. Provide three ways to help users understand the content of dialogue, which can meet the various requirements of users when enjoying the video; ^ 3 · To ensure the accuracy of the displayed text data, a certain range can be achieved by tolerance calculation Processing of the audio data to improve the accuracy of the dialogue ^ ^; and 4. Allow the user to adjust the range of audio data capture forward or backward to meet the user's requirements for correct understanding of the dialogue content.

有關本發明的特徵與實作，茲配合圖示詳細說明如下：取往K細Μ歹J 【實施方式】一種針對多媒於多媒體資料播放多媒體資料檔名，體資料進行語文翻譯的系統及方法，係時，同時記憶多媒體資料的播放時間與然後根據下達的時間記錄指令，記錄播The features and implementation of the present invention are described in detail with the illustrations below: Fetching K 歹 M 歹 J [Embodiment] A system and method for language translation of multimedia data file names and body data for multimedia media playback , At the same time, memorize the playback time of the multimedia data at the same time and then record the broadcast according to the time recording instruction issued.

1233090 五、發明說明⑷ 放時間記憶資料及多媒體資料檔幺 ^ ^ 放時間r 祕鍫iP it μμ接I 名’然後計算取得一個播料i隹Μ ^ ^ 才同乾圍擷取出片段音訊貧1233090 V. Description of the invention ⑷ Play time memory data and multimedia data files 幺 ^ Play time r Secret iP it μμ Connect I name ’and then calculate a broadcast i 隹 Μ ^ ^ Only then do you take out the audio information from the clip

丁千進仃轉換，以獲得必要的文字印乃仅S 法，兹山丄44 ΛΛ Τ 于貝科進行顯示的系統與方错由本糸統與方法的運作，可素 π ΛΑ曰的。 J運到輔助語文學習的目 f達上述目的，首先藉由「第Η 文輔 =圖’進行本發明的語文輔助系統各單元模組間 97戒旎傳遞解說。如：第卜Α圖」所示，本發明的語文輔助系統主要包換少/己^杈組110、計時單元120、擷取模組130、分析轉 =組UG及資料庫15〇，另外更可包含_個語文釋義模組Ding Qianjin converted to obtain the necessary text. Yin Nai only the S method, and the system and method of the display of Besanko 44 ΛΛ Τ in Beco are based on the operation of this system and method, which can be described as π ΛΑ. To achieve the above purpose, J has reached the goal of assisting Chinese language learning. Firstly, the explanation of the 97-key transmission between each unit module of the language auxiliary system of the present invention is carried out by "the first text auxiliary = diagram". It is shown that the language auxiliary system of the present invention mainly includes the replacement of the small / self group 110, the timing unit 120, the acquisition module 130, the analysis conversion = group UG, and the database 15. In addition, it can also include _ language interpretation modules

健十其中該記憶模組1 1 〇，負責記憶播放時間記憶資料及放子多媒體資料’·言十時單元i 20，負責記錄多媒體資料播欲所經過的時間及提供播放時間記憶資料；擷取模組13〇負責根據播放時間記憶資料計算取得播放時間範圍，並 =據播放時間範圍擷取多媒體資料之片段音訊資料；分析 ▲，模組1 4 0，負責根據音訊模糊比對資料1 5 1分析片段音 ^資料’並轉換片段音訊資料以取得文字資料；資料庫曰 5 0 ’負責儲存複數個音訊模糊比對資料丨5 1及複數個文字，釋貧訊；語文釋義模組1 6 0，負責根據文字資料及文字’ 選取指令以提供文字解釋資訊。其中該抬員取模 132(如「第1—b圖組130更包含設定單元131及計算單元所示），該設定單元1 3 1，負責提供各 1233090Jianshi among which the memory module 1 1 0 is responsible for memorizing playback time memory data and playback multimedia data '· 10 o'clock unit i 20, responsible for recording the elapsed time of multimedia data broadcast and providing playback time memory data; capture Module 13〇 is responsible for calculating the playback time range based on the playback time memory data, and = extracting the audio data of the multimedia data based on the playback time range; analysis ▲, module 1 4 0, responsible for comparing the data according to the audio fuzzy 1 5 1 Analyze the fragment audio data and convert the fragment audio data to obtain text data; the database says 50 0 'is responsible for storing a plurality of audio fuzzy comparison data 丨 51 and plural texts to explain the poor information; the language interpretation module 1 6 0 Is responsible for providing text interpretation information based on text data and text 'selection instructions. Among them, the staff removal module 132 (as shown in "Figure 1-b group 130 further includes a setting unit 131 and a calculation unit), the setting unit 1 3 1 is responsible for providing each 1233090

2設定的操作介面，以提供時間擷取參數的變[ 晝面的切割模式變更，至於該計算單元132 伊'模、 Ϊ = ί參數，執行計算作業以取得播放時間範圍，並 k供擷取杈組丨30執行後續擷取片段音訊資料之作業丨另外该分析轉換模組丨40更包含有聲音響度分析模組Η ^、音頻率分析模組142、模糊比對模組143及聲音文 Ϊ1 立St「第1—C圖」所示），該聲音響度分析模組141與、 =曰頻率分析模組142負責共同分析音訊資料的響度盥頻率，並透過聲音的過濾處理，以取得一個淨音音訊，該模 2比對模組143則根據資料庫150所提供的音訊模糊比對資料針對淨音音訊進行模糊比對的處理，以取得一個、人聲貝訊，最後該聲音轉換文字模組丨44才根據此人聲訊轉=成文字資訊，以提供系統顯示。、當使用者利用本發明的系統播放影片時，系統首先會 Ϊ ^ "亥影片的多媒體資料，以記憶模組1 1 0儲存該多媒體貝；'，並且於播放的同時由計時單元1 2 0記錄多媒體資料播^所經過的時間，當系統接收到擷取指令時，隨即由擷取模、、且1 3 〇根據該擷取指令比對影片播放所經過的時間，擷取』2播放時間記憶資料，並將此播放時間記憶資料記錄在^己丨思模組1 1 0中，之後擷取模組1 3 0更依據使用者一開 f所^擇的翻譯播放模式，根據播放時間記憶資料計算取知 ^播放時間範圍，然後再根據此播放時間範圍比對多媒體貝料擷取出片段音訊資料，以進行音訊資料分析的作2 Set the operation interface to provide the change of the time acquisition parameters. [The cutting mode of the day surface is changed. As for the calculation unit 132, the mode, Ϊ = ί parameters, perform calculation operations to obtain the playback time range, and k for extraction. The branch group 丨 30 performs the subsequent operation of capturing the audio data of the segment 丨 in addition, the analysis conversion module 丨 40 also includes a sound loudness analysis module Η ^, an audio frequency analysis module 142, a fuzzy comparison module 143, and a voice file 1 (St. "Figure 1-C"), the acoustic loudness analysis module 141 and the = frequency analysis module 142 are responsible for jointly analyzing the loudness and frequency of the audio data, and through sound filtering to obtain a Pure audio, the module 2 comparison module 143 performs fuzzy comparison on the pure audio audio according to the audio fuzzy comparison data provided by the database 150 to obtain a human voice, and finally the voice is converted to a text mode. Group 44 is converted into text information based on this human voice to provide system display. When the user uses the system of the present invention to play a movie, the system first stores the multimedia data of the movie and stores the multimedia shell with a memory module 1 1 0; and the timing unit 1 2 at the same time as the playback. 0 Record the elapsed time of the multimedia data broadcast. When the system receives the capture command, it will use the capture mode and compare the time elapsed with the video playback according to the capture command. Time memory data, and record the playback time memory data in ^ Ji 丨 Think module 1 1 0, and then retrieve module 1 3 0 according to the translation playback mode selected by the user when f is opened, and according to the playback time Calculate the memory data to obtain the playback time range, and then compare the multimedia data based on the playback time range to extract the audio data of the clip for audio data analysis.

1233090 五、發明說明（6) ' ----- 業。二在擷取模組1 3 〇取得片段音訊資料後，即將此片段音成資料傳送給分析轉換模組丨4〇進行分析轉換作業，由分析轉換模組1 4 0自資料庫1 5 〇中擷取出音訊模糊比對資料 2 1，並且利用此音訊模糊比對資料1 5 1針對擷取到的片段音訊資料進行模糊比對與分析作業，轉換該片段音訊資料成為文字資料，然後再將此文字資料顯示出來。在轉換取得文字資料並顯示之後，使用者更可以針對顯示的文字資 =，、要求查詢文字資料的解釋資訊，此時使用者所提出的要f指令，系統的語文釋義模組丨6 〇會根據此要求指八，自貝料庫1 5 0中擷取出此文字資料之文字解釋資訊，供使用者進一步了解影片對白的内容。、在述說完本發明系統中各單元與模組間的訊息盘傳遞狀況後，繼續藉由「第2圖」本發二、料的，流程圖，及「第3圖」本發明依據擷取指]:;二的三種運作模式。 χ解林發明之翻譯^所提供 3當使用者選擇使用即時模式（Real 览影片時，如「第2圖」所示，系0de)閱媒體資料，並且於播放的同時，記錄多媒體資料?播\的多過的時間，在影片播放的同時，者桩糾—播放所經令，此時系統繼續播放影片並且^櫨护姅令間記錄指 =記憶資料(步驟210);系統的擷取模組13〇:= 放嫩憶資料之後，冑即根據已取得的播放時 1233090 五、發明說明（7)1233090 V. Description of Invention (6) '----- Industry. Secondly, after acquiring the audio data of the segment from the acquisition module 130, the audio data of the segment is transmitted to the analysis and conversion module 丨 4 for analysis and conversion operation, and the analysis and conversion module 14 from the database 1 50 Extract the audio fuzzy comparison data 2 1 and use this audio fuzzy comparison data 1 5 1 to perform fuzzy comparison and analysis on the captured segment audio data, convert the segment audio data into text data, and then Text information is displayed. After converting and obtaining the text data and displaying it, the user can also query the interpretation information of the text data for the displayed text information. At this time, the user's request f command, the system's language interpretation module 丨 6 According to this request, eighth, extract the textual interpretation information of this text data from Bayer Library 150 for users to further understand the content of the video dialogue. After describing the transmission status of the message discs between the units and modules in the system of the present invention, continue to use the second figure, the flow chart, and the third figure. Refers to ::; two of the three modes of operation. χ Translation of the invention of the invention ^ Provided 3 When the user chooses to use the real-time mode (Real view video, as shown in "Figure 2", 0de) to read media data, and at the same time broadcast, record multimedia data? \ Over time, while the video is playing, the player will correct the play order. At this time, the system continues to play the video and ^ 栌 protects the inter-recording record means = memory data (step 210); the system's capture mode Group 13〇: = After releasing the data of Nen Yi, then according to the obtained playback time 1233090 V. Description of the invention (7)

料計算取彳I M 間範圍梅取ΐ 圍(步驟220 );然後再根據播放時由分析轉換桓貝片段音訊資料（步驟230 );最後】料心二，資料的分析轉換作業，以轉換成文字 :曰用者選擇使用暫停模式（〇η “Μ M〇de)閱「第擷取模式（㈣咖覽影片時，如更於播於Μ _不糸統除優先儲存影片的多媒體資料外， % 影片播放一：::門體資料播放所經過的時間，當令使用者不用者發現影片中對白的内容有指令，主知者可以對系統下達時間記錄系、統在間點的片段音訊資料，此時料及多媒體資料檔名(V驟3二);立二 ;的模貞取模式（Retrleve 所Jt 間資料及多媒體資料檔名·，即繼續播放Γ婢The data is calculated based on the IM range and the range (step 220); then the audio data of the snippet segment is converted by analysis during playback (step 230); and finally, the data center is the data analysis and conversion operation to convert to text. : The user chooses to use the pause mode (〇η "Μ M〇de) to read the" Capture mode (when watching the video, if it is more broadcast on Μ _ 糸), in addition to the multimedia data of the priority storage video,% Video playback 1 :: The elapsed time of the door body data playback, so that users do not need to find out the content of the dialogue in the video has instructions, the master can give the system time recording system, and the fragmented audio data. Time material and multimedia data file name (V step 32); Li Er; the mode of mode acquisition (data and multimedia data file name of Jt between Retrleve, that is, continue to play Γ 婢

Mode) Hold 名後，即H统夕在取得播放時間記憶資料及多媒體資料槽，：根據多媒體資料檔名擷取多媒體資料（步驟，、’且根據播放時間記憶資料計算〃 ^ 之片卜％ ;、後再利用此播&時間冑圍擷取多士某體資料 2:::::料(步驟340 );最後才利用多媒體資料之片 …枓進行分析轉換作業，將取得的片段音訊資料轉Mode) Hold the name, that is, H Tongxi will get the playback time memory data and multimedia data slot: Retrieve the multimedia data according to the multimedia data file name (steps, ', and calculate the 〃 ^ of the film% based on the playback time memory data; , And then use this broadcast & time to capture the toast data 2: 2 ::::: material (step 340); finally, use the multimedia data piece ... to analyze and convert the operation, and to obtain the segment audio data turn

1233090 五、發明說明（8) 換成文字資料（步驟3 5 〇 )，並將取得的結果顯示出來。 ^ = muetI*ieve MQde) m整部^ 放凡畢之後，先行顯示使用者所記錄過的播放爾料及多媒體資料Μ，並且要求使用者選取資圯憶資料進行片段音訊資料擷取轉換的作業，备 2間取-個播放時間記憶資料後，系統隨即根據多；：選名擷取多媒體資料（步驟32〇)，· —貝4擒，算取得播放時間範圍(步驟= =圍操取多媒體資料之片段音訊資料(步=播；多媒體資料之片段音訊資料進行分析轉4:作取 t =的片段音訊資料轉換成文字資料、乍將取得的結果顯示出來。，並將述三種運作模式中，均包含有計算取得播放0士門章巳圍的步驟，而此步驟更由一仵播放吋間此藉由「Μ闰連串的運作流程所構成，因圖，及搭配「V9圖1換放時間範圍的流程說。弟9圖」的不意圖來對此運作流程進行解首先如第9圖」所示，假設去旦< η播你吞β 位置（20時30分35秒）時^片播放至Β點的時間白内容產生不能理解的“用J ；；為於此點對於影片的對在系統接收到時間St::因；下達時間記錄指令，但運行至A點的時間位置:影片播放的時間點可能已時間記錄指令於A點％ 1分曰〇〇秒）’此時系、统即根據 410);假設時間擷、以取仔播放時間記憶資料（步驟 …取茶數的設定時間為3〇秒，此時系統即 1233090 五、發明說明（9) 根據播放時間記憶資料減少時間擷取失 ' 起點（步驟420 ) ’即「第9圖^以取得播放時間少）；接著更根據播放時間記/資2間，置(2〇時3。分以取得播放時間終點（步驟43〇)，，即』加蚪間擷取參數間位置（20時31分3〇秒）；最後再弟^圓」中的C點時時間終點取得播放時間範圍（^ 知間起點及播放圍即為「第9圖」中所表示 4圖」的運作流程即可取得系統長度，因此透過「第片段音訊資料的擷取，以協助後斤心的播放時間範圍進行在完成播放時間範圍的界二、作業的執行。還需根據播放時間範圍擷取片俨=广，說之後，整個系統業，才可以取得必要的文字資^ 9訊資料，並進行轉換作其中有關轉換作業的流程，則择=理解影片的對白内容，片段音訊資料成文字資料之流= 第5圖」本發明轉換作過程。王圖，來解說轉換作業的運百先系統在取得片段音訊資資料分析聲音響度及聲音頻率并斜之後，即根據片段音訊段音訊資料的響度與頻率之後^驟510);在取得整個片料雜音的方式，來取得片邙，續採用過濾片段音訊資 520 );接著以模糊比對的$ ^ °貪料的淨音音訊（步驟對資料151進行模糊比對，以>取^對淨音音訊以音訊模糊比 53 0 );最後再根據此人聲資訊侍—個人聲資訊（步驟 540 )，以完成轉換作業。貝° 轉換成文字資料（步驟透過前述三種瀏覽模式的％ 1233090 五、發明說明（10) 之後，使用者在觀看擷取到的視訊與轉換所得的文字次後，若對於顯示的文字資訊，認為其正確性與準確性^，問％，可以透過系統所提供的向前^…ward)擷取指人是向後（Rewind)擷取指令，重新設定片段音訊資料擷/ 放時間範圍，以重新取得一段新的視訊及文字資料，使用者確認正確的對白内容。另外有關於時間擷取參= 没疋，主要係由系統所提供的操作介面進行設定，該操作 ^面如「第10圖]所示，可設定時間擷取參數以分或秒= 單位，更以數字為大小，加以控制播放時間範圍的大小，以決定所擷取的片段音訊資料量。除此之外，系統更提供字詞查詢的功能，可提供使用者對於系統轉換所得的文^ 資訊，進行字詞查詢，以了解一個不認識的字詞之各種解釋貧訊或參考例句，以幫助學習者更了解該字詞。在述說完本發明系統的模組間訊號傳遞方式，與方法的基本運作過程之後，繼續藉由「第6圖」本發明即時模式（Real Time Mode)瀏覽實施例之詳細運作流程圖，「第 7圖」本發明暫停模式（0n Hold M〇de)瀏覽實施例之詳細流程圖，「第8圖」本發明擷取模式（RetrieveM〇de)瀏覽實施例之詳細流程圖，以及「第丨丨圖」至「第丨3圖」的實施例畫面，進行本發明較佳實施例的解說。在使用者使用本發明的系統欣賞影片之前，使用者可以先透過系統的操作介面設定時間擷取參數的值，在設定元成之後’系統即根據此時間擷取參數進行播放時間範圍的凋整依據，當使用者於瀏覽模式的選擇上，選擇採用即1233090 V. Description of the invention (8) Replaced with text data (step 3 5 0), and displays the obtained results. ^ = muetI * ieve MQde) m ^ After the release, the user first displays the playback material and multimedia data recorded by the user, and asks the user to select the asset data to retrieve and convert the segment audio data. After 2 sets of time-memory data are recorded, the system will then multiply according to the number :: select the name to retrieve the multimedia data (step 32), ·-4 to calculate the playback time range (step = = to retrieve the multimedia data) The segment audio data (step = broadcast; analysis and analysis of the segment audio data of the multimedia data. Turn 4: Take the segment audio data of t = and convert it into text data. The results obtained will be displayed at first glance.) They all include the steps of calculating and playing 0 Shimenzhangwei, and this step is composed of a series of "M「 series of operating processes, as shown in the figure, and with "V9 Figure 1 changing The process of the time range is said. I do not intend to solve this operation process according to Figure 9 "First, as shown in Figure 9", suppose that once you ’ve swallowed the β position (20:30:35), ^ Movies played to point B The white content produces an incomprehensible "use J;" For this point, the pair of movies received the time St :: cause in the system; a time recording instruction was issued, but the time position to run to point A: the time point of the movie playback may be The time recording instruction is at A point% 1 minute and 00 seconds) 'At this time, the system is based on 410); Assume that the time is retrieved and the time is stored to retrieve the data (steps ... The set time for taking the number of tea is 30 seconds At this time, the system is 1233090 V. Description of the invention (9) Reduction of time fetching loss according to the playback time memory data (step 420) 'that is, "Figure 9 ^ to get less playback time"; Set 2 rooms (20: 3. Minutes to get the end of the playback time (step 43)), that is, "add the location between the parameters (20:31:30); finally, I won't do it again." At the time point C in the middle, the playback time range is obtained (^ The starting point of Zhijian and the playback range are shown in Figure 4 as shown in Figure 4). The system length can be obtained by the operation flow. Take to assist Fan Fan after playing time Carry out the operation of the second time in the completion of the playback time range. You also need to capture the video according to the playback time range. 俨 = Wide, after that, the entire system industry can obtain the necessary text information ^ 9 information and convert it to Regarding the flow of the conversion operation, select = understand the dialogue content of the movie, the flow of the audio data of the fragment into the text data = Figure 5 "The conversion process of the present invention. After the audio data analysis of the sound loudness and sound frequency and oblique, that is, according to the loudness and frequency of the audio data of the clip ^ Step 510); in the way of obtaining the noise of the entire film, to obtain the film, continue to filter the audio 520)), then use the fuzzy comparison of $ ^ ° to clean the net audio audio (step fuzzy comparison of the data 151, and take ^ to the net audio audio with the audio blur ratio of 53 0); finally according to this Voice information service—personal voice information (step 540) to complete the conversion operation. Be converted to text data (steps through the above three browsing modes of% 1233090 V. After the description of the invention (10), after viewing the captured video and converted text times, if the user thinks about the displayed text information, Its correctness and accuracy ^, asks%, you can use the system to provide forward ^ ... ward) fetching means that the person is backward (Rewind) fetching instruction, reset the clip audio data capture / playback time range to get again A new piece of video and text data, the user confirms the correct dialogue content. In addition, there is a parameter about time acquisition = no, which is mainly set by the operating interface provided by the system. The operation surface is shown in "Figure 10". You can set the time acquisition parameter in minutes or seconds = unit, more Take the number as the size, and control the size of the playback time range to determine the amount of audio data of the clips. In addition, the system also provides a word query function, which can provide users with information about the text converted by the system ^ , To perform a word query to understand various explanations or reference examples of an unknown word, to help learners better understand the word. After describing the signal transmission methods and methods between the modules of the system of the present invention, After the basic operation process, continue to view the detailed operation flowchart of the embodiment by using the “Real Time Mode” of the present invention in “FIG. 6”, and view the embodiments by “0n Hold Mode” of the present invention in “FIG. 7” Detailed flowchart, "Figure 8" The detailed flowchart of the embodiment of the retrieval mode (Retrieve Mode) of the present invention, and the embodiment screens of "Figure 丨丨" to "Figure 丨 3", Explanation of the preferred embodiment of the present invention. Before the user uses the system of the present invention to enjoy a video, the user can first set the value of the time capture parameter through the system's operation interface. After setting the Yuancheng, the system will use the time capture parameter to fade the playback time range. Based on, when the user chooses the browsing mode,

mm

1233090 五、發明說明（11) 時模式（Real Time Mode)的瀏覽方式進行影片欣賞時，請同時參看「第6圖」，系統在接收到即時模式（Real Time Mode)的瀏覽指令（步驟610)之後，即以單一畫面開始播放影片，如「第11圖」所示；在影片播放的過程之中，當系統接收到時間記錄指令（步驟6 2 0 )時；系統隨即切割顯示晝面成視訊及文字兩顯示畫面（步驟63〇)，並且繼續播放影片，如「第1 2圖」及「第1 3圖」所示，此時使用者可以依據自己的喜好選擇切割畫面所顯示的方式；接著由於系統採用的覽模式為即時模式（Real Time Mode)，因此系統根據時間記錄指令取得時間記憶資料後，除了持續播放衫片外，更根據先鈾設定的時間記憶資料計算取得播放萨間範圍（步驟640 );然後再根據播放時間守訊資料並轉換成文字資訊(步驟650 );最；；文字員= 面㈤出文字資訊(步職），此時所顯二皆屬停留在下達時間記錄指令時，該時間點所計算取得片段音訊資料轉換獲得的文字資訊；接著當使用者閱& 畢之後，可透過系統的操作介面送出一個離開指令，以2 閉，，窗晝面，此時系統接收離開指令（步驟67〇)後恢復單一顯示晝面（步驊6 8 0 )，繼續播放影片。當使用者於瀏覽模式的選擇上 '，選^採 (〇n H〇ld Mode)的瀏覽方式進行影片欣賞二，來^換= 圖」’系統先以單一畫面播放的方式播放影片：如「= 二ί影片播放一段時間之後，當使用者遇到對-片對白内谷有不明白之處時，送出-個暫停劇覽指令，:1233090 V. Description of the invention (11) Real Time Mode browsing mode Please refer to "Figure 6" for video viewing at the same time. The system receives the Real Time Mode browsing instruction (step 610) After that, the video will start to play on a single screen, as shown in "Figure 11"; during the playback of the video, when the system receives the time recording instruction (step 6 2 0); the system will then cut and display the daylight into a video And text display screen (step 63), and continue to play the video, as shown in "Figure 12" and "Figure 13", at this time the user can choose the way the cutting screen is displayed according to his preference; Then, because the viewing mode used by the system is Real Time Mode, after the system acquires time memory data according to the time recording instruction, in addition to continuously playing the shirt, it also calculates the playback time range based on the time memory data set by the first uranium. (Step 640); Then, according to the playback time, the information is converted and converted into text information (step 650); most ;; clerk = face text information (step) At this time, the two displayed are the text information obtained from the conversion of the audio data of the segment calculated at that time point when the time recording instruction is issued; then when the user reads & finishes, he can send an exit instruction through the system operation interface When the window is closed by 2, the day and time window is closed. At this time, after receiving the leave command (step 67), the system resumes the single display of the day and time window (step 680) and continues to play the movie. When the user selects the browsing mode, select the browsing mode for video viewing (2) to switch to video appreciation. ^ Change = Picture "The system first plays the video in a single-screen playback mode, such as" = After two videos are played for a period of time, when the user encounters something that is not clear in the dialogue between the movie and the movie, he sends out a pause show instruction:

1233090 五、發明說明（12) Ϊ t ί收到此暫停瀏覽指令（步驟710)後；首先暫停播放夕媒體貢料（步驟72 0 );然後切割顯示畫面成視訊及文字兩f =晝面（步驟725 )，如「第12圖」或「第13圖」所示、’妾者以接到暫停瀏覽指令的時間點記錄播放時間記憶資料及夕媒體貢料檔名，然後在接收到播放時間記憶資料 (步驟730)後，計算取得播放時間範圍（步驟了35);系統接著根據此播放時間範圍及取得的多媒體資料檔名，擷取段音訊資料並轉換成文字資訊（步驟740 );並且於文字顯不晝面顯示文字資訊（步驟75〇);最後當使用者閱覽完畢之，’透過系統的操作介面送出一個離開指令，以關閉雙視=ΐ面，系統此時在接收離開指令（步驟760 )後；除了恢復單一顯不晝面（步驟77〇 )外；隨即繼續播放多媒體料（步驟7 8 0 )，繼續播放影片内容。、當使用者於瀏覽模式的選擇上，選擇採用擷取模式 (Retrieve Mode)的瀏覽方式進行影片欣賞時，參看「第8 圖」_，系統此時先以單一晝面播放的方式播放影片如「第11圖」所示，在影片播放一段時間之後，當使用者遇到不明白影片對白内容時，由使用者送出擷取指令，告統接收到此搁取指令（步驟810)之後；即根據此擷取指"令、記錄取得播放時間記憶資料（步驟82〇)，並且將此播放間記憶資料連同多媒體資料檔名一起記錄在記憶模电丨ι〇中，記錄完畢之後繼續播放影片；接著當影片播放完後’由使用者操作系統’顯示已記憶的多媒體資料檔播放時間記憶資料（步驟830 );然後接受使用者對多媒體1233090 V. Description of the invention (12) Ϊ t ί After receiving this pause browsing instruction (step 710); first pause the playing of the media tribute (step 72 0); then cut the display screen into video and text two f = day surface ( Step 725), as shown in "Figure 12" or "Figure 13", 'the person records the playback time memory data and the media media file name at the time point when the pause browsing instruction is received, and then receives the playback time After memorizing the data (step 730), calculate the playback time range (step 35); the system then extracts the audio data and converts it into text information based on the playback time range and the obtained multimedia data file name (step 740); and Display the text information on the daytime display (step 75); Finally, when the user finishes reading, 'send a leave command through the system's operation interface to close the double-view = ΐ plane, and the system is now receiving the leave command ( After step 760), in addition to restoring a single daytime display (step 77), the multimedia material will then continue to be played (step 780), and the movie content will continue to be played. 、 When the user chooses the browsing mode of the retrieval mode (Retrieve Mode) for movie appreciation, please refer to "Figure 8". At this time, the system first plays the movie in a single day-time playback mode, such as As shown in "Figure 11", after the video is played for a certain period of time, when the user encounters the content of the video dialogue that is not understood, the user sends a fetch command, and after receiving the hold command (step 810); that is, According to this fetch instruction " order, record and obtain the play time memory data (step 82), and record this play time memory data together with the multimedia data file name in the memory module, and continue to play the video after recording ; Then when the video finishes playing, the "operating system of the user" displays the stored time data of the stored multimedia data file (step 830); and then accepts the user for multimedia

第15頁 1233090 五、發明說明（13) 資料檔名及播放時間記憶資料的選取（步驟84〇);此時系 ^卩A行切割顯示晝面成視訊及文字兩顯示畫面（步驟時間二後ϋ所選取的播放時間記憶資料計算取得播放播^ = 6〇 );並且根據多媒體資料檔名及取得的驟870: : T % :擷取片段音訊資料並轉換成文字資訊（步的多婢體=^於：副的視訊顯$ 4面播纟該播放時間範圍 _/义；1閱ΐ且於文字顯示畫面顯示文字資訊(步驟問使用者是否系統所轉換的文字資訊後’系統則詢選擇「是」日士，、，榀視（步驟890)已記錄的資料，當使用放時間記丄I 系統則顯示已記憶的多媒體資料檔名及播若使用者。 83〇)，提供使用者繼續查詢閱覽；學習的作業。否」時，則系統停止運作，結束輔助語文非用來限定2 =，僅為本創作其中的較佳實施例而已，並圍所作的岣箸m作的貫施範圍；即凡依本創作申請專利範一、文化與修飾，皆為本創作專利範圍所涵蓋。Page 15 1233090 V. Description of the invention (13) Selection of data file name and playback time memory data (step 84); At this time, line ^ 切割 A is cut to display the day and time into video and text display screens (after step 2)选取 Calculate the playback time of the selected playback time memory data ^ = 6〇); and according to the multimedia data file name and step 870:: T%: Retrieve the audio data of the clip and convert it into text information. = ^ Yu: Vice video display $ 4 side broadcast 纟 the playback time range _ / meaning; 1 read and display text information on the text display screen (steps ask the user if the system converts the text information 'the system will ask for selection "Yes", scorn, despise (step 890) the recorded data. When the time is recorded, the system will display the saved multimedia data file name and broadcast the user. 83〇), provide the user to continue Inquiry and reading; learning assignments. No ", the system stops operating and the auxiliary language is not used to limit 2 =, it is only the preferred embodiment of this creation, and the scope of the implementation of the work ; That is, according to the original A patent-pending fan, culture and modifications are all covered by the scope of this patent creation.

1233090 圖式簡單說明第卜A圖係為本發明語文辅助系統架構圖；第1 —B圖係為本發明語文輔助系統之擷取模組細構圖； 1木第1 -c圖係為本發明語文輔助系統之分析轉換模袓部架構圖；、一第2圖係為本發明即時翻譯音訊資料的主流程圖；1233090 A brief description of the diagram. Figure A is the structure diagram of the language auxiliary system of the present invention; Figure 1-B is a detailed composition diagram of the acquisition module of the language auxiliary system of the present invention; Figure 1-c is the invention The structure diagram of the analysis conversion module of the language-assisted system; Figure 1 and Figure 2 are the main flowchart of the instant translation of audio data according to the present invention;

第3圖係為本發明依據擷取指令擷取音訊資料翻主流程圖； J 程圖；，4圖係為本發明換算取得播放時間範圍的流程圖；第5圖係為本發明轉換片段音訊資料成文字資料之流第6圖係為本發明即時模式（Real Time Mode)瀏罾每鼽例之詳細運作流程圖；見貝 f 7圖係為本發明暫停模式（On Hold Mode)瀏覽每之詳細流程圖；貝施施例係為本發明擷取模式（RetrieVe Mode)瀏覽實 Jδ平細流程圖；焉 ^ 9圖係/為本發明界定播放時間範圍示意圖；圖；、1 0圖係為本發明實施例之設定時間擷取參數的介面圖係為本發明實施例之單—顯示畫面示意圖；圖係為本發明實施例之顯示割而自之不意圖；及卜兩梘兩視第1 3圖係為本發明實施例之顯示畫面切割成左右 1233090 圖式簡單說明窗之示意圖。【圖式符號說明】FIG. 3 is a flowchart of retrieving audio data according to the retrieval instruction according to the present invention; J process diagram; FIG. 4 is a flowchart of converting and obtaining the playback time range according to the present invention; FIG. 5 is a converted segment audio of the present invention The flow of data into text data Figure 6 is a detailed operation flow chart of each example of the Real Time Mode browsing of the present invention; see Figure f 7 is a view of each of the On Hold Mode browsing of the present invention Detailed flow chart; Besch example is the RetrieVe Mode view of the actual Jδ flat flow chart; 焉 ^ 9 Figures / is a schematic diagram of the playback time range of the invention; Figures; The interface diagram for setting the time acquisition parameters in the embodiment of the present invention is a single-display screen diagram of the embodiment of the present invention; the diagram is the display of the embodiment of the present invention without intention; and Figure 3 is a schematic diagram of a simple illustration window with the display screen cut into left and right 1233090 drawings according to an embodiment of the present invention. [Illustration of Symbols]

第18頁 110 記憶模組 120 計時單元 130 擷取模組 131 設定單元 132 計算單元 140 分析轉換模組 141 聲音響度分析模組 142 聲音頻率分析模組 143 模糊比對模組 144 聲音轉換文字模組 150 資料庫 151 音訊模糊比對資料 160 語文釋義模組Page 18 110 Memory module 120 Timing unit 130 Capture module 131 Setting unit 132 Calculation unit 140 Analysis conversion module 141 Acoustic degree analysis module 142 Sound frequency analysis module 143 Fuzzy comparison module 144 Sound conversion text mode Group 150 Database 151 Audio fuzzy comparison data 160 Language interpretation module

Claims

1233090 VI. Scope of patent application 1. A system for language translation of multimedia data. It is a system that obtains text information by converting audio data of one of the multimedia data. The system includes: a memory module for Memorize a play time memory data; a timing unit to record the elapsed time of the multimedia data playback; an acquisition module to calculate a play time range based on the play time memory data and a time acquisition parameter, And extracting a piece of audio data of the multimedia data according to the playback time range; a database for storing a plurality of audio fuzzy comparison data; and an analysis conversion module for analyzing the audio fuzzy comparison data Segment audio data, and convert the segment audio data to obtain a text data. 2. The system for language translation of multimedia data as described in item 1 of the scope of the patent application, wherein the system further includes a language interpretation module that receives a text selection instruction to provide a text interpretation information based on the text data. 3. The language translation system for multimedia data as described in item 2 of the scope of patent application, wherein the text interpretation information is provided by the database 4. The language translation for multimedia data as described in item 1 of the scope of patent application System, the capture module further includes a setting unit for providing the time capture parameter.

Page 19 1233090 VI. Application for patent scope 5 · Time range for applying for special translation 6 · Calculation for applying for special translation 7 · Oblique Ganshi as described in item 1 of the patent application scope The data is processed in the language system, and the time capture parameter J ^ J is used to change the playback: Fan; the language system for multimedia data described in item 1 above, where the capture module further includes a calculation unit, Use this playback time range. Take into account a range. 8 · If you want to apply for a special translation, you need to analyze the module. A voice is turned 9 · If you want to apply for a special translation, you need to analyze the module. Audio. 10 · If applying for a special translation is based on the audio. 11 · The system of language translation for multimedia data as described in item 1 of the scope of application for translation of monographs: where the capture module provides a forward (Reward 1) capture instruction, To change the language system for multimedia data as described in item 1 of the playback time range, where the analysis conversion module further includes a sound degree, a sound frequency analysis module, a fuzzy comparison module, and a text change mode. group. The language system for multimedia data as described in item 8 of the profit range is used, wherein the acoustic loudness analysis module and the sound frequency are used to analyze the audio data of the segment to obtain a target as described in item 9 of the net sound range. Multilingual data for language system 'Where the net audio and audio system provides the fuzzy comparison module to compare the fuzzy comparison data to obtain a speech system for multimedia data as described in item 10 of the vocal profit scope, 苴Φ Bee vocal information system provides the sound conversion text

1233090

Capture audio data according to the playback time range; and take a segment of the multimedia data

According to the piece #, the audio data is converted into a text data. . The method of "Range No. 12 Institute" multimedia data. The method of acquiring the playback time range includes the following: receiving the playback time memory data; reducing the time acquisition parameter to obtain the playback time memory data to obtain A starting point of playing time; adding a time acquisition parameter according to the playing time memory data to obtain an ending point of playing time; and

The playback time range is obtained according to the start of the playback time and the end of the playback time. 14 · The method for translating a language using multimedia data as described in item 丨 2 of the scope of patent application, wherein the step of converting the audio data of the segment into the text data further includes the following 2 steps: According to the audio information section of the segment Analyze a loudness and an audio

1233090, patent coverage rate; filtering a noise of the audio data of the segment to obtain a net audio; fuzzy comparison of the net audio and an audio fuzzy comparison data to obtain a voice information; and converting the voice information into the text data. 1 5. The method for translating languages using multimedia data as described in item 丨 2 of the scope of patent application, wherein the method is a real-time mode (Rea 1 ΐ 1 me Mode) browsing, which is based on receiving the time recording instruction While taking the audio data of the clip, continue to play the multimedia data. 16 · The method for translating languages using multimedia data as described in item 5 of the patent application scope, wherein the browsing of the Real Time Mode provides a forward fetch instruction and a backward (Rewind) ) Fetching instructions to change the playback time range. 1 7 · A method for language translation using multimedia data, which is a method of obtaining a text information by extracting a part of audio data in the multimedia data for conversion. The method includes the following steps: receiving a time recording instruction and recording a Play time memory data and a multimedia data file name; receive a query command and retrieve a multimedia data according to the multimedia data file name; calculate and obtain a play time range based on the play time memory data; retrieve the multimedia according to the play time range A piece of information

Page 22, 1233090 VI. Patent application scope Audio data according to 18 · If applying for translation of monographs: Receiving based on obtaining a broadcast, obtaining obtaining a broadcast according to the time frame 19 · If applying for translation of textual data based on monographs; filtering news; Blur: And the audio data of the clip is converted into the profit method described in item 17 of the profit range, wherein the playback time is text data. The acquisition of the range using multimedia data includes the following: Playback Playback Playback Time Playback Playback Time Playback Range. The method of profit range, the steps of the time memory data of the segment; the time memory data minus the starting point; the time memory data increase the end point; and the time starting point and the benefits described in item 17 of this broadcast, which according to the film, also includes the following less one Time capture parameter to add a time capture parameter to release the time end to obtain the broadcast multimedia data to convert the segment audio data into the conversion step: audio data analysis one sound and one audio noise of the segment audio data To obtain a net tone-to-tone ratio of the net tone audio and one tone to obtain a vocal information; and the vocal information into the profit method described in item 17 of the textual profit scope, wherein the method is to convert 2 〇 · 如Obtain fuzzy comparison information for translation of monographs. Using multimedia materials to speak-On Hold mode

Page 23 1233090 VI. The scope of patent application mode) is to stop playing the multimedia data while receiving the time recording instruction to retrieve the segment audio data. 2 1 · The method of multimedia translation for language translation as described in item 20 of the scope of patent application, wherein the browsing in the On Hold Mode provides a forward fetch instruction and a backward (Rewind) ) Fetch command to change the playback time range. 2 2 · The method of multimedia translation for language translation as described in item 7 of the scope of patent application, wherein the method is a retrieval mode (Retrieve Mode) browsing, which is recorded while receiving the time recording instruction The multimedia data file name and the playback time memorize the data, and continue to play the multimedia data. 2 3 · The method of multimedia translation for language translation as described in item 22 of the scope of the patent application, wherein the browsing of the Retrieve Mode provides a forward fetch instruction and a rewind Fetch command to change the playback time range. 2 4 · The method for multimedia translation using multimedia data as described in item 17 of the scope of patent application, wherein the method is a retrieval mode (Retrieve Mode) browsing, and the query instruction is provided based on the playback of the multimedia data. . '

Page 24