TWI699663B - Segmentation method, segmentation system and non-transitory computer-readable medium - Google Patents

Segmentation method, segmentation system and non-transitory computer-readable medium Download PDF

Info

Publication number
TWI699663B
TWI699663B TW108104097A TW108104097A TWI699663B TW I699663 B TWI699663 B TW I699663B TW 108104097 A TW108104097 A TW 108104097A TW 108104097 A TW108104097 A TW 108104097A TW I699663 B TWI699663 B TW I699663B
Authority
TW
Taiwan
Prior art keywords
subtitle
paragraph
sentence
segmentation
sentences
Prior art date
Application number
TW108104097A
Other languages
Chinese (zh)
Other versions
TW202011232A (en
Inventor
藍國誠
詹詩涵
Original Assignee
台達電子工業股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 台達電子工業股份有限公司 filed Critical 台達電子工業股份有限公司
Publication of TW202011232A publication Critical patent/TW202011232A/en
Application granted granted Critical
Publication of TWI699663B publication Critical patent/TWI699663B/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • G06F16/437Administration of user profiles, e.g. generation, initialisation, adaptation, distribution

Abstract

The present disclosure relates to a segmentation method, a segmentation system and a non-transitory computer-readable medium. The segmentation method includes the following operations: receiving captioning information, wherein the captioning information includes a plurality of captioning sentences; selecting the captioning sentences according to a defalut value and dividing the selected captioning sentence into a first paragraph; performing a common segmentation vocabulary judgment for a first captioning sentence; wherein the first captioning sentence is one of the captioning sentences; and generating a second paragraph or merging the first captioning sentence into the first paragraph according to a judgment result of the common segmentation vocabulary judgment.

Description

分段方法、分段系統及非暫態電腦可讀取媒體Segmentation method, segmentation system and non-transient computer readable medium

本揭示內容關於一種分段方法、分段系統及非暫態電腦可讀取媒體,且特別是有關於一種針對字幕的分段方法、分段系統及非暫態電腦可讀取媒體。The present disclosure relates to a segmentation method, a segmentation system and a non-transitory computer readable medium, and more particularly to a segmentation method, a segmentation system and a non-transitory computer readable medium for subtitles.

線上學習平台是指一種將眾多學習資料儲存於伺服器中,讓使用者能透過網際網路連線至伺服器,以隨時瀏覽學習資料的網路服務。在現行的各類線上學習平台中,提供的學習資料類型包含影片、音訊、簡報、文件或論壇。Online learning platform refers to a network service that stores many learning materials in a server, so that users can connect to the server through the Internet to browse the learning materials at any time. In the current various online learning platforms, the types of learning materials provided include videos, audios, presentations, documents or forums.

由於線上學習平台中儲存的學習資料數量龐大,為了能夠方便使用者的使用,需要針對學習資料的文字進行自動分段以及建立段落關鍵字。因此,如何根據學習影片的內容之間的差異性進行處理,達到將學習影片中類似的主題進行分段並標註關鍵字的功能是本領域待解決的問題。Due to the huge amount of learning materials stored in the online learning platform, in order to facilitate the use of users, it is necessary to automatically segment the text of the learning materials and create paragraph keywords. Therefore, how to deal with the differences between the contents of the learning videos to achieve the function of segmenting similar topics in the learning videos and labeling keywords is a problem to be solved in this field.

本揭示內容之第一態樣是在提供一種分段方法。分段方法包含下列步驟:接收字幕資訊;其中,字幕資訊包含複數個字幕句;根據設定值選取字幕句,並將被選取的字幕句分為第一段落;針對第一字幕句進行常見分段詞彙判斷;其中,第一字幕句是字幕句的其中之一;以及根據常見分段詞彙判斷的判斷結果產生第二段落或將第一字幕句併入第一段落。The first aspect of the present disclosure is to provide a segmentation method. The segmentation method includes the following steps: receiving subtitle information; wherein the subtitle information includes a plurality of subtitle sentences; selecting subtitle sentences according to the set value, and dividing the selected subtitle sentences into the first paragraph; performing common segmentation vocabulary for the first subtitle sentence Judgment; where the first subtitle sentence is one of the subtitle sentences; and the second paragraph is generated or the first subtitle sentence is merged into the first paragraph according to the judgment result of the common segmentation vocabulary judgment.

本揭示內容之第二態樣是在提供一種分段系統,其包含儲存單元以及處理器。儲存單元用以儲存字幕資訊、分段結果、第一段落對應的註解以及第二段落對應的註解。處理器與儲存單元電性連接,用以接收字幕資訊;其中,字幕資訊包含複數個字幕句,處理器包含:分段單元、常見詞偵測單元、以及段落產生單元。分段單元用以利用設定值根據特定順序選取字幕句,並將被選取的字幕句分為第一段落。常見詞偵測單元與分段單元電性連接,用以針對第一字幕句進行常見分段詞彙判斷;其中,第一字幕句是該些字幕句的其中之一。段落產生單元與常見詞偵測單元電性連接,用以根據常見分段詞彙判斷的判斷結果產生第二段落或將第一字幕句併入第一段落。The second aspect of the present disclosure is to provide a segmented system including a storage unit and a processor. The storage unit is used to store the subtitle information, the segmentation result, the annotation corresponding to the first paragraph, and the annotation corresponding to the second paragraph. The processor is electrically connected with the storage unit to receive subtitle information; wherein the subtitle information includes a plurality of subtitle sentences, and the processor includes: a segmentation unit, a common word detection unit, and a paragraph generation unit. The segmentation unit is used to select subtitle sentences according to a specific order by using the set value, and divide the selected subtitle sentences into the first paragraph. The common word detection unit is electrically connected to the segmentation unit to perform common segmentation vocabulary judgment for the first subtitle sentence; wherein, the first subtitle sentence is one of the subtitle sentences. The paragraph generation unit is electrically connected to the common word detection unit, and is used to generate a second paragraph or merge the first subtitle sentence into the first paragraph according to the judgment result of the common segment vocabulary judgment.

本案之第三態樣是在提供一種非暫態電腦可讀取媒體包含至少一指令程序,由處理器執行至少一指令程序以實行分段方法,其包含以下步驟:接收字幕資訊;其中,字幕資訊包含複數個字幕句;根據設定值選取字幕句,並將被選取的字幕句分為第一段落;針對第一字幕句進行常見分段詞彙判斷;其中,第一字幕句是字幕句的其中之一;以及根據常見分段詞彙判斷的判斷結果產生第二段落或將第一字幕句併入第一段落。The third aspect of the present case is to provide a non-transitory computer-readable medium containing at least one instruction program. The processor executes at least one instruction program to implement the segmentation method, which includes the following steps: receiving subtitle information; The information contains multiple subtitle sentences; select subtitle sentences according to the set value, and divide the selected subtitle sentences into the first paragraph; perform common segmentation vocabulary judgments for the first subtitle sentence; where the first subtitle sentence is one of the subtitle sentences One; and according to the judgment result of common segmentation vocabulary, the second paragraph is generated or the first subtitle sentence is merged into the first paragraph.

本揭露之分段方法、分段系統及非暫態電腦可讀取媒體,其主要係改進以往係利用工方式進行影片段落標記,耗費大量人力以及時間的問題。先計算每一字幕句對應的關鍵字,在針對字幕句進行常見分段詞彙判斷,根據該常見分段詞彙判斷的判斷結果產生第二段落或將第一字幕句併入第一段落,以產生分段結果,達到將學習影片中類似的主題進行分段並標註關鍵字的功能。The segmentation method, segmentation system, and non-transitory computer-readable media disclosed in the present disclosure are mainly to improve the problem of using manual methods to mark video paragraphs, which consumes a lot of manpower and time. First calculate the keyword corresponding to each subtitle sentence, and perform common segmentation vocabulary judgments for subtitle sentences, and generate the second paragraph according to the judgment result of the common segmentation vocabulary or merge the first subtitle sentence into the first paragraph to generate subtitle sentences. Segment results, to achieve the function of segmenting similar topics in the learning film and labeling keywords.

以下將以圖式揭露本案之複數個實施方式,為明確說明起見,許多實務上的細節將在以下敘述中一併說明。然而,應瞭解到,這些實務上的細節不應用以限制本案。也就是說,在本揭示內容部分實施方式中,這些實務上的細節是非必要的。此外,為簡化圖式起見,一些習知慣用的結構與元件在圖式中將以簡單示意的方式繪示之。Hereinafter, multiple implementation modes of this case will be disclosed in schematic form. For the sake of clarity, many practical details will be described in the following description. However, it should be understood that these practical details should not be used to limit the case. In other words, in some implementations of the present disclosure, these practical details are unnecessary. In addition, in order to simplify the drawings, some conventionally used structures and elements are shown in the drawings in a simple and schematic manner.

於本文中,當一元件被稱為「連接」或「耦接」時,可指「電性連接」或「電性耦接」。「連接」或「耦接」亦可用以表示二或多個元件間相互搭配操作或互動。此外,雖然本文中使用「第一」、「第二」、…等用語描述不同元件,該用語僅是用以區別以相同技術用語描述的元件或操作。除非上下文清楚指明,否則該用語並非特別指稱或暗示次序或順位,亦非用以限定本發明。In this text, when a component is referred to as “connected” or “coupled”, it can be referred to as “electrically connected” or “electrically coupled”. "Connected" or "coupled" can also be used to mean that two or more components cooperate or interact with each other. In addition, although terms such as “first”, “second”, etc. are used herein to describe different elements, the terms are only used to distinguish elements or operations described in the same technical terms. Unless the context clearly indicates, the terms do not specifically refer to or imply order or sequence, nor are they used to limit the present invention.

請參閱第1圖。第1圖係根據本案之一些實施例所繪示之分段系統100的示意圖。如第1圖所繪示,分段系統100包含儲存單元110以及處理器130。儲存單元110電性連接至處理器130,儲存單元110用以儲存字幕資訊、分段結果、常見分段詞彙資料庫DB1、課程資料庫DB2、第一段落對應的註解以及第二段落對應的註解。Please refer to Figure 1. FIG. 1 is a schematic diagram of a segmentation system 100 according to some embodiments of the present application. As shown in FIG. 1, the segmentation system 100 includes a storage unit 110 and a processor 130. The storage unit 110 is electrically connected to the processor 130, and the storage unit 110 is used to store subtitle information, segmentation results, common segmentation vocabulary database DB1, course database DB2, annotations corresponding to the first paragraph, and annotations corresponding to the second paragraph.

承上述,處理器130包含關鍵字擷取單元131、分段單元132、常見詞偵測單元133、段落產生單元134以及註解產生單元135。分段單元132與關鍵字擷取單元131以及常見詞偵測單元133電性連接,段落產生單元134與常見詞偵測單元133以及註解產生單元135電性連接,常見詞偵測單元133與註解產生單元135電性連接。In view of the above, the processor 130 includes a keyword extraction unit 131, a segmentation unit 132, a common word detection unit 133, a paragraph generation unit 134, and a comment generation unit 135. The segmentation unit 132 is electrically connected to the keyword capturing unit 131 and the common word detecting unit 133, the paragraph generating unit 134 is electrically connected to the common word detecting unit 133 and the comment generating unit 135, and the common word detecting unit 133 is electrically connected to the comment The generating unit 135 is electrically connected.

於本發明各實施例中,儲存裝置110可以實施為記憶體、硬碟、隨身碟、記憶卡等。處理器130可以實施為積體電路如微控制單元(microcontroller)、微處理器(microprocessor)、數位訊號處理器(digital signal processor)、特殊應用積體電路(application specific integrated circuit,ASIC)、邏輯電路或其他類似元件或上述元件的組合。In various embodiments of the present invention, the storage device 110 may be implemented as a memory, a hard disk, a flash drive, a memory card, and so on. The processor 130 can be implemented as an integrated circuit such as a microcontroller, a microprocessor, a digital signal processor, an application specific integrated circuit (ASIC), and a logic circuit. Or other similar elements or a combination of the above elements.

請參閱第2圖。第2圖係根據本案之一些實施例所繪示之分段方法200的流程圖。於一實施例中,第2圖所示之分段方法200可以應用於第1圖的分段系統100上,處理器130用以根據下列分段方法200所描述之步驟,針對字幕資訊進行分段以產生一分段結果以及每一段落對應的註解。如第2圖所示,分段方法200首先執行步驟S210接收字幕資訊。於一實施例中,字幕資訊包含複數個字幕句。舉例而言,字幕資訊為影片的字幕檔案,影片的字幕檔案已經根據影片撥放時間將影片內容分為複數個字幕句,字幕句也會根據影片播放時間排序。Please refer to Figure 2. FIG. 2 is a flowchart of a segmentation method 200 according to some embodiments of the present application. In one embodiment, the segmentation method 200 shown in FIG. 2 can be applied to the segmentation system 100 in FIG. 1, and the processor 130 is used to divide the subtitle information according to the steps described in the following segmentation method 200 Paragraphs to produce a segmented result and corresponding notes for each paragraph. As shown in Figure 2, the segmentation method 200 first executes step S210 to receive subtitle information. In one embodiment, the subtitle information includes a plurality of subtitle sentences. For example, the subtitle information is the subtitle file of the video. The subtitle file of the video has divided the video content into a plurality of subtitle sentences according to the playback time of the video, and the subtitle sentences are also sorted according to the playback time of the video.

接著,分段方法200執行步驟S220根據設定值選取字幕句,並將被選取的字幕句分為當前段落。於一實施例中,設定值可以是任意的正整數,在此設定值以3為例,因此在此步驟中會根據影片播放的時間選擇3句字幕句組成當前段落。舉例而言,如果總共有N句字幕句,可以選擇第1字幕句~第3字幕句組成當前段落。Next, the segmentation method 200 executes step S220 to select a subtitle sentence according to the set value, and divide the selected subtitle sentence into the current paragraph. In one embodiment, the setting value can be any positive integer. Here, the setting value is 3 as an example. Therefore, in this step, 3 subtitle sentences are selected to form the current paragraph according to the time of the movie being played. For example, if there are a total of N subtitle sentences, you can select the first subtitle sentence ~ the third subtitle sentence to form the current paragraph.

接著,分段方法200執行步驟S230針對當前字幕句進行常見分段詞彙判斷。於一實施例中,常見分段詞彙係儲存於常見分段詞彙資料庫DB1,常見詞偵測單元133會偵測是否出現常見分段詞彙。常見分段詞彙可以分為常見開頭詞彙以及常見結尾詞彙。舉例而言,常見開頭詞彙可以為「接下來」、「開始說明」等,常見結尾詞彙可以為「以上說明到此」、「今天到這裡告一段落」等。在此步驟中,會偵測是否出現常見分段詞彙以及出現的常見分段詞彙類型(常見開頭詞彙或常見結尾詞彙)。Next, the segmentation method 200 executes step S230 to perform common segmentation vocabulary judgment for the current subtitle sentence. In one embodiment, the common segment vocabulary is stored in the common segment vocabulary database DB1, and the common word detection unit 133 detects whether the common segment vocabulary appears. Common segmented words can be divided into common beginning words and common ending words. For example, common opening words can be "next", "beginning", etc., common ending words can be "the above description is here", "today is over here", etc. In this step, it will detect whether there are common segmented words and the type of common segmented words (common beginning words or common ending words).

接著,分段方法200執行步驟S240根據常見分段詞彙判斷的判斷結果產生下一段落或將當前字幕句併入當前段落。於一實施例中,根據前述常見詞偵測單元133的偵測結果,可以決定是要產生新的段落或是將當前執行字幕句併入當前段落。舉例而言,當前段落是由第1字幕句~第3字幕句組成,當前執行字幕句可以是第4字幕句,根據判斷結果可以將第4字幕句併入當前段落或是將第4字幕句作為新的段落的開始。Next, the segmentation method 200 executes step S240 to generate the next paragraph or merge the current subtitle sentence into the current paragraph according to the judgment result of the common segmentation vocabulary judgment. In one embodiment, according to the detection result of the aforementioned common word detection unit 133, it may be determined whether to generate a new paragraph or merge the currently executed subtitle sentence into the current paragraph. For example, the current paragraph is composed of the first subtitle sentence to the third subtitle sentence, and the currently executed subtitle sentence can be the fourth subtitle sentence. According to the judgment result, the fourth subtitle sentence can be merged into the current paragraph or the fourth subtitle sentence As the beginning of a new paragraph.

承上述,步驟S240執行將當前字幕句併入當前段落後,會接著執行下一字幕句的常見分段詞彙判斷,因此會重行執行步驟S230的判斷。舉例而言,如果第4字幕句併入當前段落後,會接著執行第5字幕句的常見分段詞彙判斷。如果步驟S240執行產生下一段落後,會接著執行利用設定值根據特定順序選取字幕句,將被選取的字幕句分為下一段落,因此會重行執行步驟S220的操作。舉例而言,如果第4字幕句被分類為下一段落後,會重新選擇第5字幕句、第6字幕句以及第7字幕句加入下一段落。因此,會重複執行分段的動作,直到字幕句被分段完畢,最後產生分段結果。In view of the above, after the step S240 is executed to merge the current subtitle sentence into the current paragraph, the common segment vocabulary judgment of the next subtitle sentence will be executed, so the judgment of step S230 will be executed again. For example, if the fourth subtitle sentence is merged into the current paragraph, then the common segmentation vocabulary judgment of the fifth subtitle sentence will be executed. If step S240 is executed to generate the next paragraph, the subtitle sentences will be selected according to a specific order using the set value, and the selected subtitle sentences will be divided into the next paragraph, so the operation of step S220 will be repeated. For example, if the fourth subtitle sentence is classified as the next paragraph, the fifth subtitle sentence, the sixth subtitle sentence, and the seventh subtitle sentence will be re-selected and added to the next paragraph. Therefore, the segmentation action is repeated until the subtitle sentence is segmented, and finally the segmentation result is generated.

接著,步驟S240更包含步驟S241~S242,請一併參考第3圖,第3圖係根據本案之一些實施例所繪示之步驟S240的流程圖。如第3圖所示,分段方法200進一步執行步驟S241如果當前字幕句與常見分段詞彙相關聯,進行分段處理產生下一段落,並利用設定值根據特定順序選取字幕句,將被選取的字幕句加入下一段落。其中,步驟S241更包含步驟S2411~S2413,請進一步參考4圖,第4圖係根據本案之一些實施例所繪示之步驟S241的流程圖。如第4圖所示,分段方法200進一步執行步驟S2411根據判斷結果決定當前字幕句是否與開頭分段詞彙以及結尾分段詞彙的其中之一相關聯。接續上方實施例,根據步驟S230的判斷結果,可以決定當前字幕句是否與開頭分段詞彙或結尾分段詞彙相關聯。Next, step S240 further includes steps S241 to S242, please refer to FIG. 3 together, which is a flowchart of step S240 according to some embodiments of the present application. As shown in Figure 3, the segmentation method 200 further executes step S241. If the current subtitle sentence is associated with a common segmentation vocabulary, perform segmentation processing to generate the next paragraph, and use the set value to select the subtitle sentence in a specific order, which will be selected The subtitle sentence is added to the next paragraph. Among them, step S241 further includes steps S2411 to S2413, please further refer to FIG. 4, which is a flowchart of step S241 drawn according to some embodiments of the present application. As shown in Figure 4, the segmentation method 200 further executes step S2411 to determine whether the current subtitle sentence is associated with one of the beginning segment vocabulary and the end segment vocabulary according to the judgment result. Following the above embodiment, according to the judgment result of step S230, it can be determined whether the current subtitle sentence is associated with the beginning segment vocabulary or the end segment vocabulary.

承上述,分段方法200進一步執行步驟S2412如果當前字幕句與開頭分段詞彙相關聯,以當前字幕句作為下一段落的起始句。舉例而言,如果在前述的判斷結果中偵測到第4字幕句中具有「接下來」的詞彙,即將第4字幕作為下一段落的起始句。In view of the foregoing, the segmentation method 200 further executes step S2412. If the current subtitle sentence is associated with the beginning segment vocabulary, the current subtitle sentence is used as the start sentence of the next paragraph. For example, if it is detected that the fourth subtitle sentence has the word "next" in the aforementioned judgment result, the fourth subtitle sentence is used as the beginning sentence of the next paragraph.

承上述,分段方法200進一步執行步驟S2413如果當前字幕句與結尾分段詞彙相關聯,以當前字幕句作為當前段落的結尾句。舉例而言,如果在前述的判斷結果中偵測到第4字幕句中具有「以上說明到此」的詞彙,即將第4字幕作為當前段落的結尾句。執行完步驟S241的操作後會接著執行利用設定值根據特定順序選取字幕句,將被選取的字幕句分為下一段落,因此會重行執行步驟S220的操作,在此不再贅述。In view of the foregoing, the segmentation method 200 further executes step S2413. If the current subtitle sentence is associated with the ending segment vocabulary, the current subtitle sentence is used as the ending sentence of the current paragraph. For example, if it is detected in the foregoing judgment result that the fourth subtitle sentence has the word "the above description is here", the fourth subtitle sentence is regarded as the ending sentence of the current paragraph. After the operation of step S241 is performed, the subtitle sentences are selected according to a specific order by using the set value, and the selected subtitle sentences are divided into the next paragraph. Therefore, the operation of step S220 will be repeated, which will not be repeated here.

接著,分段方法200進一步執行步驟S242如果當前字幕句不與常見分段詞彙相關聯,當前字幕句與當前段落進行相似值計算,如果相似,將第一字幕句併入當前段落。其中,步驟S242更包含步驟S2421~ S2423,請進一步參考5圖,第5圖係根據本案之一些實施例所繪示之步驟S242的流程圖。如第5圖所示,分段方法200進一步執行步驟S2421比較當前字幕句對應的至少一特徵與當前段落對應的至少一特徵的差異值是否大於門檻值。Next, the segmentation method 200 further executes step S242. If the current subtitle sentence is not associated with the common segmentation vocabulary, the current subtitle sentence is calculated for similarity with the current paragraph, and if similar, the first subtitle sentence is merged into the current paragraph. Wherein, step S242 further includes steps S2421 to S2423, please further refer to FIG. 5, which is a flowchart of step S242 drawn according to some embodiments of the present application. As shown in FIG. 5, the segmentation method 200 further performs step S2421 to compare whether the difference between at least one feature corresponding to the current subtitle sentence and at least one feature corresponding to the current paragraph is greater than the threshold value.

承上述,於一實施例中,從字幕句中提取出複數個關鍵字,提取出的關鍵字即為當前字幕句對應的至少一特徵。利用TF-IDF統計方法(T ermF requency–InverseD ocumentF requency)計算字幕句對應的關鍵字。TF-IDF統計方法用來評估一字詞對於資料庫中的一份檔案的重要程度,字詞的重要性隨著它在檔案中出現的次數成正比增加,但同時也會隨著它在資料庫中出現的頻率成反比下降。於此實施例中,TF-IDF統計方法可以計算當前字幕句的關鍵字。接著,計算當前字幕句的至少一特徵(關鍵字)與當前段落的至少一特徵(關鍵字)的相似值,計算出的相似值越高即可判定為當前字幕句與當前段落的內容越接近。In accordance with the foregoing, in one embodiment, a plurality of keywords are extracted from the subtitle sentence, and the extracted keyword is at least one feature corresponding to the current subtitle sentence. Use the TF-IDF statistical method ( T erm F requency-Inverse D ocument F requency) to calculate the keywords corresponding to the subtitle sentences. The TF-IDF statistical method is used to evaluate the importance of a word to a file in the database. The importance of a word increases in proportion to the number of times it appears in the file, but at the same time it also increases in the data. The frequency of occurrence in the library decreases inversely. In this embodiment, the TF-IDF statistical method can calculate the keywords of the current subtitle sentence. Next, calculate the similarity value between at least one feature (keyword) of the current subtitle sentence and at least one feature (keyword) of the current paragraph. The higher the calculated similarity value, the closer the content of the current subtitle sentence and the current paragraph is. .

承上述,分段方法200進一步執行步驟S2422如果差異值小於門檻值,將當前字幕句併入當前段落。於一實施例中,利用門檻值對相似值進行篩選,當相似值不小於門檻值時,表示當前字幕句與當前段落的內容比較相似,因此可以將當前字幕句併入當前段落中。舉例而言,如果第4字幕句與當前段落的相似值不小於門檻值,表示第4字幕句與當前段落的內容比較相似,因此可以將第4字幕句加入當前段落。In view of the above, the segmentation method 200 further executes step S2422 if the difference value is less than the threshold value, merge the current subtitle sentence into the current paragraph. In one embodiment, the threshold value is used to filter the similarity value. When the similarity value is not less than the threshold value, it means that the content of the current subtitle sentence is relatively similar to the current paragraph, so the current subtitle sentence can be incorporated into the current paragraph. For example, if the similarity value between the fourth subtitle sentence and the current paragraph is not less than the threshold value, it means that the content of the fourth subtitle sentence is relatively similar to the current paragraph, so the fourth subtitle sentence can be added to the current paragraph.

承上述,分段方法200進一步執行步驟S2423如果差異值不小於門檻值,以當前字幕句作為下一段落的起始句,並利用設定值根據特定順序選取字幕句,將被選取的字幕句分為下一段落。舉例而言,當相似值小於門檻值時,表示當前字幕句與當前段落的內容具有差異,因此將當前字幕句判定為第二段落的起始句。舉例而言,如果第4字幕句與當前段落的相似值小於門檻值,表示第4字幕句與當前段落的內容具有差異,因此將第4字幕句作為下一段落的起始句。執行完步驟S252的操作後會接著執行利用設定值根據特定順序選取字幕句,將被選取的字幕句分為下一段落,因此會重行執行步驟S230的操作,在此不再贅述。In accordance with the foregoing, the segmentation method 200 further executes step S2423. If the difference value is not less than the threshold value, the current subtitle sentence is used as the starting sentence of the next paragraph, and the subtitle sentence is selected according to a specific order using the set value, and the selected subtitle sentence is divided into Next paragraph. For example, when the similarity value is less than the threshold value, it means that the content of the current subtitle sentence is different from the content of the current paragraph, so the current subtitle sentence is determined as the start sentence of the second paragraph. For example, if the similarity value between the fourth subtitle sentence and the current paragraph is less than the threshold value, it indicates that the content of the fourth subtitle sentence is different from the current paragraph. Therefore, the fourth subtitle sentence is used as the starting sentence of the next paragraph. After the operation of step S252 is performed, the subtitle sentence will be selected according to a specific order using the set value, and the selected subtitle sentence will be divided into the next paragraph. Therefore, the operation of step S230 will be performed again, which will not be repeated here.

由上述的分段操作可以得知,每次做完一句字幕句的分段計算後會接著執行下一句字幕句的分段計算,直到所有的字幕句執行完畢為止,如果有剩餘字幕句的數量少於設定值的設定時,可以不再針對剩餘字幕句進行分段計算,而是直接將剩餘字幕句併入當前段落,舉例而言,如果剩餘字幕句的數量為2,少於前述的設定值(前述將設定值設定為3),因此剩下的2句字幕句即可併入當前段落。From the above segmentation operation, it can be known that each time the segment calculation of a subtitle sentence is completed, the segment calculation of the next subtitle sentence will be executed until all the subtitle sentences are executed. If there is the number of remaining subtitle sentences When the setting is less than the set value, you can no longer perform segmentation calculation for the remaining subtitles, but directly merge the remaining subtitles into the current paragraph. For example, if the number of remaining subtitles is 2, it is less than the aforementioned setting Value (the aforementioned set value is set to 3), so the remaining 2 subtitle sentences can be incorporated into the current paragraph.

接著,執行完上述的分段步驟後,分段方法200執行步驟S250產生段落對應的註解。舉例而言,如果在執行完全部的字幕句後分為3個段落,會分別計算3個段落的註解,註解可以是根據段落中的字幕句對應的關鍵字產生。最後,將分好的段落以及段落對應的註解儲存至儲存單元110的課程資料庫DB2中。舉例而言,如果差異值小於門檻值時,表示當前字幕句與當前段落較相似,因此可以利用字幕句的關鍵字作為當前段落對應的至少一特徵。如果差異值不小於門檻值時,表示當前字幕句與當前段落不相似,因此可以利用字幕句的關鍵字作為下一段落對應的至少一特徵。Then, after performing the above-mentioned segmentation step, the segmentation method 200 executes step S250 to generate a comment corresponding to the paragraph. For example, if all subtitle sentences are executed and divided into 3 paragraphs, the annotations of the 3 paragraphs will be calculated respectively. The annotations can be generated based on the keywords corresponding to the subtitle sentences in the paragraph. Finally, the divided paragraphs and the annotations corresponding to the paragraphs are stored in the course database DB2 of the storage unit 110. For example, if the difference value is less than the threshold value, it indicates that the current subtitle sentence is relatively similar to the current paragraph. Therefore, the keyword of the subtitle sentence can be used as at least one feature corresponding to the current paragraph. If the difference value is not less than the threshold value, it means that the current subtitle sentence is not similar to the current paragraph, so the keyword of the subtitle sentence can be used as at least one feature corresponding to the next paragraph.

由上述本案之實施方式可知,主要係改進以往係利用工方式進行影片段落標記,耗費大量人力以及時間的問題。先計算每每一字幕句對應的關鍵字,在針對字幕句進行常見分段詞彙判斷,根據該常見分段詞彙判斷的判斷結果產生下一段落或將第一字幕句併入當前段落,以產生分段結果,達到將學習影片中類似的主題進行分段並標註關鍵字的功能。From the above implementation of this case, it can be seen that it is mainly to improve the problem of using manual methods to mark movie paragraphs, which consumes a lot of manpower and time. First calculate the keyword corresponding to each subtitle sentence, and perform common segmentation vocabulary judgment for subtitle sentences, and generate the next paragraph according to the judgment result of the common segmentation vocabulary judgment or merge the first subtitle sentence into the current paragraph to generate the segmentation As a result, the function of segmenting similar topics in the learning film and labeling keywords is achieved.

另外,上述例示包含依序的示範步驟,但該些步驟不必依所顯示的順序被執行。以不同順序執行該些步驟皆在本揭示內容的考量範圍內。在本揭示內容之實施例的精神與範圍內,可視情況增加、取代、變更順序及/或省略該些步驟。In addition, the above examples include sequential exemplary steps, but these steps need not be executed in the order shown. Performing these steps in a different order is within the scope of the present disclosure. Within the spirit and scope of the embodiments of the present disclosure, the steps may be added, replaced, changed, and/or omitted as appropriate.

雖然本揭示內容已以實施方式揭露如上,然其並非用以限定本發明內容,任何熟習此技藝者,在不脫離本發明內容之精神和範圍內,當可作各種更動與潤飾,因此本發明內容之保護範圍當視後附之申請專利範圍所界定者為準。Although the present disclosure has been disclosed in the above embodiments, it is not intended to limit the content of the present invention. Anyone who is familiar with this technique can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, the present invention The scope of protection of the content shall be subject to the scope of the attached patent application.

100:分段系統110:儲存單元130:處理器DB1:常見分段詞彙資料庫DB2:課程資料庫131:關鍵字擷取單元132:分段單元133:常見詞偵測單元134:段落產生單元135:註解產生單元200:分段方法S210~S250、S241~S242、S2411~S2413、S2421~S2423:步驟100: Segmentation system 110: Storage unit 130: Processor DB1: Common segmentation vocabulary database DB2: Course database 131: Keyword extraction unit 132: Segmentation unit 133: Common word detection unit 134: Paragraph generation unit 135: Annotation generation unit 200: Segmentation method S210~S250, S241~S242, S2411~S2413, S2421~S2423: Steps

為讓本發明之上述和其他目的、特徵、優點與實施例能更明顯易懂,所附圖式之說明如下: 第1圖係根據本案之一些實施例所繪示之分段系統的示意圖; 第2圖係根據本案之一些實施例所繪示之分段方法的流程圖; 第3圖係根據本案之一些實施例所繪示之步驟S240的流程圖; 第4圖係根據本案之一些實施例所繪示之步驟S241的流程圖;以及 第5圖係根據本案之一些實施例所繪示之步驟S242的流程圖。In order to make the above and other objectives, features, advantages and embodiments of the present invention more comprehensible, the description of the accompanying drawings is as follows: Figure 1 is a schematic diagram of a segmented system drawn according to some embodiments of the present invention; Figure 2 is a flow chart of the segmentation method according to some embodiments of this case; Figure 3 is a flow chart of step S240 according to some embodiments of this case; Figure 4 is a flowchart of some implementations according to this case The flowchart of step S241 shown in the example; and FIG. 5 is a flowchart of step S242 shown in some embodiments of the present application.

200:分段方法 200: Segmentation method

S210~S250:步驟 S210~S250: steps

Claims (15)

一種分段方法,包含:接收一字幕資訊;其中,該字幕資訊包含複數個字幕句;根據一設定值選取該些字幕句,並將被選取的字幕句分為一第一段落;針對一第一字幕句進行一常見分段詞彙判斷;其中,該第一字幕句是該些字幕句的其中之一;以及根據該常見分段詞彙判斷的一判斷結果產生一第二段落或將該第一字幕句併入該第一段落;其中,根據該常見分段詞彙判斷的該判斷結果產生該第二段落或將該第一字幕句併入該第一段落,更包含:如果該第一字幕句與該常見分段詞彙相關聯,進行一分段處理產生該第二段落,並利用該設定值根據一第一特定順序選取該些字幕句,將被選取的字幕句加入該第二段落;以及如果該第一字幕句不與該常見分段詞彙相關聯,該第一字幕句與該第一段落進行一相似值計算,如果相似,將該第一字幕句併入該第一段落。 A segmentation method includes: receiving a subtitle information; wherein the subtitle information includes a plurality of subtitle sentences; selecting the subtitle sentences according to a setting value, and dividing the selected subtitle sentences into a first paragraph; The subtitle sentence performs a common segmentation vocabulary judgment; wherein, the first subtitle sentence is one of the subtitle sentences; and a second paragraph is generated according to a judgment result of the common segmentation vocabulary or the first subtitle sentence Sentence is merged into the first paragraph; wherein, the second paragraph is generated according to the judgment result of the common segmented vocabulary or the first subtitle sentence is merged into the first paragraph, and further includes: if the first subtitle sentence and the common The segmented vocabulary is related, a segmentation process is performed to generate the second paragraph, and the set value is used to select the subtitle sentences according to a first specific order, and the selected subtitle sentences are added to the second paragraph; and if the first paragraph A subtitle sentence is not associated with the common segmented vocabulary, the first subtitle sentence and the first paragraph are calculated by a similarity value, and if they are similar, the first subtitle sentence is incorporated into the first paragraph. 如請求項1所述之分段方法,其中,當該第一字幕句併入該第一段落後,針對一第二字幕句進行該常見分段詞彙判斷;其中,該第二字幕句依照一第二特定順序在該第一字幕句之後。 The segmentation method according to claim 1, wherein, after the first subtitle sentence is incorporated into the first paragraph, the common segmentation vocabulary judgment is performed for a second subtitle sentence; wherein, the second subtitle sentence is in accordance with a first paragraph The second specific sequence follows the first subtitle sentence. 如請求項1所述之分段方法,其中,當產生該第二段落時,利用該設定值根據一第二特定順序選取該些字幕句,將被選取的字幕句加入該第二段落。 The segmentation method according to claim 1, wherein when the second paragraph is generated, the set value is used to select the subtitle sentences according to a second specific order, and the selected subtitle sentences are added to the second paragraph. 如請求項1所述之分段方法,其中,該分段處理包含:根據該判斷結果決定該第一字幕句是否與一開頭分段詞彙以及一結尾分段詞彙的其中之一相關聯;如果該第一字幕句與該開頭分段詞彙相關聯,以該第一字幕句作為該第二段落的起始句;以及如果該第一字幕句與該結尾分段詞彙相關聯,以該第一字幕句作為該第一段落的結尾句。 The segmentation method according to claim 1, wherein the segmentation processing includes: determining whether the first subtitle sentence is associated with one of an opening segment vocabulary and an end segment vocabulary according to the judgment result; if The first subtitle sentence is associated with the opening segment vocabulary, and the first subtitle sentence is used as the opening sentence of the second paragraph; and if the first subtitle sentence is associated with the closing segment vocabulary, the first subtitle sentence is The subtitle sentence serves as the ending sentence of the first paragraph. 如請求項1所述之分段方法,其中,該相似值計算包含:比較該第一字幕句對應的至少一特徵與該第一段落對應的至少一特徵的一差異值是否大於一門檻值;如果該差異值小於該門檻值,將該第一字幕句併入該第一段落;以及如果該差異值不小於該門檻值,以該第一字幕句作為該第二段落的起始句,並利用該設定值根據該第一特定順序選取該些字幕句,將被選取的字幕句分為該第二段落。 The segmentation method according to claim 1, wherein the similarity value calculation includes: comparing at least one feature corresponding to the first subtitle sentence and at least one feature corresponding to the first paragraph whether a difference value is greater than a threshold; if If the difference value is less than the threshold value, incorporate the first subtitle sentence into the first paragraph; and if the difference value is not less than the threshold value, use the first subtitle sentence as the start sentence of the second paragraph and use the The setting value selects the subtitle sentences according to the first specific order, and divides the selected subtitle sentences into the second paragraph. 如請求項5所述之分段方法,其中,從該些字 幕句中提取出複數個關鍵字,該些關鍵字為該第一字幕句對應的至少一特徵。 The segmentation method according to claim 5, wherein from these words A plurality of keywords are extracted from the subtitle sentence, and the keywords are at least one feature corresponding to the first subtitle sentence. 如請求項6所述之分段方法,其中,該第一段落對應的至少一特徵,由該第一段落中的該些字幕句提取出的該些關字產生。 The segmentation method according to claim 6, wherein the at least one feature corresponding to the first paragraph is generated from the related characters extracted from the subtitle sentences in the first paragraph. 一種分段系統,包含:一儲存單元,用以儲存一字幕資訊、一分段結果、一常見分段詞彙資料庫、一第一段落對應的註解以及一第二段落對應的註解;以及一處理器,與該儲存單元電性連接,用以接收該字幕資訊;其中,該字幕資訊包含複數個字幕句,該處理器包含:一分段單元,用以利用一設定值選取該些字幕句,並將被選取的字幕句分為一第一段落;一常見詞偵測單元,與該分段單元電性連接,用以針對一第一字幕句進行一常見分段詞彙判斷;其中,該第一字幕句是該些字幕句的其中之一;以及一段落產生單元,與該常見詞偵測單元電性連接,用以根據該常見分段詞彙判斷的一判斷結果產生一第二段落或將該第一字幕句併入該第一段落;其中,該段落產生單元更用以根據該判斷結果執行以下步驟:如果該第一字幕句與該常見分段詞彙相關聯,進行一分 段處理產生一第二段落,並利用該設定值根據一第一特定順序選取該些字幕句,將被選取的字幕句加入該第二段落;以及如果該第一字幕句不與該常見分段詞彙相關聯,該第一字幕句與該第一段落進行一相似值計算,如果相似,將該第一字幕句併入該第一段落。 A segmentation system includes: a storage unit for storing a subtitle information, a segmentation result, a common segmentation vocabulary database, a first paragraph corresponding annotation and a second paragraph corresponding annotation; and a processor , Is electrically connected to the storage unit to receive the subtitle information; wherein the subtitle information includes a plurality of subtitle sentences, and the processor includes: a segmentation unit for selecting the subtitle sentences using a set value, and Divide the selected subtitle sentence into a first paragraph; a common word detection unit is electrically connected to the segmentation unit for judging a common segment vocabulary for a first subtitle sentence; wherein, the first subtitle Sentence is one of the subtitle sentences; and a paragraph generation unit is electrically connected to the common word detection unit for generating a second paragraph or the first paragraph according to a judgment result of the common segment vocabulary judgment The subtitle sentence is incorporated into the first paragraph; wherein, the paragraph generation unit is further configured to perform the following steps according to the judgment result: if the first subtitle sentence is associated with the common segmented vocabulary, perform one point The paragraph processing generates a second paragraph, and uses the setting value to select the subtitle sentences according to a first specific order, and adds the selected subtitle sentences to the second paragraph; and if the first subtitle sentence does not correspond to the common segment The vocabulary is related, the first subtitle sentence and the first paragraph are calculated by a similarity value, and if they are similar, the first subtitle sentence is incorporated into the first paragraph. 如請求項8所述之分段系統,其中,當該第一字幕句併入該第一段落後,該常見詞偵測單元更用以針對一第二字幕句進行該常見分段詞彙判斷;其中,該第二字幕句依照一第二特定順序在該第一字幕句之後。 The segmentation system according to claim 8, wherein, after the first subtitle sentence is incorporated into the first paragraph, the common word detection unit is further used to perform the common segmentation vocabulary judgment for a second subtitle sentence; wherein , The second subtitle sentence follows the first subtitle sentence in a second specific order. 如請求項8所述之分段系統,其中當產生該第二段落後,該分段單元更用以利用該設定值根據一第二特定順序選取該些字幕句,將被選取的字幕句加入該第二段落。 The segmentation system according to claim 8, wherein after the second paragraph is generated, the segmentation unit is further used to select the subtitle sentences according to a second specific order by using the setting value, and add the selected subtitle sentences The second paragraph. 如請求項8所述之分段系統,其中,該分段處理包含:根據該判斷結果決定該第一字幕句是否與一開頭分段詞彙以及一結尾分段詞彙的其中之一相關聯;如果該第一字幕句與該開頭分段詞彙相關聯,以該第一字幕句作為該第二段落的起始句;以及如果該第一字幕句與該結尾分段詞彙相關聯,以該第一 字幕句作為該第一段落的結尾句。 The segmentation system according to claim 8, wherein the segmentation processing includes: determining whether the first subtitle sentence is associated with one of an opening segment vocabulary and an end segment vocabulary according to the judgment result; if The first subtitle sentence is associated with the opening segment vocabulary, and the first subtitle sentence is used as the opening sentence of the second paragraph; and if the first subtitle sentence is associated with the closing segment vocabulary, the first subtitle sentence is The subtitle sentence serves as the ending sentence of the first paragraph. 如請求項8所述之分段系統,其中,該相似值計算包含:比較該第一字幕句對應的至少一特徵與該第一段落對應的至少一特徵的一差異值是否大於一門檻值;如果該差異值小於該門檻值,將該第一字幕句併入該第一段落;以及如果該差異值不小於該門檻值,以該第一字幕句作為該第二段落的起始句,並利用該設定值根據該第一特定順序選取該些字幕句,將被選取的字幕句分為該第二段落。 The segmentation system according to claim 8, wherein the similarity value calculation includes: comparing at least one feature corresponding to the first subtitle sentence and at least one feature corresponding to the first paragraph whether a difference value is greater than a threshold; if If the difference value is less than the threshold value, incorporate the first subtitle sentence into the first paragraph; and if the difference value is not less than the threshold value, use the first subtitle sentence as the start sentence of the second paragraph and use the The setting value selects the subtitle sentences according to the first specific order, and divides the selected subtitle sentences into the second paragraph. 如請求項12所述之分段系統,更包含:一關鍵字擷取單元,與該分段單元電性連接,用以從該些字幕句中提取出複數個關鍵字,該些關鍵字為該第一字幕句對應的至少一特徵。 The segmentation system described in claim 12 further includes: a keyword extraction unit electrically connected to the segmentation unit for extracting a plurality of keywords from the subtitle sentences, and the keywords are At least one feature corresponding to the first subtitle sentence. 如請求項13所述之分段系統,其中,該第一段落對應的至少一特徵,由該第一段落中的該些字幕句提取出的該些關字產生。 The segmentation system according to claim 13, wherein the at least one feature corresponding to the first paragraph is generated by the related characters extracted from the subtitle sentences in the first paragraph. 一種非暫態電腦可讀取媒體,包含至少一指令程序,由一處理器執行該至少一指令程序以實行一分段方法,其包含: 接收一字幕資訊;其中,該字幕資訊包含複數個字幕句;根據一設定值選取該些字幕句,並將被選取的字幕句分為一第一段落;針對一第一字幕句進行一常見分段詞彙判斷;其中,該第一字幕句是該些字幕句的其中之一;以及根據該常見分段詞彙判斷的一判斷結果產生一第二段落或將該第一字幕句併入該第一段落;其中,根據該常見分段詞彙判斷的該判斷結果產生該第二段落或將該第一字幕句併入該第一段落,更包含:如果該第一字幕句與該常見分段詞彙相關聯,進行一分段處理產生該第二段落,並利用該設定值根據一特定順序選取該些字幕句,將被選取的字幕句加入該第二段落;以及如果該第一字幕句不與該常見分段詞彙相關聯,該第一字幕句與該第一段落進行一相似值計算,如果相似,將該第一字幕句併入該第一段落。 A non-transitory computer-readable medium includes at least one instruction program. The at least one instruction program is executed by a processor to implement a segmentation method, which includes: Receive a subtitle information; where the subtitle information includes a plurality of subtitle sentences; select the subtitle sentences according to a setting value, and divide the selected subtitle sentences into a first paragraph; perform a common segmentation for a first subtitle sentence Vocabulary judgment; wherein, the first subtitle sentence is one of the subtitle sentences; and a second paragraph is generated according to a judgment result of the common segmentation vocabulary judgment or the first subtitle sentence is incorporated into the first paragraph; Wherein, generating the second paragraph according to the judgment result of the common segmented vocabulary or incorporating the first subtitle sentence into the first paragraph further includes: if the first subtitle sentence is associated with the common segmented vocabulary, perform A segmentation process generates the second paragraph, and uses the setting value to select the subtitle sentences according to a specific order, and adds the selected subtitle sentences to the second paragraph; and if the first subtitle sentence does not correspond to the common segment The vocabulary is related, the first subtitle sentence and the first paragraph are calculated by a similarity value, and if they are similar, the first subtitle sentence is incorporated into the first paragraph.
TW108104097A 2018-09-07 2019-02-01 Segmentation method, segmentation system and non-transitory computer-readable medium TWI699663B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862728082P 2018-09-07 2018-09-07
US62/728,082 2018-09-07

Publications (2)

Publication Number Publication Date
TW202011232A TW202011232A (en) 2020-03-16
TWI699663B true TWI699663B (en) 2020-07-21

Family

ID=69745778

Family Applications (5)

Application Number Title Priority Date Filing Date
TW108104065A TWI709905B (en) 2018-09-07 2019-02-01 Data analysis method and data analysis system thereof
TW108104097A TWI699663B (en) 2018-09-07 2019-02-01 Segmentation method, segmentation system and non-transitory computer-readable medium
TW108104105A TWI700597B (en) 2018-09-07 2019-02-01 Segmentation method, segmentation system and non-transitory computer-readable medium
TW108104107A TWI725375B (en) 2018-09-07 2019-02-01 Data search method and data search system thereof
TW108111842A TWI696386B (en) 2018-09-07 2019-04-03 Multimedia data recommending system and multimedia data recommending method

Family Applications Before (1)

Application Number Title Priority Date Filing Date
TW108104065A TWI709905B (en) 2018-09-07 2019-02-01 Data analysis method and data analysis system thereof

Family Applications After (3)

Application Number Title Priority Date Filing Date
TW108104105A TWI700597B (en) 2018-09-07 2019-02-01 Segmentation method, segmentation system and non-transitory computer-readable medium
TW108104107A TWI725375B (en) 2018-09-07 2019-02-01 Data search method and data search system thereof
TW108111842A TWI696386B (en) 2018-09-07 2019-04-03 Multimedia data recommending system and multimedia data recommending method

Country Status (4)

Country Link
JP (3) JP6829740B2 (en)
CN (5) CN110891202B (en)
SG (5) SG10201905236WA (en)
TW (5) TWI709905B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI756703B (en) * 2020-06-03 2022-03-01 南開科技大學 Digital learning system and method thereof
CN117351794A (en) * 2023-10-13 2024-01-05 浙江上国教育科技有限公司 Online course management system based on cloud platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW200825900A (en) * 2006-12-13 2008-06-16 Inst Information Industry System and method for generating wiki by sectional time of handout and recording medium thereof
CN101382937A (en) * 2008-07-01 2009-03-11 深圳先进技术研究院 Multimedia resource processing method based on speech recognition and on-line teaching system thereof
WO2014100893A1 (en) * 2012-12-28 2014-07-03 Jérémie Salvatore De Villiers System and method for the automated customization of audio and video media
CN106231399A (en) * 2016-08-01 2016-12-14 乐视控股(北京)有限公司 Methods of video segmentation, equipment and system

Family Cites Families (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07311539A (en) * 1994-05-17 1995-11-28 Hitachi Ltd Teaching material edition supporting system
KR100250540B1 (en) * 1996-08-13 2000-04-01 김광수 Studying method of foreign language dictation with apparatus of playing caption video cd
JP2002041823A (en) * 2000-07-27 2002-02-08 Nippon Telegr & Teleph Corp <Ntt> Information distributing device, information receiving device and information distributing system
JP3685733B2 (en) * 2001-04-11 2005-08-24 株式会社ジェイ・フィット Multimedia data search apparatus, multimedia data search method, and multimedia data search program
JP2002341735A (en) * 2001-05-16 2002-11-29 Alice Factory:Kk Broadband digital learning system
CN1432932A (en) * 2002-01-16 2003-07-30 陈雯瑄 English examination and score estimation method and system
TW200411462A (en) * 2002-12-20 2004-07-01 Hsiao-Lien Wang A method for matching information exchange on network
WO2004090752A1 (en) * 2003-04-14 2004-10-21 Koninklijke Philips Electronics N.V. Method and apparatus for summarizing a music video using content analysis
JP4471737B2 (en) * 2003-10-06 2010-06-02 日本電信電話株式会社 Grouping condition determining device and method, keyword expansion device and method using the same, content search system, content information providing system and method, and program
JP4426894B2 (en) * 2004-04-15 2010-03-03 株式会社日立製作所 Document search method, document search program, and document search apparatus for executing the same
JP2005321662A (en) * 2004-05-10 2005-11-17 Fuji Xerox Co Ltd Learning support system and method
JP2006003670A (en) * 2004-06-18 2006-01-05 Hitachi Ltd Educational content providing system
KR20070116945A (en) * 2005-03-31 2007-12-11 코닌클리케 필립스 일렉트로닉스 엔.브이. Augmenting lectures based on prior exams
US9058406B2 (en) * 2005-09-14 2015-06-16 Millennial Media, Inc. Management of multiple advertising inventories using a monetization platform
JP5167546B2 (en) * 2006-08-21 2013-03-21 国立大学法人京都大学 Sentence search method, sentence search device, computer program, recording medium, and document storage device
JP5010292B2 (en) * 2007-01-18 2012-08-29 株式会社東芝 Video attribute information output device, video summarization device, program, and video attribute information output method
JP5158766B2 (en) * 2007-10-23 2013-03-06 シャープ株式会社 Content selection device, television, content selection program, and storage medium
TW200923860A (en) * 2007-11-19 2009-06-01 Univ Nat Taiwan Science Tech Interactive learning system
US8140544B2 (en) * 2008-09-03 2012-03-20 International Business Machines Corporation Interactive digital video library
CN101453649B (en) * 2008-12-30 2011-01-05 浙江大学 Key frame extracting method for compression domain video stream
JP5366632B2 (en) * 2009-04-21 2013-12-11 エヌ・ティ・ティ・コミュニケーションズ株式会社 Search support keyword presentation device, method and program
JP5493515B2 (en) * 2009-07-03 2014-05-14 富士通株式会社 Portable terminal device, information search method, and information search program
US20110177482A1 (en) * 2010-01-15 2011-07-21 Nitzan Katz Facilitating targeted interaction in a networked learning environment
JP2012038239A (en) * 2010-08-11 2012-02-23 Sony Corp Information processing equipment, information processing method and program
US8839110B2 (en) * 2011-02-16 2014-09-16 Apple Inc. Rate conform operation for a media-editing application
CN102222227B (en) * 2011-04-25 2013-07-31 中国华录集团有限公司 Video identification based system for extracting film images
CN102348049B (en) * 2011-09-16 2013-09-18 央视国际网络有限公司 Method and device for detecting position of cut point of video segment
CN102509007A (en) * 2011-11-01 2012-06-20 北京瑞信在线系统技术有限公司 Method, system and device for multimedia teaching evaluation and multimedia teaching system
JP5216922B1 (en) * 2012-01-06 2013-06-19 Flens株式会社 Learning support server, learning support system, and learning support program
US9846696B2 (en) * 2012-02-29 2017-12-19 Telefonaktiebolaget Lm Ericsson (Publ) Apparatus and methods for indexing multimedia content
US20130263166A1 (en) * 2012-03-27 2013-10-03 Bluefin Labs, Inc. Social Networking System Targeted Message Synchronization
US9058385B2 (en) * 2012-06-26 2015-06-16 Aol Inc. Systems and methods for identifying electronic content using video graphs
TWI513286B (en) * 2012-08-28 2015-12-11 Ind Tech Res Inst Method and system for continuous video replay
CN102937972B (en) * 2012-10-15 2016-06-22 上海外教社信息技术有限公司 A kind of audiovisual subtitle making system and method
JP6205767B2 (en) * 2013-03-13 2017-10-04 カシオ計算機株式会社 Learning support device, learning support method, learning support program, learning support system, and server device
TWI549498B (en) * 2013-06-24 2016-09-11 wu-xiong Chen Variable audio and video playback method
CN104572716A (en) * 2013-10-18 2015-04-29 英业达科技有限公司 System and method for playing video files
KR101537370B1 (en) * 2013-11-06 2015-07-16 주식회사 시스트란인터내셔널 System for grasping speech meaning of recording audio data based on keyword spotting, and indexing method and method thereof using the system
US20150206441A1 (en) * 2014-01-18 2015-07-23 Invent.ly LLC Personalized online learning management system and method
CN104123332B (en) * 2014-01-24 2018-11-09 腾讯科技(深圳)有限公司 The display methods and device of search result
US9892194B2 (en) * 2014-04-04 2018-02-13 Fujitsu Limited Topic identification in lecture videos
US20150293995A1 (en) * 2014-04-14 2015-10-15 David Mo Chen Systems and Methods for Performing Multi-Modal Video Search
JP6334431B2 (en) * 2015-02-18 2018-05-30 株式会社日立製作所 Data analysis apparatus, data analysis method, and data analysis program
US20160239155A1 (en) * 2015-02-18 2016-08-18 Google Inc. Adaptive media
CN105047203B (en) * 2015-05-25 2019-09-10 广州酷狗计算机科技有限公司 A kind of audio-frequency processing method, device and terminal
CN104978961B (en) * 2015-05-25 2019-10-15 广州酷狗计算机科技有限公司 A kind of audio-frequency processing method, device and terminal
TWI571756B (en) * 2015-12-11 2017-02-21 財團法人工業技術研究院 Methods and systems for analyzing reading log and documents corresponding thereof
CN105978800A (en) * 2016-07-04 2016-09-28 广东小天才科技有限公司 Method and system for pushing subjects to mobile terminal and server
CN106202453B (en) * 2016-07-13 2020-08-04 网易(杭州)网络有限公司 Multimedia resource recommendation method and device
CN106331893B (en) * 2016-08-31 2019-09-03 科大讯飞股份有限公司 Real-time caption presentation method and system
CN108122437A (en) * 2016-11-28 2018-06-05 北大方正集团有限公司 Adaptive learning method and device
CN107256262B (en) * 2017-06-13 2020-04-14 西安电子科技大学 Image retrieval method based on object detection
CN107623860A (en) * 2017-08-09 2018-01-23 北京奇艺世纪科技有限公司 Multi-medium data dividing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW200825900A (en) * 2006-12-13 2008-06-16 Inst Information Industry System and method for generating wiki by sectional time of handout and recording medium thereof
CN101382937A (en) * 2008-07-01 2009-03-11 深圳先进技术研究院 Multimedia resource processing method based on speech recognition and on-line teaching system thereof
WO2014100893A1 (en) * 2012-12-28 2014-07-03 Jérémie Salvatore De Villiers System and method for the automated customization of audio and video media
CN106231399A (en) * 2016-08-01 2016-12-14 乐视控股(北京)有限公司 Methods of video segmentation, equipment and system

Also Published As

Publication number Publication date
CN110891202A (en) 2020-03-17
TW202011231A (en) 2020-03-16
TW202011749A (en) 2020-03-16
CN110889034A (en) 2020-03-17
TWI696386B (en) 2020-06-11
SG10201906347QA (en) 2020-04-29
CN110888896B (en) 2023-09-05
SG10201905532QA (en) 2020-04-29
CN110888994A (en) 2020-03-17
JP2020042777A (en) 2020-03-19
TWI700597B (en) 2020-08-01
TWI709905B (en) 2020-11-11
TWI725375B (en) 2021-04-21
SG10201905236WA (en) 2020-04-29
TW202011221A (en) 2020-03-16
TW202011222A (en) 2020-03-16
TW202011232A (en) 2020-03-16
CN110895654A (en) 2020-03-20
SG10201905523TA (en) 2020-04-29
CN110888896A (en) 2020-03-17
CN110891202B (en) 2022-03-25
JP2020042770A (en) 2020-03-19
JP6829740B2 (en) 2021-02-10
JP2020042771A (en) 2020-03-19
SG10201907250TA (en) 2020-04-29

Similar Documents

Publication Publication Date Title
CN108009293B (en) Video tag generation method and device, computer equipment and storage medium
US9438850B2 (en) Determining importance of scenes based upon closed captioning data
CN102483743B (en) Detecting writing systems and languages
EP3401802A1 (en) Webpage training method and device, and search intention identification method and device
US8843815B2 (en) System and method for automatically extracting metadata from unstructured electronic documents
JP6335898B2 (en) Information classification based on product recognition
CN106557545B (en) Video retrieval method and device
CN107463548B (en) Phrase mining method and device
CN109275047B (en) Video information processing method and device, electronic equipment and storage medium
US20110276523A1 (en) Measuring document similarity by inferring evolution of documents through reuse of passage sequences
Moncrieff et al. Affect computing in film through sound energy dynamics
TWI699663B (en) Segmentation method, segmentation system and non-transitory computer-readable medium
CN107924398B (en) System and method for providing a review-centric news reader
CN106610990A (en) Emotional tendency analysis method and apparatus
Park et al. Exploiting script-subtitles alignment to scene boundary dectection in movie
CN112214984A (en) Content plagiarism identification method, device, equipment and storage medium
WO2022143608A1 (en) Language labeling method and apparatus, and computer device and storage medium
US20190205320A1 (en) Sentence scoring apparatus and program
CN116029280A (en) Method, device, computing equipment and storage medium for extracting key information of document
CN109344254B (en) Address information classification method and device
US9934218B2 (en) Systems and methods for extracting attributes from text content
JP5366849B2 (en) Function expression complementing apparatus, method and program
US11423208B1 (en) Text encoding issue detection
US20180307669A1 (en) Information processing apparatus
KR20200063316A (en) Apparatus for searching video based on script and method for the same