TWI790630B - Method and device for automatically generating notes - Google Patents

Method and device for automatically generating notes Download PDF

Info

Publication number
TWI790630B
TWI790630B TW110119671A TW110119671A TWI790630B TW I790630 B TWI790630 B TW I790630B TW 110119671 A TW110119671 A TW 110119671A TW 110119671 A TW110119671 A TW 110119671A TW I790630 B TWI790630 B TW I790630B
Authority
TW
Taiwan
Prior art keywords
text
data
text data
slide
graphics
Prior art date
Application number
TW110119671A
Other languages
Chinese (zh)
Other versions
TW202249000A (en
Inventor
毛俊傑
陳良其
黃錦軒
Original Assignee
宏碁股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 宏碁股份有限公司 filed Critical 宏碁股份有限公司
Priority to TW110119671A priority Critical patent/TWI790630B/en
Publication of TW202249000A publication Critical patent/TW202249000A/en
Application granted granted Critical
Publication of TWI790630B publication Critical patent/TWI790630B/en

Links

Images

Abstract

A method and a device for automatically generating notes are provided. The method for automatically generating notes includes the following steps. When a lecturer gives an explanation through a slide, a video and a voice are recorded. A graphic-to-text process is performed on the video to obtain a slide text data. A graphic capture process is performed on the video to obtain a slide graphic data. A voice-to-text process is performed on the voice to obtain a voice text data. The voice text data is classified into several paragraphs. Several summary texts are analyzed from the paragraphs. Several keywords are analyzed based on the slide text data and the voice text data. The slide text data, the slide graphic data, the voice text data, the summary texts and the keywords are integrated.

Description

自動生成筆記之方法與裝置 Method and device for automatically generating notes

本揭露是有關於一種資訊處理之方法與裝置,且特別是有關於一種自動生成筆記之方法與裝置。 The present disclosure relates to an information processing method and device, and in particular to a method and device for automatically generating notes.

參加教育訓練或聽簡報時,可以獲得相當多的資訊。然而,一昧的抄寫筆記不僅會影響思緒,在課堂後也不容易整理。 A considerable amount of information can be obtained when attending an educational training or listening to a presentation. However, ignorant copying of notes not only affects thinking, but also is not easy to organize after class.

我們可以發現,有的講師的投影片能夠與講演內容相互輝映,但有的講師則在投影片只放一些要領。單純的錄影或錄音難以進行整理,也無法清楚獲得授課的重點。 We can find that some lecturers' slides can complement the lecture content, but some lecturers only put some essentials in the slides. Simple video or audio recordings are difficult to sort out, and it is also impossible to clearly obtain the key points of the lectures.

為了讓學習更有效率,研究人員正致力開發一種自動生成筆記的技術,讓聽講者能夠在上課時更注意聽講,課後也能很快獲得整理的重點。 In order to make learning more efficient, researchers are working on developing a technology that automatically generates notes, so that listeners can pay more attention to lectures during class, and can quickly get the key points of sorting out after class.

本揭露係有關於一種自動生成筆記之方法與裝置,其對錄影資料與錄音資料進行轉換與分析,以獲得摘要文字與關鍵字。如此一來,聽講者能夠在上課時更注意聽講,課後也能很快獲得整理的重點,大幅增進學習效率。 This disclosure is about a method and device for automatically generating notes, which converts and analyzes video and audio data to obtain summary text and keywords. In this way, the listeners can pay more attention to the lectures during class, and can quickly obtain the key points after class, greatly improving learning efficiency.

根據本揭露之一方面,提出一種自動生成筆記之方法。自動生成筆記之方法包括以下步驟。於一講師透過一投影片進行講解時,進行錄影及錄音,以獲得一錄影資料及一錄音資料。對錄影資料進行一圖形轉文字程序,以獲得一投影片文字資料。對錄影資料進行一圖形擷取程序,以獲得一投影片圖形資料。對錄音資料進行一語音轉文字程序,以獲得一語音文字資料。對語音文字資料分類為數個段落。針對這些段落分別分析出數個摘要文字。針對投影片文字資料及語音文字資料分析出數個關鍵字。彙整投影片文字資料、投影片圖形資料、語音文字資料、這些摘要文字及這些關鍵字。 According to an aspect of the present disclosure, a method for automatically generating notes is provided. The method for automatically generating notes includes the following steps. When a lecturer is giving an explanation through a slideshow, video and audio recording are performed to obtain a video data and an audio data. Performing a graphics-to-text program on the video data to obtain a slideshow text data. A graphic capture procedure is performed on the video data to obtain a slide graphic data. A voice-to-text program is performed on the recorded data to obtain a voice-to-text data. Classify audio-text data into several paragraphs. Several summary texts are analyzed for these paragraphs respectively. Several keywords are analyzed for the powerpoint text data and the audio text data. Collect the powerpoint text data, powerpoint graphic data, audio text data, these summary texts and these keywords.

根據本揭露之另一方面,提出一種自動生成筆記之裝置。自動生成筆記之裝置包括一錄影裝置、一錄音裝置、一圖形轉文字單元、一圖形擷取單元、一語音轉文字單元、一語音文字分類單元、一摘要單元、一關鍵字分析單元及一彙整單元。錄影裝置用以於一講師透過一投影片進行講解時,進行錄影,以獲得一錄影資料。錄音裝置用以於講師透過投影片進行講解時,進行錄音,以獲得一錄音資料。圖形轉文字單元用對錄影資料進行一圖形轉文字程序,以獲得一投影片文字資料。圖形擷取單元用以對錄影資料進行一圖形擷取程序,以獲得一投影片圖形資料。 語音轉文字單元用以對錄音資料進行一語音轉文字程序,以獲得一語音文字資料。語音文字分類單元用以對語音文字資料分類為數個段落。摘要單元用以針對這些段落分別分析出數個摘要文字。關鍵字分析單元用以針對投影片文字資料及語音文字資料分析出數個關鍵字。彙整單元用以彙整投影片文字資料、投影片圖形資料、語音文字資料、這些摘要文字及這些關鍵字。 According to another aspect of the present disclosure, a device for automatically generating notes is provided. The device for automatically generating notes includes a video recording device, a recording device, a graphics-to-text unit, a graphics extraction unit, a voice-to-text unit, a voice-to-text classification unit, a summary unit, a keyword analysis unit and a compilation unit. The video recording device is used for video recording when a lecturer is explaining through a slide film, so as to obtain a video data. The recording device is used for recording when the lecturer gives an explanation through the slide, so as to obtain a recording material. The graphics-to-text unit performs a graphics-to-text program on the video data to obtain a slideshow text data. The graphic capture unit is used for performing a graphic capture process on the video data to obtain a slide graphic data. The speech-to-text unit is used for performing a speech-to-text program on the recording data to obtain a speech-to-text data. The audio-text classification unit is used to classify the audio-text data into several paragraphs. The summary unit is used to analyze several summary texts respectively for these paragraphs. The keyword analysis unit is used to analyze several keywords for the slide text data and the voice text data. The integration unit is used for integrating the powerpoint text data, the powerpoint graphic data, the audio text data, the abstract text and the keywords.

為了對本揭露之上述及其他方面有更佳的瞭解,下文特舉實施例,並配合所附圖式詳細說明如下: In order to have a better understanding of the above and other aspects of the present disclosure, the following specific embodiments are described in detail in conjunction with the attached drawings as follows:

100:自動生成筆記之裝置 100: A device for automatically generating notes

110:錄影裝置 110:Video installation

120:錄音裝置 120: Recording device

130:圖形轉文字單元 130: Graphics to text unit

140:圖形擷取單元 140: graphics capture unit

150:語音轉文字單元 150:Speech-to-text unit

160:語音文字分類單元 160:Phonetic text taxonomy

170:摘要單元 170:Summary unit

180:關鍵字分析單元 180: Keyword Analysis Unit

190:彙整單元 190: Integration unit

DW:投影片圖形資料 DW: slide graphics data

KW:關鍵字 KW: Keyword

PG:頁面 PG: page

S101,S102,S103.S104,S105,S106,S107,S108,S109,S110:步驟 S101, S102, S103.S104, S105, S106, S107, S108, S109, S110: steps

SG:段落 SG: Paragraph

SM:摘要文字 SM: Summary text

SP:語音文字資料 SP: Speech text data

TI:時間資訊 TI: Time Information

TX:投影片文字資料 TX: slide text data

TX1:印刷文字 TX1: printed text

TX2:手寫文字 TX2: handwritten text

VC:錄音資料 VC: recording material

VD:錄影資料 VD: video data

第1圖繪示根據一實施例之投影片與講師講解的示意圖 Figure 1 shows a schematic diagram of a slideshow and a lecturer's explanation according to an embodiment

第2圖繪示根據一實施例之自動生成筆記之裝置的方塊圖。 FIG. 2 shows a block diagram of a device for automatically generating notes according to an embodiment.

第3圖繪示根據一實施例之自動生成筆記之方法的流程圖。 FIG. 3 shows a flowchart of a method for automatically generating notes according to an embodiment.

第4~6圖示例說明第3圖之各步驟。 Figures 4~6 illustrate the steps in Figure 3.

請參照第1圖,其繪示根據一實施例之投影片與講師講解的示意圖。在講師進講解時,對著投影片之頁面PG進行解說。頁面PG上會有投影片文字資料TX與投影片圖形資料DW。投影片文字資料TX包含一印刷文字TX1及一手寫文字TX2。隨著講師的講解,則會產生語音文字資料SP。這些資料都是筆記上需要記錄 的內容。在本實施例中,將自動分析出這些資料,並且對這些資料進行整合,以進一步解析出摘要文字SM與關鍵字KW。 Please refer to FIG. 1 , which shows a schematic diagram of a slideshow and a lecturer's explanation according to an embodiment. When the lecturer is explaining, explain to the page PG of the slideshow. There will be slide text data TX and slide graphics data DW on the page PG. The slide text data TX includes a printed text TX1 and a handwritten text TX2. Following the lecturer's explanation, the speech and text data SP will be generated. These materials need to be recorded in the notes Content. In this embodiment, these data will be automatically analyzed and integrated to further analyze the abstract text SM and keywords KW.

請參照第2圖,其繪示根據一實施例之自動生成筆記之裝置100的方塊圖。自動生成筆記之裝置100包括一錄影裝置110、一錄音裝置120、一圖形轉文字單元130、一圖形擷取單元140、一語音轉文字單元150、一語音文字分類單元160、一摘要單元170、一關鍵字分析單元180及一彙整單元190。錄影裝置110用以進行錄影程序。錄音裝置120用以進行錄音程序。錄影裝置110與錄音裝置120例如是整合於同一電子裝置內,並同步進行運作。錄影裝置110與錄音裝置120例如是一智慧型手機、一攝影機、一平板電腦或一筆記型電腦。圖形轉文字單元130、圖形擷取單元140、語音轉文字單元150、語音文字分類單元160、摘要單元170、關鍵字分析單元180及彙整單元190用以進行各種特定處理程序,例如是一晶片、一電路、一電路板一程式碼或儲存程式碼之儲存裝置。在本實施例中,錄影裝置110所獲得之錄影資料VD與錄音裝置120所獲得之錄音資料VC經過圖形轉文字單元130、圖形擷取單元140、語音轉文字單元150、語音文字分類單元160、摘要單元170、關鍵字分析單元180的轉換與分析後,可以獲得摘要文字SM與關鍵字KW。以下更搭配一流程圖詳細說明各項元件之運作。 Please refer to FIG. 2 , which shows a block diagram of a device 100 for automatically generating notes according to an embodiment. The device 100 for automatically generating notes includes a video recording device 110, a recording device 120, a graphics-to-text unit 130, a graphics capture unit 140, a voice-to-text unit 150, a voice-to-text classification unit 160, a summary unit 170, A keyword analysis unit 180 and a collection unit 190 . The video recording device 110 is used for performing a video recording procedure. The recording device 120 is used for recording procedure. The video recording device 110 and the recording device 120 are, for example, integrated in the same electronic device and operate synchronously. The video recording device 110 and the recording device 120 are, for example, a smart phone, a video camera, a tablet computer or a notebook computer. The graphics-to-text unit 130, the graphics extraction unit 140, the voice-to-text unit 150, the voice-to-text classification unit 160, the summary unit 170, the keyword analysis unit 180, and the collection unit 190 are used to perform various specific processing procedures, such as a chip, A circuit, a circuit board, a program code or a storage device for storing the program code. In this embodiment, the video data VD obtained by the video recording device 110 and the recording data VC obtained by the recording device 120 pass through the graphics-to-text unit 130, the graphics extraction unit 140, the voice-to-text unit 150, the voice-to-text classification unit 160, After conversion and analysis by the summary unit 170 and the keyword analysis unit 180, the summary text SM and keywords KW can be obtained. The operation of each component is described in detail below with a flow chart.

請參照第3~6圖,第3圖繪示根據一實施例之自動生成筆記之方法的流程圖,第4~6圖示例說明第3圖之各步驟。在步 驟S101中,錄影裝置110與錄音裝置120於一講師透過一投影片進行講解時,進行錄影及錄音,以獲得錄影資料VD及錄音資料VC。在錄製錄影資料VD及錄音資料VC時,錄影資料VD及錄音資料VC均同步標記一時間資訊TI。時間資訊TI例如是一絕對時間或一相對時間。絕對時間例如是格林威治標準時間。相對時間例如是錄影裝置110與錄音裝置120在同一參考時間點啟動後,相對於該參考時間點的時間。 Please refer to Figures 3-6. Figure 3 shows a flow chart of a method for automatically generating notes according to an embodiment, and Figures 4-6 illustrate the steps in Figure 3. in step In step S101 , the video recording device 110 and the recording device 120 perform video and audio recording when a lecturer gives an explanation through a slide, so as to obtain video data VD and audio data VC. When the video data VD and the audio data VC are recorded, the video data VD and the audio data VC are simultaneously marked with a time information TI. The time information TI is, for example, an absolute time or a relative time. Absolute time is eg Greenwich Mean Time. The relative time is, for example, the time relative to the reference time point after the video recording device 110 and the recording device 120 are started at the same reference time point.

接著,在步驟S102中,圖形轉文字單元130對錄影資料VD進行一圖形轉文字程序,以獲得投影片文字資料TX。舉例來說,圖形轉文字單元130例如是分析出連續數個畫面都顯示出相同的頁面PG,某些頁面PG可會被講者遮蔽一部分。圖形轉文字單元130從這些頁面PG中挑選出或重疊出完整的頁面PG。接著再從這個完整的頁面PG分析出投影片文字資料TX。 Next, in step S102 , the graphics-to-text unit 130 performs a graphics-to-text program on the video data VD to obtain the slideshow text data TX. For example, the graphics-to-text unit 130 analyzes that several consecutive frames display the same page PG, and some pages PG may be partially covered by the speaker. The graphics-to-text unit 130 selects or overlaps a complete page PG from these pages PG. Then analyze the slideshow text data TX from this complete page PG.

在此步驟中,圖形轉文字單元130更於投影片文字資料TX標記一時間資訊TI。圖形轉文字單元130例如是在每一句文字標記該頁面PG之出現時間區間。 In this step, the image-to-text unit 130 further marks a time information TI on the slide text data TX. The graphics-to-text unit 130 marks, for example, the appearance time interval of the page PG in each sentence.

圖形轉文字單元130可以根據時間資訊TI,對投影片文字資料TX按照這些頁面PG歸類。舉例來說,如第4圖所示,投影片文字資料TX按照這些頁面PG分別歸類至其所出現之頁面PG。 The image-to-text unit 130 can classify the slideshow text data TX according to the pages PG according to the time information TI. For example, as shown in FIG. 4, the powerpoint text data TX is classified into the pages PG where it appears according to these pages PG.

在此步驟中,圖形轉文字單元130將投影片文字資料TX分類為印刷文字TX1(繪示於第1圖)及手寫文字TX2(繪示於第1圖)。圖形轉文字單元130可以依據文字之工整度與直線之變異度來判別出印刷文字TX1及手寫文字TX2。或者,圖形 轉文字單元130可以先分析出字型,再根據字型來判別出印刷文字TX1及手寫文字TX2。或者,圖形轉文字單元130可以利用機器學習演算法判別出印刷文字TX1及手寫文字TX2。 In this step, the image-to-text unit 130 classifies the slide text data TX into printed text TX1 (shown in FIG. 1 ) and handwritten text TX2 (shown in FIG. 1 ). The graphics-to-text unit 130 can distinguish the printed text TX1 and the handwritten text TX2 according to the regularity of the text and the variation of the straight line. or, graphics The text-to-text unit 130 can first analyze the font, and then distinguish the printed text TX1 and the handwritten text TX2 according to the font. Alternatively, the image-to-text unit 130 may use a machine learning algorithm to identify the printed text TX1 and the handwritten text TX2.

然後,在步驟S103中,圖形擷取單元140對錄影資料VD進行一圖形擷取程序,以獲得投影片圖形資料DW。在此步驟中,舉例來說,圖形擷取單元140例如是分析出連續數個畫面都顯示出相同的頁面PG,某些頁面PG可會被講者遮蔽一部分。圖形轉文字單元130從這些頁面PG中挑選出或重疊出完整的頁面PG。接著再從這個完整的頁面PG分析出投影片圖形資料DW。 Then, in step S103 , the graphics capture unit 140 performs a graphics capture procedure on the video data VD to obtain the slide graphics data DW. In this step, for example, the graphic capture unit 140 analyzes that several consecutive frames display the same page PG, and some pages PG may be partially covered by the speaker. The graphics-to-text unit 130 selects or overlaps a complete page PG from these pages PG. Then analyze the graphic data DW of the slideshow from the complete page PG.

圖形擷取單元140可以對頁面PG進行邊緣偵測,以分析出數個物件。再從這些物件中,分析出投影片圖形資料DW。 The graphic capture unit 140 can perform edge detection on the page PG to analyze several objects. From these objects, the graphics data DW of the slideshow is analyzed.

在此步驟中,圖形擷取單元140更於投影片圖形資料DW標記時間資訊TI。圖形擷取單元140例如是在每一物件標記該頁面PG之出現時間區間。 In this step, the image capture unit 140 further marks the time information TI on the slide image data DW. The graphic capture unit 140 marks the appearance time interval of the page PG for each object, for example.

圖形擷取單元140可以根據時間資訊TI,對投影片圖形資料DW按照這些頁面PG歸類。舉例來說,如第4圖所示,投影片圖形資料DW按照這些頁面PG分別歸類至其所出現之頁面PG。 The graphics retrieval unit 140 can classify the slide graphics data DW according to the pages PG according to the time information TI. For example, as shown in FIG. 4, the slide graphics data DW are classified into the pages PG where they appear according to these pages PG.

接著,在步驟S104中,語音轉文字單元150對錄音資料VC進行一語音轉文字程序,以獲得語音文字資料SP。舉例來說,語音轉文字單元150例如是根據預先設定之語言類型對分析錄音資料VC進行逐字分析,以獲得數個單字。語音轉文字單 元150再將這些單字轉換成詞語。最後,再將這些連續的詞語切分出數個句子,以獲得語音文字資料SP。 Next, in step S104 , the speech-to-text unit 150 performs a speech-to-text program on the recorded data VC to obtain the speech-to-text data SP. For example, the speech-to-text unit 150 analyzes the analysis recording data VC word by word according to a preset language type to obtain several words. Speech to text list Yuan 150 converts these single characters into words again. Finally, segment these continuous words into several sentences to obtain the speech and text data SP.

在此步驟中,語音轉文字單元150更於語音文字資料SP標記時間資訊TI。語音轉文字單元150例如是在每一語句標記該頁面PG之出現時間區間。 In this step, the speech-to-text unit 150 further marks the time information TI on the speech-to-text data SP. The speech-to-text unit 150, for example, marks the occurrence time interval of the page PG in each sentence.

語音轉文字單元150可以根據時間資訊TI,對語音文字資料SP按照這些頁面PG歸類。舉例來說,如第4圖所示,語音文字資料SP按照這些頁面PG分別歸類至其所出現之頁面PG。 The speech-to-text unit 150 can classify the speech-to-text data SP according to the pages PG according to the time information TI. For example, as shown in FIG. 4, the speech and text data SP are classified into the pages PG where they appear according to these pages PG.

然後,在步驟S105中,如第4圖所示,語音文字分類單元160對語音文字資料SP分類為複數個段落SG。在此步驟中,語音文字分類單元160透過一K-近鄰演算法(k-nearest neighbors,KNN)將語音文字資料SP分類為數個段落SG。 Then, in step S105 , as shown in FIG. 4 , the phonetic text classification unit 160 classifies the phonetic text data SP into a plurality of paragraphs SG. In this step, the phonetic text classification unit 160 classifies the phonetic text data SP into several segments SG through a K-nearest neighbors (KNN) algorithm.

接著,在步驟S106中,如第5圖所示,語音文字分類單元160依據投影片文字資料SP,調整對應於相鄰兩張頁面PG之語音文字資料SP的相鄰兩句。舉例來說,語音文字分類單元160可以透過K-近鄰演算法判斷相鄰兩張頁面PG之語音文字資料SP的相鄰兩句是否屬於同一段落SG。若屬於同一段落SG,則將這兩句歸類為同一頁面PG之同一段落SG。 Next, in step S106 , as shown in FIG. 5 , the phonetic text classification unit 160 adjusts two adjacent sentences corresponding to the phonetic text data SP of two adjacent pages PG according to the slide text data SP. For example, the phonetic text classification unit 160 can determine whether two adjacent sentences of the phonetic text data SP of two adjacent pages PG belong to the same paragraph SG through the K-nearest neighbor algorithm. If they belong to the same paragraph SG, the two sentences are classified as the same paragraph SG of the same page PG.

接著,在步驟S107中,摘要單元170針對這些段落SG分別分析出數個摘要文字SM。在此步驟中,摘要單元170可以濾除一些贅詞,以獲得摘要文字SM。 Next, in step S107 , the summary unit 170 analyzes several summary texts SM for these paragraphs SG respectively. In this step, the summary unit 170 can filter out some superfluous words to obtain the summary text SM.

然後,在步驟S108中,如第6圖所示,關鍵字分析單元180針對投影片文字資料TX及語音文字資料SP分析出數個關鍵字KW。在此步驟中,關鍵字分析單元180比對投影片文字 資料TX及語音文字資料SP,以分析出同時出現於投影片文字資料TX及語音文字資料SP的字詞,並將這些字詞作為關鍵字KW。 Then, in step S108 , as shown in FIG. 6 , the keyword analysis unit 180 analyzes several keywords KW for the slide text data TX and the phonetic text data SP. In this step, the keyword analysis unit 180 compares the slide text The data TX and the phonetic text data SP are used to analyze the words that appear in the slideshow text data TX and the phonetic text data SP at the same time, and use these words as keywords KW.

接著,在步驟S109中,如第6圖所示,關鍵字分析單元180按照這些關鍵字KW之出現頻率,排序這些關鍵字KW。舉例來說,出現頻率越高之關鍵字KW將被排序在前面。關鍵字分析單元180可以針對每一頁面PG進行一次排序,讓使用者很快知道該頁面PG之重點內容。 Next, in step S109 , as shown in FIG. 6 , the keyword analysis unit 180 sorts the keywords KW according to their frequency of occurrence. For example, keywords KW with higher occurrence frequency will be sorted first. The keyword analysis unit 180 can sort each page PG once, so that users can quickly know the key content of the page PG.

然後,在步驟S110中,如第6圖所示,彙整單元190彙整投影片文字資料TX、投影片圖形資料DW、語音文字資料SP、這些摘要文字SM及這些關鍵字KW。第6圖所呈現的是一種彙整方式,但本發明並不局限於第6圖之示例。 Then, in step S110 , as shown in FIG. 6 , the integration unit 190 aggregates the powerpoint text data TX, the powerpoint graphic data DW, the phonetic text data SP, the abstract text SM and the keywords KW. Fig. 6 presents a collection method, but the present invention is not limited to the example in Fig. 6 .

根據上述實施例,在自動生成筆記的技術中,錄影資料VD與錄音資料VC經過轉換與分析後,可以獲得摘要文字SM與關鍵字KW。如此一來,聽講者能夠在上課時更注意聽講,課後也能很快獲得整理的重點,大幅增進學習效率。 According to the above-mentioned embodiment, in the technique of automatically generating notes, after the video data VD and the audio data VC are converted and analyzed, the summary text SM and the keywords KW can be obtained. In this way, the listeners can pay more attention to the lectures during class, and can quickly obtain the key points after class, greatly improving learning efficiency.

綜上所述,雖然本揭露已以實施例揭露如上,然其並非用以限定本揭露。本揭露所屬技術領域中具有通常知識者,在不脫離本揭露之精神和範圍內,當可作各種之更動與潤飾。因此,本揭露之保護範圍當視後附之申請專利範圍所界定者為準。 To sum up, although the present disclosure has been disclosed above with embodiments, it is not intended to limit the present disclosure. Those with ordinary knowledge in the technical field to which this disclosure belongs may make various changes and modifications without departing from the spirit and scope of this disclosure. Therefore, the scope of protection of this disclosure should be defined by the scope of the appended patent application.

100:自動生成筆記之裝置 100: A device for automatically generating notes

110:錄影裝置 110:Video installation

120:錄音裝置 120: Recording device

130:圖形轉文字單元 130: Graphics to text unit

140:圖形擷取單元 140: graphics capture unit

150:語音轉文字單元 150:Speech-to-text unit

160:語音文字分類單元 160:Phonetic text taxonomy

170:摘要單元 170:Summary unit

180:關鍵字分析單元 180: Keyword Analysis Unit

190:彙整單元 190: Integration unit

DW:投影片圖形資料 DW: slide graphics data

KW:關鍵字 KW: Keyword

SG:段落 SG: Paragraph

SM:摘要文字 SM: Summary Text

SP:語音文字資料 SP: Speech text data

TI:時間資訊 TI: Time Information

TX:投影片文字資料 TX: slide text data

VC:錄音資料 VC: recording material

VD:錄影資料 VD: video data

Claims (10)

一種自動生成筆記之方法,該自動生成筆記之方法經由一電腦執行,該自動生成筆記之方法包括:於一講師透過一投影片進行講解時,進行錄影及錄音,以獲得一錄影資料及一錄音資料;對該錄影資料進行一圖形轉文字程序,以獲得一投影片文字資料;對該錄影資料進行一圖形擷取程序,以獲得一投影片圖形資料;對該錄音資料進行一語音轉文字程序,以獲得一語音文字資料;對該語音文字資料分類為複數個段落,該語音文字資料透過一K-近鄰演算法(k-nearest neighbors,KNN)分類為該些段落;針對該些段落濾除贅詞,以分別分析出複數個摘要文字;針對該投影片文字資料及該語音文字資料分析出複數個關鍵字,並按照該些關鍵字之出現頻率,排序該些關鍵字;以及彙整該投影片文字資料、該投影片圖形資料、該語音文字資料、該些摘要文字及該些關鍵字。 A method for automatically generating notes, the method for automatically generating notes is executed by a computer, the method for automatically generating notes includes: when a lecturer is explaining through a slide, video and audio recording are performed, so as to obtain a video data and an audio recording data; perform a graphics-to-text program on the video data to obtain a powerpoint text data; perform a graphics extraction process on the video data to obtain a powerpoint graphics data; perform a voice-to-text program on the audio data , to obtain a phonetic text data; classify the phonetic text data into a plurality of paragraphs, and classify the phonetic text data into these paragraphs through a K-nearest neighbor algorithm (k-nearest neighbors, KNN); filter out the paragraphs superfluous words, so as to separately analyze a plurality of summary texts; analyze a plurality of keywords for the slide text data and the audio text data, and sort these keywords according to the frequency of occurrence of these keywords; and compile the projection A piece of text data, the slide graphic data, the audio text data, the summary text and the keywords. 如請求項1所述之自動生成筆記之方法,其中該投影片包括複數個頁面, 在獲得該投影片文字資料之步驟中,該投影片文字資料按照該些頁面歸類;在獲得該投影片圖形資料之步驟中,該投影片圖形資料按照該些頁面歸類;在獲得該語音文字資料之步驟中,該語音文字資料按照該些頁面歸類。 The method for automatically generating notes as described in Claim 1, wherein the slideshow includes a plurality of pages, In the step of obtaining the text data of the slide, the text data of the slide is classified according to the pages; in the step of obtaining the graphic data of the slide, the graphic data of the slide is classified according to the pages; In the step of text data, the audio text data is classified according to the pages. 如請求項2所述之自動生成筆記之方法,更包括:依據該投影片文字資料,調整對應於相鄰兩張頁面之該語音文字資料的相鄰兩句。 The method for automatically generating notes as described in claim 2 further includes: adjusting two adjacent sentences of the phonetic text data corresponding to two adjacent pages according to the slide text data. 如請求項1所述之自動生成筆記之方法,其中在獲得該投影片文字資料之步驟中,該投影片文字資料分類為一印刷文字及一手寫文字。 The method for automatically generating notes as described in Claim 1, wherein in the step of obtaining the slideshow text data, the slideshow text data is classified into a printed text and a handwritten text. 如請求項1所述之自動生成筆記之方法,其中在獲得該投影片文字資料之步驟中,更於該投影片文字資料標記一時間資訊;在獲得該投影片圖形資料之步驟中,更於該投影片圖形資料標記該時間資訊;在獲得該語音文字資料之步驟中,更於該語音文字資料標記該時間資訊。 The method for automatically generating notes as described in claim item 1, wherein in the step of obtaining the slide text data, a time information is further marked on the slide text data; in the step of obtaining the slide graphics data, further in Marking the time information on the slide graphic data; and marking the time information on the audio text data in the step of obtaining the audio text data. 一種自動生成筆記之裝置,包括:一錄影裝置,用以於一講師透過一投影片進行講解時,進行錄影,以獲得一錄影資料;一錄音裝置,用以於該講師透過該投影片進行講解時,進行錄音,以獲得一錄音資料;一圖形轉文字單元,用對該錄影資料進行一圖形轉文字程序,以獲得一投影片文字資料;一圖形擷取單元,用以對該錄影資料進行一圖形擷取程序,以獲得一投影片圖形資料;一語音轉文字單元,用以對該錄音資料進行一語音轉文字程序,以獲得一語音文字資料;一語音文字分類單元,用以對該語音文字資料分類為複數個段落,該語音文字資料透過一K-近鄰演算法(k-nearest neighbors,KNN)分類為該些段落;一摘要單元,用以針對該些段落濾除贅詞,以分別分析出複數個摘要文字;一關鍵字分析單元,用以針對該投影片文字資料及該語音文字資料分析出複數個關鍵字,並按照該些關鍵字之出現頻率,排序該些關鍵字;以及一彙整單元,用以彙整該投影片文字資料、該投影片圖形資料、該語音文字資料、該些摘要文字及該些關鍵字。 A device for automatically generating notes, comprising: a video recording device for recording when a lecturer is giving an explanation through a slide to obtain a video data; a recording device for giving an explanation by the lecturer through the slide recording, to obtain a recording data; a graphics-to-text unit, which is used to perform a graphics-to-text program on the video data, to obtain a powerpoint text data; a graphics capture unit, to perform a graphics conversion on the video data A graphics extraction program, to obtain a slide graphics data; a voice-to-text unit, to perform a voice-to-text program on the recording data, to obtain a voice-text data; a voice-to-text classification unit, to the The phonetic text data is classified into a plurality of paragraphs, and the phonetic text data is classified into these paragraphs through a K-nearest neighbors (KNN) algorithm; a summary unit is used to filter redundant words for these paragraphs, to Analyzing a plurality of summary texts respectively; a keyword analysis unit, which is used to analyze a plurality of keywords for the slide text data and the audio text data, and sort the keywords according to the frequency of occurrence of the keywords; and an integration unit for integrating the slide text data, the slide graphics data, the voice text data, the summary text and the keywords. 如請求項6所述之自動生成筆記之裝置,其中該投影片包括複數個頁面, 該圖形轉文字單元按照該些頁面歸類該投影片文字資料;該圖形擷取單元按照該些頁面歸類該投影片圖形資料;該語音轉文字單元按照該些頁面歸類該語音文字資料。 The device for automatically generating notes as described in claim 6, wherein the slideshow includes a plurality of pages, The graphic-to-text unit classifies the slideshow text data according to the pages; the graphic capture unit classifies the slideshow graphic data according to the pages; the speech-to-text unit classifies the phonetic text data according to the pages. 如請求項7所述之自動生成筆記之裝置,其中該語音文字分類單元依據該投影片文字資料,調整對應於相鄰兩張頁面之該語音文字資料的相鄰兩句。 The device for automatically generating notes as described in Claim 7, wherein the phonetic text classification unit adjusts two adjacent sentences of the phonetic text data corresponding to two adjacent pages according to the slideshow text data. 如請求項6所述之自動生成筆記之裝置,其中該圖形轉文字單元將該投影片文字資料分類為一印刷文字及一手寫文字。 The device for automatically generating notes as described in Claim 6, wherein the image-to-text unit classifies the slide text data into a printed text and a handwritten text. 如請求項6所述之自動生成筆記之裝置,其中該圖形轉文字單元於該些該投影片文字資料連續標記一時間資訊;該圖形擷取單元於該投影片圖形資料連續標記該時間資訊;該語音轉文字單元於該語音文字資料連續標記該時間資訊。 The device for automatically generating notes as described in claim item 6, wherein the graphics-to-text unit continuously marks a time information on the slide text data; the graphics capture unit continuously marks the time information on the slide graphics data; The speech-to-text unit continuously marks the time information on the speech-to-text data.
TW110119671A 2021-05-31 2021-05-31 Method and device for automatically generating notes TWI790630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW110119671A TWI790630B (en) 2021-05-31 2021-05-31 Method and device for automatically generating notes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW110119671A TWI790630B (en) 2021-05-31 2021-05-31 Method and device for automatically generating notes

Publications (2)

Publication Number Publication Date
TW202249000A TW202249000A (en) 2022-12-16
TWI790630B true TWI790630B (en) 2023-01-21

Family

ID=85793600

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110119671A TWI790630B (en) 2021-05-31 2021-05-31 Method and device for automatically generating notes

Country Status (1)

Country Link
TW (1) TWI790630B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070033528A1 (en) * 1998-05-07 2007-02-08 Astute Technology, Llc Enhanced capture, management and distribution of live presentations
US20130127980A1 (en) * 2010-02-28 2013-05-23 Osterhout Group, Inc. Video display modification based on sensor input for a see-through near-to-eye display
TW201331787A (en) * 2011-12-07 2013-08-01 Microsoft Corp Displaying virtual data as printed content
CN105578115A (en) * 2015-12-22 2016-05-11 深圳市鹰硕音频科技有限公司 Network teaching method and system with voice assessment function

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070033528A1 (en) * 1998-05-07 2007-02-08 Astute Technology, Llc Enhanced capture, management and distribution of live presentations
US20130127980A1 (en) * 2010-02-28 2013-05-23 Osterhout Group, Inc. Video display modification based on sensor input for a see-through near-to-eye display
TW201331787A (en) * 2011-12-07 2013-08-01 Microsoft Corp Displaying virtual data as printed content
CN105578115A (en) * 2015-12-22 2016-05-11 深圳市鹰硕音频科技有限公司 Network teaching method and system with voice assessment function

Also Published As

Publication number Publication date
TW202249000A (en) 2022-12-16

Similar Documents

Publication Publication Date Title
US10198506B2 (en) System and method of sentiment data generation
JP2006508390A (en) Digital audio data summarization method and apparatus, and computer program product
CN109783796A (en) Predict that the pattern in content of text destroys
US9940326B2 (en) System and method for speech to speech translation using cores of a natural liquid architecture system
Kubat et al. Totalrecall: visualization and semi-automatic annotation of very large audio-visual corpora.
Akbari et al. A real-time system for online learning-based visual transcription of piano music
CN113901186A (en) Telephone recording marking method, device, equipment and storage medium
TWI790630B (en) Method and device for automatically generating notes
KR101951910B1 (en) An E-book Production System Using Automatic Placement Of Illustration And Text
JP3444831B2 (en) Editing processing device and storage medium storing editing processing program
CN111090977A (en) Intelligent writing system and intelligent writing method
CN115988149A (en) Method for generating video by AI intelligent graphics context
US20050060308A1 (en) System, method, and recording medium for coarse-to-fine descriptor propagation, mapping and/or classification
JP3471253B2 (en) Document classification method, document classification device, and recording medium recording document classification program
WO2021097629A1 (en) Data processing method and apparatus, and electronic device and storage medium
KR20200072616A (en) System and Method for Realtime Text Tracking and Translation in Video
Zhu et al. Video browsing and retrieval based on multimodal integration
Das et al. Incorporating domain knowledge to improve topic segmentation of long MOOC lecture videos
Eiken et al. Ord i dag: Mining Norwegian daily newswire
Rajan et al. Distance Metric Learnt Kernel-Based Music Classification Using Timbral Descriptors
Joder et al. Alignment kernels for audio classification with application to music instrument recognition
KR102639320B1 (en) Electronic Lab Notebook platform with automated content classification and technical document analysis artificial intelligence using thereof based on automatic recognition, abstraction and translation artificial intelligence for laboratories and operating method of the platform
Gao et al. Music summary detection with state space embedding and recurrence plot
Novitasari et al. Construction of English-French Multimodal Affective Conversational Corpus from TV Dramas
Kothawade et al. Retrieving instructional video content from speech and text information