CN112241462B - Knowledge point mark generation system and method thereof - Google Patents

Knowledge point mark generation system and method thereof Download PDF

Info

Publication number
CN112241462B
CN112241462B CN201910646422.9A CN201910646422A CN112241462B CN 112241462 B CN112241462 B CN 112241462B CN 201910646422 A CN201910646422 A CN 201910646422A CN 112241462 B CN112241462 B CN 112241462B
Authority
CN
China
Prior art keywords
vocabulary
knowledge point
word
continuously
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910646422.9A
Other languages
Chinese (zh)
Other versions
CN112241462A (en
Inventor
郑旭成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wisdom Garden Hong Kong Ltd
Original Assignee
Wisdom Garden Hong Kong Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wisdom Garden Hong Kong Ltd filed Critical Wisdom Garden Hong Kong Ltd
Priority to CN201910646422.9A priority Critical patent/CN112241462B/en
Publication of CN112241462A publication Critical patent/CN112241462A/en
Application granted granted Critical
Publication of CN112241462B publication Critical patent/CN112241462B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/432Query formulation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/14Use of phonemic categorisation or speech recognition prior to speaker recognition or verification

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

A knowledge point mark generation system and method thereof, wherein a label vocabulary is obtained by analyzing at least one first key vocabulary highlighted in a text mode in a classroom, at least one second key vocabulary highlighted in a sound mode in the classroom, at least one first candidate vocabulary repeatedly appearing in the text mode in the classroom and at least one second candidate vocabulary repeatedly appearing in the sound mode in the classroom according to the corresponding weight thereof, and the knowledge point mark is set on a time axis of a video file shot in the classroom corresponding to the time zone in which the label vocabulary appears, so as to form the video file with the knowledge point mark. Therefore, the learner can know the knowledge points and the fragments thereof in the classroom without browsing the video and audio files in the whole classroom, thereby facilitating the learner to perform important learning or review.

Description

Knowledge point mark generation system and method thereof
Technical Field
The invention relates to a mark generation system and a method thereof, in particular to a knowledge point mark generation system and a method thereof.
Background
With the progress of technology and the development of network, the learner can learn or review through the audio-video file recorded during education after the end of the class.
At present, when a learner wants to learn, the learner can only search through the title of the audio-visual file, however, the title can provide limited information, and the learner may need to browse the whole film completely to know whether the learner meets the learning requirement or not, which has a time-consuming problem.
In addition, when the learner wants to review, the learner must continuously drag the playing progress pointer or the fast-moving film on the playing time axis because the learner usually does not know the playing time of the key points (i.e. knowledge points) of the classroom to review in the video file, which obviously causes inconvenience to the learner by searching the section to be watched.
In summary, it can be known that the prior art needs to browse the video files in the whole classroom to know whether the learning requirement is met or not, and the learner is inconvenient to use to drag the playing progress pointer or search the section of the knowledge point in a fast-rotating film mode, so that an improved technical means is needed to solve the problem.
Disclosure of Invention
The invention discloses a knowledge point mark generation system and a knowledge point mark generation method.
First, the present invention discloses a knowledge point mark generation system, which includes: the device comprises an acquisition device, a voice recognition device, a processing device and an integration device. The capturing device is used for continuously capturing and analyzing the computer picture image, the projection image and/or the board image in the classroom so as to continuously obtain the text, and capturing at least one first key word in the text based on the fonts and/or the font colors of the computer picture image, the projection image and/or the board image and the clicked text. The voice recognition device is used for continuously receiving the voice signal in the class, continuously converting the voice signal into a character string in a voice-to-character mode, judging the identity of the voice signal in a voiceprint recognition or sound source recognition mode, and capturing at least one second key word in the character string based on the body of the voice signal and/or a plurality of preset words. The processing device is used for analyzing the text continuously obtained by the capturing device in a statistical mode after the class is finished so as to obtain at least one first candidate vocabulary; after the class is finished, the character strings continuously obtained by the voice recognition device are analyzed in a statistical mode so as to obtain at least one second candidate vocabulary; and analyzing the at least one first key word, the at least one second key word, the at least one first candidate word and the at least one second candidate word according to the corresponding weights to obtain the tag word. The integrating device is used for generating a time section of each sentence with a tag word in the character string continuously acquired by the voice recognition device after the class is finished, merging the adjacent time sections into a time section when the time difference between the adjacent time sections is smaller than a specific time length, and setting a plurality of knowledge point marks corresponding to the time sections and the time sections which are not merged on a time axis of the video file shot in the class so as to form the video file with the knowledge point marks.
In addition, the invention discloses a knowledge point mark generation method, which comprises the following steps: providing a knowledge point mark generation system, which comprises a capturing device, a voice recognition device, a processing device and an integration device; the capturing device continuously captures and analyzes the computer picture image, the projection image and/or the board book image in the class so as to continuously obtain the text; the capturing device captures at least one first key word in the text based on the character type and/or the character color in the computer picture image, the projection image and/or the board writing image and the clicked characters; the voice recognition device continuously receives the voice signal in the class and continuously converts the voice signal into a character string in a voice-to-character mode; the voice recognition device judges the identity of the sent voice signal in a voiceprint recognition or sound source recognition mode; the voice recognition device captures at least one second key word in the word string based on the identity of the sent voice signal and/or a plurality of preset words; the processing device analyzes the text continuously obtained by the capturing device in a statistical mode after the class is finished so as to obtain at least one first candidate vocabulary; after the class is finished, the processing device analyzes the character strings continuously obtained by the voice recognition device in a statistical mode to obtain at least one second candidate vocabulary; the processing device carries out an analysis procedure on at least one first key word, at least one second key word, at least one first candidate word and at least one second candidate word according to the corresponding weights so as to obtain a tag word; and the integrating device is used for integrating adjacent time sections into time sections according to the time section of each sentence with the tag word in the character string continuously acquired by the voice recognition device after the class is finished, and setting a plurality of knowledge point marks corresponding to the time sections and the time sections which are not integrated on a time axis of the video file shot in the class so as to form the video file with the knowledge point marks when the time difference between the adjacent time sections is smaller than a specific time length.
The system and method disclosed in the invention are different from the prior art in that the invention obtains at least one tag word by analyzing at least one first key word highlighted in a text mode in a classroom, at least one second key word highlighted in a sound mode in the classroom, at least one first candidate word repeatedly appearing in the text mode in the classroom and at least one second candidate word repeatedly appearing in the sound mode in the classroom according to the corresponding weight thereof, and sets a knowledge point mark on a time axis of a video file shot in the classroom corresponding to the time zone and the time interval in which the tag word appears, so as to form the video file with the knowledge point mark.
Through the technical means, the invention can enable a learner to know the knowledge points and the fragments thereof in the classroom without browsing the video file in the whole classroom, thereby facilitating the learner to perform important learning or review.
Drawings
FIG. 1 is a system block diagram of a knowledge point tag generation system in accordance with an embodiment of the invention.
FIGS. 2A and 2B are flowcharts illustrating an exemplary method for performing the knowledge point tag generation method by the knowledge point tag generation system of FIG. 1.
[ List of reference numerals ]
50. Live broadcast module
60. Marking module
70. Transmission module
100. Knowledge point mark generation system
110. Picking device
112. Photographic module
114. Analysis module
120. Speech recognition device
122. Microphone module
124. Conversion module
126. Voiceprint identification module
130. Processing device
140. Integration device
150. User terminal
160. Behavior detection device
162. Photographic module
164. Analysis module
Step 210 provides a knowledge point tag generation system comprising: capturing device, voice recognition device, processing device and integrating device
Step 220, the capturing device continuously captures and analyzes the computer image, the projection image and/or the board image in the classroom to continuously obtain the text
Step 230, the capturing device captures at least one first keyword in the text based on the fonts and/or font colors in the computer screen image, the projection image and/or the board image and the selected text
Step 240, the voice recognition device continuously receives the voice signal in the class and continuously converts the voice signal into character strings through a voice-to-character mode
Step 250 the voice recognition device determines the identity of the emitted voice signal by means of voiceprint recognition or sound source recognition
Step 260 the voice recognition device extracts at least one second keyword from the text string based on the identity of the uttered voice signal and/or the plurality of predetermined words
The processing device in step 270 statistically analyzes the text continuously obtained by the capturing device after the end of the class to obtain at least one first candidate vocabulary
Step 280, the processing device statistically analyzes the text strings continuously obtained by the voice recognition device after the end of the class to obtain at least one second candidate vocabulary
The processing device in step 290 analyzes the at least one first keyword, the at least one second keyword, the at least one first candidate vocabulary and the at least one second candidate vocabulary according to the weights thereof to obtain the tag vocabulary
Step 300, the integrating device merges the adjacent time segments into a time segment when the time difference between the adjacent time segments is smaller than a specific time length according to the time segment of each sentence with the tag word in the text string continuously obtained by the voice recognition device after the end of the class, and then sets a plurality of knowledge point marks corresponding to the non-merged time segment and time segment on the time axis of the video file photographed in the class to form the video file with the knowledge point marks
Detailed Description
The following detailed description of embodiments of the present invention will be given with reference to the accompanying drawings and examples, which are given by way of illustration of how the technical means can be applied to solve the technical problems and achieve the technical effects.
Before explaining the knowledge point mark generating system and method, the invention is explained first, and the knowledge point is the basic unit for information transmission in course, so that it is important to know knowledge point and learn navigation of course. The invention can analyze the behaviors and events occurring in the classroom according to the corresponding weights thereof to obtain the knowledge points in the classroom, so that a learner can learn or review through the video files recorded during teaching without browsing the video files in the whole classroom to know the knowledge points in the classroom and the fragments thereof. In addition, the capturing device, the voice recognition device and the behavior detection device can synchronously start operation at the beginning of each class and synchronously stop operation after each class is finished.
Referring to fig. 1, fig. 2A and fig. 2B, fig. 1 is a system block diagram of an embodiment of a knowledge point mark generation system according to the present invention, and fig. 2A and fig. 2B are a flowchart of an embodiment of a method for performing a knowledge point mark generation method by the knowledge point mark generation system of fig. 1. In this embodiment, the knowledge point tag generation system 100 includes: the capturing device 110, the voice recognition device 120, the processing device 130 and the integrating device 140 (step 210). The capturing device 110 is connected to the processing device 130, the voice recognition device 120 is connected to the processing device 130, and the processing device 130 is connected to the integrating device 140.
The capturing device 110, the voice recognition device 120, the processing device 130, and the integrating device 140 may be implemented in various manners, including software, hardware, firmware, or any combination thereof. The techniques presented in the embodiments may be stored on a machine-readable storage medium using software or firmware, for example: read Only Memory (ROM), random Access Memory (RAM), magnetic disk storage media, optical storage media, flash memory devices, etc., and may be executed by one or more general-purpose or special-purpose programmable microprocessors. The capturing device 110 and the processing device 130, the voice recognition device 120 and the processing device 130, and the processing device 130 and the integrating device 140 may be connected to each other by wireless or wired means to perform signal and data transmission.
The capturing mechanism 110 continuously captures and analyzes the computer screen image, the projected image and/or the board image in the classroom to continuously obtain text (step 220). In more detail, the capturing device 110 may include a photographing module 112 and a parsing module 114, and the photographing module 112 is connected to the parsing module 114. The photographing module 112 may be configured to continuously photograph images on a podium in each class, and contents of the images on the podium may include: the projection image and/or the classroom blackboard or whiteboard are used for capturing the projection image and/or the blackboard writing image, but the embodiment is not limited to the invention and can be adjusted according to actual requirements. For example, when a computer lesson is performed, the photographing module 112 can be used to continuously photograph the computer screen operated by the learner to capture the computer screen image. It should be noted that, the content continuously shot by the shooting module 112 includes auxiliary teaching data with text provided by the learner during the teaching, for example: lectures, slides, blackboard or blackboard writing on a whiteboard, etc. The parsing module 114 can continuously receive and analyze the computer image, the projection image and/or the blackboard-writing image captured by the photographing module 112 to obtain text in each computer image, each projection image and/or each blackboard-writing image, so as to generate corresponding text (text). The parsing module 114 captures the text of each computer image, each projected image and/or each blackboard-writing image by optical character recognition (optical character recognition, OCR) technology to form text (i.e. image-to-text).
The capturing device 110 captures at least one first keyword in the text based on the font and/or font color in the computer screen image, the projection image and/or the board image and the selected text (step 230). In more detail, since the text in the auxiliary teaching data provided by the learner during the teaching in the classroom may have different fonts and/or font colors, so as to enhance the transmission of certain knowledge points (i.e. emphasis), the learner may know the knowledge points (i.e. emphasis) to be transmitted by the learner through the text having the special fonts and/or font colors, and thus the capturing device 110 may capture at least one first keyword (i.e. possible knowledge points) in the text based on the fonts and/or font colors in the computer image, the projection image and/or the board image, wherein the fonts may include, but are not limited to, font size, font thickness, font type, whether the fonts are italics, whether the fonts have a bottom line and whether the fonts have a text effect, and each first keyword is a word composed of adjacent text having the special fonts and/or font colors. In addition, in this embodiment, the text in the computer image, the projection image and/or the board image is selected as the knowledge point (i.e. the key point) that the teaching person wants to strengthen during teaching in the classroom, so the capturing device 110 may capture at least one first keyword (i.e. the possible knowledge point) in the text based on the selected text in the computer image, the projection image and/or the board image, wherein each first keyword is a word composed of the selected text. It should be noted that the weights of each first keyword extracted by different manners (e.g., special font color and selected text) may be the same or different in the subsequent processing device 130 for performing the analysis procedure, and may be adjusted according to the actual requirements.
The voice recognition device 120 continuously receives the voice signal in the class and continuously converts the voice signal into a text string by voice-to-text (step 240). In more detail, the voice recognition device 120 may include a microphone module 122 and a conversion module 124, wherein the microphone module 122 may be used for continuously receiving sounds (i.e. sound signals) emitted by the learner and the learner in the class, and the conversion module 124 may convert the sound signals continuously received by the microphone module 122 into text strings through a voice-to-text manner. The microphone module 122 may include a plurality of microphone units (not shown) configured at various places in the classroom to completely receive sounds (i.e., sound signals) emitted by the learner and the learner in the classroom, and the number and configuration positions of the microphone units may be adjusted according to the actual requirements.
The voice recognition device 120 determines the identity of the emitted sound signal by voice print recognition or sound source recognition (step 250). In more detail, since the voice recognition device 120 further includes a voiceprint recognition module 126 for recognizing that the voice signal received by the microphone module 122 is sent by a learner or a learner, the text string converted by the conversion module 124 is determined to be a word spoken by the learner or the learner. In addition, in the present embodiment, since the position of the learner is usually near the lecture table (i.e. the position in front of the classroom), and the position of the learner is usually the position in the classroom or behind the lecturer, the microphone module 122 can also determine the position of the sound source and thus the identity of the sound signal. In more detail, since the microphone module 122 may include a plurality of microphone units disposed in each place of the classroom, the microphone module 122 may determine the position of the sound signal according to the time difference between the time when the microphone units receive the same sound signal and the relative disposition position of the microphone units, and determine that the sound signal is sent by the learner or the learner according to the position of the sound signal, and further determine that the text string converted by the conversion module 124 is the words spoken by the learner or the learner.
The voice recognition device 120 retrieves at least one second keyword from the text string based on the identity of the uttered voice signal and/or the plurality of predetermined words (step 260). In more detail, since the pronunciation signals sent by the learner and/or the text strings corresponding to the pronunciation signals containing the predetermined vocabulary (e.g. special, key, back, examination point, etc.) may include knowledge points of the classroom, the voice recognition device 120 may extract at least one second key vocabulary (i.e. possible knowledge points) from the pronunciation signals sent by the learner and/or the text strings corresponding to the pronunciation signals containing the predetermined vocabulary (e.g. special, key, back, examination point, etc.). The second keyword may be extracted by semantic analysis, but the embodiment is not limited to the present invention. In addition, in another embodiment, the text string corresponding to the audio signal with a larger volume emitted by the learner during the teaching process can also be used as one of the parameters for capturing the second key word.
It should be noted that, the weight of each second keyword extracted from the text string corresponding to the sound signal sent by the learner and/or the text string corresponding to the sound signal containing the preset vocabulary (e.g. special, key, back, examination point, etc.), which is corresponding to the analysis procedure performed by the subsequent processing device 130, may be the same or different, and may be adjusted according to the actual requirement.
After the end of the class, the processing device 130 statistically analyzes the text continuously obtained by the capturing device 110 to obtain at least one first candidate vocabulary (step 270). In more detail, the processing device 130 firstly performs statistics on the words in the texts obtained by the capturing device 110, and then defines the first words with higher occurrence frequency as the first candidate word (i.e. the possible knowledge points). It should be noted that, when any vocabulary frequency is too high, the vocabulary may be the main axis of the class and is not suitable to be the tag vocabulary described in the following steps, so when the processing device 130 analyzes the text continuously obtained by the capturing device 110 through the statistical manner after the class is finished, if any vocabulary frequency is determined to be beyond the preset value, the vocabulary is excluded to be the first candidate vocabulary, wherein the size of the preset value can be adjusted according to the actual requirement.
After the end of the class, the processing device 130 statistically analyzes the text strings continuously obtained by the voice recognition device 120 to obtain at least one second candidate vocabulary (step 280). In more detail, the processing device 130 firstly counts the vocabulary in the word strings obtained by the voice recognition device 120, and then defines the first few vocabulary with higher occurrence frequency as the second candidate vocabulary (i.e. the possible knowledge points). It should be noted that, when the occurrence frequency of any vocabulary is too high, the vocabulary may be the main axis of the class and is not suitable to be the tag vocabulary described in the following steps, so when the processing device 130 analyzes the character strings continuously obtained by the voice recognition device 120 in a statistical manner after the class is finished, if it is determined that any vocabulary occurrence frequency exceeds the preset value, the vocabulary is excluded to be the second candidate vocabulary, wherein the size of the preset value can be adjusted according to the actual requirement.
The processing device 130 analyzes the at least one first keyword, the at least one second keyword, the at least one first candidate word and the at least one second candidate word according to their corresponding weights to obtain a tag word (step 290). In more detail, since the probability of whether the first keyword, the second keyword, the first candidate vocabulary and the second candidate vocabulary are knowledge points is different, the weights corresponding to the first keyword, the second keyword, the first candidate vocabulary and the second candidate vocabulary are different in the analysis procedure for determining the knowledge points of the class, and can be adjusted according to the actual requirement. The analysis program determines knowledge points (i.e. tag words) of the class through the first key word, the second key word, the first candidate word, the second candidate word and the weights corresponding to the first candidate word, and the number of the knowledge points (i.e. tag words) can be adjusted according to actual requirements.
When the number of knowledge points (i.e. tag words) is one, the integrating device 140 generates a time segment of each sentence with tag words in the text string continuously obtained by the voice recognition device 120 after the end of the class, and merges the adjacent time segments into time segments when the time difference between the adjacent time segments is smaller than a specific time length, and then sets a plurality of knowledge point marks corresponding to the non-merged time segments and time segments on the time axis of the video file captured in the class to form the video file with the knowledge point marks (step 300). In more detail, the knowledge point mark generating system 100 may further include a camera device (not shown) for capturing video files to be placed on a platform or a website for learning or review by a learner and for capturing streaming video required for broadcasting the classroom (i.e. simultaneously broadcasting and storing the streaming video of the classroom to generate the video files of the classroom after the classroom is over), wherein the camera device, the capturing device 110 and the voice recognition device 120 can be synchronously started and stopped at the beginning of each classroom. In the step 290, a knowledge point (i.e. a tag word) of the class is obtained, so the integrating device 140 searches the text strings obtained by the voice recognition device 120 for a time zone in which each sentence with the knowledge point (i.e. the tag word) appears, and merges the adjacent time zones into a time zone when the time difference (i.e. the time interval) between the adjacent time zones is smaller than a specific time length, wherein the size of the specific time length can be adjusted according to the actual requirement. Then, the integrating device 140 can set a plurality of knowledge point marks on the time axis of the video file generated by shooting by the shooting device in the classroom according to the non-combined time zone and the time zone after the end of the classroom so as to form the video file with the knowledge point marks.
When the number of the tag words is plural, the integrating device 140 can find the time section and the time interval which are not combined and correspond to each tag word according to the above-mentioned process, and then distinguish the knowledge point marks corresponding to different tag words according to different colors, so as to facilitate the learner to distinguish the knowledge point marks corresponding to different tag words. For example, when the label words are "fourier transform (Fourier transform)" and "laplace transform (Laplace transform)", the knowledge point mark corresponding to "fourier transform" set on the time axis of the audio/video file may be, but not limited to, yellow, and the knowledge point mark corresponding to "fourier transform" may be, but not limited to, green.
In this embodiment, besides determining the tag vocabulary of the class through the first keyword vocabulary, the second keyword vocabulary, the first candidate vocabulary, the second candidate vocabulary and the weights corresponding to the first candidate vocabulary and the second candidate vocabulary, the behaviors of each learner in the class may be: the user can look up at a blackboard, write notes with low head, etc., and add one of the parameters that determine the tag vocabulary of the classroom, as described in detail below. In this embodiment, the knowledge point mark generation system 100 may further include a behavior detection device 160, and the knowledge point mark generation method may further include: the behavior detection device 160 continuously receives and analyzes the learner classroom image in the classroom to obtain the behavior identification signal of each learner; when the behavior detection device 160 obtains that the behavior recognition signal of any learner is a head-up or a writing mark, the processing device 130 generates a behavior string according to the text strings obtained by the voice recognition device 120 in the expected time interval; the processing device 130 analyzes the behavior strings by statistical means, the head-up rate of the whole class and/or the ratio of the whole class to obtain at least one fourth candidate vocabulary; and the processing device 130 further adds the at least one fourth candidate vocabulary to the analysis program according to the corresponding weight to obtain the tag vocabulary.
In more detail, the behavior detection device 160 may include a photographing module 162 and a parsing module 164, wherein the photographing module 162 is connected to the parsing module 164. The photographing module 162 can be used for continuously photographing images of the location of each learner in the classroom (i.e. images of each learner in the classroom), and analyzing the images continuously photographed by the photographing module 162 can obtain the behavior identification signal of each learner (i.e. dynamic behavior of each learner). Since the content taught by the learner may be the key (i.e. the knowledge point) when the learner looks up the projection image, the blackboard and/or the whiteboard or the lowhead writing, when the behavior detection device 160 obtains the behavior identification signal of any learner as the look-up projection image, the blackboard and/or the whiteboard or the lowhead writing, the processing device 130 can generate a behavior string according to the text string obtained by the voice recognition device 120 in the expected time interval before and after the occurrence time point of the learner looks-up projection image, the blackboard and/or the whiteboard or the lowhead writing, wherein the size of the expected time interval can be adjusted according to the actual requirement. The processing device 130 may first count the vocabulary in the generated behavior strings, and then define the first few vocabulary with higher occurrence frequency as the fourth candidate vocabulary (i.e. the possible knowledge points).
In addition, when the number of learners who head up and see the projected image, the blackboard, and/or the whiteboard or the low-head writing note is greater at the same time, the text string obtained by the voice recognition device 120 before and after the time point is more likely to be the knowledge point of the class, so the processing device 130 needs to add the head up rate of the whole class and/or the ratio of the writing notes of the whole class to the reference factors in the process of obtaining the fourth candidate vocabulary, and further obtains at least one fourth candidate vocabulary. Then, the processing device 130 may further add the at least one fourth candidate vocabulary to the analysis program according to the weight corresponding to the at least one fourth candidate vocabulary to obtain the tag vocabulary, wherein the weight corresponding to the at least one fourth candidate vocabulary may be adjusted according to the actual requirement.
In addition, in this embodiment, besides determining the tag vocabulary of the class through the first keyword vocabulary, the second keyword vocabulary, the first candidate vocabulary, the second candidate vocabulary, the fourth candidate vocabulary and the weights corresponding thereto, the behavior of each learner who learns through live broadcasting may be: at least one piece of marking information is set in the live streaming video and audio process, and one of the parameters for determining the tag vocabulary of the class is added, and the detailed description is as follows. In this embodiment, the knowledge point mark generation system 100 may further include at least one client 150, wherein each learner can learn by live broadcast through the client 150 owned by the learner.
Each client 150 includes a live broadcast module 50, a marking module 60, and a transmission module 70, and the knowledge point marking generating method may further include: the live broadcast module 50 of each client 150 continuously broadcasts streaming video in the class; the marking module 60 of each client 150 allows setting at least one marking information in the live streaming audio-visual process; the transmission module 70 of each ue 150 transmits the set time point of the at least one flag information to the processing device 130; after the class is finished, the processing device 130 generates a tag string according to the word string obtained by the voice recognition device 120 in a predetermined time interval before and after the time point of setting the at least one tag information by each user terminal 150; the processing device 130 analyzes the tag strings statistically to obtain at least one third candidate vocabulary; and the processing device 130 further adds the at least one third candidate vocabulary to the analysis program according to the corresponding weight to obtain the tag vocabulary. The number of the clients 150 can be adjusted according to the actual requirements. To avoid the complexity of the diagram of fig. 1, only two ues 150 are drawn, and the number of actual ues 150 can be adjusted according to the actual requirements.
In other words, when each learner learns through live broadcast (i.e. in the process of live streaming video and audio) through the owned client 150, the mark information (similar to the concept of low-head note) can be set for the portion taught by the learner in the current period at any time. Since the content taught by the learner may be the key (i.e. the knowledge point) when the learner sets the mark information, when any learner sets the mark information through the client 150, the processing device 130 can generate a mark word string according to the word strings obtained by the voice recognition device 120 in a predetermined time interval before and after the occurrence time point of the learner sets the mark information, wherein the size of the predetermined time interval can be adjusted according to the actual requirement. The processing device 130 may first count the words in the obtained tag strings, and then define the first words with higher occurrence frequency as the third candidate word (i.e. the possible knowledge point). Then, the processing device 130 may further add the at least one third candidate vocabulary to the analysis program according to the weight corresponding to the third candidate vocabulary to obtain the tag vocabulary, wherein the weight corresponding to the at least one third candidate vocabulary may be adjusted according to the actual requirement.
It should be noted in particular that the knowledge point flag generation method of the present embodiment may execute the above steps in any order, except for explaining the causal relationship thereof.
In summary, the difference between the present invention and the prior art is that the tag vocabulary is obtained by performing an analysis procedure according to the corresponding weights of at least one first keyword word highlighted in a text manner, at least one second keyword word highlighted in a sound manner, at least one first candidate word repeatedly appearing in a text manner and at least one second candidate word repeatedly appearing in a sound manner in a text manner, and the tag vocabulary is corresponding to the time zone and the time interval in which the tag vocabulary appears, and the knowledge point mark is set on the time axis of the video file photographed in the text to form the video file with the knowledge point mark.
Although the invention has been described with reference to the above embodiments, it should be understood that the invention is not limited thereto but may be modified or altered somewhat by persons skilled in the art without departing from the spirit and scope of the invention.

Claims (10)

1. A knowledge point tag generation system, comprising:
The capturing device is used for continuously capturing and analyzing the computer picture image, the projection image and/or the board image in a classroom so as to continuously obtain texts, and capturing at least one first key word in the texts based on the character types and/or the character colors in the computer picture image, the projection image and/or the board image and the clicked characters;
The voice recognition device is used for continuously receiving a voice signal in the classroom, continuously converting the voice signal into a character string in a voice-to-character mode, judging the identity of the voice signal in a voiceprint recognition or sound source recognition mode, and capturing at least one second key word in the character string based on the identity of the voice signal and/or a plurality of preset words;
The processing device is used for analyzing the text continuously acquired by the acquisition device in a statistical mode after the class is finished so as to acquire at least one first candidate vocabulary; after the class is finished, the character strings continuously obtained by the voice recognition device are analyzed in a statistical mode so as to obtain at least one second candidate vocabulary; the at least one first key word, the at least one second key word, the at least one first candidate word and the at least one second candidate word are subjected to an analysis program according to the corresponding weights so as to obtain a tag word; and
The integrating device is used for generating a time section of each sentence with the tag word in the word string continuously acquired by the voice recognition device after the class is finished, combining the adjacent time sections into a time section when the time difference between the adjacent time sections is smaller than a specific time length, and setting a plurality of knowledge point marks corresponding to the time sections which are not combined and the time section on a time axis of the video file shot in the class so as to form the video file with the knowledge point marks.
2. The knowledge point tag generation system of claim 1, wherein the knowledge point tag generation system further comprises:
At least one user terminal, each user terminal includes:
The live broadcast module is used for continuously broadcasting streaming video and audio in the class;
The marking module is used for allowing the setting of the at least one marking information in the process of live broadcasting the streaming video and audio; and
The transmission module is used for transmitting the set time point of the at least one piece of marking information to the processing device;
After the class is finished, the processing device generates a marked word string according to the word string acquired by the voice recognition device in a preset time interval before and after the time point of setting the at least one marked information by each user side; analyzing the marked word strings in a statistical mode to obtain at least one third candidate word; and adding the at least one third candidate vocabulary into the analysis program according to the corresponding weight to obtain the tag vocabulary.
3. The knowledge point marking generation system of claim 1 or 2, wherein the knowledge point marking generation system further comprises:
The behavior detection device is used for continuously receiving and analyzing the learner classroom images in the classroom so as to acquire behavior identification signals of each learner;
When the behavior detection device obtains that any behavior identification signal of the learner is head-up or writing, the processing device generates a behavior string according to the character string obtained by the voice recognition device in the expected time interval; analyzing the behavior word strings through a statistical mode, a whole-class head-up rate and/or a whole-class written note proportion to obtain at least one fourth candidate vocabulary; and adding the at least one fourth candidate vocabulary into the analysis program according to the corresponding weight to obtain the tag vocabulary.
4. The knowledge point tag generation system of claim 1, wherein when the processing device performs the analysis procedure on the at least one first keyword, the at least one second keyword, the at least one first candidate vocabulary, and the at least one second candidate vocabulary according to their corresponding weights to obtain a plurality of tag vocabularies, the integrating device distinguishes the knowledge point tags corresponding to different tag vocabularies according to different colors.
5. The knowledge point mark generation system according to claim 1, wherein when the processing device statistically analyzes the text obtained continuously by the capturing device or the text string obtained continuously by the voice recognition device after the end of the class, if it is determined that the frequency of occurrence of any vocabulary exceeds a predetermined value, the vocabulary is excluded as the first candidate vocabulary or the second candidate vocabulary.
6. The knowledge point mark generation method is characterized by comprising the following steps:
providing a knowledge point mark generation system, which comprises a capturing device, a voice recognition device, a processing device and an integration device;
The capturing device continuously captures and analyzes the computer picture image, the projection image and/or the board book image in the class so as to continuously obtain the text;
The capturing device captures at least one first keyword in the text based on the computer picture image, the projection image and/or the font color in the blackboard-writing image and the clicked text;
the voice recognition device continuously receives a voice signal in the class and continuously converts the voice signal into a character string in a voice-to-character mode;
The voice recognition device judges the identity of the sound signal through voiceprint recognition or sound source recognition;
the voice recognition device captures at least one second key word in the word string based on the identity of the sent sound signal and/or a plurality of preset words;
The processing device analyzes the text continuously obtained by the capturing device in a statistical mode after the class is finished so as to obtain at least one first candidate vocabulary;
the processing device analyzes the character strings continuously obtained by the voice recognition device in a statistical mode after the class is finished so as to obtain at least one second candidate vocabulary;
The processing device carries out an analysis procedure on the at least one first key word, the at least one second key word, the at least one first candidate word and the at least one second candidate word according to the corresponding weights so as to obtain a tag word; and
The integration device is used for generating a time zone of each sentence with the tag word in the word string continuously acquired by the voice recognition device after the class is finished, combining the adjacent time zones into a time zone when the time difference between the adjacent time zones is smaller than a specific time length, and setting a plurality of knowledge point marks corresponding to the time zone which is not combined and the time zone on a time axis of the video file shot in the class so as to form the video file with the knowledge point marks.
7. The knowledge point tag generation method of claim 6, wherein the knowledge point tag generation system further comprises at least one user side, each of the user sides comprises a live broadcast module, a tag module, and a transmission module, the knowledge point tag generation method further comprising:
the live broadcast module of each user terminal continuously broadcasts streaming video and audio in the class;
The marking module of each user terminal allows setting the at least one marking information in the process of directly broadcasting the streaming video;
the transmission module of each user side transmits the set time point of the at least one piece of marking information to the processing device;
The processing device generates a marked word string according to the word string acquired by the voice recognition device in a preset time interval before and after the user terminal sets the time point of the at least one marked information after the class is finished;
The processing device analyzes the marked word strings in a statistical mode to obtain at least one third candidate vocabulary; and
The processing device also adds the at least one third candidate vocabulary into the analysis program according to the corresponding weight to obtain the tag vocabulary.
8. The knowledge point mark generation method according to claim 6 or 7, wherein the knowledge point mark generation system further comprises a behavior detection device, the knowledge point mark generation method further comprising:
the behavior detection device continuously receives and analyzes learner classroom images in the classroom so as to acquire behavior identification signals of each learner;
When the behavior detection device obtains any behavior identification signal of the learner as a head-up or writing mark, the processing device generates a behavior string according to the character strings obtained by the voice recognition device in the expected time interval;
The processing device analyzes the behavior word strings through a statistical mode, a full-shift head-up rate and/or a full-shift writing ratio so as to obtain at least one fourth candidate vocabulary; and
The processing device also adds the at least one fourth candidate vocabulary into the analysis program according to the corresponding weight to obtain the tag vocabulary.
9. The method of claim 6, wherein when the processing device performs the analysis procedure on the at least one first keyword, the at least one second keyword, the at least one first candidate vocabulary, and the at least one second candidate vocabulary according to weights corresponding to the at least one first candidate vocabulary to obtain a plurality of tag vocabularies, the integrating device distinguishes the knowledge point tags corresponding to different tag vocabularies according to different colors.
10. The knowledge point tag generation method according to claim 6, wherein when the processing device statistically analyzes the text obtained continuously by the capturing device or the text string obtained continuously by the voice recognition device after the end of the class, if it is determined that the frequency of occurrence of any vocabulary exceeds a predetermined value, the vocabulary is excluded as the first candidate vocabulary or the second candidate vocabulary.
CN201910646422.9A 2019-07-17 2019-07-17 Knowledge point mark generation system and method thereof Active CN112241462B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910646422.9A CN112241462B (en) 2019-07-17 2019-07-17 Knowledge point mark generation system and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910646422.9A CN112241462B (en) 2019-07-17 2019-07-17 Knowledge point mark generation system and method thereof

Publications (2)

Publication Number Publication Date
CN112241462A CN112241462A (en) 2021-01-19
CN112241462B true CN112241462B (en) 2024-04-23

Family

ID=74167181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910646422.9A Active CN112241462B (en) 2019-07-17 2019-07-17 Knowledge point mark generation system and method thereof

Country Status (1)

Country Link
CN (1) CN112241462B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107895244A (en) * 2017-12-26 2018-04-10 重庆大争科技有限公司 Classroom teaching quality assessment method
CN108806685A (en) * 2018-07-02 2018-11-13 英业达科技有限公司 Speech control system and its method
CN109698920A (en) * 2017-10-20 2019-04-30 深圳市鹰硕技术有限公司 It is a kind of that tutoring system is followed based on internet teaching platform

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170039873A1 (en) * 2015-08-05 2017-02-09 Fujitsu Limited Providing adaptive electronic reading support

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109698920A (en) * 2017-10-20 2019-04-30 深圳市鹰硕技术有限公司 It is a kind of that tutoring system is followed based on internet teaching platform
CN107895244A (en) * 2017-12-26 2018-04-10 重庆大争科技有限公司 Classroom teaching quality assessment method
CN108806685A (en) * 2018-07-02 2018-11-13 英业达科技有限公司 Speech control system and its method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于语料库的古代汉语教材预期成效评估方法及应用;邱冰;皇甫伟;朱庆之;;中文信息学报;20180615(第06期);全文 *
浅谈多媒体和网络技术对大学英语词汇教学的作用;张颖;夏婕;;科技资讯;20070513(第14期);全文 *

Also Published As

Publication number Publication date
CN112241462A (en) 2021-01-19

Similar Documents

Publication Publication Date Title
US10978077B1 (en) Knowledge point mark generation system and method thereof
CN109275046B (en) Teaching data labeling method based on double video acquisition
CN110517689B (en) Voice data processing method, device and storage medium
US10304458B1 (en) Systems and methods for transcribing videos using speaker identification
US8645121B2 (en) Language translation of visual and audio input
WO2017132228A1 (en) Digital media content extraction natural language processing system
KR20170030297A (en) System, Apparatus and Method For Processing Natural Language, and Computer Readable Recording Medium
CN114465737B (en) Data processing method and device, computer equipment and storage medium
CN113035199B (en) Audio processing method, device, equipment and readable storage medium
CN112399269B (en) Video segmentation method, device, equipment and storage medium
WO2022228235A1 (en) Method and apparatus for generating video corpus, and related device
CN113537801B (en) Blackboard writing processing method, blackboard writing processing device, terminal and storage medium
CN113254708A (en) Video searching method and device, computer equipment and storage medium
CN112382295A (en) Voice recognition method, device, equipment and readable storage medium
CN116246610A (en) Conference record generation method and system based on multi-mode identification
Yang et al. An automated analysis and indexing framework for lecture video portal
CN114996506A (en) Corpus generation method and device, electronic equipment and computer-readable storage medium
CN109376145B (en) Method and device for establishing movie and television dialogue database and storage medium
CN116708055B (en) Intelligent multimedia audiovisual image processing method, system and storage medium
TWI684964B (en) Knowledge point mark generation system and method thereof
CN112241462B (en) Knowledge point mark generation system and method thereof
US9697851B2 (en) Note-taking assistance system, information delivery device, terminal, note-taking assistance method, and computer-readable recording medium
KR101783872B1 (en) Video Search System and Method thereof
KR102148021B1 (en) Information search method and apparatus in incidental images incorporating deep learning scene text detection and recognition
KR20160131730A (en) System, Apparatus and Method For Processing Natural Language, and Computer Readable Recording Medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant