CN112019786A - Intelligent teaching screen recording method and system - Google Patents

Intelligent teaching screen recording method and system Download PDF

Info

Publication number
CN112019786A
CN112019786A CN202010857325.7A CN202010857325A CN112019786A CN 112019786 A CN112019786 A CN 112019786A CN 202010857325 A CN202010857325 A CN 202010857325A CN 112019786 A CN112019786 A CN 112019786A
Authority
CN
China
Prior art keywords
sound signal
actual
screen recording
frequency domain
optimized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010857325.7A
Other languages
Chinese (zh)
Other versions
CN112019786B (en
Inventor
崔炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Original Assignee
Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd filed Critical Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Priority to CN202010857325.7A priority Critical patent/CN112019786B/en
Publication of CN112019786A publication Critical patent/CN112019786A/en
Application granted granted Critical
Publication of CN112019786B publication Critical patent/CN112019786B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/60Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals

Abstract

The invention provides an intelligent teaching screen recording method and system, which are different from the prior art that noise reduction optimization is only carried out on actual sound signals obtained by screen recording, the noise reduction optimization of the actual sound signals and the signal statistical error analysis of the sound signals are carried out by collecting corresponding standard sound signals as reference signals and carrying out the extraction of time domain characteristic parameters and frequency domain characteristic parameters on the two sound signals from the time domain and the frequency domain of the sound signals, and then the mutual combination matching of the optimized sound signals and the screen recording images is realized according to the result of the signal statistical error analysis, so that the combination matching reliability of the screen recording sound signals and the screen recording image signals is improved.

Description

Intelligent teaching screen recording method and system
Technical Field
The invention relates to the technical field of intelligent education, in particular to an intelligent teaching screen recording method and system.
Background
The screen is usually required to be recorded in the on-line teaching process in the intelligent teaching process, so that the teaching mode can be conveniently adjusted in real time according to the screen recording result. Recording the screen to online teaching process includes that sound recording screen and image recording screen, wherein the sound signal that sound recording screen obtained contains certain noise component usually, in order to guarantee the reliability of the screen recording result, need to optimize this sound signal and record screen image recombination again, but prior art all only carries out filtering and falls and makes an uproar to this sound signal itself, it does not refer to other standard sound signals and carries out the optimization of adaptability, this is unfavorable for the optimization of falling an uproar of recording screen sound signal and improves the combination matching reliability of follow-up recording screen sound signal and recording screen image signal.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides an intelligent teaching screen recording method and system, which comprises the steps of collecting a standard sound signal and an actual sound signal recorded in the screen recording process, carrying out time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal, determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, optimizing the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal, determining signal statistical error information corresponding to the optimized sound signal, and executing mutual combination and matching of the optimized sound signal and a screen recording image obtained in the screen recording process according to the signal statistical error information; it is obvious that the intelligent teaching screen recording method and system are different from the prior art that noise reduction optimization is only carried out on actual sound signals obtained by screen recording, corresponding standard sound signals are collected to serve as reference signals, extraction of time domain characteristic parameters and frequency domain characteristic parameters, noise reduction optimization of the actual sound signals and signal statistical error analysis of the sound signals are carried out on the two sound signals from the aspect of time domain and frequency domain of the sound signals, and then mutual combination matching of the optimized sound signals and the screen recording images is realized according to the result of the signal statistical error analysis, so that the combination matching reliability of the screen recording sound signals and the screen recording image signals is improved.
The invention provides an intelligent teaching screen recording method which is characterized by comprising the following steps:
step S1, collecting a standard sound signal and an actual sound signal recorded in the screen recording process, and performing time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal;
step S2, determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and then performing optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal;
step S3, determining signal statistical error information corresponding to the optimized sound signal, and executing mutual combination and matching of the optimized sound signal and the screen recording image obtained in the screen recording process according to the signal statistical error information;
further, in the step S1, the collecting a standard sound signal and an actual sound signal recorded in the screen recording process, and the performing time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal specifically includes,
step S101, recording a plurality of history teaching processes to obtain corresponding history teaching sound signals, and extracting coexisting sound signals from the history teaching sound signals to serve as the standard sound signals;
step S102, performing time domain analysis processing and frequency domain analysis processing on the standard sound signal so as to extract a first time domain characteristic parameter, a first frequency domain characteristic parameter and a first cepstrum frequency domain characteristic parameter from the standard sound signal;
step S103, performing time domain analysis processing and frequency domain analysis processing on the actual sound signal so as to extract a second time domain characteristic parameter, a second frequency domain characteristic parameter and a second cepstrum frequency domain characteristic parameter from the actual sound signal;
further, in the step S2, determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time-domain analysis processing and the frequency-domain analysis processing, and then performing optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal specifically includes,
step S201, determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the following formula (1)
Figure BDA0002646888000000031
In the above formula (1), simA represents a similarity evaluation value, x, between the standard sound signal and the actual sound signaliRepresenting the ith second time domain feature parameter extracted from the actual sound signal, m representing the total number of second time domain feature parameters, xjRepresents the jth first time domain characteristic parameter extracted from the standard sound signal, n represents the total number of the first time domain characteristic parameters, yhRepresenting the h-th second frequency domain characteristic parameter extracted from the actual sound signal, e representing the total number of second frequency domain characteristic parameters, ykDenotes the kth first frequency domain characteristic parameter extracted from the standard sound signal, f denotes the total number of the first frequency domain characteristic parameters, zpRepresenting the p-th second cepstral frequency domain feature parameter extracted from the actual sound signal, r representing the total number of second cepstral frequency domain feature parameters, zqRepresenting the q-th first cepstrum frequency domain characteristic parameter extracted from the standard sound signal, s representing the total number of first cepstrum frequency domain characteristic parameters;
step S202, comparing the similarity evaluation value with a preset similarity evaluation threshold, if the similarity evaluation value is greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the actual sound signal to obtain the optimized sound signal, if the similarity evaluation value is less than the preset similarity evaluation threshold, re-recording the actual sound signal, and determining the similarity evaluation value between the standard sound signal and the re-recorded actual sound signal again, and when the similarity evaluation value is not greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the re-recorded actual sound signal to obtain the optimized sound signal;
further, in the step S3, determining signal statistical error information corresponding to the optimized sound signal, and according to the signal statistical error information, performing mutual combination and matching between the optimized sound signal and the screen recording image obtained in the screen recording process specifically includes,
step S301, determining the actual mean square error between the optimized sound signal and the standard sound signal according to the following formula (2)
Figure BDA0002646888000000041
In the above equation (2), MSE1Representing the actual mean square error between the optimized sound signal and the standard sound signal,
Figure BDA0002646888000000042
represents the power of the tth frame sound segment in the standard sound signal,
Figure BDA0002646888000000043
c represents the power of the t frame sound segment in the optimized sound signal, and the total number of the sound segments in the standard sound signal and the optimized sound signal respectively;
step S302, determining the actual mean square error MSE according to the following formula (3)1With a predetermined mean square error MSE2Ratio Q between
Figure BDA0002646888000000044
In the above equation (3), the predetermined mean square error MSE2Has a value range of [0.1, 0.6]];
Step S303, if the ratio Q is less than or equal to 1, combining and matching the optimized sound signal and the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process, if the ratio Q is greater than 1, filtering and denoising the optimized sound signal again, re-determining the ratio Q according to the optimized sound signal subjected to filtering and denoising again, and if the re-determined ratio Q is less than or equal to 1, combining and matching the optimized sound signal subjected to filtering and denoising again and the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process.
The invention also provides an intelligent teaching screen recording system which is characterized by comprising a sound signal acquisition module, a sound signal preprocessing module, an optimized sound signal generation module and a sound signal-screen recording image combination module; wherein the content of the first and second substances,
the sound signal acquisition module is used for collecting standard sound signals and actual sound signals recorded in the screen recording process;
the sound signal preprocessing module is used for performing time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal;
the optimized sound signal generating module is used for determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and then optimizing the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal;
the sound signal-screen recording image combination module is used for determining signal statistical error information corresponding to the optimized sound signal and executing mutual combination matching of the optimized sound signal and the screen recording image obtained in the screen recording process according to the signal statistical error information;
further, the sound signal acquisition module collects the standard sound signals, specifically comprises recording a plurality of history teaching processes to obtain corresponding history teaching sound signals, and then extracts the sound signals existing together from the plurality of history teaching sound signals to serve as the standard sound signals;
the sound signal preprocessing module specifically includes performing time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal,
performing time domain analysis processing and frequency domain analysis processing on the standard sound signal so as to extract a first time domain characteristic parameter, a first frequency domain characteristic parameter and a first cepstrum frequency domain characteristic parameter from the standard sound signal,
performing time domain analysis processing and frequency domain analysis processing on the actual sound signal so as to extract a second time domain characteristic parameter, a second frequency domain characteristic parameter and a second cepstrum frequency domain characteristic parameter from the actual sound signal;
further, the optimized sound signal generating module determines a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and performs optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal specifically including,
determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the following formula (1)
Figure BDA0002646888000000051
In the above formula (1), simA represents a similarity evaluation value, x, between the standard sound signal and the actual sound signaliRepresenting the ith second time domain feature parameter extracted from the actual sound signal, m representing the total number of second time domain feature parameters, xjRepresents the jth first time domain characteristic parameter extracted from the standard sound signal, n represents the total number of the first time domain characteristic parameters, yhRepresenting the h-th second frequency domain characteristic parameter extracted from the actual sound signal, e representing the total number of second frequency domain characteristic parameters, ykDenotes the kth first frequency domain characteristic parameter extracted from the standard sound signal, f denotes the total number of the first frequency domain characteristic parameters, zpRepresenting the p-th second cepstral frequency domain feature parameter extracted from the actual sound signal, r representing the total number of second cepstral frequency domain feature parameters, zqRepresents the q-th first cepstral frequency-domain feature parameter extracted from the standard sound signal, s represents the total number of first cepstral frequency-domain feature parameters,
comparing the similarity evaluation value with a preset similarity evaluation threshold, if the similarity evaluation value is greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the actual sound signal to obtain the optimized sound signal, if the similarity evaluation value is smaller than the preset similarity evaluation threshold, re-recording the actual sound signal, re-determining the similarity evaluation value between the standard sound signal and the re-recorded actual sound signal, and when the similarity evaluation value is greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the re-recorded actual sound signal to obtain the optimized sound signal;
further, the sound signal-screen recording image combination module determines signal statistical error information corresponding to the optimized sound signal, and executes mutual combination and matching of the optimized sound signal and the screen recording image obtained in the screen recording process according to the signal statistical error information,
determining an actual mean square error between the optimized sound signal and the standard sound signal according to the following formula (2)
Figure BDA0002646888000000061
In the above equation (2), MSE1Representing the actual mean square error between the optimized sound signal and the standard sound signal,
Figure BDA0002646888000000062
represents the power of the tth frame sound segment in the standard sound signal,
Figure BDA0002646888000000063
c represents the power of the t frame sound segment in the optimized sound signal, and the total number of the sound segments in the standard sound signal and the optimized sound signal respectively;
and determining the actual mean square error MSE according to the following formula (3)1With a predetermined mean square error MSE2Ratio Q between
Figure BDA0002646888000000071
In the above equation (3), the predetermined mean square error MSE2Has a value range of [0.1, 0.6]];
And finally, if the ratio Q is smaller than or equal to 1, combining and matching the optimized sound signal and the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process, if the ratio Q is larger than 1, filtering and denoising the optimized sound signal again, re-determining the ratio Q according to the optimized sound signal subjected to filtering and denoising again, and if the re-determined ratio Q is smaller than or equal to 1, combining and matching the optimized sound signal subjected to filtering and denoising again and the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process.
Compared with the prior art, the intelligent teaching screen recording method and system have the advantages that the standard sound signal and the actual sound signal recorded in the screen recording process are collected, time domain analysis processing and frequency domain analysis processing are carried out on the standard sound signal and the actual sound signal, the similarity evaluation value between the standard sound signal and the actual sound signal is determined according to the results of the time domain analysis processing and the frequency domain analysis processing, optimization processing is carried out on the actual sound signal according to the similarity evaluation value, an optimized sound signal is obtained, signal statistical error information corresponding to the optimized sound signal is determined, and mutual combination and matching of the optimized sound signal and the screen recording image obtained in the screen recording process are carried out according to the signal statistical error information; it is obvious that the intelligent teaching screen recording method and system are different from the prior art that noise reduction optimization is only carried out on actual sound signals obtained by screen recording, corresponding standard sound signals are collected to serve as reference signals, extraction of time domain characteristic parameters and frequency domain characteristic parameters, noise reduction optimization of the actual sound signals and signal statistical error analysis of the sound signals are carried out on the two sound signals from the aspect of time domain and frequency domain of the sound signals, and then mutual combination matching of the optimized sound signals and the screen recording images is realized according to the result of the signal statistical error analysis, so that the combination matching reliability of the screen recording sound signals and the screen recording image signals is improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow diagram of an intelligent teaching screen recording method provided by the invention.
Fig. 2 is a schematic structural diagram of the intelligent teaching screen recording system provided by the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of an intelligent teaching screen recording method according to an embodiment of the present invention. The intelligent teaching screen recording method comprises the following steps:
step S1, collecting a standard sound signal and an actual sound signal recorded in the screen recording process, and performing time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal;
step S2, determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and then performing optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal;
step S3, determining signal statistical error information corresponding to the optimized sound signal, and according to the signal statistical error information, performing mutual combination and matching between the optimized sound signal and the screen recording image obtained in the screen recording process.
Preferably, the step S1 of collecting the standard sound signal and the actual sound signal recorded during the screen recording process, and the time-domain analysis processing and the frequency-domain analysis processing of the standard sound signal and the actual sound signal specifically include,
step S101, recording a plurality of history teaching processes to obtain corresponding history teaching sound signals, and extracting coexisting sound signals from the history teaching sound signals to serve as the standard sound signals;
step S102, performing time domain analysis processing and frequency domain analysis processing on the standard sound signal so as to extract a first time domain characteristic parameter, a first frequency domain characteristic parameter and a first cepstrum frequency domain characteristic parameter from the standard sound signal;
step S103, performing time domain analysis processing and frequency domain analysis processing on the actual sound signal, so as to extract a second time domain characteristic parameter, a second frequency domain characteristic parameter, and a second cepstrum frequency domain characteristic parameter from the actual sound signal.
Preferably, in the step S2, determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time-domain analysis processing and the frequency-domain analysis processing, and performing optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal specifically includes,
step S201, according to the following formula (1), determines a similarity evaluation value between the standard sound signal and the actual sound signal
Figure BDA0002646888000000091
In the above formula (1), simA represents a similarity evaluation value, x, between the standard sound signal and the actual sound signaliRepresents the ith second time domain feature parameter extracted from the actual sound signal, m represents the total number of the second time domain feature parameters, xjDenotes the jth first time domain characteristic parameter extracted from the standard sound signal, n denotes the total number of the first time domain characteristic parameters, yhDenotes an h-th second frequency domain characteristic parameter extracted from the actual sound signal, e denotes the total number of second frequency domain characteristic parameters, ykDenotes the kth first frequency domain characteristic parameter extracted from the standard sound signal, f denotes the total number of the first frequency domain characteristic parameters, zpRepresenting the p-th second cepstral characteristic parameter extracted from the actual sound signal, r representing the total number of second cepstral characteristic parameters, zqRepresents the q-th first cepstrum frequency domain characteristic parameter extracted from the standard sound signal, and s represents the total number of the first cepstrum frequency domain characteristic parameters;
step S202, comparing the similarity evaluation value with a preset similarity evaluation threshold, if the similarity evaluation value is greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the actual sound signal to obtain the optimized sound signal, if the similarity evaluation value is less than the preset similarity evaluation threshold, re-recording the actual sound signal, re-determining the similarity evaluation value between the standard sound signal and the re-recorded actual sound signal, and when the similarity evaluation value is not greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the re-recorded actual sound signal to obtain the optimized sound signal.
Preferably, in the step S3, determining signal statistical error information corresponding to the optimized sound signal, and according to the signal statistical error information, performing mutual combination matching between the optimized sound signal and the screen recording image obtained by the screen recording process specifically includes,
step S301, determining the actual mean square error between the optimized sound signal and the standard sound signal according to the following formula (2)
Figure BDA0002646888000000101
In the above equation (2), MSE1Representing the actual mean square error between the optimized sound signal and the standard sound signal,
Figure BDA0002646888000000102
represents the power of the tth frame sound segment in the standard sound signal,
Figure BDA0002646888000000103
c represents the total number of sound segments in the standard sound signal and the optimized sound signal;
step S302, determining the actual mean square error MSE according to the following formula (3)1With a predetermined mean square error MSE2Ratio Q between
Figure BDA0002646888000000104
In the above equation (3), the predetermined mean square error MSE2Has a value range of [0.1, 0.6]];
Step S303, if the ratio Q is less than or equal to 1, combining and matching the optimized sound signal with the screen recording image according to the screen recording operation timing information corresponding to the screen recording process, if the ratio Q is greater than 1, performing filtering and noise reduction processing on the optimized sound signal again, re-determining the ratio Q according to the optimized sound signal subjected to filtering and noise reduction processing again, and if the re-determined ratio Q is less than or equal to 1, combining and matching the optimized sound signal subjected to filtering and noise reduction processing again with the screen recording image according to the screen recording operation timing information corresponding to the screen recording process.
Generally speaking, the intelligent teaching screen recording method collects a standard sound signal in a noiseless environment corresponding to a history teaching process, takes the standard sound signal as a reference, then obtains an actual sound signal recorded by a worker in the screen recording process, compares the standard sound signal with the actual sound signal, calculates a similarity evaluation value through a formula (1), eliminates the actual sound signal of which the similarity evaluation value is greater than or equal to a preset similarity evaluation threshold value, prevents an error sound signal from mixing to cause the problem that image information is not matched with the sound signal in the screen recording process, performs noise reduction processing on the actual sound signal of which the similarity evaluation value is greater than the preset similarity evaluation threshold value, eliminates partial noise in the actual sound signal, makes the sound signal clearer, calculates and optimizes the actual mean square error of the sound signal and the standard sound signal through a formula (2), and confirming the ratio of the actual mean square error to the preset mean square error according to the formula (3), and confirming the processing effect after the noise reduction processing, when the ratio of the actual mean square error to the preset mean square error is less than or equal to 1, the processing effect meets the matching condition, the optimized sound signal can be matched with the recorded image information and stored, when the ratio of the actual mean square error to the preset mean square error is more than 1, the processing effect does not meet the matching condition, the optimized sound signal needs to be subjected to the noise reduction processing again until the ratio of the actual mean square error to the preset mean square error of the optimized sound signal is less than or equal to 1, and then the optimized sound signal matched with the recorded image information is ensured to be more accurate and reliable, the image is more matched with the sound during teaching, and the sound is more accurate and clear, the user experience in the teaching process is improved.
Fig. 2 is a schematic structural diagram of an intelligent teaching screen recording system according to an embodiment of the present invention. The intelligent teaching screen recording system comprises a sound signal acquisition module, a sound signal preprocessing module, an optimized sound signal generation module and a sound signal-screen recording image combination module; wherein the content of the first and second substances,
the sound signal acquisition module is used for collecting standard sound signals and actual sound signals recorded in the screen recording process;
the sound signal preprocessing module is used for carrying out time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal;
the optimized sound signal generating module is used for determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and then optimizing the actual sound signal according to the similarity evaluation value so as to obtain an optimized sound signal;
the sound signal-screen recording image combination module is used for determining signal statistical error information corresponding to the optimized sound signal and executing mutual combination and matching of the optimized sound signal and the screen recording image obtained in the screen recording process according to the signal statistical error information.
Preferably, the sound signal acquisition module collects the standard sound signals, specifically includes recording a plurality of history teaching processes to obtain corresponding history teaching sound signals, and then extracts the sound signals existing together from the plurality of history teaching sound signals to serve as the standard sound signals;
the sound signal preprocessing module specifically includes performing time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal,
performing time domain analysis processing and frequency domain analysis processing on the standard sound signal so as to extract a first time domain characteristic parameter, a first frequency domain characteristic parameter and a first cepstrum frequency domain characteristic parameter from the standard sound signal,
and performing time domain analysis processing and frequency domain analysis processing on the actual sound signal so as to extract a second time domain characteristic parameter, a second frequency domain characteristic parameter and a second cepstrum frequency domain characteristic parameter from the actual sound signal.
Preferably, the optimized sound signal generating module determines a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and performs optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal specifically including,
the similarity evaluation value between the standard sound signal and the actual sound signal is determined according to the following formula (1)
Figure BDA0002646888000000121
In the above formula (1), simA represents a similarity evaluation value, x, between the standard sound signal and the actual sound signaliRepresents the ith second time domain feature parameter extracted from the actual sound signal, m represents the total number of the second time domain feature parameters, xjDenotes the jth first time domain characteristic parameter extracted from the standard sound signal, n denotes the total number of the first time domain characteristic parameters, yhDenotes an h-th second frequency domain characteristic parameter extracted from the actual sound signal, e denotes the total number of second frequency domain characteristic parameters, ykDenotes the kth first frequency domain characteristic parameter extracted from the standard sound signal, f denotes the total number of the first frequency domain characteristic parameters, zpRepresenting the p-th second cepstral characteristic parameter extracted from the actual sound signal, r representing the total number of second cepstral characteristic parameters, zqDenotes the qth first cepstrum frequency-domain feature parameter extracted from the standard sound signal, s denotes the total number of the first cepstrum frequency-domain feature parameters,
comparing the similarity evaluation value with a preset similarity evaluation threshold, if the similarity evaluation value is greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the actual sound signal to obtain the optimized sound signal, if the similarity evaluation value is less than the preset similarity evaluation threshold, re-recording the actual sound signal, re-determining the similarity evaluation value between the standard sound signal and the re-recorded actual sound signal, and when the similarity evaluation value is greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the re-recorded actual sound signal to obtain the optimized sound signal.
Preferably, the sound signal-screen recording image combination module determines signal statistical error information corresponding to the optimized sound signal, and performs mutual combination matching of the optimized sound signal and the screen recording image obtained in the screen recording process according to the signal statistical error information,
determining an actual mean square error between the optimized sound signal and the standard sound signal according to the following formula (2)
Figure BDA0002646888000000131
In the above equation (2), MSE1Representing the actual mean square error between the optimized sound signal and the standard sound signal,
Figure BDA0002646888000000132
represents the power of the tth frame sound segment in the standard sound signal,
Figure BDA0002646888000000133
c represents the total number of sound segments in the standard sound signal and the optimized sound signal;
then, according to the following formula (3), the actual mean square error MSE is determined1With a predetermined mean square error MSE2Ratio Q between
Figure BDA0002646888000000141
In the above equation (3), the predetermined mean square error MSE2Has a value range of [0.1, 0.6]];
And finally, if the ratio Q is less than or equal to 1, combining and matching the optimized sound signal with the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process, if the ratio Q is greater than 1, filtering and denoising the optimized sound signal again, re-determining the ratio Q according to the optimized sound signal subjected to filtering and denoising again, and when the re-determined ratio Q is less than or equal to 1, combining and matching the optimized sound signal subjected to filtering and denoising again with the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process.
Generally speaking, the intelligent teaching screen recording system collects a standard sound signal in a noiseless environment corresponding to a history teaching process, takes the standard sound signal as a reference, then obtains an actual sound signal recorded by a worker in the screen recording process, compares the standard sound signal with the actual sound signal, calculates a similarity evaluation value through a formula (1), eliminates the actual sound signal of which the similarity evaluation value is greater than or equal to a preset similarity evaluation threshold value, prevents an error sound signal from mixing to cause the problem that image information is not matched with the sound signal in the screen recording process, performs noise reduction processing on the actual sound signal of which the similarity evaluation value is greater than the preset similarity evaluation threshold value, eliminates partial noise in the actual sound signal, makes the sound signal clearer, calculates and optimizes the actual mean square error of the sound signal and the standard sound signal through a formula (2), and confirming the ratio of the actual mean square error to the preset mean square error according to the formula (3), and confirming the processing effect after the noise reduction processing, when the ratio of the actual mean square error to the preset mean square error is less than or equal to 1, the processing effect meets the matching condition, the optimized sound signal can be matched with the recorded image information and stored, when the ratio of the actual mean square error to the preset mean square error is more than 1, the processing effect does not meet the matching condition, the optimized sound signal needs to be subjected to the noise reduction processing again until the ratio of the actual mean square error to the preset mean square error of the optimized sound signal is less than or equal to 1, and then the optimized sound signal matched with the recorded image information is ensured to be more accurate and reliable, the image is more matched with the sound during teaching, and the sound is more accurate and clear, the user experience in the teaching process is improved.
As can be seen from the content of the above embodiment, the method and system for recording a screen for intelligent teaching collects a standard sound signal and an actual sound signal recorded in the screen recording process, performs time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal, determines a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, performs optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal, determines signal statistical error information corresponding to the optimized sound signal, and performs mutual combination matching between the optimized sound signal and a screen recording image obtained in the screen recording process according to the signal statistical error information; it is obvious that the intelligent teaching screen recording method and system are different from the prior art that noise reduction optimization is only carried out on actual sound signals obtained by screen recording, corresponding standard sound signals are collected to serve as reference signals, extraction of time domain characteristic parameters and frequency domain characteristic parameters, noise reduction optimization of the actual sound signals and signal statistical error analysis of the sound signals are carried out on the two sound signals from the aspect of time domain and frequency domain of the sound signals, and then mutual combination matching of the optimized sound signals and the screen recording images is realized according to the result of the signal statistical error analysis, so that the combination matching reliability of the screen recording sound signals and the screen recording image signals is improved.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. The intelligent teaching screen recording method is characterized by comprising the following steps:
step S1, collecting a standard sound signal and an actual sound signal recorded in the screen recording process, and performing time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal;
step S2, determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and then performing optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal;
and step S3, determining signal statistical error information corresponding to the optimized sound signal, and executing mutual combination and matching of the optimized sound signal and the screen recording image obtained in the screen recording process according to the signal statistical error information.
2. The intelligent teaching screen recording method according to claim 1, characterized in that:
in step S1, the collecting of the standard sound signal and the actual sound signal recorded during the screen recording process, and the performing of the time domain analysis processing and the frequency domain analysis processing on the standard sound signal and the actual sound signal specifically include,
step S101, recording a plurality of history teaching processes to obtain corresponding history teaching sound signals, and extracting coexisting sound signals from the history teaching sound signals to serve as the standard sound signals;
step S102, performing time domain analysis processing and frequency domain analysis processing on the standard sound signal so as to extract a first time domain characteristic parameter, a first frequency domain characteristic parameter and a first cepstrum frequency domain characteristic parameter from the standard sound signal;
step S103, performing time domain analysis processing and frequency domain analysis processing on the actual sound signal, so as to extract a second time domain characteristic parameter, a second frequency domain characteristic parameter and a second cepstrum frequency domain characteristic parameter from the actual sound signal.
3. The intelligent teaching screen recording method according to claim 2, characterized in that:
in step S2, determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and then performing optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal specifically including,
step S201, determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the following formula (1);
Figure FDA0002646887990000021
in the above formula (1), simA represents a similarity evaluation value, x, between the standard sound signal and the actual sound signaliRepresenting the ith second time domain feature parameter extracted from the actual sound signal, m representing the total number of second time domain feature parameters, xjRepresents the jth first time domain characteristic parameter extracted from the standard sound signal, n represents the total number of the first time domain characteristic parameters, yhRepresenting the h-th second frequency domain characteristic parameter extracted from the actual sound signal, e representing the total number of second frequency domain characteristic parameters, ykDenotes the kth first frequency domain characteristic parameter extracted from the standard sound signal, f denotes the total number of the first frequency domain characteristic parameters, zpRepresenting the p-th second cepstral frequency domain feature parameter extracted from the actual sound signal, r representing the total number of second cepstral frequency domain feature parameters, zqRepresenting the q-th first cepstrum frequency domain characteristic parameter extracted from the standard sound signal, s representing the total number of first cepstrum frequency domain characteristic parameters;
step S202, comparing the similarity evaluation value with a preset similarity evaluation threshold, if the similarity evaluation value is greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the actual sound signal to obtain the optimized sound signal, if the similarity evaluation value is less than the preset similarity evaluation threshold, re-recording the actual sound signal, and determining the similarity evaluation value between the standard sound signal and the re-recorded actual sound signal again, and when the similarity evaluation value is not greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the re-recorded actual sound signal to obtain the optimized sound signal.
4. The intelligent teaching screen recording method according to claim 3, characterized in that:
in step S3, determining signal statistical error information corresponding to the optimized sound signal, and according to the signal statistical error information, performing mutual combination and matching between the optimized sound signal and the screen recording image obtained in the screen recording process specifically includes,
step S301, determining the actual mean square error between the optimized sound signal and the standard sound signal according to the following formula (2)
Figure FDA0002646887990000031
In the above equation (2), MSE1Representing the actual mean square error between the optimized sound signal and the standard sound signal,
Figure FDA0002646887990000032
represents the power of the tth frame sound segment in the standard sound signal,
Figure FDA0002646887990000033
c represents the power of the t frame sound segment in the optimized sound signal, and the total number of the sound segments in the standard sound signal and the optimized sound signal respectively;
step S302, determining the actual mean square error MSE according to the following formula (3)1Ratio Q to a predetermined mean square error MSE 2:
Figure FDA0002646887990000034
in the above formula (3), the value range of the preset mean square error MSE2 is [0.1, 0.6 ];
step S303, if the ratio Q is less than or equal to 1, combining and matching the optimized sound signal and the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process, if the ratio Q is greater than 1, filtering and denoising the optimized sound signal again, re-determining the ratio Q according to the optimized sound signal subjected to filtering and denoising again, and if the re-determined ratio Q is less than or equal to 1, combining and matching the optimized sound signal subjected to filtering and denoising again and the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process.
5. The intelligent teaching screen recording system is characterized by comprising a sound signal acquisition module, a sound signal preprocessing module, an optimized sound signal generation module and a sound signal-screen recording image combination module; wherein the content of the first and second substances,
the sound signal acquisition module is used for collecting standard sound signals and actual sound signals recorded in the screen recording process;
the sound signal preprocessing module is used for performing time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal;
the optimized sound signal generating module is used for determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and then optimizing the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal;
and the sound signal-screen recording image combination module is used for determining signal statistical error information corresponding to the optimized sound signal and executing mutual combination and matching of the optimized sound signal and the screen recording image obtained in the screen recording process according to the signal statistical error information.
6. The intelligent tutoring screen recording system of claim 5 wherein:
the sound signal acquisition module collects standard sound signals, and specifically comprises the steps of recording a plurality of historical teaching processes to obtain corresponding historical teaching sound signals, and extracting the sound signals which exist together from the plurality of historical teaching sound signals to serve as the standard sound signals; the sound signal preprocessing module specifically includes performing time domain analysis processing and frequency domain analysis processing on the standard sound signal and the actual sound signal,
performing time domain analysis processing and frequency domain analysis processing on the standard sound signal so as to extract a first time domain characteristic parameter, a first frequency domain characteristic parameter and a first cepstrum frequency domain characteristic parameter from the standard sound signal,
and performing time domain analysis processing and frequency domain analysis processing on the actual sound signal so as to extract a second time domain characteristic parameter, a second frequency domain characteristic parameter and a second cepstrum frequency domain characteristic parameter from the actual sound signal.
7. The intelligent tutoring screen recording system of claim 6 wherein:
the optimized sound signal generating module determines a similarity evaluation value between the standard sound signal and the actual sound signal according to the results of the time domain analysis processing and the frequency domain analysis processing, and performs optimization processing on the actual sound signal according to the similarity evaluation value to obtain an optimized sound signal specifically including,
determining a similarity evaluation value between the standard sound signal and the actual sound signal according to the following formula (1)
Figure FDA0002646887990000051
In the above formula (1), simA represents a similarity evaluation value, x, between the standard sound signal and the actual sound signaliRepresenting the ith second time domain feature extracted from the actual sound signalA characteristic parameter, m represents the total number of second time domain characteristic parameters, xjRepresents the jth first time domain characteristic parameter extracted from the standard sound signal, n represents the total number of the first time domain characteristic parameters, yhRepresenting the h-th second frequency domain characteristic parameter extracted from the actual sound signal, e representing the total number of second frequency domain characteristic parameters, ykDenotes the kth first frequency domain characteristic parameter extracted from the standard sound signal, f denotes the total number of the first frequency domain characteristic parameters, zpRepresenting the p-th second cepstral frequency domain feature parameter extracted from the actual sound signal, r representing the total number of second cepstral frequency domain feature parameters, zqRepresents the q-th first cepstral frequency-domain feature parameter extracted from the standard sound signal, s represents the total number of first cepstral frequency-domain feature parameters,
comparing the similarity evaluation value with a preset similarity evaluation threshold, if the similarity evaluation value is greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the actual sound signal to obtain the optimized sound signal, if the similarity evaluation value is less than the preset similarity evaluation threshold, re-recording the actual sound signal, and determining the similarity evaluation value between the standard sound signal and the re-recorded actual sound signal again, and when the similarity evaluation value is greater than or equal to the preset similarity evaluation threshold, performing filtering and noise reduction processing on the re-recorded actual sound signal to obtain the optimized sound signal.
8. The intelligent tutoring screen recording system of claim 7 wherein:
the sound signal-screen recording image combination module determines signal statistical error information corresponding to the optimized sound signal, and executes mutual combination and matching of the optimized sound signal and the screen recording image obtained in the screen recording process according to the signal statistical error information,
determining an actual mean square error between the optimized sound signal and the standard sound signal according to the following formula (2)
Figure FDA0002646887990000061
In the above equation (2), MSE1Representing the actual mean square error between the optimized sound signal and the standard sound signal,
Figure FDA0002646887990000062
represents the power of the tth frame sound segment in the standard sound signal,
Figure FDA0002646887990000063
c represents the power of the t frame sound segment in the optimized sound signal, and the total number of the sound segments in the standard sound signal and the optimized sound signal respectively;
and determining the actual mean square error MSE according to the following formula (3)1With a predetermined mean square error MSE2Ratio Q between
Figure FDA0002646887990000064
In the above equation (3), the predetermined mean square error MSE2Has a value range of [0.1, 0.6]](ii) a And finally, if the ratio Q is smaller than or equal to 1, combining and matching the optimized sound signal and the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process, if the ratio Q is larger than 1, filtering and denoising the optimized sound signal again, re-determining the ratio Q according to the optimized sound signal subjected to filtering and denoising again, and if the re-determined ratio Q is smaller than or equal to 1, combining and matching the optimized sound signal subjected to filtering and denoising again and the screen recording image according to the screen recording operation time sequence information corresponding to the screen recording process.
CN202010857325.7A 2020-08-24 2020-08-24 Intelligent teaching screen recording method and system Active CN112019786B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010857325.7A CN112019786B (en) 2020-08-24 2020-08-24 Intelligent teaching screen recording method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010857325.7A CN112019786B (en) 2020-08-24 2020-08-24 Intelligent teaching screen recording method and system

Publications (2)

Publication Number Publication Date
CN112019786A true CN112019786A (en) 2020-12-01
CN112019786B CN112019786B (en) 2021-05-25

Family

ID=73505690

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010857325.7A Active CN112019786B (en) 2020-08-24 2020-08-24 Intelligent teaching screen recording method and system

Country Status (1)

Country Link
CN (1) CN112019786B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104581346A (en) * 2015-01-14 2015-04-29 华东师范大学 Micro video course making system and method
CN105679120A (en) * 2016-01-29 2016-06-15 右江民族医学院 Method for making standard mandarin speech micro-courseware based on TTS technology
KR101722332B1 (en) * 2015-10-21 2017-04-03 한국해양대학교 산학협력단 Motion detection processing method using acoustic signal
CN107346665A (en) * 2017-06-29 2017-11-14 广州视源电子科技股份有限公司 Method, apparatus, equipment and the storage medium of audio detection
CN107402965A (en) * 2017-06-22 2017-11-28 中国农业大学 A kind of audio search method
CN107527623A (en) * 2017-08-07 2017-12-29 广州视源电子科技股份有限公司 Screen transmission method, device, electronic equipment and computer-readable recording medium
CN107610715A (en) * 2017-10-10 2018-01-19 昆明理工大学 A kind of similarity calculating method based on muli-sounds feature
CN108200526A (en) * 2017-12-29 2018-06-22 广州励丰文化科技股份有限公司 A kind of sound equipment adjustment method and device based on confidence level curve
CN108766461A (en) * 2018-07-17 2018-11-06 厦门美图之家科技有限公司 Audio feature extraction methods and device
CN109065059A (en) * 2018-09-26 2018-12-21 新巴特(安徽)智能科技有限公司 The method for identifying speaker with the voice cluster that audio frequency characteristics principal component is established
CN109635759A (en) * 2018-12-18 2019-04-16 北京嘉楠捷思信息技术有限公司 Signal processing method and device and computer readable storage medium
CN109637211A (en) * 2019-01-22 2019-04-16 合肥市云联鸿达信息技术有限公司 A kind of full-automatic recording and broadcasting system
CN110534121A (en) * 2019-08-21 2019-12-03 中国传媒大学 A kind of monitoring method and system of the audio content consistency based on frequency domain character

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104581346A (en) * 2015-01-14 2015-04-29 华东师范大学 Micro video course making system and method
KR101722332B1 (en) * 2015-10-21 2017-04-03 한국해양대학교 산학협력단 Motion detection processing method using acoustic signal
CN105679120A (en) * 2016-01-29 2016-06-15 右江民族医学院 Method for making standard mandarin speech micro-courseware based on TTS technology
CN107402965A (en) * 2017-06-22 2017-11-28 中国农业大学 A kind of audio search method
CN107346665A (en) * 2017-06-29 2017-11-14 广州视源电子科技股份有限公司 Method, apparatus, equipment and the storage medium of audio detection
CN107527623A (en) * 2017-08-07 2017-12-29 广州视源电子科技股份有限公司 Screen transmission method, device, electronic equipment and computer-readable recording medium
CN107610715A (en) * 2017-10-10 2018-01-19 昆明理工大学 A kind of similarity calculating method based on muli-sounds feature
CN108200526A (en) * 2017-12-29 2018-06-22 广州励丰文化科技股份有限公司 A kind of sound equipment adjustment method and device based on confidence level curve
CN108766461A (en) * 2018-07-17 2018-11-06 厦门美图之家科技有限公司 Audio feature extraction methods and device
CN109065059A (en) * 2018-09-26 2018-12-21 新巴特(安徽)智能科技有限公司 The method for identifying speaker with the voice cluster that audio frequency characteristics principal component is established
CN109635759A (en) * 2018-12-18 2019-04-16 北京嘉楠捷思信息技术有限公司 Signal processing method and device and computer readable storage medium
CN109637211A (en) * 2019-01-22 2019-04-16 合肥市云联鸿达信息技术有限公司 A kind of full-automatic recording and broadcasting system
CN110534121A (en) * 2019-08-21 2019-12-03 中国传媒大学 A kind of monitoring method and system of the audio content consistency based on frequency domain character

Also Published As

Publication number Publication date
CN112019786B (en) 2021-05-25

Similar Documents

Publication Publication Date Title
Hilger et al. Quantile based histogram equalization for noise robust large vocabulary speech recognition
DE60124842T2 (en) Noise-robbed pattern recognition
CN109034046B (en) Method for automatically identifying foreign matters in electric energy meter based on acoustic detection
CN101894551B (en) Device for automatically identifying cough
CN111429887B (en) Speech keyword recognition method, device and equipment based on end-to-end
CN107103903A (en) Acoustic training model method, device and storage medium based on artificial intelligence
CN110675862A (en) Corpus acquisition method, electronic device and storage medium
CN108305618B (en) Voice acquisition and search method, intelligent pen, search terminal and storage medium
CN110570873A (en) voiceprint wake-up method and device, computer equipment and storage medium
CN110807585A (en) Student classroom learning state online evaluation method and system
CN113628627B (en) Electric power industry customer service quality inspection system based on structured voice analysis
CN106971724A (en) A kind of anti-tampering method for recognizing sound-groove and system
CN110890087A (en) Voice recognition method and device based on cosine similarity
CN113539294A (en) Method for collecting and identifying sound of abnormal state of live pig
CN111489763B (en) GMM model-based speaker recognition self-adaption method in complex environment
CN110689885B (en) Machine synthesized voice recognition method, device, storage medium and electronic equipment
CN111477219A (en) Keyword distinguishing method and device, electronic equipment and readable storage medium
CN114974229A (en) Method and system for extracting abnormal behaviors based on audio data of power field operation
EP1199712B1 (en) Noise reduction method
CN115910097A (en) Audible signal identification method and system for latent fault of high-voltage circuit breaker
CN113077812A (en) Speech signal generation model training method, echo cancellation method, device and equipment
CN112019786B (en) Intelligent teaching screen recording method and system
US11238289B1 (en) Automatic lie detection method and apparatus for interactive scenarios, device and medium
CN114758645A (en) Training method, device and equipment of speech synthesis model and storage medium
CN114171057A (en) Transformer event detection method and system based on voiceprint

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PP01 Preservation of patent right

Effective date of registration: 20221020

Granted publication date: 20210525

PP01 Preservation of patent right