CN116364072B - Education information supervision method based on artificial intelligence - Google Patents

Education information supervision method based on artificial intelligence Download PDF

Info

Publication number
CN116364072B
CN116364072B CN202310634026.0A CN202310634026A CN116364072B CN 116364072 B CN116364072 B CN 116364072B CN 202310634026 A CN202310634026 A CN 202310634026A CN 116364072 B CN116364072 B CN 116364072B
Authority
CN
China
Prior art keywords
supervised
denoising
unit
signal
signals
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310634026.0A
Other languages
Chinese (zh)
Other versions
CN116364072A (en
Inventor
尹成鑫
和震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Normal University
Original Assignee
Beijing Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Normal University filed Critical Beijing Normal University
Priority to CN202310634026.0A priority Critical patent/CN116364072B/en
Publication of CN116364072A publication Critical patent/CN116364072A/en
Application granted granted Critical
Publication of CN116364072B publication Critical patent/CN116364072B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an education information supervision method based on artificial intelligence, which belongs to the technical field of electric digital data processing, wherein noise signals are firstly subjected to denoising treatment, the influence of the noise signals is reduced, then reconstruction treatment is carried out, the denoising signals are split to obtain component signals, the data distribution similarity of the component signals and the denoising signals is calculated, the component signals with high similarity are taken to replace the denoising signals, the influence of noise is further reduced, and meanwhile, component signals capable of replacing the denoising signals are selected, so that the effects of simplifying data and eliminating noise are achieved, analysis signals are input into a voice recognition model, characters to be supervised are obtained, and sensitive semantics are recognized from the characters to be supervised; the invention solves the problem of inadequate supervision caused by low recognition precision in semantic recognition of voice by adopting a neural network.

Description

Education information supervision method based on artificial intelligence
Technical Field
The invention relates to the technical field of electric digital data processing, in particular to an education information supervision method based on artificial intelligence.
Background
In the education process, teacher's comments have important guiding effect on the value form of students, and because of individual differences, each teacher's views are different, so that positive and negative effects exist on the education of students, and in order to minimize the negative effects, the education information needs to be supervised, so that the positive guiding effect of the education is ensured as much as possible.
And supervising the education information, namely realizing supervision on the language, and identifying whether the language contains sensitive vocabulary or sensitive semantics, so as to remind the education person to guide the students from the front.
The existing method for identifying whether the language contains sensitive words or sensitive semantics is to process voice signals by adopting a voice recognition language model so as to obtain corresponding semantics, wherein the voice recognition language model is usually identified by adopting a neural network, but the voice signals in the real environment have more noise, and the voice signals are directly identified by adopting the neural network, so that the problem of inadequate supervision caused by low identification precision exists.
Disclosure of Invention
Aiming at the defects in the prior art, the education information supervision method based on artificial intelligence provided by the invention solves the problem of inadequate supervision caused by low recognition precision in semantic recognition of voice by adopting a neural network.
In order to achieve the aim of the invention, the invention adopts the following technical scheme: an educational information supervision method based on artificial intelligence comprises the following steps:
s1, denoising a voice signal in education to obtain a denoised signal;
s2, carrying out reconstruction processing on the denoising signal to obtain a component signal;
s3, calculating the data distribution similarity of the component signals and the denoising signals;
s4, selecting a component signal with data distribution similarity larger than a similarity threshold value as an analysis signal;
s5, inputting the analysis signal into a voice recognition model to obtain characters to be supervised;
s6, recognizing sensitive semantics of the text to be supervised.
Further, the step S1 includes the following sub-steps:
s11, decomposing the voice signal by adopting a wavelet basis function and a decomposition scale to obtain a wavelet decomposition coefficient;
s12, processing the wavelet decomposition coefficient according to the denoising function to obtain an estimated wavelet coefficient;
s13, performing wavelet inverse transformation on the estimated wavelet coefficient to obtain a denoising signal.
Further, the denoising function in S12 is:
wherein,,for estimating wavelet coefficients +.>For denoising weight, ++>As a sign function +.>Is wavelet decomposition coefficient, +.>For the denoising threshold, ||is the absolute value;
the formula of the denoising threshold value is as follows:
wherein,,for denoising threshold value, ++>Is the +.>Personal value (s)/(s)>Is the mean value of the sequence of speech signals,for the length of the speech signal sequence, < > for>Is natural constant (18)>For decomposing scale, ->Is logarithmicFunction (F)>Is a threshold coefficient.
The beneficial effects of the above further scheme are: the invention sets the denoising threshold value which is adaptively adjusted according to the decomposition scale and the current voice signal, so that the denoising threshold value can better filter the wavelet decomposition coefficient of noise, and the wavelet decomposition coefficient of useful signals is reserved.
Further, the formula of the reconstruction processing in S2 is:
wherein,,for denoising signals>Is->Individual component signals,/->For the number of component signals +.>As a result of the residual signal,is time.
Further, the calculation formula of the similarity of the data distribution in S3 is as follows:
wherein,,data distribution similarity>For denoising signals>Is->Individual component signals,/->For denoising signal mean value, < > is>Is->Mean value of individual component signals, < >>For a statistical period of time, +.>Time, || is the absolute value.
The beneficial effects of the above further scheme are: in the invention, noise signals are removed by denoising processing, noise signals are filtered, but in order to further improve the recognition accuracy of a voice recognition model, the denoising signals are subjected to reconstruction processing, a plurality of component signals are obtained, the data distribution similarity of each component signal and the denoising signals is calculated, the component signals with high similarity are screened out, the signals input into the voice recognition model are all useful signals, and the expression of the useful signals in the voice recognition model is enhanced.
Further, the speech recognition model in S5 includes: the device comprises a plurality of analysis signal feature extraction units, a feature fusion unit, a maximum pooling layer, an average pooling layer, an attention unit, an LSTM unit, a convolution layer and a classification unit;
each analysis signal feature extraction unit is used for inputting an analysis signal, and the output end of each analysis signal feature extraction unit is connected with the input end of the feature fusion unit; the output end of the characteristic fusion unit is respectively connected with the input end of the maximum pooling layer and the input end of the average pooling layer; the input end of the attention unit is respectively connected with the output end of the maximum pooling layer and the output end of the average pooling layer, and the output end of the attention unit is connected with the input end of the LSTM unit; the input end of the convolution layer is connected with the output end of the LSTM unit, and the output end of the convolution layer is connected with the input end of the classification unit; the output end of the classifying unit is used as the output end of the voice recognition model.
The beneficial effects of the above further scheme are: according to the invention, a plurality of component signals with high similarity are input into a voice recognition model, characteristic signals are extracted through each analysis signal characteristic extraction unit, characteristics are fused in a characteristic fusion unit, significant characteristics are extracted through a maximum pooling layer, an average pooling layer extracts global characteristics, different attentions are respectively given to the characteristics output by the maximum pooling layer and the characteristics output by the average pooling layer in an attention unit, and the attention of effective characteristics is conveniently improved.
Further, the formula of the analysis signal feature extraction unit is:
wherein,,extraction unit for analyzing signal characteristics>Features extracted at the moment,/->To activate the function +.>For analyzing the weights of the signal feature extraction unit, +.>For analysis ofBias of signal feature extraction unit, +.>Is->Time input analysis signal feature extraction unit analysis signal, < >>Is a natural constant.
The beneficial effects of the above further scheme are: the invention firstly passes throughCalculating the proportionality coefficient of the analysis signal, and passing through the exponential function +.>Enhancement of the analytical signal, again by->And screening out effective characteristic information.
Further, the formula of the feature fusion unit is:
wherein,,is the feature fusion unit->Characteristics of the time output->The 1 st analysis signal feature extraction unit +.>Features extracted at the moment,/->Is->Analysis Signal feature extraction Unit->Features extracted at the moment,/->Is->Analysis Signal feature extraction Unit->Features extracted at the moment,/->Is Hadamard product (Lepidium)>Extracting the number of units for analyzing the signal characteristics;
the formula of the attention unit is as follows:
wherein,,for attention unit->Characteristics of the time output->For the first weight of the attention unit, +.>For the second weight of the attention unit, +.>For the bias of the attention unit +.>Is->The time maximization layer inputs the number of the characteristics of the attention unit, < >>To maximize the pooling layer->Time output->Personal characteristics (I)>Is->The time-averaged pooling layer inputs the number of features of the attention unit,/for>For averaging pooling layer->Time output->Personal characteristics (I)>For the first denominator coefficient, < >>Is the second denominator coefficient.
The beneficial effects of the above further scheme are: the feature fusion unit fuses the features extracted by the analysis signal feature extraction units, so that the features extracted by the analysis signal feature extraction units are fused together, and partial feature loss during feature extraction is avoided; in the invention, different weights are given to the features of the maximum pooling layer and the average pooling layer in the attention unit, so that the attention of the effective features is improved.
Further, the step S6 includes the following sub-steps:
s61, word segmentation processing is carried out on the words to be supervised, so that phrases to be supervised are obtained;
s62, calculating the matching degree of the word group to be supervised corresponding to the word to be supervised and the sensitive sentence in the database;
and S63, when the matching degree is larger than a matching threshold, sensitive semantics exist in the text to be supervised.
Further, the formula for calculating the matching degree in S62 is:
wherein,,is the matching degree; />Is->Matching states of the phrases to be supervised, +.>The text to be supervised is +.>The phrase to be supervised is 1 when appearing in the sensitive statement, and is 0 when not appearing in the sensitive statement; />Is->Matching status of synonyms of the phrases to be supervised, < ->The text to be supervised is +.>Synonyms of the phrases to be supervised are 1 when appearing in the sensitive sentences, and are 0 when not appearing in the sensitive sentences; />Is->Matching status of paraphrasing of individual phrases to be supervised,/->Is the +.>The hyponym of each phrase to be supervised is 1 when appearing in the sensitive sentence, and is 0 when not appearing in the sensitive sentence; />For the first index coefficient, +.>For the second index coefficient, +.>For the third index coefficient, +.>The word groups to be supervised in the words to be supervised are obtained.
The beneficial effects of the above further scheme are: when the matching degree is calculated in the invention, the semantic matching degree between languages is evaluated by adopting the three sides of the phrase to be monitored, the synonym of the phrase to be monitored and the near-meaning word of the phrase to be monitored, and simultaneously, the index coefficient is set to further enhance the matching degree distinguishing degree of each sentence, and when the phrase to be monitored, the synonym or the near-meaning word is more, the matching degree is higher, and the index coefficient of the near-meaning word in the invention is higherAnd when the matching of the phrase to be supervised and the synonym is successful, the degree of improvement of the matching degree is higher, and the existence of sensitive semantics in the text to be supervised is easier to confirm.
In summary, the invention has the following beneficial effects: the invention firstly carries out denoising treatment on the voice signal to reduce the influence of the noise signal, then carries out reconstruction treatment, splits the denoising signal to obtain component signals, calculates the data distribution similarity of the component signals and the denoising signal, takes the component signals with high similarity to replace the denoising signal to further reduce the influence of the noise, simultaneously selects the component signals which can replace the denoising signal, screens out partial dissimilar signals so as to achieve the effects of simplifying data and eliminating the noise, inputs the analysis signal into a voice recognition model to obtain characters to be supervised, and recognizes sensitive semantics from the characters to be supervised. According to the invention, noise is filtered through two processes of denoising and reconstruction, so that the recognition accuracy of the voice recognition model is improved, the simplified data is easier to express in the voice recognition model, the recognition accuracy of the voice recognition model is further improved, and the education information is more conveniently supervised and the supervision accuracy is improved under the condition that accurate words to be supervised are extracted.
Drawings
FIG. 1 is a flow chart of an artificial intelligence based educational information administration method;
fig. 2 is a schematic diagram of a speech recognition model.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.
As shown in fig. 1, an artificial intelligence-based education information supervision method includes the following steps:
s1, denoising a voice signal in education to obtain a denoised signal;
in this embodiment, the voice signal in the present invention may be derived from lecturer voice in classrooms, training rooms and live webcast training rooms.
The step S1 comprises the following sub-steps:
s11, decomposing the voice signal by adopting a wavelet basis function and a decomposition scale to obtain a wavelet decomposition coefficient;
s12, processing the wavelet decomposition coefficient according to the denoising function to obtain an estimated wavelet coefficient;
the denoising function in S12 is:
wherein,,for estimating wavelet coefficients +.>For denoising weight, ++>As a sign function +.>Is wavelet decomposition coefficient, +.>For the denoising threshold, ||is the absolute value;
the formula of the denoising threshold value is as follows:
wherein,,for denoising threshold value, ++>Is the +.>Personal value (s)/(s)>Is the mean value of the sequence of speech signals,for the length of the speech signal sequence, < > for>Is natural constant (18)>For decomposing scale, ->As a logarithmic function>Is a threshold coefficient.
S13, performing wavelet inverse transformation on the estimated wavelet coefficient to obtain a denoising signal.
S2, carrying out reconstruction processing on the denoising signal to obtain a component signal;
the formula of the reconstruction processing in the S2 is as follows:
wherein,,for denoising signals>Is->Individual component signals,/->For the number of component signals +.>As a result of the residual signal,is time.
S3, calculating the data distribution similarity of the component signals and the denoising signals;
the calculation formula of the data distribution similarity in the S3 is as follows:
wherein,,data distribution similarity>For denoising signals>Is->Individual component signals,/->For denoising signal mean value, < > is>Is->Mean value of individual component signals, < >>For a statistical period of time, +.>Time, || is the absolute value.
In the invention, noise signals are removed by denoising processing, noise signals are filtered, but in order to further improve the recognition accuracy of a voice recognition model, the denoising signals are subjected to reconstruction processing, a plurality of component signals are obtained, the data distribution similarity of each component signal and the denoising signals is calculated, the component signals with high similarity are screened out, the signals input into the voice recognition model are all useful signals, and the expression of the useful signals in the voice recognition model is enhanced.
S4, selecting a component signal with data distribution similarity larger than a similarity threshold value as an analysis signal;
in this embodiment, the similarity threshold may be obtained through experiments, and set according to the requirements.
S5, inputting the analysis signal into a voice recognition model to obtain characters to be supervised;
the analysis signal is a component signal that is greater than a similarity threshold.
As shown in fig. 2, the speech recognition model in S5 includes: the device comprises a plurality of analysis signal feature extraction units, a feature fusion unit, a maximum pooling layer, an average pooling layer, an attention unit, an LSTM unit, a convolution layer and a classification unit;
each analysis signal feature extraction unit is used for inputting an analysis signal, and the output end of each analysis signal feature extraction unit is connected with the input end of the feature fusion unit; the output end of the characteristic fusion unit is respectively connected with the input end of the maximum pooling layer and the input end of the average pooling layer; the input end of the attention unit is respectively connected with the output end of the maximum pooling layer and the output end of the average pooling layer, and the output end of the attention unit is connected with the input end of the LSTM unit; the input end of the convolution layer is connected with the output end of the LSTM unit, and the output end of the convolution layer is connected with the input end of the classification unit; the output end of the classifying unit is used as the output end of the voice recognition model.
The formula of the analysis signal characteristic extraction unit is as follows:
wherein,,extraction unit for analyzing signal characteristics>Features extracted at the moment,/->To activate the function +.>For analyzing the weights of the signal feature extraction unit, +.>For analyzing the bias of the signal feature extraction unit +.>Is->Time input analysis signal feature extraction unit analysis signal, < >>Is a natural constant.
The invention firstly passes throughCalculating the proportionality coefficient of the analysis signal, and passing through the exponential function +.>Enhancement of the analytical signal, again by->And screening out effective characteristic information.
The formula of the feature fusion unit is as follows:
wherein,,is the feature fusion unit->Characteristics of the time output->The 1 st analysis signal feature extraction unit +.>Features extracted at the moment,/->Is->Analysis Signal feature extraction Unit->Features extracted at the moment,/->Is the firstAnalysis Signal feature extraction Unit->Features extracted at the moment,/->Is Hadamard product (Lepidium)>Extracting the number of units for analyzing the signal characteristics;
the formula of the attention unit is as follows:
wherein,,for attention unit->Characteristics of the time output->For the first weight of the attention unit, +.>For the second weight of the attention unit, +.>For the bias of the attention unit +.>Is->The time maximization layer inputs the number of the characteristics of the attention unit, < >>To maximize the pooling layer->Time output->Personal characteristics (I)>Is->The time-averaged pooling layer inputs the number of features of the attention unit,/for>For averaging pooling layer->Time output->Personal characteristics (I)>For the first denominator coefficient, < >>Is the second denominator coefficient.
S6, recognizing sensitive semantics of the text to be supervised.
The step S6 comprises the following substeps:
s61, word segmentation processing is carried out on the words to be supervised, so that phrases to be supervised are obtained;
in step S61, the word segmentation process in the present invention is equivalent to extracting each phrase in the text to be supervised, and determining whether the sensitive semantic exists according to whether each phrase appears in the sensitive sentences of the database.
S62, calculating the matching degree of the word group to be supervised corresponding to the word to be supervised and the sensitive sentence in the database;
and S63, when the matching degree is larger than a matching threshold, sensitive semantics exist in the text to be supervised.
In this embodiment, the matching threshold may be obtained through experiments, and set according to the requirements.
The formula for calculating the matching degree in S62 is as follows:
wherein,,is the matching degree; />Is->Matching states of the phrases to be supervised, +.>The text to be supervised is +.>The phrase to be supervised is 1 when appearing in the sensitive statement, and is 0 when not appearing in the sensitive statement; />Is->Matching status of synonyms of the phrases to be supervised, < ->The text to be supervised is +.>Synonyms of the phrases to be supervised are 1 when appearing in the sensitive sentences, and are 0 when not appearing in the sensitive sentences; />Is->Matching status of paraphrasing of individual phrases to be supervised,/->Is the +.>The hyponym of each phrase to be supervised is 1 when appearing in the sensitive sentence, and is 0 when not appearing in the sensitive sentence; />For the first index coefficient, +.>For the second index coefficient, +.>For the third index coefficient, +.>The word groups to be supervised in the words to be supervised are obtained.
In this embodiment, for the same phrase to be monitored, its matching statusWhen 1, the matching state of the synonym or the paraphrasing of the synonym is 1; for the same phrase to be supervised, the matching state is +.>When 0, the matching state of the synonym or the paraphraseology can be 0 or 1.
In this embodiment, the synonyms and the paraphraseology of the phrase may be first formulated by a person according to a dictionary, and the synonyms and the paraphraseology may be stored in a database, so as to facilitate retrieval during searching.
The invention firstly carries out denoising treatment on the voice signal to reduce the influence of the noise signal, then carries out reconstruction treatment, splits the denoising signal to obtain component signals, calculates the data distribution similarity of the component signals and the denoising signal, takes the component signals with high similarity to replace the denoising signal to further reduce the influence of the noise, simultaneously selects the component signals which can replace the denoising signal, screens out partial dissimilar signals so as to achieve the effects of simplifying data and eliminating the noise, inputs the analysis signal into a voice recognition model to obtain characters to be supervised, and recognizes sensitive semantics from the characters to be supervised. According to the invention, noise is filtered through two processes of denoising and reconstruction, so that the recognition accuracy of the voice recognition model is improved, the simplified data is easier to express in the voice recognition model, the recognition accuracy of the voice recognition model is further improved, and the education information is more conveniently supervised and the supervision accuracy is improved under the condition that accurate words to be supervised are extracted.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (4)

1. An educational information supervision method based on artificial intelligence is characterized by comprising the following steps:
s1, denoising a voice signal in education to obtain a denoised signal;
s2, carrying out reconstruction processing on the denoising signal to obtain a component signal;
s3, calculating the data distribution similarity of the component signals and the denoising signals;
s4, selecting a component signal with data distribution similarity larger than a similarity threshold value as an analysis signal;
s5, inputting the analysis signal into a voice recognition model to obtain characters to be supervised;
s6, recognizing sensitive semantics of the text to be supervised;
the calculation formula of the data distribution similarity in the S3 is as follows:
wherein,,data distribution similarity>For denoising signals>Is->Individual component signals,/->For denoising signal mean value, < > is>Is->Mean value of individual component signals, < >>For a statistical period of time, +.>Time, || is the absolute value;
the speech recognition model in S5 includes: the device comprises a plurality of analysis signal feature extraction units, a feature fusion unit, a maximum pooling layer, an average pooling layer, an attention unit, an LSTM unit, a convolution layer and a classification unit;
each analysis signal feature extraction unit is used for inputting an analysis signal, and the output end of each analysis signal feature extraction unit is connected with the input end of the feature fusion unit; the output end of the characteristic fusion unit is respectively connected with the input end of the maximum pooling layer and the input end of the average pooling layer; the input end of the attention unit is respectively connected with the output end of the maximum pooling layer and the output end of the average pooling layer, and the output end of the attention unit is connected with the input end of the LSTM unit; the input end of the convolution layer is connected with the output end of the LSTM unit, and the output end of the convolution layer is connected with the input end of the classification unit; the output end of the classifying unit is used as the output end of the voice recognition model;
the formula of the analysis signal characteristic extraction unit is as follows:
wherein,,extraction unit for analyzing signal characteristics>Features extracted at the moment,/->To activate the function +.>For analyzing the weights of the signal feature extraction unit, +.>For analyzing the bias of the signal feature extraction unit +.>Is->Time input analysis signal feature extraction unit analysis signal, < >>Is a natural constant;
the formula of the feature fusion unit is as follows:
wherein,,is the feature fusion unit->Characteristics of the time output->The 1 st analysis signal feature extraction unit +.>Features extracted at the moment,/->Is->Analysis Signal feature extraction Unit->Features extracted at the moment,/->Is->Analysis Signal feature extraction Unit->Features extracted at the moment,/->Is Hadamard product (Lepidium)>Extracting the number of units for analyzing the signal characteristics;
the formula of the attention unit is as follows:
wherein,,for attention unit->Characteristics of the time output->For the first weight of the attention unit, +.>For the second weight of the attention unit, +.>For the bias of the attention unit +.>Is->The time maximization layer inputs the number of the characteristics of the attention unit, < >>To maximize the pooling layer->Time output->Personal characteristics (I)>Is->The time-averaged pooling layer inputs the number of features of the attention unit,/for>For averaging pooling layer->Time output->Personal characteristics (I)>For the first denominator coefficient, < >>Is a second denominator coefficient;
the step S6 comprises the following substeps:
s61, word segmentation processing is carried out on the words to be supervised, so that phrases to be supervised are obtained;
s62, calculating the matching degree of the word group to be supervised corresponding to the word to be supervised and the sensitive sentence in the database;
s63, when the matching degree is larger than a matching threshold, sensitive semantics exist in the text to be supervised;
the formula for calculating the matching degree in S62 is as follows:
wherein,,is the matching degree; />Is->Matching states of the phrases to be supervised, +.>The text to be supervised is +.>The phrase to be supervised is 1 when appearing in the sensitive statement, and is 0 when not appearing in the sensitive statement; />Is->Matching status of synonyms of the phrases to be supervised, < ->The text to be supervised is +.>To be monitoredSynonyms of the management phrase are 1 when appearing in the sensitive sentence, and are 0 when not appearing in the sensitive sentence; />Is->Matching status of paraphrasing of individual phrases to be supervised,/->Is the +.>The hyponym of each phrase to be supervised is 1 when appearing in the sensitive sentence, and is 0 when not appearing in the sensitive sentence; />For the first index coefficient, +.>For the second index coefficient, +.>For the third index coefficient, +.>The word groups to be supervised in the words to be supervised are obtained.
2. The artificial intelligence based education information supervision method according to claim 1, wherein S1 includes the sub-steps of:
s11, decomposing the voice signal by adopting a wavelet basis function and a decomposition scale to obtain a wavelet decomposition coefficient;
s12, processing the wavelet decomposition coefficient according to the denoising function to obtain an estimated wavelet coefficient;
s13, performing wavelet inverse transformation on the estimated wavelet coefficient to obtain a denoising signal.
3. The artificial intelligence based education information supervision method according to claim 2, wherein the denoising function in S12 is:
wherein,,for estimating wavelet coefficients +.>For denoising weight, ++>As a sign function +.>Is wavelet decomposition coefficient, +.>For the denoising threshold, ||is the absolute value;
the formula of the denoising threshold value is as follows:
wherein,,for denoising threshold value, ++>Is the +.>Personal value (s)/(s)>Is the mean value of the speech signal sequence,/-, for example>For the length of the speech signal sequence, < > for>Is natural constant (18)>For decomposing scale, ->As a logarithmic function>Is a threshold coefficient.
4. The artificial intelligence based education information supervision method according to claim 1, wherein the formula of the reconstruction process in S2 is:
wherein,,for denoising signals>Is->Individual component signals,/->For the number of component signals +.>For the residual signal>Is time.
CN202310634026.0A 2023-05-31 2023-05-31 Education information supervision method based on artificial intelligence Active CN116364072B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310634026.0A CN116364072B (en) 2023-05-31 2023-05-31 Education information supervision method based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310634026.0A CN116364072B (en) 2023-05-31 2023-05-31 Education information supervision method based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN116364072A CN116364072A (en) 2023-06-30
CN116364072B true CN116364072B (en) 2023-08-01

Family

ID=86923434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310634026.0A Active CN116364072B (en) 2023-05-31 2023-05-31 Education information supervision method based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN116364072B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117316187B (en) * 2023-11-30 2024-02-06 山东同其万疆科技创新有限公司 English teaching management system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10319390B2 (en) * 2016-02-19 2019-06-11 New York University Method and system for multi-talker babble noise reduction
CN109637520B (en) * 2018-10-16 2023-08-22 平安科技(深圳)有限公司 Sensitive content identification method, device, terminal and medium based on voice analysis
CN109410977B (en) * 2018-12-19 2022-09-23 东南大学 Voice segment detection method based on MFCC similarity of EMD-Wavelet
CN110136709A (en) * 2019-04-26 2019-08-16 国网浙江省电力有限公司信息通信分公司 Audio recognition method and video conferencing system based on speech recognition
CN111625641B (en) * 2020-07-30 2020-12-01 浙江大学 Dialog intention recognition method and system based on multi-dimensional semantic interaction representation model
CN116052641A (en) * 2022-12-28 2023-05-02 天翼物联科技有限公司 Terminal recording state detection method, system and storage medium based on DPI technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GTDNN-Based Voice Conversion Using DAEs with Binary Distributed Hidden Units;Yi-Yang Ding,Ya-Jun Hu, Zhen-Hua Ling;2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP);240-244 *

Also Published As

Publication number Publication date
CN116364072A (en) 2023-06-30

Similar Documents

Publication Publication Date Title
CN105845134B (en) Spoken language evaluation method and system for freely reading question types
CN111125349A (en) Graph model text abstract generation method based on word frequency and semantics
TWI536364B (en) Automatic speech recognition method and system
CN112784696B (en) Lip language identification method, device, equipment and storage medium based on image identification
CN107797987B (en) Bi-LSTM-CNN-based mixed corpus named entity identification method
CN110347787B (en) Interview method and device based on AI auxiliary interview scene and terminal equipment
CN105551485B (en) Voice file retrieval method and system
CN108090099B (en) Text processing method and device
CN109741824B (en) Medical inquiry method based on machine learning
CN116364072B (en) Education information supervision method based on artificial intelligence
CN112434164B (en) Network public opinion analysis method and system taking topic discovery and emotion analysis into consideration
CN111145903A (en) Method and device for acquiring vertigo inquiry text, electronic equipment and inquiry system
CN108536781B (en) Social network emotion focus mining method and system
CN111191463A (en) Emotion analysis method and device, electronic equipment and storage medium
CN109543036A (en) Text Clustering Method based on semantic similarity
Bigot et al. Person name recognition in ASR outputs using continuous context models
CN113312907B (en) Remote supervision relation extraction method and device based on hybrid neural network
CN114219248A (en) Man-sentry matching method based on LDA model, dependency syntax and deep learning
Jui et al. A machine learning-based segmentation approach for measuring similarity between sign languages
CN116842168B (en) Cross-domain problem processing method and device, electronic equipment and storage medium
CN115878847B (en) Video guiding method, system, equipment and storage medium based on natural language
CN116562278A (en) Word similarity detection method and system
CN112612895B (en) Method for calculating attitude index of main topic
CN112071304B (en) Semantic analysis method and device
CN114676699A (en) Entity emotion analysis method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant