CN116364072B

CN116364072B - Education information supervision method based on artificial intelligence

Info

Publication number: CN116364072B
Application number: CN202310634026.0A
Authority: CN
Inventors: 尹成鑫; 和震
Original assignee: Beijing Normal University
Current assignee: Beijing Normal University
Priority date: 2023-05-31
Filing date: 2023-05-31
Publication date: 2023-08-01
Anticipated expiration: 2043-05-31
Also published as: CN116364072A

Abstract

The invention discloses an education information supervision method based on artificial intelligence, which belongs to the technical field of electric digital data processing, wherein noise signals are firstly subjected to denoising treatment, the influence of the noise signals is reduced, then reconstruction treatment is carried out, the denoising signals are split to obtain component signals, the data distribution similarity of the component signals and the denoising signals is calculated, the component signals with high similarity are taken to replace the denoising signals, the influence of noise is further reduced, and meanwhile, component signals capable of replacing the denoising signals are selected, so that the effects of simplifying data and eliminating noise are achieved, analysis signals are input into a voice recognition model, characters to be supervised are obtained, and sensitive semantics are recognized from the characters to be supervised; the invention solves the problem of inadequate supervision caused by low recognition precision in semantic recognition of voice by adopting a neural network.

Description

Education information supervision method based on artificial intelligence

Technical Field

The invention relates to the technical field of electric digital data processing, in particular to an education information supervision method based on artificial intelligence.

Background

In the education process, teacher's comments have important guiding effect on the value form of students, and because of individual differences, each teacher's views are different, so that positive and negative effects exist on the education of students, and in order to minimize the negative effects, the education information needs to be supervised, so that the positive guiding effect of the education is ensured as much as possible.

And supervising the education information, namely realizing supervision on the language, and identifying whether the language contains sensitive vocabulary or sensitive semantics, so as to remind the education person to guide the students from the front.

The existing method for identifying whether the language contains sensitive words or sensitive semantics is to process voice signals by adopting a voice recognition language model so as to obtain corresponding semantics, wherein the voice recognition language model is usually identified by adopting a neural network, but the voice signals in the real environment have more noise, and the voice signals are directly identified by adopting the neural network, so that the problem of inadequate supervision caused by low identification precision exists.

Disclosure of Invention

Aiming at the defects in the prior art, the education information supervision method based on artificial intelligence provided by the invention solves the problem of inadequate supervision caused by low recognition precision in semantic recognition of voice by adopting a neural network.

In order to achieve the aim of the invention, the invention adopts the following technical scheme: an educational information supervision method based on artificial intelligence comprises the following steps:

s1, denoising a voice signal in education to obtain a denoised signal;

s2, carrying out reconstruction processing on the denoising signal to obtain a component signal;

s3, calculating the data distribution similarity of the component signals and the denoising signals;

s4, selecting a component signal with data distribution similarity larger than a similarity threshold value as an analysis signal;

s5, inputting the analysis signal into a voice recognition model to obtain characters to be supervised;

s6, recognizing sensitive semantics of the text to be supervised.

Further, the step S1 includes the following sub-steps:

s11, decomposing the voice signal by adopting a wavelet basis function and a decomposition scale to obtain a wavelet decomposition coefficient;

s12, processing the wavelet decomposition coefficient according to the denoising function to obtain an estimated wavelet coefficient;

s13, performing wavelet inverse transformation on the estimated wavelet coefficient to obtain a denoising signal.

Further, the denoising function in S12 is:

，

wherein,,for estimating wavelet coefficients +.>For denoising weight, ++>As a sign function +.>Is wavelet decomposition coefficient, +.>For the denoising threshold, ||is the absolute value;

the formula of the denoising threshold value is as follows:

，

wherein,,for denoising threshold value, ++>Is the +.>Personal value (s)/(s)>Is the mean value of the sequence of speech signals,for the length of the speech signal sequence, < > for>Is natural constant (18)>For decomposing scale, ->Is logarithmicFunction (F)>Is a threshold coefficient.

The beneficial effects of the above further scheme are: the invention sets the denoising threshold value which is adaptively adjusted according to the decomposition scale and the current voice signal, so that the denoising threshold value can better filter the wavelet decomposition coefficient of noise, and the wavelet decomposition coefficient of useful signals is reserved.

Further, the formula of the reconstruction processing in S2 is:

，

wherein,,for denoising signals>Is->Individual component signals,/->For the number of component signals +.>As a result of the residual signal,is time.

Further, the calculation formula of the similarity of the data distribution in S3 is as follows:

，

wherein,,data distribution similarity>For denoising signals>Is->Individual component signals,/->For denoising signal mean value, < > is>Is->Mean value of individual component signals, < >>For a statistical period of time, +.>Time, || is the absolute value.

The beneficial effects of the above further scheme are: in the invention, noise signals are removed by denoising processing, noise signals are filtered, but in order to further improve the recognition accuracy of a voice recognition model, the denoising signals are subjected to reconstruction processing, a plurality of component signals are obtained, the data distribution similarity of each component signal and the denoising signals is calculated, the component signals with high similarity are screened out, the signals input into the voice recognition model are all useful signals, and the expression of the useful signals in the voice recognition model is enhanced.

Further, the speech recognition model in S5 includes: the device comprises a plurality of analysis signal feature extraction units, a feature fusion unit, a maximum pooling layer, an average pooling layer, an attention unit, an LSTM unit, a convolution layer and a classification unit;

each analysis signal feature extraction unit is used for inputting an analysis signal, and the output end of each analysis signal feature extraction unit is connected with the input end of the feature fusion unit; the output end of the characteristic fusion unit is respectively connected with the input end of the maximum pooling layer and the input end of the average pooling layer; the input end of the attention unit is respectively connected with the output end of the maximum pooling layer and the output end of the average pooling layer, and the output end of the attention unit is connected with the input end of the LSTM unit; the input end of the convolution layer is connected with the output end of the LSTM unit, and the output end of the convolution layer is connected with the input end of the classification unit; the output end of the classifying unit is used as the output end of the voice recognition model.

The beneficial effects of the above further scheme are: according to the invention, a plurality of component signals with high similarity are input into a voice recognition model, characteristic signals are extracted through each analysis signal characteristic extraction unit, characteristics are fused in a characteristic fusion unit, significant characteristics are extracted through a maximum pooling layer, an average pooling layer extracts global characteristics, different attentions are respectively given to the characteristics output by the maximum pooling layer and the characteristics output by the average pooling layer in an attention unit, and the attention of effective characteristics is conveniently improved.

Further, the formula of the analysis signal feature extraction unit is:

，

wherein,,extraction unit for analyzing signal characteristics>Features extracted at the moment,/->To activate the function +.>For analyzing the weights of the signal feature extraction unit, +.>For analysis ofBias of signal feature extraction unit, +.>Is->Time input analysis signal feature extraction unit analysis signal, < >>Is a natural constant.

The beneficial effects of the above further scheme are: the invention firstly passes throughCalculating the proportionality coefficient of the analysis signal, and passing through the exponential function +.>Enhancement of the analytical signal, again by->And screening out effective characteristic information.

Further, the formula of the feature fusion unit is:

，

wherein,,is the feature fusion unit->Characteristics of the time output->The 1 st analysis signal feature extraction unit +.>Features extracted at the moment,/->Is->Analysis Signal feature extraction Unit->Features extracted at the moment,/->Is->Analysis Signal feature extraction Unit->Features extracted at the moment,/->Is Hadamard product (Lepidium)>Extracting the number of units for analyzing the signal characteristics;

the formula of the attention unit is as follows:

，

wherein,,for attention unit->Characteristics of the time output->For the first weight of the attention unit, +.>For the second weight of the attention unit, +.>For the bias of the attention unit +.>Is->The time maximization layer inputs the number of the characteristics of the attention unit, < >>To maximize the pooling layer->Time output->Personal characteristics (I)>Is->The time-averaged pooling layer inputs the number of features of the attention unit,/for>For averaging pooling layer->Time output->Personal characteristics (I)>For the first denominator coefficient, < >>Is the second denominator coefficient.

The beneficial effects of the above further scheme are: the feature fusion unit fuses the features extracted by the analysis signal feature extraction units, so that the features extracted by the analysis signal feature extraction units are fused together, and partial feature loss during feature extraction is avoided; in the invention, different weights are given to the features of the maximum pooling layer and the average pooling layer in the attention unit, so that the attention of the effective features is improved.

Further, the step S6 includes the following sub-steps:

s61, word segmentation processing is carried out on the words to be supervised, so that phrases to be supervised are obtained;

s62, calculating the matching degree of the word group to be supervised corresponding to the word to be supervised and the sensitive sentence in the database;

and S63, when the matching degree is larger than a matching threshold, sensitive semantics exist in the text to be supervised.

Further, the formula for calculating the matching degree in S62 is:

，

wherein,,is the matching degree; />Is->Matching states of the phrases to be supervised, +.>The text to be supervised is +.>The phrase to be supervised is 1 when appearing in the sensitive statement, and is 0 when not appearing in the sensitive statement; />Is->Matching status of synonyms of the phrases to be supervised, < ->The text to be supervised is +.>Synonyms of the phrases to be supervised are 1 when appearing in the sensitive sentences, and are 0 when not appearing in the sensitive sentences; />Is->Matching status of paraphrasing of individual phrases to be supervised,/->Is the +.>The hyponym of each phrase to be supervised is 1 when appearing in the sensitive sentence, and is 0 when not appearing in the sensitive sentence; />For the first index coefficient, +.>For the second index coefficient, +.>For the third index coefficient, +.>The word groups to be supervised in the words to be supervised are obtained.

The beneficial effects of the above further scheme are: when the matching degree is calculated in the invention, the semantic matching degree between languages is evaluated by adopting the three sides of the phrase to be monitored, the synonym of the phrase to be monitored and the near-meaning word of the phrase to be monitored, and simultaneously, the index coefficient is set to further enhance the matching degree distinguishing degree of each sentence, and when the phrase to be monitored, the synonym or the near-meaning word is more, the matching degree is higher, and the index coefficient of the near-meaning word in the invention is higherAnd when the matching of the phrase to be supervised and the synonym is successful, the degree of improvement of the matching degree is higher, and the existence of sensitive semantics in the text to be supervised is easier to confirm.

In summary, the invention has the following beneficial effects: the invention firstly carries out denoising treatment on the voice signal to reduce the influence of the noise signal, then carries out reconstruction treatment, splits the denoising signal to obtain component signals, calculates the data distribution similarity of the component signals and the denoising signal, takes the component signals with high similarity to replace the denoising signal to further reduce the influence of the noise, simultaneously selects the component signals which can replace the denoising signal, screens out partial dissimilar signals so as to achieve the effects of simplifying data and eliminating the noise, inputs the analysis signal into a voice recognition model to obtain characters to be supervised, and recognizes sensitive semantics from the characters to be supervised. According to the invention, noise is filtered through two processes of denoising and reconstruction, so that the recognition accuracy of the voice recognition model is improved, the simplified data is easier to express in the voice recognition model, the recognition accuracy of the voice recognition model is further improved, and the education information is more conveniently supervised and the supervision accuracy is improved under the condition that accurate words to be supervised are extracted.

Drawings

FIG. 1 is a flow chart of an artificial intelligence based educational information administration method;

fig. 2 is a schematic diagram of a speech recognition model.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.

As shown in fig. 1, an artificial intelligence-based education information supervision method includes the following steps:

s1, denoising a voice signal in education to obtain a denoised signal;

in this embodiment, the voice signal in the present invention may be derived from lecturer voice in classrooms, training rooms and live webcast training rooms.

The step S1 comprises the following sub-steps:

the denoising function in S12 is:

，

the formula of the denoising threshold value is as follows:

，

wherein,,for denoising threshold value, ++>Is the +.>Personal value (s)/(s)>Is the mean value of the sequence of speech signals,for the length of the speech signal sequence, < > for>Is natural constant (18)>For decomposing scale, ->As a logarithmic function>Is a threshold coefficient.

the formula of the reconstruction processing in the S2 is as follows:

，

the calculation formula of the data distribution similarity in the S3 is as follows:

，

In the invention, noise signals are removed by denoising processing, noise signals are filtered, but in order to further improve the recognition accuracy of a voice recognition model, the denoising signals are subjected to reconstruction processing, a plurality of component signals are obtained, the data distribution similarity of each component signal and the denoising signals is calculated, the component signals with high similarity are screened out, the signals input into the voice recognition model are all useful signals, and the expression of the useful signals in the voice recognition model is enhanced.

in this embodiment, the similarity threshold may be obtained through experiments, and set according to the requirements.

the analysis signal is a component signal that is greater than a similarity threshold.

As shown in fig. 2, the speech recognition model in S5 includes: the device comprises a plurality of analysis signal feature extraction units, a feature fusion unit, a maximum pooling layer, an average pooling layer, an attention unit, an LSTM unit, a convolution layer and a classification unit;

The formula of the analysis signal characteristic extraction unit is as follows:

，

wherein,,extraction unit for analyzing signal characteristics>Features extracted at the moment,/->To activate the function +.>For analyzing the weights of the signal feature extraction unit, +.>For analyzing the bias of the signal feature extraction unit +.>Is->Time input analysis signal feature extraction unit analysis signal, < >>Is a natural constant.

The invention firstly passes throughCalculating the proportionality coefficient of the analysis signal, and passing through the exponential function +.>Enhancement of the analytical signal, again by->And screening out effective characteristic information.

The formula of the feature fusion unit is as follows:

，

wherein,,is the feature fusion unit->Characteristics of the time output->The 1 st analysis signal feature extraction unit +.>Features extracted at the moment,/->Is->Analysis Signal feature extraction Unit->Features extracted at the moment,/->Is the firstAnalysis Signal feature extraction Unit->Features extracted at the moment,/->Is Hadamard product (Lepidium)>Extracting the number of units for analyzing the signal characteristics;

the formula of the attention unit is as follows:

，

S6, recognizing sensitive semantics of the text to be supervised.

The step S6 comprises the following substeps:

in step S61, the word segmentation process in the present invention is equivalent to extracting each phrase in the text to be supervised, and determining whether the sensitive semantic exists according to whether each phrase appears in the sensitive sentences of the database.

In this embodiment, the matching threshold may be obtained through experiments, and set according to the requirements.

The formula for calculating the matching degree in S62 is as follows:

，

In this embodiment, for the same phrase to be monitored, its matching statusWhen 1, the matching state of the synonym or the paraphrasing of the synonym is 1; for the same phrase to be supervised, the matching state is +.>When 0, the matching state of the synonym or the paraphraseology can be 0 or 1.

In this embodiment, the synonyms and the paraphraseology of the phrase may be first formulated by a person according to a dictionary, and the synonyms and the paraphraseology may be stored in a database, so as to facilitate retrieval during searching.

The invention firstly carries out denoising treatment on the voice signal to reduce the influence of the noise signal, then carries out reconstruction treatment, splits the denoising signal to obtain component signals, calculates the data distribution similarity of the component signals and the denoising signal, takes the component signals with high similarity to replace the denoising signal to further reduce the influence of the noise, simultaneously selects the component signals which can replace the denoising signal, screens out partial dissimilar signals so as to achieve the effects of simplifying data and eliminating the noise, inputs the analysis signal into a voice recognition model to obtain characters to be supervised, and recognizes sensitive semantics from the characters to be supervised. According to the invention, noise is filtered through two processes of denoising and reconstruction, so that the recognition accuracy of the voice recognition model is improved, the simplified data is easier to express in the voice recognition model, the recognition accuracy of the voice recognition model is further improved, and the education information is more conveniently supervised and the supervision accuracy is improved under the condition that accurate words to be supervised are extracted.

The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An educational information supervision method based on artificial intelligence is characterized by comprising the following steps:

s1, denoising a voice signal in education to obtain a denoised signal;

s6, recognizing sensitive semantics of the text to be supervised;

，

wherein,,data distribution similarity>For denoising signals>Is->Individual component signals,/->For denoising signal mean value, < > is>Is->Mean value of individual component signals, < >>For a statistical period of time, +.>Time, || is the absolute value;

the speech recognition model in S5 includes: the device comprises a plurality of analysis signal feature extraction units, a feature fusion unit, a maximum pooling layer, an average pooling layer, an attention unit, an LSTM unit, a convolution layer and a classification unit;

each analysis signal feature extraction unit is used for inputting an analysis signal, and the output end of each analysis signal feature extraction unit is connected with the input end of the feature fusion unit; the output end of the characteristic fusion unit is respectively connected with the input end of the maximum pooling layer and the input end of the average pooling layer; the input end of the attention unit is respectively connected with the output end of the maximum pooling layer and the output end of the average pooling layer, and the output end of the attention unit is connected with the input end of the LSTM unit; the input end of the convolution layer is connected with the output end of the LSTM unit, and the output end of the convolution layer is connected with the input end of the classification unit; the output end of the classifying unit is used as the output end of the voice recognition model;

，

wherein,,extraction unit for analyzing signal characteristics>Features extracted at the moment,/->To activate the function +.>For analyzing the weights of the signal feature extraction unit, +.>For analyzing the bias of the signal feature extraction unit +.>Is->Time input analysis signal feature extraction unit analysis signal, < >>Is a natural constant;

the formula of the feature fusion unit is as follows:

，

the formula of the attention unit is as follows:

，

wherein,,for attention unit->Characteristics of the time output->For the first weight of the attention unit, +.>For the second weight of the attention unit, +.>For the bias of the attention unit +.>Is->The time maximization layer inputs the number of the characteristics of the attention unit, < >>To maximize the pooling layer->Time output->Personal characteristics (I)>Is->The time-averaged pooling layer inputs the number of features of the attention unit,/for>For averaging pooling layer->Time output->Personal characteristics (I)>For the first denominator coefficient, < >>Is a second denominator coefficient;

the step S6 comprises the following substeps:

s63, when the matching degree is larger than a matching threshold, sensitive semantics exist in the text to be supervised;

the formula for calculating the matching degree in S62 is as follows:

，

wherein,,is the matching degree; />Is->Matching states of the phrases to be supervised, +.>The text to be supervised is +.>The phrase to be supervised is 1 when appearing in the sensitive statement, and is 0 when not appearing in the sensitive statement; />Is->Matching status of synonyms of the phrases to be supervised, < ->The text to be supervised is +.>To be monitoredSynonyms of the management phrase are 1 when appearing in the sensitive sentence, and are 0 when not appearing in the sensitive sentence; />Is->Matching status of paraphrasing of individual phrases to be supervised,/->Is the +.>The hyponym of each phrase to be supervised is 1 when appearing in the sensitive sentence, and is 0 when not appearing in the sensitive sentence; />For the first index coefficient, +.>For the second index coefficient, +.>For the third index coefficient, +.>The word groups to be supervised in the words to be supervised are obtained.

2. The artificial intelligence based education information supervision method according to claim 1, wherein S1 includes the sub-steps of:

3. The artificial intelligence based education information supervision method according to claim 2, wherein the denoising function in S12 is:

，

the formula of the denoising threshold value is as follows:

，

wherein,,for denoising threshold value, ++>Is the +.>Personal value (s)/(s)>Is the mean value of the speech signal sequence,/-, for example>For the length of the speech signal sequence, < > for>Is natural constant (18)>For decomposing scale, ->As a logarithmic function>Is a threshold coefficient.

4. The artificial intelligence based education information supervision method according to claim 1, wherein the formula of the reconstruction process in S2 is:

，

wherein,,for denoising signals>Is->Individual component signals,/->For the number of component signals +.>For the residual signal>Is time.