WO2020216286A1

WO2020216286A1 - Method for training teaching style prediction model, and computer storage medium

Info

Publication number: WO2020216286A1
Application number: PCT/CN2020/086363
Authority: WO
Inventors: 杨嵩; 黄健; 杨非; 刘子韬; 黄琰
Original assignee: 北京新唐思创教育科技有限公司
Priority date: 2019-04-23
Filing date: 2020-04-23
Publication date: 2020-10-29
Also published as: CN111832787A; CN111832787B

Abstract

Provided are a method for training a teaching style prediction model, and a computer storage medium. The method comprises: determining, on the basis of high-dimensional feature data of a teaching content sample, multiple groups of low-dimensional feature data of the teaching content sample; by means of a teaching style prediction model to be trained, acquiring, on the basis of the multiple groups of low-dimensional feature data, teaching style prediction data corresponding to the teaching content sample; and on the basis of teaching style labeling data of the teaching content sample and the teaching style prediction data, training the teaching style prediction model. In the embodiments of the present invention, high-dimensional feature data of a teaching content sample is grouped into multiple groups of low-dimensional feature data, thereby greatly reducing the dimensionality of input features of a teaching style prediction module to be trained, so that the teaching style prediction performance of the trained teaching style prediction module can be effectively improved.

Description

Training method and computer storage medium of teacher's style prediction model

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on April 23, 2019, the application number is 201910330162.4, and the invention title is "Training Method for Teacher Style Prediction Model and Computer Storage Medium", the entire content of which is incorporated by reference In this application.

Technical field

The embodiment of the present invention relates to the field of artificial intelligence, and in particular to a training method and computer storage medium of a teacher's style prediction model.

Background technique

Teacher style is the judgment of teacher's individual value, and it is an important content of education evaluation. Predicting the teaching style of teachers can enable the school's teaching management department and teachers to understand the teaching situation, discover problems, sum up experience and reform work, so as to achieve the ultimate goal of improving teaching quality. Therefore, how to predict the teacher's style fairly and accurately has always been a problem explored by the education circle.

Currently, modeling is mainly used to predict the teaching style of teachers. The input data of the model can include teaching audio and video teaching data of teachers. Because it is difficult to obtain teaching data samples of different teacher styles, the amount of teaching data samples is often small. In addition, the dimensionality of the features extracted in the teaching data sample is often high. Therefore, it is prone to overfitting when training the model, and it is impossible to train a model with better performance. Aiming at the problem of the small amount of teaching data samples and the high dimension of the extracted features, most of the existing processing methods use principal component analysis technology to reduce the dimensions of the high-dimensional features extracted from the teaching data samples, and then use The features after dimensionality reduction train the model. However, this processing method inevitably loses part of the characteristics of the original features extracted from the teaching data samples, cannot make full use of the information of the extracted original features, and cannot analyze the specific meaning of the features after dimensionality reduction. Therefore, so far, there is no model training method that can effectively improve the performance of teacher style prediction.

Summary of the invention

In view of this, one of the technical problems solved by the embodiments of the present invention is to provide a training method of a teacher's style prediction model and a non-transitory computer-readable storage medium to solve at least one of the above-mentioned problems.

The embodiment of the present invention provides a training method of a teacher's style prediction model. The method includes: determining a plurality of sets of second characteristic data of the teaching content sample based on the first characteristic data of the teaching content sample, wherein the first characteristic data has a higher dimension than the second characteristic data; The teacher style prediction model to be trained obtains the teacher style prediction data corresponding to the teaching content sample based on the multiple sets of second feature data; the teacher style annotation data based on the teaching content sample and the teacher style prediction data, Training the teacher style prediction model.

The embodiment of the present invention also provides a non-transitory computer-readable storage medium, the non-transitory computer-readable storage medium stores a readable program, and the readable program includes: a first feature used for teaching content samples Data, instructions for determining multiple sets of second feature data of the teaching content sample, wherein the first feature data has a higher dimension than the second feature data; used to predict the style of the teacher to be trained based on The multiple sets of second feature data are instructions for obtaining teacher style prediction data corresponding to the teaching content sample; used to train the teacher style based on the teacher style annotation data of the teaching content sample and the teacher style prediction data Predictive model instructions.

According to the training scheme of the teacher's style prediction model provided by the embodiment of the present invention, based on the first characteristic data of the teaching content sample, multiple sets of second characteristic data of the teaching content sample are determined, wherein the first characteristic data has a higher value than the second characteristic data. Based on multiple sets of second feature data, the teacher style prediction data corresponding to the teaching content sample is obtained through the teacher style prediction model to be trained, and then based on the teacher style annotation data and teacher style prediction data of the teaching content sample, the teacher is trained Compared with other existing methods, the style prediction model greatly reduces the dimensions of the input features of the teacher style prediction model to be trained by grouping the high-dimensional feature data of the teaching content samples into multiple sets of second feature data, thereby making The teacher style prediction performance of the trained teacher style prediction model can be effectively improved.

Description of the drawings

In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some of the embodiments described in the embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained from these drawings.

Figure 1 shows a flowchart of the steps of a method for training a teacher's style prediction model according to the first embodiment of the present invention;

Fig. 2 shows a schematic structural diagram of a teacher style prediction model provided according to the first embodiment of the present invention;

Fig. 3 shows a flowchart of steps of a method for predicting teacher style according to the second embodiment of the present invention.

Detailed ways

In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the description The embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments in the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art should fall within the protection scope of the embodiments of the present invention.

The specific implementation of the embodiments of the present invention will be further described below in conjunction with the accompanying drawings of the embodiments of the present invention.

Example one

1, there is shown a step flow chart of a method for training a teacher's style prediction model according to the first embodiment of the present invention.

Specifically, the training method of the teacher's style prediction model provided by the embodiment of the present invention includes the following steps:

In step S101, based on the first feature data of the teaching content sample, multiple sets of second feature data of the teaching content sample are determined.

In this embodiment, the teaching content sample may include audio data or video data of the teaching content as the training sample. The first feature data can be understood as a feature vector with a higher dimension, for example, a feature vector of 1000 dimensions, a feature vector of 2000 dimensions, and so on. When the teaching content sample is audio data of the teaching content as a training sample, the high-dimensional feature data of the teaching content sample may be high-dimensional voice acoustic feature data extracted from the audio data, and the voice acoustics The feature data may include prosodic feature data, spectral feature data, and voice quality feature data of the audio, and the voice acoustic feature data is specifically a voice acoustic feature vector. In a specific implementation, an existing speech acoustic feature extraction algorithm may be used to extract high-dimensional speech acoustic feature data from the audio data. When the teaching content sample is video data of teaching content as a training sample, the high-dimensional feature data of the teaching content sample may be high-dimensional facial feature data extracted from the video data, and the person The facial feature data may include the feature data of the mouth region, the feature data of the eye region, the feature data of the cheek region, etc. The facial feature data is specifically a facial feature vector of a human face. In a specific implementation manner, an existing facial feature extraction algorithm can be used to extract high-dimensional facial feature data from the video data.

In this embodiment, when multiple sets of low-dimensional feature data of the teaching content sample are determined based on the high-dimensional feature data of the teaching content sample, correlation analysis is performed on the high-dimensional feature data to determine the high-dimensional feature data. Grouping of feature data; based on the grouping of the high-dimensional feature data, the high-dimensional feature data is divided to obtain multiple groups of low-dimensional feature data of the teaching content sample. In this way, the dimension of the input features of the teacher style prediction model is greatly reduced.

Specifically, when the high-dimensional feature data is specifically high-dimensional voice acoustic feature data, it can be known from the prior knowledge of voice acoustics that the voice acoustic features include prosodic features, spectral features, and voice quality features. Therefore, it can be based on voice acoustic features. Including the prior knowledge of the prosody feature, the frequency spectrum feature, and the voice quality feature, the correlation analysis is performed on the high-dimensional voice acoustic feature data to determine the grouping of the high-dimensional voice acoustic feature data. Then, according to the grouping, the high-dimensional speech acoustic feature data is divided to obtain the prosody feature data, frequency spectrum feature data, and sound quality feature data of the teaching content sample. Simply put, the types of features included in the voice acoustic features are determined based on the prior knowledge of voice acoustics, and then the high-dimensional voice acoustic feature data is correlated based on the types of features included in the voice acoustic features. When the high-dimensional feature data is specifically high-dimensional facial feature data, it can be known from the prior knowledge of the human face that the human face includes the mouth area, the eye area, the nose area, and the cheek area. Therefore, it can be based on the human face. The face includes the prior knowledge of the mouth area, the eye area, the nose area and the cheek area, and the correlation analysis is performed on the high-dimensional facial feature data to determine the grouping of the high-dimensional facial feature data. Then, according to the grouping, the high-dimensional face feature data is divided to obtain the mouth area feature data, the eye area feature data, the nose area feature data, and the cheek area feature data of the teaching content sample. Simply put, the different areas included in the face are determined according to the prior knowledge of the face, and then the correlation analysis of the high-dimensional facial feature data is performed according to the different areas included in the face.

In this embodiment, when multiple sets of low-dimensional feature data of the teaching content sample are determined based on the high-dimensional feature data of the teaching content sample, without prior knowledge, the teaching is evaluated in an equal-dimensional manner. The high-dimensional feature data of the content sample is divided to obtain multiple sets of low-dimensional feature data of the teaching content sample. For example, when the high-dimensional feature data is specifically 1000-dimensional feature data, the 1000-dimensional high-dimensional feature data can be equally divided into 10 groups of low-dimensional feature data, and the dimension of each group of low-dimensional feature data is 100 dimension. In a specific implementation, how many groups and dimensions are divided into can be set through experiments. In this way, the dimension of the input features of the teacher style prediction model is greatly reduced.

Specifically, input the teaching content sample to the system, set the high-dimensional feature data for one of the sample data n (suppose there are N sample data, then n=1, 2,...,N) as v _n and the dimension as D , And then perform correlation analysis on the high-dimensional feature data v _n through certain prior knowledge, and divide the high-dimensional feature data v _n into K groups (k = 1, 2,..., K), and the dimensionality of each group is set Is D _k , which satisfies

For the nth sample data, the original high-dimensional feature data is v _n , and the divided low-dimensional feature data of the kth group is

then

Among them, concat (*) means that the feature data are spliced together in turn. If there is no prior knowledge to analyze the correlation of features, it can be equally divided into K parts, which also meets the above relationship.

In step S102, the teacher style prediction data corresponding to the teaching content sample is obtained based on the multiple sets of second feature data through the teacher style prediction model to be trained.

In this embodiment, the teacher style prediction model can be any suitable neural network model that can realize feature extraction or target object detection, including but not limited to convolutional neural network, enhanced learning neural network, and generation in counter neural network. Network, deep neural network, etc. The settings of the specific structure in the neural network can be appropriately set by those skilled in the art according to actual requirements, such as the number of convolution layers, the size of the convolution kernel, the number of channels, and so on. The teacher style prediction data may be a predicted teacher style category, or a predicted teacher style value.

In this embodiment, the teacher style prediction model includes multiple low-level models and high-level models connected to the output terminals of the multiple low-level models, and the multiple low-level models and the high-level models are all deep neural network models . When obtaining the teacher style prediction data corresponding to the teaching content sample through the teacher style prediction model to be trained based on the multiple sets of low-dimensional feature data, the multiple low-level models are based on the multiple sets of low-dimensional feature data. Data to obtain a plurality of teacher style preliminary prediction data corresponding to the teaching content sample; through the high-level model, based on the plurality of teacher style preliminary prediction data, obtain the teacher style final prediction data corresponding to the teaching content sample. In this way, through the multiple low-level models included in the teacher style prediction model, preliminary predictions of the teaching style are made on the teaching content samples, and then through the high-level models included in the teacher style prediction model, based on the preliminary prediction results of the teaching style, the teaching content samples are made The final prediction of the teaching style can improve the prediction accuracy of the teacher style prediction model for the teacher style corresponding to the teaching content sample.

In this embodiment, each of the multiple low-layer models includes a hidden layer and a prediction layer connected to the output end of the hidden layer, and the hidden layer is specifically a fully connected layer or a convolutional layer. The prediction layer is specifically a fully connected layer. When obtaining a plurality of preliminary prediction data of teacher style corresponding to the teaching content sample based on the plurality of sets of low-dimensional feature data through the plurality of low-level models, the plurality of sets of low-dimensional The feature data are respectively subjected to feature extraction operations to obtain feature representation data corresponding to the multiple sets of low-dimensional feature data; through the prediction layer, a mapping operation is performed on the feature representation data corresponding to the multiple sets of low-dimensional feature data, To obtain a plurality of preliminary prediction data of teacher style corresponding to the teaching content sample. Wherein, the characteristic characterization data is specifically a characteristic characterization vector. In this way, through the hidden layer, feature extraction operations are performed on multiple sets of low-dimensional feature data, which can re-encode multiple sets of low-dimensional feature data, and improve the robustness of feature representation data corresponding to multiple sets of low-dimensional feature data. It can improve the accuracy of the low-level model's preliminary prediction of the teacher style corresponding to the teaching content sample.

In this embodiment, when obtaining the teacher style final prediction data corresponding to the teaching content sample through the high-level model based on the plurality of teacher style preliminary prediction data, based on the plurality of teacher style preliminary prediction data, Generate high-level feature representation data corresponding to the high-level model; through the high-level model, based on the high-level feature representation data, obtain final prediction data of teacher style corresponding to the teaching content sample. Wherein, the high-level feature representation data is specifically a high-level feature representation vector. In this way, based on the preliminary prediction data of teacher style, the high-level feature representation data corresponding to the high-level model is generated, and then through the high-level model, based on the high-level feature representation data, the final prediction data of the teacher style corresponding to the teaching content sample is obtained, which can improve the high-level model's ability to teach content The accuracy of the final prediction of the teacher style corresponding to the sample.

In this embodiment, when generating high-level feature representation data corresponding to the high-level model based on the plurality of teacher style preliminary prediction data, based on the plurality of teacher style preliminary prediction data and the multiple sets of low-dimensional feature data The corresponding characteristic characterization data are respectively generated to generate the high-level characteristic characterization data. In this way, based on the preliminary prediction data of teacher style and the feature representation data corresponding to the low-dimensional feature data, high-level feature representation data can be generated, which can improve the robustness of the high-level feature representation data, thereby improving the high-level model's ability to respond to the teacher style corresponding to the teaching content sample. The accuracy of the final prediction.

In this embodiment, when obtaining the teacher style final prediction data corresponding to the teaching content sample through the high-level model and based on the high-level feature characterization data, the hidden layer in the high-level model is used to compare the Perform a feature extraction operation on the high-level feature characterization data to obtain the feature characterization data corresponding to the high-level feature characterization data; through the prediction layer in the high-level model, perform a mapping operation on the feature characterization data corresponding to the high-level feature characterization data to Obtain final prediction data of teacher style corresponding to the teaching content sample. Wherein, the hidden layer is specifically a fully connected layer or a convolutional layer, the prediction layer is specifically a fully connected layer, and the feature characterization data is specifically a feature characterization vector. In this way, through the hidden layer, the feature extraction operation of the high-level feature characterization data can re-encode the high-level feature characterization data, improve the robustness of the feature characterization data corresponding to the high-level feature characterization data, and thereby improve the high-level model's ability to teach The accuracy of the final prediction of the teacher's style corresponding to the content sample.

Specifically, as shown in FIG. 2, the teacher style prediction model includes multiple low-level models and high-level models connected to the output terminals of the multiple low-level models. After dividing the high-dimensional feature data, multiple feature groups are obtained, and then each feature group is input into the corresponding low-level model. Then through the corresponding low-level model, based on the feature grouping, the preliminary prediction of the teaching style of the teaching content sample is made to obtain the preliminary prediction data of the teacher style corresponding to the teaching content sample. Wherein, the low-level model includes a plurality of sequentially connected hidden layers and a prediction layer connected to the output end of the last hidden layer of the plurality of sequentially connected hidden layers. Based on the preliminary prediction data of teacher style output by multiple low-level models and the feature representation data of feature groups output by the last hidden layer of the multiple low-level models, high-level feature representation data corresponding to the high-level model is generated. Finally, through the high-level model, based on the high-level feature representation data, the final prediction of the teaching style of the teaching content sample is made to obtain the final prediction data of the teacher style corresponding to the teaching content sample.

Specifically, the high-dimensional feature data is divided into K groups of low-dimensional feature data, and each group of low-dimensional feature data corresponds to a low-level model, then the k-th low-level model and the k-th group of low-dimensional feature data

One to one correspondence. Low-dimensional feature data of the kth group for the nth sample data

Suppose the number of hidden layers of the low-level model is L ^k (l ^k =1, 2,..., L ^k ), and the hidden node dimension of the l ^kth hidden layer is

When l ^k =1,

among them,

Is the weight matrix of the first hidden layer of the kth group of low-level models, with dimensions

Is the bias vector of the first hidden layer of the kth group of low-level models, with the dimension

Indicates the result calculated according to the weight matrix and bias vector of this layer; f(*) is a non-linear function, usually a sigmoid function;

Is the first hidden vector representation of the nth sample data of the kth group of low-level models, and the dimension is

When 1＜l ^k ＜L ^k ,

among them,

Is the weight matrix of the l ^kth hidden layer of the kth group of low-level models, the dimension is

Is the bias vector of the l ^kth hidden layer of the kth group of low-level models, and the dimension is

Indicates the result calculated according to the weight matrix and bias vector of the layer;

Is the latent vector representation of the l ^kth hidden layer of the kth group of low-level models for the nth data, and the dimension is

When l ^k =L ^k ,

among them,

Is the weight matrix of the L ^k hidden layer of the kth group of low-level models, with dimensions

Is the bias vector of the L ^k hidden layer of the kth group of low-level models, with the dimension

Is the L ^kth hidden layer latent vector representation of the kth group of low-level models for the nth sample data, the dimension is

The output of the hidden layer of the k-th low-level model is

As the input of the prediction layer of the k-th group of low-level models:

among them,

Is the weight matrix of the prediction layer of the kth group of low-level models, with the dimension

Is the bias vector of the prediction layer of the k-th group of low-level models, with a dimension of 1;

It is the preliminary prediction data of teacher style of the kth group of low-level models for the nth sample data. The dimension is 1, which is a real value between 0-1.

Combine the hidden vector representation of the last hidden layer of each low-level model and preliminary prediction data of teacher style to obtain high-level feature representation data. Then the high-level feature representation data is:

Among them, the dimension of h _n is

Combining the preliminary prediction data of teacher style of each low-level model and adding the hidden vector representation of the last hidden layer can obtain more information, so that the high-level model can predict more accurately.

The high-level feature representation data is used as the input of the high-level model for final prediction. The high-level model includes a plurality of sequentially connected hidden layers and a prediction layer connected to the output end of the last hidden layer of the plurality of sequentially connected hidden layers . Suppose the number of hidden layers of the high-level model is L, and the hidden node dimension of the l-th hidden layer is D _l .

When l=1,

y _1n = W ₁ h _n + b ₁ ; g _1n = f(y _1n );

among them,

Is the weight matrix of the first hidden layer of the high-level model, the dimension is

Is the offset vector of the first hidden layer of the high-level model, with the dimension D ₁ ; y _1n indicates the result calculated according to the weight matrix and offset vector of the layer; f(*) is a nonlinear function, usually a sigmoid function; g _1n is the first hidden vector representation of the n-th sample data of the high-level model, and the dimension is D ₁ ×1.

When 1＜l＜L,

y _ln = W _l g _(l-1)n + b _l ;

among them,

Is the weight matrix of the l hidden layer of the high-level model, with a dimension of D _l × D _l-1 ;

Is the offset vector of the l hidden layer of the high-level model, with the dimension D _l ; y _ln indicates the result calculated according to the weight matrix and offset vector of the layer; g _ln is the l th of the high-level model for the nth sample data Hidden vector representation of a hidden layer, the dimension is D _l ×1.

When l ^k =L ^k ,

y _Ln = W _L g _(L-1)n + b _L ; h _Ln = f(y _Ln );

among them,

Is the weight matrix of the Lth hidden layer of the high-level model, with the dimension D _L ×D _L-1 ;

Is the offset vector of the Lth hidden layer of the high-level model, with a dimension of D _L ; y _ln indicates the result calculated according to the weight matrix and offset vector of the layer; h _Ln is the L-th value of the high-level model for the nth sample data Hidden vector representation of a hidden layer, the dimension is D _L ×1.

The output of the hidden layer of the high-level model is h _Ln as the input of the prediction layer of the high-level model:

s _n =Wh _Ln +b

among them,

Is the weight matrix of the prediction layer of the high-level model, with a dimension of 1×D _L ;

Is the bias vector of the prediction layer of the high-level model, with a dimension of 1; s _n is the final teacher-style prediction data of the high-level model for the nth sample data, with a dimension of 1, which is a real value between 0-1.

It can be seen from the above description that in the specific implementation, the structure of the low-level model and the high-level model are similar. The reason why the low-level model and the high-level model are used is because the low-level model is used to make preliminary predictions of the teaching style of the teaching content samples. Through the high-level model, based on the preliminary prediction results of the teaching style of the low-level model, the final prediction of the teaching style of the teaching content sample can improve the accuracy of the teacher style prediction model for the teacher style corresponding to the teaching content sample. In addition, due to the small amount of data in the teaching content sample, and the high dimensionality of the feature data of the teaching content sample, directly using a model (such as using only one underlying model) for modeling will cause a "dimension disaster". The model is only applicable to training data, and can not get good performance on test data, which will cause overfitting.

In step S103, the teacher style prediction model is trained based on the teacher style annotation data of the teaching content sample and the teacher style prediction data.

In this embodiment, the teacher style annotation data can be understood as the actual teacher style data of the teaching content sample.

In this embodiment, when training the teacher style prediction model based on the teacher style annotation data of the teaching content sample and the teacher style prediction data, the target loss function is used to determine the teacher style annotation data and the teacher style prediction data. The difference value between the teacher style prediction data; based on the difference value, the parameters of the teacher style prediction model are adjusted.

In this embodiment, when the difference value between the teacher style annotation data and the teacher style prediction data is determined through the objective loss function, the teacher style annotation data and the teacher style are determined through the objective loss function The difference between the final predicted data. When adjusting the parameters of the teacher style prediction model based on the difference value, adjust the parameters of the multiple low-level models and the high-level model in the teacher style prediction model based on the difference value.

In this embodiment, the objective loss function includes a mean square error term and an L2 regularization term. In this way, the training process of the teacher's style prediction model can be prevented from being affected by overfitting.

Specifically, given the high-dimensional feature data v _{n of} the nth sample data, it can be calculated by the low-level model and the high-level model, and finally the teacher style prediction data s _{n is} obtained from the prediction layer of the high-level model. Suppose the real teacher style data of the nth sample data is s′ _n , and train the teacher style prediction model to make s _n and s′ _n as close as possible. In the training process, the following function is selected as the loss function for training the teacher's style prediction model:

Among them, s _n is the real teacher style data of the nth sample data,

Is the preliminary teacher style prediction data of the kth low-level model for the nth sample data, s′ _n is the final teacher style prediction data of the high-level model for the nth sample data,

Is the weight matrix of the hidden layer of the low-level model, W ^k is the weight matrix of the prediction layer of the low-level model, W _l is the weight matrix of the hidden layer of the high-level model, W is the weight matrix of the high-level model prediction layer, and λ is the weight attenuation term. The value is between 0 and 1. The first and second terms of the above formula calculate the mean square error, and the last four terms are added with L2 regularization to prevent the teacher style prediction model from overfitting.

The training of the teacher's style prediction model is to combine the low-level model and the high-level model for unified training, and optimize the teacher's style prediction model as a whole through the objective loss function. The entire model is trained using the minimized objective loss function, that is, the parameters of the teacher style prediction model (

W ^k , W _l , W,

b ^k , b _l , b).

Specifically, by determining the difference value between the teacher style annotation data and the teacher style final prediction data, the currently obtained teacher style final prediction data is evaluated as a basis for subsequent training of the teacher style prediction model. Specifically, the difference value may be transmitted back to the teacher style prediction model, so as to train the teacher style prediction model iteratively. The training of the teacher style prediction model is an iterative process. The embodiments of this application only describe one training process, but those skilled in the art should understand that each training of the teacher style prediction model can be performed. This training method is adopted until the training of the teacher style prediction model is completed.

Through the training method of the teacher style prediction model provided by the embodiments of the application, based on the high-dimensional feature data of the teaching content sample, multiple sets of low-dimensional feature data of the teaching content sample are determined, and the teacher style prediction model to be trained is based on multiple sets of With low-dimensional feature data, the teacher style prediction data corresponding to the teaching content sample is obtained, and then based on the teacher style annotation data and the teacher style prediction data of the teaching content sample, the teacher style prediction model is trained. Compared with other existing methods, the teaching style The high-dimensional feature data of the content samples are grouped into multiple sets of low-dimensional feature data, which greatly reduces the dimension of the input features of the teacher style prediction model to be trained, so that the teacher style prediction performance of the trained teacher style prediction model can be effectively Promote.

Example two

Referring to Fig. 3, a flowchart of the steps of a method for predicting teacher style according to the second embodiment of the present invention is shown.

Specifically, the teacher style prediction method provided by the embodiment of the present invention includes the following steps:

In step S201, based on the first feature data of the teaching content data, multiple sets of second feature data of the teaching content data are determined.

In this embodiment, the teaching content data may include audio data or video data of the teaching content. When the teaching content data is audio data of the teaching content, the first feature data of the teaching content data may be high-dimensional voice acoustic feature data extracted from the audio data. When the teaching content data is video data of the teaching content, the first feature data of the teaching content data may be high-dimensional facial feature data extracted from the video data.

In this embodiment, the specific implementation of step S201 is similar to the specific implementation of step S101 described above, and will not be repeated here.

In step S202, the teacher style prediction data corresponding to the teaching content data is obtained based on the multiple sets of second feature data of the teaching content data through the trained teacher style prediction model.

In this embodiment, when the teacher style prediction model obtained through training in Embodiment 1 is based on multiple sets of low-dimensional feature data to obtain the teacher style prediction data corresponding to the teaching content data, the multiple low-level models are based on The multiple sets of low-dimensional feature data obtain multiple teacher style preliminary prediction data corresponding to the teaching content data; through the high-level model, based on the multiple teacher style preliminary prediction data, obtain the corresponding teaching content data Teacher style final prediction data. In this way, the teacher style prediction model trained in the first embodiment includes multiple low-level models, and the teaching content data is preliminarily predicted for the teaching style, and then the high-level model included in the teacher style prediction model trained in the first embodiment is based on The preliminary prediction result of the teaching style and the final prediction of the teaching style on the teaching content data can improve the accuracy of the teacher style prediction model for predicting the teacher style corresponding to the teaching content data.

In this embodiment, when the plurality of low-level models are used to obtain the plurality of teacher style preliminary prediction data corresponding to the teaching content data based on the plurality of sets of low-dimensional feature data, the hidden layer The multiple sets of low-dimensional feature data are respectively subjected to feature extraction operations to obtain feature representation data corresponding to the multiple sets of low-dimensional feature data; through the prediction layer, features corresponding to the multiple sets of low-dimensional feature data The characterization data is subjected to a mapping operation to obtain a plurality of preliminary prediction data of teacher style corresponding to the teaching content data. Wherein, the characteristic characterization data is specifically a characteristic characterization vector. In this way, through the hidden layer, feature extraction operations are performed on multiple sets of low-dimensional feature data, which can re-encode multiple sets of low-dimensional feature data, and improve the robustness of feature representation data corresponding to multiple sets of low-dimensional feature data. It can improve the accuracy of the preliminary prediction of the teacher’s style corresponding to the teaching content data by the low-level model.

In this embodiment, when obtaining the teacher style final prediction data corresponding to the teaching content data through the high-level model based on the plurality of teacher style preliminary prediction data, based on the plurality of teacher style preliminary prediction data, Generate high-level feature representation data corresponding to the high-level model; through the high-level model, based on the high-level feature representation data, obtain final prediction data of teacher style corresponding to the teaching content data. Wherein, the high-level feature representation data is specifically a high-level feature representation vector. In this way, based on the preliminary prediction data of teacher style, the high-level feature representation data corresponding to the high-level model is generated, and then through the high-level model, based on the high-level feature representation data, the final prediction data of the teacher style corresponding to the teaching content data is obtained, which can improve the high-level model's impact on the teaching content The accuracy of the final prediction of teacher style corresponding to the data.

In this embodiment, when generating high-level feature representation data corresponding to the high-level model based on the plurality of teacher style preliminary prediction data, based on the plurality of teacher style preliminary prediction data and the multiple sets of low-dimensional feature data The corresponding characteristic characterization data are respectively generated to generate the high-level characteristic characterization data. In this way, based on the preliminary prediction data of teacher style and the feature representation data corresponding to the low-dimensional feature data, high-level feature representation data can be generated, which can improve the robustness of the high-level feature representation data, thereby improving the high-level model's ability to respond to the teacher style corresponding to the teaching content data. The accuracy of the final prediction.

In this embodiment, when obtaining the final prediction data of teacher style corresponding to the teaching content data through the high-level model based on the high-level feature characterization data, the hidden layer in the high-level model is used to compare the Perform a feature extraction operation on the high-level feature characterization data to obtain the feature characterization data corresponding to the high-level feature characterization data; through the prediction layer in the high-level model, perform a mapping operation on the feature characterization data corresponding to the high-level feature characterization data to Obtain final prediction data of teacher style corresponding to the teaching content data. In this way, through the hidden layer, the feature extraction operation of the high-level feature characterization data can re-encode the high-level feature characterization data, improve the robustness of the feature characterization data corresponding to the high-level feature characterization data, and thereby improve the high-level model's ability to teach The accuracy of the final prediction of the teacher style corresponding to the content data.

In this embodiment, the method further includes: performing a mapping operation on the teacher style prediction data to obtain the teacher style category corresponding to the teaching content data. In this way, the teacher style category corresponding to the teaching content data can be obtained.

Specifically, based on the teacher style prediction data, a mapping operation is performed in the pre-built teacher style semantic space to obtain the teacher style category corresponding to the teaching content data. Wherein, the teacher style semantic space can be understood as a mapping space between teacher style prediction data and teacher style categories.

Through the teacher style prediction method provided in the embodiments of this application, based on the high-dimensional feature data of the teaching content data, multiple sets of low-dimensional feature data of the teaching content data are determined, and then the teacher style prediction model obtained through training in Embodiment 1 is based on the teaching content Multiple sets of low-dimensional feature data of the data to obtain teacher style prediction data corresponding to the teaching content data. Compared with other existing methods, by grouping the high-dimensional feature data of the teaching content data into multiple sets of low-dimensional feature data, it greatly reduces The dimension of the input features of the trained teacher style prediction model is improved, thereby effectively improving the teacher style prediction performance of the teacher style prediction model.

Example three

The embodiment of the present invention also provides a non-transitory computer-readable storage medium, the non-transitory computer-readable storage medium stores a readable program, and the readable program includes: a high-dimensional feature used for teaching content samples Data, instructions for determining multiple sets of low-dimensional feature data of the teaching content sample; used to obtain the teacher style prediction corresponding to the teaching content sample based on the multiple sets of low-dimensional feature data through the teacher style prediction model to be trained Data instructions; instructions for training the teacher style prediction model based on the teacher style annotation data of the teaching content sample and the teacher style prediction data.

Optionally, the teacher style prediction model includes a plurality of low-level models and a high-level model connected to the output ends of the plurality of low-level models. Correspondingly, the teacher style prediction model used to pass the training is based on the Multiple sets of low-dimensional feature data, and instructions for obtaining teacher style prediction data corresponding to the teaching content sample include: obtaining the teaching content sample based on the multiple sets of low-dimensional feature data through the multiple low-level models An instruction for corresponding multiple teacher style preliminary prediction data; an instruction for obtaining the teacher style final prediction data corresponding to the teaching content sample based on the multiple teacher style preliminary prediction data through the high-level model.

Optionally, each of the plurality of low-layer models includes a hidden layer and a prediction layer connected to the output terminal of the hidden layer. Correspondingly, the method for passing the plurality of low-layer models, Based on the multiple sets of low-dimensional feature data, an instruction to obtain multiple teacher style preliminary prediction data corresponding to the teaching content sample includes: using the hidden layer to perform respective operations on the multiple sets of low-dimensional feature data A feature extraction operation to obtain instructions for feature characterization data corresponding to the multiple sets of low-dimensional feature data; and to perform a mapping operation on feature characterization data corresponding to the multiple sets of low-dimensional feature data through the prediction layer, An instruction to obtain a plurality of preliminary prediction data of teacher style corresponding to the teaching content sample.

Optionally, the instruction for obtaining teacher style final prediction data corresponding to the teaching content sample based on the plurality of teacher style preliminary prediction data through the high-level model includes: Preliminary teacher style prediction data to generate instructions corresponding to the high-level feature characterization data of the high-level model; for obtaining the final prediction data of teacher style corresponding to the teaching content sample based on the high-level feature characterization data through the high-level model instruction.

Optionally, the instruction for generating high-level feature characterization data corresponding to the high-level model based on the plurality of teacher style preliminary prediction data includes: the instruction for generating high-level feature characterization data based on the plurality of teacher style preliminary prediction data and the Multiple sets of low-dimensional feature data correspond to feature characterization data, respectively, to generate instructions for the high-level feature characterization data.

Optionally, the instruction for obtaining the teacher style final prediction data corresponding to the teaching content sample based on the high-level feature characterization data through the high-level model includes: Containing layers, performing feature extraction operations on the high-level feature characterization data to obtain instructions for the feature characterization data corresponding to the high-level feature characterization data; used to characterize the high-level feature data through the prediction layer in the high-level model The corresponding characteristic characterization data performs a mapping operation to obtain an instruction of the final prediction data of teacher style corresponding to the teaching content sample.

Optionally, the instruction for training the teacher style prediction model based on the teacher style annotation data of the teaching content sample and the teacher style prediction data includes: determining the teacher by a target loss function An instruction for the difference value between the style annotation data and the teacher style final prediction data; an instruction for adjusting the parameters of the multiple low-level models and the high-level model in the teacher style prediction model based on the difference value.

Optionally, the readable program further includes: instructions for determining multiple sets of low-dimensional feature data of the teaching content data based on the high-dimensional feature data of the teaching content data; A predictive model, based on multiple sets of low-dimensional feature data of the teaching content data, obtains instructions for teacher style prediction data corresponding to the teaching content data.

Optionally, the readable program further includes: instructions for performing a mapping operation on the teacher style prediction data to obtain the teacher style category corresponding to the teaching content data.

Through the non-transitory computer-readable storage medium provided by the embodiments of this application, based on the high-dimensional feature data of the teaching content sample, multiple sets of low-dimensional feature data of the teaching content sample are determined, and the teacher style prediction model to be trained is based on multiple Group low-dimensional feature data, obtain teacher style prediction data corresponding to the teaching content sample, and then train the teacher style prediction model based on the teacher style annotation data and teacher style prediction data of the teaching content sample. Compared with other existing methods, The high-dimensional feature data of the teaching content samples are grouped into groups of low-dimensional feature data, which greatly reduces the dimension of the input features of the teacher style prediction model to be trained, so that the teacher style prediction performance of the trained teacher style prediction model can be effectively To improve.

It should be pointed out that, according to the needs of implementation, each component/step described in the embodiment of the present invention can be split into more components/steps, or two or more components/steps or partial operations of components/steps can be combined into New components/steps to achieve the purpose of the embodiments of the present invention.

The above method according to the embodiments of the present invention can be implemented in hardware, firmware, or implemented as software or computer code that can be stored in a recording medium (such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk), or implemented by The computer code downloaded from the network is originally stored in a remote recording medium or a non-transitory machine-readable medium and will be stored in a local recording medium, so that the method described here can be stored using a general-purpose computer, a dedicated processor or a programmable Or such software processing on a recording medium of dedicated hardware (such as ASIC or FPGA). It can be understood that a computer, processor, microprocessor controller, or programmable hardware includes storage components (for example, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is used by the computer, When the processor or hardware is accessed and executed, the training method of the teacher style prediction model described here is implemented. In addition, when a general-purpose computer accesses the code for implementing the training method of the teacher-style prediction model shown here, the execution of the code converts the general-purpose computer into a special-purpose computer for executing the training method of the teacher-style prediction model shown here. computer.

A person of ordinary skill in the art may realize that the units and method steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the embodiments of the present invention.

The above implementations are only used to illustrate the embodiments of the present invention, and are not intended to limit the embodiments of the present invention. Those of ordinary skill in the relevant technical field can also make various modifications without departing from the spirit and scope of the embodiments of the present invention. Changes and modifications, therefore, all equivalent technical solutions also belong to the scope of the embodiments of the present invention, and the patent protection scope of the embodiments of the present invention should be defined by the claims.

Claims

A training method of a teacher's style prediction model, characterized in that the method comprises:

Determine multiple sets of second feature data of the teaching content sample based on the first feature data of the teaching content sample, wherein the first feature data has a higher dimension than the second feature data;

Obtain the teacher style prediction data corresponding to the teaching content sample based on the plurality of sets of second feature data through the teacher style prediction model to be trained; and

Training the teacher style prediction model based on the teacher style annotation data of the teaching content sample and the teacher style prediction data.
The method according to claim 1, wherein the teacher style prediction model comprises a plurality of low-level models and a high-level model connected to the output terminals of the plurality of low-level models,

The obtaining the teacher style prediction data corresponding to the teaching content sample through the teacher style prediction model to be trained based on the multiple sets of second feature data includes:

Obtain multiple preliminary prediction data of teacher style corresponding to the teaching content sample based on the multiple sets of second feature data through the multiple low-level models; and

Through the high-level model, based on the plurality of preliminary prediction data of teacher style, the final prediction data of the teacher style corresponding to the teaching content sample is obtained.
The method according to claim 2, wherein each of the plurality of low-layer models includes a hidden layer and a prediction layer connected to the output terminal of the hidden layer,

The obtaining multiple preliminary prediction data of teacher style corresponding to the teaching content sample based on the multiple sets of second feature data through the multiple low-level models includes:

Performing feature extraction operations on the multiple sets of second feature data respectively through the hidden layer to obtain feature representation data corresponding to the multiple sets of second feature data; and

Through the prediction layer, a mapping operation is performed on the feature representation data corresponding to the multiple sets of second feature data to obtain multiple teacher style preliminary prediction data corresponding to the teaching content sample.
The method according to claim 3, wherein the obtaining, through the high-level model, based on the plurality of preliminary prediction data of teacher style, the final prediction data of the teacher style corresponding to the teaching content sample comprises:

Generate high-level feature representation data corresponding to the high-level model based on the plurality of preliminary prediction data of teacher style; and

Through the high-level model, based on the high-level feature characterization data, the final prediction data of the teacher style corresponding to the teaching content sample is obtained.
The method according to claim 4, wherein the generating high-level feature characterization data corresponding to the high-level model based on the preliminary prediction data of the multiple teacher styles comprises:

The high-level characteristic characteristic data is generated based on the characteristic characterization data corresponding to the plurality of preliminary prediction data of teacher style and the plurality of sets of second characteristic data respectively.
The method according to claim 4, wherein said obtaining, through the high-level model, based on the high-level feature characterization data, the final prediction data of teacher style corresponding to the teaching content sample comprises:

Performing a feature extraction operation on the high-level feature characterization data through the hidden layer in the high-level model to obtain feature characterization data corresponding to the high-level feature characterization data; and

Through the prediction layer in the high-level model, a mapping operation is performed on the feature representation data corresponding to the high-level feature representation data to obtain final prediction data of teacher style corresponding to the teaching content sample.
The method according to claim 2, wherein the training of the teacher style prediction model based on the teacher style annotation data of the teaching content sample and the teacher style prediction data comprises:

Determine the difference value between the teacher style annotation data and the teacher style final prediction data through a target loss function; and

Based on the difference value, the parameters of the multiple low-level models and the high-level model in the teacher style prediction model are adjusted.
The method according to any one of claims 1-7, wherein the method further comprises:

Based on the first feature data of the teaching content data, determining multiple sets of second feature data of the teaching content data; and

Through the trained teacher style prediction model, based on the multiple sets of second feature data of the teaching content data, the teacher style prediction data corresponding to the teaching content data is obtained.
The method according to claim 8, wherein the method further comprises:

A mapping operation is performed on the teacher style prediction data to obtain the teacher style category corresponding to the teaching content data.
A non-transitory computer-readable storage medium, wherein the non-transitory computer-readable storage medium stores a readable program, and when the readable program is executed by a processor, the processor executes the following steps:

Determine multiple sets of second feature data of the teaching content sample based on the first feature data of the teaching content sample, wherein the first feature data has a higher dimension than the second feature data;

Obtain the teacher style prediction data corresponding to the teaching content sample based on the plurality of sets of second feature data through the teacher style prediction model to be trained; and

Training the teacher style prediction model based on the teacher style annotation data of the teaching content sample and the teacher style prediction data.
The non-transitory computer-readable storage medium of claim 10, wherein the teacher style prediction model comprises a plurality of low-level models and a high-level model connected to the output terminals of the plurality of low-level models,

The obtaining the teacher style prediction data corresponding to the teaching content sample through the teacher style prediction model to be trained based on the multiple sets of second feature data includes:

Obtain multiple preliminary prediction data of teacher style corresponding to the teaching content sample based on the multiple sets of second feature data through the multiple low-level models; and

Through the high-level model, based on the plurality of preliminary prediction data of teacher style, the final prediction data of the teacher style corresponding to the teaching content sample is obtained.
The non-transitory computer-readable storage medium of claim 11, wherein each of the plurality of low-level models includes a hidden layer and a prediction layer connected to an output terminal of the hidden layer ,

The obtaining multiple preliminary prediction data of teacher style corresponding to the teaching content sample based on the multiple sets of second feature data through the multiple low-level models includes:

Performing feature extraction operations on the multiple sets of second feature data respectively through the hidden layer to obtain feature representation data corresponding to the multiple sets of second feature data; and

Through the prediction layer, a mapping operation is performed on the feature representation data corresponding to the multiple sets of second feature data to obtain multiple teacher style preliminary prediction data corresponding to the teaching content sample.
The non-transitory computer-readable storage medium according to claim 12, wherein said high-level model is used to obtain the final teacher style corresponding to said teaching content sample based on said plurality of teacher style preliminary prediction data Forecast data, including:

Generate high-level feature representation data corresponding to the high-level model based on the plurality of preliminary prediction data of teacher style; and

Through the high-level model, based on the high-level feature characterization data, the final prediction data of the teacher style corresponding to the teaching content sample is obtained.
The non-transitory computer-readable storage medium of claim 13, wherein the generating high-level feature characterization data corresponding to the high-level model based on the preliminary prediction data of the plurality of teacher styles comprises:

The high-level characteristic characteristic data is generated based on the characteristic characterization data corresponding to the plurality of preliminary prediction data of teacher style and the plurality of sets of second characteristic data respectively.
The non-transitory computer-readable storage medium according to claim 13, wherein the final prediction data of teacher style corresponding to the teaching content sample is obtained through the high-level model and based on the high-level feature characterization data, include:

Performing a feature extraction operation on the high-level feature characterization data through the hidden layer in the high-level model to obtain feature characterization data corresponding to the high-level feature characterization data; and

Through the prediction layer in the high-level model, a mapping operation is performed on the feature representation data corresponding to the high-level feature representation data to obtain final prediction data of teacher style corresponding to the teaching content sample.
The non-transitory computer-readable storage medium of claim 11, wherein the teacher style annotation data based on the teaching content sample and the teacher style prediction data to train the teacher style prediction model comprises :

Determine the difference value between the teacher style annotation data and the teacher style final prediction data through a target loss function; and

Based on the difference value, the parameters of the multiple low-level models and the high-level model in the teacher style prediction model are adjusted.
The non-transitory computer-readable storage medium according to any one of claims 10-16, wherein when the readable program is executed by the processor, the processor further executes the following steps:

Based on the first feature data of the teaching content data, determining multiple sets of second feature data of the teaching content data; and

Through the trained teacher style prediction model, based on the multiple sets of second feature data of the teaching content data, the teacher style prediction data corresponding to the teaching content data is obtained.
The non-transitory computer-readable storage medium of claim 17, wherein when the readable program is executed by the processor, the processor further executes the following steps:

A mapping operation is performed on the teacher style prediction data to obtain the teacher style category corresponding to the teaching content data.