CN116595987A

CN116595987A - Method, device and storage medium for classifying dulcimer text based on neural network model

Info

Publication number: CN116595987A
Application number: CN202310354490.4A
Authority: CN
Inventors: 王昊天
Original assignee: International Business And Economics, University of
Current assignee: International Business And Economics, University of
Priority date: 2023-04-04
Filing date: 2023-04-04
Publication date: 2023-08-15

Abstract

The invention provides a speaker text classification method, a speaker text classification device and a storage medium based on a neural network model, relates to the field of text recognition, in particular to the field of deep learning of the neural network model, and can be applied to feature analysis and semantic mining of texts to replace manual detection of texts with dangers and threats. The specific scheme comprises the following steps: firstly, preprocessing an administrative multi-resolution text to obtain an intermediate result as a characteristic, inputting the intermediate result into a Bert model, merging double-layer codes of words and sentences through a attention mechanism to obtain semantic representation of the dulcimer text, and finally increasing the recall rate of the model by increasing a penalty coefficient of error recognition of the dulcimer text to more accurately position the approximate range of the dulcimer text. And the neural network model is utilized to assist in identifying the speaker information, so that the labor intensity of administrative recommenders is reduced, the labor cost is saved, and the pre-intervention level of the event possibly caused by the speaker is indirectly improved.

Description

Method, device and storage medium for classifying dulcimer text based on neural network model

Technical Field

The invention relates to the field of text recognition, in particular to the field of neural network model deep learning.

Background

The speaker recognition is to use an algorithm to recognize the speaker in the text forms such as mails generated in the administrative review process by a computer, and the essence of the speaker recognition is to perform feature analysis and semantic mining on the text instead of manually detecting the text with danger and threat. One key point in solving this identification problem is the study of the automated classification of administrative law enforcement, and another is the relatively complex deep learning-based approach.

However, the research of the dullness identification problem in the administrative double-proposal is essentially a classification research problem which combines the abnormal semantic text classification algorithm and unbalanced data processing. Because of the characteristics of the business, the semantic definition of the dialect text is not high, and therefore, a classification frame and an algorithm are difficult to find through the existing research results, so that the model can accurately identify the text semantic, and the coverage rate of accurately screening the dialect reaches the manual level.

Disclosure of Invention

The invention provides a method and a device for classifying a speaker text based on a neural network model and a storage medium.

According to an aspect of the embodiments of the present disclosure, a method for classifying a speaker text based on a neural network model is characterized in that the method includes:

s1, acquiring an administrative review text, and preprocessing the administrative review text to obtain an intermediate result;

s2, inputting the intermediate result into a Bert neural network model for feature coding to obtain a feature coding result; vector conversion is carried out on the feature coding result to obtain an input vector, a multi-head self-attention mechanism is applied to enhance semantic vector representation of the input vector, and a relationship between an intermediate result and a speaker classification is obtained; inputting the relation between the input vector and the intermediate result and the speaker classification into an activation function for output to obtain an output vector; and performing classification processing on the administrative double-ended text according to the output vector to obtain a classification result of the dulcimer text.

According to another aspect of the embodiments of the present disclosure, the step S2 specifically includes:

carrying out semantic representation on the input intermediate result, and adding feature codes of administrative review applicant information into the intermediate result after semantic representation to obtain the feature code result;

converting each character or word of the feature coding result into a vector, and summing the character vector, the text vector and the position vector to obtain a one-dimensional vector representation;

enhancing semantic vector representation of the input vector through a multi-head self-attention mechanism, and acquiring a relation between the intermediate result and the speaker classification;

outputting the relation between the input vector and the intermediate result and the speaker classification through an activation function to obtain an output vector reflecting the occurrence probability of each word, and comparing the output vector with a true value to obtain a comparison result, and according to the comparison result _， Training the loss function optimization model, and adjusting weights to increase punishment of missing dialects;

and performing classification processing on the administrative multiple text according to the output vector to obtain a classification result of the dulling text.

According to another aspect of the disclosed embodiments, obtaining an administrative review text, preprocessing the administrative review text to obtain an intermediate result, including:

rejecting a missing value text and a text without semantics in the administrative multi-resolution text;

eliminating escape symbols, format errors and invalid information in the administrative complex text by using a regular expression;

data desensitization is performed on information of relevant personnel of the administrative review text.

According to another aspect of embodiments of the present disclosure, the loss function used in the loss layer in the Bert neural network model is a weighted cross entropy function, wherein the weighted cross entropy function weights the speaker text in the input vector.

According to another aspect of the disclosed embodiments, the obtained speaker text classification result is calculated for the recall, the precision, and the F1 value, wherein the expression of calculation mode 1 for the recall, calculation mode 2 for the precision, and calculation mode 3 for the F1 value is as follows;

calculation mode 1:

calculation mode 2:

calculation method 3:

wherein TP represents the number of correctly predicted speaker texts, FP represents the number of incorrectly predicted speaker texts, FN represents the number of incorrectly predicted non-speaker texts, recovery represents recall rate, precision represents precision rate.

According to another aspect of the embodiments of the present disclosure, there is provided a method for training a Bert neural network model, including:

acquiring training sample data of an administrative review text, wherein the administrative review text comprises a raised text and a non-raised text;

inputting training sample data of the administrative multi-protocol text into an element coding layer in a preset Bert neural network model, and obtaining a feature coding result;

inputting the feature coding result to a vector representation layer in a preset Bert neural network model to obtain an input vector;

inputting an input vector to an attention layer, and acquiring a relation between a training sample and a speaker classification;

inputting the relation between the input vector and the training sample and the speaker classification into a preset Bert neural network model to obtain an output vector;

performing classification operation according to the output vector to obtain a classification result of the dulcimer text;

calculating a loss function between an output vector of a preset Bert neural network model and a trained sample, training parameters of the preset Bert neural network model based on the loss function, and determining the trained preset Bert neural network model as a trained preset Bert neural network model.

According to another aspect of the embodiments of the present disclosure, there is provided a speaker recognition apparatus based on a neural network model, including:

an intermediate result determining module: acquiring an administrative review text, and preprocessing the administrative review text to obtain an intermediate result;

and a feature coding result determining module: inputting the intermediate result into a Bert neural network model for feature coding to obtain a feature coding result;

an input vector determination module: vector conversion is carried out on the feature coding result to obtain an input vector, a multi-head self-attention mechanism is applied to enhance semantic vector representation of the input vector, and a relationship between an intermediate result and a speaker classification is obtained;

an output vector determination module: inputting the relation between the input vector and the intermediate result and the speaker classification into an activation function for output to obtain an output vector;

the classification result determining module: and performing classification processing on the administrative double-ended text according to the output vector to obtain a classification result of the dulcimer text.

According to another aspect of the embodiments of the present disclosure, there is provided a training apparatus of a Bert neural network model, including:

training sample determination module: acquiring training sample data of an administrative review text, wherein the administrative review text comprises a raised text and a non-raised text;

and a feature coding result determining module: inputting training sample data of the administrative multi-protocol text into an element coding layer in a preset Bert neural network model, and obtaining a feature coding result;

an input vector determination module: inputting the feature coding result to a vector representation layer in a preset Bert neural network model to obtain an input vector;

the relation determining module: inputting an input vector to an attention layer, and acquiring a relation between a training sample and a speaker classification;

an output vector determination module: inputting the relation between the input vector and the training sample and the speaker classification into a preset Bert neural network model to obtain an output vector;

the classification result determining module: performing classification operation according to the output vector to obtain a classification result of the dulcimer text;

the neural network model determination module: calculating a loss function between an output vector of a preset Bert neural network model and a trained sample, training parameters of the preset Bert neural network model based on the loss function, and determining the trained preset Bert neural network model as a trained preset Bert neural network model.

According to another aspect of the embodiments of the present disclosure, there is provided a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the processor executes the computer program to implement the speaker recognition method based on the neural network model according to the first aspect.

The invention adopts the technical proposal and has at least the following beneficial effects:

the embodiment of the disclosure develops researches aiming at the speaker recognition in the administrative double-discussion unbalanced text data, and provides a speaker text classification model based on a neural network model, which classifies the administrative double-discussion text into a speaker class or a non-speaker class. The recall rate of the speaks is improved by the processing skills of unbalanced data and the characteristics of the speaks text, the speaks text can replace manual operation, the practical business requirement is met, and the speaks sample is basically covered. Meanwhile, the human cost is greatly saved under the condition that the omission degree is approximately equal to manual screening, and the method has important practical significance.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of embodiments of the disclosure.

Drawings

In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.

Fig. 1 is a task workflow diagram of a speaker text classification method based on a neural network model in an embodiment of the disclosure;

fig. 2 is a method schematic diagram of a speaker text classification method based on a neural network model in an embodiment of the disclosure;

FIG. 3 is a schematic diagram of a Bert neural network model, in an embodiment of the disclosure;

FIG. 4 is a diagram of a manner of desensitizing information data of administrative review text related personnel in an embodiment of the present disclosure;

fig. 5 is an experimental result analysis schematic diagram of a speaker text classification method based on a neural network model in an embodiment of the disclosure;

fig. 6 is a schematic diagram of an ablation experimental effect of a speaker text classification method based on a neural network model in an embodiment of the disclosure;

fig. 7 is another schematic diagram of an ablation experimental effect of a speaker text classification method based on a neural network model in an embodiment of the disclosure;

fig. 8 is a schematic diagram of a speaker recognition device based on a neural network model in an embodiment of the disclosure;

fig. 9 is a diagram of a Bert neural network model training device applied to speaker recognition in an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be described in detail below. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, based on the examples herein, which are within the scope of the invention as defined by the claims, will be within the scope of the invention as defined by the claims.

The terms "first," "second," "third," "fourth," and the like in embodiments of the present disclosure are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such as a series of steps or elements. The method, system, article, or apparatus is not necessarily limited to those explicitly listed but may include other steps or elements not explicitly listed or inherent to such process, method, article, or apparatus.

According to the embodiment of the disclosure, with reference to the actual business situation, the dialect class text with fewer numbers in the administrative review text is set to be a positive sample, and the non-dialect class is set to be a negative sample. Unlike "negative comments", "negative short text", the possible malignant public safety event is often hidden behind the dulling class sample, so the model is mainly aimed at identifying the dulling class sample as much as possible, thereby reducing the occurrence probability of the malignant public safety event. On the other hand, the type of the administrative review text data used in the embodiment of the present disclosure is long text, and a better effect cannot be obtained by using a general text exception handling model, so that the administrative review text data has a characteristic of being difficult to handle compared with "negative comments" and "negative short text". At present, the screening based on keywords is mostly focused in actual business operation, texts with special keywords are identified through a computer, then manual selection is performed, so that texts which are not provided with keywords but contain the meaning of the meaning cannot be accurately identified from the meaning level, and meanwhile, too much workload is not reduced.

Therefore, a new working flow of the speaker recognition task is designed in the embodiment of the disclosure, as shown in fig. 1, that is, a classification model is used to replace the previous keyword recognition method, so that the omission degree of the speaker is reduced from the semantic level, and under the condition that the target speaker does not lose too much, the secondary screening is still performed in a speaker sample set in a manual mode in the later stage, so that the real speaker information is found, and meanwhile, the workload in the secondary screening process is greatly reduced. Because the number of samples contained in the processed dulcimer class sample set is far smaller than the total number of samples, the whole labor cost and labor intensity are greatly reduced even if a manual identification method is adopted.

Therefore, compared with the accuracy rate or F1 value of a more important text classifier in other mass sample recognition tasks, the recall rate of the dialect text recognition is improved in the known limited samples, and the missing recognition situation is reduced. Therefore, the task of the invention is to cover the model recall as much as possible with the speakable samples, namely to improve the accuracy or F1 value of the experiment as much as possible on the basis that the coverage degree is close to or better than the manual identification level, so as to achieve the result that the design model has practical significance.

Therefore, for the task of speaker recognition in administrative complex unbalanced text data, embodiments of the present disclosure propose a text classification model that incorporates element feature coding, a attention mechanism, and a weighted loss function. Firstly, adding the difference between the speaker text and the non-speaker text as characteristic input, then merging double-layer codes of words and sentences through a attention mechanism to obtain semantic representation of the speaker text, and finally increasing the recall rate of the model by increasing the punishment coefficient of the error recognition speaker text, so that the approximate range of the speaker text is more accurately positioned. The neural network model is used for assisting in identifying the speaker information, so that the labor intensity of administrative recommenders is reduced, the labor cost is saved, and the pre-intervention level of the event possibly caused by the speaker is indirectly improved.

According to an aspect of the disclosed embodiments, there is provided a method for classifying a speaker text based on a neural network model, including the steps as shown in fig. 2:

s1: acquiring an administrative review text, and preprocessing the administrative review text to obtain an intermediate result;

s2: inputting the intermediate result into a Bert neural network model for feature coding to obtain a feature coding result; vector conversion is carried out according to the feature coding result to obtain an input vector, a multi-head self-attention mechanism is applied to enhance semantic vector representation of the input vector, and a relationship between an intermediate result and a speaker classification is obtained; inputting the input vector and the relation between the intermediate result and the speaker classification into an activation function for output according to the relation between the intermediate result and the classification, and obtaining an output vector; and performing classification processing on the administrative double-ended text according to the output vector to obtain a classification result of the dulcimer text.

In one possible embodiment, the Bert neural network model is as shown in fig. 3, and step S2 further includes:

semantic representation is carried out on the input administrative review text, then feature codes of administrative review applicant information are added, and feature code results are obtained;

converting each word or word in the text into a one-dimensional vector to obtain an input vector;

enhancing semantic vector representation of the input text, and further acquiring a relation between the text and the classification;

outputting the vector representation fused with the full text semantic information through an activation function, reflecting the output vector of the occurrence probability of each word, comparing the output vector with a true value, training through a loss function optimization model, and adjusting the weight to increase the penalty of missing dialect;

and (3) carrying out two classifications on the text according to the output vector, outputting the speaker class or the non-speaker class, and covering more speakers by adjusting the threshold value.

In one possible embodiment, obtaining the administrative review text, and preprocessing the administrative review text to obtain the intermediate result includes:

by observing that the administrative double-proposal text contains some invalid short text information, wherein the invalid short text information comprises a missing value text and a text without semantics, and the text can be considered as dirty data. The dirty data does not affect the output result, so that the dirty data is removed in order to improve efficiency and save labor cost.

wherein, for example, ". The model cannot pay attention to the effective parts because the model can learn wrong information due to the existence of special symbols and invalid information of the website, so that the model effect is reduced, and the information is removed.

The identification card information, the mobile phone number information and the base phone number information in the administrative multi-protocol text are replaced by replacing characters, wherein the replacing mode is shown in fig. 4. On one hand, the information of the relevant personnel of the administrative double-proposal text can be subjected to data desensitization, and on the other hand, the existence of the information can be used as characteristic input to better detect the dullness. For example, it can be observed from the obtained administrative review data that the proportion of the dialect class text containing the specific identity card of the administrative review applicant is 12%, and the proportion of the dialect class text is 3% in the non-dialect class, which can be interpreted as that the dialect class text with the identity card information can better express the complaint of the administrative review applicant on the situation of the dialect class text, and express the reasonable or unreasonable request or threat of the dialect class text in a manner more similar to a real-name system. Because there will be relatively large differences between the two samples, the recall of the model will be better improved by using these differences.

In one possible embodiment, there is provided an experimental evaluation index as required in the embodiments of the present disclosure, including:

wherein TP represents the number of samples correctly predicted as positive class, FP represents the number of samples incorrectly predicted as positive class, FN represents the number of samples incorrectly predicted as negative class, and specific calculation modes of evaluation criteria of recovery, precision and F1 value are shown as formula (1), formula (2) and formula (3):

because of the necessity of speaker text recognition, the embodiment of the disclosure adopts the speaker recall rate as the evaluation index, wherein the higher the recall rate, the more the speaker samples included in the speaker sample set are, and the better the effect is. While the end result is interpreted together in terms of recall, precision and F1 values in order to avoid the extreme case of predicting all text as spoken to yield 100% recall. Meanwhile, the model evaluation effect refers to the AUC value of the Precision-Recall curve rather than the AUC value of the ROC curve, because the AUC value of the Precision-Recall curve is more suitable for evaluating the standard of unbalanced data modeling. In addition, the context relation captured by the attention mechanism can be displayed, and whether the attention mechanism recognizes the related semantic information in the sentence or not can be discussed through individual cases.

In one possible embodiment, embodiments of the present disclosure perform unbalanced data processing in the loss layer, wherein;

the Loss function used by the Loss layer in the embodiments of the present disclosure is a weighted cross entropy function WCE Loss, which differs from the binary cross entropy by adding a weight ω to the discrimination of the speaker text in the input vector, which can balance the recall between speaker text and non-speaker text. Without WCE Loss, the model would be more prone to predicting only non-spoken text. The larger the weight coefficient, the greater the contribution of the class to the loss function, i.e., the larger the weight coefficient that should be obtained for the class with a small number of samples, thus enhancing its contribution to the loss function. Because the invention has important significance in identifying the class sample of the dulcimer, the value of the omega weight before the class of the dulcimer is provided to be larger than the normal value.

WCE Loss equation (4) uses weighting the Loss of samples of different classes, where y _i To predict tags, logits _i For the predicted probability value ω is a weight. The loss function increases the recall rate of small samples after giving larger weight to the category with smaller sample number, so the invention increases the loss weight for identifying the speakable samples, so that the classifier avoids missing speakable texts as much as possible when optimizing the text which is difficult to distinguish, thereby improving the modulusType recall rate.

In one possible embodiment, the experimental results are analyzed, where the experimental results are shown in fig. 5, with recall as an indicator, and Bert achieves the best results in all baseline models. It is worth noting that the better effect of the Bert-Base than the Base-large in the Bert pre-training model is that the Bert-large contains more node information, so that under the same hardware condition, the Bert-Base can perform larger batch processing, and meanwhile, the node information contained in the Bert-large has no obvious effect on the invention. In addition, the better effect of the Bert-base than that of the Chinese Bert-wwm is possible because the word vector is used in the text of the invention to reflect the semantic relationship in the dulling than the word vector, or the word feature is used in the element feature coding to reflect the semantic relationship in the dulling than the word feature.

Analysis was performed on the results using the Bert-base model to further explore the cause of the model classification errors. In terms of recall, the missing 1.9% of the raise is largely divided into two cases. First, some cases are marked as dulling classes, but no particularly overdriven semantic information appears, but these texts are still marked as dulling due to the possibility of jeopardizing the national major activities; secondly, the manual annotation data used in the invention can have some error annotation conditions. Therefore, aiming at the improvement of the recall rate, the invention can increase the scale of administrative review data on the basis of standard labeling data, enlarge the data used for training and strengthen the capability of a model in recognizing slight overstress semantics.

On the other hand, the accuracy rate for the model mostly belongs to the following two errors. The first is that there are cases of erroneous recognition when a relatively complex logical relation of sentences is encountered, especially a hypothesized relation, a conditional relation, and the second is that individual sensitive words appear in the expression of the administrative review applicant, and although the model considers context semantics rather than keywords, the erroneous overall recognition classification is caused by the abnormal sensitivity of the individual words. For such errors, the present invention expects better text classification models to solve these problems in the future.

The above experimental results prove that the Bert-base model achieves the best effect, but according to another possible embodiment of the disclosure, in order to evaluate the contribution degree of each part of the Bert neural network model, ablation experiments are performed, and the attention mechanism, the loss function and the element feature coding layer module are removed respectively, so as to observe the change of each experimental result. The specific result is shown in fig. 6, wherein each part of the model has a remarkable effect, the attention mechanism and the loss function have the most remarkable effect of improving the recall rate, the experimental result is greatly improved by adding element feature codes, the three parts are not necessary, and the model has good practical significance only when the three parts exist.

Further ablation experiments are performed on the Bert-base model, and experimental results are shown in fig. 7, wherein on one hand, experimental results of two cases of element feature coding and non-element feature coding are compared, and the results show that recall rate of element feature coding is improved more stably under different weights than experimental results of element feature coding. In addition, compared with the case that the loss function is not changed, that is, the weight is equal to 1, the experimental results of the two text data are obviously improved along with the increase of the weight, but the bottleneck is reached when the weight is increased to 30, the overfitting phenomenon is generated, so that the optimal weight is 30, and the experimental recall rate with element feature codes is 98.1%.

In a possible embodiment, an embodiment of the present disclosure provides a speaker recognition device based on a neural network model, where a schematic diagram of the device is shown in fig. 8, and the speaker recognition device includes:

intermediate result determination module 801: acquiring an administrative review text, and preprocessing the administrative review text to obtain an intermediate result;

classification result determination module 802: inputting the intermediate result into a Bert neural network model for feature coding to obtain a feature coding result; vector conversion is carried out according to the feature coding result to obtain an input vector, and the representation of the input vector is enhanced to obtain the relationship between the intermediate result and the classification and enhance the input vector; according to the relation between the intermediate result and the classification, the enhanced input vector is input into an activation function for output, and an output vector is obtained; and performing classification processing on the text according to the output vector to obtain a classification result.

In one possible embodiment, an embodiment of the disclosure provides a training apparatus for a neural network model, as shown in fig. 9, including:

training sample determination module 901: acquiring training sample data of an administrative review text, wherein the administrative review text comprises a raised text and a non-raised text;

feature encoding result determination module 902: inputting training sample data of the administrative multi-protocol text into an element coding layer in a preset Bert neural network model, and obtaining a feature coding result;

the input vector determination module 903: inputting the feature coding result to a vector representation layer in a preset Bert neural network model to obtain an input vector;

relationship determination module 904: inputting an input vector to an attention layer, and acquiring a relation between a training sample and a speaker classification;

the output vector determination module 905: inputting the relation between the input vector and the training sample and the speaker classification into a preset Bert neural network model to obtain an output vector;

classification result determination module 906: performing classification operation according to the output vector to obtain a classification result of the dulcimer text;

the neural network model determination module 907: calculating a loss function between an output vector of a preset Bert neural network model and a trained sample, training parameters of the preset Bert neural network model based on the loss function, and determining the trained preset Bert neural network model as a trained preset Bert neural network model.

The embodiment of the disclosure provides a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and a processor executes the computer program to realize a speaker recognition method based on a neural network model.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the embodiments of the present disclosure may be performed in parallel, sequentially, or in a different order, so long as the desired result of the technical solution disclosed in the embodiments of the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the embodiments of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the embodiments of the present disclosure are intended to be included within the scope of the embodiments of the present disclosure.

Claims

1. A method for classifying a speaker text based on a neural network model, the method comprising:

s2, inputting the intermediate result into a Bert neural network model for feature coding to obtain a feature coding result; vector conversion is carried out on the feature coding result to obtain an input vector, a multi-head self-attention mechanism is applied to enhance semantic vector representation of the input vector, and the relationship between the intermediate result and the speaker classification is obtained; inputting the relation between the input vector and the intermediate result and the speaker classification into an activation function for output to obtain an output vector; and performing classification processing on the administrative multiple text according to the output vector to obtain a classification result of the dulling text.

2. The method according to claim 1, wherein the step S2 specifically comprises:

3. The method of claim 1, wherein the obtaining the administrative review text, and the preprocessing the administrative review text to obtain the intermediate result, comprises:

rejecting the text with missing value and the text without semantics in the administrative multiple text;

eliminating escape symbols, format errors and invalid information in the administrative double-proposal text by using a regular expression;

and data desensitization is carried out on the information of the related personnel of the administrative review text.

4. The method according to any of claims 1-2, wherein the loss function used in a loss layer in the Bert neural network model is a weighted cross entropy function, wherein the weighted cross entropy function weights the dulling text in the input vector.

5. The method according to claim 1, wherein the obtained speaker text classification result is calculated for a recall rate, an accuracy rate and an F1 value, wherein the expression of calculation mode 1 of the recall rate, calculation mode 2 of the accuracy rate and calculation mode 3 of the F1 value is as follows;

calculation mode 1:

calculation mode 2:

calculation method 3:

wherein TP represents the number of correctly predicted speaker texts, FP represents the number of incorrectly predicted speaker texts, FN represents the number of incorrectly predicted non-speaker texts, recovery represents the recall rate, and precision represents the precision rate.

6. A method for training a Bert neural network model, comprising:

acquiring training sample data of an administrative review text, wherein the administrative review text comprises a raise text and a non-raise text;

inputting training sample data of the administrative multi-protocol text into an element coding layer in a preset Bert neural network model to obtain a feature coding result;

inputting the feature coding result to a vector representation layer in the preset Bert neural network model to obtain an input vector;

inputting the input vector to an attention layer, and acquiring the relation between the training sample and the speaker classification;

inputting the relation between the input vector and the training sample and the speaker classification into the preset Bert neural network model to obtain an output vector;

calculating a loss function between an output vector of the preset Bert neural network model and a trained sample, training parameters of the preset Bert neural network model based on the loss function, and determining the trained preset Bert neural network model as a trained preset Bert neural network model.

7. A speaker text classification device based on a neural network model, comprising:

an intermediate result determining module: acquiring an administrative re-proposal text, and preprocessing the administrative re-proposal text to obtain an intermediate result;

the classification result determining module: inputting the intermediate result into a Bert neural network model for feature coding to obtain a feature coding result; vector conversion is carried out on the feature coding result to obtain an input vector, a multi-head self-attention mechanism is applied to enhance semantic vector representation of the input vector, and the relationship between the intermediate result and the speaker classification is obtained; inputting the relation between the input vector and the intermediate result and the speaker classification into an activation function for output to obtain an output vector; and performing classification processing on the administrative multiple text according to the output vector to obtain a classification result of the dulling text.

8. A Bert neural network model training device, comprising:

training sample determination module: acquiring training sample data of an administrative review text, wherein the administrative review text comprises a raise text and a non-raise text;

and a feature coding result determining module: inputting training sample data of the administrative multi-protocol text into an element coding layer in a preset Bert neural network model to obtain a feature coding result;

an input vector determination module: inputting the feature coding result to a vector representation layer in the preset Bert neural network model to obtain an input vector;

the relation determining module: inputting the input vector to an attention layer, and acquiring the relation between the training sample and the speaker classification;

an output vector determination module: inputting the relation between the input vector and the training sample and the speaker classification into the preset Bert neural network model to obtain an output vector;

the neural network model determination module: calculating a loss function between an output vector of the preset Bert neural network model and a trained sample, training parameters of the preset Bert neural network model based on the loss function, and determining the trained preset Bert neural network model as a trained preset Bert neural network model.

9. A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and a processor executes the computer program to implement the neural network model-based speaker recognition method of any one of claims 1-5.