Disclosure of Invention
The invention aims to provide a bank public opinion pneumatic control method and system for carrying out model compression based on knowledge distillation. The model prediction accuracy can be improved, and the prediction time can be reduced.
In order to achieve the purpose, the invention adopts the following technical scheme: the bank public opinion trend control method for model compression based on knowledge distillation comprises the following steps:
s1: constructing a model compression entity identification module, wherein the model compression entity identification module is a neural network model for carrying out model compression based on knowledge distillation, and the construction process of the model compression entity identification module is a teacher model trained on samples through an original model, and then processing is carried out through a compression model to obtain the final classification probability;
s2: constructing a model compression public opinion classification module, wherein the model compression public opinion classification module is a neural network model for performing model compression based on knowledge distillation, and the construction process of the model compression public opinion classification module is a teacher model trained by an original model based on samples, and then performing knowledge distillation training based on the same samples through the compression model to complete the whole knowledge distillation process;
s3: real-time public sentiment news provided by the bank is transmitted to the model compression entity recognition module and the model compression public sentiment classification module in a distributed message queue mode for analysis and processing, and the wind control early warning of bank monitoring clients is completed.
The step S1 of constructing a model compression entity identification module specifically includes the following steps:
s1.1: training an original model: the network structure of the original model is based on a pretrained model Bert12 layer + Bi-LSTM + CRF; then, based on a teacher model trained by a sample Y, carrying out maximum likelihood estimation based on a ground-truth, and obtaining a result called hard-target;
s1.2: training a compression model based on knowledge distillation: and selecting a simple Bi-LSTM + CRF model from the network structure of the compression model, acquiring sequence characteristics and outputting label emission probability by the Bi-LSTM, accessing the label emission probability into the CRF to generate transition probability, and outputting and acquiring the final classification probability of the label according to the emission probability + the transition probability.
The step S2 of constructing the model compression public opinion classification module specifically comprises the following steps:
s2.1: the network structure of the original model is based on a pretrained model Bert12 layer + TextCNN; then, based on a teacher model trained by a sample Y, carrying out maximum likelihood estimation based on ground-truth to obtain a result;
s2.2: training a compression model based on knowledge distillation: selecting a simple model of TextCNN for a network structure of a compression model, distilling and training Net-S based on the same sample Y, taking 4 distillation temperatures, simultaneously inputting the sample Y into a teacher model Net-T and a student model Net-S, outputting soft-target by the Net-T, and simultaneously outputting soft-target and hard-target in the Net-S training process; adding the cross entropies corresponding to the soft-target of Net-T and the soft-target of Net-S to obtain Lsoft of the whole model loss function; and simultaneously, taking the cross entropy of the hard-target and the ground-truth of the Net-S as Lhard of the whole model loss function, and carrying out the Net-S training by a back propagation training method until the training is stopped, thereby completing the whole knowledge distillation process.
Public opinion system of bank based on knowledge distillation carries out model compression includes: the system comprises a distributed message queue module, a model compression entity identification module and a model compression public opinion classification module;
the output end of the distributed message queue module is connected with the input end of the model compression entity identification module, and the output end of the model compression entity identification module is connected with the input end of the model compression public opinion classification module;
the distributed message queue module is used for transmitting real-time public sentiment news provided by a bank to the model compression entity identification module for processing in a distributed message queue mode;
the model compression entity recognition module is used for automatically recognizing entity information in the input text;
the model compression public opinion classification module is used for classifying and predicting input public opinion information.
As a further description of the above technical solution:
the distributed message queue module adopts a Rabbit-MQ-based distributed message queue module, and adopts a multi-producer and multi-consumer service architecture.
As a further description of the above technical solution:
the system also comprises a distributed cache module;
the distributed cache module adopts a Redis-based distributed cache module;
the distributed cache module is connected with the distributed message queue module and used for caching the requests which are not processed in time by the distributed message queue module and then processed by the model compression entity recognition module and the model compression public opinion classification module.
As a further description of the above technical solution:
the distributed cache module comprises an MQ timeout mechanism cache module;
and the MQ timeout mechanism cache module writes the timeout message into a distributed cache Redis, and acquires the timeout message from the Redis for processing when the resources of the model compression entity identification module and the model compression public opinion classification module are idle.
As a further description of the above technical solution:
the distributed cache module comprises an FIFO elimination mechanism module;
the FIFO elimination mechanism module is used for persisting the information to a database through the FIFO elimination mechanism module after the Redis cache is full, setting the state mark as unprocessed, and processing the information by the model compression entity identification module and the model compression public opinion classification module subsequently.
As a further description of the above technical solution:
the entity information is the company name and the person name of the client concerned by the bank.
The invention provides a bank public opinion pneumatic control system for model compression based on knowledge distillation. The method has the following beneficial effects:
(1): this bank public opinion air control system based on knowledge distillation carries out model compression compresses the leading pre-training model in industry through the technique of knowledge distillation, simplifies the structure of neural network model, and under the less condition of model parameter, can also guarantee the performance, the effect of model, can improve the model prediction rate of accuracy, can reduce the prediction time again.
(2): the bank public opinion pneumatic control system for model compression based on knowledge distillation reduces model prediction response time through the deployment of distributed message queues and distributed caches, and meets the high real-time requirement of banks.
(3): and the pneumatic control early warning of the bank monitoring client is completed by realizing a named entity recognition model and a public opinion early warning classification model.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Referring to fig. 1-2, the bank public opinion trend control method based on knowledge distillation model compression comprises the following steps:
s1: constructing a model compression entity identification module, wherein the model compression entity identification module is a neural network model for carrying out model compression based on knowledge distillation, and the construction process of the model compression entity identification module is a teacher model trained on samples through an original model, and then processing is carried out through a compression model to obtain the final classification probability;
s2: constructing a model compression public opinion classification module, wherein the model compression public opinion classification module is a neural network model for performing model compression based on knowledge distillation, and the construction process of the model compression public opinion classification module is a teacher model trained by an original model based on samples, and then performing knowledge distillation training based on the same samples through the compression model to complete the whole knowledge distillation process;
s3: real-time public sentiment news provided by the bank is transmitted to the model compression entity recognition module and the model compression public sentiment classification module in a distributed message queue mode for analysis and processing, and the wind control early warning of bank monitoring clients is completed.
Step S1 of constructing a model compression entity identification module specifically includes the following steps:
s1.1: training an original model: the network structure of the original model is based on a pretrained model Bert12 layer + Bi-LSTM + CRF; then, based on a teacher model trained by a sample Y, carrying out maximum likelihood estimation based on a ground-truth, and obtaining a result called hard-target;
s1.2: training a compression model based on knowledge distillation: and selecting a simple model of Bi-LSTM + CRF according to the network structure of the compression model, acquiring sequence characteristics and outputting the label emission probability by the Bi-LSTM, accessing the LAbel emission probability into the CRF to generate transition probability, and outputting and acquiring the final classification probability of the label according to the emission probability and the transition probability.
Furthermore, the model compression entity recognition module builds soft-target in the original model training to participate in the subsequent knowledge distillation training process, and the soft-target is used as the loss function input of the compression model training
Step S2, the step of constructing a model compression public opinion classification module specifically comprises the following steps:
s2.1: the network structure of the original model is based on a pretrained model Bert12 layer + TextCNN; then, based on a teacher model trained by a sample Y, carrying out maximum likelihood estimation based on ground-truth to obtain a result;
s2.2: training a compression model based on knowledge distillation: selecting a simple model of TextCNN for a network structure of a compression model, distilling and training Net-S based on the same sample Y, taking 4 distillation temperatures, simultaneously inputting the sample Y into a teacher model Net-T and a student model Net-S, outputting soft-target by the Net-T, and simultaneously outputting soft-target and hard-target in the Net-S training process; adding the cross entropies corresponding to the soft-target of Net-T and the soft-target of Net-S to obtain Lsoft of the whole model loss function; and simultaneously, taking the cross entropy of the hard-target and the ground-truth of the Net-S as Lhard of the whole model loss function, and carrying out the Net-S training by a back propagation training method until the training is stopped, thereby completing the whole knowledge distillation process.
Public opinion system of bank based on knowledge distillation carries out model compression includes: the system comprises a distributed message queue module, a model compression entity identification module and a model compression public opinion classification module;
the output end of the distributed message queue module is connected with the input end of the model compression entity identification module, and the output end of the model compression entity identification module is connected with the input end of the model compression public opinion classification module;
the distributed message queue module is used for transmitting real-time public sentiment news provided by a bank to the model compression entity identification module for processing in a distributed message queue mode;
the model compression entity recognition module is used for automatically recognizing entity information in the input text,
predicting a time comparison result: the average prediction time of the original model is 247 milliseconds, the prediction time after knowledge distillation is 33 milliseconds, and the time is shortened by 7 times.
The model compression public opinion classification module is used for classifying and predicting input public opinion information, and predicting time comparison results: the average prediction time of the original model is 150 milliseconds, the prediction time after knowledge distillation is 17 milliseconds, and the time is shortened by 8 times.
The distributed message queue module adopts a Rabbit-MQ-based distributed message queue module, and adopts a multi-producer and multi-consumer service architecture.
Furthermore, the advanced pre-training model in the industry is compressed through the knowledge distillation technology, the structure of the neural network model is simplified, and the performance and the effect of the model can be ensured under the condition of less model parameters. The method can improve the accuracy of model prediction and reduce the prediction time, and completes the wind control early warning of bank monitoring clients by realizing a named entity recognition model and a public opinion early warning classification model.
The system also comprises a distributed cache module;
the distributed cache module adopts a Redis-based distributed cache module;
the distributed cache module is connected with the distributed message queue module and is used for caching the requests which are not processed in time by the distributed message queue module and then processed by the model compression entity recognition module and the model compression public opinion classification module.
The distributed cache module comprises an MQ timeout mechanism cache module;
and the MQ timeout mechanism cache module writes the timeout message into a distributed cache Redis, and acquires the timeout message from the Redis for processing when the resources of the model compression entity identification module and the model compression public opinion classification module are free, so that the operation of reading the message from a database is avoided, and the processing time delay is reduced.
The distributed cache module comprises an FIFO elimination mechanism module;
the FIFO elimination mechanism module is used for persisting the message to a database through the FIFO elimination mechanism module after the Redis cache is full, setting the state mark as unprocessed, and processing the message by the model compression entity identification module and the model compression public opinion classification module subsequently.
During a peak period, more service requests are made, the execution time of a downstream module is relatively increased, more messages backlogged in a Rabbit-MQ message queue are generated, the messages cannot be processed in time, the processing time delay is increased, the user experience is reduced, the memory module of an MQ timeout mechanism writes the timeout messages into a distributed cache Redis, and the timeout messages are obtained from the Redis to be processed when resources of a model compression entity recognition module and a model compression public sentiment classification module are free, so that the message reading operation from a database is avoided, and the processing time delay is reduced.
Through the deployment of the distributed message queue and the distributed cache, the model prediction response time is reduced, and the high real-time requirement of a bank is met.
The entity information is the company name and the person name of the client concerned by the bank.
In the description herein, references to the description of "one embodiment," "an example," "a specific example," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.