CN115713307B - Intelligent responsibility fixing method and device for operators - Google Patents

Intelligent responsibility fixing method and device for operators Download PDF

Info

Publication number
CN115713307B
CN115713307B CN202211458652.0A CN202211458652A CN115713307B CN 115713307 B CN115713307 B CN 115713307B CN 202211458652 A CN202211458652 A CN 202211458652A CN 115713307 B CN115713307 B CN 115713307B
Authority
CN
China
Prior art keywords
neural network
crf
work order
responsibility
bert
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211458652.0A
Other languages
Chinese (zh)
Other versions
CN115713307A (en
Inventor
马晓亮
安玲玲
邓从健
杜德泉
朱栩
宋灿辉
张志青
王嘉豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Yunqu Information Technology Co ltd
Guangzhou Institute of Technology of Xidian University
Original Assignee
Guangzhou Yunqu Information Technology Co ltd
Guangzhou Institute of Technology of Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Yunqu Information Technology Co ltd, Guangzhou Institute of Technology of Xidian University filed Critical Guangzhou Yunqu Information Technology Co ltd
Priority to CN202211458652.0A priority Critical patent/CN115713307B/en
Publication of CN115713307A publication Critical patent/CN115713307A/en
Application granted granted Critical
Publication of CN115713307B publication Critical patent/CN115713307B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Machine Translation (AREA)

Abstract

The application provides an operator intelligent responsibility fixing method and device of a Bayesian neural network based on BERT+CRF, wherein the method comprises the following steps: acquiring a complaint work order language library of a communication operator, and cleaning the work order based on the complaint work order language library to obtain a cleaned work order; inputting the cleaned work order into the BERT model to obtain word vectors of training data; extracting features of the word vectors by adopting a CRF algorithm to obtain a feature word sequence; inputting the characteristic word sequence into a Bayesian neural network model for training, adjusting the parameters and forming a problem root cause analysis model; and analyzing the work order to be responsible according to the problem root cause analysis model, and determining the responsibility attribution. According to the invention, the intelligent responsibility fixing method of the operator of the Bayesian neural network based on BERT+CRF can be used for well identifying keywords in the complaint worksheet by combining the semantics between words and between sentences, and carrying out intelligent responsibility fixing on the worksheet through the keywords, so that the attributive party of the complaint responsibility can be accurately and rapidly tracked.

Description

Intelligent responsibility fixing method and device for operators
Technical Field
The application relates to the field of deep learning, in particular to an operator intelligent responsibility fixing method and device of a Bayesian neural network based on BERT+CRF.
Background
In the customer service operation flow of the operator, a large amount of customer feedback comments are generated, a large amount of manpower and material resources are required to be spent for carrying out reason tracking and responsibility fixing of comments every day, a large amount of manpower, material resources and time are required to be spent when the traditional manual responsibility fixing is carried out on the complaint worksheets of the customers, and the manual responsibility fixing efficiency is low.
Keyword extraction starts from the earliest stage and is mainly based on dictionary and rule methods, which rely on rule templates manually constructed by linguists, are prone to errors and cannot be transplanted between different fields. Thus, this approach can only handle some simple text data, and is not capable for complex unstructured data. Methods based mainly on statistical machine learning, including Hidden Markov Models (HMM), maximum Entropy Models (MEM), support Vector Machines (SVM), conditional Random Fields (CRF), etc., follow. However, these methods still require a lot of manual participation in feature extraction, and are severely dependent on a corpus, so that the recognition effect is not very ideal.
Therefore, in recent years, deep learning is applied to chinese keyword extraction studies. The deep learning-based method is characterized in that the complex extraction of artificial features is avoided by acquiring the features and the distributed representation of the data, and the method has good generalization capability. Deep learning has emerged in many of these areas of research, such as LSTM-CRF, biLSTM-CRF, CNN-CRF, and CNN-BiLSTM-CRF, for many neural network models that all exhibit good results in keyword extraction.
However, the above methods have a problem in that they cannot characterize a word ambiguities because they mainly focus on word, character, or feature extraction between words, but ignore the context or semantics of the word context, so that only a static word vector that does not contain context information is extracted, thus resulting in a reduction in the entity recognition capability thereof. In order to solve the problem, the prior art proposes a BERT which further enhances the generalization capability of a word vector model, fully describes character-level, word-level, sentence-level and even inter-sentence relationship characteristics, and better characterizes syntax and semantic information in different contexts.
And because the artificial neural network can accurately make predictions, the problems of low convergence speed, easy local minimization and the like exist, and the Bayesian neural network can better avoid over-fitting.
In view of this, overcoming the shortcomings of the prior art products is a problem to be solved in the art.
Disclosure of Invention
The technical problem to be solved mainly is to provide the intelligent responsibility fixing method and the intelligent responsibility fixing device for the operators of the Bayesian neural network based on the BERT+CRF, the intelligent responsibility fixing method for the operators of the Bayesian neural network based on the BERT+CRF can be combined with the semantics between words and sentences to well identify the keywords in the complaint worksheet, the intelligent responsibility fixing of the worksheet is carried out through the keywords, the attribution party of the complaint responsibility is accurately and rapidly tracked, the customer problem is solved more quickly, the worksheet processing efficiency is improved, and the service quality and the user satisfaction degree are finally improved.
In order to solve the technical problems, one technical scheme adopted by the application is as follows: provided is an operator intelligent responsibility fixing method of a Bayesian neural network based on BERT+CRF, comprising the following steps:
s1: acquiring a complaint work order language library of a communication operator, and cleaning the work order based on the complaint work order language library to obtain a cleaned work order;
s2: inputting the cleaned work order into the BERT model to obtain word vectors of training data;
s3: extracting features of the word vectors by adopting a CRF algorithm to obtain a feature word sequence;
s4: inputting the characteristic word sequence into a Bayesian neural network model for training, adjusting the parameters and forming a problem root cause analysis model;
s5: and analyzing the work order to be responsible according to the problem root cause analysis model, and determining the responsibility attribution.
Further, the acquiring the complaint worksheet language database of the communication operator, and cleaning the worksheet based on the complaint worksheet language database, the acquiring the cleaned worksheet includes:
acquiring the product type, service type and service content of the worksheet data;
the method comprises the steps of carrying out standardization processing on a product type and a service type, replacing a forward slash symbol in the product type and the service type by using an underline symbol, and splicing the product type and the service type by using a hyphen to generate a work order type;
performing text preprocessing on the service content, converting all letters into lower cases, and processing the text by using a token to obtain processed data;
and performing ecode processing on the processed data, and converting the processed data into word embedding, type embedding and position embedding to obtain a cleaned work order.
Further, the BERT model adopts a bidirectional transducer as an encoder to fuse the contexts on the left side and the right side of the word;
the transducer adopts a multi-head mode to expand the model to concentrate on different positions and increase the representation subspace of the attention unit.
Further, a residual network is added into the coding unit of the BERT model, and the residual network is shown as the following formula:
wherein,for output after passing through the residual error network, h i For the output of the multi-head attention mechanism without joining the residual network, u i Is an input of a multi-head attention mechanism;
the output of the multi-headed attentiveness mechanism is calculated according to the following formula:
the fully linked feed forward network in the Transformer structure has two layers of dense, where the first layer of activation function is a ReLU and the second layer is a linear activation function, where,for the output of the multi-headed attentiveness mechanism, b is the bias vector and FNN is the fully linked feed forward network.
Further, the feature extraction of the word vector by using the CRF algorithm, the obtaining a feature word sequence includes:
taking the output of the BERT model as the input of a CRF module, the CRF can obtain an optimal predicted sequence through the relation of adjacent labels, wherein the observed sequence f=f of a given conditional random field 1 ,f 2 ,…,f n State sequence y=y 1 ,y 2 ,…,y n ,y i ∈{B,I,O};
Wherein, in the CRF module, the conditional probability distribution of a given observation sequence f and a state sequence y is shown as the following formula:
wherein omega j As a feature function, γ (f) refers to all possible state sequences,and->Respectively weight and bias.
Further, the feature extraction of the word vector by using the CRF algorithm, the obtaining a feature word sequence includes:
the CRF adopts maximum likelihood estimation for training to obtain an output sequence with the maximum conditional probability;
after passing through the CRF module, all feature word sequences in each work order are output.
Further, the inputting the feature word sequence into the bayesian neural network model for training and optimizing parameters, and the forming of the problem root cause analysis model comprises the following steps:
the prior distribution is placed on the parameters of the prior distribution through a Bayesian neural network, and the weight matrix of the ith layer is given as w i The method comprises the steps of carrying out a first treatment on the surface of the After the training data set is input, the input data is converted into gaussian distribution, thereby obtaining parameters with higher probability.
Further, the inputting the feature word sequence into the bayesian neural network model for training and optimizing parameters, and the forming of the problem root cause analysis model comprises the following steps:
1 likelihood distribution p (y|x, w) is predefined, wherein x represents an input value, y represents an output value and w is a weight of the neural network;
obtaining a posterior probability function of the parameter likelihood distribution by adopting the following formula, and obtaining parameter point estimation by maximizing posterior probability;
p(y * |x * ,D)=∫p(y * |x * ,w)p(w|D)d w
where x denotes predicting new input data by integration and y denotes predicting new output data by integration.
Further, the inputting the feature word sequence into the bayesian neural network model for training and optimizing parameters, and the forming of the problem root cause analysis model comprises the following steps:
in the Bayesian neural network model, parameters are identified based on maximum likelihood estimation, and the formula of the maximum likelihood estimation is as follows:
W MLE =argmax w logP(D|w)
in maximum likelihood estimation, the probabilities of w taking different values are regarded as equal, and prior estimation is not carried out on w; if an a priori estimate is introduced for w, it becomes the maximum a posteriori estimate (MAP), as shown below:
W MAP =argmax w logP(D|w)
W MAP =argmax w logP(D|w)+logP(w)
the Bayesian neural network model comprises an implicit layer, and the keyword extracted from BERT+CRF is converted into a class sentence vector of 300 dimensions by using doc2 vec;
inputting sentence vectors into a Bayesian neural network, updating w and b by a BP neural network, obtaining alpha and sigma by a maximum likelihood estimation method, wherein the updated parameters of the Bayesian neural network are the mean value and the variance of data 2 The formula is as follows:
and repeatedly iterating the formula until the convergence condition is met, substituting the parameters into the initial weight posterior probability distribution type, and solving the optimal solution of the weight.
In order to solve the technical problems, another technical scheme adopted by the application is as follows: provided is an operator intelligent responsibility fixing device of a Bayesian neural network based on BERT+CRF, comprising:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the herein described methods of operator intelligence of bert+crf based bayesian neural networks.
The beneficial effects of this application are: the application provides an operator intelligent responsibility fixing method and device of a Bayesian neural network based on BERT+CRF, wherein the method comprises the following steps: acquiring a complaint work order language library of a communication operator, and cleaning the work order based on the complaint work order language library to obtain a cleaned work order; inputting the cleaned work order into the BERT model to obtain word vectors of training data; extracting features of the word vectors by adopting a CRF algorithm to obtain a feature word sequence; inputting the characteristic word sequence into a Bayesian neural network model for training, adjusting the parameters and forming a problem root cause analysis model; and analyzing the work order to be responsible according to the problem root cause analysis model, and determining the responsibility attribution.
According to the invention, the intelligent responsibility fixing method of the operator of the Bayesian neural network based on BERT+CRF can be used for well identifying keywords in the complaint worksheets by combining the semantics between words and between sentences, and accurately and quickly tracking the attribution party of the complaint responsibility through the keywords, so that the customer problem is solved, the worksheet processing efficiency is improved, and the service quality and the user satisfaction are finally improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following description will briefly explain the drawings that are required to be used in the embodiments of the present application. It is obvious that the drawings described below are only some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
Fig. 1 is a schematic flow chart of an operator intelligent responsibility fixing method of a bayesian neural network based on bert+crf according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an intelligent responsibility-defining system for operators according to an embodiment of the present invention;
fig. 3 is a technical model diagram of an operator intelligent responsibility fixing method of a bayesian neural network based on bert+crf according to an embodiment of the present invention;
FIG. 4 is a keyword extraction model diagram of BERT+CRF provided by an embodiment of the present invention;
fig. 5 is a responsibility fixing structure of a bayesian neural network according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
In the description of the present application, it should be understood that the terms "center," "longitudinal," "transverse," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate an orientation or positional relationship based on that shown in the drawings, merely for convenience of description and to simplify the description, and do not indicate or imply that the devices or elements referred to must have a particular orientation, be configured and operated in a particular orientation, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more features. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In this application, the term "exemplary" is used to mean "serving as an example, instance, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for purposes of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known structures and processes have not been shown in detail to avoid obscuring the description of the present application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
It should be noted that, because the method in the embodiment of the present application is executed in the electronic device, the processing objects of each electronic device exist in the form of data or information, for example, time, which is substantially time information, it can be understood that in the subsequent embodiment, if the size, the number, the position, etc. are all corresponding data, so that the electronic device processes the data, which is not described herein in detail.
Example 1:
in the practical application scenario, due to the increase of service demands, the types and the number of complaint worksheets are greatly increased, and the complaint worksheets are required to be subjected to responsibility allocation, so that a great amount of manpower and material resources are consumed by manual one-to-one check allocation, and the increasing demands cannot be met by some traditional worksheets allocation methods such as supervised learning, semi-supervised learning, RNN, CNN and the like. The embodiment of the invention fuses BERT, CRF and Bayesian neural network, and provides an operator intelligent responsibility fixing method of the Bayesian neural network based on BERT+CRF, which is used for improving work order responsibility fixing efficiency and accuracy.
In order to achieve the aim of the invention, the technical scheme adopted is as follows:
s1: acquiring a complaint work order language library of a communication operator, and cleaning the work order based on the complaint work order language library to obtain a cleaned work order;
in this embodiment, the acquiring of the work order data includes: product type, service content, etc.; and (3) carrying out standardization processing on the product type and the service type, replacing a forward slash symbol (/) in the product type and the service type by using an underline symbol (_), splicing the product type and the service type by using a hyphen (-), and generating the work order type.
The method comprises the steps of performing text preprocessing on service contents, converting all letters into lower cases, performing word segmentation on the text by using a token, and performing special character deletion, inter-character interval and other processing; in order to reduce the noise of the data set, the text length of the filtered service content is smaller than 16 or greater than 800 data samples, and because the maximum length of BERT input is limited to 512 and 510 positions are left after [ CLS ] [ SEP ] is removed, the text length of the content is greater than 16 and smaller than 510, and the common processing mode is a) direct truncation; b) Extracting important fragments; c) The segmentation is carried out by selecting a direct truncation mode for processing in the embodiment of the invention.
Performing the ecode processing on the processed data to convert the processed data into word embedding (token embedding), type embedding (segment embedding) and position embedding (position embedding), wherein the dimension of the word embedding is 768 dimensions, if only one sentence is provided, the type embedding (segment embedding) is a matrix with all 0 dimensions, and the formula of the position embedding (position embedding) is specifically as follows:
where pos is the position number of the word in the vocabulary, i is the dimension number, and 2i and 2i+1 are alternately present.
In this embodiment, word embedding, location embedding, and type embedding are used as inputs to the BERT model. Each data set is then divided into a training set, a verification set and a test set, the BERT model is trained by the training set, verified by the verification set, and tested by the test set.
S2: inputting the cleaned work order into the BERT model to obtain word vectors of training data;
s3: extracting features of the word vectors by adopting a CRF algorithm to obtain a feature word sequence;
s4: inputting the characteristic word sequence into a Bayesian neural network model for training, adjusting the parameters and forming a problem root cause analysis model;
s5: and analyzing the work order to be responsible according to the problem root cause analysis model, and determining the responsibility attribution.
In this embodiment, the added word vectors of the contents of the worksheet, the customer feedback, and the like are trained, and the whole algorithm is composed of a BERT model, a CRF algorithm, and a bayesian neural network, and has the following specific structure:
the step S2 specifically includes the following steps:
(1) To fuse the context on the left and right sides of a word, the BERT model uses a bidirectional transducer as an encoder, and the most important module of the encoding unit is a Self-Attention (Self-Attention) part, as shown in the formula:
wherein Q, K, V are input word vector matrices, d k Is the input vector dimension.
To expand the ability of the BERT model to focus on different locations, increasing the representation subspace of the attention unit, the transducer adopts a "multi-headed" mode, as shown in the following formula:
MultiHead(Q,K,V)=Concat(head 1 ,head 2 ,…,head h )W o
(2) In order to solve the degradation problem in deep learning, a residual network is added into a coding unit, and the residual network is shown as the following formula:
for output after passing through the residual error network, h i For the output of the multi-head attention mechanism without joining the residual network, u i Is an input of a multi-head attention mechanism;
the fully linked feed forward network in the transducer structure has two layers of dense, the first layer having a ReLU activation function and the second layer having a linear activation function, whereinFor the output of the multi-headed attentiveness mechanism, b is the bias vector and FNN is the fully linked feed forward network.
(3) Taking the output of the BERT model as the input of the CRF algorithm, the CRF can obtain an optimal predicted sequence through the relation of adjacent labels, assuming f=f 1 ,f 2 ,…,f n Representing the observation sequence of a conditional random field, y=y 1 ,y 2 ,…,y n ,y i E { B, I, O } represents the actual output sequence, and in the linear CRF, given the observed sequence f, the conditional probability distribution of the state sequence y is shown as:
wherein omega j As a feature function, defined as follows, γ (f) refers to all possible state sequences,and->Respectively weight and bias.
The step S3 specifically includes the following steps:
(4) The CRF algorithm adopts maximum likelihood estimation for training, and the expression is as follows, so that an output sequence with the maximum conditional probability is obtained.
After the CRF algorithm, all feature word sequences in each work order are output.
The step S4 specifically includes the following steps:
(5) The Bayesian neural network places the prior distribution on its parameters, and the weight matrix of the ith layer is given as w i In general, after a training data set is input, the input data is converted into gaussian distribution, so that parameters with higher probability are obtained, so that 1 likelihood distribution p (y|x, w) needs to be defined first, wherein D represents a data set observed by a bayesian formula, in the present invention, keyword data representing CRF output is represented, x represents an input value, y represents an output value, w is a weight of a neural network, x represents new input data predicted by integration, y represents new output data predicted by integration, and specifically, the formula is as follows:
p(y * |x * ,D)=∫p(y * |x * ,w)p(w|D)d w
obtaining a posterior probability function of the parameter likelihood distribution, and obtaining parameter point estimation by maximizing posterior probability.
In bayesian neural networks, parameter identification is typically based on Maximum Likelihood Estimation (MLE), see formula:
W MLE =argmax w logP(D|w)
(6) In maximum likelihood estimation, the probabilities of w taking different values are considered equal, i.e. no a priori estimates are made for w. If a priori estimates are introduced for w, it becomes the maximum a posteriori estimate (MAP) as follows:
W MAP =argmax w logP(w|D)
W MAP =argmax w logP(D|w)+logP(w)
(7) Because the training time is long when the hidden layers of the Bayesian neural network are too many, and the fitting phenomenon is easy to occur, only one hidden layer is used, and the following specific operation steps are as follows:
when the keywords extracted by BERT+CRF are converted into class sentence vectors with 300 dimensions by using doc2vec, the doc2vec considers the interrelationship among the keywords, and the principle is similar to that of world2 vec;
inputting sentence vectors into a Bayesian neural network, updating w and b by a BP neural network, obtaining alpha and sigma by a maximum likelihood estimation method, wherein the updated parameters of the Bayesian neural network are the mean value and the variance of data 2 Is defined as follows.
And repeatedly iterating the formula until the convergence condition is met, substituting the parameters into the initial weight posterior probability distribution type, and solving the optimal solution of the weight.
And the activation function of the Bayesian neural network selects a ReLU function, and classifies worksheets through the ReLU function to complete intelligent responsibility determination in an operator.
In the embodiment of the invention, the intelligent responsibility fixing method of the operators of the Bayesian neural network based on BERT+CRF can combine the semantics between words and between sentences, well identify the keywords in the complaint worksheet, and carry out intelligent responsibility fixing of the worksheet through the keywords, accurately and quickly track the attribution party of the complaint responsibility, solve the customer problem more quickly, improve the worksheet processing efficiency, and finally improve the service quality and the user satisfaction.
Example 2:
in order to make the above objects, features and advantages of the embodiments of the present invention more comprehensible, reference to fig. 4 and 5 is made to the following detailed description of the embodiments of the present invention:
(1) The FTP file system acquires original worksheet data comprising a product type, a service type, service contents and a bill receiving team;
(2) The data processing system performs standardization processing on the product types and the service types to generate work order types, and each work order type independently generates a corresponding catalogue for storing service contents and models;
(3) The data processing system performs data preprocessing on the service content, uses a token of the BERT to perform word segmentation and eliminates stop words and invalid information;
in this embodiment, the acquiring of the work order data includes: product type, service content, etc.; and (3) carrying out standardization processing on the product type and the service type, replacing a forward slash symbol (/) in the product type and the service type by using an underline symbol (_), splicing the product type and the service type by using a hyphen (-), and generating the work order type.
The method comprises the steps of performing text preprocessing on service contents, converting all letters into lower cases, performing word segmentation on the text by using a token, and performing special character deletion, inter-character interval and other processing; in order to reduce the noise of the data set, the text length of the filtered service content is smaller than 16 or greater than 800 data samples, and because the maximum length of BERT input is limited to 512 and 510 positions are left after [ CLS ] [ SEP ] is removed, the text length of the content is greater than 16 and smaller than 510, and the common processing mode is a) direct truncation; b) Extracting important fragments; c) The segmentation is carried out by selecting a direct truncation mode for processing in the embodiment of the invention.
Performing the ecode processing on the processed data to convert the processed data into word embedding (token embedding), type embedding (segment embedding) and position embedding (position embedding), wherein the dimension of the word embedding is 768 dimensions, if only one sentence is provided, the type embedding (segment embedding) is a matrix with all 0 dimensions, and the formula of the position embedding (position embedding) is specifically as follows:
where pos is the position number of the word in the vocabulary, i is the dimension number, and 2i and 2i+1 are alternately present.
In this embodiment, word embedding, location embedding, and type embedding are used as inputs to the BERT model. Each data set is then divided into a training set, a verification set and a test set, the BERT model is trained by the training set, verified by the verification set, and tested by the test set.
(4) The intelligent responsibility-determining system generates word vectors of the service contents by using the encodes of the token;
(5) The intelligent responsibility-defining system uses Bayesian neural network training word vectors of BERT+CRF, and stores the generated intelligent responsibility-defining model;
(6) And the intelligent responsibility-determining system performs responsibility-determining prediction on the test text, and obtains a final classification prediction result according to the probability threshold.
In this embodiment, the added word vectors of the contents of the worksheet, the customer feedback, and the like are trained, and the whole algorithm is composed of a BERT model, a CRF algorithm, and a bayesian neural network, and has the following specific structure:
(1) To fuse the context on the left and right sides of a word, the BERT model uses a bidirectional transducer as an encoder, and the most important module of the encoding unit is a Self-Attention (Self-Attention) part, as shown in the formula:
wherein Q, K, V are input word vector matrices, d k Is the input vector dimension.
To expand the ability of the BERT model to focus on different locations, increasing the representation subspace of the attention unit, the transducer adopts a "multi-headed" mode, as shown in the following formula:
MultiHead(Q,K,V)=Concat(head 1 ,head 2 ,…,head h )W o
(2) In order to solve the degradation problem in deep learning, a residual network is added into a coding unit, and the residual network is shown as the following formula:
for output after passing through the residual error network, h i For the output of the multi-head attention mechanism without joining the residual network, u i Is an input of a multi-head attention mechanism;
the fully linked feed forward network in the transducer structure has two layers of dense, the first layer having a ReLU activation function and the second layer having a linear activation function, whereinFor the output of the multi-headed attentiveness mechanism, b is the bias vector and FNN is the fully linked feed forward network.
(3) Taking the output of the BERT model as the input of the CRF algorithm, the CRF can obtain an optimal predicted sequence through the relation of adjacent labels, assuming f=f 1 ,f 2 ,…,f n Representing the observation sequence of a conditional random field, y=y 1 ,y 2 ,…,y n ,y i E { B, I, O } represents the actual output sequence, and in the linear CRF, given the observed sequence f, the conditional probability distribution of the state sequence y is shown as:
wherein omega j As a feature function, defined as follows, γ (f) refers to all possible state sequences,and->Respectively weight and bias.
(4) The CRF algorithm adopts maximum likelihood estimation for training, and the expression is as follows, so that an output sequence with the maximum conditional probability is obtained.
After the CRF algorithm, all feature word sequences in each work order are output.
(5) The Bayesian neural network places the prior distribution on its parameters, and the weight matrix of the ith layer is given as w i In general, after a training data set is input, the input data is converted into gaussian distribution, so that parameters with higher probability are obtained, so that 1 likelihood distribution p (y|x, w) needs to be defined first, wherein D represents a data set observed by a bayesian formula, in the present invention, keyword data representing CRF output is represented, x represents an input value, y represents an output value, w is a weight of a neural network, x represents new input data predicted by integration, y represents new output data predicted by integration, and specifically, the formula is as follows:
p(y * |x * ,D)=∫p(y * |x * ,w)p(w|D)d w
obtaining a posterior probability function of the parameter likelihood distribution, and obtaining parameter point estimation by maximizing posterior probability.
In bayesian neural networks, parameter identification is typically based on Maximum Likelihood Estimation (MLE), see formula:
W MLE =argmax w logP(D|w)
(6) In maximum likelihood estimation, the probabilities of w taking different values are considered equal, i.e. no a priori estimates are made for w. If a priori estimates are introduced for w, it becomes the maximum a posteriori estimate (MAP) as follows:
W MAP =argmax w logP(w|D)
W MAP =argmax w logP(D|w)+logP(w)
(7) Because the training time is long when the hidden layers of the Bayesian neural network are too many, and the fitting phenomenon is easy to occur, only one hidden layer is used, and the following specific operation steps are as follows:
when the keywords extracted by BERT+CRF are converted into class sentence vectors with 300 dimensions by using doc2vec, the doc2vec considers the interrelationship among the keywords, and the principle is similar to that of world2 vec;
inputting sentence vectors into a Bayesian neural network, updating w and b by a BP neural network, obtaining alpha and sigma by a maximum likelihood estimation method, wherein the updated parameters of the Bayesian neural network are the mean value and the variance of data 2 Is defined as follows.
And repeatedly iterating the formula until the convergence condition is met, substituting the parameters into the initial weight posterior probability distribution type, and solving the optimal solution of the weight.
And the activation function of the Bayesian neural network selects a ReLU function, and classifies worksheets through the ReLU function to complete intelligent responsibility determination in an operator.
In the embodiment of the invention, the intelligent responsibility fixing method of the operators of the Bayesian neural network based on BERT+CRF can combine the semantics between words and between sentences, well identify the keywords in the complaint worksheet, and carry out intelligent responsibility fixing of the worksheet through the keywords, accurately and quickly track the attribution party of the complaint responsibility, solve the customer problem more quickly, improve the worksheet processing efficiency, and finally improve the service quality and the user satisfaction.
The foregoing description is only of embodiments of the present application, and is not intended to limit the scope of the patent application, and all equivalent structures or equivalent processes using the descriptions and the contents of the present application or other related technical fields are included in the scope of the patent application.

Claims (4)

1. An operator intelligent responsibility fixing method of a Bayesian neural network based on BERT+CRF, which is characterized by comprising the following steps:
s1: acquiring a complaint work order language library of a communication operator, and cleaning the work order based on the complaint work order language library to obtain a cleaned work order;
s2: inputting the cleaned work order into the BERT model to obtain word vectors of training data;
s3: extracting features of the word vectors by adopting a CRF algorithm to obtain a feature word sequence;
s4: inputting the characteristic word sequence into a Bayesian neural network model for training, adjusting the parameters and forming a problem root cause analysis model;
s5: analyzing the work order to be responsible according to the problem root cause analysis model, and determining the responsibility attribution; the BERT model adopts a bidirectional transducer as an encoder to fuse the contexts on the left side and the right side of the word;
the transducer adopts a multi-head mode to expand the model to concentrate on different positions and increase the representation subspace of the attention unit;
and adding a residual error network into a coding unit of the BERT model, wherein the residual error network is shown as the following formula:
wherein,for output after passing through the residual error network, h i For the output of the multi-head attention mechanism without joining the residual network, u i Is an input of a multi-head attention mechanism;
the output of the multi-headed attentiveness mechanism is calculated according to the following formula:
the fully linked feed forward network in the Transformer structure has two layers of dense, where the first layer of activation function is a ReLU and the second layer is a linear activation function, where,for the output of the multi-head attention mechanism, b is the bias vector and FNN is the fully linked feed forward network;
the feature extraction of the word vector by adopting the CRF algorithm comprises the following steps: taking the output of the BERT model as the input of a CRF module, the CRF can obtain an optimal predicted sequence through the relation of adjacent labels, wherein the observed sequence f=f of a given conditional random field 1 ,f 2 ,…,f n State sequence y=y 1 ,y 2 ,…,y n ,y i ∈{B,I,O};
Wherein, in the CRF module, the conditional probability distribution of a given observation sequence f and a state sequence y is shown as the following formula:
wherein omega j As a feature function, γ (f) refers to all possible state sequences,and->Respectively weighting and biasing;
the feature extraction of the word vector by adopting the CRF algorithm comprises the following steps: the CRF adopts maximum likelihood estimation for training to obtain an output sequence with the maximum conditional probability;
after passing through the CRF module, outputting all feature word sequences in each work order; inputting the characteristic word sequence into a Bayesian neural network model for training, adjusting parameters, and forming a problem root cause analysis model comprises the following steps:
1 likelihood distribution p (y|x, w) is predefined, wherein x represents an input value, y represents an output value and w is a weight of the neural network;
obtaining a posterior probability function of the parameter likelihood distribution by adopting the following formula, and obtaining parameter point estimation by maximizing posterior probability;
p(y * |x * ,D)=∫p(y * |x * ,w)p(w|D)d
wherein x represents predicting new input data by integration, and y represents predicting new output data by integration;
in the Bayesian neural network model, parameters are identified based on maximum likelihood estimation, and the formula of the maximum likelihood estimation is as follows:
W MLE =argmax w logP(D|w)
in maximum likelihood estimation, the probabilities of w taking different values are regarded as equal, and prior estimation is not carried out on w; if an a priori estimate is introduced for w, it becomes the maximum a posteriori estimate (MAP), as shown below:
W MAP =argmax w logP(w|D)
W MAP =argmax w logP(D|w)+logP(w)
the Bayesian neural network model comprises an implicit layer, and the keyword extracted from BERT+CRF is converted into a class sentence vector of 300 dimensions by using doc2 vec;
inputting sentence vectors into a Bayesian neural network, updating w and b by a BP neural network, obtaining alpha and sigma by a maximum likelihood estimation method, wherein the updated parameters of the Bayesian neural network are the mean value and the variance of data 2 The formula is as follows:
and repeatedly iterating the formula until the convergence condition is met, substituting the parameters into the initial weight posterior probability distribution type, and solving the optimal solution of the weight.
2. The method for intelligently defining responsibility for an operator according to claim 1, wherein the acquiring the complaint worksheet database of the communication operator, and cleaning the worksheet based on the complaint worksheet database, the acquiring the cleaned worksheet comprises:
acquiring the product type, service type and service content of the worksheet data; the method comprises the steps of carrying out standardization processing on a product type and a service type, replacing a forward slash symbol in the product type and the service type by using an underline symbol, and splicing the product type and the service type by using a hyphen to generate a work order type;
performing text preprocessing on the service content, converting all letters into lower cases, and processing the text by using a token to obtain processed data;
and performing ecode processing on the processed data, and converting the processed data into word embedding, type embedding and position embedding to obtain a cleaned work order.
3. The method for intelligently defining responsibility for an operator according to claim 1, wherein the step of inputting the feature word sequence into a bayesian neural network model for training and optimizing parameters, and the step of forming a root cause analysis model for the problem comprises the steps of:
the prior distribution is placed on the parameters of the prior distribution through a Bayesian neural network, and the weight matrix of the ith layer is given as w i The method comprises the steps of carrying out a first treatment on the surface of the After the training data set is input, the input data is converted into gaussian distribution, thereby obtaining parameters with higher probability.
4. An operator intelligent responsibility fixing device of a Bayesian neural network based on BERT+CRF, which is characterized by comprising:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the bert+crf based bayesian neural network operator intelligence method of any of claims 1 to 3.
CN202211458652.0A 2022-11-17 2022-11-17 Intelligent responsibility fixing method and device for operators Active CN115713307B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211458652.0A CN115713307B (en) 2022-11-17 2022-11-17 Intelligent responsibility fixing method and device for operators

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211458652.0A CN115713307B (en) 2022-11-17 2022-11-17 Intelligent responsibility fixing method and device for operators

Publications (2)

Publication Number Publication Date
CN115713307A CN115713307A (en) 2023-02-24
CN115713307B true CN115713307B (en) 2024-02-06

Family

ID=85234434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211458652.0A Active CN115713307B (en) 2022-11-17 2022-11-17 Intelligent responsibility fixing method and device for operators

Country Status (1)

Country Link
CN (1) CN115713307B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804651A (en) * 2018-06-07 2018-11-13 南京邮电大学 A kind of Social behaviors detection method based on reinforcing Bayes's classification
CN111753060A (en) * 2020-07-29 2020-10-09 腾讯科技(深圳)有限公司 Information retrieval method, device, equipment and computer readable storage medium
CN112949300A (en) * 2021-03-05 2021-06-11 深圳大学 Typhoon early warning planning model automatic generation method and system based on deep learning
CN113064992A (en) * 2021-03-22 2021-07-02 平安银行股份有限公司 Complaint work order structured processing method, device, equipment and storage medium
CN113312501A (en) * 2021-06-29 2021-08-27 中新国际联合研究院 Construction method and device of safety knowledge self-service query system based on knowledge graph
CN114139548A (en) * 2021-11-30 2022-03-04 北京比特易湃信息技术有限公司 Spoken language understanding method based on template matching and small sample depth model
CN115309857A (en) * 2022-05-16 2022-11-08 中国安全生产科学研究院 Intelligent classification and rapid imaging method and application of emergency

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804651A (en) * 2018-06-07 2018-11-13 南京邮电大学 A kind of Social behaviors detection method based on reinforcing Bayes's classification
CN111753060A (en) * 2020-07-29 2020-10-09 腾讯科技(深圳)有限公司 Information retrieval method, device, equipment and computer readable storage medium
CN112949300A (en) * 2021-03-05 2021-06-11 深圳大学 Typhoon early warning planning model automatic generation method and system based on deep learning
CN113064992A (en) * 2021-03-22 2021-07-02 平安银行股份有限公司 Complaint work order structured processing method, device, equipment and storage medium
CN113312501A (en) * 2021-06-29 2021-08-27 中新国际联合研究院 Construction method and device of safety knowledge self-service query system based on knowledge graph
CN114139548A (en) * 2021-11-30 2022-03-04 北京比特易湃信息技术有限公司 Spoken language understanding method based on template matching and small sample depth model
CN115309857A (en) * 2022-05-16 2022-11-08 中国安全生产科学研究院 Intelligent classification and rapid imaging method and application of emergency

Also Published As

Publication number Publication date
CN115713307A (en) 2023-02-24

Similar Documents

Publication Publication Date Title
CN110021439B (en) Medical data classification method and device based on machine learning and computer equipment
US11030401B2 (en) Unsupervised topic modeling for short texts
Bai et al. Deep enhanced representation for implicit discourse relation recognition
CN110245229B (en) Deep learning theme emotion classification method based on data enhancement
Feng et al. A linear-time bottom-up discourse parser with constraints and post-editing
WO2019153737A1 (en) Comment assessing method, device, equipment and storage medium
CN111738004A (en) Training method of named entity recognition model and named entity recognition method
CN107832663A (en) A kind of multi-modal sentiment analysis method based on quantum theory
CN110969020A (en) CNN and attention mechanism-based Chinese named entity identification method, system and medium
CN114169330A (en) Chinese named entity identification method fusing time sequence convolution and Transformer encoder
CN107818084B (en) Emotion analysis method fused with comment matching diagram
CN111753058B (en) Text viewpoint mining method and system
US20220300735A1 (en) Document distinguishing based on page sequence learning
CN111339260A (en) BERT and QA thought-based fine-grained emotion analysis method
CN112287672A (en) Text intention recognition method and device, electronic equipment and storage medium
US20230298630A1 (en) Apparatuses and methods for selectively inserting text into a video resume
CN114153973A (en) Mongolian multi-mode emotion analysis method based on T-M BERT pre-training model
Haffner Scaling large margin classifiers for spoken language understanding
Fang et al. MANNER: A variational memory-augmented model for cross domain few-shot named entity recognition
CN115713307B (en) Intelligent responsibility fixing method and device for operators
Suleymanov et al. Text classification for Azerbaijani language using machine learning and embedding
US20220180190A1 (en) Systems, apparatuses, and methods for adapted generative adversarial network for classification
CN114386412B (en) Multi-mode named entity recognition method based on uncertainty perception
Zhang et al. Segmenting Chinese Microtext: Joint Informal-Word Detection and Segmentation with Neural Networks.
Liu et al. Suggestion mining from online reviews usingrandom multimodel deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant