CN115935245B

CN115935245B - Automatic classification and allocation method for government affair hot line cases

Info

Publication number: CN115935245B
Application number: CN202310228000.6A
Authority: CN
Inventors: 杨伊态; 李颖; 李军霞; 王敬佩; 柯宝宝; 黄亚林; 张兆文; 李成涛; 陈胜鹏; 付卓
Original assignee: Geospace Information Technology Co ltd
Current assignee: Geospace Information Technology Co ltd
Priority date: 2023-03-10
Filing date: 2023-03-10
Publication date: 2023-05-26
Anticipated expiration: 2043-03-10
Also published as: CN115935245A

Abstract

The invention is applicable to the field of artificial intelligence, and provides an automatic classification and allocation method for government affair hot line cases, which comprises the following steps: step S1, a sample set is established and a pre-training model is trained; step S2, traversing the sample set according to the trained pre-training model to generate a case type related linked list and a department related linked list; s3, predicting the case type and the main department with highest probability of the sample by using the pre-training model, combining the case type related linked list and the department related linked list to obtain a case type enhancement vector and a department enhancement vector, and obtaining a case classification model and a department allocation model through training and parameter updating; and S4, acquiring input case contents, and outputting predicted case classification results and division results through case classification model and division model processing. The case classification and case allocation accuracy of the method is higher, and complaint cases in the government affair lines of convenience can be automatically classified and allocated to the authorities for processing.

Description

Automatic classification and allocation method for government affair hot line cases

Technical Field

The invention belongs to the technical field of artificial intelligence, and particularly relates to an automatic classification and allocation method for government affair hot line cases.

Background

In the government service hotline, citizens report the complaint information in modes of calling, micro-letter applet, APP, portal messages and the like, operators complete the classification of the case according to the complaint content, and then the complaint information is distributed to relevant responsibility units and disposal departments. With the wide application of government service hotlines, on one hand, the acceptance amount of hotline cases is increased and the manual disposal cost is increased gradually, and when social heat problems occur, the seat is busy, so that the demands of vast citizens are difficult to meet; on the other hand, because the cases of the convenient government service hotline are classified more, the distinction between the classes is not obvious, the case association departments are wide, the hierarchy is complex, and the realization of rapid and accurate case classification and allocation becomes the difficult problem that the office of the government service hotline has to be solved urgently.

At present, automatic classification and allocation methods of government affair hotlines are roughly divided into three types:

the first is a rule decision tree based approach. The method firstly designs filtering and matching rules, and then classifies cases according to the rules. Such as keyword matching based, knowledge base lookup based, etc. The method has good effect on distinguishing obvious cases in category, but has poor classification and distribution accuracy on cases with complex categories or similar categories.

The second category is machine learning based methods. Such methods are to sort and dispatch the case text by designing machine learning algorithms or models. Common methods include methods based on XGBoost algorithm, cosine similarity, SVM support vector machine and the like. The method can learn more category characteristics and partial semantic characteristics, and the classification and allocation accuracy of cases with complicated categories or similar categories is improved compared with that of the first category, but is still not ideal.

The third class is neural network based methods. The method extracts deep semantic features of the case text by designing the multi-layer neural network, and compared with a machine learning method, the method has higher accuracy in classifying similar case categories and distributing similar departments. However, the existing neural network-based method has poor classification accuracy for cases with higher similarity, and has unsatisfactory department accuracy for classifying cases of different categories into different administrative levels.

In order to realize efficient case classification and allocation in business, at present, a method based on a design rule decision tree, a machine learning algorithm and a deep neural network is adopted to realize automatic case classification by using a computer instead of manpower, and then the case classification is allocated to relevant responsible departments for treatment. However, the existing methods have low classification accuracy for distinguishing unobvious categories, such as "site noise (daytime)", "site noise (night)" and "business noise problem" which are difficult to distinguish case categories; the division accuracy rate for departments belonging to the same class and different levels is low, for example, the case processing range between the departments 'city management committee' and 'district management committee' is difficult to distinguish.

Disclosure of Invention

In view of the above problems, the invention aims to provide an automatic classification and allocation method for government affair hot line cases, which aims to solve the technical problem of low accuracy of the existing method.

The invention adopts the following technical scheme:

the invention provides an automatic classification and allocation method for government affair hot line cases, which comprises the following steps:

step S1, a sample set is established and a pre-training model is trained;

step S2, traversing the sample set according to the trained pre-training model to generate a case type related linked list and a department related linked list;

s3, predicting the case type and the main department with highest probability of the sample by using the pre-training model, combining the case type related linked list and the department related linked list to obtain a case type enhancement vector and a department enhancement vector, and obtaining a case classification model and a department allocation model through training and parameter updating;

and S4, acquiring input case contents, and outputting predicted case classification results and division results through case classification model and division model processing.

The beneficial effects of the invention are as follows: the invention provides an automatic classification and allocation method for government affair hot-wire cases, which is based on a neural network model, wherein the case classification model improves the classification accuracy of the model for similar case types by mainly training the differences of similar case types; the department allocation model improves the accuracy rate of the model on case allocation by fusing administrative division information and important training of the distinction of similar case authorities. Compared with the existing method, the method has higher case classification and case allocation accuracy, can automatically classify complaints in the convenience government hot line and allocate the complaints to the authorities for processing, improves the service efficiency of the convenience government hot line, reduces labor cost, improves the intelligent and automatic degree of the hot line, and improves the service level of people.

Drawings

Fig. 1 is a flowchart of a method for automatically classifying and allocating government affair hot line cases according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a process of pre-training a model provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of the generation of a case category related linked list;

fig. 4 is a process schematic of a case classification model and a department allocation model.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

In order to illustrate the technical scheme of the invention, the following description is made by specific examples.

As shown in fig. 1, the automatic classification and allocation method for government affair hot line cases provided in this embodiment includes the following steps:

and S1, establishing a sample set and training a pre-training model.

The step is used for building a pre-training model and fine-tuning parameters of the pre-training model, and the specific process of the step is as follows in combination with the illustration of fig. 2:

s11, a sample set is established, wherein the sample format is [ case information, area information, case category, and authorities ], the area information is optional, and the sample set is proportionally divided into a training sample set and a verification sample set.

For example, sample a [ "day noise gives people no way to sleep, complaints, and has gone through one hour, too slow", "pyridazine", "440", "105" ], where 440 and 105 are case category code and executive code, respectively.

S12, for each sample in the training sample set, the BERT model is used for converting case information in the sample set into an embedded vector.

First, if there is area information in the sample, the area information is combined with case information to obtain combined case information. Such as: the case information after the combination of the samples a is that' the noise of the pyridazine area on the day can not cause people to sleep, complaints are generated, people do not exist after one hour, and the case information is too slow and too slow.

Then, the combined case information is converted into corresponding word element codes by using a BERT word segmentation device, and special character codes are added at the head and the tail to form the word element codes of the case information. In this example, the BERT model is Chinese-BERT-wwm-extBERT (Bidirectional Encoder Representation from Transformers).

Such as: the information after the merging of the samples a is converted into the word element code which is as follows: [101,1515,1515,1277,1921,1692,7509,6375,782,3187,3791,4717,6230,8024,2347,2832,6401,8024,6814,749,671,702,2207,3198,738,3766,782,5052,8024,1922,2714,8024,1922,2714,102]. Where 101 is the encoding of the special character 'CLS' and 102 is the encoding of the special character 'SEP'.

Finally, inputting the character element code of the case information into the BERT model to obtain an embedded vector E of the special character CLS _CLS And a token vector [ E ] for each token ₁ ,E ₂ ,E ₃ …E _n ]Where n is the number of tokens.

The output of the BERT model is usually mainly two, one being the embedded vector of the special character "CLS" and the other being the token vector of each word. The embedded vector of the CLS is a vector obtained by pooling all the word vectors. Typically simple classification uses embedded vectors of the CLS directly, but individual token vectors may also be used if the model requires each token vector. In general, the embedded vector of the CLS is not used together with the token vector, and the embedded vector of the CLS is essentially derived from each token vector and represents the semantic feature of the entire text. The embedding vectors obtained after processing are different for different samples.

S13, respectively inputting the embedded vectors into two linear layers, outputting each linear layer to a SOFTMAX layer, respectively obtaining a case class number and a department number, and updating model parameters of the pre-training model by using a gradient descent method.

The two linear layers are respectively linear layers L1 and L2. During the training phase of the pre-training model, the vector E is embedded _CLS The linear layer L1 is input, then the loss value is calculated using a cross entropy function, and the model parameters are updated using a gradient descent method. The input dimension of the linear layer L1 and the embedded vector E _CLS The dimension of the number of the case categories is consistent, and the output dimension is the number N of the case categories _class . Similarly, the vector E will be embedded _CLS The linear layer L2 is input, then the loss value is calculated using a cross entropy function, and the model parameters are updated using a gradient descent method. The input dimension of the linear layer L2 and the embedded vector E _CLS The dimension of (2) is consistent, and the output dimension is the number N of departments _depa . This example uses a pytorch framework using a cross entropy function of cross entropyloss ().

During the prediction phase, the vector E is embedded during the subsequent model use _CLS Inputting the case number into a linear layer L1, inputting a softmax layer to obtain the probability value of each case category, and taking the case number with the highest probability value as a prediction result. Similarly, the vector E will be embedded _CLS Inputting a linear layer L2, inputting a softmax layer to obtain probability value of each department, and taking out the part with highest probability valueThe gate number is used as a prediction result.

S14, after the training sample set is used for carrying out iterative training on the pre-training model, the accuracy of the model is verified by using the verification sample set, and a model with the highest verification accuracy is used as a trained pre-training model, wherein the output of the pre-training model in the prediction stage is the probability value of each case category and department to which the sample belongs.

And training the model in a pre-training mode on the training set for multiple times in the mode, verifying the accuracy of the model by using the verification sample set, and selecting the model with the highest accuracy. The specific implementation is as follows:

firstly, after the pre-training model traverses all training sample sets, parameters are frozen, and in this embodiment, a pytorch framework is used, and parameters are frozen by using a model.

And then inputting each sample in the verification sample set into the pre-training model, and obtaining a corresponding prediction case type and a prediction department for each sample. For each verification sample, if the predicted case type is consistent with the case type in the sample, the prediction is correct, otherwise, the prediction is incorrect. The department prediction mode is the same. Where accuracy is the number of verification correct divided by the total number of verification samples. The total verification accuracy of the pre-training model is (case accuracy + department accuracy)/2. If the accuracy is higher than the existing highest accuracy, the model parameters of the version are saved, and the existing highest accuracy is changed into the accuracy of the version.

And S2, traversing the sample set according to the trained pre-training model to generate a case type related linked list and a department related linked list.

The process of generating the case category related linked list is as follows:

s211, setting an empty case category related linked list, wherein in the related linked list, the position serial number x stores the related case category of the case category x, and the length of the related linked list is N _class ；

S212, inputting each sample into a trained pre-training model to obtain probability values of each case category to which the sample belongs, and outputting the previous k with the largest probability value in sequence ₁ A serial number of the individual case category;

s213, comparing the output case categories with the correct case categories of the sample one by one in sequence, and ending when the comparison is consistent or all the comparison is inconsistent; when inconsistency occurs in the comparison process, recording information in a position corresponding to the correct case category serial number of the sample of the related linked list, and rearranging the information according to the count from large to small; if the current case category is y, the correct case category is z, and the information recording rule is as follows: if the information of the correct case category z exists in the related linked list position y, adding 1 to the information count corresponding to the correct case category z; if there is no information of the correct case category z, the information { case category z: count 1}.

In the step, a training pre-training model is used for each sample to obtain probability values of each case category to which the sample belongs, and the previous k is selected ₁ The case category with the maximum probability value is output k from the big to the small according to the probability value ₁ Number of individual case categories.

And comparing the output case categories with the case categories with correct samples one by one, starting comparison from the case category with the maximum probability value, and if the current case category y is inconsistent with the correct case category z recorded by the samples, recording information in the position y of the case category related linked list. The rule of recording information is that if the information of the case category z exists in the related linked list position y, the information count corresponding to the correct case category z is increased by 1; if there is no information of the correct case category z, the information { case category z: count 1}.

Repeating the comparison operation until the case category y is consistent with the correct case category z of the sample record, stopping, or k ₁ The comparison of all the case categories is stopped. When the model prediction is correct, the first comparison is consistent, and no information is recorded at the moment; when model predicts k ₁ If the case categories are inconsistent, k is recorded ₁ Secondary information.

The case category related linked list generation schematic shown in fig. 3 illustrates that 4 samples are input, and the first 3 case categories are fetched each time.

Inputting a sample 1, and pre-training a model prediction result 1 of a model: the case category of the case 3 before the predicted probability value is: 449 326, 11; the correct case category is 326;

the predicted case category 449 is inconsistent with the correct case category 326, and {326:1} is recorded at the 449 th position of the case category related linked list;

the predicted case category 326 is consistent with the correct case category 326 and ends.

Input sample 2, model prediction result of pre-training model 2: the case category of the case 3 before the predicted probability value is: 449,440,326; the correct case category is 326;

the predicted case category 449 is inconsistent with the correct case category 326, and the correct case category 326 is recorded at the 449 th position of the case category related linked list, so that the count of the correct case category 326 is only increased by 1 to be {326:2};

the predicted case category 440 is inconsistent with the correct case category 326, and {326:1} is recorded at the position of the case category related linked list 440;

Input sample 3, model prediction result of pre-training model 3: the case category of the case 3 before the predicted probability value is: 449,440,1; the correct case category is 1;

predicted case category 449 is inconsistent with correct case category 1, record {1:1} at the location of case category related linked list 449.

And ordering the information of 449 positions in the related chain table of the case category according to the size of the category count, and recording as follows: {326:2},{1:1}.

The predicted case category 440 is inconsistent with the correct case category 1, and {1:1} is recorded at the location of the case category related linked list 440.

And ordering the information of 440 positions in the case category related linked list according to the category count size, and recording as follows: {326:1},{1:1}.

The predicted case category 1 is consistent with the correct case category 1, and the process is finished.

Input sample 4, model prediction result 4 of pre-trained model: the case category of the case 3 before the predicted probability value is: 1,2,449; the correct case category is 1.

The generation of the case category related linked list is completed.

The generation mode of the department related linked list is the same as that of the case type related linked list, and the method is as follows: s321, setting an empty department related linked list; s322, inputting each sample into a trained pre-training model to obtain probability values of each department to which the sample belongs, and outputting the previous k with the largest probability value in sequence ₁ Serial numbers of individual departments; s323, comparing the output departments with the correct departments of the sample one by one in sequence, and ending when the comparison is consistent or all the comparison is inconsistent; when inconsistency occurs in the comparison process, recording information in a position corresponding to the current department serial number of the corresponding Guan Lianbiao department, and rearranging the information from large to small according to the count; if the current department is y ', the correct department is z', the information recording rule is as follows: if the information of the correct department z ' exists in the related linked list position y ', adding 1 to the information count corresponding to the correct department z '; if there is no information for the correct department z ', add the information { department z': count 1}.

And S3, predicting the case type and the main department with highest probability of the sample by using the pre-training model, combining the case type related linked list and the department related linked list to obtain a case type enhancement vector and a department enhancement vector, and obtaining a case classification model and a department allocation model through training and parameter updating.

As shown in fig. 4, the training process of the case classification model and the department allocation model is the same. For obtaining a case classification model, the specific process is as follows:

s311, predicting a case type c with the highest probability value of the current sample by using the pre-training model.

Firstly, inputting a sample set, wherein the sample set is proportionally divided into a training sample set and a verification sample set, and predicting a case category c with highest probability of the sample through a pre-training model.

After the sample a is converted into the character encoding input bert model, the embedding direction is obtainedQuantity E _CLS And a token vector [ E ] ₁ ,E ₂ ,E ₃ …E _n ]. Embedding vector E _CLS The linear layer L1 is input, and then the case category c with the highest probability value is obtained through prediction by a softmax layer. Assume that the predicted case category is 449 and the department is 105.

S312, taking out the previous k from the position of the serial number c of the case type related linked list ₂ The case categories related to the case category c are respectively marked as c ₁ ,c ₂ …c _k2 。

S313, respectively taking out serial numbers c from the case type vector group ₁ ,c ₂ …c _k2 And case category vectors with serial numbers of c, respectively recorded as Rc ₁ ,Rc ₂ …Rc _k2 ,Rc。

The case class vector group is composed of a case class vector group with a dimension of [ N ] _class ，E]Matrix formation, where N _class For the case category number, E is the dimension of the word element, and the embedded vector E _cls Is generally 768.

Such as: assume that the predicted case category is 449.K (K) ₂ =2, the first 2 most relevant categories of case category 449 is 362,1; the case category vectors of

lines

362,1 and 449 are taken out of the case category vector group.

S314, calculating case type enhancement vectors corresponding to the case type vectors

Wherein the method comprises the steps of

In E _j And the j-th word element vector of the sample represents point multiplication.

S315, the case type enhancement vectors are sequentially input to two linear layers and then input to a SOFTMAX layer.

Enhancing vectors for case categories

Inputting two linear layers L3 and L4, inputting/outputting dimension of the linear layer L3, inputting dimension of the linear layer L4 and embedding vector E _cls The output dimension of the linear layer L4 is the case category number N _class 。

For example, for the class vector corresponding to the case class 362, the dimension is [1, e ], the vector of each word is [1, e ], the class vector is multiplied with each word point to obtain n class word vectors with the dimension of [1, e ], and then the n class word vectors are added to obtain the case class enhancement vector with the dimension of [1, e ].

Dimension [1, E]The case type enhancement vector of (1) is input into a linear layer L3, and the obtained result dimension is [1, E]The method comprises the steps of carrying out a first treatment on the surface of the Then inputting the result into a linear layer L4 to obtain the result with the dimension of [1, N ] _class ]. Finally, the data are input into a SOFTMAX layer for normalization.

S316, calculating a loss value by using a cross entropy function, and updating model parameters by using a gradient descent method, wherein only the linear layers L3 and L4 and the case class vector group are trained, and parameters of the Bert model, the linear layer L1 and the linear layer L2 are frozen, namely, the parameters are not updated in the iterative training of the pre-training model.

S317, after the case classification model is subjected to iterative training by using the training sample set, the accuracy of the case classification model is verified by using the verification sample set, and the first model with the highest verification accuracy is used as the trained case classification model.

And after the case classification model traverses all training sample sets, freezing parameters. The present embodiment uses the pytorch framework, freezing parameters using the model. Eval () function, i.e., the case classification model does not update parameters during verification. Inputting each sample in the verification sample set into a case classification model, and obtaining a corresponding predicted case type for each sample; for each verification sample, if the predicted case type is consistent with the case type in the sample, the prediction is correct, otherwise, the prediction is incorrect. Accuracy is the number of verification correct divided by the total number of verification samples. The verification accuracy of the case classification model is (case accuracy + department accuracy)/2; if the accuracy is higher than the existing highest accuracy, the case classification model parameters of the version are saved, and the existing highest accuracy is changed into the accuracy of the version.

The training process of the department allocation model is the same and can be simply described as: s321, predicting a department c with highest probability value of the current sample by using the pre-training model, specifically, embedding a vector E _CLS The linear layer L2 is input, and then the main department d with the highest probability value is obtained through prediction by a softmax layer. S322, taking out the front k from the position of the serial number d of the department related linked list ₂ The case categories related to the department d are respectively marked as d ₁ ,d ₂ …d _k2 The method comprises the steps of carrying out a first treatment on the surface of the S323, taking out serial numbers d from the department vector groups ₁ ,d ₂ …d _k2 And department vectors with serial number d, respectively denoted Rd ₁ ,Rd ₂ …Rd _k2 Rd; for example, the first 2 most relevant departments of the departments 105 are 100,13, and the department vectors of lines 100,13, and 449 are taken from the group of department category vectors. S324, calculating department enhancement vectors corresponding to the department vectors

Wherein->

In E _j Representing point multiplication for the j-th word element vector of the sample; the department vector group is composed of a unit vector group with a dimension of N _depa ，E]Matrix formation, where N _depa Is the number of departments. S325, sequentially inputting department enhancement vectors into two linear layers and then into a SOFTMAX layer; for example, the department enhancement vector +.>

Inputting two linear layers L5 and L6, inputting and outputting dimension of the linear layer L5, inputting dimension of the linear layer L6 and embedding vector E _cls The output dimension of the linear layer L6 is the department category number N _depa . S326, calculating a loss value by using a cross entropy function, and updating model parameters by using a gradient descent method; s327, performing iterative training on the department allocation model by using a training sample set, and using a first model with the highest verification accuracy as a trained department allocation model by using the verification sample set to verify the accuracy of the model.

Inputting the case content first, and inputting the area information selectively;

and then, predicting according to the pre-training model to obtain the case class number and the department number with highest probability.

And then, according to the case classification model and the department allocation model, obtaining output results of the linear layer L4 and the linear layer L6, inputting the output result of the linear layer L4 into a softmax layer to obtain a probability value of each case category, and then taking the case serial number with the highest score as a predicted case classification result. And inputting the output result of the linear layer L6 into a softmax layer to obtain the probability value of each department, and then taking the department serial number with the highest probability value as a predicted division department result.

And finally, outputting the predicted case classification result and the distribution department result as final output.

In summary, the invention provides an automatic classification and allocation method for government affair hot wire cases, which is based on a neural network model, and improves the classification accuracy of the model for similar case categories by using a two-step training method; and fusing administrative division information to improve the accuracy of the model on case allocation. Compared with the existing method, the case classification and case allocation accuracy of the method provided by the invention is higher. The method can automatically classify complaint cases in the convenience government hot line and distribute the complaint cases to the authorities for processing, reduces the labor cost and improves the intelligent and automatic degree of the hot line.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims

1. The automatic classification and allocation method for the government affair hot line cases is characterized by comprising the following steps of:

step S1, a sample set is established and a pre-training model is trained;

s4, acquiring input case contents, and outputting predicted case classification results and division results through case classification model and division model processing;

in the step S2, the process of generating the case category related linked list and the department related linked list is the same;

for generating a case category related linked list, the process is as follows:

s211, setting an empty case category related linked list, wherein in the related linked list, the position serial number x stores the related case category of the case category x;

s213, comparing the output case categories with the correct case categories of the sample one by one in sequence, and ending when the comparison is consistent or all the comparison is inconsistent; when inconsistency occurs in the comparison process, recording information in a position corresponding to the correct case category serial number of the sample of the related linked list, and rearranging the information according to the count from large to small; if the current case category is y, the correct case category is z, and the information recording rule is as follows: if the information of the correct case category z exists in the related linked list position y, adding 1 to the information count corresponding to the correct case category z; if there is no information of the correct case category z, the information { case category z: count 1};

in the step S3, the process of obtaining the case classification model and the department allocation model is the same;

for obtaining a case classification model, the specific process is as follows:

s311, predicting a case type c with the highest probability value of the current sample by using a pre-training model;

s312, taking out the previous k from the position of the serial number c of the case type related linked list ₂ The case categories related to the case category c are respectively marked as c ₁ ,c ₂ …c _k2 ；

S313, respectively taking out serial numbers c from the case type vector group ₁ ,c ₂ …c _k2 And case category vectors with serial numbers of c, respectively recorded as Rc ₁ ,Rc ₂ …Rc _k2 ,Rc；

Wherein the method comprises the steps of

In E _j Representing point multiplication for the j-th word element vector of the sample;

s315, sequentially inputting the case type enhancement vectors into two linear layers and then inputting the case type enhancement vectors into a SOFTMAX layer;

s316, calculating a loss value by using a cross entropy function, and updating model parameters by using a gradient descent method;

2. The automatic classification and allocation method for government affair hot-line cases according to claim 1, wherein the specific process of step S1 is as follows:

s11, a sample set is established, wherein the sample format is [ case information, area information, case category, and authorities ], the area information is optional, and the sample set is divided into a training sample set and a verification sample set according to a proportion;

s12, for each sample in the training sample set, converting case information in the sample set into an embedded vector by using a BERT model;

s13, respectively inputting the embedded vectors into two linear layers, outputting each linear layer to a SOFTMAX layer, respectively obtaining a case class number and a department number, and updating model parameters of the pre-training model by using a gradient descent method;