CN111708885A

CN111708885A - Intelligent case shunting method and device

Info

Publication number: CN111708885A
Application number: CN202010516323.1A
Authority: CN
Inventors: 王平辉; 王悦; 陶敬; 许诺; 陈龙; 韩婷; 王杰华; 杨鹏; 吴用
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2020-06-09
Filing date: 2020-06-09
Publication date: 2020-09-25

Abstract

The invention discloses a case intelligent shunting method and a case intelligent shunting device, which comprise the following steps: the data acquisition module is used for acquiring a training set containing case description texts and complex and simple label spaces; the data preprocessing module is used for vectorizing the case description text to obtain a case code input matrix; the case coding module is used for aggregating the word semantic vectors to obtain case implicit semantic feature vectors containing case semantic information; the multi-attention mechanism module is used for respectively calculating the weight coefficients of each element in the case factor space and the common element space and all words in the text sequence by using an attention mechanism, constructing a plurality of attention vectors, and adding the normalized case description text length to obtain an element feature matrix vector; and the complex and simple splitting module is used for processing the element feature matrix by utilizing the multilayer perceptron to determine an output vector and realize complex and simple splitting of the case. The invention can utilize case description to predict the complexity of cases, and solves the problems of time and labor waste and poor flexibility in the prior art.

Description

Intelligent case shunting method and device

Technical Field

The invention belongs to the technical field of artificial intelligence, relates to application of artificial intelligence in judicial cases, and particularly relates to an intelligent case shunting method and device.

Background

With the increasing of the economic development speed, the contradiction and dispute in the social transformation period are continuously increased, the implementation of the case registration system reduces the threshold of cases entering the court, and the requirement of diversified judicial requirements of people is added, so that the working requirement of the court is improved. The high-speed increase of the national court collection makes the contradiction between the judging force and the judging resource supply and demand of the court increasingly aggravated, and causes serious 'congestion' of various cases, especially civil and commercial cases after entering the court. In order to actively deal with the 'few cases' contradiction in the judicial reform attack period and improve the judging efficiency, cases need to be subjected to complicated and simplified distribution so as to optimize the judicial resource allocation.

The 'complex and simple splitting' of the case means that on the basis of strictly following objective judicial laws and fully guaranteeing the legitimate program rights of the parties, a case handling mode and a trial method are reformed, and by establishing a scientific complex and simple case distinguishing standard and assisting with differentiated complex and simple case trial rules, the 'fast simple case trial and complex case fine trial, complex and simple cases when complex and easy, efficiency and quality are simplified' are realized.

In order to realize complex and simple distribution, the conventional method adopts a rule standard for manually defining complex and simple distribution of cases, and carries out relevant standard comparison and scoring on each case, thereby realizing division of difficulty and easiness of cases. Certain results are achieved, but a series of problems exist. The concrete points are as follows:

1. the workload is huge, and time and labor are wasted. The discrimination standard of complicated and simplified distribution has larger ambiguity, needs to be continuously adjusted and changed according to the actual case handling situation, is time-consuming and labor-consuming, and has great difficulty and challenge in accurately formulating the standard.

2. The flexibility is poor, the business customization capability is lacked, and the detection standard and method for complex and simple case distribution can not be flexibly adjusted according to the actual conditions of different courts.

Aiming at the problems that the existing method adopts a regular scoring mode to identify cases difficultly, and causes complex and simple distribution, time and labor are wasted and the flexibility is poor, an effective solution is not provided at present.

Disclosure of Invention

In order to overcome the defects of the prior art, the invention aims to provide an intelligent case distribution method and device, so as to solve the problems that the complicated and simple case distribution is time-consuming and labor-consuming and the flexibility is poor due to the fact that the existing method adopts rule-based marking.

In order to achieve the purpose, the invention adopts the technical scheme that:

an intelligent case shunting method comprises the following steps:

step 1: acquiring case description texts and complex and simple label spaces as original data of model training;

step 2: cleaning and labeling original case description, acquiring a semantic vector of each word in a case description text by adopting a word2vec word vector model to obtain an initial case description vector, and acquiring a case coding input matrix;

and step 3: constructing a bidirectional long-short term memory network (Bi-LSTM) for feature extraction for the case description initial vector to obtain a case implicit semantic feature vector containing case semantic information;

and 4, step 4: calculating the weight coefficient of each element in the common element space and each word in the case implicit semantic feature vector to obtain the attention weight coefficients of a plurality of elements, weighting each element attention weight coefficient and each word semantic vector in the case implicit semantic feature vector to obtain a plurality of element attention vectors, summing and reducing dimensions to obtain the common element attention vector;

and 5: constructing case by element attention vectors, introducing case description text length, splicing the normalized case description text length and each element attention vector to obtain an element feature matrix O_h；

Step 6: and (5) determining an output vector by using the multilayer perceptron according to the input obtained in the step (5), normalizing by using a Softmax model, determining the complex and simple cases, calculating the overall loss of the model, optimizing by using a gradient back propagation optimization algorithm, and using the model for the complex and simple case shunting after training.

The extraction of the case implicit semantic feature vector is obtained by respectively extracting forward and reverse implicit vectors of each word in the case description text by designing a bidirectional long-short term memory network.

The attention weight coefficient is obtained by respectively calculating the similarity degree of each element and each word in the case implicit semantic feature vector.

In the process of training the model, the cross entropy loss function is used for calculating the overall loss, namely the probability distribution difference between the overall output of the device and the observation result is quantized, and the random gradient descent algorithm is used for optimization.

The invention also provides an intelligent case diversion device, which comprises:

the data acquisition module 201 is used for acquiring case description texts and complex and simple label spaces and acquiring original data of model training;

the data preprocessing module 202 is used for cleaning and labeling the original case description, performing text vectorization processing on the cleaned and labeled text data through a word2vec word vector model to obtain an initial case description vector, and obtaining a case coding input matrix;

the case coding module 203 is used for extracting text features of the case coding input matrix by using a bidirectional long-short term memory network (Bi-LSTM) to obtain case implicit semantic feature vectors containing case semantic information;

the multi-attention mechanism module 204 performs self-learning by using an attention mechanism, firstly calculates the weight coefficients of each element in the common element space and all words in the case implicit semantic feature vector, secondly weights the semantic vector of each word and each element weight coefficient to construct a plurality of element attention vectors, sums up and reduces dimensions to obtain the common element attention vector, constructs the case description text length by adding the element attention vector, splices the normalized case description text length with each element attention vector to obtain an element feature matrix O_h；

The case complex and simple splitting module 205 processes the element feature matrix O by using a multi-layer perceptron_hAnd determining an output vector, and adding the output vector into a normalized Softmax classifier to realize complex and simple splitting of cases.

The data acquisition module 201 collects court history referee documents, extracts case description information, and marks the case according to actual case handling conditions to obtain original data of model training.

The data preprocessing module 202 includes:

the data cleaning unit 301 is used for extracting information of the original case description, completing cleaning and denoising, and obtaining meaningful original text data;

a Chinese word segmentation unit 302, configured to perform Chinese word segmentation on the original text data, and apply a jieba word segmentation tool to perform word segmentation on text field information;

the vectorization processing unit 303 performs vectorization processing on the text with words segmented by using the word2vec model to obtain case description initial vectors, and processes unstructured text data into structured data with uniform fixed length to obtain a case code input matrix.

In the multi-attention mechanism module 204, the dimension reduction is performed by applying a linear function summation.

The common element space is 4 types of common elements influencing case simplification, including original report conditions, reported conditions, litigation requests and factual reasons; the case routing element is a case routing element;

the multi-attention mechanism module 203 comprises:

and the case attention mechanism unit is used for calculating weight coefficients of all words in the case attention latent semantic feature vector and the case elements, and weighting the weight coefficients and all word semantic vectors in the case attention latent semantic feature vector to obtain a case attention element vector.

And the common element attention mechanism unit is used for calculating weight coefficients of all elements in a common element space and all words in the case implicit semantic feature vector, weighting the weight coefficients and all word semantic vectors in the case implicit semantic feature vector to obtain a plurality of element attention vectors, and applying linear function summation to reduce dimension to obtain the common element attention vectors.

The feature fusion unit introduces case description text length, splices the normalized case description text length and each element attention vector to obtain an element feature matrix O_h；

The case complex and simple splitting module 204 comprises:

hiding the layer unit: the element feature matrix O_hAs the input of the multilayer perceptron, after the activation function sigmoid is applied to the linearization, the hidden layer output is obtained: o is₁＝sigmoid(W₁O_h+b₁) (wherein, W₁Is O_hCoefficient of (a), b₁Is O_hBias term of) O₁I.e. the output vector;

an output layer unit: mixing O with₁Performing full connection processing, adding the obtained product into a normalized classifier softmax to obtain a final output O ═ softmax (W)₂O₁+b₂) (wherein, W₂Is O₁Coefficient of (a), b₂Is O₁Bias term of) to calculate the overall loss of the model, and training by using a gradient back propagation optimization algorithm to realize complex and simple case diversion.

Compared with the prior art, the method can predict according to case description, so that the defects of time and labor waste and poor flexibility of the conventional complex and simple shunting method are overcome, and the complex and simple shunting accuracy is improved.

Drawings

FIG. 1 is a structural diagram of the case intelligent shunting device of the present invention.

Fig. 2 is a schematic structural diagram of a data preprocessing module according to an embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a multi-gravity machine molding module according to an embodiment of the invention.

Fig. 4 is a schematic structural diagram of a case complex and simple splitting module according to an embodiment of the present invention.

FIG. 5 is a flowchart of the case intelligent shunting method of the present invention.

Detailed Description

First, terms related to embodiments of the present invention are explained as follows:

common element space: the 4 types of common elements influencing the case simplification comprise original report conditions, reported conditions, litigation requests and factual reasons.

The scheme consists of the following elements: the case is filed with the scheme.

Overall loss of the model: the complex and simple distribution method designed by the invention quantifies the probability distribution difference between the output result and the observation result.

The embodiments of the present invention will be described in detail below with reference to the drawings and examples.

As shown in fig. 1, the present invention provides an intelligent case diversion device, which comprises:

the data acquisition module 201 is used for acquiring case description texts and complex and simple label spaces;

the data acquisition module is mainly used for acquiring data for model training, collecting court historical referee documents, extracting case description related information, and marking labels (complex/simple) according to actual case handling conditions so as to acquire original data of the model training. In the embodiment, relevant information is extracted by collecting historical referee documents of a court, and forty thousand pieces of data are obtained for model training.

And the data preprocessing module 202 is used for cleaning, labeling and segmenting the original case description, performing text vectorization processing on the cleaned, labeled and segmented text S through a word2vec word vector model to obtain an initial case description vector, and obtaining a case code input matrix.

The method specifically comprises the following steps: the method comprises the steps of cleaning and denoising original case description, segmenting words of text data, applying a word vector model to conduct vectorization processing on input text, processing unstructured text data into structured data with uniform fixed length to obtain case description initial vectors, and obtaining case coding input matrixes. Such as: for each cleaned, labeled and participled text S, it is represented by word2vec word vector model as S ═ w₁,w₂,w₃,w₄,…w_n]Wherein w is₁,w₂,w₃,w₄,…w_nThe dimension of the word vector forming the text S is 300, and n is the number of the truncated words of the text S and takes the value of 800.

Further, as shown in fig. 2, the data preprocessing module 202 includes:

and the data cleaning unit 301 is configured to extract information of the original case description, remove punctuation marks and special symbols in the text, complete cleaning and denoising, and obtain meaningful original text data.

The Chinese word segmentation unit 302 is configured to perform Chinese word segmentation on the original text data, and apply a jieba word segmentation tool to perform word segmentation on the text field information.

The vectorization processing unit 303 performs vectorization processing on the text with words segmented by using the word2vec model to obtain case description initial vectors, and processes unstructured text data into structured data with uniform fixed length to obtain a case code input matrix. Such as: for each participled text S, it is denoted S ═ w₁,w₂,w₃,w₄,…w_n]Wherein w is a word vector constituting the text S, the dimension of w is 300, and n is the number of truncated words of the text S, and the value is 800.

The case coding module 203 constructs a bidirectional long-short term memory network (Bi-LSTM) model to extract text features of the case coding input matrix S, and obtains a case implicit semantic feature vector H ═ H containing case semantic information₁,h₂,h₃,h₄,…,h_n]，h₁,h₂,h₃,h₄,…,h_nFor the feature vector of each word w after passing through Bi-LSTM, the dimension is 100, and the case implicit semantic feature vector H can represent the information contained in different aspects of the text.

The multi-attention mechanism module 204 takes a civil case as an example, and for the task of predicting the complicated and simplified branches of the case description of the civil case, the case description comprises the original report, the defended report and the litigation request, and the factual reason is four common elements and the case-by element. Self-learning is carried out by utilizing an attention mechanism, and the weight coefficient of each common element and the case implicit semantic feature vector H is calculated to obtain the attention weight coefficients { W of a plurality of elements_z1,W_z2,W_z3,W_z4Weighting the case implicit semantic feature vector H and each element weight coefficient to construct a plurality of element attention vectors { g }₁,g₂,g₃,g₄And d, applying linear function summation to reduce dimension to obtain a common element attention vector g_p. Element attention vector g of construction plan_aIntroducing case description text length L, and directing vectorized case description text length L and element attention g_a,g_pVector splicing is carried out to obtain an element feature matrix O_h。

Further, as shown in fig. 3, the multi-attention mechanism module 203 includes:

the common element attention mechanism unit calculates the weight coefficient { W) of each common element and the case implicit semantic feature vector H_z1,W_z2,W_z3,W_z4Weighting the weight coefficient and the case implicit semantic feature vector H to obtain a plurality of element attentionVector g₁,g₂,g₃,g₄And d, applying linear function summation to reduce dimension to obtain a common element attention vector g_p。

Further, W_z1Is a weight coefficient of the original element and H, W_z2W is the weight coefficient of the reported element and H_z3Weight coefficient of litigation request and H, W_z4For practical reasons, the weight coefficients of H and the above four weight coefficients are calculated in the same way, with W_z1For example, by calculating the weight coefficient a of each word in the original element and H_i(calculation method is as follows) to obtain W_z1＝[a₁,a_2,a₃,…,a_n]。

i∈[1,n]Vectors are initialized randomly.

Further, g₁Is the primary element attention vector, g₂Is the attention vector of the notified element, g₃Request attention vector, g, for litigation₄For practical reasons, the calculation method of the above four attention vectors is the same, and g is used₁For example, by weighting factor W_z1Weighting with case implicit semantic feature vector H to obtain primary notice element attention vector, i.e. g₁＝W_z1H。

Further, g is₁,g₂,g₃,g₄The attention vectors of the four elements are subjected to linear operation to obtain the attention vector g of the common element_p，g_p＝λ₁g₁+λ₂g₂+λ₃g₃+λ₄g₄And λ is a coefficient.

The case by element attention mechanism unit is similar to the calculation method, the weight coefficient of the case by element and the case situation implicit semantic feature vector H is calculated, the weight coefficient and the case situation implicit semantic feature vector H are weighted, and a case by element attention vector g is obtained_a。

A characteristic fusion unit, which introduces case description text length L,the length L of the vectorized case description text and the element attention vector g_a,g_pSplicing to obtain an element feature matrix O_hI.e. O_h＝[L,g_a,g_p]。

Case complex and simple splitting module 205 for the primitive feature matrix O_hAnd determining an output vector as an input of a multilayer perceptron (MLP), and adding the output vector into a normalized classifier softmax to realize complex and simple splitting of cases.

Further, as shown in fig. 4, the case complexity and complexity splitting module 204 includes:

hiding the layer unit: the element feature matrix O_hAs input of a multilayer perceptron (MLP), after applying activation function de-linearization, a hidden layer output is obtained: o is₁＝sigmoid(W₁O_h+b₁) Wherein W is₁Is O_hCoefficient of (a), b₁Is O_hThe bias term of (1).

An output layer unit: outputting the hidden layer to O₁Performing full connection processing, adding the obtained product into a normalized classifier softmax to obtain a final output O ═ softmax (W)₂O₁+b₂) Wherein W is₂Is O₁Coefficient of (a), b₂Is O₁The bias term of (1). And calculating the overall loss of the model, and training by utilizing a gradient back propagation optimization algorithm to realize complex and simple distribution of the case.

The implementation result shows that the complex and simple shunting device of the embodiment of the invention has the accuracy rate of more than 90 percent and can accurately complete complex and simple shunting tasks.

The device can directly obtain the complex and simple cases by only inputting case description, and rapidly realize complex and simple shunt. Different training sets are input to retrain the model, so that the device can be applied to different courts and regions, and the flexibility is strong.

According to the technical scheme, the invention provides the case intelligent shunting device, which decomposes common element space and case by elements, can construct a plurality of element attention vectors according to case description, and predicts case complexity based on the element attention vectors so as to overcome the defects that the existing complex and simple shunting method is time-consuming, labor-consuming and poor in flexibility.

As shown in fig. 5, the present invention provides an intelligent case diversion method, which includes the following steps:

step 101: collecting court historical referee documents, extracting case description related information, marking labels (complex cases/simple cases) according to actual case handling conditions, and obtaining original data of model training.

By collecting historical referee documents of a certain court and extracting related information, forty thousand pieces of data are obtained for model training, wherein the ratio of the data marked as a frequent case to the data marked as a simple case is 1:3, and the fact that many simple cases exist in civil litigation is met.

Step 102: cleaning and denoising original case description (removing punctuation marks, escape characters, special characters and the like), then carrying out word segmentation on text data by using a jieba word segmentation tool, carrying out vectorization on each segmented text S by using a word2vec word vector model, representing each word by using a low-dimensional dense vector, and finally representing the text S as S ═ w [ w ] n₁,w₂,w₃,w₄,…w_n]Wherein w is a word vector constituting the text S, the dimension is 300, and n is the number of truncated words of the text S, and the value is 800.

And obtaining the case code input matrix through the operation.

Step 103: for case coding the input matrix S ═ w₁,w₂,w₃,w₄,…,w_n]Constructing a bidirectional long-short term memory network (Bi-LSTM) to extract features (wherein the number of hidden layer nodes (neurons) is 100), and obtaining a case implicit semantic feature vector H ═ H containing main semantic information of the case implicit semantic feature vector H ═ H₁,h₂,h₃,h₄,…,h_n]H is a feature vector of each word w after passing through Bi-LSTM, the dimension is 100, and the case implicit semantic feature vector H can represent information contained in different aspects of the text.

Step 104: calculating the hidden semantic feature vector H ═ H between each element and case situation in the common element space₁,h₂,h₃,h₄,…,h_n]Weight of (2)Coefficient, obtaining attention weight coefficients { W of multiple elements_z1,W_z2,W_z3,W_z4}. Weighting each element attention weight coefficient and case implicit semantic feature vector H respectively to obtain a plurality of element attention vectors { g₁,g₂,g₃,g₄In which g is₁Indicates the primary attention vector, g₂Representing the attention vector, g₃Representing litigation request attention vector, g₄Indicating the factual reason attention vector.

G is prepared from₁,g₂,g₃,g₄Summing and reducing dimensions to obtain the attention vector g of the common elements_p(ii) a Element attention vector g of construction plan_aSimultaneously introducing case description text length L, and vectorizing case description text length L and element attention vector g_a,g_pSplicing to obtain an element feature matrix O_h。

Step 106: o from step five_hAnd as the input of a multilayer perceptron (MLP), obtaining hidden layer output after applying sigmoid activation function to carry out linearization: o is₁＝sigmoid(W₁O_h+b₁) Wherein W is₁Is O_hCoefficient of (a), b₁Is O_hThe bias term of (1).

Outputting the hidden layer to O₁Performing full connection processing, adding the obtained product into a normalized classifier softmax to obtain a final output O ═ softmax (W)₂O₁+b₂) Wherein W is₂Is O₁Coefficient of (a), b₂Is O₁The bias term of (1).

And calculating the overall loss of the model by using a cross entropy loss function, optimizing by using a random gradient descent algorithm, and using the model for complex and simple case shunting after training.

According to the technical scheme, the invention provides the case intelligent shunting method, a plurality of element attention vectors are constructed according to case description, and cases are predicted based on the case attention vectors, so that the defects that the existing complex and simple shunting method is time-consuming, labor-consuming and poor in flexibility are overcome.

The embodiments of the method and apparatus are substantially similar in this specification and the references made to them.

The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the modifications or substitutions within the technical scope of the present invention are included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.

Claims

1. An intelligent case shunting method is characterized by comprising the following steps:

carrying out data processing on an original case to obtain a case description initial vector;

encoding the case description initial vector by using a neural network model to obtain a case implicit semantic feature vector;

and determining the difficulty level of the case according to the implicit semantic feature vector of the case and the case description text length and the related factors influencing the complex and simple cases, and intelligently distributing the cases to achieve the effect of complex and simple cases.

2. The method according to claim 1, wherein said data processing of the original case comprises:

cleaning case description contents in an original case, processing the case description contents by adopting a word segmentation tool, and acquiring a semantic vector of each word in a case description text by using a word vector model to obtain an initial case description vector.

3. The method according to claim 1, wherein said determining case difficulty level and performing case intelligent diversion comprises:

constructing a bidirectional long and short term memory network for feature extraction of the case description initial vector to obtain a case implicit semantic feature vector containing case semantic information;

combining various elements influencing the difficulty level of the case, and processing the case implicit semantic feature vectors by using a multi-attention neural network model to obtain attention vectors representing the elements;

and performing feature fusion on the attention vectors representing the elements, and combining the normalized representation of case description text length to obtain an element feature matrix, thereby achieving the purpose of dividing trial cases into different difficulty degrees.

4. The method according to claim 3, wherein the implementation of dividing trial cases into different difficulty levels comprises:

and processing the element characteristic matrix by utilizing a multilayer perceptron model so as to determine an output vector and realize the purpose of dividing trial and judgment cases in different difficulty levels.

5. The method of claim 3, wherein the processing using the multi-attention neural network model comprises:

calculating attention weight coefficients of each element in a common element space and the case situation implied semantic feature vector;

weighting the case situation implied semantic feature vector and the attention weight coefficient of each element in the common element space to construct a plurality of common element attention vectors;

summing the attention vectors of the common elements and reducing dimensions to obtain a space attention vector of the common elements;

and constructing case attention vectors, adding case description text lengths, and splicing the normalized representation of the case description text lengths with the element attention vectors to obtain an element feature matrix.

6. The method as claimed in claim 5, wherein the common element space is a set of 4 types of common elements affecting case simplification, including a case of original, a case of reported, a request for litigation, and a reason for fact, and the case of the case is a case of filing the case.

7. The method according to claim 5, wherein the attention weight coefficient is obtained by calculating the similarity degree of each element with the case implicit semantic feature vector.

8. The utility model provides a case intelligence diverging device which characterized in that, the device includes:

the data preprocessing module is used for preprocessing the original case description to obtain an initial vector of the case description;

the case coding module is used for extracting text features of the case description initial vector by utilizing a bidirectional long-short term memory network to obtain a case implicit semantic feature vector containing case semantic information;

the multi-attention mechanism module is used for respectively calculating the weight coefficients of all words in case element and common element spaces, each element and the case situation implied semantic feature vector in the case space by using an attention mechanism, constructing a plurality of attention vectors, and adding the normalized representation of the case situation description text length to obtain an element feature matrix;

and the case intelligent distribution module utilizes the multilayer perceptron to process the element feature matrix, determines an output vector, adopts the normalized classifier to carry out case intelligent distribution, and realizes complex and simple case distribution effects.

9. The apparatus of claim 8, wherein the data preprocessing module comprises:

extracting information of the original case description, cleaning and denoising to obtain meaningful text data;

performing Chinese word segmentation processing on the text data;

and applying a word vector model to carry out vectorization processing on the text with the words segmented to obtain a case description initial vector, and processing unstructured text data into structured data with uniform fixed length.

10. The apparatus of claim 8, wherein the multi-attention mechanism module comprises:

calculating attention weight coefficients of all words in case factors and case situation implied semantic feature vectors, and weighting the weight coefficients and the word semantic vectors to obtain case factor attention vectors;

calculating attention weight coefficients of all elements in a common element space and all words of the case situation implicit semantic feature vector, weighting the weight coefficients and the word semantic vectors to obtain a plurality of element attention vectors, and applying a linear function to sum and reduce dimensions to obtain the common element attention vectors;

and introducing the case description text length, and splicing the normalized representation of the case description text length and each element attention vector to obtain an element feature matrix.