CN114781389B - Crime name prediction method and system based on label enhancement representation - Google Patents

Crime name prediction method and system based on label enhancement representation Download PDF

Info

Publication number
CN114781389B
CN114781389B CN202210209170.5A CN202210209170A CN114781389B CN 114781389 B CN114781389 B CN 114781389B CN 202210209170 A CN202210209170 A CN 202210209170A CN 114781389 B CN114781389 B CN 114781389B
Authority
CN
China
Prior art keywords
crime
representation
case
tag
name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210209170.5A
Other languages
Chinese (zh)
Other versions
CN114781389A (en
Inventor
但静培
胥岚林
廖晓爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202210209170.5A priority Critical patent/CN114781389B/en
Publication of CN114781389A publication Critical patent/CN114781389A/en
Application granted granted Critical
Publication of CN114781389B publication Critical patent/CN114781389B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services; Handling legal documents

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Tourism & Hospitality (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Technology Law (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Document Processing Apparatus (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a crime name prediction method and a system based on label enhancement representation, wherein the method comprises the following steps: selecting cases as a sample set, and inputting descriptions for each case in the given sample set; giving a label input description of a crime name corresponding to each case; encoding each case description and obtaining a contextually relevant embedded representation of each word in each case description; encoding each crime name tag and obtaining an embedded representation of each crime name tag; alternately using a self-attention mechanism and a cross-attention mechanism for the coded crime name label to obtain a crime name enhancement label representation; splicing the case text representation and the crime name enhancement tag representation, and training a classifier of a convolutional neural network model; and predicting the cases to be predicted in the trained crime name prediction model to obtain the predicted crime name. According to the method, semantic information contained in the criminal name enhancement tag representation enables training data to have better interpretation, and therefore higher prediction accuracy is obtained.

Description

Crime name prediction method and system based on label enhancement representation
Technical Field
The invention relates to the technical field of machine learning, in particular to a crime name prediction method and a crime name prediction system based on label enhancement representation.
Background
Legal judgment is to complete the prediction of the crime name according to the description of the case facts, and can play an effective auxiliary role in the judgment of criminal cases, and more attention is paid in recent years, mainly providing higher-quality judgment results for people without legal basis on one hand; and on the other hand, provides legal reference for professional legal persons.
In recent years, many studies have been made on automatic decisions. Initially, the problem of autodecision was treated as a simple text classification problem, which was handled by some conventional means, such as keyword matching. With the development of deep learning, more students began to extract information in text using the framework of deep learning to assist in automatic decision making. However, most of the methods focus on text content of case descriptions, the model needs to learn the characteristics of the case descriptions, and the crime label is ignored to have certain semantic information, so that accuracy in the aspect of crime prediction is always unsatisfactory.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to provide a crime name prediction method and a crime name prediction system based on label enhancement representation.
In order to achieve the above object of the present invention, the present invention provides a crime name prediction method based on tag enhancement representation, comprising the steps of:
selecting cases as a sample set, and inputting descriptions for each case in the given sample set; giving a label input description of a crime name corresponding to each case;
encoding each case description and obtaining a contextually relevant embedded representation of each word in each case description, denoted as a case text representation X f
Encoding each crime name tag description and obtaining an embedded representation of each crime name tag, and marking a tag set containing embedded representations of all crime name tags as E T
Fusing the encoded crime name labels with the case text representation to alternately use a self-attention mechanism and a cross-attention mechanism to obtain a crime name enhancement label representation
Representing the case text X f Enhanced tag representation with crime nameSplicing, and training a classifier of the model through a convolutional neural network model to obtain a trained crime prediction model;
and predicting the cases to be predicted in the trained crime name prediction model to obtain the predicted crime name.
According to the method, the crime label is mapped to a potential semantic space through embedding the crime label, important information of case fact description is fused into the crime enhancement label, and a classifier is trained based on the important information, so that a crime prediction task of case description is completed. Under the condition of a small sample, the model can obtain higher prediction precision and has certain generalization capability on low-frequency criminals.
The preferable scheme of the crime name prediction method based on label enhancement representation is that the input description of each case in a sample set is given, and the input description S of each case is given d Performing word granularity processing to obtain case fact descriptionThe i-th word in the case input description text is represented, m is the number of words in the case input description text, i is a positive integer, and i is more than or equal to 1 and less than or equal to m;
performing word granularity processing on each crime name tag input description to obtain a crime name tag C is a positive integer not more than L, L represents the number of the crime name labels, and p represents the number of the crime names;
description of case factsCrime name label-> Encoding is performed.
The method can restore the fact of the case and the description of the crime label to the greatest extent in the form of text features, and improves the accuracy of the crime prediction.
The preferable scheme of the crime name prediction method based on label enhancement representation describes the case factsEncoding, the last hidden layer output of the encoder is used as the context-dependent embedded representation of each word in the case fact description, i.e. +.>Wherein d is s Representing the dimension of the last hidden layer of the encoder, of->Representing the embedded representation corresponding to the i-th word in the case fact description.
Tag crime nameEncoding, outputting the last hidden layer of the encoder as embedded representation of each crime label on word granularity +.>Representing the embedded representation corresponding to the jth word in the criminal name label, and summing the embedded representation of each criminal name label to obtaine c An embedded representation representing the c-th crime name label, resulting in a tag set E containing embedded representations of all crime name labels T =[e 1 ,e 2 ,...,e c ,...,e L ]。
The case description and the crime label can be mapped in the same semantic space, the information learned by the pre-training model is simultaneously applied to the case description and the crime label, and the semantic information of the crime label is brought into the training process of the model, so that the training data has better interpretation, and higher prediction precision is obtained.
In the preferred scheme of the crime name prediction method based on label enhancement representation, when a self-attention mechanism and a cross-attention mechanism are alternately used for coded crime name labels, an attention model with Q-K-V is adopted according to a transducer model:
let the key matrix beThe query matrix is +.>The value matrix isWherein W is k 、W q 、W v The attention output is obtained by the scaled dot product of the convertors for the attention as an all-zero matrix>Wherein N and M represent the length of the query vector and key value, respectively, D is the word embedding dimension, D k Representing the dimensions of a key or query matrix, D v Representing the dimension of the value matrix;
residual connection is performed during feed-forward, and final output is obtained as a crime name enhancement tag representation:wherein h is c Refer to a specific representation of the crime name tag c.
The crime label is fused with the case text to realize the enhancement representation of the crime label, and the model realizes the fusion of the preliminary case and the crime before the classifier, so that the training data has better interpretation on the model, and the accuracy of the crime prediction is improved.
The invention also provides a crime name prediction system, which comprises a processing module and a storage module, wherein the processing module and the storage module are mutually connected in a communication way, and the storage is used for storing at least one executable instruction which enables the processor to execute the operation corresponding to the crime name prediction method based on the tag enhancement representation.
The processing module comprises a case description encoder, a tag characteristic enhancer and a classifier;
the case description encoder encodes each case description to obtain a case text representation;
the tag feature enhancer maps the crime name tag to a potential semantic space to obtain an embedded representation of the crime name tag, and fuses the embedded representation with the case text representation to obtain a crime name enhanced tag representation;
and the classifier fuses the case text representation and the crime name enhancement tag representation, trains a classification model for classification prediction, and obtains a prediction result.
The crime name prediction system has all the advantages of the above crime name prediction method.
According to the invention, the semantic information contained in the criminal name enhancement tag representation enables the training data to have better interpretation, so that higher prediction precision is obtained.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
fig. 1 is a functional block diagram of a crime name prediction system.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
In the description of the present invention, unless otherwise specified and defined, it should be noted that the terms "mounted," "connected," and "coupled" are to be construed broadly, and may be, for example, mechanical or electrical, or may be in communication with each other between two elements, directly or indirectly through intermediaries, as would be understood by those skilled in the art, in view of the specific meaning of the terms described above.
The invention provides a crime name prediction method based on label enhancement representation, which mainly comprises the steps of fusing important information of case fact description into label representation of corresponding subtasks, and training a classifier based on the important information to complete the crime name prediction task of case fact description. The method comprises the following specific steps:
the case is selected as the sample set. The sample set contains a large number of cases, and the types of the criminal names corresponding to the cases are as many as possible.
The sample set used in this embodiment is a CAIL2018 dataset, where each sample in the dataset is a legal case, and each case has the same structure, including the following parts, the fact description of the case, and the results of related laws, criminal names, criminal periods, and the like. The CAIL2018 dataset consists of two parts: the details of CAIL-small and CAIL-big are shown in Table 1.
Table 1 data set introduction
Training set size Test set size Number of crime names
CAIL-small 154592 32508 196
CAIL-big 1710856 217016 196
Wherein CAIL-small additionally provides 17131 pieces of data as a validation set. In the cai-big dataset, there is a small number of fact descriptions corresponding to multiple crime names, because our goal is to verify only if the tag semantics can improve the performance of crime name prediction, and because the number of training samples for multiple crime name tags is sparse, to reduce the complexity of the model, delete the data samples for multiple crime name tags in the cai-big dataset, only the data samples for single crime name tags are retained.
Given each case input description in the sample set, for each case input description S d Performing word granularity processing to obtain case fact descriptionThe i-th word in the case input description text is represented, m is the number of words in the case input description text, i is a positive integer, and i is more than or equal to 1 and less than or equal to m.
Likewise, each case corresponds to a corresponding crime name, the corresponding crime names corresponding to different cases are the same, and the different cases are different, and the label input of the crime name corresponding to each case is givenPerforming word granularity processing on the input description of each crime name label to obtain a crime name labelThe j-th word of the descriptive text is input by the crime name label c, c is a positive integer not more than L, L represents the number of the crime name labels, and p represents the number of the current crime name labels. Aggregate notation s= (S) for all the criminal name labels 1 ,S 2 ,...,S c ,...,S L )。
And then respectively describing the facts of the casesCrime name labelEncoding is performed.
In this embodiment, bert is preferably but not limited to selected as the basic encoder to encode the case fact description and criminal name label. The method comprises the following steps:
description of case factsWhen encoding, word sequence describing the fact of case +.>Input to Bert pre-training model, the last hidden layer output of the pre-training model is +.>Wherein d is s The dimension of the last hidden layer of the Bert is represented, and the Bert pre-training model describes a word sequence S of the case fact d Each word of (a) is expanded into a ds-dimensional column vector, ">Representing the ith column vector in the ds dimension column vector, representing the embedded representation corresponding to the ith word in the case fact description. The Bert is subjected toThe last hidden layer output is used as a context dependent embedded representation of each word in the case fact description, i.e./i> Text representation X of a case f
Tag crime nameWhen coding is carried out, the method encodes the crime name label through the Bert, and selects the last hidden layer of the Bert to output as the embedded representation of each crime name label on the word granularityBert pre-training model tags crime names S c Each word of (a) is expanded into a ds-dimensional column vector, ">Representing the j-th column vector in the column vector of the ds dimension, representing the embedded representation corresponding to the j-th word in the crime label c, and summing the embedded representations of each crime label to obtain +.>e c The embedded representation representing the c-th crime name label is finally collected to obtain a label set E containing embedded representations of all crime name labels T =[e 1 ,e 2 ,...,e c ,...,e L ]。
And then enhancement processing is carried out on the crime name label, in the embodiment, the encoded crime name label is fused with the case text representation to alternately use a self-attention mechanism and a cross-attention mechanism, so that the enhancement processing is carried out on the crime name label, and the crime name enhancement label representation is obtained.
In particular, the embodiment improves the attention of multiple heads in a decoder in a transducer model, and innovatively proposes a method for enhancing label representation. The crime name tag feature enhancer is implemented using a self-attention mechanism and a cross-attention mechanism alternately, and according to a transducer model, an attention model with Q-K-V is employed:
let the key matrix beThe query matrix is +.>The value matrix isWherein W is k 、W q 、W v The attention output is obtained by the scaled dot product of the convertors for the attention as an all-zero matrix>Namely, the criminal name tag weight is dispersed to a case description result, wherein N and M respectively represent the length of a query vector and a key value, D is a word embedding dimension, and K T Transposed matrix of K, D k Representing the dimensions of a key or query matrix, D v Representing the dimension of the value matrix.
Finally, residual connection is carried out during feed-forward, and final output is obtained and used as crime name enhancement tag representation:wherein h is c Refers to an enhanced representation of the crime name tag c. Representing the case text by H d Enhancement tag representation with crime name->Splicing to obtain->Training a classifier of the model through a convolutional neural network model CNN to obtain a trained criminal name prediction model.
In this embodiment, a DPCNN model is used as a classifier in a convolutional neural network model CNN, and an H input DPCNN classifier is used to obtain a text feature representation Z fused with a crime name tag feature T And connecting the output of the function with a ReLU activation function to obtain a crime name prediction result. In particular, the text feature representation Z fused with the tag feature T Is passed to a fully connected layer using the ReLU activation function for final prediction.
Predictive valueWherein W is 0 And b 0 Representing a randomly initialized parameter matrix.
Defining a loss functionWherein->Represents the predicted value, y c Representing a true value. And optimizing and training the target loss function by using a gradient descent method, and obtaining a training completion crime name prediction model when the training completion condition is reached.
In the training process of this embodiment, the size of the word embedding dimension D is preferably but not limited to 128, the number of multi-head attentions in the transducer is preferably but not limited to 8, and the AdamW optimizer is preferably but not limited to be used, and the learning rate is preferably but not limited to be 0.001, and the regularization parameter is preferably but not limited to be 10 -4
When the crime name of the case is predicted, the case to be predicted is predicted in a trained crime name prediction model, and the predicted crime name is obtained.
In order to embody the superiority of the method, the method is compared with a BiLSTM+ ATT, textCNN, DPCNN, BERT + fine tuning model, and the same sample set is adopted during comparison.
BiLSTM+ATT: is a classical text classification model that captures both upper and lower Wen Yuyi using bi-directional LSTM with attention mechanisms and automatically selects important features through attention during training, a variant of neural networks based on attention mechanisms.
TextCNN: the CNN model is widely applied in the field of image processing, and the textCNN model is applied to process text data, so that the method has remarkable effect on text classification.
DPCNN: a commonly used text classification model is a variant of the CNN model.
Bert+ fine tuning: combining the pre-training model with the downstream task model and fine-tuning the parameters of the pre-training model. Fine tuning is currently the most common way to apply pre-trained models to specific tasks, and by combining with various downstream task models, various NLP tasks can be completed.
By testing accuracy (Acc), macro-precision (MP), macro-recall (MR), macro F1 macro-F1 (F1), and top five accuracies at top 5 (Acc@5) of the predicted result rank as a test index.
The test results are shown below:
TABLE 1 criminal name prediction on CAIL-small dataset
TABLE 2 criminal name prediction on CAIL-big dataset
Acc MP MR F1 Acc@5
BiLSTM+ATT 0.948 0.811 0.815 0.810 0.991
TextCNN 0.944 0.799 0.804 0.798 0.989
DPCNN 0.961 0.857 0.859 0.855 0.993
Bert+Fine Tune 0.958 0.914 0.915 0.914 0.987
The application 0.960 0.921 0.924 0.921 0.993
Table 3 manifestation on low frequency crime names
As shown in Table 1, on the CAIL-small data set, compared with BiLSTM+ATT, textCNN and DPCNN, the test results of the present application all obtain the highest scores, and the accuracy is improved by more than 6%. The method and the device have little difference with the Bert+Fine Tune, but because the method freezes the parameters of the pre-training model in the training process and does not need to update the parameters of the pre-training model in the reverse updating process, the training time is obviously shortened compared with the Fine tuning, and the model can be converged more quickly.
As shown in Table 2, on the CAIL-big dataset, the baseline model and the proposed method approach 100% and the accuracy difference is not more than 3% because the dataset is very large.
In CAIL-small, part of crime training data is less than 100, and the test is carried out on the part of data, wherein the test result is shown in a table 3, the performance of the low-frequency crime training data on the low-frequency crime is greatly improved compared with that of BiLSTM+ATT, textCNN and DPCNN, the accuracy rate of the low-frequency crime training data is respectively different from that of the model by 26.0%, 16.1% and 25.3%, and the accuracy rate of the Bert+Fine Tune is slightly different from that of the model, but the semantic information of the low-frequency crime is improved by 14.8% on an ACC@5 index, so that the semantic information of the low-frequency crime can be captured by the low-frequency crime training data and the semantic information can be further utilized in a crime prediction task.
The invention also provides a crime name prediction system, which comprises a processing module and a storage module, wherein the processing module and the storage module are mutually connected in a communication way, and the storage is used for storing at least one executable instruction which enables the processor to execute the operation corresponding to the crime name prediction method based on the tag enhancement representation.
The processing module is shown in fig. 1 and comprises a case description encoder, a tag characteristic enhancer and a classifier;
the case description encoder encodes each case description to obtain a case text representation;
the tag feature enhancer maps the crime name tag to a potential semantic space to obtain an embedded representation of the crime name tag, and fuses the embedded representation with the case text representation to obtain a crime name enhanced tag representation;
and the classifier fuses the case text representation and the crime name enhancement tag representation, trains a classification model for classification prediction, and obtains a prediction result.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims (4)

1. A crime name prediction method based on label enhancement representation is characterized by comprising the following steps:
selecting cases as a sample set, giving each case input description in the sample set, and giving the label input description of the criminal name corresponding to each case:
for each casePart input description S d Performing word granularity processing to obtain case fact description S d The i-th word in the case input description text is represented, m is the number of words in the case input description text, i is a positive integer, and i is more than or equal to 1 and less than or equal to m;
performing word granularity processing on each crime name tag input description to obtain a crime name tag C is a positive integer not more than L, L represents the number of the crime name labels, and p represents the number of the crime names;
encoding each case fact description and obtaining a contextually relevant embedded representation of each word in each case fact description, denoted as a case text representation X f
Description of case factsEncoding, using the last hidden layer output of the encoder as a context-dependent embedded representation of each word in the case fact description, i.e.Wherein d is s Representing the dimension of the last hidden layer of the encoder, of->Representing an embedded representation corresponding to an i-th word in the case fact description;
encoding each crime name tag description and obtaining an embedded representation of each crime name tag, and marking a tag set containing embedded representations of all crime name tags as E T
Tag crime nameEncoding, outputting the last hidden layer of the encoder as embedded representation of each crime label on word granularity +.>Wherein->Representing the embedded representation corresponding to the j-th word in the crime label, summing the embedded representation of each crime label to obtain +.>e c An embedded representation representing the c-th crime name label, resulting in a tag set E containing embedded representations of all crime name labels T =[e 1 ,e 2 ,...,e c ,...,e L ];
Fusing the encoded crime name labels with the case text representation to alternately use a self-attention mechanism and a cross-attention mechanism to obtain a crime name enhancement label representationSpecific:
according to the transducer model, an attention model with Q-K-V is used:
let the key matrix beThe query matrix is +.>The value matrix isWherein W is k 、W q 、W v The attention output is obtained by the scaled dot product of the convertors for the attention as an all-zero matrix>Wherein N and M represent the length of the query vector and key value, respectively, D is the word embedding dimension, D k Representing the dimensions of a key or query matrix, D v Representing the dimension of the value matrix;
residual connection is performed during feed-forward, and final output is obtained as a crime name enhancement tag representation:wherein h is c A specific representation referring to a crime name tag c;
representing the case text X f Enhanced tag representation with crime nameSplicing, and training a classifier of the model through a convolutional neural network model to obtain a trained crime prediction model;
and predicting the cases to be predicted in the trained crime name prediction model to obtain the predicted crime name.
2. The tag-enhanced representation-based crime prediction method of claim 1, wherein the loss function of the convolutional neural network model isWherein->Represents the predicted value, y c Representing the true value, L represents the total number of criminal name tags.
3. A system for predicting a crime name, comprising a processing module and a storage module, wherein the processing module and the storage module are in communication connection with each other, and the storage module is configured to store at least one executable instruction, and the executable instruction causes the processing module to perform the operation corresponding to the method for predicting a crime name based on the tag enhancement representation according to any one of claims 1-2.
4. A crime prediction system according to claim 3, wherein the processing module comprises a case description encoder, a tag feature enhancer and a classifier;
the case description encoder encodes each case description to obtain a case text representation;
the tag feature enhancer maps the crime name tag to a potential semantic space to obtain an embedded representation of the crime name tag, and fuses the embedded representation with the case text representation to obtain a crime name enhanced tag representation;
and the classifier fuses the case text representation and the crime name enhancement tag representation, trains a classification model for classification prediction, and obtains a prediction result.
CN202210209170.5A 2022-03-04 2022-03-04 Crime name prediction method and system based on label enhancement representation Active CN114781389B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210209170.5A CN114781389B (en) 2022-03-04 2022-03-04 Crime name prediction method and system based on label enhancement representation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210209170.5A CN114781389B (en) 2022-03-04 2022-03-04 Crime name prediction method and system based on label enhancement representation

Publications (2)

Publication Number Publication Date
CN114781389A CN114781389A (en) 2022-07-22
CN114781389B true CN114781389B (en) 2024-04-05

Family

ID=82423775

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210209170.5A Active CN114781389B (en) 2022-03-04 2022-03-04 Crime name prediction method and system based on label enhancement representation

Country Status (1)

Country Link
CN (1) CN114781389B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115620321B (en) * 2022-10-20 2023-06-23 北京百度网讯科技有限公司 Table identification method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119449A (en) * 2019-05-14 2019-08-13 湖南大学 A kind of criminal case charge prediction technique based on sequence enhancing capsule net network
CN110162787A (en) * 2019-05-05 2019-08-23 西安交通大学 A kind of class prediction method and device based on subject information
CN111582576A (en) * 2020-05-06 2020-08-25 西安交通大学 Prediction system and method based on multi-scale feature fusion and gate control unit
CN111768024A (en) * 2020-05-20 2020-10-13 中国地质大学(武汉) Criminal period prediction method and equipment based on attention mechanism and storage equipment
CN113065347A (en) * 2021-04-26 2021-07-02 上海交通大学 Criminal case judgment prediction method, system and medium based on multitask learning
CN113505937A (en) * 2021-07-26 2021-10-15 江西理工大学 Multi-view encoder-based legal decision prediction system and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162787A (en) * 2019-05-05 2019-08-23 西安交通大学 A kind of class prediction method and device based on subject information
CN110119449A (en) * 2019-05-14 2019-08-13 湖南大学 A kind of criminal case charge prediction technique based on sequence enhancing capsule net network
CN111582576A (en) * 2020-05-06 2020-08-25 西安交通大学 Prediction system and method based on multi-scale feature fusion and gate control unit
CN111768024A (en) * 2020-05-20 2020-10-13 中国地质大学(武汉) Criminal period prediction method and equipment based on attention mechanism and storage equipment
CN113065347A (en) * 2021-04-26 2021-07-02 上海交通大学 Criminal case judgment prediction method, system and medium based on multitask learning
CN113505937A (en) * 2021-07-26 2021-10-15 江西理工大学 Multi-view encoder-based legal decision prediction system and method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A Joint Label-Enhanced Representation Based on Pre-trained Model for Charge Prediction;Jingpei Dan 等;《Natural Language Processing and Chinese Computing》;20220924;第694-705页 *
基于词语语义差异性的多标签罪名预测;王加伟 等;《 中文信息学报》;20191015;第33卷(第10期);第127-134页 *
面向法律文书的量刑预测方法研究;谭红叶;张博文;张虎;李茹;;中文信息学报;20200315(第03期);第107-114页 *

Also Published As

Publication number Publication date
CN114781389A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
Sukhbaatar et al. Augmenting self-attention with persistent memory
CN109918671B (en) Electronic medical record entity relation extraction method based on convolution cyclic neural network
CN109840287B (en) Cross-modal information retrieval method and device based on neural network
CN110119765B (en) Keyword extraction method based on Seq2Seq framework
CN109508462B (en) Neural network Mongolian Chinese machine translation method based on encoder-decoder
CN110442684B (en) Class case recommendation method based on text content
CN110348016B (en) Text abstract generation method based on sentence correlation attention mechanism
CN111523534B (en) Image description method
CN111159485B (en) Tail entity linking method, device, server and storage medium
CN112417134B (en) Automatic abstract generation system and method based on voice text deep fusion features
CN113537456B (en) Depth feature compression method
CN114781389B (en) Crime name prediction method and system based on label enhancement representation
CN116416480B (en) Visual classification method and device based on multi-template prompt learning
CN115269836A (en) Intention identification method and device
CN116912642A (en) Multimode emotion analysis method, device and medium based on dual-mode and multi-granularity interaction
CN117421591A (en) Multi-modal characterization learning method based on text-guided image block screening
CN111914061B (en) Radius-based uncertainty sampling method and system for text classification active learning
CN111563534B (en) Task-oriented word embedding vector fusion method based on self-encoder
CN116028527A (en) Training method, conversion method, device, equipment and medium for language conversion model
CN115455162A (en) Answer sentence selection method and device based on hierarchical capsule and multi-view information fusion
KR20240033029A (en) Contrast learning and mask modeling for end-to-end self-supervised pretraining
CN115270917A (en) Two-stage processing multi-mode garment image generation method
CN114238649A (en) Common sense concept enhanced language model pre-training method
CN114282058A (en) Method, device and equipment for model training and video theme prediction
CN110390010A (en) A kind of Method for Automatic Text Summarization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant