CN111259673A - Feedback sequence multi-task learning-based law decision prediction method and system - Google Patents

Feedback sequence multi-task learning-based law decision prediction method and system Download PDF

Info

Publication number
CN111259673A
CN111259673A CN202010031722.9A CN202010031722A CN111259673A CN 111259673 A CN111259673 A CN 111259673A CN 202010031722 A CN202010031722 A CN 202010031722A CN 111259673 A CN111259673 A CN 111259673A
Authority
CN
China
Prior art keywords
vector
criminal
case
task
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010031722.9A
Other languages
Chinese (zh)
Other versions
CN111259673B (en
Inventor
张春云
崔超然
尹义龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Finance and Economics
Original Assignee
Shandong University of Finance and Economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Finance and Economics filed Critical Shandong University of Finance and Economics
Priority to CN202010031722.9A priority Critical patent/CN111259673B/en
Publication of CN111259673A publication Critical patent/CN111259673A/en
Application granted granted Critical
Publication of CN111259673B publication Critical patent/CN111259673B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Primary Health Care (AREA)
  • Technology Law (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a feedback sequence multitask learning-based law decision prediction method and a feedback sequence multitask learning-based law decision prediction system, wherein the method comprises the following steps of: realizing the text characteristic representation learning of case description by using a single-task law prediction method based on representation learning; by taking the information of the prior task and the information of the feedback information of the subsequent task of each subtask as the input of the current task, the sequence relationship and the reverse verification relationship among the subtasks are considered, and the law decision prediction based on the feedback sequence multi-task learning is realized. The method is based on the combination of the single task of presentation learning and the sequence multi-task learning method based on feedback, effectively utilizes the advantages of the single task of presentation learning and the sequence multi-task learning method based on feedback in legal judgment and prediction, overcomes the defect that the method based on the single task of presentation learning does not utilize complementary information of other tasks in a targeted manner, and can improve the accuracy and robustness of judgment and prediction results compared with the traditional method based on multi-task learning.

Description

Feedback sequence multi-task learning-based law decision prediction method and system
Technical Field
The invention belongs to the technical field of judicial judgment prediction, and particularly relates to a law judgment prediction method and system based on feedback sequence multitask learning.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
Legal decision prediction aims at predicting the decision result of a legal case based on case fact description. The technology is a core technology of a legal assistant system, and the deep research on the technology has important application value and practical meaning. On the one hand, legal decision prediction can provide low-cost, high-quality legal consultancy services for some masses who are not familiar with legal terms and complex decision procedures. On the other hand, the system can provide convenient reference materials for professionals (such as lawyers and judges) so as to improve the working efficiency of the professionals. Currently, legal decision prediction mainly involves three subtasks: related law forecast, criminal act forecast and criminal period forecast. The prediction aiming at the three tasks is taken as a classification task at present to realize the classification of related law regulations, criminals and criminal periods. Currently, a law decision prediction method based on a single task representing learning and a multi-task law decision prediction method based on a plurality of related subtasks are representative.
The law judgment prediction method based on expression learning mainly comprises the steps of training a large number of marked samples and coding the semantics of case by adopting a deep neural network, so that mapping from a symbolic space to a vector space is realized, and finally prediction of related legal provision, criminal lines and criminal periods is realized based on semantic vector expression described by case. However, the method for forecasting the legal decision based on the representation learning single task has the disadvantages that the method only aims at a single task, only realizes the classification of the single task based on case description characteristics, and does not consider the influence of other tasks on the task.
The multi-task-based law decision prediction method mainly considers the association among the sub-tasks of the law decision, mutually shares and mutually supplements the learned information related to the field through a sharing representation in a shallow layer, trains different classification models based on the characteristics of the tasks in the last layer of the model, and finally realizes the parallel classification of a plurality of tasks. More specifically, there are dependency relationships (i.e., sequence relationships) and validation relationships (feedback validation) between the individual subtasks of legal decision prediction. Generally, legal persons determine related laws according to case description, then decide criminal lines based on the related laws, and decide corresponding criminal periods based on the related laws and the determined criminal lines; in turn, through feedback, the corresponding predicted criminal can verify the related laws, and the corresponding predicted criminal period can also verify the related laws and criminals. However, most of the currently adopted law decision prediction methods based on multitask simply adopt a classification framework of multitask learning to classify several related tasks, and rarely consider the sequence relationship among the tasks and the feedback verification relationship among the tasks.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a law decision prediction method based on feedback sequence multitask learning, which overcomes the defect that information sharing of other related tasks is difficult to utilize only considering single task characteristic representation based on a representation learning single task method, and simultaneously adds sequence relation information and feedback verification information among tasks under a multitask-based framework.
In order to achieve the above object, one or more embodiments of the present invention provide the following technical solutions:
a law decision prediction method based on feedback sequence multitask learning comprises the following steps:
realizing the text characteristic representation learning of case description by using a single-task law prediction method based on representation learning;
by taking the information of the prior task and the information of the feedback information of the subsequent task of each subtask as the input of the current task, the sequence relationship and the reverse verification relationship among the subtasks are considered, and the law decision prediction based on the feedback sequence multi-task learning is realized.
The technical scheme is that a case description and training data sets of related legal rules, criminal acts and criminal periods are obtained from a data center server, and the training data sets are stored in a database;
performing text feature representation learning on case description to obtain vector representation of the case description;
all the law acts, the criminal acts and the criminal stages learn the text feature representation to obtain the feature vector representation of the law acts, the criminal acts and the criminal stages.
The method comprises the following steps of constructing a multi-task pre-training model based on case description law and criminal prediction, and acquiring corresponding pre-classification vectors of three subtasks;
performing feature fusion on a normal vector pointed by a pre-classification vector of a normal prediction task and a case description expression vector to obtain a case-normal expression vector;
carrying out feature fusion on a criminal vector corresponding to a criminal pointed by a pre-classification vector of a criminal prediction task and a case representation vector to obtain a case-criminal representation vector;
carrying out feature fusion on a criminal phase vector corresponding to a criminal phase pointed by a presorting vector of a criminal phase prediction task and a case expression vector to obtain a case-criminal phase expression vector;
inputting case-law vector, case-criminal vector and case-criminal period vector into a bidirectional long-and-short-term memory neural network to obtain high-level semantic representation of three vectors;
constructing classifiers of law, criminal and criminal periods based on case-law vector, case-criminal vector and case-criminal period vector high-level feature representation;
and inputting the high-level feature representation into three classifiers to realize the prediction of the law, the criminal act and the criminal period.
The technical scheme is further characterized in that a multi-task pre-training model based on case description law prediction, criminal prediction and criminal prediction is constructed, and corresponding pre-classification vectors of three subtasks are obtained:
inputting the obtained case description vector into a multi-task classifier, and pre-classifying the law, the criminal and the criminal through training the multi-task classification model to obtain a law classification vector, a criminal prediction vector and a criminal prediction vector.
According to the further technical scheme, pre-training is carried out on the basis of case fact expression training data sets through a BERT model, a language model of a legal prediction task is obtained, and vector representation of D case fact descriptions is obtained.
The technical scheme is further that a law article vector, a criminal description vector and a criminal description vector are obtained by a dictionary looking up mode aiming at law article content, criminal description and criminal description based on a BERT model.
According to the further technical scheme, a law classification vector, a criminal prediction vector and a criminal prediction vector of each key fact description are obtained by a multi-task learning method of parameter hard sharing through case fact descriptions in a training data set and corresponding law, criminal and criminal labels.
In a further technical scheme, a gate in an LSTM block is adopted to encode a sequence relation between tasks in multi-task learning and a verification relation between a subsequent task and a current task:
step 1: selecting each D of the batch of case fact descriptions DiObtaining a prediction result vector lr aiming at a normal in a pre-classification result of the multi-task pre-training classification modeliAnd extracting the normal vector l corresponding to the element with the maximum value in the vectorjDescribing the vector and case fact vector diSplicing, inputting to a full connection layer to obtain case-normal vector representation dli
Step 2: for each diObtaining a predictor vector cr for the guilty in the pre-classification result based on the multi-task pre-classification modeliAnd taking out the guilt vector corresponding to the elements in the vector, describing the vector d with case factsiSplicing, inputting to a full connection layer to obtain case-criminal vector representation dci
Step 3: for each diObtaining a prediction result vector pr aiming at the criminal period in the pre-classification result of the multi-task pre-classification modeliAnd extracting the criminal phase vector p corresponding to the element with the maximum value in the vectoriDescribing the vector and case fact vector diSplicing, inputting into a full connection layer, and obtaining case-criminal phase vector representation dpi
Step4 initialization of the Forward LSTM Module0And h0State, case-law vector dl: as input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000041
And a forward output state
Figure BDA0002364560130000042
Step 5-in cell State
Figure BDA0002364560130000043
And its forward output state
Figure BDA0002364560130000044
The case-guilt vector dc is the input cell state and the input state of the current forward LSTM module at the previous momentiAs input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000045
And a forward output state
Figure BDA0002364560130000046
Step 6: in a cellular state
Figure BDA0002364560130000047
And its forward output state
Figure BDA0002364560130000048
The input cell state and the input state at the last moment of the current forward LSTM module, namely case-criminal phase vector dpiAs input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000049
And a forward output state
Figure BDA00023645601300000410
Step 7: in a cellular state
Figure BDA00023645601300000411
And its forward output state
Figure BDA00023645601300000412
The input cell state and the input state at the last moment of the current reverse LSTM module, a case-criminal phase vector dpiAs input vectors, the cell states of the modules are calculated separately
Figure BDA00023645601300000413
And reverse output state
Figure BDA00023645601300000414
Step 8: in a cellular state
Figure BDA0002364560130000051
And its reverse output state
Figure BDA0002364560130000052
The case-criminal vector dp is the input cell state and input state at the last moment of the current reverse LSTM moduleiAs input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000053
And reverse output state
Figure BDA0002364560130000054
Step 9: in a cellular state
Figure BDA0002364560130000055
And its reverse output state
Figure BDA0002364560130000056
For the input cell state and input state at the last moment of the current reverse LSTM module, the case-criminal vector dliAs input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000057
And reverse output state
Figure BDA0002364560130000058
Step 10: respectively splicing the forward output state and the reverse output state to obtain
Figure BDA0002364560130000059
Figure BDA00023645601300000510
As based on case description diCorresponding to the input of a law article classifier, a criminal bank classifier and a criminal stage classifier, calculating a cross entropy loss function corresponding to the batch input, and updating parameters;
step 11: and if the iteration times are not limited, jumping to Step 1.
The invention also discloses a feedback sequence multitask learning-based law decision prediction system, which comprises the following steps:
the text characteristic representation learning module is used for realizing the text characteristic representation learning of case description by using a single-task law prediction method based on representation learning;
and the law decision prediction module takes the information of the prior task and the information of the feedback information of the subsequent task of each subtask as the input of the current task, considers the sequence relationship and the reverse verification relationship among the subtasks and realizes the law decision prediction based on feedback sequence multi-task learning.
The above one or more technical solutions have the following beneficial effects:
the invention expands the law decision prediction considering single subtasks to a multi-task learning method considering sequence relation and reverse verification relation among tasks to realize the prediction of the law decision prediction subtasks, and on one hand, the multi-task learning method is adopted to realize information complementation by utilizing shared information among the subtasks; on the other hand, the information of the prior task and the information of the feedback information of the subsequent task of each subtask are used as the input of the current task, the sequence relation and the reverse verification relation among the subtasks are considered, and the prediction precision of legal judgment prediction is improved better.
The method is based on the combination of the single task representation learning method and the feedback-based sequence multitask learning method, effectively utilizes the advantages of the single task representation learning method and the feedback-based sequence multitask learning method in legal decision prediction, overcomes the defect that the complementary information of other tasks is not utilized based on the single task representation learning method in a targeted manner, and can improve the accuracy and robustness of a decision prediction result compared with the traditional multitask learning-based method.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a flowchart of a legal decision prediction method based on feedback sequence multitask learning according to an embodiment of the present invention;
FIG. 2 is a diagram of a multi-task pre-training classification model according to an embodiment of the present invention.
Detailed Description
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The embodiments and features of the embodiments of the invention may be combined with each other without conflict.
The general idea provided by the invention is as follows:
the method is characterized in that the text characteristic representation learning of case description is realized by using a single-task law prediction method based on representation learning, the sequence relation and the reverse verification relation among all subtasks are considered by taking the information of the prior task and the information of the feedback information of the subsequent task of all subtasks as the input of the current task, and finally the law decision prediction based on feedback sequence multi-task learning is realized.
The technical steps are as follows: and realizing sequence relation and reverse verification based on the bidirectional LSTM, wherein the forward LSTM realizes sequence modeling between tasks by taking a prior task and a current task as input, and the reverse LSTM realizes reverse verification of the tasks by taking a subsequent task and the current task as input. This implementation corresponds to the eighth step.
Example one
Referring to fig. 1, the present embodiment discloses a law decision prediction method based on feedback sequence multitask learning, which includes the following specific steps:
the first step is as follows: training data sets of case description and related legal rules, criminals and criminal periods are obtained.
The second step is that: and performing text feature representation learning on the case description to obtain a vector representation of the case description.
The third step: all the law acts, criminals and criminal stages carry out text feature representation learning to obtain feature vector representations of the law acts, the criminals and the criminal stages.
The fourth step: and constructing a multi-task pre-training model based on case description law and law forecast, criminal forecast and criminal forecast, and acquiring corresponding pre-classification vectors of the three subtasks.
The fifth step: and performing feature fusion on the normal vector pointed by the pre-classification vector of the normal prediction task and the case description expression vector to obtain a case-normal expression vector.
And a sixth step: and carrying out feature fusion on the criminal vector corresponding to the criminal pointed by the pre-classification vector of the criminal prediction task and the case representation vector to obtain the case-criminal representation vector.
The seventh step: and carrying out feature fusion on the criminal phase vector corresponding to the criminal phase pointed by the pre-classification vector of the criminal phase prediction task and the case expression vector to obtain the case-criminal phase expression vector.
Eighth step: the case-law vector, the case-criminal vector and the case-criminal vector are used as input and input into a Bidirectional Long Short Term Memory neural network (Bi-LSTM) to obtain high-level semantic representation of the three vectors.
The ninth step: constructing classifiers for law, criminal and criminal periods based on case-law vector, case-criminal vector and case-criminal period vector.
The tenth step: and outputting the forecast results of the law, the criminal act and the criminal period.
In the second step, a representation learning-based method is adopted to obtain a vector representation d of the case description texti,{i=1,2,3…D}。
In the third step, each law is obtained by adopting a representation learning method, and the vector representations of the criminal act and the criminal period are respectively li,{i=1,2,3…L},ci,{i=1,2,3…C},pi,{i=1,2,3…P}。
In the fourth step, the case description vector d obtained in the second step is usediAnd { i ═ 1, 2, 3 … T } is input into a multitask classifier, the multitask classification model is trained to realize the pre-classification of the law, the criminal action and the criminal period, and the law classification vector lr is obtainedi{ i ═ 1, 2, 3 … D }, criminal prediction vector cri{ i ═ 1, 2, 3 … C } and penalty term prediction vector pri,{i=1,2,3…D}。
In the fifth step, the case description vector d obtained in the second step is usedi{ i ═ 1, 2, 3 … D } and the corresponding normal prediction vector lr obtained in the fourth stepiNormal vector l pointed to by { i ═ 1, 2, 3 … D }iCarrying out feature fusion to obtain case-law bar expression vector dli,{i=1,2,3…D}。
In the sixth step, based on the case description vector d obtained in the second stepi{ i ═ 1, 2, 3 … D } and the corresponding criminal prediction vector cr obtained in the fourth stepiGuilty c pointed to by { i ═ 1, 2, 3 … D }iCarrying out feature fusion to obtain case-criminal expression vector dci,{i=1,2,3…D}。
In the seventh step, the case description vector d obtained in the second step is usedi{ i ═ 1, 2, 3 … D } and the corresponding penalty period prediction vector pr obtained in step fouriThe penalty period vector p pointed to by { i ═ 1, 2, 3 … D }iCarrying out feature fusion to obtain case-criminal phase expression vector dpi,{i=1,2,3…D}。
In the eighth step, case-law expression vectors, case-criminal vectors and case criminal period vectors obtained in the fifth, sixth and seventh steps are input into a Bi-LSTM network in sequence for training to obtain high-level feature expressions corresponding to the three vectors
Figure RE-GDA0002419143350000101
Figure RE-GDA0002419143350000102
And
Figure RE-GDA0002419143350000103
and in the ninth step, inputting the high-level feature representation in the eighth step into three classifiers to realize the prediction of the law, the criminal act and the criminal period.
In this embodiment, the text feature representation learns:
the feature representation learning of the text means that information such as semantics, syntax and the like of the text is represented in a low-dimensional dense vector space through a modeling method, and then calculation and reasoning are carried out. The representation learning for text features is mainly divided into three granularities: word vector representations, sentence vector representations, and document vector representations.
In this embodiment, the existing BERT model published by google is mainly used. BerT is called Bidirectthe Encoder of the bidirectional transform is used for the functional Encoder reproduction from the transforms. The main innovation points of the model are based on a pre-training method, namely two methods, namely a covered language model (Masked Languagemodel) and a Next Sentence Prediction (Next Sennce Prediction), are used for respectively capturing characteristic expressions at the word level, the Sentence level and the chapter level. Through the BERT model, pre-training can be carried out based on case fact expression training data set, a language model of a legal prediction task is obtained, and therefore D vector representations D of case fact descriptions are obtainediAnd { i ═ 1, 2, 3 … D }. Meanwhile, the legal content, the criminal description and the criminal period description in the task are also based on the BERT model, and the legal vector l is obtained by means of dictionary looking upi{ 1, 2, 3 … L }, a guilt description vector ci{ i ═ 1, 2, 3 … C } and criminal phase description vector pi,{i=1,2,3…P}。
In this embodiment, the classification model is multi-tasked pre-trained:
referring to fig. 2, Multi Task Learning (MTL) is one of the migration learning algorithms, and migration learning can be understood as defining a source domain and a target domain, learning in the source domain, and migrating the learned knowledge to the target domain, so as to improve the learning effect of the target domain. Two multitask learning modes in deep learning: hard sharing and soft sharing of hidden layer parameters. The present project takes a hard sharing mechanism of parameters as an example, but is not limited to a hard sharing method, and the hard sharing method of parameters is usually implemented by sharing a hidden layer among all tasks, while preserving an output layer of several specific tasks.
By describing case facts in a training data set and corresponding law, criminal and criminal period labels thereof, a multi-task learning method of hard parameter sharing is adopted to obtain a law classification vector lr of each key fact descriptioni{ i ═ 1, 2, 3 … D }, criminal prediction vector cri{ i ═ 1, 2, 3 … C } and penalty term prediction vector pri,{i=1,2,3…D}。
In this embodiment, a bidirectional long-short memory module network (Bi-LSTM) modeling task sequence relationships and feedback relationships:
the Long Short-Term Memory network (LSTM) is a time-cycle neural network and is specially designed for solving the problem of gradient disappearance existing in a general cycle neural network (RNN). LSTM is a neural network of the kind comprising blocks (blocks) of LSTM, each of which has a Gate (Gate) that can remember a value of an indefinite length of time, and typically, an LSTM comprises three gates: forgetting gate, input gate and output gate. Currently, LSTM is mainly used to encode context information in a time series. The invention mainly adopts gates in the LSTM block to encode the sequence relation between tasks in the multi-task learning and the verification relation between the follow-up task and the current task. The specific learning process is as follows:
step 1: selecting each D of the batch case fact descriptions D in units of miObtaining a prediction result vector lr aiming at a normal in a pre-classification result of the multi-task pre-training classification modeliAnd extracting the normal vector l corresponding to the element with the maximum value in the vectorjDescribing the vector and case fact vector diSplicing, inputting to a full connection layer to obtain case-normal vector representation dli
Step 2: for each diObtaining a predictor vector cr for the guilty in the pre-classification result based on the multi-task pre-classification modeliAnd taking out the guilt vector c corresponding to the element whose value exceeds 0.5 in the vectorjAnd { j ═ 1, 2, … c }, and these vectors and case fact description vector d are describediSplicing, inputting to a full connection layer to obtain case-criminal vector representation dci
Step 3: for each diObtaining a prediction result vector pr aiming at the criminal period in the pre-classification result of the multi-task pre-classification modeliAnd extracting the criminal phase vector p corresponding to the element with the maximum value in the vectoriDescribing the vector and case fact vector diSplicing, inputting into a full connection layer, and obtaining case-criminal phase vector representation dpi
Step4 random initiationInitialization state c of forward LSTM module0And h0Status, case-normal vector dliAs input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000101
And a forward output state
Figure BDA0002364560130000102
Step 5-in cell State
Figure BDA0002364560130000103
And its forward output state
Figure BDA0002364560130000104
The case-guilt vector dc is the input cell state and the input state of the current forward LSTM module at the previous momentiAs input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000105
And a forward output state
Figure BDA0002364560130000106
Step 6: in a cellular state
Figure BDA0002364560130000107
And its forward output state
Figure BDA0002364560130000108
The input cell state and the input state at the last moment of the current forward LSTM module, namely case-criminal phase vector dpiAs input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000111
And a forward output state
Figure BDA0002364560130000112
Step 7: in a cellular state
Figure BDA0002364560130000113
And its forward output state
Figure BDA0002364560130000114
The input cell state and the input state at the last moment of the current reverse LSTM module, a case-criminal phase vector dpiAs input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000115
And reverse output state
Figure BDA0002364560130000116
Step 8: in a cellular state
Figure BDA0002364560130000117
And its reverse output state
Figure BDA0002364560130000118
The case-criminal vector dp is the input cell state and input state at the last moment of the current reverse LSTM moduleiAs input vectors, the cell states of the modules are calculated separately
Figure BDA0002364560130000119
And reverse output state
Figure BDA00023645601300001110
Step 9: in a cellular state
Figure BDA00023645601300001111
And its reverse output state
Figure BDA00023645601300001112
For the input cell state and input state at the last moment of the current reverse LSTM module, the case-criminal vector dliAs an inputVector, calculating the cell state of the module respectively
Figure BDA00023645601300001113
And reverse output state
Figure BDA00023645601300001114
Step10, respectively splicing the forward output state and the reverse output state to obtain
Figure BDA00023645601300001115
Figure BDA00023645601300001116
As based on case description diCorresponding to the input of the law article classifier, the criminal line classifier and the criminal period classifier, calculating the corresponding cross entropy loss function of the batch input, and updating the parameters.
Step 11: and if the iteration times are not limited, jumping to Step 1.
Example two
The present embodiment is directed to a computing device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the method steps of the method for forecasting legal decisions based on feedback sequence multitask learning in the first embodiment.
EXAMPLE III
An object of the present embodiment is to provide a computer-readable storage medium.
A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, performs the method steps of implementing a law decision prediction method based on feedback sequence multitask learning of the first embodiment.
Example four
The invention also discloses a feedback sequence multitask learning-based law decision prediction system, which comprises the following steps:
the text characteristic representation learning module is used for realizing the text characteristic representation learning of case description by using a single-task law prediction method based on representation learning;
and the law decision prediction module takes the information of the prior task and the information of the feedback information of the subsequent task of each subtask as the input of the current task, considers the sequence relationship and the reverse verification relationship among the subtasks and realizes the law decision prediction based on feedback sequence multi-task learning.
The steps involved in the apparatuses of the above second, third and fourth embodiments correspond to the first embodiment of the method, and the detailed description thereof can be found in the relevant description of the first embodiment. The term "computer-readable storage medium" should be taken to include a single medium or multiple media containing one or more sets of instructions; it should also be understood to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by a processor and that cause the processor to perform any of the methods of the present invention.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented using general purpose computing apparatus, or alternatively, they may be implemented using program code executable by computing apparatus, whereby the modules or steps may be stored in a memory device and executed by computing apparatus, or separately fabricated into individual integrated circuit modules, or multiple modules or steps thereof may be fabricated into a single integrated circuit module. The present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (10)

1. A law decision prediction method based on feedback sequence multitask learning is characterized by comprising the following steps:
realizing the text characteristic representation learning of case description by using a single-task law prediction method based on representation learning;
by taking the information of the prior task and the information of the feedback information of the subsequent task of each subtask as the input of the current task, the sequence relationship and the reverse verification relationship among the subtasks are considered, and the law decision prediction based on the feedback sequence multi-task learning is realized.
2. The legal decision prediction method based on feedback sequence multitask learning as claimed in claim 1, characterized by that, obtain the training data set of case description and its related law, criminal act and criminal phase from the data center server, the training data set is stored into the database;
performing text feature representation learning on case description to obtain vector representation of the case description;
all the law acts, the criminal acts and the criminal stages learn the text feature representation to obtain the feature vector representation of the law acts, the criminal acts and the criminal stages.
3. The law decision prediction method based on feedback sequence multitask learning as claimed in claim 1, characterized by that, a multitask pre-training model based on case description law prediction, criminal prediction and criminal prediction is constructed and the corresponding pre-classification vectors of three subtasks are obtained;
performing feature fusion on a normal vector pointed by a pre-classification vector of a normal prediction task and a case description expression vector to obtain a case-normal expression vector;
carrying out feature fusion on a criminal vector corresponding to a criminal pointed by a pre-classification vector of a criminal prediction task and a case representation vector to obtain a case-criminal representation vector;
carrying out feature fusion on a criminal phase vector corresponding to a criminal phase pointed by a presorting vector of a criminal phase prediction task and a case expression vector to obtain a case-criminal phase expression vector;
inputting case-law vector, case-criminal vector and case-criminal period vector into a bidirectional long-and-short-time memory neural network to obtain high-level semantic representation of three vectors;
constructing classifiers of law, criminal and criminal periods based on case-law vector, case-criminal vector and case-criminal period vector high-level feature representation;
and inputting the high-level feature representation into three classifiers to realize the prediction of the law, the criminal act and the criminal period.
4. The law decision prediction method based on feedback sequence multitask learning as claimed in claim 3, characterized by that, a multitask pre-training model based on case description law prediction, criminal prediction and criminal prediction is constructed and the corresponding pre-classification vectors of three subtasks are obtained:
inputting the obtained case description vector into a multi-task classifier, and pre-classifying the law, the criminal and the criminal through training the multi-task classification model to obtain a law classification vector, a criminal prediction vector and a criminal prediction vector.
5. The method as claimed in claim 3, wherein the language model of the legal prediction task is obtained by pre-training the training data set based on case fact expression through a BERT model, so as to obtain the vector representation of D case fact descriptions.
And acquiring a law article vector, a criminal description vector and a criminal description vector by adopting a dictionary looking mode aiming at the law article content, the criminal description and the criminal description based on a BERT model.
6. The method as claimed in claim 3, wherein the case fact description in the training data set and the corresponding legal, criminal and criminal period labels are used to obtain the legal classification vector, criminal prediction vector and criminal period prediction vector of each keystroke fact description by a multi-task learning method with hard parameter sharing.
7. The method of claim 1, wherein gates in LSTM blocks are used to encode sequence relationships between tasks in multitask learning and validation relationships between subsequent tasks versus the current task:
step 1: selecting each D of the batch of case fact descriptions DiObtaining a prediction result vector ir aiming at a normal in the pre-classification result of the multi-task pre-training classification modeliAnd extracting the normal vector l corresponding to the element with the maximum value in the vectorjDescribing the vector and case fact vector diSplicing, inputting to a full connection layer to obtain case-normal vector representation dli
Step 2: for each diObtaining a predictor vector cr for the guilty in the pre-classification result based on the multi-task pre-classification modeliAnd taking out the guilt vector corresponding to the elements in the vector, describing the vector d with case factsiSplicing, inputting to a full connection layer to obtain case-criminal vector representation dci
Step 3: for each diObtaining a prediction result vector pr aiming at the criminal period in the pre-classification result of the multi-task pre-classification modeliAnd extracting the criminal phase vector p corresponding to the element with the maximum value in the vectoriDescribing the vector and case fact vector diSplicing, inputting into a full connection layer, and obtaining case-criminal phase vector representation dpi
Step4 initialization of the Forward LSTM Module0And h0Status, case-normal vector dliAs input vectors, the cell states of the modules are calculated separately
Figure FDA0002364560120000031
And positive output stateState of the art
Figure FDA0002364560120000032
Step 5-in cell State
Figure FDA0002364560120000033
And its forward output state
Figure FDA0002364560120000034
The case-guilt vector dc is the input cell state and the input state of the current forward LSTM module at the previous momentiAs input vectors, the cell states of the modules are calculated separately
Figure FDA0002364560120000035
And a forward output state
Figure FDA0002364560120000036
Step 6: in a cellular state
Figure FDA0002364560120000037
And its forward output state
Figure FDA0002364560120000038
The input cell state and the input state at the last moment of the current forward LSTM module, namely case-criminal phase vector dpiAs input vectors, the cell states of the modules are calculated separately
Figure FDA0002364560120000039
And a forward output state
Figure FDA00023645601200000310
Step 7: in a cellular state
Figure FDA00023645601200000311
And its forward output stateState of the art
Figure FDA00023645601200000312
The input cell state and the input state at the last moment of the current reverse LSTM module, a case-criminal phase vector dpiAs input vectors, the cell states of the modules are calculated separately
Figure FDA0002364560120000041
And reverse output state
Figure FDA0002364560120000042
Step 8: in a cellular state
Figure FDA0002364560120000043
And its reverse output state
Figure FDA0002364560120000044
The case-criminal vector dp is the input cell state and input state at the last moment of the current reverse LSTM moduleiAs input vectors, the cell states of the modules are calculated separately
Figure FDA0002364560120000045
And reverse output state
Figure FDA0002364560120000046
Step 9: in a cellular state
Figure FDA0002364560120000047
And its reverse output state
Figure FDA0002364560120000048
For the input cell state and input state at the last moment of the current reverse LSTM module, the case-criminal vector dliAs input vectors, the cell states of the modules are calculated separately
Figure FDA0002364560120000049
And reverse output state
Figure FDA00023645601200000410
Step 10: respectively splicing the forward output state and the reverse output state to obtain
Figure FDA00023645601200000411
Figure FDA00023645601200000412
As based on case description diCorresponding to the input of a law article classifier, a criminal bank classifier and a criminal stage classifier, calculating a cross entropy loss function corresponding to the batch input, and updating parameters;
step 11: if the iteration number is less than the limited number, the Step1 is jumped to.
8. A law decision prediction system based on feedback sequence multitask learning is characterized by comprising the following steps:
the text characteristic representation learning module is used for realizing the text characteristic representation learning of case description by utilizing a single-task law prediction method based on representation learning;
and the law decision prediction module takes the information of the prior task and the information of the feedback information of the subsequent task of each subtask as the input of the current task, considers the sequence relationship and the reverse verification relationship among the subtasks and realizes the law decision prediction based on feedback sequence multi-task learning.
9. A computing device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method steps of any of claims 1-7 when executing the program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method steps of a method for legal decision prediction based on feedback sequence multitask learning according to any one of claims 1-7.
CN202010031722.9A 2020-01-13 2020-01-13 Legal decision prediction method and system based on feedback sequence multitask learning Active CN111259673B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010031722.9A CN111259673B (en) 2020-01-13 2020-01-13 Legal decision prediction method and system based on feedback sequence multitask learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010031722.9A CN111259673B (en) 2020-01-13 2020-01-13 Legal decision prediction method and system based on feedback sequence multitask learning

Publications (2)

Publication Number Publication Date
CN111259673A true CN111259673A (en) 2020-06-09
CN111259673B CN111259673B (en) 2023-05-09

Family

ID=70945221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010031722.9A Active CN111259673B (en) 2020-01-13 2020-01-13 Legal decision prediction method and system based on feedback sequence multitask learning

Country Status (1)

Country Link
CN (1) CN111259673B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015659A (en) * 2020-09-02 2020-12-01 三维通信股份有限公司 Prediction method and device based on network model
CN112131370A (en) * 2020-11-23 2020-12-25 四川大学 Question-answer model construction method and system, question-answer method and device and trial system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229582A (en) * 2018-02-01 2018-06-29 浙江大学 Entity recognition dual training method is named in a kind of multitask towards medical domain
CN109241528A (en) * 2018-08-24 2019-01-18 讯飞智元信息科技有限公司 A kind of measurement of penalty prediction of result method, apparatus, equipment and storage medium
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109376227A (en) * 2018-10-29 2019-02-22 山东大学 A kind of prison term prediction technique based on multitask artificial neural network
CN109829055A (en) * 2019-02-22 2019-05-31 苏州大学 User's law article prediction technique based on filtering door machine
CN109919175A (en) * 2019-01-16 2019-06-21 浙江大学 A kind of more classification methods of entity of combination attribute information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108229582A (en) * 2018-02-01 2018-06-29 浙江大学 Entity recognition dual training method is named in a kind of multitask towards medical domain
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109241528A (en) * 2018-08-24 2019-01-18 讯飞智元信息科技有限公司 A kind of measurement of penalty prediction of result method, apparatus, equipment and storage medium
CN109376227A (en) * 2018-10-29 2019-02-22 山东大学 A kind of prison term prediction technique based on multitask artificial neural network
CN109919175A (en) * 2019-01-16 2019-06-21 浙江大学 A kind of more classification methods of entity of combination attribute information
CN109829055A (en) * 2019-02-22 2019-05-31 苏州大学 User's law article prediction technique based on filtering door machine

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WENMIAN YANG 等: ""Legal Judgment Prediction via Multi-Perspective Bi-Feedback Network"", 《ARXIV》 *
刘宗林 等: "融入罪名关键词的法律判决预测多任务学习模型", 《清华大学学报(自然科学版)》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112015659A (en) * 2020-09-02 2020-12-01 三维通信股份有限公司 Prediction method and device based on network model
CN112131370A (en) * 2020-11-23 2020-12-25 四川大学 Question-answer model construction method and system, question-answer method and device and trial system
CN112131370B (en) * 2020-11-23 2021-03-12 四川大学 Question-answer model construction method and system, question-answer method and device and trial system

Also Published As

Publication number Publication date
CN111259673B (en) 2023-05-09

Similar Documents

Publication Publication Date Title
WO2023024412A1 (en) Visual question answering method and apparatus based on deep learning model, and medium and device
CN111985245A (en) Attention cycle gating graph convolution network-based relation extraction method and system
CN109214006B (en) Natural language reasoning method for image enhanced hierarchical semantic representation
CN114169330A (en) Chinese named entity identification method fusing time sequence convolution and Transformer encoder
CN110083702B (en) Aspect level text emotion conversion method based on multi-task learning
CN116415654A (en) Data processing method and related equipment
Wu et al. Optimized deep learning framework for water distribution data-driven modeling
CN116594748A (en) Model customization processing method, device, equipment and medium for task
CN117475038B (en) Image generation method, device, equipment and computer readable storage medium
Dai et al. Hybrid deep model for human behavior understanding on industrial internet of video things
Duan et al. Temporality-enhanced knowledgememory network for factoid question answering
Binnig et al. Towards interactive curation & automatic tuning of ml pipelines
Yang et al. Sequence-to-sequence prediction of personal computer software by recurrent neural network
CN113111190A (en) Knowledge-driven dialog generation method and device
CN114692605A (en) Keyword generation method and device fusing syntactic structure information
CN111259673A (en) Feedback sequence multi-task learning-based law decision prediction method and system
CN112183062B (en) Spoken language understanding method based on alternate decoding, electronic equipment and storage medium
CN115066690A (en) Search normalization-activation layer architecture
CN116737897A (en) Intelligent building knowledge extraction model and method based on multiple modes
CN109446518B (en) Decoding method and decoder for language model
CN116341564A (en) Problem reasoning method and device based on semantic understanding
CN116595985A (en) Method for assisting in enhancing emotion recognition in dialogue based on generated common sense
CN113779244B (en) Document emotion classification method and device, storage medium and electronic equipment
Julian Deep learning with pytorch quick start guide: learn to train and deploy neural network models in Python
CN114936564A (en) Multi-language semantic matching method and system based on alignment variational self-coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant