CN112614024B

CN112614024B - Legal intelligent recommendation method, system, device and medium based on case facts

Info

Publication number: CN112614024B
Application number: CN202011609552.4A
Authority: CN
Inventors: 请求不公布姓名; 翁洋; 李鑫; 王竹
Original assignee: Chengdu Shuzhilian Technology Co Ltd
Current assignee: Chengdu Shuzhilian Technology Co Ltd
Priority date: 2020-12-30
Filing date: 2020-12-30
Publication date: 2024-03-08
Anticipated expiration: 2040-12-30
Also published as: CN112614024A

Abstract

The invention discloses a legal intelligent recommendation method, a system, a device and a medium based on case facts, which comprise the following steps: constructing a training data set; training the French recommendation model A by using a training data set to obtain a trained French recommendation model B; input data is obtained, and the format of the input data is as follows: preset case facts-specific judicial explanation of laws corresponding to preset case facts; inputting the input data into a rule recommendation model B, and outputting N recommended rules by the rule recommendation model B, wherein the N recommended rules are the first N rules which are arranged in a descending order according to a preset matching degree in all the rules corresponding to preset case facts, and the preset matching degree is the matching degree between the rules corresponding to the preset case facts and the preset case facts. The method can enable the result of legal recommendation based on the case facts to be more accurate and assist the judges more effectively.

Description

Legal intelligent recommendation method, system, device and medium based on case facts

Technical Field

The invention relates to the field of natural language processing, in particular to a legal intelligent recommendation method, a system, a device and a medium based on case facts.

Background

At present, two problems mainly exist in the legal recommendation technology based on case facts in the judicial field: firstly, the interaction between the case facts and the laws is insufficient, the existing system is used for independently coding the case facts and the laws, and the interaction is performed by using an attention mechanism on the basis of coding; secondly, the system faces the problem of overfitting, and the existing method uses softmax when training a model, so that the similarity between the case facts and the legal strips easily falls into the overfitting trap, the case facts and the legal strips do need to be precisely mapped one to one from the aspect of legal interpretation, but the matching degree of the case facts and the legal strips is not exactly or completely matched from the aspect of natural language, but has a certain matching degree, so that the current recommendation of the case facts and the legal strips is inaccurate.

Disclosure of Invention

The invention aims to enable the result of legal recommendation based on case facts to be more accurate and assist the judges more effectively.

In order to achieve the above purpose, the invention provides a legal intelligent recommendation method based on case facts, which comprises the following steps:

constructing a training data set, wherein the training data set comprises a plurality of training data, and the format of each training data is as follows: case facts-matching law specific judicial interpretation, case facts-similar law 1 specific judicial interpretation, &..a, case facts-similar law K specific judicial interpretation, K is an integer greater than 1;

training the French recommendation model A by using a training data set to obtain a trained French recommendation model B;

input data is obtained, and the format of the input data is as follows: preset case facts-specific judicial explanation of laws corresponding to preset case facts;

inputting input data into a law recommendation model B, wherein the law recommendation model B outputs N recommended laws, the N recommended laws are top N laws which are arranged in descending order according to a preset matching degree in all laws corresponding to preset case facts, the preset matching degree is the matching degree of the laws corresponding to the preset case facts and the preset case facts, and N is an integer greater than or equal to 1.

The principle of the invention is as follows: firstly, from the judicial aspect of the legal officer, the fact of the case has a matched relation with the legal laws; secondly, the Siamese network model based on BERT, the defined loss function and the constructed training data set are used, so that the model can learn the mutual semantic relation (matching degree) between the case facts and the laws, the self-intent mechanism of BERT is fully used for enabling the case facts and the laws to be subjected to semantic full interaction, and the triple loss can effectively reduce overfitting during training, so that the recommendation of the final model is more accurate.

Preferably, constructing the training data set in the method specifically includes:

extracting a case fact a and a legal strip b corresponding to the case fact a from the judge document;

extracting judicial interpretation d corresponding to the legal strip b from the legal library;

based on judicial interpretation d, generating front top K similar laws which are arranged in descending order of similarity with the laws b;

constructing data e, wherein the format of the data e is as follows: the case facts a match the legal specific judicial interpretation, similar legal specific judicial interpretation 1, similar legal specific judicial interpretation 2, …, and similar legal specific judicial interpretation K;

converting the data e into training data f, wherein the training data f is in the format of: the case facts-matching law specific judicial interpretation, the case facts-similar law 1 specific judicial interpretation, …, the case facts-similar law K specific judicial interpretation; wherein, the case fact-matching law specific judicial interpretation is the matching data, the case fact-similar law 1 specific judicial interpretation is …, and the case fact-similar law K specific judicial interpretation is the non-matching data.

Preferably, the method includes:

an input layer, a BERT layer, an FC layer, and an output layer; wherein, the input layer includes first input sublayer and second input sublayer, and the BERT layer includes first BERT module and second BERT module, and the FC layer includes: a first FC sublayer and a second FC sublayer;

when the French recommendation model is trained, the input data of the first input sub-layer are matched data in training data, and the input data of the second input sub-layer are non-matched data in the training data;

input data of the first input sub-layer is input to a first BERT module, the first BERT module is output to the first FC sub-layer, the first FC sub-layer is output to the output layer, and the first FC sub-layer outputs the matching degree of the matching laws and the case facts;

the input data of the second input sub-layer is input to a second BERT module, the second BERT module is output to a second FC sub-layer, the second FC sub-layer is output to an output layer, and the second FC sub-layer outputs the matching degree of the non-matching legal and the case facts.

Preferably, in the method, the first BERT module and the second BERT module of the legal recommendation model share weight parameters, and the first FC sublayer and the second FC sublayer share weight parameters.

Preferably, in the method, a triple loss function is set in an output layer of the French recommendation model, the triple loss=max (0, margin+p_neg-p_pos), margin is used as super parameter tuning, p_neg is the matching degree of the non-matching French and the case facts, and p_pos is the matching degree of the matching French and the case facts.

The invention also provides a legal intelligent recommendation system based on the case facts, which further comprises:

the construction unit is used for constructing a training data set, the training data set comprises a plurality of training data, and the format of each training data is as follows: case facts-matching law specific judicial interpretation, case facts-similar law 1 specific judicial interpretation, &..a, case facts-similar law K specific judicial interpretation, K is an integer greater than 1;

the training unit is used for training the French recommendation model A by utilizing the training data set to obtain a trained French recommendation model B;

an input data obtaining unit, configured to obtain input data, where a format of the input data is: preset case facts-specific judicial explanation of laws corresponding to preset case facts;

the method comprises a rule recommendation model B, wherein the rule recommendation model B is used for processing input data input into the rule recommendation model B and outputting N recommendation rules, the N recommendation rules are the top N rules which are arranged in a descending order according to a preset matching degree in all rules corresponding to preset case facts, the preset matching degree is the matching degree of the rules corresponding to the preset case facts and the preset case facts, and N is an integer greater than or equal to 1.

Preferably, the construction unit in the system specifically comprises:

a first extraction subunit, configured to extract a case fact a and a legal strip b corresponding to the case fact a from the referee document;

the second extraction subunit is used for extracting the judicial interpretation d corresponding to the legal strip b from the legal library;

a generation subunit, configured to generate, based on judicial interpretation d, front top K similar bars arranged in descending order of similarity to bar b;

a construction subunit, configured to construct data e, where the data e is in the format of: the case facts a match the legal specific judicial interpretation, similar legal specific judicial interpretation 1, similar legal specific judicial interpretation 2, …, and similar legal specific judicial interpretation K;

the conversion subunit is configured to convert the data e into training data f, where the training data f is in a format: the case facts-matching law specific judicial interpretation, the case facts-similar law 1 specific judicial interpretation, …, the case facts-similar law K specific judicial interpretation; wherein, the case fact-matching law specific judicial interpretation is the matching data, the case fact-similar law 1 specific judicial interpretation is …, and the case fact-similar law K specific judicial interpretation is the non-matching data.

Preferably, the legal recommendation model in the system comprises:

Preferably, the first BERT module and the second BERT module of the legal recommendation model in the system share weight parameters, and the first FC sublayer and the second FC sublayer share weight parameters.

Preferably, a triple loss function is arranged in an output layer of the legal recommendation model in the system, the triple loss=max (0, margin+p_neg-p_pos), margin is used as super parameter tuning, p_neg is the matching degree of a non-matching legal and a case fact, and p_pos is the matching degree of the matching legal and the case fact.

The invention also provides a legal intelligent recommending device based on the case facts, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the computer program to realize the legal intelligent recommending method based on the case facts.

The invention also provides a computer readable storage medium storing a computer program which when executed by a processor implements the steps of the case fact-based legal intelligent recommendation method.

The one or more technical schemes provided by the invention have at least the following technical effects or advantages:

the invention realizes intelligent recommendation of laws based on case facts, uses a Siamese Network based on Bert as a model, and mainly aims to fully use a self-attention mechanism of Bert to ensure the case facts and sufficient atten (attention) of laws, and simultaneously uses a triple loss to resist an overfitting trap, wherein the triple loss function only needs that the matching degree of a positive sample is greater than that of a negative sample by using the model, so that a certain matching gray level is reserved, and overfitting is prevented.

Drawings

The accompanying drawings, which are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention;

FIG. 1 is a flow chart of a legal intelligent recommendation method based on case facts;

FIG. 2 is a schematic diagram of the structure of a training model in the present invention;

FIG. 3 is a schematic diagram of the structure of a predictive model in the present invention;

fig. 4 is a schematic diagram of the composition of the legal intelligent recommendation system based on the case facts in the present invention.

Detailed Description

In order that the above-recited objects, features and advantages of the present invention will be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. In addition, the embodiments of the present invention and the features in the embodiments may be combined with each other without collision.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than within the scope of the description, and the scope of the invention is therefore not limited to the specific embodiments disclosed below.

It will be understood that the terms "a" and "an" should be interpreted as referring to "at least one" or "one or more," i.e., in one embodiment, the number of elements may be one, while in another embodiment, the number of elements may be plural, and the term "a" should not be interpreted as limiting the number.

Example 1

Fig. 1 is a flow chart of a legal intelligent recommendation method based on case facts, and the first embodiment of the invention provides a legal intelligent recommendation method based on case facts, which comprises the following steps:

constructing a training data set, wherein the training data set comprises a plurality of training data, and the format of each training data is as follows: the case facts-the specific judicial interpretation of the matching laws, the case facts-the specific judicial interpretation of the similar laws 1, the case facts-the specific judicial interpretation of the similar laws K, K is an integer greater than 1, the number of the similar laws K can be adjusted according to actual needs, and the invention is not limited specifically;

The method uses Bert-Based Siamese Network as a model architecture, uses a triple loss as a loss function for training, and then uses the network for predicting and recommending laws.

The present invention will be described in detail below with respect to the construction, training and prediction of datasets.

A training dataset is constructed. For a certain case, extracting all case facts and corresponding laws from the published judge document based on rules (the rules in the invention can be preset or flexibly adjusted according to actual needs, the invention is not specifically limited, the invention can be automatically extracted or manually extracted, the invention is not specifically limited, and training data are manually marked by professional legal persons); extracting judicial interpretation corresponding to a legal rule from a rule base (wherein the rule base is the correspondence between the name of the legal rule and corresponding judicial interpretation, for example, sixteenth of the law of the property rights of the republic of China [ real estate register effectiveness and management mechanism thereof ] the real estate register is the basis of property rights and contents; based on specific judicial interpretations corresponding to the judicial notes, key words and semantic similarity technology are used for generating the most similar top K corresponding relations of other judicial notes of each judicial note (similar judicial notes refer to similar judicial notes extracted as non-matched judicial notes according to the overlapping condition of key words, editing distance of judicial interpretations, and send 2vec semantic coding and other technologies in judicial interpretations). Thus, data can be constructed (case facts, matching juridical judicial interpretations, like juridical judicial 1, like juridical judicial 2, …, like juridical judicial K). Further translating the above data into (case facts, matching the law specific judicial interpretation), (case facts, similar law 1 specific judicial interpretation), …, (case facts, similar law K specific judicial interpretation); the former piece of data is called matching data, the latter K pieces of data are called non-matching data, and the group of data is described for the same case fact.

And training a model. The model structure diagram is shown in fig. 2, and the legal recommendation model comprises: an input layer, a BERT layer, an FC layer, and an output layer; wherein, the input layer includes first input sublayer and second input sublayer, and the BERT layer includes first BERT module and second BERT module, and the FC layer includes: a first FC sublayer and a second FC sublayer.

The legal recommendation model uses Bert-Based Siamese Network, using a triple loss as a trained loss function. The same case facts as described above will construct several training data, each in the format ((case facts, matching legal jurisdictions), the case facts, similar legal jurisdictions K). The main purpose of this input is to make full use of the self-attention mechanism of BERT, so that the case facts and french sentences are sufficiently mutually noted (attend); the BERT layer shares weight parameters, takes CLS output of BERT and accesses FC (fully connected layer), and the FC layer also shares weight parameters. The FC layer output is a number from 0 to 1, indicating how well the case facts and french interpretations match. And finally, the output p_pos and p_neg of the rule recommendation model respectively represent the matching degree of the matching rule and the non-matching rule aiming at the same case fact, and finally, a triple loss function max (0, margin+p_neg-p_pos) is accessed, and margin is used as a super-parameter for tuning.

Model prediction. Model predictions, i.e., legal recommendations, were made using the approach shown in fig. 3. Parameters of the BERT layer and the FC layer of the prediction model are parameters obtained by the training model, for each French in the French set, input of the prediction model (case facts, french specific judicial explanation) is constructed, and the maximum value of top N is taken as a corresponding recommended French for the matching degree of the final output of the prediction model.

Example two

As shown in fig. 4, a second embodiment of the present invention provides a legal intelligent recommendation system based on case facts, where the system further includes:

In a second embodiment of the present invention, the construction unit in the system specifically includes:

In a second embodiment of the present invention, the legal recommendation model in the system includes:

In the second embodiment of the present invention, a triple loss function is set in the output layer of the legal recommendation model in the system, where triple loss=max (0, margin+p_neg-p_pos), margin is used as the super parameter tuning, p_neg is the matching degree of the non-matching legal and the case facts, and p_pos is the matching degree of the matching legal and the case facts.

Example III

The processor may be a central processing unit (CPU, central Processing Unit), other general purpose processors, digital signal processors (digital signal processor), application specific integrated circuits (Application Specific Integrated Circuit), off-the-shelf programmable gate arrays (Fieldprogrammable gate array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory may be used to store the computer program and/or the module, and the processor may implement various functions of the legal intelligent recommender based on the fact of a case in the invention by running or executing the data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), and the like. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart memory card, secure digital card, flash memory card, at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

Example IV

The legal intelligent recommender based on the fact of case may be stored in a computer readable storage medium if implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding that the present invention implements all or part of the flow of the method of the above-described embodiments, the steps of each method embodiment described above may also be implemented by a computer program stored in a computer readable storage medium, where the computer program when executed by a processor. Wherein the computer program comprises computer program code, object code forms, executable files, or some intermediate forms, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory, a random access memory, a point carrier signal, a telecommunication signal, a software distribution medium, and the like. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the legislation and the patent practice in the jurisdiction.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. The legal intelligent recommendation method based on the case facts is characterized by comprising the following steps of:

inputting input data into a law recommendation model B, wherein the law recommendation model B outputs N recommended laws, the N recommended laws are top N laws which are arranged in descending order according to a preset matching degree in all laws corresponding to preset case facts, the preset matching degree is the matching degree of the laws corresponding to the preset case facts and the preset case facts, and N is an integer greater than or equal to 1;

the construction training data set specifically comprises:

converting the data e into training data f, wherein the training data f is in the format of: the case facts-matching law specific judicial interpretation, the case facts-similar law 1 specific judicial interpretation, …, the case facts-similar law K specific judicial interpretation; wherein, the case fact-matching law specific judicial interpretation is the matching data, the case fact-similar law 1 specific judicial interpretation is …, and the case fact-similar law K specific judicial interpretation is the non-matching data;

the legal recommendation model includes:

2. The case fact-based legal intelligent recommendation method according to claim 1, wherein the first BERT module and the second BERT module share weight parameters, and the first FC sublayer and the second FC sublayer share weight parameters.

3. The intelligent recommendation method for the legal strips based on the case facts according to claim 1, wherein a triple loss function is arranged in an output layer, wherein triple loss=max (0, margin+p_neg-p_pos), margin is used as super-parameter tuning, p_neg is the matching degree of the non-matching legal strips and the case facts, and p_pos is the matching degree of the matching legal strips and the case facts.

4. Legal intelligent recommendation system based on case facts, characterized in that the system further comprises:

the method comprises a rule recommendation model B, a rule generation module and a rule generation module, wherein the rule recommendation model B is used for processing input data input into the rule recommendation model B and outputting N recommendation rules, the N recommendation rules are the front N rules which are arranged in a descending order according to a preset matching degree in all rules corresponding to preset case facts, the preset matching degree is the matching degree of the rules corresponding to the preset case facts and the preset case facts, and N is an integer greater than or equal to 1;

the construction unit specifically comprises:

the conversion subunit is configured to convert the data e into training data f, where the training data f is in a format: the case facts-matching law specific judicial interpretation, the case facts-similar law 1 specific judicial interpretation, …, the case facts-similar law K specific judicial interpretation; wherein, the case fact-matching law specific judicial interpretation is the matching data, the case fact-similar law 1 specific judicial interpretation is …, and the case fact-similar law K specific judicial interpretation is the non-matching data;

the legal recommendation model includes:

5. A case fact based legal intelligent recommendation device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the case fact based legal intelligent recommendation method according to any of claims 1-3 when executing the computer program.

6. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the case fact based legal intelligent recommendation method according to any of claims 1-3.