WO2022078308A1

WO2022078308A1 - Method and apparatus for generating judgment document abstract, and electronic device and readable storage medium

Info

Publication number: WO2022078308A1
Application number: PCT/CN2021/123175
Authority: WO
Inventors: 曹辰捷; 徐国强; 陈家豪
Original assignee: 深圳壹账通智能科技有限公司
Priority date: 2020-10-12
Filing date: 2021-10-12
Publication date: 2022-04-21
Also published as: CN112182224A

Abstract

The present application relates to the technical field of artificial intelligence. Disclosed is a method for generating a judgment document abstract. The method comprises: inputting a judgment document into a trained paragraph category identification model, so as to obtain paragraph categories of paragraphs in the judgment document, and taking a set of paragraphs of a first category as a paragraph set; respectively performing similarity matching on each paragraph in the paragraph set and a short sentence template in an abstract template, so as to obtain target short sentence templates corresponding to the paragraphs in the paragraph set; and inputting the paragraphs and the target short sentence templates corresponding thereto into a trained abstract generation model, so as to obtain target abstract short sentences corresponding to the paragraphs in the paragraph set, and combining the target abstract short sentences according to the position order, in the abstract template, of the target short sentence templates corresponding to the paragraphs, so as to obtain abstract text corresponding to the judgment document. Further provided are an apparatus for generating a judgment document abstract, and an electronic device and a readable storage medium. By means of the present application, the consistency and accuracy of a judgment document abstract are guaranteed.

Description

Method, device, electronic device and readable storage medium for generating summary of judgment documents

This application claims the priority of the Chinese patent application with the application number CN202011087426.7 and titled "Method, Apparatus, Electronic Device and Readable Storage Medium for Generating Judgment Document Abstracts" filed with the China Patent Office on October 12, 2020. The entire contents of this application are incorporated by reference.

technical field

The present application relates to the technical field of artificial intelligence, and in particular, to a method, device, electronic device, and readable storage medium for generating a summary of a judgment document.

Background technique

With the development of the information age, abstract generation has become more and more widely used in people's lives, such as the abstract generation of referee documents. By browsing the abstract, you can quickly understand the content outline and key information of the referee text, saving reading time.

The inventor realizes that the writing of judgment documents is standard, but the content is detailed and lengthy. At present, abstracts are usually generated by extracting words, phrases and sentences with greater weight from the judgment documents and combining them. The abstracts generated in this way have poor semantic coherence and lack. Effective integration of legal and adjudicative knowledge, resulting in incoherent and inaccurate summaries generated. Therefore, there is an urgent need for a method for generating abstracts of judgment documents to ensure the coherence and accuracy of the abstracts of judgment documents.

SUMMARY OF THE INVENTION

The method for generating the abstract of the judgment document provided in this application includes:

Parse the user's request to generate a judgment document summary based on the client, and obtain the judgment document carried by the request;

Input the judgment document into the trained paragraph category recognition model to obtain the paragraph category of each paragraph in the judgment document, where the paragraph category includes the first category and the second category, and the paragraphs of the first category in the judgment document are as a set of paragraphs;

Perform similarity matching between each paragraph in the paragraph set and each short sentence template in the preconfigured summary template, to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

Inputting each paragraph in the paragraph set and its corresponding target short sentence template into the trained summary generation model, obtaining the target short sentence corresponding to each paragraph in the paragraph set, according to the target short sentence template corresponding to each paragraph in the abstract template The target abstract short sentences are spliced in the order of their positions to obtain the abstract text corresponding to the judgment document.

The present application also provides a device for generating a summary of a judgment document, the device comprising:

A parsing module, configured to parse a user's request for generating a judgment document summary based on the client, and obtain the judgment document carried by the request;

The input module is used to input the judgment document into the trained paragraph category recognition model, and obtain the paragraph category of each paragraph in the judgment document, the paragraph category includes the first category and the second category, and the judgment document is A collection of paragraphs of the first category as a paragraph set;

a matching module, configured to perform similarity matching between each paragraph in the paragraph set and each short sentence template in the preconfigured summary template, to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

The splicing module is used to input each paragraph in the paragraph set and its corresponding target short sentence template into the trained summary generation model, and obtain the target abstract short sentence corresponding to each paragraph in the paragraph set, according to the target short sentence corresponding to each paragraph The position sequence of the template in the abstract template splices the target abstract short sentences to obtain the abstract text corresponding to the judgment document.

The present application also provides an electronic device, the electronic device comprising:

at least one processor; and,

a memory communicatively coupled to the at least one processor; wherein,

The memory stores a judgment document summary generation program executable by the at least one processor, and the judgment document summary generation program is executed by the at least one processor, so that the at least one processor can perform the following steps:

The present application also provides a computer-readable storage medium, where a program for generating a summary of a judgment document is stored thereon, and the program for generating a summary of a judgment document can be executed by one or more processors to implement the following steps:

Description of drawings

FIG. 1 is a schematic flowchart of a method for generating a judgment document abstract according to an embodiment of the present application;

FIG. 2 is a schematic block diagram of an apparatus for generating a summary of a judgment document provided by an embodiment of the present application;

3 is a schematic structural diagram of an electronic device for implementing a method for generating a judgment document summary provided by an embodiment of the present application;

The realization, functional characteristics and advantages of the purpose of the present application will be further described with reference to the accompanying drawings in conjunction with the embodiments.

Detailed ways

In order to make the purpose, technical solutions and advantages of the present application more clearly understood, the present application will be described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

It should be noted that the descriptions involving "first", "second", etc. in this application are only for the purpose of description, and should not be construed as indicating or implying their relative importance or implying the number of indicated technical features . Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In addition, the technical solutions between the various embodiments can be combined with each other, but must be based on the realization by those of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be realized, it should be considered that the combination of such technical solutions does not exist. , is not within the scope of protection claimed in this application.

The embodiments of the present application may acquire and process related data based on artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .

The basic technologies of artificial intelligence generally include technologies such as sensors, special artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.

The present application provides a method for generating an abstract of a judgment document. Referring to FIG. 1 , a schematic flowchart of a method for generating a summary of a judgment document provided by an embodiment of the present application is shown. The method may be performed by an electronic device, which may be implemented by software and/or hardware.

In this embodiment, the method for generating a summary of a judgment document includes:

S1. Parse the user's request for generating a judgment document summary based on the client, and obtain the judgment document carried by the request;

S2. Input the judgment document into the trained paragraph category recognition model, and obtain the paragraph category of each paragraph in the judgment document. The paragraph categories include the first category and the second category, and the first category in the judgment document is A collection of paragraphs as a paragraph set.

At present, the length of judgment documents is mainly distributed between 2000 and 8000 words, and the length of abstracts is mainly distributed between 200 and 600 words. The current Chinese generation model cannot accommodate such a huge input and output. Paragraphs get a set of passages to compress the scale of information input to the summary generation model.

The paragraph category recognition model is a roberta-large-wwm model, which is used to determine whether each paragraph in the input judgment document belongs to the first category or the second category, where the first category is an important paragraph and the second category is an ordinary paragraph. The roberta-large-wwm model is a derivative of the BERT-large model and contains 24 layers of transformers, 16 attention heads, and 1024 hidden layer units.

The training process of the paragraph category recognition model includes:

A1. Obtain multiple preset indexes corresponding to the paragraph categories of the judgment document, and mark the paragraph category for the first judgment document sample in the first database based on the multiple preset indexes;

In this embodiment, the preset indicators include: the relationship between the plaintiff and the defendant, the plaintiff's claim, the defendant's opinion, the focus of the dispute, the statement and opinion of the legal facts, and the trial result. The paragraphs associated with the above-mentioned six preset indicators in the first judgment document sample are marked as the first category (important paragraphs), and other paragraphs are marked as the second category (ordinary paragraphs).

A2. Input the first referee text sample carrying the annotation information into the paragraph category recognition model, and obtain the predicted paragraph category of each paragraph in the first referee text sample;

A3. Determine the true paragraph category of each paragraph in the first referee text sample based on the annotation information, and determine the structural parameters of the paragraph category recognition model by minimizing the loss value between the predicted paragraph category and the true paragraph category, Get the trained paragraph category recognition model.

The formula for calculating the loss value is:

Among them, qi is the predicted paragraph category of the _ith paragraph in the first judgment document sample, pi is the actual paragraph category of the _ith paragraph in the first judgment document sample, and c is the total number of paragraphs in the first judgment document sample , loss(q _i , p _i ) is the loss value between the predicted paragraph category and the real paragraph category of the i-th paragraph in the first referee document sample.

Input the judgment document to be generated into the trained paragraph category recognition model to obtain the probability value of each paragraph belonging to the first category. When the probability value corresponding to a paragraph is greater than a preset threshold (for example, 0.7), it is considered that the paragraph belongs to the first category. The paragraph category is the first category, and the set of paragraphs in the first category in the judgment document is used as the paragraph set, and the summary information will be generated according to the information in the paragraph set.

In this step, the important paragraphs in the judgment document are extracted by the paragraph category recognition model, which compresses the information scale, avoids the information input to the summary generation model being too long and overflows, and ensures the integrity of the input information of the summary generation model, so that the summary generation model The resulting summary is more accurate.

S3, carrying out similarity matching between each paragraph in the paragraph set and each short sentence template in the preconfigured summary template, to obtain the target short sentence template corresponding to each paragraph in the paragraph set;

The paragraphs in the paragraph set may still have redundant information (some paragraphs may have more than 500 words), and these paragraphs are not necessarily coherent before and after, and cannot be directly spliced as an abstract.

In this embodiment, a summary template is preconfigured (the summary template includes the above-mentioned 6 preset indicators), and an example of the summary template is as follows: the plaintiff and the defendant have a relationship of XXXX. The plaintiff filed a petition and ordered the defendant to pay.... The defendant argued that the plaintiff's claim had no factual and legal basis, and upon finding out... This court supports the plaintiff's above request. According to Article X of the "Contract Law of the People's Republic of China" ... judgment, 1. The defendant shall pay the plaintiff XX fees. 2. To reject the plaintiff's other claims. If the obligation to pay money is not fulfilled within the period specified in this judgment, double the interest on the debt for the period of delay in performance.

The similarity matching is performed between each paragraph in the paragraph set and each short sentence template in the pre-configured summary template, and the target short sentence template corresponding to each paragraph in the paragraph set is obtained, including:

B1, calculate the longest common subsequence similarity value of each paragraph in the paragraph set and each short sentence template in the summary template;

B2. When the longest common subsequence similarity value between a specified paragraph and multiple short sentence templates is greater than the similarity threshold, use the short sentence template corresponding to the highest similarity value as the target short sentence template corresponding to the specified paragraph.

The calculation formula of the longest common subsequence similarity value is:

Among them, pi is the _ith paragraph in the paragraph set, a _j is the jth short sentence template in the abstract template, LCS(pi ,a _j ) is the _ith paragraph in the paragraph set and the jth short sentence template in the abstract template The length of the longest common subsequence of , len(a _j ) is the length of the j-th sentence template in the abstract template, len(pi ) is the length of the _i - _{th paragraph in the paragraph set, and LCSR(pi , a j} ₎ is The upper limit of the length ratio of the longest common subsequence between the i-th paragraph in the paragraph set and the j-th sentence template in the abstract template, LCSP(pi ,a _j ) is the _i -th paragraph in the paragraph set and the j-th short sentence in the abstract template The lower limit of the longest common subsequence length ratio of the template, LCSFscore(pi ,a _j ) is the longest common subsequence similarity value between the ith paragraph in the paragraph set and the _jth short sentence template in the abstract template.

In this embodiment, after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each short sentence template in the abstract template, the method further includes:

If the similarity value of the longest common subsequence of a specified paragraph and only one short sentence template is greater than the similarity threshold, the short sentence template is used as the target short sentence template corresponding to the specified paragraph.

If the longest common subsequence similarity value between a specified paragraph and each short sentence template in the abstract template is smaller than the similarity threshold, the specified paragraph is deleted from the paragraph set.

In another embodiment of the present application, after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each short sentence template in the abstract template, the method further includes:

If there are multiple paragraphs in the paragraph set corresponding to the same short sentence template, the multiple paragraphs are merged according to their paragraph order in the judgment document to form a new paragraph in the paragraph set.

In this step, similarity matching is performed between each paragraph in the paragraph set and each short sentence template in the abstract template, so as to further compress the information.

S4. Input each paragraph in the paragraph set and its corresponding target short sentence template into the trained summary generation model, obtain the target short sentence corresponding to each paragraph in the paragraph set, and create a summary in the abstract according to the target short sentence template corresponding to each paragraph. The target abstract short sentences are spliced in the order of positions in the template to obtain the abstract text corresponding to the judgment document.

In this embodiment, the summary generation model is also a roberta-large-wwm model, which is used to generate summary text according to paragraph information. The paragraph category recognition model in this scheme is different from the input sample of the summary generation model, the training target is different, and the model parameters obtained by training are also different.

The training process of the abstract generation model includes:

C1. Cover the preset proportion of the text content in the second judgment document sample in the second database with a mask to obtain a third judgment document;

C2. Input the third judgment document into the abstract generation model to obtain the predicted content of the masked text;

C3. Determine the structural parameters of the summary generation model by minimizing the loss value between the real content corresponding to the mask and the predicted content, so as to obtain a trained summary generation model.

In this embodiment, the abstract generation model predicts the probability distribution of the next token by using all the preceding tokens (words) in each second referee document sample. In this training task, in order to meet the abstract generation, a piece of text content is reserved as a known text (25% to 75% of the content of each second judgment document sample), and another part of the text content (75% to 25% of the content of each second judgment document sample) is covered by masking characters.

It can be seen from the above embodiments that, in the method for generating a judgment document abstract proposed by the present application, first, the judgment document is input into the trained paragraph category recognition model, and the paragraph category of each paragraph in the judgment document is obtained, and the paragraph category includes the first category (that is, the first category). important paragraphs) and the second category (that is, ordinary paragraphs), the set of paragraphs in the first category in the judgment document is used as the paragraph set. In this step, the important paragraphs in the judgment document are extracted and put into the paragraph set through the paragraph category recognition model. The information scale avoids the situation that the information in the subsequent input summary generation model is too long and overflows, causing the subsequent generated summary information to be incomplete and inaccurate; The short sentence template performs similarity matching to obtain the target short sentence template corresponding to each paragraph in the paragraph set. This step further compresses the information scale by matching the similarity between the paragraphs in the paragraph set and the short sentence template in the abstract template; Each paragraph in the paragraph set and its corresponding target short sentence template are input into the trained summary generation model, and the target short sentence corresponding to each paragraph in the paragraph set is obtained. According to the position order of the target short sentence template corresponding to each paragraph in the abstract template The target abstract sentences are spliced together to obtain the abstract text corresponding to the judgment document. In this step, the target abstract sentences are spliced according to the positional order of the target short sentence templates corresponding to each paragraph in the abstract template, so as to ensure the coherence of the abstract. . Therefore, this application ensures the coherence and accuracy of the abstract of the judgment document.

As shown in FIG. 2 , it is a schematic block diagram of an apparatus for generating a summary of a judgment document provided by an embodiment of the present application.

The apparatus 100 for generating a summary of a judgment document described in this application may be installed in an electronic device. According to the functions implemented, the apparatus 100 for generating a summary of a judgment document may include a parsing module 110 , an input module 120 , a matching module 130 and a splicing module 140 . The modules described in this application may also be referred to as units, which refer to a series of computer program segments that can be executed by the processor of an electronic device and can perform fixed functions, and are stored in the memory of the electronic device.

In this embodiment, the functions of each module/unit are as follows:

The parsing module 110 is configured to parse a request for generating a judgment document summary sent by a user based on the client, and obtain the judgment document carried by the request;

The input module 120 is used to input the judgment document into the trained paragraph category recognition model to obtain the paragraph category of each paragraph in the judgment document, the paragraph categories include the first category and the second category, and the judgment document is The collection of paragraphs in the first category is referred to as a paragraph set.

The training process of the paragraph category recognition model includes:

The formula for calculating the loss value is:

The matching module 130 is configured to perform similarity matching between each paragraph in the paragraph set and each short sentence template in the preconfigured abstract template, to obtain the target short sentence template corresponding to each paragraph in the paragraph set;

The calculation formula of the longest common subsequence similarity value is:

In the present embodiment, after calculating the longest common subsequence similarity value of each paragraph in the paragraph set and each short sentence template in the abstract template, the matching module 130 is also used for:

In this embodiment, after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each short sentence template in the abstract template, the matching module 130 is further configured to:

In another embodiment of the present application, after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each short sentence template in the abstract template, the matching module 130 is further configured to:

The splicing module 140 is used to input each paragraph in the paragraph set and its corresponding target short sentence template into the trained summary generation model, and obtain the target short sentence corresponding to each paragraph in the paragraph set, according to the target short sentence corresponding to each paragraph. The target abstract short sentences are spliced together according to the position sequence of the sentence template in the abstract template, so as to obtain the abstract text corresponding to the judgment document.

The training process of the abstract generation model includes:

As shown in FIG. 3 , it is a schematic structural diagram of an electronic device for implementing a method for generating a summary of a judgment document provided by an embodiment of the present application.

The electronic device 1 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. The electronic device 1 may be a computer, a single network server, a server group composed of multiple network servers, or a cloud based on cloud computing composed of a large number of hosts or network servers, wherein cloud computing is a kind of distributed computing, A super virtual computer consisting of a collection of loosely coupled computers.

In this embodiment, the electronic device 1 includes, but is not limited to, a memory 11, a processor 12, and a network interface 13 that can be communicatively connected to each other through a system bus. The abstract generation program 10 is executable by the processor 12 . FIG. 3 only shows the electronic device 1 having the components 11-13 and the judgment document abstract generating program 10. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute a limitation on the electronic device 1, and may include Fewer or more components than shown, or some components are combined, or a different arrangement of components.

The memory 11 includes a memory and at least one type of readable storage medium. The memory provides a cache for the operation of the electronic device 1; the readable storage medium can be, for example, flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM) ), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. non-volatile storage media. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1; in other embodiments, the non-volatile storage medium may also be an external storage unit of the electronic device 1 A storage device, such as a pluggable hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash memory card (Flash Card), etc. equipped on the electronic device 1. In this embodiment, the readable storage medium of the memory 11 is generally used to store the operating system and various application software installed in the electronic device 1 , for example, to store the code of the judgment document abstract generating program 10 in an embodiment of the present application. In addition, the memory 11 can also be used to temporarily store various types of data that have been output or will be output.

In some embodiments, the processor 12 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips. The processor 12 is generally used to control the overall operation of the electronic device 1, such as performing control and processing related to data interaction or communication with other devices. In this embodiment, the processor 12 is configured to run the program code or process data stored in the memory 11 , for example, run the judgment document summary generation program 10 and the like.

The network interface 13 may include a wireless network interface or a wired network interface, and the network interface 13 is used to establish a communication connection between the electronic device 1 and a client (not shown in the figure).

Optionally, the electronic device 1 may further include a user interface, and the user interface may include a display (Display), an input unit such as a keyboard (Keyboard), and an optional user interface may also include a standard wired interface and a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, and the like. The display may also be appropriately called a display screen or a display unit, which is used for displaying information processed in the electronic device 1 and for displaying a visualized user interface.

It should be understood that the embodiments are only used for illustration, and are not limited by this structure in the scope of the patent application.

The judgment document summary generation program 10 stored in the memory 11 in the electronic device 1 is a combination of multiple instructions, and when running in the processor 12, can realize:

Specifically, for the specific implementation method of the above-mentioned judgment document abstract generating program 10 by the processor 12, reference may be made to the description of the relevant steps in the corresponding embodiment of FIG. 1, and details are not described herein. It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned judgment documents, the above-mentioned judgment documents can also be stored in a node of a blockchain.

Further, if the modules/units integrated in the electronic device 1 are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium. The computer-readable storage medium may be non-volatile or non-volatile. The computer-readable storage medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory) ).

The computer-readable storage medium stores a judgment document summary generation program 10, and the judgment document summary generation program 10 can be executed by one or more processors to realize the following steps:

In the several embodiments provided in this application, it should be understood that the disclosed apparatus, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division manners in actual implementation.

The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

In addition, each functional module in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.

It will be apparent to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, but that the present application can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application.

Accordingly, the embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the application is to be defined by the appended claims rather than the foregoing description, which is therefore intended to fall within the scope of the claims. All changes within the meaning and scope of the equivalents of , are included in this application. Any reference signs in the claims shall not be construed as limiting the involved claim.

The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Furthermore, it is clear that the word "comprising" does not exclude other units or steps and the singular does not exclude the plural. Several units or means recited in the system claims can also be realized by one unit or means by means of software or hardware. Second-class terms are used to denote names and do not denote any particular order.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application rather than limitations. Although the present application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present application can be Modifications or equivalent substitutions can be made without departing from the spirit and scope of the technical solutions of the present application.

Claims

A method for generating a summary of a judgment document, wherein the method comprises:

Parse the user's request to generate a judgment document summary based on the client, and obtain the judgment document carried by the request;

Input the judgment document into the trained paragraph category recognition model to obtain the paragraph category of each paragraph in the judgment document, where the paragraph category includes the first category and the second category, and the paragraphs of the first category in the judgment document are as a set of paragraphs;

Perform similarity matching between each paragraph in the paragraph set and each short sentence template in the preconfigured summary template, to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

Inputting each paragraph in the paragraph set and its corresponding target short sentence template into the trained summary generation model, obtaining the target short sentence corresponding to each paragraph in the paragraph set, according to the target short sentence template corresponding to each paragraph in the abstract template The target abstract short sentences are spliced in the order of their positions to obtain the abstract text corresponding to the judgment document.
The method for generating an abstract of a judgment document according to claim 1, wherein the similarity matching is performed between each paragraph in the paragraph set and each short sentence template in a preconfigured abstract template to obtain each paragraph in the paragraph set. The corresponding target phrase templates, including:

Calculate the longest common subsequence similarity value of each paragraph in the paragraph set and each short sentence template in the abstract template;

When the similarity value of the longest common subsequence between a specified paragraph and multiple short sentence templates is greater than the similarity threshold, the short sentence template corresponding to the highest similarity value is used as the target short sentence template corresponding to the specified paragraph.
The method for generating an abstract of a judgment document according to claim 2, wherein after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each short sentence template in the abstract template, the method further comprises:

If the longest common subsequence similarity value between a specified paragraph and each short sentence template in the abstract template is smaller than the similarity threshold, the specified paragraph is deleted from the paragraph set.
The method for generating an abstract of a judgment document according to claim 2, wherein after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each short sentence template in the abstract template, the method further comprises:

If there are multiple paragraphs in the paragraph set corresponding to the same short sentence template, the multiple paragraphs are merged according to their paragraph order in the judgment document to form a new paragraph in the paragraph set.
The method for generating a judgment document summary according to claim 2, wherein the calculation formula of the longest common subsequence similarity value is:

Among them, pi is the ith paragraph in the paragraph set, a j is the jth short sentence template in the abstract template, LCS(pi ,a j ) is the ith paragraph in the paragraph set and the jth short sentence template in the abstract template The length of the longest common subsequence of , len(a j ) is the length of the j-th sentence template in the abstract template, len(pi ) is the length of the i - th paragraph in the paragraph set, and LCSR(pi , a j ) is The upper limit of the length ratio of the longest common subsequence between the i-th paragraph in the paragraph set and the j-th sentence template in the abstract template, LCSP(pi ,a j ) is the i -th paragraph in the paragraph set and the j-th short sentence in the abstract template The lower limit of the length ratio of the longest common subsequence of the template, LCSFscore(pi , a j ) is the longest common subsequence similarity value between the ith paragraph in the paragraph set and the jth short sentence template in the abstract template.
The method for generating a summary of a judgment document according to claim 1, wherein the training process of the paragraph category recognition model comprises:

Obtaining multiple preset indexes corresponding to the paragraph categories of the judgment document, and marking the paragraph category for the first judgment document sample in the first database based on the multiple preset indexes;

Inputting the first referee text sample carrying the annotation information into the paragraph category recognition model to obtain the predicted paragraph category of each paragraph in the first referee text sample;

Determine the true paragraph category of each paragraph in the first referee text sample based on the annotation information, and determine the structural parameters of the paragraph category recognition model by minimizing the loss value between the predicted paragraph category and the actual paragraph category, and obtain training. Good paragraph category recognition model.
The method for generating a summary of a judgment document according to claim 1, wherein the training process of the summary generation model comprises:

Covering a preset proportion of the text content in the second judgment document sample in the second database with a mask to obtain a third judgment document;

Inputting the third judgment document into the abstract generation model to obtain the predicted content of the masked text;

The structural parameters of the summary generation model are determined by minimizing the loss value between the real content corresponding to the mask and the predicted content, and a trained summary generation model is obtained.
A device for generating a summary of a judgment document, wherein the device includes:

A parsing module, used for parsing the user's request for generating a judgment document summary based on the client, and obtaining the judgment document carried by the request;

The input module is used to input the judgment document into the trained paragraph category recognition model, and obtain the paragraph category of each paragraph in the judgment document, the paragraph category includes the first category and the second category, and the judgment document is A collection of paragraphs of the first category as a paragraph set;

a matching module, configured to perform similarity matching between each paragraph in the paragraph set and each short sentence template in the preconfigured summary template, to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

The splicing module is used to input each paragraph in the paragraph set and its corresponding target short sentence template into the trained summary generation model, and obtain the target abstract short sentence corresponding to each paragraph in the paragraph set, according to the target short sentence corresponding to each paragraph The position sequence of the template in the abstract template splices the target abstract short sentences to obtain the abstract text corresponding to the judgment document.
An electronic device, wherein the electronic device comprises:

at least one processor; and,

a memory communicatively coupled to the at least one processor; wherein,

The memory stores a judgment document summary generation program executable by the at least one processor, and the judgment document summary generation program is executed by the at least one processor, so that the at least one processor can perform the following steps:

Parse the user's request to generate a judgment document summary based on the client, and obtain the judgment document carried by the request;

Input the judgment document into the trained paragraph category recognition model to obtain the paragraph category of each paragraph in the judgment document, where the paragraph category includes the first category and the second category, and the paragraphs of the first category in the judgment document are as a set of paragraphs;

Perform similarity matching between each paragraph in the paragraph set and each short sentence template in the preconfigured summary template, to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

Inputting each paragraph in the paragraph set and its corresponding target short sentence template into the trained summary generation model, obtaining the target short sentence corresponding to each paragraph in the paragraph set, according to the target short sentence template corresponding to each paragraph in the abstract template The target abstract short sentences are spliced in the order of their positions to obtain the abstract text corresponding to the judgment document.
The electronic device according to claim 9, wherein, by performing similarity matching between each paragraph in the paragraph set and each short sentence template in a preconfigured abstract template, the target corresponding to each paragraph in the paragraph set is obtained. Short sentence templates, including:

Calculate the longest common subsequence similarity value of each paragraph in the paragraph set and each short sentence template in the abstract template;

When the similarity value of the longest common subsequence between a specified paragraph and multiple short sentence templates is greater than the similarity threshold, the short sentence template corresponding to the highest similarity value is used as the target short sentence template corresponding to the specified paragraph.
The electronic device according to claim 10, wherein after calculating the longest common subsequence similarity value of each paragraph in the paragraph set and each short sentence template in the abstract template, the judgment document abstract generating program is processed by the processor The following steps are also implemented during execution:

If the longest common subsequence similarity value between a specified paragraph and each short sentence template in the abstract template is smaller than the similarity threshold, the specified paragraph is deleted from the paragraph set.
The electronic device according to claim 10, wherein after calculating the longest common subsequence similarity value of each paragraph in the paragraph set and each short sentence template in the abstract template, the judgment document abstract generating program is processed by the processor The following steps are also implemented during execution:

If there are multiple paragraphs in the paragraph set corresponding to the same short sentence template, the multiple paragraphs are merged according to their paragraph order in the judgment document to form a new paragraph in the paragraph set.
The electronic device according to claim 10, wherein the calculation formula of the longest common subsequence similarity value is:

Among them, pi is the ith paragraph in the paragraph set, a j is the jth short sentence template in the abstract template, LCS(pi ,a j ) is the ith paragraph in the paragraph set and the jth short sentence template in the abstract template The length of the longest common subsequence of , len(a j ) is the length of the j-th sentence template in the abstract template, len(pi ) is the length of the i - th paragraph in the paragraph set, and LCSR(pi , a j ) is The upper limit of the length ratio of the longest common subsequence between the i-th paragraph in the paragraph set and the j-th sentence template in the abstract template, LCSP(pi ,a j ) is the i -th paragraph in the paragraph set and the j-th short sentence in the abstract template The lower limit of the longest common subsequence length ratio of the template, LCSFscore(pi ,a j ) is the longest common subsequence similarity value between the ith paragraph in the paragraph set and the jth short sentence template in the abstract template.
The electronic device according to claim 9, wherein the training process of the paragraph category recognition model comprises:

Obtaining multiple preset indexes corresponding to the paragraph categories of the judgment document, and marking the paragraph category for the first judgment document sample in the first database based on the multiple preset indexes;

Inputting the first referee text sample carrying the annotation information into the paragraph category recognition model to obtain the predicted paragraph category of each paragraph in the first referee text sample;

Determine the true paragraph category of each paragraph in the first referee text sample based on the annotation information, and determine the structural parameters of the paragraph category recognition model by minimizing the loss value between the predicted paragraph category and the actual paragraph category, and obtain training. Good paragraph category recognition model.
The electronic device according to claim 9, wherein the training process of the abstract generation model comprises:

Covering a preset proportion of the text content in the second judgment document sample in the second database with a mask to obtain a third judgment document;

Inputting the third judgment document into the abstract generation model to obtain the predicted content of the masked text;

The structural parameters of the summary generation model are determined by minimizing the loss value between the real content corresponding to the mask and the predicted content, and a trained summary generation model is obtained.
A computer-readable storage medium, wherein the computer-readable storage medium stores a judgment document summary generation program, and the judgment document summary generation program can be executed by one or more processors to realize the following steps:

Parse the user's request to generate a judgment document summary based on the client, and obtain the judgment document carried by the request;

Input the judgment document into the trained paragraph category recognition model to obtain the paragraph category of each paragraph in the judgment document, where the paragraph category includes the first category and the second category, and the paragraphs of the first category in the judgment document are as a set of paragraphs;

Perform similarity matching between each paragraph in the paragraph set and each short sentence template in the preconfigured summary template, to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

Inputting each paragraph in the paragraph set and its corresponding target short sentence template into the trained summary generation model, obtaining the target short sentence corresponding to each paragraph in the paragraph set, and according to the target short sentence template corresponding to each paragraph in the abstract template The target abstract short sentences are spliced in the order of their positions to obtain the abstract text corresponding to the judgment document.
The computer-readable storage medium according to claim 16, wherein the similarity matching between each paragraph in the paragraph set and each short sentence template in the preconfigured abstract template is performed to obtain each paragraph in the paragraph set The corresponding target phrase templates, including:

Calculate the longest common subsequence similarity value of each paragraph in the paragraph set and each short sentence template in the abstract template;

When the similarity value of the longest common subsequence between a specified paragraph and multiple short sentence templates is greater than the similarity threshold, the short sentence template corresponding to the highest similarity value is used as the target short sentence template corresponding to the specified paragraph.
The computer-readable storage medium according to claim 17, wherein after calculating the longest common subsequence similarity value of each paragraph in the paragraph set and each short sentence template in the abstract template, the judgment document abstract generating program When executed by the processor, the following steps are also implemented:

If the longest common subsequence similarity value between a specified paragraph and each short sentence template in the abstract template is smaller than the similarity threshold, the specified paragraph is deleted from the paragraph set.
The computer-readable storage medium according to claim 17, wherein after calculating the longest common subsequence similarity value of each paragraph in the paragraph set and each short sentence template in the abstract template, the judgment document abstract generating program When executed by the processor, the following steps are also implemented:

If there are multiple paragraphs in the paragraph set corresponding to the same short sentence template, the multiple paragraphs are merged according to their paragraph order in the judgment document to form a new paragraph in the paragraph set.
The computer-readable storage medium of claim 17, wherein the calculation formula of the longest common subsequence similarity value is:

Among them, pi is the ith paragraph in the paragraph set, a j is the jth short sentence template in the abstract template, LCS(pi ,a j ) is the ith paragraph in the paragraph set and the jth short sentence template in the abstract template The length of the longest common subsequence of , len(a j ) is the length of the j-th sentence template in the abstract template, len(pi ) is the length of the i - th paragraph in the paragraph set, and LCSR(pi , a j ) is The upper limit of the length ratio of the longest common subsequence between the i-th paragraph in the paragraph set and the j-th sentence template in the abstract template, LCSP(pi ,a j ) is the i -th paragraph in the paragraph set and the j-th short sentence in the abstract template The lower limit of the longest common subsequence length ratio of the template, LCSFscore(pi ,a j ) is the longest common subsequence similarity value between the ith paragraph in the paragraph set and the jth short sentence template in the abstract template.