CN112182224A

CN112182224A - Referee document abstract generation method and device, electronic equipment and readable storage medium

Info

Publication number: CN112182224A
Application number: CN202011087426.7A
Authority: CN
Inventors: 曹辰捷; 徐国强; 陈家豪
Original assignee: OneConnect Financial Technology Co Ltd Shanghai
Current assignee: OneConnect Smart Technology Co Ltd; OneConnect Financial Technology Co Ltd Shanghai
Priority date: 2020-10-12
Filing date: 2020-10-12
Publication date: 2021-01-05
Also published as: WO2022078308A1

Abstract

The invention relates to an intelligent decision, and discloses a method for generating an abstract of a referee document, which comprises the following steps: inputting the referee document into the trained paragraph category identification model to obtain the paragraph category of each paragraph in the referee document, and taking the set of the paragraphs of the first category as a paragraph set; carrying out similarity matching on each paragraph in the paragraph set and a short sentence template in the abstract template respectively to obtain a target short sentence template corresponding to each paragraph in the paragraph set; inputting each paragraph and the corresponding target short sentence template thereof into the trained abstract generating model to obtain the target abstract short sentence corresponding to each paragraph in the paragraph set, and splicing the target abstract short sentences according to the position sequence of the target short sentence template corresponding to each paragraph in the abstract template to obtain the abstract text corresponding to the referee document. The invention also provides a device for generating the official document abstract, the electronic equipment and a readable storage medium. The invention ensures the consistency and accuracy of the official document abstract.

Description

Referee document abstract generation method and device, electronic equipment and readable storage medium

Technical Field

The invention relates to the field of intelligent decision, in particular to a method and a device for generating a referee document abstract, electronic equipment and a readable storage medium.

Background

With the development of the information age, the abstract generation is more and more widely applied in the life of people, for example, the abstract generation of the referee document can quickly know the content outline and the key information of the referee text by browsing the abstract, thereby saving the reading time.

The official document is in a standard writing, but the content is exhaustive and lengthy, at present, the abstract is generated by extracting words, phrases and sentences with larger weight from the official document and combining the words, the phrases and the sentences, and the semantic coherence of the generated abstract is poor, so that the law and the official knowledge are not effectively fused, and the generated abstract is inconsistent and inaccurate. Therefore, a method for generating a referee document abstract is needed to ensure the consistency and accuracy of the referee document abstract.

Disclosure of Invention

In view of the above, it is necessary to provide a method for generating a referee document abstract, which is intended to ensure the consistency and accuracy of the referee document abstract.

The invention provides a method for generating an abstract of a referee document, which comprises the following steps:

analyzing a referee document abstract generation request sent by a user based on a client to obtain a referee document carried by the request;

inputting the referee document into a trained paragraph category identification model to obtain paragraph categories of each paragraph in the referee document, wherein the paragraph categories comprise a first category and a second category, and a set of paragraphs of the first category in the referee document is used as a paragraph set;

carrying out similarity matching on each paragraph in the paragraph set and each short sentence template in a preset abstract template to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

inputting each paragraph in the paragraph set and the corresponding target short sentence template into the trained abstract generating model to obtain the target abstract short sentence corresponding to each paragraph in the paragraph set, and splicing the target abstract short sentences according to the position sequence of the target short sentence template corresponding to each paragraph in the abstract template to obtain the abstract text corresponding to the referee document.

Optionally, the performing similarity matching between each paragraph in the paragraph set and each short sentence template in a preconfigured abstract template to obtain a target short sentence template corresponding to each paragraph in the paragraph set includes:

calculating the similarity value of the longest common subsequence of each paragraph in the paragraph set and each short sentence template in the abstract template;

and when the similarity value of the longest common subsequence of a certain appointed paragraph and the plurality of short sentence templates is greater than the similarity threshold value, taking the short sentence template corresponding to the highest similarity value as the target short sentence template corresponding to the appointed paragraph.

Optionally, after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each short sentence template in the summary template, the method further includes:

and if the similarity value of the longest common subsequence of each short sentence template in a certain appointed paragraph and the summary template is smaller than the similarity threshold value, deleting the appointed paragraph from the paragraph set.

and if a plurality of paragraphs in the paragraph set correspond to the same short sentence template, combining the paragraphs according to the paragraph sequence of the paragraphs in the referee document to form a new paragraph in the paragraph set.

Optionally, the calculation formula of the similarity value of the longest common subsequence is as follows:

wherein p is_iFor the ith paragraph in the paragraph set, a_jFor the jth phrase template in the abstract template, LCS (p)_i,a_j) The longest common subsequence length for the ith paragraph in the paragraph set and the jth phrase template in the summary template, len (a)_j) Length of jth short sentence template in abstract template, len (p)_i) LCSR (p) being the length of the ith paragraph in a paragraph set_i,a_j) The upper limit of the length ratio of the longest common subsequence of the ith paragraph in the paragraph set and the jth short sentence template in the abstract template, LCSP (p)_i,a_j) The lower limit of the length ratio of the longest common subsequence of the ith paragraph in the paragraph set and the jth short sentence template in the abstract template is LCSFscore (p)_i,a_j) The longest common subsequence similarity value between the ith paragraph in the paragraph set and the jth short sentence template in the abstract template is shown.

Optionally, the training process of the paragraph class identification model includes:

acquiring a plurality of preset indexes corresponding to the paragraph classes of the referee document, and performing paragraph class marking on a first referee document sample in a first database based on the preset indexes;

inputting a first referee text sample carrying labeling information into the paragraph category identification model to obtain a predicted paragraph category of each paragraph in the first referee text sample;

determining the real paragraph category of each paragraph in the first referee text sample based on the labeling information, and determining the structural parameters of the paragraph category identification model by minimizing the loss value between the predicted paragraph category and the real paragraph category to obtain the trained paragraph category identification model.

Optionally, the training process of the abstract generation model includes:

covering the text content in the second referee document sample in the second database in a preset proportion by using a covering symbol to obtain a third referee document;

inputting the third referee document into the abstract generation model to obtain the prediction content of the covered text;

and determining the structural parameters of the abstract generating model by minimizing the loss value between the real content and the predicted content corresponding to the mask character to obtain the trained abstract generating model.

In order to solve the above problem, the present invention further provides an apparatus for generating an abstract of a referee document, the apparatus comprising:

the analysis module is used for analyzing a referee document abstract generation request sent by a user based on a client to obtain a referee document carried by the request;

an input module, configured to input the referee document into a trained paragraph category identification model to obtain a paragraph category of each paragraph in the referee document, where the paragraph category includes a first category and a second category, and a set of paragraphs of the first category in the referee document is used as a paragraph set;

the matching module is used for performing similarity matching on each paragraph in the paragraph set and each short sentence template in a preset abstract template to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

and the splicing module is used for inputting each paragraph in the paragraph set and the target short sentence template corresponding to the paragraph set into the trained abstract generation model to obtain the target abstract short sentence corresponding to each paragraph in the paragraph set, and splicing the target abstract short sentences according to the position sequence of the target short sentence template corresponding to each paragraph in the abstract template to obtain the abstract text corresponding to the referee document.

In order to solve the above problem, the present invention also provides an electronic device, including:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores a referee document summary generation program executable by the at least one processor, the referee document summary generation program being executable by the at least one processor to enable the at least one processor to perform the referee document summary generation method described above.

In order to solve the above problems, the present invention also provides a computer-readable storage medium having stored thereon a referee document digest generation program which is executable by one or more processors to implement the referee document digest generation method.

Compared with the prior art, the method has the advantages that firstly, the referee document is input into the trained paragraph category identification model to obtain the paragraph category of each paragraph in the referee document, wherein the paragraph category comprises a first category (namely important paragraph) and a second category (namely common paragraph), a set of the paragraphs of the first category in the referee document is used as a paragraph set, the important paragraphs in the referee document are extracted and placed into the paragraph set through the paragraph category identification model in the step, the information scale is compressed, and the situations that subsequently generated abstract information is incomplete and inaccurate due to overflow caused by overlong information in the subsequently input abstract generation model are avoided; then, carrying out similarity matching on each paragraph in the paragraph set and each short sentence template in a preset abstract template to obtain a target short sentence template corresponding to each paragraph in the paragraph set, and further compressing the information scale by carrying out similarity matching on the paragraphs in the paragraph set and the short sentence templates in the abstract template; and finally, inputting each paragraph in the paragraph set and the corresponding target short sentence template thereof into the trained abstract generating model to obtain the target abstract short sentence corresponding to each paragraph in the paragraph set, splicing the target abstract short sentences according to the position sequence of the target short sentence template corresponding to each paragraph in the abstract template to obtain the abstract text corresponding to the referee document, and splicing the target abstract short sentences according to the position sequence of the target short sentence template corresponding to each paragraph in the abstract template to ensure the continuity of the abstract. Therefore, the invention ensures the consistency and accuracy of the abstract of the official document.

Drawings

Fig. 1 is a schematic flow chart of a method for generating an abstract of a referee document according to an embodiment of the present invention;

fig. 2 is a schematic block diagram of an apparatus for generating a document abstract of a referee according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an electronic device for implementing a method for generating a referee document abstract according to an embodiment of the present invention;

the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.

The invention provides a method for generating an abstract of a referee document. Fig. 1 is a schematic flow chart of a method for generating a referee document abstract according to an embodiment of the present invention. The method may be performed by an electronic device, which may be implemented by software and/or hardware.

In this embodiment, the method for generating the official document abstract includes:

s1, analyzing a referee document abstract generation request sent by a user based on a client, and acquiring a referee document carried by the request;

and S2, inputting the referee document into the trained paragraph type recognition model to obtain paragraph types of each paragraph in the referee document, wherein the paragraph types comprise a first type and a second type, and a set of the paragraphs of the first type in the referee document is used as a paragraph set.

The length of the current referee document is mainly distributed in 2000-8000 words, the abstract length is mainly distributed in 200-600 words, the current Chinese generation model cannot accommodate such huge input and output, and in the embodiment, the important paragraphs in the referee document are extracted through the paragraph type identification model to obtain a paragraph set so as to compress the information scale input to the abstract generation model.

The paragraph type identification model is a roberta-large-wwm model and is used for judging whether each paragraph in the input referee document belongs to a first type or a second type, wherein the first type is an important paragraph, and the second type is a common paragraph. The roberta-large-wwm model is a derivative of the BERT-large model and contains 24 layers of transformations, 16 attention heads, 1024 hidden layer elements.

The training process of the paragraph category identification model comprises the following steps:

a1, obtaining a plurality of preset indexes corresponding to the paragraph classes of the referee document, and carrying out paragraph class marking on a first referee document sample in a first database based on the preset indexes;

in this embodiment, the preset indexes include: original defended relationship, original defended appeal, defended opinion, dispute focus, legal fact statement and opinion, and trial result. The paragraphs in the first referee document sample associated with the above 6 preset indexes are labeled as a first category (important paragraphs), and the other paragraphs are labeled as a second category (common paragraphs).

A2, inputting a first referee text sample carrying labeling information into the paragraph category identification model to obtain a predicted paragraph category of each paragraph in the first referee text sample;

a3, determining the real paragraph category of each paragraph in the first referee text sample based on the labeling information, and determining the structural parameters of the paragraph category identification model by minimizing the loss value between the predicted paragraph category and the real paragraph category to obtain the trained paragraph category identification model.

The calculation formula of the loss value is as follows:

wherein q is_iIs the predicted paragraph class, p, of the ith paragraph in the first referee document sample_iIs the true paragraph class of the ith paragraph in the first referee document sample, and c is the total number of paragraphs in the first referee document sample, loss (q)_i,p_i) Is the loss value between the predicted paragraph class and the real paragraph class of the ith paragraph in the first referee document sample.

Inputting the referee document to be generated into the abstract into the trained paragraph category identification model to obtain the probability value of each paragraph belonging to the first category, when the probability value corresponding to a certain paragraph is greater than a preset threshold (for example, 0.7), regarding the paragraph category of the paragraph as the first category, taking the set of paragraphs of the first category in the referee document as a paragraph set, and then generating abstract information according to the information in the paragraph set.

In the step, the important paragraphs in the referee document are extracted through the paragraph category identification model, the information scale is compressed, the overflow of overlong information input to the abstract generation model is avoided, the integrity of the input information of the abstract generation model is ensured, and the abstract generated by the abstract generation model is more accurate.

S3, performing similarity matching on each paragraph in the paragraph set and each short sentence template in a preset abstract template to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

the paragraphs in the paragraph set may still have redundant information (some paragraphs may have more than 500 words), and the paragraphs are not necessarily consecutive and cannot be directly spliced as a summary.

In this embodiment, a summary template is configured in advance (the summary template includes the above 6 preset indexes), and examples of the summary template are as follows: the original quilt is the XXXX relationship. The original appeal, the conviction notice pay …, the notice is alleged that the original appeal should take on the liability of breach of contract without the fact that the original appeal is found … according to law and law. The home is supported to inform the request. According to the X … judgment of the contract Law of the people's republic of China, the original charge XX is paid for the first time and the second time. Second, refute the original report and request other litigation. If the obligation to pay is not fulfilled for a period specified by the decision, the debt interest during the delayed fulfillment period is doubled.

The obtaining of the target short sentence template corresponding to each paragraph in the paragraph set by performing similarity matching between each paragraph in the paragraph set and each short sentence template in a preconfigured abstract template includes:

b1, calculating the similarity value of the longest common subsequence of each paragraph in the paragraph set and each short sentence template in the abstract template;

and B2, when the similarity value of the longest common subsequence of a certain appointed paragraph and the plurality of short sentence templates is larger than the similarity threshold value, taking the short sentence template corresponding to the highest similarity value as the target short sentence template corresponding to the appointed paragraph.

The calculation formula of the similarity value of the longest public subsequence is as follows:

wherein p is_iFor the ith paragraph in the paragraph set, a_jAs abstract templateMiddle j-th phrase template, LCS (p)_i,a_j) The longest common subsequence length for the ith paragraph in the paragraph set and the jth phrase template in the summary template, len (a)_j) Length of jth short sentence template in abstract template, len (p)_i) LCSR (p) being the length of the ith paragraph in a paragraph set_i,a_j) The upper limit of the length ratio of the longest common subsequence of the ith paragraph in the paragraph set and the jth short sentence template in the abstract template, LCSP (p)_i,a_j) The lower limit of the length ratio of the longest common subsequence of the ith paragraph in the paragraph set and the jth short sentence template in the abstract template is LCSFscore (p)_i,a_j) The longest common subsequence similarity value between the ith paragraph in the paragraph set and the jth short sentence template in the abstract template is shown.

In this embodiment, after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each short sentence template in the summary template, the method further includes:

and if the similarity value of the longest public subsequence of a certain appointed paragraph and only one short sentence template is greater than the similarity threshold value, taking the short sentence template as a target short sentence template corresponding to the appointed paragraph.

In another embodiment of the present invention, after calculating the longest common subsequence similarity value of each paragraph in the paragraph set and each phrase template in the summary template, the method further comprises:

In the step, each paragraph in the paragraph set is subjected to similarity matching with each short sentence template in the abstract template, so that the information is further compressed.

S4, inputting each paragraph in the paragraph set and the corresponding target short sentence template thereof into the trained abstract generating model to obtain the target abstract short sentences corresponding to each paragraph in the paragraph set, and splicing the target abstract short sentences according to the position sequence of the target short sentence template corresponding to each paragraph in the abstract template to obtain the abstract text corresponding to the referee document.

In this embodiment, the abstract generation model is also a roberta-large-wwm model, and is used to generate an abstract text according to paragraph information. In the scheme, the paragraph category identification model and the abstract generation model are different in input sample, different in training target and different in trained model parameter.

The training process of the abstract generation model comprises the following steps:

c1, covering the text content in the second referee document sample in the second database in a preset proportion by using a covering symbol to obtain a third referee document;

c2, inputting the third referee document into the abstract generation model to obtain the prediction content of the covered text;

and C3, determining the structural parameters of the abstract generating model by minimizing the loss value between the real content and the predicted content corresponding to the mask character, and obtaining the trained abstract generating model.

In this embodiment, the abstract generation model predicts the probability distribution of the next token through all previous tokens (words) in each second official document sample, and in order to match the abstract generation in this training task, a piece of text content is kept as known text (25% -75% of the content in each second official document sample), and another part of text content (75% -25% of the content in each second official document sample) is covered by a mask.

It can be known from the above embodiments that the method for generating an abstract of a referee document provided by the present invention, first, inputs the referee document into a trained paragraph category identification model to obtain the paragraph categories of each paragraph in the referee document, where the paragraph categories include a first category (i.e. important paragraph) and a second category (i.e. common paragraph), and uses a set of paragraphs of the first category in the referee document as a paragraph set, and this step extracts the important paragraphs in the referee document into the paragraph set through the paragraph category identification model, so as to compress the information scale, and avoid the situations that the information in the subsequently input abstract generation model is too long and overflows, causing incomplete and inaccurate abstract information subsequently generated; then, carrying out similarity matching on each paragraph in the paragraph set and each short sentence template in a preset abstract template to obtain a target short sentence template corresponding to each paragraph in the paragraph set, and further compressing the information scale by carrying out similarity matching on the paragraphs in the paragraph set and the short sentence templates in the abstract template; and finally, inputting each paragraph in the paragraph set and the corresponding target short sentence template thereof into the trained abstract generating model to obtain the target abstract short sentence corresponding to each paragraph in the paragraph set, splicing the target abstract short sentences according to the position sequence of the target short sentence template corresponding to each paragraph in the abstract template to obtain the abstract text corresponding to the referee document, and splicing the target abstract short sentences according to the position sequence of the target short sentence template corresponding to each paragraph in the abstract template to ensure the continuity of the abstract. Therefore, the invention ensures the consistency and accuracy of the abstract of the official document.

Fig. 2 is a schematic block diagram of an apparatus for generating a document abstract of a referee according to an embodiment of the present invention.

The official document abstract generating device 100 of the present invention can be installed in an electronic device. According to the implemented functions, the apparatus 100 for generating a referee document summary may include a parsing module 110, an input module 120, a matching module 130, and a splicing module 140. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.

In the present embodiment, the functions regarding the respective modules/units are as follows:

the analysis module 110 is configured to analyze a referee document abstract generation request sent by a user based on a client, and acquire a referee document carried by the request;

an input module 120, configured to input the referee document into the trained paragraph category identification model to obtain a paragraph category of each paragraph in the referee document, where the paragraph category includes a first category and a second category, and a set of paragraphs of the first category in the referee document is used as a paragraph set.

The calculation formula of the loss value is as follows:

A matching module 130, configured to perform similarity matching on each paragraph in the paragraph set and each short sentence template in a preconfigured abstract template, so as to obtain a target short sentence template corresponding to each paragraph in the paragraph set;

wherein p is_iFor the ith paragraph in the paragraph set, a_jFor the jth phrase template in the abstract template, LCS (p)_i,a_j) The longest common subsequence length for the ith paragraph in the paragraph set and the jth phrase template in the summary template, len (a)_j) Is a abstract modelLength of jth phrase template in plate, len (p)_i) LCSR (p) being the length of the ith paragraph in a paragraph set_i,a_j) The upper limit of the length ratio of the longest common subsequence of the ith paragraph in the paragraph set and the jth short sentence template in the abstract template, LCSP (p)_i,a_j) The lower limit of the length ratio of the longest common subsequence of the ith paragraph in the paragraph set and the jth short sentence template in the abstract template is LCSFscore (p)_i,a_j) The longest common subsequence similarity value between the ith paragraph in the paragraph set and the jth short sentence template in the abstract template is shown.

In this embodiment, after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each phrase template in the summary template, the matching module 130 is further configured to:

In another embodiment of the present invention, after calculating the longest common subsequence similarity value between each paragraph in the paragraph set and each phrase template in the summary template, the matching module 130 is further configured to:

And the splicing module 140 is configured to input each paragraph in the paragraph set and the target short sentence template corresponding to the paragraph set into the trained abstract generating model, obtain a target abstract short sentence corresponding to each paragraph in the paragraph set, and splice the target abstract short sentences according to the position order of the target short sentence template corresponding to each paragraph in the abstract template, so as to obtain an abstract text corresponding to the referee document.

Fig. 3 is a schematic structural diagram of an electronic device for implementing a method for generating a referee document abstract according to an embodiment of the present invention.

The electronic device 1 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a command set or stored in advance. The electronic device 1 may be a computer, or may be a single network server, a server group composed of a plurality of network servers, or a cloud composed of a large number of hosts or network servers based on cloud computing, where cloud computing is one of distributed computing and is a super virtual computer composed of a group of loosely coupled computers.

In the present embodiment, the electronic device 1 includes, but is not limited to, a memory 11, a processor 12, and a network interface 13, which are communicatively connected to each other through a system bus, wherein the memory 11 stores a referee document summary generation program 10, and the referee document summary generation program 10 is executable by the processor 12. While FIG. 3 shows only electronic device 1 with components 11-13 and official document summary generation program 10, those skilled in the art will appreciate that the configuration shown in FIG. 3 is not intended to be limiting of electronic device 1, and may include fewer or more components than those shown, or some components in combination, or a different arrangement of components.

The storage 11 includes a memory and at least one type of readable storage medium. The memory provides cache for the operation of the electronic equipment 1; the readable storage medium may be a non-volatile storage medium such as flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 1, such as a hard disk of the electronic device 1; in other embodiments, the non-volatile storage medium may also be an external storage device of the electronic device 1, such as a plug-in hard disk provided on the electronic device 1, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. In this embodiment, the readable storage medium of the memory 11 is generally used for storing an operating system and various application software installed in the electronic device 1, for example, codes of the referee document summary generation program 10 in an embodiment of the present invention. Further, the memory 11 may also be used to temporarily store various types of data that have been output or are to be output.

Processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 12 is generally configured to control the overall operation of the electronic device 1, such as performing control and processing related to data interaction or communication with other devices. In this embodiment, the processor 12 is configured to run the program code stored in the memory 11 or process data, for example, run the referee document summary generation program 10.

The network interface 13 may comprise a wireless network interface or a wired network interface, and the network interface 13 is used for establishing a communication connection between the electronic device 1 and a client (not shown).

Optionally, the electronic device 1 may further include a user interface, the user interface may include a Display (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface may further include a standard wired interface and a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

The referee document summary generation program 10 stored in the memory 11 of the electronic device 1 is a combination of a plurality of instructions, and when running in the processor 12, can realize:

Specifically, the processor 12 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the referee document summary generation program 10, which is not described herein again. It is emphasized that the official document can also be stored in a node of a blockchain in order to further ensure the privacy and security of the official document.

Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable medium may be non-volatile or non-volatile. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

The computer readable storage medium stores a referee document summary generation program 10, and the referee document summary generation program 10 can be executed by one or more processors, and the specific implementation of the computer readable storage medium of the present invention is basically the same as that of each embodiment of the referee document summary generation method, and is not described herein again.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A method for generating an abstract of a referee document, the method comprising:

2. The method for generating an abstract of a referee document according to claim 1, wherein the step of matching each paragraph in the paragraph set with each clause template in a pre-configured abstract template for similarity to obtain a target clause template corresponding to each paragraph in the paragraph set comprises:

3. The method of official document digest generation of claim 2, wherein after calculating the longest common subsequence similarity value for each paragraph in the paragraph set and each phrase template in the digest templates, the method further comprises:

4. The method of official document digest generation of claim 2, wherein after calculating the longest common subsequence similarity value for each paragraph in the paragraph set and each phrase template in the digest templates, the method further comprises:

5. The method of generating a digest of official document of claim 2, wherein the longest common subsequence similarity value is calculated by the formula:

6. The method of generating a digest of official document of claim 1, wherein the training process of the paragraph class recognition model includes:

7. The method of abstract generation for official document of claim 1, wherein the training process of the abstract generation model comprises:

8. An apparatus for generating a digest of official document, said apparatus comprising:

9. An electronic device, characterized in that the electronic device comprises:

at least one processor; and the number of the first and second groups,

the memory stores a referee document summary generation program executable by the at least one processor to enable the at least one processor to perform the referee document summary generation method of any one of claims 1 to 7.

10. A computer-readable storage medium having stored thereon a referee document digest generation program executable by one or more processors to implement the referee document digest generation method according to any one of claims 1 to 7.