CN111783472A - Judgment book content extraction method and related device - Google Patents

Judgment book content extraction method and related device Download PDF

Info

Publication number
CN111783472A
CN111783472A CN202010612031.8A CN202010612031A CN111783472A CN 111783472 A CN111783472 A CN 111783472A CN 202010612031 A CN202010612031 A CN 202010612031A CN 111783472 A CN111783472 A CN 111783472A
Authority
CN
China
Prior art keywords
type
extraction
output result
content
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010612031.8A
Other languages
Chinese (zh)
Inventor
刘大双
晋耀红
李德彦
张志一
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dingfu Intelligent Technology Co Ltd
Original Assignee
Dingfu Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dingfu Intelligent Technology Co Ltd filed Critical Dingfu Intelligent Technology Co Ltd
Priority to CN202010612031.8A priority Critical patent/CN111783472A/en
Publication of CN111783472A publication Critical patent/CN111783472A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a judgment content extraction method and a related device, which are used for extracting and checking contents in a judgment and outputting the contents in a structured manner. The method comprises the following steps: the method comprises the steps of obtaining judgment book content, obtaining a target content text in the judgment book content according to a preset expression, generating the preset expression according to the format of the target content text, inputting the target content text into a pre-trained information extraction model and receiving an output result of the information extraction model, wherein the information extraction model is used for performing information extraction on the target content text by using an extraction node, the output result comprises a first output result and a second output result, and the first output result and the second output result are processed and output according to a preset mode. The content of the judgment book is extracted through the extraction expression, and the extraction result is processed, so that the examination and finding of the content in the judgment book are extracted and the content is output in a structured mode.

Description

Judgment book content extraction method and related device
Technical Field
The application relates to the field of word processing, in particular to a judgment book content extraction method and a related device.
Background
The judgment book is a document written by a court according to the judgment, is an application writing style commonly used in the legal field, and has the characteristics of normativity, innovativeness, openness, legality and accuracy in form.
The court and law need file the decision book, some organizations can adopt to save the source file and simultaneously extract more important data in the decision book for classification and filing, so that various data in the decision book can be observed more directly if the data needs to be extracted, for example, the data of the trial finding part in the decision book.
At present, an organization adopts a labeling method to prompt important data, which can be artificial labeling or labeling by establishing a model, and the artificial labeling has the defects that the workload is large, the labeling model is established, and the processing logic of the labeling is complex, so that the calculation amount is large.
Disclosure of Invention
The embodiment of the invention provides a judgment content extraction method and a related device, which are used for extracting and checking contents in a judgment and outputting the contents in a structured manner.
A first aspect of an embodiment of the present invention provides a method for extracting content of a decision book, including:
acquiring judgment book content;
acquiring a target content text in the judgment book content, wherein the target content text comprises first type structure data and second type structure data, the first type structure data is value class data, and the second type structure data is information class data;
inputting the target content text into a pre-trained information extraction model and receiving an output result of the information extraction model, wherein the information extraction model is used for performing information extraction on the target content text by using an extraction node, and the output result comprises a first type output result generated according to first type structure data and a second type output result generated according to second type structure data;
and processing and outputting the first type output result and the second type output result according to a preset mode.
Optionally, before the target content text is input into a pre-trained information extraction model and an output result of the information extraction model is received, the method further includes:
the method comprises the steps of constructing an information extraction model, wherein the information extraction model comprises a first type text content extraction framework and a second type text content extraction framework, the first type text content extraction framework is composed of first type text content extraction nodes and corresponding first type extraction expressions, and the second type text content extraction framework is composed of second type text content extraction nodes and corresponding second type extraction expressions.
Optionally, processing the first type output result according to a preset manner includes:
acquiring an extraction node name corresponding to the first type output result;
and deleting the fixed prefix information of the extracted node name to generate a corresponding item, wherein the item is used for storing the output result.
Optionally, processing the second type of output result according to a preset manner includes:
acquiring an extraction node name corresponding to the second type output result;
and generating a corresponding item according to the extracted node name, wherein the item is used for storing the output result.
Optionally, the processing the first type output result according to a preset manner further includes:
and carrying out duplicate removal on the first type output result.
Optionally, processing the second type of output result according to a preset manner further includes:
judging whether the second output result contains preset suffix information or not;
if yes, lengthening the second output result;
and if not, carrying out duplicate removal on the second type output result.
A second aspect of the present application provides a system for extracting content of a decision book, including:
the acquisition unit is used for acquiring the content of the judgment book;
the processing unit is used for acquiring a target content text in the judgment book content, wherein the target content text comprises first type structure data and second type structure data, the first type structure data is value class data, and the second type structure data is information class data;
the processing unit is further configured to input the target content text into a pre-trained information extraction model and receive an output result of the information extraction model, where the information extraction model is used to extract information of the target content text by using an extraction node, and the output result includes a first type of output result generated according to first type of structure data and a second type of output result generated according to second type of structure data;
the processing unit is further configured to process and output the first type output result and the second type output result according to a preset mode.
Optionally, the system further comprises:
the information extraction model comprises a first type of text content extraction framework and a second type of text content extraction framework, the first type of text content extraction framework is composed of first type of text content extraction nodes and corresponding first type of extraction expressions, and the second type of text content extraction framework is composed of second type of text content extraction nodes and corresponding second type of extraction expressions.
A third aspect of embodiments of the present application provides a computer apparatus, including:
a processor, a memory, an input-output device, and a bus;
the processor, the memory and the input and output equipment are respectively connected with the bus;
the processor is configured to perform the method according to any of the preceding embodiments.
A fourth aspect of embodiments of the present application provides a computer-readable storage medium having a computer program stored thereon, wherein: which when executed by a processor implements the steps of the method according to the previous embodiment.
From the above steps, the present application has the following advantages: in this embodiment, a decision content is obtained, a target content text in the decision content is obtained, where the target content text includes a first type of structure data and a second type of structure data, the first type of structure data is value class data, and the second type of structure data is information class data, the target content text is input into a pre-trained information extraction model, and an output result of the information extraction model is received, the information extraction model is used to extract information from the target content text by using an extraction node, the output result includes a first type of output result and a second type of output result, and the first type of output result and the second type of output result are processed and output according to a preset manner. The content of the judgment book is extracted through the extraction expression, and the extraction result is processed, so that the examination and finding of the content in the judgment book are extracted and the content is output in a structured mode.
Drawings
FIG. 1 is a diagram illustrating an embodiment of a method for extracting content of a decision book according to an embodiment of the present invention;
FIG. 2 is another diagram illustrating an embodiment of a method for extracting content of a decision book according to an embodiment of the present invention;
FIG. 3 is another diagram illustrating an embodiment of a method for extracting content of a decision book according to an embodiment of the present invention;
FIG. 4 is another diagram illustrating an embodiment of a method for extracting content of a decision book according to an embodiment of the present invention;
FIG. 5 is another diagram illustrating an embodiment of a method for extracting content of a decision book according to an embodiment of the present invention;
FIG. 6 is another diagram illustrating an embodiment of a method for extracting content of a decision book according to an embodiment of the present invention;
FIG. 7 is a diagram of an embodiment of a system for decision content extraction according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a computer device according to an embodiment of the present application.
Detailed Description
The embodiment of the invention provides a judgment content extraction method and a related device, which are used for extracting and checking contents in a judgment and outputting the contents in a structured manner.
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For convenience of understanding, a specific flow in the embodiment of the present application is described below, and referring to fig. 1, an embodiment of a method for extracting content of a decision in the embodiment of the present application includes:
101. acquiring judgment book content;
in this embodiment, the judgment content is obtained, and the judgment content may be obtained by downloading through a website, obtaining from and to an email, or downloading to a computer in advance.
102. Acquiring a target content text in the judgment book content, wherein the target content text comprises first type structure data and second type structure data, the first type structure data is value class data, and the second type structure data is information class data;
in this embodiment, a directory extraction method is adopted for extracting the contents of the decision books, and the method acquires the whole contents of the text of the trial finding part according to the fixed expression mode of the trial finding part in the contents of the decision books.
Specifically, since the examined parts of the judgment are examined and found out as "examined", the finding out is as follows: in the beginning, according to the characteristics, corresponding extraction expressions are designed so as to be positioned to the position according with the expressions and obtain the content in the position, for the condition of enterprise loan dispute judgment, important information of part of the content is checked and found out through examination and management can be generally divided into a value class and an information class, the value class is that the important information only corresponds to one of two results (such as yes or no), wherein the corresponding extraction expressions are designed according to value class data and information class data respectively, and the extraction of the information class data of the value class data is realized according to the extraction expressions. For example, the result of "whether to return interest" can only be one of "yes" and "no", and if the result is "yes", the result may be accompanied by the corresponding interest value. The information type data is data corresponding to specific information, for example, the information corresponding to the "amount of interest to be returned" is the specific amount of interest.
103. Inputting the target content text into a pre-trained information extraction model and receiving an output result of the information extraction model, wherein the information extraction model is used for performing information extraction on the target content text by using an extraction node, and the output result comprises a first type output result generated according to first type structure data and a second type output result generated according to second type structure data;
specifically, the trial finding part of the content extracted according to the catalog extraction method is input into a pre-trained information extraction model, and the information extraction model can present important information contained in the trial finding part of the content in a node-information manner according to a pre-designed extraction expression, wherein for an extraction result of value class data, the node name is characterized by beginning with 'no', and a corresponding value is one of 'yes' or 'no'; for the extraction result of the information data, the node name does not start with "no", and the corresponding value can be specific data such as a number, a date and the like.
Preferably, the extraction expression may be composed of one or more conceptual expressions as disclosed in ZL 201410155830.1, in combination with one or more semantic operators. 104. And processing and outputting the first type output result and the second type output result according to a preset mode.
Specifically, the result obtained by the information extraction model is that a node corresponds to one piece of information, and there may be a case that information corresponding to a node is "none", because when designing an extraction node, in order to cover all possible factual elements involved in an enterprise loan dispute decision statement, a large amount of data is learned in the expression of the model and the design process of the node, so that no information omission occurs in the specific implementation process, but not every specific enterprise loan dispute decision statement contains information corresponding to all nodes, for example, if an enterprise loan dispute decision statement does not involve information of "calculation reimbursement mode", then the node of "calculation reimbursement mode" of the information extraction model does not extract any content when extracting the enterprise loan dispute decision statement, the node which does not extract specific information is not output when the output result is finally generated, namely, when all the results are output, only the node name with the specific extraction result and the corresponding extraction result are displayed, namely, the node name with the extraction result and the corresponding extraction result are re-summarized, and the result can be stored in a form of key-value in a table.
In this embodiment, a decision content is obtained, a target content text in the decision content is obtained, where the target content text includes a first type of structure data and a second type of structure data, the first type of structure data is value class data, and the second type of structure data is information class data, the target content text is input into a pre-trained information extraction model, and an output result of the information extraction model is received, the information extraction model is used to extract information from the target content text by using an extraction node, the output result includes a first type of output result and a second type of output result, and the first type of output result and the second type of output result are processed and output according to a preset manner. The content of the judgment book is extracted through the extraction expression, and the extraction result is processed, so that the examination and finding of the content in the judgment book are extracted and the content is output in a structured mode.
In this embodiment, based on the embodiment described in fig. 1, the pre-construction of the information extraction model before obtaining the content of the decision-making book is further described, specifically referring to fig. 2, another embodiment of a method for extracting content of the decision-making book of the present invention includes:
201. the method comprises the steps of constructing an information extraction model, wherein the information extraction model comprises a first type text content extraction framework and a second type text content extraction framework, the first type text content extraction framework is composed of first type text content extraction nodes and corresponding first type extraction expressions, and the second type text content extraction framework is composed of second type text content extraction nodes and corresponding second type extraction expressions.
In this embodiment, the contents contained in the examined and found part of the enterprise loan dispute resolution decision book are divided into two categories, namely, a loan relationship and guarantee, and each category of contents can contain value class data and information class data, so that the application can find out that the examined and found part of contents are classified from two angles, the first angle is that the extracted contents are different according to information extraction, the first angle is divided into a value class and an information class, the former is yes or no, and the latter is data corresponding to specific information; the second point is to divide the factual elements in the decision book into loan relationship data and warranty data. Wherein, the loan relation data is extracted by using the corresponding loan relation extraction node in the loan relation frame, the guarantee data is extracted by using the corresponding guarantee extraction node in the guarantee frame, it can be understood that, in the present application, the contents contained in the examined and found part of the enterprise loan dispute resolution statement are divided into a loan relation and a guarantee, and the loan relation and the guarantee respectively contain value class data and information class data, in the final result output process, the node names and the extraction results of the value data and the information data in the loan relation, which have the extraction results, are summarized and output, the node names and the extraction results, which guarantee that the value data and the information data have the extraction results, are summarized and output, and the output form can be that the node names and the extraction results are filled in a form of key-value to be output.
Specifically, in order to obtain and structurally output important information in an enterprise loan dispute decision book, an information extraction model can be used for extracting the important information in the enterprise loan dispute decision book through an information extraction expression, the information extraction model comprises a plurality of extraction nodes, each extraction node corresponds to a fact element in the content of the enterprise loan decision book, each extraction node comprises a plurality of extraction expressions, the extraction expressions adopt a regular expression mode to describe a character string matching mode, the method can be used for checking whether a string contains a certain substring, replacing the matched substring or taking out a substring meeting a certain condition from the certain string, and the like, for an application scene of the method, as fact elements possibly contained in the enterprise loan decision book have a relatively fixed description mode, by learning the fact elements contained in a large number of enterprise loan decision books, establishing corresponding extraction nodes for all the fact elements, wherein multiple extraction expressions exist in each extraction node, and the multiple extraction expressions exist because for the content corresponding to some nodes, only one extraction expression is used, and the situation of extraction omission may occur, for example, the content which a certain extraction expression wants to extract is ended in a "mode", the target content may be in a "× mode", that is, the content which we want to extract contains two "modes" inside, and then if only one extraction expression exists, the content which only the previous "mode" corresponds to may be extracted, then multiple extraction expressions exist in such extraction nodes, and the content which contains one "mode" or more "modes" may be extracted.
In this embodiment, based on the embodiment shown in fig. 1, a processing manner of the first type output result is further introduced, specifically referring to fig. 3, another embodiment of a method for extracting content of a decision book of the present invention includes:
301. and acquiring an extraction node name corresponding to the first-class output result, and deleting fixed prefix information of the extraction node name to generate a corresponding item, wherein the item is used for storing the output result.
In this embodiment, the output result of the information extraction model is processed to generate the final enterprise loan decision statement fact element structured list.
Specifically, in a specific processing procedure, the information extraction model extracts information from all extraction nodes contained in the model one by one for the trial and error parts, the nodes of the information extraction model are obtained by learning the contents of a large number of decisions, and for a specific decision processing procedure, there may be a case where a certain node does not extract contents, for example, a node corresponds to "whether to repay the late arrestion", but the decision to be extracted does not have contents about "whether to repay the arrestion" or not, and then the node corresponding to the extraction of the contents about "whether to repay the arrestion" does not extract the corresponding result. And generating a final enterprise loan decision book fact element structured list, namely outputting all extraction nodes with results and corresponding results.
Further, for the extraction node with a specific extraction result, the enterprise loan decision statement fact element structured list can be generated correspondingly according to the name of the extraction node, that is, an item is created correspondingly according to the name of each extraction node with a result, and all the items are combined, for the value class data, since the name of the extraction node starts from "whether or not", which is a unified expression, the item name generated in the last output enterprise loan decision statement fact element structured list of the application is that the prefix of the extraction node name corresponding to the value class data is deleted, and the extraction result is stored in the item as the value correspondence, when the result is determined to be yes, the "whether or not to agree to repay the late fund is taken as an example", a new node name "agree to repay the late fund" is formed according to the name, and the corresponding result can be a specific late amount, if the content of the "nonprovisioned repayment late arrears" is mentioned in the content of the decision, the result is no, and if the content of the decision does not mention any content related to "whether to repay the arrears", the node named "whether to repay the arrears" does not draw any result.
In this embodiment, based on the embodiment shown in fig. 1, a processing manner of the second type of output result is further introduced, specifically referring to fig. 4, another embodiment of a method for extracting content of a decision book of the present invention includes:
401. and acquiring an extraction node name corresponding to the second type of output result, and generating a corresponding item according to the extraction node name, wherein the item is used for storing the output result.
In this embodiment, the output result of the information extraction model is processed to generate the final enterprise loan decision statement fact element structured list.
Specifically, in the information extraction model, all extraction nodes contained in the model perform information extraction on the trial and error parts one by one in the specific processing process, the nodes of the information extraction model are obtained by learning the contents of a great number of decisions, and for a specific decision processing process, there may be a case where a certain node does not extract contents, for example, a node corresponds to a "mortgage", but the decision to be extracted does not have contents related to the "mortgage", and then the node corresponding to the content extracted related to the "mortgage" does not extract the corresponding result. And generating a final enterprise loan decision book fact element structured list, namely outputting all extraction nodes with results and corresponding results.
Further, for an extraction node with a specific extraction result, an enterprise lending decision statement fact element structured list can be generated according to the name of the extraction node, that is, an item is created according to the name of each extraction node with a result, and all items can be collected into a table.
In this embodiment, based on the embodiment shown in fig. 3, a processing manner of the first type output result is further described, specifically referring to fig. 5, another embodiment of a method for extracting content of a decision book of the present invention includes:
501. and carrying out duplicate removal on the first type output result.
Specifically, for value class data, the result that can be extracted by the extraction expression included in the corresponding extraction node can only be yes or no, and for the same content, the result can only be one of two results, so that the output result of the value class data is yes or no that one extraction node corresponds to multiple extraction expressions, and for multiple same yes or multiple same no, only one of the multiple same extraction expressions needs to be stored in the item.
In this embodiment, based on the embodiment described in fig. 4, a processing manner of the second type of output result is further described, specifically referring to fig. 6, another embodiment of a method for extracting content of a decision book of the present invention includes:
601. judging whether preset suffix information exists in the second type of output result;
specifically, for the information class data, the extracted results of the extraction expressions included in the corresponding extraction nodes may be the same content or different contents, if the extraction results of the extraction nodes are the same content, the adopted method is also the deduplication method, if the extraction results of the extraction nodes are different, the length-taking method is adopted, taking a specific extraction node as an example, for example, a node of "mortgage", the extraction results of the extraction expressions included in the node are all the mortgage names included in the enterprise loan dispute decision books, no matter the number of the mortgages is several, only the built item names are the mortgages, and the corresponding extraction result is a vehicle, if the mortgage is a room and a vehicle in the decision book, the built node name in the project is the mortgage, the corresponding extraction result is a room and a vehicle, different extraction results do not exist, the content which is required to be extracted by a certain extraction expression is ended in a mode, a plurality of extraction expressions may exist in the extraction node, and the content correspondingly comprising a mode or more modes can be extracted.
602. If yes, lengthening the second output result;
specifically, as described in embodiment 601, according to a large amount of learning summaries of the enterprise loan dispute resolution, the last "manner" is generally the last summary of a content, so the present application performs a length-taking process on the extraction results of such nodes, that is, retains the extraction result with the maximum number of bytes, and stores the extraction result in the corresponding item.
603. And if not, carrying out duplicate removal on the second type output result.
Specifically, if the information type data with the fixed suffix does not exist, the extraction result of the corresponding extraction node is the same as the situation of the value type data and is consistent, so that only duplication removal is needed in a specific processing process, and one of the data is stored in the corresponding item.
In the embodiment, the content of the decision is obtained, a target content text in the content of the decision is obtained according to a preset expression, the preset expression is generated according to the format of the target content text, the target content text comprises first-class structural data and second-class structural data, the target content text is input into a pre-trained information extraction model, and an output result of the information extraction model is received, the information extraction model is used for performing information extraction on the target content text by using an extraction node, the output result comprises a first-class output result and a second-class output result, and the first-class output result and the second-class output result are processed and output according to a preset mode. The content of the judgment book is extracted through the extraction expression, and the extraction result is processed, so that the examination and finding of the content in the judgment book are extracted and the content is output in a structured mode.
The method in the embodiment of the present application is introduced above, and the embodiment of the present application is described below from the perspective of a virtual device.
Referring to fig. 7, an embodiment of a system for extracting content of a decision book in an embodiment of the present application includes:
an obtaining unit 701, configured to obtain content of a decision book;
a processing unit 702, configured to obtain a target content text in the content of the decision book according to a preset expression, where the preset expression is generated according to a format of the target content text, and the target content text includes first type structure data and second type structure data, where the first type structure data is value class data, and the second type structure data is information class data;
the processing unit 702 is further configured to input the target content text into a pre-trained information extraction model, and receive an output result of the information extraction model, where the information extraction model is used to extract information of the target content text by using an extraction node, and the output result includes a first type output result generated according to the first type structure data and a second type output result generated according to the second type structure data;
the processing unit 702 is further configured to process and output the first type output result and the second type output result according to a preset manner.
As a preferred embodiment, the system further comprises:
the constructing unit 703 is configured to construct an information extraction model, where the information extraction model includes a first-class text content extraction framework and a second-class text content extraction framework, the first-class text content extraction framework is composed of first-class text content extraction nodes and corresponding first-class extraction expressions, and the second-class text content extraction framework is composed of second-class text content extraction nodes and corresponding second-class extraction expressions.
In this embodiment, the obtaining unit 701 is configured to obtain the content of the decision book, and the processing unit 702, the system is used for acquiring a target content text in the decision content, wherein the target content text contains a first type of structure data and a second type of structure data, the first type of structure data is value class data, the second type of structure data is information class data, the processing unit 702 is further configured to input the target content text into a pre-trained information extraction model and receive an output result of the information extraction model, the information extraction model is used for performing information extraction on the target content text by using an extraction node, the output result comprises a first kind of output result generated according to the first kind of structure data and a second kind of output result generated according to the second kind of structure data, the processing unit 702 is further configured to process and output the first type output result and the second type output result according to a preset manner. The content of the judgment book is extracted through the extraction expression, and the extraction result is processed, so that the examination and finding of the content in the judgment book are extracted and the content is output in a structured mode.
Referring to fig. 8, a computer device in an embodiment of the present application is described below from the perspective of a physical device, where an embodiment of the computer device in the embodiment of the present application includes:
the computing device 800 may have a large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 801 (e.g., one or more processors) and a memory 805, where one or more applications or data are stored in the memory 805.
Memory 805 may be volatile storage or persistent storage, among others. The program stored in the memory 805 may include one or more modules, each of which may include a sequence of instructions for operating on the server. Still further, the central processor 801 may be configured to communicate with the memory 805 to execute a series of instruction operations in the memory 805 on the smart terminal 800.
The computer device 800 may also include one or more power supplies 802, one or more wired or wireless network interfaces 803, one or more input-output interfaces 804, and/or one or more operating systems, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and the like.
The processor 801 is specifically configured to perform the following steps:
acquiring judgment book content;
acquiring a target content text in the judgment book content, wherein the target content text comprises first type structure data and second type structure data, the first type structure data is value class data, and the second type structure data is information class data;
inputting the target content text into a pre-trained information extraction model and receiving an output result of the information extraction model, wherein the information extraction model is used for performing information extraction on the target content text by using an extraction node, and the output result comprises a first type output result generated according to first type structure data and a second type output result generated according to second type structure data;
and processing and outputting the first type output result and the second type output result according to a preset mode.
Optionally, before the target content text is input into a pre-trained information extraction model and an output result of the information extraction model is received, the method further includes:
the method comprises the steps of constructing an information extraction model, wherein the information extraction model comprises a first type text content extraction framework and a second type text content extraction framework, the first type text content extraction framework is composed of first type text content extraction nodes and corresponding first type extraction expressions, and the second type text content extraction framework is composed of second type text content extraction nodes and corresponding second type extraction expressions.
Optionally, processing the first type output result according to a preset manner includes:
acquiring an extraction node name corresponding to the first type output result;
and deleting the fixed prefix information of the extracted node name to generate a corresponding item, wherein the item is used for storing the output result.
Optionally, processing the second type of output result according to a preset manner includes:
acquiring an extraction node name corresponding to the second type output result;
and generating a corresponding item according to the extracted node name, wherein the item is used for storing the output result.
Optionally, the processing the first type output result according to a preset manner further includes:
and carrying out duplicate removal on the first type output result.
Optionally, processing the second type of output result according to a preset manner further includes:
judging whether the second output result contains preset suffix information or not;
if yes, lengthening the second output result;
and if not, carrying out duplicate removal on the second type output result.
It should be understood that, in the various embodiments of the present application, the sequence numbers of the above steps do not mean the execution sequence, and the execution sequence of the steps should be determined by their functions and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A method for extracting content of a decision, comprising:
acquiring judgment book content;
acquiring a target content text in the judgment book content, wherein the target content text comprises first type structure data and second type structure data, the first type structure data is value class data, and the second type structure data is information class data;
inputting the target content text into a pre-trained information extraction model and receiving an output result of the information extraction model, wherein the information extraction model is used for performing information extraction on the target content text by using an extraction node, and the output result comprises a first type output result generated according to first type structure data and a second type output result generated according to second type structure data;
and processing and outputting the first type output result and the second type output result according to a preset mode.
2. The method of claim 1, wherein prior to entering the target content text into a pre-trained information extraction model and receiving an output of the information extraction model, the method further comprises:
the method comprises the steps of constructing an information extraction model, wherein the information extraction model comprises a first type text content extraction framework and a second type text content extraction framework, the first type text content extraction framework is composed of first type text content extraction nodes and corresponding first type extraction expressions, and the second type text content extraction framework is composed of second type text content extraction nodes and corresponding second type extraction expressions.
3. The method of claim 1, wherein processing the first type of output results in a predetermined manner comprises:
acquiring an extraction node name corresponding to the first type output result;
and deleting the fixed prefix information of the extracted node name to generate a corresponding item, wherein the item is used for storing the output result.
4. The method of claim 1, wherein processing the output results of the second type in a predetermined manner comprises:
acquiring an extraction node name corresponding to the second type output result;
and generating a corresponding item according to the extracted node name, wherein the item is used for storing the output result.
5. The method of claim 3, wherein processing the first type of output result in a predetermined manner further comprises:
and carrying out duplicate removal on the first type output result.
6. The method of claim 4, wherein processing the output results of the second type in a predetermined manner further comprises:
judging whether the second output result contains preset suffix information or not;
if yes, lengthening the second output result;
and if not, carrying out duplicate removal on the second type output result.
7. A system for extracting content of a decision, comprising:
the acquisition unit is used for acquiring the content of the judgment book;
the processing unit is used for acquiring a target content text in the judgment book content, wherein the target content text comprises first type structure data and second type structure data, the first type structure data is value class data, and the second type structure data is information class data;
the processing unit is further configured to input the target content text into a pre-trained information extraction model and receive an output result of the information extraction model, where the information extraction model is used to extract information of the target content text by using an extraction node, and the output result includes a first type of output result generated according to first type of structure data and a second type of output result generated according to second type of structure data;
the processing unit is further configured to process and output the first type output result and the second type output result according to a preset mode.
8. The system of claim 7, further comprising:
the information extraction model comprises a first type of text content extraction framework and a second type of text content extraction framework, the first type of text content extraction framework is composed of first type of text content extraction nodes and corresponding first type of extraction expressions, and the second type of text content extraction framework is composed of second type of text content extraction nodes and corresponding second type of extraction expressions.
9. A computer device, comprising:
a processor, a memory, an input-output device, and a bus;
the processor, the memory and the input and output equipment are respectively connected with the bus;
the processor is configured to perform the method of any one of claims 1 to 6.
10. A computer-readable storage medium having stored thereon a computer program, characterized in that: the computer program, when being executed by a processor, realizes the steps of the method according to any one of claims 1 to 6.
CN202010612031.8A 2020-06-30 2020-06-30 Judgment book content extraction method and related device Pending CN111783472A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010612031.8A CN111783472A (en) 2020-06-30 2020-06-30 Judgment book content extraction method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010612031.8A CN111783472A (en) 2020-06-30 2020-06-30 Judgment book content extraction method and related device

Publications (1)

Publication Number Publication Date
CN111783472A true CN111783472A (en) 2020-10-16

Family

ID=72760432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010612031.8A Pending CN111783472A (en) 2020-06-30 2020-06-30 Judgment book content extraction method and related device

Country Status (1)

Country Link
CN (1) CN111783472A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257428A (en) * 2020-10-22 2021-01-22 鼎富智能科技有限公司 Punishment decision analysis method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107608948A (en) * 2017-10-16 2018-01-19 北京神州泰岳软件股份有限公司 A kind of construction method and device of Text Information Extraction model
WO2018224028A1 (en) * 2017-06-09 2018-12-13 北京国双科技有限公司 Method and device for acquiring focus of judgement document
CN109241528A (en) * 2018-08-24 2019-01-18 讯飞智元信息科技有限公司 A kind of measurement of penalty prediction of result method, apparatus, equipment and storage medium
CN109344187A (en) * 2018-08-28 2019-02-15 合肥工业大学 A kind of judicial decision writing desk feelings message structure processing system
CN110163257A (en) * 2019-04-23 2019-08-23 百度在线网络技术(北京)有限公司 Method, apparatus, equipment and the computer storage medium of drawing-out structure information
CN110209721A (en) * 2019-06-04 2019-09-06 南方科技大学 Judgement document transfers method, apparatus, server and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018224028A1 (en) * 2017-06-09 2018-12-13 北京国双科技有限公司 Method and device for acquiring focus of judgement document
CN107608948A (en) * 2017-10-16 2018-01-19 北京神州泰岳软件股份有限公司 A kind of construction method and device of Text Information Extraction model
CN109241528A (en) * 2018-08-24 2019-01-18 讯飞智元信息科技有限公司 A kind of measurement of penalty prediction of result method, apparatus, equipment and storage medium
CN109344187A (en) * 2018-08-28 2019-02-15 合肥工业大学 A kind of judicial decision writing desk feelings message structure processing system
CN110163257A (en) * 2019-04-23 2019-08-23 百度在线网络技术(北京)有限公司 Method, apparatus, equipment and the computer storage medium of drawing-out structure information
CN110209721A (en) * 2019-06-04 2019-09-06 南方科技大学 Judgement document transfers method, apparatus, server and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257428A (en) * 2020-10-22 2021-01-22 鼎富智能科技有限公司 Punishment decision analysis method and device

Similar Documents

Publication Publication Date Title
CN104715064B (en) It is a kind of to realize the method and server that keyword is marked on webpage
CN110795568A (en) Risk assessment method and device based on user information knowledge graph and electronic equipment
WO2016060552A1 (en) System generator module for electronic document and electronic file
CN110795697A (en) Logic expression obtaining method and device, storage medium and electronic device
CN112528067A (en) Graph database storage method, graph database reading method, graph database storage device, graph database reading device and graph database reading equipment
CN112883198A (en) Knowledge graph construction method and device, storage medium and computer equipment
CN112783825B (en) Data archiving method, device, computer device and storage medium
US11436278B2 (en) Database creation apparatus and search system
CN111783472A (en) Judgment book content extraction method and related device
EP3062245B1 (en) Dynamic modular ontology
CN112597410A (en) Method and device for performing structured extraction on webpage content based on rule configuration library
CN113742332A (en) Data storage method, device, equipment and storage medium
CN112767933B (en) Voice interaction method, device, equipment and medium of highway maintenance management system
CN112765183B (en) Multi-source data fusion method and device, storage medium and electronic equipment
CN108228573A (en) Text emotion analysis method, device and electronic equipment
CA3148074A1 (en) Text information extracting method, device, computer equipment and storage medium
CN114443727A (en) Human vein data processing method, device, equipment and storage medium
US20170220584A1 (en) Identifying Linguistically Related Content for Corpus Expansion Management
CN115687704A (en) Information display method and device, electronic equipment and computer readable storage medium
CN112765340A (en) Method and device for determining cloud service resources, electronic equipment and storage medium
CN113392124B (en) Structured language-based data query method and device
CN113836168B (en) Big data processing system and method based on block chain
CN111124548B (en) Rule analysis method and system based on YAML file
CN111931040B (en) Recommendation method for service entry of service entity in network platform
CN114091106A (en) Report file security rule establishing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination