WO2020135247A1 - 法律文书解析方法及装置 - Google Patents
法律文书解析方法及装置 Download PDFInfo
- Publication number
- WO2020135247A1 WO2020135247A1 PCT/CN2019/126934 CN2019126934W WO2020135247A1 WO 2020135247 A1 WO2020135247 A1 WO 2020135247A1 CN 2019126934 W CN2019126934 W CN 2019126934W WO 2020135247 A1 WO2020135247 A1 WO 2020135247A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- crime
- name
- correspondence
- sentencing
- conviction
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000004458 analytical method Methods 0.000 claims description 58
- 230000014509 gene expression Effects 0.000 claims description 19
- 238000000605 extraction Methods 0.000 claims description 17
- 238000005516 engineering process Methods 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 13
- 238000010586 diagram Methods 0.000 description 12
- 238000004590 computer program Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000001066 destructive effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/18—Legal services
Definitions
- the present invention relates to the technical field of data processing, and more specifically, to a method and device for analyzing legal documents.
- the present invention is proposed in order to provide a legal document analysis method and device that overcome the above problems or at least partially solve the above problems.
- the present invention provides the following technical solutions:
- a method for analyzing legal documents includes:
- the correspondence between the name of the subject of the crime, the name of the crime and the plot of conviction and sentencing is established.
- the establishing the correspondence between the name of the criminal subject, the name of the crime and the plot of conviction and sentencing according to the first correspondence and the second correspondence includes:
- the preset association dictionary records the correspondence between the preset crime and the pre-set crime sentencing plot as a third correspondence
- the method further includes:
- the process of extracting the crime includes:
- the preset keywords include: judgment, exemption from criminal punishment or exemption from criminal punishment;
- the extraction process of the conviction and sentencing plot includes:
- the preset regular expression is a regular expression constructed using a pre-defined crime sentencing scenario.
- the extraction process of the conviction and sentencing plot includes:
- the method further includes:
- the target correspondence is the correspondence between the name of the subject of the crime, the name of the crime and the plot of conviction and sentencing.
- a legal document analysis device includes:
- Crime information extraction unit used to extract the name of the criminal subject, the name of the crime and the conviction and sentencing circumstances
- the first relationship establishing unit is used to establish the correspondence between the name of the criminal subject and the name of the crime as the first correspondence when the name of the criminal subject is included in the clause where the crime is located in the legal document relationship;
- a second relationship establishing unit configured to establish a correspondence between the name of the criminal subject and the conviction and sentencing scenario when the conviction and sentencing scenario in the legal document contains the name of the criminal subject, As the second correspondence;
- the target relationship establishing unit is configured to establish a correspondence between the name of the criminal subject, the name of the crime, and the plot of conviction and sentencing according to the first correspondence and the second correspondence.
- a storage medium includes a stored program, wherein, when the program is running, the device where the storage medium is located is controlled to perform the aforementioned legal document analysis method.
- a processor for running a program wherein the method for analyzing a legal document described above is executed when the program is run.
- the method and device for analyzing legal documents after extracting the names, offenses and conviction and sentencing plots of the criminal subjects in the legal documents, according to the positional relationship between the names and offenses of the criminal subjects in the legal documents, is accurate Determine the correspondence between the name of the criminal subject and the name of the crime, and accurately determine the correspondence between the name of the criminal subject and the conviction and sentencing plot in the legal document, and then accurately determine Correspondence between the name of the subject of the crime, the name of the crime and the circumstances of conviction and sentencing, rather than directly linking the name of the subject of the crime, the name of the crime and the circumstances of conviction and sentencing extracted from the same legal document, so that the legal documents of various cases can be accurately parsed
- the corresponding relationship between the name of the subject of the crime, the name of the crime and the circumstances of conviction and sentencing are applicable to the effective analysis of legal documents in various cases.
- FIG. 1 is a flowchart of a legal document analysis method provided by an embodiment of this application
- FIG. 2 is another flowchart of a legal document analysis method provided by an embodiment of this application.
- FIG. 3 is another flowchart of the legal document analysis method provided by the embodiment of the present application.
- FIG. 5 is a diagram showing the analysis result of legal documents provided by embodiments of the present application.
- FIG. 6 is a schematic structural diagram of a legal document analysis device provided by an embodiment of the present application.
- Subject of crime refers to natural persons and units that carry out acts that endanger society and should be held criminally responsible according to law.
- the subject of a natural person refers to a natural person who has achieved criminal responsibility.
- the entity of a unit refers to a company, enterprise, institution, organ, or group that commits acts that endanger society and should bear criminal responsibility according to law.
- the name of the crime is the name of each specific crime stipulated in the criminal Law Subsection, and it is a high-level summary of the essential characteristics of this specific crime.
- the name of the crime reflects the essential difference between one crime and another, and is the fundamental boundary that distinguishes this crime from the other.
- Conviction plot It exists in the process of crime execution, and it determines that a certain act constitutes a crime by reflecting the social harmfulness of the criminal act and the personal harmfulness and degree of the perpetrator.
- Sentencing circumstance refers to various subjective and objective circumstances on which a sentence should be considered in determining the severity of the sentence or exempted from punishment if it is stipulated by law or judicial practice under the premise that the conduct has constituted a crime.
- FIG. 1 is a flowchart of a legal document analysis method provided by an embodiment of the present application.
- the method includes:
- S110 Extract the name of the subject of the crime, the name of the crime and the circumstances of conviction and sentencing in the legal document.
- NLP Natural Language Processing
- the name of the subject of the crime can be the name of the offender or the name of the criminal unit;
- the conviction and sentencing plot includes the conviction plot and sentencing plot, and is also the collective term of the conviction plot and sentencing plot.
- the name of the criminal subject and the name of the crime can be correspondingly output and stored according to the first correspondence.
- the name of the criminal subject and the conviction and sentencing plot can be correspondingly output and stored according to the second correspondence relationship.
- S140 Establish a correspondence between the name of the criminal subject, the name of the crime, and the plot of conviction and sentencing according to the first correspondence and the second correspondence.
- the legal document analysis method after extracting the name of the criminal subject, the name of the crime and the conviction and sentencing plot in the legal document, accurately determine the name of the criminal subject and the name of the crime according to the positional relationship between the name of the criminal subject and the name of the crime in the legal document Correspondence relationship, and according to the position relationship between the name of the criminal subject and the conviction and sentencing plot in the legal document, accurately determine the corresponding relationship between the name of the criminal subject and the conviction and sentencing plot, and then accurately determine the name of the criminal subject, the offense and conviction Correspondence between sentencing plots, rather than directly linking the names of criminal subjects, crimes and conviction sentencing plots extracted from the same legal document, so that the names of criminal subjects, offenses and The corresponding relationship between the conviction and sentencing circumstances is applicable to the effective analysis of legal documents in various cases (eg, single crime, single crime, multiple crimes and multiple crimes).
- FIG. 2 is another flowchart of a method for analyzing legal documents provided by an embodiment of the present application.
- the method includes:
- S210 Extract the name of the subject of the crime, the name of the crime and the circumstances of conviction and sentencing in the legal document.
- the process of extracting the crime may include:
- the preset keywords include: judgment, exemption from criminal punishment or exemption from criminal punishment;
- the legal document may include a referee document, and the referee result paragraph may refer to the “judge as follows paragraph” in the referee document, that is, locate the preset keywords “judgment”, “exempt from” in the “judge following paragraph” in the referee document.
- the location of "criminal punishment” or "exemption from criminal punishment” and extract the crimes that meet the provisions of the preset crime dictionary in the previous sentence of the location of "judgment”, "exemption from criminal punishment” or “exemption from criminal punishment” .
- the judgment documents can specifically include criminal judgment documents.
- the preset crime dictionary includes the preset crimes prescribed by law, which can be used to match the crimes that appear in legal documents.
- the extraction process of the conviction and sentencing scenario may include:
- the preset regular expression is a regular expression constructed using a pre-defined crime sentencing scenario.
- the extraction process of the conviction and sentencing scenario may include:
- the content with the same semantics as the pre-determined crime sentencing scenario is taken as the extracted conviction sentencing scenario.
- semantic analysis technology can also effectively identify the "not recognized”, “not accepted” and “disagree with the defense opinions” in the legal documents, so as to accurately extract the convictions found in the legal documents The plot and the sentencing plot.
- the preset association dictionary records the correspondence between the preset crime and the preset crime sentencing scenario as the third correspondence.
- the pre-defined crime and pre-defined crime sentencing plots refer to the crime and conviction sentencing plots prescribed by law, and are the standard expressions of the crime and conviction sentencing plots.
- the preset associated dictionary as a standard system, can be applied to the entire analytical process of legal documents.
- the correspondence between the "crime” in the first correspondence and the “conviction and sentencing plot” in the second correspondence can be established indirectly, combining the Correspondence between the "predetermined crime” and the "predetermined crime sentencing plot” in the three correspondences, the correspondence between the "crime” in the first correspondence relationship and the "convicted sentencing plot” in the second correspondence Verification, so as to improve the accuracy of the correspondence between the "crime” in the first correspondence and the "conviction and sentencing plot” in the second correspondence, and then ensure that the "criminal subject name", "crime name” and “criminal sentencing plot” are The accuracy of the correspondence between.
- the legal document analysis method provided in this embodiment is not only based on the first correspondence between the name of the criminal subject and the name of the crime and the second correspondence between the name of the criminal subject and the conviction and sentencing plot, but also combines
- the third correspondence between the preset crimes in the preset association dictionary and the pre-defined crime sentencing plots is established to establish the correspondence between the name of the criminal subject, the crimes and the conviction sentencing plots, thereby improving the final establishment
- the accuracy of the correspondence between the subject of the crime, the name of the crime, and the plot of the conviction and sentencing improves the accuracy and effectiveness of the analysis of legal documents.
- FIG. 3 is another flowchart of a legal document analysis method provided by an embodiment of the present application.
- the method includes:
- S310 Extract the name of the criminal subject, the name of the crime, and the conviction and sentencing circumstances in the legal document.
- step S320 Determine whether the name of the criminal subject is included in the clause where the crime is in the legal document, and if so, perform step S330; if not, perform step S340.
- the search can be continued to the more advanced clause to extract the State the subject of the crime.
- S360 Establish a correspondence between the name of the subject of the crime, the name of the crime, and the plot of conviction and sentencing according to the first correspondence and the second correspondence.
- S370 Correspondingly output the name of the subject of the crime, the name of the crime and the plot of conviction and sentencing according to the target correspondence.
- the target correspondence is the correspondence between the name of the subject of the crime, the name of the crime and the plot of conviction and sentencing.
- the name of the subject of the crime can be output as a record according to the target correspondence Into the database, so as to achieve a structured storage of analytical results.
- the nearest branch before the sentence where the crime is located in the legal document is determined Whether the name of the criminal subject is included in the sentence; when the most recent clause before the clause where the crime is located in the legal document contains the name of the criminal subject, establish the relationship between the name of the criminal subject and the name of the crime Correspondence, thus achieving fault-tolerant processing in the process of analyzing legal documents, and improving the success rate of analyzing legal documents.
- FIGS. 4 to 5 show an example diagram of the legal document analysis process and a display diagram of the analysis result of the legal document provided by the embodiments of the present application.
- the crime-conviction plot dictionary records the correspondence between preset crimes and preset crime plots; the crime-sentence plot dictionary records the correspondence between preset crimes and preset sentencing plots.
- the analytical results may include: "criminal subject A-crime a-conviction and sentencing scenarios a1, a2".
- the analysis results may include: "criminal subject A-crime a-conviction sentencing plots a1, a2"; “criminal subject A-crime b-conviction sentencing plots b1, b2".
- the analysis results can include: "criminal subject A-crime a-conviction sentencing plots a1, a2"; "criminal subject B-crime a-conviction sentencing plots a3, a2, a4".
- the analysis results can include: "criminal subject A-crime a-conviction sentencing plots a1, a2"; “criminal subject A-crime b-conviction sentencing plots b1, b2”; “criminal subject B- Offense c-Conviction Sentencing Circumstances c3, c4"; “criminal subject B-crime d-Conviction Sentencing circumstance d1, d2".
- the legal document analysis method provided by the present invention utilizes technologies such as dependent grammatical relationship, text analysis technology, name entity recognition and natural language processing, etc., according to the positional relationship between the name of the criminal subject and the name of the crime in the legal document, and the name of the criminal subject and the conviction and sentencing
- the positional relationship of the plot in the legal document can accurately determine the corresponding relationship between the name of the criminal subject, the name of the crime and the conviction and sentencing scenario, and can be applied to various cases (single crime, single crime, multiple crimes, multiple crimes and multiple crimes)
- An embodiment of the present invention also provides a legal document analyzing device, which is used to implement the legal document analyzing method provided by the embodiment of the present invention.
- the content of the legal document analyzing device described below can be analyzed with the legal document described above The contents of the methods correspond to each other.
- FIG. 6 is a schematic structural diagram of a legal document analysis device provided by an embodiment of the present application.
- the legal document analysis apparatus of this embodiment is used to implement the legal document analysis method of the foregoing embodiment. As shown in FIG. 6, the apparatus includes:
- the crime information extraction unit 100 is used to extract the name of the criminal subject, the name of the crime, and the conviction and sentencing circumstances in the legal document.
- the first relationship establishing unit 200 is used to establish the correspondence between the name of the criminal subject and the name of the crime when the name of the criminal subject in the clause of the legal document contains the name of the criminal subject as the first Correspondence.
- the second relationship establishment unit 300 is configured to establish a correspondence between the name of the criminal subject and the conviction and sentencing scenario when the name of the criminal subject is included in the clause where the conviction and sentencing scenario in the legal document , As the second correspondence.
- the target relationship establishing unit 400 is configured to establish a correspondence between the name of the criminal subject, the name of the crime and the plot of conviction and sentencing according to the first correspondence and the second correspondence.
- the legal document analysis device after extracting the name of the criminal subject, the name of the crime and the conviction and sentencing plot in the legal document, accurately determine the name of the criminal subject and the name of the crime according to the positional relationship between the name of the criminal subject and the name of the crime in the legal document Correspondence relationship, and according to the position relationship between the name of the criminal subject and the conviction and sentencing plot in the legal document, accurately determine the correspondence between the name of the criminal subject and the conviction and sentencing plot, and then accurately determine the name of the criminal subject, the offense and the conviction Correspondence between sentencing plots, rather than directly linking the names of criminal subjects, charges and conviction sentencing plots extracted from the same legal document, so that they can be applied to various types of cases (single and single crimes, multiple crimes per person, multiple crimes) The effective analysis of the legal documents of one crime and multiple crimes.
- the target relationship establishing unit 400 may be specifically used to:
- the preset association dictionary records the correspondence between the preset crime and the pre-set crime sentencing plot as a third correspondence
- the first relationship establishing unit 200 may also be used for:
- the crime information extraction unit 100 is specifically used to:
- the preset keywords include: judgment, exemption from criminal punishment or exemption from criminal punishment;
- the crime information extraction unit 100 is further used to:
- the preset regular expression is a regular expression constructed using a pre-defined crime sentencing scenario.
- the crime information extraction unit 100 is further used to:
- the device may further include: an analysis result output unit.
- the analysis result output unit is configured to output the name of the subject of the crime, the name of the crime, and the plot of conviction and sentencing according to the target correspondence;
- the target correspondence is the correspondence between the name of the subject of the crime, the name of the crime and the plot of conviction and sentencing.
- the legal document analysis device after extracting the name of the criminal subject, the name of the crime and the conviction and sentencing plot in the legal document, accurately determine the name of the criminal subject and the name of the crime according to the positional relationship between the name of the criminal subject and the name of the crime in the legal document Correspondence relationship, and according to the position relationship between the name of the criminal subject and the conviction and sentencing plot in the legal document, accurately determine the corresponding relationship between the name of the criminal subject and the conviction and sentencing plot, combined with the preset crime in the preset association dictionary Correspondence with the pre-defined crime sentencing circumstances, accurately determined the corresponding relationship between the name of the subject of the crime, the name of the crime and the conviction sentencing circumstances, improve the accuracy and effectiveness of the analysis of legal documents.
- the legal document analysis device includes a processor and a memory.
- the processor Stored in the memory as a program unit, the processor executes the above-mentioned program unit stored in the memory to implement the corresponding function.
- the processor contains a core, and the core retrieves the corresponding program unit from the memory.
- One or more kernels can be set to solve the technical problem that the existing legal document analysis scheme cannot achieve the effective analysis of legal documents applicable to various cases by adjusting the kernel parameters.
- the memory may include non-permanent memory, random access memory (RAM) and/or non-volatile memory in a computer-readable medium, such as read only memory (ROM) or flash memory (flash RAM), and the memory includes at least one Memory chip.
- RAM random access memory
- ROM read only memory
- flash RAM flash memory
- An embodiment of the present invention provides a storage medium on which a program is stored, which implements the legal document analysis method when the program is executed by a processor.
- An embodiment of the present invention provides a processor for running a program, wherein the legal document analysis method is executed when the program is run.
- An embodiment of the present invention provides a device.
- the device includes a processor, a memory, and a program stored on the memory and executable on the processor.
- the processor executes the program, the following steps are implemented:
- the correspondence between the name of the subject of the crime, the name of the crime and the plot of conviction and sentencing is established.
- the establishing the correspondence between the name of the criminal subject, the name of the crime and the plot of conviction and sentencing according to the first correspondence and the second correspondence includes:
- the preset association dictionary records the correspondence between the preset crime and the pre-set crime sentencing plot as a third correspondence
- the method further includes:
- the process of extracting the crime includes:
- the preset keywords include: judgment, exemption from criminal punishment or exemption from criminal punishment;
- the extraction process of the conviction and sentencing plot includes:
- the preset regular expression is a regular expression constructed using a pre-defined crime sentencing scenario.
- the extraction process of the conviction and sentencing plot includes:
- the method further includes:
- the target correspondence is the correspondence between the name of the subject of the crime, the name of the crime and the plot of conviction and sentencing.
- the devices in this article can be servers, PCs, PADs, mobile phones, etc.
- the present application also provides a computer program product, which when executed on a data processing device, is suitable for executing a program initialized with the following method steps:
- the correspondence between the name of the subject of the crime, the name of the crime and the plot of conviction and sentencing is established.
- the establishing the correspondence between the name of the criminal subject, the name of the crime and the plot of conviction and sentencing according to the first correspondence and the second correspondence includes:
- the preset association dictionary records the correspondence between the preset crime and the preset crime sentencing plot as a third correspondence
- the method further includes:
- the process of extracting the crime includes:
- the preset keywords include: judgment, exemption from criminal punishment or exemption from criminal punishment;
- the extraction process of the conviction and sentencing plot includes:
- the preset regular expression is a regular expression constructed using a pre-defined crime sentencing scenario.
- the extraction process of the conviction and sentencing plot includes:
- the method further includes:
- the target correspondence is the correspondence between the name of the subject of the crime, the name of the crime and the plot of conviction and sentencing.
- the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, the present application may take the form of a computer program product implemented on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer usable program code.
- computer usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
- These computer program instructions may also be stored in a computer readable memory that can guide a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer readable memory produce an article of manufacture including an instruction device, the instructions
- the device implements the functions specified in one block or multiple blocks of the flowchart one flow or multiple flows and/or block diagrams.
- These computer program instructions can also be loaded onto a computer or other programmable data processing device, so that a series of operating steps are performed on the computer or other programmable device to produce computer-implemented processing, which is executed on the computer or other programmable device
- the instructions provide steps for implementing the functions specified in one block or multiple blocks of the flowchart one flow or multiple flows and/or block diagrams.
- the computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
- processors CPUs
- input/output interfaces network interfaces
- memory volatile and non-volatile memory
- the memory may include non-permanent memory, random access memory (RAM) and/or non-volatile memory in a computer-readable medium, such as read only memory (ROM) or flash memory (flash RAM).
- RAM random access memory
- ROM read only memory
- flash RAM flash memory
- Computer readable media including permanent and non-permanent, removable and non-removable media, can store information by any method or technology.
- the information may be computer readable instructions, data structures, modules of programs, or other data.
- Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, read-only compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
- computer-readable media does not include temporary computer-readable media (transitory media), such as modulated data signals and carrier waves.
- the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, the present application may take the form of a computer program product implemented on one or more computer usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer usable program code.
- computer usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Business, Economics & Management (AREA)
- Technology Law (AREA)
- Primary Health Care (AREA)
- Marketing (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Strategic Management (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
法律文书解析方法及装置,所述方法包括:提取法律文书中的犯罪主体名称、罪名与定罪量刑情节(S110);当所述法律文书中的所述罪名所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系(S120);当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系(S130);根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系(S140),从而能够准确解析出各类案件的法律文书中犯罪主体名称、罪名与定罪量刑情节之间的对应关系,适用于各类案件的法律文书的有效解析。
Description
本申请要求于2018年12月24日提交中国专利局、申请号为201811580587.2、发明名称为“法律文书解析方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本发明涉及数据处理技术领域,更具体的说,涉及法律文书解析方法及装置。
法律文书中记录着犯罪主体、罪名、定罪情节与量刑情节等信息,在实际应用中,通常需要将法律文书中记载的上述信息有效地解析出来以供使用。
现有的法律文书解析方式,通常是将同一法律文书中提取出来的犯罪主体、罪名与定罪量刑情节直接进行关联,以作为该法律文书解析结果,而这种解析方式仅能够实现对单人单罪案件的犯罪主体、罪名、定罪量刑情节等信息的准确关联,而无法有效辨识出法律文书中一人多罪、多人一罪与多人多罪案件的中犯罪主体、罪名、定罪量刑情节之间的多维度的对应关系,若采用现有解析方案强行解析这类复杂案件的法律文书,则会导致犯罪主体、罪名、定罪量刑情节之间对不上号的问题,无法实现针对这类复杂案件的法律文书的有效解析。
因此,目前迫切需要一种切实有效的法律文书解析方案,以适用于各类案件的法律文书的有效解析。
发明内容
鉴于上述问题,提出了本发明以便提供一种克服上述问题或者至少部分地解决上述问题的法律文书解析方法及装置。
为实现上述目的,本发明提供如下技术方案:
一种法律文书解析方法,所述方法包括:
提取法律文书中的犯罪主体名称、罪名与定罪量刑情节;
当所述法律文书中的所述罪名所在分句中包含所述犯罪主体名称时,建立 所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系;
当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系;
根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
优选的,所述根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系包括:
获取预设关联词典;其中,所述预设关联词典中记录有预设罪名与预设定罪量刑情节之间的对应关系,作为第三对应关系;
根据所述第一对应关系、所述第二对应关系与所述第三对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
优选的,在所述提取法律文书中的犯罪主体名称、罪名与定罪量刑情节之后,所述方法还包括:
当所述法律文书中的所述罪名所在分句中未包含所述犯罪主体名称时,确定所述法律文书中的所述罪名所在分句之前的最近分句中是否包含所述犯罪主体名称;
当所述法律文书中的所述罪名所在分句之前的最近分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为所述第一对应关系。
优选的,所述罪名的提取过程包括:
在所述法律文书的裁判结果段落中定位预设关键词;其中,所述预设关键词包括:判决、免于刑事处罚或免予刑事处罚;
从所述法律文书中所述预设关键词所在位置的前一分句中,提取符合预设罪名字典规定的罪名。
优选的,所述定罪量刑情节的提取过程包括:
从所述法律文书中提取符合预设正则表达式的定罪量刑情节;其中,所述预设正则表达式为,采用预设定罪量刑情节构造的正则表达式。
优选的,所述定罪量刑情节的提取过程包括:
利用语义分析技术,从所述法律文书中提取符合预设定罪量刑情节语义的 定罪量刑情节。
优选的,在所述根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系之后,所述方法还包括:
依据目标对应关系,将所述犯罪主体名称、所述罪名与所述定罪量刑情节进行对应输出;
其中,所述目标对应关系为,所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
一种法律文书解析装置,所述装置包括:
犯罪信息提取单元,用于提取法律文书中的犯罪主体名称、罪名与定罪量刑情节;
第一关系建立单元,用于当所述法律文书中的所述罪名所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系;
第二关系建立单元,用于当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系;
目标关系建立单元,用于根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
一种存储介质,所述存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行前述所述的法律文书解析方法。
一种处理器,所述处理器用于运行程序,其中,所述程序运行时执行前述所述的法律文书解析方法。
借由上述技术方案,本发明提供的法律文书解析方法及装置,在提取出法律文书中的犯罪主体名称、罪名与定罪量刑情节后,根据犯罪主体名称与罪名在法律文书中的位置关系,准确确定出犯罪主体名称与罪名之间的对应关系,并根据犯罪主体名称与定罪量刑情节在法律文书中的位置关系,准确确定出犯罪主体名称与定罪量刑情节之间的对应关系,进而准确确定出犯罪主体名称、罪名与定罪量刑情节之间的对应关系,而并不是将同一法律文书中提取出来的犯罪主体名称、罪名与定罪量刑情节直接进行关联,从而能够准 确解析出各类案件的法律文书中犯罪主体名称、罪名与定罪量刑情节之间的对应关系,适用于各类案件的法律文书的有效解析。
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:
图1为本申请实施例提供的法律文书解析方法的一种流程图;
图2为本申请实施例提供的法律文书解析方法的另一种流程图;
图3为本申请实施例提供的法律文书解析方法的又一种流程图;
图4为本申请实施例提供的法律文书解析过程的示例图;
图5为本申请实施例提供的法律文书解析结果的展示图;
图6为本申请实施例提供的法律文书解析装置的一种结构示意图。
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
犯罪主体:是指实施危害社会的行为、依法应当负刑事责任的自然人和单位。自然人主体是指达到刑事责任能力的自然人。单位主体是指实施危害社会行为并依法应负刑事责任的公司、企业、事业单位、机关、团体。
罪名:罪名是刑法分则所规定的每一种具体犯罪的名称,是对该种具体犯罪行为本质特征的高度概括。罪名反映了一种犯罪与另一种犯罪的本质区别,是区分此罪与彼罪的根本界限。
定罪情节:是存在于犯罪实施过程中,它通过反映犯罪行为的社会危害性 和行为人的人身危害性及其程度来确定某一行为构成犯罪。
量刑情节:指在行为已经构成犯罪的前提下,法律规定或司法实践认可的,量刑时应当考虑的决定处刑轻重或者免除处罚所依据的各种主客观情况。
请参阅图1,图1为本申请实施例提供的法律文书解析方法的一种流程图。
如图1所示,所述方法包括:
S110:提取法律文书中的犯罪主体名称、罪名与定罪量刑情节。
利用NLP(Natural Language Processing,自然语言处理)、命名实体识别、语义分析及文本解析等技术,可以从法律文书中分别提取出犯罪主体名称、罪名与定罪量刑情节等信息。
其中,犯罪主体名称可以是犯罪人姓名,也可以是犯罪单位的名称;定罪量刑情节包括定罪情节与量刑情节,也是定罪情节与量刑情节的统称。
S120:当所述法律文书中的所述罪名所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系。
在得到第一对应关系后,可以按照第一对应关系,将犯罪主体名称与罪名进行对应输出并存储。
S130:当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系。
在得到第二对应关系后,也可以按照第二对应关系,将犯罪主体名称与定罪量刑情节进行对应输出并存储。
S140:根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
基于第一对应关系与第二对应关系中共有的“犯罪主体名称”,可以间接建立起第一对应关系中“罪名”与第二对应关系中“定罪量刑情节”之间的对应关系,从而得到“犯罪主体名称”、“罪名”与“定罪量刑情节”三者之间的对应关系。
本实施例提供的法律文书解析方法,在提取出法律文书中的犯罪主体名称、罪名与定罪量刑情节后,根据犯罪主体名称与罪名在法律文书中的位置关系,准确确定出犯罪主体名称与罪名之间的对应关系,并根据犯罪主体名称与定罪量刑情节在法律文书中的位置关系,准确确定出犯罪主体名称与定 罪量刑情节之间的对应关系,进而准确确定出犯罪主体名称、罪名与定罪量刑情节之间的对应关系,而并不是将同一法律文书中提取出来的犯罪主体名称、罪名与定罪量刑情节直接进行关联,从而能够准确解析出各类案件的法律文书中犯罪主体名称、罪名与定罪量刑情节之间的对应关系,适用于各类案件(如,单人单罪、一人多罪、多人一罪与多人多罪)的法律文书的有效解析。
请参阅图2,图2为本申请实施例提供的法律文书解析方法的另一种流程图。
如图2所示,所述方法包括:
S210:提取法律文书中的犯罪主体名称、罪名与定罪量刑情节。
一示例中,所述罪名的提取过程可包括:
a1、在所述法律文书的裁判结果段落中定位预设关键词。
其中,所述预设关键词包括:判决、免于刑事处罚或免予刑事处罚;
a2、从所述法律文书中所述预设关键词所在位置的前一分句中,提取符合预设罪名字典规定的罪名。
法律文书可包括裁判文书,裁判结果段落可以是指裁判文书中的“判决如下段”,也就是说,在裁判文书中的“判决如下段”中定位预设关键词“判决”、“免于刑事处罚”或“免予刑事处罚”的所在位置,并在“判决”、“免于刑事处罚”或“免予刑事处罚”所在位置的前一分句中提取符合预设罪名字典规定的罪名。其中,裁判文书具体可以包括刑事裁判文书。
预设罪名字典中包括法律规定的预设罪名,可用于匹配法律文书中出现的罪名。
如果从法律文书中预设关键词所在位置的前一分句中,未提取到符合预设罪名字典规定的罪名,也可以继续向更靠前的分句进行检索,以提取出最近分句出现的罪名。
一示例中,所述定罪量刑情节的提取过程可包括:
从所述法律文书中提取符合预设正则表达式的定罪量刑情节。
其中,所述预设正则表达式为,采用预设定罪量刑情节构造的正则表达式。
另一示例中,所述定罪量刑情节的提取过程可包括:
利用语义分析技术,从所述法律文书中提取符合预设定罪量刑情节语义的 定罪量刑情节。
具体地,将与预设定罪量刑情节的语义相同的内容,作为提取到的定罪量刑情节。
其中,利用语义分析技术,还可以对法律文书中“不被认定”、“不予采纳”与“对辩护意见不同意”等信息进行有效识别,以准确地提取出法律文书中被认定的定罪情节与量刑情节。
S220:当所述法律文书中的所述罪名所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系。
S230:当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系。
S240:获取预设关联词典。
其中,所述预设关联词典中记录有预设罪名与预设定罪量刑情节之间的对应关系,作为第三对应关系。
依据法律规定及定罪量刑指导意见等,可以预先建立一套记录有预设罪名与预设定罪量刑情节之间对应关系的预设关联词典,如下表1所示:
表1预设关联词典的内容示例
罪名 | 定罪量刑情节 |
盗窃罪 | 入户盗窃 |
盗窃罪 | 携带凶器盗窃 |
盗窃罪 | 扒窃 |
盗窃罪 | 盗窃财物数额较大 |
盗窃罪 | 盗窃财物数额巨大 |
盗窃罪 | 盗窃财物数额特别巨大 |
盗窃罪 | 曾因盗窃受过刑事处罚 |
盗窃罪 | 一年内曾因盗窃受过行政处罚 |
盗窃罪 | 多次盗窃 |
盗窃罪 | 破坏性手段盗窃 |
盗窃罪 | 盗窃国有馆藏文物 |
盗窃罪 | 盗窃通信线路|电信码号 |
其中,预设罪名与预设定罪量刑情节,是指法律规定的罪名与定罪量刑情节,是罪名与定罪量刑情节的标准表达。
预设关联词典,作为一种标准体系,可以应用到整个法律文书的解析过程中。
S250:根据所述第一对应关系、所述第二对应关系与所述第三对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
基于第一对应关系与第二对应关系中共有的“犯罪主体名称”,可以间接建立起第一对应关系中“罪名”与第二对应关系中“定罪量刑情节”之间的对应关系,结合第三对应关系中“预设罪名”与“预设定罪量刑情节”之间的对应关系,可以对第一对应关系中“罪名”与第二对应关系中“定罪量刑情节”之间的对应关系进行验证,从而提高第一对应关系中“罪名”与第二对应关系中“定罪量刑情节”之间的对应关系的准确性,进而保证“犯罪主体名称”、“罪名”与“定罪量刑情节”之间的对应关系的准确性。
本实施例提供的法律文书解析方法,不仅以所述犯罪主体名称与所述罪名之间的第一对应关系与所述犯罪主体名称与定罪量刑情节之间的第二对应关系为依据,而且结合了预设关联词典中的预设罪名与预设定罪量刑情节之间的第三对应关系,来建立犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系,从而提高了最终建立的犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系的准确性,提高了法律文书解析的准确性与有效性。
请参阅图3,图3为本申请实施例提供的法律文书解析方法的又一种流程图。
如图3所示,所述方法包括:
S310:提取法律文书中的犯罪主体名称、罪名与定罪量刑情节。
S320:判断所述法律文书中的所述罪名所在分句中是否包含所述犯罪主体名称,若是,执行步骤S330;若否,执行步骤S340。
当所述法律文书中的所述罪名所在分句中未包含所述犯罪主体名称时,确定所述法律文书中的所述罪名所在分句之前的最近分句中是否包含所述犯罪主体名称。
S330:建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系。
S340:当所述法律文书中的所述罪名所在分句之前的最近分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作 为所述第一对应关系。
一示例中,当所述法律文书中的所述罪名所在分句之前的最近分句中也未包含所述犯罪主体名称时,也可以继续向更靠前的分句进行检索,以提取出所述犯罪主体名称。
S350:当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系。
S360:根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
S370:依据目标对应关系,将所述犯罪主体名称、所述罪名与所述定罪量刑情节进行对应输出。
其中,所述目标对应关系为,所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
在建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系后,可以依据目标对应关系,将所述犯罪主体名称、所述罪名与所述定罪量刑情节作为一条记录输出到数据库中,从而实现解析结果的结构化存储。
本实施例提供的法律文书解析方法,当所述法律文书中的所述罪名所在分句中未包含所述犯罪主体名称时,确定所述法律文书中的所述罪名所在分句之前的最近分句中是否包含所述犯罪主体名称;当所述法律文书中的所述罪名所在分句之前的最近分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,从而实现了法律文件解析过程中的容错处理,提高了法律文书的解析成功率。
请参阅图4~5,图4~5示出了本申请实施例提供的法律文书解析过程的示例图与法律文书解析结果的展示图。
如图4所示,针对一件刑事法律文书,可以利用依存语法关系、文本解析技术、命名实体识别与自然语言处理等技术,提取出该刑事法律文书中的犯罪主体名称、判决罪名、定罪情节1~n与量刑情节1~n及其之间的对应关系;然后再基于“罪名-定罪情节字典”与“罪名-量刑情节字典”建立起“犯罪主体名称-罪名-定罪量刑情节”的最终对应关系,如图4中的“犯罪主体A-罪名a-定罪量刑情节1~n”、“犯罪主体B-罪名c-定罪量刑情节1~n”等。
其中,罪名-定罪情节字典中记录有预设罪名与预设定罪情节之间的对应关系;罪名-量刑情节字典中记录有预设罪名与预设量刑情节之间的对应关系。
针对不同案件类型(如,单人单罪、一人多罪、多人一罪与多人多罪)的法律文书,其解析结果可如图5所示:
针对单人单罪案件,其解析结果可包括:“犯罪主体A-罪名a-定罪量刑情节a1、a2”。
针对单人多罪案件,其解析结果可包括:“犯罪主体A-罪名a-定罪量刑情节a1、a2”;“犯罪主体A-罪名b-定罪量刑情节b1、b2”。
针对多人一罪案件,其解析结果可包括:“犯罪主体A-罪名a-定罪量刑情节a1、a2”;“犯罪主体B-罪名a-定罪量刑情节a3、a2、a4”。
针对多人多罪案件,其解析结果可包括:“犯罪主体A-罪名a-定罪量刑情节a1、a2”;“犯罪主体A-罪名b-定罪量刑情节b1、b2”;“犯罪主体B-罪名c-定罪量刑情节c3、c4”;“犯罪主体B-罪名d-定罪量刑情节d1、d2”。
本发明提供的法律文书解析方法,利用依存语法关系、文本解析技术、姓名实体识别与自然语言处理等技术,根据犯罪主体名称与罪名在法律文书中的位置关系,以及,犯罪主体名称与定罪量刑情节在法律文书中的位置关系,能够准确确定出犯罪主体名称、罪名与定罪量刑情节之间的对应关系,能够适用于各类案件(单人单罪、一人多罪、多人一罪与多人多罪)的法律文书的有效解析。
本发明实施例还提供了法律文书解析装置,所述法律文书解析装置用于实施本发明实施例提供的法律文书解析方法,下文描述的法律文书解析装置内容,可与上文描述的法律文书解析方法内容相互对应参照。
请参阅图6,图6为本申请实施例提供的法律文书解析装置的一种结构示意图。
本实施例的法律文书解析装置,用于实施前述实施例的法律文书解析方法,如图6所示,所述装置包括:
犯罪信息提取单元100,用于提取法律文书中的犯罪主体名称、罪名与定罪量刑情节。
第一关系建立单元200,用于当所述法律文书中的所述罪名所在分句中包 含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系。
第二关系建立单元300,用于当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系。
目标关系建立单元400,用于根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
本实施例提供的法律文书解析装置,在提取出法律文书中的犯罪主体名称、罪名与定罪量刑情节后,根据犯罪主体名称与罪名在法律文书中的位置关系,准确确定出犯罪主体名称与罪名之间的对应关系,并根据犯罪主体名称与定罪量刑情节在法律文书中的位置关系,准确确定出犯罪主体名称与定罪量刑情节之间的对应关系,进而准确确定出犯罪主体名称、罪名与定罪量刑情节之间的对应关系,而并不是将同一法律文书中提取出来的犯罪主体名称、罪名与定罪量刑情节直接进行关联,从而能够适用于各类案件(单人单罪、一人多罪、多人一罪与多人多罪)的法律文书的有效解析。
在其他实施例中,所述目标关系建立单元400可具体用于:
获取预设关联词典;其中,所述预设关联词典中记录有预设罪名与预设定罪量刑情节之间的对应关系,作为第三对应关系;
根据所述第一对应关系、所述第二对应关系与所述第三对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
所述第一关系建立单元200,还可用于:
当所述法律文书中的所述罪名所在分句中未包含所述犯罪主体名称时,确定所述法律文书中的所述罪名所在分句之前的最近分句中是否包含所述犯罪主体名称;
当所述法律文书中的所述罪名所在分句之前的最近分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为所述第一对应关系。
一示例中,所述犯罪信息提取单元100具体用于:
在所述法律文书的裁判结果段落中定位预设关键词;其中,所述预设关键词包括:判决、免于刑事处罚或免予刑事处罚;
从所述法律文书中所述预设关键词所在位置的前一分句中,提取符合预设罪名字典规定的罪名。
一示例中,所述犯罪信息提取单元100具体还用于:
从所述法律文书中提取符合预设正则表达式的定罪量刑情节;其中,所述预设正则表达式为,采用预设定罪量刑情节构造的正则表达式。
另一示例中,所述犯罪信息提取单元100具体还用于:
利用语义分析技术,从所述法律文书中提取符合预设定罪量刑情节语义的定罪量刑情节。
又一示例中,所述装置还可以包括:解析结果输出单元。
所述解析结果输出单元,用于依据目标对应关系,将所述犯罪主体名称、所述罪名与所述定罪量刑情节进行对应输出;
其中,所述目标对应关系为,所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
本实施例提供的法律文书解析装置,在提取出法律文书中的犯罪主体名称、罪名与定罪量刑情节后,根据犯罪主体名称与罪名在法律文书中的位置关系,准确确定出犯罪主体名称与罪名之间的对应关系,并根据犯罪主体名称与定罪量刑情节在法律文书中的位置关系,准确确定出犯罪主体名称与定罪量刑情节之间的对应关系,再结合预设关联词典中的预设罪名与预设定罪量刑情节之间的对应关系,准确确定出犯罪主体名称、罪名与定罪量刑情节之间的对应关系,提高了法律文书解析的准确性与有效性。
本发明实施例提供的法律文书解析装置,包括处理器和存储器,上述犯罪信息提取单元100、第一关系建立单元200、第二关系建立单元300与目标关系建立单元400、解析结果输出单元等均作为程序单元存储在存储器中,由处理器执行存储在存储器中的上述程序单元来实现相应的功能。
处理器中包含内核,由内核去存储器中调取相应的程序单元。内核可以设置一个或以上,通过调整内核参数来解决现有的法律文书解析方案无法实现适用于各类案件的法律文书的有效解析的技术问题。
存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM),存储器包括至少一个存储芯片。
本发明实施例提供了一种存储介质,其上存储有程序,该程序被处理器执行时实现所述法律文书解析方法。
本发明实施例提供了一种处理器,所述处理器用于运行程序,其中,所述程序运行时执行所述法律文书解析方法。
本发明实施例提供了一种设备,设备包括处理器、存储器及存储在存储器上并可在处理器上运行的程序,处理器执行程序时实现以下步骤:
提取法律文书中的犯罪主体名称、罪名与定罪量刑情节;
当所述法律文书中的所述罪名所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系;
当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系;
根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
优选的,所述根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系包括:
获取预设关联词典;其中,所述预设关联词典中记录有预设罪名与预设定罪量刑情节之间的对应关系,作为第三对应关系;
根据所述第一对应关系、所述第二对应关系与所述第三对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
优选的,在所述提取法律文书中的犯罪主体名称、罪名与定罪量刑情节之后,所述方法还包括:
当所述法律文书中的所述罪名所在分句中未包含所述犯罪主体名称时,确定所述法律文书中的所述罪名所在分句之前的最近分句中是否包含所述犯罪主体名称;
当所述法律文书中的所述罪名所在分句之前的最近分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为所述第一对应关系。
优选的,所述罪名的提取过程包括:
在所述法律文书的裁判结果段落中定位预设关键词;其中,所述预设关键 词包括:判决、免于刑事处罚或免予刑事处罚;
从所述法律文书中所述预设关键词所在位置的前一分句中,提取符合预设罪名字典规定的罪名。
优选的,所述定罪量刑情节的提取过程包括:
从所述法律文书中提取符合预设正则表达式的定罪量刑情节;其中,所述预设正则表达式为,采用预设定罪量刑情节构造的正则表达式。
优选的,所述定罪量刑情节的提取过程包括:
利用语义分析技术,从所述法律文书中提取符合预设定罪量刑情节语义的定罪量刑情节。
优选的,在所述根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系之后,所述方法还包括:
依据目标对应关系,将所述犯罪主体名称、所述罪名与所述定罪量刑情节进行对应输出;
其中,所述目标对应关系为,所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
本文中的设备可以是服务器、PC、PAD、手机等。
本申请还提供了一种计算机程序产品,当在数据处理设备上执行时,适于执行初始化有如下方法步骤的程序:
提取法律文书中的犯罪主体名称、罪名与定罪量刑情节;
当所述法律文书中的所述罪名所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系;
当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系;
根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
优选的,所述根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系包括:
获取预设关联词典;其中,所述预设关联词典中记录有预设罪名与预设定 罪量刑情节之间的对应关系,作为第三对应关系;
根据所述第一对应关系、所述第二对应关系与所述第三对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
优选的,在所述提取法律文书中的犯罪主体名称、罪名与定罪量刑情节之后,所述方法还包括:
当所述法律文书中的所述罪名所在分句中未包含所述犯罪主体名称时,确定所述法律文书中的所述罪名所在分句之前的最近分句中是否包含所述犯罪主体名称;
当所述法律文书中的所述罪名所在分句之前的最近分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为所述第一对应关系。
优选的,所述罪名的提取过程包括:
在所述法律文书的裁判结果段落中定位预设关键词;其中,所述预设关键词包括:判决、免于刑事处罚或免予刑事处罚;
从所述法律文书中所述预设关键词所在位置的前一分句中,提取符合预设罪名字典规定的罪名。
优选的,所述定罪量刑情节的提取过程包括:
从所述法律文书中提取符合预设正则表达式的定罪量刑情节;其中,所述预设正则表达式为,采用预设定罪量刑情节构造的正则表达式。
优选的,所述定罪量刑情节的提取过程包括:
利用语义分析技术,从所述法律文书中提取符合预设定罪量刑情节语义的定罪量刑情节。
优选的,在所述根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系之后,所述方法还包括:
依据目标对应关系,将所述犯罪主体名称、所述罪名与所述定罪量刑情节进行对应输出;
其中,所述目标对应关系为,所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计 算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。
存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。存储器是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器 (CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、商品或者设备中还存在另外的相同要素。
本领域技术人员应明白,本申请的实施例可提供为方法、系统或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
以上仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。
Claims (10)
- 一种法律文书解析方法,其中,所述方法包括:提取法律文书中的犯罪主体名称、罪名与定罪量刑情节;当所述法律文书中的所述罪名所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系;当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系;根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
- 如权利要求1所述的方法,其中,所述根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系包括:获取预设关联词典;其中,所述预设关联词典中记录有预设罪名与预设定罪量刑情节之间的对应关系,作为第三对应关系;根据所述第一对应关系、所述第二对应关系与所述第三对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
- 如权利要求1所述的方法,其中,在所述提取法律文书中的犯罪主体名称、罪名与定罪量刑情节之后,所述方法还包括:当所述法律文书中的所述罪名所在分句中未包含所述犯罪主体名称时,确定所述法律文书中的所述罪名所在分句之前的最近分句中是否包含所述犯罪主体名称;当所述法律文书中的所述罪名所在分句之前的最近分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为所述第一对应关系。
- 如权利要求1所述的方法,其中,所述罪名的提取过程包括:在所述法律文书的裁判结果段落中定位预设关键词;其中,所述预设关键词包括:判决、免于刑事处罚或免予刑事处罚;从所述法律文书中所述预设关键词所在位置的前一分句中,提取符合预设 罪名字典规定的罪名。
- 如权利要求1所述的方法,其中,所述定罪量刑情节的提取过程包括:从所述法律文书中提取符合预设正则表达式的定罪量刑情节;其中,所述预设正则表达式为,采用预设定罪量刑情节构造的正则表达式。
- 如权利要求1所述的方法,其中,所述定罪量刑情节的提取过程包括:利用语义分析技术,从所述法律文书中提取符合预设定罪量刑情节语义的定罪量刑情节。
- 如权利要求1所述的方法,其中,在所述根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系之后,所述方法还包括:依据目标对应关系,将所述犯罪主体名称、所述罪名与所述定罪量刑情节进行对应输出;其中,所述目标对应关系为,所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
- 一种法律文书解析装置,其中,所述装置包括:犯罪信息提取单元,用于提取法律文书中的犯罪主体名称、罪名与定罪量刑情节;第一关系建立单元,用于当所述法律文书中的所述罪名所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述罪名之间的对应关系,作为第一对应关系;第二关系建立单元,用于当所述法律文书中的所述定罪量刑情节所在分句中包含所述犯罪主体名称时,建立所述犯罪主体名称与所述定罪量刑情节之间的对应关系,作为第二对应关系;目标关系建立单元,用于根据所述第一对应关系与所述第二对应关系,建立所述犯罪主体名称、所述罪名与所述定罪量刑情节之间的对应关系。
- 一种存储介质,其中,所述存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行如权利要求1~7中任一项所述的法律文书解析方法。
- 一种处理器,其中,所述处理器用于运行程序,其中,所述程序运行时执行如权利要求1~7中任一项所述的法律文书解析方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811580587.2A CN111428466B (zh) | 2018-12-24 | 2018-12-24 | 法律文书解析方法及装置 |
CN201811580587.2 | 2018-12-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020135247A1 true WO2020135247A1 (zh) | 2020-07-02 |
Family
ID=71128415
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/126934 WO2020135247A1 (zh) | 2018-12-24 | 2019-12-20 | 法律文书解析方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111428466B (zh) |
WO (1) | WO2020135247A1 (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114037566A (zh) * | 2021-11-15 | 2022-02-11 | 中国人民大学 | 一种基于细粒度特征和互信息的自动定罪方法 |
CN114138939A (zh) * | 2021-12-08 | 2022-03-04 | 河南大学 | 一种基于形式概念分析的罪名预测方法和系统 |
CN116205350A (zh) * | 2023-01-12 | 2023-06-02 | 深圳市大数据研究院 | 基于法律文书的再犯人身危险性分析预测系统和方法 |
CN116304035A (zh) * | 2023-02-28 | 2023-06-23 | 中国司法大数据研究院有限公司 | 一种复杂案件中的多被告多罪名关系抽取方法及装置 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115687632B (zh) * | 2022-08-25 | 2024-04-09 | 中国司法大数据研究院有限公司 | 一种刑事量刑情节分解分析的方法和系统 |
CN115358896B (zh) * | 2022-10-20 | 2023-02-03 | 四川大学华西医院 | 以海量文书构建罪名演化网络的方法、装置、设备及介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130238316A1 (en) * | 2012-03-07 | 2013-09-12 | Infosys Limited | System and Method for Identifying Text in Legal documents for Preparation of Headnotes |
US20160140210A1 (en) * | 2014-11-19 | 2016-05-19 | Lexisnexis, A Division Of Reed Elsevier Inc. | Systems and methods for automatic identification of potential material facts in documents |
CN106649849A (zh) * | 2016-12-30 | 2017-05-10 | 上海智臻智能网络科技股份有限公司 | 文本信息库建立方法和装置、以及搜索方法、装置和系统 |
CN106815207A (zh) * | 2015-12-01 | 2017-06-09 | 北京国双科技有限公司 | 用于法律裁判文书的信息处理方法及装置 |
CN108874814A (zh) * | 2017-05-10 | 2018-11-23 | 北京国双科技有限公司 | 法律文书的处理方法及装置 |
CN109033249A (zh) * | 2018-07-05 | 2018-12-18 | 北京神州泰岳软件股份有限公司 | 公检法领域结构化文书的信息提取方法、装置及存储介质 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101872439A (zh) * | 2010-01-29 | 2010-10-27 | 秦野 | 一种常用百种罪名刑法量刑的方法和系统 |
US20150081742A1 (en) * | 2013-07-01 | 2015-03-19 | Curtis Roys | Human enumeration and tracking |
CN106815208A (zh) * | 2015-12-01 | 2017-06-09 | 北京国双科技有限公司 | 法律裁判文书的解析方法及装置 |
CN107358550B (zh) * | 2017-06-08 | 2022-02-22 | 上海市高级人民法院 | 刑事案件智能证据校验方法、审查方法及具有其的存储介质和终端设备 |
CN107358558B (zh) * | 2017-06-08 | 2020-12-29 | 上海市高级人民法院 | 刑事案件智能辅助办案方法,系统及具有其的存储介质和终端设备 |
CN108073988B (zh) * | 2017-06-21 | 2021-09-03 | 北京华宇元典信息服务有限公司 | 一种基于强化学习的法律认知方法、装置和介质 |
CN107578355A (zh) * | 2017-09-08 | 2018-01-12 | 北京博雅英杰科技股份有限公司 | 一种量刑方法和装置 |
-
2018
- 2018-12-24 CN CN201811580587.2A patent/CN111428466B/zh active Active
-
2019
- 2019-12-20 WO PCT/CN2019/126934 patent/WO2020135247A1/zh active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130238316A1 (en) * | 2012-03-07 | 2013-09-12 | Infosys Limited | System and Method for Identifying Text in Legal documents for Preparation of Headnotes |
US20160140210A1 (en) * | 2014-11-19 | 2016-05-19 | Lexisnexis, A Division Of Reed Elsevier Inc. | Systems and methods for automatic identification of potential material facts in documents |
CN106815207A (zh) * | 2015-12-01 | 2017-06-09 | 北京国双科技有限公司 | 用于法律裁判文书的信息处理方法及装置 |
CN106649849A (zh) * | 2016-12-30 | 2017-05-10 | 上海智臻智能网络科技股份有限公司 | 文本信息库建立方法和装置、以及搜索方法、装置和系统 |
CN108874814A (zh) * | 2017-05-10 | 2018-11-23 | 北京国双科技有限公司 | 法律文书的处理方法及装置 |
CN109033249A (zh) * | 2018-07-05 | 2018-12-18 | 北京神州泰岳软件股份有限公司 | 公检法领域结构化文书的信息提取方法、装置及存储介质 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114037566A (zh) * | 2021-11-15 | 2022-02-11 | 中国人民大学 | 一种基于细粒度特征和互信息的自动定罪方法 |
CN114138939A (zh) * | 2021-12-08 | 2022-03-04 | 河南大学 | 一种基于形式概念分析的罪名预测方法和系统 |
CN116205350A (zh) * | 2023-01-12 | 2023-06-02 | 深圳市大数据研究院 | 基于法律文书的再犯人身危险性分析预测系统和方法 |
CN116304035A (zh) * | 2023-02-28 | 2023-06-23 | 中国司法大数据研究院有限公司 | 一种复杂案件中的多被告多罪名关系抽取方法及装置 |
CN116304035B (zh) * | 2023-02-28 | 2023-11-03 | 中国司法大数据研究院有限公司 | 一种复杂案件中的多被告多罪名关系抽取方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN111428466B (zh) | 2022-04-01 |
CN111428466A (zh) | 2020-07-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020135247A1 (zh) | 法律文书解析方法及装置 | |
WO2021004333A1 (zh) | 基于知识图谱的事件处理方法、装置、设备和存储介质 | |
US9141662B2 (en) | Intelligent evidence classification and notification in a deep question answering system | |
US9158773B2 (en) | Partial and parallel pipeline processing in a deep question answering system | |
US8561185B1 (en) | Personally identifiable information detection | |
US20140172139A1 (en) | Question classification and feature mapping in a deep question answering system | |
US10565311B2 (en) | Method for updating a knowledge base of a sentiment analysis system | |
US10431338B2 (en) | System and method for weighting manageable patient attributes during criteria evaluations for treatment | |
US20210141822A1 (en) | Systems and methods for identifying latent themes in textual data | |
US10002187B2 (en) | Method and system for performing topic creation for social data | |
CN108829656B (zh) | 网络信息的数据处理方法及数据处理装置 | |
Mitra et al. | Combating fake cyber threat intelligence using provenance in cybersecurity knowledge graphs | |
CN109918621A (zh) | 基于数字指纹和语义特征的新闻文本侵权检测方法与装置 | |
EP4555425A1 (en) | Supervised summarization and structuring of unstructured documents | |
WO2019072007A1 (zh) | 一种数据处理方法及装置 | |
CN111427880A (zh) | 数据处理的方法、装置、计算设备以及介质 | |
CN111813947A (zh) | 开庭询问提纲自动生成方法及装置 | |
Zhong et al. | Fast detection of deceptive reviews by combining the time series and machine learning | |
CN116756762A (zh) | 一种识别异常隐私属性信息的方法、装置和设备 | |
Manzanares-Salor et al. | Automatic evaluation of disclosure risks of text anonymization methods | |
CN110324278A (zh) | 账号主体一致性检测方法、装置及设备 | |
Prathyusha et al. | Normalization methods for multiple sources of data | |
Cheng et al. | Fine-grained topic detection in news search results | |
CN110766091B (zh) | 一种套路贷团伙的识别方法及系统 | |
CN114708100A (zh) | 一种数据交易合规检测系统及方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19903112 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19903112 Country of ref document: EP Kind code of ref document: A1 |