CN110827177A - Case-like document searching method and device - Google Patents

Case-like document searching method and device Download PDF

Info

Publication number
CN110827177A
CN110827177A CN201810915510.XA CN201810915510A CN110827177A CN 110827177 A CN110827177 A CN 110827177A CN 201810915510 A CN201810915510 A CN 201810915510A CN 110827177 A CN110827177 A CN 110827177A
Authority
CN
China
Prior art keywords
document
legal
case
documents
existing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810915510.XA
Other languages
Chinese (zh)
Inventor
杨丹
张朔境
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201810915510.XA priority Critical patent/CN110827177A/en
Publication of CN110827177A publication Critical patent/CN110827177A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services

Landscapes

  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Technology Law (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a case-like document searching method and a case-like document searching device. And searching the legal documents matched with the document types of the existing legal documents for the class case documents matched with the document elements of the existing legal documents. The method can search the case-like documents of various types of legal documents, and can search the case-like documents according to the document elements of the existing legal documents, so that the searched case-like documents are similar to the existing legal documents in the document elements, and the accuracy rate of searching the case-like documents is improved.

Description

Case-like document searching method and device
Technical Field
The invention relates to the technical field of computers, in particular to a case-like document searching method and device.
Background
The case-like documents refer to the legal documents belonging to the same case and having high similarity on the aspects of case reasons, document types, case main facts, theory and the like among the legal documents.
The legal documents in the present text refer to non-normative legal documents, i.e. narrow legal documents, which do not have general binding force, and refer to the general term of the non-normative documents with legal effectiveness or legal significance, which are legally made by national law administration, lawyers and law firms, arbitration, notary and case parties and are used for processing various litigation cases and non-litigation cases. Wherein the non-normative legal documents are only applicable to specific persons and specific matters.
The conventional case-like document searching mode only searches case-like documents aiming at one type of legal documents, namely referee documents, namely, the searched and pushed legal documents are single in type.
Disclosure of Invention
In view of the above problems, the present invention provides a method and an apparatus for searching a case-like document, so as to solve the technical problem that the legal document type searched by the related case-like document searching method is single. The specific technical scheme is as follows:
in a first aspect, the present application provides a method for searching a kind of case document, including:
determining the document type according to the content of the existing legal document;
analyzing the document content of the existing legal document to obtain document elements, wherein the document elements are contents which embody case characteristics and influence the fact of judgment results and allow a judge to carry out legal identification and theory on the document elements;
searching for a case-like document matching the document elements of the existing legal document from among the legal documents matching the document types of the existing legal document.
In one possible implementation manner of the present application, the determining the document type according to the content of the existing legal document includes:
identifying the content of the appointed section in the existing legal document by utilizing a keyword identification rule or a regular expression rule to obtain the content characteristic representing the document type;
determining the document type of the existing legal document according to the content features.
In one possible implementation manner of the present application, the parsing the document content of the existing legal document to obtain the document element includes:
segmenting the existing legal document according to the document type of the existing legal document;
and analyzing the content of the appointed paragraphs obtained by segmenting the existing legal document to obtain the document elements of the existing legal document.
In one possible implementation manner of the present application, the document element includes at least one document element item;
searching for a type case document matching document elements of the existing legal document from among the legal documents matching the document type, including:
searching a document library for a legal document to be selected which is matched with the document type of the existing legal document;
determining a class case search model corresponding to the document type of the existing legal document, wherein the class case search model comprises document element items and identification features corresponding to the document element items;
and identifying the legal documents containing the document element items from the legal documents to be selected by utilizing the identification features corresponding to each document element item of the existing legal documents contained in the case search model to obtain case-like documents.
In one possible implementation manner of the present application, the method further includes:
disassembling and carding the corresponding legal and legal contents of a case to obtain the document element item corresponding to the case;
determining a document element identification characteristic corresponding to each document element item, and,
and determining the matching weight of each document element item in the paragraph of the legal document corresponding to the document type to obtain a class pattern search model corresponding to the document type.
In one possible implementation manner of the present application, the method further includes:
and pushing the searched class case documents matched with the document elements of the existing legal documents.
In one possible implementation manner of the present application, the method further includes:
marking the document elements contained in each of the class documents that match the existing legal document;
comparing and displaying the document elements contained in the existing legal documents and the document elements contained in each type of case documents and matched with the existing legal documents;
and pushing the case-like documents according to the sequence of the number of the matched document elements of the case-like documents and the existing legal documents from large to small.
In a second aspect, the present application further provides a device for searching documents of a kind, including:
the determining module is used for determining the document type according to the content of the existing legal document;
the analysis module is used for analyzing the document content of the existing legal document to obtain document elements, wherein the document elements are contents which embody case characteristics, influence the facts of judgment results and allow a judge to carry out legal identification and theory on the document elements;
the searching module is used for searching the legal documents matched with the document types of the existing legal documents for the class case documents matched with the document elements of the existing legal documents.
In a third aspect, the present application further provides a storage medium, on which a program is stored, where the program, when executed by a processor, implements the method for searching a case document according to any one of the possible implementations of the first aspect.
In a fourth aspect, the present application further provides a processor, where the processor is configured to execute a program, and the program executes the method for searching a case-like document according to any one of the possible implementation manners of the first aspect.
The method for searching the similar case document determines the document type of the existing legal document, and then analyzes the document content of the existing legal document to obtain the document elements. And searching the legal documents matched with the document types of the existing legal documents for the class case documents matched with the document elements of the existing legal documents. The method can search the case-like documents of various types of legal documents, and can search the case-like documents according to the document elements of the existing legal documents, so that the searched case-like documents are similar to the existing legal documents in the document elements, and the accuracy rate of searching the case-like documents is improved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flowchart illustrating a legal document pushing method according to an embodiment of the present application;
FIG. 2 is a flow chart illustrating another legal document pushing method according to an embodiment of the present application;
FIG. 3 is a flowchart of a legal document pushing method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a comparative presentation document element according to an embodiment of the present application;
FIG. 5 is a block diagram of an apparatus for looking up a document in an embodiment of the present application;
fig. 6 shows a block diagram of another apparatus for searching a class document according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Referring to fig. 1, a flowchart of a filing document searching method applied to a server for searching a filing document of the same type as the filing document of an existing legal document and matching filing document elements according to the input existing legal document is shown in an embodiment of the present application.
As shown in fig. 1, the method may include the steps of:
and S110, acquiring the existing legal documents.
The existing legal document is usually a legal document according to which a user selects a seat to push.
And S120, determining the document type according to the content of the existing legal document.
For example, the legal documents referred to in this application may include referee documents, prosecution notes, complaints, court notes, and the like.
For each legal document, the legal specification specifies a corresponding legal document format, and thus, the document type of the legal document is identified according to the format characteristics of the legal document.
In the title or the first few sections of contents of each legal document, the content characteristics such as specific keyword contents, litigation place contents and the like which can represent the document types are identified by using a keyword identification rule or a regular expression rule, and then the document types of the existing legal documents are determined according to the content characteristics.
For example, when it is recognized that the title and the specified short circuit of the legal document contain the keyword "prosecution" and contain the keyword "inspection institute" or "inspection house", the document type of the legal document is determined to be "prosecution".
S130, analyzing the document content of the existing legal document to obtain the document element.
The document elements comprise the contents of embodying case characteristics, influencing the facts of referee results and giving legal identification and theory to the officers.
For the existing legal documents, the segmentation is carried out according to the format characteristics of the document types of the legal documents, different document types have respective content characteristics, and the paragraph titles and the segmented contents of the segments are possibly different. Then, the contents of the specified paragraphs obtained by the segmentation are analyzed to obtain the document elements of the existing legal document.
For example, if the existing legal document is a prosecution, the legal document is segmented according to format characteristics of the prosecution, and is specifically divided into a hospital section, a document type section, a prosecution number section, an advertiser case section, a case section, a case fact section, a public complaint evidence section, a prosecution requirement, a base section, a final section, a trial and error hospital section, a inspector section and a prosecution date section. Then, the contents of the document element-related paragraphs, such as case status, case fact, and court of trial, are analyzed to finally obtain the document elements of the original document.
If the existing legal document is a referee document, the document is segmented according to the format characteristics of the referee document to obtain a complaint section, a dialectical section, a fact identification section, a hospital identification section and a judgment section, and the contents of the sections are analyzed to obtain the document elements of the referee document.
And S140, searching the legal documents matched with the document types of the existing legal documents for the case documents matched with the document elements of the existing legal documents.
Selecting legal documents matched with the document types of the existing legal documents from the document library, then identifying the document elements of the legal documents, and determining the legal documents matched with the document elements of the existing legal documents as the class documents of the existing legal documents.
Optionally, after finding the case-like document matched with the existing legal document, pushing the case-like document to the client. The method for searching a case-like document provided in this embodiment determines the document type of an existing legal document, and then parses the document content of the existing legal document to obtain document elements. And searching the legal documents matched with the document types of the existing legal documents for the class case documents matched with the document elements of the existing legal documents. The method can be used for searching the class case documents of various types of legal documents, and the searching mode searches the class case documents according to the document elements of the existing legal documents, so that the searched class case documents are similar to the existing legal documents in the document elements, and the accuracy rate of searching the class case documents is improved.
Referring to fig. 2, a flowchart of another method for searching a document in a class according to an embodiment of the present application is shown, where the method includes the following steps:
s210, disassembling and carding the corresponding legal and legal contents of a case to obtain the document element item corresponding to the case.
According to case law and laws, the case law is disassembled and carded in advance to obtain document elements corresponding to the case law, and then identification characteristics corresponding to each document element are determined.
The case is a summary of the nature and content of the legal relationships involved in a particular litigation case. The cases involved in different case types are different:
for example, a case of a criminal case includes at least the criminal name of the culprit, e.g., a deliberate injury crime. The cases related to the civil case are thousands of cases, and the civil case is required to be combed to obtain a primary case, a secondary case and a tertiary case, wherein the primary case usually does not directly appear in the official documents, such as 'marital family disputes', the secondary case and the tertiary case usually appear in the official documents. The administrative case is divided into: examples of the cases as the class, not as the class, and the administrative compensation class include "security administration penalty", "tax administration compensation", and the like.
For example, the case of criminal case is "illegal arrests", the case is collected by related legal terms, the second hundred thirty-eight criminal law, the real-time rule of local laws, the implementation rule of the advanced people's court of Hubei province < the criminal guidance opinion on the common scope > and the fourth illegal arrests the 1 st, 2 nd, 3 th, 4 th and 5 th criminals.
Then, the related legal and legal regulation contents of the illegal arrestment crimes are disassembled one by one to extract document essential items, for example, the contents of the second hundred thirty eight illegal arrests others or illegally deprives the freedom of the others in other methods in the criminal law, and the criminal, the obligation, the regulation or the political right is deprived under three years. With assault, insulting, heavy penalties. Criminals are committed and cause serious injury to people, and there is an apprehension in the past three years or more and ten years or less; death-causing, with peri-petits over ten years. The punishment of the crime and punishment of the violence causing the disability and the death, according to the rules of the second hundred thirty-four and the second hundred thirty-two of the law. To illegally detain and restrain others from the debt, the penalty is punished according to the regulations of the former two money. The officer of the department uses the authority to make the first three criminals and punish the punishment according to the rules of the first three criminals. The characters can be separated out, such as the characters with assault, slur, serious injury, disability and death caused by violence, and the characters are arranged into a document element table shown in the table 1.
TABLE 1
Document essence item
Illegal restraint
Impersonation military police | illegal constraint of judicial personnel
Illegal restriction of working personnel in state organs by using authority
Illegal restraint of murder
With beating | slur | abuse | affliation plot
Multiple illegal restrictions
Illegal restraint of claim illegal debt
Illegal restraint of claim legal debt
Illegal confinement for over 24 hours
Slight injury of persons due to illegal restraint
Light injury of illegal restricted people
Illegal restraint causing serious injury
Death by illegal restraint
X-level disability (changing according to the file, like the first level disability)
Mental disorder of the victim
S220, determining the document element identification characteristics corresponding to each document element item under the table.
According to different document types, corresponding legal document samples can be respectively collected for various cases, and document element identification characteristics of the document types under the condition of the case can be summarized.
Each document element item can be obtained by selecting at least one of the following identification rules according to specific sentences appearing in the legal document sample, wherein the identification rules include but are not limited to the following four types:
1) keyword rule: determining that the referee document conforms to the case scenario corresponding to the case scenario item as long as a certain paragraph of the referee document contains the keyword;
2) regular expression rules: in a certain paragraph of the referee document, a description sentence which accords with the regular expression is provided, and the referee document is determined to accord with the case scenario corresponding to the case scenario item;
for example, the murder in the referee document sample can be a chopper, a mechanical pick handle, an electric baton, a mechanical tool and other specific instruments; similarly, "illegal restrictions" in the referee's sample of documents may include a number of specific restrictions scenarios, such as "detainments, gatekeepers, restrictions," and the like.
Therefore, the above-mentioned instruments cannot be covered by a single keyword "murder"; the specific restricted plots cannot be covered by the single keyword 'restriction', and a regular expression method can be selected as the plot identification characteristic corresponding to the two items of basis. For example, the regular expression may be (murder | chopper | mechanical pickaxe | electric baton | handcuff | mechanical instrument) {0,40} (detain | hold | limit | deprive | hold). When any word appears in a certain section of the judge document (murder | chopper | mechanical pickaxe handle | electric baton | handcuff | mechanical tool) and any word appears in the section of the judge document (restraint | detain | limit | deprive | security), determining that the section of the judge document accords with the regular expression rule; 0,40 in this example means that the number of characters between two feature words may be 0-40.
3) Regular expression content rules: in a certain paragraph of the referee document, there is a description sentence which accords with the regular expression, and the content in the description sentence accords with the rule, and the referee document is determined to accord with the case scenario corresponding to the case scenario item.
The regular expression content rules can not only identify the sentences meeting the conditions, but also find the sentences meeting the specific content. For example, 1 person with serious injury, 5439 yuan of theft amount and other digital information.
For example, if a case scenario in an existing document is "seriously injured 2 persons", a referee document containing "seriously injured 2 persons" can be searched according to "seriously injured 2 persons" as the regular expression content.
Moreover, the specific content items in the case scenario items shown in table 2 can be adjusted according to the specific content in the existing document, for example, if the existing document is "disabled level", the "disabled level X" in table 2 can be adjusted to disabled level, and the referee document including the scenario can be identified according to the scenario identification rule corresponding to the case scenario item and pushed to the user.
4) Applicable legal rules: if the applicable law of the referee document contains the law stipulated by the rule, the referee document is determined to accord with the case scenario corresponding to the case scenario item.
After the corresponding relation between each case and case scenario item and scenario identification feature is obtained by using S210-S230, the target scenario identification feature matched with the case scenario of the existing document is determined according to the corresponding relation.
And S230, determining paragraphs for document element matching in the legal documents corresponding to each document type to obtain a class case search model corresponding to the document type.
And determining the paragraphs of the legal documents of the document type for document element matching according to different document types, namely searching which paragraphs of the legal documents to be selected contain a document element item. For example, for the original document, the document elements are identified in the case actual section, the original request and the section of the original document according to the content format rule of the original document.
And obtaining a class pattern search model corresponding to the document type according to the document element items, the document element identification features and the matching paragraphs of the document elements.
Wherein, S210 to S230 are preprocessing processes, and are usually executed before the class pushing. Preprocessing to obtain a class case searching model corresponding to each document type, and then directly using the corresponding class case searching model to identify document elements contained in legal documents in the document library.
S240, obtaining the existing legal document, and determining the document type according to the content of the existing legal document.
And S250, segmenting the existing legal document according to the document type of the existing legal document.
And S260, analyzing the content of the appointed paragraph obtained by segmenting the existing legal document to obtain the document element of the existing legal document.
S240 to S260 in this embodiment are the same as steps S110 to S140 in the embodiment shown in fig. 1, and are not described again here.
And S270, searching the legal documents to be selected which are matched with the document types of the existing legal documents from the document library.
And after the document type of the existing legal document is determined, selecting the legal document with the same document type from the document library as the legal document to be selected.
And S280, identifying the legal documents containing the document element items from the legal documents to be selected by utilizing the identification characteristics corresponding to the document element items of the existing legal documents contained in the class case search model corresponding to the document types of the existing legal documents to obtain the class case documents.
Selecting a case type search model corresponding to the document type of the existing legal document, and identifying the legal document containing the document elements of the existing legal document from the legal documents to be selected as the case type document by using the identification features which are contained in the case type search model and correspond to the document element items of the existing legal document.
S290, pushing the class case document.
In one embodiment of the application, when the existing legal document contains a plurality of document elements, for each document element contained in the existing legal document, the legal documents containing the document element in the to-be-selected legal document are identified one by one, and after all document elements contained in the existing legal document are traversed, a plurality of case-like documents are found. Then, sorting is carried out according to the sequence of the number of document element items matched with the existing legal documents contained in the plurality of class documents from high to low, and the class documents are pushed in sequence.
The method for searching a case-like document provided by this embodiment obtains document element items corresponding to a case by disassembling and combing legal and regulatory contents in advance, and then determines the identification features corresponding to each document element item. Acquiring an existing legal document, and determining the document type of the existing legal document; then, the document elements included in the existing legal document are analyzed and obtained. Then, from among the legal documents matching the document types of the existing legal documents, the class case documents matching the document elements of the existing legal documents are searched for using a class case search model matching the document types obtained in advance. The method can push the case-like documents of various types of legal documents, and the pushing mode pushes the case-like documents according to the document elements of the existing legal documents, so that the pushed case-like documents are similar to the existing legal documents in the document elements, and the accuracy of case-like pushing is improved.
Referring to fig. 3, a flowchart of another method for searching a type of document according to an embodiment of the present application is shown, where the method further includes the following steps before the push type of document according to any of the embodiments of the method for searching a type of document described above:
and S310, marking document elements which are contained in each type of case document and matched with the existing legal document.
And finding out each type of case document by using a case-type search model, and marking all document element items which are contained in the case document and matched with the existing legal document.
And S320, comparing and displaying the document elements contained in the existing legal documents and the document elements contained in each type of document and matched with the existing legal documents.
The aggregation visualization shows the document elements contained in the existing legal documents and the document elements contained in each pushed type of document which are matched with the existing legal documents.
As shown in fig. 4, the schematic diagram of the document elements is contrastively displayed, as shown in fig. 4, the case related to the existing legal document is "intentional injury crime", and the document element items included are "severe injury", "use murder", "obtaining forgiveness or understanding of family, wrong victim", "death", "report"; the document elements contained in the case document 1 are "serious injury", "use of weapon", "obtaining of the victim or understanding of the family", respectively.
The document element items contained in the existing legal documents are arranged in front of the priority level, the document element items contained in the class documents are arranged in the back of the priority level, the V is used for representing the document element items contained in the existing legal documents and the class documents, and the-is used for representing the document element items contained in the existing legal documents but not contained in the class documents. The document elements are displayed in a polymerization and visual contrast mode, so that the contrast condition between the document elements contained in the class case and the document elements contained in the existing legal documents and the similarity between the class case and the existing legal documents can be displayed visually.
The process of pushing the case document may adopt the process described in S330.
S330, pushing the class case documents according to the sequence of the matching number of the class case documents and the document elements of the existing legal documents from large to small.
The class case documents can be pushed in the order of the number of document element items matched with the existing legal documents from more to less.
In the method for searching a type document provided in this embodiment, after determining the type document matched with the document element of the existing legal document, the document element item included in the type document and matched with the existing legal document is labeled. And comparing and displaying the document elements of the existing legal documents and the document elements which are contained in the case-like documents and matched with the existing legal documents. Therefore, the comparison situation of the document elements contained in the class case and the document elements contained in the existing legal document and the similarity between the class case and the existing legal document can be visually shown. In addition, the class case documents are pushed in the order of the number of document element items matched with the existing legal documents from large to small, so that the class case documents closest to the existing legal documents are guaranteed to be pushed preferentially.
Corresponding to the embodiment of the method for searching the similar case document, the application also provides an embodiment of a device for searching the similar case document.
Referring to fig. 5, a block diagram of a document type searching apparatus applied to a server according to an embodiment of the present application is shown, the apparatus searching for a document type identical to that of an existing legal document and matching document elements according to the input existing legal document. As shown in fig. 5, the apparatus includes a determination module 110, a parsing module 120, and a lookup module 130.
The determining module 110 is used for determining the document type according to the content of the existing legal document.
In an embodiment of the present application, the determining module is specifically configured to: identifying the content of a specified paragraph in the existing legal document by using a keyword identification rule or a regular expression rule to obtain the content characteristics representing the document type; the document type of the existing legal document is determined according to the content features.
And the analysis module 120 is used for analyzing the document content of the existing legal document to obtain document elements.
The document elements are contents which embody case characteristics, influence facts of referee results and allow a judge to legally identify and say the case characteristics.
In an embodiment of the present application, the parsing module 120 is specifically configured to: segmenting the existing legal documents according to the document types of the existing legal documents; and analyzing the content of the appointed paragraphs obtained by segmenting the existing legal documents to obtain document elements of the existing legal documents.
The searching module 130 is used for searching the legal documents matched with the document types of the existing legal documents for the class case documents matched with the document elements of the existing legal documents.
In an embodiment of the present application, the document element includes at least one document element item, and the search module 130 is specifically configured to:
searching the legal documents to be selected which are matched with the document types of the existing legal documents from the document library; determining a class case search model corresponding to the document type of the existing legal document, wherein the class case search model comprises document element items and identification features corresponding to the document element items; and identifying the legal documents containing the document element items from the legal documents to be selected by utilizing the identification characteristics which are contained in the classification searching model and correspond to each document element item of the existing legal documents to obtain the classification documents.
In one embodiment of the present application, the process of obtaining the class search model is as follows:
disassembling and carding the corresponding legal and legal contents of a case to obtain the document element corresponding to the case; determining document element identification characteristics corresponding to each document element item, and determining the matching weight of each document element item in the paragraph of the legal document corresponding to the document type to obtain a class pattern search model corresponding to the document type.
The device for searching a similar case document provided by the embodiment determines the document type of the existing legal document, and then analyzes the document content of the existing legal document to obtain the document elements. And searching the legal documents matched with the document types of the existing legal documents for the class case documents matched with the document elements of the existing legal documents. The device can be used for searching the class case documents of various types of legal documents, and the searching mode searches the class case documents according to the document elements of the existing legal documents, so that the searched class case documents are similar to the existing legal documents in the document elements, and the accuracy rate of searching the class case documents is improved.
Referring to fig. 6, a block diagram of another apparatus for searching a document for a class document according to an embodiment of the present application is shown, where the apparatus further includes, on the basis of the embodiment shown in fig. 5: a marking module 210, a presentation module 220, and a pushing module 230.
The marking module 210 is used for marking the document elements contained in each type of case document and matched with the existing legal document.
And finding out each type of case document by using a case-type search model, and marking all document element items which are contained in the case document and matched with the existing legal document.
And the display module 220 is used for comparatively displaying the document elements contained in the existing legal documents and the document elements contained in each type of document and matched with the existing legal documents.
The document elements contained in the existing legal documents and the document elements contained in each type of pushed-out legal document and matched with the existing legal documents can be displayed in a manner shown in fig. 4.
The pushing module 230 is configured to push the type documents in a sequence from a large number to a small number, where the type documents are matched with document elements of existing legal documents.
The class case documents can be pushed in the order of the number of document element items matched with the existing legal documents from more to less.
In another embodiment of the present application, the class document found by the search module can be directly pushed, and details are not repeated here.
The device for searching the type document provided by the embodiment marks the document element item which is contained in the type document and matched with the existing legal document after determining the type document matched with the document element of the existing legal document. And comparing and displaying the document elements of the existing legal documents and the document elements which are contained in the case-like documents and matched with the existing legal documents. Therefore, the comparison situation of the document elements contained in the class case and the document elements contained in the existing legal document and the similarity between the class case and the existing legal document can be visually shown. In addition, the class case documents are pushed in the order of the number of document element items matched with the existing legal documents from large to small, so that the class case documents closest to the existing legal documents are guaranteed to be pushed preferentially.
The device for searching the type document comprises a processor and a memory, wherein the determining module, the analyzing module, the searching module and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, the case-like documents of various document types are searched by adjusting the kernel parameters, and the case-like documents are searched according to the document elements of the existing legal documents, so that the accuracy of the searched case-like documents is ensured.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
An embodiment of the present invention provides a storage medium, on which a program is stored, and when the program is executed by a processor, the method for searching a case document is implemented.
The embodiment of the invention provides a processor, which is used for running a program, wherein the method for searching the type of case document is executed when the program runs.
The embodiment of the invention provides equipment, and the equipment can be a server, a PC, a PAD, a mobile phone and the like.
The device comprises a processor, a memory and a program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the following steps:
determining the document type according to the content of the existing legal document;
analyzing the document content of the existing legal document to obtain document elements, wherein the document elements are contents which embody case characteristics and influence the fact of judgment results and allow a judge to carry out legal identification and theory on the document elements;
searching for a case-like document matching the document elements of the existing legal document from among the legal documents matching the document types of the existing legal document.
In one possible implementation manner of the present application, the determining the document type according to the content of the existing legal document includes:
identifying the content of the appointed section in the existing legal document by utilizing a keyword identification rule or a regular expression rule to obtain the content characteristic representing the document type;
determining the document type of the existing legal document according to the content features.
In one possible implementation manner of the present application, the parsing the document content of the existing legal document to obtain the document element includes:
segmenting the existing legal document according to the document type of the existing legal document;
and analyzing the content of the appointed paragraphs obtained by segmenting the existing legal document to obtain the document elements of the existing legal document.
In one possible implementation manner of the present application, the document element includes at least one document element item;
searching for a type case document matching document elements of the existing legal document from among the legal documents matching the document type, including:
searching a document library for a legal document to be selected which is matched with the document type of the existing legal document;
determining a class case search model corresponding to the document type of the existing legal document, wherein the class case search model comprises document element items and identification features corresponding to the document element items;
and identifying the legal documents containing the document element items from the legal documents to be selected by utilizing the identification features corresponding to each document element item of the existing legal documents contained in the case search model to obtain case-like documents.
In one possible implementation manner of the present application, the method further includes:
disassembling and carding the corresponding legal and legal contents of a case to obtain the document element item corresponding to the case;
determining a document element identification characteristic corresponding to each document element item, and,
and determining the matching weight of each document element item in the paragraph of the legal document corresponding to the document type to obtain a class pattern search model corresponding to the document type.
In one possible implementation manner of the present application, the method further includes:
and pushing the searched class case documents matched with the document elements of the existing legal documents.
In one possible implementation manner of the present application, the method further includes:
marking the document elements contained in each of the class documents that match the existing legal document;
comparing and displaying the document elements contained in the existing legal documents and the document elements contained in each type of case documents and matched with the existing legal documents;
and pushing the case-like documents according to the sequence of the number of the matched document elements of the case-like documents and the existing legal documents from large to small.
The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device:
determining the document type according to the content of the existing legal document;
analyzing the document content of the existing legal document to obtain document elements, wherein the document elements are contents which embody case characteristics and influence the fact of judgment results and allow a judge to carry out legal identification and theory on the document elements;
searching for a case-like document matching the document elements of the existing legal document from among the legal documents matching the document types of the existing legal document.
In one possible implementation manner of the present application, the determining the document type according to the content of the existing legal document includes:
identifying the content of the appointed section in the existing legal document by utilizing a keyword identification rule or a regular expression rule to obtain the content characteristic representing the document type;
determining the document type of the existing legal document according to the content features.
In one possible implementation manner of the present application, the parsing the document content of the existing legal document to obtain the document element includes:
segmenting the existing legal document according to the document type of the existing legal document;
and analyzing the content of the appointed paragraphs obtained by segmenting the existing legal document to obtain the document elements of the existing legal document.
In one possible implementation manner of the present application, the document element includes at least one document element item;
searching for a type case document matching document elements of the existing legal document from among the legal documents matching the document type, including:
searching a document library for a legal document to be selected which is matched with the document type of the existing legal document;
determining a class case search model corresponding to the document type of the existing legal document, wherein the class case search model comprises document element items and identification features corresponding to the document element items;
and identifying the legal documents containing the document element items from the legal documents to be selected by utilizing the identification features corresponding to each document element item of the existing legal documents contained in the case search model to obtain case-like documents.
In one possible implementation manner of the present application, the method further includes:
disassembling and carding the corresponding legal and legal contents of a case to obtain the document element item corresponding to the case;
determining a document element identification characteristic corresponding to each document element item, and,
and determining the matching weight of each document element item in the paragraph of the legal document corresponding to the document type to obtain a class pattern search model corresponding to the document type.
In one possible implementation manner of the present application, the method further includes:
and pushing the searched class case documents matched with the document elements of the existing legal documents.
In one possible implementation manner of the present application, the method further includes:
marking the document elements contained in each of the class documents that match the existing legal document;
comparing and displaying the document elements contained in the existing legal documents and the document elements contained in each type of case documents and matched with the existing legal documents;
and pushing the case-like documents according to the sequence of the number of the matched document elements of the case-like documents and the existing legal documents from large to small.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A case-like document searching method is characterized by comprising the following steps:
determining the document type according to the content of the existing legal document;
analyzing the document content of the existing legal document to obtain document elements, wherein the document elements are contents which embody case characteristics and influence the fact of judgment results and allow a judge to carry out legal identification and theory on the document elements;
searching for a case-like document matching the document elements of the existing legal document from among the legal documents matching the document types of the existing legal document.
2. The method of claim 1, wherein said determining a type of document from the contents of an existing legal document comprises:
identifying the content of the appointed section in the existing legal document by utilizing a keyword identification rule or a regular expression rule to obtain the content characteristic representing the document type;
determining the document type of the existing legal document according to the content features.
3. The method of claim 1, wherein parsing the document content of the existing legal document to obtain document elements comprises:
segmenting the existing legal document according to the document type of the existing legal document;
and analyzing the content of the appointed paragraphs obtained by segmenting the existing legal document to obtain the document elements of the existing legal document.
4. The method of claim 1, wherein the document element comprises at least one document element item;
searching for a type case document matching document elements of the existing legal document from among the legal documents matching the document type, including:
searching a document library for a legal document to be selected which is matched with the document type of the existing legal document;
determining a class case search model corresponding to the document type of the existing legal document, wherein the class case search model comprises document element items and identification features corresponding to the document element items;
and identifying the legal documents containing the document element items from the legal documents to be selected by utilizing the identification features corresponding to each document element item of the existing legal documents contained in the case search model to obtain case-like documents.
5. The method of claim 4, further comprising:
disassembling and carding the corresponding legal and legal contents of a case to obtain the document element item corresponding to the case;
determining a document element identification characteristic corresponding to each document element item, and,
and determining the matching weight of each document element item in the paragraph of the legal document corresponding to the document type to obtain a class pattern search model corresponding to the document type.
6. The method according to any one of claims 1-5, further comprising:
and pushing the searched class case documents matched with the document elements of the existing legal documents.
7. The method according to any one of claims 1-5, further comprising:
marking the document elements contained in each of the class documents that match the existing legal document;
comparing and displaying the document elements contained in the existing legal documents and the document elements contained in each type of case documents and matched with the existing legal documents;
and pushing the case-like documents according to the sequence of the number of the matched document elements of the case-like documents and the existing legal documents from large to small.
8. A kind of case and document searching device, comprising:
the determining module is used for determining the document type according to the content of the existing legal document;
the analysis module is used for analyzing the document content of the existing legal document to obtain document elements, wherein the document elements are contents which embody case characteristics, influence the facts of judgment results and allow a judge to carry out legal identification and theory on the document elements;
the searching module is used for searching the legal documents matched with the document types of the existing legal documents for the class case documents matched with the document elements of the existing legal documents.
9. A storage medium having a program stored thereon, wherein the program, when executed by a processor, implements the case text lookup method of any one of claims 1 to 7.
10. A processor for executing a program, wherein the program executes to perform the case text lookup method of any one of claims 1 to 7.
CN201810915510.XA 2018-08-13 2018-08-13 Case-like document searching method and device Pending CN110827177A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810915510.XA CN110827177A (en) 2018-08-13 2018-08-13 Case-like document searching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810915510.XA CN110827177A (en) 2018-08-13 2018-08-13 Case-like document searching method and device

Publications (1)

Publication Number Publication Date
CN110827177A true CN110827177A (en) 2020-02-21

Family

ID=69546839

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810915510.XA Pending CN110827177A (en) 2018-08-13 2018-08-13 Case-like document searching method and device

Country Status (1)

Country Link
CN (1) CN110827177A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507079A (en) * 2020-12-15 2021-03-16 科大讯飞股份有限公司 Document case situation matching method, device, equipment and storage medium
CN112507350A (en) * 2020-11-18 2021-03-16 中国工商银行股份有限公司 Authentication method and device for assisting execution of audit service
CN113486158A (en) * 2021-09-08 2021-10-08 中国司法大数据研究院有限公司 Case situation comparison-based case retrieval method, device, equipment and storage medium
CN114547245A (en) * 2022-02-21 2022-05-27 山东大学 Legal element-based class case retrieval method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003263458A (en) * 2002-03-07 2003-09-19 Ricoh Co Ltd Method and device for analyzing text
CN106991092A (en) * 2016-01-20 2017-07-28 阿里巴巴集团控股有限公司 The method and apparatus that similar judgement document is excavated based on big data
CN107330071A (en) * 2017-06-30 2017-11-07 北京神州泰岳软件股份有限公司 A kind of legal advice information intelligent replies method and platform
CN107590131A (en) * 2017-10-16 2018-01-16 北京神州泰岳软件股份有限公司 A kind of specification document processing method, apparatus and system
CN108009299A (en) * 2017-12-28 2018-05-08 北京市律典通科技有限公司 Law tries method and device for business processing
CN108038091A (en) * 2017-10-30 2018-05-15 上海思贤信息技术股份有限公司 A kind of similar calculating of judgement document's case based on figure and search method and system
CN108334500A (en) * 2018-03-05 2018-07-27 上海思贤信息技术股份有限公司 A kind of judgement document's mask method and device based on machine learning algorithm

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003263458A (en) * 2002-03-07 2003-09-19 Ricoh Co Ltd Method and device for analyzing text
CN106991092A (en) * 2016-01-20 2017-07-28 阿里巴巴集团控股有限公司 The method and apparatus that similar judgement document is excavated based on big data
CN107330071A (en) * 2017-06-30 2017-11-07 北京神州泰岳软件股份有限公司 A kind of legal advice information intelligent replies method and platform
CN107590131A (en) * 2017-10-16 2018-01-16 北京神州泰岳软件股份有限公司 A kind of specification document processing method, apparatus and system
CN108038091A (en) * 2017-10-30 2018-05-15 上海思贤信息技术股份有限公司 A kind of similar calculating of judgement document's case based on figure and search method and system
CN108009299A (en) * 2017-12-28 2018-05-08 北京市律典通科技有限公司 Law tries method and device for business processing
CN108334500A (en) * 2018-03-05 2018-07-27 上海思贤信息技术股份有限公司 A kind of judgement document's mask method and device based on machine learning algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张德: "自然语言处理技术在司法过程中的应用研究", 《信息与电脑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507350A (en) * 2020-11-18 2021-03-16 中国工商银行股份有限公司 Authentication method and device for assisting execution of audit service
CN112507350B (en) * 2020-11-18 2023-11-17 中国工商银行股份有限公司 Authentication method and device for assisting in executing check and control service
CN112507079A (en) * 2020-12-15 2021-03-16 科大讯飞股份有限公司 Document case situation matching method, device, equipment and storage medium
CN112507079B (en) * 2020-12-15 2023-01-17 科大讯飞股份有限公司 Document case situation matching method, device, equipment and storage medium
CN113486158A (en) * 2021-09-08 2021-10-08 中国司法大数据研究院有限公司 Case situation comparison-based case retrieval method, device, equipment and storage medium
CN113486158B (en) * 2021-09-08 2021-12-14 中国司法大数据研究院有限公司 Case situation comparison-based case retrieval method, device, equipment and storage medium
CN114547245A (en) * 2022-02-21 2022-05-27 山东大学 Legal element-based class case retrieval method and system

Similar Documents

Publication Publication Date Title
CN109446513B (en) Extraction method of events in text based on natural language understanding
CN110827177A (en) Case-like document searching method and device
CN111695033A (en) Enterprise public opinion analysis method, device, electronic equipment and medium
CN110929125B (en) Search recall method, device, equipment and storage medium thereof
CN110309251B (en) Text data processing method, device and computer readable storage medium
CN111310446A (en) Information extraction method and device for referee document
CN110765760B (en) Legal case distribution method and device, storage medium and server
CN111104798A (en) Analysis method, system and computer readable storage medium for criminal plot in legal document
GB2449125A (en) Metadata with degree of trust indication
CN111428466B (en) Legal document analysis method and device
CN110032721B (en) Judge document pushing method and device
CN111553151A (en) Question recommendation method and device based on field similarity calculation and server
Beytía et al. Visual gender biases in wikipedia: A systematic evaluation across the ten most spoken languages
EP3301603A1 (en) Improved search for data loss prevention
CN111078828A (en) Enterprise historical information extraction method and system
CN110020134B (en) Knowledge service information pushing method and system, storage medium and processor
CN109660621A (en) A kind of content delivery method and service equipment
CN115080709A (en) Text recognition method and device, nonvolatile storage medium and computer equipment
Lawton et al. eDiscovery in digital forensic investigations
CN111813947A (en) Automatic generation method and device for court inquiry synopsis
CN113971207A (en) Document association method and device, electronic equipment and storage medium
CN113051903A (en) Method for comparing consistency of sentences, case passes, sentencing plots and judicial documents
Ghawi et al. Analysis of country mentions in the debates of the un security council
JP2006293616A (en) Document aggregating method, and device and program
Kernot Can Three Pronouns Discriminate Identity in Writing?

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination