CN110647504A - Method and device for searching judicial documents - Google Patents

Method and device for searching judicial documents Download PDF

Info

Publication number
CN110647504A
CN110647504A CN201810663048.9A CN201810663048A CN110647504A CN 110647504 A CN110647504 A CN 110647504A CN 201810663048 A CN201810663048 A CN 201810663048A CN 110647504 A CN110647504 A CN 110647504A
Authority
CN
China
Prior art keywords
judicial
candidate
documents
search
forensic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810663048.9A
Other languages
Chinese (zh)
Other versions
CN110647504B (en
Inventor
周鑫
张雅婷
孙常龙
刘晓钟
司罗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201810663048.9A priority Critical patent/CN110647504B/en
Publication of CN110647504A publication Critical patent/CN110647504A/en
Application granted granted Critical
Publication of CN110647504B publication Critical patent/CN110647504B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services

Landscapes

  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Technology Law (AREA)
  • Primary Health Care (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a device for searching a judicial literature. Wherein, the method comprises the following steps: acquiring a received search keyword; retrieving in a judicial literature library based on the retrieval keywords to obtain a plurality of candidate judicial texts, wherein the judicial literature library comprises a plurality of judicial texts, and each judicial literature comprises at least one label; and selecting the target judicial literature from the candidate judicial literatures according to the search keywords and the labels. The invention solves the technical problem that the judicial documents which do not comprise the keywords but are related to the keywords cannot be searched in the full text because the retrieval relevance of the keywords is lower in the related technology.

Description

Method and device for searching judicial documents
Technical Field
The invention relates to the technical field of document retrieval, in particular to a method and a device for retrieving a judicial document.
Background
The retrieval of the judicial documents is common, the retrieval of the judicial documents can be directly carried out on a plurality of platforms, the core of the retrieval of the judicial documents is general search based on keyword index, and some general navigation screening functions such as examination level, region, case and the like are carried out so as to guide users to retrieve the associated judicial documents.
For example, law workers often need to find referee documents in litigation documents that are similar to the case currently being processed and that have made a decision. For ordinary people, when disputes are encountered, it is desirable to find the official documents similar to the encounters and taking the judgment into effect as references for the subsequent processing. Currently, in the process of searching for the target official document, the universal search based on the keyword index can be performed, and the associated official document can be searched based on the guidance of the universal navigation screening function.
However, although the related art provides some general screening functions and functions of block search, the related art essentially provides full-text search based on keywords, and this method has the disadvantage that the search based on keywords is relatively low in relevance, and a judicial literature which does not include the keywords in the full text but is related to the keywords cannot be searched.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a method and a device for searching a judicial literature, which are used for at least solving the technical problem that the judicial literature which does not include a keyword in the whole text but is related to the keyword cannot be searched due to lower search correlation of the keyword in the related technology.
According to an aspect of an embodiment of the present invention, there is provided a method for retrieving a judicial literature, including: acquiring a received search keyword; retrieving in a forensic script library based on the retrieval key words to obtain a plurality of candidate forensic texts, wherein the forensic script library comprises a plurality of forensic texts, and each forensic text comprises at least one tag; and selecting a target judicial literature from the candidate judicial literatures according to the retrieval key words and the labels.
According to another aspect of the embodiments of the present invention, there is also provided a judicial literature retrieval device, including: an acquisition unit configured to acquire a received search keyword; the retrieval unit is used for retrieving in a forensic script library based on the retrieval keywords to obtain a plurality of candidate forensic texts, wherein the forensic script library comprises a plurality of forensic texts, and each forensic text comprises at least one label; and the determining unit is used for selecting a target judicial literature from the candidate judicial literatures according to the search keyword and the label.
According to another aspect of the embodiments of the present invention, there is also provided a storage medium including a stored program, wherein when the program is executed by a processor, the apparatus on which the storage medium is located is controlled to execute any one of the above-mentioned methods for retrieving a judicial essay.
In the embodiment of the invention, the search keyword can be received through the terminal, the search is carried out in the forensic script library based on the received search keyword, a plurality of candidate forensic scripts are obtained, each forensic script comprises at least one label, and the target forensic script is selected from the plurality of candidate forensic scripts by using the input search keyword and the input label. In the embodiment, as the labels of each judicial document are added to the judicial documents by using the legal knowledge elements, when the labels are matched with the judicial documents, the matching degree of the judicial documents can be improved, the correlation between the retrieved judicial documents and the retrieval keywords input by the user can be high, and the extracted judicial documents can be marked with the target labels in the application, so that the user can better know the positions of the keywords input by the user in the judicial documents, the user can better use the product, and the correlation between the obtained target judicial documents and the judicial documents which the user wants is higher, thereby solving the technical problem that the retrieval correlation of the keywords is lower in the related technology, and the judicial documents which do not include the keywords in the full text but are related to the keywords can not be retrieved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a schematic network environment diagram of a network in which a computer terminal for implementing a retrieval method of a judicial essay according to an embodiment of the present application is located;
fig. 2 is a flowchart of a retrieval method of a judicial literature according to a first embodiment of the invention;
FIG. 3 is a schematic diagram of an alternative method of forensic document tag creation according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an alternative legal knowledge element combination extraction method according to an embodiment of the invention;
FIG. 5 is a schematic diagram of an alternative legal knowledge element set extraction method according to an embodiment of the invention;
FIG. 6 is a schematic diagram of matching an optional search keyword with a tag of each judicial essay in the judicial essay library according to an embodiment of the invention;
FIG. 7 is a diagram illustrating an alternative method for adjusting keywords using a search rewrite model, according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of an alternative judicial writing ordering method according to a first embodiment of the invention;
FIG. 9 is a schematic diagram of an alternative forensic document retrieval apparatus according to an embodiment of the present invention;
fig. 10 is a block diagram of a computer terminal according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, some terms or terms appearing in the description of the embodiments of the present application are applicable to the following explanations:
the referee document is a carrier for recording the trial process and results of the people's court, and is also a unique certificate for the people's court to determine and distribute the entity rights and obligations of the parties. A referee document with complete structure, complete elements and strict logic is a certificate for the right and burden of a party and is also an important basis for the upper-level people court to supervise the civil judgment activities of the lower-level people court.
Inverted indexing: one technique of a search engine is to find records based on the values of attributes.
Recalling: and the basic module of the search engine is used for acquiring candidates from the mass engine data.
word2 vec: an open source software learns vector representations of words through large-scale corpora for use in an algorithm that maps words into word vectors.
LSTM, Long Short-Term Memory network.
RNN: a recurrent neural network, used for sequence model modeling.
Example 1
In accordance with an embodiment of the present invention, there is provided a method embodiment of judicial writing retrieval, it being noted that the steps illustrated in the flow chart of the accompanying drawings may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flow chart, in some cases the steps illustrated or described may be performed in an order different than here.
The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Fig. 1 is a schematic network environment diagram of a network in which a computer terminal for implementing a judicial literature retrieval method according to an embodiment of the present application is located.
As shown in fig. 1, the network environment may include a terminal 11 and a server 13, and in this embodiment, the above-mentioned forensic document retrieval method may be applied to a hardware environment formed by the server 13 and the terminal 11 as shown in fig. 1. As shown in fig. 1, the server is connected to the terminal through a network, which includes but is not limited to: the terminal is not limited to a PC, a mobile phone, a tablet computer, etc. The retrieval method of the judicial documents in the embodiment of the application can be executed by the server, the terminal 11, or both the server and the terminal. According to one embodiment of the application, the terminal executes the retrieval method of the judicial documents of the embodiment of the application by client software or a client webpage installed on the terminal.
In an alternative embodiment, the terminal 11 may be any mobile computing device or the like. The terminal may establish a connection with the server 13 via a data network, which may be a local area network connection, a wide area network connection, an internet connection, or other type of data network connection. The terminal 11 may execute to connect to a network service executed by a server or a group of servers. A web server is a network-based user service such as an authentication service, social network, cloud resources, email, online payment, or other online application.
In an optional embodiment, the terminal receives a search keyword, generates and sends a search request corresponding to the search keyword to the server, the server searches in the judicial literature library based on the search keyword to obtain a plurality of candidate judicial literatures, and selects a target judicial literature from the plurality of candidate judicial literatures according to the search keyword and a label corresponding to each judicial literature, the user can input the search keyword on the terminal, the terminal receives the search keyword input by the user, and the target judicial literature is displayed to the user according to the target judicial literature provided by the server.
In another alternative embodiment, the retrieval method of the judicial documents is applied to the terminal 11, that is, the network environment may be implemented without including a server, and specifically, the processor of the terminal receives the generated verification request, sends the verification request to the verification processor of the terminal (the verification processor may be a virtual processing unit in the processor of the terminal), the user may input the retrieval key word on the terminal, the terminal may generate the received retrieval key word and send the retrieval request corresponding to the retrieval key word to the server to obtain a plurality of candidate judicial documents, and selects the target judicial document from the plurality of candidate judicial documents according to the retrieval key word and the tag corresponding to each judicial document, the processor provides the target judicial document according to the retrieval key word input by the user, the target judicial essay is presented to the user.
It should be noted that, according to an embodiment of the present application, the terminal shown in fig. 1 has a touch display (also referred to as "touch screen" or "touch display screen"). In some embodiments, the computer device (or mobile device) 11 shown in fig. 1 above has a Graphical User Interface (GUI) with which a user can interact by finger contact and/or gestures on the touch screen surface, where the human interaction functionality optionally includes the following interactions: executable instructions for entering predetermined validation operations, creating web pages, drawing, word processing, making electronic documents, games, video conferencing, instant messaging, emailing, call interfacing, playing digital video, playing digital music, and/or web browsing, etc., for performing the above-described human-computer interaction functions, are configured/stored in one or more processor-executable computer program products or readable storage media.
Under the above operating environment, the present application provides a retrieval method of a judicial literature as shown in fig. 2. Fig. 2 is a flowchart of a method for retrieving a judicial literature according to a first embodiment of the present invention. As shown in fig. 2, the method may include:
in step S21, the received search keyword is acquired.
And step S23, retrieving in a judicial literature base based on the retrieval keywords to obtain a plurality of candidate judicial literatures, wherein the judicial literature base comprises a plurality of judicial literatures, and each judicial literature comprises at least one label.
Step S25, selecting a target judicial literature from the plurality of candidate judicial literatures according to the search keyword and the label.
The following describes the above steps in detail, wherein step S21 acquires the received search keyword.
In the embodiment of the present application, a search keyword input by a user may be received through a terminal, the type and model of the terminal are not limited in the present application, for example, a mobile phone, a PC, an IPAD, and the like, and a search engine, a search software, or a search webpage for inputting the search keyword may be provided to the user through the terminal, for example, a website input port, such as an input box, is provided to the user, and the user may input the search keyword in the input box.
The input search keywords are not limited in the application, and may be keywords related to a judicial document to be queried, or keywords related to other judicial aspects or non-judicial aspects, and the number of words of the input search keywords is not specifically limited.
In the embodiment of the present invention, before searching in the forensic document library based on the search keyword, specific legal knowledge elements and tags corresponding to each forensic document may be determined, and fig. 3 is a schematic diagram of an alternative method for creating a forensic document tag according to the embodiment of the present invention, as shown in fig. 3, the method includes the following steps S31 to S34, wherein,
and step S31, classifying the plurality of judicial documents according to the case types to obtain a plurality of types of judicial document sets.
The classifying the multiple judicial documents may be determining a case type corresponding to each judicial document, where the case type may be specified at a specific position of the judicial document, and the specific classification of the case type may include, but is not limited to: civil, administrative, criminal, etc. The case type classification judicial documents contain a large number of judicial documents in each type, so that a corresponding judicial document set is formed.
And step S32, extracting legal knowledge elements from the judicial documents in each type of judicial document set to obtain a legal knowledge element set.
Fig. 4 is a schematic diagram of an alternative legal knowledge element set acquisition method according to an embodiment of the present invention, as shown in fig. 4, the method includes:
step S321, segmenting each piece of judicial literature according to the constituent parts of each piece of judicial literature to obtain a plurality of constituent parts.
Here, since each piece of judicial literature corresponds to a specific configuration format, for example, the judicial literature includes components such as basic information, a complainer, an original notice, a home opinion, case judgment, and the like, and by cutting these components, it is possible to obtain the content such as an original notice, an announced notice, an address, a reason for infringement, a court judgment, and the like corresponding to each piece of judicial literature, and to extract the amount of cases corresponding to each case, the company related to the case, and the like. Therefore, how to segment the judicial works can be achieved by utilizing the components of each judicial work, segmenting the components to obtain a plurality of structures corresponding to each judicial work, and preparing for extracting legal knowledge elements subsequently, wherein the content of each component is approximately the same.
In step S323, the legal knowledge elements are extracted based on the plurality of components, and a legal knowledge element set is obtained.
Optionally, as to the technical solution of extracting legal knowledge elements based on a plurality of components in the above steps, after each component is determined, the text features in each component may be extracted, for example, the basic information of the referee document including the text features is: information such as court of trial, original company, company of complaint, type of case, date of referee, and case; the text of the document comprises the following text characteristics: the process of examination, first examination, second examination, amount of involved cases, and reason for answering the questions of involved company. The legal knowledge elements may be extracted by determining the content of each component and then extracting the legal knowledge elements by using each character feature to obtain a corresponding legal knowledge element set.
Here, the method of extracting the legal knowledge element based on the plurality of components includes at least one of the following: template matching, machine learning model.
The template matching may be to establish each template by using known legal knowledge elements, the number and type of the templates are not limited, and only the legal knowledge elements and the constituent parts need to be corresponded, where each template may be limited to correspond to one constituent part or a plurality of constituent parts, and each part includes a plurality of text features. For the machine learning model, the machine learning model can be obtained by training by using the extraction mode of the legal knowledge elements corresponding to the known components of the judicial documents, such as a conditional random field model or a bidirectional LSTM model.
Whereas for the above step S323, it may be classified and/or filtered by the classifier to obtain legal knowledge elements, fig. 5 is a schematic diagram of an optional extraction method of a legal knowledge element set according to an embodiment of the present invention, as shown in fig. 5, the method includes the following steps S3231 to S3235, wherein,
in step S3231, legal knowledge elements to be extracted for each of a plurality of components are determined.
It should be noted here that, in determining the legal knowledge elements to be extracted from each component, the characteristics of each component may be used for determination, and since the composition of each component in each judicial document is relatively fixed, information such as effective date and amount of compensation can be extracted from the summary of court trial in the text of the document.
In step S3233, a classifier of legal knowledge elements to be extracted for each component is determined.
Optionally, the classifier may be obtained by training for each legal knowledge element, and the type of the classifier is not specifically limited, and may include but is not limited to: the vector machine, and the way of using the classifier is not limited in this application. Optionally, each sentence in each of the components of the judicial works may be filtered and distinguished by the classifier to determine the category or set of legal knowledge elements of each sentence.
And step S3235, classifying the corresponding components by adopting a classifier to obtain a legal knowledge element set.
For the embodiment of the invention, after the legal knowledge element set is extracted by classification, the legal knowledge elements can be used as the index of the text to prepare for the subsequent matching of the search keywords by using the legal knowledge elements. Optionally, the obtained text features and legal knowledge elements can be added into a search engine, and the search engine can be used for realizing accurate matching of the input search keywords to obtain the target judicial documents.
When the classifier is used, the classifier can be used for classifying the components, and filtering can be performed to filter each sentence or each language and character so as to determine the legal knowledge element set.
And step S33, building a legal knowledge base corresponding to each type according to the legal knowledge element set.
In this step, the legal knowledge base may be established by a plurality of legal knowledge elements corresponding to each type, and the data in the legal knowledge base may be obtained by integrating the tables of the legal knowledge elements after establishing the tables of the plurality of legal knowledge elements.
And step S34, adding corresponding labels to each judicial literature based on the legal knowledge base.
Optionally, in this step, a label may be determined for each judicial literature through the legal knowledge base, and in the present application, it is mainly emphasized that a label corresponding to the judicial literature is added to the legal knowledge element. Labels can be extracted from the original text of the judicial works through a legal knowledge base, such as boolean type features of whether goods are received, whether the original price is false, whether the discount is false, and the like.
Optionally, when extracting, the document structure corresponding to the above features may be determined according to a predefined rule, and then the extraction is performed according to the determined document structure, for example, if "whether goods are received" is extracted from the original complaint request and the court approval fact in the judicial document.
For step S23, retrieving is performed in a judicial literature library based on the retrieval keyword, and multiple candidate judicial literatures are obtained, where the judicial literature library includes multiple judicial literatures, and each judicial literature includes at least one tag.
In the invention, the label in the judicial literature corresponds to the extracted legal knowledge element in the judicial literature, the legal knowledge element set consisting of the legal knowledge elements is extracted from the divided component part of the judicial literature according to the classifier, the label corresponding to each legal knowledge element set can be obtained, and the important content in the judicial literature can be identified through the label.
It should be noted here that the judicial literature base may be a database which is established in advance and integrates all known judicial literature, and the judicial literature base stores a plurality of judicial literature. Optionally, there are multiple jurisdictions corresponding to the jurisdictional documents, and the jurisdictional documents are not specifically limited in this application, and may include, but are not limited to: administrative, civil, criminal, etc., for the type of document specifically contained in the judicial document, may be determined according to the judicial type, and may include known official documents, etc. For the tags in the judicial literature library, the tags established in advance for each judicial literature can be understood, and each judicial literature corresponds to each judicial case, which includes various information, such as but not limited to, for the referee literature: the keywords or labels that are relatively concerned by the user can be extracted according to the information in the judicial works and other contents recorded in the judicial works, such as the information of the parties, the trial process, the original complaint request, the answered debt, the court affirmation fact, the court deem, the judgment result and the like.
In the present application, the legal knowledge elements may be for judicial types, different legal knowledge elements may be determined for each legal knowledge content, and these legal knowledge may refer to elements that affect case judgment results for different types of cases, such as online shopping transaction dispute type cases, and the legal knowledge elements included therein include: commodity category, litigation amount, purchase data, whether goods are received, and whether the commodity label is wrong. In the embodiments of the present invention, a specific extraction manner is not limited, and the legal knowledge elements may be extracted manually from each judicial document, or extracted by a single keyword, or extracted by a trained model or algorithm. The extraction method may be different for different legal knowledge elements, and the corresponding legal knowledge element may be different for each judicial type.
For the retrieval method in the present application, the retrieval can be performed not only by the legal knowledge elements described above, but also by known retrieval methods, such as: search keywords are determined by examination level (corresponding to first examination, second examination, and the like), region (province, city, county, and the like), case and the like, so as to perform search. Of course, when extracting the tags of the judicial works corresponding to the search keywords, the features of the characters known to the judicial works, such as transaction amount, game name, etc., can be extracted directly through the template, and in addition, the tags corresponding to the search keywords can be extracted directly through template matching (such as conditional random field model) or machine learning.
The above step S23 of the present invention may include: matching the search keywords with the label of each judicial essay in the judicial essay library; and taking the judicial documents successfully matched as candidate judicial documents to obtain a plurality of candidate judicial documents.
In the above embodiment, in the matching, it is indicated that the tag matching is performed according to the search keyword indicating the content desired by the user and the tag corresponding to some legal knowledge elements in the judicial literature, and a plurality of desired candidate judicial literatures can be matched by using the matching of the search keyword and the tag. For example, if the search keyword input by the user is "game infringement", infringement judicial documents of various game classes desired by the user can be matched through tag matching for the user to refer to.
Here, the labels in the present application are not obtained by simple text screening in the related art, but the labels selected by using legal knowledge elements are emphasized.
As for how to match the search keyword with the tag of each of the judicial documents in the judicial document corpus in the above-described embodiment, it may include the following steps S61 to S63.
Fig. 6 is a schematic diagram of matching an optional search keyword with a tag of each judicial document in the judicial document library according to the embodiment of the invention, as shown in fig. 6, which includes:
step S61, semantic parsing is performed on the search keyword.
In this step, if semantic parsing is to be performed on the search keyword, the semantics of the search keyword may be parsed according to an existing semantic training set, including parsing: the judicial type corresponding to the search keyword, the legal knowledge element corresponding to the search keyword, the structural region of the judicial literature corresponding to the search keyword (corresponding to the above-mentioned information of the party, the trial process, the original complaint request, the acknowledged debate, the law affirmation fact, the law affirmation, the referee result, etc.), the trial level corresponding to the search keyword, the region corresponding to the search keyword, etc.
And step S63, matching the semantic meaning of the search keyword with the label of each judicial literature in the judicial literature library.
The semantic meaning of the analyzed retrieval key words can be utilized to determine what the content concerned by each retrieval key word is, so that the most accurate label of the judicial literature can be found more accurately.
In this embodiment of the present invention, before performing a search in a jurisdictional corpus based on a search keyword to obtain a plurality of candidate jurisdictions, the search keyword may also be adjusted to be a target keyword through a search rewrite model, and fig. 7 is a schematic diagram of an optional method for adjusting a keyword by using a search rewrite model according to an embodiment of the present invention, as shown in fig. 7, the method includes:
step S71, determining whether the search keyword exists in a preset search lexicon, wherein the preset search lexicon includes a plurality of search words.
The preset search word library may be a database established by using search words corresponding to existing judicial documents, and one or more search words may be established for each judicial document corresponding to a known judicial document.
Step S73, if the search keyword does not exist in the preset search word stock, adjusting the search keyword to be the target keyword by adopting a search rewriting model, wherein the search rewriting model is obtained by using a plurality of judicial documents through machine learning training, and each of the plurality of judicial documents comprises: a search keyword and a target search keyword corresponding to the search keyword.
It should be noted here that the search rewrite model may be obtained by training using known multiple judicial documents, and each known keyword corresponds to one target keyword during training, for example, "player infringement" is rewritten to "game infringement" or the like. The retrieval rewriting model can rewrite the retrieval keywords which can not be matched (the retrieval keywords can not be matched with the labels corresponding to the judicial documents) into the keywords with similar meanings or the same meanings, and the target retrieval keywords obtained after rewriting can be used for more quickly matching and obtaining a plurality of judicial documents.
Here, the target search keyword rewritten by the search rewrite model corresponds to a known search keyword, and exists in correspondence with a search word in a preset search word library.
After a target search keyword is obtained through rewriting of a search rewriting model, when searching is carried out in a forensic document library based on the search keyword to obtain a plurality of candidate forensic documents, the method comprises the following steps: and searching in the French corpus based on the target search keywords to obtain a plurality of candidate judicial documents. The method comprises the steps of obtaining a plurality of candidate judicial documents by using the target retrieval keywords, wherein the target retrieval keywords are the retrieval keywords existing in the preset retrieval word bank, and the revised target retrieval keywords can be used for searching the judicial documents in the judicial document bank, so that the searching efficiency is improved.
Optionally, before selecting the target judicial literature from the candidate judicial literatures according to the search keyword and the tag, the obtained multiple judicial literatures may be further sorted, fig. 8 is a schematic diagram of a judicial literature sorting method according to an embodiment of the present invention, as shown in fig. 8, the method includes steps S81 to S85, wherein,
and step S81, determining target labels matched with the semantics of the retrieval keywords in the candidate judicial documents to obtain a target label set of each candidate judicial document.
Step S82, determining a weight value for each target label in the label set of each candidate judicial essay.
Here, the weight value may be set manually, or may be set by using a weight model or a weight training set to weight the position of the label in the judicial literature. The specific setting manner of the weight value is not specifically limited in the present application, and for example, the setting of the weight value of the label includes: 1. 2, 3 and 4, wherein the weight value 1 is the minimum weight, and the weight value 4 is the maximum weight. Through the setting of the weight value, the importance of the label in the judicial literature can be identified.
At step S83, a matching value for each candidate judicial essay is calculated based on the weight value of each target label.
The matching value of each candidate judicial literature can be determined through the weight value, the matching value can indicate the sorting mode of the judicial literature, if the weight value is higher, the importance of the target label in the judicial literature is higher, the reference value of the judicial literature is higher, then the candidate judicial literature is sorted, the judicial literature with higher relevance degree with the search keyword input by the user and higher weight value can be displayed, and the user can better see the own desired judicial literature.
And step S84, sorting the candidate judicial documents according to the matching value of each candidate judicial document.
And step S85, displaying the sorted candidate judicial documents according to the order.
Through the steps S81 to S85, the weight value of the determined target tag may be set, and the matching value of each candidate judicial literature may be determined, so that the candidate judicial literature may be sorted to display the selected judicial literature. According to the embodiment of the invention, the judicial documents are accurately sorted by the matching mode, the matching degree of the sorted judicial documents and the retrieval keywords input by the user is higher, and the reference value is higher, so that the user can quickly utilize the retrieval keywords to obtain the content required by the user, and the experience of the user using the product corresponding to the technical scheme of the application is also improved.
In this application, the ranking of the candidate forensic documents may be achieved in various ways, and a two-layer ranking structure may be used in this application, where the first layer may use a linear model, and f (x) w1 x1+ w2 x2+. + wn xn. Where w represents a feature weight value and x represents a feature score, such as legal fact relevance, legal relevance, dispute point relevance, document quality score, etc. The characteristics used at the top layer comprise a nonlinear characteristic and result from the next layer of sequencing, the second layer of sequencing uses a wide-deep neural network method, and depth characteristics and other discretization characteristics of text semantics of the text are comprehensively considered, wherein the depth characteristics comprise Boolean type characteristics, such as 'whether goods are received' and numerical characteristics, such as 'transaction amount'. Legal and legal regulation relativity, dispute point relativity, legal fact relativity and the like in the top-level sequencing structure are based on the characteristics of legal knowledge elements.
For the above step S85, it may include: displaying the content corresponding to the target label in each candidate judicial essay according to preset identification information, wherein the preset identification information at least comprises the following steps: highlight, mark color, underline.
The extracted content corresponding to the judicial literature and the target label can be identified, for example, characters corresponding to the target label are displayed in modes of color identification, highlight and the like, so that the user can see the position of the search keyword input by the user on the displayed target label of the judicial literature, the user can quickly find the content desired by the user after clicking the judicial literature, and the experience of the user in using the product is improved.
The specific preset identification information may include not only the above highlighting, marking color, underlining, but also other manners, such as bolding, brackets, and the like. The specific use identification information can be determined according to the mode of the specific use of the product.
For step S25, a target judicial literature is selected from the plurality of candidate judicial literatures according to the search keyword and the label.
Through the step S25, the final target judicial documents can be obtained by using the search keywords and determining the matching of the label content of each judicial document.
In the embodiment of the invention, the search keyword can be received through the terminal, the search is carried out in the judicial literature library based on the received search keyword, a plurality of candidate judicial literatures are obtained, each judicial literature corresponds to at least one label, and then the target judicial literature can be selected from the plurality of candidate judicial literatures by using the search keyword and the labels. In the embodiment, as the labels of each judicial document are added to the judicial documents by using the legal knowledge elements, when the labels are matched with the judicial documents, the matching degree of the judicial documents can be improved, the correlation between the retrieved judicial documents and the retrieval keywords input by the user can be high, and the extracted judicial documents can be marked with the target labels in the application, so that the user can better know the positions of the keywords input by the user in the judicial documents, the user can better use the product, and the correlation between the obtained target judicial documents and the judicial documents which the user wants is higher, thereby solving the technical problem that the retrieval correlation of the keywords is lower in the related technology, and the judicial documents which do not include the keywords in the full text but are related to the keywords can not be retrieved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
Through the above description of the embodiments, those skilled in the art can clearly understand that the forensic text retrieval method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Example 2
According to an embodiment of the present invention, there is also provided a device for implementing the above-mentioned judicial literature search, fig. 9 is a schematic diagram of an alternative judicial literature search device according to an embodiment of the present invention, as shown in fig. 9, the search device includes: an acquisition unit 91, a retrieval unit 93, a selection unit 95, wherein,
an obtaining unit 91, configured to obtain the received search keyword.
The retrieving unit 93 is configured to retrieve from a judicial literature library based on the retrieval keyword to obtain a plurality of candidate judicial literatures, where the judicial literature library includes a plurality of judicial literatures, and each judicial literature includes at least one tag.
A selecting unit 95 for selecting a target judicial literature from the plurality of candidate judicial literatures according to the search keyword and the tag.
Optionally, the above search device further includes: the judging module is used for judging whether the search keyword exists in a preset search word bank or not before the search is carried out in the judicial literature bank based on the search keyword and a plurality of candidate judicial literatures are obtained, wherein the preset search word bank comprises a plurality of search words; an adjusting module, configured to adjust a search keyword to a target keyword by using a search rewriting model when the search keyword does not exist in a preset search lexicon, where the search rewriting model is obtained by using a plurality of judicial documents through machine learning training, and each of the plurality of judicial documents includes: searching keywords and target searching keywords corresponding to the searching keywords; the first obtaining module is used for searching in a forensic document library based on the search keyword, and the obtaining of the plurality of candidate forensic documents comprises: and the second acquisition module is used for searching based on the position of the target keyword in the judicial literature library to acquire a plurality of candidate judicial literatures.
Optionally, the above search device further includes: the classification module is used for classifying the plurality of judicial documents according to case types before searching in the judicial document library based on the search keywords to obtain a plurality of types of judicial document sets; the extraction module is used for extracting legal knowledge elements from the judicial documents in each type of the judicial document set to obtain a legal knowledge element set; the component module is used for establishing a legal knowledge base corresponding to each type according to the legal knowledge element set; and the adding module is used for adding corresponding labels to each judicial literature based on the legal knowledge base.
Wherein the extraction module comprises: the segmentation submodule is used for segmenting each piece of judicial literature according to the constituent parts of each piece of judicial literature to obtain a plurality of constituent parts; and the extraction submodule is used for extracting the legal knowledge elements based on the plurality of components to obtain a legal knowledge element set.
Preferably, the manner of extracting the legal knowledge elements based on the plurality of constituent parts includes at least one of: template matching, machine learning model.
Here, it should be noted that the extraction submodule includes: the first determining submodule is used for determining legal knowledge elements to be extracted from each of the plurality of components; the second determining submodule is used for determining a classifier of the legal knowledge elements to be extracted of each component; and the first extraction submodule is used for classifying the corresponding components by adopting a classifier to obtain a legal knowledge element set.
For the embodiment of the present invention, the search unit includes: the matching module is used for matching the search keywords with the labels of each judicial essay in the judicial essay library; and the first determining module is used for taking the judicial documents successfully matched as candidate judicial documents to obtain a plurality of candidate judicial documents.
In addition, the matching module includes: the analysis submodule is used for carrying out semantic analysis on the retrieval key words; and the matching sub-module is used for matching the semantic meaning of the search keyword with the label of each judicial document in the judicial document library.
In the embodiment of the present invention, the search device further includes: the second determining module is used for determining a target label matched with the semanteme of the retrieval key word in the candidate judicial documents before selecting the target judicial document from the candidate judicial documents according to the retrieval key word and the label to obtain a target label set of each candidate judicial document; the third determining module is used for determining the weight value of each target label in the label set of each candidate judicial essay; the calculating module is used for calculating the matching value of each candidate judicial essay based on the weight value of each target label; the sorting module is used for sorting the candidate judicial documents according to the matching value of each candidate judicial document; and the display module is used for displaying according to the sequence of the sorted candidate judicial documents.
It should be noted here that the display module may include: the display sub-module is used for displaying the content corresponding to the target label in each candidate judicial essay according to preset identification information, wherein the preset identification information at least comprises: highlight, mark color, underline.
In the embodiment of the present invention, the terminal of the obtaining unit 91 may obtain the received search keyword, the retrieving unit 93 may be used to retrieve from the jurisdictional corpus based on the received search keyword, each jurisdictional document includes at least one tag, to obtain a plurality of candidate jurisdictions, and finally, the selecting unit 95 may select the target jurisdictional document from the plurality of candidate jurisdictions according to the search keyword and the tag. In the embodiment, as the labels of each judicial document are added to the judicial documents by using the legal knowledge elements, when the labels are matched with the judicial documents, the matching degree of the judicial documents can be improved, the correlation between the retrieved judicial documents and the retrieval keywords input by the user can be high, and the extracted judicial documents can be marked with the target labels in the application, so that the user can better know the positions of the keywords input by the user in the judicial documents, the user can better use the product, and the correlation between the obtained target judicial documents and the judicial documents which the user wants is higher, thereby solving the technical problem that the retrieval correlation of the keywords is lower in the related technology, and the judicial documents which do not include the keywords in the full text but are related to the keywords can not be retrieved.
It should be noted here that the acquiring unit 91, the retrieving unit 93, and the selecting unit 95 correspond to steps S21 to S25 in embodiment 1, and the two modules are the same as the corresponding steps in the implementation example and application scenario, but are not limited to the disclosure in embodiment 1. It should be noted that the modules described above as a part of the apparatus may be operated in the computer terminal 11 provided in the first embodiment.
Example 3
The embodiment of the invention can provide a computer terminal which can be any computer terminal device in a computer terminal group. Optionally, in this embodiment, the computer terminal may also be replaced with a terminal device such as a mobile terminal.
Optionally, in this embodiment, the computer terminal may be located in at least one network device of a plurality of network devices of a computer network.
In this embodiment, the computer terminal may execute the program code of the following steps in the retrieval method of the judicial essay: acquiring a received search keyword; retrieving in a judicial literature library based on the retrieval keywords to obtain a plurality of candidate judicial texts, wherein the judicial literature library comprises a plurality of judicial texts, and each judicial literature comprises at least one label; and selecting the target judicial literature from the candidate judicial literatures according to the search keywords and the labels.
Optionally, in this embodiment, when executing the retrieval method of the judicial literature, the computer terminal may further execute program codes of the following steps: before searching in a French corpus based on search keywords and obtaining a plurality of candidate judicial documents, judging whether the search keywords exist in a preset search word bank, wherein the preset search word bank comprises a plurality of search words; if the retrieval key word does not exist in the preset retrieval word bank, adjusting the retrieval key word into a target key word by adopting a retrieval rewriting model, wherein the retrieval rewriting model is obtained by using a plurality of judicial documents through machine learning training, and each of the plurality of judicial documents comprises: searching keywords and target searching keywords corresponding to the searching keywords; retrieving in a forensic text library based on the retrieval key words, and obtaining a plurality of candidate forensic texts comprises: and searching in the forensic script library based on the target keywords to obtain a plurality of candidate forensic scripts.
Optionally, in this embodiment, when executing the retrieval method of the judicial literature, the computer terminal may further execute program codes of the following steps: classifying the plurality of judicial documents according to case types before searching in the judicial document library based on the search keywords to obtain a plurality of types of judicial document sets; extracting legal knowledge elements from the judicial documents in each type of judicial document set to obtain a legal knowledge element set; establishing a legal knowledge base corresponding to each type according to the legal knowledge element set; and adding a corresponding label to each judicial essay based on the legal knowledge base.
Optionally, in this embodiment, when executing the retrieval method of the judicial literature, the computer terminal may further execute program codes of the following steps: segmenting each judicial literature according to the constituent parts of each judicial literature to obtain a plurality of constituent parts; and extracting legal knowledge elements based on the plurality of components to obtain a legal knowledge element set.
Optionally, the manner of extracting the legal knowledge elements based on the plurality of constituent parts includes at least one of: template matching, machine learning model.
Optionally, in this embodiment, when executing the retrieval method of the judicial literature, the computer terminal may further execute program codes of the following steps: determining legal knowledge elements to be extracted from each of a plurality of components; determining a classifier of legal knowledge elements to be extracted from each component; and classifying the corresponding components by adopting a classifier to obtain a legal knowledge element set.
Optionally, in this embodiment, when executing the retrieval method of the judicial literature, the computer terminal may further execute program codes of the following steps: matching the search keywords with the label of each judicial essay in the judicial essay library; and taking the judicial documents successfully matched as candidate judicial documents to obtain a plurality of candidate judicial documents.
Optionally, in this embodiment, when executing the retrieval method of the judicial literature, the computer terminal may further execute program codes of the following steps: performing semantic analysis on the retrieval keywords; and matching the semantics based on the retrieval keywords with the label of each judicial document in the judicial document library.
Optionally, in this embodiment, when executing the retrieval method of the judicial literature, the computer terminal may further execute program codes of the following steps: before selecting a target judicial literature from a plurality of candidate judicial literatures according to the retrieval key words and the labels, determining the target labels matched with the semantics of the retrieval key words in the candidate judicial literature to obtain a target label set of each candidate judicial literature; determining a weight value of each target label in the label set of each candidate judicial essay; calculating a matching value of each candidate judicial essay based on the weight value of each target tag; sorting the candidate judicial documents according to the matching value of each candidate judicial document; and displaying according to the sequence of the sorted candidate judicial works.
Optionally, in this embodiment, when executing the retrieval method of the judicial literature, the computer terminal may further execute program codes of the following steps: displaying the content corresponding to the target label in each candidate judicial essay according to preset identification information, wherein the preset identification information at least comprises the following steps: highlight, mark color, underline.
Alternatively, fig. 10 is a block diagram of a computer terminal according to an embodiment of the present invention. As shown in fig. 10, the computer terminal 10 may include: one or more processors ( processors 102a, 102b, 102n in fig. 10), an IO interface, a memory 104 (including data storage to receive program instructions and store data), a network interface, an input/output interface, the computer terminal 10 may be connected to a cursor control device, a keyboard or a display, and may also be connected to a wired and/or wireless network connection.
The memory 104 may be configured to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for retrieving a judicial literature in the embodiment of the present invention, and the processor executes various functional applications and data processing by operating the software programs and modules stored in the memory, that is, implements the above-mentioned method for retrieving a judicial literature. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memories may further include a memory located remotely from the processor, which may be connected to the terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor can call the information and application program stored in the memory through the transmission device to execute the following steps: acquiring a received search keyword; retrieving in a judicial literature library based on the retrieval keywords to obtain a plurality of candidate judicial texts, wherein the judicial literature library comprises a plurality of judicial texts, and each judicial literature comprises at least one label; and selecting the target judicial literature from the candidate judicial literatures according to the search keywords and the labels.
The embodiment of the invention provides a retrieval scheme of a judicial literature. By matching the search keywords, the labels corresponding to legal knowledge elements in the judicial documents can be determined, so that the target judicial documents corresponding to the search keywords and the labels can be selected, and the technical problem that the judicial documents which do not include the keywords but are related to the keywords cannot be searched in the full text due to low search correlation of the keywords in the related technology is solved.
It can be understood by those skilled in the art that the structure shown in fig. 10 is only an illustration, and the computer terminal may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 10 is a diagram illustrating a structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
Example 4
The embodiment of the invention also provides a storage medium. Optionally, in this embodiment, the storage medium may include a stored program, and when the program is executed by the processor, the program controls a device in which the storage medium is located to execute the program code executed by the judicial literature retrieval method provided in the first embodiment.
Optionally, in this embodiment, the storage medium may be located in any one of computer terminals in a computer terminal group in a computer network, or in any one of mobile terminals in a mobile terminal group.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: acquiring a received search keyword; retrieving in a judicial literature library based on the retrieval keywords to obtain a plurality of candidate judicial texts, wherein the judicial literature library comprises a plurality of judicial texts, and each judicial literature comprises at least one label; and selecting the target judicial literature from the candidate judicial literatures according to the search keywords and the labels.
Optionally, in this embodiment, the storage medium is further configured to store program code for performing the following steps: before searching in a French corpus based on search keywords and obtaining a plurality of candidate judicial documents, judging whether the search keywords exist in a preset search word bank, wherein the preset search word bank comprises a plurality of search words; if the retrieval key word does not exist in the preset retrieval word bank, adjusting the retrieval key word into a target key word by adopting a retrieval rewriting model, wherein the retrieval rewriting model is obtained by using a plurality of judicial documents through machine learning training, and each of the plurality of judicial documents comprises: searching keywords and target searching keywords corresponding to the searching keywords; retrieving in a forensic text library based on the retrieval key words, and obtaining a plurality of candidate forensic texts comprises: and searching in the forensic script library based on the target keywords to obtain a plurality of candidate forensic scripts.
Optionally, in this embodiment, the storage medium is further configured to store program code for performing the following steps: classifying the plurality of judicial documents according to case types before searching in the judicial document library based on the search keywords to obtain a plurality of types of judicial document sets; extracting legal knowledge elements from the judicial documents in each type of judicial document set to obtain a legal knowledge element set; establishing a legal knowledge base corresponding to each type according to the legal knowledge element set; and adding a corresponding label to each judicial essay based on the legal knowledge base.
Optionally, in this embodiment, the storage medium is further configured to store program code for performing the following steps: segmenting each judicial literature according to the constituent parts of each judicial literature to obtain a plurality of constituent parts; and extracting legal knowledge elements based on the plurality of components to obtain a legal knowledge element set.
Optionally, the manner of extracting the legal knowledge elements based on the plurality of constituent parts includes at least one of: template matching, machine learning model.
Optionally, in this embodiment, the storage medium is further configured to store program code for performing the following steps: determining legal knowledge elements to be extracted from each of a plurality of components; determining a classifier of legal knowledge elements to be extracted from each component; and classifying the corresponding components by adopting a classifier to obtain a legal knowledge element set.
Optionally, in this embodiment, the storage medium is further configured to store program code for performing the following steps: matching the search keywords with the label of each judicial essay in the judicial essay library; and taking the judicial documents successfully matched as candidate judicial documents to obtain a plurality of candidate judicial documents.
Optionally, in this embodiment, the storage medium is further configured to store program code for performing the following steps: performing semantic analysis on the retrieval keywords; and matching the semantics based on the retrieval keywords with the label of each judicial document in the judicial document library.
Optionally, in this embodiment, the storage medium is further configured to store program code for performing the following steps: before selecting a target judicial literature from a plurality of candidate judicial literatures according to the retrieval key words and the labels, determining the target labels matched with the semantics of the retrieval key words in the candidate judicial literature to obtain a target label set of each candidate judicial literature; determining a weight value of each target label in the label set of each candidate judicial essay; calculating a matching value of each candidate judicial essay based on the weight value of each target tag; sorting the candidate judicial documents according to the matching value of each candidate judicial document; and displaying according to the sequence of the sorted candidate judicial works.
Optionally, in this embodiment, the storage medium is further configured to store program code for performing the following steps: displaying the content corresponding to the target label in each candidate judicial essay according to preset identification information, wherein the preset identification information at least comprises the following steps: highlight, mark color, underline.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (12)

1. A method for retrieving a judicial essay, comprising:
acquiring a received search keyword;
retrieving in a forensic script library based on the retrieval key words to obtain a plurality of candidate forensic texts, wherein the forensic script library comprises a plurality of forensic texts, and each forensic text comprises at least one tag;
and selecting a target judicial literature from the candidate judicial literatures according to the retrieval key words and the labels.
2. The method of claim 1,
before searching in the span document library based on the search keyword and obtaining a plurality of candidate span documents, the method further comprises the following steps:
judging whether the search keyword exists in a preset search word bank or not, wherein the preset search word bank comprises a plurality of search words;
if the search keyword does not exist in the preset search word bank, adjusting the search keyword into a target keyword by adopting a search rewriting model, wherein the search rewriting model is obtained by using a plurality of judicial documents through machine learning training, and each of the plurality of judicial documents comprises: searching keywords and target searching keywords corresponding to the searching keywords;
retrieving in a forensic text library based on the retrieval key words, and obtaining a plurality of candidate forensic texts comprises:
and retrieving in the SefaV library based on the target keywords to obtain a plurality of candidate judicial documents.
3. The method of claim 1, wherein prior to performing a search in a forensic corpus based on the search keyword, the method further comprises:
classifying the plurality of judicial documents according to case types to obtain a plurality of types of judicial document sets;
extracting legal knowledge elements from the judicial documents in each type of judicial document set to obtain a legal knowledge element set;
establishing a legal knowledge base corresponding to each type according to the legal knowledge element set;
and adding a corresponding label to each judicial essay based on the legal knowledge base.
4. The method of claim 3, wherein extracting legal knowledge elements from the judicial documents in each type of the set of judicial documents, obtaining the set of legal knowledge elements comprises:
segmenting each judicial literature according to the constituent parts of each judicial literature to obtain a plurality of constituent parts;
and extracting legal knowledge elements based on the plurality of components to obtain a legal knowledge element set.
5. The method of claim 4, wherein the manner of extracting legal knowledge elements based on the plurality of components comprises at least one of: template matching, machine learning model.
6. The method of claim 4, wherein extracting legal knowledge elements based on the plurality of components to obtain a set of legal knowledge elements comprises:
determining legal knowledge elements to be extracted from each of the plurality of components;
determining a classifier of legal knowledge elements to be extracted from each component;
and classifying the corresponding components by adopting the classifier to obtain the legal knowledge element set.
7. The method according to claim 1, wherein retrieving in a forensic corpus based on the search keyword to obtain a plurality of candidate forensic documents comprises:
matching the search keywords with the label of each judicial essay in the judicial essay library;
and taking the judicial documents successfully matched as candidate judicial documents to obtain a plurality of candidate judicial documents.
8. The method of claim 7, wherein matching the search keyword to a label of each forensic document in the corpus of forensic documents comprises:
performing semantic analysis on the retrieval keywords;
and matching the semantic meaning based on the retrieval key words with the label of each judicial essay in the judicial essay library.
9. The method of claim 8, wherein prior to selecting a target judicial document from the plurality of candidate judicial documents according to the search keyword and the tag, the method further comprises:
determining target labels matched with the semantics of the retrieval keywords in the candidate judicial documents to obtain a target label set of each candidate judicial document;
determining a weight value of each target label in the label set of each candidate judicial essay;
calculating a matching value of each candidate judicial essay based on the weight value of each target tag;
sorting the candidate judicial documents according to the matching value of each candidate judicial document;
and displaying according to the sequence of the sorted candidate judicial works.
10. The method of claim 9, wherein displaying in the order of the ranked candidate judicial documents comprises: displaying the content corresponding to the target label in each candidate judicial essay according to preset identification information, wherein the preset identification information at least comprises the following components: highlight, mark color, underline.
11. A judicial literature retrieval device, comprising:
an acquisition unit configured to acquire a received search keyword;
the retrieval unit is used for retrieving in a forensic script library based on the retrieval keywords to obtain a plurality of candidate forensic texts, wherein the forensic script library comprises a plurality of forensic texts, and each forensic text comprises at least one label;
and the determining unit is used for selecting a target judicial literature from the candidate judicial literatures according to the search keyword and the label.
12. A storage medium comprising a stored program, wherein the program, when executed by a processor, controls an apparatus in which the storage medium is located to perform the method of retrieving a judicial essay according to any one of claims 1 to 10.
CN201810663048.9A 2018-06-25 2018-06-25 Method and device for searching judicial documents Active CN110647504B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810663048.9A CN110647504B (en) 2018-06-25 2018-06-25 Method and device for searching judicial documents

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810663048.9A CN110647504B (en) 2018-06-25 2018-06-25 Method and device for searching judicial documents

Publications (2)

Publication Number Publication Date
CN110647504A true CN110647504A (en) 2020-01-03
CN110647504B CN110647504B (en) 2023-03-21

Family

ID=68988410

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810663048.9A Active CN110647504B (en) 2018-06-25 2018-06-25 Method and device for searching judicial documents

Country Status (1)

Country Link
CN (1) CN110647504B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259058A (en) * 2020-01-16 2020-06-09 北京百度网讯科技有限公司 Data mining method, data mining device and electronic equipment
CN111858938A (en) * 2020-07-23 2020-10-30 鼎富智能科技有限公司 Extraction method and device of referee document label
CN113569538A (en) * 2020-04-29 2021-10-29 北京国双科技有限公司 Document generation method and device, storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106502996A (en) * 2016-12-13 2017-03-15 深圳爱拼信息科技有限公司 A kind of judgement document's search method and server based on semantic matches
WO2017092622A1 (en) * 2015-12-01 2017-06-08 北京国双科技有限公司 Legal provision search method and device
CN106815265A (en) * 2015-12-01 2017-06-09 北京国双科技有限公司 The searching method and device of judgement document
CN107247743A (en) * 2017-05-17 2017-10-13 安徽富驰信息技术有限公司 A kind of judicial class case search method and system
CN108197163A (en) * 2017-12-14 2018-06-22 上海银江智慧智能化技术有限公司 A kind of structuring processing method based on judgement document

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017092622A1 (en) * 2015-12-01 2017-06-08 北京国双科技有限公司 Legal provision search method and device
CN106815265A (en) * 2015-12-01 2017-06-09 北京国双科技有限公司 The searching method and device of judgement document
CN106502996A (en) * 2016-12-13 2017-03-15 深圳爱拼信息科技有限公司 A kind of judgement document's search method and server based on semantic matches
CN107247743A (en) * 2017-05-17 2017-10-13 安徽富驰信息技术有限公司 A kind of judicial class case search method and system
CN108197163A (en) * 2017-12-14 2018-06-22 上海银江智慧智能化技术有限公司 A kind of structuring processing method based on judgement document

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259058A (en) * 2020-01-16 2020-06-09 北京百度网讯科技有限公司 Data mining method, data mining device and electronic equipment
CN111259058B (en) * 2020-01-16 2023-09-15 北京百度网讯科技有限公司 Data mining method, data mining device and electronic equipment
CN113569538A (en) * 2020-04-29 2021-10-29 北京国双科技有限公司 Document generation method and device, storage medium and electronic equipment
CN111858938A (en) * 2020-07-23 2020-10-30 鼎富智能科技有限公司 Extraction method and device of referee document label
CN111858938B (en) * 2020-07-23 2024-05-24 鼎富智能科技有限公司 Method and device for extracting referee document tag

Also Published As

Publication number Publication date
CN110647504B (en) 2023-03-21

Similar Documents

Publication Publication Date Title
CN112711937B (en) Template recommendation method, device, equipment and storage medium
US9767144B2 (en) Search system with query refinement
US9898464B2 (en) Information extraction supporting apparatus and method
CN111797214A (en) FAQ database-based problem screening method and device, computer equipment and medium
CN109684627A (en) A kind of file classification method and device
US20160063596A1 (en) Automatically generating reading recommendations based on linguistic difficulty
CN110647504B (en) Method and device for searching judicial documents
TW201539216A (en) Document analysis system, document analysis method and document analysis program
WO2013080214A1 (en) Topic extraction and video association
Huang et al. Leveraging the crowd to improve feature-sentiment analysis of user reviews
Kiran et al. User specific product recommendation and rating system by performing sentiment analysis on product reviews
TW201415402A (en) Forensic system, forensic method, and forensic program
JP5552582B2 (en) Content search device
CN106407316B (en) Software question and answer recommendation method and device based on topic model
CN108153754B (en) Data processing method and device
CN110532229B (en) Evidence file retrieval method, device, computer equipment and storage medium
KR20200064490A (en) Server and method for automatically generating profile
CN111523315A (en) Data processing method, text recognition device and computer equipment
US20220327445A1 (en) Workshop assistance system and workshop assistance method
CN113127736A (en) Classification recommendation method and device based on search history
CN110717008B (en) Search result ordering method and related device based on semantic recognition
CN112084376A (en) Map knowledge based recommendation method and system and electronic device
US20160124946A1 (en) Managing a set of data
KR101781597B1 (en) Apparatus and method for creating information on electronic publication
CN109242690A (en) Finance product recommended method, device, computer equipment and readable storage medium storing program for executing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant