A kind of similar calculating of judgement document's case based on figure and search method and system
Technical field
The present invention relates to the present invention relates to technical field of data processing, more particularly to the similar calculating of judgement document's case and inspection
Rope method and system.
Background technology
The retrieval of similar case is to utilize big data technology, and mining analysis is done to disclosed judgement document, for the judge that handles a case, when
Thing people, lawyer push directly related guiding or informative similar cases.Precisely efficient similar case retrieval will be very big
Working quality of case handling and efficiency are improved, allows public's feelings to fair and just.
In the prior art, similar judgement document is retrieved usually using the matching of keyword or keyword combination, tied at the same time
Close preset condition such as case by, law court's title, judicial procedure, judge's date etc. filter retrieval result.Keyword retrieval is such as
It is seldom then to match case using accurate matching for fruit;If using the keyword for including synonym, near synonym etc., may cause
It is excessive to match case, search precision declines.Keyword retrieval have ignored legal profession domain knowledge, not embody judicial logic.
A kind of semanteme for being based on word frequency-word inverse ratio frequency (TF-IDF) is proposed in Chinese patent CN106502996A
Judgement document's search method is matched, the related of retrieval result is improved to the weight in whole text set in document by adjusting word
Degree, advantage are very simple efficient.But shortcoming is also clearly:Its support to this kind of long text term of legal documents is not
It is good, and important judicial logic is not reflected out, therefore retrieval result precision is not high.
A kind of method for calculating judgement document's similarity is proposed in Chinese patent CN106933787A, extraction first is every
The judgement keyword of piece document, then the crucial term vector of construction judgement, judge's text is obtained finally by vectorial similarity is calculated
The similarity of book.This method only accounts for the court verdict part of judgement document, have ignored evidence in judgement document, the fact, strives
Focus, causality, the judicial logical gate for being applicable in the key such as law article are discussed, therefore case similarity measure and retrieval are extremely difficult to
Judicial practice requirement.
A kind of similar documents search method based on random forest technology is proposed in Chinese patent CN105930473A,
By constructing case characteristics tree, using random forest technique drill, feature weight tree is obtained, case two-by-two is generated according to querying condition
The similarity matrix of part.This method is highly dependent on accurately case feature extraction, and patent is not directed to, and is considered as simply existing
There is case to be refined by the one kind retrieved.And in actual case process of trial, case feature is often what is interweaved, with the form of tree
It is difficult to express.
Therefore, those skilled in the art is directed to developing a kind of similar calculating of judgement document's case, retrieval based on figure
Method and system, so that it is low with search result precision caused by judicial logic to solve to ignore in existing method legal profession knowledge
The problem of.
The content of the invention
In order to solve the above technical problems, the present invention provides a kind of similar calculating of judgement document's case based on figure and retrieval
Method, its key step include:
Step 1, server end collection judgement document;
Step 2, the part identification of arguing of judgement document;
Further, identification process mainly illustrates the claims for party, according to the case facts of identification, according to
Legal provisions, clearly appeal whether support, part support or it is unassisted.The part of arguing be characterized in " thinking the court " or
Similar phrase is starts, using " the court support/unassisted " or similar phrase for end;
Step 3, the case analysis of essentials for part of arguing;
Further, to the text fragment of previous step identification, split according to different grain size, the benchmark of segmentation be according to
Comma, branch, fullstop etc. are made pauses in reading unpunctuated ancient writings naturally, or the Semantic center of sentence paragraph, and the text fragments after segmentation form text cluster.
Each text cluster is concluded to different element types according to crucial legal language, such as:" evidence identification ", " case thing
In fact ", " central issue ", " causality ", " applicable law " etc..Text cluster is mapped to case by phase according to semanteme at the same time
On the case key element label of pass, such as possible label is " seriously violating rules and regulations ", " releases work in labour dispute dispute
Contract is legal " etc..Above-mentioned segmentation, conclusion and map operation, can manually mark or machine learning is completed;
Step 4, case reason collection of illustrative plates automatically generate;
Further, for the case element type and label that are extracted from above-mentioned text cluster, logically relation, generates
Case reason collection of illustrative plates.Wherein party, claims and case key element label are node, and element type represents logical relation, uses
Side represents.In node, " claims " of party are if law court " unassisted ", and text box adds " X " to represent in node, " portion
Branch holds " plus "-" expression, " supporting " is not added with any symbol;
Step 5, client receive retrieval information input by user;
Further, retrieval information is probably pleadings text, court's trial notes text or keyword etc., which submits
Server end;
Step 6, server end is extracted according to retrieval text input by user or mapping case key element;
According to case key element, the case key element reason collection of illustrative plates with gathering document carries out matching primitives for step 7, server end;
Step 8, returned to the case similar to retrieval content by matching degree height.
Further, client user can also again increase, change retrieval information, after server end recalculates, return
Whole retrieval result is adjusted back to be optimal matching.
The present invention also proposes a kind of similar calculating of judgement document's case based on figure and searching system, including judgement document's case
The similar computing device of part and similar case document search device, the similar computing device of judgement document's case include:
Argue Portion identification module:For identifying that judgement document or user input the part of arguing in retrieval information;
Case analysis of essentials module:Part of arguing for Portion identification module 201 of arguing to be identified carries out semanteme and cuts
Cut, formed after text fragments form text cluster and carry out analysis of essentials, form the mapping of text and key element;
Reason collection of illustrative plates module:Case key element for case analysis of essentials module 202 to be parsed forms reason collection of illustrative plates.
Case key element mapping block:For extraction or the case key element of map user retrieval case information;
Case key element matching module:For the case for calculating user search case key element with judgement document being gathered in server
Key element matching degree.
The similar case document search device includes:
Input module:For allowing user to input information to be retrieved;
Similar case computing module:For based on it is input by user retrieval information automation extraction or mapping case key element, and
Calculate the similarity between the case key element and other judgement document's case key element reason collection of illustrative plates;
Output module:For exporting retrieval result from high to low according to similarity, retrieval result and the user of output input
The similarity of information need to reach given threshold.
In the better embodiment of the present invention, method proposed by the present invention has taken into full account that the specialty of judgement document is known
Know, the case reason collection of illustrative plates of generation indicates the most critical key element and internal logic of case with a kind of compression but intuitive way
Relation, had both facilitated stakeholder intuitively to check case main points, and relevant case can also be accurately retrieved from document library.
It is described further below with reference to the technique effect of design of the attached drawing to the present invention, concrete structure and generation, with
It is fully understood from the purpose of the present invention, feature and effect.
Brief description of the drawings
Fig. 1 is a kind of similar calculating of judgement document's case based on figure of the preferred embodiment of the present invention and retrieval side
The flow diagram of method;
Fig. 2 is a kind of knot of similar computing device of judgement document's case based on figure of the preferred embodiment of the present invention
Structure schematic diagram;
Fig. 3 is a kind of structure diagram of retrieval device based on figure of the preferred embodiment of the present invention;
Fig. 4 is the similar calculating of judgement document's case based on figure of the preferred embodiment of the present invention and search method
Reason collection of illustrative plates schematic diagram.
Embodiment
Multiple preferred embodiments of the present invention are introduced below with reference to Figure of description, make its technology contents more clear and just
In understanding.The present invention can be emerged from by many various forms of embodiments, and protection scope of the present invention not only limits
The embodiment that Yu Wenzhong is mentioned below in conjunction with the accompanying drawings is described in further detail the specific embodiment of the present invention.
Embodiment 1
As shown in Figure 1, a kind of judgement document's case based on figure proposed by the present invention it is similar calculate, the flow of search method
Schematic diagram, the similar calculating of judgement document's case based on figure, search method comprise the following steps:
Step S101:Server end gathers judgement document.
Step S102:Part of arguing in identification positioning judgement document, which includes the key element and method of case
The trial result of institute.
Step S103:By the part of arguing in judgement document according to described content, semanteme, situation, cut into some
A text fragments, form text cluster, and carry out case analysis of essentials to each text cluster.
Step S104:Reason collection of illustrative plates is formed based on the case key element extracted in step 103.
Step S105:User inputs retrieval information, and retrieval information is probably pleadings text, court's trial notes text or keyword
Deng retrieval information submission server end.
Step S106:Server end is extracted according to retrieval text input by user or mapping case key element.
Step S107:Server end inputs the case key element of information MAP according to user, the case key element with gathering document
Reason collection of illustrative plates carries out matching primitives.
Step S108:Return to the case similar to retrieval content.
Embodiment 2
As shown in figure 4, with《Shanghai En Tien XXX Co., Ltds and poplar XX labour contract dispute first sentence papers of civil judgment》For
Example, specifically illustrate judgement document's case based on figure it is similar calculate, search method realizes process.
In the preset implementation, judgement document's quantity of server end collection is more, is carrying out similar judge's document retrieval
When it is more for the reason collection of illustrative plates of figure similarity measure, retrieval effectiveness is better.Judgement document is to record people's court's hearing process
With as a result, it is the carrier of lawsuit action result.Judgement document includes the polytypes such as court verdict, written verdict, conciliation statement.
In the preset implementation, step S102 includes:Find similar phrase the opening as part of arguing such as " thinking the court "
Begin and the phrase of similar expression law court's trial result such as " supporting the court ", " the court is unassisted " and attitude is used as and argues
Partial end.
With《Shanghai En Tien XXX Co., Ltds and poplar XX labour contract dispute first sentence papers of civil judgment》Exemplified by, it is specially
Identify that representative " the thinking the court " that part starts of arguing plays the phrase for identifying and representing law court trial result and " propped up the court
Hold " only, recognition result is as follows:
" think the court:According to《People's Republic of China's Labor Contract Law》The regulation of 39th article of Section 2, work
Person seriously violates the rules and regulations of employing unit, and employing unit can release labour contract.The releasing behavior of employing unit is
It is no legal, it should mainly examine whether the behavior of labourer seriously violates the pertinent regulation system of employing unit.Looked into according to this case
The fact that bright, defendant take advantage of one's position during work at plaintiff, receive supplier's bribery.The behavior of defendant is seriously violated
The rules and regulations of plaintiff, plaintiff releases accordingly to meet the legal requirements with the behavior of the labour contract of defendant, it is improper to have no, therefore former
Accuse and require not paying the claims that defendant releases 76,523.40 yuan of labour contract compensation, the court supports.”
In the preset implementation, step S103 carries out being typically to carry out according to different grain size in semantic segment cutting process
, the benchmark of segmentation is typically to make pauses in reading unpunctuated ancient writings naturally according to comma, branch, fullstop etc., and the text fragments after segmentation form text cluster.
On the basis of text cluster, case analysis of essentials around " evidence identification ", " case facts ", " central issue ", " causality ",
" applicable law " etc. is several to want classification, forms the label system of complete set, and carries out label for labelling to text cluster, forms text
With the mapping relations of label (case key element).Above-mentioned segmentation, conclusion and map operation, can be artificial mark or machine learning
Come what is completed.
It is succinct to describe still exemplified by referring to court verdict in step S102, case key element label is only enumerated herein to be enumerated
During case element type, the text cluster there are case key element label will be replaced with its case key element label.
Case relevant factor is as follows:
Party is " employing unit " and " labourer ";Claims are " not Litis aestimatio ";Court verdict is " to give
Support ";
The case key element label summarized is as follows:
" taking bribes ", " seriously violating rules and regulations ", " it is legal to release labour contract ", " not Litis aestimatio ";
The case element type summarized is as follows:
There is " central issue " of " it is legal to release labour contract " between employing unit and labourer;
" according to《People's Republic of China's Labor Contract Law》The regulation ... of 39th article of Section 2 can release labor
Dynamic circuit connector is same " category " applicable law " element type;
" taking bribes " category " evidence identification " element type;
" serious to violate employing unit's rules and regulations " category " case facts " element type;
There is " causality " logical element type between " taking bribes " and " serious to violate employing unit's rules and regulations ";
There is " causality " logic between " serious to violate employing unit's rules and regulations " and " it is legal to release labour contract "
Element type.
In the preset implementation, key element mappings of the step S104 based on step S103, automatically generates reason collection of illustrative plates, wherein when
Thing people, claims and case key element label are node, and element type represents logical relation, represented with side.In node, party
" claims " if text box adds " X " to represent in law court " unassisted " node, " part support " plus "-" represent, " give
Support " it is not added with any symbol.
Still by taking court verdict in step S102 as an example, the appeal of this case requires not pay for employing unit releases labour contract compensation
Indemnity, the judicial Logical presentation of case key element are:Because labourer takes bribes, employing unit's rule are seriously violated in its behavior
Chapter system, it is legal according to legal provisions employing unit releasing labour contract, therefore require not paying releasing work for employing unit
The appeal of contract compensation is supported.As shown in figure 4, illustrate the judicial logic of the case:By party " employing unit "
With " labourer ", claims " not Litis aestimatio ", each case key element label as node, using element type as logic
Relation represents with side, " supports " claims of employing unit, thus claims text simultaneously because recognizing law court
Frame is not added with any symbol.
In specific implementation, user sends retrieval request from client in step S105, and retrieval information is probably pleadings text
Originally, court's trial notes text or keyword etc., server enter step S106 after receiving the request.
In the preset implementation, step S106 according to retrieval information extraction input by user or maps case key element,
Matching degree calculating is carried out between case reason collection of illustrative plates in step S107 with gathering judgement document in server.This process is first
The similarity of claims and document claims in the period of service in retrieval information is calculated, continues to calculate retrieval letter if matching
Case key element and the similarity of case reason collection of illustrative plates in the period of service in breath, so as to obtain final matching degree.
In the preset implementation, step S108 finds out the reason collection of illustrative plates for the case retrieved with user by given threshold
Similarity reaches all cases of threshold value, and returns to retrieval result from high to low by matching degree.Client user can also be again
Increase, modification retrieval information, after server end recalculates, the retrieval result for returning to adjustment is matched with being optimal.
Embodiment 3
As shown in Fig. 2, it is the similar calculating of judgement document's case and retrieval based on figure of a preferred embodiment of the present invention
The structure diagram of the similar computing device of judgement document's case based on figure used by system.Device includes:
Argue Portion identification module 201:For identifying that judgement document or user input the part of arguing in retrieval information.
Case analysis of essentials module 202:Part of arguing for Portion identification module 201 of arguing to be identified carries out semantic
Cutting, forms after text fragments form text cluster and carries out analysis of essentials, form the mapping of text and key element.
Reason collection of illustrative plates module 203:Case key element for case analysis of essentials module 202 to be parsed forms reason figure
Spectrum.
Case key element mapping block 204:For extraction or the case key element of map user retrieval case information.
Case key element matching module 205:For calculating user search case key element with gathering judgement document's in server
Case key element matching degree.
Embodiment 4
As shown in figure 3, it is the similar calculating of judgement document's case and retrieval based on figure of a preferred embodiment of the present invention
The structure diagram of similar case document search device based on figure used by system.Device includes:
Input module 301:For allowing user to input information to be retrieved.
Similar case computing module 302:For retrieving information automation extraction or mapping case key element based on input by user,
And calculate the similarity between the case key element and other judgement document's case key element reason collection of illustrative plates.
Output module 303:For exporting retrieval result from high to low according to similarity, the retrieval result of output and user are defeated
Given threshold need to be reached by entering the similarity of information.
Preferred embodiment of the invention described in detail above.It should be appreciated that the ordinary skill of this area is without wound
The property made work can conceive according to the present invention makes many modifications and variations.Therefore, all technician in the art
Pass through the available technology of logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea
Scheme, all should be in the protection domain being defined in the patent claims.