CN111144068A - Similar arbitration case recommendation method and device - Google Patents

Similar arbitration case recommendation method and device Download PDF

Info

Publication number
CN111144068A
CN111144068A CN201911170945.7A CN201911170945A CN111144068A CN 111144068 A CN111144068 A CN 111144068A CN 201911170945 A CN201911170945 A CN 201911170945A CN 111144068 A CN111144068 A CN 111144068A
Authority
CN
China
Prior art keywords
arbitration
case
information
text information
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911170945.7A
Other languages
Chinese (zh)
Inventor
张森
罗诚
杨威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Purvar Software Wuhan Co ltd
Original Assignee
Purvar Software Wuhan Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Purvar Software Wuhan Co ltd filed Critical Purvar Software Wuhan Co ltd
Priority to CN201911170945.7A priority Critical patent/CN111144068A/en
Publication of CN111144068A publication Critical patent/CN111144068A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services; Handling legal documents
    • G06Q50/182Alternative dispute resolution

Abstract

The embodiment of the invention provides a similar arbitration case recommendation method and a similar arbitration case recommendation device, which comprise the following steps: acquiring text information of a target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to acquire vector information corresponding to the target arbitration case; carrying out similarity matching on the vector information and vector information corresponding to each arbitration case in a preset knowledge base to obtain soft cosine similarity information of a target arbitration case and each arbitration case in the knowledge base; and extracting cosine similarity information meeting preset conditions, and recommending the arbitration cases in the knowledge base corresponding to the cosine similarity information meeting the preset conditions as similar cases. The method provided by the embodiment of the invention calculates the similarity value between the case document and the case document in the knowledge base by using the vector form of the document, and takes the case judgment result in the knowledge base as the recommendation result of the case to be judged, thereby reflecting the objective legal basis of case judgment.

Description

Similar arbitration case recommendation method and device
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for recommending similar arbitration cases.
Background
Arbitration means that a dispute party agrees before or after a dispute occurs, voluntarily submits the dispute to an independent arbitration mechanism for judgment, and the arbitration mechanism makes a dispute resolution system and a dispute resolution method which are bound for all dispute parties. With the progress of people's legal consciousness and economic development, the number of cases examined by arbitration mechanisms in various regions is rapidly increased, and meanwhile, common people have more and more opportunities to strive for legal interests, so that more and more cases are arbitrated, and huge examination is brought to arbitration mechanism personnel. The law industry appears to outsiders as a very professionally significant area. Currently, arbitration mechanisms process arbitration cases in a manual processing mode, and case judgment is carried out by workers trained by professionals. Arbitration authority personnel typically compare historical arbitration cases in the database when deciding on arbitration cases.
In the prior art, the arbitration mechanism personnel often search the relevant historical arbitration cases from a specific database in the form of keywords, a good legal worker needs the extremely deep background of legal knowledge and years of working experience of the practitioner, so the number of arbitration mechanism case personnel is often not enough to cope with more and more arbitration cases. Sometimes, the judgment result depends on the keywords of case description extracted by the staff to a greater extent, and is subjective, so that the judgment of part of arbitration cases is influenced.
Disclosure of Invention
To solve the above problems in the prior art, embodiments of the present invention provide a similar arbitration case recommendation method and apparatus.
In a first aspect, an embodiment of the present invention provides a similar arbitration case recommendation method, including:
acquiring text information of a target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to acquire vector information corresponding to the target arbitration case;
carrying out similarity matching on the vector information and vector information corresponding to each arbitration case in a preset knowledge base to obtain soft cosine similarity information of the target arbitration case and each arbitration case in the knowledge base;
and extracting cosine similarity information meeting preset conditions, and recommending the arbitration cases in the knowledge base corresponding to the cosine similarity information meeting the preset conditions as similar cases.
The step of obtaining the text information of the target arbitration case, preprocessing the text information, vectorizing the preprocessed text information, and obtaining the vector information corresponding to the target arbitration case specifically includes: segmenting words of the text information, filtering out meaningless words in the text information according to a preset stop word list, and obtaining the text information after word segmentation; and representing each word in the text information after word segmentation by using a vector in a vector space to obtain vector information corresponding to the text information.
Wherein the method further comprises: acquiring text information of a plurality of arbitration cases, performing de-processing on the text information of the arbitration cases and performing vectorization on the preprocessed text information to acquire vector information corresponding to the plurality of arbitration cases; and constructing the knowledge base according to the vector information corresponding to the arbitration cases.
The method comprises the following steps of extracting cosine similarity information meeting preset conditions, and recommending arbitration cases in a knowledge base corresponding to the cosine similarity information meeting the preset conditions as similar cases, wherein the steps specifically comprise:
and sorting the cosine similarity information from big to small, and selecting an arbitration case corresponding to the cosine similarity with the similarity meeting a preset threshold value as a similar case for recommendation.
In a second aspect, an embodiment of the present invention provides a similar arbitration case recommendation apparatus, including:
the preprocessing module is used for acquiring text information of a target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to acquire vector information corresponding to the target arbitration case;
the similarity matching module is used for performing similarity matching on the vector information and vector information corresponding to each arbitration case in a preset knowledge base to obtain soft cosine similarity information of the target arbitration case and each arbitration case in the knowledge base;
and the recommending module is used for extracting the cosine similarity information meeting the preset conditions and recommending the arbitration case in the knowledge base corresponding to the cosine similarity information meeting the preset conditions as a similar case.
Wherein the preprocessing module is specifically configured to: segmenting words of the text information, filtering out nonsense words in the text information according to a preset stop word list, obtaining the segmented text information, representing each word in the segmented text information by using a vector in a vector space, and obtaining vector information corresponding to the text information.
Wherein the apparatus further comprises: the knowledge base construction module is used for acquiring text information of a plurality of arbitration cases, carrying out de-processing on the text information of the arbitration cases and vectorizing the preprocessed text information to acquire vector information corresponding to the arbitration cases; and constructing the knowledge base according to the vector information corresponding to the arbitration cases.
Wherein the recommendation module is specifically configured to: and sorting the cosine similarity information from big to small, and selecting an arbitration case corresponding to the cosine similarity with the similarity meeting a preset threshold value as a similar case for recommendation.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the similar arbitration case recommendation method as provided in the first aspect.
In a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of a similar method for arbitrating case recommendation as provided in the first aspect above.
According to the software test case generation method and device provided by the embodiment of the invention, the similarity value between the case document and the case document in the knowledge base is calculated by utilizing the vector form of the document, and the case judgment result in the knowledge base is taken as the recommendation result of the case to be judged, so that other interferences of human factors and the like can be effectively avoided, and the objective legal basis of case judgment can be embodied. The method has the advantages of high accuracy, strong robustness and simple application, improves the working efficiency of manual case judgment of an arbitration mechanism, reduces the labor intensity and time of manual processing, and reduces the cost of manual case judgment.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a flowchart illustrating a similar arbitration case recommendation method according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a similar arbitration case recommendation device according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a similar arbitration case recommendation method according to an embodiment of the present invention, where the method includes:
s1, acquiring the text information of the target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to acquire the vector information corresponding to the target arbitration case.
And S2, performing similarity matching on the vector information and vector information corresponding to each arbitration case in a preset knowledge base to obtain the soft cosine similarity information of the target arbitration case and each arbitration case in the knowledge base.
And S3, extracting cosine similarity information meeting preset conditions, and recommending the arbitration cases in the knowledge base corresponding to the cosine similarity information meeting the preset conditions as similar cases.
Specifically, since the original case material is in text form, and the text is unstructured data, it cannot be passed directly as input to any mathematical model, so all text must first be represented in computer-processable form, i.e., in numerical form.
The ultimate goal of the text processing is to convert the text description of the case into a matrix, with each word in the text description being represented by a vector in vector space. It is first necessary to segment the text because the words are the smallest meaningful language components that can act independently. When the words are divided, the single characters, punctuation marks and word groups are firstly separated by using jieba or other word dividing tools, and then the language words and other nonsense words are filtered out by contrasting the disabled word list.
The description of each new case is in text form, as described above, in order for the target arbitration case to be used in the model calculation it is necessary to convert the target arbitration case into vector form,
after vectorizing cases, each case becomes a vector form. Each dimension of the vector represents the frequency with which the word appears in the case text in the lexicon V. Traversing the new cases a and cases in the knowledge base in sequence, wherein the cases traversed each time are replaced by the cases b, and the calculation formula is as follows:
Figure BDA0002288711470000051
wherein s isij=sim(fi,fj) Wherein f isiWord vector representing the ith word, fjA word vector representing the jth word. a isiRepresenting the frequency of occurrence of the ith word in the document a, bjIndicating the frequency with which the jth word appears in document b. The soft cosine value calculated according to the formula can represent the similarity value of two documents, and the size of the similarity value is between 0 and 1.
After the soft cosine similarity value is calculated for the target arbitration case and each arbitration case of the knowledge base, the decision result of the most similar case can be selected as the recommended arbitration result of the target arbitration case, and on the other hand, the decision results of a plurality of recommended arbitration cases can be used as the recommended results.
By the method, the similarity value between the case document and the case document in the knowledge base is calculated by utilizing the vector form of the document, and the case judgment result in the knowledge base is taken as the recommendation result of the case to be judged, so that other interference of human factors and the like can be effectively avoided, and the objective legal basis of case judgment can be embodied. The method has the advantages of high accuracy, strong robustness and simple application, improves the working efficiency of manual case judgment of an arbitration mechanism, reduces the labor intensity and time of manual processing, and reduces the cost of manual case judgment.
On the basis of the above embodiment, the step of obtaining the text information of the target arbitration case, preprocessing the text information, vectorizing the preprocessed text information, and obtaining the vector information corresponding to the target arbitration case specifically includes: segmenting words of the text information, filtering out meaningless words in the text information according to a preset stop word list, and obtaining the text information after word segmentation; and representing each word in the text information after word segmentation by using a vector in a vector space to obtain vector information corresponding to the text information.
In particular, the text of the target arbitration case needs to be participled first, since the words are the smallest meaningful language components that can be independently active. When the words are divided, the single characters, punctuation marks and word groups are firstly separated by using jieba or other word dividing tools, and then the language words and other nonsense words are filtered out by contrasting the disabled word list.
And deleting the low-frequency words and/or the high-frequency words after the words and the word frequencies of the words are obtained according to the word segmentation step, so that descriptive expressions closely related to the cases are reserved. The dictionary is a hash table, and each word or character uniquely corresponds to an index.
Before the case is vectorized, word vectors are calculated for words in a dictionary by using other word vector models such as word2vec or BERT, and then the case is vectorized.
The case text is mainly represented by a vector space model, and the basic idea is to regard the case text as an n-dimensional vector (w1, w2, w3, …, wn) in a vector space, wherein wi is the weight of the ith feature, and the weight is represented by a TF-IDF coefficient in the invention. Each word in TF-IDF should be derived from the dictionary constructed above.
On the basis of the above embodiment, the method further includes: acquiring text information of a plurality of arbitration cases, performing de-processing on the text information of the arbitration cases and performing vectorization on the preprocessed text information to acquire vector information corresponding to the plurality of arbitration cases; and constructing the knowledge base according to the vector information corresponding to the arbitration cases.
Specifically, a large number of arbitration related cases are collected, each arbitration related case is subjected to word segmentation, each arbitration case is converted into vector representation according to the method, and then the vector representation is stored in a server by using a database storage unit, so that a knowledge base is constructed.
On the basis of the above embodiment, the step of extracting the cosine similarity information meeting the preset condition, and recommending the arbitration case in the knowledge base corresponding to the cosine similarity information meeting the preset condition as a similar case specifically includes: and sorting the cosine similarity information from big to small, and selecting an arbitration case corresponding to the cosine similarity with the similarity meeting a preset threshold value as a similar case for recommendation.
Specifically, after the soft cosine similarity value is calculated for each case of the new case and the knowledge base, the cases can be sorted in the descending order, so that the arbitration case with the highest similarity can be selected as the similar arbitration case for recommendation, a threshold value can also be set for the cosine similarity, and arbitration cases meeting the preset threshold value are all recommended as the similar cases for reference of related personnel.
In conclusion, the method provided by the embodiment of the invention has the advantages of high accuracy, strong robustness and simple application. Meanwhile, the application effect is obvious, the working efficiency of manual case judgment of an arbitration mechanism is improved, the labor intensity and time of manual processing are reduced, and the cost of manual case judgment is reduced.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a similar arbitration case recommendation device according to an embodiment of the present invention, the device includes: a preprocessing module 21, a similarity matching module 22 and a recommendation module 23.
The preprocessing module 21 is configured to obtain text information of a target arbitration case, preprocess the text information, and vectorize the preprocessed text information to obtain vector information corresponding to the target arbitration case;
the similarity matching module 22 is configured to perform similarity matching between the vector information and vector information corresponding to each arbitration case in a preset knowledge base, so as to obtain soft cosine similarity information between the target arbitration case and each arbitration case in the knowledge base;
the recommending module 23 is configured to extract cosine similarity information meeting a preset condition, and recommend an arbitration case in the knowledge base corresponding to the cosine similarity information meeting the preset condition as a similar case.
In particular, the text needs to be participled first, since words are the smallest meaningful language components that can be independently active. When the words are divided, the single characters, punctuation marks and word groups are firstly separated by using jieba or other word dividing tools, and then the language words and other nonsense words are filtered out by contrasting the disabled word list.
The description of each new case is in text form, as described above, in order for the target arbitration case to be used in the model calculation it is necessary to convert the target arbitration case into vector form,
after vectorizing cases, each case becomes a vector form. Each dimension of the vector represents the frequency with which the word appears in the case text in the lexicon V. Traversing the new cases a and cases in the knowledge base in sequence, wherein the cases traversed each time are replaced by the cases b, and the calculation formula is as follows:
Figure BDA0002288711470000071
wherein s isij=sim(fi,fj) Wherein f isiWord vector representing the ith word, fjA word vector representing the jth word. a isiRepresenting the frequency of occurrence of the ith word in the document a, biIndicating the frequency with which the jth word appears in document b. The soft cosine value calculated according to the formula can represent the similarity value of two documents, and the size of the similarity value is between 0 and 1.
After the soft cosine similarity value is calculated for the target arbitration case and each arbitration case of the knowledge base, the decision result of the most similar case can be selected as the recommended arbitration result of the target arbitration case, and on the other hand, the decision results of a plurality of recommended arbitration cases can be used as the recommended results.
By the aid of the device, the similarity value between the case document and the case document in the knowledge base is calculated in a vector mode of the document, case judgment results in the knowledge base are used as recommendation results of cases to be judged, other interference of human factors and the like can be effectively avoided, and objective legal basis of case judgment can be embodied. The method has the advantages of high accuracy, strong robustness and simple application, improves the working efficiency of manual case judgment of an arbitration mechanism, reduces the labor intensity and time of manual processing, and reduces the cost of manual case judgment.
On the basis of the above embodiment, the preprocessing module is specifically configured to: segmenting words of the text information, filtering out meaningless words in the text information according to a preset stop word list, and obtaining the text information after word segmentation; and representing each word in the text information after word segmentation by using a vector in a vector space to obtain vector information corresponding to the text information.
In particular, the text of the target arbitration case needs to be participled first, since the words are the smallest meaningful language components that can be independently active. When the words are divided, the single characters, punctuation marks and word groups are firstly separated by using jieba or other word dividing tools, and then the language words and other nonsense words are filtered out by contrasting the disabled word list.
And deleting the low-frequency words and/or the high-frequency words after the words and the word frequencies of the words are obtained according to the word segmentation step, so that descriptive expressions closely related to the cases are reserved. The dictionary is a hash table, and each word or character uniquely corresponds to an index.
Before the case is vectorized, word vectors are calculated for words in a dictionary by using other word vector models such as word2vec or BERT, and then the case is vectorized.
The case text is mainly represented by a vector space model, and the basic idea is to regard the case text as an n-dimensional vector (w1, w2, w3, …, wn) in a vector space, wherein wi is the weight of the ith feature, and the weight is represented by a TF-IDF coefficient in the invention. Each word in TF-IDF should be derived from the dictionary constructed above.
On the basis of the above embodiment, the apparatus further includes: the knowledge base construction module is used for acquiring text information of a plurality of arbitration cases, carrying out de-processing on the text information of the arbitration cases and vectorizing the preprocessed text information to acquire vector information corresponding to the arbitration cases; and constructing the knowledge base according to the vector information corresponding to the arbitration cases.
Specifically, a large number of arbitration related cases are collected, each arbitration related case is subjected to word segmentation, each arbitration case is converted into vector representation according to the method, and then the vector representation is stored in a server by using a database storage unit, so that a knowledge base is constructed.
On the basis of the above embodiment, the apparatus further includes: the knowledge base construction module is used for acquiring text information of a plurality of arbitration cases, carrying out de-processing on the text information of the arbitration cases and vectorizing the preprocessed text information to acquire vector information corresponding to the arbitration cases; and constructing the knowledge base according to the vector information corresponding to the arbitration cases.
Specifically, after the soft cosine similarity value is calculated for each case of the new case and the knowledge base, the cases can be sorted in the descending order, so that the arbitration case with the highest similarity can be selected as the similar arbitration case for recommendation, a threshold value can also be set for the cosine similarity, and arbitration cases meeting the preset threshold value are all recommended as the similar cases for reference of related personnel.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device includes: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 complete communication with each other through the bus 340. The processor 310 may call logic instructions in the memory 330 to perform methods including, for example: acquiring text information of a target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to acquire vector information corresponding to the target arbitration case; carrying out similarity matching on the vector information and vector information corresponding to each arbitration case in a preset knowledge base to obtain soft cosine similarity information of the target arbitration case and each arbitration case in the knowledge base; and extracting cosine similarity information meeting preset conditions, and recommending the arbitration cases in the knowledge base corresponding to the cosine similarity information meeting the preset conditions as similar cases.
An embodiment of the present invention discloses a computer program product, which includes a computer program stored on a non-transitory computer readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by a computer, the computer can execute the method provided by the above method embodiments, for example, the method includes: acquiring text information of a target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to acquire vector information corresponding to the target arbitration case; carrying out similarity matching on the vector information and vector information corresponding to each arbitration case in a preset knowledge base to obtain soft cosine similarity information of the target arbitration case and each arbitration case in the knowledge base; and extracting cosine similarity information meeting preset conditions, and recommending the arbitration cases in the knowledge base corresponding to the cosine similarity information meeting the preset conditions as similar cases.
The present embodiments provide a non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the methods provided by the above method embodiments, for example, including: acquiring text information of a target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to acquire vector information corresponding to the target arbitration case; carrying out similarity matching on the vector information and vector information corresponding to each arbitration case in a preset knowledge base to obtain soft cosine similarity information of the target arbitration case and each arbitration case in the knowledge base; and extracting cosine similarity information meeting preset conditions, and recommending the arbitration cases in the knowledge base corresponding to the cosine similarity information meeting the preset conditions as similar cases.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for recommending similar arbitration cases, comprising:
acquiring text information of a target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to acquire vector information corresponding to the target arbitration case;
carrying out similarity matching on the vector information and vector information corresponding to each arbitration case in a preset knowledge base to obtain soft cosine similarity information of the target arbitration case and each arbitration case in the knowledge base;
and extracting cosine similarity information meeting preset conditions, and recommending the arbitration cases in the knowledge base corresponding to the cosine similarity information meeting the preset conditions as similar cases.
2. The method according to claim 1, wherein the step of obtaining the text information of the target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to obtain the vector information corresponding to the target arbitration case specifically comprises:
segmenting words of the text information, filtering out meaningless words in the text information according to a preset stop word list, and obtaining the text information after word segmentation;
and representing each word in the text information after word segmentation by using a vector in a vector space to obtain vector information corresponding to the text information.
3. The method of claim 1, further comprising: acquiring text information of a plurality of arbitration cases, performing de-processing on the text information of the arbitration cases and performing vectorization on the preprocessed text information to acquire vector information corresponding to the plurality of arbitration cases;
and constructing the knowledge base according to the vector information corresponding to the arbitration cases.
4. The method according to claim 1, wherein the step of extracting the cosine similarity information meeting the preset condition and recommending the arbitration case in the knowledge base corresponding to the cosine similarity information meeting the preset condition as the similar case specifically comprises:
and sorting the cosine similarity information from big to small, and selecting an arbitration case corresponding to the cosine similarity with the similarity meeting a preset threshold value as a similar case for recommendation.
5. A similar arbitration case recommendation device, comprising:
the preprocessing module is used for acquiring text information of a target arbitration case, preprocessing the text information, and vectorizing the preprocessed text information to acquire vector information corresponding to the target arbitration case;
the similarity matching module is used for performing similarity matching on the vector information and vector information corresponding to each arbitration case in a preset knowledge base to obtain soft cosine similarity information of the target arbitration case and each arbitration case in the knowledge base;
and the recommending module is used for extracting the cosine similarity information meeting the preset conditions and recommending the arbitration case in the knowledge base corresponding to the cosine similarity information meeting the preset conditions as a similar case.
6. The apparatus of claim 5, wherein the preprocessing module is specifically configured to:
segmenting words of the text information, filtering out meaningless words in the text information according to a preset stop word list, and obtaining the text information after word segmentation;
and representing each word in the text information after word segmentation by using a vector in a vector space to obtain vector information corresponding to the text information.
7. The apparatus of claim 5, further comprising:
the knowledge base construction module is used for acquiring text information of a plurality of arbitration cases, carrying out de-processing on the text information of the arbitration cases and vectorizing the preprocessed text information to acquire vector information corresponding to the arbitration cases;
and constructing the knowledge base according to the vector information corresponding to the arbitration cases.
8. The apparatus of claim 5, wherein the recommendation module is specifically configured to:
and sorting the cosine similarity information from big to small, and selecting an arbitration case corresponding to the cosine similarity with the similarity meeting a preset threshold value as a similar case for recommendation.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program performs the steps of a similar arbitration case recommendation method as described in any one of claims 1 to 4.
10. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the steps of a similar arbitration case recommendation method according to any one of claims 1 to 4.
CN201911170945.7A 2019-11-26 2019-11-26 Similar arbitration case recommendation method and device Pending CN111144068A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911170945.7A CN111144068A (en) 2019-11-26 2019-11-26 Similar arbitration case recommendation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911170945.7A CN111144068A (en) 2019-11-26 2019-11-26 Similar arbitration case recommendation method and device

Publications (1)

Publication Number Publication Date
CN111144068A true CN111144068A (en) 2020-05-12

Family

ID=70516677

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911170945.7A Pending CN111144068A (en) 2019-11-26 2019-11-26 Similar arbitration case recommendation method and device

Country Status (1)

Country Link
CN (1) CN111144068A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340630A (en) * 2020-05-15 2020-06-26 支付宝(杭州)信息技术有限公司 Resource transfer event processing method, device, equipment and medium
CN111680986A (en) * 2020-08-12 2020-09-18 北京擎盾信息科技有限公司 Method and device for identifying serial case
CN111708875A (en) * 2020-06-02 2020-09-25 北京北大软件工程股份有限公司 Administrative law enforcement class recommendation method based on punishment characteristics
CN113032544A (en) * 2021-05-19 2021-06-25 南京视察者智能科技有限公司 Case automatic processing method and device based on big data and terminal equipment
US20220230174A1 (en) * 2021-01-21 2022-07-21 Bank Of America Corporation System for analyzing and resolving disputed data records

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070244882A1 (en) * 2006-04-13 2007-10-18 Lg Electronics Inc. Document management system and method
CN103838789A (en) * 2012-11-27 2014-06-04 大连灵动科技发展有限公司 Text similarity computing method
CN106095737A (en) * 2016-06-07 2016-11-09 杭州凡闻科技有限公司 Documents Similarity computational methods and similar document the whole network retrieval tracking
CN108536677A (en) * 2018-04-09 2018-09-14 北京信息科技大学 A kind of patent text similarity calculating method
CN108628825A (en) * 2018-04-10 2018-10-09 平安科技(深圳)有限公司 Text message Similarity Match Method, device, computer equipment and storage medium
CN109684628A (en) * 2018-11-23 2019-04-26 武汉烽火众智数字技术有限责任公司 Case intelligently pushing method and system based on merit semantic analysis
CN109840532A (en) * 2017-11-24 2019-06-04 南京大学 A kind of law court's class case recommended method based on k-means
CN110390083A (en) * 2019-06-17 2019-10-29 平安科技(深圳)有限公司 Method for pushing, device, computer equipment and the storage medium of approximate case

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070244882A1 (en) * 2006-04-13 2007-10-18 Lg Electronics Inc. Document management system and method
CN103838789A (en) * 2012-11-27 2014-06-04 大连灵动科技发展有限公司 Text similarity computing method
CN106095737A (en) * 2016-06-07 2016-11-09 杭州凡闻科技有限公司 Documents Similarity computational methods and similar document the whole network retrieval tracking
CN109840532A (en) * 2017-11-24 2019-06-04 南京大学 A kind of law court's class case recommended method based on k-means
CN108536677A (en) * 2018-04-09 2018-09-14 北京信息科技大学 A kind of patent text similarity calculating method
CN108628825A (en) * 2018-04-10 2018-10-09 平安科技(深圳)有限公司 Text message Similarity Match Method, device, computer equipment and storage medium
CN109684628A (en) * 2018-11-23 2019-04-26 武汉烽火众智数字技术有限责任公司 Case intelligently pushing method and system based on merit semantic analysis
CN110390083A (en) * 2019-06-17 2019-10-29 平安科技(深圳)有限公司 Method for pushing, device, computer equipment and the storage medium of approximate case

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
吕宾;侯伟亮;: "基于主题模型的法院文本典型案例推荐" *
夏冰;李宝安;吕学强;: "综合词位置和语义信息的专利文本相似度计算" *
谷重阳;徐浩煜;周晗;张俊杰;: "基于词汇语义信息的文本相似度计算" *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340630A (en) * 2020-05-15 2020-06-26 支付宝(杭州)信息技术有限公司 Resource transfer event processing method, device, equipment and medium
CN111708875A (en) * 2020-06-02 2020-09-25 北京北大软件工程股份有限公司 Administrative law enforcement class recommendation method based on punishment characteristics
CN111708875B (en) * 2020-06-02 2023-11-03 北京北大软件工程股份有限公司 Administrative law enforcement case recommendation method based on punishment features
CN111680986A (en) * 2020-08-12 2020-09-18 北京擎盾信息科技有限公司 Method and device for identifying serial case
CN111680986B (en) * 2020-08-12 2020-12-08 北京擎盾信息科技有限公司 Method and device for identifying serial case
US20220230174A1 (en) * 2021-01-21 2022-07-21 Bank Of America Corporation System for analyzing and resolving disputed data records
CN113032544A (en) * 2021-05-19 2021-06-25 南京视察者智能科技有限公司 Case automatic processing method and device based on big data and terminal equipment
CN113032544B (en) * 2021-05-19 2021-08-20 南京视察者智能科技有限公司 Case automatic processing method and device based on big data and terminal equipment

Similar Documents

Publication Publication Date Title
CN110993081B (en) Doctor online recommendation method and system
CN111144068A (en) Similar arbitration case recommendation method and device
CN109783639B (en) Mediated case intelligent dispatching method and system based on feature extraction
CN109766428B (en) Data query method and equipment and data processing method
CN105426354B (en) The fusion method and device of a kind of vector
CN107506389B (en) Method and device for extracting job skill requirements
CN112667794A (en) Intelligent question-answer matching method and system based on twin network BERT model
CN110633577B (en) Text desensitization method and device
CN110503143B (en) Threshold selection method, device, storage medium and device based on intention recognition
CN112559684A (en) Keyword extraction and information retrieval method
CN105279147B (en) A kind of interpreter's contribution fast matching method
CN110046264A (en) A kind of automatic classification method towards mobile phone document
CN111046177A (en) Automatic arbitration case prejudging method and device
CN110928986B (en) Legal evidence ordering and recommending method, legal evidence ordering and recommending device, legal evidence ordering and recommending equipment and storage medium
CN116775879A (en) Fine tuning training method of large language model, contract risk review method and system
CN115248890B (en) User interest portrait generation method and device, electronic equipment and storage medium
CN113192028B (en) Quality evaluation method and device for face image, electronic equipment and storage medium
CN108073567B (en) Feature word extraction processing method, system and server
CN109871540B (en) Text similarity calculation method and related equipment
CN111625858A (en) Intelligent multi-mode data desensitization method and device in vertical field
CN116910599A (en) Data clustering method, system, electronic equipment and storage medium
CN115563515A (en) Text similarity detection method, device and equipment and storage medium
CN112328812B (en) Domain knowledge extraction method and system based on self-adjusting parameters and electronic equipment
CN114333461B (en) Automatic subjective question scoring method and system
CN113158074A (en) Resume post matching method, system and equipment based on multiple interactive dimensions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination