CN111008262A - Lawyer evaluation method and recommendation method based on knowledge graph - Google Patents

Lawyer evaluation method and recommendation method based on knowledge graph Download PDF

Info

Publication number
CN111008262A
CN111008262A CN201911160895.4A CN201911160895A CN111008262A CN 111008262 A CN111008262 A CN 111008262A CN 201911160895 A CN201911160895 A CN 201911160895A CN 111008262 A CN111008262 A CN 111008262A
Authority
CN
China
Prior art keywords
case
lawyer
document
word
recommendation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911160895.4A
Other languages
Chinese (zh)
Other versions
CN111008262B (en
Inventor
刘飞
陈文平
陈亿熙
黄伟民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201911160895.4A priority Critical patent/CN111008262B/en
Publication of CN111008262A publication Critical patent/CN111008262A/en
Application granted granted Critical
Publication of CN111008262B publication Critical patent/CN111008262B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Educational Administration (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a lawyer evaluation method and a lawyer recommendation method based on a knowledge graph, wherein the lawyer evaluation method comprises the following steps of: s1, collecting open referee documents to form a referee document library; s2, preprocessing the judgment document to form an effective database; s3, constructing a knowledge graph of the referee documents, and identifying the elements of the knowledge graph of each referee document; s4, obtaining the professional quality evaluation score of each lawyer through a machine learning model according to the knowledge graph in the step S3; and S5, carrying out classification statistics according to case types to obtain case type data which each lawyer is good at, and writing the case type data into a database. A lawyer recommendation method for the lawyer evaluation method, comprising the steps of: A. acquiring a case type, a case area and a case description input by a user; B. finding a corresponding case library according to the case type and the case area input by the user; C. and (5) finding similar cases and lawyers thereof, and returning the cases and lawyers to the user side.

Description

Lawyer evaluation method and recommendation method based on knowledge graph
Technical Field
The invention belongs to the field of internet and data retrieval recommendation, and particularly relates to a lawyer evaluation method and a lawyer recommendation method based on a knowledge graph.
Background
In the internet era, various industries are developed at a high speed with the help of internet + and artificial intelligence algorithms. However, in the legal field, due to its strong professional, the knowledge of the ordinary user in the legal field is relatively deficient, which makes the user inefficient in dealing with the legal problem. This is particularly true in vulnerable groups that require legal assistance. The vulnerable group has the problems of information blocking, less available resources and the like compared with other groups. When a legal problem is met, the people cannot deal with the legal problem by finding human vein resources or high-gold application attorneys like other people. They can only apply for assistance through a legal assistance center, which consumes a great amount of labor and energy and cannot ensure to find a lawyer which is suitable for and satisfied with the lawyer.
At present, various Internet lawyer recommendation platforms are already available in the market. The user only needs to submit the basic information of the case, such as case type information, and can retrieve lawyers meeting the legal requirements of the user, so that the process of seeking legal services by the user is simplified to a certain extent. However, existing lawyers recommend that the platform function be relatively simple. On the one hand, in the existing recommendation platform, lawyer recommendation algorithms are mostly based on the lawyer historical handling rate, and a professional and objective lawyer evaluation system is lacked, for example, the method described in chinese patent CN109409645A mainly bases on the lawyer rate and some lawyer basic information in the lawyer scoring method, such as law information, lawyer personal information, and lacks of comprehensive professional evaluation on the lawyer handling process, which does not well reflect the professional level of the lawyer. On the other hand, in the prior art, when the case similarity is calculated, the semantics of the whole case are not modeled, for example, in the aspect of calculating the similarity, the chinese patent CN107563912A only calculates the cosine similarity between the case of the user and each case in the case library, and further obtains the similarity score by using the KNN method, and does not model the whole case, and the semantic information expressed by the case cannot be extracted, which affects the recommendation effect of lawyers.
Disclosure of Invention
In order to solve the problems existing in the existing methods, the invention aims to provide a lawyer assessment method and a recommendation method based on a knowledge graph, which can objectively and accurately evaluate the professional level and the field of excellence of a lawyer, accurately recommend a corresponding lawyer for a user and effectively improve the user experience.
The invention is realized by at least one of the following technical schemes.
A lawyer assessment method based on a knowledge graph comprises the following steps:
s1, collecting open referee documents to form a referee document library;
s2, preprocessing the referee documents in the referee document library in the step S1, and eliminating invalid data to form an effective database;
s3, aiming at the official documents of the effective database in the step S2, constructing a knowledge graph of the official documents, carrying out knowledge graph element recognition on each official document, and writing the extracted elements into an official document element database;
s4, performing professional quality evaluation on the case lawyers according to the knowledge graph in the step S3, obtaining professional quality evaluation scores of each lawyer through a machine learning model, and writing the lawyer scores into a lawyer evaluation database;
s5, classifying and counting the professional quality evaluation scores of each lawyer in the step S4 according to case types to obtain case type data which are good for each lawyer, and writing the case type data into a database;
and S6, applying the database obtained in the step S5 to the law assistant evaluation systems of all levels of law assistance centers, quantifying law assistant evaluation data, and improving the specialty and objectivity of law assistant evaluation.
Preferably, the pretreatment comprises the following specific steps:
s201, the content integrity of the referee documents in the referee document library in the step S1 is checked, whether the downloading is complete or not is checked, whether the content of the referee documents comprises lawyer information, original report appeal, original report proof, fact of court confirmation and court judgment results or not is checked, and the referee documents which are incompletely downloaded and lack of content are removed from the referee document library to form a primary preprocessing case library;
s202, dividing each case in the primary preprocessing case library in the step S201 into five text sections according to lawyer information, original notice requirements, original notice testimony, fact identified by a court and court judgment results;
s203, extracting corresponding element information from the five text sections in the step S202, wherein the element information comprises lawyer information, a president agent, court trial performance and case handling results, and forming an effective database.
Preferably, the step S3 includes the following steps:
s301, aiming at the referee document of the effective database in the step S2, constructing a knowledge graph of the referee document, wherein the knowledge graph of the referee document comprises a case entity, a lawyer entity, a pre-court agent entity, a court trial expression entity and a case handling result entity, and the case entity comprises three attributes of the location of the case, the name of a court and the number of the case; the lawyer entity comprises three attributes of lawyer name, law firm information and lawyer place information; the pre-court agent entity comprises three attributes of the number of testimony of original report, evidence quality certificate and the number of litigation request items; the court trial performance entity comprises three attributes of lawyer court situation, fact claim and responsibility claim; the result of case handling comprises five attributes of fact affirmation, evidence adoption, indemnity item affirmation, indemnity responsibility affirmation and support degree of original notice appeal item of the court;
s302, sampling the referee document according to the knowledge graph produced in the step S301, establishing a regular rule base of elements of the knowledge graph, setting priority for syntactic rules of the same element, performing rule matching on the referee document according to the priority, and writing a matching result of the referee document into a database;
s303, sampling the referee document according to the knowledge graph produced in the step S301 and constructing a training data set, so that model training is performed on two attributes of responsibility assertion and indemnity responsibility confirmation by adopting a bidirectional long-short term memory classification model (Bi-LSTM), more accurate element information is finally obtained and is used as further verification of the element matching result in the step S302, and the more accurate element result is written into the referee document element database again.
Preferably, the step S4 includes the following steps:
s401, sampling 50% of documents from the case library in the step S201, manually scoring the lawyers of the case according to the content of the judge documents, removing the highest score and the lowest score, and then calculating a score mean value to serve as a prior lawyer professional quality evaluation score of the lawyers in the case;
s402, converting the more accurate element result of the step S303 into a low-dimensional tensor through a word embedding model;
s403, taking a gradient lifting decision tree (GBDT) as a basic algorithm, and adopting a lightweight gradient lifter (lightGBM) frame training and monitoring model to realize automatic lawyer professional quality evaluation, wherein the low-dimensional tensor obtained in the step S402 is used as the lightweight gradient lifter (lightGBM) frame training and monitoring model input, the lawyer professional quality evaluation score obtained in the step S401 is used as prior knowledge, the Mean Square Error (MSE) is used as a lightGBM frame training and monitoring model evaluation index through iteration, the optimal parameter of the model is selected, and the model is stored;
and S404, performing lawyer professional quality evaluation on the referee document by adopting the model obtained in the step S403, and writing the score into a database.
Preferably, the step S5 includes the following steps:
s501, classifying the case professional quality evaluation scores of each lawyer according to the case types;
s502, calculating the professional quality evaluation score mean value of each lawyer in different types of cases respectively;
s503, arranging the mean values obtained in the step S502 in a descending order, taking the first, second and third case types as the lawyer adept case types, and writing the case types into a database.
The lawyer recommendation method for the knowledge-graph-based lawyer evaluation method comprises the following steps of:
A. acquiring a case type, a case area and a case description input by a user;
B. searching a corresponding case library according to the case type and the case area input by the user;
C. b, extracting information according to case descriptions input by a user, finding similar cases and lawyers thereof in the case library in the step B, and forming a lawyer recommendation candidate set;
D. and D, returning the lawyer recommendation candidate set obtained in the step C to the law-aid lawyer recommendation system so as to improve the accuracy and effectiveness of the law-aid lawyer recommendation.
Preferably, in the step B, the specific steps of searching the corresponding case library are as follows:
B01. screening cases meeting requirements according to case types input by a user to obtain a preliminary case library;
B02. and B01, comparing the region information of the case with the region input by the user according to the preliminary case library obtained in the step B01, and screening the case meeting the requirements to form a corresponding case library.
Preferably, in the step C, the step of finding similar cases and their lawyers specifically includes the following steps:
C01. according to case description input by a user, performing word segmentation processing on a text of the case description, and filtering stop words to obtain effective keywords related to the case;
C02. after the effective key words obtained in the step C01 are subjected to unique hot coding, the effective key words are input into a word embedding model, and high-dimensional sparse information is converted into low-dimensional tensors;
C03. according to the low-dimensional tensor obtained in the step C02 and the case library cases generated in the step B02, similarity is calculated one by using a similarity calculation method, and cases with the similarity larger than a threshold value and the similarity of the cases are reserved;
C04. taking the case similarity obtained in the step C03 as a weight coefficient, and taking lawyer professional quality scores corresponding to the cases as variables to obtain recommendation scores; if the same lawyer has a plurality of similar cases, taking the arithmetic mean value as a recommendation score;
C05. and D, performing descending sorting according to the recommendation scores obtained in the step C04 to serve as the recommendation sequence of lawyers to form a lawyer recommendation candidate set, wherein the information of the lawyers in the candidate set comprises names of the lawyers, places to which the lawyers belong, recommendation scores and cases of similar cases.
Preferably, the word embedding model in step C02 is a word2vec model, the training process is trained by using a skip-word model (skip-gram) and negative samples, and the objective function is as follows:
Figure BDA0002286119690000041
where θ is the target parameter to be optimized, DpIs a positive sample set, D is a positive sample, DnIs a negative sample set, D' is a randomly sampled negative sample, Dm,nSet of negative examples of the same case type as the positive examples, dm,nA negative sample of the same case type as the positive sample; v. ofdWord vector of positive samples, vd′A word vector that is a randomly sampled negative sample,
Figure BDA0002286119690000042
a word vector that is a negative example of the same case type as the positive example.
Preferably, the similarity algorithm of step C03 is specifically as follows:
normalizing the case tensor after one-hot coding to obtain tensor x, namely a document I, expressing the case after the tensor normalization of any document in a case library as x', namely a document II, and defining a sparse transfer matrix T belonging to Rn×nR represents a real number, n represents an nth Word in two documents, c (i, j) represents the overhead required for transferring the ith Word of the document to the jth Word of the document, and according to WMD (Word move's Distance), the similarity between cases sim is represented as:
Figure BDA0002286119690000051
wherein i represents the ith word in the first document, and xiA word vector representing an ith word in a first document; j represents the jth word, x 'in document two'jWord vector, T, for the jth word in document twoijIndicating the distance that the ith word in document one needs to be moved to the second jth word in the document,
Figure BDA0002286119690000052
Figure BDA0002286119690000053
preferably, the arithmetic mean of step C04 is calculated as follows:
Figure BDA0002286119690000054
in the above formula, m represents that the lawyer accepts m similar cases, SpiScoring the professional of the lawyer corresponding to the case in the case library, wherein the score recommended by the lawyer at this time is Sr
The invention has the beneficial effects that:
1) the invention effectively evaluates the professionalism of each lawyer and the field of excellence thereof and discriminates lawyers with strong professional ability in different professional fields by processing the data in the judge document. The entity and the attribute thereof are quantified according to the knowledge map, and the professional evaluation scores of the lawyers are obtained through a machine learning model, so that the professional level of each lawyer can be accurately evaluated from high-dimensional mass data;
2) according to the invention, lawyers recommend a case descriptor embedding model, so that word vectors have semantic information. In the training process of the model, the negative samples are not only from random sampling, but also from the same type of case. Compared with the existing random negative sample, the word vector has richer characteristics, and the cases of the same type and different types have differences, so that similarity calculation is facilitated;
3) the similarity calculation algorithm adopts a WMD algorithm, compared with the existing cosine similarity, edit distance and other methods, the migration of word2vec can be better utilized, meanwhile, the problem is converted into linear programming, and the method has a global optimal solution. Therefore, the method can better calculate the similarity value between texts, has higher accuracy and further improves the recommendation effect;
4) the method and the system can recommend the corresponding lawyers according to the case description of the user, so that the lawyers can process legal cases in the field of oral expertise, the user experience degree is improved, the recommendation accuracy is high, and the method and the system are suitable for popularization and use.
5) The system can be applied to the law attorney evaluation and recommendation systems of the judicial department and law assistance centers at all levels, can powerfully promote the objective and fair evaluation on the professional ability of the attorney, greatly improves the recommendation precision of the law attorney, and improves the working efficiency and accuracy of the law assistance centers. As China increasingly pays more attention to the aspect of legal assistance construction, the achievement of the invention has important application value and market potential, can effectively reduce social contradiction and promote the development of national justice and fair construction.
Drawings
Fig. 1 is a flowchart of a lawyer assessment method based on a knowledge graph according to the present embodiment;
fig. 2 is a flowchart of a lawyer assessment method based on a knowledge graph according to the present embodiment.
Detailed Description
The invention is further explained below with reference to the drawings and the specific embodiments:
a lawyer assessment method based on a knowledge-graph as shown in fig. 1, comprising the following steps:
s1, collecting open referee documents to form a referee document library;
in this embodiment, in step S1, the data is derived from a publicly-known official document network. Therefore, objectivity and authenticity of data are effectively guaranteed.
S2, preprocessing the documents in the referee document library in the step S1, and removing invalid data to form an effective database;
in this embodiment, in step S2, the specific steps of preprocessing are as follows:
s201, checking the completeness of cases in the case library of the step S1 to check whether downloading is complete, judging whether the content of the document comprises lawyer information, original notice appeal, original notice testification, fact of court identification and court judgment result, and eliminating documents which are incompletely downloaded and lack of content to form a primary preprocessed case library;
s202, dividing each case in the case library in the step S201 into 5 text sections according to lawyer information, original notice appeal, original notice testifying, fact identified by a court and court judgment results;
and S203, extracting corresponding element information from the five text segments in the step S202 to form four parts of lawyer information, president agents, court trial expression and case handling results, and forming a direct and meaningful effective database.
S3, aiming at the effective judge document database in the step S2, designing a knowledge graph of the judge document, carrying out knowledge graph element identification on each judge document, and writing the extracted relevant elements of the knowledge graph into the database;
in this embodiment, in step S3, the specific steps of preprocessing are as follows:
s301, according to the effective database obtained in the step S2, the judgment document knowledge graph comprises 5 entities in total, a case entity, a lawyer entity, a pre-court agent entity, a court trial performance entity and a case handling result entity. The case entity comprises three attributes of a case location, a court name and a case number; the lawyer entity comprises three attributes of lawyer name, law firm information and lawyer place information; the pre-court agent entity comprises three attributes of the number of testimony of original report, evidence quality certificate and the number of litigation request items; the court trial performance entity comprises three attributes of lawyer court situation, fact claim and responsibility claim; the result of case handling comprises five attributes of fact affirmation, evidence adoption, indemnity item affirmation, indemnity responsibility affirmation and support degree of original notice appeal item of the court;
s302, according to the knowledge graph produced in the step S301, the judgment document is sampled, a knowledge graph element regular rule base is established, and the syntactic rules of the same element are prioritized, such as: lawyer name elements are extracted, and the lawyer name elements can be obtained according to the' entrusted litigation attorney: XXX sentence rule, extract "commission litigation agent: the character after "is used as the lawyer name. Performing rule matching on the judgment documents according to the priority, and writing the matching result of the file to a database;
and S303, sampling the judgment document according to the knowledge graph produced in the step S301, constructing a training data set, and meanwhile, performing word segmentation on the data set by adopting a jieba word segmentation tool and removing stop words. And (3) encoding the processed data set by adopting a word2vec word vector tool of the Kagaku Kaiyuan of 2013, performing model training on two attributes of responsibility assertion and indemnity confirmation by adopting a Bi-LSTM (bidirectional long-short term memory) classification model, finally identifying more accurate element information, further verifying the element information as the element result of S302, and writing the more accurate element result into a database again.
S4, constructing a feature list of lawyer professional quality evaluation according to the knowledge graph in the step S3, obtaining professional quality evaluation scores of each lawyer through a machine learning model, and writing case lawyer evaluation information into a database; the method comprises the following specific steps:
s401, sampling 50% of preprocessing data from the case library S201, carrying out professional quality scoring (full score 100) on case lawyers by multi-person professionals according to the content of the judge document, and calculating a score mean value after removing the highest score and the lowest score to be used as a priori professional quality evaluation score of the lawyers in the case;
s402, performing word segmentation and stop word removal on the knowledge graph elements obtained in the step S302 by adopting a jieba word segmentation tool; the processed data is represented by a word2vec model in a coding mode, so that a high-dimensional sparse vector is converted into a low-dimensional dense tensor;
and S403, using GBDT (gradient lifting decision tree) as a basic algorithm, and adopting a lightGBM (lightweight gradient lifting machine) framework training and monitoring model to realize automatic lawyer professional quality evaluation. And constructing a regression task by taking the low-dimensional tensor obtained in the step S402 as a model input and the lawyer professional quality evaluation score obtained in the step S401 as a priori knowledge, and performing optimization training. Constructing a 10-fold cross inspection experiment, preferably selecting the model parameters with the best inspection effect, and storing;
and S404, performing lawyer evaluation on the processed data by adopting the model obtained in the step S403, and writing an evaluation result into a database.
S5, classifying and counting the professional quality evaluation scores of each lawyer in the step S4 according to case types to obtain case type data which are good for each lawyer, and writing the case type data into a database, wherein the specific steps are as follows:
s501, classifying the case professional quality evaluation scores of each lawyer according to case types and divided according to three-level case routing;
s502, calculating the professional quality evaluation score mean value of each lawyer in different types of cases respectively;
s503, arranging the mean values obtained in the step S502 in a descending order, taking the first, second and third case types as the lawyer adept case types, and writing the case types into a database.
And S6, applying the database obtained in the step S5 to the law assistant evaluation systems of all levels of law assistance centers, quantifying law assistant evaluation data, and improving the specialty and objectivity of law assistant evaluation.
The lawyer recommendation method for the knowledge-graph-based lawyer assessment method as shown in fig. 2 can know the professional level and the skilled field of the lawyer known in the effective database based on the published judge document, and the lawyer recommendation method in the invention is the continuation and development of information retrieval. Lawyer recommendation firstly needs to judge the type of a case of a user and the area where the user is located so as to divide a matched case subset; then, case description participles input by a user are converted into word vectors after stop words are removed, case description of each case in the case subset is also converted into word vectors, the similarity between the case description vectors of the user and the word vectors of each case in the case subset is calculated, then the similarity is used as the weight of case evaluation scores of lawyers to obtain the most total ranking score, and the optimal lawyers can be recommended after descending ranking, and the method specifically comprises the following steps:
A. and acquiring the case type, the case area and the case description input by the user through the interface. The case types are divided according to three-level case routes; the case area is divided into three levels of administrative areas, namely province, city and county (district).
B. Finding out a corresponding case library according to the case type and the case area input by the user, wherein the specific implementation steps are as follows:
B01. screening cases meeting requirements according to case types input by a user to obtain a preliminary case library;
B02. according to the preliminary case library obtained in the step B01, comparing the region information of the case with the region input by the user, and screening the case meeting the requirements to form a corresponding case library;
C. and B, extracting information according to case description input by a user, finding similar cases and lawyers thereof in the case library in the step B, and returning the cases and lawyers to the user side, wherein the specific steps are as follows:
C01. according to case description input by a user, word segmentation processing is carried out on the text of the case description, stop words are filtered out, and effective keywords related to the case are obtained. Wherein, the word segmentation tool adopts jieba word segmentation;
C02. and C01, inputting the effective key words into a word embedding model after the effective key words are subjected to one-hot coding, and converting the high-dimensional sparse information into a low-dimensional tensor.
In this embodiment, in step C02, the word embedding model is specifically as follows:
the word embedding model is an improved version of the word2vec model based on the open source in google 2013. On the basis of the original version, negative sample information of the same case type is added into the objective function, so that the word vector has case difference information in the type. The training process adopts skip-gram and negative sampling method to train, and the target function is
Figure BDA0002286119690000091
Where θ is the target parameter to be optimized, DpIs a positive sample set, D is a positive sample, DnIs a negative sample set, D' is a randomly sampled negative sample, Dm,nSet of negative examples of the same case type as the positive examples, dm,nA negative sample of the same case type as the positive sample; v. ofdWord vector of positive samples, vd′A word vector that is a randomly sampled negative sample,
Figure BDA0002286119690000095
a word vector that is a negative example of the same case type as the positive example.
C03. According to the low-dimensional tensor obtained in the step C02 and the case library case generated in the step B02, the similarity is calculated one by one, the case with the similarity larger than the threshold value and the similarity thereof are reserved,
in this embodiment, in step C03, the similarity calculation method specifically includes:
normalizing the case tensor after one-hot (one-hot coding) to obtain a tensor x (a document I), expressing the case after the tensor normalization of any document in a case library as x' (a document II), and defining a sparse transfer matrix T belonging to Rn×n(R represents a real number, n represents an nth Word in two documents), c (i, j) represents the overhead required for transferring the ith Word of the document to the jth Word of the document, and according to WMD (Word move's Distance), the similarity sim between cases is represented as
Figure BDA0002286119690000092
Wherein i represents the ith word in the first document, and xiA word vector representing an ith word in a first document; j represents the jth word, x 'in document two'jWord vector, T, for the jth word in document twoijIndicating the distance that the ith word in document one needs to be moved to the second jth word in the document,
Figure BDA0002286119690000093
Figure BDA0002286119690000094
in consideration of the actual recommendation performance problem, in this embodiment, the WMD (word shift distance) adopts an optimized RWMD (relaxed word shift distance) algorithm, that is, an open-source version of FastWMD (fast word shift distance).
C04. According to the case similarity obtained in the step C03 as a weight coefficient, and the lawyer professional quality score corresponding to the case as a variable, calculating a recommendation score; if the same lawyer has a plurality of similar cases, taking the arithmetic mean value as a recommendation score;
in this embodiment, in step C04, the arithmetic mean specifically includes the following steps:
the professional rating of the lawyer corresponding to the case in the case library is SpiThen the lawyer scores at this recommendation are
Figure BDA0002286119690000101
In the above formula, m represents that the lawyer accepts m similar cases, SrIs the final recommendation score. According to SrAnd (4) sorting lawyers to be recommended in a descending order, and sending lawyer information and lawyer recommendation scores to the client for reference and selection of the user.
C05. And D, sorting in a descending order according to the recommendation scores obtained in the step C04 to serve as the recommendation sequence of lawyers, and returning lawyer information to the user side, wherein the lawyer information comprises the names of the lawyers, the offices to which the lawyers belong, the recommendation scores and cases of similar cases.
D. And D, returning the lawyer recommendation candidate set obtained in the step C to the law-aid lawyer recommendation system so as to improve the accuracy and effectiveness of the law-aid lawyer recommendation.
The above-described examples merely represent one embodiment of the present invention, which is described in more detail and in greater detail, but are not to be construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A lawyer assessment method based on a knowledge graph is characterized by comprising the following steps:
s1, collecting open referee documents to form a referee document library;
s2, preprocessing the referee documents in the referee document library in the step S1, and eliminating invalid data to form an effective database;
s3, aiming at the official documents of the effective database in the step S2, constructing a knowledge graph of the official documents, carrying out knowledge graph element recognition on each official document, and writing the extracted elements into an official document element database;
s4, performing professional quality evaluation on the case lawyers according to the knowledge graph in the step S3, obtaining professional quality evaluation scores of each lawyer through a machine learning model, and writing the lawyer scores into a lawyer evaluation database;
s5, classifying and counting the professional quality evaluation scores of each lawyer in the step S4 according to case types to obtain case type data which are good for each lawyer, and writing the case type data into a database;
and S6, applying the database obtained in the step S5 to the law assistant evaluation systems of all levels of law assistance centers, quantifying law assistant evaluation data, and improving the specialty and objectivity of law assistant evaluation.
2. The method for lawyer assessment based on knowledge-graph as claimed in claim 1, wherein the specific steps of the preprocessing in step S2 are as follows:
s201, the content integrity of the referee documents in the referee document library in the step S1 is checked, whether the downloading is complete or not is checked, whether the content of the referee documents comprises lawyer information, original report appeal, original report proof, fact of court confirmation and court judgment results or not is checked, and the referee documents which are incompletely downloaded and lack of content are removed from the referee document library to form a primary preprocessing case library;
s202, dividing each case in the primary preprocessing case library in the step S201 into five text sections according to lawyer information, original notice requirements, original notice testimony, fact identified by a court and court judgment results;
s203, extracting corresponding element information from the five text sections in the step S202, wherein the element information comprises lawyer information, a president agent, court trial performance and case handling results, and forming an effective database.
3. The method for lawyer assessment based on a knowledge-graph of claim 1, wherein said step S3 comprises the following steps:
s301, aiming at the referee document of the effective database in the step S2, constructing a knowledge graph of the referee document, wherein the knowledge graph of the referee document comprises a case entity, a lawyer entity, a pre-court agent entity, a court trial expression entity and a case handling result entity, and the case entity comprises three attributes of the location of the case, the name of a court and the number of the case; the lawyer entity comprises three attributes of lawyer name, law firm information and lawyer place information; the pre-court agent entity comprises three attributes of the number of testimony of original report, evidence quality certificate and the number of litigation request items; the court trial performance entity comprises three attributes of lawyer court situation, fact claim and responsibility claim; the result of case handling comprises five attributes of fact affirmation, evidence adoption, indemnity item affirmation, indemnity responsibility affirmation and support degree of original notice appeal item of the court;
s302, sampling the referee document according to the knowledge graph produced in the step S301, establishing a regular rule base of elements of the knowledge graph, setting priority for syntactic rules of the same element, performing rule matching on the referee document according to the priority, and writing a matching result of the referee document into a database;
s303, sampling the referee document according to the knowledge graph produced in the step S301 and constructing a training data set, so that model training is performed on two attributes of responsibility assertion and indemnity responsibility confirmation by adopting a bidirectional long-short term memory classification model (Bi-LSTM), more accurate element information is finally obtained and is used as further verification of the element matching result in the step S302, and the more accurate element result is written into the referee document element database again.
4. The method for lawyer assessment based on a knowledge-graph of claim 1, wherein said step S4 comprises the following steps:
s401, sampling 50% of documents from the case library in the step S201, manually scoring the lawyers of the case according to the content of the judge documents, removing the highest score and the lowest score, and then calculating a score mean value to serve as a prior lawyer professional quality evaluation score of the lawyers in the case;
s402, converting the more accurate element result of the step S303 into a low-dimensional tensor through a word embedding model;
s403, taking a gradient lifting decision tree (GBDT) as a basic algorithm, and adopting a lightweight gradient lifter (lightGBM) frame training and monitoring model to realize automatic lawyer professional quality evaluation, wherein the low-dimensional tensor obtained in the step S402 is used as the lightweight gradient lifter (lightGBM) frame training and monitoring model input, the lawyer professional quality evaluation score obtained in the step S401 is used as prior knowledge, the Mean Square Error (MSE) is used as a lightGBM frame training and monitoring model evaluation index through iteration, the optimal parameter of the model is selected, and the model is stored;
and S404, performing lawyer professional quality evaluation on the referee document by adopting the model obtained in the step S403, and writing the score into a database.
5. The method for lawyer assessment based on a knowledge-graph of claim 1, wherein said step S5 comprises the following steps:
s501, classifying the case professional quality evaluation scores of each lawyer according to the case types;
s502, calculating the professional quality evaluation score mean value of each lawyer in different types of cases respectively;
s503, arranging the mean values obtained in the step S502 in a descending order, taking the first, second and third case types as the lawyer adept case types, and writing the case types into a database.
6. Attorney recommendation method for an attorney assessment method based on knowledge-graph according to claim 1, characterized by comprising the following steps:
A. acquiring a case type, a case area and a case description input by a user;
B. searching a corresponding case library according to the case type and the case area input by the user;
C. b, extracting information according to case descriptions input by a user, finding similar cases and lawyers thereof in the case library in the step B, and forming a lawyer recommendation candidate set;
D. and D, returning the lawyer recommendation candidate set obtained in the step C to the law-aid lawyer recommendation system so as to improve the accuracy and effectiveness of the law-aid lawyer recommendation.
7. The lawyer recommendation method of claim 6, wherein the step B of searching the corresponding case library comprises the following steps:
B01. screening cases meeting requirements according to case types input by a user to obtain a preliminary case library;
B02. and B01, comparing the region information of the case with the region input by the user according to the preliminary case library obtained in the step B01, and screening the case meeting the requirements to form a corresponding case library.
8. The lawyer recommendation method of claim 6, wherein the step C of finding similar cases and their lawyers comprises the following steps:
C01. according to case description input by a user, performing word segmentation processing on a text of the case description, and filtering stop words to obtain effective keywords related to the case;
C02. after the effective key words obtained in the step C01 are subjected to unique hot coding, the effective key words are input into a word embedding model, and high-dimensional sparse information is converted into low-dimensional tensors;
C03. according to the low-dimensional tensor obtained in the step C02 and the case library cases generated in the step B02, similarity is calculated one by using a similarity calculation method, and cases with the similarity larger than a threshold value and the similarity of the cases are reserved;
C04. taking the case similarity obtained in the step C03 as a weight coefficient, and taking lawyer professional quality scores corresponding to the cases as variables to obtain recommendation scores; if the same lawyer has a plurality of similar cases, taking the arithmetic mean value as a recommendation score;
C05. and D, performing descending sorting according to the recommendation scores obtained in the step C04 to serve as the recommendation sequence of lawyers to form a lawyer recommendation candidate set, wherein the information of the lawyers in the candidate set comprises names of the lawyers, places to which the lawyers belong, recommendation scores and cases of similar cases.
9. The lawyer recommendation method of claim 8, wherein the word embedding model of step C02 is a word2vec model, and the training process is performed by using a skip-word model (skip-gram) and negative sampling, and the objective function is:
Figure FDA0002286119680000031
where θ is the target parameter to be optimized, DpIs a positive sample set, D is a positive sample, DnIs a negative sample set, D' is a randomly sampled negative sample, Dm,nSet of negative examples of the same case type as the positive examples, dm,nA negative sample of the same case type as the positive sample; v. ofdWord vector of positive samples, vd′A word vector that is a randomly sampled negative sample,
Figure FDA0002286119680000041
a word vector that is a negative example of the same case type as the positive example.
10. The lawyer recommendation method of claim 8, wherein the similarity algorithm of step C03 is specifically as follows:
encoding one-hoNormalizing the case tensor after T) to obtain a tensor x which is a document one, expressing the case after the tensor normalization of any document in the case library as x' which is a document two, and defining a sparse transfer matrix T belonging to Rn×nR represents a real number, n represents an nth Word in two documents, c (i, j) represents the overhead required for transferring the ith Word of the document to the jth Word of the document, and according to WMD (Word move's Distance), the similarity between cases sim is represented as:
Figure FDA0002286119680000042
wherein i represents the ith word in the first document, and xiA word vector representing an ith word in a first document; j represents the jth word, x 'in document two'jWord vector, T, for the jth word in document twoijIndicating the distance that the ith word in document one needs to be moved to the second jth word in the document,
Figure FDA0002286119680000043
Figure FDA0002286119680000044
the arithmetic mean of step C04 is calculated as follows:
Figure FDA0002286119680000045
in the above formula, m represents that the lawyer accepts m similar cases, SpiScoring the expertise of the corresponding attorney of the case in the case library, the
The score recommended by lawyer at this time is Sr
CN201911160895.4A 2019-11-24 2019-11-24 Lawyer evaluation method and recommendation method based on knowledge graph Active CN111008262B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911160895.4A CN111008262B (en) 2019-11-24 2019-11-24 Lawyer evaluation method and recommendation method based on knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911160895.4A CN111008262B (en) 2019-11-24 2019-11-24 Lawyer evaluation method and recommendation method based on knowledge graph

Publications (2)

Publication Number Publication Date
CN111008262A true CN111008262A (en) 2020-04-14
CN111008262B CN111008262B (en) 2023-04-28

Family

ID=70113672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911160895.4A Active CN111008262B (en) 2019-11-24 2019-11-24 Lawyer evaluation method and recommendation method based on knowledge graph

Country Status (1)

Country Link
CN (1) CN111008262B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639256A (en) * 2020-04-20 2020-09-08 广东德诚科教有限公司 Discipline-based professional recommendation method and device, computer equipment and storage medium
CN111861188A (en) * 2020-07-15 2020-10-30 中国工商银行股份有限公司 Case distribution method and device
CN112632224A (en) * 2020-12-29 2021-04-09 天津汇智星源信息技术有限公司 Case recommendation method and device based on case knowledge graph and electronic equipment
CN112950414A (en) * 2021-02-25 2021-06-11 华东师范大学 Legal text representation method based on decoupling legal elements
CN113536780A (en) * 2021-06-29 2021-10-22 华东师范大学 Intelligent auxiliary case judging method for enterprise bankruptcy cases based on natural language processing
CN114218452A (en) * 2021-10-29 2022-03-22 赢火虫信息科技(上海)有限公司 Lawyer recommending method and device based on public information and electronic equipment
CN114428840A (en) * 2022-04-01 2022-05-03 湖南涉外经济学院 Legal consultation service system based on case set
CN117235200A (en) * 2023-09-12 2023-12-15 杭州湘云信息技术有限公司 Data integration method and device based on AI technology, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563912A (en) * 2017-08-29 2018-01-09 广东蔚海数问大数据科技有限公司 A kind of lawyer recommends method and system
CN108509588A (en) * 2018-03-29 2018-09-07 成都智联数创科技有限公司 A kind of lawyer's appraisal procedure and recommendation method based on big data
CN108595525A (en) * 2018-03-27 2018-09-28 成都律云科技有限公司 A kind of lawyer's information processing method and system
CN108681548A (en) * 2018-03-27 2018-10-19 成都律云科技有限公司 A kind of lawyer's information processing method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563912A (en) * 2017-08-29 2018-01-09 广东蔚海数问大数据科技有限公司 A kind of lawyer recommends method and system
CN108595525A (en) * 2018-03-27 2018-09-28 成都律云科技有限公司 A kind of lawyer's information processing method and system
CN108681548A (en) * 2018-03-27 2018-10-19 成都律云科技有限公司 A kind of lawyer's information processing method and system
CN108509588A (en) * 2018-03-29 2018-09-07 成都智联数创科技有限公司 A kind of lawyer's appraisal procedure and recommendation method based on big data

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639256A (en) * 2020-04-20 2020-09-08 广东德诚科教有限公司 Discipline-based professional recommendation method and device, computer equipment and storage medium
CN111639256B (en) * 2020-04-20 2024-06-04 广东德诚科教有限公司 Discipline-based professional recommendation method, device, computer equipment and storage medium
CN111861188A (en) * 2020-07-15 2020-10-30 中国工商银行股份有限公司 Case distribution method and device
CN112632224A (en) * 2020-12-29 2021-04-09 天津汇智星源信息技术有限公司 Case recommendation method and device based on case knowledge graph and electronic equipment
CN112632224B (en) * 2020-12-29 2023-01-24 天津汇智星源信息技术有限公司 Case recommendation method and device based on case knowledge graph and electronic equipment
CN112950414A (en) * 2021-02-25 2021-06-11 华东师范大学 Legal text representation method based on decoupling legal elements
CN112950414B (en) * 2021-02-25 2023-04-18 华东师范大学 Legal text representation method based on decoupling legal elements
CN113536780A (en) * 2021-06-29 2021-10-22 华东师范大学 Intelligent auxiliary case judging method for enterprise bankruptcy cases based on natural language processing
CN114218452A (en) * 2021-10-29 2022-03-22 赢火虫信息科技(上海)有限公司 Lawyer recommending method and device based on public information and electronic equipment
CN114428840A (en) * 2022-04-01 2022-05-03 湖南涉外经济学院 Legal consultation service system based on case set
CN117235200A (en) * 2023-09-12 2023-12-15 杭州湘云信息技术有限公司 Data integration method and device based on AI technology, computer equipment and storage medium
CN117235200B (en) * 2023-09-12 2024-05-10 杭州湘云信息技术有限公司 Data integration method and device based on AI technology, computer equipment and storage medium

Also Published As

Publication number Publication date
CN111008262B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN111008262B (en) Lawyer evaluation method and recommendation method based on knowledge graph
CN108073568B (en) Keyword extraction method and device
CN104408153B (en) A kind of short text Hash learning method based on more granularity topic models
CN112214610A (en) Entity relation joint extraction method based on span and knowledge enhancement
CN112035658B (en) Enterprise public opinion monitoring method based on deep learning
CN112256939B (en) Text entity relation extraction method for chemical field
CN111260223A (en) Intelligent identification and early warning method, system, medium and equipment for trial and judgment risk
WO2020243846A1 (en) System and method for automated file reporting
CN111339249B (en) Deep intelligent text matching method and device combining multi-angle features
CN113962219A (en) Semantic matching method and system for knowledge retrieval and question answering of power transformer
CN108595525A (en) A kind of lawyer's information processing method and system
CN115952292B (en) Multi-label classification method, apparatus and computer readable medium
CN108681548A (en) A kind of lawyer's information processing method and system
CN113688635A (en) Semantic similarity based class case recommendation method
CN111325036A (en) Emerging technology prediction-oriented evidence fact extraction method and system
CN108509588B (en) Lawyer evaluation method and recommendation method based on big data
CN113516094A (en) System and method for matching document with review experts
CN108470035A (en) A kind of entity-quotation correlation sorting technique based on differentiation mixed model
Riesener et al. Methodology for Automated Master Data Management using Artificial Intelligence
CN115269816A (en) Core personnel mining method and device based on information processing method and storage medium
CN111400496B (en) Public praise emotion analysis method for user behavior analysis
Syafiandini et al. Classification of Indonesian Government Budget Appropriations or Outlays for Research and Development (GBAORD) using decision tree and naive bayes
CN113158082B (en) Artificial intelligence-based media content reality degree analysis method
CN112711700A (en) Method and system for recommending case for fair litigation
CN117633518B (en) Industrial chain construction method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant