CN107562907B

CN107562907B - Intelligent lawyer expert case response device

Info

Publication number: CN107562907B
Application number: CN201710809899.5A
Authority: CN
Inventors: 王雨; 商锦; 何亨
Original assignee: Wuhan University of Science and Engineering WUSE
Current assignee: Wuhan University of Science and Engineering WUSE
Priority date: 2017-09-11
Filing date: 2017-09-11
Publication date: 2020-10-02
Anticipated expiration: 2037-09-11
Also published as: CN107562907A

Abstract

The invention belongs to the technical field of expert systems, and discloses an intelligent lawyer expert case response device, which comprises: the information acquisition module is used for acquiring case information; the corpus library module comprises: a corpus established based on lawyer historical case processing information; the word segmentation module is connected with the information acquisition module and is used for segmenting the case information; the keyword extraction module is connected with the word segmentation module and extracts keywords by adopting a TFIDF method; the primary matching module is respectively connected with the keyword extraction module and the corpus module, and matches the case problem with cases in the corpus by adopting a cosine similarity matching method based on the keywords; and (4) taking answers of 3 questions with high cosine similarity and outputting the answers. The invention provides an efficient intelligent lawyer expert case response device.

Description

Intelligent lawyer expert case response device

Technical Field

The invention relates to the technical field of expert systems, in particular to an intelligent lawyer expert case response device.

Background

Along with the development of the times, the right-maintaining consciousness of people is greatly improved, people can encounter various problems in life, and people can only seek legal help to solve many problems. The lawyer can be thought of firstly by improving the law, however, the number of lawyers is large in reality, each lawyer is good at different fields, the chance that people contact the lawyer is small, the quality of the lawyer cannot be judged, and the lawyer cannot know whether the lawyer is suitable for the case of the lawyer or not, so that great inconvenience is brought to people for solving the legal problem. A variety of lawyer recommendation systems and intelligent question-answer matching techniques have been proposed by a number of scholars (reference of the great wall of the literature)

Lawyer information base creating method and device, lawyer recommending method, device and system, patent application number: CN201610783519.0), although this method of recommending lawyer solves some problems, in real life, many people have little knowledge about the use of this kind of system, and are not easy to use, and people have little knowledge about lawyer, and please have higher costs for lawyer, even if lawyer information is recommended, people still have great possibility to find lawyer to solve the problem.

Disclosure of Invention

The invention provides an intelligent lawyer expert case answering device which can efficiently solve legal problems.

In order to solve the technical problem, the invention provides an intelligent lawyer expert case response device, which comprises:

the information acquisition module is used for acquiring case information;

the corpus library module comprises: a corpus established based on lawyer historical case processing information;

the word segmentation module is connected with the information acquisition module and is used for segmenting the case information;

the keyword extraction module is connected with the word segmentation module and extracts keywords by adopting a TFIDF method;

the primary matching module is respectively connected with the keyword extraction module and the corpus module, and matches the case problem with cases in the corpus by adopting a cosine similarity matching method based on the keywords; and (4) taking answers of 3 questions with high cosine similarity and outputting the answers.

Further, the apparatus further comprises:

the candidate keyword screening module is connected with the keyword extraction module, and is used for matching keywords in the corpus by adopting a cosine similarity matching method to obtain candidate keywords of the target answers;

the sentence pattern screening module is connected with the information acquisition module and used for carrying out syntactic analysis through a probabilistic context-free grammar method to obtain a candidate sentence pattern of the target answer;

and the output module is respectively connected with the candidate keyword screening module, the sentence pattern screening module and the corpus module, fills the candidate keywords into the candidate sentence patterns according to the part of speech, and outputs the final answer.

Further, the word segmentation module adopts a hidden Markov chain model to segment the word of the case information.

Further, the keyword extraction module calculates whether the keyword is similar to the context environment of the words in the corpus by using a word similarity calculation method based on the corpus in the corpus by using the keyword, and determines the semantic similarity of the two words;

and screening out words with similar semantics with the keywords of the input case information to obtain candidate keywords of the target answer.

Further, the information acquisition module includes: and the voice recognition module acquires case information by using a voice recognition technology and converts the voice information into text information as input of the case information.

Further, the speech recognition module further comprises: the fuzzy information matching module is respectively connected with the voice recognition module and the word segmentation module and is used for clustering the voice information;

if the information part which cannot be identified is classified into a certain cluster, matching information similar to the cluster in the pinyin set;

and if the information which cannot be correctly identified is classified into one type after the clustering processing, processing the information by using manual rules.

One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:

the intelligent lawyer expert case answering device that provides in this application embodiment, make full use of lawyer handles the historical record information of case, for the online generation case processing mode of user, greatly made things convenient for people's life. The searched lawyers are not required to be considered to be suitable for the cases which are not suitable for the lawyers, the generation of system answers is not limited by the field of questions, and the questions and answers in various fields are covered. And a voice recognition technology is added, so that the system is convenient for people to use, and the system is more intelligent. By adopting the fuzzy information matching method based on rules and clustering, the problem that the voice cannot be recognized under special conditions is effectively solved, and the fault tolerance of the system is greatly improved. And carrying out syntactic analysis by a probabilistic context-free grammar method, eliminating ambiguity, and converting the question sentence to obtain the basic structure of the answer sentence. According to the complexity of the case problem of the user, different strategies are adopted, and the accuracy and the stability of the system are greatly guaranteed. The hidden Markov chain model is used for segmenting words of the case problem, the information of the corpus is used for calculating the transition probability of the part of speech of the next word, and the word segmentation effect is greatly improved.

Drawings

FIG. 1 is a general flow chart of the operation of the intelligent attorney specialist case response device provided by the embodiment of the present invention;

fig. 2 is a flowchart of a fuzzy information matching method based on rules and clustering according to an embodiment of the present invention.

Detailed Description

The embodiment of the application can efficiently realize legal problem answering by providing the intelligent lawyer expert case answering device.

In order to better understand the technical solutions, the technical solutions will be described in detail below with reference to the drawings and the specific embodiments of the specification, and it should be understood that the embodiments and specific features of the embodiments of the present invention are detailed descriptions of the technical solutions of the present application, and are not limitations of the technical solutions of the present application, and the technical features of the embodiments and examples of the present application may be combined with each other without conflict.

Referring to fig. 1 and 2, an intelligent attorney specialist case response apparatus includes:

the information acquisition module is used for acquiring case information;

The device further comprises:

And the word segmentation module adopts a hidden Markov chain model to segment the words of the case information.

The keyword extraction module is used for calculating whether the keywords are similar to the context environment of the words in the corpus by adopting a word similarity calculation method based on the corpus in the corpus and determining the semantic similarity of the two words;

The information acquisition module includes: and the voice recognition module acquires case information by using a voice recognition technology and converts the voice information into text information as input of the case information.

The speech recognition module further comprises: the fuzzy information matching module is respectively connected with the voice recognition module and the word segmentation module and is used for clustering the voice information;

The specific operation of the above-described apparatus is described below.

An intelligent attorney specialist response method, comprising:

acquiring input case information;

segmenting the case information;

extracting key words by adopting a TFIDF method;

matching the case problem with the case in the corpus based on the keywords by adopting a cosine similarity matching method;

taking answers of 3 questions with high cosine similarity and outputting the answers;

the corpus is established based on lawyer historical case processing information.

Specifically, the input case information is first segmented by using a hidden markov chain model.

The hidden Markov chain model has two important sets, and the state value set is (B, M, E, S): B: begin, M: middle, E: end, S: single }. Each state represents the position of the word in the word, B represents that the word is the initial word in the word, M represents the middle word in the word, E represents the end word in the word, and S represents the word formation. The set of observations is just the information that is input. The hidden markov chain model is to compute a set of states from inputs, such as:

and (3) user input: ming Master graduates at the institute of sciences of China

The state sequence output after calculation is

BE/BE/BME/BE/BME/BE/S

From this state sequence we can do word segmentation:

BE/BE/BME/BE/BME/BE/S

the word segmentation results are therefore as follows:

Ming/Master/graduation/Chinese/academy of sciences/computing/institute

The model method only needs to calculate the state sequence without considering semantic information, shortens the processing time and greatly improves the word segmentation efficiency.

After word segmentation, extracting keywords from the information input by the user by using a TFIDF method, namely selecting words in the sentences which can reflect the topic of the sentence most, for example: the Chinese bee breeding is characterized in that the key word of the sentence is the bee, which is beneficial to obtaining the subsequent answer.

And matching the case problem with the case in the corpus by using the word segmentation result according to the number of words and the matching degree of the problem and the corpus problem by adopting a cosine similarity matching method according to the matching degree, and judging the case complexity of the user. For example, "owe money little and do", because the number of words is small after the words are divided, the problem is short and short, and the case can be regarded as a simple case; if the matching degree of the question and the question in the corpus reaches a set threshold value, such as 0.8, the case can be regarded as a simple case. For simple case information input by a user, a cosine similarity matching method is directly adopted, similarity calculation is carried out on the case problem after word segmentation and the case in the corpus to obtain the similarity value of the case problem and the case in the corpus, answers corresponding to 3 problems with the highest cosine similarity are taken and returned to the user to obtain a result.

Under the condition of complex case

The method further comprises the following steps:

performing keyword matching in a corpus by using the keywords by adopting a cosine similarity matching method to obtain candidate keywords of the target answers;

performing syntactic analysis by a probabilistic context-free grammar method to obtain a candidate sentence pattern of the target answer;

and filling the candidate keywords into the candidate sentence patterns according to the part of speech, and outputting the final answer.

In particular.

Calculating information, wherein the step of performing keyword matching in the corpus by using the keywords by adopting a cosine similarity matching method to obtain candidate keywords of the target answer comprises the following steps:

utilizing the keywords, calculating whether the keywords are similar to the context environment of the words in the corpus by adopting a word similarity calculation method based on the corpus in the corpus, and determining the semantic similarity of the two words;

That is, for a problem that is recognized as complex, keyword matching is performed, and a word similarity calculation method based on a corpus is used in the corpus by using the extracted keywords of the input case, that is, whether the context environments of the two words are similar or not is calculated, so that semantic similarity of the two words is determined. Thus, the candidate keywords of the target answer are obtained by searching the words with the similar semantics with the keywords of the case situation question input by the user.

And performing syntactic analysis on case information input by the user by using a context-free grammar method to obtain a syntactic analysis tree of the problem, namely obtaining the main and predicate objects of the sentence and the like. And composing sentence pattern templates for displaying the sentences by using the obtained sentence components as candidate sentence patterns of the target answers.

And filling the obtained candidate keywords into the candidate sentence patterns according to the part of speech to obtain a plurality of answers for the user to select.

Further, the acquiring of the input case information includes:

acquiring case information by utilizing a voice recognition technology;

and converting the voice information into text information as the input of case information.

To optimize speech recognition, the method further comprises: matching fuzzy information;

clustering the voice information;

The input of a user is searched in a pinyin set, for the problem that the user cannot identify the user due to unclear pronouncing and the like, a fuzzy information matching method based on rules and clustering is adopted, firstly fuzzy information is processed by a clustering model, the clustering model refers to a data set with N tuples or records, K groups are constructed by a splitting method, and each group represents a cluster. Putting fuzzy information into a pinyin set, and matching information similar to a cluster in the pinyin set of the cluster if the fuzzy information is partially classified into the cluster; and if the fuzzy information is classified into one category after clustering, processing by using manual rules.

The manual rule adopted by the embodiment is as follows:

if the unrecognized part is a part of common phrases, such as 'sailing', a 'sail' is correctly recognized, and if the 'sailing' cannot be recognized, a correlation matching method is adopted, all phrases related to the 'sail' are used for matching with fuzzy information, and the phrase with the highest matching degree is taken as a final phrase.

For the searched out part, the confusing pinyin is replaced, for example, the 'ong' and 'eng' in the pinyin are replaced, and the part is searched in the pinyin set again;

for the part which can not be identified, the tone is removed, and the part is searched in the pinyin set again.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to examples, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims

1. An intelligent attorney specialist case response apparatus, comprising:

the information acquisition module is used for acquiring case information;

the primary matching module is respectively connected with the keyword extraction module and the corpus module, and matches the case problem with cases in the corpus by adopting a cosine similarity matching method based on the keywords; taking answers of 3 questions with high cosine similarity and outputting the answers;

the device further comprises:

the output module is respectively connected with the candidate keyword screening module, the sentence pattern screening module and the corpus module, fills the candidate keywords into the candidate sentence patterns according to the part of speech, and outputs the final answer;

the word segmentation module adopts a hidden Markov chain model to segment the word of the case information;

screening out words with similar semantics with the keywords of the input case information to obtain candidate keywords of the target answer;

the information acquisition module includes: the voice recognition module is used for acquiring case information by utilizing a voice recognition technology and converting the voice information into text information as input of the case information;