CN110532363B - Task-oriented automatic dialogue method based on decision tree - Google Patents

Task-oriented automatic dialogue method based on decision tree Download PDF

Info

Publication number
CN110532363B
CN110532363B CN201910795839.1A CN201910795839A CN110532363B CN 110532363 B CN110532363 B CN 110532363B CN 201910795839 A CN201910795839 A CN 201910795839A CN 110532363 B CN110532363 B CN 110532363B
Authority
CN
China
Prior art keywords
attribute
decision tree
classification
attribute value
conclusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910795839.1A
Other languages
Chinese (zh)
Other versions
CN110532363A (en
Inventor
王成
胡艳霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaqiao University
Original Assignee
Huaqiao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaqiao University filed Critical Huaqiao University
Priority to CN201910795839.1A priority Critical patent/CN110532363B/en
Publication of CN110532363A publication Critical patent/CN110532363A/en
Application granted granted Critical
Publication of CN110532363B publication Critical patent/CN110532363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a task-oriented automatic dialogue method based on a decision tree, which comprises the following steps: discretizing the conclusion of legal consultation into classification categories, and discretizing the information of the parties related to the conclusion into basic attributes; receiving the questions consulted by the parties, extracting attribute values corresponding to the basic attributes through a classification algorithm, and concluding the corresponding category values; taking the collected actual cases as training samples, and establishing a decision tree-based legal consultation classification prediction model; and receiving the consultation of the new party, realizing the consultation conversation process according to the established decision tree, and returning the consultation conclusion of the party. The method provided by the invention receives the online legal consultation of the user, realizes multiple rounds of conversations with the user, can return accurate legal consultation conclusion, has good interaction effect and high real-time performance, and greatly reduces the manual workload.

Description

Task-oriented automatic dialogue method based on decision tree
Technical Field
The invention relates to the technical fields of natural language processing, data mining, big data analysis, conversation systems and the like, in particular to a task-oriented automatic conversation method based on a decision tree.
Background
The objective of the task-oriented dialog system is to access structured databases in natural language and to query data from them, which can be seen as a new human-machine interaction interface following a graphical user interface. The task-oriented dialogue system is different from a chat robot aiming at chatting and emotional companions, is a main development direction of intelligent customer service technology tool application, and is also one of the hottest and most application-valued sub-fields of an intelligent question-answering system.
Task-driven dialogue systems began from the research of many expert systems of artificial intelligence, and learners developed many Database Natural Language Interfaces (NLIDB) by extracting features, designing ontologies and organizing rules manually at the earliest time, so that people can query data through a limiting Natural Language, but only support a single round of dialogue query without context, and have poor generality and extensibility. Later with the success of the ELIZA system, multi-turn dialog systems began to emerge. Some systems have started commercial applications in some narrow areas, such as airline ticket ordering, restaurant reservation, navigation assistance. Such systems are referred to as Rule-Based or template-Based (symboloic Rule/structured Based) dialog systems, which are the first generation technologies. The method has the advantages that the rules in the system are easy to understand and transparent, and bugs are easy to repair and system updating is easy to carry out. But the generality and the expandability of the cross-domain are poor, and the data is used for designing rules instead of learning by depending on the development and maintenance of domain experts.
With the development of big data technology, Statistical machine translation (SMT for short) has achieved great success in automatic translation, and a hot tide has spread to the field of Dialog systems, and a Statistical Dialog System (Statistical Dialog System) appears, which is a second generation technology. Such systems use many shallow models (e.g., CRF, HMM) for language understanding and generation of dialog systems, and methods of reinforcement learning have also come into use, representing POMDP-based dialog systems developed by the Steve Young professor team of cambridge university. The method has the advantages that dependence on domain experts is eliminated, the robustness of question answering is enhanced, but the model becomes difficult to interpret, so that the system is difficult to maintain, the representation capability of the model is not strong enough to complete end-to-end learning, and the system is still difficult to expand across domains.
In 2014, the breakthrough progress of deep learning in computer vision and speech recognition, and the success of deep reinforcement learning in Atari games, led the students to start using a variety of Neural network models for the Dialog System, called Neural Dialog System (third generation technology). On the basis, the model has stronger characterization capability, and complete end-to-end becomes feasible, the Microsoft Dunling team develops a complete end-to-end robot KB-InfoBot in 2016, and a structured database is accessed by adopting soft query based on probability, so that people see more possibility of the neural network model on a task-oriented dialog system. However, such dialogue systems still have many limitations, including training on large-scale labeled data, the model still being difficult to interpret, and the lack of interaction between neural network learning and symbolic natural language.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a task-oriented automatic dialogue method based on a decision tree, receives online legal consultation of a user, realizes multiple rounds of dialogue with the user, can return accurate legal consultation conclusions, has good interaction effect and high real-time performance, and greatly reduces the manual workload.
The invention adopts the following technical scheme:
a task-oriented automatic dialogue method based on decision trees comprises the following steps:
1.1) discretizing the conclusion of legal consultation into classification categories, and discretizing the information of parties related to the conclusion into basic attributes;
1.2) receiving the inquiry problem of the party, extracting the attribute value corresponding to the basic attribute through a classification algorithm, and concluding the corresponding category value;
1.3) taking the collected actual cases as training samples, and establishing a decision tree-based legal consultation classification prediction model;
1.4) receiving the consultation of the new party, realizing the process of consultation conversation according to the established decision tree, and returning the conclusion of consultation of the party.
Preferably, the 1.1) specifically comprises:
1.1.1) analyzing the number of the categories giving conclusions, and determining the category meaning corresponding to each category;
1.1.2) analyzing and providing attribute sets which can influence conclusions and attribute value categories corresponding to the attributes.
Preferably, the 1.2) specifically comprises:
1.2.1) labeling the attribute value corresponding to each attribute of the sentence in the partial consulting case through label engineering;
1.2.2) labeling classification categories corresponding to the conclusions in part of the consulting cases through label engineering;
1.2.3) training a model for the label data set through a classification algorithm;
1.2.4) extracting the attribute value of the consulting case through a model trained by a classification algorithm, and extracting the class value corresponding to the conclusion.
Preferably, the 1.2.3) specifically comprises:
1.2.3.1) setting the label data set as a sentence, wherein the attribute value category label corresponding to the sentence is y;
1.2.3.2) text feature extraction: firstly, segmenting words of the sentence through a word segmentation tool, removing stop words and low-frequency words in a word segmentation result, and secondly, correspondingly weighting each word in the data set, wherein a calculation formula is as follows:
Figure BDA0002180928440000031
Figure BDA0002180928440000032
TF-IDF=TF*IDF
1.2.3.3) using the obtained weight data as the input of an LDA text topic model, and training and extracting sentence topic characteristics x;
1.2.3.4) taking the extracted feature x as the input of a classification algorithm, and carrying out model training through an SVM algorithm, wherein the specific calculation is as follows:
Firstly, a certain classification interface in a high-dimensional space is assumed to be y ═ wx + b; w represents the weight of the interface, b represents the deviation of the interface, the initial values of w and b are obtained by random initialization, and a certain characteristic point x i The distance to this plane is expressed as:
Figure BDA0002180928440000033
finding the interface with the closest point farthest away to let gamma i Is the maximum value, i.e. the objective function, i.e.:
Figure BDA0002180928440000034
Figure BDA0002180928440000035
wherein s represents the total number of sentences, the objective function is analyzed, and the original classification problem can be converted into:
Figure BDA0002180928440000036
s.t.y i (wx i +b)≥1,i=1,2,3,...s
introducing lagrange multiplier alpha i The lagrange function, lagrange multiplier alpha, can be obtained i And (3) connecting the constraint condition function with the original function, so that an equation which is equal to the number of the variables can be configured, and the solution of each variable which obtains the extreme value of the original function is solved:
Figure BDA0002180928440000037
s.t.α i ≥0
convert the problem into a minimization problem, i.e.
Figure BDA0002180928440000038
The equivalent form obtained by further transformation analysis is:
Figure BDA0002180928440000039
Figure BDA00021809284400000310
assuming that there is at least one alpha j >0, obtaining
Figure BDA0002180928440000041
And b is the optimal solution of the objective function, and alpha is the optimal solution of the dual problem, so that a hyperplane for classifying the objective function is obtained, and the data of different classes are divided.
Preferably, the 1.2.4) specifically comprises:
and extracting the theme characteristics of all the consulting cases, inputting the theme characteristics into a trained classification algorithm model, and predicting the attribute value categories corresponding to the attributes of the consulting cases so as to obtain the structured data consulted by all the consulting cases.
Preferably, the 1.3) specifically comprises:
1.3.1) taking the obtained structural data of all the consulting cases as the input of a decision tree;
1.3.2) selection of classification attributes, i.e. selecting the optimal classification attributes, and adopting the method of information gain rate to select the attribute set A ═ { a } 1 ,a 2 ,…,a n Choose the best attribute a j
Preferably, the 1.3.2), specifically comprises:
1.3.2.1) firstly, calculating the information entropy of the conclusion D, wherein the calculation formula is as follows:
Figure BDA0002180928440000042
wherein m represents the number of classes of D, P i The total number of the categories i corresponding to the representative conclusion D accounts for the total number of case consultation;
1.3.2.2) secondly, calculating the information entropy of all the attributes, wherein the calculation formula is as follows:
Figure BDA0002180928440000043
wherein Q is i Is represented in attribute a j The total number of corresponding categories i in the conclusion D accounts for the attribute a under the condition of the corresponding attribute value category k j The total number of corresponding attribute value classes k,
Figure BDA0002180928440000044
represents an attribute a j Data information of the corresponding attribute value category k; k is an element of [1, v ]]V represents a j The total number of corresponding attribute value categories;
1.3.2.3) calculate at choice Attribute a j The information entropy under the condition of (1), also called conditional entropy, is specifically calculated as follows:
Figure BDA0002180928440000045
wherein | D | represents the total number of case inquiries,
Figure BDA0002180928440000046
representing the number of attribute value classes k;
1.3.2.4) information gain is defined as the difference between the original information requirement and the new requirement as follows:
Gain(a j )=Info(D)-Info(D|a j )
1.3.2.5) calculate Attribute a j The splitting information of (a) is specifically calculated as follows:
Figure BDA0002180928440000051
1.3.2.6) information gain ratio information gain is normalized using split information values, as calculated:
Figure BDA0002180928440000052
1.3.2.7) calculating the information gain rate of all the attributes a, and then selecting the attribute with the maximum information gain rate as a split node, namely a ═ max (gain ratio (a) 1 ),GainRatio(a 1 ),…,GainRatio(a n ) Each attribute value of the attribute node a corresponds to a branch, and the data of the branch is the data of the residual attribute and conclusion under the condition that the attribute value is k, when the branch corresponds toIf there are remaining attributes selectable and it is concluded that the categories in the data are not unique, the steps 1.3.2.1-1.3.2.7 are repeated, otherwise the splitting is stopped.
Preferably, the 1.4) specifically comprises:
1.4.1) when new consulting information of the party is received, problems corresponding to the attribute nodes are provided for a user from the decision tree and the nodes, and the information of the party is extracted;
1.4.2) inputting the obtained theme features into a trained classification algorithm model, and predicting attribute value categories corresponding to the attributes of the theme features;
1.4.3) comparing the predicted attribute value category with the attribute value of the current node attribute of the decision tree, and selecting equal branches as the sub-decision tree of the next step of conversation;
1.4.4) when the sub-decision tree is a leaf node, stopping the conversation, returning the answer corresponding to the final conclusion category to the party, and otherwise, repeating the steps of 1.4.1-1.4.3.
The invention has the following beneficial effects:
1) the task-oriented automatic conversation method based on the decision tree disclosed by the invention has the advantages that online legal consultation of a user is received, multi-round conversation with the user is realized, accurate legal consultation conclusion can be returned aiming at fewer question questions, the interaction effect is good, the real-time performance is high, and the manual workload is greatly reduced;
2) the task-oriented automatic dialogue method based on the decision tree is strong in interpretability, and the most important judgment factors are well arranged at the position close to the root of the tree, so that a conclusion can be quickly obtained;
3) the task-oriented automatic conversation method based on the decision tree can be used for assisting lawyers, helping the lawyers to extract attribute values of attributes related to consultation, reducing the tedious questioning work of the lawyers and preventing the lawyers from neglecting and missing when consulting.
The above description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the description of the technical means more comprehensible.
The above and other objects, advantages and features of the present invention will become more apparent to those skilled in the art from the following detailed description of specific embodiments thereof, taken in conjunction with the accompanying drawings.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a block diagram of the system of the present invention;
FIG. 3 is a block flow diagram of the method of the present invention;
FIG. 4 is a cross-validation flow chart of the method of the present invention
FIG. 5 is a flow chart of a decision tree of the method of the present invention;
FIG. 6 is a decision tree based legal automatic dialog flow of the method of the present invention;
FIG. 7 is a flow chart of attribute value extraction for user responses of the method of the present invention;
FIG. 8 is a sample data set of attribute value extraction for user answers for the method of the present invention;
FIG. 9 is a graph of the classification accuracy of the training set of the method of the present invention at different ratios of all data; wherein (a) represents the C4.5 algorithm; (b) represents the ID3 algorithm;
FIG. 10 is a graph of the average number of questions asked for a training set of the method of the present invention at different ratios of all data; wherein (a) represents the C4.5 algorithm; (b) represents the ID3 algorithm;
FIG. 11 is C4.5-decision tree using 20% of the data as a training set for the method of the present invention;
FIG. 12 is ID 3-decision tree using 20% of the data as a training set for the method of the present invention;
FIG. 13 is a diagram of a law advisory interaction interface and advisory process in accordance with an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Referring to fig. 1, the task-oriented automatic dialogue method based on the decision tree of the present invention includes:
1.1) discretizing the conclusion of legal consultation into classification categories, and discretizing the information of parties related to the conclusion into basic attributes;
1.2) receiving the inquiry problem of the party, extracting the attribute value corresponding to the basic attribute through a classification algorithm, and concluding the corresponding category value;
1.3) taking the collected actual cases as training samples, and establishing a decision tree-based legal consultation classification prediction model;
1.4) receiving the consultation of the new party, realizing the process of consultation conversation according to the established decision tree, and returning the conclusion of consultation of the party.
The 1.1), specifically comprising:
1.1.1) analyzing the number of the categories giving conclusions, and determining the category meaning corresponding to each category;
1.1.2) analyzing and providing attribute sets which can influence conclusions and attribute value categories corresponding to the attributes.
The 1.2), specifically comprising:
1.2.1) labeling the attribute value corresponding to each attribute of the sentence in the partial consulting case through label engineering;
1.2.2) labeling classification categories corresponding to the conclusions in part of the consulting cases through label engineering;
1.2.3) training a model for the label data set through a classification algorithm;
1.2.4) extracting the attribute value of the consulting case through a model trained by a classification algorithm, and extracting the class value corresponding to the conclusion.
The 1.2.3), specifically comprising:
1.2.3.1) setting the label data set as a sentence, wherein the attribute value category label corresponding to the sentence is y;
1.2.3.2) text feature extraction: firstly, segmenting words of the sentence through a word segmentation tool, removing stop words and low-frequency words in a word segmentation result, and secondly, correspondingly weighting each word in the data set, wherein a calculation formula is as follows:
Figure BDA0002180928440000071
Figure BDA0002180928440000072
TF-IDF=TF*IDF
1.2.3.3) using the obtained weight data as the input of an LDA text topic model, and training and extracting sentence topic characteristics x;
1.2.3.4) taking the extracted feature x as the input of a classification algorithm, and carrying out model training through an SVM algorithm, wherein the specific calculation is as follows:
firstly, a certain classification interface in a high-dimensional space is assumed to be y ═ wx + b; w represents the weight of the interface, b represents the deviation of the interface, the initial values of w and b are obtained by random initialization, and a certain characteristic point x i The distance to this plane is expressed as:
Figure BDA0002180928440000073
finding the interface with the closest point farthest away to let gamma i Is the maximum value, i.e. the objective function, i.e.:
Figure BDA0002180928440000081
Figure BDA0002180928440000082
wherein s represents the total number of sentences, the objective function is analyzed, and the original classification problem can be converted into:
Figure BDA0002180928440000083
s.t.y i (wx i +b)≥1,i=1,2,3,...s
introducing lagrange multiplier alpha i The lagrange function, lagrange multiplier alpha, can be obtained i And (3) connecting the constraint condition function with the original function, so that an equation which is equal to the number of the variables can be configured, and the solution of each variable which obtains the extreme value of the original function is solved:
Figure BDA0002180928440000084
s.t.α i ≥0
convert the problem into a minimization problem, i.e.
Figure BDA0002180928440000085
The equivalent form obtained by further transformation analysis is:
Figure BDA0002180928440000086
Figure BDA0002180928440000087
assuming that there is at least one alpha j >0, obtaining
Figure BDA0002180928440000088
And b is the optimal solution of the objective function, and alpha is the optimal solution of the dual problem, so that a hyperplane for classifying the objective function is obtained, and the data of different classes are divided.
The 1.2.4), specifically comprising:
and extracting the theme characteristics of all the consulting cases, inputting the theme characteristics into a trained classification algorithm model, and predicting the attribute value categories corresponding to the attributes of the consulting cases so as to obtain the structured data consulted by all the consulting cases.
The 1.3), specifically comprising:
1.3.1) taking the obtained structural data of all the consulting cases as the input of a decision tree;
1.3.2) selection of classification attributes, i.e. selecting the optimal classification attributes, and adopting the method of information gain rate to select the attribute set A ═ { a } 1 ,a 2 ,…,a n Choose the best attribute a j
The 1.3.2), specifically comprising:
1.3.2.1) firstly, calculating the information entropy of the conclusion D, wherein the calculation formula is as follows:
Figure BDA0002180928440000091
wherein m represents the number of classes of D, P i The total number of the categories i corresponding to the representative conclusion D accounts for the total number of case consultation;
1.3.2.2) secondly, calculating the information entropy of all the attributes, wherein the calculation formula is as follows:
Figure BDA0002180928440000092
wherein Q is i Is represented in attribute a j The total number of corresponding categories i in the conclusion D accounts for the attribute a under the condition of the corresponding attribute value category k j The total number of corresponding attribute value classes k,
Figure BDA0002180928440000093
represents an attribute a j Data information of the corresponding attribute value category k; k is an element of [1, v ]]V represents a j The total number of corresponding attribute value categories;
1.3.2.3) calculate at choice Attribute a j The information entropy under the condition of (1), also called conditional entropy, is specifically calculated as follows:
Figure BDA0002180928440000094
wherein | D | represents the total number of case inquiries,
Figure BDA0002180928440000095
representing the number of attribute value classes k;
1.3.2.4) information gain is defined as the difference between the original information requirement and the new requirement as follows:
Gain(a j )=Info(D)-Info(D|a j )
1.3.2.5) calculate Attribute a j The splitting information of (a) is specifically calculated as follows:
Figure BDA0002180928440000096
1.3.2.6) information gain ratio information gain is normalized using split information values, as calculated:
Figure BDA0002180928440000097
1.3.2.7) calculating the information gain ratio of all the attributes A, and then selecting the attribute with the largest information gain ratio as a split node, namely a is max (gain ratio) 1 ),GainRatio(a 1 ),…,GainRatio(a n ) And each attribute value of the attribute node a corresponds to a branch, the data of the branch is the data of the residual attribute and the conclusion under the condition that the attribute value is k, when the residual attribute corresponding to the branch is selectable and the category in the conclusion data is not unique, the steps 1.3.2.1-1.3.2.7 are repeated, and otherwise, the splitting is stopped.
The 1.4), specifically comprising:
1.4.1) when new consulting information of the party is received, problems corresponding to the attribute nodes are provided for a user from the decision tree and the nodes, and the information of the party is extracted;
1.4.2) inputting the obtained theme features into a trained classification algorithm model, and predicting attribute value categories corresponding to the attributes of the theme features;
1.4.3) comparing the predicted attribute value category with the attribute value of the current node attribute of the decision tree, and selecting equal branches as the sub-decision tree of the next step of conversation;
1.4.4) stopping the conversation when the sub-decision tree is a leaf node, returning the answer corresponding to the final conclusion category to the party, and otherwise, repeating the steps of 1.4.1-1.4.3.
Referring to fig. 2 and 3, according to another aspect of the present invention, the present invention further includes a task-oriented automatic dialog system based on a decision tree, where the task-oriented automatic dialog system based on a decision tree includes a data collection module, a data preprocessing module, a data learning module, an attribute value extraction module, and a model application module, and for a specific description, refer to fig. 2 and 3.
The processing effect of the task-oriented automatic dialogue method based on the decision tree according to the present invention will be verified as follows.
The experiment of the invention adopts a five-fold cross validation method to validate the effect of the task-oriented automatic dialogue system based on the decision tree. The five-fold intersection method divides the data set into five parts, each part of the cycle is used as a test set, the rest is used as a training set, the training set is used for grade prediction, and the test set is used for measuring the algorithm effect, as shown in fig. 4.
2. Evaluation index
The classification result is measured by three indexes of accuracy (Precision, Pr), Recall (Recall, Re) and harmonic mean value F1, and the formula is as follows:
Figure BDA0002180928440000101
Figure BDA0002180928440000102
Figure BDA0002180928440000103
the meaning of each parameter is shown in table 1, wherein the accuracy rate is examined about the correctness of the classification result, the recall rate is examined about the completeness of the classification result, the F1 score considers that the recall rate and the accuracy rate are equally important, and the comprehensive performance of the model is examined.
TABLE 1 meaning table of index parameter for classification evaluation
Figure BDA0002180928440000104
Figure BDA0002180928440000111
The average number of questions is that when the user consults the questions, the system needs to average the number of questions thrown to the user to return the final classification result to the user.
Figure BDA0002180928440000112
Wherein i is a leaf node, n is the total number of the leaf nodes, and H avg The average number of questions is shown.
The specific steps of the experimental implementation will be described in detail below, taking marital legal consultations as an example.
1. Data set
Data from the Law products network (https://ai.lvpin100.com) The law network is a legal consultation website which has a large amount of data corpora, is arranged into problems and options by a large amount of professionals, has different depths of the problems to be selected according to different problems of user consultation under different conditions, and can not be divorced by a web crawlerhttps:// ai.lvpin100.com/g/divorce_rateIs there a ) The data format is shown in table 2:
TABLE 2 original format of data
Figure BDA0002180928440000113
The format of the data of the marital consultation problem after preprocessing is shown in table 3, wherein the data has 4 classification categories, 8 basic attributes, different attributes correspond to different attribute values, and the data has 2 × 3 × 2 × 4 — 2304 pieces of data:
TABLE 3 data Pre-processing data Format
Figure BDA0002180928440000121
2. Building decision trees
According to table 3, there are 4 classification categories and 8 basic attributes, the number of attribute values corresponding to different attributes is different, and 2304 pieces of data are obtained, and a decision tree is built by using a C4.5 algorithm according to the training samples. As shown in fig. 5, a decision tree is built by gain rate of data set attributes, and an experiment is performed by using 20% of data as a training set, and the decision tree built by the experiment is shown in fig. 11 and 12.
3. Decision tree application
When a user consults marital questions, starting from the decision attributes of the root nodes of the established decision tree, returning the questions corresponding to the decision attributes to the parties for questioning, wherein the questions asked of the parties corresponding to the basic attributes are shown in table 4, extracting the attribute values corresponding to the questions according to the answers of the user, judging branches of the decision tree according to the attribute values, throwing the questions corresponding to the next decision attribute of the branches for questioning the parties, ending until the leaf nodes are obtained, and returning the classification categories corresponding to the leaf nodes to the parties as conclusions as shown in fig. 6.
TABLE 4 questions asked of parties corresponding to basic attributes
Figure BDA0002180928440000131
4. Attribute value extraction
Referring to fig. 7, a user answer attribute value extraction dataset is constructed by manually tagging by collecting attorneys' conversations with parties. Taking whether the basic attribute has the wedding certificate as an example, the tag data is as shown in fig. 8, where 1 represents a positive class indicating that the user answer has the description wedding certificate, and 0 represents a negative class indicating that the user answer has no description wedding certificate. And for the positive class, extracting the final attribute value attribution through the identification of positive sentences and negative sentences. For the counterexample, the dialog system needs to ask the principal again until the user answers to the positive example of the result of the attribute value extraction.
For 8 basic attributes, legal attribute value extraction discriminant models are required. For example, whether the attribute is wedding certificate or not, the total amount of data sets constructed by manual tagging in the dialog between the collecting attorney and the concerned is 1100, the ratio of the training set to the test set is 8:2, the total amount of data in the test set is 220, and the training model identification performance is shown in table 5.
TABLE 5 Attribute value extraction test results
Figure BDA0002180928440000132
In the attribute value extraction experiment of the user response, according to the experimental result table 5 taking whether a wedding certificate exists as an example, the accuracy and the recall rate of the model are both over 83 percent, the usability of the model is reflected, and the method can be better applied to the attribute value extraction and judgment of the decision attribute in the decision tree.
5. Design of experiments
Since marital counseling data requires a large amount of professional processing, it is not easy to collect all data sets. In order to obtain how large a ratio of data sets is needed to achieve sufficient classification accuracy, the experimental results of classification accuracy of different proportions of all data in the training set are shown in fig. 9.
Because marital consultation has a strong logical reasoning relationship, the law network can ask all the problems to the user from beginning to end, the number of the problems is large, and the reasoning result is useless, so that experiments are performed on the influence of different proportions of training sets occupying all data on the depth of the problem decision tree, and the experimental result is shown in fig. 10.
Experiments were performed using 20% of the data as a training set, and decision trees were created as shown in fig. 11 and 12.
The evaluation indexes obtained by performing the experiment using 20% of the data as the training set are shown in table 6:
table 6 experimental results using 20% of the data as training set
Figure BDA0002180928440000141
6. Analysis of Experimental results
1) In the accuracy rate experiments under various proportions of the training set and the test set, the larger the proportion of the training set is, the higher the accuracy rate is, the rising trend is obvious, and according to fig. 9, it can be known that only 20% of the training set can reach 95% of accuracy rate, so that much manual workload is reduced, and the experiment effect is good;
2) in the layer number experiment of the trees under various proportions of the training set and the test set, the larger the proportion of the training set is, the deeper the decision tree depth is, the rising trend is obvious, and in combination with fig. 10, it can be known that the average depth of the trees is 4.3 layers when only 20% of the training set is used, namely, the user answers can be returned by asking 4-5 questions of the user, so that a lot of useless questions are saved, and the experiment effect is good;
3) in a decision tree established by using 20% of data as a training set experiment, 8 basic attributes are provided in total, the highest depth is 7 layers, and the lowest depth is 1 layer, namely, an answer result value can be returned only by raising a question for a user. The decision tree only needs to be constructed once and used repeatedly, and the maximum calculation times of each prediction do not exceed the depth of the decision tree;
4) In an experiment only using 20% of training set evaluation indexes, the accuracy of type 1 protocol divorce and type 4 disintergence relation in C4.5 algorithm classification are high, the accuracy of type 2 lition divorce is up to more than 93%, and the accuracy of type 3 disarming failure is up to more than 74%, which shows that the model has good feasibility and is superior to the ID3 algorithm.
7. The legal consultation interactive interface and consultation process are shown in fig. 13.
The above description is only an embodiment of the present invention, but the design concept of the present invention is not limited thereto, and any insubstantial modifications made by using the design concept should fall within the scope of infringing the present invention.

Claims (2)

1. A task-oriented automatic dialogue method based on decision trees is characterized by comprising the following steps:
1.1) discretizing the conclusion of legal consultation into classification categories, and discretizing the information of parties related to the conclusion into basic attributes;
1.2) receiving the inquiry problem of the party, extracting the attribute value corresponding to the basic attribute through a classification algorithm, and concluding the corresponding category value;
1.3) taking the collected actual cases as training samples, and establishing a decision tree-based legal consultation classification prediction model;
1.4) receiving the consultation of a new party, realizing the process of consultation conversation according to the established decision tree, and returning the result of consultation of the party;
the 1.2), specifically comprising:
1.2.1) labeling the attribute value corresponding to each attribute of the sentence in the partial consulting case through label engineering;
1.2.2) labeling classification categories corresponding to the conclusions in part of the consulting cases through label engineering;
1.2.3) training a model for the label data set through a classification algorithm;
1.2.4) extracting attribute values of the consulting cases through a model trained by a classification algorithm, and extracting class values corresponding to the knots;
the 1.2.3), specifically comprising:
1.2.3.1) setting the label data set as a sentence, wherein the attribute value category label corresponding to the sentence is y;
1.2.3.2) text feature extraction: firstly, segmenting words of the sentence through a word segmentation tool, removing stop words and low-frequency words in a word segmentation result, and secondly, correspondingly weighting each word in the data set, wherein a calculation formula is as follows:
Figure FDA0003657125360000011
Figure FDA0003657125360000012
TF-IDF=TF l *IDF l
1.2.3.3) using the obtained weight data as the input of an LDA text topic model, and training and extracting sentence topic characteristics x;
1.2.3.4) taking the extracted feature x as the input of a classification algorithm, and carrying out model training through an SVM algorithm, wherein the specific calculation is as follows:
Firstly, a certain classification interface in a high-dimensional space is assumed to be y ═ wx + b; w represents the weight of the interface, b represents the deviation of the interface, the initial values of w and b are obtained by random initialization, and a certain characteristic point x i The distance to the plane is expressed as:
Figure FDA0003657125360000013
finding the interface with the closest point farthest away to let gamma i Is the maximum value, i.e. the objective function, i.e.:
Figure FDA0003657125360000021
Figure FDA0003657125360000022
wherein s represents the total number of sentences, the objective function is analyzed, and the original classification problem can be converted into:
Figure FDA0003657125360000023
s.t.y i (wx i +b)≥1,i=1,2,3,...s
introducing lagrange multiplier alpha i Can obtain LagrangeThe daily function, Lagrange multiplier alpha i And (3) connecting the constraint condition function with the original function, so that an equation which is equal to the number of the variables can be configured, and the solution of each variable which obtains the extreme value of the original function is solved:
Figure FDA0003657125360000024
s.t.α i ≥0
convert the problem into a minimization problem, i.e.
Figure FDA0003657125360000025
The equivalent form obtained by further transformation analysis is:
Figure FDA0003657125360000026
Figure FDA0003657125360000027
assuming that there is at least one alpha j >0, obtaining
Figure FDA0003657125360000028
B is the optimal solution of the objective function, and alpha is the optimal solution of the dual problem, so that a hyperplane of objective function classification is obtained, and different classes of data are divided;
the 1.2.4), specifically comprising:
extracting the theme characteristics of all consulting cases, inputting the theme characteristics into a trained classification algorithm model, and predicting attribute value categories corresponding to the attributes of the consulting cases so as to obtain the structured data of consulting all the cases;
The 1.3), specifically comprising:
1.3.1) taking the obtained structural data of all the consulting cases as the input of a decision tree;
1.3.2) selection of classification attributes, i.e. selecting the optimal classification attributes, and adopting the method of information gain rate to select the attribute set A ═ { a } 1 ,a 2 ,…,a n Choose the best attribute a j
The 1.3.2), specifically comprising:
1.3.2.1) firstly, calculating the information entropy of the conclusion D, wherein the calculation formula is as follows:
Figure FDA0003657125360000031
wherein m represents the number of classes of D, P i The total number of the categories i corresponding to the representative conclusion D accounts for the total number of case consultation;
1.3.2.2) secondly, calculating the information entropy of all the attributes, wherein the calculation formula is as follows:
Figure FDA0003657125360000032
wherein Q is i Is represented in attribute a j The total number of corresponding categories i in the conclusion D accounts for the attribute a under the condition of the corresponding attribute value category k j The total number of corresponding attribute value classes k,
Figure FDA0003657125360000033
represents an attribute a j Data information of the corresponding attribute value category k; k is an element of [1, v ]]V represents a j The total number of corresponding attribute value categories;
1.3.2.3) calculate at choice Attribute a j The information entropy under the condition of (1), also called conditional entropy, is specifically calculated as follows:
Figure FDA0003657125360000034
wherein | D | represents the total number of case inquiries,
Figure FDA0003657125360000035
representing the number of attribute value classes k;
1.3.2.4) information gain is defined as the difference between the original information requirement and the new requirement as follows:
Gain(a j )=Info(D)-Info(D|a j )
1.3.2.5) calculate Attribute a j The splitting information of (2) is specifically calculated as follows:
Figure FDA0003657125360000036
1.3.2.6) information gain ratio information gain is normalized using split information values, as calculated:
Figure FDA0003657125360000037
1.3.2.7) calculating the information gain rate of all the attributes a, and then selecting the attribute with the maximum information gain rate as a split node, namely a ═ max (gain ratio (a) 1 ),GainRatio(a 2 ),…,GainRatio(a n ) Each attribute value of the attribute node a corresponds to a branch, the data of the branch is the data of the residual attribute and the conclusion under the condition that the attribute value is k, when the residual attribute corresponding to the branch can be selected and the category in the conclusion data is not unique, the step 1.3.2.1-1.3.2.7 is repeated, and otherwise, the splitting is stopped;
the 1.4), specifically comprising:
1.4.1) when new consulting information of the parties is received, questions corresponding to the attribute nodes are provided for a user from the root nodes of the decision tree, and the information of the parties is extracted;
1.4.2) inputting the obtained theme features into a trained classification algorithm model, and predicting attribute value categories corresponding to the attributes of the theme features;
1.4.3) comparing the predicted attribute value category with the attribute value of the current node attribute of the decision tree, and selecting equal branches as the sub-decision tree of the next step of conversation;
1.4.4) stopping the conversation when the sub-decision tree is a leaf node, returning the answer corresponding to the final conclusion category to the party, and otherwise, repeating the steps of 1.4.1-1.4.3.
2. The decision tree-based task-oriented automatic dialogue method according to claim 1, wherein 1.1), in particular comprises:
1.1.1) analyzing the number of the categories giving conclusions, and determining the category meaning corresponding to each category;
1.1.2) analyzing and providing attribute sets which can influence conclusions and attribute value categories corresponding to the attributes.
CN201910795839.1A 2019-08-27 2019-08-27 Task-oriented automatic dialogue method based on decision tree Active CN110532363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910795839.1A CN110532363B (en) 2019-08-27 2019-08-27 Task-oriented automatic dialogue method based on decision tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910795839.1A CN110532363B (en) 2019-08-27 2019-08-27 Task-oriented automatic dialogue method based on decision tree

Publications (2)

Publication Number Publication Date
CN110532363A CN110532363A (en) 2019-12-03
CN110532363B true CN110532363B (en) 2022-07-29

Family

ID=68664517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910795839.1A Active CN110532363B (en) 2019-08-27 2019-08-27 Task-oriented automatic dialogue method based on decision tree

Country Status (1)

Country Link
CN (1) CN110532363B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010609B (en) * 2020-12-23 2023-05-16 上海海鼎信息工程股份有限公司 Differentiated synchronization method and system applied to store operation
CN112948608B (en) * 2021-02-01 2023-08-22 北京百度网讯科技有限公司 Picture searching method and device, electronic equipment and computer readable storage medium
CN113343089B (en) * 2021-06-11 2024-09-06 北京完美赤金科技有限公司 User recall method, device and equipment
CN114416701A (en) * 2022-03-30 2022-04-29 威海海洋职业学院 Financial consultation intelligent guiding system and method based on big data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330071A (en) * 2017-06-30 2017-11-07 北京神州泰岳软件股份有限公司 A kind of legal advice information intelligent replies method and platform
CN109460457A (en) * 2018-10-25 2019-03-12 北京奥法科技有限公司 Text sentence similarity calculating method, intelligent government affairs auxiliary answer system and its working method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10356117B2 (en) * 2017-07-13 2019-07-16 Cisco Technology, Inc. Bayesian tree aggregation in decision forests to increase detection of rare malware

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330071A (en) * 2017-06-30 2017-11-07 北京神州泰岳软件股份有限公司 A kind of legal advice information intelligent replies method and platform
CN109460457A (en) * 2018-10-25 2019-03-12 北京奥法科技有限公司 Text sentence similarity calculating method, intelligent government affairs auxiliary answer system and its working method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"利用决策树的患者咨询问题分类";吴东东等;《中国数字医学》;20160525;第11卷(第02期);全文 *

Also Published As

Publication number Publication date
CN110532363A (en) 2019-12-03

Similar Documents

Publication Publication Date Title
CN110532363B (en) Task-oriented automatic dialogue method based on decision tree
CN110765257B (en) Intelligent consulting system of law of knowledge map driving type
CN111026842B (en) Natural language processing method, natural language processing device and intelligent question-answering system
CN115238101B (en) Multi-engine intelligent question-answering system oriented to multi-type knowledge base
Mondal et al. Chatbot: An automated conversation system for the educational domain
CN107908671B (en) Knowledge graph construction method and system based on legal data
CN111667926B (en) Psychological consultation/conversation system based on artificial intelligence and method thereof
CN110175227B (en) Dialogue auxiliary system based on team learning and hierarchical reasoning
CN110717018A (en) Industrial equipment fault maintenance question-answering system based on knowledge graph
CN117033571A (en) Knowledge question-answering system construction method and system
CN112069327B (en) Knowledge graph construction method and system for online education classroom teaching resources
CN113569023A (en) Chinese medicine question-answering system and method based on knowledge graph
CN113672720A (en) Power audit question and answer method based on knowledge graph and semantic similarity
CN112989761A (en) Text classification method and device
CN117438047A (en) Psychological consultation model training and psychological consultation processing method and device and electronic equipment
CN117933271A (en) Intelligent problem optimization dialogue method and system based on intention recognition of structural information
Chandiok et al. CIT: Integrated cognitive computing and cognitive agent technologies based cognitive architecture for human-like functionality in artificial systems
Sunkle et al. Informed active learning to aid domain experts in modeling compliance
CN117235215A (en) Large model and knowledge graph based dialogue generation method, system and medium
CN114330318A (en) Method and device for recognizing Chinese fine-grained entities in financial field
CN116561288B (en) Event query method, device, computer equipment, storage medium and program product
CN117251550A (en) Method for generating large model document library question and answer based on reasoning prompt
CN114661864A (en) Psychological consultation method and device based on controlled text generation and terminal equipment
Chen Tracking latent domain structures: An integration of Pathfinder and Latent Semantic Analysis
CN116775848B (en) Control method, device, computing equipment and storage medium for generating dialogue information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant