CN109635281B - Method and device for updating nodes in traffic guide graph - Google Patents

Method and device for updating nodes in traffic guide graph Download PDF

Info

Publication number
CN109635281B
CN109635281B CN201811400469.9A CN201811400469A CN109635281B CN 109635281 B CN109635281 B CN 109635281B CN 201811400469 A CN201811400469 A CN 201811400469A CN 109635281 B CN109635281 B CN 109635281B
Authority
CN
China
Prior art keywords
node
instruction
question
keyword
candidate word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811400469.9A
Other languages
Chinese (zh)
Other versions
CN109635281A (en
Inventor
胡翔
石志伟
张望舒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201811400469.9A priority Critical patent/CN109635281B/en
Publication of CN109635281A publication Critical patent/CN109635281A/en
Application granted granted Critical
Publication of CN109635281B publication Critical patent/CN109635281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • General Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Computer Security & Cryptography (AREA)
  • Marketing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An embodiment of the present specification provides a method and an apparatus for updating a node in a traffic guide map, where the method includes: acquiring a first question set formed by original questions; determining one node in the plurality of nodes as a node to be updated according to the first instruction; determining a screening node set according to the second instruction; screening out a question containing a keyword corresponding to each node in the screening node set or associated expression of the keyword from the first question set to obtain a second question set; performing word segmentation processing on each question in the second question set, and removing a keyword corresponding to each node in the screening node set or associated expression of the keyword to obtain a candidate word set; clustering the question sentences including the participles in the second question sentence set according to the participles included in the alternative word set to obtain a candidate word set consisting of the question sentences of a plurality of categories and the central words corresponding to the categories; and updating the nodes according to the candidate word set, thereby improving the efficiency.

Description

Method and device for updating nodes in traffic guide graph
Technical Field
One or more embodiments of the present specification relate to the field of computers, and more particularly, to a method and apparatus for updating a node in a traffic guide graph.
Background
The service guide map is an algorithm framework which is proposed in order to further improve the identification accuracy of the customer service robot. The service guide graph comprises a plurality of nodes which are arranged into a tree-shaped hierarchical structure according to service dimensions, each node corresponds to a keyword and an associated expression of the keyword, and the leaf nodes of the service guide graph carry standard problems associated with the keywords of the leaf nodes.
In the process of constructing the service guide diagram, the nodes need to be updated based on the user question, that is, node mining is performed, for example, a new sub-node is added below the existing nodes of the service guide diagram, or an associated expression of a keyword is added below the existing nodes of the service guide diagram. In the prior art, a clustering method based on sentence similarity is used for clustering question sentences of multiple users to obtain multiple clustering clusters, and then the multiple clustering clusters are checked manually to determine whether new nodes exist or not manually. In the clustering method of sentence similarity, the sentence similarity is greatly influenced by the text content of the sentence, and the sentence similarity is also greatly influenced by the length of the sentence, so that different expressions of the same standard problem can be clustered into different clusters, a plurality of repeated clustering clusters can be generated, the clustering clusters need to be checked manually, the labor cost is increased due to the fact that too many repeated clustering clusters are scattered, and the efficiency is low.
Accordingly, improved schemes are desired that improve efficiency when updating nodes in the traffic guide graph.
Disclosure of Invention
One or more embodiments of the present specification describe a method and apparatus for updating a node in a traffic guide graph, which can improve efficiency when updating a node in the traffic guide graph.
In a first aspect, a method for updating nodes in a service guide graph is provided, where the service guide graph includes a plurality of nodes organized into a tree-like hierarchical structure according to service dimensions, each node corresponds to a keyword and an associated expression of the keyword, and a leaf node of the service guide graph carries a standard problem associated with the keyword of the leaf node, and the method includes:
acquiring a first question set formed by original questions;
receiving a first instruction, and determining one node in the plurality of nodes as a node to be updated according to the first instruction;
receiving a second instruction, and determining a screening node set according to the second instruction;
screening out question sentences containing key words corresponding to each node in the screening node set or associated expressions of the key words from the first question sentence set to obtain a second question sentence set;
performing word segmentation processing on each question in the second question set, and removing a keyword corresponding to each node in the screening node set or associated expression of the keyword to obtain a candidate word set;
clustering the question sentences including the participles in the second question sentence set according to the participles included in the alternative word set to obtain a candidate word set consisting of a plurality of categories of question sentences and the corresponding core words of each category;
receiving a third instruction, adding a new-added child node to the node to be updated according to the third instruction, and determining at least one candidate word in the candidate word set indicated by the third instruction as a keyword corresponding to the new-added child node or an associated expression of the keyword; or receiving a fourth instruction, determining an existing child node of the node to be updated according to the fourth instruction, and determining at least one candidate word in the candidate word set indicated by the fourth instruction as the associated expression of the keyword of the existing child node.
In a possible embodiment, the determining a set of screening nodes according to the second instruction includes:
determining at least one node in the path of the node to be updated according to the second instruction;
and taking a set formed by the node to be updated and the at least one node as the screening node set.
In a possible implementation manner, the clustering, according to the participles included in the candidate word set, the question sentences including the participles in the second question sentence set to obtain a candidate word set composed of question sentences of multiple categories and a headword corresponding to each category, includes:
performing word frequency statistics on the participles included in the alternative word set to obtain high-frequency words with the word frequency larger than a preset threshold;
dividing the question including the same high-frequency word into question sentences of one category, and adding the high-frequency word into a candidate word set as a candidate word corresponding to the category.
Further, the method further comprises:
and displaying the candidate words in the candidate word set according to the order of word frequency from high to low.
In a possible implementation manner, the clustering, according to the participles included in the candidate word set, the question sentences including the participles in the second question sentence set to obtain a candidate word set composed of question sentences of multiple categories and a headword corresponding to each category, includes:
performing clustering statistics based on density on the participles included in the alternative word set to obtain a plurality of clustering clusters;
dividing the question including any participle in the same cluster into a category of question, and adding the central word of the cluster into a candidate word set as a candidate word corresponding to the category.
Further, the method further comprises:
and displaying all candidate words in the candidate word set from high to low according to the density of the corresponding category.
In a possible implementation manner, after the adding a new increasing child node to the node to be updated according to the third instruction and determining at least one candidate word in the candidate word set indicated by the third instruction as a keyword corresponding to the new increasing child node or an associated expression of the keyword, the method further includes:
receiving a fifth instruction, determining the new-added child node as a leaf node according to the fifth instruction, and mounting a standard problem associated with the keyword of the leaf node for the leaf node.
In a possible implementation manner, after the adding a new increasing child node to the node to be updated according to the third instruction and determining at least one candidate word in the candidate word set indicated by the third instruction as a keyword corresponding to the new increasing child node or an associated expression of the keyword, the method further includes:
receiving a sixth instruction, and determining that the new-added child node is not a leaf node according to the sixth instruction;
executing the receiving first instruction, and determining one node in the plurality of nodes as a node to be updated according to the first instruction, wherein the node to be updated is the new-added child node.
In a second aspect, an apparatus for updating nodes in a service guide map is provided, where the service guide map includes a plurality of nodes organized into a tree-like hierarchical structure according to service dimensions, each node corresponds to a keyword and an associated expression of the keyword, and a leaf node of the service guide map carries a standard problem associated with the keyword of the leaf node, and the apparatus includes:
the device comprises an acquisition unit, a query unit and a query unit, wherein the acquisition unit is used for acquiring a first question set formed by original questions;
the determining unit is used for receiving a first instruction and determining one node in the plurality of nodes as a node to be updated according to the first instruction; receiving a second instruction, and determining a screening node set according to the second instruction;
the screening unit is used for screening out question sentences containing key words or associated expressions of the key words corresponding to each node in the screening node set determined by the determining unit from the first question sentence set acquired by the acquiring unit to obtain a second question sentence set;
a word segmentation unit, configured to perform word segmentation processing on each question in the second question set obtained by the screening unit, and remove a keyword or an associated expression of the keyword corresponding to each node in the screening node set determined by the determination unit, to obtain an alternative word set;
the clustering unit is used for clustering the question sentences including the participles in the second question sentence set obtained by the screening unit according to the participles in the alternative word set obtained by the participle unit to obtain a candidate word set consisting of a plurality of categories of question sentences and the corresponding core words of each category;
the node updating unit is used for receiving a third instruction, adding a new-added child node to the node to be updated determined by the determining unit according to the third instruction, and determining at least one candidate word in the candidate word set obtained by the clustering unit indicated by the third instruction as a keyword corresponding to the new-added child node or an associated expression of the keyword; or receiving a fourth instruction, determining an existing child node of the node to be updated according to the fourth instruction, and determining at least one candidate word in the candidate word set indicated by the fourth instruction as the associated expression of the keyword of the existing child node.
In a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
In a fourth aspect, a computing device is provided, comprising a memory having stored therein executable code, and a processor that when executing the executable code, implements the method of the first aspect.
According to the method and the device provided by the embodiment of the specification, a first question set formed by an original question is obtained firstly, then a first instruction is received, one of a plurality of nodes is determined as a node to be updated according to the first instruction, a second instruction is received, a screening node set is determined according to the second instruction, then a question containing a keyword corresponding to each node in the screening node set or an associated expression of the keyword is screened from the first question set to obtain a second question set, then word segmentation processing is carried out on each question in the second question set, the keyword corresponding to each node in the screening node set or the associated expression of the keyword is planed to obtain a candidate word set, clustering processing is carried out on the question containing the word in the second question set according to the word segmentation of the candidate word set to obtain a candidate word set formed by a plurality of categories of questions and central words corresponding to each category, finally a third instruction is received, a keyword corresponding to the candidate word set is added according to the third instruction, and at least one of the candidate word set is determined as a new node to be updated; or receiving a fourth instruction, determining an existing child node of the node to be updated according to the fourth instruction, and determining at least one candidate word in the candidate word set indicated by the fourth instruction as the associated expression of the keyword of the existing child node. Therefore, the method utilizes the existing structure of the service guide diagram to filter and screen the initially acquired first question set, so that the difference between the user questions can be displayed, the corresponding clustering result is more accurate, the workload of operators is reduced, and the efficiency can be improved when the nodes are updated in the service guide diagram.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed herein;
FIG. 2 is a schematic diagram of another exemplary implementation of an embodiment disclosed herein;
FIG. 3 illustrates a flow diagram of a method for updating nodes in a traffic guide graph, according to one embodiment;
FIG. 4 is a schematic diagram of hierarchical filtering provided by embodiments of the present disclosure;
FIG. 5 is a schematic diagram of an interactive mining provided by an embodiment of the present description;
FIG. 6 is a flow chart of new problem discovery for a guide map provided by an embodiment of the present disclosure;
fig. 7 shows a schematic block diagram of an arrangement for updating a node in a traffic direction graph according to an embodiment.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
Fig. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in this specification. The implementation scenario involves updating nodes in a traffic guide graph. Referring to fig. 1, a service guide graph 100 includes a plurality of nodes organized into a tree-like hierarchical structure according to service dimensions, each node corresponds to a keyword and an associated expression of the keyword, and leaf nodes of the service guide graph mount standard questions (abbreviated as label questions) associated with the keywords of the leaf nodes. Wherein, the root node of the traffic guide graph 100 represents a traffic type, for example, the root node 101 in fig. 1 is a balance treasure; the third level nodes 102 and nodes after the third level in the traffic guide graph are typically semantic nodes, for example, the third level nodes 102 in fig. 1 include refunds, rollouts, switches, and the like; each leaf node 103 (also referred to as an end node) in the traffic guide graph may mount a standard question 104, for example, the leaf node 103 in fig. 1 includes "how", and the standard question 104 mounted by the leaf node is "how to query a balance treasure"; standard questions: the standard questions summarized by the user's high-frequency questions are referred to as the questions hereinafter; and (3) association expression: the method comprises synonymous expression, implication expression and upper and lower level words, and each semantic node can be configured with the associated expression.
In the embodiment of the present specification, the update node may include, but is not limited to, any of the following situations: adding new sub nodes for the nodes to be updated in the service guide diagram, and determining corresponding keywords and associated expressions for the new sub nodes; or, determining an existing child node for the node to be updated in the service guide graph, and determining the associated expression of the corresponding keyword for the existing child node.
It can be understood that after the new adding child node is added, the standard problem can be mounted for the new adding child node according to the instruction of the operator, or the new adding child node is used as the node to be updated to further update the node.
Fig. 2 is a schematic diagram of another implementation scenario of an embodiment disclosed in this specification. The implementation scenario involves node mining in a traffic guide graph. Referring to fig. 2, an original question set 21 is obtained first, then user questions in the original question set 21 are clustered 22 to obtain a plurality of cluster clusters 23, and the operator 23 performs manual review on the cluster clusters to perform question marking production 24. Wherein the process of interrogating production 24 includes the process of updating nodes. It can be understood that the more repetitive clustering clusters, that is, the more clustering clusters corresponding to the same standard problem, the greater the manual review cost, so in the embodiments of the present specification, emphasis is placed on reducing the repetitive clustering clusters to reduce the manual review cost and improve the efficiency.
FIG. 3 shows a flow diagram of a method for updating nodes in a service guide graph according to an embodiment, where the service guide graph includes a plurality of nodes organized as a tree-like hierarchy according to service dimensions, each node corresponds to a keyword and an associated expression of the keyword, and a leaf node of the service guide graph carries a standard problem associated with the keyword of the leaf node. As shown in fig. 3, the method for mining nodes in the traffic guide map in this embodiment includes the following steps: step 31, acquiring a first question set formed by original questions; step 32, receiving a first instruction, and determining one node in the plurality of nodes as a node to be updated according to the first instruction; step 33, receiving a second instruction, and determining a screening node set according to the second instruction; step 34, selecting question sentences containing key words or associated expressions of the key words corresponding to each node in the screening node set from the first question sentence set to obtain a second question sentence set; step 35, performing word segmentation processing on each question in the second question set, and removing a keyword or an associated expression of the keyword corresponding to each node in the screening node set to obtain a candidate word set; step 36, according to the participles included in the candidate word set, clustering the question sentences including the participles in the second question sentence set to obtain question sentences of multiple categories and a candidate word set formed by the core words corresponding to the categories; step 37, receiving a third instruction, adding a new-added child node to the node to be updated according to the third instruction, and determining at least one candidate word in the candidate word set indicated by the third instruction as a keyword corresponding to the new-added child node or an associated expression of the keyword; or receiving a fourth instruction, determining an existing child node of the node to be updated according to the fourth instruction, and determining at least one candidate word in the candidate word set indicated by the fourth instruction as the associated expression of the keyword of the existing child node. Specific execution modes of the above steps are described below.
First, in step 31, a first set of question sentences, made up of original question sentences, is obtained. It is understood that a plurality of question sentences, for example, 100 question sentences or 1000 question sentences, are included in the first set of question sentences.
In the embodiment of the present specification, a source of the first question set is not limited, and for example, the first question set may be obtained by collecting, on line, original questions input by a user within a preset time period.
Next, in step 32, a first instruction is received, and one of the plurality of nodes is determined as a node to be updated according to the first instruction.
It is understood that, on the premise that the service guide map has been constructed with a plurality of nodes, one node may be selected from the plurality of nodes by an operator as a node to be updated. That is to say, it needs to be determined subsequently whether a new child node can be mined out under the node to be updated, and a keyword and an associated expression of the keyword are added to the new child node, or whether an associated expression of the keyword can be added to an existing child node of the node to be updated.
Then, in step 33, a second instruction is received, and a set of screening nodes is determined according to the second instruction.
The filtering node set may only include the node to be updated, and optionally, may further include one or more nodes in a path where the node to be updated is located, where the nodes are used to filter and filter the question later.
In one example, according to a second instruction, at least one node in the path of the node to be updated is determined; and taking a set formed by the node to be updated and the at least one node as the screening node set.
Then, in step 34, a question including a keyword corresponding to each node in the screening node set or an associated expression of the keyword is screened from the first question set, so as to obtain a second question set.
It will be appreciated that the first set of questions and the second set of questions are in an inclusive relationship, and step 34 is actually a process of narrowing the scope of the questions, which may be referred to as hierarchical filtering when the set of filter nodes includes a plurality of nodes.
Fig. 4 is a schematic diagram of hierarchical filtering provided in an embodiment of the present disclosure. Referring to fig. 4, hierarchical filtering: and (4) performing a filtering and screening process on the question through the hierarchical structure of the guide picture. As shown in the figure, in the traffic guidance diagram, the node a has a child node B and a child node C, the screening node set determined in step 33 includes the node a and the node B, the keyword and the association thereof corresponding to the node a are expressed as "a, α", and the keyword and the association thereof corresponding to the node B are expressed as "B, β". The leftmost side is an original question set S0, all the questions which are expressed in a node A correlation mode are contained in the first-layer screening S0 to obtain S1, the second layer screens the S1 in the next step, a node B is used as a screening node in the legend to obtain SB2, and the process of screening the questions in a node correlation mode layer by layer is called hierarchical filtering. S0 may be understood as a first question set, and SB2 may be understood as a second question set.
Then, in step 35, performing word segmentation processing on each question in the second question set, and removing a keyword or an associated expression of the keyword corresponding to each node in the screening node set to obtain an alternative word set.
For example, a question includes the participles a, B, C, D, and the keyword or the associated expression of the keyword corresponding to each node in the screening node set includes B and C, then B and C of the participles a, B, C, D are planed, and the participles a and D are added into the alternative word set.
And performing the word segmentation processing on each question in the second question set to obtain a final alternative word set.
And then, in step 36, clustering the question sentences including the participles in the second question sentence set according to the participles included in the candidate word set to obtain question sentences of multiple categories and a candidate word set formed by the core words corresponding to the categories.
In one example, word frequency statistics is performed on the participles included in the candidate word set, and high-frequency words with word frequency larger than a preset threshold value are obtained; dividing the question including the same high-frequency word into question sentences of one category, and adding the high-frequency word into a candidate word set as a candidate word corresponding to the category.
The preset threshold may be set according to a policy, for example, the preset threshold may be set to 0, 1, 2, 3, and the like.
Furthermore, each candidate word in the candidate word set can be displayed in sequence from high to low according to word frequency for operators to observe, and one or more candidate words are selected from each candidate word as keywords and associated expressions thereof.
In another example, performing density-based cluster statistics on the segmented words included in the candidate word set to obtain a plurality of cluster clusters; dividing a question comprising any participle in the same cluster into a category question, and adding a headword of the cluster into a candidate word set as a candidate word corresponding to the category.
Furthermore, each candidate word in the candidate word set can be displayed in sequence from high to low according to the density of the corresponding category for operators to observe, and one or more candidate words are selected from the candidate words as keywords and associated expressions thereof.
Finally, in step 37, a third instruction is received, a new-added child node is added to the node to be updated according to the third instruction, and at least one candidate word in the candidate word set indicated by the third instruction is determined as a keyword corresponding to the new-added child node or an associated expression of the keyword; or receiving a fourth instruction, determining an existing child node of the node to be updated according to the fourth instruction, and determining at least one candidate word in the candidate word set indicated by the fourth instruction as an associated expression of a keyword of the existing child node.
In one example, after step 37, a fifth instruction is received, the new child node is determined to be a leaf node according to the fifth instruction, and the standard question associated with the keyword of the leaf node is mounted for the leaf node.
In another example, after step 37, a sixth instruction is received from which it is determined that the new child node is not a leaf node; executing the first receiving instruction, and determining one node in the plurality of nodes as a node to be updated according to the first instruction, wherein the node to be updated is the new child node. That is, the mining continues for the newly added child nodes.
By the method provided by the embodiment of the specification, the initially acquired first question set is filtered and screened by using the existing structure of the service guide graph, so that the difference between the questions of the users is displayed, the clustering result is relatively accurate, the workload of operators is reduced, and the efficiency can be improved when node mining is performed in the service guide graph.
Fig. 5 is a schematic diagram of interactive mining provided in an embodiment of the present specification. Referring to fig. 5, interactive mining means that an operator selects a node of a current sub-node to be mined during a process of constructing a guide graph, selects a necessary element in a path where the node is located for filtering (shown as flower, repayment), and filters all question sentences containing associated expressions of the selected node in original question sentences by an algorithm, wherein the filtering result is S ', and finally, the term frequency is counted from S' (the necessary element for filtering is planed). The right side column displays the keywords from high to low according to the word frequency of the node, finally, an operator can select a plurality of keywords to create new sub-nodes, and the selected keywords automatically become the associated expression of the new node. The whole process is called interactive excavation.
Fig. 6 is a flowchart for discovering new problems of the guide map provided in the embodiment of the present specification. Referring to fig. 6, an original question is provided first, then a node to be updated (i.e. a node to be mined) is selected, and essential elements are selected, and the purpose of mining is to find a new node and a new standard problem. In the embodiment of the specification, manual judgment is needed, and in the whole link, two judgment conditions are provided, namely whether high-frequency child nodes exist or not and whether nodes exist or not and need to be subdivided. And if the high-frequency words exist, the high-frequency words are merged into the associated expressions of the existing nodes, and if the high-frequency words do not exist, the corresponding child nodes are created. And judging whether the nodes need to be subdivided or not, and if the same problem is described, the operator needs to observe whether the question sentences left at the end nodes after hierarchical filtering in the guide graph mostly set forth the same problem or not, and if a plurality of problems are described, the node can still be subdivided.
According to another embodiment, an apparatus for updating nodes in a service guide graph is further provided, where the service guide graph includes a plurality of nodes organized into a tree-like hierarchical structure according to service dimensions, each node corresponds to a keyword and an associated expression of the keyword, and a leaf node of the service guide graph carries a standard problem associated with the keyword of the leaf node. Fig. 7 shows a schematic block diagram of an arrangement for updating a node in a traffic direction graph according to an embodiment. As shown in fig. 7, the apparatus 700 includes:
an obtaining unit 71, configured to obtain a first question set formed by original questions;
a determining unit 72, configured to receive a first instruction, and determine one node of the plurality of nodes as a node to be updated according to the first instruction; receiving a second instruction, and determining a screening node set according to the second instruction;
a screening unit 73, configured to screen out, from the first question set acquired by the acquiring unit 71, a question that includes a keyword or an associated expression of the keyword, where the keyword corresponds to each node in the screening node set determined by the determining unit 72, to obtain a second question set;
a word segmentation unit 74, configured to perform word segmentation processing on each question in the second question set obtained by the screening unit 73, and eliminate a keyword or an associated expression of the keyword corresponding to each node in the screening node set determined by the determination unit 72, so as to obtain an alternative word set;
a clustering unit 75, configured to perform clustering processing on the question sentences including the participle in the second question sentence set obtained by the screening unit 73 according to the participle included in the candidate word set obtained by the participle unit 74, so as to obtain a candidate word set formed by multiple categories of question sentences and core words corresponding to the categories;
a node updating unit 76, configured to receive a third instruction, add a new-added child node to the node to be updated determined by the determining unit 72 according to the third instruction, and determine at least one candidate word in the candidate word set obtained by the clustering unit 75 indicated by the third instruction as a keyword corresponding to the new-added child node or an associated expression of the keyword; or receiving a fourth instruction, determining an existing child node of the node to be updated according to the fourth instruction, and determining at least one candidate word in the candidate word set indicated by the fourth instruction as an associated expression of a keyword of the existing child node.
Optionally, as an embodiment, the determining unit 72 is specifically configured to:
determining at least one node in the path of the node to be updated according to the second instruction;
and taking a set formed by the node to be updated and the at least one node as the screening node set.
Optionally, as an embodiment, the clustering unit 75 is specifically configured to:
performing word frequency statistics on the participles included in the alternative word set to obtain high-frequency words with the word frequency larger than a preset threshold;
dividing the question including the same high-frequency word into question sentences of one category, and adding the high-frequency word into a candidate word set as a candidate word corresponding to the category.
Further, the apparatus further comprises:
and the display unit is used for displaying all candidate words in the candidate word set from high to low according to the word frequency.
Optionally, as an embodiment, the clustering unit 75 is specifically configured to:
performing clustering statistics based on density on the participles included in the alternative word set to obtain a plurality of clustering clusters;
dividing the question including any participle in the same cluster into a category of question, and adding the central word of the cluster into a candidate word set as a candidate word corresponding to the category.
Further, the apparatus further comprises:
and the display unit is used for sequentially displaying all candidate words in the candidate word set from high to low according to the density of the corresponding category.
Optionally, as an embodiment, the node updating unit 76 is further configured to receive a fifth instruction, determine that the new child node is a leaf node according to the fifth instruction, and mount, for the leaf node, a standard problem associated with the keyword of the leaf node.
Optionally, as an embodiment, the node updating unit 76 is further configured to receive a sixth instruction, and determine that the new child node is not a leaf node according to the sixth instruction;
the determining unit 72 is further configured to receive a first instruction, and determine one node of the plurality of nodes as a node to be updated according to the first instruction, where the node to be updated is the new child node.
By the device provided by the embodiment of the specification, the initially acquired first question set is filtered and screened by utilizing the existing structure of the service guide diagram, so that the difference between the questions of the users can be displayed, the clustering result is relatively accurate, the workload of operators is reduced, and the efficiency can be improved when node mining is performed in the service guide diagram.
In the embodiment of the specification, not only can the efficiency be improved, but also the following effects can be brought correspondingly.
On one hand, aiming at the problem that the text similarity is greatly influenced by the text content, the difference of two question sentences can be gradually reflected through hierarchical filtering. For example, question 1 is "eating one apple every day has great benefit to maintain human health", question 2 is "eating one banana every day has great benefit to maintain human health", and similar human- > health- > benefit- > \ 8230can be obtained through hierarchical filtering, the more to the end node, the more the content of the question is filtered by the preposed node, the shorter the content of the rest text is, the larger the difference is, so that the two questions are gradually reflected and distinguished.
On the other hand, aiming at the problem that the granularity cannot be controlled by a text content clustering-based method, the granularity does not need to be displayed and set by the text hierarchy filtering and word frequency statistics method, and the difference between texts can be naturally displayed as the hierarchy is gradually deepened.
On the other hand, aiming at the problem that the matching model and the clustering model are irrelevant to each other, in the embodiment of the description, the structure of the guide graph directly participates in matching, and the hierarchical filtering based on the guide graph is more accurate when the matching rate is improved due to the fact that the service optimization guide graph.
On the other hand, different expressions of the same label can be clustered into different clusters, and operators can continuously expand the associated expressions of the nodes of the guide map, so that the same problems divided into different clusters can be filtered to one node through the continuously expanded associated expressions.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with any of fig. 3 to 6.
According to an embodiment of yet another aspect, there is also provided a computing device comprising a memory and a processor, the memory having stored therein executable code, the processor, when executing the executable code, implementing the method described in conjunction with any of fig. 3 to 6.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (18)

1. A method for updating nodes in a service guide graph, wherein the service guide graph comprises a plurality of nodes which are arranged into a tree-shaped hierarchical structure according to service dimensions, each node corresponds to a keyword and an associated expression of the keyword, and a leaf node of the service guide graph carries a standard problem associated with the keyword of the leaf node, the method comprises the following steps:
acquiring a first question set formed by original questions;
receiving a first instruction, and determining one node in the plurality of nodes as a node to be updated according to the first instruction;
receiving a second instruction, and determining a screening node set according to the second instruction;
screening out question sentences containing key words corresponding to each node in the screening node set or associated expressions of the key words from the first question sentence set to obtain a second question sentence set;
performing word segmentation processing on each question in the second question set, and removing a keyword corresponding to each node in the screening node set or associated expression of the keyword to obtain a candidate word set;
clustering the question sentences including the participles in the second question sentence set according to the participles included in the alternative word set to obtain a candidate word set consisting of a plurality of categories of question sentences and the corresponding core words of each category;
receiving a third instruction, adding a new-added child node to the node to be updated according to the third instruction, and determining at least one candidate word in the candidate word set indicated by the third instruction as a keyword corresponding to the new-added child node or an associated expression of the keyword; or receiving a fourth instruction, determining an existing child node of the node to be updated according to the fourth instruction, and determining at least one candidate word in the candidate word set indicated by the fourth instruction as the associated expression of the keyword of the existing child node.
2. The method of claim 1, wherein the determining a set of screening nodes according to the second instructions comprises:
determining at least one node in the path of the node to be updated according to the second instruction;
and taking a set formed by the node to be updated and the at least one node as the screening node set.
3. The method according to claim 1, wherein the clustering, according to the participles included in the candidate word set, the question sentences including the participles in the second question sentence set to obtain a candidate word set composed of question sentences of a plurality of categories and a headword corresponding to each category, comprises:
performing word frequency statistics on the participles included in the alternative word set to obtain high-frequency words with the word frequency larger than a preset threshold;
dividing the question including the same high-frequency word into question sentences of one category, and adding the high-frequency word into a candidate word set as a candidate word corresponding to the category.
4. The method of claim 3, wherein the method further comprises:
and displaying all candidate words in the candidate word set according to the sequence of word frequency from high to low.
5. The method according to claim 1, wherein the clustering, according to the participles included in the candidate word set, the question sentences including the participles in the second question sentence set to obtain a candidate word set composed of question sentences of multiple categories and corresponding headwords of each category, includes:
performing clustering statistics based on density on the participles included in the alternative word set to obtain a plurality of clustering clusters;
dividing the question including any participle in the same cluster into a category of question, and adding the central word of the cluster into a candidate word set as a candidate word corresponding to the category.
6. The method of claim 5, wherein the method further comprises:
and displaying all candidate words in the candidate word set from high to low according to the density of the corresponding category.
7. The method of claim 1, wherein after adding a new-added child node to the node to be updated according to the third instruction and determining at least one candidate word in the candidate word set indicated by the third instruction as a keyword or an associated expression of the keyword corresponding to the new-added child node, the method further comprises:
receiving a fifth instruction, determining the new-added child node as a leaf node according to the fifth instruction, and mounting a standard problem associated with the keyword of the leaf node for the leaf node.
8. The method according to claim 1, wherein after adding a new child node to the node to be updated according to the third instruction and determining at least one candidate word in the candidate word set indicated by the third instruction as a keyword or an associated expression of the keyword corresponding to the new child node, the method further comprises:
receiving a sixth instruction, and determining that the new-added child node is not a leaf node according to the sixth instruction;
executing the receiving first instruction, and determining one node in the plurality of nodes as a node to be updated according to the first instruction, wherein the node to be updated is the new-added child node.
9. An apparatus for updating nodes in a service guide graph, wherein the service guide graph comprises a plurality of nodes organized into a tree-like hierarchical structure according to service dimensions, each node corresponds to a keyword and an associated expression of the keyword, and a leaf node of the service guide graph carries a standard problem associated with the keyword of the leaf node, the apparatus comprising:
the device comprises an acquisition unit, a query unit and a query unit, wherein the acquisition unit is used for acquiring a first question set formed by original questions;
the determining unit is used for receiving a first instruction and determining one node in the plurality of nodes as a node to be updated according to the first instruction; receiving a second instruction, and determining a screening node set according to the second instruction;
the screening unit is used for screening out question sentences containing key words or associated expressions of the key words corresponding to each node in the screening node set determined by the determining unit from the first question sentence set acquired by the acquiring unit to obtain a second question sentence set;
a word segmentation unit, configured to perform word segmentation processing on each question in the second question set obtained by the screening unit, and remove a keyword or an associated expression of the keyword corresponding to each node in the screening node set determined by the determination unit, to obtain an alternative word set;
the clustering unit is used for clustering the question sentences including the participles in the second question sentence set obtained by the screening unit according to the participles in the alternative word set obtained by the participle unit to obtain a candidate word set consisting of a plurality of categories of question sentences and core words corresponding to the categories;
the node updating unit is used for receiving a third instruction, adding a new-added child node to the node to be updated determined by the determining unit according to the third instruction, and determining at least one candidate word in the candidate word set obtained by the clustering unit indicated by the third instruction as a keyword corresponding to the new-added child node or an associated expression of the keyword; or receiving a fourth instruction, determining an existing child node of the node to be updated according to the fourth instruction, and determining at least one candidate word in the candidate word set indicated by the fourth instruction as the associated expression of the keyword of the existing child node.
10. The apparatus of claim 9, wherein the determining unit is specifically configured to:
determining at least one node in the path of the node to be updated according to the second instruction;
and taking a set formed by the node to be updated and the at least one node as the screening node set.
11. The apparatus according to claim 9, wherein the clustering unit is specifically configured to:
performing word frequency statistics on the participles included in the alternative word set to obtain high-frequency words with the word frequency larger than a preset threshold;
dividing the question including the same high-frequency word into question sentences of one category, and adding the high-frequency word into a candidate word set as a candidate word corresponding to the category.
12. The apparatus of claim 11, wherein the apparatus further comprises:
and the display unit is used for displaying all candidate words in the candidate word set from high to low according to the word frequency.
13. The apparatus according to claim 9, wherein the clustering unit is specifically configured to:
performing clustering statistics based on density on the participles included in the alternative word set to obtain a plurality of clustering clusters;
dividing the question including any participle in the same cluster into a category of question, and adding the central word of the cluster into a candidate word set as a candidate word corresponding to the category.
14. The apparatus of claim 13, wherein the apparatus further comprises:
and the display unit is used for displaying each candidate word in the candidate word set from high to low according to the density of the corresponding category.
15. The apparatus as claimed in claim 9, wherein the node updating unit is further configured to receive a fifth instruction, determine that the new child node is a leaf node according to the fifth instruction, and mount a standard question associated with the keyword of the leaf node for the leaf node.
16. The apparatus as claimed in claim 9, wherein the node update unit is further configured to receive a sixth instruction, and determine that the new child node is not a leaf node according to the sixth instruction;
the determining unit is further configured to receive a first instruction, and determine one node of the plurality of nodes as a node to be updated according to the first instruction, where the node to be updated is the new child node.
17. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-8.
18. A computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method of any of claims 1-8.
CN201811400469.9A 2018-11-22 2018-11-22 Method and device for updating nodes in traffic guide graph Active CN109635281B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811400469.9A CN109635281B (en) 2018-11-22 2018-11-22 Method and device for updating nodes in traffic guide graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811400469.9A CN109635281B (en) 2018-11-22 2018-11-22 Method and device for updating nodes in traffic guide graph

Publications (2)

Publication Number Publication Date
CN109635281A CN109635281A (en) 2019-04-16
CN109635281B true CN109635281B (en) 2023-01-31

Family

ID=66069222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811400469.9A Active CN109635281B (en) 2018-11-22 2018-11-22 Method and device for updating nodes in traffic guide graph

Country Status (1)

Country Link
CN (1) CN109635281B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539609B (en) * 2020-04-17 2023-04-07 北京亚信数据有限公司 Flow creation method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868695A (en) * 2012-09-18 2013-01-09 天格科技(杭州)有限公司 Conversation tree-based intelligent online customer service method and system
CN107015983A (en) * 2016-01-27 2017-08-04 阿里巴巴集团控股有限公司 A kind of method and apparatus for being used in intelligent answer provide knowledge information
CN107526792A (en) * 2017-08-15 2017-12-29 南通大学附属医院 A kind of Chinese question sentence keyword rapid extracting method
CN107562789A (en) * 2017-07-28 2018-01-09 深圳前海微众银行股份有限公司 Knowledge base problem update method, customer service robot and readable storage medium storing program for executing
CN107608999A (en) * 2017-07-17 2018-01-19 南京邮电大学 A kind of Question Classification method suitable for automatically request-answering system
CN108052547A (en) * 2017-11-27 2018-05-18 华中科技大学 Natural language question-answering method and system based on question sentence and knowledge graph structural analysis
JP2018092585A (en) * 2016-12-06 2018-06-14 パナソニックIpマネジメント株式会社 Information processing method, information processing device, and program
CN108763523A (en) * 2018-05-31 2018-11-06 康键信息技术(深圳)有限公司 Automatic session implementation method, server and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868695A (en) * 2012-09-18 2013-01-09 天格科技(杭州)有限公司 Conversation tree-based intelligent online customer service method and system
CN107015983A (en) * 2016-01-27 2017-08-04 阿里巴巴集团控股有限公司 A kind of method and apparatus for being used in intelligent answer provide knowledge information
JP2018092585A (en) * 2016-12-06 2018-06-14 パナソニックIpマネジメント株式会社 Information processing method, information processing device, and program
CN107608999A (en) * 2017-07-17 2018-01-19 南京邮电大学 A kind of Question Classification method suitable for automatically request-answering system
CN107562789A (en) * 2017-07-28 2018-01-09 深圳前海微众银行股份有限公司 Knowledge base problem update method, customer service robot and readable storage medium storing program for executing
CN107526792A (en) * 2017-08-15 2017-12-29 南通大学附属医院 A kind of Chinese question sentence keyword rapid extracting method
CN108052547A (en) * 2017-11-27 2018-05-18 华中科技大学 Natural language question-answering method and system based on question sentence and knowledge graph structural analysis
CN108763523A (en) * 2018-05-31 2018-11-06 康键信息技术(深圳)有限公司 Automatic session implementation method, server and storage medium

Also Published As

Publication number Publication date
CN109635281A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
US10558754B2 (en) Method and system for automating training of named entity recognition in natural language processing
CN107301170B (en) Method and device for segmenting sentences based on artificial intelligence
US8073849B2 (en) Method and system for constructing data tag based on a concept relation network
EP3832488A2 (en) Method and apparatus for generating event theme, device and storage medium
EP1840772A1 (en) Hierarchical clustering with real-time updating
CN108664615A (en) A kind of knowledge mapping construction method of discipline-oriented educational resource
CN106815307A (en) Public Culture knowledge mapping platform and its use method
CN111414491A (en) Power grid industry knowledge graph construction method, device and equipment
CN110555205B (en) Negative semantic recognition method and device, electronic equipment and storage medium
CN113986933A (en) Materialized view creating method and device, storage medium and electronic equipment
CN108304382A (en) Mass analysis method based on manufacturing process text data digging and system
Kamalabalan et al. Tool support for traceability of software artefacts
CN112115313A (en) Regular expression generation method, regular expression data extraction method, regular expression generation device, regular expression data extraction device, regular expression equipment and regular expression data extraction medium
CN109635281B (en) Method and device for updating nodes in traffic guide graph
CN112905612A (en) Knowledge card construction method and device
CN115186738B (en) Model training method, device and storage medium
CN116467291A (en) Knowledge graph storage and search method and system
CN113407678B (en) Knowledge graph construction method, device and equipment
CN113127627B (en) Poetry recommendation method based on LDA theme model and poetry knowledge map
CN113779981A (en) Recommendation method and device based on pointer network and knowledge graph
CN113870998A (en) Interrogation method, device, electronic equipment and storage medium
CN107220249A (en) Full-text search based on classification
CN117891979B (en) Method and device for constructing blood margin map, electronic equipment and readable medium
CN104317961B (en) A kind of professional system inputs intelligent prompt system
CN112836021B (en) Intelligent search system of library

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20201012

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201012

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant