CN109635281A - The method and apparatus that business leads more new node in figure - Google Patents

The method and apparatus that business leads more new node in figure Download PDF

Info

Publication number
CN109635281A
CN109635281A CN201811400469.9A CN201811400469A CN109635281A CN 109635281 A CN109635281 A CN 109635281A CN 201811400469 A CN201811400469 A CN 201811400469A CN 109635281 A CN109635281 A CN 109635281A
Authority
CN
China
Prior art keywords
node
instruction
question sentence
keyword
candidate word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811400469.9A
Other languages
Chinese (zh)
Other versions
CN109635281B (en
Inventor
胡翔
石志伟
张望舒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201811400469.9A priority Critical patent/CN109635281B/en
Publication of CN109635281A publication Critical patent/CN109635281A/en
Application granted granted Critical
Publication of CN109635281B publication Critical patent/CN109635281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/65Updates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services

Abstract

This specification embodiment provides a kind of method and apparatus that business leads more new node in figure, and method includes: the first question sentence set for obtaining original question sentence and constituting;A node in multiple nodes is determined as node to be updated according to the first instruction;Screening node set is determined according to the second instruction;From the question sentence comprising the associative expression of each corresponding keyword of node or the keyword in screening node set is filtered out in the first question sentence set, the second question sentence set is obtained;Word segmentation processing is carried out for each question sentence in the second question sentence set, and excludes the associative expression of each corresponding keyword of node or the keyword in screening node set, obtains alternative set of words;According to the participle for including in alternative set of words, clustering processing is carried out to the question sentence including the participle in the second question sentence set, the candidate word set that the question sentence and corresponding centre word of all categories for obtaining multiple classifications are constituted;Node updates are carried out according to candidate word set, thus raising efficiency.

Description

The method and apparatus that business leads more new node in figure
Technical field
This specification one or more embodiment is related to computer field more particularly to business leads the side of more new node in figure Method and device.
Background technique
It is a kind of a kind of algorithm frame proposed in order to further enhance customer service robot recognition accuracy that business, which leads figure,. Business lead figure include arranged according to business dimension be tree-shaped hierarchical structure multiple nodes, the corresponding keyword of each node with And the associative expression of the keyword, the leaf node carry standard associated with the keyword of the leaf node that the business leads figure are asked Topic.
In the building process that business leads figure, need based on user's question sentence more new node, that is to say, that node excavation is carried out, For example, newly-increased child node is further added by below the existing node that business leads figure, alternatively, increasing below the existing node that business leads figure Add the associative expression of keyword.In the prior art, the clustering method for being primarily based on sentence similarity carries out multiple user's question sentences Cluster, obtains multiple clustering clusters, then by the multiple clustering clusters of manual examination and verification, manually determines whether there is new node.Due to sentence In the clustering method of similarity, sentence similarity is influenced by the content of text of sentence, and the length of sentence is to sentence similarity Influence is also very big, can be clustered in different clusters so as to cause the different expression of same typical problem, this will lead to have It is many to repeat clustering clusters, and clustering cluster needs manual examination and verification, repeat that clustering cluster is excessive to be crossed breaking up cost of labor is caused to rise, efficiency It is low.
Accordingly, it would be desirable to there is improved plan, can when business leads more new node in figure raising efficiency.
Summary of the invention
This specification one or more embodiment describes a kind of method and apparatus that business leads more new node in figure, can The raising efficiency when business leads more new node in figure.
In a first aspect, providing a kind of method that business leads more new node in figure, it includes according to business that the business, which leads figure, Multiple nodes that dimension arranges as tree-shaped hierarchical structure, the contingency table of the corresponding keyword of each node and the keyword It reaches, the business leads the leaf node carry typical problem associated with the keyword of the leaf node of figure, and method includes:
Obtain the first question sentence set that original question sentence is constituted;
The first instruction is received, a node in the multiple node is determined as by section to be updated according to first instruction Point;
The second instruction is received, screening node set is determined according to second instruction;
From being filtered out in the first question sentence set comprising the corresponding keyword of node each in the screening node set Or the question sentence of the associative expression of the keyword, obtain the second question sentence set;
Word segmentation processing is carried out for each question sentence in the second question sentence set, and is excluded in the screening node set The associative expression of each corresponding keyword of node or the keyword, obtains alternative set of words;
According to the participle for including in the alternative set of words, to the question sentence including the participle in the second question sentence set Clustering processing is carried out, the candidate word set that the question sentence and corresponding centre word of all categories for obtaining multiple classifications are constituted;
Third instruction is received, is the newly-increased child node of node increase to be updated according to third instruction, and will be described At least one candidate word in the indicated candidate word set of third instruction is determined as the corresponding pass of the newly-increased child node The associative expression of keyword or the keyword;Alternatively, receiving the 4th instruction, the node to be updated is determined according to the 4th instruction Existing child node, and at least one candidate word in the indicated candidate word set of the 4th instruction is determined as institute State the associative expression of the keyword of existing child node.
It is described that screening node set is determined according to second instruction in a kind of possible embodiment, comprising:
According to second instruction, at least one node in the path of the node to be updated is determined;
The set that the node to be updated and at least one described node are constituted is as the screening node set.
It is described according to the participle for including in the alternative set of words in a kind of possible embodiment, to described second Question sentence including the participle in question sentence set carries out clustering processing, obtain multiple classifications question sentence and it is of all categories it is corresponding in The candidate word set that heart word is constituted, comprising:
Word frequency statistics are carried out to the participle for including in the alternative set of words, obtain the high frequency that word frequency is greater than preset threshold Word;
The question sentence of a classification will be divided into including the question sentence of same high frequency words, and using the high frequency words as the category pair Candidate word set is added in the candidate word answered.
Further, the method also includes:
Each candidate word in the candidate word set is sequentially shown from high to low according to word frequency.
It is described according to the participle for including in the alternative set of words in a kind of possible embodiment, to described second Question sentence including the participle in question sentence set carries out clustering processing, obtain multiple classifications question sentence and it is of all categories it is corresponding in The candidate word set that heart word is constituted, comprising:
Density clustering statistics is carried out to the participle for including in the alternative set of words, obtains multiple clustering clusters;
It will include that the question sentence of any participle in same clustering cluster be divided into the question sentence of a classification, and by the clustering cluster Candidate word set is added as the corresponding candidate word of the category in centre word.
Further, the method also includes:
Each candidate word in the candidate word set is sequentially shown from high to low according to the density of respective classes.
It is described that newly-increased son is increased for the node to be updated according to third instruction in a kind of possible embodiment Node, and at least one candidate word in the indicated candidate word set of third instruction is determined as the newly-increased son After the associative expression of the corresponding keyword of node or the keyword, the method also includes:
The 5th instruction is received, determines that the newly-increased child node is leaf node according to the 5th instruction, and be the leaf node Carry typical problem associated with the keyword of the leaf node.
It is described that newly-increased son is increased for the node to be updated according to third instruction in a kind of possible embodiment Node, and at least one candidate word in the indicated candidate word set of third instruction is determined as the newly-increased son After the associative expression of the corresponding keyword of node or the keyword, the method also includes:
The 6th instruction is received, determines that the newly-increased child node is not leaf node according to the 6th instruction;
First instruction of reception is executed, is determined as a node in the multiple node according to first instruction Node to be updated, wherein the node to be updated is the newly-increased child node.
Second aspect provides the device that a kind of business leads more new node in figure, and it includes according to business that the business, which leads figure, Multiple nodes that dimension arranges as tree-shaped hierarchical structure, the contingency table of the corresponding keyword of each node and the keyword It reaches, the business leads the leaf node carry typical problem associated with the keyword of the leaf node of figure, and device includes:
Acquiring unit, the first question sentence set constituted for obtaining original question sentence;
Determination unit is instructed according to described first by a node in the multiple node for receiving the first instruction It is determined as node to be updated;The second instruction is received, screening node set is determined according to second instruction;
Screening unit, for filtering out from the first question sentence set that the acquiring unit obtains comprising the determination unit The question sentence of the associative expression of each corresponding keyword of node or the keyword, obtains second and asks in determining screening node set Sentence set;
Participle unit, each question sentence in the second question sentence set for obtaining for the screening unit carry out at participle Reason, and exclude the association of each corresponding keyword of node or the keyword in the screening node set that the determination unit determines Expression, obtains alternative set of words;
Cluster cell, the participle for including in the alternative set of words for being obtained according to the participle unit, to the screening Question sentence including the participle in the second question sentence set that unit obtains carries out clustering processing, obtain multiple classifications question sentence and The candidate word set that corresponding centre word of all categories is constituted;
Node updates unit, for receive third instruction, according to the third instruction be the determination unit determine to More new node increases newly-increased child node, and in the candidate word set that the indicated cluster cell of third instruction is obtained At least one candidate word be determined as the corresponding keyword of newly-increased child node or the associative expression of the keyword;Alternatively, connecing The 4th instruction is received, the existing child node of the node to be updated is determined according to the 4th instruction, and instructs institute for the described 4th At least one candidate word in the candidate word set indicated is determined as the associative expression of the keyword of the existing child node.
The third aspect provides a kind of computer readable storage medium, is stored thereon with computer program, when the calculating When machine program executes in a computer, enable computer execute first aspect method.
Fourth aspect provides a kind of calculating equipment, including memory and processor, and being stored in the memory can hold Line code, when the processor executes the executable code, the method for realizing first aspect.
The method and apparatus provided by this specification embodiment obtain the first question sentence collection that original question sentence is constituted first It closes, then receives the first instruction, a node in the multiple node is determined as by section to be updated according to first instruction Point, and the second instruction is received, screening node set is determined according to second instruction, then screen from the first question sentence set Question sentence out comprising the associative expression of each corresponding keyword of node or the keyword in the screening node set, obtains the Two question sentence set carry out word segmentation processing next for each question sentence in the second question sentence set, and exclude the screening The associative expression of each corresponding keyword of node or the keyword in node set, obtains alternative set of words, further according to described The participle for including in alternative set of words carries out clustering processing to the question sentence including the participle in the second question sentence set, obtains The candidate word set constituted to the question sentence of multiple classifications and corresponding centre word of all categories, finally receives third instruction, according to Third instruction is that the node to be updated increases newly-increased child node, and the candidate word set that third instruction is indicated In at least one candidate word be determined as the corresponding keyword of newly-increased child node or the associative expression of the keyword;Alternatively, The 4th instruction is received, the existing child node of the node to be updated is determined according to the 4th instruction, and the 4th instruction is signified At least one candidate word in the candidate word set shown is determined as the associative expression of the keyword of the existing child node.By It is upper that screening is filtered to the first question sentence set being originally taken as it can be seen that business is utilized in this method and leads the existing structure of figure, To enable the difference between user's question sentence to show, the result correspondingly clustered is also more accurate, advantageously reduces operation personnel Workload, can when business leads more new node in figure raising efficiency.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this For the those of ordinary skill of field, without creative efforts, it can also be obtained according to these attached drawings others Attached drawing.
Fig. 1 is a kind of implement scene schematic diagram of one embodiment that this specification discloses;
Fig. 2 is another implement scene schematic diagram of one embodiment that this specification discloses;
Fig. 3 shows the method flow diagram that more new node in figure is led according to the business of one embodiment;
Fig. 4 is that the level that this specification embodiment provides filters schematic diagram;
Fig. 5 is the interactive mining schematic diagram that this specification embodiment provides;
Fig. 6 is that the figure new problem of leading that this specification embodiment provides finds flow chart;
Fig. 7 shows the schematic block diagram that the device of more new node in figure is led according to the business of one embodiment.
Specific embodiment
With reference to the accompanying drawing, the scheme provided this specification is described.
Fig. 1 is a kind of implement scene schematic diagram of one embodiment that this specification discloses.The implement scene is related in industry More new node in figure is led in business.Referring to Fig.1, it includes being arranged according to business dimension as the multiple of tree-shaped hierarchical structure that business, which leads Figure 100, Node, the associative expression of the corresponding keyword of each node and the keyword, the business lead the leaf node carry of figure with The associated typical problem of the keyword of the leaf node (referred to as mark is asked).Wherein, business leads the root nodes stand service class of Figure 100 Type, for example, the root node 101 in Fig. 1 is Yuebao;Business leads third level node 102 in figure and the later section of the third level Point is usually semantic node, for example, the third level node 102 in Fig. 1 includes reimbursement, produces, switches;Business is led every in figure A leaf node 103 (also referred to as end-node) can carry typical problem 104, for example, the leaf node 103 in Fig. 1 includes " how ", the typical problem 104 of the leaf node carry are " how query the balance treasured ";Typical problem: it is putd question to for the high frequency of user The typical problem of summary, hereinafter referred to as mark are asked;Associative expression: include synonymous expression, contain expression, upper hyponym, each semanteme Node all can configure its associative expression.
In this specification embodiment, more new node can be, but not limited to include following any situation: lead in figure in business Increase newly-increased child node for node to be updated, and determines corresponding keyword and associative expression for the newly-increased child node;Alternatively, It is that node to be updated determines existing child node, and the contingency table of corresponding keyword is determined for the existing child node that business, which is led in figure, It reaches.
It is understood that after increasing newly-increased child node, it can also be according to the instruction of operation personnel, for the newly-increased son section Point carry typical problem, or this is increased newly child node as the further more new node of node to be updated.
Fig. 2 is another implement scene schematic diagram of one embodiment that this specification discloses.The implement scene is related to Business leads progress node excavation in figure.Referring to Fig. 2, original question sentence set 21 is obtained first, then in original question sentence set 21 User's question sentence carry out cluster 22, obtain multiple clustering clusters 23, by operation personnel 23 to multiple clustering clusters carry out manual examination and verification, into Rower asks production 24.Wherein, mark asks that the process of production 24 includes the process of more new node.It is understood that repeating clustering cluster It is more, that is to say, that the clustering cluster of corresponding same typical problem is more, and manual examination and verification cost is bigger, therefore this specification is implemented In example, focus on that reduction repeats clustering cluster and improves efficiency to reduce manual examination and verification cost.
Fig. 3 shows the method flow diagram that more new node in figure is led according to the business of one embodiment, and the business leads figure packet Include multiple nodes according to business dimension arrangement for tree-shaped hierarchical structure, the corresponding keyword of each node and the keyword Associative expression, the business leads the leaf node carry typical problem associated with the keyword of the leaf node of figure.Such as Fig. 3 institute Show, the method that business leads the excavation of figure interior joint in the embodiment the following steps are included: step 31, obtain that original question sentence constitutes the One question sentence set;Step 32, the first instruction is received, is determined a node in the multiple node according to first instruction For node to be updated;Step 33, the second instruction is received, screening node set is determined according to second instruction;Step 34, from institute It states and is filtered out in the first question sentence set comprising each corresponding keyword of node or the keyword in the screening node set The question sentence of associative expression obtains the second question sentence set;Step 35, each question sentence in the second question sentence set is divided Word processing, and the associative expression of each corresponding keyword of node or the keyword in the screening node set is excluded, it obtains Alternative set of words;Step 36, according to the participle for including in the alternative set of words, in the second question sentence set include should The question sentence of participle carries out clustering processing, the candidate word set that the question sentence and corresponding centre word of all categories for obtaining multiple classifications are constituted It closes;Step 37, third instruction is received, according to third instruction is that the node to be updated increases newly-increased child node, and by institute Stating at least one candidate word in the indicated candidate word set of third instruction, to be determined as the newly-increased child node corresponding The associative expression of keyword or the keyword;Alternatively, receiving the 4th instruction, the section to be updated is determined according to the 4th instruction The existing child node of point, and at least one candidate word in the indicated candidate word set of the 4th instruction is determined as The associative expression of the keyword of the existing child node.The specific executive mode of above each step is described below.
First in step 31, the first question sentence set that original question sentence is constituted is obtained.It is understood that first question sentence It include multiple question sentences in set, for example, 100 question sentences or 1000 question sentences.
In this specification embodiment, for the first question sentence set source without limitation, for example, can be by being acquired on line The original question sentence that user inputs in preset time period, to obtain the first question sentence set.
Then in step 32, the first instruction is received, is instructed according to described first by a node in the multiple node It is determined as node to be updated.
It, can be by operation personnel from institute it is understood that under the premise of business leads figure and had been built up multiple nodes It states and selects a node as node to be updated in multiple nodes.That is, needing subsequent determination under the node to be updated Newly-increased child node whether can be excavated, and adds the associative expression of keyword and keyword for the newly-increased child node, or whether The associative expression of keyword can be increased for the existing child node of the node to be updated.
Then in step 33, the second instruction is received, screening node set is determined according to second instruction.
Wherein, screening node set can only include the node to be updated, can also include described to be updated optionally One or more nodes where node in path, these nodes are filtered screening to question sentence for subsequent.
In one example, according to the second instruction, at least one node in the path of the node to be updated is determined;It will The set that the node to be updated and at least one described node are constituted is as the screening node set.
Then in step 34, from being filtered out in the first question sentence set comprising each node in the screening node set The question sentence of the associative expression of corresponding keyword or the keyword obtains the second question sentence set.
It is understood that the first question sentence set and the second question sentence set belong to inclusion relation, step 34 is actually one A process for reducing question sentence range, when the screening node set includes multiple nodes, this process is properly termed as level mistake Filter.
Fig. 4 is that the level that this specification embodiment provides filters schematic diagram.Referring to Fig. 4, level filtering: refer to question sentence by leading The hierarchical structure of figure is filtered the process of screening.As shown, leading in figure in business, node A has child node B and child node C, the screening node set that step 33 is determined include node A and node B, and the corresponding keyword of node A and its associative expression are " a, A, α ", the corresponding keyword of node B and its associative expression are " b, B, β ".The leftmost side is original question sentence set S0, first layer Screen include in S0 A node associative expression all question sentences, obtain S1, the second layer carries out the screening of next step to S1, in legend Using B node as screening node, obtains SB2, level mistake is successively known as with the process that node associative expression screens question sentence Filter.Wherein, S0 can be understood as the first question sentence set, and SB2 can be understood as the second question sentence set.
Then in step 35, word segmentation processing is carried out for each question sentence in the second question sentence set, and is excluded described The associative expression of each corresponding keyword of node or the keyword in node set is screened, alternative set of words is obtained.
For example, question sentence includes participle A, B, C, D, in the screening node set the corresponding keyword of each node or The associative expression of the keyword includes B, C, then will segment A, B, C, D, in B, C exclude, alternative word set is added in participle A, D It closes.
Above-mentioned word segmentation processing is carried out for each question sentence in the second question sentence set, to obtain final alternative Set of words.
Again in step 36, according to the participle for including in the alternative set of words, include in the second question sentence set The question sentence of the participle carries out clustering processing, the candidate word that the question sentence and corresponding centre word of all categories for obtaining multiple classifications are constituted Set.
In one example, word frequency statistics are carried out to the participle for including in the alternative set of words, obtains word frequency and is greater than in advance If the high frequency words of threshold value;To include that the question sentences of same high frequency words be divided into the question sentence of a classification, and using the high frequency words as Candidate word set is added in the corresponding candidate word of the category.
Wherein, above-mentioned preset threshold can be set according to strategy, for example, above-mentioned preset threshold can be set as 0,1,2, 3 etc..
Further, each candidate word in the candidate word set can also sequentially be shown from high to low according to word frequency, For operation personnel's observation, and select one or more candidate words as keyword and its associative expression from each candidate word.
In another example, density clustering statistics is carried out to the participle for including in the alternative set of words, obtained To multiple clustering clusters;It will include that the question sentence of any participle in same clustering cluster is divided into the question sentence of a classification, and this is gathered Candidate word set is added as the corresponding candidate word of the category in the centre word of class cluster.
Further, can also by each candidate word in the candidate word set according to respective classes density from high to low Sequence is shown, for operation personnel's observation, and selects one or more candidate words as keyword and its pass from each candidate word Connection expression.
Finally in step 37, third instruction is received, is that the node to be updated increases newly-increased son according to third instruction Node, and at least one candidate word in the indicated candidate word set of third instruction is determined as the newly-increased son The associative expression of the corresponding keyword of node or the keyword;Alternatively, receiving the 4th instruction, is instructed according to the described 4th and determine institute The existing child node of node to be updated is stated, and indicated at least one of the candidate word set of the 4th instruction is waited Word is selected to be determined as the associative expression of the keyword of the existing child node.
In one example, after step 37, the 5th instruction is received, the newly-increased son is determined according to the 5th instruction Node is leaf node, and is leaf node carry typical problem associated with the keyword of the leaf node.
In another example, after step 37, the 6th instruction is received, is determined according to the 6th instruction described newly-increased Child node is not leaf node;First instruction of reception is executed, is instructed according to described first by one in the multiple node Node is determined as node to be updated, wherein the node to be updated is the newly-increased child node.That is, continuing to newly-increased Child node is excavated.
The method provided by this specification embodiment, the business of being utilized lead the existing structure of figure to first be originally taken Question sentence set is filtered screening, so that the difference between user's question sentence be enable to show, the result correspondingly clustered is also more smart Standard advantageously reduces the workload of operation personnel, can business lead in figure carry out node excavation when raising efficiency.
Fig. 5 is the interactive mining schematic diagram that this specification embodiment provides.Referring to Fig. 5, interactive mining refers to operator During figure is led in building, currently to excavate the node of child node by selecting, choose the node indispensability in the paths Element to filter (diagram such as: flower, refund), algorithm screens all comprising the associative expression for the node chosen in original question sentence Question sentence, the selection result S ' finally counts word frequency (excluding the indispensable element to filter) from the S '.Right side sidebar can be shown Keyword under this node according to word frequency from high to low, last operation person can choose multiple keywords, create new son section Point, these keywords chosen will automatically become the associative expression of new node.Whole process is referred to as interactive mining.
Fig. 6 is that the figure new problem of leading that this specification embodiment provides finds flow chart.Referring to Fig. 6, original ask is provided first Sentence, then chooses node to be updated (node i.e. to be excavated), chooses indispensable element, the purpose of excavation be in order to find new node with And new typical problem.In this specification embodiment, artificial judgment is needed, in entire link, there are two Rule of judgment: whether There are high frequency child nodes, if needs to segment there are node.Wherein, high frequency child node is judged whether there is, operation personnel is needed Observation, which is carried out, by the corresponding original question sentence of observation high frequency words judges whether these question sentences are not belonging to any existing Dao Tu branch, If in the presence of high frequency words to be then incorporated to the associative expression of existing node, then create corresponding child node if it does not exist.Judge whether Need to segment there are node, need operation personnel by observation lead in figure by level filtering after stay in end-node question sentence whether It is most of to illustrate that the same problem then illustrates to may continue to segment under the node if describing multiple problems.
According to the embodiment of another aspect, the device that a kind of business leads more new node in figure is also provided, the business leads figure Including arranging multiple nodes for tree-shaped hierarchical structure, the corresponding keyword of each node and the key according to business dimension The associative expression of word, the business lead the leaf node carry typical problem associated with the keyword of the leaf node of figure.Fig. 7 shows The schematic block diagram of the device of more new node in figure is led according to the business of one embodiment out.As shown in fig. 7, the device 700 wraps It includes:
Acquiring unit 71, the first question sentence set constituted for obtaining original question sentence;
Determination unit 72 saves one in the multiple node according to first instruction for receiving the first instruction Point is determined as node to be updated;The second instruction is received, screening node set is determined according to second instruction;
Screening unit 73, for filtering out from the first question sentence set that the acquiring unit 71 obtains comprising the determination The question sentence of the associative expression of each corresponding keyword of node or the keyword, obtains in the screening node set that unit 72 determines Second question sentence set;
Participle unit 74, each question sentence in the second question sentence set for obtaining for the screening unit 73 divide Word processing, and exclude each corresponding keyword of node or keyword in the screening node set that the determination unit 72 determines Associative expression, obtain alternative set of words;
Cluster cell 75, the participle for including in the alternative set of words for being obtained according to the participle unit 74, to described The question sentence including the participle in the second question sentence set that screening unit 73 obtains carries out clustering processing, obtains asking for multiple classifications The candidate word set that sentence and corresponding centre word of all categories are constituted;
Node updates unit 76 is that the determination unit 72 determines according to third instruction for receiving third instruction Node to be updated increase newly-increased child node, and the candidate word that the indicated cluster cell 75 of third instruction is obtained At least one candidate word in set is determined as the corresponding keyword of newly-increased child node or the associative expression of the keyword;Or Person receives the 4th instruction, and the existing child node of the node to be updated is determined according to the 4th instruction, and the described 4th is referred to At least one candidate word in the indicated candidate word set is enabled to be determined as the association of the keyword of the existing child node Expression.
Optionally, as one embodiment, the determination unit 72 is specifically used for:
According to second instruction, at least one node in the path of the node to be updated is determined;
The set that the node to be updated and at least one described node are constituted is as the screening node set.
Optionally, as one embodiment, the cluster cell 75 is specifically used for:
Word frequency statistics are carried out to the participle for including in the alternative set of words, obtain the high frequency that word frequency is greater than preset threshold Word;
The question sentence of a classification will be divided into including the question sentence of same high frequency words, and using the high frequency words as the category pair Candidate word set is added in the candidate word answered.
Further, described device further include:
Display unit, for sequentially showing each candidate word in the candidate word set from high to low according to word frequency.
Optionally, as one embodiment, the cluster cell 75 is specifically used for:
Density clustering statistics is carried out to the participle for including in the alternative set of words, obtains multiple clustering clusters;
It will include that the question sentence of any participle in same clustering cluster be divided into the question sentence of a classification, and by the clustering cluster Candidate word set is added as the corresponding candidate word of the category in centre word.
Further, described device further include:
Display unit, for each candidate word in the candidate word set is suitable from high to low according to the density of respective classes Sequence is shown.
Optionally, as one embodiment, the node updates unit 76 is also used to receive the 5th instruction, according to described 5th instruction determines that the newly-increased child node is leaf node, and associated with the keyword of the leaf node for the leaf node carry Typical problem.
Optionally, as one embodiment, the node updates unit 76 is also used to receive the 6th instruction, according to described 6th instruction determines that the newly-increased child node is not leaf node;
The determination unit 72, is also used to receive the first instruction, will be in the multiple node according to first instruction One node is determined as node to be updated, wherein the node to be updated is the newly-increased child node.
The device provided by this specification embodiment, the business of being utilized lead the existing structure of figure to first be originally taken Question sentence set is filtered screening, so that the difference between user's question sentence be enable to show, the result correspondingly clustered is also more smart Standard advantageously reduces the workload of operation personnel, can business lead in figure carry out node excavation when raising efficiency.
In this specification embodiment, can not only raising efficiency, following effect can also be brought accordingly.
On the one hand, the problem influenced for text similarity by content of text can be filtered by level and gradually be embodied The difference of two question sentences.For example, question sentence 1 is " eating an apple daily has very big benefit to maintenance human health ", is asked Sentence 2 is " eating a banana daily has very big benefit to maintenance human health ", can be obtained by level filtering similar: human body -> Health -> benefit -> ... the hierarchical structure of -> (apple/banana) more arrives end-node, the content that question sentence is filtered out by preposition node More, the remaining shorter difference of content of text is bigger, so that two question sentences be made gradually to embody difference, is distinguished.
On the other hand, aiming at the problem that method granularity clustered based on content of text is unable to control, pass through this paper level mistake Filter, the method for word frequency statistics do not need display setting granularity, and as level is gradually deepened, the difference meeting nature between text shows.
On the other hand, the problem mutually incoherent for Matching Model and Clustering Model, by leading in this specification embodiment The structure of figure directly participates in matching, and service optimization leads while bringing matching rate to be promoted of figure, is will lead to naturally based on leading figure Level filtering is more accurate.
On the other hand, the different expression that same mark is asked can be clustered in different clusters, and operation personnel can constantly expand The associative expression of node of graph is led, so that being in the same problem of different clusters can be filtered by the associative expression constantly expanded To under a node.
According to the embodiment of another aspect, a kind of computer readable storage medium is also provided, is stored thereon with computer journey Sequence enables computer execute and combines any described method of Fig. 3 to Fig. 6 when the computer program executes in a computer.
According to the embodiment of another further aspect, a kind of calculating equipment, including memory and processor, the memory are also provided In be stored with executable code, when the processor executes the executable code, realize and combine Fig. 3 to Fig. 6 any described Method.
Those skilled in the art are it will be appreciated that in said one or multiple examples, function described in the invention It can be realized with hardware, software, firmware or their any combination.It when implemented in software, can be by these functions Storage in computer-readable medium or as on computer-readable medium one or more instructions or code transmitted.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention Protection scope, all any modification, equivalent substitution, improvement and etc. on the basis of technical solution of the present invention, done should all Including within protection scope of the present invention.

Claims (18)

1. a kind of method that business leads more new node in figure, it includes arranging according to business dimension as tree-shaped level that the business, which leads figure, Multiple nodes of structure, the associative expression of the corresponding keyword of each node and the keyword, the business lead the leaf of figure Node carry typical problem associated with the keyword of the leaf node, which comprises
Obtain the first question sentence set that original question sentence is constituted;
The first instruction is received, a node in the multiple node is determined as by node to be updated according to first instruction;
The second instruction is received, screening node set is determined according to second instruction;
From filtered out in the first question sentence set comprising the corresponding keyword of node each in the screening node set or should The question sentence of the associative expression of keyword obtains the second question sentence set;
Word segmentation processing is carried out for each question sentence in the second question sentence set, and is excluded each in the screening node set The associative expression of the corresponding keyword of node or the keyword, obtains alternative set of words;
According to the participle for including in the alternative set of words, the question sentence including the participle in the second question sentence set is carried out Clustering processing, the candidate word set that the question sentence and corresponding centre word of all categories for obtaining multiple classifications are constituted;
Third instruction is received, according to third instruction is that the node to be updated increases newly-increased child node, and by the third At least one candidate word in the indicated candidate word set of instruction is determined as the corresponding keyword of the newly-increased child node Or the associative expression of the keyword;Alternatively, receiving the 4th instruction, the node to be updated has been determined according to the 4th instruction Have a child node, and by least one candidate word in the indicated candidate word set of the 4th instruction be determined as it is described There is the associative expression of the keyword of child node.
2. the method for claim 1, wherein described determine screening node set according to second instruction, comprising:
According to second instruction, at least one node in the path of the node to be updated is determined;
The set that the node to be updated and at least one described node are constituted is as the screening node set.
3. it is the method for claim 1, wherein described according to the participle for including in the alternative set of words, to described Question sentence including the participle in two question sentence set carries out clustering processing, obtains the question sentence of multiple classifications and of all categories corresponding The candidate word set that centre word is constituted, comprising:
Word frequency statistics are carried out to the participle for including in the alternative set of words, obtain the high frequency words that word frequency is greater than preset threshold;
The question sentence of a classification will be divided into including the question sentence of same high frequency words, and the high frequency words are corresponding as the category Candidate word set is added in candidate word.
4. method as claimed in claim 3, wherein the method also includes:
Each candidate word in the candidate word set is sequentially shown from high to low according to word frequency.
5. it is the method for claim 1, wherein described according to the participle for including in the alternative set of words, to described Question sentence including the participle in two question sentence set carries out clustering processing, obtains the question sentence of multiple classifications and of all categories corresponding The candidate word set that centre word is constituted, comprising:
Density clustering statistics is carried out to the participle for including in the alternative set of words, obtains multiple clustering clusters;
It will include that the question sentence of any participle in same clustering cluster be divided into the question sentence of a classification, and by the center of the clustering cluster Candidate word set is added as the corresponding candidate word of the category in word.
6. method as claimed in claim 5, wherein the method also includes:
Each candidate word in the candidate word set is sequentially shown from high to low according to the density of respective classes.
7. the method for claim 1, wherein described increased according to third instruction for the node to be updated increases newly Child node, and at least one candidate word in the indicated candidate word set of third instruction is determined as described increase newly After the associative expression of the corresponding keyword of child node or the keyword, the method also includes:
The 5th instruction is received, determines that the newly-increased child node is leaf node according to the 5th instruction, and be the leaf node carry Typical problem associated with the keyword of the leaf node.
8. the method for claim 1, wherein described increased according to third instruction for the node to be updated increases newly Child node, and at least one candidate word in the indicated candidate word set of third instruction is determined as described increase newly After the associative expression of the corresponding keyword of child node or the keyword, the method also includes:
The 6th instruction is received, determines that the newly-increased child node is not leaf node according to the 6th instruction;
First instruction of reception is executed, a node in the multiple node is determined as to more according to first instruction New node, wherein the node to be updated is the newly-increased child node.
9. a kind of business leads the device of more new node in figure, it includes arranging according to business dimension as tree-shaped level that the business, which leads figure, Multiple nodes of structure, the associative expression of the corresponding keyword of each node and the keyword, the business lead the leaf of figure Node carry typical problem associated with the keyword of the leaf node, described device include:
Acquiring unit, the first question sentence set constituted for obtaining original question sentence;
Determination unit determines a node in the multiple node according to first instruction for receiving the first instruction For node to be updated;The second instruction is received, screening node set is determined according to second instruction;
Screening unit is determined for filtering out from the first question sentence set that the acquiring unit obtains comprising the determination unit Screening node set in the associative expression of each corresponding keyword of node or the keyword question sentence, obtain the second question sentence collection It closes;
Participle unit, each question sentence in the second question sentence set for obtaining for the screening unit carry out word segmentation processing, And exclude the contingency table of each corresponding keyword of node or the keyword in the screening node set that the determination unit determines It reaches, obtains alternative set of words;
Cluster cell, the participle for including in the alternative set of words for being obtained according to the participle unit, to the screening unit Question sentence including the participle in the second obtained question sentence set carries out clustering processing, obtains the question sentence of multiple classifications and all kinds of The candidate word set that not corresponding centre word is constituted;
Node updates unit is the to be updated of determination unit determination according to third instruction for receiving third instruction Node increases newly-increased child node, and in the candidate word set that the indicated cluster cell of third instruction is obtained extremely A few candidate word is determined as the corresponding keyword of newly-increased child node or the associative expression of the keyword;Alternatively, receiving the Four instructions determine the existing child node of the node to be updated according to the 4th instruction, and will be indicated by the 4th instruction The candidate word set at least one candidate word be determined as the existing child node keyword associative expression.
10. device as claimed in claim 9, wherein the determination unit is specifically used for:
According to second instruction, at least one node in the path of the node to be updated is determined;
The set that the node to be updated and at least one described node are constituted is as the screening node set.
11. device as claimed in claim 9, wherein the cluster cell is specifically used for:
Word frequency statistics are carried out to the participle for including in the alternative set of words, obtain the high frequency words that word frequency is greater than preset threshold;
The question sentence of a classification will be divided into including the question sentence of same high frequency words, and the high frequency words are corresponding as the category Candidate word set is added in candidate word.
12. device as claimed in claim 11, wherein described device further include:
Display unit, for sequentially showing each candidate word in the candidate word set from high to low according to word frequency.
13. device as claimed in claim 9, wherein the cluster cell is specifically used for:
Density clustering statistics is carried out to the participle for including in the alternative set of words, obtains multiple clustering clusters;
It will include that the question sentence of any participle in same clustering cluster be divided into the question sentence of a classification, and by the center of the clustering cluster Candidate word set is added as the corresponding candidate word of the category in word.
14. device as claimed in claim 13, wherein described device further include:
Display unit, for the density sequence exhibition from high to low by each candidate word in the candidate word set according to respective classes Show.
15. device as claimed in claim 9, wherein the node updates unit is also used to receive the 5th instruction, according to institute It states the 5th instruction and determines that the newly-increased child node is leaf node, and is associated with the keyword of the leaf node for the leaf node carry Typical problem.
16. device as claimed in claim 9, wherein the node updates unit is also used to receive the 6th instruction, according to institute It states the 6th instruction and determines that the newly-increased child node is not leaf node;
The determination unit is also used to receive the first instruction, is saved one in the multiple node according to first instruction Point is determined as node to be updated, wherein the node to be updated is the newly-increased child node.
17. a kind of computer readable storage medium, is stored thereon with computer program, when the computer program in a computer When execution, computer perform claim is enabled to require the method for any one of 1-8.
18. a kind of calculating equipment, including memory and processor, executable code, the processing are stored in the memory When device executes the executable code, the method for any one of claim 1-8 is realized.
CN201811400469.9A 2018-11-22 2018-11-22 Method and device for updating nodes in traffic guide graph Active CN109635281B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811400469.9A CN109635281B (en) 2018-11-22 2018-11-22 Method and device for updating nodes in traffic guide graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811400469.9A CN109635281B (en) 2018-11-22 2018-11-22 Method and device for updating nodes in traffic guide graph

Publications (2)

Publication Number Publication Date
CN109635281A true CN109635281A (en) 2019-04-16
CN109635281B CN109635281B (en) 2023-01-31

Family

ID=66069222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811400469.9A Active CN109635281B (en) 2018-11-22 2018-11-22 Method and device for updating nodes in traffic guide graph

Country Status (1)

Country Link
CN (1) CN109635281B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539609A (en) * 2020-04-17 2020-08-14 北京亚信数据有限公司 Flow creation method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868695A (en) * 2012-09-18 2013-01-09 天格科技(杭州)有限公司 Conversation tree-based intelligent online customer service method and system
CN107015983A (en) * 2016-01-27 2017-08-04 阿里巴巴集团控股有限公司 A kind of method and apparatus for being used in intelligent answer provide knowledge information
CN107526792A (en) * 2017-08-15 2017-12-29 南通大学附属医院 A kind of Chinese question sentence keyword rapid extracting method
CN107562789A (en) * 2017-07-28 2018-01-09 深圳前海微众银行股份有限公司 Knowledge base problem update method, customer service robot and readable storage medium storing program for executing
CN107608999A (en) * 2017-07-17 2018-01-19 南京邮电大学 A kind of Question Classification method suitable for automatically request-answering system
CN108052547A (en) * 2017-11-27 2018-05-18 华中科技大学 Natural language question-answering method and system based on question sentence and knowledge graph structural analysis
JP2018092585A (en) * 2016-12-06 2018-06-14 パナソニックIpマネジメント株式会社 Information processing method, information processing device, and program
CN108763523A (en) * 2018-05-31 2018-11-06 康键信息技术(深圳)有限公司 Automatic session implementation method, server and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102868695A (en) * 2012-09-18 2013-01-09 天格科技(杭州)有限公司 Conversation tree-based intelligent online customer service method and system
CN107015983A (en) * 2016-01-27 2017-08-04 阿里巴巴集团控股有限公司 A kind of method and apparatus for being used in intelligent answer provide knowledge information
JP2018092585A (en) * 2016-12-06 2018-06-14 パナソニックIpマネジメント株式会社 Information processing method, information processing device, and program
CN107608999A (en) * 2017-07-17 2018-01-19 南京邮电大学 A kind of Question Classification method suitable for automatically request-answering system
CN107562789A (en) * 2017-07-28 2018-01-09 深圳前海微众银行股份有限公司 Knowledge base problem update method, customer service robot and readable storage medium storing program for executing
CN107526792A (en) * 2017-08-15 2017-12-29 南通大学附属医院 A kind of Chinese question sentence keyword rapid extracting method
CN108052547A (en) * 2017-11-27 2018-05-18 华中科技大学 Natural language question-answering method and system based on question sentence and knowledge graph structural analysis
CN108763523A (en) * 2018-05-31 2018-11-06 康键信息技术(深圳)有限公司 Automatic session implementation method, server and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539609A (en) * 2020-04-17 2020-08-14 北京亚信数据有限公司 Flow creation method and device
CN111539609B (en) * 2020-04-17 2023-04-07 北京亚信数据有限公司 Flow creation method and device

Also Published As

Publication number Publication date
CN109635281B (en) 2023-01-31

Similar Documents

Publication Publication Date Title
US11093537B2 (en) Information processing method, information processing apparatus, and non-transitory recording medium
CN109033387A (en) A kind of Internet of Things search system, method and storage medium merging multi-source data
CN107330021A (en) Data classification method, device and equipment based on multiway tree
CN108228571B (en) Method and device for generating couplet, storage medium and terminal equipment
CN106295807A (en) A kind of method and device of information processing
CN105069080B (en) A kind of document retrieval method and system
CN110309289A (en) A kind of sentence generation method, sentence generation device and smart machine
CN104881683A (en) Cataract eye fundus image classification method based on combined classifier and classification apparatus
CN110276076A (en) A kind of text mood analysis method, device and equipment
CN103744889B (en) A kind of method and apparatus for problem progress clustering processing
CN109858026A (en) Text emotion analysis method, device, computer equipment and storage medium
CN112036153B (en) Work order error correction method and device, computer readable storage medium and computer equipment
Zhang et al. Quotient FCMs-a decomposition theory for fuzzy cognitive maps
CN112579789A (en) Equipment fault diagnosis method and device and equipment
CN108595657A (en) The tables of data classification map method and apparatus of HIS systems
CN113811869A (en) Translating natural language queries into standard data queries
CN104601364B (en) Member management method and device in a kind of management cluster
CN106294307B (en) Corpus screening technique and device
CN109635281A (en) The method and apparatus that business leads more new node in figure
Geng et al. Relative speed of processing affects interference in Stroop and picture–word interference paradigms: evidence from the distractor frequency effect
CN112434032B (en) Automatic feature generation system and method
CN106570058A (en) Searching method and search engine
CN114333461B (en) Automatic subjective question scoring method and system
WO1998033299A1 (en) A method for automatically aggregating objects
CN113870998A (en) Interrogation method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20201012

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201012

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant