CN106372055A - Semantic similarity processing method and system in natural language man-machine interaction - Google Patents
Semantic similarity processing method and system in natural language man-machine interaction Download PDFInfo
- Publication number
- CN106372055A CN106372055A CN201610709517.7A CN201610709517A CN106372055A CN 106372055 A CN106372055 A CN 106372055A CN 201610709517 A CN201610709517 A CN 201610709517A CN 106372055 A CN106372055 A CN 106372055A
- Authority
- CN
- China
- Prior art keywords
- sentence
- user input
- data base
- search data
- preliminary search
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
Abstract
The invention relates to a semantic similarity processing method and system in natural language man-machine interaction, relates to the field of natural language man-machine interaction, and aims to solve the problem that the man-machine interaction cannot be normally realized due to low semantic comprehension accuracy of an existing man-machine interaction technology. The method is implemented by the steps of S1, establishing a preliminary query database and receiving a statement input by a user; S2, screening statements in the preliminary query database according to a format of the statement input by the user; and S3, performing semantic comparison on the statements screened out of the preliminary query database and the statement input by the user, and outputting a final result. According to the method and the system, the statements in the database are preliminarily screened through the format of the statement input by the user at first and then the similarity between the statement input by the user and the problem statements in the database is compared through semantic similarity comparison, so that the correct rate of a robot for semantic comprehension is increased to 10-25%, and the man-machine conversation process is more natural and fluent.
Description
Technical field
The present invention relates to natural language human-machine interactions field.
Background technology
At present in the middle of field of human-computer interaction, when contrasting the similarity of two words, do not process clause, be not concerned with sentence
Relation between word and word, or even it is not concerned with function word.Input " you and Xiao Ming are more severe than whom " and " Xiao Ming for example in robot
With you than more severe ", robot is to distinguish one from the other the difference of this two word.For some function words, robot also cannot enter
Row is distinguished, for example the difference between " your What for " and " your What for ".
But in customer service field, in terms of robot question and answer, as long as robot can not accurately distinguish the meaning of two words, machine
Device people cannot accurate understanding user view it is impossible to give customer satisfaction system answer.With current technology, present semantic understanding flat
All accuracy only have 64%, are also much unable to reach the purpose of man-machine normal interaction.
Content of the invention
The technical problem to be solved is the semantic similitude process side providing in a kind of man-machine natural language interaction
To there is accuracy rate on semantic understanding low it is therefore intended that solving existing human-computer interaction technology for method and system, leads to man-machine interaction no
The normal problem realized of method.
The technical scheme is that the semantic similitude in a kind of man-machine natural language interaction
Processing method, it is achieved in the following ways:
S1, set up preliminary search data base receiving user's input sentence;
S2, according to the form of user input sentence, the sentence in preliminary search data base is screened;
S3, the sentence filtering out in preliminary search data base and user input sentence are carried out semantic contrast, and export
Termination fruit.
Further, s2 implements process and includes:
S21, the subject extracting in user input language, predicate and object;
The master of all sentences in s22, the subject by user input language, predicate and object and preliminary search data base
Language, predicate and object are contrasted;
S23, filter out the language with subject identical with user input language, predicate and object in preliminary search data base
Sentence.
Further, described s3 implements process and includes:
S31, user input sentence is carried out phrase fractionation;
S32, phrases all in user input sentence are wrapped with the sentence that filters out in preliminary search data base respectively
The phrase containing is contrasted;
S33, obtained between each two sentence according to the contrast of the phrase of user input sentence and preliminary search data base
Semantic similitude value, and final result is exported according to the result of semantic similitude value.
The acquisition process of described semantic similitude value is: by user input sentence and each sentence pair in preliminary search data base
It is semantic similitude value than identical phrase number afterwards divided by phrase number all of in user input sentence.
The invention has the beneficial effects as follows: the form that the present invention first passes through user input sentence enters to the sentence in data base
Row preliminary screening, then contrasts the phase between problem sentence in user input sentence database by the comparison of Semantic Similarity
Like property, optimum is exported to user, make robot improve 10% to 25% to the accuracy of semantic understanding, make human computer conversation
Process become more natural, smooth.
A kind of semantic similitude processing system in man-machine natural language interaction, this system includes:
Database module, is used for setting up preliminary search data base receiving user's input sentence;
Sentence screening module, sieves to the sentence in preliminary search data base for the form according to user input sentence
Choosing;
Semantic contrast module, sentence and user input sentence for filtering out in preliminary search data base carry out semanteme
Contrast, and export final result.
Further, described sentence screening module includes:
Sentence extraction module, for extracting the subject in user input language, predicate and object;
Form contrast module, for by the subject in user input language, predicate and object and preliminary search data base
The subject of all sentences, predicate and object are contrasted;
The selection result acquisition module, has master identical with user input language for filtering out in preliminary search data base
The sentence of language, predicate and object.
Further, described semanteme contrast module includes:
Phrase splits module, for user input sentence is carried out phrase fractionation;
Phrase contrast module, for filtering out phrases all in user input sentence with preliminary search data base respectively
Sentence included in phrase contrasted;
Result output module, obtains every two for the phrase contrast according to user input sentence and preliminary search data base
Semantic similitude value between individual sentence, and final result is exported according to the result of semantic similitude value.
Further, the acquisition process of described semantic similitude value is: user input sentence is every with preliminary search data base
After individual sentence contrast, identical phrase number is semantic similitude value divided by phrase number all of in user input sentence.
Brief description
Fig. 1 is the flow chart of the semantic similitude processing method in the man-machine natural language interaction described in the embodiment of the present invention;
Fig. 2 is for the form according to user input sentence described in the embodiment of the present invention to the sentence in preliminary search data base
The flow chart being screened;
Fig. 3 is entering the sentence filtering out in preliminary search data base with user input sentence described in the embodiment of the present invention
The flow chart of the semantic contrast of row;
Fig. 4 is that the principle of the semantic similitude processing system in the man-machine natural language interaction described in the embodiment of the present invention is illustrated
Figure;
Fig. 5 is the principle schematic of the sentence screening module 2 described in the embodiment of the present invention;
Fig. 6 is the principle schematic of the semantic contrast module 3 described in the embodiment of the present invention.
In accompanying drawing, the list of parts representated by each label is as follows:
1st, Database module, 2, sentence screening module, 3, semantic contrast module, 4, sentence extraction module, 5, form
Contrast module, 6, the selection result acquisition module, 7, phrase split module, 8, phrase contrast module, 9, result output module.
Specific embodiment
Below in conjunction with accompanying drawing, the principle of the present invention and feature are described, example is served only for explaining the present invention, and
Non- for limiting the scope of the present invention.
Embodiment 1
As shown in figure 1, the present embodiment proposes the semantic similitude processing method in a kind of man-machine natural language interaction, it is
It is accomplished by:
S1, set up preliminary search data base receiving user's input sentence;
S2, according to the form of user input sentence, the sentence in preliminary search data base is screened;
S3, the sentence filtering out in preliminary search data base and user input sentence are carried out semantic contrast, and export
Termination fruit.
In the present embodiment, in the beginning that user input sentence is processed, first the form for read statement is entered
Row extracts, and is contrasted thus being carried out the first step to delete choosing by the trunk portion extracting problem;Detailed process is as shown in Figure 2:
S21, the subject extracting in user input language, predicate and object;
The master of all sentences in s22, the subject by user input language, predicate and object and preliminary search data base
Language, predicate and object are contrasted;
S23, filter out the language with subject identical with user input language, predicate and object in preliminary search data base
Sentence.
After carrying out preliminary screening, in data base, also may can there is the SVO of a lot of problems and user input sentence
SVO is identical, but Comparatively speaking, in preliminary search data base, the sentence corresponding with user input sentence be very
Few, then the workload by being contrasted the phrase of the phrase of each sentence filtering out and user input sentence is then non-
Often little, versus speed is also very fast, and detailed process is as shown in Figure 3:
S31, user input sentence is carried out phrase fractionation;
S32, phrases all in user input sentence are wrapped with the sentence that filters out in preliminary search data base respectively
The phrase containing is contrasted;
S33, obtained between each two sentence according to the contrast of the phrase of user input sentence and preliminary search data base
Semantic similitude value, and final result is exported according to the result of semantic similitude value.
Wherein, the acquisition process of semantic similitude value is: by user input sentence and each sentence in preliminary search data base
After contrast, identical phrase number is semantic similitude value divided by phrase number all of in user input sentence.
Give an example, in the middle of the sentence of such as user input, have ten word: a1+a2+a3+a4+a5+a6+a7+a8+a9+
A0, and have five words complete with user input sentence with its SVO and an identical sentence in preliminary search data base
Exactly the same: a2+a3+a4+a5+a6, due to the clause of this two sentences identical then it is assumed that the similarity of this two sentences is
50%, identical if there are four words, then similarity is 40%, the like.According to user input sentence and preliminary search data
In storehouse, sentence carries out semantic contrast, filters out semantic similitude value highest sentence, then this sentence is final output sentence.
Embodiment 2
As shown in figure 4, the present embodiment proposes the semantic similitude processing system in a kind of man-machine natural language interaction, this is
System includes:
Database module 1, is used for setting up preliminary search data base receiving user's input sentence;
Sentence screening module 2, is carried out to the sentence in preliminary search data base for the form according to user input sentence
Screening;
Semantic contrast module 3, sentence and user input sentence for filtering out in preliminary search data base carry out language
Justice contrast, and export final result.
Preferably, as shown in figure 5, described sentence screening module 2 includes:
Sentence extraction module 4, for extracting the subject in user input language, predicate and object;
Form contrast module 5, for by the subject in user input language, predicate and object and preliminary search data base
The subject of all sentences, predicate and object are contrasted;
The selection result acquisition module 6, for filter out in preliminary search data base have identical with user input language
The sentence of subject, predicate and object.
Preferably, as shown in fig. 6, described semanteme contrast module 3 includes:
Phrase splits module 7, for user input sentence is carried out phrase fractionation;
Phrase contrast module 8, for screening phrases all in user input sentence with preliminary search data base respectively
The phrase included in sentence going out is contrasted;
Result output module 9, every for being obtained according to the phrase contrast of user input sentence and preliminary search data base
Semantic similitude value between two sentences, and final result is exported according to the result of semantic similitude value.
Preferably, the acquisition process of described semantic similitude value is: user input sentence is every with preliminary search data base
After individual sentence contrast, identical phrase number is semantic similitude value divided by phrase number all of in user input sentence.
The form that the present embodiment first passes through user input sentence carries out preliminary screening, Ran Houtong to the sentence in data base
The comparison crossing Semantic Similarity contrasts the similarity between problem sentence in user input sentence database, and optimum is defeated
Go out to user, make robot improve 10% to 25% to the accuracy of semantic understanding, so that interactive process is become more certainly
So, smooth.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and
Within principle, any modification, equivalent substitution and improvement made etc., should be included within the scope of the present invention.
Claims (8)
1. the semantic similitude processing method in a kind of man-machine natural language interaction is it is characterised in that it is real in the following manner
Existing:
S1, set up preliminary search data base receiving user's input sentence;
S2, according to the form of user input sentence, the sentence in preliminary search data base is screened;
S3, the sentence filtering out in preliminary search data base and user input sentence are carried out semantic contrast, and export and terminate most
Really.
2. the semantic similitude processing method in a kind of man-machine natural language interaction according to claim 1 it is characterised in that
Described s2 implements process and includes:
S21, the subject extracting in user input language, predicate and object;
The subject of all sentences, meaning in s22, the subject by user input language, predicate and object and preliminary search data base
Language and object are contrasted;
S23, filter out the sentence with subject identical with user input language, predicate and object in preliminary search data base.
3. the semantic similitude processing method in a kind of man-machine natural language interaction according to claim 2 it is characterised in that
Described s3 implements process and includes:
S31, user input sentence is carried out phrase fractionation;
S32, by phrases all in user input sentence respectively with included in the sentence that filters out in preliminary search data base
Phrase is contrasted;
S33, obtain semanteme between each two sentence according to the contrast of the phrase of user input sentence and preliminary search data base
Similar value, and final result is exported according to the result of semantic similitude value.
4. the semantic similitude processing method in a kind of man-machine natural language interaction according to claim 3 it is characterised in that
The acquisition process of described semantic similitude value is: phase after user input sentence is contrasted with each sentence in preliminary search data base
Same phrase number is semantic similitude value divided by phrase number all of in user input sentence.
5. the semantic similitude processing system in a kind of man-machine natural language interaction is it is characterised in that it includes:
Database module (1), is used for setting up preliminary search data base receiving user's input sentence;
Sentence screening module (2), sieves to the sentence in preliminary search data base for the form according to user input sentence
Choosing;
Semantic contrast module (3), sentence and user input sentence for filtering out in preliminary search data base carry out semanteme
Contrast, and export final result.
6. the semantic similitude processing system in a kind of man-machine natural language interaction according to claim 5 it is characterised in that
Described sentence screening module (2) includes:
Sentence extraction module (4), for extracting the subject in user input language, predicate and object;
Form contrast module (5), for by institute in the subject in user input language, predicate and object and preliminary search data base
The subject, predicate and the object that have sentence are contrasted;
The selection result acquisition module (6), has master identical with user input language for filtering out in preliminary search data base
The sentence of language, predicate and object.
7. the semantic similitude processing system in a kind of man-machine natural language interaction according to claim 6 it is characterised in that
Described semanteme contrast module (3) includes:
Phrase splits module (7), for user input sentence is carried out phrase fractionation;
Phrase contrast module (8), for filtering out phrases all in user input sentence with preliminary search data base respectively
Sentence included in phrase contrasted;
Result output module (9), obtains every two for the phrase contrast according to user input sentence and preliminary search data base
Semantic similitude value between individual sentence, and final result is exported according to the result of semantic similitude value.
8. the semantic similitude processing system in a kind of man-machine natural language interaction according to claim 7 it is characterised in that
The acquisition process of described semantic similitude value is: phase after user input sentence is contrasted with each sentence in preliminary search data base
Same phrase number is semantic similitude value divided by phrase number all of in user input sentence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610709517.7A CN106372055B (en) | 2016-08-23 | 2016-08-23 | A kind of semanteme similar processing method and system in man-machine natural language interaction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610709517.7A CN106372055B (en) | 2016-08-23 | 2016-08-23 | A kind of semanteme similar processing method and system in man-machine natural language interaction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106372055A true CN106372055A (en) | 2017-02-01 |
CN106372055B CN106372055B (en) | 2019-10-29 |
Family
ID=57879031
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610709517.7A Active CN106372055B (en) | 2016-08-23 | 2016-08-23 | A kind of semanteme similar processing method and system in man-machine natural language interaction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106372055B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109815484A (en) * | 2018-12-21 | 2019-05-28 | 平安科技(深圳)有限公司 | Based on the semantic similarity matching process and its coalignment for intersecting attention mechanism |
CN110019688A (en) * | 2019-01-23 | 2019-07-16 | 艾肯特公司 | The method that robot is trained |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1928864A (en) * | 2006-09-22 | 2007-03-14 | 浙江大学 | FAQ based Chinese natural language ask and answer method |
CN101286161A (en) * | 2008-05-28 | 2008-10-15 | 华中科技大学 | Intelligent Chinese request-answering system based on concept |
JP2008253551A (en) * | 2007-04-05 | 2008-10-23 | Toshiba Corp | Image reading report search apparatus |
-
2016
- 2016-08-23 CN CN201610709517.7A patent/CN106372055B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1928864A (en) * | 2006-09-22 | 2007-03-14 | 浙江大学 | FAQ based Chinese natural language ask and answer method |
JP2008253551A (en) * | 2007-04-05 | 2008-10-23 | Toshiba Corp | Image reading report search apparatus |
CN101286161A (en) * | 2008-05-28 | 2008-10-15 | 华中科技大学 | Intelligent Chinese request-answering system based on concept |
Non-Patent Citations (1)
Title |
---|
李静静: "导游对话系统的相关技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109815484A (en) * | 2018-12-21 | 2019-05-28 | 平安科技(深圳)有限公司 | Based on the semantic similarity matching process and its coalignment for intersecting attention mechanism |
CN109815484B (en) * | 2018-12-21 | 2022-03-15 | 平安科技(深圳)有限公司 | Semantic similarity matching method and matching device based on cross attention mechanism |
CN110019688A (en) * | 2019-01-23 | 2019-07-16 | 艾肯特公司 | The method that robot is trained |
Also Published As
Publication number | Publication date |
---|---|
CN106372055B (en) | 2019-10-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106649742A (en) | Database maintenance method and device | |
CN106776713A (en) | It is a kind of based on this clustering method of the Massive short documents of term vector semantic analysis | |
CN106202476A (en) | A kind of interactive method and device of knowledge based collection of illustrative plates | |
CN104657346A (en) | Question matching system and question matching system in intelligent interaction system | |
CN105677783A (en) | Information processing method and device for intelligent question-answering system | |
CN106557508A (en) | A kind of text key word extracting method and device | |
CN103605691B (en) | Device and method used for processing issued contents in social network | |
CN102509001A (en) | Method for automatically removing time sequence data outlier point | |
CN109558482B (en) | Parallelization method of text clustering model PW-LDA based on Spark framework | |
CN106847279A (en) | Man-machine interaction method based on robot operating system ROS | |
CN105868311A (en) | Data analyzing method and device | |
CN110287313A (en) | A kind of the determination method and server of risk subject | |
KR20210106372A (en) | New category tag mining method and device, electronic device and computer-readable medium | |
CN114528312A (en) | Method and device for generating structured query language statement | |
CN109033282A (en) | A kind of Web page text extracting method and device based on extraction template | |
CN110175585A (en) | It is a kind of letter answer correct system and method automatically | |
CN108009715A (en) | It is a kind of automatically analyze index fluctuation root because method | |
CN105653620A (en) | Log analysis method and device of intelligent question answering system | |
CN107341142B (en) | Enterprise relation calculation method and system based on keyword extraction and analysis | |
CN104007836A (en) | Handwriting input processing method and terminal device | |
CN105095436A (en) | Automatic modeling method for data of data sources | |
CN106372055A (en) | Semantic similarity processing method and system in natural language man-machine interaction | |
CN103020117A (en) | Service contrast method and service contrast system | |
CN110275938B (en) | Knowledge extraction method and system based on unstructured document | |
Pandey et al. | Sentiment analysis using lexicon based approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |