CN106372055A - Semantic similarity processing method and system in natural language man-machine interaction - Google Patents

Semantic similarity processing method and system in natural language man-machine interaction Download PDF

Info

Publication number
CN106372055A
CN106372055A CN201610709517.7A CN201610709517A CN106372055A CN 106372055 A CN106372055 A CN 106372055A CN 201610709517 A CN201610709517 A CN 201610709517A CN 106372055 A CN106372055 A CN 106372055A
Authority
CN
China
Prior art keywords
sentence
user input
data base
search data
preliminary search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610709517.7A
Other languages
Chinese (zh)
Other versions
CN106372055B (en
Inventor
彭军辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Listening Robot Technology Co Ltd
Original Assignee
Beijing Listening Robot Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Listening Robot Technology Co Ltd filed Critical Beijing Listening Robot Technology Co Ltd
Priority to CN201610709517.7A priority Critical patent/CN106372055B/en
Publication of CN106372055A publication Critical patent/CN106372055A/en
Application granted granted Critical
Publication of CN106372055B publication Critical patent/CN106372055B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Abstract

The invention relates to a semantic similarity processing method and system in natural language man-machine interaction, relates to the field of natural language man-machine interaction, and aims to solve the problem that the man-machine interaction cannot be normally realized due to low semantic comprehension accuracy of an existing man-machine interaction technology. The method is implemented by the steps of S1, establishing a preliminary query database and receiving a statement input by a user; S2, screening statements in the preliminary query database according to a format of the statement input by the user; and S3, performing semantic comparison on the statements screened out of the preliminary query database and the statement input by the user, and outputting a final result. According to the method and the system, the statements in the database are preliminarily screened through the format of the statement input by the user at first and then the similarity between the statement input by the user and the problem statements in the database is compared through semantic similarity comparison, so that the correct rate of a robot for semantic comprehension is increased to 10-25%, and the man-machine conversation process is more natural and fluent.

Description

A kind of semantic similitude processing method in man-machine natural language interaction and system
Technical field
The present invention relates to natural language human-machine interactions field.
Background technology
At present in the middle of field of human-computer interaction, when contrasting the similarity of two words, do not process clause, be not concerned with sentence Relation between word and word, or even it is not concerned with function word.Input " you and Xiao Ming are more severe than whom " and " Xiao Ming for example in robot With you than more severe ", robot is to distinguish one from the other the difference of this two word.For some function words, robot also cannot enter Row is distinguished, for example the difference between " your What for " and " your What for ".
But in customer service field, in terms of robot question and answer, as long as robot can not accurately distinguish the meaning of two words, machine Device people cannot accurate understanding user view it is impossible to give customer satisfaction system answer.With current technology, present semantic understanding flat All accuracy only have 64%, are also much unable to reach the purpose of man-machine normal interaction.
Content of the invention
The technical problem to be solved is the semantic similitude process side providing in a kind of man-machine natural language interaction To there is accuracy rate on semantic understanding low it is therefore intended that solving existing human-computer interaction technology for method and system, leads to man-machine interaction no The normal problem realized of method.
The technical scheme is that the semantic similitude in a kind of man-machine natural language interaction Processing method, it is achieved in the following ways:
S1, set up preliminary search data base receiving user's input sentence;
S2, according to the form of user input sentence, the sentence in preliminary search data base is screened;
S3, the sentence filtering out in preliminary search data base and user input sentence are carried out semantic contrast, and export Termination fruit.
Further, s2 implements process and includes:
S21, the subject extracting in user input language, predicate and object;
The master of all sentences in s22, the subject by user input language, predicate and object and preliminary search data base Language, predicate and object are contrasted;
S23, filter out the language with subject identical with user input language, predicate and object in preliminary search data base Sentence.
Further, described s3 implements process and includes:
S31, user input sentence is carried out phrase fractionation;
S32, phrases all in user input sentence are wrapped with the sentence that filters out in preliminary search data base respectively The phrase containing is contrasted;
S33, obtained between each two sentence according to the contrast of the phrase of user input sentence and preliminary search data base Semantic similitude value, and final result is exported according to the result of semantic similitude value.
The acquisition process of described semantic similitude value is: by user input sentence and each sentence pair in preliminary search data base It is semantic similitude value than identical phrase number afterwards divided by phrase number all of in user input sentence.
The invention has the beneficial effects as follows: the form that the present invention first passes through user input sentence enters to the sentence in data base Row preliminary screening, then contrasts the phase between problem sentence in user input sentence database by the comparison of Semantic Similarity Like property, optimum is exported to user, make robot improve 10% to 25% to the accuracy of semantic understanding, make human computer conversation Process become more natural, smooth.
A kind of semantic similitude processing system in man-machine natural language interaction, this system includes:
Database module, is used for setting up preliminary search data base receiving user's input sentence;
Sentence screening module, sieves to the sentence in preliminary search data base for the form according to user input sentence Choosing;
Semantic contrast module, sentence and user input sentence for filtering out in preliminary search data base carry out semanteme Contrast, and export final result.
Further, described sentence screening module includes:
Sentence extraction module, for extracting the subject in user input language, predicate and object;
Form contrast module, for by the subject in user input language, predicate and object and preliminary search data base The subject of all sentences, predicate and object are contrasted;
The selection result acquisition module, has master identical with user input language for filtering out in preliminary search data base The sentence of language, predicate and object.
Further, described semanteme contrast module includes:
Phrase splits module, for user input sentence is carried out phrase fractionation;
Phrase contrast module, for filtering out phrases all in user input sentence with preliminary search data base respectively Sentence included in phrase contrasted;
Result output module, obtains every two for the phrase contrast according to user input sentence and preliminary search data base Semantic similitude value between individual sentence, and final result is exported according to the result of semantic similitude value.
Further, the acquisition process of described semantic similitude value is: user input sentence is every with preliminary search data base After individual sentence contrast, identical phrase number is semantic similitude value divided by phrase number all of in user input sentence.
Brief description
Fig. 1 is the flow chart of the semantic similitude processing method in the man-machine natural language interaction described in the embodiment of the present invention;
Fig. 2 is for the form according to user input sentence described in the embodiment of the present invention to the sentence in preliminary search data base The flow chart being screened;
Fig. 3 is entering the sentence filtering out in preliminary search data base with user input sentence described in the embodiment of the present invention The flow chart of the semantic contrast of row;
Fig. 4 is that the principle of the semantic similitude processing system in the man-machine natural language interaction described in the embodiment of the present invention is illustrated Figure;
Fig. 5 is the principle schematic of the sentence screening module 2 described in the embodiment of the present invention;
Fig. 6 is the principle schematic of the semantic contrast module 3 described in the embodiment of the present invention.
In accompanying drawing, the list of parts representated by each label is as follows:
1st, Database module, 2, sentence screening module, 3, semantic contrast module, 4, sentence extraction module, 5, form Contrast module, 6, the selection result acquisition module, 7, phrase split module, 8, phrase contrast module, 9, result output module.
Specific embodiment
Below in conjunction with accompanying drawing, the principle of the present invention and feature are described, example is served only for explaining the present invention, and Non- for limiting the scope of the present invention.
Embodiment 1
As shown in figure 1, the present embodiment proposes the semantic similitude processing method in a kind of man-machine natural language interaction, it is It is accomplished by:
S1, set up preliminary search data base receiving user's input sentence;
S2, according to the form of user input sentence, the sentence in preliminary search data base is screened;
S3, the sentence filtering out in preliminary search data base and user input sentence are carried out semantic contrast, and export Termination fruit.
In the present embodiment, in the beginning that user input sentence is processed, first the form for read statement is entered Row extracts, and is contrasted thus being carried out the first step to delete choosing by the trunk portion extracting problem;Detailed process is as shown in Figure 2:
S21, the subject extracting in user input language, predicate and object;
The master of all sentences in s22, the subject by user input language, predicate and object and preliminary search data base Language, predicate and object are contrasted;
S23, filter out the language with subject identical with user input language, predicate and object in preliminary search data base Sentence.
After carrying out preliminary screening, in data base, also may can there is the SVO of a lot of problems and user input sentence SVO is identical, but Comparatively speaking, in preliminary search data base, the sentence corresponding with user input sentence be very Few, then the workload by being contrasted the phrase of the phrase of each sentence filtering out and user input sentence is then non- Often little, versus speed is also very fast, and detailed process is as shown in Figure 3:
S31, user input sentence is carried out phrase fractionation;
S32, phrases all in user input sentence are wrapped with the sentence that filters out in preliminary search data base respectively The phrase containing is contrasted;
S33, obtained between each two sentence according to the contrast of the phrase of user input sentence and preliminary search data base Semantic similitude value, and final result is exported according to the result of semantic similitude value.
Wherein, the acquisition process of semantic similitude value is: by user input sentence and each sentence in preliminary search data base After contrast, identical phrase number is semantic similitude value divided by phrase number all of in user input sentence.
Give an example, in the middle of the sentence of such as user input, have ten word: a1+a2+a3+a4+a5+a6+a7+a8+a9+ A0, and have five words complete with user input sentence with its SVO and an identical sentence in preliminary search data base Exactly the same: a2+a3+a4+a5+a6, due to the clause of this two sentences identical then it is assumed that the similarity of this two sentences is 50%, identical if there are four words, then similarity is 40%, the like.According to user input sentence and preliminary search data In storehouse, sentence carries out semantic contrast, filters out semantic similitude value highest sentence, then this sentence is final output sentence.
Embodiment 2
As shown in figure 4, the present embodiment proposes the semantic similitude processing system in a kind of man-machine natural language interaction, this is System includes:
Database module 1, is used for setting up preliminary search data base receiving user's input sentence;
Sentence screening module 2, is carried out to the sentence in preliminary search data base for the form according to user input sentence Screening;
Semantic contrast module 3, sentence and user input sentence for filtering out in preliminary search data base carry out language Justice contrast, and export final result.
Preferably, as shown in figure 5, described sentence screening module 2 includes:
Sentence extraction module 4, for extracting the subject in user input language, predicate and object;
Form contrast module 5, for by the subject in user input language, predicate and object and preliminary search data base The subject of all sentences, predicate and object are contrasted;
The selection result acquisition module 6, for filter out in preliminary search data base have identical with user input language The sentence of subject, predicate and object.
Preferably, as shown in fig. 6, described semanteme contrast module 3 includes:
Phrase splits module 7, for user input sentence is carried out phrase fractionation;
Phrase contrast module 8, for screening phrases all in user input sentence with preliminary search data base respectively The phrase included in sentence going out is contrasted;
Result output module 9, every for being obtained according to the phrase contrast of user input sentence and preliminary search data base Semantic similitude value between two sentences, and final result is exported according to the result of semantic similitude value.
Preferably, the acquisition process of described semantic similitude value is: user input sentence is every with preliminary search data base After individual sentence contrast, identical phrase number is semantic similitude value divided by phrase number all of in user input sentence.
The form that the present embodiment first passes through user input sentence carries out preliminary screening, Ran Houtong to the sentence in data base The comparison crossing Semantic Similarity contrasts the similarity between problem sentence in user input sentence database, and optimum is defeated Go out to user, make robot improve 10% to 25% to the accuracy of semantic understanding, so that interactive process is become more certainly So, smooth.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and Within principle, any modification, equivalent substitution and improvement made etc., should be included within the scope of the present invention.

Claims (8)

1. the semantic similitude processing method in a kind of man-machine natural language interaction is it is characterised in that it is real in the following manner Existing:
S1, set up preliminary search data base receiving user's input sentence;
S2, according to the form of user input sentence, the sentence in preliminary search data base is screened;
S3, the sentence filtering out in preliminary search data base and user input sentence are carried out semantic contrast, and export and terminate most Really.
2. the semantic similitude processing method in a kind of man-machine natural language interaction according to claim 1 it is characterised in that Described s2 implements process and includes:
S21, the subject extracting in user input language, predicate and object;
The subject of all sentences, meaning in s22, the subject by user input language, predicate and object and preliminary search data base Language and object are contrasted;
S23, filter out the sentence with subject identical with user input language, predicate and object in preliminary search data base.
3. the semantic similitude processing method in a kind of man-machine natural language interaction according to claim 2 it is characterised in that Described s3 implements process and includes:
S31, user input sentence is carried out phrase fractionation;
S32, by phrases all in user input sentence respectively with included in the sentence that filters out in preliminary search data base Phrase is contrasted;
S33, obtain semanteme between each two sentence according to the contrast of the phrase of user input sentence and preliminary search data base Similar value, and final result is exported according to the result of semantic similitude value.
4. the semantic similitude processing method in a kind of man-machine natural language interaction according to claim 3 it is characterised in that The acquisition process of described semantic similitude value is: phase after user input sentence is contrasted with each sentence in preliminary search data base Same phrase number is semantic similitude value divided by phrase number all of in user input sentence.
5. the semantic similitude processing system in a kind of man-machine natural language interaction is it is characterised in that it includes:
Database module (1), is used for setting up preliminary search data base receiving user's input sentence;
Sentence screening module (2), sieves to the sentence in preliminary search data base for the form according to user input sentence Choosing;
Semantic contrast module (3), sentence and user input sentence for filtering out in preliminary search data base carry out semanteme Contrast, and export final result.
6. the semantic similitude processing system in a kind of man-machine natural language interaction according to claim 5 it is characterised in that Described sentence screening module (2) includes:
Sentence extraction module (4), for extracting the subject in user input language, predicate and object;
Form contrast module (5), for by institute in the subject in user input language, predicate and object and preliminary search data base The subject, predicate and the object that have sentence are contrasted;
The selection result acquisition module (6), has master identical with user input language for filtering out in preliminary search data base The sentence of language, predicate and object.
7. the semantic similitude processing system in a kind of man-machine natural language interaction according to claim 6 it is characterised in that Described semanteme contrast module (3) includes:
Phrase splits module (7), for user input sentence is carried out phrase fractionation;
Phrase contrast module (8), for filtering out phrases all in user input sentence with preliminary search data base respectively Sentence included in phrase contrasted;
Result output module (9), obtains every two for the phrase contrast according to user input sentence and preliminary search data base Semantic similitude value between individual sentence, and final result is exported according to the result of semantic similitude value.
8. the semantic similitude processing system in a kind of man-machine natural language interaction according to claim 7 it is characterised in that The acquisition process of described semantic similitude value is: phase after user input sentence is contrasted with each sentence in preliminary search data base Same phrase number is semantic similitude value divided by phrase number all of in user input sentence.
CN201610709517.7A 2016-08-23 2016-08-23 A kind of semanteme similar processing method and system in man-machine natural language interaction Active CN106372055B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610709517.7A CN106372055B (en) 2016-08-23 2016-08-23 A kind of semanteme similar processing method and system in man-machine natural language interaction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610709517.7A CN106372055B (en) 2016-08-23 2016-08-23 A kind of semanteme similar processing method and system in man-machine natural language interaction

Publications (2)

Publication Number Publication Date
CN106372055A true CN106372055A (en) 2017-02-01
CN106372055B CN106372055B (en) 2019-10-29

Family

ID=57879031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610709517.7A Active CN106372055B (en) 2016-08-23 2016-08-23 A kind of semanteme similar processing method and system in man-machine natural language interaction

Country Status (1)

Country Link
CN (1) CN106372055B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815484A (en) * 2018-12-21 2019-05-28 平安科技(深圳)有限公司 Based on the semantic similarity matching process and its coalignment for intersecting attention mechanism
CN110019688A (en) * 2019-01-23 2019-07-16 艾肯特公司 The method that robot is trained

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1928864A (en) * 2006-09-22 2007-03-14 浙江大学 FAQ based Chinese natural language ask and answer method
CN101286161A (en) * 2008-05-28 2008-10-15 华中科技大学 Intelligent Chinese request-answering system based on concept
JP2008253551A (en) * 2007-04-05 2008-10-23 Toshiba Corp Image reading report search apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1928864A (en) * 2006-09-22 2007-03-14 浙江大学 FAQ based Chinese natural language ask and answer method
JP2008253551A (en) * 2007-04-05 2008-10-23 Toshiba Corp Image reading report search apparatus
CN101286161A (en) * 2008-05-28 2008-10-15 华中科技大学 Intelligent Chinese request-answering system based on concept

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李静静: "导游对话系统的相关技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815484A (en) * 2018-12-21 2019-05-28 平安科技(深圳)有限公司 Based on the semantic similarity matching process and its coalignment for intersecting attention mechanism
CN109815484B (en) * 2018-12-21 2022-03-15 平安科技(深圳)有限公司 Semantic similarity matching method and matching device based on cross attention mechanism
CN110019688A (en) * 2019-01-23 2019-07-16 艾肯特公司 The method that robot is trained

Also Published As

Publication number Publication date
CN106372055B (en) 2019-10-29

Similar Documents

Publication Publication Date Title
CN106649742A (en) Database maintenance method and device
CN106776713A (en) It is a kind of based on this clustering method of the Massive short documents of term vector semantic analysis
CN106202476A (en) A kind of interactive method and device of knowledge based collection of illustrative plates
CN104657346A (en) Question matching system and question matching system in intelligent interaction system
CN105677783A (en) Information processing method and device for intelligent question-answering system
CN106557508A (en) A kind of text key word extracting method and device
CN103605691B (en) Device and method used for processing issued contents in social network
CN102509001A (en) Method for automatically removing time sequence data outlier point
CN109558482B (en) Parallelization method of text clustering model PW-LDA based on Spark framework
CN106847279A (en) Man-machine interaction method based on robot operating system ROS
CN105868311A (en) Data analyzing method and device
CN110287313A (en) A kind of the determination method and server of risk subject
KR20210106372A (en) New category tag mining method and device, electronic device and computer-readable medium
CN114528312A (en) Method and device for generating structured query language statement
CN109033282A (en) A kind of Web page text extracting method and device based on extraction template
CN110175585A (en) It is a kind of letter answer correct system and method automatically
CN108009715A (en) It is a kind of automatically analyze index fluctuation root because method
CN105653620A (en) Log analysis method and device of intelligent question answering system
CN107341142B (en) Enterprise relation calculation method and system based on keyword extraction and analysis
CN104007836A (en) Handwriting input processing method and terminal device
CN105095436A (en) Automatic modeling method for data of data sources
CN106372055A (en) Semantic similarity processing method and system in natural language man-machine interaction
CN103020117A (en) Service contrast method and service contrast system
CN110275938B (en) Knowledge extraction method and system based on unstructured document
Pandey et al. Sentiment analysis using lexicon based approach

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant