CN106372055B

CN106372055B - A kind of semanteme similar processing method and system in man-machine natural language interaction

Info

Publication number: CN106372055B
Application number: CN201610709517.7A
Authority: CN
Inventors: 彭军辉
Original assignee: Beijing Listening Robot Technology Co Ltd
Current assignee: Beijing Listening Robot Technology Co Ltd
Priority date: 2016-08-23
Filing date: 2016-08-23
Publication date: 2019-10-29
Anticipated expiration: 2036-08-23
Also published as: CN106372055A

Abstract

The present invention relates to the similar processing method and system of semanteme in a kind of man-machine natural language interaction, are related to natural language human-machine interactions field.Purpose is to solve the problem of that existing human-computer interaction technology causes human-computer interaction not realize normally there are accuracy rate is low on semantic understanding.This method realizes process are as follows: S1, establishes preliminary search database and receives user's read statement；S2, the sentence in preliminary search database is screened according to the format of user's read statement；S3, the sentence filtered out in preliminary search database and user's read statement are subjected to semantic comparison, and export final result.The format that the present invention passes through user's read statement first carries out preliminary screening to the sentence in database, then the similitude in user's read statement and database between problem sentence is compared by the comparison of Semantic Similarity, so that robot is improved 10% to 25% to the accuracy of semantic understanding, interactive process is made to become more natural, smooth.

Description

A kind of semanteme similar processing method and system in man-machine natural language interaction

Technical field

The present invention relates to natural language human-machine interactions fields.

Background technique

At present in field of human-computer interaction, when comparing the similitude of two words, clause is not handled, is not concerned in sentence Relationship between word and word, or even it is not concerned with function word.Such as input " you and Xiao Ming are more severe than whom " and " Xiao Ming in robot With you than more severe ", robot is the difference of this two word of can not distinguishing one from the other.For some function words, robot also can not be into Row is distinguished, such as the difference between " your What for " and " your What for ".

But in customer service field, in terms of robot question and answer, as long as robot cannot accurately distinguish the meaning of two words, machine Device people cannot accurate understanding user be intended to, customer satisfaction system answer cannot be given.With current technology, present semantic understanding is put down Equal accuracy only has 64%, is also much unable to reach the man-machine purpose normally interacted.

Summary of the invention

Technical problem to be solved by the invention is to provide the similar processing sides of semanteme in a kind of man-machine natural language interaction Method and system, it is therefore intended that solve existing human-computer interaction technology there are accuracys rate on semantic understanding it is low, cause human-computer interaction without The problem of method is normally realized.

The technical scheme to solve the above technical problems is that the semanteme in a kind of man-machine natural language interaction is similar Processing method, it is achieved in the following ways:

S1, it establishes preliminary search database and receives user's read statement；

S2, the sentence in preliminary search database is screened according to the format of user's read statement；

S3, the sentence filtered out in preliminary search database and user's read statement are subjected to semantic comparison, and exported most Terminate fruit.

Further, S2 specific implementation process includes:

Subject, predicate and object in S21, extraction user's input language；

The master of all sentences in S22, the subject by user's input language, predicate and object and preliminary search database Language, predicate and object compare；

S23, the language with subject identical as user's input language, predicate and object is filtered out in preliminary search database Sentence.

Further, the S3 specific implementation process includes:

S31, user's read statement is subjected to phrase fractionation；

S32, by phrases all in user's read statement respectively with wrapped in the sentence that is filtered out in preliminary search database The phrase contained compares；

S33, it is compared between acquisition every two sentence according to the phrase of user's read statement and preliminary search database Semantic similar value, and final result is exported according to the result of semantic similar value.

The acquisition process of the semanteme similar value are as follows: by each sentence pair in user's read statement and preliminary search database Than identical phrase number later divided by phrase number all in user's read statement as semanteme similar value.

The beneficial effects of the present invention are: the present invention pass through first the format of user's read statement to the sentence in database into Then row preliminary screening compares the phase in user's read statement and database between problem sentence by the comparison of Semantic Similarity Like property, optimum is exported makes robot improve 10% to 25% to the accuracy of semantic understanding to user, makes human-computer dialogue Process become more natural, smooth.

A kind of similar processing system of semanteme in man-machine natural language interaction, the system include:

Database module, for establishing preliminary search database and receiving user's read statement；

Sentence screening module, for being sieved according to the format of user's read statement to the sentence in preliminary search database Choosing；

Semantic contrast module, sentence and user's read statement for will filter out in preliminary search database carry out semantic Comparison, and export final result.

Further, the sentence screening module includes:

Sentence extraction module, for extracting subject, predicate and object in user's input language；

Format contrast module, for will be in subject, predicate and the object and preliminary search database in user's input language Subject, predicate and the object of all sentences compare；

The selection result obtains module, has master identical as user's input language for filtering out in preliminary search database The sentence of language, predicate and object.

Further, the semantic contrast module includes:

Phrase splits module, for user's read statement to be carried out phrase fractionation；

Phrase contrast module, for filtering out phrases all in user's read statement with preliminary search database respectively Sentence included in phrase compare；

As a result output module, for obtaining every two according to the comparison of the phrase of user's read statement and preliminary search database Semantic similar value between a sentence, and final result is exported according to the result of semantic similar value.

Further, the acquisition process of the semantic similar value are as follows: will be every in user's read statement and preliminary search database Identical phrase number is semantic similar value divided by phrase number all in user's read statement after a sentence comparison.

Detailed description of the invention

Fig. 1 is the flow chart of the similar processing method of semanteme in man-machine natural language interaction described in the embodiment of the present invention；

Fig. 2 is according to the format of user's read statement described in the embodiment of the present invention to the sentence in preliminary search database The flow chart screened；

Fig. 3 be described in the embodiment of the present invention by the sentence filtered out in preliminary search database and user's read statement into The flow chart of the semantic comparison of row；

Fig. 4 is the principle signal of the similar processing system of semanteme in man-machine natural language interaction described in the embodiment of the present invention Figure；

Fig. 5 is the schematic illustration of sentence screening module 2 described in the embodiment of the present invention；

Fig. 6 is the schematic illustration of semantic contrast module 3 described in the embodiment of the present invention.

In attached drawing, parts list represented by the reference numerals are as follows:

1, Database module, 2, sentence screening module, 3, semantic contrast module, 4, sentence extraction module, 5, format Contrast module, 6, the selection result acquisition module, 7, phrase fractionation module, 8, phrase contrast module, 9, result output module.

Specific embodiment

The principle and features of the present invention will be described below with reference to the accompanying drawings, and the given examples are served only to explain the present invention, and It is non-to be used to limit the scope of the invention.

Embodiment 1

As shown in Figure 1, the present embodiment proposes the similar processing method of semanteme in a kind of man-machine natural language interaction, it is It is accomplished by the following way:

In the present embodiment, in the beginning handled user's read statement, first to the format for read statement into Row extracts, and the trunk portion by extracting problem compares so that carrying out the first step deletes choosing；Detailed process is as shown in Figure 2:

Subject, predicate and object in S21, extraction user's input language；

After carrying out preliminary screening, can may also there are the Subject, Predicate and Object and user's read statement of many problems in database Subject, Predicate and Object is identical, but in comparison, and sentence corresponding with user's read statement is very in preliminary search database It is few, it is then then non-by the workload for comparing the phrase of the phrase of each sentence filtered out and user's read statement Often small, versus speed is also very fast, and detailed process is as shown in Figure 3:

S31, user's read statement is subjected to phrase fractionation；

Wherein, the acquisition process of semantic similar value are as follows: by each sentence in user's read statement and preliminary search database Identical phrase number is semantic similar value divided by phrase number all in user's read statement after comparison.

There are ten words: A1+A2+A3+A4+A5+A6+A7+A8+A9+ A0, and there are five word and user's read statement are complete with an identical sentence with its Subject, Predicate and Object in preliminary search database It is exactly the same: A2+A3+A4+A5+A6, since the clause of the two sentences is identical, then it is assumed that the similitude of the two sentences is 50%, if there are four word is identical, similitude 40%, and so on.According to user's read statement and preliminary search data Sentence carries out semantic comparison in library, filters out the highest sentence of semantic similar value, then the sentence is final output sentence.

Embodiment 2

As shown in figure 4, the present embodiment proposes the similar processing system of semanteme in a kind of man-machine natural language interaction, this is System includes:

Database module 1, for establishing preliminary search database and receiving user's read statement；

Sentence screening module 2, for being carried out according to the format of user's read statement to the sentence in preliminary search database Screening；

Semantic contrast module 3, sentence and user's read statement for will filter out in preliminary search database carry out language Justice comparison, and export final result.

Preferably, as shown in figure 5, the sentence screening module 2 includes:

Sentence extraction module 4, for extracting subject, predicate and object in user's input language；

Format contrast module 5, for will be in subject, predicate and the object and preliminary search database in user's input language Subject, predicate and the object of all sentences compare；

The selection result obtains module 6, for being filtered out in preliminary search database with identical as user's input language The sentence of subject, predicate and object.

Preferably, as shown in fig. 6, the semanteme contrast module 3 includes:

Phrase splits module 7, for user's read statement to be carried out phrase fractionation；

Phrase contrast module 8, for screening phrases all in user's read statement with preliminary search database respectively Phrase included in sentence out compares；

As a result output module 9, it is every for being obtained according to the comparison of the phrase of user's read statement and preliminary search database Semantic similar value between two sentences, and final result is exported according to the result of semantic similar value.

Preferably, the acquisition process of the semantic similar value are as follows: will be every in user's read statement and preliminary search database Identical phrase number is semantic similar value divided by phrase number all in user's read statement after a sentence comparison.

The format that the present embodiment passes through user's read statement first carries out preliminary screening to the sentence in database, then leads to The similitude in the comparison comparison user's read statement and database of Semantic Similarity between problem sentence is crossed, optimum is defeated Out to user, so that robot is improved 10% to 25% to the accuracy of semantic understanding, become interactive process more certainly So, smooth.

The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims

1. the similar processing method of semanteme in a kind of man-machine natural language interaction, which is characterized in that it is real in the following manner Existing:

S3, the sentence filtered out in preliminary search database and user's read statement are subjected to semantic comparison, and export and most terminates Fruit；

Wherein, the S2 specific implementation process includes:

Subject, predicate and object in S21, extraction user's read statement；

The subject of all sentences, meaning in S22, the subject by user's read statement, predicate and object and preliminary search database Language and object compare；

S23, the sentence with subject identical as user's read statement, predicate and object is filtered out in preliminary search database；

Wherein, the S3 specific implementation process includes:

S31, user's read statement is subjected to phrase fractionation；

S32, by phrases all in user's read statement respectively and included in the sentence that is filtered out in preliminary search database Phrase compares；

S33, the semanteme obtained between every two sentence is compared according to the phrase of user's read statement and preliminary search database Similar value, and final result is exported according to the result of semantic similar value；

The acquisition process of the semanteme similar value are as follows: each sentence in user's read statement and preliminary search database is compared it Identical phrase number is semantic similar value divided by phrase number all in user's read statement afterwards.

2. the similar processing system of semanteme in a kind of man-machine natural language interaction, which is characterized in that it includes:

Database module (1), for establishing preliminary search database and receiving user's read statement；

Sentence screening module (2), for being sieved according to the format of user's read statement to the sentence in preliminary search database Choosing；

Semantic contrast module (3), sentence and user's read statement for will filter out in preliminary search database carry out semantic Comparison, and export final result；

Wherein, the sentence screening module (2) includes:

Sentence extraction module (4), for extracting subject, predicate and object in user's read statement；

Format contrast module (5), for by institute in subject, predicate and the object and preliminary search database in user's read statement There are the subject, predicate and object of sentence to compare；

The selection result obtains module (6), has master identical as user's read statement for filtering out in preliminary search database The sentence of language, predicate and object；

Wherein, the semantic contrast module (3) includes:

Phrase splits module (7), for user's read statement to be carried out phrase fractionation；

Phrase contrast module (8), for filtering out phrases all in user's read statement with preliminary search database respectively Sentence included in phrase compare；

As a result output module (9), for obtaining every two according to the comparison of the phrase of user's read statement and preliminary search database Semantic similar value between a sentence, and final result is exported according to the result of semantic similar value；