CN108763356A - A kind of intelligent robot chat system and method based on the search of similar sentence - Google Patents

A kind of intelligent robot chat system and method based on the search of similar sentence Download PDF

Info

Publication number
CN108763356A
CN108763356A CN201810468020.XA CN201810468020A CN108763356A CN 108763356 A CN108763356 A CN 108763356A CN 201810468020 A CN201810468020 A CN 201810468020A CN 108763356 A CN108763356 A CN 108763356A
Authority
CN
China
Prior art keywords
sentence
question
vocabulary
answer
similar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810468020.XA
Other languages
Chinese (zh)
Inventor
庄永军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Sanbao Innovation And Intelligence Co Ltd
Original Assignee
Shenzhen Sanbao Innovation And Intelligence Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Sanbao Innovation And Intelligence Co Ltd filed Critical Shenzhen Sanbao Innovation And Intelligence Co Ltd
Priority to CN201810468020.XA priority Critical patent/CN108763356A/en
Publication of CN108763356A publication Critical patent/CN108763356A/en
Pending legal-status Critical Current

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J11/00Manipulators not otherwise provided for
    • B25J11/0005Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of intelligent robots based on the search of similar sentence to chat system and method, it include the receiving unit for receiving the inputted sentence of user, for calculating the inputted sentence of user and the similar sentence computing unit closest to sentence in contained sentence in question and answer knowledge base, the output unit of answer is corresponded in question and answer knowledge base and for storing the question and answer repository unit talked in various situations for exporting, one aspect of the present invention uses method for information retrieval, inverted index is established in question and answer knowledge base, it can be in inverted index, vocabulary to be found is quickly positioned, save index space simultaneously;On the other hand weight calculation and sentence word sequence distance versus are carried out according to the different parts of speech of vocabulary, the situation that vocabulary used is similar but meaning is different between sentence can be distinguished, to obtain more accurate similar sentence, more accurate search result is provided, enhances the interactive experience of user.

Description

A kind of intelligent robot chat system and method based on the search of similar sentence
Technical field
The present invention relates to field of artificial intelligence, specifically a kind of intelligent robot based on the search of similar sentence chats system System and method.
Background technology
With the development of information technology, intelligent robot has been increasingly becoming auxiliary tool important in people's life, people Wish to possess more easily information acquiring pattern and more humane man-machine interaction experience.But current man-machine interaction mode is not Much variations occur, the method for especially user's acquisition information is effective not enough.It is right by studying chat robots the relevant technologies It is significant in the development of promotion man-machine interaction mode.
At present in chat robots for user spoken utterances by simply considering direct of single or several words Match, the processing mode of comparison surface.
Invention content
The purpose of the present invention is to provide a kind of intelligent robots based on the search of similar sentence to chat system and method, with solution Certainly the problems mentioned above in the background art.
To achieve the above object, the present invention provides the following technical solutions:
A kind of intelligent robot chat system based on the search of similar sentence, includes for receiving connecing for the inputted sentence of user It receives unit, calculate list closest to the similar sentence of sentence to contained sentence in question and answer knowledge base for calculating the inputted sentence of user Member corresponds to the output unit of answer and for storing the question and answer knowledge base talked in various situations for exporting in question and answer knowledge base Unit, the receiving unit connect similar sentence computing unit, and similar sentence computing unit is also respectively connected with output unit and question and answer are known Know library unit.
Further technical solution as the present invention:The inputted sentence of the calculating user and contained language in question and answer knowledge base It is specifically comprised the steps of closest to sentence in sentence:1. inverted index;Using method for information retrieval, built in question and answer knowledge base Vertical inverted index, inverted index are used for recording the vocabulary that all question sentences include in knowledge question library, using some word as major key, often One major key all points to a series of question sentence number, and question sentence number is to act on behalf the number of the question sentence comprising the word, so the row's of falling rope Drawing storage form is:" ' vocabulary ' -- the number of the question sentence containing the vocabulary " is asked by this method all in question and answer knowledge base Sentence is scanned one by one, obtains all not repeated vocabulary and its corresponding question sentence numbered list.If fruit is in a question sentence, same Vocabulary occurs twice, and also only record is primary under the vocabulary for the number of the question sentence, 2. calculates similar weight:Not according to vocabulary Its different weight is assigned with part of speech;3. calculating sentence word sequence distance, including the most question sentence of the input important vocabulary of sentence In set, the question sentence most like with input sentence is taken out, word sequence distance is further calculated to it, found and input sentence The question sentence of Distance minimums, by the question sentence, corresponding answer exports in question and answer knowledge base.
Further technical solution as the present invention:2. the step specifically includes (1) by inverted index, find input Some vocabulary for including in sentence corresponds to question sentence in whole question and answer knowledge bases and numbers, according to the part of speech of vocabulary, to corresponding part of speech Question sentence number adds corresponding weight score, such as this vocabulary is verb, then the weight of the sentence number containing the verb vocabulary Add 1;(2) it calculates and obtains the corresponding synonym of step (1) vocabulary, the sentence number score comprising these synonyms is added into part of speech The half of weight.For example, ' masses ' are noun, weight 1 includes then the sentence number score of the synonym such as ' public ' of the masses In addition 0.5;(3) all vocabulary of traversal input sentence execute (1) and (2) step to each vocabulary.
Further technical solution as the present invention:3. the step specifically includes I, is carried out to the input i.e. sentence A of sentence Traversal, for the Chinese character of i-th of position in A, if it includes the Chinese to find the question sentence i.e. sentence B most like with input sentence also Word, and appear in j-th of position, then the distance Distance=Distance+ (i-j) of sentence A and sentence B2If sentence repeats There is certain Chinese character, then searched backward from the position that the last time occurs, is all the position occurred for the first time to prevent what is returned every time; II, I step is also carried out to sentence B similarly to operate, i.e., sentence B is traversed, for the Chinese character of i-th of position in B, search sentence Whether include and the position where returning in sub- A.
A kind of intelligent robot chat method based on the search of similar sentence, comprises the steps of:
A, user inputs question sentence;
B, it by inverted index, finds some vocabulary for including in input sentence and corresponds to question sentence volume in whole question and answer knowledge bases Number, according to the part of speech of vocabulary, add 1 to the weight of the question sentence number of corresponding part of speech;
C, it calculates and obtains the corresponding synonym of step B vocabulary, the sentence number score comprising these synonyms is added into word Property weight add 0.5;
D, all vocabulary of traversal input sentence, to each lexical repetition step B and step C;
E, output weight is identical and is highest question sentence set;
F, in comprising the most question sentence set of the input important vocabulary of sentence, the question sentence most like with input sentence is taken out, Word sequence distance is further calculated to it, the question sentence with input sentence Distance minimums is found, by the question sentence in question and answer knowledge Corresponding answer output in library.
Compared with prior art, the beneficial effects of the invention are as follows:One aspect of the present invention uses method for information retrieval, is asking It answers and establishes inverted index in knowledge base, vocabulary to be found can quickly be positioned in inverted index, while saving index Space;On the other hand weight calculation and sentence word sequence distance versus are carried out according to the different parts of speech of vocabulary, can distinguished The situation that vocabulary used is similar between sentence but meaning is different is provided and is more accurately searched to obtain more accurate similar sentence Rope is as a result, enhance the interactive experience of user.
Description of the drawings
Fig. 1 is the structure diagram of the present invention
Fig. 2 is flow chart of the method for the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Please refer to Fig.1-2, in the embodiment of the present invention, a kind of intelligent robot based on the search of similar sentence chats system, packet Include the receiving unit for receiving the inputted sentence of user, for calculating contained language in the inputted sentence of user and question and answer knowledge base The output unit of answer is corresponded in question and answer knowledge base and for depositing closest to the similar sentence computing unit of sentence, for exporting in sentence The question and answer repository unit talked in various situations is stored up, the receiving unit connects similar sentence computing unit, and similar sentence calculates single Member is also respectively connected with output unit and question and answer repository unit.
The inputted sentence of user is calculated specifically to comprise the steps of closest to sentence with contained sentence in question and answer knowledge base: 1. inverted index;Using method for information retrieval, inverted index is established in question and answer knowledge base, inverted index is used for recording knowledge The vocabulary that all question sentences include in question and answer library, using some word as major key, each major key all points to a series of question sentence number, Question sentence number is to act on behalf the number of the question sentence comprising the word, so inverted index storage form is:" ' vocabulary ' -- contain the vocabulary Question sentence number ", all question sentences in question and answer knowledge base are scanned one by one by this method, obtain all not repeated vocabulary With its corresponding question sentence numbered list.If fruit, in a question sentence, same vocabulary occurs twice, the number of the question sentence is in the word Also only record is primary under remittance, 2. calculates similar weight:Its different weight is assigned according to the different parts of speech of vocabulary;3. calculating sentence Word sequence distance takes out the question sentence most like with input sentence in comprising the most question sentence set of the input important vocabulary of sentence, Word sequence distance is further calculated to it, the question sentence with input sentence Distance minimums is found, by the question sentence in question and answer knowledge Corresponding answer output in library.
2. step specifically includes (1) by inverted index, find some vocabulary for including in input sentence and correspond to whole ask It answers question sentence in knowledge base to number, according to the part of speech of vocabulary, corresponding weight score is added to the question sentence number of corresponding part of speech, such as This vocabulary is verb, then the weight of the sentence number containing the verb vocabulary adds 1;(2) it is corresponding to calculate acquisition step (1) vocabulary Sentence number score comprising these synonyms is added the half of part of speech weight by synonym.For example, ' masses ' are noun, power Weight is 1, then the sentence number score for including the synonym such as ' public ' of the masses adds 0.5;(3) all words of traversal input sentence It converges, (1) and (2) step is executed to each vocabulary.
3. step specifically includes I, is traversed to the input i.e. sentence A of sentence, for the Chinese character of i-th of position in A, if looking into It includes the Chinese character to find the question sentence i.e. sentence B most like with input sentence also, and appears in j-th of position, then sentence A and sentence The distance Distance=Distance+ (i-j) of B2If sentence repeats certain Chinese character, from last time occur position to After search, to prevent return every time be all for the first time occur position;II, it also carries out I step to sentence B similarly to operate, i.e., Sentence B is traversed, whether for the Chinese character of i-th of position in B, it includes and the position where returning to search in sentence A.
A kind of intelligent robot chat method based on the search of similar sentence, comprises the steps of:
A, user inputs question sentence;
B, it by inverted index, finds some vocabulary for including in input sentence and corresponds to question sentence volume in whole question and answer knowledge bases Number, according to the part of speech of vocabulary, add 1 to the weight of the question sentence number of corresponding part of speech;
C, it calculates and obtains the corresponding synonym of step B vocabulary, the sentence number score comprising these synonyms is added into word Property weight add 0.5;
D, all vocabulary of traversal input sentence, to each lexical repetition step B and step C;
E, output weight is identical and is highest question sentence set;
F, in comprising the most question sentence set of the input important vocabulary of sentence, the question sentence most like with input sentence is taken out, Word sequence distance is further calculated to it, the question sentence with input sentence Distance minimums is found, by the question sentence in question and answer knowledge Corresponding answer output in library.
The present invention operation principle be:One aspect of the present invention uses method for information retrieval, is established in question and answer knowledge base Inverted index can quickly position vocabulary to be found, while saving index space in inverted index;On the other hand Weight calculation and sentence word sequence distance versus are carried out according to the different parts of speech of vocabulary, institute's word between sentence can be distinguished Similar but different meaning situation of converging provides more accurate search result to obtain more accurate similar sentence, enhances user Interactive experience.
It is enlightenment with above-mentioned desirable embodiment according to the present invention, through the above description, relevant staff is complete Various changes and amendments can be carried out without departing from the scope of the technological thought of the present invention' entirely.The technology of this invention Property range is not limited to the contents of the specification, it is necessary to determine its technical scope according to right.

Claims (5)

  1. Include the reception for receiving the inputted sentence of user 1. a kind of intelligent robot based on the search of similar sentence chats system Unit calculates list to contained sentence in question and answer knowledge base for calculating the inputted sentence of user closest to the similar sentence of sentence Member corresponds to the output unit of answer and for storing the question and answer knowledge base talked in various situations for exporting in question and answer knowledge base Unit, which is characterized in that the receiving unit connects similar sentence computing unit, and it is single that similar sentence computing unit is also respectively connected with output Member and question and answer repository unit.
  2. 2. the intelligent robot according to claim 1 based on the search of similar sentence chats system, which is characterized in that the meter The inputted sentence of user is calculated specifically to comprise the steps of closest to sentence with contained sentence in question and answer knowledge base:1. the row's of falling rope Draw;Using method for information retrieval, inverted index is established in question and answer knowledge base, inverted index is used for recording in knowledge question library The vocabulary that all question sentences include, using some word as major key, each major key all points to a series of question sentence number, question sentence number It is the number for acting on behalf the question sentence comprising the word, so inverted index storage form is:" ' vocabulary ' -- the question sentence containing the vocabulary Number ", is scanned all question sentences in question and answer knowledge base by this method one by one, obtains all not repeated vocabulary and it is corresponding Question sentence numbered list.If fruit, in a question sentence, same vocabulary occurs twice, the number of the question sentence under the vocabulary also only Record is primary, 2. calculates similar weight:Its different weight is assigned according to the different parts of speech of vocabulary;3. calculate sentence word sequence away from From in comprising the most question sentence set of the input important vocabulary of sentence, the question sentence most like with input sentence being taken out, to it into one Step calculates word sequence distance, finds the question sentence with input sentence Distance minimums, and the question sentence is corresponding in question and answer knowledge base Answer output.
  3. 3. the intelligent robot according to claim 2 based on the search of similar sentence chats system, which is characterized in that the step It finds some vocabulary for including in input sentence by inverted index and corresponds in whole question and answer knowledge bases in rapid 2. specifically include (1) Question sentence is numbered, and according to the part of speech of vocabulary, adds corresponding weight score to the question sentence number of corresponding part of speech, such as this vocabulary is Verb, the then weight that the sentence containing the verb vocabulary is numbered add 1;(2) it calculates and obtains the corresponding synonym of step (1) vocabulary, it will Including the sentence number score of these synonyms adds the half of part of speech weight.For example, ' masses ' are noun, weight 1 is then wrapped The sentence number score of synonym containing the masses such as ' public ' adds 0.5;(3) all vocabulary of traversal input sentence, to each Vocabulary executes (1) and (2) step.
  4. 4. the intelligent robot according to claim 2 based on the search of similar sentence chats system, which is characterized in that the step It is rapid 3. to specifically include I, the input i.e. sentence A of sentence is traversed, for the Chinese character of i-th of position in A, if finding and inputting The most like question sentence of sentence, that is, sentence B also includes the Chinese character, and appears in j-th of position, then the distance of sentence A and sentence B Distance=Distance+ (i-j)2If sentence repeats certain Chinese character, searched backward from the position that the last time occurs, All it is the position occurred for the first time to prevent what is returned every time;II, it also carries out I step to sentence B similarly to operate, i.e., to sentence B It traverses, whether for the Chinese character of i-th of position in B, it includes and the position where returning to search in sentence A.
  5. 5. a kind of intelligent robot based on the search of similar sentence chats method, which is characterized in that comprise the steps of:
    A, user inputs question sentence;
    B, it by inverted index, finds some vocabulary for including in input sentence and corresponds to question sentence number in whole question and answer knowledge bases, According to the part of speech of vocabulary, add 1 to the weight of the question sentence number of corresponding part of speech;
    C, it calculates and obtains the corresponding synonym of step B vocabulary, the sentence number score comprising these synonyms is weighed plus part of speech 0.5 is added again;
    D, all vocabulary of traversal input sentence, to each lexical repetition step B and step C;
    E, output weight is identical and is highest question sentence set;
    F, in comprising the most question sentence set of the input important vocabulary of sentence, the question sentence most like with input sentence is taken out, to it Word sequence distance is further calculated, the question sentence with input sentence Distance minimums is found, by the question sentence in question and answer knowledge base Corresponding answer output.
CN201810468020.XA 2018-05-16 2018-05-16 A kind of intelligent robot chat system and method based on the search of similar sentence Pending CN108763356A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810468020.XA CN108763356A (en) 2018-05-16 2018-05-16 A kind of intelligent robot chat system and method based on the search of similar sentence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810468020.XA CN108763356A (en) 2018-05-16 2018-05-16 A kind of intelligent robot chat system and method based on the search of similar sentence

Publications (1)

Publication Number Publication Date
CN108763356A true CN108763356A (en) 2018-11-06

Family

ID=64008188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810468020.XA Pending CN108763356A (en) 2018-05-16 2018-05-16 A kind of intelligent robot chat system and method based on the search of similar sentence

Country Status (1)

Country Link
CN (1) CN108763356A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930880A (en) * 2020-08-14 2020-11-13 易联众信息技术股份有限公司 Text code retrieval method, device and medium
CN112527965A (en) * 2020-12-18 2021-03-19 国家电网有限公司客户服务中心 Automatic question answering implementation method and device based on combination of professional library and chatting library

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080126335A1 (en) * 2006-11-29 2008-05-29 Oracle International Corporation Efficient computation of document similarity
CN101286161A (en) * 2008-05-28 2008-10-15 华中科技大学 Intelligent Chinese request-answering system based on concept
CN102200975A (en) * 2010-03-25 2011-09-28 北京师范大学 Vertical search engine system and method using semantic analysis
CN103020295A (en) * 2012-12-28 2013-04-03 新浪网技术(中国)有限公司 Problem label marking method and device
CN105824933A (en) * 2016-03-18 2016-08-03 苏州大学 Automatic question-answering system based on theme-rheme positions and realization method of automatic question answering system
CN107273350A (en) * 2017-05-16 2017-10-20 广东电网有限责任公司江门供电局 A kind of information processing method and its device for realizing intelligent answer
CN107305550A (en) * 2016-04-19 2017-10-31 中兴通讯股份有限公司 A kind of intelligent answer method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080126335A1 (en) * 2006-11-29 2008-05-29 Oracle International Corporation Efficient computation of document similarity
CN101286161A (en) * 2008-05-28 2008-10-15 华中科技大学 Intelligent Chinese request-answering system based on concept
CN102200975A (en) * 2010-03-25 2011-09-28 北京师范大学 Vertical search engine system and method using semantic analysis
CN103020295A (en) * 2012-12-28 2013-04-03 新浪网技术(中国)有限公司 Problem label marking method and device
CN105824933A (en) * 2016-03-18 2016-08-03 苏州大学 Automatic question-answering system based on theme-rheme positions and realization method of automatic question answering system
CN107305550A (en) * 2016-04-19 2017-10-31 中兴通讯股份有限公司 A kind of intelligent answer method and device
CN107273350A (en) * 2017-05-16 2017-10-20 广东电网有限责任公司江门供电局 A kind of information processing method and its device for realizing intelligent answer

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
俞霖霖: "面向百度百科的候选答案句抽取研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930880A (en) * 2020-08-14 2020-11-13 易联众信息技术股份有限公司 Text code retrieval method, device and medium
CN112527965A (en) * 2020-12-18 2021-03-19 国家电网有限公司客户服务中心 Automatic question answering implementation method and device based on combination of professional library and chatting library

Similar Documents

Publication Publication Date Title
CN110096567B (en) QA knowledge base reasoning-based multi-round dialogue reply selection method and system
Gorin et al. On adaptive acquisition of language
CN108681574B (en) Text abstract-based non-fact question-answer selection method and system
CN112100354B (en) Man-machine conversation method, device, equipment and storage medium
CN111090727B (en) Language conversion processing method and device and dialect voice interaction system
CN110717018A (en) Industrial equipment fault maintenance question-answering system based on knowledge graph
CN110795913B (en) Text encoding method, device, storage medium and terminal
CN108280218A (en) A kind of flow system based on retrieval and production mixing question and answer
CN109460459A (en) A kind of conversational system automatic optimization method based on log study
CN111191450A (en) Corpus cleaning method, corpus entry device and computer-readable storage medium
CN110245253B (en) Semantic interaction method and system based on environmental information
CN111460132A (en) Generation type conference abstract method based on graph convolution neural network
CN116737908A (en) Knowledge question-answering method, device, equipment and storage medium
CN113505209A (en) Intelligent question-answering system for automobile field
CN112632239A (en) Brain-like question-answering system based on artificial intelligence technology
CN108763356A (en) A kind of intelligent robot chat system and method based on the search of similar sentence
CN108763355A (en) A kind of intelligent robot interaction data processing system and method based on user
CN116821290A (en) Multitasking dialogue-oriented large language model training method and interaction method
Kubo et al. Team OS's system for dialogue robot competition 2022
CN111090999A (en) Information extraction method and system for power grid dispatching plan
CN113743539B (en) Form retrieval method based on deep learning
CN114238595A (en) Metallurgical knowledge question-answering method and system based on knowledge graph
Matrouf et al. Adapting probability-transitions in DP matching processing for an oral task-oriented dialogue
Patchava et al. Intelligent response retrieval for semantically similar querying using a Chatbot
CN114997173A (en) Topic guide method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181106