CN108763495B

CN108763495B - Interactive method, system, electronic equipment and storage medium

Info

Publication number: CN108763495B
Application number: CN201810536651.0A
Authority: CN
Inventors: 陈露; 初敏; 杨超; 葛付江; 郭涛涛
Original assignee: AI Speech Ltd
Current assignee: Sipic Technology Co Ltd
Priority date: 2018-05-30
Filing date: 2018-05-30
Publication date: 2019-09-20
Anticipated expiration: 2038-05-30
Also published as: CN108763495A

Abstract

The present invention discloses a kind of interactive method, is applied to interactive system, which comprises using the current state of interactive system as the input of first nerves network, in a manner of determining interactive system to the active interlocution of user；Topic to be recommended is determined according to active interlocution mode；The recommendation probability of topic to be recommended is determined as the input of nervus opticus network using current state and the feature vector of topic to be recommended；Select knowledge point to be recommended to be presented to the user from topic to be recommended according to recommendation probability value.The present embodiment determines the active interlocution mode between user according to the current state of interactive system, so as to guarantee that the dialogic operation that interactive system is actively initiated is more targetedly, meet current interactive progress, the user experience is improved.

Description

Interactive method, system, electronic equipment and storage medium

Technical field

The present invention relates to field of artificial intelligence more particularly to interactive method, system, electronic equipment and storage to be situated between Matter.

Background technique

Various dialogue robots emerge in large numbers one after another currently on the market, some occur in the form of personal assistant (siri, cortana, Magic horn etc.), there is (small ice, degree secret) with chat robots in some, also be then built in intelligent sound box, intelligent vehicle-carried equipment, In the terminals such as smart television.If analyzing the dialogue technoloyg of these robot behinds, it can substantially be divided into four seed types:

Task dialogue: the vertical field dialogue of custom-made for the task that user usually needs: for example: it makes a reservation, Music, film or certain commodity etc. are found in ticket booking.After user says a word, judge first be which task demand, and Extract the demand parameter (such as: departure place, restaurant type etc.) of user.If the call parameter of predefined is not collected entirely, machine Device can obtain information by puing question to.Therefore, Task dialogue is usually more wheel dialogues.User may also be in dialog procedure not Disconnected modification or the demand for improving oneself.

Dialogue based on question and answer pair: knowledge is with the form tissue of question and answer pair, by the enquirement of user with the question sentence of question and answer centering Compare, find immediate, and answer is returned.This dialogue is usually used in customer service robot and chats robot.It is this kind of It is single-wheel that question and answer, which service majority, some have some more wheel dialogue abilities, relates generally to simple context processing and refers to It disambiguates.

The dialogue of knowledge based map: user inquires the fact that store with triple form property with natural language/voice Knowledge.Such as " daughter of Yao Ming is how high? ".When carrying out this kind of dialogue, robot needs certain inferential capability, on Face the words is actually completed daughter -> Yao Qinlei of (1) Yao Ming by two steps；(2) height -> 160cm of Yao Qinlei.This kind of dialogue Does is most of single-wheel, some have certain more wheel dialogue abilities, and being primarily referred to as generation disambiguation, (how old is she? she -> Yao Qin Flower bud).

Production chat: by training neural network model, an answer can be automatically generated according to a customer problem. This kind of chat is limited without explicitly linking up target and field, and user says that in short system just automatically generates a reply, is replied There is certain association with problem, but without explicitly linking up target.Also this kind of dialogue of someone is referred to as open field chat.Open field chat It in existing interactive system, primarily serves and is closing the distance, establish trusting relationship, emotion is accompanied, smooth dialog procedure (example Such as, task class dialogue be unable to satisfy user demand when) and raising user's viscosity effect.

Various intelligent robots in the prior art, some use one form of them, some are then the groups of several forms It is fit.It is essentially all people is masters in all these dialogic operations, machine is passive side, and machine waits people to put question to, Then answer is provided.In the dialogue of Task, machine can also be putd question to people, but be carried out in the case where explicitly defining very much , for example, air ticket task is bought, three required arguments of predefined: departure place, place of arrival, departure time.Only user accuses Know entirely, robot could be issued to ticketing service system and be inquired, if incomplete, robot will obtain missing by puing question to Information.But on the whole, machine or passive wait state.

In addition, in the prior art for the question formulation to people be write in advance in the form of question and answer pair it is dead, cannot basis Suitable question formulation is targetedly selected for the practical interaction mode and history interaction feedback information with conversational system, Thus existing question formulation is mechanical does not have hommization, often causes bad human-computer dialogue to experience to user.

Summary of the invention

The embodiment of the present invention provides a kind of interactive method and system, at least solving one of above-mentioned technical problem.

In a first aspect, the embodiment of the present invention provides a kind of interactive method, it is applied to interactive system, the method Include:

Using the current state of the interactive system as the input of first nerves network, with the determination human-computer dialogue Active interlocution mode of the system to user；

Topic to be recommended is determined according to the active interlocution mode；

It is determined using the feature vector of the current state and the topic to be recommended as the input of nervus opticus network The recommendation probability of the topic to be recommended；

Select knowledge point to be recommended to be presented to the user from the topic to be recommended according to the recommendation probability value.

Second aspect, the embodiment of the present invention provide a kind of interactive system, comprising:

Active interlocution mode determines program module, for using the current state of the interactive system as first nerves The input of network, in such a way that the determination interactive system is to the active interlocution of user；

Topic to be recommended determines program module, for determining topic to be recommended according to the active interlocution mode；

Recommend determine the probability program module, for using the current state and the feature vector of the topic to be recommended as The recommendation probability of nervus opticus network inputted to determine the topic to be recommended；

Recommend knowledge point option program module, for selected from the topic to be recommended according to the recommendation probability value to Recommend knowledge point to be presented to the user.

The third aspect, the embodiment of the present invention provide a kind of non-volatile computer readable storage medium storing program for executing, the storage medium In to be stored with one or more include the programs executed instruction, described execute instruction can be by electronic equipment (including but not limited to Computer, server or network equipment etc.) it reads and executes, for executing any of the above-described human-computer dialogue side of the present invention Method.

Fourth aspect provides a kind of electronic equipment comprising: at least one processor, and with described at least one Manage the memory of device communication connection, wherein the memory is stored with the instruction that can be executed by least one described processor, institute It states instruction to be executed by least one described processor, so that at least one described processor is able to carry out any of the above-described of the present invention Interactive method.

5th aspect, the embodiment of the present invention also provide a kind of computer program product, and the computer program product includes The computer program being stored on non-volatile computer readable storage medium storing program for executing, the computer program include program instruction, when When described program instruction is computer-executed, the computer is made to execute any of the above-described interactive method.

The present embodiment determines the active interlocution mode between user according to the current state of interactive system, thus It can guarantee that the dialogic operation that interactive system is actively initiated is more targetedly, after determining active interlocution mode Determine the topic to be recommended for being used for active interlocution again accordingly, further using preparatory trained nervus opticus network according to The current state of interactive system and the feature vector of theme to be recommended determine that topic to be recommended recommends probability, so as to The knowledge point for initiating active interlocution is obtained from corresponding topic to be recommended according to probability is recommended.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are some embodiments of the invention, for this field For those of ordinary skill, without creative efforts, it is also possible to obtain other drawings based on these drawings.

Fig. 1 is the flow chart of an embodiment of interactive method of the invention；

Fig. 2 is the flow chart of another embodiment of interactive method of the invention；

Fig. 3 is the flow chart of the another embodiment of interactive method of the invention；

Fig. 4 is the structural schematic diagram for indicating the Q neural network of main strategy；

Fig. 5 is the structural schematic diagram of the Q neural network of Q value when selecting topic t under current state for determining；

Fig. 6 is the flow chart of another embodiment of interactive method of the invention；

Fig. 7 is the functional block diagram of an embodiment of interactive system of the invention；

Fig. 8 is the structural schematic diagram of an embodiment of electronic equipment of the invention.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.

The present invention can describe in the general context of computer-executable instructions executed by a computer, such as program Module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, programs, objects, member Part, data structure etc..The present invention can also be practiced in a distributed computing environment, in these distributed computing environments, by Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with In the local and remote computer storage media including storage equipment.

In the present invention, the fingers such as " module ", " device ", " system " are applied to the related entities of computer, such as hardware, hardware Combination, software or software in execution with software etc..In detail, for example, element can with but be not limited to run on processing Process, processor, object, executable element, execution thread, program and/or the computer of device.In addition, running on server Application program or shell script, server can be element.One or more elements can be in the process and/or thread of execution In, and element can be localized and/or be distributed between two or multiple stage computers on one computer, and can be by each Kind computer-readable medium operation.Element can also according to the signal with one or more data packets, for example, from one with Another element interacts in local system, distributed system, and/or the network in internet passes through signal and other system interactions The signals of data distinguish one entity or operation from another entity or operation, and not necessarily require or secretly Show that there are any actual relationship or orders between these entities or operation.Moreover, the terms "include", "comprise", no Only include those elements, but also including other elements that are not explicitly listed, or further include for this process, method, Article or the intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence " including ... ", and It is not excluded in process, method, article or equipment in the process, method, article or apparatus that includes the element that there is also other identical elements.

Regard the dialog procedure of person to person's machine conversational system as a Markovian decision mistake in embodiments of the present invention Journey (Markov Decision Process, MDP).At each moment, interactive system is in state s, according to tactful π (a | s) movement a is taken, observe the reply o ' of user, state is changed into s ', receives reward r '.

The embodiment of the present invention provides a kind of interactive method, is applied to interactive system, as shown in Figure 1, man-machine Dialogue method includes:

S11, using the current state of the interactive system as the input of first nerves network, it is described man-machine with determination For conversational system to the active interlocution mode of user, first nerves network is the first Q neural network.

Wherein, at least based on user and the interactive system from starting dialogue accessed knowledge until current time The statistical information and actualite information of point generate the current state.The actualite information is described current for characterizing The statistical information of the topic vector information of topic, the access knowledge point includes at least:

The primary vector information for the knowledge point quantity being accessed by the user in actualite；

The secondary vector information for the knowledge point quantity for negating by user in actualite；

From the third vector information for starting to talk with until current time the knowledge point quantity that user negated；

From the 4th vector information for starting to talk with until current time the maximum knowledge point quantity that user continuously negated.

The statistical information for accessing knowledge point detects user for the reply situation of recommended knowledge point by interactive system It determines, specifically:

Interactive system parses the reply of user by semantic understanding module, and there are four types of types for parsing result:

Selecting one in the knowledge point of system recommendation, semantic expressiveness is inform (series=id), such as " first It is a ", " second ", " the last one "；

User is to the affirmative or negative acknowledge of rhetorical question, and semantic expressiveness negates for deny (), affirm () is affirmed, such as " good, to introduce ", " having no interest ", " being not desired to understand "；

Last round of system is not recommended and is asked in reply or user pays no attention to the recommendation and rhetorical question of system, inquires one again and knows Know point k_t, semantic expressiveness is inform (kp=k_t)。

Terminate dialogue, semantic expressiveness is bye ().

In the present embodiment, accesses and know in the dialog procedure that active user had carried out with interactive system of combining closely Statistical information and the actualite of point are known to determine the current state of conversational system, to determine for inputting first nerves network Interactive system actively initiates the active interlocution mode of dialogue again, so that the interactive method of the present embodiment is each time It actively initiates that the conversational feedback information having been carried out and actualite can be comprehensively considered when dialogue to determine that active is right Words mode, to ensure that selected active interlocution mode all has more specific aim each time, more meets current meaning of being close to the users It is willing to, so that the active interlocution mode that interactive system is initiated is easier to be received by user, ensure that interactive Smooth pleasant progress improves the human-computer dialogue experience of user.

S12, topic to be recommended is determined according to the active interlocution mode；

Wherein, the active interlocution mode include at least rhetorical question conversational mode, recommend conversational mode and neither ask in reply nor Recommend, specifically:

Whether the knowledge point kt under-confirm (kp=kt), rhetorical question user session topic t is interested, such as " you are to us Far field speech recognition technology it is interested? ".

Recommend n knowledge point to user, such as:

" you can also ask me: 1, introducing the typical products for thinking must to speed in terms of intelligentized Furniture.2, day cat spirit is used Your which technology? 3, which the hardware product for thinking to speed has? ".In order to simple, it can be assumed that this n knowledge point is from n Topic, i.e., from one knowledge point of stochastical sampling in each topic (it is of course also possible to be from m topic selection n knowledge point, Wherein m is less than n).

- null, expression are neither recommended nor are asked in reply.

S13, come using the current state and the feature vector of the topic to be recommended as the input of nervus opticus network Determine that the recommendation probability of the topic to be recommended, nervus opticus network are the 2nd Q neural network；

When the active interlocution mode is rhetorical question conversational mode, the topic to be recommended is a candidate topics, described It is determined using the feature vector of the current state and the topic to be recommended as the input of nervus opticus network described wait push away The recommendation probability for recommending topic includes:

Using the current state and the feature vector of the candidate topics as the input of nervus opticus network to determine State the recommendation probability of candidate topics.

When the active interlocution mode is to recommend conversational mode, the topic to be recommended includes multiple candidate topics, institute State using the current state and the feature vector of the topic to be recommended as the input of nervus opticus network determine it is described to Recommend topic recommendation probability include:

Come using the current state and the feature vector of the multiple candidate topics as the input of nervus opticus network true Surely correspond to multiple recommendation probability of the multiple candidate topics.

S14, select knowledge point to be recommended to be presented to the user from the topic to be recommended according to the recommendation probability value.

The present embodiment determines the active interlocution mode between user according to the current state of interactive system, thus It can guarantee that the dialogic operation that interactive system is actively initiated is more targetedly, to meet current interactive progress Situation is improved for experiencing.It is corresponding after determining active interlocution mode to determine the words to be recommended for being used for active interlocution again Topic, further using preparatory current state and to be recommended theme of the trained nervus opticus network according to interactive system Feature vector determine that topic to be recommended recommends probability, so as to according to recommending the probability to come from corresponding topic to be recommended Obtain the knowledge point for initiating active interlocution.

As shown in Fig. 2, in some embodiments of the invention, it is described at least based on user and the interactive system from Start to talk with until current time the statistical information of accessed knowledge point and actualite information generates the current state packet It includes:

S21, current system feature vector is generated based on the topic vector information and first to fourth vector information；

S22, it compresses the current system feature vector to obtain the current state using Recognition with Recurrent Neural Network.

The topic vector information and first to fourth vector information, which link up, may be constructed a feature vector, use X is indicated.In addition to above- mentioned information, historical information is also critically important, can will be from right with Recognition with Recurrent Neural Network (for example, RNN/LSTM etc.) Words start all x so far and are compressed into a state expression vector s, i.e. the input at each moment of network is the corresponding moment X, last moment corresponding state be s.The present embodiment considers the selection of topic by using Recognition with Recurrent Neural Network to go through History information, so that jumping every time more personalized and accurate.

As shown in figure 3, in some embodiments, interactive method further include:

S31, record and store each round dialogue in dialogue empirical data, it is described right to be used to form empirical data pond Words empirical data include at least each round talk with the states of corresponding Current dialog systems, the action taken, subsequent time pair The prize signal that the state and conversational system of telephone system receive；

S32, the first nerves net is trained based on the dialogue empirical data in the empirical data pond according to predetermined period Network and/or the nervus opticus network.

Prize signal can come from following aspects in above-described embodiment:

The reply that user recommends interactive system or asks in reply

If user is selected from last round of recommendation knowledge point, it can give system one positive value reward, Otherwise give system one negative value reward；

If user indicates affirmative acknowledgement (ACK) to the rhetorical question knowledge point of system, gives system one positive value reward, otherwise give One negative value reward of system.

The specific reward value of above-mentioned two situations can be different, generally, asks in reply the dialogue of mode more naturally, institute With the absolute value for asking in reply (positive/negative) reward value of mode can be bigger than recommendation pattern.

For system, if being centainly oriented to purpose, a forward direction is obtained if having reached the purpose of oneself Excitation.And the target of system, it is different according to the type of conversational system, such as:

Publicity class: illustrating enough information to user, every to introduce a knowledge point to user and then obtain a forward direction Reward, different rewards can be set in the knowledge point of different topics；

Shopping guide's class: producing single purchase corelation behaviour, then obtains a biggish positive reward；

Commercial class: facilitating a business cooperation, then obtains a biggish positive reward；

Recruitment class: obtaining resume, then obtains a biggish positive reward.

Prize signal can derive from above-mentioned two aspect incessantly, and the designer of system can specifically set according to specific tasks Meter.

There is above-mentioned prize signal, then it can be with nitrification enhancement come optimisation strategy, so that basis in each state The active interlocution mode and topic (knowledge point) of policy selection are all that the progressive award for making to obtain maximizes.During the present invention implements Dialog strategy is indicated with Q network, with DQN (Deep-Q-Networks) come optimisation strategy.Unlike common DQN, this There are three types of different strategies in inventive embodiments: main strategy, Generalization bounds, rhetorical question strategy.Each strategy has oneself corresponding Q net Network, they can share an empirical data pond D.When each undated parameter, a collection of (batch) number is sampled from empirical data pond According to then according to root loss function (TD error) Optimal Parameters.At the initial stage of system optimization, if do not carried out to the topic of selection Limitation then has very big randomness according to the topic that policy selection arrives, and effect may not be highly desirable.It is asked to solve this Topic, the topic of selection can be limited in a certain range by system optimization initial stage, such as fraternal topic, the sub- words of actualite Topic etc..

Dialog strategy decides how user is recommended and be asked in reply, and is broadly divided into three steps:

Step 1: active interlocution mode is determined: by main strategy π^m(a^m| s) determine recommend, rhetorical question, still neither recommend nor Any mode in rhetorical question.

As shown in figure 4, the structural schematic diagram of the Q neural network of main strategy is indicated in the embodiment of the present invention, Q neural network Input is current state s, and output layer has 3 dimensions, respectively corresponds selection 3 kinds of modes Q value obtained, then every kind of way of recommendation a^mIt is right The probability answered are as follows:

τ is the hyper parameter for the degree that a control strategy is explored in above-mentioned formula, and τ is bigger, and above-mentioned probability is average, The degree that system is explored is bigger.Generally, τ starts larger, is then gradually reduced.When each decision, according to above-mentioned probability It carries out sampling a kind of active interlocution mode.

Step 2: it carries out topic reasoning: speculating user's next possible interested topic, that is, select one or more Possible topic needs to select a topic, if step if the active interlocution mode sampled in step 1 is rhetorical question The one active interlocution mode sampled is to recommend, then needs to select n topic.

As shown in figure 5, the structural representation of the Q neural network of Q value when to select topic t under current state for determining Figure, the network include two branching networks, and the input of one of branching networks is current state s, after several layer networks φ (s) is indicated to a state, and the input of another branching networks is that the vectorization of topic t indicates e_t, into after excessively several layer networks Obtain another and φ (s) indicates with the vector of dimensionThe Q value Q (s, t) of topic t is then selected under corresponding current state For the inner product of above-mentioned two vector, it may be assumed that

For every kind of possible candidate topics, corresponding Q value is calculated first with above-mentioned formula, then according to following formula Calculate the probability of every kind of theme of every kind of selection:

If active interlocution mode is rhetorical question, a theme is sampled according to above-mentioned probability, if it is recommendation pattern, then root N theme is sampled according to above-mentioned probability.

Step 3: a knowledge point is randomly choosed from each theme sampled as rhetorical question or recommends knowledge point.

As shown in fig. 6, the flow chart of the embodiment for interactive method of the invention, comprising the following steps:

Step 1: the enquirement or reply of user are received；

Step 2: carrying out semantic understanding to user, and there are four types of parsing results: selecting in the knowledge point of system recommendation One；To the affirmative or negative acknowledge of the rhetorical question of user；User inquires a knowledge point k again_t；User wants to exit dialogue；

Step 3: if user is not desired to exit, actualite is updated according to semantic parsing result；

Step 4: it determines to reply the main contents of user.It is the knowledge selected from last round of recommendation if it is user Point, or the affirmative acknowledgement (ACK) to system rhetorical question knowledge point, then directly provide the knowledge point contents；It is re-prompted if it is user One problem, then inquire corresponding knowledge point contents from knowledge base；

Step 5: dialogue state is updated.Actualite and session features are extracted as the current time of Recognition with Recurrent Neural Network Input, the vectorization for obtaining the hiding expression at current time as current state indicates；

Step 6: main strategy decision active interlocution mode: recommend, rhetorical question, neither recommend nor ask in reply；

Step 7: according to the recommendation pattern selected in step 6, specific active interlocution content is determined.If it is recommendation mould Formula then selects n topic according to Generalization bounds, then samples a knowledge point from each topic and forms recommendation list；If Then rhetorical question mode selects a knowledge point as rhetorical question content then according to rhetorical question one topic of policy selection from topic；Such as Fruit is neither to recommend nor ask in reply, then when front-wheel active interlocution content is sky.

Step 8: the active interlocution content in the reply content and step 7 in step 4 is showed into user, returns to step Rapid one.

Above-mentioned steps are the online service processes of whole system, without reference to the training process of strategy.It services on line In the process, empirical data can be stored in an empirical data pond by system, and each experience includes the shape of current interactive system State, the movement taken (selection active interlocution mode, for example, recommending conversational mode or rhetorical question conversational mode), next moment The state of interactive system, the reward received.Training process can be decoupled with online service process, online lower progress.Every At certain moment, training service samples a collection of (batch) data from empirical data pond, then according to the loss function (TD of root DQN Error) optimisation strategy parameter.After parameter updates, the service that is then pushed to newest parameter on line.

It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of Movement merge, but those skilled in the art should understand that, the present invention is not limited by the sequence of acts described because According to the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, and related actions and modules is not necessarily of the invention It is necessary.

In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, reference can be made to the related descriptions of other embodiments.

As shown in fig. 7, the embodiment of the present invention also provides a kind of interactive system 700 comprising:

Active interlocution mode determines program module 710, for using the current state of the interactive system as first The input of neural network, in such a way that the determination interactive system is to the active interlocution of user；

Topic to be recommended determines program module 720, for determining topic to be recommended according to the active interlocution mode；

Recommend determine the probability program module 730, for the feature vector of the current state and the topic to be recommended The recommendation probability of the topic to be recommended is determined as the input of nervus opticus network；

Recommend knowledge point option program module 740, for selecting from the topic to be recommended according to the recommendation probability value Knowledge point to be recommended is selected to be presented to the user.

In some embodiments, the embodiment of the present invention provides a kind of non-volatile computer readable storage medium storing program for executing, described to deposit Being stored in storage media one or more includes the programs executed instruction, it is described execute instruction can by electronic equipment (including but It is not limited to computer, server or the network equipment etc.) it reads and executes, with man-machine for executing any of the above-described of the present invention Dialogue method.

In some embodiments, the embodiment of the present invention also provides a kind of computer program product, and the computer program produces Product include the computer program being stored on non-volatile computer readable storage medium storing program for executing, and the computer program includes that program refers to It enables, when described program instruction is computer-executed, the computer is made to execute any of the above-described interactive method.

In some embodiments, the embodiment of the present invention also provides a kind of electronic equipment comprising: at least one processor, And the memory being connect at least one described processor communication, wherein the memory is stored with can be by described at least one The instruction that a processor executes, described instruction is executed by least one described processor, so that at least one described processor energy Enough execute interactive method.

In some embodiments, the embodiment of the present invention also provides a kind of storage medium, is stored thereon with computer program, It is characterized in that, the step of which is executed by processor interactive method.

The interactive system of the embodiments of the present invention can be used for executing the interactive method of the embodiment of the present invention, and Reach the realization interactive method technical effect achieved of the embodiments of the present invention accordingly, which is not described herein again.This Hardware processor (hardware processor) Lai Shixian related function module can be passed through in inventive embodiments.

Fig. 8 is the hardware configuration signal of the electronic equipment for the execution interactive method that another embodiment of the application provides Figure, as shown in figure 8, the equipment includes:

One or more processors 810 and memory 820, in Fig. 8 by taking a processor 810 as an example.

The equipment for executing interactive method can also include: input unit 830 and output device 840.

Processor 810, memory 820, input unit 830 and output device 840 can pass through bus or other modes It connects, in Fig. 8 for being connected by bus.

Memory 820 is used as a kind of non-volatile computer readable storage medium storing program for executing, can be used for storing non-volatile software journey Sequence, non-volatile computer executable program and module, such as the corresponding program of interactive method in the embodiment of the present application Instruction/module.Non-volatile software program, instruction and the module that processor 810 is stored in memory 820 by operation, Thereby executing the various function application and data processing of server, i.e. realization above method embodiment interactive method.

Memory 820 may include storing program area and storage data area, wherein storing program area can store operation system Application program required for system, at least one function；Storage data area can be stored to be created according to using for human-computer dialogue device Data etc..In addition, memory 820 may include high-speed random access memory, it can also include nonvolatile memory, example Such as at least one disk memory, flush memory device or other non-volatile solid state memory parts.In some embodiments, it deposits Optional reservoir 820 includes the memory remotely located relative to processor 810, these remote memories can pass through network connection To human-computer dialogue device.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, mobile radio communication And combinations thereof.

Input unit 830 can receive the number or character information of input, and generates and set with the user of human-computer dialogue device It sets and the related signal of function control.Output device 840 may include that display screen etc. shows equipment.

One or more of modules are stored in the memory 820, when by one or more of processors When 810 execution, the interactive method in above-mentioned any means embodiment is executed.

Method provided by the embodiment of the present application can be performed in the said goods, has the corresponding functional module of execution method and has Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to method provided by the embodiment of the present application.

The electronic equipment of the embodiment of the present application exists in a variety of forms, including but not limited to:

(1) mobile communication equipment: the characteristics of this kind of equipment is that have mobile communication function, and to provide speech, data Communication is main target.This Terminal Type includes: smart phone (such as iPhone), multimedia handset, functional mobile phone and low Hold mobile phone etc..

(2) super mobile personal computer equipment: this kind of equipment belongs to the scope of personal computer, there is calculating and processing function Can, generally also have mobile Internet access characteristic.This Terminal Type includes: PDA, MID and UMPC equipment etc., such as iPad.

(3) portable entertainment device: this kind of equipment can show and play multimedia content.Such equipment include: audio, Video player (such as iPod), handheld device, e-book and intelligent toy and portable car-mounted navigation equipment.

(4) server: providing the equipment of the service of calculating, and the composition of server includes that processor, hard disk, memory, system are total Line etc., server is similar with general computer architecture, but due to needing to provide highly reliable service, in processing energy Power, stability, reliability, safety, scalability, manageability etc. are more demanding.

(5) other electronic devices with data interaction function.

The apparatus embodiments described above are merely exemplary, wherein described, unit can as illustrated by the separation member It is physically separated with being or may not be, component shown as a unit may or may not be physics list Member, it can it is in one place, or may be distributed over multiple network units.It can be selected according to the actual needs In some or all of the modules achieve the purpose of the solution of this embodiment.

Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It is realized by the mode of software plus general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, above-mentioned technology Scheme substantially in other words can be embodied in the form of software products the part that the relevant technologies contribute, the computer Software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions to So that computer equipment (can be personal computer, server or the network equipment etc.) execute each embodiment or Method described in certain parts of embodiment.

Finally, it should be noted that above embodiments are only to illustrate the technical solution of the application, rather than its limitations；Although The application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: it still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features； And these are modified or replaceed, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims

1. a kind of interactive method is applied to interactive system, which comprises

Using the current state of the interactive system as the input of first nerves network, with the determination interactive system To the active interlocution mode of user；

It is determined using the feature vector of the current state and the topic to be recommended as the input of nervus opticus network described The recommendation probability of topic to be recommended；

Select knowledge point to be recommended to be presented to the user from the topic to be recommended according to the recommendation probability；

At least based on user and the interactive system from the statistics for starting to talk with the accessed knowledge point until current time Information and actualite information generate the current state；

Wherein, the actualite information is the topic vector information for characterizing the actualite, the access knowledge point Statistical information include at least:

The primary vector information for the knowledge point quantity being accessed by the user in the actualite；

The secondary vector information for the knowledge point quantity for negating by user in the actualite；

It is described at least based on user and the interactive system from starting pair 2. according to the method described in claim 1, wherein The statistical information and actualite information for talking about until current time accessed knowledge point generate the current state and include:

Current system feature vector is generated based on the actualite vector information and first to fourth vector information；

It compresses the current system feature vector to obtain the current state using Recognition with Recurrent Neural Network.

3. the active interlocution mode is rhetorical question conversational mode according to the method described in claim 1, wherein, it is described at this time to Recommendation topic is a candidate topics,

The feature vector using the current state and the topic to be recommended is determined as the input of nervus opticus network The recommendation probability of the topic to be recommended includes:

The time is determined as the input of nervus opticus network using the current state and the feature vector of the candidate topics Select the recommendation probability of topic.

4. the active interlocution mode is to recommend conversational mode according to the method described in claim 1, wherein, it is described at this time to Recommending topic includes multiple candidate topics,

It is determined using the current state and the feature vector of the multiple candidate topics as the input of nervus opticus network pair It should be in multiple recommendation probability of the multiple candidate topics.

5. according to the method described in claim 1, wherein, further includes:

The dialogue empirical data in each round dialogue is recorded and stored, to be used to form empirical data pond, the dialogue experience number According to include at least each round talk with the states of corresponding Current dialog systems, the action taken, subsequent time conversational system The reward that state and conversational system receive；

The first nerves network and/or institute are trained based on the dialogue empirical data in the empirical data pond according to predetermined period State nervus opticus network.

6. a kind of interactive system, comprising:

Active interlocution mode determines program module, for using the current state of the interactive system as first nerves network Input, in such a way that the determination interactive system is to the active interlocution of user；

Recommend determine the probability program module, for using the current state and the feature vector of the topic to be recommended as second The recommendation probability of neural network inputted to determine the topic to be recommended；

Recommend knowledge point option program module, it is to be recommended for being selected from the topic to be recommended according to the recommendation probability value Knowledge point is to be presented to the user；

7. a kind of electronic equipment comprising: at least one processor, and deposited with what at least one described processor communication was connect Reservoir, wherein the memory be stored with can by least one described processor execute instruction, described instruction by it is described at least One processor executes, so that at least one described processor is able to carry out any one of claim 1-5 the method Step.

8. a kind of storage medium, is stored thereon with computer program, which is characterized in that realize power when the program is executed by processor Benefit requires the step of any one of 1-5 the method.