CN108763495A

CN108763495A - Interactive method, system, electronic equipment and storage medium

Info

Publication number: CN108763495A
Application number: CN201810536651.0A
Authority: CN
Inventors: 陈露; 初敏; 杨超; 葛付江; 郭涛涛
Original assignee: AI Speech Ltd
Current assignee: Sipic Technology Co Ltd
Priority date: 2018-05-30
Filing date: 2018-05-30
Publication date: 2018-11-06
Anticipated expiration: 2038-05-30
Also published as: CN108763495B

Abstract

The present invention discloses a kind of interactive method, is applied to interactive system, the method includes：Using the current state of interactive system as the input of first nerves network, to determine interactive system in a manner of the active interlocution of user；Topic to be recommended is determined according to active interlocution mode；Using current state and the feature vector of topic to be recommended the recommendation probability of topic to be recommended is determined as the input of nervus opticus network；Select knowledge point to be recommended to be presented to the user from topic to be recommended according to probability value is recommended.The present embodiment determines the active interlocution mode between user according to the current state of interactive system, so as to ensure that the dialogic operation that interactive system is actively initiated is more targetedly, meet current interactive progress, the user experience is improved.

Description

Interactive method, system, electronic equipment and storage medium

Technical field

The present invention relates to field of artificial intelligence more particularly to interactive method, system, electronic equipment and storage to be situated between Matter.

Background technology

Various dialogue robots emerge in large numbers one after another currently on the market, some occur in the form of personal assistant (siri, cortana, Magic horn etc.), there is (small ice, degree secret) with chat robots in some, also be then built in intelligent sound box, intelligent vehicle-carried equipment, In the terminals such as smart television.If analyzing the dialogue technoloyg of these robot behinds, it can substantially be divided into four types：

Task is talked with：The vertical field dialogue of custom-made for the task that user usually needs：Such as：It makes a reservation, Music, film or certain commodity etc. are found in ticket booking.After user says a word, can judge first be which task demand, and Extract the demand parameter of user (such as：Departure place, restaurant type etc.).If the call parameter of predefined does not collect complete, machine Device can obtain information by puing question to.Therefore, Task dialogue is typically more wheel dialogues.User may also be in dialog procedure not The demand of oneself is improved in disconnected modification.

Dialogue based on question and answer pair：Knowledge tissue in the form of question and answer pair, by the enquirement of user with the question sentence of question and answer centering Compare, find immediate, and answer is returned.This dialogue is usually used in customer service robot and chats robot.It is this kind of It is single-wheel that question and answer, which service majority, some have some more wheel dialogue abilities, relates generally to simple context processing and refers to It disambiguates.

The dialogue of knowledge based collection of illustrative plates：User inquires the fact that stored with triple form property with natural language/voice Knowledge.For example " daughter of Yao Ming is how high？".When carrying out this kind of dialogue, robot needs certain inferential capability, on Face the words is actually completed the daughter-of (1) Yao Ming by two steps>Yao Qinlei；(2) height-of Yao Qinlei>160cm.This kind of dialogue Most of is single-wheel, some have certain more wheel dialogue abilities, and being primarily referred to as generation disambiguation, (she how old？She->Yao Qin Flower bud).

Production is chatted：By training neural network model, an answer can be automatically generated according to a customer problem. This kind of chat is limited without explicitly linking up target and field, and user says that in short system just automatically generates a reply, replys There is certain association with problem, but does not link up target explicitly.Also this kind of dialogue of someone is referred to as open field chat.Open field is chatted It in existing interactive system, primarily serves and is closing the distance, establish trusting relationship, emotion is accompanied, smooth dialog procedure (example Such as, task class dialogue cannot be satisfied user demand when) and raising user's viscosity effect.

Various intelligent robots in the prior art, some use one form of which, some are then the groups of several forms It is fit.Be essentially all people it is masters in all these dialogic operations, machine is passive side, and machine waits for people to put question to, Then answer is provided.In the dialogue of Task, machine can also be putd question to people, but be carried out in the case where explicitly defining very much , for example, air ticket task is bought, three required arguments of predefined：Departure place, place of arrival, departure time.Only user accuses Know entirely, robot could send out inquiry to ticketing service system, if incomplete, robot will obtain missing by puing question to Information.But on the whole, machine or passive wait state.

In addition, in the prior art for the question formulation to people be write in the form of question and answer pair in advance it is dead, cannot basis Suitable question formulation is targetedly selected for the practical interaction mode and history interaction feedback information with conversational system, Thus existing question formulation is mechanical does not have hommization, often causes bad human-computer dialogue to experience to user.

Invention content

A kind of interactive method of offer of the embodiment of the present invention and system, at least solving one of above-mentioned technical problem.

In a first aspect, the embodiment of the present invention provides a kind of interactive method, it is applied to interactive system, the method Including：

Using the current state of the interactive system as the input of first nerves network, with the determination human-computer dialogue Active interlocution mode of the system to user；

Topic to be recommended is determined according to the active interlocution mode；

It is determined using the feature vector of the current state and the topic to be recommended as the input of nervus opticus network The recommendation probability of the topic to be recommended；

Select knowledge point to be recommended to be presented to the user from the topic to be recommended according to the recommendation probability value.

Second aspect, the embodiment of the present invention provide a kind of interactive system, including：

Active interlocution mode determines program module, for using the current state of the interactive system as first nerves The input of network, by the determination interactive system in a manner of the active interlocution of user；

Topic to be recommended determines program module, for determining topic to be recommended according to the active interlocution mode；

Recommend determine the probability program module, for using the current state and the feature vector of the topic to be recommended as Nervus opticus network inputs to determine the recommendation probability of the topic to be recommended；

Recommend knowledge point option program module, for selecting to wait for from the topic to be recommended according to the recommendation probability value Recommend knowledge point to be presented to the user.

The third aspect, the embodiment of the present invention provide a kind of non-volatile computer readable storage medium storing program for executing, the storage medium In to be stored with one or more include the programs executed instruction, described execute instruction can (include but not limited to by electronic equipment Computer, server or network equipment etc.) it reads and executes, for executing any of the above-described human-computer dialogue side of the present invention Method.

Fourth aspect provides a kind of electronic equipment comprising：At least one processor, and at least one place Manage the memory of device communication connection, wherein the memory is stored with the instruction that can be executed by least one processor, institute It states instruction to be executed by least one processor, so that at least one processor is able to carry out any of the above-described of the present invention Interactive method.

5th aspect, the embodiment of the present invention also provide a kind of computer program product, and the computer program product includes The computer program being stored on non-volatile computer readable storage medium storing program for executing, the computer program include program instruction, when When described program instruction is computer-executed, the computer is made to execute any of the above-described interactive method.

The present embodiment determines the active interlocution mode between user according to the current state of interactive system, to It can ensure that the dialogic operation that interactive system is actively initiated is more targetedly, after determining active interlocution mode Determine the topic to be recommended for active interlocution again accordingly, further use advance trained nervus opticus network according to The current state of interactive system and the feature vector of theme to be recommended determine that topic to be recommended recommends probability, so as to The knowledge point for initiating active interlocution is obtained from corresponding topic to be recommended according to probability is recommended.

Description of the drawings

In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are some embodiments of the invention, for this field For those of ordinary skill, without creative efforts, other drawings may also be obtained based on these drawings.

Fig. 1 is the flow chart of an embodiment of the interactive method of the present invention；

Fig. 2 is the flow chart of another embodiment of the interactive method of the present invention；

Fig. 3 is the flow chart of the another embodiment of the interactive method of the present invention；

Fig. 4 is the structural schematic diagram for the Q neural networks for indicating main strategy；

Fig. 5 is the structural schematic diagram of the Q neural networks of Q values when selecting topic t under current state for determining；

Fig. 6 is the flow chart of another embodiment of the interactive method of the present invention；

Fig. 7 is the functional block diagram of an embodiment of the interactive system of the present invention；

Fig. 8 is the structural schematic diagram of an embodiment of the electronic equipment of the present invention.

Specific implementation mode

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art The every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.

The present invention can describe in the general context of computer-executable instructions executed by a computer, such as program Module.Usually, program module includes routines performing specific tasks or implementing specific abstract data types, program, object, member Part, data structure etc..The present invention can also be put into practice in a distributed computing environment, in these distributed computing environments, by Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with In the local and remote computer storage media including storage device.

In the present invention, the fingers such as " module ", " device ", " system " are applied to the related entities of computer, such as hardware, hardware Combination, software or software in execution with software etc..In detail, for example, element can with but be not limited to run on processing Process, processor, object, executable element, execution thread, program and/or the computer of device.In addition, running on server Application program or shell script, server can be element.One or more elements can be in the process and/or thread of execution In, and element can be localized and/or be distributed between two or multiple stage computers on one computer, and can be by each Kind computer-readable medium operation.Element can also according to the signal with one or more data packets, for example, from one with Another element interacts in local system, distributed system, and/or the network in internet passes through signal and other system interactions The signals of data distinguish one entity or operation from another entity or operation, and not necessarily require or secretly Show that there are any actual relationship or orders between these entities or operation.Moreover, the terms "include", "comprise", no Only include those elements, but also include other elements that are not explicitly listed, or further include for this process, method, Article or the intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence " including ... ", and It is not excluded in process, method, article or equipment in the process, method, article or apparatus that includes the element that there is also other identical elements.

Regard the dialog procedure of person to person's machine conversational system as a Markovian decision mistake in embodiments of the present invention Journey (Markov Decision Process, MDP).At each moment, interactive system is in state s, according to tactful π (a | s) action a is taken, it observes that the reply o ' of user, state are changed into s ', receives reward r '.

The embodiment of the present invention provides a kind of interactive method, is applied to interactive system, as shown in Figure 1, man-machine Dialogue method includes：

S11, using the current state of the interactive system as the input of first nerves network, it is described man-machine with determination For conversational system to the active interlocution mode of user, first nerves network is the first Q neural networks.

Wherein, user and the interactive system are at least based on from starting dialogue accessed knowledge until current time The statistical information and actualite information of point generate the current state.The actualite information is described current for characterizing The topic vector information of topic, the statistical information for accessing knowledge point include at least：

The primary vector information for the knowledge point quantity being accessed by the user in actualite；

The secondary vector information for the knowledge point quantity for negating by user in actualite；

From the third vector information for starting to talk with until current time the knowledge point quantity that user negated；

From the 4th vector information for starting to talk with until current time the maximum knowledge point quantity that user negated continuously.

Access the reply situation that the statistical information of knowledge point detects user for recommended knowledge point by interactive system It determines, specifically：

Interactive system parses the reply of user by semantic understanding module, and there are four types of types for analysis result：

Selecting one in the knowledge point of system recommendation, semantic expressiveness is inform (series=id), such as " first It is a ", " second ", " the last one "；

Affirmative or negative acknowledge of the user to rhetorical question, semantic expressiveness negates for deny (), affirm () is affirmed, such as " good, to introduce ", " having no interest ", " being not desired to understand "；

Last round of system does not have the recommendation and rhetorical question that system is paid no attention in recommendation and rhetorical question or user, inquires one again and knows Know point k_t, semantic expressiveness is inform (kp=k_t)。

Terminate dialogue, semantic expressiveness is bye ().

In the present embodiment, accesses and know in the dialog procedure that active user had carried out with interactive system of combining closely Statistical information and the actualite of point are known to determine the current state of conversational system, are determined for input first nerves network Interactive system actively initiates the active interlocution mode of dialogue again so that the interactive method of the present embodiment is each time It actively initiates that when dialogue the conversational feedback information having been carried out and actualite can be considered to determine that active is right Words mode, to ensure that, selected active interlocution mode all has more specific aim each time, more meets current meaning of being close to the users It is willing to, so that the active interlocution mode that interactive system is initiated is easier to be received by user, ensure that interactive Smooth pleasant progress improves the human-computer dialogue experience of user.

S12, topic to be recommended is determined according to the active interlocution mode；

Wherein, the active interlocution mode include at least rhetorical question conversational mode, recommend conversational mode and neither ask in reply nor Recommend, specifically：

Whether the knowledge point kt under-confirm (kp=kt), rhetorical question user session topic t is interested, such as " you are to us Far field speech recognition technology it is interested？".

Recommend n knowledge point to user, such as：

" you can also ask me：1, the typical products for thinking must to speed in terms of intelligentized Furniture are introduced.2, day cat spirit is used Your which technology？3, which the hardware product for thinking to speed has？".In order to simple, it can be assumed that this n knowledge point is from n Topic, i.e., from each topic one knowledge point of stochastical sampling (it is of course also possible to be from m topic selection n knowledge point, Wherein m is less than n).

- null, expression are neither recommended nor are asked in reply.

S13, come using the current state and the feature vector of the topic to be recommended as the input of nervus opticus network Determine that the recommendation probability of the topic to be recommended, nervus opticus network are the 2nd Q neural networks；

When the active interlocution mode is rhetorical question conversational mode, the topic to be recommended is a candidate topics, described It is determined using the current state and the feature vector of the topic to be recommended as the input of nervus opticus network and described waits pushing away The recommendation probability for recommending topic includes：

Using the current state and the feature vector of the candidate topics as the input of nervus opticus network to determine State the recommendation probability of candidate topics.

When the active interlocution mode is to recommend conversational mode, the topic to be recommended includes multiple candidate topics, institute It states and determines described wait for using the current state and the feature vector of the topic to be recommended as the input of nervus opticus network Recommend topic recommendation probability include：

Come using the current state and the feature vector of the multiple candidate topics as the input of nervus opticus network true Surely correspond to multiple recommendation probability of the multiple candidate topics.

S14, select knowledge point to be recommended to be presented to the user from the topic to be recommended according to the recommendation probability value.

The present embodiment determines the active interlocution mode between user according to the current state of interactive system, to It can ensure that the dialogic operation that interactive system is actively initiated is more targetedly, to meet current interactive progress Situation is improved for experiencing.The corresponding words to be recommended determined again for active interlocution after determining active interlocution mode Topic further uses advance current state and to be recommended theme of the trained nervus opticus network according to interactive system Feature vector determine that topic to be recommended recommends probability, so as to according to recommending the probability to come from corresponding topic to be recommended Obtain the knowledge point for initiating active interlocution.

As shown in Fig. 2, in some embodiments of the invention, it is described be at least based on user and the interactive system from Start to talk with until current time the statistical information of accessed knowledge point and actualite information generates the current state packet It includes：

S21, current system feature vector is generated based on the topic vector information and first to fourth vector information；

S22, it compresses the current system feature vector to obtain the current state using Recognition with Recurrent Neural Network.

The topic vector information and first to fourth vector information, which link up, may be constructed a feature vector, use X is indicated.In addition to above- mentioned information, historical information is also critically important, can use Recognition with Recurrent Neural Network (for example, RNN/LSTM etc.) will be from right Words start all x so far and are compressed into a state expression vector s, i.e. the input at each moment of network is the corresponding moment X, last moment corresponding state be s.The present embodiment makes the selection of topic considers to go through by using Recognition with Recurrent Neural Network History information so that redirect every time more personalized and accurate.

As shown in figure 3, in some embodiments, interactive method further includes：

S31, record simultaneously store dialogue empirical data in each round dialogue, described right for forming empirical data pond Words empirical data include at least each round talk with the states of corresponding Current dialog systems, the action taken, subsequent time pair The prize signal that the state and conversational system of telephone system receive；

S32, the first nerves net is trained based on the dialogue empirical data in the empirical data pond according to predetermined period Network and/or the nervus opticus network.

Prize signal can come from following aspects in above-described embodiment：

The reply that user recommends interactive system or asks in reply

If user is selected from last round of recommendation knowledge point, it can give system one positive value reward, Otherwise give system one negative value reward；

If user indicates affirmative acknowledgement (ACK) to the rhetorical question knowledge point of system, gives system one positive value reward, otherwise give One negative value reward of system.

The specific reward value of above-mentioned two situations can be different, usually, asks in reply the dialogue of pattern more naturally, institute With the absolute value for asking in reply (positive/negative) reward value of pattern can be bigger than recommendation pattern.

For system, if being centainly oriented to purpose, a forward direction is obtained if having reached the purpose of oneself Excitation.And the target of system, it is different according to the type of conversational system, such as：

Publicize class:Enough information is illustrated to user, often introducing a knowledge point to user then obtains a forward direction Reward, different rewards can be arranged in the knowledge point of different topics；

Shopping guide's class:Single purchase corelation behaviour is produced, then obtains a larger positive reward；

Commercial class:A business cooperation is facilitated, then obtains a larger positive reward；

Recruit class:Resume is obtained, then obtains a larger positive reward.

In terms of prize signal can derive from above-mentioned two incessantly, the designer of system can specifically set according to specific tasks Meter.

There is above-mentioned prize signal, then it can be with nitrification enhancement come optimisation strategy so that basis in each state The active interlocution mode and topic (knowledge point) of policy selection are all that the progressive award of acquisition is made to maximize.During the present invention implements Dialog strategy is indicated with Q networks, with DQN (Deep-Q-Networks) come optimisation strategy.Unlike common DQN, this There are three types of different strategies in inventive embodiments：Main strategy, Generalization bounds, rhetorical question strategy.Each strategy has oneself corresponding Q net Network, they can share an empirical data pond D.When each undated parameter, a collection of (batch) number is sampled from empirical data pond According to then according to root loss function (TD error) Optimal Parameters.At the initial stage of system optimization, if do not carried out to the topic of selection Limitation then has prodigious randomness according to the topic that policy selection arrives, and effect may not be highly desirable.It is asked to solve this Topic, the topic of selection can be limited in a certain range by system optimization initial stage, such as the fraternal topic of actualite, sub- words Topic etc..

Dialog strategy decides how user is recommended and be asked in reply, and is broadly divided into three steps：

Step 1：Determine active interlocution mode：By main strategy π^m(a^m| s) determine recommend, rhetorical question, still neither recommend nor Any mode in rhetorical question.

As shown in figure 4, the structural schematic diagram of the Q neural networks of main strategy is indicated in the embodiment of the present invention, Q neural networks Input is current state s, and output layer has 3 dimensions, the Q values that 3 kinds of modes of corresponding selection are obtained respectively, then each way of recommendation a^mIt is right The probability answered is：

τ is the hyper parameter for the degree that a control strategy is explored in above-mentioned formula, and τ is bigger, and above-mentioned probability is average, The degree that system is explored is bigger.Usually, τ starts larger, is then gradually reduced.When each decision, according to above-mentioned probability It carries out sampling a kind of active interlocution mode.

Step 2：Carry out topic reasoning：Speculate user's next possible interested topic, that is, selects one or more Possible topic needs to select a topic, if step if the active interlocution mode sampled in step 1 is rhetorical question The one active interlocution mode sampled is to recommend, then needs to select n topic.

As shown in figure 5, being the structural representation of the Q neural networks of Q values when selecting topic t under current state for determining Figure, the network include two branching networks, and the input of one of branching networks is current state s, after several layer networks φ (s) is indicated to a state, and the input of another branching networks is the vectorization expression e of topic t_t, into after excessively several layer networks It obtains another and φ (s) is indicated with the vector of dimensionThe Q value Q (s, t) of topic t are then selected under corresponding current state For the inner product of above-mentioned two vector, i.e.,：

For each possible candidate topics, corresponding Q values are calculated first with above-mentioned formula, then according to following formula Calculate each probability for selecting each theme：

If active interlocution pattern is rhetorical question, a theme is sampled according to above-mentioned probability, if it is recommendation pattern, then root N theme is sampled according to above-mentioned probability.

Step 3：A knowledge point is randomly choosed from each theme sampled as rhetorical question or recommends knowledge point.

As shown in fig. 6, the flow chart of the embodiment for the interactive method of the present invention, includes the following steps：

Step 1：Receive the enquirement or reply of user；

Step 2：Semantic understanding is carried out to user, there are four types of analysis results：It is selected in the knowledge point of system recommendation One；Affirmative to the rhetorical question of user or negative acknowledge；User inquires a knowledge point k again_t；User wants to exit dialogue；

Step 3：If user is not desired to exit, actualite is updated according to semantic analysis result；

Step 4：Determine the main contents of reply user.It is the knowledge selected from last round of recommendation if it is user Point, or ask in reply system the affirmative acknowledgement (ACK) of knowledge point, then the knowledge point contents are directly provided；It is re-prompted if it is user One problem, then inquire corresponding knowledge point contents from knowledge base；

Step 5：Update dialogue state.Extract the current time of actualite and session features as Recognition with Recurrent Neural Network Input, the vectorization for obtaining the hiding expression at current time as current state indicates；

Step 6：Main strategy decision active interlocution mode：Recommend, rhetorical question, neither recommend nor ask in reply；

Step 7：According to the recommendation pattern selected in step 6, specific active interlocution content is determined.If it is recommendation mould Formula then selects n topic according to Generalization bounds, and a knowledge point is then sampled from each topic forms recommendation list；If Then rhetorical question pattern selects a knowledge point as rhetorical question content then according to rhetorical question one topic of policy selection from topic；Such as Fruit is neither to recommend nor ask in reply, then when front-wheel active interlocution content is sky.

Step 8：By the active interlocution content displaying in the reply content and step 7 in step 4 to user, step is returned Rapid one.

Above-mentioned steps are the online service processes of whole system, without reference to the training process of strategy.It services on line In the process, empirical data can be stored in an empirical data pond by system, and each experience includes the shape of current interactive system State, the action taken (selection active interlocution mode, for example, recommending conversational mode or rhetorical question conversational mode), next moment The state of interactive system, the reward received.Training process can be decoupled with online service process, online lower progress.Every At certain moment, training service samples a collection of (batch) data from empirical data pond, then according to the loss function (TD of root DQN Error) optimisation strategy parameter.After parameter update, the service that is then pushed to newest parameter on line.

It should be noted that for each method embodiment above-mentioned, for simple description, therefore it is all expressed as a series of Action merge, but those skilled in the art should understand that, the present invention is not limited by the described action sequence because According to the present invention, certain steps can be performed in other orders or simultaneously.Secondly, those skilled in the art should also know It knows, embodiment described in this description belongs to preferred embodiment, and involved action and module are not necessarily of the invention It is necessary.

In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiment.

As shown in fig. 7, the embodiment of the present invention also provides a kind of interactive system 700 comprising：

Active interlocution mode determines program module 710, for using the current state of the interactive system as first The input of neural network, by the determination interactive system in a manner of the active interlocution of user；

Topic to be recommended determines program module 720, for determining topic to be recommended according to the active interlocution mode；

Recommend determine the probability program module 730, for the feature vector of the current state and the topic to be recommended The recommendation probability of the topic to be recommended is determined as the input of nervus opticus network；

Recommend knowledge point option program module 740, for being selected from the topic to be recommended according to the recommendation probability value Knowledge point to be recommended is selected to be presented to the user.

In some embodiments, the embodiment of the present invention provides a kind of non-volatile computer readable storage medium storing program for executing, described to deposit It includes the programs executed instruction to be stored in storage media one or more, it is described execute instruction can by electronic equipment (including but It is not limited to computer, server or the network equipment etc.) it reads and executes, it is man-machine for executing any of the above-described of the present invention Dialogue method.

In some embodiments, the embodiment of the present invention also provides a kind of computer program product, the computer program production Product include the computer program being stored on non-volatile computer readable storage medium storing program for executing, and the computer program includes that program refers to It enables, when described program instruction is computer-executed, the computer is made to execute any of the above-described interactive method.

In some embodiments, the embodiment of the present invention also provides a kind of electronic equipment comprising：At least one processor, And the memory being connect at least one processor communication, wherein the memory is stored with can be by described at least one The instruction that a processor executes, described instruction is executed by least one processor, so that at least one processor energy Enough execute interactive method.

In some embodiments, the embodiment of the present invention also provides a kind of storage medium, is stored thereon with computer program, It is characterized in that, the step of which is executed by processor interactive method.

The interactive system of the embodiments of the present invention can be used for executing the interactive method of the embodiment of the present invention, and Reach the technique effect that the realization interactive method of the embodiments of the present invention is reached accordingly, which is not described herein again.This Related function module can be realized in inventive embodiments by hardware processor (hardware processor).

Fig. 8 is the hardware configuration signal of the electronic equipment for the execution interactive method that another embodiment of the application provides Figure, as shown in figure 8, the equipment includes：

One or more processors 810 and memory 820, in Fig. 8 by taking a processor 810 as an example.

Execute interactive method equipment can also include：Input unit 830 and output device 840.

Processor 810, memory 820, input unit 830 and output device 840 can pass through bus or other modes It connects, in Fig. 8 for being connected by bus.

Memory 820 is used as a kind of non-volatile computer readable storage medium storing program for executing, can be used for storing non-volatile software journey Sequence, non-volatile computer executable program and module, such as the corresponding program of interactive method in the embodiment of the present application Instruction/module.Processor 810 is stored in non-volatile software program, instruction and module in memory 820 by operation, Above method embodiment interactive method is realized in various function application to execute server and data processing.

Memory 820 may include storing program area and storage data field, wherein storing program area can store operation system System, the required application program of at least one function；Storage data field can be stored to be created according to using for human-computer dialogue device Data etc..In addition, memory 820 may include high-speed random access memory, can also include nonvolatile memory, example Such as at least one disk memory, flush memory device or other non-volatile solid state memory parts.In some embodiments, it deposits It includes the memory remotely located relative to processor 810 that reservoir 820 is optional, these remote memories can pass through network connection To human-computer dialogue device.The example of above-mentioned network includes but not limited to internet, intranet, LAN, mobile radio communication And combinations thereof.

Input unit 830 can receive the number or character information of input, and generates and set with the user of human-computer dialogue device It sets and the related signal of function control.Output device 840 may include that display screen etc. shows equipment.

One or more of modules are stored in the memory 820, when by one or more of processors When 810 execution, the interactive method in above-mentioned any means embodiment is executed.

The said goods can perform the method that the embodiment of the present application is provided, and has the corresponding function module of execution method and has Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to the method that the embodiment of the present application is provided.

The electronic equipment of the embodiment of the present application exists in a variety of forms, including but not limited to:

(1) mobile communication equipment:The characteristics of this kind of equipment is that have mobile communication function, and to provide speech, data Communication is main target.This Terminal Type includes:Smart mobile phone (such as iPhone), multimedia handset, functional mobile phone and low Hold mobile phone etc..

(2) super mobile personal computer equipment:This kind of equipment belongs to the scope of personal computer, there is calculating and processing work( Can, generally also have mobile Internet access characteristic.This Terminal Type includes:PDA, MID and UMPC equipment etc., such as iPad.

(3) portable entertainment device:This kind of equipment can show and play multimedia content.Such equipment includes:Audio, Video player (such as iPod), handheld device, e-book and intelligent toy and portable car-mounted navigation equipment.

(4) server:The equipment for providing the service of calculating, the composition of server include that processor, hard disk, memory, system are total Line etc., server is similar with general computer architecture, but due to needing to provide highly reliable service, in processing energy Power, stability, reliability, safety, scalability, manageability etc. are more demanding.

(5) other electronic devices with data interaction function.

The apparatus embodiments described above are merely exemplary, wherein the unit illustrated as separating component can It is physically separated with being or may not be, the component shown as unit may or may not be physics list Member, you can be located at a place, or may be distributed over multiple network units.It can be selected according to the actual needs In some or all of module achieve the purpose of the solution of this embodiment.

Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It is realized by the mode of software plus general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, above-mentioned technology Scheme substantially in other words can be expressed in the form of software products the part that the relevant technologies contribute, the computer Software product can store in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD, including some instructions to So that computer equipment (can be personal computer, server either network equipment etc.) execute each embodiment or Method described in certain parts of embodiment.

Finally it should be noted that：Above example is only to illustrate the technical solution of the application, rather than its limitations；Although The application is described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that：It still may be used With technical scheme described in the above embodiments is modified or equivalent replacement of some of the technical features； And these modifications or replacements, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims

1. a kind of interactive method is applied to interactive system, the method includes：

Using the current state of the interactive system as the input of first nerves network, with the determination interactive system To the active interlocution mode of user；

It is determined using the current state and the feature vector of the topic to be recommended as the input of nervus opticus network described The recommendation probability of topic to be recommended；

Select knowledge point to be recommended to be presented to the user from the topic to be recommended according to the recommendation probability.

2. according to the method described in claim 1, wherein, further including：User and the interactive system are at least based on from opening The dialogue statistical information of accessed knowledge point and the actualite information until current time of beginning generate the current state.

3. according to the method described in claim 2, wherein, the actualite information is for characterizing the actualite Vector information is inscribed, the statistical information for accessing knowledge point includes at least：

The primary vector information for the knowledge point quantity being accessed by the user in the actualite；

The secondary vector information for the knowledge point quantity for negating by user in the actualite；

4. according to the method described in claim 3, wherein, user and the interactive system of being at least based on is from starting pair It talks about until current time the statistical information of accessed knowledge point and actualite information generates the current state and includes：

Current system feature vector is generated based on the actualite vector information and first to fourth vector information；

It compresses the current system feature vector to obtain the current state using Recognition with Recurrent Neural Network.

5. according to the method described in claim 1, wherein, the active interlocution mode is rhetorical question conversational mode, described at this time to wait for Recommendation topic is a candidate topics,

The feature vector using the current state and the topic to be recommended is determined as the input of nervus opticus network The recommendation probability of the topic to be recommended includes：

The time is determined as the input of nervus opticus network using the current state and the feature vector of the candidate topics Select the recommendation probability of topic.

6. according to the method described in claim 1, wherein, the active interlocution mode is recommendation conversational mode, described at this time to wait for It includes multiple candidate topics to recommend topic,

It is determined using the current state and the feature vector of the multiple candidate topics as the input of nervus opticus network pair It should be in multiple recommendation probability of the multiple candidate topics.

7. according to the method described in claim 1, wherein, further including：

The dialogue empirical data in each round dialogue is recorded and stores, for forming empirical data pond, the dialogue experience number According to talk with including at least each round the state of corresponding Current dialog systems, the action taken, subsequent time conversational system The reward that state and conversational system receive；

The first nerves network and/or institute are trained based on the dialogue empirical data in the empirical data pond according to predetermined period State nervus opticus network.

8. a kind of interactive system, including：

Active interlocution mode determines program module, for using the current state of the interactive system as first nerves network Input, by the determination interactive system in a manner of the active interlocution of user；

Recommend determine the probability program module, for using the current state and the feature vector of the topic to be recommended as second Neural network inputs to determine the recommendation probability of the topic to be recommended；

Recommend knowledge point option program module, it is to be recommended for being selected from the topic to be recommended according to the recommendation probability value Knowledge point is to be presented to the user.

9. a kind of electronic equipment comprising：At least one processor, and deposited with what at least one processor communication was connect Reservoir, wherein the memory is stored with the instruction that can be executed by least one processor, described instruction by it is described at least One processor executes, so that at least one processor is able to carry out any one of claim 1-7 the methods Step.

10. a kind of storage medium, is stored thereon with computer program, which is characterized in that the program is realized when being executed by processor The step of any one of claim 1-7 the methods.