CN109857843A

CN109857843A - Exchange method and system based on document

Info

Publication number: CN109857843A
Application number: CN201811596444.0A
Authority: CN
Inventors: 王雪初; 魏仲清; 王兴宝; 雷琴辉
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2018-12-25
Filing date: 2018-12-25
Publication date: 2019-06-07
Anticipated expiration: 2038-12-25
Also published as: CN109857843B

Abstract

The invention discloses a kind of exchange method and system based on document, wherein method includes: to obtain the current interactive information of user；Semantic analysis is carried out to the current interactive information；According to semantic analysis result, whether decision carries out signal reconstruct；Using the current interactive information or reconstruct interactive information, target area is determined in a document；According to the content of target area, feedback information is generated.The present invention is different from carrying out keyword retrieval to document, but towards increasingly complex and various interaction scenarios, such as more wheels put question to interaction, it is necessary to when can rebuild one be easy to machine understanding interactive information, target area candidate in conjunction with the locking of corresponding interactive information later, regenerates final required feedback information from candidate.The present invention can accurately and reliably in locating documents relevant information, and humanized result feedback is generated with this.

Description

Exchange method and system based on document

Technical field

The present invention relates to field of human-computer interaction more particularly to a kind of exchange methods and system based on document.

Background technique

The birth of e-text medium brings huge change to the reading method of people, has liberated the field of people's reading Conjunction and time make the reading method of people become more easily convenient.But with the development of society, include a large amount of text informations E-text data always inevitably make one dazzled, be difficult to rapid lockup period and wait for target, such as how to determine an e-book In content to meet user expected, how to navigate to the content etc. of user demand rapidly in the electronic description of complex product Deng.

Existing interactive class product is mostly the robot based on Opening field, such as the voice assistant of intelligent terminal in the market Deng, what be can be realized is only to carry out question answer dialog with user in Opening field using knowledge base searching and rule match, this It kind is similar to the interactive product of " game ", " chat " property, is not suitable for and is based on particular document, such as books or profession data are asked Interactive process is answered, therefore such product can not play effective work for how fast and accurately to consult related text content With.

In addition, existing e-text arrangement for reading in the market, although the matching and search of keyword can be supported, by In the search strategy that it is used in face of being more single keyword, thus it is difficult to accurately navigate to user's phase in particular document To feedback, either provide feedback also need user carry out quadratic search perhaps excludes retrieval etc. troublesome operations or offer The difficult interaction content with user of feedback mutually agrees with；Especially, when using more complicated question formulation in face of user, such as When carrying out more wheels and relevant enquirement, existing e-text arrangement for reading is then helpless to this.

Summary of the invention

The present invention is directed to consult above-mentioned pain spot when electronic document for user, a kind of exchange method based on document is provided And system, the content that user expects can be quickly and accurately positioned from particular document in more complicated interaction scenarios, It is freed during user is consulted from lengthy and tedious text.

The technical solution adopted by the invention is as follows:

A kind of exchange method based on document, comprising:

Obtain the current interactive information of user；

Semantic analysis is carried out to the current interactive information；

According to semantic analysis result, whether decision carries out signal reconstruct；

Using the current interactive information or reconstruct interactive information, target area is determined in a document；

According to the content of target area, feedback information is generated.

Optionally, described to include: to the current interactive information progress semantic analysis

Determine the literal semantic type of the current interactive information；

Determine the intention type of the current interactive information.

Optionally, described according to semantic analysis result, whether decision, which carries out signal reconstruct, includes:

According to the intention type, retain current interactive information, or utilizes interactive history data and the current friendship Mutual information obtains reconstruct interactive information.

Optionally, described to be obtained according to the intention type using interactive history data and the current interactive information Reconstructing interactive information includes:

When the intention type is improper enquirement type, according to the improper enquirement type, the interaction is gone through Corresponding semantic analysis result in history data is integrated with the current interactive information and updates the current interactive information Intention type obtains reconstruct interactive information.

Optionally, described to utilize the current interactive information or reconstruct interactive information, target area is determined in a document Include:

Document is divided into multiple content areas；

Extract the text feature of each content area and the current interactive information or the reconstruct interactive information；

The current interactive information or the reconstruct interactive information are calculated to each content area using text feature Correlation；

According to the correlation, at least one described content area is chosen as target area.

Optionally, the content according to target area, generating feedback information includes:

Calculate the attention rate between the content of target area and the current interactive information or the reconstruct interactive information；

According to the attention rate, the content of the target area and the current interactive information or reconstruct interaction letter Breath generates feedback information.

Optionally, between the content and the current interactive information or the reconstruct interactive information for calculating target area Attention rate include:

The content of target area is calculated to each word in the current interactive information or the reconstruct interactive information First attention rate；

The current interactive information or the reconstruct interactive information are calculated to each word in the content of target area Second attention rate.

Optionally, described according to the attention rate, the content of the target area and the current interactive information or institute Reconstruct interactive information is stated, generating feedback information includes:

Utilize first attention rate, second attention rate, the content of the target area and the current interaction Information or the reconstruct interactive information determine the weight of each word in default dictionary；

According to first attention rate, second attention rate and the weight, target word is determined；

The feedback letter corresponding to the current interactive information or the reconstruct interactive information is generated using the target word Breath.

A kind of interactive system based on document, comprising:

Current interactive information obtains module, for obtaining the current interactive information of user；

Semantic module, for carrying out semantic analysis to the current interactive information；

Decision-making module is reconstructed, for according to semantic analysis result, whether decision to carry out signal reconstruct；

Target area determining module, for utilizing the current interactive information or reconstruct interactive information, in a document really Set the goal region；

Feedback information generation module generates feedback information for the content according to target area.

Optionally, the reconstruct decision-making module specifically includes:

Stick unit, for determining that the intention type of current interactive information is the current interaction of reservation after normal enquirement type Information；

Reconfiguration unit, for determine current interactive information intention type be improper enquirement type after, according to described non- It is normal that type is putd question to be integrated semantic analysis result corresponding in interactive history data simultaneously with the current interactive information The intention type for updating the current interactive information obtains reconstruct interactive information.

Optionally, the target area determining module specifically includes:

Zoning unit, for document to be divided into multiple content areas；

Text character extraction unit, for extracting each content area and the current interactive information or the reconstruct The text feature of interactive information；

Correlation acquiring unit, for calculating the current interactive information or the reconstruct interactive information using text feature To the correlation of each content area；

Target area selection unit, for choosing at least one described content area as target according to the correlation Region.

Optionally, the feedback information generation module specifically includes:

Attention rate computing unit, the content for calculating target area are interacted with the current interactive information or the reconstruct Attention rate between information；

Feedback information generation unit, for the content and the current friendship according to the attention rate, the target area Mutual information or the reconstruct interactive information generate feedback information.

It is provided by the invention that semantic analysis is carried out to the current interactive information of the user got based on the exchange method of document, Further according to semantic analysis result, whether decision carries out signal reconstruct；Not reconstructed current interactive information or benefit are utilized later With the reconstruct interactive information after reconstructed, target area is determined in a document；It is finally generated and is fed back according to the content of target area Information.The present invention is different from carrying out keyword retrieval to document, but towards increasingly complex and various interaction scenarios, such as it is more Wheel puts question to interaction, it is necessary to when can rebuild the interactive information for being easy to machine understanding, later in conjunction with corresponding interaction The target area of information locking candidate regenerates final required feedback information from candidate.The present invention can be accurately and reliably Relevant information in locating documents, and humanized result feedback is generated with this.

Detailed description of the invention

To make the object, technical solutions and advantages of the present invention clearer, the present invention is made into one below in conjunction with attached drawing Step description, in which:

Fig. 1 is the flow chart of the embodiment of the exchange method provided by the invention based on document；

Fig. 2 is the structural schematic diagram of the embodiment of semantic analysis model provided by the invention；

Fig. 3 is the flow chart of the embodiment of reconstruct decision-making technique provided by the invention；

Fig. 4 is the flow chart of content area searching method provided by the invention；

Fig. 5 is the structural schematic diagram of range searching model provided by the invention；

Fig. 6 is the flow chart of the preferred embodiment provided by the invention for generating feedback information method；

Fig. 7 is the structural schematic diagram that feedback information provided by the invention generates model；

Fig. 8 is the block diagram of the embodiment of the interactive system provided by the invention based on document.

Description of symbols:

1 current interactive information obtains 2 semantic module 3 of module and reconstructs decision-making module

4 target area determining module, 5 feedback information generation module

Specific embodiment

The embodiment of the present invention is described below in detail, the example of embodiment is shown in the accompanying drawings, wherein identical from beginning to end Or similar label indicates same or similar element or element with the same or similar functions.It is retouched below with reference to attached drawing The embodiment stated is exemplary, and for explaining only the invention, and is not construed as limiting the claims.

The present invention provides a kind of embodiment of exchange method based on document, as shown in Figure 1, this method may include as Lower step:

Step S1, the current interactive information of user is obtained；

Interactive mode involved in the present invention can be, but not limited to voice and text input, and interactive voice then can be by picking up Mixer obtains the interactive audio data of user, and can further utilize existing audio processing, audio conversion writing technology by sound Frequency evidence changes into textual form；Text interaction can then support user to be directly manually entered interactive text, such as pass through keyboard, hand Write device etc. directly inputs interaction request.To the method for above-mentioned acquisition customer interaction information, the prior art can refer to, the present invention is not It limits.But it should be recognized that " current interactive information " alleged by the present embodiment has the meaning of timing level, that is to say final Feedback information and the entire process for obtaining feedback information all mutually echoed with current interactive information, but for taking turns question and answer scene more, " historical information " proposed before " current interactive information " is equally possible to take part in the process for generating final feedback information, this It will hereinafter be illustrated.

Step S2, semantic analysis is carried out to current interactive information；

In practical applications, the content of semantic analysis can according to need and chooses one or more analysis results.? In a preferred embodiment of the present invention, the angle for carrying out semantic analysis to current interactive information be can include but is not limited to: really The literal semantic type of settled preceding interactive information and the intention type for determining current interactive information.

In this regard, multitask (Multi-task) model can be selected to carry out multitask type semantic in specific implementation operation Understand.Multi-task model can judge the word of above-mentioned current interactive information by neural network simultaneously in the same model Face semanteme and interaction are intended to (this is the example of two tasks, but task quantity can be adjusted according to demand), and more Different information datas can be provided between business, so as to Optimized model overall effect, the semantic analysis model structure application In inquiry scene, can be as shown in Figure 2: Embedding layer (embeding layer) be (to ask current interactive information text Query Topic, inquiry) in each word be converted into corresponding word vector, word vector is random initializtion and as model training automatically updates； Then pass through two layers of BLSTM layer (two-way shot and long term memory network layer), it is therefore an objective to pass through original Embedding vector The vector of contextual information is obtained after BLSTM coding；Next Multi-task operation is executed, is shown with aforementioned two tasks Example: first task is that the result of the BLSTM layers above each word is carried out a Fully Connected (Quan Lian Connect) obtain that one is polytypic as a result, what is obtained in this way is exactly a label above each word, to determine current interaction letter The semantic type of each word in breath.

For purposes of illustration only, user is inquiring the electronic document process by taking the electronic description for inquiring automotive-type product as an example The current interactive information of middle proposition is " how using rain brush? ", the sequence labelling result obtained after above-mentioned model treatment is just Be: why (B_how) (I_how) make (I_how) with (E_how) rain (B_part) brush (I_part) device (E_part), wherein B, I, E respectively represents beginning, centre and the end of a semantic type, and specific semantic type can be illustrated with reference to following table.

Semantic type	Field value	It explains
			how	How to open/use/adjusting etc.	Query function application method
what	What is/cry what etc.	Inquire component or functional interpretation
			where	Which/somewhere etc.	Inquire the position of component or tool
why	It is what reason/be what problem etc.	The reason of inquiring alarm or failure
			part	Air-conditioning/dipped headlight/skylight etc.	Automobile component or accessory
attr	Temperature/horsepower etc.	The attribute of automobile component or accessory
			phen	Shake/do not work/abnormal sound etc.	Failure there is a phenomenon where
cond	Night/rainy day/muddy ground etc.	The condition that failure occurs

So far, the analysis to the literal semantic type of current interactive information is completed.

And Section 2 task be then by the forward direction of BLSTM the last one as a result, before i.e. the sentence that is obtained to LSTM express to Amount and BLSTM it is backward the last one as a result, i.e. after obtained to LSTM sentence expression vector, carry out concatenation obtain The sentence expression of one entirety, then by sentence expression by Fully Connected also obtain one it is polytypic as a result, More classification results are that is represented is the classification that user's interaction is intended to (intent), and the question and answer scene such as inquiry shows the classification of intention It anticipates as follows: normal to put question to (normal), refer to (refer), supplemental information (replenish), update information (correct), delete Except information (delete) and manipulation command (command) etc..So far, point to the intention type of current interactive information is completed Analysis.

To sum up, the type mark of word level can have been obtained by first task, and entire language has been obtained by Section 2 task The intention type mark of sentence level.

Step S3, according to semantic analysis result, whether decision carries out signal reconstruct；

When original design intention of the invention is from view of in face of having the document of a large amount of and complex script information, Yong Huru Fruit uses relative complex interactive mode, and the prior art is difficult to relevant information that is direct, being accurately obtained in document.Here institute The complicated interactive mode said can be referred to for question and answer scene according to individual habit of user, expression style or particular needs It asks, the problem of more wheels and complexity is proposed with the question formulation that exchanges naturally or since faulty wording etc. correct again, modifies and puts question to etc.. Therefore, the present invention is proposed when there is the elusive complex interaction information of computer, can be by carrying out weight to interactive information The operation of structure has clear semantic, complete expression reconstruct interactive information to obtain.It certainly, is not in practical applications every Primary interaction requires to reconstruct, if decision, the present invention must be carried out via the semantic analysis result in abovementioned steps by being reconstructed It is preferred that starting with from the intention type of current interactive information, measure and whether carry out signal reconstruct, and it is further contemplated that in weight The analysis result previously with regard to literal semantic type is used when structure problem.It is following to provide specific implementation example for reference, such as Shown in Fig. 3:

Step S301, according to intention type, judge whether to need to carry out signal reconstruct；If it is not, thening follow the steps S302；If It is to then follow the steps S303.

Preset standard designated herein, by taking the intention type in aforementioned enquirement scene as an example, it is normal put question to (normal) because To be easy to be understood the intention type, so it can be considered as without carrying out signal reconstruct by machine, and refer to (refer), supplement Information (replenish), update information (correct) and deletion information (delete) then may be incomplete due to information or be needed It to go to understand in conjunction with context, it is therefore desirable to carry out signal reconstruct.It is noted that aforementioned manipulation command (command) essence On be not the subject of question for document paid close attention in question and answer scene, such as " air-conditioner temperature is turned up 5 degree ", " cancellation constant speed is patrolled Boat " etc, the manipulation instruction be pointing directly at executive device movement output, therefore under the scene not to such interaction be intended into Row discusses.

Step S302, retain current interactive information；

Such as after being judged to normally puing question to (normal) type, then interactive history can be written into the current problem, at It is more preferably that the semantic analysis result of current problem is stored in interactive history data for " historical problem ", this is for more wheels Considered in the scene of enquirement convenient for being subsequently generated reconstruct interactive information (reconstruction), because current problem itself is still to subsequent Determine that the process of answer is effective, this will be explained in detail later.In addition, it will be appreciated by persons skilled in the art that under if It is still normal enquirement (normal) that one wheel, which is putd question to, then can override the semantic analysis result of " newly " current problem aforementioned The semantic analysis result of " historical problem ", i.e. rewriting interactive history data.And in other embodiments, it can also be fixed using history Mechanism is deposited, i.e., is stored in multiple independent complete problems in interactive history data, can be first passed through in interactive process later Disambiguation operation is carried out between multiple " historical problems ", so that it is determined that " historical problem " corresponding to current problem.

Step S303, using interactive history data and current interactive information, reconstruct interactive information is generated.

According to the preset standard of different scenes, when intention type is determined as improper enquirement type, for example (,) but it is unlimited Reference (refer), supplemental information (replenish), update information (correct) and deletion information in question and answer scene (delete) intention types such as, then can in conjunction in interactive history data " historical problem " and current problem itself generate just In the reconstruction that machine understands.Specific reconstruct mode can be the semantic analysis result according to current interactive information, such as root It is whole using literal semanteme corresponding in interactive history data and the progress of current interactive information according to the intention type of current interactive information It closes, and the intention type of current interactive information is updated, obtain reconstruct interactive information.It is designated herein " corresponding ", with aforementioned For " historical problem ", by the way of re-wrote history data, then alleged " corresponding " represents " historical problem " and is equal to The normal enquirement (the problem of uniquely retaining in history) of one wheel；When depositing mechanism surely using aforementioned history, then alleged " corresponding " can To determine one in multiple historical problems.It is also pointed out that the interactive information after the above process reconstructs can be used as subsequent friendship Mutual basis, is deposited among historical data, and similarly, and the reconstruct interactive information of deposit can replace former history interaction number According to, can also with former historical interaction data and deposit.

Specific restructuring procedure can be understood as being intended to based on current interaction, the current friendship to ingredient missing is likely to occur Mutual information is filled up, to restore the true semantic of current interactive information and update the intention type of current interactive information simultaneously, To obtain having reconstruct interactive information that is complete and clearly literal semantic and being intended to.With aforementioned automotive specification question and answer scene For, carry out as described below: assuming that current interactive information is " how travelling in fog day opens double sudden strains of a muscle? ", pass through aforementioned multitask mould The semantic analysis result that type obtains is: how intent=normal, how=open, and part=is bis- to be dodged, the cond=greasy weather.Belong to It is intended in normal put question to, then semantic analysis result is directly stored in interactive history data.

In the case of more wheel interactions occur, it can be divided into but be not limited to following five kinds of situations:

1) if next current interaction problems (current herein, to be for epicycle enquirement) are " how to close peace Full band warning note? ", semantic analysis result is: how intent=normal, how=close, part=safety belt, phen =warning note, since intention type is still normal enquirement, then representing is primary new interaction, then can be by interactive history number Last semantic analysis result in is completely covered.

If 2) next current interaction problems are " that high beams? ", semantic analysis result is: intent= Refer, part=high beam then represent since interaction is intended that reference and need to replace last round of enquirement in interactive history data Semantic analysis result, after integration and update, semantic results become: how intent=normal, how=open, part =high beam in the cond=greasy weather, replaces with the current enquirement after referring to information is exactly " how the greasy weather opens high beam "；It inherits, For a kind of special reference, such as next current interaction problems are such as " high beam ", although not special finger at this time Pronoun (this, that, it etc.), but actually refer to previous question sentence how and cond needs inherited into current interactive information.

3) if next current interactive information is " not being or not the rainy day in the greasy weather ", semantic analysis result is: intent= In correct, the cond_error=greasy weather, the cond=rainy day, since interaction is intended that amendment, representative needs to replace last round of enquirement Certain information, after integration and update, semantic results become: how intent=normal, how=open, and part=is bis- It dodges, the cond=rainy day.Current enquirement after update information is exactly " how the rainy day opens double sudden strains of a muscle ".

4) if next current interactive information is " not being the greasy weather ", the semantic results obtained by semantic understanding module Be: intent=delete, cond_error=greasy weather, intent delete, representative need to delete in last round of history Certain information, after integration and update, semantic results become: how intent=normal, how=open, and part=is bis- to be dodged. Current enquirement after deleting information is exactly " how opening double sudden strains of a muscle ".

5) if next current interactive information is " second step this how to do? ", semantic analysis result is: intent= Replenish, detail=second step, since currently interaction is intended that supplemental information, then representing need to be in the base of last round of enquirement It is further supplemented on plinth, then the semantic results in this semantic analysis result and history is combined, is mended Complete semantic results after filling, and it is reduced to the text in history mutual information, after integration and update, semantic results become: How how=opens, part=high beam, detail=second step, and the current enquirement after supplemental information is exactly " how to open remote Light lamp, second step ".

It need to illustrate again, aforementioned reference, amendment, deletion, supplement etc. each mean user when mostly wheel is interactive for progress, due to table Be used to formula up to mode and thinking logic, in fact it could happen that for other intentions based on last round of interaction content.And it was reconstructing Cheng Zhong is the angle to integrate history interaction, current interactive information is carried out completion, obtains updated semantic and intention.

It connects above, step S4, using current interactive information or reconstruct interactive information, determines target area in a document Domain；

The Integral Thought of this step is current interactive information (such as normal enquirement) or the root according to aforementioned without reconstruct According to reconstruct interactive information (such as aforementioned combination historical problem integrate out new problem), the relevant information in document may be distributed Content area scan for.A kind of specific implementation reference is provided here, as shown in figure 4, step S4 can specifically include:

Step S401, document is divided into multiple content areas；

It can specifically refer to the length and content according to document, document is divided into the paragraph etc. in chapters and sections or chapters and sections As content area.

Step S402, each content area and current interactive information are extracted or reconstructs the text feature of interactive information；

Current interactive information described herein or reconstruct interactive information are due to as it was noted above, for carrying out target area The interactive information of retrieval, can also be with the reconstruct interactive information after reconstructed either not reconstructed current interactive information.

Step S403, current interactive information is calculated using text feature or reconstruct interactive information to the phase of each content area Guan Xing；

Step S404, according to correlation, at least one content area is chosen as target area.

The mode of above-mentioned determining target area can use range searching model as shown in Figure 5 by taking question and answer scene as an example Structure realizes that the model use existing CNN and BLSTM, the explanation particularly with regard to CNN and BLSTM can use for reference known information, this Invention does not repeat this.It connects above, to corresponding problem, title (title of current chapters and sections, such as " safety belt reminding dress Set "), higher level's title (as " safety belt ") and content (specific under title being discussed in detail, such as safety belt reminding device " it is specific It is discussed in detail) it is modeled.Using the interactive computing between feature, obtaining required feedback information (problem answers) may be deposited Region.Carrying out practically process is as follows:

It is (interior to original Query (problem), Title (title), Higher Title (higher level's title), Content first Hold) it is segmented, then word is mapped to the id input model in dictionary, it is exactly to turn the id of word in embedding layer Turn to the process of vector, term vector can use other tool preconditions such as word2vec, then by Query, Title, The embedding of Higher Title is obtained by CNN network and Maxpooling (maximum pond layer) after convolution pond Then sentence characteristics are modeled with the embedding of Content by BLSTM, because BLSTM is with respect to the text of front three Information is bigger and has the timing information of context, and it is advantageous to BLSTM.Then allow Query vector indicate and Title, An interactive computing is done in the vector expression of Higher Title, Content respectively, is spliced together their result, this Sample just obtained one be equivalent to Query to Title, Higher Title, Content similarity calculation as a result, again this Three results integrate, and are a Fully Connected and obtain a score, this score is exactly current Query to this The score of one similarity of a content area.Current Query and each content area are all as above calculated, and can finally be selected The content area of highest scoring is as required target area out.

Step S5, according to the content of target area, feedback information is generated.

The content of target area designated herein can refer to any shape being likely to occur in the documents such as text, table, picture The information of formula.And the mode that the content based on the target area generates feedback information can be including at least two kinds: a kind of realization side Formula is the content based on the target area, is wherein carrying out the extraction of key message or combination generates feedback information.But in this way The feedback information for being likely to cause generation does not extremely meet scene context.For example user puts question to " how boot is opened ", but May there was only the related introduction of tailgate in specification, such as " stand in rear of vehicle, press opening and closing of back door button and lift tailgate.It presses Button is in tailgate position to the right slightly below, is aligned with rear logo right side edge ".By the method, the answer of generation may be just It is explained from the section, but if user does not know the relationship of boot and tailgate, then the answer of generation may be generated and be doubted It is puzzled.

Another situation of complicated interaction scenarios is mentioned above, is that possible contain spy in interactive information or document Fixed term, non-professional idiomatic expression or uncommon words hard to understand etc..Therefore, the present invention proposes a kind of using target The content in region and current interactive information or the mutual attention rate of the reconstruct interactive information, with the content of target area with And current interactive information or reconstruct interactive information, the preferred method of feedback information is generated, it specifically can be as shown in Figure 6, comprising:

Step S501, the content of target area is calculated to each word in current interactive information or reconstruct interactive information First attention rate, and current interactive information or reconstruct interactive information are calculated to the of each word in the content of target area Two attention rates；

Step S502, using the first attention rate, the second attention rate, the content of target area and current interactive information or again Structure interactive information determines the weight of each word in default dictionary；

Step S503, according to the first attention rate, the second attention rate and the weight, target word is determined；

Step S504, the feedback information of interactive information is generated corresponding to current interactive information or reconstructed using target word.

It is as follows to above process specific explanations by taking aforementioned question and answer scene as an example:

The thinking of the present embodiment is to propose a kind of collaboration attention mechanism, will be in coding and target area that be asked a question The coding of word content carries out collaboration attention and calculates, and principle is exactly when removing " seeing " content with the angle of query, Query can be obtained to an attention degree of word each in content, gone in " seeing " query by content again in turn Each word.Specifically in actual operation, available two probability matrixs indicated with Pq and Pc, Pq are exactly each in query The probability of word, Pc are exactly the probability of each word in content.Especially into Pq, if the probability of some word is very big in Pq If, it just will use the word inside Pq in the answer ultimately produced as final result.It thus can achieve the anti-of generation Feedforward information is more easy-to-understand, herein understandable be needle for a user, i.e., context used of being more close to the users.It therefore can be more To meet the needs of user query document humanizedly.

Model is generated in conjunction with feedback information shown in Fig. 7, the operational process of above scheme in a model is made further Illustrate: the model can be encoded text by BLSTM, then be obtained by the interactive computing between Query and Content Interactive information, then re-encoded by way of CoAttention (mutual attention).

Assuming that the maximum length of the sentence of input Query is m, the maximum length for inputting Content is n, embedding's Size is e, and BLSTM hidden layer size is h.The Query and Content of input are as unit of word, by embedding Word is converted into term vector after layer, dimension is respectively (m, e) and (n, e)；Carry out first encoding by BLSTM, it is preceding to The result of LSTM and backward LSTM are stitched together, and obtain the feature vector that each word passes through BLSTM coding, and dimension is respectively (m, 2h) and (n, 2h)；Then using CoAttention layer, with Query by the obtained feature vector of BLSTM and The feature vector that Content is obtained carries out an interactive computing, obtains the eigenmatrix that a size is (m, n), then will be special Sign matrix respectively by row calculate softmax by column and be averaging again, be (1, m) and (1, n) this results in two dimensions Attention weight, the two vectors respectively represent Content to the first attention rate and Query of word each in Query To the second attention rate of word each in Content, as aforementioned P_qAnd P_c.Then, then it is corresponding P_qAnd P_cMultiply return to it is respective BLSTM as a result, complete CoAttention calculating, obtain the feature vector of two new Query and Content；By two to Amount is integrated, such as is sent in BLSTM and is encoded with the vector that connecting method obtains one (m+n, 2h), obtains one Answer is carried out by decoder for the coding vector (1,2h) of Query and Content entire content, then by the coded sequence It generates, coding vector is sent in decoder and is decoded by LSTM, specifically, when decoding can first input a starting It accords with<GO>, a feature vector is calculated by LSTM later, then be the available spy of FullyConnected and softmax Each word is used to generate the probability P of subsequent feedback information in fixed dictionary_v(i.e. aforementioned weight), dictionary designated herein can be pre- All general and specific words the dictionary covering the field, being likely to occur in the scene first constructed, naturally it is also possible to refer to More massive dictionary set.

The probability P of final each word_r(x) weighted sum of each section probability, P be can be_r(x)=w₁P_q(x)+w₂P_c(x)+ w₃P_v(x), the word (i.e. preceding aim word) that the maximum word of probability value is generated as current location can be calculated in this way, and Using the term vector of the word as the input next time of decoder, and so on, until obtaining the termination that an output is<STOP> Fu Hou stops the generation of answer.Finally, the entire answer sequence of generation is required feedback information, shows user.

Corresponding to foregoing embodiments and preferred embodiment, the present invention also provides a kind of interactive systems based on document, such as Shown in Fig. 8, which may include memory and at least one and the storage that at least one is used to store dependent instruction Device connection and (one or more processors can also directly be held the processor for executing following each modules in other embodiments The movement of row corresponding step, without being executed by following modules, such as processor directly execute semantic analysis, signal reconstruct, The operations such as information feedback):

Current interactive information obtains module 1, for obtaining the current interactive information of user；

Semantic module 2, for carrying out semantic analysis to the current interactive information；

Decision-making module 3 is reconstructed, for according to semantic analysis result, whether decision to carry out signal reconstruct；

Target area determining module 4, for utilizing the current interactive information or reconstruct interactive information, in a document really Set the goal region；

Feedback information generation module 5 generates feedback information for the content according to target area.

Further, the semantic module specifically includes:

Literal semantic type determination unit, for determining the literal semantic type of the current interactive information；

Intention type determination unit, for determining the intention type of the current interactive information.

Further, the reconstruct decision-making module is specifically used for: according to the intention type, retain current interactive information, Or interactive history data and the current interactive information are utilized, obtain reconstruct interactive information.

Further, the reconstruct decision-making module specifically includes:

Stick unit, for determining that the intention type of current interactive information is the current interaction of reservation after normal enquirement type Information

Further, the target area determining module specifically includes:

Zoning unit, for document to be divided into multiple content areas；

Further, the feedback information generation module specifically includes:

Further, the attention rate computing unit specifically includes:

First attention rate computation subunit, for calculating the content of target area to the current interactive information or described heavy First attention rate of each word in structure interactive information；

Second attention rate computation subunit, for calculating the current interactive information or the reconstruct interactive information to target Second attention rate of each word in the content in region.

Further, the feedback information generation unit specifically includes:

Word weight determines subelement, for utilizing first attention rate, second attention rate, the target area Content and the current interactive information or the reconstruct interactive information, determine the weight of each word in default dictionary；

Target word determines subelement, is used for according to first attention rate, second attention rate and the weight, Determine target word；

Feedback information generates subelement, corresponds to the current interactive information or institute for generating using the target word State the feedback information of reconstruct interactive information.

Although the working method and technical principle of the above system embodiment and preferred embodiment are all recorded in above, still need to , it is noted that various component embodiments of the invention can be implemented in hardware, or to transport on one or more processors Capable software module is realized, or is implemented in a combination thereof.Module or unit or component in embodiment can be combined into One module or unit or component, also they can be divided into a plurality of submodules or subunits or subassembliess to be practiced.

And all the embodiments in this specification are described in a progressive manner, identical phase between each embodiment As partially may refer to each other, each embodiment focuses on the differences from other embodiments.Especially for For system embodiment, since it is substantially similar to the method embodiment, so describing fairly simple, related place is referring to method The part of embodiment illustrates.System embodiment described above is only schematical, wherein saying as separation unit Bright unit may or may not be physically separated, and component shown as a unit can be or can not also It is physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual need Some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying Out in the case where creative work, it can understand and implement.

It is described in detail structure, feature and effect of the invention based on the embodiments shown in the drawings, but more than Only presently preferred embodiments of the present invention needs to explain, technical characteristic involved in above-described embodiment and its preferred embodiment, this Field technical staff can be under the premise of not departing from, not changing mentality of designing and technical effect of the invention, reasonably group Conjunction mixes into a variety of equivalent schemes；Therefore, the present invention does not limit the scope of implementation as shown in the drawings, all according to conception of the invention Made change or equivalent example modified to equivalent change, when not going beyond the spirit of the description and the drawings, It should be within the scope of the present invention.

Claims

1. a kind of exchange method based on document characterized by comprising

Obtain the current interactive information of user；

Semantic analysis is carried out to the current interactive information；

According to the content of target area, feedback information is generated.

2. the exchange method according to claim 1 based on document, which is characterized in that described to the current interactive information Carrying out semantic analysis includes:

Determine the literal semantic type of the current interactive information；

Determine the intention type of the current interactive information.

3. the exchange method according to claim 2 based on document, which is characterized in that it is described according to semantic analysis result, Whether decision carries out signal reconstruct

According to the intention type, retain current interactive information, or utilizes interactive history data and the current interaction letter Breath obtains reconstruct interactive information.

4. the exchange method according to claim 3 based on document, which is characterized in that it is described according to the intention type, Using interactive history data and the current interactive information, obtaining reconstruct interactive information includes:

When the intention type is improper enquirement type, according to the improper enquirement type, by the interactive history number The intention of the current interactive information is integrated with the current interactive information and updated to the corresponding semantic analysis result in Type obtains reconstruct interactive information.

5. the exchange method according to claim 1 based on document, which is characterized in that described to utilize the current interaction letter Breath or reconstruct interactive information determine that target area includes: in a document

Document is divided into multiple content areas；

The current interactive information or the reconstruct interactive information are calculated to the phase of each content area using text feature Guan Xing；

6. described in any item exchange methods based on document according to claim 1~5, which is characterized in that described according to target The content in region, generating feedback information includes:

According to the attention rate, the content of the target area and the current interactive information or the reconstruct interactive information, Generate feedback information.

7. the exchange method according to claim 6 based on document, which is characterized in that the content for calculating target area The current interactive information or it is described reconstruct interactive information between attention rate include:

The content of target area is calculated to first of each word in the current interactive information or the reconstruct interactive information Attention rate；

The current interactive information or the reconstruct interactive information are calculated to second of each word in the content of target area Attention rate.

8. the exchange method according to claim 7 based on document, which is characterized in that described according to the attention rate, institute The content and the current interactive information or the reconstruct interactive information for stating target area, generating feedback information includes:

Utilize first attention rate, second attention rate, the content of the target area and the current interactive information Or the reconstruct interactive information, determine the weight of each word in default dictionary；

The feedback information corresponding to the current interactive information or the reconstruct interactive information is generated using the target word.

9. a kind of interactive system based on document characterized by comprising

Target area determining module, for determining mesh in a document using the current interactive information or reconstruct interactive information Mark region；

10. the interactive system according to claim 9 based on document, which is characterized in that the reconstruct decision-making module is specific Include:

Stick unit, for determining that the intention type of current interactive information is after normally puing question to type, to retain current interactive information；

Reconfiguration unit, for determine current interactive information intention type be improper enquirement type after, according to described improper Put question to type that semantic analysis result corresponding in interactive history data is integrated and updated with the current interactive information The intention type of the current interactive information obtains reconstruct interactive information.

11. the interactive system according to claim 9 based on document, which is characterized in that the target area determining module It specifically includes:

Zoning unit, for document to be divided into multiple content areas；

Text character extraction unit, for extracting each content area and the current interactive information or reconstruct interaction The text feature of information；

Correlation acquiring unit, for calculating the current interactive information or the reconstruct interactive information to each using text feature The correlation of a content area；

Target area selection unit, for choosing at least one described content area as target area according to the correlation.

12. according to the described in any item interactive systems based on document of claim 9~11, which is characterized in that the feedback letter Breath generation module specifically includes:

Attention rate computing unit, for calculating the content and the current interactive information or the reconstruct interactive information of target area Between attention rate；

Feedback information generation unit, for being believed according to the content and the current interaction of the attention rate, the target area Breath or the reconstruct interactive information generate feedback information.