CN109857843A - Exchange method and system based on document - Google Patents
Exchange method and system based on document Download PDFInfo
- Publication number
- CN109857843A CN109857843A CN201811596444.0A CN201811596444A CN109857843A CN 109857843 A CN109857843 A CN 109857843A CN 201811596444 A CN201811596444 A CN 201811596444A CN 109857843 A CN109857843 A CN 109857843A
- Authority
- CN
- China
- Prior art keywords
- interactive information
- reconstruct
- information
- current
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Machine Translation (AREA)
Abstract
The invention discloses a kind of exchange method and system based on document, wherein method includes: to obtain the current interactive information of user;Semantic analysis is carried out to the current interactive information;According to semantic analysis result, whether decision carries out signal reconstruct;Using the current interactive information or reconstruct interactive information, target area is determined in a document;According to the content of target area, feedback information is generated.The present invention is different from carrying out keyword retrieval to document, but towards increasingly complex and various interaction scenarios, such as more wheels put question to interaction, it is necessary to when can rebuild one be easy to machine understanding interactive information, target area candidate in conjunction with the locking of corresponding interactive information later, regenerates final required feedback information from candidate.The present invention can accurately and reliably in locating documents relevant information, and humanized result feedback is generated with this.
Description
Technical field
The present invention relates to field of human-computer interaction more particularly to a kind of exchange methods and system based on document.
Background technique
The birth of e-text medium brings huge change to the reading method of people, has liberated the field of people's reading
Conjunction and time make the reading method of people become more easily convenient.But with the development of society, include a large amount of text informations
E-text data always inevitably make one dazzled, be difficult to rapid lockup period and wait for target, such as how to determine an e-book
In content to meet user expected, how to navigate to the content etc. of user demand rapidly in the electronic description of complex product
Deng.
Existing interactive class product is mostly the robot based on Opening field, such as the voice assistant of intelligent terminal in the market
Deng, what be can be realized is only to carry out question answer dialog with user in Opening field using knowledge base searching and rule match, this
It kind is similar to the interactive product of " game ", " chat " property, is not suitable for and is based on particular document, such as books or profession data are asked
Interactive process is answered, therefore such product can not play effective work for how fast and accurately to consult related text content
With.
In addition, existing e-text arrangement for reading in the market, although the matching and search of keyword can be supported, by
In the search strategy that it is used in face of being more single keyword, thus it is difficult to accurately navigate to user's phase in particular document
To feedback, either provide feedback also need user carry out quadratic search perhaps excludes retrieval etc. troublesome operations or offer
The difficult interaction content with user of feedback mutually agrees with;Especially, when using more complicated question formulation in face of user, such as
When carrying out more wheels and relevant enquirement, existing e-text arrangement for reading is then helpless to this.
Summary of the invention
The present invention is directed to consult above-mentioned pain spot when electronic document for user, a kind of exchange method based on document is provided
And system, the content that user expects can be quickly and accurately positioned from particular document in more complicated interaction scenarios,
It is freed during user is consulted from lengthy and tedious text.
The technical solution adopted by the invention is as follows:
A kind of exchange method based on document, comprising:
Obtain the current interactive information of user;
Semantic analysis is carried out to the current interactive information;
According to semantic analysis result, whether decision carries out signal reconstruct;
Using the current interactive information or reconstruct interactive information, target area is determined in a document;
According to the content of target area, feedback information is generated.
Optionally, described to include: to the current interactive information progress semantic analysis
Determine the literal semantic type of the current interactive information;
Determine the intention type of the current interactive information.
Optionally, described according to semantic analysis result, whether decision, which carries out signal reconstruct, includes:
According to the intention type, retain current interactive information, or utilizes interactive history data and the current friendship
Mutual information obtains reconstruct interactive information.
Optionally, described to be obtained according to the intention type using interactive history data and the current interactive information
Reconstructing interactive information includes:
When the intention type is improper enquirement type, according to the improper enquirement type, the interaction is gone through
Corresponding semantic analysis result in history data is integrated with the current interactive information and updates the current interactive information
Intention type obtains reconstruct interactive information.
Optionally, described to utilize the current interactive information or reconstruct interactive information, target area is determined in a document
Include:
Document is divided into multiple content areas;
Extract the text feature of each content area and the current interactive information or the reconstruct interactive information;
The current interactive information or the reconstruct interactive information are calculated to each content area using text feature
Correlation;
According to the correlation, at least one described content area is chosen as target area.
Optionally, the content according to target area, generating feedback information includes:
Calculate the attention rate between the content of target area and the current interactive information or the reconstruct interactive information;
According to the attention rate, the content of the target area and the current interactive information or reconstruct interaction letter
Breath generates feedback information.
Optionally, between the content and the current interactive information or the reconstruct interactive information for calculating target area
Attention rate include:
The content of target area is calculated to each word in the current interactive information or the reconstruct interactive information
First attention rate;
The current interactive information or the reconstruct interactive information are calculated to each word in the content of target area
Second attention rate.
Optionally, described according to the attention rate, the content of the target area and the current interactive information or institute
Reconstruct interactive information is stated, generating feedback information includes:
Utilize first attention rate, second attention rate, the content of the target area and the current interaction
Information or the reconstruct interactive information determine the weight of each word in default dictionary;
According to first attention rate, second attention rate and the weight, target word is determined;
The feedback letter corresponding to the current interactive information or the reconstruct interactive information is generated using the target word
Breath.
A kind of interactive system based on document, comprising:
Current interactive information obtains module, for obtaining the current interactive information of user;
Semantic module, for carrying out semantic analysis to the current interactive information;
Decision-making module is reconstructed, for according to semantic analysis result, whether decision to carry out signal reconstruct;
Target area determining module, for utilizing the current interactive information or reconstruct interactive information, in a document really
Set the goal region;
Feedback information generation module generates feedback information for the content according to target area.
Optionally, the reconstruct decision-making module specifically includes:
Stick unit, for determining that the intention type of current interactive information is the current interaction of reservation after normal enquirement type
Information;
Reconfiguration unit, for determine current interactive information intention type be improper enquirement type after, according to described non-
It is normal that type is putd question to be integrated semantic analysis result corresponding in interactive history data simultaneously with the current interactive information
The intention type for updating the current interactive information obtains reconstruct interactive information.
Optionally, the target area determining module specifically includes:
Zoning unit, for document to be divided into multiple content areas;
Text character extraction unit, for extracting each content area and the current interactive information or the reconstruct
The text feature of interactive information;
Correlation acquiring unit, for calculating the current interactive information or the reconstruct interactive information using text feature
To the correlation of each content area;
Target area selection unit, for choosing at least one described content area as target according to the correlation
Region.
Optionally, the feedback information generation module specifically includes:
Attention rate computing unit, the content for calculating target area are interacted with the current interactive information or the reconstruct
Attention rate between information;
Feedback information generation unit, for the content and the current friendship according to the attention rate, the target area
Mutual information or the reconstruct interactive information generate feedback information.
It is provided by the invention that semantic analysis is carried out to the current interactive information of the user got based on the exchange method of document,
Further according to semantic analysis result, whether decision carries out signal reconstruct;Not reconstructed current interactive information or benefit are utilized later
With the reconstruct interactive information after reconstructed, target area is determined in a document;It is finally generated and is fed back according to the content of target area
Information.The present invention is different from carrying out keyword retrieval to document, but towards increasingly complex and various interaction scenarios, such as it is more
Wheel puts question to interaction, it is necessary to when can rebuild the interactive information for being easy to machine understanding, later in conjunction with corresponding interaction
The target area of information locking candidate regenerates final required feedback information from candidate.The present invention can be accurately and reliably
Relevant information in locating documents, and humanized result feedback is generated with this.
Detailed description of the invention
To make the object, technical solutions and advantages of the present invention clearer, the present invention is made into one below in conjunction with attached drawing
Step description, in which:
Fig. 1 is the flow chart of the embodiment of the exchange method provided by the invention based on document;
Fig. 2 is the structural schematic diagram of the embodiment of semantic analysis model provided by the invention;
Fig. 3 is the flow chart of the embodiment of reconstruct decision-making technique provided by the invention;
Fig. 4 is the flow chart of content area searching method provided by the invention;
Fig. 5 is the structural schematic diagram of range searching model provided by the invention;
Fig. 6 is the flow chart of the preferred embodiment provided by the invention for generating feedback information method;
Fig. 7 is the structural schematic diagram that feedback information provided by the invention generates model;
Fig. 8 is the block diagram of the embodiment of the interactive system provided by the invention based on document.
Description of symbols:
1 current interactive information obtains 2 semantic module 3 of module and reconstructs decision-making module
4 target area determining module, 5 feedback information generation module
Specific embodiment
The embodiment of the present invention is described below in detail, the example of embodiment is shown in the accompanying drawings, wherein identical from beginning to end
Or similar label indicates same or similar element or element with the same or similar functions.It is retouched below with reference to attached drawing
The embodiment stated is exemplary, and for explaining only the invention, and is not construed as limiting the claims.
The present invention provides a kind of embodiment of exchange method based on document, as shown in Figure 1, this method may include as
Lower step:
Step S1, the current interactive information of user is obtained;
Interactive mode involved in the present invention can be, but not limited to voice and text input, and interactive voice then can be by picking up
Mixer obtains the interactive audio data of user, and can further utilize existing audio processing, audio conversion writing technology by sound
Frequency evidence changes into textual form;Text interaction can then support user to be directly manually entered interactive text, such as pass through keyboard, hand
Write device etc. directly inputs interaction request.To the method for above-mentioned acquisition customer interaction information, the prior art can refer to, the present invention is not
It limits.But it should be recognized that " current interactive information " alleged by the present embodiment has the meaning of timing level, that is to say final
Feedback information and the entire process for obtaining feedback information all mutually echoed with current interactive information, but for taking turns question and answer scene more,
" historical information " proposed before " current interactive information " is equally possible to take part in the process for generating final feedback information, this
It will hereinafter be illustrated.
Step S2, semantic analysis is carried out to current interactive information;
In practical applications, the content of semantic analysis can according to need and chooses one or more analysis results.?
In a preferred embodiment of the present invention, the angle for carrying out semantic analysis to current interactive information be can include but is not limited to: really
The literal semantic type of settled preceding interactive information and the intention type for determining current interactive information.
In this regard, multitask (Multi-task) model can be selected to carry out multitask type semantic in specific implementation operation
Understand.Multi-task model can judge the word of above-mentioned current interactive information by neural network simultaneously in the same model
Face semanteme and interaction are intended to (this is the example of two tasks, but task quantity can be adjusted according to demand), and more
Different information datas can be provided between business, so as to Optimized model overall effect, the semantic analysis model structure application
In inquiry scene, can be as shown in Figure 2: Embedding layer (embeding layer) be (to ask current interactive information text Query
Topic, inquiry) in each word be converted into corresponding word vector, word vector is random initializtion and as model training automatically updates;
Then pass through two layers of BLSTM layer (two-way shot and long term memory network layer), it is therefore an objective to pass through original Embedding vector
The vector of contextual information is obtained after BLSTM coding;Next Multi-task operation is executed, is shown with aforementioned two tasks
Example: first task is that the result of the BLSTM layers above each word is carried out a Fully Connected (Quan Lian
Connect) obtain that one is polytypic as a result, what is obtained in this way is exactly a label above each word, to determine current interaction letter
The semantic type of each word in breath.
For purposes of illustration only, user is inquiring the electronic document process by taking the electronic description for inquiring automotive-type product as an example
The current interactive information of middle proposition is " how using rain brush? ", the sequence labelling result obtained after above-mentioned model treatment is just
Be: why (B_how) (I_how) make (I_how) with (E_how) rain (B_part) brush (I_part) device (E_part), wherein B,
I, E respectively represents beginning, centre and the end of a semantic type, and specific semantic type can be illustrated with reference to following table.
Semantic type | Field value | It explains |
how | How to open/use/adjusting etc. | Query function application method |
what | What is/cry what etc. | Inquire component or functional interpretation |
where | Which/somewhere etc. | Inquire the position of component or tool |
why | It is what reason/be what problem etc. | The reason of inquiring alarm or failure |
part | Air-conditioning/dipped headlight/skylight etc. | Automobile component or accessory |
attr | Temperature/horsepower etc. | The attribute of automobile component or accessory |
phen | Shake/do not work/abnormal sound etc. | Failure there is a phenomenon where |
cond | Night/rainy day/muddy ground etc. | The condition that failure occurs |
So far, the analysis to the literal semantic type of current interactive information is completed.
And Section 2 task be then by the forward direction of BLSTM the last one as a result, before i.e. the sentence that is obtained to LSTM express to
Amount and BLSTM it is backward the last one as a result, i.e. after obtained to LSTM sentence expression vector, carry out concatenation obtain
The sentence expression of one entirety, then by sentence expression by Fully Connected also obtain one it is polytypic as a result,
More classification results are that is represented is the classification that user's interaction is intended to (intent), and the question and answer scene such as inquiry shows the classification of intention
It anticipates as follows: normal to put question to (normal), refer to (refer), supplemental information (replenish), update information (correct), delete
Except information (delete) and manipulation command (command) etc..So far, point to the intention type of current interactive information is completed
Analysis.
To sum up, the type mark of word level can have been obtained by first task, and entire language has been obtained by Section 2 task
The intention type mark of sentence level.
Step S3, according to semantic analysis result, whether decision carries out signal reconstruct;
When original design intention of the invention is from view of in face of having the document of a large amount of and complex script information, Yong Huru
Fruit uses relative complex interactive mode, and the prior art is difficult to relevant information that is direct, being accurately obtained in document.Here institute
The complicated interactive mode said can be referred to for question and answer scene according to individual habit of user, expression style or particular needs
It asks, the problem of more wheels and complexity is proposed with the question formulation that exchanges naturally or since faulty wording etc. correct again, modifies and puts question to etc..
Therefore, the present invention is proposed when there is the elusive complex interaction information of computer, can be by carrying out weight to interactive information
The operation of structure has clear semantic, complete expression reconstruct interactive information to obtain.It certainly, is not in practical applications every
Primary interaction requires to reconstruct, if decision, the present invention must be carried out via the semantic analysis result in abovementioned steps by being reconstructed
It is preferred that starting with from the intention type of current interactive information, measure and whether carry out signal reconstruct, and it is further contemplated that in weight
The analysis result previously with regard to literal semantic type is used when structure problem.It is following to provide specific implementation example for reference, such as
Shown in Fig. 3:
Step S301, according to intention type, judge whether to need to carry out signal reconstruct;If it is not, thening follow the steps S302;If
It is to then follow the steps S303.
Preset standard designated herein, by taking the intention type in aforementioned enquirement scene as an example, it is normal put question to (normal) because
To be easy to be understood the intention type, so it can be considered as without carrying out signal reconstruct by machine, and refer to (refer), supplement
Information (replenish), update information (correct) and deletion information (delete) then may be incomplete due to information or be needed
It to go to understand in conjunction with context, it is therefore desirable to carry out signal reconstruct.It is noted that aforementioned manipulation command (command) essence
On be not the subject of question for document paid close attention in question and answer scene, such as " air-conditioner temperature is turned up 5 degree ", " cancellation constant speed is patrolled
Boat " etc, the manipulation instruction be pointing directly at executive device movement output, therefore under the scene not to such interaction be intended into
Row discusses.
Step S302, retain current interactive information;
Such as after being judged to normally puing question to (normal) type, then interactive history can be written into the current problem, at
It is more preferably that the semantic analysis result of current problem is stored in interactive history data for " historical problem ", this is for more wheels
Considered in the scene of enquirement convenient for being subsequently generated reconstruct interactive information (reconstruction), because current problem itself is still to subsequent
Determine that the process of answer is effective, this will be explained in detail later.In addition, it will be appreciated by persons skilled in the art that under if
It is still normal enquirement (normal) that one wheel, which is putd question to, then can override the semantic analysis result of " newly " current problem aforementioned
The semantic analysis result of " historical problem ", i.e. rewriting interactive history data.And in other embodiments, it can also be fixed using history
Mechanism is deposited, i.e., is stored in multiple independent complete problems in interactive history data, can be first passed through in interactive process later
Disambiguation operation is carried out between multiple " historical problems ", so that it is determined that " historical problem " corresponding to current problem.
Step S303, using interactive history data and current interactive information, reconstruct interactive information is generated.
According to the preset standard of different scenes, when intention type is determined as improper enquirement type, for example (,) but it is unlimited
Reference (refer), supplemental information (replenish), update information (correct) and deletion information in question and answer scene
(delete) intention types such as, then can in conjunction in interactive history data " historical problem " and current problem itself generate just
In the reconstruction that machine understands.Specific reconstruct mode can be the semantic analysis result according to current interactive information, such as root
It is whole using literal semanteme corresponding in interactive history data and the progress of current interactive information according to the intention type of current interactive information
It closes, and the intention type of current interactive information is updated, obtain reconstruct interactive information.It is designated herein " corresponding ", with aforementioned
For " historical problem ", by the way of re-wrote history data, then alleged " corresponding " represents " historical problem " and is equal to
The normal enquirement (the problem of uniquely retaining in history) of one wheel;When depositing mechanism surely using aforementioned history, then alleged " corresponding " can
To determine one in multiple historical problems.It is also pointed out that the interactive information after the above process reconstructs can be used as subsequent friendship
Mutual basis, is deposited among historical data, and similarly, and the reconstruct interactive information of deposit can replace former history interaction number
According to, can also with former historical interaction data and deposit.
Specific restructuring procedure can be understood as being intended to based on current interaction, the current friendship to ingredient missing is likely to occur
Mutual information is filled up, to restore the true semantic of current interactive information and update the intention type of current interactive information simultaneously,
To obtain having reconstruct interactive information that is complete and clearly literal semantic and being intended to.With aforementioned automotive specification question and answer scene
For, carry out as described below: assuming that current interactive information is " how travelling in fog day opens double sudden strains of a muscle? ", pass through aforementioned multitask mould
The semantic analysis result that type obtains is: how intent=normal, how=open, and part=is bis- to be dodged, the cond=greasy weather.Belong to
It is intended in normal put question to, then semantic analysis result is directly stored in interactive history data.
In the case of more wheel interactions occur, it can be divided into but be not limited to following five kinds of situations:
1) if next current interaction problems (current herein, to be for epicycle enquirement) are " how to close peace
Full band warning note? ", semantic analysis result is: how intent=normal, how=close, part=safety belt, phen
=warning note, since intention type is still normal enquirement, then representing is primary new interaction, then can be by interactive history number
Last semantic analysis result in is completely covered.
If 2) next current interaction problems are " that high beams? ", semantic analysis result is: intent=
Refer, part=high beam then represent since interaction is intended that reference and need to replace last round of enquirement in interactive history data
Semantic analysis result, after integration and update, semantic results become: how intent=normal, how=open, part
=high beam in the cond=greasy weather, replaces with the current enquirement after referring to information is exactly " how the greasy weather opens high beam ";It inherits,
For a kind of special reference, such as next current interaction problems are such as " high beam ", although not special finger at this time
Pronoun (this, that, it etc.), but actually refer to previous question sentence how and cond needs inherited into current interactive information.
3) if next current interactive information is " not being or not the rainy day in the greasy weather ", semantic analysis result is: intent=
In correct, the cond_error=greasy weather, the cond=rainy day, since interaction is intended that amendment, representative needs to replace last round of enquirement
Certain information, after integration and update, semantic results become: how intent=normal, how=open, and part=is bis-
It dodges, the cond=rainy day.Current enquirement after update information is exactly " how the rainy day opens double sudden strains of a muscle ".
4) if next current interactive information is " not being the greasy weather ", the semantic results obtained by semantic understanding module
Be: intent=delete, cond_error=greasy weather, intent delete, representative need to delete in last round of history
Certain information, after integration and update, semantic results become: how intent=normal, how=open, and part=is bis- to be dodged.
Current enquirement after deleting information is exactly " how opening double sudden strains of a muscle ".
5) if next current interactive information is " second step this how to do? ", semantic analysis result is: intent=
Replenish, detail=second step, since currently interaction is intended that supplemental information, then representing need to be in the base of last round of enquirement
It is further supplemented on plinth, then the semantic results in this semantic analysis result and history is combined, is mended
Complete semantic results after filling, and it is reduced to the text in history mutual information, after integration and update, semantic results become:
How how=opens, part=high beam, detail=second step, and the current enquirement after supplemental information is exactly " how to open remote
Light lamp, second step ".
It need to illustrate again, aforementioned reference, amendment, deletion, supplement etc. each mean user when mostly wheel is interactive for progress, due to table
Be used to formula up to mode and thinking logic, in fact it could happen that for other intentions based on last round of interaction content.And it was reconstructing
Cheng Zhong is the angle to integrate history interaction, current interactive information is carried out completion, obtains updated semantic and intention.
It connects above, step S4, using current interactive information or reconstruct interactive information, determines target area in a document
Domain;
The Integral Thought of this step is current interactive information (such as normal enquirement) or the root according to aforementioned without reconstruct
According to reconstruct interactive information (such as aforementioned combination historical problem integrate out new problem), the relevant information in document may be distributed
Content area scan for.A kind of specific implementation reference is provided here, as shown in figure 4, step S4 can specifically include:
Step S401, document is divided into multiple content areas;
It can specifically refer to the length and content according to document, document is divided into the paragraph etc. in chapters and sections or chapters and sections
As content area.
Step S402, each content area and current interactive information are extracted or reconstructs the text feature of interactive information;
Current interactive information described herein or reconstruct interactive information are due to as it was noted above, for carrying out target area
The interactive information of retrieval, can also be with the reconstruct interactive information after reconstructed either not reconstructed current interactive information.
Step S403, current interactive information is calculated using text feature or reconstruct interactive information to the phase of each content area
Guan Xing;
Step S404, according to correlation, at least one content area is chosen as target area.
The mode of above-mentioned determining target area can use range searching model as shown in Figure 5 by taking question and answer scene as an example
Structure realizes that the model use existing CNN and BLSTM, the explanation particularly with regard to CNN and BLSTM can use for reference known information, this
Invention does not repeat this.It connects above, to corresponding problem, title (title of current chapters and sections, such as " safety belt reminding dress
Set "), higher level's title (as " safety belt ") and content (specific under title being discussed in detail, such as safety belt reminding device " it is specific
It is discussed in detail) it is modeled.Using the interactive computing between feature, obtaining required feedback information (problem answers) may be deposited
Region.Carrying out practically process is as follows:
It is (interior to original Query (problem), Title (title), Higher Title (higher level's title), Content first
Hold) it is segmented, then word is mapped to the id input model in dictionary, it is exactly to turn the id of word in embedding layer
Turn to the process of vector, term vector can use other tool preconditions such as word2vec, then by Query, Title,
The embedding of Higher Title is obtained by CNN network and Maxpooling (maximum pond layer) after convolution pond
Then sentence characteristics are modeled with the embedding of Content by BLSTM, because BLSTM is with respect to the text of front three
Information is bigger and has the timing information of context, and it is advantageous to BLSTM.Then allow Query vector indicate and Title,
An interactive computing is done in the vector expression of Higher Title, Content respectively, is spliced together their result, this
Sample just obtained one be equivalent to Query to Title, Higher Title, Content similarity calculation as a result, again this
Three results integrate, and are a Fully Connected and obtain a score, this score is exactly current Query to this
The score of one similarity of a content area.Current Query and each content area are all as above calculated, and can finally be selected
The content area of highest scoring is as required target area out.
Step S5, according to the content of target area, feedback information is generated.
The content of target area designated herein can refer to any shape being likely to occur in the documents such as text, table, picture
The information of formula.And the mode that the content based on the target area generates feedback information can be including at least two kinds: a kind of realization side
Formula is the content based on the target area, is wherein carrying out the extraction of key message or combination generates feedback information.But in this way
The feedback information for being likely to cause generation does not extremely meet scene context.For example user puts question to " how boot is opened ", but
May there was only the related introduction of tailgate in specification, such as " stand in rear of vehicle, press opening and closing of back door button and lift tailgate.It presses
Button is in tailgate position to the right slightly below, is aligned with rear logo right side edge ".By the method, the answer of generation may be just
It is explained from the section, but if user does not know the relationship of boot and tailgate, then the answer of generation may be generated and be doubted
It is puzzled.
Another situation of complicated interaction scenarios is mentioned above, is that possible contain spy in interactive information or document
Fixed term, non-professional idiomatic expression or uncommon words hard to understand etc..Therefore, the present invention proposes a kind of using target
The content in region and current interactive information or the mutual attention rate of the reconstruct interactive information, with the content of target area with
And current interactive information or reconstruct interactive information, the preferred method of feedback information is generated, it specifically can be as shown in Figure 6, comprising:
Step S501, the content of target area is calculated to each word in current interactive information or reconstruct interactive information
First attention rate, and current interactive information or reconstruct interactive information are calculated to the of each word in the content of target area
Two attention rates;
Step S502, using the first attention rate, the second attention rate, the content of target area and current interactive information or again
Structure interactive information determines the weight of each word in default dictionary;
Step S503, according to the first attention rate, the second attention rate and the weight, target word is determined;
Step S504, the feedback information of interactive information is generated corresponding to current interactive information or reconstructed using target word.
It is as follows to above process specific explanations by taking aforementioned question and answer scene as an example:
The thinking of the present embodiment is to propose a kind of collaboration attention mechanism, will be in coding and target area that be asked a question
The coding of word content carries out collaboration attention and calculates, and principle is exactly when removing " seeing " content with the angle of query,
Query can be obtained to an attention degree of word each in content, gone in " seeing " query by content again in turn
Each word.Specifically in actual operation, available two probability matrixs indicated with Pq and Pc, Pq are exactly each in query
The probability of word, Pc are exactly the probability of each word in content.Especially into Pq, if the probability of some word is very big in Pq
If, it just will use the word inside Pq in the answer ultimately produced as final result.It thus can achieve the anti-of generation
Feedforward information is more easy-to-understand, herein understandable be needle for a user, i.e., context used of being more close to the users.It therefore can be more
To meet the needs of user query document humanizedly.
Model is generated in conjunction with feedback information shown in Fig. 7, the operational process of above scheme in a model is made further
Illustrate: the model can be encoded text by BLSTM, then be obtained by the interactive computing between Query and Content
Interactive information, then re-encoded by way of CoAttention (mutual attention).
Assuming that the maximum length of the sentence of input Query is m, the maximum length for inputting Content is n, embedding's
Size is e, and BLSTM hidden layer size is h.The Query and Content of input are as unit of word, by embedding
Word is converted into term vector after layer, dimension is respectively (m, e) and (n, e);Carry out first encoding by BLSTM, it is preceding to
The result of LSTM and backward LSTM are stitched together, and obtain the feature vector that each word passes through BLSTM coding, and dimension is respectively
(m, 2h) and (n, 2h);Then using CoAttention layer, with Query by the obtained feature vector of BLSTM and
The feature vector that Content is obtained carries out an interactive computing, obtains the eigenmatrix that a size is (m, n), then will be special
Sign matrix respectively by row calculate softmax by column and be averaging again, be (1, m) and (1, n) this results in two dimensions
Attention weight, the two vectors respectively represent Content to the first attention rate and Query of word each in Query
To the second attention rate of word each in Content, as aforementioned PqAnd Pc.Then, then it is corresponding PqAnd PcMultiply return to it is respective
BLSTM as a result, complete CoAttention calculating, obtain the feature vector of two new Query and Content;By two to
Amount is integrated, such as is sent in BLSTM and is encoded with the vector that connecting method obtains one (m+n, 2h), obtains one
Answer is carried out by decoder for the coding vector (1,2h) of Query and Content entire content, then by the coded sequence
It generates, coding vector is sent in decoder and is decoded by LSTM, specifically, when decoding can first input a starting
It accords with<GO>, a feature vector is calculated by LSTM later, then be the available spy of FullyConnected and softmax
Each word is used to generate the probability P of subsequent feedback information in fixed dictionaryv(i.e. aforementioned weight), dictionary designated herein can be pre-
All general and specific words the dictionary covering the field, being likely to occur in the scene first constructed, naturally it is also possible to refer to
More massive dictionary set.
The probability P of final each wordr(x) weighted sum of each section probability, P be can ber(x)=w1Pq(x)+w2Pc(x)+
w3Pv(x), the word (i.e. preceding aim word) that the maximum word of probability value is generated as current location can be calculated in this way, and
Using the term vector of the word as the input next time of decoder, and so on, until obtaining the termination that an output is<STOP>
Fu Hou stops the generation of answer.Finally, the entire answer sequence of generation is required feedback information, shows user.
It is provided by the invention that semantic analysis is carried out to the current interactive information of the user got based on the exchange method of document,
Further according to semantic analysis result, whether decision carries out signal reconstruct;Not reconstructed current interactive information or benefit are utilized later
With the reconstruct interactive information after reconstructed, target area is determined in a document;It is finally generated and is fed back according to the content of target area
Information.The present invention is different from carrying out keyword retrieval to document, but towards increasingly complex and various interaction scenarios, such as it is more
Wheel puts question to interaction, it is necessary to when can rebuild the interactive information for being easy to machine understanding, later in conjunction with corresponding interaction
The target area of information locking candidate regenerates final required feedback information from candidate.The present invention can be accurately and reliably
Relevant information in locating documents, and humanized result feedback is generated with this.
Corresponding to foregoing embodiments and preferred embodiment, the present invention also provides a kind of interactive systems based on document, such as
Shown in Fig. 8, which may include memory and at least one and the storage that at least one is used to store dependent instruction
Device connection and (one or more processors can also directly be held the processor for executing following each modules in other embodiments
The movement of row corresponding step, without being executed by following modules, such as processor directly execute semantic analysis, signal reconstruct,
The operations such as information feedback):
Current interactive information obtains module 1, for obtaining the current interactive information of user;
Semantic module 2, for carrying out semantic analysis to the current interactive information;
Decision-making module 3 is reconstructed, for according to semantic analysis result, whether decision to carry out signal reconstruct;
Target area determining module 4, for utilizing the current interactive information or reconstruct interactive information, in a document really
Set the goal region;
Feedback information generation module 5 generates feedback information for the content according to target area.
Further, the semantic module specifically includes:
Literal semantic type determination unit, for determining the literal semantic type of the current interactive information;
Intention type determination unit, for determining the intention type of the current interactive information.
Further, the reconstruct decision-making module is specifically used for: according to the intention type, retain current interactive information,
Or interactive history data and the current interactive information are utilized, obtain reconstruct interactive information.
Further, the reconstruct decision-making module specifically includes:
Stick unit, for determining that the intention type of current interactive information is the current interaction of reservation after normal enquirement type
Information
Reconfiguration unit, for determine current interactive information intention type be improper enquirement type after, according to described non-
It is normal that type is putd question to be integrated semantic analysis result corresponding in interactive history data simultaneously with the current interactive information
The intention type for updating the current interactive information obtains reconstruct interactive information.
Further, the target area determining module specifically includes:
Zoning unit, for document to be divided into multiple content areas;
Text character extraction unit, for extracting each content area and the current interactive information or the reconstruct
The text feature of interactive information;
Correlation acquiring unit, for calculating the current interactive information or the reconstruct interactive information using text feature
To the correlation of each content area;
Target area selection unit, for choosing at least one described content area as target according to the correlation
Region.
Further, the feedback information generation module specifically includes:
Attention rate computing unit, the content for calculating target area are interacted with the current interactive information or the reconstruct
Attention rate between information;
Feedback information generation unit, for the content and the current friendship according to the attention rate, the target area
Mutual information or the reconstruct interactive information generate feedback information.
Further, the attention rate computing unit specifically includes:
First attention rate computation subunit, for calculating the content of target area to the current interactive information or described heavy
First attention rate of each word in structure interactive information;
Second attention rate computation subunit, for calculating the current interactive information or the reconstruct interactive information to target
Second attention rate of each word in the content in region.
Further, the feedback information generation unit specifically includes:
Word weight determines subelement, for utilizing first attention rate, second attention rate, the target area
Content and the current interactive information or the reconstruct interactive information, determine the weight of each word in default dictionary;
Target word determines subelement, is used for according to first attention rate, second attention rate and the weight,
Determine target word;
Feedback information generates subelement, corresponds to the current interactive information or institute for generating using the target word
State the feedback information of reconstruct interactive information.
Although the working method and technical principle of the above system embodiment and preferred embodiment are all recorded in above, still need to
, it is noted that various component embodiments of the invention can be implemented in hardware, or to transport on one or more processors
Capable software module is realized, or is implemented in a combination thereof.Module or unit or component in embodiment can be combined into
One module or unit or component, also they can be divided into a plurality of submodules or subunits or subassembliess to be practiced.
And all the embodiments in this specification are described in a progressive manner, identical phase between each embodiment
As partially may refer to each other, each embodiment focuses on the differences from other embodiments.Especially for
For system embodiment, since it is substantially similar to the method embodiment, so describing fairly simple, related place is referring to method
The part of embodiment illustrates.System embodiment described above is only schematical, wherein saying as separation unit
Bright unit may or may not be physically separated, and component shown as a unit can be or can not also
It is physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual need
Some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying
Out in the case where creative work, it can understand and implement.
It is described in detail structure, feature and effect of the invention based on the embodiments shown in the drawings, but more than
Only presently preferred embodiments of the present invention needs to explain, technical characteristic involved in above-described embodiment and its preferred embodiment, this
Field technical staff can be under the premise of not departing from, not changing mentality of designing and technical effect of the invention, reasonably group
Conjunction mixes into a variety of equivalent schemes;Therefore, the present invention does not limit the scope of implementation as shown in the drawings, all according to conception of the invention
Made change or equivalent example modified to equivalent change, when not going beyond the spirit of the description and the drawings,
It should be within the scope of the present invention.
Claims (12)
1. a kind of exchange method based on document characterized by comprising
Obtain the current interactive information of user;
Semantic analysis is carried out to the current interactive information;
According to semantic analysis result, whether decision carries out signal reconstruct;
Using the current interactive information or reconstruct interactive information, target area is determined in a document;
According to the content of target area, feedback information is generated.
2. the exchange method according to claim 1 based on document, which is characterized in that described to the current interactive information
Carrying out semantic analysis includes:
Determine the literal semantic type of the current interactive information;
Determine the intention type of the current interactive information.
3. the exchange method according to claim 2 based on document, which is characterized in that it is described according to semantic analysis result,
Whether decision carries out signal reconstruct
According to the intention type, retain current interactive information, or utilizes interactive history data and the current interaction letter
Breath obtains reconstruct interactive information.
4. the exchange method according to claim 3 based on document, which is characterized in that it is described according to the intention type,
Using interactive history data and the current interactive information, obtaining reconstruct interactive information includes:
When the intention type is improper enquirement type, according to the improper enquirement type, by the interactive history number
The intention of the current interactive information is integrated with the current interactive information and updated to the corresponding semantic analysis result in
Type obtains reconstruct interactive information.
5. the exchange method according to claim 1 based on document, which is characterized in that described to utilize the current interaction letter
Breath or reconstruct interactive information determine that target area includes: in a document
Document is divided into multiple content areas;
Extract the text feature of each content area and the current interactive information or the reconstruct interactive information;
The current interactive information or the reconstruct interactive information are calculated to the phase of each content area using text feature
Guan Xing;
According to the correlation, at least one described content area is chosen as target area.
6. described in any item exchange methods based on document according to claim 1~5, which is characterized in that described according to target
The content in region, generating feedback information includes:
Calculate the attention rate between the content of target area and the current interactive information or the reconstruct interactive information;
According to the attention rate, the content of the target area and the current interactive information or the reconstruct interactive information,
Generate feedback information.
7. the exchange method according to claim 6 based on document, which is characterized in that the content for calculating target area
The current interactive information or it is described reconstruct interactive information between attention rate include:
The content of target area is calculated to first of each word in the current interactive information or the reconstruct interactive information
Attention rate;
The current interactive information or the reconstruct interactive information are calculated to second of each word in the content of target area
Attention rate.
8. the exchange method according to claim 7 based on document, which is characterized in that described according to the attention rate, institute
The content and the current interactive information or the reconstruct interactive information for stating target area, generating feedback information includes:
Utilize first attention rate, second attention rate, the content of the target area and the current interactive information
Or the reconstruct interactive information, determine the weight of each word in default dictionary;
According to first attention rate, second attention rate and the weight, target word is determined;
The feedback information corresponding to the current interactive information or the reconstruct interactive information is generated using the target word.
9. a kind of interactive system based on document characterized by comprising
Current interactive information obtains module, for obtaining the current interactive information of user;
Semantic module, for carrying out semantic analysis to the current interactive information;
Decision-making module is reconstructed, for according to semantic analysis result, whether decision to carry out signal reconstruct;
Target area determining module, for determining mesh in a document using the current interactive information or reconstruct interactive information
Mark region;
Feedback information generation module generates feedback information for the content according to target area.
10. the interactive system according to claim 9 based on document, which is characterized in that the reconstruct decision-making module is specific
Include:
Stick unit, for determining that the intention type of current interactive information is after normally puing question to type, to retain current interactive information;
Reconfiguration unit, for determine current interactive information intention type be improper enquirement type after, according to described improper
Put question to type that semantic analysis result corresponding in interactive history data is integrated and updated with the current interactive information
The intention type of the current interactive information obtains reconstruct interactive information.
11. the interactive system according to claim 9 based on document, which is characterized in that the target area determining module
It specifically includes:
Zoning unit, for document to be divided into multiple content areas;
Text character extraction unit, for extracting each content area and the current interactive information or reconstruct interaction
The text feature of information;
Correlation acquiring unit, for calculating the current interactive information or the reconstruct interactive information to each using text feature
The correlation of a content area;
Target area selection unit, for choosing at least one described content area as target area according to the correlation.
12. according to the described in any item interactive systems based on document of claim 9~11, which is characterized in that the feedback letter
Breath generation module specifically includes:
Attention rate computing unit, for calculating the content and the current interactive information or the reconstruct interactive information of target area
Between attention rate;
Feedback information generation unit, for being believed according to the content and the current interaction of the attention rate, the target area
Breath or the reconstruct interactive information generate feedback information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811596444.0A CN109857843B (en) | 2018-12-25 | 2018-12-25 | Interaction method and system based on document |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811596444.0A CN109857843B (en) | 2018-12-25 | 2018-12-25 | Interaction method and system based on document |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109857843A true CN109857843A (en) | 2019-06-07 |
CN109857843B CN109857843B (en) | 2023-01-17 |
Family
ID=66892280
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811596444.0A Active CN109857843B (en) | 2018-12-25 | 2018-12-25 | Interaction method and system based on document |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109857843B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111091011A (en) * | 2019-12-20 | 2020-05-01 | 科大讯飞股份有限公司 | Domain prediction method, domain prediction device and electronic equipment |
CN111160443A (en) * | 2019-12-25 | 2020-05-15 | 浙江大学 | Activity and user identification method based on deep multitask learning |
CN111310848A (en) * | 2020-02-28 | 2020-06-19 | 支付宝(杭州)信息技术有限公司 | Training method and device of multi-task model |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100169309A1 (en) * | 2008-12-30 | 2010-07-01 | Barrett Leslie A | System, Method, and Apparatus for Information Extraction of Textual Documents |
US20110270888A1 (en) * | 2010-04-30 | 2011-11-03 | Orbis Technologies, Inc. | Systems and methods for semantic search, content correlation and visualization |
CN105068661A (en) * | 2015-09-07 | 2015-11-18 | 百度在线网络技术(北京)有限公司 | Man-machine interaction method and system based on artificial intelligence |
CN105653738A (en) * | 2016-03-01 | 2016-06-08 | 北京百度网讯科技有限公司 | Search result broadcasting method and device based on artificial intelligence |
CN106601237A (en) * | 2016-12-29 | 2017-04-26 | 上海智臻智能网络科技股份有限公司 | Interactive voice response system and voice recognition method thereof |
CN106776936A (en) * | 2016-12-01 | 2017-05-31 | 上海智臻智能网络科技股份有限公司 | intelligent interactive method and system |
CN108664472A (en) * | 2018-05-08 | 2018-10-16 | 腾讯科技(深圳)有限公司 | Natural language processing method, apparatus and its equipment |
CN108920666A (en) * | 2018-07-05 | 2018-11-30 | 苏州思必驰信息科技有限公司 | Searching method, system, electronic equipment and storage medium based on semantic understanding |
CN108959627A (en) * | 2018-07-23 | 2018-12-07 | 北京光年无限科技有限公司 | Question and answer exchange method and system based on intelligent robot |
-
2018
- 2018-12-25 CN CN201811596444.0A patent/CN109857843B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100169309A1 (en) * | 2008-12-30 | 2010-07-01 | Barrett Leslie A | System, Method, and Apparatus for Information Extraction of Textual Documents |
US20110270888A1 (en) * | 2010-04-30 | 2011-11-03 | Orbis Technologies, Inc. | Systems and methods for semantic search, content correlation and visualization |
CN105068661A (en) * | 2015-09-07 | 2015-11-18 | 百度在线网络技术(北京)有限公司 | Man-machine interaction method and system based on artificial intelligence |
CN105653738A (en) * | 2016-03-01 | 2016-06-08 | 北京百度网讯科技有限公司 | Search result broadcasting method and device based on artificial intelligence |
CN106776936A (en) * | 2016-12-01 | 2017-05-31 | 上海智臻智能网络科技股份有限公司 | intelligent interactive method and system |
CN106601237A (en) * | 2016-12-29 | 2017-04-26 | 上海智臻智能网络科技股份有限公司 | Interactive voice response system and voice recognition method thereof |
CN108664472A (en) * | 2018-05-08 | 2018-10-16 | 腾讯科技(深圳)有限公司 | Natural language processing method, apparatus and its equipment |
CN108920666A (en) * | 2018-07-05 | 2018-11-30 | 苏州思必驰信息科技有限公司 | Searching method, system, electronic equipment and storage medium based on semantic understanding |
CN108959627A (en) * | 2018-07-23 | 2018-12-07 | 北京光年无限科技有限公司 | Question and answer exchange method and system based on intelligent robot |
Non-Patent Citations (3)
Title |
---|
FAEZEH ENSAN 等: "Document Retrieval Model Through Semantic Linking", 《ACM》 * |
付鸿鹄等: "基于段落检索和段落内容分析的知识化检索系统设计", 《情报理论与实践》 * |
王红等: "基于注意力机制的LSTM的语义关系抽取", 《计算机应用研究》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111091011A (en) * | 2019-12-20 | 2020-05-01 | 科大讯飞股份有限公司 | Domain prediction method, domain prediction device and electronic equipment |
CN111160443A (en) * | 2019-12-25 | 2020-05-15 | 浙江大学 | Activity and user identification method based on deep multitask learning |
CN111160443B (en) * | 2019-12-25 | 2023-05-23 | 浙江大学 | Activity and user identification method based on deep multitasking learning |
CN111310848A (en) * | 2020-02-28 | 2020-06-19 | 支付宝(杭州)信息技术有限公司 | Training method and device of multi-task model |
CN111310848B (en) * | 2020-02-28 | 2022-06-28 | 支付宝(杭州)信息技术有限公司 | Training method and device for multi-task model |
Also Published As
Publication number | Publication date |
---|---|
CN109857843B (en) | 2023-01-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Keneshloo et al. | Deep reinforcement learning for sequence-to-sequence models | |
CN111611361B (en) | Intelligent reading, understanding, question answering system of extraction type machine | |
CN110717017B (en) | Method for processing corpus | |
CN109885672B (en) | Question-answering type intelligent retrieval system and method for online education | |
CN107870964B (en) | Statement ordering method and system applied to answer fusion system | |
CN111026842A (en) | Natural language processing method, natural language processing device and intelligent question-answering system | |
CN108932342A (en) | A kind of method of semantic matches, the learning method of model and server | |
CN113591902A (en) | Cross-modal understanding and generating method and device based on multi-modal pre-training model | |
CN110096567A (en) | Selection method, system are replied in more wheels dialogue based on QA Analysis of Knowledge Bases Reasoning | |
CN109857846B (en) | Method and device for matching user question and knowledge point | |
CN109902750A (en) | Method is described based on two-way single attention mechanism image | |
CN110990555B (en) | End-to-end retrieval type dialogue method and system and computer equipment | |
CN109857843A (en) | Exchange method and system based on document | |
CN113239169A (en) | Artificial intelligence-based answer generation method, device, equipment and storage medium | |
CN111274822A (en) | Semantic matching method, device, equipment and storage medium | |
Dai et al. | A survey on dialog management: Recent advances and challenges | |
CN112115252A (en) | Intelligent auxiliary writing processing method and device, electronic equipment and storage medium | |
CN114997181A (en) | Intelligent question-answering method and system based on user feedback correction | |
CN114387537A (en) | Video question-answering method based on description text | |
Madureira et al. | An overview of natural language state representation for reinforcement learning | |
CN113988071A (en) | Intelligent dialogue method and device based on financial knowledge graph and electronic equipment | |
CN117033609B (en) | Text visual question-answering method, device, computer equipment and storage medium | |
Hafeth et al. | Semantic representations with attention networks for boosting image captioning | |
CN116644168A (en) | Interactive data construction method, device, equipment and storage medium | |
CN116975347A (en) | Image generation model training method and related device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |