CN106709072A - Method of obtaining intelligent conversation reply content based on shared corpora - Google Patents
Method of obtaining intelligent conversation reply content based on shared corpora Download PDFInfo
- Publication number
- CN106709072A CN106709072A CN201710076115.2A CN201710076115A CN106709072A CN 106709072 A CN106709072 A CN 106709072A CN 201710076115 A CN201710076115 A CN 201710076115A CN 106709072 A CN106709072 A CN 106709072A
- Authority
- CN
- China
- Prior art keywords
- sentence
- session
- type
- reply
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a method of obtaining intelligent conversation reply content based on shared corpora. The method comprises the following steps of: establishing personal corpora corresponding to communication parties, combining the personal corpora of the multiple communication parties, obtaining the shared corpora and matching the reply content in the shared corpora with the current conversation content. The method solves the technical problem that the conversation reply content obtained by machining based on existing shared corpora is inaccurate. The method has the beneficial effects that the workload of manually establishing the shared corpora is reduced; and the established shared corpora are rich in content and various in forms and have relatively high practicability and intellectuality, so that the relatively accurate conversation reply content is obtained by matching based on the established shared corpora.
Description
Technical field
The present invention relates to communication technical field, and in particular to one kind obtains intelligent session reply content based on shared corpus
Method.
Background technology
In intelligent session, session reply content can often be shared.Such as enterprise staff carries out commercial session with client
In scene, sales manager Zhang San for purpose client inquiry quotation reply sentence, can share to sales manager Li Si so that its
He works together, therefore the individual conference language material that can be based on one or more communication sides creates shared corpus, is then based on what is created
Shared corpus matching obtains intelligent session reply content.
Due to the existing shared corpus by manual creation to build storehouse quality universal not high, so as to cause based on existing common
The session reply content for enjoying corpus matching acquisition is not accurate.For the problem, the present embodiment proposes a kind of based on shared language
The method that material storehouse obtains intelligent session reply content.
The content of the invention
The invention provides a kind of method that intelligent session reply content is obtained based on shared corpus, to solve based on existing
The session reply content that the shared corpus matching having is obtained not accurately technical problem.
The method that intelligent session reply content is obtained based on shared corpus that the present invention is provided, including:
Personal corpus corresponding with communication side is set up, wherein, the number of communication side is more than one;
The personal corpus of multiple communication sides is merged, shared corpus is obtained;
The reply content of matching and current sessions content matching in shared corpus, and using reply content as with it is current
The corresponding session reply content of session content.
Further, setting up personal corpus corresponding with communication side includes:
Gather the session content of communication side;
Obtain the session pair in session content;
According to default scene tag, collection obtains session pair scene tag value corresponding with scene tag;
Session is carried out into matching combination to, scene tag and scene tag value corresponding with scene tag, so as to generate
Personal corpus corresponding with communication side.
Further, the session in session content is obtained to including:
According to the semanteme of session sentence in session content, determine the initiation sentence in session content and reply sentence;
According to default type judgment rule, it is determined that initiating sentence and replying the type of sentence;
Reply sentence according to initiating between sentence and initiation sentence and next initiation sentence extracts basic session pair;
Sentence to, basic session centering is initiated according to basic session and the type of sentence is replied, at least one session pair is extracted.
Further, according to the semanteme of session sentence in session content, determine the initiation sentence in session content and reply sentence bag
Include:
Judge whether the sentence of the session in session content has communication other side to send above in Preset Time interval, if nothing,
Then session sentence is defined as initiating sentence;
If so, then judge session sentence whether with communication other side send above without semantic association, if so, then by session sentence really
It is set to initiation sentence, otherwise is defined as replying sentence by session sentence.
Further, according to default type judgment rule, it is determined that the type for initiating sentence includes:
Judge to initiate whether sentence is with complete independent semantic sentence, if so, then judging to initiate whether sentence is had by multiple
It is made up of complete independent semantic simple sentence, if so, the type for initiating sentence then is defined as into complex sentence initiates sentence type, otherwise it is simple sentence
Initiate sentence type;If it is not, whether then judge to initiate sentence comprising having complete independent semantic simple sentence, if comprising sentence will be initiated
Type be defined as non-standard complex sentence and initiate sentence type, be that non-standard simple sentence initiates sentence type if not including;
Search for whether the initiation sentence of non-standard simple sentence initiation sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the initiation sentence of non-standard simple sentence initiation sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard simple sentence is initiated into sentence
The type derivative of the initiation sentence of type expands to non-standard sentence mass-sending first line of a poem type, if can not, do not carry out deriving extension;
Search for whether the initiation sentence of non-standard complex sentence initiation sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the initiation sentence of non-standard complex sentence initiation sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard complex sentence is initiated into sentence
The type derivative of the initiation sentence of type expands to non-standard sentence mass-sending first line of a poem type, if can not, do not carry out deriving extension;
Whether judge the initiation sentence of simple sentence, complex sentence, non-standard simple sentence, non-standard complex sentence and non-standard sentence realm type has certainly
Oneself session continuous above and below sentence, if so, then determining whether initiate sentence whether can be continuous above and below with oneself
Session sentence is merged into the sentence group of semantic association, if so, the type that will then initiate sentence derives expands to sentence mass-sending first line of a poem type, otherwise
Do not carry out deriving extension.
Further, according to default type judgment rule, it is determined that the type for replying sentence includes:
Judge to reply whether sentence is with complete independent semantic sentence, if so, then judging to reply whether sentence is had by multiple
It is made up of complete independent semantic simple sentence, if so, the type for replying sentence then is defined as into complex sentence replys sentence type, otherwise it is simple sentence
Reply sentence type;If it is not, whether then judge to reply sentence comprising having complete independent semantic simple sentence, if comprising sentence will be replied
Type be defined as non-standard complex sentence and reply sentence type, be that non-standard simple sentence replys sentence type if not including;
Search for whether the reply sentence of non-standard simple sentence reply sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the reply sentence of non-standard simple sentence reply sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard simple sentence is replied into sentence
The type derivative of the reply sentence of type expands to non-standard sentence group and replys sentence type, if can not, do not carry out deriving extension;
Search for whether the reply sentence of non-standard complex sentence reply sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the reply sentence of non-standard complex sentence reply sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard complex sentence is replied into sentence
The type derivative of the reply sentence of type expands to non-standard sentence group and replys sentence type, if can not, do not carry out deriving extension;
Whether judge the reply sentence of simple sentence, complex sentence, non-standard simple sentence, non-standard complex sentence and non-standard sentence realm type has certainly
Oneself session continuous above and below sentence, if so, then determining whether reply sentence whether can be continuous above and below with oneself
Session sentence is merged into the sentence group of semantic association, if so, the type derivative that will then reply sentence expands to sentence group replys sentence type, otherwise
Do not carry out deriving extension.
Further, according to basic session to, the type of sentence is initiated in basic session centering and basic session centering is replied
The type of sentence, extracts at least one session to including:
The type that sentence is initiated in basic session centering is carried out to derive extension, polytype initiation sentence is obtained;
The type that sentence is replied in basic session centering is carried out to derive extension, polytype reply sentence is obtained;
According to polytype initiation sentence and polytype reply sentence, the session pair of at least one semantic association is combined
Extracted.
Further, the personal corpus of multiple communication sides is merged, obtaining shared corpus includes:
The personal corpus of multiple communication sides is combined, combination corpus is obtained;
The session comprising identical initiation sentence obtains shared corpus to carrying out similar terms merging during corpus will be combined.
Further, also include after the shared corpus of acquisition:
Whether the session in shared corpus is judged to comprising multiple reply sentences, if so, then according to default rule to many
Individual reply sentence carries out intelligent sequencing.
Further, matched in shared corpus includes with the reply content of current sessions content matching:
Collection is corresponding with current sessions content, and session context label value corresponding with default session context label;
Matching is corresponding with current sessions content, session context label and session context label value in shared corpus
Reply sentence, as reply content.
The invention has the advantages that:
The method that intelligent session reply content is obtained based on shared corpus that the present invention is provided, by setting up and communication side
Corresponding personal corpus, the personal corpus of multiple communication sides is merged, and obtains shared corpus and in shared language
Matching and the reply content of current sessions content matching, solve the meeting obtained based on existing shared corpus matching in material storehouse
Words reply content not accurately technical problem.Not only reduce the workload of the shared corpus of manual creation, and being total to of creating
Enjoy that corpus is rich in content and diversified in form, with practicality higher and intelligent, so that based on the shared language material for creating
Storehouse matching obtains more accurately session reply content.
In addition to objects, features and advantages described above, the present invention also has other objects, features and advantages.
Below with reference to figure, the present invention is further detailed explanation.
Brief description of the drawings
The accompanying drawing for building the part of the application is used for providing a further understanding of the present invention, schematic reality of the invention
Apply example and its illustrate, for explaining the present invention, not build inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is that the preferred embodiment of the present invention is based on the method flow diagram that shared corpus obtains intelligent session reply content;
Fig. 2 be the preferred embodiment of the present invention be directed to simplify intelligent session is obtained based on shared corpus replying for embodiment
The method flow diagram of content.
Specific embodiment
Embodiments of the invention are described in detail below in conjunction with accompanying drawing, but the present invention can be defined by the claims
Multitude of different ways with covering is implemented.
Reference picture 1, the preferred embodiments of the present invention are provided one kind and are obtained in intelligent session reply based on shared corpus
The method of appearance, including:
Step S101, sets up personal corpus corresponding with communication side, wherein, the number of communication side is more than one;
Step S102, the personal corpus of multiple communication sides is merged, and obtains shared corpus;
Step S103, the reply content of matching and current sessions content matching in shared corpus, and by reply content
As session reply content corresponding with current sessions content.
The method that intelligent session reply content is obtained based on shared corpus provided in an embodiment of the present invention, by set up with
The corresponding personal corpus in communication side, the personal corpus of multiple communication sides is merged, acquisition share corpus and
Matching and the reply content of current sessions content matching, are solved and are obtained based on existing shared corpus matching in shared corpus
Session reply content not accurately technical problem.Not only reduce the workload of the shared corpus of manual creation, Er Qiechuan
The shared corpus built is rich in content and diversified in form, with practicality higher and intelligent, so that based on being total to for creating
Enjoy corpus matching and obtain more accurately session reply content.
It should be noted that because the present embodiment is to obtain shared language material by merging the personal corpus of multiple communication sides
Storehouse, therefore when personal corpus corresponding with communication side is set up, the number of communication side need to be more than one, namely need to create at least two
The personal corpus of communication side.
Alternatively, setting up personal corpus corresponding with communication side includes:
Gather the session content of communication side;
Obtain the session pair in session content;
According to default scene tag, collection obtains session pair scene tag value corresponding with scene tag;
Session is carried out into matching combination to, scene tag and scene tag value corresponding with scene tag, so as to generate
Personal corpus corresponding with communication side.
The session content that the embodiment of the present invention passes through collection communication side, obtains the session pair in session content, according to default
Scene tag, collection obtain corresponding with the scene tag scene tag value of session pair and by session to, scene tag and
Scene tag value corresponding with scene tag carries out matching combination, so that personal corpus corresponding with communication side is generated, not only
Greatly reduce the artificial workload for setting up personal corpus, and according to session to, scene tag and with scene tag pair
The scene tag value answered generates personal corpus corresponding with communication side, can preferably simulate true session context, further makes
The shared corpus that must be created also can the preferable true session context of simulation, and cause to be obtained based on the shared corpus for creating
More accurately session reply content.
It should be noted that, the embodiment of the present invention is by session to, scene tag and scene tag corresponding with scene tag
Value carries out matching combination producing individual's corpus, namely according to the content matching group of " session right+scene tag+scene tag value "
Normally, personal corpus is generated.Further, since different session contents has different scene characteristics, such as session content
Theme, session intention, Session Time, session place, session both sides' relation etc., therefore the present embodiment obtains the meeting in session content
To rear, further according to default scene tag, collection obtains session pair scene tag value corresponding with scene tag to words, and will
Session carries out matching combination to, scene tag and scene tag value corresponding with scene tag, so as to generate personal corpus.
Scene tag in the present embodiment by User Defined or automatic acquisition, for example, can be session content theme, and session communication is double
Side time, place, the date, session intention, weather, season, sex, occupation, post, mood, hobby, body-sensing data,
Health status, real-time behavior state, constellation, blood group, the bipartite relation of session communication, age gap away from, seniority in the family gap, both sides
The interval time of session communication, frequency, time span, the sentence pattern of session content, sentence class, sentence structure type, and total amount mark
One or more combination in label etc..
And the present embodiment collection is when obtaining corresponding with the scene tag scene tag value of session pair, different sides can be taken
Method realization, the method for specifically including direct collection, such as place scene tag value, can be by the GPS of mobile terminal certainly
Dynamic collection is obtained;The method of reasoning, such as communication two party relation scene tag value, can be by other acquired fields
The reasoning of scape label value is obtained;The method with the term vector of session relevance is calculated, for example, is intended to collection label value for session,
Can be obtained with the term vector of session relevance by calculating;The method of neural network learning, such as mood scene mark
Label value, the grader that session content or other acquired scene tag value inputs are trained can be classified obtain.Additionally,
The present embodiment can also automatically obtain scene tag value with reference to one or more method described above.
Alternatively, the session in session content is obtained to including:
According to the semanteme of session sentence in session content, determine the initiation sentence in session content and reply sentence;
According to default type judgment rule, it is determined that initiating sentence and replying the type of sentence;
Reply sentence according to initiating between sentence and initiation sentence and next initiation sentence extracts basic session pair;
Sentence to, basic session centering is initiated according to basic session and the type of sentence is replied, at least one session pair is extracted.
The existing session pair extracted from session content or question and answer pair, often the session of question-response is to form, and
In actual conversation procedure, communication two party conversates and not complies fully with the conversation modes of question-response, such as communication
The session sentence that other side sends, communication side may reply several session sentences, or for a plurality of session sentence that communication other side sends, lead to
News side may only reply a session sentence.
Therefore it is right if only the form extraction dialogue of question-response is taken, it is understood that there may be problems with:
(1) for the session content that some do not represent in question-response form, session pair is extracted from session content
Difficulty is larger, and precision is relatively low.The session content that sentence+multiple replys sentence form for example is initiated for multiple, session is therefrom extracted
Pair when, it is necessary to analyze reply sentence match with each initiation sentence, process is complicated, greatly, and precision is relatively low for difficulty.
(2) due to it is existing according to session content extract question and answer pair or session to be typically all standard of comparison session sentence,
Or session sentence relatively simple for structure, so as to cause the session sentence for some complicated or non-standard structures precisely to have extracted
Whole property is good and practicality session pair high.
(3) further, since the integrality of the session pair extracted in question-response form is more easily damaged, so as to cause to extract
Session to being unable to the true session of accurate simulation.Regarding to the issue above, the present invention proposes one kind according to initiation sentence and replys sentence
Type method that session pair is extracted from session content.
For the problem, the present embodiment determines the hair in session content by the semanteme according to session sentence in session content
The first line of a poem and reply sentence, according to default type judgment rule, it is determined that initiate sentence and reply the type of sentence, according to initiation sentence and hair
The reply sentence that the first line of a poem and next are initiated between sentence extracts basic session pair, and according to basic session to, basic session centering
Initiate sentence and reply the type of sentence, extract at least one session pair, solve prior art extract session pair difficulty is larger, essence
The relatively low technical problem of degree, has broken the limitation of the session to form of traditional question-response, and according to initiation sentence and return
The type of complex sentence, can not only fast and effeciently extract session pair, and the session pair extracted precision and the degree of accuracy also carry significantly
Rise.Additionally, for the session sentence of some complicated or non-standard structures, it is good and practical that the embodiment of the present invention can precisely extract integrality
Property session pair high so that the session extracted to can the true session of accurate simulation, intelligence degree is higher.Further,
The session that the embodiment of the present invention is extracted to various informative, be conducive to it is dialogue-based to precisely matching intelligent replying content, and
With various informative intelligent replying content is obtained, practicality is higher.
It should be noted that the present embodiment it is determined that initiate sentence and reply sentence type before, first preset initiate sentence and
The type and type judgment rule corresponding with type of sentence are replied, so that according to default type judgment rule, can be quick
It is determined that initiating sentence and replying the type of sentence.
The present embodiment can be by gathering the session of the instant messaging account of communication side, Email Accounts, microblogging number, cell-phone number
Content obtains session content, and wherein session content is text, picture, voice, video or animation form, and when session content is language
When sound, picture, video or animation form, also including the session content of voice, picture, video or animation form is converted into text
The session content of form.
Alternatively, according to the semanteme of session sentence in session content, determine that the sentence of the initiation in session content and reply sentence include:
Judge whether the sentence of the session in session content has communication other side to send above in Preset Time interval, if nothing,
Then session sentence is defined as initiating sentence;
If so, then judge session sentence whether with communication other side send above without semantic association, if so, then by session sentence really
It is set to initiation sentence, otherwise is defined as replying sentence by session sentence.
In order to precisely extract the session pair in session content, the present embodiment is first according to the language of session sentence in session content
Justice, determines the initiation sentence in session content and replys sentence, then further determines to initiate sentence and replys the type of sentence, so that root
Session pair is precisely extracted according to the type initiated sentence and reply sentence.Wherein, the present embodiment it is signified according to session sentence in session content
Semanteme, the detailed process for determining initiation sentence in session content and replying sentence is:Judge the session sentence in session content pre-
If whether there is communication other side to send above in time interval, if nothing, session sentence is defined as initiating sentence, if so, then judging
Session sentence whether with communication other side send above without semantic association, if so, then by session sentence be defined as initiate sentence, otherwise will
Words sentence is defined as replying sentence.
In the conversation procedure of reality, if current sessions sentence is interval interior without the upper of communication other side's transmission in Preset Time
Text, is typically construed as initiating the initial sentence of session, namely initiate sentence.For example assume current sessions sentence for December 3 sent
Session sentence, upper session sentence is to communicate the session sentence that other side sent in December 1, it is assumed that default time interval is 1 day,
Then by judging, current sessions sentence sends above in Preset Time is interval without communication other side, then by current sessions sentence
Be considered initiate session initial sentence, also will current sessions sentence be judged to initiate sentence.And the default time interval of the present embodiment
Specifically by User Defined, for example, can be 1 hour, half a day, one day, one month etc., namely current sessions sentence ought be judged
Sent above without communication other side in 1 hour, half a day, one day, one month, then judge current sessions sentence as sentence is initiated.
Additionally, when session sentence have communication other side send above when, be can determine whether according to actual session content, session sentence may
It is to reply the sentence of reply above that communication other side sends;It is likely to not be to reply communication other side to send above, but sends out again
Play the initiation sentence of session;Or simultaneously be reply communication other side send above reply sentence and again initiation session initiation
Sentence.For such case, the present embodiment is by judging whether session sentence with communication other side sends comes true without semantic association above
Determine the type of session sentence.It should be noted that whether session sentence closes without semanteme above with what communication other side sent in the present embodiment
Connection, specifically refers to whether session sentence includes the sentence without semantic association above sent with communication other side.
For example, when session sentence has communication other side to send above, and communication other side A send above for " recently how
Sample", then for session sentence (the communication side B of the first situation:" pretty good "), can determine whether out that session sentence does not include and communication
The sentence without semantic association above that other side sends, now determines session sentence to reply sentence;For second session of situation
Sentence (communication side B:" me is helped to pay telephone charge"), can determine whether out that session sentence is included with communication other side's transmission above without language
The sentence of justice association, now determines session sentence to initiate sentence;For session sentence (the communication side B of the third situation:" it is pretty good,
Me is helped to pay telephone charge"), can determine whether out that session sentence is same is included with communication other side's transmission above without semantic association
Sentence (" helps me to pay telephone charge"), now determine session sentence to initiate sentence.
The present embodiment is by judging whether the sentence of the session in session content has communication other side to send in Preset Time interval
Above and there is communication other side to send above when judge session sentence whether with communication other side send above without semantic pass
Connection, can precisely determine the initiation sentence and reply sentence in session content, be follow-up accurate according to the initiation for determining sentence and reply sentence
Extract session pair and laid the foundation to setting up personal corpus according to the session extracted.
Alternatively, according to default type judgment rule, it is determined that the type for initiating sentence includes:
Judge to initiate whether sentence is with complete independent semantic sentence, if so, then judging to initiate whether sentence is had by multiple
It is made up of complete independent semantic simple sentence, if so, the type for initiating sentence then is defined as into complex sentence initiates sentence type, otherwise it is simple sentence
Initiate sentence type;If it is not, whether then judge to initiate sentence comprising having complete independent semantic simple sentence, if comprising sentence will be initiated
Type be defined as non-standard complex sentence and initiate sentence type, be that non-standard simple sentence initiates sentence type if not including;
Search for whether the initiation sentence of non-standard simple sentence initiation sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the initiation sentence of non-standard simple sentence initiation sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard simple sentence is initiated into sentence
The type derivative of the initiation sentence of type expands to non-standard sentence mass-sending first line of a poem type, if can not, do not carry out deriving extension;
Search for whether the initiation sentence of non-standard complex sentence initiation sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the initiation sentence of non-standard complex sentence initiation sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard complex sentence is initiated into sentence
The type derivative of the initiation sentence of type expands to non-standard sentence mass-sending first line of a poem type, if can not, do not carry out deriving extension;
Whether judge the initiation sentence of simple sentence, complex sentence, non-standard simple sentence, non-standard complex sentence and non-standard sentence realm type has certainly
Oneself session continuous above and below sentence, if so, then determining whether initiate sentence whether can be continuous above and below with oneself
Session sentence is merged into the sentence group of semantic association, if so, the type that will then initiate sentence derives expands to sentence mass-sending first line of a poem type, otherwise
Do not carry out deriving extension.
In actual implementation process, initiating sentence may be presented with polytype, for example simple sentence, complex sentence, non-standard
Sentence etc., and it is different types of initiate sentence may influence or cause extract session to difference.For the problem, the present embodiment
According to default type judgment rule, it is determined that initiating the type of sentence.Specifically, sentence is being initiated with complete independent semanteme first
Under the premise of, by judging that initiating the simple sentence that sentence is by or multiple is completely independently semantic constitutes, it is determined that initiating sentence for simple sentence
Or complex sentence initiates sentence type, and on the premise of sentence is initiated without complete independent semanteme, by judging whether initiate sentence
Determine the type for initiating sentence for non-standard complex sentence also criteria of right and wrong simple sentence initiates sentence comprising the simple sentence with complete independent semanteme
Type;Then initiated by searching for non-standard simple sentence and non-standard complex sentence the initiations sentence of sentence type whether have oneself above with
Literary continuous session sentence, and whether can be merged into complete independent semantic language with the session continuous above and below of oneself sentence
Sentence, it is determined whether the type derivative that will initiate sentence expands to non-standard sentence mass-sending first line of a poem type;Finally by judging simple sentence, multiple
Whether the initiation sentence of sentence, non-standard simple sentence, non-standard complex sentence and non-standard sentence realm type has the continuous above and below of oneself
Session sentence, it is determined that whether the type for initiating sentence can derive expands to sentence mass-sending first line of a poem type.
Specifically, the present embodiment determines that being divided into three differentiation processes, i.e., first on the process nature for initiate sentence type sentences
Other process is to initiate sentence to each to initiate sentence type (simple sentence, complex sentence, non-standard simple sentence and non-standard complex sentence) according to four kinds
Differentiated one by one;Second differentiation process is after first differentiation process has been carried out, then to differentiate non-standard simple sentence and non-
Whether the initiation sentence of standard complex sentence initiation sentence type can further derive expands to non-standard sentence mass-sending first line of a poem type;3rd is sentenced
Other process be after second differentiation process has been carried out, then differentiate simple sentence, complex sentence, non-standard simple sentence, non-standard complex sentence and
Whether the initiation sentence of non-standard sentence realm type can further derive expands to sentence mass-sending first line of a poem type.
On the one hand the present embodiment is conducive to carrying out sentence structure and composition to initiating sentence by determining to initiate the type of sentence
Depth analysis, on the other hand, based on type judgement and structural analysis is carried out to initiating sentence, are conducive to more accurate extraction practicality high
And various informative session pair, storehouse quality is built to the shared corpus set up based on the session extracted so as to improve, and
More accurately session reply content is obtained based on the shared corpus matching for creating.It should be noted that being initiated in the present embodiment
Whether sentence has the session continuous above and below sentence of oneself to specifically refer to initiate whether sentence has the sender for sending initiation sentence to send
Session continuous above and below sentence.
Alternatively, according to default type judgment rule, it is determined that the type for replying sentence includes:
Judge to reply whether sentence is with complete independent semantic sentence, if so, then judging to reply whether sentence is had by multiple
It is made up of complete independent semantic simple sentence, if so, the type for replying sentence then is defined as into complex sentence replys sentence type, otherwise it is simple sentence
Reply sentence type;If it is not, whether then judge to reply sentence comprising having complete independent semantic simple sentence, if comprising sentence will be replied
Type be defined as non-standard complex sentence and reply sentence type, be that non-standard simple sentence replys sentence type if not including;
Search for whether the reply sentence of non-standard simple sentence reply sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the reply sentence of non-standard simple sentence reply sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard simple sentence is replied into sentence
The type derivative of the reply sentence of type expands to non-standard sentence group and replys sentence type, if can not, do not carry out deriving extension;
Search for whether the reply sentence of non-standard complex sentence reply sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the reply sentence of non-standard complex sentence reply sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard complex sentence is replied into sentence
The type derivative of the reply sentence of type expands to non-standard sentence group and replys sentence type, if can not, do not carry out deriving extension;
Whether judge the reply sentence of simple sentence, complex sentence, non-standard simple sentence, non-standard complex sentence and non-standard sentence realm type has certainly
Oneself session continuous above and below sentence, if so, then determining whether reply sentence whether can be continuous above and below with oneself
Session sentence is merged into the sentence group of semantic association, if so, the type derivative that will then reply sentence expands to sentence group replys sentence type, otherwise
Do not carry out deriving extension.
The present embodiment judges that the principle and process of the type replied the type of sentence and judge initiation sentence are essentially identical, therefore no longer
Describe in detail.And on the one hand the present embodiment is conducive to carrying out sentence structure and composition to replying sentence by determining to reply the type of sentence
Depth analysis, on the other hand, based on type judgement and structural analysis is carried out to replying sentence, are conducive to more accurate extraction practicality high
And various informative session pair, storehouse quality is built to the shared corpus set up based on the session extracted so as to improve, and
More accurately session reply content is obtained based on the shared corpus matching for creating.It should be noted that being replied in the present embodiment
Whether sentence has the session continuous above and below sentence of oneself to specifically refer to reply whether sentence has the sender for sending the reply sentence
The session continuous above and below sentence for sending.
Alternatively, according to basic session to, the type of sentence is initiated in basic session centering and sentence is replied in basic session centering
Type, extract at least one session to including:
The type that sentence is initiated in basic session centering is carried out to derive extension, polytype initiation sentence is obtained;
The type that sentence is replied in basic session centering is carried out to derive extension, polytype reply sentence is obtained;
According to polytype initiation sentence and polytype reply sentence, the session pair of at least one semantic association is combined
Extracted.
Due in the present embodiment initiate sentence and reply sentence type include it is various, for example simple sentence, complex sentence, non-standard simple sentence,
Non-standard complex sentence, non-standard sentence group, sentence mass-sending first line of a poem type, and it is simple sentence, complex sentence, non-standard simple sentence, non-standard complex sentence, nonstandard
Quasi- sentence group, sentence group reply sentence type, therefore are extracting basic session to rear, high and various informative in order to more precisely extract practicality
Session pair, the type that sentence is initiated in basic session centering derive extension, the polytype hair of acquisition by the present embodiment first
The first line of a poem, then carries out the type that sentence is replied in basic session centering to derive extension, obtains polytype reply sentence, finally according to
Polytype initiation sentence and polytype reply sentence, combine the session of at least one semantic association to extracting, from
And the multiple sessions pair of acquisition can be combined.
For example assume that it is that complex sentence initiates sentence type to initiate sentence type, it is that complex sentence replys sentence type to reply sentence, then by type
After derivative extension, simple sentence can be extracted initiate sentence+simple sentence and reply sentence, complex sentence is initiated sentence+simple sentence and replys sentence, simple sentence initiate sentence+
Complex sentence replys sentence, and complex sentence initiates the session pair that sentence+complex sentence replys the diversified forms such as sentence.
Alternatively, the personal corpus of multiple communication sides is merged, obtaining shared corpus includes:
The personal corpus of multiple communication sides is combined, combination corpus is obtained;
The session comprising identical initiation sentence obtains shared corpus to carrying out similar terms merging during corpus will be combined.
The personal corpus of the communication side created due to the present embodiment all by session to constituting, namely by session setup
Sentence composition is replied in sentence and corresponding session.Therefore the present embodiment is merged by the personal corpus of multiple communication sides, is obtained
When must share corpus, the personal corpus of multiple communication sides is combined first, obtains combination corpus, then will combination
Session comprising identical initiation sentence in corpus obtains shared corpus to carrying out similar terms merging.
It should be noted that the present embodiment will include the session of identical initiation sentence to carrying out similar terms conjunction in combining corpus
And, the answer sentence that will include the session centering of identical initiation sentence merges.For example assume that the personal corpus of communication side A includes meeting
Words are to { initiating sentence:How are you getting along recently/ reply sentence:Pretty good, the personal corpus of communication side B includes session to { initiating sentence:
How are you getting along recently/ reply sentence:It is as usual }, then it is identical by being included in combination corpus after by two people's corpus combinations
Initiate sentence session to carrying out similar terms merging, also can by above-mentioned two individual corpus comprising it is identical initiation sentence (" recently why
Sample") session pair, merge into initiate sentence:How are you getting along recently/ reply sentence 1:Pretty good;Reply sentence 2:It is as usual }.
The present embodiment can be obtained by that will include the session of identical initiation sentence to carrying out similar terms merging in combination corpus
The shared corpus that must be simplified, is conducive to follow-up according to the shared intelligent session reply content of corpus Rapid matching acquisition.Additionally,
The present embodiment can also can be obtained by that will include the session of identical reply sentence to carrying out similar terms merging in combination corpus
The shared corpus simplified, is conducive to follow-up according to the shared intelligent session reply content of corpus Rapid matching acquisition.For example:You
Company whereHow your company is gone toMay I ask interview addressThis 3 initiation sentences replies sentence be all:Changsha Yuelu District
Tongzi slope collection openings for the able and the worthy Changsha foreign student undertaking area opposite.
Alternatively, also include after the shared corpus of acquisition:
Whether the session in shared corpus is judged to comprising multiple reply sentences, if so, then according to default rule to many
Individual reply sentence carries out intelligent sequencing.
After the session for including identical initiation sentence in corpus being combined due to the present embodiment to carrying out similar terms merging, session
Centering is directed to same initiation sentence, potentially includes multiple reply sentences.For the problem, the present embodiment obtain shared corpus it
Afterwards also including judging the session in shared corpus to whether comprising multiple reply sentences, if so, then according to default rule to many
Individual reply sentence carries out intelligent sequencing, so that the convenient shared corpus of follow-up basis quickly obtains the reply sentence for more matching.
It should be noted that the present embodiment can reply sentence according to default rule to multiple carries out intelligent sequencing, for example
Intelligence is carried out according to replying the frequency of use of sentence, use habit, replying sentences to multiple using preference, use time order etc. rule
Can sequence.
Alternatively, matched in shared corpus includes with the reply content of current sessions content matching:
Collection is corresponding with current sessions content, and session context label value corresponding with default session context label;
Matching is corresponding with current sessions content, session context label and session context label value in shared corpus
Reply sentence, as reply content.
Automatically the shared corpus set up due to the present embodiment by session it is right+scene tag+scene tag value constitutes, therefore
During based on the matching of shared corpus with the reply content of current sessions content matching, the present embodiment gather first with current sessions
Hold corresponding, and session context label value corresponding with default session context label, then in shared corpus matching with
Current sessions content, session context label and session context label value are corresponded to and reply sentence, used as reply content.
Alternatively, default scene tag includes the first scene tag and the second scene tag, wherein
First scene tag includes:The time of session communication both sides, place, date, weather, season, body-sensing data, session
One or more combination in interval time, frequency, the time span scene tag of communication two party session communication;
Second scene tag includes:Session content theme, the session intention of session communication both sides, sex, occupation, post,
Mood, hobby, health status, real-time behavior state, the sentence pattern of session content, sentence class, sentence structure type, and total amount
One or more combination in scene tag.
Embodiment being simplified below for one, intelligent session reply content is obtained based on shared corpus to of the invention
Method is illustrated further.
Reference picture 2, it is of the invention embodiment offer is provided intelligent session reply content is obtained based on shared corpus
Method, including:
Step S201, sets up personal corpus corresponding with communication side, wherein, the number of the communication side is more than one.
Specifically, it is assumed that the communication side in the present embodiment includes communication side A1 and communication side A2, due to leading to for different
The method and process that news side sets up personal corpus are identical, therefore the present embodiment is only to one of communication side, such as communication side
A1 sets up personal corpus and is specifically described.Specifically, the method that the present embodiment sets up personal corpus for communication side A1
Including:
Step S2001, gathers the session content of communication side.
Specifically, it is assumed that the session content of the present embodiment collection is the instant messaging account of communication side A1, Email Accounts, micro-
Rich number, the session content that is conversated with communication other side B of cell-phone number, wherein, session content be text, picture, voice, video or
Animation form, and when session content is voice, picture, video or animation form, also including by voice, picture, video or dynamic
The session content of unrestrained form is converted to the session content of text formatting.Extracted from session content to describe the present embodiment in detail
The process of session pair, the present embodiment is illustrated with simple communication side A1 with the session content of communication other side B, specific as follows:
A1:Eat
B:Eat.
B:You
A1:Me is helped to pay
A1:Take
B:100 yuan are altogether paid.
B:The people of queuing can be so many.
Step S2002, judges whether the sentence of the session in session content has what communication other side sent in Preset Time interval
Above, if nothing, session sentence is defined as initiating sentence;
If so, then judge session sentence whether with communication other side send above without semantic association, if so, then by session sentence really
It is set to initiation sentence, otherwise is defined as replying sentence by session sentence.
Specifically, according to above-mentioned judgment rule, it may be determined that initiation sentence and reply sentence in session content, it is assumed that this implementation
Example is specifically shown in Table 1 by judging to obtain the initiation sentence in session content and replying sentence.
Table 1
Initiate sentence | Reply sentence |
Eat | Eat. |
You | 100 yuan are altogether paid. |
Me is helped to pay | The people of queuing can be so many. |
Take |
Step S2003, judges to initiate whether sentence is with complete independent semantic sentence, if so, then judging that initiating sentence is
It is no by multiple to there is complete independent semantic simple sentence to constitute, if so, the type for initiating sentence then is defined as into complex sentence initiates sentence type,
Otherwise for simple sentence initiates sentence type, if it is not, then judge to initiate whether sentence is included with complete independent semantic simple sentence, if comprising,
The type for initiating sentence is then defined as non-standard complex sentence and initiates sentence type, if not including, for non-standard simple sentence initiates sentence type;
Search for whether the initiation sentence of non-standard simple sentence initiation sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the initiation sentence of non-standard simple sentence initiation sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard simple sentence is initiated into sentence
The type derivative of the initiation sentence of type expands to non-standard sentence mass-sending first line of a poem type, if can not, do not carry out deriving extension;
Search for whether the initiation sentence of non-standard complex sentence initiation sentence type has the session continuous above and below sentence of oneself, if
Nothing, then do not carry out deriving extension, if so, then determining whether whether the initiation sentence of non-standard complex sentence initiation sentence type can be with oneself
Session continuous above and below sentence be merged into complete independent semantic sentence, if can, non-standard complex sentence is initiated into sentence
The type derivative of the initiation sentence of type expands to non-standard sentence mass-sending first line of a poem type, if can not, do not carry out deriving extension;
Whether judge the initiation sentence of simple sentence, complex sentence, non-standard simple sentence, non-standard complex sentence and non-standard sentence realm type has certainly
Oneself session continuous above and below sentence, if so, then determining whether initiate sentence whether can be continuous above and below with oneself
Session sentence is merged into the sentence group of semantic association, if so, will then have determined that the type of the initiation sentence of type derives expands to sentence mass-sending
First line of a poem type, does not carry out otherwise deriving extension.
Specifically, it is assumed that first differentiation process of the present embodiment first in step S2003, judge to initiate sentence
Type is as follows, is specifically shown in Table 2.
Table 2
Sequence number | Initiate sentence | Type |
First initiation sentence | Eat | Simple sentence |
Article 2 initiates sentence | You | Simple sentence |
Article 3 initiates sentence | Me is helped to pay | Non-standard simple sentence |
Article 4 initiates sentence | Take | Non-standard simple sentence |
Then, second differentiation process in step S2003, i.e., by judging non-standard simple sentence and non-standard complex sentence
Whether initiate the initiations sentence of sentence type has a session continuous above and below of oneself, and whether can with oneself above and under
Literary continuous session sentence is merged into complete independent semantic sentence, it is determined whether initiate non-standard simple sentence and non-standard complex sentence
The type derivative of sentence expands to non-standard sentence mass-sending first line of a poem type.By specific judgement, the Article 3 of the present embodiment and the
Initiating sentence for four can be merged into complete independent semantic sentence, namely now Article 3 and Article 4 can be initiated into sentence
Type derive and expand to non-standard sentence mass-sending first line of a poem type, be specifically shown in Table 3.
Table 3
Finally, the 3rd in step S2003 differentiation process, judges simple sentence, complex sentence, non-standard simple sentence, non-standard
Whether the initiation sentence of complex sentence and non-standard sentence realm type can further derive expands to sentence mass-sending first line of a poem type.
Specifically, it can be seen from table 3, the present embodiment can not will initiate the sentence group that sentence is further merged into semantic association,
I.e. in last process, do not carry out further deriving extension to initiating sentence.Therefore the final type such as institute of table 3 for obtaining initiation sentence
Show.
Step S2004, according to default type judgment rule, it is determined that replying the type of sentence.
The present embodiment determines that the principle and process base of the type of sentence are initiated in the principle and process of the type for replying sentence and determination
This is identical, therefore no longer describes in detail, it is assumed that the present embodiment judges that the type for replying sentence is specifically as shown in table 4.
Table 4
Step S2005, basic session is extracted according to the reply sentence initiated between sentence and initiation sentence and next initiation sentence
It is right.
Specifically, when the present embodiment initiates sentence extraction session pair for first, first determine whether first initiation sentence with
Whether one is initiated have reply sentence between sentence, if so, basic session pair is then extracted according to the initiation sentence and the reply sentence, by
Initiate have reply sentence between sentence in first and Article 2, then initiate sentence according to first and reply sentence to extract basic session pair.
It should be noted that the present embodiment is after it is determined that initiate to include reply sentence between sentence and next initiation sentence, also needs to calculate and initiate
Sentence with reply sentence whether semantic association, and only in the case of semantic association, just extract basis session pair, do not extract otherwise.
Present embodiment assumes that first is initiated sentence and first reply sentence semantic association, then basic session pair can be extracted, it is assumed that be
Basic session is to 1, and basic session is as shown in table 5 to 1 particular content.
Similarly, when the present embodiment is initiated sentence and extracts basic session pair for Article 2, first determine whether Article 2 initiate sentence with
Whether Article 3 initiates have reply sentence between sentence, and by judging, Article 2 and Article 3 are initiated not including reply between sentence
Sentence, then abandon Article 2 and initiate sentence as initiation sentence.Similarly, sentence is initiated according to Article 3 and Article 4, it is assumed that can extract
The basic session of semantic association is to 2, and basic session is as shown in table 5 to 2 particular content.
Table 5
Step S2006, the type that sentence is initiated in basic session centering is carried out to derive extension, obtains polytype initiation
Sentence.
Specifically, six kinds are had due to initiating the type of sentence in the present embodiment, respectively simple sentence, complex sentence, non-standard simple sentence,
Non-standard complex sentence, non-standard sentence group and sentence mass-sending first line of a poem type, therefore the present embodiment initiates sentence according to basic session centering first
Type carry out deriving extension, due in the present embodiment basic session to the type of the initiation sentence in 1 for simple sentence initiates sentence type,
Its cannot further derive be extended to other five kinds initiation sentence types, so when only include a type of initiation sentence, i.e. simple sentence
The initiation sentence of sentence type is initiated, it is specific as shown in table 6.And according to basic session to the type of the initiation sentence in 2, can be further
Derivative is extended to other kinds of initiation sentence, and such as simple sentence initiates sentence type, specific as shown in table 6.
Table 6
Step S2007, the type that sentence is replied in basic session centering is carried out to derive extension, obtains polytype reply
Sentence.
Specifically, six kinds are had due to replying the type of sentence in the present embodiment, respectively simple sentence, complex sentence, non-standard simple sentence,
Non-standard complex sentence, non-standard sentence group and sentence group reply sentence type.Therefore the present embodiment replys sentence according to basic session centering first
Type carry out deriving extension, due in the present embodiment basic session to the type of the reply sentence in 1 for simple sentence replys sentence type,
Its cannot further derive be extended to other five kinds reply sentence types, so when only include a type of reply sentence, i.e. simple sentence
The reply sentence of sentence type is replied, it is specific as shown in table 7.And according to basic session to the type of the reply sentence in 2, can be further
Derivative is extended to other kinds of reply sentence, and such as complex sentence replys sentence type, specific as shown in table 7.
Table 7
Step S2008, according to polytype initiation sentence and polytype reply sentence, combination at least one is semantic to close
The session of connection is to extracting.
Specifically, there was only one kind due to 1, initiating sentence for basic session and replying the type of sentence, so when can only carry
A session pair is taken, and is directed to basic session to 2, be various due to initiating the type of sentence and the type of complex sentence, therefore can be combined and obtain
Multiple sessions pair are obtained, 8 are specifically shown in Table, table 8 is to 26 sessions pair extracted according to basic session.
Table 8
Step S2009, according to default scene tag, collection obtains session pair scene tag corresponding with scene tag
Value.
Specifically, the present embodiment in collection with session to scene tag value corresponding and corresponding with default scene tag
When, scene tag is preset first, then for each session to gathering scene tag corresponding with default scene tag respectively
Value.Assuming that the default scene tag of the present embodiment includes session content theme, session intention, place, weather, session communication both sides
Relation, the age of communication object, the multiple combination of occupation, then can collect with each session to corresponding scene tag
Value, is specifically shown in Table 9.It should be noted that in the present embodiment due to session to 1- sessions to 6 based on session to 2
Derivative extension session pair, thus it is identical to the 2 corresponding scene tag value of scene tag with basic session.Additionally, the present embodiment pin
To different dialogues to that can set different scene tags, and the number of the scene tag for setting can also be different.
Table 9
Step S2010, it will words carry out match group to, scene tag and scene tag value corresponding with scene tag
Close, so as to generate personal exclusive corpus.
Specifically, the present embodiment carries out session to, scene tag and scene tag value corresponding with scene tag
With combination, so as to generate personal exclusive corpus, namely combined according to the content of " session right+scene tag+scene tag value "
Rule, generates the personal exclusive corpus of communication side A1.
Step S202, the personal corpus of multiple communication sides is merged, and obtains shared corpus.
Specifically, the present embodiment sets up the method and process of personal corpus and the method for communication side A1 for communication side A2
It is identical with process.And the detailed process that the personal corpus of communication side A1 and communication side A2 is merged is by the present embodiment:It is first
First the personal corpus of communication side A1 and communication side A2 is combined, combination corpus is obtained, then by combination corpus
Session comprising identical initiation sentence obtains shared corpus to carrying out similar terms merging.
Whether step S203, judges the session in shared corpus to comprising multiple reply sentences, if so, then according to default
Rule is replied sentence and carries out intelligent sequencing to multiple.
After the session for including identical initiation sentence in corpus being combined due to the present embodiment to carrying out similar terms merging, session
Centering is directed to same initiation sentence, potentially includes multiple reply sentences.Therefore the present embodiment is further sentenced after shared corpus is obtained
Whether the disconnected session shared in corpus is to including multiple reply sentences, if so, then being entered to multiple reply sentence according to default rule
Row intelligent sequencing.Specifically, the present embodiment can be according to replying the frequency of use of sentence, use habit, use preference, use time
Order etc. rule is replied sentence and carries out intelligent sequencing to multiple.
Step S204, the reply content of matching and current sessions content matching in shared corpus, and by reply content
As session reply content corresponding with current sessions content.
Specifically, the present embodiment gathers corresponding with current sessions content first, and with default session context label pair
The session context label value answered, then matching and current sessions content, session context label and session in session corpus
Scene tag value is corresponded to and replys sentence, used as reply content.For example, it is assumed that current sessions content is " to help me to pay the fees", then
According to the session corpus set up, matching session reply content can be quickly obtained.Due to being obtained according to session corpus
The session reply content for taking potentially includes multiple options, and in actual implementation process, user can select most proper as needed
When session reply content, or system chooses the session reply content most associated with current sessions content and returned automatically automatically
It is multiple.
The method that intelligent session reply content is obtained based on shared corpus provided in an embodiment of the present invention, by set up with
The corresponding personal corpus in communication side, the personal corpus of multiple communication sides is merged, acquisition share corpus and
Matching and the reply content of current sessions content matching, are solved and are obtained based on existing shared corpus matching in shared corpus
Session reply content not accurately technical problem.Not only reduce the workload of the shared corpus of manual creation, Er Qiechuan
The shared corpus built is rich in content and diversified in form, with practicality higher and intelligent, so that based on being total to for creating
Enjoy corpus matching and obtain more accurately session reply content.It is not difficult to find out simultaneously, compared to directly according to multiple communication sides
Session content creates shared corpus, and the present embodiment obtains shared corpus more by merging the personal corpus of multiple communication sides
It is easy and quick.
The preferred embodiments of the present invention are these are only, is not intended to limit the invention, for those skilled in the art
For member, the present invention can have various modifications and variations.All any modifications within the spirit and principles in the present invention, made,
Equivalent, improvement etc., should be included within the scope of the present invention.
Claims (10)
1. a kind of method that intelligent session reply content is obtained based on shared corpus, it is characterised in that including:
Personal corpus corresponding with communication side is set up, wherein, the number of the communication side is more than one;
The personal corpus of the multiple communication side is merged, shared corpus is obtained;
Matching and the reply content of current sessions content matching in the shared corpus, and using the reply content as with
The corresponding session reply content of the current sessions content.
2. the method that intelligent session reply content is obtained based on shared corpus according to claim 1, it is characterised in that
Setting up personal corpus corresponding with communication side includes:
Gather the session content of communication side;
Obtain the session pair in the session content;
According to default scene tag, collection obtains the session pair scene tag value corresponding with the scene tag;
The session is carried out into matching combination to, the scene tag and scene tag value corresponding with the scene tag,
So as to generate personal corpus corresponding with the communication side.
3. the method that intelligent session reply content is obtained based on shared corpus according to claim 2, it is characterised in that
The session in the session content is obtained to including:
According to the semanteme of session sentence in the session content, determine the initiation sentence in the session content and reply sentence;
According to default type judgment rule, the type of the initiation sentence and the reply sentence is determined;
Basic session pair is extracted according to the reply sentence that the initiation sentence and initiation sentence and next are initiated between sentence;
Sentence to, the basic session centering is initiated according to the basic session and the type of sentence is replied, at least one session is extracted
It is right.
4. the method that intelligent session reply content is obtained based on shared corpus according to claim 3, it is characterised in that
According to the semanteme of session sentence in the session content, determine that the sentence of the initiation in the session content and reply sentence include:
Judge whether the sentence of the session in the session content has communication other side to send above in Preset Time interval, if nothing,
Then session sentence is defined as initiating sentence;
If so, then judge session sentence whether with the communication other side send above without semantic association, if so, then will be described
Session sentence is defined as initiating sentence, otherwise is defined as replying sentence by session sentence.
5. the method that intelligent session reply content is obtained based on shared corpus according to claim 4, it is characterised in that
According to default type judgment rule, determining the type of the initiation sentence includes:
Judge whether whether the initiation sentence is with complete independent semantic sentence, if so, then judging the initiation sentence by many
It is individual to be constituted with complete independent semantic simple sentence, if so, the type of the initiation sentence then is defined as into complex sentence initiates sentence type, it is no
Then for simple sentence initiates sentence type;If it is not, whether the initiation sentence is then judged comprising having complete independent semantic simple sentence, if bag
Contain, then the type of the initiation sentence is defined as into non-standard complex sentence initiates sentence type, if not including, for non-standard simple sentence is initiated
Sentence type;
Search for whether the initiation sentence of non-standard simple sentence initiation sentence type has the session continuous above and below sentence of oneself, if
Whether nothing, then do not carry out deriving extension, if so, then determining whether that non-standard simple sentence initiates the initiation sentence of sentence type can be with
The session continuous above and below sentence of oneself is merged into complete independent semantic sentence, if can, by non-standard list
The type derivative that sentence initiates the initiation sentence of sentence type expands to non-standard sentence mass-sending first line of a poem type, if can not, do not carry out
Derivative extension;
Search for whether the initiation sentence of non-standard complex sentence initiation sentence type has the session continuous above and below sentence of oneself, if
Whether nothing, then do not carry out deriving extension, if so, then determining whether that non-standard complex sentence initiates the initiation sentence of sentence type can be with
The session continuous above and below sentence of oneself is merged into complete independent semantic sentence, if can, will be non-standard multiple
The type derivative that sentence initiates the initiation sentence of sentence type expands to non-standard sentence mass-sending first line of a poem type, if can not, do not carry out
Derivative extension;
Whether judge the initiation sentence of simple sentence, complex sentence, non-standard simple sentence, non-standard complex sentence and non-standard sentence realm type has certainly
Oneself session continuous above and below sentence, if so, then determining whether whether the initiation sentence can be with oneself above and below
Continuous session sentence is merged into the sentence group of semantic association, if so, then derive the type of the initiation sentence expanding to the sentence mass-sending first line of a poem
Type, does not carry out otherwise deriving extension.
6. the method that intelligent session reply content is obtained based on shared corpus according to claim 4, it is characterised in that
According to default type judgment rule, determining the type of the reply sentence includes:
Judge whether whether the reply sentence is with complete independent semantic sentence, if so, then judging the reply sentence by many
It is individual to be constituted with complete independent semantic simple sentence, if so, the type of the reply sentence then is defined as into complex sentence replys sentence type, it is no
Then for simple sentence replys sentence type;If it is not, whether the reply sentence is then judged comprising having complete independent semantic simple sentence, if bag
Contain, then the type of the reply sentence is defined as into non-standard complex sentence replys sentence type, if not including, for non-standard simple sentence is replied
Sentence type;
Search for whether the reply sentence of non-standard simple sentence reply sentence type has the session continuous above and below sentence of oneself, if
Whether nothing, then do not carry out deriving extension, if so, then determining whether that non-standard simple sentence replys the reply sentence of sentence type can be with
The session continuous above and below sentence of oneself is merged into complete independent semantic sentence, if can, by non-standard list
The type derivative of the reply sentence of sentence reply sentence type expands to non-standard sentence group and replys sentence type, if can not, do not carry out
Derivative extension;
Search for whether the reply sentence of non-standard complex sentence reply sentence type has the session continuous above and below sentence of oneself, if
Whether nothing, then do not carry out deriving extension, if so, then determining whether that non-standard complex sentence replys the reply sentence of sentence type can be with
The session continuous above and below sentence of oneself is merged into complete independent semantic sentence, if can, will be non-standard multiple
The type derivative of the reply sentence of sentence reply sentence type expands to non-standard sentence group and replys sentence type, if can not, do not carry out
Derivative extension;
Whether judge the reply sentence of simple sentence, complex sentence, non-standard simple sentence, non-standard complex sentence and non-standard sentence realm type has certainly
Oneself session continuous above and below sentence, if so, then determining whether whether the reply sentence can be with oneself above and below
Continuous session sentence is merged into the sentence group of semantic association, and sentence is replied if so, then deriving the type of the reply sentence and expanding to sentence group
Type, does not carry out otherwise deriving extension.
7. the method that intelligent session reply content is obtained based on shared corpus according to claim 6, it is characterised in that
According to basic session to, the type of sentence is initiated in the basic session centering and the type of sentence is replied in the basic session centering,
At least one session is extracted to including:
The type that sentence is initiated in the basic session centering is carried out deriving extension, polytype initiation sentence is obtained;
The type that sentence is replied in the basic session centering is carried out deriving extension, polytype reply sentence is obtained;
According to polytype meeting initiated sentence and polytype reply sentence, combine at least one semantic association
Words are to extracting.
8. the method that intelligent session reply content is obtained based on shared corpus according to claim 7, it is characterised in that
The personal corpus of the multiple communication side is merged, obtaining shared corpus includes:
The personal corpus of the multiple communication side is combined, combination corpus is obtained;
By the session comprising identical initiation sentence in the combination corpus to carrying out similar terms merging, shared corpus is obtained.
9. the method that intelligent session reply content is obtained based on shared corpus according to claim 8, it is characterised in that
Also include after the shared corpus of acquisition:
Whether the session in the shared corpus is judged to comprising multiple reply sentences, if so, then according to default rule to many
The individual reply sentence carries out intelligent sequencing.
10. the method that intelligent session reply content is obtained based on shared corpus according to claim 9, its feature is existed
In matched in the shared corpus includes with the reply content of current sessions content matching:
Collection is corresponding with current sessions content, and session context label value corresponding with default session context label;
Matched in the shared corpus and the current sessions content, the session context label and the session context
The corresponding reply sentence of label value, as reply content.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710076115.2A CN106709072A (en) | 2017-02-13 | 2017-02-13 | Method of obtaining intelligent conversation reply content based on shared corpora |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710076115.2A CN106709072A (en) | 2017-02-13 | 2017-02-13 | Method of obtaining intelligent conversation reply content based on shared corpora |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106709072A true CN106709072A (en) | 2017-05-24 |
Family
ID=58911307
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710076115.2A Pending CN106709072A (en) | 2017-02-13 | 2017-02-13 | Method of obtaining intelligent conversation reply content based on shared corpora |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106709072A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018145436A1 (en) * | 2017-02-13 | 2018-08-16 | 长沙军鸽软件有限公司 | Method for extracting conversation pair from conversation content |
CN109151044A (en) * | 2018-09-06 | 2019-01-04 | 广州酷狗计算机科技有限公司 | Information-pushing method, device, electronic equipment and storage medium |
CN109388717A (en) * | 2018-07-20 | 2019-02-26 | 北京智能点科技有限公司 | A kind of method and system of Mass production corpus |
CN110309408A (en) * | 2018-03-09 | 2019-10-08 | 陈包容 | A method of automation search |
CN110706704A (en) * | 2019-10-17 | 2020-01-17 | 四川长虹电器股份有限公司 | Method, device and computer equipment for generating voice interaction prototype |
CN113672698A (en) * | 2021-08-01 | 2021-11-19 | 北京网聘咨询有限公司 | Intelligent interviewing method, system, equipment and storage medium based on expression analysis |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101068177A (en) * | 2007-03-27 | 2007-11-07 | 腾讯科技(深圳)有限公司 | Interdynamic question-answering system and realizing method thereof |
CN105389296A (en) * | 2015-12-11 | 2016-03-09 | 小米科技有限责任公司 | Information partitioning method and apparatus |
CN106294774A (en) * | 2016-08-11 | 2017-01-04 | 北京光年无限科技有限公司 | User individual data processing method based on dialogue service and device |
-
2017
- 2017-02-13 CN CN201710076115.2A patent/CN106709072A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101068177A (en) * | 2007-03-27 | 2007-11-07 | 腾讯科技(深圳)有限公司 | Interdynamic question-answering system and realizing method thereof |
CN105389296A (en) * | 2015-12-11 | 2016-03-09 | 小米科技有限责任公司 | Information partitioning method and apparatus |
CN106294774A (en) * | 2016-08-11 | 2017-01-04 | 北京光年无限科技有限公司 | User individual data processing method based on dialogue service and device |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018145436A1 (en) * | 2017-02-13 | 2018-08-16 | 长沙军鸽软件有限公司 | Method for extracting conversation pair from conversation content |
CN110309408A (en) * | 2018-03-09 | 2019-10-08 | 陈包容 | A method of automation search |
CN110309408B (en) * | 2018-03-09 | 2023-07-14 | 陈包容 | Automatic searching method |
CN109388717A (en) * | 2018-07-20 | 2019-02-26 | 北京智能点科技有限公司 | A kind of method and system of Mass production corpus |
CN109151044A (en) * | 2018-09-06 | 2019-01-04 | 广州酷狗计算机科技有限公司 | Information-pushing method, device, electronic equipment and storage medium |
CN109151044B (en) * | 2018-09-06 | 2021-08-27 | 广州酷狗计算机科技有限公司 | Information pushing method and device, electronic equipment and storage medium |
CN110706704A (en) * | 2019-10-17 | 2020-01-17 | 四川长虹电器股份有限公司 | Method, device and computer equipment for generating voice interaction prototype |
CN113672698A (en) * | 2021-08-01 | 2021-11-19 | 北京网聘咨询有限公司 | Intelligent interviewing method, system, equipment and storage medium based on expression analysis |
CN113672698B (en) * | 2021-08-01 | 2024-05-24 | 北京网聘信息技术有限公司 | Intelligent interview method, system, equipment and storage medium based on expression analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106709072A (en) | Method of obtaining intelligent conversation reply content based on shared corpora | |
Mügge et al. | Intersectionality and the politics of knowledge production | |
US10742574B2 (en) | Method and device for implementing instant communication | |
AU2016201139A1 (en) | Conversational question and answer | |
CN106874452A (en) | A kind of method for obtaining session reply content | |
CN106649280B (en) | A method of creating shared corpus | |
CN107103083A (en) | A kind of method that robot realizes intelligent session | |
CN106874451A (en) | A kind of method of the personal exclusive corpus of automatic foundation | |
CN107025607B (en) | Accurate positioning social processing method | |
CN106407405A (en) | A social contact system based on love and marriage matching degree search | |
Körs | How Religious Communities Respond to Religious Diversity | |
CN106844734A (en) | A kind of method for automatically generating session reply content | |
CN113420058B (en) | Conversational academic conference recommendation method based on combination of user historical behaviors | |
CN106844735A (en) | A kind of method of the personal exclusive corpus of automatic foundation | |
CN106657157A (en) | Method for extracting session pairs from session contents | |
CN114257570B (en) | Processing method, device, equipment and medium based on multi-user session | |
CN107015968A (en) | A kind of method that session is actively initiated based on shared corpus | |
CN114363277A (en) | Intelligent chatting method and device based on social relationship and related products | |
Prasojo et al. | The Usage of Group Chatting Platform for English Skills | |
Addy et al. | Conviviality as a Vision and Approach for a Diaconal Society | |
CN109146737B (en) | Intelligent interaction method and device based on examination platform | |
CN107122459A (en) | A kind of method that robot realizes intelligent session | |
CN107122458A (en) | A kind of method that session is actively initiated based on shared corpus | |
Pio | Inspirational cameos: Ethnic minority Indian women entrepreneurs in New Zealand | |
Huang | “Taking Jesus Back to China”: new gospel agents in Shanghai |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170524 |
|
WD01 | Invention patent application deemed withdrawn after publication |