CN109299242A - A kind of session generation method, device, terminal device and storage medium - Google Patents
A kind of session generation method, device, terminal device and storage medium Download PDFInfo
- Publication number
- CN109299242A CN109299242A CN201811219414.8A CN201811219414A CN109299242A CN 109299242 A CN109299242 A CN 109299242A CN 201811219414 A CN201811219414 A CN 201811219414A CN 109299242 A CN109299242 A CN 109299242A
- Authority
- CN
- China
- Prior art keywords
- session
- corpus
- text
- matching rule
- vectorization
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of session generation method, device, terminal device and storage mediums, belong to artificial intelligence field.The method include that obtaining corpus in direct broadcasting room barrage corpus and Chinese corpus, corpus set is formed;By doc2vec model to corpus set training, the corpus set of vectorization expression is obtained;After the text for getting user's input, vector expression is carried out to the text by the doc2vec model;Set Keywords matching rule, according to the matching rule, retrieve in the corpus set that the vectorization indicates with the associated session of the text key word;Export the session.The degree of correlation of robot chat sessions not only can be improved in the present invention, but also enhances the scene adaptability of chat robots, ensures session dependency and interest.
Description
Technical field
The present invention relates to artificial intelligence field more particularly to a kind of session generation method, device, terminal device and storages
Medium.
Background technique
As an important research field of artificial intelligence, NLP (natural language processing) is the main of realization human-computer interaction
One of mode.In practice, our common robot chats are namely based on the realization of NLP technology, according to our voice or
Corresponding session can be generated in person's text, computer.
Usually used chat robots are mostly using the existing corpus of retrieval, or the two kinds of sides of session temporarily generated
Formula.For retrieval type session, it is existing be all directly according to user's input content retrieve corpus in associated session, it is this only
Retrieval is constituted according only to input content of text and generates session, is easy to appear that the session degree of correlation is not high, irrelevant situation.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of session generation method, device, terminal device and storage medium,
To solve the problems, such as that querying condition confusion causes search efficiency low in combined index.
In conjunction with the embodiment of the present invention in a first aspect, providing a kind of session generation method, comprising:
Corpus in direct broadcasting room barrage corpus and Chinese corpus is obtained, corpus set is formed;
By doc2vec model to corpus set training, the corpus set of vectorization expression is obtained;
After the text for getting user's input, vector expression is carried out to the text by the doc2vec model;
Set Keywords matching rule, according to the matching rule, retrieve in the corpus set that the vectorization indicates with
The associated session of keyword in the text that vectorization indicates;
Export the session.
In conjunction with the second aspect of the embodiment of the present invention, a kind of session generating means are provided, comprising:
It obtains module: for obtaining corpus in direct broadcasting room barrage corpus and Chinese corpus, forming corpus set;
Training module: for, to corpus set training, obtaining the corpus of vectorization expression by doc2vec model
Set;
Input module: for get user input text after, by the doc2vec model to the text into
Row vector indicates;
Retrieval module: for setting Keywords matching rule, according to the matching rule, retrieving the vectorization is indicated
Corpus set in vectorization indicate the text in the associated session of keyword;
Output module: for exporting the session.
In conjunction with the third aspect of the embodiment of the present invention, provide a kind of terminal installation, including memory, processor and
The computer program that can be run in the memory and on the processor is stored, the processor executes the calculating
It realizes when machine program such as the step of first aspect of embodiment of the present invention the method.
In conjunction with the fourth aspect of the embodiment of the present invention, a kind of computer readable storage medium is provided, the computer can
It reads storage medium and is stored with computer program, the embodiment of the present invention first is realized when the computer program is executed by processor
The step of the method that aspect provides.
In conjunction with the 5th aspect of the embodiment of the present invention, a kind of computer program product is provided, the computer program produces
Product include computer program, and the embodiment of the present invention first is realized when the computer program is executed by one or more processors
The step of the method that aspect provides.
In embodiments of the present invention, by training corpus set, vectorization expression is carried out, crucial by defining scene
Word matching rule, retrieval input text key word is to obtain the session in Keywords matching rule.Meeting with matching rule
Words retrieval, not only can be improved the degree of correlation of robot chat sessions, but also for the Keywords matching of setting rule, can be with
Periodically optimization extension enhances the scene adaptability of chat robots, ensures session dependency and interest.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, embodiment or the prior art will be retouched below
Attached drawing needed in stating is briefly described, it should be apparent that, the accompanying drawings in the following description is only of the invention one
A little embodiments for those of ordinary skill in the art without any creative labor, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 is one embodiment flow chart of session generation method provided in an embodiment of the present invention;
Fig. 2 is one embodiment flow chart of step S104 provided in an embodiment of the present invention
Fig. 3 is the structural schematic diagram of session generating means provided in an embodiment of the present invention;
Fig. 4 is the structural schematic diagram of terminal device provided in an embodiment of the present invention.
Specific embodiment
The embodiment of the invention provides method, apparatus, terminal device and storage mediums that a kind of session generates, are used for machine
Device accurately generates session when chatting.
In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with this hair
Attached drawing in bright embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that is retouched below
The embodiment stated is only a part of the embodiment of the present invention, and not all embodiment.Based on the embodiments of the present invention, originally
Field those of ordinary skill all other embodiment obtained without making creative work, belongs to this hair
The range of bright protection.
Embodiment one:
Referring to Fig. 1, the flow diagram of session generation method provided in an embodiment of the present invention, comprising the following steps:
S101, corpus in direct broadcasting room barrage corpus and Chinese corpus is obtained, forms corpus set;
The direct broadcasting room barrage refer to watch live streaming when, the comment subtitle of pop-up.The direct broadcasting room barrage content
It is not only related to live content, but also may also relate to the related contents such as all kinds of real-time hot spots, popular word, emerging expression.At this
In inventive embodiments, by collecting barrage corpus, it is ensured that machine session can understand emerging expression, and can make meeting
Words are more humorous, more meet popular talk mode at present, promote user experience.
The set of the Chinese corpus, that is, Chinese natural language can generally be obtained by open Chinese corpus
Corpus, such as search dog text classification corpus.
Optionally, the corpus that preassigned is not met in the direct broadcasting room barrage is cleared up.The preassigned of not meeting
Corpus generally may include rubbish barrage, sensitive word, the pet name, attack insult vocabulary.Clearing up sub-standard corpus can be improved
Model training efficiency, and guarantee the discourse representation specification generated.
S102, the corpus set is trained by doc2vec model, obtains the corpus set of vectorization expression;
The doc2vec (i.e. paragraph2vec, sentence embeddings) is a kind of non-supervisory formula algorithm,
The vector expression of sentence can be obtained by the algorithm, which can also find the similar of sentence by calculating vector distance
Property, and text cluster can be used for.
The doc2vec model need by existing data training obtains, be exactly according to existing corpus, will be each
Sentence/paragraph is mapped in vector space, can be obtained by the vectorization table for having obtained corpus by the doc2vec model
It reaches.
The corpus set can be matched by vectorization expression after user inputs text, obtain corpus set
The middle biggish session of correlation.
S103, after getting the text of user's input, vector table is carried out to the text by the doc2vec model
Show;
The text of user's input, can be and convert text for the voice that user inputs by speech recognition equipment,
It is also possible to the text inputted by input equipments such as keyboards, it is not limited here.
By to input text vectorization indicate can according to keyword in text or sentence in vector set into
Row matching, to obtain association sentence.
S104, setting Keywords matching rule retrieve the corpus that the vectorization indicates according to the matching rule
The associated session of keyword in the text indicated in conjunction with vectorization;
The Keywords matching rule is the keyword that text is inputted according to user, sets corresponding response content, the pass
The keyword of keyword i.e. scene or context, such as personage, movement, mood, the pet name or place can be embodied etc., the matching
As long as rule is that user has input and can will trigger corresponding matching rule with matched keyword, and gets matching rule
Session in then.In general, there is a plurality of session in every rule, session content is related to keyword.Preferably, also reply is every
Matching rule number.
For example, defining a matching rule: girl A093=| daughter=this or that girl appears not to be.|
Say that this guy is pretty good, give you my girl | it refuels and my girl is taken back row takes away me and to consider.It is matched in this
In rule, A093 is number, and girl or daughter are keyword, and decline is the session that chat robots can be generated, three
Words can select a wherein reply user according to input text other parts.
Optionally, in each Keywords matching rule, the corresponding session content of each keyword is set;Pass through canonical
Expression formula searches the corresponding Keywords matching rule of the text key word, and obtains the meeting in the Keywords matching rule
Words.The regular expression is combined into a regular character with the group of predefined specific character or specific character
String, for expressing the filter logic to character string.
S105, the output session.
The session is the session that is obtained according to Keywords matching rule, when having a plurality of session during Keywords matching is regular
When, it can randomly select, code of points can also be set and choose highest session of scoring, or choose and be used to input text pass
The highest session of connection degree, specific choose should be in conjunction with input text and the needs of session, it is not limited here.
The text of output can be presented to user in the form of text, export after can also being converted into voice.
It is above-mentioned to retrieve selection and corresponding session based on Keywords matching rule, session degree of correlation can be improved, and
The adaptability of different sessions scene can be enhanced.
On the basis of Fig. 1, retrieving of the step S104 based on matching rule is described in detail in conjunction with Fig. 2, as follows:
In step S104, by setting Keywords matching rule, the theme that can be chatted with quick obtaining to user, inspection
Rope corpus set can find corresponding session.
In S1041, keyword is set according to different scenes, personage, movement, mood etc., the meaning of a word is identical to be applicable in
Same set of matching rule.In every Keywords matching rule, the corresponding session of rule keyword is defined.
In S1042, when user input text contain some keyword, the corresponding matching of the keyword can be triggered
Rule includes such as daughter when user inputs text | and girl then triggers the rule of number A0930, specifically, passing through regular expressions
Formula indicates search procedure, and regularity indicates can be with are as follows: result=re.findall (r " .* (girl | daughter)+", msg,
Re.M) format.,
In S1043, after being matched to the keyword of user's input text, the corresponding session of the keyword can be obtained.
There are a plurality of sessions in general Keywords matching rule, choose a wherein reply user.
Optionally, in embodiments of the present invention, it in S1042, in the input content of text of user, is not matched to pair
The rule answered then carries out Text similarity computing by S1044, specifically, according to cosine similarity calculation formula, i.e. formula
(1), it calculates user to input text and expect the session in set, obtains the session of predetermined quantity, a reselection wherein progress
It replys.
The similarity calculation of input text, which can be ensured, is not suitable for the used time in Keywords matching rule, and user can also obtain
The reply of chat robots.
Preferably, periodically or automatically the Keywords matching rule is adjusted.
Specifically, recording every Keywords matching rule by access times, reach preset standard to by access times
Keywords matching rule optimizes, and adjusts session content and meeting in synonymous keyword or the extension Keywords matching rule
Talk about expression way.Every Keywords matching rule can optimize keyword according to frequency of use, expanded keyword and choose, and such as subtract
Few synonym etc., can also be increased and decreased the session in matching rule, e.g., can be with increasing optimization should using more rule
Session item number in rule such as increases interest and replys.
And according to every session frequency of use in Keywords matching rule, the Keywords matching rule are deleted or increased
Respective session in then.For the session in every matching rule, session has a plurality of, it is possible to reduce the session being of little use, it is right
Adjustment can be advanced optimized using more.
Above by S1045, Keywords matching can be adjusted according to the feedback data and statistical result in practical chat
Rule can make chat robots in not only optimization process, improve the session degree of correlation, enhance context adaptability, in turn
Guarantee user experience.
Embodiment two:
A kind of session generation method is essentially described above, is below retouched the device generated to a kind of session in detail
It states.
Fig. 4 shows the structural schematic diagram for the device that session provided in an embodiment of the present invention generates, comprising:
It obtains module 410: for obtaining corpus in direct broadcasting room barrage corpus and Chinese corpus, forming corpus set;
Optionally, the acquisition module 410 includes:
Unit is cleared up, for clearing up in the direct broadcasting room barrage corpus for not meeting preassigned.
Training module 420: for, to corpus set training, obtaining the language of vectorization expression by doc2vec model
Material set;
Input module 430: after the text for getting user's input, by the doc2vec model to the text
Carry out vector expression;
Retrieval module 440: for setting Keywords matching rule, according to the matching rule, the vectorization table is retrieved
The associated session of keyword in the text indicated in the corpus set shown with vectorization;
Output module 450: for exporting the session.
Optionally, the retrieval module 440 includes:
Setup unit, for setting the corresponding session content of each keyword in each Keywords matching rule;
Searching unit: for searching the corresponding Keywords matching rule of the text key word by regular expression,
And obtain the session in the Keywords matching rule.
Optionally, described that the corresponding Keywords matching rule of the text key word is searched by regular expression, and
The session obtained in Keywords matching rule further includes
Every Keywords matching rule is recorded by access times, to the keyword for being reached preset standard by access times
It is optimized with rule, specifically, session content and session in adjustment synonymous keyword or the extension Keywords matching rule
Expression way.
It optionally, further include deleting or increasing the key according to every session frequency of use in Keywords matching rule
Respective session in word matching rule.
Optionally, the retrieval module 440 further include:
Computing module does not retrieve and the text key word in the corpus set that the vectorization indicates for working as
Associated session obtains phase then by calculating the degree of correlation of session and the text in the corpus set that the vectorization indicates
Guan Du reaches the session of preset standard.
Optionally, the computing module includes:
Computing unit, for by cosine similarity calculation formula (1), calculating the text and the vectorization to be indicated
Corpus set in session the degree of correlation;
Wherein, (1, n) i ∈,Indicate that user inputs the vectorization expression of text T,Indicate corpus setIn i-th
The vectorization of corpus indicates;The size of n expression corpus set C.
Above-mentioned device can input text according to user, generate corresponding session, by setting matching rule, improve
The relevance of session.
Embodiment three:
Fig. 4 is the structural schematic diagram for the terminal device that the session that one embodiment of the invention provides generates.The terminal is set
It is standby, to have the mobile computer device of touch screen, including but not limited to smart phone, smartwatch, notebook, plate electricity
Brain, POS machine include even vehicle-mounted computer.As shown in figure 4, the terminal device 4 of the embodiment includes: memory 410, processor
420 and system bus 430, the memory 410 includes the program 4101 run of storage thereon, those skilled in the art
Member is appreciated that terminal device structure shown in Fig. 5 does not constitute the restriction to terminal device, may include than illustrating more
More or less component perhaps combines certain components or different component layouts.
It is specifically introduced below with reference to each component parts of the Fig. 4 to terminal device:
Memory 410 can be used for storing software program and module, and processor 420 is stored in memory 410 by operation
Software program and module, thereby executing the various function application and data processing of terminal.Memory 410 can be wrapped mainly
Include storing program area and storage data area, wherein storing program area can answer needed for storage program area, at least one function
With program (such as sound-playing function, image player function etc.) etc.;Storage data area can store the industry used according to terminal
Data of being engaged in (for example live data, the page show data etc.) etc..In addition, memory 410 may include high random access storage
Device, can also include nonvolatile memory, and a for example, at least disk memory, flush memory device or other volatibility are solid
State memory device.
Run program 4101 comprising data base query method on memory 410, it is described run program 4101 can
To be divided into one or more module/units, one or more of module/units are stored in the memory 410
In, and executed by processor 420, session is generated with the input content to user, one or more of module/units can be with
It is the series of computation machine program instruction section that can complete specific function, the instruction segment is for describing the computer program
4101 set the implementation procedure in 4 in the terminal.For example, the computer program 4101, which can be divided into, obtains module, instruction
Practice module, input module, retrieval module and output module.
Processor 420 is the control centre of terminal device, utilizes each of various interfaces and the entire terminal device of connection
A part by running or execute the software program and/or module that are stored in memory 410, and calls and is stored in storage
Data in device 410 execute the various functions and processing data of terminal, to carry out integral monitoring to terminal.Optionally, locate
Managing device 420 may include one or more processing units;Preferably, processor 420 can integrate application processor and modulatedemodulate is mediated
Manage device, wherein the main processing operation system of application processor, user interface and application program etc., modem processor is main
Processing wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor 420.
System bus 430 is for connection to each functional component of computer-internal, can with data information, address information,
Information is controlled, type can be such as pci bus, isa bus, VESA bus.The instruction of processor 420 passes through bus
It is transferred to memory 410, for 410 feedback data of memory to processor 420, system bus 430 is responsible for processor 420 and storage
Data, instruction interaction between device 410.Certain system bus 430 can also access other equipment, such as network interface, display
Equipment etc..
Terminal device may also include at least one sensor, such as optical sensor, motion sensor and other sensings
Device, a kind of input equipment, such as touch screen, keyboard and other, a kind of output equipment, for example, loudspeaker, display and other,
Optionally, in embodiments of the present invention, input equipment can be used for inputting instruction, such as searches and requires, and output equipment can be used for showing
Query result shows inquiry data to user.Details are not described herein for other component parts.
In embodiments of the present invention, what processor 420 included by the terminal executed runs program specifically:
A kind of session generation method, comprising:
Corpus in direct broadcasting room barrage corpus and Chinese corpus is obtained, corpus set is formed;
By doc2vec model to corpus set training, the corpus set of vectorization expression is obtained;
After the text for getting user's input, vector expression is carried out to the text by the doc2vec model;
Set Keywords matching rule, according to the matching rule, retrieve in the corpus set that the vectorization indicates with
The associated session of keyword in the text that vectorization indicates;
Export the session.
Further, corpus in the acquisition direct broadcasting room barrage corpus and Chinese corpus forms corpus set and also wraps
It includes:
Clear up the corpus that preassigned is not met in the direct broadcasting room barrage.
Further, the setting Keywords matching rule, according to the matching rule, retrieving the vectorization is indicated
Corpus set in the associated session of the text key word specifically:
In each Keywords matching rule, the corresponding session content of each keyword is set;
By regular expression, the corresponding Keywords matching rule of the text key word is searched, and obtains the key
Session in word matching rule.
It is further, described that the corresponding Keywords matching rule of the text key word is searched by regular expression,
And the session obtained in the Keywords matching rule further includes
Every Keywords matching rule is recorded by access times, to the keyword for being reached preset standard by access times
It is optimized with rule.
It is further, described that the corresponding Keywords matching rule of the text key word is searched by regular expression,
And the session obtained in the Keywords matching rule further includes
According to every session frequency of use in Keywords matching rule, deletes or increase in the Keywords matching rule
Respective session.
Further, the setting Keywords matching rule, according to the matching rule, retrieving the vectorization is indicated
Corpus set in the associated session of the text key word further include:
When do not retrieved in the corpus set that the vectorization indicates with the associated session of the text key word, then
By calculating the degree of correlation of session and the text in the corpus set that the vectorization indicates, obtains the degree of correlation and reach default
The session of standard.
Further, described related to the text by calculating session in the corpus set that the vectorization indicates
Degree obtains the session that the degree of correlation reaches preset standard specifically:
By cosine similarity calculation formula (1), meeting in the corpus set of the text and vectorization expression is calculated
The degree of correlation of words;
Wherein, (1, n) i ∈,Indicate that user inputs the vectorization expression of text T,Indicate corpus setIn i-th
The vectorization of corpus indicates;The size of n expression corpus set C.
It is apparent to those skilled in the art that for convenience and simplicity of description, foregoing description is
System, the specific work process of device and unit can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment
The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that each embodiment described in conjunction with the examples disclosed in this document
Module, unit and/or method and step, can be come with the combination of electronic hardware or computer software and electronic hardware real
It is existing.These functions are implemented in hardware or software actually, the specific application and design constraint item depending on technical solution
Part.Professional technician can use different methods to achieve the described function each specific application, but this
Realization should not be considered as beyond the scope of the present invention.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with
It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or group
Part can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown
Or the mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, between device or unit
Coupling or communication connection are connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, as unit
The component of display may or may not be physical unit, it can and it is in one place, or may be distributed over more
In a network unit.Some or all of unit therein can be selected to realize this embodiment scheme according to the actual needs
Purpose.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although reference
Invention is explained in detail for previous embodiment, those skilled in the art should understand that: it still can be right
Technical solution documented by foregoing embodiments is modified or equivalent replacement of some of the technical features;And this
It modifies or replaces, the spirit and model of technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution
It encloses.
Claims (10)
1. a kind of session generation method characterized by comprising
Corpus in direct broadcasting room barrage corpus and Chinese corpus is obtained, corpus set is formed;
By doc2vec model to corpus set training, the corpus set of vectorization expression is obtained;
After the text for getting user's input, vector expression is carried out to the text by the doc2vec model;
Set Keywords matching rule, according to the matching rule, retrieve in the corpus set that the vectorization indicates with vector
Change the associated session of keyword in the text indicated;
Export the session.
2. the method according to claim 1, wherein in the acquisition direct broadcasting room barrage corpus and Chinese corpus
Corpus forms corpus set further include:
Clear up the corpus that preassigned is not met in the direct broadcasting room barrage.
3. the method according to claim 1, wherein the setting Keywords matching is regular, according to the matching
Rule, retrieve in the corpus set that the vectorization indicates with the associated session of keyword in the text specifically:
In each Keywords matching rule, the corresponding session content of each keyword is set;
By regular expression, the corresponding Keywords matching rule of keyword in the text is searched, and obtain the keyword
Session in matching rule.
4. according to the method described in claim 3, searching and being closed in the text it is characterized in that, described by regular expression
The corresponding Keywords matching rule of keyword, and the session obtained in the Keywords matching rule further includes
Every Keywords matching rule is recorded by access times, to the Keywords matching rule for being reached preset standard by access times
Carry out session content and discourse representation mode in synonymous keyword adjustment or the extension Keywords matching rule.
5. according to the method described in claim 3, searching and being closed in the text it is characterized in that, described by regular expression
The corresponding Keywords matching rule of keyword, and the session obtained in the Keywords matching rule further includes
According to every session frequency of use in Keywords matching rule, the correspondence in the Keywords matching rule is deleted or increased
Session.
6. the method according to claim 1, wherein the setting Keywords matching is regular, according to the matching
Rule, retrieve in the corpus set that the vectorization indicates with the associated session of keyword in the text further include:
When do not retrieved in the corpus set that the vectorization indicates with the associated session of the text key word, then pass through meter
The degree of correlation of session and the text in the corpus set that the vectorization indicates is calculated, the meeting that the degree of correlation reaches preset standard is obtained
Words.
7. according to the method described in claim 6, it is characterized in that, the corpus set indicated by calculating the vectorization
The degree of correlation of middle session and the text obtains the session that the degree of correlation reaches preset standard specifically:
By cosine similarity calculation formula (1), session in the corpus set of the text and vectorization expression is calculated
The degree of correlation;
Wherein, (1, n) i ∈,Indicate that user inputs the vectorization expression of text T,Indicate corpus setIn i-th corpus
Vectorization indicate;The size of n expression corpus set C.
8. a kind of session generating means characterized by comprising
It obtains module: for obtaining corpus in direct broadcasting room barrage corpus and Chinese corpus, forming corpus set;
Training module: for, to corpus set training, obtaining the corpus set of vectorization expression by doc2vec model;
Input module: after the text for getting user's input, vector is carried out to the text by the doc2vec model
It indicates;
Retrieval module: for setting Keywords matching rule, according to the matching rule, the corpus that the vectorization indicates is retrieved
The associated session of keyword in the text indicated in set with vectorization;
Output module: for exporting the session.
9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor
The computer program of operation, which is characterized in that the processor realizes such as claim 1 to 7 when executing the computer program
Any one of described in session generation method the step of.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In the step of realization session generation method as described in any one of claims 1 to 7 when the computer program is executed by processor
Suddenly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811219414.8A CN109299242A (en) | 2018-10-19 | 2018-10-19 | A kind of session generation method, device, terminal device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811219414.8A CN109299242A (en) | 2018-10-19 | 2018-10-19 | A kind of session generation method, device, terminal device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109299242A true CN109299242A (en) | 2019-02-01 |
Family
ID=65157191
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811219414.8A Pending CN109299242A (en) | 2018-10-19 | 2018-10-19 | A kind of session generation method, device, terminal device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109299242A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114124860A (en) * | 2021-11-26 | 2022-03-01 | 中国联合网络通信集团有限公司 | Session management method, device, equipment and storage medium |
CN118312601A (en) * | 2024-06-05 | 2024-07-09 | 广东君略科技咨询有限公司 | Intelligent AI conversation method and device based on AI natural language understanding |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130110864A1 (en) * | 2011-10-27 | 2013-05-02 | Cbs Interactive, Inc. | Generating an electronic message during a browsing session |
CN105701088A (en) * | 2016-02-26 | 2016-06-22 | 北京京东尚科信息技术有限公司 | Method and device for switching machine conversation to artificial conversation |
CN107515944A (en) * | 2017-08-31 | 2017-12-26 | 广东美的制冷设备有限公司 | Exchange method, user terminal and storage medium based on artificial intelligence |
CN108491433A (en) * | 2018-02-09 | 2018-09-04 | 平安科技(深圳)有限公司 | Chat answer method, electronic device and storage medium |
-
2018
- 2018-10-19 CN CN201811219414.8A patent/CN109299242A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130110864A1 (en) * | 2011-10-27 | 2013-05-02 | Cbs Interactive, Inc. | Generating an electronic message during a browsing session |
CN105701088A (en) * | 2016-02-26 | 2016-06-22 | 北京京东尚科信息技术有限公司 | Method and device for switching machine conversation to artificial conversation |
CN107515944A (en) * | 2017-08-31 | 2017-12-26 | 广东美的制冷设备有限公司 | Exchange method, user terminal and storage medium based on artificial intelligence |
CN108491433A (en) * | 2018-02-09 | 2018-09-04 | 平安科技(深圳)有限公司 | Chat answer method, electronic device and storage medium |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114124860A (en) * | 2021-11-26 | 2022-03-01 | 中国联合网络通信集团有限公司 | Session management method, device, equipment and storage medium |
CN118312601A (en) * | 2024-06-05 | 2024-07-09 | 广东君略科技咨询有限公司 | Intelligent AI conversation method and device based on AI natural language understanding |
CN118312601B (en) * | 2024-06-05 | 2024-08-09 | 广东君略科技咨询有限公司 | Intelligent AI conversation method and device based on AI natural language understanding |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11556713B2 (en) | System and method for performing a meaning search using a natural language understanding (NLU) framework | |
US11645547B2 (en) | Human-machine interactive method and device based on artificial intelligence | |
US20190103111A1 (en) | Natural Language Processing Systems and Methods | |
JP5851507B2 (en) | Method and apparatus for internet search | |
US10482146B2 (en) | Systems and methods for automatic customization of content filtering | |
CN108920666A (en) | Searching method, system, electronic equipment and storage medium based on semantic understanding | |
CN109145213B (en) | Historical information based query recommendation method and device | |
JP7240505B2 (en) | Voice packet recommendation method, device, electronic device and program | |
CN111291549B (en) | Text processing method and device, storage medium and electronic equipment | |
JP2022540784A (en) | Derivation of Multiple Semantic Representations for Utterances in Natural Language Understanding Frameworks | |
CN106874441A (en) | Intelligent answer method and apparatus | |
CN110719525A (en) | Bullet screen expression package generation method, electronic equipment and readable storage medium | |
CN103970791B (en) | A kind of method, apparatus for recommending video from video library | |
CN116821457B (en) | Intelligent consultation and public opinion processing system based on multi-mode large model | |
CN108345612A (en) | A kind of question processing method and device, a kind of device for issue handling | |
CN109032731A (en) | A kind of voice interface method and system based on semantic understanding of oriented manipulation system | |
CN109710732A (en) | Information query method, device, storage medium and electronic equipment | |
CN111026840A (en) | Text processing method, device, server and storage medium | |
CN109299242A (en) | A kind of session generation method, device, terminal device and storage medium | |
CN116034401A (en) | System and method for retrieving video using natural language descriptions | |
CN109472032A (en) | A kind of determination method, apparatus, server and the storage medium of entity relationship diagram | |
CN107885827A (en) | File acquisition method, device, storage medium and electronic equipment | |
JP6676698B2 (en) | Information retrieval method and apparatus using relevance between reserved words and attribute language | |
WO2023040545A1 (en) | Data processing method and apparatus, device, storage medium, and program product | |
CN107918606B (en) | Method and device for identifying avatar nouns and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190201 |
|
RJ01 | Rejection of invention patent application after publication |