CN109299242A - A kind of session generation method, device, terminal device and storage medium - Google Patents

A kind of session generation method, device, terminal device and storage medium Download PDF

Info

Publication number
CN109299242A
CN109299242A CN201811219414.8A CN201811219414A CN109299242A CN 109299242 A CN109299242 A CN 109299242A CN 201811219414 A CN201811219414 A CN 201811219414A CN 109299242 A CN109299242 A CN 109299242A
Authority
CN
China
Prior art keywords
session
corpus
text
matching rule
vectorization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811219414.8A
Other languages
Chinese (zh)
Inventor
徐乐乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Douyu Network Technology Co Ltd
Original Assignee
Wuhan Douyu Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Douyu Network Technology Co Ltd filed Critical Wuhan Douyu Network Technology Co Ltd
Priority to CN201811219414.8A priority Critical patent/CN109299242A/en
Publication of CN109299242A publication Critical patent/CN109299242A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of session generation method, device, terminal device and storage mediums, belong to artificial intelligence field.The method include that obtaining corpus in direct broadcasting room barrage corpus and Chinese corpus, corpus set is formed;By doc2vec model to corpus set training, the corpus set of vectorization expression is obtained;After the text for getting user's input, vector expression is carried out to the text by the doc2vec model;Set Keywords matching rule, according to the matching rule, retrieve in the corpus set that the vectorization indicates with the associated session of the text key word;Export the session.The degree of correlation of robot chat sessions not only can be improved in the present invention, but also enhances the scene adaptability of chat robots, ensures session dependency and interest.

Description

A kind of session generation method, device, terminal device and storage medium
Technical field
The present invention relates to artificial intelligence field more particularly to a kind of session generation method, device, terminal device and storages Medium.
Background technique
As an important research field of artificial intelligence, NLP (natural language processing) is the main of realization human-computer interaction One of mode.In practice, our common robot chats are namely based on the realization of NLP technology, according to our voice or Corresponding session can be generated in person's text, computer.
Usually used chat robots are mostly using the existing corpus of retrieval, or the two kinds of sides of session temporarily generated Formula.For retrieval type session, it is existing be all directly according to user's input content retrieve corpus in associated session, it is this only Retrieval is constituted according only to input content of text and generates session, is easy to appear that the session degree of correlation is not high, irrelevant situation.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of session generation method, device, terminal device and storage medium, To solve the problems, such as that querying condition confusion causes search efficiency low in combined index.
In conjunction with the embodiment of the present invention in a first aspect, providing a kind of session generation method, comprising:
Corpus in direct broadcasting room barrage corpus and Chinese corpus is obtained, corpus set is formed;
By doc2vec model to corpus set training, the corpus set of vectorization expression is obtained;
After the text for getting user's input, vector expression is carried out to the text by the doc2vec model;
Set Keywords matching rule, according to the matching rule, retrieve in the corpus set that the vectorization indicates with The associated session of keyword in the text that vectorization indicates;
Export the session.
In conjunction with the second aspect of the embodiment of the present invention, a kind of session generating means are provided, comprising:
It obtains module: for obtaining corpus in direct broadcasting room barrage corpus and Chinese corpus, forming corpus set;
Training module: for, to corpus set training, obtaining the corpus of vectorization expression by doc2vec model Set;
Input module: for get user input text after, by the doc2vec model to the text into Row vector indicates;
Retrieval module: for setting Keywords matching rule, according to the matching rule, retrieving the vectorization is indicated Corpus set in vectorization indicate the text in the associated session of keyword;
Output module: for exporting the session.
In conjunction with the third aspect of the embodiment of the present invention, provide a kind of terminal installation, including memory, processor and The computer program that can be run in the memory and on the processor is stored, the processor executes the calculating It realizes when machine program such as the step of first aspect of embodiment of the present invention the method.
In conjunction with the fourth aspect of the embodiment of the present invention, a kind of computer readable storage medium is provided, the computer can It reads storage medium and is stored with computer program, the embodiment of the present invention first is realized when the computer program is executed by processor The step of the method that aspect provides.
In conjunction with the 5th aspect of the embodiment of the present invention, a kind of computer program product is provided, the computer program produces Product include computer program, and the embodiment of the present invention first is realized when the computer program is executed by one or more processors The step of the method that aspect provides.
In embodiments of the present invention, by training corpus set, vectorization expression is carried out, crucial by defining scene Word matching rule, retrieval input text key word is to obtain the session in Keywords matching rule.Meeting with matching rule Words retrieval, not only can be improved the degree of correlation of robot chat sessions, but also for the Keywords matching of setting rule, can be with Periodically optimization extension enhances the scene adaptability of chat robots, ensures session dependency and interest.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, embodiment or the prior art will be retouched below Attached drawing needed in stating is briefly described, it should be apparent that, the accompanying drawings in the following description is only of the invention one A little embodiments for those of ordinary skill in the art without any creative labor, can also basis These attached drawings obtain other attached drawings.
Fig. 1 is one embodiment flow chart of session generation method provided in an embodiment of the present invention;
Fig. 2 is one embodiment flow chart of step S104 provided in an embodiment of the present invention
Fig. 3 is the structural schematic diagram of session generating means provided in an embodiment of the present invention;
Fig. 4 is the structural schematic diagram of terminal device provided in an embodiment of the present invention.
Specific embodiment
The embodiment of the invention provides method, apparatus, terminal device and storage mediums that a kind of session generates, are used for machine Device accurately generates session when chatting.
In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with this hair Attached drawing in bright embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that is retouched below The embodiment stated is only a part of the embodiment of the present invention, and not all embodiment.Based on the embodiments of the present invention, originally Field those of ordinary skill all other embodiment obtained without making creative work, belongs to this hair The range of bright protection.
Embodiment one:
Referring to Fig. 1, the flow diagram of session generation method provided in an embodiment of the present invention, comprising the following steps:
S101, corpus in direct broadcasting room barrage corpus and Chinese corpus is obtained, forms corpus set;
The direct broadcasting room barrage refer to watch live streaming when, the comment subtitle of pop-up.The direct broadcasting room barrage content It is not only related to live content, but also may also relate to the related contents such as all kinds of real-time hot spots, popular word, emerging expression.At this In inventive embodiments, by collecting barrage corpus, it is ensured that machine session can understand emerging expression, and can make meeting Words are more humorous, more meet popular talk mode at present, promote user experience.
The set of the Chinese corpus, that is, Chinese natural language can generally be obtained by open Chinese corpus Corpus, such as search dog text classification corpus.
Optionally, the corpus that preassigned is not met in the direct broadcasting room barrage is cleared up.The preassigned of not meeting Corpus generally may include rubbish barrage, sensitive word, the pet name, attack insult vocabulary.Clearing up sub-standard corpus can be improved Model training efficiency, and guarantee the discourse representation specification generated.
S102, the corpus set is trained by doc2vec model, obtains the corpus set of vectorization expression;
The doc2vec (i.e. paragraph2vec, sentence embeddings) is a kind of non-supervisory formula algorithm, The vector expression of sentence can be obtained by the algorithm, which can also find the similar of sentence by calculating vector distance Property, and text cluster can be used for.
The doc2vec model need by existing data training obtains, be exactly according to existing corpus, will be each Sentence/paragraph is mapped in vector space, can be obtained by the vectorization table for having obtained corpus by the doc2vec model It reaches.
The corpus set can be matched by vectorization expression after user inputs text, obtain corpus set The middle biggish session of correlation.
S103, after getting the text of user's input, vector table is carried out to the text by the doc2vec model Show;
The text of user's input, can be and convert text for the voice that user inputs by speech recognition equipment, It is also possible to the text inputted by input equipments such as keyboards, it is not limited here.
By to input text vectorization indicate can according to keyword in text or sentence in vector set into Row matching, to obtain association sentence.
S104, setting Keywords matching rule retrieve the corpus that the vectorization indicates according to the matching rule The associated session of keyword in the text indicated in conjunction with vectorization;
The Keywords matching rule is the keyword that text is inputted according to user, sets corresponding response content, the pass The keyword of keyword i.e. scene or context, such as personage, movement, mood, the pet name or place can be embodied etc., the matching As long as rule is that user has input and can will trigger corresponding matching rule with matched keyword, and gets matching rule Session in then.In general, there is a plurality of session in every rule, session content is related to keyword.Preferably, also reply is every Matching rule number.
For example, defining a matching rule: girl A093=| daughter=this or that girl appears not to be.| Say that this guy is pretty good, give you my girl | it refuels and my girl is taken back row takes away me and to consider.It is matched in this In rule, A093 is number, and girl or daughter are keyword, and decline is the session that chat robots can be generated, three Words can select a wherein reply user according to input text other parts.
Optionally, in each Keywords matching rule, the corresponding session content of each keyword is set;Pass through canonical Expression formula searches the corresponding Keywords matching rule of the text key word, and obtains the meeting in the Keywords matching rule Words.The regular expression is combined into a regular character with the group of predefined specific character or specific character String, for expressing the filter logic to character string.
S105, the output session.
The session is the session that is obtained according to Keywords matching rule, when having a plurality of session during Keywords matching is regular When, it can randomly select, code of points can also be set and choose highest session of scoring, or choose and be used to input text pass The highest session of connection degree, specific choose should be in conjunction with input text and the needs of session, it is not limited here.
The text of output can be presented to user in the form of text, export after can also being converted into voice.
It is above-mentioned to retrieve selection and corresponding session based on Keywords matching rule, session degree of correlation can be improved, and The adaptability of different sessions scene can be enhanced.
On the basis of Fig. 1, retrieving of the step S104 based on matching rule is described in detail in conjunction with Fig. 2, as follows:
In step S104, by setting Keywords matching rule, the theme that can be chatted with quick obtaining to user, inspection Rope corpus set can find corresponding session.
In S1041, keyword is set according to different scenes, personage, movement, mood etc., the meaning of a word is identical to be applicable in Same set of matching rule.In every Keywords matching rule, the corresponding session of rule keyword is defined.
In S1042, when user input text contain some keyword, the corresponding matching of the keyword can be triggered Rule includes such as daughter when user inputs text | and girl then triggers the rule of number A0930, specifically, passing through regular expressions Formula indicates search procedure, and regularity indicates can be with are as follows: result=re.findall (r " .* (girl | daughter)+", msg, Re.M) format.,
In S1043, after being matched to the keyword of user's input text, the corresponding session of the keyword can be obtained. There are a plurality of sessions in general Keywords matching rule, choose a wherein reply user.
Optionally, in embodiments of the present invention, it in S1042, in the input content of text of user, is not matched to pair The rule answered then carries out Text similarity computing by S1044, specifically, according to cosine similarity calculation formula, i.e. formula (1), it calculates user to input text and expect the session in set, obtains the session of predetermined quantity, a reselection wherein progress It replys.
The similarity calculation of input text, which can be ensured, is not suitable for the used time in Keywords matching rule, and user can also obtain The reply of chat robots.
Preferably, periodically or automatically the Keywords matching rule is adjusted.
Specifically, recording every Keywords matching rule by access times, reach preset standard to by access times Keywords matching rule optimizes, and adjusts session content and meeting in synonymous keyword or the extension Keywords matching rule Talk about expression way.Every Keywords matching rule can optimize keyword according to frequency of use, expanded keyword and choose, and such as subtract Few synonym etc., can also be increased and decreased the session in matching rule, e.g., can be with increasing optimization should using more rule Session item number in rule such as increases interest and replys.
And according to every session frequency of use in Keywords matching rule, the Keywords matching rule are deleted or increased Respective session in then.For the session in every matching rule, session has a plurality of, it is possible to reduce the session being of little use, it is right Adjustment can be advanced optimized using more.
Above by S1045, Keywords matching can be adjusted according to the feedback data and statistical result in practical chat Rule can make chat robots in not only optimization process, improve the session degree of correlation, enhance context adaptability, in turn Guarantee user experience.
Embodiment two:
A kind of session generation method is essentially described above, is below retouched the device generated to a kind of session in detail It states.
Fig. 4 shows the structural schematic diagram for the device that session provided in an embodiment of the present invention generates, comprising:
It obtains module 410: for obtaining corpus in direct broadcasting room barrage corpus and Chinese corpus, forming corpus set;
Optionally, the acquisition module 410 includes:
Unit is cleared up, for clearing up in the direct broadcasting room barrage corpus for not meeting preassigned.
Training module 420: for, to corpus set training, obtaining the language of vectorization expression by doc2vec model Material set;
Input module 430: after the text for getting user's input, by the doc2vec model to the text Carry out vector expression;
Retrieval module 440: for setting Keywords matching rule, according to the matching rule, the vectorization table is retrieved The associated session of keyword in the text indicated in the corpus set shown with vectorization;
Output module 450: for exporting the session.
Optionally, the retrieval module 440 includes:
Setup unit, for setting the corresponding session content of each keyword in each Keywords matching rule;
Searching unit: for searching the corresponding Keywords matching rule of the text key word by regular expression, And obtain the session in the Keywords matching rule.
Optionally, described that the corresponding Keywords matching rule of the text key word is searched by regular expression, and The session obtained in Keywords matching rule further includes
Every Keywords matching rule is recorded by access times, to the keyword for being reached preset standard by access times It is optimized with rule, specifically, session content and session in adjustment synonymous keyword or the extension Keywords matching rule Expression way.
It optionally, further include deleting or increasing the key according to every session frequency of use in Keywords matching rule Respective session in word matching rule.
Optionally, the retrieval module 440 further include:
Computing module does not retrieve and the text key word in the corpus set that the vectorization indicates for working as Associated session obtains phase then by calculating the degree of correlation of session and the text in the corpus set that the vectorization indicates Guan Du reaches the session of preset standard.
Optionally, the computing module includes:
Computing unit, for by cosine similarity calculation formula (1), calculating the text and the vectorization to be indicated Corpus set in session the degree of correlation;
Wherein, (1, n) i ∈,Indicate that user inputs the vectorization expression of text T,Indicate corpus setIn i-th The vectorization of corpus indicates;The size of n expression corpus set C.
Above-mentioned device can input text according to user, generate corresponding session, by setting matching rule, improve The relevance of session.
Embodiment three:
Fig. 4 is the structural schematic diagram for the terminal device that the session that one embodiment of the invention provides generates.The terminal is set It is standby, to have the mobile computer device of touch screen, including but not limited to smart phone, smartwatch, notebook, plate electricity Brain, POS machine include even vehicle-mounted computer.As shown in figure 4, the terminal device 4 of the embodiment includes: memory 410, processor 420 and system bus 430, the memory 410 includes the program 4101 run of storage thereon, those skilled in the art Member is appreciated that terminal device structure shown in Fig. 5 does not constitute the restriction to terminal device, may include than illustrating more More or less component perhaps combines certain components or different component layouts.
It is specifically introduced below with reference to each component parts of the Fig. 4 to terminal device:
Memory 410 can be used for storing software program and module, and processor 420 is stored in memory 410 by operation Software program and module, thereby executing the various function application and data processing of terminal.Memory 410 can be wrapped mainly Include storing program area and storage data area, wherein storing program area can answer needed for storage program area, at least one function With program (such as sound-playing function, image player function etc.) etc.;Storage data area can store the industry used according to terminal Data of being engaged in (for example live data, the page show data etc.) etc..In addition, memory 410 may include high random access storage Device, can also include nonvolatile memory, and a for example, at least disk memory, flush memory device or other volatibility are solid State memory device.
Run program 4101 comprising data base query method on memory 410, it is described run program 4101 can To be divided into one or more module/units, one or more of module/units are stored in the memory 410 In, and executed by processor 420, session is generated with the input content to user, one or more of module/units can be with It is the series of computation machine program instruction section that can complete specific function, the instruction segment is for describing the computer program 4101 set the implementation procedure in 4 in the terminal.For example, the computer program 4101, which can be divided into, obtains module, instruction Practice module, input module, retrieval module and output module.
Processor 420 is the control centre of terminal device, utilizes each of various interfaces and the entire terminal device of connection A part by running or execute the software program and/or module that are stored in memory 410, and calls and is stored in storage Data in device 410 execute the various functions and processing data of terminal, to carry out integral monitoring to terminal.Optionally, locate Managing device 420 may include one or more processing units;Preferably, processor 420 can integrate application processor and modulatedemodulate is mediated Manage device, wherein the main processing operation system of application processor, user interface and application program etc., modem processor is main Processing wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor 420.
System bus 430 is for connection to each functional component of computer-internal, can with data information, address information, Information is controlled, type can be such as pci bus, isa bus, VESA bus.The instruction of processor 420 passes through bus It is transferred to memory 410, for 410 feedback data of memory to processor 420, system bus 430 is responsible for processor 420 and storage Data, instruction interaction between device 410.Certain system bus 430 can also access other equipment, such as network interface, display Equipment etc..
Terminal device may also include at least one sensor, such as optical sensor, motion sensor and other sensings Device, a kind of input equipment, such as touch screen, keyboard and other, a kind of output equipment, for example, loudspeaker, display and other, Optionally, in embodiments of the present invention, input equipment can be used for inputting instruction, such as searches and requires, and output equipment can be used for showing Query result shows inquiry data to user.Details are not described herein for other component parts.
In embodiments of the present invention, what processor 420 included by the terminal executed runs program specifically:
A kind of session generation method, comprising:
Corpus in direct broadcasting room barrage corpus and Chinese corpus is obtained, corpus set is formed;
By doc2vec model to corpus set training, the corpus set of vectorization expression is obtained;
After the text for getting user's input, vector expression is carried out to the text by the doc2vec model;
Set Keywords matching rule, according to the matching rule, retrieve in the corpus set that the vectorization indicates with The associated session of keyword in the text that vectorization indicates;
Export the session.
Further, corpus in the acquisition direct broadcasting room barrage corpus and Chinese corpus forms corpus set and also wraps It includes:
Clear up the corpus that preassigned is not met in the direct broadcasting room barrage.
Further, the setting Keywords matching rule, according to the matching rule, retrieving the vectorization is indicated Corpus set in the associated session of the text key word specifically:
In each Keywords matching rule, the corresponding session content of each keyword is set;
By regular expression, the corresponding Keywords matching rule of the text key word is searched, and obtains the key Session in word matching rule.
It is further, described that the corresponding Keywords matching rule of the text key word is searched by regular expression, And the session obtained in the Keywords matching rule further includes
Every Keywords matching rule is recorded by access times, to the keyword for being reached preset standard by access times It is optimized with rule.
It is further, described that the corresponding Keywords matching rule of the text key word is searched by regular expression, And the session obtained in the Keywords matching rule further includes
According to every session frequency of use in Keywords matching rule, deletes or increase in the Keywords matching rule Respective session.
Further, the setting Keywords matching rule, according to the matching rule, retrieving the vectorization is indicated Corpus set in the associated session of the text key word further include:
When do not retrieved in the corpus set that the vectorization indicates with the associated session of the text key word, then By calculating the degree of correlation of session and the text in the corpus set that the vectorization indicates, obtains the degree of correlation and reach default The session of standard.
Further, described related to the text by calculating session in the corpus set that the vectorization indicates Degree obtains the session that the degree of correlation reaches preset standard specifically:
By cosine similarity calculation formula (1), meeting in the corpus set of the text and vectorization expression is calculated The degree of correlation of words;
Wherein, (1, n) i ∈,Indicate that user inputs the vectorization expression of text T,Indicate corpus setIn i-th The vectorization of corpus indicates;The size of n expression corpus set C.
It is apparent to those skilled in the art that for convenience and simplicity of description, foregoing description is System, the specific work process of device and unit can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that each embodiment described in conjunction with the examples disclosed in this document Module, unit and/or method and step, can be come with the combination of electronic hardware or computer software and electronic hardware real It is existing.These functions are implemented in hardware or software actually, the specific application and design constraint item depending on technical solution Part.Professional technician can use different methods to achieve the described function each specific application, but this Realization should not be considered as beyond the scope of the present invention.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or group Part can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown Or the mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, between device or unit Coupling or communication connection are connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, as unit The component of display may or may not be physical unit, it can and it is in one place, or may be distributed over more In a network unit.Some or all of unit therein can be selected to realize this embodiment scheme according to the actual needs Purpose.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although reference Invention is explained in detail for previous embodiment, those skilled in the art should understand that: it still can be right Technical solution documented by foregoing embodiments is modified or equivalent replacement of some of the technical features;And this It modifies or replaces, the spirit and model of technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution It encloses.

Claims (10)

1. a kind of session generation method characterized by comprising
Corpus in direct broadcasting room barrage corpus and Chinese corpus is obtained, corpus set is formed;
By doc2vec model to corpus set training, the corpus set of vectorization expression is obtained;
After the text for getting user's input, vector expression is carried out to the text by the doc2vec model;
Set Keywords matching rule, according to the matching rule, retrieve in the corpus set that the vectorization indicates with vector Change the associated session of keyword in the text indicated;
Export the session.
2. the method according to claim 1, wherein in the acquisition direct broadcasting room barrage corpus and Chinese corpus Corpus forms corpus set further include:
Clear up the corpus that preassigned is not met in the direct broadcasting room barrage.
3. the method according to claim 1, wherein the setting Keywords matching is regular, according to the matching Rule, retrieve in the corpus set that the vectorization indicates with the associated session of keyword in the text specifically:
In each Keywords matching rule, the corresponding session content of each keyword is set;
By regular expression, the corresponding Keywords matching rule of keyword in the text is searched, and obtain the keyword Session in matching rule.
4. according to the method described in claim 3, searching and being closed in the text it is characterized in that, described by regular expression The corresponding Keywords matching rule of keyword, and the session obtained in the Keywords matching rule further includes
Every Keywords matching rule is recorded by access times, to the Keywords matching rule for being reached preset standard by access times Carry out session content and discourse representation mode in synonymous keyword adjustment or the extension Keywords matching rule.
5. according to the method described in claim 3, searching and being closed in the text it is characterized in that, described by regular expression The corresponding Keywords matching rule of keyword, and the session obtained in the Keywords matching rule further includes
According to every session frequency of use in Keywords matching rule, the correspondence in the Keywords matching rule is deleted or increased Session.
6. the method according to claim 1, wherein the setting Keywords matching is regular, according to the matching Rule, retrieve in the corpus set that the vectorization indicates with the associated session of keyword in the text further include:
When do not retrieved in the corpus set that the vectorization indicates with the associated session of the text key word, then pass through meter The degree of correlation of session and the text in the corpus set that the vectorization indicates is calculated, the meeting that the degree of correlation reaches preset standard is obtained Words.
7. according to the method described in claim 6, it is characterized in that, the corpus set indicated by calculating the vectorization The degree of correlation of middle session and the text obtains the session that the degree of correlation reaches preset standard specifically:
By cosine similarity calculation formula (1), session in the corpus set of the text and vectorization expression is calculated The degree of correlation;
Wherein, (1, n) i ∈,Indicate that user inputs the vectorization expression of text T,Indicate corpus setIn i-th corpus Vectorization indicate;The size of n expression corpus set C.
8. a kind of session generating means characterized by comprising
It obtains module: for obtaining corpus in direct broadcasting room barrage corpus and Chinese corpus, forming corpus set;
Training module: for, to corpus set training, obtaining the corpus set of vectorization expression by doc2vec model;
Input module: after the text for getting user's input, vector is carried out to the text by the doc2vec model It indicates;
Retrieval module: for setting Keywords matching rule, according to the matching rule, the corpus that the vectorization indicates is retrieved The associated session of keyword in the text indicated in set with vectorization;
Output module: for exporting the session.
9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor The computer program of operation, which is characterized in that the processor realizes such as claim 1 to 7 when executing the computer program Any one of described in session generation method the step of.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In the step of realization session generation method as described in any one of claims 1 to 7 when the computer program is executed by processor Suddenly.
CN201811219414.8A 2018-10-19 2018-10-19 A kind of session generation method, device, terminal device and storage medium Pending CN109299242A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811219414.8A CN109299242A (en) 2018-10-19 2018-10-19 A kind of session generation method, device, terminal device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811219414.8A CN109299242A (en) 2018-10-19 2018-10-19 A kind of session generation method, device, terminal device and storage medium

Publications (1)

Publication Number Publication Date
CN109299242A true CN109299242A (en) 2019-02-01

Family

ID=65157191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811219414.8A Pending CN109299242A (en) 2018-10-19 2018-10-19 A kind of session generation method, device, terminal device and storage medium

Country Status (1)

Country Link
CN (1) CN109299242A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114124860A (en) * 2021-11-26 2022-03-01 中国联合网络通信集团有限公司 Session management method, device, equipment and storage medium
CN118312601A (en) * 2024-06-05 2024-07-09 广东君略科技咨询有限公司 Intelligent AI conversation method and device based on AI natural language understanding

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130110864A1 (en) * 2011-10-27 2013-05-02 Cbs Interactive, Inc. Generating an electronic message during a browsing session
CN105701088A (en) * 2016-02-26 2016-06-22 北京京东尚科信息技术有限公司 Method and device for switching machine conversation to artificial conversation
CN107515944A (en) * 2017-08-31 2017-12-26 广东美的制冷设备有限公司 Exchange method, user terminal and storage medium based on artificial intelligence
CN108491433A (en) * 2018-02-09 2018-09-04 平安科技(深圳)有限公司 Chat answer method, electronic device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130110864A1 (en) * 2011-10-27 2013-05-02 Cbs Interactive, Inc. Generating an electronic message during a browsing session
CN105701088A (en) * 2016-02-26 2016-06-22 北京京东尚科信息技术有限公司 Method and device for switching machine conversation to artificial conversation
CN107515944A (en) * 2017-08-31 2017-12-26 广东美的制冷设备有限公司 Exchange method, user terminal and storage medium based on artificial intelligence
CN108491433A (en) * 2018-02-09 2018-09-04 平安科技(深圳)有限公司 Chat answer method, electronic device and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114124860A (en) * 2021-11-26 2022-03-01 中国联合网络通信集团有限公司 Session management method, device, equipment and storage medium
CN118312601A (en) * 2024-06-05 2024-07-09 广东君略科技咨询有限公司 Intelligent AI conversation method and device based on AI natural language understanding
CN118312601B (en) * 2024-06-05 2024-08-09 广东君略科技咨询有限公司 Intelligent AI conversation method and device based on AI natural language understanding

Similar Documents

Publication Publication Date Title
US11556713B2 (en) System and method for performing a meaning search using a natural language understanding (NLU) framework
US11645547B2 (en) Human-machine interactive method and device based on artificial intelligence
US20190103111A1 (en) Natural Language Processing Systems and Methods
JP5851507B2 (en) Method and apparatus for internet search
US10482146B2 (en) Systems and methods for automatic customization of content filtering
CN108920666A (en) Searching method, system, electronic equipment and storage medium based on semantic understanding
CN109145213B (en) Historical information based query recommendation method and device
JP7240505B2 (en) Voice packet recommendation method, device, electronic device and program
CN111291549B (en) Text processing method and device, storage medium and electronic equipment
JP2022540784A (en) Derivation of Multiple Semantic Representations for Utterances in Natural Language Understanding Frameworks
CN106874441A (en) Intelligent answer method and apparatus
CN110719525A (en) Bullet screen expression package generation method, electronic equipment and readable storage medium
CN103970791B (en) A kind of method, apparatus for recommending video from video library
CN116821457B (en) Intelligent consultation and public opinion processing system based on multi-mode large model
CN108345612A (en) A kind of question processing method and device, a kind of device for issue handling
CN109032731A (en) A kind of voice interface method and system based on semantic understanding of oriented manipulation system
CN109710732A (en) Information query method, device, storage medium and electronic equipment
CN111026840A (en) Text processing method, device, server and storage medium
CN109299242A (en) A kind of session generation method, device, terminal device and storage medium
CN116034401A (en) System and method for retrieving video using natural language descriptions
CN109472032A (en) A kind of determination method, apparatus, server and the storage medium of entity relationship diagram
CN107885827A (en) File acquisition method, device, storage medium and electronic equipment
JP6676698B2 (en) Information retrieval method and apparatus using relevance between reserved words and attribute language
WO2023040545A1 (en) Data processing method and apparatus, device, storage medium, and program product
CN107918606B (en) Method and device for identifying avatar nouns and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190201

RJ01 Rejection of invention patent application after publication