CN110263134B - Intelligent emotion question-answering method and device and computer readable storage medium - Google Patents

Intelligent emotion question-answering method and device and computer readable storage medium Download PDF

Info

Publication number
CN110263134B
CN110263134B CN201910386282.6A CN201910386282A CN110263134B CN 110263134 B CN110263134 B CN 110263134B CN 201910386282 A CN201910386282 A CN 201910386282A CN 110263134 B CN110263134 B CN 110263134B
Authority
CN
China
Prior art keywords
question
answer
word
data set
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910386282.6A
Other languages
Chinese (zh)
Other versions
CN110263134A (en
Inventor
侯丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910386282.6A priority Critical patent/CN110263134B/en
Priority to PCT/CN2019/102194 priority patent/WO2020224099A1/en
Publication of CN110263134A publication Critical patent/CN110263134A/en
Application granted granted Critical
Publication of CN110263134B publication Critical patent/CN110263134B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention relates to an artificial intelligence technology, and discloses an intelligent emotion question-answering method, which comprises the following steps: receiving a question-answer data set, carrying out emotion attribute labeling on the question-answer data set to obtain an emotion attribute labeling set, carrying out preprocessing and word vectorization on the question-answer data set to obtain a question-answer word vector set, inputting the emotion attribute labeling set and the question-answer word vector set into a convolutional neural network model for training, and enabling the convolutional neural network to quit training and inputting the question-answer word vector set into a recurrent neural network until the recurrent neural network meets the preset threshold requirement; and receiving user questions, inputting the user questions into the convolutional neural network to judge emotion attributes, and outputting answers to the user questions based on the convolutional neural network. The invention also provides an intelligent emotion question-answering device and a computer readable storage medium. The invention can realize accurate and intelligent emotion question and answer functions.

Description

Intelligent emotion question-answering method and device and computer readable storage medium
Technical Field
The present invention relates to the field of artificial intelligence, and in particular, to an intelligent emotion question answering method, apparatus and computer readable storage medium for intelligently giving answers to questions after receiving user questions.
Background
At present, most of research on question-answering systems focuses on whether grammar and semantics of generated sentences are reasonable or not, and the answer generation modes of the question-answering systems are mostly based on context or combined theme, so that emotion of a interlocutor, such as user input, is rarely considered: yesterday's exam I did not pass, most of the answers given by the question-answering system were as follows: is not the reciprocal, and is wonder; user input: i'm dog is now going to be present. And (3) replying: pet dogs are particularly prone to death. In real life, however, if the partner expresses a happy emotion through a language, the reply of the partner should also be a positive emotion in general. If the opposite party expresses sad emotion through the language, the content responded by the other party should be comfort and the like. Thus, replies with emotions to the question-answering system are often more popular with users.
Disclosure of Invention
The invention provides an intelligent emotion question-answering method, an intelligent emotion question-answering device and a computer readable storage medium, which mainly aim to present answer results with emotion tendencies to a user when the user inputs questions.
In order to achieve the above purpose, the invention provides an intelligent emotion question-answering method, which comprises the following steps:
acquiring a question data set and a plurality of answer data sets corresponding to the question data set from the Internet through a web crawler technology, forming a question-answer data set by the question data set and the plurality of answer data sets, and marking emotion attributes of the question-answer data set to obtain an emotion attribute marking set corresponding to the question-answer data set;
preprocessing the question-answer data set, namely performing Word segmentation and keyword extraction, and performing Word vectorization on the question-answer data set completed by the preprocessing operation according to a Word2Vec algorithm to obtain a question-answer Word vector set, wherein the question-answer Word vector set comprises a question Word vector set and an answer Word vector set;
inputting the emotion attribute labeling set into a loss function, inputting the problem word vector set into a convolutional neural network model, training by using the convolutional neural network model to obtain a training value, inputting the training value into the loss function, calculating by using the loss function and according to the emotion attribute labeling set and the training value to obtain a loss value, judging the magnitude relation between the loss value and a preset threshold value, and quitting training by using the convolutional neural network until the loss value is smaller than the preset threshold value;
after the convolutional neural network exits training, the convolutional neural network inputs the problem word vector set to a cyclic neural network, and prompts the cyclic neural network to accept the answer word vector set for training until the cyclic neural network meets the requirement of a preset threshold value;
and receiving a user question, performing the preprocessing operation and the word vectorization operation on the user question, inputting the user question into the convolutional neural network to judge the emotion attribute category, and outputting an answer of the user question by the convolutional neural network according to the emotion attribute category.
Optionally, obtaining the question data set and the multiple answer data sets corresponding to the question data set from the internet through a web crawler technology includes:
crawling questions asked in a text form from a URL page according to the web crawler technology, and forming the questions asked in the text form into a question data set;
and traversing the questions in the question data set, crawling a plurality of answers corresponding to the questions from a URL page by using the web crawler technology until the traversing of the question data set is finished, and obtaining a plurality of answer data sets corresponding to the question data set.
Optionally, the word segmentation establishes a word segmentation probability model P (S) according to the question-answer data set, maximizes the word segmentation probability model, and completes word segmentation operation, wherein the word segmentation probability model P (S) is:
Figure BDA0002054960290000021
wherein W is 1 ,W 2 ,...,W m For words of data in the question-answer data set, m is the number of the question-answer data sets;
the keyword extraction comprises the steps of constructing the relevancy of the words and extracting keywords based on the relevancy, wherein the relevancy is as follows:
Figure BDA0002054960290000022
wherein f (W) i ,W j ) For the word W i Sum word W j Of (2), tfidf (W) i ) For the word W i Of the word frequency and inverse frequency value, tfidf (W j ) For the word W j Word frequency of (a) and (b) a word frequency of (b) a word frequencyReverse frequency value, d is the word W i Sum word W j Euclidean distance with respect to word vectors.
Optionally, the Word2Vec algorithm is a CBOW model;
the CBOW model comprises an input layer, a projection layer and an output layer;
the projection layer ζ (ω, j) is:
Figure BDA0002054960290000031
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002054960290000032
representing Huffman coding corresponding to the jth node in the path omega, wherein theta is an iteration factor of the CBOW model, sigma represents a sigmoid function and X ω And a question-answer data set completed for the preprocessing operation.
Optionally, the recurrent neural network is a long-short-term memory network;
the long-term and short-term memory network comprises a forgetting door, an input door and an output door;
the forgetting door is as follows:
f t =σ(w t [h t-1 ,x t ]+b t )
wherein f t X is the output data of the forgetting gate t T is the current time of the question-answer word vector set, t-1 is the time before the current time of the question-answer word vector set, and h is the input data of the forgetting gate t-1 For the output data of the output gate at the time before the current time of the question-answering word vector set, w t B is the weight of the current time t For the bias of the current time []For matrix multiplication operations, σ represents the sigmoid function.
In addition, in order to achieve the above object, the present invention also provides an intelligent emotion question-answering device, which includes a memory and a processor, wherein the memory stores an intelligent emotion question-answering program that can run on the processor, and the intelligent emotion question-answering program when executed by the processor implements the following steps:
acquiring a question data set and a plurality of answer data sets corresponding to the question data set from the Internet through a web crawler technology, forming a question-answer data set by the question data set and the plurality of answer data sets, and marking emotion attributes of the question-answer data set to obtain an emotion attribute marking set corresponding to the question-answer data set;
preprocessing the question-answer data set, namely performing Word segmentation and keyword extraction, and performing Word vectorization on the question-answer data set completed by the preprocessing operation according to a Word2Vec algorithm to obtain a question-answer Word vector set, wherein the question-answer Word vector set comprises a question Word vector set and an answer Word vector set;
inputting the emotion attribute labeling set into a loss function, inputting the problem word vector set into a convolutional neural network model, training by using the convolutional neural network model to obtain a training value, inputting the training value into the loss function, calculating by using the loss function and according to the emotion attribute labeling set and the training value to obtain a loss value, judging the magnitude relation between the loss value and a preset threshold value, and quitting training by using the convolutional neural network until the loss value is smaller than the preset threshold value;
after the convolutional neural network exits training, the convolutional neural network inputs the problem word vector set to a cyclic neural network, and prompts the cyclic neural network to accept the answer word vector set for training until the cyclic neural network meets the requirement of a preset threshold value;
and receiving a user question, performing the preprocessing operation and the word vectorization operation on the user question, inputting the user question into the convolutional neural network to judge the emotion attribute category, and outputting an answer of the user question by the convolutional neural network according to the emotion attribute category.
Optionally, obtaining the question data set and the multiple answer data sets corresponding to the question data set from the internet through a web crawler technology includes:
crawling questions asked in a text form from a URL page according to the web crawler technology, and forming the questions asked in the text form into a question data set;
and traversing the questions in the question data set, crawling a plurality of answers corresponding to the questions from a URL page by using the web crawler technology until the traversing of the question data set is finished, and obtaining a plurality of answer data sets corresponding to the question data set.
Optionally, the word segmentation establishes a word segmentation probability model P (S) according to the question-answer data set, maximizes the word segmentation probability model, and completes word segmentation operation, wherein the word segmentation probability model P (S) is:
Figure BDA0002054960290000041
wherein W is 1 ,W 2 ,...,W m For words of data in the question-answer data set, m is the number of the question-answer data sets;
the keyword extraction comprises the steps of constructing the relevancy of the words and extracting keywords based on the relevancy, wherein the relevancy is as follows:
Figure BDA0002054960290000042
wherein f (W) i ,W j ) For the word W i Sum word W j Of (2), tfidf (W) i ) For the word W i Of the word frequency and inverse frequency value, tfidf (W j ) For the word W j The word frequency and the reverse frequency value of d is the word W i Sum word W j Euclidean distance with respect to word vectors.
Optionally, the Word2Vec algorithm is a CBOW model;
the CBOW model comprises an input layer, a projection layer and an output layer;
the projection layer ζ (ω, j) is:
Figure BDA0002054960290000043
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002054960290000044
representing Huffman coding corresponding to the jth node in the path omega, wherein theta is an iteration factor of the CBOW model, sigma represents a sigmoid function and X ω And a question-answer data set completed for the preprocessing operation.
Optionally, the recurrent neural network is a long-short-term memory network;
the long-term and short-term memory network comprises a forgetting door, an input door and an output door;
the forgetting door is as follows:
f t =σ(w t [h t-1 ,x t ]+b t )
wherein f t X is the output data of the forgetting gate t T is the current time of the question-answer word vector set, t-1 is the time before the current time of the question-answer word vector set, and h is the input data of the forgetting gate t-1 For the output data of the output gate at the time before the current time of the question-answering word vector set, w t B is the weight of the current time t For the bias of the current time []For matrix multiplication operations, σ represents the sigmoid function.
In addition, to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon an intelligent emotion question-and-answer program executable by one or more processors to implement the steps of the intelligent emotion question-and-answer method as described above.
The multi-layer network structure of the convolutional neural network can automatically extract deep features of data and learn features of different layers, so that the accuracy of text processing is greatly improved, and meanwhile, the cyclic neural network can be used for efficiently judging the front and rear time sequence states of the data, so that the intelligent emotion question-answering method, the intelligent emotion question-answering device and the computer-readable storage medium can realize an accurate intelligent emotion question-answering function.
Drawings
FIG. 1 is a schematic flow chart of an intelligent emotion question and answer method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating an internal structure of an intelligent emotion question-answering device according to an embodiment of the present invention;
fig. 3 is a schematic block diagram of an intelligent emotion question-answering program in the intelligent emotion question-answering device according to an embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The invention provides an intelligent emotion question-answering method. Referring to fig. 1, a schematic flow chart of an intelligent emotion question-answering method according to an embodiment of the present invention is shown. The method may be performed by an apparatus, which may be implemented in software and/or hardware.
In this embodiment, the intelligent emotion question-answering method includes:
s1, acquiring a question data set and a plurality of answer data sets corresponding to the question data set from the Internet through a web crawler technology, forming a question-answer data set by the question data set and the plurality of answer data sets, and marking emotion attributes of the question-answer data set to obtain an emotion attribute marking set corresponding to the question-answer data set.
According to the preferred embodiment of the invention, the questions asked in the text form are crawled from the URL page according to the Web crawler (Web crawler) technology, and the questions asked in the text form are formed into a question data set. Traversing the questions in the question data set, crawling a plurality of different answers corresponding to the questions from a URL page by using the web crawler technology until the traversing of the question data set is finished, and obtaining a plurality of answer data sets corresponding to the question data set;
in a preferred embodiment of the present invention, according to the question data set, the answers in the multiple answer data sets are labeled according to emotion attributes, so as to obtain the emotion attribute labeling set, where the emotion attributes include humor, hypo, advice, etc.
S2, preprocessing operation comprising Word segmentation and keyword extraction is carried out on the question-answer data set, word vectorization operation is carried out on the question-answer data set completed by the preprocessing operation according to a Word2Vec algorithm, and a question-answer Word vector set is obtained, wherein the question-answer Word vector set comprises a question Word vector set and an answer Word vector set.
In a preferred embodiment of the present invention, the word segmentation establishes a word segmentation probability model P (S) according to the question-answer data set, and maximizes the word segmentation probability model to complete word segmentation operation, where the word segmentation probability model P (S) is:
Figure BDA0002054960290000061
wherein W is 1 ,W 2 ,...,W m For words of data in the question-answer data set, m is the number of the question-answer data sets;
the keyword extraction comprises the steps of constructing the relevancy of the words and extracting keywords based on the relevancy, wherein the relevancy is as follows:
Figure BDA0002054960290000062
wherein f (W) i ,W j ) For the word W i Sum word W j Of (2), tfidf (W) i ) For the word W i Of the word frequency and inverse frequency value, tfidf (W j ) For the word W j The word frequency and the reverse frequency value of d is the word W i Sum word W j Euclidean distance with respect to word vectors;
in the preferred embodiment of the present invention, the Word2Vec algorithm is a CBOW model, where the CBOW model includes an input layer, a projection layer, and an output layer, and the projection layer ζ (ω, j) is:
Figure BDA0002054960290000071
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002054960290000072
representing Huffman coding corresponding to the jth node in the path omega, wherein theta is an iteration factor of the CBOW model, sigma represents a sigmoid function and X ω And a question-answer data set completed for the preprocessing operation.
S3, inputting the emotion attribute labeling set into a loss function, inputting the problem word vector set into a convolutional neural network model, training by using the convolutional neural network model to obtain a training value, inputting the training value into the loss function, calculating by using the loss function according to the emotion attribute labeling set and the training value to obtain a loss value, and judging the magnitude relation between the loss value and a preset threshold value until the loss value is smaller than the preset threshold value, wherein the convolutional neural network exits training.
In a preferred embodiment of the present invention, the convolutional neural network includes a convolutional layer, a pooling layer, a flat layer, a Dropout layer, and a full-connection layer, and the problem word vector set is in the form of one-dimensional vectors in a time dimension, so that filters (filters) of the convolutional layer and the pooling layer are also one-dimensional vectors, and in order to prevent an overfitting phenomenon, the Dropout layer is added into the convolutional layer and the pooling layer; and flattening the data after multiple convolution and pooling operations, namely the action of the Flatten layer, and finally outputting the training value through the full connection layer (Dense).
In the preferred embodiment of the present invention, the loss value E is:
Figure BDA0002054960290000073
wherein x is the training value, mu j For the emotion attribute labeling set, m is the number of question-answer data sets, anThe preset threshold is typically set to 0.01.
S4, after the convolutional neural network exits training, the convolutional neural network inputs the problem word vector set to the cyclic neural network, and prompts the cyclic neural network to accept the answer word vector set for training until the cyclic neural network meets the preset threshold requirement, and exits training.
In a preferred embodiment of the present invention, the recurrent neural network is a long-short-term memory network, and the long-short-term memory network includes a forgetting gate, an input gate, and an output gate, where the forgetting gate is:
f t =σ(w t [h t-1 ,x t ]+b t )
wherein f t X is the output data of the forgetting gate t T is the current time of the question-answer word vector set, t-1 is the time before the current time of the question-answer word vector set, and h is the input data of the forgetting gate t-1 For the output data of the output gate at the time before the current time of the question-answering word vector set, w t B is the weight of the current time t For the bias of the current time []For matrix multiplication operations, σ represents the sigmoid function.
S5, receiving user questions, performing the preprocessing operation and the word vectorization operation on the user questions, inputting the user questions into the convolutional neural network to judge emotion attribute categories, and outputting answers of the user questions according to the emotion attribute categories by the convolutional neural network.
The invention also provides an intelligent emotion question-answering device. Referring to fig. 2, a schematic diagram of an internal structure of an intelligent emotion question-answering device according to an embodiment of the present invention is shown.
In this embodiment, the intelligent emotion question and answer device 1 may be a PC (personal computer), or a terminal device such as a smart phone, a tablet computer, or a portable computer, or may be a server. The intelligent emotion question and answer device 1 at least comprises a memory 11, a processor 12, a communication bus 13 and a network interface 14.
The memory 11 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. Memory 11 may be an internal storage unit of intelligent emotion questioning and answering device 1 in some embodiments, such as a hard disk of intelligent emotion questioning and answering device 1. The memory 11 may also be an external storage device of the intelligent emotion question and answer apparatus 1 in other embodiments, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like provided on the intelligent emotion question and answer apparatus 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the intelligent emotion question and answer apparatus 1. The memory 11 may be used not only for storing application software installed in the intelligent emotion question and answer device 1 and various data such as a code of the intelligent emotion question and answer program 01, but also for temporarily storing data that has been output or is to be output.
Processor 12 may in some embodiments be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chip for executing program code or processing data stored in memory 11, such as for executing intelligent emotion question and answer program 01, etc.
The communication bus 13 is used to enable connection communication between these components.
The network interface 14 may optionally comprise a standard wired interface, a wireless interface (e.g. WI-FI interface), typically used to establish a communication connection between the apparatus 1 and other electronic devices.
Optionally, the device 1 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or a display unit, as appropriate, for displaying information processed in the intelligent emotion question and answer device 1 and for displaying a visual user interface.
Fig. 2 shows only an intelligent emotion question and answer device 1 having components 11-14 and an intelligent emotion question and answer program 01, and those skilled in the art will appreciate that the structure shown in fig. 1 does not constitute a limitation of intelligent emotion question and answer device 1, and may include fewer or more components than shown, or may combine some components, or a different arrangement of components.
In the embodiment of the device 1 shown in fig. 2, the memory 11 stores an intelligent emotion question and answer program 01; processor 12 implements the following steps when executing the intelligent emotion question and answer program 01 stored in memory 11:
step one, acquiring a question data set and a plurality of answer data sets corresponding to the question data set from the Internet through a web crawler technology, forming a question-answer data set by the question data set and the plurality of answer data sets, and marking emotion attributes of the question-answer data set to obtain an emotion attribute marking set corresponding to the question-answer data set.
According to the preferred embodiment of the invention, the questions asked in the text form are crawled from the URL page according to the Web crawler (Web crawler) technology, and the questions asked in the text form are formed into a question data set. Traversing the questions in the question data set, crawling a plurality of different answers corresponding to the questions from a URL page by using the web crawler technology until the traversing of the question data set is finished, and obtaining a plurality of answer data sets corresponding to the question data set;
in a preferred embodiment of the present invention, according to the question data set, the answers in the multiple answer data sets are labeled according to emotion attributes, so as to obtain the emotion attribute labeling set, where the emotion attributes include humor, hypo, advice, etc.
And secondly, carrying out preprocessing operation comprising Word segmentation and keyword extraction on the question-answer data set, and carrying out Word vectorization operation on the question-answer data set completed by the preprocessing operation according to a Word2Vec algorithm to obtain a question-answer Word vector set, wherein the question-answer Word vector set comprises a question-answer Word vector set and an answer Word vector set.
In a preferred embodiment of the present invention, the word segmentation establishes a word segmentation probability model P (S) according to the question-answer data set, and maximizes the word segmentation probability model to complete word segmentation operation, where the word segmentation probability model P (S) is:
Figure BDA0002054960290000101
wherein W is 1 ,W 2 ,...,W m For words of data in the question-answer data set, m is the number of the question-answer data sets;
the keyword extraction comprises the steps of constructing the relevancy of the words and extracting keywords based on the relevancy, wherein the relevancy is as follows:
Figure BDA0002054960290000102
wherein f (W) i ,W j ) For the word W i Sum word W j Of (2), tfidf (W) i ) For the word W i Of the word frequency and inverse frequency value, tfidf (W j ) For the word W j The word frequency and the reverse frequency value of d is the word W i Sum word W j Euclidean distance with respect to word vectors;
in the preferred embodiment of the present invention, the Word2Vec algorithm is a CBOW model, where the CBOW model includes an input layer, a projection layer, and an output layer, and the projection layer ζ (ω, j) is:
Figure BDA0002054960290000103
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0002054960290000104
is shown in path omega, jHuffman coding corresponding to each node, theta is an iteration factor of the CBOW model, sigma represents a sigmoid function and X ω And a question-answer data set completed for the preprocessing operation.
Inputting the emotion attribute labeling set into a loss function, inputting the problem word vector set into a convolutional neural network model, training by using the convolutional neural network model to obtain a training value, inputting the training value into the loss function, calculating by using the loss function according to the emotion attribute labeling set and the training value to obtain a loss value, and judging the magnitude relation between the loss value and a preset threshold value until the loss value is smaller than the preset threshold value, wherein the convolutional neural network exits training.
In a preferred embodiment of the present invention, the convolutional neural network includes a convolutional layer, a pooling layer, a flat layer, a Dropout layer, and a full-connection layer, and the problem word vector set is in the form of one-dimensional vectors in a time dimension, so that filters (filters) of the convolutional layer and the pooling layer are also one-dimensional vectors, and in order to prevent an overfitting phenomenon, the Dropout layer is added into the convolutional layer and the pooling layer; and flattening the data after multiple convolution and pooling operations, namely the action of the Flatten layer, and finally outputting the training value through the full connection layer (Dense).
In the preferred embodiment of the present invention, the loss value E is:
Figure BDA0002054960290000111
wherein x is the training value, mu j And m is the number of the question-answer data sets, and the preset threshold is generally set to be 0.01.
And step four, after the convolutional neural network exits training, the convolutional neural network inputs the problem word vector set to a cyclic neural network, and prompts the cyclic neural network to accept the answer word vector set for training until the cyclic neural network meets the requirement of a preset threshold value, and exits training.
In a preferred embodiment of the present invention, the recurrent neural network is a long-short-term memory network, and the long-short-term memory network includes a forgetting gate, an input gate, and an output gate, where the forgetting gate is:
f t =σ(w t [h t-1 ,x t ]+b t )
wherein f t X is the output data of the forgetting gate t T is the current time of the question-answer word vector set, t-1 is the time before the current time of the question-answer word vector set, and h is the input data of the forgetting gate t-1 For the output data of the output gate at the time before the current time of the answer word vector set, w t B is the weight of the current time t For the bias of the current time []For matrix multiplication operations, σ represents the sigmoid function.
And fifthly, receiving a user question, performing the preprocessing operation and the word vectorization operation on the user question, inputting the user question into the convolutional neural network to judge the emotion attribute category, and outputting an answer of the user question by the convolutional neural network according to the emotion attribute category.
Alternatively, in other embodiments, the intelligent emotion question and answer program may be further divided into one or more modules, where one or more modules are stored in the memory 11 and executed by one or more processors (the processor 12 in this embodiment) to complete the present invention, and the modules referred to herein are a series of instruction segments of a computer program capable of performing a specific function, for describing the execution of the intelligent emotion question and answer program in the intelligent emotion question and answer device.
For example, referring to fig. 3, a schematic program module of an intelligent emotion question and answer program in an embodiment of an intelligent emotion question and answer device of the present invention is shown, where the intelligent emotion question and answer program may be divided into a data receiving module 10, a data processing module 20, a model training module 30, and a question and answer result output module 40 by way of example:
the data receiving module 10 is configured to: acquiring a question data set and a plurality of answer data sets corresponding to the question data set from the Internet, forming a question-answer data set by the question data set and the plurality of answer data sets, and labeling emotion attributes of the question-answer data set to obtain an emotion attribute labeling set corresponding to the question-answer data set.
The data processing module 20 is configured to: and carrying out preprocessing operation comprising Word segmentation and keyword extraction on the question-answer data set, and carrying out Word vectorization operation on the question-answer data set completed by the preprocessing operation according to a Word2Vec algorithm to obtain a question-answer Word vector set, wherein the question-answer Word vector set comprises a question-answer Word vector set and an answer Word vector set.
The model training module 30 is configured to: inputting the emotion attribute labeling set into a loss function, inputting the problem word vector set into a convolutional neural network model for training, training the convolutional neural network model to obtain a training value, inputting the training value into the loss function, calculating the loss value by the loss function according to the emotion attribute labeling set and the training value, judging the magnitude relation between the loss value and a preset threshold value until the loss value is smaller than the preset threshold value, and after the convolutional neural network is out of training, inputting the problem word vector set into a convolutional neural network by the convolutional neural network, prompting the convolutional neural network to accept the answer word vector set for training until the convolutional neural network meets the requirement of the preset threshold value, and then, exiting the training.
The question and answer result output module 40 is configured to: and receiving a user question, performing the preprocessing operation and the word vectorization operation on the user question, inputting the user question into the convolutional neural network to judge the emotion attribute category, and outputting an answer of the user question by the convolutional neural network according to the emotion attribute category.
The functions or operation steps implemented when the program modules of the data receiving module 10, the data processing module 20, the model training module 30, the question and answer result output module 40 and the like are executed are substantially the same as those of the foregoing embodiments, and are not repeated herein.
In addition, an embodiment of the present invention further provides a computer readable storage medium, where an intelligent emotion question-answering program is stored, where the intelligent emotion question-answering program can be executed by one or more processors to implement the following operations:
acquiring a question data set and a plurality of answer data sets corresponding to the question data set from the Internet, forming a question-answer data set by the question data set and the plurality of answer data sets, and labeling emotion attributes of the question-answer data set to obtain an emotion attribute labeling set corresponding to the question-answer data set.
And carrying out preprocessing operation comprising Word segmentation and keyword extraction on the question-answer data set, and carrying out Word vectorization operation on the question-answer data set completed by the preprocessing operation according to a Word2Vec algorithm to obtain a question-answer Word vector set, wherein the question-answer Word vector set comprises a question-answer Word vector set and an answer Word vector set.
Inputting the emotion attribute labeling set into a loss function, inputting the problem word vector set into a convolutional neural network model for training, training the convolutional neural network model to obtain a training value, inputting the training value into the loss function, calculating the loss value by the loss function according to the emotion attribute labeling set and the training value, judging the magnitude relation between the loss value and a preset threshold value until the loss value is smaller than the preset threshold value, and after the convolutional neural network is out of training, inputting the problem word vector set into a convolutional neural network by the convolutional neural network, prompting the convolutional neural network to accept the answer word vector set for training until the convolutional neural network meets the requirement of the preset threshold value, and then, exiting the training.
And receiving a user question, performing the preprocessing operation and the word vectorization operation on the user question, inputting the user question into the convolutional neural network to judge the emotion attribute category, and outputting an answer of the user question by the convolutional neural network according to the emotion attribute category.
It should be noted that, the foregoing reference numerals of the embodiments of the present invention are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (10)

1. An intelligent emotion question and answer method, which is characterized by comprising the following steps:
acquiring a question data set and a plurality of answer data sets corresponding to the question data set from the Internet through a web crawler technology, forming a question-answer data set by the question data set and the plurality of answer data sets, and marking emotion attributes of the question-answer data set to obtain an emotion attribute marking set corresponding to the question-answer data set;
preprocessing the question-answer data set, namely performing Word segmentation and keyword extraction, and performing Word vectorization on the question-answer data set completed by the preprocessing operation according to a Word2Vec algorithm to obtain a question-answer Word vector set, wherein the question-answer Word vector set comprises a question Word vector set and an answer Word vector set;
inputting the emotion attribute labeling set into a loss function, inputting the problem word vector set into a convolutional neural network model, training by using the convolutional neural network model to obtain a training value, inputting the training value into the loss function, calculating by using the loss function and according to the emotion attribute labeling set and the training value to obtain a loss value, judging the magnitude relation between the loss value and a preset threshold value, and quitting training by using the convolutional neural network until the loss value is smaller than the preset threshold value;
after the convolutional neural network exits training, the convolutional neural network inputs the problem word vector set to a cyclic neural network, and prompts the cyclic neural network to accept the answer word vector set for training until the cyclic neural network meets the requirement of a preset threshold value;
and receiving a user question, performing the preprocessing operation and the word vectorization operation on the user question, inputting the user question into the convolutional neural network to judge the emotion attribute category, and outputting an answer of the user question by the convolutional neural network according to the emotion attribute category.
2. The intelligent emotion question and answer method of claim 1, wherein obtaining a question data set and a plurality of answer data sets corresponding to the question data set from the internet by web crawler technology, comprises:
crawling questions asked in a text form from a URL page according to the web crawler technology, and forming the questions asked in the text form into a question data set;
and traversing the questions in the question data set, crawling a plurality of answers corresponding to the questions from the URL page by using the web crawler technology until the traversing of the question data set is finished, and obtaining a plurality of answer data sets corresponding to the question data set.
3. The intelligent emotion question and answer method of claim 2, wherein the word segmentation operation includes:
establishing a word segmentation probability model P (S) according to the question-answer data set, maximizing the word segmentation probability model, and completing word segmentation operation, wherein the word segmentation probability model P (S) is as follows:
Figure FDA0002054960280000021
wherein W is 1 ,W 2 ,...,W m For words of data in the question-answer data set, m is the number of the question-answer data sets;
the keyword extraction operation includes:
constructing the relativity of the words, and extracting keywords based on the relativity, wherein the relativity is as follows:
Figure FDA0002054960280000022
wherein f (W) i ,W j ) For the word W i Sum word W j Of (2), tfidf (W) i ) For the word W i Of the word frequency and inverse frequency value, tfidf (W j ) For the word W j The word frequency and the reverse frequency value of d is the word W i Sum word W j Euclidean distance with respect to word vectors.
4. The intelligent emotion question and answer method of claim 3, wherein said Word2Vec algorithm is CBOW model;
the CBOW model comprises an input layer, a projection layer and an output layer;
the projection layer ζ (ω, j) is:
Figure FDA0002054960280000023
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure FDA0002054960280000024
representing Huffman coding corresponding to the jth node in the path omega, wherein theta is an iteration factor of the CBOW model, sigma represents a sigmoid function and X ω And a question-answer data set completed for the preprocessing operation.
5. The intelligent emotion question-answering method according to claim 4, wherein the recurrent neural network is a long-short-term memory network including a forgetting gate, an input gate, and an output gate;
the forgetting door is as follows:
f t =σ(w t [h t-1 ,x t ]+b t )
wherein f t X is the output data of the forgetting gate t T is the current time of the question-answer word vector set, t-1 is the time before the current time of the question-answer word vector set, and h is the input data of the forgetting gate t-1 For the output data of the output gate at the time before the current time of the question-answering word vector set, w t B is the weight of the current time t For the bias of the current time []For matrix multiplication operations, σ represents the sigmoid function.
6. An intelligent emotion question and answer device, characterized in that the device comprises a memory and a processor, wherein an intelligent emotion question and answer program capable of running on the processor is stored in the memory, and the intelligent emotion question and answer program realizes the following steps when being executed by the processor:
acquiring a question data set and a plurality of answer data sets corresponding to the question data set from the Internet through a web crawler technology, forming a question-answer data set by the question data set and the plurality of answer data sets, and marking emotion attributes of the question-answer data set to obtain an emotion attribute marking set corresponding to the question-answer data set;
preprocessing the question-answer data set, namely performing Word segmentation and keyword extraction, and performing Word vectorization on the question-answer data set completed by the preprocessing operation according to a Word2Vec algorithm to obtain a question-answer Word vector set, wherein the question-answer Word vector set comprises a question Word vector set and an answer Word vector set;
inputting the emotion attribute labeling set into a loss function, inputting the problem word vector set into a convolutional neural network model, training by using the convolutional neural network model to obtain a training value, inputting the training value into the loss function, calculating by using the loss function and according to the emotion attribute labeling set and the training value to obtain a loss value, judging the magnitude relation between the loss value and a preset threshold value, and quitting training by using the convolutional neural network until the loss value is smaller than the preset threshold value;
after the convolutional neural network exits training, the convolutional neural network inputs the problem word vector set to a cyclic neural network, and prompts the cyclic neural network to accept the answer word vector set for training until the cyclic neural network meets the requirement of a preset threshold value;
and receiving a user question, performing the preprocessing operation and the word vectorization operation on the user question, inputting the user question into the convolutional neural network to judge the emotion attribute category, and outputting an answer of the user question by the convolutional neural network according to the emotion attribute category.
7. The intelligent emotion question and answer device of claim 6, wherein obtaining a question data set and a plurality of answer data sets corresponding to the question data set from the internet by web crawler technology, comprises:
crawling questions asked in a text form from a URL page according to the web crawler technology, and forming the questions asked in the text form into a question data set;
and traversing the questions in the question data set, crawling a plurality of answers corresponding to the questions from the URL page by using the web crawler technology until the traversing of the question data set is finished, and obtaining a plurality of answer data sets corresponding to the question data set.
8. The intelligent emotion question and answer device of claim 7, wherein said word segmentation operation comprises:
establishing a word segmentation probability model P (S) according to the question-answer data set, maximizing the word segmentation probability model, and completing word segmentation operation, wherein the word segmentation probability model P (S) is as follows:
Figure FDA0002054960280000041
wherein W is 1 ,W 2 ,...,W m For words of data in the question-answer data set, m is the number of the question-answer data sets;
the keyword extraction operation includes:
constructing the relevance of the words and extracting keywords based on the relevance, wherein the relevance is as follows:
Figure FDA0002054960280000042
wherein f (W) i ,W j ) For the word W i Sum word W j Of (2), tfidf (W) i ) For the word W i Of the word frequency and inverse frequency value, tfidf (W j ) For the word W j The word frequency and the reverse frequency value of d is the word W i Sum word W j Euclidean distance with respect to word vectors.
9. The intelligent emotion question and answer device of claim 8, wherein Word2Vec algorithm is CBOW model;
the CBOW model comprises an input layer, a projection layer and an output layer;
the projection layer ζ (ω, j) is:
Figure FDA0002054960280000043
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure FDA0002054960280000044
representing Huffman coding corresponding to the jth node in the path omega, wherein theta is an iteration factor of the CBOW model, sigma represents a sigmoid function and X ω And a question-answer data set completed for the preprocessing operation.
10. A computer readable storage medium having stored thereon an intelligent emotion question-answering program executable by one or more processors to implement the steps of the intelligent emotion question-answering method of any one of claims 1 to 5.
CN201910386282.6A 2019-05-09 2019-05-09 Intelligent emotion question-answering method and device and computer readable storage medium Active CN110263134B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910386282.6A CN110263134B (en) 2019-05-09 2019-05-09 Intelligent emotion question-answering method and device and computer readable storage medium
PCT/CN2019/102194 WO2020224099A1 (en) 2019-05-09 2019-08-23 Intelligent emotional question answering method and device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910386282.6A CN110263134B (en) 2019-05-09 2019-05-09 Intelligent emotion question-answering method and device and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN110263134A CN110263134A (en) 2019-09-20
CN110263134B true CN110263134B (en) 2023-06-27

Family

ID=67914663

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910386282.6A Active CN110263134B (en) 2019-05-09 2019-05-09 Intelligent emotion question-answering method and device and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN110263134B (en)
WO (1) WO2020224099A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115563987B (en) * 2022-10-17 2023-07-04 北京中科智加科技有限公司 Comment text analysis processing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107168945A (en) * 2017-04-13 2017-09-15 广东工业大学 A kind of bidirectional circulating neutral net fine granularity opinion mining method for merging multiple features
CN107544957A (en) * 2017-07-05 2018-01-05 华北电力大学 A kind of Sentiment orientation analysis method of business product target word

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014127069A1 (en) * 2013-02-13 2014-08-21 Lela, Inc Methods for and apparatus for providing advice based upon total personal values
US10909329B2 (en) * 2015-05-21 2021-02-02 Baidu Usa Llc Multilingual image question answering
CN107066446B (en) * 2017-04-13 2020-04-10 广东工业大学 Logic rule embedded cyclic neural network text emotion analysis method
EP3619619A4 (en) * 2017-06-29 2020-11-18 Microsoft Technology Licensing, LLC Generating responses in automated chatting
CN108427670A (en) * 2018-04-08 2018-08-21 重庆邮电大学 A kind of sentiment analysis method based on context word vector sum deep learning
CN108875074B (en) * 2018-07-09 2021-08-10 北京慧闻科技发展有限公司 Answer selection method and device based on cross attention neural network and electronic equipment
CN108932342A (en) * 2018-07-18 2018-12-04 腾讯科技(深圳)有限公司 A kind of method of semantic matches, the learning method of model and server
CN109408633A (en) * 2018-09-17 2019-03-01 中山大学 A kind of construction method of the Recognition with Recurrent Neural Network model of multilayer attention mechanism

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107168945A (en) * 2017-04-13 2017-09-15 广东工业大学 A kind of bidirectional circulating neutral net fine granularity opinion mining method for merging multiple features
CN107544957A (en) * 2017-07-05 2018-01-05 华北电力大学 A kind of Sentiment orientation analysis method of business product target word

Also Published As

Publication number Publication date
WO2020224099A1 (en) 2020-11-12
CN110263134A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
CN112632385B (en) Course recommendation method, course recommendation device, computer equipment and medium
CN110334272B (en) Intelligent question-answering method and device based on knowledge graph and computer storage medium
US10534863B2 (en) Systems and methods for automatic semantic token tagging
US20220188521A1 (en) Artificial intelligence-based named entity recognition method and apparatus, and electronic device
CN110442857B (en) Emotion intelligent judging method and device and computer readable storage medium
CN111159346A (en) Intelligent answering method based on intention recognition, server and storage medium
CN112231569B (en) News recommendation method, device, computer equipment and storage medium
CN113127624B (en) Question-answer model training method and device
CN111898374B (en) Text recognition method, device, storage medium and electronic equipment
CN111695354A (en) Text question-answering method and device based on named entity and readable storage medium
CN110427480B (en) Intelligent personalized text recommendation method and device and computer readable storage medium
CN111193657A (en) Chat expression reply method, device and storage medium
CN110795548A (en) Intelligent question answering method, device and computer readable storage medium
CN111177349B (en) Question-answer matching method, device, equipment and storage medium
CN111813905A (en) Corpus generation method and device, computer equipment and storage medium
CN112287085B (en) Semantic matching method, system, equipment and storage medium
CN112131368A (en) Dialog generation method and device, electronic equipment and storage medium
CN110765765B (en) Contract key term extraction method, device and storage medium based on artificial intelligence
CN110222144B (en) Text content extraction method and device, electronic equipment and storage medium
CN112784011B (en) Emotion problem processing method, device and medium based on CNN and LSTM
CN110263134B (en) Intelligent emotion question-answering method and device and computer readable storage medium
CN113569118A (en) Self-media pushing method and device, computer equipment and storage medium
CN111767720B (en) Title generation method, computer and readable storage medium
CN111221942A (en) Intelligent text conversation generation method and device and computer readable storage medium
CN112132075A (en) Method and medium for processing image-text content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant