CN109871532B - Text theme extraction method and device and storage medium - Google Patents

Text theme extraction method and device and storage medium Download PDF

Info

Publication number
CN109871532B
CN109871532B CN201910008265.9A CN201910008265A CN109871532B CN 109871532 B CN109871532 B CN 109871532B CN 201910008265 A CN201910008265 A CN 201910008265A CN 109871532 B CN109871532 B CN 109871532B
Authority
CN
China
Prior art keywords
text
layer
matrix
attention mechanism
extraction model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910008265.9A
Other languages
Chinese (zh)
Other versions
CN109871532A (en
Inventor
金戈
徐亮
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910008265.9A priority Critical patent/CN109871532B/en
Publication of CN109871532A publication Critical patent/CN109871532A/en
Priority to PCT/CN2019/118287 priority patent/WO2020140633A1/en
Application granted granted Critical
Publication of CN109871532B publication Critical patent/CN109871532B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Abstract

The invention belongs to the technical field of artificial intelligence and discloses a text theme extraction method, which comprises the following steps: constructing a text theme extraction model; training a text theme extraction model; acquiring a text word vector corresponding to a text sample; inputting the text word vector into the trained text topic extraction model; and outputting the text theme, wherein the text theme extraction model comprises a convolutional neural network and an attention mechanism, the attention mechanism comprises a position attention mechanism and a channel attention mechanism, the position attention mechanism and the channel attention mechanism are established in parallel and are connected with an activation layer of the convolutional neural network, the position attention weight and the channel attention weight are respectively applied, and an output result of the position attention mechanism and an output result of the channel attention mechanism are input into a full connection layer of the convolutional neural network. The invention also discloses an electronic device and a storage medium. The invention improves the operation efficiency of the text theme extraction model and improves the accuracy of text theme extraction.

Description

Text theme extraction method and device and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a text theme extraction method, a text theme extraction device and a storage medium.
Background
With the rapid development of the internet, more and more users share information through the network and record network information resources through electronic texts, so that if required electronic information needs to be quickly found in a large amount of electronic texts, text topics need to be extracted from the texts to represent the electronic texts. Extracting text topics from a text can help a user determine whether the text needs to be viewed in its entirety. And a plurality of theme texts exist in the massive electronic texts, and the massive electronic texts contain rich theme information. Most of the extraction of the text theme is based on an artificial intelligence technology, and a computer is utilized to automatically extract contents from the text through a text theme extraction model to generate the text theme. Most of the existing text theme extraction models are mainly based on a recurrent neural network, and the recurrent neural network model has low operation efficiency, so that the operation efficiency of the text theme extraction model is low, the operation burden is increased, and the efficiency of text theme extraction through the text theme extraction model is low.
Disclosure of Invention
The invention provides a text theme extraction method, a text theme extraction device and a storage medium based on a convolutional neural network and an attention mechanism, which are used for improving the efficiency of text theme extraction and reducing the operation burden of a text theme extraction model.
In order to achieve the above object, an aspect of the present invention provides a text topic extraction method, including: constructing a text theme extraction model; training the text theme extraction model; acquiring a text word vector corresponding to a text sample; inputting the text word vector into a trained text theme extraction model; outputting a text topic corresponding to the text sample, wherein the constructed text topic extraction model comprises a convolutional neural network and an attention mechanism, the convolutional neural network comprises an input layer, a convolutional layer, an activation layer and a full connection layer, the input layer is used for inputting text word vectors, the convolutional layer is used for performing convolution operation on the text word vectors and extracting text features to obtain text feature vectors, and the activation layer is used for activating the text feature vectors; the attention mechanism comprises a position attention mechanism and a channel attention mechanism, the position attention mechanism and the channel attention mechanism are established in parallel and are connected with the activation layer, a position attention weight is applied to an output matrix of the activation layer through the position attention mechanism to obtain a position attention characteristic matrix, and the position attention characteristic matrix is added with the output matrix of the activation layer to obtain an output result of the position attention mechanism; applying a channel attention weight to the output matrix of the activation layer through the channel attention mechanism to obtain a channel attention characteristic matrix, and adding the channel attention characteristic matrix and the output matrix of the activation layer to obtain an output result of the channel attention mechanism; and inputting the output result of the position attention mechanism and the output result of the channel attention mechanism into the full connection layer, and outputting the text theme through the full connection layer.
Preferably, the text topic extraction model further comprises a plurality of fully-connected hidden layers, the fully-connected hidden layers are arranged in parallel, each fully-connected hidden layer is connected with the active layer, and the output matrix of the active layer is converted into a position attention weight matrix and a channel attention weight matrix through the fully-connected hidden layers.
Preferably, the step of obtaining the location attention feature matrix comprises: the output matrix of the active layer outputs a first conversion matrix through a full-connection hidden layer, and the first conversion matrix and the position attention weight matrix are subjected to matrix multiplication to obtain a position attention feature matrix; the step of obtaining the channel attention feature matrix comprises: and the output matrix of the active layer outputs a second conversion matrix through another fully-connected hidden layer, and the second conversion matrix and the channel attention weight matrix are subjected to matrix multiplication to obtain a channel attention feature matrix.
Preferably, the text topic extraction model further comprises an embedding layer, the embedding layer is located at a first layer of the text topic extraction model, and text samples are converted into text word vectors through the embedding layer.
Preferably, the step of obtaining the text word vector corresponding to the text sample includes: constructing a word vector model, and training the word vector model according to a corpus; performing word segmentation on the text sample; inputting the text sample after word segmentation into the trained word vector model; and outputting a text word vector corresponding to the text sample.
Preferably, the step of training the text topic extraction model comprises:
initializing parameters of the text topic extraction model, the parameters including: inputting a connection weight of the convolutional layer and the convolutional layer, a connection weight of the convolutional layer and the active layer, and a connection weight of the active layer and the full connection layer;
constructing a training sample set, wherein the training sample comprises a text word vector and a text theme;
inputting one training sample in the training sample set into the text theme extraction model, and outputting a text theme corresponding to the training sample;
updating the parameters based on a loss function of the text topic extraction model;
training the next training sample according to the updated parameters, and calculating a loss function value of the text theme extraction model;
judging whether the training of the text theme extraction model reaches a convergence condition, if so, ending the training to obtain the trained text theme extraction model, and if not, updating the parameters of the text theme extraction model and continuing the training, wherein the convergence condition is that the change of the loss function value is smaller than a preset threshold value.
In order to achieve the above object, another aspect of the present invention provides an electronic device, including: a processor; a memory including a text topic extraction program, the text topic extraction program when executed by the processor implementing the steps of the text topic extraction method as follows:
constructing a text theme extraction model; training the text theme extraction model; acquiring a text word vector corresponding to a text sample; inputting the text word vector into a trained text topic extraction model; outputting a text theme corresponding to the text sample, wherein the constructed text theme extraction model comprises a convolutional neural network and an attention mechanism, the convolutional neural network comprises an input layer, a convolutional layer, an activation layer and a full connection layer, the input layer is used for inputting text word vectors, the convolutional layer is used for performing convolutional operation on the text word vectors, text features are extracted, text feature vectors are obtained, and the activation layer is used for activating the text feature vectors; the attention mechanism comprises a position attention mechanism and a channel attention mechanism, the position attention mechanism and the channel attention mechanism are established in parallel and are connected with the activation layer, a position attention weight is applied to an output matrix of the activation layer through the position attention mechanism to obtain a position attention characteristic matrix, and the position attention characteristic matrix is added with the output matrix of the activation layer to obtain an output result of the position attention mechanism; applying a channel attention weight to the output matrix of the activation layer through the channel attention mechanism to obtain a channel attention characteristic matrix, and adding the channel attention characteristic matrix and the output matrix of the activation layer to obtain an output result of the channel attention mechanism; and inputting the output result of the position attention mechanism and the output result of the channel attention mechanism into the full connection layer, and outputting the text theme through the full connection layer.
Preferably, the text topic extraction model further comprises a plurality of fully-connected hidden layers, the fully-connected hidden layers are arranged in parallel, each fully-connected hidden layer is connected with the active layer, and the output matrix of the active layer is converted into a position attention weight matrix and a channel attention weight matrix through the fully-connected hidden layers.
Preferably, the text topic extraction model further comprises an embedding layer, the embedding layer is located at the first layer of the text topic extraction model, and the text sample is converted into the text word vector through the embedding layer.
In order to achieve the above object, a further aspect of the present invention is to provide a computer-readable storage medium including a text topic extraction program, which when executed by a processor, implements the steps of the text topic extraction method as described above.
Compared with the prior art, the invention has the following advantages and beneficial effects:
according to the method, the text theme extraction model is constructed by combining the convolutional neural network and the attention mechanism, the text theme corresponding to the text is extracted, the precision of the text theme extraction model is improved, and meanwhile, the operation efficiency of the text theme extraction model is improved, so that the efficiency of theme extraction on the text sample through the text theme extraction model is improved.
Drawings
FIG. 1 is a schematic flow chart of a text topic extraction method according to the present invention;
fig. 2 is a block diagram of a text topic extraction program according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
The embodiments of the present invention will be described below with reference to the accompanying drawings. Those of ordinary skill in the art will recognize that the described embodiments can be modified in various different ways, or combinations thereof, without departing from the spirit and scope of the present invention. Accordingly, the drawings and description are illustrative in nature and are only intended to illustrate the invention and not to limit the scope of the claims. Furthermore, in the present description, the drawings are not to scale and like reference numerals refer to like parts.
Fig. 1 is a schematic flow diagram of a text topic extraction method according to the present invention, and as shown in fig. 1, the text topic extraction method according to the present invention includes the following steps:
s1, constructing a text theme extraction model;
step S2, training the text theme extraction model;
step S3, obtaining a text word vector corresponding to the text sample;
step S4, inputting the text word vector into a trained text theme extraction model;
step S5, outputting the text theme corresponding to the text sample,
the text topic extraction model constructed in step S1 includes a convolutional neural network and an attention mechanism, where the convolutional neural network includes an input layer, a convolutional layer, an activation layer, and a full-link layer, the input layer is used to input text word vectors, the convolutional layer is used to perform convolutional operation on the text word vectors, extract text features, and obtain text feature vectors, and the activation layer is used to perform an activation action on the text feature vectors; the attention mechanism comprises a position attention mechanism and a channel attention mechanism, the position attention mechanism and the channel attention mechanism are established in parallel and are connected with the activation layer, a position attention weight is applied to an output matrix of the activation layer through the position attention mechanism to obtain a position attention characteristic matrix, and the position attention characteristic matrix is added with the output matrix of the activation layer to obtain an output result of the position attention mechanism; applying a channel attention weight to the output matrix of the activation layer through the channel attention mechanism to obtain a channel attention characteristic matrix, summing the channel attention characteristic matrix and the output matrix of the activation layer to obtain an output result of the channel attention mechanism, inputting the output result of the position attention mechanism and the output result of the channel attention mechanism into the full connection layer, and outputting a text theme through the full connection layer.
The text topic extraction model is constructed by combining the convolutional neural network and the attention mechanism, and the text topic corresponding to the text is extracted, so that the method can be used for extracting keywords of comments. The text topic extraction model carries out convolution operation through a convolution neural network, different weights are distributed for different semantics through establishing a parallel position attention mechanism and a parallel channel attention mechanism, the accuracy of the text topic extraction model is improved, meanwhile, the operation efficiency of the text topic extraction model is improved, and therefore the efficiency of topic extraction on text samples through the text topic extraction model is improved.
In the invention, the convolution layer of the convolutional neural network comprises a plurality of one-dimensional convolution kernels, each convolution kernel has 128 channels, the step sizes of the convolution kernels are respectively 1, 3 and 5, the convolution kernel of each step size respectively accounts for 1/3 of the total amount of the convolution kernels, and the consistency of the input dimension and the output dimension of the convolution layer is realized through the setting of an inner distance (padding). The more the number of convolution kernels, the more text features are extracted through convolution layers, and the more accurate text topics are obtained through subsequent processing of text feature vectors. However, the larger the number of convolution kernels, the higher the running speed of the text topic extraction model is affected, so that the number of convolution kernels is not suitable to be excessive.
In one embodiment of the present invention, the activation function of the activation layer in the convolutional neural network is a ReLU function, but the present invention is not limited thereto, and may be in other activation function forms, for example, a Sigmoid function or a Tanh function. In the invention, the output of the activation layer is used as the output of the convolutional neural network.
Preferably, the step of training the text topic extraction model comprises:
initializing parameters of the text topic extraction model, the parameters including: inputting a connection weight of the convolutional layer and the convolutional layer, a connection weight of the convolutional layer and the active layer, and a connection weight of the active layer and the full connection layer;
constructing a training sample set, wherein the training sample comprises a text word vector and a text theme;
inputting one training sample in the training sample set into the text theme extraction model, and outputting a text theme corresponding to the training sample;
updating the parameters based on a loss function of the text theme extraction model, wherein the loss function is a cross entropy function;
training the next training sample according to the updated parameters, and calculating a loss function value of the text theme extraction model;
judging whether the training of the text theme extraction model reaches a convergence condition, if so, ending the training to obtain the trained text theme extraction model, and if not, updating the parameters of the text theme extraction model and continuing the training, wherein the convergence condition is that the change of the loss function value is smaller than a preset threshold value. In general, the preset threshold may be 0.02, and the training learning rate is 0.001.
The text samples can be converted into the corresponding text word vectors in a plurality of ways. Preferably, the topic extraction model further comprises an embedding layer, the embedding layer is located in a first layer of the text topic extraction model, the text sample is converted into a text word vector through the embedding layer, and the obtained text word vector is input into an input layer of the convolutional neural network. However, the present invention is not limited thereto, and the present invention may also convert the text sample into the text Word vector corresponding to the text sample by using various Word vector models, such as Word2Vec model, CBOW model, etc. Preferably, in step S3, the step of obtaining the text word vector corresponding to the text sample includes:
constructing a word vector model, and training the word vector model according to a corpus, wherein the corpus used for training can be a Chinese Wikipedia corpus;
segmenting the text sample;
inputting the text sample after word segmentation into the trained word vector model;
and outputting a text word vector corresponding to the text sample.
The method comprises the steps of determining the length of a text according to a text sample, and segmenting words of the text sample according to the determined length of the text, wherein in one embodiment of the invention, the length of the text determined according to the text sample is 100, segmenting words of the text sample through a word segmentation library (for example, a word segmentation library such as jieba, Jcseg and HanLP), and the dimension of a text word vector is 300.
According to the method, the attention mechanism comprises a position attention mechanism and a channel attention mechanism, wherein the position attention mechanism applies attention to an output matrix of the activation layer according to text position characteristics and performs weight distribution; the channel attention mechanism applies attention to the output of the activation layer according to a convolution kernel channel, and weight distribution is carried out.
The position attention mechanism and the channel attention mechanism are established in parallel and are connected with the activation layer, the input of the position attention mechanism and the input of the channel attention mechanism are both derived from the output of the activation layer, preferably, the text theme extraction model further comprises a plurality of full-connection hidden layers, the full-connection hidden layers are arranged in parallel, each full-connection hidden layer is connected with the activation layer, the output matrix of the activation layer is converted into a position attention weight matrix and a channel attention weight matrix through different full-connection hidden layers, and weight distribution is carried out on the output of the activation layer according to the position attention weight matrix and the channel attention weight matrix.
Further, the output matrix of the active layer outputs a first conversion matrix through a fully-connected hidden layer, the first conversion matrix and the position attention weight matrix are subjected to matrix multiplication to obtain a position attention feature matrix, the position attention feature matrix and the output matrix of the active layer are added to obtain an output result of the channel attention mechanism, and the output result is input into the fully-connected layer; and the output matrix of the active layer outputs a second conversion matrix through another fully-connected hidden layer, the second conversion matrix and the channel attention weight matrix are subjected to matrix multiplication to obtain a channel attention feature matrix, the channel attention feature matrix and the output matrix of the active layer are added to obtain an output result of the channel attention mechanism, and the output result is input into the fully-connected layer.
In one embodiment of the invention, a convolution kernel of a convolution neural network is a single-dimensional convolution kernel, the total number of channels of the convolution kernel is k, an output matrix of an active layer is k m 1, the output matrix of the active layer is converted into k m during processing, the output matrix of the active layer is subjected to feature extraction through different fully-connected hidden layers connected with the active layer, the output matrix of the active layer is respectively converted into two matrixes with dimensions of m k and k m, the two matrixes are subjected to multiplication operation to obtain a position attention weight matrix with dimension of m, and the output of the active layer is subjected to weight distribution according to the position attention weight matrix, so that semantemes at different positions obtain different weights, and text themes are extracted more accurately. Similarly, feature extraction is carried out on the output of the active layer through different fully-connected hidden layers connected with the active layer, the output matrix of the active layer is converted into two matrixes with the dimensionality of k × m and m × k respectively, multiplication operation is carried out on the two matrixes to obtain a channel attention weight matrix with the dimensionality of k × k, and weight distribution is carried out on the output of the active layer according to the channel attention weight matrix.
For example, the convolution kernel of the convolutional neural network is a single-dimensional convolution kernel, the total number of channels of the convolution kernel is 384, and the output matrix of the convolutional neural network active layer is a three-dimensional matrix of 384 × 100 × 1. For the position attention mechanism, firstly, the output matrix of the active layer is converted into a two-dimensional matrix of 384 × 100, two matrices with the dimensions of 100 × 384 and 384 × 100 are output through two parallel fully-connected hidden layers, and matrix multiplication and softmax mapping are carried out on the two matrices to obtain a position attention weight matrix with the dimension of 100 × 100. On the basis, the first conversion matrix with the dimension of 384 × 100 is output through another parallel fully-connected hidden layer, the first conversion matrix and the position attention weight matrix are subjected to matrix multiplication, a position attention feature matrix with the dimension of 384 × 100 is obtained and converted into a three-dimensional matrix of 384 × 100 × 1, and the three-dimensional matrix is added with the three-dimensional matrix of 384 × 100 × 1 output by the active layer to serve as an output result of the position attention mechanism. For the channel attention mechanism, firstly, the output matrix of the active layer is converted into a two-dimensional matrix of 384 × 100, two matrices with the dimensions of 384 × 100 and 100 × 384 are output through two parallel fully-connected hidden layers, and matrix multiplication and softmax mapping are carried out on the two matrices to obtain a channel attention weight matrix with the dimension of 384 × 384. On the basis, a second conversion matrix with the dimension of 100 x 384 is output through another parallel fully-connected hidden layer, the second conversion matrix and the channel attention weight matrix are subjected to matrix multiplication, the channel attention matrix with the dimension of 100 x 384 is obtained and converted into a three-dimensional matrix with the dimension of 384 x 100 x 1, and the three-dimensional matrix is added with the three-dimensional matrix with the dimension of 384 x 100 x 1 output by the active layer to serve as an output result of the channel attention mechanism. And inputting the output results of the position attention mechanism and the channel attention mechanism into a full connection layer to finish the output of the whole text topic extraction model, wherein the output dimensionality of the text topic extraction model is 100, the length of the text topic extraction model is the same as the text length determined by the text sample, and the output dimensionality corresponds to the keyword label of each word in the input text sample.
In the invention, different attention weights are applied to the output matrix of the activation layer through an attention mechanism, a position attention characteristic matrix and a channel attention characteristic matrix are obtained through the attention mechanism, and the position attention characteristic matrix and the channel attention characteristic matrix are respectively added with the output matrix of the convolutional neural network. For example, a 10-dimensional output matrix is obtained by the text word vector under the action of the convolutional neural network activation layer, a 10-dimensional position attention feature matrix is obtained by applying position attention to the 10-dimensional output matrix through a position attention mechanism, a 10-dimensional channel attention feature matrix is obtained by applying channel attention to the 10-dimensional output matrix through a channel attention mechanism, and the 10-dimensional position attention feature matrix and the 10-dimensional output matrix of the convolutional neural network are added to obtain a 20-dimensional output matrix which is input into the fully-connected layer. Similarly, a 10-dimensional channel attention feature matrix is added with a 10-dimensional output matrix of the convolutional neural network to obtain a 20-dimensional output matrix, the 20-dimensional output matrix is input into the full-link layer, and the text theme is output through the full-link layer.
The text theme extraction method is applied to an electronic device, and the electronic device can be a television, a smart phone, a tablet computer, a computer and other terminal equipment.
The electronic device includes: a processor; the memory is used for storing a text theme extraction program, and the processor executes the text theme extraction program to realize the following steps of the text theme extraction method: constructing a text theme extraction model; training the text theme extraction model; acquiring a text word vector corresponding to a text sample; inputting the text word vector into a trained text theme extraction model; and outputting a text theme corresponding to the text sample.
The constructed text topic extraction model comprises a convolutional neural network and an attention mechanism, wherein the convolutional neural network comprises an input layer, a convolutional layer, an activation layer and a full connection layer, the input layer is used for inputting text word vectors, the convolutional layer is used for performing convolution operation on the text word vectors and extracting text features to obtain text feature vectors, and the activation layer is used for activating the text feature vectors; the attention mechanism comprises a position attention mechanism and a channel attention mechanism, the position attention mechanism and the channel attention mechanism are established in parallel and are connected with the activation layer, a position attention weight is applied to an output matrix of the activation layer through the position attention mechanism to obtain a position attention characteristic matrix, and the position attention characteristic matrix is added with the output matrix of the activation layer to obtain an output result of the position attention mechanism; applying a channel attention weight to the output matrix of the activation layer through the channel attention mechanism to obtain a channel attention characteristic matrix, summing the channel attention characteristic matrix and the output matrix of the activation layer to obtain an output result of the channel attention mechanism, inputting the output result of the position attention mechanism and the output result of the channel attention mechanism into the full connection layer, and outputting a text theme through the full connection layer.
The electronic device further comprises a network interface, a communication bus and the like. The network interface may include a standard wired interface and a standard wireless interface, and the communication bus is used for realizing connection and communication among the components.
The memory includes at least one type of readable storage medium, which may be a non-volatile storage medium such as a flash memory, a hard disk, an optical disk, etc., or a plug-in hard disk, etc., and is not limited thereto, and may be any device that stores instructions or software and any associated data files in a non-transitory manner and provides instructions or software programs to the processor to enable the processor to execute the instructions or software programs. In the invention, the software program stored in the memory comprises a text theme extracting program, and the text theme extracting program can be provided for the processor, so that the processor can execute the text theme extracting program to realize the steps of the text theme extracting method.
The processor may be a central processing unit, a microprocessor or other data processing chip, etc., and may run a stored program in the memory, for example, may execute the text theme extraction program in the present invention.
The electronic device may further comprise a display, which may also be referred to as a display screen or display unit. In some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch panel, or the like. The display is used for displaying information processed in the electronic device and for displaying a visual work interface.
The electronic device may further comprise a user interface which may comprise an input unit, such as a keyboard, a speech output device, such as a sound, a headset, etc.
In the invention, the convolution layer of the convolutional neural network in the text topic extraction model comprises a plurality of one-dimensional convolution kernels, each convolution kernel has 128 channels, the step lengths of the convolution kernels are respectively 1, 3 and 5, the convolution kernel of each step length respectively accounts for 1/3 of the total amount of the convolution kernels, and the input dimension and the output dimension of the convolution layer are consistent through the setting of an inner distance (padding).
Preferably, the step of training the text topic extraction model comprises:
initializing parameters of the text topic extraction model, the parameters including: inputting a connection weight of the convolutional layer and the convolutional layer, a connection weight of the convolutional layer and the active layer, and a connection weight of the active layer and the full connection layer;
constructing a training sample set, wherein the training sample comprises a text word vector and a text theme;
inputting one training sample in the training sample set into the text theme extraction model, and outputting a text theme corresponding to the training sample;
updating the parameters based on a loss function of the text theme extraction model, wherein the loss function is a cross entropy function;
training the next training sample according to the updated parameters, and calculating a loss function value of the text theme extraction model;
judging whether the training of the text theme extraction model reaches a convergence condition, if so, ending the training to obtain the trained text theme extraction model, and if not, updating the parameters of the text theme extraction model and continuing the training, wherein the convergence condition is that the change of the loss function value is smaller than a preset threshold value. In general, the preset threshold may be 0.02, and the training learning rate is 0.001.
The text samples can be converted into the corresponding text word vectors in a plurality of ways. Preferably, the topic extraction model further comprises an embedding layer, the embedding layer is located in a first layer of the text topic extraction model, the text sample is converted into a text word vector through the embedding layer, and the obtained text word vector is input into an input layer of the convolutional neural network. However, the present invention is not limited thereto, and the present invention may also convert the text sample into the text Word vector corresponding to the text sample by using various Word vector models, such as Word2Vec model, CBOW model, etc. Preferably, the step of obtaining the text word vector corresponding to the text sample includes:
constructing a word vector model, and training the word vector model according to a corpus, wherein the corpus used for training can be a Chinese Wikipedia corpus;
performing word segmentation on the text sample;
inputting the text sample after word segmentation into the trained word vector model;
and outputting a text word vector corresponding to the text sample.
The method comprises the steps of determining the length of a text according to a text sample, and segmenting words of the text sample according to the determined length of the text, wherein in one embodiment of the invention, the length of the text determined according to the text sample is 100, segmenting words of the text sample through a word segmentation library (for example, a word segmentation library such as jieba, Jcseg and HanLP), and the dimension of a text word vector is 300.
Preferably, the text topic extraction model further includes a plurality of fully-connected hidden layers, the plurality of fully-connected hidden layers are arranged in parallel, each fully-connected hidden layer is connected with the active layer, an output matrix of the active layer is converted into a position attention weight matrix and a channel attention weight matrix through different fully-connected hidden layers, weight distribution is performed on outputs of the active layer according to the position attention weight matrix and the channel attention weight matrix, and inputs of the position attention mechanism and the channel attention mechanism are derived from outputs of the active layer.
Further, the output matrix of the active layer outputs a first conversion matrix through a fully-connected hidden layer, the first conversion matrix and the position attention weight matrix are subjected to matrix multiplication to obtain a position attention feature matrix, the position attention feature matrix and the output matrix of the active layer are added to obtain an output result of the channel attention mechanism, and the output result is input into the fully-connected layer; and the output matrix of the active layer outputs a second conversion matrix through another fully-connected hidden layer, the second conversion matrix and the channel attention weight matrix are subjected to matrix multiplication to obtain a channel attention feature matrix, the channel attention feature matrix and the output matrix of the active layer are added to obtain an output result of the channel attention mechanism, and the output result is input into the fully-connected layer.
In one embodiment of the invention, a convolution kernel of a convolution neural network is a single-dimensional convolution kernel, the total number of channels of the convolution kernel is k, an output matrix of an active layer is k m 1, the output matrix of the active layer is converted into k m during processing, the output matrix of the active layer is subjected to feature extraction through different fully-connected hidden layers connected with the active layer, the output matrix of the active layer is respectively converted into two matrixes with dimensions of m k and k m, the two matrixes are subjected to multiplication operation to obtain a position attention weight matrix with dimension of m, and the output of the active layer is subjected to weight distribution according to the position attention weight matrix, so that semantemes at different positions obtain different weights, and text themes are extracted more accurately. Similarly, feature extraction is carried out on the output of the active layer through different fully-connected hidden layers connected with the active layer, the output matrix of the active layer is converted into two matrixes with the dimensionality of k × m and m × k respectively, multiplication operation is carried out on the two matrixes to obtain a channel attention weight matrix with the dimensionality of k × k, and weight distribution is carried out on the output of the active layer according to the channel attention weight matrix.
In other embodiments, the text topic extraction program may also be divided into one or more modules, which are stored in the memory and executed by the processor to accomplish the present invention. The modules referred to herein are referred to as a series of computer program instruction segments capable of performing specified functions. Fig. 2 is a schematic block diagram of a text topic extraction program in the present invention, and as shown in fig. 2, the text topic extraction program may be divided into: the system comprises a model building module 1, a model training module 2, an obtaining module 3, an input module 4 and an output module 5. The functions or operation steps implemented by the modules are similar to those of the above, and are not detailed here, for example, where:
the model construction module 1 is used for constructing a text theme extraction model, wherein the constructed text theme extraction model comprises a convolutional neural network and an attention mechanism, and the specific composition is as described above and is not repeated herein;
the model training module 2 trains the text theme extraction model;
the acquisition module 3 is used for acquiring text word vectors corresponding to the text samples;
the input module 4 is used for inputting the text word vectors into the trained text topic extraction model;
and the output module 5 outputs the text theme corresponding to the text sample.
In one embodiment of the invention, a computer readable storage medium may be any tangible medium that can contain, or store a program or instructions, where the program can be executed to implement corresponding functions via hardware associated with stored program instructions. For example, the computer readable storage medium may be a computer diskette, hard disk, random access memory, read only memory, or the like. The invention is not so limited and can be any means that stores the instructions or software and any associated data files or data structures in a non-transitory manner and that can be provided to a processor to cause the processor to execute the programs or instructions therein. The computer-readable storage medium includes a text topic extraction program, and when the text topic extraction program is executed by a processor, the following text topic extraction method is realized:
constructing a text theme extraction model; training the text theme extraction model; acquiring a text word vector corresponding to a text sample; inputting the text word vector into a trained text theme extraction model; and outputting a text theme corresponding to the text sample.
The constructed text theme extraction model comprises a convolutional neural network and an attention mechanism, wherein the convolutional neural network comprises an input layer, a convolutional layer, an activation layer and a full connection layer, the input layer is used for inputting text word vectors, the convolutional layer is used for carrying out convolution operation on the text word vectors and extracting text features to obtain text feature vectors, and the activation layer is used for activating the text feature vectors; the attention mechanism comprises a position attention mechanism and a channel attention mechanism, the position attention mechanism and the channel attention mechanism are established in parallel and are connected with the activation layer, a position attention weight is applied to an output matrix of the activation layer through the position attention mechanism to obtain a position attention characteristic matrix, and the position attention characteristic matrix is added with the output matrix of the activation layer to obtain an output result of the position attention mechanism; applying a channel attention weight to the output matrix of the activation layer through the channel attention mechanism to obtain a channel attention characteristic matrix, summing the channel attention characteristic matrix and the output matrix of the activation layer to obtain an output result of the channel attention mechanism, inputting the output result of the position attention mechanism and the output result of the channel attention mechanism into the full connection layer, and outputting a text theme through the full connection layer.
The specific implementation of the computer-readable storage medium of the present invention is substantially the same as the specific implementation of the text theme extraction method and the electronic device, and is not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments. Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (9)

1. A text theme extraction method is applied to an electronic device and is characterized by comprising the following steps:
constructing a text theme extraction model;
training the text theme extraction model;
acquiring a text word vector corresponding to a text sample;
inputting the text word vector into a trained text theme extraction model;
outputting a text topic corresponding to the text sample,
wherein the constructed text topic extraction model comprises a convolutional neural network and an attention mechanism,
the convolutional neural network comprises an input layer, a convolutional layer, an activation layer and a full-connection layer, wherein the input layer is used for inputting text word vectors, the convolutional layer is used for carrying out convolutional operation on the text word vectors and extracting text features to obtain text feature vectors, and the activation layer is used for activating the text feature vectors; the convolutional layer of the convolutional neural network comprises a plurality of one-dimensional convolutional kernels, the step lengths of the convolutional kernels are respectively 1, 3 and 5, the convolutional kernels of each step length respectively account for 1/3 of the total amount of the convolutional kernels, and the input dimension and the output dimension of the convolutional layer are consistent by setting the inner distance;
the attention mechanism comprises a position attention mechanism and a channel attention mechanism, the position attention mechanism and the channel attention mechanism are established in parallel and are connected with the activation layer, a position attention weight is applied to an output matrix of the activation layer through the position attention mechanism to obtain a position attention characteristic matrix, and the position attention characteristic matrix is added with the output matrix of the activation layer to obtain an output result of the position attention mechanism; wherein the step of obtaining the location attention feature matrix comprises: the output matrix of the active layer outputs a first conversion matrix through a full-connection hidden layer, and the first conversion matrix and the position attention weight matrix are subjected to matrix multiplication to obtain a position attention feature matrix; the step of obtaining the channel attention feature matrix comprises: the output matrix of the active layer outputs a second conversion matrix through another fully-connected hidden layer, and the second conversion matrix and the channel attention weight matrix are subjected to matrix multiplication to obtain a channel attention feature matrix;
applying a channel attention weight to the output matrix of the activation layer through the channel attention mechanism to obtain a channel attention characteristic matrix, and adding the channel attention characteristic matrix and the output matrix of the activation layer to obtain an output result of the channel attention mechanism;
and inputting the output result of the position attention mechanism and the output result of the channel attention mechanism into the full connection layer, and outputting the text theme through the full connection layer.
2. The text topic extraction method of claim 1, wherein the text topic extraction model further comprises a plurality of fully-connected hidden layers, the plurality of fully-connected hidden layers are arranged in parallel, each fully-connected hidden layer is connected with the active layer, and an output matrix of the active layer is converted into a position attention weight matrix and a channel attention weight matrix through the fully-connected hidden layers.
3. The text topic extraction method of claim 1, wherein the text topic extraction model further comprises an embedding layer, the embedding layer is located at a first layer of the text topic extraction model, and text samples are converted into text word vectors through the embedding layer.
4. The method for extracting text topics according to claim 1, wherein the step of obtaining text word vectors corresponding to the text samples comprises:
constructing a word vector model, and training the word vector model according to a corpus;
performing word segmentation on the text sample;
inputting the text sample after word segmentation into the trained word vector model;
and outputting a text word vector corresponding to the text sample.
5. The text topic extraction method of any one of claims 1 to 4 wherein the step of training the text topic extraction model comprises:
initializing parameters of the text topic extraction model, the parameters including: inputting a connection weight of the convolutional layer and the convolutional layer, a connection weight of the convolutional layer and the active layer, and a connection weight of the active layer and the full connection layer;
constructing a training sample set, wherein the training sample comprises a text word vector and a text theme;
inputting one training sample in the training sample set into the text theme extraction model, and outputting a text theme corresponding to the training sample;
updating the parameters based on a loss function of the text topic extraction model;
training the next training sample according to the updated parameters, and calculating a loss function value of the text theme extraction model;
judging whether the training of the text theme extraction model reaches a convergence condition, if so, ending the training to obtain the trained text theme extraction model, and if not, updating the parameters of the text theme extraction model and continuing the training, wherein the convergence condition is that the change of the loss function value is smaller than a preset threshold value.
6. An electronic device, comprising:
a processor;
a memory including a text topic extraction program, the text topic extraction program when executed by the processor implementing the steps of the text topic extraction method as follows:
constructing a text theme extraction model;
training the text theme extraction model;
acquiring a text word vector corresponding to a text sample;
inputting the text word vector into a trained text theme extraction model;
outputting a text topic corresponding to the text sample,
wherein the constructed text topic extraction model comprises a convolutional neural network and an attention mechanism,
the convolutional neural network comprises an input layer, a convolutional layer, an activation layer and a full-connection layer, wherein the input layer is used for inputting text word vectors, the convolutional layer is used for carrying out convolutional operation on the text word vectors and extracting text features to obtain text feature vectors, and the activation layer is used for activating the text feature vectors; the convolutional layer of the convolutional neural network comprises a plurality of one-dimensional convolutional kernels, the step lengths of the convolutional kernels are respectively 1, 3 and 5, the convolutional kernels of each step length respectively account for 1/3 of the total amount of the convolutional kernels, and the input dimension and the output dimension of the convolutional layer are consistent by setting the inner distance;
the attention mechanism comprises a position attention mechanism and a channel attention mechanism, the position attention mechanism and the channel attention mechanism are established in parallel and are connected with the activation layer, a position attention weight is applied to an output matrix of the activation layer through the position attention mechanism to obtain a position attention characteristic matrix, and the position attention characteristic matrix is added with the output matrix of the activation layer to obtain an output result of the position attention mechanism; wherein the step of obtaining the location attention feature matrix comprises: the output matrix of the active layer outputs a first conversion matrix through a full-connection hidden layer, and the first conversion matrix and the position attention weight matrix are subjected to matrix multiplication to obtain a position attention feature matrix; the step of obtaining the channel attention feature matrix comprises: the output matrix of the active layer outputs a second conversion matrix through another fully-connected hidden layer, and the second conversion matrix and the channel attention weight matrix are subjected to matrix multiplication to obtain a channel attention feature matrix;
applying a channel attention weight to the output matrix of the activation layer through the channel attention mechanism to obtain a channel attention characteristic matrix, and adding the channel attention characteristic matrix and the output matrix of the activation layer to obtain an output result of the channel attention mechanism;
and inputting the output result of the position attention mechanism and the output result of the channel attention mechanism into the full connection layer, and outputting the text theme through the full connection layer.
7. The electronic device of claim 6, wherein the text topic extraction model further comprises a plurality of fully-connected hidden layers, the plurality of fully-connected hidden layers are arranged in parallel, each fully-connected hidden layer is connected with the active layer, and an output matrix of the active layer is converted into a position attention weight matrix and a channel attention weight matrix through the fully-connected hidden layers.
8. The electronic device of claim 7, wherein the text topic extraction model further comprises an embedding layer located at a first layer of the text topic extraction model, and wherein text samples are converted into text word vectors by the embedding layer.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium includes a text topic extraction program, and when the text topic extraction program is executed by a processor, the steps of the text topic extraction method according to any one of claims 1 to 5 are implemented.
CN201910008265.9A 2019-01-04 2019-01-04 Text theme extraction method and device and storage medium Active CN109871532B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910008265.9A CN109871532B (en) 2019-01-04 2019-01-04 Text theme extraction method and device and storage medium
PCT/CN2019/118287 WO2020140633A1 (en) 2019-01-04 2019-11-14 Text topic extraction method, apparatus, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910008265.9A CN109871532B (en) 2019-01-04 2019-01-04 Text theme extraction method and device and storage medium

Publications (2)

Publication Number Publication Date
CN109871532A CN109871532A (en) 2019-06-11
CN109871532B true CN109871532B (en) 2022-07-08

Family

ID=66917528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910008265.9A Active CN109871532B (en) 2019-01-04 2019-01-04 Text theme extraction method and device and storage medium

Country Status (2)

Country Link
CN (1) CN109871532B (en)
WO (1) WO2020140633A1 (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871532B (en) * 2019-01-04 2022-07-08 平安科技(深圳)有限公司 Text theme extraction method and device and storage medium
CN110322453B (en) * 2019-07-05 2023-04-18 西安电子科技大学 3D point cloud semantic segmentation method based on position attention and auxiliary network
CN110705268A (en) * 2019-09-02 2020-01-17 平安科技(深圳)有限公司 Article subject extraction method and device based on artificial intelligence and computer-readable storage medium
CN111274892B (en) * 2020-01-14 2020-12-18 北京科技大学 Robust remote sensing image change detection method and system
CN113378556B (en) * 2020-02-25 2023-07-14 华为技术有限公司 Method and device for extracting text keywords
CN111597333B (en) * 2020-04-27 2022-08-02 国家计算机网络与信息安全管理中心 Event and event element extraction method and device for block chain field
CN111986730A (en) * 2020-07-27 2020-11-24 中国科学院计算技术研究所苏州智能计算产业技术研究院 Method for predicting siRNA silencing efficiency
CN111881260A (en) * 2020-07-31 2020-11-03 安徽农业大学 Neural network emotion analysis method and device based on aspect attention and convolutional memory
CN111985551B (en) * 2020-08-14 2023-10-27 湖南理工学院 Stereo matching algorithm based on multi-attention network
CN112231562B (en) * 2020-10-15 2023-07-14 北京工商大学 Network rumor recognition method and system
CN112232746B (en) * 2020-11-03 2023-08-22 金陵科技学院 Cold-chain logistics demand estimation method based on attention weighting
CN112580782B (en) * 2020-12-14 2024-02-09 华东理工大学 Channel-enhanced dual-attention generation countermeasure network and image generation method
CN112905751B (en) * 2021-03-19 2024-03-29 常熟理工学院 Topic evolution tracking method combining topic model and twin network model
CN112818687B (en) * 2021-03-25 2022-07-08 杭州数澜科技有限公司 Method, device, electronic equipment and storage medium for constructing title recognition model
CN113111970B (en) * 2021-04-30 2023-12-26 陕西师范大学 Method for classifying images by constructing global embedded attention residual network
CN113311406B (en) * 2021-05-28 2023-06-30 西安电子科技大学 Aircraft time-frequency domain rotor wing parameter estimation method based on multichannel attention network
CN113191134B (en) * 2021-05-31 2023-02-03 平安科技(深圳)有限公司 Document quality verification method, device, equipment and medium based on attention mechanism
CN113468874B (en) * 2021-06-09 2024-04-16 大连理工大学 Biomedical relation extraction method based on graph convolution self-coding
CN113469335A (en) * 2021-06-29 2021-10-01 杭州中葳数字科技有限公司 Method for distributing weight for feature by using relationship between features of different convolutional layers
CN113837445A (en) * 2021-08-27 2021-12-24 合肥工业大学 Personality prediction method and system based on attention mechanism
CN113806534B (en) * 2021-09-03 2023-04-18 电子科技大学 Hot event prediction method for social network
WO2023220859A1 (en) * 2022-05-16 2023-11-23 Intel Corporation Multi-dimensional attention for dynamic convolutional kernel
CN114881029B (en) * 2022-06-09 2024-03-01 合肥工业大学 Chinese text readability evaluation method based on hybrid neural network
CN116025765B (en) * 2023-01-17 2024-01-19 浙江德卡控制阀仪表有限公司 Axial flow type regulating valve and control method thereof
CN116383652B (en) * 2023-04-03 2024-02-06 华院计算技术(上海)股份有限公司 Model training method, controllable text generation method, system, equipment and medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8239397B2 (en) * 2009-01-27 2012-08-07 Palo Alto Research Center Incorporated System and method for managing user attention by detecting hot and cold topics in social indexes
CN103559193B (en) * 2013-09-10 2016-08-31 浙江大学 A kind of based on the theme modeling method selecting unit
US9904874B2 (en) * 2015-11-05 2018-02-27 Microsoft Technology Licensing, Llc Hardware-efficient deep convolutional neural networks
CN106528655A (en) * 2016-10-18 2017-03-22 百度在线网络技术(北京)有限公司 Text subject recognition method and device
US20180329884A1 (en) * 2017-05-12 2018-11-15 Rsvp Technologies Inc. Neural contextual conversation learning
CN107239446B (en) * 2017-05-27 2019-12-03 中国矿业大学 A kind of intelligence relationship extracting method based on neural network Yu attention mechanism
CN108364023A (en) * 2018-02-11 2018-08-03 北京达佳互联信息技术有限公司 Image-recognizing method based on attention model and system
CN108829719B (en) * 2018-05-07 2022-03-01 中国科学院合肥物质科学研究院 Non-fact question-answer selection method and system
CN108984526B (en) * 2018-07-10 2021-05-07 北京理工大学 Document theme vector extraction method based on deep learning
CN108986797B (en) * 2018-08-06 2021-07-06 中国科学技术大学 Voice theme recognition method and system
CN109871532B (en) * 2019-01-04 2022-07-08 平安科技(深圳)有限公司 Text theme extraction method and device and storage medium

Also Published As

Publication number Publication date
WO2020140633A1 (en) 2020-07-09
CN109871532A (en) 2019-06-11

Similar Documents

Publication Publication Date Title
CN109871532B (en) Text theme extraction method and device and storage medium
CN109960726B (en) Text classification model construction method, device, terminal and storage medium
CN110348535B (en) Visual question-answering model training method and device
CN108763535B (en) Information acquisition method and device
CN110765785B (en) Chinese-English translation method based on neural network and related equipment thereof
CN113722438B (en) Sentence vector generation method and device based on sentence vector model and computer equipment
CN111985243B (en) Emotion model training method, emotion analysis device and storage medium
US11238050B2 (en) Method and apparatus for determining response for user input data, and medium
CN110795913A (en) Text encoding method and device, storage medium and terminal
CN113239176B (en) Semantic matching model training method, device, equipment and storage medium
CN115062134B (en) Knowledge question-answering model training and knowledge question-answering method, device and computer equipment
CN111858898A (en) Text processing method and device based on artificial intelligence and electronic equipment
CN111476138A (en) Construction method and identification method of building drawing component identification model and related equipment
CN110866098A (en) Machine reading method and device based on transformer and lstm and readable storage medium
CN113157900A (en) Intention recognition method and device, computer equipment and storage medium
CN110222144B (en) Text content extraction method and device, electronic equipment and storage medium
CN113723077B (en) Sentence vector generation method and device based on bidirectional characterization model and computer equipment
CN113535912B (en) Text association method and related equipment based on graph rolling network and attention mechanism
CN110598210A (en) Entity recognition model training method, entity recognition device, entity recognition equipment and medium
CN115906861B (en) Sentence emotion analysis method and device based on interaction aspect information fusion
CN114970666B (en) Spoken language processing method and device, electronic equipment and storage medium
CN116957006A (en) Training method, device, equipment, medium and program product of prediction model
CN113626468B (en) SQL sentence generation method, device and equipment based on artificial intelligence and storage medium
CN111401069A (en) Intention recognition method and intention recognition device for conversation text and terminal
CN114782720A (en) Method, device, electronic device, medium, and program product for determining matching of document

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant