WO2021174922A1 - Statement sentiment classification method and related device - Google Patents

Statement sentiment classification method and related device Download PDF

Info

Publication number
WO2021174922A1
WO2021174922A1 PCT/CN2020/131951 CN2020131951W WO2021174922A1 WO 2021174922 A1 WO2021174922 A1 WO 2021174922A1 CN 2020131951 W CN2020131951 W CN 2020131951W WO 2021174922 A1 WO2021174922 A1 WO 2021174922A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
sequence
word
coding
weight matrix
Prior art date
Application number
PCT/CN2020/131951
Other languages
French (fr)
Chinese (zh)
Inventor
于凤英
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202010137265.1A external-priority patent/CN111460812B/en
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021174922A1 publication Critical patent/WO2021174922A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of natural language processing, and specifically relates to a sentence emotion classification method, device, computer equipment, and computer storage medium.
  • sentiment classification models such as convolutional neural networks
  • sentences in the designated domain are classified with the trained sentiment classification model.
  • the inventor realizes that the existing text sentiment classification method is only suitable for sentence sentiment classification tasks in a fixed field, and a larger training set is required to improve the accuracy of sentiment classification.
  • the first aspect of the present application provides a sentence sentiment classification method, and the sentence sentiment classification method includes:
  • Each first sentence sample in the first sentence sample set contains a missing word; for each first sentence sample, the feature extraction model is used to calculate the first sentence sample before the missing word Words are converted into a first word vector sequence according to word order, words after the missing word in the first sentence sample are converted into a second word vector sequence according to the reverse word order, and the first sentence is converted according to a preset vocabulary coding table
  • the missing words in the sample are transformed into the label vector of the first sentence sample;
  • the feature extraction model is used to encode the first word vector sequence into a first encoding sequence, and the second word vector sequence is encoded as Second coding sequence; using the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence; according to the missing word vector of the first sentence sample and the missing word vector
  • the label vector of the first sentence sample trains the feature extraction model to obtain a first feature extraction model, and a new second feature extraction model is created so that the neural network structure of the second feature extraction model is the same as that
  • the neural network structure is consistent, the weight of the first feature extraction model is used to update the weight of the second feature extraction model; the second sentence sample with attribute tags is used to train the first feature extraction model and the full connection An attribute classification model composed of layers; the attribute classification model is used to identify the attribute words of a plurality of sentences to be recognized, and each sentence to be recognized is connected with the attribute words of each sentence to be recognized to obtain the connection attribute words A plurality of sentences to be recognized; using the plurality of sentences to be recognized connected with attribute words with emotion labels to train an emotion classification model composed of the second feature extraction model and a deep learning model; and using the attribute classification model to recognize Processing the attribute words of the sentence, the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute word of the sentence to be processed and the emotion type of the sentence to be processed.
  • a second aspect of the present application provides a sentence emotion classification device, the device includes:
  • the obtaining module is used to obtain a first sentence sample set, and each first sentence sample in the first sentence sample set contains a missing word; the conversion module is used to obtain a feature extraction model for each first sentence sample.
  • the words before the missing word in the first sentence sample are converted into the first word vector sequence in the word order, and the words after the missing word in the first sentence sample are converted into the second word vector sequence according to the reverse word order.
  • a vocabulary encoding table converts the missing words in the first sentence sample into a label vector of the first sentence sample; an encoding module is used to encode the first word vector sequence into The first coding sequence is used to code the second word vector sequence into a second coding sequence; the calculation module is used to use the feature extraction model to calculate the first coding sequence according to the first coding sequence and the second coding sequence.
  • the missing word vector of the sentence sample; the first training module is used to train the feature extraction model according to the missing word vector of the first sentence sample and the label vector of the first sentence sample to obtain the first feature extraction model, and create a new
  • the second feature extraction model makes the neural network structure of the second feature extraction model consistent with the neural network structure of the first feature extraction model, and updates the second feature extraction with the weight of the first feature extraction model
  • the weight of the model; the second training module is used to train the attribute classification model composed of the first feature extraction model and the fully connected layer with second sentence samples with attribute labels;
  • the connection module is used to use the attributes
  • the classification model recognizes the attribute words of a plurality of sentences to be recognized, and connects each sentence to be recognized with the attribute words of each sentence to be recognized recognized to obtain the plurality of sentences to be recognized that connect the attribute words;
  • the third training module is used to train the emotion classification model composed of the second feature extraction model and the deep learning model with the plurality of sentences to be recognized connected with attribute words with emotion labels; the classification module is used to use all
  • the attribute classification model identifies the attribute words of the sentence to be processed, and the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute words of the sentence to be processed and the emotion type of the sentence to be processed.
  • a third aspect of the present application provides a computer device, the computer device includes a processor, and the processor is configured to execute the following steps when executing computer-readable instructions stored in a memory:
  • Each first sentence sample in the first sentence sample set contains a missing word; for each first sentence sample, the feature extraction model is used to calculate the first sentence sample before the missing word Words are converted into a first word vector sequence according to word order, words after the missing word in the first sentence sample are converted into a second word vector sequence according to the reverse word order, and the first sentence is converted according to a preset vocabulary coding table
  • the missing words in the sample are transformed into the label vector of the first sentence sample;
  • the feature extraction model is used to encode the first word vector sequence into a first encoding sequence, and the second word vector sequence is encoded as Second coding sequence; using the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence; according to the missing word vector of the first sentence sample and the missing word vector
  • the label vector of the first sentence sample trains the feature extraction model to obtain a first feature extraction model, and a new second feature extraction model is created so that the neural network structure of the second feature extraction model is the same as that
  • the neural network structure is consistent, the weight of the first feature extraction model is used to update the weight of the second feature extraction model; the second sentence sample with attribute tags is used to train the first feature extraction model and the full connection An attribute classification model composed of layers; the attribute classification model is used to identify the attribute words of a plurality of sentences to be recognized, and each sentence to be recognized is connected with the attribute words of each sentence to be recognized to obtain the connection attribute words A plurality of sentences to be recognized; using the plurality of sentences to be recognized connected with attribute words with emotion labels to train an emotion classification model composed of the second feature extraction model and a deep learning model; and using the attribute classification model to recognize Processing the attribute words of the sentence, the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute word of the sentence to be processed and the emotion type of the sentence to be processed.
  • the fourth aspect of the present application provides a computer storage medium having computer-readable instructions stored thereon, and the computer-readable instructions implement the following steps when executed by a processor:
  • Each first sentence sample in the first sentence sample set contains a missing word; for each first sentence sample, the feature extraction model is used to calculate the first sentence sample before the missing word Words are converted into a first word vector sequence according to word order, words after the missing word in the first sentence sample are converted into a second word vector sequence according to the reverse word order, and the first sentence is converted according to a preset vocabulary coding table
  • the missing words in the sample are transformed into the label vector of the first sentence sample;
  • the feature extraction model is used to encode the first word vector sequence into a first encoding sequence, and the second word vector sequence is encoded as Second coding sequence; using the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence; according to the missing word vector of the first sentence sample and the missing word vector
  • the label vector of the first sentence sample trains the feature extraction model to obtain a first feature extraction model, and a new second feature extraction model is created so that the neural network structure of the second feature extraction model is the same as that
  • the neural network structure is consistent, the weight of the first feature extraction model is used to update the weight of the second feature extraction model; the second sentence sample with attribute tags is used to train the first feature extraction model and the full connection An attribute classification model composed of layers; the attribute classification model is used to identify the attribute words of a plurality of sentences to be recognized, and each sentence to be recognized is connected with the attribute words of each sentence to be recognized to obtain the connection attribute words A plurality of sentences to be recognized; using the plurality of sentences to be recognized connected with attribute words with emotion labels to train an emotion classification model composed of the second feature extraction model and a deep learning model; and using the attribute classification model to recognize Processing the attribute words of the sentence, the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute word of the sentence to be processed and the emotion type of the sentence to be processed.
  • This application performs emotion classification on sentences to enhance the accuracy and scene adaptability of emotion classification.
  • Fig. 1 is a flowchart of a sentence sentiment classification method provided by an embodiment of the present application.
  • Fig. 2 is a structural diagram of a sentence sentiment classification device provided by an embodiment of the present application.
  • Fig. 3 is a schematic diagram of a computer device provided by an embodiment of the present application.
  • the sentence emotion classification method of the present application is applied to one or more computer devices.
  • the computer device is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • Its hardware includes, but is not limited to, a microprocessor and an application specific integrated circuit (ASIC) , Programmable Gate Array (Field-Programmable Gate Array, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
  • ASIC application specific integrated circuit
  • FPGA Field-Programmable Gate Array
  • DSP Digital Processor
  • embedded equipment etc.
  • the computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
  • Fig. 1 is a flowchart of a sentence sentiment classification method provided in Embodiment 1 of the present application.
  • the sentence emotion classification method is applied to computer equipment.
  • the sentence sentiment classification method of the present application can perform sentiment classification on the sentence.
  • the sentence emotion classification method includes:
  • each text includes multiple sentences, each text is occluded multiple times, part of the words in the text is occluded each time, and a missing part is extracted from the text of each occluded part of the words
  • the sentence of words is used as the first sentence sample.
  • Each field includes multiple texts, and each text in each field can include multiple sentences.
  • This embodiment does not limit the size of the field, such as the field of electronic products and the field of notebook computers, and the field of electronic products may include the field of notebook computers.
  • Each text in multiple texts in each field can be occluded multiple times, and a preset proportion of words in each text can be randomly occluded each time to obtain the first sentence with missing words in multiple texts in each field sample.
  • each first sentence sample use a feature extraction model to convert words before the missing word in the first sentence sample into a first word vector sequence in order of words, and convert the words after the missing word in the first sentence sample The words of are converted into a second word vector sequence according to the reverse word order, and the missing words in the first sentence sample are converted into the label vector of the first sentence sample according to a preset vocabulary coding table.
  • the feature extraction model includes the input layer, the forward hidden layer, the backward hidden layer, and the output layer.
  • the use of the feature extraction model converts the words before the missing word in the first sentence sample into a first word vector sequence in order of words, and converts the words after the missing word in the first sentence sample
  • the conversion of words into the second word vector sequence according to the reverse word order includes:
  • a first sentence sample is " ⁇ S>from ⁇ mask>Language Processing ⁇ E>", where " ⁇ S>” means the head word of the first sentence sample, and " ⁇ E>” means the first sentence sample
  • the tail word convert the word " ⁇ S> ⁇ ” before the missing word "ran” into the first coding vector sequence ⁇ (0,0,0,0,1,0,0 in word order) , 0), (0, 0, 0, 0, 0, 0, 1) ⁇
  • the word "language processing ⁇ E>” after the missing word "ran” is transformed into the second coding vector sequence ⁇ ( 0, 0, 0, 0, 0, 1, 0), (0, 0, 0, 0, 0, 1, 0, 0), (1, 0, 0, 0, 0, 0), (0, 0, 1, 0, 0, 0, 0), (0, 1, 0, 0, 0, 0, 0), (0, 1, 0, 0, 0, 0) ⁇
  • the preset vocabulary encoding table may adopt one -Encoding methods such as hot and word2vec.
  • the missing word ⁇ mask> in the first sentence sample is converted into the label vector (0, 0, 0, 1, 0, 0, 0, 0) of the first sentence sample, namely "Ran" one-hot encoding.
  • the forward hidden layer of the feature extraction model encodes the first word vector sequence into the first coding sequence
  • the backward hidden layer of the feature extraction model encodes the second word vector sequence
  • the sequence code is the second coding sequence.
  • the forward hidden layer and the backward hidden layer respectively include N forward hidden sublayers and N backward hidden sublayers, each forward hidden sublayer includes U encoding modules, and each backward hidden sublayer
  • the layer includes W encoding modules; wherein the u-th encoding module of the n-th forward hidden sublayer receives the vector Z n-1 output by the u-1th encoding module of the n-1th forward hidden sublayer, u-1 and the vector Z of the n 1-layer of the front output to hidden sublayer u-th encoding module n-1, u, n-th layer before Z n to the hidden sublayer u-th encoding module output vector, u To the u-th coding module of the forward hidden sublayer of the n+1th layer and the u+1th
  • the u-th encoding module of the first layer forward hidden sublayer receives the u-1th word vector of the first word vector sequence and the u-th word vector of the first word vector sequence, and the Nth layer forward The hidden sub-layer is the first coding sequence.
  • the first encoding module of the nth forward hidden sublayer receives the vector Z n-1,1 output by the first encoding module of the n-1th forward hidden sublayer.
  • the first encoding module of the nth forward hidden sublayer The first encoding module outputs the vector Z n,1 to the first encoding module of the forward hidden sub-layer of the n+1th layer.
  • the w-th encoding module of the nth layer of backward hidden sublayer receives the vector R n-1, w-1 and n-1 of the output of the w-1th encoding module of the n-1th layer of backward hidden sublayer
  • the vector R n-1,w output by the w-th encoding module of the backward hidden sub-layer, the output vector Z n,w of the w-th encoding module of the n-th backward hidden sub-layer to the n+1-th layer is backward hidden
  • the w-th coding module of the sub-layer and the w+1-th coding module of the n+1-th backward hidden sub-layer 2 ⁇ w ⁇ W.
  • the w-th encoding module of the first layer of backward hidden sublayer receives the w-1th word vector of the second word vector sequence and the wth word vector of the second word vector sequence, and the Nth layer is backward
  • the hidden sublayer is the second coding sequence.
  • the first coding module of the nth layer of backward hidden sublayer receives the vector R n-1,1 output by the first coding module of the n-1th layer of backward hidden sublayer, and the vector of the nth layer of backward hidden sublayer
  • the first encoding module outputs the vector R n,1 to the first encoding module of the hidden sub-layer at the n+1th layer.
  • the encoding of the first word vector sequence into the first encoding sequence by the feature extraction model includes:
  • the first encoding module of the forward hidden sublayer of the first layer encodes the first word vector of the first word vector sequence according to the first weight matrix subset in the initialized weight matrix set.
  • the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence, the initialized weight matrix set includes N weight matrix subsets, the intermediate vector sequence of the first coding sequence and The intermediate vector sequence of the second coding sequence corresponds one-to-one in order.
  • the forward hidden sublayer of the nth layer and the backward hidden sublayer of the nth layer share the nth weight matrix subset, and each weight matrix
  • the subset includes multiple sets of weight matrixes and a fourth weight matrix, and each set of weight matrixes includes a V weight matrix, a Q weight matrix, and a K weight matrix.
  • the V weight matrix, the Q weight matrix, and the K weight matrix in the multiple sets of weight matrix are used to calculate the first coding sequence and the second coding sequence based on multi-head attention. That is, the first coding sequence represents the above semantic information of the missing word in the first sentence sample, and the second coding sequence represents the following semantic information of the missing word in the first sentence sample information.
  • the first encoding module of the forward hidden sublayer of the first layer encodes the first word vector of the first word vector sequence into the first weight matrix subset in the initialized weight matrix set.
  • the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence includes:
  • the first encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the V in the multiple sets of weight matrixes in the first weight matrix subset.
  • connection between the first coding module of two adjacent layers is similar to an ordinary neuron connection, and no attention mechanism is used.
  • the u-th coding module of the forward hidden sub-layer of the first layer will one by one according to the first weight matrix subset.
  • the u-th vector of the first intermediate vector sequence corresponds to the u-th word vector of the first word vector sequence in a one-to-one correspondence.
  • the u-th encoding module of the forward hidden sub-layer of the first layer calculates the u-1th word vector of the first word vector sequence and the first word vector sequence one by one according to the first weight matrix subset
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The V weight matrix in, obtain the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The Q weight matrix in, obtains the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset
  • the K weight matrix in obtains the K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The V weight matrix in, obtains the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset
  • the third K weight matrix in obtain the K'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second coding module of the forward hidden sublayer of the first layer is based on the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, the a first intermediate vector of the two vectors encoding a first sequence of the sequence Z K weight vectors 1 and 2, the first a first intermediate vector of the two vectors coding sequence of the sequence Z K 1,2 '
  • the weight vector determines the attention value of the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence and the first intermediate vector sequence of the first coding sequence The attention value of the V'weight vector of the second vector Z 1,2.
  • the second coding module of the forward hidden sublayer of the first layer is based on the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, the The V'weight vector of the second vector of the first intermediate vector sequence Z 1,2 of the first coding sequence, the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the attention value of the V weight vector and the attention value of the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence determine the first coding sequence
  • (1)-(7) is the second encoding module of the forward hidden sublayer of the first layer to obtain the first encoding sequence according to the first set of weight matrices in the first weight matrix subset
  • the first score of the second vector Z 1,2 of the first intermediate vector sequence of Multiple scores of the second vector Z 1,2 of an intermediate vector sequence.
  • Multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence can be obtained from multiple sets of weight matrixes in the first weight matrix subset at the same time.
  • the second coding module of the forward hidden sublayer of the first layer connects multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, The combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence is obtained.
  • the second coding module of the forward hidden sublayer of the first layer multiplies the combination vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence by the first A four-weight matrix is used to obtain an intermediate vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the feedforward network in the second coding module of the forward hidden sublayer of the first layer performs residual and normalization processing on the second intermediate vector sequence of the first coding sequence.
  • the intermediate vectors of the two vectors Z 1,2 are coded, and the normalization process is performed again to obtain the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • (12)(1)-(11) is the second encoding module of the forward hidden sublayer of the first layer, according to the first weight matrix subset, the second word vector of the first word vector sequence And the first word vector of the first word vector sequence is coded as the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the n-th forward hidden sub-layer is used to calculate the n-1th intermediate vector sequence Z of the first coding sequence according to the n-th weight matrix subset.
  • the n-1 code is the n-th intermediate vector sequence Z n of the first code sequence.
  • the feature extraction model encodes the first word vector sequence as a first encoding sequence
  • the feature extraction model encodes the second word vector sequence as a second encoding sequence R n .
  • the u-th encoding module of the forward hidden sublayer of the first layer can encode the u-th vector Z 1,u of the first intermediate vector sequence of the first encoding sequence.
  • Each code in the forward hidden sublayer and the backward hidden sublayer of the same layer can run concurrently.
  • the output layer of the feature extraction model is used to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence.
  • the vectors in the first coding sequence and the second coding sequence are summed in dimensions, and a sum vector obtained by the summation is multiplied by the output weight matrix and normalized to obtain the first The missing word vector of the sentence sample.
  • the loss value of the missing word vector and the label vector of the first sentence sample may be calculated according to the cross-entropy loss function, and the weight matrix of the feature extraction model may be optimized according to the loss value.
  • a new intermediate feature extraction model can be created according to the neural network structure of the first feature extraction model.
  • the neural network structure can include the number of neurons, the number of neuron layers, and the way of connection between neurons.
  • the weight of the first feature extraction model can be copied. After the first feature extraction model is trained, the weight of the first feature extraction model enables the first feature extraction model to have strong feature extraction capabilities , Initializing the weight of the intermediate feature extraction model with the weight of the first feature extraction model to obtain the second feature extraction model that is the same as the first feature extraction model.
  • the attribute label may include resolution, processor, sound effects, etc.
  • One sentence of the second sentence sample is "This computer responds quickly", and the attribute label is "Processor". Indicates that this second sample sentence includes the semantics of the processor.
  • the second sentence sample may be a sentence in a given field with attribute tags.
  • a small number of the second sentence samples can be used to train the attribute classification model. Because the feature extraction model has been trained and can extract semantic information well, it is only necessary to fine-tune and adjust the weight matrix of the feature extraction model.
  • the weight matrix in the fully connected layer is optimized. Wherein, the output of the first feature extraction model is the input of the fully connected layer.
  • each sentence in the second sentence sample is equally divided into two parts according to the number of words, and the words in the first part of each sentence in the second sentence sample are similar to the first part.
  • the words before the missing word of each sentence in the sentence sample, and the words in the latter part of each sentence of the second sentence sample are similar to the words after the missing word of each sentence in the first sentence sample.
  • the one word in the middle is eliminated when the second sentence sample is divided, and the word in the middle is divided into the front part.
  • the process of using the second sentence sample to train the first feature extraction model is similar to the process of using the first sentence sample to train the feature extraction model, and will not be repeated here.
  • the sentiment label may include "positive”, “neutral”, “negative”, etc.
  • the deep learning model may be CNN, RNN, or LSTM. Training the sentiment classification model is an existing method, and will not be repeated here. Wherein, the output of the second feature extraction model is the input of the deep learning model.
  • the multiple to-be-recognized sentences of the connected attribute words may be output, and the multiple to-be-recognized sentences of the output connected attribute words may be manually labeled to obtain the multiple to-be-recognized sentences of the connected attribute words with emotion labels , Receiving the plurality of sentences to be recognized of the connection attribute words with emotion tags.
  • the process of training the second feature extraction model with the plurality of sentences to be recognized with the sentiment label connected attribute words is similar to the process of training the first feature extraction model with the second sentence samples. Go into details again.
  • the attribute classification model is used to identify the attribute word of the sentence to be processed "this computer is very fast” as “processor”, and the sentiment classification model responds very quickly to the sentence " ⁇ S>" which is connected to the attribute word.
  • Fast ⁇ SOE> processor ⁇ E> to classify, output the attribute word "processor” of the sentence to be processed and the emotion type "positive" of the sentence to be processed.
  • the attribute classification model and the emotion classification model may also be stored in a node of a blockchain.
  • the first embodiment realizes emotion classification of sentences, and enhances the accuracy and scene adaptability of emotion classification.
  • the U-th encoding module of the n-th forward hidden sublayer converts the U-1th vector of the n-1 intermediate vector sequence of the first encoding sequence, and the first encoding sequence
  • the U-th vector of the n-1 intermediate vector sequence of the second coding sequence and the W-th vector of the n-1 intermediate vector sequence of the second coding sequence are coded as Z n, U ;
  • the nth layer of the backward hidden sublayer is
  • the W encoding modules combine the W-1th vector of the n-1 intermediate vector sequence of the second encoding sequence, the Wth vector of the n-1 intermediate vector sequence of the second encoding sequence, and the The U-th vector of the n-1 intermediate vector sequence of a coding sequence is coded as R n,W .
  • the feature extraction model can be migrated to sentiment classification models in different fields.
  • Fig. 2 is a structural diagram of a sentence emotion classification device provided in the second embodiment of the present application.
  • the sentence emotion classification device 20 is applied to computer equipment.
  • the sentence emotion classification device 20 can perform emotion classification on the sentence.
  • the sentence emotion classification device 20 may include an acquisition module 201, a conversion module 202, an encoding module 203, a calculation module 204, a first training module 205, a second training module 206, a connection module 207, and a third training module.
  • the obtaining module 201 is configured to obtain a first sentence sample set, and each first sentence sample in the first sentence sample set contains a missing word.
  • each text includes multiple sentences, each text is occluded multiple times, part of the words in the text is occluded each time, and a missing part is extracted from the text of each occluded part of the words
  • the sentence of words is used as the first sentence sample.
  • Each field includes multiple texts, and each text in each field can include multiple sentences.
  • This embodiment does not limit the size of the field, such as the field of electronic products and the field of notebook computers, and the field of electronic products may include the field of notebook computers.
  • Each text in multiple texts in each field can be occluded multiple times, and a preset proportion of words in each text can be randomly occluded each time to obtain the first sentence with missing words in multiple texts in each field sample.
  • the conversion module 202 is configured to, for each first sentence sample, use a feature extraction model to convert the words before the missing words in the first sentence sample into a first word vector sequence in order of words, and convert all the words in the first sentence sample The words following the missing word are converted into a second word vector sequence according to the reverse word order, and the missing words in the first sentence sample are converted into the label vector of the first sentence sample according to a preset vocabulary coding table.
  • the feature extraction model includes the input layer, the forward hidden layer, the backward hidden layer, and the output layer.
  • the use of the feature extraction model converts the words before the missing word in the first sentence sample into a first word vector sequence in order of words, and converts the words after the missing word in the first sentence sample
  • the conversion of words into the second word vector sequence according to the reverse word order includes:
  • a first sentence sample is " ⁇ S>from ⁇ mask>Language Processing ⁇ E>", where " ⁇ S>” means the head word of the first sentence sample, and " ⁇ E>” means the first sentence sample
  • the tail word convert the word " ⁇ S> ⁇ ” before the missing word "ran” into the first coding vector sequence ⁇ (0,0,0,0,1,0,0 in word order) , 0), (0, 0, 0, 0, 0, 0, 1) ⁇
  • the word "language processing ⁇ E>” after the missing word "ran” is transformed into the second coding vector sequence ⁇ ( 0, 0, 0, 0, 0, 1, 0), (0, 0, 0, 0, 0, 1, 0, 0), (1, 0, 0, 0, 0, 0), (0, 0, 1, 0, 0, 0, 0), (0, 1, 0, 0, 0, 0, 0), (0, 1, 0, 0, 0, 0) ⁇
  • the preset vocabulary encoding table may adopt one -Encoding methods such as hot and word2vec.
  • the missing word ⁇ mask> in the first sentence sample is converted into the label vector (0, 0, 0, 1, 0, 0, 0, 0) of the first sentence sample, namely "Ran" one-hot encoding.
  • the encoding module 203 is configured to use the feature extraction model to encode the first word vector sequence into a first encoding sequence, and to encode the second word vector sequence into a second encoding sequence.
  • the forward hidden layer of the feature extraction model encodes the first word vector sequence into the first coding sequence
  • the backward hidden layer of the feature extraction model encodes the second word vector sequence
  • the sequence code is the second coding sequence.
  • the forward hidden layer and the backward hidden layer respectively include N forward hidden sublayers and N backward hidden sublayers, each forward hidden sublayer includes U encoding modules, and each backward hidden sublayer
  • the layer includes W encoding modules; wherein the u-th encoding module of the n-th forward hidden sublayer receives the vector Z n-1 output by the u-1th encoding module of the n-1th forward hidden sublayer, u-1 and the vector Z of the n 1-layer of the front output to hidden sublayer u-th encoding module n-1, u, n-th layer before Z n to the hidden sublayer u-th encoding module output vector, u To the u-th coding module of the forward hidden sublayer of the n+1th layer and the u+1th
  • the u-th encoding module of the first layer forward hidden sublayer receives the u-1th word vector of the first word vector sequence and the u-th word vector of the first word vector sequence, and the Nth layer forward The hidden sub-layer is the first coding sequence.
  • the first encoding module of the nth forward hidden sublayer receives the vector Z n-1,1 output by the first encoding module of the n-1th forward hidden sublayer.
  • the first encoding module of the nth forward hidden sublayer The first encoding module outputs the vector Z n,1 to the first encoding module of the forward hidden sub-layer of the n+1th layer.
  • the w-th encoding module of the nth layer of backward hidden sublayer receives the vector R n-1, w-1 and n-1th layer output by the w-1th encoding module of the n-1th layer of backward hidden sublayer
  • the vector R n-1,w output by the w-th encoding module of the backward hidden sub-layer, the output vector Z n,w of the w-th encoding module of the n-th backward hidden sub-layer to the n+1-th layer is backward hidden
  • the w-th coding module of the sub-layer and the w+1-th coding module of the backward hidden sub-layer of the n+1 layer 2 ⁇ w ⁇ W.
  • the w-th encoding module of the first layer of backward hidden sublayer receives the w-1th word vector of the second word vector sequence and the wth word vector of the second word vector sequence, and the Nth layer is backward
  • the hidden sublayer is the second coding sequence.
  • the first coding module of the nth layer of backward hidden sublayer receives the vector R n-1,1 output by the first coding module of the n-1th layer of backward hidden sublayer, and the vector of the nth layer of backward hidden sublayer
  • the first encoding module outputs the vector R n,1 to the first encoding module of the hidden sub-layer at the n+1th layer.
  • the encoding of the first word vector sequence into the first encoding sequence by the feature extraction model includes:
  • the first encoding module of the forward hidden sublayer of the first layer encodes the first word vector of the first word vector sequence according to the first weight matrix subset in the initialized weight matrix set.
  • the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence, the initialized weight matrix set includes N weight matrix subsets, the intermediate vector sequence of the first coding sequence and The intermediate vector sequence of the second coding sequence corresponds one-to-one in order.
  • the forward hidden sublayer of the nth layer and the backward hidden sublayer of the nth layer share the nth weight matrix subset, and each weight matrix
  • the subset includes multiple sets of weight matrixes and a fourth weight matrix, and each set of weight matrixes includes a V weight matrix, a Q weight matrix, and a K weight matrix.
  • the V weight matrix, the Q weight matrix, and the K weight matrix in the multiple sets of weight matrix are used to calculate the first coding sequence and the second coding sequence based on multi-head attention. That is, the first coding sequence represents the above semantic information of the missing word in the first sentence sample, and the second coding sequence represents the following semantic information of the missing word in the first sentence sample information.
  • the first encoding module of the forward hidden sublayer of the first layer encodes the first word vector of the first word vector sequence into the first weight matrix subset in the initialized weight matrix set.
  • the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence includes:
  • the first encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the V in the multiple sets of weight matrixes in the first weight matrix subset.
  • connection between the first coding module of two adjacent layers is similar to an ordinary neuron connection, and no attention mechanism is used.
  • the u-th coding module of the forward hidden sub-layer of the first layer will one by one according to the first weight matrix subset.
  • the u-th vector of the first intermediate vector sequence corresponds to the u-th word vector of the first word vector sequence in a one-to-one correspondence.
  • the u-th encoding module of the forward hidden sub-layer of the first layer calculates the u-1th word vector of the first word vector sequence and the first word vector sequence one by one according to the first weight matrix subset
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The V weight matrix in, obtain the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The Q weight matrix in, obtains the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset
  • the K weight matrix in obtains the K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The V weight matrix in, obtains the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset
  • the third K weight matrix in obtain the K'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the second coding module of the forward hidden sublayer of the first layer is based on the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, the a first intermediate vector of the two vectors encoding a first sequence of the sequence Z K weight vectors 1 and 2, the first a first intermediate vector of the two vectors coding sequence of the sequence Z K 1,2 '
  • the weight vector determines the attention value of the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence and the first intermediate vector sequence of the first coding sequence The attention value of the V'weight vector of the second vector Z 1,2.
  • the second coding module of the forward hidden sublayer of the first layer is based on the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, the The V'weight vector of the second vector of the first intermediate vector sequence Z 1,2 of the first coding sequence, the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the attention value of the V weight vector and the attention value of the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence determine the first coding sequence
  • (1)-(7) is the second encoding module of the forward hidden sublayer of the first layer to obtain the first encoding sequence according to the first set of weight matrices in the first weight matrix subset
  • the first score of the second vector Z 1,2 of the first intermediate vector sequence of Multiple scores of the second vector Z 1,2 of an intermediate vector sequence.
  • Multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence can be obtained from multiple sets of weight matrixes in the first weight matrix subset at the same time.
  • the second coding module of the forward hidden sublayer of the first layer connects multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, The combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence is obtained.
  • the second coding module of the forward hidden sublayer of the first layer multiplies the combination vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence by the first A four-weight matrix is used to obtain an intermediate vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the feedforward network in the second coding module of the forward hidden sublayer of the first layer performs residual and normalization processing on the second intermediate vector sequence of the first coding sequence.
  • the intermediate vectors of the two vectors Z 1,2 are coded, and the normalization process is performed again to obtain the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • (12)(1)-(11) is the second encoding module of the forward hidden sublayer of the first layer, according to the first weight matrix subset, the second word vector of the first word vector sequence And the first word vector of the first word vector sequence is coded as the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  • the n-th forward hidden sub-layer is used to calculate the n-1th intermediate vector sequence Z of the first coding sequence according to the n-th weight matrix subset.
  • the n-1 code is the n-th intermediate vector sequence Z n of the first code sequence.
  • the feature extraction model encodes the first word vector sequence as a first encoding sequence
  • the feature extraction model encodes the second word vector sequence as a second encoding sequence R n .
  • the u-th encoding module of the forward hidden sublayer of the first layer can encode the u-th vector Z 1,u of the first intermediate vector sequence of the first encoding sequence.
  • Each code in the forward hidden sublayer and the backward hidden sublayer of the same layer can run concurrently.
  • the calculation module 204 is configured to use the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence.
  • the output layer of the feature extraction model is used to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence.
  • the vectors in the first coding sequence and the second coding sequence are summed in dimensions, and a sum vector obtained by the summation is multiplied by the output weight matrix and normalized to obtain the first The missing word vector of the sentence sample.
  • the first training module 205 is configured to train the feature extraction model according to the missing word vector of the first sentence sample and the label vector of the first sentence sample to obtain a first feature extraction model, and create a second feature extraction model,
  • the neural network structure of the second feature extraction model is made consistent with the neural network structure of the first feature extraction model, and the weight of the second feature extraction model is updated with the weight of the first feature extraction model.
  • the loss value of the missing word vector and the label vector of the first sentence sample may be calculated according to the cross-entropy loss function, and the weight matrix of the feature extraction model may be optimized according to the loss value.
  • a new intermediate feature extraction model can be created according to the neural network structure of the first feature extraction model.
  • the neural network structure can include the number of neurons, the number of neuron layers, and the way of connection between neurons.
  • the weight of the first feature extraction model can be copied. After the first feature extraction model is trained, the weight of the first feature extraction model enables the first feature extraction model to have strong feature extraction capabilities , Initializing the weight of the intermediate feature extraction model with the weight of the first feature extraction model to obtain the second feature extraction model that is the same as the first feature extraction model.
  • the second training module 206 is used to train the attribute classification model composed of the first feature extraction model and the fully connected layer using second sentence samples with attribute tags.
  • the attribute label may include resolution, processor, sound effects, etc.
  • One sentence of the second sentence sample is "This computer responds quickly", and the attribute label is "Processor". Indicates that this second sample sentence includes the semantics of the processor.
  • the second sentence sample may be a sentence in a given field with attribute tags.
  • a small number of the second sentence samples can be used to train the attribute classification model. Because the feature extraction model has been trained and can extract semantic information well, it is only necessary to fine-tune and adjust the weight matrix of the feature extraction model.
  • the weight matrix in the fully connected layer is optimized. Wherein, the output of the first feature extraction model is the input of the fully connected layer.
  • each sentence in the second sentence sample is equally divided into two parts according to the number of words, and the words in the first part of each sentence in the second sentence sample are similar to the first part.
  • the words before the missing word of each sentence in the sentence sample, and the words in the latter part of each sentence of the second sentence sample are similar to the words after the missing word of each sentence in the first sentence sample.
  • the one word in the middle is eliminated when the second sentence sample is divided, and the word in the middle is divided into the front part.
  • the process of using the second sentence sample to train the first feature extraction model is similar to the process of using the first sentence sample to train the feature extraction model, and will not be repeated here.
  • the connection module 207 is used to identify the attribute words of a plurality of sentences to be recognized by using the attribute classification model, and connect each sentence to be recognized with the attribute words of each sentence to be recognized to obtain the plurality of connected attribute words. Sentences to be recognized.
  • the third training module 208 is configured to train an emotion classification model composed of the second feature extraction model and a deep learning model by using the plurality of sentences to be recognized that are connected to attribute words with emotion labels.
  • the sentiment label may include "positive”, “neutral”, “negative”, etc.
  • the deep learning model may be CNN, RNN, or LSTM. Training the sentiment classification model is an existing method, and will not be repeated here. Wherein, the output of the second feature extraction model is the input of the deep learning model.
  • the multiple to-be-recognized sentences of the connected attribute words may be output, and the multiple to-be-recognized sentences of the output connected attribute words may be manually labeled to obtain the multiple to-be-recognized sentences of the connected attribute words with emotion labels , Receiving the plurality of sentences to be recognized of the connection attribute words with emotion tags.
  • the process of training the second feature extraction model with the plurality of sentences to be recognized with the sentiment label connected attribute words is similar to the process of training the first feature extraction model with the second sentence samples. Go into details again.
  • the classification module 209 is configured to use the attribute classification model to identify the attribute words of the sentence to be processed.
  • the sentiment classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute words of the sentence to be processed and the sentence to be processed The emotional type of the sentence.
  • the attribute classification model is used to identify the attribute word of the sentence to be processed "this computer is very fast” as “processor”, and the sentiment classification model responds very quickly to the sentence " ⁇ S>" which is connected to the attribute word.
  • Fast ⁇ SOE> processor ⁇ E> to classify, output the attribute word "processor” of the sentence to be processed and the emotion type "positive" of the sentence to be processed.
  • the attribute classification model and the emotion classification model may also be stored in a node of a blockchain.
  • the second embodiment realizes emotion classification of sentences, and enhances the accuracy and scene adaptability of emotion classification.
  • the U-th encoding module of the n-th forward hidden sublayer converts the U-1th vector of the n-1 intermediate vector sequence of the first encoding sequence, and the first encoding sequence
  • the U-th vector of the n-1 intermediate vector sequence of the second coding sequence and the W-th vector of the n-1 intermediate vector sequence of the second coding sequence are coded as Z n, U ;
  • the nth layer of the backward hidden sublayer is
  • the W encoding modules combine the W-1th vector of the n-1 intermediate vector sequence of the second encoding sequence, the Wth vector of the n-1 intermediate vector sequence of the second encoding sequence, and the The U-th vector of the n-1 intermediate vector sequence of a coding sequence is coded as R n,W .
  • the feature extraction model can be migrated to sentiment classification models in different fields.
  • This embodiment provides a computer storage medium with a computer program stored on the computer storage medium.
  • the computer storage medium may be non-volatile or volatile.
  • each module in the above-mentioned device embodiment is realized, for example, the modules 201-209 in FIG. 2.
  • FIG. 3 is a schematic diagram of the computer equipment provided in the fourth embodiment of the application.
  • the computer device 30 includes a memory 301, a processor 302, and a computer program 303 stored in the memory 301 and running on the processor 302, such as a sentence emotion classification program.
  • the processor 302 executes the computer program 303, the steps in the embodiment of the sentence emotion classification method described above are implemented, for example, steps 101-109 shown in FIG. 1.
  • the computer program 303 may be divided into one or more modules, and the one or more modules are stored in the memory 301 and executed by the processor 302 to complete the method.
  • the one or more modules may be a series of computer-readable instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 303 in the computer device 30.
  • the computer program 303 can be divided into the acquisition module 201, the conversion module 202, the encoding module 203, the calculation module 204, the first training module 205, the second training module 206, the connection module 207, and the third training in FIG. Module 208, classification module 209, see the second embodiment for the specific functions of each module.
  • the computer device 30 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the schematic diagram 3 is only an example of the computer device 30, and does not constitute a limitation on the computer device 30. It may include more or less components than those shown in the figure, or combine certain components, or different components.
  • the computer device 30 may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 302 may be a central processing unit (Central Processing Unit, CPU), other general processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor can be a microprocessor or the processor 302 can also be any conventional processor, etc.
  • the processor 302 is the control center of the computer device 30, which uses various interfaces and lines to connect the entire computer device 30. Various parts.
  • the memory 301 may be used to store the computer program 303, and the processor 302 can implement the computer by running or executing the computer program 303 or module stored in the memory 301 and calling data stored in the memory 301.
  • the memory 301 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.); the storage data area may The data (such as audio data, phone book, etc.) created according to the use of the computer device 30 are stored.
  • the memory 301 may include non-volatile and volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card ( Flash Card), at least one magnetic disk storage device, flash memory device, or other storage device.
  • non-volatile and volatile memory such as a hard disk, a memory, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card ( Flash Card), at least one magnetic disk storage device, flash memory device, or other storage device.
  • the integrated module of the computer device 30 is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • this application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a computer storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read only memory (ROM), random access memory (RAM) etc.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present application provides a statement sentiment classification method and a related device. Said method comprises: using a feature extraction model to convert words before a missing word in a first statement sample into a first coding sequence according to a word order, converting words after the missing word in the first statement sample into a second coding sequence according to a reverse word order, and converting the missing word in the first statement sample into a tag vector of the first statement sample; using the feature extraction model to calculate a missing word vector of the first statement sample according to the first coding sequence and the second coding sequence; training the feature extraction model according to the missing word vector of the first statement sample and the tag vector of the first statement sample; and using an attribute classification model formed by the feature extraction model to identify attribute words of statements to be processed, and using a sentiment classification model formed by the feature extraction model to classify said statements which connect the attribute words. The present application enhances the accuracy and scene adaptability of sentiment classification. The present application also relates to a blockchain.

Description

语句情感分类方法及相关设备Sentence emotion classification method and related equipment
本申请要求于2020年03月02日提交中国专利局,申请号为202010137265.1申请名称为“语句情感分类方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims to be submitted to the Chinese Patent Office on March 2, 2020. The application number is 202010137265.1. The application titled "Sentence Emotion Classification Method and Related Equipment" is the priority of the Chinese patent application, the entire content of which is incorporated into this application by reference.
技术领域Technical field
本申请涉及自然语言处理领域,具体涉及一种语句情感分类方法、装置、计算机设备及计算机存储介质。This application relates to the field of natural language processing, and specifically relates to a sentence emotion classification method, device, computer equipment, and computer storage medium.
背景技术Background technique
通常,在人工智能的自然语言处理领域,使用指定领域的带有情感标签的语句对情感分类模型(例如卷积神经网络)进行训练,用训练好的情感分类模型对指定领域的语句进行分类。发明人意识到,现有的文本情感分类方法只适用于固定领域的语句情感分类任务,且需要较大的训练集才可能提升情感分类的准确性。Generally, in the field of natural language processing of artificial intelligence, sentiment classification models (such as convolutional neural networks) are trained using sentences with sentiment labels in the designated domain, and sentences in the designated domain are classified with the trained sentiment classification model. The inventor realizes that the existing text sentiment classification method is only suitable for sentence sentiment classification tasks in a fixed field, and a larger training set is required to improve the accuracy of sentiment classification.
如何提升文本情感分类的场景适应性和情感分类的准确性,成为当前待解决的问题。How to improve the scene adaptability of text emotion classification and the accuracy of emotion classification has become a problem to be solved at present.
发明内容Summary of the invention
鉴于以上内容,有必要提出一种语句情感分类方法、装置、计算机设备及计算机存储介质,其可以对语句进行情感分类,增强情感分类的准确性和场景适应性。In view of the above, it is necessary to propose a sentence emotion classification method, device, computer equipment, and computer storage medium, which can classify the emotion of the sentence and enhance the accuracy and scene adaptability of the emotion classification.
本申请的第一方面提供一种语句情感分类方法,所述语句情感分类方法包括:The first aspect of the present application provides a sentence sentiment classification method, and the sentence sentiment classification method includes:
获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词;对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量;利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列;利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量;根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更新所述第二特征提取模型的权值;用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型;用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句;用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型;用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。Obtain a first sentence sample set. Each first sentence sample in the first sentence sample set contains a missing word; for each first sentence sample, the feature extraction model is used to calculate the first sentence sample before the missing word Words are converted into a first word vector sequence according to word order, words after the missing word in the first sentence sample are converted into a second word vector sequence according to the reverse word order, and the first sentence is converted according to a preset vocabulary coding table The missing words in the sample are transformed into the label vector of the first sentence sample; the feature extraction model is used to encode the first word vector sequence into a first encoding sequence, and the second word vector sequence is encoded as Second coding sequence; using the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence; according to the missing word vector of the first sentence sample and the missing word vector The label vector of the first sentence sample trains the feature extraction model to obtain a first feature extraction model, and a new second feature extraction model is created so that the neural network structure of the second feature extraction model is the same as that of the first feature extraction model. The neural network structure is consistent, the weight of the first feature extraction model is used to update the weight of the second feature extraction model; the second sentence sample with attribute tags is used to train the first feature extraction model and the full connection An attribute classification model composed of layers; the attribute classification model is used to identify the attribute words of a plurality of sentences to be recognized, and each sentence to be recognized is connected with the attribute words of each sentence to be recognized to obtain the connection attribute words A plurality of sentences to be recognized; using the plurality of sentences to be recognized connected with attribute words with emotion labels to train an emotion classification model composed of the second feature extraction model and a deep learning model; and using the attribute classification model to recognize Processing the attribute words of the sentence, the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute word of the sentence to be processed and the emotion type of the sentence to be processed.
本申请的第二方面提供一种语句情感分类装置,所述装置包括:A second aspect of the present application provides a sentence emotion classification device, the device includes:
获取模块,用于获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词;转化模块,用于对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述 第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量;编码模块,用于利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列;计算模块,用于利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量;第一训练模块,用于根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更新所述第二特征提取模型的权值;第二训练模块,用于用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型;连接模块,用于用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句;The obtaining module is used to obtain a first sentence sample set, and each first sentence sample in the first sentence sample set contains a missing word; the conversion module is used to obtain a feature extraction model for each first sentence sample. The words before the missing word in the first sentence sample are converted into the first word vector sequence in the word order, and the words after the missing word in the first sentence sample are converted into the second word vector sequence according to the reverse word order. Suppose that a vocabulary encoding table converts the missing words in the first sentence sample into a label vector of the first sentence sample; an encoding module is used to encode the first word vector sequence into The first coding sequence is used to code the second word vector sequence into a second coding sequence; the calculation module is used to use the feature extraction model to calculate the first coding sequence according to the first coding sequence and the second coding sequence. The missing word vector of the sentence sample; the first training module is used to train the feature extraction model according to the missing word vector of the first sentence sample and the label vector of the first sentence sample to obtain the first feature extraction model, and create a new The second feature extraction model makes the neural network structure of the second feature extraction model consistent with the neural network structure of the first feature extraction model, and updates the second feature extraction with the weight of the first feature extraction model The weight of the model; the second training module is used to train the attribute classification model composed of the first feature extraction model and the fully connected layer with second sentence samples with attribute labels; the connection module is used to use the attributes The classification model recognizes the attribute words of a plurality of sentences to be recognized, and connects each sentence to be recognized with the attribute words of each sentence to be recognized recognized to obtain the plurality of sentences to be recognized that connect the attribute words;
第三训练模块,用于用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型;分类模块,用于用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。The third training module is used to train the emotion classification model composed of the second feature extraction model and the deep learning model with the plurality of sentences to be recognized connected with attribute words with emotion labels; the classification module is used to use all The attribute classification model identifies the attribute words of the sentence to be processed, and the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute words of the sentence to be processed and the emotion type of the sentence to be processed.
本申请的第三方面提供一种计算机设备,所述计算机设备包括处理器,所述处理器用于执行存储器中存储的计算机可读指令时实现以下步骤:A third aspect of the present application provides a computer device, the computer device includes a processor, and the processor is configured to execute the following steps when executing computer-readable instructions stored in a memory:
获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词;对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量;利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列;利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量;根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更新所述第二特征提取模型的权值;用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型;用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句;用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型;用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。Obtain a first sentence sample set. Each first sentence sample in the first sentence sample set contains a missing word; for each first sentence sample, the feature extraction model is used to calculate the first sentence sample before the missing word Words are converted into a first word vector sequence according to word order, words after the missing word in the first sentence sample are converted into a second word vector sequence according to the reverse word order, and the first sentence is converted according to a preset vocabulary coding table The missing words in the sample are transformed into the label vector of the first sentence sample; the feature extraction model is used to encode the first word vector sequence into a first encoding sequence, and the second word vector sequence is encoded as Second coding sequence; using the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence; according to the missing word vector of the first sentence sample and the missing word vector The label vector of the first sentence sample trains the feature extraction model to obtain a first feature extraction model, and a new second feature extraction model is created so that the neural network structure of the second feature extraction model is the same as that of the first feature extraction model. The neural network structure is consistent, the weight of the first feature extraction model is used to update the weight of the second feature extraction model; the second sentence sample with attribute tags is used to train the first feature extraction model and the full connection An attribute classification model composed of layers; the attribute classification model is used to identify the attribute words of a plurality of sentences to be recognized, and each sentence to be recognized is connected with the attribute words of each sentence to be recognized to obtain the connection attribute words A plurality of sentences to be recognized; using the plurality of sentences to be recognized connected with attribute words with emotion labels to train an emotion classification model composed of the second feature extraction model and a deep learning model; and using the attribute classification model to recognize Processing the attribute words of the sentence, the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute word of the sentence to be processed and the emotion type of the sentence to be processed.
本申请的第四方面提供一种计算机存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:The fourth aspect of the present application provides a computer storage medium having computer-readable instructions stored thereon, and the computer-readable instructions implement the following steps when executed by a processor:
获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词;对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量;利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列;利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量;根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更 新所述第二特征提取模型的权值;用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型;用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句;用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型;用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。Obtain a first sentence sample set. Each first sentence sample in the first sentence sample set contains a missing word; for each first sentence sample, the feature extraction model is used to calculate the first sentence sample before the missing word Words are converted into a first word vector sequence according to word order, words after the missing word in the first sentence sample are converted into a second word vector sequence according to the reverse word order, and the first sentence is converted according to a preset vocabulary coding table The missing words in the sample are transformed into the label vector of the first sentence sample; the feature extraction model is used to encode the first word vector sequence into a first encoding sequence, and the second word vector sequence is encoded as Second coding sequence; using the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence; according to the missing word vector of the first sentence sample and the missing word vector The label vector of the first sentence sample trains the feature extraction model to obtain a first feature extraction model, and a new second feature extraction model is created so that the neural network structure of the second feature extraction model is the same as that of the first feature extraction model. The neural network structure is consistent, the weight of the first feature extraction model is used to update the weight of the second feature extraction model; the second sentence sample with attribute tags is used to train the first feature extraction model and the full connection An attribute classification model composed of layers; the attribute classification model is used to identify the attribute words of a plurality of sentences to be recognized, and each sentence to be recognized is connected with the attribute words of each sentence to be recognized to obtain the connection attribute words A plurality of sentences to be recognized; using the plurality of sentences to be recognized connected with attribute words with emotion labels to train an emotion classification model composed of the second feature extraction model and a deep learning model; and using the attribute classification model to recognize Processing the attribute words of the sentence, the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute word of the sentence to be processed and the emotion type of the sentence to be processed.
本申请对语句进行情感分类,增强情感分类的准确性和场景适应性。This application performs emotion classification on sentences to enhance the accuracy and scene adaptability of emotion classification.
附图说明Description of the drawings
图1是本申请实施例提供的语句情感分类方法的流程图。Fig. 1 is a flowchart of a sentence sentiment classification method provided by an embodiment of the present application.
图2是本申请实施例提供的语句情感分类装置的结构图。Fig. 2 is a structural diagram of a sentence sentiment classification device provided by an embodiment of the present application.
图3是本申请实施例提供的计算机设备的示意图。Fig. 3 is a schematic diagram of a computer device provided by an embodiment of the present application.
具体实施方式Detailed ways
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施例对本申请进行详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to be able to understand the above objectives, features and advantages of the application more clearly, the application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the application and the features in the embodiments can be combined with each other if there is no conflict.
优选地,本申请的语句情感分类方法应用在一个或者多个计算机设备中。所述计算机设备是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable Gate Array,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。Preferably, the sentence emotion classification method of the present application is applied to one or more computer devices. The computer device is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor and an application specific integrated circuit (ASIC) , Programmable Gate Array (Field-Programmable Gate Array, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。The computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
实施例一Example one
图1是本申请实施例一提供的语句情感分类方法的流程图。所述语句情感分类方法应用于计算机设备。Fig. 1 is a flowchart of a sentence sentiment classification method provided in Embodiment 1 of the present application. The sentence emotion classification method is applied to computer equipment.
本申请语句情感分类方法可以对语句进行情感分类。The sentence sentiment classification method of the present application can perform sentiment classification on the sentence.
如图1所示,所述语句情感分类方法包括:As shown in Figure 1, the sentence emotion classification method includes:
101,获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词。101. Obtain a first sentence sample set, where each first sentence sample in the first sentence sample set contains a missing word.
可以获取不同领域的多个文本,每个文本包括多个语句,对每个文本进行多次遮挡,每次遮挡所述文本中的部分词语,从每次遮挡部分词语的文本中抽取包含一个缺失词的句子作为所述第一语句样本。Multiple texts in different fields can be obtained, each text includes multiple sentences, each text is occluded multiple times, part of the words in the text is occluded each time, and a missing part is extracted from the text of each occluded part of the words The sentence of words is used as the first sentence sample.
可以获取旅游、电子产品、专利服务等各个领域的多个文本,每个领域的包括多个文本,每个领域的每个文本可以包括多个语句。本实施例对领域的大小不做限定,如电子产品领域和笔记本电脑领域,电子产品领域可以包括笔记本电脑领域。Multiple texts in various fields such as tourism, electronic products, and patent services can be obtained. Each field includes multiple texts, and each text in each field can include multiple sentences. This embodiment does not limit the size of the field, such as the field of electronic products and the field of notebook computers, and the field of electronic products may include the field of notebook computers.
可以对各个领域的多个文本中的每个文本进行多次遮挡,每次随机遮挡每个文本中的预设比例的部分词语,得到每各个领域的多个文本中存在缺失词的第一语句样本。Each text in multiple texts in each field can be occluded multiple times, and a preset proportion of words in each text can be randomly occluded each time to obtain the first sentence with missing words in multiple texts in each field sample.
102,对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量。102. For each first sentence sample, use a feature extraction model to convert words before the missing word in the first sentence sample into a first word vector sequence in order of words, and convert the words after the missing word in the first sentence sample The words of are converted into a second word vector sequence according to the reverse word order, and the missing words in the first sentence sample are converted into the label vector of the first sentence sample according to a preset vocabulary coding table.
所述特征提取模型包括所述输入层、前向隐藏层、后向隐藏层和输出层。The feature extraction model includes the input layer, the forward hidden layer, the backward hidden layer, and the output layer.
在一具体实施例中,所述利用特征提取模型将所述第一语句样本中缺失词之前的词 语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列包括:In a specific embodiment, the use of the feature extraction model converts the words before the missing word in the first sentence sample into a first word vector sequence in order of words, and converts the words after the missing word in the first sentence sample The conversion of words into the second word vector sequence according to the reverse word order includes:
将所述第一语句样本中的所述缺失词前的词语依词序转化为第一编码向量序列,将所述第一语句样本中的所述缺失词后的词语依词序转化为第二编码向量序列;将所述第一语句样本中的所述缺失词前的词语的位置编号转化为第一位置向量序列,将所述第一语句样本中的所述缺失词后的词语的位置编号转化为第二位置向量序列;将所述第一编码向量序列和所述第一位置向量序列转化为第一词向量序列,将所述第二编码向量序列和所述第二位置向量序列转化为第二词向量序列。Convert the words before the missing word in the first sentence sample into a first coding vector sequence in word order, and convert the words after the missing word in the first sentence sample into a second coding vector in word order Sequence; convert the position number of the word before the missing word in the first sentence sample into a first position vector sequence, and convert the position number of the word after the missing word in the first sentence sample into A second position vector sequence; converting the first coding vector sequence and the first position vector sequence into a first word vector sequence, and converting the second coding vector sequence and the second position vector sequence into a second Sequence of word vectors.
例如,一个第一语句样本为“<S>自<mask>语言处理<E>”,其中“<S>”表示第一语句样本的头部词,“<E>”表示第一语句样本的尾部词,根据所述预设词汇编码表将缺失词“然”前的词语“<S>自”依词序转化为第一编码向量序列{(0,0,0,0,1,0,0,0),(0,0,0,0,0,0,0,1)},将缺失词“然”后的词语“语言处理<E>”依词序转化为第二编码向量序列{(0,0,0,0,0,0,1,0),(0,0,0,0,0,1,0,0),(1,0,0,0,0,0,0,0),(0,0,1,0,0,0,0,0),(0,1,0,0,0,0,0,0)},所述预设词汇编码表可以采用one-hot、word2vec等编码方式。将第一语句样本中的缺失词“然”前的词语的位置编号转化为第一位置向量序列{(1,0,0,0,0,0,0,0),(0,1,0,0,0,0,0,0)},将第一语句样本中的缺失词后的词语的位置编号转化为第二位置向量序列{(0,0,0,1,0,0,0,0),(0,0,0,0,1,0,0,0),(0,0,0,0,0,1,0,0),(0,0,0,0,0,0,1,0),(0,0,0,0,0,0,0,1)}。将缺失词“然”前的每个词语对应的第一编码向量序列中的第一编码向量和第一位置向量序列中的第一位置向量相加,得到第一词向量序列{(1,0,0,0,1,0,0,0),(0,1,0,0,0,0,0,1)}。将缺失词“然”后的每个词语对应的第二编码向量序列中的第二编码向量和第二位置向量序列中的第二位置向量相加,得到第二词向量序列{(0,0,0,1,0,0,1,0),(0,0,0,0,1,1,0,0),(1,0,0,0,0,1,0,0),(0,0,1,0,0,0,1,0),(0,1,0,0,0,0,0,1)}。For example, a first sentence sample is "<S>from <mask>Language Processing<E>", where "<S>" means the head word of the first sentence sample, and "<E>" means the first sentence sample The tail word, according to the preset vocabulary coding table, convert the word "<S>自" before the missing word "ran" into the first coding vector sequence {(0,0,0,0,1,0,0 in word order) , 0), (0, 0, 0, 0, 0, 0, 0, 1)}, the word "language processing <E>" after the missing word "ran" is transformed into the second coding vector sequence {( 0, 0, 0, 0, 0, 0, 1, 0), (0, 0, 0, 0, 0, 1, 0, 0), (1, 0, 0, 0, 0, 0, 0, 0), (0, 0, 1, 0, 0, 0, 0, 0), (0, 1, 0, 0, 0, 0, 0, 0)}, the preset vocabulary encoding table may adopt one -Encoding methods such as hot and word2vec. Convert the position number of the word before the missing word "ran" in the first sentence sample into the first position vector sequence {(1, 0, 0, 0, 0, 0, 0, 0), (0, 1, 0) ,0,0,0,0,0)}, convert the position number of the word after the missing word in the first sentence sample into a second position vector sequence {(0,0,0,1,0,0,0 , 0), (0, 0, 0, 0, 1, 0, 0, 0), (0, 0, 0, 0, 0, 1, 0, 0), (0, 0, 0, 0, 0 , 0, 1, 0), (0, 0, 0, 0, 0, 0, 0, 1)}. Add the first code vector in the first code vector sequence corresponding to each word before the missing word "ran" and the first position vector in the first position vector sequence to obtain the first word vector sequence {(1, 0 , 0, 0, 1, 0, 0, 0), (0, 1, 0, 0, 0, 0, 0, 1)}. Add the second code vector in the second code vector sequence corresponding to each word after the missing word "ran" and the second position vector in the second position vector sequence to obtain the second word vector sequence {(0, 0 , 0, 1, 0, 0, 1, 0), (0, 0, 0, 0, 1, 1, 0, 0), (1, 0, 0, 0, 0, 1, 0, 0), (0, 0, 1, 0, 0, 0, 1, 0), (0, 1, 0, 0, 0, 0, 0, 1)}.
根据所述预设词汇编码表将第一语句样本中的缺失词<mask>转化为所述第一语句样本的标签向量(0,0,0,1,0,0,0,0),即“然”的one-hot编码。According to the preset vocabulary encoding table, the missing word <mask> in the first sentence sample is converted into the label vector (0, 0, 0, 1, 0, 0, 0, 0) of the first sentence sample, namely "Ran" one-hot encoding.
103,利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列。103. Use the feature extraction model to encode the first word vector sequence as a first encoding sequence, and encode the second word vector sequence as a second encoding sequence.
在本实施例中,所述特征提取模型的前向隐藏层将所述第一词向量序列编码为所述第一编码序列,所述特征提取模型的后向隐藏层将所述第二词向量序列编码为第二编码序列。所述前向隐藏层和所述后向隐藏层分别包括N个前向隐藏子层和N个后向隐藏子层,每个前向隐藏子层包括U个编码模块,每个后向隐藏子层包括W个编码模块;其中,第n层前向隐藏子层的第u个编码模块接收第n-1层前向隐藏子层的第u-1个编码模块输出的向量Z n-1,u-1和第n-1层前向隐藏子层的第u个编码模块输出的向量Z n-1,u,第n层前向隐藏子层的第u个编码模块输出向量Z n,u至第n+1层前向隐藏子层的第u个编码模块和第n+1层前向隐藏子层的第u+1个编码模块,2≤n≤N,2≤u≤U。第1层前向隐藏子层的第u个编码模块接收所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量,第N层前向隐藏子层为所述第一编码序列。第n层前向隐藏子层的第1个编码模块接收第n-1层前向隐藏子层的第1个编码模块输出的向量Z n-1,1,第n层前向隐藏子层的第1个编码模块输出向量Z n,1至第n+1层前向隐藏子层的第1个编码模块。第n层后向隐藏子层的第w个编码模块接收第n-1层后向隐藏子层的第w-1个编码模块输出的向量R n-1,w-1和第n-1层后向隐藏子层的第w个编码模块输出的向量R n-1,w,第n层后向隐藏子层的第w个编码模块输出向量Z n,w至第n+1层后向隐藏子层的第w个编码模块和第n+1层后向隐藏子层的第w+1个编码模块,2≤w≤W。第1层后向隐藏子层的第w个编码模块接收所述第二词向量序列的第w-1个词向量和所述第二 词向量序列的第w个词向量,第N层后向隐藏子层为所述第二编码序列。第n层后向隐藏子层的第1个编码模块接收第n-1层后向隐藏子层的第1个编码模块输出的向量R n-1,1,第n层后向隐藏子层的第1个编码模块输出向量R n,1至第n+1层后向隐藏子层的第1个编码模块。 In this embodiment, the forward hidden layer of the feature extraction model encodes the first word vector sequence into the first coding sequence, and the backward hidden layer of the feature extraction model encodes the second word vector sequence The sequence code is the second coding sequence. The forward hidden layer and the backward hidden layer respectively include N forward hidden sublayers and N backward hidden sublayers, each forward hidden sublayer includes U encoding modules, and each backward hidden sublayer The layer includes W encoding modules; wherein the u-th encoding module of the n-th forward hidden sublayer receives the vector Z n-1 output by the u-1th encoding module of the n-1th forward hidden sublayer, u-1 and the vector Z of the n 1-layer of the front output to hidden sublayer u-th encoding module n-1, u, n-th layer before Z n to the hidden sublayer u-th encoding module output vector, u To the u-th coding module of the forward hidden sublayer of the n+1th layer and the u+1th coding module of the forward hidden sublayer of the n+1th layer, 2≤n≤N, 2≤u≤U. The u-th encoding module of the first layer forward hidden sublayer receives the u-1th word vector of the first word vector sequence and the u-th word vector of the first word vector sequence, and the Nth layer forward The hidden sub-layer is the first coding sequence. The first encoding module of the nth forward hidden sublayer receives the vector Z n-1,1 output by the first encoding module of the n-1th forward hidden sublayer. The first encoding module of the nth forward hidden sublayer The first encoding module outputs the vector Z n,1 to the first encoding module of the forward hidden sub-layer of the n+1th layer. The w-th encoding module of the nth layer of backward hidden sublayer receives the vector R n-1, w-1 and n-1 of the output of the w-1th encoding module of the n-1th layer of backward hidden sublayer The vector R n-1,w output by the w-th encoding module of the backward hidden sub-layer, the output vector Z n,w of the w-th encoding module of the n-th backward hidden sub-layer to the n+1-th layer is backward hidden The w-th coding module of the sub-layer and the w+1-th coding module of the n+1-th backward hidden sub-layer, 2≤w≤W. The w-th encoding module of the first layer of backward hidden sublayer receives the w-1th word vector of the second word vector sequence and the wth word vector of the second word vector sequence, and the Nth layer is backward The hidden sublayer is the second coding sequence. The first coding module of the nth layer of backward hidden sublayer receives the vector R n-1,1 output by the first coding module of the n-1th layer of backward hidden sublayer, and the vector of the nth layer of backward hidden sublayer The first encoding module outputs the vector R n,1 to the first encoding module of the hidden sub-layer at the n+1th layer.
在一具体实施例中,所述特征提取模型将所述第一词向量序列编码为第一编码序列包括:In a specific embodiment, the encoding of the first word vector sequence into the first encoding sequence by the feature extraction model includes:
(a)第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1,所述初始化的权值矩阵集包括N个权值矩阵子集,所述第一编码序列的中间向量序列与所述第二编码序列的中间向量序列按顺序一一对应,第n层的前向隐藏子层和第n层的后向隐藏子层共享第n个权值矩阵子集,每个权值矩阵子集包括多组权值矩阵和第四权值矩阵,每组权值矩阵包括V权值矩阵、Q权值矩阵、K权值矩阵。 (a) The first encoding module of the forward hidden sublayer of the first layer encodes the first word vector of the first word vector sequence according to the first weight matrix subset in the initialized weight matrix set. The first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence, the initialized weight matrix set includes N weight matrix subsets, the intermediate vector sequence of the first coding sequence and The intermediate vector sequence of the second coding sequence corresponds one-to-one in order. The forward hidden sublayer of the nth layer and the backward hidden sublayer of the nth layer share the nth weight matrix subset, and each weight matrix The subset includes multiple sets of weight matrixes and a fourth weight matrix, and each set of weight matrixes includes a V weight matrix, a Q weight matrix, and a K weight matrix.
其中,多组权值矩阵中的V权值矩阵、Q权值矩阵、K权值矩阵用于计算基于多头注意力的所述第一编码序列和所述第二编码序列。即所述第一编码序列表示了所述第一语句样本中所述缺失词的上文的语义信息,所述第二编码序列表示了所述第一语句样本中所述缺失词的下文的语义信息。Wherein, the V weight matrix, the Q weight matrix, and the K weight matrix in the multiple sets of weight matrix are used to calculate the first coding sequence and the second coding sequence based on multi-head attention. That is, the first coding sequence represents the above semantic information of the missing word in the first sentence sample, and the second coding sequence represents the following semantic information of the missing word in the first sentence sample information.
所述第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1包括: The first encoding module of the forward hidden sublayer of the first layer encodes the first word vector of the first word vector sequence into the first weight matrix subset in the initialized weight matrix set. The first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence includes:
所述第1层前向隐藏子层的第1个编码模块将所述第一词向量序列的第1个词向量分别乘以第一个权值矩阵子集中的多组权值矩阵中的V权值矩阵,得到所述第一词向量序列的第一个词向量的多个V权值向量;将所述第一词向量序列的第一个词向量的多个V权值向量进行连接,得到所述第一词向量序列的第一个词向量的组合向量;用所述第一词向量序列的第一个词向量的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1The first encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the V in the multiple sets of weight matrixes in the first weight matrix subset. A weight matrix to obtain multiple V weight vectors of the first word vector of the first word vector sequence; connect the multiple V weight vectors of the first word vector of the first word vector sequence, Obtain the combined vector of the first word vector of the first word vector sequence; multiply the combined vector of the first word vector of the first word vector sequence by the fourth weight matrix to obtain the first The first vector Z 1,1 of the first intermediate vector sequence of the coding sequence.
相邻两层的第1个编码模块之间的连接类似普通的神经元连接,没有使用注意力机制。The connection between the first coding module of two adjacent layers is similar to an ordinary neuron connection, and no attention mechanism is used.
(b)从所述第1层前向隐藏子层的第2个编码模块开始,所述第1层前向隐藏子层的第u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U},其中,所述第一编码序列的第一个中间向量序列的第u个向量与所述第一词向量序列的第u个词向量一一对应。 (b) Starting from the second coding module of the forward hidden sub-layer of the first layer, the u-th coding module of the forward hidden sub-layer of the first layer will one by one according to the first weight matrix subset. The u-1th word vector of the first word vector sequence and the u-th word vector of the first word vector sequence are coded as the u-th vector Z 1 of the first intermediate vector sequence of the first coding sequence ,u , obtain the first intermediate vector sequence Z 1 of the first coding sequence Z 1 ={Z 1,1 ,...,Z 1,u ,...,Z 1,U }, wherein The u-th vector of the first intermediate vector sequence corresponds to the u-th word vector of the first word vector sequence in a one-to-one correspondence.
所述第1层前向隐藏子层的第u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U}包括: The u-th encoding module of the forward hidden sub-layer of the first layer calculates the u-1th word vector of the first word vector sequence and the first word vector sequence one by one according to the first weight matrix subset The u-th word vector encoding is the u-th vector Z 1,u of the first intermediate vector sequence of the first encoding sequence, and the first intermediate vector sequence Z 1 of the first encoding sequence Z 1 ={Z 1,1 ,…,Z 1,u ,…,Z 1,U } include:
(1)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V权值向量。 (1) The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The V weight matrix in, obtain the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(2)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的Q权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的Q权值向量。 (2) The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The Q weight matrix in, obtains the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(3)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的K权值矩阵,得到所述第一编 码序列的第一个中间向量序列的第2个向量Z 1,2的K权值向量。 (3) The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The K weight matrix in, obtains the K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(4)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V’权值向量。 (4) The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The V weight matrix in, obtains the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(5)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的第三K权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的K’权值向量。 (5) The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The third K weight matrix in, obtain the K'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(6)所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的Q权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K’权值向量确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V’权值向量的注意力值。 (6) The second coding module of the forward hidden sublayer of the first layer is based on the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, the a first intermediate vector of the two vectors encoding a first sequence of the sequence Z K weight vectors 1 and 2, the first a first intermediate vector of the two vectors coding sequence of the sequence Z K 1,2 'The weight vector determines the attention value of the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence and the first intermediate vector sequence of the first coding sequence The attention value of the V'weight vector of the second vector Z 1,2.
(7)所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量、所述第一编码序列的第1个中间向量序列Z 1,2的第2个向量的V’权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V’权值向量的注意力值确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的第一个分值。 (7) The second coding module of the forward hidden sublayer of the first layer is based on the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, the The V'weight vector of the second vector of the first intermediate vector sequence Z 1,2 of the first coding sequence, the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence The attention value of the V weight vector and the attention value of the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence determine the first coding sequence The first score of the second vector Z 1,2 of an intermediate vector sequence.
(8)(1)-(7)为所述第1层前向隐藏子层的第2个编码模块根据第一个权值矩阵子集中的第一组权值矩阵得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的第一个分值,同理,可由第一个权值矩阵子集中的多组权值矩阵得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的多个分值。 (8) (1)-(7) is the second encoding module of the forward hidden sublayer of the first layer to obtain the first encoding sequence according to the first set of weight matrices in the first weight matrix subset In the same way, the first score of the second vector Z 1,2 of the first intermediate vector sequence of Multiple scores of the second vector Z 1,2 of an intermediate vector sequence.
可以同时由第一个权值矩阵子集中的多组权值矩阵得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的多个分值。 Multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence can be obtained from multiple sets of weight matrixes in the first weight matrix subset at the same time.
(9)所述第1层前向隐藏子层的第2个编码模块将所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的多个分值进行连接,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量。 (9) The second coding module of the forward hidden sublayer of the first layer connects multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, The combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence is obtained.
(10)所述第1层前向隐藏子层的第2个编码模块用所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量。 (10) The second coding module of the forward hidden sublayer of the first layer multiplies the combination vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence by the first A four-weight matrix is used to obtain an intermediate vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(11)所述第1层前向隐藏子层的第2个编码模块中的前馈网络对经过残差和归一化处理的所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量进行编码,并再次进行归一化处理,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2(11) The feedforward network in the second coding module of the forward hidden sublayer of the first layer performs residual and normalization processing on the second intermediate vector sequence of the first coding sequence. The intermediate vectors of the two vectors Z 1,2 are coded, and the normalization process is performed again to obtain the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(12)(1)-(11)为所述第1层前向隐藏子层的第2个编码模块根据第一个权值矩阵子集将所述第一词向量序列的第2个词向量和所述第一词向量序列的第1个词向量编码为所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2,同理,所述第1层前向隐藏子层的第u个编码模块可编码得到所述第一编码序列的第1个中间向量序列的第u个向量Z 1,u,得到Z 1={Z 1,1,…,Z 1,u,…,Z 1,U}。 (12)(1)-(11) is the second encoding module of the forward hidden sublayer of the first layer, according to the first weight matrix subset, the second word vector of the first word vector sequence And the first word vector of the first word vector sequence is coded as the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence. Similarly, the first layer of forward hiding The u-th coding module of the sub-layer can code to obtain the u-th vector Z 1,u of the first intermediate vector sequence of the first coding sequence, and obtain Z 1 ={Z 1,1 ,...,Z 1,u ,…,Z 1,U }.
(c)从第2层前向隐藏子层开始,逐个用第n层前向隐藏子层根据第n个权值矩阵子集将所述第一编码序列的第n-1个中间向量序列Z n-1编码为所述第一编码序列的第n个中间向量序列Z n(c) Starting from the forward hidden sub-layer of the second layer, the n-th forward hidden sub-layer is used to calculate the n-1th intermediate vector sequence Z of the first coding sequence according to the n-th weight matrix subset. The n-1 code is the n-th intermediate vector sequence Z n of the first code sequence.
与所述特征提取模型将所述第一词向量序列编码为第一编码序列同理,所述特征提取模型将所述第二词向量序列编码为第二编码序列R nIn the same way that the feature extraction model encodes the first word vector sequence as a first encoding sequence, the feature extraction model encodes the second word vector sequence as a second encoding sequence R n .
所述第1层前向隐藏子层的第u个编码模块可编码得到所述第一编码序列的第1个 中间向量序列的第u个向量Z 1,uThe u-th encoding module of the forward hidden sublayer of the first layer can encode the u-th vector Z 1,u of the first intermediate vector sequence of the first encoding sequence.
同一层的前向隐藏子层和后向隐藏子层中的每个编码可以同时并发运行。Each code in the forward hidden sublayer and the backward hidden sublayer of the same layer can run concurrently.
104,利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量。104. Use the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence.
在本实施例中,利用所述特征提取模型的输出层根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量。In this embodiment, the output layer of the feature extraction model is used to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence.
将所述第一编码序列和所述第二编码序列中的向量按维度求和,将求和得到的一个和向量乘以所述输出权值矩阵并进行归一化处理,得到所述第一语句样本的缺失词向量。The vectors in the first coding sequence and the second coding sequence are summed in dimensions, and a sum vector obtained by the summation is multiplied by the output weight matrix and normalized to obtain the first The missing word vector of the sentence sample.
105,根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更新所述第二特征提取模型的权值。105. Train the feature extraction model according to the missing word vector of the first sentence sample and the label vector of the first sentence sample to obtain a first feature extraction model, create a second feature extraction model, and make the second feature The neural network structure of the extraction model is consistent with the neural network structure of the first feature extraction model, and the weight of the second feature extraction model is updated with the weight of the first feature extraction model.
可以根据交叉熵损失函数计算所述第一语句样本的缺失词向量和标签向量的损失值,根据所述损失值采优化所述特征提取模型的权值矩阵。The loss value of the missing word vector and the label vector of the first sentence sample may be calculated according to the cross-entropy loss function, and the weight matrix of the feature extraction model may be optimized according to the loss value.
可以按照所述第一特征提取模型的神经网络结构新建中间特征提取模型。神经网络结构可以包括神经元数量、神经元层数、神经元间连接方式等。可以复制所述第一特征提取模型的权值,在所述第一特征提取模型被训练后,所述第一特征提取模型的权值使所述第一特征提取模型具有较强的特征提取能力,用所述第一特征提取模型的权值初始化所述中间特征提取模型的权值,得到与所述第一特征提取模型相同的所述第二特征提取模型。A new intermediate feature extraction model can be created according to the neural network structure of the first feature extraction model. The neural network structure can include the number of neurons, the number of neuron layers, and the way of connection between neurons. The weight of the first feature extraction model can be copied. After the first feature extraction model is trained, the weight of the first feature extraction model enables the first feature extraction model to have strong feature extraction capabilities , Initializing the weight of the intermediate feature extraction model with the weight of the first feature extraction model to obtain the second feature extraction model that is the same as the first feature extraction model.
106,用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型。106. Use a second sentence sample with an attribute label to train an attribute classification model composed of the first feature extraction model and a fully connected layer.
例如,对于笔记本电脑的第二语句样本,属性标签可以包括分辨率、处理器、音效等,第二语句样本的一个语句为“这个电脑的反应速度很快”,属性标签为“处理器”,表示这个第二样本语句包括了处理器的语义。For example, for the second sentence sample of a laptop computer, the attribute label may include resolution, processor, sound effects, etc. One sentence of the second sentence sample is "This computer responds quickly", and the attribute label is "Processor". Indicates that this second sample sentence includes the semantics of the processor.
所述第二语句样本可以是带有属性标签的给定领域的语句。可以采用少量所述第二语句样本训练所述属性分类模型,因为所述特征提取模型已训练好且可以较好地提取语义信息,所以只需要对所述特征提取模型的权值矩阵进行微调和对所述全连接层中的权值矩阵进行优化。其中,所述第一特征提取模型的输出为所述全连接层的输入。The second sentence sample may be a sentence in a given field with attribute tags. A small number of the second sentence samples can be used to train the attribute classification model. Because the feature extraction model has been trained and can extract semantic information well, it is only necessary to fine-tune and adjust the weight matrix of the feature extraction model. The weight matrix in the fully connected layer is optimized. Wherein, the output of the first feature extraction model is the input of the fully connected layer.
在一具体实施例中,将所述第二语句样本中的每个语句按词数量等分为前后两部分,所述第二语句样本中的每个语句的前部分的词语类似所述第一语句样本中的每个语句的缺失词之前的词语,所述第二语句样本的每个语句的后部分的词语类似所述第一语句样本的每个语句的缺失词之后的词语。例如,第二语句样本的一个语句为“<S>这个电脑的反应速度很快处理器<E>”,则将“<S>这个电脑的反应”作为第二语句样本的一个语句的前部分的词语,类似于第一语句样本的一个语句的缺失词之前的词语;则将“速度很快处理器<E>”作为第二语句样本的一个语句的后部分的词语,类似于第一语句样本的一个语句的缺失词之后的词语。In a specific embodiment, each sentence in the second sentence sample is equally divided into two parts according to the number of words, and the words in the first part of each sentence in the second sentence sample are similar to the first part. The words before the missing word of each sentence in the sentence sample, and the words in the latter part of each sentence of the second sentence sample are similar to the words after the missing word of each sentence in the first sentence sample. For example, if one sentence of the second sentence sample is "<S>This computer has a very fast response speed processor<E>", then "<S>This computer's response" is taken as the first part of a sentence in the second sentence sample The words in the first sentence sample are similar to the words before the missing word of a sentence in the first sentence sample; the "fast processor <E>" is used as the last part of a sentence in the second sentence sample, similar to the first sentence The words after the missing word in a sentence of the sample.
若所述第二语句样本中的语句的词数量为奇数,则等分时剔除正中间的一个词,并将正中间的一个词分至前部分。If the number of words in the sentence in the second sentence sample is an odd number, the one word in the middle is eliminated when the second sentence sample is divided, and the word in the middle is divided into the front part.
用所述第二语句样本训练所述第一特征提取模型的过程与用所述第一语句样本训练所述特征提取模型的过程类似,此处不再赘述。The process of using the second sentence sample to train the first feature extraction model is similar to the process of using the first sentence sample to train the feature extraction model, and will not be repeated here.
107,用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句。107. Use the attribute classification model to identify the attribute words of a plurality of sentences to be recognized, and connect each sentence to be recognized with the attribute words of each sentence to be recognized to obtain the plurality of sentences to be recognized that connect the attribute words .
例如,待识别语句“<S>这个电脑的反应速度很快处理器<E>”的属性标签为“处理器”,将待识别语句与待识别语句的属性词连接为“<S>这个电脑的反应速度很快<SOE>处理器<E>”,其中“<SOE>”表示连接词。For example, the attribute tag of the sentence to be recognized "<S>This computer responds very quickly. Processor <E>" is "Processor", and the attribute words of the sentence to be recognized and the sentence to be recognized are connected as "<S>This computer The response speed is very fast <SOE>Processor<E>", where "<SOE>" stands for conjunction.
108,用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型。108. Train an emotion classification model composed of the second feature extraction model and a deep learning model by using the plurality of sentences to be recognized that are connected to attribute words with emotion labels.
所述情感标签可以包括“积极”、“中性”和“负面”等,所述深度学习模型可以是CNN、RNN或LSTM等。训练所述情感分类模型为现有方法,此处不再赘述。其中,所述第二特征提取模型的输出为所述深度学习模型的输入。The sentiment label may include "positive", "neutral", "negative", etc., and the deep learning model may be CNN, RNN, or LSTM. Training the sentiment classification model is an existing method, and will not be repeated here. Wherein, the output of the second feature extraction model is the input of the deep learning model.
可以输出连接属性词的所述多个待识别语句,输出的连接属性词的所述多个待识别语句可以进行人工标记处理,得到带有情感标签的连接属性词的所述多个待识别语句,接收带有情感标签的连接属性词的所述多个待识别语句。The multiple to-be-recognized sentences of the connected attribute words may be output, and the multiple to-be-recognized sentences of the output connected attribute words may be manually labeled to obtain the multiple to-be-recognized sentences of the connected attribute words with emotion labels , Receiving the plurality of sentences to be recognized of the connection attribute words with emotion tags.
用带有情感标签的连接属性词的所述多个待识别语句训练所述第二特征提取模型的过程与用所述第二语句样本训练所述第一特征提取模型的过程类似,此处不再赘述。The process of training the second feature extraction model with the plurality of sentences to be recognized with the sentiment label connected attribute words is similar to the process of training the first feature extraction model with the second sentence samples. Go into details again.
109,用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。109. Use the attribute classification model to identify the attribute words of the sentence to be processed, and the sentiment classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute words of the sentence to be processed and the emotion type of the sentence to be processed .
例如,用属性分类模型识别出待处理语句“这个电脑的反应速度很快”的属性词为“处理器”,情感分类模型对连接属性词的待处理语句“<S>这个电脑的反应速度很快<SOE>处理器<E>”进行分类,输出待处理语句的属性词“处理器”和待处理语句的情感类型“积极”。For example, the attribute classification model is used to identify the attribute word of the sentence to be processed "this computer is very fast" as "processor", and the sentiment classification model responds very quickly to the sentence "<S>" which is connected to the attribute word. Fast <SOE> processor <E>" to classify, output the attribute word "processor" of the sentence to be processed and the emotion type "positive" of the sentence to be processed.
需要强调的是,为进一步保证所述属性分类模型和所述情感分类模型的私密和安全性,所述属性分类模型和所述情感分类模型还可以存储于一区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the attribute classification model and the emotion classification model, the attribute classification model and the emotion classification model may also be stored in a node of a blockchain.
实施例一实现了对语句进行情感分类,增强情感分类的准确性和场景适应性。The first embodiment realizes emotion classification of sentences, and enhances the accuracy and scene adaptability of emotion classification.
在另一实施例中,第n层前向隐藏子层的第U个编码模块将所述第一编码序列的n-1个中间向量序列的第U-1个向量、所述第一编码序列的n-1个中间向量序列的第U个向量和所述第二编码序列的n-1个中间向量序列的第W个向量编码为Z n,U;第n层后向隐藏子层的第W个编码模块将所述第二编码序列的n-1个中间向量序列的第W-1个向量、所述第二编码序列的n-1个中间向量序列的第W个向量和所述第一编码序列的n-1个中间向量序列的第U个向量编码为R n,WIn another embodiment, the U-th encoding module of the n-th forward hidden sublayer converts the U-1th vector of the n-1 intermediate vector sequence of the first encoding sequence, and the first encoding sequence The U-th vector of the n-1 intermediate vector sequence of the second coding sequence and the W-th vector of the n-1 intermediate vector sequence of the second coding sequence are coded as Z n, U ; the nth layer of the backward hidden sublayer is The W encoding modules combine the W-1th vector of the n-1 intermediate vector sequence of the second encoding sequence, the Wth vector of the n-1 intermediate vector sequence of the second encoding sequence, and the The U-th vector of the n-1 intermediate vector sequence of a coding sequence is coded as R n,W .
在另一实施例中,可以将所述特征提取模型迁移至不同领域的情感分类模型中。In another embodiment, the feature extraction model can be migrated to sentiment classification models in different fields.
实施例二Example two
图2是本申请实施例二提供的语句情感分类装置的结构图。所述语句情感分类装置20应用于计算机设备。所述语句情感分类装置20可以对语句进行情感分类。如图2所示,所述语句情感分类装置20可以包括获取模块201、转化模块202、编码模块203、计算模块204、第一训练模块205、第二训练模块206、连接模块207、第三训练模块208、分类模块209。Fig. 2 is a structural diagram of a sentence emotion classification device provided in the second embodiment of the present application. The sentence emotion classification device 20 is applied to computer equipment. The sentence emotion classification device 20 can perform emotion classification on the sentence. As shown in FIG. 2, the sentence emotion classification device 20 may include an acquisition module 201, a conversion module 202, an encoding module 203, a calculation module 204, a first training module 205, a second training module 206, a connection module 207, and a third training module. Module 208, classification module 209.
获取模块201,用于获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词。The obtaining module 201 is configured to obtain a first sentence sample set, and each first sentence sample in the first sentence sample set contains a missing word.
可以获取不同领域的多个文本,每个文本包括多个语句,对每个文本进行多次遮挡,每次遮挡所述文本中的部分词语,从每次遮挡部分词语的文本中抽取包含一个缺失词的句子作为所述第一语句样本。Multiple texts in different fields can be obtained, each text includes multiple sentences, each text is occluded multiple times, part of the words in the text is occluded each time, and a missing part is extracted from the text of each occluded part of the words The sentence of words is used as the first sentence sample.
可以获取旅游、电子产品、专利服务等各个领域的多个文本,每个领域的包括多个文本,每个领域的每个文本可以包括多个语句。本实施例对领域的大小不做限定,如电子产品领域和笔记本电脑领域,电子产品领域可以包括笔记本电脑领域。Multiple texts in various fields such as tourism, electronic products, and patent services can be obtained. Each field includes multiple texts, and each text in each field can include multiple sentences. This embodiment does not limit the size of the field, such as the field of electronic products and the field of notebook computers, and the field of electronic products may include the field of notebook computers.
可以对各个领域的多个文本中的每个文本进行多次遮挡,每次随机遮挡每个文本中的预设比例的部分词语,得到每各个领域的多个文本中存在缺失词的第一语句样本。Each text in multiple texts in each field can be occluded multiple times, and a preset proportion of words in each text can be randomly occluded each time to obtain the first sentence with missing words in multiple texts in each field sample.
转化模块202,用于对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量。The conversion module 202 is configured to, for each first sentence sample, use a feature extraction model to convert the words before the missing words in the first sentence sample into a first word vector sequence in order of words, and convert all the words in the first sentence sample The words following the missing word are converted into a second word vector sequence according to the reverse word order, and the missing words in the first sentence sample are converted into the label vector of the first sentence sample according to a preset vocabulary coding table.
所述特征提取模型包括所述输入层、前向隐藏层、后向隐藏层和输出层。The feature extraction model includes the input layer, the forward hidden layer, the backward hidden layer, and the output layer.
在一具体实施例中,所述利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列包括:In a specific embodiment, the use of the feature extraction model converts the words before the missing word in the first sentence sample into a first word vector sequence in order of words, and converts the words after the missing word in the first sentence sample The conversion of words into the second word vector sequence according to the reverse word order includes:
将所述第一语句样本中的所述缺失词前的词语依词序转化为第一编码向量序列,将所述第一语句样本中的所述缺失词后的词语依词序转化为第二编码向量序列;将所述第一语句样本中的所述缺失词前的词语的位置编号转化为第一位置向量序列,将所述第一语句样本中的所述缺失词后的词语的位置编号转化为第二位置向量序列;将所述第一编码向量序列和所述第一位置向量序列转化为第一词向量序列,将所述第二编码向量序列和所述第二位置向量序列转化为第二词向量序列。Convert the words before the missing word in the first sentence sample into a first coding vector sequence in word order, and convert the words after the missing word in the first sentence sample into a second coding vector in word order Sequence; convert the position number of the word before the missing word in the first sentence sample into a first position vector sequence, and convert the position number of the word after the missing word in the first sentence sample into A second position vector sequence; converting the first coding vector sequence and the first position vector sequence into a first word vector sequence, and converting the second coding vector sequence and the second position vector sequence into a second Sequence of word vectors.
例如,一个第一语句样本为“<S>自<mask>语言处理<E>”,其中“<S>”表示第一语句样本的头部词,“<E>”表示第一语句样本的尾部词,根据所述预设词汇编码表将缺失词“然”前的词语“<S>自”依词序转化为第一编码向量序列{(0,0,0,0,1,0,0,0),(0,0,0,0,0,0,0,1)},将缺失词“然”后的词语“语言处理<E>”依词序转化为第二编码向量序列{(0,0,0,0,0,0,1,0),(0,0,0,0,0,1,0,0),(1,0,0,0,0,0,0,0),(0,0,1,0,0,0,0,0),(0,1,0,0,0,0,0,0)},所述预设词汇编码表可以采用one-hot、word2vec等编码方式。将第一语句样本中的缺失词“然”前的词语的位置编号转化为第一位置向量序列{(1,0,0,0,0,0,0,0),(0,1,0,0,0,0,0,0)},将第一语句样本中的缺失词后的词语的位置编号转化为第二位置向量序列{(0,0,0,1,0,0,0,0),(0,0,0,0,1,0,0,0),(0,0,0,0,0,1,0,0),(0,0,0,0,0,0,1,0),(0,0,0,0,0,0,0,1)}。将缺失词“然”前的每个词语对应的第一编码向量序列中的第一编码向量和第一位置向量序列中的第一位置向量相加,得到第一词向量序列{(1,0,0,0,1,0,0,0),(0,1,0,0,0,0,0,1)}。将缺失词“然”后的每个词语对应的第二编码向量序列中的第二编码向量和第二位置向量序列中的第二位置向量相加,得到第二词向量序列{(0,0,0,1,0,0,1,0),(0,0,0,0,1,1,0,0),(1,0,0,0,0,1,0,0),(0,0,1,0,0,0,1,0),(0,1,0,0,0,0,0,1)}。For example, a first sentence sample is "<S>from <mask>Language Processing<E>", where "<S>" means the head word of the first sentence sample, and "<E>" means the first sentence sample The tail word, according to the preset vocabulary coding table, convert the word "<S>自" before the missing word "ran" into the first coding vector sequence {(0,0,0,0,1,0,0 in word order) , 0), (0, 0, 0, 0, 0, 0, 0, 1)}, the word "language processing <E>" after the missing word "ran" is transformed into the second coding vector sequence {( 0, 0, 0, 0, 0, 0, 1, 0), (0, 0, 0, 0, 0, 1, 0, 0), (1, 0, 0, 0, 0, 0, 0, 0), (0, 0, 1, 0, 0, 0, 0, 0), (0, 1, 0, 0, 0, 0, 0, 0)}, the preset vocabulary encoding table may adopt one -Encoding methods such as hot and word2vec. Convert the position number of the word before the missing word "ran" in the first sentence sample into the first position vector sequence {(1, 0, 0, 0, 0, 0, 0, 0), (0, 1, 0) ,0,0,0,0,0)}, convert the position number of the word after the missing word in the first sentence sample into a second position vector sequence {(0,0,0,1,0,0,0 , 0), (0, 0, 0, 0, 1, 0, 0, 0), (0, 0, 0, 0, 0, 1, 0, 0), (0, 0, 0, 0, 0 , 0, 1, 0), (0, 0, 0, 0, 0, 0, 0, 1)}. Add the first code vector in the first code vector sequence corresponding to each word before the missing word "ran" and the first position vector in the first position vector sequence to obtain the first word vector sequence {(1, 0 , 0, 0, 1, 0, 0, 0), (0, 1, 0, 0, 0, 0, 0, 1)}. Add the second code vector in the second code vector sequence corresponding to each word after the missing word "ran" and the second position vector in the second position vector sequence to obtain the second word vector sequence {(0, 0 , 0, 1, 0, 0, 1, 0), (0, 0, 0, 0, 1, 1, 0, 0), (1, 0, 0, 0, 0, 1, 0, 0), (0, 0, 1, 0, 0, 0, 1, 0), (0, 1, 0, 0, 0, 0, 0, 1)}.
根据所述预设词汇编码表将第一语句样本中的缺失词<mask>转化为所述第一语句样本的标签向量(0,0,0,1,0,0,0,0),即“然”的one-hot编码。According to the preset vocabulary encoding table, the missing word <mask> in the first sentence sample is converted into the label vector (0, 0, 0, 1, 0, 0, 0, 0) of the first sentence sample, namely "Ran" one-hot encoding.
编码模块203,用于利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列。The encoding module 203 is configured to use the feature extraction model to encode the first word vector sequence into a first encoding sequence, and to encode the second word vector sequence into a second encoding sequence.
在本实施例中,所述特征提取模型的前向隐藏层将所述第一词向量序列编码为所述第一编码序列,所述特征提取模型的后向隐藏层将所述第二词向量序列编码为第二编码序列。所述前向隐藏层和所述后向隐藏层分别包括N个前向隐藏子层和N个后向隐藏子层,每个前向隐藏子层包括U个编码模块,每个后向隐藏子层包括W个编码模块;其中,第n层前向隐藏子层的第u个编码模块接收第n-1层前向隐藏子层的第u-1个编码模块输出的向量Z n-1,u-1和第n-1层前向隐藏子层的第u个编码模块输出的向量Z n-1,u,第n层前向隐藏子层的第u个编码模块输出向量Z n,u至第n+1层前向隐藏子层的第u个编码模块和第n+1层前向隐藏子层的第u+1个编码模块,2≤n≤N,2≤u≤U。第1层前向隐藏子层的第u个编码模块接收所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量,第N层前向隐藏子层为所述第一编码序列。第n层前向隐藏子层的第1个编码模块接收第n-1层前向隐藏子层的第1个编码模块输出的向量Z n-1,1,第n层前向隐藏子层的第1个编码模块输出向量Z n,1至第n+1层前向隐藏子层的第1个编码模块。第n层后向隐藏子层的第w个编码模块接收第n-1层后向隐藏子层的第w-1个编码模块输出的向量R n-1,w-1和第n-1层后向隐藏子层的第w个编码模块输出的向量R n-1,w,第n层后向隐藏子层的第w个编码模块输出向量Z n,w至第n+1层后向隐藏子层 的第w个编码模块和第n+1层后向隐藏子层的第w+1个编码模块,2≤w≤W。第1层后向隐藏子层的第w个编码模块接收所述第二词向量序列的第w-1个词向量和所述第二词向量序列的第w个词向量,第N层后向隐藏子层为所述第二编码序列。第n层后向隐藏子层的第1个编码模块接收第n-1层后向隐藏子层的第1个编码模块输出的向量R n-1,1,第n层后向隐藏子层的第1个编码模块输出向量R n,1至第n+1层后向隐藏子层的第1个编码模块。 In this embodiment, the forward hidden layer of the feature extraction model encodes the first word vector sequence into the first coding sequence, and the backward hidden layer of the feature extraction model encodes the second word vector sequence The sequence code is the second coding sequence. The forward hidden layer and the backward hidden layer respectively include N forward hidden sublayers and N backward hidden sublayers, each forward hidden sublayer includes U encoding modules, and each backward hidden sublayer The layer includes W encoding modules; wherein the u-th encoding module of the n-th forward hidden sublayer receives the vector Z n-1 output by the u-1th encoding module of the n-1th forward hidden sublayer, u-1 and the vector Z of the n 1-layer of the front output to hidden sublayer u-th encoding module n-1, u, n-th layer before Z n to the hidden sublayer u-th encoding module output vector, u To the u-th coding module of the forward hidden sublayer of the n+1th layer and the u+1th coding module of the forward hidden sublayer of the n+1th layer, 2≤n≤N, 2≤u≤U. The u-th encoding module of the first layer forward hidden sublayer receives the u-1th word vector of the first word vector sequence and the u-th word vector of the first word vector sequence, and the Nth layer forward The hidden sub-layer is the first coding sequence. The first encoding module of the nth forward hidden sublayer receives the vector Z n-1,1 output by the first encoding module of the n-1th forward hidden sublayer. The first encoding module of the nth forward hidden sublayer The first encoding module outputs the vector Z n,1 to the first encoding module of the forward hidden sub-layer of the n+1th layer. The w-th encoding module of the nth layer of backward hidden sublayer receives the vector R n-1, w-1 and n-1th layer output by the w-1th encoding module of the n-1th layer of backward hidden sublayer The vector R n-1,w output by the w-th encoding module of the backward hidden sub-layer, the output vector Z n,w of the w-th encoding module of the n-th backward hidden sub-layer to the n+1-th layer is backward hidden The w-th coding module of the sub-layer and the w+1-th coding module of the backward hidden sub-layer of the n+1 layer, 2≤w≤W. The w-th encoding module of the first layer of backward hidden sublayer receives the w-1th word vector of the second word vector sequence and the wth word vector of the second word vector sequence, and the Nth layer is backward The hidden sublayer is the second coding sequence. The first coding module of the nth layer of backward hidden sublayer receives the vector R n-1,1 output by the first coding module of the n-1th layer of backward hidden sublayer, and the vector of the nth layer of backward hidden sublayer The first encoding module outputs the vector R n,1 to the first encoding module of the hidden sub-layer at the n+1th layer.
在一具体实施例中,所述特征提取模型将所述第一词向量序列编码为第一编码序列包括:In a specific embodiment, the encoding of the first word vector sequence into the first encoding sequence by the feature extraction model includes:
(a)第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1,所述初始化的权值矩阵集包括N个权值矩阵子集,所述第一编码序列的中间向量序列与所述第二编码序列的中间向量序列按顺序一一对应,第n层的前向隐藏子层和第n层的后向隐藏子层共享第n个权值矩阵子集,每个权值矩阵子集包括多组权值矩阵和第四权值矩阵,每组权值矩阵包括V权值矩阵、Q权值矩阵、K权值矩阵。 (a) The first encoding module of the forward hidden sublayer of the first layer encodes the first word vector of the first word vector sequence according to the first weight matrix subset in the initialized weight matrix set. The first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence, the initialized weight matrix set includes N weight matrix subsets, the intermediate vector sequence of the first coding sequence and The intermediate vector sequence of the second coding sequence corresponds one-to-one in order. The forward hidden sublayer of the nth layer and the backward hidden sublayer of the nth layer share the nth weight matrix subset, and each weight matrix The subset includes multiple sets of weight matrixes and a fourth weight matrix, and each set of weight matrixes includes a V weight matrix, a Q weight matrix, and a K weight matrix.
其中,多组权值矩阵中的V权值矩阵、Q权值矩阵、K权值矩阵用于计算基于多头注意力的所述第一编码序列和所述第二编码序列。即所述第一编码序列表示了所述第一语句样本中所述缺失词的上文的语义信息,所述第二编码序列表示了所述第一语句样本中所述缺失词的下文的语义信息。Wherein, the V weight matrix, the Q weight matrix, and the K weight matrix in the multiple sets of weight matrix are used to calculate the first coding sequence and the second coding sequence based on multi-head attention. That is, the first coding sequence represents the above semantic information of the missing word in the first sentence sample, and the second coding sequence represents the following semantic information of the missing word in the first sentence sample information.
所述第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1包括: The first encoding module of the forward hidden sublayer of the first layer encodes the first word vector of the first word vector sequence into the first weight matrix subset in the initialized weight matrix set. The first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence includes:
所述第1层前向隐藏子层的第1个编码模块将所述第一词向量序列的第1个词向量分别乘以第一个权值矩阵子集中的多组权值矩阵中的V权值矩阵,得到所述第一词向量序列的第一个词向量的多个V权值向量;将所述第一词向量序列的第一个词向量的多个V权值向量进行连接,得到所述第一词向量序列的第一个词向量的组合向量;用所述第一词向量序列的第一个词向量的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1The first encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the V in the multiple sets of weight matrixes in the first weight matrix subset. A weight matrix to obtain multiple V weight vectors of the first word vector of the first word vector sequence; connect the multiple V weight vectors of the first word vector of the first word vector sequence, Obtain the combined vector of the first word vector of the first word vector sequence; multiply the combined vector of the first word vector of the first word vector sequence by the fourth weight matrix to obtain the first The first vector Z 1,1 of the first intermediate vector sequence of the coding sequence.
相邻两层的第1个编码模块之间的连接类似普通的神经元连接,没有使用注意力机制。The connection between the first coding module of two adjacent layers is similar to an ordinary neuron connection, and no attention mechanism is used.
(b)从所述第1层前向隐藏子层的第2个编码模块开始,所述第1层前向隐藏子层的第u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U},其中,所述第一编码序列的第一个中间向量序列的第u个向量与所述第一词向量序列的第u个词向量一一对应。 (b) Starting from the second coding module of the forward hidden sub-layer of the first layer, the u-th coding module of the forward hidden sub-layer of the first layer will one by one according to the first weight matrix subset. The u-1th word vector of the first word vector sequence and the u-th word vector of the first word vector sequence are coded as the u-th vector Z 1 of the first intermediate vector sequence of the first coding sequence ,u , obtain the first intermediate vector sequence Z 1 of the first coding sequence Z 1 ={Z 1,1 ,...,Z 1,u ,...,Z 1,U }, wherein The u-th vector of the first intermediate vector sequence corresponds to the u-th word vector of the first word vector sequence in a one-to-one correspondence.
所述第1层前向隐藏子层的第u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U}包括: The u-th encoding module of the forward hidden sub-layer of the first layer calculates the u-1th word vector of the first word vector sequence and the first word vector sequence one by one according to the first weight matrix subset The u-th word vector encoding is the u-th vector Z 1,u of the first intermediate vector sequence of the first encoding sequence, and the first intermediate vector sequence Z 1 of the first encoding sequence Z 1 ={Z 1,1 ,…,Z 1,u ,…,Z 1,U } include:
(1)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V权值向量。 (1) The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The V weight matrix in, obtain the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(2)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的Q权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的Q权值向量。 (2) The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The Q weight matrix in, obtains the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(3)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的K权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的K权值向量。 (3) The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The K weight matrix in, obtains the K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(4)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V’权值向量。 (4) The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The V weight matrix in, obtains the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(5)所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的第三K权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的K’权值向量。 (5) The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first group of weight matrix in the first weight matrix subset The third K weight matrix in, obtain the K'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(6)所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的Q权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K’权值向量确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V’权值向量的注意力值。 (6) The second coding module of the forward hidden sublayer of the first layer is based on the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, the a first intermediate vector of the two vectors encoding a first sequence of the sequence Z K weight vectors 1 and 2, the first a first intermediate vector of the two vectors coding sequence of the sequence Z K 1,2 'The weight vector determines the attention value of the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence and the first intermediate vector sequence of the first coding sequence The attention value of the V'weight vector of the second vector Z 1,2.
(7)所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量、所述第一编码序列的第1个中间向量序列Z 1,2的第2个向量的V’权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V’权值向量的注意力值确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的第一个分值。 (7) The second coding module of the forward hidden sublayer of the first layer is based on the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, the The V'weight vector of the second vector of the first intermediate vector sequence Z 1,2 of the first coding sequence, the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence The attention value of the V weight vector and the attention value of the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence determine the first coding sequence The first score of the second vector Z 1,2 of an intermediate vector sequence.
(8)(1)-(7)为所述第1层前向隐藏子层的第2个编码模块根据第一个权值矩阵子集中的第一组权值矩阵得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的第一个分值,同理,可由第一个权值矩阵子集中的多组权值矩阵得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的多个分值。 (8) (1)-(7) is the second encoding module of the forward hidden sublayer of the first layer to obtain the first encoding sequence according to the first set of weight matrices in the first weight matrix subset In the same way, the first score of the second vector Z 1,2 of the first intermediate vector sequence of Multiple scores of the second vector Z 1,2 of an intermediate vector sequence.
可以同时由第一个权值矩阵子集中的多组权值矩阵得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的多个分值。 Multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence can be obtained from multiple sets of weight matrixes in the first weight matrix subset at the same time.
(9)所述第1层前向隐藏子层的第2个编码模块将所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的多个分值进行连接,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量。 (9) The second coding module of the forward hidden sublayer of the first layer connects multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, The combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence is obtained.
(10)所述第1层前向隐藏子层的第2个编码模块用所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量。 (10) The second coding module of the forward hidden sublayer of the first layer multiplies the combination vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence by the first A four-weight matrix is used to obtain an intermediate vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(11)所述第1层前向隐藏子层的第2个编码模块中的前馈网络对经过残差和归一化处理的所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量进行编码,并再次进行归一化处理,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2(11) The feedforward network in the second coding module of the forward hidden sublayer of the first layer performs residual and normalization processing on the second intermediate vector sequence of the first coding sequence. The intermediate vectors of the two vectors Z 1,2 are coded, and the normalization process is performed again to obtain the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
(12)(1)-(11)为所述第1层前向隐藏子层的第2个编码模块根据第一个权值矩阵子集将所述第一词向量序列的第2个词向量和所述第一词向量序列的第1个词向量编码为所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2,同理,所述第1层前向隐藏子层的第u个编码模块可编码得到所述第一编码序列的第1个中间向量序列的第u个向量Z 1,u,得到Z 1={Z 1,1,…,Z 1,u,…,Z 1,U}。 (12)(1)-(11) is the second encoding module of the forward hidden sublayer of the first layer, according to the first weight matrix subset, the second word vector of the first word vector sequence And the first word vector of the first word vector sequence is coded as the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence. Similarly, the first layer of forward hiding The u-th coding module of the sub-layer can code to obtain the u-th vector Z 1,u of the first intermediate vector sequence of the first coding sequence, and obtain Z 1 ={Z 1,1 ,...,Z 1,u ,…,Z 1,U }.
(c)从第2层前向隐藏子层开始,逐个用第n层前向隐藏子层根据第n个权值矩阵子集将所述第一编码序列的第n-1个中间向量序列Z n-1编码为所述第一编码序列的第n个中间向量序列Z n(c) Starting from the forward hidden sub-layer of the second layer, the n-th forward hidden sub-layer is used to calculate the n-1th intermediate vector sequence Z of the first coding sequence according to the n-th weight matrix subset. The n-1 code is the n-th intermediate vector sequence Z n of the first code sequence.
与所述特征提取模型将所述第一词向量序列编码为第一编码序列同理,所述特征提 取模型将所述第二词向量序列编码为第二编码序列R nIn the same way that the feature extraction model encodes the first word vector sequence as a first encoding sequence, the feature extraction model encodes the second word vector sequence as a second encoding sequence R n .
所述第1层前向隐藏子层的第u个编码模块可编码得到所述第一编码序列的第1个中间向量序列的第u个向量Z 1,uThe u-th encoding module of the forward hidden sublayer of the first layer can encode the u-th vector Z 1,u of the first intermediate vector sequence of the first encoding sequence.
同一层的前向隐藏子层和后向隐藏子层中的每个编码可以同时并发运行。Each code in the forward hidden sublayer and the backward hidden sublayer of the same layer can run concurrently.
计算模块204,用于利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量。The calculation module 204 is configured to use the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence.
在本实施例中,利用所述特征提取模型的输出层根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量。In this embodiment, the output layer of the feature extraction model is used to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence.
将所述第一编码序列和所述第二编码序列中的向量按维度求和,将求和得到的一个和向量乘以所述输出权值矩阵并进行归一化处理,得到所述第一语句样本的缺失词向量。The vectors in the first coding sequence and the second coding sequence are summed in dimensions, and a sum vector obtained by the summation is multiplied by the output weight matrix and normalized to obtain the first The missing word vector of the sentence sample.
第一训练模块205,用于根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更新所述第二特征提取模型的权值。The first training module 205 is configured to train the feature extraction model according to the missing word vector of the first sentence sample and the label vector of the first sentence sample to obtain a first feature extraction model, and create a second feature extraction model, The neural network structure of the second feature extraction model is made consistent with the neural network structure of the first feature extraction model, and the weight of the second feature extraction model is updated with the weight of the first feature extraction model.
可以根据交叉熵损失函数计算所述第一语句样本的缺失词向量和标签向量的损失值,根据所述损失值采优化所述特征提取模型的权值矩阵。The loss value of the missing word vector and the label vector of the first sentence sample may be calculated according to the cross-entropy loss function, and the weight matrix of the feature extraction model may be optimized according to the loss value.
可以按照所述第一特征提取模型的神经网络结构新建中间特征提取模型。神经网络结构可以包括神经元数量、神经元层数、神经元间连接方式等。可以复制所述第一特征提取模型的权值,在所述第一特征提取模型被训练后,所述第一特征提取模型的权值使所述第一特征提取模型具有较强的特征提取能力,用所述第一特征提取模型的权值初始化所述中间特征提取模型的权值,得到与所述第一特征提取模型相同的所述第二特征提取模型。A new intermediate feature extraction model can be created according to the neural network structure of the first feature extraction model. The neural network structure can include the number of neurons, the number of neuron layers, and the way of connection between neurons. The weight of the first feature extraction model can be copied. After the first feature extraction model is trained, the weight of the first feature extraction model enables the first feature extraction model to have strong feature extraction capabilities , Initializing the weight of the intermediate feature extraction model with the weight of the first feature extraction model to obtain the second feature extraction model that is the same as the first feature extraction model.
第二训练模块206,用于用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型。The second training module 206 is used to train the attribute classification model composed of the first feature extraction model and the fully connected layer using second sentence samples with attribute tags.
例如,对于笔记本电脑的第二语句样本,属性标签可以包括分辨率、处理器、音效等,第二语句样本的一个语句为“这个电脑的反应速度很快”,属性标签为“处理器”,表示这个第二样本语句包括了处理器的语义。For example, for the second sentence sample of a laptop computer, the attribute label may include resolution, processor, sound effects, etc. One sentence of the second sentence sample is "This computer responds quickly", and the attribute label is "Processor". Indicates that this second sample sentence includes the semantics of the processor.
所述第二语句样本可以是带有属性标签的给定领域的语句。可以采用少量所述第二语句样本训练所述属性分类模型,因为所述特征提取模型已训练好且可以较好地提取语义信息,所以只需要对所述特征提取模型的权值矩阵进行微调和对所述全连接层中的权值矩阵进行优化。其中,所述第一特征提取模型的输出为所述全连接层的输入。The second sentence sample may be a sentence in a given field with attribute tags. A small number of the second sentence samples can be used to train the attribute classification model. Because the feature extraction model has been trained and can extract semantic information well, it is only necessary to fine-tune and adjust the weight matrix of the feature extraction model. The weight matrix in the fully connected layer is optimized. Wherein, the output of the first feature extraction model is the input of the fully connected layer.
在一具体实施例中,将所述第二语句样本中的每个语句按词数量等分为前后两部分,所述第二语句样本中的每个语句的前部分的词语类似所述第一语句样本中的每个语句的缺失词之前的词语,所述第二语句样本的每个语句的后部分的词语类似所述第一语句样本的每个语句的缺失词之后的词语。例如,第二语句样本的一个语句为“<S>这个电脑的反应速度很快处理器<E>”,则将“<S>这个电脑的反应”作为第二语句样本的一个语句的前部分的词语,类似于第一语句样本的一个语句的缺失词之前的词语;则将“速度很快处理器<E>”作为第二语句样本的一个语句的后部分的词语,类似于第一语句样本的一个语句的缺失词之后的词语。In a specific embodiment, each sentence in the second sentence sample is equally divided into two parts according to the number of words, and the words in the first part of each sentence in the second sentence sample are similar to the first part. The words before the missing word of each sentence in the sentence sample, and the words in the latter part of each sentence of the second sentence sample are similar to the words after the missing word of each sentence in the first sentence sample. For example, if one sentence of the second sentence sample is "<S>This computer has a fast response speed processor<E>", then "<S>This computer's response" is taken as the first part of a sentence in the second sentence sample The words in the first sentence sample are similar to the words before the missing word of a sentence in the first sentence sample; the "fast processor <E>" is used as the last part of a sentence in the second sentence sample, similar to the first sentence The words after the missing word in a sentence of the sample.
若所述第二语句样本中的语句的词数量为奇数,则等分时剔除正中间的一个词,并将正中间的一个词分至前部分。If the number of words in the sentence in the second sentence sample is an odd number, the one word in the middle is eliminated when the second sentence sample is divided, and the word in the middle is divided into the front part.
用所述第二语句样本训练所述第一特征提取模型的过程与用所述第一语句样本训练所述特征提取模型的过程类似,此处不再赘述。The process of using the second sentence sample to train the first feature extraction model is similar to the process of using the first sentence sample to train the feature extraction model, and will not be repeated here.
连接模块207,用于用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句。The connection module 207 is used to identify the attribute words of a plurality of sentences to be recognized by using the attribute classification model, and connect each sentence to be recognized with the attribute words of each sentence to be recognized to obtain the plurality of connected attribute words. Sentences to be recognized.
例如,待识别语句“<S>这个电脑的反应速度很快处理器<E>”的属性标签为“处理器”,将待识别语句与待识别语句的属性词连接为“<S>这个电脑的反应速度很快<SOE>处理器<E>”,其中“<SOE>”表示连接词。For example, the attribute tag of the sentence to be recognized "<S>This computer responds very quickly. Processor <E>" is "Processor", and the attribute words of the sentence to be recognized and the sentence to be recognized are connected as "<S>This computer The response speed is very fast <SOE>Processor<E>", where "<SOE>" stands for conjunction.
第三训练模块208,用于用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型。The third training module 208 is configured to train an emotion classification model composed of the second feature extraction model and a deep learning model by using the plurality of sentences to be recognized that are connected to attribute words with emotion labels.
所述情感标签可以包括“积极”、“中性”和“负面”等,所述深度学习模型可以是CNN、RNN或LSTM等。训练所述情感分类模型为现有方法,此处不再赘述。其中,所述第二特征提取模型的输出为所述深度学习模型的输入。The sentiment label may include "positive", "neutral", "negative", etc., and the deep learning model may be CNN, RNN, or LSTM. Training the sentiment classification model is an existing method, and will not be repeated here. Wherein, the output of the second feature extraction model is the input of the deep learning model.
可以输出连接属性词的所述多个待识别语句,输出的连接属性词的所述多个待识别语句可以进行人工标记处理,得到带有情感标签的连接属性词的所述多个待识别语句,接收带有情感标签的连接属性词的所述多个待识别语句。The multiple to-be-recognized sentences of the connected attribute words may be output, and the multiple to-be-recognized sentences of the output connected attribute words may be manually labeled to obtain the multiple to-be-recognized sentences of the connected attribute words with emotion labels , Receiving the plurality of sentences to be recognized of the connection attribute words with emotion tags.
用带有情感标签的连接属性词的所述多个待识别语句训练所述第二特征提取模型的过程与用所述第二语句样本训练所述第一特征提取模型的过程类似,此处不再赘述。The process of training the second feature extraction model with the plurality of sentences to be recognized with the sentiment label connected attribute words is similar to the process of training the first feature extraction model with the second sentence samples. Go into details again.
分类模块209,用于用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。The classification module 209 is configured to use the attribute classification model to identify the attribute words of the sentence to be processed. The sentiment classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute words of the sentence to be processed and the sentence to be processed The emotional type of the sentence.
例如,用属性分类模型识别出待处理语句“这个电脑的反应速度很快”的属性词为“处理器”,情感分类模型对连接属性词的待处理语句“<S>这个电脑的反应速度很快<SOE>处理器<E>”进行分类,输出待处理语句的属性词“处理器”和待处理语句的情感类型“积极”。For example, the attribute classification model is used to identify the attribute word of the sentence to be processed "this computer is very fast" as "processor", and the sentiment classification model responds very quickly to the sentence "<S>" which is connected to the attribute word. Fast <SOE> processor <E>" to classify, output the attribute word "processor" of the sentence to be processed and the emotion type "positive" of the sentence to be processed.
需要强调的是,为进一步保证所述属性分类模型和所述情感分类模型的私密和安全性,所述属性分类模型和所述情感分类模型还可以存储于一区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the attribute classification model and the emotion classification model, the attribute classification model and the emotion classification model may also be stored in a node of a blockchain.
实施例二实现了对语句进行情感分类,增强情感分类的准确性和场景适应性。The second embodiment realizes emotion classification of sentences, and enhances the accuracy and scene adaptability of emotion classification.
在另一实施例中,第n层前向隐藏子层的第U个编码模块将所述第一编码序列的n-1个中间向量序列的第U-1个向量、所述第一编码序列的n-1个中间向量序列的第U个向量和所述第二编码序列的n-1个中间向量序列的第W个向量编码为Z n,U;第n层后向隐藏子层的第W个编码模块将所述第二编码序列的n-1个中间向量序列的第W-1个向量、所述第二编码序列的n-1个中间向量序列的第W个向量和所述第一编码序列的n-1个中间向量序列的第U个向量编码为R n,WIn another embodiment, the U-th encoding module of the n-th forward hidden sublayer converts the U-1th vector of the n-1 intermediate vector sequence of the first encoding sequence, and the first encoding sequence The U-th vector of the n-1 intermediate vector sequence of the second coding sequence and the W-th vector of the n-1 intermediate vector sequence of the second coding sequence are coded as Z n, U ; the nth layer of the backward hidden sublayer is The W encoding modules combine the W-1th vector of the n-1 intermediate vector sequence of the second encoding sequence, the Wth vector of the n-1 intermediate vector sequence of the second encoding sequence, and the The U-th vector of the n-1 intermediate vector sequence of a coding sequence is coded as R n,W .
在另一实施例中,可以将所述特征提取模型迁移至不同领域的情感分类模型中。In another embodiment, the feature extraction model can be migrated to sentiment classification models in different fields.
实施例三Example three
本实施例提供一种计算机存储介质,该计算机存储介质上存储有计算机程序,该计算机存储介质可以是非易失性,也可以是易失性。该计算机程序被处理器执行时实现上述语句情感分类方法实施例中的步骤,例如图1所示的步骤101-109。This embodiment provides a computer storage medium with a computer program stored on the computer storage medium. The computer storage medium may be non-volatile or volatile. When the computer program is executed by the processor, the steps in the above-mentioned sentence emotion classification method embodiment, such as steps 101-109 shown in FIG. 1, are implemented.
或者,该计算机程序被处理器执行时实现上述装置实施例中各模块的功能,例如图2中的模块201-209。Or, when the computer program is executed by the processor, the function of each module in the above-mentioned device embodiment is realized, for example, the modules 201-209 in FIG. 2.
实施例四Example four
图3为本申请实施例四提供的计算机设备的示意图。所述计算机设备30包括存储器301、处理器302以及存储在所述存储器301中并可在所述处理器302上运行的计算机程序303,例如语句情感分类程序。所述处理器302执行所述计算机程序303时实现上述语句情感分类方法实施例中的步骤,例如图1所示的步骤101-109。FIG. 3 is a schematic diagram of the computer equipment provided in the fourth embodiment of the application. The computer device 30 includes a memory 301, a processor 302, and a computer program 303 stored in the memory 301 and running on the processor 302, such as a sentence emotion classification program. When the processor 302 executes the computer program 303, the steps in the embodiment of the sentence emotion classification method described above are implemented, for example, steps 101-109 shown in FIG. 1.
或者,该计算机程序303被处理器执行时实现上述装置实施例中各模块的功能,例如图2中的模块201-209。Alternatively, when the computer program 303 is executed by the processor, the functions of the modules in the above-mentioned device embodiments, such as the modules 201-209 in FIG. 2, are realized.
示例性的,所述计算机程序303可以被分割成一个或多个模块,所述一个或者多个模块被存储在所述存储器301中,并由所述处理器302执行,以完成本方法。所述一个或多个模块可以是能够完成特定功能的一系列计算机可读指令段,该指令段用于描述所 述计算机程序303在所述计算机设备30中的执行过程。例如,所述计算机程序303可以被分割成图2中的获取模块201、转化模块202、编码模块203、计算模块204、第一训练模块205、第二训练模块206、连接模块207、第三训练模块208、分类模块209,各模块具体功能参见实施例二。Exemplarily, the computer program 303 may be divided into one or more modules, and the one or more modules are stored in the memory 301 and executed by the processor 302 to complete the method. The one or more modules may be a series of computer-readable instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 303 in the computer device 30. For example, the computer program 303 can be divided into the acquisition module 201, the conversion module 202, the encoding module 203, the calculation module 204, the first training module 205, the second training module 206, the connection module 207, and the third training in FIG. Module 208, classification module 209, see the second embodiment for the specific functions of each module.
所述计算机设备30可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。本领域技术人员可以理解,所述示意图3仅仅是计算机设备30的示例,并不构成对计算机设备30的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述计算机设备30还可以包括输入输出设备、网络接入设备、总线等。The computer device 30 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. Those skilled in the art can understand that the schematic diagram 3 is only an example of the computer device 30, and does not constitute a limitation on the computer device 30. It may include more or less components than those shown in the figure, or combine certain components, or different components. For example, the computer device 30 may also include input and output devices, network access devices, buses, and so on.
所称处理器302可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器302也可以是任何常规的处理器等,所述处理器302是所述计算机设备30的控制中心,利用各种接口和线路连接整个计算机设备30的各个部分。The so-called processor 302 may be a central processing unit (Central Processing Unit, CPU), other general processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor or the processor 302 can also be any conventional processor, etc. The processor 302 is the control center of the computer device 30, which uses various interfaces and lines to connect the entire computer device 30. Various parts.
所述存储器301可用于存储所述计算机程序303,所述处理器302通过运行或执行存储在所述存储器301内的计算机程序303或模块,以及调用存储在存储器301内的数据,实现所述计算机设备30的各种功能。所述存储器301可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据计算机设备30的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器301可以包括非易失性和易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)、至少一个磁盘存储器件、闪存器件、或其他存储器件。The memory 301 may be used to store the computer program 303, and the processor 302 can implement the computer by running or executing the computer program 303 or module stored in the memory 301 and calling data stored in the memory 301. Various functions of the device 30. The memory 301 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.); the storage data area may The data (such as audio data, phone book, etc.) created according to the use of the computer device 30 are stored. In addition, the memory 301 may include non-volatile and volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a Secure Digital (SD) card, and a flash memory card ( Flash Card), at least one magnetic disk storage device, flash memory device, or other storage device.
所述计算机设备30集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM)、随机存取存储器(RAM)等。If the integrated module of the computer device 30 is implemented in the form of a software function module and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, this application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program. The computer program can be stored in a computer storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read only memory (ROM), random access memory (RAM) etc.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims (20)

  1. 一种语句情感分类方法,其中,所述语句情感分类方法包括:A sentence sentiment classification method, wherein the sentence sentiment classification method includes:
    获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词;Acquiring a first sentence sample set, where each first sentence sample in the first sentence sample set contains a missing word;
    对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量;For each first sentence sample, use a feature extraction model to convert the words before the missing word in the first sentence sample into a first word vector sequence, and convert the words after the missing word in the first sentence sample Transforming into a second word vector sequence according to the reverse word order, and transforming the missing words in the first sentence sample into the label vector of the first sentence sample according to a preset vocabulary coding table;
    利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列;Coding the first word vector sequence into a first coding sequence and coding the second word vector sequence into a second coding sequence by using the feature extraction model;
    利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量;Using the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence;
    根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更新所述第二特征提取模型的权值;Train the feature extraction model according to the missing word vector of the first sentence sample and the label vector of the first sentence sample to obtain a first feature extraction model, create a second feature extraction model, and make the second feature extraction model The neural network structure of is consistent with the neural network structure of the first feature extraction model, and the weight of the second feature extraction model is updated with the weight of the first feature extraction model;
    用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型;Training an attribute classification model composed of the first feature extraction model and a fully connected layer by using a second sentence sample with attribute labels;
    用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句;Identify the attribute words of a plurality of sentences to be recognized by using the attribute classification model, and connect each sentence to be recognized with the attribute words of each sentence to be recognized that are recognized to obtain the plurality of sentences to be recognized that connect the attribute words;
    用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型;Training an emotion classification model composed of the second feature extraction model and a deep learning model by using the plurality of sentences to be recognized that are connected attribute words with emotion labels;
    用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。The attribute classification model is used to identify the attribute words of the sentence to be processed, and the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute words of the sentence to be processed and the emotion type of the sentence to be processed.
  2. 如权利要求1所述的语句情感分类方法,其中,所述特征提取模型包括输入层、前向隐藏层、后向隐藏层和输出层。The sentence emotion classification method according to claim 1, wherein the feature extraction model includes an input layer, a forward hidden layer, a backward hidden layer, and an output layer.
  3. 如权利要求1所述的语句情感分类方法,其中,所述利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列包括:5. The sentence sentiment classification method according to claim 1, wherein the use of the feature extraction model converts the words before the missing words in the first sentence sample into a first word vector sequence in order to convert the first sentence sample The words after the missing words described in the reverse word order are transformed into the second word vector sequence including:
    将所述第一语句样本中的所述缺失词前的词语依词序转化为第一编码向量序列,将所述第一语句样本中的所述缺失词后的词语依词序转化为第二编码向量序列;Convert the words before the missing word in the first sentence sample into a first coding vector sequence in word order, and convert the words after the missing word in the first sentence sample into a second coding vector in word order sequence;
    将所述第一语句样本中的所述缺失词前的词语的位置编号转化为第一位置向量序列,将所述第一语句样本中的所述缺失词后的词语的位置编号转化为第二位置向量序列;Convert the position number of the word before the missing word in the first sentence sample into a first position vector sequence, and convert the position number of the word after the missing word in the first sentence sample into a second Position vector sequence;
    将所述第一编码向量序列和所述第一位置向量序列转化为第一词向量序列,将所述第二编码向量序列和所述第二位置向量序列转化为第二词向量序列。The first coding vector sequence and the first position vector sequence are converted into a first word vector sequence, and the second coding vector sequence and the second position vector sequence are converted into a second word vector sequence.
  4. 如权利要求1所述的语句情感分类方法,其中,所述特征提取模型将所述第一词向量序列编码为第一编码序列包括:5. The sentence sentiment classification method according to claim 1, wherein said feature extraction model encoding said first word vector sequence into a first encoding sequence comprises:
    所述特征提取模型的第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1,所述初始化的权值矩阵集包括N个权值矩阵子集,所述第一编码序列的中间向量序列与所述第二编码序列的中间向量序列按顺序一一对应,所述特征提取模型的第n层的前向隐藏子层和第n层的后向隐藏子层共享第n个权值矩阵子集,每个权值矩阵子集包括多组权值矩阵和第四权值矩阵,每组权值矩阵包括V权值矩阵、Q权值矩阵、K权值矩阵; The first encoding module of the forward hidden sublayer of the first layer of the feature extraction model calculates the first word vector of the first word vector sequence according to the first weight matrix subset in the initialized weight matrix set Coded as the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence, the initialized weight matrix set includes N weight matrix subsets, and the middle of the first coding sequence The vector sequence corresponds to the intermediate vector sequence of the second coding sequence in a one-to-one order, and the forward hidden sublayer of the nth layer and the backward hidden sublayer of the nth layer of the feature extraction model share the nth weight. Matrix subset, each weight matrix subset includes multiple sets of weight matrix and a fourth weight matrix, and each set of weight matrix includes V weight matrix, Q weight matrix, and K weight matrix;
    从所述第1层前向隐藏子层的第2个编码模块开始,所述第1层前向隐藏子层的第 u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U},其中,所述第一编码序列的第一个中间向量序列的第u个向量与所述第一词向量序列的第u个词向量一一对应; Starting from the second coding module of the forward hidden sublayer of the first layer, the u-th coding module of the forward hidden sublayer of the first layer divides the first coding module one by one according to the first weight matrix subset. The u-1th word vector of the word vector sequence and the u-th word vector of the first word vector sequence are coded as the u-th vector Z 1,u of the first intermediate vector sequence of the first coding sequence, Obtain the first intermediate vector sequence Z 1 ={Z 1,1 ,...,Z 1,u ,...,Z 1,U } of the first coding sequence, wherein the first one of the first coding sequence The u-th vector of the intermediate vector sequence corresponds to the u-th word vector of the first word vector sequence in a one-to-one correspondence;
    从所述特征提取模型的第2层前向隐藏子层开始,逐个用所述第n层前向隐藏子层根据第n个权值矩阵子集将所述第一编码序列的第n-1个中间向量序列Z n-1编码为所述第一编码序列的第n个中间向量序列Z nStarting from the forward hidden sub-layer of the second layer of the feature extraction model, the n-th forward hidden sub-layer is used to calculate the n-1th of the first coding sequence according to the n-th weight matrix subset. intermediate vector Z n-1 encoding sequence for said first sequence of n intermediate vector Z n coding sequence.
  5. 如权利要求4所述的语句情感分类方法,其中,所述第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1包括: The sentence sentiment classification method according to claim 4, wherein the first encoding module of the forward hidden sub-layer of the first layer divides the first weight matrix subset according to the first weight matrix subset in the initialized weight matrix set. The first word vector encoding of a word vector sequence is the first vector Z 1,1 of the first intermediate vector sequence of the first encoding sequence includes:
    所述第1层前向隐藏子层的第1个编码模块将所述第一词向量序列的第1个词向量分别乘以第一个权值矩阵子集中的多组权值矩阵中的V权值矩阵,得到所述第一词向量序列的第一个词向量的多个V权值向量;The first encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the V in the multiple sets of weight matrixes in the first weight matrix subset. A weight matrix to obtain multiple V weight vectors of the first word vector of the first word vector sequence;
    将所述第一词向量序列的第一个词向量的多个V权值向量进行连接,得到所述第一词向量序列的第一个词向量的组合向量;Connecting multiple V weight vectors of the first word vector of the first word vector sequence to obtain a combined vector of the first word vector of the first word vector sequence;
    用所述第一词向量序列的第一个词向量的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1Multiply the combined vector of the first word vector of the first word vector sequence by the fourth weight matrix to obtain the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence .
  6. 如权利要求4所述的语句情感分类方法,其中,所述第1层前向隐藏子层的第u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U}包括: The sentence sentiment classification method according to claim 4, wherein the u-th encoding module of the forward hidden sub-layer of the first layer divides the first word vector sequence of the first word vector sequence one by one according to the first weight matrix subset u-1 word vectors and the u-th word vector of the first word vector sequence are coded as the u-th vector Z 1,u of the first intermediate vector sequence of the first coding sequence, to obtain the first The first intermediate vector sequence Z 1 of the coding sequence = {Z 1,1 ,...,Z 1,u ,...,Z 1,U } includes:
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by V in the first group of weight matrix in the first weight matrix subset. A weight matrix to obtain the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的Q权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的Q权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the Q in the first group of weight matrix in the first weight matrix subset. A weight matrix to obtain the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的K权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的K权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by K in the first group of weight matrix in the first weight matrix subset A weight matrix to obtain the K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V’权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by V in the first group of weight matrix in the first weight matrix subset. A weight matrix to obtain the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的第三K权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的K’权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first weight matrix in the first group of weight matrix subsets in the first weight matrix subset. Three K weight matrix to obtain the K'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的Q权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K’权值向量确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V’权值向量的注意力值; The second coding module of the forward hidden sublayer of the first layer is based on the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, and the first coding sequence The K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the sequence, the K'weight of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence The vector determines the attention value of the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence and the second of the first intermediate vector sequence of the first coding sequence. The attention value of the V'weight vector of two vectors Z 1,2;
    所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量、所述第一编码序列的第1个中间向量序列Z 1,2的 第2个向量的V’权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V’权值向量的注意力值确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的第一个分值; The second coding module of the forward hidden sublayer of the first layer is based on the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, and the first coding sequence V'weight vector of the second vector of the first intermediate vector sequence Z 1,2 of the sequence, and V weight of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence The attention value of the vector and the attention value of the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence determine the first intermediate vector of the first coding sequence The first score of the second vector Z 1,2 of the vector sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的多个分值进行连接,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量; The second coding module of the forward hidden sublayer of the first layer connects multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence to obtain the The combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块用所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量; The second coding module of the forward hidden sublayer of the first layer multiplies the combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence by the fourth weight Matrix to obtain the intermediate vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块中的前馈网络对经过残差和归一化处理的所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量进行编码,并再次进行归一化处理,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2The feedforward network in the second coding module of the forward hidden sublayer of the first layer performs residual and normalization processing on the second vector Z of the first intermediate vector sequence of the first coding sequence. The intermediate vector of 1,2 is coded, and the normalization process is performed again to obtain the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  7. 如权利要求1所述的语句情感分类方法,其中,所述语句情感分类方法还包括:5. The sentence sentiment classification method according to claim 1, wherein the sentence sentiment classification method further comprises:
    所述特征提取模型的第n层前向隐藏子层的第U个编码模块将所述第一编码序列的n-1个中间向量序列的第U-1个向量、所述第一编码序列的n-1个中间向量序列的第U个向量和所述第二编码序列的n-1个中间向量序列的第W个向量编码为Z n,U;第n层后向隐藏子层的第W个编码模块将所述第二编码序列的n-1个中间向量序列的第W-1个向量、所述第二编码序列的n-1个中间向量序列的第W个向量和所述第一编码序列的n-1个中间向量序列的第U个向量编码为R n,WThe U-th coding module of the n-th forward hidden sublayer of the feature extraction model converts the U-1th vector of the n-1 intermediate vector sequence of the first coding sequence, and the U-1th vector of the first coding sequence The U-th vector of the n-1 intermediate vector sequence and the W-th vector of the n-1 intermediate vector sequence of the second coding sequence are encoded as Z n, U ; the W-th vector of the n-th layer backward hidden sublayer Encoding modules combine the W-1th vector of the n-1 intermediate vector sequences of the second encoding sequence, the Wth vector of the n-1 intermediate vector sequences of the second encoding sequence and the first The U-th vector of the n-1 intermediate vector sequence of the coding sequence is coded as R n,W .
  8. 一种语句情感分类装置,其中,所述装置包括:A sentence emotion classification device, wherein the device includes:
    获取模块,用于获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词;An obtaining module, configured to obtain a first sentence sample set, each first sentence sample in the first sentence sample set contains a missing word;
    转化模块,用于对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量;The conversion module is used for each first sentence sample, using a feature extraction model to convert the words before the missing word in the first sentence sample into a first word vector sequence in order of words, and convert the words in the first sentence sample The words after the missing word are converted into a second word vector sequence according to the reverse word order, and the missing words in the first sentence sample are converted into the label vector of the first sentence sample according to a preset vocabulary coding table;
    编码模块,用于利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列;An encoding module, configured to use the feature extraction model to encode the first word vector sequence into a first encoding sequence, and to encode the second word vector sequence into a second encoding sequence;
    计算模块,用于利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量;A calculation module, configured to use the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence;
    第一训练模块,用于根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更新所述第二特征提取模型的权值;The first training module is used to train the feature extraction model according to the missing word vector of the first sentence sample and the label vector of the first sentence sample to obtain a first feature extraction model, create a second feature extraction model, and make The neural network structure of the second feature extraction model is consistent with the neural network structure of the first feature extraction model, and the weight of the second feature extraction model is updated with the weight of the first feature extraction model;
    第二训练模块,用于用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型;The second training module is used to train the attribute classification model composed of the first feature extraction model and the fully connected layer by using second sentence samples with attribute labels;
    连接模块,用于用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句;The connection module is used to identify the attribute words of a plurality of sentences to be recognized using the attribute classification model, and connect each sentence to be recognized with the attribute words of each sentence to be recognized to obtain the plurality of connected attribute words Sentence to be recognized;
    第三训练模块,用于用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型;The third training module is configured to train an emotion classification model composed of the second feature extraction model and a deep learning model by using the plurality of sentences to be recognized that are connected to attribute words with emotion labels;
    分类模块,用于用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。The classification module is used to identify the attribute words of the sentence to be processed using the attribute classification model, and the sentiment classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute words of the sentence to be processed and the sentence to be processed Type of emotion.
  9. 一种计算机设备,其中,所述计算机设备包括处理器,所述处理器用于执行存储器中存储的计算机可读指令以实现以下步骤:A computer device, wherein the computer device includes a processor, and the processor is configured to execute computer-readable instructions stored in a memory to implement the following steps:
    获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词;Acquiring a first sentence sample set, where each first sentence sample in the first sentence sample set contains a missing word;
    对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量;For each first sentence sample, use a feature extraction model to convert the words before the missing word in the first sentence sample into a first word vector sequence, and convert the words after the missing word in the first sentence sample Transforming into a second word vector sequence according to the reverse word order, and transforming the missing words in the first sentence sample into the label vector of the first sentence sample according to a preset vocabulary coding table;
    利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列;Coding the first word vector sequence into a first coding sequence and coding the second word vector sequence into a second coding sequence by using the feature extraction model;
    利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量;Using the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence;
    根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更新所述第二特征提取模型的权值;Train the feature extraction model according to the missing word vector of the first sentence sample and the label vector of the first sentence sample to obtain a first feature extraction model, create a second feature extraction model, and make the second feature extraction model The neural network structure of is consistent with the neural network structure of the first feature extraction model, and the weight of the second feature extraction model is updated with the weight of the first feature extraction model;
    用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型;Training an attribute classification model composed of the first feature extraction model and a fully connected layer by using a second sentence sample with attribute labels;
    用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句;Identify the attribute words of a plurality of sentences to be recognized by using the attribute classification model, and connect each sentence to be recognized with the attribute words of each sentence to be recognized that are recognized to obtain the plurality of sentences to be recognized that connect the attribute words;
    用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型;Training an emotion classification model composed of the second feature extraction model and a deep learning model by using the plurality of sentences to be recognized that are connected attribute words with emotion labels;
    用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。The attribute classification model is used to identify the attribute words of the sentence to be processed, and the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute words of the sentence to be processed and the emotion type of the sentence to be processed.
  10. 如权利要求9所述的计算机设备,其中,所述特征提取模型包括输入层、前向隐藏层、后向隐藏层和输出层。9. The computer device of claim 9, wherein the feature extraction model includes an input layer, a forward hidden layer, a backward hidden layer, and an output layer.
  11. 如权利要求9所述的计算机设备,其中,所述处理器执行所述存储器中存储的计算机可读指令以实现利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列时,包括:The computer device according to claim 9, wherein the processor executes the computer-readable instructions stored in the memory to implement the use of a feature extraction model to convert the words before the missing words in the first sentence sample into word order The first word vector sequence, when the words after the missing word in the first sentence sample are converted into the second word vector sequence according to the reverse word order, include:
    将所述第一语句样本中的所述缺失词前的词语依词序转化为第一编码向量序列,将所述第一语句样本中的所述缺失词后的词语依词序转化为第二编码向量序列;Convert the words before the missing word in the first sentence sample into a first coding vector sequence in word order, and convert the words after the missing word in the first sentence sample into a second coding vector in word order sequence;
    将所述第一语句样本中的所述缺失词前的词语的位置编号转化为第一位置向量序列,将所述第一语句样本中的所述缺失词后的词语的位置编号转化为第二位置向量序列;Convert the position number of the word before the missing word in the first sentence sample into a first position vector sequence, and convert the position number of the word after the missing word in the first sentence sample into a second Position vector sequence;
    将所述第一编码向量序列和所述第一位置向量序列转化为第一词向量序列,将所述第二编码向量序列和所述第二位置向量序列转化为第二词向量序列。The first coding vector sequence and the first position vector sequence are converted into a first word vector sequence, and the second coding vector sequence and the second position vector sequence are converted into a second word vector sequence.
  12. 如权利要求9所述的计算机设备,其中,所述处理器执行所述存储器中存储的计算机可读指令以实现所述特征提取模型将所述第一词向量序列编码为第一编码序列时,包括:9. The computer device according to claim 9, wherein the processor executes computer-readable instructions stored in the memory to implement the feature extraction model to encode the first word vector sequence into the first encoding sequence, include:
    所述特征提取模型的第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1,所述初始化的权值矩阵集包括N个权值矩阵子集,所述第一编码序列的中间向量序列与所述第二编码序列的中间向量序列按顺序一一对应,所述特征提取模型的第n层的前向隐藏子层和第n层的后向隐藏子层共享第n个权值矩阵子集,每个权值矩阵子集包括多组权值矩阵和第四权值矩阵,每组权值矩阵包括V权值矩阵、Q权值矩阵、K权值矩阵; The first encoding module of the forward hidden sublayer of the first layer of the feature extraction model calculates the first word vector of the first word vector sequence according to the first weight matrix subset in the initialized weight matrix set Coded as the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence, the initialized weight matrix set includes N weight matrix subsets, and the middle of the first coding sequence The vector sequence corresponds to the intermediate vector sequence of the second coding sequence in a one-to-one order, and the forward hidden sublayer of the nth layer and the backward hidden sublayer of the nth layer of the feature extraction model share the nth weight. Matrix subset, each weight matrix subset includes multiple sets of weight matrix and a fourth weight matrix, and each set of weight matrix includes V weight matrix, Q weight matrix, and K weight matrix;
    从所述第1层前向隐藏子层的第2个编码模块开始,所述第1层前向隐藏子层的第u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U}, 其中,所述第一编码序列的第一个中间向量序列的第u个向量与所述第一词向量序列的第u个词向量一一对应; Starting from the second coding module of the forward hidden sublayer of the first layer, the u-th coding module of the forward hidden sublayer of the first layer divides the first coding module one by one according to the first weight matrix subset. The u-1th word vector of the word vector sequence and the u-th word vector of the first word vector sequence are coded as the u-th vector Z 1,u of the first intermediate vector sequence of the first coding sequence, Obtain the first intermediate vector sequence Z 1 ={Z 1,1 ,...,Z 1,u ,...,Z 1,U } of the first coding sequence, where the first one of the first coding sequence The u-th vector of the intermediate vector sequence corresponds to the u-th word vector of the first word vector sequence in a one-to-one correspondence;
    从所述特征提取模型的第2层前向隐藏子层开始,逐个用所述第n层前向隐藏子层根据第n个权值矩阵子集将所述第一编码序列的第n-1个中间向量序列Z n-1编码为所述第一编码序列的第n个中间向量序列Z nStarting from the forward hidden sub-layer of the second layer of the feature extraction model, the n-th forward hidden sub-layer is used to calculate the n-1th of the first coding sequence according to the n-th weight matrix subset. intermediate vector Z n-1 encoding sequence for said first sequence of n intermediate vector Z n coding sequence.
  13. 如权利要求12所述的计算机设备,其中,所述处理器执行所述存储器中存储的计算机可读指令以实现所述第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1时,包括: The computer device according to claim 12, wherein the processor executes the computer-readable instructions stored in the memory to implement the first encoding module of the first layer forward hiding sublayer according to the initialized weight When the first weight matrix subset in the matrix set encodes the first word vector of the first word vector sequence into the first vector Z 1,1 of the first intermediate vector sequence of the first encoding sequence ,include:
    所述第1层前向隐藏子层的第1个编码模块将所述第一词向量序列的第1个词向量分别乘以第一个权值矩阵子集中的多组权值矩阵中的V权值矩阵,得到所述第一词向量序列的第一个词向量的多个V权值向量;The first encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the V in the multiple sets of weight matrixes in the first weight matrix subset. A weight matrix to obtain multiple V weight vectors of the first word vector of the first word vector sequence;
    将所述第一词向量序列的第一个词向量的多个V权值向量进行连接,得到所述第一词向量序列的第一个词向量的组合向量;Connecting multiple V weight vectors of the first word vector of the first word vector sequence to obtain a combined vector of the first word vector of the first word vector sequence;
    用所述第一词向量序列的第一个词向量的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1Multiply the combined vector of the first word vector of the first word vector sequence by the fourth weight matrix to obtain the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence .
  14. 如权利要求12所述的计算机设备,其中,所述处理器执行所述存储器中存储的计算机可读指令以实现所述第1层前向隐藏子层的第u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U}时,包括: The computer device of claim 12, wherein the processor executes computer-readable instructions stored in the memory to implement the u-th encoding module of the first layer forward hidden sublayer according to the first weight The value matrix subset encodes the u-1th word vector of the first word vector sequence and the u-th word vector of the first word vector sequence one by one into the first intermediate vector sequence of the first coding sequence When the u-th vector Z 1,u of the first coding sequence is obtained, the first intermediate vector sequence Z 1 ={Z 1,1 ,...,Z 1,u ,...,Z 1,U } is obtained, including :
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by V in the first group of weight matrix in the first weight matrix subset. A weight matrix to obtain the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的Q权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的Q权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the Q in the first group of weight matrix in the first weight matrix subset. A weight matrix to obtain the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的K权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的K权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by K in the first group of weight matrix in the first weight matrix subset A weight matrix to obtain the K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V’权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by V in the first group of weight matrix in the first weight matrix subset. A weight matrix to obtain the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的第三K权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的K’权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first weight matrix in the first group of weight matrix subsets in the first weight matrix subset. Three K weight matrix to obtain the K'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的Q权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K’权值向量确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V’权值向量的注意力值; The second coding module of the forward hidden sublayer of the first layer is based on the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, and the first coding sequence The K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the sequence, the K'weight of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence The vector determines the attention value of the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence and the second of the first intermediate vector sequence of the first coding sequence. The attention value of the V'weight vector of two vectors Z 1,2;
    所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量、所述第一编码序列的第1个中间向量序列Z 1,2的第2个向量的V’权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2 的V’权值向量的注意力值确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的第一个分值; The second coding module of the forward hidden sublayer of the first layer is based on the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, and the first coding sequence V'weight vector of the second vector of the first intermediate vector sequence Z 1,2 of the sequence, and V weight of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence The attention value of the vector and the attention value of the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence determine the first intermediate vector of the first coding sequence The first score of the second vector Z 1,2 of the vector sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的多个分值进行连接,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量; The second coding module of the forward hidden sublayer of the first layer connects multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence to obtain the The combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块用所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量; The second coding module of the forward hidden sublayer of the first layer multiplies the combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence by the fourth weight Matrix to obtain the intermediate vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块中的前馈网络对经过残差和归一化处理的所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量进行编码,并再次进行归一化处理,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2The feedforward network in the second coding module of the forward hidden sublayer of the first layer performs residual and normalization processing on the second vector Z of the first intermediate vector sequence of the first coding sequence. The intermediate vector of 1,2 is coded, and the normalization process is performed again to obtain the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
  15. 如权利要求9所述的计算机设备,其中,所述处理器执行所述存储器中存储的计算机可读指令还用以实现以下步骤:9. The computer device of claim 9, wherein the processor executes the computer-readable instructions stored in the memory to further implement the following steps:
    所述特征提取模型的第n层前向隐藏子层的第U个编码模块将所述第一编码序列的n-1个中间向量序列的第U-1个向量、所述第一编码序列的n-1个中间向量序列的第U个向量和所述第二编码序列的n-1个中间向量序列的第W个向量编码为Z n,U;第n层后向隐藏子层的第W个编码模块将所述第二编码序列的n-1个中间向量序列的第W-1个向量、所述第二编码序列的n-1个中间向量序列的第W个向量和所述第一编码序列的n-1个中间向量序列的第U个向量编码为R n,WThe U-th coding module of the n-th forward hidden sublayer of the feature extraction model converts the U-1th vector of the n-1 intermediate vector sequence of the first coding sequence, and the U-1th vector of the first coding sequence The U-th vector of the n-1 intermediate vector sequence and the W-th vector of the n-1 intermediate vector sequence of the second coding sequence are encoded as Z n, U ; the W-th vector of the n-th layer backward hidden sublayer Encoding modules combine the W-1th vector of the n-1 intermediate vector sequences of the second encoding sequence, the Wth vector of the n-1 intermediate vector sequences of the second encoding sequence and the first The U-th vector of the n-1 intermediate vector sequence of the coding sequence is coded as R n,W .
  16. 一种计算机存储介质,所述计算机存储介质上存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时实现以下步骤:A computer storage medium having computer readable instructions stored thereon, wherein the computer readable instructions implement the following steps when executed by a processor:
    获取第一语句样本集,所述第一语句样本集中的每个第一语句样本包含一个缺失词;Acquiring a first sentence sample set, where each first sentence sample in the first sentence sample set contains a missing word;
    对于每个第一语句样本,利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列,根据预设词汇编码表将所述第一语句样本中的所述缺失词转化为所述第一语句样本的标签向量;For each first sentence sample, use a feature extraction model to convert the words before the missing word in the first sentence sample into a first word vector sequence, and convert the words after the missing word in the first sentence sample Transforming into a second word vector sequence according to the reverse word order, and transforming the missing words in the first sentence sample into the label vector of the first sentence sample according to a preset vocabulary coding table;
    利用所述特征提取模型将所述第一词向量序列编码为第一编码序列,将所述第二词向量序列编码为第二编码序列;Coding the first word vector sequence into a first coding sequence and coding the second word vector sequence into a second coding sequence by using the feature extraction model;
    利用所述特征提取模型根据所述第一编码序列、所述第二编码序列计算所述第一语句样本的缺失词向量;Using the feature extraction model to calculate the missing word vector of the first sentence sample according to the first coding sequence and the second coding sequence;
    根据所述第一语句样本的缺失词向量和所述第一语句样本的标签向量训练所述特征提取模型,得到第一特征提取模型,新建第二特征提取模型,使所述第二特征提取模型的神经网络结构与所述第一特征提取模型的神经网络结构一致,用所述第一特征提取模型的权值更新所述第二特征提取模型的权值;Train the feature extraction model according to the missing word vector of the first sentence sample and the label vector of the first sentence sample to obtain a first feature extraction model, create a second feature extraction model, and make the second feature extraction model The neural network structure of is consistent with the neural network structure of the first feature extraction model, and the weight of the second feature extraction model is updated with the weight of the first feature extraction model;
    用带有属性标签的第二语句样本训练由所述第一特征提取模型和全连接层构成的属性分类模型;Training an attribute classification model composed of the first feature extraction model and a fully connected layer by using a second sentence sample with attribute labels;
    用所述属性分类模型识别多个待识别语句的属性词,将每个待识别语句与识别出的每个待识别语句的属性词连接,得到连接属性词的所述多个待识别语句;Using the attribute classification model to identify the attribute words of a plurality of sentences to be recognized, and connect each sentence to be recognized with the attribute words of each sentence to be recognized recognized to obtain the plurality of sentences to be recognized that connect the attribute words;
    用带有情感标签的连接属性词的所述多个待识别语句训练由所述第二特征提取模型和深度学习模型构成的情感分类模型;Training an emotion classification model composed of the second feature extraction model and a deep learning model by using the plurality of sentences to be recognized that are connected attribute words with emotion labels;
    用所述属性分类模型识别待处理语句的属性词,情感分类模型对连接属性词的所述待处理语句进行分类,输出所述待处理语句的属性词和所述待处理语句的情感类型。The attribute classification model is used to identify the attribute words of the sentence to be processed, and the emotion classification model classifies the sentence to be processed connecting the attribute words, and outputs the attribute words of the sentence to be processed and the emotion type of the sentence to be processed.
  17. 如权利要求16所述的计算机存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述利用特征提取模型将所述第一语句样本中缺失词之前的词语依词序转化为第一词向量序列,将所述第一语句样本中所述缺失词之后的词语依反向词序转化为第二词向量序列时,包括:The computer storage medium of claim 16, wherein the computer-readable instructions are executed by the processor to implement the use of the feature extraction model to convert the words before the missing words in the first sentence sample into word order The first word vector sequence, when the words after the missing word in the first sentence sample are converted into the second word vector sequence according to the reverse word order, include:
    将所述第一语句样本中的所述缺失词前的词语依词序转化为第一编码向量序列,将所述第一语句样本中的所述缺失词后的词语依词序转化为第二编码向量序列;Convert the words before the missing word in the first sentence sample into a first coding vector sequence in word order, and convert the words after the missing word in the first sentence sample into a second coding vector in word order sequence;
    将所述第一语句样本中的所述缺失词前的词语的位置编号转化为第一位置向量序列,将所述第一语句样本中的所述缺失词后的词语的位置编号转化为第二位置向量序列;Convert the position number of the word before the missing word in the first sentence sample into a first position vector sequence, and convert the position number of the word after the missing word in the first sentence sample into a second Position vector sequence;
    将所述第一编码向量序列和所述第一位置向量序列转化为第一词向量序列,将所述第二编码向量序列和所述第二位置向量序列转化为第二词向量序列。The first coding vector sequence and the first position vector sequence are converted into a first word vector sequence, and the second coding vector sequence and the second position vector sequence are converted into a second word vector sequence.
  18. 如权利要求16所述的计算机存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述特征提取模型将所述第一词向量序列编码为第一编码序列时,包括:15. The computer storage medium according to claim 16, wherein, when the computer-readable instructions are executed by the processor to implement the feature extraction model to encode the first word vector sequence into the first encoding sequence, the method comprises:
    所述特征提取模型的第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1,所述初始化的权值矩阵集包括N个权值矩阵子集,所述第一编码序列的中间向量序列与所述第二编码序列的中间向量序列按顺序一一对应,所述特征提取模型的第n层的前向隐藏子层和第n层的后向隐藏子层共享第n个权值矩阵子集,每个权值矩阵子集包括多组权值矩阵和第四权值矩阵,每组权值矩阵包括V权值矩阵、Q权值矩阵、K权值矩阵; The first encoding module of the forward hidden sublayer of the first layer of the feature extraction model calculates the first word vector of the first word vector sequence according to the first weight matrix subset in the initialized weight matrix set Coded as the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence, the initialized weight matrix set includes N weight matrix subsets, and the middle of the first coding sequence The vector sequence corresponds to the intermediate vector sequence of the second coding sequence in a one-to-one order, and the forward hidden sublayer of the nth layer and the backward hidden sublayer of the nth layer of the feature extraction model share the nth weight. Matrix subset, each weight matrix subset includes multiple sets of weight matrix and a fourth weight matrix, and each set of weight matrix includes V weight matrix, Q weight matrix, and K weight matrix;
    从所述第1层前向隐藏子层的第2个编码模块开始,所述第1层前向隐藏子层的第u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U},其中,所述第一编码序列的第一个中间向量序列的第u个向量与所述第一词向量序列的第u个词向量一一对应; Starting from the second coding module of the forward hidden sublayer of the first layer, the u-th coding module of the forward hidden sublayer of the first layer divides the first coding module one by one according to the first weight matrix subset. The u-1th word vector of the word vector sequence and the u-th word vector of the first word vector sequence are coded as the u-th vector Z 1,u of the first intermediate vector sequence of the first coding sequence, Obtain the first intermediate vector sequence Z 1 ={Z 1,1 ,...,Z 1,u ,...,Z 1,U } of the first coding sequence, wherein the first one of the first coding sequence The u-th vector of the intermediate vector sequence corresponds to the u-th word vector of the first word vector sequence in a one-to-one correspondence;
    从所述特征提取模型的第2层前向隐藏子层开始,逐个用所述第n层前向隐藏子层根据第n个权值矩阵子集将所述第一编码序列的第n-1个中间向量序列Z n-1编码为所述第一编码序列的第n个中间向量序列Z nStarting from the forward hidden sub-layer of the second layer of the feature extraction model, the n-th forward hidden sub-layer is used to calculate the n-1th of the first coding sequence according to the n-th weight matrix subset. intermediate vector Z n-1 encoding sequence for said first sequence of n intermediate vector Z n coding sequence.
  19. 如权利要求18所述的计算机存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述第1层前向隐藏子层的第1个编码模块根据初始化的权值矩阵集中的第一个权值矩阵子集将所述第一词向量序列的第1个词向量编码为所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1时,包括: The computer storage medium of claim 18, wherein the computer-readable instructions are executed by the processor to implement the first encoding module of the first layer forward hiding sub-layer to concentrate according to the initialized weight matrix When the first weight matrix subset encodes the first word vector of the first word vector sequence as the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence, it includes :
    所述第1层前向隐藏子层的第1个编码模块将所述第一词向量序列的第1个词向量分别乘以第一个权值矩阵子集中的多组权值矩阵中的V权值矩阵,得到所述第一词向量序列的第一个词向量的多个V权值向量;The first encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the V in the multiple sets of weight matrixes in the first weight matrix subset. A weight matrix to obtain multiple V weight vectors of the first word vector of the first word vector sequence;
    将所述第一词向量序列的第一个词向量的多个V权值向量进行连接,得到所述第一词向量序列的第一个词向量的组合向量;Connecting multiple V weight vectors of the first word vector of the first word vector sequence to obtain a combined vector of the first word vector of the first word vector sequence;
    用所述第一词向量序列的第一个词向量的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第一个中间向量序列的第1个向量Z 1,1Multiply the combined vector of the first word vector of the first word vector sequence by the fourth weight matrix to obtain the first vector Z 1,1 of the first intermediate vector sequence of the first coding sequence .
  20. 如权利要求18所述的计算机存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述第1层前向隐藏子层的第u个编码模块根据第一个权值矩阵子集逐个将所述第一词向量序列的第u-1个词向量和所述第一词向量序列的第u个词向量编码为所述第一编码序列的第一个中间向量序列的第u个向量Z 1,u,得到所述第一编码序列的第一个中间向量序列Z 1={Z 1,1,…,Z 1,u,…,Z 1,U}时,包括: The computer storage medium of claim 18, wherein the computer-readable instructions are executed by the processor to implement the u-th encoding module of the first layer forward hidden sublayer according to the first weight matrix The subset encodes the u-1th word vector of the first word vector sequence and the uth word vector of the first word vector sequence one by one into the first intermediate vector sequence of the first coding sequence. When u vectors Z 1,u are obtained, the first intermediate vector sequence Z 1 of the first coding sequence is obtained = {Z 1,1 ,...,Z 1,u ,...,Z 1,U }, including:
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by V in the first group of weight matrix in the first weight matrix subset. A weight matrix to obtain the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的Q权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的Q权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by the Q in the first group of weight matrix in the first weight matrix subset. A weight matrix to obtain the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第2个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的K权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的K权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the second word vector of the first word vector sequence by K in the first group of weight matrix in the first weight matrix subset A weight matrix to obtain the K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的V权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的V’权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by V in the first group of weight matrix in the first weight matrix subset. A weight matrix to obtain the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一词向量序列的第1个词向量乘以第一个权值矩阵子集中的第1组权值矩阵中的第三K权值矩阵,得到所述第一编码序列的第一个中间向量序列的第2个向量Z 1,2的K’权值向量; The second encoding module of the forward hidden sublayer of the first layer multiplies the first word vector of the first word vector sequence by the first weight matrix in the first group of weight matrix subsets in the first weight matrix subset. Three K weight matrix to obtain the K'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的Q权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的K’权值向量确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V’权值向量的注意力值; The second coding module of the forward hidden sublayer of the first layer is based on the Q weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, and the first coding sequence The K weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the sequence, the K'weight of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence The vector determines the attention value of the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence and the second vector of the first intermediate vector sequence of the first coding sequence. The attention value of the V'weight vector of two vectors Z 1,2;
    所述第1层前向隐藏子层的第2个编码模块根据所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量、所述第一编码序列的第1个中间向量序列Z 1,2的第2个向量的V’权值向量、所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V权值向量的注意力值和所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的V’权值向量的注意力值确定所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的第一个分值; The second coding module of the forward hidden sublayer of the first layer is based on the V weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence, and the first coding sequence V'weight vector of the second vector of the first intermediate vector sequence Z 1,2 of the sequence, and V weight of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence The attention value of the vector and the attention value of the V'weight vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence determine the first intermediate vector of the first coding sequence The first score of the second vector Z 1,2 of the vector sequence;
    所述第1层前向隐藏子层的第2个编码模块将所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的多个分值进行连接,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量; The second coding module of the forward hidden sublayer of the first layer connects multiple scores of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence to obtain the The combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块用所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的组合向量乘以所述第四权值矩阵,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量; The second coding module of the forward hidden sublayer of the first layer multiplies the combined vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence by the fourth weight Matrix to obtain the intermediate vector of the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence;
    所述第1层前向隐藏子层的第2个编码模块中的前馈网络对经过残差和归一化处理的所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2的中间向量进行编码,并再次进行归一化处理,得到所述第一编码序列的第1个中间向量序列的第2个向量Z 1,2The feedforward network in the second coding module of the forward hidden sublayer of the first layer performs residual and normalization processing on the second vector Z of the first intermediate vector sequence of the first coding sequence. The intermediate vector of 1,2 is coded, and the normalization process is performed again to obtain the second vector Z 1,2 of the first intermediate vector sequence of the first coding sequence.
PCT/CN2020/131951 2020-03-02 2020-11-26 Statement sentiment classification method and related device WO2021174922A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010137265.1A CN111460812B (en) 2020-03-02 Sentence emotion classification method and related equipment
CN202010137265.1 2020-03-02

Publications (1)

Publication Number Publication Date
WO2021174922A1 true WO2021174922A1 (en) 2021-09-10

Family

ID=71684213

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/131951 WO2021174922A1 (en) 2020-03-02 2020-11-26 Statement sentiment classification method and related device

Country Status (1)

Country Link
WO (1) WO2021174922A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115169530A (en) * 2022-06-29 2022-10-11 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN115905533A (en) * 2022-11-24 2023-04-04 重庆邮电大学 Intelligent multi-label text classification method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299457A (en) * 2018-09-06 2019-02-01 北京奇艺世纪科技有限公司 A kind of opining mining method, device and equipment
CN110222178A (en) * 2019-05-24 2019-09-10 新华三大数据技术有限公司 Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing
CN110287320A (en) * 2019-06-25 2019-09-27 北京工业大学 A kind of deep learning of combination attention mechanism is classified sentiment analysis model more
KR20200017568A (en) * 2018-07-23 2020-02-19 전남대학교산학협력단 Sentence sentiment classification system and method based on sentiment dictionary construction by the price fluctuation and convolutional neural network
CN110825849A (en) * 2019-11-05 2020-02-21 泰康保险集团股份有限公司 Text information emotion analysis method, device, medium and electronic equipment
CN111460812A (en) * 2020-03-02 2020-07-28 平安科技(深圳)有限公司 Statement emotion classification method and related equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200017568A (en) * 2018-07-23 2020-02-19 전남대학교산학협력단 Sentence sentiment classification system and method based on sentiment dictionary construction by the price fluctuation and convolutional neural network
CN109299457A (en) * 2018-09-06 2019-02-01 北京奇艺世纪科技有限公司 A kind of opining mining method, device and equipment
CN110222178A (en) * 2019-05-24 2019-09-10 新华三大数据技术有限公司 Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing
CN110287320A (en) * 2019-06-25 2019-09-27 北京工业大学 A kind of deep learning of combination attention mechanism is classified sentiment analysis model more
CN110825849A (en) * 2019-11-05 2020-02-21 泰康保险集团股份有限公司 Text information emotion analysis method, device, medium and electronic equipment
CN111460812A (en) * 2020-03-02 2020-07-28 平安科技(深圳)有限公司 Statement emotion classification method and related equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
P. WAILA ; MARISHA ; V. K. SINGH ; M. K. SINGH: "Evaluating Machine Learning and Unsupervised Semantic Orientation approaches for sentiment analysis of textual reviews", COMPUTATIONAL INTELLIGENCE&COMPUTING RESEARCH (ICCIC), 2012 IEEE INTERNATIONAL CONFERENCE ON, IEEE, 18 December 2012 (2012-12-18), pages 1 - 6, XP032390029, ISBN: 978-1-4673-1342-1, DOI: 10.1109/ICCIC.2012.6510235 *
WANG YUHAN, ZHANG CHUNYUN, ZHAO BAOLIN, XIAO MING, GENG LEILEI, CUI CHAORAN: "Sentiment Analysis of Twitter Data Based on CNN", JOURNAL OF DATA ACQUISITION AND PROCESSING, vol. 33, no. 5, 1 September 2018 (2018-09-01), pages 921 - 927, XP055842112, DOI: 10.16337/j.1004-9037.2018.05.017 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115169530A (en) * 2022-06-29 2022-10-11 北京百度网讯科技有限公司 Data processing method and device, electronic equipment and readable storage medium
CN115169530B (en) * 2022-06-29 2023-09-26 北京百度网讯科技有限公司 Data processing method, device, electronic equipment and readable storage medium
CN115905533A (en) * 2022-11-24 2023-04-04 重庆邮电大学 Intelligent multi-label text classification method
CN115905533B (en) * 2022-11-24 2023-09-19 湖南光线空间信息科技有限公司 Multi-label text intelligent classification method

Also Published As

Publication number Publication date
CN111460812A (en) 2020-07-28

Similar Documents

Publication Publication Date Title
US20220180202A1 (en) Text processing model training method, and text processing method and apparatus
WO2021082953A1 (en) Machine reading understanding method and apparatus, storage medium, and device
WO2020186778A1 (en) Error word correction method and device, computer device, and storage medium
CN113420807A (en) Multi-mode fusion emotion recognition system and method based on multi-task learning and attention mechanism and experimental evaluation method
CN108959482B (en) Single-round dialogue data classification method and device based on deep learning and electronic equipment
JP2023509031A (en) Translation method, device, device and computer program based on multimodal machine learning
WO2020140487A1 (en) Speech recognition method for human-machine interaction of smart apparatus, and system
CN109887484B (en) Dual learning-based voice recognition and voice synthesis method and device
JP7291183B2 (en) Methods, apparatus, devices, media, and program products for training models
Firdaus et al. A deep multi-task model for dialogue act classification, intent detection and slot filling
JP2020500366A (en) Simultaneous multi-task neural network model for multiple natural language processing (NLP) tasks
CN108780464A (en) Method and system for handling input inquiry
WO2021174922A1 (en) Statement sentiment classification method and related device
WO2023020522A1 (en) Methods for natural language processing and training natural language processing model, and device
WO2022228127A1 (en) Element text processing method and apparatus, electronic device, and storage medium
CN115203409A (en) Video emotion classification method based on gating fusion and multitask learning
Zaidi et al. Cross-language speech emotion recognition using multimodal dual attention transformers
US10902221B1 (en) Social hash for language models
US10706086B1 (en) Collaborative-filtering based user simulation for dialog systems
CN115357710B (en) Training method and device for table description text generation model and electronic equipment
WO2023279921A1 (en) Neural network model training method, data processing method, and apparatuses
CN116127049A (en) Model training method, text generation method, terminal device and computer medium
CN114417891B (en) Reply statement determination method and device based on rough semantics and electronic equipment
CN113723111B (en) Small sample intention recognition method, device, equipment and storage medium
US11676410B1 (en) Latent space encoding of text for named entity recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20923018

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20923018

Country of ref document: EP

Kind code of ref document: A1