CN116189064B - Barrage emotion analysis method and system based on joint model - Google Patents
Barrage emotion analysis method and system based on joint model Download PDFInfo
- Publication number
- CN116189064B CN116189064B CN202310458854.3A CN202310458854A CN116189064B CN 116189064 B CN116189064 B CN 116189064B CN 202310458854 A CN202310458854 A CN 202310458854A CN 116189064 B CN116189064 B CN 116189064B
- Authority
- CN
- China
- Prior art keywords
- barrage
- surrounding
- video
- emotion
- comment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000008451 emotion Effects 0.000 title claims abstract description 105
- 238000004458 analytical method Methods 0.000 title claims abstract description 39
- 230000004927 fusion Effects 0.000 claims abstract description 25
- 238000012216 screening Methods 0.000 claims abstract description 9
- 230000006870 function Effects 0.000 claims description 35
- 238000004364 calculation method Methods 0.000 claims description 23
- 230000003014 reinforcing effect Effects 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 17
- 238000000034 method Methods 0.000 claims description 17
- 238000012549 training Methods 0.000 claims description 17
- 238000012545 processing Methods 0.000 claims description 14
- 238000010276 construction Methods 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 8
- 238000001914 filtration Methods 0.000 claims description 7
- 239000013598 vector Substances 0.000 claims description 6
- 238000012512 characterization method Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 230000002093 peripheral effect Effects 0.000 claims description 4
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 230000007787 long-term memory Effects 0.000 claims description 3
- 230000006403 short-term memory Effects 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 abstract description 5
- 230000000694 effects Effects 0.000 description 5
- 230000003631 expected effect Effects 0.000 description 3
- 230000001788 irregular Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 201000007201 aphasia Diseases 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Pure & Applied Mathematics (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Algebra (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a barrage emotion analysis method and system based on a joint model, wherein barrage comments are input into the trained joint model to output emotion tendencies corresponding to the barrage comments, the joint model comprises a coding module and a decoding module, the coding module comprises a video coding module, a text coding module, a gating fusion module and a multi-mode fusion module, the decoding module comprises a barrage reconstruction module and an emotion analysis module, and the decoding module takes the output of the coding module as input to output emotion analysis tendencies corresponding to the barrage comments; the barrage emotion analysis method and system utilize a gating fusion screening mechanism to take surrounding barrage comments as context information of target barrage comments, and utilize a multi-mode fusion mode to take video information into consideration, and fully utilize useful information to strengthen characteristic representation of video barrages so as to accurately identify emotion tendencies of the target barrage comments.
Description
Technical Field
The invention relates to the technical field of barrage emotion analysis, in particular to a barrage emotion analysis method and system based on a joint model.
Background
The emotion analysis of the video barrage refers to the emotion polarity of real-time comments of the video.
The existing video barrage emotion analysis method is prone to extracting sentence-level features for emotion analysis and classification, and is based on grammar and semantics of rules, but the barrage is characterized in that: the method is short, has serious aphasia, uses special characters to represent specific meanings, has extremely irregular grammar and the like, so the traditional emotion analysis method cannot accurately segment the bullet screen properly, analyze grammar and the like, and further cannot accurately analyze emotion.
In addition, the existing barrage comment is short, insufficient context information is not available, grammar is very irregular, the existing barrage comment is related to a video theme at the time, interactivity is strong, instantaneity is strong, and the like, so that the existing method cannot effectively and accurately analyze emotion of the existing barrage comment in a short time.
Disclosure of Invention
Based on the technical problems in the background technology, the invention provides a barrage emotion analysis method and system based on a joint model, which can accurately identify emotion tendencies of target barrage comments.
According to the barrage emotion analysis method based on the joint model, barrage comments are input into the trained joint model to output emotion tendencies corresponding to the barrage comments;
the training process of the joint model is as follows:
s1: constructing a training sample set, the training sample set comprising momentsBullet comment>Time->To the point ofInward bullet comment->Surrounding video->And comment on bullet screen->Video surrounding barrage comment +.>;
S2: for the videoCoding and concatenating to obtain coded video feature ∈ ->Comment on the barrage->And the video surrounding barrage comment +.>Coding to obtain the coded target barrage characteristic +.>And surrounding barrage feature->;
S3: based on the target barrage featureFor the surrounding barrage feature->After screening and filtering, connecting in series to obtain all surrounding barrage comments +.>;
S4: video characterization through self-attention layers and cross-attention layersTarget barrage feature->Comment on surrounding barrage->Enhancement processing to obtain enhanced video features>Enhanced target barrage feature->And reinforcing the surrounding barrage->;
S5, performing S5; enhancement of video features based on multi-layered multi-headed pairs of attention layersEnhanced target barrage feature->Reinforcing surrounding barrage->Reconstructing to obtain reconstructed barrage comments, and constructing a barrage reconstructed loss function by using the reconstructed barrage comments and the real barrage comments by using cross entropy>;
S6: for enhanced video featuresEnhanced target barrage feature->Reinforcing surrounding barrage->Sequentially carrying out regularization and normalization operations, and outputting the barrage comment +.>Corresponding to predicted barrage emotion +.>;
S7: predicted barrage emotion using cross entropyAnd true barrage emotion->Construction of a loss function for emotion prediction>Loss function based on barrage reconstruction>And loss function of emotion prediction->Calculating the overall loss function->Updating parameters of the joint model based on the total loss function and the back propagation algorithm until the performance of the joint model reaches a set expected value;
the surrounding barrage commentsThe calculation formula is as follows:
wherein ,for post-selection->Comment on the surrounding bullet screen->Is->Video surrounding barrage comment->Is (are) peripheral features of->,/>Is a learnable gate matrix +.>Is a learnable gate offset vector, +.>As a function of the ReLU,representing series connection,/->Representing the product.
Further, the video featuresThe calculation formula of (2) is as follows:
the target barrage featureThe calculation formula of (2) is as follows:
the surrounding barrage featureThe calculation formula of (2) is as follows:
wherein ,,/>,/>representing series connection,/->Representing a video encoder>Representing a long and short term memory network.
Further, in step S4: video characterization through self-attention layers and cross-attention layersTarget barrage feature->Comment on surrounding barrage->Enhancement processing to obtain enhanced video features>Enhanced target barrage feature->And reinforcing the surrounding barrage->Specifically, the method comprises the following steps:
characterizing videoTarget barrage feature->Comment on surrounding barrage->Inputting as a first layer of the self-attention layer and the cross-attention layer and performing an L-layer iteration, wherein the L-layer is the total layer number of the self-attention layer and the cross-attention layer;
in the first placeLayer input video feature->Obtaining the input video feature of the next layer +.>The following are provided:
in the first placeLayer input target barrage feature->Obtaining the input target barrage feature of the next layer>:
In the first placeLayer input surrounding barrage comment->Obtaining the comment +.>:
Where SA represents the self-attention layer and CA represents the cross-attention layer.
Further, in step S5, the barrage reconstructed loss functionThe construction formula is as follows:
wherein ,indicating batch processing, +.>Representing cross entropy loss, < >>Representing a reconstruction module->Comment of bullet generated by the reconstruction module is represented, < ->Indicating time->Is a true bullet comment;
specifically, the bullet comments generated by the reconstruction module are specifically expressed in the following form:
wherein Representing a multi-layer perceptron, LN representing regularization operation, MHA representing cross-multi-headed attention.
Further, in step S6, predicted barrage emotionThe calculation formula is as follows:
wherein ,is a Softmax function, LN represents a layer regularization operation, +.>Representing a multi-layer sensor->For a learnable emotion prediction matrix, +.>Is a learnable video emotion matrix, +.>As a learnable ambient barrage emotion matrix,representing a learnable target barrage emotion matrix, < ->Representing a tandem operation, representing a product.
Further, in step S7, the loss function of emotion predictionThe construction formula is as follows:
the overall loss functionThe calculation process of (2) is as follows:
wherein ,for predicted barrage emotion, +.>Representing cross entropy loss, < >>Is true barrage emotion +.>Representing loss balance parameters, +.>Indicating batch processing.
A barrage emotion analysis system based on a joint model inputs barrage comments into the trained joint model to output emotion tendencies corresponding to the barrage comments;
the analysis system comprises a construction module, a video coding module, a text coding module, a door control fusion module, a multi-mode fusion module, a barrage reconstruction module, a barrage emotion prediction module and a loss calculation module;
the construction module is used for constructing a training sample set, and the training sample set comprises momentsBullet comment>Time->To->Inward bullet comment->Surrounding video->And comment on bullet screen->Video surrounding barrage comment +.>;
The video coding module is used for coding the videoCoding and concatenating to obtain coded video features;
The text coding module is used for commenting on the barrageAnd the video surrounding barrage comment +.>Coding to obtain the coded target barrage characteristic +.>And surrounding barrage feature->;
The gating fusion module is based on the target barrage characteristicsFor the surrounding barrage feature->After screening and filtering, connecting in series to obtain all surrounding barrage comments +.>;
The multi-mode fusion module is used for video features through the self-attention layer and the cross-attention layerTarget barrage feature->Comment on surrounding barrage->Processing to obtain enhanced video features->Enhanced target barrage feature->And reinforcing the surrounding barrage->;
The bullet screen reconstruction module is used for multi-layer multi-head attentionForce layer pair enhanced video featuresEnhanced target barrage feature->Reinforcing surrounding barrage->Reconstructing to obtain reconstructed barrage comments, and constructing a barrage reconstructed loss function by using the reconstructed barrage comments and the real barrage comments by using cross entropy>;
The barrage emotion prediction module is used for enhancing video featuresEnhanced target barrage feature->Reinforcing surrounding barrage->Sequentially carrying out regularization and normalization operations, and outputting the barrage comment +.>Corresponding to predicted barrage emotion +.>;
The loss calculation module is used for predicting bullet screen emotion by using cross entropyAnd true barrage emotion->Construction of a loss function for emotion prediction>Loss function based on barrage reconstruction>And loss function for emotion predictionCalculating the overall loss function->Updating parameters of the joint model based on the total loss function and the back propagation algorithm until the performance of the joint model reaches a set expectation;
the surrounding barrage commentsThe calculation formula is as follows:
wherein ,for post-selection->Comment on the surrounding bullet screen->Is->Video surrounding barrage comment->Is of the circumference of (1)Surrounding features (I)>,/>Is a learnable gate matrix +.>Is a learnable gate offset vector, +.>As a function of the ReLU,representing series connection,/->Representing the product.
The barrage emotion analysis method and system based on the joint model provided by the invention have the advantages that: according to the barrage emotion analysis method and system based on the joint model, the video information is included through the multi-mode fusion module, the relation between the video theme and the barrage is fully considered, the enhanced characteristic representation is obtained, and the emotion analysis performance of the joint model on the target barrage comment is improved; the video information is included through the multi-mode fusion module, the relation between the video theme and the barrage is fully considered, the enhanced characteristic representation is obtained, and the performance of the combined model for carrying out emotion analysis on the target barrage comment is improved; and the bullet screen reconstruction module is utilized to promote the overall learning effect of each module and improve the performance of the emotion analysis module.
Drawings
FIG. 1 is a schematic diagram of the structure of the present invention;
fig. 2 is a schematic view of a module frame according to the present invention.
Detailed Description
In the following detailed description of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The invention may be embodied in many other forms than described herein and similarly modified by those skilled in the art without departing from the spirit or scope of the invention, which is therefore not limited to the specific embodiments disclosed below.
As shown in fig. 1 and 2, according to the barrage emotion analysis method based on the combined model, barrage comments are input into the trained combined model so as to output emotion tendencies corresponding to the barrage comments; the combined model uses a coding-decoding architecture, and comprises a coding module and a decoding module, wherein the coding module comprises a video coding module, a text coding module, a door control fusion module and a multi-mode fusion module, the decoding module comprises a barrage reconstruction module and an emotion analysis module, the emotion analysis module comprises a barrage emotion prediction module and a loss calculation module, and the decoding module takes the output of the coding module as input so as to output emotion analysis trends corresponding to barrage evaluation.
The method mainly comprises the steps of taking surrounding comments as context information of a target barrage by using a gating screening mechanism in a joint model, taking video information into consideration by using a multi-mode fusion mode, fully utilizing useful information to strengthen characteristic representation of the video barrage, constructing the joint model based on a residual convolution neural network, a long-short-period memory network, a gating fusion self-attention layer, a cross-attention layer and the like, training and learning parameters in the joint model, and optimizing the learning parameters to realize the effect of accurately identifying emotion tendencies of the target barrage comments by the joint model, and is specifically as follows.
The training process of the joint model is as follows:
s1: constructing a training sample set, the training sample set comprising momentsBullet comment>Time->To the point ofInward bullet comment->Surrounding video->And comment on bullet screen->Video surrounding barrage comment +.>;
Video frequencyThere is->Frame video->Video surrounding barrage comment->There is->Video bullet comment->Video surrounding barrage comment->Is at the comment->Surrounding comments.
For example, the bullet comment y "for itself, in the example shown in FIG. 2, insists on-! "as input, video surrounding bullet comments"beautiful and" good stature "as the context of y, and video corresponding to the comment y of the bullet screen is given +.>Together as an input.
S2: for the videoCoding and concatenating to obtain coded video feature ∈ ->Comment on the barrage->And the video surrounding barrage comment +.>Coding to obtain the coded target barrage characteristic +.>And surrounding barrage feature->;
Encoding within a video encoding module using a residual convolutional neural networkFrame video->And concatenating the obtained encoded vectors to obtain the encoded frame-level video feature +.>:
wherein ,representing a video encoder>Representing a tandem operation;
in the text coding module, a long-term and short-term memory network is used) Comment on the barrage respectively->And its surroundingsVideo bullet comment->Coding to obtain the coded target barrage characteristic +.>And surrounding barrage feature->:
I.e.
wherein ,;
=/>;
it should be understood that the firstVideo surrounding barrage comment->Is characterized by->。
S3: based on the target barrage featureFor the surrounding barrage feature->After screening and filtering, connecting in series to obtain all surrounding barrage comments +.>;
Based on the characteristics of the video barrage, some surrounding useful surrounding barrage comments with the same emotion can be used as the context information of the target barrage comments to provide assistance, so that the video barrage comments can be utilized by the gating fusion moduleTo pair(s)Screening and filtering operation is carried out to obtain the +.>Comment on the surrounding bullet screen->:
wherein ,for post-selection->Comment on the surrounding bullet screen->Is->Video surrounding barrage comment->Is (are) peripheral features of->,/>Is a learnable gate matrix +.>For a learnable gate offset vector, the function +.>For ReLU function>Representing series connection,/->Representing the product> and />Are all learnable parameters, and parameter optimization is carried out in the combined model training process so as to achieve the expected effect by using the input model;
1 st to 1 stComment on the surrounding bullet screen->All surrounding barrage comments are obtained by connecting in series>:
wherein ,representing a series operation.
S4: video characterization through self-attention layers and cross-attention layersTarget barrage feature->Comment on surrounding barrage->Enhancement processing to obtain enhanced video features>Enhanced target barrage feature->And reinforcing the surrounding barrage->;
The multi-mode fusion module consists of an L-layer self-attention layer and a cross-attention layer, and is used for characterizing videoTarget barrage features/>Comment on surrounding barrage->As input of the first layer of the multi-mode fusion module, after multi-layer iteration (i.e. after processing of the L layers of self-attention layers and cross-attention layers), corresponding enhanced video features fused with other modes are obtained in the last layer>Enhanced target barrage feature->And reinforcing the surrounding barrage->。
In the first placeLayer input video feature->Obtaining the input video feature of the next layer +.>The following are provided:
in the first placeLayer input target barrage feature->Obtaining the input target barrage feature of the next layer>:
In the first placeLayer input surrounding barrage comment->Obtaining the comment +.>:
Where SA represents the self-attention layer and CA represents the cross-attention layer.
S5, performing S5; enhancement of video features based on multi-layered multi-headed pairs of attention layersEnhanced target barrage feature->Reinforcing surrounding barrage->Reconstructing to obtain reconstructed barrage comments, and constructing a barrage reconstructed loss function by using the reconstructed barrage comments and the real barrage comments by using cross entropy>;
The decoding module consists of a barrage reconstruction module and an emotion analysis module, and the decoding module encodes the enhanced video features obtained in the moduleEnhanced target barrage feature->Reinforcing surrounding barrage->As input;
and in the barrage reconstruction module, the reconstruction loss is analyzed and calculated by the barrage reconstruction module and added into closed-loop training to promote the learning effect of the multi-mode fusion module and promote the effect of the emotion analysis module.
The barrage reconstruction module consists of a plurality of multi-head attention layers, and a loss function of barrage reconstructionThe method comprises the following steps:
wherein ,representing batch processing, CE representing cross entropy loss, < ->Representing a reconstruction module->Comment of bullet generated by the reconstruction module is represented, < ->Indicating time->Is a true bullet comment;
specifically, the bullet comments generated by the reconstruction module are specifically expressed in the following form:
wherein Representing a multi-layer perceptron, LN representing regularization operation, MHA representing cross-multi-headed attention.
S6: for enhanced video featuresEnhanced target barrage feature->Reinforcing surrounding barrage->Sequentially carrying out regularization and normalization operations, and outputting the barrage comment +.>Corresponding to predicted barrage emotion +.>;
The emotion analysis module comprises a barrage emotion prediction module and a loss calculation module; wherein the bullet screen emotion predicted in the bullet screen emotion prediction moduleThe calculation formula is as follows:
wherein ,is a Softmax function, LN represents a layer regularization operation, +.>Representing a multi-layer sensor->For a learnable emotion prediction matrix, +.>Is a learnable video emotion matrix, +.>As a learnable ambient barrage emotion matrix,representing a learnable target barrage emotion matrix, < ->Represents a series operation, & represents a product, & lt + & gt>、/>、/>All are learnable parameters, and parameter optimization is performed in the combined model training process so as to achieve the expected effect by using the input model.
S7: predicted barrage emotion using cross entropyAnd true barrage emotion->Construction of a loss function for emotion prediction>Loss function based on barrage reconstruction>And loss of emotion predictionFunction->Calculating the overall loss function->Updating parameters of the joint model based on the total loss function and the back propagation algorithm until the performance of the joint model reaches a set expected value;
loss function for emotion prediction in loss calculation moduleThe construction formula is as follows:
wherein ,indicating batch processing, +.>For the predicted bullet screen emotion, the predicted bullet screen emotion is bullet screen comment +.>Predicted barrage emotion output through the joint model, < ->The true barrage emotion is barrage comment +.>Corresponding actual emotion;
overall loss functionThe calculation process of (2) is as follows:
wherein ,and representing the loss balance parameters, and updating the learnable parameters of the joint model based on the loss and the back propagation algorithm until the model performance achieves the expected effect.
First: step S3 provides a gating fusion mechanism, and utilizes the target barrage comments to carry out screening and filtering operation on surrounding barrage comments, so that some surrounding useful barrage comments with the same emotion can be used as context information of the target barrage comments to provide assistance, the problems that the barrage comments are short, insufficient context information exists and the like are solved, and the quality of the target barrage is improved.
Second,: step S4 provides a multi-mode fusion enhancement mechanism, video information is included through the multi-mode fusion module, the relation between a video theme and a barrage is fully considered, enhanced feature representation is obtained, and the performance of the joint model for carrying out emotion analysis on target barrage comments is improved.
Third, steps S5 to S7 provide a barrage reconstruction and emotion analysis mechanism, and the barrage reconstruction module is utilized to promote the overall learning effect of each module and the performance of the emotion analysis module.
The embodiment is mainly applied to emotion analysis of video real-time comments, for example, a comment is sent by a user at a certain moment, and emotion tendency of the comment is judged.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.
Claims (7)
1. The barrage emotion analysis method based on the joint model is characterized in that barrage comments are input into the trained joint model to output emotion tendencies corresponding to the barrage comments;
the training process of the joint model is as follows:
s1: constructing a training sample set, the training sample set comprising momentsBullet comment>Time->To->Inward bullet comment->Surrounding video->And comment on bullet screen->Video surrounding barrage comment +.>;
S2: for the videoCoding and concatenating to obtain coded video feature ∈ ->Comment on the barrage->And the video surrounding barrage comment +.>EncodingObtaining the coded target barrage characteristic +.>And surrounding barrage features;
S3: based on the target barrage featureFor the surrounding barrage feature->After screening and filtering, connecting in series to obtain all surrounding barrage comments +.>;
S4: video characterization through self-attention layers and cross-attention layersTarget barrage feature->Comment on surrounding barrage->Enhancement processing to obtain enhanced video features>Enhanced target barrage feature->And reinforcing the surrounding barrage->;
S5, performing S5; enhancement of video features based on multi-layered multi-headed pairs of attention layersEnhanced target barrage feature->Reinforcing surrounding barrage->Reconstructing to obtain reconstructed barrage comments, and constructing a barrage reconstructed loss function by using the reconstructed barrage comments and the real barrage comments by using cross entropy>;
S6: for enhanced video featuresEnhanced target barrage feature->Reinforcing surrounding barrage->Sequentially carrying out regularization and normalization operations, and outputting the barrage comment +.>Corresponding to predicted barrage emotion +.>;
S7: predicted barrage emotion using cross entropyAnd true barrage emotion->Constructing a loss function for emotion predictionLoss function based on barrage reconstruction>And loss function of emotion prediction->Calculated overall loss functionBased on the overall loss function->And updating parameters of the joint model by a back propagation algorithm until the performance of the joint model reaches a set expected value;
the surrounding barrage commentsThe calculation formula is as follows:
wherein ,for post-selection->Comment on the surrounding bullet screen->Is->Video surrounding barrage comment->Is (are) peripheral features of->,/>Is a learnable gate matrix +.>Is a learnable gate offset vector, +.>For ReLU function>Representing series connection,/->Representing the product.
2. The method of collaborative model-based barrage emotion analysis of claim 1, wherein the video featuresThe calculation formula of (2) is as follows:
the target barrage featureThe calculation formula of (2) is as follows:
the surrounding barrage featureThe calculation formula of (2) is as follows:
wherein ,,/>,/>representing series connection,/->Representing a video encoder>Representing a long and short term memory network.
3. The barrage emotion analysis method based on joint model as set forth in claim 1, characterized in that in step S4: video characterization through self-attention layers and cross-attention layersTarget barrage feature->Comment on surrounding bullet screenEnhancement processing to obtain enhanced video features>Enhanced target barrage feature->And reinforcing the surrounding barrageSpecifically, the method comprises the following steps:
characterizing videoTarget barrage feature->Comment on surrounding barrage->Inputting as a first layer of the self-attention layer and the cross-attention layer and performing an L-layer iteration, wherein the L-layer is the total layer number of the self-attention layer and the cross-attention layer;
in the first placeLayer input video feature->Obtaining the input video feature of the next layer +.>The following are provided:
in the first placeLayer input target barrage feature->Obtaining the input target barrage feature of the next layer>:
In the first placeLayer input surrounding barrage comment->Obtaining the comment +.>:
Where SA represents the self-attention layer and CA represents the cross-attention layer.
4. A combined model-based barrage emotion analysis method as claimed in claim 3, characterized in that in step S5, the barrage reconstructed loss functionThe construction formula is as follows:
wherein ,indicating batch processing, +.>Representing cross entropy loss, < >>Representing a reconstruction module->Comment of bullet generated by the reconstruction module is represented, < ->Indicating time->Is a true bullet comment;
specifically, the bullet comments generated by the reconstruction module are specifically expressed in the following form:
wherein ,representing a multi-layer perceptron, LN representing regularization operation, MHA representing cross-multi-headed attention.
5. The method of claim 4, wherein in step S6, predicted barrage emotion is predictedThe calculation formula is as follows:
wherein ,is a Softmax function, LN represents a layer regularization operation, +.>Representing a multi-layer sensor->For a learnable emotion prediction matrix, +.>Is a learnable video emotion matrix, +.>Is a surrounding barrage emotion matrix which can be learned, < + >>Representing a learnable target barrage emotion matrix, < ->Representing a tandem operation, representing a product.
6. The method of collaborative model-based barrage emotion analysis according to claim 5, wherein in step S7, the emotion predicted penalty functionThe construction formula is as follows:
the overall loss functionThe calculation process of (2) is as follows:
wherein ,for predicted barrage emotion, +.>Is true barrage emotion +.>Representing cross entropy loss, < >>Representing loss balance parameters, +.>Indicating batch processing.
7. The barrage emotion analysis system based on the joint model is characterized in that barrage comments are input into the trained joint model to output emotion tendencies corresponding to the barrage comments;
the analysis system comprises a construction module, a video coding module, a text coding module, a door control fusion module, a multi-mode fusion module, a barrage reconstruction module, a barrage emotion prediction module and a loss calculation module;
the construction module is used for constructing a training sample set, and the training sample set comprises momentsBullet comment>Time of dayTo->Inward bullet comment->Surrounding video->And comment on bullet screen->Video surrounding barrage comment +.>;
The video coding module is used for coding the videoCoding and concatenating to obtain coded video feature ∈ ->;
The text coding module is used for commenting on the barrageAnd the video surrounding barrage comment +.>Coding to obtain the coded target barrage characteristic +.>And surrounding barrage feature->;
The gating fusion module is based on the target barrage characteristicsFor the surrounding barrage feature->After screening and filtering, connecting in series to obtain all surrounding barrage comments +.>;
The multi-mode fusion module is used for video features through the self-attention layer and the cross-attention layerTarget barrage feature->Comment on surrounding barrage->Processing to obtain enhanced video features->Enhanced target barrage feature->And reinforcing the surrounding barrage->;
The bullet screen reconstruction module is used for enhancing vision based on multi-layer multi-head attention layer pairsFrequency characteristicsEnhanced target barrage feature->Reinforcing surrounding barrage->Reconstructing to obtain reconstructed barrage comments, and constructing a barrage reconstructed loss function by using the reconstructed barrage comments and the real barrage comments by using cross entropy>;
The barrage emotion prediction module is used for enhancing video featuresEnhanced target barrage feature->Reinforcing surrounding barrage->Sequentially carrying out regularization and normalization operations, and outputting the barrage comment +.>Corresponding to predicted barrage emotion +.>;
The loss calculation module is used for predicting bullet screen emotion by using cross entropyAnd true barrage emotion->Construction of a loss function for emotion prediction>Loss function based on barrage reconstruction>And loss function of emotion prediction->Calculating the overall loss function->Updating parameters of the joint model based on the overall loss function and the back propagation algorithm until the performance of the joint model reaches a set expectation;
the surrounding barrage commentsThe calculation formula is as follows:
wherein ,for post-selection->Comment on the surrounding bullet screen->Is->Video surrounding barrage comment->Is (are) peripheral features of->,/>Is a learnable gate matrix +.>Is a learnable gate offset vector, +.>For ReLU function>Representing series connection,/->Representing the product.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310458854.3A CN116189064B (en) | 2023-04-26 | 2023-04-26 | Barrage emotion analysis method and system based on joint model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310458854.3A CN116189064B (en) | 2023-04-26 | 2023-04-26 | Barrage emotion analysis method and system based on joint model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116189064A CN116189064A (en) | 2023-05-30 |
CN116189064B true CN116189064B (en) | 2023-08-29 |
Family
ID=86446571
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310458854.3A Active CN116189064B (en) | 2023-04-26 | 2023-04-26 | Barrage emotion analysis method and system based on joint model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116189064B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101782898A (en) * | 2010-03-25 | 2010-07-21 | 中国科学院计算技术研究所 | Method for analyzing tendentiousness of affective words |
CN111683294A (en) * | 2020-05-08 | 2020-09-18 | 华东师范大学 | Bullet screen comment recommendation method for information extraction |
CN112036187A (en) * | 2020-07-09 | 2020-12-04 | 上海极链网络科技有限公司 | Context-based video barrage text auditing method and system |
CN114119136A (en) * | 2021-10-29 | 2022-03-01 | 中国工商银行股份有限公司 | Product recommendation method and device, electronic equipment and medium |
US11302360B1 (en) * | 2020-10-08 | 2022-04-12 | Adobe Inc. | Enhancing review videos |
CN114817536A (en) * | 2022-04-20 | 2022-07-29 | 上海电力大学 | Network short text sentiment analysis method based on fusion characteristics |
CN114896386A (en) * | 2021-09-24 | 2022-08-12 | 武汉工程大学 | Film comment semantic emotion analysis method and system based on BilSTM |
WO2022183138A2 (en) * | 2021-01-29 | 2022-09-01 | Elaboration, Inc. | Automated classification of emotio-cogniton |
CN115361595A (en) * | 2022-07-28 | 2022-11-18 | 华中科技大学 | Video bullet screen generation method |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108363790B (en) * | 2018-02-12 | 2021-10-22 | 百度在线网络技术(北京)有限公司 | Method, device, equipment and storage medium for evaluating comments |
US10762154B2 (en) * | 2018-08-10 | 2020-09-01 | International Business Machines Corporation | Relative weighting for social collaboration comments |
US11010561B2 (en) * | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
-
2023
- 2023-04-26 CN CN202310458854.3A patent/CN116189064B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101782898A (en) * | 2010-03-25 | 2010-07-21 | 中国科学院计算技术研究所 | Method for analyzing tendentiousness of affective words |
CN111683294A (en) * | 2020-05-08 | 2020-09-18 | 华东师范大学 | Bullet screen comment recommendation method for information extraction |
CN112036187A (en) * | 2020-07-09 | 2020-12-04 | 上海极链网络科技有限公司 | Context-based video barrage text auditing method and system |
US11302360B1 (en) * | 2020-10-08 | 2022-04-12 | Adobe Inc. | Enhancing review videos |
WO2022183138A2 (en) * | 2021-01-29 | 2022-09-01 | Elaboration, Inc. | Automated classification of emotio-cogniton |
CN114896386A (en) * | 2021-09-24 | 2022-08-12 | 武汉工程大学 | Film comment semantic emotion analysis method and system based on BilSTM |
CN114119136A (en) * | 2021-10-29 | 2022-03-01 | 中国工商银行股份有限公司 | Product recommendation method and device, electronic equipment and medium |
CN114817536A (en) * | 2022-04-20 | 2022-07-29 | 上海电力大学 | Network short text sentiment analysis method based on fusion characteristics |
CN115361595A (en) * | 2022-07-28 | 2022-11-18 | 华中科技大学 | Video bullet screen generation method |
Non-Patent Citations (1)
Title |
---|
基于卷积神经网络的语种识别系统;金马,宋彦,戴礼荣;<数据采集与处理>;第34卷(第2期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116189064A (en) | 2023-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108681610B (en) | generating type multi-turn chatting dialogue method, system and computer readable storage medium | |
ALIAS PARTH GOYAL et al. | Z-forcing: Training stochastic recurrent networks | |
CN107608943B (en) | Image subtitle generating method and system fusing visual attention and semantic attention | |
CN111860785A (en) | Time sequence prediction method and system based on attention mechanism cyclic neural network | |
CN107484017A (en) | Supervision video abstraction generating method is had based on attention model | |
CN111274398A (en) | Method and system for analyzing comment emotion of aspect-level user product | |
CN111626764A (en) | Commodity sales volume prediction method and device based on Transformer + LSTM neural network model | |
CN110083702B (en) | Aspect level text emotion conversion method based on multi-task learning | |
CN117475038B (en) | Image generation method, device, equipment and computer readable storage medium | |
CN113128527B (en) | Image scene classification method based on converter model and convolutional neural network | |
CN113989933B (en) | Online behavior recognition model training and detecting method and system | |
CN112527966A (en) | Network text emotion analysis method based on Bi-GRU neural network and self-attention mechanism | |
CN115841119B (en) | Emotion cause extraction method based on graph structure | |
CN112529071B (en) | Text classification method, system, computer equipment and storage medium | |
Skatchkovsky et al. | Learning to time-decode in spiking neural networks through the information bottleneck | |
CN114444561A (en) | PM2.5 prediction method based on CNNs-GRU fusion deep learning model | |
US20240104352A1 (en) | Contrastive Learning and Masked Modeling for End-To-End Self-Supervised Pre-Training | |
Zhu et al. | Parallel interaction spatiotemporal constrained variational autoencoder for soft sensor modeling | |
Koryakovskiy et al. | One-shot model for mixed-precision quantization | |
CN110175338B (en) | Data processing method and device | |
CN115269836A (en) | Intention identification method and device | |
CN116189064B (en) | Barrage emotion analysis method and system based on joint model | |
CN114501031B (en) | Compression coding and decompression method and device | |
CN116863920B (en) | Voice recognition method, device, equipment and medium based on double-flow self-supervision network | |
CN116384340B (en) | Real-time barrage emotion analysis method based on variation cross-modal characterization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |