CN113673567A - Panorama emotion recognition method and system based on multi-angle subregion self-adaption - Google Patents

Panorama emotion recognition method and system based on multi-angle subregion self-adaption Download PDF

Info

Publication number
CN113673567A
CN113673567A CN202110816786.4A CN202110816786A CN113673567A CN 113673567 A CN113673567 A CN 113673567A CN 202110816786 A CN202110816786 A CN 202110816786A CN 113673567 A CN113673567 A CN 113673567A
Authority
CN
China
Prior art keywords
emotion
feature
sub
panorama
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110816786.4A
Other languages
Chinese (zh)
Other versions
CN113673567B (en
Inventor
青春美
黄容
徐向民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202110816786.4A priority Critical patent/CN113673567B/en
Publication of CN113673567A publication Critical patent/CN113673567A/en
Application granted granted Critical
Publication of CN113673567B publication Critical patent/CN113673567B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/06Topological mapping of higher dimensional structures onto lower dimensional surfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • G06T3/047Fisheye or wide-angle transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computational Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Analysis (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Algebra (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a panorama emotion recognition method and system based on multi-angle sub-region self-adaptation. A series of equidistant cylindrical projection panoramas are generated by utilizing a spherical multi-angle rotation algorithm, and input into a convolutional neural network to obtain the characteristic advantages of different levels. And guiding local features through global features, adaptively establishing the relevance between current scale context features, and capturing the global and local context dependencies of feature maps in different levels. And sampling feature graphs of different levels, splicing in channel dimensions to realize feature fusion, and acquiring the emotion classification labels of the users. According to the invention, the emotional preference and distribution of the user in various scenes can be correctly predicted, and the user experience under VR is improved.

Description

Panorama emotion recognition method and system based on multi-angle subregion self-adaption
Technical Field
The invention relates to the field of emotion recognition, in particular to a panorama emotion recognition method and system based on multi-angle subregion self-adaption.
Background
Emotion is a psychological and physiological state, accompanied by cognitive and consciousness processes, and the research on human emotion and cognition is a high-level stage of artificial intelligence. With the vigorous development of artificial intelligence and deep learning, it becomes possible to establish an emotion model with the abilities of perceiving, recognizing and understanding human emotion. By giving the machine the ability to make intelligent, sensitive and friendly feedback on the user's emotions, a natural environment in which people and people, and people and machines are harmonious and co-located is finally created, which is a good vision to guide new directions for future applications of computers.
Traditional emotion inducement has modes such as pictures, characters, voice, video and the like, and the actual prediction effect of the corresponding emotion recognition data set is not satisfactory. The virtual reality technology achieves the purpose of emotion induction through immersive vivid and three-dimensional experience, and is a better emotion induction element. In recent years, deep learning has revolutionized in practice, but in terms of emotional interaction, emotional tag data based on a virtual reality evoked state is rare, and an effective emotional research method and model are lacking. The panorama is a storage form of all-round and real space information on a two-dimensional plane, and can be used as an effective material for analyzing the emotion of the VR immersive virtual environment.
Disclosure of Invention
In order to overcome the defects and shortcomings of the prior art, the invention provides a panorama emotion recognition system and method based on multi-angle subregion adaptation.
According to the invention, through the display characteristics of panoramic contents in the head-mounted display and an equidistant columnar projection mode, a spherical multi-angle rotation algorithm is designed to obtain panoramic pictures with different angles, and the panoramic pictures are combined with a convolutional neural network adaptive to context, so that the accuracy of emotion classification labels is effectively improved.
The invention adopts the following technical scheme:
a panorama emotion recognition method based on multi-angle subregion adaptation comprises the following steps:
multi-angle rotation step: the conversion from a three-dimensional omnibearing stereoscopic view to a two-dimensional plane panorama is realized by adopting spherical multi-angle rotation and equidistant columnar projection;
a characteristic extraction step: extracting the characteristics of the two-dimensional planar panoramic image by using a pre-trained convolutional neural network model to obtain characteristic images of different levels;
sub-region adaptation step: inputting feature maps of different levels, searching for global and local relevance, adaptively establishing context features of the current scale, and capturing global and local context dependencies of the feature maps of different levels;
multi-scale fusion step: unifying the sizes of the feature maps of different levels through an up-sampling step, and splicing the feature maps on the channel dimension to realize multi-scale feature fusion;
and (3) emotion classification step: and determining the target emotion according to the advantages of the different level features, and outputting a corresponding emotion label.
Further, the spherical multi-angle rotation specifically comprises:
establishing a three-dimensional spherical coordinate system with the head of the user as the sphere center, and projecting a 360-degree panoramic image presented by the user under a head-mounted display to the surface of the sphere;
rotating the projection drawing according to the content distribution characteristics of the panoramic drawing;
the rotation comprises horizontal rotation and vertical rotation, and the horizontal rotation realizes that the cut edge contents on the two sides rotate to the middle main visual area; the vertical rotation enables the polar severely distorted content to rotate to near the equator.
Further, the equidistant columnar projection is to map the longitude lines into vertical lines with constant spacing, map the latitude lines into horizontal lines with constant spacing, and project the equidistant cylinders of the three-dimensional view onto the two-dimensional panorama.
Further, the three-dimensional spherical coordinate is a right-hand coordinate system, the field angle is 90 degrees, the binocular direct-viewing direction of the user is taken as a horizontal axis, and the central coordinate of the front viewing port is [0,0,0 ]; the right viewport center coordinate is [90,0,0 ]; the center coordinate of the back viewport is [180,0,0 ]; the left viewport center coordinate is [ -90,0,0 ]; the upper viewport center coordinate is [0,90,0 ]; the central coordinate of the lower viewport is [0, -90,0 ]; corresponding to the six faces of the cube tangent to the sphere.
Further, the feature extraction step specifically comprises:
inputting the two-dimensional panoramic picture into a pre-trained convolutional neural network, extracting the hierarchical structure of different feature spaces universal to the visual world, and forming a feature vector set [ X ]1,X2,...,Xl]Each element in the set represents a feature map of the current level.
Further, the sub-region self-adapting step comprises a sub-region content characterization branch and an emotion contribution characterization branch;
the sub-region content characterization branch obtains a sub-region content characterization y by performing adaptive average pooling operation on the characteristic graph with the input size of h multiplied by w multiplied by csWherein h, w, c and s respectively represent the height, width, channel number and preset size of the characteristic diagram;
the emotion contribution degree characterization branch specifically comprises the following steps:
for feature vector set [ X1,X2,...,Xl]Is globally pooled to obtain a global information token g (X) of size 1X1 xcl);
Characterizing global information g (X) with a broadcast mechanisml) Residual error connection is realized by adding the input feature diagram element by element, and the number of channels is converted into s through convolution operation of 1x12Thereby constructing a size of hw × s2Adaptive emotion contribution matrix as
Adaptive emotion contribution matrix asWith sub-region content characterisation ysMultiplying to obtain a context feature characterization vector ZlThe vector represents each pixel point i and each sub-region ys×sThe degree of association of (c).
Further, the adaptive average pooling divides the input feature map into s × s sub-regions to obtain a group of sub-regions representing Ys×s=[y1,y2,...,ys×s]Transforming the feature map with size of s × s × c into s2Sub-region content characterization y of x cs
Further, the emotion contribution degree matrix is constructedasThe method comprises the following specific steps: let sub-region ys×sThe contribution degree of the emotion classification label at the i point of the feature map is aiThen, any i point of the feature map corresponds to s × s emotion contribution degree vectors aiForm a set
Figure BDA0003170399950000031
Transforming to obtain an emotion contribution degree matrix asThe size is hw × s2
Further, the multi-scale fusion step specifically comprises: the multi-scale feature maps with different levels are realized by utilizing upsampling operation, such as deconvolution or interpolation operation, the sizes are uniform, the feature fusion is completed by splicing on channel dimensions, and the size H multiplied by W multiplied by x (C) is finally obtained1+C2+...+Cl) The total information representation is combined with the high-level semantic information representation.
A system for realizing a panorama emotion recognition method based on multi-angle subregion adaptation comprises the following steps:
multi-angle rotating module: the method is used for realizing the conversion from a three-dimensional panoramic view to a two-dimensional panoramic view by multi-angle rotation and equidistant columnar projection;
a feature extraction module: the two-dimensional panoramic image feature extraction device is used for extracting features of the two-dimensional panoramic image to obtain feature images of different levels;
a sub-region adaptive module: the method is used for correlating regions with consistent emotion classification labels, and global features guide local features to adaptively establish the relevance of context features of the current scale and capture long-distance dependence;
a multi-scale fusion module: the device is used for unifying the sizes of the feature maps in different levels and splicing the feature maps in the channel dimension to realize multi-scale feature fusion;
and an emotion classification module: and determining the target emotion according to the advantages of the different level features, and outputting a corresponding emotion label.
The invention has the following beneficial effects:
1. aiming at the problem that emotion label data are rare under a virtual reality induction state, a spherical multi-angle rotation algorithm is provided to realize data enhancement. A three-dimensional spherical coordinate system is established for 360-degree views in a user virtual environment, a sphere rotates along different coordinate axes in multiple angles, and then equidistant columnar projection is carried out respectively to obtain expanded data samples, so that the generalization capability of the model can be effectively improved.
2. The equidistant columnar projection projects the longitude lines and the latitude lines to a rectangular plane at equal intervals, which causes serious distortion of panoramic contents at the upper and lower poles. The data sample expanded by the spherical multi-angle rotation algorithm can keep rotation invariance, and when distortion is relieved, the edge information of two sides is rotated to the central main visual area, so that the content characteristics can be well captured and extracted by the emotional model, and the identification accuracy of the model is improved.
3. And extracting different-level features of the panoramic image by using the pre-trained convolutional neural network, and exerting the complementary advantages of the bottom-level detail information and the high-level semantic information. The global feature guides local features, the relevance between different areas or objects of the feature map is built in a self-adaptive mode, and long-distance dependence is captured. Therefore, the prediction performance of the model on the panoramic image emotion induction area is effectively improved.
4. The method fills the blank in the field of panoramic image emotion recognition, is beneficial to reading the user emotion and collecting feedback in an immersive virtual environment, and is important for developing VR application scenes such as user behavior prediction and VR scene modeling.
Drawings
FIG. 1 is a flow chart of the overall method of practicing the present invention.
FIG. 2 is a schematic diagram of a user head-mounted display in a virtual environment.
Fig. 3(a) and 3(b) are schematic diagrams of three-dimensional spherical coordinates and a projected two-dimensional plane, respectively.
FIG. 4 is a schematic diagram illustrating the effect of a multi-angle rotation algorithm rotating 180 degrees along the x-axis.
FIG. 5 is a diagram of a sub-region adaptation module according to the present invention.
FIG. 6 is a schematic diagram of a model framework of an overall implementation of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited to these examples.
Examples
As shown in fig. 1, a panorama emotion recognition method based on multi-angle sub-region adaptation is used for recognizing and predicting user emotion in an immersive virtual environment, and includes the following steps:
and the multi-angle rotation module is used for presenting an interactive 360-degree view of the immersive virtual environment to the user, and as shown in fig. 2, a series of data expansion samples are obtained by adopting a spherical multi-angle rotation algorithm. And the longitude lines are mapped into vertical lines with constant intervals by utilizing equidistant columnar projection, and the latitude lines are mapped into horizontal lines with constant intervals, so that the conversion from a three-dimensional omnibearing stereoscopic view to a two-dimensional plane panorama is completed.
The HMD in fig. 2 represents a head mounted display.
The spherical multi-angle rotation algorithm specifically comprises the following steps: and establishing a three-dimensional Cartesian coordinate system with the head of the user as the center of a sphere. The sphere is sequentially rotated by a certain angle along the horizontal axis, so that distortion is improved when an object with two severely distorted poles rotates to the position near the equator in a multi-angle mode. Meanwhile, the sphere is sequentially rotated by a certain angle along the vertical axis, and the edge contents cut at the two sides are rotated to the central main visual area.
The multi-angle rotation algorithm is adopted, so that the region of the panorama inducing emotion is rotated to the position, close to the equator, of the main view according to the content distribution characteristics of the panorama, the adverse effect caused by distortion projection is reduced, and the model can capture relevant characteristics conveniently.
The rotation comprises horizontal rotation and vertical rotation, and the horizontal rotation realizes that the cut edge contents on the two sides rotate to the middle main visual area; the vertical rotation enables the polar severely distorted content to rotate to near the equator.
Further, the spherical multi-angle rotation algorithm specifically comprises the following steps:
a three-dimensional spherical coordinate system with the user's head as the origin o is constructed to conform to the right-hand coordinate system, as shown in fig. 3 (a). And (3) rotating the sphere by 90 degrees in the horizontal direction by utilizing a spherical multi-angle rotation algorithm, repeating for 2 times, and realizing that the edge contents cut at two sides rotate to the middle main visual area, which is shown in figure 4. And rotating the sphere by 45 degrees along the vertical direction, repeating for 4 times, and rotating the object with two poles seriously twisted to the vicinity of the equator to improve distortion. Each panorama yielded results of 2x 4-8 data enhancements.
Let the height of the panorama be H, the width be W, the coordinate of any point on the plane be (u, v), the corresponding three-dimensional sphere coordinate point be (x, y, z), the longitude and latitude value be
Figure BDA0003170399950000051
The relationship between the longitude and latitude and the spherical coordinates is as follows:
Figure BDA0003170399950000052
the conversion formula of the same point in the three-dimensional space and the two-dimensional plane is as follows:
Figure BDA0003170399950000053
the warp lines are mapped to vertical lines of constant pitch and the weft lines are mapped to horizontal lines of constant pitch, as shown in fig. 3 (b).
In the emotion recognition field, due to the limitation of content distortion existing in a panorama ERP storage format, in order to facilitate a model to capture relevant features, a multi-angle algorithm needs to rotate an object or a region inducing emotion to a position close to the equator of a main view, so that the object or the region is projected to the center of a two-dimensional plane through equidistant rectangles. However, different panoramic pictures need different rotation angles, and manual personalized customization of each panoramic picture is impractical. Generally, this is achieved by rotating the sphere horizontally 90 degrees, 2 times, then rotating the sphere 45 degrees along the x-axis, 4 times, each panorama yielding 2x 4-8 results.
And the feature extraction module is used for realizing feature extraction by using a pre-trained convolutional neural network on a large-scale image classification task. For the input image I, the formula X is usedl=f(Σkl·Xl-1+bl) Extracting the hierarchical structure of different feature spaces commonly used in the visual world to form a feature vector set [ X ]1,X2,...,Xl]. Wherein k islIs a convolution kernel of layer I, Xl-1Is a profile of the l-1 layer output, blIs the bias term. Each element in the set represents a feature map of the current layer, and the feature map serves as the input of the sub-region adaptive module to exert the complementary advantages of different layers of information.
The sub-region adaptive module, as shown in fig. 5, adaptively establishes the context feature of the current scale by finding the global and local relevance, and captures the global and local context dependencies of the feature maps of different levels. The module consists of a sub-region content representation branch and an emotion contribution degree representation branch, and specifically comprises the following steps:
sub-region content characterization branch pair feature vector set [ X ]1,X2,...,Xl]Is subjected to adaptive average pooling, and the adaptive average pooling function is defined as follows:
kernel_size=(input_size+2×padding)-(output_size-1)×stride
i.e., input size, output size, boundary padding and move step size, determine the size of the current convolution kernel. A feature map X with the size of h multiplied by w multiplied by clAnd converting into s × s × c, wherein h, w, c, and s respectively represent the height, width, number of channels, and preset size of the feature map. Then the adaptive average pooling divides the input feature map into s x s sub-regions to obtain a set of sub-region representations Ys×s=[y1,y2,...,ys×s]. Transforming a feature map of size sxsxsxsxxc c to s2Sub-region content characterization y of x cs
Emotion contribution degree characterization branch pair feature vector set [ X [ ]1,X2,...,Xl]Is subjected to global average pooling to obtain a global information characterization g (X) with a size of 1 × 1 × cl). And adding the 1 × 1 × c global information representation and the input feature map pixel by using a broadcasting mechanism to realize residual connection, and obtaining the feature map with the size of h × w × c.
Let sub-region ys×sThe contribution degree of the emotion classification label at the i point of the feature map is aiThe number of channels is converted into s by convolution operation of 1x12Then, any i point of the feature map corresponds to s × s emotion contribution degree vectors aiForm a set
Figure BDA0003170399950000061
Deforming to obtain a size hw × s2Adaptive emotion contribution matrix as
Representing emotion contribution degree matrix a of branch output by emotion contribution degreesSub-region content representation y output by sub-region content representation branchsMultiplication, the function is defined as follows:
Figure BDA0003170399950000062
obtaining a context feature characterization vector ZlThe vector represents each pixel point i and each sub-region ys×sThe degree of association of (a), its internal implied emotional contribution vector AiAnd the global and local connection weights are characterized, and the automatic optimization is carried out along with the continuous iteration of the network.
Further, the dependency refers to an association between two or more emotional subjects. The feature extraction module can realize the identification of different areas or objects, such as emotional main persons and cats, by using global and local features of the panoramic image, but the feature extraction module is not enough to be used as a standard for emotion prediction. The relevance between a person and a cat is also required to be adaptively established through a sub-region adaptive module, and the person makes a touch or cares for the cat so as to give a correct positive emotional label.
And the multi-scale fusion module is used for realizing the feature fusion of the feature graphs of different levels. The method comprises the steps of utilizing an upsampling operation to realize the size unification of feature graphs of different levels, splicing the feature graphs of the unified size on channel dimensions, and finally obtaining the dimension H multiplied by W multiplied by C1+C2+...+Cl) The underlying geometric information representation of (a) is combined with the higher-level semantic information representation of (b).
And the emotion classification module can realize higher emotion classification effect on the panoramic image containing the remarkable main body and the panoramic image not containing the remarkable main body. Due to the parameter redundancy of the fully-connected layer, replacing the fully-connected layer with global average pooling serves as a "classifier". And carrying out emotion recognition on the panoramic image with the remarkable main body by utilizing deep features which pay more attention to the abstract semantic information. And performing emotion recognition on the panoramic image without the prominent subject by utilizing shallow features providing detail perception information about edges, stripes, colors and the like. Obtaining the emotion classification label with higher accuracy, and the overall framework of the model is shown in fig. 6.
The features extracted by different levels of convolution operation of the feature extraction module are different, the conv layer _1, 2 and other bottom layers of convolution extract visual layer features, such as color, texture, contour and the like, and the conv layer 4, 5 and other high layers of convolution extract object layer and concept layer features, namely abstract semantic information. Predicting the emotional areas of different/same panoramas needs to be combined with the characteristic advantages of different levels, and if the content of the panoramas is a single straight-white natural scene, the color and texture information of the bottom layer are the key for correct classification; if the panoramic image content is a complex multi-object interactive scene, high-level semantic information is important. The sub-region self-adaptive module is beneficial to capturing the emotion induction region better by establishing the relevance between different regions of the characteristic diagram and the object, so that a correct emotion label is given.
In this embodiment, the feature extraction module extracts 4 layers of feature maps of conv layer _2, 3, 4, and 5, and meanwhile, the feature map of each layer is sent to the sub-region adaptation module, and the relevance of different regions is established under the condition that the different scales S are 1, 2, 4, and n (S is set to be unlimited, and generally, the effect of combining 1, 2, and 4 is the best). Because the sizes of the feature maps of different levels are different, a multi-scale fusion module is needed, firstly, the feature maps are unified in scale, then all the feature maps are spliced on the channel dimension, the total feature after splicing is used as the basis of emotion classification, and finally, the emotion polarity of the input panoramic image is obtained, namely, the input panoramic image is positive or negative.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (9)

1. A panorama emotion recognition method based on multi-angle subregion adaptation is characterized by comprising the following steps:
multi-angle rotation step: the conversion from a three-dimensional omnibearing stereoscopic view to a two-dimensional plane panorama is realized by adopting spherical multi-angle rotation and equidistant columnar projection;
a characteristic extraction step: extracting the features of the two-dimensional panoramic image by using a pre-training model to obtain feature images of different levels;
sub-region adaptation step: inputting feature maps of different levels, searching for global and local relevance, adaptively establishing context features of the current scale, and capturing global and local context dependencies of the feature maps of different levels;
multi-scale fusion step: splicing feature graphs of different levels on channel dimensions to realize multi-scale feature fusion;
and (3) emotion classification step: and determining the target emotion according to the advantages of the different level features, and outputting a corresponding emotion label.
2. The panorama emotion recognition method of claim 1, wherein the spherical multi-angle rotation specifically comprises:
establishing a three-dimensional spherical coordinate system with the head of the user as the sphere center, and projecting a 360-degree panoramic image presented by the user under a head-mounted display to the surface of the sphere;
rotating the projection drawing according to the content distribution characteristics of the panoramic drawing;
the rotation comprises horizontal rotation and vertical rotation, and the horizontal rotation realizes that the cut edge contents on the two sides rotate to the middle main visual area; the vertical rotation enables the polar severely distorted content to rotate to near the equator.
3. The method for emotion recognition of a panorama image according to claim 1, wherein the equidistant cylindrical projection is a projection of a three-dimensional stereoscopic view equidistant cylinder onto the two-dimensional panorama image by mapping longitude lines to vertical lines of constant pitch, latitude lines to horizontal lines of constant pitch.
4. The panoramic image emotion recognition method according to claim 2, wherein the three-dimensional spherical coordinates are a right-hand coordinate system, the field angle is 90 degrees, and the center coordinates of a front view are [0,0,0] when the binocular direct-view direction of the user is taken as a horizontal axis; the right viewport center coordinate is [90,0,0 ]; the center coordinate of the back viewport is [180,0,0 ]; the left viewport center coordinate is [ -90,0,0 ]; the upper viewport center coordinate is [0,90,0 ]; the central coordinate of the lower viewport is [0, -90,0 ]; corresponding to the six faces of the cube tangent to the sphere.
5. The panorama emotion recognition method of claim 1, wherein the feature extraction step specifically comprises:
inputting the two-dimensional panoramic picture into a pre-trained convolutional neural network, extracting the hierarchical structure of different feature spaces universal to the visual world, and forming a feature vector set [ X ]1,X2,...,Xl]Each element in the set represents a feature map of the current level.
6. The method for emotion recognition of a panorama image according to claim 1, wherein said sub-region adaptation step includes two branches of a sub-region content characterization branch and an emotion contribution characterization branch;
the sub-region content characterization branch obtains a sub-region content characterization y by performing adaptive average pooling operation on the characteristic graph with the input size of h multiplied by w multiplied by csWherein h, w, c and s respectively represent the height, width, channel number and preset size of the characteristic diagram;
the emotion contribution degree characterization branch specifically comprises the following steps:
for feature vector set [ X1,X2,...,Xl]Each element ofThe pixels are globally pooled to obtain a global information representation g (X) with a size of 1 × 1 × cl);
Characterizing global information g (X) with a broadcast mechanisml) Residual error connection is realized by adding the input feature diagram element by element, and the number of channels is converted into s through convolution operation of 1x12Thereby constructing a size of hw × s2Adaptive emotion contribution matrix as
Adaptive emotion contribution matrix asWith sub-region content characterisation ysMultiplying to obtain a context feature characterization vector ZlThe vector represents each pixel point i and each sub-region ys×sThe degree of association of (c).
7. The method for emotion recognition of panoramic image according to claim 6, wherein the construction size is hw x s2Adaptive emotion contribution matrix asThe method specifically comprises the following steps: let sub-region ys×sThe contribution degree of the emotion classification label at the i point of the feature map is aiThe number of channels is converted into s by convolution operation of 1x12Then, any i point of the feature map corresponds to s × s emotion contribution degree vectors aiForm a set
Figure FDA0003170399940000021
Deforming to obtain a size hw × s2Adaptive emotion contribution matrix as
8. The method for emotion recognition of panoramic imagery according to claim 6, wherein adaptive averaging pooling divides the input feature map into s x s sub-regions, resulting in a set of sub-region representations Ys×s=[y1,y2,...,ys×s]Transforming the feature map with size of s × s × c into s2Sub-region content characterization y of x cs
9. A system for realizing the panorama emotion recognition method based on any one of claims 1-8, wherein the system comprises:
multi-angle rotating module: the method is used for realizing the conversion from a three-dimensional panoramic view to a two-dimensional panoramic view by multi-angle rotation and equidistant columnar projection;
a feature extraction module: extracting features of the two-dimensional panoramic image to obtain feature images of different levels, and capturing context dependence of the feature images on the whole and the part;
a sub-region adaptive module: areas with consistent emotion classification labels are correlated, and context features of the current scale are established in a self-adaptive mode by finding global and local correlation;
a multi-scale fusion module: splicing the feature maps on the channel dimension, and performing multi-scale feature fusion;
and an emotion classification module: and determining the target emotion according to the advantages of the different level features, and outputting a corresponding emotion label.
CN202110816786.4A 2021-07-20 2021-07-20 Panorama emotion recognition method and system based on multi-angle sub-region self-adaption Active CN113673567B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110816786.4A CN113673567B (en) 2021-07-20 2021-07-20 Panorama emotion recognition method and system based on multi-angle sub-region self-adaption

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110816786.4A CN113673567B (en) 2021-07-20 2021-07-20 Panorama emotion recognition method and system based on multi-angle sub-region self-adaption

Publications (2)

Publication Number Publication Date
CN113673567A true CN113673567A (en) 2021-11-19
CN113673567B CN113673567B (en) 2023-07-21

Family

ID=78539860

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110816786.4A Active CN113673567B (en) 2021-07-20 2021-07-20 Panorama emotion recognition method and system based on multi-angle sub-region self-adaption

Country Status (1)

Country Link
CN (1) CN113673567B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114201970A (en) * 2021-11-23 2022-03-18 国家电网有限公司华东分部 Method and device for capturing power grid scheduling event detection based on semantic features

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107506722A (en) * 2017-08-18 2017-12-22 中国地质大学(武汉) One kind is based on depth sparse convolution neutral net face emotion identification method
CN111832620A (en) * 2020-06-11 2020-10-27 桂林电子科技大学 Image emotion classification method based on double-attention multilayer feature fusion
CN112784764A (en) * 2021-01-27 2021-05-11 南京邮电大学 Expression recognition method and system based on local and global attention mechanism
CN112800875A (en) * 2021-01-14 2021-05-14 北京理工大学 Multi-mode emotion recognition method based on mixed feature fusion and decision fusion
CN113011504A (en) * 2021-03-23 2021-06-22 华南理工大学 Virtual reality scene emotion recognition method based on visual angle weight and feature fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107506722A (en) * 2017-08-18 2017-12-22 中国地质大学(武汉) One kind is based on depth sparse convolution neutral net face emotion identification method
CN111832620A (en) * 2020-06-11 2020-10-27 桂林电子科技大学 Image emotion classification method based on double-attention multilayer feature fusion
CN112800875A (en) * 2021-01-14 2021-05-14 北京理工大学 Multi-mode emotion recognition method based on mixed feature fusion and decision fusion
CN112784764A (en) * 2021-01-27 2021-05-11 南京邮电大学 Expression recognition method and system based on local and global attention mechanism
CN113011504A (en) * 2021-03-23 2021-06-22 华南理工大学 Virtual reality scene emotion recognition method based on visual angle weight and feature fusion

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114201970A (en) * 2021-11-23 2022-03-18 国家电网有限公司华东分部 Method and device for capturing power grid scheduling event detection based on semantic features

Also Published As

Publication number Publication date
CN113673567B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
Attal et al. MatryODShka: Real-time 6DoF video view synthesis using multi-sphere images
Zhang et al. All-weather deep outdoor lighting estimation
US20180012411A1 (en) Augmented Reality Methods and Devices
CN112766160A (en) Face replacement method based on multi-stage attribute encoder and attention mechanism
CN113628348B (en) Method and equipment for determining viewpoint path in three-dimensional scene
CN113822977A (en) Image rendering method, device, equipment and storage medium
CN111563502A (en) Image text recognition method and device, electronic equipment and computer storage medium
CN115330947A (en) Three-dimensional face reconstruction method and device, equipment, medium and product thereof
CN115222917A (en) Training method, device and equipment for three-dimensional reconstruction model and storage medium
CN113592726A (en) High dynamic range imaging method, device, electronic equipment and storage medium
Karakottas et al. 360 surface regression with a hyper-sphere loss
CN111754622B (en) Face three-dimensional image generation method and related equipment
CN113822965A (en) Image rendering processing method, device and equipment and computer storage medium
CN115115805A (en) Training method, device and equipment for three-dimensional reconstruction model and storage medium
CN117218246A (en) Training method and device for image generation model, electronic equipment and storage medium
WO2021151380A1 (en) Method for rendering virtual object based on illumination estimation, method for training neural network, and related products
CN116740261A (en) Image reconstruction method and device and training method and device of image reconstruction model
CN117635801A (en) New view synthesis method and system based on real-time rendering generalizable nerve radiation field
CN113673567B (en) Panorama emotion recognition method and system based on multi-angle sub-region self-adaption
CN114283152A (en) Image processing method, image processing model training method, image processing device, image processing equipment and image processing medium
CN116385667B (en) Reconstruction method of three-dimensional model, training method and device of texture reconstruction model
Khan et al. A review of benchmark datasets and training loss functions in neural depth estimation
CN116109974A (en) Volumetric video display method and related equipment
CN115393471A (en) Image processing method and device and electronic equipment
CN115272450A (en) Target positioning method based on panoramic segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant