CN110827193A - Panoramic video saliency detection method based on multi-channel features - Google Patents
Panoramic video saliency detection method based on multi-channel features Download PDFInfo
- Publication number
- CN110827193A CN110827193A CN201911000029.9A CN201911000029A CN110827193A CN 110827193 A CN110827193 A CN 110827193A CN 201911000029 A CN201911000029 A CN 201911000029A CN 110827193 A CN110827193 A CN 110827193A
- Authority
- CN
- China
- Prior art keywords
- image
- different
- panoramic
- image block
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 41
- 230000000007 visual effect Effects 0.000 claims abstract description 32
- 238000000034 method Methods 0.000 claims abstract description 27
- 238000013507 mapping Methods 0.000 claims abstract description 9
- 230000002441 reversible effect Effects 0.000 claims abstract description 5
- 230000009466 transformation Effects 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000007499 fusion processing Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 230000003287 optical effect Effects 0.000 claims description 2
- 230000007246 mechanism Effects 0.000 abstract description 3
- 230000000694 effects Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 210000000887 face Anatomy 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000004424 eye movement Effects 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/08—Projecting images onto non-planar surfaces, e.g. geodetic screens
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/215—Motion-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention provides a panoramic video saliency detection method based on multi-channel characteristics, which comprises the steps of carrying out reverse ERP transformation on a panoramic image, mapping a planar panoramic image onto a spherical surface, and generating a spherical panoramic image; simulating a visual window image by adopting a plane tangent to the spherical panoramic image to obtain different image blocks; in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different salient operation operators to form different salient feature subgraphs, and simultaneously, considering motion information among image block sequences to convert the salient detection of the images into the salient detection of videos; and fusing different significant characteristic subgraphs to generate a total significant graph. The method has better accuracy in simulating the human visual attention mechanism.
Description
Technical Field
The invention relates to the technical field of image significance detection, in particular to a panoramic video significance detection method based on multi-channel features, and particularly relates to a panoramic video significance detection method based on direction, color, spatial frequency and motion features.
Background
The detection of the significance of traditional images has been a relatively deep topic, and researchers have proposed many models in the past three decades, most of which are based on two ideas: from bottom to top and from top to bottom. The bottom-up model is data-driven, combines primary features such as color, contrast and orientation of an image, considers the difference of pixels and surrounding fields in features, and is unrelated to subjective emotion of a person, such as a visual saliency calculation model proposed by Itti L and the like. The top-down model is driven by tasks, and the prior knowledge about the scene is added into the model consideration to serve as an important basis for guiding significance distribution, so that the cognition of human psychological activities is included, for example, the human face, the vehicle and the central position are more easily noticed by an observer.
In the saliency data collection of images, the observer is allowed to repeatedly view "looking" for salient regions in front of a still image, which is a great difference from video. In the viewing of panoramic video, the picture content is dynamic, and the observer often misses some objects while watching a position or moving the head, so that the salient region of the image cannot completely correspond to the salient region of the panoramic video.
For the Saliency prediction algorithm of panoramic video, De Abreu Ana et al published "Ninth International Conference on Quality of Multimedia Experience" in 2017, which changes a 360 ° image into a conventional two-dimensional planar image through spherical-to-rectangular plane mapping (ERP transformation), and predicts a Saliency region through a conventional planar image Saliency detection algorithm. However, this method does not deal with the distortion in the mapping of the panoramic image to the planar image, and is still not slightly different from the panoramic content viewed by human eyes in the virtual reality environment. Battisti Federica et al published ' A feature-based adaptive evaluation for similarity evaluation of omni-directional images ' in 2018 Signal Processing: Image Communication ', and the final significance map is integrated by extracting visual window images from 360-degree images, performing significance measurement on chroma, saturation and GBVS characteristics based on graph theory and combining results of skin and face detection. However, this method only considers the prediction of the salient region of the panoramic image, and is not suitable for the prediction of the salient region of the panoramic video due to the omission of the inter-frame information. Researchers have also proposed panoramic video saliency detection algorithms based on deep learning, but the limitations are large, mainly due to the small number of eye movement data sets of dynamic scenes and the generally small scale.
At present, no explanation or report of the similar technology of the invention is found, and similar data at home and abroad are not collected.
Disclosure of Invention
In view of the above-mentioned deficiencies in the prior art, the present invention aims to provide a method for detecting the saliency of a panoramic video based on multi-channel features such as direction, color, spatial frequency, and motion features, which adopts a 360 ° image to plane image distortion-free mapping method, combines feature combination and feature extraction from bottom to top with a modeling idea from top to bottom, and simultaneously considers the influence of inter-frame information of the video on saliency prediction, thereby simulating a human visual attention mechanism with good accuracy.
The invention is realized by the following technical scheme.
The invention provides a panoramic video saliency detection method based on multi-channel features, which comprises the following steps:
s1: carrying out reverse ERP transformation on the panoramic image, and mapping the planar panoramic image to a spherical surface to generate a spherical panoramic image;
s2: simulating a visual window image by adopting a plane tangent to the spherical panoramic image to obtain different image blocks;
s3: in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different salient operation operators to form different salient feature subgraphs, and simultaneously, considering motion information among image block sequences to convert the salient detection of the images into the salient detection of videos;
s4: and fusing different significant characteristic subgraphs to generate a total significant graph.
Preferably, in step 1, the expression of the spherical panoramic image is:
wherein, lambda is the longitude of the point (x, y) projected to the spherical surface in the rectangular coordinate system of the planar panoramic image,is the latitude of the plane panoramic image projected to the spherical surface,is the latitude corresponding to the horizontal central axis of the planar panoramic image, wherein the value is 0 and lambda0The longitude corresponding to the central meridian of the planar panoramic image.
Preferably, the step 2 includes the following sub-steps:
s2.1: setting a plane tangent to the spherical surface of the spherical panoramic image, and then projecting a curved surface with a limited angle on the spherical surface in the visual window onto the plane as an image block of a current picture;
s2.2: rotating the visual window by a fixed angle, and moving a plane tangent to the spherical surface to a new longitude latitude tangent to the center of the window to obtain a next projected image block;
s2.3: and repeating the step S2.2 to obtain a series of image blocks of the multi-view viewing plane panoramic image in the simulated visual window.
Preferably, the plane is a rectangular plane of fixed length and width tangent to the spherical surface of the spherical panoramic image, which is disposed at the center of the spherical surface, and the curved surfaces of limited visual angles in the visual window are each mapped onto this rectangular plane.
Preferably, the step 3 includes the following sub-steps:
s3.1: extracting statistical feature subgraph f of image block based on different layers and orientations of sideband pyramid domain of pixel s1(s):
Constructing a controllable pyramid model in a gray scale image of an image block of the planar panoramic image; calculating histograms of pictures with different spatial frequencies and orientations to estimate probability density distribution, performing weighted linear addition on results of different levels and orientations to obtain a statistical feature subgraph f of different levels and orientations based on a sideband pyramid domain of a pixel s1(s) is as follows:
wherein, αkRepresenting the weights for all orientations and levels, wherein the vertical and horizontal directions are given the same weight, and the weights between different frequency components are assigned by a function; pkRepresenting the probability of the brightness corresponding in the side band k of the pyramid W, IsRepresents the luminance of the pixel s;
s3.2: extraction of color feature subgraph f of image block based on pixel s2(s):
Calculating the distribution of the image in the image block in three channels of RGB and integrating to obtain a color characteristic value O(s) of a pixel s as shown in the following formula:
wherein λ iscIs a weight, P, for a color channel learned by luminance value conversion of RGB to a given color format (YUV)cRepresenting the corresponding probability of the brightness of different color channels;
and multiplying the color distance of the CIELAB space by the weight based on the space distance between the pixels, and performing normalization processing to obtain:
wherein k issFor the normalized denominator term, C calculates the color distance in CIELAB space, function gdFor use in determining the distance between pixel spacesSetting a weight, s' representing another pixel in space, Is′Represents the corresponding brightness; Ω denotes a set of pixels of the image block, Δ L*、Δa*、Δb*Respectively representing the distances of two pixels on three components in the CIELAB space;
s3.3: local symmetric feature subgraph f for extracting image block3:
Detecting the local symmetry axis of the image in the image block, and taking the obtained result as the local symmetric characteristic subgraph f of the image block3;
S3.4: extracting semantic feature subgraph f of image block4:
Extracting high-order features (including characters, automobiles and faces) of the image in the image block by using a target detection algorithm to obtain a semantic feature subgraph f of the image block4。
S3.5: extracting motion information characteristic subgraph f of image block5:
Detecting the image block sequence in the visual window, adding the motion information into the detection, and taking the obtained result as the motion information characteristic subgraph f of a group of image blocks5。
Preferably, in S3.1, the controllable pyramid model uses spatial filters with different orientations and bandwidths for the construction of each layer, and the spatial filters are applied to extract information of different directions of the grayscale map.
Preferably, in S3.1, weights between different frequency components are assigned by a CSF function.
Preferably, in S3.4, high-order features of the image in the image block are extracted, a target detection algorithm based on multi-scale variability grouping model mixture is adopted, and features of different levels of the image are extracted by using an image pyramid.
Preferably, in S3.5, the LK optical flow method is adopted to detect the image block sequence in the visual window.
Preferably, in S4, the feature fusion is performed on the different feature subgraphs obtained in S3 by using a linear weighting method.
Preferably, in the feature fusion process, for a salient feature subgraph corresponding to a visual window at a high latitude, a lower weight is allocated to suppress the salient region possibility of two poles during feature fusion.
Compared with the prior art, the invention has the following beneficial effects:
1. the saliency detection algorithm of the traditional image can be applied to the panoramic image without being influenced by distortion through the mapping change of the coordinate system;
2. the saliency detection framework based on the multi-vision channel feature fusion has strong expansibility and has the characteristics of flexibility and easiness in modification;
3. the feature estimation of the motion information is introduced on the basis of the saliency image detection algorithm, so that a new panorama video saliency detection algorithm is provided, and the situation that other saliency contents are ignored can be reduced while a moving object is concerned.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a schematic diagram of a multi-window mapping process of a panoramic picture;
FIG. 2 is a flow chart of significance detection based on multi-channel features;
fig. 3 is a diagram showing the effect of comparing the normal rendering and the point-of-regard rendering.
Detailed Description
The following examples illustrate the invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are given. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.
The embodiment of the invention provides a panoramic video saliency detection method based on multi-channel characteristics, wherein the multi-channel characteristics comprise: direction, color, spatial frequency, and motion characteristics.
The method comprises the following steps:
step 1: firstly, the panoramic image is transformed by reverse ERP (equal-distance cylindrical Projection), and the planar panoramic image is mapped to a spherical surface to generate a spherical panoramic image;
step 2: simulating a visual window image through a plane tangent to the spherical panoramic image to obtain different image blocks;
and step 3: in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different saliency operation operators, and forming a salient feature map by considering motion information among image block sequences;
and 4, step 4: and synthesizing a total saliency map through a saliency map fusion process.
Further, still include:
and 5, repeatedly executing the steps 1 to 4 until a total saliency map of each frame of panoramic image in the panoramic video is obtained, and completing the saliency detection of the panoramic video.
Further, in step 1, the mathematical expression of the spherical panoramic image is:
wherein, lambda is the longitude of the point (x, y) of the planar panoramic image of the rectangular coordinate system after being projected to the spherical surface,is the latitude of the plane panoramic image projected to the spherical surface,is the latitude corresponding to the horizontal central axis of the planar panoramic image, which is 0, lambda0The longitude corresponding to the central meridian of the planar panoramic image.
Further, the step 2 includes the following sub-steps:
step 2.1: after the planar panoramic image is mapped onto the spherical surface, the present embodiment sets some planes tangent to the spherical surface to simulate viewing a planar panoramic picture (as shown in fig. 1) in a head-mounted display device (HMD), and then projects the curved surfaces with limited angles on the spherical surface onto these planes as image blocks of the current picture;
step 2.2: then the visual window rotates by a fixed angle, and the rectangular plane tangent to the spherical surface moves to a new longitude and latitude tangent to the center of the window along with the fixed angle to obtain a next projected image block;
step 2.3: repeating the step 2.2, so that the present embodiment obtains a series of image blocks simulating that the human eyes view the planar panoramic image in the HMD in a multi-view manner, and the planar panoramic image is mapped into these small image blocks, and then is subjected to saliency detection (as shown in fig. 3).
In the above steps, after the planar panoramic image is mapped to the spherical surface, the present embodiment sets a rectangular plane of fixed length and width tangent to the spherical surface as the initial projection plane at the center of the spherical panoramic image, and the image of limited visual angle in the visual window will be mapped to this plane.
Further, the step 3 includes the following sub-steps:
step 3.1: extracting statistical feature subgraph f of image block based on different layers and orientations of sideband pyramid domain of pixel s1(s):
Step 3.2: extraction of color feature subgraph f of image block based on pixel s2(s):
Step 3.3: local symmetric feature subgraph f for extracting image block3:
Step 3.4: extracting semantic feature subgraph f of image block4:
Step 3.5: extracting motion information characteristic subgraph f of image block5:
Further, the air conditioner is provided with a fan,
step 3.1: statistics for different levels and orientations of the sideband pyramid domain: in consideration of multiple visual channels and contrast sensitivity, the controllable pyramid model is constructed in the gray scale image of the image block of the planar panoramic image, and spatial filters with different orientations and bandwidths are used for constructing each layer. After that, the present embodiment calculatesEstimating probability density distribution by histograms of pictures with different spatial frequencies and orientations, calculating a characteristic value of a certain pixel s by the following formula, and performing weighted linear addition on results of different levels and orientations to obtain a characteristic subgraph f based on the pixel s1(s):
Wherein, αkIncluding weight considerations for all orientations and levels, applying Gabor filters extracts information in different directions of the image, where the vertical and horizontal directions are given the same weight, and the weights between different frequency components are assigned by the CSF function. PkAnd representing the probability of the brightness corresponding to the sideband k of the pyramid W, and obtaining a salient feature subgraph corresponding to the pixel s through linear combination of all layers of the pyramid.
Wherein, the CSF is a contrast sensitivity function proposed in the Effects of Spatial bandwidth and temporal presentation by Peli El et al published in Spatial Vision 1993, wherein Spatial frequency is used as an input variable, a detection threshold value is changed along with the input, and different weights can be assigned to the contents of different Spatial frequencies in the picture.
Step 3.2: the method for calculating the characteristic value of the color of a certain pixel s is obtained by calculating and integrating the distribution of the image in the image block in three channels of RGB:
wherein λ iscIs a weight, P, for a color channel learned by luminance value conversion of RGB to a given color format (YUV)cRepresenting the probability that the luminance of different color channels corresponds. In addition, according to the study on the contrast, the present embodiment attempts to emphasize a feature map obtained in the case of high contrast among color features. As shown in the following equation, this embodiment multiplies the color distance in CIELAB space by the distance between pixelsThe weights of the spatial distances are weighted and normalized.
Wherein k issFor the normalized denominator term, C calculates the color distance in CIELAB space, function gdFor setting weights based on the distance between pixel spaces, the present embodiment uses a gaussian function whose width is controlled by the standard deviation σ, so that the feature of the pixel s is enhanced by the local color contrast to obtain a feature sub-graph f2(s)。
Step 3.3: in this embodiment, an extraction algorithm of a basic feature in "Learning-Based Symmetry Detection in natural images" proposed by Stavros Tsogkas et al in "European consensus Computer Vision" of 2012 is added to detect a local Symmetry axis of an image in an image block, and a result obtained by the Detection is used as a third-class salient feature sub-graph f3。
Step 3.4: humans tend to focus on some particular objects, such as people, cars, faces, etc., in a high-dimensional semantic understanding of the image. In this embodiment, a relevant target detection algorithm proposed by Pedr F Felzenzwalb et al in "IEEEtransactions on Pattern Analysis and Machine Analysis" published in 2010 "Objectdetection with characterization related parts-Based Models" is used to extract such high-order features, and a significant feature sub-graph F is obtained4. The target detection algorithm is based on multi-scale variability grouping model mixing, objects are detected and identified through mixing of a main coarse precision filter bank and a series of high-resolution filter banks, and the image pyramid is used for extracting features of different layers.
Step 3.5: this example introduces Bruce D.Lucas in the feature detection, which is proposed in the 1985 article "Generalized Image Matching by the Method of Differences" to detect the Image sequence in the visual window, and the result is used as a set of feature sub-graphs f5Thereby adding motion information to the model to take into account the appearance of the image in the image blockThe saliency detection algorithm is improved into a saliency detection algorithm of the video.
Further, in the step 4, the 5 kinds of feature sub-images obtained in the step 3 are subjected to feature fusion by using a linear weighting method, and because the central axis deviation of the panoramic content watched by the audience is considered, a lower weight is assigned to the significant feature sub-image corresponding to the visual window at the high latitude during feature fusion to suppress the probability of significant areas of the two poles.
The method for detecting the significance of the panoramic video based on the multi-channel characteristics, provided by the embodiment of the invention, is used for performing reverse ERP transformation on the panoramic image, and mapping the planar panoramic image onto a spherical surface to generate a spherical panoramic image; simulating a visual window image by adopting a plane tangent to the spherical panoramic image to obtain different image blocks; in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different salient operation operators to form different salient feature subgraphs, and simultaneously, considering motion information among image block sequences to convert the salient detection of the images into the salient detection of videos; and fusing different significant characteristic subgraphs to generate a total significant graph. The method has better accuracy in simulating the human visual attention mechanism.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.
Claims (10)
1. A method for detecting the saliency of a panoramic video based on multi-channel features is characterized by comprising the following steps:
s1: carrying out reverse ERP transformation on the panoramic image, and mapping the planar panoramic image to a spherical surface to generate a spherical panoramic image;
s2: simulating a visual window image by adopting a plane tangent to the spherical panoramic image to obtain different image blocks;
s3: in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different salient operation operators to form different salient feature subgraphs, and simultaneously, considering motion information among image block sequences to convert the salient detection of the images into the salient detection of videos;
s4: and fusing different significant characteristic subgraphs to generate a total significant graph.
2. The method for detecting the saliency of panoramic video based on multi-channel features as claimed in claim 1, wherein in the step 1, the expression of the spherical panoramic image is as follows:
wherein, lambda is the longitude of the point (x, y) projected to the spherical surface in the rectangular coordinate system of the planar panoramic image,is the latitude of the plane panoramic image projected to the spherical surface,is the latitude corresponding to the horizontal central axis of the planar panoramic image, wherein the value is 0 and lambda0The longitude corresponding to the central meridian of the planar panoramic image.
3. The method for detecting the saliency of panoramic video based on multi-channel features as claimed in claim 1, wherein said step 2 comprises the following sub-steps:
s2.1: setting a plane tangent to the spherical surface of the spherical panoramic image, and then projecting a curved surface with a limited angle on the spherical surface in the visual window onto the plane as an image block of a current picture;
s2.2: rotating the visual window by a fixed angle, and moving a plane tangent to the spherical surface to a new longitude latitude tangent to the center of the window to obtain a next projected image block;
s2.3: and repeating the step S2.2 to obtain a series of image blocks of the multi-view viewing plane panoramic image in the simulated visual window.
4. The method for detecting the saliency of the panoramic video based on the multi-channel features as claimed in claim 3, wherein the plane is a rectangular plane with a fixed length and width and tangent to a spherical surface of the spherical panoramic image, and the curved surfaces with limited visual angles in the visual window are all mapped onto the rectangular plane.
5. The method for detecting the saliency of panoramic video based on multi-channel features as claimed in claim 1, wherein said step 3 comprises the following sub-steps:
s3.1: extracting statistical feature subgraph f of image block based on different layers and orientations of sideband pyramid domain of pixel s1(s):
Constructing a controllable pyramid model in a gray scale image of an image block of the planar panoramic image; calculating histograms of pictures with different spatial frequencies and orientations to estimate probability density distribution, performing weighted linear addition on results of different levels and orientations to obtain a statistical feature subgraph f of different levels and orientations based on a sideband pyramid domain of a pixel s1(s) is as follows:
wherein, αkRepresenting the weights for all orientations and levels, wherein the vertical and horizontal directions are given the same weight, and the weights between different frequency components are assigned by a function; pkRepresenting the probability of the brightness corresponding in the side band k of the pyramid W, IsRepresents the luminance of the pixel s;
s3.2: extraction of color feature subgraph f of image block based on pixel s2(s):
Calculating the distribution of the image in the image block in three channels of RGB and integrating to obtain a color characteristic value O(s) of a pixel s as shown in the following formula:
wherein λ iscIs a weight, P, for a color channel learned by luminance value conversion of RGB to a given color format (YUV)cRepresenting the corresponding probability of the brightness of different color channels;
and multiplying the color distance of the CIELAB space by the weight based on the space distance between the pixels, and performing normalization processing to obtain:
wherein ks is a normalized denominator term, C calculates the color distance of CIELAB space, and g is a functiondFor setting a weight based on the distance between the pixel spaces, s' representing another pixel in space, Is′Represents the corresponding brightness; Ω denotes a set of pixels of the image block, Δ L*、Δa*、Δb*Respectively representing the distances of two pixels on three components in the CIELAB space;
s3.3: local symmetric feature subgraph f for extracting image block3:
Detecting the local symmetry axis of the image in the image block, and taking the obtained result as the local symmetric characteristic subgraph f of the image block3;
S3.4: extracting semantic feature subgraph f of image block4:
Extracting high-order features (including characters, automobiles and faces) of the image in the image block by using a target detection algorithm to obtain a semantic feature subgraph f of the image block4。
S3.5: extracting motion information characteristic subgraph f of image block5:
Detecting the image block sequence in the visual window, adding the motion information into the detection, and taking the obtained result as the motion information characteristic subgraph f of a group of image blocks5。
6. The method for detecting the saliency of panoramic video based on multi-channel features of claim 5, wherein in S3.1, the controllable pyramid model uses spatial filters with different orientations and bandwidths for the construction of each layer, and the spatial filters are applied to extract information of different directions of a gray map; and/or
In S3.1, weights between different frequency components are assigned by CSF functions.
7. The method for detecting the saliency of panoramic video based on multi-channel features as claimed in claim 5, wherein in S3.4, the high-order features of the images in the image blocks are extracted, a target detection algorithm based on multi-scale variability grouping model mixing is adopted, and simultaneously, the features of different levels of the images are extracted by adopting an image pyramid.
8. The method for detecting the saliency of panoramic video based on multi-channel features of claim 5, wherein in S3.5, an LK optical flow method is adopted to detect the image block sequence in the visual window.
9. The method for detecting the saliency of panoramic video based on multi-channel features of claim 1, wherein in the step S4, feature fusion is performed on different feature subgraphs obtained in the step S3 by using a linear weighting method.
10. The method for detecting the saliency of panoramic video based on multi-channel features of claim 9, wherein in the feature fusion process, for a saliency feature subgraph corresponding to a visual window at a high latitude, a lower weight is assigned to suppress the saliency region possibility of two poles during feature fusion.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911000029.9A CN110827193B (en) | 2019-10-21 | 2019-10-21 | Panoramic video significance detection method based on multichannel characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911000029.9A CN110827193B (en) | 2019-10-21 | 2019-10-21 | Panoramic video significance detection method based on multichannel characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110827193A true CN110827193A (en) | 2020-02-21 |
CN110827193B CN110827193B (en) | 2023-05-09 |
Family
ID=69549745
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911000029.9A Active CN110827193B (en) | 2019-10-21 | 2019-10-21 | Panoramic video significance detection method based on multichannel characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110827193B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111488888A (en) * | 2020-04-10 | 2020-08-04 | 腾讯科技(深圳)有限公司 | Image feature extraction method and human face feature generation device |
CN111488886A (en) * | 2020-03-12 | 2020-08-04 | 上海交通大学 | Panorama image significance prediction method and system with attention feature arrangement and terminal |
CN111832414A (en) * | 2020-06-09 | 2020-10-27 | 天津大学 | Animal counting method based on graph regular optical flow attention network |
CN113569636A (en) * | 2021-06-22 | 2021-10-29 | 中国科学院信息工程研究所 | Fisheye image feature processing method and system based on spherical features and electronic equipment |
CN114529589A (en) * | 2020-11-05 | 2022-05-24 | 北京航空航天大学 | Panoramic video browsing interaction method |
CN114639171A (en) * | 2022-05-18 | 2022-06-17 | 松立控股集团股份有限公司 | Panoramic safety monitoring method for parking lot |
WO2022126921A1 (en) * | 2020-12-18 | 2022-06-23 | 平安科技(深圳)有限公司 | Panoramic picture detection method and device, terminal, and storage medium |
CN114898120A (en) * | 2022-05-27 | 2022-08-12 | 杭州电子科技大学 | 360-degree image salient target detection method based on convolutional neural network |
CN115131589A (en) * | 2022-08-31 | 2022-09-30 | 天津艺点意创科技有限公司 | Image generation method for intelligent design of Internet literary works |
CN117036154A (en) * | 2023-08-17 | 2023-11-10 | 中国石油大学(华东) | Panoramic video fixation point prediction method without head display and distortion |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150310303A1 (en) * | 2014-04-29 | 2015-10-29 | International Business Machines Corporation | Extracting salient features from video using a neurosynaptic system |
CN105488812A (en) * | 2015-11-24 | 2016-04-13 | 江南大学 | Motion-feature-fused space-time significance detection method |
CN106780297A (en) * | 2016-11-30 | 2017-05-31 | 天津大学 | Image high registration accuracy method under scene and Varying Illumination |
CN106899840A (en) * | 2017-03-01 | 2017-06-27 | 北京大学深圳研究生院 | Panoramic picture mapping method |
CN106951829A (en) * | 2017-02-23 | 2017-07-14 | 南京邮电大学 | A kind of notable method for checking object of video based on minimum spanning tree |
CN108462868A (en) * | 2018-02-12 | 2018-08-28 | 叠境数字科技(上海)有限公司 | The prediction technique of user's fixation point in 360 degree of panorama VR videos |
CN109064444A (en) * | 2018-06-28 | 2018-12-21 | 东南大学 | Track plates Defect inspection method based on significance analysis |
CN109166178A (en) * | 2018-07-23 | 2019-01-08 | 中国科学院信息工程研究所 | A kind of significant drawing generating method of panoramic picture that visual characteristic is merged with behavioral trait and system |
CN110070073A (en) * | 2019-05-07 | 2019-07-30 | 国家广播电视总局广播电视科学研究院 | Pedestrian's recognition methods again of global characteristics and local feature based on attention mechanism |
-
2019
- 2019-10-21 CN CN201911000029.9A patent/CN110827193B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150310303A1 (en) * | 2014-04-29 | 2015-10-29 | International Business Machines Corporation | Extracting salient features from video using a neurosynaptic system |
CN105488812A (en) * | 2015-11-24 | 2016-04-13 | 江南大学 | Motion-feature-fused space-time significance detection method |
CN106780297A (en) * | 2016-11-30 | 2017-05-31 | 天津大学 | Image high registration accuracy method under scene and Varying Illumination |
CN106951829A (en) * | 2017-02-23 | 2017-07-14 | 南京邮电大学 | A kind of notable method for checking object of video based on minimum spanning tree |
CN106899840A (en) * | 2017-03-01 | 2017-06-27 | 北京大学深圳研究生院 | Panoramic picture mapping method |
CN108462868A (en) * | 2018-02-12 | 2018-08-28 | 叠境数字科技(上海)有限公司 | The prediction technique of user's fixation point in 360 degree of panorama VR videos |
CN109064444A (en) * | 2018-06-28 | 2018-12-21 | 东南大学 | Track plates Defect inspection method based on significance analysis |
CN109166178A (en) * | 2018-07-23 | 2019-01-08 | 中国科学院信息工程研究所 | A kind of significant drawing generating method of panoramic picture that visual characteristic is merged with behavioral trait and system |
CN110070073A (en) * | 2019-05-07 | 2019-07-30 | 国家广播电视总局广播电视科学研究院 | Pedestrian's recognition methods again of global characteristics and local feature based on attention mechanism |
Non-Patent Citations (2)
Title |
---|
张乾;邓向冬;宁金辉;王惠明;孙岩;欧臻彦;韦安明;: "有线数字电视用户家庭图像质量评估与改善的研究" * |
苏群: "全景视频的显著性检测及其在编码传输中的应用" * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111488886B (en) * | 2020-03-12 | 2023-04-28 | 上海交通大学 | Panoramic image significance prediction method, system and terminal for arranging attention features |
CN111488886A (en) * | 2020-03-12 | 2020-08-04 | 上海交通大学 | Panorama image significance prediction method and system with attention feature arrangement and terminal |
CN111488888A (en) * | 2020-04-10 | 2020-08-04 | 腾讯科技(深圳)有限公司 | Image feature extraction method and human face feature generation device |
CN111832414A (en) * | 2020-06-09 | 2020-10-27 | 天津大学 | Animal counting method based on graph regular optical flow attention network |
CN114529589A (en) * | 2020-11-05 | 2022-05-24 | 北京航空航天大学 | Panoramic video browsing interaction method |
CN114529589B (en) * | 2020-11-05 | 2024-05-24 | 北京航空航天大学 | Panoramic video browsing interaction method |
WO2022126921A1 (en) * | 2020-12-18 | 2022-06-23 | 平安科技(深圳)有限公司 | Panoramic picture detection method and device, terminal, and storage medium |
CN113569636A (en) * | 2021-06-22 | 2021-10-29 | 中国科学院信息工程研究所 | Fisheye image feature processing method and system based on spherical features and electronic equipment |
CN113569636B (en) * | 2021-06-22 | 2023-12-05 | 中国科学院信息工程研究所 | Fisheye image feature processing method and system based on spherical features and electronic equipment |
CN114639171A (en) * | 2022-05-18 | 2022-06-17 | 松立控股集团股份有限公司 | Panoramic safety monitoring method for parking lot |
CN114898120A (en) * | 2022-05-27 | 2022-08-12 | 杭州电子科技大学 | 360-degree image salient target detection method based on convolutional neural network |
CN115131589A (en) * | 2022-08-31 | 2022-09-30 | 天津艺点意创科技有限公司 | Image generation method for intelligent design of Internet literary works |
CN115131589B (en) * | 2022-08-31 | 2022-11-22 | 天津艺点意创科技有限公司 | Image generation method for intelligent design of Internet literary works |
CN117036154A (en) * | 2023-08-17 | 2023-11-10 | 中国石油大学(华东) | Panoramic video fixation point prediction method without head display and distortion |
CN117036154B (en) * | 2023-08-17 | 2024-02-02 | 中国石油大学(华东) | Panoramic video fixation point prediction method without head display and distortion |
Also Published As
Publication number | Publication date |
---|---|
CN110827193B (en) | 2023-05-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110827193B (en) | Panoramic video significance detection method based on multichannel characteristics | |
Lebreton et al. | GBVS360, BMS360, ProSal: Extending existing saliency prediction models from 2D to omnidirectional images | |
Xu et al. | Arid: A new dataset for recognizing action in the dark | |
US20210279971A1 (en) | Method, storage medium and apparatus for converting 2d picture set to 3d model | |
CN109684925B (en) | Depth image-based human face living body detection method and device | |
US5802220A (en) | Apparatus and method for tracking facial motion through a sequence of images | |
DE112018007721T5 (en) | Acquire and modify 3D faces using neural imaging and time tracking networks | |
CN110650368A (en) | Video processing method and device and electronic equipment | |
CN108134937B (en) | Compressed domain significance detection method based on HEVC | |
US20180144212A1 (en) | Method and device for generating an image representative of a cluster of images | |
US20180357819A1 (en) | Method for generating a set of annotated images | |
Xu et al. | Saliency prediction on omnidirectional image with generative adversarial imitation learning | |
CN107749066A (en) | A kind of multiple dimensioned space-time vision significance detection method based on region | |
CN106993188B (en) | A kind of HEVC compaction coding method based on plurality of human faces saliency | |
CN106156714A (en) | The Human bodys' response method merged based on skeletal joint feature and surface character | |
Han et al. | A mixed-reality system for broadcasting sports video to mobile devices | |
CN107481067B (en) | Intelligent advertisement system and interaction method thereof | |
CN108141568A (en) | Osd information generation video camera, osd information synthesis terminal device 20 and the osd information shared system being made of it | |
CN112633217A (en) | Human face recognition living body detection method for calculating sight direction based on three-dimensional eyeball model | |
CN109523590B (en) | 3D image depth information visual comfort evaluation method based on sample | |
CN104298961B (en) | Video method of combination based on Mouth-Shape Recognition | |
CN113673567A (en) | Panorama emotion recognition method and system based on multi-angle subregion self-adaption | |
CN112954313A (en) | Method for calculating perception quality of panoramic image | |
CN112488165A (en) | Infrared pedestrian identification method and system based on deep learning model | |
CN113805824A (en) | Electronic device and method for displaying image on display equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |