CN110827193A - Panoramic video saliency detection method based on multi-channel features - Google Patents

Panoramic video saliency detection method based on multi-channel features Download PDF

Info

Publication number
CN110827193A
CN110827193A CN201911000029.9A CN201911000029A CN110827193A CN 110827193 A CN110827193 A CN 110827193A CN 201911000029 A CN201911000029 A CN 201911000029A CN 110827193 A CN110827193 A CN 110827193A
Authority
CN
China
Prior art keywords
image
different
panoramic
image block
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911000029.9A
Other languages
Chinese (zh)
Other versions
CN110827193B (en
Inventor
邓向冬
宁金辉
王惠明
张乾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Planning Institute Of Radio And Television Of State Administration Of Radio And Television
Original Assignee
Planning Institute Of Radio And Television Of State Administration Of Radio And Television
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Planning Institute Of Radio And Television Of State Administration Of Radio And Television filed Critical Planning Institute Of Radio And Television Of State Administration Of Radio And Television
Priority to CN201911000029.9A priority Critical patent/CN110827193B/en
Publication of CN110827193A publication Critical patent/CN110827193A/en
Application granted granted Critical
Publication of CN110827193B publication Critical patent/CN110827193B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/08Projecting images onto non-planar surfaces, e.g. geodetic screens
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a panoramic video saliency detection method based on multi-channel characteristics, which comprises the steps of carrying out reverse ERP transformation on a panoramic image, mapping a planar panoramic image onto a spherical surface, and generating a spherical panoramic image; simulating a visual window image by adopting a plane tangent to the spherical panoramic image to obtain different image blocks; in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different salient operation operators to form different salient feature subgraphs, and simultaneously, considering motion information among image block sequences to convert the salient detection of the images into the salient detection of videos; and fusing different significant characteristic subgraphs to generate a total significant graph. The method has better accuracy in simulating the human visual attention mechanism.

Description

Panoramic video saliency detection method based on multi-channel features
Technical Field
The invention relates to the technical field of image significance detection, in particular to a panoramic video significance detection method based on multi-channel features, and particularly relates to a panoramic video significance detection method based on direction, color, spatial frequency and motion features.
Background
The detection of the significance of traditional images has been a relatively deep topic, and researchers have proposed many models in the past three decades, most of which are based on two ideas: from bottom to top and from top to bottom. The bottom-up model is data-driven, combines primary features such as color, contrast and orientation of an image, considers the difference of pixels and surrounding fields in features, and is unrelated to subjective emotion of a person, such as a visual saliency calculation model proposed by Itti L and the like. The top-down model is driven by tasks, and the prior knowledge about the scene is added into the model consideration to serve as an important basis for guiding significance distribution, so that the cognition of human psychological activities is included, for example, the human face, the vehicle and the central position are more easily noticed by an observer.
In the saliency data collection of images, the observer is allowed to repeatedly view "looking" for salient regions in front of a still image, which is a great difference from video. In the viewing of panoramic video, the picture content is dynamic, and the observer often misses some objects while watching a position or moving the head, so that the salient region of the image cannot completely correspond to the salient region of the panoramic video.
For the Saliency prediction algorithm of panoramic video, De Abreu Ana et al published "Ninth International Conference on Quality of Multimedia Experience" in 2017, which changes a 360 ° image into a conventional two-dimensional planar image through spherical-to-rectangular plane mapping (ERP transformation), and predicts a Saliency region through a conventional planar image Saliency detection algorithm. However, this method does not deal with the distortion in the mapping of the panoramic image to the planar image, and is still not slightly different from the panoramic content viewed by human eyes in the virtual reality environment. Battisti Federica et al published ' A feature-based adaptive evaluation for similarity evaluation of omni-directional images ' in 2018 Signal Processing: Image Communication ', and the final significance map is integrated by extracting visual window images from 360-degree images, performing significance measurement on chroma, saturation and GBVS characteristics based on graph theory and combining results of skin and face detection. However, this method only considers the prediction of the salient region of the panoramic image, and is not suitable for the prediction of the salient region of the panoramic video due to the omission of the inter-frame information. Researchers have also proposed panoramic video saliency detection algorithms based on deep learning, but the limitations are large, mainly due to the small number of eye movement data sets of dynamic scenes and the generally small scale.
At present, no explanation or report of the similar technology of the invention is found, and similar data at home and abroad are not collected.
Disclosure of Invention
In view of the above-mentioned deficiencies in the prior art, the present invention aims to provide a method for detecting the saliency of a panoramic video based on multi-channel features such as direction, color, spatial frequency, and motion features, which adopts a 360 ° image to plane image distortion-free mapping method, combines feature combination and feature extraction from bottom to top with a modeling idea from top to bottom, and simultaneously considers the influence of inter-frame information of the video on saliency prediction, thereby simulating a human visual attention mechanism with good accuracy.
The invention is realized by the following technical scheme.
The invention provides a panoramic video saliency detection method based on multi-channel features, which comprises the following steps:
s1: carrying out reverse ERP transformation on the panoramic image, and mapping the planar panoramic image to a spherical surface to generate a spherical panoramic image;
s2: simulating a visual window image by adopting a plane tangent to the spherical panoramic image to obtain different image blocks;
s3: in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different salient operation operators to form different salient feature subgraphs, and simultaneously, considering motion information among image block sequences to convert the salient detection of the images into the salient detection of videos;
s4: and fusing different significant characteristic subgraphs to generate a total significant graph.
Preferably, in step 1, the expression of the spherical panoramic image is:
Figure BDA0002241009970000021
wherein, lambda is the longitude of the point (x, y) projected to the spherical surface in the rectangular coordinate system of the planar panoramic image,
Figure BDA0002241009970000022
is the latitude of the plane panoramic image projected to the spherical surface,is the latitude corresponding to the horizontal central axis of the planar panoramic image, wherein the value is 0 and lambda0The longitude corresponding to the central meridian of the planar panoramic image.
Preferably, the step 2 includes the following sub-steps:
s2.1: setting a plane tangent to the spherical surface of the spherical panoramic image, and then projecting a curved surface with a limited angle on the spherical surface in the visual window onto the plane as an image block of a current picture;
s2.2: rotating the visual window by a fixed angle, and moving a plane tangent to the spherical surface to a new longitude latitude tangent to the center of the window to obtain a next projected image block;
s2.3: and repeating the step S2.2 to obtain a series of image blocks of the multi-view viewing plane panoramic image in the simulated visual window.
Preferably, the plane is a rectangular plane of fixed length and width tangent to the spherical surface of the spherical panoramic image, which is disposed at the center of the spherical surface, and the curved surfaces of limited visual angles in the visual window are each mapped onto this rectangular plane.
Preferably, the step 3 includes the following sub-steps:
s3.1: extracting statistical feature subgraph f of image block based on different layers and orientations of sideband pyramid domain of pixel s1(s):
Constructing a controllable pyramid model in a gray scale image of an image block of the planar panoramic image; calculating histograms of pictures with different spatial frequencies and orientations to estimate probability density distribution, performing weighted linear addition on results of different levels and orientations to obtain a statistical feature subgraph f of different levels and orientations based on a sideband pyramid domain of a pixel s1(s) is as follows:
Figure BDA0002241009970000031
wherein, αkRepresenting the weights for all orientations and levels, wherein the vertical and horizontal directions are given the same weight, and the weights between different frequency components are assigned by a function; pkRepresenting the probability of the brightness corresponding in the side band k of the pyramid W, IsRepresents the luminance of the pixel s;
s3.2: extraction of color feature subgraph f of image block based on pixel s2(s):
Calculating the distribution of the image in the image block in three channels of RGB and integrating to obtain a color characteristic value O(s) of a pixel s as shown in the following formula:
Figure BDA0002241009970000032
wherein λ iscIs a weight, P, for a color channel learned by luminance value conversion of RGB to a given color format (YUV)cRepresenting the corresponding probability of the brightness of different color channels;
and multiplying the color distance of the CIELAB space by the weight based on the space distance between the pixels, and performing normalization processing to obtain:
Figure BDA0002241009970000041
wherein k issFor the normalized denominator term, C calculates the color distance in CIELAB space, function gdFor use in determining the distance between pixel spacesSetting a weight, s' representing another pixel in space, Is′Represents the corresponding brightness; Ω denotes a set of pixels of the image block, Δ L*、Δa*、Δb*Respectively representing the distances of two pixels on three components in the CIELAB space;
s3.3: local symmetric feature subgraph f for extracting image block3
Detecting the local symmetry axis of the image in the image block, and taking the obtained result as the local symmetric characteristic subgraph f of the image block3
S3.4: extracting semantic feature subgraph f of image block4
Extracting high-order features (including characters, automobiles and faces) of the image in the image block by using a target detection algorithm to obtain a semantic feature subgraph f of the image block4
S3.5: extracting motion information characteristic subgraph f of image block5
Detecting the image block sequence in the visual window, adding the motion information into the detection, and taking the obtained result as the motion information characteristic subgraph f of a group of image blocks5
Preferably, in S3.1, the controllable pyramid model uses spatial filters with different orientations and bandwidths for the construction of each layer, and the spatial filters are applied to extract information of different directions of the grayscale map.
Preferably, in S3.1, weights between different frequency components are assigned by a CSF function.
Preferably, in S3.4, high-order features of the image in the image block are extracted, a target detection algorithm based on multi-scale variability grouping model mixture is adopted, and features of different levels of the image are extracted by using an image pyramid.
Preferably, in S3.5, the LK optical flow method is adopted to detect the image block sequence in the visual window.
Preferably, in S4, the feature fusion is performed on the different feature subgraphs obtained in S3 by using a linear weighting method.
Preferably, in the feature fusion process, for a salient feature subgraph corresponding to a visual window at a high latitude, a lower weight is allocated to suppress the salient region possibility of two poles during feature fusion.
Compared with the prior art, the invention has the following beneficial effects:
1. the saliency detection algorithm of the traditional image can be applied to the panoramic image without being influenced by distortion through the mapping change of the coordinate system;
2. the saliency detection framework based on the multi-vision channel feature fusion has strong expansibility and has the characteristics of flexibility and easiness in modification;
3. the feature estimation of the motion information is introduced on the basis of the saliency image detection algorithm, so that a new panorama video saliency detection algorithm is provided, and the situation that other saliency contents are ignored can be reduced while a moving object is concerned.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a schematic diagram of a multi-window mapping process of a panoramic picture;
FIG. 2 is a flow chart of significance detection based on multi-channel features;
fig. 3 is a diagram showing the effect of comparing the normal rendering and the point-of-regard rendering.
Detailed Description
The following examples illustrate the invention in detail: the embodiment is implemented on the premise of the technical scheme of the invention, and a detailed implementation mode and a specific operation process are given. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.
The embodiment of the invention provides a panoramic video saliency detection method based on multi-channel characteristics, wherein the multi-channel characteristics comprise: direction, color, spatial frequency, and motion characteristics.
The method comprises the following steps:
step 1: firstly, the panoramic image is transformed by reverse ERP (equal-distance cylindrical Projection), and the planar panoramic image is mapped to a spherical surface to generate a spherical panoramic image;
step 2: simulating a visual window image through a plane tangent to the spherical panoramic image to obtain different image blocks;
and step 3: in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different saliency operation operators, and forming a salient feature map by considering motion information among image block sequences;
and 4, step 4: and synthesizing a total saliency map through a saliency map fusion process.
Further, still include:
and 5, repeatedly executing the steps 1 to 4 until a total saliency map of each frame of panoramic image in the panoramic video is obtained, and completing the saliency detection of the panoramic video.
Further, in step 1, the mathematical expression of the spherical panoramic image is:
wherein, lambda is the longitude of the point (x, y) of the planar panoramic image of the rectangular coordinate system after being projected to the spherical surface,
Figure BDA0002241009970000061
is the latitude of the plane panoramic image projected to the spherical surface,is the latitude corresponding to the horizontal central axis of the planar panoramic image, which is 0, lambda0The longitude corresponding to the central meridian of the planar panoramic image.
Further, the step 2 includes the following sub-steps:
step 2.1: after the planar panoramic image is mapped onto the spherical surface, the present embodiment sets some planes tangent to the spherical surface to simulate viewing a planar panoramic picture (as shown in fig. 1) in a head-mounted display device (HMD), and then projects the curved surfaces with limited angles on the spherical surface onto these planes as image blocks of the current picture;
step 2.2: then the visual window rotates by a fixed angle, and the rectangular plane tangent to the spherical surface moves to a new longitude and latitude tangent to the center of the window along with the fixed angle to obtain a next projected image block;
step 2.3: repeating the step 2.2, so that the present embodiment obtains a series of image blocks simulating that the human eyes view the planar panoramic image in the HMD in a multi-view manner, and the planar panoramic image is mapped into these small image blocks, and then is subjected to saliency detection (as shown in fig. 3).
In the above steps, after the planar panoramic image is mapped to the spherical surface, the present embodiment sets a rectangular plane of fixed length and width tangent to the spherical surface as the initial projection plane at the center of the spherical panoramic image, and the image of limited visual angle in the visual window will be mapped to this plane.
Further, the step 3 includes the following sub-steps:
step 3.1: extracting statistical feature subgraph f of image block based on different layers and orientations of sideband pyramid domain of pixel s1(s):
Step 3.2: extraction of color feature subgraph f of image block based on pixel s2(s):
Step 3.3: local symmetric feature subgraph f for extracting image block3
Step 3.4: extracting semantic feature subgraph f of image block4
Step 3.5: extracting motion information characteristic subgraph f of image block5
Further, the air conditioner is provided with a fan,
step 3.1: statistics for different levels and orientations of the sideband pyramid domain: in consideration of multiple visual channels and contrast sensitivity, the controllable pyramid model is constructed in the gray scale image of the image block of the planar panoramic image, and spatial filters with different orientations and bandwidths are used for constructing each layer. After that, the present embodiment calculatesEstimating probability density distribution by histograms of pictures with different spatial frequencies and orientations, calculating a characteristic value of a certain pixel s by the following formula, and performing weighted linear addition on results of different levels and orientations to obtain a characteristic subgraph f based on the pixel s1(s):
Wherein, αkIncluding weight considerations for all orientations and levels, applying Gabor filters extracts information in different directions of the image, where the vertical and horizontal directions are given the same weight, and the weights between different frequency components are assigned by the CSF function. PkAnd representing the probability of the brightness corresponding to the sideband k of the pyramid W, and obtaining a salient feature subgraph corresponding to the pixel s through linear combination of all layers of the pyramid.
Wherein, the CSF is a contrast sensitivity function proposed in the Effects of Spatial bandwidth and temporal presentation by Peli El et al published in Spatial Vision 1993, wherein Spatial frequency is used as an input variable, a detection threshold value is changed along with the input, and different weights can be assigned to the contents of different Spatial frequencies in the picture.
Step 3.2: the method for calculating the characteristic value of the color of a certain pixel s is obtained by calculating and integrating the distribution of the image in the image block in three channels of RGB:
Figure BDA0002241009970000071
wherein λ iscIs a weight, P, for a color channel learned by luminance value conversion of RGB to a given color format (YUV)cRepresenting the probability that the luminance of different color channels corresponds. In addition, according to the study on the contrast, the present embodiment attempts to emphasize a feature map obtained in the case of high contrast among color features. As shown in the following equation, this embodiment multiplies the color distance in CIELAB space by the distance between pixelsThe weights of the spatial distances are weighted and normalized.
Figure BDA0002241009970000072
Wherein k issFor the normalized denominator term, C calculates the color distance in CIELAB space, function gdFor setting weights based on the distance between pixel spaces, the present embodiment uses a gaussian function whose width is controlled by the standard deviation σ, so that the feature of the pixel s is enhanced by the local color contrast to obtain a feature sub-graph f2(s)。
Step 3.3: in this embodiment, an extraction algorithm of a basic feature in "Learning-Based Symmetry Detection in natural images" proposed by Stavros Tsogkas et al in "European consensus Computer Vision" of 2012 is added to detect a local Symmetry axis of an image in an image block, and a result obtained by the Detection is used as a third-class salient feature sub-graph f3
Step 3.4: humans tend to focus on some particular objects, such as people, cars, faces, etc., in a high-dimensional semantic understanding of the image. In this embodiment, a relevant target detection algorithm proposed by Pedr F Felzenzwalb et al in "IEEEtransactions on Pattern Analysis and Machine Analysis" published in 2010 "Objectdetection with characterization related parts-Based Models" is used to extract such high-order features, and a significant feature sub-graph F is obtained4. The target detection algorithm is based on multi-scale variability grouping model mixing, objects are detected and identified through mixing of a main coarse precision filter bank and a series of high-resolution filter banks, and the image pyramid is used for extracting features of different layers.
Step 3.5: this example introduces Bruce D.Lucas in the feature detection, which is proposed in the 1985 article "Generalized Image Matching by the Method of Differences" to detect the Image sequence in the visual window, and the result is used as a set of feature sub-graphs f5Thereby adding motion information to the model to take into account the appearance of the image in the image blockThe saliency detection algorithm is improved into a saliency detection algorithm of the video.
Further, in the step 4, the 5 kinds of feature sub-images obtained in the step 3 are subjected to feature fusion by using a linear weighting method, and because the central axis deviation of the panoramic content watched by the audience is considered, a lower weight is assigned to the significant feature sub-image corresponding to the visual window at the high latitude during feature fusion to suppress the probability of significant areas of the two poles.
The method for detecting the significance of the panoramic video based on the multi-channel characteristics, provided by the embodiment of the invention, is used for performing reverse ERP transformation on the panoramic image, and mapping the planar panoramic image onto a spherical surface to generate a spherical panoramic image; simulating a visual window image by adopting a plane tangent to the spherical panoramic image to obtain different image blocks; in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different salient operation operators to form different salient feature subgraphs, and simultaneously, considering motion information among image block sequences to convert the salient detection of the images into the salient detection of videos; and fusing different significant characteristic subgraphs to generate a total significant graph. The method has better accuracy in simulating the human visual attention mechanism.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention.

Claims (10)

1. A method for detecting the saliency of a panoramic video based on multi-channel features is characterized by comprising the following steps:
s1: carrying out reverse ERP transformation on the panoramic image, and mapping the planar panoramic image to a spherical surface to generate a spherical panoramic image;
s2: simulating a visual window image by adopting a plane tangent to the spherical panoramic image to obtain different image blocks;
s3: in different feature spaces, extracting salient regions of image blocks in the feature spaces by using different salient operation operators to form different salient feature subgraphs, and simultaneously, considering motion information among image block sequences to convert the salient detection of the images into the salient detection of videos;
s4: and fusing different significant characteristic subgraphs to generate a total significant graph.
2. The method for detecting the saliency of panoramic video based on multi-channel features as claimed in claim 1, wherein in the step 1, the expression of the spherical panoramic image is as follows:
Figure FDA0002241009960000011
wherein, lambda is the longitude of the point (x, y) projected to the spherical surface in the rectangular coordinate system of the planar panoramic image,
Figure FDA0002241009960000012
is the latitude of the plane panoramic image projected to the spherical surface,
Figure FDA0002241009960000013
is the latitude corresponding to the horizontal central axis of the planar panoramic image, wherein the value is 0 and lambda0The longitude corresponding to the central meridian of the planar panoramic image.
3. The method for detecting the saliency of panoramic video based on multi-channel features as claimed in claim 1, wherein said step 2 comprises the following sub-steps:
s2.1: setting a plane tangent to the spherical surface of the spherical panoramic image, and then projecting a curved surface with a limited angle on the spherical surface in the visual window onto the plane as an image block of a current picture;
s2.2: rotating the visual window by a fixed angle, and moving a plane tangent to the spherical surface to a new longitude latitude tangent to the center of the window to obtain a next projected image block;
s2.3: and repeating the step S2.2 to obtain a series of image blocks of the multi-view viewing plane panoramic image in the simulated visual window.
4. The method for detecting the saliency of the panoramic video based on the multi-channel features as claimed in claim 3, wherein the plane is a rectangular plane with a fixed length and width and tangent to a spherical surface of the spherical panoramic image, and the curved surfaces with limited visual angles in the visual window are all mapped onto the rectangular plane.
5. The method for detecting the saliency of panoramic video based on multi-channel features as claimed in claim 1, wherein said step 3 comprises the following sub-steps:
s3.1: extracting statistical feature subgraph f of image block based on different layers and orientations of sideband pyramid domain of pixel s1(s):
Constructing a controllable pyramid model in a gray scale image of an image block of the planar panoramic image; calculating histograms of pictures with different spatial frequencies and orientations to estimate probability density distribution, performing weighted linear addition on results of different levels and orientations to obtain a statistical feature subgraph f of different levels and orientations based on a sideband pyramid domain of a pixel s1(s) is as follows:
Figure FDA0002241009960000021
wherein, αkRepresenting the weights for all orientations and levels, wherein the vertical and horizontal directions are given the same weight, and the weights between different frequency components are assigned by a function; pkRepresenting the probability of the brightness corresponding in the side band k of the pyramid W, IsRepresents the luminance of the pixel s;
s3.2: extraction of color feature subgraph f of image block based on pixel s2(s):
Calculating the distribution of the image in the image block in three channels of RGB and integrating to obtain a color characteristic value O(s) of a pixel s as shown in the following formula:
Figure FDA0002241009960000022
wherein λ iscIs a weight, P, for a color channel learned by luminance value conversion of RGB to a given color format (YUV)cRepresenting the corresponding probability of the brightness of different color channels;
and multiplying the color distance of the CIELAB space by the weight based on the space distance between the pixels, and performing normalization processing to obtain:
Figure FDA0002241009960000023
wherein ks is a normalized denominator term, C calculates the color distance of CIELAB space, and g is a functiondFor setting a weight based on the distance between the pixel spaces, s' representing another pixel in space, Is′Represents the corresponding brightness; Ω denotes a set of pixels of the image block, Δ L*、Δa*、Δb*Respectively representing the distances of two pixels on three components in the CIELAB space;
s3.3: local symmetric feature subgraph f for extracting image block3
Detecting the local symmetry axis of the image in the image block, and taking the obtained result as the local symmetric characteristic subgraph f of the image block3
S3.4: extracting semantic feature subgraph f of image block4
Extracting high-order features (including characters, automobiles and faces) of the image in the image block by using a target detection algorithm to obtain a semantic feature subgraph f of the image block4
S3.5: extracting motion information characteristic subgraph f of image block5
Detecting the image block sequence in the visual window, adding the motion information into the detection, and taking the obtained result as the motion information characteristic subgraph f of a group of image blocks5
6. The method for detecting the saliency of panoramic video based on multi-channel features of claim 5, wherein in S3.1, the controllable pyramid model uses spatial filters with different orientations and bandwidths for the construction of each layer, and the spatial filters are applied to extract information of different directions of a gray map; and/or
In S3.1, weights between different frequency components are assigned by CSF functions.
7. The method for detecting the saliency of panoramic video based on multi-channel features as claimed in claim 5, wherein in S3.4, the high-order features of the images in the image blocks are extracted, a target detection algorithm based on multi-scale variability grouping model mixing is adopted, and simultaneously, the features of different levels of the images are extracted by adopting an image pyramid.
8. The method for detecting the saliency of panoramic video based on multi-channel features of claim 5, wherein in S3.5, an LK optical flow method is adopted to detect the image block sequence in the visual window.
9. The method for detecting the saliency of panoramic video based on multi-channel features of claim 1, wherein in the step S4, feature fusion is performed on different feature subgraphs obtained in the step S3 by using a linear weighting method.
10. The method for detecting the saliency of panoramic video based on multi-channel features of claim 9, wherein in the feature fusion process, for a saliency feature subgraph corresponding to a visual window at a high latitude, a lower weight is assigned to suppress the saliency region possibility of two poles during feature fusion.
CN201911000029.9A 2019-10-21 2019-10-21 Panoramic video significance detection method based on multichannel characteristics Active CN110827193B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911000029.9A CN110827193B (en) 2019-10-21 2019-10-21 Panoramic video significance detection method based on multichannel characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911000029.9A CN110827193B (en) 2019-10-21 2019-10-21 Panoramic video significance detection method based on multichannel characteristics

Publications (2)

Publication Number Publication Date
CN110827193A true CN110827193A (en) 2020-02-21
CN110827193B CN110827193B (en) 2023-05-09

Family

ID=69549745

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911000029.9A Active CN110827193B (en) 2019-10-21 2019-10-21 Panoramic video significance detection method based on multichannel characteristics

Country Status (1)

Country Link
CN (1) CN110827193B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488888A (en) * 2020-04-10 2020-08-04 腾讯科技(深圳)有限公司 Image feature extraction method and human face feature generation device
CN111488886A (en) * 2020-03-12 2020-08-04 上海交通大学 Panorama image significance prediction method and system with attention feature arrangement and terminal
CN111832414A (en) * 2020-06-09 2020-10-27 天津大学 Animal counting method based on graph regular optical flow attention network
CN113569636A (en) * 2021-06-22 2021-10-29 中国科学院信息工程研究所 Fisheye image feature processing method and system based on spherical features and electronic equipment
CN114529589A (en) * 2020-11-05 2022-05-24 北京航空航天大学 Panoramic video browsing interaction method
CN114639171A (en) * 2022-05-18 2022-06-17 松立控股集团股份有限公司 Panoramic safety monitoring method for parking lot
WO2022126921A1 (en) * 2020-12-18 2022-06-23 平安科技(深圳)有限公司 Panoramic picture detection method and device, terminal, and storage medium
CN114898120A (en) * 2022-05-27 2022-08-12 杭州电子科技大学 360-degree image salient target detection method based on convolutional neural network
CN115131589A (en) * 2022-08-31 2022-09-30 天津艺点意创科技有限公司 Image generation method for intelligent design of Internet literary works
CN117036154A (en) * 2023-08-17 2023-11-10 中国石油大学(华东) Panoramic video fixation point prediction method without head display and distortion

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150310303A1 (en) * 2014-04-29 2015-10-29 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
CN105488812A (en) * 2015-11-24 2016-04-13 江南大学 Motion-feature-fused space-time significance detection method
CN106780297A (en) * 2016-11-30 2017-05-31 天津大学 Image high registration accuracy method under scene and Varying Illumination
CN106899840A (en) * 2017-03-01 2017-06-27 北京大学深圳研究生院 Panoramic picture mapping method
CN106951829A (en) * 2017-02-23 2017-07-14 南京邮电大学 A kind of notable method for checking object of video based on minimum spanning tree
CN108462868A (en) * 2018-02-12 2018-08-28 叠境数字科技(上海)有限公司 The prediction technique of user's fixation point in 360 degree of panorama VR videos
CN109064444A (en) * 2018-06-28 2018-12-21 东南大学 Track plates Defect inspection method based on significance analysis
CN109166178A (en) * 2018-07-23 2019-01-08 中国科学院信息工程研究所 A kind of significant drawing generating method of panoramic picture that visual characteristic is merged with behavioral trait and system
CN110070073A (en) * 2019-05-07 2019-07-30 国家广播电视总局广播电视科学研究院 Pedestrian's recognition methods again of global characteristics and local feature based on attention mechanism

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150310303A1 (en) * 2014-04-29 2015-10-29 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
CN105488812A (en) * 2015-11-24 2016-04-13 江南大学 Motion-feature-fused space-time significance detection method
CN106780297A (en) * 2016-11-30 2017-05-31 天津大学 Image high registration accuracy method under scene and Varying Illumination
CN106951829A (en) * 2017-02-23 2017-07-14 南京邮电大学 A kind of notable method for checking object of video based on minimum spanning tree
CN106899840A (en) * 2017-03-01 2017-06-27 北京大学深圳研究生院 Panoramic picture mapping method
CN108462868A (en) * 2018-02-12 2018-08-28 叠境数字科技(上海)有限公司 The prediction technique of user's fixation point in 360 degree of panorama VR videos
CN109064444A (en) * 2018-06-28 2018-12-21 东南大学 Track plates Defect inspection method based on significance analysis
CN109166178A (en) * 2018-07-23 2019-01-08 中国科学院信息工程研究所 A kind of significant drawing generating method of panoramic picture that visual characteristic is merged with behavioral trait and system
CN110070073A (en) * 2019-05-07 2019-07-30 国家广播电视总局广播电视科学研究院 Pedestrian's recognition methods again of global characteristics and local feature based on attention mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张乾;邓向冬;宁金辉;王惠明;孙岩;欧臻彦;韦安明;: "有线数字电视用户家庭图像质量评估与改善的研究" *
苏群: "全景视频的显著性检测及其在编码传输中的应用" *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488886B (en) * 2020-03-12 2023-04-28 上海交通大学 Panoramic image significance prediction method, system and terminal for arranging attention features
CN111488886A (en) * 2020-03-12 2020-08-04 上海交通大学 Panorama image significance prediction method and system with attention feature arrangement and terminal
CN111488888A (en) * 2020-04-10 2020-08-04 腾讯科技(深圳)有限公司 Image feature extraction method and human face feature generation device
CN111832414A (en) * 2020-06-09 2020-10-27 天津大学 Animal counting method based on graph regular optical flow attention network
CN114529589A (en) * 2020-11-05 2022-05-24 北京航空航天大学 Panoramic video browsing interaction method
CN114529589B (en) * 2020-11-05 2024-05-24 北京航空航天大学 Panoramic video browsing interaction method
WO2022126921A1 (en) * 2020-12-18 2022-06-23 平安科技(深圳)有限公司 Panoramic picture detection method and device, terminal, and storage medium
CN113569636A (en) * 2021-06-22 2021-10-29 中国科学院信息工程研究所 Fisheye image feature processing method and system based on spherical features and electronic equipment
CN113569636B (en) * 2021-06-22 2023-12-05 中国科学院信息工程研究所 Fisheye image feature processing method and system based on spherical features and electronic equipment
CN114639171A (en) * 2022-05-18 2022-06-17 松立控股集团股份有限公司 Panoramic safety monitoring method for parking lot
CN114898120A (en) * 2022-05-27 2022-08-12 杭州电子科技大学 360-degree image salient target detection method based on convolutional neural network
CN115131589A (en) * 2022-08-31 2022-09-30 天津艺点意创科技有限公司 Image generation method for intelligent design of Internet literary works
CN115131589B (en) * 2022-08-31 2022-11-22 天津艺点意创科技有限公司 Image generation method for intelligent design of Internet literary works
CN117036154A (en) * 2023-08-17 2023-11-10 中国石油大学(华东) Panoramic video fixation point prediction method without head display and distortion
CN117036154B (en) * 2023-08-17 2024-02-02 中国石油大学(华东) Panoramic video fixation point prediction method without head display and distortion

Also Published As

Publication number Publication date
CN110827193B (en) 2023-05-09

Similar Documents

Publication Publication Date Title
CN110827193B (en) Panoramic video significance detection method based on multichannel characteristics
Lebreton et al. GBVS360, BMS360, ProSal: Extending existing saliency prediction models from 2D to omnidirectional images
Xu et al. Arid: A new dataset for recognizing action in the dark
US20210279971A1 (en) Method, storage medium and apparatus for converting 2d picture set to 3d model
CN109684925B (en) Depth image-based human face living body detection method and device
US5802220A (en) Apparatus and method for tracking facial motion through a sequence of images
DE112018007721T5 (en) Acquire and modify 3D faces using neural imaging and time tracking networks
CN110650368A (en) Video processing method and device and electronic equipment
CN108134937B (en) Compressed domain significance detection method based on HEVC
US20180144212A1 (en) Method and device for generating an image representative of a cluster of images
US20180357819A1 (en) Method for generating a set of annotated images
Xu et al. Saliency prediction on omnidirectional image with generative adversarial imitation learning
CN107749066A (en) A kind of multiple dimensioned space-time vision significance detection method based on region
CN106993188B (en) A kind of HEVC compaction coding method based on plurality of human faces saliency
CN106156714A (en) The Human bodys' response method merged based on skeletal joint feature and surface character
Han et al. A mixed-reality system for broadcasting sports video to mobile devices
CN107481067B (en) Intelligent advertisement system and interaction method thereof
CN108141568A (en) Osd information generation video camera, osd information synthesis terminal device 20 and the osd information shared system being made of it
CN112633217A (en) Human face recognition living body detection method for calculating sight direction based on three-dimensional eyeball model
CN109523590B (en) 3D image depth information visual comfort evaluation method based on sample
CN104298961B (en) Video method of combination based on Mouth-Shape Recognition
CN113673567A (en) Panorama emotion recognition method and system based on multi-angle subregion self-adaption
CN112954313A (en) Method for calculating perception quality of panoramic image
CN112488165A (en) Infrared pedestrian identification method and system based on deep learning model
CN113805824A (en) Electronic device and method for displaying image on display equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant