CN107886533A - Vision significance detection method, device, equipment and the storage medium of stereo-picture - Google Patents
Vision significance detection method, device, equipment and the storage medium of stereo-picture Download PDFInfo
- Publication number
- CN107886533A CN107886533A CN201711014924.7A CN201711014924A CN107886533A CN 107886533 A CN107886533 A CN 107886533A CN 201711014924 A CN201711014924 A CN 201711014924A CN 107886533 A CN107886533 A CN 107886533A
- Authority
- CN
- China
- Prior art keywords
- conspicuousness
- picture
- stereo
- prediction
- depth
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Abstract
The applicable field of computer technology of the present invention, there is provided a kind of vision significance detection method of stereo-picture, device, equipment and storage medium, this method include:When the vision significance for receiving stereo-picture detects request, the colouring information and depth information of stereo-picture are obtained first, then respectively to colouring information, depth information and colouring information and depth information carry out conspicuousness prediction, obtain the prediction of the first conspicuousness, second conspicuousness is predicted and the prediction of the 3rd conspicuousness, the first obtained conspicuousness is predicted afterwards, second conspicuousness is predicted and the prediction of the 3rd conspicuousness is cascaded with default multiple center-biased priori, obtain multichannel cascade connection data, multi-channel information Spatial Difference fusion is carried out to multichannel cascade connection data finally by default interchannel UNE, to obtain the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.
Description
Technical field
The invention belongs to field of computer technology, more particularly to a kind of vision significance detection method of stereo-picture, dress
Put, equipment and storage medium.
Background technology
Recently, because deep learning model such as convolutional neural networks have been widely used in vision significance detection mould
Type, and it has been obviously improved the performance of vision significance model.Therefore, largely the 2D visions based on deep learning are notable
Property model is also suggested.Vig et al. is the elder generation for first attempting to build the vision significance detection model based on convolutional neural networks
Drive, the model is named as depth combination of network (eDN).Afterwards, Kummerer et al. proposes a conspicuousness model, the model
Using an existing neural network model extraction deep learning feature, the vision for then reusing these feature calculation images shows
Work property.Srinivas et al. designs a conspicuousness model, and due to the space-invariance of complete convolutional network, the model uses one
Individual novel location-based convolutional network goes to model the pattern of location-dependent query.Huang et al. proposes one based on depth god
Conspicuousness method through network is used for the gap that reduces between model prediction result and people's eye fixation behavior, the model by using
Different scale images information and deep neural network model is finely tuned based on the object function of conspicuousness evaluation index.Marcella
Et al. further propose a novel conspicuousness attention model for natural image.However, these methods are all based on 2D
What multimedia application proposed.
Different from traditional 2D conspicuousness models, only small part conspicuousness model predicts a 3D nature using depth map
Human eye position of interest in scene, and by using one it is linear plus and method by resulting color and depth characteristic figure
After being merged, a final 3D Saliency maps are generated.Also some 3D rendering conspicuousness computation models pass through extension one
A little traditional 2D vision significance models are suggested.For example Neil et al. is by the way that existing attention model is extended from 2D
A three-dimensional notice framework is proposed to binocular domain.Zhang et al. uses multiple perception in stereoscopic vision attention model
Stimulate.In order to generate the conspicuousness of final 3D rendering, weight 2D Saliency maps are removed with depth information in some models.Lang
Et al. on 2D and 3D rendering carry out eyeball tracking experimental result be used for carry out depth significance analysis, wherein by extend with
Preceding 2D conspicuousnesses detection model calculates 3D Saliency maps.Recently, Fang et al. is proposed color, brightness, texture and depth
Etc. the Saliency maps that information is combined together generation 3D rendering.
Although it is contemplated that depth characteristic has improved the performance of the conspicuousness detection model of stereo-picture, however, existing
These conspicuousness detection models still have the problem of some are challenging in terms of the content sign of stereo-picture.Traditional
The method of manual extraction characteristics of image is difficult the high-level image, semantic information of extraction, and traditional stereo-picture conspicuousness is melted
Conjunction method can not also detect the spatial coherence between the color of stereo-picture and depth information.In addition, the side of linear fusion
Method only simply adds with method to merge multiple characteristic patterns of extraction, not in view of the difference in space simply by one
Property.In summary, existing stereo-picture conspicuousness detection model lacks diversified picture material and characterizes and do not account for
Spatial diversity between the feature such as color and depth.
The content of the invention
It is an object of the invention to provide a kind of vision significance detection method of stereo-picture, device, equipment and storage
Medium, it is intended to solve because existing stereo-picture vision significance detection method lacks diversified picture material sign and neglects
The spatial diversity between color characteristic and depth characteristic has been omited, has caused conspicuousness to detect the problem of inaccurate.
On the one hand, the invention provides a kind of vision significance detection method of stereo-picture, methods described to include following
Step:
When the vision significance for receiving stereo-picture detects request, the colouring information and depth of the stereo-picture are obtained
Spend information;
Predict that network carries out conspicuousness prediction to the colouring information by default color conspicuousness, it is described vertical to obtain
The first conspicuousness prediction of body image, it is pre- to predict that network carries out conspicuousness to the depth information by default depth conspicuousness
Survey, to obtain the prediction of the second conspicuousness of the stereo-picture, and predict network to the face by default joint conspicuousness
Color information and the depth information carry out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of the stereo-picture;
By first conspicuousness prediction, second conspicuousness prediction and the 3rd conspicuousness prediction with it is default
Multiple center-biased priori are cascaded, and obtain multichannel cascade connection data;
Multi-channel information Spatial Difference is carried out to the multichannel cascade connection data by default interchannel UNE
Fusion, to obtain the Saliency maps of the stereo-picture.
On the other hand, the invention provides a kind of vision significance detection means of stereo-picture, described device to include:
Information acquisition unit, for when the vision significance for receiving stereo-picture detects request, obtaining the solid
The colouring information and depth information of image;
Conspicuousness predicting unit is notable for predicting that network is carried out to the colouring information by default color conspicuousness
Property prediction, with obtain the first conspicuousness of the stereo-picture prediction, by default depth conspicuousness predict network to described
Depth information carries out conspicuousness prediction, to obtain the prediction of the second conspicuousness of the stereo-picture, and it is aobvious by default joint
Work property prediction network carries out conspicuousness prediction to the colouring information and the depth information, to obtain the of the stereo-picture
Three conspicuousnesses are predicted;
Passage concatenation unit, for by first conspicuousness prediction, second conspicuousness prediction and the described 3rd
Conspicuousness prediction is cascaded with default multiple center-biased priori, obtains multichannel cascade connection data;And
Saliency maps acquiring unit, for being carried out by default interchannel UNE to the multichannel cascade connection data
Multi-channel information Spatial Difference merges, to obtain the Saliency maps of the stereo-picture.
On the other hand, present invention also offers a kind of image detecting apparatus, including memory, processor and it is stored in institute
The computer program that can be run in memory and on the processor is stated, it is real during computer program described in the computing device
Now such as the step of the vision significance detection method of the stereo-picture.
On the other hand, present invention also offers a kind of computer-readable recording medium, the computer-readable recording medium
Computer program is stored with, the vision significance inspection such as the stereo-picture is realized when the computer program is executed by processor
The step of survey method.
Present invention firstly receives stereo-picture vision significance detect request, and obtain stereo-picture colouring information and
Depth information, then predict that network carries out conspicuousness prediction to colouring information by default color conspicuousness, to obtain solid
The first conspicuousness prediction of image, predict that network carries out conspicuousness prediction to depth information by default depth conspicuousness, with
The second conspicuousness prediction of stereo-picture is obtained, and predicts that network is believed colouring information and depth by default joint conspicuousness
Breath carries out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of stereo-picture, afterwards by the first conspicuousness prediction, second notable
Property prediction and the 3rd conspicuousness prediction cascaded with default multiple center-biased priori, obtain multichannel cascade connection data,
Multi-channel information Spatial Difference fusion is carried out to multichannel cascade connection data finally by default interchannel UNE, with
To the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.
Brief description of the drawings
Fig. 1 is the implementation process figure of the vision significance detection method for the stereo-picture that the embodiment of the present invention one provides;
Fig. 2 is the structural representation of the vision significance detection means for the stereo-picture that the embodiment of the present invention two provides;
Fig. 3 is the structural representation of the vision significance detection means for the stereo-picture that the embodiment of the present invention three provides;With
And
Fig. 4 is the structural representation for the image detecting apparatus that the embodiment of the present invention four provides.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
It is described in detail below in conjunction with specific implementation of the specific embodiment to the present invention:
Embodiment one:
Fig. 1 shows the implementation process of the vision significance detection method for the stereo-picture that the embodiment of the present invention one provides,
For convenience of description, the part related to the embodiment of the present invention is illustrate only, details are as follows:
In step S101, when the vision significance for receiving stereo-picture detects request, the face of stereo-picture is obtained
Color information and depth information.
The embodiment of the present invention is applied to the vision significance detecting system of stereo-picture, to predict user in 3D natural scenes
Position of interest, Saliency maps corresponding to generation.In embodiments of the present invention, when the vision significance for receiving stereo-picture
During detection request, the colouring information and depth information of stereo-picture are obtained, is calculated for follow-up vision significance.Stereogram
As may be embodied in vision significance detection request, can also independently be transmitted.
In step s 102, predict that network carries out conspicuousness prediction to colouring information by default color conspicuousness, with
The first conspicuousness prediction of stereo-picture is obtained, predicts that network carries out conspicuousness to depth information by default depth conspicuousness
Prediction, to obtain the prediction of the second conspicuousness of stereo-picture, and predict network to colouring information by default joint conspicuousness
Conspicuousness prediction is carried out with depth information, to obtain the prediction of the 3rd conspicuousness of stereo-picture.
In embodiments of the present invention, color conspicuousness prediction network include predetermined number stacking convolutional layer, one point
Class layer, a linear interpolation layer and an output layer, depth conspicuousness prediction network also include the convolution that predetermined number stacks
Module, a classification layer, a linear interpolation layer and an output layer, joint conspicuousness prediction network include two full convolution nets
Network stream, ' Concat ' layer, a classification layer, a linear interpolation layer and an output layer.
Preferably, when predicting that network carries out conspicuousness prediction to colouring information by default color conspicuousness, first
Predict that predetermined number convolutional layer carries out feature extraction to colouring information in network by color conspicuousness, to obtain corresponding face
Color characteristic figure, then predict that the classification layer in network is classified to color characteristic figure by color conspicuousness, generate dense face
Color conspicuousness prognostic chart, so as to improve the variation of picture material characterization information, wherein, the classification layer includes one 3x3 volumes
Product core and an output channel, finally according to the spatial resolution of stereo-picture, the line in network is predicted by color conspicuousness
Property interpolated layer dense color conspicuousness prognostic chart is up-sampled, and cross entropy operation is performed to the obtained image of up-sampling,
Obtain the first conspicuousness prediction (prediction of color conspicuousness) of stereo-picture and exported by output layer, so as to improve use
In the variation for the characteristics of image for characterizing stereo-picture color.
Preferably, when predicting that network carries out conspicuousness prediction to depth information by default depth conspicuousness, first
Predict that predetermined number convolutional layer carries out feature extraction to depth information in network by depth conspicuousness, to obtain corresponding depth
Characteristic pattern is spent, then predicts that the classification layer in network is classified to depth characteristic figure by depth conspicuousness, generates dense depth
Conspicuousness prognostic chart is spent, so as to improve the variation of picture material characterization information, wherein, the classification layer includes one 3x3 volumes
Product core and an output channel, finally according to the spatial resolution of stereo-picture, the line in network is predicted by depth conspicuousness
Property interpolated layer dense depth conspicuousness prognostic chart is up-sampled, and cross entropy operation is performed to the obtained image of up-sampling,
Obtain the second conspicuousness prediction (prediction of depth conspicuousness) of stereo-picture and exported by output layer, so as to improve use
In the variation for the characteristics of image for characterizing stereo-picture depth.
Specifically, when carrying out feature extraction to colouring information and depth information by predetermined number convolutional layer, each
Can be according to default formula in individual convolutional layerCarry out feature extraction,WithRepresent
One random convolution filter parameter, n ∈ { 64,128,256,512 } represent the wave filter sum of l layers, finally give face
Color characteristic figure FcOr depth characteristic figure Fd, then according to formula S in linear interpolated layerc/d=sigmoid (↑ (ωi*Fc/d+
bi)) perform cross entropy operation, wherein ωiAnd biThe weight vectors relative to pixel i and biasing are represented respectively, ↑ represent up-sampling
Operation, sigmoid represent a cross entropy operation, Sc/dTo up-sample the result that obtained image performs cross entropy operation.
Preferably, predicting that network is pre- to colouring information and depth information progress conspicuousness by default joint conspicuousness
During survey, predict that two full convolutional network streams in network carry out spy to colouring information and depth information by combining conspicuousness first
Sign extraction, corresponding color characteristic figure and depth characteristic figure are obtained, then obtained color characteristic figure and depth characteristic figure are entered
Row feature cascades, and obtains color and depth union feature figure, afterwards by combining the classification layer in conspicuousness prediction network to face
Color and depth union feature figure are classified, and dense color and depth joint conspicuousness prognostic chart are generated, so as to improve image
The variation of content characterization information, finally according to the spatial resolution of stereo-picture, predicted by combining conspicuousness in network
Linear interpolation layer is up-sampled to dense color and depth joint conspicuousness prognostic chart, and the image that up-sampling obtains is performed
Cross entropy operates, and obtains the 3rd conspicuousness prediction (color and the prediction of depth conspicuousness) of stereo-picture and is carried out by output layer
Output, so as to improve the variation of the characteristics of image for characterizing stereo-picture depth, while realizes color characteristic and depth
The calculating of spatial diversity between degree feature.Wherein, the convolutional layer that the full convolutional network stream is stacked by predetermined number forms, this point
Class layer includes a 3x3 convolution kernel and an output channel.
Specifically, when the color characteristic figure to obtaining and depth characteristic figure carry out feature cascade, first at ' Concat '
According to formula F in layerc&d=Concat (Fc,Fd) feature cascade is carried out, obtain color and depth union feature figure Fc&d, Ran Hou
According to formula S in linear interpolation layerc&d=Sigmoid (↑ (ωi*Fc&d+bi)) perform cross entropy operation, Sc&dObtained for up-sampling
Image perform cross entropy operation result.
In step s 103, the first obtained conspicuousness prediction, the prediction of the second conspicuousness and the 3rd conspicuousness are predicted
Cascaded with default multiple center-biased priori, obtain multichannel cascade connection data.
In embodiments of the present invention, the framework of default interchannel UNE include ' Concat ' layer, one it is defeated
Enter layer, two convolutional layers, one return a convolution classification layer and output layer, the interchannel UNE be used to carrying out center according to
Rely the fusion of the Spatial Difference of pattern and visual signature, so as to improve the integrality of Saliency maps and display effect.
In embodiments of the present invention, due to different picture materials and environment is collected, center-biased is various and not unique
, therefore, in order to which learning center surrounds feature, the first obtained conspicuousness prediction, the prediction of the second conspicuousness and the 3rd are shown
The prediction of work property and default multiple center-biased priori IcbAccording to formula SIC=Concat (Sc,Sd,Sc&d,Icb) cascaded, it is raw
Into n-channel cascade data SIC。
In step S104, it is empty that multi-channel information is carried out to multichannel cascade connection data by default interchannel UNE
Between otherness merge, to obtain the Saliency maps of stereo-picture.
In embodiments of the present invention, it is preferable that multichannel cascade connection data are entered by default interchannel UNE
Row multi-channel information Spatial Difference merge when, first by multichannel cascade connection data input into interchannel UNE convolution kernel
Size is to obtain the visual signature and center-biased pattern of dense conspicuousness prognostic chart in 3x3 two convolutional layers respectively, then
Convolution is performed to visual signature and center-biased pattern by the recurrence convolutional layer of interchannel UNE and returns operation, with basis
FormulaThe Saliency maps of stereo-picture are calculated, so as to by calculating color characteristic
Spatial diversity information between depth characteristic, improve the accuracy of conspicuousness detection.Wherein, convolution classification layer is returned to include
One 3x3 convolution kernel and an output channel, IcbMultiple center-biased priori are represented, R represents ReLU nonlinear operations,
' Sigmoid ' is a cost function, S3dRepresent the Saliency maps of stereo-picture.
Embodiment two:
Fig. 2 shows the structure of the vision significance detection means for the stereo-picture that the embodiment of the present invention two provides, in order to
It is easy to illustrate, illustrate only the part related to the embodiment of the present invention, including:
Information acquisition unit 21, for when the vision significance for receiving stereo-picture detects request, obtaining stereogram
The colouring information and depth information of picture.
In embodiments of the present invention, when the vision significance for receiving stereo-picture detects request, acquisition of information is passed through
Unit 21 obtains the colouring information and depth information of stereo-picture, is calculated for follow-up vision significance.Stereo-picture can
Included in vision significance detection request, can also independently be transmitted.
Conspicuousness predicting unit 22, for predicting that network carries out conspicuousness to colouring information by default color conspicuousness
Prediction, to obtain the prediction of the first conspicuousness of stereo-picture, predict that network enters to depth information by default depth conspicuousness
Row conspicuousness is predicted, to obtain the prediction of the second conspicuousness of stereo-picture, and predicts network pair by default joint conspicuousness
Colouring information and depth information carry out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of stereo-picture.
In embodiments of the present invention, color conspicuousness prediction network include predetermined number stacking convolutional layer, one point
Class layer, a linear interpolation layer and an output layer, depth conspicuousness prediction network also include the convolution that predetermined number stacks
Module, a classification layer, a linear interpolation layer and an output layer, joint conspicuousness prediction network include two full convolution nets
Network stream, ' Concat ' layer, a classification layer, a linear interpolation layer and an output layer.
Passage concatenation unit 23, the first conspicuousness for that will obtain is predicted, the second conspicuousness is predicted and the 3rd is notable
Property prediction cascaded with default multiple center-biased priori, obtain multichannel cascade connection data.
In embodiments of the present invention, the framework of default interchannel UNE include ' Concat ' layer, one it is defeated
Enter layer, two convolutional layers, one return a convolution classification layer and output layer, the interchannel UNE be used to carrying out center according to
Rely the fusion of the Spatial Difference of pattern and visual signature, so as to improve the integrality of Saliency maps and display effect.
In embodiments of the present invention, due to different picture materials and environment is collected, center-biased is various and not unique
, therefore, in order to which learning center surrounds feature, the first obtained conspicuousness is predicted by passage concatenation unit 23, second shows
The prediction of work property and the prediction of the 3rd conspicuousness and default multiple center-biased priori IcbAccording to formula SIC=Concat (Sc,Sd,
Sc&d,Icb) cascaded, generation n-channel cascade data SIC。
Saliency maps acquiring unit 24 is more for being carried out by default interchannel UNE to multichannel cascade connection data
Channel information Spatial Difference merges, to obtain the Saliency maps of stereo-picture.
In embodiments of the present invention, when the vision significance for receiving stereo-picture detects request, information is passed through first
Acquiring unit 21 obtains the colouring information and depth information of stereo-picture, then by conspicuousness predicting unit 22 respectively to color
Information, depth information and colouring information and depth information carry out conspicuousness prediction, obtain the first conspicuousness prediction, second notable
Property prediction and the 3rd conspicuousness prediction, the first obtained conspicuousness is predicted by passage concatenation unit 23 afterwards, second shown
The prediction of work property and the prediction of the 3rd conspicuousness are cascaded with default multiple center-biased priori, obtain multichannel cascade connection number
According to last Saliency maps acquiring unit 24 carries out multichannel letter by default interchannel UNE to multichannel cascade connection data
Spatial Difference fusion is ceased, to obtain the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.
In embodiments of the present invention, each unit of the vision significance detection means of stereo-picture can by corresponding hardware or
Software unit realizes that each unit can be independent soft and hardware unit, can also be integrated into a soft and hardware unit, herein not
To limit the present invention.
Embodiment three:
Fig. 3 shows the structure of the vision significance detection means for the stereo-picture that the embodiment of the present invention three provides, in order to
It is easy to illustrate, illustrate only the part related to the embodiment of the present invention, including:
Information acquisition unit 31, for when the vision significance for receiving stereo-picture detects request, obtaining stereogram
The colouring information and depth information of picture.
In embodiments of the present invention, when the vision significance for receiving stereo-picture detects request, acquisition of information is passed through
Unit 31 obtains the colouring information and depth information of stereo-picture, is calculated for follow-up vision significance.Stereo-picture can
Included in vision significance detection request, can also independently be transmitted.
Conspicuousness predicting unit 32, for predicting that network carries out conspicuousness to colouring information by default color conspicuousness
Prediction, to obtain the prediction of the first conspicuousness of stereo-picture, predict that network enters to depth information by default depth conspicuousness
Row conspicuousness is predicted, to obtain the prediction of the second conspicuousness of stereo-picture, and predicts network pair by default joint conspicuousness
Colouring information and depth information carry out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of stereo-picture.
In embodiments of the present invention, color conspicuousness prediction network include predetermined number stacking convolutional layer, one point
Class layer, a linear interpolation layer and an output layer, depth conspicuousness prediction network also include the convolution that predetermined number stacks
Module, a classification layer, a linear interpolation layer and an output layer, joint conspicuousness prediction network include two full convolution nets
Network stream, ' Concat ' layer, a classification layer, a linear interpolation layer and an output layer.
Preferably, when predicting that network carries out conspicuousness prediction to colouring information by default color conspicuousness, first
Predict that predetermined number convolutional layer carries out feature extraction to colouring information in network by color conspicuousness, to obtain corresponding face
Color characteristic figure, then predict that the classification layer in network is classified to color characteristic figure by color conspicuousness, generate dense face
Color conspicuousness prognostic chart, so as to improve the variation of picture material characterization information, wherein, the classification layer includes one 3x3 volumes
Product core and an output channel, finally according to the spatial resolution of stereo-picture, the line in network is predicted by color conspicuousness
Property interpolated layer dense color conspicuousness prognostic chart is up-sampled, and cross entropy operation is performed to the obtained image of up-sampling,
The first conspicuousness for obtaining stereo-picture is predicted and exported by output layer, is used to characterize stereo-picture face so as to improve
The variation of the characteristics of image of color.
Preferably, when predicting that network carries out conspicuousness prediction to depth information by default depth conspicuousness, first
Predict that predetermined number convolutional layer carries out feature extraction to depth information in network by depth conspicuousness, to obtain corresponding depth
Characteristic pattern is spent, then predicts that the classification layer in network is classified to depth characteristic figure by depth conspicuousness, generates dense depth
Conspicuousness prognostic chart is spent, so as to improve the variation of picture material characterization information, wherein, the classification layer includes one 3x3 volumes
Product core and an output channel, finally according to the spatial resolution of stereo-picture, the line in network is predicted by depth conspicuousness
Property interpolated layer dense depth conspicuousness prognostic chart is up-sampled, and cross entropy operation is performed to the obtained image of up-sampling,
The second conspicuousness for obtaining stereo-picture is predicted and exported by output layer, is used to characterize stereo-picture depth so as to improve
The variation of the characteristics of image of degree.
Specifically, when carrying out feature extraction to colouring information and depth information by predetermined number convolutional layer, each
Can be according to default formula in individual convolutional layerCarry out feature extraction,WithRepresent
One random convolution filter parameter, n ∈ { 64,128,256,512 } represent the wave filter sum of l layers, finally give face
Color characteristic figure FcOr depth characteristic figure Fd, then according to formula S in linear interpolated layerc/d=sigmoid (↑ (ωi*Fc/d+
bi)) perform cross entropy operation, wherein ωiAnd biThe weight vectors relative to pixel i and biasing are represented respectively, ↑ represent up-sampling
Operation, sigmoid represent a cross entropy operation, Sc/dTo up-sample the result that obtained image performs cross entropy operation.
Preferably, predicting that network is pre- to colouring information and depth information progress conspicuousness by default joint conspicuousness
During survey, predict that two full convolutional network streams in network carry out spy to colouring information and depth information by combining conspicuousness first
Sign extraction, corresponding color characteristic figure and depth characteristic figure are obtained, then obtained color characteristic figure and depth characteristic figure are entered
Row feature cascades, and obtains color and depth union feature figure, afterwards by combining the classification layer in conspicuousness prediction network to face
Color and depth union feature figure are classified, and dense color and depth joint conspicuousness prognostic chart are generated, so as to improve image
The variation of content characterization information, finally according to the spatial resolution of stereo-picture, predicted by combining conspicuousness in network
Linear interpolation layer is up-sampled to dense color and depth joint conspicuousness prognostic chart, and the image that up-sampling obtains is performed
Cross entropy is operated, and the 3rd conspicuousness for obtaining stereo-picture is predicted and exported by output layer, is used for table so as to improve
The variation of the characteristics of image of stereo-picture depth is levied, while realizes the meter of spatial diversity between color characteristic and depth characteristic
Calculate.Wherein, the convolutional layer that the full convolutional network stream is stacked by predetermined number forms, and the classification layer includes a 3x3 convolution kernel
With an output channel.
Specifically, when the color characteristic figure to obtaining and depth characteristic figure carry out feature cascade, first at ' Concat '
According to formula F in layerc&d=Concat (Fc,Fd) feature cascade is carried out, obtain color and depth union feature figure Fc&d, Ran Hou
According to formula S in linear interpolation layerc&d=Sigmoid (↑ (ωi*Fc&d+bi)) perform cross entropy operation, Sc&dObtained for up-sampling
Image perform cross entropy operation result.
Passage concatenation unit 33, the first conspicuousness for that will obtain is predicted, the second conspicuousness is predicted and the 3rd is notable
Property prediction cascaded with default multiple center-biased priori, obtain multichannel cascade connection data.
In embodiments of the present invention, the framework of default interchannel UNE include ' Concat ' layer, one it is defeated
Enter layer, two convolutional layers, one return a convolution classification layer and output layer, the interchannel UNE be used to carrying out center according to
Rely the fusion of the Spatial Difference of pattern and visual signature, so as to improve the integrality of Saliency maps and display effect.
In embodiments of the present invention, due to different picture materials and environment is collected, center-biased is various and not unique
, therefore, in order to which learning center surrounds feature, the first obtained conspicuousness is predicted by passage concatenation unit 33, second shows
The prediction of work property and the prediction of the 3rd conspicuousness and default multiple center-biased priori IcbAccording to formula SIC=Concat (Sc,Sd,
Sc&d,Icb) cascaded, generation n-channel cascade data SIC。
Saliency maps acquiring unit 34 is more for being carried out by default interchannel UNE to multichannel cascade connection data
Channel information Spatial Difference merges, to obtain the Saliency maps of stereo-picture.
In embodiments of the present invention, it is preferable that multichannel cascade connection data are entered by default interchannel UNE
Row multi-channel information Spatial Difference merge when, first by multichannel cascade connection data input into interchannel UNE convolution kernel
Size is to obtain the visual signature and center-biased pattern of dense conspicuousness prognostic chart in 3x3 two convolutional layers respectively, then
Convolution is performed to visual signature and center-biased pattern by the recurrence convolutional layer of interchannel UNE and returns operation, with basis
FormulaThe Saliency maps of stereo-picture are calculated, so as to by calculating color characteristic
Spatial diversity information between depth characteristic, improve the accuracy of conspicuousness detection.Wherein, convolution classification layer is returned to include
One 3x3 convolution kernel and an output channel according to formula calculate centered around characteristic pattern, IcbRepresent that multiple center-biaseds are first
Test, R represents ReLU nonlinear operations, and ' Sigmoid ' is a cost function, S3dRepresent the Saliency maps of stereo-picture.
It is therefore preferred that the conspicuousness predicting unit 32 includes:
Characteristic pattern acquiring unit 321, for predicting that predetermined number convolutional layer is to color in network by color conspicuousness
Information carries out feature extraction, to obtain corresponding color characteristic figure;
Tagsort unit 322, the classification layer for being predicted by color conspicuousness in network are carried out to color characteristic figure
Classification, generates dense color conspicuousness prognostic chart, the classification layer includes a 3x3 convolution kernel and an output channel;
Unit 323 is up-sampled, for the spatial resolution according to stereo-picture, is predicted by color conspicuousness in network
Linear interpolation layer up-samples to dense color conspicuousness prognostic chart;And
Cross entropy predicting unit 324, the image for being obtained to up-sampling perform cross entropy operation, obtain stereo-picture
First conspicuousness is predicted;
Preferably, the Saliency maps acquiring unit 34 includes:
Convolutional filtering unit 341, for by multichannel cascade connection data input into interchannel UNE convolution kernel size
For in 3x3 the first convolution filter and the second convolution filter, obtain respectively the visual signature of dense conspicuousness prognostic chart and
Center-biased pattern;And
Subelement 342 is obtained, for the recurrence convolutional layer by interchannel UNE to visual signature and center-biased
Pattern performs convolution and returns operation, obtains the Saliency maps of stereo-picture, returns convolutional layer and includes a 3x3 convolution kernel and one
Output channel.
In embodiments of the present invention, each unit of the vision significance detection means of stereo-picture can by corresponding hardware or
Software unit realizes that each unit can be independent soft and hardware unit, can also be integrated into a soft and hardware unit, herein not
To limit the present invention.
Example IV:
Fig. 4 shows the structure for the image detecting apparatus that the embodiment of the present invention four provides, and for convenience of description, illustrate only
The part related to the embodiment of the present invention.
The image detecting apparatus 4 of the embodiment of the present invention includes processor 40, memory 41 and is stored in memory 41
And the computer program 42 that can be run on processor 40.The processor 40 is realized above-mentioned each vertical when performing computer program 42
Step in the vision significance detection method embodiment of body image, such as the step S101 to S104 shown in Fig. 1.Or place
Reason device 40 realizes the function of each unit in above-mentioned each device embodiment when performing computer program 42, for example, unit 21 shown in Fig. 2
To the function of unit 31 to 34 shown in 24, Fig. 3.
In embodiments of the present invention, regarding for above-mentioned each stereo-picture is realized when the processor 40 performs computer program 42
When feeling the step in conspicuousness detection method embodiment, the vision significance detection request of stereo-picture is received first, and is obtained
The colouring information and depth information of stereo-picture, then predict that network shows to colouring information by default color conspicuousness
The prediction of work property, to obtain the prediction of the first conspicuousness of stereo-picture, predict that network is believed depth by default depth conspicuousness
Breath carries out conspicuousness prediction, to obtain the prediction of the second conspicuousness of stereo-picture, and passes through the default joint pre- survey grid of conspicuousness
Network carries out conspicuousness prediction to colouring information and depth information, to obtain the prediction of the 3rd conspicuousness of stereo-picture, afterwards by the
The prediction of one conspicuousness, the prediction of the second conspicuousness and the prediction of the 3rd conspicuousness carry out level with default multiple center-biased priori
Connection, obtains multichannel cascade connection data, and multichannel is carried out to multichannel cascade connection data finally by default interchannel UNE
Information space otherness merges, to obtain the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.
The step of processor 40 is realized when performing computer program 42 in the image detecting apparatus 4 specifically refers to implementation
The description of method, will not be repeated here in example one.
Embodiment five:
In embodiments of the present invention, there is provided a kind of computer-readable recording medium, the computer-readable recording medium are deposited
Computer program is contained, the computer program realizes the vision significance detection of above-mentioned each stereo-picture when being executed by processor
Step in embodiment of the method, for example, the step S101 to S104 shown in Fig. 1.Or the computer program is executed by processor
The function of each unit in the above-mentioned each device embodiments of Shi Shixian, for example, unit 21 to 24 shown in Fig. 2, unit 31 to 34 shown in Fig. 3
Function.
In embodiments of the present invention, the vision significance detection request of stereo-picture is received first, and obtains stereo-picture
Colouring information and depth information, then by default color conspicuousness predict network to colouring information carry out conspicuousness it is pre-
Survey, to obtain the prediction of the first conspicuousness of stereo-picture, predict that network is carried out to depth information by default depth conspicuousness
Conspicuousness is predicted, to obtain the prediction of the second conspicuousness of stereo-picture, and predicts network to face by default joint conspicuousness
Color information and depth information carry out conspicuousness prediction, notable by first afterwards to obtain the prediction of the 3rd conspicuousness of stereo-picture
Property prediction, the second conspicuousness prediction and the 3rd conspicuousness prediction cascaded with default multiple center-biased priori, obtain
Multichannel cascade connection data, multi-channel information space is carried out to multichannel cascade connection data finally by default interchannel UNE
Otherness merges, to obtain the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.The computer program
The vision significance detection method for the stereo-picture realized when being executed by processor further is referred in preceding method embodiment
The description of step, will not be repeated here.
The computer-readable recording medium of the embodiment of the present invention can include that any of computer program code can be carried
Entity or device, recording medium, for example, the memory such as ROM/RAM, disk, CD, flash memory.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention
All any modification, equivalent and improvement made within refreshing and principle etc., should be included in the scope of the protection.
Claims (10)
1. a kind of vision significance detection method of stereo-picture, it is characterised in that methods described comprises the steps:
When the vision significance for receiving stereo-picture detects request, the colouring information and depth letter of the stereo-picture are obtained
Breath;
Predict that network carries out conspicuousness prediction to the colouring information by default color conspicuousness, to obtain the stereogram
The first conspicuousness prediction of picture, predict that network carries out conspicuousness prediction to the depth information by default depth conspicuousness,
To obtain the prediction of the second conspicuousness of the stereo-picture, and predict that network is believed the color by default joint conspicuousness
Breath and the depth information carry out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of the stereo-picture;
By first conspicuousness prediction, second conspicuousness prediction and the 3rd conspicuousness prediction with it is default multiple
Center-biased priori is cascaded, and obtains multichannel cascade connection data;
Multi-channel information Spatial Difference fusion is carried out to the multichannel cascade connection data by default interchannel UNE,
To obtain the Saliency maps of the stereo-picture.
2. the method as described in claim 1, it is characterised in that predict network to the color by default color conspicuousness
Information is carried out the step of conspicuousness prediction, including:
Predict that predetermined number convolutional layer carries out feature extraction to the colouring information in network by the color conspicuousness, with
Color characteristic figure corresponding to obtaining;
Predict that the classification layer in network is classified to the color characteristic figure by the color conspicuousness, generate dense color
Conspicuousness prognostic chart, the classification layer include a 3x3 convolution kernel and an output channel;
According to the spatial resolution of the stereo-picture, by the linear interpolation layer in color conspicuousness prediction network to institute
Dense color conspicuousness prognostic chart is stated to be up-sampled;
Cross entropy operation is performed to the obtained image that up-samples, first conspicuousness for obtaining the stereo-picture is pre-
Survey.
3. the method as described in claim 1, it is characterised in that predict network to the depth by default depth conspicuousness
Information is carried out the step of conspicuousness prediction, including:
Predict that predetermined number convolutional layer carries out feature extraction to the depth information in network by the depth conspicuousness, with
Depth characteristic figure corresponding to obtaining;
Predict that the classification layer in network is classified to the depth characteristic figure by the depth conspicuousness, generate dense depth
Conspicuousness prognostic chart, the classification layer include a 3x3 convolution kernel and an output channel;
According to the spatial resolution of the stereo-picture, by the linear interpolation layer in depth conspicuousness prediction network to institute
Dense depth conspicuousness prognostic chart is stated to be up-sampled;
Cross entropy operation is performed to the obtained image that up-samples, second conspicuousness for obtaining the stereo-picture is pre-
Survey.
4. the method as described in claim 1, it is characterised in that predict network to the color by default joint conspicuousness
The step of information and the depth information carry out conspicuousness prediction, including:
Predict that two full convolutional network streams in network are believed the colouring information and the depth by the joint conspicuousness
Breath carries out feature extraction, obtains corresponding color characteristic figure and depth characteristic figure;
Feature cascade is carried out to the obtained color characteristic figure and the depth characteristic figure, obtains color and depth joint
Characteristic pattern;
Predict that the classification layer in network is classified to the color and depth union feature figure by the joint conspicuousness, it is raw
Into dense color and depth joint conspicuousness prognostic chart, the classification layer includes a 3x3 convolution kernel and an output channel;
According to the spatial resolution of the stereo-picture, by the linear interpolation layer in the joint conspicuousness prediction network to institute
Dense color and depth joint conspicuousness prognostic chart is stated to be up-sampled;
Cross entropy operation is performed to the obtained image that up-samples, the 3rd conspicuousness for obtaining the stereo-picture is pre-
Survey.
5. the method as described in claim 1, it is characterised in that by default interchannel UNE to the multichannel level
Join the step of data carry out the fusion of multi-channel information Spatial Difference, including:
The first convolution that convolution kernel size is 3x3 into the interchannel UNE by the multichannel cascade connection data input is filtered
In ripple device and the second convolution filter, the visual signature and center-biased pattern of dense conspicuousness prognostic chart are obtained respectively;
Convolution is performed by the recurrence convolutional layer of the interchannel UNE to the visual signature and center-biased pattern to return
Return operation, obtain the Saliency maps of the stereo-picture, the recurrence convolutional layer includes a 3x3 convolution kernel and an output is logical
Road.
6. the vision significance detection means of a kind of stereo-picture, it is characterised in that described device includes:
Information acquisition unit, for when the vision significance for receiving stereo-picture detects request, obtaining the stereo-picture
Colouring information and depth information;
Conspicuousness predicting unit, it is pre- for predicting that network carries out conspicuousness to the colouring information by default color conspicuousness
Survey, to obtain the prediction of the first conspicuousness of the stereo-picture, predict network to the depth by default depth conspicuousness
Information carries out conspicuousness prediction, to obtain the prediction of the second conspicuousness of the stereo-picture, and passes through default joint conspicuousness
Predict that network carries out conspicuousness prediction to the colouring information and the depth information, it is aobvious to obtain the 3rd of the stereo-picture the
The prediction of work property;
Passage concatenation unit, for by first conspicuousness prediction, second conspicuousness prediction and described 3rd notable
Property prediction cascaded with default multiple center-biased priori, obtain multichannel cascade connection data;And
Saliency maps acquiring unit, it is more logical for being carried out by default interchannel UNE to the multichannel cascade connection data
Road information space otherness fusion, to obtain the Saliency maps of the stereo-picture.
7. device as claimed in claim 6, it is characterised in that the conspicuousness predicting unit includes:
Characteristic pattern acquiring unit, for predicting that predetermined number convolutional layer is to the color in network by the color conspicuousness
Information carries out feature extraction, to obtain corresponding color characteristic figure;
Tagsort unit, for predicting that the classification layer in network is carried out to the color characteristic figure by the color conspicuousness
Classification, generates dense color conspicuousness prognostic chart, and the classification layer includes a 3x3 convolution kernel and an output channel;
Unit is up-sampled, for the spatial resolution according to the stereo-picture, is predicted by the color conspicuousness in network
Linear interpolation layer the dense color conspicuousness prognostic chart is up-sampled;And
Cross entropy predicting unit, for performing cross entropy operation to the obtained image that up-samples, obtain the stereo-picture
First conspicuousness prediction.
8. device as claimed in claim 6, it is characterised in that the Saliency maps acquiring unit includes:
Convolutional filtering unit, for by the multichannel cascade connection data input into the interchannel UNE convolution kernel size
For in 3x3 the first convolution filter and the second convolution filter, obtain respectively the visual signature of dense conspicuousness prognostic chart and
Center-biased pattern;And
Subelement is obtained, for the recurrence convolutional layer by the interchannel UNE to the visual signature and center-biased
Pattern performs convolution and returns operation, obtains the Saliency maps of the stereo-picture, and the recurrence convolutional layer includes a 3x3 convolution
Core and an output channel.
9. a kind of image detecting apparatus, including memory, processor and it is stored in the memory and can be in the processing
The computer program run on device, it is characterised in that realize such as claim 1 described in the computing device during computer program
The step of to any one of 5 methods described.
10. a kind of computer-readable recording medium, the computer-readable recording medium storage has computer program, and its feature exists
In when the computer program is executed by processor the step of realization such as any one of claim 1 to 5 methods described.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711014924.7A CN107886533B (en) | 2017-10-26 | 2017-10-26 | Method, device and equipment for detecting visual saliency of three-dimensional image and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711014924.7A CN107886533B (en) | 2017-10-26 | 2017-10-26 | Method, device and equipment for detecting visual saliency of three-dimensional image and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107886533A true CN107886533A (en) | 2018-04-06 |
CN107886533B CN107886533B (en) | 2021-05-04 |
Family
ID=61782458
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711014924.7A Active CN107886533B (en) | 2017-10-26 | 2017-10-26 | Method, device and equipment for detecting visual saliency of three-dimensional image and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107886533B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108664967A (en) * | 2018-04-17 | 2018-10-16 | 上海交通大学 | A kind of multimedia page vision significance prediction technique and system |
CN109409435A (en) * | 2018-11-01 | 2019-03-01 | 上海大学 | A kind of depth perception conspicuousness detection method based on convolutional neural networks |
CN110942095A (en) * | 2019-11-27 | 2020-03-31 | 中国科学院自动化研究所 | Method and system for detecting salient object area |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102800086A (en) * | 2012-06-21 | 2012-11-28 | 上海海事大学 | Offshore scene significance detection method |
CN103020993A (en) * | 2012-11-28 | 2013-04-03 | 杭州电子科技大学 | Visual saliency detection method by fusing dual-channel color contrasts |
CN103996040A (en) * | 2014-05-13 | 2014-08-20 | 西北工业大学 | Bottom-up visual saliency generating method fusing local-global contrast ratio |
CN104063872A (en) * | 2014-07-04 | 2014-09-24 | 西安电子科技大学 | Method for detecting salient regions in sequence images based on improved visual attention model |
CN104574375A (en) * | 2014-12-23 | 2015-04-29 | 浙江大学 | Image significance detection method combining color and depth information |
CN104966286A (en) * | 2015-06-04 | 2015-10-07 | 电子科技大学 | 3D video saliency detection method |
CN105404888A (en) * | 2015-11-16 | 2016-03-16 | 浙江大学 | Saliency object detection method integrated with color and depth information |
CN105869173A (en) * | 2016-04-19 | 2016-08-17 | 天津大学 | Stereoscopic vision saliency detection method |
CN106157319A (en) * | 2016-07-28 | 2016-11-23 | 哈尔滨工业大学 | The significance detection method that region based on convolutional neural networks and Pixel-level merge |
CN106462771A (en) * | 2016-08-05 | 2017-02-22 | 深圳大学 | 3D image significance detection method |
CN106997478A (en) * | 2017-04-13 | 2017-08-01 | 安徽大学 | RGB D image well-marked target detection methods based on notable center priori |
US20170300788A1 (en) * | 2014-01-30 | 2017-10-19 | Hrl Laboratories, Llc | Method for object detection in digital image and video using spiking neural networks |
CN107292318A (en) * | 2017-07-21 | 2017-10-24 | 北京大学深圳研究生院 | Image significance object detection method based on center dark channel prior information |
CN107292875A (en) * | 2017-06-29 | 2017-10-24 | 西安建筑科技大学 | A kind of conspicuousness detection method based on global Local Feature Fusion |
-
2017
- 2017-10-26 CN CN201711014924.7A patent/CN107886533B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102800086A (en) * | 2012-06-21 | 2012-11-28 | 上海海事大学 | Offshore scene significance detection method |
CN103020993A (en) * | 2012-11-28 | 2013-04-03 | 杭州电子科技大学 | Visual saliency detection method by fusing dual-channel color contrasts |
US20170300788A1 (en) * | 2014-01-30 | 2017-10-19 | Hrl Laboratories, Llc | Method for object detection in digital image and video using spiking neural networks |
CN103996040A (en) * | 2014-05-13 | 2014-08-20 | 西北工业大学 | Bottom-up visual saliency generating method fusing local-global contrast ratio |
CN104063872A (en) * | 2014-07-04 | 2014-09-24 | 西安电子科技大学 | Method for detecting salient regions in sequence images based on improved visual attention model |
CN104574375A (en) * | 2014-12-23 | 2015-04-29 | 浙江大学 | Image significance detection method combining color and depth information |
CN104966286A (en) * | 2015-06-04 | 2015-10-07 | 电子科技大学 | 3D video saliency detection method |
CN105404888A (en) * | 2015-11-16 | 2016-03-16 | 浙江大学 | Saliency object detection method integrated with color and depth information |
CN105869173A (en) * | 2016-04-19 | 2016-08-17 | 天津大学 | Stereoscopic vision saliency detection method |
CN106157319A (en) * | 2016-07-28 | 2016-11-23 | 哈尔滨工业大学 | The significance detection method that region based on convolutional neural networks and Pixel-level merge |
CN106462771A (en) * | 2016-08-05 | 2017-02-22 | 深圳大学 | 3D image significance detection method |
CN106997478A (en) * | 2017-04-13 | 2017-08-01 | 安徽大学 | RGB D image well-marked target detection methods based on notable center priori |
CN107292875A (en) * | 2017-06-29 | 2017-10-24 | 西安建筑科技大学 | A kind of conspicuousness detection method based on global Local Feature Fusion |
CN107292318A (en) * | 2017-07-21 | 2017-10-24 | 北京大学深圳研究生院 | Image significance object detection method based on center dark channel prior information |
Non-Patent Citations (4)
Title |
---|
FERREIRA L等: ""A method to compute saliency regions in 3D video based on fusion of feature maps"", 《PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO》 * |
YUMING FANG等: ""Saliency Detection for Stereoscopic Images"", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 * |
吴建国等: ""融合显著深度特征的RGB-D图像显著目标检测"", 《电子与信息学报》 * |
徐威等: ""利用层次先验估计的显著性目标检测"", 《自动化学报》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108664967A (en) * | 2018-04-17 | 2018-10-16 | 上海交通大学 | A kind of multimedia page vision significance prediction technique and system |
CN108664967B (en) * | 2018-04-17 | 2020-08-25 | 上海媒智科技有限公司 | Method and system for predicting visual saliency of multimedia page |
CN109409435A (en) * | 2018-11-01 | 2019-03-01 | 上海大学 | A kind of depth perception conspicuousness detection method based on convolutional neural networks |
CN110942095A (en) * | 2019-11-27 | 2020-03-31 | 中国科学院自动化研究所 | Method and system for detecting salient object area |
Also Published As
Publication number | Publication date |
---|---|
CN107886533B (en) | 2021-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3510561B1 (en) | Predicting depth from image data using a statistical model | |
CN101443817B (en) | Method and device for determining correspondence, preferably for the three-dimensional reconstruction of a scene | |
US8681150B2 (en) | Method, medium, and system with 3 dimensional object modeling using multiple view points | |
KR101393621B1 (en) | Method and system for analyzing a quality of three-dimensional image | |
Feng et al. | Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications | |
CN108961327A (en) | A kind of monocular depth estimation method and its device, equipment and storage medium | |
CN110866509B (en) | Action recognition method, device, computer storage medium and computer equipment | |
CN109690620A (en) | Threedimensional model generating means and threedimensional model generation method | |
CN107301664A (en) | Improvement sectional perspective matching process based on similarity measure function | |
CN111563418A (en) | Asymmetric multi-mode fusion significance detection method based on attention mechanism | |
CN103384343B (en) | A kind of method and device thereof filling up image cavity | |
CN108345892A (en) | A kind of detection method, device, equipment and the storage medium of stereo-picture conspicuousness | |
CN101542529A (en) | Generation of depth map for an image | |
CN105898278B (en) | A kind of three-dimensional video-frequency conspicuousness detection method based on binocular Multidimensional Awareness characteristic | |
CN107886533A (en) | Vision significance detection method, device, equipment and the storage medium of stereo-picture | |
JP2014096062A (en) | Image processing method and image processing apparatus | |
CN109644280B (en) | Method for generating hierarchical depth data of scene | |
CN108986197A (en) | 3D skeleton line construction method and device | |
CN110096993A (en) | The object detection apparatus and method of binocular stereo vision | |
Jiang et al. | Quality assessment for virtual reality technology based on real scene | |
Xiao et al. | Multi-scale attention generative adversarial networks for video frame interpolation | |
CN104243970A (en) | 3D drawn image objective quality evaluation method based on stereoscopic vision attention mechanism and structural similarity | |
JP6965299B2 (en) | Object detectors, object detection methods, programs, and moving objects | |
CN114494611A (en) | Intelligent three-dimensional reconstruction method, device, equipment and medium based on nerve basis function | |
Northam et al. | Stereoscopic 3D image stylization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |