CN107886533A - Vision significance detection method, device, equipment and the storage medium of stereo-picture - Google Patents

Vision significance detection method, device, equipment and the storage medium of stereo-picture Download PDF

Info

Publication number
CN107886533A
CN107886533A CN201711014924.7A CN201711014924A CN107886533A CN 107886533 A CN107886533 A CN 107886533A CN 201711014924 A CN201711014924 A CN 201711014924A CN 107886533 A CN107886533 A CN 107886533A
Authority
CN
China
Prior art keywords
conspicuousness
picture
stereo
prediction
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711014924.7A
Other languages
Chinese (zh)
Other versions
CN107886533B (en
Inventor
张秋丹
王旭
江健民
周宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201711014924.7A priority Critical patent/CN107886533B/en
Publication of CN107886533A publication Critical patent/CN107886533A/en
Application granted granted Critical
Publication of CN107886533B publication Critical patent/CN107886533B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The applicable field of computer technology of the present invention, there is provided a kind of vision significance detection method of stereo-picture, device, equipment and storage medium, this method include:When the vision significance for receiving stereo-picture detects request, the colouring information and depth information of stereo-picture are obtained first, then respectively to colouring information, depth information and colouring information and depth information carry out conspicuousness prediction, obtain the prediction of the first conspicuousness, second conspicuousness is predicted and the prediction of the 3rd conspicuousness, the first obtained conspicuousness is predicted afterwards, second conspicuousness is predicted and the prediction of the 3rd conspicuousness is cascaded with default multiple center-biased priori, obtain multichannel cascade connection data, multi-channel information Spatial Difference fusion is carried out to multichannel cascade connection data finally by default interchannel UNE, to obtain the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.

Description

Vision significance detection method, device, equipment and the storage medium of stereo-picture
Technical field
The invention belongs to field of computer technology, more particularly to a kind of vision significance detection method of stereo-picture, dress Put, equipment and storage medium.
Background technology
Recently, because deep learning model such as convolutional neural networks have been widely used in vision significance detection mould Type, and it has been obviously improved the performance of vision significance model.Therefore, largely the 2D visions based on deep learning are notable Property model is also suggested.Vig et al. is the elder generation for first attempting to build the vision significance detection model based on convolutional neural networks Drive, the model is named as depth combination of network (eDN).Afterwards, Kummerer et al. proposes a conspicuousness model, the model Using an existing neural network model extraction deep learning feature, the vision for then reusing these feature calculation images shows Work property.Srinivas et al. designs a conspicuousness model, and due to the space-invariance of complete convolutional network, the model uses one Individual novel location-based convolutional network goes to model the pattern of location-dependent query.Huang et al. proposes one based on depth god Conspicuousness method through network is used for the gap that reduces between model prediction result and people's eye fixation behavior, the model by using Different scale images information and deep neural network model is finely tuned based on the object function of conspicuousness evaluation index.Marcella Et al. further propose a novel conspicuousness attention model for natural image.However, these methods are all based on 2D What multimedia application proposed.
Different from traditional 2D conspicuousness models, only small part conspicuousness model predicts a 3D nature using depth map Human eye position of interest in scene, and by using one it is linear plus and method by resulting color and depth characteristic figure After being merged, a final 3D Saliency maps are generated.Also some 3D rendering conspicuousness computation models pass through extension one A little traditional 2D vision significance models are suggested.For example Neil et al. is by the way that existing attention model is extended from 2D A three-dimensional notice framework is proposed to binocular domain.Zhang et al. uses multiple perception in stereoscopic vision attention model Stimulate.In order to generate the conspicuousness of final 3D rendering, weight 2D Saliency maps are removed with depth information in some models.Lang Et al. on 2D and 3D rendering carry out eyeball tracking experimental result be used for carry out depth significance analysis, wherein by extend with Preceding 2D conspicuousnesses detection model calculates 3D Saliency maps.Recently, Fang et al. is proposed color, brightness, texture and depth Etc. the Saliency maps that information is combined together generation 3D rendering.
Although it is contemplated that depth characteristic has improved the performance of the conspicuousness detection model of stereo-picture, however, existing These conspicuousness detection models still have the problem of some are challenging in terms of the content sign of stereo-picture.Traditional The method of manual extraction characteristics of image is difficult the high-level image, semantic information of extraction, and traditional stereo-picture conspicuousness is melted Conjunction method can not also detect the spatial coherence between the color of stereo-picture and depth information.In addition, the side of linear fusion Method only simply adds with method to merge multiple characteristic patterns of extraction, not in view of the difference in space simply by one Property.In summary, existing stereo-picture conspicuousness detection model lacks diversified picture material and characterizes and do not account for Spatial diversity between the feature such as color and depth.
The content of the invention
It is an object of the invention to provide a kind of vision significance detection method of stereo-picture, device, equipment and storage Medium, it is intended to solve because existing stereo-picture vision significance detection method lacks diversified picture material sign and neglects The spatial diversity between color characteristic and depth characteristic has been omited, has caused conspicuousness to detect the problem of inaccurate.
On the one hand, the invention provides a kind of vision significance detection method of stereo-picture, methods described to include following Step:
When the vision significance for receiving stereo-picture detects request, the colouring information and depth of the stereo-picture are obtained Spend information;
Predict that network carries out conspicuousness prediction to the colouring information by default color conspicuousness, it is described vertical to obtain The first conspicuousness prediction of body image, it is pre- to predict that network carries out conspicuousness to the depth information by default depth conspicuousness Survey, to obtain the prediction of the second conspicuousness of the stereo-picture, and predict network to the face by default joint conspicuousness Color information and the depth information carry out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of the stereo-picture;
By first conspicuousness prediction, second conspicuousness prediction and the 3rd conspicuousness prediction with it is default Multiple center-biased priori are cascaded, and obtain multichannel cascade connection data;
Multi-channel information Spatial Difference is carried out to the multichannel cascade connection data by default interchannel UNE Fusion, to obtain the Saliency maps of the stereo-picture.
On the other hand, the invention provides a kind of vision significance detection means of stereo-picture, described device to include:
Information acquisition unit, for when the vision significance for receiving stereo-picture detects request, obtaining the solid The colouring information and depth information of image;
Conspicuousness predicting unit is notable for predicting that network is carried out to the colouring information by default color conspicuousness Property prediction, with obtain the first conspicuousness of the stereo-picture prediction, by default depth conspicuousness predict network to described Depth information carries out conspicuousness prediction, to obtain the prediction of the second conspicuousness of the stereo-picture, and it is aobvious by default joint Work property prediction network carries out conspicuousness prediction to the colouring information and the depth information, to obtain the of the stereo-picture Three conspicuousnesses are predicted;
Passage concatenation unit, for by first conspicuousness prediction, second conspicuousness prediction and the described 3rd Conspicuousness prediction is cascaded with default multiple center-biased priori, obtains multichannel cascade connection data;And
Saliency maps acquiring unit, for being carried out by default interchannel UNE to the multichannel cascade connection data Multi-channel information Spatial Difference merges, to obtain the Saliency maps of the stereo-picture.
On the other hand, present invention also offers a kind of image detecting apparatus, including memory, processor and it is stored in institute The computer program that can be run in memory and on the processor is stated, it is real during computer program described in the computing device Now such as the step of the vision significance detection method of the stereo-picture.
On the other hand, present invention also offers a kind of computer-readable recording medium, the computer-readable recording medium Computer program is stored with, the vision significance inspection such as the stereo-picture is realized when the computer program is executed by processor The step of survey method.
Present invention firstly receives stereo-picture vision significance detect request, and obtain stereo-picture colouring information and Depth information, then predict that network carries out conspicuousness prediction to colouring information by default color conspicuousness, to obtain solid The first conspicuousness prediction of image, predict that network carries out conspicuousness prediction to depth information by default depth conspicuousness, with The second conspicuousness prediction of stereo-picture is obtained, and predicts that network is believed colouring information and depth by default joint conspicuousness Breath carries out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of stereo-picture, afterwards by the first conspicuousness prediction, second notable Property prediction and the 3rd conspicuousness prediction cascaded with default multiple center-biased priori, obtain multichannel cascade connection data, Multi-channel information Spatial Difference fusion is carried out to multichannel cascade connection data finally by default interchannel UNE, with To the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.
Brief description of the drawings
Fig. 1 is the implementation process figure of the vision significance detection method for the stereo-picture that the embodiment of the present invention one provides;
Fig. 2 is the structural representation of the vision significance detection means for the stereo-picture that the embodiment of the present invention two provides;
Fig. 3 is the structural representation of the vision significance detection means for the stereo-picture that the embodiment of the present invention three provides;With And
Fig. 4 is the structural representation for the image detecting apparatus that the embodiment of the present invention four provides.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
It is described in detail below in conjunction with specific implementation of the specific embodiment to the present invention:
Embodiment one:
Fig. 1 shows the implementation process of the vision significance detection method for the stereo-picture that the embodiment of the present invention one provides, For convenience of description, the part related to the embodiment of the present invention is illustrate only, details are as follows:
In step S101, when the vision significance for receiving stereo-picture detects request, the face of stereo-picture is obtained Color information and depth information.
The embodiment of the present invention is applied to the vision significance detecting system of stereo-picture, to predict user in 3D natural scenes Position of interest, Saliency maps corresponding to generation.In embodiments of the present invention, when the vision significance for receiving stereo-picture During detection request, the colouring information and depth information of stereo-picture are obtained, is calculated for follow-up vision significance.Stereogram As may be embodied in vision significance detection request, can also independently be transmitted.
In step s 102, predict that network carries out conspicuousness prediction to colouring information by default color conspicuousness, with The first conspicuousness prediction of stereo-picture is obtained, predicts that network carries out conspicuousness to depth information by default depth conspicuousness Prediction, to obtain the prediction of the second conspicuousness of stereo-picture, and predict network to colouring information by default joint conspicuousness Conspicuousness prediction is carried out with depth information, to obtain the prediction of the 3rd conspicuousness of stereo-picture.
In embodiments of the present invention, color conspicuousness prediction network include predetermined number stacking convolutional layer, one point Class layer, a linear interpolation layer and an output layer, depth conspicuousness prediction network also include the convolution that predetermined number stacks Module, a classification layer, a linear interpolation layer and an output layer, joint conspicuousness prediction network include two full convolution nets Network stream, ' Concat ' layer, a classification layer, a linear interpolation layer and an output layer.
Preferably, when predicting that network carries out conspicuousness prediction to colouring information by default color conspicuousness, first Predict that predetermined number convolutional layer carries out feature extraction to colouring information in network by color conspicuousness, to obtain corresponding face Color characteristic figure, then predict that the classification layer in network is classified to color characteristic figure by color conspicuousness, generate dense face Color conspicuousness prognostic chart, so as to improve the variation of picture material characterization information, wherein, the classification layer includes one 3x3 volumes Product core and an output channel, finally according to the spatial resolution of stereo-picture, the line in network is predicted by color conspicuousness Property interpolated layer dense color conspicuousness prognostic chart is up-sampled, and cross entropy operation is performed to the obtained image of up-sampling, Obtain the first conspicuousness prediction (prediction of color conspicuousness) of stereo-picture and exported by output layer, so as to improve use In the variation for the characteristics of image for characterizing stereo-picture color.
Preferably, when predicting that network carries out conspicuousness prediction to depth information by default depth conspicuousness, first Predict that predetermined number convolutional layer carries out feature extraction to depth information in network by depth conspicuousness, to obtain corresponding depth Characteristic pattern is spent, then predicts that the classification layer in network is classified to depth characteristic figure by depth conspicuousness, generates dense depth Conspicuousness prognostic chart is spent, so as to improve the variation of picture material characterization information, wherein, the classification layer includes one 3x3 volumes Product core and an output channel, finally according to the spatial resolution of stereo-picture, the line in network is predicted by depth conspicuousness Property interpolated layer dense depth conspicuousness prognostic chart is up-sampled, and cross entropy operation is performed to the obtained image of up-sampling, Obtain the second conspicuousness prediction (prediction of depth conspicuousness) of stereo-picture and exported by output layer, so as to improve use In the variation for the characteristics of image for characterizing stereo-picture depth.
Specifically, when carrying out feature extraction to colouring information and depth information by predetermined number convolutional layer, each Can be according to default formula in individual convolutional layerCarry out feature extraction,WithRepresent One random convolution filter parameter, n ∈ { 64,128,256,512 } represent the wave filter sum of l layers, finally give face Color characteristic figure FcOr depth characteristic figure Fd, then according to formula S in linear interpolated layerc/d=sigmoid (↑ (ωi*Fc/d+ bi)) perform cross entropy operation, wherein ωiAnd biThe weight vectors relative to pixel i and biasing are represented respectively, ↑ represent up-sampling Operation, sigmoid represent a cross entropy operation, Sc/dTo up-sample the result that obtained image performs cross entropy operation.
Preferably, predicting that network is pre- to colouring information and depth information progress conspicuousness by default joint conspicuousness During survey, predict that two full convolutional network streams in network carry out spy to colouring information and depth information by combining conspicuousness first Sign extraction, corresponding color characteristic figure and depth characteristic figure are obtained, then obtained color characteristic figure and depth characteristic figure are entered Row feature cascades, and obtains color and depth union feature figure, afterwards by combining the classification layer in conspicuousness prediction network to face Color and depth union feature figure are classified, and dense color and depth joint conspicuousness prognostic chart are generated, so as to improve image The variation of content characterization information, finally according to the spatial resolution of stereo-picture, predicted by combining conspicuousness in network Linear interpolation layer is up-sampled to dense color and depth joint conspicuousness prognostic chart, and the image that up-sampling obtains is performed Cross entropy operates, and obtains the 3rd conspicuousness prediction (color and the prediction of depth conspicuousness) of stereo-picture and is carried out by output layer Output, so as to improve the variation of the characteristics of image for characterizing stereo-picture depth, while realizes color characteristic and depth The calculating of spatial diversity between degree feature.Wherein, the convolutional layer that the full convolutional network stream is stacked by predetermined number forms, this point Class layer includes a 3x3 convolution kernel and an output channel.
Specifically, when the color characteristic figure to obtaining and depth characteristic figure carry out feature cascade, first at ' Concat ' According to formula F in layerc&d=Concat (Fc,Fd) feature cascade is carried out, obtain color and depth union feature figure Fc&d, Ran Hou According to formula S in linear interpolation layerc&d=Sigmoid (↑ (ωi*Fc&d+bi)) perform cross entropy operation, Sc&dObtained for up-sampling Image perform cross entropy operation result.
In step s 103, the first obtained conspicuousness prediction, the prediction of the second conspicuousness and the 3rd conspicuousness are predicted Cascaded with default multiple center-biased priori, obtain multichannel cascade connection data.
In embodiments of the present invention, the framework of default interchannel UNE include ' Concat ' layer, one it is defeated Enter layer, two convolutional layers, one return a convolution classification layer and output layer, the interchannel UNE be used to carrying out center according to Rely the fusion of the Spatial Difference of pattern and visual signature, so as to improve the integrality of Saliency maps and display effect.
In embodiments of the present invention, due to different picture materials and environment is collected, center-biased is various and not unique , therefore, in order to which learning center surrounds feature, the first obtained conspicuousness prediction, the prediction of the second conspicuousness and the 3rd are shown The prediction of work property and default multiple center-biased priori IcbAccording to formula SIC=Concat (Sc,Sd,Sc&d,Icb) cascaded, it is raw Into n-channel cascade data SIC
In step S104, it is empty that multi-channel information is carried out to multichannel cascade connection data by default interchannel UNE Between otherness merge, to obtain the Saliency maps of stereo-picture.
In embodiments of the present invention, it is preferable that multichannel cascade connection data are entered by default interchannel UNE Row multi-channel information Spatial Difference merge when, first by multichannel cascade connection data input into interchannel UNE convolution kernel Size is to obtain the visual signature and center-biased pattern of dense conspicuousness prognostic chart in 3x3 two convolutional layers respectively, then Convolution is performed to visual signature and center-biased pattern by the recurrence convolutional layer of interchannel UNE and returns operation, with basis FormulaThe Saliency maps of stereo-picture are calculated, so as to by calculating color characteristic Spatial diversity information between depth characteristic, improve the accuracy of conspicuousness detection.Wherein, convolution classification layer is returned to include One 3x3 convolution kernel and an output channel, IcbMultiple center-biased priori are represented, R represents ReLU nonlinear operations, ' Sigmoid ' is a cost function, S3dRepresent the Saliency maps of stereo-picture.
Embodiment two:
Fig. 2 shows the structure of the vision significance detection means for the stereo-picture that the embodiment of the present invention two provides, in order to It is easy to illustrate, illustrate only the part related to the embodiment of the present invention, including:
Information acquisition unit 21, for when the vision significance for receiving stereo-picture detects request, obtaining stereogram The colouring information and depth information of picture.
In embodiments of the present invention, when the vision significance for receiving stereo-picture detects request, acquisition of information is passed through Unit 21 obtains the colouring information and depth information of stereo-picture, is calculated for follow-up vision significance.Stereo-picture can Included in vision significance detection request, can also independently be transmitted.
Conspicuousness predicting unit 22, for predicting that network carries out conspicuousness to colouring information by default color conspicuousness Prediction, to obtain the prediction of the first conspicuousness of stereo-picture, predict that network enters to depth information by default depth conspicuousness Row conspicuousness is predicted, to obtain the prediction of the second conspicuousness of stereo-picture, and predicts network pair by default joint conspicuousness Colouring information and depth information carry out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of stereo-picture.
In embodiments of the present invention, color conspicuousness prediction network include predetermined number stacking convolutional layer, one point Class layer, a linear interpolation layer and an output layer, depth conspicuousness prediction network also include the convolution that predetermined number stacks Module, a classification layer, a linear interpolation layer and an output layer, joint conspicuousness prediction network include two full convolution nets Network stream, ' Concat ' layer, a classification layer, a linear interpolation layer and an output layer.
Passage concatenation unit 23, the first conspicuousness for that will obtain is predicted, the second conspicuousness is predicted and the 3rd is notable Property prediction cascaded with default multiple center-biased priori, obtain multichannel cascade connection data.
In embodiments of the present invention, the framework of default interchannel UNE include ' Concat ' layer, one it is defeated Enter layer, two convolutional layers, one return a convolution classification layer and output layer, the interchannel UNE be used to carrying out center according to Rely the fusion of the Spatial Difference of pattern and visual signature, so as to improve the integrality of Saliency maps and display effect.
In embodiments of the present invention, due to different picture materials and environment is collected, center-biased is various and not unique , therefore, in order to which learning center surrounds feature, the first obtained conspicuousness is predicted by passage concatenation unit 23, second shows The prediction of work property and the prediction of the 3rd conspicuousness and default multiple center-biased priori IcbAccording to formula SIC=Concat (Sc,Sd, Sc&d,Icb) cascaded, generation n-channel cascade data SIC
Saliency maps acquiring unit 24 is more for being carried out by default interchannel UNE to multichannel cascade connection data Channel information Spatial Difference merges, to obtain the Saliency maps of stereo-picture.
In embodiments of the present invention, when the vision significance for receiving stereo-picture detects request, information is passed through first Acquiring unit 21 obtains the colouring information and depth information of stereo-picture, then by conspicuousness predicting unit 22 respectively to color Information, depth information and colouring information and depth information carry out conspicuousness prediction, obtain the first conspicuousness prediction, second notable Property prediction and the 3rd conspicuousness prediction, the first obtained conspicuousness is predicted by passage concatenation unit 23 afterwards, second shown The prediction of work property and the prediction of the 3rd conspicuousness are cascaded with default multiple center-biased priori, obtain multichannel cascade connection number According to last Saliency maps acquiring unit 24 carries out multichannel letter by default interchannel UNE to multichannel cascade connection data Spatial Difference fusion is ceased, to obtain the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.
In embodiments of the present invention, each unit of the vision significance detection means of stereo-picture can by corresponding hardware or Software unit realizes that each unit can be independent soft and hardware unit, can also be integrated into a soft and hardware unit, herein not To limit the present invention.
Embodiment three:
Fig. 3 shows the structure of the vision significance detection means for the stereo-picture that the embodiment of the present invention three provides, in order to It is easy to illustrate, illustrate only the part related to the embodiment of the present invention, including:
Information acquisition unit 31, for when the vision significance for receiving stereo-picture detects request, obtaining stereogram The colouring information and depth information of picture.
In embodiments of the present invention, when the vision significance for receiving stereo-picture detects request, acquisition of information is passed through Unit 31 obtains the colouring information and depth information of stereo-picture, is calculated for follow-up vision significance.Stereo-picture can Included in vision significance detection request, can also independently be transmitted.
Conspicuousness predicting unit 32, for predicting that network carries out conspicuousness to colouring information by default color conspicuousness Prediction, to obtain the prediction of the first conspicuousness of stereo-picture, predict that network enters to depth information by default depth conspicuousness Row conspicuousness is predicted, to obtain the prediction of the second conspicuousness of stereo-picture, and predicts network pair by default joint conspicuousness Colouring information and depth information carry out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of stereo-picture.
In embodiments of the present invention, color conspicuousness prediction network include predetermined number stacking convolutional layer, one point Class layer, a linear interpolation layer and an output layer, depth conspicuousness prediction network also include the convolution that predetermined number stacks Module, a classification layer, a linear interpolation layer and an output layer, joint conspicuousness prediction network include two full convolution nets Network stream, ' Concat ' layer, a classification layer, a linear interpolation layer and an output layer.
Preferably, when predicting that network carries out conspicuousness prediction to colouring information by default color conspicuousness, first Predict that predetermined number convolutional layer carries out feature extraction to colouring information in network by color conspicuousness, to obtain corresponding face Color characteristic figure, then predict that the classification layer in network is classified to color characteristic figure by color conspicuousness, generate dense face Color conspicuousness prognostic chart, so as to improve the variation of picture material characterization information, wherein, the classification layer includes one 3x3 volumes Product core and an output channel, finally according to the spatial resolution of stereo-picture, the line in network is predicted by color conspicuousness Property interpolated layer dense color conspicuousness prognostic chart is up-sampled, and cross entropy operation is performed to the obtained image of up-sampling, The first conspicuousness for obtaining stereo-picture is predicted and exported by output layer, is used to characterize stereo-picture face so as to improve The variation of the characteristics of image of color.
Preferably, when predicting that network carries out conspicuousness prediction to depth information by default depth conspicuousness, first Predict that predetermined number convolutional layer carries out feature extraction to depth information in network by depth conspicuousness, to obtain corresponding depth Characteristic pattern is spent, then predicts that the classification layer in network is classified to depth characteristic figure by depth conspicuousness, generates dense depth Conspicuousness prognostic chart is spent, so as to improve the variation of picture material characterization information, wherein, the classification layer includes one 3x3 volumes Product core and an output channel, finally according to the spatial resolution of stereo-picture, the line in network is predicted by depth conspicuousness Property interpolated layer dense depth conspicuousness prognostic chart is up-sampled, and cross entropy operation is performed to the obtained image of up-sampling, The second conspicuousness for obtaining stereo-picture is predicted and exported by output layer, is used to characterize stereo-picture depth so as to improve The variation of the characteristics of image of degree.
Specifically, when carrying out feature extraction to colouring information and depth information by predetermined number convolutional layer, each Can be according to default formula in individual convolutional layerCarry out feature extraction,WithRepresent One random convolution filter parameter, n ∈ { 64,128,256,512 } represent the wave filter sum of l layers, finally give face Color characteristic figure FcOr depth characteristic figure Fd, then according to formula S in linear interpolated layerc/d=sigmoid (↑ (ωi*Fc/d+ bi)) perform cross entropy operation, wherein ωiAnd biThe weight vectors relative to pixel i and biasing are represented respectively, ↑ represent up-sampling Operation, sigmoid represent a cross entropy operation, Sc/dTo up-sample the result that obtained image performs cross entropy operation.
Preferably, predicting that network is pre- to colouring information and depth information progress conspicuousness by default joint conspicuousness During survey, predict that two full convolutional network streams in network carry out spy to colouring information and depth information by combining conspicuousness first Sign extraction, corresponding color characteristic figure and depth characteristic figure are obtained, then obtained color characteristic figure and depth characteristic figure are entered Row feature cascades, and obtains color and depth union feature figure, afterwards by combining the classification layer in conspicuousness prediction network to face Color and depth union feature figure are classified, and dense color and depth joint conspicuousness prognostic chart are generated, so as to improve image The variation of content characterization information, finally according to the spatial resolution of stereo-picture, predicted by combining conspicuousness in network Linear interpolation layer is up-sampled to dense color and depth joint conspicuousness prognostic chart, and the image that up-sampling obtains is performed Cross entropy is operated, and the 3rd conspicuousness for obtaining stereo-picture is predicted and exported by output layer, is used for table so as to improve The variation of the characteristics of image of stereo-picture depth is levied, while realizes the meter of spatial diversity between color characteristic and depth characteristic Calculate.Wherein, the convolutional layer that the full convolutional network stream is stacked by predetermined number forms, and the classification layer includes a 3x3 convolution kernel With an output channel.
Specifically, when the color characteristic figure to obtaining and depth characteristic figure carry out feature cascade, first at ' Concat ' According to formula F in layerc&d=Concat (Fc,Fd) feature cascade is carried out, obtain color and depth union feature figure Fc&d, Ran Hou According to formula S in linear interpolation layerc&d=Sigmoid (↑ (ωi*Fc&d+bi)) perform cross entropy operation, Sc&dObtained for up-sampling Image perform cross entropy operation result.
Passage concatenation unit 33, the first conspicuousness for that will obtain is predicted, the second conspicuousness is predicted and the 3rd is notable Property prediction cascaded with default multiple center-biased priori, obtain multichannel cascade connection data.
In embodiments of the present invention, the framework of default interchannel UNE include ' Concat ' layer, one it is defeated Enter layer, two convolutional layers, one return a convolution classification layer and output layer, the interchannel UNE be used to carrying out center according to Rely the fusion of the Spatial Difference of pattern and visual signature, so as to improve the integrality of Saliency maps and display effect.
In embodiments of the present invention, due to different picture materials and environment is collected, center-biased is various and not unique , therefore, in order to which learning center surrounds feature, the first obtained conspicuousness is predicted by passage concatenation unit 33, second shows The prediction of work property and the prediction of the 3rd conspicuousness and default multiple center-biased priori IcbAccording to formula SIC=Concat (Sc,Sd, Sc&d,Icb) cascaded, generation n-channel cascade data SIC
Saliency maps acquiring unit 34 is more for being carried out by default interchannel UNE to multichannel cascade connection data Channel information Spatial Difference merges, to obtain the Saliency maps of stereo-picture.
In embodiments of the present invention, it is preferable that multichannel cascade connection data are entered by default interchannel UNE Row multi-channel information Spatial Difference merge when, first by multichannel cascade connection data input into interchannel UNE convolution kernel Size is to obtain the visual signature and center-biased pattern of dense conspicuousness prognostic chart in 3x3 two convolutional layers respectively, then Convolution is performed to visual signature and center-biased pattern by the recurrence convolutional layer of interchannel UNE and returns operation, with basis FormulaThe Saliency maps of stereo-picture are calculated, so as to by calculating color characteristic Spatial diversity information between depth characteristic, improve the accuracy of conspicuousness detection.Wherein, convolution classification layer is returned to include One 3x3 convolution kernel and an output channel according to formula calculate centered around characteristic pattern, IcbRepresent that multiple center-biaseds are first Test, R represents ReLU nonlinear operations, and ' Sigmoid ' is a cost function, S3dRepresent the Saliency maps of stereo-picture.
It is therefore preferred that the conspicuousness predicting unit 32 includes:
Characteristic pattern acquiring unit 321, for predicting that predetermined number convolutional layer is to color in network by color conspicuousness Information carries out feature extraction, to obtain corresponding color characteristic figure;
Tagsort unit 322, the classification layer for being predicted by color conspicuousness in network are carried out to color characteristic figure Classification, generates dense color conspicuousness prognostic chart, the classification layer includes a 3x3 convolution kernel and an output channel;
Unit 323 is up-sampled, for the spatial resolution according to stereo-picture, is predicted by color conspicuousness in network Linear interpolation layer up-samples to dense color conspicuousness prognostic chart;And
Cross entropy predicting unit 324, the image for being obtained to up-sampling perform cross entropy operation, obtain stereo-picture First conspicuousness is predicted;
Preferably, the Saliency maps acquiring unit 34 includes:
Convolutional filtering unit 341, for by multichannel cascade connection data input into interchannel UNE convolution kernel size For in 3x3 the first convolution filter and the second convolution filter, obtain respectively the visual signature of dense conspicuousness prognostic chart and Center-biased pattern;And
Subelement 342 is obtained, for the recurrence convolutional layer by interchannel UNE to visual signature and center-biased Pattern performs convolution and returns operation, obtains the Saliency maps of stereo-picture, returns convolutional layer and includes a 3x3 convolution kernel and one Output channel.
In embodiments of the present invention, each unit of the vision significance detection means of stereo-picture can by corresponding hardware or Software unit realizes that each unit can be independent soft and hardware unit, can also be integrated into a soft and hardware unit, herein not To limit the present invention.
Example IV:
Fig. 4 shows the structure for the image detecting apparatus that the embodiment of the present invention four provides, and for convenience of description, illustrate only The part related to the embodiment of the present invention.
The image detecting apparatus 4 of the embodiment of the present invention includes processor 40, memory 41 and is stored in memory 41 And the computer program 42 that can be run on processor 40.The processor 40 is realized above-mentioned each vertical when performing computer program 42 Step in the vision significance detection method embodiment of body image, such as the step S101 to S104 shown in Fig. 1.Or place Reason device 40 realizes the function of each unit in above-mentioned each device embodiment when performing computer program 42, for example, unit 21 shown in Fig. 2 To the function of unit 31 to 34 shown in 24, Fig. 3.
In embodiments of the present invention, regarding for above-mentioned each stereo-picture is realized when the processor 40 performs computer program 42 When feeling the step in conspicuousness detection method embodiment, the vision significance detection request of stereo-picture is received first, and is obtained The colouring information and depth information of stereo-picture, then predict that network shows to colouring information by default color conspicuousness The prediction of work property, to obtain the prediction of the first conspicuousness of stereo-picture, predict that network is believed depth by default depth conspicuousness Breath carries out conspicuousness prediction, to obtain the prediction of the second conspicuousness of stereo-picture, and passes through the default joint pre- survey grid of conspicuousness Network carries out conspicuousness prediction to colouring information and depth information, to obtain the prediction of the 3rd conspicuousness of stereo-picture, afterwards by the The prediction of one conspicuousness, the prediction of the second conspicuousness and the prediction of the 3rd conspicuousness carry out level with default multiple center-biased priori Connection, obtains multichannel cascade connection data, and multichannel is carried out to multichannel cascade connection data finally by default interchannel UNE Information space otherness merges, to obtain the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.
The step of processor 40 is realized when performing computer program 42 in the image detecting apparatus 4 specifically refers to implementation The description of method, will not be repeated here in example one.
Embodiment five:
In embodiments of the present invention, there is provided a kind of computer-readable recording medium, the computer-readable recording medium are deposited Computer program is contained, the computer program realizes the vision significance detection of above-mentioned each stereo-picture when being executed by processor Step in embodiment of the method, for example, the step S101 to S104 shown in Fig. 1.Or the computer program is executed by processor The function of each unit in the above-mentioned each device embodiments of Shi Shixian, for example, unit 21 to 24 shown in Fig. 2, unit 31 to 34 shown in Fig. 3 Function.
In embodiments of the present invention, the vision significance detection request of stereo-picture is received first, and obtains stereo-picture Colouring information and depth information, then by default color conspicuousness predict network to colouring information carry out conspicuousness it is pre- Survey, to obtain the prediction of the first conspicuousness of stereo-picture, predict that network is carried out to depth information by default depth conspicuousness Conspicuousness is predicted, to obtain the prediction of the second conspicuousness of stereo-picture, and predicts network to face by default joint conspicuousness Color information and depth information carry out conspicuousness prediction, notable by first afterwards to obtain the prediction of the 3rd conspicuousness of stereo-picture Property prediction, the second conspicuousness prediction and the 3rd conspicuousness prediction cascaded with default multiple center-biased priori, obtain Multichannel cascade connection data, multi-channel information space is carried out to multichannel cascade connection data finally by default interchannel UNE Otherness merges, to obtain the Saliency maps of stereo-picture, so as to improve the accuracy of conspicuousness detection.The computer program The vision significance detection method for the stereo-picture realized when being executed by processor further is referred in preceding method embodiment The description of step, will not be repeated here.
The computer-readable recording medium of the embodiment of the present invention can include that any of computer program code can be carried Entity or device, recording medium, for example, the memory such as ROM/RAM, disk, CD, flash memory.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention All any modification, equivalent and improvement made within refreshing and principle etc., should be included in the scope of the protection.

Claims (10)

1. a kind of vision significance detection method of stereo-picture, it is characterised in that methods described comprises the steps:
When the vision significance for receiving stereo-picture detects request, the colouring information and depth letter of the stereo-picture are obtained Breath;
Predict that network carries out conspicuousness prediction to the colouring information by default color conspicuousness, to obtain the stereogram The first conspicuousness prediction of picture, predict that network carries out conspicuousness prediction to the depth information by default depth conspicuousness, To obtain the prediction of the second conspicuousness of the stereo-picture, and predict that network is believed the color by default joint conspicuousness Breath and the depth information carry out conspicuousness prediction, to obtain the prediction of the 3rd conspicuousness of the stereo-picture;
By first conspicuousness prediction, second conspicuousness prediction and the 3rd conspicuousness prediction with it is default multiple Center-biased priori is cascaded, and obtains multichannel cascade connection data;
Multi-channel information Spatial Difference fusion is carried out to the multichannel cascade connection data by default interchannel UNE, To obtain the Saliency maps of the stereo-picture.
2. the method as described in claim 1, it is characterised in that predict network to the color by default color conspicuousness Information is carried out the step of conspicuousness prediction, including:
Predict that predetermined number convolutional layer carries out feature extraction to the colouring information in network by the color conspicuousness, with Color characteristic figure corresponding to obtaining;
Predict that the classification layer in network is classified to the color characteristic figure by the color conspicuousness, generate dense color Conspicuousness prognostic chart, the classification layer include a 3x3 convolution kernel and an output channel;
According to the spatial resolution of the stereo-picture, by the linear interpolation layer in color conspicuousness prediction network to institute Dense color conspicuousness prognostic chart is stated to be up-sampled;
Cross entropy operation is performed to the obtained image that up-samples, first conspicuousness for obtaining the stereo-picture is pre- Survey.
3. the method as described in claim 1, it is characterised in that predict network to the depth by default depth conspicuousness Information is carried out the step of conspicuousness prediction, including:
Predict that predetermined number convolutional layer carries out feature extraction to the depth information in network by the depth conspicuousness, with Depth characteristic figure corresponding to obtaining;
Predict that the classification layer in network is classified to the depth characteristic figure by the depth conspicuousness, generate dense depth Conspicuousness prognostic chart, the classification layer include a 3x3 convolution kernel and an output channel;
According to the spatial resolution of the stereo-picture, by the linear interpolation layer in depth conspicuousness prediction network to institute Dense depth conspicuousness prognostic chart is stated to be up-sampled;
Cross entropy operation is performed to the obtained image that up-samples, second conspicuousness for obtaining the stereo-picture is pre- Survey.
4. the method as described in claim 1, it is characterised in that predict network to the color by default joint conspicuousness The step of information and the depth information carry out conspicuousness prediction, including:
Predict that two full convolutional network streams in network are believed the colouring information and the depth by the joint conspicuousness Breath carries out feature extraction, obtains corresponding color characteristic figure and depth characteristic figure;
Feature cascade is carried out to the obtained color characteristic figure and the depth characteristic figure, obtains color and depth joint Characteristic pattern;
Predict that the classification layer in network is classified to the color and depth union feature figure by the joint conspicuousness, it is raw Into dense color and depth joint conspicuousness prognostic chart, the classification layer includes a 3x3 convolution kernel and an output channel;
According to the spatial resolution of the stereo-picture, by the linear interpolation layer in the joint conspicuousness prediction network to institute Dense color and depth joint conspicuousness prognostic chart is stated to be up-sampled;
Cross entropy operation is performed to the obtained image that up-samples, the 3rd conspicuousness for obtaining the stereo-picture is pre- Survey.
5. the method as described in claim 1, it is characterised in that by default interchannel UNE to the multichannel level Join the step of data carry out the fusion of multi-channel information Spatial Difference, including:
The first convolution that convolution kernel size is 3x3 into the interchannel UNE by the multichannel cascade connection data input is filtered In ripple device and the second convolution filter, the visual signature and center-biased pattern of dense conspicuousness prognostic chart are obtained respectively;
Convolution is performed by the recurrence convolutional layer of the interchannel UNE to the visual signature and center-biased pattern to return Return operation, obtain the Saliency maps of the stereo-picture, the recurrence convolutional layer includes a 3x3 convolution kernel and an output is logical Road.
6. the vision significance detection means of a kind of stereo-picture, it is characterised in that described device includes:
Information acquisition unit, for when the vision significance for receiving stereo-picture detects request, obtaining the stereo-picture Colouring information and depth information;
Conspicuousness predicting unit, it is pre- for predicting that network carries out conspicuousness to the colouring information by default color conspicuousness Survey, to obtain the prediction of the first conspicuousness of the stereo-picture, predict network to the depth by default depth conspicuousness Information carries out conspicuousness prediction, to obtain the prediction of the second conspicuousness of the stereo-picture, and passes through default joint conspicuousness Predict that network carries out conspicuousness prediction to the colouring information and the depth information, it is aobvious to obtain the 3rd of the stereo-picture the The prediction of work property;
Passage concatenation unit, for by first conspicuousness prediction, second conspicuousness prediction and described 3rd notable Property prediction cascaded with default multiple center-biased priori, obtain multichannel cascade connection data;And
Saliency maps acquiring unit, it is more logical for being carried out by default interchannel UNE to the multichannel cascade connection data Road information space otherness fusion, to obtain the Saliency maps of the stereo-picture.
7. device as claimed in claim 6, it is characterised in that the conspicuousness predicting unit includes:
Characteristic pattern acquiring unit, for predicting that predetermined number convolutional layer is to the color in network by the color conspicuousness Information carries out feature extraction, to obtain corresponding color characteristic figure;
Tagsort unit, for predicting that the classification layer in network is carried out to the color characteristic figure by the color conspicuousness Classification, generates dense color conspicuousness prognostic chart, and the classification layer includes a 3x3 convolution kernel and an output channel;
Unit is up-sampled, for the spatial resolution according to the stereo-picture, is predicted by the color conspicuousness in network Linear interpolation layer the dense color conspicuousness prognostic chart is up-sampled;And
Cross entropy predicting unit, for performing cross entropy operation to the obtained image that up-samples, obtain the stereo-picture First conspicuousness prediction.
8. device as claimed in claim 6, it is characterised in that the Saliency maps acquiring unit includes:
Convolutional filtering unit, for by the multichannel cascade connection data input into the interchannel UNE convolution kernel size For in 3x3 the first convolution filter and the second convolution filter, obtain respectively the visual signature of dense conspicuousness prognostic chart and Center-biased pattern;And
Subelement is obtained, for the recurrence convolutional layer by the interchannel UNE to the visual signature and center-biased Pattern performs convolution and returns operation, obtains the Saliency maps of the stereo-picture, and the recurrence convolutional layer includes a 3x3 convolution Core and an output channel.
9. a kind of image detecting apparatus, including memory, processor and it is stored in the memory and can be in the processing The computer program run on device, it is characterised in that realize such as claim 1 described in the computing device during computer program The step of to any one of 5 methods described.
10. a kind of computer-readable recording medium, the computer-readable recording medium storage has computer program, and its feature exists In when the computer program is executed by processor the step of realization such as any one of claim 1 to 5 methods described.
CN201711014924.7A 2017-10-26 2017-10-26 Method, device and equipment for detecting visual saliency of three-dimensional image and storage medium Active CN107886533B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711014924.7A CN107886533B (en) 2017-10-26 2017-10-26 Method, device and equipment for detecting visual saliency of three-dimensional image and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711014924.7A CN107886533B (en) 2017-10-26 2017-10-26 Method, device and equipment for detecting visual saliency of three-dimensional image and storage medium

Publications (2)

Publication Number Publication Date
CN107886533A true CN107886533A (en) 2018-04-06
CN107886533B CN107886533B (en) 2021-05-04

Family

ID=61782458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711014924.7A Active CN107886533B (en) 2017-10-26 2017-10-26 Method, device and equipment for detecting visual saliency of three-dimensional image and storage medium

Country Status (1)

Country Link
CN (1) CN107886533B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664967A (en) * 2018-04-17 2018-10-16 上海交通大学 A kind of multimedia page vision significance prediction technique and system
CN109409435A (en) * 2018-11-01 2019-03-01 上海大学 A kind of depth perception conspicuousness detection method based on convolutional neural networks
CN110942095A (en) * 2019-11-27 2020-03-31 中国科学院自动化研究所 Method and system for detecting salient object area

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102800086A (en) * 2012-06-21 2012-11-28 上海海事大学 Offshore scene significance detection method
CN103020993A (en) * 2012-11-28 2013-04-03 杭州电子科技大学 Visual saliency detection method by fusing dual-channel color contrasts
CN103996040A (en) * 2014-05-13 2014-08-20 西北工业大学 Bottom-up visual saliency generating method fusing local-global contrast ratio
CN104063872A (en) * 2014-07-04 2014-09-24 西安电子科技大学 Method for detecting salient regions in sequence images based on improved visual attention model
CN104574375A (en) * 2014-12-23 2015-04-29 浙江大学 Image significance detection method combining color and depth information
CN104966286A (en) * 2015-06-04 2015-10-07 电子科技大学 3D video saliency detection method
CN105404888A (en) * 2015-11-16 2016-03-16 浙江大学 Saliency object detection method integrated with color and depth information
CN105869173A (en) * 2016-04-19 2016-08-17 天津大学 Stereoscopic vision saliency detection method
CN106157319A (en) * 2016-07-28 2016-11-23 哈尔滨工业大学 The significance detection method that region based on convolutional neural networks and Pixel-level merge
CN106462771A (en) * 2016-08-05 2017-02-22 深圳大学 3D image significance detection method
CN106997478A (en) * 2017-04-13 2017-08-01 安徽大学 RGB D image well-marked target detection methods based on notable center priori
US20170300788A1 (en) * 2014-01-30 2017-10-19 Hrl Laboratories, Llc Method for object detection in digital image and video using spiking neural networks
CN107292318A (en) * 2017-07-21 2017-10-24 北京大学深圳研究生院 Image significance object detection method based on center dark channel prior information
CN107292875A (en) * 2017-06-29 2017-10-24 西安建筑科技大学 A kind of conspicuousness detection method based on global Local Feature Fusion

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102800086A (en) * 2012-06-21 2012-11-28 上海海事大学 Offshore scene significance detection method
CN103020993A (en) * 2012-11-28 2013-04-03 杭州电子科技大学 Visual saliency detection method by fusing dual-channel color contrasts
US20170300788A1 (en) * 2014-01-30 2017-10-19 Hrl Laboratories, Llc Method for object detection in digital image and video using spiking neural networks
CN103996040A (en) * 2014-05-13 2014-08-20 西北工业大学 Bottom-up visual saliency generating method fusing local-global contrast ratio
CN104063872A (en) * 2014-07-04 2014-09-24 西安电子科技大学 Method for detecting salient regions in sequence images based on improved visual attention model
CN104574375A (en) * 2014-12-23 2015-04-29 浙江大学 Image significance detection method combining color and depth information
CN104966286A (en) * 2015-06-04 2015-10-07 电子科技大学 3D video saliency detection method
CN105404888A (en) * 2015-11-16 2016-03-16 浙江大学 Saliency object detection method integrated with color and depth information
CN105869173A (en) * 2016-04-19 2016-08-17 天津大学 Stereoscopic vision saliency detection method
CN106157319A (en) * 2016-07-28 2016-11-23 哈尔滨工业大学 The significance detection method that region based on convolutional neural networks and Pixel-level merge
CN106462771A (en) * 2016-08-05 2017-02-22 深圳大学 3D image significance detection method
CN106997478A (en) * 2017-04-13 2017-08-01 安徽大学 RGB D image well-marked target detection methods based on notable center priori
CN107292875A (en) * 2017-06-29 2017-10-24 西安建筑科技大学 A kind of conspicuousness detection method based on global Local Feature Fusion
CN107292318A (en) * 2017-07-21 2017-10-24 北京大学深圳研究生院 Image significance object detection method based on center dark channel prior information

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
FERREIRA L等: ""A method to compute saliency regions in 3D video based on fusion of feature maps"", 《PROCEEDINGS OF 2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO》 *
YUMING FANG等: ""Saliency Detection for Stereoscopic Images"", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
吴建国等: ""融合显著深度特征的RGB-D图像显著目标检测"", 《电子与信息学报》 *
徐威等: ""利用层次先验估计的显著性目标检测"", 《自动化学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664967A (en) * 2018-04-17 2018-10-16 上海交通大学 A kind of multimedia page vision significance prediction technique and system
CN108664967B (en) * 2018-04-17 2020-08-25 上海媒智科技有限公司 Method and system for predicting visual saliency of multimedia page
CN109409435A (en) * 2018-11-01 2019-03-01 上海大学 A kind of depth perception conspicuousness detection method based on convolutional neural networks
CN110942095A (en) * 2019-11-27 2020-03-31 中国科学院自动化研究所 Method and system for detecting salient object area

Also Published As

Publication number Publication date
CN107886533B (en) 2021-05-04

Similar Documents

Publication Publication Date Title
EP3510561B1 (en) Predicting depth from image data using a statistical model
CN101443817B (en) Method and device for determining correspondence, preferably for the three-dimensional reconstruction of a scene
US8681150B2 (en) Method, medium, and system with 3 dimensional object modeling using multiple view points
KR101393621B1 (en) Method and system for analyzing a quality of three-dimensional image
Feng et al. Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications
CN108961327A (en) A kind of monocular depth estimation method and its device, equipment and storage medium
CN110866509B (en) Action recognition method, device, computer storage medium and computer equipment
CN109690620A (en) Threedimensional model generating means and threedimensional model generation method
CN107301664A (en) Improvement sectional perspective matching process based on similarity measure function
CN111563418A (en) Asymmetric multi-mode fusion significance detection method based on attention mechanism
CN103384343B (en) A kind of method and device thereof filling up image cavity
CN108345892A (en) A kind of detection method, device, equipment and the storage medium of stereo-picture conspicuousness
CN101542529A (en) Generation of depth map for an image
CN105898278B (en) A kind of three-dimensional video-frequency conspicuousness detection method based on binocular Multidimensional Awareness characteristic
CN107886533A (en) Vision significance detection method, device, equipment and the storage medium of stereo-picture
JP2014096062A (en) Image processing method and image processing apparatus
CN109644280B (en) Method for generating hierarchical depth data of scene
CN108986197A (en) 3D skeleton line construction method and device
CN110096993A (en) The object detection apparatus and method of binocular stereo vision
Jiang et al. Quality assessment for virtual reality technology based on real scene
Xiao et al. Multi-scale attention generative adversarial networks for video frame interpolation
CN104243970A (en) 3D drawn image objective quality evaluation method based on stereoscopic vision attention mechanism and structural similarity
JP6965299B2 (en) Object detectors, object detection methods, programs, and moving objects
CN114494611A (en) Intelligent three-dimensional reconstruction method, device, equipment and medium based on nerve basis function
Northam et al. Stereoscopic 3D image stylization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant