CN113963170A - RGBD image saliency detection method based on interactive feature fusion - Google Patents

RGBD image saliency detection method based on interactive feature fusion Download PDF

Info

Publication number
CN113963170A
CN113963170A CN202111039181.5A CN202111039181A CN113963170A CN 113963170 A CN113963170 A CN 113963170A CN 202111039181 A CN202111039181 A CN 202111039181A CN 113963170 A CN113963170 A CN 113963170A
Authority
CN
China
Prior art keywords
image
convolution
fusion
saliency detection
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111039181.5A
Other languages
Chinese (zh)
Inventor
赵晓丽
张倬尧
陈正
方志军
叶翰辰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Engineering Science
Original Assignee
Shanghai University of Engineering Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai University of Engineering Science filed Critical Shanghai University of Engineering Science
Priority to CN202111039181.5A priority Critical patent/CN113963170A/en
Publication of CN113963170A publication Critical patent/CN113963170A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses an RGBD image saliency detection method based on interactive feature fusion, for each image in the training image sample set, firstly, a multilevel convolution neural network module is utilized to respectively extract the characteristics of the color image and the depth image in a multilevel manner, and a cross characteristic fusion module is utilized, performing multi-level dot product fusion on the color and depth image features extracted by deep level convolution to obtain an initial significant image, and then, carrying out multi-scale fusion on the initial salient image by using an increment structure, outputting a network prediction salient image, finally solving a focus entropy loss function by using the network prediction salient image and the target salient image, learning the optimal parameters of an image saliency detection model, obtaining a trained image saliency detection model, and carrying out saliency detection on the RGB-D image to be processed. The method is simple and reliable, convenient to operate, easy to realize and convenient to popularize and apply.

Description

RGBD image saliency detection method based on interactive feature fusion
Technical Field
The invention relates to the technical field of image processing, in particular to an RGBD image saliency detection method based on interactive feature fusion.
Background
In the application fields of automatic driving, robots, virtual reality and the like, a significant target in a scene is searched, information with weak task correlation is filtered, the method has important significance for reducing the system computation complexity and improving the scene understanding capability, and is one of the core problems and research hotspots in the field of computer vision.
In recent years, with the wide application of deep convolutional neural networks in the field of image processing, significance detection has been rapidly developed, and a large number of significance models based on visual features such as color and brightness have been proposed in succession. Li et al first constructed a multi-scale feature-based significance model using a deep neural network in a Visual salience based on multiscale deep feature. Hou et al propose a DSS model in "deep Supervised patient Object Detection with Short Connections", which utilizes a Full Convolutional Network (FCN) to extract multi-level multi-scale features, and then fuses the extracted multi-level multi-scale features together by introducing a layer jump structure; feng et al in "attention feedback network for boundary-aware object detection" utilizes a global perceptron module to refine the most salient features as a whole, and an attention feedback module to pass information between respective codecs.
However, RGB image saliency detection faces two major challenges: firstly, when the target and the background have similar appearances, the target and the background are difficult to distinguish by only depending on RGB information; secondly, when the same object contains different colors, the same object is easily judged to be different objects by mistake; the depth map contains rich space structure and three-dimensional layout information, and can provide a large number of additional clues to distinguish the target from the background on the basis of ensuring the integrity of the detection area, so that the significance detection effect can be effectively improved by utilizing the depth information. Ciptadi et al first introduces depth information on the basis of RGB in depth view of saliency and proposes a RGB-D-based significance segmentation model; peng et al propose a multi-stage RGB-D model in "Rgbd sample object detection: a benchmark and algorithms", which takes into account depth and appearance cues generated by low-level feature contrast, mid-level region grouping and high-level prior enhancement at the same time; chen et al designs a complementary perception fusion module in a progressive complementary fusion network for RGB-D spatial object detection to learn color and depth complementary information, and gradually fuses multi-level information in a manner of densely increasing layer-by-layer supervision from deep to shallow through a cascade module; piao et al propose a Depth-induced multi-scale recurrent attention network in a Depth-induced multi-scale recurrent attention network for salience detection, which uses a Depth refinement block containing a residual error structure to fuse color and Depth complementary information, combines multi-scale context features with Depth information to accurately locate a saliency target, and utilizes a recurrent attention module to further improve the model performance.
In summary, the existing RGB-D saliency detection method mainly proposes some sub-network learning color and depth complementary information based on the backbone network, and performs feature fusion, but most of the existing RGB-D saliency detection methods are very large in network structure, large in parameter quantity, and difficult to train.
Disclosure of Invention
The invention provides an RGBD image saliency detection method based on interactive Feature Fusion, and provides a novel interactive double-flow saliency detection frame, which designs a Global and Local Feature extraction convolution Block (GL Block) for acquiring Global features and guiding Local Feature extraction, provides a method for acquiring common features of a color image and a depth image in a dot-product manner, and builds a Cross-mode Feature Fusion Module (CFFM) to Cross-fuse Feature information of the color image and the depth image.
The invention can be realized by the following technical scheme:
an RGBD image saliency detection method based on interactive feature fusion comprises the following steps:
firstly, establishing an image sample set for training;
step two, establishing an image significance detection model;
for each image in the image sample set, firstly, a multilevel convolutional neural network module is utilized to extract the characteristics of color images and depth images in a multilevel mode respectively, a cross characteristic fusion module is utilized to perform multilevel point-product fusion on the characteristics of the color images and the depth images extracted by the deep convolutional mode to obtain initial significant images, then, an Incepration structure is utilized to perform multi-scale fusion on the initial significant images to output network prediction significant images, finally, the network prediction significant images and target significant images are utilized to solve a focus entropy loss function, the optimal parameters of an image significance detection model are learned, and a trained image significance detection model is obtained;
and step three, inputting the RGB-D image to be processed into the trained image saliency detection model, and outputting a corresponding saliency detection result, namely a saliency map, through model calculation.
Further, the cross feature fusion module comprises a first convolution and a second convolution, the first convolution is used for carrying out feature extraction on the color image features, the second convolution is used for carrying out feature extraction on the depth image features, common features of the color image features and the depth image features are extracted in a point multiplication mode and are subjected to fusion transformation, then the third convolution is used for merging the fused features with the original color image features and the original depth image features through convolution and activation operations.
Further, the first convolution, the second convolution and the third convolution have the same structure.
Further, the multi-level convolution neural network module comprises two same branches which respectively act on the color image and the depth image, the FCN structure is adopted, the multi-level convolution neural network module comprises five layers of convolution, the first layer of convolution adopts a standard convolution block, and the rest layers of convolution all adopt a global-local feature extraction convolution block;
the global-local feature extraction convolution block comprises a global branch and a local branch, the local branch firstly reduces an input feature graph to 1/4 of an original feature graph by convolution with the step length of 2, then local feature extraction is carried out by two same convolutions with the step length of 1, the global branch adopts a bottleneck structure to carry out global feature extraction, and finally, the extracted global feature and the extracted local feature are fused by a dot-product mode.
Further, the convolution kernel size of the convolution with step size 1 is 3 × 3, and the activation function is ReLU.
Further, the focus entropy loss function
Figure BDA0003248589570000031
Is arranged as
Figure BDA0003248589570000032
Wherein y and
Figure BDA0003248589570000033
respectively representing a target significant image and a network prediction significant image, gamma represents a constant, and alpha represents a balance factor.
The beneficial technical effects of the invention are as follows:
a novel interactive double-flow saliency detection framework is adopted, a saliency region can be well detected, an accurate saliency map can be generated, and saliency target detection efficiency and accuracy are improved. Experimental results show that comprehensive experiments on three public data sets of NJU2000, NLPR and STEREO show that the method has a good detection effect on mainstream evaluation indexes. In addition, the method of the invention is simple and reliable, convenient to operate, easy to realize and convenient to popularize and apply.
Drawings
FIG. 1 is a schematic diagram of the architecture of a dual stream network of the present invention;
FIG. 2 is a schematic diagram of the structure of the global-local feature extraction volume Block GL Block of the present invention;
FIG. 3 is a schematic structural diagram of a cross feature fusion module CFFM according to the present invention;
FIG. 4 is a graphical representation of the comparison of the results of significance testing using the method of the present invention with other methods;
FIG. 5 is a graph comparing P-R curves for significance testing using the method of the present invention with other methods;
FIG. 6 is a graph comparing model sizes for significance detection using the method of the present invention with other methods.
Detailed Description
The following detailed description of the preferred embodiments will be made with reference to the accompanying drawings.
The invention provides an RGBD image saliency detection method based on interactive feature fusion, as shown in FIG. 1, a network framework of the method adopts a double-flow network, utilizes a proposed global-local feature extraction volume Block GL Block to acquire and fuse global and local features, and replaces an original standard volume Block in an FCN to generate an initial saliency map; in order to obtain common significant features of color and depth information, a cross feature fusion module CFFM based on a dot multiplication mode is provided; considering that the shallow feature has more noise, the invention utilizes CFFM to cross and fuse color and depth features at a deeper level in the FCN network, thereby reducing redundant features; and finally, fusing the initial saliency map through an inclusion structure to improve the scale adaptability of the network. The method comprises the following specific steps:
firstly, establishing an image sample set for training;
scaling a color image, a depth map and an artificial labeling map of a corresponding saliency map in each RGB-D image in an image sample set together so as to enable computing equipment to bear the calculated amount of a neural network, and performing operations such as random cutting, horizontal turning and the like together so as to increase the diversity of data; and then, normalizing the color image and the depth image in the image sample set to highlight the foreground characteristics of the image.
Step two, establishing an image significance detection model;
1. and for each image in the image sample set, firstly, a multilevel convolutional neural network module is utilized to respectively extract the characteristics of the color image and the depth image in a multilevel mode.
For a segmented network, the larger the receptive field, the larger the range captured by the network, the more information that can be used for analysis, and the better the segmentation effect. The sense field of the convolution layer in the shallow layer is narrow, a large amount of detail information is reserved, and the fine segmentation of images is facilitated; the deep convolutional layer reception field is relatively wide and can be used for learning some abstract features and improving classification performance, the FCN adopts a skip-level structure, shallow information is fully utilized to assist gradual upsampling, and therefore a refined segmentation image is obtained, but in the FCN, the fc7 layer actual reception field is only 1/4 of a full image, but not a whole image and is not enough to well complete a task, in order to obtain a larger reception field, a method of increasing the network depth and using a large convolution kernel is generally adopted, however, capturing global context information through the former method not only greatly increases the network burden, but also easily causes gradient explosion and gradient disappearance; the latter results in the sudden increase of the calculation amount, which is not beneficial to the increase of the network depth, and the calculation performance is also reduced.
Based on the above problems, the present invention designs a global-local feature extraction convolution Block GL Block, which adopts a dual-branch structure for extracting local and global features, the structure of which is shown in fig. 2, and is used in a dual-current network as shown in fig. 1, wherein a multi-level convolution neural network module includes two branches, which act on a color image and a depth image respectively, and each of which includes 5 convolution blocks, the first is a standard convolution Block, and the rest is the GL Block provided by the present invention, and then performs up-sampling by using deconvolution, and merges shallow layer information through skip level connection, so that each convolution Block can perform global feature extraction, which does not increase network load, but can ensure calculation speed, and is helpful for optimization of the whole network structure.
The GL Block provided by the invention is of a double-branch structure, namely a local branch and a global branch, so that local features and overall features can be respectively extracted. The local branch firstly uses the convolution layer with the step length of 2, the convolution kernel size of 3 multiplied by 3 and the activation function of ReLU to reduce the input characteristic diagram to 1/4 of the original characteristic diagram, and then uses two identical convolution layers with the step length of 1 to extract the local characteristics; in order to reduce the calculation amount of a branch network, a bottleneck structure is adopted for global branches, namely, a global average pooling layer is firstly utilized to explicitly extract global features, a series of convolution operations are carried out for integrating global spatial information of a whole image, then Softmax is utilized to learn global feature distribution, and finally a dot-product mode is adopted to fuse the global features and local features.
2. And performing multi-stage dot product fusion on the color and depth image features extracted by deep secondary convolution in the multi-level convolution neural network module by using a cross feature fusion module to obtain an initial significant image.
Because the existing design of the cross-modal feature fusion mode is mostly based on an addition or cascade mode, the structure is complex, the calculated amount is large, and redundant noise is easily introduced, the cross feature fusion module CFFM is set up by adopting a dot multiplication mode under the inspiration of attention mechanism, and is used for fusing color image features f with vivid appearance and texture information as shown in FIG. 3r∈RH*W*CAnd depth image feature f providing clear object shape, contour and spatial structured∈RH*W*C. Considering that the shallow depth feature contains a large amount of noise, the invention applies the cross feature fusion module to the deeper level in the multilayer convolutional neural network module.
The cross feature fusion module fuses the color image features f by using a dot multiplication moderAnd depth image feature fdThe method comprises a first convolution and a second convolution, and a color image feature f extracted from one branch in the multi-level convolution neural network module by using the first convolutionrFeature extraction and channel compression are carried out, the calculated amount of a module is reduced for subsequent processing, and meanwhile, the depth image feature f extracted from the other branch is subjected to second convolutiondExtracting features, compressing channels, and extracting color image features f by dot productrAnd depth image feature fdThe fused features have clear boundary and semantic consistency, and then the fused features are subjected to convolution and activation operations by utilizing a third convolution to be consistent with the original color image features frAnd depth image feature fdMerging is performed, for example, the channel is restored, and the merged channel is merged with the original features by an addition method. Thus, the color image feature f is fused by multiple cross featuresrAnd depth image feature fdWill gradually absorb each other's useful information to make it complementary and reduce the color image characteristic frAnd sharpens the depth image feature fdThe boundary of (2). Finally, a convolution of 3 x 3 is adopted to restore the original channel and add the original color image characteristic frAnd depth image feature fdExpressed in refined features. The process can be expressed by the following formula:
fr=fr+W2(Wr(fr)*Wd(fd))
fd=fd+W2(Wr(fr)*Wd(fd))
wherein, Wr、Wd、W2Are network parameters of a 3 x 3 convolution for compression and recovery of the channel.
The whole cross feature fusion module adopts a symmetrical structure, and extracts the original color image features f from two branches in the multilayer convolutional neural network modulerAnd depth image feature fdAfter point multiplication, the branches corresponding to the multilayer convolutional neural network module are respectively led back, and the multiplication is carried out on the branches, the larger the common information is, the color image characteristic f isrPassing detail information to depth image features fdTo refine the edge, depth image features fdPassing saliency semantics to color image features frRedundant information is discarded, so edges can be refined, and the redundant information cannot appear in color and depth components at the same time.
3. And performing multi-scale fusion on the salient images by using an increment structure, outputting network prediction salient images, finally solving a focus entropy loss function by using the network prediction salient images and the target salient images, learning the optimal parameters of the image saliency detection model, and obtaining the trained image saliency detection model.
The Incep structure is used for fusing the color and depth initial significant images output by the depth branch and the color branch and outputting a network prediction significant image, and achieves the parameter quantity of a compression model while achieving the expected purpose by connecting a small convolution kernel and a large convolution kernel in parallel.
The problems of unbalance of positive and negative samples and unbalance of background foreground in a real scene cannot be solved by the focus entropy loss function. Therefore, the present invention introduces Focalloss to solve this problem, whose formula is as follows:
Figure BDA0003248589570000071
wherein y and
Figure BDA0003248589570000072
the target significant image and the final significant image of the network are respectively represented, gamma represents a constant, loss of samples which are easy to classify is reduced, the attention of the network to difficult samples is higher, alpha represents a balance factor, and the contribution of a foreground to a loss function is increased so as to balance positive and negative samples.
And step three, inputting the RGB-D image to be processed into the trained image saliency detection model, and outputting a corresponding saliency detection result, namely a saliency map, through model calculation.
The model of the invention is realized based on PyTorch, the machine graphics card is configured into two GTX1080Ti (11GB), an Adam optimizer is used for training, and the training impulse, the learning rate, the weight attenuation rate and the batch size are respectively set to be (0.9,0.999), 0.0005,1E-5 and 16. Since the model is an end-to-end model, no training or other operations are required.
In order to verify the feasibility of the method, 1585 pictures are selected from an NJU2000 data set to serve as a training set, and 400 pictures are selected to serve as a test set; selecting 800 pictures on the NLPR data set as a training set, and 200 pictures as a test set; 637 pictures are selected from the STEREO data set as a training set, and 160 pictures are selected as a testing set. The experimental results of figures 5-6 show that: the model provided by the invention has certain advantages all the time, can accurately detect the salient region of the image, and occupies less computing resources than other methods.
In the invention, Precison and Recall values are used as evaluation indexes, and a P-R curve is drawn to evaluate the performance of the algorithm, as shown in FIG. 5, the calculation formula is as follows:
Precision=TP/TP+FP
Recall=TN/TN+FN
wherein TP, FP, TN and FN represent the number of true positive, false positive, true negative and false negative respectively.
Although specific embodiments of the present invention have been described above, it will be appreciated by those skilled in the art that these are merely examples and that many variations or modifications may be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is therefore defined by the appended claims.

Claims (6)

1. An RGBD image saliency detection method based on interactive feature fusion is characterized by comprising the following steps:
firstly, establishing an image sample set for training;
step two, establishing an image significance detection model;
for each image in the image sample set, firstly, a multilevel convolutional neural network module is utilized to respectively extract the characteristics of a color image and a depth image in a multilevel manner, and a cross characteristic fusion module is utilized to carry out multilevel point-product fusion on the characteristics of the color image and the depth image extracted by the deep convolutional manner to obtain an initial significant image, then, an Incepration structure is utilized to carry out multi-scale fusion on the initial significant image to output a network prediction significant image, finally, the network prediction significant image and a target significant image are utilized to solve a focus entropy loss function, the optimal parameters of an image significance detection model are learned, and a trained image significance detection model is obtained;
and step three, inputting the RGB-D image to be processed into the trained image saliency detection model, and outputting a corresponding saliency detection result, namely a saliency map, through model calculation.
2. The RGBD image saliency detection method based on interactive feature fusion of claim 1, characterized in that: the cross feature fusion module comprises a first convolution and a second convolution, the first convolution is used for carrying out feature extraction on color image features, the second convolution is used for carrying out feature extraction on depth image features, common features of the color image features and the depth image features are extracted in a point multiplication mode and are subjected to fusion transformation, and then the third convolution is used for merging the fused features with the original color image features and the original depth image features respectively through convolution and activation operations.
3. The RGBD image saliency detection method based on interactive feature fusion of claim 2, characterized in that: the first convolution, the second convolution and the third convolution have the same structure.
4. The RGBD image saliency detection method based on interactive feature fusion of claim 1, characterized in that: the multi-level convolution neural network module comprises two same branches which respectively act on the color image and the depth image, and adopts an FCN structure, and comprises five layers of convolution, wherein the first layer of convolution adopts a standard convolution block, and the rest layers of convolution all adopt a global-local feature extraction convolution block;
the global-local feature extraction convolution block comprises a global branch and a local branch, the local branch firstly reduces an input feature graph to 1/4 of an original feature graph by convolution with the step length of 2, then local feature extraction is carried out by two same convolutions with the step length of 1, the global branch adopts a bottleneck structure to carry out global feature extraction, and finally, the extracted global feature and the extracted local feature are fused by a dot-product mode.
5. The RGBD image saliency detection method based on interactive feature fusion according to claim 4, characterized in that: the convolution kernel size for the convolution with step size 1 is 3 × 3 and the activation function is ReLU.
6. The RGBD image saliency detection method based on interactive feature fusion of claim 1, characterized in that: the focus entropy loss function
Figure FDA0003248589560000021
Is arranged as
Figure FDA0003248589560000022
Wherein y and
Figure FDA0003248589560000023
respectively representing a target significant image and a network prediction significant image, gamma represents a constant, and alpha represents a balance factor.
CN202111039181.5A 2021-09-06 2021-09-06 RGBD image saliency detection method based on interactive feature fusion Pending CN113963170A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111039181.5A CN113963170A (en) 2021-09-06 2021-09-06 RGBD image saliency detection method based on interactive feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111039181.5A CN113963170A (en) 2021-09-06 2021-09-06 RGBD image saliency detection method based on interactive feature fusion

Publications (1)

Publication Number Publication Date
CN113963170A true CN113963170A (en) 2022-01-21

Family

ID=79461154

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111039181.5A Pending CN113963170A (en) 2021-09-06 2021-09-06 RGBD image saliency detection method based on interactive feature fusion

Country Status (1)

Country Link
CN (1) CN113963170A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445442A (en) * 2022-01-28 2022-05-06 杭州电子科技大学 Multispectral image semantic segmentation method based on asymmetric cross fusion
CN115359019A (en) * 2022-08-25 2022-11-18 杭州电子科技大学 Steel surface defect detection method based on interactive features and cascade features
CN115457259A (en) * 2022-09-14 2022-12-09 华洋通信科技股份有限公司 Image rapid saliency detection method based on multi-channel activation optimization
CN117593517A (en) * 2024-01-19 2024-02-23 南京信息工程大学 Camouflage target detection method based on complementary perception cross-view fusion network

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445442A (en) * 2022-01-28 2022-05-06 杭州电子科技大学 Multispectral image semantic segmentation method based on asymmetric cross fusion
CN115359019A (en) * 2022-08-25 2022-11-18 杭州电子科技大学 Steel surface defect detection method based on interactive features and cascade features
CN115457259A (en) * 2022-09-14 2022-12-09 华洋通信科技股份有限公司 Image rapid saliency detection method based on multi-channel activation optimization
CN115457259B (en) * 2022-09-14 2023-10-31 华洋通信科技股份有限公司 Image rapid saliency detection method based on multichannel activation optimization
CN117593517A (en) * 2024-01-19 2024-02-23 南京信息工程大学 Camouflage target detection method based on complementary perception cross-view fusion network
CN117593517B (en) * 2024-01-19 2024-04-16 南京信息工程大学 Camouflage target detection method based on complementary perception cross-view fusion network

Similar Documents

Publication Publication Date Title
CN111325751B (en) CT image segmentation system based on attention convolution neural network
CN111582316B (en) RGB-D significance target detection method
CN113963170A (en) RGBD image saliency detection method based on interactive feature fusion
CN110210539B (en) RGB-T image saliency target detection method based on multi-level depth feature fusion
Zhou et al. Salient object detection in stereoscopic 3D images using a deep convolutional residual autoencoder
CN110689599B (en) 3D visual saliency prediction method based on non-local enhancement generation countermeasure network
CN112434608B (en) Human behavior identification method and system based on double-current combined network
CN111242181B (en) RGB-D saliency object detector based on image semantics and detail
WO2023174098A1 (en) Real-time gesture detection method and apparatus
CN113255678A (en) Road crack automatic identification method based on semantic segmentation
CN111160356A (en) Image segmentation and classification method and device
CN110852199A (en) Foreground extraction method based on double-frame coding and decoding model
CN114693952A (en) RGB-D significance target detection method based on multi-modal difference fusion network
CN114219824A (en) Visible light-infrared target tracking method and system based on deep network
CN116205962A (en) Monocular depth estimation method and system based on complete context information
CN114842524A (en) Face false distinguishing method based on irregular significant pixel cluster
CN113066074A (en) Visual saliency prediction method based on binocular parallax offset fusion
CN111612803A (en) Vehicle image semantic segmentation method based on image definition
CN115457385A (en) Building change detection method based on lightweight network
CN116167927A (en) Image defogging method and system based on mixed double-channel attention mechanism
CN110211146B (en) Video foreground segmentation method and device for cross-view simulation
CN111539434B (en) Infrared weak and small target detection method based on similarity
CN113962332A (en) Salient target identification method based on self-optimization fusion feedback
CN113780241A (en) Acceleration method and device for detecting salient object
NL2030745B1 (en) Computer system for saliency detection of rgbd images based on interactive feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination