CN112348870A - Significance target detection method based on residual error fusion - Google Patents

Significance target detection method based on residual error fusion Download PDF

Info

Publication number
CN112348870A
CN112348870A CN202011235626.2A CN202011235626A CN112348870A CN 112348870 A CN112348870 A CN 112348870A CN 202011235626 A CN202011235626 A CN 202011235626A CN 112348870 A CN112348870 A CN 112348870A
Authority
CN
China
Prior art keywords
feature
module
fusion
rgb
multiplied
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011235626.2A
Other languages
Chinese (zh)
Other versions
CN112348870B (en
Inventor
张立和
金玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN202011235626.2A priority Critical patent/CN112348870B/en
Publication of CN112348870A publication Critical patent/CN112348870A/en
Application granted granted Critical
Publication of CN112348870B publication Critical patent/CN112348870B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of artificial intelligence, and provides a significance target detection method based on residual error fusion. The method comprises the steps of firstly constructing a significance detection model, extracting multi-level RGB image features and depth image features through a double-flow feature extraction module, further extracting deep-level features by using a residual error module, and simultaneously gradually fusing features from RGB feature extraction branches and corresponding previous levels by using a fusion module so as to train and obtain a final algorithm model. The invention realizes the significance prediction from end to end, has low model complexity and can fully and effectively utilize RGB image information and depth image information to predict significance regions.

Description

Significance target detection method based on residual error fusion
Technical Field
The invention belongs to the technical field of artificial intelligence, relates to deep learning, and particularly relates to an image saliency detection method.
Background
Salient object detection is an important first step in the computationally mechanistic solution of the surrounding environment. Its task is to enable a computer to mimic the human attention mechanism to detect areas of appealing attention in an image. These attention-attracting regions contain most of the visual information in the image. By screening out the image foreground regions containing the main visual information, the subsequent steps of image understanding can obtain cleaner and more accurate content information in the image, and can also reduce calculation and storage resources when processing the image background region, so that the overall performance of the subsequent steps of image understanding is improved. Usually, one only focuses on the areas of the image that are most attractive to human eyes, i.e. foreground areas or salient objects, while ignoring background areas. Therefore, one uses a computer to simulate the human visual system for saliency detection.
However, most of the existing significance detection based on depth learning aims at rgb images, and only the color images are relied on to ignore corresponding depth information, so that the accuracy and efficiency of significance detection are limited, and particularly when the foreground and the background are difficult to distinguish, the RGBD significance detection is generated. The RGBD saliency detection aims to accurately detect a salient object from an image with the aid of a depth image. Although some progress has been made in the significance detection of RGBD, there is still a great room for improvement. Although the appearance of devices such as a Kinect and a light field camera facilitates the acquisition of depth images, certain noise is introduced, and how to design a better algorithm to fit a model under the condition is worthy of careful consideration. Secondly, a significance detection algorithm based on deep learning generally has a problem that how to better fuse RGB information and depth information, the RGB image contains a large amount of information such as color texture, the depth image contains abundant geometric and edge information, and the depth information contains some information contained by RGB, so that how to better combine the RGB information and the depth information to complement each other and to more accurately highlight a significance region is a problem which is worthy of consideration at present.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method makes up the defects of the existing method, provides the RGBD image saliency detection method based on residual error fusion, and achieves the purpose of obtaining higher model precision.
The technical scheme of the invention is as follows:
a salient object detection method based on residual fusion comprises the following steps:
(1) constructing a significance detection model
The significance detection model comprises a double-current feature extraction module, a multi-scale feature pyramid pooling module, a residual fusion module and a parallel upsampling module;
(2) carrying out channel copying on an original depth image corresponding to the RGB image I to obtain a depth image D;
(3) as shown in fig. 1, an RGB image I and a depth image D thereof are input into a saliency detection model, and a multi-level RGB image feature { Ii, I ═ 1,2,3,4,5} and a multi-level depth image feature { Di, I ═ 1,2,3,4,5} are extracted by an RGB image feature extraction branch (the uppermost row in fig. 1) and a depth image feature extraction branch (the lowermost row in fig. 1) in a dual-stream feature extraction module, respectively;
(4) adding a multi-scale feature pyramid pooling module to the final stage of the RGB image feature extraction branch, further extracting deep-level features through five residual error fusion modules (fig. 1 res-fuse), and performing parallel upsampling step by step to obtain a final significance prediction result Pfinal
(5) The multiscale feature pyramid pooling module includes six sub-branches, as shown in fig. 2, and is configured to obtain context information of input feature data, where a first sub-branch employs a 1 × 1 convolutional layer, a second, a third, and a fourth sub-branches employ respectively a hole convolution with expansion rates of 3, 5, and 7, and a fifth sub-branch employs global average pooling to obtain a 1 × 1 feature representation; the sixth sub-branch adopts a direct jump connection mode to connect the input characteristic data to the output end; the first four branches further strengthen the feature expression by utilizing 1 multiplied by 1 convolutional layers and cavity convolution, and simultaneously keep the feature size and the number of channels unchanged; for the feature representation obtained by convolution learning, further up-sampling to the size of the input feature respectively, and adopting an up-sampling strategy of bilinear interpolation; finally, combining the six sub-branches in a channel cascade mode to obtain the multi-scale characteristic pyramid pooling characteristic representation of the input characteristic data;
(6) the residual fusion module is used for fusing the branches { Ii, Di, i ═ 1,2,3,4,5} from the feature extraction module, and is defined as follows:
Figure BDA0002766648240000031
wherein res _ fuse (-) represents residual fusion, Up (-) represents parallel upsampling, and C (-) represents fusion of two inputs in the channel direction; for residual fusion, two cases are distinguished: when its input is the last branch of the feature extraction module, directly connecting I5After passing through a multi-scale feature pyramid pooling module, taking the feature as an rgb feature, performing three times of continuous convolution and ReLU operations, and adding the feature of the feature Di after performing one time of convolution and ReLU operations according to elements to obtain a residual error fusion result; otherwise, for the ith-level RGB image feature Ii and the ith-level depth image feature Di of the feature extraction module, firstly, the residual fusion result obtained by Ii +1 and Di +1 is sampled in parallel, then the residual fusion result and the Ii are fused according to the channel direction to be used as RGB features, the features obtained after three times of continuous convolution and ReLU operations are carried out on the RGB features and the features obtained after one time of convolution and ReLU operations are added according to elements to obtain a residual fusion result;
(7) the parallel upsampling includes four sub-branches, as shown in fig. 4, which use convolutional layers with different receiving domains, respectively, and are intended to capture different local structures; then, connecting the response generated by the four convolutional layers to a tensor feature with the size of H multiplied by W multiplied by 2C, dividing H multiplied by W multiplied by 2C according to a C/2 unit and splicing and recombining according to the length H direction and the W direction respectively in order to obtain the feature which is half of the original length and width dimension, and finally obtaining the feature with the size of 2H multiplied by 2W multiplied by C/2;
(8) initializing a weight parameter on ImageNet through vgg-16 pre-training model; in the model training phase, optimization is carried out by taking a cross entropy loss function as an objective function, an Adam optimization algorithm is used, the momentum is set to be 0.9, the weight attenuation rate is set to be 0.1, the basic learning rate is set to be 1 multiplied by 10 < -6 >, and the batch size is set to be 1.
The invention has the beneficial effects that: the method fully utilizes complementary information contained in the RGB image and the corresponding depth image, and achieves the aim of accurately predicting the saliency area in the RGBD image in a residual error fusion mode. In addition, the characteristic aggregation module and a reasonable up-sampling mode aggregate the characteristics of different scales, so that the end-to-end significance prediction can be realized by fully and effectively utilizing RGB image information and depth image information.
Drawings
Fig. 1 is a frame diagram of an RGBD saliency detection method based on residual fusion, where the top row represents RGB image feature extraction branches, and the bottom row represents depth image feature extraction branches;
FIG. 2 is a schematic diagram of a multi-scale feature pyramid pooling module;
FIG. 3 is a schematic diagram of a residual fusion module;
fig. 4 is a schematic diagram of a parallel upsampling module.
Detailed Description
The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.
The invention is implemented as follows:
(1) constructing a significance detection model
The significance detection model comprises a double-current feature extraction module, a multi-scale feature pyramid pooling module, a residual error fusion module and a parallel upsampling module.
(2) And performing channel copy on the depth image corresponding to the RGB image to obtain a three-channel image, and obtaining a depth image D.
(3) Inputting the RGB image I and the depth image D thereof into a saliency detection model, and extracting multi-level RGB image features { Ii, I ═ 1,2,3,4,5} and depth image features { Di, I ═ 1,2,3,4,5} respectively by an RGB image feature extraction branch (RGB encoder) and a depth image feature extraction branch (depth encoder) in a dual-stream feature extraction module.
(4) And adding a characteristic pyramid module to the final stage of the RGB encoder branch, further extracting deep-level characteristics through a residual fusion module, up-sampling step by step, and simultaneously supervising step by step in the extraction process to achieve the aim of global optimization.
(5) As shown in fig. 2, the multi-scale feature pooling module includes six sub-branches to obtain context information of input RGB image feature data, where a first sub-branch adopts a 1 × 1 convolutional layer, a second, a third, and a fourth sub-branches respectively adopt hole convolutions with expansion rates of 3, 5, and 7, and a fifth sub-branch adopts global average pooling to obtain a 1 × 1 feature representation; the sixth sub-branch connects the input features to the output in a direct jump connection. The first four branches further enhance feature expression using 1 × 1 convolutional layers and hole convolutions while keeping feature size and number of channels unchanged. For the feature representation obtained by the convolution learning, the feature representation is further up-sampled to the size of the input feature, and an up-sampling strategy of bilinear interpolation is adopted. And the fifth branch is cascaded to the final feature map in a global average pooling mode, and finally, the features of the six sub-branches are combined in a channel cascading mode to obtain the feature representation fused with the multi-scale pooling.
(6) The residual fusion module is used for fusing branches { Ii, Di, i ═ 1,2,3,4,5} from the encoder feature extraction, and is defined as follows:
Figure BDA0002766648240000051
where res _ fuse (-) represents residual fusion, Up (-) represents parallel upsampling, and C (-) represents fusion of two features in channel direction, as shown in fig. 3. It can be seen that for residual fusion, there are three input sources, namely, RGB encoder features with corresponding sizes, depth encoder features, and output features from the previous residual fusion module (when the residual block is at the rightmost end in fig. 1, there is no such portion, i is 5), the RGB encoder features with corresponding sizes and the features from the previous residual fusion module are first cascaded according to the channel direction, and then the residual fusion features are obtained when the RGB encoder features and the depth encoder features are respectively input into the residual fusion module.
(7) As shown in fig. 3, the parallel upsampling includes four sub-branches, which use convolutional layers with different receiving domains, respectively, and are intended to capture different local structures, then, the responses generated by the four convolutional layers are connected to a tensor feature with a size of H × W × 2C, in order to obtain a feature which is half of the original length and width, we can divide H × W × 2C by channels according to C/2 units, and finally, perform splicing recombination according to the length and H direction and according to the W direction, respectively, to obtain a feature with a size of 2H × 2W × C/2.
(8) Initializing a weight parameter on ImageNet through vgg-16 pre-training model; in the model training phase, optimization is carried out by taking a cross entropy loss function as an objective function, an Adam optimization algorithm is used, the momentum is set to be 0.9, the weight attenuation rate is set to be 0.1, the basic learning rate is set to be 1 multiplied by 10 < -6 >, and the batch size is set to be 1.

Claims (1)

1. A salient object detection method based on residual fusion is characterized by comprising the following steps:
(1) constructing a significance detection model
The significance detection model comprises a double-current feature extraction module, a multi-scale feature pyramid pooling module, a residual fusion module and a parallel upsampling module;
(2) carrying out channel copying on an original depth image corresponding to the RGB image I to obtain a depth image D;
(3) inputting an RGB image I and a depth image D thereof into a significance detection model, and respectively extracting multi-level RGB image features { Ii, I-1, 2,3,4,5} and multi-level depth image features { Di, I-1, 2,3,4,5} through an RGB image feature extraction branch and a depth image feature extraction branch in a double-current feature extraction module;
(4) adding a multi-scale feature pyramid pooling module to the final stage of the RGB image feature extraction branch, further extracting deep level features through five residual error fusion modules, and gradually extractingBy parallel up-sampling, the final significance prediction result P is obtainedfinal
(5) The multi-scale feature pyramid pooling module comprises six sub-branches and is used for obtaining context information of input feature data, wherein the first sub-branch adopts a 1 x 1 convolutional layer, the second, third and fourth sub-branches respectively adopt hole convolutions with expansion rates of 3, 5 and 7, and the fifth sub-branch adopts global average pooling to obtain a 1 x 1 feature representation; the sixth sub-branch adopts a direct jump connection mode to connect the input characteristic data to the output end; the first four branches further strengthen the feature expression by utilizing 1 multiplied by 1 convolutional layers and cavity convolution, and simultaneously keep the feature size and the number of channels unchanged; for the feature representation obtained by convolution learning, further up-sampling to the size of the input feature respectively, and adopting an up-sampling strategy of bilinear interpolation; finally, combining the six sub-branches in a channel cascade mode to obtain the multi-scale characteristic pyramid pooling characteristic representation of the input characteristic data;
(6) the residual fusion module is used for fusing branches { Ii, Di, i ═ 1,2,3,4,5} from the dual-stream feature extraction module, and is defined as follows:
Figure FDA0002766648230000011
wherein res _ fuse (-) represents residual fusion, Up (-) represents parallel upsampling, and C (-) represents fusion of two inputs in the channel direction; for residual fusion, two cases are distinguished: when its input is the last branch of the feature extraction module, directly connecting I5After passing through a multi-scale feature pyramid pooling module, taking the feature as an rgb feature, performing three times of continuous convolution and ReLU operations, and adding the feature of the feature Di after performing one time of convolution and ReLU operations according to elements to obtain a residual error fusion result; otherwise, for the ith-level RGB image feature Ii and the ith-level depth image feature Di of the feature extraction module, firstly, the residual fusion results obtained by Ii +1 and Di +1 are sampled in parallel, then the residual fusion results and the Ii are fused according to the channel direction to be used as RGB features, and after three times of continuous convolution and ReLU operations are carried out on the RGB features and the Ii +1Adding the obtained characteristics and the characteristics of Di subjected to primary convolution and ReLU operation according to elements to obtain a residual error fusion result;
(7) the parallel upsampling comprises four sub-branches, which respectively use convolutional layers with different receiving domains, intended to capture different local structures; then, connecting the response generated by the four convolutional layers to a tensor feature with the size of H multiplied by W multiplied by 2C, dividing H multiplied by W multiplied by 2C according to a C/2 unit and splicing and recombining according to the length H direction and the W direction respectively in order to obtain the feature which is half of the original length and width dimension, and finally obtaining the feature with the size of 2H multiplied by 2W multiplied by C/2;
(8) initializing a weight parameter on ImageNet through vgg-16 pre-training model; in the model training phase, optimization is carried out by taking a cross entropy loss function as an objective function, an Adam optimization algorithm is used, the momentum is set to be 0.9, the weight attenuation rate is set to be 0.1, the basic learning rate is set to be 1 multiplied by 10 < -6 >, and the batch size is set to be 1.
CN202011235626.2A 2020-11-06 2020-11-06 Significance target detection method based on residual error fusion Active CN112348870B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011235626.2A CN112348870B (en) 2020-11-06 2020-11-06 Significance target detection method based on residual error fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011235626.2A CN112348870B (en) 2020-11-06 2020-11-06 Significance target detection method based on residual error fusion

Publications (2)

Publication Number Publication Date
CN112348870A true CN112348870A (en) 2021-02-09
CN112348870B CN112348870B (en) 2022-09-30

Family

ID=74429671

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011235626.2A Active CN112348870B (en) 2020-11-06 2020-11-06 Significance target detection method based on residual error fusion

Country Status (1)

Country Link
CN (1) CN112348870B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113205481A (en) * 2021-03-19 2021-08-03 浙江科技学院 Salient object detection method based on stepped progressive neural network
CN113298154A (en) * 2021-05-27 2021-08-24 安徽大学 RGB-D image salient target detection method
CN113344844A (en) * 2021-04-14 2021-09-03 山东师范大学 Target fruit detection method and system based on RGB-D multimode image information
CN113408350A (en) * 2021-05-17 2021-09-17 杭州电子科技大学 Innovative edge feature extraction method-based remote sensing image significance detection method
CN113486899A (en) * 2021-05-26 2021-10-08 南开大学 Saliency target detection method based on complementary branch network
CN113536973A (en) * 2021-06-28 2021-10-22 杭州电子科技大学 Traffic sign detection method based on significance
CN113763447A (en) * 2021-08-24 2021-12-07 北京的卢深视科技有限公司 Method for completing depth map, electronic device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210539A (en) * 2019-05-22 2019-09-06 西安电子科技大学 The RGB-T saliency object detection method of multistage depth characteristic fusion
CN110909594A (en) * 2019-10-12 2020-03-24 杭州电子科技大学 Video significance detection method based on depth fusion
CN111242138A (en) * 2020-01-11 2020-06-05 杭州电子科技大学 RGBD significance detection method based on multi-scale feature fusion
CN111582316A (en) * 2020-04-10 2020-08-25 天津大学 RGB-D significance target detection method
CN111798436A (en) * 2020-07-07 2020-10-20 浙江科技学院 Salient object detection method based on attention expansion convolution feature fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210539A (en) * 2019-05-22 2019-09-06 西安电子科技大学 The RGB-T saliency object detection method of multistage depth characteristic fusion
CN110909594A (en) * 2019-10-12 2020-03-24 杭州电子科技大学 Video significance detection method based on depth fusion
CN111242138A (en) * 2020-01-11 2020-06-05 杭州电子科技大学 RGBD significance detection method based on multi-scale feature fusion
CN111582316A (en) * 2020-04-10 2020-08-25 天津大学 RGB-D significance target detection method
CN111798436A (en) * 2020-07-07 2020-10-20 浙江科技学院 Salient object detection method based on attention expansion convolution feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘政怡等: "基于多模态特征融合监督的RGB-D图像显著性检测", 《电子与信息学报》 *
张小娟等: "完全残差连接与多尺度特征融合遥感图像分割", 《遥感学报》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113205481A (en) * 2021-03-19 2021-08-03 浙江科技学院 Salient object detection method based on stepped progressive neural network
CN113344844A (en) * 2021-04-14 2021-09-03 山东师范大学 Target fruit detection method and system based on RGB-D multimode image information
CN113408350A (en) * 2021-05-17 2021-09-17 杭州电子科技大学 Innovative edge feature extraction method-based remote sensing image significance detection method
CN113408350B (en) * 2021-05-17 2023-09-19 杭州电子科技大学 Remote sensing image significance detection method based on edge feature extraction
CN113486899A (en) * 2021-05-26 2021-10-08 南开大学 Saliency target detection method based on complementary branch network
CN113298154A (en) * 2021-05-27 2021-08-24 安徽大学 RGB-D image salient target detection method
CN113298154B (en) * 2021-05-27 2022-11-11 安徽大学 RGB-D image salient object detection method
CN113536973A (en) * 2021-06-28 2021-10-22 杭州电子科技大学 Traffic sign detection method based on significance
CN113536973B (en) * 2021-06-28 2023-08-18 杭州电子科技大学 Traffic sign detection method based on saliency
CN113763447A (en) * 2021-08-24 2021-12-07 北京的卢深视科技有限公司 Method for completing depth map, electronic device and storage medium

Also Published As

Publication number Publication date
CN112348870B (en) 2022-09-30

Similar Documents

Publication Publication Date Title
CN112348870B (en) Significance target detection method based on residual error fusion
CN111242138B (en) RGBD significance detection method based on multi-scale feature fusion
CN111582316B (en) RGB-D significance target detection method
CN109271933B (en) Method for estimating three-dimensional human body posture based on video stream
CN111292264B (en) Image high dynamic range reconstruction method based on deep learning
CN110363716B (en) High-quality reconstruction method for generating confrontation network composite degraded image based on conditions
CN111931787A (en) RGBD significance detection method based on feature polymerization
CN109005398B (en) Stereo image parallax matching method based on convolutional neural network
CN112767418B (en) Mirror image segmentation method based on depth perception
CN112489164B (en) Image coloring method based on improved depth separable convolutional neural network
CN113192073A (en) Clothing semantic segmentation method based on cross fusion network
CN113076957A (en) RGB-D image saliency target detection method based on cross-modal feature fusion
CN110929735A (en) Rapid significance detection method based on multi-scale feature attention mechanism
CN114612832A (en) Real-time gesture detection method and device
CN111899203A (en) Real image generation method based on label graph under unsupervised training and storage medium
CN110852199A (en) Foreground extraction method based on double-frame coding and decoding model
CN114926734A (en) Solid waste detection device and method based on feature aggregation and attention fusion
CN113888505B (en) Natural scene text detection method based on semantic segmentation
CN115908793A (en) Coding and decoding structure semantic segmentation model based on position attention mechanism
CN114283315A (en) RGB-D significance target detection method based on interactive guidance attention and trapezoidal pyramid fusion
CN113066074A (en) Visual saliency prediction method based on binocular parallax offset fusion
Wang et al. A multi-scale attentive recurrent network for image dehazing
CN113222016B (en) Change detection method and device based on cross enhancement of high-level and low-level features
CN115471718A (en) Construction and detection method of lightweight significance target detection model based on multi-scale learning
CN113256528B (en) Low-illumination video enhancement method based on multi-scale cascade depth residual error network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant