CN111814895A - Significance target detection method based on absolute and relative depth induction network - Google Patents
Significance target detection method based on absolute and relative depth induction network Download PDFInfo
- Publication number
- CN111814895A CN111814895A CN202010695446.6A CN202010695446A CN111814895A CN 111814895 A CN111814895 A CN 111814895A CN 202010695446 A CN202010695446 A CN 202010695446A CN 111814895 A CN111814895 A CN 111814895A
- Authority
- CN
- China
- Prior art keywords
- depth
- absolute
- feature
- network
- relative depth
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000006698 induction Effects 0.000 title claims abstract description 43
- 238000001514 detection method Methods 0.000 title claims abstract description 24
- 238000000034 method Methods 0.000 claims abstract description 24
- 230000004927 fusion Effects 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 6
- 239000013589 supplement Substances 0.000 claims abstract description 5
- 238000013507 mapping Methods 0.000 claims description 11
- 238000011176 pooling Methods 0.000 claims description 8
- 230000010354 integration Effects 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 239000013598 vector Substances 0.000 claims description 5
- 230000003993 interaction Effects 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 2
- 230000001939 inductive effect Effects 0.000 claims description 2
- 230000009467 reduction Effects 0.000 claims description 2
- 238000005728 strengthening Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 abstract description 8
- 230000000694 effects Effects 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- ABAFKQHGFDZEJO-UHFFFAOYSA-N 4,6,6-trimethylbicyclo[3.1.1]heptane-4-carbaldehyde Chemical compound C1C2C(C)(C)C1CCC2(C)C=O ABAFKQHGFDZEJO-UHFFFAOYSA-N 0.000 description 1
- 241000186704 Pinales Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a significant target detection method based on absolute and relative depth induction networks, which comprises the following steps: deep induced network training with a residual error network as a backbone network; the absolute depth induction module performs cross-modal feature fusion to position an object; and the relative depth induction module establishes a space geometric model to supplement detailed information. The method not only extracts the RG image characteristics from the residual error network, but also extracts the depth information to help the salient object detection task, and the absolute depth induction module RGB image characteristics and the depth image information are fused and utilized in a mode from coarse to fine, so that the disordered noise interference caused by the asynchronous characteristics of two spaces is avoided; the relative depth induction module establishes a spatial graph convolution model to explore spatial structure and geometric information so as to enhance the local feature representation capability, thereby improving the detection accuracy and robustness, achieving excellent detection effect and having wide application prospect.
Description
Technical Field
The invention belongs to the technical field of significance target detection, and particularly relates to a significance target detection method based on absolute and relative depth induction networks.
Background
Salient object detection, which aims at locating and segmenting the most visually distinctive objects in an image, is a fundamental operation in computer image processing. In recent years, it has been widely applied to various fields such as repositioning, scene classification, visual tracking, semantic segmentation, and the like. Before the relevant image processing operation, the computer can adopt a significance detection technology to filter out irrelevant information, thereby greatly reducing the work of image processing and improving the efficiency.
Early methods of salient object detection were primarily designed to detect salient objects in images using hand-made features such as brightness, color, and texture. In recent years, various deep learning-based models have been proposed due to the development of CNN. Hou et al, 2017, proposed a short connection mechanism between layers and used it to aggregate feature maps from multiple scales. In 2017, Zhang et al explore multi-level features at each scale and generate significance maps in a recursive manner. Feng et al, 2019, proposed an attention feedback module to better explore the structure of significant objects. However, these recently proposed methods have certain challenges facing extremely complex situations such as semantically complex backgrounds, low-luminance environments and transparent objects, and to solve this problem we propose to supplement the RGB image with depth information. Therefore, the spatial structure and 3D geometric information of the scene can be explored, and the effectiveness and the robustness of the network are improved.
The features extracted by the traditional RGB-D significant object detection method lack global context information and semantic clues in the features. In recent years, an efficient integration approach of depth and RGB features has been a key issue for this task. In 2019 Zhao et al designed a contrast loss to explore the a priori contrast in depth images. An attention map is then generated by fusing the refined depth and RGB features. Outputting the final significance mapping by fully utilizing the fluid pyramid integration strategy of the multi-scale cross-modal characteristics. In 2019, pinal et al hierarchically integrates depth and RGB images and refines the final saliency map by a recursive attention model. However, the current method is asynchronous in the fusion depth and RGB image feature space, and clutter noise can be introduced into the network.
In summary, the existing salient object detection technology has the following defects: first, most existing methods extract features from the RGB image only, which are not sufficient to distinguish a significant object from a cluttered background region; second, most existing methods extract depth and RGB features through separate networks and fuse them directly using different strategies. However, there is no consistency across the modal feature space. Directly fusing them results in noisy responses in the prediction; third, although the significant objects can be accurately located by using the absolute depth induction module, detailed significance information of local regions is not deeply explored, which also limits further improvement of model performance.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a significant target detection method based on absolute and relative depth induction networks, and solves the problems mentioned in the background technology.
(II) technical scheme
In order to achieve the purpose, the invention provides the following technical scheme: a salient object detection method based on absolute and relative depth induction networks comprises the following steps:
a. deep induced network training with a residual error network as a backbone network: removing the final pooling layer and the full-link layer of ResNet-50, uniformly adjusting the network input image to 256 multiplied by 256, carrying out normalization processing on the data set, generating corresponding side output images from the feature images generated by the five convolution blocks in a pyramid mode, and then carrying out fusion operation from top to bottom in the network;
b. the absolute depth induction module is fused across modal features to position an object: inputting the depth image of the input image into a group of convolutions to obtain a depth feature mapping image with the same size as Res2_ x feature mapping, applying an absolute depth induction network for multiple times, integrating the depth feature mapping image and an RGB feature mapping image together in a recursive mode, realizing cross-modal feature fusion, avoiding noise interference caused by simple fusion of two asynchronous modal features, strengthening the depth interaction between depth and color features, and adaptively fusing the RGB and depth features on each scale;
c. the relative depth induction module establishes a space geometric model to supplement detail information: firstly, the feature map from the final stage Res5_ x of the decoding network is up-sampled and integrated with the feature map obtained by the absolute depth induction module in a cross-modal fusion manner to generate a new feature map, and then the new feature map and the depth map generated by the absolute depth induction module are jointly input into the relative depth induction module to explore the spatial structure and detailed significance information of the image and wrap the relative depth information in the network to improve the performance of the significance model;
further, when the size of the input network image in the step a is the same, the data set is operated by using a bilinear interpolation method.
Further, when the side output graph is generated in the step a, the output feature graphs of the four residual blocks are input into a 1 × 1 convolution layer, and the channels of the feature graphs are subjected to dimension reduction to obtain the side output graph, so that the side output graph is used for a subsequent top-down integrated multi-level feature graph.
Further, the integration of the depth feature map and the RGB2 feature map together in a recursive manner as described in step b, the absolute depth-inducing network is implemented by a Gated Recursive Unit (GRU) which aims to handle the sequence problem, we describe the multi-scale feature integration process as a sequence problem and consider each scale as a time step.
Furthermore, in each time step, the dimension of the depth feature map is reduced, then the depth feature map and the RGB feature map are cascaded and converted through global maximum pooling to generate a new feature vector, and then the RGB feature map and the depth feature map are fused on each scale in a self-adaptive manner through operations such as a full connection layer.
Further, in step c, the spatial structure and detailed saliency information of the image are searched by using a relative depth induction module, which searches for relative depth information by using a Graph Convolution Network (GCN).
Further, the proposed Graph Convolution Network (GCN) projects the image pixels into a 3D space according to the spatial positions and the depth values of the image pixels, overcomes the disadvantage that adjacent pixels in a 2D space are not strongly associated in a 3D point cloud space, performs information propagation in a local area according to a short-distance relative depth relation, and successively enhances the local feature representation capability by exploring spatial structure and geometric information on multiple scales.
(III) advantageous effects
Compared with the prior art, the invention provides the significant target detection method based on the absolute and relative depth induction networks, which has the following beneficial effects:
the method not only extracts RGB image features from a residual error network, but also provides the method for assisting a saliency target detection task by using depth information, most of the existing RGB-D models only simply extract depth and RGB features and fusion the depth and RGB features heuristically, and an absolute depth induction module is used for cross-modal fusion of the RGB image features and the depth image information in a mode from coarse to fine, so that the disordered noise interference caused by the asynchronous characteristics of two spaces is avoided, and an object is accurately positioned; the relative depth induction module is utilized to establish a spatial graph convolution model to explore spatial structure and geometric information so as to enhance the local feature representation capability, thereby improving the detection accuracy and robustness, achieving excellent detection effect, being conductive to fusion with other fields and having wide application prospect.
Drawings
Fig. 1 is a significance target flow chart based on absolute and relative depth induction networks proposed by the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention provides a technical solution: the salient object detection method based on the absolute and relative depth induction network comprises the following steps:
deep induced network training with a residual error network as a backbone network: and removing the last pooling layer and the full-link layer of Resnet-50, uniformly adjusting the network input image to 256 multiplied by 256, carrying out normalization processing on the data set, and generating a corresponding side output image by the feature map generated by the five convolution blocks in a pyramid mode. Then the fusion operation is carried out from top to bottom in the network.
Deep induced network training with a residual error network as a backbone network: removing the final pooling layer and the full-link layer of ResNet-50, the backbone network comprises five convolution blocks, Conv1, Res2_ x, … Res5_ x, inputting RGB images with size of x W H, respectively generating RGB images with size of x W H by convolution blocksCharacteristic diagram ofThe shallower layers capture low-level information of the image, such as texture and spatial detail, and the deep feature maps contain high-level semantic information. We merge feature maps in a pyramidal fashionReducing the channels to C using a 1 x 1 convolution kernel, resulting in a side-output mapThe multi-level feature map is then integrated in a top-down manner,
wherein σ is a ReLU activation function, CAT ·]Is a cascade operation of connecting two characteristic graphs under the same channel dimension, UP is an UP-sampling operation with bilinear interpolation, and Wl,blAre trainable parameters in the network.
The absolute depth induction module is fused across modal features to position an object: first, an input depth image D of size W x H is input into a set of convolution layers, and a convolution layer of size W x H is generatedCharacteristic diagram f ofdThen, an Absolute Depth Induction Module (ADIM) is applied multiple times to recursively integrate the depth profile with the RGB profileIntegrated together, to enhance the depth interaction between depth and color features,
wherein,is the depth characteristic of the update and,is the aggregate result of depth and RGB information in the ith layer.
According to the above embodiments, preferably ADIM is implemented by a Gated Recursion Unit (GRU) which aims at handling the sequence problem, we formulate the multi-scale feature integration process as a sequence problem and treat each scale as a time step. At each timeIn step, we characterize RGBTreated as input to GRU, and depth featuresWhich may be considered as the hidden state of the last step, the two feature maps are cascaded and transformed by a global max-pooling (GMP) operation and feature vectors are generated. A fully connected layer is then applied over the feature vector to generate a reset gate r and an update gate z. The values of the two gates are normalized by an S-type function, in practice the gate r controls the depth and the integration of the RGB features, the z controlsAnd (4) updating. In this way, RGB and depth features can be adaptively fused at each scale. The interaction between depth and RGB features is enhanced by the network. The generated multi-scale feature map is then processedCombined with the signature graph in the decoded state, i.e. equation (1) is re-expressed as:
the relative depth induction module establishes a space geometric model to supplement detail information: the Relative Depth Induction Module (RDIM) is used in the decoding phase, first the feature map from the last phase of the decoding networkPerforming upsampling and feature mappingIntegrated together, the generated feature map is represented as described in equation (3)RDIM is then applied to the feature graphAnd depth image to embed relative depth information in the network
According to the above embodiments, RDIM is preferably implemented by Graph Convolution Network (GCN), and to explore the relative depth relationship between pixels, we first will generate a feature map by ADIMLet G ═ V, E, (where node set is V, edge set is E. we treat each node ni in the graph as a point in the 3D coordinate system, and let the coordinates be (x)i,yi,di) Wherein (x)i,yi) Is spatial location mapping of featuresAnd di is the corresponding depth value. Representing a set of nodes as V ═ n1,n2,...,nkAnd k is the number of nodes. We define the edge set e of 3D coordinates and its adjacent m elementsi,jE is left to E, calculate edge Ei,jWeight w ofi,jAs relative depth values, to measure the spatial correlation between nodes ni and nj,
wi,j=|(xi,yi,di)-(xj,yj,dj)| (5)
to describe the semantic relationship between nodes ni and nj, we are edge ei,jDefining an attribute feature ai,jTo further consider global context information of the image, feature maps are appliedApplication of GAP to extract high-level semantic information and output featuresVector fg。
The space GCN consists of a set of stacked Graph Convolution Layers (GCLs), for each GCL the edge e is updated firsti,jAttribute feature of (a)i,j,
Wherein,andare respectively a characteristic diagramPosition (x) ofi,yi) And (x)j,yj) Using MLP to update the function of each node,
wherein N (N)i) Is a set w of neighboring pixels of node nii,jAs edge ei,jIn such a way that RDIM focuses more on areas with larger relative distances, and in such a way that RDIM focuses more on areas with larger relative distances, the message is transmitted through the edges of the neighboring nodes. Then, the updated characteristics of all nodes are combinedFed to the global max pooling layer and an updated global feature vector f is obtainedgFinally, we pass the last oneObtain a characteristic diagramWhereinIs the overall output on the scale lrdmi. By using the GCN to transfer messages between nodes, the functionality of each node will be updated and refined according to its relationship to all other neighboring nodes. In our network, we apply RDIM at level 3 and level 4 of the decoding phase. The generated RDIM profile is then input to the next decoding stage.
Selecting the feature map generated in the last decoding stageThe final saliency map is predicted because it combines absolute and relative depth information, first using a bilinear interpolation operation to map the featuresUp-sampling to make it be identical to input size, finally inputting into convolution layer of single channel to obtain final significant graph S, in the course of training, the final significant graph is passed through cross entropy loss function and is formed from truth graphThe monitoring is carried out by monitoring the operation of the device,
whereinAnd Si,jThe significance values in the positions (i, j) of the true value map and significance map, respectively.
The invention not only extracts RGB image characteristics from a residual error network, but also provides a method for assisting a saliency target detection task by using depth information, most of the existing RGB-D models only simply extract depth and RGB characteristics and fusion the depth and RGB characteristics in a heuristic manner, and an absolute depth induction module is designed to fuse and utilize the RGB image characteristics and the depth image information in a mode from coarse to fine, so that disordered noise interference caused by asynchronous characteristics of two spaces is avoided, and an object is accurately positioned; meanwhile, a relative depth induction module is designed to establish a space map convolution model to explore space structure and geometric information so as to enhance the local feature representation capability, thereby improving the detection accuracy and robustness, achieving excellent detection effect, being conductive to fusion with other fields and having wide application prospect.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
In the description of the present invention, it should be noted that the terms "upper", "lower", "inner", "outer", "front", "rear", "both ends", "one end", "the other end", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (7)
1. A salient object detection method based on absolute and relative depth induction networks is characterized by comprising the following steps:
a. deep induced network training with a residual error network as a backbone network: removing the final pooling layer and the full-link layer of ResNet-50, uniformly adjusting the network input image to 256 multiplied by 256, carrying out normalization processing on the data set, generating corresponding side output images from the feature images generated by the five convolution blocks in a pyramid mode, and then carrying out fusion operation from top to bottom in the network;
b. the absolute depth induction module is fused across modal features to position an object: inputting the depth image of the input image into a group of convolutions to obtain a depth feature mapping image with the same size as Res2_ x feature mapping, applying an absolute depth induction network for multiple times, integrating the depth feature mapping image and an RGB feature mapping image together in a recursive mode, realizing cross-modal feature fusion, avoiding noise interference caused by simple fusion of two asynchronous modal features, strengthening the depth interaction between depth and color features, and adaptively fusing the RGB and depth features on each scale;
c. the relative depth induction module establishes a space geometric model to supplement detail information: firstly, feature maps from the final stage Res5_ x of the decoding network are up-sampled and integrated with feature maps obtained by cross-modal fusion of an absolute depth induction module to generate new feature maps, and then the new feature maps and the depth maps generated by the absolute depth induction module are jointly input into a relative depth induction module to explore the spatial structure and detailed significance information of images, and the relative depth information is wrapped in the network to improve the performance of the significance model.
2. The method for detecting the salient objects based on the absolute and relative depth induction networks according to claim 1, wherein: and c, when the sizes of the input network images in the step a are the same, operating the data set by using a bilinear interpolation method.
3. The method for detecting the salient objects based on the absolute and relative depth induction networks according to claim 1, wherein: and b, when the side output graph is generated in the step a, the output feature graphs of the four residual blocks are input into a 1 × 1 convolution layer, and the channels of the feature graphs are subjected to dimension reduction to obtain the side output graph, so that the side output graph is used for the subsequent top-down integrated multi-level feature graph.
4. The method for detecting the salient objects based on the absolute and relative depth induction networks according to claim 1, wherein: integrating the depth feature map and the RGB2 feature map together in a recursive manner as described in step b, the absolute depth-inducing network is implemented by a Gated Recursive Unit (GRU) which aims to handle the sequence problem, we represent the multi-scale feature integration process as a sequence problem and treat each scale as a time step.
5. The method for detecting the salient objects based on the absolute and relative depth induction networks according to claim 1, wherein: in each time step, the dimension of the depth feature map is reduced, then the depth feature map and the RGB feature map are cascaded and converted through global maximum pooling to generate a new feature vector, and the RGB feature map and the depth feature map are fused on each scale in a self-adaptive manner through operations such as a full connection layer.
6. The method for detecting the salient objects based on the absolute and relative depth induction networks according to claim 1, wherein: in step c, the spatial structure and detailed significance information of the image are searched by using a relative depth induction module, which searches relative depth information by using a Graph Convolution Network (GCN).
7. The method for detecting the salient objects based on the absolute and relative depth induction networks according to claim 1, wherein: the proposed Graph Convolution Network (GCN) projects image pixels into a 3D space according to the spatial positions and the depth values of the image pixels, overcomes the disadvantage that adjacent pixels in a 2D space are not strongly associated in a 3D point cloud space, performs information propagation in a local area according to a short-distance relative depth relation, and successively enhances the local feature representation capability by exploring spatial structure and geometric information on multiple scales.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010695446.6A CN111814895A (en) | 2020-07-17 | 2020-07-17 | Significance target detection method based on absolute and relative depth induction network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010695446.6A CN111814895A (en) | 2020-07-17 | 2020-07-17 | Significance target detection method based on absolute and relative depth induction network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111814895A true CN111814895A (en) | 2020-10-23 |
Family
ID=72866457
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010695446.6A Pending CN111814895A (en) | 2020-07-17 | 2020-07-17 | Significance target detection method based on absolute and relative depth induction network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111814895A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113076947A (en) * | 2021-03-26 | 2021-07-06 | 东北大学 | RGB-T image significance detection system with cross-guide fusion |
CN113537279A (en) * | 2021-05-18 | 2021-10-22 | 齐鲁工业大学 | COVID-19 identification system based on similar residual convolution and LSTM |
CN113963081A (en) * | 2021-10-11 | 2022-01-21 | 华东师范大学 | Intelligent image chart synthesis method based on graph convolution network |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351941A1 (en) * | 2016-06-03 | 2017-12-07 | Miovision Technologies Incorporated | System and Method for Performing Saliency Detection Using Deep Active Contours |
WO2019136946A1 (en) * | 2018-01-15 | 2019-07-18 | 中山大学 | Deep learning-based weakly supervised salient object detection method and system |
CN110210539A (en) * | 2019-05-22 | 2019-09-06 | 西安电子科技大学 | The RGB-T saliency object detection method of multistage depth characteristic fusion |
CN110399907A (en) * | 2019-07-03 | 2019-11-01 | 杭州深睿博联科技有限公司 | Thoracic cavity illness detection method and device, storage medium based on induction attention |
AU2020100274A4 (en) * | 2020-02-25 | 2020-03-26 | Huang, Shuying DR | A Multi-Scale Feature Fusion Network based on GANs for Haze Removal |
CN111242138A (en) * | 2020-01-11 | 2020-06-05 | 杭州电子科技大学 | RGBD significance detection method based on multi-scale feature fusion |
CN111242238A (en) * | 2020-01-21 | 2020-06-05 | 北京交通大学 | Method for acquiring RGB-D image saliency target |
-
2020
- 2020-07-17 CN CN202010695446.6A patent/CN111814895A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351941A1 (en) * | 2016-06-03 | 2017-12-07 | Miovision Technologies Incorporated | System and Method for Performing Saliency Detection Using Deep Active Contours |
WO2019136946A1 (en) * | 2018-01-15 | 2019-07-18 | 中山大学 | Deep learning-based weakly supervised salient object detection method and system |
CN110210539A (en) * | 2019-05-22 | 2019-09-06 | 西安电子科技大学 | The RGB-T saliency object detection method of multistage depth characteristic fusion |
CN110399907A (en) * | 2019-07-03 | 2019-11-01 | 杭州深睿博联科技有限公司 | Thoracic cavity illness detection method and device, storage medium based on induction attention |
CN111242138A (en) * | 2020-01-11 | 2020-06-05 | 杭州电子科技大学 | RGBD significance detection method based on multi-scale feature fusion |
CN111242238A (en) * | 2020-01-21 | 2020-06-05 | 北京交通大学 | Method for acquiring RGB-D image saliency target |
AU2020100274A4 (en) * | 2020-02-25 | 2020-03-26 | Huang, Shuying DR | A Multi-Scale Feature Fusion Network based on GANs for Haze Removal |
Non-Patent Citations (2)
Title |
---|
刘政怡;段群涛;石松;赵鹏;: "基于多模态特征融合监督的RGB-D图像显著性检测", 电子与信息学报, no. 04, 15 April 2020 (2020-04-15), pages 206 - 213 * |
陈凯;王永雄;: "结合空间注意力多层特征融合显著性检测", 中国图象图形学报, no. 06, 16 June 2020 (2020-06-16), pages 66 - 77 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113076947A (en) * | 2021-03-26 | 2021-07-06 | 东北大学 | RGB-T image significance detection system with cross-guide fusion |
CN113076947B (en) * | 2021-03-26 | 2023-09-01 | 东北大学 | Cross-guided fusion RGB-T image saliency detection system |
CN113537279A (en) * | 2021-05-18 | 2021-10-22 | 齐鲁工业大学 | COVID-19 identification system based on similar residual convolution and LSTM |
CN113963081A (en) * | 2021-10-11 | 2022-01-21 | 华东师范大学 | Intelligent image chart synthesis method based on graph convolution network |
CN113963081B (en) * | 2021-10-11 | 2024-05-17 | 华东师范大学 | Image chart intelligent synthesis method based on graph convolution network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cong et al. | Global-and-local collaborative learning for co-salient object detection | |
TWI821671B (en) | A method and device for positioning text areas | |
Leng et al. | Robust obstacle detection and recognition for driver assistance systems | |
Wang et al. | Background-driven salient object detection | |
CN111814895A (en) | Significance target detection method based on absolute and relative depth induction network | |
CN107392930B (en) | Quantum Canny edge detection method | |
Saber et al. | Partial shape recognition by sub-matrix matching for partial matching guided image labeling | |
Shen et al. | ICAFusion: Iterative cross-attention guided feature fusion for multispectral object detection | |
Wu et al. | Handmap: Robust hand pose estimation via intermediate dense guidance map supervision | |
CN111480169A (en) | Method, system and apparatus for pattern recognition | |
EP3992908A1 (en) | Two-stage depth estimation machine learning algorithm and spherical warping layer for equi-rectangular projection stereo matching | |
Wang et al. | Towards Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point Clouds of Wild Scenes. | |
Chen et al. | Image splicing localization using residual image and residual-based fully convolutional network | |
Hu et al. | RGB-D image multi-target detection method based on 3D DSF R-CNN | |
Zhang et al. | R2Net: Residual refinement network for salient object detection | |
Yao et al. | As‐global‐as‐possible stereo matching with adaptive smoothness prior | |
CN114445479A (en) | Equal-rectangular projection stereo matching two-stage depth estimation machine learning algorithm and spherical distortion layer | |
Tang et al. | SDRNet: An end-to-end shadow detection and removal network | |
CN116385660A (en) | Indoor single view scene semantic reconstruction method and system | |
Zhou et al. | FANet: Feature aggregation network for RGBD saliency detection | |
Long et al. | Adaptive surface normal constraint for geometric estimation from monocular images | |
CN111914809B (en) | Target object positioning method, image processing method, device and computer equipment | |
Xie et al. | 3D surface segmentation from point clouds via quadric fits based on DBSCAN clustering | |
CN117315724A (en) | Open scene-oriented three-dimensional pedestrian detection method, system, equipment and medium | |
Haryono et al. | Oriented object detection in satellite images using convolutional neural network based on ResNeXt |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |