CN113128519A

CN113128519A - Multi-mode multi-spliced RGB-D significance target detection method

Info

Publication number: CN113128519A
Application number: CN202110461176.7A
Authority: CN
Inventors: 陈莉; 赵志华
Original assignee: Northwestern University
Current assignee: Northwestern University
Priority date: 2021-04-27
Filing date: 2021-04-27
Publication date: 2021-07-16
Anticipated expiration: 2041-04-27
Also published as: CN113128519B

Abstract

The invention discloses a multi-mode multi-spliced RGB-D significance target detection method, which comprises the following steps of: s1, dividing the image into non-overlapping sub-regions, respectively extracting RGB image color information, Depth information of a Depth image and symmetric invariant LBP characteristics of each image sub-region, and forming a region histogram based on the symmetric invariant LBP characteristics; s2, measuring the correlation of RGB image color information, Depth information of a Depth image and a region histogram based on class condition mutual information entropy, and realizing the fusion of the RGB image color information, the Depth information of the Depth image and the region histogram in the order of scores by using a self-adaptive score fusion algorithm to obtain the final score of each image subregion; and S3, realizing the salient object detection of the image based on the final score of each image subregion. The method can quickly and efficiently realize the detection of the saliency target and can realize the saliency detection precision.

Description

Multi-mode multi-spliced RGB-D significance target detection method

Technical Field

The invention relates to the field of image detection, in particular to a multi-mode multi-spliced RGB-D saliency target detection method.

Background

The salient object detection is an important component of computer vision, and with the continuous development of the field of computer vision, a salient object detection method with higher efficiency and better accuracy is urgently needed.

During the development of saliency detection, various methods come up, for example, utilizing color features, position information, texture features, and the like of images. Some conventional methods use central priors, edge priors, semantic priors, etc. However, these models often fail because the color scene in the image is very complex and there is no significant contrast between the object and the background, and the light is an object whose approximation is difficult to distinguish by these features.

Disclosure of Invention

In order to solve the problems, the invention provides a multi-mode multi-splicing RGB-D saliency target detection method, which can quickly and efficiently realize the detection of saliency targets and can realize the accuracy of saliency detection.

Meanwhile, the complementarity of the visible light camera and the near infrared camera is utilized, the feature extraction is simultaneously carried out on the visible light face picture by utilizing the deep learning algorithm, and finally the fusion algorithm is used for carrying out hierarchical fusion on the features extracted by the deep learning model, so that the effect of advantage complementation is played.

In order to achieve the purpose, the invention adopts the technical scheme that:

a multi-mode multi-spliced RGB-D saliency target detection method comprises the following steps:

s1, dividing the image into non-overlapping sub-regions, respectively extracting RGB image color information, Depth information of a Depth image and symmetric invariant LBP characteristics of each image sub-region, and forming a region histogram based on the symmetric invariant LBP characteristics;

s2, measuring the correlation of RGB image color information, Depth information of a Depth image and a region histogram based on class condition mutual information entropy, and realizing the fusion of the RGB image color information, the Depth information of the Depth image and the region histogram in the order of scores by using a self-adaptive score fusion algorithm to obtain the final score of each image subregion;

and S3, realizing the salient object detection of the image based on the final score of each image subregion.

Further, in step S1, the extraction of RGB image color information, Depth image Depth information, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.

Further, in the step S1, the detection of the object carried in the image is firstly realized based on the Dssd _ Xception _ coco model, and then the division of the image sub-regions is realized based on the detection result of the object.

Further, each target is configured with an image subregion, and the rest of the background is configured with an image subregion.

Furthermore, the Dssd _ Xception _ coco model adopts a DSSD target detection algorithm, pre-trains an Xception neural network by using a coco data set, then trains the model by using a previously prepared data set, finely adjusts various parameters in the deep neural network, and finally obtains a suitable detection model for detecting each target loaded in the image.

Further, in the step S3, the saliency target detection of the image is achieved according to the final score of each image sub-region based on the ResNet50 model.

Further, still include: the step of recognizing the human eye observation angle of the image is realized, different human eye observation angles correspond to different image deflection angle adjusting models, and the adjustment of the image deflection angle is realized based on the image deflection angle model

The invention has the following beneficial effects:

the detection of the saliency target can be realized quickly and efficiently, and the saliency detection precision can be realized.

Drawings

FIG. 1 is a flowchart of a multi-modal multi-tiling RGB-D saliency target detection method according to embodiment 1 of the present invention

Fig. 2 is a flowchart of a multi-modal multi-stitched RGB-D saliency target detection method according to embodiment 2 of the present invention.

Detailed Description

In order that the objects and advantages of the invention will be more clearly understood, the invention is further described in detail below with reference to examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Example 1

and S3, realizing the saliency target detection of the image according to the final score of each image subregion based on the ResNet50 model.

In this embodiment, in step S1, extraction of RGB image color information, Depth information of a Depth image, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.

In this embodiment, in step S1, the detection of the image intra-object is first implemented based on the Dssd _ Xception _ coco model, and then the division of the image sub-regions is implemented based on the detection result of the object. Each target is configured with an image subregion, and the rest of the background is configured with an image subregion. The Dssd _ Xscene _ coco model adopts a DSSD target detection algorithm, an Xscene neural network is pre-trained by a coco data set, then the model is trained by a previously prepared data set, various parameters in the deep neural network are finely adjusted, and finally a suitable detection model for detecting various targets (such as people, trees, furniture and the like) carried in the image is obtained.

Example 2

s1, recognizing the human eye observation angle of the image, wherein different human eye observation angles correspond to different image deflection angle adjusting models, and adjusting the image deflection angle based on the image deflection angle model;

s2, dividing the image into non-overlapping sub-regions, respectively extracting RGB image color information, Depth information of a Depth image and symmetric invariant LBP characteristics of each image sub-region, and forming a region histogram based on the symmetric invariant LBP characteristics;

s3, measuring the correlation of RGB image color information, Depth information of a Depth image and a region histogram based on class condition mutual information entropy, and realizing the fusion of the RGB image color information, the Depth information of the Depth image and the region histogram in the order of scores by using a self-adaptive score fusion algorithm to obtain the final score of each image subregion;

and S4, realizing the saliency target detection of the image according to the final score of each image subregion based on the ResNet50 model.

In this embodiment, in step S1, firstly, the detection of the targets carried in the image is realized based on the Dssd _ Xception _ coco model, and then the division of the image sub-regions is realized based on the detection result of the targets, where each target is configured with an image sub-region, and the remaining background is configured with an image sub-region. The Dssd _ Xscene _ coco model adopts a DSSD target detection algorithm, an Xscene neural network is pre-trained by a coco data set, then the model is trained by a previously prepared data set, various parameters in the deep neural network are finely adjusted, and finally a suitable detection model for detecting various targets (such as people, trees, furniture and the like) carried in the image is obtained.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and these improvements and modifications should also be construed as the protection scope of the present invention.

Claims

1. A multi-mode multi-spliced RGB-D saliency target detection method is characterized by comprising the following steps:

2. The method according to claim 1, wherein in step S1, extraction of RGB image color information, Depth information of Depth image, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.

3. The method as claimed in claim 1, wherein in step S1, firstly, the detection of the object carried in the image is implemented based on Dssd _ Xception _ coco model, and then, the division of the image sub-regions is implemented based on the detection result of the object.

4. The method as claimed in claim 3, wherein each target is configured with an image sub-region, and the remaining background is configured with an image sub-region.

5. The method as claimed in claim 3, wherein the Dssd _ Xception _ coco model employs a DSSD target detection algorithm, pre-trains an Xception neural network with a coco data set, then trains the model with a previously prepared data set, fine-tunes parameters in the deep neural network, and finally obtains a suitable detection model for detecting targets carried in the image.

6. The multi-modal multi-stitching RGB-D saliency target detection method according to claim 1, wherein in step S3, saliency target detection of images is achieved according to the final score of each image sub-region based on the ResNet50 model.

7. The method of claim 1, further comprising: and identifying the human eye observation angle of the image, wherein different human eye observation angles correspond to different image deflection angle adjustment models, and the image deflection angle is adjusted based on the image deflection angle model.