CN110135435B - Saliency detection method and device based on breadth learning system - Google Patents

Saliency detection method and device based on breadth learning system Download PDF

Info

Publication number
CN110135435B
CN110135435B CN201910308906.2A CN201910308906A CN110135435B CN 110135435 B CN110135435 B CN 110135435B CN 201910308906 A CN201910308906 A CN 201910308906A CN 110135435 B CN110135435 B CN 110135435B
Authority
CN
China
Prior art keywords
image
saliency map
saliency
function
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910308906.2A
Other languages
Chinese (zh)
Other versions
CN110135435A (en
Inventor
林晓
李想
王志杰
黄继风
郑晓妹
盛斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Normal University
Original Assignee
Shanghai Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Normal University filed Critical Shanghai Normal University
Priority to CN201910308906.2A priority Critical patent/CN110135435B/en
Publication of CN110135435A publication Critical patent/CN110135435A/en
Application granted granted Critical
Publication of CN110135435B publication Critical patent/CN110135435B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a significance detection method and a device based on an breadth learning system, wherein the method comprises the following steps: step S1: the image is divided into a plurality of superpixels, the color, position, texture, prior and contrast information of each superpixel is extracted, and the feature vector of each superpixel is obtained; step S2: processing the image based on the obtained feature vectors of the super pixels to obtain an initial saliency map; step S3: establishing a conditional random field model for the initial saliency map, calculating a kernel matrix in the conditional random field by using regression based on breadth learning, and taking the obtained optimal solution as an optimized saliency map; step S4: and using the optimized saliency map for visual tracking, image classification, image segmentation, target recognition, image video compression, image retrieval or image redirection. Compared with the prior art, the method combines color features, spatial features, texture features, prior features and contrast features in the feature extraction stage, and improves the detection effect.

Description

Saliency detection method and device based on breadth learning system
Technical Field
The present invention relates to saliency detection technologies, and in particular, to a saliency detection method and apparatus based on an extent learning system.
Background
In recent years, saliency detection has become one of the hot topics in the field of computer vision, attracting the interests of a large number of scholars. Many excellent algorithms have appeared in the field, but there is still great difficulty in developing a simple and practical significance model. At present, saliency detection has been widely applied to the related fields of visual tracking, image classification and image segmentation, target recognition, image video compression, image retrieval, image redirection and the like.
The saliency detection algorithm can be classified into visual attention detection and saliency target detection according to different detection models and functions. The visual attention detection is to estimate the change track of a fixation point when human eyes observe an image, and is widely researched in the neurology, and the salient object detection is to extract the whole salient object area and inhibit background noise.
Significance detection can also be divided into bottom-up models, and top-down models, depending on the manner in which the data is processed. The top-down model is for a representative feature in the training sample and therefore can detect some fixed size and class of targets. In contrast, the bottom-up model is data-driven, without prior knowledge, and is generated by direct stimulation of the underlying visual information. Obviously, the computational complexity of the bottom-up model is typically lower than that of the top-down model.
Then, when the existing significance algorithm is used for significance detection, the final effect is poor due to fewer considered features.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a significance detection method and device based on an extensive learning system.
The purpose of the invention can be realized by the following technical scheme:
a saliency detection method based on an extent learning system comprises the following steps:
step S1: the image is divided into a plurality of superpixels, the color, position, texture, prior and contrast information of each superpixel is extracted, and the feature vector of each superpixel is obtained;
step S2: processing the image based on the obtained feature vectors of the super pixels to obtain an initial saliency map;
step S3: establishing a conditional random field model for the initial saliency map, calculating a kernel matrix in the conditional random field by using regression based on breadth learning, and taking the obtained optimal solution as an optimized saliency map;
step S4: and using the optimized saliency map for visual tracking, image classification, image segmentation, target recognition, image video compression, image retrieval or image redirection.
The energy function of the initial saliency map is specifically:
Figure BDA0002030787620000021
wherein:
Figure BDA0002030787620000022
as a function of energy, phiu(. is a function of a unit term, phip(. cndot.) is a two-element function,
Figure BDA0002030787620000023
the calculated saliency value for the ith feature vector,
Figure BDA0002030787620000024
a calculated saliency value for the jth eigenvector.
The mathematical expression of the unit term function is as follows:
Figure BDA0002030787620000025
wherein: f is a feature vector, fRFor a set of R feature vectors,
Figure BDA0002030787620000026
is in fRConditional on the significance of the ith super pixel being yiIs updated to
Figure BDA0002030787620000027
The probability of (d);
the mathematical expression of the bivariate function is:
Figure BDA0002030787620000028
wherein:
Figure BDA0002030787620000029
as a compatibility function, if
Figure BDA00020307876200000210
Taking 1, otherwise, taking 0, K being kernel matrix, L being number of Gaussian function, K(l)(. cndot.) is the l-th Gaussian function.
The step S3 specifically includes:
step S31: establishing a conditional random field model for the initial saliency map;
step S32: calculating the distance between the feature vectors;
step S33: calculating a penalty term based on the obtained distance:
Figure BDA00020307876200000211
wherein: c (i, j) is a penalty term, distf(fi,fj) Is the distance, s, between the ith and jth feature vectorsiIs the ith super pixel, sjIs the jth super pixel, S is the set formed by super pixels;
step S34: and finally, minimizing the energy to obtain an optimal solution, and taking the obtained optimal solution as an optimized saliency map.
The mathematical expression of the distance is specifically as follows:
Figure BDA0002030787620000031
wherein: gamma is a constant coefficient, ciIs the color average of the ith super pixel, cjIs the average of the colors of the jth super-pixel,
Figure BDA0002030787620000032
is the norm of L2.
The number of superpixels obtained in step S1 is 200.
The constant factor is 0.8.
A saliency detection device based on an extent learning system, characterized by comprising a memory, a processor, and a program stored in the memory and executed by the processor, wherein the processor executes the program to implement the following steps:
step S1: the image is divided into a plurality of superpixels, the color, position, texture, prior and contrast information of each superpixel is extracted, and the feature vector of each superpixel is obtained;
step S2: processing the image based on the obtained feature vectors of the super pixels to obtain an initial saliency map;
step S3: and establishing a conditional random field model for the initial saliency map, calculating a kernel matrix in the conditional random field by using regression based on breadth learning, and taking the obtained optimal solution as the optimized saliency map.
Compared with the prior art, the invention has the following beneficial effects:
1) in the feature extraction stage, color features, spatial features, texture features, prior features and contrast features are combined, so that the feature vectors have higher resolution.
2) A set of basic breadth learning models are grouped into a set of basic learning models and trained through a Boosting learning algorithm. This method is not only fast in training, but also produces accurate results.
3) CRF is applied with the learned kernel matrix to optimize the initial results.
Drawings
FIG. 1 is a schematic flow chart of the main steps of the method of the present invention;
FIG. 2 is a schematic diagram of initial saliency map acquisition;
FIG. 3 is a diagram illustrating a comparison between an effect graph generated by the method of the present invention and an effect graph generated by another algorithm;
FIG. 4 is a qualitative comparison example diagram of saliency maps generated by 9 models of the classical and recent advanced algorithms and the method herein, wherein FIG. 4(a) is the original map, FIG. 4(b) is the true value map of the mapping, and FIG. 4(c) and the following are the effect maps of the present invention;
FIG. 5 is a schematic diagram of features to be extracted for a superpixel;
fig. 6 is a program segment of an enhanced extent learning algorithm.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.
A saliency detection method based on an extent learning system is implemented by a computer system in the form of a computer program, where the computer system is a saliency detection apparatus and includes a memory, a processor, and a program stored in the memory and executed by the processor, as shown in fig. 1, and the processor implements the following steps when executing the program:
step S1: dividing the image into a plurality of superpixels, extracting color, position, texture, prior and contrast information of each superpixel, and obtaining a feature vector of each superpixel, wherein the number of the superpixels obtained in the step S1 is 200;
the process specifically adopts a Geodesic-SLIC algorithm to segment an image into N superpixels, and for each superpixel, 5 angles of the color, the position, the texture, the prior and the contrast of the superpixel are used for extracting a feature vector to generate 1 feature vector with 203 bits, which is shown in FIG. 5. Wherein the color and location features of the superpixel are both taken averages and the texture features, the present invention uses color histograms, gradient histograms, etc. In addition, the invention also refers to target priors and background priors to enhance the resolving power for the feature vectors. Finally, the color is calculated by the application, and the contrast of the texture is taken as the final component of the feature vector.
Step S2: processing the image based on the obtained feature vectors of the super pixels, as shown in fig. 2, obtaining an initial saliency map;
the energy function of the initial saliency map is specifically:
Figure BDA0002030787620000041
wherein:
Figure BDA0002030787620000042
as a function of energy, phiu(. is a function of a unit term, phip(. cndot.) is a two-element function,
Figure BDA0002030787620000043
the calculated saliency value for the ith feature vector,
Figure BDA0002030787620000044
a calculated saliency value for the jth eigenvector.
The mathematical expression of the unit term function is:
Figure BDA0002030787620000045
wherein: f is a feature vector, fRFor a set of R feature vectors,
Figure BDA0002030787620000046
is in fRConditional on the significance of the ith super pixel being yiIs updated to
Figure BDA0002030787620000047
The probability of (d);
the mathematical expression of the bigram function is:
Figure BDA0002030787620000051
wherein:
Figure BDA0002030787620000052
as a compatibility function, if
Figure BDA0002030787620000053
Taking 1, otherwise, taking 0, K being kernel matrix, L being number of Gaussian function, K(l)(. cndot.) is the l-th Gaussian function.
Specifically, the breadth learning model and the Adabousting algorithm are combined to form an enhanced breadth learning system to learn the features extracted in the first step. The wide network is different from the deep network model, and the network structure of the wide network has only two layers, namely a hidden layer and an output layer. And the optimal parameter optimization for solving the optimal parameter optimization is a ridge regression problem, and the parameters of the width learning network can be obtained through matrix inversion. Therefore, the wide learning training efficiency is far higher than that of a deep learning network, and the learning system can achieve high accuracy when the wide learning training system is combined with Adabousting.
The breadth neural network used in the application has two layers, one layer is a hidden layer consisting of a node mapping layer U and a node enhancement layer E, and the other layer is an output layer Y. Assume the eigenvector matrix is:
Figure BDA0002030787620000054
this is used as input to the breadth learning system. The calculation is as follows:
U=φ(FWff)
H=ξ(UWee)
wherein both the function φ (-) and the function ξ (-) are non-linear functions and the final output is: and Y is W [ U | H ]. W is the parameter to be finally determined.
Figure BDA0002030787620000055
Wherein, the source code of the enhanced breadth learning algorithm is shown in FIG. 6
Step S3: establishing a conditional random field model for the initial saliency map, calculating a kernel matrix in the conditional random field by using regression based on breadth learning, and taking the obtained optimal solution as the optimized saliency map as shown in fig. 3 and 4, specifically including:
step S31: establishing a conditional random field model for the initial saliency map;
step S32: calculating the distance between the feature vectors;
step S33: calculating a penalty term based on the obtained distance:
Figure BDA0002030787620000056
wherein: c (i, j) is a penalty term, distf(fi,fj) Is the distance, s, between the ith and jth feature vectorsiIs the ith super pixel, sjIs the jth super pixel, S is the set formed by super pixels;
step S34: and finally, minimizing the energy to obtain an optimal solution, and taking the obtained optimal solution as an optimized saliency map.
The mathematical expression of the distance is specifically:
Figure BDA0002030787620000061
wherein: gamma is a constant coefficient, 0.8, ciIs the color average of the ith super pixel, cjIs the average of the colors of the jth super-pixel,
Figure BDA0002030787620000062
is the norm of L2.
Thereafter, by step S4: and using the optimized saliency map for visual tracking, image classification, image segmentation, target recognition, image video compression, image retrieval or image redirection.

Claims (5)

1. A saliency detection method based on an extent learning system is characterized by comprising the following steps:
step S1: dividing the image into a plurality of superpixels, extracting the color, position, texture, prior and contrast information of each superpixel, and obtaining the feature vector of each superpixel,
step S2: processing the image based on the obtained feature vectors of the super pixels to obtain an initial saliency map;
step S3: establishing a conditional random field model for the initial saliency map, calculating a kernel matrix in the conditional random field by using regression based on breadth learning, taking the obtained optimal solution as the optimized saliency map,
step S4: using the optimized saliency map for visual tracking, image classification, image segmentation, target identification, image video compression, image retrieval or image redirection;
the energy function of the initial saliency map is specifically:
Figure FDA0002937498530000011
wherein:
Figure FDA0002937498530000012
as a function of energy, phiu(. is a function of a unit term, phip(. cndot.) is a two-element function,
Figure FDA0002937498530000013
the calculated saliency value for the ith feature vector,
Figure FDA0002937498530000014
a calculated saliency value for the jth eigenvector;
the mathematical expression of the unit term function is as follows:
Figure FDA0002937498530000015
wherein: f is a feature vector, fRFor a set of R feature vectors,
Figure FDA0002937498530000016
is in fRConditional on the significance of the ith super pixel being yiIs updated to
Figure FDA0002937498530000017
The probability of (a) of (b) being,
the mathematical expression of the bivariate function is:
Figure FDA0002937498530000018
wherein:
Figure FDA0002937498530000019
as a compatibility function, if
Figure FDA00029374985300000110
Taking 1, otherwise, taking 0, K being kernel matrix, L being number of Gaussian function, K(l)(. h) is the l-th Gaussian function;
the step S3 specifically includes:
step S31: a conditional random field model is built for the initial saliency map,
step S32: the distance between the feature vectors is calculated,
step S33: calculating a penalty term based on the obtained distance:
Figure FDA00029374985300000111
wherein: c (i, j) is a penalty term, distf(fi,fj) Is the distance, s, between the ith and jth feature vectorsiIs the ith super pixel, sjIs the jth super-pixel, S is the set of super-pixels, fiIs the i-th feature vector, fjFor the jth feature vector, the number of feature vectors,
step S34: and finally, minimizing the energy to obtain an optimal solution, and taking the obtained optimal solution as an optimized saliency map.
2. The saliency detection method based on breadth learning system according to claim 1, characterized in that the mathematical expression of the distance between the ith and jth feature vectors is specifically:
Figure FDA0002937498530000021
wherein: gamma is a constant coefficient, ciIs the color average of the ith super pixel, cjIs the average of the colors of the jth super-pixel,
Figure FDA0002937498530000022
is the norm of L2.
3. The saliency detection method based on the breadth learning system according to claim 1, characterized in that the number of the super pixels obtained in the step S1 is 200.
4. The saliency detection method based on an extent learning system according to claim 2 characterized in that said constant coefficient is 0.8.
5. An apparatus for saliency detection based on an extent learning system comprising a memory, a processor, and a program stored in the memory and executed by the processor, the processor implementing the method of any of claims 1-4 when executing the program.
CN201910308906.2A 2019-04-17 2019-04-17 Saliency detection method and device based on breadth learning system Active CN110135435B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910308906.2A CN110135435B (en) 2019-04-17 2019-04-17 Saliency detection method and device based on breadth learning system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910308906.2A CN110135435B (en) 2019-04-17 2019-04-17 Saliency detection method and device based on breadth learning system

Publications (2)

Publication Number Publication Date
CN110135435A CN110135435A (en) 2019-08-16
CN110135435B true CN110135435B (en) 2021-05-18

Family

ID=67570332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910308906.2A Active CN110135435B (en) 2019-04-17 2019-04-17 Saliency detection method and device based on breadth learning system

Country Status (1)

Country Link
CN (1) CN110135435B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111726633B (en) * 2020-05-11 2021-03-26 河南大学 Compressed video stream recoding method based on deep learning and significance perception
CN111640129B (en) * 2020-05-25 2023-04-07 电子科技大学 Visual mortar recognition system applied to indoor wall construction robot
CN112800260B (en) * 2021-04-09 2021-08-20 北京邮电大学 Multi-label image retrieval method and device based on deep hash energy model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7940985B2 (en) * 2007-06-06 2011-05-10 Microsoft Corporation Salient object detection
CN103136766A (en) * 2012-12-28 2013-06-05 上海交通大学 Object significance detecting method based on color contrast and color distribution
CN104077609A (en) * 2014-06-27 2014-10-01 河海大学 Saliency detection method based on conditional random field
CN104574375A (en) * 2014-12-23 2015-04-29 浙江大学 Image significance detection method combining color and depth information
US9147255B1 (en) * 2013-03-14 2015-09-29 Hrl Laboratories, Llc Rapid object detection by combining structural information from image segmentation with bio-inspired attentional mechanisms
CN105590319A (en) * 2015-12-18 2016-05-18 华南理工大学 Method for detecting image saliency region for deep learning
CN107133955A (en) * 2017-04-14 2017-09-05 大连理工大学 A kind of collaboration conspicuousness detection method combined at many levels
CN108320286A (en) * 2018-02-28 2018-07-24 苏州大学 Image significance detection method, system, equipment and computer readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017180208A1 (en) * 2016-04-13 2017-10-19 Google Inc. Wide and deep machine learning models
CN105931189B (en) * 2016-05-10 2020-09-11 浙江大学 Video super-resolution method and device based on improved super-resolution parameterized model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7940985B2 (en) * 2007-06-06 2011-05-10 Microsoft Corporation Salient object detection
CN103136766A (en) * 2012-12-28 2013-06-05 上海交通大学 Object significance detecting method based on color contrast and color distribution
US9147255B1 (en) * 2013-03-14 2015-09-29 Hrl Laboratories, Llc Rapid object detection by combining structural information from image segmentation with bio-inspired attentional mechanisms
CN104077609A (en) * 2014-06-27 2014-10-01 河海大学 Saliency detection method based on conditional random field
CN104574375A (en) * 2014-12-23 2015-04-29 浙江大学 Image significance detection method combining color and depth information
CN105590319A (en) * 2015-12-18 2016-05-18 华南理工大学 Method for detecting image saliency region for deep learning
CN107133955A (en) * 2017-04-14 2017-09-05 大连理工大学 A kind of collaboration conspicuousness detection method combined at many levels
CN108320286A (en) * 2018-02-28 2018-07-24 苏州大学 Image significance detection method, system, equipment and computer readable storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture";C. L. Philip Chen et al.;《IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS》;20180131;第29卷(第1期);第10-24页 *
"Performance Analysis of Machine Learning Classifiers for Intrusion Detection";Skhumbuzo Zwane et al.;《2018 International Conference on Intelligent and Innovative Computing Applications (ICONIC)》;20190107;第1-5页 *
"基于双层多尺度神经网络的显著性对象检测算法";李鑫 等;《微电子学与计算机》;20181130;第35卷(第11期);第1-7页 *

Also Published As

Publication number Publication date
CN110135435A (en) 2019-08-16

Similar Documents

Publication Publication Date Title
CN111860670B (en) Domain adaptive model training method, image detection method, device, equipment and medium
CN108256562B (en) Salient target detection method and system based on weak supervision time-space cascade neural network
CN112750140B (en) Information mining-based disguised target image segmentation method
CN112307958A (en) Micro-expression identification method based on spatiotemporal appearance movement attention network
CN112184752A (en) Video target tracking method based on pyramid convolution
CN109446889B (en) Object tracking method and device based on twin matching network
CN112052886A (en) Human body action attitude intelligent estimation method and device based on convolutional neural network
CN112288011B (en) Image matching method based on self-attention deep neural network
CN109002755B (en) Age estimation model construction method and estimation method based on face image
CN110135435B (en) Saliency detection method and device based on breadth learning system
CN111814611B (en) Multi-scale face age estimation method and system embedded with high-order information
CN112837344B (en) Target tracking method for generating twin network based on condition countermeasure
CN107862680B (en) Target tracking optimization method based on correlation filter
Yang et al. Visual tracking with long-short term based correlation filter
Yan et al. STDMANet: Spatio-temporal differential multiscale attention network for small moving infrared target detection
CN115761484A (en) Cloud detection method and device based on remote sensing image
CN114333062B (en) Pedestrian re-recognition model training method based on heterogeneous dual networks and feature consistency
CN116740362B (en) Attention-based lightweight asymmetric scene semantic segmentation method and system
CN113763274A (en) Multi-source image matching method combining local phase sharpness orientation description
CN113763417A (en) Target tracking method based on twin network and residual error structure
Wang et al. An improved convolutional neural network-based scene image recognition method
Pei et al. FGO-Net: Feature and Gaussian Optimization Network for visual saliency prediction
CN113420760A (en) Handwritten Mongolian detection and identification method based on segmentation and deformation LSTM
Guo et al. Face illumination normalization based on generative adversarial network
CN112419227B (en) Underwater target detection method and system based on small target search scaling technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant