CN107169954B - Image significance detection method based on parallel convolutional neural network - Google Patents
Image significance detection method based on parallel convolutional neural network Download PDFInfo
- Publication number
- CN107169954B CN107169954B CN201710253255.2A CN201710253255A CN107169954B CN 107169954 B CN107169954 B CN 107169954B CN 201710253255 A CN201710253255 A CN 201710253255A CN 107169954 B CN107169954 B CN 107169954B
- Authority
- CN
- China
- Prior art keywords
- neural network
- convolutional neural
- parallel convolutional
- layer
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
Abstract
The invention discloses an image significance detection method based on a parallel convolutional neural network, which comprises the following steps: the method comprises the following steps: (1) designing a parallel convolutional neural network structure; (2) designing two network input graphs, and defining a tag based on a super pixel aiming at input; (3) carrying out data set balancing processing and input preprocessing; (4) model training: the model comprises a data preprocessing module and a parallel convolutional neural network structure; (5) a saliency map is computed using the trained model for the target image. The method can effectively detect the internal semantics of the salient main body and the difference between the salient main body and the background, detect the saliency from the global and local angles, and realize the automatic saliency detection of the image.
Description
Technical Field
The invention relates to an image detection method, in particular to an image significance detection method based on a parallel convolution neural network.
Background
The purpose of image saliency detection is to identify the visually most prominent region of an image, which is a very important issue in the field of computer vision and image processing. The saliency detection as a preprocessing means has wide application in computer vision and image processing, such as multimedia information transmission, image video reconstruction, image video quality evaluation and the like. Meanwhile, significance detection is also widely applied to high-level visual tasks, such as object detection and identity recognition. As a very mature topic, a large number of significance detection models have been proposed by scholars.
Traditional saliency detection models fall into manual feature-based methods and prior knowledge-based methods. The method based on the manual features is dedicated to design various manual features such as color, brightness and texture, when an image has more complex semantics, the method cannot effectively detect a significant subject, and when the difference between the color and the brightness of the subject and the background is small, the method based on the manual features cannot effectively distinguish the significant subject from the background. The method based on a priori knowledge defines the common characteristics of the salient bodies, for example, the method based on the background a priori assumes that the edge region close to the image is the background, but some salient bodies of the image are at the edge of the image, which makes the method based on the a priori knowledge have limitations.
Disclosure of Invention
In order to overcome the above disadvantages and shortcomings of the prior art, the present invention aims to provide an image saliency detection method based on a parallel convolutional neural network, which effectively detects the intrinsic semantics of a salient subject and the difference from the background, detects saliency from global and local angles, and realizes automatic saliency detection on an image.
The purpose of the invention is realized by the following technical scheme:
an image significance detection method based on a parallel convolutional neural network comprises the following steps:
(1) designing a parallel convolutional neural network structure; the parallel convolutional neural network structure comprises a global angle detection module CNN-G and a local angle detection module CNN-L;
the global angle detection module CNN-G is a single-path convolutional neural network; the local angle detection module CNN-L is a two-way parallel convolutional neural network; the global angle detection module CNN-G and the local angle detection module CNN-L realize parallelism through a full connection layer;
(2) designing two network input graphs, and defining a tag based on a super pixel aiming at input; the network input graph comprises a global filling graph and a local cutting graph;
the global filling graph takes the super-pixel as the center, contains all information of the original graph, represents global characteristics and is used as the input of the global angle detection module CNN-G;
the local cutting graph takes the super-pixel as a center and contains the cutting graph of the detail information of the super-pixel neighborhood, represents local characteristics and is used as the input of the local angle detection module CNN-L;
(3) carrying out data set balancing processing and input preprocessing;
(4) model training: the model comprises a data preprocessing module and a parallel convolutional neural network structure;
(5) a saliency map is computed using the trained model for the target image.
The defining of the tag based on the super-pixel for the input in the step (2) is specifically as follows:
the super pixel label is determined by the overlapping rate of the super pixel and the real label of the saliency map, if the super pixel label is greater than a set threshold value, the label is 1, and the super pixel label is regarded as saliency; otherwise, if the overlap ratio is smaller than the set threshold, the label is 0, and the label is considered as insignificant.
The data set balancing processing in the step (3) specifically comprises the following steps:
all positive samples obtained from one image are adopted, and negative samples with the same number as the positive samples are randomly selected; the specification for all samples was normalized to 256 x 256 size.
The first 5 layers of the parallel convolutional neural network structure in the step (1) are 5 convolutional layers; the first convolution layer has 96 convolution kernels with a size of 11 x 3; layer 2 has 256 convolution kernels with a size of 5 x 48. The third layer of convolution layers has 384 cores, the size is 3 x 256; the fourth convolution layer has 384 cores, and the size is 3 × 192; the 5 th convolution layer has 256 cores with the size of 3 × 192; the back of the first two layers and the fifth layer of convolution layers are connected with a pooling layer and a regularization layer.
And (1) sharing parameters of the convolution layer at the same layer of the parallel convolution neural network structure to learn scale invariance characteristics.
In the step (4), the training of the parallel convolutional neural network comprises the following steps:
(4-1) initializing network parameters;
(4-2) setting training parameters;
(4-3) loading training data;
and (4-4) iteratively training.
Initializing the network parameters in the step (4-1), specifically: initializing the first six layers of the parallel convolutional neural network by using the first six layer network parameters of the AlexNet model by adopting a fine-tune strategy; the initialization setting of the fully-connected layer is random value initialization.
The training parameters in the step (4-2) are specifically: the initial learning rate of the first 5 layers of the parallel convolutional neural network is set to 0.0001; the initial learning rate of the full connection layer parameters is 0.001; the training process is set to reduce the learning rate by 40% after each 8-time sample set traversal.
Step (4-3) the iterative training: iterative training is carried out on the parallel convolution neural network by adopting a random gradient descent algorithm, network parameters are stored once every 1000 times of iteration, and the optimal solution of the network is obtained through continuous iteration.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention detects the significance from the global and local angles at the same time, thereby effectively avoiding the defect of detecting the significance from a single angle; and multi-scale information is considered at the same time, so that the detection result is clearer and more complete.
2. Compared with the method using pixel points as basic processing units, the method provided by the invention has the advantages that the calculation amount is greatly reduced, and the algorithm effect is improved to a certain extent.
3. The invention is based on the parallel convolution neural network, and the trained model can adapt to various conditions, such as the image has a plurality of significant subjects, the significant subjects are too big or too small, the significant subjects are at the edge of the image, the significant subjects are similar to the background, the image background is complex, and the like.
Drawings
Fig. 1 is a flowchart of an image saliency detection method based on a parallel convolutional neural network according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples, but the embodiments of the present invention are not limited thereto.
Examples
As shown in fig. 1, the image saliency detection method based on the parallel convolutional neural network of the present embodiment includes the following steps:
(1) designing a parallel convolutional neural network structure; the parallel convolutional neural network structure comprises a global angle detection module CNN-G and a local angle detection module CNN-L.
The global angle detection module CNN-G is a single-path convolutional neural network; the local angle detection module CNN-L is a two-way parallel convolutional neural network; the global angle detection module CNN-G and the local angle detection module CNN-L are parallel through a full connection layer.
The first six layers of Alex network [ A.Krizhevsky, I.Sutskeeper, G.E.Hinton, ImageNetclassification with deep dependent Processing networks in: Proceedings of the annual Conference on Neural Information Processing System (NIPS),2012, pp.1097-1105 ] were used as one-way reference networks.
The input image size of the parallel convolutional neural network structure is 227 x 3, and the three-dimensional values are width, height and channel number respectively. The first 5 layers are 5 convolutional layers. The first convolution layer has 96 convolution kernels with a size of 11 x 3. Layer 2 has 256 convolution kernels with a size of 5 x 48. The third layer of convolutional layers has 384 cores, with a size of 3 x 256. The fourth convolution layer has 384 cores, with a size of 3 × 192. The 5 th convolution layer has 256 cores, 3 x 192 in size. The front two and the fifth convolutional layers are followed by a Pooling layer (Pooling) and a regularization layer (Normalization). CNN-G and CNN-L are parallel through a full connection layer with 4096 neuron number, so that the model detects significance from the global and local angles. The last layer of the parallel convolutional neural network structure is an output layer with only 2 neurons, and represents the significance value of the super-pixel to be predicted.
(2) Designing two network input graphs, and defining a tag based on a super pixel aiming at input; the network input graph comprises a global filling graph and a local cutting graph; the global filling graph takes the super-pixel as the center, contains all information of the original graph, represents global characteristics and is used as the input of the global angle detection module CNN-G; the local cropping map takes the super-pixels as the center, contains the cropping map of the super-pixel field detail information, represents local characteristics, and is used as the input of the local angle detection module CNN-L.
In the embodiment, an image is segmented by using an SLIC superpixel segmentation algorithm, then three input graphs including a global filling graph and two local cutting graphs are filled or cut by taking a certain superpixel S as a center, and the part exceeding the original image area is filled by using the pixel average value of a database. The three different size maps are then scaled to the same size, and each is used as input to three convolutional neural networks in a parallel network.
When designing how much of the original image information is included in the three input images, the following is defined: is provided with (W)o,Ho) Width and height of the original image, respectively, (W)p,Hp) The width and the height of the input image are respectively, and the calculation formula between the width and the height is as follows:
(Wp,Hp)=2×(Wo,Ho)×cp
where cp is the clipping factor. Since there are three different input images, cp also has three different values, cp being [1, 1/4, 1/8] in the present invention. The input image is a filling graph containing all the information of the original image, the filling graph is used as the input of the global network, and the saliency is detected from the global angle; in the local network, cp is [1/4, 1/8], the input image contains local detail information of different scales in the domain of the superpixel S, and the two cropping images are used as the input of the local network to detect local saliency in a multi-scale mode. Finally, the parallelism of the network enables the whole network to have the capability of detecting the significance from the global and local angles at the same time.
The label of the super-pixel is defined as follows, S is the super-pixel, G is the true label of the saliency map, (1) if | S ∩ G |/S > 0.9, the label is 1, indicating that the super-pixel is significant, (2) if | S ∩ G |/S < 0.1, the label is 0, indicating that the super-pixel is insignificant, (3) if 0.1 < | S ∩ G |/S < 0.9, the super-pixel is discarded, not being used as training data.
(3) Data set balancing processing and input preprocessing:
unbalanced training data sets can have adverse effects on classification results and weaken the ability to learn to obtain features. When the positive and negative samples are taken according to the method in (2), the number of the positive samples obtained from the database is far less than that of the negative samples, in order to enable the number of the positive and negative samples to be consistent, all the positive samples obtained from one image are adopted in the training process, the auxiliary samples with the same number as the positive samples are randomly selected, and the specifications of all the samples are normalized to 256 × 256.
(4) Model training: the model comprises a data preprocessing module and a parallel convolutional neural network structure;
the parallel convolutional neural network comprises the following specific training steps:
(4-1) network parameter initialization: a fine-tune strategy is adopted, and the first six layers of the parallel convolutional neural network are initialized by utilizing the first six layer network parameters of the AlexNet model; the initialization setting of the fully-connected layer is random value initialization.
(4-2) setting training parameters: the initial learning rate of the first 5 layers is set to 0.0001. The initial learning rate of the full connectivity layer parameters was 0.001. The training process is set to reduce the learning rate by 40% after each 8-time sample set traversal.
(4-3) training data were loaded, wherein the training set was 6000 images randomly selected from the MSRA10K database and 3500 images randomly selected from the DUT-OMRON database, and the verification set was 800 images randomly selected from the MSRA10K database and 468 images randomly selected from the DUT-OMRON database. The images of the training and validation sets do not coincide.
And (4-4) performing iterative training on the parallel convolutional neural network by adopting a random gradient descent algorithm, storing the network parameters once every 1000 times of iteration, and obtaining the optimal solution of the network through continuous iteration. The network with high accuracy and low loss function on the verification set is comprehensively considered as the optimal network of the invention.
(5) A saliency map is computed using the trained model for the target image.
By using the significance detection model designed by the invention, after a user gives an image, the system calculates a significance map according to the trained and learned depth model.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (8)
1. An image significance detection method based on a parallel convolution neural network is characterized by comprising the following steps:
(1) designing a parallel convolutional neural network structure; the parallel convolutional neural network structure comprises a global angle detection module CNN-G and a local angle detection module CNN-L;
the global angle detection module CNN-G is a single-path convolutional neural network; the local angle detection module CNN-L is a two-way parallel convolutional neural network; the global angle detection module CNN-G and the local angle detection module CNN-L realize parallelism through a full connection layer;
(2) designing two network input graphs, and defining a tag based on a super pixel aiming at input; the definition of the tag based on the super-pixel for the input is specifically as follows: the super pixel label is determined by the overlapping rate of the super pixel and the real label of the saliency map, if the super pixel label is greater than a set threshold value, the label is 1, and the super pixel label is regarded as saliency; otherwise, if the overlapping rate is smaller than the set threshold, the label is 0, and the label is considered as non-significant;
the network input graph comprises a global filling graph and a local cutting graph;
the global filling graph takes the super-pixel as the center, contains all information of the original graph, represents global characteristics and is used as the input of the global angle detection module CNN-G;
the local cutting graph takes the super-pixel as the center and contains the cutting graph of the detail information in the super-pixel field, represents local characteristics and is used as the input of the local angle detection module CNN-L;
(3) carrying out data set balancing processing and input preprocessing;
(4) model training: the model comprises a data preprocessing module and a parallel convolutional neural network structure;
(5) a saliency map is computed using the trained model for the target image.
2. The image saliency detection method based on parallel convolutional neural network of claim 1, characterized in that, the data set balancing process of step (3) is specifically:
all positive samples obtained from one image are adopted, and negative samples with the same number as the positive samples are randomly selected; the specification for all samples was normalized to 256 x 256 size.
3. The parallel convolutional neural network-based image saliency detection method of claim 1, wherein the first 5 layers of the parallel convolutional neural network structure of step (1) are 5 convolutional layers; the first convolution layer has 96 convolution kernels with a size of 11 x 3; layer 2 has 256 convolution kernels, size 5 x 48; the third layer of convolution layers has 384 cores, the size is 3 x 256; the fourth convolution layer has 384 cores, and the size is 3 × 192; the 5 th convolution layer has 256 cores with the size of 3 × 192; the back of the first two layers and the fifth layer of convolution layers are connected with a pooling layer and a regularization layer.
4. The parallel convolutional neural network-based image significance detection method as claimed in claim 3, wherein the parameters of the layer convolutional layers of the parallel convolutional neural network structure in step (1) are shared to learn the scale invariance features.
5. The parallel convolutional neural network-based image saliency detection method according to claim 3, wherein in step (4), the training of the parallel convolutional neural network comprises the following steps:
(4-1) initializing network parameters;
(4-2) setting training parameters;
(4-3) loading training data;
and (4-4) iteratively training.
6. The image saliency detection method based on parallel convolutional neural network of claim 5, characterized in that, the network parameters of step (4-1) are initialized, specifically: initializing the first six layers of the parallel convolutional neural network by using the first six layer network parameters of the AlexNet model by adopting a fine-tune strategy; the initialization setting of the fully-connected layer is random value initialization.
7. The image saliency detection method based on parallel convolutional neural network of claim 5, characterized in that, the training parameters of step (4-2) are specifically: the initial learning rate of the first 5 layers of the parallel convolutional neural network is set to 0.0001; the initial learning rate of the full connection layer parameters is 0.001; the training process is set to reduce the learning rate by 40% after each 8-time sample set traversal.
8. The parallel convolutional neural network-based image saliency detection method of claim 5, wherein the step (4-3) of iteratively training: iterative training is carried out on the parallel convolution neural network by adopting a random gradient descent algorithm, network parameters are stored once every 1000 times of iteration, and the optimal solution of the network is obtained through continuous iteration.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710253255.2A CN107169954B (en) | 2017-04-18 | 2017-04-18 | Image significance detection method based on parallel convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710253255.2A CN107169954B (en) | 2017-04-18 | 2017-04-18 | Image significance detection method based on parallel convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107169954A CN107169954A (en) | 2017-09-15 |
CN107169954B true CN107169954B (en) | 2020-06-19 |
Family
ID=59812176
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710253255.2A Active CN107169954B (en) | 2017-04-18 | 2017-04-18 | Image significance detection method based on parallel convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107169954B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107729819B (en) * | 2017-09-22 | 2020-05-19 | 华中科技大学 | Face labeling method based on sparse fully-convolutional neural network |
CN107966447B (en) * | 2017-11-14 | 2019-12-17 | 浙江大学 | workpiece surface defect detection method based on convolutional neural network |
CN107833220B (en) * | 2017-11-28 | 2021-06-11 | 河海大学常州校区 | Fabric defect detection method based on deep convolutional neural network and visual saliency |
CN108154150B (en) * | 2017-12-18 | 2021-07-23 | 北京工业大学 | Significance detection method based on background prior |
CN108364281B (en) * | 2018-01-08 | 2020-10-30 | 佛山市顺德区中山大学研究院 | Ribbon edge flaw defect detection method based on convolutional neural network |
CN108230243B (en) * | 2018-02-09 | 2021-04-27 | 福州大学 | Background blurring method based on salient region detection model |
CN108875555B (en) * | 2018-04-25 | 2022-02-25 | 中国人民解放军军事科学院军事医学研究院 | Video interest area and salient object extracting and positioning system based on neural network |
CN108647695A (en) * | 2018-05-02 | 2018-10-12 | 武汉科技大学 | Soft image conspicuousness detection method based on covariance convolutional neural networks |
CN109508627A (en) * | 2018-09-21 | 2019-03-22 | 国网信息通信产业集团有限公司 | The unmanned plane dynamic image identifying system and method for shared parameter CNN in a kind of layer |
CN112016548B (en) * | 2020-10-15 | 2021-02-09 | 腾讯科技(深圳)有限公司 | Cover picture display method and related device |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102176001B (en) * | 2011-02-10 | 2013-05-08 | 哈尔滨工程大学 | Permeable band ratio factor-based water depth inversion method |
JP6135283B2 (en) * | 2013-04-26 | 2017-05-31 | オムロン株式会社 | Image processing apparatus, image processing method, program, and recording medium |
CN107533801A (en) * | 2013-11-01 | 2018-01-02 | 国际智能技术公司 | Use the ground mapping technology of mapping vehicle |
CN104298976B (en) * | 2014-10-16 | 2017-09-26 | 电子科技大学 | Detection method of license plate based on convolutional neural networks |
WO2016197303A1 (en) * | 2015-06-08 | 2016-12-15 | Microsoft Technology Licensing, Llc. | Image semantic segmentation |
CN104933691B (en) * | 2015-06-25 | 2019-02-12 | 中国计量学院 | Image interfusion method based on the detection of phase spectrum vision significance |
CN105701508B (en) * | 2016-01-12 | 2017-12-15 | 西安交通大学 | Global local optimum model and conspicuousness detection algorithm based on multistage convolutional neural networks |
CN106157319B (en) * | 2016-07-28 | 2018-11-02 | 哈尔滨工业大学 | The conspicuousness detection method in region and Pixel-level fusion based on convolutional neural networks |
CN106447658B (en) * | 2016-09-26 | 2019-06-21 | 西北工业大学 | Conspicuousness object detection method based on global and local convolutional network |
CN106446914A (en) * | 2016-09-28 | 2017-02-22 | 天津工业大学 | Road detection based on superpixels and convolution neural network |
EP3151164A3 (en) * | 2016-12-26 | 2017-04-12 | Argosai Teknoloji Anonim Sirketi | A method for foreign object debris detection |
-
2017
- 2017-04-18 CN CN201710253255.2A patent/CN107169954B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN107169954A (en) | 2017-09-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107169954B (en) | Image significance detection method based on parallel convolutional neural network | |
WO2021000906A1 (en) | Sar image-oriented small-sample semantic feature enhancement method and apparatus | |
CN111553929B (en) | Mobile phone screen defect segmentation method, device and equipment based on converged network | |
CN110472627B (en) | End-to-end SAR image recognition method, device and storage medium | |
CN106897673B (en) | Retinex algorithm and convolutional neural network-based pedestrian re-identification method | |
CN112861690B (en) | Multi-method fused remote sensing image change detection method and system | |
CN111652321A (en) | Offshore ship detection method based on improved YOLOV3 algorithm | |
CN111209952A (en) | Underwater target detection method based on improved SSD and transfer learning | |
CN111445459B (en) | Image defect detection method and system based on depth twin network | |
CN109002755B (en) | Age estimation model construction method and estimation method based on face image | |
CN109376591B (en) | Ship target detection method for deep learning feature and visual feature combined training | |
CN110569782A (en) | Target detection method based on deep learning | |
CN110210493B (en) | Contour detection method and system based on non-classical receptive field modulation neural network | |
CN113743417B (en) | Semantic segmentation method and semantic segmentation device | |
CN113011253B (en) | Facial expression recognition method, device, equipment and storage medium based on ResNeXt network | |
CN113436227A (en) | Twin network target tracking method based on inverted residual error | |
CN111814611A (en) | Multi-scale face age estimation method and system embedded with high-order information | |
Yang | Research on lane recognition algorithm based on deep learning | |
CN110135435B (en) | Saliency detection method and device based on breadth learning system | |
Hirner et al. | FC-DCNN: A densely connected neural network for stereo estimation | |
CN112488220B (en) | Small target detection method based on deep learning | |
CN110211106B (en) | Mean shift SAR image coastline detection method based on segmented Sigmoid bandwidth | |
CN112434576A (en) | Face recognition method and system based on depth camera | |
CN110136098B (en) | Cable sequence detection method based on deep learning | |
CN112132145A (en) | Image classification method and system based on model extended convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |