CN111027576A - Cooperative significance detection method based on cooperative significance generation type countermeasure network - Google Patents

Cooperative significance detection method based on cooperative significance generation type countermeasure network Download PDF

Info

Publication number
CN111027576A
CN111027576A CN201911368623.3A CN201911368623A CN111027576A CN 111027576 A CN111027576 A CN 111027576A CN 201911368623 A CN201911368623 A CN 201911368623A CN 111027576 A CN111027576 A CN 111027576A
Authority
CN
China
Prior art keywords
generator
significance
cooperative
training
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911368623.3A
Other languages
Chinese (zh)
Other versions
CN111027576B (en
Inventor
钱晓亮
白臻
任航丽
曾黎
邢培旭
程塨
姚西文
刘向龙
岳伟超
王芳
刘玉翠
赵素娜
王慰
毋媛媛
吴青娥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou University of Light Industry
Original Assignee
Zhengzhou University of Light Industry
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou University of Light Industry filed Critical Zhengzhou University of Light Industry
Priority to CN201911368623.3A priority Critical patent/CN111027576B/en
Publication of CN111027576A publication Critical patent/CN111027576A/en
Application granted granted Critical
Publication of CN111027576B publication Critical patent/CN111027576B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The invention provides a cooperative significance detection method based on a cooperative significance generation type countermeasure network, which comprises the following steps: constructing a cooperative significance generation type confrontation network model; carrying out two-stage training on the cooperative significance generation type confrontation network model: in the first training stage, a marked target database with a label is adopted to pre-train the cooperative significance generating type countermeasure network, and after the training in the first training stage, the cooperative significance generating type countermeasure network has the capability of detecting the significance of a single image; in the second training stage, based on the model parameters trained in the first stage, the labeled cooperative significance data group with the cooperative significance target belonging to the same category is adopted to train the cooperative significance generation type countermeasure network, and after the training in the second training stage, the trained model can be directly used for carrying out category cooperative significance detection. The invention has the advantages of simple training process, high detection efficiency, stronger universality and higher accuracy.

Description

Cooperative significance detection method based on cooperative significance generation type countermeasure network
Technical Field
The invention relates to the technical field of computer vision and machine learning, in particular to a cooperative significance detection method based on a cooperative significance generation type countermeasure network.
Background
With the advent of the big data age, websites and storage mobile devices that can provide various information resources, the digital information of a large number of images and videos is full of our lives, and how to endow computers with the ability to quickly and accurately acquire and retain more effective information is necessary. The cooperative saliency detection is based on a human biological visual attention mechanism, effective target information capable of representing a plurality of related scene images and a common saliency target in a video is extracted by detecting the common saliency target in the images, redundancy and noise in the images are automatically filtered, and time and space complexity of an algorithm is reduced, so that the preferential allocation of computing resources is realized, and the execution efficiency of subsequent image tasks is improved.
The existing collaborative saliency detection methods have multiple types, and the extraction of the saliency features of a single image and the capture of similarity clues among multiple images are key links involved in the task. With the development of deep learning, the existing cooperative significance methods can be divided into two categories according to whether the methods adopt deep learning technology or not. The method based on non-deep learning is often based on some manually designed features and artificially set similarity measurement criteria to carry out collaborative significance detection, so that the extracted features and similarity information limit the detection performance and have a large influence on the detection precision. Compared with the traditional cooperative significance detection method, the other cooperative significance detection method based on deep learning has the advantages that the information extracted by the depth model is more effective, and the cooperative significance detection performance is greatly improved. However, the scale of the conventional marked cooperative significance data volume is limited, and certain restriction is formed on the application of the deep learning technology.
Disclosure of Invention
Aiming at the technical problem that the conventional cooperative significance detection method based on deep learning is limited by insufficient training data amount, so that the detection precision is greatly influenced, the invention provides a cooperative significance detection method based on a cooperative significance generation type countermeasure network, which effectively utilizes the significance of a single image and the internal correlation information of images of the same category to detect the cooperative significance of the images of the same category, and has simple training and detection process and high detection efficiency.
In order to achieve the purpose, the technical scheme of the invention is realized as follows: a cooperative significance detection method based on a cooperative significance generation type countermeasure network comprises the following steps:
the method comprises the following steps: constructing a cooperative significance generation type confrontation network model: designing a network architecture of a generator and a discriminator in the cooperative significance generation type countermeasure network according to the cooperative significance detection task characteristics, and constructing a cooperative significance generation type countermeasure network model;
step two: carrying out two-stage training on the cooperative significance generation type confrontation network model: in the first training stage, pre-training the cooperative significance generating type countermeasure network by adopting a marked significant target database, and in the second training stage, training the cooperative significance generating type countermeasure network by adopting a cooperative significance data group of which cooperative significant targets belong to the same category based on the model parameters trained in the first stage;
step three: and (3) detecting the category synergistic significance: and D, taking the generator of the collaborative saliency generative confrontation network model trained in the step two as a category collaborative saliency detector, taking the images belonging to the same category as the input of the category collaborative saliency detector, and directly outputting a collaborative saliency map corresponding to the images of the same category end to end.
The network of the generator in the first step adopts a U-Net network structure and is a full convolution network, wherein convolution kernel sizes, step lengths and filling values of convolution layers and deconvolution layers are symmetrically arranged, and Dropout operation is arranged on the last three convolution layers and the first three deconvolution layers; the network of the discriminator is also a full convolution network, a two-dimensional probability matrix is output after multilayer convolution operation, and Patch-level true and false discrimination is carried out on the image input by the discriminator according to the two-dimensional probability matrix; the generator learns the mapping relation between the original image and the collaborative saliency true value image so as to generate a collaborative saliency map, and the discriminator carries out true and false distinguishing discrimination on the collaborative saliency map and the true value image generated by the generator.
In the first training stage and the second training stage in the second step: when the generator is trained, the network parameters of the discriminator are fixed, the probability that the image generated by the generator is judged to be true by the discriminator is improved, and the parameters of the generator are updated; when the discriminator is trained, the parameters of the generator are fixed, so that the discriminator improves the probability that the real sample is judged to be true, reduces the probability that the generated false sample is judged to be true, and updates the parameters of the discriminator.
In the first training stage and the second training stage, the loss function of the generator is as follows:
LG=LG1+λ·LG2(1)
Figure BDA0002339089200000021
wherein LG is1Is a countermeasure loss of the generator, LG2Is the pixel loss of the generator, λ is the coefficient to adjust the loss weight; thetaGIs a network model parameter of the generator; countermeasure loss LG of generator1Comprises the following steps:
LG1=BCE(D(G(Im),Im),Areal) (3)
BCE(x,y)=y·logx+(1-y)·log(1-x) (4)
pixel loss LG of generator2Comprises the following steps:
LG2=||Sm-G(Im)||1(5)
wherein, ImAnd SmRespectively representing the m-th input original image and the corresponding salient object true value graph, G (-) represents the pseudo salient graph generated by the generator, D (-) represents the two-dimensional probability matrix output by the discriminator, ArealIs a two-dimensional matrix with all 1 elements, the size of the two-dimensional matrix is consistent with that of a probability matrix D (·); the function BCE (is) is used for calculating a two-dimensional probability matrix D (is) and a two-dimensional matrix ArealThe expression of the function BCE (is shown in formula (4)), wherein x and y are arguments of the function BCE (is shown in formula (4)); LG (Ligno-lead-acid)21 norm loss for the significant target true value map and the generated image;
the penalty function of the discriminator is expressed as:
LD=BCE(D(Sm,Im),Areal)+BCE(D(G(Im),Im),Afake) (6)
Figure BDA0002339089200000031
wherein, thetaDNetwork model parameters representing discriminators, AfakeThe two-dimensional matrix with all 0 elements has the size consistent with that of the probability matrix D (·,).
In the first training stage, a marked salient target database is used as a training sample set to train the collaborative saliency generation type countermeasure network, and the mapping relation between the original image and the salient target true value image is automatically learned, and the specific implementation method is as follows:
adopting images with pixel level labels in the salient target databases PASCAL-1500, HKU-IS and DUTS as training data, adjusting the sizes of all original images and corresponding salient target true value images into the input size of a generator, inputting the original images into the generator to obtain pseudo salient images with the same size, carrying out pixel level image comparison on the pseudo salient images and the salient target true value images, splicing the pseudo salient images and the original images according to the number of channels to be used as pseudo samples, and splicing the salient target true value images and the original images to be used as real samples to be respectively sent into a discriminator; wherein the Adam optimization algorithm is adopted to iteratively update the generator network model parameter theta by minimizing a loss functionGNetwork model parameter θ of sum discriminatorD
In the second step, the image groups are divided according to the categories of the common salient objects contained in the images, and on the basis of the model parameters trained in the first stage, a corresponding image group is adopted for a certain category to perform second-stage training on the cooperative saliency generation type countermeasure network, so that the cooperative saliency generation type countermeasure network learns the mapping relation between the original image and the cooperative saliency truth value map, and the specific implementation method comprises the following steps:
adopting 3 public collaborative significance detection databases which are disclosed by CoSal2015, iCoseg and MSRC-A and are grouped according to significant target classes, adjusting the sizes of all original images and corresponding collaborative significance true value graphs to the input size of a generator, directly and randomly selecting 50% of image data in each group as class training samples, training the collaborative significance generation type countermeasure network trained in the first stage, enabling the generator to automatically learn and extract public significance information in the class samples, and training a collaborative significance detection model aiming at a single class image.
If any one image belongs to one of the training sample classes used in the second stage training, the image is sent to a generator of a collaborative saliency detection model of the corresponding class after the training is finished, the size of the image needs to be adjusted to the input size of the generator before the input, and the image output by the generator is a collaborative saliency map of the input image:
CoSm=G*(Im) (8)
wherein G is*(. The) represents a two-stage trained cooperative significance generative countermeasure network generator, CoSmIs an image I to be detectedmAnd finally generating a synergetic saliency map.
Compared with the prior art, the invention has the beneficial effects that: based on the cooperative significance generation type countermeasure network, a two-stage training mechanism is adopted for cooperative significance detection, and the advantages are as follows: 1) training data using the significant objective database alleviates a series of training problems that can arise from an insufficient amount of synergistic significance data. 2) The method comprises the steps of adopting a two-stage training mechanism, taking the salient target data as first-stage training data to ensure that a network trained in the stage has single-image salient detection capability, taking a cooperative salient image group as second-stage training data by utilizing the memory function of the network on the basis that the network retains the single-image salient detection capability, enabling the network to have the capability of capturing the relevance information among images of the same category, and fusing the single-image salient detection information and the relevance information among the images of the same group to enable the network to finally have the category cooperative salient detection capability. Experiments show that the method has the advantages of simple training and detection process and high detection efficiency, can obviously improve the universality and the accuracy of the cooperative significance detection, and has important significance for quickly acquiring the key information.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of the present invention.
Fig. 2 is a comparison of the subjective results of the present invention and the existing algorithm on the CoSal2015 database.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, a cooperative significance detection method based on a cooperative significance generating type confrontation network utilizes a trained cooperative significance generating type confrontation network to realize cooperative significance detection, and includes the following steps:
the method comprises the following steps: constructing a cooperative significance generation type confrontation network model: and designing a network architecture of a generator and a discriminator in the cooperative significance generation type countermeasure network according to the cooperative significance detection task characteristics, and constructing a cooperative significance generation type countermeasure network model.
The cooperative significance generation type countermeasure network model comprises a generator network model and a discriminator network model. And designing a proper generator network model to learn the mapping relation between the original image and the collaborative saliency map, so that the collaborative saliency map generated by the generator is as close to a collaborative saliency truth map as possible. And designing a proper discriminator network model to distinguish the cooperative saliency map generated by the generator from the cooperative saliency truth map as much as possible so as to assist in training the generator.
The generator network model integrally adopts a U-Net network structure, is a full convolution network and comprises 8 convolution layers and 8 deconvolution layers which are symmetrically arranged, the sizes of an image output by the generator and an input image are ensured to be consistent, the convolution kernel sizes, the step lengths and the filling values of the convolution layers and the deconvolution layers are symmetrically arranged, Dropout operation with the value of 0.5 is added to the last 3 convolution layers and the first 3 deconvolution layers, namely, the activation output of the last layer is randomly set to zero by 50%, and overfitting can be effectively prevented. In the generator network model architecture, short connections are added on the basis of the symmetry of the encoder-decoder network structure, and feature maps with the same size are cascaded, as shown in the design of U-Net network structure (Ronneberger O, Fischer P, Brox T, et al. U-Net: connected Networks for biological image creation. in Proc. media. image creation. and compression. Assis. 2015, pp.234-241). The purpose of the concatenation is to ensure that the generated image retains detailed information such as target edges. Therefore, the network structure of the generator G is shown in table 1.
Network architecture for Table 1 Generator
Figure BDA0002339089200000051
The discriminator network module is designed to adopt an encoder structure and also a full convolution network, the network structure of the discriminator D is shown in table 2, the output of the discriminator is a 28 × 28 two-dimensional matrix, each element on the matrix is the probability that the Image block of the corresponding input Image is discriminated to be true, i.e. the Patch-level discrimination is performed on the input Image, see literature (Phillip Isola, Jun-YanZhu, Tinghui Zhou, et al.,. Image-to-Image transformation with conditional adaptive networks. arXiv prediction arXiv: 1611.07004).
Table 2 network architecture of arbiter
Network layer Input device Convolution kernel size Step size Filling value Output of
Convolutional layer 1 256×256×6 4 2 1 128×128×64
Convolutional layer 2 128×128×64 4 2 1 64×64×128
Convolutional layer 3 64×64×128 4 2 1 32×32×256
Convolutional layer 4 32×32×256 3 1 0 30×30×512
Convolutional layer 5 30×30×512 3 1 0 28×28×1
Step two: carrying out two-stage training on the cooperative significance generation type confrontation network model: in the first training stage, a marked significant target database with abundant data volume is adopted to pre-train the cooperative significance generating type countermeasure network, and in the second training stage, based on model parameters trained in the first stage, the cooperative significance data group with cooperative significant targets belonging to the same category is adopted to perform category cooperative significance detection training on the cooperative significance generating type countermeasure network.
The invention designs a two-stage training mechanism, wherein the first training stage is used for pre-training to enable a generator in the cooperative significance generation type countermeasure network to have primary single-image significant target detection capability, and the second training stage is used for further enabling the generator to have the cooperative significant target detection capability of a plurality of images of the type.
During the two-stage training process, the loss function of the generator is expressed as:
LG=LG1+λ·LG2(1)
Figure BDA0002339089200000061
the loss function of the generator consists of a penalty and pixel loss, LG1And LG2Respectively representing the antagonistic loss and the pixel loss of a generator in two-stage training, wherein lambda is a coefficient for adjusting the loss weight, and the invention sets lambda to 100 and thetaGAre the network model parameters of the generator G. When the generator is trained, the parameters of the discriminator are fixed, the probability that the image generated by the generator is judged to be true by the discriminator is improved as much as possible, and the parameters of the generator are updated. Wherein the penalty of the generator is expressed as:
LG1=BCE(D(G(Im),Im),Areal) (3)
BCE(x,y)=y·logx+(1-y)·log(1-x) (4)
the pixel loss of the generator is expressed as:
LG2=||Sm-G(Im)||1(5)
wherein, ImAnd SmRespectively representing the m-th input original image and its corresponding truth value diagram of significance, I of the first stagemAnd SmFrom the significant object database required for training, stage ImAnd SmFrom a collaborative saliency database required by training, G (-) represents a pseudo saliency map generated by a generator, and an original image ImAnd G (-) is spliced to serve as the input of a probability matrix D, and D (-) represents the probability matrix of the output of the discriminator. The parameter θ of the probability matrix D (·,. cndot.)DThe weights and biases of all neurons between convolutional networks in the trained discriminator network are determined by the network structure, and the output of the discriminator is a two-dimensional probability matrix. The size of the image input by the discriminator is 256 × 256 × 3, and the output is a 28 × 28 two-dimensional probability matrix, so that each element on the matrix is a probability value, and an image block with the size of about 9 × 9 is discriminated as a true probability value by the discriminator according to the position, and the probability value range is [0,1]In the meantime. A. therealThe two-dimensional matrix is a two-dimensional matrix with all 1 elements, and the size of the two-dimensional matrix is consistent with that of the probability matrix D (·,) wherein each element corresponds to an image block in the input sample, and the probability that the corresponding image block is judged to be true by the discriminator is all 1. The function BCE (is) is used for calculating a probability matrix D (is) and a two-dimensional matrix ArealIs shown in formula (4), wherein x and y are arguments of the function BCE (·, ·). LG (Ligno-lead-acid)2Is a 1 norm loss of the significant target true value map and the generated image.
During the two-stage training process, the discriminant loss function is expressed as:
LD(θD)=BCE(D(SmD),Areal)+BCE(D(G(ImG),θD),Afake) (6)
Figure BDA0002339089200000071
in the above formula (6), θDNetwork model parameters representing discriminators, ArealIs a two-dimensional matrix with elements all being 1, AfakeIs a two-dimensional matrix with all 0 elements, and the size of the two-dimensional matrix is consistent with that of the probability matrix D (·,). When the discriminator is trained, the parameters of the generator are fixed, so that the discriminator can improve the probability that a real sample is judged to be true as much as possible, and the probability that a generated false sample is judged to be true is reduced, thereby updating the parameters of the discriminator.
In the first training stage, a marked target database with abundant data volume is used as a sample set to perform first-stage training on the cooperative significance generation type countermeasure network, so that the model learns the mapping relation between an original image and a marked target true value image, and the model has the marked target detection capability aiming at a single image. On the basis of the model parameters trained in the first stage, the model inherits the saliency detection capability of a single image, divides an image group according to the category of a common salient object contained in the image, and adopts a corresponding image group to perform second-stage training on the cooperative saliency generation type countermeasure network aiming at a certain category. And the training type cooperative significance generation type countermeasure network has the capability of detecting a common significant target among a plurality of images.
Training in the first stage: the images with 21517 pixel-level labels in the significance target databases PASCAL-1500, HKU-IS and DUTS were used as training data. The image sizes of all the original images and the corresponding saliency target true value maps are adjusted to 256 × 256 × 3 sizes, and the original images are input to the generator. The size of the generated image is 256 multiplied by 3, the generated image is compared with a salient target true value image in a pixel level image, the generated image and an original image are spliced according to the number of channels to be used as a pseudo sample, and the salient target true value image and the original image are spliced to be used as real samples to be respectively sent to a discriminator. Wherein, Adam optimization algorithm is adopted to iteratively update the model parameters, and Batchsize, learning rate, Dropout rate and Epoch are respectively set as 1, 0.0002, 0.5 and 100. The Batchsize refers to the number of samples adopted for updating the network parameters once in the training process, the learning rate refers to the updating amplitude of each model parameter training in the training process, and the Epoch refers to the number of times of all training samples in the training process. Dropout operation is added only in 6-8 convolutional layers and 1-3 deconvolution layers, so that the generator network structure has certain robustness.
And (3) training in the second stage: on the basis of the parameters of the collaborative significance generating confrontation network model trained in the first stage, namely the parameters trained in the first stage are used as initial parameters for training in the second stage, and the training model has the class collaborative significance detection capability. Model training was performed using a total of 3 public databases of synergistic significance detection, costal 2015, iCoseg, and MSRC-a. Before training, the image sizes of all original images and corresponding true value images of the salient objects are adjusted to be 256 multiplied by 256, the images in the 3 databases are grouped according to whether the collaborative salient objects belong to the same class, therefore, 50% of data in the groups are directly selected at random for the second stage training, in the training process of the same class sample, the model is promoted to learn and extract the common salient information in the class sample, and the collaborative salient detection model aiming at the single class image is trained. Training involved the same parameter settings as the first phase training, except that the parameters of Epoch were 400.
Step three: and (3) detecting the category synergistic significance: and D, taking the generator of the collaborative saliency generative confrontation network model trained in the step two as a class collaborative saliency detector, taking an image which belongs to the same class as the second-stage training sample as the input of the class collaborative saliency detector, and directly outputting a collaborative saliency map corresponding to the image from end to end.
The generator in the cooperative significance generation type confrontation network model after two-stage training is directly used as a category cooperative significance detector, if any one image belongs to one of the training sample categories used in the second-stage training, the image is sent into the generator of the cooperative significance detection model of the corresponding category after the training is finished, the image size is unified to 256 multiplied by 3 before input, the image output by the generator is a cooperative significance map of the input image, and the size is also 256 multiplied by 3. Directly using the image generated by the generator as a collaborative saliency map as shown in the following formula:
CoSm=G*(Im) (8)
G*(. The) represents a two-stage trained cooperative significance generative countermeasure network generator, CoSmIs an image I to be detectedmAnd finally generating a synergetic saliency map.
At this point, the detection of the cooperative saliency of a group of target images containing the same category is completed, that is, the detection of the cooperative saliency of images is completed.
The invention carries out experiments on workstations with Intel (R) Xeon E5-2650 v3@2.30Hz multiplied by 20CPU and NVIDIA GTXTITAN-XPGPU as hardware environments and 128G video memory, and the running software environments are as follows: ubuntu16.04 and the deep learning framework pytorch 1.0.
In order to verify the detection performance and efficiency of the invention, the detection time comparison and subjective result comparison of each image are carried out on a CoSal2015 database by the method and 6 synergistic significance detection methods. On a CoSal2015 database, under the same hardware environment, detection time comparison is carried out on the SACS-R, SACS, CBCS and ESMG methods of the public codes and the invention, as shown in Table 3.
TABLE 3 comparison of detection times of existing algorithms on CoSal2015 database
Algorithm SACS-R SACS CBCS ESMG The invention
Code type MATLAB MATLAB MATLAB MATLAB Python
Time of detection 8.873 seconds 2.652 seconds 1.688 seconds 1.723 seconds 0.562 seconds
Among the methods involved in the subjective comparison are LDAW from the literature (D.Zhang, J.Han, C.Li et al. Co-bearing Detection depth and wind. in Proc. IEEE Conf. Compout. Vis. Pattern recognition, 2015, pp.2994-3002), SACS-R and SACS from the literature (X.Cao, Z.Tao, B.Zhang et al. Self-adaptive weighing Co-salt Detection vision Rank Detection evaluation. EEE.E. transformation. Processes. vol.23, No.9, pp.4175-4186,2014), SACS-R and SACS from the literature (J.Han, G.Cheng., Im.Li. A. Unifield Leemissive-base for use, C.E.M.J.J.C. conversion, C.F. C. C.F. C.R-C.C.F. J.F. C.F. J.M.F. J.F. J.M.F. J.M.M.F. J.J.F. J.C.F. J.M.F. J.S. J.F. 3, C.M.E.M.M.F. J.M.F. 3. C.E.M.M.F. 3. J.M.M.M.M.S. 3. J.E.E.E.F. 3. III.E. III.E.E. J.E.F.F.M.M.F.C. III.F. III. III.C. III.F.F.F.F.S. III.C. 3. III.S. III.F.S. III.S. 3. III.S. III.F.F.S. III.S., the subjective comparison results for starfish, frogs and rubber, grouped in the upper part of the CoSal2015 database, are shown in fig. 2.
As can be seen from fig. 2 and table 3, the time for detecting the cooperative significance of one image is shortest, i.e., the efficiency is highest, and compared with other existing methods, the cooperative significance map obtained by the present invention is closest to the truth map.
The method comprises the construction of a cooperative significance generation type confrontation network model and two-stage model training, wherein in the first training stage, a marked significance target data set with abundant data volume is used as a training data set to relieve the training problem caused by insufficient marked data volume in the field of cooperative visual significance, and meanwhile, the trained network has the detection capability of the significance of a single image; and taking the generator trained in two stages as a class cooperative saliency detector, and outputting a cooperative saliency map of the class image end to end. The invention effectively utilizes the memory function of the network, the significance of a single image and the internal correlation information of the same type of image group to carry out the cooperative significance detection among a plurality of images, and has simple training and detection process and high detection efficiency.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. A cooperative significance detection method based on a cooperative significance generation type countermeasure network is characterized by comprising the following steps:
the method comprises the following steps: constructing a cooperative significance generation type confrontation network model: designing a network architecture of a generator and a discriminator in the cooperative significance generation type countermeasure network according to the cooperative significance detection task characteristics, and constructing a cooperative significance generation type countermeasure network model;
step two: carrying out two-stage training on the cooperative significance generation type confrontation network model: in the first training stage, pre-training the cooperative significance generating type countermeasure network by adopting a marked significant target database, and in the second training stage, training the cooperative significance generating type countermeasure network by adopting a cooperative significance data group of which cooperative significant targets belong to the same category based on the model parameters trained in the first stage;
step three: and (3) detecting the category synergistic significance: and D, taking the generator of the collaborative saliency generative confrontation network model trained in the step two as a category collaborative saliency detector, taking the images belonging to the same category as the input of the category collaborative saliency detector, and directly outputting a collaborative saliency map corresponding to the images of the same category end to end.
2. The cooperative significance detection method based on the cooperative significance generation type countermeasure network as claimed in claim 1, wherein the network of the generator in the first step adopts a U-Net network structure, and is a full convolution network, wherein convolution kernel sizes, step sizes and filling values of convolution layers and deconvolution layers are symmetrically arranged, and Dropout operations are arranged on the last three convolution layers and the first three deconvolution layers; the network of the discriminator is also a full convolution network, a two-dimensional probability matrix is output after multilayer convolution operation, and Patch-level true and false discrimination is carried out on the image input by the discriminator according to the two-dimensional probability matrix; the generator learns the mapping relation between the original image and the collaborative saliency true value image so as to generate a collaborative saliency map, and the discriminator carries out true and false distinguishing discrimination on the collaborative saliency map and the true value image generated by the generator.
3. The method for detecting cooperative significance based on the cooperative significance generating countermeasure network of claim 1 or 2, wherein in the first training stage and the second training stage in the second step: when the generator is trained, the network parameters of the discriminator are fixed, the probability that the image generated by the generator is judged to be true by the discriminator is improved, and the parameters of the generator are updated; when the discriminator is trained, the parameters of the generator are fixed, so that the discriminator improves the probability that the real sample is judged to be true, reduces the probability that the generated false sample is judged to be true, and updates the parameters of the discriminator.
4. The method for detecting cooperative significance based on the cooperative significance generating countermeasure network of claim 3, wherein in the first training phase and the second training phase, the loss function of the generator is as follows:
LG=LG1+λ·LG2(1)
Figure FDA0002339089190000011
wherein LG is1Is a countermeasure loss of the generator, LG2Is the pixel loss of the generator, λ is the coefficient to adjust the loss weight; thetaGIs a network model parameter of the generator; countermeasure loss LG of generator1Comprises the following steps:
LG1=BCE(D(G(Im),Im),Areal) (3)
BCE(x,y)=y·logx+(1-y)·log(1-x) (4)
pixel loss LG of generator2Comprises the following steps:
LG2=||Sm-G(Im)||1(5)
wherein, ImAnd SmRespectively representing the m-th input original image and the corresponding salient object true value graph, G (-) represents the pseudo salient graph generated by the generator, D (-) represents the two-dimensional probability matrix output by the discriminator, ArealIs a two-dimensional matrix with all 1 elements, the size of the two-dimensional matrix is consistent with that of a probability matrix D (·); the function BCE (is) is used for calculating a two-dimensional probability matrix D (is) and a two-dimensional matrix ArealThe expression of the function BCE (is shown in formula (4)), wherein x and y are arguments of the function BCE (is shown in formula (4)); LG (Ligno-lead-acid)21 norm loss for the significant target true value map and the generated image;
the penalty function of the discriminator is expressed as:
LD=BCE(D(Sm,Im),Areal)+BCE(D(G(Im),Im),Afake) (6)
Figure FDA0002339089190000021
wherein, thetaDNetwork model parameters representing discriminators, AfakeThe two-dimensional matrix with all 0 elements has the size consistent with that of the probability matrix D (·,).
5. The cooperative significance detection method based on the cooperative significance generating-type countermeasure network of claim 4, wherein in the first training stage, the database of labeled significant objects is used as a training sample set to train the cooperative significance generating-type countermeasure network, and the mapping relationship between the original image and the true value image of the significant object is automatically learned by the method specifically comprising:
adopting images with pixel level labels in the salient target databases PASCAL-1500, HKU-IS and DUTS as training data, adjusting the sizes of all original images and corresponding salient target true value images into the input size of a generator, inputting the original images into the generator to obtain pseudo salient images with the same size, carrying out pixel level image comparison on the pseudo salient images and the salient target true value images, splicing the pseudo salient images and the original images according to the number of channels to be used as pseudo samples, and splicing the salient target true value images and the original images to be used as real samples to be respectively sent into a discriminator; wherein the Adam optimization algorithm is adopted to iteratively update the generator network model parameter theta by minimizing a loss functionGNetwork model parameter θ of sum discriminatorD
6. The cooperative significance detection method based on the cooperative significance generation type countermeasure network as claimed in claim 5, wherein in the second step, the image groups are divided according to the categories of the common significant objects contained in the images, and on the basis of the model parameters trained in the first stage, the cooperative significance generation type countermeasure network is trained in the second stage by adopting the corresponding image group for a certain category, so that the cooperative significance generation type countermeasure network learns the mapping relationship between the original image and the cooperative significance truth map, and the specific implementation method is as follows:
adopting 3 public collaborative significance detection databases which are disclosed by CoSal2015, iCoseg and MSRC-A and are grouped according to significant target classes, adjusting the sizes of all original images and corresponding collaborative significance true value graphs to the input size of a generator, directly and randomly selecting 50% of image data in each group as class training samples, training the collaborative significance generation type countermeasure network trained in the first stage, enabling the generator to automatically learn and extract public significance information in the class samples, and training a collaborative significance detection model aiming at a single class image.
7. The cooperative significance detection method based on the cooperative significance generation-based confrontation network as claimed in claim 5 or 6, wherein if any one of the images belongs to one of the training sample classes used in the second stage of training, the image is sent to a generator of the cooperative significance detection model of the corresponding class which has been trained, the size of the image needs to be adjusted to the input size of the generator before being input, and the image output by the generator is the cooperative significance map of the input image:
CoSm=G*(Im) (8)
wherein G is*(. The) represents a two-stage trained cooperative significance generative countermeasure network generator, CoSmIs an image I to be detectedmAnd finally generating a synergetic saliency map.
CN201911368623.3A 2019-12-26 2019-12-26 Cooperative significance detection method based on cooperative significance generation type countermeasure network Active CN111027576B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911368623.3A CN111027576B (en) 2019-12-26 2019-12-26 Cooperative significance detection method based on cooperative significance generation type countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911368623.3A CN111027576B (en) 2019-12-26 2019-12-26 Cooperative significance detection method based on cooperative significance generation type countermeasure network

Publications (2)

Publication Number Publication Date
CN111027576A true CN111027576A (en) 2020-04-17
CN111027576B CN111027576B (en) 2020-10-30

Family

ID=70213922

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911368623.3A Active CN111027576B (en) 2019-12-26 2019-12-26 Cooperative significance detection method based on cooperative significance generation type countermeasure network

Country Status (1)

Country Link
CN (1) CN111027576B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898507A (en) * 2020-07-22 2020-11-06 武汉大学 Deep learning method for predicting earth surface coverage category of label-free remote sensing image
CN112507933A (en) * 2020-12-16 2021-03-16 南开大学 Saliency target detection method and system based on centralized information interaction
CN112598043A (en) * 2020-12-17 2021-04-02 杭州电子科技大学 Cooperative significance detection method based on weak supervised learning
CN112651940A (en) * 2020-12-25 2021-04-13 郑州轻工业大学 Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
CN114743027A (en) * 2022-04-11 2022-07-12 郑州轻工业大学 Weak supervision learning-guided cooperative significance detection method
CN116109496A (en) * 2022-11-15 2023-05-12 济南大学 X-ray film enhancement method and system based on double-flow structure protection network
CN116994006A (en) * 2023-09-27 2023-11-03 江苏源驶科技有限公司 Collaborative saliency detection method and system for fusing image saliency information

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845471A (en) * 2017-02-20 2017-06-13 深圳市唯特视科技有限公司 A kind of vision significance Forecasting Methodology based on generation confrontation network
CN107123150A (en) * 2017-03-25 2017-09-01 复旦大学 The method of global color Contrast Detection and segmentation notable figure
CN107346436A (en) * 2017-06-29 2017-11-14 北京以萨技术股份有限公司 A kind of vision significance detection method of fused images classification
CN109711283A (en) * 2018-12-10 2019-05-03 广东工业大学 A kind of joint doubledictionary and error matrix block Expression Recognition algorithm
CN109727264A (en) * 2019-01-10 2019-05-07 南京旷云科技有限公司 Image generating method, the training method of neural network, device and electronic equipment
CN110110576A (en) * 2019-01-03 2019-08-09 北京航空航天大学 A kind of traffic scene thermal infrared semanteme generation method based on twin semantic network
CN110310343A (en) * 2019-05-28 2019-10-08 西安万像电子科技有限公司 Image processing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106845471A (en) * 2017-02-20 2017-06-13 深圳市唯特视科技有限公司 A kind of vision significance Forecasting Methodology based on generation confrontation network
CN107123150A (en) * 2017-03-25 2017-09-01 复旦大学 The method of global color Contrast Detection and segmentation notable figure
CN107346436A (en) * 2017-06-29 2017-11-14 北京以萨技术股份有限公司 A kind of vision significance detection method of fused images classification
CN109711283A (en) * 2018-12-10 2019-05-03 广东工业大学 A kind of joint doubledictionary and error matrix block Expression Recognition algorithm
CN110110576A (en) * 2019-01-03 2019-08-09 北京航空航天大学 A kind of traffic scene thermal infrared semanteme generation method based on twin semantic network
CN109727264A (en) * 2019-01-10 2019-05-07 南京旷云科技有限公司 Image generating method, the training method of neural network, device and electronic equipment
CN110310343A (en) * 2019-05-28 2019-10-08 西安万像电子科技有限公司 Image processing method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LINA WEI等: "Group-wise deep co-saliency detection", 《ARXIV》 *
PHILLIP ISOLA等: "Image-to-Image Translation with Conditional Adversarial Networks", 《ARXIV》 *
XIAOLIANG QIAN等: "Hardness Recognition of Robotic Forearm Based on Semi-supervised Generative Adversarial Networks", 《FRONTIERS IN NEUROROBOTICS》 *
李建伟等: "基于条件生成对抗网络的视频显著性目标检测", 《传感器与微系统》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898507A (en) * 2020-07-22 2020-11-06 武汉大学 Deep learning method for predicting earth surface coverage category of label-free remote sensing image
CN112507933A (en) * 2020-12-16 2021-03-16 南开大学 Saliency target detection method and system based on centralized information interaction
CN112507933B (en) * 2020-12-16 2022-09-16 南开大学 Saliency target detection method and system based on centralized information interaction
CN112598043A (en) * 2020-12-17 2021-04-02 杭州电子科技大学 Cooperative significance detection method based on weak supervised learning
CN112598043B (en) * 2020-12-17 2023-08-18 杭州电子科技大学 Collaborative saliency detection method based on weak supervised learning
CN112651940A (en) * 2020-12-25 2021-04-13 郑州轻工业大学 Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
CN112651940B (en) * 2020-12-25 2021-09-17 郑州轻工业大学 Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
CN114743027A (en) * 2022-04-11 2022-07-12 郑州轻工业大学 Weak supervision learning-guided cooperative significance detection method
CN114743027B (en) * 2022-04-11 2023-01-31 郑州轻工业大学 Weak supervision learning-guided cooperative significance detection method
CN116109496A (en) * 2022-11-15 2023-05-12 济南大学 X-ray film enhancement method and system based on double-flow structure protection network
CN116994006A (en) * 2023-09-27 2023-11-03 江苏源驶科技有限公司 Collaborative saliency detection method and system for fusing image saliency information
CN116994006B (en) * 2023-09-27 2023-12-08 江苏源驶科技有限公司 Collaborative saliency detection method and system for fusing image saliency information

Also Published As

Publication number Publication date
CN111027576B (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN111027576B (en) Cooperative significance detection method based on cooperative significance generation type countermeasure network
AU2019200270B2 (en) Concept mask: large-scale segmentation from semantic concepts
US10402448B2 (en) Image retrieval with deep local feature descriptors and attention-based keypoint descriptors
Gao et al. Change detection from synthetic aperture radar images based on channel weighting-based deep cascade network
CN108734210B (en) Object detection method based on cross-modal multi-scale feature fusion
CN110210513B (en) Data classification method and device and terminal equipment
CN110782420A (en) Small target feature representation enhancement method based on deep learning
CN110503076B (en) Video classification method, device, equipment and medium based on artificial intelligence
CN109993102B (en) Similar face retrieval method, device and storage medium
JP2018524678A (en) Business discovery from images
CN115937655B (en) Multi-order feature interaction target detection model, construction method, device and application thereof
CN112651940B (en) Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
CN112149526B (en) Lane line detection method and system based on long-distance information fusion
GB2579262A (en) Space-time memory network for locating target object in video content
CN113569607A (en) Motion recognition method, motion recognition device, motion recognition equipment and storage medium
CN114692750A (en) Fine-grained image classification method and device, electronic equipment and storage medium
Fan et al. A novel sonar target detection and classification algorithm
WO2024027347A1 (en) Content recognition method and apparatus, device, storage medium, and computer program product
CN115066687A (en) Radioactivity data generation
Abdelaziz et al. Few-shot learning with saliency maps as additional visual information
CN116543250A (en) Model compression method based on class attention transmission
CN117011219A (en) Method, apparatus, device, storage medium and program product for detecting quality of article
Lu et al. A Traffic Sign Detection Network Based on PosNeg-Balanced Anchors and Domain Adaptation
CN114387489A (en) Power equipment identification method and device and terminal equipment
CN113343953A (en) FGR-AM method and system for remote sensing scene recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant