CN111027576A

CN111027576A - Cooperative significance detection method based on cooperative significance generation type countermeasure network

Info

Publication number: CN111027576A
Application number: CN201911368623.3A
Authority: CN
Inventors: 钱晓亮; 白臻; 任航丽; 曾黎; 邢培旭; 程塨; 姚西文; 刘向龙; 岳伟超; 王芳; 刘玉翠; 赵素娜; 王慰; 毋媛媛; 吴青娥
Original assignee: Zhengzhou University of Light Industry
Current assignee: Zhengzhou University of Light Industry
Priority date: 2019-12-26
Filing date: 2019-12-26
Publication date: 2020-04-17
Anticipated expiration: 2039-12-26
Also published as: CN111027576B

Abstract

The invention provides a cooperative significance detection method based on a cooperative significance generation type countermeasure network, which comprises the following steps: constructing a cooperative significance generation type confrontation network model; carrying out two-stage training on the cooperative significance generation type confrontation network model: in the first training stage, a marked target database with a label is adopted to pre-train the cooperative significance generating type countermeasure network, and after the training in the first training stage, the cooperative significance generating type countermeasure network has the capability of detecting the significance of a single image; in the second training stage, based on the model parameters trained in the first stage, the labeled cooperative significance data group with the cooperative significance target belonging to the same category is adopted to train the cooperative significance generation type countermeasure network, and after the training in the second training stage, the trained model can be directly used for carrying out category cooperative significance detection. The invention has the advantages of simple training process, high detection efficiency, stronger universality and higher accuracy.

Description

Cooperative significance detection method based on cooperative significance generation type countermeasure network

Technical Field

The invention relates to the technical field of computer vision and machine learning, in particular to a cooperative significance detection method based on a cooperative significance generation type countermeasure network.

Background

With the advent of the big data age, websites and storage mobile devices that can provide various information resources, the digital information of a large number of images and videos is full of our lives, and how to endow computers with the ability to quickly and accurately acquire and retain more effective information is necessary. The cooperative saliency detection is based on a human biological visual attention mechanism, effective target information capable of representing a plurality of related scene images and a common saliency target in a video is extracted by detecting the common saliency target in the images, redundancy and noise in the images are automatically filtered, and time and space complexity of an algorithm is reduced, so that the preferential allocation of computing resources is realized, and the execution efficiency of subsequent image tasks is improved.

The existing collaborative saliency detection methods have multiple types, and the extraction of the saliency features of a single image and the capture of similarity clues among multiple images are key links involved in the task. With the development of deep learning, the existing cooperative significance methods can be divided into two categories according to whether the methods adopt deep learning technology or not. The method based on non-deep learning is often based on some manually designed features and artificially set similarity measurement criteria to carry out collaborative significance detection, so that the extracted features and similarity information limit the detection performance and have a large influence on the detection precision. Compared with the traditional cooperative significance detection method, the other cooperative significance detection method based on deep learning has the advantages that the information extracted by the depth model is more effective, and the cooperative significance detection performance is greatly improved. However, the scale of the conventional marked cooperative significance data volume is limited, and certain restriction is formed on the application of the deep learning technology.

Disclosure of Invention

Aiming at the technical problem that the conventional cooperative significance detection method based on deep learning is limited by insufficient training data amount, so that the detection precision is greatly influenced, the invention provides a cooperative significance detection method based on a cooperative significance generation type countermeasure network, which effectively utilizes the significance of a single image and the internal correlation information of images of the same category to detect the cooperative significance of the images of the same category, and has simple training and detection process and high detection efficiency.

In order to achieve the purpose, the technical scheme of the invention is realized as follows: a cooperative significance detection method based on a cooperative significance generation type countermeasure network comprises the following steps:

the method comprises the following steps: constructing a cooperative significance generation type confrontation network model: designing a network architecture of a generator and a discriminator in the cooperative significance generation type countermeasure network according to the cooperative significance detection task characteristics, and constructing a cooperative significance generation type countermeasure network model;

step two: carrying out two-stage training on the cooperative significance generation type confrontation network model: in the first training stage, pre-training the cooperative significance generating type countermeasure network by adopting a marked significant target database, and in the second training stage, training the cooperative significance generating type countermeasure network by adopting a cooperative significance data group of which cooperative significant targets belong to the same category based on the model parameters trained in the first stage;

step three: and (3) detecting the category synergistic significance: and D, taking the generator of the collaborative saliency generative confrontation network model trained in the step two as a category collaborative saliency detector, taking the images belonging to the same category as the input of the category collaborative saliency detector, and directly outputting a collaborative saliency map corresponding to the images of the same category end to end.

The network of the generator in the first step adopts a U-Net network structure and is a full convolution network, wherein convolution kernel sizes, step lengths and filling values of convolution layers and deconvolution layers are symmetrically arranged, and Dropout operation is arranged on the last three convolution layers and the first three deconvolution layers; the network of the discriminator is also a full convolution network, a two-dimensional probability matrix is output after multilayer convolution operation, and Patch-level true and false discrimination is carried out on the image input by the discriminator according to the two-dimensional probability matrix; the generator learns the mapping relation between the original image and the collaborative saliency true value image so as to generate a collaborative saliency map, and the discriminator carries out true and false distinguishing discrimination on the collaborative saliency map and the true value image generated by the generator.

In the first training stage and the second training stage in the second step: when the generator is trained, the network parameters of the discriminator are fixed, the probability that the image generated by the generator is judged to be true by the discriminator is improved, and the parameters of the generator are updated; when the discriminator is trained, the parameters of the generator are fixed, so that the discriminator improves the probability that the real sample is judged to be true, reduces the probability that the generated false sample is judged to be true, and updates the parameters of the discriminator.

In the first training stage and the second training stage, the loss function of the generator is as follows:

LG＝LG₁+λ·LG₂(1)

wherein LG is₁Is a countermeasure loss of the generator, LG₂Is the pixel loss of the generator, λ is the coefficient to adjust the loss weight; theta^GIs a network model parameter of the generator; countermeasure loss LG of generator₁Comprises the following steps:

LG₁＝BCE(D(G(I_m),I_m),A_real) (3)

BCE(x,y)＝y·logx+(1-y)·log(1-x) (4)

pixel loss LG of generator₂Comprises the following steps:

LG₂＝||S_m-G(I_m)||₁(5)

wherein, I_mAnd S_mRespectively representing the m-th input original image and the corresponding salient object true value graph, G (-) represents the pseudo salient graph generated by the generator, D (-) represents the two-dimensional probability matrix output by the discriminator, A_realIs a two-dimensional matrix with all 1 elements, the size of the two-dimensional matrix is consistent with that of a probability matrix D (·); the function BCE (is) is used for calculating a two-dimensional probability matrix D (is) and a two-dimensional matrix A_realThe expression of the function BCE (is shown in formula (4)), wherein x and y are arguments of the function BCE (is shown in formula (4)); LG (Ligno-lead-acid)₂1 norm loss for the significant target true value map and the generated image;

the penalty function of the discriminator is expressed as:

LD＝BCE(D(S_m,I_m),A_real)+BCE(D(G(I_m),I_m),A_fake) (6)

wherein, theta^DNetwork model parameters representing discriminators, A_fakeThe two-dimensional matrix with all 0 elements has the size consistent with that of the probability matrix D (·,).

In the first training stage, a marked salient target database is used as a training sample set to train the collaborative saliency generation type countermeasure network, and the mapping relation between the original image and the salient target true value image is automatically learned, and the specific implementation method is as follows:

adopting images with pixel level labels in the salient target databases PASCAL-1500, HKU-IS and DUTS as training data, adjusting the sizes of all original images and corresponding salient target true value images into the input size of a generator, inputting the original images into the generator to obtain pseudo salient images with the same size, carrying out pixel level image comparison on the pseudo salient images and the salient target true value images, splicing the pseudo salient images and the original images according to the number of channels to be used as pseudo samples, and splicing the salient target true value images and the original images to be used as real samples to be respectively sent into a discriminator; wherein the Adam optimization algorithm is adopted to iteratively update the generator network model parameter theta by minimizing a loss function^GNetwork model parameter θ of sum discriminator^D。

In the second step, the image groups are divided according to the categories of the common salient objects contained in the images, and on the basis of the model parameters trained in the first stage, a corresponding image group is adopted for a certain category to perform second-stage training on the cooperative saliency generation type countermeasure network, so that the cooperative saliency generation type countermeasure network learns the mapping relation between the original image and the cooperative saliency truth value map, and the specific implementation method comprises the following steps:

adopting 3 public collaborative significance detection databases which are disclosed by CoSal2015, iCoseg and MSRC-A and are grouped according to significant target classes, adjusting the sizes of all original images and corresponding collaborative significance true value graphs to the input size of a generator, directly and randomly selecting 50% of image data in each group as class training samples, training the collaborative significance generation type countermeasure network trained in the first stage, enabling the generator to automatically learn and extract public significance information in the class samples, and training a collaborative significance detection model aiming at a single class image.

If any one image belongs to one of the training sample classes used in the second stage training, the image is sent to a generator of a collaborative saliency detection model of the corresponding class after the training is finished, the size of the image needs to be adjusted to the input size of the generator before the input, and the image output by the generator is a collaborative saliency map of the input image:

CoS_m＝G^*(I_m) (8)

wherein G is^*(. The) represents a two-stage trained cooperative significance generative countermeasure network generator, CoS_mIs an image I to be detected_mAnd finally generating a synergetic saliency map.

Compared with the prior art, the invention has the beneficial effects that: based on the cooperative significance generation type countermeasure network, a two-stage training mechanism is adopted for cooperative significance detection, and the advantages are as follows: 1) training data using the significant objective database alleviates a series of training problems that can arise from an insufficient amount of synergistic significance data. 2) The method comprises the steps of adopting a two-stage training mechanism, taking the salient target data as first-stage training data to ensure that a network trained in the stage has single-image salient detection capability, taking a cooperative salient image group as second-stage training data by utilizing the memory function of the network on the basis that the network retains the single-image salient detection capability, enabling the network to have the capability of capturing the relevance information among images of the same category, and fusing the single-image salient detection information and the relevance information among the images of the same group to enable the network to finally have the category cooperative salient detection capability. Experiments show that the method has the advantages of simple training and detection process and high detection efficiency, can obviously improve the universality and the accuracy of the cooperative significance detection, and has important significance for quickly acquiring the key information.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of the present invention.

Fig. 2 is a comparison of the subjective results of the present invention and the existing algorithm on the CoSal2015 database.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.

As shown in fig. 1, a cooperative significance detection method based on a cooperative significance generating type confrontation network utilizes a trained cooperative significance generating type confrontation network to realize cooperative significance detection, and includes the following steps:

the method comprises the following steps: constructing a cooperative significance generation type confrontation network model: and designing a network architecture of a generator and a discriminator in the cooperative significance generation type countermeasure network according to the cooperative significance detection task characteristics, and constructing a cooperative significance generation type countermeasure network model.

The cooperative significance generation type countermeasure network model comprises a generator network model and a discriminator network model. And designing a proper generator network model to learn the mapping relation between the original image and the collaborative saliency map, so that the collaborative saliency map generated by the generator is as close to a collaborative saliency truth map as possible. And designing a proper discriminator network model to distinguish the cooperative saliency map generated by the generator from the cooperative saliency truth map as much as possible so as to assist in training the generator.

The generator network model integrally adopts a U-Net network structure, is a full convolution network and comprises 8 convolution layers and 8 deconvolution layers which are symmetrically arranged, the sizes of an image output by the generator and an input image are ensured to be consistent, the convolution kernel sizes, the step lengths and the filling values of the convolution layers and the deconvolution layers are symmetrically arranged, Dropout operation with the value of 0.5 is added to the last 3 convolution layers and the first 3 deconvolution layers, namely, the activation output of the last layer is randomly set to zero by 50%, and overfitting can be effectively prevented. In the generator network model architecture, short connections are added on the basis of the symmetry of the encoder-decoder network structure, and feature maps with the same size are cascaded, as shown in the design of U-Net network structure (Ronneberger O, Fischer P, Brox T, et al. U-Net: connected Networks for biological image creation. in Proc. media. image creation. and compression. Assis. 2015, pp.234-241). The purpose of the concatenation is to ensure that the generated image retains detailed information such as target edges. Therefore, the network structure of the generator G is shown in table 1.

Network architecture for Table 1 Generator

The discriminator network module is designed to adopt an encoder structure and also a full convolution network, the network structure of the discriminator D is shown in table 2, the output of the discriminator is a 28 × 28 two-dimensional matrix, each element on the matrix is the probability that the Image block of the corresponding input Image is discriminated to be true, i.e. the Patch-level discrimination is performed on the input Image, see literature (Phillip Isola, Jun-YanZhu, Tinghui Zhou, et al.,. Image-to-Image transformation with conditional adaptive networks. arXiv prediction arXiv: 1611.07004).

Table 2 network architecture of arbiter

Network layer	Input device	Convolution kernel size	Step size	Filling value	Output of
						Convolutional layer 1	256×256×6	4	2	1	128×128×64
Convolutional layer 2	128×128×64	4	2	1	64×64×128
						Convolutional layer 3	64×64×128	4	2	1	32×32×256
Convolutional layer 4	32×32×256	3	1	0	30×30×512
						Convolutional layer 5	30×30×512	3	1	0	28×28×1

Step two: carrying out two-stage training on the cooperative significance generation type confrontation network model: in the first training stage, a marked significant target database with abundant data volume is adopted to pre-train the cooperative significance generating type countermeasure network, and in the second training stage, based on model parameters trained in the first stage, the cooperative significance data group with cooperative significant targets belonging to the same category is adopted to perform category cooperative significance detection training on the cooperative significance generating type countermeasure network.

The invention designs a two-stage training mechanism, wherein the first training stage is used for pre-training to enable a generator in the cooperative significance generation type countermeasure network to have primary single-image significant target detection capability, and the second training stage is used for further enabling the generator to have the cooperative significant target detection capability of a plurality of images of the type.

During the two-stage training process, the loss function of the generator is expressed as:

LG＝LG₁+λ·LG₂(1)

the loss function of the generator consists of a penalty and pixel loss, LG₁And LG₂Respectively representing the antagonistic loss and the pixel loss of a generator in two-stage training, wherein lambda is a coefficient for adjusting the loss weight, and the invention sets lambda to 100 and theta^GAre the network model parameters of the generator G. When the generator is trained, the parameters of the discriminator are fixed, the probability that the image generated by the generator is judged to be true by the discriminator is improved as much as possible, and the parameters of the generator are updated. Wherein the penalty of the generator is expressed as:

LG₁＝BCE(D(G(I_m),I_m),A_real) (3)

BCE(x,y)＝y·logx+(1-y)·log(1-x) (4)

the pixel loss of the generator is expressed as:

LG₂＝||S_m-G(I_m)||₁(5)

wherein, I_mAnd S_mRespectively representing the m-th input original image and its corresponding truth value diagram of significance, I of the first stage_mAnd S_mFrom the significant object database required for training, stage I_mAnd S_mFrom a collaborative saliency database required by training, G (-) represents a pseudo saliency map generated by a generator, and an original image I_mAnd G (-) is spliced to serve as the input of a probability matrix D, and D (-) represents the probability matrix of the output of the discriminator. The parameter θ of the probability matrix D (·,. cndot.)^DThe weights and biases of all neurons between convolutional networks in the trained discriminator network are determined by the network structure, and the output of the discriminator is a two-dimensional probability matrix. The size of the image input by the discriminator is 256 × 256 × 3, and the output is a 28 × 28 two-dimensional probability matrix, so that each element on the matrix is a probability value, and an image block with the size of about 9 × 9 is discriminated as a true probability value by the discriminator according to the position, and the probability value range is [0,1]In the meantime. A. the_realThe two-dimensional matrix is a two-dimensional matrix with all 1 elements, and the size of the two-dimensional matrix is consistent with that of the probability matrix D (·,) wherein each element corresponds to an image block in the input sample, and the probability that the corresponding image block is judged to be true by the discriminator is all 1. The function BCE (is) is used for calculating a probability matrix D (is) and a two-dimensional matrix A_realIs shown in formula (4), wherein x and y are arguments of the function BCE (·, ·). LG (Ligno-lead-acid)₂Is a 1 norm loss of the significant target true value map and the generated image.

During the two-stage training process, the discriminant loss function is expressed as:

LD(θ^D)＝BCE(D(S_m,θ^D),A_real)+BCE(D(G(I_m,θ^G),θ^D),A_fake) (6)

in the above formula (6), θ^DNetwork model parameters representing discriminators, A_realIs a two-dimensional matrix with elements all being 1, A_fakeIs a two-dimensional matrix with all 0 elements, and the size of the two-dimensional matrix is consistent with that of the probability matrix D (·,). When the discriminator is trained, the parameters of the generator are fixed, so that the discriminator can improve the probability that a real sample is judged to be true as much as possible, and the probability that a generated false sample is judged to be true is reduced, thereby updating the parameters of the discriminator.

In the first training stage, a marked target database with abundant data volume is used as a sample set to perform first-stage training on the cooperative significance generation type countermeasure network, so that the model learns the mapping relation between an original image and a marked target true value image, and the model has the marked target detection capability aiming at a single image. On the basis of the model parameters trained in the first stage, the model inherits the saliency detection capability of a single image, divides an image group according to the category of a common salient object contained in the image, and adopts a corresponding image group to perform second-stage training on the cooperative saliency generation type countermeasure network aiming at a certain category. And the training type cooperative significance generation type countermeasure network has the capability of detecting a common significant target among a plurality of images.

Training in the first stage: the images with 21517 pixel-level labels in the significance target databases PASCAL-1500, HKU-IS and DUTS were used as training data. The image sizes of all the original images and the corresponding saliency target true value maps are adjusted to 256 × 256 × 3 sizes, and the original images are input to the generator. The size of the generated image is 256 multiplied by 3, the generated image is compared with a salient target true value image in a pixel level image, the generated image and an original image are spliced according to the number of channels to be used as a pseudo sample, and the salient target true value image and the original image are spliced to be used as real samples to be respectively sent to a discriminator. Wherein, Adam optimization algorithm is adopted to iteratively update the model parameters, and Batchsize, learning rate, Dropout rate and Epoch are respectively set as 1, 0.0002, 0.5 and 100. The Batchsize refers to the number of samples adopted for updating the network parameters once in the training process, the learning rate refers to the updating amplitude of each model parameter training in the training process, and the Epoch refers to the number of times of all training samples in the training process. Dropout operation is added only in 6-8 convolutional layers and 1-3 deconvolution layers, so that the generator network structure has certain robustness.

And (3) training in the second stage: on the basis of the parameters of the collaborative significance generating confrontation network model trained in the first stage, namely the parameters trained in the first stage are used as initial parameters for training in the second stage, and the training model has the class collaborative significance detection capability. Model training was performed using a total of 3 public databases of synergistic significance detection, costal 2015, iCoseg, and MSRC-a. Before training, the image sizes of all original images and corresponding true value images of the salient objects are adjusted to be 256 multiplied by 256, the images in the 3 databases are grouped according to whether the collaborative salient objects belong to the same class, therefore, 50% of data in the groups are directly selected at random for the second stage training, in the training process of the same class sample, the model is promoted to learn and extract the common salient information in the class sample, and the collaborative salient detection model aiming at the single class image is trained. Training involved the same parameter settings as the first phase training, except that the parameters of Epoch were 400.

Step three: and (3) detecting the category synergistic significance: and D, taking the generator of the collaborative saliency generative confrontation network model trained in the step two as a class collaborative saliency detector, taking an image which belongs to the same class as the second-stage training sample as the input of the class collaborative saliency detector, and directly outputting a collaborative saliency map corresponding to the image from end to end.

The generator in the cooperative significance generation type confrontation network model after two-stage training is directly used as a category cooperative significance detector, if any one image belongs to one of the training sample categories used in the second-stage training, the image is sent into the generator of the cooperative significance detection model of the corresponding category after the training is finished, the image size is unified to 256 multiplied by 3 before input, the image output by the generator is a cooperative significance map of the input image, and the size is also 256 multiplied by 3. Directly using the image generated by the generator as a collaborative saliency map as shown in the following formula:

CoS_m＝G^*(I_m) (8)

G^*(. The) represents a two-stage trained cooperative significance generative countermeasure network generator, CoS_mIs an image I to be detected_mAnd finally generating a synergetic saliency map.

At this point, the detection of the cooperative saliency of a group of target images containing the same category is completed, that is, the detection of the cooperative saliency of images is completed.

The invention carries out experiments on workstations with Intel (R) Xeon E5-2650 v3@2.30Hz multiplied by 20CPU and NVIDIA GTXTITAN-XPGPU as hardware environments and 128G video memory, and the running software environments are as follows: ubuntu16.04 and the deep learning framework pytorch 1.0.

In order to verify the detection performance and efficiency of the invention, the detection time comparison and subjective result comparison of each image are carried out on a CoSal2015 database by the method and 6 synergistic significance detection methods. On a CoSal2015 database, under the same hardware environment, detection time comparison is carried out on the SACS-R, SACS, CBCS and ESMG methods of the public codes and the invention, as shown in Table 3.

TABLE 3 comparison of detection times of existing algorithms on CoSal2015 database

Algorithm	SACS-R	SACS	CBCS	ESMG	The invention
						Code type	MATLAB	MATLAB	MATLAB	MATLAB	Python
Time of detection	8.873 seconds	2.652 seconds	1.688 seconds	1.723 seconds	0.562 seconds

Among the methods involved in the subjective comparison are LDAW from the literature (D.Zhang, J.Han, C.Li et al. Co-bearing Detection depth and wind. in Proc. IEEE Conf. Compout. Vis. Pattern recognition, 2015, pp.2994-3002), SACS-R and SACS from the literature (X.Cao, Z.Tao, B.Zhang et al. Self-adaptive weighing Co-salt Detection vision Rank Detection evaluation. EEE.E. transformation. Processes. vol.23, No.9, pp.4175-4186,2014), SACS-R and SACS from the literature (J.Han, G.Cheng., Im.Li. A. Unifield Leemissive-base for use, C.E.M.J.J.C. conversion, C.F. C. C.F. C.R-C.C.F. J.F. C.F. J.M.F. J.F. J.M.F. J.M.M.F. J.J.F. J.C.F. J.M.F. J.S. J.F. 3, C.M.E.M.M.F. J.M.F. 3. C.E.M.M.F. 3. J.M.M.M.M.S. 3. J.E.E.E.F. 3. III.E. III.E.E. J.E.F.F.M.M.F.C. III.F. III. III.C. III.F.F.F.F.S. III.C. 3. III.S. III.F.S. III.S. 3. III.S. III.F.F.S. III.S., the subjective comparison results for starfish, frogs and rubber, grouped in the upper part of the CoSal2015 database, are shown in fig. 2.

As can be seen from fig. 2 and table 3, the time for detecting the cooperative significance of one image is shortest, i.e., the efficiency is highest, and compared with other existing methods, the cooperative significance map obtained by the present invention is closest to the truth map.

The method comprises the construction of a cooperative significance generation type confrontation network model and two-stage model training, wherein in the first training stage, a marked significance target data set with abundant data volume is used as a training data set to relieve the training problem caused by insufficient marked data volume in the field of cooperative visual significance, and meanwhile, the trained network has the detection capability of the significance of a single image; and taking the generator trained in two stages as a class cooperative saliency detector, and outputting a cooperative saliency map of the class image end to end. The invention effectively utilizes the memory function of the network, the significance of a single image and the internal correlation information of the same type of image group to carry out the cooperative significance detection among a plurality of images, and has simple training and detection process and high detection efficiency.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A cooperative significance detection method based on a cooperative significance generation type countermeasure network is characterized by comprising the following steps:

2. The cooperative significance detection method based on the cooperative significance generation type countermeasure network as claimed in claim 1, wherein the network of the generator in the first step adopts a U-Net network structure, and is a full convolution network, wherein convolution kernel sizes, step sizes and filling values of convolution layers and deconvolution layers are symmetrically arranged, and Dropout operations are arranged on the last three convolution layers and the first three deconvolution layers; the network of the discriminator is also a full convolution network, a two-dimensional probability matrix is output after multilayer convolution operation, and Patch-level true and false discrimination is carried out on the image input by the discriminator according to the two-dimensional probability matrix; the generator learns the mapping relation between the original image and the collaborative saliency true value image so as to generate a collaborative saliency map, and the discriminator carries out true and false distinguishing discrimination on the collaborative saliency map and the true value image generated by the generator.

3. The method for detecting cooperative significance based on the cooperative significance generating countermeasure network of claim 1 or 2, wherein in the first training stage and the second training stage in the second step: when the generator is trained, the network parameters of the discriminator are fixed, the probability that the image generated by the generator is judged to be true by the discriminator is improved, and the parameters of the generator are updated; when the discriminator is trained, the parameters of the generator are fixed, so that the discriminator improves the probability that the real sample is judged to be true, reduces the probability that the generated false sample is judged to be true, and updates the parameters of the discriminator.

4. The method for detecting cooperative significance based on the cooperative significance generating countermeasure network of claim 3, wherein in the first training phase and the second training phase, the loss function of the generator is as follows:

LG＝LG₁+λ·LG₂(1)

LG₁＝BCE(D(G(I_m),I_m),A_real) (3)

BCE(x,y)＝y·logx+(1-y)·log(1-x) (4)

pixel loss LG of generator₂Comprises the following steps:

LG₂＝||S_m-G(I_m)||₁(5)

the penalty function of the discriminator is expressed as:

LD＝BCE(D(S_m,I_m),A_real)+BCE(D(G(I_m),I_m),A_fake) (6)

5. The cooperative significance detection method based on the cooperative significance generating-type countermeasure network of claim 4, wherein in the first training stage, the database of labeled significant objects is used as a training sample set to train the cooperative significance generating-type countermeasure network, and the mapping relationship between the original image and the true value image of the significant object is automatically learned by the method specifically comprising:

6. The cooperative significance detection method based on the cooperative significance generation type countermeasure network as claimed in claim 5, wherein in the second step, the image groups are divided according to the categories of the common significant objects contained in the images, and on the basis of the model parameters trained in the first stage, the cooperative significance generation type countermeasure network is trained in the second stage by adopting the corresponding image group for a certain category, so that the cooperative significance generation type countermeasure network learns the mapping relationship between the original image and the cooperative significance truth map, and the specific implementation method is as follows:

7. The cooperative significance detection method based on the cooperative significance generation-based confrontation network as claimed in claim 5 or 6, wherein if any one of the images belongs to one of the training sample classes used in the second stage of training, the image is sent to a generator of the cooperative significance detection model of the corresponding class which has been trained, the size of the image needs to be adjusted to the input size of the generator before being input, and the image output by the generator is the cooperative significance map of the input image:

CoS_m＝G^*(I_m) (8)