CN115115500A - Watermark embedding method combined with underwater image enhancement - Google Patents

Watermark embedding method combined with underwater image enhancement Download PDF

Info

Publication number
CN115115500A
CN115115500A CN202210852829.9A CN202210852829A CN115115500A CN 115115500 A CN115115500 A CN 115115500A CN 202210852829 A CN202210852829 A CN 202210852829A CN 115115500 A CN115115500 A CN 115115500A
Authority
CN
China
Prior art keywords
watermark
image
feature map
enhancement
underwater
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210852829.9A
Other languages
Chinese (zh)
Inventor
骆挺
吴俊�
何周燕
徐海勇
宋洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
College of Science and Technology of Ningbo University
Original Assignee
College of Science and Technology of Ningbo University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by College of Science and Technology of Ningbo University filed Critical College of Science and Technology of Ningbo University
Priority to CN202210852829.9A priority Critical patent/CN115115500A/en
Publication of CN115115500A publication Critical patent/CN115115500A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2201/00General purpose image data processing
    • G06T2201/005Image watermarking
    • G06T2201/0065Extraction of an embedded watermark; Reliable detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Editing Of Facsimile Originals (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a watermark embedding method combined with underwater image enhancement, which comprises the following steps: a watermark encoder combining image enhancement is provided, watermark embedding and image enhancement are integrated into a unified structure, an image enhancement process is considered during watermark embedding, a residual error attention module is added into the watermark encoder to strengthen the attention to a quality degradation area, watermark information is redundantly embedded five times, a multi-scale downsampling fusion discriminator is constructed, and combined training is carried out on the discriminator, the watermark encoder and a watermark extractor until a loss function is converged. The method comprehensively considers the embedding of the watermark information and the enhancement of the original underwater image through the watermark encoder, adjusts the depth characteristic representation of the original underwater image through the residual error attention module, improves the visual quality of the watermark image and the robustness of the watermark embedding, and further improves the visual quality and the robustness of the watermark information through the combined training of the watermark encoder, the watermark extractor and the discriminator.

Description

Watermark embedding method combined with underwater image enhancement
Technical Field
The invention belongs to the field of image watermarking, and particularly relates to a watermark embedding method combined with underwater image enhancement.
Background
Due to the selective absorption of the water medium to the light waves and the scattering effect of particles in water, the underwater images obtained by shooting usually have blurred details, low contrast, distorted colors and the like. To solve this problem, underwater image enhancement adjusts contrast, improves sharpness and color correction, etc. by a method based on a physical model and a non-physical model to improve the visual quality of an image. However, with the development of multimedia and network technologies, the enhanced images may be copied and tampered by illegal users during transmission or sharing, and thus, the corresponding copyrights need to be protected. The digital image watermarking technology can embed extra information into an image, so that the problem of copyright dispute is effectively solved.
Watermarking technologies can be generally divided into fragile watermarks and robust watermarks, wherein the fragile watermarks are sensitive to image modification and are mainly used for image tampering detection. The robust watermark is that the copyright identification of the image is embedded into the original underwater image, and the image containing the watermark can still extract relatively complete watermark information after general image processing or malicious attack. Conventional robust watermarks are generally based on a transform domain, and the watermark is embedded in the image by modifying corresponding transform coefficients. The method has certain robustness to common image attacks, however, the generalization capability to different attacks is weak.
Deep learning exhibits excellent performance in different fields of computer vision and natural language processing, which benefits from its powerful feature extraction capability. With this in mind, researchers began exploring the design of a deep learning-based watermarking framework. Before the advent of deep learning techniques, most watermarking methods typically used machine learning tools to improve performance. The invisibility of the watermarking method and the robustness to geometric attacks can be improved by using a machine learning tool, but the features for training need to be extracted manually, so that the performance of the watermarking method is limited to a great extent. Later, with the development of deep learning technology, some watermark technologies based on deep networks are provided, and the generalization capability of the watermark to resist different attacks can be better solved. The network model aims to design an end-to-end trainable deep network watermark model, and invisibility and robustness of the watermark model are improved by designing a reasonable loss function and adding a noise layer. However, embedding of the watermark will destroy the enhanced image, and therefore, this process is in conflict with image enhancement. How to consider image enhancement in the watermark embedding process is not discussed in depth in the existing watermark method.
Disclosure of Invention
The present invention aims to provide a watermark embedding method combined with underwater image enhancement to solve the problems of the prior art.
In order to achieve the above object, the present invention provides a watermark embedding method combined with underwater image enhancement, comprising:
acquiring an original underwater image, constructing a watermark encoder combined with image enhancement, and carrying out image enhancement and embedding initial binary watermark information on the original underwater image based on the watermark encoder to acquire a watermark image with an enhancement effect;
constructing a noise layer, inputting the watermark image and the original underwater image into the noise layer, and obtaining a noise image;
constructing a watermark extractor, extracting and decoding the watermark information of the noise image based on the watermark extractor, and obtaining target binary watermark information;
constructing a discriminator, and grading the watermark image and the label image;
constructing a multi-modal loss function, and evaluating the global content, color and texture information of the watermark image and the loss aiming at the watermark robustness;
and combining the enhanced watermark encoder and the watermark extractor for combined training, alternately training with the discriminator, and performing iterative updating in the training process until the multi-modal loss function is converged.
Optionally, the watermark encoder includes: 5 downsampling convolution blocks, 5 upsampling convolution blocks, two normal convolution blocks, and a residual attention module.
Optionally, in the process of acquiring the noise image: the image is attacked by any one of Crop (p%), Dropout (p%), resize (scale), or JPEG (Q), wherein Crop (p%) represents randomly cropping the image as a percentage of p%; cropout (p%) means that a p% pixel region is selected from the watermark image, then a (1-p%) pixel region is selected from the original underwater image, and then the two are spliced into a new image; dropout (P%) indicates that the watermark image pixels are retained by a percentage of P%, the remaining pixels being filled by the original underwater image; resize (scale) means to enlarge or reduce the image in scale; JPEG (Q) denotes JPEG compression of an image by a quality factor Q.
Optionally, the watermark extractor includes: several volume blocks, GAP layer, and full connection layer.
Optionally, the discriminator includes 4 convolution blocks and 1 convolution layer, and adopts a multi-scale downsampling fusion strategy and a markov Patch-GAN architecture, wherein the input of the first convolution block is the watermark image and the label image, the input of the second to fourth convolution blocks is the output of the previous convolution block and a feature map of downsampling the label image to a corresponding size, and the convolution layer outputs a score.
Optionally, the acquiring process of the watermark image includes: the original underwater image obtains a first feature map through a first rolling block, the watermark information is copied and expanded to the same size of the first feature map, channel splicing is carried out on the watermark information and the feature map, and a second rolling block is input after splicing is completed to be processed to obtain a second feature map; the watermark information is copied and expanded to the size same as that of the second feature map, channel splicing is carried out on the watermark information and the second feature map, a first downsampling volume block is input after splicing is finished, a first circulation process is carried out, a third feature map, a fourth feature map, a fifth feature map and a sixth feature map are sequentially obtained, the sixth feature map is input into a fifth downsampling volume block to obtain a seventh feature map, the seventh feature map is input into the residual attention module to adjust feature representation, and an eighth feature map is obtained; and entering a second circulation process, and sequentially obtaining a ninth characteristic diagram, a tenth characteristic diagram, an eleventh characteristic diagram, a twelfth characteristic diagram and a watermark image, wherein jump connection is formed between mirror images, and connected characteristics are firstly subjected to channel splicing and then input into an upsampling volume block.
Optionally, the process of adjusting the feature representation by the residual attention module includes: calculating one-dimensional channel attention by utilizing the interdependency among the channels, and multiplying the channel attention by the input element to obtain a feature map after the channel attention is adjusted; and finally, performing residual connection on the input feature map and the output feature map to obtain a residual attention feature map.
Optionally, the multi-modal loss function includes: representing image global similarity loss by mean square error
Figure BDA0003752403170000041
Inputting the watermark image and the label image into a pre-trained VGG-19 network, respectively extracting high-level features output by a block 5-conv 2 layer, wherein the difference is perception loss, and the calculation method is that
Figure BDA0003752403170000042
The difference between the initial binary watermark information and the target binary watermark information is minimized mean square error, and the expression is
Figure BDA0003752403170000043
The output result of the discriminator on the watermark image is taken as the adversarial loss of the watermark encoder, and is expressed as:
Figure BDA0003752403170000044
wherein GT index labels the image, I en As watermark image, C is the image's channelThe number of tracks, H being the height of the image, W being the width of the image, M in For said initial binary watermark information, M out For said target binary watermark information, θ D Is a discriminator parameter; the goal of the co-training is to minimize losses, expressed as:
Figure BDA0003752403170000051
wherein λ is i ,λ p ,λ m ,λ adv Relative weights for each term are set to 1.0, 0.5, 0.2, 0.3, respectively; during the discriminator training, the following losses are minimized:
Figure BDA0003752403170000052
optionally, the first cyclic process is as follows: and copying and expanding the watermark information to the size which is the same as that of the Nth feature map, and performing channel splicing with the Nth feature map to be used as the input of an (N-1) th downsampling convolution block to obtain an (N + 1) th feature map, wherein N is from 2 to 5.
Optionally, the second cyclic process is as follows: and inputting the M-th feature map into an M-7 th upsampling volume block to obtain an M + 1-th feature map, wherein M is from 8 to 11, and inputting the twelfth feature map into a fifth upsampling module to obtain a watermark image.
The invention has the technical effects that:
the method comprehensively considers the embedding of the watermark information and the enhancement of the original underwater image through the watermark encoder, adjusts the depth characteristic representation of the original underwater image through the residual error attention module, improves the visual quality of the watermark image and the robustness of the watermark embedding, and further improves the visual quality and the robustness of the watermark information through the combined training of the watermark encoder, the watermark extractor and the discriminator.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:
FIG. 1 is a network structure of a discriminator in an embodiment of the present invention;
FIG. 2 is a block diagram of a residual attention module according to an embodiment of the present invention.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Example one
As shown in fig. 1-2, the present embodiment provides a watermark embedding method combined with underwater image enhancement, including:
the model consists of four parts: (1) parameter is theta WWE In combination with the image enhancement watermark encoder WWE, the original underwater image I co And watermark information M in As input, and generates a watermark image I en (ii) a (2) A noise layer, the original underwater image and the watermark image as input, and generating a noise image I no (3) Parameter is theta E The watermark extractor E of (1), extracting the noise image I no As input, and extracts watermark information M out (4) Parameter is theta D The discriminator D of (1) for watermarking the image I en And label image GT as input, and outputs different scores S to discriminate I en The image quality of (a).
Watermark encoder WWE in conjunction with image enhancement: WWE is based on U-Net, is a U-type network and has a hopping connection between the mirror layers. This idea of jump connection has proven to be very effective for image-to-image conversion and image quality enhancement problems. Thus, in WWE the network, this structure is used to accomplish watermark embedding and image enhancement.
Set original underwater image
Figure BDA0003752403170000061
Has a size of CxHxW, wherein
Figure BDA0003752403170000062
Represents the range of image pixel values, i.e., { 0.,. 255 }. For embedded binary watermark information asOne-dimensional vector, denoted M in E {0,1} L, where L represents the length of the binary watermark information, used to control the watermark capacity. The watermark encoder WWE in combination with image enhancement performs enhancement processing on the original underwater image and embeds watermark information into the enhanced image to generate a watermark image I en This process can be expressed as:
I en =f WWE (I co ,M in ;θ WWE )
convolution blocks form the basic components of the network, each convolution block comprising a convolution layer (conv), an activation function and a batch normalization layer (BN). A leaky modified linear unit (leakyreu) with a negative slope of 0.2 is used as an activation function for the downsampling process at WWE, and the upsampling process uses a modified linear unit (ReLU) as an activation function. ConvBlocks in the model has a convolution kernel size of 3 × 3, step size of 2 and padding of 1 during downsampling, and a convolution kernel size of 3 × 3, step size of 1 and padding of 1 during upsampling.
In the down-sampling process, the plurality of convolution kernels in the convolution block enlarges the channel number of the image, so that richer features are known for image enhancement, and more potential embedding positions are provided for watermark embedding. However, the features captured by the convolution kernel have only weak spatial and channel correlations. Certain channel or spatial locations may exhibit poor invisibility to the embedded watermark and there are also different channel and spatial region inconsistency attenuation problems during image enhancement. To solve these problems, a Residual Attention Module (RAM) is designed which enables the network to focus on image quality degradation areas and watermark invisible areas and to give greater weight to these areas. It is placed after the last volume block of the downsampling to enhance WWE network performance.
ConvBlock in a watermark encoder combined with image enhancement represents a common convolutional block, ConvBlock-do and ConvBlock-up represent a downsampled convolutional block and an upsampled convolutional block, respectively, and RAM represents a residual attention module. The Kernel Information is in the form of "number of output channels × (convolution Kernel height × convolution Kernel width × input channel)Number) ". The watermark encoder WWE incorporating image enhancement consists of 5 downsampled convolution blocks, 5 upsampled convolution blocks, 2 normal convolution blocks and one residual attention block. During the down-sampling, the number of channels of the convolutional layer gradually increases, and conversely, the number of channels of the convolutional layer gradually decreases during the up-sampling. Original underwater image I co Obtaining a signature representation I by means of the first ConvBlock co1 . This step initially extracts texture features in the image and provides an embedding location for the watermark. Then, the binary watermark information M in Extension to sum I by replication co1 Same size as I co1 Channel splicing is carried out, and I is obtained by the second ConvBlock processing co2 . Following this, the binary watermark information M in Extension to sum I by replication co2 Same size as I co2 Performing channel splicing, and processing by a first ConvBlock-do to obtain I co3 . Similarly, the second to fourth ConvBlock-do perform channel splicing on the copied and expanded binary watermark information and the output characteristic diagram of the previous ConvBlock-do to serve as input, and respectively output I co4 ,I co5 And I co6 . The last ConvBlock-do only inputs the output characteristic diagram of the last ConvBlock-do and generates I co7 Then, mixing I co7 Inputting into RAM to adjust the feature representation to obtain I co8 . Finally, mixing I co8 Inputting into ConvBlock-up, the first to fifth ConvBlock-up respectively outputting I co9 ,I co10 ,I co11 ,I co12 And I en . In addition, this step has a jump connection between the mirrors, i.e. (I) co9 ,I co6 )、(I co10 ,I co5 )、(I co11 ,I co4 )、(I co12 ,I co3 ) In the meantime. Specifically, channel splicing is performed on the connected features, and then the connected features are input into ConvBlock-up. In order not to influence the image enhancement effect, watermark information is selected to be embedded in the down sampling process. The watermark information is embedded five times in total, so that redundancy can be increased to enhance the robustness of watermark extraction. The whole process is essentially to jointly encode the watermark information and the image enhancement features by using convolution operation,to obtain a watermark image with enhanced effect.
Noise layer: the noise layer is a key part for improving the robustness of the watermark model. During the transmission process of the watermark image in a communication channel, various label image processing attacks are inevitably generated. In order to extract complete watermark information, different attacks are added during training, and robustness of a watermark model to specific real attacks can be effectively improved. Therefore, a noise subnetwork is designed, various attacks are simulated as a micro-network layer in iterative training of the network, and only one attack is randomly selected in each training loop.
The noise layer includes five attacks, Crop (p%), Dropout (p%), resize (scale), and jpeg (q), respectively. Crop (p%) means randomly cropping the image as a percentage of p%. Cropout (p%) denotes the watermark image I en In the image, a p% pixel area is selected and then the original underwater image I is selected co Select 1-p% pixel area, and then combine the two into a new image. Dropout (P%) indicates that watermark image I is retained by a percentage of P% en Pixels, the rest pixels are composed of original underwater image I co And (6) filling. Resize (scale) means to enlarge or reduce an image in scale. JPEG (Q) denotes JPEG compression of an image by a quality factor Q.
The watermark extractor E: the extractor E learns to decode the watermark information to derive from the received noisy image I no Extract watermark information M out This process can be expressed as:
M out =f E (l no ;θ E )
where ConvBlock is the same as in WWE networks, ConvBlock-K denotes that there are K identical ConvBlock volume blocks, GAP denotes global averaging pooling, and FC denotes a fully connected layer. Noisy image I no Characterization of a 64-channel by the first ConvBlock no1 . Then, the first no1 Inputting the K identical ConvBlock to be processed to obtain 64-channel characteristic diagram representation I no2 . The purpose of this step is to extract the rich deep features of the image and thus extract the embedded watermark information. Next, I is added no2 Inputting GAP layer to perform global average pooling to obtain a 1 × 1 × 64 tensor I no3 . Finally, mixing I no3 Transformed into a 1 x 64 one-dimensional vector and processed by the FC layer to generate the final length L of binary watermark information. The essence of watermark reconstruction is to extract watermark information from different levels of image features.
A discriminator D: discriminator D pair watermark image I en And the label image GT, and outputting a score S, which can be expressed as: s ═ f D (I en ,GT;θ D )
The purpose of the discriminator is to discriminate the watermark image I en A higher score is assigned and a lower score is assigned to the tag image GT to enhance the similarity between the two. The discriminative power of the discriminators will undoubtedly affect the performance of WWE because they are updated in a competitive relationship. Therefore, in order to improve WWE performance, a multi-scale downsampling fusion strategy is proposed in the model, and a Markov Patch-GAN architecture is adopted. The architecture assumes that image pixels are independent of image patch size, i.e., discrimination is based only on patch level information. This assumption is important for capturing high frequency characteristics such as local texture and style. As shown in fig. 1, the first ConvBlock input is a watermark image and a label image, the second to fourth ConvBlock inputs are the output of the previous ConvBlock and the downsampling of the label image to a feature map of the corresponding size, the convolution layer output scores, and the downsampling of the label image GT is done using the same ConvBlock. Finally, a convolution operation is used to output a score of size 16 × 16 × 1. The convolution kernel size in each convolution block is 3 x 3, step size 2 and padding 1, and is normalized using the ReLU activation function and batch processing (BN).
Residual attention module
An attention mechanism in deep learning may enable web learning to focus on important features and ignore irrelevant features. It is applied in the present model and the weight size is adjusted according to the importance of space and channel using the attention module (RAM) with residual concatenation. Computing input feature maps with convolution blocks
Figure BDA0003752403170000101
For example, the module first calculates the attention of one-dimensional channel by using the interdependence relationship between channels, and then multiplies the attention of channel by the input element to obtain the feature map after the attention of channel is adjusted
Figure BDA0003752403170000102
Secondly, calculating two-dimensional space attention of Q by utilizing the interdependence relation of the space areas, multiplying the space attention and Q by elements, and outputting a characteristic diagram finally containing channel attention and space attention
Figure BDA0003752403170000103
Finally, residual error connection is carried out on the input characteristic diagram F and the output characteristic diagram U to obtain a residual error attention characteristic diagram
Figure BDA0003752403170000104
The structure of the residual attention module is shown in fig. 2.
For channel attention, spatial information is first aggregated using average pooling and maximum pooling in the spatial direction for the input feature map F to obtain
Figure BDA0003752403170000111
And
Figure BDA0003752403170000112
the following formula can be used for calculation:
Figure BDA0003752403170000113
the two pooling operations being per channel
Figure BDA0003752403170000114
Global information is compressed into two scalars as a representation of spatial features. To model the correlation between each channel, propagation is performed using a shared network consisting of two fully-connected channels
Figure BDA0003752403170000115
And
Figure BDA0003752403170000116
and then, fusing the two feature vectors through element addition, and converting the fused feature vectors into channel attention through a sigmoid function. The channel attention CA is obtained by:
Figure BDA0003752403170000117
wherein σ (-) denotes a sigmoid function, δ (-) denotes a ReLU function,
Figure BDA0003752403170000118
and
Figure BDA0003752403170000119
the weights of the two fully connected layers are represented separately, and r is set to 16 in order to reduce the model computation cost. Finally, each element of CA is multiplied by each pass of F to compute Q, which can be expressed as:
Figure BDA00037524031700001110
wherein the content of the first and second substances,
Figure BDA00037524031700001111
representing pixel multiplication. Thus, a feature map after the attention of the channel is adjusted is obtained.
Similar to channel attention, channel information is first aggregated using average pooling and maximum pooling in the channel direction on the input profile Q to obtain
Figure BDA00037524031700001112
And
Figure BDA00037524031700001113
the following formula can be used for calculation:
Figure BDA00037524031700001114
Figure BDA00037524031700001115
these two pooling operations compress all channel information into one channel as a representation of the channel characteristics. Next, the process of the present invention is described,
Figure BDA0003752403170000121
and
Figure BDA0003752403170000122
and (4) carrying out channel splicing, and obtaining the space attention through the convolution layer and the sigmoid function. The calculation of spatial attention SA may be expressed as:
Figure BDA0003752403170000123
wherein σ (·) represents a sigmoid function, Conv (·) represents convolution operation, and-represents channel splicing. In the convolution operation, the size of the convolution kernel is set to 7 × 7. Finally, each element of SA is multiplied by the element at the corresponding position of Q to calculate U, which can be expressed as U-SA × Q
Furthermore, to avoid the gradient vanishing problem and to maintain good characteristics of the original features, a residual join is added, adding F to U yields F'. The overall process of the residual attention module can be expressed as a function: f' ═ RAM (F)
Thus, when a feature F passes through the RAM, important spatial regions and channels are given greater weight, and vice versa. F' has a stronger representation capability for image enhancement and watermark embedding.
Loss function
In went, the watermark encoder WWE combined with image enhancement and the extractor E adopt a joint end-to-end working mode, and perform synchronous updating during training. The discriminator is alternately optimized based on the idea of mutual confrontation. Digital image watermarkingThe basic requirement of the technology is to ensure the original underwater image I co And watermark image I en Visually indistinguishable there between. However, unlike general digital image watermarking techniques, since the image enhancement process is taken into account when embedding the watermark, the requirement of this model is to enhance the watermark image I en Similarity with the label image GT. To achieve this goal, the mean square error is used to represent the image global similarity loss:
Figure BDA0003752403170000124
further, in order to encourage the generation of a watermark image whose content (i.e., feature representation) is similar to the tag image GT image, the watermark image I en And inputting the label image GT into a pre-trained VGG-19 network, and then respectively extracting the high-level features of the block5_ conv2 layer output and minimizing the difference between the high-level features and the high-level features. This difference is called the perceptual loss and is calculated as follows:
Figure BDA0003752403170000131
watermarking techniques require accurate extraction of watermark information from the watermark image. Embedded watermark information M in Each value of which is 0 or 1, and extracted watermark information M out Is a floating point number between 0 and 1. Minimizing M during training using mean square error loss in And M out The difference between them:
Figure BDA0003752403170000132
when the model training is completed and the model is actually applied, M is required to be added out Rounded to 0 or 1 to construct the true binary sequence. As described above, discriminator D employs a Markov Patch-GAN architecture that is effective in capturing high frequency information about texture and style. Thus, a discriminator is used on a watermark imageThe output of (c) is used as an adversarial loss of WWE encoder to enhance local texture and style consistency. It can be expressed as:
Figure BDA0003752403170000133
in summary, the training goal of the combined image enhanced watermark encoder WWE and extractor E is to minimize:
Figure BDA0003752403170000134
wherein λ is i ,λ p ,λ m ,λ adv The relative weight of each item is expressed and set to 1.0, 0.5, 0.2 and 0.3, respectively, according to the experimental results.
The discriminator strives to reduce the prediction score of the label image GT and to enlarge the watermark image I en The prediction score of (1). To train discriminator D, the following penalty is minimized:
Figure BDA0003752403170000135
the enhanced watermark encoder WWE and extractor E are jointly trained and alternately trained with the discriminator D until the loss function converges. Wherein WWE and E together minimize a loss function
Figure BDA0003752403170000136
And D is responsible for minimizing
Figure BDA0003752403170000137
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A watermark embedding method in combination with underwater image enhancement, comprising the steps of:
acquiring an original underwater image, constructing a watermark encoder combined with image enhancement, and carrying out image enhancement and embedding initial binary watermark information on the original underwater image based on the watermark encoder to acquire a watermark image with an enhancement effect;
constructing a noise layer, inputting the watermark image and the original underwater image into the noise layer, and obtaining a noise image;
constructing a watermark extractor, extracting and decoding the watermark information of the noise image based on the watermark extractor, and obtaining target binary watermark information;
constructing a discriminator, and grading the watermark image and the label image;
constructing a multi-modal loss function, and evaluating the global content, color and texture information of the watermark image and the loss aiming at the watermark robustness;
and combining the enhanced watermark encoder and the watermark extractor for combined training, alternately training with the discriminator, and performing iterative updating in the training process until the multi-modal loss function is converged.
2. The watermark embedding method in combination with underwater image enhancement as claimed in claim 1, wherein the watermark encoder comprises: 5 downsampling convolution blocks, 5 upsampling convolution blocks, two normal convolution blocks, and a residual attention module.
3. The watermark embedding method combined with underwater image enhancement as recited in claim 1, wherein the acquisition process of the noise image comprises: the image is attacked by any one of Crop (p%), Dropout (p%), resize (scale), or JPEG (Q), wherein Crop (p%) represents randomly cropping the image as a percentage of p%; cropout (p%) means that a p% pixel region is selected from the watermark image, then a (1-p%) pixel region is selected from the original underwater image, and then the two are spliced into a new image; dropout (P%) indicates that the watermark image pixels are retained by a percentage of P%, the remaining pixels being filled by the original underwater image; resize (scale) means to enlarge or reduce the image in scale; JPEG (Q) denotes JPEG compression of an image by a quality factor Q.
4. The watermark embedding method in combination with underwater image enhancement as claimed in claim 1, wherein said watermark extractor comprises: several volume blocks, GAP layer, and full connection layer.
5. The watermark embedding method in combination with underwater image enhancement according to claim 1, characterized in that the discriminator comprises 4 convolution blocks and 1 convolution layer, and adopts a multi-scale downsampling fusion strategy and a markov Patch-GAN architecture, wherein the input of the first convolution block is the watermark image and the label image, the input of the second to fourth convolution blocks is the output of the previous convolution block and the feature map of the label image downsampled to the corresponding size, and the convolution layer outputs scores.
6. The watermark embedding method combined with underwater image enhancement as claimed in claim 2, wherein the watermark image obtaining process comprises: the original underwater image obtains a first feature map through a first rolling block, the watermark information is copied and expanded to the same size of the first feature map, channel splicing is carried out on the watermark information and the feature map, and a second rolling block is input after splicing is completed to be processed to obtain a second feature map; the watermark information is copied and expanded to the size same as that of the second feature map, channel splicing is carried out on the watermark information and the second feature map, a first downsampling volume block is input after splicing is finished, a first circulation process is carried out, a third feature map, a fourth feature map, a fifth feature map and a sixth feature map are sequentially obtained, the sixth feature map is input into a fifth downsampling volume block to obtain a seventh feature map, the seventh feature map is input into the residual attention module to adjust feature representation, and an eighth feature map is obtained; and entering a second circulation process, and sequentially obtaining a ninth characteristic diagram, a tenth characteristic diagram, an eleventh characteristic diagram, a twelfth characteristic diagram and a watermark image, wherein jump connection is formed between mirror images, and connected characteristics are firstly subjected to channel splicing and then input into an upsampling volume block.
7. The watermark embedding method in combination with underwater image enhancement as claimed in claim 6, wherein said process of residual attention module adjusting feature representation comprises: calculating one-dimensional channel attention by utilizing the interdependency among the channels, and multiplying the channel attention by the input element to obtain a feature map after the channel attention is adjusted; and finally, performing residual connection on the input feature map and the output feature map to obtain a residual attention feature map.
8. The method of claim 1, wherein the multi-modal loss function comprises: representing image global similarity loss by mean square error
Figure FDA0003752403160000031
Inputting the watermark image and the label image into a pre-trained VGG-19 network, respectively extracting high-level features output by a block 5-conv 2 layer, wherein the difference is perception loss, and the calculation method is that
Figure FDA0003752403160000032
The difference between the initial binary watermark information and the target binary watermark information is minimized mean square error, and the expression is
Figure FDA0003752403160000033
The output result of the discriminator on the watermark image is taken as the adversarial loss of the watermark encoder, and is expressed as:
Figure FDA0003752403160000034
wherein GT index labels the image, I en Is a watermark image, C is the number of channels of the image, H is the height of the image, W is the width of the image, M in For the initial binary watermark information, M out For said target binary watermark information, θ D Is a discriminator parameter; the goal of the co-training is to minimize losses, expressed as:
Figure FDA0003752403160000035
wherein λ is i ,λ p ,λ m ,λ adv Relative weights for each term are set to 1.0, 0.5, 0.2, 0.3, respectively; during the discriminator training, the following losses are minimized:
Figure FDA0003752403160000041
9. the watermark embedding method combined with underwater image enhancement as recited in claim 6, wherein the first loop process is: and copying and expanding the watermark information to the size which is the same as that of the Nth feature map, and performing channel splicing with the Nth feature map to be used as the input of an (N-1) th downsampling convolution block to obtain an (N + 1) th feature map, wherein N is from 2 to 5.
10. The watermark embedding method combined with underwater image enhancement as claimed in claim 6, wherein said second loop process is: and inputting the M-th feature map into an M-7 th upsampling volume block to obtain an M + 1-th feature map, wherein M is from 8 to 11, and inputting the twelfth feature map into a fifth upsampling module to obtain a watermark image.
CN202210852829.9A 2022-07-19 2022-07-19 Watermark embedding method combined with underwater image enhancement Withdrawn CN115115500A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210852829.9A CN115115500A (en) 2022-07-19 2022-07-19 Watermark embedding method combined with underwater image enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210852829.9A CN115115500A (en) 2022-07-19 2022-07-19 Watermark embedding method combined with underwater image enhancement

Publications (1)

Publication Number Publication Date
CN115115500A true CN115115500A (en) 2022-09-27

Family

ID=83333922

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210852829.9A Withdrawn CN115115500A (en) 2022-07-19 2022-07-19 Watermark embedding method combined with underwater image enhancement

Country Status (1)

Country Link
CN (1) CN115115500A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115880125A (en) * 2023-03-02 2023-03-31 宁波大学科学技术学院 Soft fusion robust image watermarking method based on Transformer
CN116152116A (en) * 2023-04-04 2023-05-23 青岛哈尔滨工程大学创新发展中心 Underwater image enhancement method based on visual self-attention model
CN116308985A (en) * 2023-05-23 2023-06-23 贵州大学 Robust watermarking method for diffusion tensor image

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115880125A (en) * 2023-03-02 2023-03-31 宁波大学科学技术学院 Soft fusion robust image watermarking method based on Transformer
CN116152116A (en) * 2023-04-04 2023-05-23 青岛哈尔滨工程大学创新发展中心 Underwater image enhancement method based on visual self-attention model
CN116152116B (en) * 2023-04-04 2023-07-21 青岛哈尔滨工程大学创新发展中心 Underwater image enhancement method based on visual self-attention model
CN116308985A (en) * 2023-05-23 2023-06-23 贵州大学 Robust watermarking method for diffusion tensor image
CN116308985B (en) * 2023-05-23 2023-07-25 贵州大学 Robust watermarking method for diffusion tensor image

Similar Documents

Publication Publication Date Title
Guan et al. DeepMIH: Deep invertible network for multiple image hiding
CN111028308B (en) Steganography and reading method for information in image
CN111598761B (en) Anti-printing shooting image digital watermarking method based on image noise reduction
CN115115500A (en) Watermark embedding method combined with underwater image enhancement
CN110706302B (en) System and method for synthesizing images by text
Wang et al. HidingGAN: High capacity information hiding with generative adversarial network
Liu et al. Overview of image inpainting and forensic technology
CN115829819A (en) Neural network-based image robust reversible information hiding method, device and medium
Wei et al. A robust image watermarking approach using cycle variational autoencoder
Gao A method for face image inpainting based on generative adversarial networks
CN114926734A (en) Solid waste detection device and method based on feature aggregation and attention fusion
Liao et al. GIFMarking: The robust watermarking for animated GIF based deep learning
Zhu et al. Generative high-capacity image hiding based on residual CNN in wavelet domain
Zhang et al. A blind watermarking system based on deep learning model
Li et al. Robust image steganography framework based on generative adversarial network
JP7307266B2 (en) Image synthesizer and method for inserting watermark
CN116342362B (en) Deep learning enhanced digital watermark imperceptibility method
CN116152061A (en) Super-resolution reconstruction method based on fuzzy core estimation
Jia et al. AFcIHNet: Attention feature-constrained network for single image information hiding
Abdollahi et al. Image steganography based on smooth cycle-consistent adversarial learning
CN114494387A (en) Data set network generation model and fog map generation method
CN114493971A (en) Media data conversion model training and digital watermark embedding method and device
Zhong et al. Deep Learning based Image Watermarking: A Brief Survey
Essaidani et al. Asynchronous Invariant Digital Image Watermarking in Radon Field for Resistant Encrypted Watermark.
Gao et al. A method for face image inpainting based on autoencoder and generative adversarial network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220927